Julien Anguenot wrote:

when re-indexing an entire site, no problem is experienced (I actually
re-indexed 24 sites, and the persistent connection works like a charm).

kewl !

yes indeed :-)

however in production (i.e. in our case, with front-end servers behind a
load-balancer, virtual IPs, firewall rules, weird routing tables,
long-lasting connections that mysteriously die after a few minutes ...)
the persistent connection stops working after a few minutes. Eventually
the xmlrpc client in nuxeo.lucene.catalog.py fails to connect to the
lucene server with the message "Connection to %s FAILED.
nuxeo.lucene.catalog won't work until the connection will be possible
again" and the server needs to be restarted.

Argh, this is strange... Is the NXLucene server stalled in the case ? If
it's the case that's a bug... Can you check this point out please ?

In fact the NXLucene server can still be used if zope is restarted, so rather I would say that it's the client that is getting stalled (the xmlrpc.py module which is now being bypassed).

But again the production environment that we have may not be optimal for HTTP persistent connections...

apart from that, the NXLucene server died a day ago, I'm not sure if this is related to the persistent connection bug on the client, or if it is due to a log rotation that failed, or an invalid query. I'm still trying to figure out. But since the lucene server was restarted it has worked without any problem.

So I changed:

   def _getProxy(self):
       if getattr(self, '_v_proxy', None) is None:
           self._v_proxy = xmlrpclib.ServerProxy(
               self._getServerURL(), transport=PersistentTransport())

to:

   def _getProxy(self):
       if getattr(self, '_v_proxy', None) is None:
           self._v_proxy = xmlrpclib.ServerProxy(
               self._getServerURL())

and individual queries now work perfectly well too.

So probably depending on the type of method called in the catalog
(search vs reindex) a persistent transport is better adapted in some
cases, or more dangerous in others. Note that I didn't experience any of
these issues in the development environment.

I think before doing this we need to check the server status.. If this
is stalled then we have an issue somewhere because it means a query's
responsible of this...

the problem is probably on the client, because it keeps on opening new connections and thinks that they are not usable a few minutes later, so it could be that the connection doesn't really get freed on the server either.

Otherwise, everything works perfectly, it's a *huge* performance boost
compared to ZCatalog!

eheh great ! Glad it does work fine :)

You indexing 24 CPS sites in the same NXLucene ?

Can you give us the OS against which you are running NXLucene and the
gcc version please ?

Cheers,

        J.


yes, we have 24 CPS sites on the same lucene catalog with an extra 'site_id' keyword to restrict searches. We are going to use the central catalog to spread local news and events between sites.. that's a total of 6 zope servers connected to the same lucene.

the lucene server setup is:

4 CPU 2GHz 64bits
Red Hat Enterprise Linux AS release 4
gcc-3.4.6-3 (included in RedHat)
pylucene2 :-)

Cheers,
/JM


_______________________________________________
cps-devel mailing list
http://lists.nuxeo.com/mailman/listinfo/cps-devel

Reply via email to