Jean-Marc Orliaguet wrote:
> Julien Anguenot wrote:
>>
>>> when re-indexing an entire site, no problem is experienced (I actually
>>> re-indexed 24 sites, and the persistent connection works like a charm).
>>>     
>>
>> kewl !
>>   
> 
> yes indeed :-)
> 
>>> however in production (i.e. in our case, with front-end servers behind a
>>> load-balancer, virtual IPs, firewall rules, weird routing tables,
>>> long-lasting connections that mysteriously die after a few minutes ...)
>>> the persistent connection stops working after a few minutes. Eventually
>>> the xmlrpc client in nuxeo.lucene.catalog.py fails to connect to the
>>> lucene server with the message "Connection to %s FAILED.
>>> nuxeo.lucene.catalog won't work until the connection will be possible
>>> again" and the server needs to be restarted.
>>>     
>>
>> Argh, this is strange... Is the NXLucene server stalled in the case ? If
>> it's the case that's a bug... Can you check this point out please ?
>>   
> 
> In fact the NXLucene server can still be used if zope is restarted, so
> rather I would say that it's the client that is getting stalled (the
> xmlrpc.py module which is now being bypassed).

Then it smells a client problem. Though, the class dealing with the
persistent transport should reconnect if the connection died. Maybe we
are missing a timeout over there. I did that quite some time ago so I
don't remember the details of the implementation right now...

> But again the production environment that we have may not be optimal for
> HTTP persistent connections...

Maybe...

> apart from that, the NXLucene server died a day ago, I'm not sure if
> this is related to the persistent connection bug on the client, or if it
> is due to a log rotation that failed, or an invalid query. I'm still
> trying to figure out. But since the lucene server was restarted it has
> worked without any problem.

Did you get a core dump ?

>>  
>>> So I changed:
>>>
>>>    def _getProxy(self):
>>>        if getattr(self, '_v_proxy', None) is None:
>>>            self._v_proxy = xmlrpclib.ServerProxy(
>>>                self._getServerURL(), transport=PersistentTransport())
>>>
>>> to:
>>>
>>>    def _getProxy(self):
>>>        if getattr(self, '_v_proxy', None) is None:
>>>            self._v_proxy = xmlrpclib.ServerProxy(
>>>                self._getServerURL())
>>>
>>> and individual queries now work perfectly well too.
>>>
>>> So probably depending on the type of method called in the catalog
>>> (search vs reindex) a persistent transport is better adapted in some
>>> cases, or more dangerous in others. Note that I didn't experience any of
>>> these issues in the development environment.
>>>     
>>
>> I think before doing this we need to check the server status.. If this
>> is stalled then we have an issue somewhere because it means a query's
>> responsible of this...
>>   
> 
> the problem is probably on the client, because it keeps on opening new
> connections and thinks that they are not usable a few minutes later, so
> it could be that the connection doesn't really get freed on the server
> either.
> 
>>  
>>> Otherwise, everything works perfectly, it's a *huge* performance boost
>>> compared to ZCatalog!
>>>     
>>
>> eheh great ! Glad it does work fine :)
>>
>> You indexing 24 CPS sites in the same NXLucene ?
>>
>> Can you give us the OS against which you are running NXLucene and the
>> gcc version please ?

[...]

> yes, we have 24 CPS sites on the same lucene catalog with an extra
> 'site_id' keyword to restrict searches. We are going to use the central
> catalog to spread local news and events between sites.. that's a total
> of 6 zope servers connected to the same lucene.

really great. It's been designed to do that at the first place so glad
you use it the way it was intended to be :)

> the lucene server setup is:
> 
> 4 CPU 2GHz 64bits
> Red Hat Enterprise Linux AS release 4
> gcc-3.4.6-3 (included in RedHat)
> pylucene2 :-)

Of course, Red Hat enterprise :) You shouldn't get that much gcj bugs as
with others distros.

Are you aware about the gcc 3.4.6 bug related to the 2Go PyLucene store
limitation ?

Cheers,

        J.

-- 
Julien Anguenot | Nuxeo R&D (Paris, France)
Open Source ECM - http://www.nuxeo.com
Nuxeo 5 : http://www.nuxeo.org
Mobile: +33 (0) 6 72 57 57 66

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
cps-devel mailing list
http://lists.nuxeo.com/mailman/listinfo/cps-devel

Reply via email to