RE: RESOLUTION: Re: [ZODB-Dev] more lockup information / zope2.9.6+zodb3.6.2

2007-04-18 Thread Paul Williams
It sounds like we have the same problem.  We had contracted Tres Seaver
write us a keepalive tool to ping the server periodically.  This has
fixed our problem and we haven't had a problem in 8 days.  We used to
have this problem at least once a day.

The biggest thing is that it is seen by some as a bug in Zope or Python
since we fixed it with a keepalive.  How do we definitively clear Zeo
infrastructure?  Is it somehow linked to python code not recognizing the
connection loss or is this strictly an iptables issue.  Is it a bug in
iptables or just a mis-configuration?

Thanks,
Paul Williams


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf
Of Alan Runyan
Sent: Sunday, April 15, 2007 11:52 AM
To: Dieter Maurer
Cc: zodb-dev@zope.org
Subject: RESOLUTION: Re: [ZODB-Dev] more lockup information /
zope2.9.6+zodb3.6.2

Thanks to Jim, Theune, Dieter and all others who weighed in on this
thread.

The problem:  ZEO Clients would lock up randomly requiring restart.

The hints: Lots of 'Connection timed out' and 'No route to host' in
ZEO Server log files.

The solution: The machine the ZEO server was running iptables.  Even
though it was
allowing the zeo server port 8100; it was filtering on all interfaces.
 I simply changed rules so port 8100 would not filter the internal
network interface - at all.  ZEO listens specifically on internal
ip/port 8100.

System has been stable for several days; without zeo server logs
containing any connection errors.  I had never had this problem before
because.. well.. we use firewalls on our customers (external
interfaces and our internal interfaces never have filtering) and this
particular sysadmin was running iptables on all public/private
interfaces; locking the machine down as much as possible.

Thanks again guys!

-- 
Alan Runyan
Enfold Systems, Inc.
http://www.enfoldsystems.com/
phone: +1.713.942.2377x111
fax: +1.832.201.8856
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: RESOLUTION: Re: [ZODB-Dev] more lockup information / zope2.9.6+zodb3.6.2

2007-04-18 Thread Paul Williams
That would be a very significant amount of work for us to implement.  We
are using Plone 2.5.1 which doesn't support anything over Zope 2.9.x.

We are also using zope 2.9.5.  I am assuming that this is the same
problem as Alan's since we have the log message filled with the warning
signs below.  

We are still waiting for a copy of iptables and hosts.allow and
host.deny to analyze.

Thanks,
Paul Williams


-Original Message-
From: Jim Fulton [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, April 18, 2007 10:13 AM
To: Paul Williams
Cc: Alan Runyan; Dieter Maurer; zodb-dev@zope.org
Subject: Re: RESOLUTION: Re: [ZODB-Dev] more lockup information /
zope2.9.6+zodb3.6.2


It would be interesting for you to try this with Zope 2.10/ZODB 3.7.

In ZODB 3.7, I added an attempt to do this sort of keep-alive check.   
Now, clients attempt to send an empty message to the server every 30  
seconds during periods of inactivity.  Of course, I couldn't actually  
test that this fixed anything, as I didn't have a way to reproduce  
the problem.  I didn't mention this earlier in the thread, because it  
didn't sound to me like this was the problem Alan was having.
Alan said he was using Zope 2.9 and this ZODB 3.6.

Jim

On Apr 18, 2007, at 10:29 AM, Paul Williams wrote:

 It sounds like we have the same problem.  We had contracted Tres  
 Seaver
 write us a keepalive tool to ping the server periodically.  This has
 fixed our problem and we haven't had a problem in 8 days.  We used to
 have this problem at least once a day.

 The biggest thing is that it is seen by some as a bug in Zope or  
 Python
 since we fixed it with a keepalive.  How do we definitively clear Zeo
 infrastructure?  Is it somehow linked to python code not  
 recognizing the
 connection loss or is this strictly an iptables issue.  Is it a bug in
 iptables or just a mis-configuration?

 Thanks,
 Paul Williams


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf
 Of Alan Runyan
 Sent: Sunday, April 15, 2007 11:52 AM
 To: Dieter Maurer
 Cc: zodb-dev@zope.org
 Subject: RESOLUTION: Re: [ZODB-Dev] more lockup information /
 zope2.9.6+zodb3.6.2

 Thanks to Jim, Theune, Dieter and all others who weighed in on this
 thread.

 The problem:  ZEO Clients would lock up randomly requiring restart.

 The hints: Lots of 'Connection timed out' and 'No route to host' in
 ZEO Server log files.

 The solution: The machine the ZEO server was running iptables.  Even
 though it was
 allowing the zeo server port 8100; it was filtering on all interfaces.
  I simply changed rules so port 8100 would not filter the internal
 network interface - at all.  ZEO listens specifically on internal
 ip/port 8100.

 System has been stable for several days; without zeo server logs
 containing any connection errors.  I had never had this problem before
 because.. well.. we use firewalls on our customers (external
 interfaces and our internal interfaces never have filtering) and this
 particular sysadmin was running iptables on all public/private
 interfaces; locking the machine down as much as possible.

 Thanks again guys!

 -- 
 Alan Runyan
 Enfold Systems, Inc.
 http://www.enfoldsystems.com/
 phone: +1.713.942.2377x111
 fax: +1.832.201.8856
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/

 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev

--
Jim Fulton  mailto:[EMAIL PROTECTED]Python
Powered!
CTO (540) 361-1714
http://www.python.org
Zope Corporationhttp://www.zope.com
http://www.zope.org



___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] zeo client patch in connection.py

2007-03-21 Thread Paul Williams
Since we implemented this, any task that takes more than a few seconds (ie.  
pack the database)  throws a clientdisconnectederror.  I am hoping someone out 
there can shed some light on what might be happening.

-Original Message-
From: Christian Theune [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 07, 2007 9:43 AM
To: Paul Williams
Cc: zodb-dev@zope.org
Subject: Re: [ZODB-Dev] zeo client patch in connection.py

Hi,

could you send this again as a patch produced using `diff` against your
original sources OR tell us  which version of Zope/ZODB you used as the
starting point for this patch.

Thanks,
Christian

Am Mittwoch, den 07.03.2007, 08:52 -0600 schrieb Paul Williams:
 Hi Everyone, 
 
 We were experience problems with our zeo client setup on redhat rhel4.
 The client would just quit responding.  No memory or cpu increase was
 associated with this.  The client would remain hung until it was
 restarted. 
 
 We looked on the client using Netstat and the status was ESTABLISHED
 with the zeo server.  On the zeo server the netstat said LISTENING. 
 
 When running the deadlockdebugger, one thread was in asnycore wait.
 The others were normal actions such as folder listing or folder
 contents. 
 
 We implemented a couple of lines of code on line 641 of connection.py
 in the ZEO/zrpc packages 
 
 We added and else clause to call self.close() if delay is over one
 second.  We found that one second wasn't quite enough and moved it to
 5 seconds. 
 
 Now we find out that this drastically improved our performance. The
 servers are now  1 second per page load.  Before, they could be 5
 seconds or more, if they loaded at all.  Also, our servers used to
 crash several times a day and they now haven't crashed in almost a
 week. 
 
 I just wanted to put this out there and see if anyone has any comments
 at all.  I need to get a more permanent solution than this, but it is
 what we have for now. 
 
 System Configuration 
 Zope 2.9.5 
 Plone 2.5.1 
 Python 2.4.3 
 Redhat Rhel4 
 
 Communications between our zeo clients and zeo server only route
 through a switch. 
 
 
 Thank you for any help, 
 Paul Williams
 
 
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/
 
 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev
-- 
gocept gmbh  co. kg - forsterstraße 29 - 06112 halle/saale - germany
www.gocept.com - [EMAIL PROTECTED] - phone +49 345 122 9889 7 -
fax +49 345 122 9889 1 - zope and plone consulting and development
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] zeo client patch in connection.py

2007-03-07 Thread Paul Williams
Hi Everyone, 

We were experience problems with our zeo client setup on redhat rhel4.
The client would just quit responding.  No memory or cpu increase was
associated with this.  The client would remain hung until it was
restarted. 

We looked on the client using Netstat and the status was ESTABLISHED
with the zeo server.  On the zeo server the netstat said LISTENING. 

When running the deadlockdebugger, one thread was in asnycore wait.  The
others were normal actions such as folder listing or folder contents. 

We implemented a couple of lines of code on line 641 of connection.py in
the ZEO/zrpc packages 

We added and else clause to call self.close() if delay is over one
second.  We found that one second wasn't quite enough and moved it to 5
seconds. 

Now we find out that this drastically improved our performance. The
servers are now  1 second per page load.  Before, they could be 5
seconds or more, if they loaded at all.  Also, our servers used to crash
several times a day and they now haven't crashed in almost a week. 

I just wanted to put this out there and see if anyone has any comments
at all.  I need to get a more permanent solution than this, but it is
what we have for now. 

System Configuration 
Zope 2.9.5 
Plone 2.5.1 
Python 2.4.3 
Redhat Rhel4 

Communications between our zeo clients and zeo server only route through
a switch. 


Thank you for any help, 
Paul Williams



connection.py
Description: connection.py
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] zeo client patch in connection.py

2007-03-07 Thread Paul Williams
Here is a patch.  The version is 3.6.2 and was distributed with Zope 2.9.5.

Thank you,
Paul Williams

-Original Message-
From: Christian Theune [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, March 07, 2007 9:43 AM
To: Paul Williams
Cc: zodb-dev@zope.org
Subject: Re: [ZODB-Dev] zeo client patch in connection.py

Hi,

could you send this again as a patch produced using `diff` against your
original sources OR tell us  which version of Zope/ZODB you used as the
starting point for this patch.

Thanks,
Christian

Am Mittwoch, den 07.03.2007, 08:52 -0600 schrieb Paul Williams:
 Hi Everyone, 
 
 We were experience problems with our zeo client setup on redhat rhel4.
 The client would just quit responding.  No memory or cpu increase was
 associated with this.  The client would remain hung until it was
 restarted. 
 
 We looked on the client using Netstat and the status was ESTABLISHED
 with the zeo server.  On the zeo server the netstat said LISTENING. 
 
 When running the deadlockdebugger, one thread was in asnycore wait.
 The others were normal actions such as folder listing or folder
 contents. 
 
 We implemented a couple of lines of code on line 641 of connection.py
 in the ZEO/zrpc packages 
 
 We added and else clause to call self.close() if delay is over one
 second.  We found that one second wasn't quite enough and moved it to
 5 seconds. 
 
 Now we find out that this drastically improved our performance. The
 servers are now  1 second per page load.  Before, they could be 5
 seconds or more, if they loaded at all.  Also, our servers used to
 crash several times a day and they now haven't crashed in almost a
 week. 
 
 I just wanted to put this out there and see if anyone has any comments
 at all.  I need to get a more permanent solution than this, but it is
 what we have for now. 
 
 System Configuration 
 Zope 2.9.5 
 Plone 2.5.1 
 Python 2.4.3 
 Redhat Rhel4 
 
 Communications between our zeo clients and zeo server only route
 through a switch. 
 
 
 Thank you for any help, 
 Paul Williams
 
 
 ___
 For more information about ZODB, see the ZODB Wiki:
 http://www.zope.org/Wikis/ZODB/
 
 ZODB-Dev mailing list  -  ZODB-Dev@zope.org
 http://mail.zope.org/mailman/listinfo/zodb-dev
-- 
gocept gmbh  co. kg - forsterstraße 29 - 06112 halle/saale - germany
www.gocept.com - [EMAIL PROTECTED] - phone +49 345 122 9889 7 -
fax +49 345 122 9889 1 - zope and plone consulting and development


connection.py.patch
Description: connection.py.patch
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev