Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3

2004-05-26 Thread Heimo Laukkanen
On Wed, 26 May 2004 08:09:53 +0200, Dario Lopez-Kästen  
[EMAIL PROTECTED] wrote:

Heimo Laukkanen wrote:
 Hi,
 I am hitting my head agains wall - and witnessing strange behaviour,  
where after time most of load focuses only to one thread. I am not sure  
what is the cause of this, but so far we have seen different things  
before load has concentrated to one single thread.
We also have this behaviour, but I have thought that it was our  
overusage of DCOracle2 connections that was the culprit, along with out  
bad TTW code.

Do you use any DA, nad have long running SQL queries?
Yes, extensively. We use DCOracle2 that had to be patched to work with  
UTF-8. And we combine it with badly scripted code too. We started to do  
prototype by using own Archetypes storage layer that stores data from  
content objects to the Oracle database in certain format - so every access  
to an object actually creates a load of queries to the Oracle database.

Connections to the databse are handled by the Oracle adapter in Zope, so  
we have not done any own connections but rather just run queries through  
Zope adapter.

We have no fix for this, but have resigned to do restarts every now and  
then. We use Zope 2.5.1, Redhat 7.3
How did you debug or pinpoint the culprit to be DCOracle connection?
Have you tried cx_Oracle?
--
-huima
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3

2004-05-26 Thread Heimo Laukkanen
On Wed, 26 May 2004 08:44:44 +0200, Dario Lopez-Kästen  
[EMAIL PROTECTED] wrote:

We use a lot of queries (minimum 10-15) for each request, accros around  
7 DA connected to various Oracle schemas. Most of the queries are  
fairsly simple, but there are a couple of queries that are often used  
that are just plain bad, however they are so bad that it is very  
difficult to fix them.
Ok. In our case we use very simple queries, but this stupid code generates  
a lot of those. One thing to do is to throw away stupidity and Archetypes  
tsorage layer that is an bad idea - and put object to flush and read it's  
data from the oracle only when actually necessary.

Chris withers has done some work in improving DCOracle2's connections  
and general bug-fixning. If you haven't used it, grab the latest  
DCOracle2 from cvs - it is much better.
We did take it from CVS. Has there been much work on lately?
We see this behaviour mostly under heavy load - many users accessing the  
database all the time. Using the latest DCOracle and improving parts of  
our code has removed a lot of the problems, however it believe it is  
still unclcear what the problem is - in our setting, my stance is that  
we have only cured the symptoms, not the real problems...
Sounds familiar.
Have you tried cx_Oracle?
No I was not aware that they had a Zope adaptor.
Well Dieter and others suggested that it should be doable to to write  
adaptor based on any python db api compliant adaptor - either using  
DCOracle, psycopg or similar as base. However since I've never done that  
kind of work before I am reluctant to take the step and be bitten, hence  
the question if you have had more courage.

--
-huima
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


[Zope-dev] Zope threads on 2.7.0, python 2.3.3

2004-05-25 Thread Heimo Laukkanen
Hi,
I am hitting my head agains wall - and witnessing strange behaviour, 
where after time most of load focuses only to one thread. I am not sure 
what is the cause of this, but so far we have seen different things 
before load has concentrated to one single thread.

After reading thread:
http://www.gossamer-threads.com/lists/zope/dev/24230?page=last
I started to wonder whether that signalling behaviour is also a cause 
for our problems - since we are running Debina linux without NPTL (
Linux 2.4.26 #1 SMP Thu Apr 22 11:16:14 EEST 2004 i686 unknown ).

When zope is started, it nicely starts multiple threads and when tested 
with ab - each thread gets their share of the work. We saw in event log 
previously exceptions that were never caught - and thought if those were 
the source of the problem. Eventhough we added ugly try catch code 
around this problematic code - we still notice load concentration to 
this one thread.

I will try to go and look with gdb those no longer working threads - but 
would like to know if anyone could give any pointers on what to look for 
or what to check.

-huima
___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://mail.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
http://mail.zope.org/mailman/listinfo/zope-announce
http://mail.zope.org/mailman/listinfo/zope )


Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3

2004-05-25 Thread Heimo Laukkanen
On Tue, 25 May 2004 20:21:21 +0200, Dieter Maurer [EMAIL PROTECTED]  
wrote:

I would be surprised...
What does Control_Panel -- Debug information tells you
about the use of your connections (at the bottom of the page)?
At the moment it said that only one opened connection and others were  
none. I have no terminal access to the machines at the moment to check how  
threads are behing at the moment.

However I did try to look at the processes with gdb - however I need to do  
more reading on gdb manual to provide more info beyond just backtrace on  
the process status. I'm not familiar on Python internals so any pointers  
on what to expect or what not to expect would be appreciated.

Below is first output from ps -auxfww | grep portaali where portaali is  
the name of the user who owns the processes.

Below we will see that each thread has started at the same time - but  
thread with pid 14553 has done most of the work. And while putting more  
load and looking at top - that is also the only thread that gets any  
percentage of process time. I have no information on when this has  
happened since event log does not tell anything peculiar.

portaali 14530  0.0  0.2  6116 4432 ?S12:12   0:00  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/portaali/
ContentManagement-1.0/zope-2.7.0/lib/python/zdaemon/zdrun.py -S  
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/
zopeschema.xml -b 10 -d -f -s  
/usr/local/portaali/zope_instance/var/zopectlsock -x 0,2 -z  
/usr/local/portaali/zope_instance  
/usr/local/portaali/zope_instance/bin/runzope
portaali 14531  0.2 16.6 365220 345168 ? S12:12   0:46  \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf
portaali 14532  0.0 16.6 365220 345168 ? S12:12   0:00  \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf
portaali 14553 13.0 16.6 365220 345168 ? S12:12  34:40 \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf
portaali 14552  0.1 16.6 365220 345168 ? S12:12   0:18 \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf
portaali 14548  0.8 16.6 365220 345168 ? S12:12   2:16 \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf
portaali 14545  0.1 16.6 365220 345168 ? S12:12   0:17 \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf
portaali 14547  0.2 16.6 365220 345168 ? S12:12   0:36 \_  
/opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/
portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py  
-C /usr/local/portaali/zope_instance/etc/zope.conf

And then backtrace with gdb from those quiet threads
~# gdb program 14552
backtrace
#0  0x4007b87e in sigsuspend () from /lib/libc.so.6
#1  0x4001e879 in __pthread_wait_for_restart_signal () from  
/lib/libpthread.so.0
#2  0x4001fee1 in sem_wait@@GLIBC_2.1 () from /lib/libpthread.so.0
#3  0x080c4830 in PyThread_acquire_lock (lock=0x861a908, waitflag=1) at  
Python/thread_pthread.h:406
#4  0x080c76c8 in lock_PyThread_acquire_lock (self=0x417c4840,  
args=0x4017002c) at ./Modules/threadmodule.c:63
#5  0x080e0e3f in PyCFunction_Call (func=0x41c7c7cc, arg=0x4017002c,  
kw=0x0) at Objects/methodobject.c:73
#6  0x080a3beb in call_function (pp_stack=0xbe7fb904, oparg=0) at  
Python/ceval.c:3439
#7  0x080a22f6 in eval_frame (f=0x9b5ea24) at Python/ceval.c:2116
#8  0x080a31e2 in PyEval_EvalCodeEx (co=0x40442c20, globals=0x40445acc,  
locals=0x0, args=0x8bdb024, argcount=1,
kws=0x8bdb028, kwcount=1, defs=0x4047eb18, defcount=5, closure=0x0) at  
Python/ceval.c:2663
#9  0x080a55ec in fast_function (func=0x404a9924, pp_stack=0xbe7fba94,  
n=3, na=1, nk=1) at Python/ceval.c:3529
#10 0x080a3c71 in call_function (pp_stack=0xbe7fba94, oparg=256) at  
Python/ceval.c:3458

Another thread ( 14548 )
#0  0x401153c4 in read () from /lib/libc.so.6
#1  0x40029ae0 in __DTOR_END__ () from /lib/libpthread.so.0
#2  0x412474f0 in nttrd () from  
/opt/portaali/comp/oracle/lib/libclntsh.so.9.0
#3  0x410fdcf8 in nsprecv () from  
/opt/portaali/comp/oracle/lib/libclntsh.so.9.0
#4  0x41101ac5 in nsrdr () from  
/opt/portaali/comp/oracle/lib/libclntsh.so.9.0
#5  

[Zope-dev] Threads dying on zope 2.7.0 on one Zeo setup, ZServer exception

2004-05-18 Thread Heimo Laukkanen
Hi ya,
I noticed some weird bahivour that was bringing a setup down on our
servers on testing installation. In our setup we create development 
sandboxes with makefile and in similar fashion also testing/production 
servers.

Our testing/production setup has a little bit different config files 
where from standard config all the commented out lines have been 
removed. Also compared to development instances the instance home is in 
other place. Other than that setups are same.

At the moment with testing setup under load test ZServer dies - leaving 
all but one thread at almost 0% cpu and that one thread working in high 
loads of 90%. Testing similar zeo-setup with development instances do 
not produce this.

We are a bit scared on this, since we do not have clue on what causes 
following behaviour - and eventhough we see that we can get it to work, 
we would like to have an idea on what might be behind this - so that it 
does not bite our butt again.

Below is log snip and traceback of the exceptions when other threads go 
down.

In log-file we get:
2004-05-17T16:47:17 ERROR(200) ZServer uncaptured python exception,
closing channel ZServer.HTTPServer.zhttp_channel connected
127.0.0.1:47855 at 0x42fdb36c channel#: 589 requests:
(socket.error:(104, 'Connection reset by peer')
[/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/asynchat.py|initiate_send|218] 

[/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/medusa/http_server.py|send|417] 

[/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/asyncore.py|send|337])
2004-05-17T16:48:49 ERROR(200) ZServer uncaptured python exception,
closing channel ZServer.HTTPServer.zhttp_channel connected
127.0.0.1:47924 at 0x430538ac channel#: 651 requests:
(socket.error:(104, 'Connection reset by peer')
[/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/asynchat.py|initiate_send|218] 

[/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/medusa/http_server.py|send|417] 

[/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/asyncore.py|send|337])
2004-05-17T16:50:05 ERROR(200) ZServer uncaptured python exception,
closing channel ZServer.HTTPServer.zhttp_channel connected
127.0.0.1:47961 at 0x42f3ba4c channel#: 683 requests:
(socket.error:(104, 'Connection reset by peer')
[/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/asynchat.py|initiate_send|218] 

[/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/medusa/http_server.py|send|417] 

[/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/asyncore.py|send|337])

In tracback we will get:
2004-05-18T11:44:14 INFO(0) Zope Ready to handle requests
Unhandled exception in thread started by class
ZServer.PubCore.ZServerPublisher.ZServerPublisher at 0x405435cc
Traceback (most recent call last):
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/PubCore/ZServerPublisher.py, 

line 23, in __init__
response=response)
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZPublisher/Publish.py, 

line 372, in publish_module
environ, debug, request, response)
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZPublisher/Publish.py, 

line 173, in publish_module_standard
request.response.exception()
AttributeError: 'NoneType' object has no attribute 'exception'
Unhandled exception in thread started by class
ZServer.PubCore.ZServerPublisher.ZServerPublisher at 0x405435cc
Traceback (most recent call last):
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/PubCore/ZServerPublisher.py, 

line 23, in __init__
response=response)
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZPublisher/Publish.py, 

line 372, in publish_module
environ, debug, request, response)
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZPublisher/Publish.py, 

line 173, in publish_module_standard
request.response.exception()
AttributeError: 'NoneType' object has no attribute 'exception'
Unhandled exception in thread started by class
ZServer.PubCore.ZServerPublisher.ZServerPublisher at 0x405435cc
Traceback (most recent call last):
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/PubCore/ZServerPublisher.py, 

line 23, in __init__
response=response)
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZPublisher/Publish.py, 

line 372, in publish_module
environ, debug, request, response)
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZPublisher/Publish.py, 

line 173, in publish_module_standard
request.response.exception()
AttributeError: 'NoneType' object has no attribute 'exception'
Unhandled exception in thread started by class
ZServer.PubCore.ZServerPublisher.ZServerPublisher at 0x405435cc
Traceback (most recent call last):
  File
/opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/ZServer/PubCore/ZServerPublisher.py, 

line 23, in __init__