Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
Heimo Laukkanen wrote at 2004-5-25 22:46 +0300: > ... >> What does "Control_Panel --> Debug information" tells you >> about the use of your connections (at the bottom of the page)? > >At the moment it said that only one opened connection and others were >none. I have no terminal access to the machines at the moment to check how >threads are behing at the moment. That looks strange... Maybe, the only running thread does not release the GIL? >And then backtrace with gdb from those quiet threads My most essential tools when analysing Python code in "gdb" are the following two GDB command definitions. def ps x/s ({PyStringObject}$arg0)->ob_sval end def pfr ps f->f_code->co_filename ps f->f_code->co_name p f->f_lineno end "ps" allows you to look at a Python String variable and "pfr" can be called in "eval_frame" frames. It tells you what code this frame is executing -- identified by module and function. You cannot trust the "lineno" in Python 2.3 (it is the start of the function not where you actually are). >~# gdb program 14552 >backtrace > ... >#7 0x080a22f6 in eval_frame (f=0x9b5ea24) at Python/ceval.c:2116 use "fr 7" to select this frame and then "pfr" to see where this is. >... >Another thread ( 14548 ) > > >#0 0x401153c4 in read () from /lib/libc.so.6 >#1 0x40029ae0 in __DTOR_END__ () from /lib/libpthread.so.0 >#2 0x412474f0 in nttrd () from >/opt/portaali/comp/oracle/lib/libclntsh.so.9.0 >#3 0x410fdcf8 in nsprecv () from Obviously, this is a call to Oracle. Maybe, the GIL is not released? > ... >#14 0x40e13631 in Cursor_execute (self=0x4216fcd0, args=0x4017002c) at >src/dco2.c:3740 The GIL should have been released in the function above. Check it for "Py_BEGIN_ALLOW_THREADS", "Py_END_ALLOW_THREADS" around line 3740. > > PID: 14547 > >#0 0x401153c4 in read () from /lib/libc.so.6 >#1 0x40029ae0 in __DTOR_END__ () from /lib/libpthread.so.0 >#2 0x412474f0 in nttrd () from >/opt/portaali/comp/oracle/lib/libclntsh.so.9.0 This, too is waiting for Oracle... >PID: 14545 > >#0 0x401153c4 in read () from /lib/libc.so.6 >#1 0x40029ae0 in __DTOR_END__ () from /lib/libpthread.so.0 >#2 0x412474f0 in nttrd () from >/opt/portaali/comp/oracle/lib/libclntsh.so.9.0 >#3 0x410fdcf8 in nsprecv () from This, too... >PID: 14531 > >#0 0x4011c7ee in select () from /lib/libc.so.6 >#1 0x40568cb4 in __DTOR_END__ () from >/opt/portaali/ContentManagement-1.0/python-2.3.3/lib/python2.3/lib-dynload/select.so >#2 0x080e0e3f in PyCFunction_Call (func=0x4049636c, arg=0x426476e4, >kw=0x0) at Objects/methodobject.c:73 >#3 0x080a3beb in call_function (pp_stack=0xb11c, oparg=4) at >Python/ceval.c:3439 >#4 0x080a22f6 in eval_frame (f=0x9033f64) at Python/ceval.c:2116 Check where you are here (--> "pfr"). It may be the medusa/asyncore main loop. You have several threads waiting for responses from Oracle. Are you sure that "Control_Panel --> Debug information" tells you only about a single ZODB connection? -- Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
Dario Lopez-Kästen wrote: Chris withers has done some work in improving DCOracle2's connections and general bug-fixning. If you haven't used it, grab the latest DCOracle2 from cvs - it is much better. My work is still on a branch... Have you tried cx_Oracle? No I was not aware that they had a Zope adaptor. I think they do, if it's significantly better than DCOracle, then I'll tell people to switch instead of trying to make DCOracle work further ;-) Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
Heimo Laukkanen wrote: On Wed, 26 May 2004 08:44:44 +0200, Dario Lopez-Kästen <[EMAIL PROTECTED]> wrote: Ok. In our case we use very simple queries, but this stupid code generates a lot of those. One thing to do is to throw away stupidity and Archetypes tsorage layer that is an bad idea - and put object to flush and read it's data from the oracle only when actually necessary. I agree. We cannot do it however, because ost of my code is mostly TTW, so w have no custom obejcts. However, this is a good strategy in gerenal - do not read from external soruces unless it is necessary... Chris withers has done some work in improving DCOracle2's connections and general bug-fixning. If you haven't used it, grab the latest DCOracle2 from cvs - it is much better. We did take it from CVS. Has there been much work on lately? not that I am aware of. (I haven't checked :) Have you tried cx_Oracle? No I was not aware that they had a Zope adaptor. Well Dieter and others suggested that it should be doable to to write adaptor based on any python db api compliant adaptor - either using DCOracle, psycopg or similar as base. However since I've never done that kind of work before I am reluctant to take the step and be bitten, hence the question if you have had more courage. No, sorry. For me, this too is beyond what I can do at the moment, mostly due to time constraints. Hiopefully this will chagne as more people become invlved with Zope at my work... /dario -- -- --- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
On Wed, 26 May 2004 08:44:44 +0200, Dario Lopez-Kästen <[EMAIL PROTECTED]> wrote: We use a lot of queries (minimum 10-15) for each request, accros around 7 DA connected to various Oracle schemas. Most of the queries are fairsly simple, but there are a couple of queries that are often used that are just plain bad, however they are so bad that it is very difficult to fix them. Ok. In our case we use very simple queries, but this stupid code generates a lot of those. One thing to do is to throw away stupidity and Archetypes tsorage layer that is an bad idea - and put object to flush and read it's data from the oracle only when actually necessary. Chris withers has done some work in improving DCOracle2's connections and general bug-fixning. If you haven't used it, grab the latest DCOracle2 from cvs - it is much better. We did take it from CVS. Has there been much work on lately? We see this behaviour mostly under heavy load - many users accessing the database all the time. Using the latest DCOracle and improving parts of our code has removed a lot of the problems, however it believe it is still unclcear what the problem is - in our setting, my stance is that we have only cured the symptoms, not the real problems... Sounds familiar. Have you tried cx_Oracle? No I was not aware that they had a Zope adaptor. Well Dieter and others suggested that it should be doable to to write adaptor based on any python db api compliant adaptor - either using DCOracle, psycopg or similar as base. However since I've never done that kind of work before I am reluctant to take the step and be bitten, hence the question if you have had more courage. -- -huima ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
Heimo Laukkanen wrote: Do you use any DA, nad have long running SQL queries? Yes, extensively. We use DCOracle2 that had to be patched to work with UTF-8. And we combine it with badly scripted code too. We started to do prototype by using own Archetypes storage layer that stores data from content objects to the Oracle database in certain format - so every access to an object actually creates a load of queries to the Oracle database. Connections to the databse are handled by the Oracle adapter in Zope, so we have not done any own connections but rather just run queries through Zope adapter. We use a lot of queries (minimum 10-15) for each request, accros around 7 DA connected to various Oracle schemas. Most of the queries are fairsly simple, but there are a couple of queries that are often used that are just plain bad, however they are so bad that it is very difficult to fix them. These queries generate a lot of load on the DB server (they are *very* bad SQL). So these tend to take a long tiem to respond. We have no fix for this, but have resigned to do restarts every now and then. We use Zope 2.5.1, Redhat 7.3 How did you debug or pinpoint the culprit to be DCOracle connection? Gut feeling, doing some basic optimisation of the not-so complicated cases. I did receive a few tips on how to do real debugging from Matt, Dieter and others, but those tips assume a level of knowlegde that I do not posses yet - also I have until recently not had any hardware to do that kind of debugging on. I just made some basic observations - some times Zope would stop responding and angry users would call, and I woudl find threads that had been running for more thatn 7 seconds, while at the same time I could observe queries on the database that had been running for similar lenght of time, occasionally even blocking the database altoghther. Some times it seemed like Zope wold "lose" the connection to the database (we still get that randomly from now and then) but I am not 100% sure that this is only zope's fault - it may be that the database was under so much load that it "lost" the connection to zope, thus triggering some kidn of "wait for reply" loop on zope's side. Chris withers has done some work in improving DCOracle2's connections and general bug-fixning. If you haven't used it, grab the latest DCOracle2 from cvs - it is much better. We see this behaviour mostly under heavy load - many users accessing the database all the time. Using the latest DCOracle and improving parts of our code has removed a lot of the problems, however it believe it is still unclcear what the problem is - in our setting, my stance is that we have only cured the symptoms, not the real problems... Have you tried cx_Oracle? No I was not aware that they had a Zope adaptor. /dario -- -- --- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
On Wed, 26 May 2004 08:09:53 +0200, Dario Lopez-Kästen <[EMAIL PROTECTED]> wrote: Heimo Laukkanen wrote: Hi, I am hitting my head agains wall - and witnessing strange behaviour, where after time most of load focuses only to one thread. I am not sure what is the cause of this, but so far we have seen different things before load has concentrated to one single thread. We also have this behaviour, but I have thought that it was our overusage of DCOracle2 connections that was the culprit, along with out bad TTW code. Do you use any DA, nad have long running SQL queries? Yes, extensively. We use DCOracle2 that had to be patched to work with UTF-8. And we combine it with badly scripted code too. We started to do prototype by using own Archetypes storage layer that stores data from content objects to the Oracle database in certain format - so every access to an object actually creates a load of queries to the Oracle database. Connections to the databse are handled by the Oracle adapter in Zope, so we have not done any own connections but rather just run queries through Zope adapter. We have no fix for this, but have resigned to do restarts every now and then. We use Zope 2.5.1, Redhat 7.3 How did you debug or pinpoint the culprit to be DCOracle connection? Have you tried cx_Oracle? -- -huima ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
Heimo Laukkanen wrote: Hi, I am hitting my head agains wall - and witnessing strange behaviour, where after time most of load focuses only to one thread. I am not sure what is the cause of this, but so far we have seen different things before load has concentrated to one single thread. We also have this behaviour, but I have thought that it was our overusage of DCOracle2 connections that was the culprit, along with out bad TTW code. Do you use any DA, nad have long running SQL queries? We have no fix for this, but have resigned to do restarts every now and then. We use Zope 2.5.1, Redhat 7.3 /dario -- -- --- Dario Lopez-KÃsten, IT Systems & Services Chalmers University of Tech. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
On Tue, 25 May 2004 20:21:21 +0200, Dieter Maurer <[EMAIL PROTECTED]> wrote: I would be surprised... What does "Control_Panel --> Debug information" tells you about the use of your connections (at the bottom of the page)? At the moment it said that only one opened connection and others were none. I have no terminal access to the machines at the moment to check how threads are behing at the moment. However I did try to look at the processes with gdb - however I need to do more reading on gdb manual to provide more info beyond just backtrace on the process status. I'm not familiar on Python internals so any pointers on what to expect or what not to expect would be appreciated. Below is first output from ps -auxfww | grep portaali where portaali is the name of the user who owns the processes. Below we will see that each thread has started at the same time - but thread with pid 14553 has done most of the work. And while putting more load and looking at top - that is also the only thread that gets any percentage of process time. I have no information on when this has happened since event log does not tell anything peculiar. portaali 14530 0.0 0.2 6116 4432 ?S12:12 0:00 /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/portaali/ ContentManagement-1.0/zope-2.7.0/lib/python/zdaemon/zdrun.py -S /opt/portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/ zopeschema.xml -b 10 -d -f -s /usr/local/portaali/zope_instance/var/zopectlsock -x 0,2 -z /usr/local/portaali/zope_instance /usr/local/portaali/zope_instance/bin/runzope portaali 14531 0.2 16.6 365220 345168 ? S12:12 0:46 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf portaali 14532 0.0 16.6 365220 345168 ? S12:12 0:00 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf portaali 14553 13.0 16.6 365220 345168 ? S12:12 34:40 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf portaali 14552 0.1 16.6 365220 345168 ? S12:12 0:18 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf portaali 14548 0.8 16.6 365220 345168 ? S12:12 2:16 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf portaali 14545 0.1 16.6 365220 345168 ? S12:12 0:17 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf portaali 14547 0.2 16.6 365220 345168 ? S12:12 0:36 \_ /opt/portaali/ContentManagement-1.0/python-2.3.3/bin/python /opt/ portaali/ContentManagement-1.0/zope-2.7.0/lib/python/Zope/Startup/run.py -C /usr/local/portaali/zope_instance/etc/zope.conf And then backtrace with gdb from those quiet threads ~# gdb program 14552 backtrace #0 0x4007b87e in sigsuspend () from /lib/libc.so.6 #1 0x4001e879 in __pthread_wait_for_restart_signal () from /lib/libpthread.so.0 #2 0x4001fee1 in sem_wait@@GLIBC_2.1 () from /lib/libpthread.so.0 #3 0x080c4830 in PyThread_acquire_lock (lock=0x861a908, waitflag=1) at Python/thread_pthread.h:406 #4 0x080c76c8 in lock_PyThread_acquire_lock (self=0x417c4840, args=0x4017002c) at ./Modules/threadmodule.c:63 #5 0x080e0e3f in PyCFunction_Call (func=0x41c7c7cc, arg=0x4017002c, kw=0x0) at Objects/methodobject.c:73 #6 0x080a3beb in call_function (pp_stack=0xbe7fb904, oparg=0) at Python/ceval.c:3439 #7 0x080a22f6 in eval_frame (f=0x9b5ea24) at Python/ceval.c:2116 #8 0x080a31e2 in PyEval_EvalCodeEx (co=0x40442c20, globals=0x40445acc, locals=0x0, args=0x8bdb024, argcount=1, kws=0x8bdb028, kwcount=1, defs=0x4047eb18, defcount=5, closure=0x0) at Python/ceval.c:2663 #9 0x080a55ec in fast_function (func=0x404a9924, pp_stack=0xbe7fba94, n=3, na=1, nk=1) at Python/ceval.c:3529 #10 0x080a3c71 in call_function (pp_stack=0xbe7fba94, oparg=256) at Python/ceval.c:3458 Another thread ( 14548 ) #0 0x401153c4 in read () from /lib/libc.so.6 #1 0x40029ae0 in __DTOR_END__ () from /lib/libpthread.so.0 #2 0x412474f0 in nttrd () from /opt/portaali/comp/oracle/lib/libclntsh.so.9.0 #3 0x410fdcf8 in nsprecv () from /opt/portaali/comp/oracle/lib/libclntsh.so.9.0 #4 0x41101ac5 in nsrdr () from /opt/portaali/comp/oracle/lib/libclntsh.so.9.0
Re: [Zope-dev] Zope threads on 2.7.0, python 2.3.3
Heimo Laukkanen wrote at 2004-5-25 14:54 +0300: > ... >witnessing strange behaviour, >where after time most of load focuses only to one thread. > ... >After reading thread: >http://www.gossamer-threads.com/lists/zope/dev/24230?page=last > >I started to wonder whether that signalling behaviour is also a cause >for our problems - since we are running Debina linux without NPTL ( >Linux 2.4.26 #1 SMP Thu Apr 22 11:16:14 EEST 2004 i686 unknown ). I would be surprised... What does "Control_Panel --> Debug information" tells you about the use of your connections (at the bottom of the page)? -- Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )