Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
On Thursday 14 April 2005 20:23, Tim Peters wrote: > The size of the objects in the database has little to do with memory > consumed by a FileStorage pack; it's more the number of distinct object > revisions at work, since an in-memory object reachability graph is > constructed. I'm not sure how DirectoryStorage could perform packing > without constructing a similar reachability graph (Toby?). Both storages *traverse* the object reachability graph, keeping a record of which oids are reachable. They both keep a traversal to-do list in memory, which is sized proportional to the height of the reachability graph. They differ in how they record which oids are reachable. FileStorage uses an fsIndex instance, which stores everything in memory (in a memory-efficient manner). The default implementation in DirectoryStorages uses a bit in the file permissions to mark reached objects. The I/O cost of this is the main reason for DirectoryStorage's relative slowness in packing. There is an alternative implementation in DirectoryStorage which creates a second temporary ZODB to hold an OIBTree to store the list of reachable objects. This also has a fixed memory cost and performs better than the standard permissions bit implementation. One big disadvantage last time I looked was memory leaks when creating and destroying ZODB.DB objects - but I think Tim and Jeremy have since addressed that. > The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very > slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and > regretted the then-last round of packing changes, which favored reducing RAM > usage at the cost of increasing runtime. That appears to have been a wrong > tradeoff for most modern boxes. Interesting. DirectoryStorage can use an all-in-memory implementation too. Anyone with a big storage fancy trying it? > Toby, I know (or think I know ) that DirectoryStorage won't commit a > transaction containing dangling references. I think that's great, and I'd > like (if possible) to introduce such a check at a higher level, so that all > storages would benefit. There are races in this dangling reference detection. I guess thats OK since it is only there to warn about a bug in a higher layer. > Does DirectoryStorage do something beyond that > check specifically aimed at preventing POSKeyErrors? There are numerous corner cases that can lead to objects incorrectly appearing to be unreachable during packing. I describe one here: http://mail.zope.org/pipermail/zodb-dev/2002-May/002601.html DirectoryStorage takes two precautions to reduce the chances of being bitten by this class of problem: a. Ensuring that the pack threshold time leaves sufficient margin of safety. storage.pack(one day ago) is fine. storage.pack(zero days ago) is silently converted to storage.pack(10 minutes ago) b. Both storages keep all objects that are reachable from a sufficiently recent version of the root object. DirectoryStorage will also keep objects that have been modified in any sufficiently recent transaction even if they do not appear to be reachable. (this set in almost always empty, unless we have hit a corner case. Objects almost always have to be reachable in order to get modified) -- Toby Dickenson ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
RE: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
[Chris Withers] >> Out of interest, why are you using DirectoryStorage? [Dario Lopez-Kästen] > I chose it for several reasons: I don't want to talk you out of it, but since this a general list I feel compelled to respond to these points wrt current FileStorage. You're using a by-now very old Zope (2.6.2), and may not be aware of the info at: http://zope.org/Wikis/ZODB/FileStorageBackup > 1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip, > tar-balls, etc) in this particular application (it's a student portal, > course admin portal and an LMS). While we are not yet in the > multigigabyte realm, we are storing archive copies of all the previous > year's materials, which will eventually grow to be a lot of stuff. If I understand correctly, DirectoryStorage and FileStorage both store this stuff in giant pickles -- and then there's no cause for "large" total size difference I'm aware of. The storage comparison matrix at http://cvs.zope.org/ZODB3/Doc/storages.html?rev=1 says DirectoryStorage requires "Roughly 30% more [disk] space than Data.fs", not less disk space. Indeed, it's hard to imagine any non-compressing scheme that could require less total disk space than FileStorage. > 2) There is the issue of huge Data.fs fiels and making daily backups. We > need to have incremental backups See the link above: repozo.py supports incremental Data.fs backup, taking (using -Q) time roughly proportional to the increase in Data.fs size since the most recent backup. It goes fast! > 3) HA - while DirStor is not a HA-tool per se, it provides the necessary > tools for building something that provide some aspects of HA, ie. the > replication features, etc. Unsure what "HA" means to you. "High availability", perhaps? ZRS is available for FileStorage, but it's admittedly not free: http://www.zope.com/Products/ZRS.html > 4) Maintenance. While I have not yet dared to pack the DB, the mere size > of the database will make packing a non-trivial operation memorywise in > FielStorage. DirStor does not have the same memory requirements when > packing. The size of the objects in the database has little to do with memory consumed by a FileStorage pack; it's more the number of distinct object revisions at work, since an in-memory object reachability graph is constructed. I'm not sure how DirectoryStorage could perform packing without constructing a similar reachability graph (Toby?). The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and regretted the then-last round of packing changes, which favored reducing RAM usage at the cost of increasing runtime. That appears to have been a wrong tradeoff for most modern boxes. Then again, data storages are growing ever bigger too. It's very nice that DirectoryStorage's direct RAM consumption is independent of the number of objects. > 5) POSKeyErrors. We where getting quite a few of those, and that scared > me. with DirStor, I do not see them as much as before. Do you see _any_? FWIW, several nasty causes (bugs in ZODB and Zope) for POSKeyErrors have been fixed since Zope 2.6.2, and reports of POSKeyErrors from current Zope/ZODB installations are conspicuous by absence. Toby, I know (or think I know ) that DirectoryStorage won't commit a transaction containing dangling references. I think that's great, and I'd like (if possible) to introduce such a check at a higher level, so that all storages would benefit. Does DirectoryStorage do something beyond that check specifically aimed at preventing POSKeyErrors? ... ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
On Thu, Apr 14, 2005 at 09:38:39AM +0200, Dario Lopez-K?sten wrote: > 4) Maintenance. While I have not yet dared to pack the DB, the mere size > of the database will make packing a non-trivial operation memorywise in > FielStorage. DirStor does not have the same memory requirements when > packing. but OTOH, it can take a long time to pack. I have a weekly cron job that packs my ~3 GB DirectoryStorages and they frequently take well over an hour each. Full backups take roughly an hour too. Incremental backups are pretty quick, e.g. last night's 91 M backup took 1 minute. I'm a big fan of DirectoryStorage, but in terms of raw speed it's slower in some ways. Packing is memory-friendly but slow. (If Also you mention issues with incremental backup of Data.fs. Did you ever try repozo.py? Just mentioning it for completeness. > 5) POSKeyErrors. We where getting quite a few of those, and that scared > me. with DirStor, I do not see them as much as before. I don't see them ever :-) That was the big motivator for me. -- Paul Winkler http://www.slinkp.com ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
On Thursday 14 April 2005 08:38, Dario Lopez-Kästen wrote: > Chris Withers wrote: > > You sure you're using the latest > > DirectoryStorage? Can you reproduce this using just plain FileStorage? > > No, I am not using the latest, because of my *n*l sysadmin (actually > there are two of them, but only one whines :-). I'll try beating him in > the head a few times, er, I mean, "discuss the issue with him" to make > him change his mind. If it helps, Im pretty sure that no deadlocking bugs have been fixed in directorystorage since version 1.1.4. > 5) POSKeyErrors. We where getting quite a few of those, and that > scared me. with DirStor, I do not see them as much as before. I would be interested to see if you are getting *any* unexpected POSKeyErrors with DirectoryStorage. -- Toby Dickenson ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
Chris Withers wrote: Dario Lopez-Kästen wrote: I am in need for some help. We are using Zope 2.6.2, DBTab on the clients (4 of them on 2 servers) and Directory storage on teh ZEO side. Out of interest, why are you using DirectoryStorage? I chose it for several reasons: 1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip, tar-balls, etc) in this particular application (it's a student portal, course admin portal and an LMS). While we are not yet in the multigigabyte realm, we are storing archive copies of all the previous year's materials, which will eventually grow to be a lot of stuff. 2) There is the issue of huge Data.fs fiels and making daily backups. We need to have incremental backups 3) HA - while DirStor is not a HA-tool per se, it provides the necessary tools for building something that provide some aspects of HA, ie. the replication features, etc. 4) Maintenance. While I have not yet dared to pack the DB, the mere size of the database will make packing a non-trivial operation memorywise in FielStorage. DirStor does not have the same memory requirements when packing. 5) POSKeyErrors. We where getting quite a few of those, and that scared me. with DirStor, I do not see them as much as before. Well, ZEO storage server is single threaded, so I guess something could lock there causing the other clients to wait infinitely for the storage server. Never heard of it though. You sure you're using the latest DirectoryStorage? Can you reproduce this using just plain FileStorage? No, I am not using the latest, because of my *n*l sysadmin (actually there are two of them, but only one whines :-). I'll try beating him in the head a few times, er, I mean, "discuss the issue with him" to make him change his mind. For practical reasons, there is no way I can replicate this behaviour with FileStorage. That would entail taking the system down for a few days and then take it back up again after we've moved all the contents to FileStorage. Today I discovered that the problem may not be in the Zope layer but in the Oracle layer, not because DCOracle or Oracle suck, but because we share our instance with another app that is known for it's bad programming style. And our app is not the most brilliant piece of code either. Several parts of it have not been touched since before the introduction of Script(Python) in Zope, and back then we were all Zope newbies using DTML for everything. So there may be DB congestion issues at the core of it all. The reason I sent the mail is that this is something that has been happenning all of a sudden for three weeks now. I'll report back when I get a full report from the DBA guys and see if there is any change in usage pattern on the DB level. thanks, /dario -- -- --- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech. "...and click? damn, I need to kill -9 Word again..." - b using macosx ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
Dario Lopez-Kästen wrote: I am in need for some help. We are using Zope 2.6.2, DBTab on the clients (4 of them on 2 servers) and Directory storage on teh ZEO side. Out of interest, why are you using DirectoryStorage? here is where the wirdness starts. It seems that ALL the clients stop responding. I verified this by acceident when I restarted the two nodes of the first server. Suddenly all four nodes started responding (even the two that where not restarted) and the ZEO-serverlog started to log events again. This leads me to wonder if there indeed is some interaction between the ZEO-clients that we need to be aware of. Well, ZEO storage server is single threaded, so I guess something could lock there causing the other clients to wait infinitely for the storage server. Never heard of it though. You sure you're using the latest DirectoryStorage? Can you reproduce this using just plain FileStorage? cheers, Chris -- Simplistix - Content Management, Zope & Python Consulting - http://www.simplistix.co.uk ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
On Tue, Apr 12, 2005 at 02:10:14PM +0200, Dario Lopez-K?sten wrote: > Yesterday afternoonI restarted the ZEO server, but today we just now had > our first hang again. The logs of the clients state: > > ERROR(200) ZServer uncaptured python exception, closing channel > 0x1a310774 channel#: 0 req > uests:> (socket.error:(104, 'Connection reset by peer') > [/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asynchat.py|initiate_send|213] > > [/usr/local/zope/software/zope > -2.6.2-TfDsZeo/ZServer/medusa/http_server.py|send|417] > [/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asyncore.py|send|338]) > > but it "seems" to fix itself because there are other log entries after > that. Ignore those. That just means somebody or something (probably a virus or a script kiddie) sent you a malformed request, probably scanning for known vulnerabilities to exploit. ZServer just logs the problem and ignores the request. If you get a lot of those you might consider an http sanitizer in front of your Zope. Pound might do. As for the rest of your troubles... never seen that myself, afraid I have no idea. -- Paul Winkler http://www.slinkp.com ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev
[ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?
Hello, I am in need for some help. We are using Zope 2.6.2, DBTab on the clients (4 of them on 2 servers) and Directory storage on teh ZEO side. Almost 3 weeks ago we suddenly started experiencing intermittent server hangs. Since then we have server hangs at least 2 times per day. Naturally uers are non-happy. Yesterday afternoonI restarted the ZEO server, but today we just now had our first hang again. The logs of the clients state: ERROR(200) ZServer uncaptured python exception, closing channel uests:> (socket.error:(104, 'Connection reset by peer') [/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asynchat.py|initiate_send|213] [/usr/local/zope/software/zope -2.6.2-TfDsZeo/ZServer/medusa/http_server.py|send|417] [/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asyncore.py|send|338]) but it "seems" to fix itself because there are other log entries after that. After an hour or two the clients stop responding, however, and here is where the wirdness starts. It seems that ALL the clients stop responding. I verified this by acceident when I restarted the two nodes of the first server. Suddenly all four nodes started responding (even the two that where not restarted) and the ZEO-serverlog started to log events again. This leads me to wonder if there indeed is some interaction between the ZEO-clients that we need to be aware of. Any response would be appreciated. Sincerely, /dario -- -- --- Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech. "...and click? damn, I need to kill -9 Word again..." - b using macosx ___ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev