Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-15 Thread Toby Dickenson
On Thursday 14 April 2005 20:23, Tim Peters wrote:

> The size of the objects in the database has little to do with memory
> consumed by a FileStorage pack; it's more the number of distinct object
> revisions at work, since an in-memory object reachability graph is
> constructed.  I'm not sure how DirectoryStorage could perform packing
> without constructing a similar reachability graph (Toby?).

Both storages *traverse* the object reachability graph, keeping a record of 
which oids are reachable. They both keep a traversal to-do list in memory, 
which is sized proportional to the height of the reachability graph.

They differ in how they record which oids are reachable. FileStorage uses an 
fsIndex instance, which stores everything in memory (in a memory-efficient 
manner). The default implementation in DirectoryStorages uses a bit in the 
file permissions to mark reached objects. The I/O cost of this is the main 
reason for DirectoryStorage's relative slowness in packing.

There is an alternative implementation in DirectoryStorage which creates a 
second temporary ZODB to hold an OIBTree to store the list of reachable 
objects. This also has a fixed memory cost and performs better than the 
standard permissions bit implementation. One big disadvantage last time I 
looked was memory leaks when creating and destroying ZODB.DB objects - but I 
think Tim and Jeremy have since addressed that.

> The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very
> slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and
> regretted the then-last round of packing changes, which favored reducing RAM
> usage at the cost of increasing runtime.  That appears to have been a wrong
> tradeoff for most modern boxes.

Interesting. DirectoryStorage can use an all-in-memory implementation too. 
Anyone with a big storage fancy trying it?

> Toby, I know (or think I know ) that DirectoryStorage won't commit a
> transaction containing dangling references.  I think that's great, and I'd
> like (if possible) to introduce such a check at a higher level, so that all
> storages would benefit. 

There are races in this dangling reference detection. I guess thats OK since 
it is only there to warn about a bug in a higher layer.

> Does DirectoryStorage do something beyond that 
> check specifically aimed at preventing POSKeyErrors?  

There are numerous corner cases that can lead to objects incorrectly appearing 
to be unreachable during packing. I describe one here:
http://mail.zope.org/pipermail/zodb-dev/2002-May/002601.html
DirectoryStorage takes two precautions to reduce the chances of being bitten 
by this class of problem:

a. Ensuring that the pack threshold time leaves sufficient margin of safety. 
storage.pack(one day ago) is fine.
storage.pack(zero days ago) is silently converted to
storage.pack(10 minutes ago)

b. Both storages keep all objects that are reachable from a sufficiently 
recent version of the root object. DirectoryStorage will also keep objects 
that have been modified in any sufficiently recent transaction even if they 
do not appear to be reachable. (this set in almost always empty, unless we 
have hit a corner case. Objects almost always have to be reachable in order 
to get modified)


-- 
Toby Dickenson
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


RE: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-14 Thread Tim Peters
[Chris Withers]
>> Out of interest, why are you using DirectoryStorage?

[Dario Lopez-Kästen]
> I chose it for several reasons:

I don't want to talk you out of it, but since this a general list I feel
compelled  to respond to these points wrt current FileStorage.  You're

using a by-now very old Zope (2.6.2), and may not be aware of the info at:

http://zope.org/Wikis/ZODB/FileStorageBackup

> 1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip,
> tar-balls, etc) in this particular application (it's a student portal,
> course admin portal and an LMS). While we are not yet in the
> multigigabyte realm, we are storing archive copies of all the previous
> year's materials, which will eventually grow to be a lot of stuff.

If I understand correctly, DirectoryStorage and FileStorage both store this
stuff in giant pickles -- and then there's no cause for "large" total size
difference I'm aware of.  The storage comparison matrix at

http://cvs.zope.org/ZODB3/Doc/storages.html?rev=1

says DirectoryStorage requires "Roughly 30% more [disk] space than Data.fs",
not less disk space.  Indeed, it's hard to imagine any non-compressing
scheme that could require less total disk space than FileStorage.

> 2) There is the issue of huge Data.fs fiels and making daily backups. We
> need to have incremental backups

See the link above:  repozo.py supports incremental Data.fs backup, taking
(using -Q) time roughly proportional to the increase in Data.fs size since
the most recent backup.  It goes fast!

> 3) HA - while DirStor is not a HA-tool per se, it provides the necessary
> tools for building something that provide some aspects of HA, ie. the
> replication features, etc.

Unsure what "HA" means to you.  "High availability", perhaps?  ZRS is
available for FileStorage, but it's admittedly not free:

http://www.zope.com/Products/ZRS.html
 
> 4) Maintenance. While I have not yet dared to pack the DB, the mere size
> of the database will make packing a non-trivial operation memorywise in
> FielStorage. DirStor does not have the same memory requirements when
> packing.

The size of the objects in the database has little to do with memory
consumed by a FileStorage pack; it's more the number of distinct object
revisions at work, since an in-memory object reachability graph is
constructed.  I'm not sure how DirectoryStorage could perform packing
without constructing a similar reachability graph (Toby?).

The last time Jeremy and I watched a pack work on a 20GB Data.fs, on a very
slow Solaris box, we noticed that it was only taking 10-20% of the RAM, and
regretted the then-last round of packing changes, which favored reducing RAM
usage at the cost of increasing runtime.  That appears to have been a wrong
tradeoff for most modern boxes.

Then again, data storages are growing ever bigger too.  It's very nice that
DirectoryStorage's direct RAM consumption is independent of the number of
objects.

> 5) POSKeyErrors. We where getting quite a few of those, and that scared
> me. with DirStor, I do not see them as much as before.

Do you see _any_?

FWIW, several nasty causes (bugs in ZODB and Zope) for POSKeyErrors have
been fixed since Zope 2.6.2, and reports of POSKeyErrors from current
Zope/ZODB installations are conspicuous by absence.

Toby, I know (or think I know ) that DirectoryStorage won't commit a
transaction containing dangling references.  I think that's great, and I'd
like (if possible) to introduce such a check at a higher level, so that all
storages would benefit.  Does DirectoryStorage do something beyond that
check specifically aimed at preventing POSKeyErrors?  

...

___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-14 Thread Paul Winkler
On Thu, Apr 14, 2005 at 09:38:39AM +0200, Dario Lopez-K?sten wrote:
> 4) Maintenance. While I have not yet dared to pack the DB, the mere size 
> of the database will make packing a non-trivial operation memorywise in 
> FielStorage. DirStor does not have the same memory requirements when 
> packing.

 but OTOH, it can take a long time to pack.
I have a weekly cron job that packs my ~3 GB DirectoryStorages and 
they frequently take well over an hour each.  

Full backups take roughly an hour too.  Incremental backups are pretty
quick, e.g. last night's 91 M backup took 1 minute.

I'm a big fan of DirectoryStorage, but in terms of raw speed it's 
slower in some ways.  Packing is memory-friendly but slow.
(If 

Also you mention issues with incremental backup of Data.fs.
Did you ever try repozo.py? Just mentioning it for completeness.
 
> 5) POSKeyErrors. We where getting quite a few of those, and that scared 
> me. with DirStor, I do not see them as much as before.

I don't see them ever :-)
That was the big motivator for me.
 
-- 

Paul Winkler
http://www.slinkp.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-14 Thread Toby Dickenson
On Thursday 14 April 2005 08:38, Dario Lopez-Kästen wrote:
> Chris Withers wrote:
> > You sure you're using the latest 
> > DirectoryStorage? Can you reproduce this using just plain FileStorage?
> 
> No, I am not using the latest, because of my *n*l sysadmin (actually 
> there are two of them, but only one whines :-). I'll try beating him in 
> the head a few times, er, I mean, "discuss the issue with him" to make 
> him change his mind.

If it helps, Im pretty sure that no deadlocking bugs have been fixed in 
directorystorage since version 1.1.4.

> 5) POSKeyErrors. We where getting quite a few of those, and that
> scared me. with DirStor, I do not see them as much as before.

I would be interested to see if you are getting *any* unexpected POSKeyErrors 
with DirectoryStorage.

-- 
Toby Dickenson
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-14 Thread Dario Lopez-Kästen
Chris Withers wrote:
Dario Lopez-Kästen wrote:
I am in need for some help. We are using Zope 2.6.2, DBTab on the 
clients (4 of them on 2 servers) and Directory storage on teh ZEO side.

Out of interest, why are you using DirectoryStorage?
I chose it for several reasons:
1) we are storing large amounts of binary files (PDF, Word, Matlab, Zip, 
tar-balls, etc) in this particular application (it's a student portal, 
course admin portal and an LMS). While we are not yet in the 
multigigabyte realm, we are storing archive copies of all the previous 
year's materials, which will eventually grow to be a lot of stuff.

2) There is the issue of huge Data.fs fiels and making daily backups. We 
need to have incremental backups

3) HA - while DirStor is not a HA-tool per se, it provides the necessary 
tools for building something that provide some aspects of HA, ie. the 
replication features, etc.

4) Maintenance. While I have not yet dared to pack the DB, the mere size 
of the database will make packing a non-trivial operation memorywise in 
FielStorage. DirStor does not have the same memory requirements when 
packing.

5) POSKeyErrors. We where getting quite a few of those, and that scared 
me. with DirStor, I do not see them as much as before.

Well, ZEO storage server is single threaded, so I guess something could 
lock there causing the other clients to wait infinitely for the storage 
server. Never heard of it though. You sure you're using the latest 
DirectoryStorage? Can you reproduce this using just plain FileStorage?
No, I am not using the latest, because of my *n*l sysadmin (actually 
there are two of them, but only one whines :-). I'll try beating him in 
the head a few times, er, I mean, "discuss the issue with him" to make 
him change his mind.

For practical reasons, there is no way I can replicate this behaviour 
with FileStorage. That would entail taking the system down for a few 
days and then take it back up again after we've moved all the contents 
to FileStorage.

Today I discovered that the problem may not be in the Zope layer but in 
the Oracle layer, not because DCOracle or Oracle suck, but because we 
share our instance with another app that is known for it's bad 
programming style.

And our app is not the most brilliant piece of code either. Several 
parts of it have not been touched since before the introduction of 
Script(Python) in Zope, and back then we were all Zope newbies using 
DTML for everything.

So there may be DB congestion issues at the core of it all. The reason I 
sent the mail is that this is something that has been happenning all of 
a sudden for three weeks now. I'll report back when I get a full report 
from the DBA guys and see if there is any change in usage pattern on the 
DB level.

thanks,
/dario
--
-- ---
Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
"...and click? damn, I need to kill -9 Word again..." - b using macosx
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-13 Thread Chris Withers
Dario Lopez-Kästen wrote:
I am in need for some help. We are using Zope 2.6.2, DBTab on the 
clients (4 of them on 2 servers) and Directory storage on teh ZEO side.
Out of interest, why are you using DirectoryStorage?
here is where the wirdness starts. It seems that ALL the clients stop 
responding. I verified this by acceident when I restarted the two nodes 
of the first server. Suddenly all four nodes started responding (even 
the two that where not restarted) and the ZEO-serverlog started to log 
events again.

This leads me to wonder if there indeed is some interaction between the 
ZEO-clients that we need to be aware of.
Well, ZEO storage server is single threaded, so I guess something could 
lock there causing the other clients to wait infinitely for the storage 
server. Never heard of it though. You sure you're using the latest 
DirectoryStorage? Can you reproduce this using just plain FileStorage?

cheers,
Chris
--
Simplistix - Content Management, Zope & Python Consulting
   - http://www.simplistix.co.uk
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: [ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-12 Thread Paul Winkler
On Tue, Apr 12, 2005 at 02:10:14PM +0200, Dario Lopez-K?sten wrote:
> Yesterday afternoonI restarted the ZEO server, but today we just now had 
> our first hang again. The logs of the clients state:
> 
> ERROR(200) ZServer uncaptured python exception, closing channel 
>  0x1a310774 channel#: 0 req
> uests:> (socket.error:(104, 'Connection reset by peer') 
> [/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asynchat.py|initiate_send|213]
>  
> [/usr/local/zope/software/zope
> -2.6.2-TfDsZeo/ZServer/medusa/http_server.py|send|417] 
> [/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asyncore.py|send|338])
> 
> but it "seems" to fix itself because there are other log entries after 
> that. 

Ignore those.  That just means somebody or something (probably 
a virus or a script kiddie) sent you a malformed request, probably
scanning for known vulnerabilities to exploit.
ZServer just logs the problem and ignores the request.

If you get a lot of those you might consider an http sanitizer
in front of your Zope.  Pound might do.

As for the rest of your troubles... never seen that myself,
afraid I have no idea. 

-- 

Paul Winkler
http://www.slinkp.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


[ZODB-Dev] Hanging ZEO-client hangs all other ZEO-clients?

2005-04-12 Thread Dario Lopez-Kästen
Hello,
I am in need for some help. We are using Zope 2.6.2, DBTab on the 
clients (4 of them on 2 servers) and Directory storage on teh ZEO side.

Almost 3 weeks ago we suddenly started experiencing intermittent server 
hangs. Since then we have server hangs at least 2 times per day. 
Naturally uers are non-happy.

Yesterday afternoonI restarted the ZEO server, but today we just now had 
our first hang again. The logs of the clients state:

ERROR(200) ZServer uncaptured python exception, closing channel 

uests:> (socket.error:(104, 'Connection reset by peer') 
[/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asynchat.py|initiate_send|213] 
[/usr/local/zope/software/zope
-2.6.2-TfDsZeo/ZServer/medusa/http_server.py|send|417] 
[/usr/local/zope/software/zope-2.6.2-TfDsZeo/ZServer/medusa/asyncore.py|send|338])

but it "seems" to fix itself because there are other log entries after 
that. After an hour or two the clients stop responding, however, and 
here is where the wirdness starts. It seems that ALL the clients stop 
responding. I verified this by acceident when I restarted the two nodes 
of the first server. Suddenly all four nodes started responding (even 
the two that where not restarted) and the ZEO-serverlog started to log 
events again.

This leads me to wonder if there indeed is some interaction between the 
ZEO-clients that we need to be aware of.

Any response would be appreciated.
Sincerely,
/dario
--
-- ---
Dario Lopez-Kästen, IT Systems & Services Chalmers University of Tech.
"...and click? damn, I need to kill -9 Word again..." - b using macosx
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/
ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev