Re: ZEO cache best practices, was Re: [ZODB-Dev] ZEO 3.2 (Zope 2.7) ->3.6 (Zope 2.9) upgrading: Much slower startup due to cache file creation

2006-04-19 Thread Tim Peters
...

[Tim Peters]
>> OTOH, the larger the ZEO cache file, the longer it may take startup or
>> reconnection cache verification to complete, so there's always some
>> reason not to do the obvious thing <0.3 wink>.

[Jim Fulton]
> This is only a problem if:
>
> 1. a persistent cache is used or
>
> 2. the client gets disconnected from the server, without
> restarting, long enough for the server to commit enough
> transactions for the server to be able to do enough quick
> verification.

IME, the most frequent (or maybe just the loudest ;-)) complaint was
neither of those:  it's when multiple ZEO clients all use very large
non-persistent ZEO client caches, and the ZEO _server_ goes down. 
When the ZEO server comes back up, it (of course) has lost all the
in-memory caching done by the previous incarnation of the server to
support quick verification, so all the clients "take forever" to do
full verification, and all the clients appear non-responsive for the
duration.

IIRC, that's the rationale for adding a new "yes, I know I'm
non-persistent, but please stop trying to help me anyway"
never-verify-always-start-over-from-scratch ZEO client cache option. 
Ask Andrew Sawyers to be sure ;-)

> We seem to have a lot of problems with persistent caches for
> some reason, so I tend to recommend against their use.  I'm
> not sure what's going on there.

I believe there are multiple causes, from undetected cache-file
corruption after a non-clean shutdown (there's little redundancy to
_check_ in the pre- or post-MVCC cache designs, although I tried to
add a little more sanity-checking in the post-MVCC cache), to users
switching the database their ZEO client connects to without deleting
the persistent cache file(s) or changing any of the stuff on the
client that gets folded into that file's name.  The key seems to be
that almost all cases of persisent-cache problems I'm aware of "got
fixed by magic" just by deleting the cache file(s) are restarting.

> I still find them useful in situations in which the connection to the server 
> is slow.
>
> the second case should be rare.
>
> I would definately error in the direction of using a larger cache

Me too.
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: ZEO cache best practices, was Re: [ZODB-Dev] ZEO 3.2 (Zope 2.7) ->3.6 (Zope 2.9) upgrading: Much slower startup due to cache file creation

2006-04-19 Thread Jim Fulton

Tim Peters wrote:

[Paul Winkler]


Heh. Well, I had a site that under some usage patterns would
occasionally slow to a crawl with cache flips every few minutes.  That
was with the old default 20 MB cache size.  I think I left it at
500 MB or so and that site's been fine since. But the performance
demands were pretty low.



Yes, 20MB is _very_ small.  It may have seemed conservatively safe 10
years ago when disks were much smaller, but now it's ludicrously
small.

OTOH, the larger the ZEO cache file, the longer it may take startup or
reconnection cache verification to complete, so there's always some
reason not to do the obvious thing <0.3 wink>.


This is only a problem if:

1. a persistent cache is used or

2. the client gets disconnected from the server, without
   restarting, long enough for the server to commit enough
   transactions for the server to be able to do enough quick
   verification.

We seem to have a lot of problems with persistent caches for
some reason, so I tend to recommend against their use.  I'm
not sure what's going on there.  I still find them useful in
situations in which the connection to the server is slow.

the second case should be rare.

I would definately error in the direction of using a
larger cache

Jim

--
Jim Fulton   mailto:[EMAIL PROTECTED]   Python Powered!
CTO  (540) 361-1714http://www.python.org
Zope Corporation http://www.zope.com   http://www.zope.org
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: ZEO cache best practices, was Re: [ZODB-Dev] ZEO 3.2 (Zope 2.7) ->3.6 (Zope 2.9) upgrading: Much slower startup due to cache file creation

2006-04-18 Thread Tim Peters
[Paul Winkler]
> Heh. Well, I had a site that under some usage patterns would
> occasionally slow to a crawl with cache flips every few minutes.  That
> was with the old default 20 MB cache size.  I think I left it at
> 500 MB or so and that site's been fine since. But the performance
> demands were pretty low.

Yes, 20MB is _very_ small.  It may have seemed conservatively safe 10
years ago when disks were much smaller, but now it's ludicrously
small.

OTOH, the larger the ZEO cache file, the longer it may take startup or
reconnection cache verification to complete, so there's always some
reason not to do the obvious thing <0.3 wink>.

[on doc/ZEO/trace.txt]
> Thanks, those are very good documents!
>
> Out of curiosity, do you have any guesstimates on how much
> overhead enabling the cache trace can incur?

It's definitely intended that you be able to use ZEO's cache tracing
on a production box.

No particular memory burden (trace records are written directly to a
disk file, not in any way cached in memory (apart from that the OS may
use RAM to buffer disk writes, of course).

The speed burden should be minor:  producing a trace record requires a
trivial amount of computation, and then whatever time it takes to pass
a binary string of a few dozen bytes off to the platform output
routines (note that this is summary info, and trace records have a
fixed small size independent of object sizes).  Given that a ZEO cache
hit has to do hundreds or thousands or ... bytes worth of file I/O
anyway (to read up the object pickle), and a cache hit is the cheapest
thing you can do with ZEO, it's relatively minor additional work.

Depending on cache activity, trace files can grow quickly (megabytes
per hour is common, and I contrived tests that produced hundreds of
megabytes per hour), and that's probably the biggest thing to look out
for.  For example, if the trace file is on a small partition, don't be
suprised if a traced ZEO craps out with an "out of disk space" error
when trying to append a new trace record to the file.
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: ZEO cache best practices, was Re: [ZODB-Dev] ZEO 3.2 (Zope 2.7) ->3.6 (Zope 2.9) upgrading: Much slower startup due to cache file creation

2006-04-18 Thread Paul Winkler
On Tue, Apr 18, 2006 at 02:12:15PM -0400, Tim Peters wrote:
> [Paul Winkler]
> > Interesting. Is there a recommended way now to judge whether your
> > ZEO cache is "big enough"?
> 
> Was there a recommended way before?  If so, it probably sucked too ;-)

Heh. Well, I had a site that under some usage patterns would
occasionally slow to a crawl with cache flips every few minutes.  That
was with the old default 20 MB cache size.  I think I left it at
500 MB or so and that site's been fine since. But the performance
demands were pretty low.

> The best approach for any knob is to try different settings with your
> actual app running an actual workload, measure whatever it is you're
> trying to optimize, and pick the next setting to try accordingly. 
> Lather, rinse, repeat.

Sure. But when you're in a hurry to fix a particular symptom, and
the first thing you try apparently makes it go away permanently,
sometimes that's good enough :-)
 
> ZEO supports a way to create a dump file summarizing "interesting"
> cache events, and there's a cache simulator program that uses that
> file as input to predict how various cache statistics (like overall
> hit rate) would change _if_ you had specified a different cache size. 
> That goes much faster than actually running the whole application
> again, but the reported results are an approximation.  I know several
> (but not many) people have tried this post-MVCC, and the few I heard
> back from said it was helpful.  You can read about it in ZODB's
> doc/ZEO/trace.txt.  cache.txt in the same directory gives a brief
> overview of the post-MVCC ZEO cache design.

Thanks, those are very good documents!

Out of curiosity, do you have any guesstimates on how much
overhead enabling the cache trace can incur?

-- 

Paul Winkler
http://www.slinkp.com
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev


Re: ZEO cache best practices, was Re: [ZODB-Dev] ZEO 3.2 (Zope 2.7) ->3.6 (Zope 2.9) upgrading: Much slower startup due to cache file creation

2006-04-18 Thread Tim Peters
[Tim Peters]
>> That's an example:  the post-MVCC ZEO cache is a single file, and
>> there are no cache flips; flips are unique to the pre-MVCC two-file
>> ZEO cache design.

[Paul Winkler]
> Interesting. Is there a recommended way now to judge whether your
> ZEO cache is "big enough"?

Was there a recommended way before?  If so, it probably sucked too ;-)
 It depends so much on the specifics of your app's "typical behavior",
and how you answer the question "'big enough' for _what_?".  For
example, Martin said he never saw cache flips under ZEO 3.2, and that
says he's probably got unusual (in the statistical sense, relative to
the universe of ZEO users) goals here.  If so, "an answer" that made
him happy probably wouldn't mean much to you.

> I used to grep the logs for cache flips to see how often they were happening.
> (I didn't have a formula for what to do with that information, it was a 
> pretty fuzzy
> process that typically ended up as "make the cache really huge and
> forget about it").

The best approach for any knob is to try different settings with your
actual app running an actual workload, measure whatever it is you're
trying to optimize, and pick the next setting to try accordingly. 
Lather, rinse, repeat.

ZEO supports a way to create a dump file summarizing "interesting"
cache events, and there's a cache simulator program that uses that
file as input to predict how various cache statistics (like overall
hit rate) would change _if_ you had specified a different cache size. 
That goes much faster than actually running the whole application
again, but the reported results are an approximation.  I know several
(but not many) people have tried this post-MVCC, and the few I heard
back from said it was helpful.  You can read about it in ZODB's
doc/ZEO/trace.txt.  cache.txt in the same directory gives a brief
overview of the post-MVCC ZEO cache design.
___
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev