RE: [Zope-dev] Cache growing during single REQUEST
Tim Peters wrote at 2003-9-14 16:40 -0400: ... [Dieter Maurer] Whoever wants to use it right now: the no more ReadConflictErrors patch on http://www.dieter.handshake.de/pyprojects/zope does precisely this (for storages that support history information). How has that been working out for people (not limited to Dieter)? It works fine for us. We are working on a content management system with tens of thousands of large SGML/XML documents. My colleague likes to make parallel imports. Before the patch, cataloguing (using an improved version of Shane's QueueCatalog) did almost not proceed due to an incredible amount of ReadConflictErrors. After the patch, there are occasional WriteConflictErrors (caused by the catalog BTrees performing conflict resolution only at the leaf level), but in general the system is stable. We now use it also in production systems without problems. That's indeed what we're after, although Jeremy has in mind deeper/broader changes aimed at being more efficient than digging thru history. The history is consulted only rarely (only when the current state is too young). However, in this case, the pickled data is read twice. Of course, I do not mind when it gets more efficient... Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
Tim Peters wrote: Probably none for many apps. You'll be working with possibly non-current data, so think of ways your apps could possibly be damaged by that. For example, you're Bill Gates, using ZODB to track all your assets. A summary report takes hours to generate, and by the time you get it, perhaps a few of your billion-dollar overseas accounts were wiped out in the wee hours by an adverse court judgment, but the total you get added in the account values as of the time the report-generating transaction began. Oops. To the extent that MVCC hides that you're working with non-current data, to that extent also does an app relying on current data become vulnerable. When Bill is contemplating fleeing the country during turbulent times, he presumably needs to know how much cash he has right now, not what he had last night. Most apps aren't like that, but a one-size-fits-all policy for long-running transactions (like Bill's) doesn't exist. Ah, okay. That all makes sense... Of course, Bill may appreciate having a report that says based on data no newer than X where X is the time the transaction to generate the report started, rather than no report at all due to lots of read conflicts ;-) Chris ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Cache growing during single REQUEST
[Tim] ... The ways in which MVCC loses will become obvious later 0.9 wink. [Chris Withers] Any ideas what they'll be yet? Probably none for many apps. You'll be working with possibly non-current data, so think of ways your apps could possibly be damaged by that. For example, you're Bill Gates, using ZODB to track all your assets. A summary report takes hours to generate, and by the time you get it, perhaps a few of your billion-dollar overseas accounts were wiped out in the wee hours by an adverse court judgment, but the total you get added in the account values as of the time the report-generating transaction began. Oops. To the extent that MVCC hides that you're working with non-current data, to that extent also does an app relying on current data become vulnerable. When Bill is contemplating fleeing the country during turbulent times, he presumably needs to know how much cash he has right now, not what he had last night. Most apps aren't like that, but a one-size-fits-all policy for long-running transactions (like Bill's) doesn't exist. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Cache growing during single REQUEST
[Tim] ... The short course in ZODB is that, when MVCC is in effect, a read will return the state of the object as of the time the current transaction began, even if the object has subsequently been modified by some other transaction. [Dieter Maurer] Whoever wants to use it right now: the no more ReadConflictErrors patch on http://www.dieter.handshake.de/pyprojects/zope does precisely this (for storages that support history information). How has that been working out for people (not limited to Dieter)? That's indeed what we're after, although Jeremy has in mind deeper/broader changes aimed at being more efficient than digging thru history. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Cache growing during single REQUEST
Tim Peters wrote at 2003-9-11 23:50 -0400: ... You can google on the phrase to get a ton of more-or-less abstract overviews of the concept. The short course in ZODB is that, when MVCC is in effect, a read will return the state of the object as of the time the current transaction began, even if the object has subsequently been modified by some other transaction. Whoever wants to use it right now: the no more ReadConflictErrors patch on http://www.dieter.handshake.de/pyprojects/zope does precisely this (for storages that support history information). Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
Tim Peters wrote: Multiversion concurrency control (MVCC for short) is the next step. If no other crises intervene, Jeremy and I will start implementing that on the ZODB3 3.2 branch (most likely that branch -- can't swear to it) soon (ditto). From Jeremy's recent notes, I think it'll be the Zope 2.6 branch... You can google on the phrase to get a ton of more-or-less abstract overviews of the concept. The short course in ZODB is that, when MVCC is in effect, a read will return the state of the object as of the time the current transaction began, even if the object has subsequently been modified by some other transaction. Sounds great :-) If you need any implementations testing, do let me know! The overriding concern in all schemes is that you don't see inconsistent data. Yep. The current ReadConflictError prevents you from seeing inconsistent data by preventing you from loading objects that have changed since your current transaction began. Indeed, butthat could eb a lot of things, and often in situations where ti wouldn't matter, as long as the data is consistent... MVCC prevents it by (possibly) delivering non-current object states. I don't think either can be viewed as a pure win. The ways in which ReadConflictError loses are obvious to people here because they've experienced them. The ways in which MVCC loses will become obvious later 0.9 wink. Any ideas what they'll be yet? Chris ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
On Fri, 2003-09-12 at 07:34, Chris Withers wrote: Tim Peters wrote: Multiversion concurrency control (MVCC for short) is the next step. If no other crises intervene, Jeremy and I will start implementing that on the ZODB3 3.2 branch (most likely that branch -- can't swear to it) soon (ditto). From Jeremy's recent notes, I think it'll be the Zope 2.6 branch... Nope. The 'ZODB3-3_1-branch' has retired; all future ZODB3 3.1.x work will take place on the 'Zope-2_6-branch'. MVCC, however, is new feature work, and will take place on the 3.2 branch (which may be the Zope-2_7-branch :) You can google on the phrase to get a ton of more-or-less abstract overviews of the concept. The short course in ZODB is that, when MVCC is in effect, a read will return the state of the object as of the time the current transaction began, even if the object has subsequently been modified by some other transaction. Sounds great :-) If you need any implementations testing, do let me know! The overriding concern in all schemes is that you don't see inconsistent data. Yep. The current ReadConflictError prevents you from seeing inconsistent data by preventing you from loading objects that have changed since your current transaction began. Indeed, butthat could eb a lot of things, and often in situations where ti wouldn't matter, as long as the data is consistent... MVCC prevents it by (possibly) delivering non-current object states. I don't think either can be viewed as a pure win. The ways in which ReadConflictError loses are obvious to people here because they've experienced them. The ways in which MVCC loses will become obvious later 0.9 wink. Any ideas what they'll be yet? Some applications will have no downside at all; for instance, requests that will do no writing the the database are pure wins for MVCC (e.g., long-running report requests). Requests that write to the database on the basis of stale-but-consistent data *may* still be OK, but that will need to be a policy set by the application. Tres. -- === Tres Seaver[EMAIL PROTECTED] Zope Corporation Zope Dealers http://www.zope.com ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
RE: [Zope-dev] Cache growing during single REQUEST
[Toby Dickenson] You get a ReadConflictError when loading an object if it has been modified since the start of the transaction. This exception therefore becomes increasingly likely as time progresses since the start of the transaction. [Chris Withers] What's the thing Zope Corp are touting as the long term solution to this problem? Multiversion concurrency control (MVCC for short) is the next step. If no other crises intervene, Jeremy and I will start implementing that on the ZODB3 3.2 branch (most likely that branch -- can't swear to it) soon (ditto). You can google on the phrase to get a ton of more-or-less abstract overviews of the concept. The short course in ZODB is that, when MVCC is in effect, a read will return the state of the object as of the time the current transaction began, even if the object has subsequently been modified by some other transaction. The overriding concern in all schemes is that you don't see inconsistent data. The current ReadConflictError prevents you from seeing inconsistent data by preventing you from loading objects that have changed since your current transaction began. MVCC prevents it by (possibly) delivering non-current object states. I don't think either can be viewed as a pure win. The ways in which ReadConflictError loses are obvious to people here because they've experienced them. The ways in which MVCC loses will become obvious later 0.9 wink. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
On Wednesday 03 September 2003 12:15, Chris Withers wrote: If the object load would cause the cache to go above it's maximum number, *Number* isnt the the right parameter to control here. We need to limit the total amount of RAM. Objects are of variable size, and the largest ZODB objects are very much bigger than the average. One task I would like to find time for is making the cache aware of this. For example, this would fix the problem where the current cache unfairly penalises ZCatalog operations because its BTree nodes are so small. then boot an object out of the cache in order to make room for the new one. That would have a bad effect on ReadConflictErrors. Cache purging should only happen on transaction boundaries for storages where ReadConflictErrors are possible. control the amount of memory Zope actually uses and other requests would stand a chance of beind processed normally. I havent seen a mention of ulimit or autolance earlier in this thread They are mostly adequate protection against the work problems. -- Toby Dickenson ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
Toby Dickenson wrote: On Wednesday 03 September 2003 12:15, Chris Withers wrote: If the object load would cause the cache to go above it's maximum number, *Number* isnt the the right parameter to control here. We need to limit the total amount of RAM. Objects are of variable size, and the largest ZODB objects are very much bigger than the average. This is true, but isn't this much harder to do as we run into the same issue that people have trying to produce folderish objects for Zope that limit the size of their contained objects? Hmmm, maybe not... could we make a note of the pickles size when the data is loaded and update that size when it's comitted? Is this the same as the in-memory size? then boot an object out of the cache in order to make room for the new one. That would have a bad effect on ReadConflictErrors. Don't follow, can you explain? Cache purging should only happen on transaction boundaries for storages where ReadConflictErrors are possible. Can you put some brackets in that statement, I don't follow it.. I havent seen a mention of ulimit or autolance earlier in this thread They are mostly adequate protection against the work problems. Do they kill the thread or the whole Zope? I'm also keen for users to nto get MemoryErrors, but to just have their request take much longer ( cache thrashing and the like...) cheers, Chris ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
On Friday 05 September 2003 14:21, Chris Withers wrote: Hmmm, maybe not... could we make a note of the pickles size when the data is loaded and update that size when it's comitted? Is this the same as the in-memory size? Yes, I hope its close enough for most purposes. It would be nice to have a way for the object to override that in the cases where it is badly wrong. then boot an object out of the cache in order to make room for the new one. That would have a bad effect on ReadConflictErrors. Don't follow, can you explain? You get a ReadConflictError when loading an object if it has been modified since the start of the transaction. This exception therefore becomes increasingly likely as time progresses since the start of the transaction. Today you can minimise the probability of a read conflict by touching the objects you need (at least the ones likely to change) near the start of the transaction. This works because, once loaded, the objects stay in memory until the end of the transaction. Another problem I didnt mention is that _v_ attributes are supposed to last at least until the end of the transaction. I havent seen a mention of ulimit or autolance earlier in this thread They are mostly adequate protection against the work problems. Do they kill the thread or the whole Zope? whole zope of course there isnt a great overhead in using twice as many zopes, each with half as many publisher threads. (assuming you already have zeo, of course) I'm also keen for users to nto get MemoryErrors, but to just have their request take much longer ( cache thrashing and the like...) use squid, and it will retry the request. -- Toby Dickenson ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
Leonardo Rochael Almeida wrote: as a lower bound. (Note that the cache is still allowed to grow indefinitely within the scope of a request, however.) This is the single biggest cause of Zope becoming unresponsive for me: people doing silly things that drag hordes of objects into memory i na single request, Now that would be an interesting feature: An upper bound on the number of objects a request is allowed touch, period. If a request requires more than that it's rolled back. Hmm, not so useful, is people just keep retryign the request. What I'd like to see if the cache checking it's size on object load. If the object load would cause the cache to go above it's maximum number, then boot an object out of the cache in order to make room for the new one. So, you'd get slowness because of cache thrashing on THAT PARTICULAR REQUEST, but at least you'd be able to control the amount of memory Zope actually uses and other requests would stand a chance of beind processed normally. snip very enlightening explanation of why a Python process using lots of memory, even for a short period of time, is a 'bad thing' That was pretty informative, but does give even more of a good reason why we really need to be able to put a maximum upper bound on the amount of memory Zope can use at any one point... Chris ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
Shane Hathaway wrote: as a lower bound. (Note that the cache is still allowed to grow indefinitely within the scope of a request, however.) Which is still pretty broken IMHO :-( This is the single biggest cause of Zope becoming unresponsive for me: people doing silly things that drag hordes of objects into memory i na single request, made worse by the fact that Python is extraordinarily bad at giving memory back to the OS after that request has been completed ;-) cheers, Chris ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] Cache growing during single REQUEST
On Fri, 2003-08-29 at 06:28, Chris Withers wrote: Shane Hathaway wrote: as a lower bound. (Note that the cache is still allowed to grow indefinitely within the scope of a request, however.) Which is still pretty broken IMHO :-( This is the single biggest cause of Zope becoming unresponsive for me: people doing silly things that drag hordes of objects into memory i na single request, Now that would be an interesting feature: An upper bound on the number of objects a request is allowed touch, period. If a request requires more than that it's rolled back. made worse by the fact that Python is extraordinarily bad at giving memory back to the OS after that request has been completed ;-) On Linux (and probably other nixes), this is a libc (mis)feature. You see, malloc() is a libc abstraction. The memory the OS makes avaliable to each process is actually a contiguous block with a variable upper (or is it lower?) bound. This bound is usually changed thru the sbrk() call, which is a wrapper to the brk() call. The latter tells to the OS the amount of heap space the process think it needs. The former is an increment or decrement wrapper around the latter. When the process runs out of heap (malloc()) space, it (The libc in the python interpreter, in our case) calls sbrk() with a positive number (eg. it asks for more memory). Now, no matter how much memory is later free()d (or in our case, python objects deleted), the system never calls sbrk() with a negative value. If it were to do that, it (the libc) would first have to make sure that there are no pointers to allocated space still in the area. I'm not sure, but I think that the glibc in Linux just doesn't bother. Defragmentation of the area is not an option because, in C, due to pointer arithmetic, you can never be sure where all the pointers are, to change them to the new locations. Now this is all from memory (and a cursory read of the sbrk(2) man page) so the above could be full of, erm..., incorrections :-) Corrections to the above explanation are welcome. Cheers, Leo -- Ideas don't stay in some minds very long because they don't like solitary confinement. ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )