Re: [Zope-dev] How bad _are_ ConflictErrors
Chris Withers wrote at 2005-11-21 16:33 +: ... here's a line from one of our event logs: 2005-11-17T08:00:27 INFO(0) ZODB conflict error at /some_uri (347 conflicts since startup at 2005-11-08T17:56:20) What is this telling me? It is incredibly stupid. The message above only tells you, that (at the given time) a request for /some_uri resulted in a ConflictError and that since startup (at the given time) 347 conflicts occured. Unfortunately, it does not tell you * what object caused the conflict * whether it is a read or a write conflict (read conflicts are very rare since MVCC introduction, but they may still happen) * for write conflicts: what versions of the object did particate A long time ago, I posted an extension making this additional information available (it is all present in the exception instance. Zope is just too stupid to read (and log) it). Did the user actually see a ConflictError page? Usually not. Or was this error successfully resolved? It may (or may not) later be resolved. This is still not clear when the message is generated. What object did this ConflictError occur on and/or how can I modify my our Zope instances to find out where the conflict was occurring? See above -- search the archive for the extension... Now, when should the number of ConflictErrors logged in this way start to become worrying? When you start to see lots on them (per time unit). I analysed the logs from our cluster and we're getting about 450 conflict errors in our busiest hours when the cluster of 8 ZEO clients is taking about 11,000 hits in that hour. Is this 'bad'? I would not be happy: it is about 5 %. This gives quite some chance that your customers see failures caused by the conflicts (when 3 repetitions are not enough). If so, where should I start to make things better? You find out which objects cause the conflicts. You analyse what you can do to reduce concurrent writes to these objects (split into separate persistent subobjects) or whether you can provide conflict resolution. -- Dieter ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] How bad _are_ ConflictErrors
Conflicts and how they interact with the database and sessioning machinery is my hot button right at the moment )-: I Hope I have not included too much information. I ran a quick report and we see about 1000 conflicts per hour at about 12 hits per hour. These are order of magnitude numbers and are highly variable. The 1% number is way bigger than I am comfortable with although I have no basis to scale my expectations. I'd be much happier were it a couple of orders of magnitude smaller. Conflict errors are not always errors. As I understand it, Zope retries when a conflict occurs and usually is able to commit both sides of the conflicting transaction. Sometimes Zope cannot commit conflicting transactions--and it is at that point that an error occurs. There are supposed to be significant changes in the Zope 2.8.4/ZODB 3.4.2 system. Read-read conflicts no longer generate conflict errors and the retry mechanism has been reworked at the ZODB level to retry once and then raise a POSKEY exception. The optimistic locking used by Zope can cause problems, particularly when the conflicting method changes external state. We have seen instances where an action was taken multiple times due to conflicts and their resolution. In one instance, we had an infinite loop in the conflict resolution. The interactions which can cause conflicts are not always obvious. I am still learning. We do have occasional instances where unresolved conflicts raise user visible diagnostics. These are real errors. While I have not explored the reasons why, it appears that at least some of these errors are not logged in event.log but only displayed to the user. I asked the list the other day whether anyone had prepared a set of best practice guidelines on the techniques to use to minimize conflicts? Dieter Maurer responded: * Localize out into separate persistent objects attributes with high write frequency. E.g. when you have a counter, put into its own persistent object (you can use a BTrees.Length.Length object for a counter). * Implement conflict resolution for your high frequently written persistent objects. Formerly, TemporaryStorage had only very limited history information to support conflict resolution (which limited the wholesome effect of conflict resolution). Rumours say that this improved with Zope 2.8. * Write only when you really change something. E.g. instead of session[XXX] = sss use if session[XXX] != sss: session[XXX] = sss (at least, if there is a high chance that session already contains the correct value). Session variable present a particularly vexing problem since they may trigger writes even though they are apparently read-only. Chris McDonough [EMAIL PROTECTED] wrote in response to my posting: On Nov 20, 2005, at 12:16 PM, Dennis Allison wrote: [...] Looking at the code, I don't understand why I am seeing conflicts. As I understand things, neither variables in the dtml-let space nor the REQUEST/RESPONSE space are stored in the ZODB so modifications to them don't look like writes to the conflict mechanism. Am I incorrect in my understanding? Yes, but that's understandable. It's not exactly obvious. The sessioning machinery is one of the few places in Zope where it's necessary for the code to do what's known as a write on read in the ZODB database. Even if you're just reading from a session, looking up a session, or doing anything otherwise related to sessioning, it's possible for your code to generate a ZODB write. This is why you get conflicts even if you're just reading; whenever you access the sessioning machinery, you are potentially (but not always) causing a ZODB write. All writes can potentially cause a conflict error. While this might sound fantastic, it's pretty much impossible to avoid when using ZODB as a sessioning backend. The sessioning machinery has been tuned to generate as few conflicts as possible, and you can help it by doing your own timeout, resolution, and housekeeping tuning as has been suggested. MVCC gets rid of read conflicts. But it's not possible to completely avoid write conflicts under the current design. Here's why. The sessioning machinery is composed of three major data structures: - an index of timeslice to bucket. A timeslice is an integer representing some range of time (the range of time is variable, depending on the resolution, but out of the box, it represents 20 seconds). This mapping is an IOBTree. - A bucket is a mapping from a browser id to session data object (aka transient object). This mapping is an OOBTree. - three increasers which mark the last timeslice in which something was done (called the garbage collector, called the finalizer, etc). The point of sessioning is to provide a writable namespace
Re: [Zope-dev] How bad _are_ ConflictErrors
[Dennis Allison] ... Conflict errors are not always errors. At the ZODB level, an unresolved conflict always raises an exception. Whether such an exception is considered to be an error isn't ZODB's decision -- that's up to the app. My understanding (which may be wrong) is that Zope tries up to 3 times to perform commit a given transaction, suppressing any conflict exceptions for the duration, before giving up. As I understand it, Zope retries when a conflict occurs and usually is able to commit both sides of the conflicting transaction. Right (although note that there may be more than two sides). Sometimes Zope cannot commit conflicting transactions--and it is at that point that an error occurs. Right, Zope eventually gives up on a transaction that keeps on raising conflict exceptions. There are supposed to be significant changes in the Zope 2.8.4/ZODB 3.4.2 system. There are. ZODB 3.3 introduced multiversion concurrency control (MVCC), which eliminates read conflicts in normal operation. Read-read conflicts no longer generate conflict errors Not really: under MVCC, there simply aren't any read conflicts. There may still be write conflicts. and the retry mechanism has been reworked at the ZODB level to retry once and then raise a POSKEY exception. Nope, no version of ZODB ever retries a transaction on its own. If an application (like Zope) wants to retry, it's entirely up to it do so. The optimistic locking used by Zope ZODB's transactional approach is optimistic, precisely because it _doesn't_ lock objects modified by a transaction. Any number of transactions are free to modify the same object at the same time -- no locking mechanism attempts to stop that. If multiple transactions do modify the same object at the same time, and that object doesn't implement conflict resolution, then only the first transaction to commit its changes to that object can succeed. can cause problems, particularly when the conflicting method changes external state. Yes -- but do note it's not a transactional system then (ZODB can roll back all changes _it_ makes, so that a failure to commit does no harm to the database state; external resources that can't take back provisional changes are indeed challenging to use in a transactional system). ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] How bad _are_ ConflictErrors
On Nov 21, 2005, at 2:10 PM, Dennis Allison wrote: Conflicts and how they interact with the database and sessioning machinery is my hot button right at the moment )-: I Hope I have not included too much information. I ran a quick report and we see about 1000 conflicts per hour at about 12 hits per hour. Is this the number of log messages that indicate a conflict error occurred (e.g. x conflict errors since DATE messages in the event log) or the number of conflict errors that are retried more than three times and thus make it out to the app user? I'm guessing the former. These are order of magnitude numbers and are highly variable. The 1% number is way bigger than I am comfortable with although I have no basis to scale my expectations. I'd be much happier were it a couple of orders of magnitude smaller. I would be too. It's considerably difficult when ZODB is used as the sessioning backend. A lot of effort has been put in to reducing the potential for conflicts already. It could of course be better if more time was put in, but there hasn't been any reason (besides a sense of accomplishment and contribution to the greater good, anyway ;-) to put in that effort since the last time this machinery was overhauled. That said, if no conflict errors actually bubble up to the user using the application, the penalty is just app performance and knowledge expense (e.g. you can't use a nontransactional mailhost, you can't use a nontransactional database table, etc). You've already paid for the latter the hard way. ;-) I can't judge the expense of the former to you but I assume that's what you're primarily worried about now. Conflict errors are not always errors. The real reason they're called errors is only because they're implemented as Python exceptions. They are implemented as exceptions because it was the easiest mechanism to use (exceptions are already built into Python). As I understand it, Zope retries when a conflict occurs and usually is able to commit both sides of the conflicting transaction. There can be more than two sides (actually there always are... there are three.. the two conflicting in-progress connection states and the database state). Sometimes Zope cannot commit conflicting transactions--and it is at that point that an error occurs. An exception occurs, yes. Oops, I just realized Tim responded to the rest of these points, so I won't go on. We do have occasional instances where unresolved conflicts raise user visible diagnostics. These are real errors. While I have not explored the reasons why, it appears that at least some of these errors are not logged in event.log but only displayed to the user. To be pedantic, if you're right about conflict error tracebacks being shown to end users, it's not because they are unresolved (in the sense that 'application-level conflict resolution' could have prevented them), it's because a request was issued that resulted in a conflict error, which was retried, and then that retried request raised a conflict error, and then twice more. The only way to figure out what's going on here is to see the traceback. IIRC, Zope logs conflict error tracebacks at the BLATHER log level (as well as a deluge of other ancillary info). However, even if BLATHER logging mode is not on, if no obvious error is put in the event log when a conflict error is relayed to a user, that's definitely a bug. I'd believe it in a second! ;-) The Zope conflict exception catching code is written in such a complicated way (and without the benefit of any automated tests) that tracking that down could take an entire day which I don't have to burn ATM. So I'm afraid the status quo will prevail until someone gets so indignant about it that they either pay for it to be fixed or fix it themselves. Apologies for that. :-( - C ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] How bad _are_ ConflictErrors
These are order of magnitude numbers and are highly variable. The 1% number is way bigger than I am comfortable with although I have no basis to scale my expectations. I'd be much happier were it a couple of orders of magnitude smaller. I would be too. It's considerably difficult when ZODB is used as the sessioning backend. A lot of effort has been put in to reducing the potential for conflicts already. It could of course be better if more time was put in, but there hasn't been any reason (besides a sense of accomplishment and contribution to the greater good, anyway ;-) to put in that effort since the last time this machinery was overhauled. I should also say that without the benefit of knowing whether you've taken the advice of turning the knobs available to you that help reduce conflicts (bumping up timeout resolution, turning off inband housekeeping, using a local database rather than a ClientStorage- backed database for session data), that we enumerated in previous emails, it's hard to know whether doing any more work would be beneficial. - C ___ Zope-Dev maillist - Zope-Dev@zope.org http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] How bad _are_ ConflictErrors
On Mon, 21 Nov 2005, Chris McDonough wrote: On Nov 21, 2005, at 2:10 PM, Dennis Allison wrote: Conflicts and how they interact with the database and sessioning machinery is my hot button right at the moment )-: I Hope I have not included too much information. I ran a quick report and we see about 1000 conflicts per hour at about 12 hits per hour. Is this the number of log messages that indicate a conflict error occurred (e.g. x conflict errors since DATE messages in the event log) or the number of conflict errors that are retried more than three times and thus make it out to the app user? I'm guessing the former. *** you are correct -- this is the easy hack on the event.log. It's much harder to know how many make it out to the user. We have an associated bug in the MySQL interface which generates threading errors, apparently triggered by a conflict error and the subsequent backout. These occur with most conflicts which involve the database--almost every conflict with our system structure. These are order of magnitude numbers and are highly variable. The 1% number is way bigger than I am comfortable with although I have no basis to scale my expectations. I'd be much happier were it a couple of orders of magnitude smaller. I would be too. It's considerably difficult when ZODB is used as the sessioning backend. A lot of effort has been put in to reducing the potential for conflicts already. It could of course be better if more time was put in, but there hasn't been any reason (besides a sense of accomplishment and contribution to the greater good, anyway ;-) to put in that effort since the last time this machinery was overhauled. *** I've moved from a ZODB sessioning backend to local sessioning. There has not been a significant change, I think because the MySQL problem dominates at the moment. That said, if no conflict errors actually bubble up to the user using the application, the penalty is just app performance and knowledge expense (e.g. you can't use a nontransactional mailhost, you can't use a nontransactional database table, etc). You've already paid for the latter the hard way. ;-) I can't judge the expense of the former to you but I assume that's what you're primarily worried about now. *** Right now, we have major problems with our transactional database and locks. Once that gets resolved, we will address how to refactor to minimize the cost of transactions and ensure correctness in the presence of conflicts. Correctness is already pretty much guaranteed with our current systems structure. Conflict errors are not always errors. The real reason they're called errors is only because they're implemented as Python exceptions. They are implemented as exceptions because it was the easiest mechanism to use (exceptions are already built into Python). As I understand it, Zope retries when a conflict occurs and usually is able to commit both sides of the conflicting transaction. There can be more than two sides (actually there always are... there are three.. the two conflicting in-progress connection states and the database state). Sometimes Zope cannot commit conflicting transactions--and it is at that point that an error occurs. An exception occurs, yes. Oops, I just realized Tim responded to the rest of these points, so I won't go on. *** Yes, he did. THANKS TIM for your comments and help. (And you too Chris) We do have occasional instances where unresolved conflicts raise user visible diagnostics. These are real errors. While I have not explored the reasons why, it appears that at least some of these errors are not logged in event.log but only displayed to the user. To be pedantic, if you're right about conflict error tracebacks being shown to end users, it's not because they are unresolved (in the sense that 'application-level conflict resolution' could have prevented them), it's because a request was issued that resulted in a conflict error, which was retried, and then that retried request raised a conflict error, and then twice more. The only way to figure out what's going on here is to see the traceback. IIRC, Zope logs conflict error tracebacks at the BLATHER log level (as well as a deluge of other ancillary info). However, even if BLATHER logging mode is not on, if no obvious error is put in the event log when a conflict error is relayed to a user, that's definitely a bug. I'd believe it in a second! ;-) *** have done that but no helpful results as of yet. The Zope conflict exception catching code is written in such a complicated way (and without the benefit of any automated tests) that tracking that down could take an entire day which I don't have to burn ATM. So I'm afraid the status quo will prevail until someone gets so indignant about