Re: [Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries,Oh My....
On Thu, 2003-05-29 at 01:08, Jeffrey P Shell wrote: Thanks for the information. Is it safe at all to try to catch a ConflictError during the critical part of the code, log some information, and then reraise the error to let the system do what it needs? Sure, but I'm not sure what that buys you in your case. The system will still retry the request if you reraise a conflict error. And it would be spotty coverage at best; it's almost impossible to know where a ConflictError might be raised. The only reasonable solution would be to change ZPublisher's default behavior to not retry requests on conflict errors, which is probably not what you want either. I guess you're right though - it's hard to know when it will occur. In the production system, in this particular method, there are only two known persistent object interactions. At the end of the entire method, after a notification email has been sent, I have something like: session['pieces'] = {} (session['pieces'] was a dictionary of {item_id:integer} bits. It never gets large for an individual user). I think that the one recent case of desync'd data happened when we got to this point. Since it's at the very end of the script (no more writes are expected beyond this point), I imagine that a get_transaction().commit() might be OK to precede this statement, just so that even if any conflicts happen when trying to write back to the session, we at least have synchronized data between the two systems. Although, prior to this, there are a few reads of this session data. Might it be safer to do something like this at the top of the method?: pieces = session['pieces'].copy() pieces = session.get('pieces', {} ..at the top of the method might be better, particularly because you'll need to explicitly resave the dictionary into the session like so at the end of the method anyway: session['pieces'] = pieces (standard persistence rules apply to session data as well, so you need to restore basic types after you mutate them if you want the changes to persist). We've also found that accessing session data early in the request can help reduce the number of conflicts that happen later in the request. See http://mail.zope.org/pipermail/zope-dev/2003-March/019081.html for more information. I apologize if this post is making little sense (or stupid sense) - dealing with threads, locks, conflicts, etc, has been the part of Zope I've understood the least. I like that for the most part I don't have to think about it, but I don't know where to go for [fairly] current documentation on how to deal with it for those rare times I do. FWIW, the Zope Book 2.6 edition session chapter speaks a bit to what conflict errors are. The ZDG persistence chapter talks a bit about threading and concurrency. The other persistent data write occurs earlier in the method, an object that generates serial numbers based off of some simple data in a PersistentMapping gets updated. I think that PersistentMapping has become fairly large by now. It maps the item_id referenced above to a regular dictionary containing three key/value pairs each. I make sure to follow the rules of persistence when dealing with these dictionaries-with-a-PersistentMapping, but I'm guessing that an OOBTree might be better instead. I still don't understand the potential pitfalls of Zope/ZODB BTrees (I keep reading about 'bucket splits' causing conflicts, and I don't know if that would be better or worse than any pitfalls a PersistentMapping gives). Know that any change to a PersistentMapping needs to load and repersist the entire data set in the mapping when a key or value is updated or added. It is very likely that this will cause a conflict, particularly when two threads try to do this at once. OTOH, a BTree is made up of many other persistent subobjects, and there is less of a chance (but still a good chance) that two concurrent accesses to a BTree will cause a conflict error. Finally, the system in question has a few (three? four?) public Zope sites using the same session storage. Is there any documentation, notes, etc, about fine tuning the default session storage set up to handle large sites (or groups of sites) with less conflicts? The best source of docs for sessions in the 2.6 Zope Book sessions chapter. The maillist thread that I mentioned above gives some information from Toby Dickenson about accessing session data early in a transaction to reduce the possibility of read conflicts. Thanks again for the help. I'll take a look at MailDropHost. Maybe I'll have to wrap another gateway around the gateway to the external system to try to catch these conflict situations. Fortunately, the critical area only occurs once in the current copy of the code. Hopefully that will make it easier to protect. Good luck! - C Thanks again, Jeffrey ___ Zope-Dev
Re: [Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries, Oh My....
On Thursday, May 29, 2003, at 07:32 AM, Chris McDonough wrote: On Thu, 2003-05-29 at 01:08, Jeffrey P Shell wrote: Thanks for the information. Is it safe at all to try to catch a ConflictError during the critical part of the code, log some information, and then reraise the error to let the system do what it needs? Sure, but I'm not sure what that buys you in your case. The system will still retry the request if you reraise a conflict error. And it would be spotty coverage at best; it's almost impossible to know where a ConflictError might be raised. The only reasonable solution would be to change ZPublisher's default behavior to not retry requests on conflict errors, which is probably not what you want either. Changing ZPublisher doesn't sound like fun times. I think I'll avoid that one :). The main thing that I want to catch is the fact that this event has occurred so that we can (a) get notified by monitoring software that something is screwy, (b) have a record identifier in the logs that would help us clear the right record out of the external system as soon as possible, and (c) let us know with accuracy when this situation occurs. We've only had one report of this happening, but we have some other suspicious data that we're not sure about. I have an idea on how to handle it, thanks to your TransactionManager suggestion. We still need to send the data out through the gateway immediately (we can't wait for a transaction commit), but might be able to do this: def send(self, data): self._txn_note = data['item_id'] ... def abort(self, reallyme, t): if self._txn_note: LOG('Outro', PROBLEM, 'Received abort after sending transaction id %s' % self._txn_note) self._txn_note = None def tpc_finish(self, transaction): self._txn_note = None It's a rough view of the transaction methods that I think I'll have to implement to do this. It might make a good howto/recipe ultimately. *shrug*. We've also found that accessing session data early in the request can help reduce the number of conflicts that happen later in the request. See http://mail.zope.org/pipermail/zope-dev/2003-March/019081.html for more information. Thanks. Quite an interesting read (I got a bit of a ways through it and have bookmarked it for closer evaluation when I get to the office). I apologize if this post is making little sense (or stupid sense) - dealing with threads, locks, conflicts, etc, has been the part of Zope I've understood the least. I like that for the most part I don't have to think about it, but I don't know where to go for [fairly] current documentation on how to deal with it for those rare times I do. FWIW, the Zope Book 2.6 edition session chapter speaks a bit to what conflict errors are. The ZDG persistence chapter talks a bit about threading and concurrency. A little off-topic: any idea when the 2.6 ZB can be marked as current? Or are we waiting for Zope 2.7 to do that? ;) That documentation has been helpful. Although I still don't know when Zope/ZODB start new threads. Is it for every REQUEST? Know that any change to a PersistentMapping needs to load and repersist the entire data set in the mapping when a key or value is updated or added. It is very likely that this will cause a conflict, particularly when two threads try to do this at once. OTOH, a BTree is made up of many other persistent subobjects, and there is less of a chance (but still a good chance) that two concurrent accesses to a BTree will cause a conflict error. I'll move that data structure to a BTree. I have to do some other work on that code anyways. The best source of docs for sessions in the 2.6 Zope Book sessions chapter. The maillist thread that I mentioned above gives some information from Toby Dickenson about accessing session data early in a transaction to reduce the possibility of read conflicts. Good luck! Thanks. I owe you yet more beer. -- Jeffrey P Shell [EMAIL PROTECTED] ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries, Oh My....
Jeffrey P Shell wrote at 2003-5-28 19:33 -0600: I need to know more about Conflict Errors. We're running into a lot of them lately, it seems, on production Zope 2.6.1 sites (running on FreeBSD). The primary culprit seems to be Temporary Storage/Sessions. When the conflicts are ReadConfictErrors, then my Snapshot Isolation patch may help you. When the conflicts are (write) ConflictErrors, then application specific conflict resolution might reduce the probability of Conflicts. You must not expect to get rid of all (write) ConflictErrors. Something that has happened, and is causing a small amount of alarm, is that a large method that interfaces to external non-transactional systems seems to (on occasion) send their information off to that external system twice, but there's only one matching set of Zope data. As the two writes to the non-transactional system are very close to each other and contain nearly identical data (except for one bit that gets regenerated in the method), and there are conflict INFO reports in the Event Log from around the same time, I'm assuming that a conflict error is happening somewhere in this method and causing the transaction to be retried (if I'm understanding how Conflict Errors work). Zope and the relational databases seem to do things fine with rolling back the data, but the non-transactional systems now have duplicate data that they **absolutely should not have**. If this *MUST* not happen, you cannot use your current system... You can probably make your non-transactional system transactional very easily. Jens (Vagenpohl, maybe slightly misspelled) has a transactional mailhost drop in. Analysing it, will give you some hints how to proceed with your system. Dieter ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
[Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries, Oh My....
I need to know more about Conflict Errors. We're running into a lot of them lately, it seems, on production Zope 2.6.1 sites (running on FreeBSD). The primary culprit seems to be Temporary Storage/Sessions. Something that has happened, and is causing a small amount of alarm, is that a large method that interfaces to external non-transactional systems seems to (on occasion) send their information off to that external system twice, but there's only one matching set of Zope data. As the two writes to the non-transactional system are very close to each other and contain nearly identical data (except for one bit that gets regenerated in the method), and there are conflict INFO reports in the Event Log from around the same time, I'm assuming that a conflict error is happening somewhere in this method and causing the transaction to be retried (if I'm understanding how Conflict Errors work). Zope and the relational databases seem to do things fine with rolling back the data, but the non-transactional systems now have duplicate data that they **absolutely should not have**. This doesn't happen often, but (as stated), this is a critical operation that needs to be better protected. All other exceptions and bits and pieces in the block of code in question has been tested thoroughly and we have not had any other problems that cause erroneous writes. Is there a way I can protect against Conflict Error retries as well? Is there some sort of Try/Except or Try/Finally I can wrap around the code that won't interfere with the ZODB? Is there any other sort of best-practice here that could help me (and others) who might unknowingly trigger this problem? I know there are some fixes likely to be in Zope 2.6.2 that may help with the situation, but I'd like to put extra protections around this code regardless of what may be coming in the future. Thanks in advance, Jeffrey P Shell [EMAIL PROTECTED] ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries,Oh My....
On Wed, 2003-05-28 at 21:33, Jeffrey P Shell wrote: Something that has happened, and is causing a small amount of alarm, is that a large method that interfaces to external non-transactional systems seems to (on occasion) send their information off to that external system twice, but there's only one matching set of Zope data. As the two writes to the non-transactional system are very close to each other and contain nearly identical data (except for one bit that gets regenerated in the method), and there are conflict INFO reports in the Event Log from around the same time, I'm assuming that a conflict error is happening somewhere in this method and causing the transaction to be retried (if I'm understanding how Conflict Errors work). Zope and the relational databases seem to do things fine with rolling back the data, but the non-transactional systems now have duplicate data that they **absolutely should not have**. Within Zope, when a conflict error is raised, ZPublisher catches the exception and retries the request up to 3 times. This is why sometimes, for example, you'll see double email notifications from Wiki subscriptions on zope.org. This doesn't happen often, but (as stated), this is a critical operation that needs to be better protected. All other exceptions and bits and pieces in the block of code in question has been tested thoroughly and we have not had any other problems that cause erroneous writes. Is there a way I can protect against Conflict Error retries as well? Is there some sort of Try/Except or Try/Finally I can wrap around the code that won't interfere with the ZODB? Is there any other sort of best-practice here that could help me (and others) who might unknowingly trigger this problem? Not infallibly. You can really never know where a ConflictError will might be raised. Any concurrent access to a persistent object is a possible candidate. I know there are some fixes likely to be in Zope 2.6.2 that may help with the situation, but I'd like to put extra protections around this code regardless of what may be coming in the future. It will only get worse with 2.6.2: the number of conflict errors cause by the sessioning machinery in 2.6.2 is going to go up as compared to 2.6.1 and below. This is because the strategy used to reduce the number of conflict errors used currently causes data desynchronization problems. Some folks have created products that mimic transactional semantics (like Jens' MailDropHost) to avoid this kind of problem. You might want to try the same... - C ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )
Re: [Zope-dev] [CRITICAL] Conflict Errors, Transactions, Retries, Oh My....
On Wednesday, May 28, 2003, at 10:19 PM, Chris McDonough wrote: This doesn't happen often, but (as stated), this is a critical operation that needs to be better protected. All other exceptions and bits and pieces in the block of code in question has been tested thoroughly and we have not had any other problems that cause erroneous writes. Is there a way I can protect against Conflict Error retries as well? Is there some sort of Try/Except or Try/Finally I can wrap around the code that won't interfere with the ZODB? Is there any other sort of best-practice here that could help me (and others) who might unknowingly trigger this problem? Not infallibly. You can really never know where a ConflictError will might be raised. Any concurrent access to a persistent object is a possible candidate. Thanks for the information. Is it safe at all to try to catch a ConflictError during the critical part of the code, log some information, and then reraise the error to let the system do what it needs? I guess you're right though - it's hard to know when it will occur. In the production system, in this particular method, there are only two known persistent object interactions. At the end of the entire method, after a notification email has been sent, I have something like: session['pieces'] = {} (session['pieces'] was a dictionary of {item_id:integer} bits. It never gets large for an individual user). I think that the one recent case of desync'd data happened when we got to this point. Since it's at the very end of the script (no more writes are expected beyond this point), I imagine that a get_transaction().commit() might be OK to precede this statement, just so that even if any conflicts happen when trying to write back to the session, we at least have synchronized data between the two systems. Although, prior to this, there are a few reads of this session data. Might it be safer to do something like this at the top of the method?: pieces = session['pieces'].copy() I apologize if this post is making little sense (or stupid sense) - dealing with threads, locks, conflicts, etc, has been the part of Zope I've understood the least. I like that for the most part I don't have to think about it, but I don't know where to go for [fairly] current documentation on how to deal with it for those rare times I do. The other persistent data write occurs earlier in the method, an object that generates serial numbers based off of some simple data in a PersistentMapping gets updated. I think that PersistentMapping has become fairly large by now. It maps the item_id referenced above to a regular dictionary containing three key/value pairs each. I make sure to follow the rules of persistence when dealing with these dictionaries-with-a-PersistentMapping, but I'm guessing that an OOBTree might be better instead. I still don't understand the potential pitfalls of Zope/ZODB BTrees (I keep reading about 'bucket splits' causing conflicts, and I don't know if that would be better or worse than any pitfalls a PersistentMapping gives). Finally, the system in question has a few (three? four?) public Zope sites using the same session storage. Is there any documentation, notes, etc, about fine tuning the default session storage set up to handle large sites (or groups of sites) with less conflicts? Thanks again for the help. I'll take a look at MailDropHost. Maybe I'll have to wrap another gateway around the gateway to the external system to try to catch these conflict situations. Fortunately, the critical area only occurs once in the current copy of the code. Hopefully that will make it easier to protect. Thanks again, Jeffrey ___ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )