On 06/02/2011 07:00 PM, Raphaël Valyi - http://www.akretion.com wrote: > Olivier, sorry, I still don't get it: > if the sequence is locked by some batch (like the MRP), what is the benefit > of "not waiting even a second before failing". > Can you cite a use case where there is a benefit in not waiting even for a > second?
Have a look at the example transactions in the text file I attached, this shows why waiting is useless. It's simple: concurrent updates on the same row(s) by interleaved transactions are simply not allowed by snapshot isolation, regardless of the amount of waiting you're willing to do. If two transactions start in parallel to update the same row(s), only one will be able to do it, and the second one will fail 100% of the time. This is because even after waiting for the first transaction to finish, the second one is still only allowed to see the "snapshot" of the database as it was when the transaction started. And because that snapshot is now stale, updating rows in it is not allowed and will fail with "ERROR: could not serialize access due to concurrent update". To convince yourself, start 2 psql clients in parallel, begin a serializable transaction in each, and try to update the same row. The second one will fail when the first commits, e.g: test=# BEGIN ISOLATION LEVEL SERIALIZABLE; BEGIN test=# -- dummy SELECT to really start the transaction: test=# SELECT count(id) FROM res_company; (...) test=# UPDATE res_users SET write_date = now() WHERE id = 1; ERROR: could not serialize access due to concurrent update test_install=# Please, don't reply again to this bug until you've actually played with that and tried the different variations ;-) (and remember that a transaction only starts when you access the database at least once after the BEGIN) > You say: " instead of waiting a bit to fail later anyway." Yes, exactly. Without SFU NOWAIT, you will still fail 100% of the time, it will just take longer to notice it. No amount of waiting *during the transaction* will help. This is a consequence of snapshot isolation, our closest match for serializable isolation. You need to start a new transaction and try again to have a chance of overcoming that, but doing so automatically in the code of stock or sequence means breaking the atomicity of the enclosing transaction, so that's not possible. > If take the MRP batch case or other batches like we have with > base_sale_multichannels, there is often a commit after a record is > processed, and I believe this is a good practice as it would prevent > a full roll back just because of some specific record issue. Sure, during batch processing, it makes sense to consider each item on its own and process it in its own transaction. But the core code of stock or sequence is much lower-level, so it cannot do rogue commits nor escape its enclosing transaction. > Since there are commit and since the batch spend time doing other things* > before locking the ir_sequence table, unless I miss something it means that > there are small time periods while the ir_sequence is not locked. Am I > wrong? > Then, unless I miss something, if we wait say up to 1 sec when requesting a > sequence number, wouldn't have get a chance to be able to get our sequence > number during those small periods it's not locked? That won't work, for the same reason: the transactions would be interleaved, so any transaction that starts while another one increments the sequence is going to fail no matter what. Looping a few times would not work either, once the transaction has begun, it's too late. You would need to retry the whole transaction completely (batch processing can/should do that). > And overall, what is the benefit of waiting 0 millisecond before blowing > into a locked sequence table exception? Any use case where this really bring > value? Benefits: cut the useless wait (you're gonna fail *anyway*, just later), and remove a possible source of deadlocks, because any transaction that is waiting is such a source. Good enough? > * I even consider very seriously to introduce some sleep(a few millisecs) at > least in base_sale_multi_channels import batches if that would help making > other process more fluid (and so may be we could optionally do it in the MRP > batch too). If you put a sleep() right after the cr.commit() you're indeed decreasing the chance of conflicts with manual transactions. Conflicts can still occur, just less often. Wrapping up again, our suggested solution to improve the situation is to have 2 different behaviors: general sequences would be implemented with PostgreSQL sequences, with no locking but no guarantee about gaps, and good performances. Only some accounting sequences would be implemented with the gapless, locking sequences, because that is still the only reliable way we can guarantee the absence of gaps. This means user actions and batch processes should still be prepared to face a temporary error in *rare* cases. We could have a temporary exception in the exception hierarchy for that, easy to distinguish from permanent exceptions, and to handle appropriately in batch processing. -- You received this bug notification because you are a member of C2C OERPScenario, which is subscribed to the OpenERP Project Group. https://bugs.launchpad.net/bugs/746620 Title: Implementation of faster gap-tolerant sequences Status in OpenERP Server: Confirmed Bug description: OpenERP is classified as an Enterprise class software package. Meaning that more than one person, at least 2 should be able to use the system simultaneously. Like packers shipping products, taking orders, reserving products. I find that in fact, it is not possible for creation of stock moves simultaneously. Say I am importing orders from a shop. It is creating stock moves because the order is paid. At the same time I am shipping products which is normal for an ERP system. I might also import orders from a separate shop say a POS system. It is not possible! [2011-03-31 13:10:47,657][midwestsupplies] WARNING:stock.location:Failed attempt to reserve 1.0 x product 1669, li kely due to another transaction already in progress. Next attempt is likely to work. Detailed error available at D EBUG level. OperationalError: could not obtain lock on row in relation "ir_sequence" Two different errors. I can only perform one action at a time! What happens is that any time the program calls for a stock_move it will lock the stock_move table so no other process can access it which means that you cant do hardly anything unless its done individually. Now say with a MS system, or any Enterprise system would be able to handle many simultaneous actions but this program has a serious architecture flaw to not be able to support this. _______________________________________________ Mailing list: https://launchpad.net/~c2c-oerpscenario Post to : [email protected] Unsubscribe : https://launchpad.net/~c2c-oerpscenario More help : https://help.launchpad.net/ListHelp

