On 06/07/2011 12:11 AM, Raphaël Valyi - http://www.akretion.com wrote: > Some additional remarks: for me the gap enabled solution might have some > benefits in some situation but is still not the silver bullet. Take Kyle's > case: we are importing orders both from Magento and OpenbravoPOS by batch > and there are several hundreds of such orders a day: we certainly don't want > holes in the invoice and accounting sequence (POS or Credit cards orders are > automatically reconciled by base_sale_multichannels) and the probability of > concurrent access is still huge.
Yes, you'd apparently be in a situation where you need to use gapless sequences in a high contention workload. True serializability would be required to address this with no risk of locking issue, but PostgreSQL does not support it, for performance reasons. Even the new Serializable Snapshot Isolation mode (SSI) introduced by pg 9.1 won't help here, as it will only detect more serializable anomalies, and trigger a similar error. Apparently, performance in a truly serializable isolation would be so terrible [1] that SSI will perform better in all cases even with some transactions having to be replayed. [1] http://wiki.postgresql.org/wiki/Serializable#Serializable_Isolation_Implementation_Strategies > In this situation I would instead go for: at each order iteration, being > able to strictly catch this error (I believe that wouldn't be too clean with > today's code) and if we got the order process attempt failed because of > that, then we wait some amount of time and retry and this over and over. That's the approach we are suggesting, and by having a dedicated kind of exception to detect, it should be very easy. In fact, we could implement some sort of auto-retry at the RPC level, when such a temporary lock exception is raised. This may not work for batch processing when not done via RPC, but batch jobs could implement the same logic. And this would at least help regular users, without too much risk (with a limit on the number of retries and a semi-random delay between them to avoid obvious issues) > The only thing that I don't like with that is that some resource is wasted: > when we start an iteration over: we are going to do quite a lot of things > (import order data via WS or SQL, map fields...) before trying to get some > ir_sequence and this will all be wasted when we try the whole iteration > cycle over. Yes, this should be minimized as much as possible, but cannot be 100% avoided except with full serializability, which has a big perf impact. One obvious thing to do is to schedule the batch imports at different times of the day (e.g. OBPOS vs Magento), and if possible outside of the peak activity hours from regular users. > Then I kind of think base_sale_multichannels and possibly other batches like > MRP could possibly have some kind of mutex (could be some request on > ir_sequence itself) so we would wait until we can acquire the mutex, then > fire a commit and take an iteration cycle, then release then commit and then > release the mutex. > What do you think about this approach? This would be an ad-hoc implementation of serializability, and I think it has some of the same pitfalls. To work easily in a multi-instance system, the mutex would need to be done at database level, e.g. with a LOCK TABLE statement. Such an exclusive lock waits until all readers and writers are done, and blocks new ones. As a consequence, it has a very high chance of causing deadlocks, because it must wait for every single reader to be gone. Two transactions doing SELECT then LOCK TABLE on the same table would immediately deadlock. To avoid this issue, a dedicated mutex table could be used, with the consequence that non-batch transactions wouldn't use the mutex (the LOCK TABLE needs to be taken immediately at transaction start, otherwise it doesn't work, so this is difficult outside of controlled jobs) That last option might be sufficient indeed to drastically reduce the chance of locking failures caused by 2 batch imports being executed in parallel, because they would be serialized, with only conflicts with manual transactions remaining. But due to what was discussed above, this could well be outperformed by the simple retrying of failed transactions in regular snapshot isolation.. though this probably depends a lot on the usage patterns and how the jobs are scheduled. My gut feeling is: KISS as much as possible, so perhaps this is sufficient: - most sequences would be lightweight, gap-tolerant, no locking - accounting sequences and stock reservation would still do what is right wrt locking, with a possible temporary exception raised - the server would do some level of auto-retry at RPC level for temporary exceptions: transparent to both users and business code - batch jobs that use accounting sequences or stock reservations must take care of retries, and should be scheduled in their own time-frame as much as possible, so that lock failures are very rare, to minimize waste -- You received this bug notification because you are a member of C2C OERPScenario, which is subscribed to the OpenERP Project Group. https://bugs.launchpad.net/bugs/746620 Title: Implementation of faster gap-tolerant sequences Status in OpenERP Server: Confirmed Bug description: OpenERP is classified as an Enterprise class software package. Meaning that more than one person, at least 2 should be able to use the system simultaneously. Like packers shipping products, taking orders, reserving products. I find that in fact, it is not possible for creation of stock moves simultaneously. Say I am importing orders from a shop. It is creating stock moves because the order is paid. At the same time I am shipping products which is normal for an ERP system. I might also import orders from a separate shop say a POS system. It is not possible! [2011-03-31 13:10:47,657][midwestsupplies] WARNING:stock.location:Failed attempt to reserve 1.0 x product 1669, li kely due to another transaction already in progress. Next attempt is likely to work. Detailed error available at D EBUG level. OperationalError: could not obtain lock on row in relation "ir_sequence" Two different errors. I can only perform one action at a time! What happens is that any time the program calls for a stock_move it will lock the stock_move table so no other process can access it which means that you cant do hardly anything unless its done individually. Now say with a MS system, or any Enterprise system would be able to handle many simultaneous actions but this program has a serious architecture flaw to not be able to support this. _______________________________________________ Mailing list: https://launchpad.net/~c2c-oerpscenario Post to : [email protected] Unsubscribe : https://launchpad.net/~c2c-oerpscenario More help : https://help.launchpad.net/ListHelp

