Thank you for the explanation, it is largely as I expected, but I am stil learning my way around the ofbiz way. I appreciate the help.
See inline comments below. -- Matt Warnock <[email protected]> RidgeCrest Herbals, Inc. On Fri, 2010-05-21 at 10:44 -0700, James McGill wrote: > I will try to explain what we mean by "special" here. > > The Entity delegator has a facility to do sequence generation, per entity. > That utility may or may not have a concurrency problem in distributed > environments. In any case, that's not being used here. > > Consider a scenario where you have two companies that you are doing invoices > for. Call them Company "AA" and company "BB", and let these be the > partyIds. Seems like this might arise even for separate warehouses of the same company, if each has a separate invoice sequence. > > Let us introduce a business rule that requires invoices to be numbered in > the sequence "AA1, AA2 ..." and "BB1, BB2 ...". > > This rule is implemented by OFBiz: In PartyAcctgPreference, keyed by > PartyId, there is a flag "INVSQ_ENF_SEQ". I think that stands for "Invoice > Sequence Enforce Sequential". In that PartyAcctgPreference, the sequence > number for that Party is stored as "lastInvoiceNumber". > > Let's walk through the single-thread scenario for createInvoice. > applications/accounting/script/org/ofbiz/accounting/invoice/InvoiceServices.xml#createInvoice > > The service opens a transaction. The first thing it does is call > createInvoiceId. createInvoiceId also opens a transaction. It looks up > PartyAcctgPreference for "AA". It checks INVSQ_ENF_SEQ. Seeing "true", it > then reads lastInvoiceNumber, increments it by one, writes it back to the > Entity, returns this value to createInvoice, committing its transaction. > createInvoiceId writes the Invoice and commits its transaction. Everything > is fine in a single thread, of course. An here is where it breaks. Nested transactions are one of the dicey areas of SQL. Some implementations don't even support them. PostgreSQL added this feature only in the last few years, and I don't know if I'd trust it on a multi-platform design like ofbiz. If it was all in one transaction, we'd be fine, I think, but it has to be the outer one. If the inner transaction commits, it should be provisional, and should ALSO roll back if the outer one does. > > In multiple threads, we see two threads reading the same lastInvoiceNumber > for a party, even though createInvoiceId and createInvoice each runs under a > transaction. To add to the puzzle, we see this happen even when those two > threads are more than a few seconds apart (an eternity). > > I'm still not convinced that putting createInvoiceId in a synchronized > section will solve this problem, because the service already uses a > transaction, which means it is already treated as a critical section. I > think that createInvoice itself should be exclusive to the thread as well, > since it looks like two threads enter createInvoice, then they both get the > same invoiceId from createInvoiceId even though that shouldn't be possible > because of the transaction. > > Once this happens, it becomes impossible to use the system until > lastInvoiceNumber is incremented for the party. I'm trying to write a test > case that triggers the bug so that we can show it's fixed by adding > synchronized to the minilang. > Another issue may be the fact that we are keeping a lastInvoiceNumber separate from the table. That seems like it might violate normalization rules. Why not look up the lastInvoiceNumber value in the Invoice table itself? As long as the lookup, increment, and write operations are all in the same transaction, you should be OK. If the ID field is properly indexed, a "SELECT invoiceID FROM invoice DESCENDING LIMIT 1 WHERE partyID = ?" shouldn't be enough slower than "SELECT lastInvoiceNumber FROM AccountingPref WHERE partyID = ?" to make the difference worthwhile. MySQL's non-standard "autoincrement" seems to do this internally. While I am not a fan of the fact that it is non-standard, it does have some advantages. Stored procedures could do the same thing in a more standard way. > > Now, MOST businesses will require that every invoice number is unique, > > and MANY will further require that all order/invoice/check numbers be > > accounted for, even on prenumbered forms (no "gaps") for auditing > > purposes. Is that "special" to ofbiz? > > > > No. At the risk of being repetitive, consider the case with multiple > parties, each of which needs sequential invoices. If the PK of Invoice were > simply a serial number, the normal sequence generator would work. But if we > need a serial number *per party*, then we need to generate a sequence per > party and use the partyId as a prefix. (At least this is the way OFBiz does > it). So this way, company AA can have invoiceId AA1, and company BB can > have invoiceId BB1, and these can each be used as a PK in the same column. > Which is fine. > > Alternatively, some businesses I have seen enforce a particular format > > on the invoice/order/document number (like OOOYYYYMMDDSSS, where OOO is > > the office number, followed by a date, and SSS is a sequence number > > within the day), and maybe that is where the trouble lies. In the US, > > Social Security numbers are generated on a "special" pattern like this. > > > > You would probably use "referenceNumber" for something like this. Your > "SSS" component of your field would get you to the same place as the bug > we're discussing, given a partyId of "OOO". > > You need an thread-safe/atomic method of generating SSS. Even though > getInvoiceId runs in a transaction, two threads get the same SSS. > I think we are on the same page here. I was envisioning multiple offices, like 528 for South California and 552 for Utah, to borrow from the Social Security number case. > > > In either event, I remain convinced, and on reviewing the thread, David > > Jones seems to agree, that this problem cannot be easily solved except > > with atomic SQL transactions. Is there some reason they are not being > > used in this situation? Or are they in use, and is the OP using a > > REALLY broken JDBC? > > > mysql-connector-java-5.1.8-bin.jar > > > > Or are they in use, but a record write has > > inadvertently snuck outside the transaction? > > > > That's what I'm wondering. As of MySQL 5.5 at least, transactions cannot be nested. http://dev.mysql.com/doc/refman/5.5/en/implicit-commit.html > > At our company, we have two people putting orders in all morning (and we > > are a very small company), and the chance of a collision in your > > scenario is NOT rare, but quite high indeed, as the OP has already > > pointed out, even on a single server with multiple users. > > > If you only generate them for a single PartyID and you don't have > orderSequenceEnumId=ODRSQ_ENF_SEQ > you won't ever hit this part of the service. > Understood. > > > The issue is not that the sequence generator is "slow" to store the new > > invoice number, but rather that "last ID" read, the "new ID" generator, > > and the "new ID" write are not ALL encased inside a transaction that > > will ensure that the database is ALWAYS internally consistent. The > > "race" occurs because there is no enclosing transaction, and the "race" > > will never be completely controlled by being "fast" enough. > > > > But there supposedly is an enclosing transaction. The services run in a > transaction by default, and just to be sure, I've explicitly set > use-transaction="true" for both. So it's like this: > > begin(Tx1) > createInvoice > begin(Tx2) > getInvoiceId > commit(Tx2) > commit(Tx1) > > It looks like the right idea to me, but it ends up not working, and I > haven't been smart enough (so far) to figure out why. > I'm almost certain nested transactions are the issue. > > > If there is a long lag between read and write (as where a user is > > entering order line items) then you either have to 1) do the > > read/increment/write at the END of the data entry process, or 2) face > > the high likelihood that two entry processes will collide, and one of > > them will have to roll back and then be renumbered. > > > > I don't see a way to do try/retry looping in minilang. I think you'd need > an ECA to handle the error condition, but I think there's nothing wrong with > Java (to sum up how I feel about minilang in general.) > > > > Most software I have seen is not very bright about this, and it reads > > and generates the new order number first, displaying it on the input > > screen as data is entered. > > > Please look at createInvoice and how it gets the Id. There's no UI event > involved, and since services run in a transaction by default, it doesn't > really look like the design is wrong. I might not even be right about the > source of the problem. But it breaks about once a week, on both Quote and > Invoice (which use the same approach) and the log definitely shows two > threads getting the same lastId. > > Whether you use a SQL standard "sequence" or not (or even the mysql > > "autoincrement" abomination), and regardless of the primary key status > > of the orderID (any SQL "unique" declaration could have the same effect, > > and a well designed order table should at LEAST have a unique orderID, > > whether it is part of the primary key or not), SQL transactions should > > resolve this issue easily and scalably. > > > > On the other hand, using locking or semaphores in the application code > > is reinventing the SQL wheel and fraught with possible error. As David > > suggested, there may be multiple front ends talking to a single > > database, so a SQL solution is best. I am merely respectfully > > suggesting that it would be better to let SQL do what it does best, and > > atomic transactions are the essence of good SQL. > > > > Hope this helps explain where I am coming from. > > > > I hope you will look at the code in question and I hope you can explain why > transactions haven't worked. > MySQL5, isolation-level="ReadCommitted". I believe OFBiz is already taking > the approach you suggest, relying on SQL transactions to protect a critical > section, and it's not working for some reason. > > Any or all of my assumptions could be wrong. I just need to fix this so it > stops shutting us down. >
