Hi, Your analysis sounds sensible and changing onDocumentWrite() looks ok to me as well from a quick look at the code, we would only add an allocation of the array, which is quit performance-wise compared to all the stuff that happens while saving the document.
A reproducing test-case would be nice, so we can verify this going forward. Dominik. On Fri, Feb 23, 2018 at 10:34 PM, Greg Woolsey <greg.wool...@gmail.com> wrote: > I'm finally able to work around the problem by changing > XSSFRow.onDocumentWrite() to always copy the current state of the _cells > collection to the _row CArray, instead of only when things don't look like > they are ordered. There are several cases I've uncovered where the two > groups can be the same size, with the same cell reference strings in the > same order, but still not equal. One involves adding cells then saving > then editing, another involves possible side effects of > XSSFCell.setBlank(). > > The safest option I can see is to just simply the method to just the code > inside the if(!isOrdered) statement, without the precondition checking. I > copied that code to my test project and applied it after every data table > row update, and that fixed my issue without appearing to cause a > performance hit. In fact, removing the ordering checking probably makes it > a constant time change, since it was doing essentially the same loops > whether it applied any changes or not. > > Anyone want to weigh in on that simplification of > XSSFRow.onDocumentWrite()? > > > On Fri, Feb 23, 2018 at 12:55 PM Greg Woolsey <greg.wool...@gmail.com> > wrote: > > > For what it's worth, I'm using a build from last August. The code > changes > > I'm seeing since then are almost entirely about API changes to use Enums > > and ints and remove deprecated methods. > > > > I'm suspicious there are not a lot of use cases where a document is > > opened, edited, saved, edited again without re-opening, then saved again. > > That's where I'm seeing the sync issues, after the first save. Then some > > subsequent edits get lost. it appears to be for cells that did not exist > > originally, then were added by edits. After the first save, edits to > these > > cells are not passed on to the CTRow, causing the second save to contain > > incorrect data. I'm still digging. > > > > On Fri, Feb 23, 2018 at 11:09 AM Alain FAGOT BÉAREZ < > abea...@for-scala.it> > > wrote: > > > >> Hi, > >> > >> It might have been the same symptoms which have led Sandeep Tiwari to > >> make a code change that I was reluctant to accept in the branch about > >> charts in XWPF. His explanations were in the line of yours. But I > couldn't > >> not accept the idea that such a fundamental feature had been broken. > >> > >> I hope to have some time this weekend to dive into the commits history > to > >> point out any change done recently that might affect that. > >> > >> Best regards, > >> Alain FAGOT BÉAREZ > >> > >> > >> > >> Von: Greg Woolsey > >> Gesendet: Fri Feb 23 03:53:08 GMT-03:00 2018 > >> An: POI Developers List > >> Betreff: baffling save behavior > >> > >> I hesitate to even bring it up, thinking I should be able to figure it > >> out, > >> but I've never seen behavior like this. Keep reading if you want to > twist > >> your brain in knots like mine. > >> > >> TL/DR - stop now if you'd rather be productive for your day job. > >> > >> Somehow, I have a consistent state where a file originating in Excel, > >> modified through POI, then saved and opened in Excel causes not just an > >> Excel repair, but incorrect/old data as well. > >> > >> So far, just a coding error, I figure. But the weirdness starts when I > use > >> the debugger to check on things. A specific row that I know comes out > >> wrong has the right cell values when I use > >> wb.getSheet(#).getRow(#).getCell(#).toString(), but when I check the > CTRow > >> and CTCell objects, then I find the old values still. So somehow, I've > got > >> the XSSFRow._cells collection out of sync with the CTRow _row contents. > >> > >> Anyone know how I could have done that? > >> > >> I may be calling workbook.write() in multiple places for different > >> purposes/copies. It looks to me like XSSFRow.onDocumentWrite() > explicitly > >> replaces the CTRow array of CTCell beans with new ones due to bug, > #56170, > >> but it is replacing the CTCell references in the XSSFCell objects with > the > >> new copies, . > >> > >> I haven't found a smoking gun yet, but any thoughts are welcome. Perhaps > >> the issue may lie with the formula evaluator created prior to the save > >> event? > >> > >> It may be related to that bug, or even directly tied to it, as that was > >> about updating tables, and that's part of the updates my process is > doing > >> as well. > >> > >> Sorry for the long document, normally I just figure these out on my own, > >> but I've been at this a week so far, which has never happened before. If > >> it comes down to black box behavior by XMLBeans, that would help explain > >> it, as I know next to nothing about that library's details. > >> > >> Greg > >> > >> > >> -------- Originale Nachricht -------- > >> Von: Greg Woolsey <greg.wool...@gmail.com> > >> Gesendet: Fri Feb 23 03:53:08 GMT-03:00 2018 > >> An: POI Developers List <dev@poi.apache.org> > >> Betreff: baffling save behavior > >> > >> I hesitate to even bring it up, thinking I should be able to figure it > >> out, > >> but I've never seen behavior like this. Keep reading if you want to > twist > >> your brain in knots like mine. > >> > >> TL/DR - stop now if you'd rather be productive for your day job. > >> > >> Somehow, I have a consistent state where a file originating in Excel, > >> modified through POI, then saved and opened in Excel causes not just an > >> Excel repair, but incorrect/old data as well. > >> > >> So far, just a coding error, I figure. But the weirdness starts when I > >> use > >> the debugger to check on things. A specific row that I know comes out > >> wrong has the right cell values when I use > >> wb.getSheet(#).getRow(#).getCell(#).toString(), but when I check the > CTRow > >> and CTCell objects, then I find the old values still. So somehow, I've > >> got > >> the XSSFRow._cells collection out of sync with the CTRow _row contents. > >> > >> Anyone know how I could have done that? > >> > >> I may be calling workbook.write() in multiple places for different > >> purposes/copies. It looks to me like XSSFRow.onDocumentWrite() > explicitly > >> replaces the CTRow array of CTCell beans with new ones due to bug, > #56170, > >> but it is replacing the CTCell references in the XSSFCell objects with > the > >> new copies, . > >> > >> I haven't found a smoking gun yet, but any thoughts are welcome. > Perhaps > >> the issue may lie with the formula evaluator created prior to the save > >> event? > >> > >> It may be related to that bug, or even directly tied to it, as that was > >> about updating tables, and that's part of the updates my process is > doing > >> as well. > >> > >> Sorry for the long document, normally I just figure these out on my own, > >> but I've been at this a week so far, which has never happened before. > If > >> it comes down to black box behavior by XMLBeans, that would help explain > >> it, as I know next to nothing about that library's details. > >> > >> Greg > >> > > >