Re: Regression Test Run for upcoming 5.0.0
Hi, the 2nd run of the regression tests is now finished, results look much better now, only very few failures left (56 failures in 12 stacktraces): 1) o.a.p.ooxml.POIXMLException: error: The document is not a xml@urn:schemas-poi-apache-org:vmldrawing: document element namespace mismatch expected "urn:schemas-poi-apache-org:vmldrawing" got " http://schemas.openxmlformats.org/spreadsheetml/2006/main; => Seems to have been introduced by #64773 - Visual signatures for .xlsx/.docx, Subversion Revision 1882394 2) A few failures related to drawing slideshows, likely introduced by support much more functionality there, not sure if we need to fix those 3) java.lang.RuntimeException: CountryRecord or SSTRecord not found: This is just a change in an error-message which needs to be catched differently in the integration-tests 4) some documents try to allocate very large arrays, which I would ignore as a user can increase the allowed max allocated memory easily 5) "java.lang.IllegalArgumentException: Invalid char (*) found at index (*) in sheet name *" => now happens because we fixed another issue, so not an actual regression Full reports are at http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html and http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html I think we only need to take a look at 1) and 2) before releasing. Thanks... Dominik. On Sun, Jan 3, 2021 at 1:08 PM Dominik Stadler wrote: > Hi, > > Thanks for the fixes and the "stress" documents, I added a few more and > added a test for the normal unit-tests to trigger those documents, > otherwise the ooxml-schema-lite does not contain them as far as I saw. > > Next regression-run is underway... > > Dominik. > > On Wed, Dec 30, 2020 at 8:25 PM Andreas Beeker > wrote: > >> HI, >> >> I've mentioned it in our private slack group *) - there's also an ant >> error, which ignores quite a few *$Factory.class-es in packing the lite jar. >> I'm currently trying to figure out how I can workaround this. >> >> > Another potential approach: ... >> This was my first approach class -> xsb, but it was not reliable >> therefore I've spent some time to find out (the few lines) of byte-buddy >> code. >> So those .xsb are the ones we use in our test. if we do b) those should >> be picked up. >> >> Andi >> >> *) this is just a participation reminder for the rest - I'm happy to >> invite you if you tell me your asf slack id ;) >> >> On 30.12.20 20:04, Dominik Stadler wrote: >> > Hi, >> > >> > I'd go for b), hopefully not too many are necessary, it seems a simple >> test >> > which reads in the document triggers the necesary parts in most of the >> > cases. >> > >> > c) would mean anybody out there with such a file would now get >> > regression-errors unless he switches to the full file. >> > >> > Another potential approach: I don't know much about how you do all this >> > agent-stuff nowadays, but is there a way to match the classes to the >> xsb to >> > find those missing ones as we seem to cover the classes themselves >> already >> > as they are only included when used in tests. >> > >> > Dominik. >> > >> > On Wed, Dec 30, 2020 at 7:09 PM Andreas Beeker >> wrote: >> > >> >> Hi Dominik, >> >> >> >> thank you for running the regression test. >> >> >> >>> * Most of these are because the "lite" ooxml-schema jar is still >> missing >> >>> some stuff, not sure if the new way of building the lite-jar is the >> cause >> >>> or if we now use more parts in the regression tests >> >> The lite jar used to contain all *.xsb files and now it will only >> contains >> >> the ones used in the tests, which decreased its size by around 40%. >> >> >> >> Should we ... ? >> >> a) rollback the change and include all *.xsbs - the class files might >> be >> >> still missing >> >> b) provide unit tests for the failing files - we might need a few >> >> roundtrips to fix those cases, i.e. best would be a reduced file list >> of >> >> those failures >> >> c) use the full schema for the regression tests >> >> >> >> Andi >> >> >> >> >> >> On 30.12.20 17:37, Dominik Stadler wrote: >> >>> Hi, >> >>> >> >>> In order to get the release-preparations rolling a bit, I have >> finished a >> >>> first run of the "mass regression test" exercise. >> >>> >> >>> As usual it brings up cases where documents fail now, but did work >> fine >> >>> previously, i.e. regressions that we may have introduced since the >> >> previous >> >>> release. >> >>> >> >>> I now process 3,356,984 documents (460k of those are skipped because >> they >> >>> are duplicates), currently there are around 3800 documents which show >> a >> >>> regression: >> >>> * Most of these are because the "lite" ooxml-schema jar is still >> missing >> >>> some stuff, not sure if the new way of building the lite-jar is the >> cause >> >>> or if we now use more parts in the regression tests >> >>> * some exceptions/NPEs probably related to more support for >> >>> drawing/rendering PPT(X) and so
Re: Regression Test Run for upcoming 5.0.0
Hi, Thanks for the fixes and the "stress" documents, I added a few more and added a test for the normal unit-tests to trigger those documents, otherwise the ooxml-schema-lite does not contain them as far as I saw. Next regression-run is underway... Dominik. On Wed, Dec 30, 2020 at 8:25 PM Andreas Beeker wrote: > HI, > > I've mentioned it in our private slack group *) - there's also an ant > error, which ignores quite a few *$Factory.class-es in packing the lite jar. > I'm currently trying to figure out how I can workaround this. > > > Another potential approach: ... > This was my first approach class -> xsb, but it was not reliable therefore > I've spent some time to find out (the few lines) of byte-buddy code. > So those .xsb are the ones we use in our test. if we do b) those should be > picked up. > > Andi > > *) this is just a participation reminder for the rest - I'm happy to > invite you if you tell me your asf slack id ;) > > On 30.12.20 20:04, Dominik Stadler wrote: > > Hi, > > > > I'd go for b), hopefully not too many are necessary, it seems a simple > test > > which reads in the document triggers the necesary parts in most of the > > cases. > > > > c) would mean anybody out there with such a file would now get > > regression-errors unless he switches to the full file. > > > > Another potential approach: I don't know much about how you do all this > > agent-stuff nowadays, but is there a way to match the classes to the xsb > to > > find those missing ones as we seem to cover the classes themselves > already > > as they are only included when used in tests. > > > > Dominik. > > > > On Wed, Dec 30, 2020 at 7:09 PM Andreas Beeker > wrote: > > > >> Hi Dominik, > >> > >> thank you for running the regression test. > >> > >>> * Most of these are because the "lite" ooxml-schema jar is still > missing > >>> some stuff, not sure if the new way of building the lite-jar is the > cause > >>> or if we now use more parts in the regression tests > >> The lite jar used to contain all *.xsb files and now it will only > contains > >> the ones used in the tests, which decreased its size by around 40%. > >> > >> Should we ... ? > >> a) rollback the change and include all *.xsbs - the class files might be > >> still missing > >> b) provide unit tests for the failing files - we might need a few > >> roundtrips to fix those cases, i.e. best would be a reduced file list of > >> those failures > >> c) use the full schema for the regression tests > >> > >> Andi > >> > >> > >> On 30.12.20 17:37, Dominik Stadler wrote: > >>> Hi, > >>> > >>> In order to get the release-preparations rolling a bit, I have > finished a > >>> first run of the "mass regression test" exercise. > >>> > >>> As usual it brings up cases where documents fail now, but did work fine > >>> previously, i.e. regressions that we may have introduced since the > >> previous > >>> release. > >>> > >>> I now process 3,356,984 documents (460k of those are skipped because > they > >>> are duplicates), currently there are around 3800 documents which show a > >>> regression: > >>> * Most of these are because the "lite" ooxml-schema jar is still > missing > >>> some stuff, not sure if the new way of building the lite-jar is the > cause > >>> or if we now use more parts in the regression tests > >>> * some exceptions/NPEs probably related to more support for > >>> drawing/rendering PPT(X) and so some may in fact be simply new > "expected" > >>> exceptions for broken documents > >>> * Note: The ones with TIMEOUT or OLDFORMAT are not regressions > >>> > >>> 5.0.0 vs. 4.1.2: > >>> > >> > http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html > >>> 5.0.0 overall errors: > >>> > >> > http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html > >>> I can fairly easily re-run this as soon as we have fixes for some of > the > >>> things. > >>> > >>> Thanks... Dominik. > >>> > >> > >> - > >> To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > >> For additional commands, e-mail: dev-h...@poi.apache.org > >> > >> > > > - > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > For additional commands, e-mail: dev-h...@poi.apache.org > >
Re: Regression Test Run for upcoming 5.0.0
HI, I've mentioned it in our private slack group *) - there's also an ant error, which ignores quite a few *$Factory.class-es in packing the lite jar. I'm currently trying to figure out how I can workaround this. Another potential approach: ... This was my first approach class -> xsb, but it was not reliable therefore I've spent some time to find out (the few lines) of byte-buddy code. So those .xsb are the ones we use in our test. if we do b) those should be picked up. Andi *) this is just a participation reminder for the rest - I'm happy to invite you if you tell me your asf slack id ;) On 30.12.20 20:04, Dominik Stadler wrote: Hi, I'd go for b), hopefully not too many are necessary, it seems a simple test which reads in the document triggers the necesary parts in most of the cases. c) would mean anybody out there with such a file would now get regression-errors unless he switches to the full file. Another potential approach: I don't know much about how you do all this agent-stuff nowadays, but is there a way to match the classes to the xsb to find those missing ones as we seem to cover the classes themselves already as they are only included when used in tests. Dominik. On Wed, Dec 30, 2020 at 7:09 PM Andreas Beeker wrote: Hi Dominik, thank you for running the regression test. * Most of these are because the "lite" ooxml-schema jar is still missing some stuff, not sure if the new way of building the lite-jar is the cause or if we now use more parts in the regression tests The lite jar used to contain all *.xsb files and now it will only contains the ones used in the tests, which decreased its size by around 40%. Should we ... ? a) rollback the change and include all *.xsbs - the class files might be still missing b) provide unit tests for the failing files - we might need a few roundtrips to fix those cases, i.e. best would be a reduced file list of those failures c) use the full schema for the regression tests Andi On 30.12.20 17:37, Dominik Stadler wrote: Hi, In order to get the release-preparations rolling a bit, I have finished a first run of the "mass regression test" exercise. As usual it brings up cases where documents fail now, but did work fine previously, i.e. regressions that we may have introduced since the previous release. I now process 3,356,984 documents (460k of those are skipped because they are duplicates), currently there are around 3800 documents which show a regression: * Most of these are because the "lite" ooxml-schema jar is still missing some stuff, not sure if the new way of building the lite-jar is the cause or if we now use more parts in the regression tests * some exceptions/NPEs probably related to more support for drawing/rendering PPT(X) and so some may in fact be simply new "expected" exceptions for broken documents * Note: The ones with TIMEOUT or OLDFORMAT are not regressions 5.0.0 vs. 4.1.2: http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html 5.0.0 overall errors: http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html I can fairly easily re-run this as soon as we have fixes for some of the things. Thanks... Dominik. - To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org - To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org
Re: Regression Test Run for upcoming 5.0.0
Hi, I'd go for b), hopefully not too many are necessary, it seems a simple test which reads in the document triggers the necesary parts in most of the cases. c) would mean anybody out there with such a file would now get regression-errors unless he switches to the full file. Another potential approach: I don't know much about how you do all this agent-stuff nowadays, but is there a way to match the classes to the xsb to find those missing ones as we seem to cover the classes themselves already as they are only included when used in tests. Dominik. On Wed, Dec 30, 2020 at 7:09 PM Andreas Beeker wrote: > Hi Dominik, > > thank you for running the regression test. > > > * Most of these are because the "lite" ooxml-schema jar is still missing > > some stuff, not sure if the new way of building the lite-jar is the cause > > or if we now use more parts in the regression tests > > The lite jar used to contain all *.xsb files and now it will only contains > the ones used in the tests, which decreased its size by around 40%. > > Should we ... ? > a) rollback the change and include all *.xsbs - the class files might be > still missing > b) provide unit tests for the failing files - we might need a few > roundtrips to fix those cases, i.e. best would be a reduced file list of > those failures > c) use the full schema for the regression tests > > Andi > > > On 30.12.20 17:37, Dominik Stadler wrote: > > Hi, > > > > In order to get the release-preparations rolling a bit, I have finished a > > first run of the "mass regression test" exercise. > > > > As usual it brings up cases where documents fail now, but did work fine > > previously, i.e. regressions that we may have introduced since the > previous > > release. > > > > I now process 3,356,984 documents (460k of those are skipped because they > > are duplicates), currently there are around 3800 documents which show a > > regression: > > * Most of these are because the "lite" ooxml-schema jar is still missing > > some stuff, not sure if the new way of building the lite-jar is the cause > > or if we now use more parts in the regression tests > > * some exceptions/NPEs probably related to more support for > > drawing/rendering PPT(X) and so some may in fact be simply new "expected" > > exceptions for broken documents > > * Note: The ones with TIMEOUT or OLDFORMAT are not regressions > > > > 5.0.0 vs. 4.1.2: > > > http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html > > > > 5.0.0 overall errors: > > > http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html > > > > I can fairly easily re-run this as soon as we have fixes for some of the > > things. > > > > Thanks... Dominik. > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org > For additional commands, e-mail: dev-h...@poi.apache.org > >
Re: Regression Test Run for upcoming 5.0.0
Hi Dominik, thank you for running the regression test. * Most of these are because the "lite" ooxml-schema jar is still missing some stuff, not sure if the new way of building the lite-jar is the cause or if we now use more parts in the regression tests The lite jar used to contain all *.xsb files and now it will only contains the ones used in the tests, which decreased its size by around 40%. Should we ... ? a) rollback the change and include all *.xsbs - the class files might be still missing b) provide unit tests for the failing files - we might need a few roundtrips to fix those cases, i.e. best would be a reduced file list of those failures c) use the full schema for the regression tests Andi On 30.12.20 17:37, Dominik Stadler wrote: Hi, In order to get the release-preparations rolling a bit, I have finished a first run of the "mass regression test" exercise. As usual it brings up cases where documents fail now, but did work fine previously, i.e. regressions that we may have introduced since the previous release. I now process 3,356,984 documents (460k of those are skipped because they are duplicates), currently there are around 3800 documents which show a regression: * Most of these are because the "lite" ooxml-schema jar is still missing some stuff, not sure if the new way of building the lite-jar is the cause or if we now use more parts in the regression tests * some exceptions/NPEs probably related to more support for drawing/rendering PPT(X) and so some may in fact be simply new "expected" exceptions for broken documents * Note: The ones with TIMEOUT or OLDFORMAT are not regressions 5.0.0 vs. 4.1.2: http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html 5.0.0 overall errors: http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html I can fairly easily re-run this as soon as we have fixes for some of the things. Thanks... Dominik. - To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org
Regression Test Run for upcoming 5.0.0
Hi, In order to get the release-preparations rolling a bit, I have finished a first run of the "mass regression test" exercise. As usual it brings up cases where documents fail now, but did work fine previously, i.e. regressions that we may have introduced since the previous release. I now process 3,356,984 documents (460k of those are skipped because they are duplicates), currently there are around 3800 documents which show a regression: * Most of these are because the "lite" ooxml-schema jar is still missing some stuff, not sure if the new way of building the lite-jar is the cause or if we now use more parts in the regression tests * some exceptions/NPEs probably related to more support for drawing/rendering PPT(X) and so some may in fact be simply new "expected" exceptions for broken documents * Note: The ones with TIMEOUT or OLDFORMAT are not regressions 5.0.0 vs. 4.1.2: http://people.apache.org/~centic/poi_regression/reports/index412RC3to500RC1.html 5.0.0 overall errors: http://people.apache.org/~centic/poi_regression/reportsAll/index412RC3to500RC1.html I can fairly easily re-run this as soon as we have fixes for some of the things. Thanks... Dominik.