https://issues.apache.org/bugzilla/show_bug.cgi?id=51524
Sergey Vladimirov <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED OS/Version| |All --- Comment #2 from Sergey Vladimirov <[email protected]> 2011-07-18 18:53:57 UTC --- Antoni, Rebuilding PAPX table is central point in paragraph processing. It shall not be disabled in a way like "get me old poi", because old poi based on incorrect algorithm. Previous version looked for all PAPX and called them paragraphs. Current version construct the text, split it into paragraphs and applies PAPX. I.e. in previous version you missed some text from document in case there were not PAPX for it. For example, in document by the link you provide 151 parts of text were missing. So disabling this feature can be usefull only in one case - you want to work with PAPX directly. I have no idea why someone will want to do it outside of POI. Anyway, personally i'm sad it is done in PapBinTable. It shall be done on different level, in completely different place. There should be at least two different APIs: one for low-level files processing, including parsing PAPX structures, complex file tables, etc., and second - to work with document, with tables, lists, paragraphs, etc. Rebuilding paragraphs list shall be on the second level. But making such change will break API for sure. So we need to wait until 4.0, where "old" way to work with PAPX will be provided and preserved in "first level API", as well as DocumentParser with different options on the "second level API". Anyway, again, i rewrite this code partially and now it tooks 4 seconds. Please, let me know if it's acceptable. If possible, please check if data parsed are correct - errors could be. -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
