Without looking, can we use that code to read and modify it to allow writing a 2006ML document as a single XML document? I have no opinion on the read only parser.
-----Original Message----- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Wednesday, November 23, 2016 2:38 PM To: POI Developers List <dev@poi.apache.org> Subject: RE: 2006 ML format? All, I went it alone for the 2006ml format on Tika, see details [1]. If you have any feedback on that bit of code, I'd appreciate it! Major questions: 1) Do we want to move some/most of that into POI for 2006ml? 2) Do we want to offer a streaming read-only XWPF parser based on that code for the regular docx? Cheers, Tim [1] https://issues.apache.org/jira/browse/TIKA-2179?focusedCommentId=15691150&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15691150 -----Original Message----- From: Allison, Timothy B. [mailto:talli...@mitre.org] Sent: Monday, November 21, 2016 7:14 AM To: POI Developers List <dev@poi.apache.org> Subject: RE: 2006 ML format? Y, I experimented with adding an InlineOPCPackage; I couldn't quite get it to work, and even if I did, it makes a mess of our OPCPackage and ZipPackage. I'm thinking I might use this as a reason to build a beanless SXWPF read-only SAX parser. I suspect that we could very easily re-use whatever I develop for this format on the "modern" ooxml...suspicions have been wrong before...only code and unit tests will tell. :) -----Original Message----- From: Mark Murphy [mailto:jmarkmur...@gmail.com] Sent: Saturday, November 19, 2016 5:19 PM To: POI Developers List <dev@poi.apache.org> Subject: Re: 2006 ML format? Wow, this is nothing like what I thought it would be. I discovered that you can write a document in this format by selecting save as xml document. On Fri, Nov 18, 2016 at 7:03 AM, Allison, Timothy B. <talli...@mitre.org> wrote: > Thank you, Javen. I worry that I'll be adding duct tape to > OPCPackage, but let me put together a patch and we can decide if > adding an InlinePackage is too Frankenstein-y for POI. > > -----Original Message----- > From: Javen O'Neal [mailto:javenon...@gmail.com] > Sent: Thursday, November 17, 2016 5:58 PM > To: POI Developers List <dev@poi.apache.org> > Subject: Re: 2006 ML format? > > This would probably be of interest to users of POI who are not > necessarily using Tika. > > If someone spends the effort to add support for a Microsoft Office > format, POI seems like a better host. > > On Nov 17, 2016 10:55 AM, "Allison, Timothy B." <talli...@mitre.org> > wrote: > > All, > On TIKA-2179 [1], Sean Story submitted a document that appears to be > a > 2006 ML format .xml file. It appears to inline the components of a > regular docx into a single xml file, no zip. Is it worth the effort > to build a read-only subclass of OPCPackage (say, InlinePackage) that > would parallel our ZipPackage? Or would it be better to handle this > purely on the Tika side and rewrite the file as a temporary ZipFile > that can be read by our current OPCPackage? > Thank you. > > Best, > > Tim > [1] https://issues.apache.org/jira/browse/TIKA-2179 > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org B KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKCB [ X ܚX KK[XZ[ ] ][ X ܚX P K \X K ܙ B ܈Y][ۘ[ [X[ K[XZ[ ] Z[ K \X K ܙ B B --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org For additional commands, e-mail: dev-h...@poi.apache.org