Case insensitive entries in OLE2?

2023-06-20 Thread Tim Allison
All, Over on TIKA-4091, Ross Spencer noted that according to the spec, entries in OLE2 containers should be case insensitive. We can fix our detection on Tika at least to work in this way on OLE2, but it looks like POI requires case sensitivity. If I had the time, would there be any appetite

Re: [ANNOUNCE] Apache POI 5.2.3 released

2022-09-19 Thread Tim Allison
Thank you, PJ and team! On Sat, Sep 17, 2022 at 6:52 AM PJ Fanning wrote: > > The Apache POI project is pleased to announce the release of POI 5.2.3. > Featured are a handful of new areas of functionality, and numerous bug fixes. > > > See the downloads page for binary and source distributions:

Re: Plea - test the POI 5.0.0 snapshot

2020-12-21 Thread Tim Allison
Andi, Is it safe for Tika to exclude batik-all, de.rotator.pdfbox and pdfbox? Our tests appear to pass without them...not sure if there will be surprises. Thank you, again! Cheers, Tim On Sat, Dec 19, 2020 at 8:45 AM Tim Allison wrote: > Will integrate w T

Re: Plea - test the POI 5.0.0 snapshot

2020-12-21 Thread Tim Allison
, Dec 21, 2020 at 11:26 AM Tim Allison wrote: > Andi, > Thank you for all of your work on this! This is probably user error, > but I'm getting a failed test when I integrate poi trunk with Tika. Is > this something I can fix at the Tika level? > > org.apache.tika.excep

Re: Plea - test the POI 5.0.0 snapshot

2020-12-21 Thread Tim Allison
(CompositeParser.java:280) On Sat, Dec 19, 2020 at 8:47 AM Tim Allison wrote: > If anyone else on this list has time and an interest POI 5.0.0 is on the > way! Please help test! > > -- Forwarded message ----- > From: Tim Allison > Date: Sat, Dec 19, 2020 at 8:45 AM > Subject: Re

Re: Plea - test the POI 5.0.0 snapshot

2020-12-19 Thread Tim Allison
Will integrate w Tika on Monday and test it out. Thank you!!! On Sat, Dec 19, 2020 at 7:52 AM Andreas Beeker wrote: > Dear POI users, > > we are shortly before releasing POI 5.0.0 and there have been some > breaking changes [1]. > Notably the JPMS/JigSaw migration and the upgrade of the

Re: POI 4.1.1

2019-09-20 Thread Tim Allison
I think I remember a regression in emf/wmf...could be spurious or my fault at the Tika level. I’ll take a look today. On Fri, Sep 20, 2019 at 12:46 AM Thimo von Rauchhaupt < thimo.von.rauchha...@empic.de> wrote: > Hi Andi, > > > > Yes, the nightly snapthot worked very well. > > Thanks for the

Re: macros in xlsx and xls

2019-07-30 Thread Tim Allison
I'm sorry for my delay. I've been meaning to respond. Code from Tika might be useful. We do extract macros with POI; but we don't do it based on links in xlsx. For ooxml:

Re: streaming detection of OLE?

2019-04-16 Thread Tim Allison
w use > case the Tika user has. > > I assume that all you need is the first block or two to confirm this looks > like an OLE document. > > Regards, > Dave > > > On Apr 16, 2019, at 12:29 PM, Tim Allison wrote: > > > > Thank you, Dave! The reading exa

Re: streaming detection of OLE?

2019-04-16 Thread Tim Allison
like it opens a POIFS first no matter how you register a listener. On Tue, Apr 16, 2019 at 3:20 PM Dave Fisher wrote: > Hi Tim, > > Maybe the answer is using HPSF - > > https://poi.apache.org/components/hpsf/how-to.html > > Regards, > Dave > > > On Apr 16, 2019

streaming detection of OLE?

2019-04-16 Thread Tim Allison
All, In Tika, when we do file type detection of OLE files (POIFSContainerDetector), we spool the file to disk, open a POIFS and make a decision based on document/directory names. A user on TIKA-2849 does not want to copy the full file from a slow network drive for detection. When I tried using

Re: Getting contents out of HemfComment.EmfCommentDataWMF ?

2019-04-08 Thread Tim Allison
Are you ok w it? On Mon, Apr 8, 2019 at 6:11 PM Andreas Beeker wrote: > Oh sorry ... I've just realized you've introduced the method recently ... > how embarrassing ... > >

Getting contents out of HemfComment.EmfCommentDataWMF ?

2019-04-08 Thread Tim Allison
All, I'm trying to update Tika for POI 4.1.0, and we used to do the following in our EMFParser: private void handleWMF(HemfCommentPublic.WindowsMetafile comment, ...) throws ... { ... try (InputStream is = TikaInputStream.get(comment.getWmfInputStream())) { ...parse(is)

Linkage error with TypeClassHolder

2018-12-12 Thread Tim Allison
Would anyone (with more XMLBeans knowledge than I have) be willing to take a look at this? Is this a POI issue or a problem caused by the classloader/other jars on the path? https://issues.apache.org/jira/browse/TIKA-2789 java.lang.LinkageError: loader (instance of

Re: Is it POI error starting 3.14 version onwards

2018-05-16 Thread Tim Allison
You need to make your SAXReader namespace aware: saxFactory.setNamespaceAware(true); On Wed, May 16, 2018 at 8:59 AM, Tim Allison <talli...@apache.org> wrote: > Sorry for my delay. I just tested your file with Apache Tika 1.18 which > uses POI 3.17..., and I got:

Re: Is it POI error starting 3.14 version onwards

2018-05-15 Thread Tim Allison
Any chanc you can share the file? On Tue, May 15, 2018 at 3:19 AM Syed Mudassir Ahmed < syed.mudas...@gaianconsultants.com> wrote: > Hi, > I am trying to read data from a XLSX sheet via XSSFSheetXMLHandler. The > source code is below. > > public static void main(String str[]) throws

CVE-2017-12626 – Denial of Service Vulnerabilities in Apache POI < 3.17

2018-01-26 Thread Tim Allison
ich accept content from external or untrusted sources are advised to upgrade to Apache POI 3.17 or newer. -Tim Allison on behalf of the Apache POI PMC   [0] https://bz.apache.org/bugzilla/show_bug.cgi?id=61338 [1] https://bz.apache.org/bugzilla/show_bug.cgi?id=61294 [2] https://bz.apache.org/bugzi

[CVE-2016-5000] XML External Entity (XXE) Vulnerability in Apache POI's XLSX2CSV Example

2016-07-22 Thread Tim Allison
CVE-2016-5000: XML External Entity (XXE) Vulnerability in Apache POI's XLSX2CSV Example Severity: Important Vendor: The Apache Software Foundation Versions Affected: POI 3.5-3.13 Description: Apache POI's XLSX2CSV example uses Java's XML components to parse OpenXML files. Applications