I don't know any more about those other features but I'm very excited to
get the new property flags into production. I think that will unlock a
bunch of learning.
Thank you for adding that.

On Tue, Feb 3, 2026 at 1:07 PM Tim Allison <[email protected]> wrote:

> Thank you, Mike.
>
> If you know more about the other features I added, please carefully
> review. I trusted Claude on those and to generate test files for dde etc.
>
> On Tue, Feb 3, 2026 at 7:57 AM Mike Flester via user <[email protected]>
> wrote:
>
>> Thank you Tim! the PR looks great.
>>
>> On Mon, Feb 2, 2026 at 7:18 PM Tim Allison <[email protected]> wrote:
>>
>>> Thank you for raising this and sharing a triggering doc. I've opened:
>>> https://issues.apache.org/jira/browse/TIKA-4646
>>>
>>> On Mon, Feb 2, 2026 at 11:02 AM Mike Flester via user <
>>> [email protected]> wrote:
>>>
>>>> Hello -
>>>>
>>>> The ooxml/docx began life as a phishing email attachment. The attacker
>>>> hyperlink has been replaced with something benign.
>>>>
>>>> Tika did not extract the link because (I think) it's in "instructional
>>>> text". The document appears to work fine (the victim is able to click the
>>>> link).
>>>>
>>>> I have a bit of POI code (not production quality) that can dig this
>>>> instructional text out.
>>>>
>>>> Link to both the docx and the java code -
>>>> https://limewire.com/d/qtC1E#79Q8zip1SU
>>>>
>>>> $ javac -cp ~/tika/tika-app/target/lib/*:. POILinkExtractor.java
>>>> $ java -cp ~/tika/tika-app/target/lib/*:. POILinkExtractor
>>>> missing-link.docx
>>>>
>>>> Is this something that might see a place in Tika? As an option on the
>>>> existing XWPFWordExtractorDecorator? Or as a new parser in that package? Or
>>>> would I be best doing something outside of Tika for this cae?
>>>>
>>>> Thanks,
>>>> Mike
>>>>
>>>

Reply via email to