[ 
https://issues.apache.org/jira/browse/TIKA-3711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17515974#comment-17515974
 ] 

Luís Filipe Nassif commented on TIKA-3711:
------------------------------------------

Well, IMHO I think the user may be interested to check the image in the 
original document if he/she knows there is some (non ocred) image in a specific 
text location. This for sure could be done with a specific xml handler impl 
like you said, but if current behavior are going to be changed to suppress 
current text output, maybe an option to enable it again could help...

> Image file names included in parsed Word Document text
> ------------------------------------------------------
>
>                 Key: TIKA-3711
>                 URL: https://issues.apache.org/jira/browse/TIKA-3711
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 2.3.0
>            Reporter: Sam Stephens
>            Priority: Minor
>         Attachments: word-doc-with-image.docx
>
>
> The attached Word document includes nothing but a single image. Running it 
> through the Tika 2.2.0 AutoDetectParser correctly returns null. Running it 
> through the Tika 2.3.0 AutoDetectParser returns the text:
> {{image1.png}}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to