[
https://issues.apache.org/jira/browse/TIKA-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346120#comment-14346120
]
Tyler Palsulich commented on TIKA-1057:
---------------------------------------
Can someone provide a .doc file with a status metadata field? Or, do they all
have it?
> document content property "Status" is not extracted for *.doc files
> -------------------------------------------------------------------
>
> Key: TIKA-1057
> URL: https://issues.apache.org/jira/browse/TIKA-1057
> Project: Tika
> Issue Type: Bug
> Components: parser
> Environment: java 1.5/1.6 / Windows 7
> Reporter: Thomas Stroeter
> Priority: Minor
>
> I would like to use Tika to extract the document property "Status" from a
> word 97-2003 *.doc file.
>
> Tika dumps the document status property correctly from the xml *.docx files
> as "Content-Status" and "cp:contentStatus", but I can not extract the
> metadata from a *.doc Word documents using Tika.
> Nevertheless Word 2010 has no problem to set and extract that document meta
> data from a *.doc file.
> Is there a way to extract these information by Tika for *.doc files, too?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)