Thomas Stroeter created TIKA-1057:
-------------------------------------
Summary: The document property "Status" is not extracted for *.doc
files
Key: TIKA-1057
URL: https://issues.apache.org/jira/browse/TIKA-1057
Project: Tika
Issue Type: Bug
Environment: java 1.5 / Windows
Reporter: Thomas Stroeter
Priority: Minor
I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
Tika dumps the document status property correctly from the xml *.docx files as
"Content-Status" and "cp:contentStatus", but I can not extract the metadata
from a *.doc Word documents using Tika.
Nevertheless Word 2010 has no problem to set and extract that document meta
data from a *.doc file. (Attached are two example files)
Is there a way to extract these information by Tika for *.doc files, too?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira