[ 
https://issues.apache.org/jira/browse/TIKA-1057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Stroeter updated TIKA-1057:
----------------------------------

    Description: 
I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file.

Is there a way to extract these information by Tika for *.doc files, too?

  was:
I would like to use Tika to extract the document property "Status"
from a word 97-2003 *.doc file.
   
Tika dumps the document status property correctly from the xml *.docx files as 
"Content-Status" and "cp:contentStatus", but I can not extract the metadata 
from a *.doc Word documents using Tika. 

Nevertheless Word 2010 has no problem to set and extract that document meta 
data from a *.doc file. (Attached are two example files)

Is there a way to extract these information by Tika for *.doc files, too?

    
> The document property "Status" is not extracted for *.doc files
> ---------------------------------------------------------------
>
>                 Key: TIKA-1057
>                 URL: https://issues.apache.org/jira/browse/TIKA-1057
>             Project: Tika
>          Issue Type: Bug
>         Environment: java 1.5 / Windows
>            Reporter: Thomas Stroeter
>            Priority: Minor
>
> I would like to use Tika to extract the document property "Status"
> from a word 97-2003 *.doc file.
>    
> Tika dumps the document status property correctly from the xml *.docx files 
> as "Content-Status" and "cp:contentStatus", but I can not extract the 
> metadata from a *.doc Word documents using Tika. 
> Nevertheless Word 2010 has no problem to set and extract that document meta 
> data from a *.doc file.
> Is there a way to extract these information by Tika for *.doc files, too?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to