"Office" interface should be renamed "MSOffice".
------------------------------------------------
Key: TIKA-18
URL: https://issues.apache.org/jira/browse/TIKA-18
Project: Tika
Issue Type: Improvement
Components: general
Affects Versions: 0.1-incubator
Reporter: Keith R. Bennett
Priority: Minor
Fix For: 0.1-incubator
The org.apache.tika.metadata.Office interface should be renamed MSOffice to
make clear to the reader that it applies to Microsoft Office and not
(necessarily) OpenOffice.
According to Chris Mattmann, the author of this interface:
"Basically it [the Office interface] applies to whatever office met fields
are made available via the Jakarta POI library, which is where we got the
list of met keys from while using the interface on Apache Nutch.
"Each one of the interfaces in the org.apache.tika.metadata package are
collections of met keys that we found used in commonplace throughout the
Nutch code, and that we tried to generalize into a canonical source. The
Office interface, were properties that we saw used in the parse-msword,
parse-msexcel/etc. plugins, which all rely on Jakarta POI I believe."
Patch coming...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.