[ 
https://issues.apache.org/jira/browse/TIKA-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283063#comment-15283063
 ] 

Hudson commented on TIKA-1958:
------------------------------

SUCCESS: Integrated in tika-2.x #93 (See 
[https://builds.apache.org/job/tika-2.x/93/])
TIKA-1958: add mime detection and parsers for MSOffice 2003 wordml and 
(tallison: rev a882a3242f4c94728a0129643bb52381e0e4c096)
* 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/xml/SpreadsheetMLParser.java
* 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/xml/WordMLParser.java
* tika-test-resources/src/test/resources/test-documents/testEXCEL2003.xml
* tika-core/src/main/resources/org/apache/tika/mime/tika-mimetypes.xml
* tika-test-resources/src/test/resources/test-documents/testWORD2003.xml
* 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/xml/HyperlinkHandler.java
* 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/xml/AbstractXML2003Parser.java
* 
tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/xml/XML2003ParserTest.java
* CHANGES.txt
* 
tika-parser-bundles/tika-parser-office-bundle/src/test/java/org/apache/tika/module/office/BundleIT.java
* 
tika-parser-modules/tika-parser-office-module/src/main/resources/META-INF/services/org.apache.tika.parser.Parser


> Add mime detection and lightweight parsers for Office 2003 Word and Excel 
> formats
> ---------------------------------------------------------------------------------
>
>                 Key: TIKA-1958
>                 URL: https://issues.apache.org/jira/browse/TIKA-1958
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Minor
>             Fix For: 2.0, 1.14
>
>         Attachments: 2010-cal-eu.xls, excel_msword_2003.tar.bz2
>
>
> Over on POI, a user asked if we supported 2003 xls (xml) files.  It would be 
> neat if we could add mime detection and a "good enough" parser to handle 2003 
> xls and doc files.
> This could be a great task for someone wanting to get started in contributing 
> to Tika.
> references:
> https://mail-archives.apache.org/mod_mbox/poi-user/201604.mbox/%3Calpine.BSO.2.20.1604210825140.22929%40ref.nmedia.net%3E
> https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats
> https://msdn.microsoft.com/en-us/library/bb226687(v=office.11).aspx



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to