[ 
https://issues.apache.org/jira/browse/TIKA-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15756049#comment-15756049
 ] 

Hudson commented on TIKA-2210:
------------------------------

SUCCESS: Integrated in Jenkins build Tika-trunk #1160 (See 
[https://builds.apache.org/job/Tika-trunk/1160/])
TIKA-2210 -- add experimental SAX parser for pptx -- this is a first (tallison: 
rev 90cdf1f6a844e0d0541167bc0364bb3963f93b2d)
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/OfficeParserConfig.java
* (delete) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFRunProperties.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/AbstractOOXMLExtractor.java
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xslf/XSLFTikaBodyPartHandler.java
* (add) 
tika-parsers/src/test/resources/test-documents/testPPTX_overlappingRelations.pptx
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/SXWPFWordExtractorDecorator.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/MetadataExtractor.java
* (edit) CHANGES.txt
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/AbstractDocumentXMLBodyHandler.java
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xslf/XSLFDocumentXMLBodyHandler.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFEventBasedWordExtractor.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/AbstractOfficeParser.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFStylesShim.java
* (add) 
tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXSLFExtractorTest.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLExtractorFactory.java
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xslf/XSLFEventBasedPowerPointExtractor.java
* (add) tika-parsers/src/test/resources/test-documents/testPPT_various2.pptx
* (edit) tika-core/src/test/java/org/apache/tika/TikaTest.java
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/SXSLFPowerPointExtractorDecorator.java
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/RunProperties.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFTikaBodyPartHandler.java
* (add) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/ParagraphProperties.java
* (edit) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFDocumentXMLBodyHandler.java
* (delete) 
tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFParagraphProperties.java


> Add experimental SAX/Streaming XSLF/pptx extractor
> --------------------------------------------------
>
>                 Key: TIKA-2210
>                 URL: https://issues.apache.org/jira/browse/TIKA-2210
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Minor
>
> On TIKA-2201, [~sevaa] shared a reasonably sized pptx that caused an OOM.  
> While the SAX docx parser is still fresh in my mind, let's add one for pptx.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to