[ 
https://issues.apache.org/jira/browse/TIKA-2210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15818815#comment-15818815
 ] 

Hudson commented on TIKA-2210:
------------------------------

SUCCESS: Integrated in Jenkins build tika-2.x #194 (See 
[https://builds.apache.org/job/tika-2.x/194/])
TIKA-2210 -- add experimental SAX parser for pptx and update (also (tallison: 
rev 68161573140cb584f8af136c57045fbca833fec5)
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/XWPFListManager.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLWordAndPowerPointTextHandler.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/xwpf/ml2006/Word2006MLParserTest.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/OfficeParserConfig.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/AbstractOOXMLExtractor.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLExtractor.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXSLFExtractorTest.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/ParagraphProperties.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/ml2006/Word2006MLDocHandler.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLExtractorFactory.java
* (add) 
tika-test-resources/src/test/resources/test-documents/testWORD_template.dotx
* (edit) tika-app/src/test/java/org/apache/tika/parser/TestParsers.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/MetadataExtractor.java
* (add) 
tika-test-resources/src/test/resources/test-documents/testWORD_template.docx
* (edit) 
tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFEventBasedWordExtractor.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/SXWPFExtractorTest.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSLFPowerPointExtractorDecorator.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/RunProperties.java
* (add) 
tika-test-resources/src/test/resources/test-documents/testPPTX_overlappingRelations.pptx
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xslf/XSLFEventBasedPowerPointExtractor.java
* (edit) CHANGES.txt
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLTikaBodyPartHandler.java
* (edit) 
tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/xwpf/SXWPFExtractorTest.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/SXSLFPowerPointExtractorDecorator.java
* (add) 
tika-test-resources/src/test/resources/test-documents/testWORD_embedded_pics.docx
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFNumberingShim.java
* (edit) tika-core/src/test/java/org/apache/tika/TikaTest.java
* (add) 
tika-test-resources/src/test/resources/test-documents/testPPT_various2.pptx
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/XWPFStylesShim.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/SXWPFWordExtractorDecorator.java
* (edit) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/XSSFExcelExtractorDecorator.java
* (add) 
tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/xwpf/ml2006/WordAndPowerPointTextPartHandler.java


> Add experimental SAX/Streaming XSLF/pptx extractor
> --------------------------------------------------
>
>                 Key: TIKA-2210
>                 URL: https://issues.apache.org/jira/browse/TIKA-2210
>             Project: Tika
>          Issue Type: Improvement
>          Components: parser
>    Affects Versions: 1.14
>            Reporter: Tim Allison
>            Assignee: Tim Allison
>            Priority: Minor
>             Fix For: 2.0, 1.15
>
>
> On TIKA-2201, [~sevaa] shared a reasonably sized pptx that caused an OOM.  
> While the SAX docx parser is still fresh in my mind, let's add one for pptx.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to