[ https://issues.apache.org/jira/browse/TIKA-2187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15710536#comment-15710536 ]
Hudson commented on TIKA-2187: ------------------------------ SUCCESS: Integrated in Jenkins build tika-2.x #180 (See [https://builds.apache.org/job/tika-2.x/180/]) TIKA-2187 -- make "ignore deleted" as the default in the experimental (tallison: rev 3d08da79febc75d1ca0fd3293a5f383983057b00) * (edit) tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/xwpf/SXWPFExtractorTest.java * (edit) tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/xwpf/ml2006/Word2006MLParserTest.java * (edit) tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/WordParserTest.java * (add) tika-test-resources/src/test/resources/test-documents/testWORD_2006ml.doc * (edit) CHANGES.txt * (edit) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/OfficeParserConfig.java * (edit) tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/microsoft/WordExtractor.java > Align default behavior of experimental docx parser with that of doc parser in > handling delText > ---------------------------------------------------------------------------------------------- > > Key: TIKA-2187 > URL: https://issues.apache.org/jira/browse/TIKA-2187 > Project: Tika > Issue Type: Improvement > Reporter: Tim Allison > Priority: Minor > Fix For: 2.0, 1.15 > > > Now that we can ignore delText via the experimental alternate SAXParser for > .docx files, let's make that the default behavior to align with the expected > behavior for our .doc parser (ignore deleted text). > Let's also add the ability to include deleted text from .doc files. -- This message was sent by Atlassian JIRA (v6.3.4#6332)