Fix merge conflict.
Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/9056894d Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/9056894d Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/9056894d Branch: refs/heads/master Commit: 9056894da580107d1a5a21b29a0b7042ffa15c42 Parents: 3fbc03c 7c245fa Author: Chris Mattmann <[email protected]> Authored: Tue Mar 1 21:41:57 2016 -0800 Committer: Chris Mattmann <[email protected]> Committed: Tue Mar 1 21:41:57 2016 -0800 ---------------------------------------------------------------------- CHANGES.txt | 2 + .../org/apache/tika/parser/pdf/PDF2XHTML.java | 20 ++ .../org/apache/tika/parser/pdf/PDFParser.java | 35 +- .../apache/tika/parser/pdf/PDFParserConfig.java | 36 ++- .../apache/tika/parser/pdf/XFAExtractor.java | 318 +++++++++++++++++++ .../apache/tika/parser/pdf/PDFParser.properties | 3 +- .../apache/tika/parser/pdf/PDFParserTest.java | 32 +- .../testPDF_XFA_govdocs1_258578.pdf | Bin 0 -> 168176 bytes 8 files changed, 442 insertions(+), 4 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/tika/blob/9056894d/CHANGES.txt ---------------------------------------------------------------------- diff --cc CHANGES.txt index d5bebcd,05d6d76..e6603fa --- a/CHANGES.txt +++ b/CHANGES.txt @@@ -1,8 -1,7 +1,10 @@@ Release 1.13 - ??? + * Tika now incorporates the Natural Language Toolkit (NLTK) from the + Python community as an option for Named Entity Recognition (TIKA-1876). + + * Add support for XFA extraction via Pascal Essiembre (TIKA-1857). + * Upgrade to sqlite-jdbc 3.8.11.2 (TIKA-1861). NOTE: this dependency is still <scope>provided</scope>. You need to include this dependency in order to parser sqlite files.
