svn commit: r1511816 - in /tika/trunk: CHANGES.txt tika-core/src/test/java/org/apache/tika/mime/MimeDetectionTest.java tika-core/src/test/resources/org/apache/tika/mime/test-tika-327.html

2013-08-08 Thread tallison
Author: tallison Date: Thu Aug 8 14:50:15 2013 New Revision: 1511816 URL: http://svn.apache.org/r1511816 Log: Tika 1139 update to 1129 Added: tika/trunk/tika-core/src/test/resources/org/apache/tika/mime/test-tika-327.html Modified: tika/trunk/CHANGES.txt tika/trunk/tika-core/src

svn commit: r1511901 - in /tika/trunk: ./ tika-parsers/src/main/java/org/apache/tika/parser/pdf/ tika-parsers/src/test/java/org/apache/tika/parser/pdf/ tika-parsers/src/test/resources/test-documents/

2013-08-08 Thread tallison
Author: tallison Date: Thu Aug 8 17:55:27 2013 New Revision: 1511901 URL: http://svn.apache.org/r1511901 Log: TIKA-1124, process attachments within an embedded PDF Added: tika/trunk/tika-parsers/src/test/resources/test-documents/TIKA-1142.docx (with props) Modified: tika/trunk

svn commit: r1511908 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/parser/pdf/PDFParserTest.java resources/test-documents/TIKA-1142.docx resources/test-documents/testPDFEmbeddingAndEmbe

2013-08-08 Thread tallison
Author: tallison Date: Thu Aug 8 18:09:39 2013 New Revision: 1511908 URL: http://svn.apache.org/r1511908 Log: Tika 1124 not 1142...sorry Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testPDFEmbeddingAndEmbedded.docx (with props) Removed: tika/trunk/tika-parsers

svn commit: r1514126 - in /tika/trunk: ./ tika-parsers/src/main/java/org/apache/tika/parser/html/ tika-parsers/src/test/java/org/apache/tika/parser/html/ tika-parsers/src/test/resources/test-documents

2013-08-14 Thread tallison
Author: tallison Date: Thu Aug 15 01:59:26 2013 New Revision: 1514126 URL: http://svn.apache.org/r1514126 Log: TIKA 1001 more flexible html meta-header encoding detector Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testHTMLNoisyMetaEncoding_1.html tika/trunk/tika

svn commit: r1514551 - in /tika/trunk: CHANGES.txt tika-parsers/pom.xml

2013-08-15 Thread tallison
Author: tallison Date: Fri Aug 16 01:15:40 2013 New Revision: 1514551 URL: http://svn.apache.org/r1514551 Log: TIKA-1153 upgrade PDFBox to 1.8.2 Modified: tika/trunk/CHANGES.txt tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/tika

svn commit: r1524741 - /tika/trunk/tika-parsers/pom.xml

2013-09-19 Thread tallison
Author: tallison Date: Thu Sep 19 13:55:28 2013 New Revision: 1524741 URL: http://svn.apache.org/r1524741 Log: bumped poi to 3.10-beta2 Modified: tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/tika-parsers/pom.xml URL: http://svn.apache.org/viewvc/tika/trunk/tika-parsers/pom.xml?rev

svn commit: r1526570 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java resources/test-documents/testWORD_missing_ooxml_bean1.docx

2013-09-26 Thread tallison
Author: tallison Date: Thu Sep 26 15:25:19 2013 New Revision: 1526570 URL: http://svn.apache.org/r1526570 Log: TIKA-792 fixed by POI-3.10-beta2; added test for missing ooxml bean Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testWORD_missing_ooxml_bean1.docx

svn commit: r1526593 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java

2013-09-26 Thread tallison
Author: tallison Date: Thu Sep 26 16:18:07 2013 New Revision: 1526593 URL: http://svn.apache.org/r1526593 Log: commented out TIKA-792 test for now Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java Modified: tika/trunk/tika-parsers

svn commit: r1526907 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java

2013-09-27 Thread tallison
Author: tallison Date: Fri Sep 27 14:03:14 2013 New Revision: 1526907 URL: http://svn.apache.org/r1526907 Log: second attempt to add test for detecting missing ooxml bean. Builds successfully locally. Jenkins failed last time. Stack traces didn't point to this test; but redirecting stderr may

svn commit: r1526975 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java

2013-09-27 Thread tallison
Author: tallison Date: Fri Sep 27 16:18:54 2013 New Revision: 1526975 URL: http://svn.apache.org/r1526975 Log: TIKA-1076 extract text from tables in ppt. Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java tika/trunk/tika-parsers/src/test

svn commit: r1526981 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/parser/microsoft/ java/org/apache/tika/parser/microsoft/ooxml/ resources/test-documents/

2013-09-27 Thread tallison
Author: tallison Date: Fri Sep 27 16:49:52 2013 New Revision: 1526981 URL: http://svn.apache.org/r1526981 Log: TIKA-817 -- autodates in ppt and pptx. Already fixed by TIKA-805. Added files and tests to confirm behavior specifiedin POI-52367 and POI-52368 Added: tika/trunk/tika-parsers/src

svn commit: r1527030 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/microsoft/HSLFExtractor.java test/java/org/apache/tika/parser/microsoft/PowerPointParserTest.java

2013-09-27 Thread tallison
Author: tallison Date: Fri Sep 27 18:55:31 2013 New Revision: 1527030 URL: http://svn.apache.org/r1527030 Log: TIKA-1171 -- extra asterisks from master slide in PPT; added tests to TIKA-712 test files to show 1171 was fixed. Borrowed extraction code from POI PowerPointExtractor Modified

svn commit: r1527038 - /tika/trunk/CHANGES.txt

2013-09-27 Thread tallison
Author: tallison Date: Fri Sep 27 19:14:25 2013 New Revision: 1527038 URL: http://svn.apache.org/r1527038 Log: updated CHANGES.txt to cover recent activity Modified: tika/trunk/CHANGES.txt Modified: tika/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/tika/trunk/CHANGES.txt?rev

svn commit: r1527044 - /tika/trunk/CHANGES.txt

2013-09-27 Thread tallison
Author: tallison Date: Fri Sep 27 19:38:03 2013 New Revision: 1527044 URL: http://svn.apache.org/r1527044 Log: added 1130 to CHANGES.txt Modified: tika/trunk/CHANGES.txt Modified: tika/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/tika/trunk/CHANGES.txt?rev=1527044r1=1527043r2

svn commit: r1547037 - /tika/trunk/tika-parsers/pom.xml

2013-12-02 Thread tallison
Author: tallison Date: Mon Dec 2 14:46:38 2013 New Revision: 1547037 URL: http://svn.apache.org/r1547037 Log: TIKA-1200 upgrade pdfbox to 1.8.3 Modified: tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/tika-parsers/pom.xml URL: http://svn.apache.org/viewvc/tika/trunk/tika-parsers

svn commit: r1550725 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/ main/resources/org/apache/tika/parser/pdf/ test/java/org/apache/tika/parser/pdf/ test/resources/test-docum

2013-12-13 Thread tallison
Author: tallison Date: Fri Dec 13 13:20:43 2013 New Revision: 1550725 URL: http://svn.apache.org/r1550725 Log: TIKA-973 reopened. Would prefer test docs unequivocally consistent with Apache License 2.0. Deleted initial test docs from trunk and commented out test case. Also added

svn commit: r1561661 - in /tika/trunk: ./ tika-parsers/src/main/java/org/apache/tika/parser/pdf/ tika-parsers/src/test/java/org/apache/tika/parser/pdf/ tika-parsers/src/test/resources/test-documents/

2014-01-27 Thread tallison
Author: tallison Date: Mon Jan 27 13:09:16 2014 New Revision: 1561661 URL: http://svn.apache.org/r1561661 Log: TIKA-1226: PDF TextStripper fails when it encounters PDSignature Field. Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testPDF_acroform3.pdf (with props

svn commit: r1561665 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java

2014-01-27 Thread tallison
Author: tallison Date: Mon Jan 27 13:18:54 2014 New Revision: 1561665 URL: http://svn.apache.org/r1561665 Log: TIKA-1226, removed println...doh. Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java Modified: tika/trunk/tika-parsers/src/main/java/org

svn commit: r1564042 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/ test/java/org/apache/tika/ test/java/org/apache/tika/parser/pdf/ test/resources/test-documents/

2014-02-03 Thread tallison
Author: tallison Date: Mon Feb 3 20:11:10 2014 New Revision: 1564042 URL: http://svn.apache.org/r1564042 Log: TIKA-1228: Look for attachments under Kids node if embeddedFiles.getNames() returns null Added: tika/trunk/tika-parsers/src/test/resources/test-documents

svn commit: r1567074 - /tika/trunk/tika-parsers/pom.xml

2014-02-11 Thread tallison
Author: tallison Date: Tue Feb 11 12:08:26 2014 New Revision: 1567074 URL: http://svn.apache.org/r1567074 Log: TIKA-1237 upgrade to poi-3.10-FINAL Modified: tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/tika-parsers/pom.xml URL: http://svn.apache.org/viewvc/tika/trunk/tika-parsers

svn commit: r1569788 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2014-02-19 Thread tallison
Author: tallison Date: Wed Feb 19 15:27:24 2014 New Revision: 1569788 URL: http://svn.apache.org/r1569788 Log: got rid of brittle requirement for specific number of pdfs to be tested in PDFParserTest Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf

svn commit: r1574959 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/ test/java/org/apache/tika/parser/pdf/ test/resources/test-documents/

2014-03-06 Thread tallison
Author: tallison Date: Thu Mar 6 16:52:19 2014 New Revision: 1574959 URL: http://svn.apache.org/r1574959 Log: TIKA-1232: add fine-grained pdf version extraction Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testPDF_Version.10.x.pdf (with props) tika/trunk/tika

svn commit: r1575112 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java

2014-03-06 Thread tallison
Author: tallison Date: Fri Mar 7 01:27:41 2014 New Revision: 1575112 URL: http://svn.apache.org/r1575112 Log: TIKA-1252 small clean up Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika

svn commit: r1575116 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/ test/java/org/apache/tika/parser/pdf/

2014-03-06 Thread tallison
Author: tallison Date: Fri Mar 7 01:50:34 2014 New Revision: 1575116 URL: http://svn.apache.org/r1575116 Log: clean up whitespace in PDFParser components Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java tika/trunk/tika-parsers/src/main/java/org

svn commit: r1575120 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/mbox/OutlookPSTParser.java

2014-03-06 Thread tallison
Author: tallison Date: Fri Mar 7 01:57:26 2014 New Revision: 1575120 URL: http://svn.apache.org/r1575120 Log: cleanup whitespace in OutlookPSTParser Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/mbox/OutlookPSTParser.java Modified: tika/trunk/tika-parsers/src

svn commit: r1586529 - in /tika/trunk/tika-parsers/src/test/java/org/apache/tika: ./ parser/microsoft/ parser/microsoft/ooxml/ parser/pdf/ parser/xml/

2014-04-10 Thread tallison
Author: tallison Date: Fri Apr 11 01:48:48 2014 New Revision: 1586529 URL: http://svn.apache.org/r1586529 Log: TIKA-1271: trivial refactoring of classes useful for testing embedded document handling Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/TikaTest.java tika

svn commit: r1588005 [1/4] - in /tika/trunk: ./ tika-core/src/main/java/org/apache/tika/io/ tika-core/src/main/java/org/apache/tika/metadata/ tika-core/src/test/java/org/apache/tika/io/ tika-parsers/s

2014-04-16 Thread tallison
Author: tallison Date: Wed Apr 16 18:04:20 2014 New Revision: 1588005 URL: http://svn.apache.org/r1588005 Log: TIKA-1010 extract embedded documents from RTF Added: tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/RTFMetadata.java tika/trunk/tika-parsers/src/main/java/org

svn commit: r1589778 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/code/SourceCodeParserTest.java

2014-04-24 Thread tallison
Author: tallison Date: Thu Apr 24 16:02:01 2014 New Revision: 1589778 URL: http://svn.apache.org/r1589778 Log: TIKA-1279 trivial fix caps in testJAVA.java in test cases so that tests pass in *nix Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/code

svn commit: r1593983 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java

2014-05-12 Thread tallison
Author: tallison Date: Mon May 12 14:46:16 2014 New Revision: 1593983 URL: http://svn.apache.org/r1593983 Log: TIKA-1233: removed catch blocks after upgrade to PDFBOX-1.8.5; see PDFBOX-1803 Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java Modified

svn commit: r1593996 - in /tika/trunk: CHANGES.txt tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java

2014-05-12 Thread tallison
Author: tallison Date: Mon May 12 15:14:09 2014 New Revision: 1593996 URL: http://svn.apache.org/r1593996 Log: TIKA-1231: added more null checks after underlying fix was made in PDFBox-1.8.5 Modified: tika/trunk/CHANGES.txt tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser

svn commit: r1594958 - /tika/trunk/tika-parsers/src/test/resources/test-documents/testPDFTripleLangTitle.pdf

2014-05-16 Thread tallison
Author: tallison Date: Thu May 15 15:50:18 2014 New Revision: 1594958 URL: http://svn.apache.org/r1594958 Log: test doc actually added for r1594957 temporary bug fix until TIKA-1295 is resolved Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testPDFTripleLangTitle.pdf

svn commit: r1594930 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2014-05-16 Thread tallison
Author: tallison Date: Thu May 15 14:36:24 2014 New Revision: 1594930 URL: http://svn.apache.org/r1594930 Log: Ignore a test until TIKA-1298 is fixed Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java Modified: tika/trunk/tika-parsers/src/test

svn commit: r1597132 - in /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/rtf: RTFEmbObjHandler.java RTFObjDataParser.java

2014-05-23 Thread tallison
Author: tallison Date: Fri May 23 17:11:28 2014 New Revision: 1597132 URL: http://svn.apache.org/r1597132 Log: add license header to RTFObjDataParser and clean up whitespace in RTFEmbObjHandler Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/rtf/RTFEmbObjHandler.java

svn commit: r1597856 - in /tika/trunk: ./ tika-core/src/main/java/org/apache/tika/metadata/ tika-parsers/src/main/java/org/apache/tika/parser/pdf/ tika-parsers/src/main/resources/org/apache/tika/parse

2014-05-27 Thread tallison
Author: tallison Date: Tue May 27 19:33:07 2014 New Revision: 1597856 URL: http://svn.apache.org/r1597856 Log: TIKA-1294 add ability to turn off image extraction from PDFs Modified: tika/trunk/CHANGES.txt tika/trunk/tika-core/src/main/java/org/apache/tika/metadata

svn commit: r1598305 - in /tika/trunk: tika-core/src/main/java/org/apache/tika/metadata/TikaCoreProperties.java tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java tika-parsers/src/te

2014-05-29 Thread tallison
Author: tallison Date: Thu May 29 14:37:25 2014 New Revision: 1598305 URL: http://svn.apache.org/r1598305 Log: fix to TIKA-1294, uppercase enum Modified: tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/TikaCoreProperties.java tika/trunk/tika-parsers/src/main/java/org/apache

svn commit: r1598693 - in /tika/trunk: CHANGES.txt tika-parsers/src/main/java/org/apache/tika/parser/rtf/TextExtractor.java tika-parsers/src/test/java/org/apache/tika/parser/rtf/RTFParserTest.java

2014-05-30 Thread tallison
Author: tallison Date: Fri May 30 18:23:15 2014 New Revision: 1598693 URL: http://svn.apache.org/r1598693 Log: TIKA-1305: make RTF list handling slightly more robust against corrupt list metadata Modified: tika/trunk/CHANGES.txt tika/trunk/tika-parsers/src/main/java/org/apache/tika

svn commit: r1598698 - /tika/trunk/tika-parsers/src/test/resources/test-documents/testRTFCorruptListOverride.rtf

2014-05-30 Thread tallison
Author: tallison Date: Fri May 30 18:30:44 2014 New Revision: 1598698 URL: http://svn.apache.org/r1598698 Log: TIKA-1305: test file added to svn...argh. Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testRTFCorruptListOverride.rtf Added: tika/trunk/tika-parsers/src/test

svn commit: r1600554 - in /tika/trunk: ./ src/site/apt/ tika-app/ tika-app/src/main/java/org/apache/tika/cli/ tika-serialization/ tika-serialization/src/ tika-serialization/src/main/ tika-serializatio

2014-06-04 Thread tallison
Author: tallison Date: Thu Jun 5 01:42:27 2014 New Revision: 1600554 URL: http://svn.apache.org/r1600554 Log: TIKA-1311 centralize serialization Added: tika/trunk/tika-serialization/ tika/trunk/tika-serialization/pom.xml tika/trunk/tika-serialization/src/ tika/trunk/tika

svn commit: r1603208 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/PDFParser.java test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2014-06-17 Thread tallison
Author: tallison Date: Tue Jun 17 16:05:44 2014 New Revision: 1603208 URL: http://svn.apache.org/r1603208 Log: TIKA-1341: fix double endDocument in PDFParser Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java tika/trunk/tika-parsers/src/test/java

svn commit: r1604989 - in /tika/trunk/tika-parsers: pom.xml src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java src/main/java/org/apache/tika/parser/pdf/PDFParser.java src/test/java/org/apache/tik

2014-06-23 Thread tallison
Author: tallison Date: Tue Jun 24 01:12:45 2014 New Revision: 1604989 URL: http://svn.apache.org/r1604989 Log: TIKA-1352 upgrade to PDFBox 1.8.6 Modified: tika/trunk/tika-parsers/pom.xml tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java tika/trunk/tika

svn commit: r1604995 - /tika/trunk/CHANGES.txt

2014-06-23 Thread tallison
Author: tallison Date: Tue Jun 24 01:51:28 2014 New Revision: 1604995 URL: http://svn.apache.org/r1604995 Log: TIKA-1352 update CHANGES.txt Modified: tika/trunk/CHANGES.txt Modified: tika/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/tika/trunk/CHANGES.txt?rev=1604995r1=1604994r2

svn commit: r1613122 - in /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf: PDF2XHTML.java PDFParser.java PDFParserConfig.java

2014-07-24 Thread tallison
Author: tallison Date: Thu Jul 24 13:37:35 2014 New Revision: 1613122 URL: http://svn.apache.org/r1613122 Log: Fix potential NPE and fix javadoc refs for PDFParser Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java tika/trunk/tika-parsers/src/main

svn commit: r1613395 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java

2014-07-25 Thread tallison
Author: tallison Date: Fri Jul 25 11:46:51 2014 New Revision: 1613395 URL: http://svn.apache.org/r1613395 Log: TIKA-1375: decrease memory consumption when extracting images in PDFs Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java Modified: tika

svn commit: r1615174 - /tika/trunk/tika-parsers/src/test/resources/test-documents/testWORD_missing_text.docx

2014-08-01 Thread tallison
Author: tallison Date: Fri Aug 1 17:31:21 2014 New Revision: 1615174 URL: http://svn.apache.org/r1615174 Log: TIKA-1380: staging an updated test file for the actual patch once POI 3.11-beta-1 is released Modified: tika/trunk/tika-parsers/src/test/resources/test-documents

svn commit: r1615538 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/parser/microsoft/ java/org/apache/tika/parser/microsoft/ooxml/ resources/test-documents/

2014-08-04 Thread tallison
Author: tallison Date: Mon Aug 4 12:12:46 2014 New Revision: 1615538 URL: http://svn.apache.org/r1615538 Log: added test and test docs for comments in xls and xlsx; lack of tests detected during work on TIKA-1380 Added: tika/trunk/tika-parsers/src/test/resources/test-documents

svn commit: r1615630 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/ java/org/apache/tika/parser/microsoft/ java/org/apache/tika/parser/microsoft/ooxml/ resources/test-documents/

2014-08-04 Thread tallison
Author: tallison Date: Mon Aug 4 15:51:28 2014 New Revision: 1615630 URL: http://svn.apache.org/r1615630 Log: Found existing comments test in TestParsers; clean up earlier tests for comments in xls and xlsx Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testComment.xls

svn commit: r1615667 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/microsoft/ooxml/XWPFWordExtractorDecorator.java test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTes

2014-08-04 Thread tallison
Author: tallison Date: Mon Aug 4 16:45:36 2014 New Revision: 1615667 URL: http://svn.apache.org/r1615667 Log: TIKA-1317 extract contents from SDTs within cells in tables in XWPF (docx) files Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml

svn commit: r1615675 - in /tika/branches/1.6/tika-parsers/src: main/java/org/apache/tika/parser/microsoft/ooxml/XWPFWordExtractorDecorator.java test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLPa

2014-08-04 Thread tallison
Author: tallison Date: Mon Aug 4 16:51:40 2014 New Revision: 1615675 URL: http://svn.apache.org/r1615675 Log: TIKA-1317 extract contents from SDTs within cells in tables in XWPF (docx) files Modified: tika/branches/1.6/tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ooxml

svn commit: r1615923 - in /tika/branches/1.6: CHANGES.txt tika-parsers/pom.xml

2014-08-05 Thread tallison
Author: tallison Date: Tue Aug 5 13:03:05 2014 New Revision: 1615923 URL: http://svn.apache.org/r1615923 Log: TIKA-1275 upgrade Commons Compress to 1.8.1; updated CHANGES.txt, too Modified: tika/branches/1.6/CHANGES.txt tika/branches/1.6/tika-parsers/pom.xml Modified: tika/branches/1.6

svn commit: r1615926 - in /tika/trunk: CHANGES.txt tika-parsers/pom.xml

2014-08-05 Thread tallison
Author: tallison Date: Tue Aug 5 13:15:12 2014 New Revision: 1615926 URL: http://svn.apache.org/r1615926 Log: TIKA-1275 upgrade commons compress to 1.8.1; updated CHANGES.txt, too Modified: tika/trunk/CHANGES.txt tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/CHANGES.txt URL

svn commit: r1615970 - in /tika/branches/1.6: tika-app/src/test/java/org/apache/tika/cli/ tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ tika-parsers/src/main/java/org/apache/tika/parser

2014-08-05 Thread tallison
Author: tallison Date: Tue Aug 5 18:17:39 2014 New Revision: 1615970 URL: http://svn.apache.org/r1615970 Log: TIKA-1380; fix for null ole.getLabel() Modified: tika/branches/1.6/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java tika/branches/1.6/tika-parsers/src/main/java/org

svn commit: r1615980 - in /tika/trunk: tika-app/src/test/java/org/apache/tika/cli/ tika-parsers/src/main/java/org/apache/tika/parser/microsoft/ tika-parsers/src/main/java/org/apache/tika/parser/micros

2014-08-05 Thread tallison
Author: tallison Date: Tue Aug 5 19:02:11 2014 New Revision: 1615980 URL: http://svn.apache.org/r1615980 Log: TIKA-1380; fix cases where ole.getLabel() == null for ole attachments Modified: tika/trunk/tika-app/src/test/java/org/apache/tika/cli/TikaCLITest.java tika/trunk/tika-parsers

svn commit: r1626221 - in /tika/trunk: tika-app/src/main/java/org/apache/tika/cli/ tika-app/src/main/java/org/apache/tika/gui/ tika-app/src/test/java/org/apache/tika/cli/ tika-core/src/main/java/org/a

2014-09-19 Thread tallison
Author: tallison Date: Fri Sep 19 14:00:24 2014 New Revision: 1626221 URL: http://svn.apache.org/r1626221 Log: TIKA-1418 add example for how to dump tika config; and add --config to CLI Modified: tika/trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java tika/trunk/tika-app/src

svn commit: r1626222 - in /tika/trunk: tika-app/src/test/resources/test-data/ tika-example/src/main/java/org/apache/tika/example/ tika-example/src/test/java/org/apache/tika/example/

2014-09-19 Thread tallison
Author: tallison Date: Fri Sep 19 14:02:16 2014 New Revision: 1626222 URL: http://svn.apache.org/r1626222 Log: TIKA-1418 add files Added: tika/trunk/tika-app/src/test/resources/test-data/bad_xml.xml tika/trunk/tika-app/src/test/resources/test-data/tika-config1.xml tika/trunk/tika

svn commit: r1626223 - /tika/trunk/tika-example/src/main/java/org/apache/tika/example/DumpTikaConfigExample.java

2014-09-19 Thread tallison
Author: tallison Date: Fri Sep 19 14:10:20 2014 New Revision: 1626223 URL: http://svn.apache.org/r1626223 Log: TIKA-1418 remove println...the horror. Modified: tika/trunk/tika-example/src/main/java/org/apache/tika/example/DumpTikaConfigExample.java Modified: tika/trunk/tika-example/src

svn commit: r1626300 - in /tika/trunk: tika-core/src/main/java/org/apache/tika/parser/ tika-core/src/main/java/org/apache/tika/sax/ tika-parsers/src/test/java/org/apache/tika/ tika-parsers/src/test/ja

2014-09-19 Thread tallison
Author: tallison Date: Fri Sep 19 19:18:08 2014 New Revision: 1626300 URL: http://svn.apache.org/r1626300 Log: TIKA-1329 add RecursiveParserWrapper Added: tika/trunk/tika-core/src/main/java/org/apache/tika/parser/RecursiveParserWrapper.java tika/trunk/tika-core/src/main/java/org/apache

svn commit: r1627304 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java

2014-09-24 Thread tallison
Author: tallison Date: Wed Sep 24 12:58:56 2014 New Revision: 1627304 URL: http://svn.apache.org/r1627304 Log: TIKA-1424: clear PDFont's resources after each document Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java Modified: tika/trunk/tika

svn commit: r1627308 - in /tika/trunk: CHANGES.txt tika-parsers/pom.xml

2014-09-24 Thread tallison
Author: tallison Date: Wed Sep 24 13:10:10 2014 New Revision: 1627308 URL: http://svn.apache.org/r1627308 Log: TIKA-1419: upgrade to PDFBox 1.8.7 and update CHANGES.txt for this and a few recent changes Modified: tika/trunk/CHANGES.txt tika/trunk/tika-parsers/pom.xml Modified: tika

svn commit: r1628350 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/PDF2XHTML.java test/java/org/apache/tika/parser/pdf/PDFParserTest.java test/resources/test-documents/testPD

2014-09-29 Thread tallison
Author: tallison Date: Tue Sep 30 01:41:20 2014 New Revision: 1628350 URL: http://svn.apache.org/r1628350 Log: TIKA-1433 : extract documents embedded within annotations in PDFs Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testPDFFileEmbInAnnotation.pdf (with props

svn commit: r1628715 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java

2014-10-01 Thread tallison
Author: tallison Date: Wed Oct 1 14:35:46 2014 New Revision: 1628715 URL: http://svn.apache.org/r1628715 Log: TIKA-1427, small clean up to ensure that inline image number tracks with extracted file Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java

svn commit: r1633357 - /tika/trunk/tika-app/src/main/java/org/apache/tika/io/

2014-10-21 Thread tallison
Author: tallison Date: Tue Oct 21 12:26:15 2014 New Revision: 1633357 URL: http://svn.apache.org/r1633357 Log: clean up from TIKA-1311 Removed: tika/trunk/tika-app/src/main/java/org/apache/tika/io/

svn commit: r1633499 - in /tika/trunk: ./ tika-app/src/main/java/org/apache/tika/cli/ tika-app/src/main/java/org/apache/tika/gui/ tika-app/src/test/java/org/apache/tika/cli/ tika-app/src/test/resource

2014-10-21 Thread tallison
Author: tallison Date: Wed Oct 22 00:31:37 2014 New Revision: 1633499 URL: http://svn.apache.org/r1633499 Log: TIKA-1451 add RecursiveParserWrapper output to CLI and GUI Added: tika/trunk/tika-app/src/test/resources/test-data/test_recursive_embedded.docx (with props) tika/trunk/tika

svn commit: r1633845 - /tika/trunk/tika-serialization/src/main/java/org/apache/tika/metadata/serialization/JsonMetadataBase.java

2014-10-23 Thread tallison
Author: tallison Date: Thu Oct 23 15:45:20 2014 New Revision: 1633845 URL: http://svn.apache.org/r1633845 Log: move pretty print metadata key sorter into standalone class Modified: tika/trunk/tika-serialization/src/main/java/org/apache/tika/metadata/serialization/JsonMetadataBase.java

svn commit: r1633846 - /tika/trunk/tika-serialization/src/main/java/org/apache/tika/metadata/serialization/PrettyMetadataKeyComparator.java

2014-10-23 Thread tallison
Author: tallison Date: Thu Oct 23 15:46:09 2014 New Revision: 1633846 URL: http://svn.apache.org/r1633846 Log: move pretty print metadata key sorter into standalone class, with added PrettyMetadataKeyComparator...argh Added: tika/trunk/tika-serialization/src/main/java/org/apache/tika

svn commit: r1633850 - in /tika/trunk/tika-serialization: pom.xml src/test/java/org/apache/tika/metadata/serialization/JsonMetadataListTest.java src/test/java/org/apache/tika/metadata/serialization/Js

2014-10-23 Thread tallison
Author: tallison Date: Thu Oct 23 15:56:51 2014 New Revision: 1633850 URL: http://svn.apache.org/r1633850 Log: upgrade gson to 2.2.4 Modified: tika/trunk/tika-serialization/pom.xml tika/trunk/tika-serialization/src/test/java/org/apache/tika/metadata/serialization

svn commit: r1634594 - in /tika/trunk: tika-app/src/main/java/org/apache/tika/gui/TikaGUI.java tika-core/src/main/java/org/apache/tika/sax/BasicContentHandlerFactory.java tika-core/src/test/java/org/a

2014-10-27 Thread tallison
Author: tallison Date: Mon Oct 27 17:00:03 2014 New Revision: 1634594 URL: http://svn.apache.org/r1634594 Log: TIKA-1459 fix write limit bug in BasicContentHandlerFactory when creating a BodyContentHandler Added: tika/trunk/tika-core/src/test/java/org/apache/tika/sax

svn commit: r1635097 - /tika/trunk/tika-app/pom.xml

2014-10-29 Thread tallison
Author: tallison Date: Wed Oct 29 10:57:28 2014 New Revision: 1635097 URL: http://svn.apache.org/r1635097 Log: cleanup tika-app pom, remove unnecessary gson dependency Modified: tika/trunk/tika-app/pom.xml Modified: tika/trunk/tika-app/pom.xml URL: http://svn.apache.org/viewvc/tika/trunk

svn commit: r1637868 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/PDFParser.java test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2014-11-10 Thread tallison
Author: tallison Date: Mon Nov 10 14:15:22 2014 New Revision: 1637868 URL: http://svn.apache.org/r1637868 Log: TIKA-1467: in PDFParser, move metadata set isEncrypted() to before decryption step. Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java

svn commit: r1645684 - in /tika/trunk/tika-parsers: pom.xml src/main/java/org/apache/tika/parser/pdf/PDFParser.java src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2014-12-15 Thread tallison
Author: tallison Date: Mon Dec 15 16:13:31 2014 New Revision: 1645684 URL: http://svn.apache.org/r1645684 Log: TIKA-1442, upgrade to PDFBox 1.8.8 Modified: tika/trunk/tika-parsers/pom.xml tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java tika/trunk

svn commit: r1646612 - /tika/trunk/tika-server/src/main/java/org/apache/tika/server/TikaServerCli.java

2014-12-18 Thread tallison
Author: tallison Date: Fri Dec 19 02:07:04 2014 New Revision: 1646612 URL: http://svn.apache.org/r1646612 Log: TIKA-1498: now actually add providers to cli...argh Modified: tika/trunk/tika-server/src/main/java/org/apache/tika/server/TikaServerCli.java Modified: tika/trunk/tika-server/src

svn commit: r1646616 - in /tika/trunk/tika-server: ./ src/main/java/org/apache/tika/server/ src/test/java/org/apache/tika/server/

2014-12-18 Thread tallison
Author: tallison Date: Fri Dec 19 03:12:38 2014 New Revision: 1646616 URL: http://svn.apache.org/r1646616 Log: TIKA-1497: add JSON and XMP output to tika-server's /meta Added: tika/trunk/tika-server/src/main/java/org/apache/tika/server/XMPMessageBodyWriter.java Modified: tika/trunk/tika

svn commit: r1646617 - /tika/trunk/CHANGES.txt

2014-12-18 Thread tallison
Author: tallison Date: Fri Dec 19 03:13:56 2014 New Revision: 1646617 URL: http://svn.apache.org/r1646617 Log: TIKA-1497: update changes.txt Modified: tika/trunk/CHANGES.txt Modified: tika/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/tika/trunk/CHANGES.txt?rev=1646617r1=1646616r2

svn commit: r1657952 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/mime/TestMimeTypes.java java/org/apache/tika/parser/font/FontParsersTest.java resources/test-documents/testTrueType2.t

2015-02-06 Thread tallison
Author: tallison Date: Fri Feb 6 20:33:02 2015 New Revision: 1657952 URL: http://svn.apache.org/r1657952 Log: TIKA-1542 substitute Apache friendly TTF test file for our current copyrighted file, take 2. See PDFBOX-2383 Added: tika/trunk/tika-parsers/src/test/resources/test-documents

svn commit: r1657739 - in /tika/trunk/tika-parsers/src/test: java/org/apache/tika/mime/TestMimeTypes.java java/org/apache/tika/parser/font/FontParsersTest.java resources/test-documents/testTrueType.tt

2015-02-05 Thread tallison
Author: tallison Date: Fri Feb 6 02:26:22 2015 New Revision: 1657739 URL: http://svn.apache.org/r1657739 Log: TIKA-1542 substitute Apache friendly TTF test file for our current copyrighted file Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testTrueType2.ttf

svn commit: r1653994 - /tika/trunk/tika-core/src/main/java/org/apache/tika/parser/external/ExternalParser.java

2015-01-22 Thread tallison
Author: tallison Date: Thu Jan 22 18:29:19 2015 New Revision: 1653994 URL: http://svn.apache.org/r1653994 Log: TIKA-1526: initial fix for jvm bug that can affect users with a default Locale of tr running on MACOSX or BSD. We still need to confirm that this fixes the problem and/or add a unit

svn commit: r1659449 - in /tika/trunk: ./ tika-bundle/ tika-parsers/ tika-parsers/src/main/appended-resources/META-INF/ tika-parsers/src/main/resources/META-INF/services/ tika-parsers/src/test/resourc

2015-02-12 Thread tallison
Author: tallison Date: Fri Feb 13 02:03:39 2015 New Revision: 1659449 URL: http://svn.apache.org/r1659449 Log: TIKA-1511 add parser for sqlite3 Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testSqlite3b.db (with props) Modified: tika/trunk/CHANGES.txt tika/trunk

svn commit: r1659446 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pdf/PDFParser.java test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2015-02-12 Thread tallison
Author: tallison Date: Fri Feb 13 01:00:31 2015 New Revision: 1659446 URL: http://svn.apache.org/r1659446 Log: TIKA-1548 improve handling of encrypted pdfs when wrong password is offered Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDFParser.java tika

svn commit: r1661129 - in /tika/trunk: ./ tika-parsers/src/test/java/org/apache/tika/ tika-parsers/src/test/java/org/apache/tika/parser/evil/ tika-parsers/src/test/resources/META-INF/ tika-parsers/src

2015-02-20 Thread tallison
Author: tallison Date: Fri Feb 20 14:16:18 2015 New Revision: 1661129 URL: http://svn.apache.org/r1661129 Log: TIKA-1553: add an EvilParser for testing purposes Added: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/evil/ tika/trunk/tika-parsers/src/test/java/org/apache

svn commit: r1654225 - /tika/trunk/tika-core/src/test/java/org/apache/tika/sax/BasicContentHandlerFactoryTest.java

2015-01-23 Thread tallison
Author: tallison Date: Fri Jan 23 14:36:36 2015 New Revision: 1654225 URL: http://svn.apache.org/r1654225 Log: TIKA-1529: step 1...get rid of toLowerCase in BasicContentHandlerFactoryTest Modified: tika/trunk/tika-core/src/test/java/org/apache/tika/sax/BasicContentHandlerFactoryTest.java

svn commit: r1655431 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/pkg/PackageParser.java test/java/org/apache/tika/parser/pkg/Seven7ParserTest.java

2015-01-28 Thread tallison
Author: tallison Date: Wed Jan 28 18:57:00 2015 New Revision: 1655431 URL: http://svn.apache.org/r1655431 Log: TIKA-1521: follow commons-compress and require installation of jce before testing password on 7z file Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pkg

svn commit: r1655433 - in /tika/trunk: CHANGES.txt tika-parsers/pom.xml

2015-01-28 Thread tallison
Author: tallison Date: Wed Jan 28 19:04:39 2015 New Revision: 1655433 URL: http://svn.apache.org/r1655433 Log: TIKA-1534: Upgrade to Commons Compress 1.9 Modified: tika/trunk/CHANGES.txt tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc

svn commit: r1659547 - /tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/Database.java

2015-02-13 Thread tallison
Author: tallison Date: Fri Feb 13 12:54:45 2015 New Revision: 1659547 URL: http://svn.apache.org/r1659547 Log: TIKA-1511, third time is the charm...many apologies Added: tika/trunk/tika-core/src/main/java/org/apache/tika/metadata/Database.java Added: tika/trunk/tika-core/src/main/java/org

svn commit: r1659598 - /tika/trunk/tika-parsers/pom.xml

2015-02-13 Thread tallison
Author: tallison Date: Fri Feb 13 16:40:55 2015 New Revision: 1659598 URL: http://svn.apache.org/r1659598 Log: TIKA-1511 try to revert to earlier version of sqlite-jdbc to avoid unsatisfiedlikeerror on ubuntu Modified: tika/trunk/tika-parsers/pom.xml Modified: tika/trunk/tika-parsers

svn commit: r1658947 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/rtf/TextExtractor.java test/java/org/apache/tika/parser/rtf/RTFParserTest.java test/resources/test-documents/te

2015-02-11 Thread tallison
Author: tallison Date: Wed Feb 11 12:59:03 2015 New Revision: 1658947 URL: http://svn.apache.org/r1658947 Log: TIKA-1544 consecutive new lines not preserved in rtf Added: tika/trunk/tika-parsers/src/test/resources/test-documents/testRTFNewlines.rtf Modified: tika/trunk/tika-parsers/src

svn commit: r1659545 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/jdbc/ test/java/org/apache/tika/parser/jdbc/

2015-02-13 Thread tallison
Author: tallison Date: Fri Feb 13 12:43:56 2015 New Revision: 1659545 URL: http://svn.apache.org/r1659545 Log: TIKA-1511, with new files added...doh Added: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/jdbc/ tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser

svn commit: r1650117 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/ocr/TesseractOCRParserTest.java

2015-01-07 Thread tallison
Author: tallison Date: Wed Jan 7 16:48:43 2015 New Revision: 1650117 URL: http://svn.apache.org/r1650117 Log: TIKA-1445: add tests to TesseractOCRParserTest to ensure metadata is extracted Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/ocr

svn commit: r1650111 - /tika/trunk/tika-server/src/test/java/org/apache/tika/server/TikaMimeTypesTest.java

2015-01-07 Thread tallison
Author: tallison Date: Wed Jan 7 16:35:33 2015 New Revision: 1650111 URL: http://svn.apache.org/r1650111 Log: TIKA-1445: need to fix TikaMimeTypesTest in tika-server to accomodate two options for parser Modified: tika/trunk/tika-server/src/test/java/org/apache/tika/server

svn commit: r1665232 - in /tika/trunk: ./ tika-core/src/test/java/org/apache/tika/parser/mock/ tika-parsers/src/test/java/org/apache/tika/parser/mock/ tika-parsers/src/test/resources/test-documents/mo

2015-03-09 Thread tallison
Author: tallison Date: Mon Mar 9 13:40:44 2015 New Revision: 1665232 URL: http://svn.apache.org/r1665232 Log: TIKA-1553, add action types for printing to stdout and stderr Modified: tika/trunk/CHANGES.txt tika/trunk/tika-core/src/test/java/org/apache/tika/parser/mock/MockParser.java

svn commit: r1664641 - /tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java

2015-03-06 Thread tallison
Author: tallison Date: Fri Mar 6 14:50:46 2015 New Revision: 1664641 URL: http://svn.apache.org/r1664641 Log: turn off pdfbox logging in PDFParserTest Modified: tika/trunk/tika-parsers/src/test/java/org/apache/tika/parser/pdf/PDFParserTest.java Modified: tika/trunk/tika-parsers/src/test

svn commit: r1664635 - in /tika/trunk: ./ tika-core/src/test/java/org/apache/tika/parser/ tika-core/src/test/java/org/apache/tika/parser/mock/ tika-parsers/ tika-parsers/src/test/java/org/apache/tika/

2015-03-06 Thread tallison
Author: tallison Date: Fri Mar 6 14:41:07 2015 New Revision: 1664635 URL: http://svn.apache.org/r1664635 Log: TIKA-1553 change EvilParser to MockParser and move to core Added: tika/trunk/tika-core/src/test/java/org/apache/tika/parser/mock/ tika/trunk/tika-core/src/test/java/org/apache

svn commit: r1670090 - in /tika/trunk/tika-parsers/src: main/java/org/apache/tika/parser/microsoft/WordExtractor.java test/java/org/apache/tika/parser/microsoft/WordParserTest.java test/resources/test

2015-03-30 Thread tallison
Author: tallison Date: Mon Mar 30 13:29:11 2015 New Revision: 1670090 URL: http://svn.apache.org/r1670090 Log: TIKA-1512 temporary workaround. Currently not including test docs or tests that derive from govdocs1 Added: tika/trunk/tika-parsers/src/test/resources/test-documents

svn commit: r1670095 - in /tika/trunk/tika-server/src: main/java/org/apache/tika/server/resource/ test/java/org/apache/tika/server/

2015-03-30 Thread tallison
Author: tallison Date: Mon Mar 30 13:57:06 2015 New Revision: 1670095 URL: http://svn.apache.org/r1670095 Log: TIKA-1584: fixed regression in Tika 1.7 that prevents processing of embedded docs with /tika service Modified: tika/trunk/tika-server/src/main/java/org/apache/tika/server/resource

svn commit: r1670185 - in /tika/trunk/tika-batch/src/main: java/org/apache/tika/batch/fs/builders/BasicTikaFSConsumersBuilder.java resources/org/apache/tika/batch/fs/default-tika-batch-config.xml

2015-03-30 Thread tallison
Author: tallison Date: Mon Mar 30 19:43:38 2015 New Revision: 1670185 URL: http://svn.apache.org/r1670185 Log: TIKA-1330, trivial fixes to avoid NPE with consumersManagerMaxMillis parameter Modified: tika/trunk/tika-batch/src/main/java/org/apache/tika/batch/fs/builders

svn commit: r1670238 - in /tika/trunk: tika-app/pom.xml tika-server/pom.xml

2015-03-30 Thread tallison
Author: tallison Date: Tue Mar 31 01:58:04 2015 New Revision: 1670238 URL: http://svn.apache.org/r1670238 Log: TIKA-1423: exclude pdfs and readme.txt files from tika-app and tika-server jars. Anything else we can exclude? Modified: tika/trunk/tika-app/pom.xml tika/trunk/tika-server

svn commit: r1670237 - in /tika/trunk: tika-app/src/main/java/org/apache/tika/cli/ tika-app/src/test/java/org/apache/tika/cli/ tika-batch/src/main/java/org/apache/tika/batch/builders/ tika-batch/src/m

2015-03-30 Thread tallison
Author: tallison Date: Tue Mar 31 01:54:40 2015 New Revision: 1670237 URL: http://svn.apache.org/r1670237 Log: TIKA-1330: add integration tests to TikaCLITest Modified: tika/trunk/tika-app/src/main/java/org/apache/tika/cli/TikaCLI.java tika/trunk/tika-app/src/test/java/org/apache/tika

svn commit: r1663424 - /tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java

2015-03-02 Thread tallison
Author: tallison Date: Mon Mar 2 20:40:35 2015 New Revision: 1663424 URL: http://svn.apache.org/r1663424 Log: TIKA-758 clean up after remembering PDFBOX-1130 Modified: tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/pdf/PDF2XHTML.java Modified: tika/trunk/tika-parsers/src

svn commit: r1663764 - in /tika/trunk: tika-core/src/main/java/org/apache/tika/exception/ tika-core/src/main/java/org/apache/tika/metadata/ tika-parsers/src/main/java/org/apache/tika/parser/pdf/ tika-

2015-03-03 Thread tallison
Author: tallison Date: Tue Mar 3 18:51:41 2015 New Revision: 1663764 URL: http://svn.apache.org/r1663764 Log: TIKA-1489 add optional accessibility checking to PDF files Added: tika/trunk/tika-core/src/main/java/org/apache/tika/exception/AccessPermissionException.java tika/trunk/tika

svn commit: r1661200 [3/3] - in /tika/trunk/tika-server/src: main/java/org/apache/tika/server/ test/java/org/apache/tika/server/

2015-02-20 Thread tallison
Modified: tika/trunk/tika-server/src/test/java/org/apache/tika/server/MetadataResourceTest.java URL: http://svn.apache.org/viewvc/tika/trunk/tika-server/src/test/java/org/apache/tika/server/MetadataResourceTest.java?rev=1661200r1=1661199r2=1661200view=diff

svn commit: r1661193 - in /tika/trunk: ./ tika-server/ tika-server/src/main/java/org/apache/tika/server/ tika-server/src/test/java/org/apache/tika/server/ tika-server/src/test/resources/META-INF/ tika

2015-02-20 Thread tallison
Author: tallison Date: Fri Feb 20 19:11:44 2015 New Revision: 1661193 URL: http://svn.apache.org/r1661193 Log: TIKA-1323: allow tika-server to return stack traces from parse exceptions for easier analysis of parser exceptions via tika-server. Added: tika/trunk/tika-server/src/main/java/org

svn commit: r1661200 [2/3] - in /tika/trunk/tika-server/src: main/java/org/apache/tika/server/ test/java/org/apache/tika/server/

2015-02-20 Thread tallison
Modified: tika/trunk/tika-server/src/main/java/org/apache/tika/server/TikaResource.java URL: http://svn.apache.org/viewvc/tika/trunk/tika-server/src/main/java/org/apache/tika/server/TikaResource.java?rev=1661200r1=1661199r2=1661200view=diff

  1   2   3   4   5   6   7   8   9   10   >