[2/3] tika git commit: TIKA-2041, upgrade ICU4j's charset detector to avoid multithreading bug.

2016-07-26 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/9f6c71fa/tika-parser-modules/tika-parser-text-module/src/main/java/org/apache/tika/parser/txt/CharsetRecog_sbcs.java -- diff --git

[3/3] tika git commit: TIKA-2041, upgrade ICU4j's charset detector to avoid multithreading bug.

2016-07-26 Thread tallison
TIKA-2041, upgrade ICU4j's charset detector to avoid multithreading bug. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/9f6c71fa Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/9f6c71fa Diff:

[1/3] tika git commit: TIKA-2041, upgrade ICU4j's charset detector to avoid multithreading bug.

2016-07-26 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x f89887d2f -> 9f6c71fa6 http://git-wip-us.apache.org/repos/asf/tika/blob/9f6c71fa/tika-parser-modules/tika-parser-web-module/src/test/java/org/apache/tika/parser/html/HtmlParserTest.java

tika git commit: TIKA-2040 - prevent permanent hang/oom on corrupt chm file

2016-07-26 Thread tallison
Repository: tika Updated Branches: refs/heads/master f5b04b60c -> 71cb9363c TIKA-2040 - prevent permanent hang/oom on corrupt chm file Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/71cb9363 Tree:

tika git commit: TIKA-2040 - prevent permanent hang/oom on corrupt chm file

2016-07-26 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 9f6c71fa6 -> 1c582aba6 TIKA-2040 - prevent permanent hang/oom on corrupt chm file Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/1c582aba Tree:

tika git commit: TIKA-2029 -- upgrade jackcess to 2.1.4

2016-07-22 Thread tallison
Repository: tika Updated Branches: refs/heads/master a383567c2 -> f00ab040d TIKA-2029 -- upgrade jackcess to 2.1.4 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/f00ab040 Tree:

tika git commit: TIKA-2039 upgrade to jackcess 2.1.4

2016-07-22 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x f4bacf859 -> 8b951a43c TIKA-2039 upgrade to jackcess 2.1.4 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/8b951a43 Tree:

tika git commit: clean up MatParser

2016-08-11 Thread tallison
Repository: tika Updated Branches: refs/heads/master 85e538500 -> 8a68b5d47 clean up MatParser Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/8a68b5d4 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/8a68b5d4 Diff:

tika git commit: cleanup MatParser

2016-08-11 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x fc7c372f5 -> 6ebbd7ef7 cleanup MatParser Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/6ebbd7ef Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/6ebbd7ef Diff:

tika git commit: TIKA-2041 - add important diffs between new copy/paste from ICU4J and legacy code which may have included Tika-specific mods.

2016-08-11 Thread tallison
Repository: tika Updated Branches: refs/heads/master 8a68b5d47 -> bd9a9b911 TIKA-2041 - add important diffs between new copy/paste from ICU4J and legacy code which may have included Tika-specific mods. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

tika git commit: TIKA-2041 - add important diffs between new copy/paste from ICU4J and legacy code which may have included Tika-specific mods.

2016-08-11 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 6ebbd7ef7 -> b41c0b2a8 TIKA-2041 - add important diffs between new copy/paste from ICU4J and legacy code which may have included Tika-specific mods. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

[1/2] tika git commit: fix for TIKA-1980 contributed by naegelejd

2016-08-12 Thread tallison
Repository: tika Updated Branches: refs/heads/master bd9a9b911 -> 5495ffcf3 fix for TIKA-1980 contributed by naegelejd Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/9024f12e Tree:

[2/2] tika git commit: TIKA-1980 - via Joseph Naegele. This closes #121

2016-08-12 Thread tallison
TIKA-1980 - via Joseph Naegele. This closes #121 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/5495ffcf Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/5495ffcf Diff:

[3/3] tika git commit: TIKA-1980 via Joseph Naegele

2016-08-12 Thread tallison
TIKA-1980 via Joseph Naegele Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/27bc383e Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/27bc383e Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/27bc383e Branch:

[2/3] tika git commit: TIKA-1938 via Joseph Naegele

2016-08-12 Thread tallison
TIKA-1938 via Joseph Naegele Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/09bd22fb Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/09bd22fb Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/09bd22fb Branch:

[1/3] tika git commit: TIKA-1938 via Joseph Naegele

2016-08-12 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x b41c0b2a8 -> 27bc383eb TIKA-1938 via Joseph Naegele Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/5358bf1e Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/5358bf1e

[2/2] tika git commit: TIKA-2024 -- extract original file name/location, initial patch: rtf, applefile, word2003, word, pdf

2016-06-28 Thread tallison
t(33, new Pair("file_5.pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation")); +expected.put(32, new Pair("thumbnail.jpeg", "image/jpeg")); +expected.put(36, new Pair("file_6.doc", "application/msword")); +expe

[1/2] tika git commit: TIKA-2022 -- clean up AppleSingleFileParser to use EndianUtils, shorten test file, make field types private

2016-06-28 Thread tallison
Repository: tika Updated Branches: refs/heads/master 7db0ab628 -> b6d55ae34 TIKA-2022 -- clean up AppleSingleFileParser to use EndianUtils, shorten test file, make field types private Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

tika git commit: TIKA-2024 -- remove debugging test

2016-06-28 Thread tallison
s RTFParserTest extends TikaTest { assertEquals("C:\\Users\\tallison\\AppData\\Local\\Temp\\testJPEG_普林斯顿.jpg", metadataList.get(45).get(TikaCoreProperties.ORIGINAL_RESOURCE_NAME)); } - -@Test -public void oneOff() throws Exception { -

[2/2] tika git commit: TIKA-2022 -- add applefile parser

2016-06-27 Thread tallison
TIKA-2022 -- add applefile parser Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/0f3b0bdb Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/0f3b0bdb Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/0f3b0bdb Branch:

tika git commit: TIKA-2026 -- improve extraction of embedded files from ppt, pptx and xlsx

2016-06-28 Thread tallison
Repository: tika Updated Branches: refs/heads/master 69d825005 -> 7cc610e1b TIKA-2026 -- improve extraction of embedded files from ppt, pptx and xlsx Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/7cc610e1 Tree:

tika git commit: TIKA-2026 --fix caps on test files

2016-06-28 Thread tallison
Repository: tika Updated Branches: refs/heads/master 7cc610e1b -> 52f04bea6 TIKA-2026 --fix caps on test files Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/52f04bea Tree:

[5/5] tika git commit: TIKA-2026 -- improve extraction of attachments for PPT, PPTX, XLSX

2016-06-28 Thread tallison
TIKA-2026 -- improve extraction of attachments for PPT, PPTX, XLSX Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/dd3c2a48 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/dd3c2a48 Diff:

[1/5] tika git commit: fix indentation

2016-06-28 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 5bc597dc8 -> dd3c2a486 fix indentation Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/865c45cd Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/865c45cd Diff:

[2/5] tika git commit: TIKA-2022 - clean up -- make entries private, move more into EndianUtils

2016-06-28 Thread tallison
TIKA-2022 - clean up -- make entries private, move more into EndianUtils Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/c84855f6 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/c84855f6 Diff:

[4/5] tika git commit: rm inconsistently capitalized test files

2016-06-28 Thread tallison
rm inconsistently capitalized test files Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/933af20e Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/933af20e Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/933af20e

[2/2] tika git commit: Merge remote-tracking branch 'origin/2.x' into 2.x

2016-07-08 Thread tallison
Merge remote-tracking branch 'origin/2.x' into 2.x Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/cdfacdb4 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/cdfacdb4 Diff:

[1/2] tika git commit: TIKA-2030 - add handling for element to ODT parser. Thanks to David Pilato for opening this issue.

2016-07-08 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 573527bbc -> cdfacdb41 TIKA-2030 - add handling for element to ODT parser. Thanks to David Pilato for opening this issue. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

[1/2] tika git commit: TIKA-2030 - add processing for element in odt, thanks to David Pilato for identifying this.

2016-07-08 Thread tallison
Repository: tika Updated Branches: refs/heads/master 636060eb6 -> 8d29f7a62 TIKA-2030 - add processing for element in odt, thanks to David Pilato for identifying this. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

[2/2] tika git commit: Merge remote-tracking branch 'origin/master'

2016-07-08 Thread tallison
Merge remote-tracking branch 'origin/master' Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/8d29f7a6 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/8d29f7a6 Diff:

tika git commit: TIKA-2030 -- fix test document so that it is correctly detected.

2016-07-08 Thread tallison
Repository: tika Updated Branches: refs/heads/master 8d29f7a62 -> ff187a0f4 TIKA-2030 -- fix test document so that it is correctly detected. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/ff187a0f Tree:

tika git commit: TIKA-2030 - fix test file so that it is correctly detected

2016-07-08 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x cdfacdb41 -> e27526b84 TIKA-2030 - fix test file so that it is correctly detected Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/e27526b8 Tree:

tika git commit: TIKA-2029: add some content for links so that we don't generate bad html

2016-07-06 Thread tallison
Repository: tika Updated Branches: refs/heads/master 23a11eff3 -> 95b2cd127 TIKA-2029: add some content for links so that we don't generate bad html http://tika.apache.org/"/> Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

tika git commit: TIKA-2048

2016-08-05 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 1c582aba6 -> fc7c372f5 TIKA-2048 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/fc7c372f Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/fc7c372f Diff:

tika git commit: TIKA-2048

2016-08-05 Thread tallison
Repository: tika Updated Branches: refs/heads/master 71cb9363c -> 85e538500 TIKA-2048 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/85e53850 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/85e53850 Diff:

tika git commit: TIKA-2025 -- override general format in excel to extract 15 digit integers

2016-07-22 Thread tallison
osoft/ooxml/OOXMLParserTest.java b/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java index bc2b0ae..8625fa3 100644 --- a/tika-parsers/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java +++ b/tika-parsers/src/test/java/org/apache/tika/parser/

tika git commit: TIKA-2025 increase number of significant digits extracted in "general" format in xls/xlsx

2016-07-22 Thread tallison
src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java +++ b/tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParserTest.java @@ -1236,6 +1236,16 @@ public class OOXMLParserTest extends TikaTest { assertContains("C:\\Use

tika git commit: TIKA-2024 add location extraction for OLE1.0 embedded files

2016-06-29 Thread tallison
c void testOrigSourcePath() throws Exception { +Metadata embed1_zip_metadata = getRecursiveJson("test_recursive_embedded.doc").get(11); +assertContains("C:\\Users\\tallison\\AppData\\Local\\Temp\\embed1.zip", + Arrays.asList(embed1_zip_metadata.getVa

tika git commit: TIKA-2024 extract original path name from OLE1.0 embedded objects

2016-06-29 Thread tallison
alues); assertContains("Hard Drive:Course Folders:276:276-s00:07-Force-on-a-current-S00", values); } + +@Test +public void testOrigSourcePath() throws Exception { +Metadata embed1_zip_metadata = getRecursiveJson("test_recursive_embedded.doc&

[31/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/mp3/Mp3Parser.java -- diff --git

[19/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/chm/TestChmBlockInfo.java -- diff --git

[03/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-web-module/src/main/java/org/apache/tika/parser/html/IdentityHtmlMapper.java -- diff --git

[20/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/rtf/TextExtractor.java -- diff --git

[25/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/chm/core/ChmConstants.java -- diff --git

[33/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/image/ImageParser.java -- diff --git

[30/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-multimedia-module/src/test/java/org/apache/tika/parser/image/ImageParserTest.java -- diff --git

[10/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-package-module/src/test/java/org/apache/tika/parser/pkg/ZipParserTest.java -- diff --git

[27/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/chm/accessor/ChmItspHeader.java -- diff --git

[01/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x dd3c2a486 -> c7a6bcac4 http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-web-module/src/test/java/org/apache/tika/parser/mail/RFC822ParserTest.java

[35/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-code-module/src/test/java/org/apache/tika/parser/asm/ClassParserTest.java -- diff --git

[38/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-core/src/test/java/org/apache/tika/utils/ConcurrentUtilsTest.java -- diff --git a/tika-core/src/test/java/org/apache/tika/utils/ConcurrentUtilsTest.java

[14/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
ext/html; charset=UTF-8")); -expected.put(17, new Pair("testJPEG_\u666E\u6797\u65AF\u987F.jpg", "image/jpeg")); -expected.put(20, new Pair("file_2.xls", "application/vnd.ms-excel")); - expected.put(23, new Pair("testMSG_\u66

[24/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/chm/lzx/ChmLzxBlock.java -- diff --git

[22/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/odf/OpenDocumentContentParser.java -- diff --git

[12/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-package-module/src/main/java/org/apache/tika/parser/pkg/PackageParser.java -- diff --git

[15/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/WordParserTest.java -- diff --git

[28/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/chm/accessor/ChmDirectoryListingSet.java -- diff --git

[21/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/opc/OPCDetector.java -- diff --git

[09/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-text-module/src/main/java/org/apache/tika/parser/txt/CharsetDetector.java -- diff --git

[26/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/chm/accessor/ChmPmgiHeader.java -- diff --git

[36/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-advanced-module/src/main/java/org/apache/tika/module/advanced/internal/Activator.java -- diff --git

[18/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/chm/TestDirectoryListingEntry.java -- diff --git

[39/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
Convert new lines from windows to unix Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/c7a6bcac Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/c7a6bcac Diff: http://git-wip-us.apache.org/repos/asf/tika/diff/c7a6bcac

[23/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/main/java/org/apache/tika/parser/chm/lzx/ChmLzxState.java -- diff --git

[11/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-package-module/src/test/java/org/apache/tika/parser/iwork/IWorkParserTest.java -- diff --git

[13/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-package-module/src/main/java/org/apache/tika/parser/iwork/KeynoteContentHandler.java -- diff --git

[06/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-text-module/src/main/java/org/apache/tika/parser/txt/CharsetRecognizer.java -- diff --git

[37/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-bundles/tika-parser-office-bundle/pom.xml -- diff --git a/tika-parser-bundles/tika-parser-office-bundle/pom.xml

[04/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-web-module/src/main/java/org/apache/tika/parser/html/BoilerpipeContentHandler.java -- diff --git

[29/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-multimedia-module/src/test/java/org/apache/tika/parser/mp3/Mp3ParserTest.java -- diff --git

[17/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/ExcelParserTest.java -- diff --git

[32/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-multimedia-module/src/main/java/org/apache/tika/parser/mp3/ID3v22Handler.java -- diff --git

[07/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-text-module/src/main/java/org/apache/tika/parser/txt/CharsetRecog_sbcs.java -- diff --git

[16/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-office-module/src/test/java/org/apache/tika/parser/microsoft/POIContainerExtractionTest.java -- diff --git

[08/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-text-module/src/main/java/org/apache/tika/parser/txt/CharsetRecog_Unicode.java -- diff --git

[05/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-text-module/src/test/java/org/apache/tika/parser/txt/TXTParserTest.java -- diff --git

[02/39] tika git commit: Convert new lines from windows to unix

2016-06-29 Thread tallison
http://git-wip-us.apache.org/repos/asf/tika/blob/c7a6bcac/tika-parser-modules/tika-parser-web-module/src/test/java/org/apache/tika/parser/html/HtmlParserTest.java -- diff --git

tika git commit: fix getRecursiveJson -> getRecursiveMetadata in TikaTest, no json is involved here...

2016-06-29 Thread tallison
s TikaTest { @Test public void testOrigSourcePath() throws Exception { -Metadata embed1_zip_metadata = getRecursiveJson("test_recursive_embedded.doc").get(11); +Metadata embed1_zip_metadata = getRecursiveMetadata("test_recursive_embedded.doc").get(11);

tika git commit: fix getRecursiveJson -> getRecursiveMetadata in TikaTest, no json is involved here...

2016-06-29 Thread tallison
estOrigSourcePath() throws Exception { -Metadata embed1_zip_metadata = getRecursiveJson("test_recursive_embedded.doc").get(11); +Metadata embed1_zip_metadata = getRecursiveMetadata("test_recursive_embedded.doc").get(11); assertContains("C:\\Users\\tall

tika git commit: TIKA 2259 -- improve url extraction from PDFs = copy Tilman Hausherr's code from PDFBOX 3644

2017-02-02 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 2d4889f44 -> 7b0655cc1 TIKA 2259 -- improve url extraction from PDFs = copy Tilman Hausherr's code from PDFBOX 3644 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

tika git commit: TIKA-2259 -- improve url extraction from PDFs = copy Tilman Hausherr's code from PDFBOX-3644

2017-02-02 Thread tallison
Repository: tika Updated Branches: refs/heads/master da8363fe6 -> 7555b136d TIKA-2259 -- improve url extraction from PDFs = copy Tilman Hausherr's code from PDFBOX-3644 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

tika git commit: TIKA 2025 -- fix xls/x testBigIntegersWGeneralFormat to work in multiple locales

2017-02-02 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 28010d90d -> 2d4889f44 TIKA 2025 -- fix xls/x testBigIntegersWGeneralFormat to work in multiple locales Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/2d4889f4 Tree:

[1/2] tika git commit: TIKA-2025 -- fix xls/x testBigIntegersWGeneralFormat to work in multiple locales. This closes #151

2017-02-02 Thread tallison
Repository: tika Updated Branches: refs/heads/master 73a37a4c2 -> da8363fe6 TIKA-2025 -- fix xls/x testBigIntegersWGeneralFormat to work in multiple locales. This closes #151 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

[2/2] tika git commit: Merge remote-tracking branch 'origin/master'

2017-02-02 Thread tallison
Merge remote-tracking branch 'origin/master' Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/da8363fe Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/da8363fe Diff:

tika git commit: TIKA-2181 - upgrade to POI 3.16.beta2

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/master 7555b136d -> 0d54f07fa TIKA-2181 - upgrade to POI 3.16.beta2 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/0d54f07f Tree:

tika git commit: TIKA-2181 upgrade to POI 3 16 beta2, make sure to upgrade overall bundle

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x cf3996ed0 -> 27e81b97a TIKA-2181 upgrade to POI 3 16 beta2, make sure to upgrade overall bundle Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/27e81b97 Tree:

tika git commit: TIKA-2198 - add null check to Tika after upgrade to POI 3.16-beta2

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 27e81b97a -> 0d7f5bad0 TIKA-2198 - add null check to Tika after upgrade to POI 3.16-beta2 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/0d7f5bad Tree:

tika git commit: TIKA-2134 -- remove npe catch after upgrade to POI 3.16.beta2

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/master bc3b26369 -> 27e026eff TIKA-2134 -- remove npe catch after upgrade to POI 3.16.beta2 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/27e026ef Tree:

tika git commit: TIKA-2134 - remove npe catch after upgrade to POI 3.16.beta2

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 0d7f5bad0 -> d9f376c12 TIKA-2134 - remove npe catch after upgrade to POI 3.16.beta2 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/d9f376c1 Tree:

tika git commit: TIKA 2181 upgrade to POI 3 16 beta2

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 7b0655cc1 -> cf3996ed0 TIKA 2181 upgrade to POI 3 16 beta2 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/cf3996ed Tree:

tika git commit: TIKA-2198 - add null check to Tika after upgrade to POI 3.16.beta2

2017-02-06 Thread tallison
Repository: tika Updated Branches: refs/heads/master 0d54f07fa -> bc3b26369 TIKA-2198 - add null check to Tika after upgrade to POI 3.16.beta2 Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/bc3b2636 Tree:

tika git commit: TIKA-2247 and TIKA-2246 -- add parsers for EMF/WMF

2017-02-06 Thread tallison
cation/vnd.ms-outlook")); +expected.put(27, new Pair("file_3.pdf", "application/pdf")); +expected.put(30, new Pair("file_4.ppt", "application/vnd.ms-powerpoint")); +expected.put(34, new Pair("file_5.pptx", "application/vnd.o

tika git commit: TIKA-2246 and TIKA-2247 -add parsers for EMF and WMF

2017-02-06 Thread tallison
")); +expected.put(30, new Pair("file_4.ppt", "application/vnd.ms-powerpoint")); +expected.put(34, new Pair("file_5.pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation"));

tika git commit: TIKA-2249 -- update javadocs to alert devs that tables are not "maintained" by the PDFParser

2017-01-24 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 4599374d6 -> 235c2adab TIKA-2249 -- update javadocs to alert devs that tables are not "maintained" by the PDFParser Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/235c2ada

tika git commit: TIKA-2249 -- update javadocs to alert devs that tables are not "maintained" by the PDFParser

2017-01-24 Thread tallison
Repository: tika Updated Branches: refs/heads/master 7afcfc702 -> fe94908c0 TIKA-2249 -- update javadocs to alert devs that tables are not "maintained" by the PDFParser Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

[2/2] tika git commit: Merge remote-tracking branch 'origin/master'

2017-01-24 Thread tallison
Merge remote-tracking branch 'origin/master' Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/7afcfc70 Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/7afcfc70 Diff:

[1/2] tika git commit: TIKA 2244 -- be more parsimonious with BufferedInputStream. AutoDetectReader

2017-01-24 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x bd667acde -> 4599374d6 TIKA 2244 -- be more parsimonious with BufferedInputStream. AutoDetectReader Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/985c1aef Tree:

[2/2] tika git commit: Merge remote-tracking branch 'origin/2.x' into 2.x

2017-01-24 Thread tallison
Merge remote-tracking branch 'origin/2.x' into 2.x Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/4599374d Tree: http://git-wip-us.apache.org/repos/asf/tika/tree/4599374d Diff:

tika git commit: TIKA-2251 improve exception handling in SAX pptx/docx parsers

2017-01-26 Thread tallison
Repository: tika Updated Branches: refs/heads/2.x 235c2adab -> 3df8ce8b2 TIKA-2251 improve exception handling in SAX pptx/docx parsers Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit: http://git-wip-us.apache.org/repos/asf/tika/commit/3df8ce8b Tree:

tika git commit: TIKA-2251 -- make catch blocks as small as possible and improve "logging" with malformed files in new experimental SAX docx/pptx parsers.

2017-01-26 Thread tallison
Repository: tika Updated Branches: refs/heads/master fe94908c0 -> 280ab87e8 TIKA-2251 -- make catch blocks as small as possible and improve "logging" with malformed files in new experimental SAX docx/pptx parsers. Project: http://git-wip-us.apache.org/repos/asf/tika/repo Commit:

<    1   2   3   4   5   6   7   8   9   10   >