scm info in pom.xml

2016-02-06 Thread Ken Krugler
I'm revisiting the creation of a new tika-langdetect module in the 2.x branch, and have created a pom.xml But in looking at what I started with (from tika-translate), I see this: http://svn.apache.org/viewvc/tika/trunk/tika-langdetect

project.build.sourceEncoding

2016-02-06 Thread Ken Krugler
Hi devs, I ran into an issue where a test file that contained UTF-8 text was being displayed in Eclipse as us-ascii. I had thought that Tika would use UTF-8 everywhere for file encodings, but… Currently the tika-parent/pom.xml has: 1.7 1.7 ${project.build.sourceEncoding}

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-06 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136003#comment-15136003 ] Ken Krugler commented on TIKA-1851: --- I got a clean build w/o any pre-installed modules, so much better,

[jira] [Commented] (TIKA-1723) Integrate language-detector into Tika

2016-02-06 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136077#comment-15136077 ] Ken Krugler commented on TIKA-1723: --- OK, I've committed this code to a new tika-langdetect module in the

Tracking 2.x migration changes

2016-02-06 Thread Ken Krugler
Is there a document where we're tracking what (breaking) API changes are occurring in the 2.x branch, and the migration path from 1.x for Tika users? If not, should this be a wiki page that we all edit iteratively? Thanks, -- Ken -- Ken Krugler +1 530-210-6378

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-06 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135828#comment-15135828 ] Hudson commented on TIKA-1851: -- SUCCESS: Integrated in tika-2.x #23 (See

[jira] [Commented] (TIKA-741) "Zip bomb" (XML nesting) detection is too strict

2016-02-06 Thread Pascal Essiembre (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136128#comment-15136128 ] Pascal Essiembre commented on TIKA-741: --- It looks like maxDepth 100 is not enough. I am using Tika

[jira] [Commented] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-06 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15136079#comment-15136079 ] Ken Krugler commented on TIKA-1851: --- After poking around a bit, my vote would be to (a) move the test