[jira] [Commented] (TIKA-1848) Address issues with Tika 1.12rc#1

2016-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130376#comment-15130376 ] Nick Burch commented on TIKA-1848: -- I'm not sure if our test files should have license headers in them,

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-03 Thread Ray Gauss II (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130386#comment-15130386 ] Ray Gauss II commented on TIKA-1824: bq. Thank you, Bob Paulin! Again, this is fantastic. Indeed,

[jira] [Commented] (TIKA-1846) Set up Hudson (or similar?) with new Git repo

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130321#comment-15130321 ] Tim Allison commented on TIKA-1846: --- Thank you! > Set up Hudson (or similar?) with new Git repo >

[jira] [Commented] (TIKA-1723) Integrate language-detector into Tika

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130360#comment-15130360 ] Tim Allison commented on TIKA-1723: --- Come on over to the 2.x branch, the water is fine. :) Plenty of

[jira] [Commented] (TIKA-1848) Address issues with Tika 1.12rc#1

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130400#comment-15130400 ] Tim Allison commented on TIKA-1848: --- I tested adding headers, and they don't break our tests with the

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Boris Slobodin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130563#comment-15130563 ] Boris Slobodin commented on TIKA-1850: -- Note, this may be related to

[jira] [Resolved] (TIKA-1821) Problem in Tika().detect for xml file signed in CADES

2016-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1821. -- Resolution: Fixed Fix Version/s: 1.13 Thanks for these, I've used the to add unit tests which

[jira] [Commented] (TIKA-1821) Problem in Tika().detect for xml file signed in CADES

2016-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130509#comment-15130509 ] Hudson commented on TIKA-1821: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #904 (See

[jira] [Created] (TIKA-1849) RTF Exception

2016-02-03 Thread Andrea (JIRA)
Andrea created TIKA-1849: Summary: RTF Exception Key: TIKA-1849 URL: https://issues.apache.org/jira/browse/TIKA-1849 Project: Tika Issue Type: Bug Components: parser Affects Versions:

[jira] [Created] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Boris Slobodin (JIRA)
Boris Slobodin created TIKA-1850: Summary: Tika erroneously detects some versions of jQuery as "text/html" Key: TIKA-1850 URL: https://issues.apache.org/jira/browse/TIKA-1850 Project: Tika

[jira] [Assigned] (TIKA-1847) Clean up parser version parameters in 2.x

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reassigned TIKA-1847: - Assignee: Tim Allison > Clean up parser version parameters in 2.x >

[jira] [Created] (TIKA-1851) Tika 2.0 - Move test resources from core to test-resources

2016-02-03 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1851: - Summary: Tika 2.0 - Move test resources from core to test-resources Key: TIKA-1851 URL: https://issues.apache.org/jira/browse/TIKA-1851 Project: Tika Issue Type:

[jira] [Comment Edited] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131500#comment-15131500 ] Tim Allison edited comment on TIKA-1824 at 2/4/16 1:21 AM: --- bq. Perhaps add

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131500#comment-15131500 ] Tim Allison commented on TIKA-1824: --- bq. Perhaps add "parser(s?) to the artifactId Y, sorry,

[jira] [Commented] (TIKA-1847) Tika 2.0 - Clean up tika-parsers pom dependencies and a few other things

2016-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131535#comment-15131535 ] Hudson commented on TIKA-1847: -- UNSTABLE: Integrated in tika-2.x #17 (See

[jira] [Updated] (TIKA-1847) Tika 2.0 - Clean up tika-parsers pom dependencies and a few other things

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1847: -- Summary: Tika 2.0 - Clean up tika-parsers pom dependencies and a few other things (was: Clean up parser

[jira] [Resolved] (TIKA-1847) Tika 2.0 - Clean up tika-parsers pom dependencies and a few other things

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1847. --- Resolution: Fixed > Tika 2.0 - Clean up tika-parsers pom dependencies and a few other things >

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-03 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131650#comment-15131650 ] Bob Paulin commented on TIKA-1824: -- So before we go that way let me explain what about your previous

[jira] [Commented] (TIKA-1848) Address issues with Tika 1.12rc#1

2016-02-03 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130667#comment-15130667 ] Lewis John McGibbney commented on TIKA-1848: Ack Ken -- *Lewis* > Address issues with

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130678#comment-15130678 ] Nick Burch commented on TIKA-1850: -- Looks like a duplicate to me, are you happy to close it as such? >

[jira] [Commented] (TIKA-1849) RTF Exception

2016-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130677#comment-15130677 ] Hudson commented on TIKA-1849: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #905 (See

[jira] [Resolved] (TIKA-1845) Unable to extract content from certain RTFs using tika-server versions since 1.5

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1845. --- Resolution: Fixed in both trunk and 2.x Thank you for raising this and supplying a test file! >

[jira] [Commented] (TIKA-1845) Unable to extract content from certain RTFs using tika-server versions since 1.5

2016-02-03 Thread Ian Williams (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130686#comment-15130686 ] Ian Williams commented on TIKA-1845: I am out of the office until Thu 04 Feb 2016. Regards Ian >

[jira] [Commented] (TIKA-1848) Address issues with Tika 1.12rc#1

2016-02-03 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130582#comment-15130582 ] Lewis John McGibbney commented on TIKA-1848: Hi Folks, I am +1 to this being closed then as

[jira] [Commented] (TIKA-1723) Integrate language-detector into Tika

2016-02-03 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130676#comment-15130676 ] Ken Krugler commented on TIKA-1723: --- [~talli...@apache.org] I must admit, focusing on this change in 2.0,

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Lauri Lehmijoki (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131798#comment-15131798 ] Lauri Lehmijoki commented on TIKA-1850: --- Hi Nick, I was unable to find the nightly build in a Maven

[jira] [Commented] (TIKA-1824) Tika 2.0 - Create Initial Parser Modules

2016-02-03 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131749#comment-15131749 ] Ken Krugler commented on TIKA-1824: --- As someone who regularly deals with 100s of jars in the dependency

[jira] [Commented] (TIKA-1848) Address issues with Tika 1.12rc#1

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130325#comment-15130325 ] Tim Allison commented on TIKA-1848: --- So, um, I'll try to fix these in trunk. Do we need an rc2 where

[jira] [Commented] (TIKA-1845) Unable to extract content from certain RTFs using tika-server versions since 1.5

2016-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130766#comment-15130766 ] Hudson commented on TIKA-1845: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #906 (See

[jira] [Commented] (TIKA-1848) Address issues with Tika 1.12rc#1

2016-02-03 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130666#comment-15130666 ] Ken Krugler commented on TIKA-1848: --- Unless I'm not understanding the issues properly, I agree with the

[jira] [Commented] (TIKA-1141) javascript files that contain "<html" are detected as text/html

2016-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130723#comment-15130723 ] Nick Burch commented on TIKA-1141: -- I've tweaked the mime magic for HTML, so we give javascript files

[jira] [Commented] (TIKA-1845) Unable to extract content from certain RTFs using tika-server versions since 1.5

2016-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130734#comment-15130734 ] Hudson commented on TIKA-1845: -- UNSTABLE: Integrated in tika-2.x #15 (See

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Boris Slobodin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130846#comment-15130846 ] Boris Slobodin commented on TIKA-1850: -- Nick, that issue is almost 3 years old, was hoping this may

[jira] [Issue Comment Deleted] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Boris Slobodin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Slobodin updated TIKA-1850: - Comment: was deleted (was: Nick, also, according to your comment

[jira] [Commented] (TIKA-1141) javascript files that contain "<html" are detected as text/html

2016-02-03 Thread Boris Slobodin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130853#comment-15130853 ] Boris Slobodin commented on TIKA-1141: -- This would be a great workaround. > javascript files that

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Boris Slobodin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130850#comment-15130850 ] Boris Slobodin commented on TIKA-1850: -- Nick, also, according to your comment

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Boris Slobodin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130857#comment-15130857 ] Boris Slobodin commented on TIKA-1850: -- I see your recent comment on the other issue from an hour ago.

[jira] [Commented] (TIKA-1850) Tika erroneously detects some versions of jQuery as "text/html"

2016-02-03 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15130860#comment-15130860 ] Nick Burch commented on TIKA-1850: -- Please grab a nightly build / build from git, and check - the test

[jira] [Commented] (TIKA-1849) RTF Exception

2016-02-03 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131024#comment-15131024 ] Hudson commented on TIKA-1849: -- UNSTABLE: Integrated in tika-2.x #16 (See

[jira] [Commented] (TIKA-1849) RTF Exception

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131036#comment-15131036 ] Tim Allison commented on TIKA-1849: --- I'm not able to reproduce this in our test suite. To confirm, this

[jira] [Comment Edited] (TIKA-1849) RTF Exception

2016-02-03 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131036#comment-15131036 ] Tim Allison edited comment on TIKA-1849 at 2/3/16 8:06 PM: --- I'm not able to