[jira] [Resolved] (TIKA-1363) .mat files not parsing

2014-07-15 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris A. Mattmann resolved TIKA-1363. - Resolution: Fixed Fix Version/s: 1.6 Assignee: Chris A. Mattmann - great

[jira] [Commented] (TIKA-1363) .mat files not parsing

2014-07-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061805#comment-14061805 ] Hudson commented on TIKA-1363: -- SUCCESS: Integrated in tika-trunk-jdk1.7 #94 (See

[jira] [Commented] (TIKA-1363) .mat files not parsing

2014-07-15 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061817#comment-14061817 ] Hudson commented on TIKA-1363: -- SUCCESS: Integrated in tika-trunk-jdk1.6 #93 (See

[jira] [Commented] (TIKA-1095) Only gibberish extracted from this PDF

2014-07-15 Thread Stefan Postema (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061818#comment-14061818 ] Stefan Postema commented on TIKA-1095: -- I'm having the same problem. The file is also

[jira] [Commented] (TIKA-1095) Only gibberish extracted from this PDF

2014-07-15 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061867#comment-14061867 ] Hong-Thai Nguyen commented on TIKA-1095: Event with latest Tika can't convert this

[jira] [Updated] (TIKA-1095) Only gibberish extracted from this PDF

2014-07-15 Thread Hong-Thai Nguyen (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hong-Thai Nguyen updated TIKA-1095: --- Component/s: (was: general) parser Only gibberish extracted from this

[jira] [Updated] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Beryozkin updated TIKA-1367: --- Description: tika-parsers module has many strong transitive parser dependencies. Maven users

[jira] [Updated] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Beryozkin updated TIKA-1367: --- Description: tika-parsers module has many strong transitive parser dependencies. Maven users

[jira] [Created] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2014-07-15 Thread Sergey Beryozkin (JIRA)
Sergey Beryozkin created TIKA-1367: -- Summary: Tika documentation should list tika-parsers parser dependencies Key: TIKA-1367 URL: https://issues.apache.org/jira/browse/TIKA-1367 Project: Tika

[jira] [Created] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
Sergey Beryozkin created TIKA-1368: -- Summary: Improve the modularity of tika-parsers Key: TIKA-1368 URL: https://issues.apache.org/jira/browse/TIKA-1368 Project: Tika Issue Type:

[jira] [Commented] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061893#comment-14061893 ] Sergey Beryozkin commented on TIKA-1368: The documentation needs to be in place

[jira] [Updated] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Beryozkin updated TIKA-1368: --- Affects Version/s: 1.7 Improve the modularity of tika-parsers

Re: Can some of tika-parsers module dependencies be made optional ?

2014-07-15 Thread Sergey Beryozkin
Hi All, I've opened 2 JIRA issues, see [1] and [2]. [1] is about documenting the 3rd party transitive tika-parser dependencies to help Maven users to exclude the kibs not required in a given project. Help on resolving [1] form true Tika experts like Nick and others would be appreciated :-).

Re: Can some of tika-parsers module dependencies be made optional ?

2014-07-15 Thread Ray Gauss
I’m not sure the third option is much more work up front than pulling apart the transitive dependencies for documentation purposes, though it is more sensitive as you say. Just to confirm, with any of the other solutions we would need to manually document not just immediate dependencies but

Re: Can some of tika-parsers module dependencies be made optional ?

2014-07-15 Thread Sergey Beryozkin
Hi, On 15/07/14 12:34, Ray Gauss wrote: I’m not sure the third option is much more work up front than pulling apart the transitive dependencies for documentation purposes, though it is more sensitive as you say. As far as I understand the 3rd option would require introducing many micro

[jira] [Commented] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Ken Krugler (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062024#comment-14062024 ] Ken Krugler commented on TIKA-1368: --- While I also wish there was a better way to get only

[jira] [Comment Edited] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062098#comment-14062098 ] Sergey Beryozkin edited comment on TIKA-1368 at 7/15/14 2:14 PM:

[jira] [Commented] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062098#comment-14062098 ] Sergey Beryozkin commented on TIKA-1368: #2 is for those users who know what they

[jira] [Comment Edited] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062098#comment-14062098 ] Sergey Beryozkin edited comment on TIKA-1368 at 7/15/14 2:15 PM:

[jira] [Comment Edited] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062098#comment-14062098 ] Sergey Beryozkin edited comment on TIKA-1368 at 7/15/14 2:14 PM:

[jira] [Comment Edited] (TIKA-1368) Improve the modularity of tika-parsers

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062098#comment-14062098 ] Sergey Beryozkin edited comment on TIKA-1368 at 7/15/14 2:14 PM:

Re: tika-trunk-jdk1.7 - Build # 75 - Failure

2014-07-15 Thread Gregory Kanevsky
unsubscribe On Tue, Jul 1, 2014 at 3:13 AM, Apache Jenkins Server jenk...@builds.apache.org wrote: The Apache Jenkins build system has built tika-trunk-jdk1.7 (build #75) Status: Failure Check console output at https://builds.apache.org/job/tika-trunk-jdk1.7/75/ to view the results.

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-15 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062180#comment-14062180 ] Tyler Palsulich commented on TIKA-1365: --- Thanks! It looks like the html is malformed,

[jira] [Commented] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2014-07-15 Thread Tyler Palsulich (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062188#comment-14062188 ] Tyler Palsulich commented on TIKA-1367: --- I think that letting users know just how big

[jira] [Created] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-07-15 Thread John Gibson (JIRA)
John Gibson created TIKA-1369: - Summary: Date parsing and thread safety in ImageMetadataExtractor Key: TIKA-1369 URL: https://issues.apache.org/jira/browse/TIKA-1369 Project: Tika Issue Type:

[jira] [Commented] (TIKA-1369) Date parsing and thread safety in ImageMetadataExtractor

2014-07-15 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062288#comment-14062288 ] Nick Burch commented on TIKA-1369: -- General comment - adding more dependencies to the Tika

Re: [jira] [Commented] (TIKA-1363) .mat files not parsing

2014-07-15 Thread Annie Burgess
I pulled the new trunk and looks like Tika is now successfully parsing Matlab .mat files at the command line and in the GUI. Thanks all for your help on this new parser! On Tue, Jul 15, 2014 at 12:05 AM, Hudson (JIRA) j...@apache.org wrote: [

Re: [jira] [Commented] (TIKA-1363) .mat files not parsing

2014-07-15 Thread Mattmann, Chris A (3980)
Great work, Annie! -Original Message- From: Annie Burgess anniebry...@gmail.com Reply-To: dev@tika.apache.org dev@tika.apache.org, anniebryant.burg...@gmail.com anniebryant.burg...@gmail.com Date: Tuesday, July 15, 2014 10:36 AM To: dev@tika.apache.org dev@tika.apache.org Subject: Re:

[jira] [Commented] (TIKA-1367) Tika documentation should list tika-parsers parser dependencies

2014-07-15 Thread Sergey Beryozkin (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062606#comment-14062606 ] Sergey Beryozkin commented on TIKA-1367: Thanks for the proposal, I'm not sure

[jira] [Commented] (TIKA-1365) Incorrectly MimeType detection for Apache Lucene web site

2014-07-15 Thread Tien Nguyen Manh (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14062979#comment-14062979 ] Tien Nguyen Manh commented on TIKA-1365: i think XMLParser throws that exception is