Re: [ANNOUNCE] Apache Tika 1.5 Released

2014-02-22 Thread Mattmann, Chris A (3980)
woot, here here! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-283, Mailstop: 171-246 Email:

[jira] [Commented] (TIKA-1243) Support for 7z archives

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909414#comment-13909414 ] Nick Burch commented on TIKA-1243: -- Sadly it's not that simple - I've raised COMPRESS-267

[jira] [Commented] (TIKA-1243) Support for 7z archives

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909417#comment-13909417 ] Nick Burch commented on TIKA-1243: -- In r1570857 I've added a disabled unit test for this,

buildbot failure in ASF Buildbot on tika-trunk

2014-02-22 Thread buildbot
The Buildbot has detected a new failure on builder tika-trunk while building ASF Buildbot. Full details are available at: http://ci.apache.org/builders/tika-trunk/builds/1159 Buildbot URL: http://ci.apache.org/ Buildslave for this Build: portunus_ubuntu Build Reason: scheduler Build Source

[jira] [Commented] (TIKA-1241) Tika does not recognise empty nor spanning ZIP files magic

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909422#comment-13909422 ] Nick Burch commented on TIKA-1241: -- Thanks for this, applied with minor tweak + new unit

[jira] [Resolved] (TIKA-1241) Tika does not recognise empty nor spanning ZIP files magic

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1241. -- Resolution: Fixed Fix Version/s: 1.6 Tika does not recognise empty nor spanning ZIP files magic

Re: [VOTE] Apache Tika 1.5 RC2

2014-02-22 Thread Nick Burch
On Fri, 14 Feb 2014, Annie Burgess wrote: I also live in a sort-of removed location - Anchorage, AK. If anyone knows of any developers up north, I'd love to try to connect with the AK Apache community. There are two main public places where Apache committers announce their locations: *

[jira] [Resolved] (TIKA-1225) MDI files detection

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Burch resolved TIKA-1225. -- Resolution: Fixed Fix Version/s: 1.6 Thanks for this, applied in r1570879. MDI files detection

[jira] [Commented] (TIKA-1243) Support for 7z archives

2014-02-22 Thread Luis Filipe Nassif (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909581#comment-13909581 ] Luis Filipe Nassif commented on TIKA-1243: -- Wow, looks like Compress-267 was

[jira] [Commented] (TIKA-1245) Incorrect MIME type detection

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909602#comment-13909602 ] Nick Burch commented on TIKA-1245: -- Please attach a PDF file that shows this problem -

[jira] [Commented] (TIKA-1243) Support for 7z archives

2014-02-22 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13909603#comment-13909603 ] Nick Burch commented on TIKA-1243: -- 7z is not compatible with streams, it only works with

[jira] [Updated] (TIKA-1245) Incorrect MIME type detection

2014-02-22 Thread Mohamed Mustafa Khimani (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohamed Mustafa Khimani updated TIKA-1245: -- Attachment: 0001 pages 250-329.pdf Test file attached.. Incorrect MIME

Re: CSCI ASSIGNMENT QUESTION

2014-02-22 Thread Mattmann, Chris A (3980)
Hi Mohamed, Thank you for your question. Your code below looks like it's accomplishing the basics, and the requirements of assignment #1. BTW, I'm CC'ing dev@tika.apache.org. The (optional check OCR quality) refers to the fact that in Tika 1.5, we rely on PDF parsing code that doesn't always get