[jira] [Commented] (TIKA-93) OCR support

2014-02-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895886#comment-13895886 ] Grant Ingersoll commented on TIKA-93: - FYI:

[jira] [Commented] (TIKA-93) OCR support

2014-02-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895887#comment-13895887 ] Grant Ingersoll commented on TIKA-93: - Not sure I am happy w/ the changes here yet, esp.

[jira] [Commented] (TIKA-93) OCR support

2014-02-09 Thread Nick Burch (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13895897#comment-13895897 ] Nick Burch commented on TIKA-93: Generally speaking, when a parser finds embedded resources,

[jira] [Updated] (TIKA-93) OCR support

2014-02-09 Thread Grant Ingersoll (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-93?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Grant Ingersoll updated TIKA-93: Attachment: testOCR.pptx testOCR.pdf testOCR.docx

Re: [VOTE] Apache Tika 1.5 RC1

2014-02-09 Thread Dave Meikle
Hi Guys, Whilst there was lots of positive votes for releasing, I have cut another RC fixing the version numbers of tika-dotnet and tika-java7 to tidy up the issue Julien noticed, therefore planning the scrap this vote and call another quick one. Cheers, Dave On 5 February 2014 11:11,

[VOTE] Apache Tika 1.5 RC2

2014-02-09 Thread Dave Meikle
Hi Guys, A new release candidate for the Tika 1.5 release is now available at: http://people.apache.org/~dmeikle/tika-1.5-rc2/ This fixes the issues with the POM version numbers for tika-dotnet and tika-java7 in Tika 1.5 RC1. The release candidate is a zip archive of the sources in:

[jira] [Updated] (TIKA-1233) PDFBox can throw StringIndexOutOfBoundsException on some dates

2014-02-09 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1233: -- Description: PDFBOX's date parser can throw a StringIndexOutOfBoundsException if a date string for