[jira] [Commented] (TIKA-1683) Add encryption support to Jackcess parser

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628585#comment-14628585 ] Tim Allison commented on TIKA-1683: --- At this point we have a version clash on bouncycastl

[jira] [Commented] (TIKA-1588) Upgrade to PDFBox 1.8.10 when available

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14628970#comment-14628970 ] Tim Allison commented on TIKA-1588: --- Interesting. This must be another case of the multi-

[jira] [Resolved] (TIKA-1681) Fix file opening in Jackcess to enable read only for v1997 files

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1681. --- Resolution: Fixed Many thanks again to James Ahlborn. > Fix file opening in Jackcess to enable read on

[jira] [Created] (TIKA-1684) Clean up metadata properties in Jackcess parser

2015-07-15 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1684: - Summary: Clean up metadata properties in Jackcess parser Key: TIKA-1684 URL: https://issues.apache.org/jira/browse/TIKA-1684 Project: Tika Issue Type: Improvement

[jira] [Resolved] (TIKA-1684) Clean up metadata properties in Jackcess parser

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1684. --- Resolution: Fixed r 1691297 > Clean up metadata properties in Jackcess parser > --

[jira] [Assigned] (TIKA-1684) Clean up metadata properties in Jackcess parser

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reassigned TIKA-1684: - Assignee: Tim Allison > Clean up metadata properties in Jackcess parser >

[jira] [Created] (TIKA-1685) Clean up deprecated components

2015-07-15 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1685: - Summary: Clean up deprecated components Key: TIKA-1685 URL: https://issues.apache.org/jira/browse/TIKA-1685 Project: Tika Issue Type: Task Reporter: Ti

[jira] [Resolved] (TIKA-1685) Clean up some deprecated components

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1685. --- Resolution: Fixed r1691299 > Clean up some deprecated components > ---

[jira] [Updated] (TIKA-1685) Clean up some deprecated components

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1685: -- Summary: Clean up some deprecated components (was: Clean up deprecated components) > Clean up some depr

[jira] [Created] (TIKA-1686) Upgrade metadata-extractor to 2.8.1

2015-07-15 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1686: - Summary: Upgrade metadata-extractor to 2.8.1 Key: TIKA-1686 URL: https://issues.apache.org/jira/browse/TIKA-1686 Project: Tika Issue Type: Task Reporte

[jira] [Commented] (TIKA-1686) Upgrade metadata-extractor to 2.8.1

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14629046#comment-14629046 ] Tim Allison commented on TIKA-1686: --- We're getting a test failure: {noformat} org.junit.C

[jira] [Created] (TIKA-1687) Upgrade xerial.org's sqlite-jdbc to 3.8.10.1

2015-07-15 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1687: - Summary: Upgrade xerial.org's sqlite-jdbc to 3.8.10.1 Key: TIKA-1687 URL: https://issues.apache.org/jira/browse/TIKA-1687 Project: Tika Issue Type: Task

[jira] [Resolved] (TIKA-1687) Upgrade xerial.org's sqlite-jdbc to 3.8.10.1

2015-07-15 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1687. --- Resolution: Fixed r 1691302 > Upgrade xerial.org's sqlite-jdbc to 3.8.10.1 > -

[jira] [Commented] (TIKA-1671) Wrapped lines in PDF files not processed correctly

2015-07-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631200#comment-14631200 ] Tim Allison commented on TIKA-1671: --- I think this is an issue with PDFs in general, not P

[jira] [Commented] (TIKA-1671) Wrapped lines in PDF files not processed correctly

2015-07-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631204#comment-14631204 ] Tim Allison commented on TIKA-1671: --- And a few other points... Encoding instructions wit

[jira] [Assigned] (TIKA-1690) nconsistent (buggy) behavior when using tika-server

2015-07-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reassigned TIKA-1690: - Assignee: Tim Allison > nconsistent (buggy) behavior when using tika-server > ---

[jira] [Commented] (TIKA-1690) nconsistent (buggy) behavior when using tika-server

2015-07-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631209#comment-14631209 ] Tim Allison commented on TIKA-1690: --- Thank you for raising this. As I mentioned on the u

[jira] [Commented] (TIKA-1689) Parser sort order change in TIKA-1517 breaks parser override capability

2015-07-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631212#comment-14631212 ] Tim Allison commented on TIKA-1689: --- [~dwarren], thank you for raising this. [~chrismatt

[jira] [Commented] (TIKA-1690) nconsistent (buggy) behavior when using tika-server

2015-07-17 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14631245#comment-14631245 ] Tim Allison commented on TIKA-1690: --- [~chrismattmann], as I look at this, it looks like w

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633429#comment-14633429 ] Tim Allison commented on TIKA-1678: --- [~tilman], y, that's taken from the xmp. As you fou

[jira] [Commented] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633433#comment-14633433 ] Tim Allison commented on TIKA-1238: --- [~rangma], Any chance you could share a test file?

[jira] [Commented] (TIKA-1690) Inconsistent (buggy) behavior when using tika-server

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633436#comment-14633436 ] Tim Allison commented on TIKA-1690: --- tmpFile? Do you mean the fileUrl? Sorry. > Incon

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633454#comment-14633454 ] Tim Allison commented on TIKA-1678: --- The good news is that with PDFBox 2.0, we get a {{nu

[jira] [Comment Edited] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633454#comment-14633454 ] Tim Allison edited comment on TIKA-1678 at 7/20/15 11:43 AM: - T

[jira] [Commented] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633487#comment-14633487 ] Tim Allison commented on TIKA-1238: --- The stacktrace is related to my original problem, bu

[jira] [Commented] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633511#comment-14633511 ] Tim Allison commented on TIKA-1238: --- Got it. For now, let's see if I can find some trigg

[jira] [Commented] (TIKA-1690) Inconsistent (buggy) behavior when using tika-server

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633610#comment-14633610 ] Tim Allison commented on TIKA-1690: --- Is the problem {{is.available()}}? {noformat}

[jira] [Comment Edited] (TIKA-1690) Inconsistent (buggy) behavior when using tika-server

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633610#comment-14633610 ] Tim Allison edited comment on TIKA-1690 at 7/20/15 1:43 PM: Is

[jira] [Commented] (TIKA-1285) Upgrade to PDFBox 2.0.0 when available

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633643#comment-14633643 ] Tim Allison commented on TIKA-1285: --- Still hammering out some issues. If regression tests

[jira] [Updated] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1238: -- Attachment: (was: 873911_100_20061124_191408.msg) > Update OutlookExtractor to handle codepage id

[jira] [Commented] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633696#comment-14633696 ] Tim Allison commented on TIKA-1238: --- Probably not the best way to transfer a file... I m

[jira] [Resolved] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1238. --- Resolution: Fixed That should do it. > Update OutlookExtractor to handle codepage identification more

[jira] [Comment Edited] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633487#comment-14633487 ] Tim Allison edited comment on TIKA-1238 at 7/20/15 3:34 PM: The

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633720#comment-14633720 ] Tim Allison commented on TIKA-1678: --- Very helpful! If we require that the string start w

[jira] [Reopened] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reopened TIKA-1238: --- Doh. Reopening until we get the mods to POI and then the updated Tika code after the next POI release. >

[jira] [Commented] (TIKA-1238) Update OutlookExtractor to handle codepage identification more rigorously

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633786#comment-14633786 ] Tim Allison commented on TIKA-1238: --- That's up to the community, but I think we have anot

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633986#comment-14633986 ] Tim Allison commented on TIKA-1678: --- That works perfectly. Thank you, [~tilman]! Now I'

[jira] [Comment Edited] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14633986#comment-14633986 ] Tim Allison edited comment on TIKA-1678 at 7/20/15 7:38 PM: Tha

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634052#comment-14634052 ] Tim Allison commented on TIKA-1678: --- No good deed goes unpunished. Thank you! Let me kn

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-20 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14634391#comment-14634391 ] Tim Allison commented on TIKA-1678: --- Slight modification of [~tilman]'s example added in

[jira] [Resolved] (TIKA-1683) Add encryption support to Jackcess parser

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1683. --- Resolution: Fixed r1692100 > Add encryption support to Jackcess parser > -

[jira] [Created] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
Tim Allison created TIKA-1692: - Summary: Enable getExtension() for texty file types Key: TIKA-1692 URL: https://issues.apache.org/jira/browse/TIKA-1692 Project: Tika Issue Type: Improvement

[jira] [Updated] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1692: -- Description: {{getExtension()}} offers a handy way to add a "detected" extension from a {{MimeType}} for

[jira] [Updated] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1692: -- Attachment: MimeUtilTest.java So the use case is: you've already done file type identification to figure

[jira] [Comment Edited] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635674#comment-14635674 ] Tim Allison edited comment on TIKA-1692 at 7/21/15 7:29 PM: So

[jira] [Commented] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635687#comment-14635687 ] Tim Allison commented on TIKA-1692: --- Nothing like unit tests... So, all is well for stra

[jira] [Comment Edited] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635687#comment-14635687 ] Tim Allison edited comment on TIKA-1692 at 7/21/15 7:38 PM: Not

[jira] [Comment Edited] (TIKA-1692) Enable getExtension() for texty file types

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14635687#comment-14635687 ] Tim Allison edited comment on TIKA-1692 at 7/21/15 7:40 PM: Not

[jira] [Updated] (TIKA-1692) Enable getExtension() for texty file types that include encoding information

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1692: -- Summary: Enable getExtension() for texty file types that include encoding information (was: Enable getEx

[jira] [Commented] (TIKA-1692) Enable getExtension() for texty file types that include encoding information

2015-07-21 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636076#comment-14636076 ] Tim Allison commented on TIKA-1692: --- Y. That's perfect. Will give it a try. > Enable get

[jira] [Commented] (TIKA-1692) Enable getExtension() for texty file types that include encoding information

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636901#comment-14636901 ] Tim Allison commented on TIKA-1692: --- Hmmm If we modify {{getRegisteredMimeType}} to t

[jira] [Updated] (TIKA-1692) Enable getExtension() for mime strings with parameters

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1692: -- Summary: Enable getExtension() for mime strings with parameters (was: Enable getExtension() for texty fi

[jira] [Comment Edited] (TIKA-1692) Enable getExtension() for mime strings with parameters

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14636901#comment-14636901 ] Tim Allison edited comment on TIKA-1692 at 7/22/15 2:12 PM: Hmm

[jira] [Resolved] (TIKA-1692) Enable getExtension() for mime strings with parameters

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1692. --- Resolution: Fixed r1692283. Thank you, [~gagravarr]! > Enable getExtension() for mime strings with pa

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14637535#comment-14637535 ] Tim Allison commented on TIKA-1678: --- Got it. Thank you! > PDF metadata extraction fails

[jira] [Resolved] (TIKA-1233) PDFBox can throw StringIndexOutOfBoundsException on some dates

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1233. --- Resolution: Fixed Upgraded to PDFBox 1.8.10 with r1692341 > PDFBox can throw StringIndexOutOfBoundsExc

[jira] [Resolved] (TIKA-1588) Upgrade to PDFBox 1.8.10 when available

2015-07-22 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1588. --- Resolution: Fixed Fix Version/s: 1.10 r1692341 > Upgrade to PDFBox 1.8.10 when available >

[jira] [Commented] (TIKA-1693) Tika OSGi bundle build fails

2015-07-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14638614#comment-14638614 ] Tim Allison commented on TIKA-1693: --- Thank you, [~bobpaulin]! Will fix shortly. Any ide

[jira] [Resolved] (TIKA-1690) Inconsistent (buggy) behavior when using tika-server

2015-07-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1690. --- Resolution: Fixed Reverted r1678515's fileUrl capability in tika-server. 1692383 If we want this capa

[jira] [Resolved] (TIKA-1667) Upgrade to POI 3.13-beta1 when available

2015-07-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1667?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1667. --- Resolution: Fixed r1692422 > Upgrade to POI 3.13-beta1 when available > --

[jira] [Resolved] (TIKA-1046) Get "java.util.zip.ZipException: unknown compression method" when indexing ppf97-file containing wmf-image

2015-07-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1046. --- Resolution: Fixed Fixed with upgrade to POI-3.13-beta1 in r1692422. Many thanks, again, [~kiwiwings] f

[jira] [Resolved] (TIKA-1315) Basic list support in WordExtractor

2015-07-23 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1315. --- Resolution: Fixed Completely fixed, now, with upgrade to POI 3.13-beta1 in r1692422. Overrides now wo

[jira] [Updated] (TIKA-1689) Parser sort order change in TIKA-1517 breaks parser override capability

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-1689: -- Priority: Blocker (was: Major) > Parser sort order change in TIKA-1517 breaks parser override capability

[jira] [Commented] (TIKA-1689) Parser sort order change in TIKA-1517 breaks parser override capability

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640350#comment-14640350 ] Tim Allison commented on TIKA-1689: --- To confirm [~dwarren]'s point and to supplement with

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640370#comment-14640370 ] Tim Allison commented on TIKA-1678: --- I'll try to build a test file today with the fix on

[jira] [Comment Edited] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640370#comment-14640370 ] Tim Allison edited comment on TIKA-1678 at 7/24/15 12:18 PM: - I

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640458#comment-14640458 ] Tim Allison commented on TIKA-1678: --- Duh. I had subclassed BaseParser, but then I was tr

[jira] [Commented] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640480#comment-14640480 ] Tim Allison commented on TIKA-1678: --- As a former professor of (Ancient) Greek, it hurts t

[jira] [Resolved] (TIKA-1678) PDF metadata extraction fails to spot UTF-16 encoded title

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1678. --- Resolution: Fixed Fix Version/s: 1.10 Thank you, [~anjackson], for raising this. Thank you, [~t

[jira] [Resolved] (TIKA-1689) Parser sort order change in TIKA-1517 breaks parser override capability

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-1689. --- Resolution: Fixed Fix Version/s: 1.10 I flipped the sort order back to what it was in r1692564.

[jira] [Comment Edited] (TIKA-1689) Parser sort order change in TIKA-1517 breaks parser override capability

2015-07-24 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14640870#comment-14640870 ] Tim Allison edited comment on TIKA-1689 at 7/24/15 6:29 PM: In

[jira] [Commented] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642685#comment-14642685 ] Tim Allison commented on TIKA-1524: --- [~grossws], does this modification depend on TIKA-16

[jira] [Commented] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642786#comment-14642786 ] Tim Allison commented on TIKA-1524: --- Y, I was just looking at the small bit of code that

[jira] [Comment Edited] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642786#comment-14642786 ] Tim Allison edited comment on TIKA-1524 at 7/27/15 2:36 PM: Y,

[jira] [Comment Edited] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642786#comment-14642786 ] Tim Allison edited comment on TIKA-1524 at 7/27/15 2:37 PM: Y,

[jira] [Commented] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642959#comment-14642959 ] Tim Allison commented on TIKA-1524: --- Y, sorry, I wasn't being as precise as I should have

[jira] [Comment Edited] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642959#comment-14642959 ] Tim Allison edited comment on TIKA-1524 at 7/27/15 4:32 PM: Y,

[jira] [Commented] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643214#comment-14643214 ] Tim Allison commented on TIKA-1524: --- I don't think I'll have time to install the geotopic

[jira] [Commented] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-27 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643447#comment-14643447 ] Tim Allison commented on TIKA-1524: --- ;) ha! It looks to me like the current JSON parser

[jira] [Commented] (TIKA-1524) Can install Tika-Bundle, missing JUnit dependency

2015-07-28 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644298#comment-14644298 ] Tim Allison commented on TIKA-1524: --- +1. Looks great to me. Thank you! [~bobpaulin], t

[jira] [Commented] (TIKA-1691) Apache Tika for enabling metadata interoperability

2015-07-28 Thread Tim Allison (JIRA)
[ https://issues.apache.org/jira/browse/TIKA-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14644309#comment-14644309 ] Tim Allison commented on TIKA-1691: --- My 2 cents: Metadata interoperability is a goal tha

[jira] [Commented] (TIKA-4249) EML file is treating it as text file in 3.9.2 version

2024-04-30 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842401#comment-17842401 ] Tim Allison commented on TIKA-4249: --- I'm guessing you mean 2.9.0->2.9.2. The challenge

[jira] [Commented] (TIKA-4249) EML file is treating it as text file in 3.9.2 version

2024-04-30 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842402#comment-17842402 ] Tim Allison commented on TIKA-4249: --- Modifying the first hit from {{offset="0"}} to {{of

[jira] [Commented] (TIKA-4249) EML file is treating it as text file in 3.9.2 version

2024-04-30 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842405#comment-17842405 ] Tim Allison commented on TIKA-4249: --- Files never cease to amaze! Thank you. Onwards! >

[jira] [Resolved] (TIKA-4249) EML file is treating it as text file in 3.9.2 version

2024-05-01 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4249. --- Fix Version/s: 3.0.0 2.9.3 Resolution: Fixed > EML file is treating it as te

[jira] [Updated] (TIKA-4249) EML file is treating it as text file in 2.9.2 version

2024-05-01 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4249: -- Summary: EML file is treating it as text file in 2.9.2 version (was: EML file is treating it as text fi

[jira] [Commented] (TIKA-4249) EML file is treating it as text file in 3.9.2 version

2024-05-01 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842604#comment-17842604 ] Tim Allison commented on TIKA-4249: --- The example file shared was actually kind of weird.

[jira] [Commented] (TIKA-4243) tika configuration overhaul

2024-05-01 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842605#comment-17842605 ] Tim Allison commented on TIKA-4243: --- Do we put it in tika-serialization or a new module?

[jira] [Commented] (TIKA-4249) EML file is treating it as text file in 2.9.2 version

2024-05-01 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17842745#comment-17842745 ] Tim Allison commented on TIKA-4249: --- Version numbers for the fix are noted above: 2.9.3

[jira] [Created] (TIKA-4250) Add a libpst-based parser

2024-05-02 Thread Tim Allison (Jira)
Tim Allison created TIKA-4250: - Summary: Add a libpst-based parser Key: TIKA-4250 URL: https://issues.apache.org/jira/browse/TIKA-4250 Project: Tika Issue Type: Task Reporter: Tim All

[jira] [Commented] (TIKA-4249) EML file is treating it as text file in 2.9.2 version

2024-05-03 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843217#comment-17843217 ] Tim Allison commented on TIKA-4249: --- > Crystal ball is murky on the timing of the next 2

[jira] [Commented] (TIKA-4250) Add a libpst-based parser

2024-05-03 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843361#comment-17843361 ] Tim Allison commented on TIKA-4250: --- Hahahahaha. I figured you'd have input on this [~lf

[jira] [Commented] (TIKA-4250) Add a libpst-based parser

2024-05-04 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843428#comment-17843428 ] Tim Allison commented on TIKA-4250: --- Given your experience, I think it would be valuable

[jira] [Comment Edited] (TIKA-4250) Add a libpst-based parser

2024-05-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843740#comment-17843740 ] Tim Allison edited comment on TIKA-4250 at 5/6/24 1:02 PM: --- Wow.

[jira] [Updated] (TIKA-4250) Add a libpst-based parser

2024-05-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4250: -- Attachment: 8.msg > Add a libpst-based parser > - > > Key: TIKA-

[jira] [Updated] (TIKA-4250) Add a libpst-based parser

2024-05-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4250: -- Attachment: 8.eml > Add a libpst-based parser > - > > Key: TIKA-

[jira] [Commented] (TIKA-4250) Add a libpst-based parser

2024-05-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843798#comment-17843798 ] Tim Allison commented on TIKA-4250: --- So, I caught an example of libpst not reading an at

[jira] [Comment Edited] (TIKA-4250) Add a libpst-based parser

2024-05-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843798#comment-17843798 ] Tim Allison edited comment on TIKA-4250 at 5/6/24 5:02 PM: --- So,

[jira] [Comment Edited] (TIKA-4250) Add a libpst-based parser

2024-05-06 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843746#comment-17843746 ] Tim Allison edited comment on TIKA-4250 at 5/6/24 5:03 PM: --- Wait

[jira] [Created] (TIKA-4251) [DISCUSS] move to cosium's git-code-format-maven-plugin

2024-05-06 Thread Tim Allison (Jira)
Tim Allison created TIKA-4251: - Summary: [DISCUSS] move to cosium's git-code-format-maven-plugin Key: TIKA-4251 URL: https://issues.apache.org/jira/browse/TIKA-4251 Project: Tika Issue Type: Task

<    3   4   5   6   7   8   9   10   11   12   >