[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086045#comment-16086045
]
Tim Allison commented on TIKA-2428:
---
bq. Our algorithm for recovering deleted files often recovers
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085640#comment-16085640
]
Tim Allison edited comment on TIKA-2428 at 7/13/17 4:48 PM:
Thank you,
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085965#comment-16085965
]
Luis Filipe Nassif commented on TIKA-2428:
--
bq. If bytes skipped is more than requested, we've hit
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085665#comment-16085665
]
Luis Filipe Nassif commented on TIKA-2428:
--
I just put the stacktrace, you found the cause.
But
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085742#comment-16085742
]
Tim Allison commented on TIKA-2428:
---
bq. But I understood it can skip more than are remaining in the
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085852#comment-16085852
]
Luis Filipe Nassif commented on TIKA-2428:
--
Strange, I don't think the javadocs allow that. Maybe
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085742#comment-16085742
]
Tim Allison edited comment on TIKA-2428 at 7/13/17 3:14 PM:
bq. But I
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085709#comment-16085709
]
Matthew Caruana Galizia commented on TIKA-2042:
---
I'd like to ask for this issue to be
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085730#comment-16085730
]
Tim Allison commented on TIKA-2428:
---
I wonder why I didn't see this in our common crawl/govdocs1 corpus?
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085842#comment-16085842
]
Matthew Caruana Galizia commented on TIKA-2042:
---
[~gagravarr] thank you - that fixes the
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthew Caruana Galizia updated TIKA-2042:
--
Attachment: mbox_email_section.txt
Sample of one of the message sections from
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085881#comment-16085881
]
Tim Allison commented on TIKA-2428:
---
bq. Maybe there is an issue with IOUtils.skipFully()
Y, completely.
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085880#comment-16085880
]
Luis Filipe Nassif commented on TIKA-2042:
--
This problem is very very recurrent. I think we should
[
https://issues.apache.org/jira/browse/TIKA-879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthew Caruana Galizia updated TIKA-879:
-
Attachment: mbox_email_section.txt
As described in TIKA-2042, the attached file
> Sorry [~talli...@apache.org]! Just now having time to test against our
> forensic test corpus...
Let us know what else you find...perhaps give tika-eval a try on a sample of
docs?
Between this and TIKA-2042, at least I won't have time to forget my signing
key's password. :)
Onward to 1.17!
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085847#comment-16085847
]
Matthew Caruana Galizia edited comment on TIKA-2042 at 7/13/17 3:13 PM:
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085747#comment-16085747
]
Nick Burch commented on TIKA-2042:
--
[~mcaruanagalizia] I've added some more patterns in
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthew Caruana Galizia updated TIKA-2042:
--
Attachment: mbox_header.txt
Header attached with identifying information
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085742#comment-16085742
]
Tim Allison edited comment on TIKA-2428 at 7/13/17 2:12 PM:
bq. But I
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085838#comment-16085838
]
Hudson commented on TIKA-2042:
--
FAILURE: Integrated in Jenkins build Tika-trunk #1331 (See
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085863#comment-16085863
]
Tim Allison commented on TIKA-2428:
---
https://bz.apache.org/bugzilla/show_bug.cgi?id=61294
> EMFParser
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085709#comment-16085709
]
Matthew Caruana Galizia edited comment on TIKA-2042 at 7/13/17 2:22 PM:
Tim Allison created TIKA-2429:
-
Summary: Upgrade to POI 3.17-beta2 when available
Key: TIKA-2429
URL: https://issues.apache.org/jira/browse/TIKA-2429
Project: Tika
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085877#comment-16085877
]
Tim Allison commented on TIKA-2428:
---
bq. I don't think the javadocs allow that.
I think the javadocs
[
https://issues.apache.org/jira/browse/TIKA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085883#comment-16085883
]
Luis Filipe Nassif commented on TIKA-2042:
--
See Tika-879. Looks like widening the magic search
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086318#comment-16086318
]
Luis Filipe Nassif commented on TIKA-2428:
--
That would be very nice!
> EMFParser loops forever
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086436#comment-16086436
]
Tim Allison commented on TIKA-2428:
---
https://bz.apache.org/bugzilla/show_bug.cgi?id=61295
I suspect
Tim Allison created TIKA-2430:
-
Summary: Add at least dev test capability to run Tika against
corrupted files in our test suite
Key: TIKA-2430
URL: https://issues.apache.org/jira/browse/TIKA-2430
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085979#comment-16085979
]
Tim Allison commented on TIKA-2428:
---
Sorry. I misunderstood. Right. That's my belief.
> EMFParser
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085640#comment-16085640
]
Tim Allison edited comment on TIKA-2428 at 7/13/17 12:34 PM:
-
Thank you,
[
https://issues.apache.org/jira/browse/TIKA-2428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16085640#comment-16085640
]
Tim Allison commented on TIKA-2428:
---
Thank you, [~lfcnassif], for reporting this and finding the cause.
31 matches
Mail list logo