[
https://issues.apache.org/jira/browse/TIKA-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17576143#comment-17576143
]
Hudson commented on TIKA-3795:
--
FAILURE: Integrated in Jenkins build Tika » tika-main-jdk8 #732 (See
[
https://issues.apache.org/jira/browse/TIKA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17576078#comment-17576078
]
Hudson commented on TIKA-3831:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #731 (See
[
https://issues.apache.org/jira/browse/TIKA-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575945#comment-17575945
]
Tika User edited comment on TIKA-3827 at 8/5/22 11:20 PM:
--
Okay
was (Author:
[
https://issues.apache.org/jira/browse/TIKA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-3831:
--
Summary: Allow for retries in S3Fetcher (was: Small improvements to
S3Fetcher)
> Allow for retries in
[
https://issues.apache.org/jira/browse/TIKA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-3831:
--
Description: We should allow for retries. (was: When using the s3fetcher
with aws public datasets, no
[
https://issues.apache.org/jira/browse/TIKA-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison updated TIKA-3831:
--
Summary: Small improvements to S3Fetcher (was: S3Fetcher does not need to
require credentials)
>
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575951#comment-17575951
]
Lakatos Gyula commented on TIKA-3832:
-
[~tallison] Thanks a lot for fixing the problem! Tika is
[
https://issues.apache.org/jira/browse/TIKA-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575945#comment-17575945
]
Tika User commented on TIKA-3827:
-
When this fix will be available? Next version?
> Word Document
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575936#comment-17575936
]
Hudson commented on TIKA-3832:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #730 (See
[
https://issues.apache.org/jira/browse/TIKA-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575935#comment-17575935
]
Hudson commented on TIKA-3827:
--
SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk8 #730 (See
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575915#comment-17575915
]
Hudson commented on TIKA-3832:
--
SUCCESS: Integrated in Jenkins build Tika » tika-branch1x-jdk8 #244 (See
[
https://issues.apache.org/jira/browse/TIKA-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575891#comment-17575891
]
Tim Allison commented on TIKA-3827:
---
For now, I've added a mediatype hint that the bytes are of type
[
https://issues.apache.org/jira/browse/TIKA-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575890#comment-17575890
]
Tim Allison commented on TIKA-3827:
---
That's the client code, but we don't know what "getImageData()" is
[
https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575883#comment-17575883
]
Tim Allison edited comment on TIKA-3829 at 8/5/22 2:47 PM:
---
You can exclude
[
https://issues.apache.org/jira/browse/TIKA-3827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575881#comment-17575881
]
Tika User commented on TIKA-3827:
-
Below is the code:
You can easily extract text from the document
[
https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575883#comment-17575883
]
Tim Allison commented on TIKA-3829:
---
You can exclude parsers and exclude specific mime types from
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Allison resolved TIKA-3832.
---
Fix Version/s: 1.28.5
2.4.2
Resolution: Fixed
Thank you [~Laxika] for
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575856#comment-17575856
]
Tim Allison commented on TIKA-3832:
---
We have to defend against cycles in BookMarks... Facepalm, we do in
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575847#comment-17575847
]
Tim Allison commented on TIKA-3832:
---
Thank you for sharing the file. PDFBox's ExtractText has no
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575814#comment-17575814
]
Nick Burch commented on TIKA-3832:
--
Any chance you could try with Apache PDFBox directly? They've got a
Lakatos Gyula created TIKA-3832:
---
Summary: Required array length is too large when reading a PDF file
Key: TIKA-3832
URL: https://issues.apache.org/jira/browse/TIKA-3832
Project: Tika
Issue
[
https://issues.apache.org/jira/browse/TIKA-3832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lakatos Gyula updated TIKA-3832:
Summary: Required array length is too large (OOM) error when reading a PDF
file (was: Required
[
https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575607#comment-17575607
]
John edited comment on TIKA-3829 at 8/5/22 7:01 AM:
Ok. Will check and get you back if
[
https://issues.apache.org/jira/browse/TIKA-3829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17575607#comment-17575607
]
John commented on TIKA-3829:
Ok. Will check and get you back if we faced this problem again.
There is any
THausherr merged PR #641:
URL: https://github.com/apache/tika/pull/641
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
THausherr merged PR #640:
URL: https://github.com/apache/tika/pull/640
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
THausherr merged PR #642:
URL: https://github.com/apache/tika/pull/642
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org
27 matches
Mail list logo