[jira] [Commented] (TIKA-4326) General updates for 3.0.1

2025-01-16 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913676#comment-17913676 ] Hudson commented on TIKA-4326: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_3x-jd

[jira] [Commented] (TIKA-4239) Update to 2.9.3

2025-01-16 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913678#comment-17913678 ] Hudson commented on TIKA-4239: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_2x-jd

[jira] [Commented] (TIKA-4366) Upgrade to POI 5.4.0

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913631#comment-17913631 ] mbiso commented on TIKA-4366: - Hi, because I read some issues and I saw image  is 2 months old

[jira] [Commented] (TIKA-4366) Upgrade to POI 5.4.0

2025-01-16 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913653#comment-17913653 ] Tilman Hausherr commented on TIKA-4366: --- Although I'm not involved in the core relea

[jira] [Commented] (TIKA-4327) General updates for 4.0.0

2025-01-16 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913655#comment-17913655 ] Hudson commented on TIKA-4327: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk17 #

[jira] [Commented] (TIKA-4366) Upgrade to POI 5.4.0

2025-01-16 Thread Tilman Hausherr (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913616#comment-17913616 ] Tilman Hausherr commented on TIKA-4366: --- What would need to be done? > Upgrade to P

[jira] [Commented] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913730#comment-17913730 ] mbiso commented on TIKA-4367: - Hi. I tried with docker image apache/tika:2.9.2.1-full so with

Re: Release schedule for 2.x and 3.x?

2025-01-16 Thread Tim Allison
Sorry, on second thought, a small tweak: I propose that we release 3.1.0 after PDFBox 3.x is released. I further propose that we make a 2.9.3 release at some point after the 3.1.0 release IF we get requests for a 2.x release...otherwise we'll do a final 2.x EOL release in April, 2025. On Thu, Jan

[jira] [Commented] (TIKA-4366) Upgrade to POI 5.4.0

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913729#comment-17913729 ] Tim Allison commented on TIKA-4366: --- Y, we'd have to make a new release. Frankly, it is

[jira] [Commented] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913734#comment-17913734 ] Tim Allison commented on TIKA-4367: --- It really shouldn't work! LOL... I'm wondering if f

[jira] [Commented] (TIKA-4186) tika server shut down innocent connections

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913735#comment-17913735 ] Tim Allison commented on TIKA-4186: --- [~mbiso] I see how your question is related to this

[jira] [Resolved] (TIKA-4186) tika server shut down innocent connections

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4186. --- Resolution: Not A Problem > tika server shut down innocent connections > -

[jira] [Updated] (TIKA-4368) Unable to correctly extract content in OneNote

2025-01-16 Thread luman (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] luman updated TIKA-4368: Description: # Non-rich text content is not checked for the latest version, so when the content is TextExtendedAsci

[jira] [Commented] (TIKA-3932) New repeatable test failures on Solr integration tests for Solr 6 on macosx aarch

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913738#comment-17913738 ] Tim Allison commented on TIKA-3932: --- I don't have a mac anymore to check. :( Maybe [~ndi

[jira] [Created] (TIKA-4368) Unable to correctly extract content in OneNote

2025-01-16 Thread luman (Jira)
luman created TIKA-4368: --- Summary: Unable to correctly extract content in OneNote Key: TIKA-4368 URL: https://issues.apache.org/jira/browse/TIKA-4368 Project: Tika Issue Type: Bug Components:

[jira] [Updated] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mbiso updated TIKA-4367: Attachment: tika-config_new.xml > Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked > proces

[jira] [Commented] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913752#comment-17913752 ] mbiso commented on TIKA-4367: - I updated tika-config_new.xml with your hints, thanks a lot, as

[jira] [Updated] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mbiso updated TIKA-4367: Description: Hi. i have this problem on my tika-server running in a docker container. Due to large files, i obtain

[jira] [Updated] (TIKA-4368) Unable to correctly extract content in OneNote

2025-01-16 Thread luman (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] luman updated TIKA-4368: Description: # Non-rich text content is not checked for the latest version, so when the content is TextExtendedAsci

[jira] [Updated] (TIKA-4368) Unable to correctly extract content in OneNote

2025-01-16 Thread luman (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] luman updated TIKA-4368: Description: # Non-rich text content is not checked for the latest version, so when the content is TextExtendedAsci

[jira] [Commented] (TIKA-4369) Pages extracted twice

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913775#comment-17913775 ] Tim Allison commented on TIKA-4369: --- Thank you [~tilman], let me take a look before you

[jira] [Comment Edited] (TIKA-4369) Pages extracted twice

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913775#comment-17913775 ] Tim Allison edited comment on TIKA-4369 at 1/16/25 3:18 PM: Th

Release schedule for 2.x and 3.x?

2025-01-16 Thread Tim Allison
All, It has been a while since we last released 2.x (April 2024) and 3.x (October 2024). We've had a number of dependency updates. PDFBox is on the cusp of a new 3.x release. I propose that we release 3.1.0 after PDFBox 3.x is released and that we make a 2.9.3 release the following week. WDYT

[jira] [Commented] (TIKA-4368) Unable to correctly extract content in OneNote

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913745#comment-17913745 ] Tim Allison commented on TIKA-4368: --- Committers are standing by for the PR. :lol: Thank

[jira] [Assigned] (TIKA-4368) Unable to correctly extract content in OneNote

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison reassigned TIKA-4368: - Assignee: Tim Allison > Unable to correctly extract content in OneNote >

[jira] [Commented] (TIKA-4366) Upgrade to POI 5.4.0

2025-01-16 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913740#comment-17913740 ] Tim Allison commented on TIKA-4366: --- See: https://lists.apache.org/thread/g23c5vq7d85k1f

[jira] [Created] (TIKA-4369) Pages extracted twice

2025-01-16 Thread Tilman Hausherr (Jira)
Tilman Hausherr created TIKA-4369: - Summary: Pages extracted twice Key: TIKA-4369 URL: https://issues.apache.org/jira/browse/TIKA-4369 Project: Tika Issue Type: Bug Components: pars

[jira] [Commented] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913751#comment-17913751 ] mbiso commented on TIKA-4367: - I updated tika-config_new.xml with your hints, thanks a lot, as

[jira] (TIKA-4367) Problem with the: org.apache.tika.server.core.ServerStatusWatcher forked process observed TIMEOUT and is shutting down

2025-01-16 Thread mbiso (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4367 ] mbiso deleted comment on TIKA-4367: - was (Author: JIRAUSER308329): I updated tika-config_new.xml with your hints, thanks a lot, as in teh attachment. I will retry with the apache/tika:latest-full and

Re: [PR] Introduce GoogleDrive Fetcher for tika-pipes [tika]

2025-01-16 Thread via GitHub
nddipiazza commented on PR #2077: URL: https://github.com/apache/tika/pull/2077#issuecomment-2596585852 I am porting this into https://github.com/nddipiazza/tika-pipes starting now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

[jira] [Commented] (TIKA-4239) Update to 2.9.3

2025-01-16 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17913849#comment-17913849 ] Hudson commented on TIKA-4239: -- SUCCESS: Integrated in Jenkins build Tika » tika-branch_2x-jd