[GitHub] [tika] THausherr merged pull request #991: Bump jetty-bom from 9.4.50.v20221201 to 9.4.51.v20230217

2023-02-27 Thread via GitHub
THausherr merged PR #991: URL: https://github.com/apache/tika/pull/991 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] THausherr merged pull request #988: Bump maven-compiler-plugin from 3.10.1 to 3.11.0

2023-02-27 Thread via GitHub
THausherr merged PR #988: URL: https://github.com/apache/tika/pull/988 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] THausherr merged pull request #989: Bump aws.version from 1.12.415 to 1.12.416

2023-02-27 Thread via GitHub
THausherr merged PR #989: URL: https://github.com/apache/tika/pull/989 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] THausherr merged pull request #990: Bump zstd-jni from 1.5.4-1 to 1.5.4-2

2023-02-27 Thread via GitHub
THausherr merged PR #990: URL: https://github.com/apache/tika/pull/990 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@tika.apache.org

[GitHub] [tika] dependabot[bot] opened a new pull request, #991: Bump jetty-bom from 9.4.50.v20221201 to 9.4.51.v20230217

2023-02-27 Thread via GitHub
dependabot[bot] opened a new pull request, #991: URL: https://github.com/apache/tika/pull/991 Bumps [jetty-bom](https://github.com/eclipse/jetty.project) from 9.4.50.v20221201 to 9.4.51.v20230217. Release notes Sourced from

[jira] [Updated] (TIKA-3981) Tika parser meets window system file

2023-02-27 Thread Tika User (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tika User updated TIKA-3981: Attachment: Tika_Testing.docx > Tika parser meets window system file >

[jira] [Commented] (TIKA-3981) Tika parser meets window system file

2023-02-27 Thread Tika User (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694330#comment-17694330 ] Tika User commented on TIKA-3981: - Hi [~nick] ,     Only the special files, existed in the

[GitHub] [tika] dependabot[bot] opened a new pull request, #990: Bump zstd-jni from 1.5.4-1 to 1.5.4-2

2023-02-27 Thread via GitHub
dependabot[bot] opened a new pull request, #990: URL: https://github.com/apache/tika/pull/990 Bumps [zstd-jni](https://github.com/luben/zstd-jni) from 1.5.4-1 to 1.5.4-2. Commits https://github.com/luben/zstd-jni/commit/46699bbb024a7e04a61e61d7dbe12fdb1ed9c5dd;>46699bb

[GitHub] [tika] dependabot[bot] opened a new pull request, #989: Bump aws.version from 1.12.415 to 1.12.416

2023-02-27 Thread via GitHub
dependabot[bot] opened a new pull request, #989: URL: https://github.com/apache/tika/pull/989 Bumps `aws.version` from 1.12.415 to 1.12.416. Updates `aws-java-sdk-s3` from 1.12.415 to 1.12.416 Changelog Sourced from

[GitHub] [tika] dependabot[bot] opened a new pull request, #988: Bump maven-compiler-plugin from 3.10.1 to 3.11.0

2023-02-27 Thread via GitHub
dependabot[bot] opened a new pull request, #988: URL: https://github.com/apache/tika/pull/988 Bumps [maven-compiler-plugin](https://github.com/apache/maven-compiler-plugin) from 3.10.1 to 3.11.0. Commits

[jira] [Created] (TIKA-3983) Snapshot versions mismatch

2023-02-27 Thread Alexey Pismenskiy (Jira)
Alexey Pismenskiy created TIKA-3983: --- Summary: Snapshot versions mismatch Key: TIKA-3983 URL: https://issues.apache.org/jira/browse/TIKA-3983 Project: Tika Issue Type: Bug

[GitHub] [tika] apismensky commented on pull request #985: [TIKA-3979] OneNoteParser - Improve performance for deserialization

2023-02-27 Thread via GitHub
apismensky commented on PR #985: URL: https://github.com/apache/tika/pull/985#issuecomment-1446966128 Confirming with my file: Before fix: 26844 ms After fix: 692 ms Yay yay! @nddipiazza thanks for fixing! -- This is an automated message from the Apache Git Service. To

[jira] [Commented] (TIKA-3979) OneNoteParser - Improve performance for deserialization

2023-02-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694154#comment-17694154 ] ASF GitHub Bot commented on TIKA-3979: -- apismensky commented on PR #985: URL:

[jira] [Commented] (TIKA-3979) OneNoteParser - Improve performance for deserialization

2023-02-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694134#comment-17694134 ] ASF GitHub Bot commented on TIKA-3979: -- nddipiazza commented on PR #985: URL:

[GitHub] [tika] nddipiazza commented on pull request #985: [TIKA-3979] OneNoteParser - Improve performance for deserialization

2023-02-27 Thread via GitHub
nddipiazza commented on PR #985: URL: https://github.com/apache/tika/pull/985#issuecomment-1446882008 yes that is because the onenote parser for alterantive format was just printing some general header information before. now it's actually parsing it (slowly due to the bug) which should

[jira] [Commented] (TIKA-3979) OneNoteParser - Improve performance for deserialization

2023-02-27 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17694096#comment-17694096 ] ASF GitHub Bot commented on TIKA-3979: -- apismensky commented on PR #985: URL:

[GitHub] [tika] apismensky commented on pull request #985: [TIKA-3979] OneNoteParser - Improve performance for deserialization

2023-02-27 Thread via GitHub
apismensky commented on PR #985: URL: https://github.com/apache/tika/pull/985#issuecomment-1446743975 I was going to submit this issue last week. My observation was similar - lots of overhead around BitSet - mem allocations / cpu. We switched from tika 1.27 to 2.7.0 For one of