Re: [PR] Bump aws.version from 1.12.656 to 1.12.657 [tika]

2024-02-12 Thread via GitHub
THausherr merged PR #1592: URL: https://github.com/apache/tika/pull/1592 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[PR] Bump aws.version from 1.12.656 to 1.12.657 [tika]

2024-02-12 Thread via GitHub
dependabot[bot] opened a new pull request, #1592: URL: https://github.com/apache/tika/pull/1592 Bumps `aws.version` from 1.12.656 to 1.12.657. Updates `com.amazonaws:aws-java-sdk-s3` from 1.12.656 to 1.12.657 Changelog Sourced from

[jira] [Commented] (TIKA-3784) Detector returns "application/x-x509-key" when scanning a .p12 file

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816827#comment-17816827 ] Tim Allison commented on TIKA-3784: --- Well, sure, if you want to make it easy! Y, let's go with something

[jira] [Comment Edited] (TIKA-3784) Detector returns "application/x-x509-key" when scanning a .p12 file

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816811#comment-17816811 ] Lonzak edited comment on TIKA-3784 at 2/12/24 11:16 PM: PKCS12 is not the easiest

[jira] [Comment Edited] (TIKA-3784) Detector returns "application/x-x509-key" when scanning a .p12 file

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816811#comment-17816811 ] Lonzak edited comment on TIKA-3784 at 2/12/24 11:15 PM: PKCS12 is not the easiest

[jira] [Updated] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lonzak updated TIKA-4194: - Description: We use tika to detect the type of a file which is uploaded. In most cases this works quite well.

[jira] [Commented] (TIKA-3784) Detector returns "application/x-x509-key" when scanning a .p12 file

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816811#comment-17816811 ] Lonzak commented on TIKA-3784: -- PKCS12 is not the easiest format :-| The oid for pkcs12 starts with

[jira] [Commented] (TIKA-4191) tika-core and other deps should be "provided" in non-app contexts

2024-02-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816804#comment-17816804 ] Hudson commented on TIKA-4191: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1505 (See

[jira] [Commented] (TIKA-3784) Detector returns "application/x-x509-key" when scanning a .p12 file

2024-02-12 Thread Nick Burch (Jira)
[ https://issues.apache.org/jira/browse/TIKA-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816788#comment-17816788 ] Nick Burch commented on TIKA-3784: -- >From [https://datatracker.ietf.org/doc/rfc7292/] it looks like

[jira] [Commented] (TIKA-4196) Add a BOM charset detector

2024-02-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816779#comment-17816779 ] Hudson commented on TIKA-4196: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1504 (See

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816781#comment-17816781 ] Hudson commented on TIKA-4194: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1504 (See

[jira] [Commented] (TIKA-4195) JSoupParser conceals null from the EncodingDetector

2024-02-12 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816780#comment-17816780 ] Hudson commented on TIKA-4195: -- SUCCESS: Integrated in Jenkins build Tika » tika-main-jdk11 #1504 (See

[jira] [Commented] (TIKA-4191) tika-core and other deps should be "provided" in non-app contexts

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816715#comment-17816715 ] ASF GitHub Bot commented on TIKA-4191: -- tballison merged PR #1575: URL:

Re: [PR] TIKA-4191 -- reduce tika-core's scope to "provided" where possible [tika]

2024-02-12 Thread via GitHub
tballison merged PR #1575: URL: https://github.com/apache/tika/pull/1575 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Resolved] (TIKA-4197) Downgrade jackrabbit in 2.x

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4197. --- Fix Version/s: 2.9.2 Resolution: Fixed > Downgrade jackrabbit in 2.x >

[jira] [Created] (TIKA-4197) Downgrade jackrabbit in 2.x

2024-02-12 Thread Tim Allison (Jira)
Tim Allison created TIKA-4197: - Summary: Downgrade jackrabbit in 2.x Key: TIKA-4197 URL: https://issues.apache.org/jira/browse/TIKA-4197 Project: Tika Issue Type: Bug Reporter: Tim

[jira] [Updated] (TIKA-4195) JSoupParser conceals null from the EncodingDetector

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4195: -- Description: The JSoupParser runs encoding detection on the InputStream. If the result is null, the

[jira] [Resolved] (TIKA-4195) JSoupParser conceals null from the EncodingDetector

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison resolved TIKA-4195. --- Fix Version/s: 3.0.0 Resolution: Fixed > JSoupParser conceals null from the EncodingDetector >

Re: [PR] TIKA-4195 -- jsoup parser shouldn't conceal backoff to default encoding [tika]

2024-02-12 Thread via GitHub
tballison merged PR #1591: URL: https://github.com/apache/tika/pull/1591 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (TIKA-4195) JSoupParser conceals null from the EncodingDetector

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816694#comment-17816694 ] ASF GitHub Bot commented on TIKA-4195: -- tballison merged PR #1591: URL:

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816689#comment-17816689 ] Tim Allison commented on TIKA-4194: --- Merged and cherry-picked into branch_2x. [~tom_1st] if you do have

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816687#comment-17816687 ] ASF GitHub Bot commented on TIKA-4194: -- tballison merged PR #1589: URL:

Re: [PR] [TIKA-4194] Fix for unrecognized pkcs12 keystores [tika]

2024-02-12 Thread via GitHub
tballison merged PR #1589: URL: https://github.com/apache/tika/pull/1589 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (TIKA-4196) Add a BOM charset detector

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816680#comment-17816680 ] ASF GitHub Bot commented on TIKA-4196: -- tballison merged PR #1590: URL:

Re: [PR] TIKA-4196 [tika]

2024-02-12 Thread via GitHub
tballison merged PR #1590: URL: https://github.com/apache/tika/pull/1590 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (TIKA-4195) JSoupParser conceals null from the EncodingDetector

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816679#comment-17816679 ] ASF GitHub Bot commented on TIKA-4195: -- tballison opened a new pull request, #1591: URL:

[PR] TIKA-4195 -- jsoup parser shouldn't conceal backoff to default encoding [tika]

2024-02-12 Thread via GitHub
tballison opened a new pull request, #1591: URL: https://github.com/apache/tika/pull/1591 Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the

[jira] [Updated] (TIKA-4196) Add a BOM charset detector

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tim Allison updated TIKA-4196: -- Description: The ICU4j and the StandardHtmlEncodingDetector detectors include a bom detector, but for

[jira] [Commented] (TIKA-4196) Add a BOM charset detector

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816668#comment-17816668 ] ASF GitHub Bot commented on TIKA-4196: -- tballison opened a new pull request, #1590: URL:

[PR] TIKA-4196 [tika]

2024-02-12 Thread via GitHub
tballison opened a new pull request, #1590: URL: https://github.com/apache/tika/pull/1590 Thanks for your contribution to [Apache Tika](https://tika.apache.org/)! Your help is appreciated! Before opening the pull request, please verify that * there is an open issue on the

[jira] [Created] (TIKA-4196) Add a BOM charset detector

2024-02-12 Thread Tim Allison (Jira)
Tim Allison created TIKA-4196: - Summary: Add a BOM charset detector Key: TIKA-4196 URL: https://issues.apache.org/jira/browse/TIKA-4196 Project: Tika Issue Type: New Feature

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Tim Allison (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816661#comment-17816661 ] Tim Allison commented on TIKA-4194: --- Thank you for this! I'll try to take a look later today. Is there

[jira] [Created] (TIKA-4195) JSoupParser conceals null from the EncodingDetector

2024-02-12 Thread Tim Allison (Jira)
Tim Allison created TIKA-4195: - Summary: JSoupParser conceals null from the EncodingDetector Key: TIKA-4195 URL: https://issues.apache.org/jira/browse/TIKA-4195 Project: Tika Issue Type:

[jira] [Comment Edited] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816607#comment-17816607 ] Lonzak edited comment on TIKA-4194 at 2/12/24 1:47 PM: --- Interestingly the

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816607#comment-17816607 ] Lonzak commented on TIKA-4194: -- Interestingly the "application/pkcs7-signature" type looks quite similar:  

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread Lonzak (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816608#comment-17816608 ] Lonzak commented on TIKA-4194: -- Added a pull request: https://github.com/apache/tika/pull/1589 > tika fails

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816606#comment-17816606 ] ASF GitHub Bot commented on TIKA-4194: -- Lonzak commented on PR #1589: URL:

Re: [PR] [TIKA-4194] Fix for unrecognized pkcs12 keystores [tika]

2024-02-12 Thread via GitHub
Lonzak commented on PR #1589: URL: https://github.com/apache/tika/pull/1589#issuecomment-1938709322 It would appreciated if the change could go into 2.9.X ([branch_2x](https://github.com/apache/tika/tree/branch_2x)) -- This is an automated message from the Apache Git Service. To respond

[jira] [Commented] (TIKA-4194) tika fails to detect certain pkcs12 keystores types p12 pfx

2024-02-12 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/TIKA-4194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17816605#comment-17816605 ] ASF GitHub Bot commented on TIKA-4194: -- Lonzak opened a new pull request, #1589: URL: