[tika] branch dependabot/maven/aws.version-1.12.568 deleted (was 4310eca9c)

2023-10-17 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch dependabot/maven/aws.version-1.12.568
in repository https://gitbox.apache.org/repos/asf/tika.git


 was 4310eca9c Bump aws.version from 1.12.567 to 1.12.568

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.



[tika] branch main updated (ce5ca0ffe -> 8f8d40bc9)

2023-10-17 Thread tilman
This is an automated email from the ASF dual-hosted git repository.

tilman pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git


from ce5ca0ffe Merge pull request #1408 from 
apache/dependabot/maven/aws.version-1.12.567
 add 4310eca9c Bump aws.version from 1.12.567 to 1.12.568
 new 8f8d40bc9 Merge pull request #1409 from 
apache/dependabot/maven/aws.version-1.12.568

The 1 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 tika-parent/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[tika] 01/01: Merge pull request #1409 from apache/dependabot/maven/aws.version-1.12.568

2023-10-17 Thread tilman
This is an automated email from the ASF dual-hosted git repository.

tilman pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git

commit 8f8d40bc922472fa090d95e42e7591fbb0fb0bf8
Merge: ce5ca0ffe 4310eca9c
Author: Tilman Hausherr 
AuthorDate: Wed Oct 18 07:31:47 2023 +0200

Merge pull request #1409 from apache/dependabot/maven/aws.version-1.12.568

Bump aws.version from 1.12.567 to 1.12.568

 tika-parent/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)



[tika] branch dependabot/maven/aws.version-1.12.568 created (now 4310eca9c)

2023-10-17 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch dependabot/maven/aws.version-1.12.568
in repository https://gitbox.apache.org/repos/asf/tika.git


  at 4310eca9c Bump aws.version from 1.12.567 to 1.12.568

No new revisions were added by this update.



[tika] branch branch_2x updated: update some dependencies

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a commit to branch branch_2x
in repository https://gitbox.apache.org/repos/asf/tika.git


The following commit(s) were added to refs/heads/branch_2x by this push:
 new d8a85c7c7 update some dependencies
d8a85c7c7 is described below

commit d8a85c7c709cdd8f1dcc515eca1005ca325f0560
Author: tallison 
AuthorDate: Tue Oct 17 09:51:22 2023 -0400

update some dependencies
---
 tika-parent/pom.xml | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/tika-parent/pom.xml b/tika-parent/pom.xml
index 69c14fdfd..79e5952c1 100644
--- a/tika-parent/pom.xml
+++ b/tika-parent/pom.xml
@@ -306,7 +306,7 @@
 
 
 2.28.0
-1.12.565
+1.12.567
 
-3.5.6
+3.5.7
 1.27
 1.0.0-M2
 
@@ -343,7 +343,7 @@
 62.2
 1.4.0
 2.21.19
-2.15.2
+2.15.3
 1.3.2
 2.0
 2.1.1
@@ -1061,12 +1061,6 @@
 h2
 2.2.224
 
-
-
-com.fasterxml.jackson.core
-jackson-databind
-2.15.2
-
   
   true
 



svn commit: r64583 [2/3] - /dev/tika/2.9.1/

2023-10-17 Thread tallison


Added: dev/tika/2.9.1/CHANGES-2.9.1.txt
==
--- dev/tika/2.9.1/CHANGES-2.9.1.txt (added)
+++ dev/tika/2.9.1/CHANGES-2.9.1.txt Tue Oct 17 11:11:43 2023
@@ -0,0 +1,3198 @@
+Release 2.9.1 - 10/17/2023
+
+   * Dependency upgrades including commons-compress to fix CVE-2023-42503.
+
+   * Improve RFC822 detection (TIKA-4153).
+
+   * Enable configuration of "maxJsonStringFieldLength" in TikaConfig to allow 
users to
+ avoid DEFAULT_MAX_STRING_LEN exceptions from Jackson (TIKA-4154).
+
+   * Fix bug in DateUtils that stripped timezone information from
+ incoming Calendar objects (TIKA-4126).
+
+   * The InputStreamDigester now calculates stream length (TIKA-4016).
+
+Release 2.9.0 - 8/23/2023
+
+   * With user configuration, the PDFParser can now throw an 
EncryptedDocumentException
+ for Microsoft IRM PDF containers with encrypted payloads. Separately,
+ the PDFParser now throws an EncryptedDocumentException instead of an 
IOException
+ if the security handler cannot be found (TIKA-4082).
+
+   * Fix bug that led to duplicate extraction of macros from some OLE2 
containers (TIKA-4116).
+
+   * Parse iframe's srcdoc as an embedded file (TIKA-3109).
+
+   * Add detection of warc.gz as a specialization of gz and parse as if a 
standard WARC (TIKA-4048).
+
+   * Allow users to modify the attachment limit size in the /unpack resource 
(TIKA-4039)
+   
+   * Fixed write limit bug in RecursiveParserWrapper (TIKA-4055).
+
+   * Add mime detection for many files with thanks to Gregory Lepore 
(TIKA-3992).
+   
+   * Fixed iWork 13 keynote detection on files with wrong extension 
(TIKA-4111).
+
+Release 2.8.0 - 5/11/2023
+
+   * Enable counting and/or parsing of incremental updates in PDFs.  This
+ is an experimental feature and may change in later releases (TIKA-4017).
+
+   * Fixed bug that prevented the the loading of CompositeExternalParser in 
tika-app and
+ tika-server-standard. This parser will call exiftool and ffmpeg if those 
are installed, as was
+ the behavior in Tika 1.x. Exclude 
org.apache.tika.parser.external.CompositeExternalParser
+ if you do not want this behavior (TIKA-4022).
+
+   * Removed the shading of tika-parsers-standard-module (TIKA-4038).
+
+   * Enable optional extraction of file system metadata in FileSystemFetcher 
(TIKA-4035).
+
+   * Allow pretty printing in FileSystemEmitter (TIKA-4034).
+
+   * Add detection for and a new mime type for older postscript-based
+ Adobe Illustrator "application/illustrator+ps" files (TIKA-3971).
+
+   * Add magic detection for canon raw file types: crw, cr2 and cr3 
(TIKA-3991).
+
+   * Add detection for ONIX message files (TIKA-4011).
+
+   * Add detection and a parser for ActiveMime files (TIKA-3987).
+
+   * Add extraction of rendition layout value and version from Epub 
(TIKA-4013).
+
+   * Improve embedded file extraction from PDFs (TIKA-4012).
+
+   * Improve metadata extraction from WARCs (TIKA-4018).
+
+   * Update to PDFBox 2.0.28 (TIKA-4016).
+
+   * Users may now avoid the ZeroByteFileException via a
+ setting on the AutoDetectParserConfig (TIKA-3976).
+
+   * Fix bug in closing  elements in the presence of  elements
+ in RTF files (TIKA-3972).
+
+   * Improve extraction of embedded file names in .docx (TIKA-3968).
+
+   * Normalize author, title, subject and description to their Dublin Core
+ properties in the HTMLParser (TIKA-3963).
+
+
+Release 2.7.0 - 1/31/2023
+
+   * Add SVG detection for svg files that lack the xml header (TIKA-3308).
+
+   * Migrate to a live fork of Universal Charset Detector (TIKA-3213).
+
+   * Improve handling of text-based attachments inside .eml files (TIKA-3959).
+
+   * Add tika-parser-nlp-package to release artifacts (TIKA-3958).
+
+   * Remove need for  element in classes that extend ConfigBase 
(TIKA-3946).
+
+   * Add X-TIKA:embedded_id_path to ensure unique embedded file paths 
(TIKA-3942).
+
+   * Fix bug that prevented digests when the fallback/EmptyParser
+ was called (TIKA-3939).
+
+   * Remove log4j 1.2.x (and slf4j-log4j12 which now redirects to 
slf4j-reload4j) from
+ all modules (TIKA-3935).
+
+   * Upgrade mime4j to 0.8.9 (TIKA-3950).
+
+   * Refactor date parsing for emails (TIKA-3957)
+
+   * Upgrade to Bouncy Castle 1.71 and jdk18on jars (TIKA-3933).
+
+   * Add a JDBCPipesReporter (TIKA-3931).
+
+   * Add multivalued field strategy option in jdbc-emitter (TIKA-3930).
+ Default is now 'concatenate' with ', ' as the delimiter.
+
+   * Downgrade logging in PipesClient for each parse from info to debug.
+
+Release 2.6.0 - 11/3/2022
+
+   * Add optional Siegfried detector (TIKA-3901).
+
+   * Move OverrideDetector's functionality to the CompositeDetector 
(TIKA-3904).
+
+   * The FileCommandDetector has been refactored to have the same
+ behavior as the Siegfried detector; see setUseMime in the javadoc 
(TIKA-3902).
+
+   * Fix bug in OpenSearch emitter that 

svn commit: r64583 [3/3] - /dev/tika/2.9.1/

2023-10-17 Thread tallison
Added: dev/tika/2.9.1/tika-2.9.1-src.zip
==
Binary file - no diff available.

Propchange: dev/tika/2.9.1/tika-2.9.1-src.zip
--
svn:mime-type = application/octet-stream

Added: dev/tika/2.9.1/tika-2.9.1-src.zip.asc
==
--- dev/tika/2.9.1/tika-2.9.1-src.zip.asc (added)
+++ dev/tika/2.9.1/tika-2.9.1-src.zip.asc Tue Oct 17 11:11:43 2023
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAABCgAdFiEEGERU+thpd2Dz4A0uSlGkW5RP/VEFAmUuaooACgkQSlGkW5RP
+/VHbHA//WOL7S9Om6xtkiL/vJ5dTzmKuepOUvSMWSQmd/gGoeD+OhI7DPWh1ZAnx
+mQFF9E/x33pzPChJS0EUUueeaNQFwQyuAclS/2arNXQfBEHpp00aEhDrDPvX5bPD
+fCdXzLCqtvDhhzj/Xr8P7xuToIytmk7b1usiXGPis8bVorFhK+V+61qRn0O1Pyr0
+SnC7Sdch+TIXT2GPLj5hee3PMUfWHLLVKCpy72YLhlEnAwfnalSc++M5ESOvIub4
+tIiydwe6DKVszcV4PvQxlykrnyNvQkiIwQYnThEzoBHUA2AOXQHgyS4Fgic0oJ4+
+qXiafOFVIDqz29JokwdPpTECCzjAAJshVHR4VgkHnmKluVTOmeKW1zfVjPb8osr1
+j1ykPTWWJ04zw1yBLcDyr6b6jzpL8i2pT+1FHhJg/P2a8MkPvdYz5b5dtLAoUt3+
+flZo5NYwA4pumkIc6h2/q0pXMROd48LyZx0igjFhntPgXy3qMeibeuMyK+od7yi5
+U0dLRN+PqzlqGw3YVhCLxxr5GvDKs7cfj7ivKvNpvpvJyYA0QwvsTIPtJvvhgOED
+pRdjLftvrDHE3Z0kFqv1YpT/qLAOzCttxGS1wuy8XWsRN6zj05WNa3oXpXnZ1yW8
+ohQwniA3bkDYCDuGzZI3YgyRAERuU9IBbxGZmAhsWs6GYp8fEeM=
+=BuLA
+-END PGP SIGNATURE-

Added: dev/tika/2.9.1/tika-2.9.1-src.zip.sha512
==
--- dev/tika/2.9.1/tika-2.9.1-src.zip.sha512 (added)
+++ dev/tika/2.9.1/tika-2.9.1-src.zip.sha512 Tue Oct 17 11:11:43 2023
@@ -0,0 +1 @@
+ba13a0d22994ca84cccd9ad2931e099051870d46a5a3440258f93bd63f6e3b03de51709c51cf0e4029e57ba9c44cdb243ac440d76e695dfc081dfd9d956d8777

Added: dev/tika/2.9.1/tika-app-2.9.1.jar
==
Binary file - no diff available.

Propchange: dev/tika/2.9.1/tika-app-2.9.1.jar
--
svn:mime-type = application/octet-stream

Added: dev/tika/2.9.1/tika-app-2.9.1.jar.asc
==
--- dev/tika/2.9.1/tika-app-2.9.1.jar.asc (added)
+++ dev/tika/2.9.1/tika-app-2.9.1.jar.asc Tue Oct 17 11:11:43 2023
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAABCgAdFiEEGERU+thpd2Dz4A0uSlGkW5RP/VEFAmUuaCIACgkQSlGkW5RP
+/VGZAA/6A+DR4UTmMHz6moH3gU9dDWr8x/VfiXEfvNcdvZ9mfxEuNFgGn1WNWaGC
+IB5XMfcn+W2D27xYrPSik7YQNZWorIYAxl+Ip+eROGisRbrFLjd1VIdHppGE1Unp
+zEfHTnVV6ewY8fgOmmhEstnFHWZpNpOrU+ePXhb0JkI09kDXC8JZWh6A5jeub4Yc
+DDriuJCnP86RgGMbNcYyUUNNs01ziNpVkuuCcD7WpGp95bOuMMMQmnGC4fQZBTff
+B8XRVebnivk592Bhucqoz9rlYuXTmQnKlcFL3xkN6U9pkJR/ZBG+0wRjxPMK4gPc
+/YB+IH3AlBKhCEK+B5pJBXhunOpKb10AI8c6YJnHrZzWcK3sFtaoUIdtNI1Do8sw
+0XTtylGQtVsVeW2IE/mPYj5NpDpSjQxAAPkuX5MGFG3QFJ/4K7a78fR4CpoiMaKD
+YiaS0wfdrM8awiw4uHkx4HSW+Xpp0XoMe2SDwLmPjlB3My5SEGCPIgtQIvZDHzMv
+C7HhNf90THJqdkQElLc2VAGoi1xB9sBovzxwS7nFt6Ytxgh6UqIW9VhySnGKJXg8
+tWsrgHTfYEp9rzry6OtE0bwjL6ebskRRD3JDJmgY/tcZQNnTQ1nEUkmUUXtSc/Hv
+t+dz8AUeDsiAeEdHw3LT9HEenBRzjsxDM+P1ESqWvp/djVbNDQ0=
+=PgoF
+-END PGP SIGNATURE-

Added: dev/tika/2.9.1/tika-app-2.9.1.jar.sha512
==
--- dev/tika/2.9.1/tika-app-2.9.1.jar.sha512 (added)
+++ dev/tika/2.9.1/tika-app-2.9.1.jar.sha512 Tue Oct 17 11:11:43 2023
@@ -0,0 +1 @@
+843392a5d378098ae81e09da559a685c9582a390790da7e658e6f4c81d2d49c5285566875485ea2a0e8bd634e7e338ebac2e77f449972343c7f4a98d6179dec7

Added: dev/tika/2.9.1/tika-eval-app-2.9.1.jar
==
Binary file - no diff available.

Propchange: dev/tika/2.9.1/tika-eval-app-2.9.1.jar
--
svn:mime-type = application/octet-stream

Added: dev/tika/2.9.1/tika-eval-app-2.9.1.jar.asc
==
--- dev/tika/2.9.1/tika-eval-app-2.9.1.jar.asc (added)
+++ dev/tika/2.9.1/tika-eval-app-2.9.1.jar.asc Tue Oct 17 11:11:43 2023
@@ -0,0 +1,16 @@
+-BEGIN PGP SIGNATURE-
+
+iQIzBAABCgAdFiEEGERU+thpd2Dz4A0uSlGkW5RP/VEFAmUuajwACgkQSlGkW5RP
+/VEm7A/+LNOz50cvovRqK5SF1wB7dRCeHvX0LA+A76InEnN8NB5TQOICatZv+ERs
+WPJU1n+Y7CFzbZdXYqLvCSQojsiKBmP1iC5D955OLWLwPaCAHpULBb2sHDYZnC2Z
+Hk4szsi9fBjLrX4lj3rMcPBwayq6mJV4pg3pxAQZeIenSrTjZ5rMr48v+UN2hsSs
+hcSc7OAtsKb/xi6chz3EyJyrzrtaR3e2hIZJz2Kp44gkotj8ERU467G41TwV9mVT
+A8smodplwqUCElahFznSjjdEYYUiWavBQXtAw6bxEMp2xtDaFS5vJLMmY4d9cwtq
+b3YVOnj0X8ZOyyqDOtZN2p/+cs+9aIwypVXa6YFYtgIopoUuo8Uzl/rg74tsx76d
+nrbFfw2MOj9ZW3y8a7nzWJnBQvCmFs53GJIxWiw1Ouvb1rEhxcJvAph5zrEHNRDk
+Pe7KV0FAQBCGsWHfxwoXAlKvvQnZDNmCstZKf+5SluZoQ7ntQ6KylyWQ0215uUL9
+K5GaY4+iYQlHdnwOfpV/ODqIJSgO+tZ1rBNar1z958gPaIN3nR+wcR4HSWJK/9sW

svn commit: r64583 [1/3] - /dev/tika/2.9.1/

2023-10-17 Thread tallison
Author: tallison
Date: Tue Oct 17 11:11:43 2023
New Revision: 64583

Log:
Add Tika 2.9.1 RC#1 artifacts

Added:
dev/tika/2.9.1/
dev/tika/2.9.1/CHANGES-2.9.1.txt
dev/tika/2.9.1/tika-2.9.1-src.zip   (with props)
dev/tika/2.9.1/tika-2.9.1-src.zip.asc
dev/tika/2.9.1/tika-2.9.1-src.zip.sha512
dev/tika/2.9.1/tika-app-2.9.1.jar   (with props)
dev/tika/2.9.1/tika-app-2.9.1.jar.asc
dev/tika/2.9.1/tika-app-2.9.1.jar.sha512
dev/tika/2.9.1/tika-eval-app-2.9.1.jar   (with props)
dev/tika/2.9.1/tika-eval-app-2.9.1.jar.asc
dev/tika/2.9.1/tika-eval-app-2.9.1.jar.sha512
dev/tika/2.9.1/tika-parser-nlp-package-2.9.1.jar   (with props)
dev/tika/2.9.1/tika-parser-nlp-package-2.9.1.jar.asc
dev/tika/2.9.1/tika-parser-nlp-package-2.9.1.jar.sha512
dev/tika/2.9.1/tika-parser-scientific-package-2.9.1.jar   (with props)
dev/tika/2.9.1/tika-parser-scientific-package-2.9.1.jar.asc
dev/tika/2.9.1/tika-parser-scientific-package-2.9.1.jar.sha512
dev/tika/2.9.1/tika-parser-sqlite3-package-2.9.1.jar   (with props)
dev/tika/2.9.1/tika-parser-sqlite3-package-2.9.1.jar.asc
dev/tika/2.9.1/tika-parser-sqlite3-package-2.9.1.jar.sha512
dev/tika/2.9.1/tika-server-standard-2.9.1-bin.tgz   (with props)
dev/tika/2.9.1/tika-server-standard-2.9.1-bin.tgz.asc
dev/tika/2.9.1/tika-server-standard-2.9.1-bin.tgz.sha512
dev/tika/2.9.1/tika-server-standard-2.9.1-bin.zip   (with props)
dev/tika/2.9.1/tika-server-standard-2.9.1-bin.zip.asc
dev/tika/2.9.1/tika-server-standard-2.9.1-bin.zip.sha512
dev/tika/2.9.1/tika-server-standard-2.9.1.jar   (with props)
dev/tika/2.9.1/tika-server-standard-2.9.1.jar.asc
dev/tika/2.9.1/tika-server-standard-2.9.1.jar.sha512



[tika] annotated tag 2.9.1-rc1 updated (3b4365064 -> 60ecc6336)

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a change to annotated tag 2.9.1-rc1
in repository https://gitbox.apache.org/repos/asf/tika.git


*** WARNING: tag 2.9.1-rc1 was modified! ***

from 3b4365064 (commit)
  to 60ecc6336 (tag)
 tagging 3b4365064ea9354c4dd0a2fc0b6f5348f1ae43e1 (commit)
 replaces 2.9.0
  by tallison
  on Tue Oct 17 06:30:29 2023 -0400

- Log -
[maven-release-plugin] copy for tag 2.9.1-rc1
---


No new revisions were added by this update.

Summary of changes:



[tika] branch branch_2x updated: [maven-release-plugin] prepare for next development iteration

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a commit to branch branch_2x
in repository https://gitbox.apache.org/repos/asf/tika.git


The following commit(s) were added to refs/heads/branch_2x by this push:
 new f24a11c54 [maven-release-plugin] prepare for next development iteration
f24a11c54 is described below

commit f24a11c54fb14a2273bbb4ff36ee4fa55d6c7806
Author: tallison 
AuthorDate: Tue Oct 17 06:30:31 2023 -0400

[maven-release-plugin] prepare for next development iteration
---
 pom.xml|   4 +-
 tika-app/pom.xml   |   4 +-
 tika-batch/pom.xml |   4 +-
 tika-bom/pom.xml   | 146 ++---
 tika-bundles/pom.xml   |   6 +-
 tika-bundles/tika-bundle-standard/pom.xml  |   6 +-
 tika-core/pom.xml  |   4 +-
 tika-detectors/pom.xml |   2 +-
 tika-detectors/tika-detector-siegfried/pom.xml |   2 +-
 tika-eval/pom.xml  |   4 +-
 tika-eval/tika-eval-app/pom.xml|   4 +-
 tika-eval/tika-eval-core/pom.xml   |   4 +-
 tika-example/pom.xml   |   4 +-
 tika-fuzzing/pom.xml   |   2 +-
 tika-integration-tests/pom.xml |   4 +-
 .../tika-pipes-kafka-integration-tests/pom.xml |   4 +-
 .../pom.xml|   4 +-
 .../tika-pipes-s3-integration-tests/pom.xml|   4 +-
 .../tika-pipes-solr-integration-tests/pom.xml  |   4 +-
 .../tika-resource-loading-tests/pom.xml|   2 +-
 tika-java7/pom.xml |   4 +-
 tika-langdetect/pom.xml|   4 +-
 tika-langdetect/tika-langdetect-lingo24/pom.xml|   4 +-
 tika-langdetect/tika-langdetect-mitll-text/pom.xml |   4 +-
 tika-langdetect/tika-langdetect-opennlp/pom.xml|   4 +-
 tika-langdetect/tika-langdetect-optimaize/pom.xml  |   4 +-
 .../tika-langdetect-test-commons/pom.xml   |   4 +-
 tika-langdetect/tika-langdetect-tika/pom.xml   |   4 +-
 tika-parent/pom.xml|   6 +-
 tika-parsers/pom.xml   |   4 +-
 tika-parsers/tika-parsers-extended/pom.xml |   4 +-
 .../tika-parser-scientific-module/pom.xml  |   4 +-
 .../tika-parser-scientific-package/pom.xml |   4 +-
 .../tika-parser-sqlite3-module/pom.xml |   4 +-
 .../tika-parser-sqlite3-package/pom.xml|   4 +-
 .../pom.xml|   4 +-
 tika-parsers/tika-parsers-ml/pom.xml   |   4 +-
 .../tika-parsers-ml/tika-age-recogniser/pom.xml|   4 +-
 tika-parsers/tika-parsers-ml/tika-dl/pom.xml   |   4 +-
 .../tika-parser-advancedmedia-module/pom.xml   |   4 +-
 .../tika-parser-advancedmedia-package/pom.xml  |   4 +-
 .../tika-parsers-ml/tika-parser-nlp-module/pom.xml |   4 +-
 .../tika-parser-nlp-package/pom.xml|   4 +-
 .../tika-parsers-ml/tika-transcribe-aws/pom.xml|   4 +-
 tika-parsers/tika-parsers-standard/pom.xml |   4 +-
 .../tika-parsers-standard-modules/pom.xml  |   4 +-
 .../tika-parser-apple-module/pom.xml   |   4 +-
 .../tika-parser-audiovideo-module/pom.xml  |   4 +-
 .../tika-parser-cad-module/pom.xml |   4 +-
 .../tika-parser-code-module/pom.xml|   4 +-
 .../tika-parser-crypto-module/pom.xml  |   4 +-
 .../tika-parser-digest-commons/pom.xml |   4 +-
 .../tika-parser-font-module/pom.xml|   4 +-
 .../tika-parser-html-commons/pom.xml   |   4 +-
 .../tika-parser-html-module/pom.xml|   4 +-
 .../tika-parser-image-module/pom.xml   |   4 +-
 .../tika-parser-jdbc-commons/pom.xml   |   4 +-
 .../tika-parser-mail-commons/pom.xml   |   4 +-
 .../tika-parser-mail-module/pom.xml|   4 +-
 .../tika-parser-microsoft-module/pom.xml   |   4 +-
 .../tika-parser-miscoffice-module/pom.xml  |   4 +-
 .../tika-parser-news-module/pom.xml|   4 +-
 .../tika-parser-ocr-module/pom.xml |   4 +-
 .../tika-parser-pdf-module/pom.xml |   4 +-
 .../tika-parser-pkg-module/pom.xml |   4 +-
 .../tika-parser-text-module/pom.xml|   4 +-
 .../tika-parser-webarchive-module/pom.xml  |   4 +-
 .../tika-parser-xml-module/pom.xml |   4 +-
 .../tika-parser-xmp-commons/pom.xml|   4 +-
 .../tika-parser-zip-commons/pom.xml|   4 +-
 .../tika-parsers-standard-package/pom.xml  |   4 +-
 tika-pipes/pom.xml |   4 +-
 tika-pipes/tika-async-cli/pom.xml   

[tika] branch branch_2x updated (83b8de0ea -> 3b4365064)

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a change to branch branch_2x
in repository https://gitbox.apache.org/repos/asf/tika.git


from 83b8de0ea update CHANGES.txt for 2.9.1 release process
 add 3b4365064 [maven-release-plugin] prepare release 2.9.1-rc1

No new revisions were added by this update.

Summary of changes:
 pom.xml|   4 +-
 tika-app/pom.xml   |   4 +-
 tika-batch/pom.xml |   4 +-
 tika-bom/pom.xml   | 146 ++---
 tika-bundles/pom.xml   |   6 +-
 tika-bundles/tika-bundle-standard/pom.xml  |   6 +-
 tika-core/pom.xml  |   4 +-
 tika-detectors/pom.xml |   2 +-
 tika-detectors/tika-detector-siegfried/pom.xml |   2 +-
 tika-eval/pom.xml  |   4 +-
 tika-eval/tika-eval-app/pom.xml|   4 +-
 tika-eval/tika-eval-core/pom.xml   |   4 +-
 tika-example/pom.xml   |   4 +-
 tika-fuzzing/pom.xml   |   2 +-
 tika-integration-tests/pom.xml |   4 +-
 .../tika-pipes-kafka-integration-tests/pom.xml |   4 +-
 .../pom.xml|   4 +-
 .../tika-pipes-s3-integration-tests/pom.xml|   4 +-
 .../tika-pipes-solr-integration-tests/pom.xml  |   4 +-
 .../tika-resource-loading-tests/pom.xml|   2 +-
 tika-java7/pom.xml |   4 +-
 tika-langdetect/pom.xml|   4 +-
 tika-langdetect/tika-langdetect-lingo24/pom.xml|   4 +-
 tika-langdetect/tika-langdetect-mitll-text/pom.xml |   4 +-
 tika-langdetect/tika-langdetect-opennlp/pom.xml|   4 +-
 tika-langdetect/tika-langdetect-optimaize/pom.xml  |   4 +-
 .../tika-langdetect-test-commons/pom.xml   |   4 +-
 tika-langdetect/tika-langdetect-tika/pom.xml   |   4 +-
 tika-parent/pom.xml|   6 +-
 tika-parsers/pom.xml   |   4 +-
 tika-parsers/tika-parsers-extended/pom.xml |   4 +-
 .../tika-parser-scientific-module/pom.xml  |   4 +-
 .../tika-parser-scientific-package/pom.xml |   4 +-
 .../tika-parser-sqlite3-module/pom.xml |   4 +-
 .../tika-parser-sqlite3-package/pom.xml|   4 +-
 .../pom.xml|   4 +-
 tika-parsers/tika-parsers-ml/pom.xml   |   4 +-
 .../tika-parsers-ml/tika-age-recogniser/pom.xml|   4 +-
 tika-parsers/tika-parsers-ml/tika-dl/pom.xml   |   4 +-
 .../tika-parser-advancedmedia-module/pom.xml   |   4 +-
 .../tika-parser-advancedmedia-package/pom.xml  |   4 +-
 .../tika-parsers-ml/tika-parser-nlp-module/pom.xml |   4 +-
 .../tika-parser-nlp-package/pom.xml|   4 +-
 .../tika-parsers-ml/tika-transcribe-aws/pom.xml|   4 +-
 tika-parsers/tika-parsers-standard/pom.xml |   4 +-
 .../tika-parsers-standard-modules/pom.xml  |   4 +-
 .../tika-parser-apple-module/pom.xml   |   4 +-
 .../tika-parser-audiovideo-module/pom.xml  |   4 +-
 .../tika-parser-cad-module/pom.xml |   4 +-
 .../tika-parser-code-module/pom.xml|   4 +-
 .../tika-parser-crypto-module/pom.xml  |   4 +-
 .../tika-parser-digest-commons/pom.xml |   4 +-
 .../tika-parser-font-module/pom.xml|   4 +-
 .../tika-parser-html-commons/pom.xml   |   4 +-
 .../tika-parser-html-module/pom.xml|   4 +-
 .../tika-parser-image-module/pom.xml   |   4 +-
 .../tika-parser-jdbc-commons/pom.xml   |   4 +-
 .../tika-parser-mail-commons/pom.xml   |   4 +-
 .../tika-parser-mail-module/pom.xml|   4 +-
 .../tika-parser-microsoft-module/pom.xml   |   4 +-
 .../tika-parser-miscoffice-module/pom.xml  |   4 +-
 .../tika-parser-news-module/pom.xml|   4 +-
 .../tika-parser-ocr-module/pom.xml |   4 +-
 .../tika-parser-pdf-module/pom.xml |   4 +-
 .../tika-parser-pkg-module/pom.xml |   4 +-
 .../tika-parser-text-module/pom.xml|   4 +-
 .../tika-parser-webarchive-module/pom.xml  |   4 +-
 .../tika-parser-xml-module/pom.xml |   4 +-
 .../tika-parser-xmp-commons/pom.xml|   4 +-
 .../tika-parser-zip-commons/pom.xml|   4 +-
 .../tika-parsers-standard-package/pom.xml  |   4 +-
 tika-pipes/pom.xml |   4 +-
 tika-pipes/tika-async-cli/pom.xml  |   4 +-
 tika-pipes/tika-emitters/pom.xml   |   4 +-
 .../tika-emitters/tika-emitter-az-blob/pom.xml |   4 +-
 

[tika] branch branch_2x updated (c3b2bd735 -> 83b8de0ea)

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a change to branch branch_2x
in repository https://gitbox.apache.org/repos/asf/tika.git


from c3b2bd735 TIKA-4155 -- disable tests that cause failures in github 
actions. :(
 new aaf7b5f83 revert POI in prep for 2.9.1 release process
 new 83b8de0ea update CHANGES.txt for 2.9.1 release process

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 CHANGES.txt  | 9 -
 tika-parent/pom.xml  | 2 +-
 .../java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java | 3 ++-
 3 files changed, 11 insertions(+), 3 deletions(-)



[tika] 02/02: update CHANGES.txt for 2.9.1 release process

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a commit to branch branch_2x
in repository https://gitbox.apache.org/repos/asf/tika.git

commit 83b8de0eac0649f4728ddae548fb30d8d7f8219e
Author: tallison 
AuthorDate: Tue Oct 17 06:11:10 2023 -0400

update CHANGES.txt for 2.9.1 release process
---
 CHANGES.txt | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/CHANGES.txt b/CHANGES.txt
index 9d6a3b3ad..4367faa6a 100644
--- a/CHANGES.txt
+++ b/CHANGES.txt
@@ -1,4 +1,11 @@
-Release 2.9.1 - ??
+Release 2.9.1 - 10/17/2023
+
+   * Dependency upgrades including commons-compress to fix CVE-2023-42503.
+
+   * Improve RFC822 detection (TIKA-4153).
+
+   * Enable configuration of "maxJsonStringFieldLength" in TikaConfig to allow 
users to
+ avoid DEFAULT_MAX_STRING_LEN exceptions from Jackson (TIKA-4154).
 
* Fix bug in DateUtils that stripped timezone information from
  incoming Calendar objects (TIKA-4126).



[tika] 01/02: revert POI in prep for 2.9.1 release process

2023-10-17 Thread tallison
This is an automated email from the ASF dual-hosted git repository.

tallison pushed a commit to branch branch_2x
in repository https://gitbox.apache.org/repos/asf/tika.git

commit aaf7b5f83804c01776ea878348dd7b23e01d1b93
Author: tallison 
AuthorDate: Tue Oct 17 06:05:48 2023 -0400

revert POI in prep for 2.9.1 release process
---
 tika-parent/pom.xml| 2 +-
 .../main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java  | 3 ++-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/tika-parent/pom.xml b/tika-parent/pom.xml
index d1899c7ee..de65a1a99 100644
--- a/tika-parent/pom.xml
+++ b/tika-parent/pom.xml
@@ -383,7 +383,7 @@
 4.13.5
 2.0.29
 
-5.2.4
+5.2.3
 2.3.2
 1.1.12
 2.1.0
diff --git 
a/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java
 
b/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java
index f365dc94f..e267debd8 100644
--- 
a/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java
+++ 
b/tika-parsers/tika-parsers-standard/tika-parsers-standard-modules/tika-parser-microsoft-module/src/main/java/org/apache/tika/parser/microsoft/ooxml/OOXMLParser.java
@@ -102,7 +102,8 @@ public class OOXMLParser extends AbstractOfficeParser {
 //turn off POI's zip bomb detection because we have our own
 ZipSecureFile.setMinInflateRatio(-1.0d);
 //bump this to a higher value than POI's default of 1000
-ZipSecureFile.setMaxFileCount(1);
+//turn this back on with > 5.2.3
+//ZipSecureFile.setMaxFileCount(1);
 }
 
 public Set getSupportedTypes(ParseContext context) {



[tika] branch dependabot/maven/aws.version-1.12.567 deleted (was 58fb978c2)

2023-10-17 Thread github-bot
This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a change to branch dependabot/maven/aws.version-1.12.567
in repository https://gitbox.apache.org/repos/asf/tika.git


 was 58fb978c2 Bump aws.version from 1.12.566 to 1.12.567

The revisions that were on this branch are still contained in
other references; therefore, this change does not discard any commits
from the repository.



[tika] branch main updated (7825b59cb -> ce5ca0ffe)

2023-10-17 Thread tilman
This is an automated email from the ASF dual-hosted git repository.

tilman pushed a change to branch main
in repository https://gitbox.apache.org/repos/asf/tika.git


from 7825b59cb TIKA-4153 -- revert changes to robots.txt detection and add 
unit test for robots file starting with comments
 add 58fb978c2 Bump aws.version from 1.12.566 to 1.12.567
 add ce5ca0ffe Merge pull request #1408 from 
apache/dependabot/maven/aws.version-1.12.567

No new revisions were added by this update.

Summary of changes:
 tika-parent/pom.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)