dev
Thread
Date
Earlier messages
Messages by Thread
FYI dead-end with TikaPipes and Docker
Mikhail Khludnev
[PR] TIKA-4671-lang-aware-charset-detection [tika]
via GitHub
[jira] [Commented] (TIKA-4671) Use langid to adjudicate charset detector disagreements
Hudson (Jira)
[jira] [Commented] (TIKA-4671) Use langid to adjudicate charset detector disagreements
ASF GitHub Bot (Jira)
[jira] [Created] (TIKA-4671) Use langid to adjudicate charset detector disagreements
Tim Allison (Jira)
[PR] fix flaky windows timeouts on ci/cd [tika]
via GitHub
Re: [PR] fix flaky windows timeouts on ci/cd [tika]
via GitHub
[PR] TIKA-4663 -- add cli option for markdown in 3.x [tika]
via GitHub
Re: [PR] TIKA-4663 -- add cli option for markdown in 3.x [tika]
via GitHub
[jira] [Commented] (TIKA-4670) Improve early crash detection in pipesserver in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4670) Improve early crash detection in pipesserver in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4670) Improve early crash detection in pipesserver in 4.x
Hudson (Jira)
[PR] TIKA-4670 -- improve exit handling btwn pipesclient and pipesserver [tika]
via GitHub
Re: [PR] TIKA-4670 -- improve exit handling btwn pipesclient and pipesserver [tika]
via GitHub
[jira] [Created] (TIKA-4670) Improve early crash detection in pipesserver in 4.x
Tim Allison (Jira)
[jira] [Commented] (TIKA-4669) Improve runtime serialization updates in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4669) Improve runtime serialization updates in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4669) Improve runtime serialization updates in 4.x
Hudson (Jira)
[PR] TIKA-4669 -- improve serdes [tika]
via GitHub
Re: [PR] TIKA-4669 -- improve serdes [tika]
via GitHub
[jira] [Created] (TIKA-4669) Improve runtime serialization updates in 4.x
Tim Allison (Jira)
[jira] [Commented] (TIKA-4668) Simplify version with maven $revision in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4668) Simplify version with maven $revision in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4668) Simplify version with maven $revision in 4.x
Hudson (Jira)
[PR] TIKA-4668 -- modernize versioning with $revision [tika]
via GitHub
Re: [PR] TIKA-4668 -- modernize versioning with $revision [tika]
via GitHub
[jira] [Created] (TIKA-4668) Simplify version with maven $revision in 4.x
Tim Allison (Jira)
[jira] [Resolved] (TIKA-4661) Automate tika-helm release on Tika version update
Lewis John McGibbney (Jira)
[jira] [Commented] (TIKA-4667) Add tess4j wrapper in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4667) Add tess4j wrapper in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4667) Add tess4j wrapper in 4.x
Hudson (Jira)
[PR] TIKA-4667 - add Tess4J in-process OCR parser and docs [tika]
via GitHub
Re: [PR] TIKA-4667 - add Tess4J in-process OCR parser and docs [tika]
via GitHub
[jira] [Commented] (TIKA-4666) Add VLM/modern OCR options parsers in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4666) Add VLM/modern OCR options parsers in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4666) Add VLM/modern OCR options parsers in 4.x
Hudson (Jira)
[PR] TIKA-4665-inference-module [tika]
via GitHub
Re: [PR] TIKA-4665-inference-module [tika]
via GitHub
[PR] TIKA-4666 - add VLM parsers (Claude, Gemini, OpenAI) [tika]
via GitHub
Re: [PR] TIKA-4666 - add VLM parsers (Claude, Gemini, OpenAI) [tika]
via GitHub
[jira] [Commented] (TIKA-4665) Add chunking and inference handling poc in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4665) Add chunking and inference handling poc in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4665) Add chunking and inference handling poc in 4.x
Hudson (Jira)
[jira] [Commented] (TIKA-4664) Add poppler renderer and remove mutool rendering wrapper in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4664) Add poppler renderer and remove mutool rendering wrapper in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4664) Add poppler renderer and remove mutool rendering wrapper in 4.x
Hudson (Jira)
[PR] TIKA-4664 - add Poppler renderer, replace MuPDF, add OCR safety limits [tika]
via GitHub
Re: [PR] TIKA-4664 - add Poppler renderer, replace MuPDF, add OCR safety limits [tika]
via GitHub
[jira] [Commented] (TIKA-4663) Switch default handler type to markdown in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4663) Switch default handler type to markdown in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4663) Switch default handler type to markdown in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4663) Switch default handler type to markdown in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4663) Switch default handler type to markdown in 4.x
Hudson (Jira)
[jira] [Commented] (TIKA-4663) Switch default handler type to markdown in 4.x
Hudson (Jira)
[PR] TIKA-4663 - add content handler type metadata and switch default to markdown [tika]
via GitHub
Re: [PR] TIKA-4663 - add content handler type metadata and switch default to markdown [tika]
via GitHub
[jira] [Resolved] (TIKA-4630) embeddedRelationshipId is missing from tar files that are children of gzip files (i.e. tarballs)
Tim Allison (Jira)
[jira] [Updated] (TIKA-4665) Add chunking and inference handling poc in 4.x
Tim Allison (Jira)
[jira] [Created] (TIKA-4667) Add tess4j wrapper in 4.x
Tim Allison (Jira)
[jira] [Created] (TIKA-4666) Add VLM/modern OCR options parsers in 4.x
Tim Allison (Jira)
[jira] [Created] (TIKA-4665) Add chunking and inference handling poc in 4.x
Tim Allison (Jira)
[jira] [Created] (TIKA-4664) Add poppler renderer and remove mutool rendering wrapper in 4.x
Tim Allison (Jira)
[jira] [Created] (TIKA-4663) Switch default handler type to markdown in 4.x
Tim Allison (Jira)
[jira] [Commented] (TIKA-4662) Modernize lang-detector for at least 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4662) Modernize lang-detector for at least 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4662) Modernize lang-detector for at least 4.x
Hudson (Jira)
[PR] TIKA-4662 -- update language detection [tika]
via GitHub
Re: [PR] TIKA-4662 -- update language detection [tika]
via GitHub
[jira] [Created] (TIKA-4662) Modernize lang-detector for at least 4.x
Tim Allison (Jira)
[PR] Bump software.amazon.awssdk:bom from 2.41.28 to 2.41.29 [tika]
via GitHub
Re: [PR] Bump software.amazon.awssdk:bom from 2.41.28 to 2.41.29 [tika]
via GitHub
[PR] Bump org.springframework:spring-context from 7.0.3 to 7.0.4 [tika]
via GitHub
Re: [PR] Bump org.springframework:spring-context from 7.0.3 to 7.0.4 [tika]
via GitHub
[PR] Bump org.xerial:sqlite-jdbc from 3.51.1.0 to 3.51.2.0 [tika]
via GitHub
Re: [PR] Bump org.xerial:sqlite-jdbc from 3.51.1.0 to 3.51.2.0 [tika]
via GitHub
[jira] [Commented] (TIKA-4661) Automate tika-helm release on Tika version update
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4661) Automate tika-helm release on Tika version update
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4661) Automate tika-helm release on Tika version update
ASF GitHub Bot (Jira)
[jira] [Updated] (TIKA-4661) Automate tika-helm release on Tika version update
Lewis John McGibbney (Jira)
[PR] TIKA-4661 Automate tika-helm release on Tika version update [tika-helm]
via GitHub
Re: [PR] TIKA-4661 Automate tika-helm release on Tika version update [tika-helm]
via GitHub
Re: [PR] TIKA-4661 Automate tika-helm release on Tika version update [tika-helm]
via GitHub
[jira] [Created] (TIKA-4661) Automate tika-helm release on Tika version update Description:
Lewis John McGibbney (Jira)
[jira] [Resolved] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
Lewis John McGibbney (Jira)
[PR] Bump Apache Tika Docker image to 3.2.3.0-full [tika-helm]
via GitHub
Re: [PR] Bump Apache Tika Docker image to 3.2.3.0-full [tika-helm]
via GitHub
[jira] [Commented] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
Lewis John McGibbney (Jira)
[PR] TIKA-4660 Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions [tika-helm]
via GitHub
Re: [PR] TIKA-4660 Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions [tika-helm]
via GitHub
Re: [PR] TIKA-4660 Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions [tika-helm]
via GitHub
[PR] TIKA-4660 Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions [tika-helm]
via GitHub
Re: [PR] TIKA-4660 Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions [tika-helm]
via GitHub
[jira] [Updated] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Updated] (TIKA-4660) Add automated Tika Docker image version bump workflow and upgrade all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Created] (TIKA-4660) dd automated Tika Docker image version bump workflow and upgrade all GitHub Actions
Lewis John McGibbney (Jira)
[jira] [Resolved] (TIKA-4657) Endnote content in tables omitted from .docx text
Tim Allison (Jira)
[PR] remove bean setters/getters on parsers and detectors; other fixes to work when tesseract is installed [tika]
via GitHub
Re: [PR] remove bean setters/getters on parsers and detectors; other fixes to work when tesseract is installed [tika]
via GitHub
[PR] TIKA-4657 -- improve extraction from footnote/endnotes in xwpf [tika]
via GitHub
Re: [PR] TIKA-4657 -- improve extraction from footnote/endnotes in xwpf [tika]
via GitHub
[jira] [Commented] (TIKA-4659) Add tika-eval-lite for embedded junk detection
ASF GitHub Bot (Jira)
[PR] TIKA-4659 -- tika-eval-lite [tika]
via GitHub
[jira] [Created] (TIKA-4659) Add tika-eval-lite for embedded junk detection
Tim Allison (Jira)
[jira] [Commented] (TIKA-4657) Endnote content in tables omitted from .docx text
Tilman Hausherr (Jira)
[jira] [Commented] (TIKA-4657) Endnote content in tables omitted from .docx text
Tim Allison (Jira)
[jira] [Commented] (TIKA-4657) Endnote content in tables omitted from .docx text
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4657) Endnote content in tables omitted from .docx text
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4657) Endnote content in tables omitted from .docx text
Tim Allison (Jira)
[jira] [Updated] (TIKA-4657) Endnote content in tables omitted from .docx text
Tilman Hausherr (Jira)
[jira] [Updated] (TIKA-4658) Clean up parser configuration in 4.x
Tim Allison (Jira)
[jira] [Updated] (TIKA-4658) Clean up parser configuration in 4.x
Tim Allison (Jira)
[jira] [Updated] (TIKA-4658) Clean up parser configuration in 4.x
Tim Allison (Jira)
[jira] [Updated] (TIKA-4658) Clean up parser and detector configuration in 4.x
Tim Allison (Jira)
[jira] [Created] (TIKA-4658) Clean up tesseract configuration in 4.x
Tim Allison (Jira)
[PR] serialization tweaks [tika]
via GitHub
Re: [PR] serialization tweaks [tika]
via GitHub
[jira] [Created] (TIKA-4657) Endnote content in tables omitted from .docx text
Klara Mazurak (Jira)
[jira] [Comment Edited] (TIKA-4654) Experiment with docstrum for clustering TextPositions for PDFs
Tim Allison (Jira)
[jira] [Commented] (TIKA-4654) Experiment with docstrum for clustering TextPositions for PDFs
Tim Allison (Jira)
[PR] TIKA-4653 -- fix up extra whitespace [tika]
via GitHub
Re: [PR] TIKA-4653 -- fix up extra whitespace [tika]
via GitHub
[jira] [Resolved] (TIKA-4655) Fix application/x-grib mimetype determination
Tim Allison (Jira)
JDK 26 Release Candidate | JavaOne 2026 and More
David Delabassee via dev
[PR] TIKA-4655 - fix grib detection [tika]
via GitHub
Re: [PR] TIKA-4655 - fix grib detection [tika]
via GitHub
[jira] [Commented] (TIKA-4655) Fix application/x-grib mimetype determination
Tim Allison (Jira)
[jira] [Commented] (TIKA-4655) Fix application/x-grib mimetype determination
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4655) Fix application/x-grib mimetype determination
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4655) Fix application/x-grib mimetype determination
Hudson (Jira)
[jira] [Commented] (TIKA-4656) Allow text-only option alongside rmeta and concatenated in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4656) Allow text-only option alongside rmeta and concatenated in 4.x
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4656) Allow text-only option alongside rmeta and concatenated in 4.x
Hudson (Jira)
[PR] TIKA-4656-allow-content-only [tika]
via GitHub
Re: [PR] TIKA-4656-allow-content-only [tika]
via GitHub
[jira] [Created] (TIKA-4656) Allow text-only option alongside rmeta and concatenated in 4.x
Tim Allison (Jira)
[jira] [Updated] (TIKA-4655) Fix application/x-grib mimetype determination
Laura Delmaestro (Jira)
[jira] [Updated] (TIKA-4655) Fix application/x-grib mimetype determination
Laura Delmaestro (Jira)
[jira] [Updated] (TIKA-4655) Fix application/x-grib mimetype determination
Laura Delmaestro (Jira)
[jira] [Created] (TIKA-4655) Fix application/x-grib mimetype determination
Laura Delmaestro (Jira)
[jira] [Resolved] (TIKA-4653) Add markdown contenthandler
Tim Allison (Jira)
[jira] [Created] (TIKA-4654) Experiment with docstrum for clustering TextPositions for PDFs
Tim Allison (Jira)
[PR] TIKA-4653 - markdown for 3x [tika]
via GitHub
Re: [PR] TIKA-4653 - markdown for 3x [tika]
via GitHub
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
Hudson (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
Hudson (Jira)
[jira] [Commented] (TIKA-4653) Add markdown contenthandler
ASF GitHub Bot (Jira)
[PR] TIKA-4653-markdown-handler [tika]
via GitHub
Re: [PR] TIKA-4653-markdown-handler [tika]
via GitHub
Re: [PR] TIKA-4653-markdown-handler [tika]
via GitHub
[PR] Bump org.ops4j.pax.url:pax-url-aether from 3.0.1 to 3.0.2 [tika]
via GitHub
Re: [PR] Bump org.ops4j.pax.url:pax-url-aether from 3.0.1 to 3.0.2 [tika]
via GitHub
[PR] Bump software.amazon.awssdk:bom from 2.41.23 to 2.41.24 [tika]
via GitHub
Re: [PR] Bump software.amazon.awssdk:bom from 2.41.23 to 2.41.24 [tika]
via GitHub
[PR] Bump com.github.luben:zstd-jni from 1.5.7-6 to 1.5.7-7 [tika]
via GitHub
Re: [PR] Bump com.github.luben:zstd-jni from 1.5.7-6 to 1.5.7-7 [tika]
via GitHub
[PR] Bump org.apache.maven.plugins:maven-dependency-plugin from 3.9.0 to 3.10.0 [tika]
via GitHub
Re: [PR] Bump org.apache.maven.plugins:maven-dependency-plugin from 3.9.0 to 3.10.0 [tika]
via GitHub
[PR] Bump org.codehaus.mojo:animal-sniffer-annotations from 1.26 to 1.27 [tika]
via GitHub
Re: [PR] Bump org.codehaus.mojo:animal-sniffer-annotations from 1.26 to 1.27 [tika]
via GitHub
[PR] Bump com.google.errorprone:error_prone_annotations from 2.46.0 to 2.47.0 [tika]
via GitHub
Re: [PR] Bump com.google.errorprone:error_prone_annotations from 2.46.0 to 2.47.0 [tika]
via GitHub
[PR] Bump com.puppycrawl.tools:checkstyle from 12.3.1 to 13.2.0 [tika]
via GitHub
Re: [PR] Bump com.puppycrawl.tools:checkstyle from 12.3.1 to 13.2.0 [tika]
via GitHub
Re: [PR] Bump com.puppycrawl.tools:checkstyle from 12.3.1 to 13.2.0 [tika]
via GitHub
[PR] Bump org.jetbrains.kotlin:kotlin-stdlib from 2.3.0 to 2.3.10 [tika]
via GitHub
Re: [PR] Bump org.jetbrains.kotlin:kotlin-stdlib from 2.3.0 to 2.3.10 [tika]
via GitHub
[PR] Bump com.microsoft.graph:microsoft-graph from 6.60.0 to 6.61.0 [tika]
via GitHub
Re: [PR] Bump com.microsoft.graph:microsoft-graph from 6.60.0 to 6.61.0 [tika]
via GitHub
[PR] Bump io.netty:netty-bom from 4.2.9.Final to 4.2.10.Final [tika]
via GitHub
Re: [PR] Bump io.netty:netty-bom from 4.2.9.Final to 4.2.10.Final [tika]
via GitHub
[jira] [Created] (TIKA-4653) Add markdown contenthandler
Tim Allison (Jira)
[jira] [Resolved] (TIKA-4650) Improve zip parsing in 4.x
Tim Allison (Jira)
[jira] [Resolved] (TIKA-4651) Use pipes fork parser where possible
Tim Allison (Jira)
[jira] [Resolved] (TIKA-4652) Add yolo option for tika-pipes
Tim Allison (Jira)
[jira] [Commented] (TIKA-4652) Add yolo option for tika-pipes
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4652) Add yolo option for tika-pipes
ASF GitHub Bot (Jira)
[PR] TIKA-4652 -- add a yolo option to tika-pipes to restore legacy crashing behavior but with the safety of pipes [tika]
via GitHub
Re: [PR] TIKA-4652 -- add a yolo option to tika-pipes to restore legacy crashability but with the safety of pipes [tika]
via GitHub
[jira] [Created] (TIKA-4652) Add yolo option for tika-pipes
Tim Allison (Jira)
[jira] [Created] (TIKA-4651) Use pipes fork parser where possible
Tim Allison (Jira)
[jira] [Commented] (TIKA-4651) Use pipes fork parser where possible
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4651) Use pipes fork parser where possible
ASF GitHub Bot (Jira)
[jira] [Commented] (TIKA-4651) Use pipes fork parser where possible
Hudson (Jira)
[PR] TIKA-4651 -- refactor cli to us pipes for parser [tika]
via GitHub
Re: [PR] TIKA-4651 -- refactor cli to us pipes for parser [tika]
via GitHub
[PR] TIKA-4650 - improvements for 3.x [tika]
via GitHub
Re: [PR] TIKA-4650 - improvements for 3.x [tika]
via GitHub
[PR] TIKA-4650-refactor-zip-parser [tika]
via GitHub
Re: [PR] TIKA-4650-refactor-zip-parser [tika]
via GitHub
Earlier messages