[ 
https://issues.apache.org/jira/browse/TIKA-4666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18059210#comment-18059210
 ] 

Hudson commented on TIKA-4666:
------------------------------

SUCCESS: Integrated in Jenkins build Tika ยป tika-main-jdk17 #1212 (See 
[https://ci-builds.apache.org/job/Tika/job/tika-main-jdk17/1212/])
TIKA-4666 - add VLM parsers (Claude, Gemini, OpenAI) (#2614) (github: 
[https://github.com/apache/tika/commit/2c98c636774ad85d7c1878adc90862ee865dacf0])
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/test/java/org/apache/tika/parser/vlm/GeminiVLMParserTest.java
* (add) docs/modules/ROOT/examples/claude-vlm-full.json
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/main/java/org/apache/tika/parser/vlm/AbstractVLMParser.java
* (add) docs/modules/ROOT/examples/openai-vlm-basic.json
* (add) docs/modules/ROOT/pages/advanced/local-vlm-server.adoc
* (add) docs/modules/ROOT/examples/gemini-vlm-full.json
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/main/java/org/apache/tika/parser/vlm/VLMOCRConfig.java
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/test/java/org/apache/tika/parser/vlm/MarkdownToXHTMLEmitterTest.java
* (edit) tika-parsers/tika-parsers-ml/pom.xml
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/test/java/org/apache/tika/parser/vlm/ClaudeVLMParserTest.java
* (add) docs/modules/ROOT/examples/vlm-pdf-parsing.json
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/main/java/org/apache/tika/parser/vlm/ClaudeVLMParser.java
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/test/java/org/apache/tika/parser/vlm/OpenAIVLMParserTest.java
* (add) docs/modules/ROOT/examples/claude-vlm-basic.json
* (add) docs/modules/ROOT/examples/openai-vlm-full.json
* (add) docs/modules/ROOT/examples/gemini-vlm-basic.json
* (add) tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/pom.xml
* (edit) docs/modules/ROOT/pages/advanced/index.adoc
* (add) docs/modules/ROOT/pages/configuration/parsers/vlm-parsers.adoc
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/main/java/org/apache/tika/parser/vlm/GeminiVLMParser.java
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/main/java/org/apache/tika/parser/vlm/OpenAIVLMParser.java
* (edit) docs/modules/ROOT/nav.adoc
* (add) 
tika-parsers/tika-parsers-ml/tika-parser-vlm-ocr-module/src/main/java/org/apache/tika/parser/vlm/MarkdownToXHTMLEmitter.java


> Add VLM/modern OCR options parsers in 4.x
> -----------------------------------------
>
>                 Key: TIKA-4666
>                 URL: https://issues.apache.org/jira/browse/TIKA-4666
>             Project: Tika
>          Issue Type: New Feature
>            Reporter: Tim Allison
>            Priority: Major
>
> Along the lines of TIKA-4665, it would be great if we could integrate with 
> modern vlms. Ideally, these would call out to vlm apis, but we could include 
> examples of running qwen or jinaai locally in a python server.
> Again, as with our old deeplearning4j module, I think we should offer 
> examples of these kinds of integrations whether or not they get picked up in 
> production.
> Modern VLMs can be a swap in for tesseract ocr depending on the prompt. Or, 
> they can be used to label images... user chooses what the prompt will be.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to