oscerd commented on code in PR #1423:
URL: https://github.com/apache/camel-website/pull/1423#discussion_r2433536366


##########
content/blog/2025/10/camel-docling/index.md:
##########
@@ -0,0 +1,227 @@
+---
+title: "Building Intelligent Document Processing with Apache Camel: Docling 
meets LangChain4j"
+date: 2025-10-15
+draft: false
+authors: [ oscerd ]
+categories: ["Camel", "AI"]
+preview: "The new camel-docling component meets camel-langchain4j"
+---
+
+In the rapidly evolving landscape of AI-powered applications, the ability to 
process and understand documents has become increasingly crucial. Whether 
you're dealing with PDFs, Word documents, or PowerPoint presentations, 
extracting meaningful insights from unstructured data is a challenge many 
developers face daily.
+
+In this post, we'll explore how Apache Camel's new AI components enable 
developers to build sophisticated RAG (Retrieval Augmented Generation) 
pipelines with minimal code. We'll combine the power of Docling for document 
conversion with LangChain4j for AI orchestration, all orchestrated through 
Camel's YAML DSL.
+
+## The Challenge: Document Intelligence at Scale
+
+Companies are drowning in documents. Legal firms process contracts, healthcare 
providers manage medical records, and financial institutions analyze reports. 
The traditional approach of manual document review simply doesn't scale.
+
+So this a possible space where we could apply RAG and Apache Camel. The steps:
+
+* Convert documents from any format to structured text
+* Extract key insights and summaries
+* Answer questions about document content
+* Process documents in real-time as they arrive
+
+This is where the combination of Docling and LangChain4j shines, and Apache 
Camel provides the perfect integration layer to bring them together.
+
+## Meet the Components
+
+### Camel-Docling: Enterprise Document Conversion
+
+The `camel-docling` component integrates IBM's Docling library, an AI-powered 
document parser that can handle various formats including PDF, Word, 
PowerPoint, and more. What makes Docling special is its ability to preserve 
document structure while converting to clean Markdown, HTML, or JSON.
+
+Key features:
+
+* **Multiple Operations**: Convert to Markdown, HTML, JSON, or extract 
structured data
+* **Flexible Deployment**: Works with both CLI and API (docling-serve) modes
+* **Content Control**: Return content directly in the message body or as file 
paths
+* **OCR Support**: Handle scanned documents with optical character recognition
+
+### Camel-LangChain4j: AI Orchestration Made Simple
+
+The `camel-langchain4j-chat` component provides seamless integration with 
Large Language Models through the LangChain4j framework. It supports various 
LLM providers including OpenAI, Ollama, and more.
+
+Perfect for:
+
+* Document analysis and summarization
+* Question-answering systems
+* Content generation
+* RAG implementations
+
+## Building a RAG Pipeline with YAML
+
+Let's walk through a complete example that demonstrates the power of combining 
these components. Our goal is to create a system that automatically processes 
documents, analyzes them with AI, and generates comprehensive reports: a 
classic example.
+
+### Architecture Overview
+
+The flow is straightforward:
+
+1. Watch a directory for new documents
+2. Convert documents to Markdown using Docling
+3. Send the converted content to an LLM for analysis
+4. Generate a comprehensive analysis report
+5. Clean up processed files
+
+All of this is defined declaratively in YAML, making it easy to understand and 
modify.
+
+### Setting Up the Infrastructure
+
+First, we need our services running. Thanks to camel infra command, this is 
pretty simple:
+
+```shell
+# Start Docling (if camel infra supports it)
+$ jbang -Dcamel.jbang.version=4.16.0-SNAPSHOT camel@apache/camel infra run 
docling

Review Comment:
   @squakez now It seems to be a link problem
   
   [check:links    ] Found invalid urls in blog/2024/11/camel-k-2-5/index.html:
   [check:links    ]    Linked file at path 
camel-k/next/kamelets/kamelets-user.html does not exist!
   ERROR: "check:links" exited with 1.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to