smongiar commented on PR #21994:
URL: https://github.com/apache/camel/pull/21994#issuecomment-4066914217
Hi @gnodet ... thank you for your review and observations to the PR.
Anyway, let me answer first about "_Test changes modify what's being
tested_" complain. The most general, important one.
These "RAG" classes are not implementing RAG logic. They are simple
Processor implementations that can be used as bean ref in YAML DSL without
defining any additional Java code.
They don't do retrieval-augmented generation,they don't call an LLM, they
don't combine retrieved context with prompts, they don't do chunking or any RAG
pipeline orchestration.
I _agree that The "RAG" prefix is arguably a misnomer_ (maybe something like
_MilvusHelper_ would be less confusing), so I can change them in case, but it's
the same we did for Qdrant example.
But the tests, are still testing the component in isolation.
The reason why I'm focusing on YAML DSL, _is to make the AI components
easily configurable_ so they can be well integrated in UI Camel Tools, used by
customers. It something we will do for each AI component, to follow this
direction.
Then, about other review observation (sorry but I need to explain on some
points, there have been too much replies and this make the review a little bit
confusing, so let me explain briefly some of them, for now):
**Input validation and error handling:**
It's a standard Camel pattern. Dozens of components across the codebase do
split("=") and access [0]/[1] without bounds checking. Adding defensive
validation here would make these helpers inconsistent with the rest of the
project, so these properties are developer-configured, not user-supplied. Just
to make an example, if a developer writes "content" instead of "content=text",
they have a bug in their route definition (which is effectively the same as a
startup-time configuration error). Wrapping these in custom exceptions adds
complexity for no real benefit. A _NumberFormatException_ from
_Integer.parseInt("abc")_ already tells you "abc" is not a valid integer.
Wrapping it to say "_dimension must be a valid integer_" adds negligible value
for a developer audience. Same for _ArrayIndexOutOfBoundsException_ — the stack
trace points directly at the _split("=")_ line, making the cause obvious.
In case, I can consider a _@param_ Javadoc note on the setter documenting
the expected format (e.g., "_field1=var1,field2=var2_"), not runtime validation
code. Tell me if you agree.
**Hardcoded values:**
"_Make everything configurable_", it's just the purpose of these helpers.
Basically we are talking about reasonable defaults, that need to be changed,
plus they involve optional parameters sometime.
As example, what are you saying about _RAGCreateIndex_, and
_.withIndexName("userFaceIndex")_, it's supported and originally reported by
the related IT code. But _indexName_ is an optional parameter in Milvus.
Omitting it just auto-generates a name, which is fine for most use cases.
**Code duplication (INSERT/UPSERT)**:
These are two separate Processor beans meant to be independently referenced
in YAML DSL. They happen to look similar today, but they map to fundamentally
different Milvus operations (INSERT vs UPSERT) which have different semantics
and could diverge in the future. It was so in the original implementation, why
consider this as duplication of code, now, during the refactoring?
**Numeric properties as Strings:**
When you say "_If this is intentional for JBang/YAML compatibility_" ... So,
yes, it is intentional, as already mentioned above. These are beans configured
via YAML DSL property binding, where everything comes in as strings. Using
String types is the pragmatic, consistent choice that aligns with how the
existing Qdrant or Milvus RAG helpers already work. It's a deliberate pattern,
not an oversight.
**Revert existing test modifications — add dedicated RAG helper tests
instead:**
It sounds like you are treating the RAG helpers as a separate "system under
test" that needs its own test suite. But as we've already established, these
helpers are just convenience processors that build Milvus SDK parameter
objects. They aren't a new feature with independent logic worth testing in
isolation. Other than that I didn't changed those tests. Processor were just
collected and refactored.
So, why this need to create RAG tests, since RAG itself is not tested as
explained before?
Most of complains I suppose are due to misleading classes names. Or at
least, I suppose. So I hope, this clarification, about the use of "RAG" prefix
should address the review in a different context.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]