Guillaume Nodet created CAMEL-22894:
---------------------------------------
Summary: Refactor Simple/CSimple language implementation for
better maintainability
Key: CAMEL-22894
URL: https://issues.apache.org/jira/browse/CAMEL-22894
Project: Camel
Issue Type: New Feature
Reporter: Guillaume Nodet
h2. Background
The Simple and CSimple language implementations have grown organically over
time and now present significant maintainability challenges. This issue is to
discuss potential refactoring approaches with the community.
h2. Current State Analysis
|| File || Lines || Concern ||
| SimpleFunctionExpression.java | 3,713 | Dual responsibility: runtime + code
generation |
| CSimpleHelper.java | 1,199 | Static helper methods for compiled expressions |
| SimpleFunctionStart.java | 707 | Function block handling |
| SimpleTest.java | 3,595 | Simple language tests |
| OriginalSimpleTest.java | 3,344 | Near-duplicate CSimple tests |
| *Total* | *~12,500* | |
h2. Key Issues
h3. 1. Massive Class with Dual Responsibility
{{SimpleFunctionExpression.java}} (3,713 lines) contains both:
- {{createExpression()}} path for runtime Simple language
- {{createCode()}} path for compile-time CSimple language
Each function (body, header, variable, date, bean, file, math, etc.) is
implemented *twice* with similar parsing logic but different output generation.
h3. 2. Test Duplication
{{SimpleTest.java}} and {{OriginalSimpleTest.java}} are essentially copies
(~7,000 lines combined) with "simple" vs "csimple" language. When adding new
features (e.g., ternary operator in CAMEL-22873), tests must be manually
duplicated.
h3. 3. Parallel Implementation Pattern
Every AST node supporting csimple must implement both {{createExpression()}}
and {{createCode()}}. Adding the ternary operator required ~140 lines of nearly
duplicate code.
h3. 4. String-Based Code Generation
Code generation uses error-prone string concatenation:
{code:java}
return "Object value = " + exp + ";\n return convertTo(exchange, " +
type + ", value)";
{code}
h2. Proposed Refactoring Approach
h3. 1. Declarative Function Registry
Replace if-else chains with declarative function definitions:
{code:java}
@SimpleFunction(name = "headerAs", syntax = "headerAs(key, type)")
public class HeaderAsFunction implements FunctionDescriptor {
Expression interpret(FunctionContext ctx) { ... }
CodeFragment compile(FunctionContext ctx) { ... }
}
{code}
h3. 2. Visitor Pattern for AST Processing
Separate parsing from output generation:
- Parse once → produce AST
- InterpreterVisitor → produces Expression (Simple)
- CodeGenVisitor → produces Java code (CSimple)
h3. 3. Unified Test Framework
Use JUnit 5 parameterized tests:
{code:java}
@ParameterizedTest
@ValueSource(strings = {"simple", "csimple"})
void testTernaryOperator(String languageName) {
// Single test runs on both backends
}
{code}
h3. 4. Type-Safe Code Generation
Replace string concatenation with builders that can validate generated code.
h2. Expected Benefits
|| Aspect || Current || After Refactoring ||
| Add new function | Edit 2 places in 3700-line file | Create single class |
| Add new operator | Edit multiple AST classes + tests | One AST node + visitor
methods |
| Test maintenance | Duplicate ~3500 lines | Single parameterized suite |
| Estimated LOC | ~12,500 | ~6,000-7,000 |
h2. Migration Approach
This could be done incrementally:
1. Create unified test framework (low risk, immediate benefit)
2. Extract function registry, migrate functions one at a time
3. Introduce visitor pattern for new features
4. Migrate existing AST nodes to visitor pattern
5. Clean up legacy code
h2. Questions for Discussion
1. Is there appetite for this level of refactoring?
2. Should we prioritize any specific aspect (e.g., tests first)?
3. Are there backward compatibility concerns to consider?
4. Should this be a GSoC or similar project given the scope?
Related: CAMEL-22873 (ternary operator) exposed the maintenance burden when
adding new features.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)