platinumhamburg opened a new issue, #2398:
URL: https://github.com/apache/fluss/issues/2398

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/fluss/issues) and 
found nothing similar.
   
   
   ### Motivation
   
   Runtime code generation is essential for achieving high performance in data 
processing systems. Instead of using reflection or generic implementations, 
generated code can:
   
   1. **Eliminate virtual method dispatch** - Direct field access and 
type-specific operations
   2. **Enable JIT optimization** - Generated code can be better optimized by 
the JVM
   3. **Reduce boxing/unboxing overhead** - Type-specific code avoids primitive 
boxing
   4. **Support complex type comparisons** - Nested types (arrays, maps, rows) 
require specialized comparison logic
   
   The initial use case is `RecordEqualiser` for comparing `InternalRow` 
instances, which is needed for:
   - Change data capture (CDC) deduplication
   - Aggregation state management
   - Primary key table updates
   
   ### Solution
   
   ### Core Framework
   
   | Component | Description |
   |-----------|-------------|
   | `CodeGeneratorContext` | Manages reusable code fragments, member 
variables, and class-level declarations |
   | `JavaCodeBuilder` | Type-safe builder for constructing Java source code 
with fluent API |
   | `CompileUtils` | Compiles generated source code using Janino with LRU 
caching |
   | `GeneratedClass<T>` | Wrapper holding generated source code and compiled 
class |
   | `CodeGenException` | Exception type for code generation failures |
   
   ### Type-Safe API
   
   - `Modifier` enum: PUBLIC, PRIVATE, PROTECTED, STATIC, FINAL, ABSTRACT, 
SYNCHRONIZED, VOLATILE, TRANSIENT
   - `PrimitiveType` enum: BOOLEAN, BYTE, CHAR, SHORT, INT, LONG, FLOAT, 
DOUBLE, VOID
   - `Param` class: Type-safe method parameter representation
   - Helper methods: `mods()`, `params()`, `typeOf()`, `arrayOf()`
   
   ### Code Generators
   
   | Generator | Output Interface | Description |
   |-----------|------------------|-------------|
   | `EqualiserCodeGenerator` | `RecordEqualiser` | Generates code for 
comparing two `InternalRow` instances |
   
   ### Supported Data Types
   
   The `EqualiserCodeGenerator` supports all Fluss data types:
   
   - **Primitive types**: BOOLEAN, TINYINT, SMALLINT, INT, BIGINT, FLOAT, DOUBLE
   - **String types**: CHAR, VARCHAR, STRING
   - **Binary types**: BINARY, VARBINARY, BYTES
   - **Temporal types**: DATE, TIME, TIMESTAMP, TIMESTAMP_LTZ
   - **Numeric types**: DECIMAL (with precision/scale)
   - **Complex types**: ARRAY, MAP, ROW (nested)
   
   ### Features
   
   - Field projection support for partial row comparison
   - Compiled class caching with configurable cache size
   - Janino dependency shaded to `org.apache.fluss.shaded.org.codehaus.janino` 
to avoid classpath conflicts
   - Comprehensive Javadoc and package-info documentation
   
   
   ### Anything else?
   
   _No response_
   
   ### Willingness to contribute
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to