wuchong commented on PR #1404: URL: https://github.com/apache/fluss/pull/1404#issuecomment-3315842899
Thanks @polyzos for the pull request. In general this util looks good and useful. But I think we shouldn’t expose it directly to users. Ideally, users should call `writer.upsert(T record)` directly, without handling the conversion. The conversion from POJO to `InternalRow` should happen automatically under the API, because we already know the schema of the table. This magic is very similar to Flink Table API `StreamTableEnvironment#fromDataStream(org.apache.flink.streaming.api.datastream.DataStream<T>)` which automatically converts a `DataStream<T>` into `Table`. Underlying, it generates a `InputConversionOperator` to convert the POJO into `RowData`. It has a `DataStructureConverter` abstraction to do the conversion, and the POJO conversion magic is in `StructuredObjectConverter` (it uses codegen for conversion, but we can Java refelction in the first version). We can introduce a `SpecificUpsertWriter<T> extends UpsertWriter` which provides additional POJO-type-based APIs `upsert(T record)` and `delete(T record)`. The `SpecificUpsertWriter<T>` can be got from `Upsert#createWriter(Class<T> specifcPojo)`. When the framework creates a `SpecificUpsertWriter`, it first checks the POJO type (whether it's a valid POJO type, and whether it can convert from/to InternalRow). We can introduce similar APIs for `Append` and `Scan`. Back to this pull request, I think we can first polish this PR a bit (I will leave some review comments later), and have another design/FIP to do the final API expose. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
