openinx opened a new issue #1305:
URL: https://github.com/apache/iceberg/issues/1305


   We have upgraded the flink version to 1.11, and flink 1.11 have turned its 
Row data type to `RowData`. While the previous parquet/avro readers writers we 
developed were based on `Row` type,  now @JingsongLi have contributed the 
`RowData`  avro reader and writer 
(https://github.com/apache/iceberg/pull/1232),  @chenjunjiedada  is helping to 
contribute the `RowData` parquet 
reader(https://github.com/apache/iceberg/pull/1266) and writer 
(https://github.com/apache/iceberg/pull/1272),  and I've pushed a  `RowData`  
orc reader and writer (https://github.com/apache/iceberg/pull/1255) for 
reviewing.    
   
   IMO,  we'd better to replace the `Row` with `RowData` in the flink module as 
soon as possible, so that we could unify all the path and put all the resources 
(both developing and reviewing resources) on `RowData` path.  My plan is: 
   
   1.  As the patch (https://github.com/apache/iceberg/pull/1145) about flink 
IcebergStreamWriter has been reviewed and is ready to merge now,  so we let 
this patch get into master branch firstly. 
   2.  The flink TaskWriter unit tests are running based on `Row` partition 
key,  before turning to `RowData` we need to implement `RowData` partition key 
firstly.  So I prepared the patch `RowDataWrapper` 
(https://github.com/apache/iceberg/pull/1299).   Get this patch merged is the 
second step. 
   3.  We will need an extra patch doing the refactor to replace all the `Row` 
type with `RowData` (I have implemented one in my own branch 
https://github.com/apache/iceberg/commit/2af37c53fd36639ba41aebd362f379c7f5451ed1),
 and make sure all the unit tests could pass.  From this point in time,  all 
flink development and unit tests will use `RowData`. 
   4.  The future RowData parquet/orc reader and writer will be added in the 
`TaskWriter` tests. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to