openinx commented on pull request #1774: URL: https://github.com/apache/iceberg/pull/1774#issuecomment-733569256
I removed the `PartitionedFanoutWriter` in #1818 because: 1. I found it's easy and more simpler to understand after unifying the unpartitioned & partitioned fanout writer in a single [RowDataTaskWriter](https://github.com/apache/iceberg/pull/1818/files#diff-137cbe4278e90eab7d4d545be87f5daf929e48a012f1c791ca1e7fc7d7fe5eddR41). 2. The flink need to parse the `RowKind` to decide whether the row should be dispatched to `write` method or `delete` method, the previous abstraction is more suitable for the requirement, So I created an unified task writer for flink. For spark fanout task writer, I think it's reasonable for the spark streaming scenarios because in that case we don't necessary to shuffle the records based on partition keys. Moving the `PartitionedFanoutWriter` from `flink` module to the `core` module looks good to me. @XuQianJin-Stars Mind to update this PR to address the CI issue ? Thanks. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
