yyanyy commented on a change in pull request #2214:
URL: https://github.com/apache/iceberg/pull/2214#discussion_r599194200
##########
File path: core/src/main/java/org/apache/iceberg/avro/Avro.java
##########
@@ -284,6 +286,11 @@ public DeleteWriteBuilder equalityFieldIds(int...
fieldIds) {
return this;
}
+ public DeleteWriteBuilder withSortOrder(SortOrder newSortOrder) {
Review comment:
I think the first sounds good to me! I think both `DataWriter` and
`*DeleteWriter` are wrapper around appender, but today to construct the latter
we have to specify everything in appender factories for each file format, and
`Parquet` `Avro` actually directly pass them over to the underlying appender. I
wonder if we can have separate `newAppender` and `newDeleteAppender` in
`FileAppenderFactory` (or even generalize into one) so that we don't do this
delegation per file format, and move the three methods you mentioned to
`WriterFactory` to separate the two writer factories. In this case
`WriterFactory` may always rely on a `FileAppenderFactory` in constructor,
which might still be fine.
For adding `buildWriter` to Parquet, Avro, ORC classes, I'm not entirely
sure though, since `DataWriter` does not differ much among three file formats,
so adding them to each file format may actually increase duplicated code. I
haven't looked into the code deeply but I wonder if we can abstract the
creation of appender out from delete builders, so that the new `WriterFactory`
can create delete writers as simple as create data writers today, so that we
will just have this logic in each `WriterFactory`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]