[GitHub] [iceberg] yyanyy commented on a change in pull request #2214: Add sort order to writer classes

GitBox Mon, 22 Mar 2021 19:05:11 -0700


yyanyy commented on a change in pull request #2214:
URL: https://github.com/apache/iceberg/pull/2214#discussion_r599194200




##########
File path: core/src/main/java/org/apache/iceberg/avro/Avro.java
##########
@@ -284,6 +286,11 @@ public DeleteWriteBuilder equalityFieldIds(int... 
fieldIds) {
       return this;
     }
 
+    public DeleteWriteBuilder withSortOrder(SortOrder newSortOrder) {

Review comment:
       I think the first sounds good to me! I think both `DataWriter` and 
`*DeleteWriter` are wrapper around appender, but today to construct the latter 
we have to specify everything in appender factories for each file format, and 
`Parquet` `Avro` actually directly pass them over to the underlying appender. I 
wonder if we can have separate `newAppender` and `newDeleteAppender` in 
`FileAppenderFactory` (or even generalize into one) so that we don't do this 
delegation per file format, and move the three methods you mentioned to 
`WriterFactory` to separate the two writer factories. In this case 
`WriterFactory` may always rely on a `FileAppenderFactory` in constructor, 
which might still be fine. 
   
   For adding `buildWriter` to Parquet, Avro, ORC classes, I'm not entirely 
sure though, since `DataWriter` does not differ much among three file formats, 
so adding them to each file format may actually increase duplicated code. I 
haven't looked into the code deeply but I wonder if we can abstract the 
creation of appender out from delete builders, so that the new `WriterFactory` 
can create delete writers as simple as create data writers today, so that we 
will just have this logic in each `WriterFactory`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] yyanyy commented on a change in pull request #2214: Add sort order to writer classes

Reply via email to