aokolnychyi commented on a change in pull request #2362:
URL: https://github.com/apache/iceberg/pull/2362#discussion_r615074539



##########
File path: spark3/src/main/java/org/apache/iceberg/spark/source/SparkWrite.java
##########
@@ -479,27 +488,17 @@ public void doCommit(long epochId, WriterCommitMessage[] 
messages) {
   }
 
   private static class WriterFactory implements DataWriterFactory, 
StreamingDataWriterFactory {
-    private final PartitionSpec spec;
+    private final Broadcast<Table> tableBroadcast;
     private final FileFormat format;
-    private final LocationProvider locations;
-    private final Map<String, String> properties;
-    private final Broadcast<FileIO> io;
-    private final Broadcast<EncryptionManager> encryptionManager;
     private final long targetFileSize;
     private final Schema writeSchema;
     private final StructType dsSchema;
     private final boolean partitionedFanoutEnabled;
 
-    protected WriterFactory(PartitionSpec spec, FileFormat format, 
LocationProvider locations,
-                            Map<String, String> properties, Broadcast<FileIO> 
io,
-                            Broadcast<EncryptionManager> encryptionManager, 
long targetFileSize,
+    protected WriterFactory(Broadcast<Table> tableBroadcast, FileFormat 
format, long targetFileSize,
                             Schema writeSchema, StructType dsSchema, boolean 
partitionedFanoutEnabled) {
-      this.spec = spec;
+      this.tableBroadcast = tableBroadcast;

Review comment:
       This is a follow-up on 
https://github.com/apache/iceberg/commit/2843db8cbe2a2b6eba2209ec6f758d2530d5c94b
 where we redesigned the way we serialize tables. The new broadcasted size 
should up to 50% smaller according to the benchmarks I ran even though we 
broadcast the table. We changed the way we serialize the Hadoop conf in 
`FileIO`.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to