[GitHub] [iceberg] openinx commented on a change in pull request #1774: [iceberg-1746] Implement spark fanout writer

GitBox Wed, 25 Nov 2020 01:21:31 -0800


openinx commented on a change in pull request #1774:
URL: https://github.com/apache/iceberg/pull/1774#discussion_r530219917




##########
File path: 
spark/src/test/java/org/apache/iceberg/spark/source/TestSparkDataWrite.java
##########
@@ -374,6 +375,58 @@ public void 
testPartitionedCreateWithTargetFileSizeViaOption() throws IOExceptio
     }
   }
 
+  @Test
+  public void testPartitionedFanoutCreateWithTargetFileSizeViaOption() throws 
IOException {
+    File parent = temp.newFolder(format.toString());
+    File location = new File(parent, "test");
+
+    HadoopTables tables = new HadoopTables(CONF);
+    PartitionSpec spec = 
PartitionSpec.builderFor(SCHEMA).identity("data").build();
+    Table table = tables.create(SCHEMA, spec, location.toString());
+    table.updateProperties()
+        .set(WRITE_PARTITIONED_FANOUT_ENABLED, "true")
+        .commit();
+
+    List<SimpleRecord> expected = Lists.newArrayListWithCapacity(8000);
+    for (int i = 0; i < 2000; i++) {
+      expected.add(new SimpleRecord(i, "a"));
+      expected.add(new SimpleRecord(i, "b"));
+      expected.add(new SimpleRecord(i, "c"));
+      expected.add(new SimpleRecord(i, "d"));
+    }
+
+    Dataset<Row> df = spark.createDataFrame(expected, SimpleRecord.class);
+
+    df.select("id", "data").sort("data").write()

Review comment:
       For partitioned fanout case, we don't have to sort based on `data` 
column ?  Otherwise, what's the difference compared to `PartitionedWriter` ? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] openinx commented on a change in pull request #1774: [iceberg-1746] Implement spark fanout writer

Reply via email to