sudssf commented on a change in pull request #882: fix: Failed to get status issue because of s3 eventual consistency URL: https://github.com/apache/incubator-iceberg/pull/882#discussion_r401960590
########## File path: spark/src/test/java/org/apache/iceberg/spark/source/TestDataSourceOptions.java ########## @@ -201,6 +201,36 @@ public void testSplitOptionsOverridesTableProperties() throws IOException { Assert.assertEquals("Spark partitions should match", 2, resultDf.javaRDD().getNumPartitions()); } + @Test + public void testSplitOptionsOverridesTablePropertiesWithWriterLength() throws IOException { + String tableLocation = temp.newFolder("iceberg-table").toString(); + + HadoopTables tables = new HadoopTables(CONF); + PartitionSpec spec = PartitionSpec.unpartitioned(); + Map<String, String> options = Maps.newHashMap(); + options.put(TableProperties.SPLIT_SIZE, String.valueOf(128L * 1024 * 1024)); // 128Mb + tables.create(SCHEMA, spec, options, tableLocation); + + List<SimpleRecord> expectedRecords = Lists.newArrayList( + new SimpleRecord(1, "a"), + new SimpleRecord(2, "b") + ); + Dataset<Row> originalDf = spark.createDataFrame(expectedRecords, SimpleRecord.class); + originalDf.select("id", "data").write() + .format("iceberg") + .mode("append") + .option("use-writer-length-as-file-size", true) + .save(tableLocation); + + Dataset<Row> resultDf = spark.read() + .format("iceberg") + .option("split-size", String.valueOf(611 + 103)) // 611 bytes is the size of SimpleRecord(1,"a") Review comment: I think this happens only for parquet. https://github.com/apache/incubator-iceberg/blob/master/parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java#L142 `writeStore` seems to return non zero results for `getBufferedSize` after close. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org