NeQuissimus commented on a change in pull request #3293:
URL: https://github.com/apache/iceberg/pull/3293#discussion_r729981179
##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java
##########
@@ -187,11 +215,6 @@ private void flushRowGroup(boolean finished) {
private void startRowGroup() {
Preconditions.checkState(!closed, "Writer is closed");
- try {
- this.nextRowGroupSize = Math.min(writer.getNextRowGroupSize(),
targetRowGroupSize);
- } catch (IOException e) {
- throw new RuntimeIOException(e);
- }
Review comment:
I tried to explain this in the description. The way the alignment works
inside iceberg, this is not needed because nextRowGroupSize will always be
identical to targetRowGroupSize due to the maxPadding setting of 0.
##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/Parquet.java
##########
@@ -183,6 +184,16 @@ public WriteBuilder writerVersion(WriterVersion version) {
return this;
}
+ public WriteBuilder lazy() {
+ this.lazyWriter = true;
+ return this;
+ }
+
+ public WriteBuilder lazy(boolean lazy) {
+ this.lazyWriter = lazy;
+ return this;
+ }
+
Review comment:
There are actually situations when you might want an empty file to exist
as soon as you get ready to write.
Some of the tests also assume this behaviour.
I did not want to change the default behaviour in this regard and the
builder can leave this to the API user.
##########
File path: parquet/src/main/java/org/apache/iceberg/parquet/ParquetWriter.java
##########
@@ -187,11 +215,6 @@ private void flushRowGroup(boolean finished) {
private void startRowGroup() {
Preconditions.checkState(!closed, "Writer is closed");
- try {
- this.nextRowGroupSize = Math.min(writer.getNextRowGroupSize(),
targetRowGroupSize);
- } catch (IOException e) {
- throw new RuntimeIOException(e);
- }
Review comment:
Wasn't sure if I made sense, hope that cleared it up :)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]