rdblue commented on a change in pull request #3480:
URL: https://github.com/apache/iceberg/pull/3480#discussion_r744329774
##########
File path: core/src/main/java/org/apache/iceberg/ManifestWriter.java
##########
@@ -149,13 +150,17 @@ public long length() {
return writer.length();
}
+ void useSequenceNumber(long sequenceNumber) {
Review comment:
All of the other configuration is passed in when creating the writer. Is
there a reason not to do that here? The benefit of doing that is that the
sequence number would not be mutable. So you wouldn't be able to do something
like this:
```java
ManifestWriter<DataFile> writer = new ManifestWriter(...);
writer.add(dataFile1);
writer.setSequenceNumber(N);
writer.add(dataFile2);
```
It isn't really clear what the correct behavior for that code would be.
Also, if the sequence number was fixed at create time, then we could write
the sequence number into every file that doesn't have a sequence number instead
of using inheritance. I think that we prefer setting the sequence number on
entries rather than inheriting. And also the manifest itself could have the
correct sequence number when _it_ was added.
If we always have the correct sequence number for manifests themselves, then
I think we get what we wanted with the proposal to use two sequence numbers.
The manifest sequence number is always when the ADDED files were actually
added. The file sequence number is always when the the is in time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]