RussellSpitzer commented on code in PR #4612:
URL: https://github.com/apache/iceberg/pull/4612#discussion_r857711524
##########
core/src/main/java/org/apache/iceberg/BaseRewriteManifests.java:
##########
@@ -257,7 +257,10 @@ private void validateDeletedManifests(Set<ManifestFile>
currentManifests) {
.filter(manifest -> !currentManifests.contains(manifest))
.findAny()
.ifPresent(manifest -> {
- throw new ValidationException("Manifest is missing: %s",
manifest.path());
+ throw new ValidationException("Cannot commit RewriteManifests; " +
+ "manifests that existed at the beginning of this rewrite
have already been removed by another " +
+ "operation. Manifest %s, which would be replaced by this
operation, has already been removed.",
+ manifest.path());
Review Comment:
I think "Manifest is missing" is a very confusing message. From my talks
with folks who have hit this as well as other Iceberg dev's, the gut reaction
to this message is that the table has become corrupted. After all why would
"manifest is missing" be an issue for an operation which is removing manifests?
Having a missing manifest also generally means that the table is no longer
stable so users are on high alert.
To further confound things, the manifest probably still exists and a user
can find it, which leads them to ask "Why would Iceberg think the file is
missing when it is clearly on disk?". Users often let me know in conjunction
that Iceberg says a manifest is missing but they can see it on disk
In my opinion a good error message has to explain what went wrong, why it
went wrong, and how to respond (if such a response is possible and non-obvious).
At the moment the error message at best explains the "why it went wrong" but
without enough context to really let the user even know what part of the
rewrite actually broke.
If I was going to order information here the most important concept to get
across is
1. "What" - The commit has failed (not writing, reading or deleting
manifests)
2. "Why" - The change we prepared no longer cleanly applies because a
manifest we thought existed is no longer current
3. "Remediation" - Retry the operation if desired
So if we want to trim it down to the essentials
```
"Cannot commit, table has been concurrently modified and manifest %s is no
longer part of the table"
```
Although even knowing which manifest is missing is not even that valid so we
could remove that too.
I actually have issues with most of our Validation messages in
MergingSnapshotProducer for the similar reason. Users are often confused by
"Found conflicting files that can contain records matching true". But we can
save that for future PR's
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]