Vinish Reddy created HUDI-9060:
----------------------------------

             Summary: Remove validations for clustering metadata
                 Key: HUDI-9060
                 URL: https://issues.apache.org/jira/browse/HUDI-9060
             Project: Apache Hudi
          Issue Type: Bug
          Components: clustering, deltastreamer
            Reporter: Vinish Reddy
            Assignee: Vinish Reddy


When clustering plan has log files which delete all records in the 
partition/base file, the clustering used to fail before because of this 
validation validateClusteringCommit
[https://github.com/apache/hudi/blob/master/hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/BaseHoodieTableServiceClient.java#L490C5-L490C29]
 
{code:java}
if (clusteringMetadata.getWriteStatuses().isEmpty()) {
HoodieClusteringPlan clusteringPlan = ClusteringUtils.getClusteringPlan(
table.getMetaClient(), 
ClusteringUtils.getInflightClusteringInstant(clusteringCommitTime, 
table.getActiveTimeline(), table.getInstantGenerator()).get())
.map(Pair::getRight).orElseThrow(() -> new HoodieClusteringException(
"Unable to read clustering plan for instant: " + clusteringCommitTime));
throw new HoodieClusteringException("Clustering plan produced 0 WriteStatus for 
" + clusteringCommitTime
+ " #groups: " + clusteringPlan.getInputGroups().size() + " expected at least "
+ 
clusteringPlan.getInputGroups().stream().mapToInt(HoodieClusteringGroup::getNumOutputFileGroups).sum()
+ " write statuses");
}{code}

We can remove this validation as it's not required.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to