JackeyLee007 opened a new pull request, #7227:
URL: https://github.com/apache/paimon/pull/7227

   <!-- Please specify the module before the PR name: [core] ... or [flink] ... 
-->
   
   ### Purpose
   
   <!-- What is the purpose of the change -->
   Make sure the auto-tag exists as expected even when no data is provided.
   
   The former version of procedure `trigger_tag_automatic_creation` creates the 
tag only if it finds the right snapshot which is after `tag.creation-delay`. 
This is not always okay.
   
   As we know, the auto-tag would be created within the Flink CDC job. But it 
would fail if the Flink job:
   * crashes accidentally
   * fails to create checkpoint
   * is delayed in creating checkpoint
   
   To meet the auto-tag expectation, I enhanced the 
`trigger_tag_automatic_creation` procedure by adding a new `force` parameter, 
defaulting to false. When it's true, and the auto-tag still does not exist 
after the first attempt at auto-creation, we make an empty commit to force the 
creation of the snapshot which could be used to create the auto-tag during the 
second run.
   
   ### Tests
   
   <!-- List UT and IT cases to verify this change -->
   Flink UT `TriggerTagAutomaticCreationProcedureITCase` and Spark UT 
`TriggerTagAutomaticCreationProcedureTest` were adapted.
   
   ### API and Format
   
   <!-- Does this change affect API or storage format -->
   NO
   
   ### Documentation
   
   <!-- Does this change introduce a new feature -->
   Doc provied for the new parameter.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to