nsivabalan edited a comment on pull request #2438:
URL: https://github.com/apache/hudi/pull/2438#issuecomment-864413719
Guess we can simplify things. Let me go over some pseudo code of interest.
within DeltaSync.read()
```
// set right checkpoint value
if(cfg.checkpoint != null && !
(commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
```
// Note that first if condition deals with RESET_key where as 2nd else if
conditions deals with Checkpoint_key
within write()
```
// towards the end
commitMetadata.out(Checkpoint_Key, updated checkpoint after writing)
if(cfg.checkpoint != null) {
commitMetadata.add(Checkpoint_RESET_Key);
}
```
If cfg.checkpoint is set, only during first round, it will be honored. At
the end of first batch, we add Checkpoint_RESET_Key to the commitmetadata and
hence from subsequent batches, checkpoint will be parsed from commitMetadata.
With this PR, only addition is that we are introducing a new checkpoint
type. Let me propose a simple add on to above code that would work for us.
within DeltaSync.read()
```
// set right checkpoint value
boolean resetCheckpointType = true; // New addition
if(cfg.checkpoint != null && !
(commitMetadata.contains(Checkpoint_RESET_Key) ) {
checkpoint = cfg.checkpoint;
resetCheckpointType = false; // New addition
} else if (commitMetadata.contains(Checkpoint_Key)) {
checkpoint = commitMetadata.get(Checkpoint_Key));
} else {
Option.empty()
}
// New addition
if (resetCheckpointType) {
**reset checkpoint type if set.**
}
```
No other changes are required. This is based of the assumption that
Checkpoint_RESET_Key and checkpoint type goes hand in hand. During first batch,
checkpoint type could be set, there won't be any Checkpoint_RESET_Key set. But
from 2nd batch, it should be reverse. check point type should not be set, but
Checkpoint_RESET_Key should be part of the commit metadata. Given this
assumption, we don't really need to add checkpoint type to commitMetadata, but
still decide whether to use the checkpoint type or not.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]