HeartSaVioR opened a new pull request, #54083:
URL: https://github.com/apache/spark/pull/54083
### What changes were proposed in this pull request?
This PR proposes to change the default delimiter for the merge operator of
RocksDB to an empty string, so that merge operation does not add a delimiter
and concat two without any character.
Changing the delimiter isn't compatible with existing checkpoints, so this
change is coupled with SQLConf, with known offset log metadata trick, to apply
the change only for new streaming queries.
* New SQL config:
`spark.sql.streaming.stateStore.rocksdb.mergeOperatorVersion`
* Default: 2 ('' as delimiter)
* Default for existing checkpoints: 1 (',' as delimiter)
### Why are the changes needed?
We found out there is no way to distinguish two cases of 1) put against
non-existence value then merge and 2) merge against non-existence value then
merge, from the current delimiter. There has been an "implication" that
operators do ensure they call merge only when they know the operation is
against existing key. This effectively requires GET operation which can be an
outstanding performance impact depending on the logic.
Making delimiter to an empty string (none) would eliminate the difference
between the two cases, allowing operators to perform blind merge without
checking the existence of the key.
### Does this PR introduce _any_ user-facing change?
No, the change is internal and there is no user-facing change.
### How was this patch tested?
Added UTs.
### Was this patch authored or co-authored using generative AI tooling?
Co-authored by claude-4.5-sonnet
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]