kbuci opened a new pull request, #18123:
URL: https://github.com/apache/hudi/pull/18123
**Summary:** When using the Zookeeper-based lock provider, the lock node in
ZooKeeper now stores metadata (including application id) so lock holders can be
identified. Application id is taken from the engine context (e.g. Spark
application id) and passed through write config into lock config and into
`HoodieInterProcessMutex`, which writes it into the ZK lock node.
**Changelog:**
- **hudi-common**
- `LockConfiguration`: Added `LOCK_HOLDER_APP_ID_KEY`
(`hoodie.write.lock.app_id`).
- `HoodieEngineContext`: Added default `getApplicationId()` returning
`"Unknown"`.
- **hudi-client-common**
- Added `HoodieInterProcessMutex`: Wraps Curator `InterProcessMutex` and
overrides `getLockNodeBytes()` to set lock node data from `LockConfiguration`
(including application id).
- `BaseZookeeperBasedLockProvider`: Uses `HoodieInterProcessMutex` instead
of `InterProcessMutex`, passing `LockConfiguration` so lock node bytes include
app id.
- `LockManager`: When building `LockConfiguration`, copies lock props and
sets `LOCK_HOLDER_APP_ID_KEY` from `writeConfig.getApplicationId()` so the lock
provider receives the app id.
- `HoodieWriteConfig`: Added `applicationId` (default `"Unknown"`),
`getApplicationId()`, and `setApplicationId(String)`.
- `BaseHoodieWriteClient`: In both constructors that take
`HoodieEngineContext`, added
`config.setApplicationId(context.getApplicationId())` so the write config gets
the engine’s application id for use by `LockManager`.
- **hudi-spark-client**
- `HoodieSparkEngineContext`: Overrode `getApplicationId()` to return
`javaSparkContext.sc().applicationId()`.
### Impact
- **User-facing:** No change to public APIs. Existing ZK lock config
continues to work; application id defaults to `"Unknown"` if not set.
- **Behavior:** Lock nodes created by the ZK lock provider now store
`application_id=<value>` in the node data, so tools (e.g. zkcli) can show which
application holds the lock. Spark users get the Spark application id
automatically; other engines keep `"Unknown"` unless they override
`getApplicationId()` or set it on `HoodieWriteConfig`.
- **Performance:** Negligible (one string in lock config and in ZK node
data).
### Risk Level
**Low.** Changes are additive and backward compatible: default application
id is `"Unknown"`, and existing ZK lock behavior is unchanged except for the
extra metadata in the lock node. `TestHoodieInterProcessMutex` verifies
`getLockNodeBytes()` behavior; existing ZK lock tests remain valid.
### Documentation Update
None. No new user-facing config is required; `hoodie.write.lock.app_id` is
optional and used internally when set via write config. No website or
config-doc changes needed unless we later document this for operators.
### Contributor's checklist
- [ ] Read through [contributor's
guide](https://hudi.apache.org/contribute/how-to-contribute)
- [x] Enough context is provided in the sections above
- [x] Adequate tests were added if applicable
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]