bkonold opened a new pull request #1367:
URL: https://github.com/apache/samza/pull/1367
**Issues**: `TaskSideInputStorageManager` conflates both processing and
storage logic. This is problematic as adding support of transactional state in
standby containers requires heavy additions to both. Refactoring is necessary
in order to avoid `TaskSideInputStorageManager` from becoming a difficult to
maintain behemoth (like `ContainerStorageManager` is now). After this change,
introduction of support for transactional state in standby containers can
evolve cleanly.
**Changes**: Processing logic is moved out of `TaskSideInputStorageManager`
and into a new class `TaskSideInputHandler`. This includes: coordination of
oldestOffsets, lastProcessedOffsets, startingOffsets, and processing behavior
for a given SSP envelope.
Management of stores' StorageEngines remains in
`TaskSideInputStorageManager` as it is today.
`ContainerStorageManager` now interfaces only with `TaskSideInputHandler` to
handle side input lifecycle, processing, and flush.
There is **NO NEW** functionality added in this patch, only a refactor.
**Tests**: Existing unit tests located in `TaskSideInputStorageManager` have
been adapted to the class's new responsibilities, or moved and re-implemented
for `TaskSideInputHandler`.
**API Changes**: None. The classes touched are internal to Samza.
**Upgrade Instructions**: None.
**Usage Instructions**: None.
**NOTES**: Since it is difficult to parse where new code is coming from when
new classes are written, I've included comments / annotations across the PR to
indicate where code that appears as "addition" was sourced from.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]