shekhars-li opened a new pull request #1582: URL: https://github.com/apache/samza/pull/1582
Problems addressed: 1. During init, SnapshotIndex blobs of stores that were previously configured and has blob ID in the checkpoint topic are being downloaded and deserialized. This may create problems if the blob is missing in the blob store or if the blob is incorrectly serialized. This causes the container to fail during init and the subsequently the job to fail. 2. SnapshotIndexSerde was incorrectly serializing the SnapshotIndex blob possibly due to missing Jackson-datatype-JDK8 module not loading during the run time. Fix: 1. Backup/Restore managers already have a set of stores currently configured passed to them through ContainerStorageManager (CSM). We pass this set to BlobStoreUtil and download the blobs only relevant to this current set thus avoiding the problem of missing or badly serialized blobs. 2. We move the loading of the Jackson-datatype-JDK8 module to SamzaObjectMapper to ensure the module is loaded and present during the runtime when the SnapshotIndexSerde object is created. We also added a debug line to verify the hypothesis and check if the module is available during the runtime. Tests: Added a new unit test to verify that the SnapshotIndex blobs for only configured stores are downloaded. Updated some existing unit tests to match the new method signature of `getSnapshotIndexes()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
