shekhars-li opened a new pull request #1582:
URL: https://github.com/apache/samza/pull/1582


   Problems addressed:
   
   1. During init, SnapshotIndex blobs of stores that were previously 
configured and has blob ID in the checkpoint topic are being downloaded and 
deserialized. This may create problems if the blob is missing in the blob store 
or if the blob is incorrectly serialized. This causes the container to fail 
during init and the subsequently the job to fail. 
   2. SnapshotIndexSerde was incorrectly serializing the SnapshotIndex blob 
possibly due to missing Jackson-datatype-JDK8 module not loading during the run 
time. 
   
   Fix:
   
   1. Backup/Restore managers already have a set of stores currently configured 
passed to them through ContainerStorageManager (CSM). We pass this set to 
BlobStoreUtil and download the blobs only relevant to this current set thus 
avoiding the problem of missing or badly serialized blobs. 
   2. We move the loading of the Jackson-datatype-JDK8 module to 
SamzaObjectMapper to ensure the module is loaded and present during the runtime 
when the SnapshotIndexSerde object is created. We also added a debug line to 
verify the hypothesis and check if the module is available during the runtime. 
   
   Tests:
   Added a new unit test to verify that the SnapshotIndex blobs for only 
configured stores are downloaded. Updated some existing unit tests to match the 
new method signature of `getSnapshotIndexes()`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to