[ https://issues.apache.org/jira/browse/STORM-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jungtaek Lim resolved STORM-3122. --------------------------------- Resolution: Fixed Fix Version/s: 1.2.3 Merged into 1.x-branch. > FNFE due to race condition between "async localizer" and "update blob" timer > thread > ----------------------------------------------------------------------------------- > > Key: STORM-3122 > URL: https://issues.apache.org/jira/browse/STORM-3122 > Project: Apache Storm > Issue Type: Bug > Components: storm-core > Affects Versions: 1.x > Reporter: Jungtaek Lim > Assignee: Jungtaek Lim > Priority: Critical > Labels: pull-request-available > Fix For: 1.2.3 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > There's race condition between "async localizer" and "update blob" timer > thread. > When worker is shutting down, reference count for blob will be 0 and > supervisor will remove actual blob file. There's also "update blob" timer > thread which tries to keep blobs updated for downloaded topologies. While > updating topology it should read some of blob files already downloaded > assuming these files should be downloaded before, and the assumption is > broken because of async localizer. > [~arunmahadevan] suggested an approach to fix this: "updateBlobsForTopology" > can just catch the FIleNotFoundException and skip updating the blobs in case > it can't find the stormconf, and the approach looks simplest fix so I'll > provide a patch based on suggestion. > Btw, it doesn't apply to master branch, since in master branch all blobs are > synced up separately (no need to read stormconf to enumerate topology related > blobs), and update logic is already fault-tolerance (skip to next sync when > it can't pull the blob). -- This message was sent by Atlassian JIRA (v7.6.3#76005)