[
https://issues.apache.org/jira/browse/HDFS-13101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16682387#comment-16682387
]
Wei-Chiu Chuang commented on HDFS-13101:
----------------------------------------
A little bit of updates:
After HDFS-13314 was applied, we were finally able to identify a set of
fsimages + edits that reproduced corruption reliably. Spoiler: the fsimage was
already partially corrupted before HDFS-13314 aborted check point. So it took
us a little while to understand other sequences of events leading to the null
pointer. The gist of the issue is a delete trash operation followed by delete
snapshots.
The fix for this bug is still undergoing. [~sodonnell], [~smeng] and
[~adam.antal] have put in tremendous number of hours on it, and they've been
actively working on the reproduction in the past few weeks.
[~yzhangal] mind if I reassign this jira to [~smeng] for tracking purposes?
> Yet another fsimage corruption related to snapshot
> --------------------------------------------------
>
> Key: HDFS-13101
> URL: https://issues.apache.org/jira/browse/HDFS-13101
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Priority: Major
>
> Lately we saw case similar to HDFS-9406, even though HDFS-9406 fix is
> present, so it's likely another case not covered by the fix. We are currently
> trying to collect good fsimage + editlogs to replay to reproduce it and
> investigate.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]