ZanderXu opened a new pull request, #5430: URL: https://github.com/apache/hadoop/pull/5430
Jira: [HDFS-16933](https://issues.apache.org/jira/browse/HDFS-16933) We encountered a problem that NameNode randomly has wrong owner ship after loading the same fsimage if we enable parallel fsimage loading. After tracing and found that maybe there is a race in SerialNumberMap. ``` public int get(T t) { if (t == null) { return 0; } Integer sn = t2i.get(t); if (sn == null) { // Assume there are two thread with different t, such as: // T1 with hbase // T2 with hdfs // If T1 and T2 get the sn in the same time, they will get the same sn, such as 10 sn = current.getAndIncrement(); if (sn > max) { current.getAndDecrement(); throw new IllegalStateException(name + ": serial number map is full"); } Integer old = t2i.putIfAbsent(t, sn); if (old != null) { current.getAndDecrement(); return old; } // If T1 puts the 10->hbase to the i2t first, T2 will use 10 -> hdfs to overwrite it. So it will cause that the Inodes will get a wrong owner hdfs, actual it should be hbase. i2t.put(sn, t); } return sn; } ``` There are two mappings in SerialNumberMap, t2i and i2t. They should be safely updated together. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
