[
https://issues.apache.org/jira/browse/HDFS-6821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Nauroth resolved HDFS-6821.
---------------------------------
Resolution: Won't Fix
Hi, [~samera].
Ideas similar to this have been proposed several times. The consensus has
always been that pushing a recursive operation all the way to the NameNode for
atomicity would impact throughput too severely. The implementation would
require holding the write lock while updating every inode in a subtree. During
that time, all other RPC caller threads would block waiting for release of the
write lock. A finer-grained locking implementation would help mitigate this,
but it wouldn't eliminate the problem completely.
It's typical behavior in many file systems that recursive operations are driven
from user space, and the syscalls modify a single inode at a time. HDFS isn't
different in this respect.
I'm going to resolve this as won't fix.
> Atomicity of multi file operations
> ----------------------------------
>
> Key: HDFS-6821
> URL: https://issues.apache.org/jira/browse/HDFS-6821
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Samer Al-Kiswany
> Priority: Minor
>
> Looking how HDFS updates the log files in case of chmod –r or chown –r
> operations. In these operations, HDFS name node seems to update each file
> separately; consequently the strace of the operation looks as follows.
> append(edits)
> fsync(edits)
> append(edits)
> fsync(edits)
> -----------------------
> append(edits)
> fsync(edits)
> append(edits)
> fsync(edits)
> If a crash happens in the middle of this operation (e.g. at the dashed line
> in the trace), the system will end up with part of the files updates with the
> new owner or permissions and part still with the old owner.
> Isn’t it better to log the whole operations (chown -r) as one entry in the
> edit file?
--
This message was sent by Atlassian JIRA
(v6.2#6252)