[
https://issues.apache.org/jira/browse/HDFS-2635?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163802#comment-13163802
]
[email protected] commented on HDFS-2635:
---------------------------------------
My submitted patch fixes the file duplication by incorporating the mapper's key
into the filename. The combination of hostname and map key result in a unique
filename across multiple hosts with a variable number of map slots per host.
> NNBench creates duplicate files if multiple maps are run from the same client
> -----------------------------------------------------------------------------
>
> Key: HDFS-2635
> URL: https://issues.apache.org/jira/browse/HDFS-2635
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: test
> Affects Versions: 0.20.203.0, 0.20.204.0, 0.20.205.0
> Reporter: [email protected]
> Priority: Minor
> Attachments: hdfs-2635.patch
>
>
> NNBench creates files in the format:
> file_<hostname>__<filenum>
> This works seamlessly as long as all of the Hadoop clients in the cluster are
> each running with a single map slot. If multiple map slots are available on a
> single client, each mapper tries to create the same set of files. This can
> result in lock contention on some non-Hadoop HDFS implementations thereby
> defeating the purpose of the NNBench test.
> Making the files unique per mapper, not per host, resolves this.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira