[
http://issues.apache.org/jira/browse/HADOOP-171?page=comments#action_12376786 ]
Konstantin Shvachko commented on HADOOP-171:
--------------------------------------------
There are two different issues here, I think
1) asking for a replication hint.
This is what highReplicationHint() is for.
We can provide a parameter (strategy) making it look like
short highReplicationHint( enum strategy )
This one is easy since it just returns a number based on the number of
registered data nodes.
But it works for the problem of choosing replication for submitted job files
(.jar, .xml).
2) dynamically maintaining replication of a file based on a chosen strategy and
cluster size.
This is hard.
Do we really need that now?
Are submitted job files removed after the job is done?
> need standard API to set dfs replication = high
> -----------------------------------------------
>
> Key: HADOOP-171
> URL: http://issues.apache.org/jira/browse/HADOOP-171
> Project: Hadoop
> Type: New Feature
> Components: dfs
> Versions: 0.2
> Reporter: Doug Cutting
> Assignee: Konstantin Shvachko
>
> There should be a standard way to indicate that files should be highly
> replicated, appropriate for files that all nodes will read. This should be
> settable both on file creation and for already-existing files. Perhaps
> specifying a particular replication value, like Short.MAX_VALUE, or zero, can
> be used to signal this. The level should not be constant, but should be
> relative to the cluster size and network topography. If more nodes are added
> or if nodes are deleted, the actual replication count should increase or
> decrease.
> Initially, all that is needed is an API to specify this. It could initially
> be implemented with a constant (e.g., 10) or with something related to the
> number of datanodes (sqrt?), and needn't auto-adjust as the cluster size
> changes That is only the long-term goal.
> When JobClient copies job files (job.xml & job.jar) into the job's
> filesystem, it should specify this replication level.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
http://www.atlassian.com/software/jira