[
https://issues.apache.org/jira/browse/HADOOP-6827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905027#action_12905027
]
Steve Loughran commented on HADOOP-6827:
----------------------------------------
I think "racks" confuse people; we have
# a physical rack topology (which matters for maintenance, power outages,
fork-lift truck related incidents, etc.)
# a network topology (which should really be a hierarchy of switches, some
metadata about cost/throttling on each one, maybe even extend to clusters in
the same or separate facilities)
# a power topology which could map from a server name to its UPS Id and perhaps
a shared PSU ID, so you'd know to prefer to replicate content across UPSs and
any shared physical PSUs
# HDD batches whose failures may not be independent.
# Disks inside each server
Really, we should know all these things, so that people can start worrying
about them. Currently everyone discusses "rack awareness" when really it tends
to be switch awareness, but once you start looking at failure modes of a
cluster as in HDFS-1094, the differences matter.
> Extend the API of NetworkTopology
> ---------------------------------
>
> Key: HADOOP-6827
> URL: https://issues.apache.org/jira/browse/HADOOP-6827
> Project: Hadoop Common
> Issue Type: New Feature
> Reporter: Rodrigo Schmidt
> Assignee: Rodrigo Schmidt
>
> The current API for NetworkTopology is too restricted (and somehow biased
> towards the default block placement algorithm).
> For instance, it lacks methods to list racks or limit machines within a rack
> which are fundamental for HDFS-1094.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.