[
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667902#comment-13667902
]
Mck SembWever commented on CASSANDRA-2388:
------------------------------------------
{quote}The biggest problem is [avoiding endpoints in a different DC]. Maybe the
way todo this is change getSplits logic to never return replicas in another DC.
I think this would require adding DC info to the describe_ring call{quote}
Tasktrackers may have access to a set of datacenters, so this DC info needs
contain a list of DCs.
For example, our setup separates datacenters by physical datacenter and
hadoop-usage, like:{noformat}DC1 "Production + Hadoop"
c*01 c*03
DC2 "Production + Hadoop"
c*02 c*04
DC3 "Production"
c*05
DC4 "Production"
c*06{noformat}
So here we'd pass to getSplits() a DC info like "DC1,DC2".
But the problem remain, given a task executing on c*01 that fails to connect to
localhost, although we can now prevent a connection to DC3 or DC4, we can't
favour a connection to any other split in DC1 over anything in DC2. Is this
solvable?
> ColumnFamilyRecordReader fails for a given split because a host is down, even
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-2388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Affects Versions: 0.6
> Reporter: Eldon Stegall
> Assignee: Mck SembWever
> Priority: Minor
> Labels: hadoop, inputformat
> Fix For: 1.2.6
>
> Attachments: 0002_On_TException_try_next_split.patch,
> CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch,
> CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch,
> CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We
> should try multiple locations for a given split.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira