[ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13667902#comment-13667902
 ] 

Mck SembWever commented on CASSANDRA-2388:
------------------------------------------

{quote}The biggest problem is [avoiding endpoints in a different DC]. Maybe the 
way todo this is change getSplits logic to never return replicas in another DC. 
I think this would require adding DC info to the describe_ring call{quote}

Tasktrackers may have access to a set of datacenters, so this DC info needs 
contain a list of DCs.

For example, our setup separates datacenters by physical datacenter and 
hadoop-usage, like:{noformat}DC1 "Production + Hadoop"
  c*01 c*03
DC2 "Production + Hadoop"
  c*02 c*04
DC3 "Production"
  c*05
DC4 "Production"
  c*06{noformat}

So here we'd pass to getSplits() a DC info like "DC1,DC2".
But the problem remain, given a task executing on c*01 that fails to connect to 
localhost, although we can now prevent a connection to DC3 or DC4, we can't 
favour a connection to any other split in DC1 over anything in DC2. Is this 
solvable? 
                
> ColumnFamilyRecordReader fails for a given split because a host is down, even 
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>            Priority: Minor
>              Labels: hadoop, inputformat
>             Fix For: 1.2.6
>
>         Attachments: 0002_On_TException_try_next_split.patch, 
> CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, 
> CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
> CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We 
> should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to