[
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048671#comment-13048671
]
Mck SembWever commented on CASSANDRA-2388:
------------------------------------------
{quote}
bq. public String[] sort_endpoints_by_proximity(String endpoint, String[]
endpoints, boolean restrictToSameDC)
I don't think it makes sense to send the client endpoint to this call since the
endpoint might not be a cassandra node. It's a reasonable assumption that the
endpoint it's talking to is local enough to the client to use that.
{quote}
For the test set i was running against, RF=2, each split's has two endpoints
always in different datacenters.
If the "local" endpoint is down then getLocations() will then call
client.sort_endpoints_by_proximity(..)
The "local" (or initialAddress) will fail too naturally.
It then makes a client connection through the "other" endpoint. \[see
CFRR.describeDatacenter(..)].
This will presume the wrong datacenter and return itself as a valid endpoint.
I need some way to know what the original datacenter is, even when it is down.
> ColumnFamilyRecordReader fails for a given split because a host is down, even
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-2388
> URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Reporter: Eldon Stegall
> Assignee: Mck SembWever
> Labels: hadoop, inputformat
> Fix For: 0.8.1
>
> Attachments: 0002_On_TException_try_next_split.patch,
> CASSANDRA-2388.patch, CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We
> should try multiple locations for a given split.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira