[
https://issues.apache.org/jira/browse/CASSANDRA-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Ellis updated CASSANDRA-4886:
--------------------------------------
Affects Version/s: (was: 1.1.6)
Fix Version/s: (was: 1.1.6)
2.0
> Remote ColumnFamilyInputFormat
> ------------------------------
>
> Key: CASSANDRA-4886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4886
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Reporter: Scott Fines
> Fix For: 2.0
>
> Attachments: CASSANDRA-4886.patch
>
>
> As written, the ColumnFamilyInputFormat does not have a great deal of fault
> tolerance.
> It only attempts to perform a read from a single replica, with an infinite
> timeout. If that replica is not available, then the Task fails, and must be
> retried on a different node.
> This is fine if the TaskTrackers are colocated with Cassandra nodes, but is
> very fragile when this is not possible. When the Tasktrackers are remote to
> cassandra, the same rules about clients should apply--there should be a
> strict (configurable) timeout, and the ability to retry requests on a
> different replica if at single request fails.
> It seems obvious that we'd want to support both types of architecture; to do
> that, we should probably have a configuration which allows the user to
> specify his architecture choices explicitely.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira