[
https://issues.apache.org/jira/browse/CASSANDRA-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14166651#comment-14166651
]
mck commented on CASSANDRA-4886:
--------------------------------
AFAIK everything thrift related is frozen :( so i presume the patch isn't going
to be applied to master.
> Remote ColumnFamilyInputFormat
> ------------------------------
>
> Key: CASSANDRA-4886
> URL: https://issues.apache.org/jira/browse/CASSANDRA-4886
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Reporter: Scott Fines
> Fix For: 3.0
>
> Attachments: CASSANDRA-4886.patch
>
>
> As written, the ColumnFamilyInputFormat does not have a great deal of fault
> tolerance.
> It only attempts to perform a read from a single replica, with an infinite
> timeout. If that replica is not available, then the Task fails, and must be
> retried on a different node.
> This is fine if the TaskTrackers are colocated with Cassandra nodes, but is
> very fragile when this is not possible. When the Tasktrackers are remote to
> cassandra, the same rules about clients should apply--there should be a
> strict (configurable) timeout, and the ability to retry requests on a
> different replica if at single request fails.
> It seems obvious that we'd want to support both types of architecture; to do
> that, we should probably have a configuration which allows the user to
> specify his architecture choices explicitely.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)