[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

Mck SembWever (JIRA) Sat, 18 May 2013 14:51:18 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661436#comment-13661436
 ]


Mck SembWever commented on CASSANDRA-2388:
------------------------------------------

Jonathan,
 I can't say i'm in favour of enforcing data locality.
Because  data locality in hadoop doesn't work this way… when a tasktracker 
through the next heartbeat announces that it has a task slot free the 
jobtracker will do its best to assign a task with data locality to it but 
failing this will assign it a random task. the number of these random tasks can 
be quite high, just like i mentioned above
{quote} Tasks are still being evenly distributed around the ring regardless of 
what the ColumnFamilySplit.locations is. {quote}

This can be almost solved by upgrading to hadoop-0.21+, using the fair 
scheduler and setting the property {code}<property>
        <name>mapred.fairscheduler.locality.delay</name>
        <value>360000000</value>
<property>{code}.

At the end of the day while hadoop encourages data locality it does not enforce 
it.
The ideal approach would be to sort all locations by proximity.
The feasible approach hopefully is still [~tjake]'s above. In addition i'd be 
in favour of a setting in the job's configuration as to whether a location from 
another datacenter can be used.

references:
 - http://www.infoq.com/articles/HadoopInputFormat
 - http://www.mentby.com/matei-zaharia/running-only-node-local-jobs.html
 - 
https://groups.google.com/a/cloudera.org/forum/?fromgroups#!topic/cdh-user/3ggnE5hV0PY
 - http://www.cs.berkeley.edu/~matei/papers/2010/eurosys_delay_scheduling.pdf
                
> ColumnFamilyRecordReader fails for a given split because a host is down, even 
> if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>            Priority: Minor
>              Labels: hadoop, inputformat
>             Fix For: 1.2.5
>
>         Attachments: 0002_On_TException_try_next_split.patch, 
> CASSANDRA-2388-addition1.patch, CASSANDRA-2388-extended.patch, 
> CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, 
> CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We 
> should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.

Reply via email to