[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Patrik Modesto (Commented) (JIRA) Thu, 15 Mar 2012 13:23:05 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230511#comment-13230511
 ]


Patrik Modesto commented on CASSANDRA-3811:
-------------------------------------------

I can see there is a misunderstanding about our Hadoop cluster setup. We use 
quite common setup that's why I made this ticket "critical".

Our setup looks like this:
||node||Components||
|node1|Datanode, Tasktracker, Cassandra|
|node2|Datanode, Tasktracker, Cassandra|
|node3|Datanode, Tasktracker, Cassandra|
|node4|Datanode, Tasktracker, Cassandra|
|node5|Namenode, Jobtracker, our mapreduce jobs|

All with rpc_endpoints: 0.0.0.0

I'd say, this is quite reasonable clean setup.

The problem is, that with this setup, we can't use Cassandra 0.8.8 and above 
because of the problem I've described earlier today. With this exact setup our 
jobs just fail to start because there is no Cassandra on node5. Moving our jobs 
to, for example, node1 allows them to run but CFIF gets wrong split sizes (it 
asks just 0.0.0.0 for all of the key ranges) and there are tasks that shows 
thousands of percent of progress. Please carefuly read my earlier post about 
describe_splits().

We use rpc_endpoint: 0.0.0.0 because we have other non-Hadoop components that 
connect to Cassandra for data and they are on different intefaces.

I hope I've explained the setup so you can understand that from my point of 
view it is critical. With Cassandra 0.8.8 and above our Hadoop jobs eihter fail 
to start or fail to complete the work.

We have quite wide rows (even tens of thousands columns) and write heavy 
cluster so we use batch.size=512 and split.size=8196 for our Hadoop jobs. That 
may or may not be connected to the wrong key ranges.

Regards,
Patrik

                
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
>                 Key: CASSANDRA-3811
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.8.9, 0.8.10
>         Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
>            Reporter: Patrik Modesto
>            Priority: Minor
>
> Setting rpc_address to empty to make Cassandra listen on all network 
> intefaceces breaks running mapredude job from outside the cluster. The jobs 
> wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG  hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
>        at 
> org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
>        at 
> org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
>        at 
> org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
>        ... 9 more
> Caused by: java.net.ConnectException: Connection refused
>        at java.net.PlainSocketImpl.socketConnect(Native Method)
>        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
>        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
>        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
>        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
>        at java.net.Socket.connect(Socket.java:529)
>        at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
>        ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>        at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
>        ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
>        at 
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
>        at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>        at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>        at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael 
> Frisch|http://www.mail-archive.com/[email protected]/msg20180.html] 
> found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in 
> Cassandra 0.8.9 (here: 
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
>  then the problem is that line 202 is doing an == comparison on strings.  The 
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (CASSANDRA-3811) Empty rpc_address prevents running MapReduce job outside a cluster

Reply via email to