[
https://issues.apache.org/jira/browse/CASSANDRA-3811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230511#comment-13230511
]
Patrik Modesto commented on CASSANDRA-3811:
-------------------------------------------
I can see there is a misunderstanding about our Hadoop cluster setup. We use
quite common setup that's why I made this ticket "critical".
Our setup looks like this:
||node||Components||
|node1|Datanode, Tasktracker, Cassandra|
|node2|Datanode, Tasktracker, Cassandra|
|node3|Datanode, Tasktracker, Cassandra|
|node4|Datanode, Tasktracker, Cassandra|
|node5|Namenode, Jobtracker, our mapreduce jobs|
All with rpc_endpoints: 0.0.0.0
I'd say, this is quite reasonable clean setup.
The problem is, that with this setup, we can't use Cassandra 0.8.8 and above
because of the problem I've described earlier today. With this exact setup our
jobs just fail to start because there is no Cassandra on node5. Moving our jobs
to, for example, node1 allows them to run but CFIF gets wrong split sizes (it
asks just 0.0.0.0 for all of the key ranges) and there are tasks that shows
thousands of percent of progress. Please carefuly read my earlier post about
describe_splits().
We use rpc_endpoint: 0.0.0.0 because we have other non-Hadoop components that
connect to Cassandra for data and they are on different intefaces.
I hope I've explained the setup so you can understand that from my point of
view it is critical. With Cassandra 0.8.8 and above our Hadoop jobs eihter fail
to start or fail to complete the work.
We have quite wide rows (even tens of thousands columns) and write heavy
cluster so we use batch.size=512 and split.size=8196 for our Hadoop jobs. That
may or may not be connected to the wrong key ranges.
Regards,
Patrik
> Empty rpc_address prevents running MapReduce job outside a cluster
> ------------------------------------------------------------------
>
> Key: CASSANDRA-3811
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3811
> Project: Cassandra
> Issue Type: Bug
> Components: Hadoop
> Affects Versions: 0.8.9, 0.8.10
> Environment: Debian Stable,
> Cassandra 0.8.9,
> Java(TM) SE Runtime Environment (build 1.6.0_26-b03),
> Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
> Reporter: Patrik Modesto
> Priority: Minor
>
> Setting rpc_address to empty to make Cassandra listen on all network
> intefaceces breaks running mapredude job from outside the cluster. The jobs
> wont even start, showing these messages:
> {noformat}
> 12/01/26 11:15:21 DEBUG hadoop.ColumnFamilyInputFormat: failed
> connect to endpoint 0.0.0.0
> java.io.IOException: unable to connect to server
> at
> org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:389)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:224)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.ConnectException: Connection refused
> at org.apache.thrift.transport.TSocket.open(TSocket.java:183)
> at
> org.apache.thrift.transport.TFramedTransport.open(TFramedTransport.java:81)
> at
> org.apache.cassandra.hadoop.ConfigHelper.createConnection(ConfigHelper.java:385)
> ... 9 more
> Caused by: java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
> at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:211)
> at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
> at java.net.Socket.connect(Socket.java:529)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:178)
> ... 11 more
> ...
> Caused by: java.util.concurrent.ExecutionException:
> java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSplits(ColumnFamilyInputFormat.java:156)
> ... 19 more
> Caused by: java.io.IOException: failed connecting to all endpoints
> 10.0.18.129,10.0.18.99,10.0.18.98
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.getSubSplits(ColumnFamilyInputFormat.java:241)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat.access$200(ColumnFamilyInputFormat.java:73)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:193)
> at
> org.apache.cassandra.hadoop.ColumnFamilyInputFormat$SplitCallable.call(ColumnFamilyInputFormat.java:178)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> {noformat}
> Describe ring retunrs:
> {noformat}
> describe_ring returns:
> endpoints: 10.0.18.129,10.0.18.99,10.0.18.98
> rpc_endpoints: 0.0.0.0,0.0.0.0,0.0.0.0
> {noformat}
> [Michael
> Frisch|http://www.mail-archive.com/[email protected]/msg20180.html]
> found possible bug in the Cassandra source:
> {quote}
> If the code in the 0.8 branch is reflective of what is actually included in
> Cassandra 0.8.9 (here:
> http://svn.apache.org/repos/asf/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/hadoop/ColumnFamilyInputFormat.java)
> then the problem is that line 202 is doing an == comparison on strings. The
> correct way to compare would be endpoint_address.equals("0.0.0.0") instead.
> - Mike
> {quote}
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira