RE: MapReduce with two ethernet cards
Looks like that did it, thanks! Scott From: Brandon Williams [dri...@gmail.com] Sent: Thursday, October 13, 2011 2:16 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Thu, Oct 13, 2011 at 1:17 PM, Scott Fines scott.fi...@nisc.coop wrote: When I look at the source for ColumnFamilyInputFormat, it appears that it does a call to client.describe_ring; when you do the equivalent call with nodetool, you get the 10.1.1.* addresses. This seems to indicate to me that I should open up the firewall and attempt to contact those IPs instead of the normal thrift IPs. That leads me to think that I need to have thrift listening on both IPs, though. Would that then be the case? My mistake, I thought I'd committed this: https://issues.apache.org/jira/browse/CASSANDRA-3214 Can you see if that solves your issue? -Brandon
RE: MapReduce with two ethernet cards
I upgraded to cassandra 0.8.7, and the problem persists. Scott From: Brandon Williams [dri...@gmail.com] Sent: Monday, October 10, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote: Hi all, This may be a silly question, but I'm at a bit of a loss, and was hoping for some help. I have a Cassandra cluster set up with two NICs--one for internel communication between cassandra machines (10.1.1.*), and one to respond to Thrift RPC (172.28.*.*). I also have a Hadoop cluster set up, which, for unrelated reasons, has to remain separate from Cassandra, so I've written a little MapReduce job to copy data from Cassandra to Hadoop. However, when I try to run my job, I get java.io.IOException: failed connecting to all endpoints 10.1.1.24,10.1.1.17,10.1.1.16 which is puzzling to me. It seems like the MR is attempting to connect to the internal communication IPs instead of the external Thrift IPs. Since I set up a firewall to block external access to the internal IPs of Cassandra, this is obviously going to fail. So my question is: why does Cassandra MR seem to be grabbing the listen_address instead of the Thrift one. Presuming it's not a funky configuration error or something on my part, is that strictly necessary? All told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, should I open up port 7000 or port 9160 between Hadoop and Cassandra? Thanks for your help, Scott Your cassandra is old, upgrade to the latest version. -Brandon
Re: MapReduce with two ethernet cards
What is your rpc_address set to? If it's 0.0.0.0 (bind everything) then that's not going to work if listen_address is blocked. -Brandon On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote: I upgraded to cassandra 0.8.7, and the problem persists. Scott From: Brandon Williams [dri...@gmail.com] Sent: Monday, October 10, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote: Hi all, This may be a silly question, but I'm at a bit of a loss, and was hoping for some help. I have a Cassandra cluster set up with two NICs--one for internel communication between cassandra machines (10.1.1.*), and one to respond to Thrift RPC (172.28.*.*). I also have a Hadoop cluster set up, which, for unrelated reasons, has to remain separate from Cassandra, so I've written a little MapReduce job to copy data from Cassandra to Hadoop. However, when I try to run my job, I get java.io.IOException: failed connecting to all endpoints 10.1.1.24,10.1.1.17,10.1.1.16 which is puzzling to me. It seems like the MR is attempting to connect to the internal communication IPs instead of the external Thrift IPs. Since I set up a firewall to block external access to the internal IPs of Cassandra, this is obviously going to fail. So my question is: why does Cassandra MR seem to be grabbing the listen_address instead of the Thrift one. Presuming it's not a funky configuration error or something on my part, is that strictly necessary? All told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, should I open up port 7000 or port 9160 between Hadoop and Cassandra? Thanks for your help, Scott Your cassandra is old, upgrade to the latest version. -Brandon
RE: MapReduce with two ethernet cards
The listen address on all machines are set to the 10.1.1.* addresses, while the thrift rpc address is the 172.28.* addresses From: Brandon Williams [dri...@gmail.com] Sent: Thursday, October 13, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards What is your rpc_address set to? If it's 0.0.0.0 (bind everything) then that's not going to work if listen_address is blocked. -Brandon On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote: I upgraded to cassandra 0.8.7, and the problem persists. Scott From: Brandon Williams [dri...@gmail.com] Sent: Monday, October 10, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote: Hi all, This may be a silly question, but I'm at a bit of a loss, and was hoping for some help. I have a Cassandra cluster set up with two NICs--one for internel communication between cassandra machines (10.1.1.*), and one to respond to Thrift RPC (172.28.*.*). I also have a Hadoop cluster set up, which, for unrelated reasons, has to remain separate from Cassandra, so I've written a little MapReduce job to copy data from Cassandra to Hadoop. However, when I try to run my job, I get java.io.IOException: failed connecting to all endpoints 10.1.1.24,10.1.1.17,10.1.1.16 which is puzzling to me. It seems like the MR is attempting to connect to the internal communication IPs instead of the external Thrift IPs. Since I set up a firewall to block external access to the internal IPs of Cassandra, this is obviously going to fail. So my question is: why does Cassandra MR seem to be grabbing the listen_address instead of the Thrift one. Presuming it's not a funky configuration error or something on my part, is that strictly necessary? All told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, should I open up port 7000 or port 9160 between Hadoop and Cassandra? Thanks for your help, Scott Your cassandra is old, upgrade to the latest version. -Brandon
RE: MapReduce with two ethernet cards
When I look at the source for ColumnFamilyInputFormat, it appears that it does a call to client.describe_ring; when you do the equivalent call with nodetool, you get the 10.1.1.* addresses. This seems to indicate to me that I should open up the firewall and attempt to contact those IPs instead of the normal thrift IPs. That leads me to think that I need to have thrift listening on both IPs, though. Would that then be the case? Scott From: Scott Fines [scott.fi...@nisc.coop] Sent: Thursday, October 13, 2011 12:40 PM To: user@cassandra.apache.org Subject: RE: MapReduce with two ethernet cards The listen address on all machines are set to the 10.1.1.* addresses, while the thrift rpc address is the 172.28.* addresses From: Brandon Williams [dri...@gmail.com] Sent: Thursday, October 13, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards What is your rpc_address set to? If it's 0.0.0.0 (bind everything) then that's not going to work if listen_address is blocked. -Brandon On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote: I upgraded to cassandra 0.8.7, and the problem persists. Scott From: Brandon Williams [dri...@gmail.com] Sent: Monday, October 10, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote: Hi all, This may be a silly question, but I'm at a bit of a loss, and was hoping for some help. I have a Cassandra cluster set up with two NICs--one for internel communication between cassandra machines (10.1.1.*), and one to respond to Thrift RPC (172.28.*.*). I also have a Hadoop cluster set up, which, for unrelated reasons, has to remain separate from Cassandra, so I've written a little MapReduce job to copy data from Cassandra to Hadoop. However, when I try to run my job, I get java.io.IOException: failed connecting to all endpoints 10.1.1.24,10.1.1.17,10.1.1.16 which is puzzling to me. It seems like the MR is attempting to connect to the internal communication IPs instead of the external Thrift IPs. Since I set up a firewall to block external access to the internal IPs of Cassandra, this is obviously going to fail. So my question is: why does Cassandra MR seem to be grabbing the listen_address instead of the Thrift one. Presuming it's not a funky configuration error or something on my part, is that strictly necessary? All told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, should I open up port 7000 or port 9160 between Hadoop and Cassandra? Thanks for your help, Scott Your cassandra is old, upgrade to the latest version. -Brandon
Re: MapReduce with two ethernet cards
On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote: Hi all, This may be a silly question, but I'm at a bit of a loss, and was hoping for some help. I have a Cassandra cluster set up with two NICs--one for internel communication between cassandra machines (10.1.1.*), and one to respond to Thrift RPC (172.28.*.*). I also have a Hadoop cluster set up, which, for unrelated reasons, has to remain separate from Cassandra, so I've written a little MapReduce job to copy data from Cassandra to Hadoop. However, when I try to run my job, I get java.io.IOException: failed connecting to all endpoints 10.1.1.24,10.1.1.17,10.1.1.16 which is puzzling to me. It seems like the MR is attempting to connect to the internal communication IPs instead of the external Thrift IPs. Since I set up a firewall to block external access to the internal IPs of Cassandra, this is obviously going to fail. So my question is: why does Cassandra MR seem to be grabbing the listen_address instead of the Thrift one. Presuming it's not a funky configuration error or something on my part, is that strictly necessary? All told, I'd prefer if it was connecting to the Thrift IPs, but if it can't, should I open up port 7000 or port 9160 between Hadoop and Cassandra? Thanks for your help, Scott Your cassandra is old, upgrade to the latest version. -Brandon