Re: bulk load problem
Hi all, I am facing the same problem when trying to load Cassandra using sstableloader. I am running a Cassandra instance in my own machine and sstableloader is also called from the same machine. Following are the steps I followed. - get a copy of the running Cassandra instance - set another loopback address with sudo ifconfig lo:0 127.0.0.2 netmask 255.0.0.0 up - set listen address and rpc address of the copied Cassandra's cassandra.yaml to 127.0.0.2 - ran ./sstableloader -d 127.0.0.2 directory of created sstables But this give me an error 'Could not retrieve endpoint ranges: ' and just that. I am so grateful for any hints to get over this. What I want to get done is actually running the sstableloader via a java code. But I couldn't get over it either and trying to understand the required args with this. It is great if someone can help me in either cases. Thanks in advance! On Tue, Jul 3, 2012 at 5:16 AM, aaron morton aa...@thelastpickle.comwrote: Do you have the full stack ? It will include a cause. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/06/2012, at 12:07 PM, James Pirz wrote: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James -- Pushpalanka Jayawardhana | Undergraduate | Computer Science and Engineering University of Moratuwa +94779716248 | http://pushpalankajaya.blogspot.com Twitter: http://twitter.com/Pushpalanka | Slideshare: http://www.slideshare.net/Pushpalanka
Re: bulk load problem
I couldn't get the same-host sstableloader to work either. But it's easier to use the JMX bulk-load hook that's built into Cassandra anyway. The following is what I implemented to do this: import java.io.IOException; import java.util.HashMap; import java.util.Map; import javax.management.JMX; import javax.management.MBeanServerConnection; import javax.management.MalformedObjectNameException; import javax.management.ObjectName; import javax.management.remote.JMXConnector; import javax.management.remote.JMXConnectorFactory; import javax.management.remote.JMXServiceURL; import org.apache.cassandra.service.StorageServiceMBean; public class JmxBulkLoader { private JMXConnector connector; private StorageServiceMBean storageBean; public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL(String.format(service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi, host, port)); MapString,Object env = new HashMapString,Object(); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName(org.apache.cassandra.db:type=StorageService); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String path) { storageBean.bulkLoad(path); } public static void main(String[] args) throws Exception { if (args.length == 0) { throw new IllegalArgumentException(usage: paths to bulk files); } JmxBulkLoader np = new JmxBulkLoader(localhost, 7199); for (String arg : args) { np.bulkLoad(arg); } np.close(); } } On Jul 9, 2012, at 5:16 AM, Pushpalanka Jayawardhana wrote: Hi all, I am facing the same problem when trying to load Cassandra using sstableloader. I am running a Cassandra instance in my own machine and sstableloader is also called from the same machine. Following are the steps I followed. get a copy of the running Cassandra instance set another loopback address with sudo ifconfig lo:0 127.0.0.2 netmask 255.0.0.0 up set listen address and rpc address of the copied Cassandra's cassandra.yaml to 127.0.0.2 ran ./sstableloader -d 127.0.0.2 directory of created sstables But this give me an error 'Could not retrieve endpoint ranges: ' and just that. I am so grateful for any hints to get over this. What I want to get done is actually running the sstableloader via a java code. But I couldn't get over it either and trying to understand the required args with this. It is great if someone can help me in either cases. Thanks in advance! On Tue, Jul 3, 2012 at 5:16 AM, aaron morton aa...@thelastpickle.com wrote: Do you have the full stack ? It will include a cause. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 27/06/2012, at 12:07 PM, James Pirz wrote: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James -- Pushpalanka Jayawardhana | Undergraduate | Computer Science and Engineering University of Moratuwa +94779716248 | http://pushpalankajaya.blogspot.com Twitter: http://twitter.com/Pushpalanka | Slideshare: http://www.slideshare.net/Pushpalanka
Re: bulk load problem
Due to the change in directory structure from ver 1.1, you have to create the directory like /path/to/sstables/Keyspace name/ColumnFamily name and put your sstables. In your case, I think it would be /data/ssTable/tpch/tpch/cf0. And you have to specify that directory as a parameter for sstableloader bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/cf0 Yuki On Tuesday, June 26, 2012 at 7:07 PM, James Pirz wrote: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James
Re: bulk load problem
Hi, Thanks Brian for your code and Thanks Yuki. Directory structure was a problem and could correct it with Yuki's guidance. Still the error was same and it was due to wrong Thrift rpc address. After correcting it, bulk loading was successful. On Mon, Jul 9, 2012 at 8:13 PM, Yuki Morishita mor.y...@gmail.com wrote: Due to the change in directory structure from ver 1.1, you have to create the directory like /path/to/sstables/Keyspace name/ColumnFamily name and put your sstables. In your case, I think it would be /data/ssTable/tpch/tpch/cf0. And you have to specify that directory as a parameter for sstableloader bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/cf0 Yuki On Tuesday, June 26, 2012 at 7:07 PM, James Pirz wrote: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James -- Pushpalanka Jayawardhana | Undergraduate | Computer Science and Engineering University of Moratuwa +94779716248 | http://pushpalankajaya.blogspot.com Twitter: http://twitter.com/Pushpalanka | Slideshare: http://www.slideshare.net/Pushpalanka
Re: bulk load problem
What is your yaml setting for rpc and listen server on destination node? Nury Tue, 26 Jun 2012 17:07:49 -0700 от James Pirz james.p...@gmail.com: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James
Re: bulk load problem
Thank you so much ! The problem was the RPC address, it was different than the listen address. I appreciate your help. Best, James On Wed, Jun 27, 2012 at 1:29 AM, Nury Redjepow nreje...@mail.ru wrote: What is your yaml setting for rpc and listen server on destination node? Nury Tue, 26 Jun 2012 17:07:49 -0700 от James Pirz james.p...@gmail.com: Dear all, I am trying to use sstableloader in cassandra 1.1.1, to bulk load some data into a single node cluster. I am running the following command: bin/sstableloader -d 192.168.100.1 /data/ssTable/tpch/tpch/ from another node (other than the node on which cassandra is running), while the data should be loaded into a keyspace named tpch. I made sure that the 2nd node, from which I run sstableloader, have the same copy of cassandra.yaml as the destination node. I have put tpch-cf0-hd-1-Data.db tpch-cf0-hd-1-Index.db under the path, I have passed to sstableloader. But I am getting the following error: Could not retrieve endpoint ranges: Any hint ? Thanks in advance, James