If you are doing a straight one-to-one copy from one cluster to another try…
1) nodetool snapshot on each prod node for the system and application key spaces. 2) rsync system and app key space snapshots 3) update the yaml files on the new cluster to have the correct initial_tokens. This is not necessary as they are stored in the system KS, but it is limits surprises later. 4) Start the new cluster. For bulk load you will want to use the sstableloader http://www.datastax.com/dev/blog/bulk-loading Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/01/2012, at 3:32 AM, Scott Fines wrote: > Hi all, > > I'm trying to copy a column family from our production cluster to our > development one for testing purposes, so I thought I would try the bulkload > API. Since I'm lazy, I'm using the Cassandra bulkLoad JMX call from one of > the development machines. Here are the steps I followed: > > 1. (on production C* node): nodetool flush <keyspace> <CF> > 2. rsync SSTables from production C* node to development C* node > 3. bulkLoad SSTables through JMX > > But when I do that, on one of the development C* nodes, I keep getting this > exception: > > java.lang.NullPointerException > at org.apache.cassandra.io.sstable.SSTable.getMinimalKey(SSTable.java:156) > at > org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:334) > at > org.apache.cassandra.io.sstable.SSTableWriter.closeAndOpenReader(SSTableWriter.java:302) > at > org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:156) > at > org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:88) > at > org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:184) > > After which, the node itself seems to stream data successfully (I'm in the > middle of checking that right now). > > Is this an error that I should be concerned about? > > Thanks, > > Scott > >