I checked the system.log for the Cassandra node that I did the jconsole JMX session against and which had the data to load. Lot of log output indicating that it's busy loading the files. Lot of stacktraces indicating a broken pipe. I have no reason to believe there are connectivity issues between the nodes, but verifying that is beyond my expertise. What's indicative is this last bit of log output: INFO [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,441 StreamReplyVerbHandler.java (line 44) Successfully sent /srv/cas-snapshot-06-17-2015/endpoints/endpoint_messages/endpoints-endpoint_messages-ic-34-Data.db to /10.205.55.101 INFO [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,457 OutputHandler.java (line 42) Streaming session to /10.205.55.101 failed ERROR [Streaming to /10.205.55.101:5] 2015-06-19 21:20:45,458 CassandraDaemon.java (line 253) Exception in thread Thread[Streaming to / 10.205.55.101:5,5,RMI Runtime] java.lang.RuntimeException: java.io.IOException: Broken pipe at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Broken pipe at sun.nio.ch.FileChannelImpl.transferTo0(Native Method) at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:433) at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:565) at org.apache.cassandra.streaming.compress.CompressedFileStreamTask.stream(CompressedFileStreamTask.java:93) at org.apache.cassandra.streaming.FileStreamTask.runMayThrow(FileStreamTask.java:91) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ... 3 more
And then right after that I see what appears to be the output from the nodetool refresh: INFO [RMI TCP Connection(2480)-10.2.101.114] 2015-06-19 21:22:56,877 ColumnFamilyStore.java (line 478) Loading new SSTables for endpoints/endpoint_messages... INFO [RMI TCP Connection(2480)-10.2.101.114] 2015-06-19 21:22:56,878 ColumnFamilyStore.java (line 524) No new SSTables were found for endpoints/endpoint_messages Notice that Cassandra hasn't found any new SSTables, even though it was just so busy loading them. What's also noteworthy is that the output from the originating node shows it successfully sent endpoints-endpoint_messages-ic-34-Data.db to another node. But then in the system.log for that destination node, I see no mention of that file. What I do see on the destination node are a few INFO messages about streaming one of the .db files, and every time that's immediately followed by an error message: INFO [Thread-108] 2015-06-19 21:20:45,453 StreamInSession.java (line 142) Streaming of file /srv/cas-snapshot-06-17-2015/endpoints/endpoint_messages/endpoints-endpoint_messages-ic-26-Data.db sections=1 progress=0/105137329 - 0% for org.apache.cassandra.streaming.StreamInSession@46c039ef failed: requesting a retry. ERROR [Thread-109] 2015-06-19 21:20:45,456 CassandraDaemon.java (line 253) Exception in thread Thread[Thread-109,5,main] java.lang.RuntimeException: java.nio.channels.AsynchronousCloseException at com.google.common.base.Throwables.propagate(Throwables.java:160) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:32) at java.lang.Thread.run(Thread.java:745) Caused by: java.nio.channels.AsynchronousCloseException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205) at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:412) at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203) at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103) at org.apache.cassandra.streaming.compress.CompressedInputStream$Reader.runMayThrow(CompressedInputStream.java:151) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) ... 1 more I don't know, I'm seeing enough flakiness here as to consider Cassandra bulk-loading a lost cause, even if there is something wrong and fixable about my particular cluster. On to exporting and re-importing data at the proprietary application level. Life is too short. On Fri, Jun 19, 2015 at 2:40 PM, Mitch Gitman <mgit...@gmail.com> wrote: > Fabien, thanks for the reply. We do have Thrift enabled. From what I can > tell, the "Could not retrieve endpoint ranges:" crops up under various > circumstances. > > From further reading on sstableloader, it occurred to me that it might be > a safer bet to use the JMX StorageService bulkLoad command, considering > that the data to import was already on one of the Cassandra nodes, just in > an arbitrary directory outside the Cassandra data directories. > > I was able to get this bulkLoad command to fail with a message that the > directory structure did not follow the expected keyspace/table/ pattern. So > I created a keyspace directory and then a table directory within that and > moved all the files under the table directory. Executed bulkLoad, passing > in that directory. It succeeded. > > Then I went and ran a nodetool refresh on the table in question. > > Only one problem. If I then went to query the table for, well, anything, > nothing came back. And this was after successfully querying the table > before and truncating the table just prior to the bulkLoad, so that I knew > that only the data coming from the bulkLoad could show up there. > > Oh, and for good measure, I stopped and started all the nodes too. No luck > still. > > What's puzzling about this is that the bulkLoad silently succeeds, even > though it doesn't appear to be doing anything. I haven't bothered yet to > check the Cassandra logs. > > On Fri, Jun 19, 2015 at 12:28 AM, Fabien Rousseau <fabifab...@gmail.com> > wrote: > >> Hi, >> >> I already got this error on a 2.1 clusters because thrift was disabled. >> So you should check that thrift is enabled and accessible from the >> sstableloader process. >> >> Hope this help >> >> Fabien >> Le 19 juin 2015 05:44, "Mitch Gitman" <mgit...@gmail.com> a écrit : >> >>> I'm using sstableloader to bulk-load a table from one cluster to >>> another. I can't just copy sstables because the clusters have different >>> topologies. While we're looking to upgrade soon to Cassandra 2.0.x, we're >>> on Cassandra 1.2.19. The source data comes from a "nodetool snapshot." >>> >>> Here's the command I ran: >>> sstableloader -d *IP_ADDRESSES_OF_SEED_NOTES* */SNAPSHOT_DIRECTORY/* >>> >>> Here's the result I got: >>> Could not retrieve endpoint ranges: >>> -pr,--principal kerberos principal >>> -k,--keytab keytab location >>> --ssl-keystore ssl keystore location >>> --ssl-keystore-password ssl keystore password >>> --ssl-keystore-type ssl keystore type >>> --ssl-truststore ssl truststore location >>> --ssl-truststore-password ssl truststore password >>> --ssl-truststore-type ssl truststore type >>> >>> Not sure what to make of this, what with the hints at security arguments >>> that pop up. The source and destination clusters have no security. >>> >>> Hoping this might ring a bell with someone out there. >>> >> >