Problems in the cassandra bulk loader

José Elias Queiroga da Costa Araújo Wed, 09 Oct 2013 13:23:03 -0700

        Hi all,

        I'm trying to use the bulk insertion with the
SSTableSimpleUnsortedWriter class from cassandra API and I facing some
problems.  After generating and uploading the .db files by using the
./sstableloader command , I noticed the data didn't match with inserted
one.


        I put the used code below to try to explain the bahaviour.

         I'm trying to generate the data files using only one rowkey and
one supercolumn. Where the super column has 10 columns.

IPartitioner p = new Murmur3Partitioner();
CFMetaData scf = new CFMetaData("myKeySpace", "Column",
 ColumnFamilyType.Super, BytesType.instance, BytesType.instance);

SSTableSimpleUnsortedWriter usersWriter = new
SSTableSimpleUnsortedWriter(new File("./"), scf, p,64);

int rowKey = 10;
int superColumnKey = 20;
for (int i = 0; i < 10; i++) {
 usersWriter.newRow(ByteBufferUtil.bytes(rowKey));
usersWriter.newSuperColumn(ByteBufferUtil.bytes(superColumnKey));
 usersWriter.addColumn(ByteBufferUtil.bytes(i+1),ByteBufferUtil.bytes(i+1),
System.currentTimeMillis());
 }
 usersWriter.close();

                After uploading,  the result is:

                RowKey: 0000000a
                   => (super_column=00000014,
                              (name=00000001, value=00000001,
timestamp=1381348293144))

                1 Row Returned.

                In this case, my super column should have 10 columns? With
values between 00000001 to 00000011?  Since I'm using the same super
column.  The documentation says the newRow method could be invoked many
times, it impacts only the performance.

                The second question is: If this is the correct behavior,
the column value should be 00000011, since it is the last value passed as
argument to addColumn(...) method in the loop?

              Thanks in the advance,

               Elias.

Problems in the cassandra bulk loader

Reply via email to