Hmmm... In general it seems that for some reason Cassandra reads invalid value when trying to get key length (it should be ~100-150, but it gets 2048), then basing on this value it reads too much data and when trying to read next key's length again it reads some garbage translating it to a gargantuan int, that has nothing to do with real data layout in file, so in the end it crashes.

I paste here some more logs (it crashes in a bit different place in code when streaming only one file, than it happened on production, but due to steps I have to take to reproduce it, I'm sure it's for the same root cause). I've modified code a bit to print debug messages in a few more places and I print the "real" key in DecoratedKey:

----- HERE STARTS THE LAST VALID KEY/VALUE PAIR
INFO [Thread-8] 2013-05-23 13:07:06,766 ByteBufferUtil.java (line 370) readShortLength->readUnsignedShort: 121 INFO [Thread-8] 2013-05-23 13:07:06,766 ByteBufferUtil.java (line 407) READ with length 121 INFO [Thread-8] 2013-05-23 13:07:06,766 IncomingStreamReader.java (line 165) DK: DecoratedKey(5664330507961197044404922676062547179, *[VALID HEX STRING]*, java.nio.HeapByteBuffer[pos=0 lim=121 cap=121], key: 0,daily,20130325,1,domain,uniqueusers,<the rest of the key I can't show - total lenght is 121 as expected and it's 100% valid>) INFO [Thread-8] 2013-05-23 13:07:06,766 IncomingStreamReader.java (line 169) dataSize: 56 INFO [Thread-8] 2013-05-23 13:07:06,766 ByteBufferUtil.java (line 370) readShortLength->readUnsignedShort: 8 INFO [Thread-8] 2013-05-23 13:07:06,766 ByteBufferUtil.java (line 407) READ with length 8 INFO [Thread-8] 2013-05-23 13:07:06,766 ByteBufferUtil.java (line 356) readWithLength: 9 INFO [Thread-8] 2013-05-23 13:07:06,766 ByteBufferUtil.java (line 407) READ with length 9

----- HERE STARTS THE KEY/VALUE PAIR THAT FAILS
INFO [Thread-8] 2013-05-23 13:07:06,767 ByteBufferUtil.java (line 370) readShortLength->readUnsignedShort: 2048 *<-- 100% wrong, there's no such long key in this file!* INFO [Thread-8] 2013-05-23 13:07:06,767 ByteBufferUtil.java (line 407) READ with length 2048 INFO [Thread-8] 2013-05-23 13:07:06,768 IncomingStreamReader.java (line 165) DK: DecoratedKey(72742576182685307053268326336110565659, *[2048-bytes HEX string]*, java.nio.HeapByteBuffer[pos=0 lim=2048 cap=2048], key: *[HERE'S THE LONG CHUNK - 2048 bytes - OF GARBAGE DATA! I can see a few keys & columns here + some strange characters that describe the data - key lenghts etc.]*) INFO [Thread-60] 2013-05-24 12:08:06,756 IncomingStreamReader.java (line 169) dataSize: 3831491449718795617 *<-- WOOT?! U MAD?!* INFO [Thread-60] 2013-05-24 12:08:07,116 ByteBufferUtil.java (line 370) readShortLength->readUnsignedShort: 14938 INFO [Thread-60] 2013-05-24 12:08:07,116 ByteBufferUtil.java (line 407) READ with length 14938 INFO [Thread-60] 2013-05-24 12:08:07,116 ByteBufferUtil.java (line 356) readWithLength: 131080 INFO [Thread-60] 2013-05-24 12:08:07,117 ByteBufferUtil.java (line 407) READ with length 131080 INFO [Thread-60] 2013-05-24 12:08:07,121 ByteBufferUtil.java (line 370) readShortLength->readUnsignedShort: 27749 INFO [Thread-60] 2013-05-24 12:08:07,121 ByteBufferUtil.java (line 407) READ with length 27749 INFO [Thread-60] 2013-05-24 12:08:07,121 ByteBufferUtil.java (line 356) readWithLength: 81622009 INFO [Thread-60] 2013-05-24 12:08:07,186 CompressedInputStream.java (line 94) CompressedInputStream->read: 6405671, 6291456, 114215 INFO [Thread-60] 2013-05-24 12:08:07,186 CompressedInputStream.java (line 95) CompressedInputStream compare: 6405671 >= 6291456 && 6405671 < 6405671 *<-- The assertion that fails* ERROR [Thread-60] 2013-05-24 12:08:07,221 CassandraDaemon.java (line 133) Exception in thread Thread[Thread-60,5,main]
java.lang.AssertionError
at org.apache.cassandra.streaming.compress.CompressedInputStream.read(CompressedInputStream.java:98)
        at java.io.InputStream.read(InputStream.java:163)
        at java.io.DataInputStream.readFully(DataInputStream.java:178)
        at java.io.DataInputStream.readFully(DataInputStream.java:152)
at org.apache.cassandra.utils.BytesReadTracker.readFully(BytesReadTracker.java:94)
        at 
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:405)
at org.apache.cassandra.utils.ByteBufferUtil.readWithLength(ByteBufferUtil.java:363) at org.apache.cassandra.db.ColumnSerializer.deserializeColumnBody(ColumnSerializer.java:108) at org.apache.cassandra.db.OnDiskAtom$Serializer.deserializeFromSSTable(OnDiskAtom.java:92) at org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:249) at org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:187) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226) at org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)


It gets more interesting... ;-)

M.

W dniu 24.05.2013 10:46, Michal Michalski pisze:
Sounds like a nasty heisenbug, can you replace or rebuild the machine?

Heisenbug :D
(never heard this name before :-) )

I thought so too, but I finally managed to reproduce it locally (it
requires 3 nodes, one of them needs to have a specific token assigned),
the rest just have to be present in cluster; plus the special, "invalid"
file too of course), so it's probably token/partitioner related, but for
now I have no bloody idea how. Thus, I'm rather sure that
rebuilding/replacing node will not work :-(

I also thought it might be a very weird DeflateCompressor bug causing it
to produce invalid output for some strange characters (not very
possible, but I had no better idea ;-) ). However, I tested
compression/decompression and it's fine.

D'oh!

M.


  Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 21/05/2013, at 9:36 PM, Michal Michalski <mich...@opera.com> wrote:

I've finally had some time to experiment a bit with this problem (it
occured twice again) and here's what I found:

1. So far (three occurences in total), *when* it happened, it
happened only for streaming to  *one* specific C* node (but it works
on this node too for 99,9% of the time)
2. It happens with compression turned on
(cassandra.output.compression.class set to
org.apache.cassandra.io.compress.DeflateCompressor, but it doesn't
matter what the chunk length is)
3. Everything works fine when compression is turned off.

So it looks like I have a workaround for now, but I don't really
understand the root cause of this problem and what's the "right"
solution if we want to keep using compression.

Anyway, the thing that interests me the most is why does it fail so
rarely and - assuming it's not a coincidence - why only for one C*
node...

May it be a DeflateCompressor's bug?
Any other ideas?

Regards,
Michał


W dniu 31.03.2013 12:01, aaron morton pisze:
  but yesterday one of 600 mappers failed

:)

 From what I can understand by looking into the C* source, it seems
to me that the problem is caused by a empty (or surprisingly
finished?) input buffer (?) causing token to be set to -1 which is
improper for RandomPartitioner:
Yes, there is a zero length key which as a -1 token.

However, I can't figure out what's the root cause of this problem.
Any ideas?
mmm, the BulkOutputFormat uses a SSTableSimpleUnsortedWriter and
neither of them check for zero length row keys. I would look there
first.

There is no validation in the  AbstractSSTableSimpleWriter, not sure
if that is by design or an oversight. Can you catch the zero length
key in your map job ?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/03/2013, at 2:26 PM, Michal Michalski <mich...@opera.com> wrote:

We're streaming data to Cassandra directly from MapReduce job using
BulkOutputFormat. It's been working for more than a year without
any problems, but yesterday one of 600 mappers faild and we got a
strange-looking exception on one of the C* nodes.

IMPORTANT: It happens on one node and on one cluster only. We've
loaded the same data to test cluster and it worked.


ERROR [Thread-1340977] 2013-03-28 06:35:47,695 CassandraDaemon.java
(line 133) Exception in thread Thread[Thread-1340977,5,main]
java.lang.RuntimeException: Last written key
DecoratedKey(5664330507961197044404922676062547179,
302c6461696c792c32303133303332352c312c646f6d61696e2c756e6971756575736572732c633a494e2c433a6d63635f6d6e635f636172726965725f43656c6c4f6e655f4b61726e6174616b615f2842616e67616c6f7265295f494e2c643a53616d73756e675f47542d49393037302c703a612c673a3133)
>= current key DecoratedKey(-1, ) writing into
/cassandra/production/IndexedValues/production-IndexedValues-tmp-ib-240346-Data.db

    at
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:133)

    at
org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:209)

    at
org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:179)

    at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122)

    at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226)

    at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166)

    at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)



 From what I can understand by looking into the C* source, it seems
to me that the problem is caused by a empty (or surprisingly
finished?) input buffer (?) causing token to be set to -1 which is
improper for RandomPartitioner:

public BigIntegerToken getToken(ByteBuffer key)
{
    if (key.remaining() == 0)
        return MINIMUM;        // Which is -1
    return new BigIntegerToken(FBUtilities.hashToBigInteger(key));
}

However, I can't figure out what's the root cause of this problem.
Any ideas?

Of course I can't exclude a bug in my code which streams these
data, but - as I said - it works when loading the same data to test
cluster (which has different number of nodes, thus different token
assignment, which might be a case too).

Michał






Reply via email to