[
https://issues.apache.org/jira/browse/CASSANDRA-11028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15105685#comment-15105685
]
Paulo Motta commented on CASSANDRA-11028:
-----------------------------------------
Thanks for the report [~autocracy]. While working on CASSANDRA-10961 I added
more detailed debug logging to stream writer and reader, printing source
sstable, keyspace, table and faulty partition key in case of error/corruption
on receiver side, so this should be improved in upcoming releases. Some
additional logging was also added on CASSANDRA-9294.
> Streaming errors caused by corrupt tables need more logging
> -----------------------------------------------------------
>
> Key: CASSANDRA-11028
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11028
> Project: Cassandra
> Issue Type: Bug
> Reporter: Jeff Ferland
>
> Example output: ERROR [STREAM-IN-/10.0.10.218] 2016-01-17 16:01:38,431
> StreamSession.java:505 - [Stream #e6ca4590-bc66-11e5-84be-571ffcecc993]
> Streaming error occurred
> java.lang.IllegalArgumentException: Unknown type 0
> In some cases logging shows a message more like:
> ERROR [STREAM-IN-/10.0.10.12] 2016-01-05 14:44:38,690 StreamSession.java:505
> - [Stream #472d28e0-b347-11e5-8b40-bb4d80df86f4] Streaming error occurred
> java.io.IOException: Too many retries for Header (cfId:
> 6b262d58-8730-36ca-8e3e-f0a40beaf92f, #0, version: ka, estimated keys: 58880,
> transfer size: 2159040, compressed?: true, repairedAt: 0)
> In the majority of cases, however, no information identifying the column
> family is shown, and never identifying the source file that was being
> streamed.
> Errors do no stop the streaming process, but do mark the streaming as failed
> at the end. This usually results in a log message pattern like:
> INFO [StreamReceiveTask:252] 2016-01-18 04:45:01,190
> StreamResultFuture.java:180 - [Stream #e6ca4590-bc66-11e5-84be-571ffcecc993]
> Session with /10.0.10.219 is complete
> WARN [StreamReceiveTask:252] 2016-01-18 04:45:01,215
> StreamResultFuture.java:207 - [Stream #e6ca4590-bc66-11e5-84be-571ffcecc993]
> Stream failed
> ERROR [main] 2016-01-18 04:45:01,217 CassandraDaemon.java:579 - Exception
> encountered during startup
> ... which is highly confusing given the error occurred hours before.
> Request: more detail in logging messages for stream failure indicating what
> column family was being used, and if possible a clarification between network
> issues and corrupt file issues.
> Actual cause of errors / solution is running nodetool scrub on the offending
> node. It's rather expensive scrubbing the whole space blindly versus
> targeting issue tables. In our particular case, out of order keys were caused
> by a bug in a previous version of Cassandra.
> WARN [CompactionExecutor:19552] 2016-01-18 16:02:10,155
> OutputHandler.java:52 - 378490 out of order rows found while scrubbing
> SSTableReader(path='/mnt/cassandra/data/keyspace/cf-888a52f96d1d389790ee586a6100916c/keyspace-cf-ka-133-Data.db');
> Those have been written (in order) to a new sstable
> (SSTableReader(path='/mnt/cassandra/data/keyspace/cf-888a52f96d1d389790ee586a6100916c/keyspace-cf-ka-179-Data.db'))
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)