[
https://issues.apache.org/jira/browse/CASSANDRA-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16542961#comment-16542961
]
Dimitar Dimitrov edited comment on CASSANDRA-13938 at 9/11/18 5:52 AM:
-----------------------------------------------------------------------
{quote}The problem is that when {{CompressedInputStream#position()}} is called,
the new position might be in the middle of a buffer. We need to remember that
offset, and subtract that value when updating {{current}} in
{{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double
counted on the first call to {{#reBuffer()}} after {{#position()}} as we add
the {{buffer.position()}} to {{current}}. {{current}} already accounts for
those offset bytes when {{#position()}} was called.
{quote}
[~jasobrown], isn't that equivalent (although a bit more complex) to just
setting {{current}} to the last reached/read position in the stream when
rebuffering? (i.e. {{current = streamOffset + buffer.position()}}).
I might be missing something, but the role of {{currentBufferOffset}} seems to
be solely to "align" {{current}} and {{streamOffset}} the first time after a
new section is started. Then {{current += buffer.position() -
currentBufferOffset}} expands to {{current = -current- + buffer.position() +
streamOffset - -current- }} which is the same as {{current = streamOffset +
buffer.position()}}. After that first time, {{current}} naturally follows
{{streamOffset}} without the need of any adjustment, but it seems more natural
to express this as {{streamOffset + buffer.position()}} instead of the new
expression or the old {{current + buffer.position()}}. To me, it's also a bit
more intuitive and easier to understand (hopefully it's also right in addition
to intuitive :)).
The equivalence above would hold true if {{current}} and {{streamOffset}} don't
change their value in the meantime, but I think this is ensured by the
well-ordered sequential fashion in which the decompressing and the offset
bookkeeping functionality of {{CompressedInputStream}} happen in the thread
running the corresponding {{StreamDeserializingTask}}.
* The aforementioned well-ordered sequential fashion seems to be POSITION
followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not
update {{current}} with the above calculation in case {{current}} is already
too far ahead (i.e. the new section is not starting within the current buffer).
was (Author: dimitarndimitrov):
{quote}The problem is that when {{CompressedInputStream#position()}} is called,
the new position might be in the middle of a buffer. We need to remember that
offset, and subtract that value when updating {{current}} in
{{#reBuffer(boolean)}}. The resaon why is that those offset bytes get double
counted on the first call to {{#reBuffer()}} after {{#position()}} as we add
the {{buffer.position()}} to {{current}}. {{current}} already accounts for
those offset bytes when {{#position()}} was called.
{quote}
[~jasobrown], isn't that equivalent (although a bit more complex) to just
setting {{current}} to the last reached/read position in the stream when
rebuffering? (i.e. {{current = streamOffset + buffer.position()}}).
I might be missing something, but the role of {{currentBufferOffset}} seems to
be solely to "align" {{current}} and {{streamOffset}} the first time after a
new section is started. Then {{current += buffer.position() -
currentBufferOffse expands to }}{{current = -current- + buffer.position() +
streamOffset - -current- }}which is the same as {{current = streamOffset +
buffer.position()}}. After that first time, {{current}} naturally follows
{{streamOffset}} without the need of any adjustment, but it seems more natural
to express this as {{streamOffset + buffer.position()}} instead of the new
expression or the old {{current + buffer.position()}}. To me, it's also a bit
more intuitive and easier to understand (hopefully it's also right in addition
to intuitive :)).
The equivalence above would hold true if {{current}} and {{streamOffset}} don't
change their value in the meantime, but I think this is ensured by the
well-ordered sequential fashion in which the decompressing and the offset
bookkeeping functionality of {{CompressedInputStream}} happen in the thread
running the corresponding {{StreamDeserializingTask}}.
* The aforementioned well-ordered sequential fashion seems to be POSITION
followed by 0-N times REBUFFER + DECOMPRESS, where the first REBUFFER might not
update {{current}} with the above calculation in case {{current}} is already
too far ahead (i.e. the new section is not starting within the current buffer).
> Default repair is broken, crashes other nodes participating in repair (in
> trunk)
> --------------------------------------------------------------------------------
>
> Key: CASSANDRA-13938
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13938
> Project: Cassandra
> Issue Type: Bug
> Components: Repair
> Reporter: Nate McCall
> Assignee: Jason Brown
> Priority: Critical
> Fix For: 4.x
>
> Attachments: 13938.yaml, test.sh
>
>
> Running through a simple scenario to test some of the new repair features, I
> was not able to make a repair command work. Further, the exception seemed to
> trigger a nasty failure state that basically shuts down the netty connections
> for messaging *and* CQL on the nodes transferring back data to the node being
> repaired. The following steps reproduce this issue consistently.
> Cassandra stress profile (probably not necessary, but this one provides a
> really simple schema and consistent data shape):
> {noformat}
> keyspace: standard_long
> keyspace_definition: |
> CREATE KEYSPACE standard_long WITH replication = {'class':'SimpleStrategy',
> 'replication_factor':3};
> table: test_data
> table_definition: |
> CREATE TABLE test_data (
> key text,
> ts bigint,
> val text,
> PRIMARY KEY (key, ts)
> ) WITH COMPACT STORAGE AND
> CLUSTERING ORDER BY (ts DESC) AND
> bloom_filter_fp_chance=0.010000 AND
> caching={'keys':'ALL', 'rows_per_partition':'NONE'} AND
> comment='' AND
> dclocal_read_repair_chance=0.000000 AND
> gc_grace_seconds=864000 AND
> read_repair_chance=0.000000 AND
> compaction={'class': 'SizeTieredCompactionStrategy'} AND
> compression={'sstable_compression': 'LZ4Compressor'};
> columnspec:
> - name: key
> population: uniform(1..50000000) # 50 million records available
> - name: ts
> cluster: gaussian(1..50) # Up to 50 inserts per record
> - name: val
> population: gaussian(128..1024) # varrying size of value data
> insert:
> partitions: fixed(1) # only one insert per batch for individual partitions
> select: fixed(1)/1 # each insert comes in one at a time
> batchtype: UNLOGGED
> queries:
> single:
> cql: select * from test_data where key = ? and ts = ? limit 1;
> series:
> cql: select key,ts,val from test_data where key = ? limit 10;
> {noformat}
> The commands to build and run:
> {noformat}
> ccm create 4_0_test -v git:trunk -n 3 -s
> ccm stress user profile=./histo-test-schema.yml
> ops\(insert=20,single=1,series=1\) duration=15s -rate threads=4
> # flush the memtable just to get everything on disk
> ccm node1 nodetool flush
> ccm node2 nodetool flush
> ccm node3 nodetool flush
> # disable hints for nodes 2 and 3
> ccm node2 nodetool disablehandoff
> ccm node3 nodetool disablehandoff
> # stop node1
> ccm node1 stop
> ccm stress user profile=./histo-test-schema.yml
> ops\(insert=20,single=1,series=1\) duration=45s -rate threads=4
> # wait 10 seconds
> ccm node1 start
> # Note that we are local to ccm's nodetool install 'cause repair preview is
> not reported yet
> node1/bin/nodetool repair --preview
> node1/bin/nodetool repair standard_long test_data
> {noformat}
> The error outputs from the last repair command follow. First, this is stdout
> from node1:
> {noformat}
> $ node1/bin/nodetool repair standard_long test_data
> objc[47876]: Class JavaLaunchHelper is implemented in both
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/bin/java
> (0x10274d4c0) and
> /Library/Java/JavaVirtualMachines/jdk1.8.0_101.jdk/Contents/Home/jre/lib/libinstrument.dylib
> (0x1047b64e0). One of the two will be used. Which one is undefined.
> [2017-10-05 14:31:52,425] Starting repair command #4
> (7e1a9150-a98e-11e7-ad86-cbd2801b8de2), repairing keyspace standard_long with
> repair options (parallelism: parallel, primary range: false, incremental:
> true, job threads: 1, ColumnFamilies: [test_data], dataCenters: [], hosts:
> [], previewKind: NONE, # of ranges: 3, pull repair: false, force repair:
> false)
> [2017-10-05 14:32:07,045] Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2
> for range [(3074457345618258602,-9223372036854775808],
> (-9223372036854775808,-3074457345618258603],
> (-3074457345618258603,3074457345618258602]] failed with error Stream failed
> [2017-10-05 14:32:07,048] null
> [2017-10-05 14:32:07,050] Repair command #4 finished in 14 seconds
> error: Repair job has failed with the error message: [2017-10-05
> 14:32:07,048] null
> -- StackTrace --
> java.lang.RuntimeException: Repair job has failed with the error message:
> [2017-10-05 14:32:07,048] null
> at org.apache.cassandra.tools.RepairRunner.progress(RepairRunner.java:122)
> at
> org.apache.cassandra.utils.progress.jmx.JMXNotificationProgressListener.handleNotification(JMXNotificationProgressListener.java:77)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.dispatchNotification(ClientNotifForwarder.java:583)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.doRun(ClientNotifForwarder.java:533)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$NotifFetcher.run(ClientNotifForwarder.java:452)
> at
> com.sun.jmx.remote.internal.ClientNotifForwarder$LinearExecutor$1.run(ClientNotifForwarder.java:108)
> {noformat}
> node1's {{system.log}}:
> {noformat}
> INFO [Stream-Deserializer-/127.0.0.2:63069-e0af297f] 2017-10-05 14:32:07,037
> StreamResultFuture.java:193 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Session with /127.0.0.2 is complete
> INFO [Stream-Deserializer-/127.0.0.3:63068-eb8f23bc] 2017-10-05 14:32:07,037
> StreamResultFuture.java:193 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Session with /127.0.0.3 is complete
> ERROR [Streaming-Netty-Thread-5-5] 2017-10-05 14:32:07,037
> StreamSession.java:617 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Streaming error occurred on session with peer 127.0.0.3
> java.nio.channels.ClosedChannelException: null
> at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown
> Source) ~[netty-all-4.1.14.Final.jar:4.1.14.Final]
> ERROR [Streaming-Netty-Thread-5-7] 2017-10-05 14:32:07,038
> StreamSession.java:617 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Streaming error occurred on session with peer 127.0.0.2
> java.nio.channels.ClosedChannelException: null
> at io.netty.channel.AbstractChannel$AbstractUnsafe.write(...)(Unknown
> Source) ~[netty-all-4.1.14.Final.jar:4.1.14.Final]
> WARN [Stream-Deserializer-/127.0.0.2:63069-e0af297f] 2017-10-05 14:32:07,038
> StreamResultFuture.java:220 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Stream failed
> WARN [Stream-Deserializer-/127.0.0.3:63068-eb8f23bc] 2017-10-05 14:32:07,038
> StreamResultFuture.java:220 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Stream failed
> WARN [RepairJobTask:1] 2017-10-05 14:32:07,038 RepairJob.java:176 - [repair
> #7e2e8e80-a98e-11e7-ad86-cbd2801b8de2] test_data sync failed
> ERROR [Stream-Deserializer-/127.0.0.3:7000-48246b87] 2017-10-05 14:32:07,041
> StreamSession.java:757 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Remote peer 127.0.0.3 failed stream session.
> ERROR [RepairJobTask:1] 2017-10-05 14:32:07,042 RepairSession.java:326 -
> [repair #7e2e8e80-a98e-11e7-ad86-cbd2801b8de2] Session completed with the
> following error
> org.apache.cassandra.streaming.StreamException: Stream failed
> at
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> ~[main/:na]
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> ~[guava-18.0.jar:na]
> at
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:221)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:197)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:488)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:601)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:207)
> ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
> ERROR [RepairJobTask:1] 2017-10-05 14:32:07,043 RepairRunnable.java:564 -
> Repair session 7e2e8e80-a98e-11e7-ad86-cbd2801b8de2 for range
> [(3074457345618258602,-9223372036854775808],
> (-9223372036854775808,-3074457345618258603],
> (-3074457345618258603,3074457345618258602]] failed with error Stream failed
> org.apache.cassandra.streaming.StreamException: Stream failed
> at
> org.apache.cassandra.streaming.management.StreamEventJMXNotifier.onFailure(StreamEventJMXNotifier.java:88)
> ~[main/:na]
> at com.google.common.util.concurrent.Futures$6.run(Futures.java:1310)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
> ~[guava-18.0.jar:na]
> at
> com.google.common.util.concurrent.AbstractFuture.setException(AbstractFuture.java:202)
> ~[guava-18.0.jar:na]
> at
> org.apache.cassandra.streaming.StreamResultFuture.maybeComplete(StreamResultFuture.java:221)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.StreamResultFuture.handleSessionComplete(StreamResultFuture.java:197)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.StreamSession.closeSession(StreamSession.java:488)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.StreamSession.onError(StreamSession.java:601)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:207)
> ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
> INFO [RepairJobTask:1] 2017-10-05 14:32:07,045 CoordinatorSession.java:233 -
> Incremental repair session 7e1a9150-a98e-11e7-ad86-cbd2801b8de2 failed
> ERROR [Stream-Deserializer-/127.0.0.2:7000-4b83e3cb] 2017-10-05 14:32:07,045
> StreamSession.java:757 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Remote peer 127.0.0.2 failed stream session.
> INFO [AntiEntropyStage:1] 2017-10-05 14:32:07,048
> CoordinatorSession.java:233 - Incremental repair session
> 7e1a9150-a98e-11e7-ad86-cbd2801b8de2 failed
> INFO [AntiEntropyStage:1] 2017-10-05 14:32:07,049 LocalSessions.java:501 -
> Failing local repair session 7e1a9150-a98e-11e7-ad86-cbd2801b8de2
> INFO [RepairJobTask:1] 2017-10-05 14:32:07,049 RepairRunnable.java:647 -
> Repair command #4 finished in 14 seconds
> {noformat}
> node2's {{system.log}} (note the transport shutdowns at the end):
> {noformat}
> INFO [AntiEntropyStage:1] 2017-10-05 18:31:52,521 LocalSessions.java:560 -
> Beginning local incremental repair session
> LocalSession{sessionID=7e1a9150-a98e-11e7-ad86-cbd2801b8de2, state=PREPARING,
> coordinator=/127.0.0.1, tableIds=[99d53860-a98d-11e7-9807-39cb3e573e5c],
> repairedAt=1507181512483, ranges=[(3074457345618258602,-9223372036854775808],
> (-9223372036854775808,-3074457345618258603],
> (-3074457345618258603,3074457345618258602]], participants=[/127.0.0.1,
> /127.0.0.2, /127.0.0.3], startedAt=1507181512, lastUpdate=1507181512}
> INFO [CompactionExecutor:224] 2017-10-05 18:31:52,539
> CompactionManager.java:642 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> Starting anticompaction for standard_long.test_data on 2/2 sstables
> INFO [CompactionExecutor:224] 2017-10-05 18:31:52,539
> CompactionManager.java:664 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> SSTable
> BigTableReader(path='/Users/zznate/.ccm/4_0_test/node2/data0/standard_long/test_data-99d53860a98d11e7980739cb3e573e5c/na-27-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:224] 2017-10-05 18:31:52,539
> CompactionManager.java:664 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> SSTable
> BigTableReader(path='/Users/zznate/.ccm/4_0_test/node2/data0/standard_long/test_data-99d53860a98d11e7980739cb3e573e5c/na-26-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:224] 2017-10-05 18:31:52,547
> CompactionManager.java:699 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> Completed anticompaction successfully
> INFO [AntiEntropyStage:1] 2017-10-05 18:31:57,500 Validator.java:292 -
> [repair #7e2e8e80-a98e-11e7-ad86-cbd2801b8de2] Sending completed merkle tree
> to /127.0.0.1 for standard_long.test_data
> INFO [Stream-Deserializer-/127.0.0.1:63064-3a39d969] 2017-10-05 18:32:05,417
> StreamResultFuture.java:115 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2
> ID#0] Creating new streaming plan for Repair
> INFO [Stream-Deserializer-/127.0.0.1:63064-3a39d969] 2017-10-05 18:32:05,418
> StreamResultFuture.java:122 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2,
> ID#0] Received streaming plan for Repair
> INFO [NonPeriodicTasks:1] 2017-10-05 18:32:05,856
> StreamResultFuture.java:179 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2
> ID#0] Prepare completed. Receiving 1 files(8.136MiB), sending 2
> files(42.689MiB)
> INFO [Stream-Deserializer-/127.0.0.1:63064-3a39d969] 2017-10-05 18:32:06,625
> StreamResultFuture.java:179 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2
> ID#0] Prepare completed. Receiving 1 files(8.136MiB), sending 2
> files(42.689MiB)
> WARN [Stream-Deserializer-/127.0.0.1:63066-c7002e89] 2017-10-05 18:32:06,747
> CompressedStreamReader.java:112 - [Stream
> 85d4b790-a98e-11e7-ad86-cbd2801b8de2] Error while reading partition
> DecoratedKey(-9060243433852736644, 5f1c6c5d747c) from stream on
> ks='standard_long' and table='test_data'.
> ERROR [Stream-Deserializer-/127.0.0.1:63066-c7002e89] 2017-10-05 18:32:06,759
> StreamSession.java:617 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Streaming error occurred on session with peer 127.0.0.1
> org.apache.cassandra.streaming.StreamReceiveException:
> java.lang.AssertionError: stream can only read forward.
> at
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178)
> ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> Caused by: java.lang.AssertionError: stream can only read forward.
> at
> org.apache.cassandra.streaming.compress.CompressedInputStream.position(CompressedInputStream.java:108)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:96)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:58)
> ~[main/:na]
> ... 4 common frames omitted
> INFO [Stream-Deserializer-/127.0.0.1:63066-c7002e89] 2017-10-05 18:32:06,761
> StreamResultFuture.java:193 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Session with /127.0.0.1 is complete
> WARN [Stream-Deserializer-/127.0.0.1:63066-c7002e89] 2017-10-05 18:32:06,762
> StreamResultFuture.java:220 - [Stream #85d4b790-a98e-11e7-ad86-cbd2801b8de2]
> Stream failed
> ERROR [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,765
> CassandraDaemon.java:211 - Exception in thread
> Thread[NettyStreaming-Outbound-/127.0.0.1:1,5,main]
> org.apache.cassandra.io.FSReadError:
> java.nio.channels.ClosedByInterruptException
> at
> org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:133)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:94)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:111)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:53)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:41)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.async.NettyStreamingMessageSender$FileStreamTask.run(NettyStreamingMessageSender.java:324)
> ~[main/:na]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[na:1.8.0_101]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> ~[na:1.8.0_101]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_101]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
> Caused by: java.nio.channels.ClosedByInterruptException: null
> at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> ~[na:1.8.0_101]
> at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:746)
> ~[na:1.8.0_101]
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
> ~[na:1.8.0_101]
> at
> org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:129)
> ~[main/:na]
> ... 12 common frames omitted
> ERROR [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,769
> StorageService.java:393 - Stopping gossiper
> WARN [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,769
> StorageService.java:315 - Stopping gossip by operator request
> INFO [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,769
> Gossiper.java:1527 - Announcing shutdown
> INFO [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,770
> StorageService.java:2202 - Node /127.0.0.2 state jump to shutdown
> INFO [AntiEntropyStage:1] 2017-10-05 18:32:07,049 LocalSessions.java:501 -
> Failing local repair session 7e1a9150-a98e-11e7-ad86-cbd2801b8de2
> ERROR [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:08,771
> StorageService.java:398 - Stopping native transport
> INFO [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:08,774
> Server.java:180 - Stop listening for CQL clients
> {noformat}
> And node3 {{system.log}} (similar to node2):
> {noformat}
> INFO [AntiEntropyStage:1] 2017-10-05 18:31:52,521 LocalSessions.java:560 -
> Beginning local incremental repair session
> LocalSession{sessionID=7e1a9150-a98e-11e7-ad86-cbd2801b8de2, state=PREPARING,
> coordinator=/127.0.0.1, tableIds=[99d53860-a98d-11e7-9807-39cb3e573e5c],
> repairedAt=1507181512483, ranges=[(3074457345618258602,-9223372036854775808],
> (-9223372036854775808,-3074457345618258603],
> (-3074457345618258603,3074457345618258602]], participants=[/127.0.0.1,
> /127.0.0.2, /127.0.0.3], startedAt=1507181512, lastUpdate=1507181512}
> INFO [CompactionExecutor:249] 2017-10-05 18:31:52,542
> CompactionManager.java:642 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> Starting anticompaction for standard_long.test_data on 2/2 sstables
> INFO [CompactionExecutor:249] 2017-10-05 18:31:52,543
> CompactionManager.java:664 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> SSTable
> BigTableReader(path='/Users/zznate/.ccm/4_0_test/node3/data0/standard_long/test_data-99d53860a98d11e7980739cb3e573e5c/na-27-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:249] 2017-10-05 18:31:52,543
> CompactionManager.java:664 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> SSTable
> BigTableReader(path='/Users/zznate/.ccm/4_0_test/node3/data0/standard_long/test_data-99d53860a98d11e7980739cb3e573e5c/na-26-big-Data.db')
> fully contained in range (-9223372036854775808,-9223372036854775808],
> mutating repairedAt instead of anticompacting
> INFO [CompactionExecutor:249] 2017-10-05 18:31:52,550
> CompactionManager.java:699 - [repair #7e1a9150-a98e-11e7-ad86-cbd2801b8de2]
> Completed anticompaction successfully
> INFO [AntiEntropyStage:1] 2017-10-05 18:31:57,918 Validator.java:292 -
> [repair #7e2e8e80-a98e-11e7-ad86-cbd2801b8de2] Sending completed merkle tree
> to /127.0.0.1 for standard_long.test_data
> INFO [Stream-Deserializer-/127.0.0.1:63063-d6987513] 2017-10-05 18:32:05,817
> StreamResultFuture.java:115 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2
> ID#0] Creating new streaming plan for Repair
> INFO [Stream-Deserializer-/127.0.0.1:63063-d6987513] 2017-10-05 18:32:05,818
> StreamResultFuture.java:122 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2,
> ID#0] Received streaming plan for Repair
> INFO [NonPeriodicTasks:1] 2017-10-05 18:32:05,866
> StreamResultFuture.java:179 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2
> ID#0] Prepare completed. Receiving 1 files(8.136MiB), sending 2
> files(42.679MiB)
> INFO [Stream-Deserializer-/127.0.0.1:63063-d6987513] 2017-10-05 18:32:06,622
> StreamResultFuture.java:179 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2
> ID#0] Prepare completed. Receiving 1 files(8.136MiB), sending 2
> files(42.679MiB)
> WARN [Stream-Deserializer-/127.0.0.1:63067-6347c9a8] 2017-10-05 18:32:06,759
> CompressedStreamReader.java:112 - [Stream
> 85d3f440-a98e-11e7-ad86-cbd2801b8de2] Error while reading partition
> DecoratedKey(-9060243433852736644, 5f1c6c5d747c) from stream on
> ks='standard_long' and table='test_data'.
> ERROR [Stream-Deserializer-/127.0.0.1:63067-6347c9a8] 2017-10-05 18:32:06,773
> StreamSession.java:617 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Streaming error occurred on session with peer 127.0.0.1
> org.apache.cassandra.streaming.StreamReceiveException:
> java.lang.AssertionError: stream can only read forward.
> at
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:63)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:41)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.StreamMessage.deserialize(StreamMessage.java:55)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.async.StreamingInboundHandler$StreamDeserializingTask.run(StreamingInboundHandler.java:178)
> ~[main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
> Caused by: java.lang.AssertionError: stream can only read forward.
> at
> org.apache.cassandra.streaming.compress.CompressedInputStream.position(CompressedInputStream.java:108)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.compress.CompressedStreamReader.read(CompressedStreamReader.java:96)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.IncomingFileMessage$1.deserialize(IncomingFileMessage.java:58)
> ~[main/:na]
> ... 4 common frames omitted
> INFO [GossipStage:1] 2017-10-05 18:32:06,774 Gossiper.java:1040 -
> InetAddress /127.0.0.2 is now DOWN
> INFO [Stream-Deserializer-/127.0.0.1:63067-6347c9a8] 2017-10-05 18:32:06,775
> StreamResultFuture.java:193 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Session with /127.0.0.1 is complete
> WARN [Stream-Deserializer-/127.0.0.1:63067-6347c9a8] 2017-10-05 18:32:06,775
> StreamResultFuture.java:220 - [Stream #85d3f440-a98e-11e7-ad86-cbd2801b8de2]
> Stream failed
> ERROR [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,778
> CassandraDaemon.java:211 - Exception in thread
> Thread[NettyStreaming-Outbound-/127.0.0.1:1,5,main]
> org.apache.cassandra.io.FSReadError:
> java.nio.channels.ClosedByInterruptException
> at
> org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:133)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.compress.CompressedStreamWriter.write(CompressedStreamWriter.java:94)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage.serialize(OutgoingFileMessage.java:111)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:53)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.OutgoingFileMessage$1.serialize(OutgoingFileMessage.java:42)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.messages.StreamMessage.serialize(StreamMessage.java:41)
> ~[main/:na]
> at
> org.apache.cassandra.streaming.async.NettyStreamingMessageSender$FileStreamTask.run(NettyStreamingMessageSender.java:324)
> ~[main/:na]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_101]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> ~[na:1.8.0_101]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> ~[na:1.8.0_101]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_101]
> at
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:81)
> [main/:na]
> at java.lang.Thread.run(Thread.java:745) ~[na:1.8.0_101]
> Caused by: java.nio.channels.ClosedByInterruptException: null
> at
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
> ~[na:1.8.0_101]
> at sun.nio.ch.FileChannelImpl.readInternal(FileChannelImpl.java:746)
> ~[na:1.8.0_101]
> at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:727)
> ~[na:1.8.0_101]
> at
> org.apache.cassandra.io.util.ChannelProxy.read(ChannelProxy.java:129)
> ~[main/:na]
> ... 12 common frames omitted
> ERROR [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,781
> StorageService.java:393 - Stopping gossiper
> WARN [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,781
> StorageService.java:315 - Stopping gossip by operator request
> INFO [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,781
> Gossiper.java:1527 - Announcing shutdown
> INFO [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:06,782
> StorageService.java:2202 - Node /127.0.0.3 state jump to shutdown
> INFO [AntiEntropyStage:1] 2017-10-05 18:32:07,049 LocalSessions.java:501 -
> Failing local repair session 7e1a9150-a98e-11e7-ad86-cbd2801b8de2
> ERROR [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:08,782
> StorageService.java:398 - Stopping native transport
> INFO [NettyStreaming-Outbound-/127.0.0.1:1] 2017-10-05 18:32:08,785
> Server.java:180 - Stop listening for CQL clients
> {noformat}
> The final state of the cluster after running this repair command:
> {noformat}
> $ ccm node1 nodetool status
> Datacenter: datacenter1
> =======================
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> -- Address Load Tokens Owns (effective) Host ID
> Rack
> UN 127.0.0.1 8.62 MiB 1 100.0%
> ffe7466b-2937-4322-a388-cca1819f6513 rack1
> DN 127.0.0.2 44.54 MiB 1 100.0%
> e374f662-1da5-477d-b1fb-173b8311c4a9 rack1
> DN 127.0.0.3 44.53 MiB 1 100.0%
> d8d99bd6-4b9f-4510-a4c3-62951be1b4d2 rack1
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]