[
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16206662#comment-16206662
]
Nate McCall commented on CASSANDRA-13863:
-----------------------------------------
Note: removing the *read_repair_chance options entirely is currently being
discussed under CASSANDRA-13910
> Speculative retry causes read repair even if read_repair_chance is 0.0.
> -----------------------------------------------------------------------
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
> Issue Type: Improvement
> Reporter: Hiro Wakabayashi
> Attachments:
> 0001-Use-read_repair_chance-when-starting-repairs-due-to-.patch, speculative
> retries.pdf
>
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should
> cause no read repair, but read repair happens with speculative retry. I think
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should
> stop read repair completely because the user wants to stop read repair in
> some cases.
> {panel:title=Case 1: TWCS users}
> The
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
> states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users
> should attempt to avoid this behavior. Specifically, users should avoid
> queries that explicitly set the timestamp via CQL USING TIMESTAMP.
> Additionally, users should run frequent repairs (which streams data in such a
> way that it does not become comingled), and disable background read repair by
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency
> higher than no read repair. We can use anti entropy repair in off peak time
> for consistency.
> {panel}
>
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled:
> true/hinted_handoff_enabled: false/'
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:"
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '3'} AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> ----------------------
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
> activity
> | timestamp | source |
> source_elapsed
> -----------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
>
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 |
> 0
> Parsing insert into ks1.t1 (key, value) values ('mmullass',
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 |
> 127.0.0.1 | 4323
> Preparing
> statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.320000 | 127.0.0.1 |
> 5250
> Determining replicas for
> mutation [SharedPool-Worker-1] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |
> 11886
> Appending to
> commitlog [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |
> 12195
> Adding to t1
> memtable [SharedPool-Worker-3] | 2017-09-12 23:59:42.327000 | 127.0.0.1 |
> 12392
>
> Request complete | 2017-09-12 23:59:42.328680 | 127.0.0.1 |
> 12680
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1
> where key = 'mmullass';"
> Current consistency level is ONE.
> Now Tracing is enabled
> key | value
> ----------+--------------------
> mmullass | 0x0000000000000001
> (1 rows)
> Tracing session: 3420ce90-97cb-11e7-8ea7-c1bd4d549501
> activity |
> timestamp | source | source_elapsed
> ----------------------------------------------------------------------------+----------------------------+-----------+----------------
> Execute CQL3 query |
> 2017-09-13 00:01:06.681000 | 127.0.0.1 | 0
> Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-1] |
> 2017-09-13 00:01:06.681000 | 127.0.0.1 | 296
> Preparing statement [SharedPool-Worker-1] |
> 2017-09-13 00:01:06.681000 | 127.0.0.1 | 561
> Executing single-partition query on t1 [SharedPool-Worker-2] |
> 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1056
> Acquiring sstable references [SharedPool-Worker-2] |
> 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1142
> Merging memtable contents [SharedPool-Worker-2] |
> 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1206
> Read 1 live and 0 tombstone cells [SharedPool-Worker-2] |
> 2017-09-13 00:01:06.682000 | 127.0.0.1 | 1455
> Request complete |
> 2017-09-13 00:01:06.682794 | 127.0.0.1 | 1794
> {noformat}
> h3. 4. Start node2 and confirm node2 has no data.
> {noformat}
> $ ccm node2 start && ccm status
> Cluster: 'cluster_3.0.14'
> -------------------------
> node1: UP
> node3: DOWN
> node2: UP
> $ ccm node2 nodetool flush
> $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db
> ls: /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db: No
> such file or directory
> {noformat}
> h3. 5. Select the row from node2 and read repair works.
> {noformat}
> $ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1
> where key = 'mmullass';"
> Current consistency level is ONE.
> Now Tracing is enabled
> key | value
> -----+-------
> (0 rows)
> Tracing session: 72a71fc0-97cb-11e7-83cc-a3af9d3da979
> activity
>
>
> | timestamp | source | source_elapsed
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
>
>
> Execute CQL3 query
> | 2017-09-13 00:02:51.582000 | 127.0.0.2 | 0
>
>
> Parsing select * from ks1.t1 where key = 'mmullass'; [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.583000 | 127.0.0.2 | 1112
>
>
> Preparing statement [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.583000 | 127.0.0.2 | 1412
>
>
> reading data from /127.0.0.1 [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.584000 | 127.0.0.2 | 2107
>
>
> Executing single-partition query on t1 [SharedPool-Worker-1]
> | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3492
>
>
> Sending READ message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1]
> | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3516
>
>
> Acquiring sstable references [SharedPool-Worker-1]
> | 2017-09-13 00:02:51.585000 | 127.0.0.2 | 3595
>
>
> Merging memtable contents [SharedPool-Worker-1]
> | 2017-09-13 00:02:51.585001 | 127.0.0.2 | 3673
>
>
> Read 0 live and 0 tombstone cells [SharedPool-Worker-1]
> | 2017-09-13 00:02:51.585001 | 127.0.0.2 | 3851
>
>
> READ message received from /127.0.0.2 [MessagingService-Incoming-/127.0.0.2]
> | 2017-09-13 00:02:51.588000 | 127.0.0.1 | 33
>
>
> Acquiring sstable references [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12444
>
>
> Merging memtable contents [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12536
>
>
> Read 1 live and 0 tombstone cells [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12765
>
>
> Enqueuing response to /127.0.0.2 [SharedPool-Worker-2]
> | 2017-09-13 00:02:51.600000 | 127.0.0.1 | 12929
>
> Sending
> REQUEST_RESPONSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2]
> | 2017-09-13 00:02:51.602000 | 127.0.0.1 | 14686
>
>
> REQUEST_RESPONSE message received from /127.0.0.1
> [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:02:51.603000 |
> 127.0.0.2 | --
>
>
> Processing response from /127.0.0.1 [SharedPool-Worker-3]
> | 2017-09-13 00:02:51.610000 | 127.0.0.2 | --
>
>
> Initiating read-repair [SharedPool-Worker-3]
> | 2017-09-13 00:02:51.610000 | 127.0.0.2 | --
> Digest mismatch: org.apache.cassandra.service.DigestMismatchException:
> Mismatch for key DecoratedKey(-4886857781295767937, 6d6d756c6c617373)
> (d41d8cd98f00b204e9800998ecf8427e vs f8e0f9262a889cd3ebf4e5d50159757b)
> [ReadRepairStage:1] | 2017-09-13 00:02:51.624000 | 127.0.0.2 | --
>
>
> Request complete
> | 2017-09-13 00:02:51.586892 | 127.0.0.2 | 4892
> {noformat}
> h3. 6. As a result, node2 has the row.
> {noformat}
> $ ccm node2 cqlsh -k ks1 -e "consistency; tracing on; select * from ks1.t1
> where key = 'mmullass';"
> Current consistency level is ONE.
> Now Tracing is enabled
> key | value
> ----------+--------------------
> mmullass | 0x0000000000000001
> (1 rows)
> Tracing session: 78526330-97cb-11e7-83cc-a3af9d3da979
> activity
> | timestamp | source | source_elapsed
> ------------------------------------------------------------------------------------------+----------------------------+-----------+----------------
>
> Execute CQL3 query | 2017-09-13 00:03:01.091000 | 127.0.0.2 | 0
> Parsing select * from ks1.t1 where key = 'mmullass';
> [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |
> 216
> Preparing statement
> [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |
> 390
> reading data from /127.0.0.1
> [SharedPool-Worker-3] | 2017-09-13 00:03:01.091000 | 127.0.0.2 |
> 808
> Executing single-partition query on t1
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |
> 1041
> READ message received from /127.0.0.2
> [MessagingService-Incoming-/127.0.0.2] | 2017-09-13 00:03:01.092000 |
> 127.0.0.1 | 33
> Sending READ message to /127.0.0.1
> [MessagingService-Outgoing-/127.0.0.1] | 2017-09-13 00:03:01.092000 |
> 127.0.0.2 | 1036
> Executing single-partition query on t1
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |
> 189
> Acquiring sstable references
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |
> 1113
> Acquiring sstable references
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |
> 276
> Merging memtable contents
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.092000 | 127.0.0.2 |
> 1172
> Merging memtable contents
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.092000 | 127.0.0.1 |
> 332
> REQUEST_RESPONSE message received from /127.0.0.1
> [MessagingService-Incoming-/127.0.0.1] | 2017-09-13 00:03:01.093000 |
> 127.0.0.2 | --
> Read 1 live and 0 tombstone cells
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |
> 565
> Enqueuing response to /127.0.0.2
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.093000 | 127.0.0.1 |
> 648
> Sending REQUEST_RESPONSE message to /127.0.0.2
> [MessagingService-Outgoing-/127.0.0.2] | 2017-09-13 00:03:01.093000 |
> 127.0.0.1 | 783
> Processing response from /127.0.0.1
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.094000 | 127.0.0.2 |
> --
> Initiating read-repair
> [SharedPool-Worker-1] | 2017-09-13 00:03:01.099000 | 127.0.0.2 |
> --
> Read 1 live and 0 tombstone cells
> [SharedPool-Worker-2] | 2017-09-13 00:03:01.101000 | 127.0.0.2 |
> 10113
>
> Request complete | 2017-09-13 00:03:01.092830 | 127.0.0.2 | 1830
> $ ccm node2 nodetool flush
> $ ls ~/.ccm/cluster_3.0.14/node2/data0/ks1/t1-*/*-Data.db
> /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db
> $ ~/.ccm/repository/3.0.14/tools/bin/sstabledump
> /Users/hiwakaba/.ccm/cluster_3.0.14/node2/data0/ks1/t1-ec659e0097ca11e78ea7c1bd4d549501/mc-1-big-Data.db
> -k mmullass
> [
> {
> "partition" : {
> "key" : [ "mmullass" ],
> "position" : 0
> },
> "rows" : [
> {
> "type" : "row",
> "position" : 36,
> "liveness_info" : { "tstamp" : "2017-09-12T14:59:42.312969Z" },
> "cells" : [
> { "name" : "value", "value" : "0000000000000001" }
> ]
> }
> ]
> }
> ]
> {noformat}
> In [CASSANDRA-11409|https://issues.apache.org/jira/browse/CASSANDRA-11409],
> [~cam1982] commented this was not a bug. So I filed this issue as an
> improvement.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]