[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-10-17 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16207542#comment-16207542
 ] 

Aleksey Yeschenko commented on CASSANDRA-13863:
---

[~zznate] Yep, though that JIRA is not about removing read repair or not making 
it block, those are orthogonal concerns.

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
> Attachments: 
> 0001-Use-read_repair_chance-when-starting-repairs-due-to-.patch, speculative 
> retries.pdf
>
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 
> 127.0.0.1 |   4323
>

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-10-16 Thread Nate McCall (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16206662#comment-16206662
 ] 

Nate McCall commented on CASSANDRA-13863:
-

Note: removing the *read_repair_chance options entirely is currently being 
discussed under CASSANDRA-13910 

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
> Attachments: 
> 0001-Use-read_repair_chance-when-starting-repairs-due-to-.patch, speculative 
> retries.pdf
>
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 
> 127.0.0.1 |   4323
>Preparing 
> statement 

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-10-05 Thread Shogo Hoshii (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16194173#comment-16194173
 ] 

Shogo Hoshii commented on CASSANDRA-13863:
--

Hello, 

I am Shogo from same company as Mr. Wakabayashi and Mohanan.
I have attached a pdf file, that shows the result of the performance test 
between 3.0.9 and 3.0.12.

[~muru] Could you attach the patch you made and the result of performance test 
of the patched cluster if possible?
They would be helpful for discussion.

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
> Attachments: speculative retries.pdf
>
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] 

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-09-26 Thread Jeremy Hanna (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181708#comment-16181708
 ] 

Jeremy Hanna commented on CASSANDRA-13863:
--

Linking to CASSANDRA-10726 as one intent is to remove blocking read repairs.

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 
> 127.0.0.1 |   4323
>Preparing 
> statement [SharedPool-Worker-1] | 2017-09-12 23:59:42.32 | 127.0.0.1 |
>5250
>  Determining replicas 

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-09-26 Thread Murukesh Mohanan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181007#comment-16181007
 ] 

Murukesh Mohanan commented on CASSANDRA-13863:
--

How about:
* having the {{DigestMismatchException}} handler get a read repair decision? 
While discussing this problem with Wakabayashi-san, I made a quick patch that 
does this, but I have the feeling it's not the right way to go about this.
* or, a global flag to disable automatic read repairs? I'm not sure how this 
would go - I'm guessing there's quite a few places where read repairs can be 
initiated?

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-09-13 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164538#comment-16164538
 ] 

Aleksey Yeschenko commented on CASSANDRA-13863:
---

bq. read_repair_chance = 0.0 and dclocal_read_repair_chance = 0.0 should cause 
no read repair, but read repair happens with speculative retry. I think 
read_repair_chance = 0.0 and dclocal_read_repair_chance = 0.0 should stop read 
repair completely because the user wants to stop read repair in some cases.

{{(dclocal_)?read_repair_chance}} only determines whether or not Cassandra 
should attempt to fetch responses from additional replicas (for the purposes of 
repair). Whenever we have more than one response (because or non-zero read 
repair chances, speculative retry, or just CL > ONE) and some responses 
mismatch, we perform repair, unconditionally. So you could say the settings are 
confusingly named, and that maybe we could use a way to disable repair of 
mismatched responses, but yeah, this is not a bug.

> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source 

[jira] [Commented] (CASSANDRA-13863) Speculative retry causes read repair even if read_repair_chance is 0.0.

2017-09-12 Thread Jeff Jirsa (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164175#comment-16164175
 ] 

Jeff Jirsa commented on CASSANDRA-13863:


Was similarly surprised when I first saw this in code. I suspect? the logic is 
that we've already done the read, it seems silly NOT to repair the data, but 
twcs/dtcs is a pretty good example of it not being a strict win. 



> Speculative retry causes read repair even if read_repair_chance is 0.0.
> ---
>
> Key: CASSANDRA-13863
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13863
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Hiro Wakabayashi
>
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> cause no read repair, but read repair happens with speculative retry. I think 
> {{read_repair_chance = 0.0}} and {{dclocal_read_repair_chance = 0.0}} should 
> stop read repair completely because the user wants to stop read repair in 
> some cases.
> {panel:title=Case 1: TWCS users}
> The 
> [documentation|http://cassandra.apache.org/doc/latest/operating/compaction.html?highlight=read_repair_chance]
>  states how to disable read repair.
> {quote}While TWCS tries to minimize the impact of comingled data, users 
> should attempt to avoid this behavior. Specifically, users should avoid 
> queries that explicitly set the timestamp via CQL USING TIMESTAMP. 
> Additionally, users should run frequent repairs (which streams data in such a 
> way that it does not become comingled), and disable background read repair by 
> setting the table’s read_repair_chance and dclocal_read_repair_chance to 0.
> {quote}
> {panel}
> {panel:title=Case 2. Strict SLA for read latency}
> In a peak time, read latency is a key for us but, read repair causes latency 
> higher than no read repair. We can use anti entropy repair in off peak time 
> for consistency.
> {panel}
>  
> Here is my procedure to reproduce the problem.
> h3. 1. Create a cluster and set {{hinted_handoff_enabled}} to false.
> {noformat}
> $ ccm create -v 3.0.14 -n 3 cluster_3.0.14
> $ for h in $(seq 1 3) ; do perl -pi -e 's/hinted_handoff_enabled: 
> true/hinted_handoff_enabled: false/' 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> $ for h in $(seq 1 3) ; do grep "hinted_handoff_enabled:" 
> ~/.ccm/cluster_3.0.14/node$h/conf/cassandra.yaml ; done
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> hinted_handoff_enabled: false
> $ ccm start{noformat}
> h3. 2. Create a keyspace and a table.
> {noformat}
> $ ccm node1 cqlsh
> DROP KEYSPACE IF EXISTS ks1;
> CREATE KEYSPACE ks1 WITH replication = {'class': 'SimpleStrategy', 
> 'replication_factor': '3'}  AND durable_writes = true;
> CREATE TABLE ks1.t1 (
> key text PRIMARY KEY,
> value blob
> ) WITH bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class': 
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'chunk_length_in_kb': '64', 'class': 
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.0
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = 'ALWAYS';
> QUIT;
> {noformat}
> h3. 3. Stop node2 and node3. Insert a row.
> {noformat}
> $ ccm node3 stop && ccm node2 stop && ccm status
> Cluster: 'cluster_3.0.14'
> --
> node1: UP
> node3: DOWN
> node2: DOWN
> $ ccm node1 cqlsh -k ks1 -e "consistency; tracing on; insert into ks1.t1 
> (key, value) values ('mmullass', bigintAsBlob(1));"
> Current consistency level is ONE.
> Now Tracing is enabled
> Tracing session: 01d74590-97cb-11e7-8ea7-c1bd4d549501
>  activity 
>| timestamp  | source| 
> source_elapsed
> -++---+
>   
> Execute CQL3 query | 2017-09-12 23:59:42.316000 | 127.0.0.1 | 
>  0
>  Parsing insert into ks1.t1 (key, value) values ('mmullass', 
> bigintAsBlob(1)); [SharedPool-Worker-1] | 2017-09-12 23:59:42.319000 | 
> 127.0.0.1 |   4323
>Preparing 
> statement