Probably you want to read this blog:

https://academy.datastax.com/support-blog/read-repair

-Arvinder

On Tue, Jan 1, 2019, 12:43 PM Jeff Jirsa <jji...@gmail.com wrote:

> There are two types of read repair
>
> - Blocking/foreground due to reading you consistency level (local_quorum
> for you) and one is the responses not matching
>
> - Probabilistic read repair which queries extra hosts in advance and read
> repairs them if they mismatch AFTER responding to the caller/client
>
> You’ve disabled the latter but you can’t disable the former (there’s a
> proposal to configure that but I don’t recall if it’s been committed and
> I’m mobile so not gonna go search JIRA).
>
> The big mutation is due to large mismatch - probably due to the bounces
> and reading before hints replayed (hint throttle is quite low in 3.11, you
> may want to increase it).
>
>
> --
> Jeff Jirsa
>
>
> On Jan 1, 2019, at 11:51 AM, Vlad <qa23d-...@yahoo.com.invalid> wrote:
>
> Hi, thanks for answer.
>
> what I don't understand is:
>
> - why there are attempts of read repair if repair chances are 0.0 ?
> - what can be cause for big mutation size?
> - why hinted handoffs didn't prevent inconsistency? (because of  big
> mutation size?)
>
> Thanks.
>
>
> On Tuesday, January 1, 2019 9:41 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>
>
> Read repair due to digest mismatch and speculative retry can both cause
> some behaviors that are hard to reason about (usually seen if a host stops
> accepting writes due to bad disk, which you havent described, but generally
> speaking, there are times when reads will block on writing to extra
> replicas).
>
> The patch from https://issues.apache.org/jira/browse/CASSANDRA-10726
> changes this behavior significantly.
>
> The last message in this thread (about huge read repair mutations)
> suggests that your writes during the bounce got some partitions quite out
> of sync, and hints aren't replaying fast enough to fill in the gaps before
> you read, and the read repair is timing out. The read repair timing out
> wouldn't block the read after 10726, so if you're seeing read timeouts
> right now, what you probably want to do is run repair or read much smaller
> pages so that read repair succeeds, or increase your commitlog segment size
> from 32M to 128M or so until the read repair actually succeeds.
>
>
> On Tue, Jan 1, 2019 at 12:18 AM Vlad <qa23d-...@yahoo.com.invalid> wrote:
>
> Hi All and Happy New Year!!!
>
> This year started with Cassandra 3.11.3 sometimes forces level ALL despite
> query level LOCAL_QUORUM (actually there is only one DC) and it fails with
> timeout.
>
> As far as I understand, it can be caused by read repair attempts (we see
> "DigestMismatch" errors in Cassandra log), but table has no read repair
> configured:
>
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>     AND comment = ''
>     AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>     AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND *dclocal_read_repair_chance *= 0.0
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND *read_repair_chance *= 0.0
>     AND speculative_retry = '99PERCENTILE';
>
>
> Any suggestions?
>
> Thanks.
>
>
>
>

Reply via email to