I've collected some more data-points, and I still see dropped
mutations with compaction_throughput_mb_per_sec set to 8.
The only notable thing regarding the current setup is that I have
another keyspace (not being repaired though) with really wide rows
(100MB per partition), but that shouldn't have any impact in theory.
Nodes do not seem that overloaded either and don't see any GC spikes
while those mutations are dropped :/

Hitting a dead end here, any further idea where to look for further ideas?

Regards,
Stefano

On Wed, Aug 10, 2016 at 12:41 PM, Stefano Ortolani <ostef...@gmail.com> wrote:
> That's what I was thinking. Maybe GC pressure?
> Some more details: during anticompaction I have some CFs exploding to 1K
> SStables (to be back to ~200 upon completion).
> HW specs should be quite good (12 cores/32 GB ram) but, I admit, still
> relying on spinning disks, with ~150GB per node.
> Current version is 3.0.8.
>
>
> On Wed, Aug 10, 2016 at 12:36 PM, Paulo Motta <pauloricard...@gmail.com>
> wrote:
>>
>> That's pretty low already, but perhaps you should lower to see if it will
>> improve the dropped mutations during anti-compaction (even if it increases
>> repair time), otherwise the problem might be somewhere else. Generally
>> dropped mutations is a signal of cluster overload, so if there's nothing
>> else wrong perhaps you need to increase your capacity. What version are you
>> in?
>>
>> 2016-08-10 8:21 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
>>>
>>> Not yet. Right now I have it set at 16.
>>> Would halving it more or less double the repair time?
>>>
>>> On Tue, Aug 9, 2016 at 7:58 PM, Paulo Motta <pauloricard...@gmail.com>
>>> wrote:
>>>>
>>>> Anticompaction throttling can be done by setting the usual
>>>> compaction_throughput_mb_per_sec knob on cassandra.yaml or via nodetool
>>>> setcompactionthroughput. Did you try lowering that  and checking if that
>>>> improves the dropped mutations?
>>>>
>>>> 2016-08-09 13:32 GMT-03:00 Stefano Ortolani <ostef...@gmail.com>:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I am running incremental repaird on a weekly basis (can't do it every
>>>>> day as one single run takes 36 hours), and every time, I have at least one
>>>>> node dropping mutations as part of the process (this almost always during
>>>>> the anticompaction phase). Ironically this leads to a system where 
>>>>> repairing
>>>>> makes data consistent at the cost of making some other data not 
>>>>> consistent.
>>>>>
>>>>> Does anybody know why this is happening?
>>>>>
>>>>> My feeling is that this might be caused by anticompacting column
>>>>> families with really wide rows and with many SStables. If that is the 
>>>>> case,
>>>>> any way I can throttle that?
>>>>>
>>>>> Thanks!
>>>>> Stefano
>>>>
>>>>
>>>
>>
>

Reply via email to