Re: Batch size warnings

Edward Capriolo Wed, 07 Dec 2016 09:23:49 -0800

I have been circling around a thought process over batches. Now that
Cassandra has aggregating functions, it might be possible write a type of
record that has an END_OF_BATCH type marker and the data can be suppressed
from view until it was all there.


IE you write something like a checksum record that an intelligent client
can use to tell if the rest of the batch is complete.

On Wed, Dec 7, 2016 at 11:58 AM, Voytek Jarnot <voytek.jar...@gmail.com>
wrote:

> Been about a month since I have up on it, but it was very much related to
> the stuff you're dealing with ... Basically Cassandra just stepping on its
> own.... errrrr, tripping over its own feet streaming MVs.
>
> On Dec 7, 2016 10:45 AM, "Benjamin Roth" <benjamin.r...@jaumo.com> wrote:
>
>> I meant the mv thing
>>
>> Am 07.12.2016 17:27 schrieb "Voytek Jarnot" <voytek.jar...@gmail.com>:
>>
>>> Sure, about which part?
>>>
>>> default batch size warning is 5kb
>>> I've increased it to 30kb, and will need to increase to 40kb (8x default
>>> setting) to avoid WARN log messages about batch sizes.  I do realize it's
>>> just a WARNing, but may as well avoid those if I can configure it out.
>>> That said, having to increase it so substantially (and we're only dealing
>>> with 5 tables) is making me wonder if I'm not taking the correct approach
>>> in terms of using batches to guarantee atomicity.
>>>
>>> On Wed, Dec 7, 2016 at 10:13 AM, Benjamin Roth <benjamin.r...@jaumo.com>
>>> wrote:
>>>
>>>> Could you please be more specific?
>>>>
>>>> Am 07.12.2016 17:10 schrieb "Voytek Jarnot" <voytek.jar...@gmail.com>:
>>>>
>>>>> Should've mentioned - running 3.9.  Also - please do not recommend
>>>>> MVs: I tried, they're broken, we punted.
>>>>>
>>>>> On Wed, Dec 7, 2016 at 10:06 AM, Voytek Jarnot <
>>>>> voytek.jar...@gmail.com> wrote:
>>>>>
>>>>>> The low default value for batch_size_warn_threshold_in_kb is making
>>>>>> me wonder if I'm perhaps approaching the problem of atomicity in a
>>>>>> non-ideal fashion.
>>>>>>
>>>>>> With one data set duplicated/denormalized into 5 tables to support
>>>>>> queries, we use batches to ensure inserts make it to all or 0 tables.  
>>>>>> This
>>>>>> works fine, but I've had to bump the warn threshold and fail threshold
>>>>>> substantially (8x higher for the warn threshold).  This - in turn - makes
>>>>>> me wonder, with a default setting so low, if I'm not solving this problem
>>>>>> in the canonical/standard way.
>>>>>>
>>>>>> Mostly just looking for confirmation that we're not unintentionally
>>>>>> doing something weird...
>>>>>>
>>>>>
>>>>>
>>>

Re: Batch size warnings

Reply via email to