Re: How does the "batch" commit log sync works

2016-10-30 Thread Hiroyuki Yamada
Hello Benedict and Edward,

Thank you very much for the comments.
I think the batch parameter is useful when doing some transactional
processing on C* where we need atomicity and higher durability.

Anyways, I think it is not working as expected at least in the latest
versions in 2.1 and 2.2.
So, I created a ticket in JIRA.
https://issues.apache.org/jira/browse/CASSANDRA-12864

I hope it will be fixed soon.

Thanks,
Hiro

On Fri, Oct 28, 2016 at 6:00 PM, Benedict Elliott Smith
 wrote:
> That is the maximum length of time that queries may be batched together for,
> not the minimum. If there is a break in the flow of queries for the commit
> log, it will commit those outstanding immediately.  It will anyway commit in
> clusters of commit log file size (default 32Mb).
>
> I know the documentation used to disagree with itself in a few places, and
> with actual behaviour, but I thought that had been fixed.  I suggest you
> file a ticket if you find a mention that does not match this description.
>
> Really the batch period is a near useless parameter.  If it were to be
> honoured as a minimum, performance would decline due to the threading model
> in Cassandra (and it will be years before this and memory management improve
> enough to support that behaviour).
>
> Conversely honouring it as a maximum is only possible for very small values,
> just by nature of queueing theory.
>
> I believe I proposed removing the parameter entirely some time ago, though
> it is lost in the mists of time.
>
> Anyway, many people do indeed use this commitlog mode successfully, although
> it is by far less common than periodic mode.  This behaviour does not mean
> your data is in anyway unsafe.
>
>
> On Friday, 28 October 2016, Edward Capriolo  wrote:
>>
>> I mentioned during my Cassandra.yaml presentation at the summit that I
>> never saw anyone use these settings. Things off by default are typically not
>> highly not covered well by tests. It sounds like it is not working. Quick
>> suggestion: go back in time maybe to a version like 1.2.X or 0.7 and see if
>> it behaves like the yaml suggests it should.
>>
>> On Thu, Oct 27, 2016 at 11:48 PM, Hiroyuki Yamada 
>> wrote:
>>>
>>> Hello Satoshi and the community,
>>>
>>> I am also using commitlog_sync for durability, but I have never
>>> modified commitlog_sync_batch_window_in_ms parameter yet,
>>> so I wondered if it is working or not.
>>>
>>> As Satoshi said, I also changed commitlog_sync_batch_window_in_ms (to
>>> 1) and restarted C* and
>>> issued some INSERT command.
>>> But, it actually returned immediately right after issuing.
>>>
>>> So, it seems like the parameter is not working correctly.
>>> Are we missing something ?
>>>
>>> Thanks,
>>> Hiro
>>>
>>> On Thu, Oct 27, 2016 at 5:58 PM, Satoshi Hikida 
>>> wrote:
>>> > Hi, all.
>>> >
>>> > I have a question about "batch" commit log sync behavior with C*
>>> > version
>>> > 2.2.8.
>>> >
>>> > Here's what I have done:
>>> >
>>> > * set commitlog_sync to the "batch" mode as follows:
>>> >
>>> >> commitlog_sync: batch
>>> >> commitlog_sync_batch_window_in_ms: 1
>>> >
>>> > * ran a script which inserts the data to a table
>>> > * prepared a disk dedicated to store the commit logs
>>> >
>>> > According to the DataStax document, I expected that fsync is done once
>>> > in a
>>> > batch window (one fsync per 10sec in this case) and writes issued
>>> > within
>>> > this batch window are blocked until fsync is completed.
>>> >
>>> > In my experiment, however, it seems that the write requests returned
>>> > almost
>>> > immediately (within 300~400 ms).
>>> >
>>> > Am I misunderstanding something? If so, can someone give me any advices
>>> > as
>>> > to the reason why C* behaves like this?
>>> >
>>> >
>>> > I referred to this document:
>>> >
>>> > https://docs.datastax.com/en/cassandra/2.2/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__PerformanceTuningProps
>>> >
>>> > Regards,
>>> > Satoshi
>>> >
>>
>>
>


Re: How does the "batch" commit log sync works

2016-10-28 Thread Benedict Elliott Smith
That is the maximum length of time that queries may be batched together
for, not the minimum. If there is a break in the flow of queries for the
commit log, it will commit those outstanding immediately.  It will anyway
commit in clusters of commit log file size (default 32Mb).

I know the documentation used to disagree with itself in a few places, and
with actual behaviour, but I thought that had been fixed.  I suggest you
file a ticket if you find a mention that does not match this description.

Really the batch period is a near useless parameter.  If it were to be
honoured as a minimum, performance would decline due to the threading model
in Cassandra (and it will be years before this and memory management
improve enough to support that behaviour).

Conversely honouring it as a maximum is only possible for very small
values, just by nature of queueing theory.

I believe I proposed removing the parameter entirely some time ago, though
it is lost in the mists of time.

Anyway, many people do indeed use this commitlog mode
successfully, although it is by far less common than periodic mode.  This
behaviour does not mean your data is in anyway unsafe.

On Friday, 28 October 2016, Edward Capriolo  wrote:

> I mentioned during my Cassandra.yaml presentation at the summit that I
> never saw anyone use these settings. Things off by default are typically
> not highly not covered well by tests. It sounds like it is not working.
> Quick suggestion: go back in time maybe to a version like 1.2.X or 0.7 and
> see if it behaves like the yaml suggests it should.
>
> On Thu, Oct 27, 2016 at 11:48 PM, Hiroyuki Yamada  > wrote:
>
>> Hello Satoshi and the community,
>>
>> I am also using commitlog_sync for durability, but I have never
>> modified commitlog_sync_batch_window_in_ms parameter yet,
>> so I wondered if it is working or not.
>>
>> As Satoshi said, I also changed commitlog_sync_batch_window_in_ms (to
>> 1) and restarted C* and
>> issued some INSERT command.
>> But, it actually returned immediately right after issuing.
>>
>> So, it seems like the parameter is not working correctly.
>> Are we missing something ?
>>
>> Thanks,
>> Hiro
>>
>> On Thu, Oct 27, 2016 at 5:58 PM, Satoshi Hikida > > wrote:
>> > Hi, all.
>> >
>> > I have a question about "batch" commit log sync behavior with C* version
>> > 2.2.8.
>> >
>> > Here's what I have done:
>> >
>> > * set commitlog_sync to the "batch" mode as follows:
>> >
>> >> commitlog_sync: batch
>> >> commitlog_sync_batch_window_in_ms: 1
>> >
>> > * ran a script which inserts the data to a table
>> > * prepared a disk dedicated to store the commit logs
>> >
>> > According to the DataStax document, I expected that fsync is done once
>> in a
>> > batch window (one fsync per 10sec in this case) and writes issued within
>> > this batch window are blocked until fsync is completed.
>> >
>> > In my experiment, however, it seems that the write requests returned
>> almost
>> > immediately (within 300~400 ms).
>> >
>> > Am I misunderstanding something? If so, can someone give me any advices
>> as
>> > to the reason why C* behaves like this?
>> >
>> >
>> > I referred to this document:
>> > https://docs.datastax.com/en/cassandra/2.2/cassandra/configu
>> ration/configCassandra_yaml.html#configCassandra_yaml__
>> PerformanceTuningProps
>> >
>> > Regards,
>> > Satoshi
>> >
>>
>
>


Re: How does the "batch" commit log sync works

2016-10-27 Thread Edward Capriolo
I mentioned during my Cassandra.yaml presentation at the summit that I
never saw anyone use these settings. Things off by default are typically
not highly not covered well by tests. It sounds like it is not working.
Quick suggestion: go back in time maybe to a version like 1.2.X or 0.7 and
see if it behaves like the yaml suggests it should.

On Thu, Oct 27, 2016 at 11:48 PM, Hiroyuki Yamada 
wrote:

> Hello Satoshi and the community,
>
> I am also using commitlog_sync for durability, but I have never
> modified commitlog_sync_batch_window_in_ms parameter yet,
> so I wondered if it is working or not.
>
> As Satoshi said, I also changed commitlog_sync_batch_window_in_ms (to
> 1) and restarted C* and
> issued some INSERT command.
> But, it actually returned immediately right after issuing.
>
> So, it seems like the parameter is not working correctly.
> Are we missing something ?
>
> Thanks,
> Hiro
>
> On Thu, Oct 27, 2016 at 5:58 PM, Satoshi Hikida 
> wrote:
> > Hi, all.
> >
> > I have a question about "batch" commit log sync behavior with C* version
> > 2.2.8.
> >
> > Here's what I have done:
> >
> > * set commitlog_sync to the "batch" mode as follows:
> >
> >> commitlog_sync: batch
> >> commitlog_sync_batch_window_in_ms: 1
> >
> > * ran a script which inserts the data to a table
> > * prepared a disk dedicated to store the commit logs
> >
> > According to the DataStax document, I expected that fsync is done once
> in a
> > batch window (one fsync per 10sec in this case) and writes issued within
> > this batch window are blocked until fsync is completed.
> >
> > In my experiment, however, it seems that the write requests returned
> almost
> > immediately (within 300~400 ms).
> >
> > Am I misunderstanding something? If so, can someone give me any advices
> as
> > to the reason why C* behaves like this?
> >
> >
> > I referred to this document:
> > https://docs.datastax.com/en/cassandra/2.2/cassandra/
> configuration/configCassandra_yaml.html#configCassandra_
> yaml__PerformanceTuningProps
> >
> > Regards,
> > Satoshi
> >
>


Re: How does the "batch" commit log sync works

2016-10-27 Thread Hiroyuki Yamada
Hello Satoshi and the community,

I am also using commitlog_sync for durability, but I have never
modified commitlog_sync_batch_window_in_ms parameter yet,
so I wondered if it is working or not.

As Satoshi said, I also changed commitlog_sync_batch_window_in_ms (to
1) and restarted C* and
issued some INSERT command.
But, it actually returned immediately right after issuing.

So, it seems like the parameter is not working correctly.
Are we missing something ?

Thanks,
Hiro

On Thu, Oct 27, 2016 at 5:58 PM, Satoshi Hikida  wrote:
> Hi, all.
>
> I have a question about "batch" commit log sync behavior with C* version
> 2.2.8.
>
> Here's what I have done:
>
> * set commitlog_sync to the "batch" mode as follows:
>
>> commitlog_sync: batch
>> commitlog_sync_batch_window_in_ms: 1
>
> * ran a script which inserts the data to a table
> * prepared a disk dedicated to store the commit logs
>
> According to the DataStax document, I expected that fsync is done once in a
> batch window (one fsync per 10sec in this case) and writes issued within
> this batch window are blocked until fsync is completed.
>
> In my experiment, however, it seems that the write requests returned almost
> immediately (within 300~400 ms).
>
> Am I misunderstanding something? If so, can someone give me any advices as
> to the reason why C* behaves like this?
>
>
> I referred to this document:
> https://docs.datastax.com/en/cassandra/2.2/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__PerformanceTuningProps
>
> Regards,
> Satoshi
>


How does the "batch" commit log sync works

2016-10-27 Thread Satoshi Hikida
Hi, all.

I have a question about "batch" commit log sync behavior with C* version
2.2.8.

Here's what I have done:

* set commitlog_sync to the "batch" mode as follows:

> commitlog_sync: batch
> commitlog_sync_batch_window_in_ms: 1

* ran a script which inserts the data to a table
* prepared a disk dedicated to store the commit logs

According to the DataStax document, I expected that fsync is done once in a
batch window (one fsync per 10sec in this case) and writes issued within
this batch window are blocked until fsync is completed.

In my experiment, however, it seems that the write requests returned almost
immediately (within 300~400 ms).

Am I misunderstanding something? If so, can someone give me any advices as
to the reason why C* behaves like this?


I referred to this document:
https://docs.datastax.com/en/cassandra/2.2/cassandra/configuration/configCassandra_yaml.html#configCassandra_yaml__PerformanceTuningProps

Regards,
Satoshi