Re: Batch support in Cassandra store

Valentin Kulichenko Thu, 28 Jul 2016 23:01:48 -0700

Hi Igor,

I'm not a big Cassandra expert, but here are my thoughts.


1. Sending updates in a batch is always better than sending them one by
one. For example, if you do putAll in Ignite with 100 entries, and these
entries are split across 5 nodes, the client will send 5 requests instead
of 100. This provides significant performance improvement. Is there a way
to use similar approach in Cassandra?
2. As for logged batches, I can easily believe that this is a rarely used
feature, but since it exists in Cassandra, I can't find a single reason why
not to support it in our store as an option. Users that come across those
rare cases, will only say thank you to us :)

What do you think?

-Val

On Thu, Jul 28, 2016 at 10:41 PM, Igor Rudyak <irud...@gmail.com> wrote:

> There are actually some cases when atomic read isolation in Cassandra could
> be important. Lets assume batch was persisted in Cassandra, but not
> finalized yet - read operation from Cassandra returns us only partially
> committed data of the batch. In the such situation we have problems when:
>
> 1) Some of the batch records already expired from Ignite cache and we
> reading them from persistent store (Cassandra in our case).
>
> 2) All Ignite nodes storing the batch records (or subset records) died (or
> for example became unavailable for 10sec because of network problem). While
> reading such records from Ignite cache we will be redirected to persistent
> store.
>
> 3) Network separation occurred such a way that we now have two Ignite
> cluster, but all the replicas of the batch data are located only in one of
> these clusters. Again while reading such records from Ignite cache on the
> second cluster we will be redirected to persistent store.
>
> In all mentioned cases, if Cassandra batch isn't finalized yet - we will
> read partially committed transaction data.
>
>
> On Thu, Jul 28, 2016 at 6:52 AM, Luiz Felipe Trevisan <
> luizfelipe.trevi...@gmail.com> wrote:
>
> > I totally agree with you regarding the guarantees we have with logged
> > batches and I'm also pretty much aware of the performance penalty
> involved
> > using this solution.
> >
> > But since all read operations are executed via ignite it means that
> > isolation in the Cassandra level is not really important. I think the
> only
> > guarantee really needed is that we don't end up with a partial insert in
> > Cassandra in case we have a failure in ignite and we loose the node that
> > was responsible for this write operation.
> >
> > My other assumption is that the write operation needs to finish before an
> > eviction happens for this entry and we loose the data in cache (since
> batch
> > doesn't guarantee isolation). However if we cannot achieve this I don't
> see
> > why use ignite as a cache store.
> >
> > Luiz
> >
> > --
> > Luiz Felipe Trevisan
> >
> > On Wed, Jul 27, 2016 at 4:55 PM, Igor Rudyak <irud...@gmail.com> wrote:
> >
> >> Hi Luiz,
> >>
> >> Logged batches is not the solution to achieve atomic view of your Ignite
> >> transaction changes in Cassandra.
> >>
> >> The problem with logged batches(aka atomic) is they guarantees that if
> >> any part of the batch succeeds, all of it will, no other transactional
> >> enforcement is done at the batch level. For example, there is no batch
> >> isolation. Clients are able to read the first updated rows from the
> batch,
> >> while other rows are still being updated on the server (in RDBMS
> >> terminology it means *READ-UNCOMMITED* isolation level). Thus Cassandra
> >> mean "atomic" in the database sense that if any part of the batch
> succeeds,
> >> all of it will.
> >>
> >> Probably the best way to archive read atomic isolation for Ignite
> >> transaction persisting data into Cassandra, is to implement RAMP
> >> transactions (http://www.bailis.org/papers/ramp-sigmod2014.pdf) on top
> >> of Cassandra.
> >>
> >> I may create a ticket for this if community would like it.
> >>
> >>
> >> Igor Rudyak
> >>
> >>
> >> On Wed, Jul 27, 2016 at 12:55 PM, Luiz Felipe Trevisan <
> >> luizfelipe.trevi...@gmail.com> wrote:
> >>
> >>> Hi Igor,
> >>>
> >>> Does it make sense for you using logged batches to guarantee atomicity
> >>> in Cassandra in cases we are doing a cross cache transaction operation?
> >>>
> >>> Luiz
> >>>
> >>> --
> >>> Luiz Felipe Trevisan
> >>>
> >>> On Wed, Jul 27, 2016 at 2:05 AM, Dmitriy Setrakyan <
> >>> dsetrak...@apache.org> wrote:
> >>>
> >>>> I am very confused still. Ilya, can you please explain what happens in
> >>>> Cassandra if user calls IgniteCache.putAll(...) method?
> >>>>
> >>>> In Ignite, if putAll(...) is called, Ignite will make the best effort
> to
> >>>> execute the update as a batch, in which case the performance is
> better.
> >>>> What is the analogy in Cassandra?
> >>>>
> >>>> D.
> >>>>
> >>>> On Tue, Jul 26, 2016 at 9:16 PM, Igor Rudyak <irud...@gmail.com>
> wrote:
> >>>>
> >>>> > Dmitriy,
> >>>> >
> >>>> > There is absolutely same approach for all async read/write/delete
> >>>> > operations - Cassandra session just provides executeAsync(statement)
> >>>> > function
> >>>> > for all type of operations.
> >>>> >
> >>>> > To be more detailed about Cassandra batches, there are actually two
> >>>> types
> >>>> > of batches:
> >>>> >
> >>>> > 1) *Logged batch* (aka atomic) - the main purpose of such batches is
> >>>> to
> >>>> > keep duplicated data in sync while updating multiple tables, but at
> >>>> the
> >>>> > cost of performance.
> >>>> >
> >>>> > 2) *Unlogged batch* - the only specific case for such batch is when
> >>>> all
> >>>> > updates are addressed to only *one* partition key and batch having
> >>>> > "*reasonable
> >>>> > size*". In a such situation there *could be* performance benefits if
> >>>> you
> >>>> > are using Cassandra *TokenAware* load balancing policy. In this
> >>>> particular
> >>>> > case all the updates will go directly without any additional
> >>>> > coordination to the primary node, which is responsible for storing
> >>>> data for
> >>>> > this partition key.
> >>>> >
> >>>> > The *generic rule* is that - *individual updates using async mode*
> >>>> provides
> >>>> > the best performance (
> >>>> > https://docs.datastax.com/en/cql/3.1/cql/cql_using/useBatch.html).
> >>>> That's
> >>>> > because it spread all updates across the whole cluster. In contrast
> to
> >>>> > this, when you are using batches, what this is actually doing is
> >>>> putting a
> >>>> > huge amount of pressure on a single coordinator node. This is
> because
> >>>> the
> >>>> > coordinator needs to forward each individual insert/update/delete to
> >>>> the
> >>>> > correct replicas. In general you're just losing all the benefit of
> >>>> > Cassandra TokenAware load balancing policy when you're updating
> >>>> different
> >>>> > partitions in a single round trip to the database.
> >>>> >
> >>>> > Probably the only enhancement which could be done is to separate our
> >>>> batch
> >>>> > to smaller batches, each of which is updating records having the
> same
> >>>> > partition key. In this case it could provide some performance
> >>>> benefits when
> >>>> > used in combination with Cassandra TokenAware policy. But there are
> >>>> several
> >>>> > concerns:
> >>>> >
> >>>> > 1) It looks like rather rare case
> >>>> > 2) Makes error handling more complex - you just don't know what
> >>>> operations
> >>>> > in a batch succeed and what failed and need to retry all batch
> >>>> > 3) Retry logic could produce more load on the cluster - in case of
> >>>> > individual updates you just need to retry the only mutations which
> are
> >>>> > failed, in case of batches you need to retry the whole batch
> >>>> > 4)* Unlogged batch is deprecated in Cassandra 3.0* (
> >>>> > https://docs.datastax.com/en/cql/3.3/cql/cql_reference/batch_r.html
> ),
> >>>> > which
> >>>> > we are currently using for Ignite Cassandra module.
> >>>> >
> >>>> >
> >>>> > Igor Rudyak
> >>>> >
> >>>> >
> >>>> >
> >>>> > On Tue, Jul 26, 2016 at 4:45 PM, Dmitriy Setrakyan <
> >>>> dsetrak...@apache.org>
> >>>> > wrote:
> >>>> >
> >>>> > >
> >>>> > >
> >>>> > > On Tue, Jul 26, 2016 at 5:53 PM, Igor Rudyak <irud...@gmail.com>
> >>>> wrote:
> >>>> > >
> >>>> > >> Hi Valentin,
> >>>> > >>
> >>>> > >> For writeAll/readAll Cassandra cache store implementation uses
> >>>> async
> >>>> > >> operations (
> >>>> http://www.datastax.com/dev/blog/java-driver-async-queries)
> >>>> > >> and
> >>>> > >> futures, which has the best characteristics in terms of
> >>>> performance.
> >>>> > >>
> >>>> > >>
> >>>> > > Thanks, Igor. This link describes the query operations, but I
> could
> >>>> not
> >>>> > > find the mention of writes.
> >>>> > >
> >>>> > >
> >>>> > >> Cassandra BATCH statement is actually quite often anti-pattern
> for
> >>>> those
> >>>> > >> who come from relational world. BATCH statement concept in
> >>>> Cassandra is
> >>>> > >> totally different from relational world and is not for optimizing
> >>>> > >> batch/bulk operations. The main purpose of Cassandra BATCH is to
> >>>> keep
> >>>> > >> denormalized data in sync. For example when you duplicating the
> >>>> same
> >>>> > data
> >>>> > >> into several tables. All other cases are not recommended for
> >>>> Cassandra
> >>>> > >> batches:
> >>>> > >>  -
> >>>> > >>
> >>>> > >>
> >>>> >
> >>>>
> https://medium.com/@foundev/cassandra-batch-loading-without-the-batch-keyword-40f00e35e23e#.k4xfir8ij
> >>>> > >>  -
> >>>> > >>
> >>>> > >>
> >>>> >
> >>>>
> http://christopher-batey.blogspot.com/2015/02/cassandra-anti-pattern-misuse-of.html
> >>>> > >>  -
> >>>> https://inoio.de/blog/2016/01/13/cassandra-to-batch-or-not-to-batch/
> >>>> > >>
> >>>> > >> It's also good to mention that in CassandraCacheStore
> >>>> implementation
> >>>> > >> (actually in CassandraSessionImpl) all operation with Cassandra
> is
> >>>> > wrapped
> >>>> > >> in a loop. The reason is in a case of failure it will be
> performed
> >>>> 20
> >>>> > >> attempts to retry the operation with incrementally increasing
> >>>> timeouts
> >>>> > >> starting from 100ms and specific exception handling logic
> >>>> (Cassandra
> >>>> > hosts
> >>>> > >> unavailability and etc.). Thus it provides quite reliable
> >>>> persistence
> >>>> > >> mechanism. According to load tests, even on heavily overloaded
> >>>> Cassandra
> >>>> > >> cluster (CPU LOAD > 10 per one core) there were no lost
> >>>> > >> writes/reads/deletes and maximum 6 attempts to perform one
> >>>> operation.
> >>>> > >>
> >>>> > >
> >>>> > > I think that the main point about Cassandra batch operations is
> not
> >>>> about
> >>>> > > reliability, but about performance. If user batches up 100s of
> >>>> updates
> >>>> > in 1
> >>>> > > Cassandra batch, then it will be a lot faster than doing them
> >>>> 1-by-1 in
> >>>> > > Ignite. Wrapping them into Ignite "putAll(...)" call just seems
> more
> >>>> > > logical to me, no?
> >>>> > >
> >>>> > >
> >>>> > >>
> >>>> > >> Igor Rudyak
> >>>> > >>
> >>>> > >> On Tue, Jul 26, 2016 at 1:58 PM, Valentin Kulichenko <
> >>>> > >> valentin.kuliche...@gmail.com> wrote:
> >>>> > >>
> >>>> > >> > Hi Igor,
> >>>> > >> >
> >>>> > >> > I noticed that current Cassandra store implementation doesn't
> >>>> support
> >>>> > >> > batching for writeAll and deleteAll methods, it simply executes
> >>>> all
> >>>> > >> updates
> >>>> > >> > one by one (asynchronously in parallel).
> >>>> > >> >
> >>>> > >> > I think it can be useful to provide such support and created a
> >>>> ticket
> >>>> > >> [1].
> >>>> > >> > Can you please give your input on this? Does it make sense in
> >>>> your
> >>>> > >> opinion?
> >>>> > >> >
> >>>> > >> > [1] https://issues.apache.org/jira/browse/IGNITE-3588
> >>>> > >> >
> >>>> > >> > -Val
> >>>> > >> >
> >>>> > >>
> >>>> > >
> >>>> > >
> >>>> >
> >>>>
> >>>
> >>>
> >>
> >
>

Re: Batch support in Cassandra store

Reply via email to