Re: [DISCUSS] java 9 and the future of cassandra on the jdk

2018-03-20 Thread Kant Kodali
Java 10 is releasing today!

On Tue, Mar 20, 2018 at 9:07 AM, Ariel Weisberg  wrote:

> Hi,
>
> +1 to what Jordan is saying.
>
> It seems like if we are cutting a release off of trunk we want to make
> sure we get N years of supported JDK out of it. For a single LTS release N
> could be at most 3 and historically that isn't long enough and it's very
> likely we will get < 3 after a release is cut.
>
> Going beyond 3 years could be tricky in the worst case because bringing in
> up to 3 years of JDK changes to an older release might mean some of our
> dependencies no longer function and now it's not just minor fixes it's
> bringing in who knows what in terms of updated dependencies.
>
> I think in some cases we are going to need to take a release we have
> already cut and make it work with an LTS release that didn't exist when the
> release was cut.
>
> We also need to update how CI works. We should at least build and run a
> quick smoke test with the JDKs we are claiming to support and
> asynchronously run all the tests on the rather large matrix that now exists.
>
> Ariel
>
> On Tue, Mar 20, 2018, at 11:07 AM, Jeremiah Jordan wrote:
> > My suggestion would be to keep trunk on the latest LTS by default, but
> > with compatibility with the latest release if possible.  Since Oracle
> > LTS releases are every 3 years, I would not want to tie us to that
> > release cycle?
> > So until Java 11 is out that would mean trunk should work under Java 8,
> > with the option of being compiled/run under Java 9 or 10.  Once Java 11
> > is out we could then switch to 11 only.
> >
> > -Jeremiah
> >
> > On Mar 20, 2018, at 10:48 AM, Jason Brown  wrote:
> >
> > >>> Wouldn't that potentially leave us in a situation where we're ready
> for
> > > a C* release but blocked waiting on a new LTS cut?
> > >
> > > Agreed, and perhaps if we're close enough to a LTS release (say three
> > > months or less), we could choose to delay (probably with community
> > > input/vote). If we're a year or two out, then, no, we should not wait.
> I
> > > think this is what I meant to communicate by "Perhaps we can evaluate
> this
> > > over time." (poorly stated, in hindsight)
> > >
> > >> On Tue, Mar 20, 2018 at 7:22 AM, Josh McKenzie 
> wrote:
> > >>
> > >> Need a little clarification on something:
> > >>
> > >>> 2) always release cassandra on a LTS version
> > >> combined with:
> > >>> 3) keep trunk on the lasest jdk version, assumming we release a major
> > >>> cassandra version close enough to a LTS release.
> > >>
> > >> Wouldn't that potentially leave us in a situation where we're ready
> > >> for a C* release but blocked waiting on a new LTS cut? For example, if
> > >> JDK 9 were the currently supported LTS and trunk was on JDK 11, we'd
> > >> either have to get trunk to work with 9 or wait for 11 to resolve
> > >> that.
> > >>
> > >>> On Tue, Mar 20, 2018 at 9:32 AM, Jason Brown 
> wrote:
> > >>> Hi all,
> > >>>
> > >>>
> > >>> TL;DR Oracle has started revving the JDK version much faster, and we
> need
> > >>> an agreed upon plan.
> > >>>
> > >>> Well, we probably should has this discussion this already by now, but
> > >> here
> > >>> we are. Oracle announced plans to release updated JDK version every
> six
> > >>> months, and each new version immediate supercedes the previous in all
> > >> ways:
> > >>> no updates/security fixes to previous versions is the main thing, and
> > >>> previous versions are EOL'd immediately. In addition, Oracle has
> planned
> > >>> parallel LTS versions that will live for three years, and then
> superceded
> > >>> by the next LTS; but not immediately EOL'd from what I can tell.
> Please
> > >> see
> > >>> [1, 2] for Oracle's offical comments about this change ([3] was
> > >>> particularly useful, imo), [4] and many other postings on the
> internet
> > >> for
> > >>> discussion/commentary.
> > >>>
> > >>> We have a jira [5] where Robert Stupp did most of the work to get us
> onto
> > >>> Java 9 (thanks, Robert), but then the announcement of the JDK version
> > >>> changes happened last fall after Robert had done much of the work on
> the
> > >>> ticket.
> > >>>
> > >>> Here's an initial proposal of how to move forward. I don't suspect
> it's
> > >>> complete, but a decent place to start a conversation.
> > >>>
> > >>> 1) receommend OracleJDK over OpenJDK. IIUC from [3], the OpenJDK will
> > >>> release every six months, and the OracleJDK will release every three
> > >> years.
> > >>> Thus, the OracleJDK is the LTS version, and it just comes from a
> snapshot
> > >>> of one of those OpenJDK builds.
> > >>>
> > >>> 2) always release cassandra on a LTS version. I don't think we can
> > >>> reasonably expect operators to update the JDK every six months, on
> time.
> > >>> Further, if there are breaking changes to the JDK, we don't want to
> have
> > >> to
> > >>> update established c* versions due to those changes, every six
> 

Re: Weird error (unable to start cassandra)

2017-09-11 Thread Kant Kodali
I had to do brew upgrade jemalloc to fix this issue.

On Mon, Sep 11, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:

> Hi All,
>
> I am trying to start cassandra 3.11 on Mac OS Sierra 10.12.6. when invoke
> cassandra binary I get the following error
>
> java(2981,0x7fffedb763c0) malloc: *** malloc_zone_unregister() failed for
> 0x7fffedb6c000
>
> I have xcode version 8.3.3 installed (latest). Any clue ?
>
> Thanks!
>


Weird error (unable to start cassandra)

2017-09-11 Thread Kant Kodali
Hi All,

I am trying to start cassandra 3.11 on Mac OS Sierra 10.12.6. when invoke
cassandra binary I get the following error

java(2981,0x7fffedb763c0) malloc: *** malloc_zone_unregister() failed for
0x7fffedb6c000

I have xcode version 8.3.3 installed (latest). Any clue ?

Thanks!


Re: Does partition size limitation still exists in Cassandra 3.10 given there is a B-tree implementation?

2017-05-11 Thread Kant Kodali
oh this looks like one I am looking for
https://issues.apache.org/jira/browse/CASSANDRA-9754. Is this in Cassandra
3.10 or merged somewhere?

On Thu, May 11, 2017 at 1:13 AM, Kant Kodali <k...@peernova.com> wrote:

> Hi DuyHai,
>
> I am trying to see what are the possible things we can do to get over this
> limitation?
>
> 1. Would this https://issues.apache.org/jira/browse/CASSANDRA-7447 help
> at all?
> 2. Can we have Merkle trees built for groups of rows in partition ? such
> that we can stream only those groups where the hash is different?
> 3. It would be interesting to see if we can spread a partition across
> nodes.
>
> I am just trying to validate some ideas that can help potentially get over
> this 100MB limitation since we may not always fit into a time series model.
>
> Thanks!
>
> On Thu, May 11, 2017 at 12:37 AM, DuyHai Doan <doanduy...@gmail.com>
> wrote:
>
>> Yes the recommendation still applies
>>
>> Wide partitions have huge impact on repair (over streaming), compaction
>> and bootstrap
>>
>> Le 10 mai 2017 23:54, "Kant Kodali" <k...@peernova.com> a écrit :
>>
>> Hi All,
>>
>> Cassandra community had always been recommending 100MB per partition as a
>> sweet spot however does this limitation still exist given there is a
>> B-tree
>> implementation to identify rows inside a partition?
>>
>> https://github.com/apache/cassandra/blob/trunk/src/java/org/
>> apache/cassandra/db/rows/BTreeRow.java
>>
>> Thanks!
>>
>>
>>
>


Re: Does partition size limitation still exists in Cassandra 3.10 given there is a B-tree implementation?

2017-05-11 Thread Kant Kodali
Hi DuyHai,

I am trying to see what are the possible things we can do to get over this
limitation?

1. Would this https://issues.apache.org/jira/browse/CASSANDRA-7447 help at
all?
2. Can we have Merkle trees built for groups of rows in partition ? such
that we can stream only those groups where the hash is different?
3. It would be interesting to see if we can spread a partition across nodes.

I am just trying to validate some ideas that can help potentially get over
this 100MB limitation since we may not always fit into a time series model.

Thanks!

On Thu, May 11, 2017 at 12:37 AM, DuyHai Doan <doanduy...@gmail.com> wrote:

> Yes the recommendation still applies
>
> Wide partitions have huge impact on repair (over streaming), compaction
> and bootstrap
>
> Le 10 mai 2017 23:54, "Kant Kodali" <k...@peernova.com> a écrit :
>
> Hi All,
>
> Cassandra community had always been recommending 100MB per partition as a
> sweet spot however does this limitation still exist given there is a B-tree
> implementation to identify rows inside a partition?
>
> https://github.com/apache/cassandra/blob/trunk/src/java/org/
> apache/cassandra/db/rows/BTreeRow.java
>
> Thanks!
>
>
>


Does partition size limitation still exists in Cassandra 3.10 given there is a B-tree implementation?

2017-05-10 Thread Kant Kodali
Hi All,

Cassandra community had always been recommending 100MB per partition as a
sweet spot however does this limitation still exist given there is a B-tree
implementation to identify rows inside a partition?

https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/rows/BTreeRow.java

Thanks!


Re: State of triggers

2017-03-02 Thread Kant Kodali
+1

On Thu, Mar 2, 2017 at 11:04 AM, S G  wrote:

> Hi,
>
> I am not able to find any documentation on the current state of triggers
> being production ready.
>
> The post at
> http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-
> 0-prototype-triggers-support
> says that "The current implementation is experimental, and there is some
> work to do before triggers in Cassandra can be declared final and
> production-ready."
>
> So which version of Cassandra should we expect triggers to be stable
> enough?
> Our requirement is to develop a solution for several Cassandra users all
> running on different versions (they won't upgrade easily) and no one is
> using 3.5+ versions.
> So the smallest Cassandra version which has production ready triggers would
> be really good to know.
>
> Also any advice on common gotchas with Cassandra triggers would be great to
> know.
>
> Thanks
> SG
>


Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali
Agreed that async performs better than sync in general but the catch here
to me is the "order".

The whole point of async is to do out of order processing by which I mean
say if a request 1 comes in at time t1 and a request 2 comes in at time t2
where t1 < t2 and say now that t1 is taking longer to process than t2 in
which case request 2 should get a response first and subsequently a
response for request 1. This is where I would imagine all the benefits of
async come in but the moment you introduce order by saying for Last Write
Wins all the async requests should be processed in order I would imagine
all the benefits of async are lost.

Let's see if anyone can comment about how it works inside C*.

Thanks!



On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor <d...@scylladb.com> wrote:

> Could be. Let's stay tuned to see if someone else pick it up.
> Anyway, if it's synchronous, you'll have a large penalty for latency.
>
> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali <k...@peernova.com> wrote:
>
>> Thanks again for the response! if they mean it between client and server
>> I am not sure why they would use the word "replication" in the statement
>> below since there is no replication between client and server( coordinator).
>>
>> "Choose between synchronous or asynchronous replication for each update."
>>>
>>
>> Sent from my iPhone
>>
>> On Feb 20, 2017, at 5:30 PM, Dor Laor <d...@scylladb.com> wrote:
>>
>> I think they mean the client to server and not among the servers
>>
>> On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> Also here is a statement from C* website
>>>
>>> "Choose between synchronous or asynchronous replication for each update.
>>> "
>>>
>>> http://cassandra.apache.org/
>>>
>>> Looks like we can choose then either sync or async then?
>>>
>>> On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Hi Dor,
>>>>
>>>> Great response! My comments are inline.
>>>>
>>>> Thanks a lot,
>>>> kant
>>>>
>>>>
>>>> On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor <d...@scylladb.com> wrote:
>>>>
>>>>> I sent this answer but it bounced off the user@apache.
>>>>> Here is the email anyway:
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Dor Laor <d...@scylladb.com>
>>>>> Date: Mon, Feb 20, 2017 at 4:37 PM
>>>>> Subject: Re: Does C* coordinator writes to replicas in same order or
>>>>> different order?
>>>>> To: dev@cassandra.apache.org
>>>>> Cc: u...@cassandra.apache.org
>>>>>
>>>>>
>>>>> + The C* coordinator send async write requests to the replicas.
>>>>>This is very important since it allows it to return a low latency
>>>>>reply to the client once the CL is reached. You wouldn't want
>>>>>to serialize the replicas one after the other.
>>>>>
>>>>
>>>> *so coordinator wont wait until a CL is reached before it
>>>> process another request? *
>>>>
>>>>>
>>>>>  + The client <-> server sync/async isn't related to the coordinator
>>>>> in this case.
>>>>>
>>>>>  + In the case of concurrent writes (always the case...), the time
>>>>> stamp
>>>>> sets the order. Note that it's possible to work with client
>>>>> timestamps or
>>>>> server timestamps. The client ones are usually the best choice.
>>>>>
>>>>
>>>>  *In theory, Why we say concurrent writes they should have the same
>>>> timestamp right?  What I am really looking for is that if I send write
>>>> request concurrently for record 1 and record 2 are they guaranteed to be
>>>> inserted in the same order across replicas? (Whatever order coordinator may
>>>> choose is fine but I want the same order across all replicas and with async
>>>> replication I am not sure how that is possible ? for example,  if a request
>>>> arrives with timestamp t1 and another request arrives with a timestamp t2
>>>> where t1 < t2...with async replication what if one replica chooses to
>>>> execute t2 first and then t1 simply because t1 is slow while another
>>>> replica choose to execute t1 first and then t2..how would that work?  )*
>>>>
>>>>>
>>>>> Note that C* each node can be a coordinator (one per request) and its
>>>>> the desired case in order to load balance the incoming requests. Once
>>>>> again,
>>>>> timestamps determine the order among the requests.
>>>>>
>>>>> Cheers,
>>>>> Dor
>>>>>
>>>>> On Mon, Feb 20, 2017 at 4:12 PM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> when C* coordinator writes to replicas does it write it in same order
>>>>>> or
>>>>>> different order? other words, Does the replication happen
>>>>>> synchronously or
>>>>>> asynchrnoulsy ? Also does this depend sync or async client? What
>>>>>> happens in
>>>>>> the case of concurrent writes to a coordinator ?
>>>>>>
>>>>>> Thanks,
>>>>>> kant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>


Does C* coordinator writes to replicas in same order or different order?

2017-02-20 Thread Kant Kodali
Hi,

when C* coordinator writes to replicas does it write it in same order or
different order? other words, Does the replication happen synchronously or
asynchrnoulsy ? Also does this depend sync or async client? What happens in
the case of concurrent writes to a coordinator ?

Thanks,
kant


Are Cassandra Triggers Thread Safe? ("Tough questions perhaps!")

2017-02-20 Thread Kant Kodali
Hi,

1. Are Cassandra Triggers Thread Safe? what happens if two writes invoke
the trigger where the trigger is trying to modify same row in a partition?
2. Had anyone used it successfully on production? If so, any issues? (I am
using the latest version of C* 3.10)
3. I have partitions that are about 10K rows. And each row in a partition
need to have a pointer to the previous row (and the pointer in this case is
a hash). Using Triggers here would greatly simply our application logic.

It will be a huge help if I can get answers to this.

Thanks,
kant


Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Kant Kodali
@Benjamin Roth: How do you say something is a different PRIMARY KEY now?
looks like you are saying

The below is same partition key and same primary key?

PRIMARY KEY ((a, b), c, d) and
PRIMARY KEY ((a, b), d, c)

@Russell Great to see you here! As always that is spot on!

On Fri, Feb 10, 2017 at 11:13 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> Thanks a lot for that post. If I read the code right, then there is one
> case missing in your post.
> According to StorageProxy.mutateMV, local updates are NOT put into a batch
> and are instantly applied locally. So a batch is only created if remote
> mutations have to be applied and only for those mutations.
>
> 2017-02-10 19:58 GMT+01:00 DuyHai Doan <doanduy...@gmail.com>:
>
> > See my blog post to understand how MV is implemented:
> > http://www.doanduyhai.com/blog/?p=1930
> >
> > On Fri, Feb 10, 2017 at 7:48 PM, Benjamin Roth <benjamin.r...@jaumo.com>
> > wrote:
> >
> > > Same partition key:
> > >
> > > PRIMARY KEY ((a, b), c, d) and
> > > PRIMARY KEY ((a, b), d, c)
> > >
> > > PRIMARY KEY ((a), b, c) and
> > > PRIMARY KEY ((a), c, b)
> > >
> > > Different partition key:
> > >
> > > PRIMARY KEY ((a, b), c, d) and
> > > PRIMARY KEY ((a), b, d, c)
> > >
> > > PRIMARY KEY ((a), b) and
> > > PRIMARY KEY ((b), a)
> > >
> > >
> > > 2017-02-10 19:46 GMT+01:00 Kant Kodali <k...@peernova.com>:
> > >
> > > > Okies now I understand what you mean by "same" partition key.  I
> think
> > > you
> > > > are saying
> > > >
> > > > PRIMARY KEY(col1, col2, col3) == PRIMARY KEY(col2, col1, col3) // so
> > far
> > > I
> > > > assumed they are different partition keys.
> > > >
> > > > On Fri, Feb 10, 2017 at 10:36 AM, Benjamin Roth <
> > benjamin.r...@jaumo.com
> > > >
> > > > wrote:
> > > >
> > > > > There are use cases where the partition key is the same. For
> example
> > if
> > > > you
> > > > > need a sorting within a partition or a filtering different from the
> > > > > original clustering keys.
> > > > > We actually use this for some MVs.
> > > > >
> > > > > If you want "dumb" denormalization with simple append only cases
> (or
> > > more
> > > > > general cases that don't require a read before write on update) you
> > are
> > > > > maybe better off with batched denormalized atomics writes.
> > > > >
> > > > > The main benefit of MVs is if you need denormalization to sort or
> > > filter
> > > > by
> > > > > a non-primary key field.
> > > > >
> > > > > 2017-02-10 19:31 GMT+01:00 Kant Kodali <k...@peernova.com>:
> > > > >
> > > > > > yes thanks for the clarification.  But why would I ever have MV
> > with
> > > > the
> > > > > > same partition key? if it is the same partition key I could just
> > read
> > > > > from
> > > > > > the base table right? our MV Partition key contains the columns
> > from
> > > > the
> > > > > > base table partition key but in a different order plus an
> > additional
> > > > > column
> > > > > > (which is allowed as of today)
> > > > > >
> > > > > > On Fri, Feb 10, 2017 at 10:23 AM, Benjamin Roth <
> > > > benjamin.r...@jaumo.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > It depends on your model.
> > > > > > > If the base table + MV have the same partition key, then the MV
> > > > > mutations
> > > > > > > are applied synchronously, so they are written as soon the
> write
> > > > > request
> > > > > > > returns.
> > > > > > > => In this case you can rely on the R+F > RF
> > > > > > >
> > > > > > > If the partition key of the MV is different, the partition of
> the
> > > MV
> > > > is
> > > > > > > probably placed on a different host (or said differently it
> > cannot
> > > be
> > > > > > > guaranteed that it is on the same host). In this case, the MV
> > > updates
> > > > > a

Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Kant Kodali
In that case I can't even say same partition key == same row key

The below would be different partition keys according to you right?

PRIMARY KEY ((a, b), c, d) and
PRIMARY KEY ((a, b), d, c, e)

On Fri, Feb 10, 2017 at 10:48 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> Same partition key:
>
> PRIMARY KEY ((a, b), c, d) and
> PRIMARY KEY ((a, b), d, c)
>
> PRIMARY KEY ((a), b, c) and
> PRIMARY KEY ((a), c, b)
>
> Different partition key:
>
> PRIMARY KEY ((a, b), c, d) and
> PRIMARY KEY ((a), b, d, c)
>
> PRIMARY KEY ((a), b) and
> PRIMARY KEY ((b), a)
>
>
> 2017-02-10 19:46 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
> > Okies now I understand what you mean by "same" partition key.  I think
> you
> > are saying
> >
> > PRIMARY KEY(col1, col2, col3) == PRIMARY KEY(col2, col1, col3) // so far
> I
> > assumed they are different partition keys.
> >
> > On Fri, Feb 10, 2017 at 10:36 AM, Benjamin Roth <benjamin.r...@jaumo.com
> >
> > wrote:
> >
> > > There are use cases where the partition key is the same. For example if
> > you
> > > need a sorting within a partition or a filtering different from the
> > > original clustering keys.
> > > We actually use this for some MVs.
> > >
> > > If you want "dumb" denormalization with simple append only cases (or
> more
> > > general cases that don't require a read before write on update) you are
> > > maybe better off with batched denormalized atomics writes.
> > >
> > > The main benefit of MVs is if you need denormalization to sort or
> filter
> > by
> > > a non-primary key field.
> > >
> > > 2017-02-10 19:31 GMT+01:00 Kant Kodali <k...@peernova.com>:
> > >
> > > > yes thanks for the clarification.  But why would I ever have MV with
> > the
> > > > same partition key? if it is the same partition key I could just read
> > > from
> > > > the base table right? our MV Partition key contains the columns from
> > the
> > > > base table partition key but in a different order plus an additional
> > > column
> > > > (which is allowed as of today)
> > > >
> > > > On Fri, Feb 10, 2017 at 10:23 AM, Benjamin Roth <
> > benjamin.r...@jaumo.com
> > > >
> > > > wrote:
> > > >
> > > > > It depends on your model.
> > > > > If the base table + MV have the same partition key, then the MV
> > > mutations
> > > > > are applied synchronously, so they are written as soon the write
> > > request
> > > > > returns.
> > > > > => In this case you can rely on the R+F > RF
> > > > >
> > > > > If the partition key of the MV is different, the partition of the
> MV
> > is
> > > > > probably placed on a different host (or said differently it cannot
> be
> > > > > guaranteed that it is on the same host). In this case, the MV
> updates
> > > are
> > > > > executed async in a logged batch. So it can be guaranteed they will
> > be
> > > > > applied eventually but not at the time the write request returns.
> > > > > => You cannot rely and there is no possibility to absolutely
> > guarantee
> > > > > anything, not matter what CL you choose. A MV update may always
> > "arrive
> > > > > late". I guess it has been implemented like this to not block in
> case
> > > of
> > > > > remote request to prefer the cluster sanity over consistency.
> > > > >
> > > > > Is it now 100% clear?
> > > > >
> > > > > 2017-02-10 19:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
> > > > >
> > > > > > So R+W > RF doesnt apply for reads on MV right because say I set
> > > QUORUM
> > > > > > level consistency for both reads and writes then there can be a
> > > > scenario
> > > > > > where a write is successful to the base table and then say
> > > immediately
> > > > I
> > > > > do
> > > > > > a read through MV but prior to MV getting the update from the
> base
> > > > table.
> > > > > > so there isn't any way to make sure to read after MV had been
> > > > > successfully
> > > > > > updated. is that correct?
> >

Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Kant Kodali
Okies now I understand what you mean by "same" partition key.  I think you
are saying

PRIMARY KEY(col1, col2, col3) == PRIMARY KEY(col2, col1, col3) // so far I
assumed they are different partition keys.

On Fri, Feb 10, 2017 at 10:36 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> There are use cases where the partition key is the same. For example if you
> need a sorting within a partition or a filtering different from the
> original clustering keys.
> We actually use this for some MVs.
>
> If you want "dumb" denormalization with simple append only cases (or more
> general cases that don't require a read before write on update) you are
> maybe better off with batched denormalized atomics writes.
>
> The main benefit of MVs is if you need denormalization to sort or filter by
> a non-primary key field.
>
> 2017-02-10 19:31 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
> > yes thanks for the clarification.  But why would I ever have MV with the
> > same partition key? if it is the same partition key I could just read
> from
> > the base table right? our MV Partition key contains the columns from the
> > base table partition key but in a different order plus an additional
> column
> > (which is allowed as of today)
> >
> > On Fri, Feb 10, 2017 at 10:23 AM, Benjamin Roth <benjamin.r...@jaumo.com
> >
> > wrote:
> >
> > > It depends on your model.
> > > If the base table + MV have the same partition key, then the MV
> mutations
> > > are applied synchronously, so they are written as soon the write
> request
> > > returns.
> > > => In this case you can rely on the R+F > RF
> > >
> > > If the partition key of the MV is different, the partition of the MV is
> > > probably placed on a different host (or said differently it cannot be
> > > guaranteed that it is on the same host). In this case, the MV updates
> are
> > > executed async in a logged batch. So it can be guaranteed they will be
> > > applied eventually but not at the time the write request returns.
> > > => You cannot rely and there is no possibility to absolutely guarantee
> > > anything, not matter what CL you choose. A MV update may always "arrive
> > > late". I guess it has been implemented like this to not block in case
> of
> > > remote request to prefer the cluster sanity over consistency.
> > >
> > > Is it now 100% clear?
> > >
> > > 2017-02-10 19:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
> > >
> > > > So R+W > RF doesnt apply for reads on MV right because say I set
> QUORUM
> > > > level consistency for both reads and writes then there can be a
> > scenario
> > > > where a write is successful to the base table and then say
> immediately
> > I
> > > do
> > > > a read through MV but prior to MV getting the update from the base
> > table.
> > > > so there isn't any way to make sure to read after MV had been
> > > successfully
> > > > updated. is that correct?
> > > >
> > > > On Fri, Feb 10, 2017 at 6:30 AM, Benjamin Roth <
> > benjamin.r...@jaumo.com>
> > > > wrote:
> > > >
> > > > > Hi Kant
> > > > >
> > > > > Is it clear now?
> > > > > Sorry for the confusion!
> > > > >
> > > > > Have a nice one
> > > > >
> > > > > Am 10.02.2017 09:17 schrieb "Kant Kodali" <k...@peernova.com>:
> > > > >
> > > > > thanks!
> > > > >
> > > > > On Thu, Feb 9, 2017 at 8:51 PM, Benjamin Roth <
> > benjamin.r...@jaumo.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Yes it is
> > > > > >
> > > > > > Am 10.02.2017 00:46 schrieb "Kant Kodali" <k...@peernova.com>:
> > > > > >
> > > > > > > If reading from materialized view with a consistency level of
> > > quorum
> > > > am
> > > > > I
> > > > > > > guaranteed to have the most recent view? other words is w + r
> > n
> > > > > > contract
> > > > > > > maintained for MV's as well for both reads and writes?
> > > > > > >
> > > > > > > Thanks!
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > Benjamin Roth
> > > Prokurist
> > >
> > > Jaumo GmbH · www.jaumo.com
> > > Wehrstraße 46 · 73035 Göppingen · Germany
> > > Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> > > AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
> > >
> >
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Kant Kodali
yes thanks for the clarification.  But why would I ever have MV with the
same partition key? if it is the same partition key I could just read from
the base table right? our MV Partition key contains the columns from the
base table partition key but in a different order plus an additional column
(which is allowed as of today)

On Fri, Feb 10, 2017 at 10:23 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> It depends on your model.
> If the base table + MV have the same partition key, then the MV mutations
> are applied synchronously, so they are written as soon the write request
> returns.
> => In this case you can rely on the R+F > RF
>
> If the partition key of the MV is different, the partition of the MV is
> probably placed on a different host (or said differently it cannot be
> guaranteed that it is on the same host). In this case, the MV updates are
> executed async in a logged batch. So it can be guaranteed they will be
> applied eventually but not at the time the write request returns.
> => You cannot rely and there is no possibility to absolutely guarantee
> anything, not matter what CL you choose. A MV update may always "arrive
> late". I guess it has been implemented like this to not block in case of
> remote request to prefer the cluster sanity over consistency.
>
> Is it now 100% clear?
>
> 2017-02-10 19:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
> > So R+W > RF doesnt apply for reads on MV right because say I set QUORUM
> > level consistency for both reads and writes then there can be a scenario
> > where a write is successful to the base table and then say immediately I
> do
> > a read through MV but prior to MV getting the update from the base table.
> > so there isn't any way to make sure to read after MV had been
> successfully
> > updated. is that correct?
> >
> > On Fri, Feb 10, 2017 at 6:30 AM, Benjamin Roth <benjamin.r...@jaumo.com>
> > wrote:
> >
> > > Hi Kant
> > >
> > > Is it clear now?
> > > Sorry for the confusion!
> > >
> > > Have a nice one
> > >
> > > Am 10.02.2017 09:17 schrieb "Kant Kodali" <k...@peernova.com>:
> > >
> > > thanks!
> > >
> > > On Thu, Feb 9, 2017 at 8:51 PM, Benjamin Roth <benjamin.r...@jaumo.com
> >
> > > wrote:
> > >
> > > > Yes it is
> > > >
> > > > Am 10.02.2017 00:46 schrieb "Kant Kodali" <k...@peernova.com>:
> > > >
> > > > > If reading from materialized view with a consistency level of
> quorum
> > am
> > > I
> > > > > guaranteed to have the most recent view? other words is w + r > n
> > > > contract
> > > > > maintained for MV's as well for both reads and writes?
> > > > >
> > > > > Thanks!
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>


Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Kant Kodali
So R+W > RF doesnt apply for reads on MV right because say I set QUORUM
level consistency for both reads and writes then there can be a scenario
where a write is successful to the base table and then say immediately I do
a read through MV but prior to MV getting the update from the base table.
so there isn't any way to make sure to read after MV had been successfully
updated. is that correct?

On Fri, Feb 10, 2017 at 6:30 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> Hi Kant
>
> Is it clear now?
> Sorry for the confusion!
>
> Have a nice one
>
> Am 10.02.2017 09:17 schrieb "Kant Kodali" <k...@peernova.com>:
>
> thanks!
>
> On Thu, Feb 9, 2017 at 8:51 PM, Benjamin Roth <benjamin.r...@jaumo.com>
> wrote:
>
> > Yes it is
> >
> > Am 10.02.2017 00:46 schrieb "Kant Kodali" <k...@peernova.com>:
> >
> > > If reading from materialized view with a consistency level of quorum am
> I
> > > guaranteed to have the most recent view? other words is w + r > n
> > contract
> > > maintained for MV's as well for both reads and writes?
> > >
> > > Thanks!
> > >
> >
>


Re: If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-10 Thread Kant Kodali
thanks!

On Thu, Feb 9, 2017 at 8:51 PM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> Yes it is
>
> Am 10.02.2017 00:46 schrieb "Kant Kodali" <k...@peernova.com>:
>
> > If reading from materialized view with a consistency level of quorum am I
> > guaranteed to have the most recent view? other words is w + r > n
> contract
> > maintained for MV's as well for both reads and writes?
> >
> > Thanks!
> >
>


If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-09 Thread Kant Kodali
If reading from materialized view with a consistency level of quorum am I
guaranteed to have the most recent view? other words is w + r > n contract
maintained for MV's as well for both reads and writes?

Thanks!


Re: AW: Why does CockroachDB github website say Cassandra has noAvailability on datacenter failure?

2017-02-07 Thread Kant Kodali
https://github.com/cockroachdb/cockroach/commit/f46a547827d3439b57baa5c3a11f8f9ad2d8b153

On Tue, Feb 7, 2017 at 3:20 PM, Kant Kodali <k...@peernova.com> wrote:

> LOL They took down that image finally!! But I would still keep an eye on
> what kind of fake benchmarks they might come up with.
>
> On Tue, Feb 7, 2017 at 7:11 AM, Amit Trivedi <tria...@gmail.com> wrote:
>
>> It indeed is a marketing gimmick. By clubbing Cassandra with likes of
>> HBase that favors consistency over availability, points under cons section
>> are all true, just not true when applied to any one database from that
>> group.
>>
>> Thanks and Regards
>>
>> Amit Trivedi
>>
>> On Feb 7, 2017, 7:32 AM -0500, j.kes...@enercast.de, wrote:
>>
>> Deeper inside there is a diagram:
>>
>>
>>
>> https://raw.githubusercontent.com/cockroachdb/cockroach/mast
>> er/docs/media/sql-nosql-newsql.png
>>
>>
>>
>> They compare to NoSQL along with Riak, HBase and Cassandra.
>>
>>
>>
>> Of course you CAN have a Cassandra cluster which is not fully available
>> with loss of a dc nor consistent.
>>
>>
>>
>> Marketing 
>>
>>
>>
>> Gesendet von meinem Windows 10 Phone
>>
>>
>>
>> *Von:* DuyHai Doan <doanduy...@gmail.com>
>> *Gesendet:* Dienstag, 7. Februar 2017 11:53
>> *An:* dev@cassandra.apache.org
>> *Cc:* u...@cassandra.apache.org
>> *Betreff:* Re: Why does CockroachDB github website say Cassandra has
>> noAvailability on datacenter failure?
>>
>>
>>
>> The link you posted doesn't say anything about Cassandra
>>
>> Le 7 févr. 2017 11:41, "Kant Kodali" <k...@peernova.com> a écrit :
>>
>> Why does CockroachDB github website say Cassandra has no Availability on
>> datacenter failure?
>>
>> https://github.com/cockroachdb/cockroach
>>
>>
>>
>>
>


Re: AW: Why does CockroachDB github website say Cassandra has noAvailability on datacenter failure?

2017-02-07 Thread Kant Kodali
LOL They took down that image finally!! But I would still keep an eye on
what kind of fake benchmarks they might come up with.

On Tue, Feb 7, 2017 at 7:11 AM, Amit Trivedi <tria...@gmail.com> wrote:

> It indeed is a marketing gimmick. By clubbing Cassandra with likes of
> HBase that favors consistency over availability, points under cons section
> are all true, just not true when applied to any one database from that
> group.
>
> Thanks and Regards
>
> Amit Trivedi
>
> On Feb 7, 2017, 7:32 AM -0500, j.kes...@enercast.de, wrote:
>
> Deeper inside there is a diagram:
>
>
>
> https://raw.githubusercontent.com/cockroachdb/cockroach/
> master/docs/media/sql-nosql-newsql.png
>
>
>
> They compare to NoSQL along with Riak, HBase and Cassandra.
>
>
>
> Of course you CAN have a Cassandra cluster which is not fully available
> with loss of a dc nor consistent.
>
>
>
> Marketing 
>
>
>
> Gesendet von meinem Windows 10 Phone
>
>
>
> *Von:* DuyHai Doan <doanduy...@gmail.com>
> *Gesendet:* Dienstag, 7. Februar 2017 11:53
> *An:* dev@cassandra.apache.org
> *Cc:* u...@cassandra.apache.org
> *Betreff:* Re: Why does CockroachDB github website say Cassandra has
> noAvailability on datacenter failure?
>
>
>
> The link you posted doesn't say anything about Cassandra
>
> Le 7 févr. 2017 11:41, "Kant Kodali" <k...@peernova.com> a écrit :
>
> Why does CockroachDB github website say Cassandra has no Availability on
> datacenter failure?
>
> https://github.com/cockroachdb/cockroach
>
>
>
>


Re: Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali
yes agreed with this response

On Tue, Feb 7, 2017 at 5:07 AM, James Carman <ja...@carmanconsulting.com>
wrote:

> I think folks might agree that it's not worth the time to worry about what
> they say.  The ASF isn't a commercial entity, so we don't worry about
> market share or anything.  Sure, it's not cool for folks to say misleading
> or downright false statements about Cassandra, but we can't police the
> internet.  We would be better served focusing on what we can control, which
> is Cassandra, making it the best NoSQL database it can be.  Perhaps you
> should write a blog post showing Cassandra survive a failure and we can
> link to it from the Cassandra site.
>
> Now, this doesn't apply to trademarks, as the PMC is responsible for
> "defending" its marks.
>
>
>
> On Tue, Feb 7, 2017 at 7:59 AM Kant Kodali <k...@peernova.com> wrote:
>
> > @James I don't see how people can agree to it if they know Cassandra or
> > even better Distributed systems reasonably well
> >
> > On Tue, Feb 7, 2017 at 4:54 AM, Bernardo Sanchez <
> > bernard...@pointclickcare.com> wrote:
> >
> > > same. yra
> > >
> > > Sent from my BlackBerry - the most secure mobile device - via the Bell
> > > Network
> > > From: benjamin.r...@jaumo.com
> > > Sent: February 7, 2017 7:51 AM
> > > To: dev@cassandra.apache.org
> > > Reply-to: dev@cassandra.apache.org
> > > Subject: Re: Why does CockroachDB github website say Cassandra has no
> > > Availability on datacenter failure?
> > >
> > >
> > > Btw this isn't the Bronx either. It's not incorrect to be polite.
> > >
> > > Am 07.02.2017 13:45 schrieb "Bernardo Sanchez" <
> > > bernard...@pointclickcare.com>:
> > >
> > > > guys this isn't twitter. stop your stupid posts
> > > >
> > > > From: benjamin.le...@datastax.com
> > > > Sent: February 7, 2017 7:43 AM
> > > > To: dev@cassandra.apache.org
> > > > Reply-to: dev@cassandra.apache.org
> > > > Subject: Re: Why does CockroachDB github website say Cassandra has no
> > > > Availability on datacenter failure?
> > > >
> > > >
> > > > Do not get angry for that. It does not worth it. :-)
> > > >
> > > > On Tue, Feb 7, 2017 at 1:11 PM, Kant Kodali <k...@peernova.com>
> wrote:
> > > >
> > > > > lol. But seriously are they even allowed to say something that is
> not
> > > > true
> > > > > about another product ?
> > > > >
> > > > > On Tue, Feb 7, 2017 at 4:05 AM, kurt greaves <k...@instaclustr.com
> >
> > > > wrote:
> > > > >
> > > > > > Marketing never lies. Ever
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali
@James I don't see how people can agree to it if they know Cassandra or
even better Distributed systems reasonably well

On Tue, Feb 7, 2017 at 4:54 AM, Bernardo Sanchez <
bernard...@pointclickcare.com> wrote:

> same. yra
>
> Sent from my BlackBerry - the most secure mobile device - via the Bell
> Network
> From: benjamin.r...@jaumo.com
> Sent: February 7, 2017 7:51 AM
> To: dev@cassandra.apache.org
> Reply-to: dev@cassandra.apache.org
> Subject: Re: Why does CockroachDB github website say Cassandra has no
> Availability on datacenter failure?
>
>
> Btw this isn't the Bronx either. It's not incorrect to be polite.
>
> Am 07.02.2017 13:45 schrieb "Bernardo Sanchez" <
> bernard...@pointclickcare.com>:
>
> > guys this isn't twitter. stop your stupid posts
> >
> > From: benjamin.le...@datastax.com
> > Sent: February 7, 2017 7:43 AM
> > To: dev@cassandra.apache.org
> > Reply-to: dev@cassandra.apache.org
> > Subject: Re: Why does CockroachDB github website say Cassandra has no
> > Availability on datacenter failure?
> >
> >
> > Do not get angry for that. It does not worth it. :-)
> >
> > On Tue, Feb 7, 2017 at 1:11 PM, Kant Kodali <k...@peernova.com> wrote:
> >
> > > lol. But seriously are they even allowed to say something that is not
> > true
> > > about another product ?
> > >
> > > On Tue, Feb 7, 2017 at 4:05 AM, kurt greaves <k...@instaclustr.com>
> > wrote:
> > >
> > > > Marketing never lies. Ever
> > > >
> > >
> >
>


Re: Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali
lol. But seriously are they even allowed to say something that is not true
about another product ?

On Tue, Feb 7, 2017 at 4:05 AM, kurt greaves  wrote:

> Marketing never lies. Ever
>


Re: Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali
On Tue, Feb 7, 2017 at 3:52 AM, Kant Kodali <k...@peernova.com> wrote:

>
>
> This is the picture taken from https://github.com/cockroachdb/cockroach
>
> On Tue, Feb 7, 2017 at 3:01 AM, Benjamin Lerer <
> benjamin.le...@datastax.com> wrote:
>
>> ... and by the way, it is far from being the only wrong stuff in their
>> list.
>>
>> On Tue, Feb 7, 2017 at 11:57 AM, Benjamin Lerer <
>> benjamin.le...@datastax.com
>> > wrote:
>>
>> > May be you should ask them? We are not involved in the CockroachDB
>> github
>> > website.
>> >
>> > On Tue, Feb 7, 2017 at 11:41 AM, Kant Kodali <k...@peernova.com> wrote:
>> >
>> >> Why does CockroachDB github website say Cassandra has no Availability
>> on
>> >> datacenter failure?
>> >>
>> >> https://github.com/cockroachdb/cockroach
>> >>
>> >
>> >
>>
>
>


Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali
Why does CockroachDB github website say Cassandra has no Availability on
datacenter failure?

https://github.com/cockroachdb/cockroach


Re: quick question

2017-02-01 Thread Kant Kodali
Adding dev only for this thread.

On Wed, Feb 1, 2017 at 4:39 AM, Kant Kodali <k...@peernova.com> wrote:

> What is the difference between accepting a value and committing a value?
>
>
>
> On Wed, Feb 1, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Hi,
>>
>> Thanks for the response. I finished watching this video but I still got
>> few questions.
>>
>> 1) The speaker seems to suggest that there are different consistency
>> levels being used in different phases of paxos protocol. If so, what is
>> right consistency level to set on these phases?
>>
>> 2) Right now, we just set consistency level as QUORUM at the global level
>> and I dont think we ever change it so in this case what would be the
>> consistency levels being used in different phases.
>>
>> 3) The fact that one should think about reading before the commit phase
>> or after the commit phase (but not any other phase) sounds like there is
>> something special about commit phase and what is that? when I set the
>> QUORUM level consistency at global level Does the commit phase happen right
>> after accept  phase or no? or irrespective of the consistency level when
>> does the commit phase happen anyway? and what happens during the commit
>> phase?
>>
>>
>> Thanks,
>> kant
>>
>>
>> On Wed, Feb 1, 2017 at 3:30 AM, Alain RODRIGUEZ <arodr...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I believe that this talk from Christopher Batey at the Cassandra Summit
>>> 2016 might answer most of your questions around LWT:
>>> https://www.youtube.com/watch?v=wcxQM3ZN20c
>>>
>>> He explains a lot of stuff including consistency considerations. My
>>> understanding is that the quorum read can only see the data written using
>>> LWT after the commit phase. A SERIAL Read would see it (video, around
>>> 23:40).
>>>
>>> Here are the slides as well: http://fr.slideshare.net
>>> /DataStax/light-weight-transactions-under-stress-christopher
>>> -batey-the-last-pickle-cassandra-summit-2016
>>>
>>> Let us know if you still have questions after watching this (about 35
>>> minutes).
>>>
>>> C*heers,
>>> ---
>>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>>> France
>>>
>>> The Last Pickle - Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>> 2017-02-01 10:57 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>>
>>>> When you initiate a LWT(write) and do a QUORUM read is there a chance
>>>> that one might not see the LWT write ? If so, can someone explain a bit
>>>> more?
>>>>
>>>> Thanks!
>>>>
>>>
>>>
>>
>


Re: Is there a way to do Read and Set at Cassandra level?

2016-11-05 Thread Kant Kodali
But then don't I need to evict for every batch of writes? I thought cache
would make sense when reads/writes > 1 per say. What do you think?

On Sat, Nov 5, 2016 at 3:33 AM, DuyHai Doan <doanduy...@gmail.com> wrote:

> "I have a requirement where I need to know last value that is written
> successfully so I could read that value and do some computation and include
> it in the subsequent write"
>
> Maybe keeping the last written value in a distributed cache is cheaper
> than doing a read before write in Cassandra ?
>
> On Sat, Nov 5, 2016 at 11:24 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> I have a requirement where I need to know last value that is written
>> successfully so I could read that value and do some computation and include
>> it in the subsequent write. For now we are doing read before write which
>> significantly degrades the performance. Light weight transactions are more
>> of a compare and set than a Read and Set. The very first thing I tried is
>> to see if I can eliminate this need by the application but looks like it is
>> a strong requirement for us so I am wondering if there is any way I can
>> optimize that? I know batching could help in the sense I can do one read
>> for every batch so that the writes in the batch doesn't take a read
>> performance hit but I wonder if there is any clever ideas or tricks I can
>> do?
>>
>
>


Is there a way to do Read and Set at Cassandra level?

2016-11-05 Thread Kant Kodali
I have a requirement where I need to know last value that is written
successfully so I could read that value and do some computation and include
it in the subsequent write. For now we are doing read before write which
significantly degrades the performance. Light weight transactions are more
of a compare and set than a Read and Set. The very first thing I tried is
to see if I can eliminate this need by the application but looks like it is
a strong requirement for us so I am wondering if there is any way I can
optimize that? I know batching could help in the sense I can do one read
for every batch so that the writes in the batch doesn't take a read
performance hit but I wonder if there is any clever ideas or tricks I can
do?


Re: Is SASI index in Cassandra efficient for high cardinality columns?

2016-10-21 Thread Kant Kodali
Why Secondary index cannot be broken down into token ranges like primary
index at least for exact matches? That way dont need to scan the whole
cluster atleast for exact matches. I understand if it is a substring search
then there will 2^n substrings which equates to 2^n hashes/tokens which can
be a lot!

On Sat, Oct 15, 2016 at 4:35 AM, DuyHai Doan <doanduy...@gmail.com> wrote:

> If each indexed value has very few matching rows, then querying using SASI
> (or any impl of secondary index) may scan the whole cluster.
>
> This is because the index are "distributed" e.g. the indexed values stay
> on the same nodes as the base data. And even SASI with its own
> data-structure will not help much here.
>
> One should understand that the 2nd index query has to deal with 2 layers:
>
> 1) The cluster layer, which is common for any impl of 2nd index. Read my
> blog post here: http://www.planetcassandra.org/blog/
> cassandra-native-secondary-index-deep-dive/
>
> 2) The local read path, which depends on the impl of 2nd index. Some are
> using Lucene library like Stratio impl, some rolls in its own data
> structures like SASI
>
> If you have a 1-to-1 relationship between the index value and the matching
> row (or 1-to-a few), I would recommend using materialized views instead:
>
> http://www.slideshare.net/doanduyhai/sasi-cassandra-on-
> the-full-text-search-ride-voxxed-daybelgrade-2016/25
>
> Materialized views guarantee that for each search indexed value, you only
> hit a single node (or N replicas depending on the used consistency level)
>
> However, materialized views have their own drawbacks (weeker consistency
> guarantee) and you can't use range queries (<,  >, ≤, ≥) or full text
> search on the indexed value
>
>
>
>
>
> On Sat, Oct 15, 2016 at 11:55 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Well I went with the definition from wikipedia and that definition rules
>> out #1 so it is #2 and it is just one matching row in my case.
>>
>>
>>
>> On Sat, Oct 15, 2016 at 2:40 AM, DuyHai Doan <doanduy...@gmail.com>
>> wrote:
>>
>> > Define precisely what you mean by "high cardinality columns". Do you
>> mean:
>> >
>> > 1) a single indexed value is present in a lot of rows
>> > 2) a single indexed value has only a few (if not just one) matching row
>> >
>> >
>> > On Sat, Oct 15, 2016 at 8:37 AM, Kant Kodali <k...@peernova.com> wrote:
>> >
>> >> I understand Secondary Indexes in general are inefficient on high
>> >> cardinality columns but since SASI is built from scratch I wonder if
>> the
>> >> same argument applies there? If not, Why? Because I believe primary
>> keys in
>> >> Cassandra are indeed indexed and since Primary key is supposed to be
>> the
>> >> column with highest cardinality why not do the same for secondary
>> indexes?
>> >>
>> >
>> >
>>
>
>


Re: Is SASI index in Cassandra efficient for high cardinality columns?

2016-10-15 Thread Kant Kodali
Well I went with the definition from wikipedia and that definition rules
out #1 so it is #2 and it is just one matching row in my case.



On Sat, Oct 15, 2016 at 2:40 AM, DuyHai Doan <doanduy...@gmail.com> wrote:

> Define precisely what you mean by "high cardinality columns". Do you mean:
>
> 1) a single indexed value is present in a lot of rows
> 2) a single indexed value has only a few (if not just one) matching row
>
>
> On Sat, Oct 15, 2016 at 8:37 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> I understand Secondary Indexes in general are inefficient on high
>> cardinality columns but since SASI is built from scratch I wonder if the
>> same argument applies there? If not, Why? Because I believe primary keys in
>> Cassandra are indeed indexed and since Primary key is supposed to be the
>> column with highest cardinality why not do the same for secondary indexes?
>>
>
>


Is SASI index in Cassandra efficient for high cardinality columns?

2016-10-15 Thread Kant Kodali
I understand Secondary Indexes in general are inefficient on high
cardinality columns but since SASI is built from scratch I wonder if the
same argument applies there? If not, Why? Because I believe primary keys in
Cassandra are indeed indexed and since Primary key is supposed to be the
column with highest cardinality why not do the same for secondary indexes?


Re: Why does Cassandra need to have 2B column limit? why can't we have unlimited ?

2016-10-12 Thread Kant Kodali
I did mention this in my previous email.  This is not time series data. I
understand how to structure it if it is a time series data/

What do you mean globally sorted? you mean keeping every partition sorted
(since I come from Casandra world)?

rowkey 1 -> blob
page -> int or long or bigint
col1  -> text
col2 -> blob
co3 -> bigint

On Wed, Oct 12, 2016 at 1:37 AM, Dorian Hoxha <dorian.ho...@gmail.com>
wrote:

> There are some issues working on larger partitions.
> Hbase doesn't do what you say! You have also to be carefull on hbase not
> to create large rows! But since they are globally-sorted, you can easily
> sort between them and create small rows.
>
> In my opinion, cassandra people are wrong, in that they say "globally
> sorted is the devil!" while all fb/google/etc actually use globally-sorted
> most of the time! You have to be careful though (just like with random
> partition)
>
> Can you tell what rowkey1, page1, col(x) actually are ? Maybe there is a
> way.
> The most "recent", means there's a timestamp in there ?
>
> On Wed, Oct 12, 2016 at 9:58 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Hi All,
>>
>> I understand Cassandra can have a maximum of 2B rows per partition but in
>> practice some people seem to suggest the magic number is 100K. why not
>> create another partition/rowkey automatically (whenever we reach a safe
>> limit that  we consider would be efficient)  with auto increment bigint  as
>> a suffix appended to the new rowkey? so that the driver can return the new
>> rowkey  indicating that there is a new partition and so on...Now I
>> understand this would involve allowing partial row key searches which
>> currently Cassandra wouldn't do (but I believe HBASE does) and thinking
>> about token ranges and potentially many other things..
>>
>> My current problem is this
>>
>> I have a row key followed by bunch of columns (this is not time series
>> data)
>> and these columns can grow to any number so since I have 100K limit (or
>> whatever the number is. say some limit) I want to break the partition into
>> level/pages
>>
>> rowkey1, page1->col1, col2, col3..
>> rowkey1, page2->col1, col2, col3..
>>
>> now say my Cassandra db is populated with data and say my application
>> just got booted up and I want to most recent value of a certain partition
>> but I don't know which page it belongs to since my application just got
>> booted up? how do I solve this in the most efficient that is possible in
>> Cassandra today? I understand I can create MV, other tables that can hold
>> some auxiliary data such as number of pages per partition and so on..but
>> that involves the maintenance cost of that other table which I cannot
>> afford really because I have MV's, secondary indexes for other good
>> reasons. so it would be great if someone can explain the best way possible
>> as of today with Cassandra? By best way I mean is it possible with one
>> request? If Yes, then how? If not, then what is the next best way to solve
>> this?
>>
>> Thanks,
>> kant
>>
>
>


Why does Cassandra need to have 2B column limit? why can't we have unlimited ?

2016-10-12 Thread Kant Kodali
Hi All,

I understand Cassandra can have a maximum of 2B rows per partition but in
practice some people seem to suggest the magic number is 100K. why not
create another partition/rowkey automatically (whenever we reach a safe
limit that  we consider would be efficient)  with auto increment bigint  as
a suffix appended to the new rowkey? so that the driver can return the new
rowkey  indicating that there is a new partition and so on...Now I
understand this would involve allowing partial row key searches which
currently Cassandra wouldn't do (but I believe HBASE does) and thinking
about token ranges and potentially many other things..

My current problem is this

I have a row key followed by bunch of columns (this is not time series data)
and these columns can grow to any number so since I have 100K limit (or
whatever the number is. say some limit) I want to break the partition into
level/pages

rowkey1, page1->col1, col2, col3..
rowkey1, page2->col1, col2, col3..

now say my Cassandra db is populated with data and say my application just
got booted up and I want to most recent value of a certain partition but I
don't know which page it belongs to since my application just got booted
up? how do I solve this in the most efficient that is possible in Cassandra
today? I understand I can create MV, other tables that can hold some
auxiliary data such as number of pages per partition and so on..but that
involves the maintenance cost of that other table which I cannot afford
really because I have MV's, secondary indexes for other good reasons. so it
would be great if someone can explain the best way possible as of today
with Cassandra? By best way I mean is it possible with one request? If Yes,
then how? If not, then what is the next best way to solve this?

Thanks,
kant


How to write a trigger in Cassandra to only detect updates of an existing row ?

2016-10-04 Thread Kant Kodali

Hi all,
How to write a trigger in Cassandra to detect updates? My requirement is that I
want a trigger to alert me only when there is an update to an existing row and
looks like given the way INSERT and Update works this might be hard to do
because INSERT will just overwrite if there is an existing row and Update
becomes new insert where there is no row that belongs to certain partition key.
is there a way to solve this problem?
Thanks,

kant

Problems with cassandra on AWS

2016-07-10 Thread Kant Kodali
Hi Guys,

I installed a 3 node Cassandra cluster on AWS and my replication factor is
3. I am trying to insert some data into a table. I set the consistency
level of QUORUM at a Cassandra Session level. It only inserts into one node
and unable to talk to other nodes because it is trying to contact other
nodes through private IP and obviously that is failing so I am not sure how
to change settings in say cassandra.yaml or somewhere such that rpc_address
in system.peers table is updated to public IP's? I tried changing the seeds
to all public IP's that didn't work as it looks like ec2 instances cannot
talk to each other using public IP's. any help would be appreciated!

Thanks,
kant