When you initiate a LWT(write) and do a QUORUM read is there a chance that
one might not see the LWT write ? If so, can someone explain a bit more?
Rodriguez - @arodream - al...@thelastpickle.com
> The Last Pickle - Apache Cassandra Consulting
> 2017-02-01 10:57 GMT+01:00 Kant Kodali <k...@peernova.com>:
>> When you initiate a LWT(write) and do a QUORUM rea
What is the difference between accepting a value and committing a value?
On Wed, Feb 1, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:
> Thanks for the response. I finished watching this video but I still got
> few questions.
> 1) The
Why does CockroachDB github website say Cassandra has no Availability on
lol. But seriously are they even allowed to say something that is not true
about another product ?
On Tue, Feb 7, 2017 at 4:05 AM, kurt greaves wrote:
> Marketing never lies. Ever
; *Cc:* email@example.com
> *Betreff:* Re: Why does CockroachDB github website say Cassandra has
> noAvailability on datacenter failure?
> The link you posted doesn't say anything about Cassandra
> Le 7 févr. 2017 11:41, "Kant Kodali" <k...@peernova.com> a écrit :
> Why does CockroachDB github website say Cassandra has no Availability on
> datacenter failure?
On Tue, Feb 7, 2017 at 3:20 PM, Kant Kodali <k...@peernova.com> wrote:
> LOL They took down that image finally!! But I would still keep an eye on
> what kind of fake benchmarks they m
Adding dev only for this thread.
On Wed, Feb 1, 2017 at 4:39 AM, Kant Kodali <k...@peernova.com> wrote:
> What is the difference between accepting a value and committing a value?
> On Wed, Feb 1, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:
Lets say I have 2 DC's and I have 3 node cluster on each DC and one replica
on each DC. I would like to maintain Strong consistency and high
1) First of all, How do I even set up one replica on each DC?
2) what should my read and write consistent levels be when I am
It looks like there is ordering within one client (ordering based on
timestamp) and looks like this *order is preserved across all replicas*
however the benefits of async given the ordering restriction are slightly
blur for me.
On Tue, Feb 21, 2017 at 2:35 AM, Kant Kodali <k...@peernova.
fferent use case.
> CS was not designed to guarantee order. It was build to be linear
> scalable, highly concurrent and eventual consistent.
> To me it sounds like a ACID DB better serves what you are asking for.
> 2017-02-21 10:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
t, read the order based on
> PK (locally) and update "the pointer" on every write (also locally). If you
> then store your pointer with the last known timestamp of your base data,
> you also have a LWW on your pointer so also the last pointer wins when
> reading with > CL_
synchronous, you'll have a large penalty for latency.
> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali <k...@peernova.com> wrote:
>> Thanks again for the response! if they mean it between client and server
>> I am not sure why they would use the word "replication
> That paper does go on to discuss electing a distinguished proposer, but
> that was never done for C*. I believe it's not considered a good fit for C*
> On Thu, Feb 16, 2017, at 04:20 PM, Kant Kodali wrote:
> @Ariel Weisberg
Goal: want to create check_duplicate UDA on a blob column
Context: I have a partition of 10Million rows with size of 10GB (I know
this is bad). I want to check if there are duplicate in a blob column in
this partition. The blob column can at most be 256 bytes.
Question: can I create
1. Are Cassandra Triggers Thread Safe? what happens if two writes invoke
the trigger where the trigger is trying to modify same row in a partition?
2. Had anyone used it successfully on production? If so, any issues? (I am
using the latest version of C* 3.10)
3. I have partitions that are
when C* coordinator writes to replicas does it write it in same order or
different order? other words, Does the replication happen synchronously or
asynchrnoulsy ? Also does this depend sync or async client? What happens in
the case of concurrent writes to a coordinator ?
61.21 802187.44 10299432635
Min 0.00 5.72 9.89 125
Max 6.00 668489.538582860.53 10299432635
On Sat, Feb 18, 2017 at 12:28 AM, Kant Kodali &l
is there a query to find out the largest partition in a table? Does the
query below give me the largest partition?
select max(mean_partition_size) from size_estimates ;
9:47 AM, Ariel Weisberg <ar...@weisberg.ws> wrote:
>> No it's not going to be in 3.11.x. The earliest release it could make it
>> into is 4.0.
>> On Wed, Feb 22, 2017, at 03:34 AM, Kant Kodali wrote:
Saw this one today...
On Tue, Jan 3, 2017 at 6:27 AM, Eric Evans
> On Mon, Jan 2, 2017 at 2:26 PM, Edward Capriolo
> > Lets be clear:
> > What I am saying is avoiding being loose
How does Cassandra achieve Linearizability with “Last write wins” (conflict
resolution methods based on time-of-day clocks) ?
Relying on synchronized clocks are almost certainly non-linearizable,
because clock timestamps cannot be guaranteed to be consistent with actual
event ordering due to
If reading from materialized view with a consistency level of quorum am I
guaranteed to have the most recent view? other words is w + r > n contract
maintained for MV's as well for both reads and writes?
m (probably stratum 2 or 3) source.
> As Jonathan mentioned, there's no guarantee from Cassandra, but if you
> need as close as you can get, you'll probably need to do it yourself.
> (I run several stratum 2 ntpd servers for pool.ntp.org)
> Kind regards,
> We use our own NTP cluster to reduce clock drift as much as possible, but
> public NTP servers are good enough for most uses. https://www.instaclustr.
> On T
s very expensive. CS is made for
> A+P mostly with tunable C. In ACID databases this is a completely different
> thing as they are mostly either not partition tolerant, not highly
> available or not scalable (in a distributed manner, not speaking of
> "monolithic super servers"
eal problem that isn’t going away anytime soon.
> This problem is sometimes addressed with event sourcing rather than
> mutating in place.
> Hope this helps.
> On Feb 9, 2017, at 5:21 PM, Kant Kodali <k...@peernova.com> wrote:
> @Justin I rea
"That’s the safety blanket everyone wants but is extremely expensive,
especially in Cassandra."
yes LWT's are expensive. Are there any plans to make this better?
On Fri, Feb 10, 2017 at 12:17 AM, Kant Kodali <k...@peernova.com> wrote:
> Hi Jon,
> Thanks a lot for
constraint to tolerate F failures ? I understand it is not needed when not
using LWT's since Cassandra is a master-less system.
On Fri, Feb 10, 2017 at 10:25 AM, Kant Kodali <k...@peernova.com> wrote:
> Thanks Ariel! Yes I knew there are so many variations and optimizations of
> Paxos. I
Also Attached is a flamed graph generated from a thread dump.
On Mon, Feb 27, 2017 at 2:32 AM, Kant Kodali <k...@peernova.com> wrote:
> Attached are the stats of my Cassandra node running on a 4-core CPU. I am
> using sjk-plus tool for the first time so what are t
uld you want to achieve? *5000 is the target.*
> Le Lundi 27 février 2017 12h48, Kant Kodali <k...@peernova.com> a écrit :
> Also Attached is a flamed graph generated from a thread dump.
> On Mon, Feb 27, 2017 at 2:32 AM,
Hi! My answers are inline.
On Mon, Feb 27, 2017 at 11:48 AM, Kant Kodali <k...@peernova.com> wrote:
> On Mon, Feb 27, 2017 at 10:30 AM, Romain Hardouin <romainh...@yahoo.fr>
>> Regarding shared pool workers see CASSANDRA-11
> version. There are likely other new feature tickets that should really
> say 4.x.
> Kind regards,
> On 02/22/2017 07:28 PM, Kant Kodali wrote:
> > I hope that patch is reviewed as quickly as possible. We use LWT's
> > heavily and we are getting a throug
Thanks again. My response are inline.
On Tue, Feb 28, 2017 at 10:04 AM, Romain Hardouin
> > we are currently using 3.0.9. should we use 3.8 or 3.10
> No, don't use 3.X in production unless you really need a major feature.
> I would advise to
On Tue, Feb 28, 2017 at 7:51 PM, Kant Kodali <k...@peernova.com> wrote:
> Hi Romain,
> Thanks again. My response are inline.
> On Tue, Feb 28, 2017 at 10:04 AM, Romain Hardouin <romainh...@yahoo.fr>
>> > we are currently using 3.0
On Wed, Oct 5, 2016 at 12:20 AM, Kant Kodali <k...@peernova.com> wrote:
Thanks a lot, This helps me to make a decision on not to write one for the
performance reasons you pointed out!
On Tue, Oct 4, 2016 11:42 AM, Eric Stevens migh...@gmail.com
along with its associated overhead. But all that said, it should be possible,
though you'll have to write it for yourself in your trigger code.
On Tue, Oct 4, 2016 at 12:29 PM Kant Kodali <k...@peernova.com> wrote:
How to write a trigger in Cassandra to detect updates? My requi
How to write a trigger in Cassandra to detect updates? My requirement is that I
want a trigger to alert me only when there is an update to an existing row and
looks like given the way INSERT and Update works this might be hard to do
because INSERT will just overwrite if there is an
sure as long as that isolated instance is treated as separate cluster you
shouldn't run into any problems.
On Thu, Oct 6, 2016 4:08 PM, Ali Akhtar ali.rac...@gmail.com
Is it possible to create an isolated cassandra instance which is run during
integration tests and it disappears
2016 at 5:10 AM, Kant Kodali <k...@peernova.com> wrote:
you dont need to look for cassandra java api to start/stop instance. you just
need to write a shell script or python or java or any language to execute shell
On Thu, Oct 6, 2016 4:57 PM, Ali Akhtar ali.rac...@gmail.com
ngth(), you can decide whether
it is an insert/update(length > 0) OR delete(length == 0)
I would urge you to try the snippet once on you own, to see what kind of data it
produces in next. You could dump the output of next in a column for audit table,
to see that output.
I have a scenario where every write/row depends on some of the data written
in the previous row so I end up doing a read before write which is
degrading the performance by a significant margin so I am thinking if I
should keep track of the last row written for every partition in a cache so
I have a requirement where I need to know last value that is written
successfully so I could read that value and do some computation and include
it in the subsequent write. For now we are doing read before write which
significantly degrades the performance. Light weight transactions are more
e that is written
> successfully so I could read that value and do some computation and include
> it in the subsequent write"
> Maybe keeping the last written value in a distributed cache is cheaper
> than doing a read before write in Cassandra ?
> On Sat, Nov 5, 2016 at 11:2
can Cassandra cluster direct or load balance the requests by detecting the
resource usage of a particular node?
> *Winguzone <https://winguzone.com?from=list> - Hosted Cloud
> CassandraLaunch your cluster in minutes.*
> On Wed, 19 Oct 2016 06:14:27 -0400*Kant Kodali <k...@peernova.com
> <k...@peernova.com>>* wrote
> can Cassandra cluster direct or load balance the requests by detecting the
> resource usage of a particular node?
embellished on a statement to make a splashy article.
>> The effect is something like this:
>> Iced tea does not cause kidney stones! Cassandr
(thats like 15 million columns where each column can have a data of size
On Fri, Oct 14, 2016 at 11:30 PM, Kant Kodali <k...@peernova.com> wrote:
> "Robert said he could treat safely 10 15GB partitions at his presentation"
> This sounds like there is there is
I understand Secondary Indexes in general are inefficient on high
cardinality columns but since SASI is built from scratch I wonder if the
same argument applies there? If not, Why? Because I believe primary keys in
Cassandra are indeed indexed and since Primary key is supposed to be the
es each of them have a 15GB partition".
> What I wanted to say is we can store much more rows(and columns) in a
> partition than before 3.6.
> 2016-10-15 15:34 GMT+09:00 Kant Kodali <k...@peernova.com>:
>> "Robert said he could treat safely 10
te data --> pressure on compaction
> - bootstrapping of new nodes --> failure to stream a partition in the
> middle will force to re-send the whole partition from the beginning again -->
> the receiving node has a bunch of duplicate data --> pressure on compaction
Do you mean:
> 1) a single indexed value is present in a lot of rows
> 2) a single indexed value has only a few (if not just one) matching row
> On Sat, Oct 15, 2016 at 8:37 AM, Kant Kodali <k...@peernova.com> wrote:
>> I understand Secondary Indexes in ge
resumed" in a middle of a partition, the operational pains will still
> be there. Same for compaction
> On Sat, Oct 15, 2016 at 12:00 PM, Kant Kodali <k...@peernova.com> wrote:
>> 1) It will be great if someone can confirm that there is no limi
torage engine, you'll create a bunch of tombstones and
> duplicates of values
> On Sun, Oct 23, 2016 at 9:35 PM, Kant Kodali <k...@peernova.com> wrote:
>> Hi All,
>> Is there any problem having too many clustering columns? My goal is to
Is there any problem having too many clustering columns? My goal is to
store data by columns in order and for any given partition (primary key)
each of its non-clustering column (columns that are not part of primary
key) can lead to a new column underneath or the CQL equivalent would be a
What is the maximum value of Cassandra Counter Column?
where does it say counter is implemented as long?
On Sun, Oct 23, 2016 at 1:13 AM, Ali Akhtar <ali.rac...@gmail.com> wrote:
> Probably: https://docs.oracle.com/javase/8/docs/api/java/
> On Sun, Oct 23, 2016 at 1:12 PM, Kant Kodali <k...@peernova.
I just read the following link
and I wonder what is the point of counter type when we can do the same
thing with int or bigint? what are benefits of using counter data type?
t; arrives – counters attempt to solve this
> *From: *Kant Kodali <k...@peernova.com>
> *Reply-To: *"firstname.lastname@example.org" <email@example.com>
> *Date: *Monday, October 17, 2016 at 5:20 PM
> *To: *"firstname.lastname@example.org
Also are you saying counters are atomic?
On Mon, Oct 17, 2016 at 6:43 PM, Kant Kodali <k...@peernova.com> wrote:
> How about “Set the value 1 above what it is now" ? The same principle
> should apply right?
> On Mon, Oct 17, 2016 at 6:21 PM, Jeff J
Sorry I shouldn't have said adding a node. Sometimes data seems to be corrupted
or inconsistent in which case would like to run a repair.
Sent from my iPhone
> On Oct 19, 2016, at 10:10 AM, Sean Bridges
> Thanks, we will try that.
ou know what the caveats are and
> use a proper tool to orchestrate it, that would save you from repairing all
> 10TB each time.
> CASSANDRA-12580 might help too as Romain showed us :
> On Wed, Oct 19, 2016 at 6:42 PM Kant Kodali <k...@peernova.com> wrote:
> Another question on a same note would be what would be the fastest way to do
> repairs of size 10TB cluster ? Full repairs are taking days. So among repair
Another question on a same note would be what would be the fastest way to do
repairs of size 10TB cluster ? Full repairs are taking days. So among repair
parallel or repair sub range which is faster in the case of say adding a new
node to the cluster?
Sent from my iPhone
> On Oct 19, 2016, at
+1 Chris Lohfink response
I would also restate the following sentence "java GC pauses are pretty much
a fact of life" to "Any GC based system pauses are pretty much a fact of
I would be more than happy to see if someone can counter prove.
On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink
's Garbage Collector?)
What timeouts are you referring to here?
On Sun, Nov 27, 2016 at 9:57 PM, Harikrishnan Pillai <
> Hi @Kant Kodali,
> We have multiple clusters running zing .
> One cluster has 11/11 and another one also
C4 (Zing's Garbage
*What timeouts are you referring to ?*
On Mon, Nov 28, 2016 at 7:39 AM, Harikrishnan Pillai <
> Hi @Kant Kodali,
> 11 /11 , 11 nodes in DC1 and 11 nodes in DC2.
1) What is the size of each Virtual Node token range?
2) Are all Vnode token ranges in one server are of the same size?
3) If these token ranges are predefined then isn't it implying that the
maximum total number of rows in a server is also predefined?
maximum total number of rows in a server =
> all the list entries and comparing them with a key you are looking for.
> This thesis is maybe more correct:
> There can be no more than 2^64 nodes in a cluster as then 2 nodes would
> share exactly the same token and this does not make really sense.
ialized views guarantee that for each search indexed value, you only
> hit a single node (or N replicas depending on the used consistency level)
> However, materialized views have their own drawbacks (weeker consistency
> guarantee) and you can't use range queries (<, >, ≤,
I keep reading the articles below but the biggest questions for me are as
1) what is the "data size" per request? without data size it hard for me to
see anything sensible
2) is there batching here?
On Tue, Nov 1, 2016 at 2:10 AM, _ _ wrote:
> Currently i am running a cassandra cluster of 3 nodes (with it replicating
> to both nodes) and am experiencing poor performance, usually getting second
> response times when running queries when i am expecting/needing
ng all overhead is about 400 bytes.
>> Note to sure able the batching - it may be one of the parameters to
>> On Mon, Oct 31, 2016 at 4:07 PM, Kant Kodali <k...@peernova.com> wrote:
>>> Hi Guys,
I don't see a store() call in your receive().
Search for store() in here
On Wed, Nov 2, 2016 at 10:23 AM, Cassa L wrote:
> I am using spark 1.6. I wrote a custom receiver to read from WebSocket.
are not necessarily logN in my case because I may divide the number of
nodes at each level by 1/4 or 1/8.
On Wed, Oct 26, 2016 at 1:24 AM, Kant Kodali <k...@peernova.com> wrote:
> @Ali hmm..I didn't mean to say I store the same data across two tables and
> neither tables are depe
If one were given a choice of fitting all the data into one table vs
fitting the data into two tables while say (keeping all the runtime and
space complexity for CRUD operations the same in either case) which one
would you choose and why?
restating my first question.
On Wed, Oct 26, 2016 at 1:19 AM, Ali Akhtar <ali.rac...@gmail.com> wrote:
> You would need to do each write twice and data will take up twice the
> space as its duplicated in two places.
> On Wed, Oct 26, 2016 at 1:17 PM, Kant Kodali <k...@peernova.com
I guess the question can be rephrased into "What is the overhead of
creating and maintaining an additional table?"
On Wed, Oct 26, 2016 at 1:12 AM, Ali Akhtar <ali.rac...@gmail.com> wrote:
> Depends on the use case. No one right answer.
> On Wed, Oct 26, 2016 a
st like with random
>> Can you tell what rowkey1, page1, col(x) actually are ? Maybe there is a
>> The most "recent", means there's a timestamp in there ?
>> On Wed, Oct 12, 2016 at 9:58 AM, Kant Kodali <k...@peernova.com> wr
I understand Cassandra can have a maximum of 2B rows per partition but in
practice some people seem to suggest the magic number is 100K. why not
create another partition/rowkey automatically (whenever we reach a safe
limit that we consider would be efficient) with auto increment bigint
to be careful though (just like with random
> Can you tell what rowkey1, page1, col(x) actually are ? Maybe there is a
> The most "recent", means there's a timestamp in there ?
> On Wed, Oct 12, 2016 at 9:58 AM, Kant Kodali <k...@peernov
Note: Non-system keyspaces don't have the same replication settings,
effective ownership information is meaningless
> Am 25.11.2016 23:38 schrieb "Kant Kodali" <k...@peernova.com>:
>> +1 Chris Lohfink response
>> I would also restate the following sentence "java GC pauses are pretty
>> much a fact of life" to "Any GC based sys
Good to know about Zing! I will have to take a look.
On Sat, Nov 26, 2016 at 8:27 PM, Kant Kodali <k...@peernova.com> wrote:
> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
> completely I mean no GC pauses whatsoever.
> When you say Java is NO
seen there looked very
> interesting and promising! By the way it's written in C++.
> 2016-11-27 7:06 GMT+01:00 Kant Kodali <k...@peernova.com>:
>> Automatic Reference counting sounds like college level idea that we all
>> have been hearing for since GC is born! Th
taining Apps that
> are build in c or c++ has never been such a pain.
> On the other Hand Java is easier to handle for Developers. And coding
> plain c is also a pain.
> Thats why i Said its a philosophic discussion.
> Anyway Cassandra rund on Java so We have to Deal wi
> with azul .we never had a major gc above 10 ms .
> Sent from my iPhone
> > On Nov 25, 2016, at 3:49 PM, Martin Schröder <mar...@oneiros.de> wrote:
> > 2016-11-25 23:38 GMT+01:00 Kant Kodali <k...@peernova.com>:
> >> I would also rest
Are Materialized views persisted on disk? sorry for the naive question.
ue (as 4).
> On Sat, Dec 17, 2016 at 10:21 PM, Kant Kodali <k...@peernova.com> wrote:
>> I keep hearing that the minimum number of Cassandra nodes required to
>> achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
>> deployments by defau
u’re trying to read/write with quorum
> consistency then the read/write operation will fail. You could still do
> reads/writes with CL=ONE, though (provided that at least 1 of the replicas
> was up).
> - Max
> > On Dec 17, 2016, at 1:21 pm, Kant Kodali <k...@peernova.com>
I keep hearing that the minimum number of Cassandra nodes required to
achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
deployments by default seem to deploy 4 nodes. Can anyone shine some light
What happens if I have 3 nodes and replication factor of 3 and
. You look at
Google or FB and see how much open source contribution they have
done. Oracle doesnt come anywhere close to that.
On Mon, Jan 2, 2017 at 8:08 PM, Edward Capriolo <edlinuxg...@gmail.com>
> On Mon, Jan 2, 2017 at 8:30 PM, Kant Kodali <k...@peernova.com> w
This is a subjective question and of course it would turn into opinionated
answers and I think we should welcome that (Nothing wrong in debating a
topic). we have many such debates as SE's such as programming language
comparisons, Architectural debates, Framework/Library debates and so on.
yeah you should async writes also you cannot neglect data size so you might
want to let us know what your data size is?
On Thu, Jan 5, 2017 at 2:57 PM, kurt Greaves wrote:
> you should try switching to async writes and then perform the test. sync
> writes won't make much
On Wed, Dec 21, 2016 at 2:59 AM, Kant Kodali <k...@peernova.com> wrote:
> On Wed, Dec 21, 2016 at 2:58 AM, Kant Kodali <k..
>Oracle can offer support but maybe only for Oracle JDK.
>Twitter uses OpenJDK, but they have their own JVM support team. Not
>sure everyone can afford that.
> As a side note I’ll add that Oracle is paying talented engineers to work
> on the JVM to make it
On Wed, Dec 21, 2016 at 2:58 AM, Kant Kodali <k...@peernova.com> wrote:
> The fact is Oracle is horrible :)
> On Wed, Dec 21, 2016 at 2:54 AM, Brice Dutheil <brice.duth...@gmail.com
> -- Brice
> On Wed, Dec 21, 2016 at 11:34 AM, Kant Kodali <k...@peernova.com> wrote:
>> yeah well I don't think Oracle is treating Java the way Google is
>> treating Go and I am not a big fan of Go mainly because I understand the
>> JVM is far m
Looking at this
I don't know why Cassandra recommends Oracle JVM?
JVM is a great piece of software but I would like to stay away from Oracle
as much as possible. Oracle is just horrible the way they
> Option (a) will impact more the cluster stability than (b).
> [@@ THALES GROUP INTERNAL @@]
> *De :* Kant Kodali [mailto:k...@peernova.com]
> *Envoyé :* samedi 17 décembre 2016 22:21
> *À :* email@example.com
1 - 100 of 142 matches
Mail list logo