There are n vnodes regardless of the size of the physical cluster.
Regards
Milind
On Jun 10, 2013 7:48 AM, Theo Hultberg t...@iconara.net wrote:
Hi,
The default number of vnodes is 256, is there any significance in this
number? Since Cassandra's vnodes don't work like for example Riak's,
Why would you use Cassandra for primary store of logging information? Have
you considered Kafka ?
You could , of course, then fan out the logs to both Cassandra (on a near
real time basis ) and then on a daily basis (if you wish) extract the
deltas from Kafka into a RDBMS; with no PIG/Hive etc.
IMO
You would use Cassandra Counters (or other variation of distributed
counting) in case of having determined that a centralized version of
counting is not going to work.
You'd determine the non_feasibility of centralized counting by figuring the
speed at which you need to sustain writes and
1. Assuming that the majorirty of the line items are new and
2. The lookup of an existing line-item will dictate the performance of the
system because reads are slower than writes in C*.
3. Assuming that you are using counters in C*
Therefore eliminate that problem by implementing a bloom
Kafka is relatively stable and has a active well-supported news-group as
well.
As discussed by Brian, you would be inverting the paradigm of
store-process. Essentially in your original approach, you are storing the
messages first and then processing them after the fact. In the Kafka model,
you
On 1, countandra.org.
On 2, the issue is a little more deep (we have investigated this at
countandra). To approach it a little more comprehensively, the issue has
more to do with events rather than counts (at least in IMO).
A similar issue is about averages... countandra does sums and counts
Coolwww.countandra.org calls them cascaded counters and it will be also
based on Kafka.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Feb 22, 2012 7:22 PM, Edward Capriolo edlinuxg...@gmail.com
The composite-key approach with counters would work very well in this case.
It will also obviate the concern of not knowing the exact column names
apriori...although for efficiencies, you might to look at maintaining a
secondary cachelike cf for lookup
Depending on your data patterns(not to
My bad ~s/X:X-Value/Y:Y-Value/ after rereading the SELECT.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Jan 22, 2012 6:40 AM, Milind Parikh milindpar...@gmail.com wrote:
The composite-key approach
I used rainbird as inspiration for Countandra ( some of publicly available
data structures from rainbird preso). That said, there are significant
differences between the two architectures. Additiomally as Cassandra begins
to provide triggets, some very interesting things will become possible in
You might want to look at the code in countandra.org; regardless of whether
you use it. It use a model of dynamic composite keys (although static
composite keys would have worked as well). For the actual query,only one
row is hit. This of course only works bc the data model is attuned for the
Inspired by twitter's rainbird project, Countandra is a hierarchical
distributed counting engine at scale.
It provides a complete http based interface to both posting events and
getting queries. The syntax of a event posting is done in a FORMS
compatible way. The result of the query is emitted in
For 99% of current applications requiing a persistent datastore, Oracle,
PgSQL and MySQL variants will suffice.
For the 1% of the applications, consider C* if
(a) you have given up on distributed transactions (ACIDLY; but
NOT BASEICLY)
(b) wondering about this new fangled
Why have two rings? Cassandra manages the replication for youone ring
with physical nodes in two dc might be a better option. Of course, depending
on the inter-dc failure characteristics, might need to endure split-brain
for a while.
/***
sent from my android...please
use zookeeper. Scott Fines has a great library on top of zk.
On Fri, Sep 16, 2011 at 7:08 PM, Daning Wang dan...@netseer.com wrote:
We try to implement an ordered queue system in Cassandra(ver 0.8.5). In
initial design we use a row as queue, a column for each item in queue.
that means
Why not use couchdb for this use case?
Milind
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On Aug 18, 2011 9:07 PM, Nicholas Neuberger nneuberg...@gmail.com wrote:
I've been using Cassandra as a
In order to be predicable @ big data scale, the intensity and periodicity of
STW Garbage Collection has to be brought down. Assume that SLABS (Cass 2252)
will be available in the main line at some time and assume that this will
have the impact that other projects (hbase etc) are reporting. I
If I understand this correctly, then the epoch integer would be generated by
each node. Since time always flows forward, the assumption would be, I
suppose, that the epochs would be tagged with the node that generated them
and additionally the counter would carry as much history as necessary (and
I believe that the key reason is souped up performance for most recent data.
And yes, an intelligent flush leaves you vulnerable to some data loss.
/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/
On May
Other interesting flavors in a distributed cache terracotta,
gemfire.together with a complex event processing engine. like
OCEP
drives a lot of low latency, high freq trading where nano seconds matter
/***
sent from my android...please pardon occasional typos
Most likely because in the wild, you can't assume a reliable DNS.
Just as an aside...This question comes up often in context of managing
Cassandra clusters;especially in elastic situations. Most CMDBs assume a
static name (host names/static IPs) for nodes. However this often proves to
be
At the risk of repeating the previous conclusions:
(a) This configuration obviates the need for a patch that I had posted
earlier. This is a good thing.
(b) The reported latency(@Sasha) is less than ordinary latencies in EC2. The
reasons behind this are not well understood. However I wouldn't
@ the
speed of thought
/
On Apr 25, 2011 3:54 AM, David Strauss da...@davidstrauss.net wrote:
On Fri, 2011-04-22 at 13:31 -0700, Milind Parikh wrote:
Is there a chance of getting manual confli...
You can actually already perform manual conflict resolution in
Cassandra
award both external and internal IP
address for each node? or we have to explicitly buy the external IP's?
I am looking into overlay n/w's.
On Mon, Apr 25, 2011 at 5:20 PM, Milind Parikh milindpar...@gmail.com
wrote:
I stand correctedI show how cassandra can be deployed
need other ports for basic
setup , right ?
If anyone coud get 'nodetool repair' working with this patch (across
regions), let me know. It may be I am doing something wrong.
On Wed, Mar 23, 2011 at 1:08 AM, Milind Parikh milindpar...@gmail.com
wrote:
@aj
are you sure...
@aj
are you sure that all ports are accessible from all node?
@sasha
I think that being able to have the semantics of address aNAT address can
emable security from different perspective. Describing an overlay nw will
take long hete. But that may solve your security concerns over the internet.
code.
Dave Viner
On Mon, Mar 21, 2011 at 9:41 AM, A J s5a...@gmail.com wrote:
Thanks for sharing the document, Milind !
Followed the instructions and it worked for me.
On Mon, Mar 21, 2011 at 5:01 AM, Milind Parikh milindpar...@gmail.com
wrote:
Here's the document on Cassandra
https://docs.google.com/document/d/13Yc2t4d07290TdiRmSTchuAk9sbp4BeqOpqeYhbcDFM/edit?hl=en
There was an excellent session on vector clocks and synchronous writes in
cassandra. Here are my gleanings out of it.
/***
sent from my android...please pardon occasional typos as I
28 matches
Mail list logo