Hi,
Sorry about the gravedigging, but what would be a good start value to tune
rpc_max_threads ?
I mean, default is unlimited, the value commented is 2048. Native protocol
seems to only allow 128 simultaneous threads. Should I stick to 2048 or try
with something closer to 128 or even something
Hi guys,
I am looking at added and dropped option in Cassandra between 1.2.18 and
2.0.11 and this makes me wonder:
Why has the index_interval option been removed from cassandra.yaml ? I know
we can also define it on a per table basis, yet, this global option was
quite useful to tune memory
Hi,
Is it better to use Counter to User click count than maintaining creating
new row as user id : timestamp and count it.
Basically we want to track the user clicks and use the same for
hourly/daily/monthly report.
Thanks
Ajay
Hi!
It’s really a tradeoff between accurate and fast and your read access patterns;
if you need it to be fairly fast, use counters by all means, but accept the
fact that they will (especially in older versions of cassandra or adverse
network conditions) drift off from the true click count.
Hi,
So you mean to say counters are not accurate? (It is highly likely that
multiple parallel threads trying to increment the counter as users click
the links).
Thanks
Ajay
On Mon, Dec 29, 2014 at 4:49 PM, Janne Jalkanen janne.jalka...@ecyrd.com
wrote:
Hi!
It’s really a tradeoff between
Hi Ajay,
Here is a good explanation you might want to read.
http://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
Though we use counters for 3 years now, we used them from start C* 0.8 and
we are happy with them. Limits I can see in both ways are:
This is a bit difficult. Depending on your access patterns and data
volume, I'd be inclined to keep a separate table with a (count,
foreign_key) clustering key. Then do a client-side join to read the data
back in the order you're looking for. That will at least make the heavily
updated table
https://issues.apache.org/jira/browse/CASSANDRA-3534
On Mon, Dec 29, 2014 at 6:58 PM, Alain RODRIGUEZ arodr...@gmail.com wrote:
Hi guys,
I am looking at added and dropped option in Cassandra between 1.2.18 and
2.0.11 and this makes me wonder:
Why has the index_interval option been removed
Thanks for the clarification.
In my case, Cassandra is the only storage. If the counters get incorrect,
it could't be corrected. For that if we store raw data, we can as well go
that approach. But the granularity has to be as seconds level as more than
one user can click the same link. So the
If the counters get incorrect, it could't be corrected
You'd have to store something that allowed you to correct it. For example,
the TimeUUID approach to keep true counts, which are slow to read but
accurate, and a background process that trues up your counter columns
periodically.
On Mon,
Thanks for the pointer Jason,
Yet, I thought that cache and memtables went off-heap only in version 2.1
and not 2.0 (As of Cassandra 2.0, there are two major pieces of the
storage engine that still depend on the JVM heap: memtables and the key
cache. --
Did you solved this issue ?
I guess nobody answers you because this is very weird. I also guess you've
made some mistake on the configuration.
Anyway, let me know if you managed to get out of the mess somehow or if you
still need help.
C*heers
2014-12-03 15:57 GMT+01:00 Castelain, Alain
I noticed (and reported) a bug that made me drop this tool --
https://github.com/BrianGallew/cassandra_range_repair/issues/16
Might this be related somehow ?
C*heers
Alain
2014-11-21 13:30 GMT+01:00 Paulo Ricardo Motta Gomes
paulo.mo...@chaordicsystems.com:
Hey guys,
Just reviving this
What you are asking maybe answer in the code level and pretty deep stuff,
at least from user (like me) point of view. But to quote Jonathan
in CASSANDRA-3534, Then you will be able to say use X amount of memory for
memtables, Y amount for the cache (and monitor Z amount for the bloom
filters)
I made an error on Topic title.
We are indeed going to do it (that's why I made the mistake), but I am
speaking of 1.2 -- 2.0 here, and we will start by this before going to
2.1, since we want to do it in rolling upgrade way.
Thanks for your enlightening pointer about this vanished pressure
On Tue, Dec 23, 2014 at 10:26 AM, Peter Lin wool...@gmail.com wrote:
I'm bias in favor of using both thrift and CQL3, though many people on the
list probably think I'm crazy.
I don't think you're crazy but I do think you will ultimately face the
deprecation of thrift.
Briefly, I disbelieve
In my bias opinion something else should replace CQL and it needs a proper
rewrite on the sever side.
I've studied the code and having written query parsers and planners, what is
there today isn't going to work long term.
Whatever replaced both thrift and CQL needs to provide 100% of the
Thanks Ryan.
I want to understand what is the best way to increase/change the replica
factor of the cassandra cluster? My priority is consistency and probably I
am tolerant about some down time of the cluster. Is it totally weird to try
changing replica later or are there people doing it for
On Wed, Dec 24, 2014 at 9:41 AM, Phil Burress philtburr...@gmail.com
wrote:
Just upgraded our cluster from 2.1.1 to 2.1.2 and our nodes keep dying.
The kernel is killing the process due to out of memory:
https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/
Appears to
On Tue, Dec 23, 2014 at 12:29 AM, Jiri Horky ho...@avast.com wrote:
just a follow up. We've seen this behavior multiple times now. It seems
that the receiving node loses connectivity to the cluster and thus
thinks that it is the sole online node, whereas the rest of the cluster
thinks that it
On Mon, Dec 29, 2014 at 1:40 PM, Pranay Agarwal agarwalpran...@gmail.com
wrote:
I want to understand what is the best way to increase/change the replica
factor of the cassandra cluster? My priority is consistency and probably I
am tolerant about some down time of the cluster. Is it totally
Should I stick to 2048 or try
with something closer to 128 or even something else ?
2048 worked fine for us.
About HSHA,
I anti-recommend hsha, serious apparently unresolved problems exist with
it.
We saw an improvement when we switched to HSHA, particularly for our
offline
Might be https://issues.apache.org/jira/browse/CASSANDRA-8061 or one of the
linked/duplicate tickets.
=Rob
On Mon, Dec 29, 2014 at 1:40 PM, Robert Coli rc...@eventbrite.com wrote:
On Wed, Dec 24, 2014 at 9:41 AM, Phil Burress philtburr...@gmail.com
wrote:
Just upgraded our cluster from
On Mon, Dec 29, 2014 at 2:03 PM, mck m...@apache.org wrote:
We saw an improvement when we switched to HSHA, particularly for our
offline (hadoop/spark) nodes.
Sorry i don't have the data anymore to support that statement, although
i can say that improvement paled in comparison to
Hi folks,
Perhaps this is a question better addressed to the Cassandra developers
directly, but I thought I'd ask it here first. We've recently been
benchmarking certain uses of secondary indexes in Cassandra 2.1.x, and
we've noticed that when the number of items in an index reaches beyond
Perf is better, correctness seems less so. I value latter more than
former.
Yeah no doubt.
Especially in CASSANDRA-6285 i see some scary stuff went down.
But there are no outstanding bugs that we know of, are there?
(CASSANDRA-6815 remains just a wrap up of how options are to be
presented
So while not exactly the same, this seems like a good analogy for
suggesting a third interface to fix problems with existing interfaces:
http://xkcd.com/927/
Even if the CQL parsing code in Cassandra is subpar (I haven't studied it),
that's not an especially compelling case to suggest replacing
Secondary indexes are there for convenience, not performance. If you're
looking for something performant, you'll need to maintain your own indexes.
On Mon Dec 29 2014 at 3:22:58 PM Sam Klock skl...@akamai.com wrote:
Hi folks,
Perhaps this is a question better addressed to the Cassandra
The kind of query language I'm thinking of is closer to Datalog, which is
what Datomic uses. It's a personal bias, but I find it easier and cleaner
to express joins, subqueries and correlated subqueries in a
LISP-like/datalog like syntax than SQL.
Since CQL is modeled/inspired by SQL, it inherits
29 matches
Mail list logo