Please don't post on two mailing lists at once, it makes it impossible for
people that are not subscribed to the 2 mailing list to follow the thread
(and is bad form in general). If unsure which one is the most appropriate,
it's fine, pick your best guest (in this case it's clearly a java driver
Ok, looks fair enough.
Thanks guys. I would be great to be able to add disks when amount of data
raises and add nodes when throughput increases... :)
2014-06-19 5:27 GMT+02:00 Ben Bromhead b...@instaclustr.com:
Hi,
I have a column family with a secondary index on one of its columns. I
noticed that when I write a row to the column family, and immediately query
that row through the secondary index, every now and then it won't give any
results.
Could it be that Cassandra performs the write to the internal
I would say this is worth benchmarking before jumping to conclusions. The
network being a bottleneck (or latency causing) for EBS is, to my
knowledge, supposition, and instances can be started with direct
connections to EBS if this is a concern. The blog post below shows that
even without SSDs the
Hi eveybody,
we have some problems running repairs on a timely schedule. We have a
three node deployment, and we start repair on one node every week,
repairing one columnfamily by one.
However, when we run into the big column families, usually repair
sessions hangs undefinitely, and we have
Hey guys.
If you haven't seen KairosDB, it's a time series database on top of
cassandra.
Anyway, we're deploying it in production. However, the existing APIs are a
bit raw (requiring you to send JSON directly) and don't provide much on top
of syntactic sugar.
There's the codahale metrics API
What a coincidence! Today happened in my cluster of 7 nodes as well.
Regards,
Pavel
On Wed, Jun 18, 2014 at 11:13 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I have a 10 node cluster with cassandra 2.0.8.
I am taking this exceptions in the log when I run my code. What my
The DataStax doc should be current best practices:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html
If you or anybody else finds it inadequate, speak up.
-- Jack Krupansky
-Original Message-
From: Paolo Crosato
Sent: Thursday, June 19,
If someone really wanted to try this it, I recommend adding an Elastic
Network Interface or two for gossip and client/API traffic. This lets EBS
and management traffic have the pre-configured network.
On Thu, Jun 19, 2014 at 6:54 AM, Benedict Elliott Smith
belliottsm...@datastax.com wrote:
I
It turns out this is caused by an earlier, failed attempt to upgrade.
Removing all pre-sstablemetamigration snapshot directories solved the issue.
Credits to Markus Eriksson.
On Wed, Jun 11, 2014 at 9:42 AM, Tom van den Berge t...@drillster.com
wrote:
No, unfortunately I haven't.
On Tue,
does an elastic network interface really use a different physical network
interface? or is it just to give the ability for multiple ip addresses?
On June 19, 2014 at 3:56:34 PM, Nate McCall (n...@thelastpickle.com) wrote:
If someone really wanted to try this it, I recommend adding an Elastic
Hello Paolo,
I just published an open source version of the dsetool list_subranges
command, which will enable you to perform subrange repair as described in
the post.
You can find the code and usage instructions here:
https://github.com/pauloricardomg/cassandra-list-subranges
Currently
Hello,
I am using Cassandra 2.1.0-rc1 and trying to set up internode encryption.
Here's how I have generated the certificates and keystores:
keytool -genkeypair -v -keyalg RSA -keysize 1024 -alias node1 -keystore
node1.keystore -storepass 'mypassword' -dname 'CN=Development' -keypass
Hi Brian,
What compaction are you running? Have you tried using leveled compaction? AFAIK
it should generally require less disk space during compaction.
Cheers,
Jens
—
Sent from Mailbox
On Wed, Jun 18, 2014 at 6:02 PM, Brian Tarbox tar...@cabotresearch.com
wrote:
I'm running on AWS
...and temporarily adding more nodes and rebalancing is not an option?—
Sent from Mailbox
On Wed, Jun 18, 2014 at 9:39 PM, Brian Tarbox tar...@cabotresearch.com
wrote:
I don't think I have the space to run a major compaction right now (I'm
above 50% disk space used already) and compaction can
Sorry - should have been clear I was speaking in terms of route optimizing,
not bandwidth. No idea as to the implementation (probably instance
specific) and I doubt it actually doubles bandwidth.
Specifically: having an ENI dedicated to API traffic did smooth out some
recent load tests we did for
Never mind fellas.
Found the stupid error here. Sharing with you just in case. Typo error on
my script to generate those.
I have the '' characters while generating the keystore and certificates.
-keystore 'mypassword' while correct is -keystore mypassword
I knew it was a certificate issue,
I know now it's been caused by the heap filling up in some nodes. When it
fills up, the node goes does, GC runs more, then the node goes up again.
Looking for GCInspector in the log, I see GC takes more time to run each
time it runs, as shown bellow.
I have set key cache to 100 mb and I was used
Pavel,
Out of curiosity, did it start to happen before some update? Which version
of Cassandra are you using?
[]s
2014-06-19 16:10 GMT-03:00 Pavel Kogan pavel.ko...@cortica.com:
What a coincidence! Today happened in my cluster of 7 nodes as well.
Regards,
Pavel
On Wed, Jun 18, 2014
I was taking a look at Cassandra anti-patterns list:
http://www.datastax.com/documentation/cassandra/2.0/cassandra/architecture/architecturePlanningAntiPatterns_c.html
Among then is
SELECT ... IN or index lookups¶
Your other option is to fire off async queries. It's pretty
straightforward w/ the java or python drivers.
On Thu, Jun 19, 2014 at 5:56 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
I was taking a look at Cassandra anti-patterns list:
But using async queries wouldn't be even worse than using SELECT IN?
The justification in the docs is I could query many nodes, but I would
still do it.
Today, I use both async queries AND SELECT IN:
SELECT_ENTITY_LOOKUP = SELECT entity_id FROM + ENTITY_LOOKUP + WHERE
name=%s and value in(%s)
If you use async and your driver is token aware, it will go to the
proper node, rather than requiring the coordinator to do so.
Realistically you're going to have a connection open to every server
anyways. It's the difference between you querying for the data
directly and using a coordinator as
This is interesting, I didn't know that!
It might make sense then to use select = + async + token aware, I will try
to change my code.
But would it be a recomended solution for these cases? Any other options?
I still would if this is the right use case for Cassandra, to look for
random keys in a
The only case in which it might be better to use an IN clause is if
the entire query can be satisfied from that machine. Otherwise, go
async.
The native driver reuses connections and intelligently manages the
pool for you. It can also multiplex queries over a single connection.
I am assuming
Irrespective of performance and latency numbers there are fundamental flaws
with using EBS/NAS and Cassandra, particularly around bandwidth contention and
what happens when the shared storage medium breaks. Also obligatory reference
to
26 matches
Mail list logo