Re: Querying key cache

2017-09-14 Thread Jeff Jirsa
Could keep another table and insert into it IF NOT EXISTS. If the insert is applied, you can act on it (perhaps inserting with a column indicating you're starting the action). When complete you can update the column to indicate completion Of course this all sounds queue-like, sometimes it's

Re: Querying key cache

2017-09-14 Thread Jagadeesh Mohan
This is our use case. We would like to store series of events, and act on event only once(dedup or abnormal event for same event id is expected to be dropped). Thought of using key cache to check and insert. Would you suggest any other work around? On Fri, Sep 15, 2017 at 11:10 AM, Jeff Jirsa

Re: Querying key cache

2017-09-14 Thread Jeff Jirsa
> On Sep 14, 2017, at 10:31 PM, Jagadeesh Mohan > wrote: > > Hi, > I would like to know if there is a way to query key cache. No > Follow up question: Is there a way to use presence of key in key cache in > conditional updates. No > > Primary Use case is

Querying key cache

2017-09-14 Thread Jagadeesh Mohan
Hi, I would like to know if there is a way to query key cache. Follow up question: Is there a way to use presence of key in key cache in conditional updates. Primary Use case is to figure out the dedup of the data. -- With Regards, Jagadeesh

Re: Re[4]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread kurt greaves
Sorry that only applies our you're using NTS. You're right that simple strategy won't work very well in this case. To migrate you'll likely need to do a DC migration to ensuite no downtime, as replica placement will change even if RF stays the same. On 15 Sep. 2017 08:26, "kurt greaves"

Re: Re[4]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread kurt greaves
If you have racks configured and lose nodes you should replace the node with one from the same rack. You then need to repair, and definitely don't decommission until you do. Also 40 nodes with 256 vnodes is not a fun time for repair. On 15 Sep. 2017 03:36, "Dominik Petrovic"

Re[4]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread Dominik Petrovic
@jeff, I'm using 3 availability zones, during the life of the cluster we lost nodes, retired others and we end up having some of the data written/replicated on a single availability zone. We saw it with nodetool getendpoints. RegardsĀ  >Thursday, September 14, 2017 9:23 AM -07:00 from Jeff

Re: Re[2]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread Jeff Jirsa
With one datacenter/region, what did you discover in an outage you think you'll solve with network topology strategy? It should be equivalent for a single D.C. -- Jeff Jirsa > On Sep 14, 2017, at 8:47 AM, Dominik Petrovic > wrote: > > Thank you for the

Re[2]: Modify keyspace replication strategy and rebalance the nodes

2017-09-14 Thread Dominik Petrovic
Thank you for the replies! @jeff my current cluster details are: 1 datacenter 40 nodes, with vnodes=256 RF=3 What is your advice? is it a production cluster, so I need to be very careful about it. Regards >Thu, 14 Sep 2017 -2:47:52 -0700 from Jeff Jirsa : > >The token

Re: Self-healing data integrity?

2017-09-14 Thread Carlos Rolo
Wouldn't be easier for 1) The CRC to be checked by the sender, and don't send if it doesn't match? 2) And once the stream ends, you could compare the 2 CRCs to see if something got weird during transfer? Also you could implement this in 2 pieces instead of reviewing the streaming architecture

Re: Compaction in cassandra

2017-09-14 Thread Jeff Jirsa
Shouldn't need it under normal circumstances, and should avoid it unless you explicitly need it -- Jeff Jirsa > On Sep 13, 2017, at 11:49 PM, Akshit Jain wrote: > > Is it helpful to run nodetool compaction in cassandra? > or automatic compaction is just fine. >

Compaction in cassandra

2017-09-14 Thread Akshit Jain
Is it helpful to run nodetool compaction in cassandra? or automatic compaction is just fine. Regards

Re: Cassandra configuration thumb rules

2017-09-14 Thread Jeff Jirsa
Some notes from ~7 years of running in prod below - note though that none of this matters, the only thing that matters is benchmarking your load on your own hardware. Definitely run benchmarks and figure out what works for you. 166k/s is something you CAN hit with a 3-5 node cluster with the