Re: ttl in collections

2015-01-06 Thread Jens-U. Mozdzen
Hi Eduardo, Zitat von Eduardo Cusa eduardo.c...@usmediaconsulting.com: [...] I have to worry about the tombstones generated? Considering that I will have many daily set updates that depends on your definition of many... we've run into a situation where we wanted to age out old data using

Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread yhqruc
Hi, all:I use cassandra.thrift to implement a replace row interface in this way:First use batch_mutate to delete that row, then use batch_mutate to insert a new row.I always find that after call this interface, the row is not exist. Then I doubt that it is the problem caused by

Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread yhqruc
Hi, all:I use cassandra.thrift to implement a replace row interface in this way:First use batch_mutate to delete that row, then use batch_mutate to insert a new row.I always find that after call this interface, the row is not exist. Then I doubt that it is the problem caused by

Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread yhqruc
Hi, all:I use cassandra.thrift to implement a replace row interface in this way:First use batch_mutate to delete that row, then use batch_mutate to insert a new row.I always find that after call this interface, the row is not exist. Then I doubt that it is the problem caused by

Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread yhqruc
Hi, all:I use cassandra.thrift to implement a replace row interface in this way:First use batch_mutate to delete that row, then use batch_mutate to insert a new row.I always find that after call this interface, the row is not exist. Then I doubt that it is the problem caused by

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Rahul Bhardwaj
Hi Joe.. Thanks for your valuable solution.. it worked. But for this problem *The processes are killed by kernel, coz they are eating all memory (oom-killer). We have set JAVA heap to default (i.e. it is using 8G) because we have 64 GB RAM.* Should I apply patch given for issue

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Ryan Svihla
Btw side note here, you're using GIANT Batches, and the logs are indicating such, this will cause a signficant amount of heap pressure. The root cause fix is not to use giant batches in the first place. On Tue, Jan 6, 2015 at 4:43 AM, Rahul Bhardwaj rahul.bhard...@indiamart.com wrote: Hi

Re: Re: Cassandra update row after delete immediately, and read that, the data not right?

2015-01-06 Thread Ryan Svihla
so the coordinator node of a given request sets the timestamp unless overridden by the client (which you can do on a per statement basis), while you can move all of your timestamps to client side, eventually as you add more clients you have a similar problem set and will still have to use NTP to

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
woops wrong thread..ignore that :) Robert is correct in this regard by and large even though I disagree with the tradeoff, as my experience has shown me, for a lot of use cases it's not a happy tradeoff, YMMV and there are some that do exist (low write throughput). On Tue, Jan 6, 2015 at 12:58

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla
I would add that STC and JBOD are logically a bad fit anyway, and that doing it with nodetool compact is extra silly. For this reasons I tend to only use JBOD with LCS and therefore with SSD. As far as modeling out tombstones, I tend to push towards more around the model, for example if you're

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
as long as they know how to handle node recovery and don't inflict return data back from the dead that was deleted. On Tue, Jan 6, 2015 at 12:52 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla r...@foundev.pro wrote: In general today, large amounts of

Re: Question about `nodetool rebuild` finsh

2015-01-06 Thread Ryan Svihla
without more information it's hard to say what is the bottleneck. There could be a great deal of gc traffic, it could be hung (some old streaming bugs in some older versions of cassandra), it could be the disk io is falling behind with your compaction of new sstables. On Sun, Dec 28, 2014 at

Re: ttl in collections

2015-01-06 Thread Eduardo Cusa
thanks Jens and Ryan, is clear to me what happens with tombstones for a CF row Now, the same behavior that apply to CF rows also apply to elements in a set Data type? Regards On Tue, Jan 6, 2015 at 12:31 PM, Ryan Svihla r...@foundev.pro wrote: Tombstone management is a big conversation, you

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Robert Coli
On Tue, Jan 6, 2015 at 7:39 AM, Ryan Svihla r...@foundev.pro wrote: In general today, large amounts of hints still pretty much makes a node angry (just no longer nearly as nasty as it was before), unless you have a really low throughput, you're probably not going to gain much in practice by

Re: deletedAt and localDeletion

2015-01-06 Thread Ryan Svihla
If you look at the source there are some useful comments regarding those specifics https://github.com/apache/cassandra/blob/8d8fed52242c34b477d0384ba1d1ce3978efbbe8/src/java/org/apache/cassandra/db/DeletionTime.java /** * A timestamp (typically in microseconds since the unix epoch, although this

Re: Changing replication factor of Cassandra cluster

2015-01-06 Thread Pranay Agarwal
Thanks Robert. Also, I have seen the node-repair operation to fail for some nodes. What are the chances of the data getting corrupt if node-repair fails? I am okay with data availability issues for some time as long as I don't loose or corrupt data. Also, is there way to restore the graph without

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Robert Coli
On Tue, Jan 6, 2015 at 11:14 AM, Ryan Svihla r...@foundev.pro wrote: woops wrong thread..ignore that :) Robert is correct in this regard by and large even though I disagree with the tradeoff, as my experience has shown me, for a lot of use cases it's not a happy tradeoff, YMMV and there are

Re: Question about `nodetool rebuild` finsh

2015-01-06 Thread Robert Coli
On Tue, Jan 6, 2015 at 12:51 PM, Robert Coli rc...@eventbrite.com wrote: This particular one will never finish, because if it's hung that long it's hung forever. Restart affected nodes, wipe the one that was partially rebuilt, and start again. Bleh, missed that netstats shows streaming

Re: Question about `nodetool rebuild` finsh

2015-01-06 Thread Robert Coli
On Sun, Dec 28, 2014 at 7:00 PM, 李洛 luolee...@gmail.com wrote: I want to konw _how could I konw when the rebuild finsh_. This particular one will never finish, because if it's hung that long it's hung forever. Restart affected nodes, wipe the one that was partially rebuilt, and start again.

Re: Changing replication factor of Cassandra cluster

2015-01-06 Thread Robert Coli
On Tue, Jan 6, 2015 at 4:40 PM, Pranay Agarwal agarwalpran...@gmail.com wrote: Thanks Robert. Also, I have seen the node-repair operation to fail for some nodes. What are the chances of the data getting corrupt if node-repair fails? If repair does not complete before gc_grace_seconds, chance

转发:Re: Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread yhqruc
Hi, I found that in my function, both delete and update use the client side timestamp.The update timestamp should be always bigger than the deletion timestamp. I wonder why the update failed in some cases? thank you. - 原始邮件 - 发件人:Ryan Svihla r...@foundev.pro

Re: ttl in collections

2015-01-06 Thread Ryan Svihla
Tombstone management is a big conversation, you can manage it in one of the following ways 1) set a gc_grace_seconds of 0 and then run nodetool compact while using size tiered compaction..as frequently as needed. This often is a pretty lousy solution as gc_grace_seconds means you're not very

Re: Is it possible to implement a interface to replace a row in cassandra using cassandra.thrift?

2015-01-06 Thread Ryan Svihla
replies inline On Tue, Jan 6, 2015 at 2:28 AM, yhq...@sina.com wrote: Hi, all: I use cassandra.thrift to implement a replace row interface in this way: First use batch_mutate to delete that row, then use batch_mutate to insert a new row. I always find that after call this

Re: Implications of ramping up max_hint_window_in_ms

2015-01-06 Thread Ryan Svihla
In general today, large amounts of hints still pretty much makes a node angry (just no longer nearly as nasty as it was before), unless you have a really low throughput, you're probably not going to gain much in practice by raising the hints window today. Later on when we get file system based

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Rahul Bhardwaj
Hi All, I applied Cassandra patch for issue 8248 to see what it do. Now I noticed below errors in my system.log : ERROR [NonPeriodicTasks:1] 2015-01-07 10:55:48,869 CassandraDaemon.java:153 - Exception in thread Thread[NonPeriodicTasks:1,5,main] java.lang.AssertionError: null at

Re: Queries required before data modeling?

2015-01-06 Thread Srinivasa T N
Thanks for the info guys. Regards, Seenu. On Tue, Jan 6, 2015 at 11:31 PM, Ryan Svihla r...@foundev.pro wrote: Yes, however in most cases this means just one new table, so you make a new table and copy the data over. In many ways this is not unlike a schema change, or if you need to change

Re:

2015-01-06 Thread Nagesh
Thanks Ryan, Srinivas for you answer. Finally I have decided to create three column families 1. product_date_id (mm, dd, prodid) PRIMARY KEY ((mm), dd, prodid) - Record the arrival date on updates of a product - Get list of products that are recently added/updated Ex: [(mm, dd)

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Rahul Bhardwaj
Thanks Ryan... We will keep ur valuable suggestion in resolving this issue.. But what is your take on Cassandra patch for issue 8248 to resolve this. On Tuesday, January 6, 2015, Ryan Svihla r...@foundev.pro wrote: Btw side note here, you're using GIANT Batches, and the logs are indicating

Queries required before data modeling?

2015-01-06 Thread Srinivasa T N
Hi All, I was just googling around and reading the various articles on data modeling in cassandra. All of them talk about working backwards, i.e., first now what type of queries you are going to make and select a right data model which can support those queries efficiently. But one thing I

Re:

2015-01-06 Thread Ryan Svihla
Normal data modeling approach in Cassandra is a separate column family of each of those queries is answerable with one partition key (that's going to be the fastest). I'm very suspicious of - get list of products for a given range of ids Is this being driven by another query to get a list of

Re: Queries required before data modeling?

2015-01-06 Thread James Rothering
Yes, remodeling the schema will be required to have good performance for new queries which things had not been cached ahead of time to accommodate. In C*, you're going to pre-compute all caching ahead of time, in order to maximize performance. This is in contrast to the relational approach where

Re: Cassandra consuming whole RAM (64 G)

2015-01-06 Thread Ryan Svihla
That even with that patch you'll likely run heap pressure with batches of that size, so either increase your heap and take the GC hit on CPU (and have longer GCs) or don't use large batches. The batch conversation is a bigger one which I discuss here

Re: Queries required before data modeling?

2015-01-06 Thread Ryan Svihla
Yes, however in most cases this means just one new table, so you make a new table and copy the data over. In many ways this is not unlike a schema change, or if you need to change your primary key on an existing table in traditional SQL databases. This design around partition key is true of all

Re: STCS limitation with JBOD?

2015-01-06 Thread Dan Kinder
Thanks for the info guys. Regardless of the reason for using nodetool compact, it seems like the question still stands... but he impression I'm getting is that nodetool compact on JBOD as I described will basically fall apart. Is that correct? To answer Colin's question as an aside: we have a

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla
nodetool compact is the ultimate running with scissors solution, far more people manage to stab themselves in the eye. Customers running with scissors successfully not withstanding. My favorite discussions usually tend to result: 1. We still have tombstones ( so they set gc_grace_seconds to

Re: Reload/resync system.peers table

2015-01-06 Thread Ryan Svihla
auto_bootstrap: false shouldn't help here any more than true. So when I had this issue before in prod I've actually just executed delete statements to the bogus nodes, this however only solved a symptom (the ghosts came back) and the issue was a bug (