In last year's summit there was a presentation from Instaclustr - https://www.instaclustr.com/meetups/presentation-by-ben-bromhead-at-cassandra-summit-2014-san-francisco/. It could be the solution you are looking for. However I don't see the code being checked in or JIRA being created. So for now you'd better plan the capacity carefully.
On Wed, Jan 21, 2015 at 11:21 PM, Yatong Zhang <bluefl...@gmail.com> wrote: > Yes, my cluster is almost full and there are lots of pending tasks. You > helped me a lot and thank you Eric~ > > On Thu, Jan 22, 2015 at 11:59 AM, Eric Stevens <migh...@gmail.com> wrote: > >> Yes, bootstrapping a new node will cause read loads on your existing >> nodes - it is becoming the owner and replica of a whole new set of existing >> data. To do that it needs to know what data it's now responsible for, and >> that's what bootstrapping is for. >> >> If you're at the point where bootstrapping a new node is placing a >> too-heavy burden on your existing nodes, you may be dangerously close to or >> even past the tipping point where you ought to have already grown your >> cluster. You need to grow your cluster as soon as possible, and chances >> are you're close to no longer being able to keep up with compaction (see >> nodetool compactionstats, make sure pending tasks is <5, preferably 0 or >> 1). Once you're falling behind on compaction, it becomes difficult to >> successfully bootstrap new nodes, and you're in a very tough spot. >> >> >> On Wed, Jan 21, 2015 at 7:43 PM, Yatong Zhang <bluefl...@gmail.com> >> wrote: >> >>> Thanks for the reply. The bootstrap of new node put a heavy burden on >>> the whole cluster and I don't know why. So that' the issue I want to fix >>> actually. >>> >>> On Mon, Jan 12, 2015 at 6:08 AM, Eric Stevens <migh...@gmail.com> wrote: >>> >>>> Yes, but it won't do what I suspect you're hoping for. If you disable >>>> auto_bootstrap in cassandra.yaml the node will join the cluster and will >>>> not stream any old data from existing nodes. >>>> >>>> The cluster will now be in an inconsistent state. If you bring enough >>>> nodes online this way to violate your read consistency level (eg RF=3, >>>> CL=Quorum, if you bring on 2 nodes this way), some of your queries will be >>>> missing data that they ought to have returned. >>>> >>>> There is no way to bring a new node online and have it be responsible >>>> just for new data, and have no responsibility for old data. It *will* be >>>> responsible for old data, it just won't *know* about the old data it >>>> should be responsible for. Executing a repair will fix this, but only >>>> because the existing nodes will stream all the missing data to the new >>>> node. This will create more pressure on your cluster than just normal >>>> bootstrapping would have. >>>> >>>> I can't think of any reason you'd want to do that unless you needed to >>>> grow your cluster really quickly, and were ok with corrupting your old >>>> data. >>>> >>>> On Sat, Jan 10, 2015 at 12:39 AM, Yatong Zhang <bluefl...@gmail.com> >>>> wrote: >>>> >>>>> Hi there, >>>>> >>>>> I am using C* 2.0.10 and I was trying to add a new node to a >>>>> cluster(actually replace a dead node). But after added the new node some >>>>> other nodes in the cluster had a very high work-load and affected the >>>>> whole >>>>> performance of the cluster. >>>>> So I am wondering is there a way to add a new node and this node only >>>>> afford new data? >>>>> >>>> >>>> >>> >> >