Also not sure if this is relevant but just noticed the nodetool tpstats output:
Pool Name Active Pending Completed Blocked All time blocked FlushWriter 0 0 1136 0 512 Looks like about 50% of flushes are blocked. On Tue, Aug 5, 2014 at 10:14 AM, Ruchir Jha <ruchir....@gmail.com> wrote: > Yes num_tokens is set to 256. initial_token is blank on all nodes > including the new one. > > > On Tue, Aug 5, 2014 at 10:03 AM, Mark Reddy <mark.re...@boxever.com> > wrote: > >> My understanding was that if initial_token is left empty on the new node, >>> it just contacts the heaviest node and bisects its token range. >> >> >> If you are using vnodes and you have num_tokens set to 256 the new node >> will take token ranges dynamically. What is the configuration of your other >> nodes, are you setting num_tokens or initial_token on those? >> >> >> Mark >> >> >> On Tue, Aug 5, 2014 at 2:57 PM, Ruchir Jha <ruchir....@gmail.com> wrote: >> >>> Thanks Patricia for your response! >>> >>> On the new node, I just see a lot of the following: >>> >>> INFO [FlushWriter:75] 2014-08-05 09:53:04,394 Memtable.java (line 400) >>> Writing Memtable >>> INFO [CompactionExecutor:3] 2014-08-05 09:53:11,132 CompactionTask.java >>> (line 262) Compacted 12 sstables to >>> >>> so basically it is just busy flushing, and compacting. Would you have >>> any ideas on why the 2x disk space blow up. My understanding was that if >>> initial_token is left empty on the new node, it just contacts the heaviest >>> node and bisects its token range. And the heaviest node is around 2.1 TB, >>> and the new node is already at 4 TB. Could this be because compaction is >>> falling behind? >>> >>> Ruchir >>> >>> >>> On Mon, Aug 4, 2014 at 7:23 PM, Patricia Gorla < >>> patri...@thelastpickle.com> wrote: >>> >>>> Ruchir, >>>> >>>> What exactly are you seeing in the logs? Are you running major >>>> compactions on the new bootstrapping node? >>>> >>>> With respect to the seed list, it is generally advisable to use 3 seed >>>> nodes per AZ / DC. >>>> >>>> Cheers, >>>> >>>> >>>> On Mon, Aug 4, 2014 at 11:41 AM, Ruchir Jha <ruchir....@gmail.com> >>>> wrote: >>>> >>>>> I am trying to bootstrap the thirteenth node in a 12 node cluster >>>>> where the average data size per node is about 2.1 TB. The bootstrap >>>>> streaming has been going on for 2 days now, and the disk size on the new >>>>> node is already above 4 TB and still going. Is this because the new node >>>>> is >>>>> running major compactions while the streaming is going on? >>>>> >>>>> One thing that I noticed that seemed off was the seeds property in the >>>>> yaml of the 13th node comprises of 1..12. Where as the seeds property on >>>>> the existing 12 nodes consists of all the other nodes except the >>>>> thirteenth >>>>> node. Is this an issue? >>>>> >>>>> Any other insight is appreciated? >>>>> >>>>> Ruchir. >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Patricia Gorla >>>> @patriciagorla >>>> >>>> Consultant >>>> Apache Cassandra Consulting >>>> http://www.thelastpickle.com <http://thelastpickle.com> >>>> >>> >>> >> >