Re: Data in multi disks is not evenly distributed

2017-06-13 Thread Akhil Mehra
Hi, I came across the following method ( https://github.com/apache/cassandra/blob/afd68abe60742c6deb6357ba4605268dfb3d06ea/src/java/org/apache/cassandra/service/StorageService.java#L5006-L5021). It seems data is evenly split across disks according to local token ranges. It might be that data

Re: Data in multi disks is not evenly distributed

2017-06-11 Thread Erick Ramirez
That's the cause of the imbalance -- an excessively large sstable which suggests to me that at some point you performed a manual major compaction with nodetool compact. If the table is using STCS, there won't be other compaction partners in the near future so you split the sstable manually with

Re: Data in multi disks is not evenly distributed

2017-06-11 Thread Xihui He
Hi Vladimir, The disks size are all the same, 1.8T as show in df, and only used by cassandra. It seems to me that maybe the compacted lagetst file is on data01 which uses 1.1T. Thanks, Xihui On 11 June 2017 at 17:26, Vladimir Yudovin wrote: > Hi, > > Do your disks have

Re: Data in multi disks is not evenly distributed

2017-06-11 Thread Vladimir Yudovin
Hi, Do your disks have the same size? AFAK Cassandra distributes data with proportion to disk size, i.e. keeps the same percent of busy space. Best regards, Vladimir Yudovin, Winguzone - Cloud Cassandra Hosting On Wed, 07 Jun 2017 06:15:48 -0400 Xihui He xihu...@gmail.com

RE: Data in multi disks is not evenly distributed

2017-06-08 Thread ZAIDI, ASAD A
Check status of load with nodetool status command. Make sure your there isn’t huge number of pending compactions for your tables. Ideally speaking data distribution should be even across your nodes. you should have reserved extra 15% of free space relative to your maximum size of your table