Re: Can "data_file_directories" make use of multiple disks?

2018-04-09 Thread Venkata Hari Krishna Nukala
Paulo, thanks for the confirmation. I had raised a ticket for this. https://issues.apache.org/jira/browse/CASSANDRA-14372 On Tue, Apr 10, 2018 at 2:37 AM, Paulo Motta wrote: > > cassandra.yaml states that "Directories where Cassandra should store > data on disk.

Re: Can "data_file_directories" make use of multiple disks?

2018-04-09 Thread Paulo Motta
> cassandra.yaml states that "Directories where Cassandra should store data on > disk. Cassandra will spread data evenly across them, subject to the > granularity of the configured compaction strategy.". I feel it is not correct > anymore. Is it worth updating the doc? In fact this changed

Re: Can "data_file_directories" make use of multiple disks?

2018-04-09 Thread Venkata Hari Krishna Nukala
I spent some time in code (trunk) to understand it better. If I understood it correctly DiskBoundaryManager.getDiskBoundaries() method does the partition and it has nothing to do with the compaction strategy. Is it correct? cassandra.yaml states that "Directories where Cassandra should store data

Re: Can "data_file_directories" make use of multiple disks?

2018-03-27 Thread Jonathan Haddad
In Cassandra 3.2 and later, data is partitioned by token range, which should give you even distribution of data. If you're going to go into 3.x, please use the latest 3.11, which at this time is 3.11.2. On Tue, Mar 27, 2018 at 8:05 AM Venkata Hari Krishna Nukala <

Re: Can "data_file_directories" make use of multiple disks?

2018-03-27 Thread Rahul Singh
Yes you can have multiple entries from multiple disks. No guarantee as I can see of even distribution. If you want even distribution there are better mechanisms for this at the filesystem later. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Mar 27, 2018, 8:05 AM -0700, Venkata Hari

Can "data_file_directories" make use of multiple disks?

2018-03-27 Thread Venkata Hari Krishna Nukala
Hi, I am trying to replace machines having HDD with little powerful machines having SSD in production. The data present in each node is around 300gb. But the newer machines have 2 X 200GB SSDs instead of a single disk. "data_file_directories" looks like a multi-valued config which I can use. Am