On 05/13/2014 08:13 PM, Yatong Zhang wrote:
Thank you Aaron, but we're planning about 20T per node, is that feasible?

20T per node is 5x greater than the max recommended data per node on high-end spec hardware of 5T/node on nodes with 16+ cores, 128-256G, SSD, and 10gigE.

pgs 12-13 (the whole doc is well worth a careful read):
http://www.datastax.com/wp-content/uploads/2014/01/WP-DataStax-Enterprise-Reference-Architecture.pdf

In looking at your other disk space thread, I see that you are using 4T drives, so those definitely aren't SSD. It also looks like you partitioned /dev/sda for an OS partition and are using the rest for data - I assume your commitlog is on /dev/sda1, so your /dev/sda3 data partition is on the same spindle as your data - not recommended..

I would RAID0 all those data drives, personally, and give up managing them separately. They are on multiple PCIe controllers, one drive per channel, right?

The trouble you are having with running out of disk space, then opting for LCS which is about 2x more I/O intensive; this could add a different level of pain on spindles.

I would highly suggest re-thinking about how you want to set up your data model and re-plan your cluster appropriately, to be honest. I'm not saying that working with what you have isn't at all possible, but you are experiencing pain due to pushing far beyond the bounds of the suggested recommendations for Cassandra. If you have a high threshold for pain, then carry on. I mean this admirably - I would *love* to hear back that you have sorted out all your issues, and do continue to post questions for help. I could be completely off-base in my reading.

I do think many more nodes is the way to go with this much data - this is Cassandra's strength. I'm not sure what your data actually contains, but if you are using large blobs like image data, think about putting that blob data somewhere else, and storing only the metadata in Cassandra with, ie. URL pointers on where to retrieve the image data - stuff like that will help.

--
Warm regards,
Michael Shuler

Reply via email to