[ https://issues.apache.org/jira/browse/CASSANDRA-8460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16414714#comment-16414714 ]
Jon Haddad commented on CASSANDRA-8460: --------------------------------------- {quote} The requirement we're looking to target, as per the original JIRA, is people who have data that is hot for a short period but then they need to keep around for a long time with infrequent access (ie well defined rules on hot vs cold, not deciding what is hot based on what was recently read). Typically when I've seen this requirement people want: 1) The best possible performance for the hot data 2) Lowest cost of storage for the cold data It seems to me that with LVM we're a not doing the best we could in terms of either of these. {quote} If you want the best possible read performance for hot data, there's not going to be a better option than the caching layer. Treating a disk as part of the Cassandra storage pool rather than a managed cache layer by the OS introduces the need for explicit configuration and the need to explicitly manage the free data. By this I mean you will need to keep some definition in the schema or code about when to keep things on the hot disk and when to move it off. My gut tells me this will result in an under utilized disk, mostly because the more efficient you get on the fast disk the greater the risk of failure. Imagine a large compaction happening on the hot disk - this patch will need to ensure it starts moving older data off to the slow drive which is going to block compactions from happening on the hot disk. Regarding the low cost, I agree with you, duplicating the data on a cache drive is going to cost more than the aggregate of the space of the two drives. {quote} For performance, there is the write-through slow down you mentioned, depending on where you draw the line on moving to slow disk vs the final TWCS compaction you might have compactions pushing data you want to be quick out of cache and if you used EBS for both the hot disk and the slow disk you are increasing usage of the EBS bandwidth to copy to and from cache (although using local SSD as the cache negates this last one). {quote} I'm not sure how much of a problem is in practice. Cassandra's sequential writes are going to avoid a lot of performance issues related to spinning disks. In my experience the biggest performance problem limiting compaction throughput is goign to be GC pauses, not the ability to write bytes to disk. {quote} In terms of cost, with LVM the fast disk is purely being used as cache rather than a primary store so you are having to duplicate that amount of data storage - whether that is significant probably depends on your desired ratio of fast to slow disk and how cost sensitive you are. {quote} Agreed. To me, the main benefit to having the fast disk involved is the ability to increase density significantly at very low cost. If you were to have a small SSD backed by 3-5TB of slow storage, that's a pretty good win in my opinion. {quote} Whether this downsides are worth the extra complexity is of course a matter of judgement rather than facts so happy to go with the community consensus here but thought I'd put in my POV. {quote} To be clear - I'm not shooting down the patch, or saying it's a bad idea. I think there's some interesting aspects to it with some valid use cases, I'd just like everyone to be aware of existing alternatives, as I didn't see anyone bring up lvmcache in the three years this ticket has existed. > Make it possible to move non-compacting sstables to slow/big storage in DTCS > ---------------------------------------------------------------------------- > > Key: CASSANDRA-8460 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8460 > Project: Cassandra > Issue Type: Improvement > Reporter: Marcus Eriksson > Assignee: Lerh Chuan Low > Priority: Major > Labels: doc-impacting, dtcs > Fix For: 4.x > > > It would be nice if we could configure DTCS to have a set of extra data > directories where we move the sstables once they are older than > max_sstable_age_days. > This would enable users to have a quick, small SSD for hot, new data, and big > spinning disks for data that is rarely read and never compacted. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org