Hi,

It depends on compaction strategy to an extent. Leveled compaction is
partitioning sstables on token range so there is a wider variety of
scenarios where it works. I haven't done the napkin math at 10 terabytes
to figure what % of sstables will be leveled to the point they work with
256 vnodes.
It's also probably possible without vnodes to use other compaction
strategies by specifying multiple data directories so that they are
partitioned by token range to match replication. I don't know the
operational process is for that maybe Dinesh does.
Ariel


On Wed, Aug 29, 2018, at 3:33 AM, kurt greaves wrote:
> My reasoning was if you have a small cluster with vnodes you're more
> likely to have enough overlap between nodes that whole SSTables will
> be streamed on major ops. As  N gets >RF you'll have less common
> ranges and thus less likely to be streaming complete SSTables. Correct
> me if I've misunderstood.> 
> On 28 August 2018 at 01:37, Dinesh Joshi
> <dinesh.jo...@yahoo.com.invalid> wrote:>> Although the extent of benefits 
> depend on the specific use case, the
>> cluster size is definitely not a limiting factor.
>>>> Dinesh
>> 
>> 
>> On Aug 27, 2018, at 5:05 AM, kurt greaves
>> <k...@instaclustr.com> wrote:>>> I believe there are caveats that it will 
>> only really help if you're
>>> not using vnodes, or you have a very small cluster, and also
>>> internode encryption is not enabled. Alternatively if you're using
>>> JBOD vnodes will be marginally better, but JBOD is not a great idea
>>> (and doesn't guarantee a massive improvement).>>> 
>>> On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID
>>> <dinesh.jo...@yahoo.com.invalid> wrote:>>>> Yes, this feature will help 
>>> with operating nodes with higher data
>>>> density.>>>> 
>>>> 
>>>> Dinesh
>>>> 
>>>> 
>>>> 
>>>> On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester
>>>> <onmstes...@zoho.com> wrote:>>>> 
>>>> 
>>>> I've noticed this new feature of 4.0:
>>>> Streaming optimizations
>>>> (https://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html)>>>>
>>>>  Is this mean that we could have much more data density with
>>>> Cassandra 4.0 (less problems than 3.X)? I mean > 10 TB of data on
>>>> each node without worrying about node join/remove?>>>> This is something 
>>>> needed for Write-Heavy applications that do not
>>>> read a lot. When you have like 2 TB of data per day and need to
>>>> keep it for 6 month, it would be waste of money to purchase 180
>>>> servers (even Commodity or Cloud).>>>> IMHO, even if 4.0 fix problem with 
>>>> streaming/joining a new node,
>>>> still Compaction is another evil for a big node, but we could
>>>> tolerate that somehow>>>> 
>>>> Sent using Zoho Mail[1]


>>>> 


Links:

  1. https://www.zoho.com/mail/

Reply via email to