Re: bigger data density with Cassandra 4.0?

2018-08-30 Thread dinesh.jo...@yahoo.com.INVALID
With LCS, 6696 you can maximize the percentage of SSTables that use the new 
streaming path. With LCS and relatively small SSTables you should see good 
gains. Bootstrap is a use-case that should see the maximum benefits. This 
feature will get better with time.
Dinesh 

On Wednesday, August 29, 2018, 12:34:32 AM PDT, kurt greaves 
 wrote:  
 
 My reasoning was if you have a small cluster with vnodes you're more likely to 
have enough overlap between nodes that whole SSTables will be streamed on major 
ops. As  N gets >RF you'll have less common ranges and thus less likely to be 
streaming complete SSTables. Correct me if I've misunderstood.
On 28 August 2018 at 01:37, Dinesh Joshi  wrote:

Although the extent of benefits depend on the specific use case, the cluster 
size is definitely not a limiting factor.

Dinesh
On Aug 27, 2018, at 5:05 AM, kurt greaves  wrote:


I believe there are caveats that it will only really help if you're not using 
vnodes, or you have a very small cluster, and also internode encryption is not 
enabled. Alternatively if you're using JBOD vnodes will be marginally better, 
but JBOD is not a great idea (and doesn't guarantee a massive improvement).
On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID 
 wrote:

Yes, this feature will help with operating nodes with higher data density.
Dinesh 

On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester 
 wrote:  
 
 I've noticed this new feature of 4.0:
Streaming optimizations (https://cassandra.apache.org/ 
blog/2018/08/07/faster_streami ng_in_cassandra.html)
Is this mean that we could have much more data density with Cassandra 4.0 (less 
problems than 3.X)? I mean > 10 TB of data on each node without worrying about 
node join/remove?
This is something needed for Write-Heavy applications that do not read a lot. 
When you have like 2 TB of data per day and need to keep it for 6 month, it 
would be waste of money to purchase 180 servers (even Commodity or Cloud). 
IMHO, even if 4.0 fix problem with streaming/joining a new node, still 
Compaction is another evil for a big node, but we could tolerate that somehow


Sent using Zoho Mail



  




  

RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Rahul Singh
YugaByte is also another new dancer in the Cassandra dance. The data store is 
based on RocksDB — and it’s written in C++. Although they ar wire compliant 
with c* I’m pretty are everything under the hood is NOT a port like Scylla was 
initially.

Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Aug 29, 2018, 10:05 AM -0400, Durity, Sean R , 
wrote:
> If you are going to compare vs commercial offerings like Scylla and CosmosDB, 
> you should be looking at DataStax Enterprise. They are moving more quickly 
> than open source (IMO) on adding features and tools that enterprises really 
> need. I think they have some emerging tech for large/dense nodes, in 
> particular. The ability to handle different data model types (Graph and 
> Search) and embedded analytics sets it apart from plain Cassandra. Plus, they 
> have replaced Cassandra’s SEDA architecture to give it a significant boost in 
> performance. As a customer, I see the value in what they are doing.
>
>
> Sean Durity
> From: onmstester onmstester 
> Sent: Wednesday, August 29, 2018 7:43 AM
> To: user 
> Subject: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?
>
> Could you please explain more about (you mean slower performance in compare 
> to Cassandra?)
> ---Hbase tends to be quite average for transactional data
>
> and about:
> ScyllaDB IDK, I'd assume they just sorted out streaming by learning from 
> C*'s mistakes.
> While ScyllaDB is a much younger project than Cassandra with so much less 
> usage and attention, Currently I encounter a dilemma on launching new 
> clusters which is: should i wait for Cassandra community to apply all 
> enhancement's and bug fixes that applied by their main competitors (Scylla DB 
> or Cosmos DB) or just switch to competitors (afraid of the new world!)?
> For example right now is there a motivation to handle more dense nodes in 
> near future?
>
> Again, Thank you for your time
>
> Sent using Zoho Mail
>
>
>  On Wed, 29 Aug 2018 15:16:40 +0430 kurt greaves  
> wrote 
>
> > quote_type
> > Most of the issues around big nodes is related to streaming, which is 
> > currently quite slow (should be a bit better in 4.0). HBase is built on top 
> > of hadoop, which is much better at large files/very dense nodes, and tends 
> > to be quite average for transactional data. ScyllaDB IDK, I'd assume they 
> > just sorted out streaming by learning from C*'s mistakes.
> >
> > On 29 August 2018 at 19:43, onmstester onmstester  
> > wrote:
> >
> > > quote_type
> > >
> > > Thanks Kurt,
> > > Actually my cluster has > 10 nodes, so there is a tiny chance to stream a 
> > > complete SSTable.
> > > While logically any Columnar noSql db like Cassandra, needs always to 
> > > re-sort grouped data for later-fast-reads and having nodes with big 
> > > amount of data (> 2 TB) would be annoying for this background process, 
> > > How is it possible that some of these databases like HBase and Scylla db 
> > > does not emphasis on small nodes (like Cassandra do)?
> > >
> > > Sent using Zoho Mail
> > >
> > >
> > >  Forwarded message 
> > > From : kurt greaves 
> > > To : "User"
> > > Date : Wed, 29 Aug 2018 12:03:47 +0430
> > > Subject : Re: bigger data density with Cassandra 4.0?
> > >  Forwarded message 
> > >
> > > > quote_type
> > > > My reasoning was if you have a small cluster with vnodes you're more 
> > > > likely to have enough overlap between nodes that whole SSTables will be 
> > > > streamed on major ops. As  N gets >RF you'll have less common ranges 
> > > > and thus less likely to be streaming complete SSTables. Correct me if 
> > > > I've misunderstood.
> > >
>
>
>
>
> The information in this Internet Email is confidential and may be legally 
> privileged. It is intended solely for the addressee. Access to this Email by 
> anyone else is unauthorized. If you are not the intended recipient, any 
> disclosure, copying, distribution or any action taken or omitted to be taken 
> in reliance on it, is prohibited and may be unlawful. When addressed to our 
> clients any opinions or advice contained in this Email are subject to the 
> terms and conditions expressed in any applicable governing The Home Depot 
> terms of business or client engagement letter. The Home Depot disclaims all 
> responsibility and liability for the accuracy and content of this attachment 
> and for any damages or losses arising from any inaccuracies, errors, viruses, 
> e.g., worms, trojan horses, etc., or other items of a destructive nature, 
> which may be contained in this attachment and shall not be liable for direct, 
> indirect, consequential or special damages in connection with this e-mail 
> message or its attachment.


Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Ariel Weisberg
Hi,

It depends on compaction strategy to an extent. Leveled compaction is
partitioning sstables on token range so there is a wider variety of
scenarios where it works. I haven't done the napkin math at 10 terabytes
to figure what % of sstables will be leveled to the point they work with
256 vnodes.
It's also probably possible without vnodes to use other compaction
strategies by specifying multiple data directories so that they are
partitioned by token range to match replication. I don't know the
operational process is for that maybe Dinesh does.
Ariel


On Wed, Aug 29, 2018, at 3:33 AM, kurt greaves wrote:
> My reasoning was if you have a small cluster with vnodes you're more
> likely to have enough overlap between nodes that whole SSTables will
> be streamed on major ops. As  N gets >RF you'll have less common
> ranges and thus less likely to be streaming complete SSTables. Correct
> me if I've misunderstood.> 
> On 28 August 2018 at 01:37, Dinesh Joshi
>  wrote:>> Although the extent of benefits 
> depend on the specific use case, the
>> cluster size is definitely not a limiting factor.
 Dinesh
>> 
>> 
>> On Aug 27, 2018, at 5:05 AM, kurt greaves
>>  wrote:>>> I believe there are caveats that it will 
>> only really help if you're
>>> not using vnodes, or you have a very small cluster, and also
>>> internode encryption is not enabled. Alternatively if you're using
>>> JBOD vnodes will be marginally better, but JBOD is not a great idea
>>> (and doesn't guarantee a massive improvement).>>> 
>>> On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID
>>>  wrote: Yes, this feature will help 
>>> with operating nodes with higher data
 density. 
 
 Dinesh
 
 
 
 On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester
  wrote: 
 
 I've noticed this new feature of 4.0:
 Streaming optimizations
 (https://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html)
  Is this mean that we could have much more data density with
 Cassandra 4.0 (less problems than 3.X)? I mean > 10 TB of data on
 each node without worrying about node join/remove? This is something 
 needed for Write-Heavy applications that do not
 read a lot. When you have like 2 TB of data per day and need to
 keep it for 6 month, it would be waste of money to purchase 180
 servers (even Commodity or Cloud). IMHO, even if 4.0 fix problem with 
 streaming/joining a new node,
 still Compaction is another evil for a big node, but we could
 tolerate that somehow 
 Sent using Zoho Mail[1]


 


Links:

  1. https://www.zoho.com/mail/


RE: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread Durity, Sean R
If you are going to compare vs commercial offerings like Scylla and CosmosDB, 
you should be looking at DataStax Enterprise. They are moving more quickly than 
open source (IMO) on adding features and tools that enterprises really need. I 
think they have some emerging tech for large/dense nodes, in particular. The 
ability to handle different data model types (Graph and Search) and embedded 
analytics sets it apart from plain Cassandra. Plus, they have replaced 
Cassandra’s SEDA architecture to give it a significant boost in performance. As 
a customer, I see the value in what they are doing.


Sean Durity
From: onmstester onmstester 
Sent: Wednesday, August 29, 2018 7:43 AM
To: user 
Subject: [EXTERNAL] Re: Re: bigger data density with Cassandra 4.0?

Could you please explain more about (you mean slower performance in compare to 
Cassandra?)
---Hbase tends to be quite average for transactional data

and about:
ScyllaDB IDK, I'd assume they just sorted out streaming by learning from 
C*'s mistakes.
While ScyllaDB is a much younger project than Cassandra with so much less usage 
and attention, Currently I encounter a dilemma on launching new clusters which 
is: should i wait for Cassandra community to apply all enhancement's and bug 
fixes that applied by their main competitors (Scylla DB or Cosmos DB) or just 
switch to competitors (afraid of the new world!)?
For example right now is there a motivation to handle more dense nodes in near 
future?

Again, Thank you for your time


Sent using Zoho 
Mail<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.zoho.com_mail_=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=2gGrVkQ7RF2pImDvPVeLGUNq7aZfjH2_G1MqYpKWNGg=pOoeQZFvHspf4j5Q7T-s6qoqv_Zk3R407jriz-WG_f4=>


 On Wed, 29 Aug 2018 15:16:40 +0430 kurt greaves 
mailto:k...@instaclustr.com>> wrote 

Most of the issues around big nodes is related to streaming, which is currently 
quite slow (should be a bit better in 4.0). HBase is built on top of hadoop, 
which is much better at large files/very dense nodes, and tends to be quite 
average for transactional data. ScyllaDB IDK, I'd assume they just sorted out 
streaming by learning from C*'s mistakes.

On 29 August 2018 at 19:43, onmstester onmstester 
mailto:onmstes...@zoho.com>> wrote:


Thanks Kurt,
Actually my cluster has > 10 nodes, so there is a tiny chance to stream a 
complete SSTable.
While logically any Columnar noSql db like Cassandra, needs always to re-sort 
grouped data for later-fast-reads and having nodes with big amount of data (> 2 
TB) would be annoying for this background process, How is it possible that some 
of these databases like HBase and Scylla db does not emphasis on small nodes 
(like Cassandra do)?


Sent using Zoho 
Mail<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.zoho.com_mail_=DwMFaQ=MtgQEAMQGqekjTjiAhkudQ=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ=2gGrVkQ7RF2pImDvPVeLGUNq7aZfjH2_G1MqYpKWNGg=pOoeQZFvHspf4j5Q7T-s6qoqv_Zk3R407jriz-WG_f4=>


 Forwarded message 
From : kurt greaves mailto:k...@instaclustr.com>>
To : "User"mailto:user@cassandra.apache.org>>
Date : Wed, 29 Aug 2018 12:03:47 +0430
Subject : Re: bigger data density with Cassandra 4.0?
 Forwarded message 

My reasoning was if you have a small cluster with vnodes you're more likely to 
have enough overlap between nodes that whole SSTables will be streamed on major 
ops. As  N gets >RF you'll have less common ranges and thus less likely to be 
streaming complete SSTables. Correct me if I've misunderstood.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread onmstester onmstester
Could you please explain more about (you mean slower performance in compare to 
Cassandra?) ---Hbase tends to be quite average for transactional data and 
about: ScyllaDB IDK, I'd assume they just sorted out streaming by learning 
from C*'s mistakes. While ScyllaDB is a much younger project than Cassandra 
with so much less usage and attention, Currently I encounter a dilemma on 
launching new clusters which is: should i wait for Cassandra community to apply 
all enhancement's and bug fixes that applied by their main competitors (Scylla 
DB or Cosmos DB) or just switch to competitors (afraid of the new world!)? For 
example right now is there a motivation to handle more dense nodes in near 
future? Again, Thank you for your time Sent using Zoho Mail  On Wed, 29 Aug 
2018 15:16:40 +0430 kurt greaves  wrote  Most of the 
issues around big nodes is related to streaming, which is currently quite slow 
(should be a bit better in 4.0). HBase is built on top of hadoop, which is much 
better at large files/very dense nodes, and tends to be quite average for 
transactional data. ScyllaDB IDK, I'd assume they just sorted out streaming by 
learning from C*'s mistakes. On 29 August 2018 at 19:43, onmstester onmstester 
 wrote: Thanks Kurt, Actually my cluster has > 10 nodes, 
so there is a tiny chance to stream a complete SSTable. While logically any 
Columnar noSql db like Cassandra, needs always to re-sort grouped data for 
later-fast-reads and having nodes with big amount of data (> 2 TB) would be 
annoying for this background process, How is it possible that some of these 
databases like HBase and Scylla db does not emphasis on small nodes (like 
Cassandra do)? Sent using Zoho Mail  Forwarded message  
From : kurt greaves  To : 
"User" Date : Wed, 29 Aug 2018 12:03:47 +0430 
Subject : Re: bigger data density with Cassandra 4.0?  Forwarded 
message  My reasoning was if you have a small cluster with vnodes 
you're more likely to have enough overlap between nodes that whole SSTables 
will be streamed on major ops. As  N gets >RF you'll have less common ranges 
and thus less likely to be streaming complete SSTables. Correct me if I've 
misunderstood.

Re: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
Most of the issues around big nodes is related to streaming, which is
currently quite slow (should be a bit better in 4.0). HBase is built on top
of hadoop, which is much better at large files/very dense nodes, and tends
to be quite average for transactional data. ScyllaDB IDK, I'd assume they
just sorted out streaming by learning from C*'s mistakes.

On 29 August 2018 at 19:43, onmstester onmstester 
wrote:

> Thanks Kurt,
> Actually my cluster has > 10 nodes, so there is a tiny chance to stream a
> complete SSTable.
> While logically any Columnar noSql db like Cassandra, needs always to
> re-sort grouped data for later-fast-reads and having nodes with big amount
> of data (> 2 TB) would be annoying for this background process, How is it
> possible that some of these databases like HBase and Scylla db does not
> emphasis on small nodes (like Cassandra do)?
>
> Sent using Zoho Mail <https://www.zoho.com/mail/>
>
>
>  Forwarded message 
> From : kurt greaves 
> To : "User"
> Date : Wed, 29 Aug 2018 12:03:47 +0430
> Subject : Re: bigger data density with Cassandra 4.0?
>  Forwarded message 
>
> My reasoning was if you have a small cluster with vnodes you're more
> likely to have enough overlap between nodes that whole SSTables will be
> streamed on major ops. As  N gets >RF you'll have less common ranges and
> thus less likely to be streaming complete SSTables. Correct me if I've
> misunderstood.
>
>
>
>


Fwd: Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread onmstester onmstester
Thanks Kurt, Actually my cluster has > 10 nodes, so there is a tiny chance to 
stream a complete SSTable. While logically any Columnar noSql db like 
Cassandra, needs always to re-sort grouped data for later-fast-reads and having 
nodes with big amount of data (> 2 TB) would be annoying for this background 
process, How is it possible that some of these databases like HBase and Scylla 
db does not emphasis on small nodes (like Cassandra do)? Sent using Zoho Mail 
 Forwarded message  From : kurt greaves 
 To : "User" Date : Wed, 29 
Aug 2018 12:03:47 +0430 Subject : Re: bigger data density with Cassandra 4.0? 
 Forwarded message  My reasoning was if you have a 
small cluster with vnodes you're more likely to have enough overlap between 
nodes that whole SSTables will be streamed on major ops. As  N gets >RF you'll 
have less common ranges and thus less likely to be streaming complete SSTables. 
Correct me if I've misunderstood.

Re: bigger data density with Cassandra 4.0?

2018-08-29 Thread kurt greaves
My reasoning was if you have a small cluster with vnodes you're more likely
to have enough overlap between nodes that whole SSTables will be streamed
on major ops. As  N gets >RF you'll have less common ranges and thus less
likely to be streaming complete SSTables. Correct me if I've misunderstood.

On 28 August 2018 at 01:37, Dinesh Joshi 
wrote:

> Although the extent of benefits depend on the specific use case, the
> cluster size is definitely not a limiting factor.
>
> Dinesh
>
> On Aug 27, 2018, at 5:05 AM, kurt greaves  wrote:
>
> I believe there are caveats that it will only really help if you're not
> using vnodes, or you have a very small cluster, and also internode
> encryption is not enabled. Alternatively if you're using JBOD vnodes will
> be marginally better, but JBOD is not a great idea (and doesn't guarantee a
> massive improvement).
>
> On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID <
> dinesh.jo...@yahoo.com.invalid> wrote:
>
>> Yes, this feature will help with operating nodes with higher data density.
>>
>> Dinesh
>>
>>
>> On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester <
>> onmstes...@zoho.com> wrote:
>>
>>
>> I've noticed this new feature of 4.0:
>> Streaming optimizations (https://cassandra.apache.org/
>> blog/2018/08/07/faster_streaming_in_cassandra.html)
>> Is this mean that we could have much more data density with Cassandra 4.0
>> (less problems than 3.X)? I mean > 10 TB of data on each node without
>> worrying about node join/remove?
>> This is something needed for Write-Heavy applications that do not read a
>> lot. When you have like 2 TB of data per day and need to keep it for 6
>> month, it would be waste of money to purchase 180 servers (even Commodity
>> or Cloud).
>> IMHO, even if 4.0 fix problem with streaming/joining a new node, still
>> Compaction is another evil for a big node, but we could tolerate that
>> somehow
>>
>> Sent using Zoho Mail 
>>
>>
>>
>


Re: bigger data density with Cassandra 4.0?

2018-08-27 Thread Dinesh Joshi
Although the extent of benefits depend on the specific use case, the cluster 
size is definitely not a limiting factor.

Dinesh

> On Aug 27, 2018, at 5:05 AM, kurt greaves  wrote:
> 
> I believe there are caveats that it will only really help if you're not using 
> vnodes, or you have a very small cluster, and also internode encryption is 
> not enabled. Alternatively if you're using JBOD vnodes will be marginally 
> better, but JBOD is not a great idea (and doesn't guarantee a massive 
> improvement).
> 
>> On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID 
>>  wrote:
>> Yes, this feature will help with operating nodes with higher data density.
>> 
>> Dinesh
>> 
>> 
>> On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester 
>>  wrote:
>> 
>> 
>> I've noticed this new feature of 4.0:
>> Streaming optimizations 
>> (https://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html)
>> Is this mean that we could have much more data density with Cassandra 4.0 
>> (less problems than 3.X)? I mean > 10 TB of data on each node without 
>> worrying about node join/remove?
>> This is something needed for Write-Heavy applications that do not read a 
>> lot. When you have like 2 TB of data per day and need to keep it for 6 
>> month, it would be waste of money to purchase 180 servers (even Commodity or 
>> Cloud). 
>> IMHO, even if 4.0 fix problem with streaming/joining a new node, still 
>> Compaction is another evil for a big node, but we could tolerate that somehow
>> 
>> Sent using Zoho Mail
>> 
>> 
>> 
> 


Re: bigger data density with Cassandra 4.0?

2018-08-27 Thread kurt greaves
I believe there are caveats that it will only really help if you're not
using vnodes, or you have a very small cluster, and also internode
encryption is not enabled. Alternatively if you're using JBOD vnodes will
be marginally better, but JBOD is not a great idea (and doesn't guarantee a
massive improvement).

On 27 August 2018 at 15:46, dinesh.jo...@yahoo.com.INVALID <
dinesh.jo...@yahoo.com.invalid> wrote:

> Yes, this feature will help with operating nodes with higher data density.
>
> Dinesh
>
>
> On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester <
> onmstes...@zoho.com> wrote:
>
>
> I've noticed this new feature of 4.0:
> Streaming optimizations (https://cassandra.apache.org/
> blog/2018/08/07/faster_streaming_in_cassandra.html)
> Is this mean that we could have much more data density with Cassandra 4.0
> (less problems than 3.X)? I mean > 10 TB of data on each node without
> worrying about node join/remove?
> This is something needed for Write-Heavy applications that do not read a
> lot. When you have like 2 TB of data per day and need to keep it for 6
> month, it would be waste of money to purchase 180 servers (even Commodity
> or Cloud).
> IMHO, even if 4.0 fix problem with streaming/joining a new node, still
> Compaction is another evil for a big node, but we could tolerate that
> somehow
>
> Sent using Zoho Mail 
>
>
>


Re: bigger data density with Cassandra 4.0?

2018-08-26 Thread dinesh.jo...@yahoo.com.INVALID
Yes, this feature will help with operating nodes with higher data density.
Dinesh 

On Saturday, August 25, 2018, 9:01:27 PM PDT, onmstester onmstester 
 wrote:  
 
 I've noticed this new feature of 4.0:
Streaming optimizations 
(https://cassandra.apache.org/blog/2018/08/07/faster_streaming_in_cassandra.html)
Is this mean that we could have much more data density with Cassandra 4.0 (less 
problems than 3.X)? I mean > 10 TB of data on each node without worrying about 
node join/remove?
This is something needed for Write-Heavy applications that do not read a lot. 
When you have like 2 TB of data per day and need to keep it for 6 month, it 
would be waste of money to purchase 180 servers (even Commodity or Cloud). 
IMHO, even if 4.0 fix problem with streaming/joining a new node, still 
Compaction is another evil for a big node, but we could tolerate that somehow


Sent using Zoho Mail