We based the estimate on a previous controlled observation. We generated a
year's worth of one minute data for a single identifier and recorded the
size of the resulting sstable. By adding the data one month at a time we
observed that there was a linear predictable increase in the sstable size.
Using this we simply multiplied by the number of identifiers, in this case
700, to get the 7GB estimate.
And as noted above this estimate is correct once the data is compacted to
one sstable but is wrong when there are multiple sstables.

Phil


Andreas Finke wrote
> Hi Phil,
> 
> there is no dump question ;) What is your size estimation based on e.g.
> what size is a column in your calculation?





--
View this message in context: 
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Can-SSTables-overlap-with-SizeTieredCompactionStrategy-tp7594574p7594641.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at 
Nabble.com.

Reply via email to