On 6/6/2011 11:25 PM, Benjamin Coverston wrote:
Currently, my data dir has about 16 sets. I thought that compaction
(with nodetool) would clean-up these files, but it doesn't. Neither
does cleanup or repair.
You're not even talking about snapshots using nodetool snapshot yet.
Also nodetool
You can find useful information in:
http://www.datastax.com/docs/0.8/operations/scheduled_tasks
sstables are immutable. Once it written to disk, it won't be updated.
When you take snapshot, the tool makes hard links to sstable files.
After certain time, you will have some times of memtable
: Backups, Snapshots, SSTable Data Files, Compaction
On 6/7/2011 2:29 AM, Maki Watanabe wrote:
You can find useful information in:
http://www.datastax.com/docs/0.8/operations/scheduled_tasks
sstables are immutable. Once it written to disk, it won't be updated.
When you take snapshot, the tool
Hi AJ,
Unfortunately, for storage capacity planning it's a bit of a guessing
game. Until you run your load against it and profile the usage you just
are not going to know for sure. I have seen cases where planning to have
50% excess capacity/node was plenty, and I have seen other extreme
Thanks to everyone who responded thus far.
On 6/7/2011 10:16 AM, Benjamin Coverston wrote:
snip
Not to say that there aren't workloads where having many TB/Node
doesn't work, but if you're planning to read from the data you're
writing you do want to ensure that your working set is stored in
I'd also say consider what happens during maintenance and failure scenarios.
Moving 10's TB around takes a lot longer than 100's GB.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 8 Jun 2011, at 06:40, AJ wrote:
Thanks to
Aaron makes a good point, the happiest customers in my opinion are the
ones that choose nodes on the smaller side, and more of them.
Regarding the working set, I am referring to the OS cache. On linux,
with JNA, Cassadra utilizes, to great effectiveness, memory mapped files
and this is where
Hi,
I am working on a backup strategy and am trying to understand what is
going on in the data directory.
I notice that after a write to a CF and then flush, a new set of data
files are created with an index number incremented in their names, such as:
Initially:
Users-e-1-Filter.db
Hi AJ,
inline:
On 6/6/11 11:03 PM, AJ wrote:
Hi,
I am working on a backup strategy and am trying to understand what is
going on in the data directory.
I notice that after a write to a CF and then flush, a new set of data
files are created with an index number incremented in their names,