subject:"STCS limitation with JBOD\?"

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla

I would add that STC and JBOD are logically a bad fit anyway, and that
doing it with nodetool compact is extra silly. For this reasons I tend to
only use JBOD with LCS and therefore with SSD.

As far as modeling out tombstones, I tend to push towards more around the
model, for example if you're doing partitioning based on time, say daily,
for the sake of ease of understanding say you do Monday, Tuesday, Wed, etc,
but you only query the last 2 days. Soon as you have tables out of scope
you truncate that said table (on wednesday you can safely truncate Monday).
You can do this same approach with work queues ( truncate the work queue
when done), or really any logical model that has data that falls out of
scope. This can mean querying more than one table at a time, but if you do
this in an async fashion that tradeoff can totally be worth it compared to
managing tombstones, and really LCS does pin read times reasonably well
especially when compared to STC combined with compact (either you're
spiking read times on the compact, or your spiking beforehand because you
had a burst of write traffic prior to your nodetool compact run).

Details on my modeling these approaches here
http://lostechies.com/ryansvihla/2014/10/20/domain-modeling-around-deletes-or-using-cassandra-as-a-queue-even-when-you-know-better/

Finally,not typically a big fan of rewriting all data to a new table,
though I've done that for some models that were hard to partition (session
data that had variable times of aging out, so we pushed over the new
records ).

On Tue, Jan 6, 2015 at 12:12 PM, Dan Kinder dkin...@turnitin.com wrote:

Thanks for the info guys. Regardless of the reason for using nodetool
compact, it seems like the question still stands... but he impression I'm
getting is that nodetool compact on JBOD as I described will basically fall
apart. Is that correct?

To answer Colin's question as an aside: we have a dataset with fairly high
insert load and periodic range reads (batch processing). We have a
situation where we may want rewrite some rows (changing the primary key) by
deleting and inserting as a new row. This is not something we would do on a
regular basis, but after or during the process a compact would greatly help
to clear out tombstones/rewritten data.

@Ryan Svihla it also sounds like your suggestion in this case would be:
create a new column family, rewrite all data into that, truncate/remove the
previous one, and replace it with the new one.

On Tue, Jan 6, 2015 at 9:39 AM, Ryan Svihla r...@foundev.pro wrote:

nodetool compact is the ultimate running with scissors solution, far
more people manage to stab themselves in the eye. Customers running with
scissors successfully not withstanding.

My favorite discussions usually tend to result:

1. We still have tombstones ( so they set gc_grace_seconds to 0)
2. We added a node after fixing it and now a bunch of records that
were deleted have come back (usually after setting gc_grace_seconds to 0
and then not blanking nodes that have been offline)
3. Why are my read latencies so spikey? (cause they're on STC and
now have a giant single huge SStable which worked fine when their data set
was tiny, now they're looking at 100 sstables on STC, which means
slwww reads)
4. We still have tombstones (yeah I know this again, but this is
usually when they've switched to LCS, which basically noops with nodetool
compact)

All of this is managed when you have a team that understands the
tradeoffs of nodetool compact, but I categorically reject it's a good
experience for new users, as I've unfortunately had about dozen fire drills
this year as a result of nodetool compact alone.

Data modeling around partitions that are truncated when falling out of
scope is typically far more manageable, works with any compaction strategy,
and doesn't require operational awareness at the same scale.

On Fri, Jan 2, 2015 at 2:15 PM, Robert Coli rc...@eventbrite.com wrote:

On Fri, Jan 2, 2015 at 11:28 AM, Colin co...@clark.ws wrote:

Forcing a major compaction is usually a bad idea. What is your reason
for doing that?

I'd say often and not usually. Lots of people have schema where they
create way too much garbage, and major compaction can be a good response.
The docs' historic incoherent FUD notwithstanding.

=Rob

Thanks,
Ryan Svihla

--
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com

Thanks,
Ryan Svihla

Re: STCS limitation with JBOD?

2015-01-06 Thread Dan Kinder

Thanks for the info guys. Regardless of the reason for using nodetool
compact, it seems like the question still stands... but he impression I'm
getting is that nodetool compact on JBOD as I described will basically fall
apart. Is that correct?

To answer Colin's question as an aside: we have a dataset with fairly high
insert load and periodic range reads (batch processing). We have a
situation where we may want rewrite some rows (changing the primary key) by
deleting and inserting as a new row. This is not something we would do on a
regular basis, but after or during the process a compact would greatly help
to clear out tombstones/rewritten data.

@Ryan Svihla it also sounds like your suggestion in this case would be:
create a new column family, rewrite all data into that, truncate/remove the
previous one, and replace it with the new one.

On Tue, Jan 6, 2015 at 9:39 AM, Ryan Svihla r...@foundev.pro wrote:

 nodetool compact is the ultimate running with scissors solution, far
 more people manage to stab themselves in the eye. Customers running with
 scissors successfully not withstanding.

 My favorite discussions usually tend to result:

1. We still have tombstones ( so they set gc_grace_seconds to 0)
2. We added a node after fixing it and now a bunch of records that
were deleted have come back (usually after setting gc_grace_seconds to 0
and then not blanking nodes that have been offline)
3. Why are my read latencies so spikey?  (cause they're on STC and now
have a giant single huge SStable which worked fine when their data set was
tiny, now they're looking at 100 sstables on STC, which means slwww
reads)
4. We still have tombstones (yeah I know this again, but this is
usually when they've switched to LCS, which basically noops with nodetool
compact)

 All of this is managed when you have a team that understands the tradeoffs
 of nodetool compact, but I categorically reject it's a good experience for
 new users, as I've unfortunately had about dozen fire drills this year as a
 result of nodetool compact alone.

 Data modeling around partitions that are truncated when falling out of
 scope is typically far more manageable, works with any compaction strategy,
 and doesn't require operational awareness at the same scale.

 On Fri, Jan 2, 2015 at 2:15 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Jan 2, 2015 at 11:28 AM, Colin co...@clark.ws wrote:

 Forcing a major compaction is usually a bad idea.  What is your reason
 for doing that?


 I'd say often and not usually. Lots of people have schema where they
 create way too much garbage, and major compaction can be a good response.
 The docs' historic incoherent FUD notwithstanding.

 =Rob





 --

 Thanks,
 Ryan Svihla




-- 
Dan Kinder
Senior Software Engineer
Turnitin – www.turnitin.com
dkin...@turnitin.com

Re: STCS limitation with JBOD?

2015-01-06 Thread Ryan Svihla

nodetool compact is the ultimate running with scissors solution, far more
people manage to stab themselves in the eye. Customers running with
scissors successfully not withstanding.

My favorite discussions usually tend to result:

   1. We still have tombstones ( so they set gc_grace_seconds to 0)
   2. We added a node after fixing it and now a bunch of records that were
   deleted have come back (usually after setting gc_grace_seconds to 0 and
   then not blanking nodes that have been offline)
   3. Why are my read latencies so spikey?  (cause they're on STC and now
   have a giant single huge SStable which worked fine when their data set was
   tiny, now they're looking at 100 sstables on STC, which means slwww
   reads)
   4. We still have tombstones (yeah I know this again, but this is
   usually when they've switched to LCS, which basically noops with nodetool
   compact)

All of this is managed when you have a team that understands the tradeoffs
of nodetool compact, but I categorically reject it's a good experience for
new users, as I've unfortunately had about dozen fire drills this year as a
result of nodetool compact alone.

Data modeling around partitions that are truncated when falling out of
scope is typically far more manageable, works with any compaction strategy,
and doesn't require operational awareness at the same scale.

On Fri, Jan 2, 2015 at 2:15 PM, Robert Coli rc...@eventbrite.com wrote:

 On Fri, Jan 2, 2015 at 11:28 AM, Colin co...@clark.ws wrote:

 Forcing a major compaction is usually a bad idea.  What is your reason
 for doing that?


 I'd say often and not usually. Lots of people have schema where they
 create way too much garbage, and major compaction can be a good response.
 The docs' historic incoherent FUD notwithstanding.

 =Rob





-- 

Thanks,
Ryan Svihla

Re: STCS limitation with JBOD?

2015-01-02 Thread Colin

Forcing a major compaction is usually a bad idea.  What is your reason for 
doing that?

--
Colin Clark 
+1-320-221-9531
 

 On Jan 2, 2015, at 1:17 PM, Dan Kinder dkin...@turnitin.com wrote:
 
 Hi,
 
 Forcing a major compaction (using nodetool compact) with STCS will result in 
 a single sstable (ignoring repair data). However this seems like it could be 
 a problem for large JBOD setups. For example if I have 12 disks, 1T each, 
 then it seems like on this node I cannot have one column family store more 
 than 1T worth of data (more or less), because all the data will end up in a 
 single sstable that can exist only on one disk. Is this accurate? The 
 compaction write path docs give a bit of hope that cassandra could split the 
 one final sstable across the disks, but I doubt it is able to and want to 
 confirm.
 
 I imagine that RAID/LLVM, using LCS, or multiple cassandra instances not in 
 JBOD mode could be solutions to this (with their own problems), but want to 
 verify that this actually is a problem.
 
 -dan

STCS limitation with JBOD?

2015-01-02 Thread Dan Kinder

Hi,

Forcing a major compaction (using nodetool compact
http://datastax.com/documentation/cassandra/2.1/cassandra/tools/toolsCompact.html)
with STCS will result in a single sstable (ignoring repair data). However
this seems like it could be a problem for large JBOD setups. For example if
I have 12 disks, 1T each, then it seems like on this node I cannot have one
column family store more than 1T worth of data (more or less), because all
the data will end up in a single sstable that can exist only on one disk.
Is this accurate? The compaction write path docs
http://datastax.com/documentation/cassandra/2.1/cassandra/dml/dml_write_path_c.html
give a bit of hope that cassandra could split the one final sstable across
the disks, but I doubt it is able to and want to confirm.

I imagine that RAID/LLVM, using LCS, or multiple cassandra instances not in
JBOD mode could be solutions to this (with their own problems), but want to
verify that this actually is a problem.

-dan

Re: STCS limitation with JBOD?

2015-01-02 Thread Robert Coli

On Fri, Jan 2, 2015 at 11:28 AM, Colin co...@clark.ws wrote:

 Forcing a major compaction is usually a bad idea.  What is your reason for
 doing that?


I'd say often and not usually. Lots of people have schema where they
create way too much garbage, and major compaction can be a good response.
The docs' historic incoherent FUD notwithstanding.

=Rob

Re: STCS limitation with JBOD?

Re: STCS limitation with JBOD?

Re: STCS limitation with JBOD?

Re: STCS limitation with JBOD?

STCS limitation with JBOD?

Re: STCS limitation with JBOD?

6 matches

Site Navigation

Mail list logo

Footer information