Re: get_slice slow

B. Todd Burruss Tue, 24 Aug 2010 22:17:54 -0700

thx artie,

i haven't used a super CF because i thought it has more trouble doingslices because the entire row must be deserialized to get to thesubcolumn you want?

iostat is nothing, 0.0. i have plenty of RAM and the OS is I/O cachingnicely

i haven't used the key cache, because i only have one key, the row ofthe queue ;)

i haven't used row cache because i need the row to grow quite large,millions of columns. and the size of data could be arbitrary - rightnow i am testing with < 32 byte values per column.


i do need quorum consistency.

i have read previous that some folks are using a single row withmillions of columns. is anyone using get_slice to pick off the first orthe last column in the row?


On 08/24/2010 09:25 PM, Artie Copeland wrote:

Have you tried using a super column, it seems that having a row withover 100K columns and growing would be alot for cassandra todeserialize? what is iostat and jmeter telling you? it would beinteresting to see that data. also what are you using for you key orrow caching? do you need to use a quorum consistency as that can slowdown reads as well, can you use a lower consistency level?


Artie

On Tue, Aug 24, 2010 at 9:14 PM, B. Todd Burruss <bburr...@real.com<mailto:bburr...@real.com>> wrote:


    i am using get_slice to pull columns from a row to emulate a
    queue.  column names are TimeUUID and the values are small, < 32
    bytes.  simple ColumnFamily.

    i am using SlicePredicate like this to pull the first ("oldest")
    column in the row:

           SlicePredicate predicate = new SlicePredicate();
           predicate.setSlice_range(new SliceRange(new byte[] {}, new
    byte[] {}, false, 1));

           get_slice(rowKey, colParent, predicate, QUORUM);

    once i get the column i remove it.  so there are a lot of gets and
    mutates, leaving lots of deleted columns.

    get_slice starts off performing just fine, but then falls off
    dramatically as the number of columns grows.  at its peak there
    are 100,000 columns and get_slice is taking over 100ms to return.

    i am running a single instance of cassandra 0.7 on localhost,
    default config.  i've done some googling and can't find any tweaks
    or tuning suggestions specific to get_slice.  i already know about
    separating commitlog and data, watching iostat, GC, etc.

    any low hanging tuning fruit anyone can think of?  in 0.6 i recall
    an index for columns, maybe that is what i need?

    thx




--
http://yeslinux.org
http://yestech.org

Re: get_slice slow

Reply via email to