Re: Multiget performance

2014-04-11 Thread Allan C
For sanity, I ran the same python script with the same row ids again today and it was 10x faster. Must be something going wrong intermittently in my cluster.  -Allan On April 11, 2014 at 11:02:11 AM, Allan C (alla...@gmail.com) wrote:  It’s a fairly standard relational-like CF. Description is t

Re: Multiget performance

2014-04-11 Thread Allan C
 It’s a fairly standard relational-like CF. Description is the only field that’s potentially big (can be up to 1k). CREATE COLUMN FAMILY 'Event' WITH   key_validation_class = 'UTF8Type' AND   comparator = 'UTF8Type' AND   default_validation_class = 'UTF8Type' AND   bloom_filter_fp_chance = 0.1 AN

Re: Multiget performance

2014-04-10 Thread Tyler Hobbs
On Thu, Apr 10, 2014 at 6:26 PM, Allan C wrote: > > Looks like the amount of data returned has a big effect. When I only > return one column, python reports only 20ms compared to 150ms when > returning the whole row. Rows are each less than 1k in size, but there must > be client overhead. > That

Re: Multiget performance

2014-04-10 Thread DuyHai Doan
As far as I understood, the multiget performance is bound to the slowest node responding to the coordinator. If you are fetching 100 partitions within *n* nodes, the coordinator will issue requests to those nodes and wait until all the responses are given back before returning the results to the

Re: Multiget performance

2014-04-09 Thread Tyler Hobbs
Can you trace the query and paste the results? On Wed, Apr 9, 2014 at 11:17 AM, Allan C wrote: > As one CQL statement: > > SELECT * from Event WHERE key IN ([100 keys]); > > -Allan > > On April 9, 2014 at 12:52:13 AM, Daniel Chia (danc...@coursera.org) wrote: > > Are you making the 100 calls i

Re: Multiget performance

2014-04-09 Thread Allan C
As one CQL statement:  SELECT * from Event WHERE key IN ([100 keys]); -Allan On April 9, 2014 at 12:52:13 AM, Daniel Chia (danc...@coursera.org) wrote: Are you making the 100 calls in serial, or in parallel? Thanks, Daniel On Tue, Apr 8, 2014 at 11:22 PM, Allan C wrote: Hi all, I’ve always

Re: Multiget performance

2014-04-09 Thread Daniel Chia
Are you making the 100 calls in serial, or in parallel? Thanks, Daniel On Tue, Apr 8, 2014 at 11:22 PM, Allan C wrote: > Hi all, > > I've always been told that multigets are a Cassandra anti-pattern for > performance reasons. I ran a quick test tonight to prove it to myself, and, > sure enough

Multiget performance

2014-04-08 Thread Allan C
Hi all, I’ve always been told that multigets are a Cassandra anti-pattern for performance reasons. I ran a quick test tonight to prove it to myself, and, sure enough, slowness ensued. It takes about 150ms to get 100 keys for my use case. Not terrible, but at least an order of magnitude from wha