date:20100309

Re: schema design question

2010-03-09 Thread Matteo Caprari

Thanks Jonathan. Correct if I'm wrong: you are suggesting that each time we receive a new row (item, [users]) we do 2 operations: 1) insert (or merge) this row 'as it is' (item, [users]) 2) for each user in [users]: insert (user, [item]) Each incoming item is liked by 100 users, so it would be

Re: schema design question

2010-03-09 Thread Matteo Caprari

On Tue, Mar 9, 2010 at 1:23 PM, Jonathan Ellis wrote: > One quad-core node can handle ~14000 inserts per second so you are in > good shape. Well, yeah! >> instead of 'all users that liked N items'? > > That's true. So you'd want to use a custom comparator where first 64 > bits is the Long and t

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Sylvain Lebresne

On Tue, Mar 9, 2010 at 2:52 PM, Jonathan Ellis wrote: > By "reads" do you mean what stress.py counts (rows) or rows * columns? > If it is rows, then you are still actually reading more columns/s in > case 2. Well, unless I'm mistaking, that's the same in my example as I give in both case to stre

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Jonathan Ellis

On Tue, Mar 9, 2010 at 7:15 AM, Sylvain Lebresne wrote: > 1) stress.py -t 10 -o read -n 5000 -c 1 -r > 2) stress.py -t 10 -o read -n 50 -c 1 -r > > In the case 1) I get around 200 reads/seconds and that's pretty stable. The > disk is spinning like crazy (~25% io_wait), very few cpu or me

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Jonathan Ellis

On Tue, Mar 9, 2010 at 8:31 AM, Sylvain Lebresne wrote: > Well, unless I'm mistaking, that's the same in my example as I give in > both case > to stress.py the option '-c 1' which tells it to retrieve only one > column each time > even in the case where I have 100 columns by row. Oh. Why would y

Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Sylvain Lebresne

Hello, I've done some tests and it seems that somehow to have more rows with few columns is better than to have more rows with fewer columns, at least as long as read performance is concerned. Using stress.py, on a quad core 2.27Ghz with 4Go RAM and the out of the box cassandra configuration, I in

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Jesse McConnell

in my experience #2 will work well up to a point where it will trigger a limitation of cassandra (slated to be resolved in .7 \o/) where all of the columns under a given key must be able to fit into memory. For things like index's of data I have opted to shard the keys for really large data sets t

Re: schema design question

2010-03-09 Thread Jonathan Ellis

On Tue, Mar 9, 2010 at 3:53 AM, Matteo Caprari wrote: > Thanks Jonathan. > > Correct if I'm wrong: you are suggesting that each time we receive a new > row (item, [users]) we do 2 operations: > > 1) insert (or merge) this row 'as it is' (item, [users]) > 2) for each user in [users]: insert (user,

another ConcurrentModificationException

2010-03-09 Thread B. Todd Burruss

using cassandra-0.6.0-beta2/ 2010-03-09 09:17:26,827 ERROR [pool-1-thread-675] [Cassandra.java:1166] Internal error processing get java.util.ConcurrentModificationException at java.util.AbstractList $Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.n

Re: Cassandra hardware - balancing CPU/memory/iops/disk space

2010-03-09 Thread B. Todd Burruss

our dataset is too big to fit into cache, so we are hitting disk. not a problem for normal operation, but when a node is restored, hinted handoff, load balanced, or if reads/write simply build up we see a problem. the nodes can't seem to catch up. this seems to be centered around drive seek

Re: another ConcurrentModificationException

2010-03-09 Thread Jonathan Ellis

Cool, you're doing a great job finding these. :) Can you create a ticket? On Tue, Mar 9, 2010 at 11:57 AM, B. Todd Burruss wrote: > using cassandra-0.6.0-beta2/ > > > 2010-03-09 09:17:26,827 ERROR [pool-1-thread-675] [Cassandra.java:1166] > Internal error processing get > java.util.ConcurrentMod

Re: another ConcurrentModificationException

2010-03-09 Thread B. Todd Burruss

np, you give me free software, i give you free testing ;) i have some more so i'll just create tix and send them along i just switched to using thunderbird and any new messages i send to the list are being flagged as spam. i have no problems with evolution. anyone have an idea? (i can repl

new bug tix

2010-03-09 Thread B. Todd Burruss

these are both ConcurrentModificationExceptions https://issues.apache.org/jira/browse/CASSANDRA-864 https://issues.apache.org/jira/browse/CASSANDRA-865 this one is an AssertError https://issues.apache.org/jira/browse/CASSANDRA-866

Re: Cassandra hardware - balancing CPU/memory/iops/disk space

2010-03-09 Thread Jesse McConnell

let us know how the SSD's pan out, I am curious about that as well cheers, jesse -- jesse mcconnell jesse.mcconn...@gmail.com On Tue, Mar 9, 2010 at 12:08, B. Todd Burruss wrote: > our dataset is too big to fit into cache, so we are hitting disk. not a > problem for normal operation, but whe

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Sylvain Lebresne

Alright, What I'm observing shows better with bigger columns, so I've slightly modified the stress.py test so that it inserts column of 50K bytes (I attach the modified stress.py for info but it really just read 5 bytes from /dev/null and use that as data. I also added a sleep to the insert ot

no longer in storage-conf.xml in 0.6

2010-03-09 Thread Bill Au

I am checking out the 0.6 release since I need the batch_mutate command. I noticed that is no longer in storage-conf.xml for 0.6. Is that not used anymore? Or is that not configurable anymore? If it is still used but not configurable, how do I run multiple instances of Cassandra on a single ma

Re: no longer in storage-conf.xml in 0.6

2010-03-09 Thread Jonathan Ellis

It's no longer used. And it was always assumed that ControlPort and StoragePort are the same across all instances; you run multiple instances on a single machine by varying the IP address, not the ports. On Tue, Mar 9, 2010 at 1:21 PM, Bill Au wrote: > I am checking out the 0.6 release since I n

Re: schema design question

2010-03-09 Thread Jonathan Ellis

On Tue, Mar 9, 2010 at 7:30 AM, Matteo Caprari wrote: > On Tue, Mar 9, 2010 at 1:23 PM, Jonathan Ellis wrote: >> That's true. So you'd want to use a custom comparator where first 64 >> bits is the Long and the rest is the userid, for instance. >> >> (Long + something else is common enough that w

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Brandon Williams

On Tue, Mar 9, 2010 at 1:14 PM, Sylvain Lebresne wrote: > I've inserted 1000 row of 100 column each (python stress.py -t 2 -n > 1000 -c 100 -i 5) > If I read, I get the roughly the same number of row whether I read the > whole row > (python stress.py -t 10 -n 1000 -o read -r -c 100) or only the f

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Sylvain Lebresne

> A row causes a disk seek while columns are contiguous. So if the row isn't > in the cache, you're being impaired by the seeks. In general, fatter rows > should be more performant than skinny ones. Sure, I understand that. Still, I get 400 columns by seconds (ie, 400 seeks by seconds) when the

cassandra 0.6.0 beta 2 download contains beta 1?

2010-03-09 Thread Omer van der Horst Jansen

The apache-cassandra-0.6.0-beta2-bin.tar.gz download contains both these files in the apache-cassandra-0.6.0-beta2/lib directory: apache-cassandra-0.6.0-beta1.jar apache-cassandra-0.6.0-beta2.jar Given the way the classpath is constructed, it's possible that anyone using this download is actual

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

2010-03-09 Thread Brandon Williams

On Tue, Mar 9, 2010 at 2:28 PM, Sylvain Lebresne wrote: > > A row causes a disk seek while columns are contiguous. So if the row > isn't > > in the cache, you're being impaired by the seeks. In general, fatter > rows > > should be more performant than skinny ones. > > Sure, I understand that. S

IllegalStateException: Queue full

2010-03-09 Thread Todd Burruss

using tip of 0.6 branch with 864.txt patch. i have 4 nodes, one node is overcome with compaction right now. i started with no load then added a tiny bit of load and almost immediately got these errors on the other 3 nodes. 2010-03-09 16:05:43,004 ERROR [RESPONSE-STAGE:982] [CassandraDaemon.jav

Re: IllegalStateException: Queue full

2010-03-09 Thread Jonathan Ellis

v2 of patch attached to #864 (replaces old one) On Tue, Mar 9, 2010 at 6:08 PM, Todd Burruss wrote: > using tip of 0.6 branch with 864.txt patch. i have 4 nodes, one node is > overcome with compaction right now. i started with no load then added a tiny > bit of load and almost immediately got

Re: Hackathon?!?

2010-03-09 Thread Dan Di Spaltro

Alright guys, we have settled on a date for the Cassandra meetup on... April 15th, better known as, Tax day! We can host it here at Cloudkick, unless a cooler startup wants to host it. http://maps.google.com/maps/ms?ie=UTF8&hl=en&msa=0&msid=100290781618196563860.000478354937656785449&z=19

Re: Hackathon?!?

2010-03-09 Thread Jonathan Ellis

I can make it. \o/ On Tue, Mar 9, 2010 at 8:05 PM, Dan Di Spaltro wrote: > Alright guys, we have settled on a date for the Cassandra meetup on... > April 15th, better known as, Tax day! > We can host it here at Cloudkick, unless a cooler startup wants to host it. > http://maps.google.com/maps/ms?

Re: Hackathon?!?

2010-03-09 Thread Jeff Hodges

I'm down. -- Jeff On Tue, Mar 9, 2010 at 6:18 PM, Jonathan Ellis wrote: > I can make it. \o/ > > On Tue, Mar 9, 2010 at 8:05 PM, Dan Di Spaltro > wrote: >> Alright guys, we have settled on a date for the Cassandra meetup on... >> April 15th, better known as, Tax day! >> We can host it here at C

Re: Hackathon?!?

2010-03-09 Thread Stu Hood

Definitely on board! -Original Message- From: "Dan Di Spaltro" Sent: Tuesday, March 9, 2010 8:05pm To: cassandra-user@incubator.apache.org Subject: Re: Hackathon?!? Alright guys, we have settled on a date for the Cassandra meetup on... April 15th, better known as, Tax day! We can host

Re: atomicity across keys and secondary index support

2010-03-09 Thread Patricio Echagüe

Hey Jonathan, has there been any update on this feature? Thanks a lot Patricio On Thu, Dec 3, 2009 at 2:35 PM, Jonathan Ellis wrote: > that is still very firmly in the category of "future work." > > 2009/12/3 Patricio Echagüe : > > Hi all, I was reading the original paper[1] looking for answers

Re: atomicity across keys and secondary index support

2010-03-09 Thread Jonathan Ellis

Atomicity: no. 2ary indexes: CASSANDRA-749 is targeting the 0.8 release 2010/3/9 Patricio Echagüe : > Hey Jonathan, has there been any update on this feature? > > Thanks a lot > Patricio > > On Thu, Dec 3, 2009 at 2:35 PM, Jonathan Ellis wrote: >> >> that is still very firmly in the category of

Re: Hackathon?!?

2010-03-09 Thread Dan Di Spaltro

Great, that would probably get us a lot more room. Sweet, so its settled, we'll do it at Digg WHQ! On Tue, Mar 9, 2010 at 9:13 PM, Chris Goffinet wrote: > +1 from Digg if you wanna have it at our place as well, got the OK from the > boss. > > -Chris > > On Mar 9, 2010, at 6:05 PM, Dan Di Spalt

Re: Hackathon?!?

2010-03-09 Thread Ryan King

I'm already committed to talking about cassandra that day at our company's developer conference (chirp.twitter.com). -ryan On Tue, Mar 9, 2010 at 6:26 PM, Jeff Hodges wrote: > I'm down. > -- > Jeff > > On Tue, Mar 9, 2010 at 6:18 PM, Jonathan Ellis wrote: >> I can make it. \o/ >> >> On Tue, Mar

Re: Hackathon?!?

2010-03-09 Thread Jeff Hodges

Ah, hell. Thought this was the first day. Can't make it. -- Jeff On Mar 9, 2010 9:32 PM, "Ryan King" wrote: I'm already committed to talking about cassandra that day at our company's developer conference (chirp.twitter.com). -ryan On Tue, Mar 9, 2010 at 6:26 PM, Jeff Hodges wrote: > I'm down

Re: schema design question

Re: schema design question

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Re: schema design question

another ConcurrentModificationException

Re: Cassandra hardware - balancing CPU/memory/iops/disk space

Re: another ConcurrentModificationException

Re: another ConcurrentModificationException

new bug tix

Re: Cassandra hardware - balancing CPU/memory/iops/disk space

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

no longer in storage-conf.xml in 0.6

Re: no longer in storage-conf.xml in 0.6

Re: schema design question

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

cassandra 0.6.0 beta 2 download contains beta 1?

Re: Bad read performances: 'few rows of many columns' vs 'many rows of few columns'

IllegalStateException: Queue full

Re: IllegalStateException: Queue full

Re: Hackathon?!?

Re: Hackathon?!?

Re: Hackathon?!?

Re: Hackathon?!?

Re: atomicity across keys and secondary index support

Re: atomicity across keys and secondary index support

Re: Hackathon?!?

Re: Hackathon?!?

Re: Hackathon?!?

33 matches

Site Navigation

Mail list logo

Footer information