Re: slides for Testing out a slab allocator for Cassandra to reduce GC promotion failures by @stuhood ?

2011-08-25 Thread Ryan King
On Thu, Aug 25, 2011 at 9:33 AM, Yang tedd...@gmail.com wrote: http://twitoaster.com/country-us/lenn0x/testing-out-a-slab-allocator-for-cassandra-to-reduce-gc-promotion-failures-by-stuhood-cassandra-memtables-gc-cc-jointheflock/ hi:  I'm interested in learning more about the slaballocator,

Re: slides for Testing out a slab allocator for Cassandra to reduce GC promotion failures by @stuhood ?

2011-08-25 Thread Ryan King
background (on Hbase though, ) http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3/ The cassandra implementation is somewhat similar. -ryan thanks On Thu, Aug 25, 2011 at 10:01 AM, Ryan King r...@twitter.com wrote: On Thu, Aug 25, 2011

Re: Customized Secondary Index Schema

2011-08-24 Thread Ryan King
On Tue, Aug 23, 2011 at 10:03 AM, Alvin UW alvi...@gmail.com wrote: Hello, As mentioned by Ed Anuff in his blog and slides, one way to build customized secondary index is: We use one CF, each row to represent a secondary index, with the secondary index name as row key. For example,

Re: Avoid Simultaneous Minor Compactions?

2011-08-21 Thread Ryan King
You should throttle your compactions to a sustainable level. -ryan On Sun, Aug 21, 2011 at 10:22 PM, Hefeng Yuan hfy...@rhapsody.com wrote: We just noticed that at one time, 4 nodes were doing minor compaction together, each of them took 20~60 minutes. We're on 0.8.1, 6 nodes, RF5. This

Re: cassandra server disk full

2011-08-03 Thread Ryan King
2, 2011 at 9:27 AM, Jim Ancona j...@anconafamily.com wrote: On Mon, Aug 1, 2011 at 6:12 PM, Ryan King r...@twitter.com wrote: On Fri, Jul 29, 2011 at 12:02 PM, Chris Burroughs chris.burrou...@gmail.com wrote: On 07/25/2011 01:53 PM, Ryan King wrote: Actually I was wrong– our patch will disable

Re: cassandra server disk full

2011-08-01 Thread Ryan King
On Fri, Jul 29, 2011 at 12:02 PM, Chris Burroughs chris.burrou...@gmail.com wrote: On 07/25/2011 01:53 PM, Ryan King wrote: Actually I was wrong– our patch will disable gosisp and thrift but leave the process running: https://issues.apache.org/jira/browse/CASSANDRA-2118 If people

Re: Cassandra Pig with network topology and data centers.

2011-07-29 Thread Ryan King
It'd be great if we had different settings for inter- and intra-DC read repair. -ryan On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani jak...@gmail.com wrote: Yes it's read repair you can lower the read repair chance to tune this. On Jul 29, 2011, at 6:31 PM, Aaron Griffith

Re: Aggregation and Co-Processors

2011-07-28 Thread Ryan King
On Thu, Jul 28, 2011 at 12:08 PM, Stephen Pope stephen.p...@quest.com wrote: I just finished watching the video by Eric Evans on “CQL – Not just NoSQL. It’s MoSQL”, and I heard mention of aggregation queries. He said there’s been some talk about it, and that you guys were calling it

Re: cassandra server disk full

2011-07-25 Thread Ryan King
We have a patch somewhere that will kill the node on IOErrors, since those tend to be of the class that are unrecoverable. -ryan On Thu, Jul 7, 2011 at 8:02 PM, Jonathan Ellis jbel...@gmail.com wrote: Yeah, ideally it should probably die or drop into read-only mode if it runs out of space.

Re: cassandra server disk full

2011-07-25 Thread Ryan King
Actually I was wrong– our patch will disable gosisp and thrift but leave the process running: https://issues.apache.org/jira/browse/CASSANDRA-2118 If people are interested in that I can make sure its up to date with our latest version. -ryan On Mon, Jul 25, 2011 at 10:07 AM, Ryan King r

Re: Strong Consistency with ONE read/writes

2011-07-12 Thread Ryan King
If you're interested in this idea, you should read up about Spinnaker: http://www.vldb.org/pvldb/vol4/p243-rao.pdf -ryan On Mon, Jul 11, 2011 at 2:48 PM, Yang tedd...@gmail.com wrote: I'm not proposing any changes to be done, but this looks like a very interesting topic for

Re: question on capacity planning

2011-06-29 Thread Ryan King
On Wed, Jun 29, 2011 at 5:36 AM, Jacob, Arun arun.ja...@disney.com wrote: if I'm planning to store 20TB of new data per week, and expire all data every 2 weeks, with a replication factor of 3, do I only need approximately 120 TB of disk? I'm going to use ttl in my column values to automatically

Re: counter question

2011-06-24 Thread Ryan King
On Fri, Jun 24, 2011 at 6:08 AM, Joseph Stein crypt...@gmail.com wrote: cool now that 0.8 is out any chance Rainbird is going to be open sourced? Not anytime soon. We're busy launching a bunch of stuff (some of which you'll hear about at CassandraSF). -ryan if not then I guess I will be

Re: No Transactions: An Example

2011-06-23 Thread Ryan King
On Thu, Jun 23, 2011 at 2:05 PM, Les Hazlewood l...@katasoft.com wrote: Hi Dominic, Thanks so much for providing this information.  I was unaware of Cages and this looks like it could be used effectively for certain things. This is because Cassandra uses the timestamps of columns that have

Re: hinted handoff sleeping

2011-06-23 Thread Ryan King
On Thu, Jun 23, 2011 at 2:55 PM, Jeffrey Wang jw...@palantir.com wrote: Hey all, We’re running a slightly patched version of 0.7.3 on a cluster of 5 nodes. I’ve been noticing a number of messages in our logs which look like this (after a node goes “down” and comes back up, usually just due

Re: simple question about merged SSTable sizes

2011-06-22 Thread Ryan King
On Wed, Jun 22, 2011 at 10:00 AM, Jonathan Colby jonathan.co...@gmail.com wrote: Thanks for the explanation.  I'm still a bit skeptical. So if you really needed to control the maximum size of compacted SSTables,   you need to delete data at such a rate that the new files created by

Re: 99.999% uptime - Operations Best Practices?

2011-06-22 Thread Ryan King
On Wed, Jun 22, 2011 at 2:24 PM, Les Hazlewood l...@katasoft.com wrote: I'm planning on using Cassandra as a product's core data store, and it is imperative that it never goes down or loses data, even in the event of a data center failure.  This uptime requirement (five nines: 99.999% uptime)

Re: SSTable corruption blocking compaction and scrub can't fix it

2011-06-17 Thread Ryan King
Even without lsof, you should be able to get the data from /proc/$pid -ryan On Fri, Jun 17, 2011 at 5:08 AM, Dominic Williams dwilli...@system7.co.uk wrote: Unfortunately I shutdown that node and anyway lsof wasn't installed. But $ulimit gives unlimited On 17 June 2011 13:00, Sylvain

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 8:18 AM, AJ a...@dude.podzone.net wrote: Good morning all. Hypothetical Setup: 1 data center RF = 3 Total nodes 3 Problem: Suppose I need maximum consistency for one critical operation; thus I specify CL = ALL for reads.  However, this will fail if only 1 replica

Re: snitch thrift

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 6:11 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi all! Assuming a node ends up in GC land for a while, there is a good chance that even though it performs terribly and the dynamic snitching will help you to avoid it on the gossip side, it will not really

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 1:05 PM, AJ a...@dude.podzone.net wrote: On 6/16/2011 10:58 AM, Dan Hendry wrote: I think this would add a lot of complexity behind the scenes and be conceptually confusing, particularly for new users. I'm not so sure about this.  Cass is already somewhat

Re: compression for regular column names?

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 3:41 PM, E R pc88m...@gmail.com wrote: Hi all, As a way of gaining familiarity with Cassandra I am migrating a table that is currently stored in a relational database and mapping it into a Cassandra column family. We add about 700,000 new rows a day to this table, and

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King
On Thu, Jun 16, 2011 at 2:12 PM, AJ a...@dude.podzone.net wrote: On 6/16/2011 2:37 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 1:05 PM, AJa...@dude.podzone.net  wrote: snip The Cassandra consistency model is pretty elegant and this type of approach breaks that elegance in many ways

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King
There's a ticket open to address this: https://issues.apache.org/jira/browse/CASSANDRA-1974 -ryan On Wed, Jun 15, 2011 at 8:49 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: On Thu, Jun 16, 2011 at 12:48 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Even if the gc call

Re: Forcing Cassandra to free up some space

2011-06-15 Thread Ryan King
There's a ticket open for this: https://issues.apache.org/jira/browse/CASSANDRA-2521. Vote on it if you think its important. -ryan On Wed, Jun 15, 2011 at 7:34 PM, Jeffrey Kesselman jef...@gmail.com wrote: The GC cleanup approach, if depending on specific objects being GCd, is fundamentally

Re: Cassandra scaling problem in virtualized environment

2011-06-14 Thread Ryan King
On Tue, Jun 14, 2011 at 8:16 AM, Schuilenga, Jan Taeke jantaeke.schuile...@ocwduo.nl wrote: Hi All, We are having issues testing Cassandra in a virtualized environment (Vmware ESX). Our challenge is to combine a  high number of concurrent users with a very low maximum response time.

Re: need some help with counters

2011-06-09 Thread Ryan King
On Thu, Jun 9, 2011 at 12:41 PM, Ian Holsman had...@holsman.net wrote: Hi. I had a brief look at CASSANDRA-2103 (expiring counter columns), and I was wondering if anyone can help me with my problem. I want to keep some page-view stats on a URL at different levels of granularity (page

Re: need some help with counters

2011-06-09 Thread Ryan King
On Thu, Jun 9, 2011 at 1:06 PM, Ian Holsman had...@holsman.net wrote: Hi Ryan. you wouldn't have your version of cassandra up on github would you?? No, and the patch isn't in our version yet either. We're still working on it. -ryan

Re: [RELEASE] 0.8.0

2011-06-06 Thread Ryan King
On Mon, Jun 6, 2011 at 6:09 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Of course I talked too soon. I saw a corrupted commitlog some days back after killing cassandra and I just came across a committed hints file after a cluster restart for some config changes :( Will look into

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-23 Thread Ryan King
On Sun, May 22, 2011 at 11:00 AM, Yang tedd...@gmail.com wrote: Thanks, I did read through that pdf doc, and went through the counters code in 0.8-rc2, I think I understand the logic in that code. in my hypothetical implementation, I am not suggesting to overstep the complicated logic

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-23 Thread Ryan King
On Mon, May 23, 2011 at 12:06 PM, Yang tedd...@gmail.com wrote: Thanks Ryan, could you please share more details: according to what you observed in testing,  why was performance  worse if you do not do extra buffering? I was thinking (could be wrong)  that without extra buffering, the

Re: rainbird question (why is the 1minute buffer needed?)

2011-05-23 Thread Ryan King
Maybe. We haven't really tested it without buffering and probably won't anytime soon. 1 minute latency is good enough for what we're doing. On Mon, May 23, 2011 at 1:58 PM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: On May 23, 2011, at 2:23 PM, Ryan King wrote: On Mon, May 23, 2011 at 12

Re: Ghost token

2011-05-13 Thread Ryan King
That's the same as the last one. The token space is a circle so the last one at the list is repeated at the top. -ryan On Fri, May 13, 2011 at 9:59 AM, Scott McPheeters smcpheet...@healthx.com wrote: Has anyone seen this and know if it is causing an issue or how to fix it?  Anytime I run

Re: CQL, 0.8, and need for language drivers

2011-04-13 Thread Ryan King
On Tue, Apr 12, 2011 at 7:16 PM, Jeremy Hanna jeremy.hanna1...@gmail.com wrote: As some may have heard, CQL is going to be in 0.8.  It's a level of abstraction that will hopefully make the lives of client developers substantially easier.  The ideal is to make it so client devs only need to do

Re: Analysing hotspot gc logs

2011-04-11 Thread Ryan King
On Mon, Apr 11, 2011 at 10:35 AM, Chris Burroughs chris.burrou...@gmail.com wrote: To avoid taking my own thread [1] off on a tangent.  Does anyone have a reccomendation for a tool to graphical analysis (ie make useful graphs) out of hoptspot gc logs?  Google searches have turned up several

Re: problem with large batch mutation set

2011-04-07 Thread Ryan King
On Wed, Apr 6, 2011 at 11:49 PM, Ross Black ross.w.bl...@gmail.com wrote: Hi, I am using the thrift client batch_mutate method with Cassandra 0.7.0 on Ubuntu 10.10. When the size of the mutations gets too large, the client fails with the following exception: Caused by:

Re: RTG/MRTG/Cricket replacement using Cassandra?

2011-03-31 Thread Ryan King
We have a solution for time series data on cassandra at Twitter that we'd like to open source, but it requires 0.8/trunk so we're not going to release it until that's stable. See http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011 -ryan On Thu, Mar 31, 2011

Re: stress.py bug?

2011-03-21 Thread Ryan King
On Mon, Mar 21, 2011 at 4:02 AM, pob peterob...@gmail.com wrote: Hi, I'm inserting data from client node with stress.py to cluster of 6 nodes. They are all on 1Gbps network, max real throughput of network is 930Mbps (after measurement). python stress.py -c 1 -S 17  -d{6nodes}  -l3 -e

Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Ryan King
On Mon, Feb 28, 2011 at 9:24 AM, Flachbart, Dirk (HP Software - TransactionVision) dirk.flachb...@hp.com wrote: Hi, We are trying to use Cassandra for high-performance insertion of simple key/value records. I have set up Cassandra on two of my machines in my local network (Windows 2008

Re: Question about insert performance in multiple node cluster

2011-02-28 Thread Ryan King
On Mon, Feb 28, 2011 at 2:05 PM, Flachbart, Dirk (HP Software - TransactionVision) dirk.flachb...@hp.com wrote: Replication factor is set to 1, and I'm using ConsistencyLevel.ANY. And yep, I tried doubling the threads from 16 to 32 when running with the second server, didn't make a

Re: Column name size

2011-02-11 Thread Ryan King
On Fri, Feb 11, 2011 at 2:06 AM, Patrik Modesto patrik.mode...@gmail.com wrote: Hi all! I'm thinking if size of a column name could matter for a large dataset in Cassandra  (I mean lots of rows). For example what if I have a row with 10 columns each has 10 bytes value and 10 bytes name. Do I

Re: Ruby thrift is trying to write Time as string

2011-02-07 Thread Ryan King
On Sat, Feb 5, 2011 at 10:12 PM, Joshua Partogi joshua.j...@gmail.com wrote: Hi, I don't know whether my assumption is right or not. When I tried to insert a Time value into a column I am getting this exception: vendor/ruby/1.8/gems/thrift-0.5.0/lib/thrift/protocol/binary_protocol.rb:106:in

Re: Using a synchronized counter that keeps track of no of users on the application using it to allot UserIds/ keys to the new users after sign up

2011-02-04 Thread Ryan King
On Thu, Feb 3, 2011 at 9:12 PM, Aklin_81 asdk...@gmail.com wrote: Thanks Matthew Ryan, The main inspiration behind me trying to generate Ids in sequential manner is to reduce the size of the userId, since I am using it for heavy denormalization. UUIDs are 16 bytes long, but I can also have a

Re: New Generation Size guidelines

2011-02-04 Thread Ryan King
On Fri, Feb 4, 2011 at 1:45 PM, Oleg Proudnikov ol...@cloudorange.com wrote: Hi All, I have a 3 server cluster with RF=2. My heap is 2G out of a 4G RAM. The servers have 4 cores. I used default heap settings. The Eden space ended up around 60M and the Survivor spaces are around 7M. This

Re: Using a synchronized counter that keeps track of no of users on the application using it to allot UserIds/ keys to the new users after sign up

2011-02-03 Thread Ryan King
You could also consider snowflake: http://github.com/twitter/snowflake which gives you ids that roughly sort by time (but aren't sequential). -ryan On Thu, Feb 3, 2011 at 11:13 AM, Matthew E. Kennedy matt.kenn...@spadac.com wrote: Unless you need your user identifiers to be sequential for

Re: reduced cached mem; resident set size growth

2011-02-02 Thread Ryan King
On Wed, Feb 2, 2011 at 6:22 AM, Chris Burroughs chris.burrou...@gmail.com wrote: On 01/28/2011 09:19 PM, Chris Burroughs wrote: Thanks Oleg and Zhu.  I swear that wasn't a new hotspot version when I checked, but that's obviously not the case.  I'll update one node to the latest as soon as I

Re: reduced cached mem; resident set size growth

2011-02-02 Thread Ryan King
On Wed, Feb 2, 2011 at 10:29 AM, Chris Burroughs chris.burrou...@gmail.com wrote: On 02/02/2011 12:49 PM, Ryan King wrote: We're seeing a similar problem with one of our clusters (but over a longer time scale). Its possible that its not a leak, but just fragmentation. Unless you've told

Re: 0.7.0 mx4j, get attribute

2011-02-02 Thread Ryan King
On Wed, Feb 2, 2011 at 10:40 AM, Chris Burroughs chris.burrou...@gmail.com wrote: I'm using 0.7.0 and experimenting with the new mx4j support. http://host:port/mbean?objectname=org.apache.cassandra.request%3Atype%3DReadStage Returns a nice pretty html page.  For purposes of monitoring I would

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Ryan King
Not open source, but here's a preso on how simplegeo do it: http://www.slideshare.net/mmalone/scaling-gis-data-in-nonrelational-data-stores Note: we do it very differently here at Twitter (but aren't at liberty to discuss in detail)– I say this just to point out that there are several valid

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Ryan King
On Fri, Jan 21, 2011 at 12:24 PM, Joseph Stein crypt...@gmail.com wrote: Thanks Ryan, Jake and Mike for the quick responses. I will mull through this weekend between engineering things from scratch or going the Solr/Solandra route as Jake points out is an option (and the effort/time related

Re: Document Mapper for Ruby?

2011-01-20 Thread Ryan King
Not sure what you mean by document mapper, but CassandraObject might fit the bill: https://github.com/nzkoz/cassandra_object -ryan On Wed, Jan 19, 2011 at 11:03 PM, Joshua Partogi joshua.j...@gmail.com wrote: Hi all, Is anyone aware of a document mapper for Ruby similar to MongoMapper?

Re: Tombstone lifespan after multiple deletions

2011-01-17 Thread Ryan King
On Sun, Jan 16, 2011 at 6:53 AM, David Boxenhorn da...@lookin2.com wrote: If I delete a row, and later on delete it again, before GCGraceSeconds has elapsed, does the tombstone live longer? Each delete is a new tombstone, which should answer your question. -ryan In other words, if I have the

Re: cassandra row cache

2011-01-13 Thread Ryan King
I'm not sure if this is entirely true, but I *think* older version of cassandra used a version of the ConcurrentLinkedHashmap (which backs the row cache) that used the Second Chance algorithm, rather than LRU, which might explain this non-LRU-like behavior. I may be entirely wrong about this

Re: Storing big objects into columns

2011-01-13 Thread Ryan King
On Thu, Jan 13, 2011 at 2:38 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Dear all, In a project I would like to store big objects in columns, serialized. For example entire images (several Ko to several Mo), flash animations (several Mo) etc... Does someone use Cassandra with those

Re: Storing big objects into columns

2011-01-13 Thread Ryan King
On Thu, Jan 13, 2011 at 2:44 PM, Victor Kabdebon victor.kabde...@gmail.com wrote: Is there any recommanded maximum size for a Column ? (not the very upper limit which is 2Gb) Why is it useful to chunk the content into multiple columns ? I think you're going to have to do some tests yourself.

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:00 PM, mck m...@apache.org wrote: I'm using 0.7.0-rc3, 3 nodes, RF=3, and ByteOrderedPartitioner. When i run nodetool ring it reports Address         Status State   Load            Owns    Token                                                        

Re: Should nodetool ring give equal load ?

2011-01-12 Thread Ryan King
On Wed, Jan 12, 2011 at 2:08 PM, mck m...@apache.org wrote: You're using an ordered partitioner and your nodes are evenly spread around the ring, but your data probably isn't evenly distributed. This load number seems equals to `du -hs data_file_directories` and since i've got N == RF

Re: Ruby database migrations for Cassandra - ActiveColumn

2011-01-11 Thread Ryan King
Awesome and great to see you're using our fauna cassandra gem. :) -ryan On Tue, Jan 11, 2011 at 10:18 AM, Mike Wynholds m...@carbonfive.com wrote: Happy new year all- I just wanted to mention that I have released a new Cassandra data management gem called ActiveColumn.  The first major

Re: Insert LongType with ruby

2011-01-04 Thread Ryan King
On Tue, Jan 4, 2011 at 12:50 PM, vicent roca daniel sap...@gmail.com wrote: I'm getting more consistent results using Time.stamp instead of Time From: https://github.com/fauna/cassandra/blob/master/lib/cassandra/long.rb Yeah, you were probably overwriting values then. -ryan

Re: Insert LongType with ruby

2011-01-03 Thread Ryan King
On Sun, Jan 2, 2011 at 3:45 PM, vicent roca daniel sap...@gmail.com wrote: Hi guys, I need your help. I'm trying to insert a column name of type LongType using the ruby wrapper, but I can't get it working. What I'm trying is something like this:    app.insert(:Data, 'device1-cpu', { Time.now

Re: Insert LongType with ruby

2011-01-03 Thread Ryan King
' from (irb):6 from /Users/armandolalala/.rvm/rubies/ruby-1.9.2-p0/bin/irb:17:in `main' On Mon, Jan 3, 2011 at 10:06 PM, Ryan King r...@twitter.com wrote: On Mon, Jan 3, 2011 at 12:56 PM, vicent roca daniel sap...@gmail.com wrote: Hi again! code: require 'rubygems' require 'cassandra

Re: Too many open files Exception + java.lang.ArithmeticException: / by zero

2010-12-16 Thread Ryan King
Are you creating a new connection for each row you insert (and if so are you closing it)? -ryan On Wed, Dec 15, 2010 at 8:13 AM, Amin Sakka, Novapost amin.sa...@novapost.fr wrote: Hello, I'm using cassandra 0.7.0 rc1, a single node configuration, replication factor 1, random partitioner, 2 GO

Re: Fauna Questions

2010-12-15 Thread Ryan King
On Tue, Dec 14, 2010 at 7:14 AM, Alberto Velandia betovelan...@gmail.com wrote: Hi has anyone noticed that the documentation for the Cassandra Class is gone from the website? http://blog.evanweaver.com/2010/12/06/cassandra-0-8/ http://rdoc.info/gems/cassandra will always have the latest

Re: Running multiple instances on a single server --micrandra ??

2010-12-09 Thread Ryan King
Overall, I don't think this is a crazy idea, though I think I'd prefer cassandra to manage this setup. The problem you will run into is that because the storage port is assumed to be the same across the cluster you'll only be able to do this if you can assign multiple IPs to each server (one for

fauna cassandra client 0.9.0

2010-12-08 Thread Ryan King
I just pushed a 0.9.0 release of the fauna-cassandra ruby client. This is our first release that includes support for Cassandra 0.7 (currently supporting RC1 and not earlier 0.7 releases). code/download: https://rubygems.org/gems/cassandra git: http://github.com/fauna/cassandra File any bugs on

Re: If one seed node crash, how can I add one seed node?

2010-12-07 Thread Ryan King
Note that there's not really anything special about the seed node and its all relative– the cluster doesn't necessarily have to agreed on who the seeds are. So, to bring up a new node to replace the old seed, just set the new node's seed to any existing node in the system. After that you can go

Re: fauna/cassandra gem does not work with Cassandra 0.7

2010-12-07 Thread Ryan King
Please file this on github issues: https://github.com/fauna/cassandra/issues. And I'll get to it soon. -ryan On Tue, Dec 7, 2010 at 2:21 AM, Joshua Partogi joshua.j...@gmail.com wrote: Hi, I pull out fauna/cassandra gem 0.10.0 from github. I then tried to get a value from cassandra as such.

Re: If one seed node crash, how can I add one seed node?

2010-12-07 Thread Ryan King
On Tue, Dec 7, 2010 at 1:07 PM, Eric Gilmore e...@riptano.com wrote: What would comprise a sane and reasonably balanced list? Should there be a certain proportion of seeds per total nodes? Any other considerations besides a) list must be identical on all nodes and b) you can't

Testathon at Twitter on December 13th

2010-12-06 Thread Ryan King
We're going to be hosting people at the Twitter offices the evening of December 13th to focus on testing 0.7. If you're interested please contact me offlist and I'll add you to the invite. Note that we're trying to keep the group small and focused. -ryan

Re: Newbie question about connecting to a cassandra server from another server using Fauna

2010-12-06 Thread Ryan King
It would help if you give us more context. The code snippet you've given us is incomplete and not very helpful. -ryan On Mon, Dec 6, 2010 at 12:33 PM, Alberto Velandia betovelan...@gmail.com wrote: Hi I've successfully managed to connect to the server through the cassandra-cli command but

Re: Cassandra 0.7 beta3 BinaryMemtable and Supercolumns

2010-11-12 Thread Ryan King
On Fri, Nov 12, 2010 at 7:33 AM, Aditya Muralidharan aditya.muralidha...@nisc.coop wrote: Thanks for the response. We're trying to get a general idea of the insert and retrieval performance, and we figured BinaryMemtable would be a great enabler for our bulk import scenarios. Normal thrift

Re: CF Stats in 0.7beta3

2010-11-10 Thread Ryan King
Yeah, that's really microsecond latency. Note, though that this isn't the full request timing, its just the storage proxy down, so it doesn't account for any latency added by thrift or the network. -ryan On Wed, Nov 10, 2010 at 1:43 PM, Rock, Paul paul.r...@teamaol.com wrote: Afternoon all -

Re: High BloomFilterFalseRation

2010-11-02 Thread Ryan King
On Tue, Nov 2, 2010 at 1:28 AM, Daniel Doubleday daniel.double...@gmx.net wrote: Hi all had some time yesterday to dig a lil deeper. And maybe this saves someone who made the same mistake the time so ... After trying to reproduce the problem in unit tests with the same data which led

Re: atomic test-or-set

2010-10-05 Thread Ryan King
On Tue, Oct 5, 2010 at 8:23 AM, Ian Rogers ian.rog...@contactclean.com wrote: Does Cassandra have an atomic test-or-set operation? That is, I want to check to see if a key has a value and, if not, set it to something.  But it must be an atomic operation - I can't do a separate fetch and then

Re: avro + cassandra + ruby

2010-09-30 Thread Ryan King
On Thu, Sep 30, 2010 at 1:08 PM, Gabor Torok gabor.to...@sunpowercorp.com wrote: I added a comment to an existing issue: https://issues.apache.org/jira/browse/AVRO-537 Cool. I'll work with Jeff (who sits about 10 feet from me) to get this fixed. :) -ryan

Re: avro + cassandra + ruby

2010-09-29 Thread Ryan King
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok gabor.to...@sunpowercorp.com wrote: Hi, I'm attempting to use avro to talk to cassandra because the ruby thrift client's read performance is pretty bad (I measured 4x slower than java). However, I run into a problem when calling multiget_slice.

Re: avro + cassandra + ruby

2010-09-28 Thread Ryan King
On Tue, Sep 28, 2010 at 4:06 PM, Gabor Torok gabor.to...@sunpowercorp.com wrote: Hi, I'm attempting to use avro to talk to cassandra because the ruby thrift client's read performance is pretty bad (I measured 4x slower than java). Only 4x feels like a win. :) One thing you should try is to

Re: is it my cassandra cluster ok?

2010-08-25 Thread Ryan King
Looks like you need to do some load balancing. -ryan On Wed, Aug 25, 2010 at 12:33 AM, john xie shanfengg...@gmail.com wrote: /opt/apache-cassandra-0.6.4/bin/nodetool --host 192.168.123.100 ring Address       Status     Load          Range      Ring 162027259805094200094770502377853667196

Re: SEO friendly pagination

2010-08-25 Thread Ryan King
On Wed, Aug 25, 2010 at 11:20 AM, Petr Odut petr.o...@gmail.com wrote: Hi, I've read about pagination in cassandra. My current implementation is get_range_slices with startKey = lastKey + 1, but I need to get the specified page directly. Is it any chance to do this? If you look at twitter,

Re: cache sizes using percentages

2010-08-17 Thread Ryan King
On Tue, Aug 17, 2010 at 10:55 AM, Artie Copeland yeslinux@gmail.com wrote: if i set a key cache size of 100% the way i understand how that works is: - the cache is not write through, but read through - a key gets added to the cache on the first read if not already available - the size of

Re: How does cfstats calculate Row Size?

2010-08-12 Thread Ryan King
On Thu, Aug 12, 2010 at 9:08 AM, Julie julie.su...@nextcentury.com wrote: I am chasing down a row size discrepancy and am confused. I populated a single node Cassandra cluster with 10,000 rows of data, using numeric keys 1-10,000, where each row is a little over 100kB in length and has a

Re: Soliciting thoughts on possible read optimization

2010-08-11 Thread Ryan King
On Tue, Aug 10, 2010 at 8:43 PM, Arya Asemanfar aryaaseman...@gmail.com wrote: I mentioned this today to a couple folks at Cassandra Summit, and thought I'd solicit some more thoughts here. Currently, the read stage includes checking row cache. So if your concurrent reads is N and you have N

Re: Cassandra 0.7 Ruby/Thrift Bindings

2010-08-06 Thread Ryan King
Make sure the client and server are both using the same transport (framed vs. non) -ryan On Fri, Aug 6, 2010 at 9:47 AM, Mark static.void@gmail.com wrote: Has anyone had any success using Cassandra 0.7 w/ ruby? I'm attempting to use the fauan/cassandra gem

Re: Cassandra 0.7 Ruby/Thrift Bindings

2010-08-06 Thread Ryan King
On Fri, Aug 6, 2010 at 9:57 AM, Mark static.void@gmail.com wrote: Wow.. fast answer AND correct. In Cassandra.yml # Frame size for thrift (maximum field length). # 0 disables TFramedTransport in favor of TSocket. thrift_framed_transport_size_in_mb: 15 I just had to change that value to

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-20 Thread Ryan King
On Tue, Jul 20, 2010 at 6:20 AM, Juho Mäkinen juho.maki...@gmail.com wrote: I managed to run a few benchmarks. Servers   r/s   1        64.5k   2        59.5k The configuration: Client: Machine with four Quad Core Intel Xeon CPU E5520 @ 2.27Ghz cpus (total 16 cores), 4530 bogomips per

Re: Ran into an issue where Cassandra Crashed when running out of heap space

2010-07-20 Thread Ryan King
On Tue, Jul 20, 2010 at 1:28 PM, Peter Schuller peter.schul...@infidyne.com wrote: Attaching Jconsole shows that there is a growth of memory and weird spikes. Unfortunately I did not take a screen shot of the growth of the spike over time. I'll do that when it occurs again. Note that expected

Re: Cassandra benchmarking on Rackspace Cloud

2010-07-19 Thread Ryan King
On Mon, Jul 19, 2010 at 11:02 AM, David Schoonover david.schoono...@gmail.com wrote: Multiple client processes, or multiple client machines? I ran it with both one and two client machines making requests, and ensured the sum of the request threads across the clients was 50. That was on the

Re: TechCrunch article on Twitter and Cassandra

2010-07-10 Thread Ryan King
On Sat, Jul 10, 2010 at 10:33 AM, Marty Greenia martygree...@gmail.com wrote: It almost seems counter-intuitive. For analytics, you'd think they'd want a database that supports more sophisticated query functionality (sql). Whereas for everyday tweet storage, something fast and high-throughput

Re: Implementing Counter on Cassandra

2010-06-29 Thread Ryan King
On Tue, Jun 29, 2010 at 9:42 AM, Utku Can Topçu u...@topcu.gen.tr wrote: Hey Guys, Currently in a project I'm involved in, I need to have some columns holding incremented data. The easy approach for implementing a counter with increments is right now as I figured out is read - increment -

Re: Digg 4 Preview on TWiT

2010-06-28 Thread Ryan King
On Mon, Jun 28, 2010 at 9:35 AM, Chris Goffinet c...@chrisgoffinet.com wrote: Digg is not forking Cassandra. We use 0.6 for production, with a few in-house patches (related to our infrastructure). The biggest difference with our branch and apache 0.6 branch is we have the work Kelvin and

Re: Learning-by-doing (also announcing a new Ruby Client Codename: Greek Architect)

2010-06-19 Thread Ryan King
On Sat, Jun 19, 2010 at 9:30 AM, Christian van der Leeden christian.vanderlee...@googlemail.com wrote: Hi Thomas,        did you look at cassandra gem from twitter (fauna/cassandra) on github? They also use the thrift_client and already have the basic cassandra API accessible. I'm also

Re: Panasas and Cassandra

2010-05-25 Thread Ryan King
On Tue, May 25, 2010 at 9:06 AM, Fernanda Foertter fernanda.foert...@pic.com wrote: Hi everyone, So we have Panasas (http://www.panasas.com), and want to avoid local drives.  Because panasas has its own redundancy and cache, Can I set RF=1?  If so, can you think of any reason why we shouldn’t

Re: Why Cassandra is space inefficient compared to MySQL?

2010-05-25 Thread Ryan King
Also, timestamps for each column. -ryan On Tue, May 25, 2010 at 5:41 AM, Jonathan Ellis jbel...@gmail.com wrote: That's true.  But fundamentally Cassandra is expected to use more space than mysql for a few reasons; usually the biggest factor is that Cassandra has to write out each column name

Re: is cassandra really a 'handsoff' solution once setup?

2010-05-14 Thread Ryan King
On Fri, May 14, 2010 at 11:46 AM, S Ahmed sahmed1...@gmail.com wrote: realizing cassandra might be a little tricky to setup at first due to lack of docs etc. Once it is up and running/humming, is it a hands-off solution or does it require hand-holding/monitoring? I recall Joe Stump's blog

Re: Cache capacities set by nodetool

2010-05-12 Thread Ryan King
It's a bug: https://issues.apache.org/jira/browse/CASSANDRA-1079 -ryan On Wed, May 12, 2010 at 8:16 AM, James Golick jamesgol...@gmail.com wrote: When I first brought this cluster online, the storage-conf.xml file had a few cache capacities set. Since then, we've completely changed how we use

Re: Memory usage continually increases with reads

2010-04-28 Thread Ryan King
On Wed, Apr 28, 2010 at 12:12 PM, Kyusik Chung kyu...@discovereads.com wrote: Hello.  I am using Cassandra 0.6.1 on ubuntu 8.04.  3 node cluster. I notice that when I start making lots of read requests (serially), memory usage of jsvc keeps climbing until it uses up all memory on the server

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Ryan King
On Tue, Apr 27, 2010 at 1:31 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Thanks Ryan for the fast response! Can you explain to me why binding against 127.0.0.1 causes the problem? Maybe it's useful to point this out in the documentation to avoid users deploy this kind of setups. Are

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Ryan King
On Tue, Apr 27, 2010 at 1:38 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Nope, I'm doing some tests locally on my notebook (Macbook OSX 10.6.3 w/4GB RAM). My script insert several hundred thousand columns with stable speed, and then it exits throwing that exception. Its possible

Re: ThriftTransportException using Ruby Gem 0.8.2 against Cassandra 0.6.1

2010-04-27 Thread Ryan King
On Tue, Apr 27, 2010 at 2:29 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: El 27/04/2010, a las 18:11, Ryan King escribió: On Tue, Apr 27, 2010 at 1:38 PM, Lucas Di Pentima lu...@di-pentima.com.ar wrote: Nope, I'm doing some tests locally on my notebook (Macbook OSX 10.6.3 w/4GB RAM

Re: Can Cassandra make real use of several DataFileDirectories?

2010-04-26 Thread Ryan King
I would recommend using RAID-0 rather that multiple data directories. -ryan 2010/4/26 Roland Hänel rol...@haenel.me: I have a configuration like this:   DataFileDirectories   DataFileDirectory/storage01/cassandra/data/DataFileDirectory  

  1   2   >