Re: Looking for work

2010-03-02 Thread Joe Stump
Us too at SimpleGeo! We're Python, Cassandra, Erlang, and a smattering of Java and C++. We have offices in Boulder, CO and SF. --Joe -- Typed with big fingers on a small keyboard. On Mar 2, 2010, at 19:01, Peter Halliday wrote: I'm looking for work. My previous employer was a non-profi

Re: Map return for multiget_slice() query

2010-01-13 Thread Joe Stump
I ran into this problem in Python because dict's aren't ordered in Python. Not sure if that applies here. --Joe On Jan 12, 2010, at 2:22 AM, Richard Grossman wrote: > Hi > > I've a simple CF like this : > Name="channelShow" > FlushPeriodInMinutes=

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Joe Stump
On Dec 28, 2009, at 11:40 AM, Ted Zlatanov wrote: > I can see that's a problem. In my case, row keys represent switches in > production so I don't expect more than a few hundred. An application > can't find out how many switches are known without enumerating the > keys; how would you suggest I

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Joe Stump
On Dec 28, 2009, at 11:00 AM, Ted Zlatanov wrote: > Is this worth a JIRA feature request? Or is it something Cassandra will > never support fully? From the user's perspective it's very useful. I don't know why it'd be very useful to be honest. Lots of us have CF's with billions of keys. Ours,

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Joe Stump
On Dec 28, 2009, at 9:51 AM, Ted Zlatanov wrote: > If each node does a key enumeration, can the results be aggregated > somehow? It seems useful to get a list of all the keys across the > cluster even if it's not 100% accurate. I didn't see discussions of > such a feature in JIRA or in the arch

Re: Partition data - advantage and disadvantage

2009-12-28 Thread Joe Stump
On Dec 28, 2009, at 5:43 AM, JKnight JKnight wrote: >~ org.apache.cassandra.dht.RandomPartitioner, The advantage of the random partitioner is that it randomly distributes your keys across the cluster. This (theoretically) avoids key clustering on nodes. The big disadvantage is that you can

Re: Cassandra vs HBase

2009-12-05 Thread Joe Stump
On Dec 5, 2009, at 7:41 PM, Bill Hastings wrote: > [Is] HBase used for real timish applications and if so any ideas what the > largest deployment is. I don't know of anyone off the top of my head who's using anything built on top of Hadoop for a real-time environment. Hadoop just wasn't built

Re: Persistently increasing read latency

2009-12-04 Thread Joe Stump
On Dec 4, 2009, at 1:09 PM, Jonathan Ellis wrote: > If you don't have enough room for both, it doesn't matter how you prioritize. I'm assuming another alternative is to add more boxes to your cluster. --Joe

Re: What is the limit number of subcolumns in one Super Coloumn

2009-11-29 Thread Joe Stump
On Nov 29, 2009, at 4:25 PM, Thanh. Chau Nguyen Nhat wrote: > is there any limit number of subcolumns in one SuperColumn? I believe you're limited by the amount of RAM on a machine (for now). So if you've given the JVM 8GB then a single SuperColumn can't be more than 8GB. --Joe

Re: Cassandra users survey

2009-11-20 Thread Joe Stump
SimpleGeo is using Cassandra as the backend of our real-time location infrastructure. We needed something that was distributed, could scale, could handle lots of writes, etc. We looked into all the usual suspects, but went with Cassandra because it was written in Java (we have two guys who

Re: [VOTE] Website

2009-11-11 Thread Joe Stump
+1 - Great work.

Re: bandwidth limiting Cassandra's replication and access control

2009-11-11 Thread Joe Stump
On Nov 11, 2009, at 3:29 PM, Alexander Vushkan wrote: ...but authentication support would be nice to have... I'll continue to object to this. If you're considering running Cassandra (or MySQL or Reddis or Memcache or MemcacheDB or ...) on an open network Ur Doin' It Wrong. This is what VP

Re: Incr/Decr Counters in Cassandra

2009-11-04 Thread Joe Stump
SimpleGeo would be interested. --Joe On Nov 4, 2009, at 2:32 PM, Chris Goffinet wrote: Hey, At Digg we've been thinking about counters in Cassandra. In a lot of our use cases we need this type of support from a distributed storage system. Anyone else out there who has such needs as well?

Re: Custom partitioners

2009-10-10 Thread Joe Stump
As I said in my email, it's a code test. Simply meant to test an applicant's skills during the hiring process. I'm aware of OPP and COPP and choosing partitioners. --Joe On Oct 10, 2009, at 7:02, Mark Robson wrote: 2009/10/10 Joe Stump I've got a guy doing a code te

Custom partitioners

2009-10-09 Thread Joe Stump
I've got a guy doing a code test for us and he has some questions about custom partitioners: http://gist.github.com/205537 Wondering if anyone could chime in. Thanks! --Joe

Re: cassandra as permanent datastore

2009-10-01 Thread Joe Stump
For click tracking data I might look at Hadoop as well. It can handle the writes, replication, etc. along with being mechanisms for crunching large datasets built in (e.g. MapReduce). That being said, you're working with data that, for the most part, it won't be the end of the world if you

Re: New Features - Future releases

2009-09-18 Thread Joe Stump
On Sep 18, 2009, at 9:46 PM, wrote: Your idea is not bad: having a service layer in front of Cassandra. How about a separate opensource project or a standard/spec for ACL in the service layer? Sure. SOLR is kind of like this for Lucene. --Joe

Re: New Features - Future releases

2009-09-18 Thread Joe Stump
On Sep 18, 2009, at 9:33 PM, wrote: • ACL I'm strongly against ACL. Cassandra was built for highly scalable and highly distributed environments, which always sit behind firewalls. ALC's can easily be implemented in a service layer in front of Cassandra. • Multiple data cente

Re: Got Logo?

2009-09-17 Thread Joe Stump
On Sep 17, 2009, at 11:34 AM, David Pollak wrote: We collected money from the community and put up a $500 bounty. $500 is dangerously close to what a good logo designer charges (dache.ch, for instance, charges $800). Just throwing that out there. I'll be happy to kick in $50 towards a 99D

Re: random n00b question

2009-09-14 Thread Joe Stump
I'd recommend still using Memcached for sessions. The reason is because Memcached has built in garbage collection of zombie sessions (via LRU) and Cassandra does not. --Joe On Sep 14, 2009, at 5:09 PM, Matt Kydd wrote: I'm looking at a similar use for Cass - storing sessions and some denor

Re: lazyboy does not work with cassandra

2009-08-24 Thread Joe Stump
It only works with 0.3. There are people making progress to get it working on 0.4. Namely, Drew Schleck: http://github.com/dschleck/lazyboy/ --Joe On Aug 24, 2009, at 10:20 PM, mobiledream...@gmail.com wrote: lazyboy does not work with cassandra python columnfamily.pyTraceback (most recent c