CassandraFS in 1.0?

2011-07-06 Thread Joseph Stein
Hey folks, I am going to start prototyping our media tier using cassandra as a file system (meaning upload video/audio/images to web server save in cassandra and then streaming them out) Has anyone done this before? I was thinking brisk's CassandraFS might be a fantastic implementation for this

Paging Columns from a Row

2011-06-05 Thread Joseph Stein
What is the best practices here to page and slice columns from a row. So lets say I have 1,000,000 columns in a row I read the row but want to have 1 thread read columns 0 - , second thread (actor in my case) 1 - 1 ... and so on so i can have 100 workers processing 10,000 columns for

Re: Paging Columns from a Row

2011-06-05 Thread Joseph Stein
at the same time. On Sun, Jun 5, 2011 at 1:43 PM, Joseph Stein crypt...@gmail.com wrote: What is the best practices here to page and slice columns from a row. So lets say I have 1,000,000 columns in a row I read the row but want to have 1 thread read columns 0 - , second thread (actor

Re: [RELEASE] 0.8.0

2011-06-02 Thread Joseph Stein
Awesome! On Thu, Jun 2, 2011 at 7:36 PM, Eric Evans eev...@rackspace.com wrote: I am very pleased to announce the official release of Cassandra 0.8.0. If you haven't been paying attention to this release, this is your last chance, because by this time tomorrow all your friends are going to

Re: Cassandra Hackathon?

2011-05-17 Thread Joseph Stein
...@gmail.comwrote: I had it on our list of ideas for the Cassandra NYC meetup. I am down for action. On Mon, May 16, 2011 at 9:40 PM, Joseph Stein crypt...@gmail.com wrote: Any interest for a Cassandra Hackathon evening in NYC? Any committer(s) going to be in the NYC area together that can lead

Cassandra Hackathon?

2011-05-16 Thread Joseph Stein
Any interest for a Cassandra Hackathon evening in NYC? Any committer(s) going to be in the NYC area together that can lead/guide this? http://www.meetup.com/NYC-Cassandra-User-Group/events/18635801/ I have a thumbs up to use our office www.medialets.com in the Milk Studios building. It is a big

GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Joseph Stein
I hear that a bunch of folks have GeoIndexing built on top of Cassandra and running in production. Any of them open sourced (Twitter? SimpleGeo? Bueller?) planning on it? /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */

Re: GeoIndexing in Cassandra, Open Sourced?

2011-01-21 Thread Joseph Stein
On Fri, Jan 21, 2011 at 1:49 PM, Mike Malone m...@simplegeo.com wrote: A more recent preso I gave about the SimpleGeo architecture is up at http://strangeloop2010.com/system/talks/presentations/000/014/495/Malone-DimensionalDataDHT.pdf Mike On Fri, Jan 21, 2011 at 10:02 AM, Joseph Stein

Re: [RELEASE] 0.7.0 (and 0.6.9)

2011-01-11 Thread Joseph Stein
Many thanks to those that put in all the hard work, time, dedication, etc for another awesome release !!! /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ On Tue, Jan 11, 2011 at 12:23 PM, Eric Evans eev...@rackspace.com wrote: As some of you may already be

Re: Cassandra vs MongoDB

2010-07-28 Thread Joseph Stein
If you are looking to store web logs and then do ad hoc queries you might/should be using Hadoop (depending on how big your logs are) While MongoDB has MapReduce (built in) it is there to simulate SQL GROUP BY and not for large scale analytics by any means. MongoDB uses a global read/write lock

geo distance calculations

2010-06-26 Thread Joseph Stein
I believe I have asked before but now that I am really getting into the weeds with this it seems I am about to go down the MongoDB path... before I do let me ask again (as I would prefer to stick with Cassandra for this app) Has anyone implemented geo (long lat) calculations (distance) using

Re: timeout while running simple hadoop job

2010-05-07 Thread Joseph Stein
you can manage the number of map tasks by node mapred.tasktracker.map.tasks.maximum=1 On Fri, May 7, 2010 at 9:53 AM, gabriele renzi rff@gmail.com wrote: On Fri, May 7, 2010 at 2:44 PM, Jonathan Ellis jbel...@gmail.com wrote: Sounds like you need to configure Hadoop to not create a whole

Re: Cassandra use cases: as a datagrid ? as a distributed cache ?

2010-04-26 Thread Joseph Stein
great talk tonight in NYC I attended in regards to using Cassandra as a Lucene Index store (really great idea nicely implemented) http://blog.sematext.com/2010/02/09/lucandra-a-cassandra-based-lucene-backend/ so Lucinda uses Cassandra as a distributed cache of indexes =8^) On Mon, Apr 26, 2010

Re: Cassandra use cases: as a datagrid ? as a distributed cache ?

2010-04-26 Thread Joseph Stein
(sp) Lucandra http://github.com/tjake/Lucandra On Mon, Apr 26, 2010 at 11:08 PM, Joseph Stein crypt...@gmail.com wrote: great talk tonight in NYC I attended in regards to using Cassandra as a Lucene Index store (really great idea nicely implemented) http://blog.sematext.com/2010/02/09/lucandra

Re: The Difference Between Cassandra and HBase

2010-04-25 Thread Joseph Stein
it is kind of the classic distinction between OLTP OLAP. Cassandra is to OLTP as HBase is to OLAP (for those SAT nutz). Both are useful and valuable in their own right, agreed. On Sun, Apr 25, 2010 at 12:20 PM, Jeff Hodges jhod...@twitter.com wrote: HBase is awesome when you need high

download links 404 on main site

2010-03-15 Thread Joseph Stein
so i just moved to a new dev machine and went to download 0.5.1 was excited to see when googling cassandra coming up #1 (under the top level site now) but upset when EVERY mirror I tried came up 404 error not found =8^( http://cassandra.apache.org/ try to download 0.5.1, no luck ... not sure