better than a tool
written to HDFS and adapted. I hear people saying Map/Reduce on
Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really make
sense? Should we expect a result like this?
Final question: Do you think writting a new M/R tool like described would
be reinventing the wheel
on the maximum capacity of a
single host, but my guess is that a map / reduce tool written specifically
to Cassandra, from the beggining, could perform much better than a tool
written to HDFS and adapted. I hear people saying Map/Reduce on
Cassandra/HBase is usually 30% slower than M/R in HDFS. Does
depend on the maximum capacity of a
single host, but my guess is that a map / reduce tool written
specifically
to Cassandra, from the beggining, could perform much better than a tool
written to HDFS and adapted. I hear people saying Map/Reduce on
Cassandra/HBase is usually 30% slower than
, but my guess is that a map / reduce tool written
specifically
to Cassandra, from the beggining, could perform much better than a tool
written to HDFS and adapted. I hear people saying Map/Reduce on
Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really
make
sense? Should
written to HDFS and adapted. I hear people saying Map/Reduce on
Cassandra/HBase is usually 30% slower than M/R in HDFS. Does it really
make
sense? Should we expect a result like this?
Final question: Do you think writting a new M/R tool like described
would be
reinventing
On Mon, Jul 21, 2014 at 10:54 AM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
My understanding (please some correct me if I am wrong) is that when you
insert N items in a Cassandra CF, you are executing N binary searches to
insert the item already indexed by a key. When you read
Hi Robert,
First of all, thanks for answering.
2014-07-21 20:18 GMT-03:00 Robert Coli rc...@eventbrite.com:
You're wrong, unless you're talking about insertion into a memtable, which
you probably aren't and which probably doesn't actually work that way
enough to be meaningful.
On disk,
On Mon, Jul 21, 2014 at 5:45 PM, Marcelo Elias Del Valle
marc...@s1mbi0se.com.br wrote:
Although several sstables (disk fragments) may have the same row key,
inside a single sstable row keys and column keys are indexed, right?
Otherwise, doing a GET in Cassandra would take some time.
From
Hi,
But if you are only relying on memtables to sort writes, that seems like a
pretty heavyweight reason to use Cassandra?
Actually, it's not a reason to use Cassandra. I already use Cassandra and I
need to map reduce data from it. I am trying to see a reason to use the
conventional M/R
, January 17, 2013 8:58 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I just want
to use pig and hive
@cassandra.apache.org
Date: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.org user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I
just want
to use pig and hive on cassandra. I know that pig
@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use
@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I just want
to use pig and hive on cassandra. I know that pig samples are provided and
work now with cassandra natively (they are part of the core). However
AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I just want
to use pig and hive on cassandra. I know
@orange.com
Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
Date: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand
someone else will have to answer
that question.
From: cscetbon@orange.com
Reply-To: user@cassandra.apache.org
Date: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.org user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can
: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.org user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I just want
to use pig and hive on cassandra. I know that pig samples are provided
what do you mean ? it's not needed by Pig or Hive to access Cassandra data.
Regards
On Jan 16, 2013, at 11:14 PM, Brandon Williams
dri...@gmail.commailto:dri...@gmail.com wrote:
You won't get CFS,
but it's not a hard requirement, either.
--
http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.2.0/apache-cassandra-
1.2.0-src.tar.gz
--Jimmy
From: cscetbon@orange.com
Reply-To: user@cassandra.apache.org
Date: Thursday, January 17, 2013 6:35 AM
To: user@cassandra.apache.org user@cassandra.apache.org
Subject: Re: Pig / Map Reduce
@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
what do you mean ? it's not needed by Pig or Hive to access Cassandra data.
Regards
On Jan 16, 2013, at 11:14 PM, Brandon Williams
dri...@gmail.commailto:dri...@gmail.com wrote:
You won't get CFS,
but it's not a hard requirement, either
someone else will have to answer
that question.
From: cscetbon@orange.com
Reply-To: user@cassandra.apache.org
Date: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.org user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can
user@cassandra.apache.orgmailto:user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those who use Hadoop. I just want to
use pig and hive on cassandra. I know that pig samples are provided and work
now with cassandra natively
to answer
that question.
From: cscetbon@orange.com
Reply-To: user@cassandra.apache.org
Date: Thursday, January 17, 2013 8:58 AM
To: user@cassandra.apache.org user@cassandra.apache.org
Subject: Re: Pig / Map Reduce on Cassandra
Jimmy,
I understand that CFS can replace HDFS for those
Hi,
I know that DataStax Enterprise package provide Brisk, but is there a community
version ? Is it easy to interface Hadoop with Cassandra as the storage or do we
absolutely have to use Brisk for that ?
I know CassandraFS is natively available in cassandra 1.2, the version I use,
so is there
Here are a few examples I have worked on, reading from xml.gz files then
writing to cassandara.
https://github.com/jschappet/medline
You will also need:
https://github.com/jschappet/medline-base
These examples are Hadoop Jobs using Cassandra as the Data Store.
This one is a good place to
I don't want to write to Cassandra as it replicates data from another
datacenter, but I just want to use Hadoop Jobs (Pig and Hive) to read data from
it. I would like to use the same configuration as
http://www.datastax.com/dev/blog/hadoop-mapreduce-in-the-cassandra-cluster but
I want to know
Try this one then, it reads from cassandra, then writes back to cassandra,
but you could change the write to where ever you would like.
getConf().set(IN_COLUMN_NAME, columnName );
Job job = new Job(getConf(), ProcessRawXml);
Brisk is pretty much stagnant. I think someone forked it to work with 1.0
but not sure how that is going. You'll need to pay for DSE to get CFS
(which is essentially Brisk) if you want to use any modern version of C*.
Best,
Michael
On 1/16/13 11:17 AM, cscetbon@orange.com
Here is the point. You're right this github repository has not been updated for
a year and a half. I thought brisk was just a bundle of some technologies and
that it was possible to install the same components and make them work together
without using this bundle :(
On Jan 16, 2013, at 8:22
On Wed, Jan 16, 2013 at 2:37 PM, cscetbon@orange.com wrote:
Here is the point. You're right this github repository has not been updated
for a year and a half. I thought brisk was just a bundle of some technologies
and that it was possible to install the same components and make them work
I'm having some problems during running a Map Reduce program using
Cassandra as input.
I already right some MapRed programs using the cassandra 1.0.9, but now I'm
trying with an old version with a patch that supports trigger. (this one:
https://issues.apache.org/jira/browse/CASSANDRA-1311)
When I
Hey Bill,
A few months ago we did an experiment with 5 hadoop nodes pulling from
4 cass nodes. It was pulling down 1 column family with 8 small columns
just dumping the raw data to hdfs. It was cycling through around 17K
map tasks per sec. The machines weren't being taxed too hard, so I'm
sure
Hi All
How performant is M/R on Cassandra when compared to running it on HDFS?
Anyone have any numbers they can share? Specifically how much of data the
M/R job was run against and what was the throughput etc. Any information
would be very helpful.
--
Cheers
Bill
33 matches
Mail list logo