Jonathan,

Thanks for your thoughts.

I've done some simple benchmarks with the batch insert apis and was looking for 
something slightly more performant.  Is there a batch row insert that I missed?

Any pointers (at all) to anything related to FB's bulk loading or the 
binarymemtable?  I've attempted to do this by writing a custom IVerbHandler for 
ingestion and interfacing with the MessagingService internally but it's not 
that clean.

Thanks again,
-Alex



________________________________
From: Jonathan Ellis <[email protected]>
To: [email protected]
Sent: Thursday, May 21, 2009 7:44:59 AM
Subject: Re: Ingesting from Hadoop to Cassandra

Have you benchmarked the batch insert apis?  If that is "fast enough"
then it's by far the simplest way to go.

Otherwise you'll have to use the binarymemtable stuff which is
undocumented and not exposed as a client api (you basically write a
custom "loader" version of cassandra to use it, I think).  FB used
this for their own bulk loading so it works at some level, but clearly
there is some assembly required.

-Jonathan

On Thu, May 21, 2009 at 2:28 AM, Alexandre Linares <[email protected]> wrote:
> Hi all,
>
> I'm trying to find the most optimal way to ingest my content from Hadoop to
> Cassandra.  Assuming I have figured out the table representation for this
> content, what is the best way to do go about pushing from my cluster?  What
> Cassandra client batch APIs do you suggest I use to push to Cassandra? I'm
> sure this is a common pattern, I'm curious to see how it has been
> implemented.  Assume millions of of rows and 1000s of columns.
>
> Thanks in advance,
> -Alex
>
>



      

Reply via email to