Have you benchmarked the batch insert apis?  If that is "fast enough"
then it's by far the simplest way to go.

Otherwise you'll have to use the binarymemtable stuff which is
undocumented and not exposed as a client api (you basically write a
custom "loader" version of cassandra to use it, I think).  FB used
this for their own bulk loading so it works at some level, but clearly
there is some assembly required.

-Jonathan

On Thu, May 21, 2009 at 2:28 AM, Alexandre Linares <[email protected]> wrote:
> Hi all,
>
> I'm trying to find the most optimal way to ingest my content from Hadoop to
> Cassandra.  Assuming I have figured out the table representation for this
> content, what is the best way to do go about pushing from my cluster?  What
> Cassandra client batch APIs do you suggest I use to push to Cassandra? I'm
> sure this is a common pattern, I'm curious to see how it has been
> implemented.  Assume millions of of rows and 1000s of columns.
>
> Thanks in advance,
> -Alex
>
>

Reply via email to