Hello,

On Mon, Dec 23, 2013 at 3:23 PM, Jie Deng <[email protected]> wrote:

> I am using Java, and Spark has APIs for Java as well. Though there is a
> saying that Java in Spark is slower than Scala shell, well, depends on your
> requirement.
> I am not an expert in Spark, but as far as I know, Spark provide different
> level of storage including memory or disk. And for the disk part, HDFS is
> just a choice. I am not using hdfs myself, but you will loss the benefit of
> hdfs as well. In other words, it's also just based on your requirements.
> And MongoDB or S3 are also doable, at least with Java APIs, I suppose.
>
>
I guess that answers the question of whether it is doable. Where/how do I
find out how it is doable? :)

I am guessing every pipeline is a "custom job" of sorts - hence it is the
developer's job to write the "connectors" to 0mq or dynamodb, for example?
Or....? Is there some kind of a "plug in" system for Spark?

Thanks!

Reply via email to