On Fri, Dec 12, 2014 at 2:24 PM, Ted Dunning <[email protected]> wrote:
>
> Hadoop dependencies are a quagmire.
>
> It would be far preferable to rewrite the necessary serialization to avoid
> Hadoop dependencies entirely.
>
> If we dropping the MR code, why do we need to reference the VectorWritable
> class at all?
>

yes, this is the only form of serialization right now. Yes, it would be
much more preferrable to rewrite it without going after Writable, in Kryo
terms.

Given amount of activity in that domain lately though, I am just being
realistic here.

But yes, i support getting rid of Writable type serialization.

We do need Sequence file format though.

Also keep in mind that spark brings hadoop dependencies as well. Which is
also sort of both blessing and a curse.

Blessing because we don't have to declare a particular hadoop dependency
any longer.

Curse is that because of course actuall hadoop version depends on
parameters of Spark compilation; not what pom and maven tells us. So we are
constrained only to pieces that are "forever" compatible accross hadoop
history.


>

Reply via email to