Re: Proposal for concrete operator for writing to HDFS file

Yogi Devendra Sun, 06 Mar 2016 00:40:19 -0800

Thomas,

I agree that toString() may not give valid output for most of the objects.
But, my understanding was to keep csv/json/avro conversion separate from
this operator.

Same conversions will be required for few other stores. How about having
separate POJO to csv/json/avro converter before this operator which would
emit byte[] or String?

I mentioned about toString() in the proposal assuming operator would
receive byte[] or String for having useful output. But, in case if it gets
some other type it can resort to toString() for converting it to some
byte[] instead of throwing error.

Question:
1. Does it make sense to keep csv/json/avro conversion separate from
writing to HDFS?

2. To restrict the allowed types:
Should we just have to input ports for byte[], String respectively? By
doing this, we can formally disqualify any other type.

~ Yogi

On 6 March 2016 at 09:58, Thomas Weise <[email protected]> wrote:

> >
> > > 1. Take any java object as input and get the bytes of the string
> returned
> > > from toString method on the object.
> > >
> >
> > Yes. It would allow any java object and byte[] will be derived from the
> > toString(). If input is byte[]; then it would be passed on without any
> > conversion.
> >
> >
> Relying on toString() does not seem appropriate. Since you want the
> operator configurable, why not let the user configure how to serialize the
> data. Default could be JSON.
>
> Thanks
>

Re: Proposal for concrete operator for writing to HDFS file

Reply via email to