Re: Reading Parquet data from input stream and write to output stream

Ryan Blue Mon, 09 Nov 2015 09:36:44 -0800

Selina,

You should be able to write to S3 without needing to flush to an outputstream. You would just use the S3 FileSystem to write data instead ofHDFS. This doesn't need to require Parquet to write to an OutputStreaminstead of a file. Is there a reason why you want to supply an outputstream instead?


rb

On 11/05/2015 05:56 PM, Selina Tech wrote:

Dear all:

       I am wondering if I could read input stream such as Kafka and convert
it Parquet data  and write back to output stream?  All example I found
convert data file to Parquet data.

       I know this feature is not available last year. How about right now?

       I am trying to aggregate Kafka message by Samza and convert it to
Parquet data and then save it to S3. What is the best one to implement it?


Sincerely,
Selina

reference:
https://github.com/Parquet/parquet-mr/issues/231



--
Ryan Blue
Software Engineer
Cloudera, Inc.

Re: Reading Parquet data from input stream and write to output stream

Reply via email to