Selina,
You should be able to write to S3 without needing to flush to an output
stream. You would just use the S3 FileSystem to write data instead of
HDFS. This doesn't need to require Parquet to write to an OutputStream
instead of a file. Is there a reason why you want to supply an output
stream instead?
rb
On 11/05/2015 05:56 PM, Selina Tech wrote:
Dear all:
I am wondering if I could read input stream such as Kafka and convert
it Parquet data and write back to output stream? All example I found
convert data file to Parquet data.
I know this feature is not available last year. How about right now?
I am trying to aggregate Kafka message by Samza and convert it to
Parquet data and then save it to S3. What is the best one to implement it?
Sincerely,
Selina
reference:
https://github.com/Parquet/parquet-mr/issues/231
--
Ryan Blue
Software Engineer
Cloudera, Inc.