Pyflink/Flink Java parquet streaming file sink for a dynamic schema stream

Kamil ty Thu, 02 Dec 2021 15:12:24 -0800

Hello,

I'm wondering if there is a possibility to create a parquet streaming file
sink in Pyflink (in Table API) or in Java Flink (in Datastream api).


To give an example of the expected behaviour. Each element of the stream is
going to contain a json string. I want to save this stream to parquet files
without having to explicitly define the schema/types of the messages (also
using a single sink).

If this is possible, (might be in Java Flink using a custom
ParquetBulkWriterFactory etc.) any direction for the implementation would
be appreciated.

Best regards
Kamil

Pyflink/Flink Java parquet streaming file sink for a dynamic schema stream

Reply via email to