Hey Ryan,

I've never encountered a use case for writing Protobuf encoded files to a
filesystem.

Best regards,

Martijn

On Fri, May 26, 2023 at 6:39 PM Ryan Skraba via user <user@flink.apache.org>
wrote:

> Hello all!
>
> I discovered while investigating FLINK-32008[1] that we can write to the
> filesystem connector with the protobuf format, but today, the resulting
> file is pretty unlikely to be useful or rereadable.
>
> There's no real standard for storing many protobuf messages in a single
> file container, although the documentation mentions writing size-delimited
> messages sequentially[2].  In practice, I've never encountered protobuf
> binaries stored on filesystems without using some other sort of "framing"
> (like how parquet can be accessed with either an Avro or a protobuf
> oriented API).
>
> Does anyone have any use cases for bulk storage of protobuf messages on a
> filesystem?  Should these files just be considered temporary storage for
> Flink jobs, or do they need to be compatible with other systems?  Is there
> a splittable / compressable file format?
>
> The alternative might be to just forbid file storage for protobuf
> messages!  Any opinions?
>
> All my best, Ryan Skraba
>
> [1]: https://issues.apache.org/jira/browse/FLINK-32008
> [2]: https://protobuf.dev/programming-guides/techniques/#streaming
>

Reply via email to