Hey Ryan, I've never encountered a use case for writing Protobuf encoded files to a filesystem.
Best regards, Martijn On Fri, May 26, 2023 at 6:39 PM Ryan Skraba via user <user@flink.apache.org> wrote: > Hello all! > > I discovered while investigating FLINK-32008[1] that we can write to the > filesystem connector with the protobuf format, but today, the resulting > file is pretty unlikely to be useful or rereadable. > > There's no real standard for storing many protobuf messages in a single > file container, although the documentation mentions writing size-delimited > messages sequentially[2]. In practice, I've never encountered protobuf > binaries stored on filesystems without using some other sort of "framing" > (like how parquet can be accessed with either an Avro or a protobuf > oriented API). > > Does anyone have any use cases for bulk storage of protobuf messages on a > filesystem? Should these files just be considered temporary storage for > Flink jobs, or do they need to be compatible with other systems? Is there > a splittable / compressable file format? > > The alternative might be to just forbid file storage for protobuf > messages! Any opinions? > > All my best, Ryan Skraba > > [1]: https://issues.apache.org/jira/browse/FLINK-32008 > [2]: https://protobuf.dev/programming-guides/techniques/#streaming >