Re: [DISCUSS] Add support for Apache Arrow format

2023-04-13 Thread Aitozi
> Which connectors would be commonly used when reading in Arrow format? Filesystem? Currently, yes. The better way is it can be combined used with different connector, but I have not figured out how to integrate the Arrow format deserializer with the `DecodingFormat` or `DeserializationSchema`

Re: [DISCUSS] Add support for Apache Arrow format

2023-04-12 Thread Martijn Visser
Which connectors would be commonly used when reading in Arrow format? Filesystem? On Wed, Apr 12, 2023 at 4:27 AM Jacky Lau wrote: > Hi >I also think arrow format will be useful when reading/writing with > message queue. >Arrow defines a language-independent columnar memory format for

Re: [DISCUSS] Add support for Apache Arrow format

2023-04-11 Thread Jacky Lau
Hi I also think arrow format will be useful when reading/writing with message queue. Arrow defines a language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware like CPUs and GPUs. The Arrow memory format also

Re: [DISCUSS] Add support for Apache Arrow format

2023-04-02 Thread Aitozi
Hi all, Thanks for your input. @Ran > However, as mentioned in the issue you listed, it may take a lot of work and the community's consideration for integrating Arrow. To clarify, this proposal solely aims to introduce flink-arrow as a new format, similar to flink-csv and flink-protobuf. It

Re: [DISCUSS] Add support for Apache Arrow format

2023-03-30 Thread Jim Hughes
Hi all, How do Flink formats relate to or interact with Paimon (formerly Flink-Table-Store)? If the Flink format interface is used there, then it may be useful to consider Arrow along with other columnar formats. Separately, from previous experience, I've seen the Arrow format be useful as an

Re: [DISCUSS] Add support for Apache Arrow format

2023-03-30 Thread Martijn Visser
Hi, To be honest, I haven't seen that much demand for supporting the Arrow format directly in Flink as a flink-format. I'm wondering if there's really much benefit for the Flink project to add another file format, over properly supporting the format that we already have in the project. Best

Re: [DISCUSS] Add support for Apache Arrow format

2023-03-30 Thread Ran Tao
It is a good point that flink integrates apache arrow as a format. Arrow can take advantage of SIMD-specific or vectorized optimizations, which should be of great benefit to batch tasks. However, as mentioned in the issue you listed, it may take a lot of work and the community's consideration for

Re: [Discuss] Add support for Apache Arrow

2019-04-11 Thread Flavio Pompermaier
Very BIG +1 for adoption of Apache Arrow. This would simplify a lot the integration with other tools On Thu, Apr 11, 2019 at 2:21 PM Run wrote: > Hi guys, > > > Apache Arrow provides a cross-language, standardized, columnar, memory > format for data. > So it is highly desirable to import Arrow