Very BIG +1 for adoption of Apache Arrow. This would simplify a lot the
integration with other tools

On Thu, Apr 11, 2019 at 2:21 PM Run <fan_li...@foxmail.com> wrote:

> Hi guys,
>
>
> Apache Arrow provides a cross-language, standardized, columnar, memory
> format for data.
> So it is highly desirable to import Arrow to Flink, and make use of its
> memory layout and memory management facilities.
> More background on this can be found in
> https://issues.apache.org/jira/browse/FLINK-10929
>
>
> The problem is that, incorporating Arrow to Flink is a big move, and may
> affect many modules of Flink.
> In our efforts to vectorize Blink batch jobs, we have tried to incorporate
> Arrow to Flink. Those were incremental changes, which can be easily turned
> off with a single flag, and the changes are transparent to other parts of
> the code base.
>
>
> This is the draft of our design document:
>
> https://docs.google.com/document/d/1LstaiGzlzTdGUmyG_-9GSWleKpLRid8SJWXQCTqcQB4/edit?usp=sharing
>
>
> For the first step, I suggest providing a flag which is disabled by
> default, and let the MemoryManager depend on the Arrow Buffer Allocator.
> With this change, all the MemorySegment will be based on Arrow buffers, but
> this is transparent to other components, and should never break them.
>
>
> After this step, we can apply the changes described in the design
> documents, incrementally.
> This is our initial thoughts. Would you please give your valuable comments?
>
>
> Best,
> Liya Fan

Reply via email to