This refers to the Java implementation. Our backend is primarily written in
Java,
thus we decided to prototype our integration in Java. To elucidate, let's
assume
that we want to create a shared file in memory (e.g /dev/shm) and FPGAs
will
later DMA this shared memory area. By utilizing the current Java
implementation
to populate an Arrow schema, the addresses where data buffers of
BaseValueVectors
reside are not page-aligned. Therefore, we wanted those addresses to be
page-aligned,
so we could natively mmap() such addresses to shared-files in /dev/shm.
Alternatively,
we could also benefit from being able to leverage MemoryMappedFiles to data
buffers.

Dimitris

On Wed, Mar 20, 2019 at 11:17 AM Micah Kornfield <[email protected]>
wrote:

> Hi Dimitris,
> This sounds interesting.  Is this for the C++ implementation?  Could you go
> into a little bit more detail?  How would this differ then using a
> MemoryPool implementation that always aligns to 4KB boundaries (instead of
> the 64 byte boundaries the default one does today [1])?
>
> Thanks,
> Micah
>
> [1]
>
> https://github.com/apache/arrow/blob/master/cpp/src/arrow/memory_pool.cc#L40
>
> On Wed, Mar 20, 2019 at 1:53 AM Dimitris Lekkas <[email protected]>
> wrote:
>
> > Hello folks,
> >
> > I am working at Inaccel where we utilize FPGAs to accelerate machine
> > learning workloads . Recently, we wanted to integrate our platform with
> > Arrow and we stumbled upon the non-alignment of data-buffers to page
> > boundaries (4KB). We implemented the option to supply per-column metadata
> > to page-align column vectors and we later memory mapped those columns
> > using native mmap() calls. Accelerators leverage page-alignment to i/o
> > efficiently, thus such an option might be an interesting addition to the
> > project.
> > In case you are interested, I will create a PR.
> >
> > Dimitris
> >
>

Reply via email to