> as far as we can tell, this filesystem layer
> is unaware of expressions, record batches, etc

You're correct that the filesystem layer doesn't directly support
Expressions.
However the datasets API includes the Partitioning classes which embed
expressions in paths. Depending on what expressions etc you need to embed,
you could implement a RadosFileSystem class which wraps an IO context
and treat object names as paths. If the RADOS objects contain arrow
formatted
data, then a FileSystemDataset (using IpcFileFormat) can be constructed
which
views the IO context and exploits the partitioning information embedded in
object
names to accelerate filtering. Does that accommodate your use case?

> Our main concern is that this new arrow::dataset::RadosFormat class will
be
> deriving from the arrow::dataset::FileFormat class, which seems to raise a
> conceptual mismatch as there isn’t really a RADOS format

IIUC RADOS doesn't interact with a filesystem directly, so RadosFileFormat
would
indeed be a conceptually problematic point of extension. If a RADOS file
system
is not viable then I think the ideal approach would be to directly
implement the
Fragment [1] and Dataset [2] interfaces, forgoing a FileFormat
implementation altogether.
Unfortunately the only example we have of this approach is InMemoryFragment,
which simply wraps a vector of record batches.

[1]
https://github.com/apache/arrow/blob/975f4eb/cpp/src/arrow/dataset/dataset.h#L45-L90
[2]
https://github.com/apache/arrow/blob/975f4eb/cpp/src/arrow/dataset/dataset.h#L119-L158


On Fri, Aug 28, 2020 at 1:27 PM Ivo Jimenez <ivo.jime...@gmail.com> wrote:

> Hi Antoine
>
> > Yes, that is our plan. Since this is going to be done on the storage-,
> > > server-side, this would be transparent to the client. So our main
> concern
> > > is whether this be OK from the design perspective, and could this
> > > eventually be merged upstream?
> >
> > Arrow datasets have no notion of client and server, so I'm not sure what
> > you mean here.
>
>
> Sorry for the confusion. This is where we see a mismatch between the
> current design and what we are trying to achieve.
>
> Our goal is to push down computations in a cloud storage system. By pushing
> we mean actually sending computation tasks to storage nodes (e.g. filtering
> executing on storage nodes). Ideally this would be done by implementing a
> new plugin for arrow::fs but as far as we can tell, this filesystem layer
> is unaware of expressions, record batches, etc. so this information cannot
> be communicated down to storage.
>
> So what we thought would work is to implement this at the Dataset API
> level, and implement a scanner (and writer) that would be deferring these
> operations to storage nodes. For example, the RadosScanTask class will ask
> a storage node to actually do a scan and fetch the result, as opposed to do
> the scan locally.
>
> We would immensely appreciate it if you could let us know if the above is
> OK, or if you think there is a better alternative for accomplishing this,
> as we would rather implement this functionality in a way that is
> compatible with your overall vision.
>
>
> > Do you simply mean contributing RadosFormat to the Arrow
> > codebase?
>
>
> Yes, so that others wanting to achieve this on a Ceph cluster could
> leverage this as well.
>
>
> > I would say that depends on the required dependencies, and
> > ease of testing (and/or CI) for other developers.
>
>
> OK, yes we will pay attention to these aspects as part of an eventual PR.
> We will include tests and ensure that CI covers the changes we introduce.
>
> thanks!
>

Reply via email to