Re: Arrow Dataset API on Ceph

Yibo Cai Mon, 21 Jun 2021 03:30:05 -0700

Hi Jayjeet,

I've successfully validated basic functions based on the links youprovided, on both Arm64 and x86, with binaries built from your PR.Everything looks fine. From perf, I can see arrow code is runningactively on ceph osd nodes.

Currently, I deployed and tested on 4 VMs. For performance evaluation, Iwill turn to bare metal servers with NVMe drives. Do you have test casesfor the benchmarking? Thanks.


Yibo

On 6/8/21 4:18 PM, Jayjeet Chakraborty wrote:

Hi Yibo,

Thanks a lot for your interest in our work. Please refer to this [1] guide to 
deploy a complete environment on a cluster of nodes. Regarding your comment 
about a Ceph patch, the arrow object class that we implement is actually a 
plugin and does not require the Ceph source tree for building or maintaining 
it. It only requires the rados-objclass-dev package as a dependency. We provide 
the CMake option ARROW_CLS  to allow optional build of the object class plugin. 
Please let us know if you have any questions or comments. We really appreciate 
you taking the time to look into our project. Thanks.

Best,
Jayjeet

[1] 
https://github.com/uccross/skyhookdm-arrow/blob/arrow-master/cpp/src/arrow/adapters/arrow-rados-cls/docs/deploy.md

On 2021/06/07 10:36:08, Yibo Cai <yibo....@arm.com> wrote:

Hi Jayjeet,

It is exciting to see a real world computational storage solution built
upon Arrow and Ceph. Amazing work!

We are interesting in this project (I'm from Arm open source software
team focusing on storage and big data OSS), and would like to reproduce
your works first, then evaluate performance on Arm platform.

I went through your Arrow PR, it looks great. IIUC, there should be a
corresponding Ceph patch implementing the object class with Arrow.

I wonder the best approach to deploy a complete environment for a quick
evaluation. Any comment is welcomed. Thanks.

Yibo

On 6/2/21 3:42 AM, Jayjeet Chakraborty wrote:

Dear Arrow Community,

In our previous discussion, we planned on implementing a new Dataset API
like InMemoryDataset to interact with objects containing IPC data stored in
Ceph/RADOS <https://ceph.io/>. We had implemented this design and raised a
PR <https://github.com/apache/arrow/pull/8647>. But when we started adding
the dataset discovery functionality, we found ourselves reimplementing
filesystem abstractions and its metadata management. We closed the original
PR and raised a new PR <https://github.com/apache/arrow/pull/10431> where
we redesigned our implementation to use the Ceph filesystem as our file I/O
interface since it provides fast metadata support via the Ceph metadata
servers (MDS). We also decided to store data using one of the file formats
supported by Arrow. One of our driving use cases favored Parquet.

Since we perform the scan operation inside the storage layer using Ceph
Object class
<https://docs.ceph.com/en/latest/rados/api/objclass-sdk/#:~:text=Ceph%20can%20be%20extended%20by,object%20classes%20within%20the%20tree.>
methods which need to be invoked directly on objects, we utilize the
striping strategy information provided by CephFS to translate filename in
CephFS to object id in RADOS. To be able to have this one-to-one mapping,
we split Parquet files in a manner similar to how Spark splits Parquet
files for HDFS and ensure that each fragment is backed by a single RADOS
object.

We are planning a new PR, we extend the FileFormat interface to create a
RadosParquetFileFormat
<https://github.com/uccross/skyhookdm-arrow/blob/arrow-master/cpp/src/arrow/dataset/file_rados_parquet.h#L129>
interface that offloads Parquet file scan operations to the RADOS layer in
Ceph. Since we now utilize a filesystem interface, we can just use the
FileSystemDataset API and plug in our new format to offload scan
operations. We have also added Python bindings for the new APIs that we
implemented. In all, our patch only consists of around 3,000 LoC and
introduces new dependencies to Ceph’s librados and object class SDK only
(that can be disabled via cmake flags).

We have added an architecture
<https://github.com/uccross/skyhookdm-arrow/blob/rados-parquet-pr/cpp/src/arrow/adapters/arrow-rados-cls/docs/architecture.md>
document with our PR which describes the overall architecture along with
the life of a dataset scan on using RadosParquet. Additionally, we recently
wrote up a paper <https://arxiv.org/abs/2105.09894> describing our design
and implementation along with some initial benchmarks given there. We plan
to raise a PR <https://github.com/apache/arrow/pull/10431> to upstream our
format to apache/arrow soon and hence look forward to your comments and
thoughts on this new feature. Please let us know if you have any questions.
Thank you.

Best regards,

Jayjeet Chakraborty

On 2020/09/15 18:06:56, Micah Kornfield <emkornfi...@gmail.com> wrote:

gmock is already a dependency.  We haven't upgraded gmock/gtest in a

while,

we might want to consider doing that (but this is orthogonal).

On Tue, Sep 15, 2020 at 10:16 AM Antoine Pitrou <anto...@python.org>

wrote:


Hi Ivo,

You can open a JIRA once you've got a PR ready.  No need to do it before
you think you're ready for submission.

AFAIK, gmock is already a dependency.

Regards

Antoine.



Le 15/09/2020 à 18:49, Ivo Jimenez a écrit :

Hi again,

We noticed in the contribution guidelines that there needs to be an

issue for every PR in JIRA. Should we open one for the eventual PR for

the

work we're doing on implementing the dataset on Ceph's RADOS?


Also, on a related note, we would like to mock the RADOS client so

that

we can integrate it in CI tests. Would it be OK to include gmock as a
dependency?


thanks!

On 2020/09/02 22:05:51, Ivo Jimenez <ivo.jime...@gmail.com> wrote:

Hi Ben,

Our main concern is that this new arrow::dataset::RadosFormat class

will

be

deriving from the arrow::dataset::FileFormat class, which seems to

raise

conceptual mismatch as there isn’t really a RADOS format


IIUC RADOS doesn't interact with a filesystem directly, so

RadosFileFormat

would
indeed be a conceptually problematic point of extension. If a RADOS

file

system
is not viable then I think the ideal approach would be to directly
implement the
Fragment [1] and Dataset [2] interfaces, forgoing a FileFormat
implementation altogether.
Unfortunately the only example we have of this approach is
InMemoryFragment,
which simply wraps a vector of record batches.


This is what we will go with, as this seems to be the quickest way

for

us

to have a PoC and start experimenting with this.

Thanks a lot for the invaluable feedback! 🙏

Re: Arrow Dataset API on Ceph

Reply via email to