lidavidm commented on a change in pull request #12620: URL: https://github.com/apache/arrow/pull/12620#discussion_r830644580
########## File path: docs/source/cpp/examples/dataset_skyhook_scan_example.rst ########## @@ -0,0 +1,85 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +Arrow Skyhook example +========================= Review comment: ```suggestion ===================== Arrow Skyhook example ===================== ``` ########## File path: docs/source/cpp/examples/dataset_skyhook_scan_example.rst ########## @@ -0,0 +1,85 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +Arrow Skyhook example +========================= + +The file ``cpp/examples/arrow/dataset_skyhook_scan_example.cc`` +located inside the source tree contains an example of using Skyhook to +offload filters and projections to a Ceph cluster. + +Instuctions +-------------------- + +1. Install Ceph and Skyhook dependencies. +.. code-block:: bash + + apt update + apt install -y cmake \ + libradospp-dev \ + rados-objclass-dev \ + ceph \ + ceph-common \ + ceph-osd \ + ceph-mon \ + ceph-mgr \ + ceph-mds \ + rbd-mirror \ + ceph-fuse + +2. Build and install Skyhook. + +.. code-block:: bash + + git clone https://github.com/apache/arrow + cd arrow/ + mkdir -p cpp/release + cd cpp/release + cmake -DARROW_SKYHOOK=ON \ + -DARROW_PARQUET=ON \ + -DARROW_WITH_SNAPPY=ON \ + -DARROW_BUILD_EXAMPLES=ON \ + -DARROW_DATASET=ON \ + -DARROW_CSV=ON \ + -DARROW_WITH_LZ4=ON \ + .. + + make -j${nproc} install + cp release/libcls_skyhook.so /usr/lib/x86_64-linux-gnu/rados-classes/ + +3. Deploy a Ceph cluster with a single in-memory OSD. + +.. code-block:: bash + + mkdir -p /tmp/skyhook + ../examples/scripts/micro-osd.sh /tmp/skyhook + +4. Generate an example dataset. + +.. code-block:: bash + + python3 ../../ci/scripts/generate_dataset.py + cp -r nyc /mnt/cephfs/ + +5. Compile and Run the example. + +.. code-block:: bash + g++ -std=c++11 ../examples/arrow/dataset_skyhook_scan_example.cc -larrow -larrow_dataset -larrow_skyhook -o skyhook_example Review comment: This should just be `make dataset_skyhook_scan_example`, the Arrow build system will build the examples ########## File path: docs/source/cpp/examples/dataset_skyhook_scan_example.rst ########## @@ -0,0 +1,85 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +Arrow Skyhook example +========================= + +The file ``cpp/examples/arrow/dataset_skyhook_scan_example.cc`` +located inside the source tree contains an example of using Skyhook to +offload filters and projections to a Ceph cluster. + +Instuctions +-------------------- + +1. Install Ceph and Skyhook dependencies. +.. code-block:: bash + + apt update Review comment: Is this for Debian or Ubuntu? Can we mention the distro's version? ########## File path: cpp/examples/scripts/micro-osd.sh ########## @@ -0,0 +1,103 @@ +# +# Copyright (C) 2013,2014 Loic Dachary <[email protected]> +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU Affero General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU Affero General Public License for more details. +# +# You should have received a copy of the GNU Affero General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. +# Review comment: We can't commit an AGPL-licensed script into our repo. If it's available somewhere else, we can just remove it from here and link to it from the docs instead? ########## File path: docs/source/cpp/examples/dataset_skyhook_scan_example.rst ########## @@ -0,0 +1,85 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +Arrow Skyhook example +========================= + +The file ``cpp/examples/arrow/dataset_skyhook_scan_example.cc`` +located inside the source tree contains an example of using Skyhook to +offload filters and projections to a Ceph cluster. + +Instuctions +-------------------- Review comment: ```suggestion Instuctions =========== ``` ########## File path: docs/source/cpp/examples/dataset_skyhook_scan_example.rst ########## @@ -0,0 +1,85 @@ +.. Licensed to the Apache Software Foundation (ASF) under one +.. or more contributor license agreements. See the NOTICE file +.. distributed with this work for additional information +.. regarding copyright ownership. The ASF licenses this file +.. to you under the Apache License, Version 2.0 (the +.. "License"); you may not use this file except in compliance +.. with the License. You may obtain a copy of the License at + +.. http://www.apache.org/licenses/LICENSE-2.0 + +.. Unless required by applicable law or agreed to in writing, +.. software distributed under the License is distributed on an +.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +.. KIND, either express or implied. See the License for the +.. specific language governing permissions and limitations +.. under the License. + +.. default-domain:: cpp +.. highlight:: cpp + +Arrow Skyhook example +========================= + +The file ``cpp/examples/arrow/dataset_skyhook_scan_example.cc`` +located inside the source tree contains an example of using Skyhook to +offload filters and projections to a Ceph cluster. + +Instuctions +-------------------- + +1. Install Ceph and Skyhook dependencies. +.. code-block:: bash + + apt update + apt install -y cmake \ + libradospp-dev \ + rados-objclass-dev \ + ceph \ + ceph-common \ + ceph-osd \ + ceph-mon \ + ceph-mgr \ + ceph-mds \ + rbd-mirror \ + ceph-fuse + +2. Build and install Skyhook. + +.. code-block:: bash + + git clone https://github.com/apache/arrow + cd arrow/ + mkdir -p cpp/release + cd cpp/release + cmake -DARROW_SKYHOOK=ON \ + -DARROW_PARQUET=ON \ + -DARROW_WITH_SNAPPY=ON \ + -DARROW_BUILD_EXAMPLES=ON \ + -DARROW_DATASET=ON \ + -DARROW_CSV=ON \ + -DARROW_WITH_LZ4=ON \ + .. Review comment: nit, but the indentation-after-backslash is a little inconsistent for each of these code blocks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
