drin commented on a change in pull request #9810:
URL: https://github.com/apache/arrow/pull/9810#discussion_r610317754



##########
File path: docs/source/cpp/dataset.rst
##########
@@ -0,0 +1,381 @@
+.. Licensed to the Apache Software Foundation (ASF) under one
+.. or more contributor license agreements.  See the NOTICE file
+.. distributed with this work for additional information
+.. regarding copyright ownership.  The ASF licenses this file
+.. to you under the Apache License, Version 2.0 (the
+.. "License"); you may not use this file except in compliance
+.. with the License.  You may obtain a copy of the License at
+
+..   http://www.apache.org/licenses/LICENSE-2.0
+
+.. Unless required by applicable law or agreed to in writing,
+.. software distributed under the License is distributed on an
+.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+.. KIND, either express or implied.  See the License for the
+.. specific language governing permissions and limitations
+.. under the License.
+
+.. default-domain:: cpp
+.. highlight:: cpp
+
+================
+Tabular Datasets
+================
+
+.. seealso::
+   :doc:`Dataset API reference <api/dataset>`
+
+.. warning::
+
+    The ``arrow::dataset`` namespace is experimental, and a stable API
+    is not yet guaranteed.
+
+The Arrow Datasets library provides functionality to efficiently work with
+tabular, potentially larger than memory and multi-file datasets:

Review comment:
       Actually, looking at the below section, maybe lift the first bullet 
point so that the datasets library isn't defined as a library for datasets? 
This suggestion includes @westonpace 's next few comments and the following 
bullet points:
   
   ```
   The Arrow Datasets library provides a unified interface for operations on 
tabular data from various sources, in various formats. The data may be from 
buffers, local filesystems, and cloud filesystems; and may be larger than 
memory or span multiple units (e.g. multiple files). Some supported data 
formats include parquet, feather, and IPC. Some supported operations include 
discovery, predicate pushdowns, and read parallelism.
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to