This is an automated email from the ASF dual-hosted git repository.
agrove pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git
The following commit(s) were added to refs/heads/master by this push:
new 9c8547e25 MINOR: Make crate READMEs consistent (#2437)
9c8547e25 is described below
commit 9c8547e25036f74a492bd7b231f0e78e6dc5d599
Author: Andy Grove <[email protected]>
AuthorDate: Tue May 3 19:31:55 2022 -0600
MINOR: Make crate READMEs consistent (#2437)
---
CONTRIBUTING.md | 2 +-
ballista/rust/client/Cargo.toml | 1 +
ballista/rust/core/Cargo.toml | 1 +
ballista/rust/core/README.md | 3 +--
ballista/rust/executor/Cargo.toml | 1 +
ballista/rust/executor/README.md | 3 +--
ballista/rust/scheduler/Cargo.toml | 1 +
ballista/rust/scheduler/README.md | 3 +--
data-access/README.md | 10 ++++++++--
datafusion-cli/Cargo.toml | 1 +
datafusion-cli/README.md | 7 ++++++-
datafusion/common/README.md | 4 +++-
datafusion/expr/Cargo.toml | 2 +-
datafusion/expr/README.md | 4 +++-
datafusion/expr/src/aggregate_function.rs | 2 +-
datafusion/expr/src/lib.rs | 23 +++++++++++------------
datafusion/jit/Cargo.toml | 2 +-
datafusion/{common => jit}/README.md | 6 ++++--
datafusion/physical-expr/Cargo.toml | 2 +-
datafusion/physical-expr/README.md | 6 ++++--
datafusion/proto/Cargo.toml | 2 +-
datafusion/{common => proto}/README.md | 6 ++++--
datafusion/row/Cargo.toml | 2 +-
datafusion/{common => row}/README.md | 6 ++++--
dev/release/README.md | 4 +++-
25 files changed, 65 insertions(+), 39 deletions(-)
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index c3983cd56..ab3381dff 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -253,5 +253,5 @@ $ prettier --version
After you've confirmed your prettier version, you can format all the `.md`
files:
```bash
-prettier -w {ballista,datafusion,datafusion-examples,dev,docs,python}/**/*.md
+prettier -w
{ballista,datafusion,data-access,datafusion-cli,datafusion-examples,dev,docs}/**/*.md
```
diff --git a/ballista/rust/client/Cargo.toml b/ballista/rust/client/Cargo.toml
index bb5a6b789..9052380f1 100644
--- a/ballista/rust/client/Cargo.toml
+++ b/ballista/rust/client/Cargo.toml
@@ -22,6 +22,7 @@ license = "Apache-2.0"
version = "0.6.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
edition = "2021"
rust-version = "1.59"
diff --git a/ballista/rust/core/Cargo.toml b/ballista/rust/core/Cargo.toml
index 18b8f05ec..edac7f507 100644
--- a/ballista/rust/core/Cargo.toml
+++ b/ballista/rust/core/Cargo.toml
@@ -22,6 +22,7 @@ license = "Apache-2.0"
version = "0.6.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
edition = "2018"
build = "build.rs"
diff --git a/ballista/rust/core/README.md b/ballista/rust/core/README.md
index 2ab95f313..2b4c9fbfd 100644
--- a/ballista/rust/core/README.md
+++ b/ballista/rust/core/README.md
@@ -20,5 +20,4 @@
# Ballista Core Library
This crate contains the Ballista core library which is used as a dependency by
the `ballista-client`,
-`ballista-scheduler`, and `ballista-executor` crates. Refer to
<https://crates.io/crates/ballista> for
-general Ballista documentation.
+`ballista-scheduler`, and `ballista-executor` crates.
diff --git a/ballista/rust/executor/Cargo.toml
b/ballista/rust/executor/Cargo.toml
index 3282f1f88..236f92f7e 100644
--- a/ballista/rust/executor/Cargo.toml
+++ b/ballista/rust/executor/Cargo.toml
@@ -22,6 +22,7 @@ license = "Apache-2.0"
version = "0.6.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
edition = "2018"
diff --git a/ballista/rust/executor/README.md b/ballista/rust/executor/README.md
index 731c2dca7..91f4c3266 100644
--- a/ballista/rust/executor/README.md
+++ b/ballista/rust/executor/README.md
@@ -19,5 +19,4 @@
# Ballista Executor Process
-This crate contains the Ballista executor process. Refer to
<https://crates.io/crates/ballista> for
-documentation.
+This crate contains the Ballista executor process.
diff --git a/ballista/rust/scheduler/Cargo.toml
b/ballista/rust/scheduler/Cargo.toml
index d1eac599e..07ed56f68 100644
--- a/ballista/rust/scheduler/Cargo.toml
+++ b/ballista/rust/scheduler/Cargo.toml
@@ -22,6 +22,7 @@ license = "Apache-2.0"
version = "0.6.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
edition = "2018"
diff --git a/ballista/rust/scheduler/README.md
b/ballista/rust/scheduler/README.md
index fbc35e427..382cce8cb 100644
--- a/ballista/rust/scheduler/README.md
+++ b/ballista/rust/scheduler/README.md
@@ -19,5 +19,4 @@
# Ballista Scheduler Process
-This crate contains the Ballista scheduler process. Refer to
<https://crates.io/crates/ballista> for
-documentation.
+This crate contains the Ballista scheduler process.
diff --git a/data-access/README.md b/data-access/README.md
index 36fdb7095..526603f69 100644
--- a/data-access/README.md
+++ b/data-access/README.md
@@ -17,6 +17,12 @@
under the License.
-->
-# Data Access Layer
+# DataFusion Data Access Layer
-This module contains an `async` API for accessing data, either remotely or
locally. Currently, it's based on the object store interfaces. In the future,
this module may include interfaces for accessing databases, or streaming data.
\ No newline at end of file
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides an `async` API for
accessing data, either remotely or locally.
+Currently, it is based on the object store interfaces. In the future, this
module may include interfaces for accessing
+databases, or streaming data.
+
+[df]: https://crates.io/crates/datafusion
diff --git a/datafusion-cli/Cargo.toml b/datafusion-cli/Cargo.toml
index e9895deb1..8ec1a1f88 100644
--- a/datafusion-cli/Cargo.toml
+++ b/datafusion-cli/Cargo.toml
@@ -26,6 +26,7 @@ license = "Apache-2.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
rust-version = "1.59"
+readme = "README.md"
[dependencies]
arrow = { version = "12" }
diff --git a/datafusion-cli/README.md b/datafusion-cli/README.md
index b83539975..5c72f16a4 100644
--- a/datafusion-cli/README.md
+++ b/datafusion-cli/README.md
@@ -19,6 +19,8 @@
# DataFusion Command-line Interface
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
The DataFusion CLI allows SQL queries to be executed by an in-process
DataFusion context, or by a distributed
Ballista context.
@@ -75,6 +77,7 @@ cargo build
```
## Ballista
+
If you want to execute the SQL in ballista by `datafusion-cli`, you must
build/compile the `datafusion-cli` with features of "ballista" first.
```bash
@@ -86,4 +89,6 @@ The DataFusion CLI can connect to a Ballista scheduler for
query execution.
```bash
datafusion-cli --host localhost --port 50050
-```
\ No newline at end of file
+```
+
+[df]: https://crates.io/crates/datafusion
diff --git a/datafusion/common/README.md b/datafusion/common/README.md
index 8c44d78ef..9bccf3f18 100644
--- a/datafusion/common/README.md
+++ b/datafusion/common/README.md
@@ -19,6 +19,8 @@
# DataFusion Common
-This is an internal module for the most fundamental types of [DataFusion][df].
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides common data types and
utilities.
[df]: https://crates.io/crates/datafusion
diff --git a/datafusion/expr/Cargo.toml b/datafusion/expr/Cargo.toml
index 4095d4ebc..35be6570b 100644
--- a/datafusion/expr/Cargo.toml
+++ b/datafusion/expr/Cargo.toml
@@ -21,7 +21,7 @@ description = "Logical plan and expression representation for
DataFusion query e
version = "7.0.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
-readme = "../README.md"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
license = "Apache-2.0"
keywords = [ "datafusion", "logical", "plan", "expressions" ]
diff --git a/datafusion/expr/README.md b/datafusion/expr/README.md
index 6ce82347c..bcce30be3 100644
--- a/datafusion/expr/README.md
+++ b/datafusion/expr/README.md
@@ -19,6 +19,8 @@
# DataFusion Logical Plan and Expressions
-This is an internal module for fundamental expression types of
[DataFusion][df].
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides data types and utilities
for logical plans and expressions.
[df]: https://crates.io/crates/datafusion
diff --git a/datafusion/expr/src/aggregate_function.rs
b/datafusion/expr/src/aggregate_function.rs
index 4e590c467..14cd46615 100644
--- a/datafusion/expr/src/aggregate_function.rs
+++ b/datafusion/expr/src/aggregate_function.rs
@@ -682,7 +682,7 @@ pub fn is_correlation_support_arg_type(arg_type: &DataType)
-> bool {
}
/// Return `true` if `arg_type` is of a [`DataType`] that the
-/// [`ApproxPercentileCont`] aggregation can operate on.
+/// [`AggregateFunction::ApproxPercentileCont`] aggregation can operate on.
pub fn is_approx_percentile_cont_supported_arg_type(arg_type: &DataType) ->
bool {
matches!(
arg_type,
diff --git a/datafusion/expr/src/lib.rs b/datafusion/expr/src/lib.rs
index b513bf52d..871f4f37d 100644
--- a/datafusion/expr/src/lib.rs
+++ b/datafusion/expr/src/lib.rs
@@ -15,6 +15,16 @@
// specific language governing permissions and limitations
// under the License.
+//! [DataFusion](https://github.com/apache/arrow-datafusion)
+//! is an extensible query execution framework that uses
+//! [Apache Arrow](https://arrow.apache.org) as its in-memory format.
+//!
+//! This crate is a submodule of DataFusion that provides types representing
+//! logical query plans ([LogicalPlan]) and logical expressions ([Expr]) as
well as utilities for
+//! working with these types.
+//!
+//! The [expr_fn] module contains functions for creating expressions.
+
mod accumulator;
pub mod aggregate_function;
pub mod array_expressions;
@@ -44,18 +54,7 @@ pub use aggregate_function::AggregateFunction;
pub use built_in_function::BuiltinScalarFunction;
pub use columnar_value::{ColumnarValue, NullColumnarValue};
pub use expr::Expr;
-pub use expr_fn::{
- abs, acos, and, approx_distinct, approx_percentile_cont, array, ascii,
asin, atan,
- avg, bit_length, btrim, case, ceil, character_length, chr, coalesce, col,
concat,
- concat_expr, concat_ws, concat_ws_expr, cos, count, count_distinct,
date_part,
- date_trunc, digest, exists, exp, floor, in_list, in_subquery, initcap,
left, length,
- ln, log10, log2, lower, lpad, ltrim, max, md5, min, not_exists,
not_in_subquery, now,
- now_expr, nullif, octet_length, or, random, regexp_match, regexp_replace,
repeat,
- replace, reverse, right, round, rpad, rtrim, scalar_subquery, sha224,
sha256, sha384,
- sha512, signum, sin, split_part, sqrt, starts_with, strpos, substr, sum,
tan, to_hex,
- to_timestamp_micros, to_timestamp_millis, to_timestamp_seconds, translate,
trim,
- trunc, upper, when,
-};
+pub use expr_fn::*;
pub use expr_schema::ExprSchemable;
pub use function::{
AccumulatorFunctionImplementation, ReturnTypeFunction,
ScalarFunctionImplementation,
diff --git a/datafusion/jit/Cargo.toml b/datafusion/jit/Cargo.toml
index fe1f278f2..29be2f153 100644
--- a/datafusion/jit/Cargo.toml
+++ b/datafusion/jit/Cargo.toml
@@ -21,7 +21,7 @@ description = "Just In Time (JIT) compilation support for
DataFusion query engin
version = "7.0.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
-readme = "../README.md"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
license = "Apache-2.0"
keywords = [ "arrow", "query", "sql" ]
diff --git a/datafusion/common/README.md b/datafusion/jit/README.md
similarity index 79%
copy from datafusion/common/README.md
copy to datafusion/jit/README.md
index 8c44d78ef..de931ed67 100644
--- a/datafusion/common/README.md
+++ b/datafusion/jit/README.md
@@ -17,8 +17,10 @@
under the License.
-->
-# DataFusion Common
+# DataFusion JIT
-This is an internal module for the most fundamental types of [DataFusion][df].
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides JIT code generation.
[df]: https://crates.io/crates/datafusion
diff --git a/datafusion/physical-expr/Cargo.toml
b/datafusion/physical-expr/Cargo.toml
index 57bb0bf2c..fd2d8444e 100644
--- a/datafusion/physical-expr/Cargo.toml
+++ b/datafusion/physical-expr/Cargo.toml
@@ -21,7 +21,7 @@ description = "Physical expression implementation for
DataFusion query engine"
version = "7.0.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
-readme = "../README.md"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
license = "Apache-2.0"
keywords = [ "arrow", "query", "sql" ]
diff --git a/datafusion/physical-expr/README.md
b/datafusion/physical-expr/README.md
index 9c9202338..a887d3eb2 100644
--- a/datafusion/physical-expr/README.md
+++ b/datafusion/physical-expr/README.md
@@ -17,8 +17,10 @@
under the License.
-->
-# DataFusion Physical Expr
+# DataFusion Physical Expressions
-This is an internal module for fundamental physical expression types of
[DataFusion][df].
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides data types and utilities
for physical expressions.
[df]: https://crates.io/crates/datafusion
diff --git a/datafusion/proto/Cargo.toml b/datafusion/proto/Cargo.toml
index bafc32712..c7e338b77 100644
--- a/datafusion/proto/Cargo.toml
+++ b/datafusion/proto/Cargo.toml
@@ -21,7 +21,7 @@ description = "Protobuf serialization of DataFusion logical
plan expressions"
version = "7.0.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
-readme = "../README.md"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
license = "Apache-2.0"
keywords = [ "arrow", "query", "sql" ]
diff --git a/datafusion/common/README.md b/datafusion/proto/README.md
similarity index 75%
copy from datafusion/common/README.md
copy to datafusion/proto/README.md
index 8c44d78ef..b928652e9 100644
--- a/datafusion/common/README.md
+++ b/datafusion/proto/README.md
@@ -17,8 +17,10 @@
under the License.
-->
-# DataFusion Common
+# DataFusion Proto
-This is an internal module for the most fundamental types of [DataFusion][df].
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides a protocol buffer format
for representing query plans and expressions.
[df]: https://crates.io/crates/datafusion
diff --git a/datafusion/row/Cargo.toml b/datafusion/row/Cargo.toml
index 26b517300..041a13d6d 100644
--- a/datafusion/row/Cargo.toml
+++ b/datafusion/row/Cargo.toml
@@ -21,7 +21,7 @@ description = "Row backed by raw bytes for DataFusion query
engine"
version = "7.0.0"
homepage = "https://github.com/apache/arrow-datafusion"
repository = "https://github.com/apache/arrow-datafusion"
-readme = "../README.md"
+readme = "README.md"
authors = ["Apache Arrow <[email protected]>"]
license = "Apache-2.0"
keywords = [ "arrow", "query", "sql" ]
diff --git a/datafusion/common/README.md b/datafusion/row/README.md
similarity index 78%
copy from datafusion/common/README.md
copy to datafusion/row/README.md
index 8c44d78ef..9a93bbaa7 100644
--- a/datafusion/common/README.md
+++ b/datafusion/row/README.md
@@ -17,8 +17,10 @@
under the License.
-->
-# DataFusion Common
+# DataFusion Row
-This is an internal module for the most fundamental types of [DataFusion][df].
+[DataFusion](df) is an extensible query execution framework, written in Rust,
that uses Apache Arrow as its in-memory format.
+
+This crate is a submodule of DataFusion that provides a format for row-based
data.
[df]: https://crates.io/crates/datafusion
diff --git a/dev/release/README.md b/dev/release/README.md
index 3244d32d2..03bd105b5 100644
--- a/dev/release/README.md
+++ b/dev/release/README.md
@@ -34,14 +34,16 @@ Python binding or Ballista always requires a new DataFusion
version release.
### Major Release
-DataFusion typically has major releases from the `master` branch every 3
months, including breaking API changes.
+DataFusion typically has major releases from the `master` branch every 3
months, including breaking API changes.
### Minor Release
Starting v7.0.0, we are experimenting with maintaining an active stable
release branch (e.g. `maint-7.x`). Every month, we will review the `maint-*`
branch and prepare a minor release (e.g. v7.1.0) when necessary. A patch
release (v7.0.1) can be requested on demand if it is urgent bug/security fix.
#### How to add changes to `maint-*` branch?
+
If you would like to propose your change for inclusion in the maintenance
branch
+
1. follow normal workflow to create PR to `master` branch and wait for its
approval and merges.
2. after PR is squash merged to `master`, branch from most recent maintenance
branch (e.g. `maint-7-x`), cherry-pick the commit and create a PR to
maintenance branch (e.g. `maint-7-x`).