This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/arrow.git
The following commit(s) were added to refs/heads/master by this push:
new 43d00e9 ARROW-12015: [Rust] [DataFusion] Integrate doc-comment crate
to ensure readme examples remain valid
43d00e9 is described below
commit 43d00e9629fe34dc40c78ea96c008de186726a39
Author: Ruan Pearce-Authers <[email protected]>
AuthorDate: Fri Mar 19 06:54:26 2021 -0400
ARROW-12015: [Rust] [DataFusion] Integrate doc-comment crate to ensure
readme examples remain valid
As discussed
[here](https://github.com/apache/arrow/pull/9710#discussion_r596404956), we
were looking into how we might add code examples to the DataFusion readme
whilst keeping them in sync with reality as we go through API revisions etc.
This PR pulls in a new dev dependency, `doc-comment`, which allows for
detecting all the `rust`-tagged code blocks in a Markdown file and treating
them as doctests, and wires this up for `README.md`.
My only concerns are:
- because the end result is a full-blown doctest, you do need to make sure
imports etc are present, which makes the samples more verbose than some people
would perhaps like
- again on the verbosity front: we have lots of async code which requires a
`#[tokio::main] async fn main() { ... }` wrapper
Neither of these are inherently bad imo, but worth noting upfront.
As an example of a readme sample that passes as a doctest (borrowed from
@alamb's latest documentation PR, #9710):
```rust
use datafusion::prelude::*;
use arrow::util::pretty::print_batches;
use arrow::record_batch::RecordBatch;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
let mut ctx = ExecutionContext::new();
// create the dataframe
let df = ctx.read_csv("tests/example.csv", CsvReadOptions::new())?;
let df = df.filter(col("a").lt_eq(col("b")))?
.aggregate(&[col("a")], &[min(col("b"))])?
.limit(100)?;
let results: Vec<RecordBatch> = df.collect().await?;
print_batches(&results)?;
Ok(())
}
```
Closes #9749 from returnString/readme_doctest
Authored-by: Ruan Pearce-Authers <[email protected]>
Signed-off-by: Andrew Lamb <[email protected]>
---
rust/datafusion/Cargo.toml | 1 +
rust/datafusion/src/lib.rs | 3 +++
2 files changed, 4 insertions(+)
diff --git a/rust/datafusion/Cargo.toml b/rust/datafusion/Cargo.toml
index b713b77..3d795ba 100644
--- a/rust/datafusion/Cargo.toml
+++ b/rust/datafusion/Cargo.toml
@@ -78,6 +78,7 @@ tempfile = "3"
prost = "0.7"
arrow-flight = { path = "../arrow-flight", version = "4.0.0-SNAPSHOT" }
tonic = "0.4"
+doc-comment = "0.3"
[[bench]]
name = "aggregate_query_sql"
diff --git a/rust/datafusion/src/lib.rs b/rust/datafusion/src/lib.rs
index 5126f90..f0fcc4f 100644
--- a/rust/datafusion/src/lib.rs
+++ b/rust/datafusion/src/lib.rs
@@ -175,3 +175,6 @@ pub mod test;
#[macro_use]
#[cfg(feature = "regex_expressions")]
extern crate lazy_static;
+
+#[cfg(doctest)]
+doc_comment::doctest!("../README.md", readme_example_test);