returnString opened a new pull request #9749:
URL: https://github.com/apache/arrow/pull/9749


   As discussed 
[here](https://github.com/apache/arrow/pull/9710#discussion_r596404956), we 
were looking into how we might add code examples to the DataFusion readme 
whilst keeping them in sync with reality as we go through API revisions etc.
   
   This PR pulls in a new dev dependency, `doc-comment`, which allows for 
detecting all the `rust`-tagged code blocks in a Markdown file and treating 
them as doctests, and wires this up for `README.md`.
   
   My only concerns are:
   - because the end result is a full-blown doctest, you do need to make sure 
imports etc are present, which makes the samples more verbose than some people 
would perhaps like
   - again on the verbosity front: we have lots of async code which requires a 
`#[tokio::main] async fn main() { ... }` wrapper
   
   Neither of these are inherently bad imo, but worth noting upfront.
   
   As an example of a readme sample that passes as a doctest (borrowed from 
@alamb's latest documentation PR, #9710):
   
   ```rust
   use datafusion::prelude::*;
   use arrow::util::pretty::print_batches;
   use arrow::record_batch::RecordBatch;
   
   #[tokio::main]
   async fn main() -> datafusion::error::Result<()> {
     let mut ctx = ExecutionContext::new();
     // create the dataframe
     let df = ctx.read_csv("tests/example.csv", CsvReadOptions::new())?;
   
     let df = df.filter(col("a").lt_eq(col("b")))?
               .aggregate(&[col("a")], &[min(col("b"))])?
               .limit(100)?;
   
     let results: Vec<RecordBatch> = df.collect().await?;
     print_batches(&results)?;
   
     Ok(())
   }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to