andygrove commented on a change in pull request #923:
URL: https://github.com/apache/arrow-datafusion/pull/923#discussion_r694009745
##########
File path: datafusion/src/dataframe.rs
##########
@@ -223,6 +223,21 @@ pub trait DataFrame: Send + Sync {
/// ```
async fn collect(&self) -> Result<Vec<RecordBatch>>;
+ /// Print results.
+ ///
+ /// ```
+ /// # use datafusion::prelude::*;
+ /// # use datafusion::error::Result;
+ /// # #[tokio::main]
+ /// # async fn main() -> Result<()> {
+ /// let mut ctx = ExecutionContext::new();
+ /// let df = ctx.read_csv("tests/example.csv", CsvReadOptions::new())?;
+ /// df.show().await?;
+ /// # Ok(())
+ /// # }
+ /// ```
+ async fn show(&self) -> Result<()>;
Review comment:
Sorry to be a pain but this seems confusing to me and maybe it would be
better to revert to your original code and document that the user can limit
output by using `df.limit(20).show()`. Alternatively, we could add a
`show_limit(limit: usize)` alternate method where the user can specify how many
rows they would like.
The issue I have with this is that it shows 20 rows by default and if the
user wants more then they need to add a limit, which is counterintuitive
because limit normally reduces the number of rows. Also, this code only works
if the final operator is a limit, so it won't work constantly if the limit is
wrapped in a sort, for example.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]