tv42 opened a new issue, #8476:
URL: https://github.com/apache/arrow-datafusion/issues/8476
### Describe the bug
`Dataframe::cache` gives an error where an execution that doesn't first
cache results succeeds.
I would have expected caching to have no effect on success/failure.
### To Reproduce
```rust
use datafusion::prelude::SessionContext;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let sql = "SELECT CASE WHEN true THEN NULL ELSE 1 END;";
let ctx = SessionContext::new();
let plan = ctx.state().create_logical_plan(sql).await?;
let df = ctx.execute_logical_plan(plan).await?;
// Comment out the next line to make the error go away.
let df = df.cache().await?;
let batches = df.collect().await?;
let display =
datafusion::arrow::util::pretty::pretty_format_batches(&batches).unwrap();
println!("{}", display);
Ok(())
}
```
### Expected behavior
Behavior with and without `let df = df.cache().await?` to be functionally
same, only changing performance and memory use.
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]