alamb opened a new issue, #2873:
URL: https://github.com/apache/arrow-datafusion/issues/2873
**Describe the bug**
For a `DictionaryArray` `col` evaluating an expression like
```sql
CASE
WHEN col IS NULL THEN ''
ELSE col
END
```
Generates an error:
```
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value:
ArrowError(InvalidArgumentError("arguments need to have the same data type"))',
src/main.rs:45:82
```
**To Reproduce**
```rust
use std::sync::Arc;
use datafusion::arrow::datatypes::Int32Type;
use datafusion::prelude::*;
use datafusion::arrow::array::DictionaryArray;
use datafusion::datasource::MemTable;
use datafusion::logical_plan::{LogicalPlanBuilder, provider_as_source, when};
use datafusion::physical_plan::collect;
use datafusion::error::Result;
use datafusion::arrow::{self, record_batch::RecordBatch};
#[tokio::main]
async fn main() -> Result<()> {
let ctx = SessionContext::new();
let host: DictionaryArray<Int32Type> = vec![Some("host1"), None,
Some("host2")].into_iter().collect();
let batch = RecordBatch::try_from_iter(vec![
("host", Arc::new(host) as _),
]).unwrap();
let t = MemTable::try_new(batch.schema(), vec![vec![batch]]).unwrap();
let expr = when(col("host").is_null(), lit(""))
.otherwise(col("host"))
.unwrap();
let projection = None;
let builder = LogicalPlanBuilder::scan(
"cpu_load_short",
provider_as_source(Arc::new(t)),
projection
).unwrap();
let logical_plan = builder
.project(vec![expr])
.unwrap()
.build()
.unwrap();
// manually optimize the plan
let physical_plan =
ctx.create_physical_plan(&logical_plan).await.unwrap();
let results: Vec<RecordBatch> = collect(physical_plan,
ctx.task_ctx()).await.unwrap();
// format the results
println!("Results:\n\n{}",
arrow::util::pretty::pretty_format_batches(&results).unwrap());
Ok(())
}
```
```
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value:
ArrowError(InvalidArgumentError("arguments need to have the same data type"))',
src/main.rs:45:82
stack backtrace:
0: rust_begin_unwind
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/panicking.rs:584:5
1: core::panicking::panic_fmt
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/panicking.rs:142:14
2: core::result::unwrap_failed
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/result.rs:1785:5
3: core::result::Result<T,E>::unwrap
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/result.rs:1078:23
4: rust_arrow_playground::main::{{closure}}
at ./src/main.rs:45:37
5: <core::future::from_generator::GenFuture<T> as
core::future::future::Future>::poll
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/future/mod.rs:91:19
6: tokio::park::thread::CachedParkThread::block_on::{{closure}}
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/park/thread.rs:263:54
7: tokio::coop::with_budget::{{closure}}
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:102:9
8: std::thread::local::LocalKey<T>::try_with
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/thread/local.rs:445:16
9: std::thread::local::LocalKey<T>::with
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/std/src/thread/local.rs:421:9
10: tokio::coop::with_budget
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:95:5
11: tokio::coop::budget
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/coop.rs:72:5
12: tokio::park::thread::CachedParkThread::block_on
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/park/thread.rs:263:31
13: tokio::runtime::enter::Enter::block_on
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/enter.rs:151:13
14: tokio::runtime::thread_pool::ThreadPool::block_on
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/thread_pool/mod.rs:90:9
15: tokio::runtime::Runtime::block_on
at
/Users/alamb/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.19.2/src/runtime/mod.rs:482:43
16: rust_arrow_playground::main
at ./src/main.rs:49:5
17: core::ops::function::FnOnce::call_once
at
/rustc/a8314ef7d0ec7b75c336af2c9857bfaf43002bfc/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose
backtrace.
```
**Expected behavior**
The test passes with this output:
```
+------------------------------------------------------------------------------------+
| CASE WHEN #cpu_load_short.host IS NULL THEN Utf8("") ELSE
#cpu_load_short.host END |
+------------------------------------------------------------------------------------+
| host1
|
|
|
| host2
|
+------------------------------------------------------------------------------------+
```
**Additional context**
This test used to pass. The last commit it passed was
57f47ab9230a9a12b3244191dcf1623f8b69fd61
It appears to fail starting of da392f4b3d77ad5fec0018a50146746a0efabac6 (aka
came in via https://github.com/apache/arrow-datafusion/pull/2819) which makes
sense given the change.
Found while debugging upgrade into IOx:
https://github.com/influxdata/influxdb_iox/pull/5079
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]