tustvold commented on issue #2453: URL: https://github.com/apache/arrow-datafusion/issues/2453#issuecomment-1141300151
I can confirm https://github.com/apache/arrow-datafusion/pull/2631 closes this issue, although it should be noted it now runs into a different issue that will need to be triaged and fixed ``` called `Result::unwrap()` on an `Err` value: ArrowError(ComputeError("concat requires input of at least one array")) thread 'physical_plan::file_format::parquet::tests::temp' panicked at 'called `Result::unwrap()` on an `Err` value: ArrowError(ComputeError("concat requires input of at least one array"))', datafusion/common/src/scalar.rs:1206:18 stack backtrace: 0: rust_begin_unwind at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:584:5 1: core::panicking::panic_fmt at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/panicking.rs:143:14 2: core::result::unwrap_failed at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1785:5 3: core::result::Result<T,E>::unwrap at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/result.rs:1078:23 4: datafusion_common::scalar::ScalarValue::to_array_of_size at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:1198:22 5: datafusion_common::scalar::ScalarValue::to_array_of_size::{{closure}} at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:1253:45 6: core::iter::adapters::map::map_fold::{{closure}} at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/adapters/map.rs:84:28 7: core::iter::traits::iterator::Iterator::fold at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/traits/iterator.rs:2362:21 8: <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::fold at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/adapters/map.rs:124:9 9: core::iter::traits::iterator::Iterator::for_each at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/traits/iterator.rs:779:9 10: <alloc::vec::Vec<T,A> as alloc::vec::spec_extend::SpecExtend<T,I>>::spec_extend at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/spec_extend.rs:40:17 11: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/spec_from_iter_nested.rs:62:9 12: <alloc::vec::Vec<T> as alloc::vec::spec_from_iter::SpecFromIter<T,I>>::from_iter at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/spec_from_iter.rs:33:9 13: <alloc::vec::Vec<T> as core::iter::traits::collect::FromIterator<T>>::from_iter at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/vec/mod.rs:2554:9 14: core::iter::traits::iterator::Iterator::collect at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/iter/traits/iterator.rs:1784:9 15: datafusion_common::scalar::ScalarValue::to_array_of_size at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:1248:48 16: datafusion_common::scalar::ScalarValue::to_array at /home/raphael/repos/external/arrow-datafusion/datafusion/common/src/scalar.rs:658:9 17: datafusion::datasource::get_statistics_with_limit::{{closure}} at ./src/datasource/mod.rs:75:56 18: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19 19: datafusion::datasource::listing::table::ListingTable::list_files_for_scan::{{closure}} at ./src/datasource/listing/table.rs:394:67 20: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19 21: <datafusion::datasource::listing::table::ListingTable as datafusion::datasource::datasource::TableProvider>::scan::{{closure}} ``` However, if you disable stats collection it works correctly :tada: ``` #[tokio::test] async fn temp() { let ctx = SessionContext::new(); let mut options = ParquetReadOptions::default() .parquet_pruning(true) .to_listing_options(2); // Disable stats collection options.collect_stat = false; ctx.register_listing_table("patient", "/home/raphael/Downloads/part-00000-f6337bce-7fcd-4021-9f9d-040413ea83f8-c000.snappy.parquet", options, None).await.unwrap(); let df = ctx.sql("SELECT patient.meta FROM patient LIMIT 10").await.unwrap(); df.show().await.unwrap(); } ``` I will file a follow on ticket. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
