alamb opened a new issue, #9269: URL: https://github.com/apache/arrow-datafusion/issues/9269
### Describe the bug There is a bug when reading from partitioned tables that have commas in their names Here is the test https://github.com/apache/arrow-datafusion/blob/b2a04519da97c2ff81789ef41dd652870794a73a/datafusion/sqllogictest/test_files/copy.slt#L109 ### To Reproduce Run this script ```sql -- create a table with quotes in the column names create table test ("'test'" varchar, "'test2'" varchar, "'test3'" varchar); insert into test VALUES ('a', 'x', 'aa'), ('b','y', 'bb'), ('c', 'z', 'cc'); copy test to '/tmp/escape_quote' (format csv, partition_by '''test2'',''test3'''); -- read back from the table CREATE EXTERNAL TABLE validate_partitioned_escape_quote STORED AS CSV LOCATION '/tmp/escape_quote/' PARTITIONED BY ("'test2'", "'test3'"); -- This panics select * from validate_partitioned_escape_quote; ``` Here is an example: ```sql ❯ -- create a table with quotes in the column names create table test ("'test'" varchar, "'test2'" varchar, "'test3'" varchar); insert into test VALUES ('a', 'x', 'aa'), ('b','y', 'bb'), ('c', 'z', 'cc'); copy test to '/tmp/escape_quote' (format csv, partition_by '''test2'',''test3'''); 0 rows in set. Query took 0.008 seconds. +-------+ | count | +-------+ | 3 | +-------+ 1 row in set. Query took 0.009 seconds. +-------+ | count | +-------+ | 3 | +-------+ 1 row in set. Query took 0.029 seconds. ❯ -- read back from the table CREATE EXTERNAL TABLE validate_partitioned_escape_quote STORED AS CSV LOCATION '/tmp/escape_quote/' PARTITIONED BY ("'test2'", "'test3'"); 0 rows in set. Query took 0.004 seconds. ❯ -- This panics select * from validate_partitioned_escape_quote; thread 'thread 'tokio-runtime-workertokio-runtime-worker' panicked at ' panicked at /Users/andrewlamb/Software/arrow-datafusion/datafusion/core/src/datasource/physical_plan/file_scan_config.rs/Users/andrewlamb/Software/arrow-datafusion/datafusion/core/src/datasource/physical_plan/file_scan_config.rs::248:thread '54248: :tokio-runtime-workerindex out of bounds: the len is 0 but the index is 054' panicked at /Users/andrewlamb/Software/arrow-datafusion/datafusion/core/src/datasource/physical_plan/file_scan_config.rs:248: :index out of bounds: the len is 0 but the index is 054 : index out of bounds: the len is 0 but the index is 0 stack backtrace: 0: rust_begin_unwind at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5 1: core::panicking::panic_fmt at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14 2: core::panicking::panic_bounds_check at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:208:5 3: datafusion::datasource::physical_plan::file_scan_config::PartitionColumnProjector::project 4: <datafusion::datasource::physical_plan::file_stream::FileStream<F> as futures_core::stream::Stream>::poll_next 5: datafusion_physical_plan::stream::RecordBatchReceiverStreamBuilder::run_input::{{closure}} 6: tokio::runtime::task::core::Core<T,S>::poll 7: tokio::runtime::task::harness::Harness<T,S>::poll 8: tokio::runtime::scheduler::multi_thread::worker::Context::run_task 9: tokio::runtime::scheduler::multi_thread::worker::Context::run 10: tokio::runtime::context::runtime::enter_runtime 11: tokio::runtime::scheduler::multi_thread::worker::run 12: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll 13: tokio::runtime::task::core::Core<T,S>::poll 14: tokio::runtime::task::harness::Harness<T,S>::poll 15: tokio::runtime::blocking::pool::Inner::run note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. ``` ### Expected behavior Note the data is written correctly ```shell andrewlamb@Andrews-MacBook-Pro:~/Software/influxdb_iox$ find /tmp/escape_quote /tmp/escape_quote /tmp/escape_quote/'test2'=x /tmp/escape_quote/'test2'=x/'test3'=aa /tmp/escape_quote/'test2'=x/'test3'=aa/3zMw255TXFQxId14.csv /tmp/escape_quote/'test2'=y /tmp/escape_quote/'test2'=y/'test3'=bb /tmp/escape_quote/'test2'=y/'test3'=bb/3zMw255TXFQxId14.csv /tmp/escape_quote/'test2'=z /tmp/escape_quote/'test2'=z/'test3'=cc /tmp/escape_quote/'test2'=z/'test3'=cc/3zMw255TXFQxId14.csv ``` ``` andrewlamb@Andrews-MacBook-Pro:~/Software/influxdb_iox$ cat /tmp/escape_quote/\'test2\'\=x/\'test3\'\=aa/3zMw255TXFQxId14.csv 'test' a ``` ### Additional context @devinjdangelo found this in https://github.com/apache/arrow-datafusion/pull/9240 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
