matthewmturner opened a new issue #1488: URL: https://github.com/apache/arrow-datafusion/issues/1488
**Describe the bug** I am working on adding datafusion to db-benchmarks (#147). As part of that I am using datafusion-cli to test writing queries on the db-benchmark data (can be generated here https://github.com/h2oai/db-benchmark/tree/master/_data). while i was creating a table of one of the datasets i noticed that one of the column types was inferred incorrectly. specifically, column v1 was picked up as utf8 instead of float / decimal. **To Reproduce** ``` DataFusion CLI v5.1.0 ❯ CREATE EXTERNAL TABLE x STORED AS CSV WITH HEADER ROW LOCATION "data/J1_1e7_NA_0_0.csv"; 0 rows in set. Query took 5.536 seconds. ❯ SHOW COLUMNS FROM x; +---------------+--------------+------------+-------------+-----------+-------------+ | table_catalog | table_schema | table_name | column_name | data_type | is_nullable | +---------------+--------------+------------+-------------+-----------+-------------+ | datafusion | public | x | id1 | Int64 | NO | | datafusion | public | x | id2 | Int64 | NO | | datafusion | public | x | id3 | Int64 | NO | | datafusion | public | x | id4 | Utf8 | NO | | datafusion | public | x | id5 | Utf8 | NO | | datafusion | public | x | id6 | Utf8 | NO | | datafusion | public | x | v1 | Utf8 | NO | +---------------+--------------+------------+-------------+-----------+-------------+ 7 rows in set. Query took 0.002 seconds. ``` **Expected behavior** A clear and concise description of what you expected to happen. Column v1 should have data_type of float or decimal **Additional context** Add any other context about the problem here. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
