2010YOUY01 opened a new issue, #17774: URL: https://github.com/apache/datafusion/issues/17774
### Describe the bug datafusion-cli is compiled from the latest main commit 5bbdb7eb1 To reproduce, tpch dataset can be generated via https://github.com/clflushopt/tpchgen-rs/tree/main/tpchgen-cli, and update the following paths ``` CREATE EXTERNAL TABLE customer STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/customer.parquet'; CREATE EXTERNAL TABLE nation STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/nation.parquet'; CREATE EXTERNAL TABLE part STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/part.parquet'; CREATE EXTERNAL TABLE region STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/region.parquet'; CREATE EXTERNAL TABLE lineitem STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/lineitem.parquet'; CREATE EXTERNAL TABLE orders STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/orders.parquet'; CREATE EXTERNAL TABLE partsupp STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/partsupp.parquet'; CREATE EXTERNAL TABLE supplier STORED AS PARQUET LOCATION '/Users/yongting/Code/datafusion-sqlstorm/data/supplier.parquet'; WITH MonthlySales AS ( SELECT DATE_TRUNC('month', o_orderdate) AS sales_month, SUM(l_extendedprice) AS total_sales FROM orders AS o JOIN lineitem AS l ON o.o_orderkey = l.l_orderkey GROUP BY sales_month ), TopRegions AS ( SELECT n.n_name AS region_name, SUM(ps.ps_supplycost) AS region_supply_cost FROM partsupp AS ps JOIN supplier AS s ON ps.ps_suppkey = s.s_suppkey JOIN nation AS n ON s.s_nationkey = n.n_nationkey GROUP BY region_name ORDER BY region_supply_cost DESC LIMIT 5 ) SELECT ms.sales_month, ms.total_sales, tr.region_name, tr.region_supply_cost FROM MonthlySales AS ms LEFT JOIN TopRegions AS tr ON tr.region_supply_cost > ( SELECT AVG(region_supply_cost) FROM TopRegions ); Optimizer rule 'eliminate_cross_join' failed caused by Check optimizer-specific invariants after optimizer rule: eliminate_cross_join caused by Internal error: Failed due to a difference in schemas: original schema: DFSchema { inner: Schema { fields: [Field { name: "sales_month", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "total_sales", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_name", data_type: Utf8View, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_supply_cost", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [Some(Bare { table: "ms" }), Some(Bare { table: "ms" }), Some(Bare { table: "tr" }), Some(Bare { table: "tr" })], functional_dependencies: FunctionalDependencies { deps: [FunctionalDependence { source_indices: [0], target_indices: [0, 1], nullable: true, mode: Multi }, FunctionalDependence { source_indices: [2], target_indices: [2, 3] , nullable: true, mode: Multi }] } }, new schema: DFSchema { inner: Schema { fields: [Field { name: "sales_month", data_type: Timestamp(Nanosecond, None), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "total_sales", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_name", data_type: Utf8View, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "region_supply_cost", data_type: Decimal128(25, 2), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "avg(topregions.region_supply_cost)", data_type: Decimal128(29, 6), nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }], metadata: {} }, field_qualifiers: [Some(Bare { table: "ms" }), Some(Bare { table: "ms" }), Some(Bare { table: "tr" }), Some(Bare { table: "tr" }), Some(Bare { table: "tr" })], functional_dependencies: FunctionalDependencies { deps: [FunctionalD ependence { source_indices: [0], target_indices: [0, 1], nullable: true, mode: Multi }, FunctionalDependence { source_indices: [2], target_indices: [2, 3], nullable: true, mode: Multi }] } }. This issue was likely caused by a bug in DataFusion's code. Please help us to resolve this by filing a bug report in our issue tracker: https://github.com/apache/datafusion/issues ``` ### To Reproduce _No response_ ### Expected behavior _No response_ ### Additional context Found by SQLStorm #17698 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
