alamb commented on code in PR #6566:
URL: https://github.com/apache/arrow-datafusion/pull/6566#discussion_r1221355361
##########
datafusion/core/src/execution/context.rs:
##########
@@ -518,7 +518,7 @@ impl SessionContext {
let physical = DataFrame::new(self.state(), input);
let batches: Vec<_> = physical.collect_partitioned().await?;
- let table = Arc::new(MemTable::try_new(schema, batches)?);
+ let table = Arc::new(MemTable::new_not_registered(schema,
batches));
Review Comment:
I think using the same names in physical and logical plans is preferable
because the rest of the parts of the code expects this and sometimes makes
assumptions that it is the case (because it mostly is).
If we don't make the logical and physical plans match up, I predict we will
continue to hit a long tail of bugs related to schema mismatches, only when
using window functions related to the discrepancy.
If the long display name is a problem (and I can see how it would be)
perhaps we can figure out how to make `display_name` produce something shorter
for window functions other than serializing the entire window definition
Here is what postgres does:
```sql
postgres=# select first_value(x) over (order by x) from foo;
first_value
-------------
1
(1 row)
```
We probably need to do something more sophisticated as DataFusion needs
distinct column names.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]