houqp commented on issue #1507: URL: https://github.com/apache/arrow-datafusion/issues/1507#issuecomment-1019798820
@matthewmturner this is indeed expected behavior. If you need to create a dataframe from record batches, you need to explicitly create aliases to make sure all output field names are unique. Postgres behave the same way as well, see this subquery as an example: http://sqlfiddle.com/#!17/bf2fd/26130. > Given Arrow is in-memory dataframe, with extra parameter, it should be possible to ask ctx.sql to return dataframe directly? @jychen7 , SQL queries only returns record batches, which doesn't have the concept of column qualifiers (table name). This is also to align the behavior with other dbs and query engines. For example, postgres strips all qualifiers in its query output as well. The `create_dataframe` API is actually not a very well named method, what it does actually is to create a memtable and wrap it with a TableScan plan. It won't make a lot sense for a sql query to return a table (memtable) in this case. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org