luobei42 opened a new issue, #3819:
URL: https://github.com/apache/arrow-datafusion/issues/3819
Just ran into an issue with a left join. The right table's result is
missing, but that should work in a left join. Using datafusion==0.6.0 from
python. The following is a small test case to reproduce the issue from python:
import pyarrow as pa
import datafusion
ctx = datafusion.SessionContext()
batch = pa.RecordBatch.from_arrays(
[pa.array(a) for a in [
[0, 1, 2, 3],
['Aldous0 Abfalterer0', 'Bob0 Abramovich0', 'Cyril0
Jobstreibitzer0', 'Dmitry0 Laurent0']]
],
names=['id', 'name']
)
ctx.register_record_batches('author', [[batch]])
batch = pa.RecordBatch.from_arrays(
[pa.array(a) for a in [
[0, 1, 2, 3, 4, 5],
[1, 2, 2, 3, 3, 3],
['always a true house', 'always one true banana', 'never the false
tree', 'always the true table',
'never a false house', 'always one green window']]
],
names=['id', 'author_id', 'name']
)
ctx.register_record_batches('book', [[batch]])
print(ctx.sql("select count(*) from author").collect()[0].to_pylist()) #ok
print(ctx.sql("select count(*) from book").collect()[0].to_pylist()) #ok
print(ctx.sql("""\
SELECT a.id, a.name, count(distinct b.id)
from author a
left join book b on a.id=b.author_id
where a.id=0
group by a.id, a.name
""").collect()[0].to_pylist()) #no result, left join not working
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]