luobei42 opened a new issue, #3819:
URL: https://github.com/apache/arrow-datafusion/issues/3819

   Just ran into an issue with a left join. The right table's result is 
missing, but that should work in a left join. Using datafusion==0.6.0 from 
python. The following is a small test case to reproduce the issue from python:
   
   import pyarrow as pa
   import datafusion
   
   ctx = datafusion.SessionContext()
   
   batch = pa.RecordBatch.from_arrays(
       [pa.array(a) for a in [
           [0, 1, 2, 3],
           ['Aldous0 Abfalterer0', 'Bob0 Abramovich0', 'Cyril0 
Jobstreibitzer0', 'Dmitry0 Laurent0']]
       ],
       names=['id', 'name']
   )
   ctx.register_record_batches('author', [[batch]])
   
   batch = pa.RecordBatch.from_arrays(
       [pa.array(a) for a in [
           [0, 1, 2, 3, 4, 5],
           [1, 2, 2, 3, 3, 3],
           ['always a true house', 'always one true banana', 'never the false 
tree', 'always the true table',
             'never a false house', 'always one green window']]
       ],
       names=['id', 'author_id', 'name']
   )
   ctx.register_record_batches('book', [[batch]])
   
   print(ctx.sql("select count(*) from author").collect()[0].to_pylist()) #ok
   print(ctx.sql("select count(*) from book").collect()[0].to_pylist()) #ok
   print(ctx.sql("""\
   SELECT a.id, a.name, count(distinct b.id) 
               from author a 
               left join book b on a.id=b.author_id
               where a.id=0
               group by a.id, a.name
   """).collect()[0].to_pylist()) #no result, left join not working


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to