devoopsman45 commented on PR #1881:
URL: 
https://github.com/apache/datafusion-ballista/pull/1881#issuecomment-4754090781

   @martin-g Addressed the comments. Please take a look, when you can. Also 
found another issue (Atleast, seems like an issue), waiting for your thoughts. 
   
   While testing this PR, I started the cluster with docker compose -f 
docker-compose.quick.yml up and ran the existing remote-sql example against it:
   
     ```bash   
     cd examples && cargo run --release --example remote-sql
     ```
     
     The query ran without any error but returned an empty result:
   
     ```
     ++
     ++
     ```
   
     Looking at the scheduler logs, the execution plan shows EmptyExec at the 
leaf where a CsvExec should be:
   
     ```
     FilterExec: c11@1 > 0.1 AND c11@1 < 0.9
       EmptyExec
     ```
   
     It seems like calling `register_csv` on a Rust 
`SessionContext::remote_with_state` only stores the table locally on the 
client,  it is not reaching the scheduler. When the plan is serialized and sent 
to the scheduler, the table scan is lost and becomes `EmptyExec`. The same 
behavior happens when running against local binaries (not just Docker), so it 
doesn't appear to be a networking issue.
    
   Interestingly, `tpch.py` (which uses `BallistaSessionContext` from the 
Python client) does not have this problem, it registers tables and can see the 
output rows, so the plan is correct. Is this a known limitation? Happy to file 
a separate issue if it would be useful.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to