devoopsman45 commented on PR #1881:
URL:
https://github.com/apache/datafusion-ballista/pull/1881#issuecomment-4754090781
@martin-g Addressed the comments. Please take a look, when you can. Also
found another issue (Atleast, seems like an issue), waiting for your thoughts.
While testing this PR, I started the cluster with docker compose -f
docker-compose.quick.yml up and ran the existing remote-sql example against it:
```bash
cd examples && cargo run --release --example remote-sql
```
The query ran without any error but returned an empty result:
```
++
++
```
Looking at the scheduler logs, the execution plan shows EmptyExec at the
leaf where a CsvExec should be:
```
FilterExec: c11@1 > 0.1 AND c11@1 < 0.9
EmptyExec
```
It seems like calling `register_csv` on a Rust
`SessionContext::remote_with_state` only stores the table locally on the
client, it is not reaching the scheduler. When the plan is serialized and sent
to the scheduler, the table scan is lost and becomes `EmptyExec`. The same
behavior happens when running against local binaries (not just Docker), so it
doesn't appear to be a networking issue.
Interestingly, `tpch.py` (which uses `BallistaSessionContext` from the
Python client) does not have this problem, it registers tables and can see the
output rows, so the plan is correct. Is this a known limitation? Happy to file
a separate issue if it would be useful.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]