ZhangqyTJ opened a new issue, #10:
URL: https://github.com/apache/arrow-ballista/issues/10
I added the s3 (minio_store) module in
datafusion/src/datasource/object_store, and registered the minio_store in
benchmarks/tpch.rs through the register_object_store() method of
ExecutionContext. But when I start the Scheduler and Executor, and then run
"cargo run --bin tpch --release****", the data in minio cannot be read.
After checking the code, I found that LocalFileSystem is used directly at
ballista/rust/core/src/serde/physical_plan/from_proto.rs(789) and
ballista/rust/core/src/serde/logical_plan/from_proto.rs(201), so I modified
these two codes to minio_store and it ran successfully.
How to make Ballista support external file system?
**The project address after I added minio_store**
_https://github.com/ZhangqyTJ/arrow-datafusion.git_
**Modify the code before running**
ballista/rust/core/src/serde/physical_plan/from_proto.rs(789)
ballista/rust/core/src/serde/logical_plan/from_proto.rs(201)
**Run command**
To run the scheduler from source:
```bash
cd $ARROW_HOME/ballista/rust/scheduler
RUST_LOG=info cargo run --release
```
By default the scheduler will bind to `0.0.0.0` and listen on port 50050.
To run the executor from source:
```bash
cd $ARROW_HOME/ballista/rust/executor
RUST_LOG=info cargo run --release
```
To run the benchmarks:
```bash
cargo run --bin tpch --release benchmark ballista --host localhost
--port 50050 --query 1 --partitions 1 --path s3://test1/tpch_tbl/cutdata
--format tbl --storage-type minio --endpoint 192.168.75.81:9091 --username
minioadmin --password minioadmin --bucket test1
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]