This is an automated email from the ASF dual-hosted git repository.

agrove pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow-ballista.git


The following commit(s) were added to refs/heads/main by this push:
     new 31eb4f18 feat: Added Quick Start Documentation (#960)
31eb4f18 is described below

commit 31eb4f18ac5ecf1d07cc4fbed1fcb3bdc9a203ed
Author: Matthew Aylward <[email protected]>
AuthorDate: Thu Feb 1 16:20:38 2024 +0100

    feat: Added Quick Start Documentation (#960)
    
    * feat: Added Quick Start Documentation
    
    * feat: Prettier Fix
---
 docs/source/user-guide/deployment/index.rst      |   1 +
 docs/source/user-guide/deployment/quick-start.md | 147 +++++++++++++++++++++++
 2 files changed, 148 insertions(+)

diff --git a/docs/source/user-guide/deployment/index.rst 
b/docs/source/user-guide/deployment/index.rst
index 29e255b6..28278d6f 100644
--- a/docs/source/user-guide/deployment/index.rst
+++ b/docs/source/user-guide/deployment/index.rst
@@ -21,6 +21,7 @@ Start a Ballista Cluster
 .. toctree::
    :maxdepth: 2
 
+   Quick Start <quick-start>
    Cargo Install <cargo-install>
    Docker <docker>
    Docker Compose <docker-compose>
diff --git a/docs/source/user-guide/deployment/quick-start.md 
b/docs/source/user-guide/deployment/quick-start.md
new file mode 100644
index 00000000..14c17fc0
--- /dev/null
+++ b/docs/source/user-guide/deployment/quick-start.md
@@ -0,0 +1,147 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Ballista Quickstart
+
+A simple way to start a local cluster for testing purposes is to use cargo to 
build the project and then run the scheduler and executor binaries directly 
along with the Ballista UI.
+
+Project Requirements:
+
+- [Rust](https://www.rust-lang.org/tools/install)
+- [Node.js](https://nodejs.org/en/download)
+- [Yarn](https://classic.yarnpkg.com/lang/en/docs/install)
+
+### Build the project
+
+From the root of the project, build release binaries.
+
+```shell
+cargo build --release
+```
+
+Start a Ballista scheduler process in a new terminal session.
+
+```shell
+RUST_LOG=info ./target/release/ballista-scheduler
+```
+
+Start one or more Ballista executor processes in new terminal sessions. When 
starting more than one
+executor, a unique port number must be specified for each executor.
+
+```shell
+RUST_LOG=info ./target/release/ballista-executor -c 2 -p 50051
+
+RUST_LOG=info ./target/release/ballista-executor -c 2 -p 50052
+```
+
+Start the Ballista UI in a new terminal session.
+
+```shell
+cd ballista/scheduler/ui
+yarn
+yarn start
+```
+
+You can now access the UI at http://localhost:3000/
+
+## Running the examples
+
+The examples can be run using the `cargo run --bin` syntax. Open a new 
terminal session and run the following commands.
+
+## Running the examples
+
+## Distributed SQL Example
+
+```bash
+cd examples
+cargo run --release --bin sql
+```
+
+### Source code for distributed SQL example
+
+```rust
+use ballista::prelude::*;
+use datafusion::prelude::CsvReadOptions;
+
+/// This example demonstrates executing a simple query against an Arrow data 
source (CSV) and
+/// fetching results, using SQL
+#[tokio::main]
+async fn main() -> Result<()> {
+    let config = BallistaConfig::builder()
+        .set("ballista.shuffle.partitions", "4")
+        .build()?;
+    let ctx = BallistaContext::remote("localhost", 50050, &config).await?;
+
+    // register csv file with the execution context
+    ctx.register_csv(
+        "test",
+        "testdata/aggregate_test_100.csv",
+        CsvReadOptions::new(),
+    )
+    .await?;
+
+    // execute the query
+    let df = ctx
+        .sql(
+            "SELECT c1, MIN(c12), MAX(c12) \
+        FROM test \
+        WHERE c11 > 0.1 AND c11 < 0.9 \
+        GROUP BY c1",
+        )
+        .await?;
+
+    // print the results
+    df.show().await?;
+
+    Ok(())
+}
+```
+
+## Distributed DataFrame Example
+
+```bash
+cd examples
+cargo run --release --bin dataframe
+```
+
+### Source code for distributed DataFrame example
+
+```rust
+#[tokio::main]
+async fn main() -> Result<()> {
+    let config = BallistaConfig::builder()
+        .set("ballista.shuffle.partitions", "4")
+        .build()?;
+    let ctx = BallistaContext::remote("localhost", 50050, &config).await?;
+
+    let filename = "testdata/alltypes_plain.parquet";
+
+    // define the query using the DataFrame trait
+    let df = ctx
+        .read_parquet(filename, ParquetReadOptions::default())
+        .await?
+        .select_columns(&["id", "bool_col", "timestamp_col"])?
+        .filter(col("id").gt(lit(1)))?;
+
+    // print the results
+    df.show().await?;
+
+    Ok(())
+}
+```

Reply via email to