This is an automated email from the ASF dual-hosted git repository.
alamb pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/arrow-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 201ebf0 add datafusion roadmap (#154)
201ebf0 is described below
commit 201ebf0d7238c89c3749ee228bc8583008678970
Author: QP Hou <[email protected]>
AuthorDate: Wed Oct 20 03:36:09 2021 -0700
add datafusion roadmap (#154)
---
datafusion/_modules/index.html | 7 +
datafusion/_sources/cli/index.rst.txt | 6 +-
datafusion/_sources/community/communication.md.txt | 17 ++
datafusion/_sources/index.rst.txt | 1 +
datafusion/_sources/specification/roadmap.md.txt | 99 ++++++++
.../_sources/user-guide/example-usage.md.txt | 4 -
datafusion/_sources/user-guide/library.md.txt | 5 +-
datafusion/cli/index.html | 16 +-
datafusion/community/communication.html | 24 ++
datafusion/genindex.html | 32 ++-
datafusion/index.html | 8 +
datafusion/objects.inv | Bin 1632 -> 1694 bytes
datafusion/py-modindex.html | 5 +
datafusion/python/api/dataframe.html | 10 +
datafusion/python/api/execution_context.html | 10 +
datafusion/python/api/expression.html | 10 +
datafusion/python/api/functions.html | 10 +
.../python/generated/datafusion.DataFrame.html | 10 +
.../generated/datafusion.ExecutionContext.html | 10 +
.../python/generated/datafusion.Expression.html | 10 +
.../python/generated/datafusion.functions.html | 44 ++++
datafusion/search.html | 5 +
datafusion/searchindex.js | 2 +-
.../roadmap.html} | 250 ++++++++++++++-------
datafusion/user-guide/example-usage.html | 14 +-
datafusion/user-guide/library.html | 15 +-
26 files changed, 525 insertions(+), 99 deletions(-)
diff --git a/datafusion/_modules/index.html b/datafusion/_modules/index.html
index 7bcb0b0..0233d14 100644
--- a/datafusion/_modules/index.html
+++ b/datafusion/_modules/index.html
@@ -319,6 +319,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -392,6 +397,8 @@
<h1>All modules for which code is available</h1>
<ul><li><a href="builtins.html">builtins</a></li>
+<li><a href="datafusion/functions.html">datafusion.functions</a></li>
+<li><a href="functions.html">functions</a></li>
</ul>
</div>
diff --git a/datafusion/_sources/cli/index.rst.txt
b/datafusion/_sources/cli/index.rst.txt
index 93ae173..2b91430 100644
--- a/datafusion/_sources/cli/index.rst.txt
+++ b/datafusion/_sources/cli/index.rst.txt
@@ -53,7 +53,7 @@ Usage
.. code-block:: bash
- DataFusion 5.0.0-SNAPSHOT
+ DataFusion 5.1.0-SNAPSHOT
DataFusion is an in-memory query engine that uses Apache Arrow as the
memory model. It supports executing SQL queries
against CSV and Parquet files as well as querying directly against
in-memory data.
@@ -68,8 +68,10 @@ Usage
OPTIONS:
-c, --batch-size <batch-size> The batch size of each query, or use
DataFusion default
-p, --data-path <data-path> Path to your data, default to current
directory
- -f, --file <file> Execute commands from file, then exit
+ -f, --file <file>... Execute commands from file(s), then
exit
--format <format> Output format [default: table]
[possible values: csv, tsv, table, json, ndjson]
+ --host <host> Ballista scheduler host
+ --port <port> Ballista scheduler port
Type `exit` or `quit` to exit the CLI.
diff --git a/datafusion/_sources/community/communication.md.txt
b/datafusion/_sources/community/communication.md.txt
index bbf07a1..7d8e58a 100644
--- a/datafusion/_sources/community/communication.md.txt
+++ b/datafusion/_sources/community/communication.md.txt
@@ -52,6 +52,23 @@ server ([invite link](https://discord.gg/Qw5gKqHxUM)) in
case you are not able
to join the Slack workspace. If you need an invite to the Slack workspace, you
can also ask for one in our Discord server.
+### Sync up Zoom calls
+
+We have biweekly sync calls every other Thursdays at 16:00 UTC
+(starting September 30, 2021) on Zoom [Meeting
Link](https://influxdata.zoom.us/j/94666921249)
+
+The[agenda](https://docs.google.com/document/d/1atCVnoff5SR4eM4Lwf2M1BBJTY6g3_HUNR6qswYJW_U/edit)
+is available if you would like to add a topic for discussion or see what is
planned.
+
+The goals of these calls are:
+
+1. Help "put a face to the name" of some of other contributors we are working
with
+2. Discuss / synchronize on the goals and major initiatives from different
stakeholders to identify areas where more alignment is needed
+
+No decisions are made on the call and anything of substance will be discussed
on this mailing list or in github issues / google docs.
+
+We will send a summary of all sync ups to the [email protected] mailing
list.
+
## Contributing
Our source code is hosted on
diff --git a/datafusion/_sources/index.rst.txt
b/datafusion/_sources/index.rst.txt
index 6956d0b..bf6b250 100644
--- a/datafusion/_sources/index.rst.txt
+++ b/datafusion/_sources/index.rst.txt
@@ -52,6 +52,7 @@ Table of content
:maxdepth: 1
:caption: Specification
+ specification/roadmap
specification/invariants
specification/output-field-name-semantic
diff --git a/datafusion/_sources/specification/roadmap.md.txt
b/datafusion/_sources/specification/roadmap.md.txt
new file mode 100644
index 0000000..520815b
--- /dev/null
+++ b/datafusion/_sources/specification/roadmap.md.txt
@@ -0,0 +1,99 @@
+<!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Roadmap
+
+This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.
+
+DataFusion and Ballista are part of the [Apache
+Arrow](https://arrow.apache.org/) project and governed by the Apache
+Software Foundation governance model. These projects are entirely
+driven by volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you start a coversation using a github issue or the
[email protected] mailing list to make review efficient and avoid
+surprises.
+
+# DataFusion
+
+DataFusion's goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+[Rust](https://www.rust-lang.org/) and [Apache
Arrow](https://arrow.apache.org/)
+to provide:
+
+1. Best-in-class single node query performance
+2. A Declarative SQL query interface compatible with PostgreSQL
+3. A Dataframe API, similar to those offered by Pandas and Spark
+4. A Procedural API for programatically creating and running execution plans
+5. High performance, data race free, erogonomic extensibility points at at
every layer
+
+## Additional SQL Language Features
+
+- Complete support list on
[status](https://github.com/apache/arrow-datafusion/blob/master/README.md#status)
+- Timestamp Arithmetic
[#194](https://github.com/apache/arrow-datafusion/issues/194)
+- SQL Parser extension point
[#533](https://github.com/apache/arrow-datafusion/issues/533)
+- Support for nested structures (fields, lists, structs)
[#119](https://github.com/apache/arrow-datafusion/issues/119)
+- Remaining Set Operators (`INTERSECT` / `EXCEPT`)
[#1082](https://github.com/apache/arrow-datafusion/issues/1082)
+- Run all queries from the TPCH benchmark (see
[milestone](https://github.com/apache/arrow-datafusion/milestone/2) for more
details)
+
+## Query Optimizer
+
+- Additional constant folding / partial evaluation
[#1070](https://github.com/apache/arrow-datafusion/issues/1070)
+- More sophisticated cost based optimizer for join ordering
+- Implement advanced query optimization framework (Tokomak) #440
+
+## Datasources
+
+- Better support for reading data from remote filesystems (e.g. S3) without
caching it locally
[#907](https://github.com/apache/arrow-datafusion/issues/907)
[#1060](https://github.com/apache/arrow-datafusion/issues/1060)
+- Support for partitioned datasources
[#1139](https://github.com/apache/arrow-datafusion/issues/1139) and make the
integration of other table formats (Delta, Iceberg...) simpler
+- Improve performances of file format datasources (parallelize file listings,
async Arrow readers, file chunk prefetching capability...)
+
+## Runtime / Infrastructure
+
+- Migrate to some sort of arrow2 based implementation (see
[milestone](https://github.com/apache/arrow-datafusion/milestone/3) for more
details)
+- Add DataFusion to h2oai/db-benchmark
[147](https://github.com/apache/arrow-datafusion/issues/147)
+- Improve build time
[348](https://github.com/apache/arrow-datafusion/issues/348)
+
+## Resource Management
+
+- Finer grain control and limit of runtime memory
[#587](https://github.com/apache/arrow-datafusion/issues/587) and CPU usage
[#54](https://github.com/apache/arrow-datafusion/issues/64)
+
+## Python Interface
+
+TBD
+
+## DataFusion CLI (`datafusion-cli`)
+
+Note: There are some additional thoughts on a datafusion-cli vision on
[#1096](https://github.com/apache/arrow-datafusion/issues/1096#issuecomment-939418770).
+
+- Better abstraction between REPL parsing and queries so that commands are
separated and handled correctly
+- Connect to the `Statistics` subsystem and have the cli print out more stats
for query debugging, etc.
+- Improved error handling for interactive use and shell scripting usage
+- publishing to apt, brew, and possible NuGet registry so that people can use
it more easily
+- adopt a shorter name, like dfcli?
+
+## Ballista
+
+# Vision
+
+TBD
diff --git a/datafusion/_sources/user-guide/example-usage.md.txt
b/datafusion/_sources/user-guide/example-usage.md.txt
index 4280079..c09e1e8 100644
--- a/datafusion/_sources/user-guide/example-usage.md.txt
+++ b/datafusion/_sources/user-guide/example-usage.md.txt
@@ -23,8 +23,6 @@ Run a SQL query against data stored in a CSV:
```rust
use datafusion::prelude::*;
-use arrow::util::pretty::print_batches;
-use arrow::record_batch::RecordBatch;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
@@ -45,8 +43,6 @@ Use the DataFrame API to process data stored in a CSV:
```rust
use datafusion::prelude::*;
-use arrow::util::pretty::print_batches;
-use arrow::record_batch::RecordBatch;
#[tokio::main]
async fn main() -> datafusion::error::Result<()> {
diff --git a/datafusion/_sources/user-guide/library.md.txt
b/datafusion/_sources/user-guide/library.md.txt
index bfaf741..f4c5083 100644
--- a/datafusion/_sources/user-guide/library.md.txt
+++ b/datafusion/_sources/user-guide/library.md.txt
@@ -38,9 +38,8 @@ worth noting that using the settings in the
`[profile.release]` section will sig
```toml
[dependencies]
datafusion = { version = "5.0" , features = ["simd"]}
-tokio = { version = "^1.0", features = ["macros", "rt", "rt-multi-thread"] }
-snmalloc-rs = {version = "0.2", features= ["cache-friendly"]}
-num_cpus = "1.0"
+tokio = { version = "^1.0", features = ["rt-multi-thread"] }
+snmalloc-rs = "0.2"
[profile.release]
lto = true
diff --git a/datafusion/cli/index.html b/datafusion/cli/index.html
index 07aed1b..0715064 100644
--- a/datafusion/cli/index.html
+++ b/datafusion/cli/index.html
@@ -321,6 +321,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -364,6 +369,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
@@ -464,7 +474,7 @@ docker run -it -v <span
class="k">$(</span>your_data_location<span class="k">)</
</div>
<div class="section" id="usage">
<h2>Usage<a class="headerlink" href="#usage" title="Permalink to this
headline">¶</a></h2>
-<div class="highlight-bash notranslate"><div
class="highlight"><pre><span></span>DataFusion <span
class="m">5</span>.0.0-SNAPSHOT
+<div class="highlight-bash notranslate"><div
class="highlight"><pre><span></span>DataFusion <span
class="m">5</span>.1.0-SNAPSHOT
DataFusion is an <span class="k">in</span>-memory query engine that uses
Apache Arrow as the memory model. It supports executing SQL queries
against CSV and Parquet files as well as querying directly against <span
class="k">in</span>-memory data.
@@ -479,8 +489,10 @@ FLAGS:
OPTIONS:
-c, --batch-size <batch-size> The batch size of each query, or
use DataFusion default
-p, --data-path <data-path> Path to your data, default to
current directory
- -f, --file <file> Execute commands from file, <span
class="k">then</span> <span class="nb">exit</span>
+ -f, --file <file>... Execute commands from file<span
class="o">(</span>s<span class="o">)</span>, <span class="k">then</span> <span
class="nb">exit</span>
--format <format> Output format <span
class="o">[</span>default: table<span class="o">]</span> <span
class="o">[</span>possible values: csv, tsv, table, json, ndjson<span
class="o">]</span>
+ --host <host> Ballista scheduler host
+ --port <port> Ballista scheduler port
</pre></div>
</div>
<p>Type <cite>exit</cite> or <cite>quit</cite> to exit the CLI.</p>
diff --git a/datafusion/community/communication.html
b/datafusion/community/communication.html
index 5b23a71..f06b978 100644
--- a/datafusion/community/communication.html
+++ b/datafusion/community/communication.html
@@ -320,6 +320,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -404,6 +409,11 @@
Slack and Discord
</a>
</li>
+ <li class="toc-h3 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#sync-up-zoom-calls">
+ Sync up Zoom calls
+ </a>
+ </li>
</ul>
</li>
<li class="toc-h2 nav-item toc-entry">
@@ -488,6 +498,20 @@ server (<a class="reference external"
href="https://discord.gg/Qw5gKqHxUM">invit
to join the Slack workspace. If you need an invite to the Slack workspace, you
can also ask for one in our Discord server.</p>
</div>
+<div class="section" id="sync-up-zoom-calls">
+<h3>Sync up Zoom calls<a class="headerlink" href="#sync-up-zoom-calls"
title="Permalink to this headline">¶</a></h3>
+<p>We have biweekly sync calls every other Thursdays at 16:00 UTC
+(starting September 30, 2021) on Zoom <a class="reference external"
href="https://influxdata.zoom.us/j/94666921249">Meeting Link</a></p>
+<p>The<a class="reference external"
href="https://docs.google.com/document/d/1atCVnoff5SR4eM4Lwf2M1BBJTY6g3_HUNR6qswYJW_U/edit">agenda</a>
+is available if you would like to add a topic for discussion or see what is
planned.</p>
+<p>The goals of these calls are:</p>
+<ol class="simple">
+<li><p>Help “put a face to the name” of some of other contributors we are
working with</p></li>
+<li><p>Discuss / synchronize on the goals and major initiatives from different
stakeholders to identify areas where more alignment is needed</p></li>
+</ol>
+<p>No decisions are made on the call and anything of substance will be
discussed on this mailing list or in github issues / google docs.</p>
+<p>We will send a summary of all sync ups to the dev@arrow.apache.org
mailing list.</p>
+</div>
</div>
<div class="section" id="contributing">
<h2>Contributing<a class="headerlink" href="#contributing" title="Permalink to
this headline">¶</a></h2>
diff --git a/datafusion/genindex.html b/datafusion/genindex.html
index e1f44b7..fe789e5 100644
--- a/datafusion/genindex.html
+++ b/datafusion/genindex.html
@@ -320,6 +320,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -412,6 +417,7 @@
| <a href="#S"><strong>S</strong></a>
| <a href="#T"><strong>T</strong></a>
| <a href="#U"><strong>U</strong></a>
+ | <a href="#V"><strong>V</strong></a>
</div>
<h2 id="_">_</h2>
@@ -439,6 +445,8 @@
</li>
<li><a
href="python/generated/datafusion.Expression.html#datafusion.Expression.alias">alias()
(datafusion.Expression method)</a>
</li>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.approx_distinct">approx_distinct()
(in module datafusion.functions)</a>
+</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.array">array()
(in module datafusion.functions)</a>
@@ -503,6 +511,8 @@
<td style="width: 33%; vertical-align: top;"><ul>
<li><a
href="python/generated/datafusion.functions.html#module-datafusion.functions">datafusion.functions
(module)</a>
</li>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.digest">digest()
(in module datafusion.functions)</a>
+</li>
</ul></td>
</tr></table>
@@ -535,10 +545,12 @@
<h2 id="I">I</h2>
<table style="width: 100%" class="indextable genindextable"><tr>
<td style="width: 33%; vertical-align: top;"><ul>
- <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.in_list">in_list()
(in module datafusion.functions)</a>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.Volatility.immutable">immutable()
(datafusion.functions.Volatility static method)</a>
</li>
</ul></td>
<td style="width: 33%; vertical-align: top;"><ul>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.in_list">in_list()
(in module datafusion.functions)</a>
+</li>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.initcap">initcap()
(in module datafusion.functions)</a>
</li>
</ul></td>
@@ -661,20 +673,22 @@
</li>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.sin">sin()
(in module datafusion.functions)</a>
</li>
- </ul></td>
- <td style="width: 33%; vertical-align: top;"><ul>
<li><a
href="python/generated/datafusion.DataFrame.html#datafusion.DataFrame.sort">sort()
(datafusion.DataFrame method)</a>
<ul>
<li><a
href="python/generated/datafusion.Expression.html#datafusion.Expression.sort">(datafusion.Expression
method)</a>
</li>
</ul></li>
+ </ul></td>
+ <td style="width: 33%; vertical-align: top;"><ul>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.split_part">split_part()
(in module datafusion.functions)</a>
</li>
<li><a
href="python/generated/datafusion.ExecutionContext.html#datafusion.ExecutionContext.sql">sql()
(datafusion.ExecutionContext method)</a>
</li>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.sqrt">sqrt()
(in module datafusion.functions)</a>
</li>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.Volatility.stable">stable()
(datafusion.functions.Volatility static method)</a>
+</li>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.starts_with">starts_with()
(in module datafusion.functions)</a>
</li>
<li><a
href="python/generated/datafusion.functions.html#datafusion.functions.strpos">strpos()
(in module datafusion.functions)</a>
@@ -720,6 +734,18 @@
</ul></td>
</tr></table>
+<h2 id="V">V</h2>
+<table style="width: 100%" class="indextable genindextable"><tr>
+ <td style="width: 33%; vertical-align: top;"><ul>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.Volatility.volatile">volatile()
(datafusion.functions.Volatility static method)</a>
+</li>
+ </ul></td>
+ <td style="width: 33%; vertical-align: top;"><ul>
+ <li><a
href="python/generated/datafusion.functions.html#datafusion.functions.Volatility">Volatility
(class in datafusion.functions)</a>
+</li>
+ </ul></td>
+</tr></table>
+
</div>
diff --git a/datafusion/index.html b/datafusion/index.html
index 87c1527..8a69e67 100644
--- a/datafusion/index.html
+++ b/datafusion/index.html
@@ -320,6 +320,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -451,6 +456,9 @@
<div class="toctree-wrapper compound" id="toc-specs">
<p class="caption"><span class="caption-text">Specification</span><a
class="headerlink" href="#toc-specs" title="Permalink to this toctree">¶</a></p>
<ul>
+<li class="toctree-l1"><a class="reference internal"
href="specification/roadmap.html">Roadmap</a></li>
+<li class="toctree-l1"><a class="reference internal"
href="specification/roadmap.html#datafusion">DataFusion</a></li>
+<li class="toctree-l1"><a class="reference internal"
href="specification/roadmap.html#vision">Vision</a></li>
<li class="toctree-l1"><a class="reference internal"
href="specification/invariants.html">DataFusion’s Invariants</a></li>
<li class="toctree-l1"><a class="reference internal"
href="specification/output-field-name-semantic.html">Datafusion output field
name semantic</a></li>
</ul>
diff --git a/datafusion/objects.inv b/datafusion/objects.inv
index 75f7edc..630ef29 100644
Binary files a/datafusion/objects.inv and b/datafusion/objects.inv differ
diff --git a/datafusion/py-modindex.html b/datafusion/py-modindex.html
index d5bd232..ebc4985 100644
--- a/datafusion/py-modindex.html
+++ b/datafusion/py-modindex.html
@@ -322,6 +322,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="specification/invariants.html">
DataFusion’s Invariants
</a>
diff --git a/datafusion/python/api/dataframe.html
b/datafusion/python/api/dataframe.html
index 651ddaf..9965b9a 100644
--- a/datafusion/python/api/dataframe.html
+++ b/datafusion/python/api/dataframe.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/api/execution_context.html
b/datafusion/python/api/execution_context.html
index 95f538c..b0580c2 100644
--- a/datafusion/python/api/execution_context.html
+++ b/datafusion/python/api/execution_context.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/api/expression.html
b/datafusion/python/api/expression.html
index c1ef4b0..1e9ab9a 100644
--- a/datafusion/python/api/expression.html
+++ b/datafusion/python/api/expression.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/api/functions.html
b/datafusion/python/api/functions.html
index 342d708..f771b80 100644
--- a/datafusion/python/api/functions.html
+++ b/datafusion/python/api/functions.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/generated/datafusion.DataFrame.html
b/datafusion/python/generated/datafusion.DataFrame.html
index bd238c2..e03283a 100644
--- a/datafusion/python/generated/datafusion.DataFrame.html
+++ b/datafusion/python/generated/datafusion.DataFrame.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/generated/datafusion.ExecutionContext.html
b/datafusion/python/generated/datafusion.ExecutionContext.html
index 547bdb4..0b4078c 100644
--- a/datafusion/python/generated/datafusion.ExecutionContext.html
+++ b/datafusion/python/generated/datafusion.ExecutionContext.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/generated/datafusion.Expression.html
b/datafusion/python/generated/datafusion.Expression.html
index b2cb1db..1809823 100644
--- a/datafusion/python/generated/datafusion.Expression.html
+++ b/datafusion/python/generated/datafusion.Expression.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
diff --git a/datafusion/python/generated/datafusion.functions.html
b/datafusion/python/generated/datafusion.functions.html
index b9db979..6229760 100644
--- a/datafusion/python/generated/datafusion.functions.html
+++ b/datafusion/python/generated/datafusion.functions.html
@@ -273,6 +273,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -316,6 +321,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
@@ -560,6 +570,27 @@
</tr>
</tbody>
</table>
+<dl class="class">
+<dt id="datafusion.functions.Volatility">
+<em class="property">class </em><code class="sig-prename
descclassname">datafusion.functions.</code><code class="sig-name
descname">Volatility</code><a class="headerlink"
href="#datafusion.functions.Volatility" title="Permalink to this
definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal
notranslate"><span class="pre">object</span></code></p>
+<dl class="method">
+<dt id="datafusion.functions.Volatility.immutable">
+<em class="property">static </em><code class="sig-name
descname">immutable</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.Volatility.immutable" title="Permalink to this
definition">¶</a></dt>
+<dd></dd></dl>
+
+<dl class="method">
+<dt id="datafusion.functions.Volatility.stable">
+<em class="property">static </em><code class="sig-name
descname">stable</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.Volatility.stable" title="Permalink to this
definition">¶</a></dt>
+<dd></dd></dl>
+
+<dl class="method">
+<dt id="datafusion.functions.Volatility.volatile">
+<em class="property">static </em><code class="sig-name
descname">volatile</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.Volatility.volatile" title="Permalink to this
definition">¶</a></dt>
+<dd></dd></dl>
+
+</dd></dl>
+
<dl class="function">
<dt id="datafusion.functions.abs">
<code class="sig-prename descclassname">datafusion.functions.</code><code
class="sig-name descname">abs</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.abs" title="Permalink to this definition">¶</a></dt>
@@ -571,6 +602,12 @@
<dd></dd></dl>
<dl class="function">
+<dt id="datafusion.functions.approx_distinct">
+<code class="sig-prename descclassname">datafusion.functions.</code><code
class="sig-name descname">approx_distinct</code><span
class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.approx_distinct" title="Permalink to this
definition">¶</a></dt>
+<dd><p>This function is not documented yet</p>
+</dd></dl>
+
+<dl class="function">
<dt id="datafusion.functions.array">
<code class="sig-prename descclassname">datafusion.functions.</code><code
class="sig-name descname">array</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.array" title="Permalink to this
definition">¶</a></dt>
<dd></dd></dl>
@@ -737,6 +774,13 @@ NULL arguments are ignored.</p>
</dd></dl>
<dl class="function">
+<dt id="datafusion.functions.digest">
+<code class="sig-prename descclassname">datafusion.functions.</code><code
class="sig-name descname">digest</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.digest" title="Permalink to this
definition">¶</a></dt>
+<dd><p>Computes a binary hash of the given data. type is the algorithm to use.
+Standard algorithms are md5, sha224, sha256, sha384, sha512, blake2s, blake2b,
and blake3.</p>
+</dd></dl>
+
+<dl class="function">
<dt id="datafusion.functions.min">
<code class="sig-prename descclassname">datafusion.functions.</code><code
class="sig-name descname">min</code><span class="sig-paren">(</span><span
class="sig-paren">)</span><a class="headerlink"
href="#datafusion.functions.min" title="Permalink to this definition">¶</a></dt>
<dd><p>This function is not documented yet</p>
diff --git a/datafusion/search.html b/datafusion/search.html
index 497f011..ff49897 100644
--- a/datafusion/search.html
+++ b/datafusion/search.html
@@ -324,6 +324,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="specification/invariants.html">
DataFusion’s Invariants
</a>
diff --git a/datafusion/searchindex.js b/datafusion/searchindex.js
index 6ed70da..58a3f9c 100644
--- a/datafusion/searchindex.js
+++ b/datafusion/searchindex.js
@@ -1 +1 @@
-Search.setIndex({docnames:["cli/index","community/communication","index","python/api","python/api/dataframe","python/api/execution_context","python/api/expression","python/api/functions","python/generated/datafusion.DataFrame","python/generated/datafusion.ExecutionContext","python/generated/datafusion.Expression","python/generated/datafusion.functions","python/index","specification/invariants","specification/output-field-name-semantic","user-guide/cli","user-guide/distributed/clients/ind
[...]
\ No newline at end of file
+Search.setIndex({docnames:["cli/index","community/communication","index","python/api","python/api/dataframe","python/api/execution_context","python/api/expression","python/api/functions","python/generated/datafusion.DataFrame","python/generated/datafusion.ExecutionContext","python/generated/datafusion.Expression","python/generated/datafusion.functions","python/index","specification/invariants","specification/output-field-name-semantic","specification/roadmap","user-guide/cli","user-guide
[...]
\ No newline at end of file
diff --git a/datafusion/community/communication.html
b/datafusion/specification/roadmap.html
similarity index 56%
copy from datafusion/community/communication.html
copy to datafusion/specification/roadmap.html
index 5b23a71..a8178ef 100644
--- a/datafusion/community/communication.html
+++ b/datafusion/specification/roadmap.html
@@ -4,7 +4,7 @@
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8" />
- <title>Communication — Arrow Datafusion documentation</title>
+ <title>Roadmap — Arrow Datafusion documentation</title>
<link href="../_static/css/theme.css" rel="stylesheet" />
<link href="../_static/css/index.c5995385ac14fb8791e8eb36b4908be2.css"
rel="stylesheet" />
@@ -34,7 +34,8 @@
<script src="../_static/language_data.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
- <link rel="prev" title="Datafusion output field name semantic"
href="../specification/output-field-name-semantic.html" />
+ <link rel="next" title="DataFusion’s Invariants" href="invariants.html" />
+ <link rel="prev" title="Frequently Asked Questions"
href="../user-guide/faq.html" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="docsearch:language" content="en" />
@@ -318,14 +319,19 @@
Specification
</span>
</p>
-<ul class="nav bd-sidenav">
+<ul class="current nav bd-sidenav">
+ <li class="toctree-l1 current active">
+ <a class="current reference internal" href="#">
+ Roadmap
+ </a>
+ </li>
<li class="toctree-l1">
- <a class="reference internal" href="../specification/invariants.html">
+ <a class="reference internal" href="invariants.html">
DataFusion’s Invariants
</a>
</li>
<li class="toctree-l1">
- <a class="reference internal"
href="../specification/output-field-name-semantic.html">
+ <a class="reference internal" href="output-field-name-semantic.html">
Datafusion output field name semantic
</a>
</li>
@@ -352,9 +358,9 @@
Community
</span>
</p>
-<ul class="current nav bd-sidenav">
- <li class="toctree-l1 current active">
- <a class="current reference internal" href="#">
+<ul class="nav bd-sidenav">
+ <li class="toctree-l1">
+ <a class="reference internal" href="../community/communication.html">
Communication
</a>
</li>
@@ -389,26 +395,67 @@
<nav id="bd-toc-nav">
<ul class="visible nav section-nav flex-column">
- <li class="toc-h2 nav-item toc-entry">
- <a class="reference internal nav-link" href="#questions">
- Questions?
+ <li class="toc-h1 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#">
+ Roadmap
+ </a>
+ </li>
+ <li class="toc-h1 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#datafusion">
+ DataFusion
</a>
- <ul class="nav section-nav flex-column">
- <li class="toc-h3 nav-item toc-entry">
- <a class="reference internal nav-link" href="#mailing-list">
- Mailing list
+ <ul class="visible nav section-nav flex-column">
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link"
href="#additional-sql-language-features">
+ Additional SQL Language Features
+ </a>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#query-optimizer">
+ Query Optimizer
+ </a>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#datasources">
+ Datasources
+ </a>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#runtime-infrastructure">
+ Runtime / Infrastructure
</a>
</li>
- <li class="toc-h3 nav-item toc-entry">
- <a class="reference internal nav-link" href="#slack-and-discord">
- Slack and Discord
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#resource-management">
+ Resource Management
+ </a>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#python-interface">
+ Python Interface
+ </a>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link"
href="#datafusion-cli-datafusion-cli">
+ DataFusion CLI (
+ <code class="docutils literal notranslate">
+ <span class="pre">
+ datafusion-cli
+ </span>
+ </code>
+ )
+ </a>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#ballista">
+ Ballista
</a>
</li>
</ul>
</li>
- <li class="toc-h2 nav-item toc-entry">
- <a class="reference internal nav-link" href="#contributing">
- Contributing
+ <li class="toc-h1 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#vision">
+ Vision
</a>
</li>
</ul>
@@ -420,7 +467,7 @@
<div class="tocsection editthispage">
- <a
href="https://github.com/apache/arrow-datafusion/edit/master/docs/source/community/communication.md">
+ <a
href="https://github.com/apache/arrow-datafusion/edit/master/docs/source/specification/roadmap.md">
<i class="fas fa-pencil-alt"></i> Edit this page
</a>
</div>
@@ -439,68 +486,116 @@
<div>
- <!---
- Licensed to the Apache Software Foundation (ASF) under one
- or more contributor license agreements. See the NOTICE file
- distributed with this work for additional information
- regarding copyright ownership. The ASF licenses this file
- to you under the Apache License, Version 2.0 (the
- "License"); you may not use this file except in compliance
- with the License. You may obtain a copy of the License at
+ <!--
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements. See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership. The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
- Unless required by applicable law or agreed to in writing,
- software distributed under the License is distributed on an
- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
- KIND, either express or implied. See the License for the
- specific language governing permissions and limitations
- under the License.
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied. See the License for the
+specific language governing permissions and limitations
+under the License.
-->
-<div class="section" id="communication">
-<h1>Communication<a class="headerlink" href="#communication" title="Permalink
to this headline">¶</a></h1>
-<p>We welcome participation from everyone and encourage you to join us, ask
-questions, and get involved.</p>
-<p>All participation in the Apache Arrow DataFusion project is governed by the
-Apache Software Foundation’s <a class="reference external"
href="https://www.apache.org/foundation/policies/conduct.html">code of
-conduct</a>.</p>
-<div class="section" id="questions">
-<h2>Questions?<a class="headerlink" href="#questions" title="Permalink to this
headline">¶</a></h2>
-<div class="section" id="mailing-list">
-<h3>Mailing list<a class="headerlink" href="#mailing-list" title="Permalink to
this headline">¶</a></h3>
-<p>We use arrow.apache.org’s <code class="docutils literal notranslate"><span
class="pre">dev@</span></code> mailing list for project management, release
-coorindation and design discussions
-(<a class="reference external"
href="mailto:dev-subscribe%40arrow.apache.org">subscribe</a>,
-<a class="reference external"
href="mailto:dev-unsubscribe%40arrow.apache.org">unsubscribe</a>,
-<a class="reference external"
href="https://lists.apache.org/list.html?dev@arrow.apache.org">archives</a>).</p>
-<p>When emailing the dev list, please make sure to prefix the subject line
with a
-<code class="docutils literal notranslate"><span
class="pre">[DataFusion]</span></code> tag, e.g. <code class="docutils literal
notranslate"><span class="pre">"[DataFusion]</span> <span
class="pre">New</span> <span class="pre">API</span> <span
class="pre">for</span> <span class="pre">remote</span> <span
class="pre">data</span> <span class="pre">sources"</span></code>, so
-that the appropriate people in the Apache Arrow community notice the
message.</p>
+<div class="section" id="roadmap">
+<h1>Roadmap<a class="headerlink" href="#roadmap" title="Permalink to this
headline">¶</a></h1>
+<p>This document describes high level goals of the DataFusion and
+Ballista development community. It is not meant to restrict
+possibilities, but rather help newcomers understand the broader
+context of where the community is headed, and inspire
+additional contributions.</p>
+<p>DataFusion and Ballista are part of the <a class="reference external"
href="https://arrow.apache.org/">Apache
+Arrow</a> project and governed by the Apache
+Software Foundation governance model. These projects are entirely
+driven by volunteers, and we welcome contributions for items not on
+this roadmap. However, before submitting a large PR, we strongly
+suggest you start a coversation using a github issue or the
+dev@arrow.apache.org mailing list to make review efficient and avoid
+surprises.</p>
+</div>
+<div class="section" id="datafusion">
+<h1>DataFusion<a class="headerlink" href="#datafusion" title="Permalink to
this headline">¶</a></h1>
+<p>DataFusion’s goal is to become the embedded query engine of choice
+for new analytic applications, by leveraging the unique features of
+<a class="reference external" href="https://www.rust-lang.org/">Rust</a> and
<a class="reference external" href="https://arrow.apache.org/">Apache Arrow</a>
+to provide:</p>
+<ol class="simple">
+<li><p>Best-in-class single node query performance</p></li>
+<li><p>A Declarative SQL query interface compatible with PostgreSQL</p></li>
+<li><p>A Dataframe API, similar to those offered by Pandas and Spark</p></li>
+<li><p>A Procedural API for programatically creating and running execution
plans</p></li>
+<li><p>High performance, data race free, erogonomic extensibility points at at
every layer</p></li>
+</ol>
+<div class="section" id="additional-sql-language-features">
+<h2>Additional SQL Language Features<a class="headerlink"
href="#additional-sql-language-features" title="Permalink to this
headline">¶</a></h2>
+<ul class="simple">
+<li><p>Complete support list on <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/README.md#status">status</a></p></li>
+<li><p>Timestamp Arithmetic <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/194">#194</a></p></li>
+<li><p>SQL Parser extension point <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/533">#533</a></p></li>
+<li><p>Support for nested structures (fields, lists, structs) <a
class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/119">#119</a></p></li>
+<li><p>Remaining Set Operators (<code class="docutils literal
notranslate"><span class="pre">INTERSECT</span></code> / <code class="docutils
literal notranslate"><span class="pre">EXCEPT</span></code>) <a
class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/1082">#1082</a></p></li>
+<li><p>Run all queries from the TPCH benchmark (see <a class="reference
external"
href="https://github.com/apache/arrow-datafusion/milestone/2">milestone</a> for
more details)</p></li>
+</ul>
+</div>
+<div class="section" id="query-optimizer">
+<h2>Query Optimizer<a class="headerlink" href="#query-optimizer"
title="Permalink to this headline">¶</a></h2>
+<ul class="simple">
+<li><p>Additional constant folding / partial evaluation <a class="reference
external"
href="https://github.com/apache/arrow-datafusion/issues/1070">#1070</a></p></li>
+<li><p>More sophisticated cost based optimizer for join ordering</p></li>
+<li><p>Implement advanced query optimization framework (Tokomak) #440</p></li>
+</ul>
+</div>
+<div class="section" id="datasources">
+<h2>Datasources<a class="headerlink" href="#datasources" title="Permalink to
this headline">¶</a></h2>
+<ul class="simple">
+<li><p>Better support for reading data from remote filesystems (e.g. S3)
without caching it locally <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/907">#907</a> <a
class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/1060">#1060</a></p></li>
+<li><p>Support for partitioned datasources <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/1139">#1139</a> and
make the integration of other table formats (Delta, Iceberg…) simpler</p></li>
+<li><p>Improve performances of file format datasources (parallelize file
listings, async Arrow readers, file chunk prefetching capability…)</p></li>
+</ul>
+</div>
+<div class="section" id="runtime-infrastructure">
+<h2>Runtime / Infrastructure<a class="headerlink"
href="#runtime-infrastructure" title="Permalink to this headline">¶</a></h2>
+<ul class="simple">
+<li><p>Migrate to some sort of arrow2 based implementation (see <a
class="reference external"
href="https://github.com/apache/arrow-datafusion/milestone/3">milestone</a> for
more details)</p></li>
+<li><p>Add DataFusion to h2oai/db-benchmark <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/147">147</a></p></li>
+<li><p>Improve build time <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/348">348</a></p></li>
+</ul>
</div>
-<div class="section" id="slack-and-discord">
-<h3>Slack and Discord<a class="headerlink" href="#slack-and-discord"
title="Permalink to this headline">¶</a></h3>
-<p>We use the official <a class="reference external"
href="https://s.apache.org/slack-invite">ASF</a> Slack workspace
-for informal discussions and coordination. This is a great place to meet other
-contributors and get guidance on where to contribute. Join us in the
-<code class="docutils literal notranslate"><span
class="pre">#arrow-rust</span></code> channel.</p>
-<p>We also have a backup Arrow Rust Discord
-server (<a class="reference external"
href="https://discord.gg/Qw5gKqHxUM">invite link</a>) in case you are not able
-to join the Slack workspace. If you need an invite to the Slack workspace, you
-can also ask for one in our Discord server.</p>
+<div class="section" id="resource-management">
+<h2>Resource Management<a class="headerlink" href="#resource-management"
title="Permalink to this headline">¶</a></h2>
+<ul class="simple">
+<li><p>Finer grain control and limit of runtime memory <a class="reference
external" href="https://github.com/apache/arrow-datafusion/issues/587">#587</a>
and CPU usage <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/64">#54</a></p></li>
+</ul>
+</div>
+<div class="section" id="python-interface">
+<h2>Python Interface<a class="headerlink" href="#python-interface"
title="Permalink to this headline">¶</a></h2>
+<p>TBD</p>
+</div>
+<div class="section" id="datafusion-cli-datafusion-cli">
+<h2>DataFusion CLI (<code class="docutils literal notranslate"><span
class="pre">datafusion-cli</span></code>)<a class="headerlink"
href="#datafusion-cli-datafusion-cli" title="Permalink to this
headline">¶</a></h2>
+<p>Note: There are some additional thoughts on a datafusion-cli vision on <a
class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/1096#issuecomment-939418770">#1096</a>.</p>
+<ul class="simple">
+<li><p>Better abstraction between REPL parsing and queries so that commands
are separated and handled correctly</p></li>
+<li><p>Connect to the <code class="docutils literal notranslate"><span
class="pre">Statistics</span></code> subsystem and have the cli print out more
stats for query debugging, etc.</p></li>
+<li><p>Improved error handling for interactive use and shell scripting
usage</p></li>
+<li><p>publishing to apt, brew, and possible NuGet registry so that people can
use it more easily</p></li>
+<li><p>adopt a shorter name, like dfcli?</p></li>
+</ul>
</div>
+<div class="section" id="ballista">
+<h2>Ballista<a class="headerlink" href="#ballista" title="Permalink to this
headline">¶</a></h2>
</div>
-<div class="section" id="contributing">
-<h2>Contributing<a class="headerlink" href="#contributing" title="Permalink to
this headline">¶</a></h2>
-<p>Our source code is hosted on
-<a class="reference external"
href="https://github.com/apache/arrow-datafusion">GitHub</a>. For developers
new to
-the project, we have curated a
-<a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22">good-first-issue</a>
-list to help you get started.</p>
-<p>We use GitHub issues for maintaining a queue of development work and as the
-public record. We often use Google docs, Github issues and pull requests for
-quick and small design discussions. For major design change proposals, please
-make sure to send them to the dev list for more visibility.</p>
</div>
+<div class="section" id="vision">
+<h1>Vision<a class="headerlink" href="#vision" title="Permalink to this
headline">¶</a></h1>
+<p>TBD</p>
</div>
@@ -509,7 +604,8 @@ make sure to send them to the dev list for more
visibility.</p>
<div class='prev-next-bottom'>
- <a class='left-prev' id="prev-link"
href="../specification/output-field-name-semantic.html" title="previous
page">Datafusion output field name semantic</a>
+ <a class='left-prev' id="prev-link" href="../user-guide/faq.html"
title="previous page">Frequently Asked Questions</a>
+ <a class='right-next' id="next-link" href="invariants.html" title="next
page">DataFusion’s Invariants</a>
</div>
diff --git a/datafusion/user-guide/example-usage.html
b/datafusion/user-guide/example-usage.html
index 9a29bb5..34237d6 100644
--- a/datafusion/user-guide/example-usage.html
+++ b/datafusion/user-guide/example-usage.html
@@ -321,6 +321,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -364,6 +369,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
@@ -430,8 +440,6 @@
<h1>Example Usage<a class="headerlink" href="#example-usage" title="Permalink
to this headline">¶</a></h1>
<p>Run a SQL query against data stored in a CSV:</p>
<div class="highlight-rust notranslate"><div
class="highlight"><pre><span></span><span class="k">use</span><span class="w">
</span><span class="n">datafusion</span>::<span class="n">prelude</span>::<span
class="o">*</span><span class="p">;</span><span class="w"></span>
-<span class="k">use</span><span class="w"> </span><span
class="n">arrow</span>::<span class="n">util</span>::<span
class="n">pretty</span>::<span class="n">print_batches</span><span
class="p">;</span><span class="w"></span>
-<span class="k">use</span><span class="w"> </span><span
class="n">arrow</span>::<span class="n">record_batch</span>::<span
class="n">RecordBatch</span><span class="p">;</span><span class="w"></span>
<span class="cp">#[tokio::main]</span><span class="w"></span>
<span class="k">async</span><span class="w"> </span><span class="k">fn</span>
<span class="nf">main</span><span class="p">()</span><span class="w">
</span>-> <span class="nc">datafusion</span>::<span
class="n">error</span>::<span class="nb">Result</span><span
class="o"><</span><span class="p">()</span><span class="o">></span><span
class="w"> </span><span class="p">{</span><span class="w"></span>
@@ -450,8 +458,6 @@
</div>
<p>Use the DataFrame API to process data stored in a CSV:</p>
<div class="highlight-rust notranslate"><div
class="highlight"><pre><span></span><span class="k">use</span><span class="w">
</span><span class="n">datafusion</span>::<span class="n">prelude</span>::<span
class="o">*</span><span class="p">;</span><span class="w"></span>
-<span class="k">use</span><span class="w"> </span><span
class="n">arrow</span>::<span class="n">util</span>::<span
class="n">pretty</span>::<span class="n">print_batches</span><span
class="p">;</span><span class="w"></span>
-<span class="k">use</span><span class="w"> </span><span
class="n">arrow</span>::<span class="n">record_batch</span>::<span
class="n">RecordBatch</span><span class="p">;</span><span class="w"></span>
<span class="cp">#[tokio::main]</span><span class="w"></span>
<span class="k">async</span><span class="w"> </span><span class="k">fn</span>
<span class="nf">main</span><span class="p">()</span><span class="w">
</span>-> <span class="nc">datafusion</span>::<span
class="n">error</span>::<span class="nb">Result</span><span
class="o"><</span><span class="p">()</span><span class="o">></span><span
class="w"> </span><span class="p">{</span><span class="w"></span>
diff --git a/datafusion/user-guide/library.html
b/datafusion/user-guide/library.html
index cc49c3c..abc081c 100644
--- a/datafusion/user-guide/library.html
+++ b/datafusion/user-guide/library.html
@@ -321,6 +321,11 @@
</p>
<ul class="nav bd-sidenav">
<li class="toctree-l1">
+ <a class="reference internal" href="../specification/roadmap.html">
+ Roadmap
+ </a>
+ </li>
+ <li class="toctree-l1">
<a class="reference internal" href="../specification/invariants.html">
DataFusion’s Invariants
</a>
@@ -364,6 +369,11 @@
Issue tracker
</a>
</li>
+ <li class="toctree-l1">
+ <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/master/CODE_OF_CONDUCT.md">
+ Code of conduct
+ </a>
+ </li>
</ul>
@@ -458,9 +468,8 @@
worth noting that using the settings in the <code class="docutils literal
notranslate"><span class="pre">[profile.release]</span></code> section will
significantly increase the build time.</p>
<div class="highlight-toml notranslate"><div
class="highlight"><pre><span></span><span class="k">[dependencies]</span>
<span class="n">datafusion</span> <span class="o">=</span> <span
class="p">{</span> <span class="n">version</span> <span class="o">=</span>
<span class="s">"5.0"</span> <span class="p">,</span> <span
class="n">features</span> <span class="o">=</span> <span
class="p">[</span><span class="s">"simd"</span><span
class="p">]}</span>
-<span class="n">tokio</span> <span class="o">=</span> <span class="p">{</span>
<span class="n">version</span> <span class="o">=</span> <span
class="s">"^1.0"</span><span class="p">,</span> <span
class="n">features</span> <span class="o">=</span> <span
class="p">[</span><span class="s">"macros"</span><span
class="p">,</span> <span class="s">"rt"</span><span
class="p">,</span> <span class="s">"rt-multi-thread"</span><span
class="p">]</span> <span cla [...]
-<span class="n">snmalloc-rs</span> <span class="o">=</span> <span
class="p">{</span><span class="n">version</span> <span class="o">=</span> <span
class="s">"0.2"</span><span class="p">,</span> <span
class="n">features</span><span class="o">=</span> <span class="p">[</span><span
class="s">"cache-friendly"</span><span class="p">]}</span>
-<span class="n">num_cpus</span> <span class="o">=</span> <span
class="s">"1.0"</span>
+<span class="n">tokio</span> <span class="o">=</span> <span class="p">{</span>
<span class="n">version</span> <span class="o">=</span> <span
class="s">"^1.0"</span><span class="p">,</span> <span
class="n">features</span> <span class="o">=</span> <span
class="p">[</span><span class="s">"rt-multi-thread"</span><span
class="p">]</span> <span class="p">}</span>
+<span class="n">snmalloc-rs</span> <span class="o">=</span> <span
class="s">"0.2"</span>
<span class="k">[profile.release]</span>
<span class="n">lto</span> <span class="o">=</span> <span
class="kc">true</span>