This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 20b88cd58a Publish built docs triggered by
33b15c1e8a670bee7ceb11f5f02e445e0e16bff0
20b88cd58a is described below
commit 20b88cd58afaf3e949640659007ee83318de4337
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Tue May 16 21:36:24 2023 +0000
Publish built docs triggered by 33b15c1e8a670bee7ceb11f5f02e445e0e16bff0
---
_sources/contributor-guide/index.md.txt | 45 ++++++-----
contributor-guide/index.html | 128 ++++++++++++++++----------------
searchindex.js | 2 +-
3 files changed, 87 insertions(+), 88 deletions(-)
diff --git a/_sources/contributor-guide/index.md.txt
b/_sources/contributor-guide/index.md.txt
index 174b5827a6..6dfd29a5c1 100644
--- a/_sources/contributor-guide/index.md.txt
+++ b/_sources/contributor-guide/index.md.txt
@@ -33,7 +33,7 @@ list to help you get started.
# Developer's guide
-## Pull Requests
+## Pull Request Overview
We welcome pull requests (PRs) from anyone from the community.
@@ -115,42 +115,41 @@ or run them all at once:
-
[dev/rust_lint.sh](https://github.com/apache/arrow-datafusion/blob/main/dev/rust_lint.sh)
-### Test Organization
+## Testing
-Tests are very important to ensure that improvemens or fixes are not
accidentally broken during subsequent refactorings.
+Tests are critical to ensure that DataFusion is working properly and
+is not accidentally broken during refactorings. All new features
+should have test coverage.
DataFusion has several levels of tests in its [Test
Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html)
-and tries to follow rust standard [Testing
Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in
the The Book.
+and tries to follow the Rust standard [Testing
Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in
the The Book.
-This section highlights the most important test modules that exist
+### Unit tests
-#### Unit tests
+Tests for code in an individual module are defined in the same source file
with a `test` module, following Rust convention.
-Tests for the code in an individual module are defined in the same source file
with a `test` module, following Rust convention.
+### sqllogictests Tests
-#### Rust Integration Tests
+DataFusion's SQL implementation is tested using
[sqllogictest](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests/sqllogictests)
which are run like any other Rust test using `cargo test --test sqllogictests`.
-There are several tests of the public interface of the DataFusion library in
the
[tests](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests)
directory.
-
-You can run these tests individually using a command such as
+`sqllogictests` tests may be less convenient for new contributors who are
familiar with writing `.rs` tests as they require learning another tool.
However, `sqllogictest` based tests are much easier to develop and maintain as
they 1) do not require a slow recompile/link cycle and 2) can be automatically
updated via `cargo test --test sqllogictests -- --complete`.
-```shell
-cargo test -p datafusion --test sql_integration
-```
+Like similar systems such as [DuckDB](https://duckdb.org/dev/testing),
DataFusion has chosen to trade off a slightly higher barrier to contribution
for longer term maintainability. While we are still in the process of
[migrating some old sql_integration
tests](https://github.com/apache/arrow-datafusion/issues/6195), all new tests
should be written using sqllogictests if possible.
-One very important test is the
[sql_integration](https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sql_integration.rs)
test which validates DataFusion's ability to run a large assortment of SQL
queries against an assortment of data setups.
+### Rust Integration Tests
-#### sqllogictests Tests
+There are several tests of the public interface of the DataFusion library in
the
[tests](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests)
directory.
-The
[sqllogictests](https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests/sqllogictests)
also validate DataFusion SQL against an assortment of data setups.
+You can run these tests individually using `cargo` as normal command such as
-Data Driven tests have many benefits including being easier to write and
maintain. We are in the process of [migrating sql_integration
tests](https://github.com/apache/arrow-datafusion/issues/4460) and encourage
-you to add new tests using sqllogictests if possible.
+```shell
+cargo test -p datafusion --test dataframe
+```
-### Benchmarks
+## Benchmarks
-#### Criterion Benchmarks
+### Criterion Benchmarks
[Criterion](https://docs.rs/criterion/latest/criterion/index.html) is a
statistics-driven micro-benchmarking framework used by DataFusion for
evaluating the performance of specific code-paths. In particular, the criterion
benchmarks help to both guide optimisation efforts, and prevent performance
regressions within DataFusion.
@@ -164,7 +163,7 @@ A full list of benchmarks can be found
[here](https://github.com/apache/arrow-da
_[cargo-criterion](https://github.com/bheisler/cargo-criterion) may also be
used for more advanced reporting._
-#### Parquet SQL Benchmarks
+### Parquet SQL Benchmarks
The parquet SQL benchmarks can be run with
@@ -178,7 +177,7 @@ If the environment variable `PARQUET_FILE` is set, the
benchmark will run querie
The benchmark will automatically remove any generated parquet file on exit,
however, if interrupted (e.g. by CTRL+C) it will not. This can be useful for
analysing the particular file after the fact, or preserving it to use with
`PARQUET_FILE` in subsequent runs.
-#### Upstream Benchmark Suites
+### Upstream Benchmark Suites
Instructions and tooling for running upstream benchmark suites against
DataFusion can be found in
[benchmarks](https://github.com/apache/arrow-datafusion/tree/main/benchmarks).
diff --git a/contributor-guide/index.html b/contributor-guide/index.html
index 21821d538f..58b55acdce 100644
--- a/contributor-guide/index.html
+++ b/contributor-guide/index.html
@@ -289,8 +289,8 @@
</a>
<ul class="visible nav section-nav flex-column">
<li class="toc-h2 nav-item toc-entry">
- <a class="reference internal nav-link" href="#pull-requests">
- Pull Requests
+ <a class="reference internal nav-link" href="#pull-request-overview">
+ Pull Request Overview
</a>
</li>
<li class="toc-h2 nav-item toc-entry">
@@ -313,49 +313,49 @@
Bootstrap environment
</a>
</li>
+ </ul>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#testing">
+ Testing
+ </a>
+ <ul class="nav section-nav flex-column">
+ <li class="toc-h3 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#unit-tests">
+ Unit tests
+ </a>
+ </li>
+ <li class="toc-h3 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#sqllogictests-tests">
+ sqllogictests Tests
+ </a>
+ </li>
+ <li class="toc-h3 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#rust-integration-tests">
+ Rust Integration Tests
+ </a>
+ </li>
+ </ul>
+ </li>
+ <li class="toc-h2 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#benchmarks">
+ Benchmarks
+ </a>
+ <ul class="nav section-nav flex-column">
+ <li class="toc-h3 nav-item toc-entry">
+ <a class="reference internal nav-link" href="#criterion-benchmarks">
+ Criterion Benchmarks
+ </a>
+ </li>
<li class="toc-h3 nav-item toc-entry">
- <a class="reference internal nav-link" href="#test-organization">
- Test Organization
+ <a class="reference internal nav-link" href="#parquet-sql-benchmarks">
+ Parquet SQL Benchmarks
</a>
- <ul class="nav section-nav flex-column">
- <li class="toc-h4 nav-item toc-entry">
- <a class="reference internal nav-link" href="#unit-tests">
- Unit tests
- </a>
- </li>
- <li class="toc-h4 nav-item toc-entry">
- <a class="reference internal nav-link" href="#rust-integration-tests">
- Rust Integration Tests
- </a>
- </li>
- <li class="toc-h4 nav-item toc-entry">
- <a class="reference internal nav-link" href="#sqllogictests-tests">
- sqllogictests Tests
- </a>
- </li>
- </ul>
</li>
<li class="toc-h3 nav-item toc-entry">
- <a class="reference internal nav-link" href="#benchmarks">
- Benchmarks
+ <a class="reference internal nav-link" href="#upstream-benchmark-suites">
+ Upstream Benchmark Suites
</a>
- <ul class="nav section-nav flex-column">
- <li class="toc-h4 nav-item toc-entry">
- <a class="reference internal nav-link" href="#criterion-benchmarks">
- Criterion Benchmarks
- </a>
- </li>
- <li class="toc-h4 nav-item toc-entry">
- <a class="reference internal nav-link" href="#parquet-sql-benchmarks">
- Parquet SQL Benchmarks
- </a>
- </li>
- <li class="toc-h4 nav-item toc-entry">
- <a class="reference internal nav-link"
href="#upstream-benchmark-suites">
- Upstream Benchmark Suites
- </a>
- </li>
- </ul>
</li>
</ul>
</li>
@@ -460,8 +460,8 @@ list to help you get started.</p>
</section>
<section id="developer-s-guide">
<h1>Developer’s guide<a class="headerlink" href="#developer-s-guide"
title="Permalink to this heading">¶</a></h1>
-<section id="pull-requests">
-<h2>Pull Requests<a class="headerlink" href="#pull-requests" title="Permalink
to this heading">¶</a></h2>
+<section id="pull-request-overview">
+<h2>Pull Request Overview<a class="headerlink" href="#pull-request-overview"
title="Permalink to this heading">¶</a></h2>
<p>We welcome pull requests (PRs) from anyone from the community.</p>
<p>DataFusion is a very active fast-moving project and we try to review and
merge PRs quickly to keep the review backlog down and the pace up. After review
and approval, one of the <a class="reference external"
href="https://arrow.apache.org/committers/">many people with commit access</a>
will merge your PR.</p>
<p>Review bandwidth is currently our most limited resource, and we highly
encourage reviews by the broader community. If you are waiting for your PR to
be reviewed, consider helping review other PRs that are waiting. Such review
both helps the reviewer to learn the codebase and become more expert, as well
as helps identify issues in the PR (such as lack of test coverage), that can be
addressed and make future reviews faster and more efficient.</p>
@@ -534,37 +534,38 @@ libprotoc<span class="w"> </span><span
class="m">3</span>.12.4
<li><p><a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/main/dev/rust_lint.sh">dev/rust_lint.sh</a></p></li>
</ul>
</section>
-<section id="test-organization">
-<h3>Test Organization<a class="headerlink" href="#test-organization"
title="Permalink to this heading">¶</a></h3>
-<p>Tests are very important to ensure that improvemens or fixes are not
accidentally broken during subsequent refactorings.</p>
+</section>
+<section id="testing">
+<h2>Testing<a class="headerlink" href="#testing" title="Permalink to this
heading">¶</a></h2>
+<p>Tests are critical to ensure that DataFusion is working properly and
+is not accidentally broken during refactorings. All new features
+should have test coverage.</p>
<p>DataFusion has several levels of tests in its <a class="reference external"
href="https://martinfowler.com/articles/practical-test-pyramid.html">Test
Pyramid</a>
-and tries to follow rust standard <a class="reference external"
href="https://doc.rust-lang.org/book/ch11-03-test-organization.html">Testing
Organization</a> in the The Book.</p>
-<p>This section highlights the most important test modules that exist</p>
+and tries to follow the Rust standard <a class="reference external"
href="https://doc.rust-lang.org/book/ch11-03-test-organization.html">Testing
Organization</a> in the The Book.</p>
<section id="unit-tests">
-<h4>Unit tests<a class="headerlink" href="#unit-tests" title="Permalink to
this heading">¶</a></h4>
-<p>Tests for the code in an individual module are defined in the same source
file with a <code class="docutils literal notranslate"><span
class="pre">test</span></code> module, following Rust convention.</p>
+<h3>Unit tests<a class="headerlink" href="#unit-tests" title="Permalink to
this heading">¶</a></h3>
+<p>Tests for code in an individual module are defined in the same source file
with a <code class="docutils literal notranslate"><span
class="pre">test</span></code> module, following Rust convention.</p>
+</section>
+<section id="sqllogictests-tests">
+<h3>sqllogictests Tests<a class="headerlink" href="#sqllogictests-tests"
title="Permalink to this heading">¶</a></h3>
+<p>DataFusion’s SQL implementation is tested using <a class="reference
external"
href="https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests/sqllogictests">sqllogictest</a>
which are run like any other Rust test using <code class="docutils literal
notranslate"><span class="pre">cargo</span> <span class="pre">test</span> <span
class="pre">--test</span> <span class="pre">sqllogictests</span></code>.</p>
+<p><code class="docutils literal notranslate"><span
class="pre">sqllogictests</span></code> tests may be less convenient for new
contributors who are familiar with writing <code class="docutils literal
notranslate"><span class="pre">.rs</span></code> tests as they require learning
another tool. However, <code class="docutils literal notranslate"><span
class="pre">sqllogictest</span></code> based tests are much easier to develop
and maintain as they 1) do not require a slow recompile/link [...]
+<p>Like similar systems such as <a class="reference external"
href="https://duckdb.org/dev/testing">DuckDB</a>, DataFusion has chosen to
trade off a slightly higher barrier to contribution for longer term
maintainability. While we are still in the process of <a class="reference
external"
href="https://github.com/apache/arrow-datafusion/issues/6195">migrating some
old sql_integration tests</a>, all new tests should be written using
sqllogictests if possible.</p>
</section>
<section id="rust-integration-tests">
-<h4>Rust Integration Tests<a class="headerlink" href="#rust-integration-tests"
title="Permalink to this heading">¶</a></h4>
+<h3>Rust Integration Tests<a class="headerlink" href="#rust-integration-tests"
title="Permalink to this heading">¶</a></h3>
<p>There are several tests of the public interface of the DataFusion library
in the <a class="reference external"
href="https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests">tests</a>
directory.</p>
-<p>You can run these tests individually using a command such as</p>
-<div class="highlight-shell notranslate"><div
class="highlight"><pre><span></span>cargo<span class="w"> </span><span
class="nb">test</span><span class="w"> </span>-p<span class="w">
</span>datafusion<span class="w"> </span>--test<span class="w">
</span>sql_integration
+<p>You can run these tests individually using <code class="docutils literal
notranslate"><span class="pre">cargo</span></code> as normal command such as</p>
+<div class="highlight-shell notranslate"><div
class="highlight"><pre><span></span>cargo<span class="w"> </span><span
class="nb">test</span><span class="w"> </span>-p<span class="w">
</span>datafusion<span class="w"> </span>--test<span class="w"> </span>dataframe
</pre></div>
</div>
-<p>One very important test is the <a class="reference external"
href="https://github.com/apache/arrow-datafusion/blob/main/datafusion/core/tests/sql_integration.rs">sql_integration</a>
test which validates DataFusion’s ability to run a large assortment of SQL
queries against an assortment of data setups.</p>
-</section>
-<section id="sqllogictests-tests">
-<h4>sqllogictests Tests<a class="headerlink" href="#sqllogictests-tests"
title="Permalink to this heading">¶</a></h4>
-<p>The <a class="reference external"
href="https://github.com/apache/arrow-datafusion/tree/main/datafusion/core/tests/sqllogictests">sqllogictests</a>
also validate DataFusion SQL against an assortment of data setups.</p>
-<p>Data Driven tests have many benefits including being easier to write and
maintain. We are in the process of <a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues/4460">migrating
sql_integration tests</a> and encourage
-you to add new tests using sqllogictests if possible.</p>
</section>
</section>
<section id="benchmarks">
-<h3>Benchmarks<a class="headerlink" href="#benchmarks" title="Permalink to
this heading">¶</a></h3>
+<h2>Benchmarks<a class="headerlink" href="#benchmarks" title="Permalink to
this heading">¶</a></h2>
<section id="criterion-benchmarks">
-<h4>Criterion Benchmarks<a class="headerlink" href="#criterion-benchmarks"
title="Permalink to this heading">¶</a></h4>
+<h3>Criterion Benchmarks<a class="headerlink" href="#criterion-benchmarks"
title="Permalink to this heading">¶</a></h3>
<p><a class="reference external"
href="https://docs.rs/criterion/latest/criterion/index.html">Criterion</a> is a
statistics-driven micro-benchmarking framework used by DataFusion for
evaluating the performance of specific code-paths. In particular, the criterion
benchmarks help to both guide optimisation efforts, and prevent performance
regressions within DataFusion.</p>
<p>Criterion integrates with Cargo’s built-in <a class="reference external"
href="https://doc.rust-lang.org/cargo/commands/cargo-bench.html">benchmark
support</a> and a given benchmark can be run with</p>
<div class="highlight-default notranslate"><div
class="highlight"><pre><span></span><span class="n">cargo</span> <span
class="n">bench</span> <span class="o">--</span><span class="n">bench</span>
<span class="n">BENCHMARK_NAME</span>
@@ -574,7 +575,7 @@ you to add new tests using sqllogictests if possible.</p>
<p><em><a class="reference external"
href="https://github.com/bheisler/cargo-criterion">cargo-criterion</a> may also
be used for more advanced reporting.</em></p>
</section>
<section id="parquet-sql-benchmarks">
-<h4>Parquet SQL Benchmarks<a class="headerlink" href="#parquet-sql-benchmarks"
title="Permalink to this heading">¶</a></h4>
+<h3>Parquet SQL Benchmarks<a class="headerlink" href="#parquet-sql-benchmarks"
title="Permalink to this heading">¶</a></h3>
<p>The parquet SQL benchmarks can be run with</p>
<div class="highlight-default notranslate"><div
class="highlight"><pre><span></span> <span class="n">cargo</span> <span
class="n">bench</span> <span class="o">--</span><span class="n">bench</span>
<span class="n">parquet_query_sql</span>
</pre></div>
@@ -584,12 +585,11 @@ you to add new tests using sqllogictests if possible.</p>
<p>The benchmark will automatically remove any generated parquet file on exit,
however, if interrupted (e.g. by CTRL+C) it will not. This can be useful for
analysing the particular file after the fact, or preserving it to use with
<code class="docutils literal notranslate"><span
class="pre">PARQUET_FILE</span></code> in subsequent runs.</p>
</section>
<section id="upstream-benchmark-suites">
-<h4>Upstream Benchmark Suites<a class="headerlink"
href="#upstream-benchmark-suites" title="Permalink to this heading">¶</a></h4>
+<h3>Upstream Benchmark Suites<a class="headerlink"
href="#upstream-benchmark-suites" title="Permalink to this heading">¶</a></h3>
<p>Instructions and tooling for running upstream benchmark suites against
DataFusion can be found in <a class="reference external"
href="https://github.com/apache/arrow-datafusion/tree/main/benchmarks">benchmarks</a>.</p>
<p>These are valuable for comparative evaluation against alternative Arrow
implementations and query engines.</p>
</section>
</section>
-</section>
<section id="howtos">
<h2>HOWTOs<a class="headerlink" href="#howtos" title="Permalink to this
heading">¶</a></h2>
<section id="how-to-add-a-new-scalar-function">
diff --git a/searchindex.js b/searchindex.js
index 3c8abafa05..2c4cbdcf1b 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"docnames": ["contributor-guide/architecture",
"contributor-guide/communication", "contributor-guide/index",
"contributor-guide/quarterly_roadmap", "contributor-guide/roadmap",
"contributor-guide/specification/index",
"contributor-guide/specification/invariants",
"contributor-guide/specification/output-field-name-semantic", "index",
"user-guide/cli", "user-guide/configs", "user-guide/dataframe",
"user-guide/example-usage", "user-guide/expressions", "user-guide/faq", "use
[...]
\ No newline at end of file
+Search.setIndex({"docnames": ["contributor-guide/architecture",
"contributor-guide/communication", "contributor-guide/index",
"contributor-guide/quarterly_roadmap", "contributor-guide/roadmap",
"contributor-guide/specification/index",
"contributor-guide/specification/invariants",
"contributor-guide/specification/output-field-name-semantic", "index",
"user-guide/cli", "user-guide/configs", "user-guide/dataframe",
"user-guide/example-usage", "user-guide/expressions", "user-guide/faq", "use
[...]
\ No newline at end of file