This is an automated email from the ASF dual-hosted git repository.
github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/arrow-datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push:
new 9e7473721b Publish built docs triggered by
e9edd0cb4592eb3ca644bc2a2a7674042486802b
9e7473721b is described below
commit 9e7473721bf357977ba269ceefe302a056489bda
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Mon Apr 10 21:32:28 2023 +0000
Publish built docs triggered by e9edd0cb4592eb3ca644bc2a2a7674042486802b
---
_sources/contributor-guide/architecture.md.txt | 13 ++++---------
_sources/contributor-guide/index.md.txt | 17 ++++++++++++-----
contributor-guide/architecture.html | 17 +++--------------
contributor-guide/index.html | 16 +++++++++++-----
searchindex.js | 2 +-
5 files changed, 31 insertions(+), 34 deletions(-)
diff --git a/_sources/contributor-guide/architecture.md.txt
b/_sources/contributor-guide/architecture.md.txt
index 081866aa8f..48c065f5b7 100644
--- a/_sources/contributor-guide/architecture.md.txt
+++ b/_sources/contributor-guide/architecture.md.txt
@@ -19,13 +19,8 @@
# Architecture
-There is no formal document describing DataFusion's architecture yet, but the
following presentations offer a good overview of its different components and
how they interact together.
+DataFusion's code structure and organization is described in the
+[Crate Documentation], to keep it as close to the source as
+possible.
-- [Apr 2023]: The Apache Arrow DataFusion Architecture talks series by @alamb
- - _Query Engine_: [recording](https://youtu.be/NVKujPxwSBA) and
[slides](https://docs.google.com/presentation/d/1D3GDVas-8y0sA4c8EOgdCvEjVND4s2E7I6zfs67Y4j8/edit#slide=id.p)
- - _Logical Plan and Expressions_: [recording](https://youtu.be/EzZTLiSJnhY)
and
[slides](https://docs.google.com/presentation/d/1ypylM3-w60kVDW7Q6S99AHzvlBgciTdjsAfqNP85K30/edit#slide=id.gbe21b752a6_0_218)
- - _Physical Plan and Execution_: [recording](https://youtu.be/2jkWU3_w6z0)
and
[slides](https://docs.google.com/presentation/d/1cA2WQJ2qg6tx6y4Wf8FH2WVSm9JQ5UgmBWATHdik0hg/edit?usp=sharing)
-- [February 2021]: How DataFusion is used within the Ballista Project is
described in \*Ballista: Distributed Compute with Rust and Apache Arrow:
[recording](https://www.youtube.com/watch?v=ZZHQaOap9pQ)
-- [July 2022]: DataFusion and Arrow: Supercharge Your Data Analytical Tool
with a Rusty Query Engine:
[recording](https://www.youtube.com/watch?v=Rii1VTn3seQ) and
[slides](https://docs.google.com/presentation/d/1q1bPibvu64k2b7LPi7Yyb0k3gA1BiUYiUbEklqW1Ckc/view#slide=id.g11054eeab4c_0_1165)
-- [March 2021]: The DataFusion architecture is described in _Query Engine
Design and the Rust-Based DataFusion in Apache Arrow_:
[recording](https://www.youtube.com/watch?v=K6eCAVEk4kU) (DataFusion content
starts [~ 15 minutes in](https://www.youtube.com/watch?v=K6eCAVEk4kU&t=875s))
and
[slides](https://www.slideshare.net/influxdata/influxdb-iox-tech-talks-query-engine-design-and-the-rustbased-datafusion-in-apache-arrow-244161934)
-- [February 2021]: How DataFusion is used within the Ballista Project is
described in \*Ballista: Distributed Compute with Rust and Apache Arrow:
[recording](https://www.youtube.com/watch?v=ZZHQaOap9pQ)
+[crate documentation]:
https://docs.rs/datafusion/latest/datafusion/index.html#code-organization
diff --git a/_sources/contributor-guide/index.md.txt
b/_sources/contributor-guide/index.md.txt
index 322153d4fd..174b5827a6 100644
--- a/_sources/contributor-guide/index.md.txt
+++ b/_sources/contributor-guide/index.md.txt
@@ -23,9 +23,9 @@ We welcome and encourage contributions of all kinds, such as:
1. Tickets with issue reports of feature requests
2. Documentation improvements
-3. Code (PR or PR Review)
+3. Code, both PR and (especially) PR Review.
-In addition to submitting new PRs, we have a healthy tradition of community
members helping review each other's PRs. Doing so is a great way to help the
community as well as get more familiar with Rust and the relevant codebases.
+In addition to submitting new PRs, we have a healthy tradition of community
members reviewing each other's PRs. Doing so is a great way to help the
community as well as get more familiar with Rust and the relevant codebases.
You can find a curated
[good-first-issue](https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22)
@@ -41,6 +41,11 @@ DataFusion is a very active fast-moving project and we try
to review and merge P
Review bandwidth is currently our most limited resource, and we highly
encourage reviews by the broader community. If you are waiting for your PR to
be reviewed, consider helping review other PRs that are waiting. Such review
both helps the reviewer to learn the codebase and become more expert, as well
as helps identify issues in the PR (such as lack of test coverage), that can be
addressed and make future reviews faster and more efficient.
+Things to help look for in a PR:
+
+1. Is the feature or fix covered sufficiently with tests (see `Test
Organization` below)?
+2. Is the code clear, and fits the style of the existing codebase?
+
Since we are a worldwide community, we have contributors in many timezones who
review and comment. To ensure anyone who wishes has an opportunity to review a
PR, our committers try to ensure that at least 24 hours passes between when a
"major" PR is approved and when it is merged.
A "major" PR means there is a substantial change in design or a change in the
API. Committers apply their best judgment to determine what constitutes a
substantial change. A "minor" PR might be merged without a 24 hour delay, again
subject to the judgment of the committer. Examples of potential "minor" PRs are:
@@ -112,15 +117,17 @@ or run them all at once:
### Test Organization
+Tests are very important to ensure that improvemens or fixes are not
accidentally broken during subsequent refactorings.
+
DataFusion has several levels of tests in its [Test
Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html)
-and tries to follow [Testing
Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in
the The Book.
+and tries to follow rust standard [Testing
Organization](https://doc.rust-lang.org/book/ch11-03-test-organization.html) in
the The Book.
This section highlights the most important test modules that exist
#### Unit tests
-Tests for the code in an individual module are defined in the same source file
with a `test` module, following Rust convention
+Tests for the code in an individual module are defined in the same source file
with a `test` module, following Rust convention.
#### Rust Integration Tests
@@ -240,7 +247,7 @@ dot -Tpdf < /tmp/plan.dot > /tmp/plan.pdf
## Specifications
-We formalize DataFusion semantics and behaviors through specification
+We formalize some DataFusion semantics and behaviors through specification
documents. These specifications are useful to be used as references to help
resolve ambiguities during development or code reviews.
diff --git a/contributor-guide/architecture.html
b/contributor-guide/architecture.html
index fc17ac21f2..d93cc03f80 100644
--- a/contributor-guide/architecture.html
+++ b/contributor-guide/architecture.html
@@ -347,20 +347,9 @@
-->
<section id="architecture">
<h1>Architecture<a class="headerlink" href="#architecture" title="Permalink to
this heading">¶</a></h1>
-<p>There is no formal document describing DataFusion’s architecture yet, but
the following presentations offer a good overview of its different components
and how they interact together.</p>
-<ul class="simple">
-<li><p>[Apr 2023]: The Apache Arrow DataFusion Architecture talks series by
@alamb</p>
-<ul>
-<li><p><em>Query Engine</em>: <a class="reference external"
href="https://youtu.be/NVKujPxwSBA">recording</a> and <a class="reference
external"
href="https://docs.google.com/presentation/d/1D3GDVas-8y0sA4c8EOgdCvEjVND4s2E7I6zfs67Y4j8/edit#slide=id.p">slides</a></p></li>
-<li><p><em>Logical Plan and Expressions</em>: <a class="reference external"
href="https://youtu.be/EzZTLiSJnhY">recording</a> and <a class="reference
external"
href="https://docs.google.com/presentation/d/1ypylM3-w60kVDW7Q6S99AHzvlBgciTdjsAfqNP85K30/edit#slide=id.gbe21b752a6_0_218">slides</a></p></li>
-<li><p><em>Physical Plan and Execution</em>: <a class="reference external"
href="https://youtu.be/2jkWU3_w6z0">recording</a> and <a class="reference
external"
href="https://docs.google.com/presentation/d/1cA2WQJ2qg6tx6y4Wf8FH2WVSm9JQ5UgmBWATHdik0hg/edit?usp=sharing">slides</a></p></li>
-</ul>
-</li>
-<li><p>[February 2021]: How DataFusion is used within the Ballista Project is
described in *Ballista: Distributed Compute with Rust and Apache Arrow: <a
class="reference external"
href="https://www.youtube.com/watch?v=ZZHQaOap9pQ">recording</a></p></li>
-<li><p>[July 2022]: DataFusion and Arrow: Supercharge Your Data Analytical
Tool with a Rusty Query Engine: <a class="reference external"
href="https://www.youtube.com/watch?v=Rii1VTn3seQ">recording</a> and <a
class="reference external"
href="https://docs.google.com/presentation/d/1q1bPibvu64k2b7LPi7Yyb0k3gA1BiUYiUbEklqW1Ckc/view#slide=id.g11054eeab4c_0_1165">slides</a></p></li>
-<li><p>[March 2021]: The DataFusion architecture is described in <em>Query
Engine Design and the Rust-Based DataFusion in Apache Arrow</em>: <a
class="reference external"
href="https://www.youtube.com/watch?v=K6eCAVEk4kU">recording</a> (DataFusion
content starts <a class="reference external"
href="https://www.youtube.com/watch?v=K6eCAVEk4kU&amp;t=875s">~ 15 minutes
in</a>) and <a class="reference external"
href="https://www.slideshare.net/influxdata/influxdb-iox-tech-talks-query-engi
[...]
-<li><p>[February 2021]: How DataFusion is used within the Ballista Project is
described in *Ballista: Distributed Compute with Rust and Apache Arrow: <a
class="reference external"
href="https://www.youtube.com/watch?v=ZZHQaOap9pQ">recording</a></p></li>
-</ul>
+<p>DataFusion’s code structure and organization is described in the
+<a class="reference external"
href="https://docs.rs/datafusion/latest/datafusion/index.html#code-organization">Crate
Documentation</a>, to keep it as close to the source as
+possible.</p>
</section>
diff --git a/contributor-guide/index.html b/contributor-guide/index.html
index 98487fdfb4..e38fd30b32 100644
--- a/contributor-guide/index.html
+++ b/contributor-guide/index.html
@@ -477,9 +477,9 @@
<ol class="arabic simple">
<li><p>Tickets with issue reports of feature requests</p></li>
<li><p>Documentation improvements</p></li>
-<li><p>Code (PR or PR Review)</p></li>
+<li><p>Code, both PR and (especially) PR Review.</p></li>
</ol>
-<p>In addition to submitting new PRs, we have a healthy tradition of community
members helping review each other’s PRs. Doing so is a great way to help the
community as well as get more familiar with Rust and the relevant codebases.</p>
+<p>In addition to submitting new PRs, we have a healthy tradition of community
members reviewing each other’s PRs. Doing so is a great way to help the
community as well as get more familiar with Rust and the relevant codebases.</p>
<p>You can find a curated
<a class="reference external"
href="https://github.com/apache/arrow-datafusion/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22">good-first-issue</a>
list to help you get started.</p>
@@ -491,6 +491,11 @@ list to help you get started.</p>
<p>We welcome pull requests (PRs) from anyone from the community.</p>
<p>DataFusion is a very active fast-moving project and we try to review and
merge PRs quickly to keep the review backlog down and the pace up. After review
and approval, one of the <a class="reference external"
href="https://arrow.apache.org/committers/">many people with commit access</a>
will merge your PR.</p>
<p>Review bandwidth is currently our most limited resource, and we highly
encourage reviews by the broader community. If you are waiting for your PR to
be reviewed, consider helping review other PRs that are waiting. Such review
both helps the reviewer to learn the codebase and become more expert, as well
as helps identify issues in the PR (such as lack of test coverage), that can be
addressed and make future reviews faster and more efficient.</p>
+<p>Things to help look for in a PR:</p>
+<ol class="arabic simple">
+<li><p>Is the feature or fix covered sufficiently with tests (see <code
class="docutils literal notranslate"><span class="pre">Test</span> <span
class="pre">Organization</span></code> below)?</p></li>
+<li><p>Is the code clear, and fits the style of the existing codebase?</p></li>
+</ol>
<p>Since we are a worldwide community, we have contributors in many timezones
who review and comment. To ensure anyone who wishes has an opportunity to
review a PR, our committers try to ensure that at least 24 hours passes between
when a “major” PR is approved and when it is merged.</p>
<p>A “major” PR means there is a substantial change in design or a change in
the API. Committers apply their best judgment to determine what constitutes a
substantial change. A “minor” PR might be merged without a 24 hour delay, again
subject to the judgment of the committer. Examples of potential “minor” PRs
are:</p>
<ol class="arabic simple">
@@ -557,13 +562,14 @@ libprotoc<span class="w"> </span><span
class="m">3</span>.12.4
</section>
<section id="test-organization">
<h3>Test Organization<a class="headerlink" href="#test-organization"
title="Permalink to this heading">¶</a></h3>
+<p>Tests are very important to ensure that improvemens or fixes are not
accidentally broken during subsequent refactorings.</p>
<p>DataFusion has several levels of tests in its <a class="reference external"
href="https://martinfowler.com/articles/practical-test-pyramid.html">Test
Pyramid</a>
-and tries to follow <a class="reference external"
href="https://doc.rust-lang.org/book/ch11-03-test-organization.html">Testing
Organization</a> in the The Book.</p>
+and tries to follow rust standard <a class="reference external"
href="https://doc.rust-lang.org/book/ch11-03-test-organization.html">Testing
Organization</a> in the The Book.</p>
<p>This section highlights the most important test modules that exist</p>
<section id="unit-tests">
<h4>Unit tests<a class="headerlink" href="#unit-tests" title="Permalink to
this heading">¶</a></h4>
-<p>Tests for the code in an individual module are defined in the same source
file with a <code class="docutils literal notranslate"><span
class="pre">test</span></code> module, following Rust convention</p>
+<p>Tests for the code in an individual module are defined in the same source
file with a <code class="docutils literal notranslate"><span
class="pre">test</span></code> module, following Rust convention.</p>
</section>
<section id="rust-integration-tests">
<h4>Rust Integration Tests<a class="headerlink" href="#rust-integration-tests"
title="Permalink to this heading">¶</a></h4>
@@ -687,7 +693,7 @@ can be displayed. For example, the following command
creates a
</section>
<section id="specifications">
<h2>Specifications<a class="headerlink" href="#specifications"
title="Permalink to this heading">¶</a></h2>
-<p>We formalize DataFusion semantics and behaviors through specification
+<p>We formalize some DataFusion semantics and behaviors through specification
documents. These specifications are useful to be used as references to help
resolve ambiguities during development or code reviews.</p>
<p>You are also welcome to propose changes to existing specifications or create
diff --git a/searchindex.js b/searchindex.js
index 1868597499..40cbcde105 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"docnames": ["contributor-guide/architecture",
"contributor-guide/communication", "contributor-guide/index",
"contributor-guide/quarterly_roadmap", "contributor-guide/roadmap",
"contributor-guide/specification/index",
"contributor-guide/specification/invariants",
"contributor-guide/specification/output-field-name-semantic", "index",
"user-guide/cli", "user-guide/comparison", "user-guide/configs",
"user-guide/dataframe", "user-guide/example-usage", "user-guide/expressions
[...]
\ No newline at end of file
+Search.setIndex({"docnames": ["contributor-guide/architecture",
"contributor-guide/communication", "contributor-guide/index",
"contributor-guide/quarterly_roadmap", "contributor-guide/roadmap",
"contributor-guide/specification/index",
"contributor-guide/specification/invariants",
"contributor-guide/specification/output-field-name-semantic", "index",
"user-guide/cli", "user-guide/comparison", "user-guide/configs",
"user-guide/dataframe", "user-guide/example-usage", "user-guide/expressions
[...]
\ No newline at end of file