This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-site in repository https://gitbox.apache.org/repos/asf/datafusion.git
The following commit(s) were added to refs/heads/asf-site by this push: new 214114f783 Publish built docs triggered by accd2255f05acc91827016ffdd96b66e774ed2dc 214114f783 is described below commit 214114f783033e7ac70fafb95d38ff55bd9f4b1d Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com> AuthorDate: Wed Jun 18 12:41:21 2025 +0000 Publish built docs triggered by accd2255f05acc91827016ffdd96b66e774ed2dc --- _sources/contributor-guide/roadmap.md.txt | 84 ++------------ contributor-guide/roadmap.html | 187 ++---------------------------- searchindex.js | 2 +- 3 files changed, 19 insertions(+), 254 deletions(-) diff --git a/_sources/contributor-guide/roadmap.md.txt b/_sources/contributor-guide/roadmap.md.txt index 3d9c1ee371..79add1b86f 100644 --- a/_sources/contributor-guide/roadmap.md.txt +++ b/_sources/contributor-guide/roadmap.md.txt @@ -46,81 +46,13 @@ make review efficient and avoid surprises. # Quarterly Roadmap -A quarterly roadmap will be published to give the DataFusion community -visibility into the priorities of the projects contributors. This roadmap is not -binding and we would welcome any/all contributions to help keep this list up to -date. +The DataFusion roadmap is driven by the priorities of contributors rather than +any single organization or coordinating committee. We typically discuss our +roadmap using GitHub issues, approximately quarterly, and invite you to join the +discussion. -## 2023 Q4 +For more information: -- Improve data output (`COPY`, `INSERT` and DataFrame) output capability [#6569](https://github.com/apache/datafusion/issues/6569) -- Implementation of `ARRAY` types and related functions [#6980](https://github.com/apache/datafusion/issues/6980) -- Write an industrial paper about DataFusion for SIGMOD [#6782](https://github.com/apache/datafusion/issues/6782) - -## 2022 Q2 - -### DataFusion Core - -- IO Improvements - - Reading, registering, and writing more file formats from both DataFrame API and SQL - - Additional options for IO including partitioning and metadata support -- Work Scheduling - - Improve predictability, observability and performance of IO and CPU-bound work - - Develop a more explicit story for managing parallelism during plan execution -- Memory Management - - Add more operators for memory limited execution -- Performance - - Incorporate row-format into operators such as aggregate - - Add row-format benchmarks - - Explore JIT-compiling complex expressions - - Explore LLVM for JIT, with inline Rust functions as the primary goal - - Improve performance of Sort and Merge using Row Format / JIT expressions -- Documentation - - General improvements to DataFusion website - - Publish design documents -- Streaming - - Create `StreamProvider` trait - -### Ballista - -- Make production ready - - Shuffle file cleanup - - Fill functional gaps between DataFusion and Ballista - - Improve task scheduling and data exchange efficiency - - Better error handling - - Task failure - - Executor lost - - Schedule restart - - Improve monitoring and logging - - Auto scaling support -- Support for multi-scheduler deployments. Initially for resiliency and fault tolerance but ultimately to support sharding for scalability and more efficient caching. -- Executor deployment grouping based on resource allocation - -### Extensions ([datafusion-contrib](https://github.com/datafusion-contrib)) - -### [DataFusion-Python](https://github.com/datafusion-contrib/datafusion-python) - -- Add missing functionality to DataFrame and SessionContext -- Improve documentation - -### [DataFusion-S3](https://github.com/datafusion-contrib/datafusion-objectstore-s3) - -- Create Python bindings to use with datafusion-python - -### [DataFusion-Tui](https://github.com/datafusion-contrib/datafusion-tui) - -- Create multiple SQL editors -- Expose more Context and query metadata -- Support new data sources - - BigTable, HDFS, HTTP APIs - -### [DataFusion-BigTable](https://github.com/datafusion-contrib/datafusion-bigtable) - -- Python binding to use with datafusion-python -- Timestamp range predicate pushdown -- Multi-threaded partition aware execution -- Production ready Rust SDK - -### [DataFusion-Streams](https://github.com/datafusion-contrib/datafusion-streams) - -- Create experimental implementation of `StreamProvider` trait +1. [Search for issues labeled `roadmap`](https://github.com/apache/datafusion/issues?q=is%3Aissue%20%20%20roadmap) +2. [DataFusion Road Map: Q3-Q4 2025](https://github.com/apache/datafusion/issues/15878) +3. [2024 Q4 / 2025 Q1 Roadmap](https://github.com/apache/datafusion/issues/13274) diff --git a/contributor-guide/roadmap.html b/contributor-guide/roadmap.html index 478208636b..bc29e46cbd 100644 --- a/contributor-guide/roadmap.html +++ b/contributor-guide/roadmap.html @@ -571,60 +571,6 @@ <a class="reference internal nav-link" href="#quarterly-roadmap"> Quarterly Roadmap </a> - <ul class="visible nav section-nav flex-column"> - <li class="toc-h2 nav-item toc-entry"> - <a class="reference internal nav-link" href="#q4"> - 2023 Q4 - </a> - </li> - <li class="toc-h2 nav-item toc-entry"> - <a class="reference internal nav-link" href="#q2"> - 2022 Q2 - </a> - <ul class="nav section-nav flex-column"> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#datafusion-core"> - DataFusion Core - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#ballista"> - Ballista - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#extensions-datafusion-contrib"> - Extensions (datafusion-contrib) - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#datafusion-python"> - DataFusion-Python - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#datafusion-s3"> - DataFusion-S3 - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#datafusion-tui"> - DataFusion-Tui - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#datafusion-bigtable"> - DataFusion-BigTable - </a> - </li> - <li class="toc-h3 nav-item toc-entry"> - <a class="reference internal nav-link" href="#datafusion-streams"> - DataFusion-Streams - </a> - </li> - </ul> - </li> - </ul> </li> </ul> @@ -698,129 +644,16 @@ make review efficient and avoid surprises.</p> </section> <section id="quarterly-roadmap"> <h1>Quarterly Roadmap<a class="headerlink" href="#quarterly-roadmap" title="Link to this heading">¶</a></h1> -<p>A quarterly roadmap will be published to give the DataFusion community -visibility into the priorities of the projects contributors. This roadmap is not -binding and we would welcome any/all contributions to help keep this list up to -date.</p> -<section id="q4"> -<h2>2023 Q4<a class="headerlink" href="#q4" title="Link to this heading">¶</a></h2> -<ul class="simple"> -<li><p>Improve data output (<code class="docutils literal notranslate"><span class="pre">COPY</span></code>, <code class="docutils literal notranslate"><span class="pre">INSERT</span></code> and DataFrame) output capability <a class="reference external" href="https://github.com/apache/datafusion/issues/6569">#6569</a></p></li> -<li><p>Implementation of <code class="docutils literal notranslate"><span class="pre">ARRAY</span></code> types and related functions <a class="reference external" href="https://github.com/apache/datafusion/issues/6980">#6980</a></p></li> -<li><p>Write an industrial paper about DataFusion for SIGMOD <a class="reference external" href="https://github.com/apache/datafusion/issues/6782">#6782</a></p></li> -</ul> -</section> -<section id="q2"> -<h2>2022 Q2<a class="headerlink" href="#q2" title="Link to this heading">¶</a></h2> -<section id="datafusion-core"> -<h3>DataFusion Core<a class="headerlink" href="#datafusion-core" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>IO Improvements</p> -<ul> -<li><p>Reading, registering, and writing more file formats from both DataFrame API and SQL</p></li> -<li><p>Additional options for IO including partitioning and metadata support</p></li> -</ul> -</li> -<li><p>Work Scheduling</p> -<ul> -<li><p>Improve predictability, observability and performance of IO and CPU-bound work</p></li> -<li><p>Develop a more explicit story for managing parallelism during plan execution</p></li> -</ul> -</li> -<li><p>Memory Management</p> -<ul> -<li><p>Add more operators for memory limited execution</p></li> -</ul> -</li> -<li><p>Performance</p> -<ul> -<li><p>Incorporate row-format into operators such as aggregate</p></li> -<li><p>Add row-format benchmarks</p></li> -<li><p>Explore JIT-compiling complex expressions</p></li> -<li><p>Explore LLVM for JIT, with inline Rust functions as the primary goal</p></li> -<li><p>Improve performance of Sort and Merge using Row Format / JIT expressions</p></li> -</ul> -</li> -<li><p>Documentation</p> -<ul> -<li><p>General improvements to DataFusion website</p></li> -<li><p>Publish design documents</p></li> -</ul> -</li> -<li><p>Streaming</p> -<ul> -<li><p>Create <code class="docutils literal notranslate"><span class="pre">StreamProvider</span></code> trait</p></li> -</ul> -</li> -</ul> -</section> -<section id="ballista"> -<h3>Ballista<a class="headerlink" href="#ballista" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>Make production ready</p> -<ul> -<li><p>Shuffle file cleanup</p></li> -<li><p>Fill functional gaps between DataFusion and Ballista</p></li> -<li><p>Improve task scheduling and data exchange efficiency</p></li> -<li><p>Better error handling</p> -<ul> -<li><p>Task failure</p></li> -<li><p>Executor lost</p></li> -<li><p>Schedule restart</p></li> -</ul> -</li> -<li><p>Improve monitoring and logging</p></li> -<li><p>Auto scaling support</p></li> -</ul> -</li> -<li><p>Support for multi-scheduler deployments. Initially for resiliency and fault tolerance but ultimately to support sharding for scalability and more efficient caching.</p></li> -<li><p>Executor deployment grouping based on resource allocation</p></li> -</ul> -</section> -<section id="extensions-datafusion-contrib"> -<h3>Extensions (<a class="reference external" href="https://github.com/datafusion-contrib">datafusion-contrib</a>)<a class="headerlink" href="#extensions-datafusion-contrib" title="Link to this heading">¶</a></h3> -</section> -<section id="datafusion-python"> -<h3><a class="reference external" href="https://github.com/datafusion-contrib/datafusion-python">DataFusion-Python</a><a class="headerlink" href="#datafusion-python" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>Add missing functionality to DataFrame and SessionContext</p></li> -<li><p>Improve documentation</p></li> -</ul> -</section> -<section id="datafusion-s3"> -<h3><a class="reference external" href="https://github.com/datafusion-contrib/datafusion-objectstore-s3">DataFusion-S3</a><a class="headerlink" href="#datafusion-s3" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>Create Python bindings to use with datafusion-python</p></li> -</ul> -</section> -<section id="datafusion-tui"> -<h3><a class="reference external" href="https://github.com/datafusion-contrib/datafusion-tui">DataFusion-Tui</a><a class="headerlink" href="#datafusion-tui" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>Create multiple SQL editors</p></li> -<li><p>Expose more Context and query metadata</p></li> -<li><p>Support new data sources</p> -<ul> -<li><p>BigTable, HDFS, HTTP APIs</p></li> -</ul> -</li> -</ul> -</section> -<section id="datafusion-bigtable"> -<h3><a class="reference external" href="https://github.com/datafusion-contrib/datafusion-bigtable">DataFusion-BigTable</a><a class="headerlink" href="#datafusion-bigtable" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>Python binding to use with datafusion-python</p></li> -<li><p>Timestamp range predicate pushdown</p></li> -<li><p>Multi-threaded partition aware execution</p></li> -<li><p>Production ready Rust SDK</p></li> -</ul> -</section> -<section id="datafusion-streams"> -<h3><a class="reference external" href="https://github.com/datafusion-contrib/datafusion-streams">DataFusion-Streams</a><a class="headerlink" href="#datafusion-streams" title="Link to this heading">¶</a></h3> -<ul class="simple"> -<li><p>Create experimental implementation of <code class="docutils literal notranslate"><span class="pre">StreamProvider</span></code> trait</p></li> -</ul> -</section> -</section> +<p>The DataFusion roadmap is driven by the priorities of contributors rather than +any single organization or coordinating committee. We typically discuss our +roadmap using GitHub issues, approximately quarterly, and invite you to join the +discussion.</p> +<p>For more information:</p> +<ol class="arabic simple"> +<li><p><a class="reference external" href="https://github.com/apache/datafusion/issues?q=is%3Aissue%20%20%20roadmap">Search for issues labeled <code class="docutils literal notranslate"><span class="pre">roadmap</span></code></a></p></li> +<li><p><a class="reference external" href="https://github.com/apache/datafusion/issues/15878">DataFusion Road Map: Q3-Q4 2025</a></p></li> +<li><p><a class="reference external" href="https://github.com/apache/datafusion/issues/13274">2024 Q4 / 2025 Q1 Roadmap</a></p></li> +</ol> </section> diff --git a/searchindex.js b/searchindex.js index 4f0c9b4fea..089f216068 100644 --- a/searchindex.js +++ b/searchindex.js @@ -1 +1 @@ -Search.setIndex({"alltitles":{"!=":[[57,"op-neq"]],"!~":[[57,"op-re-not-match"]],"!~*":[[57,"op-re-not-match-i"]],"!~~":[[57,"id19"]],"!~~*":[[57,"id20"]],"#":[[57,"op-bit-xor"]],"%":[[57,"op-modulo"]],"&":[[57,"op-bit-and"]],"(relation, name) tuples in logical fields and logical columns are unique":[[12,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[57,"op-multiply"]],"+":[[57,"op-plus"]],"-":[[57,"op-minus"]],"/":[[57,"op-divide"]],"2022 Q2":[[10,"q2"]] [...] \ No newline at end of file +Search.setIndex({"alltitles":{"!=":[[57,"op-neq"]],"!~":[[57,"op-re-not-match"]],"!~*":[[57,"op-re-not-match-i"]],"!~~":[[57,"id19"]],"!~~*":[[57,"id20"]],"#":[[57,"op-bit-xor"]],"%":[[57,"op-modulo"]],"&":[[57,"op-bit-and"]],"(relation, name) tuples in logical fields and logical columns are unique":[[12,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[57,"op-multiply"]],"+":[[57,"op-plus"]],"-":[[57,"op-minus"]],"/":[[57,"op-divide"]],"<":[[57,"op-lt"]],"< [...] \ No newline at end of file --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org