This is an automated email from the ASF dual-hosted git repository. github-bot pushed a commit to branch asf-staging in repository https://gitbox.apache.org/repos/asf/datafusion-site.git
The following commit(s) were added to refs/heads/asf-staging by this push: new b95fcfa Commit build products b95fcfa is described below commit b95fcfa2ca4b3957a3e636c23b90a846d5c3792a Author: Build Pelican (action) <priv...@infra.apache.org> AuthorDate: Sat Aug 9 10:03:30 2025 +0000 Commit build products --- blog/2025/08/15/external-parquet-indexes/index.html | 4 ++-- blog/feeds/all-en.atom.xml | 4 ++-- blog/feeds/andrew-lamb-influxdata.atom.xml | 4 ++-- blog/feeds/blog.atom.xml | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/blog/2025/08/15/external-parquet-indexes/index.html b/blog/2025/08/15/external-parquet-indexes/index.html index df50aa2..5ad76ae 100644 --- a/blog/2025/08/15/external-parquet-indexes/index.html +++ b/blog/2025/08/15/external-parquet-indexes/index.html @@ -259,7 +259,7 @@ data:</p> <p><strong>Figure 6</strong>: Step 1: File Pruning. Given a query predicate, systems use external indexes to quickly rule out files that cannot match the query. In this case, by consulting the index all but two files can be ruled out.</p> -<p>There are many different systems that use external indexs to find files such as +<p>There are many different systems that use external indexes to find files such as <a href="https://cwiki.apache.org/confluence/display/Hive/Design#Design-Metastore">Hive Metadata Store</a>, <a href="https://iceberg.apache.org/">Iceberg</a>, <a href="https://delta.io/">Delta Lake</a>, @@ -579,7 +579,7 @@ it out, we would love for you to join us.</p> database literature</a> after the first research paper to describe the technique.</p> <p><a id="footnote3"></a><code>3</code>: Benchmaxxing (verb): to add specific optimizations that only impact benchmark results and are not widely applicable to real world use cases.</p> -<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is which is a simple and widely used form of indexing based on directory paths, where the directory structure is used to +<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is a simple and widely used form of indexing based on directory paths, where the directory structure is used to store information about the data in the files. For example, a directory structure like <code>year=2025/month=08/day=15/</code> can be used to store data for a specific day and the system can quickly rule out directories that do not match the query predicate.</p> <p><a id="footnote5"></a><code>5</code>: I am also convinced that we can speed up the process of parsing Parquet footer diff --git a/blog/feeds/all-en.atom.xml b/blog/feeds/all-en.atom.xml index 4128311..b6a8345 100644 --- a/blog/feeds/all-en.atom.xml +++ b/blog/feeds/all-en.atom.xml @@ -238,7 +238,7 @@ data:</p> <p><strong>Figure 6</strong>: Step 1: File Pruning. Given a query predicate, systems use external indexes to quickly rule out files that cannot match the query. In this case, by consulting the index all but two files can be ruled out.</p> -<p>There are many different systems that use external indexs to find files such as +<p>There are many different systems that use external indexes to find files such as <a href="https://cwiki.apache.org/confluence/display/Hive/Design#Design-Metastore">Hive Metadata Store</a>, <a href="https://iceberg.apache.org/">Iceberg</a>, <a href="https://delta.io/">Delta Lake</a>, @@ -558,7 +558,7 @@ it out, we would love for you to join us.</p> database literature</a> after the first research paper to describe the technique.</p> <p><a id="footnote3"></a><code>3</code>: Benchmaxxing (verb): to add specific optimizations that only impact benchmark results and are not widely applicable to real world use cases.</p> -<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is which is a simple and widely used form of indexing based on directory paths, where the directory structure is used to +<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is a simple and widely used form of indexing based on directory paths, where the directory structure is used to store information about the data in the files. For example, a directory structure like <code>year=2025/month=08/day=15/</code> can be used to store data for a specific day and the system can quickly rule out directories that do not match the query predicate.</p> <p><a id="footnote5"></a><code>5</code>: I am also convinced that we can speed up the process of parsing Parquet footer diff --git a/blog/feeds/andrew-lamb-influxdata.atom.xml b/blog/feeds/andrew-lamb-influxdata.atom.xml index 5428eab..a3c5589 100644 --- a/blog/feeds/andrew-lamb-influxdata.atom.xml +++ b/blog/feeds/andrew-lamb-influxdata.atom.xml @@ -238,7 +238,7 @@ data:</p> <p><strong>Figure 6</strong>: Step 1: File Pruning. Given a query predicate, systems use external indexes to quickly rule out files that cannot match the query. In this case, by consulting the index all but two files can be ruled out.</p> -<p>There are many different systems that use external indexs to find files such as +<p>There are many different systems that use external indexes to find files such as <a href="https://cwiki.apache.org/confluence/display/Hive/Design#Design-Metastore">Hive Metadata Store</a>, <a href="https://iceberg.apache.org/">Iceberg</a>, <a href="https://delta.io/">Delta Lake</a>, @@ -558,7 +558,7 @@ it out, we would love for you to join us.</p> database literature</a> after the first research paper to describe the technique.</p> <p><a id="footnote3"></a><code>3</code>: Benchmaxxing (verb): to add specific optimizations that only impact benchmark results and are not widely applicable to real world use cases.</p> -<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is which is a simple and widely used form of indexing based on directory paths, where the directory structure is used to +<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is a simple and widely used form of indexing based on directory paths, where the directory structure is used to store information about the data in the files. For example, a directory structure like <code>year=2025/month=08/day=15/</code> can be used to store data for a specific day and the system can quickly rule out directories that do not match the query predicate.</p> <p><a id="footnote5"></a><code>5</code>: I am also convinced that we can speed up the process of parsing Parquet footer diff --git a/blog/feeds/blog.atom.xml b/blog/feeds/blog.atom.xml index 58f480c..05f3435 100644 --- a/blog/feeds/blog.atom.xml +++ b/blog/feeds/blog.atom.xml @@ -238,7 +238,7 @@ data:</p> <p><strong>Figure 6</strong>: Step 1: File Pruning. Given a query predicate, systems use external indexes to quickly rule out files that cannot match the query. In this case, by consulting the index all but two files can be ruled out.</p> -<p>There are many different systems that use external indexs to find files such as +<p>There are many different systems that use external indexes to find files such as <a href="https://cwiki.apache.org/confluence/display/Hive/Design#Design-Metastore">Hive Metadata Store</a>, <a href="https://iceberg.apache.org/">Iceberg</a>, <a href="https://delta.io/">Delta Lake</a>, @@ -558,7 +558,7 @@ it out, we would love for you to join us.</p> database literature</a> after the first research paper to describe the technique.</p> <p><a id="footnote3"></a><code>3</code>: Benchmaxxing (verb): to add specific optimizations that only impact benchmark results and are not widely applicable to real world use cases.</p> -<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is which is a simple and widely used form of indexing based on directory paths, where the directory structure is used to +<p><a id="footnote4"></a><code>4</code>: Hive Style Partitioning is a simple and widely used form of indexing based on directory paths, where the directory structure is used to store information about the data in the files. For example, a directory structure like <code>year=2025/month=08/day=15/</code> can be used to store data for a specific day and the system can quickly rule out directories that do not match the query predicate.</p> <p><a id="footnote5"></a><code>5</code>: I am also convinced that we can speed up the process of parsing Parquet footer --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org For additional commands, e-mail: commits-h...@datafusion.apache.org