This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/datafusion.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 8476e75d23 Publish built docs triggered by 
a91e0421ebadf3a155508e28e272f5fb8356bca1
8476e75d23 is described below

commit 8476e75d23c67d0f3224a4272e579609158f04aa
Author: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
AuthorDate: Wed Jun 11 08:49:35 2025 +0000

    Publish built docs triggered by a91e0421ebadf3a155508e28e272f5fb8356bca1
---
 _sources/user-guide/cli/datasources.md.txt | 87 +++++++++++++++++++++---------
 searchindex.js                             |  2 +-
 user-guide/cli/datasources.html            | 75 ++++++++++++++++++++------
 3 files changed, 123 insertions(+), 41 deletions(-)

diff --git a/_sources/user-guide/cli/datasources.md.txt 
b/_sources/user-guide/cli/datasources.md.txt
index 2e14f1f54c..afc4f6c0c5 100644
--- a/_sources/user-guide/cli/datasources.md.txt
+++ b/_sources/user-guide/cli/datasources.md.txt
@@ -82,22 +82,29 @@ select count(*) from 
'https://datasets.clickhouse.com/hits_compatible/athena_par
 To read from an AWS S3 or GCS, use `s3` or `gs` as a protocol prefix. For
 example, to read a file in an S3 bucket named `my-data-bucket` use the URL
 `s3://my-data-bucket`and set the relevant access credentials as environmental
-variables (e.g. for AWS S3 you need to at least `AWS_ACCESS_KEY_ID` and
+variables (e.g. for AWS S3 you can use `AWS_ACCESS_KEY_ID` and
 `AWS_SECRET_ACCESS_KEY`).
 
 ```sql
-select count(*) from 's3://my-data-bucket/athena_partitioned/hits.parquet'
+> select count(*) from 
's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet/';
++------------+
+| count(*)   |
++------------+
+| 1310903963 |
++------------+
 ```
 
-See the [`CREATE EXTERNAL TABLE`](#create-external-table) section for
+See the [`CREATE EXTERNAL TABLE`](#create-external-table) section below for
 additional configuration options.
 
 # `CREATE EXTERNAL TABLE`
 
 It is also possible to create a table backed by files or remote locations via
-`CREATE EXTERNAL TABLE` as shown below. Note that DataFusion does not support 
wildcards (e.g. `*`) in file paths; instead, specify the directory path 
directly to read all compatible files in that directory.
+`CREATE EXTERNAL TABLE` as shown below. Note that DataFusion does not support
+wildcards (e.g. `*`) in file paths; instead, specify the directory path 
directly
+to read all compatible files in that directory.
 
-For example, to create a table `hits` backed by a local parquet file, use:
+For example, to create a table `hits` backed by a local parquet file named 
`hits.parquet`:
 
 ```sql
 CREATE EXTERNAL TABLE hits
@@ -105,7 +112,7 @@ STORED AS PARQUET
 LOCATION 'hits.parquet';
 ```
 
-To create a table `hits` backed by a remote parquet file via HTTP(S), use
+To create a table `hits` backed by a remote parquet file via HTTP(S):
 
 ```sql
 CREATE EXTERNAL TABLE hits
@@ -127,7 +134,11 @@ select count(*) from hits;
 
 **Why Wildcards Are Not Supported**
 
-Although wildcards (e.g., _.parquet or \*\*/_.parquet) may work for local 
filesystems in some cases, they are not officially supported by DataFusion. 
This is because wildcards are not universally applicable across all storage 
backends (e.g., S3, GCS). Instead, DataFusion expects the user to specify the 
directory path, and it will automatically read all compatible files within that 
directory.
+Although wildcards (e.g., _.parquet or \*\*/_.parquet) may work for local
+filesystems in some cases, they are not supported by DataFusion CLI. This
+is because wildcards are not universally applicable across all storage backends
+(e.g., S3, GCS). Instead, DataFusion expects the user to specify the directory
+path, and it will automatically read all compatible files within that 
directory.
 
 For example, the following usage is not supported:
 
@@ -148,7 +159,7 @@ CREATE EXTERNAL TABLE test (
     day DATE
 )
 STORED AS PARQUET
-LOCATION 'gs://bucket/my_table';
+LOCATION 'gs://bucket/my_table/';
 ```
 
 # Formats
@@ -168,6 +179,11 @@ LOCATION '/mnt/nyctaxi/tripdata.parquet';
 Register a single folder parquet datasource. Note: All files inside must be 
valid
 parquet files and have compatible schemas
 
+:::{note}
+Paths must end in Slash `/`
+: The path must end in `/` otherwise DataFusion will treat the path as a file 
and not a directory
+:::
+
 ```sql
 CREATE EXTERNAL TABLE taxi
 STORED AS PARQUET
@@ -178,7 +194,7 @@ LOCATION '/mnt/nyctaxi/';
 
 DataFusion will infer the CSV schema automatically or you can provide it 
explicitly.
 
-Register a single file csv datasource with a header row.
+Register a single file csv datasource with a header row:
 
 ```sql
 CREATE EXTERNAL TABLE test
@@ -187,7 +203,7 @@ LOCATION '/path/to/aggregate_test_100.csv'
 OPTIONS ('has_header' 'true');
 ```
 
-Register a single file csv datasource with explicitly defined schema.
+Register a single file csv datasource with explicitly defined schema:
 
 ```sql
 CREATE EXTERNAL TABLE test (
@@ -213,7 +229,7 @@ LOCATION '/path/to/aggregate_test_100.csv';
 
 ## HTTP(s)
 
-To read from a remote parquet file via HTTP(S) you can use the following:
+To read from a remote parquet file via HTTP(S):
 
 ```sql
 CREATE EXTERNAL TABLE hits
@@ -223,9 +239,12 @@ LOCATION 
'https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hit
 
 ## S3
 
-[AWS S3](https://aws.amazon.com/s3/) data sources must have connection 
credentials configured.
+DataFusion CLI supports configuring [AWS S3](https://aws.amazon.com/s3/) via 
the
+`CREATE EXTERNAL TABLE` statement and standard AWS configuration methods (via 
the
+[`aws-config`] AWS SDK crate).
 
-To create an external table from a file in an S3 bucket:
+To create an external table from a file in an S3 bucket with explicit
+credentials:
 
 ```sql
 CREATE EXTERNAL TABLE test
@@ -238,7 +257,7 @@ OPTIONS(
 LOCATION 's3://bucket/path/file.parquet';
 ```
 
-It is also possible to specify the access information using environment 
variables:
+To create an external table using environment variables:
 
 ```bash
 $ export AWS_DEFAULT_REGION=us-east-2
@@ -247,7 +266,7 @@ $ export AWS_ACCESS_KEY_ID=******
 
 $ datafusion-cli
 `datafusion-cli v21.0.0
-> create external table test stored as parquet location 
's3://bucket/path/file.parquet';
+> create CREATE TABLE test STORED AS PARQUET LOCATION 
's3://bucket/path/file.parquet';
 0 rows in set. Query took 0.374 seconds.
 > select * from test;
 +----------+----------+
@@ -258,19 +277,39 @@ $ datafusion-cli
 1 row in set. Query took 0.171 seconds.
 ```
 
+To read from a public S3 bucket without signatures, use the
+`aws.SKIP_SIGNATURE` option:
+
+```sql
+CREATE EXTERNAL TABLE nyc_taxi_rides
+STORED AS PARQUET LOCATION 
's3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet/'
+OPTIONS(aws.SKIP_SIGNATURE true);
+```
+
+Credentials are taken in this order of precedence:
+
+1. Explicitly specified in the `OPTIONS` clause of the `CREATE EXTERNAL TABLE` 
statement.
+2. Determined by [`aws-config`] crate (standard environment variables such as 
`AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` as well as other AWS specific 
features).
+
+If no credentials are specified, DataFusion CLI will use unsigned requests to 
S3,
+which allows reading from public buckets.
+
 Supported configuration options are:
 
-| Environment Variable                     | Configuration Option    | 
Description                                          |
-| ---------------------------------------- | ----------------------- | 
---------------------------------------------------- |
-| `AWS_ACCESS_KEY_ID`                      | `aws.access_key_id`     |         
                                             |
-| `AWS_SECRET_ACCESS_KEY`                  | `aws.secret_access_key` |         
                                             |
-| `AWS_DEFAULT_REGION`                     | `aws.region`            |         
                                             |
-| `AWS_ENDPOINT`                           | `aws.endpoint`          |         
                                             |
-| `AWS_SESSION_TOKEN`                      | `aws.token`             |         
                                             |
-| `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI` |                         | See 
[IAM Roles]                                      |
-| `AWS_ALLOW_HTTP`                         |                         | set to 
"true" to permit HTTP connections without TLS |
+| Environment Variable                     | Configuration Option    | 
Description                                    |
+| ---------------------------------------- | ----------------------- | 
---------------------------------------------- |
+| `AWS_ACCESS_KEY_ID`                      | `aws.access_key_id`     |         
                                       |
+| `AWS_SECRET_ACCESS_KEY`                  | `aws.secret_access_key` |         
                                       |
+| `AWS_DEFAULT_REGION`                     | `aws.region`            |         
                                       |
+| `AWS_ENDPOINT`                           | `aws.endpoint`          |         
                                       |
+| `AWS_SESSION_TOKEN`                      | `aws.token`             |         
                                       |
+| `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI` |                         | See 
[IAM Roles]                                |
+| `AWS_ALLOW_HTTP`                         |                         | If 
"true", permit HTTP connections without TLS |
+| `AWS_SKIP_SIGNATURE`                     | `aws.skip_signature`    | If 
"true", does not sign requests              |
+|                                          | `aws.nosign`            | Alias 
for `skip_signature`                     |
 
 [iam roles]: 
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-iam-roles.html
+[`aws-config`]: https://docs.rs/aws-config/latest/aws_config/
 
 ## OSS
 
diff --git a/searchindex.js b/searchindex.js
index 0936b2408d..0ef662758d 100644
--- a/searchindex.js
+++ b/searchindex.js
@@ -1 +1 @@
-Search.setIndex({"alltitles":{"!=":[[54,"op-neq"]],"!~":[[54,"op-re-not-match"]],"!~*":[[54,"op-re-not-match-i"]],"!~~":[[54,"id19"]],"!~~*":[[54,"id20"]],"#":[[54,"op-bit-xor"]],"%":[[54,"op-modulo"]],"&":[[54,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[12,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[54,"op-multiply"]],"+":[[54,"op-plus"]],"-":[[54,"op-minus"]],"/":[[54,"op-divide"]],"2022
 Q2":[[10,"q2"]] [...]
\ No newline at end of file
+Search.setIndex({"alltitles":{"!=":[[54,"op-neq"]],"!~":[[54,"op-re-not-match"]],"!~*":[[54,"op-re-not-match-i"]],"!~~":[[54,"id19"]],"!~~*":[[54,"id20"]],"#":[[54,"op-bit-xor"]],"%":[[54,"op-modulo"]],"&":[[54,"op-bit-and"]],"(relation,
 name) tuples in logical fields and logical columns are 
unique":[[12,"relation-name-tuples-in-logical-fields-and-logical-columns-are-unique"]],"*":[[54,"op-multiply"]],"+":[[54,"op-plus"]],"-":[[54,"op-minus"]],"/":[[54,"op-divide"]],"2022
 Q2":[[10,"q2"]] [...]
\ No newline at end of file
diff --git a/user-guide/cli/datasources.html b/user-guide/cli/datasources.html
index eb18590b8a..a9ea511ff3 100644
--- a/user-guide/cli/datasources.html
+++ b/user-guide/cli/datasources.html
@@ -703,25 +703,32 @@ For example, to read from a remote parquet file via 
HTTP(S) you can use the foll
 <p>To read from an AWS S3 or GCS, use <code class="docutils literal 
notranslate"><span class="pre">s3</span></code> or <code class="docutils 
literal notranslate"><span class="pre">gs</span></code> as a protocol prefix. 
For
 example, to read a file in an S3 bucket named <code class="docutils literal 
notranslate"><span class="pre">my-data-bucket</span></code> use the URL
 <code class="docutils literal notranslate"><span 
class="pre">s3://my-data-bucket</span></code>and set the relevant access 
credentials as environmental
-variables (e.g. for AWS S3 you need to at least <code class="docutils literal 
notranslate"><span class="pre">AWS_ACCESS_KEY_ID</span></code> and
+variables (e.g. for AWS S3 you can use <code class="docutils literal 
notranslate"><span class="pre">AWS_ACCESS_KEY_ID</span></code> and
 <code class="docutils literal notranslate"><span 
class="pre">AWS_SECRET_ACCESS_KEY</span></code>).</p>
-<div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">select</span><span 
class="w"> </span><span class="k">count</span><span class="p">(</span><span 
class="o">*</span><span class="p">)</span><span class="w"> </span><span 
class="k">from</span><span class="w"> </span><span 
class="s1">&#39;s3://my-data-bucket/athena_partitioned/hits.parquet&#39;</span>
+<div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="o">&gt;</span><span class="w"> 
</span><span class="k">select</span><span class="w"> </span><span 
class="k">count</span><span class="p">(</span><span class="o">*</span><span 
class="p">)</span><span class="w"> </span><span class="k">from</span><span 
class="w"> </span><span 
class="s1">&#39;s3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet/&#39;</span><span
 class="p">;</span>
+<span class="o">+</span><span class="c1">------------+</span>
+<span class="o">|</span><span class="w"> </span><span 
class="k">count</span><span class="p">(</span><span class="o">*</span><span 
class="p">)</span><span class="w">   </span><span class="o">|</span>
+<span class="o">+</span><span class="c1">------------+</span>
+<span class="o">|</span><span class="w"> </span><span 
class="mi">1310903963</span><span class="w"> </span><span class="o">|</span>
+<span class="o">+</span><span class="c1">------------+</span>
 </pre></div>
 </div>
-<p>See the <a class="reference internal" href="#create-external-table"><code 
class="docutils literal notranslate"><span class="pre">CREATE</span> <span 
class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code></a> section 
for
+<p>See the <a class="reference internal" href="#create-external-table"><code 
class="docutils literal notranslate"><span class="pre">CREATE</span> <span 
class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code></a> section 
below for
 additional configuration options.</p>
 </section>
 <section id="create-external-table">
 <h1><code class="docutils literal notranslate"><span class="pre">CREATE</span> 
<span class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code><a 
class="headerlink" href="#create-external-table" title="Link to this 
heading">¶</a></h1>
 <p>It is also possible to create a table backed by files or remote locations 
via
-<code class="docutils literal notranslate"><span class="pre">CREATE</span> 
<span class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code> as 
shown below. Note that DataFusion does not support wildcards (e.g. <code 
class="docutils literal notranslate"><span class="pre">*</span></code>) in file 
paths; instead, specify the directory path directly to read all compatible 
files in that directory.</p>
-<p>For example, to create a table <code class="docutils literal 
notranslate"><span class="pre">hits</span></code> backed by a local parquet 
file, use:</p>
+<code class="docutils literal notranslate"><span class="pre">CREATE</span> 
<span class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code> as 
shown below. Note that DataFusion does not support
+wildcards (e.g. <code class="docutils literal notranslate"><span 
class="pre">*</span></code>) in file paths; instead, specify the directory path 
directly
+to read all compatible files in that directory.</p>
+<p>For example, to create a table <code class="docutils literal 
notranslate"><span class="pre">hits</span></code> backed by a local parquet 
file named <code class="docutils literal notranslate"><span 
class="pre">hits.parquet</span></code>:</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">hits</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span>
 <span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;hits.parquet&#39;</span><span class="p">;</span>
 </pre></div>
 </div>
-<p>To create a table <code class="docutils literal notranslate"><span 
class="pre">hits</span></code> backed by a remote parquet file via HTTP(S), 
use</p>
+<p>To create a table <code class="docutils literal notranslate"><span 
class="pre">hits</span></code> backed by a remote parquet file via HTTP(S):</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">hits</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span>
 <span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet&#39;</span><span
 class="p">;</span>
@@ -738,7 +745,11 @@ additional configuration options.</p>
 </pre></div>
 </div>
 <p><strong>Why Wildcards Are Not Supported</strong></p>
-<p>Although wildcards (e.g., <em>.parquet or **/</em>.parquet) may work for 
local filesystems in some cases, they are not officially supported by 
DataFusion. This is because wildcards are not universally applicable across all 
storage backends (e.g., S3, GCS). Instead, DataFusion expects the user to 
specify the directory path, and it will automatically read all compatible files 
within that directory.</p>
+<p>Although wildcards (e.g., <em>.parquet or **/</em>.parquet) may work for 
local
+filesystems in some cases, they are not supported by DataFusion CLI. This
+is because wildcards are not universally applicable across all storage backends
+(e.g., S3, GCS). Instead, DataFusion expects the user to specify the directory
+path, and it will automatically read all compatible files within that 
directory.</p>
 <p>For example, the following usage is not supported:</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">test</span><span 
class="w"> </span><span class="p">(</span>
 <span class="w">    </span><span class="n">message</span><span class="w"> 
</span><span class="nb">TEXT</span><span class="p">,</span>
@@ -754,7 +765,7 @@ additional configuration options.</p>
 <span class="w">    </span><span class="k">day</span><span class="w"> 
</span><span class="nb">DATE</span>
 <span class="p">)</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span>
-<span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;gs://bucket/my_table&#39;</span><span class="p">;</span>
+<span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;gs://bucket/my_table/&#39;</span><span class="p">;</span>
 </pre></div>
 </div>
 </section>
@@ -771,6 +782,13 @@ additional configuration options.</p>
 </div>
 <p>Register a single folder parquet datasource. Note: All files inside must be 
valid
 parquet files and have compatible schemas</p>
+<div class="admonition note">
+<p class="admonition-title">Note</p>
+<dl class="simple myst">
+<dt>Paths must end in Slash <code class="docutils literal notranslate"><span 
class="pre">/</span></code></dt><dd><p>The path must end in <code 
class="docutils literal notranslate"><span class="pre">/</span></code> 
otherwise DataFusion will treat the path as a file and not a directory</p>
+</dd>
+</dl>
+</div>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">taxi</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span>
 <span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;/mnt/nyctaxi/&#39;</span><span class="p">;</span>
@@ -780,14 +798,14 @@ parquet files and have compatible schemas</p>
 <section id="csv">
 <h2>CSV<a class="headerlink" href="#csv" title="Link to this 
heading">¶</a></h2>
 <p>DataFusion will infer the CSV schema automatically or you can provide it 
explicitly.</p>
-<p>Register a single file csv datasource with a header row.</p>
+<p>Register a single file csv datasource with a header row:</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">test</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">CSV</span>
 <span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;/path/to/aggregate_test_100.csv&#39;</span>
 <span class="k">OPTIONS</span><span class="w"> </span><span 
class="p">(</span><span class="s1">&#39;has_header&#39;</span><span class="w"> 
</span><span class="s1">&#39;true&#39;</span><span class="p">);</span>
 </pre></div>
 </div>
-<p>Register a single file csv datasource with explicitly defined schema.</p>
+<p>Register a single file csv datasource with explicitly defined schema:</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">test</span><span 
class="w"> </span><span class="p">(</span>
 <span class="w">    </span><span class="n">c1</span><span class="w">  
</span><span class="nb">VARCHAR</span><span class="w"> </span><span 
class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span 
class="p">,</span>
 <span class="w">    </span><span class="n">c2</span><span class="w">  
</span><span class="nb">INT</span><span class="w"> </span><span 
class="k">NOT</span><span class="w"> </span><span class="k">NULL</span><span 
class="p">,</span>
@@ -813,7 +831,7 @@ parquet files and have compatible schemas</p>
 <h1>Locations<a class="headerlink" href="#locations" title="Link to this 
heading">¶</a></h1>
 <section id="http-s">
 <h2>HTTP(s)<a class="headerlink" href="#http-s" title="Link to this 
heading">¶</a></h2>
-<p>To read from a remote parquet file via HTTP(S) you can use the 
following:</p>
+<p>To read from a remote parquet file via HTTP(S):</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">hits</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span>
 <span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_1.parquet&#39;</span><span
 class="p">;</span>
@@ -822,8 +840,11 @@ parquet files and have compatible schemas</p>
 </section>
 <section id="s3">
 <h2>S3<a class="headerlink" href="#s3" title="Link to this heading">¶</a></h2>
-<p><a class="reference external" href="https://aws.amazon.com/s3/";>AWS S3</a> 
data sources must have connection credentials configured.</p>
-<p>To create an external table from a file in an S3 bucket:</p>
+<p>DataFusion CLI supports configuring <a class="reference external" 
href="https://aws.amazon.com/s3/";>AWS S3</a> via the
+<code class="docutils literal notranslate"><span class="pre">CREATE</span> 
<span class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code> 
statement and standard AWS configuration methods (via the
+<a class="reference external" 
href="https://docs.rs/aws-config/latest/aws_config/";><code class="docutils 
literal notranslate"><span class="pre">aws-config</span></code></a> AWS SDK 
crate).</p>
+<p>To create an external table from a file in an S3 bucket with explicit
+credentials:</p>
 <div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span class="n">test</span>
 <span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span>
 <span class="k">OPTIONS</span><span class="p">(</span>
@@ -834,14 +855,14 @@ parquet files and have compatible schemas</p>
 <span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;s3://bucket/path/file.parquet&#39;</span><span 
class="p">;</span>
 </pre></div>
 </div>
-<p>It is also possible to specify the access information using environment 
variables:</p>
+<p>To create an external table using environment variables:</p>
 <div class="highlight-bash notranslate"><div 
class="highlight"><pre><span></span>$<span class="w"> </span><span 
class="nb">export</span><span class="w"> </span><span 
class="nv">AWS_DEFAULT_REGION</span><span class="o">=</span>us-east-2
 $<span class="w"> </span><span class="nb">export</span><span class="w"> 
</span><span class="nv">AWS_SECRET_ACCESS_KEY</span><span 
class="o">=</span>******
 $<span class="w"> </span><span class="nb">export</span><span class="w"> 
</span><span class="nv">AWS_ACCESS_KEY_ID</span><span class="o">=</span>******
 
 $<span class="w"> </span>datafusion-cli
 <span class="sb">`</span>datafusion-cli<span class="w"> </span>v21.0.0
-&gt;<span class="w"> </span>create<span class="w"> </span>external<span 
class="w"> </span>table<span class="w"> </span><span 
class="nb">test</span><span class="w"> </span>stored<span class="w"> 
</span>as<span class="w"> </span>parquet<span class="w"> </span>location<span 
class="w"> </span><span 
class="s1">&#39;s3://bucket/path/file.parquet&#39;</span><span 
class="p">;</span>
+&gt;<span class="w"> </span>create<span class="w"> </span>CREATE<span 
class="w"> </span>TABLE<span class="w"> </span><span 
class="nb">test</span><span class="w"> </span>STORED<span class="w"> 
</span>AS<span class="w"> </span>PARQUET<span class="w"> </span>LOCATION<span 
class="w"> </span><span 
class="s1">&#39;s3://bucket/path/file.parquet&#39;</span><span 
class="p">;</span>
 <span class="m">0</span><span class="w"> </span>rows<span class="w"> 
</span><span class="k">in</span><span class="w"> </span>set.<span class="w"> 
</span>Query<span class="w"> </span>took<span class="w"> </span><span 
class="m">0</span>.374<span class="w"> </span>seconds.
 &gt;<span class="w"> </span><span class="k">select</span><span class="w"> 
</span>*<span class="w"> </span>from<span class="w"> </span>test<span 
class="p">;</span>
 +----------+----------+
@@ -852,6 +873,20 @@ $<span class="w"> </span>datafusion-cli
 <span class="m">1</span><span class="w"> </span>row<span class="w"> 
</span><span class="k">in</span><span class="w"> </span>set.<span class="w"> 
</span>Query<span class="w"> </span>took<span class="w"> </span><span 
class="m">0</span>.171<span class="w"> </span>seconds.
 </pre></div>
 </div>
+<p>To read from a public S3 bucket without signatures, use the
+<code class="docutils literal notranslate"><span 
class="pre">aws.SKIP_SIGNATURE</span></code> option:</p>
+<div class="highlight-sql notranslate"><div 
class="highlight"><pre><span></span><span class="k">CREATE</span><span 
class="w"> </span><span class="k">EXTERNAL</span><span class="w"> </span><span 
class="k">TABLE</span><span class="w"> </span><span 
class="n">nyc_taxi_rides</span>
+<span class="n">STORED</span><span class="w"> </span><span 
class="k">AS</span><span class="w"> </span><span class="n">PARQUET</span><span 
class="w"> </span><span class="k">LOCATION</span><span class="w"> </span><span 
class="s1">&#39;s3://altinity-clickhouse-data/nyc_taxi_rides/data/tripdata_parquet/&#39;</span>
+<span class="k">OPTIONS</span><span class="p">(</span><span 
class="n">aws</span><span class="p">.</span><span 
class="n">SKIP_SIGNATURE</span><span class="w"> </span><span 
class="k">true</span><span class="p">);</span>
+</pre></div>
+</div>
+<p>Credentials are taken in this order of precedence:</p>
+<ol class="arabic simple">
+<li><p>Explicitly specified in the <code class="docutils literal 
notranslate"><span class="pre">OPTIONS</span></code> clause of the <code 
class="docutils literal notranslate"><span class="pre">CREATE</span> <span 
class="pre">EXTERNAL</span> <span class="pre">TABLE</span></code> 
statement.</p></li>
+<li><p>Determined by <a class="reference external" 
href="https://docs.rs/aws-config/latest/aws_config/";><code class="docutils 
literal notranslate"><span class="pre">aws-config</span></code></a> crate 
(standard environment variables such as <code class="docutils literal 
notranslate"><span class="pre">AWS_ACCESS_KEY_ID</span></code> and <code 
class="docutils literal notranslate"><span 
class="pre">AWS_SECRET_ACCESS_KEY</span></code> as well as other AWS specific 
features).</p></li>
+</ol>
+<p>If no credentials are specified, DataFusion CLI will use unsigned requests 
to S3,
+which allows reading from public buckets.</p>
 <p>Supported configuration options are:</p>
 <table class="table">
 <thead>
@@ -887,7 +922,15 @@ $<span class="w"> </span>datafusion-cli
 </tr>
 <tr class="row-even"><td><p><code class="docutils literal notranslate"><span 
class="pre">AWS_ALLOW_HTTP</span></code></p></td>
 <td><p></p></td>
-<td><p>set to “true” to permit HTTP connections without TLS</p></td>
+<td><p>If “true”, permit HTTP connections without TLS</p></td>
+</tr>
+<tr class="row-odd"><td><p><code class="docutils literal notranslate"><span 
class="pre">AWS_SKIP_SIGNATURE</span></code></p></td>
+<td><p><code class="docutils literal notranslate"><span 
class="pre">aws.skip_signature</span></code></p></td>
+<td><p>If “true”, does not sign requests</p></td>
+</tr>
+<tr class="row-even"><td><p></p></td>
+<td><p><code class="docutils literal notranslate"><span 
class="pre">aws.nosign</span></code></p></td>
+<td><p>Alias for <code class="docutils literal notranslate"><span 
class="pre">skip_signature</span></code></p></td>
 </tr>
 </tbody>
 </table>


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@datafusion.apache.org
For additional commands, e-mail: commits-h...@datafusion.apache.org

Reply via email to