[drill] branch gh-pages updated: doc edits for 1.15

bridgetb Thu, 10 Jan 2019 15:56:02 -0800

This is an automated email from the ASF dual-hosted git repository.

bridgetb pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git



The following commit(s) were added to refs/heads/gh-pages by this push:
     new 12be27a  doc edits for 1.15
12be27a is described below

commit 12be27ab53289852e1d89ed5a90cfbdb017719bc
Author: Bridget Bevens <[email protected]>
AuthorDate: Thu Jan 10 15:52:04 2019 -0800

    doc edits for 1.15
---
 .../plugins/110-s3-storage-plugin.md               | 79 ++++++++++++++--------
 .../026-parquet-filter-pushdown.md                 |  4 +-
 2 files changed, 54 insertions(+), 29 deletions(-)

diff --git a/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md 
b/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md
index d688576..726e1ce 100644
--- a/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md
+++ b/_docs/connect-a-data-source/plugins/110-s3-storage-plugin.md
@@ -1,6 +1,6 @@
 ---
 title: "S3 Storage Plugin"
-date: 2018-12-22
+date: 2019-01-10
 parent: "Connect a Data Source"
 ---
 Drill works with data stored in the cloud. With a few simple steps, you can 
configure the S3 storage plugin for Drill and be off to the races running 
queries. 
@@ -19,7 +19,7 @@ For additional information, refer to the [HDFS S3 
documentation](https://hadoop.
 
 ## Providing AWS Credentials  
 
-Your environment determines where you provide your AWS credentials. You can 
use the following methods to define your AWS credentials:  
+Your environment determines where you provide your AWS credentials. You define 
your AWS credentials:  
 
 - In the S3 storage plugin configuration:
        - [You can point to an encrypted file in an external 
provider.]({{site.baseurl}}/docs/s3-storage-plugin/#using-an-external-provider-for-credentials)
 (Drill 1.15 and later) 
@@ -64,43 +64,47 @@ If you use IAM roles/Instance profiles, to access data in 
s3, use the following
 
 ##Configuring the S3 Storage Plugin
 
-The Storage page in the Drill Web UI provides an S3 storage plugin that you 
configure to connect Drill to the S3 distributed file system registered in 
core-site.xml. If you did not define your AWS credentials in the core-site.xml 
file, you can define them in the storage plugin configuration. You can define 
the credentials directly in the configuration, or you can use an external 
provider. 
+The **Storage** page in the Drill Web UI provides an S3 storage plugin that 
you configure to connect Drill to the S3 distributed file system registered in 
`core-site.xml`. If you did not define your AWS credentials in the 
`core-site.xml` file, you can define them in the storage plugin configuration. 
You can define the credentials directly in the S3 storage plugin configuration, 
or you can configure the S3 storage plugin to use an external provider.
 
-To configure the S3 storage plugin, log in to the Drill Web UI at 
`http://<drill-hostname>:8047`. The drill-hostname is a node on which Drill is 
running. Go to the **Storage** page and click **Update** next to the S3 storage 
plugin option. Edit the configuration and then click **Update** to save the 
configuration.  
+To configure the S3 storage plugin, log in to the Drill Web UI at 
`http://<drill-hostname>:8047`. The `drill-hostname` is a node on which Drill 
is running. Go to the **Storage** page and click **Update** next to the S3 
storage plugin option. 
 
-**Note:** The `"config"` block in the S3 storage plugin configuration contains 
contains properties to define your AWS credentials. Do not include the 
`"config"` block in your S3 storage plugin configuration if you defined your 
AWS credentials in the core-site.xml file. 
+**Note:** The `"config"` block in the S3 storage plugin configuration contains 
properties to define your AWS credentials. Do not include the `"config"` block 
in your S3 storage plugin configuration if you defined your AWS credentials in 
the `core-site.xml` file.  
 
-Use either of the following methods to provide your credentials:
+Configure the S3 storage plugin configuration to use an external provider for 
credentials or directly add the credentials in the configuration itself, as 
described in the following sections. Click **Update** to save the configuration 
when done. 
 
 ### Using an External Provider for Credentials
-Starting in Drill 1.15, the S3 storage plugin supports the [Hadoop Credential 
Provider 
API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html]),
 which allows you to store secret keys and other sensitive data in an encrypted 
file in an external provider versus storing them in plain text in a 
configuration file or storage plugin configuration.
+Starting in Drill 1.15, the S3 storage plugin supports the [Hadoop Credential 
Provider 
API](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/CredentialProviderAPI.html%5D),
 which allows you to store secret keys and other sensitive data in an encrypted 
file in an external provider versus storing them in plain text in a 
configuration file or directly in the storage plugin configuration.
+ 
+When you configure the S3 storage plugin to use an external provider, Drill 
first checks the external provider for the keys. If the keys are not available 
via the provider, or the provider is not configured, Drill can fall back to 
using the plain text data in the `core-site.xml` file or S3 storage plugin 
configuration. 
 
-When you configure the S3 storage plugin to use an external provider, Drill 
first checks the external provider for the keys. If the keys are not available 
via the provider, or the provider is not configured, Drill can fall back to 
using the plain text data in the `core-site.xml` file or S3 configuration, 
unless the `hadoop.security.credential.clear-text-fallback` property is set to 
`false`.  
+For fallback to work, you must include the 
`hadoop.security.credential.clear-text-fallback` property in the S3 storage 
plugin configuration, with the property set to 'true'. 
 
-**Configuring the S3 Plugin to use an External Provider**  
+For subsequent connections, if you want Drill to connect using different 
credentials, you can include the `fs.s3a.impl.disable.cache` property in the  
configuration. See [Reconnecting to an S3 Bucket Using Different 
Credentials]({{site.baseurl}}/docs/s3-storage-plugin/#reconnecting-to-an-s3-bucket-using-different-credentials)
 for more information.  
 
-Add the bucket name, `hadoop.security.credential.provider.path` and 
`fs.s3a.impl.disable.cache` properties to the S3 storage plugin configuration, 
as shown in the following example:
+**Configuring the S3 Plugin to use an External Provider**  
+Add the bucket name and the `hadoop.security.credential.provider.path` 
property to the S3 storage plugin configuration. The 
`hadoop.security.credential.provider.path` property should point to a file that 
contains your encrypted passwords. Optionally, include the 
`hadoop.security.credential.clear-text-fallback` property for fallback and the 
`fs.s3a.impl.disable.cache` property to reconnect using different credentials. 
  
+The following example shows an S3 storage plugin configuration with the S3 
bucket, `hadoop.security.credential.provider.path`, and 
`fs.s3a.impl.disable.cache properties` set:  
+
        {
-        "type":
-       "file",
+       "type":
+    "file",
          "connection": "s3a://bucket-name/",
          "config": {
-               
"hadoop.security.credential.provider.path":"jceks://file/tmp/s3.jceks",
-               "Fs.s3a.impl.disable.cache":"true",
-               ...
-               },
+           
"hadoop.security.credential.provider.path":"jceks://file/tmp/s3.jceks",
+           "fs.s3a.impl.disable.cache":"true",
+           ...
+           },
          "workspaces": {
            ...
-         }
-
- 
-**Note:** The `hadoop.security.credential.provider.path` property should point 
to a file that contains your encrypted passwords. The 
`fs.s3a.impl.disable.cache` option must be set to true.
+         }  
 
 ###Adding Credentials Directly to the S3 Plugin  
-You can add your AWS credentials directly to the S3 configuration, though this 
method is the least secure, but sufficient for use on a single machine, such as 
a laptop. 
+You can add your AWS credentials directly to the S3 configuration, though this 
method is the least secure, but sufficient for use on a single machine, such as 
a laptop. Include the S3 bucket name, the AWS access keys, and the S3 endpoint 
in the configuration. 
+
+Optionally, for subsequent connections, if you want Drill to connect using 
different credentials, you can include the `fs.s3a.impl.disable.cache` property 
in the  configuration. See [Reconnecting to an S3 Bucket Using Different 
Credentials]({{site.baseurl}}/docs/s3-storage-plugin/#reconnecting-to-an-s3-bucket-using-different-credentials)
 for more information.
 
-Add the S3 bucket name and the `"config"` block with the properties shown in 
the following example: 
+The following example shows an S3 storage plugin configuration with the S3 
bucket, access key properties, and `fs.s3a.impl.disable.cache` property:
 
     {
        "type": "file",
@@ -109,13 +113,34 @@ Add the S3 bucket name and the `"config"` block with the 
properties shown in the
        "config": {
                "fs.s3a.access.key": "<key>",
                "fs.s3a.secret.key": "<key>",
-               "fs.s3a.endpoint": "s3.us-west-1.amazonaws.com"
+               "fs.s3a.endpoint": "s3.us-west-1.amazonaws.com",
+           "fs.s3a.impl.disable.cache":"true"
        },
        "workspaces": {...
-               },
-       
-         
-Drill can now use the HDFS s3a library to access data in S3.
+               },  
+
+###Reconnecting to an S3 Bucket Using Different Credentials 
+Whether you store credentials in the S3 storage plugin configuration directly 
or in an external provider, you can reconnect to an existing S3 bucket using 
different credentials when you include the `fs.s3a.impl.disable.cache` property 
in the S3 storage plugin configuration. The `fs.s3a.impl.disable.cache` 
property disables the S3 file system cache when set to 'true'. If 
`fs.s3a.impl.disable.cache` is set to 'false' when Drill reconnects, Drill uses 
the previous credentials to connect. Yo [...]
+
+The following example S3 storage plugin configuration includes the 
fs.s3a.impl.disable.cache property:
+
+
+{
+ "type":
+"file",
+  "connection": "s3a://bucket-name/",
+  "config": {
+    "hadoop.security.credential.provider.path":"jceks://file/tmp/s3.jceks",
+    "fs.s3a.impl.disable.cache":"true",
+    ...
+    },
+  "workspaces": {
+    ...
+  }
+
+
+
+  
 
 
 ## Quering Parquet Format Files On S3 
diff --git a/_docs/performance-tuning/026-parquet-filter-pushdown.md 
b/_docs/performance-tuning/026-parquet-filter-pushdown.md
index fe50bf7..3976c2e 100644
--- a/_docs/performance-tuning/026-parquet-filter-pushdown.md
+++ b/_docs/performance-tuning/026-parquet-filter-pushdown.md
@@ -1,6 +1,6 @@
 ---
 title: "Parquet Filter Pushdown"
-date: 2018-12-14
+date: 2019-01-10
 parent: "Performance Tuning"
 ---
 
@@ -113,7 +113,7 @@ The following table lists the supported and unsupported 
clauses, operators, data
 | Clauses              | WHERE,   <sup>1</sup>WITH, HAVING (HAVING is 
supported if Drill can pass the filter through GROUP   BY.)                     
                                                                                
                                                            | -                 
                      |
 | Operators            | <sup>2</sup>BETWEEN,   <sup>2</sup>ITEM, AND, OR, 
NOT, <sup>1</sup>IS [NOT] NULL, <sup>1</sup>IS [NOT] TRUE, <sup>1</sup>IS [NOT] 
FALSE, IN (An   IN list is converted to OR if the number in the IN list is 
within a certain   threshold, for example 20. If greater than the threshold, 
pruning cannot   occur.) | -                                       |
 | Comparison Operators | <>,   <, >, <=, >=, =                                 
                                                                                
                                                                                
                                        | -                                     
  |
-| Data Types           | INT,   BIGINT, FLOAT, DOUBLE, DATE, TIMESTAMP, TIME, 
<sup>1</sup>BOOLEAN (true, false), <sup>3</sup>VARCHAR and DECIMAL columns      
                                                                                
                                                                                
             | CHAR,   Hive TIMESTAMP |
+| Data Types           | INT, BIGINT, FLOAT, DOUBLE, DATE, TIMESTAMP, TIME, 
<sup>1</sup>BOOLEAN (true, false), <sup>3</sup>VARCHAR,CHAR (treated as 
VARCHAR), and DECIMAL columns                                                   
                                                                                
                                                | Hive TIMESTAMP |
 | Function             | CAST   is supported among the following types only: 
int, bigint, float, double,   <sup>1</sup>date, <sup>1</sup>timestamp, and 
<sup>1</sup>time                                                                
                                                                                
| -                                       |
 | Other                | <sup>2</sup>Enabled   native Hive reader, Files with 
multiple row groups, <sup>2</sup>Joins                                          
                                                                                
                                                             | -                
                       |

[drill] branch gh-pages updated: doc edits for 1.15

Reply via email to