[drill] branch gh-pages updated: team update edits for DRILL-6744

bridgetb Fri, 14 Dec 2018 13:33:39 -0800

This is an automated email from the ASF dual-hosted git repository.

bridgetb pushed a commit to branch gh-pages
in repository https://gitbox.apache.org/repos/asf/drill.git



The following commit(s) were added to refs/heads/gh-pages by this push:
     new e6a1c3d  team update edits for DRILL-6744
e6a1c3d is described below

commit e6a1c3d454d92f6a1d575d3b7c182c7510403300
Author: Bridget Bevens <[email protected]>
AuthorDate: Fri Dec 14 13:32:27 2018 -0800

    team update edits for DRILL-6744
---
 .../026-parquet-filter-pushdown.md                 | 24 +++---
 team.md                                            | 91 ++++++++++++----------
 2 files changed, 64 insertions(+), 51 deletions(-)

diff --git a/_docs/performance-tuning/026-parquet-filter-pushdown.md 
b/_docs/performance-tuning/026-parquet-filter-pushdown.md
index a1c5416..fe50bf7 100644
--- a/_docs/performance-tuning/026-parquet-filter-pushdown.md
+++ b/_docs/performance-tuning/026-parquet-filter-pushdown.md
@@ -6,9 +6,11 @@ parent: "Performance Tuning"
 
 Drill 1.9 introduces the Parquet filter pushdown option. Parquet filter 
pushdown is a performance optimization that prunes extraneous data from a 
Parquet file to reduce the amount of data that Drill scans and reads when a 
query on a Parquet file contains a filter expression. Pruning data reduces the 
I/O, CPU, and network overhead to optimize Drill’s performance.
  
-Parquet filter pushdown is enabled by default. When a query contains a filter 
expression, you can run the [EXPLAIN PLAN 
command]({{site.baseurl}}/docs/explain/) to see if Drill applies Parquet filter 
pushdown to the query. You can enable and disable this feature using the [ALTER 
SYSTEM|SESSION SET]({{site.baseurl}}/docs/alter-system/) command with the 
`planner.store.parquet.rowgroup.filter.pushdown` option.  
+Parquet filter pushdown is enabled by default. When a query contains a filter 
expression, you can run the [EXPLAIN PLAN 
command]({{site.baseurl}}/docs/explain/) to see if Drill applies Parquet filter 
pushdown to the query. You can enable and disable this feature through the 
`planner.store.parquet.rowgroup.filter.pushdown` option, as shown:   
 
-As of Drill 1.13, the query planner in Drill can apply project push down, 
filter push down, and partition pruning to star queries in common table 
expressions (CTEs), views, and subqueries, for example:  
+       SET `planner.store.parquet.rowgroup.filter.pushdown`='false'   
+
+Starting in Drill 1.13, the query planner in Drill can apply project push 
down, filter push down, and partition pruning to star queries in common table 
expressions (CTEs), views, and subqueries, for example:  
   
        select col1 from (select * from t)  
 
@@ -37,7 +39,7 @@ If Parquet files were created with a pre-1.10.0 version of 
Parquet, and the data
 In Hive 2.3, Parquet files are created by a pre-1.10.0 version of Parquet. If 
the data in the binary columns is in ASCII format, you can enable the 
`store.parquet.reader.strings_signed_min_max` option to enable pushdown support 
for VARCHAR data types. DECIMAL filter pushdown is not supported.  
 
 ###Drill Generated Metadata Files  
-Parquet filter pushdown for DECIMAL and VARCHAR data types may not work 
correctly on Drill metadata files that were generated prior to Drill 1.15. 
Regenerate all Drill metadata files using Drill 1.15 or later to ensure that 
Parquet filter pushdown works correctly on Drill generated metadata files.
+Parquet filter pushdown for DECIMAL and VARCHAR data types may not work 
correctly on Drill metadata files that were generated prior to Drill 1.15. 
Regenerate all Drill metadata files using Drill 1.15 or later to ensure that 
Parquet filter pushdown on VARCHAR and DECIMAL data types works correctly on 
Drill generated metadata files.
 
 If the `store.parquet.reader.strings_signed_min_max` option is not enabled 
during regeneration, the minimum and maximum values for the binary data will 
not be written. When the binary data is in ASCII format, enabling the 
`store.parquet.reader.strings_signed_min_max` option during regeneration 
ensures that the minimum and maximum values are written and thus read back and 
used during filter pushdown.  
 
@@ -72,17 +74,15 @@ Currently, Parquet filter pushdown only supports filters 
that reference columns
 Parquet filter pushdown works best if you presort the data. You do not have to 
sort the entire data set at once. You can sort a subset of the data set, sort 
another subset, and so on.   
 
 ###Configuring Parquet Filter Pushdown  
-Use the [ALTER SYSTEM|SESSION SET]({{site.baseurl}}/docs/alter-system/) 
command with the Parquet filter pushdown options to enable or disable the 
feature, and set the number of row groups for a table.  
+Use the [ALTER SYSTEM]({{site.baseurl}}/docs/alter-system/) or 
[SET]({{site.baseurl}}/docs/set/) command with the Parquet filter pushdown 
options to enable or disable the related features.  
 
 The following table lists the Parquet filter pushdown options with their 
descriptions and default values:  
 
-|       Option                                               | Description     
                                                                                
                                                                                
                                                                                
                                                                                
           | Default   |
-|------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|
-| "planner.store.parquet.rowgroup.filter.pushdown"           | Turns the 
Parquet filter pushdown feature on or   off.                                    
                                                                                
                                                                                
                                                                                
                 | TRUE      |
-| "planner.store.parquet.rowgroup.filter.pushdown.threshold" | Sets the number 
of row groups that a table can   have. You can increase the threshold if the 
filter can prune many row groups.   However, if this setting is too high, the 
filter evaluation overhead   increases. Base this setting on the data set. 
Reduce this setting if the   planning time is significant, or you do not see 
any benefit at runtime. | 10,000    |  
-
-###Viewing the Query Plan
-Because Drill applies Parquet filter pushdown during the query planning phase, 
you can view the query execution plan to see if Drill pushes down the filter 
when a query on a Parquet file contains a filter expression. You can run the 
[EXPLAIN PLAN command]({{site.baseurl}}/docs/explain/) to see the execution 
plan for the query, as shown in the following example.
+| Option                                                   | Description       
                                                                                
                                                                                
                                                                                
                                                                                
                                                                                
              [...]
+|----------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 [...]
+| planner.store.parquet.rowgroup.filter.pushdown           | Turns   the 
Parquet filter pushdown feature on or off.                                      
                                                                                
                                                                                
                                                                                
                                                                                
                    [...]
+| planner.store.parquet.rowgroup.filter.pushdown.threshold | Sets   the number 
of row groups that a table can have. You can increase the   threshold if the 
filter can prune many row groups. However, if this setting   is too high, the 
filter evaluation overhead increases. Base this setting on   the data set. 
Reduce this setting if the planning time is significant, or you   do not see 
any benefit at runtime.                                                         
                           [...]
+| store.parquet.reader.strings_signed_min_max              | Allows binary 
statistics usage   for Parquet files created with a pre-1.10.0 version of 
Parquet. Files created   pre-1.10.0 have incorrectly calculated statistics for 
UTF-8 data. If you know   that data in the binary columns is in ASCII (not 
UTF-8), setting this option   to 'true' enables statistics usage for VARCHAR 
and DECIMAL data types.   Default is unset; empty string. Allowed values are 
'true', 'false', '' (empty   string [...]
 
 **Example**  
 
@@ -113,7 +113,7 @@ The following table lists the supported and unsupported 
clauses, operators, data
 | Clauses              | WHERE,   <sup>1</sup>WITH, HAVING (HAVING is 
supported if Drill can pass the filter through GROUP   BY.)                     
                                                                                
                                                            | -                 
                      |
 | Operators            | <sup>2</sup>BETWEEN,   <sup>2</sup>ITEM, AND, OR, 
NOT, <sup>1</sup>IS [NOT] NULL, <sup>1</sup>IS [NOT] TRUE, <sup>1</sup>IS [NOT] 
FALSE, IN (An   IN list is converted to OR if the number in the IN list is 
within a certain   threshold, for example 20. If greater than the threshold, 
pruning cannot   occur.) | -                                       |
 | Comparison Operators | <>,   <, >, <=, >=, =                                 
                                                                                
                                                                                
                                        | -                                     
  |
-| Data Types           | INT,   BIGINT, FLOAT, DOUBLE, DATE, TIMESTAMP, TIME, 
<sup>1</sup>BOOLEAN (true, false), <sup>3</sup>VARCHAR columns                  
                                                                                
                                                                                
 | CHAR,   Hive TIMESTAMP |
+| Data Types           | INT,   BIGINT, FLOAT, DOUBLE, DATE, TIMESTAMP, TIME, 
<sup>1</sup>BOOLEAN (true, false), <sup>3</sup>VARCHAR and DECIMAL columns      
                                                                                
                                                                                
             | CHAR,   Hive TIMESTAMP |
 | Function             | CAST   is supported among the following types only: 
int, bigint, float, double,   <sup>1</sup>date, <sup>1</sup>timestamp, and 
<sup>1</sup>time                                                                
                                                                                
| -                                       |
 | Other                | <sup>2</sup>Enabled   native Hive reader, Files with 
multiple row groups, <sup>2</sup>Joins                                          
                                                                                
                                                             | -                
                       |
 
diff --git a/team.md b/team.md
index bd65f6d..4a8a3ec 100755
--- a/team.md
+++ b/team.md
@@ -8,42 +8,55 @@ We welcome contributions to the project. If you're interested 
in contributing, t
 
 ## Drill Committers
 
-| Name | Alias (email is &lt;alias&gt;@apache.org) |
-|------|-------|
-| Jacques Nadeau | jacques |
-| Tomer Shiran | tshiran |
-| Ted Dunning | tdunning |
-| Jason Frantz | jason |
-| MC Srivas | srivas |
-| Julian Hyde | jhyde |
-| Tim Chen | tnachen |
-| Mehant Baid | mehant |
-| Jinfeng Ni | jni |
-| Venki Korukanti | venki |
-| Jason Altekruse | json |
-| Aditya Kishore | adi |
-| Parth Chandra | parthc |
-| Aman Sinha | amansinha |
-| Steven Phillips | smp |
-| Bridget Bevens | bridgetb |
-| Hanifi Gunes | hg |
-| Abdelhakim Deneche | adeneche |
-| Sudheesh Katkam | sudheesh |
-| Ellen Friedman | ellenf |
-| Kris Hahn | krishahn |
-| Neeraja Rentachintala | neerajar |
-| Chris Westin | cwestin |
-| Abhishek Girish | agirish |
-| Rahul Challapalli | rkins |
-| Arina Ielchiieva | arina |  
-| Paul Rogers | progers |
-| Laurent Goujon | laurent |  
-| Charles Givre | cgivre |   
-| Boaz Ben-Zvi | boaz |  
-| Anil Kumar Batchu | akumarb2010 |  
-| Vitalii Diravka  | vitalii |  
-| Kamesh Bhallamudi | kameshb |  
-| Kunal Khatua | kunal |  
-| Volodymyr Vysotskyi | volodymyr |
-| Sorabh Hamirwasia | sorabh | 
-| Timothy Farkas | timothyfarkas |  
+| **Name**                    | **Alias (email is <alias>@apache.org)** |
+|-------------------------|-------------------------------------|
+| Abdel Hakim Deneche     | adeneche                            |
+| Aditya Kishore          | adi                                 |
+| Abhishek Girish         | agirish                             |
+| AnilKumar B             | akumarb2010                         |
+| Aman Sinha              | amansinha                           |
+| Arina Ielchiieva        | arina                               |
+| Boaz Ben-Zvi            | boaz                                |
+| Bridget Bevens          | bridgetb                            |
+| Kamesh Bhallamudi       | bvskamesh                           |
+| Charles Givre           | cgivre                              |
+| Chunhui Shi             | cshi                                |
+| Chris Wensel            | cwensel                             |
+| Chris Westin            | cwestin                             |
+| Ellen Friedman          | ellenf                              |
+| German Shegalov         | gera                                |
+| Gautam Parai            | gparai                              |
+| Grant Ingersoll         | gsingers                            |
+| Hanifi Gunes            | hg                                  |
+| Hanumath Rao Maduri     | hmaduri                             |
+| Hsuan-Yi Chu            | hsuanyichu                          |
+| Isabel Drost-Fromm      | isabel                              |
+| Jacques Nadeau          | jacques                             |
+| Jason Frantz            | jason                               |
+| Julian Hyde             | jhyde                               |
+| Jinfeng Ni              | jni                                 |
+| Jason Altekruse         | json                                |
+| Karthikeyan Manivannan  | karthikm                            |
+| Keys Botzum             | kbotzum                             |
+| Kris Hahn               | krishahn                            |
+| Kunal Khatua            | kunal                               |
+| Laurent Goujon          | laurent                             |
+| Mehant Baid             | mehant                              |
+| Neeraja Rentachintala   | neerajar                            |
+| Parth Chandra           | parthc                              |
+| Padma Penumarthy        | ppadma                              |
+| Paul Rogers             | progers                             |
+| Ryan Rawson             | rawson                              |
+| Rahul Kumar Challapalli | rkins                               |
+| Steven Phillips         | smp                                 |
+| Sorabh Hamirwasia       | sorabh                              |
+| Srivas                  | srivas                              |
+| Sudheesh Katkam         | sudheesh                            |
+| Ted Dunning             | tdunning                            |
+| Timothy Farkas          | timothyfarkas                       |
+| Timothy Chen            | tnachen                             |
+| Tomer Shiran            | tshiran                             |
+| Venki Korukanti         | venki                               |
+| Vitalii Diravka         | vitalii                             |
+| Vova Vysotskyi          | volodymyr                           |
+| Weijie Tong             | weijie                              |
\ No newline at end of file

[drill] branch gh-pages updated: team update edits for DRILL-6744

Reply via email to