[jira] [Commented] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586937#comment-16586937
 ] 

ASF GitHub Bot commented on DRILL-5735:
---

kkhatua commented on issue #1279: DRILL-5735: Allow search/sort in the Options 
webUI
URL: https://github.com/apache/drill/pull/1279#issuecomment-414550320
 
 
   Done. Rebased and squashed. Cleared unit and functional tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586743#comment-16586743
 ] 

Robert Hou commented on DRILL-6566:
---

parquet views can be found on 10.10.100.186:/tmp/createViewsParquet.sql

> Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.  AGGR OOM at First Phase.
> --
>
> Key: DRILL-6566
> URL: https://issues.apache.org/jira/browse/DRILL-6566
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6566
>
>
> This is TPCDS Query 66.
> Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql
> SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> ship_carriers,
> year1,
> Sum(jan_sales) AS jan_sales,
> Sum(feb_sales) AS feb_sales,
> Sum(mar_sales) AS mar_sales,
> Sum(apr_sales) AS apr_sales,
> Sum(may_sales) AS may_sales,
> Sum(jun_sales) AS jun_sales,
> Sum(jul_sales) AS jul_sales,
> Sum(aug_sales) AS aug_sales,
> Sum(sep_sales) AS sep_sales,
> Sum(oct_sales) AS oct_sales,
> Sum(nov_sales) AS nov_sales,
> Sum(dec_sales) AS dec_sales,
> Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
> Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
> Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
> Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
> Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
> Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
> Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
> Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
> Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
> Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
> Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
> Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
> Sum(jan_net)   AS jan_net,
> Sum(feb_net)   AS feb_net,
> Sum(mar_net)   AS mar_net,
> Sum(apr_net)   AS apr_net,
> Sum(may_net)   AS may_net,
> Sum(jun_net)   AS jun_net,
> Sum(jul_net)   AS jul_net,
> Sum(aug_net)   AS aug_net,
> Sum(sep_net)   AS sep_net,
> Sum(oct_net)   AS oct_net,
> Sum(nov_net)   AS nov_net,
> Sum(dec_net)   AS dec_net
> FROM   (SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> 'ZOUROS'
> \|\| ','
> \|\| 'ZHOU' AS ship_carriers,
> d_yearAS year1,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jan_sales,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS feb_sales,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS mar_sales,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS apr_sales,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS may_sales,
> Sum(CASE
> WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jun_sales,
> Sum(CASE
> WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jul_sales,
> Sum(CASE
> WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS aug_sales,
> Sum(CASE
> WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS sep_sales,
> Sum(CASE
> WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS oct_sales,
> Sum(CASE
> WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS nov_sales,
> Sum(CASE
> WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS dec_sales,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS jan_net,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS feb_net,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS mar_net,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS apr_net,
> Sum(CASE
> WHEN d_moy = 5 THEN 

[jira] [Commented] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586720#comment-16586720
 ] 

Robert Hou commented on DRILL-6566:
---

When I run the query66 with hive generated data, and set the 
max_query_memory_per_node to 10 GB, I see the AGGR OOM message:

   Error: RESOURCE ERROR: One or more nodes ran out of memory while executing 
the query.

   AGGR OOM at First Phase. Partitions: 1. Estimated batch size: 31260672. 
values size: 25165824. Output alloc size: 25165824. Planned batches: 1 Memory 
limit: 2302755 so far allocated: 262144. 
Fragment 6:0

If I use the default for max_query_memory_per_node = 2 GB, I see a generic 
message:

   Error: RESOURCE ERROR: One or more nodes ran out of memory while executing 
the query.

> Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.  AGGR OOM at First Phase.
> --
>
> Key: DRILL-6566
> URL: https://issues.apache.org/jira/browse/DRILL-6566
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6566
>
>
> This is TPCDS Query 66.
> Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql
> SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> ship_carriers,
> year1,
> Sum(jan_sales) AS jan_sales,
> Sum(feb_sales) AS feb_sales,
> Sum(mar_sales) AS mar_sales,
> Sum(apr_sales) AS apr_sales,
> Sum(may_sales) AS may_sales,
> Sum(jun_sales) AS jun_sales,
> Sum(jul_sales) AS jul_sales,
> Sum(aug_sales) AS aug_sales,
> Sum(sep_sales) AS sep_sales,
> Sum(oct_sales) AS oct_sales,
> Sum(nov_sales) AS nov_sales,
> Sum(dec_sales) AS dec_sales,
> Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
> Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
> Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
> Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
> Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
> Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
> Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
> Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
> Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
> Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
> Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
> Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
> Sum(jan_net)   AS jan_net,
> Sum(feb_net)   AS feb_net,
> Sum(mar_net)   AS mar_net,
> Sum(apr_net)   AS apr_net,
> Sum(may_net)   AS may_net,
> Sum(jun_net)   AS jun_net,
> Sum(jul_net)   AS jul_net,
> Sum(aug_net)   AS aug_net,
> Sum(sep_net)   AS sep_net,
> Sum(oct_net)   AS oct_net,
> Sum(nov_net)   AS nov_net,
> Sum(dec_net)   AS dec_net
> FROM   (SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> 'ZOUROS'
> \|\| ','
> \|\| 'ZHOU' AS ship_carriers,
> d_yearAS year1,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jan_sales,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS feb_sales,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS mar_sales,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS apr_sales,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS may_sales,
> Sum(CASE
> WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jun_sales,
> Sum(CASE
> WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jul_sales,
> Sum(CASE
> WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS aug_sales,
> Sum(CASE
> WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS sep_sales,
> Sum(CASE
> WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS oct_sales,
> Sum(CASE
> WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS 

[jira] [Commented] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Boaz Ben-Zvi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586708#comment-16586708
 ] 

Boaz Ben-Zvi commented on DRILL-6566:
-

Looks like as batch-sizing makes the #rows smaller, the estimate (used to 
determine when to spill) still uses the original estimate (based on 64K rows 
per batch). Need to check if this estimate needs to be updated.

 

 

> Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.  AGGR OOM at First Phase.
> --
>
> Key: DRILL-6566
> URL: https://issues.apache.org/jira/browse/DRILL-6566
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6566
>
>
> This is TPCDS Query 66.
> Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql
> SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> ship_carriers,
> year1,
> Sum(jan_sales) AS jan_sales,
> Sum(feb_sales) AS feb_sales,
> Sum(mar_sales) AS mar_sales,
> Sum(apr_sales) AS apr_sales,
> Sum(may_sales) AS may_sales,
> Sum(jun_sales) AS jun_sales,
> Sum(jul_sales) AS jul_sales,
> Sum(aug_sales) AS aug_sales,
> Sum(sep_sales) AS sep_sales,
> Sum(oct_sales) AS oct_sales,
> Sum(nov_sales) AS nov_sales,
> Sum(dec_sales) AS dec_sales,
> Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
> Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
> Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
> Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
> Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
> Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
> Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
> Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
> Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
> Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
> Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
> Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
> Sum(jan_net)   AS jan_net,
> Sum(feb_net)   AS feb_net,
> Sum(mar_net)   AS mar_net,
> Sum(apr_net)   AS apr_net,
> Sum(may_net)   AS may_net,
> Sum(jun_net)   AS jun_net,
> Sum(jul_net)   AS jul_net,
> Sum(aug_net)   AS aug_net,
> Sum(sep_net)   AS sep_net,
> Sum(oct_net)   AS oct_net,
> Sum(nov_net)   AS nov_net,
> Sum(dec_net)   AS dec_net
> FROM   (SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> 'ZOUROS'
> \|\| ','
> \|\| 'ZHOU' AS ship_carriers,
> d_yearAS year1,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jan_sales,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS feb_sales,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS mar_sales,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS apr_sales,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS may_sales,
> Sum(CASE
> WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jun_sales,
> Sum(CASE
> WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jul_sales,
> Sum(CASE
> WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS aug_sales,
> Sum(CASE
> WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS sep_sales,
> Sum(CASE
> WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS oct_sales,
> Sum(CASE
> WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS nov_sales,
> Sum(CASE
> WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS dec_sales,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS jan_net,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS feb_net,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> 

[jira] [Assigned] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Boaz Ben-Zvi (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Boaz Ben-Zvi reassigned DRILL-6566:
---

Assignee: Boaz Ben-Zvi  (was: Timothy Farkas)

> Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.  AGGR OOM at First Phase.
> --
>
> Key: DRILL-6566
> URL: https://issues.apache.org/jira/browse/DRILL-6566
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6566
>
>
> This is TPCDS Query 66.
> Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql
> SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> ship_carriers,
> year1,
> Sum(jan_sales) AS jan_sales,
> Sum(feb_sales) AS feb_sales,
> Sum(mar_sales) AS mar_sales,
> Sum(apr_sales) AS apr_sales,
> Sum(may_sales) AS may_sales,
> Sum(jun_sales) AS jun_sales,
> Sum(jul_sales) AS jul_sales,
> Sum(aug_sales) AS aug_sales,
> Sum(sep_sales) AS sep_sales,
> Sum(oct_sales) AS oct_sales,
> Sum(nov_sales) AS nov_sales,
> Sum(dec_sales) AS dec_sales,
> Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
> Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
> Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
> Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
> Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
> Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
> Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
> Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
> Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
> Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
> Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
> Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
> Sum(jan_net)   AS jan_net,
> Sum(feb_net)   AS feb_net,
> Sum(mar_net)   AS mar_net,
> Sum(apr_net)   AS apr_net,
> Sum(may_net)   AS may_net,
> Sum(jun_net)   AS jun_net,
> Sum(jul_net)   AS jul_net,
> Sum(aug_net)   AS aug_net,
> Sum(sep_net)   AS sep_net,
> Sum(oct_net)   AS oct_net,
> Sum(nov_net)   AS nov_net,
> Sum(dec_net)   AS dec_net
> FROM   (SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> 'ZOUROS'
> \|\| ','
> \|\| 'ZHOU' AS ship_carriers,
> d_yearAS year1,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jan_sales,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS feb_sales,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS mar_sales,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS apr_sales,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS may_sales,
> Sum(CASE
> WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jun_sales,
> Sum(CASE
> WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jul_sales,
> Sum(CASE
> WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS aug_sales,
> Sum(CASE
> WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS sep_sales,
> Sum(CASE
> WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS oct_sales,
> Sum(CASE
> WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS nov_sales,
> Sum(CASE
> WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS dec_sales,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS jan_net,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS feb_net,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS mar_net,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS apr_net,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS 

[jira] [Updated] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Robert Hou (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-6566:
--
Attachment: drillbit.log.6566

> Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.  AGGR OOM at First Phase.
> --
>
> Key: DRILL-6566
> URL: https://issues.apache.org/jira/browse/DRILL-6566
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Timothy Farkas
>Priority: Critical
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6566
>
>
> This is TPCDS Query 66.
> Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql
> SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> ship_carriers,
> year1,
> Sum(jan_sales) AS jan_sales,
> Sum(feb_sales) AS feb_sales,
> Sum(mar_sales) AS mar_sales,
> Sum(apr_sales) AS apr_sales,
> Sum(may_sales) AS may_sales,
> Sum(jun_sales) AS jun_sales,
> Sum(jul_sales) AS jul_sales,
> Sum(aug_sales) AS aug_sales,
> Sum(sep_sales) AS sep_sales,
> Sum(oct_sales) AS oct_sales,
> Sum(nov_sales) AS nov_sales,
> Sum(dec_sales) AS dec_sales,
> Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
> Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
> Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
> Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
> Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
> Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
> Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
> Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
> Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
> Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
> Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
> Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
> Sum(jan_net)   AS jan_net,
> Sum(feb_net)   AS feb_net,
> Sum(mar_net)   AS mar_net,
> Sum(apr_net)   AS apr_net,
> Sum(may_net)   AS may_net,
> Sum(jun_net)   AS jun_net,
> Sum(jul_net)   AS jul_net,
> Sum(aug_net)   AS aug_net,
> Sum(sep_net)   AS sep_net,
> Sum(oct_net)   AS oct_net,
> Sum(nov_net)   AS nov_net,
> Sum(dec_net)   AS dec_net
> FROM   (SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> 'ZOUROS'
> \|\| ','
> \|\| 'ZHOU' AS ship_carriers,
> d_yearAS year1,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jan_sales,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS feb_sales,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS mar_sales,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS apr_sales,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS may_sales,
> Sum(CASE
> WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jun_sales,
> Sum(CASE
> WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jul_sales,
> Sum(CASE
> WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS aug_sales,
> Sum(CASE
> WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS sep_sales,
> Sum(CASE
> WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS oct_sales,
> Sum(CASE
> WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS nov_sales,
> Sum(CASE
> WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS dec_sales,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS jan_net,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS feb_net,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS mar_net,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS apr_net,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS may_net,
> Sum(CASE
> WHEN 

[jira] [Commented] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586681#comment-16586681
 ] 

Robert Hou commented on DRILL-6566:
---

Here is the explain plan for the hive generated parquet file.

{noformat}
| 00-00Screen
00-01  Project(w_warehouse_name=[$0], w_warehouse_sq_ft=[$1], w_city=[$2], 
w_county=[$3], w_state=[$4], w_country=[$5], ship_carriers=[$6], year1=[$7], 
jan_sales=[$8], feb_sales=[$9], mar_sales=[$10], apr_sales=[$11], 
may_sales=[$12], jun_sales=[$13], jul_sales=[$14], aug_sales=[$15], 
sep_sales=[$16], oct_sales=[$17], nov_sales=[$18], dec_sales=[$19], 
jan_sales_per_sq_foot=[$20], feb_sales_per_sq_foot=[$21], 
mar_sales_per_sq_foot=[$22], apr_sales_per_sq_foot=[$23], 
may_sales_per_sq_foot=[$24], jun_sales_per_sq_foot=[$25], 
jul_sales_per_sq_foot=[$26], aug_sales_per_sq_foot=[$27], 
sep_sales_per_sq_foot=[$28], oct_sales_per_sq_foot=[$29], 
nov_sales_per_sq_foot=[$30], dec_sales_per_sq_foot=[$31], jan_net=[$32], 
feb_net=[$33], mar_net=[$34], apr_net=[$35], may_net=[$36], jun_net=[$37], 
jul_net=[$38], aug_net=[$39], sep_net=[$40], oct_net=[$41], nov_net=[$42], 
dec_net=[$43])
00-02SelectionVectorRemover
00-03  Limit(fetch=[100])
00-04SelectionVectorRemover
00-05  TopN(limit=[100])
00-06HashAgg(group=[{0, 1, 2, 3, 4, 5, 6, 7}], 
jan_sales=[SUM($8)], feb_sales=[SUM($9)], mar_sales=[SUM($10)], 
apr_sales=[SUM($11)], may_sales=[SUM($12)], jun_sales=[SUM($13)], 
jul_sales=[SUM($14)], aug_sales=[SUM($15)], sep_sales=[SUM($16)], 
oct_sales=[SUM($17)], nov_sales=[SUM($18)], dec_sales=[SUM($19)], 
jan_sales_per_sq_foot=[SUM($20)], feb_sales_per_sq_foot=[SUM($21)], 
mar_sales_per_sq_foot=[SUM($22)], apr_sales_per_sq_foot=[SUM($23)], 
may_sales_per_sq_foot=[SUM($24)], jun_sales_per_sq_foot=[SUM($25)], 
jul_sales_per_sq_foot=[SUM($26)], aug_sales_per_sq_foot=[SUM($27)], 
sep_sales_per_sq_foot=[SUM($28)], oct_sales_per_sq_foot=[SUM($29)], 
nov_sales_per_sq_foot=[SUM($30)], dec_sales_per_sq_foot=[SUM($31)], 
jan_net=[SUM($32)], feb_net=[SUM($33)], mar_net=[SUM($34)], apr_net=[SUM($35)], 
may_net=[SUM($36)], jun_net=[SUM($37)], jul_net=[SUM($38)], aug_net=[SUM($39)], 
sep_net=[SUM($40)], oct_net=[SUM($41)], nov_net=[SUM($42)], dec_net=[SUM($43)])
00-07  Project(w_warehouse_name=[$0], w_warehouse_sq_ft=[$1], 
w_city=[$2], w_county=[$3], w_state=[$4], w_country=[$5], ship_carriers=[$6], 
year1=[$7], jan_sales=[$8], feb_sales=[$9], mar_sales=[$10], apr_sales=[$11], 
may_sales=[$12], jun_sales=[$13], jul_sales=[$14], aug_sales=[$15], 
sep_sales=[$16], oct_sales=[$17], nov_sales=[$18], dec_sales=[$19], $f20=[$20], 
$f21=[$21], $f22=[$22], $f23=[$23], $f24=[$24], $f25=[$25], $f26=[$26], 
$f27=[$27], $f28=[$28], $f29=[$29], $f30=[$30], $f31=[$31], jan_net=[$32], 
feb_net=[$33], mar_net=[$34], apr_net=[$35], may_net=[$36], jun_net=[$37], 
jul_net=[$38], aug_net=[$39], sep_net=[$40], oct_net=[$41], nov_net=[$42], 
dec_net=[$43])
00-08HashToRandomExchange(dist0=[[$0]])
01-01  UnorderedMuxExchange
02-01Project(w_warehouse_name=[$0], 
w_warehouse_sq_ft=[$1], w_city=[$2], w_county=[$3], w_state=[$4], 
w_country=[$5], ship_carriers=[$6], year1=[$7], jan_sales=[$8], feb_sales=[$9], 
mar_sales=[$10], apr_sales=[$11], may_sales=[$12], jun_sales=[$13], 
jul_sales=[$14], aug_sales=[$15], sep_sales=[$16], oct_sales=[$17], 
nov_sales=[$18], dec_sales=[$19], $f20=[$20], $f21=[$21], $f22=[$22], 
$f23=[$23], $f24=[$24], $f25=[$25], $f26=[$26], $f27=[$27], $f28=[$28], 
$f29=[$29], $f30=[$30], $f31=[$31], jan_net=[$32], feb_net=[$33], 
mar_net=[$34], apr_net=[$35], may_net=[$36], jun_net=[$37], jul_net=[$38], 
aug_net=[$39], sep_net=[$40], oct_net=[$41], nov_net=[$42], dec_net=[$43], 
E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($0, 1301011)])
02-02  UnionAll(all=[true])
02-04Project(w_warehouse_name=[$0], 
w_warehouse_sq_ft=[$1], w_city=[$2], w_county=[$3], w_state=[$4], 
w_country=[$5], ship_carriers=[||(||('ZOUROS', ','), 'ZHOU')], year1=[$6], 
jan_sales=[$7], feb_sales=[$8], mar_sales=[$9], apr_sales=[$10], 
may_sales=[$11], jun_sales=[$12], jul_sales=[$13], aug_sales=[$14], 
sep_sales=[$15], oct_sales=[$16], nov_sales=[$17], dec_sales=[$18], $f20=[/($7, 
$1)], $f21=[/($8, $1)], $f22=[/($9, $1)], $f23=[/($10, $1)], $f24=[/($11, $1)], 
$f25=[/($12, $1)], $f26=[/($13, $1)], $f27=[/($14, $1)], $f28=[/($15, $1)], 
$f29=[/($16, $1)], $f30=[/($17, $1)], $f31=[/($18, $1)], jan_net=[$19], 
feb_net=[$20], mar_net=[$21], apr_net=[$22], may_net=[$23], jun_net=[$24], 
jul_net=[$25], aug_net=[$26], sep_net=[$27], oct_net=[$28], nov_net=[$29], 
dec_net=[$30])
02-06  HashAgg(group=[{0, 1, 2, 3, 4, 5, 6}], 
jan_sales=[SUM($7)], feb_sales=[SUM($8)], mar_sales=[SUM($9)], 
apr_sales=[SUM($10)], may_sales=[SUM($11)], 

[jira] [Updated] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Robert Hou (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou updated DRILL-6566:
--
Description: 
This is TPCDS Query 66.

Query: tpcds/tpcds_sf1/hive-generated-parquet/hive1_native/query66.sql

SELECT w_warehouse_name,
w_warehouse_sq_ft,
w_city,
w_county,
w_state,
w_country,
ship_carriers,
year1,
Sum(jan_sales) AS jan_sales,
Sum(feb_sales) AS feb_sales,
Sum(mar_sales) AS mar_sales,
Sum(apr_sales) AS apr_sales,
Sum(may_sales) AS may_sales,
Sum(jun_sales) AS jun_sales,
Sum(jul_sales) AS jul_sales,
Sum(aug_sales) AS aug_sales,
Sum(sep_sales) AS sep_sales,
Sum(oct_sales) AS oct_sales,
Sum(nov_sales) AS nov_sales,
Sum(dec_sales) AS dec_sales,
Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
Sum(jan_net)   AS jan_net,
Sum(feb_net)   AS feb_net,
Sum(mar_net)   AS mar_net,
Sum(apr_net)   AS apr_net,
Sum(may_net)   AS may_net,
Sum(jun_net)   AS jun_net,
Sum(jul_net)   AS jul_net,
Sum(aug_net)   AS aug_net,
Sum(sep_net)   AS sep_net,
Sum(oct_net)   AS oct_net,
Sum(nov_net)   AS nov_net,
Sum(dec_net)   AS dec_net
FROM   (SELECT w_warehouse_name,
w_warehouse_sq_ft,
w_city,
w_county,
w_state,
w_country,
'ZOUROS'
\|\| ','
\|\| 'ZHOU' AS ship_carriers,
d_yearAS year1,
Sum(CASE
WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS jan_sales,
Sum(CASE
WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS feb_sales,
Sum(CASE
WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS mar_sales,
Sum(CASE
WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS apr_sales,
Sum(CASE
WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS may_sales,
Sum(CASE
WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS jun_sales,
Sum(CASE
WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS jul_sales,
Sum(CASE
WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS aug_sales,
Sum(CASE
WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS sep_sales,
Sum(CASE
WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS oct_sales,
Sum(CASE
WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS nov_sales,
Sum(CASE
WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
ELSE 0
END)  AS dec_sales,
Sum(CASE
WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS jan_net,
Sum(CASE
WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS feb_net,
Sum(CASE
WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS mar_net,
Sum(CASE
WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS apr_net,
Sum(CASE
WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS may_net,
Sum(CASE
WHEN d_moy = 6 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS jun_net,
Sum(CASE
WHEN d_moy = 7 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS jul_net,
Sum(CASE
WHEN d_moy = 8 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS aug_net,
Sum(CASE
WHEN d_moy = 9 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS sep_net,
Sum(CASE
WHEN d_moy = 10 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS oct_net,
Sum(CASE
WHEN d_moy = 11 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS nov_net,
Sum(CASE
WHEN d_moy = 12 THEN ws_net_paid_inc_ship * ws_quantity
ELSE 0
END)  AS dec_net
FROM   web_sales,
warehouse,
date_dim,
time_dim,
ship_mode
WHERE  ws_warehouse_sk = w_warehouse_sk
AND ws_sold_date_sk = d_date_sk
AND ws_sold_time_sk = t_time_sk
AND ws_ship_mode_sk = sm_ship_mode_sk
AND d_year = 1998
AND t_time BETWEEN 7249 AND 7249 + 28800
AND sm_carrier IN ( 'ZOUROS', 'ZHOU' )
GROUP  BY w_warehouse_name,
w_warehouse_sq_ft,
w_city,
w_county,
w_state,
w_country,
d_year
UNION ALL

[jira] [Commented] (DRILL-6552) Drill Metadata management "Drill MetaStore"

2018-08-20 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586572#comment-16586572
 ] 

Robert Hou commented on DRILL-6552:
---

Are we considering a prototype for this feature?  I would be concerned about 
the ability of HMS to scale.  If we find that HMS on an RDBMS does not scale, 
then a solution based on HMS on an RDBMS might not be worth much.

> Drill Metadata management "Drill MetaStore"
> ---
>
> Key: DRILL-6552
> URL: https://issues.apache.org/jira/browse/DRILL-6552
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Metadata
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 2.0.0
>
>
> It would be useful for Drill to have some sort of metastore which would 
> enable Drill to remember previously defined schemata so Drill doesn’t have to 
> do the same work over and over again.
> It allows to store schema and statistics, which will allow to accelerate 
> queries validation, planning and execution time. Also it increases stability 
> of Drill and allows to avoid different kind if issues: "schema change 
> Exceptions", "limit 0" optimization and so on. 
> One of the main candidates is Hive Metastore.
> Starting from 3.0 version Hive Metastore can be the separate service from 
> Hive server:
> [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration]
> Optional enhancement is storing Drill's profiles, UDFs, plugins configs in 
> some kind of metastore as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6566) Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more nodes ran out of memory while executing the query. AGGR OOM at First Phase.

2018-08-20 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586555#comment-16586555
 ] 

Robert Hou commented on DRILL-6566:
---

The default batch size is 16MB.  I ran the query with 8 MB and got the same 
errors.

> Jenkins Regression: TPCDS query 66 fails with RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.  AGGR OOM at First Phase.
> --
>
> Key: DRILL-6566
> URL: https://issues.apache.org/jira/browse/DRILL-6566
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: Timothy Farkas
>Priority: Critical
> Fix For: 1.15.0
>
>
> This is TPCDS Query 66.
> Query: tpcds/tpcds_sf1/original/parquet/query66.sql
> SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> ship_carriers,
> year1,
> Sum(jan_sales) AS jan_sales,
> Sum(feb_sales) AS feb_sales,
> Sum(mar_sales) AS mar_sales,
> Sum(apr_sales) AS apr_sales,
> Sum(may_sales) AS may_sales,
> Sum(jun_sales) AS jun_sales,
> Sum(jul_sales) AS jul_sales,
> Sum(aug_sales) AS aug_sales,
> Sum(sep_sales) AS sep_sales,
> Sum(oct_sales) AS oct_sales,
> Sum(nov_sales) AS nov_sales,
> Sum(dec_sales) AS dec_sales,
> Sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot,
> Sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot,
> Sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot,
> Sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot,
> Sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot,
> Sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot,
> Sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot,
> Sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot,
> Sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot,
> Sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot,
> Sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot,
> Sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot,
> Sum(jan_net)   AS jan_net,
> Sum(feb_net)   AS feb_net,
> Sum(mar_net)   AS mar_net,
> Sum(apr_net)   AS apr_net,
> Sum(may_net)   AS may_net,
> Sum(jun_net)   AS jun_net,
> Sum(jul_net)   AS jul_net,
> Sum(aug_net)   AS aug_net,
> Sum(sep_net)   AS sep_net,
> Sum(oct_net)   AS oct_net,
> Sum(nov_net)   AS nov_net,
> Sum(dec_net)   AS dec_net
> FROM   (SELECT w_warehouse_name,
> w_warehouse_sq_ft,
> w_city,
> w_county,
> w_state,
> w_country,
> 'ZOUROS'
> \|\| ','
> \|\| 'ZHOU' AS ship_carriers,
> d_yearAS year1,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jan_sales,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS feb_sales,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS mar_sales,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS apr_sales,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS may_sales,
> Sum(CASE
> WHEN d_moy = 6 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jun_sales,
> Sum(CASE
> WHEN d_moy = 7 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS jul_sales,
> Sum(CASE
> WHEN d_moy = 8 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS aug_sales,
> Sum(CASE
> WHEN d_moy = 9 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS sep_sales,
> Sum(CASE
> WHEN d_moy = 10 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS oct_sales,
> Sum(CASE
> WHEN d_moy = 11 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS nov_sales,
> Sum(CASE
> WHEN d_moy = 12 THEN ws_ext_sales_price * ws_quantity
> ELSE 0
> END)  AS dec_sales,
> Sum(CASE
> WHEN d_moy = 1 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS jan_net,
> Sum(CASE
> WHEN d_moy = 2 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS feb_net,
> Sum(CASE
> WHEN d_moy = 3 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS mar_net,
> Sum(CASE
> WHEN d_moy = 4 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS apr_net,
> Sum(CASE
> WHEN d_moy = 5 THEN ws_net_paid_inc_ship * ws_quantity
> ELSE 0
> END)  AS 

[jira] [Commented] (DRILL-6685) Error in parquet record reader

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586554#comment-16586554
 ] 

ASF GitHub Bot commented on DRILL-6685:
---

Ben-Zvi closed pull request #1433: DRILL-6685: Fixed exception when reading 
Parquet data
URL: https://github.com/apache/drill/pull/1433
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenAbstractPageEntryReader.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenAbstractPageEntryReader.java
index fecf1ce3158..a708f52353f 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenAbstractPageEntryReader.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenAbstractPageEntryReader.java
@@ -97,4 +97,22 @@ protected final boolean load(boolean force) {
   protected final int remainingPageData() {
 return pageInfo.pageDataLen - pageInfo.pageDataOff;
   }
+
+  /**
+   * Fixed length readers calculate upfront the maximum number of entries to 
process as entry length
+   * are known.
+   * @param valuesToRead requested number of values to read
+   * @param entrySz sizeof(integer) + column's precision
+   * @return maximum entries to read within each call (based on the bulk 
entry, entry size, and requested
+   * number of entries to read)
+   */
+  protected final int getFixedLengthMaxRecordsToRead(int valuesToRead, int 
entrySz) {
+// Let's start with bulk's entry and requested values-to-read constraints
+int numEntriesToRead = Math.min(entry.getMaxEntries(), valuesToRead);
+
+// Now include the size of the fixed entry (since they are fixed)
+numEntriesToRead = Math.min(numEntriesToRead, buffer.limit() / entrySz);
+
+return numEntriesToRead;
+  }
 }
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenFixedEntryReader.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenFixedEntryReader.java
index a6e7077241a..e66bd051163 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenFixedEntryReader.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenFixedEntryReader.java
@@ -43,7 +43,7 @@ final VarLenColumnBulkEntry getEntry(int valuesToRead) {
 
 final int expectedDataLen = columnPrecInfo.precision;
 final int entrySz = 4 + columnPrecInfo.precision;
-final int readBatch = Math.min(entry.getMaxEntries(), valuesToRead);
+final int readBatch = getFixedLengthMaxRecordsToRead(valuesToRead, 
entrySz);
 Preconditions.checkState(readBatch > 0, "Read batch count [%d] should be 
greater than zero", readBatch);
 
 final int[] valueLengths = entry.getValuesLength();
diff --git 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenNullableFixedEntryReader.java
 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenNullableFixedEntryReader.java
index 3869113249b..caf5c73472b 100644
--- 
a/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenNullableFixedEntryReader.java
+++ 
b/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/VarLenNullableFixedEntryReader.java
@@ -45,7 +45,7 @@ final VarLenColumnBulkEntry getEntry(int valuesToRead) {
 
 final int expectedDataLen = columnPrecInfo.precision;
 final int entrySz = 4 + columnPrecInfo.precision;
-final int readBatch = Math.min(entry.getMaxEntries(), valuesToRead);
+final int readBatch = getFixedLengthMaxRecordsToRead(valuesToRead, 
entrySz);
 Preconditions.checkState(readBatch > 0, "Read batch count [%s] should be 
greater than zero", readBatch);
 
 final int[] valueLengths = entry.getValuesLength();
diff --git 
a/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetBulkReader.java
 
b/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetBulkReader.java
new file mode 100644
index 000..315ff93be4c
--- /dev/null
+++ 
b/exec/java-exec/src/test/java/org/apache/drill/exec/store/parquet/TestParquetBulkReader.java
@@ -0,0 +1,149 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in 

[jira] [Updated] (DRILL-6702) OperatingSystemMXBean class cast exception when loaded under IBM JVM

2018-08-20 Thread Rob Wu (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rob Wu updated DRILL-6702:
--
Description: 
Related to: https://issues.apache.org/jira/browse/DRILL-6289

 

[https://github.com/apache/drill/blob/1.14.0/common/src/main/java/org/apache/drill/exec/metrics/CpuGaugeSet.java#L28|https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_blob_1.14.0_common_src_main_java_org_apache_drill_exec_metrics_CpuGaugeSet.java-23L28=DwMFAg=cskdkSMqhcnjZxdQVpwTXg=-cT6otg6lpT_XkmYy7yg3A=f8a5MyR85-7Ns3KmymU7PI8Sk6qW8vRa9HJIa0-npNA=mpztPtwrTzNkgLcUORZdl5LQ6gyP5iAf3umFzgdOMeI=]

 

Exception in thread "main" java.lang.ExceptionInInitializerError

    at java.lang.J9VMInternals.ensureError(J9VMInternals.java:141)

    at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:130)

    at 
org.apache.drill.exec.metrics.DrillMetrics.getRegistry(DrillMetrics.java:111)

    at 
org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:64)

    at 
org.apache.drill.exec.memory.BaseAllocator.(BaseAllocator.java:48)

    at 
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:45)

    at 
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:40)

    ...

Caused by: java.lang.ClassCastException: 
com.ibm.lang.management.ExtendedOperatingSystem incompatible with 
com.sun.management.OperatingSystemMXBean

    at org.apache.drill.exec.metrics.CpuGaugeSet.(CpuGaugeSet.java:40)

    at 
org.apache.drill.exec.metrics.DrillMetrics$RegistryHolder.registerSystemMetrics(DrillMetrics.java:63)

    at 
org.apache.drill.exec.metrics.DrillMetrics$RegistryHolder.(DrillMetrics.java:53)

  was:
[https://github.com/apache/drill/blob/1.14.0/common/src/main/java/org/apache/drill/exec/metrics/CpuGaugeSet.java#L28|https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_blob_1.14.0_common_src_main_java_org_apache_drill_exec_metrics_CpuGaugeSet.java-23L28=DwMFAg=cskdkSMqhcnjZxdQVpwTXg=-cT6otg6lpT_XkmYy7yg3A=f8a5MyR85-7Ns3KmymU7PI8Sk6qW8vRa9HJIa0-npNA=mpztPtwrTzNkgLcUORZdl5LQ6gyP5iAf3umFzgdOMeI=]

 

Exception in thread "main" java.lang.ExceptionInInitializerError

    at java.lang.J9VMInternals.ensureError(J9VMInternals.java:141)

    at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:130)

    at 
org.apache.drill.exec.metrics.DrillMetrics.getRegistry(DrillMetrics.java:111)

    at 
org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:64)

    at 
org.apache.drill.exec.memory.BaseAllocator.(BaseAllocator.java:48)

    at 
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:45)

    at 
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:40)

    ...

Caused by: java.lang.ClassCastException: 
com.ibm.lang.management.ExtendedOperatingSystem incompatible with 
com.sun.management.OperatingSystemMXBean

    at org.apache.drill.exec.metrics.CpuGaugeSet.(CpuGaugeSet.java:40)

    at 
org.apache.drill.exec.metrics.DrillMetrics$RegistryHolder.registerSystemMetrics(DrillMetrics.java:63)

    at 
org.apache.drill.exec.metrics.DrillMetrics$RegistryHolder.(DrillMetrics.java:53)


> OperatingSystemMXBean class cast exception when loaded under IBM JVM
> 
>
> Key: DRILL-6702
> URL: https://issues.apache.org/jira/browse/DRILL-6702
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Rob Wu
>Assignee: Kunal Khatua
>Priority: Minor
>
> Related to: https://issues.apache.org/jira/browse/DRILL-6289
>  
> [https://github.com/apache/drill/blob/1.14.0/common/src/main/java/org/apache/drill/exec/metrics/CpuGaugeSet.java#L28|https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_blob_1.14.0_common_src_main_java_org_apache_drill_exec_metrics_CpuGaugeSet.java-23L28=DwMFAg=cskdkSMqhcnjZxdQVpwTXg=-cT6otg6lpT_XkmYy7yg3A=f8a5MyR85-7Ns3KmymU7PI8Sk6qW8vRa9HJIa0-npNA=mpztPtwrTzNkgLcUORZdl5LQ6gyP5iAf3umFzgdOMeI=]
>  
> Exception in thread "main" java.lang.ExceptionInInitializerError
>     at java.lang.J9VMInternals.ensureError(J9VMInternals.java:141)
>     at 
> java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:130)
>     at 
> org.apache.drill.exec.metrics.DrillMetrics.getRegistry(DrillMetrics.java:111)
>     at 
> org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:64)
>     at 
> org.apache.drill.exec.memory.BaseAllocator.(BaseAllocator.java:48)
>     at 
> org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:45)
>     at 
> org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:40)
>     ...
> Caused by: java.lang.ClassCastException: 
> 

[jira] [Created] (DRILL-6702) OperatingSystemMXBean class cast exception when loaded under IBM JVM

2018-08-20 Thread Rob Wu (JIRA)
Rob Wu created DRILL-6702:
-

 Summary: OperatingSystemMXBean class cast exception when loaded 
under IBM JVM
 Key: DRILL-6702
 URL: https://issues.apache.org/jira/browse/DRILL-6702
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.14.0
Reporter: Rob Wu
Assignee: Kunal Khatua


[https://github.com/apache/drill/blob/1.14.0/common/src/main/java/org/apache/drill/exec/metrics/CpuGaugeSet.java#L28|https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_drill_blob_1.14.0_common_src_main_java_org_apache_drill_exec_metrics_CpuGaugeSet.java-23L28=DwMFAg=cskdkSMqhcnjZxdQVpwTXg=-cT6otg6lpT_XkmYy7yg3A=f8a5MyR85-7Ns3KmymU7PI8Sk6qW8vRa9HJIa0-npNA=mpztPtwrTzNkgLcUORZdl5LQ6gyP5iAf3umFzgdOMeI=]

 

Exception in thread "main" java.lang.ExceptionInInitializerError

    at java.lang.J9VMInternals.ensureError(J9VMInternals.java:141)

    at 
java.lang.J9VMInternals.recordInitializationFailure(J9VMInternals.java:130)

    at 
org.apache.drill.exec.metrics.DrillMetrics.getRegistry(DrillMetrics.java:111)

    at 
org.apache.drill.exec.memory.AllocationManager.(AllocationManager.java:64)

    at 
org.apache.drill.exec.memory.BaseAllocator.(BaseAllocator.java:48)

    at 
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:45)

    at 
org.apache.drill.exec.memory.RootAllocatorFactory.newRoot(RootAllocatorFactory.java:40)

    ...

Caused by: java.lang.ClassCastException: 
com.ibm.lang.management.ExtendedOperatingSystem incompatible with 
com.sun.management.OperatingSystemMXBean

    at org.apache.drill.exec.metrics.CpuGaugeSet.(CpuGaugeSet.java:40)

    at 
org.apache.drill.exec.metrics.DrillMetrics$RegistryHolder.registerSystemMetrics(DrillMetrics.java:63)

    at 
org.apache.drill.exec.metrics.DrillMetrics$RegistryHolder.(DrillMetrics.java:53)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6552) Drill Metadata management "Drill MetaStore"

2018-08-20 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586511#comment-16586511
 ] 

Volodymyr Vysotskyi commented on DRILL-6552:


[~weijie], sure, we will share the main points of the discussion and 
presentation itself at the dev mailing list.

> Drill Metadata management "Drill MetaStore"
> ---
>
> Key: DRILL-6552
> URL: https://issues.apache.org/jira/browse/DRILL-6552
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Metadata
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 2.0.0
>
>
> It would be useful for Drill to have some sort of metastore which would 
> enable Drill to remember previously defined schemata so Drill doesn’t have to 
> do the same work over and over again.
> It allows to store schema and statistics, which will allow to accelerate 
> queries validation, planning and execution time. Also it increases stability 
> of Drill and allows to avoid different kind if issues: "schema change 
> Exceptions", "limit 0" optimization and so on. 
> One of the main candidates is Hive Metastore.
> Starting from 3.0 version Hive Metastore can be the separate service from 
> Hive server:
> [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration]
> Optional enhancement is storing Drill's profiles, UDFs, plugins configs in 
> some kind of metastore as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6701) Non-admin users can access options page via restapi

2018-08-20 Thread Krystal (JIRA)
Krystal created DRILL-6701:
--

 Summary: Non-admin users can access options page via restapi
 Key: DRILL-6701
 URL: https://issues.apache.org/jira/browse/DRILL-6701
 Project: Apache Drill
  Issue Type: Bug
  Components: Security
Affects Versions: 1.14.0
Reporter: Krystal


Only admin users should be able to access drill's options page.  However via 
restapi, a non-admin user can access the options page.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6640) Drill takes long time in planning when there are large number of files in views/tables DFS parent directory

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586396#comment-16586396
 ] 

ASF GitHub Bot commented on DRILL-6640:
---

ilooner commented on issue #1405: DRILL-6640: Modifying DotDrillUtil 
implementation to avoid using globStatus calls
URL: https://github.com/apache/drill/pull/1405#issuecomment-414435259
 
 
   Sorry for the delayed review @kr-arjun. Please use BaseDirTestWatcher for 
creating temp directories in tests. There is information in the javadoc for the 
class, an example in ExampleTest, and you can take a look at the other tests 
that use it for more examples.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Drill takes long time in planning when there are large number of files in  
> views/tables DFS parent directory
> 
>
> Key: DRILL-6640
> URL: https://issues.apache.org/jira/browse/DRILL-6640
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Query Planning  Optimization
>Reporter: Arjun
>Assignee: Arjun
>Priority: Major
> Fix For: 1.15.0
>
>
> When Drill is used for querying views/ tables, the query planning time 
> increases as the number of files in views/tables parent directory increases. 
> This becomes unacceptably long with complex queries.
> This is caused by globStatus operation on view files using GLOB to retrieve 
> view file status. This can be improved by avoiding the usage of GLOB pattern 
> for Drill metadata files like view files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586345#comment-16586345
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211361059
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/SubDirTestWatcher.java
 ##
 @@ -83,10 +84,10 @@
   private List subDirs;
 
   protected SubDirTestWatcher(File baseDir, boolean createAtBeginning, boolean 
deleteAtEnd, List subDirs) {
-this.baseDir = Preconditions.checkNotNull(baseDir);
+this.baseDir = Objects.requireNonNull(baseDir);
 this.createAtBeginning = createAtBeginning;
 this.deleteAtEnd = deleteAtEnd;
-this.subDirs = Preconditions.checkNotNull(subDirs);
+this.subDirs = Objects.requireNonNull(subDirs);
 
 Preconditions.checkArgument(!subDirs.isEmpty(), "The list of subDirs is 
empty.");
 
 Review comment:
   @vvysotskyi arguments of both of you are really reasonable. But I prefer 
don't make changes without obvious benefit. If you insist, please describe the 
case and post it as [DISCUSSION] in Drill dev mailing list with a link to 
current PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586296#comment-16586296
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vrozov commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211351021
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/SubDirTestWatcher.java
 ##
 @@ -83,10 +84,10 @@
   private List subDirs;
 
   protected SubDirTestWatcher(File baseDir, boolean createAtBeginning, boolean 
deleteAtEnd, List subDirs) {
-this.baseDir = Preconditions.checkNotNull(baseDir);
+this.baseDir = Objects.requireNonNull(baseDir);
 this.createAtBeginning = createAtBeginning;
 this.deleteAtEnd = deleteAtEnd;
-this.subDirs = Preconditions.checkNotNull(subDirs);
+this.subDirs = Objects.requireNonNull(subDirs);
 
 Preconditions.checkArgument(!subDirs.isEmpty(), "The list of subDirs is 
empty.");
 
 Review comment:
   @vvysotskyi I already explained reasons to keep `Preconditions` in my other 
comments. Here is the summary in random order:
   - The change does not remove the dependency on the guava library.
   - `Preconditions` are not deprecated by the guava team and provides a stable 
interface.
   - `Preconditions` functionality is much wider compared to JDK classes.
   - Using `Preconditions` will be more consistent with other checks like 
`checkState` or `checkArgument` and also with other Apache projects.
   - It avoids unnecessary changes and keeps original contributions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6693) When a query is started from Drill Web Console, the UI becomes inaccessible until the query is completed

2018-08-20 Thread Vlad Rozov (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586273#comment-16586273
 ] 

Vlad Rozov commented on DRILL-6693:
---

[~angozhiy] Did you verify that changing {{drill.exec.http.jetty.server. 
selectors }} also changes number of jetty threads? Please attach jstack of a 
drillbit with not default number of acceptors/selectors.

[~arina] DRILL-5994 configures drill to accept 2 concurrent requests. If this 
is not the case, there may be a bug in DRILL-5994 implementation that needs to 
be investigated. If by default drill handles 2 **concurrent** requests, the 
behavior is by design drill should not be used as a web server. If it is 
necessary to serve more than 2 concurrent requests, DRILL-5994 introduces 
configuration parameters to manage how many concurrent requests (not session) 
drillbit should be able to handle.

Note that jetty or any other web server design implies that long-running SQL 
queries will be served asynchronously (similar to how requests are handled on 
netty threads) if submitted over web interface as well. Jetty threads, 
acceptors and selectors are limited resources (whether there are only 2 or a 
100).

> When a query is started from Drill Web Console, the UI becomes inaccessible 
> until the query is completed
> 
>
> Key: DRILL-6693
> URL: https://issues.apache.org/jira/browse/DRILL-6693
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Steps:*
>  # From Web UI, run the following query:
> {noformat}
> select * 
> from (
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role
> from cp.`employee.json`)
> where last_name = 'Blumberg'
> {noformat}
>  # While query is running, try open the Profiles page (or any other). If It 
> completes too fast, add some unions to the query above.
> *Expected result:*
>  Profiles page should be opened. The running query should be listed.
> *Actual result:*
>  The Web UI hangs until the query completes.
> *Notes:*
> - If you open another tab with Web Console, it also stuck when the query is 
> running.
> - If the query is started from sqlline, everything is fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586271#comment-16586271
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vvysotskyi commented on a change in pull request #1397: DRILL-6633: Replace 
usage of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211344318
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/SubDirTestWatcher.java
 ##
 @@ -83,10 +84,10 @@
   private List subDirs;
 
   protected SubDirTestWatcher(File baseDir, boolean createAtBeginning, boolean 
deleteAtEnd, List subDirs) {
-this.baseDir = Preconditions.checkNotNull(baseDir);
+this.baseDir = Objects.requireNonNull(baseDir);
 this.createAtBeginning = createAtBeginning;
 this.deleteAtEnd = deleteAtEnd;
-this.subDirs = Preconditions.checkNotNull(subDirs);
+this.subDirs = Objects.requireNonNull(subDirs);
 
 Preconditions.checkArgument(!subDirs.isEmpty(), "The list of subDirs is 
empty.");
 
 Review comment:
   @vrozov, I didn't correctly express my point of view: I don't see a reason 
for using methods from third-party libraries if JDK provides methods with 
exactly the same functionality. Such usages of third-party libraries should be 
prevented for the consistency.
   
   Using methods from JDK is also helps to prevent problems with updating 
library versions, since the API of a third-party library may change, but JDK 
guarantees backward compatibility.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586235#comment-16586235
 ] 

ASF GitHub Bot commented on DRILL-6385:
---

amansinha100 commented on issue #1334: DRILL-6385: Support JPPD feature
URL: https://github.com/apache/drill/pull/1334#issuecomment-414392376
 
 
   @sohami thanks for the detailed follow-up review and running the regression 
tests with the PR (I assume with default setting of run-time pushdown 
disabled).  @weijietong sorry I have not been responsive lately due to personal 
reasons.   Thanks for addressing review comments. 
   @sohami , it's a +1 from me as well, so when you get a chance pls go ahead 
and merge the PR. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.15.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6700) Graceful shutdown from command line against parquet tables fails

2018-08-20 Thread Krystal (JIRA)
Krystal created DRILL-6700:
--

 Summary: Graceful shutdown from command line against parquet 
tables fails 
 Key: DRILL-6700
 URL: https://issues.apache.org/jira/browse/DRILL-6700
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow
Affects Versions: 1.14.0
Reporter: Krystal
 Attachments: drillbit.log

Trigger a graceful shutdown from the command line while a query against parquet 
table is running would fail with the following error:

"Error: SYSTEM ERROR: IOException: Filesystem closed

Fragment 5:11"

This does not occur if graceful shutdown is triggered from the webUI.

Attached is the drillbit.log.

 [^drillbit.log]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586214#comment-16586214
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vrozov commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211334757
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/SubDirTestWatcher.java
 ##
 @@ -83,10 +84,10 @@
   private List subDirs;
 
   protected SubDirTestWatcher(File baseDir, boolean createAtBeginning, boolean 
deleteAtEnd, List subDirs) {
-this.baseDir = Preconditions.checkNotNull(baseDir);
+this.baseDir = Objects.requireNonNull(baseDir);
 this.createAtBeginning = createAtBeginning;
 this.deleteAtEnd = deleteAtEnd;
-this.subDirs = Preconditions.checkNotNull(subDirs);
+this.subDirs = Objects.requireNonNull(subDirs);
 
 Preconditions.checkArgument(!subDirs.isEmpty(), "The list of subDirs is 
empty.");
 
 Review comment:
   @vvysotskyi -1: individual preferences do not matter in Apache projects. In 
the new code that you contribute, you may use `Objects.requireNonNull()`, but 
there is no technical reason to change existing contributions. To change 
community preference, the issue must be discussed and voted on dev@drill.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6422) Update guava to 23.0 and shade it

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586150#comment-16586150
 ] 

ASF GitHub Bot commented on DRILL-6422:
---

vvysotskyi commented on issue #1264:  DRILL-6422: Update guava to 23.0 and 
shade it
URL: https://github.com/apache/drill/pull/1264#issuecomment-414372108
 
 
   @vrozov, I don't mind of creating a separate PR, but I still think that it 
is unnecessary, since we need to merge the first PR, publish shaded guava 
package to the maven manually and then merge the second PR.
   
   Could you please explain how to manually publish shaded guava package to the 
maven repository?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Update guava to 23.0 and shade it
> -
>
> Key: DRILL-6422
> URL: https://issues.apache.org/jira/browse/DRILL-6422
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Some hadoop libraries use old versions of guava and most of them are 
> incompatible with guava 23.0.
> To allow usage of new guava version, it should be shaded and shaded version 
> should be used in the project.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586135#comment-16586135
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vvysotskyi commented on a change in pull request #1397: DRILL-6633: Replace 
usage of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211305912
 
 

 ##
 File path: 
contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java
 ##
 @@ -162,7 +163,7 @@ public void setUp() throws UnknownHostException {
   @Test
   public void testMongoGroupScanAssignmentMix() throws UnknownHostException,
   ExecutionSetupException {
-final List endpoints = Lists.newArrayList();
+final List endpoints = new ArrayList<>();
 
 Review comment:
   Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586134#comment-16586134
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vvysotskyi commented on a change in pull request #1397: DRILL-6633: Replace 
usage of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211279104
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/SubDirTestWatcher.java
 ##
 @@ -83,10 +84,10 @@
   private List subDirs;
 
   protected SubDirTestWatcher(File baseDir, boolean createAtBeginning, boolean 
deleteAtEnd, List subDirs) {
-this.baseDir = Preconditions.checkNotNull(baseDir);
+this.baseDir = Objects.requireNonNull(baseDir);
 this.createAtBeginning = createAtBeginning;
 this.deleteAtEnd = deleteAtEnd;
-this.subDirs = Preconditions.checkNotNull(subDirs);
+this.subDirs = Objects.requireNonNull(subDirs);
 
 Preconditions.checkArgument(!subDirs.isEmpty(), "The list of subDirs is 
empty.");
 
 Review comment:
   I prefer using JDK things instead of third-party libraries, so I leave code 
with `Objects.requireNonNull()` (no additional changes in PR).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6693) When a query is started from Drill Web Console, the UI becomes inaccessible until the query is completed

2018-08-20 Thread Alvin Chua (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586132#comment-16586132
 ] 

Alvin Chua commented on DRILL-6693:
---

Any temporary work around for this issue? The rest API server can't handle 
simultaneous requests due to this issue, and system become very unresponsive.

> When a query is started from Drill Web Console, the UI becomes inaccessible 
> until the query is completed
> 
>
> Key: DRILL-6693
> URL: https://issues.apache.org/jira/browse/DRILL-6693
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Steps:*
>  # From Web UI, run the following query:
> {noformat}
> select * 
> from (
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role
> from cp.`employee.json`)
> where last_name = 'Blumberg'
> {noformat}
>  # While query is running, try open the Profiles page (or any other). If It 
> completes too fast, add some unions to the query above.
> *Expected result:*
>  Profiles page should be opened. The running query should be listed.
> *Actual result:*
>  The Web UI hangs until the query completes.
> *Notes:*
> - If you open another tab with Web Console, it also stuck when the query is 
> running.
> - If the query is started from sqlline, everything is fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586133#comment-16586133
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vvysotskyi commented on a change in pull request #1397: DRILL-6633: Replace 
usage of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211305444
 
 

 ##
 File path: 
contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
 ##
 @@ -62,16 +62,16 @@ public void testEncode() throws Exception {
 
   @Test
   public void testXpath_Double() throws Exception {
-final String query = "select xpath_double ('2040', 
'a/b * a/c') as col \n" +
+String query = "select xpath_double ('2040', 'a/b * 
a/c') as col \n" +
 "from hive.kv \n" +
 
 Review comment:
   I agree with you that the purpose of the test is not directly connected with 
`limit 0`, but it helps to highlight that the schema only is required.
   Thanks for providing me with a chance to make the final decision :) I'll 
leave the code as it is.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6685) Error in parquet record reader

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16586130#comment-16586130
 ] 

ASF GitHub Bot commented on DRILL-6685:
---

sachouche commented on issue #1433: DRILL-6685: Fixed exception when reading 
Parquet data
URL: https://github.com/apache/drill/pull/1433#issuecomment-414368128
 
 
   Arina, 
   - First of all, thank you for the review!
   - With regard to the test, I see your point as I have considered the same; 
my preference for using the "try {} finally {}" pattern is simply to influence 
future tests as I felt this pattern is less error prone  


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Error in parquet record reader
> --
>
> Key: DRILL-6685
> URL: https://issues.apache.org/jira/browse/DRILL-6685
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6685
>
>
> This is the query:
> select VarbinaryValue1 from 
> dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet` limit 
> 36;
> It appears to be caused by this commit:
> DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader
> aee899c1b26ebb9a5781d280d5a73b42c273d4d5
> This is the stack trace:
> {noformat}
> Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
> Message: 
> Hadoop path: 
> /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet/0_0_0.parquet
> Total records read: 0
> Row group index: 0
> Records in row group: 1250
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
>   optional int64 Index;
>   optional binary VarbinaryValue1;
>   optional int64 BigIntValue;
>   optional boolean BooleanValue;
>   optional int32 DateValue (DATE);
>   optional float FloatValue;
>   optional binary VarcharValue1 (UTF8);
>   optional double DoubleValue;
>   optional int32 IntegerValue;
>   optional int32 TimeValue (TIME_MILLIS);
>   optional int64 TimestampValue (TIMESTAMP_MILLIS);
>   optional binary VarbinaryValue2;
>   optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);
>   optional binary VarcharValue2 (UTF8);
> }
> , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks: 
> [BlockMetaData{1250, 23750308 [ColumnMetaData{UNCOMPRESSED [Index] optional 
> int64 Index  [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED 
> [VarbinaryValue1] optional binary VarbinaryValue1  [PLAIN, RLE, BIT_PACKED], 
> 10057}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue  
> [PLAIN, RLE, BIT_PACKED], 8174655}, ColumnMetaData{UNCOMPRESSED 
> [BooleanValue] optional boolean BooleanValue  [PLAIN, RLE, BIT_PACKED], 
> 8179722}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue 
> (DATE)  [PLAIN, RLE, BIT_PACKED], 8179916}, ColumnMetaData{UNCOMPRESSED 
> [FloatValue] optional float FloatValue  [PLAIN, RLE, BIT_PACKED], 8184959}, 
> ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1 
> (UTF8)  [PLAIN, RLE, BIT_PACKED], 8190002}, ColumnMetaData{UNCOMPRESSED 
> [DoubleValue] optional double DoubleValue  [PLAIN, RLE, BIT_PACKED], 
> 10230058}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32 
> IntegerValue  [PLAIN, RLE, BIT_PACKED], 10240111}, 
> ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue 
> (TIME_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10245154}, 
> ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue 
> (TIMESTAMP_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10250197}, 
> ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2 
>  [PLAIN, RLE, BIT_PACKED], 10260250}, ColumnMetaData{UNCOMPRESSED 
> [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19632385}, ColumnMetaData{UNCOMPRESSED 
> [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19647446}, ColumnMetaData{UNCOMPRESSED 
> [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19662507}, ColumnMetaData{UNCOMPRESSED 
> [VarcharValue2] optional binary VarcharValue2 (UTF8)  [PLAIN, RLE, 
> BIT_PACKED], 19677568}]}]}
> Fragment 0:0
> [Error Id: 

[jira] [Comment Edited] (DRILL-6677) Check style reports JavaDocs imports only statements as unused

2018-08-20 Thread Vitalii Diravka (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585832#comment-16585832
 ] 

Vitalii Diravka edited comment on DRILL-6677 at 8/20/18 3:09 PM:
-

Looks like currently this is the issue with Drill Maven check-style and it is 
not related to specific IDE.
Can be reproduced with adding `LogicalFilter` to import statements in 
DrillRelFactories and {{mvn clean install -DskipTests}}


was (Author: vitalii):
Looks like currently this is the issue with Drill Maven check-style and it is 
related to specific IDE.
Can be reproduced with adding `LogicalFilter` to import statements in 
DrillRelFactories and {{mvn clean install -DskipTests}}

> Check style reports JavaDocs imports only statements as unused
> --
>
> Key: DRILL-6677
> URL: https://issues.apache.org/jira/browse/DRILL-6677
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Major
>
> Consider the following Java snippet:
> {code}
> import com.foo.Bar;
> /**
>   This is a reference to {@link com.foo.Bar}
> {code}
> Eclipse will notice the reference to {{com.foo.Bar}} in the Javadoc comment 
> and its automatic import fixer-upper will include the import.
> But, check style appears to ignore Javadoc imports. So, Check style reports 
> the import as unused.
> The only way, at present, to make Check style happy is to turn off the 
> Eclipse import fixer-upper and do everything manually.
> Request: modify check style to also check for class references in Javadoc 
> comments as such references are required for the Javadoc to build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585887#comment-16585887
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211218552
 
 

 ##
 File path: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseStoragePluginConfig.java
 ##
 @@ -30,7 +31,6 @@
 import com.fasterxml.jackson.annotation.JsonTypeName;
 import com.google.common.annotations.VisibleForTesting;
 import com.google.common.collect.ImmutableMap;
 
 Review comment:
   ok


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585897#comment-16585897
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211243710
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/ischema/InfoSchemaFilter.java
 ##
 @@ -77,12 +77,9 @@ public FunctionExprNode(String function, List 
args) {
 
 @Override
 public String toString() {
-  StringBuilder builder = new StringBuilder();
-  builder.append(function);
-  builder.append("(");
-  builder.append(Joiner.on(",").join(args));
-  builder.append(")");
-  return builder.toString();
+  return function + args.stream()
+  .map(ExprNode::toString)
+  .collect(Collectors.joining(",", "(", ")"));
 
 Review comment:
   Not major issue. Let's leave it as is


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585892#comment-16585892
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211240663
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/xsort/ExternalSortBatch.java
 ##
 @@ -586,9 +587,9 @@ public BatchGroup mergeAndSpill(LinkedList 
batchGroups) throws Schem
 c1.setRecordCount(count);
 
 String spillDir = dirs.next();
-Path currSpillPath = new Path(Joiner.on("/").join(spillDir, fileName));
+Path currSpillPath = new Path(spillDir + "/" + fileName);
 currSpillDirs.add(currSpillPath);
-String outputFile = Joiner.on("/").join(currSpillPath, spillCount++);
+String outputFile = currSpillPath+ "/" + spillCount++;
 
 Review comment:
   `String.valueOf()`, but the statement will be longer, ok, let's leave it as 
is. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585895#comment-16585895
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211236890
 
 

 ##
 File path: 
contrib/storage-mongo/src/test/java/org/apache/drill/exec/store/mongo/TestMongoChunkAssignment.java
 ##
 @@ -162,7 +163,7 @@ public void setUp() throws UnknownHostException {
   @Test
   public void testMongoGroupScanAssignmentMix() throws UnknownHostException,
   ExecutionSetupException {
-final List endpoints = Lists.newArrayList();
+final List endpoints = new ArrayList<>();
 
 Review comment:
   I have concentrated on your changes, you have already improved a lot of 
similar code.
   It's not critical, so it's up to you.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585886#comment-16585886
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211214140
 
 

 ##
 File path: 
contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/MapRDBGroupScan.java
 ##
 @@ -42,9 +44,6 @@
 import com.fasterxml.jackson.annotation.JsonProperty;
 import com.google.common.base.Preconditions;
 import com.google.common.base.Stopwatch;
 
 Review comment:
   Agree


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585885#comment-16585885
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211213987
 
 

 ##
 File path: common/src/test/java/org/apache/drill/test/SubDirTestWatcher.java
 ##
 @@ -83,10 +84,10 @@
   private List subDirs;
 
   protected SubDirTestWatcher(File baseDir, boolean createAtBeginning, boolean 
deleteAtEnd, List subDirs) {
-this.baseDir = Preconditions.checkNotNull(baseDir);
+this.baseDir = Objects.requireNonNull(baseDir);
 this.createAtBeginning = createAtBeginning;
 this.deleteAtEnd = deleteAtEnd;
-this.subDirs = Preconditions.checkNotNull(subDirs);
+this.subDirs = Objects.requireNonNull(subDirs);
 
 Preconditions.checkArgument(!subDirs.isEmpty(), "The list of subDirs is 
empty.");
 
 Review comment:
   `Objects.requireNonNull` is introduced mostly for `Streams`, but not for 
replacing Guava Preconditions.
   From other hand they have the same implementation. Except 
`Preconditions.checkNotNull()` overloaded method with NPE error message 
formatting and Guava recommends to use Preconditions, see [Guava 
master](https://github.com/google/guava/blob/master/guava/src/com/google/common/base/Preconditions.java#L93).
   
   Looks like Calcite project replaces these methods, see [Calcite 
master](https://github.com/apache/calcite/blob/3fa29455664bec0056c436491b369e0cd72242ea/src/main/config/forbidden-apis/signatures.txt#L58),
 but Apex doesn't replace them.
   
   Since we stay with Guava it could be reasonably to follow their 
documentation.
   But it is not critical, so I will leave the decision for you.
   
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585884#comment-16585884
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211200356
 
 

 ##
 File path: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/test/Bug1735ResultSetCloseReleasesBuffersTest.java
 ##
 @@ -69,23 +69,19 @@ public static void tearDownClass() {
   public void test() throws Exception {
 withNoDefaultSchema()
 .withConnection(
-new Function() {
-  public Void apply( Connection connection ) {
-try {
-  Statement statement = connection.createStatement();
-  ResultSet resultSet = statement.executeQuery( "USE dfs.tmp" );
-  // TODO:  Purge nextUntilEnd(...) and calls when remaining 
fragment
-  // race conditions are fixed (not just DRILL-2245 fixes).
-  // resultSet.close( resultSet );
-  statement.close();
-  // connection.close() is in withConnection(...)
-  return null;
-} catch ( SQLException e ) {
-  throw new RuntimeException( e );
-}
+(Function) connection -> {
 
 Review comment:
   Right, there are a bunch of other tests in this package with a similar name.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585890#comment-16585890
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211228814
 
 

 ##
 File path: 
contrib/storage-hive/core/src/test/java/org/apache/drill/exec/fn/hive/TestInbuiltHiveUDFs.java
 ##
 @@ -62,16 +62,16 @@ public void testEncode() throws Exception {
 
   @Test
   public void testXpath_Double() throws Exception {
-final String query = "select xpath_double ('2040', 
'a/b * a/c') as col \n" +
+String query = "select xpath_double ('2040', 'a/b * 
a/c') as col \n" +
 "from hive.kv \n" +
 
 Review comment:
   The `limit 0` is used for getting schema only, the purpose of test is to 
verify `xpath_double` UDF. 
   It is not SQL formatting style even.
   But I will not argue, since it is minor and not your code. I will let you to 
make the final decision


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585893#comment-16585893
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211222476
 
 

 ##
 File path: 
contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/schema/HiveSchemaFactory.java
 ##
 @@ -46,7 +47,6 @@
 import org.apache.thrift.TException;
 
 import com.google.common.collect.ImmutableList;
 
 Review comment:
   Great. Please replace `ImmutableMap.of()` and `ImmutableSet.of()` too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585891#comment-16585891
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211232521
 
 

 ##
 File path: 
contrib/storage-jdbc/src/main/java/org/apache/drill/exec/store/jdbc/JdbcStoragePlugin.java
 ##
 @@ -157,10 +155,10 @@ public JdbcStoragePlugin getPlugin() {
   }
 
   /**
-   * Returns whether a condition is supported by {@link JdbcJoin}.
+   * Returns whether a condition is supported by {@link JdbcRules.JdbcJoin}.
*
* Corresponds to the capabilities of
-   * {@link SqlImplementor#convertConditionToSqlNode}.
+   * {@link 
org.apache.calcite.rel.rel2sql.SqlImplementor#convertConditionToSqlNode}.
 
 Review comment:
   Looks like currently Drill Maven check style prohibits it. I have updated 
the Jira with a comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585888#comment-16585888
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211242647
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/schema/ObjectSchema.java
 ##
 @@ -71,19 +70,16 @@ public void resetMarkedFields() {
 
 @Override
 public Iterable removeUnreadFields() {
-final List removedFields = Lists.newArrayList();
-Iterables.removeIf(fields.values(), new Predicate() {
-@Override
-public boolean apply(Field field) {
-if (!field.isRead()) {
-removedFields.add(field);
-return true;
-} else if (field.hasSchema()) {
-Iterables.addAll(removedFields, 
field.getAssignedSchema().removeUnreadFields());
-}
-
-return false;
+List removedFields = new ArrayList<>();
+fields.values().removeIf(field -> {
+if (!field.isRead()) {
+removedFields.add(field);
+return true;
+} else if (field.hasSchema()) {
+Iterables.addAll(removedFields, 
field.getAssignedSchema().removeUnreadFields());
 
 Review comment:
   It makes sense.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585894#comment-16585894
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211237733
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZookeeperClient.java
 ##
 @@ -192,24 +190,24 @@ public boolean hasPath(final String path, final boolean 
consistent, final DataCh
* @param consistent consistency check
* @param version version holder
*/
-  public byte[] get(final String path, final boolean consistent, final 
DataChangeVersion version) {
-Preconditions.checkNotNull(path, "path is required");
+  public byte[] get(String path, boolean consistent, DataChangeVersion 
version) {
+Objects.requireNonNull(path, "path is required");
 
-final String target = PathUtils.join(root, path);
+String target = PathUtils.join(root, path);
 if (consistent) {
   try {
 if (version != null) {
   Stat stat = new Stat();
-  final byte[] bytes = 
curator.getData().storingStatIn(stat).forPath(target);
+  byte[] bytes = curator.getData().storingStatIn(stat).forPath(target);
   version.setVersion(stat.getVersion());
   return bytes;
 }
 return curator.getData().forPath(target);
-  } catch (final Exception ex) {
+  } catch (Exception ex) {
 
 Review comment:
   `ex` is used rarelya and more widespread in other languages. 
   But ok, you can leave it.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585889#comment-16585889
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211243271
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/options/TypeValidators.java
 ##
 @@ -242,7 +241,7 @@ public String getAdminUserGroups(OptionManager 
optionManager) {
   // if this option has not been changed by the user then return the
   // process user groups
   if (adminUserGroups.equals(DEFAULT_ADMIN_USER_GROUPS)) {
-adminUserGroups = 
Joiner.on(",").join(ImpersonationUtil.getProcessUserGroupNames());
+adminUserGroups = String.join(",", 
ImpersonationUtil.getProcessUserGroupNames());
 
 Review comment:
   It depends on where `adminUserGroups` is used. 
   Ok, let's leave it as is


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585896#comment-16585896
 ] 

ASF GitHub Bot commented on DRILL-6633:
---

vdiravka commented on a change in pull request #1397: DRILL-6633: Replace usage 
of Guava classes by JDK ones
URL: https://github.com/apache/drill/pull/1397#discussion_r211202697
 
 

 ##
 File path: common/src/main/java/org/apache/drill/common/KerberosUtil.java
 ##
 @@ -57,9 +57,9 @@ public static String getPrincipalFromParts(final String 
primary, final String in
   public static String[] splitPrincipalIntoParts(final String principal) {
 final String[] components = principal.split("[/@]");
 checkState(components.length == 3);
 
 Review comment:
   Nothing wrong with `Guava Preconditions` and with `Apache Commons` as well.
   IMO the method names in `Apache Commons` are more clear.
   
   But I've wrongly supposed that the purpose was to get rid Guava as much as 
possible.
   
   We can leave Guava Preconditions, since it is already written and usage of 
it is more common.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6693) When a query is started from Drill Web Console, the UI becomes inaccessible until the query is completed

2018-08-20 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585867#comment-16585867
 ] 

Arina Ielchiieva commented on DRILL-6693:
-

[~vrozov] do you have any idea DRILL-5994 might have caused such behavior?

> When a query is started from Drill Web Console, the UI becomes inaccessible 
> until the query is completed
> 
>
> Key: DRILL-6693
> URL: https://issues.apache.org/jira/browse/DRILL-6693
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Steps:*
>  # From Web UI, run the following query:
> {noformat}
> select * 
> from (
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role
> from cp.`employee.json`)
> where last_name = 'Blumberg'
> {noformat}
>  # While query is running, try open the Profiles page (or any other). If It 
> completes too fast, add some unions to the query above.
> *Expected result:*
>  Profiles page should be opened. The running query should be listed.
> *Actual result:*
>  The Web UI hangs until the query completes.
> *Notes:*
> - If you open another tab with Web Console, it also stuck when the query is 
> running.
> - If the query is started from sqlline, everything is fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6693) When a query is started from Drill Web Console, the UI becomes inaccessible until the query is completed

2018-08-20 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585867#comment-16585867
 ] 

Arina Ielchiieva edited comment on DRILL-6693 at 8/20/18 12:43 PM:
---

[~vrozov] do you have any idea why DRILL-5994 might have caused such behavior?


was (Author: arina):
[~vrozov] do you have any idea DRILL-5994 might have caused such behavior?

> When a query is started from Drill Web Console, the UI becomes inaccessible 
> until the query is completed
> 
>
> Key: DRILL-6693
> URL: https://issues.apache.org/jira/browse/DRILL-6693
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Steps:*
>  # From Web UI, run the following query:
> {noformat}
> select * 
> from (
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role
> from cp.`employee.json`)
> where last_name = 'Blumberg'
> {noformat}
>  # While query is running, try open the Profiles page (or any other). If It 
> completes too fast, add some unions to the query above.
> *Expected result:*
>  Profiles page should be opened. The running query should be listed.
> *Actual result:*
>  The Web UI hangs until the query completes.
> *Notes:*
> - If you open another tab with Web Console, it also stuck when the query is 
> running.
> - If the query is started from sqlline, everything is fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6693) When a query is started from Drill Web Console, the UI becomes inaccessible until the query is completed

2018-08-20 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585854#comment-16585854
 ] 

Anton Gozhiy commented on DRILL-6693:
-

I pinpointed the commit where the issue was introduced:
{noformat}
commit 49faae0452935e9ee1054c056df3e038391048ba
Author: Vlad Rozov 
Date:   Fri Mar 2 10:39:31 2018 -0800

DRILL-5994: Enable configuring number of Jetty acceptors and selectors 
(default to 1 acceptor and 2 selectors)

closes #1148
{noformat}
There were added two options: drill.exec.http.jetty.server.acceptors (default 
1) and drill.exec.http.jetty.server.selectors (default 2)
I tried to change them, but it didn't help.

> When a query is started from Drill Web Console, the UI becomes inaccessible 
> until the query is completed
> 
>
> Key: DRILL-6693
> URL: https://issues.apache.org/jira/browse/DRILL-6693
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Steps:*
>  # From Web UI, run the following query:
> {noformat}
> select * 
> from (
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role 
> from cp.`employee.json` 
> union
> select employee_id, full_name, first_name, last_name, position_id, 
> position_title, store_id, department_id, birth_date, hire_date, salary, 
> supervisor_id, education_level, marital_status, gender, management_role
> from cp.`employee.json`)
> where last_name = 'Blumberg'
> {noformat}
>  # While query is running, try open the Profiles page (or any other). If It 
> completes too fast, add some unions to the query above.
> *Expected result:*
>  Profiles page should be opened. The running query should be listed.
> *Actual result:*
>  The Web UI hangs until the query completes.
> *Notes:*
> - If you open another tab with Web Console, it also stuck when the query is 
> running.
> - If the query is started from sqlline, everything is fine.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6677) Check style reports JavaDocs imports only statements as unused

2018-08-20 Thread Vitalii Diravka (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6677:
---
Summary: Check style reports JavaDocs imports only statements as unused  
(was: Check style unused import check conflicts with Eclipse)

> Check style reports JavaDocs imports only statements as unused
> --
>
> Key: DRILL-6677
> URL: https://issues.apache.org/jira/browse/DRILL-6677
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Major
>
> Consider the following Java snippet:
> {code}
> import com.foo.Bar;
> /**
>   This is a reference to {@link com.foo.Bar}
> {code}
> Eclipse will notice the reference to {{com.foo.Bar}} in the Javadoc comment 
> and its automatic import fixer-upper will include the import.
> But, check style appears to ignore Javadoc imports. So, Check style reports 
> the import as unused.
> The only way, at present, to make Check style happy is to turn off the 
> Eclipse import fixer-upper and do everything manually.
> Request: modify check style to also check for class references in Javadoc 
> comments as such references are required for the Javadoc to build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6677) Check style unused import check conflicts with Eclipse

2018-08-20 Thread Vitalii Diravka (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585832#comment-16585832
 ] 

Vitalii Diravka commented on DRILL-6677:


Looks like currently this is the issue with Drill Maven check-style and it is 
related to specific IDE.
Can be reproduced with adding `LogicalFilter` to import statements in 
DrillRelFactories and {{mvn clean install -DskipTests}}

> Check style unused import check conflicts with Eclipse
> --
>
> Key: DRILL-6677
> URL: https://issues.apache.org/jira/browse/DRILL-6677
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Priority: Major
>
> Consider the following Java snippet:
> {code}
> import com.foo.Bar;
> /**
>   This is a reference to {@link com.foo.Bar}
> {code}
> Eclipse will notice the reference to {{com.foo.Bar}} in the Javadoc comment 
> and its automatic import fixer-upper will include the import.
> But, check style appears to ignore Javadoc imports. So, Check style reports 
> the import as unused.
> The only way, at present, to make Check style happy is to turn off the 
> Eclipse import fixer-upper and do everything manually.
> Request: modify check style to also check for class references in Javadoc 
> comments as such references are required for the Javadoc to build.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6644) In Some Cases The HashJoin Memory Calculator Over Reserves Memory For The Probe Side During The Build Phase

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585773#comment-16585773
 ] 

ASF GitHub Bot commented on DRILL-6644:
---

arina-ielchiieva commented on issue #1409: DRILL-6644: Don't reserve space for 
incoming probe batches unnecessarily during the build phase.
URL: https://github.com/apache/drill/pull/1409#issuecomment-414274214
 
 
   @ilooner I'll remove ready-to-commit label, please put it back when test 
failures are resolved.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> In Some Cases The HashJoin Memory Calculator Over Reserves Memory For The 
> Probe Side During The Build Phase
> ---
>
> Key: DRILL-6644
> URL: https://issues.apache.org/jira/browse/DRILL-6644
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>
> There are two cases where the HashJoin Memory calculator over reserves memory:
>  1. It reserves a maximum incoming probe batch size during the build phase. 
> This is not really necessary because we will not fetch probe data until the 
> probe phase. We only have to account for the data received during 
> OK_NEW_SCHEMA.
>  2. https://issues.apache.org/jira/browse/DRILL-6646



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6644) In Some Cases The HashJoin Memory Calculator Over Reserves Memory For The Probe Side During The Build Phase

2018-08-20 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6644:

Labels:   (was: ready-to-commit)

> In Some Cases The HashJoin Memory Calculator Over Reserves Memory For The 
> Probe Side During The Build Phase
> ---
>
> Key: DRILL-6644
> URL: https://issues.apache.org/jira/browse/DRILL-6644
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.15.0
>
>
> There are two cases where the HashJoin Memory calculator over reserves memory:
>  1. It reserves a maximum incoming probe batch size during the build phase. 
> This is not really necessary because we will not fetch probe data until the 
> probe phase. We only have to account for the data received during 
> OK_NEW_SCHEMA.
>  2. https://issues.apache.org/jira/browse/DRILL-6646



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-20 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6385:

Labels: doc-impacting ready-to-commit  (was: ready-to-commit)

> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.15.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6385) Support JPPD (Join Predicate Push Down)

2018-08-20 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6385:

Fix Version/s: 1.15.0

> Support JPPD (Join Predicate Push Down)
> ---
>
> Key: DRILL-6385
> URL: https://issues.apache.org/jira/browse/DRILL-6385
> Project: Apache Drill
>  Issue Type: New Feature
>  Components:  Server, Execution - Flow
>Affects Versions: 1.15.0
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> This feature is to support the JPPD (Join Predicate Push Down). It will 
> benefit the HashJoin ,Broadcast HashJoin performance by reducing the number 
> of rows to send across the network ,the memory consumed. This feature is 
> already supported by Impala which calls it RuntimeFilter 
> ([https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_runtime_filtering.html]).
>  The first PR will try to push down a bloom filter of HashJoin node to 
> Parquet’s scan node.   The propose basic procedure is described as follow:
>  # The HashJoin build side accumulate the equal join condition rows to 
> construct a bloom filter. Then it sends out the bloom filter to the foreman 
> node.
>  # The foreman node accept the bloom filters passively from all the fragments 
> that has the HashJoin operator. It then aggregates the bloom filters to form 
> a global bloom filter.
>  # The foreman node broadcasts the global bloom filter to all the probe side 
> scan nodes which maybe already have send out partial data to the hash join 
> nodes(currently the hash join node will prefetch one batch from both sides ).
>       4.  The scan node accepts a global bloom filter from the foreman node. 
> It will filter the rest rows satisfying the bloom filter.
>  
> To implement above execution flow, some main new notion described as below:
>       1. RuntimeFilter
> It’s a filter container which may contain BloomFilter or MinMaxFilter.
>       2. RuntimeFilterReporter
> It wraps the logic to send hash join’s bloom filter to the foreman.The 
> serialized bloom filter will be sent out through the data tunnel.This object 
> will be instanced by the FragmentExecutor and passed to the 
> FragmentContext.So the HashJoin operator can obtain it through the 
> FragmentContext.
>      3. RuntimeFilterRequestHandler
> It is responsible to accept a SendRuntimeFilterRequest RPC to strip the 
> actual BloomFilter from the network. It then translates this filter to the 
> WorkerBee’s new interface registerRuntimeFilter.
> Another RPC type is BroadcastRuntimeFilterRequest. It will register the 
> accepted global bloom filter to the WorkerBee by the registerRuntimeFilter 
> method and then propagate to the FragmentContext through which the probe side 
> scan node can fetch the aggregated bloom filter.
>       4.RuntimeFilterManager
> The foreman will instance a RuntimeFilterManager .It will indirectly get 
> every RuntimeFilter by the WorkerBee. Once all the BloomFilters have been 
> accepted and aggregated . It will broadcast the aggregated bloom filter to 
> all the probe side scan nodes through the data tunnel by a 
> BroadcastRuntimeFilterRequest RPC.
>      5. RuntimeFilterEnableOption 
>  A global option will be added to decide whether to enable this new feature.
>  
> Welcome suggestion and advice from you.The related PR will be presented as 
> soon as possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6685) Error in parquet record reader

2018-08-20 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6685:

Labels: ready-to-commit  (was: pull-request-available ready-to-commit)

> Error in parquet record reader
> --
>
> Key: DRILL-6685
> URL: https://issues.apache.org/jira/browse/DRILL-6685
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6685
>
>
> This is the query:
> select VarbinaryValue1 from 
> dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet` limit 
> 36;
> It appears to be caused by this commit:
> DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader
> aee899c1b26ebb9a5781d280d5a73b42c273d4d5
> This is the stack trace:
> {noformat}
> Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
> Message: 
> Hadoop path: 
> /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet/0_0_0.parquet
> Total records read: 0
> Row group index: 0
> Records in row group: 1250
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
>   optional int64 Index;
>   optional binary VarbinaryValue1;
>   optional int64 BigIntValue;
>   optional boolean BooleanValue;
>   optional int32 DateValue (DATE);
>   optional float FloatValue;
>   optional binary VarcharValue1 (UTF8);
>   optional double DoubleValue;
>   optional int32 IntegerValue;
>   optional int32 TimeValue (TIME_MILLIS);
>   optional int64 TimestampValue (TIMESTAMP_MILLIS);
>   optional binary VarbinaryValue2;
>   optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);
>   optional binary VarcharValue2 (UTF8);
> }
> , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks: 
> [BlockMetaData{1250, 23750308 [ColumnMetaData{UNCOMPRESSED [Index] optional 
> int64 Index  [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED 
> [VarbinaryValue1] optional binary VarbinaryValue1  [PLAIN, RLE, BIT_PACKED], 
> 10057}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue  
> [PLAIN, RLE, BIT_PACKED], 8174655}, ColumnMetaData{UNCOMPRESSED 
> [BooleanValue] optional boolean BooleanValue  [PLAIN, RLE, BIT_PACKED], 
> 8179722}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue 
> (DATE)  [PLAIN, RLE, BIT_PACKED], 8179916}, ColumnMetaData{UNCOMPRESSED 
> [FloatValue] optional float FloatValue  [PLAIN, RLE, BIT_PACKED], 8184959}, 
> ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1 
> (UTF8)  [PLAIN, RLE, BIT_PACKED], 8190002}, ColumnMetaData{UNCOMPRESSED 
> [DoubleValue] optional double DoubleValue  [PLAIN, RLE, BIT_PACKED], 
> 10230058}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32 
> IntegerValue  [PLAIN, RLE, BIT_PACKED], 10240111}, 
> ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue 
> (TIME_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10245154}, 
> ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue 
> (TIMESTAMP_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10250197}, 
> ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2 
>  [PLAIN, RLE, BIT_PACKED], 10260250}, ColumnMetaData{UNCOMPRESSED 
> [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19632385}, ColumnMetaData{UNCOMPRESSED 
> [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19647446}, ColumnMetaData{UNCOMPRESSED 
> [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19662507}, ColumnMetaData{UNCOMPRESSED 
> [VarcharValue2] optional binary VarcharValue2 (UTF8)  [PLAIN, RLE, 
> BIT_PACKED], 19677568}]}]}
> Fragment 0:0
> [Error Id: 25852cdb-3217-4041-9743-66e9f3a2fbe4 on qa-node186.qa.lab:31010] 
> (state=,code=0)
> {noformat}
> Table can be found in 10.10.100.186:/tmp/fourvarchar_asc_nulls_16MB.parquet
> sys.version is:
> 1.15.0-SNAPSHOT a05f17d6fcd80f0d21260d3b1074ab895f457bacChanged 
> PROJECT_OUTPUT_BATCH_SIZE to System + Session   30.07.2018 @ 17:12:53 PDT 
>   r...@mapr.com   30.07.2018 @ 17:25:21 PDT^M
> fourvarchar_asc_nulls70.q



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6685) Error in parquet record reader

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585766#comment-16585766
 ] 

ASF GitHub Bot commented on DRILL-6685:
---

arina-ielchiieva commented on issue #1433: DRILL-6685: Fixed exception when 
reading Parquet data
URL: https://github.com/apache/drill/pull/1433#issuecomment-414271589
 
 
   @sachouche I don't have strong opinion of that, so PR can go as is, though 
since all your tests using the same options, before / after could save you from 
logic duplication.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Error in parquet record reader
> --
>
> Key: DRILL-6685
> URL: https://issues.apache.org/jira/browse/DRILL-6685
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Assignee: salim achouche
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.15.0
>
> Attachments: drillbit.log.6685
>
>
> This is the query:
> select VarbinaryValue1 from 
> dfs.`/drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet` limit 
> 36;
> It appears to be caused by this commit:
> DRILL-6570: Fixed IndexOutofBoundException in Parquet Reader
> aee899c1b26ebb9a5781d280d5a73b42c273d4d5
> This is the stack trace:
> {noformat}
> Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
> Message: 
> Hadoop path: 
> /drill/testdata/batch_memory/fourvarchar_asc_nulls_16MB.parquet/0_0_0.parquet
> Total records read: 0
> Row group index: 0
> Records in row group: 1250
> Parquet Metadata: ParquetMetaData{FileMetaData{schema: message root {
>   optional int64 Index;
>   optional binary VarbinaryValue1;
>   optional int64 BigIntValue;
>   optional boolean BooleanValue;
>   optional int32 DateValue (DATE);
>   optional float FloatValue;
>   optional binary VarcharValue1 (UTF8);
>   optional double DoubleValue;
>   optional int32 IntegerValue;
>   optional int32 TimeValue (TIME_MILLIS);
>   optional int64 TimestampValue (TIMESTAMP_MILLIS);
>   optional binary VarbinaryValue2;
>   optional fixed_len_byte_array(12) IntervalYearValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalDayValue (INTERVAL);
>   optional fixed_len_byte_array(12) IntervalSecondValue (INTERVAL);
>   optional binary VarcharValue2 (UTF8);
> }
> , metadata: {drill-writer.version=2, drill.version=1.14.0-SNAPSHOT}}, blocks: 
> [BlockMetaData{1250, 23750308 [ColumnMetaData{UNCOMPRESSED [Index] optional 
> int64 Index  [PLAIN, RLE, BIT_PACKED], 4}, ColumnMetaData{UNCOMPRESSED 
> [VarbinaryValue1] optional binary VarbinaryValue1  [PLAIN, RLE, BIT_PACKED], 
> 10057}, ColumnMetaData{UNCOMPRESSED [BigIntValue] optional int64 BigIntValue  
> [PLAIN, RLE, BIT_PACKED], 8174655}, ColumnMetaData{UNCOMPRESSED 
> [BooleanValue] optional boolean BooleanValue  [PLAIN, RLE, BIT_PACKED], 
> 8179722}, ColumnMetaData{UNCOMPRESSED [DateValue] optional int32 DateValue 
> (DATE)  [PLAIN, RLE, BIT_PACKED], 8179916}, ColumnMetaData{UNCOMPRESSED 
> [FloatValue] optional float FloatValue  [PLAIN, RLE, BIT_PACKED], 8184959}, 
> ColumnMetaData{UNCOMPRESSED [VarcharValue1] optional binary VarcharValue1 
> (UTF8)  [PLAIN, RLE, BIT_PACKED], 8190002}, ColumnMetaData{UNCOMPRESSED 
> [DoubleValue] optional double DoubleValue  [PLAIN, RLE, BIT_PACKED], 
> 10230058}, ColumnMetaData{UNCOMPRESSED [IntegerValue] optional int32 
> IntegerValue  [PLAIN, RLE, BIT_PACKED], 10240111}, 
> ColumnMetaData{UNCOMPRESSED [TimeValue] optional int32 TimeValue 
> (TIME_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10245154}, 
> ColumnMetaData{UNCOMPRESSED [TimestampValue] optional int64 TimestampValue 
> (TIMESTAMP_MILLIS)  [PLAIN, RLE, BIT_PACKED], 10250197}, 
> ColumnMetaData{UNCOMPRESSED [VarbinaryValue2] optional binary VarbinaryValue2 
>  [PLAIN, RLE, BIT_PACKED], 10260250}, ColumnMetaData{UNCOMPRESSED 
> [IntervalYearValue] optional fixed_len_byte_array(12) IntervalYearValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19632385}, ColumnMetaData{UNCOMPRESSED 
> [IntervalDayValue] optional fixed_len_byte_array(12) IntervalDayValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19647446}, ColumnMetaData{UNCOMPRESSED 
> [IntervalSecondValue] optional fixed_len_byte_array(12) IntervalSecondValue 
> (INTERVAL)  [PLAIN, RLE, BIT_PACKED], 19662507}, ColumnMetaData{UNCOMPRESSED 
> [VarcharValue2] optional binary VarcharValue2 (UTF8)  [PLAIN, RLE, 
> BIT_PACKED], 19677568}]}]}
> Fragment 0:0
> [Error Id: 25852cdb-3217-4041-9743-66e9f3a2fbe4 on qa-node186.qa.lab:31010] 
> (state=,code=0)
> {noformat}
> Table 

[jira] [Commented] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585765#comment-16585765
 ] 

ASF GitHub Bot commented on DRILL-5735:
---

arina-ielchiieva commented on issue #1279: DRILL-5735: Allow search/sort in the 
Options webUI
URL: https://github.com/apache/drill/pull/1279#issuecomment-414270793
 
 
   @kkhatua please squash the commits.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-08-20 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5735:

Reviewer: Arina Ielchiieva  (was: John Omernik)

> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5735) UI options grouping and filtering & Metrics hints

2018-08-20 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5735:

Labels: doc-impacting ready-to-commit  (was: )

> UI options grouping and filtering & Metrics hints
> -
>
> Key: DRILL-5735
> URL: https://issues.apache.org/jira/browse/DRILL-5735
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Affects Versions: 1.9.0, 1.10.0, 1.11.0
>Reporter: Muhammad Gelbana
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.15.0
>
>
> I'm thinking of some UI improvements that could make all the difference for 
> users trying to optimize low-performing queries.
> h2. Options
> h3. Grouping
> We can organize the options to be grouped by their scope of effect, this will 
> help users easily locate the options they may need to tune.
> h3. Filtering
> Since the options are a lot, we can add a filtering mechanism (i.e. string 
> search or group\scope filtering) so the user can filter out the options he's 
> not interested in. To provide more benefit than the grouping idea mentioned 
> above, filtering may include keywords also and not just the option name, 
> since the user may not be aware of the name of the option he's looking for.
> h2. Metrics
> I'm referring here to the metrics page and the query execution plan page that 
> displays the overview section and major\minor fragments metrics. We can show 
> hints for each metric such as:
> # What does it represent in more details.
> # What option\scope-of-options to tune (increase ? decrease ?) to improve the 
> performance reported by this metric.
> # May be even provide a small dialog to quickly allow the modification of the 
> related option(s) to that metric



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6179) Added pcapng-format support

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585744#comment-16585744
 ] 

ASF GitHub Bot commented on DRILL-6179:
---

arina-ielchiieva commented on a change in pull request #1126: DRILL-6179: Added 
pcapng-format support
URL: https://github.com/apache/drill/pull/1126#discussion_r211207014
 
 

 ##
 File path: protocol/src/main/protobuf/UserBitShared.proto
 ##
 @@ -343,6 +343,7 @@ enum CoreOperatorType {
   IMAGE_SUB_SCAN = 52;
   SEQUENCE_SUB_SCAN = 53;
   PARTITION_LIMIT = 54;
+  PCAPNG_SUB_SCAN = 55;
 
 Review comment:
   You also need to generate C++ code.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Added pcapng-format support
> ---
>
> Key: DRILL-6179
> URL: https://issues.apache.org/jira/browse/DRILL-6179
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.13.0
>Reporter: Vlad
>Assignee: Vlad
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> The _PCAP Next Generation Dump File Format_ (or pcapng for short) [1] is an 
> attempt to overcome the limitations of the currently widely used (but 
> limited) libpcap format.
> At a first level, it is desirable to query and filter by source and 
> destination IP and port, and src/dest mac addreses or by protocol. Beyond 
> that, however, it would be very useful to be able to group packets by TCP 
> session and eventually to look at packet contents.
> Initial work is available at  
> https://github.com/mapr-demos/drill/tree/pcapng_dev
> [1] https://pcapng.github.io/pcapng/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6179) Added pcapng-format support

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585738#comment-16585738
 ] 

ASF GitHub Bot commented on DRILL-6179:
---

arina-ielchiieva commented on a change in pull request #1126: DRILL-6179: Added 
pcapng-format support
URL: https://github.com/apache/drill/pull/1126#discussion_r211205567
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/pcapng/PcapngFormatPlugin.java
 ##
 @@ -0,0 +1,76 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.pcapng;
+
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.proto.UserBitShared;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.exec.store.RecordWriter;
+import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin;
+import org.apache.drill.exec.store.dfs.easy.EasyWriter;
+import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.hadoop.conf.Configuration;
+
+import java.util.List;
+
+public class PcapngFormatPlugin extends EasyFormatPlugin {
+
+  public static final String DEFAULT_NAME = "pcapng";
+
+  public PcapngFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf,
+StoragePluginConfig storagePluginConfig) {
+this(name, context, fsConf, storagePluginConfig, new PcapngFormatConfig());
+  }
+
+  public PcapngFormatPlugin(String name, DrillbitContext context, 
Configuration fsConf, StoragePluginConfig config, PcapngFormatConfig 
formatPluginConfig) {
+super(name, context, fsConf, config, formatPluginConfig, true,
+false, true, false,
+formatPluginConfig.getExtensions(), DEFAULT_NAME);
+  }
+
+  @Override
+  public boolean supportsPushDown() {
+return true;
+  }
+
+  @Override
+  public RecordReader getRecordReader(FragmentContext context, DrillFileSystem 
dfs,
+  FileWork fileWork, List 
columns,
+  String userName) {
+return new PcapngRecordReader(fileWork.getPath(), dfs, columns);
+  }
+
+  @Override
+  public RecordWriter getRecordWriter(FragmentContext context, EasyWriter 
writer) {
+throw new UnsupportedOperationException("unimplemented");
+  }
+
+  @Override
+  public int getReaderOperatorType() {
+return UserBitShared.CoreOperatorType.PCAPNG_SUB_SCAN_VALUE;
+  }
+
+  @Override
+  public int getWriterOperatorType() {
+return 0;
 
 Review comment:
   UnsupportedOperationException?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Added pcapng-format support
> ---
>
> Key: DRILL-6179
> URL: https://issues.apache.org/jira/browse/DRILL-6179
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.13.0
>Reporter: Vlad
>Assignee: Vlad
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> The _PCAP Next Generation Dump File Format_ (or pcapng for short) [1] is an 
> attempt to overcome the limitations of the currently widely used (but 
> limited) libpcap format.
> At a first level, it is desirable to query and filter by source and 
> destination IP and port, and src/dest mac addreses or by protocol. Beyond 
> that, however, it would be very useful to be able to group packets by TCP 
> session and eventually to look at packet contents.
> Initial work is available at  
> https://github.com/mapr-demos/drill/tree/pcapng_dev
> [1] https://pcapng.github.io/pcapng/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6179) Added pcapng-format support

2018-08-20 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585727#comment-16585727
 ] 

ASF GitHub Bot commented on DRILL-6179:
---

Vlad-Storona commented on issue #1126: DRILL-6179: Added pcapng-format support
URL: https://github.com/apache/drill/pull/1126#issuecomment-414262364
 
 
   @paul-rogers thanks for the great review.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Added pcapng-format support
> ---
>
> Key: DRILL-6179
> URL: https://issues.apache.org/jira/browse/DRILL-6179
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.13.0
>Reporter: Vlad
>Assignee: Vlad
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.15.0
>
>
> The _PCAP Next Generation Dump File Format_ (or pcapng for short) [1] is an 
> attempt to overcome the limitations of the currently widely used (but 
> limited) libpcap format.
> At a first level, it is desirable to query and filter by source and 
> destination IP and port, and src/dest mac addreses or by protocol. Beyond 
> that, however, it would be very useful to be able to group packets by TCP 
> session and eventually to look at packet contents.
> Initial work is available at  
> https://github.com/mapr-demos/drill/tree/pcapng_dev
> [1] https://pcapng.github.io/pcapng/
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6633) Replace usage of Guava classes by JDK ones

2018-08-20 Thread Vitalii Diravka (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6633:
---
Issue Type: Improvement  (was: Bug)

> Replace usage of Guava classes by JDK ones
> --
>
> Key: DRILL-6633
> URL: https://issues.apache.org/jira/browse/DRILL-6633
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.14.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.15.0
>
>
> Currently, Drill uses classes from Guava which can be replaced after moving 
> to JDK 8.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6699) Drill client session authorization

2018-08-20 Thread Oleksandr Kalinin (JIRA)
Oleksandr Kalinin created DRILL-6699:


 Summary: Drill client session authorization
 Key: DRILL-6699
 URL: https://issues.apache.org/jira/browse/DRILL-6699
 Project: Apache Drill
  Issue Type: New Feature
Reporter: Oleksandr Kalinin


Currently Drill relies on pluggable security mechanisms to perform user 
authentication. Any positively authenticated user will be permitted to 
establish a session and execute queries on the cluster. Queries will be 
executed on behalf of authenticated user if impersonation is enabled. 
Authorization is performed at data (FS) level.

While this model secures access to data, it doesn't secure cluster resources in 
some uses cases like running multiple Drill clusters within single YARN 
cluster. Since YARN resources in multi-tenant environments are subject to 
authorization itself, not all users who are positively authenticated are 
actually authorized to use YARN resources used to run Drill cluster.

Secondary issue is that it could also be challenging to enable impersonation 
with non-admin / low-privilege accounts typically used to run applications on 
YARN (and hence Drill on YARN clusters too).

Above issues could be addressed with introduction of session authorization in 
Drill. Cluster admin could configure some simple ACLs which would define users 
and/or groups of users permitted to connect and use the cluster. After 
authentication and before finalization of client session creation authorization 
step could be added to check authenticated user against ACLs.

While proposed feature is primarily aimed at Drill on YARN use case, it could 
also be useful for access control on standalone clusters. Otherwise admins need 
to push authorization handling to pluggable security mechanisms which is much 
more complex to implement than simple ACL config, and sometimes even unfeasible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6552) Drill Metadata management "Drill MetaStore"

2018-08-20 Thread weijie.tong (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16585538#comment-16585538
 ] 

weijie.tong commented on DRILL-6552:


[~vvysotskyi] could you share your hangout discussion about your presentation 
at the dev mail list.

> Drill Metadata management "Drill MetaStore"
> ---
>
> Key: DRILL-6552
> URL: https://issues.apache.org/jira/browse/DRILL-6552
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Metadata
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 2.0.0
>
>
> It would be useful for Drill to have some sort of metastore which would 
> enable Drill to remember previously defined schemata so Drill doesn’t have to 
> do the same work over and over again.
> It allows to store schema and statistics, which will allow to accelerate 
> queries validation, planning and execution time. Also it increases stability 
> of Drill and allows to avoid different kind if issues: "schema change 
> Exceptions", "limit 0" optimization and so on. 
> One of the main candidates is Hive Metastore.
> Starting from 3.0 version Hive Metastore can be the separate service from 
> Hive server:
> [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration]
> Optional enhancement is storing Drill's profiles, UDFs, plugins configs in 
> some kind of metastore as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)