[jira] [Comment Edited] (DRILL-5247) Text form of EXPLAIN statement does not have same information as profile

2017-02-08 Thread Chunhui Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858883#comment-15858883
 ] 

Chunhui Shi edited comment on DRILL-5247 at 2/9/17 2:19 AM:


"explain plan including all attributes for " will get the cost details 
returned. Is that not enough? if you need cost information about abandoned 
plans, usually I will go to calcite logs.



was (Author: cshi):
"explain plan including all attributes for " will get the cost details 
returned. Is that not enough, if you need cost information about abandoned 
plans, usually I will go to calcite logs.


> Text form of EXPLAIN statement does not have same information as profile
> 
>
> Key: DRILL-5247
> URL: https://issues.apache.org/jira/browse/DRILL-5247
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Create as simple query. Run it and view the "physical plan" in the Web UI or 
> the profile JSON. That plan contains a rich set of information about operator 
> costs and so on.
> Now, with the same query, execute an EXPLAIN statement. The resulting plan 
> looks like the one in the profile, but lacks the cost detail. (See below.)
> Since the cost detail comes from the planner, and is essential to 
> understanding why a plan was chosen, the information should appear in the 
> EXPLAIN output. (After all, the output is supposed to EXPLAIN the plan...)
> Example of EXPLAIN output:
> {code}
> 00-00Screen
> 00-01  Project(id_i=[$0], name_s20=[$1])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[=($0, 10)])
> 00-04Scan(groupscan=[MockGroupScanPOP [url=null, 
> readEntries=[MockScanEntry [records=1, columns=[MockColumn 
> [minorType=INT, name=id_i, mode=REQUIRED], MockColumn [minorType=VARCHAR, 
> name=name_s20, mode=REQUIRED]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5247) Text form of EXPLAIN statement does not have same information as profile

2017-02-08 Thread Chunhui Shi (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858883#comment-15858883
 ] 

Chunhui Shi commented on DRILL-5247:


"explain plan including all attributes for " will get the cost details 
returned. Is that not enough, if you need cost information about abandoned 
plans, usually I will go to calcite logs.


> Text form of EXPLAIN statement does not have same information as profile
> 
>
> Key: DRILL-5247
> URL: https://issues.apache.org/jira/browse/DRILL-5247
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>Priority: Minor
>
> Create as simple query. Run it and view the "physical plan" in the Web UI or 
> the profile JSON. That plan contains a rich set of information about operator 
> costs and so on.
> Now, with the same query, execute an EXPLAIN statement. The resulting plan 
> looks like the one in the profile, but lacks the cost detail. (See below.)
> Since the cost detail comes from the planner, and is essential to 
> understanding why a plan was chosen, the information should appear in the 
> EXPLAIN output. (After all, the output is supposed to EXPLAIN the plan...)
> Example of EXPLAIN output:
> {code}
> 00-00Screen
> 00-01  Project(id_i=[$0], name_s20=[$1])
> 00-02SelectionVectorRemover
> 00-03  Filter(condition=[=($0, 10)])
> 00-04Scan(groupscan=[MockGroupScanPOP [url=null, 
> readEntries=[MockScanEntry [records=1, columns=[MockColumn 
> [minorType=INT, name=id_i, mode=REQUIRED], MockColumn [minorType=VARCHAR, 
> name=name_s20, mode=REQUIRED]])
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5247) Text form of EXPLAIN statement does not have same information as profile

2017-02-08 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5247:
--

 Summary: Text form of EXPLAIN statement does not have same 
information as profile
 Key: DRILL-5247
 URL: https://issues.apache.org/jira/browse/DRILL-5247
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Paul Rogers
Priority: Minor


Create as simple query. Run it and view the "physical plan" in the Web UI or 
the profile JSON. That plan contains a rich set of information about operator 
costs and so on.

Now, with the same query, execute an EXPLAIN statement. The resulting plan 
looks like the one in the profile, but lacks the cost detail. (See below.)

Since the cost detail comes from the planner, and is essential to understanding 
why a plan was chosen, the information should appear in the EXPLAIN output. 
(After all, the output is supposed to EXPLAIN the plan...)

Example of EXPLAIN output:

{code}
00-00Screen
00-01  Project(id_i=[$0], name_s20=[$1])
00-02SelectionVectorRemover
00-03  Filter(condition=[=($0, 10)])
00-04Scan(groupscan=[MockGroupScanPOP [url=null, 
readEntries=[MockScanEntry [records=1, columns=[MockColumn [minorType=INT, 
name=id_i, mode=REQUIRED], MockColumn [minorType=VARCHAR, name=name_s20, 
mode=REQUIRED]])
{code}




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5246) NULL not supported in VALUES clause

2017-02-08 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers updated DRILL-5246:
---
Description: 
The following is valid in DRILL:

{code}
SELECT * FROM (VALUES ( TRUE, FALSE ))
SELECT * FROM (VALUES ( TRUE ), (FALSE))
{code}

But, the following is not:
{code}
SELECT * FROM (VALUES ( TRUE, NULL, FALSE ))
SELECT * FROM (VALUES ( TRUE ), (FALSE), (NULL))
{code}

Internally, a bare "null" may be an issue since Drill does not have a null 
type. So, the first example might be a problem (a column for which the only 
value is Null, but we don't know a null of which type.) But, the second example 
should be fine: we know that the type is boolean.

While Drill's error reporting in the web UI is not clear, it seems that I'm 
getting a syntax error when using Null:

{code}
org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR:
>From line 1, column 43 to line 1, column 46: Illegal use of 'NULL' SQL Query 
>null 
{code}


  was:
The following is valid in DRILL:

{code}
SELECT * FROM (VALUES ( TRUE, FALSE ))
SELECT * FROM (VALUES ( TRUE ), (FALSE))
{code}

But, the following is not:
{code}
SELECT * FROM (VALUES ( TRUE, NULL, FALSE ))
SELECT * FROM (VALUES ( TRUE ), (FALSE), (NULL))
{code}

Internally, a bare "null" may be an issue since Drill does not have a null 
type. So, the first example might be a problem (a column for which the only 
value is Null, but we don't know a null of which type.) But, the second example 
should be fine: we know that the type is boolean.

While Drill's error reporting in the web UI is not clear, it seems that I'm 
getting a syntax error when using Null.



> NULL not supported in VALUES clause
> ---
>
> Key: DRILL-5246
> URL: https://issues.apache.org/jira/browse/DRILL-5246
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.9.0
>Reporter: Paul Rogers
>
> The following is valid in DRILL:
> {code}
> SELECT * FROM (VALUES ( TRUE, FALSE ))
> SELECT * FROM (VALUES ( TRUE ), (FALSE))
> {code}
> But, the following is not:
> {code}
> SELECT * FROM (VALUES ( TRUE, NULL, FALSE ))
> SELECT * FROM (VALUES ( TRUE ), (FALSE), (NULL))
> {code}
> Internally, a bare "null" may be an issue since Drill does not have a null 
> type. So, the first example might be a problem (a column for which the only 
> value is Null, but we don't know a null of which type.) But, the second 
> example should be fine: we know that the type is boolean.
> While Drill's error reporting in the web UI is not clear, it seems that I'm 
> getting a syntax error when using Null:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR:
> From line 1, column 43 to line 1, column 46: Illegal use of 'NULL' SQL Query 
> null 
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5246) NULL not supported in VALUES clause

2017-02-08 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5246:
--

 Summary: NULL not supported in VALUES clause
 Key: DRILL-5246
 URL: https://issues.apache.org/jira/browse/DRILL-5246
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Paul Rogers


The following is valid in DRILL:

{code}
SELECT * FROM (VALUES ( TRUE, FALSE ))
SELECT * FROM (VALUES ( TRUE ), (FALSE))
{code}

But, the following is not:
{code}
SELECT * FROM (VALUES ( TRUE, NULL, FALSE ))
SELECT * FROM (VALUES ( TRUE ), (FALSE), (NULL))
{code}

Internally, a bare "null" may be an issue since Drill does not have a null 
type. So, the first example might be a problem (a column for which the only 
value is Null, but we don't know a null of which type.) But, the second example 
should be fine: we know that the type is boolean.

While Drill's error reporting in the web UI is not clear, it seems that I'm 
getting a syntax error when using Null.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Closed] (DRILL-1159) query a particular row within a csv file caused IllegalStateException

2017-02-08 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-1159.
-
Resolution: Cannot Reproduce

This is a very old JIRA and I lost the original data so close this.

> query a particular row within a csv file caused IllegalStateException
> -
>
> Key: DRILL-1159
> URL: https://issues.apache.org/jira/browse/DRILL-1159
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Chun Chang
>Assignee: Chun Chang
> Fix For: Future
>
> Attachments: jira1159
>
>
> #Mon Jul 14 10:10:52 PDT 2014
> git.commit.id.abbrev=699851b
> I have some data in a csv file format. And the following query caused 
> IllegalStateException:
> 0: jdbc:drill:schema=dfs> select * from dfs.`bugsb.csv` where columns[0]=887;
> Error: exception while executing query (state=,code=0)
> Data is sensitive so not shown here. But I will paste stack.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-4842) SELECT * on JSON data results in NumberFormatException

2017-02-08 Thread Kunal Khatua (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Khatua updated DRILL-4842:

Reviewer: Chun Chang  (was: Chunhui Shi)

[~cch...@maprtech.com] Please commit the tests to the test framework when 
ready. Are there any other tests that can potentially be affected by this? 

> SELECT * on JSON data results in NumberFormatException
> --
>
> Key: DRILL-4842
> URL: https://issues.apache.org/jira/browse/DRILL-4842
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Serhii Harnyk
>  Labels: ready-to-commit
> Attachments: tooManyNulls.json
>
>
> Note that doing SELECT c1 returns correct results, the failure is seen when 
> we do SELECT star. json.all_text_mode was set to true.
> JSON file tooManyNulls.json has one key c1 with 4096 nulls as its value and 
> the 4097th key c1 has the value "Hello World"
> git commit ID : aaf220ff
> MapR Drill 1.8.0 RPM
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> alter session set 
> `store.json.all_text_mode`=true;
> +---++
> |  ok   |  summary   |
> +---++
> | true  | store.json.all_text_mode updated.  |
> +---++
> 1 row selected (0.27 seconds)
> 0: jdbc:drill:schema=dfs.tmp> SELECT c1 FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> +--+
> |  c1  |
> +--+
> | Hello World  |
> +--+
> 1 row selected (0.243 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select * FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> Error: SYSTEM ERROR: NumberFormatException: Hello World
> Fragment 0:0
> [Error Id: 9cafb3f9-3d5c-478a-b55c-900602b8765e on centos-01.qa.lab:31010]
>  (java.lang.NumberFormatException) Hello World
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI():95
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt():120
> org.apache.drill.exec.test.generated.FiltererGen1169.doSetup():45
> org.apache.drill.exec.test.generated.FiltererGen1169.setup():54
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():195
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():107
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745 (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> Caused by: java.lang.NumberFormatException: Hello World
> at 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:95)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.ap

[jira] [Closed] (DRILL-1200) mondrian2580.q - cause schema change exception

2017-02-08 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-1200.
-
Resolution: Fixed

not seen in recent test runs.

> mondrian2580.q - cause schema change exception
> --
>
> Key: DRILL-1200
> URL: https://issues.apache.org/jira/browse/DRILL-1200
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Reporter: Chun Chang
>Priority: Minor
> Fix For: Future
>
>
> #Mon Jul 21 10:24:21 PDT 2014
> git.commit.id.abbrev=e5c2da0
> The following mondrian query (query2580.q) caused schema change exception:
> 0: jdbc:drill:schema=dfs> select time_by_day.the_year as c0, 
> sum(sales_fact_1997.unit_sales) as m0, sum(sales_fact_1997.store_cost) as m1, 
> sum(sales_fact_1997.store_sales) as m2, count(sales_fact_1997.product_id) as 
> m3, count(distinct sales_fact_1997.customer_id) as m4, sum((case when 
> sales_fact_1997.promotion_id = 0 then 0 else sales_fact_1997.store_sales 
> end)) as m5 from time_by_day as time_by_day, sales_fact_1997 as 
> sales_fact_1997 where sales_fact_1997.time_id = time_by_day.time_id and 
> time_by_day.the_year = 1997 group by time_by_day.the_year;
> Query failed: org.apache.drill.exec.rpc.RpcException: Remote failure while 
> running query.[error_id: "b9fa1177-505a-4ae1-9d44-52b1d40f9a92"
> endpoint {
>   address: "qa-node120.qa.lab"
>   user_port: 31010
>   control_port: 31011
>   data_port: 31012
> }
> error_type: 0
> message: "Failure while running fragment. < IllegalStateException:[ You tried 
> to do a batch data read operation when you were in a state of STOP.  You can 
> only do this type of operation when you are in a state of OK or 
> OK_NEW_SCHEMA. ]"
> ]
> Error: exception while executing query (state=,code=0)
> physical plan:
> 14:31:16.198 [19aa0bca-d59e-4061-962d-b3fabb0f24b7:foreman] DEBUG 
> o.a.d.e.p.s.h.DefaultSqlHandler - Drill Physical :
> 00-00Screen: rowcount = 1.0, cumulative cost = {874658.1 rows, 3052726.1 
> cpu, 0.0 io, 0.0 network}, id = 3109
> 00-01  Project(c0=[$0], m0=[$1], m1=[$2], m2=[$3], m3=[$4], m4=[$5], 
> m5=[$6]): rowcount = 1.0, cumulative cost = {874658.0 rows, 3052726.0 cpu, 
> 0.0 io, 0.0 network}, id = 3108
> 00-02Project(c0=[$0], m0=[CASE(=($2, 0), null, $1)], m1=[CASE(=($4, 
> 0), null, $3)], m2=[CASE(=($6, 0), null, $5)], m3=[$7], m4=[$11], 
> m5=[CASE(=($9, 0), null, $8)]): rowcount = 1.0, cumulative cost = {874657.0 
> rows, 3052698.0 cpu, 0.0 io, 0.0 network}, id = 3107
> 00-03  HashJoin(condition=[IS NOT DISTINCT FROM($0, $10)], 
> joinType=[inner]): rowcount = 1.0, cumulative cost = {874656.0 rows, 
> 3052670.0 cpu, 0.0 io, 0.0 network}, id = 3106
> 00-05HashAgg(group=[{0}], agg#0=[$SUM0($1)], agg#1=[COUNT($1)], 
> agg#2=[$SUM0($2)], agg#3=[COUNT($2)], agg#4=[$SUM0($3)], agg#5=[COUNT($3)], 
> m3=[COUNT($4)], agg#7=[$SUM0($6)], agg#8=[COUNT($6)]): rowcount = 1.0, 
> cumulative cost = {437326.0 rows, 1743481.5 cpu, 0.0 io, 0.0 network}, id = 
> 3093
> 00-07  Project(c0=[$0], unit_sales=[$3], store_cost=[$4], 
> store_sales=[$5], product_id=[$6], customer_id=[$7], 
> $f6=[CASE(=(CAST($8):INTEGER, 0), CAST(0):ANY, $5)]): rowcount = 1.0, 
> cumulative cost = {437325.0 rows, 1743365.5 cpu, 0.0 io, 0.0 network}, id = 
> 3092
> 00-09HashJoin(condition=[=($2, $1)], joinType=[inner]): 
> rowcount = 1.0, cumulative cost = {437324.0 rows, 1743337.5 cpu, 0.0 io, 0.0 
> network}, id = 3091
> 00-12  SelectionVectorRemover: rowcount = 109.5, cumulative 
> cost = {3029.5 rows, 5227.5 cpu, 0.0 io, 0.0 network}, id = 3087
> 00-15Filter(condition=[=(CAST($0):INTEGER, 1997)]): 
> rowcount = 109.5, cumulative cost = {2920.0 rows, 5118.0 cpu, 0.0 io, 0.0 
> network}, id = 3086
> 00-19  Project(the_year=[$1], time_id=[$0]): rowcount = 
> 730.0, cumulative cost = {2190.0 rows, 2198.0 cpu, 0.0 io, 0.0 network}, id = 
> 3085
> 00-23ProducerConsumer: rowcount = 730.0, cumulative 
> cost = {1460.0 rows, 2190.0 cpu, 0.0 io, 0.0 network}, id = 3084
> 00-26  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:/user/root/mondrian/time_by_day]], 
> selectionRoot=/user/root/mondrian/time_by_day, columns=[SchemaPath 
> [`the_year`], SchemaPath [`time_id`): rowcount = 730.0, cumulative cost = 
> {730.0 rows, 1460.0 cpu, 0.0 io, 0.0 network}, id = 3058
> 00-11  Project(time_id0=[$0], unit_sales=[$1], 
> store_cost=[$2], store_sales=[$3], product_id=[$4], customer_id=[$5], 
> promotion_id=[$6]): rowcount = 86837.0, cumulative cost = {347348.0 rows, 
> 694752.0 cpu, 0.0 io, 0.0 network}, id = 3090
> 00-14Project(time_id=[$4], unit_sales=[$1], 
> store_cost=[$5], store_sa

[jira] [Closed] (DRILL-1801) Need to support referencing a column from a SELECT * subquery

2017-02-08 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-1801.
-
Resolution: Fixed

This query passes in recent runs. 

> Need to support referencing a column from a SELECT * subquery
> -
>
> Key: DRILL-1801
> URL: https://issues.apache.org/jira/browse/DRILL-1801
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 0.7.0
>Reporter: Chun Chang
> Fix For: Future
>
>
> #Tue Dec 02 14:38:34 EST 2014
> git.commit.id.abbrev=757e9a2
> Mondrian query5843.q used to work but failed with the following stack:
> 2014-12-02 16:55:39,696 [2b81a073-d825-bd6d-85c8-022726952867:frag:0:0] WARN  
> o.a.d.e.e.ExpressionTreeMaterializer - Unable to find value vector of path 
> `T9¦¦*`, returning null instance.
> 2014-12-02 16:55:39,771 [2b81a073-d825-bd6d-85c8-022726952867:frag:0:0] WARN  
> o.a.d.e.w.fragment.FragmentExecutor - Error while initializing or executing 
> fragment
> java.lang.RuntimeException: Only COUNT aggregate function supported for 
> Boolean type
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen91048$BatchHolder.setupInterior(HashAggTemplate.java:72)
>  ~[na:na]
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen91048$BatchHolder.setup(HashAggTemplate.java:150)
>  ~[na:na]
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen91048$BatchHolder.access$600(HashAggTemplate.java:117)
>  ~[na:na]
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen91048.addBatchHolder(HashAggTemplate.java:445)
>  ~[na:na]
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen91048.setup(HashAggTemplate.java:260)
>  ~[na:na]
>   at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.createAggregatorInternal(HashAggBatch.java:263)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.createAggregator(HashAggBatch.java:189)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.buildSchema(HashAggBatch.java:97)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:130)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:99)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:89)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:132)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:142)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.next(IteratorValidatorBatchIterator.java:118)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:67) 
> ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   a

[jira] [Closed] (DRILL-1802) Change Fragment and Foreman state and cancellation to use full synchronous acknowledgement

2017-02-08 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-1802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-1802.
-
Resolution: Fixed

No longer seen in testing. Likely fixed by other fixes.

> Change Fragment and Foreman state and cancellation to use full synchronous 
> acknowledgement
> --
>
> Key: DRILL-1802
> URL: https://issues.apache.org/jira/browse/DRILL-1802
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 0.7.0
>Reporter: Chun Chang
>Assignee: Chun Chang
>Priority: Critical
> Fix For: Future
>
>
> #Tue Dec 02 14:38:34 EST 2014
> git.commit.id.abbrev=757e9a2
> While running Mondrian query, saw lots of the following exception in 
> drillbit.log, queries were successful:
> 2014-12-02 14:30:42,940 [2b81c26d-4109-6df8-018b-c32616ca359c:frag:0:0] INFO  
> o.a.drill.exec.work.foreman.Foreman - Dropping request to move to COMPLETED 
> state as query is already at FAILED state (which is terminal).
> 2014-12-02 14:35:23,392 [2b81c153-a391-0045-d9eb-8801b7d08159:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - Dropping request to move to FAILED 
> state as query is already at COMPLETED state (which is terminal).
> org.apache.drill.exec.work.foreman.ForemanException: Unexpected exception 
> during fragment initialization.
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:194) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:254)
>  [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper. 
> Failure while accessing Zookeeper
>   at 
> org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:111)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.QueryStatus.updateQueryStateInStore(QueryStatus.java:132)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.Foreman.recordNewState(Foreman.java:502) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:396) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.foreman.Foreman.runPhysicalPlan(Foreman.java:311) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:510) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:185) 
> [drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 4 common frames omitted
> Caused by: java.lang.RuntimeException: Failure while accessing Zookeeper
>   at 
> org.apache.drill.exec.store.sys.zk.ZkEStore.createNodeInZK(ZkEStore.java:53) 
> ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.store.sys.zk.ZkAbstractStore.put(ZkAbstractStore.java:106)
>  ~[drill-java-exec-0.7.0-SNAPSHOT-rebuffed.jar:0.7.0-SNAPSHOT]
>   ... 10 common frames omitted
> Caused by: org.apache.zookeeper.KeeperException$NodeExistsException: 
> KeeperErrorCode = NodeExists for 
> /drill/running/2b81c153-a391-0045-d9eb-8801b7d08159
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:119) 
> ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) 
> ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) 
> ~[zookeeper-3.4.5-mapr-1406.jar:3.4.5-mapr-1406--1]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676)
>  ~[curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660)
>  ~[curator-framework-2.5.0.jar:na]
>   at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) 
> ~[curator-client-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656)
>  ~[curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441)
>  ~[curator-framework-2.5.0.jar:na]
>   at 
> org.apache.curator.framework.imps.Cre

[jira] [Closed] (DRILL-2290) Very slow performance for a query involving nested map

2017-02-08 Thread Chun Chang (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-2290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chang closed DRILL-2290.
-
Resolution: Resolved

Tested with 1.8.0 and the performance hit is gone.

{noformat}
0: jdbc:drill:schema=dfs.md1314> select b.id, a.ooa[1].fl.f1, b.oooi, 
a.ooof.oa.oab.oabc from dfs.`/drill/testdata/complex/json/complex.json` a inner 
join dfs.`/drill/testdata/complex/json/complex.json` b on 
a.ooa[1].fl.f1=b.ooa[1].fl.f1 order by b.id limit 20;
+-+--+-+--+
| id  |  EXPR$1  |oooi |  EXPR$3  |
+-+--+-+--+
| 1   | 1.6789   | {"oa":{"oab":{}}}   | null |
| 3   | 3.6789   | {"oa":{"oab":{}}}   | 3.5678   |
| 4   | 4.6789   | {"oa":{"oab":{}}}   | 4.5678   |
| 5   | 5.6789   | {"oa":{"oab":{}}}   | 5.5678   |
| 7   | 7.6789   | {"oa":{"oab":{}}}   | null |
| 9   | 9.6789   | {"oa":{"oab":{}}}   | null |
| 11  | 11.6789  | {"oa":{"oab":{}}}   | 11.5678  |
| 12  | 12.6789  | {"oa":{"oab":{}}}   | null |
| 13  | 13.6789  | {"oa":{"oab":{}}}   | null |
| 17  | 17.6789  | {"oa":{"oab":{}}}   | 17.5678  |
| 18  | 18.6789  | {"oa":{"oab":{}}}   | null |
| 20  | 20.6789  | {"oa":{"oab":{}}}   | null |
| 21  | 21.6789  | {"oa":{"oab":{"oabc":21}}}  | null |
| 22  | 22.6789  | {"oa":{"oab":{"oabc":22}}}  | 22.5678  |
| 23  | 23.6789  | {"oa":{"oab":{}}}   | 23.5678  |
| 27  | 27.6789  | {"oa":{"oab":{}}}   | null |
| 30  | 30.6789  | {"oa":{"oab":{}}}   | 30.5678  |
| 32  | 32.6789  | {"oa":{"oab":{"oabc":32}}}  | null |
| 34  | 34.6789  | {"oa":{"oab":{"oabc":34}}}  | 34.5678  |
| 36  | 36.6789  | {"oa":{"oab":{}}}   | 36.5678  |
+-+--+-+--+
20 rows selected (142.316 seconds)
{noformat}

> Very slow performance for a query involving nested map
> --
>
> Key: DRILL-2290
> URL: https://issues.apache.org/jira/browse/DRILL-2290
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 0.8.0
>Reporter: Chun Chang
> Fix For: Future
>
>
> #Thu Feb 19 18:40:10 EST 2015
> git.commit.id.abbrev=1ceddff
> This query took 17 minutes to complete. Too long. I think this happened after 
> the fix dealing with nested maps.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select b.id, a.ooa[1].fl.f1, 
> b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner join `complex.json` b 
> on a.ooa[1].fl.f1=b.ooa[1].fl.f1 order by b.id limit 20;
> +++++
> | id |   EXPR$1   |oooi|   EXPR$3   |
> +++++
> | 1  | 1.6789 | {"oa":{"oab":{"oabc":1}}} | 1.5678 |
> | 3  | 3.6789 | {"oa":{"oab":{"oabc":3}}} | 3.5678 |
> | 4  | 4.6789 | {"oa":{"oab":{"oabc":4}}} | 4.5678 |
> | 5  | 5.6789 | {"oa":{"oab":{"oabc":5}}} | 5.5678 |
> | 7  | 7.6789 | {"oa":{"oab":{"oabc":7}}} | 7.5678 |
> | 9  | 9.6789 | {"oa":{"oab":{"oabc":9}}} | 9.5678 |
> | 10 | 10.6789| {"oa":{"oab":{"oabc":10}}} | 10.5678|
> | 11 | 11.6789| {"oa":{"oab":{"oabc":11}}} | 11.5678|
> | 13 | 13.6789| {"oa":{"oab":{"oabc":13}}} | 13.5678|
> | 14 | 14.6789| {"oa":{"oab":{"oabc":14}}} | 14.5678|
> | 15 | 15.6789| {"oa":{"oab":{"oabc":15}}} | 15.5678|
> | 16 | 16.6789| {"oa":{"oab":{"oabc":16}}} | 16.5678|
> | 17 | 17.6789| {"oa":{"oab":{"oabc":17}}} | 17.5678|
> | 18 | 18.6789| {"oa":{"oab":{"oabc":18}}} | 18.5678|
> | 19 | 19.6789| {"oa":{"oab":{"oabc":19}}} | 19.5678|
> | 20 | 20.6789| {"oa":{"oab":{"oabc":20}}} | 20.5678|
> | 21 | 21.6789| {"oa":{"oab":{"oabc":21}}} | 21.5678|
> | 22 | 22.6789| {"oa":{"oab":{"oabc":22}}} | 22.5678|
> | 24 | 24.6789| {"oa":{"oab":{"oabc":24}}} | 24.5678|
> | 25 | 25.6789| {"oa":{"oab":{"oabc":25}}} | 25.5678|
> +++++
> 20 rows selected (1020.036 seconds)
> {code}
> The query deals just a little less than 1 million records so should not be 
> that slow.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexJ> select count(*) from (select 
> b.id, a.ooa[1].fl.f1, b.oooi, a.ooof.oa.oab.oabc from `complex.json` a inner 
> join `complex.json` b on a.ooa[1].fl.f1=b.ooa[1].fl.f1) c;
> ++
> |   EXPR$0   |
> ++
> | 900190 |
> ++
> 1 row selected (700.516

[jira] [Commented] (DRILL-2954) OOM: CTAS from JSON to Parquet on a single wide row JSON file

2017-02-08 Thread Chun Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858692#comment-15858692
 ] 

Chun Chang commented on DRILL-2954:
---

just select * ran out of memory:

{noformat}
[root@perfnode166 ~]# sqlline --maxWidth=1 -n mapr -p mapr -u 
"jdbc:drill:schema=dfs.md1314;drillbit=10.10.30.166"
apache drill 1.8.0
"json ain't no thang"
0: jdbc:drill:schema=dfs.md1314> select * from 
dfs.`/drill/testdata/complex/json/singlewide.json`;
Error: RESOURCE ERROR: One or more nodes ran out of memory while executing the 
query.

Failure allocating buffer.
Fragment 0:0

[Error Id: bc23ef37-9ece-41f3-a40d-028bab775750 on 10.10.30.166:31010] 
(state=,code=0)
{noformat}

> OOM: CTAS from JSON to Parquet on a single wide row JSON file
> -
>
> Key: DRILL-2954
> URL: https://issues.apache.org/jira/browse/DRILL-2954
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet, Storage - Writer
>Affects Versions: 0.9.0
>Reporter: Chun Chang
> Fix For: Future
>
> Attachments: singlewide.json
>
>
> #Generated by Git-Commit-Id-Plugin
> #Sun May 03 18:33:43 EDT 2015
> git.commit.id.abbrev=10833d2
> Have a single row JSON file, with nested structure about 5 levels deep. The 
> file size is about 3.8M. So, a single row of size about 3.8M.
> Converting this file to parquet using CTAS, drillbit quickly ran out of 
> memory.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDirComplexP> create table `singlewide.json` 
> as select * from dfs.`/drill/testdata/complex/json/singlewide.json`;
> Query failed: RESOURCE ERROR: One or more nodes ran out of memory while 
> executing the query.
> Fragment 0:0
> [c6ec52c8-8307-4313-97c8-b9da9e3125e5 on qa-node119.qa.lab:31010]
> Error: exception while executing query: Failure while executing query. 
> (state=,code=0)
> {code}
> drillbit log:
> {code}
> 2015-05-04 14:20:13,071 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:foreman] INFO  
> o.a.drill.exec.work.foreman.Foreman - State change requested.  PENDING --> 
> RUNNING
> 2015-05-04 14:20:13,254 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  
> o.a.d.e.w.fragment.FragmentExecutor - 
> 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0: State change requested from 
> AWAITING_ALLOCATION --> RUNNING for
> 2015-05-04 14:20:13,255 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  
> o.a.d.e.w.f.AbstractStatusReporter - State changed for 
> 2ab81d73-3343-1627-1f34-bb4e88bb4c0c:0:0. New state: RUNNING
> 2015-05-04 14:20:45,486 [2ab81d73-3343-1627-1f34-bb4e88bb4c0c:frag:0:0] INFO  
> o.a.d.c.e.DrillRuntimeException - User Error Occurred
> org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: One or more 
> nodes ran out of memory while executing the query.
> [c6ec52c8-8307-4313-97c8-b9da9e3125e5 ]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:465)
>  ~[drill-common-0.9.0-rebuffed.jar:0.9.0]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:210)
>  [drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-0.9.0-rebuffed.jar:0.9.0]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_45]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_45]
>   at java.lang.Thread.run(Thread.java:744) [na:1.7.0_45]
> Caused by: java.lang.OutOfMemoryError: Direct buffer memory
>   at java.nio.Bits.reserveMemory(Bits.java:658) ~[na:1.7.0_45]
>   at java.nio.DirectByteBuffer.(DirectByteBuffer.java:123) 
> ~[na:1.7.0_45]
>   at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:306) 
> ~[na:1.7.0_45]
>   at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:437) 
> ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
>   at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:179) 
> ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
>   at io.netty.buffer.PoolArena.allocate(PoolArena.java:168) 
> ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
>   at io.netty.buffer.PoolArena.allocate(PoolArena.java:98) 
> ~[netty-buffer-4.0.24.Final.jar:4.0.24.Final]
>   at 
> io.netty.buffer.PooledByteBufAllocatorL.newDirectBuffer(PooledByteBufAllocatorL.java:46)
>  ~[drill-java-exec-0.9.0-rebuffed.jar:4.0.24.Final]
>   at 
> io.netty.buffer.PooledByteBufAllocatorL.directBuffer(PooledByteBufAllocatorL.java:66)
>  ~[drill-java-exec-0.9.0-rebuffed.jar:4.0.24.Final]
>   at 
> org.apache.drill.exec.memory.TopLevelAllocator$ChildAllocator.buffer(TopLevelAllocator.java:227)
>  ~[drill-java-exec-0.9.0-rebuffed.jar:0.9.0]
>   at 
> org.apache.drill.exec.memory.TopLevelAl

[jira] [Commented] (DRILL-5245) Using filter and offset could lead to an assertion error in Calcite

2017-02-08 Thread Rahul Challapalli (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858611#comment-15858611
 ] 

Rahul Challapalli commented on DRILL-5245:
--

The above query only got past the Assertion error when I used an *OFFSET of 1*

> Using filter and offset could lead to an assertion error in Calcite
> ---
>
> Key: DRILL-5245
> URL: https://issues.apache.org/jira/browse/DRILL-5245
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning & Optimization
>Affects Versions: 1.10.0
>Reporter: Rahul Challapalli
>
> git.commit.id.abbrev=2af709f
> Based on the filter selectivity, the planner might think that the number of 
> records from upstream in lesser than the "OFFSET" value and can fail with an 
> assertion error. Though in reality the estimate based on filter selectivity 
> could be wrong
> Below is one such example where I am hitting this issue
> {code}
> select * from (
>   select * from (
> select d.*, concat(d.c_first_name, d.c_last_name) as name from (
>   SELECT 
> *
>   FROM   catalog_sales,
>customer
>   WHERE  cs_bill_customer_sk = c_customer_sk
> ) as d 
> order by d.c_email_address nulls first 
>   ) as d1 
>   where d1.name is not null
> ) d2
> OFFSET 1434510;
> {code}
> Exception from the logs
> {code}
> 2017-02-08 11:42:39,925 [27648b4f-98e5-22a9-f7d7-eccb587854a6:foreman] ERROR 
> o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: AssertionError
> [Error Id: d026ab7f-9e11-4854-b39c-66a7846b6a3a on qa-node190.qa.lab:31010]
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError
> [Error Id: d026ab7f-9e11-4854-b39c-66a7846b6a3a on qa-node190.qa.lab:31010]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
>  ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:825)
>  [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:945) 
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281) 
> [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_111]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_111]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
> exception during fragment initialization: null
> ... 4 common frames omitted
> Caused by: java.lang.AssertionError: null
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.isNonNegative(RelMetadataQuery.java:524)
>  ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.validateResult(RelMetadataQuery.java:543)
>  ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at 
> org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:87)
>  ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at 
> org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:103)
>  ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at 
> org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:160) 
> ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at 
> org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:283) 
> ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1927) 
> ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.log(DefaultSqlHandler.java:138)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.log(DefaultSqlHandler.java:132)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:411)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:343)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:240)
>  ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.h

[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858569#comment-15858569
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on the issue:

https://github.com/apache/drill/pull/578
  
+ Addressed review comments
+ Updated commit messages
+ Rebased on latest master


> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Sudheesh Katkam
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (DRILL-5245) Using filter and offset could lead to an assertion error in Calcite

2017-02-08 Thread Rahul Challapalli (JIRA)
Rahul Challapalli created DRILL-5245:


 Summary: Using filter and offset could lead to an assertion error 
in Calcite
 Key: DRILL-5245
 URL: https://issues.apache.org/jira/browse/DRILL-5245
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.10.0
Reporter: Rahul Challapalli


git.commit.id.abbrev=2af709f

Based on the filter selectivity, the planner might think that the number of 
records from upstream in lesser than the "OFFSET" value and can fail with an 
assertion error. Though in reality the estimate based on filter selectivity 
could be wrong

Below is one such example where I am hitting this issue
{code}
select * from (
  select * from (
select d.*, concat(d.c_first_name, d.c_last_name) as name from (
  SELECT 
*
  FROM   catalog_sales,
   customer
  WHERE  cs_bill_customer_sk = c_customer_sk
) as d 
order by d.c_email_address nulls first 
  ) as d1 
  where d1.name is not null
) d2
OFFSET 1434510;
{code}

Exception from the logs
{code}
2017-02-08 11:42:39,925 [27648b4f-98e5-22a9-f7d7-eccb587854a6:foreman] ERROR 
o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: AssertionError


[Error Id: d026ab7f-9e11-4854-b39c-66a7846b6a3a on qa-node190.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: AssertionError


[Error Id: d026ab7f-9e11-4854-b39c-66a7846b6a3a on qa-node190.qa.lab:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:544)
 ~[drill-common-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:825)
 [drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.work.foreman.Foreman.moveToState(Foreman.java:945) 
[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:281) 
[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_111]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_111]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_111]
Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected 
exception during fragment initialization: null
... 4 common frames omitted
Caused by: java.lang.AssertionError: null
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.isNonNegative(RelMetadataQuery.java:524)
 ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.validateResult(RelMetadataQuery.java:543)
 ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at 
org.apache.calcite.rel.metadata.RelMetadataQuery.getRowCount(RelMetadataQuery.java:87)
 ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.explain_(RelWriterImpl.java:103)
 ~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at 
org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:160) 
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at 
org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:283) 
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at org.apache.calcite.plan.RelOptUtil.toString(RelOptUtil.java:1927) 
~[calcite-core-1.4.0-drill-r19.jar:1.4.0-drill-r19]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.log(DefaultSqlHandler.java:138)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.log(DefaultSqlHandler.java:132)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:411)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:343)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:240)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:290)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:168)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:117)
 ~[drill-java-exec-1.10.0-SNAPSHOT.jar:1.10.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.

[jira] [Commented] (DRILL-3214) Config option to cast empty string to null does not cast empty string to null

2017-02-08 Thread Brett Archuleta (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858503#comment-15858503
 ] 

Brett Archuleta commented on DRILL-3214:


Hi there, just curious what the development status on this is... I'm a 
developer who is working extensively with Drill and we just ran into this bug 
on 1.8, and I'd like to contribute to the Drill project and help fix it.

> Config option to cast empty string to null does not cast empty string to null
> -
>
> Key: DRILL-3214
> URL: https://issues.apache.org/jira/browse/DRILL-3214
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.0.0
> Environment: faec150598840c40827e6493992d81209aa936da
>Reporter: Khurram Faraaz
>Assignee: Sean Hsuan-Yi Chu
> Fix For: 1.1.0
>
>
> Config option drill.exec.functions.cast_empty_string_to_null does not seem to 
> be working as designed.
> Disable casting of empty strings to null. 
> {code}
> 0: jdbc:drill:schema=dfs.tmp> alter session set 
> `drill.exec.functions.cast_empty_string_to_null` = false;
> +---+--+
> |  ok   | summary  |
> +---+--+
> | true  | drill.exec.functions.cast_empty_string_to_null updated.  |
> +---+--+
> 1 row selected (0.078 seconds)
> {code}
> In this query we see empty strings are retained in query output in columns[1].
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT columns[0], columns[1], columns[2] from 
> `threeColsDouble.csv`;
> +--+-+-+
> |  EXPR$0  | EXPR$1  | EXPR$2  |
> +--+-+-+
> | 156  | 234 | 1   |
> | 2653543  | 434 | 0   |
> | 367345   | 567567  | 23  |
> | 34554| 1234| 45  |
> | 4345 | 567678  | 19876   |
> | 34556| 0   | 1109|
> | 5456 | -1  | 1098|
> | 6567 | | 34534   |
> | 7678 | 1   | 6   |
> | 8798 | 456 | 243 |
> | 265354   | 234 | 123 |
> | 367345   | | 234 |
> | 34554| 1   | 2   |
> | 4345 | 0   | 10  |
> | 34556| -1  | 19  |
> | 5456 | 23423   | 345 |
> | 6567 | 0   | 2348|
> | 7678 | 1   | 2   |
> | 8798 | | 45  |
> | 099  | 19  | 17  |
> +--+-+-+
> 20 rows selected (0.13 seconds)
> {code}
> Casting empty strings to integer leads to NumberFormatException
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT columns[0], cast(columns[1] as int), 
> columns[2] from `threeColsDouble.csv`;
> Error: SYSTEM ERROR: java.lang.NumberFormatException: 
> Fragment 0:0
> [Error Id: b08f4247-263a-460d-b37b-91a70375f7ba on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Enable casting empty string to null.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> alter session set 
> `drill.exec.functions.cast_empty_string_to_null` = true;
> +---+--+
> |  ok   | summary  |
> +---+--+
> | true  | drill.exec.functions.cast_empty_string_to_null updated.  |
> +---+--+
> 1 row selected (0.077 seconds)
> {code}
> Run query
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT columns[0], cast(columns[1] as int), 
> columns[2] from `threeColsDouble.csv`;
> Error: SYSTEM ERROR: java.lang.NumberFormatException: 
> Fragment 0:0
> [Error Id: de633399-15f9-4a79-a21f-262bd5551207 on centos-03.qa.lab:31010] 
> (state=,code=0)
> {code}
> Note from the output of below query that the empty strings are not casted to 
> null, although drill.exec.functions.cast_empty_string_to_null was set to true.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> SELECT columns[0], columns[1], columns[2] from 
> `threeColsDouble.csv`;
> +--+-+-+
> |  EXPR$0  | EXPR$1  | EXPR$2  |
> +--+-+-+
> | 156  | 234 | 1   |
> | 2653543  | 434 | 0   |
> | 367345   | 567567  | 23  |
> | 34554| 1234| 45  |
> | 4345 | 567678  | 19876   |
> | 34556| 0   | 1109|
> | 5456 | -1  | 1098|
> | 6567 | | 34534   |
> | 7678 | 1   | 6   |
> | 8798 | 456 | 243 |
> | 265354   | 234 | 123 |
> | 367345   | | 234 |
> | 34554| 1   | 2   |
> | 4345 | 0   | 10  |
> | 34556| -1  | 19  |
> | 5456 | 23423   | 345 |
> | 6567   

[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858452#comment-15858452
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100152148
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java ---
@@ -88,22 +129,183 @@ public void submitQuery(UserResultsListener 
resultsListener, RunQuery query) {
 send(queryResultHandler.getWrappedListener(resultsListener), 
RpcType.RUN_QUERY, query, QueryId.class);
   }
 
-  public void connect(RpcConnectionHandler handler, 
DrillbitEndpoint endpoint,
-  UserProperties props, UserBitShared.UserCredentials 
credentials) {
+  public CheckedFuture connect(DrillbitEndpoint 
endpoint, DrillProperties parameters,
+   UserCredentials 
credentials) {
+final FutureHandler handler = new FutureHandler();
 UserToBitHandshake.Builder hsBuilder = UserToBitHandshake.newBuilder()
 .setRpcVersion(UserRpcConfig.RPC_VERSION)
 .setSupportListening(true)
 .setSupportComplexTypes(supportComplexTypes)
 .setSupportTimeout(true)
 .setCredentials(credentials)
-.setClientInfos(UserRpcUtils.getRpcEndpointInfos(clientName));
+.setClientInfos(UserRpcUtils.getRpcEndpointInfos(clientName))
+.setSaslSupport(SaslSupport.SASL_AUTH)
+.setProperties(parameters.serializeForServer());
+this.properties = parameters;
+
+
connectAsClient(queryResultHandler.getWrappedConnectionHandler(handler),
+hsBuilder.build(), endpoint.getAddress(), endpoint.getUserPort());
+return handler;
+  }
+
+  /**
+   * Check (after {@link #connect connecting}) if server requires 
authentication.
+   *
+   * @return true if server requires authentication
+   */
+  public boolean serverRequiresAuthentication() {
+return supportedAuthMechs != null;
+  }
+
+  /**
+   * Returns a list of supported authentication mechanism. If called 
before {@link #connect connecting},
+   * returns null. If called after {@link #connect connecting}, returns a 
list of supported mechanisms
+   * iff authentication is required.
+   *
+   * @return list of supported authentication mechanisms
+   */
+  public List getSupportedAuthenticationMechanisms() {
--- End diff --

non-internal API


> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Sudheesh Katkam
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5034) Select timestamp from hive generated parquet always return in UTC

2017-02-08 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5034:
---
Labels: ready-to-commit  (was: )

> Select timestamp from hive generated parquet always return in UTC
> -
>
> Key: DRILL-5034
> URL: https://issues.apache.org/jira/browse/DRILL-5034
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.9.0
>Reporter: Krystal
>Assignee: Vitalii Diravka
>  Labels: ready-to-commit
>
> commit id: 5cea9afa6278e21574c6a982ae5c3d82085ef904
> Reading timestamp data against a hive parquet table from drill automatically 
> converts the timestamp data to UTC. 
> {code}
> SELECT TIMEOFDAY() FROM (VALUES(1));
> +--+
> |EXPR$0|
> +--+
> | 2016-11-10 12:33:26.547 America/Los_Angeles  |
> +--+
> {code}
> data schema:
> {code}
> message hive_schema {
>   optional int32 voter_id;
>   optional binary name (UTF8);
>   optional int32 age;
>   optional binary registration (UTF8);
>   optional fixed_len_byte_array(3) contributions (DECIMAL(6,2));
>   optional int32 voterzone;
>   optional int96 create_timestamp;
>   optional int32 create_date (DATE);
> }
> {code}
> Using drill-1.8, the returned timestamps match the table data:
> {code}
> select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from 
> `/user/hive/warehouse/voter_hive_parquet` limit 5;
> ++
> | EXPR$0 |
> ++
> | 2016-10-23 20:03:58.0  |
> | null   |
> | 2016-09-09 12:01:18.0  |
> | 2017-03-06 20:35:55.0  |
> | 2017-01-20 22:32:43.0  |
> ++
> 5 rows selected (1.032 seconds)
> {code}
> If the user timzone is changed to UTC, then the timestamp data is returned in 
> UTC time.
> Using drill-1.9, the returned timestamps got converted to UTC eventhough the 
> user timezone is in PST.
> {code}
> select convert_from(create_timestamp, 'TIMESTAMP_IMPALA') from 
> dfs.`/user/hive/warehouse/voter_hive_parquet` limit 5;
> ++
> | EXPR$0 |
> ++
> | 2016-10-24 03:03:58.0  |
> | null   |
> | 2016-09-09 19:01:18.0  |
> | 2017-03-07 04:35:55.0  |
> | 2017-01-21 06:32:43.0  |
> ++
> {code}
> {code}
> alter session set `store.parquet.reader.int96_as_timestamp`=true;
> +---+---+
> |  ok   |  summary  |
> +---+---+
> | true  | store.parquet.reader.int96_as_timestamp updated.  |
> +---+---+
> select create_timestamp from dfs.`/user/hive/warehouse/voter_hive_parquet` 
> limit 5;
> ++
> |create_timestamp|
> ++
> | 2016-10-24 03:03:58.0  |
> | null   |
> | 2016-09-09 19:01:18.0  |
> | 2017-03-07 04:35:55.0  |
> | 2017-01-21 06:32:43.0  |
> ++
> {code}
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-4842) SELECT * on JSON data results in NumberFormatException

2017-02-08 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-4842:

Labels: ready-to-commit  (was: )

> SELECT * on JSON data results in NumberFormatException
> --
>
> Key: DRILL-4842
> URL: https://issues.apache.org/jira/browse/DRILL-4842
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.2.0
>Reporter: Khurram Faraaz
>Assignee: Serhii Harnyk
>  Labels: ready-to-commit
> Attachments: tooManyNulls.json
>
>
> Note that doing SELECT c1 returns correct results, the failure is seen when 
> we do SELECT star. json.all_text_mode was set to true.
> JSON file tooManyNulls.json has one key c1 with 4096 nulls as its value and 
> the 4097th key c1 has the value "Hello World"
> git commit ID : aaf220ff
> MapR Drill 1.8.0 RPM
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> alter session set 
> `store.json.all_text_mode`=true;
> +---++
> |  ok   |  summary   |
> +---++
> | true  | store.json.all_text_mode updated.  |
> +---++
> 1 row selected (0.27 seconds)
> 0: jdbc:drill:schema=dfs.tmp> SELECT c1 FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> +--+
> |  c1  |
> +--+
> | Hello World  |
> +--+
> 1 row selected (0.243 seconds)
> 0: jdbc:drill:schema=dfs.tmp> select * FROM `tooManyNulls.json` WHERE c1 IN 
> ('Hello World');
> Error: SYSTEM ERROR: NumberFormatException: Hello World
> Fragment 0:0
> [Error Id: 9cafb3f9-3d5c-478a-b55c-900602b8765e on centos-01.qa.lab:31010]
>  (java.lang.NumberFormatException) Hello World
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI():95
> 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt():120
> org.apache.drill.exec.test.generated.FiltererGen1169.doSetup():45
> org.apache.drill.exec.test.generated.FiltererGen1169.setup():54
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.generateSV2Filterer():195
> 
> org.apache.drill.exec.physical.impl.filter.FilterRecordBatch.setupNewSchema():107
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():78
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.innerNext():94
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext():51
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():135
> org.apache.drill.exec.record.AbstractRecordBatch.next():162
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():257
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():251
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():415
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():251
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():745 (state=,code=0)
> 0: jdbc:drill:schema=dfs.tmp>
> {noformat}
> Stack trace from drillbit.log
> {noformat}
> Caused by: java.lang.NumberFormatException: Hello World
> at 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.nfeI(StringFunctionHelpers.java:95)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]
> at 
> org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.varTypesToInt(StringFunctionHelpers.java:120)
>  ~[drill-java-exec-1.8.0-SNAPSHOT.jar:1.8.0-SNAPSHOT]

[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858336#comment-15858336
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100131255
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/data/DataClient.java ---
@@ -75,27 +85,106 @@ public MessageLite getResponseDefaultInstance(int 
rpcType) throws RpcException {
   }
 
   @Override
-  protected Response handle(DataClientConnection connection, int rpcType, 
ByteBuf pBody, ByteBuf dBody) throws RpcException {
+  protected void handle(DataClientConnection connection, int rpcType, 
ByteBuf pBody, ByteBuf dBody,
+ResponseSender sender) throws RpcException {
 throw new UnsupportedOperationException("DataClient is unidirectional 
by design.");
   }
 
   BufferAllocator getAllocator() {
-return allocator;
+return config.getAllocator();
   }
 
   @Override
   protected void validateHandshake(BitServerHandshake handshake) throws 
RpcException {
 if (handshake.getRpcVersion() != DataRpcConfig.RPC_VERSION) {
-  throw new RpcException(String.format("Invalid rpc version.  Expected 
%d, actual %d.", handshake.getRpcVersion(), DataRpcConfig.RPC_VERSION));
+  throw new RpcException(String.format("Invalid rpc version.  Expected 
%d, actual %d.",
+  handshake.getRpcVersion(), DataRpcConfig.RPC_VERSION));
+}
+
+if (handshake.getAuthenticationMechanismsCount() != 0) { // remote 
requires authentication
--- End diff --

Correct me if I am wrong, but both your intentions are different.

I've addressed Sorabh's comment, as in, "check for the case if 
Authentication is enabled on this client and for some reason server is sending 
empty list of mechanisms list (may be wrong config) then we should throw 
exception"

But regarding Laurent's comment, the "code" is the "same as in 
ControlClient", the objects are all different (handshake, connection, config). 
That refactoring would require a lot more changes to BasicClient. I'll open a 
ticket once this PR is merged.


> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Sudheesh Katkam
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858320#comment-15858320
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100129942
  
--- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp ---
@@ -407,37 +422,155 @@ connectionStatus_t 
DrillClientImpl::validateHandshake(DrillUserProperties* prope
 if(ret!=CONN_SUCCESS){
 return ret;
 }
-if(this->m_handshakeStatus != exec::user::SUCCESS){
-switch(this->m_handshakeStatus){
-case exec::user::RPC_VERSION_MISMATCH:
-DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Invalid rpc version. 
 Expected "
-<< DRILL_RPC_VERSION << ", actual "<< 
m_handshakeVersion << "." << std::endl;)
-return handleConnError(CONN_BAD_RPC_VER,
-getMessage(ERR_CONN_BAD_RPC_VER, DRILL_RPC_VERSION,
-m_handshakeVersion,
-this->m_handshakeErrorId.c_str(),
-this->m_handshakeErrorMsg.c_str()));
-case exec::user::AUTH_FAILED:
-DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Authentication 
failed." << std::endl;)
-return handleConnError(CONN_AUTH_FAILED,
-getMessage(ERR_CONN_AUTHFAIL,
-this->m_handshakeErrorId.c_str(),
-this->m_handshakeErrorMsg.c_str()));
-case exec::user::UNKNOWN_FAILURE:
-DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Unknown error during 
handshake." << std::endl;)
-return handleConnError(CONN_HANDSHAKE_FAILED,
-getMessage(ERR_CONN_UNKNOWN_ERR,
-this->m_handshakeErrorId.c_str(),
-this->m_handshakeErrorMsg.c_str()));
-default:
-break;
+
+switch(this->m_handshakeStatus) {
+case exec::user::SUCCESS:
+// reset io_service after handshake is validated before 
running queries
+m_io_service.reset();
+return CONN_SUCCESS;
+case exec::user::RPC_VERSION_MISMATCH:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Invalid rpc version.  
Expected "
+<< DRILL_RPC_VERSION << ", actual "<< m_handshakeVersion 
<< "." << std::endl;)
+return handleConnError(CONN_BAD_RPC_VER, 
getMessage(ERR_CONN_BAD_RPC_VER, DRILL_RPC_VERSION,
+
m_handshakeVersion,
+
this->m_handshakeErrorId.c_str(),
+
this->m_handshakeErrorMsg.c_str()));
+case exec::user::AUTH_FAILED:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Authentication failed." 
<< std::endl;)
+return handleConnError(CONN_AUTH_FAILED, 
getMessage(ERR_CONN_AUTHFAIL,
+
this->m_handshakeErrorId.c_str(),
+
this->m_handshakeErrorMsg.c_str()));
+case exec::user::UNKNOWN_FAILURE:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Unknown error during 
handshake." << std::endl;)
+return handleConnError(CONN_HANDSHAKE_FAILED, 
getMessage(ERR_CONN_UNKNOWN_ERR,
+ 
this->m_handshakeErrorId.c_str(),
+ 
this->m_handshakeErrorMsg.c_str()));
+case exec::user::AUTH_REQUIRED:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Server requires SASL 
authentication." << std::endl;)
+return authenticate(properties);
+default:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Unknown return status." 
<< std::endl;)
+return handleConnError(CONN_HANDSHAKE_FAILED, 
getMessage(ERR_CONN_UNKNOWN_ERR,
+ 
this->m_handshakeErrorId.c_str(),
+ 
this->m_handshakeErrorMsg.c_str()));
+}
+}
+
+connectionStatus_t DrillClientImpl::authenticate(const 
DrillUserProperties* userProperties) {
+try {
+m_saslAuthenticator = new SaslAuthenticatorImpl(userProperties);
+} catch (std::runtime_error& e) {
--- End diff --

Suggestion for an alternative?


[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858319#comment-15858319
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100129719
  
--- Diff: contrib/native/client/src/clientlib/saslAuthenticatorImpl.cpp ---
@@ -0,0 +1,210 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include 
+#include 
+#include 
+#include "saslAuthenticatorImpl.hpp"
+
+#include "drillClientImpl.hpp"
+#include "logger.hpp"
+
+namespace Drill {
+
+#define DEFAULT_SERVICE_NAME "drill"
+
+#define KERBEROS_SIMPLE_NAME "kerberos"
+#define KERBEROS_SASL_NAME "gssapi"
+#define PLAIN_NAME "plain"
+
+const std::map 
SaslAuthenticatorImpl::MECHANISM_MAPPING = boost::assign::map_list_of
+(KERBEROS_SIMPLE_NAME, KERBEROS_SASL_NAME)
+(PLAIN_NAME, PLAIN_NAME)
+;
+
+boost::mutex SaslAuthenticatorImpl::s_mutex;
+bool SaslAuthenticatorImpl::s_initialized = false;
+
+SaslAuthenticatorImpl::SaslAuthenticatorImpl(const DrillUserProperties* 
const properties) :
+m_properties(properties), m_pConnection(NULL), m_secret(NULL) {
+
+if (!s_initialized) {
--- End diff --

This allows for lazy init in case auth is not used.


> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Sudheesh Katkam
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4824) JSON with complex nested data produces incorrect output with missing fields

2017-02-08 Thread Paul Rogers (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858294#comment-15858294
 ] 

Paul Rogers commented on DRILL-4824:


Let’s step back to establish if we want a correct solution, or just a less-bad 
workaround. In this JIRA, we’ve been talking about workarounds.

The fundamental problem is that Drill discards information important to JSON. 
In JSON, a field can have multiple states: not-provided, null, map (perhaps 
empty), list (perhaps empty), number, string.

Drill cannot represent the following:

* Variable types
* Not-provided
* Null map
* Null array

As a result, we try to “compress” the JSON states into the smaller set of Drill 
states. To have an accurate solution, we must make (at least) three changes:

* Add a null bit to map and array (Go from MapVector to NullableMapVector, etc.)
* Include the not-provided bit.
* Support variant (union) vectors.

The good news is that Drill already provides the essential pieces.

* Drill provides null flag vectors for other vectors. There is nothing (other 
than work) preventing us from adding them to maps and arrays, and lists.
* While we call the nullable flag a “bit” vector, we actually use an entire 
byte per record. As a result, we can simply grab one of the seven unused bits 
to use as a “not-provided” bit.
* Drill provides a (partial implementation of) a variant (or “union") vector.

Building on those three components, we can achieve complete support of the JSON 
standard. 

> JSON with complex nested data produces incorrect output with missing fields
> ---
>
> Key: DRILL-4824
> URL: https://issues.apache.org/jira/browse/DRILL-4824
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JSON
>Affects Versions: 1.0.0
>Reporter: Roman
>Assignee: Serhii Harnyk
>
> There is incorrect output in case of JSON file with complex nested data.
> _JSON:_
> {code:none|title=example.json|borderStyle=solid}
> {
> "Field1" : {
> }
> }
> {
> "Field1" : {
> "InnerField1": {"key1":"value1"},
> "InnerField2": {"key2":"value2"}
> }
> }
> {
> "Field1" : {
> "InnerField3" : ["value3", "value4"],
> "InnerField4" : ["value5", "value6"]
> }
> }
> {code}
> _Query:_
> {code:sql}
> select Field1 from dfs.`/tmp/example.json`
> {code}
> _Incorrect result:_
> {code:none}
> +---+
> |  Field1   |
> +---+
> {"InnerField1":{},"InnerField2":{},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{"key1":"value1"},"InnerField2" 
> {"key2":"value2"},"InnerField3":[],"InnerField4":[]}
> {"InnerField1":{},"InnerField2":{},"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--+
> {code}
> Theres is no need to output missing fields. In case of deeply nested 
> structure we will get unreadable result for user.
> _Correct result:_
> {code:none}
> +--+
> | Field1   |
> +--+
> |{} 
> {"InnerField1":{"key1":"value1"},"InnerField2":{"key2":"value2"}}
> {"InnerField3":["value3","value4"],"InnerField4":["value5","value6"]}
> +--+
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858287#comment-15858287
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100122930
  
--- Diff: contrib/native/client/src/clientlib/drillClientImpl.cpp ---
@@ -407,37 +422,155 @@ connectionStatus_t 
DrillClientImpl::validateHandshake(DrillUserProperties* prope
 if(ret!=CONN_SUCCESS){
 return ret;
 }
-if(this->m_handshakeStatus != exec::user::SUCCESS){
-switch(this->m_handshakeStatus){
-case exec::user::RPC_VERSION_MISMATCH:
-DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Invalid rpc version. 
 Expected "
-<< DRILL_RPC_VERSION << ", actual "<< 
m_handshakeVersion << "." << std::endl;)
-return handleConnError(CONN_BAD_RPC_VER,
-getMessage(ERR_CONN_BAD_RPC_VER, DRILL_RPC_VERSION,
-m_handshakeVersion,
-this->m_handshakeErrorId.c_str(),
-this->m_handshakeErrorMsg.c_str()));
-case exec::user::AUTH_FAILED:
-DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Authentication 
failed." << std::endl;)
-return handleConnError(CONN_AUTH_FAILED,
-getMessage(ERR_CONN_AUTHFAIL,
-this->m_handshakeErrorId.c_str(),
-this->m_handshakeErrorMsg.c_str()));
-case exec::user::UNKNOWN_FAILURE:
-DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Unknown error during 
handshake." << std::endl;)
-return handleConnError(CONN_HANDSHAKE_FAILED,
-getMessage(ERR_CONN_UNKNOWN_ERR,
-this->m_handshakeErrorId.c_str(),
-this->m_handshakeErrorMsg.c_str()));
-default:
-break;
+
+switch(this->m_handshakeStatus) {
+case exec::user::SUCCESS:
+// reset io_service after handshake is validated before 
running queries
+m_io_service.reset();
+return CONN_SUCCESS;
+case exec::user::RPC_VERSION_MISMATCH:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Invalid rpc version.  
Expected "
+<< DRILL_RPC_VERSION << ", actual "<< m_handshakeVersion 
<< "." << std::endl;)
+return handleConnError(CONN_BAD_RPC_VER, 
getMessage(ERR_CONN_BAD_RPC_VER, DRILL_RPC_VERSION,
+
m_handshakeVersion,
+
this->m_handshakeErrorId.c_str(),
+
this->m_handshakeErrorMsg.c_str()));
+case exec::user::AUTH_FAILED:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Authentication failed." 
<< std::endl;)
+return handleConnError(CONN_AUTH_FAILED, 
getMessage(ERR_CONN_AUTHFAIL,
+
this->m_handshakeErrorId.c_str(),
+
this->m_handshakeErrorMsg.c_str()));
+case exec::user::UNKNOWN_FAILURE:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Unknown error during 
handshake." << std::endl;)
+return handleConnError(CONN_HANDSHAKE_FAILED, 
getMessage(ERR_CONN_UNKNOWN_ERR,
+ 
this->m_handshakeErrorId.c_str(),
+ 
this->m_handshakeErrorMsg.c_str()));
+case exec::user::AUTH_REQUIRED:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Server requires SASL 
authentication." << std::endl;)
+return authenticate(properties);
+default:
+DRILL_MT_LOG(DRILL_LOG(LOG_TRACE) << "Unknown return status." 
<< std::endl;)
+return handleConnError(CONN_HANDSHAKE_FAILED, 
getMessage(ERR_CONN_UNKNOWN_ERR,
+ 
this->m_handshakeErrorId.c_str(),
+ 
this->m_handshakeErrorMsg.c_str()));
+}
+}
+
+connectionStatus_t DrillClientImpl::authenticate(const 
DrillUserProperties* userProperties) {
+try {
+m_saslAuthenticator = new SaslAuthenticatorImpl(userProperties);
--- End diff --

I don't think so. (I am not sure how else this could be done.)


> Kerberos

[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858284#comment-15858284
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100122717
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/client/DrillClient.java ---
@@ -568,19 +565,13 @@ public void runQuery(QueryType type, 
List planFragments, UserResul
* Helper method to generate the UserCredentials message from the 
properties.
*/
   private UserBitShared.UserCredentials getUserCredentials() {
-// If username is not propagated as one of the properties
-String userName = "anonymous";
-
-if (props != null) {
-  for (Property property: props.getPropertiesList()) {
-if (property.getKey().equalsIgnoreCase("user") && 
!Strings.isNullOrEmpty(property.getValue())) {
-  userName = property.getValue();
-  break;
-}
-  }
+String userName = properties.getProperty(DrillProperties.USER);
+if (Strings.isNullOrEmpty(userName)) {
+  userName = "anonymous"; // if username is not propagated as one of 
the properties
--- End diff --

To keep the functionality as is, not sure was the original intent was.


> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Sudheesh Katkam
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-4280) Kerberos Authentication

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-4280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858281#comment-15858281
 ] 

ASF GitHub Bot commented on DRILL-4280:
---

Github user sudheeshkatkam commented on a diff in the pull request:

https://github.com/apache/drill/pull/578#discussion_r100122562
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/rpc/control/ConnectionManagerRegistry.java
 ---
@@ -32,24 +29,19 @@
 
   private final ConcurrentMap 
registry = Maps.newConcurrentMap();
 
-  private final ControlMessageHandler handler;
-  private final BootStrapContext context;
-  private volatile DrillbitEndpoint localEndpoint;
-  private final BufferAllocator allocator;
+  private final BitConnectionConfigImpl config;
 
-  public ConnectionManagerRegistry(BufferAllocator allocator, 
ControlMessageHandler handler, BootStrapContext context) {
-super();
-this.handler = handler;
-this.context = context;
-this.allocator = allocator;
+  public ConnectionManagerRegistry(BitConnectionConfigImpl config) {
--- End diff --

Each impl of ConnectionConfig is package private. There are subtle 
differences among them, which is why references using impl are used with within 
the package. But in generic classes (e.g.AbstractServerConnection), interface 
is used.


> Kerberos Authentication
> ---
>
> Key: DRILL-4280
> URL: https://issues.apache.org/jira/browse/DRILL-4280
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Keys Botzum
>Assignee: Sudheesh Katkam
>  Labels: security
>
> Drill should support Kerberos based authentication from clients. This means 
> that both the ODBC and JDBC drivers as well as the web/REST interfaces should 
> support inbound Kerberos. For Web this would most likely be SPNEGO while for 
> ODBC and JDBC this will be more generic Kerberos.
> Since Hive and much of Hadoop supports Kerberos there is a potential for a 
> lot of reuse of ideas if not implementation.
> Note that this is related to but not the same as 
> https://issues.apache.org/jira/browse/DRILL-3584 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5040) Interrupted CTAS should not succeed & should not create physical file on disk

2017-02-08 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-5040:

Labels: ready-to-commit  (was: )

> Interrupted CTAS should not succeed & should not create physical file on disk
> -
>
> Key: DRILL-5040
> URL: https://issues.apache.org/jira/browse/DRILL-5040
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.9.0
>Reporter: Khurram Faraaz
>Assignee: Arina Ielchiieva
>  Labels: ready-to-commit
> Fix For: 1.10.0
>
>
> We should not allow CTAS to succeed (i.e create physical file on disk ) in 
> the case where it was interrupted. (vis Ctrl-C)
> Drill 1.9.0
> git commit ID : db30854
> Consider the below CTAS that was interrupted using Ctrl-C
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> create table temp_t1 as select * from 
> `twoKeyJsn.json`; 
> [ issue Ctrl-C while the above CTAS is running ]
> No rows affected (7.694 seconds)
> {noformat}
> I verified that physical file was created on disk, even though the above CTAS 
> was Canceled
> {noformat}
> [root@centos-01 ~]# hadoop fs -ls /tmp/temp_t1*
> -rwxr-xr-x   3 root root   36713198 2016-11-14 10:51 
> /tmp/temp_t1/0_0_0.parquet
> {noformat}
> We are able to do a select on the CTAS table (above) that was Canceled.
> {noformat}
> 0: jdbc:drill:schema=dfs.tmp> select count(*) from temp_t1;
> +--+
> |  EXPR$0  |
> +--+
> | 3747840  |
> +--+
> 1 row selected (0.183 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5002) Using hive's date functions on top of date column gives wrong results for local time-zone

2017-02-08 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15858106#comment-15858106
 ] 

Vitalii Diravka commented on DRILL-5002:


I changed UTC time-zone to the local one (-10.00) and reproduced the issue. So 
the root-causes of the problem are hive's date functions and local time-zone 
(Hive receives UTC time and converts it to the local time. Therefore it is 
necessary to pass UTC time to Hive). 

But the issue corresponds to every data source. For example:
{code}
0: jdbc:drill:zk=local> select to_date('1994-01-01','-mm-dd') from 
(VALUES(1));
+-+
|   EXPR$0|
+-+
| 1994-01-01  |
+-+
1 row selected (0.096 seconds)
0: jdbc:drill:zk=local> select last_day(to_date('1994-01-01','-mm-dd')) 
from (VALUES(1));
+-+
|   EXPR$0|
+-+
| 1993-12-31  |
+-+
{code}
Therefore I changed the name of this ticket.

> Using hive's date functions on top of date column gives wrong results for 
> local time-zone
> -
>
> Key: DRILL-5002
> URL: https://issues.apache.org/jira/browse/DRILL-5002
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>Priority: Critical
> Attachments: 0_0_0.parquet
>
>
> git.commit.id.abbrev=190d5d4
> Wrong Result 1 :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1994-02-01' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1994-02-01  | 1   |
> | 1994-02-01  | 1   |
> +-+-+
> {code}
> Wrong Result 2 : 
> {code}
> select l_shipdate, `day`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1998-06-02  | 1   |
> | 1998-06-02  | 1   |
> +-+-+
> {code}
> Correct Result :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1998-06-02  | 6   |
> | 1998-06-02  | 6   |
> +-+-+
> {code}
> It looks like we are getting wrong results when the 'day' is '01'. I only 
> tried month and day hive functionsbut wouldn't be surprised if they have 
> similar issues too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (DRILL-5002) Using hive's date functions on top of date column gives wrong results for local time-zone

2017-02-08 Thread Vitalii Diravka (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-5002:
---
Summary: Using hive's date functions on top of date column gives wrong 
results for local time-zone  (was: Using hive's date functions on top of date 
column in parquet gives wrong results)

> Using hive's date functions on top of date column gives wrong results for 
> local time-zone
> -
>
> Key: DRILL-5002
> URL: https://issues.apache.org/jira/browse/DRILL-5002
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Hive, Storage - Parquet
>Reporter: Rahul Challapalli
>Assignee: Vitalii Diravka
>Priority: Critical
> Attachments: 0_0_0.parquet
>
>
> git.commit.id.abbrev=190d5d4
> Wrong Result 1 :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1994-02-01' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1994-02-01  | 1   |
> | 1994-02-01  | 1   |
> +-+-+
> {code}
> Wrong Result 2 : 
> {code}
> select l_shipdate, `day`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1998-06-02  | 1   |
> | 1998-06-02  | 1   |
> +-+-+
> {code}
> Correct Result :
> {code}
> select l_shipdate, `month`(l_shipdate) from cp.`tpch/lineitem.parquet` where 
> l_shipdate = date '1998-06-02' limit 2;
> +-+-+
> | l_shipdate  | EXPR$1  |
> +-+-+
> | 1998-06-02  | 6   |
> | 1998-06-02  | 6   |
> +-+-+
> {code}
> It looks like we are getting wrong results when the 'day' is '01'. I only 
> tried month and day hive functionsbut wouldn't be surprised if they have 
> similar issues too.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5243) Fix TestContextFunctions.sessionIdUDFWithinSameSession unit test

2017-02-08 Thread Arina Ielchiieva (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857845#comment-15857845
 ] 

Arina Ielchiieva commented on DRILL-5243:
-

Merged with commit id 478de241dd28b41bcb4487fe67937dca33522dc7

> Fix TestContextFunctions.sessionIdUDFWithinSameSession unit test
> 
>
> Key: DRILL-5243
> URL: https://issues.apache.org/jira/browse/DRILL-5243
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>  Labels: ready-to-commit
> Fix For: 1.10.0
>
>
> After DRILL-5043 was merged into master, it introduced unit test 
> TestContextFunctions.sessionIdUDFWithinSameSession which is currently failing 
> with the following error:
> {noformat}
> java.lang.Exception: org.apache.drill.common.exceptions.UserRemoteException: 
> PARSE ERROR: Encountered ";" at line 1, column 48.
> {noformat}
> Fix:
> remove semicolon in the end of the query
> {noformat}
> final String sessionIdQuery = "select session_id as sessionId from 
> (values(1));"
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (DRILL-5080) Create a memory-managed version of the External Sort operator

2017-02-08 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15857604#comment-15857604
 ] 

ASF GitHub Bot commented on DRILL-5080:
---

Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/717
  
Some comment got lost in the force-push. One was related to the output 
batch size, suggesting we cap it at 16 MB. The reason is that value vectors 
about 16 MB cause memory fragmentation. A later fix will limit an output batch 
to either 64K rows (the size of an sv2) or so that the longest vector is 
smaller than 16 MB. The most recent commit added per-column size information so 
that we can enforce this limit. For example, we can have 64K rows with columns 
of size 256 bytes within a 16 MB vector. There is no reason not to allow 64K 
rows even for rows with four of the 256 columns. Total batch size would be 64 
MB, but no single vector would be above 16 MB.

That fix will be offered, along with tests and enabling the managed sort by 
default, in a subsequent PR.




> Create a memory-managed version of the External Sort operator
> -
>
> Key: DRILL-5080
> URL: https://issues.apache.org/jira/browse/DRILL-5080
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.8.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>  Labels: ready-to-commit
> Fix For: 1.10.0
>
> Attachments: ManagedExternalSortDesign.pdf
>
>
> We propose to create a "managed" version of the external sort operator that 
> works to a clearly-defined memory limit. Attached is a design specification 
> for the work.
> The project will include fixing a number of bugs related to the external 
> sort, include as sub-tasks of this umbrella task.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)