[jira] [Commented] (DRILL-7243) Storage Plugins Management

2019-05-07 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16834946#comment-16834946
 ] 

Pritesh Maker commented on DRILL-7243:
--

[~vitalii] any thoughts?

> Storage Plugins Management
> --
>
> Key: DRILL-7243
> URL: https://issues.apache.org/jira/browse/DRILL-7243
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Information Schema
>Affects Versions: 1.16.0
>Reporter: benj
>Priority: Minor
>
> Since 1.16 when updating storage plugin configuration via web interface, the 
> page change for the global management page of all storage plugins.
> Previously when you pressed the update button, the page did not change and so 
> it was possible to make other changes immediately.
> The return to the global page was done by pressing the button "back".
> On our side the previous behavior was better than the one set up in the 1.16



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7239) Download page wrong date

2019-05-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7239:


Assignee: Bridget Bevens

> Download page wrong date
> 
>
> Key: DRILL-7239
> URL: https://issues.apache.org/jira/browse/DRILL-7239
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Sebb
>Assignee: Bridget Bevens
>Priority: Major
>
> [https://drill.apache.org/download/] says:
>  
> "Drill 1.16 was released on May 02, 2018."
> I think that is wrong.
>  
> The page also says:
> "Copyright © 2012-2014"
> If the year is mentioned, it should be updated for the last substantive change



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7235) Download page should link to verification instructions

2019-05-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7235:


Assignee: Bridget Bevens

> Download page should link to verification instructions
> --
>
> Key: DRILL-7235
> URL: https://issues.apache.org/jira/browse/DRILL-7235
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Sebb
>Assignee: Bridget Bevens
>Priority: Major
>
> The download page include links to KEYS, sigs and hashes, but does not 
> provide any information on why or how they should be used.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6032) Use RecordBatchSizer to estimate size of columns in HashAgg

2019-05-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6032:
-
Fix Version/s: (was: 1.17.0)

> Use RecordBatchSizer to estimate size of columns in HashAgg
> ---
>
> Key: DRILL-6032
> URL: https://issues.apache.org/jira/browse/DRILL-6032
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> We need to use the RecordBatchSize to estimate the size of columns in the 
> Partition batches created by HashAgg.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6739) Update Kafka libs to 2.0.0 version

2019-05-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6739:
-
Fix Version/s: (was: 1.17.0)

> Update Kafka libs to 2.0.0 version
> --
>
> Key: DRILL-6739
> URL: https://issues.apache.org/jira/browse/DRILL-6739
> Project: Apache Drill
>  Issue Type: Task
>  Components: Storage - Kafka
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> The current version of Kafka libs is 0.11.0.1
>  The last version is 2.0.0 (September 2018) 
> https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients
> Looks like the only changes which should be done are:
>  * replacing {{serverConfig()}} method with {{staticServerConfig()}} in Drill 
> {{EmbeddedKafkaCluster}} class
>  * Replacing deprecated {{AdminUtils}} with {{kafka.zk.AdminZkClient}} 
> [https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/core/src/main/scala/kafka/admin/AdminUtils.scala#L35]
>  https://issues.apache.org/jira/browse/KAFKA-6545
> The initial work: https://github.com/vdiravka/drill/commits/DRILL-6739



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6540) Upgrade to HADOOP-3.0 libraries

2019-05-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6540:
-
Fix Version/s: (was: 1.17.0)

> Upgrade to HADOOP-3.0 libraries 
> 
>
> Key: DRILL-6540
> URL: https://issues.apache.org/jira/browse/DRILL-6540
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>
> Currently Drill uses 2.7.4 version of hadoop libraries (hadoop-common, 
> hadoop-hdfs, hadoop-annotations, hadoop-aws, hadoop-yarn-api, hadoop-client, 
> hadoop-yarn-client).
> A year ago the [Hadoop 3.0|https://hadoop.apache.org/docs/r3.0.0/index.html] 
> was released and recently it was updated to [Hadoop 
> 3.2.0|https://hadoop.apache.org/docs/r3.2.0/].
> To use Drill under Hadoop3.0 distribution we need this upgrade. Also the 
> newer version includes new features, which can be useful for Drill.
>  This upgrade is also needed to leverage the newest version of Zookeeper 
> libraries and Hive 3.1 version.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6465) Transitive closure is not working in Drill for Join with multiple local conditions

2019-05-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6465:
-
Fix Version/s: (was: 1.17.0)

> Transitive closure is not working in Drill for Join with multiple local 
> conditions
> --
>
> Key: DRILL-6465
> URL: https://issues.apache.org/jira/browse/DRILL-6465
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Denys Ordynskiy
>Assignee: Vitalii Diravka
>Priority: Minor
> Attachments: drill.zip
>
>
> For several SQL operators Transitive closure is not working during Partition 
> Pruning and Filter Pushdown for the left table in Join.
>  If I use several local conditions, then Drill scans full left table in Join.
>  But if we move additional conditions to the WHERE statement, then Transitive 
> closure works fine for all joined tables
> *Query BETWEEN:*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y BETWEEN 1987 AND 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=8, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2])]{code}
> *Actual result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=16, partitions= [Partition(values:[1987, 5, 
> 1]), Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2]), Partition(values:[1990, 4, 1]), 
> Partition(values:[1990, 4, 2]), Partition(values:[1990, 5, 1]), 
> Partition(values:[1990, 5, 2]), Partition(values:[1991, 3, 1]), 
> Partition(values:[1991, 3, 2]), Partition(values:[1991, 3, 3]), 
> Partition(values:[1991, 3, 4])
> ]
> {code}
> *There is the same Transitive closure behavior for this logical operators:*
>  * NOT IN
>  * LIKE
>  * NOT LIKE
> Also Transitive closure is not working during Partition Pruning and Filter 
> Pushdown for this comparison operators:
> *Query <*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y < 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=4, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2])]{code}
> *Actual result:*
> {code:java}
> 00-00 Screen
> 00-01 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-02 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-03 HashJoin(condition=[=($1, $6)], joinType=[inner])
> 00-05 Scan(groupscan=[HiveScan [table=Table(dbName:default, 
> tableName:h_tab1), columns=[`**`], numPartitions=16, partitions= 
> [Partition(values:[1987, 5, 1]), Partition(values:[1987, 5, 2]), 
> Partition(values:[1987, 7, 1]), Partition(values:[1987, 7, 2]), 
> Partition(values:[1988, 11, 1]), Partition(values:[1988, 11, 2]), 
> Partition(values:[1988, 12, 1]), Partition(values:[1988, 12, 2]), 
> Partition(values:[1990, 4, 1]), Partition(values:[1990, 4, 2]), 
> Partition(values:[1990, 5, 1]), Partition(values:[1990, 5, 2]), 
> Partition(values:[1991, 3, 1]), Partition(values:[1991, 3, 2]), 
> Partition(values:[1991, 3, 3]), Partition(values:[1991, 3, 4])], 
> inputDirectories=[maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/1, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/2, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/3, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/4, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/5, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/6, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/7, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/8, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/9, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/10, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/11, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/12, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/13, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/14, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/15, 
> 

[jira] [Updated] (DRILL-6806) Start moving code for handling a partition in HashAgg into a separate class

2019-05-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6806:
-
Fix Version/s: (was: 1.17.0)

> Start moving code for handling a partition in HashAgg into a separate class
> ---
>
> Key: DRILL-6806
> URL: https://issues.apache.org/jira/browse/DRILL-6806
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
>
> Since this involves a lot of refactoring this will be a multiple PR effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7230) Add README.md with instructions for release

2019-05-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7230?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7230:


Assignee: Sorabh Hamirwasia

> Add README.md with instructions for release
> ---
>
> Key: DRILL-7230
> URL: https://issues.apache.org/jira/browse/DRILL-7230
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build  Test
>Reporter: Sorabh Hamirwasia
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.17.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7099) Resource Management in Exchange Operators

2019-04-29 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7099:
-
Fix Version/s: 1.17.0

> Resource Management in Exchange Operators
> -
>
> Key: DRILL-7099
> URL: https://issues.apache.org/jira/browse/DRILL-7099
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.17.0
>
>
> This Jira will be used to track the changes required for implementing 
> Resource Management in Exchange operators.
> The design can be found here: 
> https://docs.google.com/document/d/1N9OXfCWcp68jsxYVmSt9tPgnZRV_zk8rwwFh0BxXZeE/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7099) Resource Management in Exchange Operators

2019-04-29 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7099:
-
Issue Type: Improvement  (was: Bug)

> Resource Management in Exchange Operators
> -
>
> Key: DRILL-7099
> URL: https://issues.apache.org/jira/browse/DRILL-7099
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.17.0
>
>
> This Jira will be used to track the changes required for implementing 
> Resource Management in Exchange operators.
> The design can be found here: 
> https://docs.google.com/document/d/1N9OXfCWcp68jsxYVmSt9tPgnZRV_zk8rwwFh0BxXZeE/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7171) Count(*) query on leaf level directory is not reading summary cache file.

2019-04-15 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7171:
-
Fix Version/s: 1.17.0

> Count(*) query on leaf level directory is not reading summary cache file.
> -
>
> Key: DRILL-7171
> URL: https://issues.apache.org/jira/browse/DRILL-7171
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Minor
> Fix For: 1.17.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Since the leaf level directory doesn't store the metadata directories file, 
> while reading summary if the directories cache file is not present, it is 
> assumed that the cache is possibly corrupt and reading of the summary cache 
> file is skipped. Metadata directories cache file should be created at the 
> leaf level.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7175) Configuration of store.parquet.use_new_reader in drill-override.conf has no effect

2019-04-14 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7175:


Assignee: Venkata Jyothsna Donapati

> Configuration of store.parquet.use_new_reader in drill-override.conf has no 
> effect
> --
>
> Key: DRILL-7175
> URL: https://issues.apache.org/jira/browse/DRILL-7175
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: benj
>Assignee: Venkata Jyothsna Donapati
>Priority: Minor
>
> As related in drill user mailing list 
> ([http://mail-archives.apache.org/mod_mbox/drill-user/201904.mbox/%3ceec5a6bc-fa95-44ce-d2b6-c02c1bfd0...@laposte.net%3e])
> It's possible to configure any system option of Drill (sys.option) in the 
> drill-override.conf.
>  Exemple with _drill.exec.storage.file.partition.column.label_
> {code:java}
> drill.exec: {
>   cluster-id: "drillbits-test",
>   ...
> },
> drill.exec.options: {
>   drill.exec.storage.file.partition.column.label: "drill_dir",
>   ...
> }
> {code}
> But the configuration of the particular option *store.parquet.use_new_reader* 
> in the same way has absolutly no effect.
> Have no idea if this in rapport  with the "Not supported in this release" 
> description found for this option, but it's appears at minimal strange that 
> it's possible to configure it with ALTER SESSION/SYSTEM and not in 
> drill-override.conf
> As an additional point, I had a little trouble finding that the options had 
> to fit into *drill.exec.options:* so I propose to add in 
> drill-override-example.conf a sample of configuration of any option to avoid 
> this trouble.
> Even with keeping the mention found in drill-module.conf if necessary
> {code:json}
> # Users are not supposed to set these options in the drill-override.conf file.
> # Users should use ALTER SYSTEM and ALTER SESSION to set the options.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7175) Configuration of store.parquet.use_new_reader in drill-override.conf has no effect

2019-04-14 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7175:
-
Fix Version/s: 1.17.0

> Configuration of store.parquet.use_new_reader in drill-override.conf has no 
> effect
> --
>
> Key: DRILL-7175
> URL: https://issues.apache.org/jira/browse/DRILL-7175
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: benj
>Assignee: Venkata Jyothsna Donapati
>Priority: Minor
> Fix For: 1.17.0
>
>
> As related in drill user mailing list 
> ([http://mail-archives.apache.org/mod_mbox/drill-user/201904.mbox/%3ceec5a6bc-fa95-44ce-d2b6-c02c1bfd0...@laposte.net%3e])
> It's possible to configure any system option of Drill (sys.option) in the 
> drill-override.conf.
>  Exemple with _drill.exec.storage.file.partition.column.label_
> {code:java}
> drill.exec: {
>   cluster-id: "drillbits-test",
>   ...
> },
> drill.exec.options: {
>   drill.exec.storage.file.partition.column.label: "drill_dir",
>   ...
> }
> {code}
> But the configuration of the particular option *store.parquet.use_new_reader* 
> in the same way has absolutly no effect.
> Have no idea if this in rapport  with the "Not supported in this release" 
> description found for this option, but it's appears at minimal strange that 
> it's possible to configure it with ALTER SESSION/SYSTEM and not in 
> drill-override.conf
> As an additional point, I had a little trouble finding that the options had 
> to fit into *drill.exec.options:* so I propose to add in 
> drill-override-example.conf a sample of configuration of any option to avoid 
> this trouble.
> Even with keeping the mention found in drill-module.conf if necessary
> {code:json}
> # Users are not supposed to set these options in the drill-override.conf file.
> # Users should use ALTER SYSTEM and ALTER SESSION to set the options.
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7135) Upgrade to Jetty 9.4

2019-04-10 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7135:
-
Fix Version/s: (was: Future)
   1.17.0

> Upgrade to Jetty 9.4
> 
>
> Key: DRILL-7135
> URL: https://issues.apache.org/jira/browse/DRILL-7135
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Vitalii Diravka
>Priority: Minor
> Fix For: 1.17.0
>
>
> Initially DRILL-7051 updated Jetty to 9.4 version and DRILL-7081 updated 
> Jersey version to 2.28 version. These versions work fine for Drill with 
> Hadoop version below 3.0.
>  Starting from Hadoop 3.0 it uses 
> [org.eclipse.jetty|https://github.com/apache/hadoop/blob/branch-3.0/hadoop-project/pom.xml#L38]
>  9.3 version.
>  That's why it conflicts with newer Jetty versions.
> Drill can update Jetty and Jersey versions after resolution HADOOP-14930 and 
> HBASE-19256.
>  Or alternatively these libs can be shaded in Drill, but there is no real 
> reason to do it nowadays.
> See details in 
> [#1681|https://github.com/apache/drill/pull/1681#discussion_r265904521] PR.
> _Notes_: 
> * For Jersey update it is necessary to add 
> org.glassfish.jersey.inject:jersey-hk2 in Drill to solve all compilation 
> failures.
> * See doc for Jetty update: 
> https://www.eclipse.org/jetty/documentation/9.4.x/upgrading-jetty.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7135) Upgrade to Jetty 9.4

2019-04-10 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7135:


Assignee: Arina Ielchiieva

> Upgrade to Jetty 9.4
> 
>
> Key: DRILL-7135
> URL: https://issues.apache.org/jira/browse/DRILL-7135
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Vitalii Diravka
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.17.0
>
>
> Initially DRILL-7051 updated Jetty to 9.4 version and DRILL-7081 updated 
> Jersey version to 2.28 version. These versions work fine for Drill with 
> Hadoop version below 3.0.
>  Starting from Hadoop 3.0 it uses 
> [org.eclipse.jetty|https://github.com/apache/hadoop/blob/branch-3.0/hadoop-project/pom.xml#L38]
>  9.3 version.
>  That's why it conflicts with newer Jetty versions.
> Drill can update Jetty and Jersey versions after resolution HADOOP-14930 and 
> HBASE-19256.
>  Or alternatively these libs can be shaded in Drill, but there is no real 
> reason to do it nowadays.
> See details in 
> [#1681|https://github.com/apache/drill/pull/1681#discussion_r265904521] PR.
> _Notes_: 
> * For Jersey update it is necessary to add 
> org.glassfish.jersey.inject:jersey-hk2 in Drill to solve all compilation 
> failures.
> * See doc for Jetty update: 
> https://www.eclipse.org/jetty/documentation/9.4.x/upgrading-jetty.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7162) Apache Drill uses 3rd Party with Highest CVEs

2019-04-10 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7162:
-
Fix Version/s: 1.17.0

>  Apache Drill uses 3rd Party with Highest CVEs
> --
>
> Key: DRILL-7162
> URL: https://issues.apache.org/jira/browse/DRILL-7162
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0, 1.14.0, 1.15.0
>Reporter: Ayush Sharma
>Priority: Major
> Fix For: 1.17.0
>
>
> Apache Drill uses rd party libraries with almost 250+ CVEs.
> Most of the CVEs are in the older version of Jetty (9.1.x) whereas the 
> current version of Jetty is 9.4.x
> Also many of the other libraries are in EOF versions and the are not patched 
> even in the latest release.
> This creates an issue of security when we use it in production.
> We are able to replace many older version of libraries with the latest 
> versions with no CVEs , however many of them are not replaceable as it is and 
> would require some changes in the source code.
> The jetty version is of the highest priority and needs migration to 9.4.x 
> version immediately.
>  
> Please look into this issue at immediate priority as it compromises with the 
> security of the application utilizing Apache Drill.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7049) REST API returns the toString of byte arrays (VARBINARY types)

2019-04-04 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7049:


Assignee: jean-claude

> REST API returns the toString of byte arrays (VARBINARY types)
> --
>
> Key: DRILL-7049
> URL: https://issues.apache.org/jira/browse/DRILL-7049
> Project: Apache Drill
>  Issue Type: Bug
>  Components:  Server, Web Server
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Minor
> Fix For: 1.16.0
>
>
> Doing a query using the REST API will return VARBINARY columns as a Java byte 
> array hashcode instead of the actual data of the VARBINARY.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7149) Kerberos Code Missing from Drill on YARN

2019-04-03 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16808919#comment-16808919
 ] 

Pritesh Maker commented on DRILL-7149:
--

[~agirish] any thoughts?

> Kerberos Code Missing from Drill on YARN
> 
>
> Key: DRILL-7149
> URL: https://issues.apache.org/jira/browse/DRILL-7149
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Security
>Affects Versions: 1.14.0
>Reporter: Charles Givre
>Priority: Blocker
>
> My company is trying to deploy Drill using the Drill on Yarn (DoY) and we 
> have run into the issue that DoY does not seem to support passing Kerberos 
> credentials in order to interact with HDFS. 
> Upon checking the source code available in GIT 
> (https://github.com/apache/drill/blob/1.14.0/drill-yarn/src/main/java/org/apache/drill/yarn/core/)
>  and referring to Apache YARN documentation 
> (https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnApplicationSecurity.html)
>  , we saw no section for passing the security credentials needed by the 
> application to interact with any Hadoop cluster services and applications. 
> This we feel needs to be added to the source code so that delegation tokens 
> can be passed inside the container for the process to be able access Drill 
> archive on HDFS and start. It probably should be added to the 
> ContainerLaunchContext within the ApplicationSubmissionContext for DoY as 
> suggested under Apache documentation.
>  
> We tried the same DoY utility on a non-kerberised cluster and the process 
> started well. Although we ran into a different issue there of hosts getting 
> blacklisted
> We tested with the Single Principal per cluster option.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7153) Drill Fails to Build using JDK 1.8.0_65

2019-04-03 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7153:
-
Reviewer: Volodymyr Vysotskyi

> Drill Fails to Build using JDK 1.8.0_65
> ---
>
> Key: DRILL-7153
> URL: https://issues.apache.org/jira/browse/DRILL-7153
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Blocker
> Fix For: 1.16.0
>
>
> Drill fails to build when using Java 1.8.0_65.  Throws the following error:
> [{{ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile 
> (default-compile) on project drill-java-exec: Compilation failure
> [ERROR] 
> /Users/cgivre/github/drill-dev/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/FilterEvaluatorUtils.java:[59,68]
>  error: unreported exception E; must be caught or declared to be thrown
> [ERROR]   where E,T,V are type-variables:
> [ERROR] E extends Exception declared in method 
> accept(ExprVisitor,V)
> [ERROR] T extends Object declared in method 
> accept(ExprVisitor,V)
> [ERROR] V extends Object declared in method 
> accept(ExprVisitor,V)
> [ERROR]
> [ERROR] -> [Help 1]
> [ERROR]
> [ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
> switch.
> [ERROR] Re-run Maven using the -X switch to enable full debug logging.
> [ERROR]
> [ERROR] For more information about the errors and possible solutions, please 
> read the following articles:
> [ERROR] [Help 1] 
> http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
> [ERROR]
> [ERROR] After correcting the problems, you can resume the build with the 
> command
> [ERROR]   mvn  -rf :drill-java-exec}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7136) Num_buckets for HashAgg in profile may be inaccurate

2019-04-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7136:


Assignee: Gautam Parai  (was: Pritesh Maker)

> Num_buckets for HashAgg in profile may be inaccurate
> 
>
> Key: DRILL-7136
> URL: https://issues.apache.org/jira/browse/DRILL-7136
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Robert Hou
>Assignee: Gautam Parai
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: 23650ee5-6721-8a8f-7dd3-f5dd09a3a7b0.sys.drill
>
>
> I ran TPCH query 17 with sf 1000.  Here is the query:
> {noformat}
> select
>   sum(l.l_extendedprice) / 7.0 as avg_yearly
> from
>   lineitem l,
>   part p
> where
>   p.p_partkey = l.l_partkey
>   and p.p_brand = 'Brand#13'
>   and p.p_container = 'JUMBO CAN'
>   and l.l_quantity < (
> select
>   0.2 * avg(l2.l_quantity)
> from
>   lineitem l2
> where
>   l2.l_partkey = p.p_partkey
>   );
> {noformat}
> One of the hash agg operators has resized 6 times.  It should have 4M 
> buckets.  But the profile shows it has 64K buckets.
> I have attached a sample profile.  In this profile, the hash agg operator is 
> (04-02).
> {noformat}
> Operator Metrics
> Minor FragmentNUM_BUCKETS NUM_ENTRIES NUM_RESIZING
> RESIZING_TIME_MSNUM_PARTITIONS  SPILLED_PARTITIONS  SPILL_MB  
>   SPILL_CYCLE INPUT_BATCH_COUNT   AVG_INPUT_BATCH_BYTES   
> AVG_INPUT_ROW_BYTES INPUT_RECORD_COUNT  OUTPUT_BATCH_COUNT  
> AVG_OUTPUT_BATCH_BYTES  AVG_OUTPUT_ROW_BYTESOUTPUT_RECORD_COUNT
> 04-00-02  65,536 748,746  6   364 1   
> 582 0   813 582,653 18  26,316,456  401 1,631,943 
>   25  26,176,350
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7139) Date_add() can produce incorrect results when adding to a timestamp

2019-03-28 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7139:


Assignee: Arina Ielchiieva  (was: Pritesh Maker)

> Date_add() can produce incorrect results when adding to a timestamp
> ---
>
> Key: DRILL-7139
> URL: https://issues.apache.org/jira/browse/DRILL-7139
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.15.0
>Reporter: Robert Hou
>Assignee: Arina Ielchiieva
>Priority: Major
>
> I am using date_add() to create a sequence of timestamps:
> {noformat}
> select date_add(timestamp '1970-01-01 00:00:00', cast(concat('PT',107374,'M') 
> as interval minute)) timestamp_id from (values(1));
> +--+
> |   timestamp_id   |
> +--+
> | 1970-01-25 20:31:12.704  |
> +--+
> 1 row selected (0.121 seconds)
> {noformat}
> When I add one more, I get an older timestamp:
> {noformat}
> select date_add(timestamp '1970-01-01 00:00:00', cast(concat('PT',107375,'M') 
> as interval minute)) timestamp_id from (values(1));
> +--+
> |   timestamp_id   |
> +--+
> | 1969-12-07 03:29:25.408  |
> +--+
> 1 row selected (0.126 seconds)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7118) Filter not getting pushed down on MapR-DB tables.

2019-03-21 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7118:
-
Reviewer: Aman Sinha

> Filter not getting pushed down on MapR-DB tables.
> -
>
> Key: DRILL-7118
> URL: https://issues.apache.org/jira/browse/DRILL-7118
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
> Fix For: 1.16.0
>
>
> A simple is null filter is not being pushed down for the mapr-db tables. Here 
> is the repro for the same.
> {code:java}
> 0: jdbc:drill:zk=local> explain plan for select * from dfs.`/tmp/js` where b 
> is null;
> ANTLR Tool version 4.5 used for code generation does not match the current 
> runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation 
> does not match the current runtime version 4.7.1ANTLR Tool version 4.5 used 
> for code generation does not match the current runtime version 4.7.1ANTLR 
> Runtime version 4.5 used for parser compilation does not match the current 
> runtime version 
> 4.7.1+--+--+
> | text | json |
> +--+--+
> | 00-00 Screen
> 00-01 Project(**=[$0])
> 00-02 Project(T0¦¦**=[$0])
> 00-03 SelectionVectorRemover
> 00-04 Filter(condition=[IS NULL($1)])
> 00-05 Project(T0¦¦**=[$0], b=[$1])
> 00-06 Scan(table=[[dfs, /tmp/js]], groupscan=[JsonTableGroupScan 
> [ScanSpec=JsonScanSpec [tableName=/tmp/js, condition=null], columns=[`**`, 
> `b`], maxwidth=1]])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7107) Unable to connect to Drill 1.15 through ZK

2019-03-15 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7107:
-
Fix Version/s: 1.16.0

> Unable to connect to Drill 1.15 through ZK
> --
>
> Key: DRILL-7107
> URL: https://issues.apache.org/jira/browse/DRILL-7107
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.16.0
>
>
> After upgrading to Drill 1.15, users are seeing they are no longer able to 
> connect to Drill using ZK quorum. They are getting the following "Unable to 
> setup ZK for client" error.
> [~]$ sqlline -u "jdbc:drill:zk=172.16.2.165:5181;auth=maprsasl"
> Error: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client. 
> (state=,code=0)
> java.sql.SQLNonTransientConnectionException: Failure in connecting to Drill: 
> org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for client.
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:174)
>  at 
> org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:67)
>  at 
> org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:67)
>  at 
> org.apache.calcite.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:138)
>  at org.apache.drill.jdbc.Driver.connect(Driver.java:72)
>  at sqlline.DatabaseConnection.connect(DatabaseConnection.java:130)
>  at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:179)
>  at sqlline.Commands.connect(Commands.java:1247)
>  at sqlline.Commands.connect(Commands.java:1139)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
>  at sqlline.SqlLine.dispatch(SqlLine.java:722)
>  at sqlline.SqlLine.initArgs(SqlLine.java:416)
>  at sqlline.SqlLine.begin(SqlLine.java:514)
>  at sqlline.SqlLine.start(SqlLine.java:264)
>  at sqlline.SqlLine.main(SqlLine.java:195)
> Caused by: org.apache.drill.exec.rpc.RpcException: Failure setting up ZK for 
> client.
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:340)
>  at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:165)
>  ... 18 more
> Caused by: java.lang.NullPointerException
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.findACLProvider(ZKACLProviderFactory.java:68)
>  at 
> org.apache.drill.exec.coord.zk.ZKACLProviderFactory.getACLProvider(ZKACLProviderFactory.java:47)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:114)
>  at 
> org.apache.drill.exec.coord.zk.ZKClusterCoordinator.(ZKClusterCoordinator.java:86)
>  at org.apache.drill.exec.client.DrillClient.connect(DrillClient.java:337)
>  ... 19 more
> Apache Drill 1.15.0.0
> "This isn't your grandfather's SQL."
> sqlline>
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7102) Apache Metrics WEBUI Unavailable

2019-03-15 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794044#comment-16794044
 ] 

Pritesh Maker commented on DRILL-7102:
--

[~agirish] any thoughts on this? Did you encounter any change for the 
Kubernetes work that you are doing?

> Apache Metrics WEBUI Unavailable 
> -
>
> Key: DRILL-7102
> URL: https://issues.apache.org/jira/browse/DRILL-7102
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.15.0
> Environment: kubernetes v1.13.2
> ubuntu:18.04
> Apache Drill 1.15.0
> 64GB RAM
> 8 vCpu Cores
>Reporter: Gene
>Priority: Minor
> Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot 
> 2019-03-14 at 2.44.37 PM.png
>
>
> Apache Drill Metrics unavailable in webUI when exposed through NodePort in 
> Kubernetes.
> Error:
> {code:java}
> Failed to load resource: net::ERR_CONNECTION_REFUSED
> {code}
> Browser unable to resolve requested url.
> Maybe we can have a feature where we can change the resource name that the 
> browser is looking for.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5658) Documentation for Drill Crypto Functions

2019-03-12 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-5658:
-
Issue Type: Task  (was: Improvement)

> Documentation for Drill Crypto Functions
> 
>
> Key: DRILL-5658
> URL: https://issues.apache.org/jira/browse/DRILL-5658
> Project: Apache Drill
>  Issue Type: Task
>  Components: Functions - Drill
>Affects Versions: 1.11.0
>Reporter: Charles Givre
>Assignee: Bridget Bevens
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> Attached is the documentation for the crypto functions that are being added 
> to Drill 1.11.0.
> https://gist.github.com/cgivre/63b25bdc85159bec484f069406858adc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6965) Adjust table function usage for all storage plugins and implement schema parameter

2019-03-12 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6965:
-
Fix Version/s: (was: 1.16.0)

> Adjust table function usage for all storage plugins and implement schema 
> parameter
> --
>
> Key: DRILL-6965
> URL: https://issues.apache.org/jira/browse/DRILL-6965
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
>
> Design doc - 
> https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6806) Start moving code for handling a partition in HashAgg into a separate class

2019-03-12 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6806:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Start moving code for handling a partition in HashAgg into a separate class
> ---
>
> Key: DRILL-6806
> URL: https://issues.apache.org/jira/browse/DRILL-6806
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.17.0
>
>
> Since this involves a lot of refactoring this will be a multiple PR effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6835) Schema Provision using File / Table Function

2019-03-12 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6835:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Schema Provision using File / Table Function
> 
>
> Key: DRILL-6835
> URL: https://issues.apache.org/jira/browse/DRILL-6835
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.17.0
>
>
> Schema Provision using File / Table Function design document:
> https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2019-03-12 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6543:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.17.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6845) Eliminate duplicates for Semi Hash Join

2019-03-12 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6845:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Eliminate duplicates for Semi Hash Join
> ---
>
> Key: DRILL-6845
> URL: https://issues.apache.org/jira/browse/DRILL-6845
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Minor
> Fix For: 1.17.0
>
>
> Following DRILL-6735: The performance of the new Semi Hash Join may degrade 
> if the build side contains excessive number of join-key duplicate rows; this 
> mainly a result of the need to store all those rows first, before the hash 
> table is built.
>   Proposed solution: For Semi, the Hash Agg would create a Hash-Table 
> initially, and use it to eliminate key-duplicate rows as they arrive.
>   Proposed extra: That Hash-Table has an added cost (e.g. resizing). So 
> perform "runtime stats" – Check initial number of incoming rows (e.g. 32k), 
> and if the number of duplicates is less than some threshold (e.g. %20) – 
> cancel that "early" hash table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6951) Merge row set based mock data source

2019-03-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6951:
-
Fix Version/s: (was: 1.16.0)

> Merge row set based mock data source
> 
>
> Key: DRILL-6951
> URL: https://issues.apache.org/jira/browse/DRILL-6951
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> The mock reader framework is an obscure bit of code used in tests that 
> generates fake data for use in things like testing sort, filters and so on.
> Because the mock reader is simple, it is a good demonstration case for the 
> new scanner framework based on the result set loader. This task merges the 
> existing work in migrating the mock data source into master via a PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6953) Merge row set-based JSON reader

2019-03-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6953:
-
Fix Version/s: (was: 1.16.0)

> Merge row set-based JSON reader
> ---
>
> Key: DRILL-6953
> URL: https://issues.apache.org/jira/browse/DRILL-6953
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> The final step in the ongoing "result set loader" saga is to merge the 
> revised JSON reader into master. This reader does two key things:
> * Demonstrates the prototypical "late schema" style of data reading (discover 
> schema while reading).
> * Implements many tricks and hacks to handle schema changes while loading.
> * Shows that, even with all these tricks, the only true solution is to 
> actually have a schema.
> The new JSON reader:
> * Uses an expanded state machine when parsing rather than the complex set of 
> if-statements in the current version.
> * Handles reading a run of nulls before seeing the first data value (as long 
> as the data value shows up in the first record batch).
> * Uses the result-set loader to generate fixed-size batches regardless of the 
> complexity, depth of structure, or width of variable-length fields.
> While the JSON reader itself is helpful, the key contribution is that it 
> shows how to use the entire kit of parts: result set loader, projection 
> framework, and so on. Since the projection framework can handle an external 
> schema, it is also a handy foundation for the ongoing schema project.
> Key work to complete after this merger will be to reconcile actual data with 
> the external schema. For example, if we know a column is supposed to be a 
> VarChar, then read the column as a VarChar regardless of the type JSON itself 
> picks. Or, if a column is supposed to be a Double, then convert Int and 
> String JSON values into Doubles.
> The Row Set framework was designed to allow inserting custom column writers. 
> This would be a great opportunity to do the work needed to create them. Then, 
> use the new JSON framework to allow parsing a JSON field as a specified Drill 
> type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6896) Extraneous columns being projected past a join

2019-03-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6896:
-
Fix Version/s: (was: 1.16.0)
   Future

> Extraneous columns being projected past a join
> --
>
> Key: DRILL-6896
> URL: https://issues.apache.org/jira/browse/DRILL-6896
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Karthikeyan Manivannan
>Assignee: Aman Sinha
>Priority: Major
> Fix For: Future
>
>
> [~rhou] noted that TPCH13 on Drill 1.15 was running slower than Drill 1.14. 
> Analysis revealed that an extra column was being projected in 1.15 and the 
> slowdown was because the extra column was being unnecessarily pushed across 
> an exchange.
> Here is a simplified query written by [~amansinha100] that exhibits the same 
> problem :
> In first plan, o_custkey and o_comment are both extraneous projections. 
>  In the second plan (on 1.14.0), also, there is an extraneous projection: 
> o_custkey but not o_comment.
> On 1.15.0:
> -
> {noformat}
> explain plan without implementation for 
> select
>   c.c_custkey
> from
>cp.`tpch/customer.parquet` c 
>  left outer join cp.`tpch/orders.parquet` o 
>   on c.c_custkey = o.o_custkey
>  and o.o_comment not like '%special%requests%'
>;
> DrillScreenRel
>   DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$2], o_custkey=[$0], o_comment=[$1])
>   DrillJoinRel(condition=[=($2, $0)], joinType=[right])
> DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
>   DrillScanRel(table=[[cp, tpch/orders.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> {noformat}
> On 1.14.0:
> -
> {noformat}
> DrillScreenRel
>   DrillProjectRel(c_custkey=[$0])
> DrillProjectRel(c_custkey=[$1], o_custkey=[$0])
>   DrillJoinRel(condition=[=($1, $0)], joinType=[right])
> DrillProjectRel(o_custkey=[$0])
>   DrillFilterRel(condition=[NOT(LIKE($1, '%special%requests%'))])
> DrillScanRel(table=[[cp, tpch/orders.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/orders.parquet]], 
> selectionRoot=classpath:/tpch/orders.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`o_custkey`, `o_comment`]]])
> DrillScanRel(table=[[cp, tpch/customer.parquet]], 
> groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath 
> [path=classpath:/tpch/customer.parquet]], 
> selectionRoot=classpath:/tpch/customer.parquet, numFiles=1, numRowGroups=1, 
> usedMetadataFile=false, columns=[`c_custkey`]]])
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7091) Query with EXISTS and correlated subquery fails with NPE in HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl

2019-03-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7091:


Assignee: Boaz Ben-Zvi

> Query with EXISTS and correlated subquery fails with NPE in 
> HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl
> --
>
> Key: DRILL-7091
> URL: https://issues.apache.org/jira/browse/DRILL-7091
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Boaz Ben-Zvi
>Priority: Major
>
> Steps to reproduce:
> 1. Create view:
> {code:sql}
> create view dfs.tmp.nation_view as select * from cp.`tpch/nation.parquet`;
> {code}
> Run the following query:
> {code:sql}
> SELECT n_nationkey, n_name
> FROM  dfs.tmp.nation_view a
> WHERE EXISTS (SELECT 1
> FROM cp.`tpch/region.parquet` b
> WHERE b.r_regionkey =  a.n_regionkey)
> {code}
> This query fails with NPE:
> {noformat}
> [Error Id: 9a592635-f792-4403-965c-bd2eece7e8fc on cv1:31010]
>   at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:364)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:219)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:330)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_161]
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_161]
>   at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.physical.impl.join.HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl.initialize(HashJoinMemoryCalculatorImpl.java:267)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase(HashJoinBatch.java:959)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:525)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.test.generated.HashAggregatorGen2.doWork(HashAggTemplate.java:642)
>  ~[na:na]
>   at 
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:295)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> 

[jira] [Updated] (DRILL-7075) Fix debian package issue with control files

2019-03-05 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7075:
-
Fix Version/s: (was: Future)
   1.16.0

> Fix debian package issue with control files
> ---
>
> Key: DRILL-7075
> URL: https://issues.apache.org/jira/browse/DRILL-7075
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: Future
> Environment: Verified under Ubuntu OS installed on ARM64
>Reporter: Naresh Bhat
>Assignee: Naresh Bhat
>Priority: Major
> Fix For: 1.16.0
>
>
> The debian package issue with control files.  The master branch is broken and 
> while generating the debian package it is looking for control files under 
> distribution/src/deb/ I don't think it is good idea to remove the control 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7060) Support JsonParser Feature 'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' in JsonReader

2019-02-28 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7060:
-
Reviewer: Sorabh Hamirwasia

> Support JsonParser Feature 'ALLOW_BACKSLASH_ESCAPING_ANY_CHARACTER' in 
> JsonReader
> -
>
> Key: DRILL-7060
> URL: https://issues.apache.org/jira/browse/DRILL-7060
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - JSON
>Affects Versions: 1.15.0, 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.16.0
>
>
> Some JSON files may have strings with backslashes - which are read as escape 
> characters. By default only standard escape characters are allowed. So 
> querying such files would fail. For example see:
> Data
> {code}
> {"file":"C:\Sfiles\escape.json"}
> {code}
> Error
> {code}
> (com.fasterxml.jackson.core.JsonParseException) Unrecognized character escape 
> 'S' (code 83)
>  at [Source: (org.apache.drill.exec.store.dfs.DrillFSDataInputStream); line: 
> 1, column: 178]
> com.fasterxml.jackson.core.JsonParser._constructError():1804
> com.fasterxml.jackson.core.base.ParserMinimalBase._reportError():663
> 
> com.fasterxml.jackson.core.base.ParserMinimalBase._handleUnrecognizedCharacterEscape():640
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeEscaped():3243
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser._skipString():2537
> com.fasterxml.jackson.core.json.UTF8StreamJsonParser.nextToken():683
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeData():342
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeDataSwitch():298
> org.apache.drill.exec.vector.complex.fn.JsonReader.writeToVector():246
> org.apache.drill.exec.vector.complex.fn.JsonReader.write():205
> org.apache.drill.exec.store.easy.json.JSONRecordReader.next():216
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():223
> ...
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7067) Querying parquet file with null value field error

2019-02-28 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780978#comment-16780978
 ] 

Pritesh Maker commented on DRILL-7067:
--

Thanks for reporting this issue [~jcachola] - are you able to attach a sample 
file to demonstrate the error?

> Querying parquet file with null value field error
> -
>
> Key: DRILL-7067
> URL: https://issues.apache.org/jira/browse/DRILL-7067
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
> Environment: Ubuntu 18.04.1 LTS 
> Apache Drill 1.15.0 
>Reporter: Jardhel Cachola
>Priority: Blocker
> Fix For: Future
>
>
> When we try to join two parquet tables and one of them has null values on any 
> field, the query doesn't run. It fails showing the following error: 
> Error: SYSTEM ERROR: IllegalStateException: Failure while reading vector. 
> Expected vector class of org.apache.drill.exec.vector.NullableIntVector but 
> was holding vector class 
> org.apache.drill.exec.vector.NullableVarDecimalVector, field= [`id_cub` 
> (VARDECIMAL(38, 4):OPTIONAL), children=([`$bits$` (UINT1:REQUIRED)], 
> [`id_cub` (VARDECIMAL(38, 4):OPTIONAL), children=([`$offsets$` 
> (UINT4:REQUIRED)])])]
> Fragment 2:0
> Please, refer to logs for more information.
> [Error Id: 48f63255-c771-4809-8252-ef7a78fda31b on 
> ip-172-18-250-4.us-west-2.compute.internal:31010]
> (java.lang.IllegalStateException) Failure while reading vector. Expected 
> vector class of org.apache.drill.exec.vector.NullableIntVector but was 
> holding vector class org.apache.drill.exec.vector.NullableVarDecimalVector, 
> field= [`id_cub` (VARDECIMAL(38, 4):OPTIONAL), children=([`$bits$` 
> (UINT1:REQUIRED)], [`id_cub` (VARDECIMAL(38, 4):OPTIONAL), 
> children=([`$offsets$` (UINT4:REQUIRED)])])] 
>  org.apache.drill.exec.record.VectorContainer.getValueAccessorById():324
>  org.apache.drill.exec.record.RecordBatchLoader.getValueAccessorById():251
>  
> org.apache.drill.exec.physical.impl.unorderedreceiver.UnorderedReceiverBatch.getValueAccessorById():142
>  
> org.apache.drill.exec.test.generated.PartitionerGen1732$OutgoingRecordBatch.doSetup():114
>  
> org.apache.drill.exec.test.generated.PartitionerGen1732$OutgoingRecordBatch.initializeBatch():399
>  
> org.apache.drill.exec.test.generated.PartitionerGen1732.flushOutgoingBatches():185
>  
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator$FlushBatchesHandlingClass.execute():285
>  
> org.apache.drill.exec.physical.impl.partitionsender.PartitionerDecorator$PartitionerTask.run():340
>  java.util.concurrent.ThreadPoolExecutor.runWorker():1149
>  java.util.concurrent.ThreadPoolExecutor$Worker.run():624
>  java.lang.Thread.run():748 (state=,code=0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7054) timestamp in milliseconds

2019-02-28 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7054:
-
Fix Version/s: 1.16.0

> timestamp in milliseconds
> -
>
> Key: DRILL-7054
> URL: https://issues.apache.org/jira/browse/DRILL-7054
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Reporter: Angelo Mantellini
>Priority: Minor
> Fix For: 1.16.0
>
>
> It is important to show the timestamp with microseconds precision.
> timestamp has milliseconds as precision and in some case, it could be not 
> enough.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6739) Update Kafka libs to 2.0.0 version

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6739:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Update Kafka libs to 2.0.0 version
> --
>
> Key: DRILL-6739
> URL: https://issues.apache.org/jira/browse/DRILL-6739
> Project: Apache Drill
>  Issue Type: Task
>  Components: Storage - Kafka
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.17.0
>
>
> The current version of Kafka libs is 0.11.0.1
>  The last version is 2.0.0 (September 2018) 
> https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients
> Looks like the only changes which should be done are:
>  * replacing {{serverConfig()}} method with {{staticServerConfig()}} in Drill 
> {{EmbeddedKafkaCluster}} class
>  * Replacing deprecated {{AdminUtils}} with {{kafka.zk.AdminZkClient}} 
> [https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/core/src/main/scala/kafka/admin/AdminUtils.scala#L35]
>  https://issues.apache.org/jira/browse/KAFKA-6545
> The initial work: https://github.com/vdiravka/drill/commits/DRILL-6739



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6465) Transitive closure is not working in Drill for Join with multiple local conditions

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6465:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Transitive closure is not working in Drill for Join with multiple local 
> conditions
> --
>
> Key: DRILL-6465
> URL: https://issues.apache.org/jira/browse/DRILL-6465
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Denys Ordynskiy
>Assignee: Vitalii Diravka
>Priority: Minor
> Fix For: 1.17.0
>
> Attachments: drill.zip
>
>
> For several SQL operators Transitive closure is not working during Partition 
> Pruning and Filter Pushdown for the left table in Join.
>  If I use several local conditions, then Drill scans full left table in Join.
>  But if we move additional conditions to the WHERE statement, then Transitive 
> closure works fine for all joined tables
> *Query BETWEEN:*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y BETWEEN 1987 AND 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=8, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2])]{code}
> *Actual result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=16, partitions= [Partition(values:[1987, 5, 
> 1]), Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2]), Partition(values:[1990, 4, 1]), 
> Partition(values:[1990, 4, 2]), Partition(values:[1990, 5, 1]), 
> Partition(values:[1990, 5, 2]), Partition(values:[1991, 3, 1]), 
> Partition(values:[1991, 3, 2]), Partition(values:[1991, 3, 3]), 
> Partition(values:[1991, 3, 4])
> ]
> {code}
> *There is the same Transitive closure behavior for this logical operators:*
>  * NOT IN
>  * LIKE
>  * NOT LIKE
> Also Transitive closure is not working during Partition Pruning and Filter 
> Pushdown for this comparison operators:
> *Query <*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y < 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=4, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2])]{code}
> *Actual result:*
> {code:java}
> 00-00 Screen
> 00-01 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-02 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-03 HashJoin(condition=[=($1, $6)], joinType=[inner])
> 00-05 Scan(groupscan=[HiveScan [table=Table(dbName:default, 
> tableName:h_tab1), columns=[`**`], numPartitions=16, partitions= 
> [Partition(values:[1987, 5, 1]), Partition(values:[1987, 5, 2]), 
> Partition(values:[1987, 7, 1]), Partition(values:[1987, 7, 2]), 
> Partition(values:[1988, 11, 1]), Partition(values:[1988, 11, 2]), 
> Partition(values:[1988, 12, 1]), Partition(values:[1988, 12, 2]), 
> Partition(values:[1990, 4, 1]), Partition(values:[1990, 4, 2]), 
> Partition(values:[1990, 5, 1]), Partition(values:[1990, 5, 2]), 
> Partition(values:[1991, 3, 1]), Partition(values:[1991, 3, 2]), 
> Partition(values:[1991, 3, 3]), Partition(values:[1991, 3, 4])], 
> inputDirectories=[maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/1, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/2, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/3, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/4, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/5, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/6, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/7, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/8, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/9, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/10, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/11, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/12, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/13, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/14, 
> 

[jira] [Updated] (DRILL-6430) Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or Locally

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6430:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Drill Should Not Fail If It Sees Deprecated Options Stored In Zookeeper Or 
> Locally
> --
>
> Key: DRILL-6430
> URL: https://issues.apache.org/jira/browse/DRILL-6430
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.17.0
>
>
> This is required for resource management since we will likely remove many 
> options.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-3090) sqlline : save SQL to script file and replay from script, results in error

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-3090:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> sqlline : save SQL to script file and replay from script, results in error
> --
>
> Key: DRILL-3090
> URL: https://issues.apache.org/jira/browse/DRILL-3090
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.0.0
> Environment: ffbb9c7adc6360744bee186e1f69d47dc743f73e
>Reporter: Khurram Faraaz
>Assignee: Arina Ielchiieva
>Priority: Minor
> Fix For: 1.17.0
>
>
> Save a SQL query to a script file and replay the SQL from the script file 
> using !run, on sqlline prompt throws error. We should not see the error when 
> we replay the SQL from the script file.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> !script file3
> Saving command script to "/opt/mapr/drill/drill-1.0.0/bin/file3". Enter 
> "script" with no arguments to stop it.
> 0: jdbc:drill:schema=dfs.tmp> select * from sys.drillbits;
> +++--+++
> |  hostname  | user_port  | control_port | data_port  |  current   |
> +++--+++
> | centos-04.qa.lab | 31010  | 31011| 31012  | false  |
> | centos-02.qa.lab | 31010  | 31011| 31012  | false  |
> | centos-01.qa.lab | 31010  | 31011| 31012  | false  |
> | centos-03.qa.lab | 31010  | 31011| 31012  | true   |
> +++--+++
> 4 rows selected (0.176 seconds)
> 0: jdbc:drill:schema=dfs.tmp> !script
> Script closed. Enter "run /opt/mapr/drill/drill-1.0.0/bin/file3" to replay it.
> 0: jdbc:drill:schema=dfs.tmp> !run /opt/mapr/drill/drill-1.0.0/bin/file3
> 1/2  select * from sys.drillbits;
> +++--+++
> |  hostname  | user_port  | control_port | data_port  |  current   |
> +++--+++
> | centos-04 | 31010  | 31011| 31012  | false  |
> | centos-02 | 31010  | 31011| 31012  | false  |
> | centos-01 | 31010  | 31011| 31012  | false  |
> | centos-03 | 31010  | 31011| 31012  | true   |
> +++--+++
> 4 rows selected (0.178 seconds)
> 2/2  !script
> Usage: script 
> Aborting command set because "force" is false and command failed: "!script"
> {code}
> I looked at the contents of file3 under /opt/mapr/drill/drill-1.0.0/bin
> There seems to be an additional/extra "!script" in the file.
> {code}
> [root@centos-01 bin]# cat file3
> select * from sys.drillbits;
> !script
> [root@centos-01 bin]# 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6989) Upgrade to SqlLine 1.7.0

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6989:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Upgrade to SqlLine 1.7.0
> 
>
> Key: DRILL-6989
> URL: https://issues.apache.org/jira/browse/DRILL-6989
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.15.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.17.0
>
>
> Upgrade to SqlLine 1.7.0 after its officially released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7038) Queries on partitioned columns scan the entire datasets

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7038:
-
Reviewer: Gautam Parai

> Queries on partitioned columns scan the entire datasets
> ---
>
> Key: DRILL-7038
> URL: https://issues.apache.org/jira/browse/DRILL-7038
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> For tables with hive-style partitions like
> {code}
> /table/2018/Q1
> /table/2018/Q2
> /table/2019/Q1
> etc.
> {code}
> if any of the following queries is run:
> {code}
> select distinct dir0 from dfs.`/table`
> {code}
> {code}
> select dir0 from dfs.`/table` group by dir0
> {code}
> it will actually scan every single record in the table rather than just 
> getting a list of directories at the dir0 level. This applies even when 
> cached metadata is available. This is a big penalty especially as the 
> datasets grow.
> To avoid such situations, a logical prune rule can be used to collect 
> partition columns (`dir0`), either from metadata cache (if available) or 
> group scan, and drop unnecessary files from being read. The rule will be 
> applied on following conditions:
> 1) all queried columns are partitoin columns, and
> 2) either {{DISTINCT}} or {{GROUP BY}} operations are performed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6839) Failed to plan (aggregate + Hash or NL join) when slice target is low

2019-02-27 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6839:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Failed to plan (aggregate + Hash or NL join) when slice target is low 
> --
>
> Key: DRILL-6839
> URL: https://issues.apache.org/jira/browse/DRILL-6839
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
> Fix For: 1.17.0
>
>
> *Case 1.* When nested loop join is about to be used:
>  - Option "_planner.enable_nljoin_for_scalar_only_" is set to false
>  - Option "_planner.slice_target_" is set to low value for imitation of big 
> input tables
>  
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
>  @BeforeClass
>  public static void setUp() throws Exception {
>  startCluster(ClusterFixture.builder(dirTestWatcher));
>  }
>  @Test
>  public void testCrossJoinSucceedsForLowSliceTarget() throws Exception {
>try {
>  client.alterSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName(), 
> false);
>  client.alterSession(ExecConstants.SLICE_TARGET, 1);
>  queryBuilder().sql(
> "SELECT COUNT(l.nation_id) " +
> "FROM cp.`tpch/nation.parquet` l " +
> ", cp.`tpch/region.parquet` r")
>  .run();
>} finally {
> client.resetSession(ExecConstants.SLICE_TARGET);
> client.resetSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName());
>}
>  }
> }{code}
>  
> *Case 2.* When hash join is about to be used:
>  - Option "planner.enable_mergejoin" is set to false, so hash join will be 
> used instead
>  - Option "planner.slice_target" is set to low value for imitation of big 
> input tables
>  - Comment out //ruleList.add(HashJoinPrule.DIST_INSTANCE); in 
> PlannerPhase.getPhysicalRules method
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
>  @BeforeClass
>  public static void setUp() throws Exception {
>startCluster(ClusterFixture.builder(dirTestWatcher));
>  }
>  @Test
>  public void testInnerJoinSucceedsForLowSliceTarget() throws Exception {
>try {
> client.alterSession(PlannerSettings.MERGEJOIN.getOptionName(), false);
> client.alterSession(ExecConstants.SLICE_TARGET, 1);
> queryBuilder().sql(
>   "SELECT COUNT(l.nation_id) " +
>   "FROM cp.`tpch/nation.parquet` l " +
>   "INNER JOIN cp.`tpch/region.parquet` r " +
>   "ON r.nation_id = l.nation_id")
> .run();
>} finally {
> client.resetSession(ExecConstants.SLICE_TARGET);
> client.resetSession(PlannerSettings.MERGEJOIN.getOptionName());
>}
>  }
> }
> {code}
>  
> *Workaround:* To avoid the exception we need to set option 
> "_planner.enable_multiphase_agg_" to false. By doing this we avoid 
> unsuccessful attempts to create 2 phase aggregation plan in StreamAggPrule 
> and guarantee that logical aggregate will be converted to physical one. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7032) Ignore corrupt rows in a PCAP file

2019-02-25 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7032:
-
Fix Version/s: 1.16.0

> Ignore corrupt rows in a PCAP file
> --
>
> Key: DRILL-7032
> URL: https://issues.apache.org/jira/browse/DRILL-7032
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.15.0
> Environment: OS: Ubuntu 18.4
> Drill version: 1.15.0
> Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
>Reporter: Giovanni Conte
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.16.0
>
>
> Would be useful for Drill to have some ability to ignore corrupt rows in a 
> PCAP file instead of trow the java exception.
> This is because there are many pcap files with corrupted lines and this 
> funcionality will avoid to do a pre-fixing of the packet-captures (example 
> attached file).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7056) Drill fails with NPE when starting in distributed mode and 31010 port is used

2019-02-25 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7056:
-
Fix Version/s: 1.16.0

> Drill fails with NPE when starting in distributed mode and 31010 port is used
> -
>
> Key: DRILL-7056
> URL: https://issues.apache.org/jira/browse/DRILL-7056
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Volodymyr Vysotskyi
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.16.0
>
>
> Steps to reproduce:
> 1. Run process which uses 31010 port, for example, execute the following code:
> {code:java}
> try (Socket localhost = new Socket("localhost", 31010)) {
>   Thread.sleep(500_000);
> }
> {code}
> 2. Start Drill in distributed mode.
> As the result, Drill is not started and the following stack trace with NPE in 
> {{drillbit.log}}:
> {noformat}
> [Error Id: 2600a33d-c06d-4361-a39b-8f4b3d05b639 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.rpc.BasicServer.bind(BasicServer.java:211) 
> ~[drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.service.ServiceEngine.start(ServiceEngine.java:100) 
> ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:207) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:527) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:497) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:493) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method) ~[na:1.8.0_191]
> at sun.nio.ch.Net.bind(Net.java:433) ~[na:1.8.0_191]
> at sun.nio.ch.Net.bind(Net.java:425) ~[na:1.8.0_191]
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) 
> ~[na:1.8.0_191]
> at 
> io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
>  ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:500)
>  ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
>  ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:495)
>  ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:480)
>  ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965) 
> ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:209) 
> ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:355) 
> ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
>  ~[netty-common-4.0.48.Final.jar:4.0.48.Final]
> at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463) 
> ~[netty-transport-4.0.48.Final.jar:4.0.48.Final]
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:131)
>  ~[netty-common-4.0.48.Final.jar:4.0.48.Final]
> at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_191]
> 2019-02-25 17:58:03,888 [main] WARN  o.apache.drill.exec.server.Drillbit - 
> Failure on close()
> java.lang.NullPointerException: null
> at 
> org.apache.drill.exec.server.rest.WebServer.generateOptionsDescriptionJSFile(WebServer.java:480)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.server.rest.WebServer.getTmpJavaScriptDir(WebServer.java:138)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.server.rest.WebServer.close(WebServer.java:471) 
> ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:87) 
> ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> 

[jira] [Assigned] (DRILL-6893) Invalid output for star and self-join queries for RDBMS Storage Plugin

2019-02-25 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6893:


Assignee: Volodymyr Vysotskyi

> Invalid output for star and self-join queries for RDBMS Storage Plugin
> --
>
> Key: DRILL-6893
> URL: https://issues.apache.org/jira/browse/DRILL-6893
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JDBC
>Affects Versions: 1.14.0
> Environment: mysql-5.7.23-0ubuntu0.18.04.1
> mysql-connector-java-5.1.39-bin.jar
>Reporter: Vitalii Diravka
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.16.0
>
>
> Invalid output for star and self-join queries for RDBMS Storage Plugin:
> {code:java}
> 0: jdbc:drill:zk=local> SELECT * FROM (SELECT * FROM 
> mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN 
> (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` 
> ON `t`.`PersonID` = `t0`.`PersonID` ;
> +---+--+-+--++---+--+---+
> | PersonID  | OrderID  | ItemID  | GroupID  | PersonID0  | OrderID0  | 
> ItemID0  | GroupID0  |
> +---+--+-+--++---+--+---+
> | 10| 10   | 10  | 10   | null   | null  | null   
>   | null  |
> +---+--+-+--++---+--+---+
> 1 row selected (1.402 seconds)
> 0: jdbc:drill:zk=local> select * from sys.version;
> +--+---+++++
> | version  | commit_id |  
>commit_message |commit_time |
> build_email | build_time |
> +--+---+++++
> | 1.15.0-SNAPSHOT  | 100a68b314230d4cf327477f7a10f9c650720513  | DRILL-540: 
> Allow querying hive views in Drill  | 30.11.2018 @ 10:50:46 EET  | 
> vitalii.dira...@gmail.com  | 10.12.2018 @ 15:46:54 EET  |
> +--+---+++++
> 1 row selected (0.302 seconds)
> {code}
> The same result in older 1.11.0 Drill version:
> {code:java}
> 0: jdbc:drill:zk=local> SELECT * FROM (SELECT * FROM 
> mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t` INNER JOIN 
> (SELECT * FROM mysql.`testdb`.`mscIdentities3` WHERE `PersonID` = 10) AS `t0` 
> ON `t`.`PersonID` = `t0`.`PersonID`;
> +---+--+-+--++---+--+---+
> | PersonID  | OrderID  | ItemID  | GroupID  | PersonID0  | OrderID0  | 
> ItemID0  | GroupID0  |
> +---+--+-+--++---+--+---+
> | 10| 10   | 10  | 10   | null   | null  | null   
>   | null  |
> +---+--+-+--++---+--+---+
> 1 row selected (1.344 seconds)
> 0: jdbc:drill:zk=local> select * from sys.version;
> +--+---+--+-+++
> | version  | commit_id |
> commit_message| commit_time |
> build_email | build_time |
> +--+---+--+-+++
> | 1.11.0   | 4220fb2fffbc81883df3e5fea575fa0a584852b3  | 
> [maven-release-plugin] prepare release drill-1.11.0  | 24.07.2017 @ 16:47:07 
> EEST  | vitalii.dira...@gmail.com  | 06.12.2018 @ 14:36:39 EET  |
> +--+---+--+-+++
> 1 row selected (0.271 seconds)
> {code}
> But the same query in MySQL:
> {code:java}
> mysql> select * from `mscIdentities3` t1 join `mscIdentities3` t2 on 
> t1.`PersonId` = t2.`PersonID` where t1.`PersonID` = 10;
> 

[jira] [Updated] (DRILL-345) Addition of a genericUDF/reflectUDF

2019-02-25 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-345:

Fix Version/s: (was: 1.16.0)

> Addition of a genericUDF/reflectUDF
> ---
>
> Key: DRILL-345
> URL: https://issues.apache.org/jira/browse/DRILL-345
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 0.5.0
>Reporter: Yash Sharma
>Assignee: Mehant Baid
>Priority: Major
> Fix For: Future
>
> Attachments: DRILL-345.patch
>
>
> Add genericUDF/reflectUDF that will allow users to use methods from Java 
> directly, instead having to write a UDF wrapper on top of existing Java 
> method.
> *Usage:*
> {code:sql}
> SELECT reflect("java.lang.String", "valueOf", 1),
>reflect("java.lang.String", "isEmpty"),
>reflect("java.lang.Math", "max", 2, 3),
>reflect("java.lang.Math", "min", 2, 3),
>reflect("java.lang.Math", "round", 2.5),
>reflect("java.lang.Math", "exp", 1.0),
>reflect("java.lang.Math", "floor", 1.9)
> FROM src LIMIT 1;
> 1   true3   2   3   2.7182818284590455  1.0
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6707) Query with 10-way merge join fails with IllegalArgumentException

2019-02-25 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6707:
-
Fix Version/s: 1.16.0

> Query with 10-way merge join fails with IllegalArgumentException
> 
>
> Key: DRILL-6707
> URL: https://issues.apache.org/jira/browse/DRILL-6707
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators, Query Planning  
> Optimization
>Affects Versions: 1.15.0
>Reporter: Abhishek Girish
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: drillbit.zip
>
>
> Query
> {code}
> SELECT   *
> FROM si.tpch_sf1_parquet.customer C,
>  si.tpch_sf1_parquet.orders O,
>  si.tpch_sf1_parquet.lineitem L,
>  si.tpch_sf1_parquet.part P,
>  si.tpch_sf1_parquet.supplier S,
>  si.tpch_sf1_parquet.partsupp PS,
>  si.tpch_sf1_parquet.nation S_N,
>  si.tpch_sf1_parquet.region S_R,
>  si.tpch_sf1_parquet.nation C_N,
>  si.tpch_sf1_parquet.region C_R
> WHEREC.C_CUSTKEY = O.O_CUSTKEY 
> AND  O.O_ORDERKEY = L.L_ORDERKEY
> AND  L.L_PARTKEY = P.P_PARTKEY
> AND  L.L_SUPPKEY = S.S_SUPPKEY
> AND  P.P_PARTKEY = PS.PS_PARTKEY
> AND  P.P_SUPPKEY = PS.PS_SUPPKEY
> AND  S.S_NATIONKEY = S_N.N_NATIONKEY
> AND  S_N.N_REGIONKEY = S_R.R_REGIONKEY
> AND  C.C_NATIONKEY = C_N.N_NATIONKEY
> AND  C_N.N_REGIONKEY = C_R.R_REGIONKEY
> {code}
> Plan
> {code}
> 00-00Screen : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0, 
> DYNAMIC_STAR **1, DYNAMIC_STAR **2, DYNAMIC_STAR **3, DYNAMIC_STAR **4, 
> DYNAMIC_STAR **5, DYNAMIC_STAR **6, DYNAMIC_STAR **7, DYNAMIC_STAR **8): 
> rowcount = 6001215.0, cumulative cost = {1.151087965E8 rows, 
> 2.66710261332395E9 cpu, 3.198503E7 io, 5.172844544E11 network, 1.87681384E9 
> memory}, id = 419943
> 00-01  ProjectAllowDup(**=[$0], **0=[$1], **1=[$2], **2=[$3], **3=[$4], 
> **4=[$5], **5=[$6], **6=[$7], **7=[$8], **8=[$9]) : rowType = 
> RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0, DYNAMIC_STAR **1, DYNAMIC_STAR 
> **2, DYNAMIC_STAR **3, DYNAMIC_STAR **4, DYNAMIC_STAR **5, DYNAMIC_STAR **6, 
> DYNAMIC_STAR **7, DYNAMIC_STAR **8): rowcount = 6001215.0, cumulative cost = 
> {1.14508675E8 rows, 2.66650249182395E9 cpu, 3.198503E7 io, 5.172844544E11 
> network, 1.87681384E9 memory}, id = 419942
> 00-02UnionExchange : rowType = RecordType(DYNAMIC_STAR T19¦¦**, 
> DYNAMIC_STAR T18¦¦**, DYNAMIC_STAR T12¦¦**, DYNAMIC_STAR T17¦¦**, 
> DYNAMIC_STAR T13¦¦**, DYNAMIC_STAR T16¦¦**, DYNAMIC_STAR T14¦¦**, 
> DYNAMIC_STAR T15¦¦**, DYNAMIC_STAR T20¦¦**, DYNAMIC_STAR T21¦¦**): rowcount = 
> 6001215.0, cumulative cost = {1.0850746E8 rows, 2.60649034182395E9 cpu, 
> 3.198503E7 io, 5.172844544E11 network, 1.87681384E9 memory}, id = 419941
> 01-01  Project(T19¦¦**=[$0], T18¦¦**=[$3], T12¦¦**=[$6], 
> T17¦¦**=[$10], T13¦¦**=[$13], T16¦¦**=[$16], T14¦¦**=[$19], T15¦¦**=[$22], 
> T20¦¦**=[$24], T21¦¦**=[$27]) : rowType = RecordType(DYNAMIC_STAR T19¦¦**, 
> DYNAMIC_STAR T18¦¦**, DYNAMIC_STAR T12¦¦**, DYNAMIC_STAR T17¦¦**, 
> DYNAMIC_STAR T13¦¦**, DYNAMIC_STAR T16¦¦**, DYNAMIC_STAR T14¦¦**, 
> DYNAMIC_STAR T15¦¦**, DYNAMIC_STAR T20¦¦**, DYNAMIC_STAR T21¦¦**): rowcount = 
> 6001215.0, cumulative cost = {1.02506245E8 rows, 2.55848062182395E9 cpu, 
> 3.198503E7 io, 2.71474688E11 network, 1.87681384E9 memory}, id = 419940
> 01-02Project(T19¦¦**=[$21], C_CUSTKEY=[$22], C_NATIONKEY=[$23], 
> T18¦¦**=[$18], O_CUSTKEY=[$19], O_ORDERKEY=[$20], T12¦¦**=[$0], 
> L_ORDERKEY=[$1], L_PARTKEY=[$2], L_SUPPKEY=[$3], T17¦¦**=[$15], 
> P_PARTKEY=[$16], P_SUPPKEY=[$17], T13¦¦**=[$4], S_SUPPKEY=[$5], 
> S_NATIONKEY=[$6], T16¦¦**=[$12], PS_PARTKEY=[$13], PS_SUPPKEY=[$14], 
> T14¦¦**=[$7], N_NATIONKEY=[$8], N_REGIONKEY=[$9], T15¦¦**=[$10], 
> R_REGIONKEY=[$11], T20¦¦**=[$24], N_NATIONKEY0=[$25], N_REGIONKEY0=[$26], 
> T21¦¦**=[$27], R_REGIONKEY0=[$28]) : rowType = RecordType(DYNAMIC_STAR 
> T19¦¦**, ANY C_CUSTKEY, ANY C_NATIONKEY, DYNAMIC_STAR T18¦¦**, ANY O_CUSTKEY, 
> ANY O_ORDERKEY, DYNAMIC_STAR T12¦¦**, ANY L_ORDERKEY, ANY L_PARTKEY, ANY 
> L_SUPPKEY, DYNAMIC_STAR T17¦¦**, ANY P_PARTKEY, ANY P_SUPPKEY, DYNAMIC_STAR 
> T13¦¦**, ANY S_SUPPKEY, ANY S_NATIONKEY, DYNAMIC_STAR T16¦¦**, ANY 
> PS_PARTKEY, ANY PS_SUPPKEY, DYNAMIC_STAR T14¦¦**, ANY N_NATIONKEY, ANY 
> N_REGIONKEY, DYNAMIC_STAR T15¦¦**, ANY R_REGIONKEY, DYNAMIC_STAR T20¦¦**, ANY 
> N_NATIONKEY0, ANY N_REGIONKEY0, DYNAMIC_STAR T21¦¦**, ANY R_REGIONKEY0): 
> rowcount = 6001215.0, cumulative cost = {9.650503E7 rows, 2.49846847182395E9 
> cpu, 3.198503E7 io, 2.71474688E11 network, 1.87681384E9 memory}, id = 419939
> 01-03  MergeJoin(condition=[=($20, $1)], joinType=[inner]) : 
> rowType 

[jira] [Updated] (DRILL-6707) Query with 10-way merge join fails with IllegalArgumentException

2019-02-25 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6707:
-
Reviewer: Sorabh Hamirwasia

> Query with 10-way merge join fails with IllegalArgumentException
> 
>
> Key: DRILL-6707
> URL: https://issues.apache.org/jira/browse/DRILL-6707
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators, Query Planning  
> Optimization
>Affects Versions: 1.15.0
>Reporter: Abhishek Girish
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: drillbit.zip
>
>
> Query
> {code}
> SELECT   *
> FROM si.tpch_sf1_parquet.customer C,
>  si.tpch_sf1_parquet.orders O,
>  si.tpch_sf1_parquet.lineitem L,
>  si.tpch_sf1_parquet.part P,
>  si.tpch_sf1_parquet.supplier S,
>  si.tpch_sf1_parquet.partsupp PS,
>  si.tpch_sf1_parquet.nation S_N,
>  si.tpch_sf1_parquet.region S_R,
>  si.tpch_sf1_parquet.nation C_N,
>  si.tpch_sf1_parquet.region C_R
> WHEREC.C_CUSTKEY = O.O_CUSTKEY 
> AND  O.O_ORDERKEY = L.L_ORDERKEY
> AND  L.L_PARTKEY = P.P_PARTKEY
> AND  L.L_SUPPKEY = S.S_SUPPKEY
> AND  P.P_PARTKEY = PS.PS_PARTKEY
> AND  P.P_SUPPKEY = PS.PS_SUPPKEY
> AND  S.S_NATIONKEY = S_N.N_NATIONKEY
> AND  S_N.N_REGIONKEY = S_R.R_REGIONKEY
> AND  C.C_NATIONKEY = C_N.N_NATIONKEY
> AND  C_N.N_REGIONKEY = C_R.R_REGIONKEY
> {code}
> Plan
> {code}
> 00-00Screen : rowType = RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0, 
> DYNAMIC_STAR **1, DYNAMIC_STAR **2, DYNAMIC_STAR **3, DYNAMIC_STAR **4, 
> DYNAMIC_STAR **5, DYNAMIC_STAR **6, DYNAMIC_STAR **7, DYNAMIC_STAR **8): 
> rowcount = 6001215.0, cumulative cost = {1.151087965E8 rows, 
> 2.66710261332395E9 cpu, 3.198503E7 io, 5.172844544E11 network, 1.87681384E9 
> memory}, id = 419943
> 00-01  ProjectAllowDup(**=[$0], **0=[$1], **1=[$2], **2=[$3], **3=[$4], 
> **4=[$5], **5=[$6], **6=[$7], **7=[$8], **8=[$9]) : rowType = 
> RecordType(DYNAMIC_STAR **, DYNAMIC_STAR **0, DYNAMIC_STAR **1, DYNAMIC_STAR 
> **2, DYNAMIC_STAR **3, DYNAMIC_STAR **4, DYNAMIC_STAR **5, DYNAMIC_STAR **6, 
> DYNAMIC_STAR **7, DYNAMIC_STAR **8): rowcount = 6001215.0, cumulative cost = 
> {1.14508675E8 rows, 2.66650249182395E9 cpu, 3.198503E7 io, 5.172844544E11 
> network, 1.87681384E9 memory}, id = 419942
> 00-02UnionExchange : rowType = RecordType(DYNAMIC_STAR T19¦¦**, 
> DYNAMIC_STAR T18¦¦**, DYNAMIC_STAR T12¦¦**, DYNAMIC_STAR T17¦¦**, 
> DYNAMIC_STAR T13¦¦**, DYNAMIC_STAR T16¦¦**, DYNAMIC_STAR T14¦¦**, 
> DYNAMIC_STAR T15¦¦**, DYNAMIC_STAR T20¦¦**, DYNAMIC_STAR T21¦¦**): rowcount = 
> 6001215.0, cumulative cost = {1.0850746E8 rows, 2.60649034182395E9 cpu, 
> 3.198503E7 io, 5.172844544E11 network, 1.87681384E9 memory}, id = 419941
> 01-01  Project(T19¦¦**=[$0], T18¦¦**=[$3], T12¦¦**=[$6], 
> T17¦¦**=[$10], T13¦¦**=[$13], T16¦¦**=[$16], T14¦¦**=[$19], T15¦¦**=[$22], 
> T20¦¦**=[$24], T21¦¦**=[$27]) : rowType = RecordType(DYNAMIC_STAR T19¦¦**, 
> DYNAMIC_STAR T18¦¦**, DYNAMIC_STAR T12¦¦**, DYNAMIC_STAR T17¦¦**, 
> DYNAMIC_STAR T13¦¦**, DYNAMIC_STAR T16¦¦**, DYNAMIC_STAR T14¦¦**, 
> DYNAMIC_STAR T15¦¦**, DYNAMIC_STAR T20¦¦**, DYNAMIC_STAR T21¦¦**): rowcount = 
> 6001215.0, cumulative cost = {1.02506245E8 rows, 2.55848062182395E9 cpu, 
> 3.198503E7 io, 2.71474688E11 network, 1.87681384E9 memory}, id = 419940
> 01-02Project(T19¦¦**=[$21], C_CUSTKEY=[$22], C_NATIONKEY=[$23], 
> T18¦¦**=[$18], O_CUSTKEY=[$19], O_ORDERKEY=[$20], T12¦¦**=[$0], 
> L_ORDERKEY=[$1], L_PARTKEY=[$2], L_SUPPKEY=[$3], T17¦¦**=[$15], 
> P_PARTKEY=[$16], P_SUPPKEY=[$17], T13¦¦**=[$4], S_SUPPKEY=[$5], 
> S_NATIONKEY=[$6], T16¦¦**=[$12], PS_PARTKEY=[$13], PS_SUPPKEY=[$14], 
> T14¦¦**=[$7], N_NATIONKEY=[$8], N_REGIONKEY=[$9], T15¦¦**=[$10], 
> R_REGIONKEY=[$11], T20¦¦**=[$24], N_NATIONKEY0=[$25], N_REGIONKEY0=[$26], 
> T21¦¦**=[$27], R_REGIONKEY0=[$28]) : rowType = RecordType(DYNAMIC_STAR 
> T19¦¦**, ANY C_CUSTKEY, ANY C_NATIONKEY, DYNAMIC_STAR T18¦¦**, ANY O_CUSTKEY, 
> ANY O_ORDERKEY, DYNAMIC_STAR T12¦¦**, ANY L_ORDERKEY, ANY L_PARTKEY, ANY 
> L_SUPPKEY, DYNAMIC_STAR T17¦¦**, ANY P_PARTKEY, ANY P_SUPPKEY, DYNAMIC_STAR 
> T13¦¦**, ANY S_SUPPKEY, ANY S_NATIONKEY, DYNAMIC_STAR T16¦¦**, ANY 
> PS_PARTKEY, ANY PS_SUPPKEY, DYNAMIC_STAR T14¦¦**, ANY N_NATIONKEY, ANY 
> N_REGIONKEY, DYNAMIC_STAR T15¦¦**, ANY R_REGIONKEY, DYNAMIC_STAR T20¦¦**, ANY 
> N_NATIONKEY0, ANY N_REGIONKEY0, DYNAMIC_STAR T21¦¦**, ANY R_REGIONKEY0): 
> rowcount = 6001215.0, cumulative cost = {9.650503E7 rows, 2.49846847182395E9 
> cpu, 3.198503E7 io, 2.71474688E11 network, 1.87681384E9 memory}, id = 419939
> 01-03  MergeJoin(condition=[=($20, $1)], joinType=[inner]) : 
> 

[jira] [Assigned] (DRILL-7050) RexNode convert exception in subquery

2019-02-22 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7050:


Assignee: Hanumath Rao Maduri

> RexNode convert exception in subquery
> -
>
> Key: DRILL-7050
> URL: https://issues.apache.org/jira/browse/DRILL-7050
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0, 1.15.0
>Reporter: Oleg Zinoviev
>Assignee: Hanumath Rao Maduri
>Priority: Major
>
> If the query contains a subquery whose filters are associated with the main 
> query, an error occurs: *PLAN ERROR: Cannot convert RexNode to equivalent 
> Drill expression. RexNode Class: org.apache.calcite.rex.RexCorrelVariable*
> Steps to reproduce:
> 1) Create source table (or view, doesn't matter)
> {code:sql}
> create table dfs.root.source as  (
> select 1 as id union all select 2 as id
> )
> {code}
> 2) Execute query
> {code:sql}
> select t1.id,
>   (select count(t2.id) 
>   from dfs.root.source t2 where t2.id = t1.id)
> from  dfs.root.source t1
> {code}
> Reason: 
> Method 
> {code:java}org.apache.calcite.sql2rel.SqlToRelConverter.Blackboard.lookupExp{code}
>   call {code:java}RexBuilder.makeCorrel{code} in some cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7045) UDF string_binary java.lang.IndexOutOfBoundsException:

2019-02-21 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7045:
-
Fix Version/s: 1.16.0

> UDF string_binary java.lang.IndexOutOfBoundsException:
> --
>
> Key: DRILL-7045
> URL: https://issues.apache.org/jira/browse/DRILL-7045
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Priority: Minor
> Fix For: 1.16.0
>
>
> Given a large field like
>  
> cat input.json
> { "col0": 
> 

[jira] [Commented] (DRILL-7045) UDF string_binary java.lang.IndexOutOfBoundsException:

2019-02-21 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16774391#comment-16774391
 ] 

Pritesh Maker commented on DRILL-7045:
--

[~jccote] thanks for the contribution - do let us know when it's ready for 
review.

> UDF string_binary java.lang.IndexOutOfBoundsException:
> --
>
> Key: DRILL-7045
> URL: https://issues.apache.org/jira/browse/DRILL-7045
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Minor
> Fix For: 1.16.0
>
>
> Given a large field like
>  
> cat input.json
> { "col0": 
> 

[jira] [Assigned] (DRILL-7045) UDF string_binary java.lang.IndexOutOfBoundsException:

2019-02-21 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7045:


Assignee: jean-claude

> UDF string_binary java.lang.IndexOutOfBoundsException:
> --
>
> Key: DRILL-7045
> URL: https://issues.apache.org/jira/browse/DRILL-7045
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Minor
> Fix For: 1.16.0
>
>
> Given a large field like
>  
> cat input.json
> { "col0": 
> 

[jira] [Updated] (DRILL-7042) DRILL: Apache drill v1.15.0 failed to generate deb/rpm package

2019-02-19 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7042:
-
Fix Version/s: (was: Future)
   1.16.0

> DRILL: Apache drill v1.15.0 failed to generate deb/rpm package
> --
>
> Key: DRILL-7042
> URL: https://issues.apache.org/jira/browse/DRILL-7042
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: 1.15.0
> Environment: Ubuntu/Debian/CentOS installed on ARM64 server machine.
>Reporter: Naresh Bhat
>Priority: Major
> Fix For: 1.16.0
>
>
> I tried to create a debian/rpm package on ARM64 machine,  But it failed to 
> generate the debian/rpm package on Ubuntu/CentOS machine. It is required to 
> fix the pom.xml file under distribution folder.
> Error logs while generating DEB package:
> =
> drill$ git branch 
>   master
> * v1.15.0
> drill$ mvn clean -X package -Pdeb -DskipTests
> .
> ..
> [INFO] Creating debian package: target/drill-1.15.0.deb
> [INFO] Building data
> [ERROR] Failed to create debian package target/drill-1.15.0.deb
> org.vafer.jdeb.PackagingException: Failed to create debian package 
> target/drill-1.15.0.deb
> at org.vafer.jdeb.maven.DebMaker.makeDeb (DebMaker.java:247)
> at org.vafer.jdeb.maven.DebMojo.execute (DebMojo.java:416)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:210)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
> at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke (Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
> (Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launch 
> (Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
> (Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.main 
> (Launcher.java:356)
> Caused by: org.vafer.jdeb.PackagingException: Could not create deb package
> at org.vafer.jdeb.Processor.createDeb (Processor.java:172)
> at org.vafer.jdeb.maven.DebMaker.makeDeb (DebMaker.java:244)
> at org.vafer.jdeb.maven.DebMojo.execute (DebMojo.java:416)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:210)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> 

[jira] [Assigned] (DRILL-7042) DRILL: Apache drill v1.15.0 failed to generate deb/rpm package

2019-02-19 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7042:


Assignee: Naresh Bhat

> DRILL: Apache drill v1.15.0 failed to generate deb/rpm package
> --
>
> Key: DRILL-7042
> URL: https://issues.apache.org/jira/browse/DRILL-7042
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: 1.15.0
> Environment: Ubuntu/Debian/CentOS installed on ARM64 server machine.
>Reporter: Naresh Bhat
>Assignee: Naresh Bhat
>Priority: Major
> Fix For: 1.16.0
>
>
> I tried to create a debian/rpm package on ARM64 machine,  But it failed to 
> generate the debian/rpm package on Ubuntu/CentOS machine. It is required to 
> fix the pom.xml file under distribution folder.
> Error logs while generating DEB package:
> =
> drill$ git branch 
>   master
> * v1.15.0
> drill$ mvn clean -X package -Pdeb -DskipTests
> .
> ..
> [INFO] Creating debian package: target/drill-1.15.0.deb
> [INFO] Building data
> [ERROR] Failed to create debian package target/drill-1.15.0.deb
> org.vafer.jdeb.PackagingException: Failed to create debian package 
> target/drill-1.15.0.deb
> at org.vafer.jdeb.maven.DebMaker.makeDeb (DebMaker.java:247)
> at org.vafer.jdeb.maven.DebMojo.execute (DebMojo.java:416)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:210)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
> at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke (Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
> (Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launch 
> (Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
> (Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.main 
> (Launcher.java:356)
> Caused by: org.vafer.jdeb.PackagingException: Could not create deb package
> at org.vafer.jdeb.Processor.createDeb (Processor.java:172)
> at org.vafer.jdeb.maven.DebMaker.makeDeb (DebMaker.java:244)
> at org.vafer.jdeb.maven.DebMojo.execute (DebMojo.java:416)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:210)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> 

[jira] [Updated] (DRILL-7042) DRILL: Apache drill v1.15.0 failed to generate deb/rpm package

2019-02-19 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7042:
-
Reviewer: Vitalii Diravka

> DRILL: Apache drill v1.15.0 failed to generate deb/rpm package
> --
>
> Key: DRILL-7042
> URL: https://issues.apache.org/jira/browse/DRILL-7042
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: 1.15.0
> Environment: Ubuntu/Debian/CentOS installed on ARM64 server machine.
>Reporter: Naresh Bhat
>Assignee: Naresh Bhat
>Priority: Major
> Fix For: 1.16.0
>
>
> I tried to create a debian/rpm package on ARM64 machine,  But it failed to 
> generate the debian/rpm package on Ubuntu/CentOS machine. It is required to 
> fix the pom.xml file under distribution folder.
> Error logs while generating DEB package:
> =
> drill$ git branch 
>   master
> * v1.15.0
> drill$ mvn clean -X package -Pdeb -DskipTests
> .
> ..
> [INFO] Creating debian package: target/drill-1.15.0.deb
> [INFO] Building data
> [ERROR] Failed to create debian package target/drill-1.15.0.deb
> org.vafer.jdeb.PackagingException: Failed to create debian package 
> target/drill-1.15.0.deb
> at org.vafer.jdeb.maven.DebMaker.makeDeb (DebMaker.java:247)
> at org.vafer.jdeb.maven.DebMojo.execute (DebMojo.java:416)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:210)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
> at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:288)
> at org.apache.maven.cli.MavenCli.main (MavenCli.java:192)
> at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke 
> (NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke 
> (DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke (Method.java:498)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced 
> (Launcher.java:289)
> at org.codehaus.plexus.classworlds.launcher.Launcher.launch 
> (Launcher.java:229)
> at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode 
> (Launcher.java:415)
> at org.codehaus.plexus.classworlds.launcher.Launcher.main 
> (Launcher.java:356)
> Caused by: org.vafer.jdeb.PackagingException: Could not create deb package
> at org.vafer.jdeb.Processor.createDeb (Processor.java:172)
> at org.vafer.jdeb.maven.DebMaker.makeDeb (DebMaker.java:244)
> at org.vafer.jdeb.maven.DebMojo.execute (DebMojo.java:416)
> at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo 
> (DefaultBuildPluginManager.java:137)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:210)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:156)
> at org.apache.maven.lifecycle.internal.MojoExecutor.execute 
> (MojoExecutor.java:148)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:117)
> at 
> org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject 
> (LifecycleModuleBuilder.java:81)
> at 
> org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build
>  (SingleThreadedBuilder.java:56)
> at org.apache.maven.lifecycle.internal.LifecycleStarter.execute 
> (LifecycleStarter.java:128)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
> at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
> at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
> at org.apache.maven.cli.MavenCli.execute (MavenCli.java:956)
>

[jira] [Assigned] (DRILL-7041) CompileException happens if a nested coalesce function returns null

2019-02-15 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7041:


Assignee: Bohdan Kazydub

> CompileException happens if a nested coalesce function returns null
> ---
>
> Key: DRILL-7041
> URL: https://issues.apache.org/jira/browse/DRILL-7041
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
>
> *Query:*
> {code:sql}
> select coalesce(coalesce(n_name1, n_name2), n_name) from 
> cp.`tpch/nation.parquet`
> {code}
> *Expected result:*
> Values from "n_name" column should be returned
> *Actual result:*
> An exception happens:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> CompileException: Line 57, Column 27: Assignment conversion not possible from 
> type "org.apache.drill.exec.expr.holders.NullableVarCharHolder" to type 
> "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer 
> to logs for more information. [Error Id: e54d5bfd-604d-4a39-b62f-33bb964e5286 
> on userf87d-pc:31010] (org.apache.drill.exec.exception.SchemaChangeException) 
> Failure while attempting to load generated class 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():573
>  
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
>  org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
>  org.apache.drill.exec.record.AbstractRecordBatch.next():186 
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284 
> java.security.AccessController.doPrivileged():-2 
> javax.security.auth.Subject.doAs():422 
> org.apache.hadoop.security.UserGroupInformation.doAs():1746 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284 
> org.apache.drill.common.SelfCleaningRunnable.run():38 
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149 
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624 
> java.lang.Thread.run():748 Caused By 
> (org.apache.drill.exec.exception.ClassTransformationException) 
> java.util.concurrent.ExecutionException: 
> org.apache.drill.exec.exception.ClassTransformationException: Failure 
> generating transformation classes for value: package 
> org.apache.drill.exec.test.generated; import 
> org.apache.drill.exec.exception.SchemaChangeException; import 
> org.apache.drill.exec.expr.holders.BigIntHolder; import 
> org.apache.drill.exec.expr.holders.BitHolder; import 
> org.apache.drill.exec.expr.holders.NullableVarBinaryHolder; import 
> org.apache.drill.exec.expr.holders.NullableVarCharHolder; import 
> org.apache.drill.exec.expr.holders.VarCharHolder; import 
> org.apache.drill.exec.ops.FragmentContext; import 
> org.apache.drill.exec.record.RecordBatch; import 
> org.apache.drill.exec.vector.UntypedNullHolder; import 
> org.apache.drill.exec.vector.UntypedNullVector; import 
> org.apache.drill.exec.vector.VarCharVector; public class ProjectorGen35 { 
> BigIntHolder const6; BitHolder constant9; UntypedNullHolder constant13; 
> VarCharVector vv14; UntypedNullVector vv19; public void doEval(int inIndex, 
> int outIndex) throws SchemaChangeException { { UntypedNullHolder out0 = new 
> UntypedNullHolder(); if (constant9 .value == 1) { if (constant13 .isSet!= 0) 
> { out0 = constant13; } } else { VarCharHolder out17 = new VarCharHolder(); { 
> out17 .buffer = vv14 .getBuffer(); long startEnd = vv14 
> .getAccessor().getStartEnd((inIndex)); out17 .start = ((int) startEnd); out17 
> .end = ((int)(startEnd >> 32)); } // start of eval portion of 
> convertToNullableVARCHAR function. // NullableVarCharHolder out18 = new 
> NullableVarCharHolder(); { final NullableVarCharHolder output = new 
> NullableVarCharHolder(); VarCharHolder input = out17; 
> GConvertToNullableVarCharHolder_eval: { output.isSet = 1; output.start = 
> input.start; output.end = input.end; output.buffer = input.buffer; } out18 = 
> output; } // end of eval portion of convertToNullableVARCHAR function. 
> // if (out18 .isSet!= 0) { out0 = out18; } } if (!(out0 .isSet == 0)) { 
> vv19 .getMutator().set((outIndex), out0 .isSet, out0); } } } public void 
> doSetup(FragmentContext context, RecordBatch incoming, RecordBatch outgoing) 
> throws SchemaChangeException { { UntypedNullHolder out1 = new 
> UntypedNullHolder(); NullableVarBinaryHolder out2 = new 
> 

[jira] [Commented] (DRILL-7037) Apache Drill Crashes when a 50mb json string is queried via the REST API provided

2019-02-12 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766258#comment-16766258
 ] 

Pritesh Maker commented on DRILL-7037:
--

[~er.ayushsha...@gmail.com] please do share any logs/ stack traces as 
[~vitalii] suggested. 

I do want to understand the use case and the scenario better - when you say 
that you are supplying a 50MB JSON query parameter, can you give an example of 
the REST API Call, the query and the parameter value? This will help us analyze 
potential code paths that could be impacted.

> Apache Drill Crashes when a 50mb json string is queried via the REST API 
> provided
> -
>
> Key: DRILL-7037
> URL: https://issues.apache.org/jira/browse/DRILL-7037
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.14.0
> Environment: Windows 10 
> 24GB RAM
> 8 Cores
> Used the REST API call to query drill
>Reporter: Ayush Sharma
>Priority: Blocker
> Attachments: scheduler.txt
>
>
> Apache Drill crashes with OutofMemoryException (24GB RAM) when a REST API 
> call is made by supplying a json of size 50MB in the query paramater of the 
> REST API.
> The REST API even crashes for a 10MB query (16GB RAM) and works with a 5MB 
> query.
> This is a blocker for us and will need immediate remediation.
> We are also not aware of any sys.options which might bring the HEAP size down 
> drastically or currently making it go up.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6992) Support column histogram statistics

2019-02-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6992:
-
Fix Version/s: (was: 1.16.0)

> Support column histogram statistics
> ---
>
> Key: DRILL-6992
> URL: https://issues.apache.org/jira/browse/DRILL-6992
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
>
> As a follow-up to 
> [DRILL-1328|https://issues.apache.org/jira/browse/DRILL-1328] which is adding 
> NDV (num distinct values) support and creating the framework for statistics, 
> we also need Histograms.   These are needed  for range predicates selectivity 
> estimation as well as equality predicates when there is non-uniform 
> distribution of data.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6992) Support column histogram statistics

2019-02-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6992:
-
Fix Version/s: 1.16.0

> Support column histogram statistics
> ---
>
> Key: DRILL-6992
> URL: https://issues.apache.org/jira/browse/DRILL-6992
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Aman Sinha
>Assignee: Aman Sinha
>Priority: Major
> Fix For: 1.16.0
>
>
> As a follow-up to 
> [DRILL-1328|https://issues.apache.org/jira/browse/DRILL-1328] which is adding 
> NDV (num distinct values) support and creating the framework for statistics, 
> we also need Histograms.   These are needed  for range predicates selectivity 
> estimation as well as equality predicates when there is non-uniform 
> distribution of data.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error

2019-02-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7035:
-
Fix Version/s: 1.16.0

> Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to 
> communication error 
> --
>
> Key: DRILL-7035
> URL: https://issues.apache.org/jira/browse/DRILL-7035
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Affects Versions: 1.12.0
>Reporter: Rob Wu
>Assignee: Debraj Ray
>Priority: Major
> Fix For: 1.16.0
>
>
> [~debraj92] found that when under some circumstance the 
> SaslAuthenticatorImpl's sasl_dispose() function will crash out at 
> destruction. The incident seems to be random and only when certain 
> authentication and encryption combinations are used during connection.
> After digging a little deeper, I found that when BOOST communication error 
> occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be 
> called from various threads resulting in a race condition of freeing the 
> handle. This can be reproduced with the querysubmitter. This is reproducible 
> since 1.12.0+.
> [~debraj92] will be adding a patch to resolve this incident.
>  
> {code:java}
> 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle 
> Read from buffer 04E1D850
> 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel 
> deadline timer.
> 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: 
> ERR_QRY_COMMERR. Boost Communication Error: End of file
> 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ 
> 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ 
> 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT ---
> 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown
> 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT ---
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7035) Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to communication error

2019-02-11 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7035:


Assignee: Debraj Ray

> Drill C++ Client crashes on multiple SaslAuthenticatorImpl Destruction due to 
> communication error 
> --
>
> Key: DRILL-7035
> URL: https://issues.apache.org/jira/browse/DRILL-7035
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Affects Versions: 1.12.0
>Reporter: Rob Wu
>Assignee: Debraj Ray
>Priority: Major
>
> [~debraj92] found that when under some circumstance the 
> SaslAuthenticatorImpl's sasl_dispose() function will crash out at 
> destruction. The incident seems to be random and only when certain 
> authentication and encryption combinations are used during connection.
> After digging a little deeper, I found that when BOOST communication error 
> occurs, the shutdownSocket (eventually triggering sasl_dispose()) could be 
> called from various threads resulting in a race condition of freeing the 
> handle. This can be reproduced with the querysubmitter. This is reproducible 
> since 1.12.0+.
> [~debraj92] will be adding a patch to resolve this incident.
>  
> {code:java}
> 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Handle 
> Read from buffer 04E1D850
> 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: Cancel 
> deadline timer.
> 2019-Feb-11 10:44:01 : TRACE : 2d74 : DrillClientImpl::handleRead: 
> ERR_QRY_COMMERR. Boost Communication Error: End of file
> 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: +++ ENTER +++ 
> 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: +++ ENTER +++ 
> 2019-Feb-11 10:44:31 : TRACE : 2d74 : Disposing 2: --- EXIT ---
> 2019-Feb-11 10:44:31 : TRACE : 2d74 : Socket shutdown
> 2019-Feb-11 10:44:31 : TRACE : 3df8 : Disposing 1: --- EXIT ---
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-5028) Opening profiles page from web ui gets very slow when a lot of history files have been stored in HDFS or Local FS.

2019-02-07 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5028:


Assignee: Kunal Khatua

> Opening profiles page from web ui gets very slow when a lot of history files 
> have been stored in HDFS or Local FS.
> --
>
> Key: DRILL-5028
> URL: https://issues.apache.org/jira/browse/DRILL-5028
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.8.0
>Reporter: Account Not Used
>Assignee: Kunal Khatua
>Priority: Minor
> Fix For: 1.16.0
>
>
> We have a Drill cluster with 20+ Nodes and we store all history profiles in 
> hdfs. Without doing periodically cleans for hdfs, the profiles page gets 
> slower while serving more queries.
> Code from LocalPersistentStore.java uses fs.list(false, basePath) for 
> fetching the latest 100 history profiles by default, I guess this operation 
> blocks the page loading (Millions small files can be stored in the basePath), 
> maybe we can try some other ways to reach the same goal.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6949) Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition the inner data any further" when Semi join is enabled

2019-02-07 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6949:


Assignee: Boaz Ben-Zvi

> Query fails with "UNSUPPORTED_OPERATION ERROR: Hash-Join can not partition 
> the inner data any further" when Semi join is enabled
> 
>
> Key: DRILL-6949
> URL: https://issues.apache.org/jira/browse/DRILL-6949
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Abhishek Ravi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.16.0
>
> Attachments: 23cc1240-74ff-a0c0-8cd5-938fc136e4e2.sys.drill, 
> 23cc1369-0812-63ce-1861-872636571437.sys.drill
>
>
> Following query fails when with *Error: UNSUPPORTED_OPERATION ERROR: 
> Hash-Join can not partition the inner data any further (probably due to too 
> many join-key duplicates)* on TPC-H SF100 data.
> {code:sql}
> set `exec.hashjoin.enable.runtime_filter` = true;
> set `exec.hashjoin.runtime_filter.max.waiting.time` = 1;
> set `planner.enable_broadcast_join` = false;
> select
>  count(*)
> from
>  lineitem l1
> where
>  l1.l_discount IN (
>  select
>  distinct(cast(l2.l_discount as double))
>  from
>  lineitem l2);
> reset `exec.hashjoin.enable.runtime_filter`;
> reset `exec.hashjoin.runtime_filter.max.waiting.time`;
> reset `planner.enable_broadcast_join`;
> {code}
> The subquery contains *distinct* keyword and hence there should not be 
> duplicate values. 
> I suspect that the failure is caused by semijoin because the query succeeds 
> when semijoin is disabled explicitly.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-02-07 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6970:


Assignee: jean-claude

> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6966) JDBC Storgage Plugin fails to retrieve schema of Oracle DB

2019-02-07 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6966:
-
Fix Version/s: (was: 1.16.0)

> JDBC Storgage Plugin fails to retrieve schema of Oracle DB
> --
>
> Key: DRILL-6966
> URL: https://issues.apache.org/jira/browse/DRILL-6966
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - JDBC
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Priority: Major
> Fix For: Future
>
>
>  
> I've configured drill to use an JDBC storage plugin. My connection string is 
> for an Oracle database. I have included the Oracle JDBC driver to my drill 
> deployment.
>  
> The connection is established correctly. However the storage plugin fails to 
> retrieve the schema of the database. 
>  
> The JDBC API provides two ways to get metadata about the database: 
> getSchemas() and getCatalogs(). Depending on the database vendor one of those 
> calls will return the information. The storage plugin correctly tries both 
> calls. However I believe there is an issue when using Oracle.
>  
> The JDBC API claims that getSchemas() returns a set of two columns "table 
> schema" and "table catalog" but in reality it just returns "table schema" as 
> a first column of the result set. I believe this to be a bug. Has anyone 
> tested the JDBC storage plugin against an Oracle DB and successfully 
> retrieved the list of schema and tables? 
>  
> This article explains in details the issue in retrieving schema information 
> using JDBC. 
> [https://www.wisdomjobs.com/e-university/jdbc-tutorial-278/what-are-catalogs-and-schemas-1147.html]
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-5673) NPE from planner when using a table function with an invalid table

2019-02-07 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-5673:


Assignee: Arina Ielchiieva

> NPE from planner when using a table function with an invalid table
> --
>
> Key: DRILL-5673
> URL: https://issues.apache.org/jira/browse/DRILL-5673
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.12.0
>
>
> Create a CSV file, with headers and call it "data_3.csv."
> Set up a storage plugin config with headers, delimiter of ",". Call it 
> "myws". Then read the file:
> {code}
> SELECT * FROM `dfs.myws`.`data_3.csv`
> {code}
> This query works.
> Try the same with a table function:
> {code}
> SELECT * FROM table(dfs.myws.`data_3.csv` (type => 'text', fieldDelimiter => 
> ',' , extractHeader => true))
> {code}
> This also works.
> Now, let's use an incorrect name:
> {code}
> SELECT * FROM `dfs.myws`.`data_33.csv`
> {code}
> (Note the "33" in the name.)
> This produces an error to the client:
> Now the bug. Do the same thing with the table function:
> {code}
> SELECT * FROM table(dfs.myws.`data_33.csv` (type => 'text', fieldDelimiter => 
> ',' , extractHeader => true))
> {code}
> This results in an NPE:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: null
> SQL Query null
> [Error Id: cf151c28-9879-4ecc-893a-78d85a11c2f4 on 10.250.50.74:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
> ...
> Caused by: java.lang.NullPointerException: null
>   at 
> org.apache.drill.exec.planner.logical.DrillTranslatableTable.getRowType(DrillTranslatableTable.java:49)
>  ~[classes/:na]
>   at 
> org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl(ProcedureNamespace.java:68)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:883)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:869)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2806)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:2791)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3014)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:86)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:883)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:869)
>  ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
>   at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:210) 
> ~[calcite-core-1.4.0-drill-r21.jar:1.4.0-drill-r21]
> ...
> {code}
> This bug causes much user confusion as the user cannot immediately tell that 
> this is "user error" vs. something terribly wrong with Drill.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6989) Upgrade to SqlLine 1.7.0

2019-02-07 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6989:


Assignee: Arina Ielchiieva

> Upgrade to SqlLine 1.7.0
> 
>
> Key: DRILL-6989
> URL: https://issues.apache.org/jira/browse/DRILL-6989
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.15.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.16.0
>
>
> Upgrade to SqlLine 1.7.0 after its officially released.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6960) Auto Limit Wrapping should not apply to non-select query

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6960:
-
Remaining Estimate: (was: 168h)
 Original Estimate: (was: 168h)

> Auto Limit Wrapping should not apply to non-select query
> 
>
> Key: DRILL-6960
> URL: https://issues.apache.org/jira/browse/DRILL-6960
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Kunal Khatua
>Assignee: Kunal Khatua
>Priority: Major
>  Labels: user-experience
> Fix For: 1.16.0
>
>
> [~IhorHuzenko] pointed out that DRILL-6050 can cause submission of queries 
> with incorrect syntax. 
> For example, when user enters {{SHOW DATABASES}}' and after limitation 
> wrapping {{SELECT * FROM (SHOW DATABASES) LIMIT 10}} will be posted. 
> This results into parsing errors, like:
> {{Query Failed: An Error Occurred 
> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 
> Encountered "( show" at line 2, column 15. Was expecting one of:  
> ... }}.
> The fix should involve a javascript check for all non-select queries and not 
> apply the LIMIT wrap for those queries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-2362) Drill should manage Query Profiling archiving

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-2362:
-
Issue Type: New Feature  (was: Bug)

> Drill should manage Query Profiling archiving
> -
>
> Key: DRILL-2362
> URL: https://issues.apache.org/jira/browse/DRILL-2362
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 0.7.0
>Reporter: Chris Westin
>Assignee: Kunal Khatua
>Priority: Major
>
> We collect query profile information for analysis purposes, but we keep it 
> forever. At this time, for a few queries, it isn't a problem. But as users 
> start putting Drill into production, automated use via other applications 
> will make this grow quickly. We need to come up with a retention policy 
> mechanism, with suitable settings administrators can use, and implement it so 
> that this data can be cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-2861) enhance drill profile file management

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-2861:
-
Fix Version/s: (was: 1.16.0)

> enhance drill profile file management
> -
>
> Key: DRILL-2861
> URL: https://issues.apache.org/jira/browse/DRILL-2861
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 0.9.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
>Priority: Major
>
> We need to manage profile files better. Currently each query creates one 
> profile file on the local filesystem of the forman node. You can imagine how 
> this can quickly get out of hand in a production environment.
> We need:
> 1. be able to turn on and off profiling, preferably on the fly
> 2. profiling files should be managed the same as log files
> 3. able to change default file location, for example on a distributed 
> filesystem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6991) Kerberos ticket is being dumped in the log if log level is "debug" for stdout

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6991:


Assignee: Sorabh Hamirwasia

> Kerberos ticket is being dumped in the log if log level is "debug" for stdout 
> --
>
> Key: DRILL-6991
> URL: https://issues.apache.org/jira/browse/DRILL-6991
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Sorabh Hamirwasia
>Priority: Major
>
> *Prerequisites:*
>  # Drill is installed on cluster with Kerberos security
>  # Into conf/logback.xml, set the following log level:
> {code:xml}
>   
> 
> 
>   
> {code}
> *Steps:*
> # Start Drill
> # Connect using sqlline using the following string:
> {noformat}
> bin/sqlline -u "jdbc:drill:zk=;principal="
> {noformat}
> *Expected result:*
> No sensitive information should be displayed
> *Actual result:*
> Kerberos  ticket and session key are being dumped into console output:
> {noformat}
> 14:35:38.806 [TGT Renewer for mapr/node1.cluster.com@NODE1] DEBUG 
> o.a.h.security.UserGroupInformation - Found tgt Ticket (hex) = 
> : 61 82 01 3D 30 82 01 39   A0 03 02 01 05 A1 07 1B  a..=0..9
> 0010: 05 4E 4F 44 45 31 A2 1A   30 18 A0 03 02 01 02 A1  .NODE1..0...
> 0020: 11 30 0F 1B 06 6B 72 62   74 67 74 1B 05 4E 4F 44  .0...krbtgt..NOD
> 0030: 45 31 A3 82 01 0B 30 82   01 07 A0 03 02 01 12 A1  E10.
> 0040: 03 02 01 01 A2 81 FA 04   81 F7 03 8D A9 FA 7D 89  
> 0050: 1B DF 37 B7 4D E6 6C 99   3E 8F FA 48 D9 9A 79 F3  ..7.M.l.>..H..y.
> 0060: 92 34 7F BF 67 1E 77 4A   2F C9 AF 82 93 4E 46 1D  .4..g.wJ/NF.
> 0070: 41 74 B0 AF 41 A8 8B 02   71 83 CC 14 51 72 60 EE  At..A...q...Qr`.
> 0080: 29 67 14 F0 A6 33 63 07   41 AA 8D DC 7B 5B 41 F3  )g...3c.A[A.
> 0090: 83 48 8B 2A 0B 4D 6D 57   9A 6E CF 6B DC 0B C0 D1  .H.*.MmW.n.k
> 00A0: 83 BB 27 40 88 7E 9F 2B   D1 FD A8 6A E1 BF F6 CC  ..'@...+...j
> 00B0: 0E 0C FB 93 5D 69 9A 8B   11 88 0C F2 7C E1 FD 04  ]i..
> 00C0: F5 AB 66 0C A4 A4 7B 30   D1 7F F1 2D D6 A1 52 D1  ..f0...-..R.
> 00D0: 79 59 F2 06 CB 65 FB 73   63 1D 5B E9 4F 28 73 EB  yY...e.sc.[.O(s.
> 00E0: 72 7F 04 46 34 56 F4 40   6C C0 2C 39 C0 5B C6 25  r..F4V.@l.,9.[.%
> 00F0: ED EF 64 07 CE ED 35 9D   D7 91 6C 8F C9 CE 16 F5  ..d...5...l.
> 0100: CA 5E 6F DE 08 D2 68 30   C7 03 97 E7 C0 FF D9 52  .^o...h0...R
> 0110: F8 1D 2F DB 63 6D 12 4A   CD 60 AD D0 BA FA 4B CF  ../.cm.J.`K.
> 0120: 2C B9 8C CA 5A E6 EC 10   5A 0A 1F 84 B0 80 BD 39  ,...Z...Z..9
> 0130: 42 2C 33 EB C0 AA 0D 44   F0 F4 E9 87 24 43 BB 9A  B,3D$C..
> 0140: 52 R
> Client Principal = mapr/node1.cluster.com@NODE1
> Server Principal = krbtgt/NODE1@NODE1
> Session Key = EncryptionKey: keyType=18 keyBytes (hex dump)=
> : 50 DA D1 D7 91 D3 64 BE   45 7B D8 02 25 81 18 25  P.d.E...%..%
> 0010: DA 59 4F BA 76 67 BB 39   9C F7 17 46 A7 C5 00 E2  .YO.vg.9...F
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6991) Kerberos ticket is being dumped in the log if log level is "debug" for stdout

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6991:
-
Fix Version/s: 1.16.0

> Kerberos ticket is being dumped in the log if log level is "debug" for stdout 
> --
>
> Key: DRILL-6991
> URL: https://issues.apache.org/jira/browse/DRILL-6991
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.16.0
>
>
> *Prerequisites:*
>  # Drill is installed on cluster with Kerberos security
>  # Into conf/logback.xml, set the following log level:
> {code:xml}
>   
> 
> 
>   
> {code}
> *Steps:*
> # Start Drill
> # Connect using sqlline using the following string:
> {noformat}
> bin/sqlline -u "jdbc:drill:zk=;principal="
> {noformat}
> *Expected result:*
> No sensitive information should be displayed
> *Actual result:*
> Kerberos  ticket and session key are being dumped into console output:
> {noformat}
> 14:35:38.806 [TGT Renewer for mapr/node1.cluster.com@NODE1] DEBUG 
> o.a.h.security.UserGroupInformation - Found tgt Ticket (hex) = 
> : 61 82 01 3D 30 82 01 39   A0 03 02 01 05 A1 07 1B  a..=0..9
> 0010: 05 4E 4F 44 45 31 A2 1A   30 18 A0 03 02 01 02 A1  .NODE1..0...
> 0020: 11 30 0F 1B 06 6B 72 62   74 67 74 1B 05 4E 4F 44  .0...krbtgt..NOD
> 0030: 45 31 A3 82 01 0B 30 82   01 07 A0 03 02 01 12 A1  E10.
> 0040: 03 02 01 01 A2 81 FA 04   81 F7 03 8D A9 FA 7D 89  
> 0050: 1B DF 37 B7 4D E6 6C 99   3E 8F FA 48 D9 9A 79 F3  ..7.M.l.>..H..y.
> 0060: 92 34 7F BF 67 1E 77 4A   2F C9 AF 82 93 4E 46 1D  .4..g.wJ/NF.
> 0070: 41 74 B0 AF 41 A8 8B 02   71 83 CC 14 51 72 60 EE  At..A...q...Qr`.
> 0080: 29 67 14 F0 A6 33 63 07   41 AA 8D DC 7B 5B 41 F3  )g...3c.A[A.
> 0090: 83 48 8B 2A 0B 4D 6D 57   9A 6E CF 6B DC 0B C0 D1  .H.*.MmW.n.k
> 00A0: 83 BB 27 40 88 7E 9F 2B   D1 FD A8 6A E1 BF F6 CC  ..'@...+...j
> 00B0: 0E 0C FB 93 5D 69 9A 8B   11 88 0C F2 7C E1 FD 04  ]i..
> 00C0: F5 AB 66 0C A4 A4 7B 30   D1 7F F1 2D D6 A1 52 D1  ..f0...-..R.
> 00D0: 79 59 F2 06 CB 65 FB 73   63 1D 5B E9 4F 28 73 EB  yY...e.sc.[.O(s.
> 00E0: 72 7F 04 46 34 56 F4 40   6C C0 2C 39 C0 5B C6 25  r..F4V.@l.,9.[.%
> 00F0: ED EF 64 07 CE ED 35 9D   D7 91 6C 8F C9 CE 16 F5  ..d...5...l.
> 0100: CA 5E 6F DE 08 D2 68 30   C7 03 97 E7 C0 FF D9 52  .^o...h0...R
> 0110: F8 1D 2F DB 63 6D 12 4A   CD 60 AD D0 BA FA 4B CF  ../.cm.J.`K.
> 0120: 2C B9 8C CA 5A E6 EC 10   5A 0A 1F 84 B0 80 BD 39  ,...Z...Z..9
> 0130: 42 2C 33 EB C0 AA 0D 44   F0 F4 E9 87 24 43 BB 9A  B,3D$C..
> 0140: 52 R
> Client Principal = mapr/node1.cluster.com@NODE1
> Server Principal = krbtgt/NODE1@NODE1
> Session Key = EncryptionKey: keyType=18 keyBytes (hex dump)=
> : 50 DA D1 D7 91 D3 64 BE   45 7B D8 02 25 81 18 25  P.d.E...%..%
> 0010: DA 59 4F BA 76 67 BB 39   9C F7 17 46 A7 C5 00 E2  .YO.vg.9...F
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-2362) Drill should manage Query Profiling archiving

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-2362:


Assignee: Kunal Khatua

> Drill should manage Query Profiling archiving
> -
>
> Key: DRILL-2362
> URL: https://issues.apache.org/jira/browse/DRILL-2362
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 0.7.0
>Reporter: Chris Westin
>Assignee: Kunal Khatua
>Priority: Major
>
> We collect query profile information for analysis purposes, but we keep it 
> forever. At this time, for a few queries, it isn't a problem. But as users 
> start putting Drill into production, automated use via other applications 
> will make this grow quickly. We need to come up with a retention policy 
> mechanism, with suitable settings administrators can use, and implement it so 
> that this data can be cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-2362) Drill should manage Query Profiling archiving

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-2362:
-
Fix Version/s: (was: 1.16.0)

> Drill should manage Query Profiling archiving
> -
>
> Key: DRILL-2362
> URL: https://issues.apache.org/jira/browse/DRILL-2362
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 0.7.0
>Reporter: Chris Westin
>Priority: Major
>
> We collect query profile information for analysis purposes, but we keep it 
> forever. At this time, for a few queries, it isn't a problem. But as users 
> start putting Drill into production, automated use via other applications 
> will make this grow quickly. We need to come up with a retention policy 
> mechanism, with suitable settings administrators can use, and implement it so 
> that this data can be cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-2362) Drill should manage Query Profiling archiving

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-2362:


Assignee: (was: Padma Penumarthy)

> Drill should manage Query Profiling archiving
> -
>
> Key: DRILL-2362
> URL: https://issues.apache.org/jira/browse/DRILL-2362
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Other
>Affects Versions: 0.7.0
>Reporter: Chris Westin
>Priority: Major
> Fix For: 1.16.0
>
>
> We collect query profile information for analysis purposes, but we keep it 
> forever. At this time, for a few queries, it isn't a problem. But as users 
> start putting Drill into production, automated use via other applications 
> will make this grow quickly. We need to come up with a retention policy 
> mechanism, with suitable settings administrators can use, and implement it so 
> that this data can be cleaned up.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-2861) enhance drill profile file management

2019-02-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-2861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-2861:


Assignee: Kunal Khatua  (was: Padma Penumarthy)

> enhance drill profile file management
> -
>
> Key: DRILL-2861
> URL: https://issues.apache.org/jira/browse/DRILL-2861
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 0.9.0
>Reporter: Chun Chang
>Assignee: Kunal Khatua
>Priority: Major
> Fix For: 1.16.0
>
>
> We need to manage profile files better. Currently each query creates one 
> profile file on the local filesystem of the forman node. You can imagine how 
> this can quickly get out of hand in a production environment.
> We need:
> 1. be able to turn on and off profiling, preferably on the fly
> 2. profiling files should be managed the same as log files
> 3. able to change default file location, for example on a distributed 
> filesystem



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7018) Drill Query (when store.parquet.reader.int96_as_timestamp=true) on Parquet File fails with Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 0, writerIndex: 37

2019-01-30 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7018:
-
Remaining Estimate: 0h  (was: 24h)
 Original Estimate: 0h  (was: 24h)
  Reviewer: Boaz Ben-Zvi

> Drill Query (when store.parquet.reader.int96_as_timestamp=true) on Parquet 
> File fails with Error: SYSTEM ERROR: IndexOutOfBoundsException: readerIndex: 
> 0, writerIndex: 372 (expected: 0 <= readerIndex <= writerIndex <= 
> capacity(256))
> 
>
> Key: DRILL-7018
> URL: https://issues.apache.org/jira/browse/DRILL-7018
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Parquet
>Affects Versions: 1.14.0
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
> Fix For: 1.16.0
>
>   Original Estimate: 0h
>  Remaining Estimate: 0h
>
> alter system set `store.parquet.reader.int96_as_timestamp`= true
> run query witch projects a column of type Parquet INT96 timestamp with 31 
> nulls
> The following exception will be thrown:
> java.lang.IndexOutOfBoundsException: readerIndex: 0, writerIndex: 372 
> (expected: 0 <= readerIndex <= writerIndex <= capacity(256))
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7014) Format plugin for LTSV files

2019-01-29 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7014:
-
Fix Version/s: 1.16.0

> Format plugin for LTSV files
> 
>
> Key: DRILL-7014
> URL: https://issues.apache.org/jira/browse/DRILL-7014
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.15.0
>Reporter: Takako Shimamoto
>Assignee: Takako Shimamoto
>Priority: Major
> Fix For: 1.16.0
>
>
> I would like to contribute [this 
> plugin|https://github.com/bizreach/drill-ltsv-plugin] to Drill.
> h4. Abstract
> storage-plugins-override.conf
> {code:json}
> "storage":{
>   dfs: {
> type: "file",
> connection: "file:///",
> formats: {
>   "ltsv": {
> "type": "ltsv",
> "extensions": [
>   "ltsv"
> ]
>   }
> },
> enabled: true
>   }
> }
> {code}
> sample.ltsv
> {code}
> time:30/Nov/2016:00:55:08 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/xxx HTTP/1.1  status:200  size:4968 referer:- ua:Java/1.8.0_131 
> reqtime:2.532 apptime:2.532 vhost:api.example.com
> time:30/Nov/2016:00:56:37 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/yyy HTTP/1.1  status:200  size:412  referer:- ua:Java/1.8.0_201 
> reqtime:3.580 apptime:3.580 vhost:api.example.com
> {code}
> Run query
> {code:sh}
> root@1805183e9b65:/apache-drill-1.15.0# ./bin/drill-embedded 
> Apache Drill 1.15.0
> "Drill must go on."
> 0: jdbc:drill:zk=local> SELECT * FROM 
> dfs.`/apache-drill-1.15.0/sample-data/sample.ltsv` WHERE reqtime > 3.0;
> +-+--+---+---+-+---+--+-+--+--+--+
> |time |   host   | forwardedfor  |  
> req  | status  | size  | referer  |   ua| reqtime  | 
> apptime  |  vhost   |
> +-+--+---+---+-+---+--+-+--+--+--+
> | 30/Nov/2016:00:56:37 +0900  | xxx.xxx.xxx.xxx  | - | GET 
> /v1/yyy HTTP/1.1  | 200 | 412   | -| Java/1.8.0_201  | 3.580| 
> 3.580| api.example.com  |
> +-+--+---+---+-+---+--+-+--+--+--+
> 1 row selected (6.074 seconds)
> 0: jdbc:drill:zk=local> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7014) Format plugin for LTSV files

2019-01-29 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-7014:
-
Reviewer: Charles Givre

> Format plugin for LTSV files
> 
>
> Key: DRILL-7014
> URL: https://issues.apache.org/jira/browse/DRILL-7014
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.15.0
>Reporter: Takako Shimamoto
>Assignee: Takako Shimamoto
>Priority: Major
> Fix For: 1.16.0
>
>
> I would like to contribute [this 
> plugin|https://github.com/bizreach/drill-ltsv-plugin] to Drill.
> h4. Abstract
> storage-plugins-override.conf
> {code:json}
> "storage":{
>   dfs: {
> type: "file",
> connection: "file:///",
> formats: {
>   "ltsv": {
> "type": "ltsv",
> "extensions": [
>   "ltsv"
> ]
>   }
> },
> enabled: true
>   }
> }
> {code}
> sample.ltsv
> {code}
> time:30/Nov/2016:00:55:08 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/xxx HTTP/1.1  status:200  size:4968 referer:- ua:Java/1.8.0_131 
> reqtime:2.532 apptime:2.532 vhost:api.example.com
> time:30/Nov/2016:00:56:37 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/yyy HTTP/1.1  status:200  size:412  referer:- ua:Java/1.8.0_201 
> reqtime:3.580 apptime:3.580 vhost:api.example.com
> {code}
> Run query
> {code:sh}
> root@1805183e9b65:/apache-drill-1.15.0# ./bin/drill-embedded 
> Apache Drill 1.15.0
> "Drill must go on."
> 0: jdbc:drill:zk=local> SELECT * FROM 
> dfs.`/apache-drill-1.15.0/sample-data/sample.ltsv` WHERE reqtime > 3.0;
> +-+--+---+---+-+---+--+-+--+--+--+
> |time |   host   | forwardedfor  |  
> req  | status  | size  | referer  |   ua| reqtime  | 
> apptime  |  vhost   |
> +-+--+---+---+-+---+--+-+--+--+--+
> | 30/Nov/2016:00:56:37 +0900  | xxx.xxx.xxx.xxx  | - | GET 
> /v1/yyy HTTP/1.1  | 200 | 412   | -| Java/1.8.0_201  | 3.580| 
> 3.580| api.example.com  |
> +-+--+---+---+-+---+--+-+--+--+--+
> 1 row selected (6.074 seconds)
> 0: jdbc:drill:zk=local> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7014) Format plugin for LTSV files

2019-01-29 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-7014:


Assignee: Takako Shimamoto

> Format plugin for LTSV files
> 
>
> Key: DRILL-7014
> URL: https://issues.apache.org/jira/browse/DRILL-7014
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.15.0
>Reporter: Takako Shimamoto
>Assignee: Takako Shimamoto
>Priority: Major
>
> I would like to contribute [this 
> plugin|https://github.com/bizreach/drill-ltsv-plugin] to Drill.
> h4. Abstract
> storage-plugins-override.conf
> {code:json}
> "storage":{
>   dfs: {
> type: "file",
> connection: "file:///",
> formats: {
>   "ltsv": {
> "type": "ltsv",
> "extensions": [
>   "ltsv"
> ]
>   }
> },
> enabled: true
>   }
> }
> {code}
> sample.ltsv
> {code}
> time:30/Nov/2016:00:55:08 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/xxx HTTP/1.1  status:200  size:4968 referer:- ua:Java/1.8.0_131 
> reqtime:2.532 apptime:2.532 vhost:api.example.com
> time:30/Nov/2016:00:56:37 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/yyy HTTP/1.1  status:200  size:412  referer:- ua:Java/1.8.0_201 
> reqtime:3.580 apptime:3.580 vhost:api.example.com
> {code}
> Run query
> {code:sh}
> root@1805183e9b65:/apache-drill-1.15.0# ./bin/drill-embedded 
> Apache Drill 1.15.0
> "Drill must go on."
> 0: jdbc:drill:zk=local> SELECT * FROM 
> dfs.`/apache-drill-1.15.0/sample-data/sample.ltsv` WHERE reqtime > 3.0;
> +-+--+---+---+-+---+--+-+--+--+--+
> |time |   host   | forwardedfor  |  
> req  | status  | size  | referer  |   ua| reqtime  | 
> apptime  |  vhost   |
> +-+--+---+---+-+---+--+-+--+--+--+
> | 30/Nov/2016:00:56:37 +0900  | xxx.xxx.xxx.xxx  | - | GET 
> /v1/yyy HTTP/1.1  | 200 | 412   | -| Java/1.8.0_201  | 3.580| 
> 3.580| api.example.com  |
> +-+--+---+---+-+---+--+-+--+--+--+
> 1 row selected (6.074 seconds)
> 0: jdbc:drill:zk=local> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7014) Format plugin for LTSV files

2019-01-29 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755545#comment-16755545
 ] 

Pritesh Maker commented on DRILL-7014:
--

[~cgivre] would you be able to review this contribution?

> Format plugin for LTSV files
> 
>
> Key: DRILL-7014
> URL: https://issues.apache.org/jira/browse/DRILL-7014
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.15.0
>Reporter: Takako Shimamoto
>Assignee: Takako Shimamoto
>Priority: Major
>
> I would like to contribute [this 
> plugin|https://github.com/bizreach/drill-ltsv-plugin] to Drill.
> h4. Abstract
> storage-plugins-override.conf
> {code:json}
> "storage":{
>   dfs: {
> type: "file",
> connection: "file:///",
> formats: {
>   "ltsv": {
> "type": "ltsv",
> "extensions": [
>   "ltsv"
> ]
>   }
> },
> enabled: true
>   }
> }
> {code}
> sample.ltsv
> {code}
> time:30/Nov/2016:00:55:08 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/xxx HTTP/1.1  status:200  size:4968 referer:- ua:Java/1.8.0_131 
> reqtime:2.532 apptime:2.532 vhost:api.example.com
> time:30/Nov/2016:00:56:37 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/yyy HTTP/1.1  status:200  size:412  referer:- ua:Java/1.8.0_201 
> reqtime:3.580 apptime:3.580 vhost:api.example.com
> {code}
> Run query
> {code:sh}
> root@1805183e9b65:/apache-drill-1.15.0# ./bin/drill-embedded 
> Apache Drill 1.15.0
> "Drill must go on."
> 0: jdbc:drill:zk=local> SELECT * FROM 
> dfs.`/apache-drill-1.15.0/sample-data/sample.ltsv` WHERE reqtime > 3.0;
> +-+--+---+---+-+---+--+-+--+--+--+
> |time |   host   | forwardedfor  |  
> req  | status  | size  | referer  |   ua| reqtime  | 
> apptime  |  vhost   |
> +-+--+---+---+-+---+--+-+--+--+--+
> | 30/Nov/2016:00:56:37 +0900  | xxx.xxx.xxx.xxx  | - | GET 
> /v1/yyy HTTP/1.1  | 200 | 412   | -| Java/1.8.0_201  | 3.580| 
> 3.580| api.example.com  |
> +-+--+---+---+-+---+--+-+--+--+--+
> 1 row selected (6.074 seconds)
> 0: jdbc:drill:zk=local> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6991) Kerberos ticket is being dumped in the log if log level is "debug" for stdout

2019-01-29 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755543#comment-16755543
 ] 

Pritesh Maker commented on DRILL-6991:
--

[~shamirwasia] do you recommend we close this issue?

> Kerberos ticket is being dumped in the log if log level is "debug" for stdout 
> --
>
> Key: DRILL-6991
> URL: https://issues.apache.org/jira/browse/DRILL-6991
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Priority: Major
>
> *Prerequisites:*
>  # Drill is installed on cluster with Kerberos security
>  # Into conf/logback.xml, set the following log level:
> {code:xml}
>   
> 
> 
>   
> {code}
> *Steps:*
> # Start Drill
> # Connect using sqlline using the following string:
> {noformat}
> bin/sqlline -u "jdbc:drill:zk=;principal="
> {noformat}
> *Expected result:*
> No sensitive information should be displayed
> *Actual result:*
> Kerberos  ticket and session key are being dumped into console output:
> {noformat}
> 14:35:38.806 [TGT Renewer for mapr/node1.cluster.com@NODE1] DEBUG 
> o.a.h.security.UserGroupInformation - Found tgt Ticket (hex) = 
> : 61 82 01 3D 30 82 01 39   A0 03 02 01 05 A1 07 1B  a..=0..9
> 0010: 05 4E 4F 44 45 31 A2 1A   30 18 A0 03 02 01 02 A1  .NODE1..0...
> 0020: 11 30 0F 1B 06 6B 72 62   74 67 74 1B 05 4E 4F 44  .0...krbtgt..NOD
> 0030: 45 31 A3 82 01 0B 30 82   01 07 A0 03 02 01 12 A1  E10.
> 0040: 03 02 01 01 A2 81 FA 04   81 F7 03 8D A9 FA 7D 89  
> 0050: 1B DF 37 B7 4D E6 6C 99   3E 8F FA 48 D9 9A 79 F3  ..7.M.l.>..H..y.
> 0060: 92 34 7F BF 67 1E 77 4A   2F C9 AF 82 93 4E 46 1D  .4..g.wJ/NF.
> 0070: 41 74 B0 AF 41 A8 8B 02   71 83 CC 14 51 72 60 EE  At..A...q...Qr`.
> 0080: 29 67 14 F0 A6 33 63 07   41 AA 8D DC 7B 5B 41 F3  )g...3c.A[A.
> 0090: 83 48 8B 2A 0B 4D 6D 57   9A 6E CF 6B DC 0B C0 D1  .H.*.MmW.n.k
> 00A0: 83 BB 27 40 88 7E 9F 2B   D1 FD A8 6A E1 BF F6 CC  ..'@...+...j
> 00B0: 0E 0C FB 93 5D 69 9A 8B   11 88 0C F2 7C E1 FD 04  ]i..
> 00C0: F5 AB 66 0C A4 A4 7B 30   D1 7F F1 2D D6 A1 52 D1  ..f0...-..R.
> 00D0: 79 59 F2 06 CB 65 FB 73   63 1D 5B E9 4F 28 73 EB  yY...e.sc.[.O(s.
> 00E0: 72 7F 04 46 34 56 F4 40   6C C0 2C 39 C0 5B C6 25  r..F4V.@l.,9.[.%
> 00F0: ED EF 64 07 CE ED 35 9D   D7 91 6C 8F C9 CE 16 F5  ..d...5...l.
> 0100: CA 5E 6F DE 08 D2 68 30   C7 03 97 E7 C0 FF D9 52  .^o...h0...R
> 0110: F8 1D 2F DB 63 6D 12 4A   CD 60 AD D0 BA FA 4B CF  ../.cm.J.`K.
> 0120: 2C B9 8C CA 5A E6 EC 10   5A 0A 1F 84 B0 80 BD 39  ,...Z...Z..9
> 0130: 42 2C 33 EB C0 AA 0D 44   F0 F4 E9 87 24 43 BB 9A  B,3D$C..
> 0140: 52 R
> Client Principal = mapr/node1.cluster.com@NODE1
> Server Principal = krbtgt/NODE1@NODE1
> Session Key = EncryptionKey: keyType=18 keyBytes (hex dump)=
> : 50 DA D1 D7 91 D3 64 BE   45 7B D8 02 25 81 18 25  P.d.E...%..%
> 0010: DA 59 4F BA 76 67 BB 39   9C F7 17 46 A7 C5 00 E2  .YO.vg.9...F
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6855) Query from non-existent proxy user fails with "No default schema selected" when impersonation is enabled

2019-01-29 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6855:
-
Fix Version/s: 1.16.0

> Query from non-existent proxy user fails with "No default schema selected" 
> when impersonation is enabled
> 
>
> Key: DRILL-6855
> URL: https://issues.apache.org/jira/browse/DRILL-6855
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Abhishek Ravi
>Assignee: Abhishek Ravi
>Priority: Major
> Fix For: 1.16.0
>
>
> Query from a *proxy user* fails with following error when *impersonation* is 
> *enabled* but user does not exist. This behaviour was discovered when running 
> Drill on MapR.
> {noformat}
> Error: VALIDATION ERROR: Schema [[dfs]] is not valid with respect to either 
> root schema or current default schema.
> Current default schema: No default schema selected
> {noformat}
> The above error is very confusing and made it very hard to relate to proxy 
> user does not exist + impersonation issue. 
> The {{fs.access(wsPath, FsAction.READ)}} in 
> {{WorkspaceSchemaFactory.accessible fails with IOException,}} which is not 
> handled in {{accessible}} but in {{DynamicRootSchema.loadSchemaFactory}}. At 
> this point none of the schemas are registered and hence the root schema will 
> be registered as default schema. 
> The query execution continues and fails much ahead at 
> {{DrillSqlWorker.getQueryPlan}} where the {{SqlConverter.validate}} 
> eventually throws  {{SchemaUtilites.throwSchemaNotFoundException}}.
> One possible fix could be to handle {{IOException}} similar to 
> {{FileNotFoundException}} in {{WorkspaceSchemaFactory.accessible}}.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-4858) REPEATED_COUNT on JSON containing an array of maps

2019-01-19 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-4858:
-
Fix Version/s: 1.16.0

> REPEATED_COUNT on JSON containing an array of maps
> --
>
> Key: DRILL-4858
> URL: https://issues.apache.org/jira/browse/DRILL-4858
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: jean-claude
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.16.0
>
>
> REPEATED_COUNT of JSON containing an array of map does not work.
> JSON file
> {code}
> drill$ cat /Users/jccote/repeated_count.json 
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {code}
> select
> {code}
> 0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
> dfs.`/Users/jccote/repeated_count.json`;
> {code}
> error
> {code}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
> (state=,code=0)
> {code}
> Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
> Looks like it's not enabled yet. 
> {code}
>   // TODO - need to confirm that these work   SMP: They do not
>   @FunctionTemplate(name = "repeated_count", scope = 
> FunctionTemplate.FunctionScope.SIMPLE)
>   public static class RepeatedLengthMap implements DrillSimpleFunc {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-4858) REPEATED_COUNT on JSON containing an array of maps

2019-01-19 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-4858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-4858:


Assignee: Bohdan Kazydub

> REPEATED_COUNT on JSON containing an array of maps
> --
>
> Key: DRILL-4858
> URL: https://issues.apache.org/jira/browse/DRILL-4858
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: jean-claude
>Assignee: Bohdan Kazydub
>Priority: Minor
>
> REPEATED_COUNT of JSON containing an array of map does not work.
> JSON file
> {code}
> drill$ cat /Users/jccote/repeated_count.json 
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {"intArray": [1,2,3,4], "mapArray": [{"name": "foo"},{"name": "foo"}], 
> "label": "foo"}
> {code}
> select
> {code}
> 0: jdbc:drill:zk=local> select repeated_count(mapArray) from 
> dfs.`/Users/jccote/repeated_count.json`;
> {code}
> error
> {code}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [repeated_count(MAP-REPEATED)].  Full expression: --UNKNOWN EXPRESSION--..
> Fragment 0:0
> [Error Id: 1057bb8e-1cc4-4a9a-a748-3a6a14092858 on 192.168.1.3:31010] 
> (state=,code=0)
> {code}
> Looking at the org.apache.drill.exec.expr.fn.impl.SimpleRepeatedFunctions
> Looks like it's not enabled yet. 
> {code}
>   // TODO - need to confirm that these work   SMP: They do not
>   @FunctionTemplate(name = "repeated_count", scope = 
> FunctionTemplate.FunctionScope.SIMPLE)
>   public static class RepeatedLengthMap implements DrillSimpleFunc {
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6938) SQL get the wrong result after hashjoin and hashagg disabled

2019-01-18 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6938:
-
Fix Version/s: 1.16.0

> SQL get the wrong result after hashjoin and hashagg disabled
> 
>
> Key: DRILL-6938
> URL: https://issues.apache.org/jira/browse/DRILL-6938
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Dony Dong
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.16.0
>
>
> Hi Team
> After we disable hashjoin and hashagg to fix out of memory issue, we got the 
> wrong result.
> With these two parameters enabled, we will get 8 rows. After we disable them, 
> it only return 3 rows. It seems some MEM_ID had exclude before group or some 
> other step.
> select b.MEM_ID,count(distinct b.DEP_NO)
> from dfs.test.emp b
> where b.DEP_NO<>'-'
> and b.MEM_ID in ('68','412','852','117','657','816','135','751')
> and b.HIRE_DATE>'2014-06-01'
> group by b.MEM_ID
> order by 1;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6845) Eliminate duplicates for Semi Hash Join

2019-01-14 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742389#comment-16742389
 ] 

Pritesh Maker commented on DRILL-6845:
--

[~ben-zvi] adding the link to the PR - https://github.com/apache/drill/pull/1606

> Eliminate duplicates for Semi Hash Join
> ---
>
> Key: DRILL-6845
> URL: https://issues.apache.org/jira/browse/DRILL-6845
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Minor
> Fix For: 1.16.0
>
>
> Following DRILL-6735: The performance of the new Semi Hash Join may degrade 
> if the build side contains excessive number of join-key duplicate rows; this 
> mainly a result of the need to store all those rows first, before the hash 
> table is built.
>   Proposed solution: For Semi, the Hash Agg would create a Hash-Table 
> initially, and use it to eliminate key-duplicate rows as they arrive.
>   Proposed extra: That Hash-Table has an added cost (e.g. resizing). So 
> perform "runtime stats" – Check initial number of incoming rows (e.g. 32k), 
> and if the number of duplicates is less than some threshold (e.g. %20) – 
> cancel that "early" hash table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6880) Hash-Join: Many null keys on the build side form a long linked chain in the Hash Table

2019-01-08 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6880:
-
Reviewer: Aman Sinha

> Hash-Join: Many null keys on the build side form a long linked chain in the 
> Hash Table
> --
>
> Key: DRILL-6880
> URL: https://issues.apache.org/jira/browse/DRILL-6880
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Critical
> Fix For: 1.16.0
>
>
> When building the Hash Table for the Hash-Join, each new key is matched with 
> an existing key (same bucket) by calling the generated method 
> `isKeyMatchInternalBuild`, which compares the two. However when both keys are 
> null, the method returns *false* (meaning not-equal; i.e. it is a new key), 
> thus the new key is added into the list following the old key. When a third 
> null key is found, it would be matched with the prior two, and added as well. 
> Etc etc ...
> This way many null values would perform checks at order N^2 / 2.
> _Suggested improvement_: The generated code should return a third result, 
> meaning "two null keys". Then in case of Inner or Left joins all the 
> duplicate nulls can be discarded.
> Below is a simple example, note the time difference between non-null and the 
> all-nulls tables (also instrumentation showed that for nulls, the method 
> above was called 1249975000 times!!)
> {code:java}
> 0: jdbc:drill:zk=local> use dfs.tmp;
> 0: jdbc:drill:zk=local> create table testNull as (select cast(null as int) 
> mycol from 
>  dfs.`/data/test128M.tbl` limit 5);
> 0: jdbc:drill:zk=local> create table test1 as (select cast(1 as int) mycol1 
> from 
>  dfs.`/data/test128M.tbl` limit 6);
> 0: jdbc:drill:zk=local> create table test2 as (select cast(2 as int) mycol2 
> from dfs.`/data/test128M.tbl` limit 5);
> 0: jdbc:drill:zk=local> select count(*) from test1 join test2 on test1.mycol1 
> = test2.mycol2;
> +-+
> | EXPR$0  |
> +-+
> | 0   |
> +-+
> 1 row selected (0.443 seconds)
> 0: jdbc:drill:zk=local> select count(*) from test1 join testNull on 
> test1.mycol1 = testNull.mycol;
> +-+
> | EXPR$0  |
> +-+
> | 0   |
> +-+
> 1 row selected (140.098 seconds)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6253) HashAgg Unit Testing And Refactoring

2019-01-04 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6253:
-
Fix Version/s: (was: 1.16.0)
   Future

> HashAgg Unit Testing And Refactoring
> 
>
> Key: DRILL-6253
> URL: https://issues.apache.org/jira/browse/DRILL-6253
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: Future
>
>
> This is a parent issue to hold all the subtasks required to refactor HashAgg 
> to make it unit testable. Design doc
> https://docs.google.com/document/d/110BAWg3QXMfdmuqB0p3HuaoKpPGY-lqCRtHFxdh53Ds/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5897) Support Query Cancellation when WebConnection is closed on client side both for authenticated and unauthenticated user's

2019-01-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-5897:
-
Fix Version/s: (was: 1.16.0)

> Support Query Cancellation when WebConnection is closed on client side both 
> for authenticated and unauthenticated user's
> 
>
> Key: DRILL-5897
> URL: https://issues.apache.org/jira/browse/DRILL-5897
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Web Server
>Reporter: Sorabh Hamirwasia
>Priority: Major
> Fix For: Future
>
>
> Today there is no session created (using cookies) for unauthenticated WebUser 
> whereas for authenticated user's session is created. Also when a user submits 
> a query then we wait until entire results is gathered on WebServer side and 
> then send the entire Webpage in the response (probably that's how ftl works).
> For authenticated user's we only cancel the queries in-flight when the 
> session is invalidated (either by timeout or logout). However in absence of 
> session we do nothing for unauthenticated user's so once a query is submitted 
> it will run until it's failed or successful. The only way to explicitly 
> cancel a query is from profile page which will not work when profiles are 
> disabled.
> We should research more on if it's possible to get the underlying 
> WebConnection (not session) close event and cancel queries running as part of 
> that connection close event. Also since today we will wait for entire query 
> to finish on backend server and then send the response back, which is when a 
> bad connection is detected it doesn't makes sense to cancel at that point 
> (there is 1:1 mapping between request and connection) since query is already 
> completed. Instead we can send header followed by batches of data 
> (pagination) then we can detect early enough if connection is valid or not 
> and cancel the query in response to that. More research is needed in this 
> area along with knowledge of Jetty on how this can be achieved to make our 
> WebServer more performant.
>  It would also be good to explore if we can provide sessions for 
> unauthenticated user connection too based on timeout and then handle the 
> query cancellation as part of session timeout. This will also impact the way 
> we support impersonation without authentication scenario where we ask user to 
> input query user name for each request. If we support session then username 
> should be done at session level rather than per request level which can be 
> achieved by logging user without password (similar to authentication flow)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6914) Query with RuntimeFilter and SemiJoin fails with IllegalStateException: Memory was leaked by query

2019-01-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6914:


Assignee: Boaz Ben-Zvi

> Query with RuntimeFilter and SemiJoin fails with IllegalStateException: 
> Memory was leaked by query
> --
>
> Key: DRILL-6914
> URL: https://issues.apache.org/jira/browse/DRILL-6914
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.15.0
>Reporter: Abhishek Ravi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.16.0
>
>
> Following query fails on TPC-H SF 100 dataset when 
> exec.hashjoin.enable.runtime_filter = true AND planner.enable_semijoin = true.
> Note that the query does not fail if any one of them or both are disabled.
> {code:sql}
> set `exec.hashjoin.enable.runtime_filter` = true;
> set `exec.hashjoin.runtime_filter.max.waiting.time` = 1;
> set `planner.enable_broadcast_join` = false;
> set `planner.enable_semijoin` = true;
> select
>  count(*) as row_count
> from
>  lineitem l1
> where
>  l1.l_shipdate IN (
>  select
>  distinct(cast(l2.l_shipdate as date))
>  from
>  lineitem l2);
> reset `exec.hashjoin.enable.runtime_filter`;
> reset `exec.hashjoin.runtime_filter.max.waiting.time`;
> reset `planner.enable_broadcast_join`;
> reset `planner.enable_semijoin`;
> {code}
>  
> {noformat}
> Error: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. 
> Memory leaked: (134217728)
> Allocator(frag:1:0) 800/134217728/172453568/70126322567 
> (res/actual/peak/limit)
> Fragment 1:0
> Please, refer to logs for more information.
> [Error Id: ccee18b3-c3ff-4fdb-b314-23a6cfed0a0e on qa-node185.qa.lab:31010] 
> (state=,code=0)
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked 
> by query. Memory leaked: (134217728)
> Allocator(frag:1:0) 800/134217728/172453568/70126322567 
> (res/actual/peak/limit)
> Fragment 1:0
> Please, refer to logs for more information.
> [Error Id: ccee18b3-c3ff-4fdb-b314-23a6cfed0a0e on qa-node185.qa.lab:31010]
> at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:536)
> at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:640)
> at org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:217)
> at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:151)
> at sqlline.BufferedRows.(BufferedRows.java:37)
> at sqlline.SqlLine.print(SqlLine.java:1716)
> at sqlline.Commands.execute(Commands.java:949)
> at sqlline.Commands.sql(Commands.java:882)
> at sqlline.SqlLine.dispatch(SqlLine.java:725)
> at sqlline.SqlLine.runCommands(SqlLine.java:1779)
> at sqlline.Commands.run(Commands.java:1485)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
> at sqlline.SqlLine.dispatch(SqlLine.java:722)
> at sqlline.SqlLine.initArgs(SqlLine.java:458)
> at sqlline.SqlLine.begin(SqlLine.java:514)
> at sqlline.SqlLine.start(SqlLine.java:264)
> at sqlline.SqlLine.main(SqlLine.java:195)
> Caused by: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM 
> ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: 
> (134217728)
> Allocator(frag:1:0) 800/134217728/172453568/70126322567 
> (res/actual/peak/limit)
> Fragment 1:0
> Please, refer to logs for more information.
> [Error Id: ccee18b3-c3ff-4fdb-b314-23a6cfed0a0e on qa-node185.qa.lab:31010]
> at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
> at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422)
> at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96)
> at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273)
> at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243)
> at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
> at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
> at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
> at 
> 

[jira] [Assigned] (DRILL-6739) Update Kafka libs to 2.0.0 version

2019-01-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6739:


Assignee: Vitalii Diravka

> Update Kafka libs to 2.0.0 version
> --
>
> Key: DRILL-6739
> URL: https://issues.apache.org/jira/browse/DRILL-6739
> Project: Apache Drill
>  Issue Type: Task
>  Components: Storage - Kafka
>Affects Versions: 1.14.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.16.0
>
>
> The current version of Kafka libs is 0.11.0.1
>  The last version is 2.0.0 (September 2018) 
> https://mvnrepository.com/artifact/org.apache.kafka/kafka-clients
> Looks like the only changes which should be done are:
>  * replacing {{serverConfig()}} method with {{staticServerConfig()}} in Drill 
> {{EmbeddedKafkaCluster}} class
>  * Replacing deprecated {{AdminUtils}} with {{kafka.zk.AdminZkClient}} 
> [https://github.com/apache/kafka/blob/3cdc78e6bb1f83973a14ce1550fe3874f7348b05/core/src/main/scala/kafka/admin/AdminUtils.scala#L35]
>  https://issues.apache.org/jira/browse/KAFKA-6545
> The initial work: https://github.com/vdiravka/drill/commits/DRILL-6739



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6903) SchemaBuilder improvements

2019-01-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6903:


Assignee: Arina Ielchiieva

> SchemaBuilder improvements
> --
>
> Key: DRILL-6903
> URL: https://issues.apache.org/jira/browse/DRILL-6903
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.16.0
>
>
> SchemaBuilder code will be moved to the exec package from test in DRILL-6901.
>  There are a couple of improvements that can be done in the existing code:
> *1. ColumnBuilder: OPTIONAL vs REQUIRED*
>  {{ColumnBuilder}} constructor sets mode as REQUIRED by default. We might 
> consider setting OPTIONAL as default.
>  
> [https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/rowSet/schema/ColumnBuilder.java#L40]
> *2. ColumnBuilder: setScale method*
>  {{setScale}} method is a bit awkward, it requires precision as second 
> parameter. More natural to go with precision and then scale.
>  Suggestion is to have {{setPrecisionAndScale}} method instead which will 
> accept precision and scale as first and second parameters. 
>  
> [https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/rowSet/schema/ColumnBuilder.java#L57]
> *3. SchemaContainer: addColumn method*
>  {{addColumn}} method has parameter {{AbstractColumnMetadata}}, since we have 
> interface {{ColumnMetadata}}, it's better to operate on the interface level 
> rather than on the abstract class.
>  
> [https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/test/rowSet/schema/SchemaContainer.java#L28]
> *4. MapBuilder / RepeatedListBuilder / UnionBuilder: buildCol method*
>  {{buildCol}} method is private in these classes. These classes create 
> columns and add them to the schema. There might be use cases when dev needs 
> only column and will add it to the schema when needed. Suggestion is to make 
> {{buildCol}} method public. Also all these classes require 
> {{SchemaContainer}} as parent, though when we only need them to build the 
> column, second constructor can be added without {{SchemaContainer}} parameter.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-3090) sqlline : save SQL to script file and replay from script, results in error

2019-01-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-3090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-3090:


Assignee: Vitalii Diravka

> sqlline : save SQL to script file and replay from script, results in error
> --
>
> Key: DRILL-3090
> URL: https://issues.apache.org/jira/browse/DRILL-3090
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - CLI
>Affects Versions: 1.0.0
> Environment: ffbb9c7adc6360744bee186e1f69d47dc743f73e
>Reporter: Khurram Faraaz
>Assignee: Vitalii Diravka
>Priority: Minor
> Fix For: 1.16.0
>
>
> Save a SQL query to a script file and replay the SQL from the script file 
> using !run, on sqlline prompt throws error. We should not see the error when 
> we replay the SQL from the script file.
> {code}
> 0: jdbc:drill:schema=dfs.tmp> !script file3
> Saving command script to "/opt/mapr/drill/drill-1.0.0/bin/file3". Enter 
> "script" with no arguments to stop it.
> 0: jdbc:drill:schema=dfs.tmp> select * from sys.drillbits;
> +++--+++
> |  hostname  | user_port  | control_port | data_port  |  current   |
> +++--+++
> | centos-04.qa.lab | 31010  | 31011| 31012  | false  |
> | centos-02.qa.lab | 31010  | 31011| 31012  | false  |
> | centos-01.qa.lab | 31010  | 31011| 31012  | false  |
> | centos-03.qa.lab | 31010  | 31011| 31012  | true   |
> +++--+++
> 4 rows selected (0.176 seconds)
> 0: jdbc:drill:schema=dfs.tmp> !script
> Script closed. Enter "run /opt/mapr/drill/drill-1.0.0/bin/file3" to replay it.
> 0: jdbc:drill:schema=dfs.tmp> !run /opt/mapr/drill/drill-1.0.0/bin/file3
> 1/2  select * from sys.drillbits;
> +++--+++
> |  hostname  | user_port  | control_port | data_port  |  current   |
> +++--+++
> | centos-04 | 31010  | 31011| 31012  | false  |
> | centos-02 | 31010  | 31011| 31012  | false  |
> | centos-01 | 31010  | 31011| 31012  | false  |
> | centos-03 | 31010  | 31011| 31012  | true   |
> +++--+++
> 4 rows selected (0.178 seconds)
> 2/2  !script
> Usage: script 
> Aborting command set because "force" is false and command failed: "!script"
> {code}
> I looked at the contents of file3 under /opt/mapr/drill/drill-1.0.0/bin
> There seems to be an additional/extra "!script" in the file.
> {code}
> [root@centos-01 bin]# cat file3
> select * from sys.drillbits;
> !script
> [root@centos-01 bin]# 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-6839) Failed to plan (aggregate + Hash or NL join) when slice target is low

2019-01-02 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker reassigned DRILL-6839:


Assignee: Igor Guzenko

> Failed to plan (aggregate + Hash or NL join) when slice target is low 
> --
>
> Key: DRILL-6839
> URL: https://issues.apache.org/jira/browse/DRILL-6839
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
> Fix For: 1.16.0
>
>
> *Case 1.* When nested loop join is about to be used:
>  - Option "_planner.enable_nljoin_for_scalar_only_" is set to false
>  - Option "_planner.slice_target_" is set to low value for imitation of big 
> input tables
>  
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
>  @BeforeClass
>  public static void setUp() throws Exception {
>  startCluster(ClusterFixture.builder(dirTestWatcher));
>  }
>  @Test
>  public void testCrossJoinSucceedsForLowSliceTarget() throws Exception {
>try {
>  client.alterSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName(), 
> false);
>  client.alterSession(ExecConstants.SLICE_TARGET, 1);
>  queryBuilder().sql(
> "SELECT COUNT(l.nation_id) " +
> "FROM cp.`tpch/nation.parquet` l " +
> ", cp.`tpch/region.parquet` r")
>  .run();
>} finally {
> client.resetSession(ExecConstants.SLICE_TARGET);
> client.resetSession(PlannerSettings.NLJOIN_FOR_SCALAR.getOptionName());
>}
>  }
> }{code}
>  
> *Case 2.* When hash join is about to be used:
>  - Option "planner.enable_mergejoin" is set to false, so hash join will be 
> used instead
>  - Option "planner.slice_target" is set to low value for imitation of big 
> input tables
>  - Comment out //ruleList.add(HashJoinPrule.DIST_INSTANCE); in 
> PlannerPhase.getPhysicalRules method
> {code:java}
> @Category(SqlTest.class)
> public class CrossJoinTest extends ClusterTest {
>  @BeforeClass
>  public static void setUp() throws Exception {
>startCluster(ClusterFixture.builder(dirTestWatcher));
>  }
>  @Test
>  public void testInnerJoinSucceedsForLowSliceTarget() throws Exception {
>try {
> client.alterSession(PlannerSettings.MERGEJOIN.getOptionName(), false);
> client.alterSession(ExecConstants.SLICE_TARGET, 1);
> queryBuilder().sql(
>   "SELECT COUNT(l.nation_id) " +
>   "FROM cp.`tpch/nation.parquet` l " +
>   "INNER JOIN cp.`tpch/region.parquet` r " +
>   "ON r.nation_id = l.nation_id")
> .run();
>} finally {
> client.resetSession(ExecConstants.SLICE_TARGET);
> client.resetSession(PlannerSettings.MERGEJOIN.getOptionName());
>}
>  }
> }
> {code}
>  
> *Workaround:* To avoid the exception we need to set option 
> "_planner.enable_multiphase_agg_" to false. By doing this we avoid 
> unsuccessful attempts to create 2 phase aggregation plan in StreamAggPrule 
> and guarantee that logical aggregate will be converted to physical one. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-5360) Timestamp type documented as UTC, implemented as local time

2019-01-01 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-5360:
-
Issue Type: Improvement  (was: Bug)

> Timestamp type documented as UTC, implemented as local time
> ---
>
> Key: DRILL-5360
> URL: https://issues.apache.org/jira/browse/DRILL-5360
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Bohdan Kazydub
>Priority: Critical
> Fix For: 2.0.0
>
>
> The Drill documentation implies that the {{Timestamp}} type is in UTC:
> bq. JDBC timestamp in year, month, date hour, minute, second, and optional 
> milliseconds format: -MM-dd HH:mm:ss.SSS. ... TIMESTAMP literals: Drill 
> stores values in Coordinated Universal Time (UTC). Drill supports time 
> functions in the range 1971 to 2037. ... Drill does not support TIMESTAMP 
> with time zone.
> The above is ambiguous. The first part talks about JDBC timestamps. From the 
> JDK Javadoc:
> bq. Timestamp: A thin wrapper around java.util.Date. ... Date class is 
> intended to reflect coordinated universal time (UTC)...
> So, a JDBC timestamp is intended to represent time in UTC. (The "indented to 
> reflect" statement leaves open the possibility of misusing {{Date}} to 
> represent times in other time zones. This was common practice in early Java 
> development and was the reason for the eventual development of the Joda, then 
> Java 8 date/time classes.)
> The Drill documentation implies that timestamp *literals* are in UTC, but a 
> careful read of the documentation does allow an interpretation that the 
> internal representation can be other than UTC. If this is true, then we would 
> also rely on a liberal reading of the Java `Timestamp` class to also not be 
> UTC. (Or, we rely on the Drill JDBC driver to convert from the (unknown) 
> server time zone to a UTC value returned by the Drill JDBC client.)
> Still, a superficial reading (and common practice) would suggest that a Drill 
> Timestamp should be in UTC.
> However, a test on a Mac, with an embedded Drillbit (run in the Pacific time 
> zone, with Daylight Savings Time in effect) shows that the Timestamp binary 
> value is actual local time:
> {code}
>   long before = System.currentTimeMillis();
>   long value = getDateValue(client, "SELECT NOW() FROM (VALUES(1))" );
>   double hrsDiff = (value - before) / (1000.00 * 60 * 60);
>   System.out.println("Hours: " + hrsDiff);
> {code}
> The above gets the actual UTC time from Java. Then, it runs a query that gets 
> Drill's idea of the current time using the {{NOW()}} function. (The 
> {{getDateValue}} function uses the new test framework to access the actual 
> {{long}} value from the returned value vector.) Finally, we compute the 
> difference between the two times, converted to hours. Output:
> {code}
> Hours: -6.975
> {code}
> As it turns out, this is the difference between UTC and PDT. So, the time is 
> in local time, not UTC.
> Since the documentation and implementation are both ambiguous, it is hard to 
> know the intent of the Drill Timestamp. Clearly, common practice is to use 
> UTC. But, there is wiggle-room.
> If the Timestamp value is supposed to be local time, then Drill should 
> provide a function to return the server's time zone offset (in ms) from UTC 
> so that the client can to the needed local-to-UTC conversion to get a true 
> timestamp.
> On the other hand, if the Timestamp is supposed to be UTC (per common 
> practice), then {{NOW()}} should not report local time, it should return UTC.
> Further, if {{NOW()}} returns local time, but Timestamp literals are UTC, 
> then it is hard to see how any query can be rationally written if one 
> timestamp value is local, but a literal is UTC.
> So, job #1 is to define the Timestamp semantics. Then, use that to figure out 
> where the bug lies to make implementation consistent with documentation (or 
> visa-versa.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >