[jira] [Commented] (DRILL-7109) Statistics adds external sort, which spills to disk

2019-03-22 Thread Gautam Parai (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799449#comment-16799449
 ] 

Gautam Parai commented on DRILL-7109:
-

[~rhou] can you please create another Jira for the issue where we see filter 
predicates of the type $0=$0 in TPCH4 - that is another issue which should be 
looked at outside the scope of statistics.

> Statistics adds external sort, which spills to disk
> ---
>
> Key: DRILL-7109
> URL: https://issues.apache.org/jira/browse/DRILL-7109
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.16.0
>Reporter: Robert Hou
>Assignee: Gautam Parai
>Priority: Major
> Fix For: 1.16.0
>
>
> TPCH query 4 with sf 100 runs many times slower.  One issue is that an extra 
> external sort has been added, and both external sorts spill to disk.
> Also, the hash join sees 100x more data.
> Here is the query:
> {noformat}
> select
>   o.o_orderpriority,
>   count(*) as order_count
> from
>   orders o
> where
>   o.o_orderdate >= date '1996-10-01'
>   and o.o_orderdate < date '1996-10-01' + interval '3' month
>   and 
>   exists (
> select
>   *
> from
>   lineitem l
> where
>   l.l_orderkey = o.o_orderkey
>   and l.l_commitdate < l.l_receiptdate
>   )
> group by
>   o.o_orderpriority
> order by
>   o.o_orderpriority;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7102) Apache Metrics WEBUI Unavailable

2019-03-22 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799418#comment-16799418
 ] 

Abhishek Girish commented on DRILL-7102:


One issue could be that you are using the hostname:port - and that may not be 
resolvable. Can you try with external IP:port and let me know if that works? 

> Apache Metrics WEBUI Unavailable 
> -
>
> Key: DRILL-7102
> URL: https://issues.apache.org/jira/browse/DRILL-7102
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.15.0
> Environment: kubernetes v1.13.2
> ubuntu:18.04
> Apache Drill 1.15.0
> 64GB RAM
> 8 vCpu Cores
>Reporter: Gene
>Assignee: Abhishek Girish
>Priority: Minor
> Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot 
> 2019-03-14 at 2.44.37 PM.png
>
>
> Apache Drill Metrics unavailable in webUI when exposed through NodePort in 
> Kubernetes.
> Error:
> {code:java}
> Failed to load resource: net::ERR_CONNECTION_REFUSED
> {code}
> Browser unable to resolve requested url.
> Maybe we can have a feature where we can change the resource name that the 
> browser is looking for.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7102) Apache Metrics WEBUI Unavailable

2019-03-22 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799417#comment-16799417
 ] 

Abhishek Girish commented on DRILL-7102:


[~tocinoatbp], I just tried with LoadBalancer and NodePort configurations - 
could not reproduce the issue. I was able to open the Metrics page and click 
through them. Can you please share more details?

> Apache Metrics WEBUI Unavailable 
> -
>
> Key: DRILL-7102
> URL: https://issues.apache.org/jira/browse/DRILL-7102
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.15.0
> Environment: kubernetes v1.13.2
> ubuntu:18.04
> Apache Drill 1.15.0
> 64GB RAM
> 8 vCpu Cores
>Reporter: Gene
>Assignee: Abhishek Girish
>Priority: Minor
> Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot 
> 2019-03-14 at 2.44.37 PM.png
>
>
> Apache Drill Metrics unavailable in webUI when exposed through NodePort in 
> Kubernetes.
> Error:
> {code:java}
> Failed to load resource: net::ERR_CONNECTION_REFUSED
> {code}
> Browser unable to resolve requested url.
> Maybe we can have a feature where we can change the resource name that the 
> browser is looking for.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (DRILL-7102) Apache Metrics WEBUI Unavailable

2019-03-22 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish reassigned DRILL-7102:
--

Assignee: Abhishek Girish

> Apache Metrics WEBUI Unavailable 
> -
>
> Key: DRILL-7102
> URL: https://issues.apache.org/jira/browse/DRILL-7102
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - HTTP
>Affects Versions: 1.15.0
> Environment: kubernetes v1.13.2
> ubuntu:18.04
> Apache Drill 1.15.0
> 64GB RAM
> 8 vCpu Cores
>Reporter: Gene
>Assignee: Abhishek Girish
>Priority: Minor
> Attachments: Screen Shot 2019-03-13 at 1.16.14 PM.png, Screen Shot 
> 2019-03-14 at 2.44.37 PM.png
>
>
> Apache Drill Metrics unavailable in webUI when exposed through NodePort in 
> Kubernetes.
> Error:
> {code:java}
> Failed to load resource: net::ERR_CONNECTION_REFUSED
> {code}
> Browser unable to resolve requested url.
> Maybe we can have a feature where we can change the resource name that the 
> browser is looking for.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Robert Hou (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou resolved DRILL-7132.
---
Resolution: Not A Problem

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799365#comment-16799365
 ] 

Volodymyr Vysotskyi commented on DRILL-7132:


Here is a discussion from the PR with an explanation of the necessity of 
changing format: 
[https://github.com/apache/drill/pull/805#discussion_r117578486]

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799355#comment-16799355
 ] 

Robert Hou commented on DRILL-7132:
---

While I agree that there is no requirement to store data in human-readable 
format, there are advantages when it comes to support and debugging customer 
issues.  But I assume you considered this and decided the pros of using a 
different format were more important.

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799353#comment-16799353
 ] 

Robert Hou commented on DRILL-7132:
---

The online decoder works.

Thanks.

--Robert

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799349#comment-16799349
 ] 

Volodymyr Vysotskyi commented on DRILL-7132:


These values may be decoded using java code ({{Base64}} class, or 
alternatives), or using online decoders, like this one: 
[https://www.base64decode.org/].

Also, old metadata cache files without encoding (V3_2 or older) should be 
handled correctly by Drill.

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-7130) IllegalStateException: Read batch count [0] should be greater than zero

2019-03-22 Thread Timothy Farkas (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799295#comment-16799295
 ] 

Timothy Farkas edited comment on DRILL-7130 at 3/22/19 9:01 PM:


Made a mistake and the Jira number is not included in the commit message. If 
searching for this change in the commit history look for "Fixed 
IllegalStateException while reading Parquet data" Also the commit id is 
0a547708d6734f893ca0d6bf673f7a6ae856375e


was (Author: timothyfarkas):
Made a mistake and the Jira number is not included in the commit message. If 
searching for this change in the commit history look for "Fixed 
IllegalStateException while reading Parquet data"

> IllegalStateException: Read batch count [0] should be greater than zero
> ---
>
> Key: DRILL-7130
> URL: https://issues.apache.org/jira/browse/DRILL-7130
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
> Fix For: 1.16.0
>
>
> The following exception is being hit when reading parquet data:
> Caused by: java.lang.IllegalStateException: Read batch count [0] should be 
> greater than zero at 
> org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:509)
>  ~[drill-shaded-guava-23.0.jar:23.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenNullableFixedEntryReader.getEntry(VarLenNullableFixedEntryReader.java:49)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getFixedEntry(VarLenBulkPageReader.java:167)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:132)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:154)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:38)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624)
>  ~[vector-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:716)
>  ~[vector-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:114)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:92)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:156)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:43)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:288)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] ... 29 common frames omitted
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Robert Hou (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799343#comment-16799343
 ] 

Robert Hou commented on DRILL-7132:
---

[~vvysotskyi] Sounds good.

How does QA verify that the values are correct?  We have some metadata cache 
tests that are failing, and they should be re-verified with the new base24 
values.  And I'm about to add some new ones for an enhancement to the metadata 
cache feature.

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Volodymyr Vysotskyi (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799339#comment-16799339
 ] 

Volodymyr Vysotskyi commented on DRILL-7132:


[~rhou], parquet metadata cache contains min/max values for varchar, decimal, 
interval, and some other types encoded using base64, so they differ from the 
values displayed by parquet tools.

There is no need to store values in the same format/encoding, etc. The main 
requirement is Drill should be able to handle these values from parquet 
metadata cache files correctly, and it does.

As a side note, in DRILL-4139 was made a change to use base64 encoding in 
parquet metadata cache to be able to handle correctly statistics for decimal 
and interval types.

> Metadata cache does not have correct min/max values for varchar and interval 
> data types
> ---
>
> Key: DRILL-7132
> URL: https://issues.apache.org/jira/browse/DRILL-7132
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.14.0
>Reporter: Robert Hou
>Priority: Major
> Fix For: 1.17.0
>
> Attachments: 0_0_10.parquet
>
>
> The parquet metadata cache does not have correct min/max values for varchar 
> and interval data types.
> I have attached a parquet file.  Here is what parquet tools shows for varchar:
> [varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
> average: 67 total: 67 (raw data: 65 saving -3%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 65 max: 65 average: 65 total: 65
>   column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "varchar_col" ],
> "minValue" : "aW9lZ2pOSkt2bmtk",
> "maxValue" : "aW9lZ2pOSkt2bmtk",
> "nulls" : 0
> Here is what parquet tools shows for interval:
> [interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
> average: 52 total: 52 (raw data: 50 saving -4%)
>   values: min: 1 max: 1 average: 1 total: 1
>   uncompressed: min: 50 max: 50 average: 50 total: 50
>   column values statistics: min: P18582D, max: P18582D, num_nulls: 0
> Here is what the metadata cache file shows:
> "name" : [ "interval_col" ],
> "minValue" : "UDE4NTgyRA==",
> "maxValue" : "UDE4NTgyRA==",
> "nulls" : 0



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7132) Metadata cache does not have correct min/max values for varchar and interval data types

2019-03-22 Thread Robert Hou (JIRA)
Robert Hou created DRILL-7132:
-

 Summary: Metadata cache does not have correct min/max values for 
varchar and interval data types
 Key: DRILL-7132
 URL: https://issues.apache.org/jira/browse/DRILL-7132
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Affects Versions: 1.14.0
Reporter: Robert Hou
 Fix For: 1.17.0
 Attachments: 0_0_10.parquet

The parquet metadata cache does not have correct min/max values for varchar and 
interval data types.

I have attached a parquet file.  Here is what parquet tools shows for varchar:

[varchar_col] BINARY 14.6% of all space [PLAIN, BIT_PACKED] min: 67 max: 67 
average: 67 total: 67 (raw data: 65 saving -3%)
  values: min: 1 max: 1 average: 1 total: 1
  uncompressed: min: 65 max: 65 average: 65 total: 65
  column values statistics: min: ioegjNJKvnkd, max: ioegjNJKvnkd, num_nulls: 0

Here is what the metadata cache file shows:

"name" : [ "varchar_col" ],
"minValue" : "aW9lZ2pOSkt2bmtk",
"maxValue" : "aW9lZ2pOSkt2bmtk",
"nulls" : 0

Here is what parquet tools shows for interval:

[interval_col] BINARY 11.3% of all space [PLAIN, BIT_PACKED] min: 52 max: 52 
average: 52 total: 52 (raw data: 50 saving -4%)
  values: min: 1 max: 1 average: 1 total: 1
  uncompressed: min: 50 max: 50 average: 50 total: 50
  column values statistics: min: P18582D, max: P18582D, num_nulls: 0

Here is what the metadata cache file shows:

"name" : [ "interval_col" ],
"minValue" : "UDE4NTgyRA==",
"maxValue" : "UDE4NTgyRA==",
"nulls" : 0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7127) Update hbase version for mapr profile

2019-03-22 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799251#comment-16799251
 ] 

Abhishek Girish commented on DRILL-7127:


Currently seeing test failures. Moving out of 1.16.0 scope, as it's not a 
blocker. 

> Update hbase version for mapr profile
> -
>
> Key: DRILL-7127
> URL: https://issues.apache.org/jira/browse/DRILL-7127
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase, Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>
> Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - 
> which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7127) Update hbase version for mapr profile

2019-03-22 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7127:
---
Fix Version/s: (was: 1.16.0)

> Update hbase version for mapr profile
> -
>
> Key: DRILL-7127
> URL: https://issues.apache.org/jira/browse/DRILL-7127
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase, Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>
> Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - 
> which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7109) Statistics adds external sort, which spills to disk

2019-03-22 Thread Aman Sinha (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16799215#comment-16799215
 ] 

Aman Sinha commented on DRILL-7109:
---

[~rhou] for all such issues related to the planning, please add the EXPLAIN 
plan with and without statistics for faster diagnosis.  

> Statistics adds external sort, which spills to disk
> ---
>
> Key: DRILL-7109
> URL: https://issues.apache.org/jira/browse/DRILL-7109
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.16.0
>Reporter: Robert Hou
>Assignee: Gautam Parai
>Priority: Major
> Fix For: 1.16.0
>
>
> TPCH query 4 with sf 100 runs many times slower.  One issue is that an extra 
> external sort has been added, and both external sorts spill to disk.
> Also, the hash join sees 100x more data.
> Here is the query:
> {noformat}
> select
>   o.o_orderpriority,
>   count(*) as order_count
> from
>   orders o
> where
>   o.o_orderdate >= date '1996-10-01'
>   and o.o_orderdate < date '1996-10-01' + interval '3' month
>   and 
>   exists (
> select
>   *
> from
>   lineitem l
> where
>   l.l_orderkey = o.o_orderkey
>   and l.l_commitdate < l.l_receiptdate
>   )
> group by
>   o.o_orderpriority
> order by
>   o.o_orderpriority;
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7130) IllegalStateException: Read batch count [0] should be greater than zero

2019-03-22 Thread salim achouche (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

salim achouche updated DRILL-7130:
--
Reviewer: Timothy Farkas

> IllegalStateException: Read batch count [0] should be greater than zero
> ---
>
> Key: DRILL-7130
> URL: https://issues.apache.org/jira/browse/DRILL-7130
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
> Fix For: 1.17.0
>
>
> The following exception is being hit when reading parquet data:
> Caused by: java.lang.IllegalStateException: Read batch count [0] should be 
> greater than zero at 
> org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkState(Preconditions.java:509)
>  ~[drill-shaded-guava-23.0.jar:23.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenNullableFixedEntryReader.getEntry(VarLenNullableFixedEntryReader.java:49)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getFixedEntry(VarLenBulkPageReader.java:167)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBulkPageReader.getEntry(VarLenBulkPageReader.java:132)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:154)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenColumnBulkInput.next(VarLenColumnBulkInput.java:38)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.vector.VarCharVector$Mutator.setSafe(VarCharVector.java:624)
>  ~[vector-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.vector.NullableVarCharVector$Mutator.setSafe(NullableVarCharVector.java:716)
>  ~[vector-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLengthColumnReaders$NullableVarCharColumn.setSafe(VarLengthColumnReaders.java:215)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLengthValuesColumn.readRecordsInBulk(VarLengthValuesColumn.java:98)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readRecordsInBulk(VarLenBinaryReader.java:114)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.VarLenBinaryReader.readFields(VarLenBinaryReader.java:92)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.BatchReader$VariableWidthReader.readRecords(BatchReader.java:156)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.BatchReader.readBatch(BatchReader.java:43)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] at 
> org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.next(ParquetRecordReader.java:288)
>  ~[drill-java-exec-1.15.0.0.jar:1.15.0.0] ... 29 common frames omitted
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7127) Update hbase version for mapr profile

2019-03-22 Thread Sorabh Hamirwasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-7127:
-
Labels:   (was: ready-to-commit)

> Update hbase version for mapr profile
> -
>
> Key: DRILL-7127
> URL: https://issues.apache.org/jira/browse/DRILL-7127
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase, Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.16.0
>
>
> Current hbase version for mapr profile is {{1.1.1-mapr-1602-m7-5.2.0}} - 
> which is over 3 years old. Needs to be updated to {{1.1.8-mapr-1808}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7131) generate_series / generator

2019-03-22 Thread benj (JIRA)
benj created DRILL-7131:
---

 Summary: generate_series / generator
 Key: DRILL-7131
 URL: https://issues.apache.org/jira/browse/DRILL-7131
 Project: Apache Drill
  Issue Type: Wish
  Components: Functions - Drill
Affects Versions: 1.15.0
Reporter: benj


Please add a very useful functionality  an equivalent of generate_series in 
Postgres / Oracle or generator in MySql

[https://www.postgresql.org/docs/9.1/functions-srf.html]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7079) Drill can't query views from the S3 storage when plain authentication is enabled

2019-03-22 Thread Volodymyr Vysotskyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-7079:
---
Labels: ready-to-commit  (was: )

> Drill can't query views from the S3 storage when plain authentication is 
> enabled
> 
>
> Key: DRILL-7079
> URL: https://issues.apache.org/jira/browse/DRILL-7079
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Denys Ordynskiy
>Assignee: Bohdan Kazydub
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Enable plain authentication in Drill.
> Create the view on the S3 storage:
> create view s3.tmp.`testview` as select * from cp.`employee.json` limit 20;
> Try to select data from the created view:
> select * from s3.tmp.`testview`;
> *Actual result*:
> {noformat}
> 2019-02-27 17:01:09,202 [Client-1] INFO  
> o.a.d.j.i.DrillCursor$ResultsListener - [#4] Query failed: 
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> IllegalArgumentException: A valid userName is expected
> Please, refer to logs for more information.
> [Error Id: 2271c3aa-6d09-4b51-a585-0e0e954b46eb on maprhost:31010]
>   at 
> org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) 
> [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) 
> [drill-rpc-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88)
>  [netty-codec-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287)
>  [netty-handler-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102)
>  [netty-codec-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312)
>  [netty-codec-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286)
>  [netty-codec-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335)
>  [netty-transport-4.0.48.Final.jar:4.0.48.Final]
>   at 
> 

[jira] [Closed] (DRILL-7041) CompileException happens if a nested coalesce function returns null

2019-03-22 Thread Anton Gozhiy (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy closed DRILL-7041.
---

> CompileException happens if a nested coalesce function returns null
> ---
>
> Key: DRILL-7041
> URL: https://issues.apache.org/jira/browse/DRILL-7041
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select coalesce(coalesce(n_name1, n_name2), n_name) from 
> cp.`tpch/nation.parquet`
> {code}
> *Expected result:*
> Values from "n_name" column should be returned
> *Actual result:*
> An exception happens:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> CompileException: Line 57, Column 27: Assignment conversion not possible from 
> type "org.apache.drill.exec.expr.holders.NullableVarCharHolder" to type 
> "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer 
> to logs for more information. [Error Id: e54d5bfd-604d-4a39-b62f-33bb964e5286 
> on userf87d-pc:31010] (org.apache.drill.exec.exception.SchemaChangeException) 
> Failure while attempting to load generated class 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():573
>  
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
>  org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
>  org.apache.drill.exec.record.AbstractRecordBatch.next():186 
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284 
> java.security.AccessController.doPrivileged():-2 
> javax.security.auth.Subject.doAs():422 
> org.apache.hadoop.security.UserGroupInformation.doAs():1746 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284 
> org.apache.drill.common.SelfCleaningRunnable.run():38 
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149 
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624 
> java.lang.Thread.run():748 Caused By 
> (org.apache.drill.exec.exception.ClassTransformationException) 
> java.util.concurrent.ExecutionException: 
> org.apache.drill.exec.exception.ClassTransformationException: Failure 
> generating transformation classes for value: package 
> org.apache.drill.exec.test.generated; import 
> org.apache.drill.exec.exception.SchemaChangeException; import 
> org.apache.drill.exec.expr.holders.BigIntHolder; import 
> org.apache.drill.exec.expr.holders.BitHolder; import 
> org.apache.drill.exec.expr.holders.NullableVarBinaryHolder; import 
> org.apache.drill.exec.expr.holders.NullableVarCharHolder; import 
> org.apache.drill.exec.expr.holders.VarCharHolder; import 
> org.apache.drill.exec.ops.FragmentContext; import 
> org.apache.drill.exec.record.RecordBatch; import 
> org.apache.drill.exec.vector.UntypedNullHolder; import 
> org.apache.drill.exec.vector.UntypedNullVector; import 
> org.apache.drill.exec.vector.VarCharVector; public class ProjectorGen35 { 
> BigIntHolder const6; BitHolder constant9; UntypedNullHolder constant13; 
> VarCharVector vv14; UntypedNullVector vv19; public void doEval(int inIndex, 
> int outIndex) throws SchemaChangeException { { UntypedNullHolder out0 = new 
> UntypedNullHolder(); if (constant9 .value == 1) { if (constant13 .isSet!= 0) 
> { out0 = constant13; } } else { VarCharHolder out17 = new VarCharHolder(); { 
> out17 .buffer = vv14 .getBuffer(); long startEnd = vv14 
> .getAccessor().getStartEnd((inIndex)); out17 .start = ((int) startEnd); out17 
> .end = ((int)(startEnd >> 32)); } // start of eval portion of 
> convertToNullableVARCHAR function. // NullableVarCharHolder out18 = new 
> NullableVarCharHolder(); { final NullableVarCharHolder output = new 
> NullableVarCharHolder(); VarCharHolder input = out17; 
> GConvertToNullableVarCharHolder_eval: { output.isSet = 1; output.start = 
> input.start; output.end = input.end; output.buffer = input.buffer; } out18 = 
> output; } // end of eval portion of convertToNullableVARCHAR function. 
> // if (out18 .isSet!= 0) { out0 = out18; } } if (!(out0 .isSet == 0)) { 
> vv19 .getMutator().set((outIndex), out0 .isSet, out0); } } } public void 
> doSetup(FragmentContext context, RecordBatch incoming, RecordBatch outgoing) 
> throws SchemaChangeException { { UntypedNullHolder out1 = new 
> UntypedNullHolder(); NullableVarBinaryHolder out2 = new 
> 

[jira] [Commented] (DRILL-7041) CompileException happens if a nested coalesce function returns null

2019-03-22 Thread Anton Gozhiy (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798928#comment-16798928
 ] 

Anton Gozhiy commented on DRILL-7041:
-

Verified with Drill version 1.16.0-SNAPSHOT (commit 
bf1bdec6069f6fdd2132608450357edea47d328c)

> CompileException happens if a nested coalesce function returns null
> ---
>
> Key: DRILL-7041
> URL: https://issues.apache.org/jira/browse/DRILL-7041
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select coalesce(coalesce(n_name1, n_name2), n_name) from 
> cp.`tpch/nation.parquet`
> {code}
> *Expected result:*
> Values from "n_name" column should be returned
> *Actual result:*
> An exception happens:
> {code}
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> CompileException: Line 57, Column 27: Assignment conversion not possible from 
> type "org.apache.drill.exec.expr.holders.NullableVarCharHolder" to type 
> "org.apache.drill.exec.vector.UntypedNullHolder" Fragment 0:0 Please, refer 
> to logs for more information. [Error Id: e54d5bfd-604d-4a39-b62f-33bb964e5286 
> on userf87d-pc:31010] (org.apache.drill.exec.exception.SchemaChangeException) 
> Failure while attempting to load generated class 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():573
>  
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
>  org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
>  org.apache.drill.exec.record.AbstractRecordBatch.next():186 
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83 
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284 
> java.security.AccessController.doPrivileged():-2 
> javax.security.auth.Subject.doAs():422 
> org.apache.hadoop.security.UserGroupInformation.doAs():1746 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284 
> org.apache.drill.common.SelfCleaningRunnable.run():38 
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149 
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624 
> java.lang.Thread.run():748 Caused By 
> (org.apache.drill.exec.exception.ClassTransformationException) 
> java.util.concurrent.ExecutionException: 
> org.apache.drill.exec.exception.ClassTransformationException: Failure 
> generating transformation classes for value: package 
> org.apache.drill.exec.test.generated; import 
> org.apache.drill.exec.exception.SchemaChangeException; import 
> org.apache.drill.exec.expr.holders.BigIntHolder; import 
> org.apache.drill.exec.expr.holders.BitHolder; import 
> org.apache.drill.exec.expr.holders.NullableVarBinaryHolder; import 
> org.apache.drill.exec.expr.holders.NullableVarCharHolder; import 
> org.apache.drill.exec.expr.holders.VarCharHolder; import 
> org.apache.drill.exec.ops.FragmentContext; import 
> org.apache.drill.exec.record.RecordBatch; import 
> org.apache.drill.exec.vector.UntypedNullHolder; import 
> org.apache.drill.exec.vector.UntypedNullVector; import 
> org.apache.drill.exec.vector.VarCharVector; public class ProjectorGen35 { 
> BigIntHolder const6; BitHolder constant9; UntypedNullHolder constant13; 
> VarCharVector vv14; UntypedNullVector vv19; public void doEval(int inIndex, 
> int outIndex) throws SchemaChangeException { { UntypedNullHolder out0 = new 
> UntypedNullHolder(); if (constant9 .value == 1) { if (constant13 .isSet!= 0) 
> { out0 = constant13; } } else { VarCharHolder out17 = new VarCharHolder(); { 
> out17 .buffer = vv14 .getBuffer(); long startEnd = vv14 
> .getAccessor().getStartEnd((inIndex)); out17 .start = ((int) startEnd); out17 
> .end = ((int)(startEnd >> 32)); } // start of eval portion of 
> convertToNullableVARCHAR function. // NullableVarCharHolder out18 = new 
> NullableVarCharHolder(); { final NullableVarCharHolder output = new 
> NullableVarCharHolder(); VarCharHolder input = out17; 
> GConvertToNullableVarCharHolder_eval: { output.isSet = 1; output.start = 
> input.start; output.end = input.end; output.buffer = input.buffer; } out18 = 
> output; } // end of eval portion of convertToNullableVARCHAR function. 
> // if (out18 .isSet!= 0) { out0 = out18; } } if (!(out0 .isSet == 0)) { 
> vv19 .getMutator().set((outIndex), out0 .isSet, out0); } } } public void 
> doSetup(FragmentContext context, RecordBatch incoming, RecordBatch outgoing) 

[jira] [Commented] (DRILL-7104) Change of data type when parquet with multiple fragment

2019-03-22 Thread benj (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16798891#comment-16798891
 ] 

benj commented on DRILL-7104:
-

With the request that build Parquet with more than 1 Fragment, if you add an 
UNION on empty string ('') you get an error

 
{code:java}
CREATE TABLE `bug` AS
((SELECT CAST(NULL AS VARCHAR) AS demo
  ,md5(CAST(rand() AS VARCHAR)) AS jam
  FROM `onebigfile` LIMIT 100)
 UNION
 (SELECT CAST(NULL AS VARCHAR) AS demo
  ,md5(CAST(rand() AS VARCHAR)) AS jam
  FROM `onebigfile` LIMIT 100)
 UNION
 (SELECT CAST('' AS VARCHAR) AS demo, 'jam' AS jam FROM (VALUES(1)))
);
=>
Error: SYSTEM ERROR: NumberFormatException:
{code}
Please find the complete log of these error here :

 
 * [^DRILL-7104_ErrorNumberFormatException_20190322.log]

> Change of data type when parquet with multiple fragment
> ---
>
> Key: DRILL-7104
> URL: https://issues.apache.org/jira/browse/DRILL-7104
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: benj
>Priority: Major
> Attachments: DRILL-7104_ErrorNumberFormatException_20190322.log
>
>
> When creating a Parquet with a column filled only with "CAST(NULL AS 
> VARCHAR)", if the parquet has several fragment, the type is read like INT 
> instead of VARCHAR.
> First, create +Parquet with only one fragment+ - all is fine (the type of 
> "demo" is correct).
> {code:java}
> CREATE TABLE `nobug` AS 
>  (SELECT CAST(NULL AS VARCHAR) AS demo
>   , md5(cast(rand() AS VARCHAR) AS jam 
>   FROM `onebigfile` LIMIT 100));
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 1000   |
> SELECT drilltypeof(demo) AS goodtype FROM `bug` LIMIT 1;
> ++
> | goodtype   |
> ++
> | VARCHAR|
> {code}
> Second, create +Parquet with at least 2 fragments+ - the type of "demo" 
> change to INT
> {code:java}
> CREATE TABLE `bug` AS 
> ((SELECT CAST(NULL AS VARCHAR) AS demo
>   ,md5(CAST(rand() AS VARCHAR)) AS jam 
>   FROM `onebigfile` LIMIT 100) 
>  UNION 
>  (SELECT CAST(NULL AS VARCHAR) AS demo
>   ,md5(CAST(rand() AS VARCHAR)) AS jam
>   FROM `onebigfile` LIMIT 100));
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_1   | 1000276|
> | 1_0   | 999724 |
> SELECT drilltypeof(demo) AS badtype FROM `bug` LIMIT 1;
> ++
> | badtype|
> ++
> | INT|{code}
> The change of type is really terrible...
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7104) Change of data type when parquet with multiple fragment

2019-03-22 Thread benj (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

benj updated DRILL-7104:

Attachment: DRILL-7104_ErrorNumberFormatException_20190322.log

> Change of data type when parquet with multiple fragment
> ---
>
> Key: DRILL-7104
> URL: https://issues.apache.org/jira/browse/DRILL-7104
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: benj
>Priority: Major
> Attachments: DRILL-7104_ErrorNumberFormatException_20190322.log
>
>
> When creating a Parquet with a column filled only with "CAST(NULL AS 
> VARCHAR)", if the parquet has several fragment, the type is read like INT 
> instead of VARCHAR.
> First, create +Parquet with only one fragment+ - all is fine (the type of 
> "demo" is correct).
> {code:java}
> CREATE TABLE `nobug` AS 
>  (SELECT CAST(NULL AS VARCHAR) AS demo
>   , md5(cast(rand() AS VARCHAR) AS jam 
>   FROM `onebigfile` LIMIT 100));
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 0_0   | 1000   |
> SELECT drilltypeof(demo) AS goodtype FROM `bug` LIMIT 1;
> ++
> | goodtype   |
> ++
> | VARCHAR|
> {code}
> Second, create +Parquet with at least 2 fragments+ - the type of "demo" 
> change to INT
> {code:java}
> CREATE TABLE `bug` AS 
> ((SELECT CAST(NULL AS VARCHAR) AS demo
>   ,md5(CAST(rand() AS VARCHAR)) AS jam 
>   FROM `onebigfile` LIMIT 100) 
>  UNION 
>  (SELECT CAST(NULL AS VARCHAR) AS demo
>   ,md5(CAST(rand() AS VARCHAR)) AS jam
>   FROM `onebigfile` LIMIT 100));
> +---++
> | Fragment  | Number of records written  |
> +---++
> | 1_1   | 1000276|
> | 1_0   | 999724 |
> SELECT drilltypeof(demo) AS badtype FROM `bug` LIMIT 1;
> ++
> | badtype|
> ++
> | INT|{code}
> The change of type is really terrible...
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-7118) Filter not getting pushed down on MapR-DB tables.

2019-03-22 Thread Volodymyr Vysotskyi (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-7118:
---
Labels: ready-to-commit  (was: )

> Filter not getting pushed down on MapR-DB tables.
> -
>
> Key: DRILL-7118
> URL: https://issues.apache.org/jira/browse/DRILL-7118
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.15.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> A simple is null filter is not being pushed down for the mapr-db tables. Here 
> is the repro for the same.
> {code:java}
> 0: jdbc:drill:zk=local> explain plan for select * from dfs.`/tmp/js` where b 
> is null;
> ANTLR Tool version 4.5 used for code generation does not match the current 
> runtime version 4.7.1ANTLR Runtime version 4.5 used for parser compilation 
> does not match the current runtime version 4.7.1ANTLR Tool version 4.5 used 
> for code generation does not match the current runtime version 4.7.1ANTLR 
> Runtime version 4.5 used for parser compilation does not match the current 
> runtime version 
> 4.7.1+--+--+
> | text | json |
> +--+--+
> | 00-00 Screen
> 00-01 Project(**=[$0])
> 00-02 Project(T0¦¦**=[$0])
> 00-03 SelectionVectorRemover
> 00-04 Filter(condition=[IS NULL($1)])
> 00-05 Project(T0¦¦**=[$0], b=[$1])
> 00-06 Scan(table=[[dfs, /tmp/js]], groupscan=[JsonTableGroupScan 
> [ScanSpec=JsonScanSpec [tableName=/tmp/js, condition=null], columns=[`**`, 
> `b`], maxwidth=1]])
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)