date:20190312

[jira] [Commented] (DRILL-7096) Develop vector for canonical Map

2019-03-12 Thread Paul Rogers (JIRA)



[ 
https://issues.apache.org/jira/browse/DRILL-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16791308#comment-16791308
 ] 

Paul Rogers commented on DRILL-7096:


Interesting idea. There are a number of items to consider.

Drill is a SQL engine. JDBC and ODBC are the primary client APIs. These APIs, 
and relational operators, know how to work on relational structures. Drills 
"structs" are, in fact, nested tuples. Because of that, users can use things 
like "unnest" to convert them to relational format.

Would be good to understand the use cases for true maps in a SQL query. For 
example, what functions that user might want to use?

Another issue is the form of values. Simplest if values are strings. If values 
can be of mixed types, then the "valuesVector" must be a union vector, and 
union vectors are very complex, space inefficient, are not fully supported in 
Drill, and are not well defined in SQL.

> Develop vector for canonical Map
> -
>
> Key: DRILL-7096
> URL: https://issues.apache.org/jira/browse/DRILL-7096
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Igor Guzenko
>Assignee: Bohdan Kazydub
>Priority: Major
>
> Canonical Map datatype can be represented using combination of three 
> value vectors:
> keysVector - vector for storing keys of each map
> valuesVector - vector for storing values of each map
> offsetsVector - vector for storing of start indexes of next each map
> So it's not very hard to create such Map vector, but there is a major issue 
> with such map representation. It's hard to search maps values by key in such 
> vector, need to investigate some advanced techniques to make such search 
> efficient. Or find other more suitable options to represent map datatype in 
> world of vectors.
> After question about maps, Apache Arrow developers responded that for Java 
> they don't have real Map vector, for now they just have logical Map type 
> definition where they define Map like: List< Struct value:value_type> >. So implementation of value vector would be useful for 
> Arrow too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7100) parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative

2019-03-12 Thread salim achouche (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

salim achouche updated DRILL-7100:
--
Reviewer: Timothy Farkas

> parquet RecordBatchSizerManager : IllegalArgumentException: the requested 
> size must be non-negative
> ---
>
> Key: DRILL-7100
> URL: https://issues.apache.org/jira/browse/DRILL-7100
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Major
>
> Table has string columns that can range from 1024 bytes to 32MB in length, we 
> should be able to handle such wide string columns in parquet, when querying 
> from Drill.
> Hive Version 2.3.3
> Drill Version 1.15
> {noformat}
> CREATE TABLE temp.cust_bhsf_ce_blob_parquet (
>  event_id DECIMAL, 
>  valid_until_dt_tm string, 
>  blob_seq_num DECIMAL, 
>  valid_from_dt_tm string, 
>  blob_length DECIMAL, 
>  compression_cd DECIMAL, 
>  blob_contents string, 
>  updt_dt_tm string, 
>  updt_id DECIMAL, 
>  updt_task DECIMAL, 
>  updt_cnt DECIMAL, 
>  updt_applctx DECIMAL, 
>  last_utc_ts string, 
>  ccl_load_dt_tm string, 
>  ccl_updt_dt_tm string )
>  STORED AS PARQUET;
> {noformat}
>  
> The source table is stored as ORC format.
> Failing query.
> {noformat}
> SELECT event_id, BLOB_CONTENTS FROM hive.temp.cust_bhsf_ce_blob_parquet WHERE 
> event_id = 3443236037
> 2019-03-07 14:40:17,886 [237e8c79-0e9b-45d6-9134-0da95dba462f:frag:1:269] 
> INFO o.a.d.exec.physical.impl.ScanBatch - User Error Occurred: the requested 
> size must be non-negative (the requested size must be non-negative)
> org.apache.drill.common.exceptions.UserException: INTERNAL_ERROR ERROR: the 
> requested size must be non-negative
> {noformat}
> Snippet from drillbit.log file
> {noformat}
> [Error Id: 41a4d597-f54d-42a6-be6d-5dbeb7f642ba ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:293) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:69)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:297)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:284)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>  [hadoop-common-2.7.0-mapr-1808.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:284)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_181]
>

[jira] [Closed] (DRILL-7101) IllegalArgumentException when reading parquet data

2019-03-12 Thread salim achouche (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

salim achouche closed DRILL-7101.
-
Resolution: Duplicate

> IllegalArgumentException when reading parquet data
> --
>
> Key: DRILL-7101
> URL: https://issues.apache.org/jira/browse/DRILL-7101
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: salim achouche
>Assignee: salim achouche
>Priority: Major
> Fix For: 1.16.0
>
>
> The Parquet reader fails with the below stack trace:
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:293) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:69)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:297)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:284)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181] at 
> javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181] at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>  [hadoop-common-2.7.0-mapr-1808.jar:na] at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:284)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_181] at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] 
> Caused by: java.lang.IllegalArgumentException: the requested size must be 
> non-negative at 
> org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkArgument(Preconditions.java:135)
>  ~[drill-shaded-guava-23.0.jar:23.0] at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:224) 
> ~[drill-memory-base-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:211) 
> ~[drill-memory-base-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:394)
>  ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:250)
>  ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:41)
>  ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:54)
>  ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
> org.apache.drill.exec.store.parquet.columnreaders.batchsizing.RecordBatchSizerManager.allocate(RecordBatchSizerManager.java:165)
>

[jira] [Created] (DRILL-7101) IllegalArgumentException when reading parquet data

2019-03-12 Thread salim achouche (JIRA)

salim achouche created DRILL-7101:
-

 Summary: IllegalArgumentException when reading parquet data
 Key: DRILL-7101
 URL: https://issues.apache.org/jira/browse/DRILL-7101
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.15.0
Reporter: salim achouche
Assignee: salim achouche
 Fix For: 1.16.0


The Parquet reader fails with the below stack trace:

at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
 ~[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:293) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:69)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:297)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:284)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181] at 
javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181] at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
 [hadoop-common-2.7.0-mapr-1808.jar:na] at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:284)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_181] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_181] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181] Caused 
by: java.lang.IllegalArgumentException: the requested size must be non-negative 
at 
org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkArgument(Preconditions.java:135)
 ~[drill-shaded-guava-23.0.jar:23.0] at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:224) 
~[drill-memory-base-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:211) 
~[drill-memory-base-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.vector.VarCharVector.allocateNew(VarCharVector.java:394) 
~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.vector.NullableVarCharVector.allocateNew(NullableVarCharVector.java:250)
 ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.vector.AllocationHelper.allocatePrecomputedChildCount(AllocationHelper.java:41)
 ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.vector.AllocationHelper.allocate(AllocationHelper.java:54)
 ~[vector-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.store.parquet.columnreaders.batchsizing.RecordBatchSizerManager.allocate(RecordBatchSizerManager.java:165)
 ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.allocate(ParquetRecordReader.java:276)
 ~[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at 
org.apache.drill.exec.physical.impl.ScanBatch.internalNext(ScanBatch.java:221) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr] at

[jira] [Created] (DRILL-7100) parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative

2019-03-12 Thread Khurram Faraaz (JIRA)

Khurram Faraaz created DRILL-7100:
-

 Summary: parquet RecordBatchSizerManager : 
IllegalArgumentException: the requested size must be non-negative
 Key: DRILL-7100
 URL: https://issues.apache.org/jira/browse/DRILL-7100
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.15.0
Reporter: Khurram Faraaz


Table has string columns that can range from 1024 bytes to 32MB in length, we 
should be able to handle such wide string columns in parquet, when querying 
from Drill.

Hive Version 2.3.3
Drill Version 1.15

{noformat}
CREATE TABLE temp.cust_bhsf_ce_blob_parquet (
 event_id DECIMAL, 
 valid_until_dt_tm string, 
 blob_seq_num DECIMAL, 
 valid_from_dt_tm string, 
 blob_length DECIMAL, 
 compression_cd DECIMAL, 
 blob_contents string, 
 updt_dt_tm string, 
 updt_id DECIMAL, 
 updt_task DECIMAL, 
 updt_cnt DECIMAL, 
 updt_applctx DECIMAL, 
 last_utc_ts string, 
 ccl_load_dt_tm string, 
 ccl_updt_dt_tm string )
 STORED AS PARQUET;
{noformat}
 
The source table is stored as ORC format.

Failing query.
{noformat}
SELECT event_id, BLOB_CONTENTS FROM hive.temp.cust_bhsf_ce_blob_parquet WHERE 
event_id = 3443236037

2019-03-07 14:40:17,886 [237e8c79-0e9b-45d6-9134-0da95dba462f:frag:1:269] INFO 
o.a.d.exec.physical.impl.ScanBatch - User Error Occurred: the requested size 
must be non-negative (the requested size must be non-negative)
org.apache.drill.common.exceptions.UserException: INTERNAL_ERROR ERROR: the 
requested size must be non-negative
{noformat}

Snippet from drillbit.log file
{noformat}
[Error Id: 41a4d597-f54d-42a6-be6d-5dbeb7f642ba ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
 ~[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:293) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:69)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
[drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:297)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:284)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181]
at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181]
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
 [hadoop-common-2.7.0-mapr-1808.jar:na]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:284)
 [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_181]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
Caused by: java.lang.IllegalArgumentException: the requested size must be 
non-negative
at 
org.apache.drill.shaded.guava.com.google.common.base.Preconditions.checkArgument(Preconditions.java:135)
 ~[drill-shaded-guava-23.0.jar:23.0]
at org.apache.drill.exec.memory.BaseAllocator.buffer(BaseAllocator.java:224)

[jira] [Assigned] (DRILL-7100) parquet RecordBatchSizerManager : IllegalArgumentException: the requested size must be non-negative

2019-03-12 Thread Khurram Faraaz (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Khurram Faraaz reassigned DRILL-7100:
-

Assignee: salim achouche

> parquet RecordBatchSizerManager : IllegalArgumentException: the requested 
> size must be non-negative
> ---
>
> Key: DRILL-7100
> URL: https://issues.apache.org/jira/browse/DRILL-7100
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.15.0
>Reporter: Khurram Faraaz
>Assignee: salim achouche
>Priority: Major
>
> Table has string columns that can range from 1024 bytes to 32MB in length, we 
> should be able to handle such wide string columns in parquet, when querying 
> from Drill.
> Hive Version 2.3.3
> Drill Version 1.15
> {noformat}
> CREATE TABLE temp.cust_bhsf_ce_blob_parquet (
>  event_id DECIMAL, 
>  valid_until_dt_tm string, 
>  blob_seq_num DECIMAL, 
>  valid_from_dt_tm string, 
>  blob_length DECIMAL, 
>  compression_cd DECIMAL, 
>  blob_contents string, 
>  updt_dt_tm string, 
>  updt_id DECIMAL, 
>  updt_task DECIMAL, 
>  updt_cnt DECIMAL, 
>  updt_applctx DECIMAL, 
>  last_utc_ts string, 
>  ccl_load_dt_tm string, 
>  ccl_updt_dt_tm string )
>  STORED AS PARQUET;
> {noformat}
>  
> The source table is stored as ORC format.
> Failing query.
> {noformat}
> SELECT event_id, BLOB_CONTENTS FROM hive.temp.cust_bhsf_ce_blob_parquet WHERE 
> event_id = 3443236037
> 2019-03-07 14:40:17,886 [237e8c79-0e9b-45d6-9134-0da95dba462f:frag:1:269] 
> INFO o.a.d.exec.physical.impl.ScanBatch - User Error Occurred: the requested 
> size must be non-negative (the requested size must be non-negative)
> org.apache.drill.common.exceptions.UserException: INTERNAL_ERROR ERROR: the 
> requested size must be non-negative
> {noformat}
> Snippet from drillbit.log file
> {noformat}
> [Error Id: 41a4d597-f54d-42a6-be6d-5dbeb7f642ba ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:293) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:69)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.SingleSenderCreator$SingleSenderRootExec.innerNext(SingleSenderCreator.java:93)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
> [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:297)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:284)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at java.security.AccessController.doPrivileged(Native Method) [na:1.8.0_181]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_181]
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>  [hadoop-common-2.7.0-mapr-1808.jar:na]
> at 
> org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:284)
>  [drill-java-exec-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38)
>  [drill-common-1.15.0.0-mapr.jar:1.15.0.0-mapr]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: UpdateExport.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: Export.png, ExportAll.png, Storage.png, 
> UpdateExport.png, create.png, image-2018-07-23-02-55-02-024.png, 
> image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: create.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: Export.png, ExportAll.png, Storage.png, create.png, 
> image-2018-07-23-02-55-02-024.png, image-2018-10-22-20-20-24-658.png, 
> image-2018-10-22-20-20-59-105.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: ExportAll.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: Export.png, ExportAll.png, Storage.png, 
> image-2018-07-23-02-55-02-024.png, image-2018-10-22-20-20-24-658.png, 
> image-2018-10-22-20-20-59-105.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: Export.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: Export.png, Storage.png, 
> image-2018-07-23-02-55-02-024.png, image-2018-10-22-20-20-24-658.png, 
> image-2018-10-22-20-20-59-105.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: Storage.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: Storage.png, image-2018-07-23-02-55-02-024.png, 
> image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: (was: screenshot-2.png)

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: Storage.png, image-2018-07-23-02-55-02-024.png, 
> image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: (was: image-2019-03-13-00-40-36-672.png)

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: image-2018-07-23-02-55-02-024.png, 
> image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png, 
> screenshot-2.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: screenshot-2.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: image-2018-07-23-02-55-02-024.png, 
> image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png, 
> screenshot-2.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6562) Plugin Management improvements

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6562:
---
Attachment: image-2019-03-13-00-40-36-672.png

> Plugin Management improvements
> --
>
> Key: DRILL-6562
> URL: https://issues.apache.org/jira/browse/DRILL-6562
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Client - HTTP, Web Server
>Affects Versions: 1.14.0
>Reporter: Abhishek Girish
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
> Attachments: image-2018-07-23-02-55-02-024.png, 
> image-2018-10-22-20-20-24-658.png, image-2018-10-22-20-20-59-105.png, 
> image-2019-03-13-00-40-36-672.png
>
>
> Follow-up to DRILL-4580.
> Provide ability to export all storage plugin configurations at once, with a 
> new "Export All" option on the Storage page of the Drill web UI



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7093) Batch Sizing in SingleSender

2019-03-12 Thread Karthikeyan Manivannan (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthikeyan Manivannan updated DRILL-7093:
--
Issue Type: Sub-task  (was: Bug)
Parent: DRILL-7099

> Batch Sizing in SingleSender
> 
>
> Key: DRILL-7093
> URL: https://issues.apache.org/jira/browse/DRILL-7093
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
>
> SingleSender batch sizing: SingleSender does not have a mechanism to control 
> the size of batches sent to the receiver. This results in excessive memory 
> use. 
> This bug captures the changes required to SingleSender to control batch size 
> by using the RecordbatchSizer
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7099) Resource Management in Exchange Operators

2019-03-12 Thread Karthikeyan Manivannan (JIRA)

Karthikeyan Manivannan created DRILL-7099:
-

 Summary: Resource Management in Exchange Operators
 Key: DRILL-7099
 URL: https://issues.apache.org/jira/browse/DRILL-7099
 Project: Apache Drill
  Issue Type: Bug
Reporter: Karthikeyan Manivannan
Assignee: Karthikeyan Manivannan


This Jira will be used to track the changes required for implementing Resource 
Management in Exchange operators.

The design can be found here: 
https://docs.google.com/document/d/1N9OXfCWcp68jsxYVmSt9tPgnZRV_zk8rwwFh0BxXZeE/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6552) Drill Metadata management "Drill MetaStore"

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6552:
---
Labels: ready-to-commit  (was: )

> Drill Metadata management "Drill MetaStore"
> ---
>
> Key: DRILL-6552
> URL: https://issues.apache.org/jira/browse/DRILL-6552
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Metadata
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 2.0.0
>
>
> It would be useful for Drill to have some sort of metastore which would 
> enable Drill to remember previously defined schemata so Drill doesn’t have to 
> do the same work over and over again.
> It allows to store schema and statistics, which will allow to accelerate 
> queries validation, planning and execution time. Also it increases stability 
> of Drill and allows to avoid different kind if issues: "schema change 
> Exceptions", "limit 0" optimization and so on. 
> One of the main candidates is Hive Metastore.
> Starting from 3.0 version Hive Metastore can be the separate service from 
> Hive server:
> [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration]
> Optional enhancement is storing Drill's profiles, UDFs, plugins configs in 
> some kind of metastore as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6524) Two CASE statements in projection influence results of each other

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6524:
---
Labels: ready-to-commit  (was: )

> Two CASE statements in projection influence results of each other
> -
>
> Key: DRILL-6524
> URL: https://issues.apache.org/jira/browse/DRILL-6524
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Query Planning  Optimization
>Affects Versions: 1.11.0
> Environment: Linux 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 
> 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux,
> NAME="CentOS Linux"
> VERSION="7 (Core)"
> apache drill 1.11.0, 
> openjdk version "1.8.0_171"
> OpenJDK Runtime Environment (build 1.8.0_171-b10)
> OpenJDK 64-Bit Server VM (build 25.171-b10, mixed mode)
>Reporter: Oleksandr Chornyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> h3. Steps to Reproduce
> Run the following query via {{sqlline}}:
> {code:sql}
> select
>   case when expr$0 = 3 then expr$0 else expr$1 end,
>   case when expr$0 = 1 then expr$0 else expr$1 end
> from (values(1, 2));
> {code}
> h4. Actual Results
> {noformat}
> +-+-+
> | EXPR$0  | EXPR$1  |
> +-+-+
> | 2   | 2   |
> +-+-+
> {noformat}
> h4. Expected Results
> {noformat}
> +-+-+
> | EXPR$0  | EXPR$1  |
> +-+-+
> | 2   | 1   |
> +-+-+
> {noformat}
> Note, that changing order of CASE statements fixes the issue. The following 
> query yields correct results:
> {code:sql}
> select
>   case when expr$0 = 1 then expr$0 else expr$1 end,
>   case when expr$0 = 3 then expr$0 else expr$1 end
> from (values(1, 2));
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6552) Drill Metadata management "Drill MetaStore"

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-6552:
---
Labels:   (was: ready-to-commit)

> Drill Metadata management "Drill MetaStore"
> ---
>
> Key: DRILL-6552
> URL: https://issues.apache.org/jira/browse/DRILL-6552
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Metadata
>Affects Versions: 1.13.0
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 2.0.0
>
>
> It would be useful for Drill to have some sort of metastore which would 
> enable Drill to remember previously defined schemata so Drill doesn’t have to 
> do the same work over and over again.
> It allows to store schema and statistics, which will allow to accelerate 
> queries validation, planning and execution time. Also it increases stability 
> of Drill and allows to avoid different kind if issues: "schema change 
> Exceptions", "limit 0" optimization and so on. 
> One of the main candidates is Hive Metastore.
> Starting from 3.0 version Hive Metastore can be the separate service from 
> Hive server:
> [https://cwiki.apache.org/confluence/display/Hive/AdminManual+Metastore+3.0+Administration]
> Optional enhancement is storing Drill's profiles, UDFs, plugins configs in 
> some kind of metastore as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-5658) Documentation for Drill Crypto Functions

2019-03-12 Thread Pritesh Maker (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-5658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-5658:
-
Issue Type: Task  (was: Improvement)

> Documentation for Drill Crypto Functions
> 
>
> Key: DRILL-5658
> URL: https://issues.apache.org/jira/browse/DRILL-5658
> Project: Apache Drill
>  Issue Type: Task
>  Components: Functions - Drill
>Affects Versions: 1.11.0
>Reporter: Charles Givre
>Assignee: Bridget Bevens
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> Attached is the documentation for the crypto functions that are being added 
> to Drill 1.11.0.
> https://gist.github.com/cgivre/63b25bdc85159bec484f069406858adc



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6965) Adjust table function usage for all storage plugins and implement schema parameter

2019-03-12 Thread Pritesh Maker (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6965:
-
Fix Version/s: (was: 1.16.0)

> Adjust table function usage for all storage plugins and implement schema 
> parameter
> --
>
> Key: DRILL-6965
> URL: https://issues.apache.org/jira/browse/DRILL-6965
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
>
> Design doc - 
> https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6806) Start moving code for handling a partition in HashAgg into a separate class

2019-03-12 Thread Pritesh Maker (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6806:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Start moving code for handling a partition in HashAgg into a separate class
> ---
>
> Key: DRILL-6806
> URL: https://issues.apache.org/jira/browse/DRILL-6806
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Timothy Farkas
>Assignee: Timothy Farkas
>Priority: Major
> Fix For: 1.17.0
>
>
> Since this involves a lot of refactoring this will be a multiple PR effort.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6835) Schema Provision using File / Table Function

2019-03-12 Thread Pritesh Maker (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6835:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Schema Provision using File / Table Function
> 
>
> Key: DRILL-6835
> URL: https://issues.apache.org/jira/browse/DRILL-6835
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.17.0
>
>
> Schema Provision using File / Table Function design document:
> https://docs.google.com/document/d/1mp4egSbNs8jFYRbPVbm_l0Y5GjH3HnoqCmOpMTR_g4w/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6543) Option for memory mgmt: Reserve allowance for non-buffered

2019-03-12 Thread Pritesh Maker (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6543:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Option for memory mgmt: Reserve allowance for non-buffered
> --
>
> Key: DRILL-6543
> URL: https://issues.apache.org/jira/browse/DRILL-6543
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Major
> Fix For: 1.17.0
>
>
> Introduce a new option to enforce/remind users to reserve some allowance when 
> budgeting their memory:
> The problem: When the "planner.memory.max_query_memory_per_node" (MQMPN) 
> option is set equal (or "nearly equal") to the allocated *Direct Memory*, an 
> OOM is still possible. The reason is that the memory used by the 
> "non-buffered" operators is not taken into account.
> For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
> operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
> When other non-buffered operators (e.g., a Scanner, or a Sender) also grab 
> some of the Direct Memory, then less than 100 MB is left available. And if 
> all those 5 Hash-Joins are pushing their limits, then one HJ may have only 
> allocated 12MB so far, but on the next 1MB allocation it will hit an OOM 
> (from the JVM, as all the 100MB Direct memory is already used).
> A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
> non-buffered operators (e.g., default %25). This *allowance* may prevent many 
> of the cases like the example above. The new option would return an error 
> (when a query initiates) if the MQMPN is set too high. Note that this option 
> +can not+ address concurrent queries.
> This should also apply to the alternative for the MQMPN - the 
> {{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
> _*reserve*_ such memory (e.g., can set it to %100); only its documentation 
> clearly explains this issue (that doc suggests reserving %50 allowance, as it 
> was written when the Hash-Join was non-buffered; i.e., before spill was 
> implemented).
> The memory given to the buffered operators is the highest calculated between 
> the MQMPN and the PPQ. The new reserve option would verify that this figure 
> allows the allowance.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6845) Eliminate duplicates for Semi Hash Join

2019-03-12 Thread Pritesh Maker (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6845:
-
Fix Version/s: (was: 1.16.0)
   1.17.0

> Eliminate duplicates for Semi Hash Join
> ---
>
> Key: DRILL-6845
> URL: https://issues.apache.org/jira/browse/DRILL-6845
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Minor
> Fix For: 1.17.0
>
>
> Following DRILL-6735: The performance of the new Semi Hash Join may degrade 
> if the build side contains excessive number of join-key duplicate rows; this 
> mainly a result of the need to store all those rows first, before the hash 
> table is built.
>   Proposed solution: For Semi, the Hash Agg would create a Hash-Table 
> initially, and use it to eliminate key-duplicate rows as they arrive.
>   Proposed extra: That Hash-Table has an added cost (e.g. resizing). So 
> perform "runtime stats" – Check initial number of incoming rows (e.g. 32k), 
> and if the number of duplicates is less than some threshold (e.g. %20) – 
> cancel that "early" hash table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-7061) Selecting option to limit results to 1000 on web UI causes parse error

2019-03-12 Thread Kunal Khatua (JIRA)



[ 
https://issues.apache.org/jira/browse/DRILL-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790831#comment-16790831
 ] 

Kunal Khatua commented on DRILL-7061:
-

Yes. I've removed it.

> Selecting option to limit results to 1000 on web UI causes parse error
> --
>
> Key: DRILL-7061
> URL: https://issues.apache.org/jira/browse/DRILL-7061
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Khurram Faraaz
>Assignee: Kunal Khatua
>Priority: Critical
> Fix For: 1.16.0
>
> Attachments: image-2019-02-27-14-17-24-348.png
>
>
> Selecting option to Limit results to 1,000 causes a parse error on web UI, 
> screen shot is attached. Browser used was Chrome.
> Drill version => 1.16.0-SNAPSHOT
> commit = e342ff5
> Error reported on web UI when we press Submit button on web UI
> {noformat}
> Query Failed: An Error Occurred 
> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 'LIMIT 
> start, count' is not allowed under the current SQL conformance level SQL 
> Query -- [autoLimit: 1,000 rows] select * from ( select length(varStr) from 
> dfs.`/root/many_json_files` ) limit 1,000 [Error Id: 
> e252d1cc-54d4-4530-837c-a1726a5be89f on qa102-45.qa.lab:31010]{noformat}
>  Stack trace from drillbit.log
> {noformat}
> 2019-02-27 21:59:18,428 [2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query with id 
> 2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2 issued by anonymous: -- [autoLimit: 
> 1,000 rows]
> select * from (
> select length(varStr) from dfs.`/root/many_json_files`
> ) limit 1,000
> 2019-02-27 21:59:18,438 [2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2:foreman] INFO 
> o.a.d.exec.planner.sql.SqlConverter - User Error Occurred: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level ('LIMIT start, 
> count' is not allowed under the current SQL conformance level)
> org.apache.drill.common.exceptions.UserException: PARSE ERROR: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level
> SQL Query -- [autoLimit: 1,000 rows]
> select * from (
> select length(varStr) from dfs.`/root/many_json_files`
> ) limit 1,000
> [Error Id: 286b7236-bafd-4ddc-ab10-aaac07e5c088 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:193) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:138)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:110)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:76)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_191]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
> Caused by: org.apache.calcite.sql.parser.SqlParseException: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.convertException(DrillParserImpl.java:357)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.normalizeException(DrillParserImpl.java:145)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:156) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
> at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:181) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
> at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:185) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> ... 8 common frames omitted
> Caused by: org.apache.drill.exec.planner.sql.parser.impl.ParseException: 
> 'LIMIT start, count' is not allowed under the current SQL conformance level
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.OrderedQueryOrExpr(DrillParserImpl.java:489)
>

[jira] [Resolved] (DRILL-6976) SchemaChangeException happens when using split function in subquery if it returns empty result.

2019-03-12 Thread Anton Gozhiy (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy resolved DRILL-6976.
-
Resolution: Fixed

> SchemaChangeException happens when using split function in subquery if it 
> returns empty result.
> ---
>
> Key: DRILL-6976
> URL: https://issues.apache.org/jira/browse/DRILL-6976
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select substr(col, 2, 3) 
> from (select split(n_comment, ' ') [3] col 
>   from cp.`tpch/nation.parquet` 
>   where n_nationkey = -1 
>   group by n_comment 
>   order by n_comment 
>   limit 5);
> {code}
> *Expected result:*
> {noformat}
> +-+
> | EXPR$0  |
> +-+
> +-+
> {noformat}
> *Actual result:*
> {noformat}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: 86515d74-7b9c-4949-8ece-c9c17e00afc3 on userf87d-pc:31010]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while 
> trying to materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():498
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748 (state=,code=0)
> {noformat}
> *Note:* Filter "where n_nationkey = -1" doesn't return any rows. In case of " 
> = 1", for example, the query will return result without error.
> *Workaround:* Use cast on the split function, like
> {code:sql}
> cast(split(n_comment, ' ') [3] as varchar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (DRILL-6976) SchemaChangeException happens when using split function in subquery if it returns empty result.

2019-03-12 Thread Anton Gozhiy (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anton Gozhiy reassigned DRILL-6976:
---

Assignee: Bohdan Kazydub  (was: Anton Gozhiy)

> SchemaChangeException happens when using split function in subquery if it 
> returns empty result.
> ---
>
> Key: DRILL-6976
> URL: https://issues.apache.org/jira/browse/DRILL-6976
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: Anton Gozhiy
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.16.0
>
>
> *Query:*
> {code:sql}
> select substr(col, 2, 3) 
> from (select split(n_comment, ' ') [3] col 
>   from cp.`tpch/nation.parquet` 
>   where n_nationkey = -1 
>   group by n_comment 
>   order by n_comment 
>   limit 5);
> {code}
> *Expected result:*
> {noformat}
> +-+
> | EXPR$0  |
> +-+
> +-+
> {noformat}
> *Actual result:*
> {noformat}
> Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to 
> materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> Fragment 0:0
> Please, refer to logs for more information.
> [Error Id: 86515d74-7b9c-4949-8ece-c9c17e00afc3 on userf87d-pc:31010]
>   (org.apache.drill.exec.exception.SchemaChangeException) Failure while 
> trying to materialize incoming schema.  Errors:
>  
> Error in expression at index -1.  Error: Missing function implementation: 
> [castVARCHAR(NULL-OPTIONAL, BIGINT-REQUIRED)].  Full expression: --UNKNOWN 
> EXPRESSION--..
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput():498
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchema():583
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():101
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():143
> org.apache.drill.exec.record.AbstractRecordBatch.next():186
> org.apache.drill.exec.physical.impl.BaseRootExec.next():104
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():94
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():297
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():284
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1746
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():284
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748 (state=,code=0)
> {noformat}
> *Note:* Filter "where n_nationkey = -1" doesn't return any rows. In case of " 
> = 1", for example, the query will return result without error.
> *Workaround:* Use cast on the split function, like
> {code:sql}
> cast(split(n_comment, ' ') [3] as varchar)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-7061) Selecting option to limit results to 1000 on web UI causes parse error

2019-03-12 Thread Sorabh Hamirwasia (JIRA)



[ 
https://issues.apache.org/jira/browse/DRILL-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790799#comment-16790799
 ] 

Sorabh Hamirwasia commented on DRILL-7061:
--

Should we remove the link for DRILL-6960 since this issue is different ?

> Selecting option to limit results to 1000 on web UI causes parse error
> --
>
> Key: DRILL-7061
> URL: https://issues.apache.org/jira/browse/DRILL-7061
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Khurram Faraaz
>Assignee: Kunal Khatua
>Priority: Critical
> Fix For: 1.16.0
>
> Attachments: image-2019-02-27-14-17-24-348.png
>
>
> Selecting option to Limit results to 1,000 causes a parse error on web UI, 
> screen shot is attached. Browser used was Chrome.
> Drill version => 1.16.0-SNAPSHOT
> commit = e342ff5
> Error reported on web UI when we press Submit button on web UI
> {noformat}
> Query Failed: An Error Occurred 
> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 'LIMIT 
> start, count' is not allowed under the current SQL conformance level SQL 
> Query -- [autoLimit: 1,000 rows] select * from ( select length(varStr) from 
> dfs.`/root/many_json_files` ) limit 1,000 [Error Id: 
> e252d1cc-54d4-4530-837c-a1726a5be89f on qa102-45.qa.lab:31010]{noformat}
>  Stack trace from drillbit.log
> {noformat}
> 2019-02-27 21:59:18,428 [2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query with id 
> 2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2 issued by anonymous: -- [autoLimit: 
> 1,000 rows]
> select * from (
> select length(varStr) from dfs.`/root/many_json_files`
> ) limit 1,000
> 2019-02-27 21:59:18,438 [2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2:foreman] INFO 
> o.a.d.exec.planner.sql.SqlConverter - User Error Occurred: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level ('LIMIT start, 
> count' is not allowed under the current SQL conformance level)
> org.apache.drill.common.exceptions.UserException: PARSE ERROR: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level
> SQL Query -- [autoLimit: 1,000 rows]
> select * from (
> select length(varStr) from dfs.`/root/many_json_files`
> ) limit 1,000
> [Error Id: 286b7236-bafd-4ddc-ab10-aaac07e5c088 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:193) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:138)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:110)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:76)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_191]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
> Caused by: org.apache.calcite.sql.parser.SqlParseException: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.convertException(DrillParserImpl.java:357)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.normalizeException(DrillParserImpl.java:145)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:156) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
> at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:181) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
> at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:185) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> ... 8 common frames omitted
> Caused by: org.apache.drill.exec.planner.sql.parser.impl.ParseException: 
> 'LIMIT start, count' is not allowed under the current SQL conformance level
> at 
>

[jira] [Assigned] (DRILL-5683) Incorrect query result when query uses NOT(IS NOT NULL) expression

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi reassigned DRILL-5683:
--

Assignee: Volodymyr Vysotskyi  (was: Vitalii Diravka)

> Incorrect query result when query uses NOT(IS NOT NULL) expression 
> ---
>
> Key: DRILL-5683
> URL: https://issues.apache.org/jira/browse/DRILL-5683
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jinfeng Ni
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>
> The following repo was modified from a testcase provided by Arjun 
> Rajan(ara...@mapr.com).
> 1. Prepare dataset with null.
> {code}
> create table dfs.tmp.t1 as 
>   select r_regionkey, r_name, case when mod(r_regionkey, 3) > 0 then 
> mod(r_regionkey, 3) else null end as flag 
>   from cp.`tpch/region.parquet`;
> select * from dfs.tmp.t1;
> +--+--+---+
> | r_regionkey  |r_name| flag  |
> +--+--+---+
> | 0| AFRICA   | null  |
> | 1| AMERICA  | 1 |
> | 2| ASIA | 2 |
> | 3| EUROPE   | null  |
> | 4| MIDDLE EAST  | 1 |
> +--+--+---+
> {code}
> 2. Query with NOT(IS NOT NULL) expression in the filter. 
> {code}
> select * from dfs.tmp.t1 where NOT (flag IS NOT NULL);
> +--+-+---+
> | r_regionkey  | r_name  | flag  |
> +--+-+---+
> | 0| AFRICA  | null  |
> | 3| EUROPE  | null  |
> +--+-+---+
> {code}
> 3. Switch run-time code compiler from default to 'JDK', and get wrong result. 
> {code}
> alter system set `exec.java_compiler` = 'JDK';
> +---+--+
> |  ok   |   summary|
> +---+--+
> | true  | exec.java_compiler updated.  |
> +---+--+
> select * from dfs.tmp.t1 where NOT (flag IS NOT NULL);
> +--+--+---+
> | r_regionkey  |r_name| flag  |
> +--+--+---+
> | 0| AFRICA   | null  |
> | 1| AMERICA  | 1 |
> | 2| ASIA | 2 |
> | 3| EUROPE   | null  |
> | 4| MIDDLE EAST  | 1 |
> +--+--+---+
> {code}
> 4.  Wrong result could happen too, when NOT(IS NOT NULL) in Project operator.
> {code}
> select r_regionkey, r_name, NOT(flag IS NOT NULL) as exp1 from dfs.tmp.t1;
> +--+--+---+
> | r_regionkey  |r_name| exp1  |
> +--+--+---+
> | 0| AFRICA   | true  |
> | 1| AMERICA  | true  |
> | 2| ASIA | true  |
> | 3| EUROPE   | true  |
> | 4| MIDDLE EAST  | true  |
> +--+--+---+
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6825) Applying different hash function according to data types and data size

2019-03-12 Thread Sorabh Hamirwasia (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6825:
-
Reviewer: Boaz Ben-Zvi

> Applying different hash function according to data types and data size
> --
>
> Key: DRILL-6825
> URL: https://issues.apache.org/jira/browse/DRILL-6825
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Codegen
>Reporter: weijie.tong
>Assignee: weijie.tong
>Priority: Major
> Fix For: 1.16.0
>
>
> Different hash functions have different performance according to different 
> data types and data size. We should choose a right one to apply not just 
> Murmurhash.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6845) Eliminate duplicates for Semi Hash Join

2019-03-12 Thread Sorabh Hamirwasia (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6845:
-
Reviewer: Timothy Farkas

> Eliminate duplicates for Semi Hash Join
> ---
>
> Key: DRILL-6845
> URL: https://issues.apache.org/jira/browse/DRILL-6845
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Execution - Relational Operators
>Affects Versions: 1.14.0
>Reporter: Boaz Ben-Zvi
>Assignee: Boaz Ben-Zvi
>Priority: Minor
> Fix For: 1.16.0
>
>
> Following DRILL-6735: The performance of the new Semi Hash Join may degrade 
> if the build side contains excessive number of join-key duplicate rows; this 
> mainly a result of the need to store all those rows first, before the hash 
> table is built.
>   Proposed solution: For Semi, the Hash Agg would create a Hash-Table 
> initially, and use it to eliminate key-duplicate rows as they arrive.
>   Proposed extra: That Hash-Table has an added cost (e.g. resizing). So 
> perform "runtime stats" – Check initial number of incoming rows (e.g. 32k), 
> and if the number of duplicates is less than some threshold (e.g. %20) – 
> cancel that "early" hash table.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7098) File Metadata Metastore Plugin

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-7098:
---
Affects Version/s: (was: 1.15.0)

> File Metadata Metastore Plugin
> --
>
> Key: DRILL-7098
> URL: https://issues.apache.org/jira/browse/DRILL-7098
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components:  Server, Metadata
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.16.0
>
>
> DRILL-6852 introduces Drill Metastore API. 
> The second step is to create internal Drill Metastore mechanism (and File 
> Metastore Plugin), which will involve Metastore API and can be extended for 
> using by other Storage Plugins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7098) File Metadata Metastore Plugin

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-7098:
---
Labels: Metastore  (was: )

> File Metadata Metastore Plugin
> --
>
> Key: DRILL-7098
> URL: https://issues.apache.org/jira/browse/DRILL-7098
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components:  Server, Metadata
>Reporter: Vitalii Diravka
>Assignee: Vitalii Diravka
>Priority: Major
>  Labels: Metastore
> Fix For: 1.16.0
>
>
> DRILL-6852 introduces Drill Metastore API. 
> The second step is to create internal Drill Metastore mechanism (and File 
> Metastore Plugin), which will involve Metastore API and can be extended for 
> using by other Storage Plugins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7098) File Metadata Metastore Plugin

2019-03-12 Thread Vitalii Diravka (JIRA)

Vitalii Diravka created DRILL-7098:
--

 Summary: File Metadata Metastore Plugin
 Key: DRILL-7098
 URL: https://issues.apache.org/jira/browse/DRILL-7098
 Project: Apache Drill
  Issue Type: Sub-task
  Components:  Server, Metadata
Affects Versions: 1.15.0
Reporter: Vitalii Diravka
Assignee: Vitalii Diravka
 Fix For: 1.16.0


DRILL-6852 introduces Drill Metastore API. 
The second step is to create internal Drill Metastore mechanism (and File 
Metastore Plugin), which will involve Metastore API and can be extended for 
using by other Storage Plugins.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (DRILL-6524) Two CASE statements in projection influence results of each other

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



[ 
https://issues.apache.org/jira/browse/DRILL-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16790756#comment-16790756
 ] 

Volodymyr Vysotskyi commented on DRILL-6524:


In the second commit of the pull request, were made changes to avoid such cases 
of reference assignment to allow producing a scalar replacement.
For the next query:
{code:sql}
select
  case when expr$0 = 3 then expr$0 else expr$1 end,
  case when expr$0 = 1 then expr$0 else expr$1 end
from (values(1, 2));
{code}
without the changes will be generated the next class:
{code:java}
package org.apache.drill.exec.test.generated;

import org.apache.drill.exec.exception.SchemaChangeException;
import org.apache.drill.exec.expr.holders.BigIntHolder;
import org.apache.drill.exec.expr.holders.IntHolder;
import org.apache.drill.exec.expr.holders.NullableBigIntHolder;
import org.apache.drill.exec.expr.holders.NullableBitHolder;
import org.apache.drill.exec.ops.FragmentContext;
import org.apache.drill.exec.physical.impl.project.ProjectorTemplate;
import org.apache.drill.exec.record.RecordBatch;
import org.apache.drill.exec.vector.NullableBigIntVector;

public class ProjectorGen0
extends ProjectorTemplate
{

NullableBigIntVector vv1;
IntHolder const5;
BigIntHolder constant7;
NullableBigIntVector vv9;
NullableBigIntVector vv13;
IntHolder const17;
BigIntHolder constant19;
NullableBigIntVector vv21;
NullableBigIntVector vv25;

public ProjectorGen0() {
try {
__DRILL_INIT__();
} catch (SchemaChangeException e) {
throw new UnsupportedOperationException(e);
}
}

public void doEval(int inIndex, int outIndex)
throws SchemaChangeException
{
{
NullableBigIntHolder out0 = new NullableBigIntHolder();
NullableBigIntHolder out4 = new NullableBigIntHolder();
{
out4 .isSet = vv1 .getAccessor().isSet((inIndex));
if (out4 .isSet == 1) {
out4 .value = vv1 .getAccessor().get((inIndex));
}
}
// start of eval portion of equal function. //
NullableBitHolder out8 = new NullableBitHolder();
{
if (out4 .isSet == 0) {
out8 .isSet = 0;
} else {
final NullableBitHolder out = new NullableBitHolder();
NullableBigIntHolder left = out4;
BigIntHolder right = constant7;
GCompareBigIntVsBigInt$EqualsBigIntVsBigInt_eval:
{
if (Double.isNaN(left.value) && Double.isNaN(right.value)) {
out.value = 1;
} else
{
out.value = left.value == right.value ? 1 : 0;
}
}
out.isSet = 1;
out8 = out;
out.isSet = 1;
}
}
// end of eval portion of equal function. //
if ((out8 .isSet == 1)&&(out8 .value == 1)) {
if (out4 .isSet!= 0) {
out0 = out4;
}
} else {
NullableBigIntHolder out12 = new NullableBigIntHolder();
{
out12 .isSet = vv9 .getAccessor().isSet((inIndex));
if (out12 .isSet == 1) {
out12 .value = vv9 .getAccessor().get((inIndex));
}
}
if (out12 .isSet!= 0) {
out0 = out12;
}
}
if (!(out0 .isSet == 0)) {
vv13 .getMutator().set((outIndex), out0 .isSet, out0 .value);
}
NullableBigIntHolder out16 = new NullableBigIntHolder();
// start of eval portion of equal function. //
NullableBitHolder out20 = new NullableBitHolder();
{
if (out4 .isSet == 0) {
out20 .isSet = 0;
} else {
final NullableBitHolder out = new NullableBitHolder();
NullableBigIntHolder left = out4;
BigIntHolder right = constant19;
GCompareBigIntVsBigInt$EqualsBigIntVsBigInt_eval:
{
if (Double.isNaN(left.value) && Double.isNaN(right.value)) {
out.value = 1;
} else
{
out.value = left.value == right.value ? 1 : 0;
}
}
out.isSet = 1;
out20 = out;
out.isSet = 1;
}
}
// end of eval portion of equal function. //
if ((out20 .isSet == 1)&&(out20 .value == 1)) {
if (out4 .isSet!= 0) {
out16 = out4;
}
} else {
NullableBigIntHolder out24 = new NullableBigIntHolder();
{
out24 .isSet =

[jira] [Updated] (DRILL-7061) Selecting option to limit results to 1000 on web UI causes parse error

2019-03-12 Thread Sorabh Hamirwasia (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-7061:
-
Reviewer: Sorabh Hamirwasia

> Selecting option to limit results to 1000 on web UI causes parse error
> --
>
> Key: DRILL-7061
> URL: https://issues.apache.org/jira/browse/DRILL-7061
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Web Server
>Affects Versions: 1.16.0
>Reporter: Khurram Faraaz
>Assignee: Kunal Khatua
>Priority: Critical
> Fix For: 1.16.0
>
> Attachments: image-2019-02-27-14-17-24-348.png
>
>
> Selecting option to Limit results to 1,000 causes a parse error on web UI, 
> screen shot is attached. Browser used was Chrome.
> Drill version => 1.16.0-SNAPSHOT
> commit = e342ff5
> Error reported on web UI when we press Submit button on web UI
> {noformat}
> Query Failed: An Error Occurred 
> org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: 'LIMIT 
> start, count' is not allowed under the current SQL conformance level SQL 
> Query -- [autoLimit: 1,000 rows] select * from ( select length(varStr) from 
> dfs.`/root/many_json_files` ) limit 1,000 [Error Id: 
> e252d1cc-54d4-4530-837c-a1726a5be89f on qa102-45.qa.lab:31010]{noformat}
>  Stack trace from drillbit.log
> {noformat}
> 2019-02-27 21:59:18,428 [2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2:foreman] INFO 
> o.a.drill.exec.work.foreman.Foreman - Query text for query with id 
> 2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2 issued by anonymous: -- [autoLimit: 
> 1,000 rows]
> select * from (
> select length(varStr) from dfs.`/root/many_json_files`
> ) limit 1,000
> 2019-02-27 21:59:18,438 [2388f7c9-2cb4-0ef8-4088-3ffcab1f0ed2:foreman] INFO 
> o.a.d.exec.planner.sql.SqlConverter - User Error Occurred: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level ('LIMIT start, 
> count' is not allowed under the current SQL conformance level)
> org.apache.drill.common.exceptions.UserException: PARSE ERROR: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level
> SQL Query -- [autoLimit: 1,000 rows]
> select * from (
> select length(varStr) from dfs.`/root/many_json_files`
> ) limit 1,000
> [Error Id: 286b7236-bafd-4ddc-ab10-aaac07e5c088 ]
> at 
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
>  ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:193) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:138)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:110)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:76)
>  [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:584) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:272) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  [na:1.8.0_191]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_191]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_191]
> Caused by: org.apache.calcite.sql.parser.SqlParseException: 'LIMIT start, 
> count' is not allowed under the current SQL conformance level
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.convertException(DrillParserImpl.java:357)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.normalizeException(DrillParserImpl.java:145)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:156) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
> at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:181) 
> ~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]
> at 
> org.apache.drill.exec.planner.sql.SqlConverter.parse(SqlConverter.java:185) 
> [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> ... 8 common frames omitted
> Caused by: org.apache.drill.exec.planner.sql.parser.impl.ParseException: 
> 'LIMIT start, count' is not allowed under the current SQL conformance level
> at 
> org.apache.drill.exec.planner.sql.parser.impl.DrillParserImpl.OrderedQueryOrExpr(DrillParserImpl.java:489)
>  ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT]
> at

[jira] [Comment Edited] (DRILL-6524) Two CASE statements in projection influence results of each other

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



[ 
https://issues.apache.org/jira/browse/DRILL-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788807#comment-16788807
 ] 

Volodymyr Vysotskyi edited comment on DRILL-6524 at 3/12/19 4:38 PM:
-

The problem here is connected with incorrect scalar replacement. It still 
happens even for the case when a reference to the object outside of the 
conditional block is changed inside the conditional block. It causes wrong 
results for the case when code in this conditional block will not be executed.
The code below demonstrates this issue:

Original class:
{code:java}
package org.apache.drill;

import org.apache.drill.exec.expr.holders.NullableBigIntHolder;

public class CompileClassWithIfs {

  public static void doSomething() {
int a = 2;
NullableBigIntHolder out0 = new NullableBigIntHolder();
out0.isSet = 1;
NullableBigIntHolder out4 = new NullableBigIntHolder();
out4.isSet = 0;
NullableBigIntHolder out5 = new NullableBigIntHolder();

if (a == 0) {
  out0 = out4;
} else {
}

if (out4.isSet == 0) {
  out0.isSet = 1;
  out5.value = 3;
} else {
  out0.isSet = 0;
  assert false : "Incorrect class transformation. This code should never be 
executed.";
}
  }
}
{code}

Code after scalar replacement:
{code:java}
package org.apache.drill;

public class CompileClassWithIfs {
  public static void doSomething() {
byte var0 = 2;
boolean var1 = false;
long var2 = 0L;
var1 = true;
boolean var4 = false;
long var5 = 0L;
var4 = false;
boolean var7 = false;
long var8 = 0L;
if (var0 == 0) {
  var1 = var4;
}

if (!var1) {
  var1 = true;
  var8 = (long)3;
} else {
  var1 = false;
  if (true) {
throw new AssertionError("Incorrect class transformation. This code 
should never be executed.");
  }
}

  }

  public CompileClassWithIfs() {
  }
}
{code}

Please note that in the original code, assertion error will not be thrown, but 
after scalar replacement {{out0}} and {{out4}} have the same values, so they 
were inlined using the same variables.

Code after the changes in https://github.com/apache/drill/pull/1686:
{code:java}
package org.apache.drill;

import org.apache.drill.exec.expr.holders.NullableBigIntHolder;

public class CompileClassWithIfs {
  public static void doSomething() {
byte var0 = 2;
NullableBigIntHolder var1 = new NullableBigIntHolder();
var1.isSet = 1;
NullableBigIntHolder var2 = new NullableBigIntHolder();
var2.isSet = 0;
boolean var3 = false;
long var4 = 0L;
if (var0 == 0) {
  var1 = var2;
}

if (var2.isSet == 0) {
  var1.isSet = 1;
  var4 = (long)3;
} else {
  var1.isSet = 0;
  if (true) {
throw new AssertionError("Incorrect class transformation. This code 
should never be executed.");
  }
}

  }

  public CompileClassWithIfs() {
  }
}
{code}

Please note, that two {{NullableBigIntHolder}} objects whose references were 
assigned to each other in if the block weren't replaced since it is impossible 
to produce the correct replacement.

Bytecode of the last class:
{code}
 // class version 50.0 (50)
// access flags 0x21
public class org/apache/drill/CompileClassWithIfs {


  // access flags 0x9
  public static doSomething()V
ICONST_2
ISTORE 0
NEW org/apache/drill/exec/expr/holders/NullableBigIntHolder
DUP
INVOKESPECIAL 
org/apache/drill/exec/expr/holders/NullableBigIntHolder. ()V
ASTORE 1
ALOAD 1
ICONST_1
PUTFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I
NEW org/apache/drill/exec/expr/holders/NullableBigIntHolder
DUP
INVOKESPECIAL 
org/apache/drill/exec/expr/holders/NullableBigIntHolder. ()V
ASTORE 2
ALOAD 2
ICONST_0
PUTFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I
ICONST_0
ISTORE 3
LCONST_0
LSTORE 4
ILOAD 0
ICONST_0
IF_ICMPNE L0
ALOAD 2
ASTORE 1
   L0
ALOAD 2
GETFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I
ICONST_0
IF_ICMPNE L1
ALOAD 1
ICONST_1
PUTFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I
ICONST_3
I2L
LSTORE 4
GOTO L2
   L1
ALOAD 1
ICONST_0
PUTFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I
ICONST_0
IFNE L2
NEW java/lang/AssertionError
DUP
LDC "Incorrect class transformation. This code should never be executed."
INVOKESPECIAL java/lang/AssertionError. (Ljava/lang/Object;)V
ATHROW
   L2
RETURN
MAXSTACK = 3
MAXLOCALS = 6

  // access flags 0x1
  public ()V
ALOAD 0
INVOKESPECIAL java/lang/Object. ()V
RETURN
MAXSTACK = 1
MAXLOCALS = 1
}
{code}



was (Author: vvysotskyi):
The problem here is connected with incorrect scalar replacement. It still

[jira] [Comment Edited] (DRILL-6524) Two CASE statements in projection influence results of each other

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



[ 
https://issues.apache.org/jira/browse/DRILL-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16788807#comment-16788807
 ] 

Volodymyr Vysotskyi edited comment on DRILL-6524 at 3/12/19 4:33 PM:
-

The problem here is connected with incorrect scalar replacement. It still 
happens even for the case when a reference to the object outside of the 
conditional block is changed inside the conditional block. It causes wrong 
results for the case when code in this conditional block will not be executed.
The code below demonstrates this issue:

Original class:
{code:java}
package org.apache.drill;

import org.apache.drill.exec.expr.holders.NullableBigIntHolder;

public class CompileClassWithIfs {

  public static void doSomething() {
int a = 2;
NullableBigIntHolder out0 = new NullableBigIntHolder();
out0.isSet = 1;
NullableBigIntHolder out4 = new NullableBigIntHolder();
out4.isSet = 0;
NullableBigIntHolder out5 = new NullableBigIntHolder();

if (a == 0) {
  out0 = out4;
} else {
}

if (out4.isSet == 0) {
  out0.isSet = 1;
  out5.value = 3;
} else {
  out0.isSet = 0;
  assert false : "Incorrect class transformation. This code should never be 
executed.";
}
  }
}
{code}

Code after scalar replacement:
{code:java}
package org.apache.drill;

public class CompileClassWithIfs {
  public static void doSomething() {
byte var0 = 2;
boolean var1 = false;
long var2 = 0L;
var1 = true;
boolean var4 = false;
long var5 = 0L;
var4 = false;
boolean var7 = false;
long var8 = 0L;
if (var0 == 0) {
  var1 = var4;
}

if (!var1) {
  var1 = true;
  var8 = (long)3;
} else {
  var1 = false;
  if (true) {
throw new AssertionError("Incorrect class transformation. This code 
should never be executed.");
  }
}

  }

  public CompileClassWithIfs() {
  }
}
{code}

Please note that in the original code, assertion error will not be thrown, but 
after scalar replacement {{out0}} and {{out4}} have the same values, so they 
were inlined using the same variables.

Code after the changes in https://github.com/apache/drill/pull/1686:
{code:java}
package org.apache.drill;

import org.apache.drill.exec.expr.holders.NullableBigIntHolder;

public class CompileClassWithIfs {
  public static void doSomething() {
byte var0 = 2;
NullableBigIntHolder var1 = new NullableBigIntHolder();
var1.isSet = 1;
NullableBigIntHolder var2 = new NullableBigIntHolder();
var2.isSet = 0;
boolean var3 = false;
long var4 = 0L;
if (var0 == 0) {
  var1 = var2;
}

if (var2.isSet == 0) {
  var1.isSet = 1;
  var4 = (long)3;
} else {
  var1.isSet = 0;
  if (true) {
throw new AssertionError("Incorrect class transformation. This code 
should never be executed.");
  }
}

  }

  public CompileClassWithIfs() {
  }
}
{code}

Please note, that two {{NullableBigIntHolder}} objects whose references were 
assigned to each other in if the block weren't replaced since it is impossible 
to produce the correct replacement.


was (Author: vvysotskyi):
The problem here is connected with incorrect scalar replacement. It still 
happens even for the case when a reference to the object outside of the 
conditional block is changed inside the conditional block. It causes wrong 
results for the case when code in this conditional block will not be executed.
The code below demonstrates this issue:

Original class:
{code:java}
package org.apache.drill;

import org.apache.drill.exec.expr.holders.NullableBigIntHolder;

public class CompileClassWithIfs {

  public static void doSomething() {
int a = 2;
NullableBigIntHolder out0 = new NullableBigIntHolder();
out0.isSet = 1;
NullableBigIntHolder out4 = new NullableBigIntHolder();
out4.isSet = 0;
if (a == 0) {
  out0 = out4;
} else {
}

if (out4.isSet == 0) {
  out0.isSet = 1;
} else {
  out0.isSet = 0;
  assert false : "incorrect value";
}
  }
}
{code}

Code after scalar replacement:
{code:java}
package org.apache.drill;

public class CompileClassWithIfs {
  public static void doSomething() {
int a = 2;
boolean var1 = false;
long var2 = 0L;
var1 = true;
boolean var4 = false;
long var5 = 0L;
var4 = false;
if (a == 0) {
  var1 = var4;
}

if (!var1) {
  var1 = true;
} else {
  var1 = false;
  if (true) {
throw new AssertionError("incorrect value");
  }
}

  }

  public CompileClassWithIfs() {
  }
}
{code}

Please note that in the original code, assertion error will not be thrown, but 
after scalar replacement {{out0}} and {{out4}} have the same values, so they 
were inlined using the same variables.

> Two CASE statements in projection influence results of each other
>

[jira] [Updated] (DRILL-7075) Fix debian package issue with control files

2019-03-12 Thread Vitalii Diravka (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vitalii Diravka updated DRILL-7075:
---
Labels: archive archives deb ready-to-commit rpm  (was: archive archives 
deb rpm)

> Fix debian package issue with control files
> ---
>
> Key: DRILL-7075
> URL: https://issues.apache.org/jira/browse/DRILL-7075
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Tools, Build  Test
>Affects Versions: Future
> Environment: Verified under Ubuntu OS installed on ARM64
>Reporter: Naresh Bhat
>Assignee: Naresh Bhat
>Priority: Major
>  Labels: archive, archives, deb, ready-to-commit, rpm
> Fix For: 1.16.0
>
>
> The debian package issue with control files.  The master branch is broken and 
> while generating the debian package it is looking for control files under 
> distribution/src/deb/ I don't think it is good idea to remove the control 
> files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (DRILL-7096) Develop vector for canonical Map

2019-03-12 Thread Igor Guzenko (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Guzenko reassigned DRILL-7096:
---

Assignee: Bohdan Kazydub  (was: Igor Guzenko)

> Develop vector for canonical Map
> -
>
> Key: DRILL-7096
> URL: https://issues.apache.org/jira/browse/DRILL-7096
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Igor Guzenko
>Assignee: Bohdan Kazydub
>Priority: Major
>
> Canonical Map datatype can be represented using combination of three 
> value vectors:
> keysVector - vector for storing keys of each map
> valuesVector - vector for storing values of each map
> offsetsVector - vector for storing of start indexes of next each map
> So it's not very hard to create such Map vector, but there is a major issue 
> with such map representation. It's hard to search maps values by key in such 
> vector, need to investigate some advanced techniques to make such search 
> efficient. Or find other more suitable options to represent map datatype in 
> world of vectors.
> After question about maps, Apache Arrow developers responded that for Java 
> they don't have real Map vector, for now they just have logical Map type 
> definition where they define Map like: List< Struct value:value_type> >. So implementation of value vector would be useful for 
> Arrow too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7097) Rename MapVector to StructVector

2019-03-12 Thread Igor Guzenko (JIRA)

Igor Guzenko created DRILL-7097:
---

 Summary: Rename MapVector to StructVector
 Key: DRILL-7097
 URL: https://issues.apache.org/jira/browse/DRILL-7097
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Igor Guzenko
Assignee: Igor Guzenko


For a long time Drill's MapVector was actually more suitable for representing 
Struct data. And in Apache Arrow it was actually renamed to StructVector. To 
align our code with Arrow and give space for planned implementation of 
canonical Map (DRILL-7096) we need to rename existing MapVector and all 
related classes. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7096) Develop vector for canonical Map

2019-03-12 Thread Igor Guzenko (JIRA)

Igor Guzenko created DRILL-7096:
---

 Summary: Develop vector for canonical Map
 Key: DRILL-7096
 URL: https://issues.apache.org/jira/browse/DRILL-7096
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Igor Guzenko
Assignee: Igor Guzenko


Canonical Map datatype can be represented using combination of three value 
vectors:

keysVector - vector for storing keys of each map
valuesVector - vector for storing values of each map
offsetsVector - vector for storing of start indexes of next each map

So it's not very hard to create such Map vector, but there is a major issue 
with such map representation. It's hard to search maps values by key in such 
vector, need to investigate some advanced techniques to make such search 
efficient. Or find other more suitable options to represent map datatype in 
world of vectors.

After question about maps, Apache Arrow developers responded that for Java they 
don't have real Map vector, for now they just have logical Map type definition 
where they define Map like: List< Struct >. So 
implementation of value vector would be useful for Arrow too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Assigned] (DRILL-3587) Select hive's struct data gives IndexOutOfBoundsException instead of unsupported error

2019-03-12 Thread Igor Guzenko (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-3587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Igor Guzenko reassigned DRILL-3587:
---

Assignee: Igor Guzenko

> Select hive's struct data gives IndexOutOfBoundsException instead of 
> unsupported error
> --
>
> Key: DRILL-3587
> URL: https://issues.apache.org/jira/browse/DRILL-3587
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Hive
>Affects Versions: 1.2.0
>Reporter: Krystal
>Assignee: Igor Guzenko
>Priority: Major
> Fix For: Future
>
>
> I have a hive table that has a STRUCT data column.
> hive> select c15 from alltypes;
> OK
> NULL
> {"r":null,"s":null}
> {"r":1,"s":{"a":2,"b":"x"}}
> From drill:
> select c15 from alltypes;
> Error: SYSTEM ERROR: IndexOutOfBoundsException: index (1) must be less than 
> size (1)
> Since drill currently does not support hive struct data type, drill should 
> display user friendly error that hive struct data type is not supported.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-12 Thread Charles Givre (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Charles Givre updated DRILL-6970:
-
Labels: ready-to-commit  (was: )

> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7095) Expose Tuple Metadata to the physical plan

2019-03-12 Thread Arina Ielchiieva (JIRA)

Arina Ielchiieva created DRILL-7095:
---

 Summary: Expose Tuple Metadata to the physical plan
 Key: DRILL-7095
 URL: https://issues.apache.org/jira/browse/DRILL-7095
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Arina Ielchiieva
Assignee: Arina Ielchiieva
 Fix For: 1.16.0


Provide mechanism to expose Tuple Metadata to the physical plan (sub scan).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (DRILL-4093) Use CalciteSchema from Calcite master branch

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi resolved DRILL-4093.

Resolution: Done

Done in the scope of DRILL-3993.

> Use CalciteSchema from Calcite master branch
> 
>
> Key: DRILL-4093
> URL: https://issues.apache.org/jira/browse/DRILL-4093
> Project: Apache Drill
>  Issue Type: Task
>  Components: Query Planning  Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>Priority: Major
>
> Calcite-911 (https://issues.apache.org/jira/browse/CALCITE-911) pushed some 
> Drill specific change related to CalciteSchema code to Calcite master branch. 
>  Drill should pick those changes in the fork, and make adjustment in Drill's 
> code.
> This would reduce the required effort when Drill wants to rebase the fork 
> onto Calcite master, or wants to get rid of the fork.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-4093) Use CalciteSchema from Calcite master branch

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-4093:
---
Issue Type: Task  (was: Bug)

> Use CalciteSchema from Calcite master branch
> 
>
> Key: DRILL-4093
> URL: https://issues.apache.org/jira/browse/DRILL-4093
> Project: Apache Drill
>  Issue Type: Task
>  Components: Query Planning  Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>Priority: Major
>
> Calcite-911 (https://issues.apache.org/jira/browse/CALCITE-911) pushed some 
> Drill specific change related to CalciteSchema code to Calcite master branch. 
>  Drill should pick those changes in the fork, and make adjustment in Drill's 
> code.
> This would reduce the required effort when Drill wants to rebase the fork 
> onto Calcite master, or wants to get rid of the fork.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-4093) Use CalciteSchema from Calcite master branch

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi updated DRILL-4093:
---
Fix Version/s: 1.13.0

> Use CalciteSchema from Calcite master branch
> 
>
> Key: DRILL-4093
> URL: https://issues.apache.org/jira/browse/DRILL-4093
> Project: Apache Drill
>  Issue Type: Task
>  Components: Query Planning  Optimization
>Reporter: Jinfeng Ni
>Assignee: Jinfeng Ni
>Priority: Major
> Fix For: 1.13.0
>
>
> Calcite-911 (https://issues.apache.org/jira/browse/CALCITE-911) pushed some 
> Drill specific change related to CalciteSchema code to Calcite master branch. 
>  Drill should pick those changes in the fork, and make adjustment in Drill's 
> code.
> This would reduce the required effort when Drill wants to rebase the fork 
> onto Calcite master, or wants to get rid of the fork.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Resolved] (DRILL-4004) Fix bugs in JDK8 Tests before updating enforcer to JDK8

2019-03-12 Thread Volodymyr Vysotskyi (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Volodymyr Vysotskyi resolved DRILL-4004.

   Resolution: Fixed
Fix Version/s: 1.13.0

Fixed in DRILL-4329

> Fix bugs in JDK8 Tests before updating enforcer to JDK8
> ---
>
> Key: DRILL-4004
> URL: https://issues.apache.org/jira/browse/DRILL-4004
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Jacques Nadeau
>Priority: Major
> Fix For: 1.13.0
>
>
> The following tests fail on JDK8
> {code}
> org.apache.drill.exec.store.mongo.TestMongoFilterPushDown.testFilterPushDownIsEqual
> org.apache.drill.exec.store.mongo.TestMongoFilterPushDown.testFilterPushDownGreaterThanWithSingleField
> org.apache.drill.exec.store.mongo.TestMongoFilterPushDown.testFilterPushDownLessThanWithSingleField
> org.apache.drill.TestFrameworkTest.testRepeatedColumnMatching
> org.apache.drill.TestFrameworkTest.testCSVVerificationOfOrder_checkFailure
> org.apache.drill.exec.physical.impl.flatten.TestFlattenPlanning.testFlattenPlanningAvoidUnnecessaryProject
> org.apache.drill.exec.record.vector.TestValueVector.testFixedVectorReallocation
> org.apache.drill.exec.record.vector.TestValueVector.testVariableVectorReallocation
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7068) Support memory adjustment framework for resource management with Queues

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7068:

Labels: ready-to-commit  (was: )

> Support memory adjustment framework for resource management with Queues
> ---
>
> Key: DRILL-7068
> URL: https://issues.apache.org/jira/browse/DRILL-7068
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Query Planning  Optimization
>Affects Versions: 1.16.0
>Reporter: Hanumath Rao Maduri
>Assignee: Hanumath Rao Maduri
>Priority: Major
>  Labels: ready-to-commit
>
> Add support for memory adjustment framework based on queue configuration for 
> a query. 
> It also addresses the re-factoring the existing queue based resource 
> management in Drill.
> For more details on the design please refer to the parent JIRA.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7058) Refresh command to support subset of columns

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7058:

Reviewer: Arina Ielchiieva  (was: Gautam Parai)

> Refresh command to support subset of columns
> 
>
> Key: DRILL-7058
> URL: https://issues.apache.org/jira/browse/DRILL-7058
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Modify the REFRESH TABLE METADATA command to specify selected columns which 
> are deemed interesting in some form - either sorted/partitioned/clustered by 
> and column metadata will be stored only for those columns. The proposed 
> syntax is 
>   REFRESH TABLE METADATA  *_[ COLUMNS (list of columns) / NONE ]_*   path>
> For example, REFRESH TABLE METADATA COLUMNS (age, salary) `/tmp/employee` 
> stores column metadata only for the age and salary columns. REFRESH TABLE 
> METADATA COLUMNS NONE `/tmp/employee` will not store column metadata for any 
> column. 
> By default, if the optional 'COLUMNS' clause is omitted, column metadata is 
> collected for all the columns.  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7058) Refresh command to support subset of columns

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7058:

Labels: doc-impacting ready-to-commit  (was: doc-impacting)

> Refresh command to support subset of columns
> 
>
> Key: DRILL-7058
> URL: https://issues.apache.org/jira/browse/DRILL-7058
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Metadata
>Reporter: Venkata Jyothsna Donapati
>Assignee: Venkata Jyothsna Donapati
>Priority: Major
>  Labels: doc-impacting, ready-to-commit
> Fix For: 1.16.0
>
>   Original Estimate: 72h
>  Remaining Estimate: 72h
>
> Modify the REFRESH TABLE METADATA command to specify selected columns which 
> are deemed interesting in some form - either sorted/partitioned/clustered by 
> and column metadata will be stored only for those columns. The proposed 
> syntax is 
>   REFRESH TABLE METADATA  *_[ COLUMNS (list of columns) / NONE ]_*   path>
> For example, REFRESH TABLE METADATA COLUMNS (age, salary) `/tmp/employee` 
> stores column metadata only for the age and salary columns. REFRESH TABLE 
> METADATA COLUMNS NONE `/tmp/employee` will not store column metadata for any 
> column. 
> By default, if the optional 'COLUMNS' clause is omitted, column metadata is 
> collected for all the columns.  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7014) Format plugin for LTSV files

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7014:

Labels: doc-impacting  (was: )

> Format plugin for LTSV files
> 
>
> Key: DRILL-7014
> URL: https://issues.apache.org/jira/browse/DRILL-7014
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.15.0
>Reporter: Takako Shimamoto
>Assignee: Takako Shimamoto
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.16.0
>
>
> I would like to contribute [this 
> plugin|https://github.com/bizreach/drill-ltsv-plugin] to Drill.
> h4. Abstract
> storage-plugins-override.conf
> {code:json}
> "storage":{
>   dfs: {
> type: "file",
> connection: "file:///",
> formats: {
>   "ltsv": {
> "type": "ltsv",
> "extensions": [
>   "ltsv"
> ]
>   }
> },
> enabled: true
>   }
> }
> {code}
> sample.ltsv
> {code}
> time:30/Nov/2016:00:55:08 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/xxx HTTP/1.1  status:200  size:4968 referer:- ua:Java/1.8.0_131 
> reqtime:2.532 apptime:2.532 vhost:api.example.com
> time:30/Nov/2016:00:56:37 +0900 host:xxx.xxx.xxx.xxx  forwardedfor:-  req:GET 
> /v1/yyy HTTP/1.1  status:200  size:412  referer:- ua:Java/1.8.0_201 
> reqtime:3.580 apptime:3.580 vhost:api.example.com
> {code}
> Run query
> {code:sh}
> root@1805183e9b65:/apache-drill-1.15.0# ./bin/drill-embedded 
> Apache Drill 1.15.0
> "Drill must go on."
> 0: jdbc:drill:zk=local> SELECT * FROM 
> dfs.`/apache-drill-1.15.0/sample-data/sample.ltsv` WHERE reqtime > 3.0;
> +-+--+---+---+-+---+--+-+--+--+--+
> |time |   host   | forwardedfor  |  
> req  | status  | size  | referer  |   ua| reqtime  | 
> apptime  |  vhost   |
> +-+--+---+---+-+---+--+-+--+--+--+
> | 30/Nov/2016:00:56:37 +0900  | xxx.xxx.xxx.xxx  | - | GET 
> /v1/yyy HTTP/1.1  | 200 | 412   | -| Java/1.8.0_201  | 3.580| 
> 3.580| api.example.com  |
> +-+--+---+---+-+---+--+-+--+--+--+
> 1 row selected (6.074 seconds)
> 0: jdbc:drill:zk=local> 
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6852) Adapt current Parquet Metadata cache implementation to use Drill Metastore API

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6852:

Labels: ready-to-commit  (was: )

> Adapt current Parquet Metadata cache implementation to use Drill Metastore API
> --
>
> Key: DRILL-6852
> URL: https://issues.apache.org/jira/browse/DRILL-6852
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> According to the design document for DRILL-6552, existing metadata cache API 
> should be adapted to use generalized API for metastore and parquet metadata 
> cache will be presented as the implementation of metastore API.
> The aim of this Jira is to refactor Parquet Metadata cache implementation and 
> adapt it to use Drill Metastore API.
> Execution plan:
>  - Refactor AbstractParquetGroupScan and its implementations to use metastore 
> metadata classes. Store Drill data types in metadata files for Parquet tables.
>  - Storing the least restrictive type instead of current first file’s column 
> data type.
>  - Rework logic in AbstractParquetGroupScan to allow filtering at different 
> metadata layers: partition, file, row group, etc. The same for pushing the 
> limit.
>  - Implement logic to convert existing parquet metadata to metastore metadata 
> to preserve backward compatibility.
>  - Implement fetching metadata only when it is needed (for filtering, limit, 
> count(*) etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (DRILL-7094) UnSupported Bson type: DECIMAL128 Mongodb 3.6 onwards

2019-03-12 Thread Binod Sarkar (JIRA)

Binod Sarkar created DRILL-7094:
---

 Summary: UnSupported Bson type: DECIMAL128 Mongodb 3.6 onwards
 Key: DRILL-7094
 URL: https://issues.apache.org/jira/browse/DRILL-7094
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - MongoDB
Affects Versions: 1.15.0
Reporter: Binod Sarkar


DECIMAL128 not working with Mongodb 3.6 onwards. Showing same error *Error: 
INTERNAL_ERROR ERROR: UnSupported Bson type: DECIMAL128*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6852) Adapt current Parquet Metadata cache implementation to use Drill Metastore API

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6852:

Reviewer: Aman Sinha

> Adapt current Parquet Metadata cache implementation to use Drill Metastore API
> --
>
> Key: DRILL-6852
> URL: https://issues.apache.org/jira/browse/DRILL-6852
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> According to the design document for DRILL-6552, existing metadata cache API 
> should be adapted to use generalized API for metastore and parquet metadata 
> cache will be presented as the implementation of metastore API.
> The aim of this Jira is to refactor Parquet Metadata cache implementation and 
> adapt it to use Drill Metastore API.
> Execution plan:
>  - Refactor AbstractParquetGroupScan and its implementations to use metastore 
> metadata classes. Store Drill data types in metadata files for Parquet tables.
>  - Storing the least restrictive type instead of current first file’s column 
> data type.
>  - Rework logic in AbstractParquetGroupScan to allow filtering at different 
> metadata layers: partition, file, row group, etc. The same for pushing the 
> limit.
>  - Implement logic to convert existing parquet metadata to metastore metadata 
> to preserve backward compatibility.
>  - Implement fetching metadata only when it is needed (for filtering, limit, 
> count(*) etc.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6970) Issue with LogRegex format plugin where drillbuf was overflowing

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6970:

Reviewer: Charles Givre

> Issue with LogRegex format plugin where drillbuf was overflowing 
> -
>
> Key: DRILL-6970
> URL: https://issues.apache.org/jira/browse/DRILL-6970
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.15.0
>Reporter: jean-claude
>Assignee: jean-claude
>Priority: Major
> Fix For: 1.16.0
>
>
> The log format plugin does re-allocate the drillbuf when it fills up. You can 
> query small log files but larger ones will fail with this error:
> 0: jdbc:drill:zk=local> select * from dfs.root.`/prog/test.log`;
> Error: INTERNAL_ERROR ERROR: index: 32724, length: 108 (expected: range(0, 
> 32768))
> Fragment 0:0
> Please, refer to logs for more information.
>  
> I'm running drill-embeded. The log storage plugin is configured like so
> {code:java}
> "log": {
> "type": "logRegex",
> "regex": "(.+)",
> "extension": "log",
> "maxErrors": 10,
> "schema": [
> {
> "fieldName": "line"
> }
> ]
> },
> {code}
> The log files is very simple
> {code:java}
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> jdsaljfldaksjfldsajfldasjflkjdsfldsjfljsdalfk
> ...{code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7092) Rename map to struct in schema definition

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7092:

Labels: ready-to-commit  (was: )

> Rename map to struct in schema definition
> -
>
> Key: DRILL-7092
> URL: https://issues.apache.org/jira/browse/DRILL-7092
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Drill maps concept is more close to Struct concept. Good write up by Paul 
> about this:
> https://github.com/paul-rogers/drill/wiki/Drill-Maps. Internally such objects 
> are represented by MapVector which in future will be renamed to StructVector 
> (similar was done in Arrow project). Drill also does not support true maps, 
> which in future will be also implemented with proper naming (MapVector).
> Nevertheless, internal map / struct presentation does not affect users plus 
> before schema provisioning they did not deal with this concept at all.
> Initially map complex type in Schema Provisioning schema was represented with 
> map keyword followed by angle brackets: col1 map. Given 
> second thought to map notion, looks like it's better to use struct in schema 
> definition and internally convert it to map metadata (again which eventually 
> will be renamed to struct). Such approach would ease map to struct renaming 
> (since it will be done internally) and users will be using struct naming as 
> before. Once Drill will implement true map, schema provisioning will allow 
> map type for schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-2326) scalar replacement fails in TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical()

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-2326:

Description: 
Scalar replacement fails in 
TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical():

for now, we've worked around this by using a retry strategy in ClassTransformer 
which will fall back to using the code without scalar replacement when the 
scalar replacement fails. This needs to be revisited to find out what is 
failing about this particular case.

  was:For now, we've worked around this by using a retry strategy in 
ClassTransformer which will fall back to using the code without scalar 
replacement when the scalar replacement fails. This needs to be revisited to 
find out what is failing about this particular case.


> scalar replacement fails in 
> TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical()
> 
>
> Key: DRILL-2326
> URL: https://issues.apache.org/jira/browse/DRILL-2326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 0.8.0
>Reporter: Chris Westin
>Assignee: Volodymyr Vysotskyi
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Scalar replacement fails in 
> TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical():
> for now, we've worked around this by using a retry strategy in 
> ClassTransformer which will fall back to using the code without scalar 
> replacement when the scalar replacement fails. This needs to be revisited to 
> find out what is failing about this particular case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-2326) Scalar replacement fails in TestConvertFunctions

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-2326:

Summary: Scalar replacement fails in TestConvertFunctions  (was: scalar 
replacement fails in TestConvertFunctions)

> Scalar replacement fails in TestConvertFunctions
> 
>
> Key: DRILL-2326
> URL: https://issues.apache.org/jira/browse/DRILL-2326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 0.8.0
>Reporter: Chris Westin
>Assignee: Volodymyr Vysotskyi
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Scalar replacement fails in 
> TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical():
> for now, we've worked around this by using a retry strategy in 
> ClassTransformer which will fall back to using the code without scalar 
> replacement when the scalar replacement fails. This needs to be revisited to 
> find out what is failing about this particular case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-2326) scalar replacement fails in TestConvertFunctions

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-2326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-2326:

Summary: scalar replacement fails in TestConvertFunctions  (was: scalar 
replacement fails in 
TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical())

> scalar replacement fails in TestConvertFunctions
> 
>
> Key: DRILL-2326
> URL: https://issues.apache.org/jira/browse/DRILL-2326
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Codegen
>Affects Versions: 0.8.0
>Reporter: Chris Westin
>Assignee: Volodymyr Vysotskyi
>Priority: Minor
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Scalar replacement fails in 
> TestConvertFunctions.testBigIntVarCharReturnTripConvertLogical():
> for now, we've worked around this by using a retry strategy in 
> ClassTransformer which will fall back to using the code without scalar 
> replacement when the scalar replacement fails. This needs to be revisited to 
> find out what is failing about this particular case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6951) Merge row set based mock data source

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6951:

Fix Version/s: 1.17.0

> Merge row set based mock data source
> 
>
> Key: DRILL-6951
> URL: https://issues.apache.org/jira/browse/DRILL-6951
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> The mock reader framework is an obscure bit of code used in tests that 
> generates fake data for use in things like testing sort, filters and so on.
> Because the mock reader is simple, it is a good demonstration case for the 
> new scanner framework based on the result set loader. This task merges the 
> existing work in migrating the mock data source into master via a PR.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-6953) Merge row set-based JSON reader

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6953:

Fix Version/s: 1.17.0

> Merge row set-based JSON reader
> ---
>
> Key: DRILL-6953
> URL: https://issues.apache.org/jira/browse/DRILL-6953
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> The final step in the ongoing "result set loader" saga is to merge the 
> revised JSON reader into master. This reader does two key things:
> * Demonstrates the prototypical "late schema" style of data reading (discover 
> schema while reading).
> * Implements many tricks and hacks to handle schema changes while loading.
> * Shows that, even with all these tricks, the only true solution is to 
> actually have a schema.
> The new JSON reader:
> * Uses an expanded state machine when parsing rather than the complex set of 
> if-statements in the current version.
> * Handles reading a run of nulls before seeing the first data value (as long 
> as the data value shows up in the first record batch).
> * Uses the result-set loader to generate fixed-size batches regardless of the 
> complexity, depth of structure, or width of variable-length fields.
> While the JSON reader itself is helpful, the key contribution is that it 
> shows how to use the entire kit of parts: result set loader, projection 
> framework, and so on. Since the projection framework can handle an external 
> schema, it is also a handy foundation for the ongoing schema project.
> Key work to complete after this merger will be to reconcile actual data with 
> the external schema. For example, if we know a column is supposed to be a 
> VarChar, then read the column as a VarChar regardless of the type JSON itself 
> picks. Or, if a column is supposed to be a Double, then convert Int and 
> String JSON values into Doubles.
> The Row Set framework was designed to allow inserting custom column writers. 
> This would be a great opportunity to do the work needed to create them. Then, 
> use the new JSON framework to allow parsing a JSON field as a specified Drill 
> type.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7092) Rename map to struct in schema definition

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7092:

Reviewer: Paul Rogers

> Rename map to struct in schema definition
> -
>
> Key: DRILL-7092
> URL: https://issues.apache.org/jira/browse/DRILL-7092
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.16.0
>
>
> Drill maps concept is more close to Struct concept. Good write up by Paul 
> about this:
> https://github.com/paul-rogers/drill/wiki/Drill-Maps. Internally such objects 
> are represented by MapVector which in future will be renamed to StructVector 
> (similar was done in Arrow project). Drill also does not support true maps, 
> which in future will be also implemented with proper naming (MapVector).
> Nevertheless, internal map / struct presentation does not affect users plus 
> before schema provisioning they did not deal with this concept at all.
> Initially map complex type in Schema Provisioning schema was represented with 
> map keyword followed by angle brackets: col1 map. Given 
> second thought to map notion, looks like it's better to use struct in schema 
> definition and internally convert it to map metadata (again which eventually 
> will be renamed to struct). Such approach would ease map to struct renaming 
> (since it will be done internally) and users will be using struct naming as 
> before. Once Drill will implement true map, schema provisioning will allow 
> map type for schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7086) Enhance row-set scan framework to use external schema

2019-03-12 Thread Arina Ielchiieva (JIRA)



 [ 
https://issues.apache.org/jira/browse/DRILL-7086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-7086:

Fix Version/s: 1.16.0

> Enhance row-set scan framework to use external schema
> -
>
> Key: DRILL-7086
> URL: https://issues.apache.org/jira/browse/DRILL-7086
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.16.0
>
>
> Modify the row-set scan framework to work with an external (partial) schema; 
> inserting "type conversion shims" to convert as needed. The reader provides 
> an "input schema" the data types the reader is prepared to handle. An 
> optional "output schema" describes the types of the value vectors to create. 
> The type conversion "shims" give the reader the "setFoo" method it wants to 
> use, while converting the data to the type needed for the vector. For 
> example, the CSV reader might read only text fields, while the shim converts 
> a column to an INT.
> This is just the framework layer, DRILL-7011 will combine this mechanism with 
> the plan-side features to enable use of the feature in the new row-set based 
> CSV reader.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

67 matches

Mail list logo