[jira] [Updated] (DRILL-6529) Project Batch Sizing causes two LargeFileCompilation tests to timeout

2018-06-26 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-6529:
--
Labels:   (was: ready-to-commit)

> Project Batch Sizing causes two LargeFileCompilation tests to timeout
> -
>
> Key: DRILL-6529
> URL: https://issues.apache.org/jira/browse/DRILL-6529
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
> Fix For: 1.14.0
>
>
> Timeout failures are seen in TestLargeFileCompilation testExternal_Sort and 
> testTop_N_Sort. These tests are stress tests for compilation where the 
> queries cover projections over 5000 columns and sort over 500 columns. These 
> tests pass if they are run stand-alone. Something triggers the timeouts when 
> the tests are run in parallel as part of a unit test run.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6529) Project Batch Sizing causes two LargeFileCompilation tests to timeout

2018-06-26 Thread Timothy Farkas (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas updated DRILL-6529:
--
Labels: ready-to-commit  (was: )

> Project Batch Sizing causes two LargeFileCompilation tests to timeout
> -
>
> Key: DRILL-6529
> URL: https://issues.apache.org/jira/browse/DRILL-6529
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Karthikeyan Manivannan
>Assignee: Karthikeyan Manivannan
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Timeout failures are seen in TestLargeFileCompilation testExternal_Sort and 
> testTop_N_Sort. These tests are stress tests for compilation where the 
> queries cover projections over 5000 columns and sort over 500 columns. These 
> tests pass if they are run stand-alone. Something triggers the timeouts when 
> the tests are run in parallel as part of a unit test run.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6530) JVM crash with a query involving multiple json files with one file having a schema change of one column from string to list

2018-06-26 Thread Sorabh Hamirwasia (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524488#comment-16524488
 ] 

Sorabh Hamirwasia commented on DRILL-6530:
--

The issue here was in ListWriters.java, when it adds a writer for repeated type 
then it gets the count of vectors currently in container and after calling 
*container.addOrGet* again get's the vector count in container to see if a new 
vector is added or not. If a new vector is added then it allocates memory for 
the newly created vector since by default the Drillbuf for a vector is empty. 

In cases when the schema only changes in a way such that number of fields are 
same but only type of a field has changed, the vector count before calling 
*container.addOrGet* and after calling that will be same even though a new 
vector for that field is internally created. Hence the space for Drillbuf was 
not getting allocated. Later when 
[startNewValue|https://github.com/apache/drill/blob/master/exec/vector/src/main/java/org/apache/drill/exec/vector/complex/BaseRepeatedValueVector.java#L270]
 was called as part of 
[setPosition|https://github.com/apache/drill/blob/master/exec/vector/src/main/codegen/templates/ListWriters.java#L117],
 it was doing realloc but only initializing second half of offset vector and 
hence leaving the first half uninitialized which ultimately resulted in 
referencing a memory at index represented by uninitialized garbage value.

The fix is to detect the creation of new value vector in case of schema changes 
by comparing old and new ValueVector

> JVM crash with a query involving multiple json files with one file having a 
> schema change of one column from string to list
> ---
>
> Key: DRILL-6530
> URL: https://issues.apache.org/jira/browse/DRILL-6530
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.14.0
>Reporter: Kedar Sankar Behera
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 0_0_92.json, 0_0_93.json, drillbit.log, drillbit.out, 
> hs_err_pid32076.log
>
>
> JVM crash with a Lateral Unnest query involving multiple json files with one 
> file having a schema change of one column from string to list .
> Query :- 
> {code}
> SELECT customer.c_custkey,customer.c_acctbal,orders.o_orderkey, 
> orders.o_totalprice,orders.o_orderdate,orders.o_shippriority,customer.c_address,orders.o_orderpriority,customer.c_comment
> FROM customer, LATERAL 
> (SELECT O.ord.o_orderkey as o_orderkey, O.ord.o_totalprice as 
> o_totalprice,O.ord.o_orderdate as o_orderdate ,O.ord.o_shippriority as 
> o_shippriority,O.ord.o_orderpriority 
> as o_orderpriority FROM UNNEST(customer.c_orders) O(ord))orders;
> {code}
> The error got was 
> {code}
> o.a.d.e.p.impl.join.LateralJoinBatch - Output batch still has some space 
> left, getting new batches from left and right
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_custkey
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_phone
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_acctbal
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_orders
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_mktsegment
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_address
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_nationkey
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_name
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_comment
> 2018-06-21 15:25:16,316 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.e.v.c.AbstractContainerVector - Field [o_comment] mutated from 
> [NullableVarCharVector] to [RepeatedVarCharVector]
> 2018-06-21 15:25:16,318 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.drill.exec.vector.UInt4Vector - Reallocating vector [[`$offsets$` 
> (UINT4:REQUIRED)]]. # of bytes: [16384] -> [32768]
> 

[jira] [Updated] (DRILL-6530) JVM crash with a query involving multiple json files with one file having a schema change of one column from string to list

2018-06-26 Thread Sorabh Hamirwasia (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6530?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sorabh Hamirwasia updated DRILL-6530:
-
Component/s: Execution - Data Types

> JVM crash with a query involving multiple json files with one file having a 
> schema change of one column from string to list
> ---
>
> Key: DRILL-6530
> URL: https://issues.apache.org/jira/browse/DRILL-6530
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.14.0
>Reporter: Kedar Sankar Behera
>Assignee: Sorabh Hamirwasia
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: 0_0_92.json, 0_0_93.json, drillbit.log, drillbit.out, 
> hs_err_pid32076.log
>
>
> JVM crash with a Lateral Unnest query involving multiple json files with one 
> file having a schema change of one column from string to list .
> Query :- 
> {code}
> SELECT customer.c_custkey,customer.c_acctbal,orders.o_orderkey, 
> orders.o_totalprice,orders.o_orderdate,orders.o_shippriority,customer.c_address,orders.o_orderpriority,customer.c_comment
> FROM customer, LATERAL 
> (SELECT O.ord.o_orderkey as o_orderkey, O.ord.o_totalprice as 
> o_totalprice,O.ord.o_orderdate as o_orderdate ,O.ord.o_shippriority as 
> o_shippriority,O.ord.o_orderpriority 
> as o_orderpriority FROM UNNEST(customer.c_orders) O(ord))orders;
> {code}
> The error got was 
> {code}
> o.a.d.e.p.impl.join.LateralJoinBatch - Output batch still has some space 
> left, getting new batches from left and right
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_custkey
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_phone
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_acctbal
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_orders
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_mktsegment
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_address
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_nationkey
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_name
> 2018-06-21 15:25:16,303 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.exec.physical.impl.ScanBatch - set record count 0 for vv c_comment
> 2018-06-21 15:25:16,316 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.d.e.v.c.AbstractContainerVector - Field [o_comment] mutated from 
> [NullableVarCharVector] to [RepeatedVarCharVector]
> 2018-06-21 15:25:16,318 [24d3da36-bdb8-cb5b-594c-82135bfb84aa:frag:0:0] DEBUG 
> o.a.drill.exec.vector.UInt4Vector - Reallocating vector [[`$offsets$` 
> (UINT4:REQUIRED)]]. # of bytes: [16384] -> [32768]
> {code}
> On Further investigating with [~shamirwasia] it's found that the crash only 
> happens when [o_comment] mutates from  [NullableVarCharVector]  to 
> [RepeatedVarCharVector],not the other way around
> Please find the logs stack trace and the data file
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6543) Options for memory mgmt: Reserve allowance for non-buffered, and Hash-Join default to not fallback

2018-06-26 Thread Boaz Ben-Zvi (JIRA)
Boaz Ben-Zvi created DRILL-6543:
---

 Summary: Options for memory mgmt: Reserve allowance for 
non-buffered, and Hash-Join default to not fallback   
 Key: DRILL-6543
 URL: https://issues.apache.org/jira/browse/DRILL-6543
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Relational Operators
Affects Versions: 1.13.0
Reporter: Boaz Ben-Zvi
Assignee: Boaz Ben-Zvi
 Fix For: 1.14.0


Changes to options related to memory budgeting:

(1) Change the default for "drill.exec.hashjoin.fallback.enabled" to *false* 
(same as for the similar Hash-Agg option). This would force users to calculate 
and assign sufficient memory for the query, or explicitly choose to fallback.

(2) When the "planner.memory.max_query_memory_per_node" (MQMPN) option is set 
equal (or "nearly equal") to the allocated *Direct Memory*, an OOM is still 
possible. The reason is that the memory used by the "non-buffered" operators is 
not taken into account.

For example, MQMPN == Direct-Memory == 100 MB. Run a query with 5 buffered 
operators (e.g., 5 instances of a Hash-Join), so each gets "promised" 20 MB. 
When other non-buffered operators (e.g., a Scanner, or a Sender) also grab some 
of the Direct Memory, then less than 100 MB is left available. And if all those 
5 Hash-Joins are pushing their limits, then one HJ may have only allocated 12MB 
so far, but on the next 1MB allocation it will hit an OOM (from the JVM, as all 
the 100MB Direct memory is already used).

A solution -- a new option to _*reserve*_ some of the Direct Memory for those 
non-buffered operators (e.g., default %25). This *allowance* may prevent many 
of the cases like the example above. The new option would return an error (when 
a query initiates) if the MQMPN is set too high. Note that this option +can 
not+ address concurrent queries.

This should also apply to the alternative for the MQMPN - the 
{{"planner.memory.percent_per_query"}} option (PPQ). The PPQ does not 
_*reserve*_ such memory (e.g., can set it to %100); only its documentation 
clearly explains this issue (that doc suggests reserving %50 allowance, as it 
was written when the Hash-Join was non-buffered; i.e., before spill was 
implemented).

The memory given to the buffered operators is the highest calculated between 
the MQMPN and the PPQ. The new reserve option would verify that this figure 
allows the allowance.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6542) IndexOutOfBoundsException for multilevel lateral queries with schema changed partitioned complex data

2018-06-26 Thread Kedar Sankar Behera (JIRA)
Kedar Sankar Behera created DRILL-6542:
--

 Summary: IndexOutOfBoundsException for multilevel lateral queries 
with schema changed partitioned complex data
 Key: DRILL-6542
 URL: https://issues.apache.org/jira/browse/DRILL-6542
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.14.0
Reporter: Kedar Sankar Behera
 Fix For: 1.14.0


IndexOutOfBoundsException for multilevel lateral queries with schema changed 
partitioned complex data
query:
{code}
select customer.c_custkey, customer.c_name, orders.orderkey, orders.totalprice, 
olineitems.l_partkey, olineitems.l_linenumber, olineitems.l_quantity from 
customer, 
lateral (select t1.o.o_orderkey as orderkey, t1.o.o_totalprice as totalprice, 
t1.o.o_lineitems as lineitems from unnest(customer.c_orders) t1(o)) orders, 
lateral (select t2.l.l_partkey as l_partkey, t2.l.l_linenumber as l_linenumber, 
t2.l.l_quantity as l_quantity from unnest(orders.lineitems) t2(l)) olineitems 
order by customer.c_custkey, orders.orderkey, orders.totalprice, 
olineitems.l_partkey, olineitems.l_linenumber, olineitems.l_quantity limit 50;
{code}

Error:
{code}
[Error Id: 7427fa7e-af4a-4f11-acd9-ced71848a1ed on drill182:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
IndexOutOfBoundsException: writerIndex: 1 (expected: readerIndex(0) <= 
writerIndex <= capacity(0))

Fragment 0:0

[Error Id: 7427fa7e-af4a-4f11-acd9-ced71848a1ed on drill182:31010]
 at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633)
 ~[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:361)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:216)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:327)
 [drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_161]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_161]
 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
Caused by: java.lang.IndexOutOfBoundsException: writerIndex: 1 (expected: 
readerIndex(0) <= writerIndex <= capacity(0))
 at io.netty.buffer.AbstractByteBuf.writerIndex(AbstractByteBuf.java:104) 
~[netty-buffer-4.0.48.Final.jar:4.0.48.Final]
 at 
org.apache.drill.exec.vector.UInt1Vector.splitAndTransferTo(UInt1Vector.java:329)
 ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.vector.NullableBigIntVector.splitAndTransferTo(NullableBigIntVector.java:312)
 ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.vector.NullableBigIntVector$TransferImpl.splitAndTransfer(NullableBigIntVector.java:339)
 ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.vector.complex.RepeatedMapVector$SingleMapTransferPair.splitAndTransfer(RepeatedMapVector.java:298)
 ~[vector-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.unnest.UnnestImpl.unnestRecords(UnnestImpl.java:101)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.unnest.UnnestRecordBatch.doWork(UnnestRecordBatch.java:283)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.unnest.UnnestRecordBatch.innerNext(UnnestRecordBatch.java:236)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:137)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:172)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.14.0-SNAPSHOT.jar:1.14.0-SNAPSHOT]
 at 

[jira] [Updated] (DRILL-6539) Record count not set for this vector container error

2018-06-26 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6539:
-
Reviewer: Timothy Farkas

> Record count not set for this vector container error 
> -
>
> Key: DRILL-6539
> URL: https://issues.apache.org/jira/browse/DRILL-6539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
>
> This error is randomly seen when executing queries.
> [Error Id: 6a2a49e5-28d9-4587-ab8b-5262c07f8fdc on drill196:31010]
>   (java.lang.IllegalStateException) Record count not set for this vector 
> container
> com.google.common.base.Preconditions.checkState():173
> org.apache.drill.exec.record.VectorContainer.getRecordCount():394
> org.apache.drill.exec.record.RecordBatchSizer.():681
> org.apache.drill.exec.record.RecordBatchSizer.():665
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():441
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():882
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():891
> 
> org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():578
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():937
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():754
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():335
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch():403
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():354
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():299
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():281
> org.apache.drill.common.SelfCleaningRunnable.run():38
> 

[jira] [Commented] (DRILL-6453) TPC-DS query 72 has regressed

2018-06-26 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524094#comment-16524094
 ] 

Khurram Faraaz commented on DRILL-6453:
---

[~priteshm] 

Results of the other test on Apache Drill 1.14.0 commit b92f599, with the two 
options set to below values,

alter system set `planner.memory.max_query_memory_per_node` = 8589934592;
alter system set `drill.exec.hashagg.fallback.enabled` = true;

And in drill-env.sh we set direct memory to

export DRILL_MAX_DIRECT_MEMORY=$\{DRILL_MAX_DIRECT_MEMORY:-"12G"}


TPC-DS query 72 is reported to be in Canceled state on UI after having run for 
2 hr 11 min 21.958 sec

The last Exception we see in the drillbit.log is this 
 
IllegalStateException: Record count not set for this vector container

> TPC-DS query 72 has regressed
> -
>
> Key: DRILL-6453
> URL: https://issues.apache.org/jira/browse/DRILL-6453
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it 
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT 
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took 
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not 
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to 
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in 
> CANCELLATION_REQUESTED state on UI.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6539) Record count not set for this vector container error

2018-06-26 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16524076#comment-16524076
 ] 

Khurram Faraaz commented on DRILL-6539:
---

[~sachouche] you should try to run the test on a 4 node cluster to hit the 
issue, since you ran the test a single node (mac) you may not hit it. I can 
share test cluster details for you to try and repro the issue.

> Record count not set for this vector container error 
> -
>
> Key: DRILL-6539
> URL: https://issues.apache.org/jira/browse/DRILL-6539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
>
> This error is randomly seen when executing queries.
> [Error Id: 6a2a49e5-28d9-4587-ab8b-5262c07f8fdc on drill196:31010]
>   (java.lang.IllegalStateException) Record count not set for this vector 
> container
> com.google.common.base.Preconditions.checkState():173
> org.apache.drill.exec.record.VectorContainer.getRecordCount():394
> org.apache.drill.exec.record.RecordBatchSizer.():681
> org.apache.drill.exec.record.RecordBatchSizer.():665
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():441
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():882
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():891
> 
> org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():578
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():937
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():754
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():335
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch():403
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():354
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():299
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> 

[jira] [Commented] (DRILL-6539) Record count not set for this vector container error

2018-06-26 Thread salim achouche (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523959#comment-16523959
 ] 

salim achouche commented on DRILL-6539:
---

I have been trying to reproduce this issue on my mac os but Khurram' TPCDS test 
succeeded. [~ppenumarthy] Do you have another repro case?

> Record count not set for this vector container error 
> -
>
> Key: DRILL-6539
> URL: https://issues.apache.org/jira/browse/DRILL-6539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
>
> This error is randomly seen when executing queries.
> [Error Id: 6a2a49e5-28d9-4587-ab8b-5262c07f8fdc on drill196:31010]
>   (java.lang.IllegalStateException) Record count not set for this vector 
> container
> com.google.common.base.Preconditions.checkState():173
> org.apache.drill.exec.record.VectorContainer.getRecordCount():394
> org.apache.drill.exec.record.RecordBatchSizer.():681
> org.apache.drill.exec.record.RecordBatchSizer.():665
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():441
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():882
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():891
> 
> org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():578
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():937
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():754
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():335
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch():403
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():354
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():299
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1595
> 

[jira] [Updated] (DRILL-6465) Transitive closure is not working in Drill for Join with multiple local conditions

2018-06-26 Thread Pritesh Maker (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pritesh Maker updated DRILL-6465:
-
Fix Version/s: (was: 1.14.0)
   1.15.0

> Transitive closure is not working in Drill for Join with multiple local 
> conditions
> --
>
> Key: DRILL-6465
> URL: https://issues.apache.org/jira/browse/DRILL-6465
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Denys Ordynskiy
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.15.0
>
> Attachments: drill.zip
>
>
> For several SQL operators Transitive closure is not working during Partition 
> Pruning and Filter Pushdown for the left table in Join.
>  If I use several local conditions, then Drill scans full left table in Join.
>  But if we move additional conditions to the WHERE statement, then Transitive 
> closure works fine for all joined tables
> *Query BETWEEN:*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y BETWEEN 1987 AND 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=8, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2])]{code}
> *Actual result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=16, partitions= [Partition(values:[1987, 5, 
> 1]), Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2]), Partition(values:[1990, 4, 1]), 
> Partition(values:[1990, 4, 2]), Partition(values:[1990, 5, 1]), 
> Partition(values:[1990, 5, 2]), Partition(values:[1991, 3, 1]), 
> Partition(values:[1991, 3, 2]), Partition(values:[1991, 3, 3]), 
> Partition(values:[1991, 3, 4])
> ]
> {code}
> *There is the same Transitive closure behavior for this logical operators:*
>  * NOT IN
>  * LIKE
>  * NOT LIKE
> Also Transitive closure is not working during Partition Pruning and Filter 
> Pushdown for this comparison operators:
> *Query <*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y < 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=4, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2])]{code}
> *Actual result:*
> {code:java}
> 00-00 Screen
> 00-01 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-02 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-03 HashJoin(condition=[=($1, $6)], joinType=[inner])
> 00-05 Scan(groupscan=[HiveScan [table=Table(dbName:default, 
> tableName:h_tab1), columns=[`**`], numPartitions=16, partitions= 
> [Partition(values:[1987, 5, 1]), Partition(values:[1987, 5, 2]), 
> Partition(values:[1987, 7, 1]), Partition(values:[1987, 7, 2]), 
> Partition(values:[1988, 11, 1]), Partition(values:[1988, 11, 2]), 
> Partition(values:[1988, 12, 1]), Partition(values:[1988, 12, 2]), 
> Partition(values:[1990, 4, 1]), Partition(values:[1990, 4, 2]), 
> Partition(values:[1990, 5, 1]), Partition(values:[1990, 5, 2]), 
> Partition(values:[1991, 3, 1]), Partition(values:[1991, 3, 2]), 
> Partition(values:[1991, 3, 3]), Partition(values:[1991, 3, 4])], 
> inputDirectories=[maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/1, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/2, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/3, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/4, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/5, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/6, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/7, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/8, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/9, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/10, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/11, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/12, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/13, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/14, 
> 

[jira] [Commented] (DRILL-6465) Transitive closure is not working in Drill for Join with multiple local conditions

2018-06-26 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523939#comment-16523939
 ] 

Pritesh Maker commented on DRILL-6465:
--

Ok, let's wait for the next Calcite upgrade. I'll mark the issue for 1.15 
release.

> Transitive closure is not working in Drill for Join with multiple local 
> conditions
> --
>
> Key: DRILL-6465
> URL: https://issues.apache.org/jira/browse/DRILL-6465
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Denys Ordynskiy
>Assignee: Vitalii Diravka
>Priority: Major
> Fix For: 1.14.0
>
> Attachments: drill.zip
>
>
> For several SQL operators Transitive closure is not working during Partition 
> Pruning and Filter Pushdown for the left table in Join.
>  If I use several local conditions, then Drill scans full left table in Join.
>  But if we move additional conditions to the WHERE statement, then Transitive 
> closure works fine for all joined tables
> *Query BETWEEN:*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y BETWEEN 1987 AND 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=8, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2])]{code}
> *Actual result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=16, partitions= [Partition(values:[1987, 5, 
> 1]), Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2]), Partition(values:[1988, 11, 1]), 
> Partition(values:[1988, 11, 2]), Partition(values:[1988, 12, 1]), 
> Partition(values:[1988, 12, 2]), Partition(values:[1990, 4, 1]), 
> Partition(values:[1990, 4, 2]), Partition(values:[1990, 5, 1]), 
> Partition(values:[1990, 5, 2]), Partition(values:[1991, 3, 1]), 
> Partition(values:[1991, 3, 2]), Partition(values:[1991, 3, 3]), 
> Partition(values:[1991, 3, 4])
> ]
> {code}
> *There is the same Transitive closure behavior for this logical operators:*
>  * NOT IN
>  * LIKE
>  * NOT LIKE
> Also Transitive closure is not working during Partition Pruning and Filter 
> Pushdown for this comparison operators:
> *Query <*
> {code:java}
> EXPLAIN PLAN FOR
> SELECT * FROM hive.`h_tab1` t1
> JOIN hive.`h_tab2` t2
> ON t1.y=t2.y
> AND t2.y < 1988;
> {code}
> *Expected result:*
> {code:java}
> Scan(groupscan=[HiveScan [table=Table(dbName:default, tableName:h_tab1), 
> columns=[`**`], numPartitions=4, partitions= [Partition(values:[1987, 5, 1]), 
> Partition(values:[1987, 5, 2]), Partition(values:[1987, 7, 1]), 
> Partition(values:[1987, 7, 2])]{code}
> *Actual result:*
> {code:java}
> 00-00 Screen
> 00-01 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-02 Project(itm=[$0], y=[$1], m=[$2], category=[$3], itm0=[$4], 
> category0=[$5], y0=[$6], m0=[$7])
> 00-03 HashJoin(condition=[=($1, $6)], joinType=[inner])
> 00-05 Scan(groupscan=[HiveScan [table=Table(dbName:default, 
> tableName:h_tab1), columns=[`**`], numPartitions=16, partitions= 
> [Partition(values:[1987, 5, 1]), Partition(values:[1987, 5, 2]), 
> Partition(values:[1987, 7, 1]), Partition(values:[1987, 7, 2]), 
> Partition(values:[1988, 11, 1]), Partition(values:[1988, 11, 2]), 
> Partition(values:[1988, 12, 1]), Partition(values:[1988, 12, 2]), 
> Partition(values:[1990, 4, 1]), Partition(values:[1990, 4, 2]), 
> Partition(values:[1990, 5, 1]), Partition(values:[1990, 5, 2]), 
> Partition(values:[1991, 3, 1]), Partition(values:[1991, 3, 2]), 
> Partition(values:[1991, 3, 3]), Partition(values:[1991, 3, 4])], 
> inputDirectories=[maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/1, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/2, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/3, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/4, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/5, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/6, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/7, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/8, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/9, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/10, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/11, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/12, 
> maprfs:/drill/testdata/ctas/parquet/DRILL_6173/tab1/13, 
> 

[jira] [Commented] (DRILL-6453) TPC-DS query 72 has regressed

2018-06-26 Thread Pritesh Maker (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523934#comment-16523934
 ] 

Pritesh Maker commented on DRILL-6453:
--

Thanks [~khfaraaz] - what was the result of the second test?

> TPC-DS query 72 has regressed
> -
>
> Key: DRILL-6453
> URL: https://issues.apache.org/jira/browse/DRILL-6453
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it 
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT 
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took 
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not 
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to 
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in 
> CANCELLATION_REQUESTED state on UI.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6538) MongoDb Collection are accessible only after certain time

2018-06-26 Thread Aniket (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aniket updated DRILL-6538:
--
Component/s: Storage - MongoDB

> MongoDb Collection are accessible only after certain time
> -
>
> Key: DRILL-6538
> URL: https://issues.apache.org/jira/browse/DRILL-6538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill, Storage - MongoDB
>Affects Versions: 1.12.0, 1.13.0
>Reporter: Aniket
>Priority: Major
>
> After creating a new collection in mongo, If immediately drill query on that 
> collection is fired then getting an error. Please note that there is some 
> compound index created in Mongo.
> *org.apache.calcite.runtime.CalciteContextException: From line 1, column 433 
> to line 1, column 437: Object  not found within 
> 'mongo.'*
>  After 60 seconds the same query works properly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6538) MongoDb Collection are accessible only after certain time

2018-06-26 Thread Aniket (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523437#comment-16523437
 ] 

Aniket edited comment on DRILL-6538 at 6/26/18 3:30 PM:


I saw the the code  in *MongoSchemaFactory.java*

path : 
storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
 .newBuilder() //
 .expireAfterAccess(1, TimeUnit.MINUTES) //
 .build(new DatabaseLoader());
  
 tableNameLoader = CacheBuilder //
 .newBuilder() //
 .expireAfterAccess(1, TimeUnit.MINUTES) //
 .build(new TableNameLoader()); 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?


was (Author: aniketamrutkar):
I saw in the code  in *MongoSchemaFactory.java*

path : 
storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new DatabaseLoader());
 
tableNameLoader = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new TableNameLoader()); 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?

> MongoDb Collection are accessible only after certain time
> -
>
> Key: DRILL-6538
> URL: https://issues.apache.org/jira/browse/DRILL-6538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.12.0, 1.13.0
>Reporter: Aniket
>Priority: Major
>
> After creating a new collection in mongo, If immediately drill query on that 
> collection is fired then getting an error. Please note that there is some 
> compound index created in Mongo.
> *org.apache.calcite.runtime.CalciteContextException: From line 1, column 433 
> to line 1, column 437: Object  not found within 
> 'mongo.'*
>  After 60 seconds the same query works properly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6534) Upgrade ZooKeeper patch version to 3.4.11

2018-06-26 Thread Vitalii Diravka (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522622#comment-16522622
 ] 

Vitalii Diravka edited comment on DRILL-6534 at 6/26/18 2:53 PM:
-

Currently Drill uses 3.4.6 version of zookeeper and 2.7.1 version of 
curator-framework.
 The last versions in maven are: 
[3.4.12|https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.4.12]
 and 
[4.0.1|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/4.0.1]
 
([2.12.0|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/2.12.0]
 for second major version).
 But these dependencies are leveraged as transitive from "hadoop-common". Do we 
need to update "hadoop-common" at first?


was (Author: vitalii):
Currently Drill uses 3.4.6 version of zookeeper and 2.7.1 version of 
curator-framework.
The last versions in maven are: 
[3.4.12|https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.4.12]
 and 
[4.0.1|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/4.0.1]
 
([2.12.0|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/2.12.0]
 for 2 major version).
But these dependencies are leveraged as transitive from "hadoop-common". Do we 
need to update "hadoop-common" at first?

> Upgrade ZooKeeper patch version to 3.4.11
> -
>
> Key: DRILL-6534
> URL: https://issues.apache.org/jira/browse/DRILL-6534
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6532) Support of Archive.org as an S3 Provider

2018-06-26 Thread Vitalii Diravka (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523518#comment-16523518
 ] 

Vitalii Diravka commented on DRILL-6532:


Thanks [~h4ck3rm1k3] 
Please try to configure Drill's S3 plugin for Archive.org
Just FYI in case if it is necessary for you, I have created separate Jira for 
upgrade Hadoop libraries. 

> Support of Archive.org as an S3 Provider
> 
>
> Key: DRILL-6532
> URL: https://issues.apache.org/jira/browse/DRILL-6532
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: james michael dupont
>Priority: Minor
>
> To support archive.org using the s3 protocol



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6541) Upgrade ZooKeeper patch version to 3.4.11 for mapr profile

2018-06-26 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6541:
--
Reviewer: Vitalii Diravka

> Upgrade ZooKeeper patch version to 3.4.11 for mapr profile
> --
>
> Key: DRILL-6541
> URL: https://issues.apache.org/jira/browse/DRILL-6541
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.13.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6541) Upgrade ZooKeeper patch version to 3.4.11 for mapr profile

2018-06-26 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6541:
--
Fix Version/s: 1.15.0

> Upgrade ZooKeeper patch version to 3.4.11 for mapr profile
> --
>
> Key: DRILL-6541
> URL: https://issues.apache.org/jira/browse/DRILL-6541
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.13.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6541) Upgrade ZooKeeper patch version to 3.4.11 for mapr profile

2018-06-26 Thread Bohdan Kazydub (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6541:
--
Affects Version/s: 1.13.0

> Upgrade ZooKeeper patch version to 3.4.11 for mapr profile
> --
>
> Key: DRILL-6541
> URL: https://issues.apache.org/jira/browse/DRILL-6541
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.13.0
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.15.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6541) Upgrade ZooKeeper patch version to 3.4.11 for mapr profile

2018-06-26 Thread Bohdan Kazydub (JIRA)
Bohdan Kazydub created DRILL-6541:
-

 Summary: Upgrade ZooKeeper patch version to 3.4.11 for mapr profile
 Key: DRILL-6541
 URL: https://issues.apache.org/jira/browse/DRILL-6541
 Project: Apache Drill
  Issue Type: Task
Reporter: Bohdan Kazydub
Assignee: Bohdan Kazydub






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6534) Upgrade ZooKeeper patch version to 3.4.11

2018-06-26 Thread Vitalii Diravka (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522622#comment-16522622
 ] 

Vitalii Diravka edited comment on DRILL-6534 at 6/26/18 10:00 AM:
--

Currently Drill uses 3.4.6 version of zookeeper and 2.7.1 version of 
curator-framework.
The last versions in maven are: 
[3.4.12|https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.4.12]
 and 
[4.0.1|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/4.0.1]
 
([2.12.0|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/2.12.0]
 for 2 major version).
But these dependencies are leveraged as transitive from "hadoop-common". Do we 
need to update "hadoop-common" at first?


was (Author: vitalii):
Currently Drill uses 3.4.6 version of zookeeper and 2.7.1 version of 
curator-framework.
The last versions in maven are: 
[3.4.12|https://mvnrepository.com/artifact/org.apache.zookeeper/zookeeper/3.4.12]
 and 
[4.0.1|https://mvnrepository.com/artifact/org.apache.curator/curator-framework/4.0.1]
 
([2.12.0|[https://mvnrepository.com/artifact/org.apache.curator/curator-framework/2.12.0]]
 for 2 major version).
But these dependencies are leveraged as transitive from "hadoop-common". Do we 
need to update "hadoop-common" at first?

> Upgrade ZooKeeper patch version to 3.4.11
> -
>
> Key: DRILL-6534
> URL: https://issues.apache.org/jira/browse/DRILL-6534
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6534) Upgrade ZooKeeper patch version to 3.4.11

2018-06-26 Thread Vitalii Diravka (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523511#comment-16523511
 ] 

Vitalii Diravka commented on DRILL-6534:


[~arina] It makes sense. Done.
[DRILL-6540: Upgrade to HADOOP-3.1 
libraries|https://issues.apache.org/jira/browse/DRILL-6540].

> Upgrade ZooKeeper patch version to 3.4.11
> -
>
> Key: DRILL-6534
> URL: https://issues.apache.org/jira/browse/DRILL-6534
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6540) Upgrade to HADOOP-3.1 libraries

2018-06-26 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-6540:
--

 Summary: Upgrade to HADOOP-3.1 libraries 
 Key: DRILL-6540
 URL: https://issues.apache.org/jira/browse/DRILL-6540
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Vitalii Diravka


Currently Drill uses 2.7.1 version of hadoop libraries (hadoop-common, 
hadoop-hdfs, hadoop-annotations, hadoop-aws, hadoop-yarn-api, hadoop-client, 
hadoop-yarn-client).
Half of year ago the [Hadoop 
3.0|https://hadoop.apache.org/docs/r3.0.0/index.html] was released and recently 
it was an update - [Hadoop 3.1|https://hadoop.apache.org/docs/r3.1.0/]. 

To use Drill under Hadoop3.0 distribution we need this upgrade. Also the newer 
version includes new features, which can be useful for Drill.
This upgrade is also needed to leverage the newest version of zookeeper 
libraries.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6534) Upgrade ZooKeeper patch version to 3.4.11

2018-06-26 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523448#comment-16523448
 ] 

Arina Ielchiieva edited comment on DRILL-6534 at 6/26/18 9:12 AM:
--

I would say that should be done apart of this Jira, [~vitalii] could you please 
create Jira for the upgrade?


was (Author: arina):
I would say that should be done part of this Jira, [~vitalii] could you please 
create Jira for the upgrade?

> Upgrade ZooKeeper patch version to 3.4.11
> -
>
> Key: DRILL-6534
> URL: https://issues.apache.org/jira/browse/DRILL-6534
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6534) Upgrade ZooKeeper patch version to 3.4.11

2018-06-26 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6534:

Reviewer: Vitalii Diravka  (was: Arina Ielchiieva)

> Upgrade ZooKeeper patch version to 3.4.11
> -
>
> Key: DRILL-6534
> URL: https://issues.apache.org/jira/browse/DRILL-6534
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6534) Upgrade ZooKeeper patch version to 3.4.11

2018-06-26 Thread Arina Ielchiieva (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523448#comment-16523448
 ] 

Arina Ielchiieva commented on DRILL-6534:
-

I would say that should be done part of this Jira, [~vitalii] could you please 
create Jira for the upgrade?

> Upgrade ZooKeeper patch version to 3.4.11
> -
>
> Key: DRILL-6534
> URL: https://issues.apache.org/jira/browse/DRILL-6534
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Bohdan Kazydub
>Assignee: Bohdan Kazydub
>Priority: Minor
> Fix For: 1.14.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6538) MongoDb Collection are accessible only after certain time

2018-06-26 Thread Aniket (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523437#comment-16523437
 ] 

Aniket edited comment on DRILL-6538 at 6/26/18 8:52 AM:


I saw in the code  in *MongoSchemaFactory.java*

path : 
storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new DatabaseLoader());
 
tableNameLoader = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new TableNameLoader());

 

 

 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?


was (Author: aniketamrutkar):
I saw in the code  in *MongoSchemaFactory.java*

path : 
[drill|https://github.com/apache/drill/tree/1.13.0]/[contrib|https://github.com/apache/drill/tree/1.13.0/contrib]/[storage-mongo|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo]/[src|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src]/[main|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main]/[java|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java]/[org|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org]/[apache|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache]/[drill|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill]/[exec|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec]/[store|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec/store]/[mongo|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo]/[schema|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new DatabaseLoader());
 
tableNameLoader = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new TableNameLoader());

 

 

 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?

> MongoDb Collection are accessible only after certain time
> -
>
> Key: DRILL-6538
> URL: https://issues.apache.org/jira/browse/DRILL-6538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.12.0, 1.13.0
>Reporter: Aniket
>Priority: Major
>
> After creating a new collection in mongo, If immediately drill query on that 
> collection is fired then getting an error. Please note that there is some 
> compound index created in Mongo.
> *org.apache.calcite.runtime.CalciteContextException: From line 1, column 433 
> to line 1, column 437: Object  not found within 
> 'mongo.'*
>  After 60 seconds the same query works properly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6538) MongoDb Collection are accessible only after certain time

2018-06-26 Thread Aniket (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523437#comment-16523437
 ] 

Aniket commented on DRILL-6538:
---

I saw in the code  in *MongoSchemaFactory.java*

path : 
[drill|https://github.com/apache/drill/tree/1.13.0]/[contrib|https://github.com/apache/drill/tree/1.13.0/contrib]/[storage-mongo|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo]/[src|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src]/[main|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main]/[java|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java]/[org|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org]/[apache|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache]/[drill|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill]/[exec|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec]/[store|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec/store]/[mongo|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo]/[schema|https://github.com/apache/drill/tree/1.13.0/contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new DatabaseLoader());
 
tableNameLoader = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new TableNameLoader());

 

 

 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?

> MongoDb Collection are accessible only after certain time
> -
>
> Key: DRILL-6538
> URL: https://issues.apache.org/jira/browse/DRILL-6538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.12.0, 1.13.0
>Reporter: Aniket
>Priority: Major
>
> After creating a new collection in mongo, If immediately drill query on that 
> collection is fired then getting an error. Please note that there is some 
> compound index created in Mongo.
> *org.apache.calcite.runtime.CalciteContextException: From line 1, column 433 
> to line 1, column 437: Object  not found within 
> 'mongo.'*
>  After 60 seconds the same query works properly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (DRILL-6538) MongoDb Collection are accessible only after certain time

2018-06-26 Thread Aniket (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6538?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523437#comment-16523437
 ] 

Aniket edited comment on DRILL-6538 at 6/26/18 8:53 AM:


I saw in the code  in *MongoSchemaFactory.java*

path : 
storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new DatabaseLoader());
 
tableNameLoader = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new TableNameLoader()); 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?


was (Author: aniketamrutkar):
I saw in the code  in *MongoSchemaFactory.java*

path : 
storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/schema]/MongoSchemaFactory.java

 

databases = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new DatabaseLoader());
 
tableNameLoader = CacheBuilder //
.newBuilder() //
.expireAfterAccess(1, TimeUnit.MINUTES) //
.build(new TableNameLoader());

 

 

 

I think this is causing the 1-minute delay. 

Is there any way we can overwrite this ?

> MongoDb Collection are accessible only after certain time
> -
>
> Key: DRILL-6538
> URL: https://issues.apache.org/jira/browse/DRILL-6538
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Functions - Drill
>Affects Versions: 1.12.0, 1.13.0
>Reporter: Aniket
>Priority: Major
>
> After creating a new collection in mongo, If immediately drill query on that 
> collection is fired then getting an error. Please note that there is some 
> compound index created in Mongo.
> *org.apache.calcite.runtime.CalciteContextException: From line 1, column 433 
> to line 1, column 437: Object  not found within 
> 'mongo.'*
>  After 60 seconds the same query works properly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6453) TPC-DS query 72 has regressed

2018-06-26 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523255#comment-16523255
 ] 

Khurram Faraaz commented on DRILL-6453:
---

[~priteshm] on latest apache master commit b92f599 with DEFAULT values for 
system options and default 

DRILL_MAX_DIRECT_MEMORY we see that query is in Canceled state after running 
for 2 hr 11 mins.

As part of the other test, we will increase DRILL_MAX_DIRECT_MEMORY to 12G and 
also set below options and share test results.

alter system set `planner.memory.max_query_memory_per_node` = 10737418240;

alter system set `drill.exec.hashagg.fallback.enabled` = true;

> TPC-DS query 72 has regressed
> -
>
> Key: DRILL-6453
> URL: https://issues.apache.org/jira/browse/DRILL-6453
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow
>Affects Versions: 1.14.0
>Reporter: Khurram Faraaz
>Assignee: Boaz Ben-Zvi
>Priority: Blocker
> Fix For: 1.14.0
>
> Attachments: 24f75b18-014a-fb58-21d2-baeab5c3352c.sys.drill
>
>
> TPC-DS query 72 seems to have regressed, query profile for the case where it 
> Canceled after 2 hours on Drill 1.14.0 is attached here.
> {noformat}
> On, Drill 1.14.0-SNAPSHOT 
> commit : 931b43e (TPC-DS query 72 executed successfully on this commit, took 
> around 55 seconds to execute)
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> TPC-DS query 72 executed successfully & took 47 seconds to complete execution.
> {noformat}
> {noformat}
> TPC-DS data in the below run has date values stored as DATE datatype and not 
> VARCHAR type
> On, Drill 1.14.0-SNAPSHOT
> commit : 82e1a12
> SF1 parquet data on 4 nodes; 
> planner.memory.max_query_memory_per_node = 10737418240. 
> drill.exec.hashagg.fallback.enabled = true
> and
> alter system set `exec.hashjoin.num_partitions` = 1;
> TPC-DS query 72 executed for 2 hrs and 11 mins and did not complete, I had to 
> Cancel it by stopping the Foreman drillbit.
> As a result several minor fragments are reported to be in 
> CANCELLATION_REQUESTED state on UI.
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6539) Record count not set for this vector container error

2018-06-26 Thread Khurram Faraaz (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16523247#comment-16523247
 ] 

Khurram Faraaz commented on DRILL-6539:
---

This issue is same as https://issues.apache.org/jira/browse/DRILL-6517

we can consistently reproduce this error when TPC-DS query 72 is executed on 
latest Apache master 1.14.0 commit b92f599 with DEFAULT values for system 
options.

> Record count not set for this vector container error 
> -
>
> Key: DRILL-6539
> URL: https://issues.apache.org/jira/browse/DRILL-6539
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Affects Versions: 1.13.0
>Reporter: Padma Penumarthy
>Assignee: Padma Penumarthy
>Priority: Major
> Fix For: 1.14.0
>
>
> This error is randomly seen when executing queries.
> [Error Id: 6a2a49e5-28d9-4587-ab8b-5262c07f8fdc on drill196:31010]
>   (java.lang.IllegalStateException) Record count not set for this vector 
> container
> com.google.common.base.Preconditions.checkState():173
> org.apache.drill.exec.record.VectorContainer.getRecordCount():394
> org.apache.drill.exec.record.RecordBatchSizer.():681
> org.apache.drill.exec.record.RecordBatchSizer.():665
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate$BatchHolder.getActualSize():441
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.getActualSize():882
> 
> org.apache.drill.exec.physical.impl.common.HashTableTemplate.makeDebugString():891
> 
> org.apache.drill.exec.physical.impl.common.HashPartition.makeDebugString():578
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.makeDebugString():937
> 
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase():754
> org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext():335
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.test.generated.HashAggregatorGen89497.doWork():617
> org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext():176
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.loadBatch():403
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.load():354
> 
> org.apache.drill.exec.physical.impl.xsort.managed.ExternalSortBatch.innerNext():299
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():109
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():63
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():137
> org.apache.drill.exec.record.AbstractRecordBatch.next():172
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():83
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():294
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():281
> java.security.AccessController.doPrivileged():-2
>