[GitHub] drill pull request #1009: DRILL-5905: Exclude jdk-tools from project depende...

2017-10-24 Thread vrozov
GitHub user vrozov opened a pull request:

https://github.com/apache/drill/pull/1009

DRILL-5905: Exclude jdk-tools from project dependencies



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/drill DRILL-5905

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1009.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1009


commit 27b1deb03dbf6697715a9f368512f73b7b4e59c8
Author: Vlad Rozov 
Date:   2017-10-25T02:10:37Z

DRILL-5905: Exclude jdk-tools from project dependencies




---


[jira] [Created] (DRILL-5905) Exclude jdk-tools from project dependencies

2017-10-24 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-5905:
-

 Summary: Exclude jdk-tools from project dependencies
 Key: DRILL-5905
 URL: https://issues.apache.org/jira/browse/DRILL-5905
 Project: Apache Drill
  Issue Type: Improvement
  Components: Tools, Build & Test
Reporter: Vlad Rozov
Assignee: Vlad Rozov
Priority: Minor


hadoop-annotations and hbase-annotations have system scope dependency on JDK 
tools.jar. This dependency is provided by JDK and should be excluded from the 
project dependencies



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1001: JIRA DRILL-5879: Like operator performance improve...

2017-10-24 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1001#discussion_r146708658
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -17,37 +17,166 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternContainsMatcher implements SqlPatternMatcher {
+public final class SqlPatternContainsMatcher implements SqlPatternMatcher {
   final String patternString;
   CharSequence charSequenceWrapper;
   final int patternLength;
+  final MatcherFcn matcherFcn;
 
   public SqlPatternContainsMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.patternString = patternString;
+this.patternString   = patternString;
 this.charSequenceWrapper = charSequenceWrapper;
-patternLength = patternString.length();
+patternLength= patternString.length();
+
+// The idea is to write loops with simple condition checks to allow 
the Java Hotspot achieve
+// better optimizations (especially vectorization)
+if (patternLength == 1) {
+  matcherFcn = new Matcher1();
--- End diff --

Padma, I have two reasons to follow the added complexity
1) The new code is encapsulated within the Contains matching logic; doesn't 
increase code complexity
2) 
o I created a test with the original match logic, pattern and input were 
Strings though passed as CharSequence
o Ran the test with the new and old method (1 billion iterations) on MacOS
o pattern length 
o The old match method performed in 43sec where as the new one performed in 
15sec
o The reason for the speedup is the custom matcher functions have less 
instructions (load and comparison)


---


[GitHub] drill pull request #1001: JIRA DRILL-5879: Like operator performance improve...

2017-10-24 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/1001#discussion_r146705325
  
--- Diff: 
exec/java-exec/src/main/codegen/templates/CastFunctionsSrcVarLenTargetVarLen.java
 ---
@@ -73,6 +73,9 @@ public void eval() {
 out.start =  in.start;
 if (charCount <= length.value || length.value == 0 ) {
   out.end = in.end;
+  if (charCount == (out.end - out.start)) {
+out.asciiMode = 
org.apache.drill.exec.expr.holders.VarCharHolder.CHAR_MODE_IS_ASCII; // we can 
conclude this string is ASCII
--- End diff --

- As previously stated (when responding to Paul'd comment), the expression 
framework is able to use the same VarCharHolder input variable when it is 
shared amongst multiple expressions
- If the original column was of type var-binary, then the expression 
framework will include a cast to var-char
- The cast logic will also compute the string length
- Using this information to deduce whether the string is pure ASCII or not
- UTF-8 encoding uses 1 byte for ASCII and 2, 3, or 4 for other character 
sets
- If the encoded length and character length are equal, then this means 
this is an ASCII string 


---


[GitHub] drill pull request #1008: drill-5890: Fixed a file descriptor leak in Drill'...

2017-10-24 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1008


---


[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

2017-10-24 Thread HanumathRao
Github user HanumathRao commented on the issue:

https://github.com/apache/drill/pull/996
  
@arina-ielchiieva I think this check shouldn't cause much of a performance 
impact as it is in parser code and also it is checked right now by the DRILL's 
custom overload of getTable. If this function throws an exception then it is 
caught and reported to the user. I think there can be improvement to not check 
this by every code path(i.e valid code path as well) and only check when 
super.getTable returns null, in this way we can report error only when we 
couldn't find a table. Please let me know if this approach seems reasonable, so 
that I can go ahead and change the code accordingly.


---


[jira] [Resolved] (DRILL-5706) Select * on hbase table having multiple regions(one or more empty) returns wrong result intermittently

2017-10-24 Thread Paul Rogers (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Rogers resolved DRILL-5706.

Resolution: Fixed

> Select * on hbase table having multiple regions(one or more empty) returns 
> wrong result intermittently
> --
>
> Key: DRILL-5706
> URL: https://issues.apache.org/jira/browse/DRILL-5706
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HBase
>Affects Versions: 1.11.0
>Reporter: Prasad Nagaraj Subramanya
>Assignee: Paul Rogers
>
> 1) Create a hbase table with 4 regions
> {code}
> create 'myhbase', 'cf1', {SPLITS => ['a', 'b', 'c']}
> put 'myhbase','a','cf1:col1','somedata'
> put 'myhbase','b','cf1:col1','somedata'
> put 'myhbase','c','cf1:col1','somedata'
> {code}
> 2) Run select * on the hbase table
> {code}
> select * from hbase.myhbase;
> {code}
> The query returns wrong result intermittently



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (DRILL-5873) Drill C++ Client should throw proper/complete error message for the ODBC driver to consume

2017-10-24 Thread Parth Chandra (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Parth Chandra resolved DRILL-5873.
--
Resolution: Fixed

> Drill C++ Client should throw proper/complete error message for the ODBC 
> driver to consume
> --
>
> Key: DRILL-5873
> URL: https://issues.apache.org/jira/browse/DRILL-5873
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Client - C++
>Reporter: Krystal
>Assignee: Parth Chandra
>  Labels: ready-to-commit
>
> The Drill C++ Client should throw a proper/complete error message for the 
> driver to utilize.
> The ODBC driver is directly outputting the exception message thrown by the 
> client by calling the getError() API after the connect() API has failed with 
> an error status.
> For the Java client, similar logic is hard coded at 
> https://github.com/apache/drill/blob/1.11.0/exec/java-exec/src/main/java/org/apache/drill/exec/rpc/user/UserClient.java#L247.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1005: DRILL-5896: Handle HBase columns vector creation i...

2017-10-24 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146658667
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -75,6 +75,8 @@
 
   private TableName hbaseTableName;
   private Scan hbaseScan;
+  private Scan hbaseScan1;
+  Set completeFamilies;
--- End diff --

`private`?


---


[GitHub] drill pull request #1005: DRILL-5896: Handle HBase columns vector creation i...

2017-10-24 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146659717
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -186,6 +192,10 @@ public void setup(OperatorContext context, 
OutputMutator output) throws Executio
   }
 }
   }
+
+  for (String familyName : completeFamilies) {
+getOrCreateFamilyVector(familyName, false);
+  }
--- End diff --

Does this create just the map, or also the vectors within the map? Maybe a 
comment to explain the goals?


---


[GitHub] drill pull request #1005: DRILL-5896: Handle HBase columns vector creation i...

2017-10-24 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146659105
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -121,16 +125,18 @@ public HBaseRecordReader(Connection connection, 
HBaseSubScan.HBaseSubScanSpec su
 byte[] family = root.getPath().getBytes();
 transformed.add(SchemaPath.getSimplePath(root.getPath()));
 PathSegment child = root.getChild();
-if (!completeFamilies.contains(new String(family, 
StandardCharsets.UTF_8).toLowerCase())) {
-  if (child != null && child.isNamed()) {
-byte[] qualifier = child.getNameSegment().getPath().getBytes();
+if (child != null && child.isNamed()) {
+  byte[] qualifier = child.getNameSegment().getPath().getBytes();
+  hbaseScan1.addColumn(family, qualifier);
+  if (!completeFamilies.contains(new String(family, 
StandardCharsets.UTF_8))) {
--- End diff --

Redundant conversion of `family` to `String`, here and below.


---


[GitHub] drill pull request #1005: DRILL-5896: Handle HBase columns vector creation i...

2017-10-24 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1005#discussion_r146658771
  
--- Diff: 
contrib/storage-hbase/src/main/java/org/apache/drill/exec/store/hbase/HBaseRecordReader.java
 ---
@@ -87,6 +89,7 @@ public HBaseRecordReader(Connection connection, 
HBaseSubScan.HBaseSubScanSpec su
 hbaseTableName = TableName.valueOf(
 Preconditions.checkNotNull(subScanSpec, "HBase reader needs a 
sub-scan spec").getTableName());
 hbaseScan = new Scan(subScanSpec.getStartRow(), 
subScanSpec.getStopRow());
+hbaseScan1 = new Scan();
--- End diff --

Better name or comment to explain.


---


[GitHub] drill issue #1008: drill-5890: Fixed a file descriptor leak in Drill's test-...

2017-10-24 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1008
  
+1. 
Great catch Salim. 


---


[GitHub] drill issue #1008: drill-5890: Fixed a file descriptor leak in Drill's test-...

2017-10-24 Thread sachouche
Github user sachouche commented on the issue:

https://github.com/apache/drill/pull/1008
  
@parthchandra can you please review this change? thanks!


---


[GitHub] drill pull request #1008: drill-5890: Fixed a file descriptor leak in Drill'...

2017-10-24 Thread sachouche
GitHub user sachouche opened a pull request:

https://github.com/apache/drill/pull/1008

drill-5890: Fixed a file descriptor leak in Drill's test-suite

Problem Description
- The Drill test-suite uses two surefire processes to run tests
- This has the advantage of avoiding class reloading if the JVM exited 
after running a test class
- The side effect of this approach, is that resource leaks could be 
problematic
- When running the Drill's test-suite on MacOS (Sierra) my tests failed 
with a max FD descriptors reached
- Had to increase the maximum number of open FDs for the whole machine and 
per process
- The process is described in the following 
[link](https://superuser.com/questions/302754/increase-the-maximum-number-of-open-file-descriptors-in-snow-leopard)
- Two limit files "limit.maxfiles.plist" and "limit.maxproc.plist" have to 
be created under "/Library/LaunchDaemons"
- Originally, I had to set the maximum number of FDs per process to a large 
value (100,000 and the system to 200,000) for the tests to succeed

FD Leak Cause
Debugging the Drill test suite, it was noticed 
- A base class BaseTestQuery has a @BeforeClass and @AfterClass TestNG tags 
- This means that each Drill test class extending from BaseTestQuery will 
have a setup method called before any tests are executed and a cleanup method 
invoked when all the tests are done (or a fatal error in between)
- The OpenClient() method was starting a DrillBit and creating a client 
connection to it
- DrillBit's BootStrapContext class was initializing two Netty 
EventLoopGroup objects which internally opened 20 FDs each
- It was noticed that one of them was not getting de-initialized

Fix
- Added logic within the BootStrapContext object to shutdown the 
EventLoopGroup objects if they have been already shutdown (and are not in the 
process of being shutdown)
- The fix tries to shut both objects because the container class should 
ideally manage the lifecycle of its objects; at least, the code should clearly 
articulate lifecycle management responsibilities to avoid leaks
- Used the "shutdownGracefully" method since it was a) already used by our 
code and b) is advertised to have sensible timeout values
- The added shutdown calls are being invoked only when consumer objects 
have been also shutdown
- Running the tests show that the number of FDs per surefire process 
doesn't extend beyond few hundreds (majority created for JAR files loading)


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sachouche/drill drill-5890

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1008.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1008


commit 5f8bad865a78ab265015ca21184d6c59e22d1c95
Author: Salim Achouche 
Date:   2017-10-24T17:12:19Z

drill-5890: Fixed a file descriptor leak in Drill's test-suite




---


[jira] [Created] (DRILL-5904) TestCTAS.testPartitionByForAllType fails sporadically on some build machines

2017-10-24 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-5904:
-

 Summary: TestCTAS.testPartitionByForAllType fails sporadically on 
some build machines
 Key: DRILL-5904
 URL: https://issues.apache.org/jira/browse/DRILL-5904
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas


Vlad found that the TestCTAS.testPartitionByForAllType test sporadically fails 
with this stack trace sometimes.

testPartitionByForAllTypes(org.apache.drill.exec.sql.TestCTAS)
java.lang.Exception: test timed out after 10 milliseconds
at java.io.UnixFileSystem.canonicalize0(Native Method) ~[na:1.7.0_131]
at java.io.UnixFileSystem.canonicalize(UnixFileSystem.java:172) 
~[na:1.7.0_131]
at java.io.File.getCanonicalPath(File.java:618) ~[na:1.7.0_131]
at java.io.File.getCanonicalFile(File.java:643) ~[na:1.7.0_131]
at org.apache.commons.io.FileUtils.isSymlink(FileUtils.java:2935) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1534) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.forceDelete(FileUtils.java:2270) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1653) 
~[commons-io-2.4.jar:2.4]
at org.apache.commons.io.FileUtils.deleteQuietly(FileUtils.java:1566) 
~[commons-io-2.4.jar:2.4]
at 
org.apache.drill.exec.sql.TestCTAS.testPartitionByForAllTypes(TestCTAS.java:292)
 ~[test-classes/:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.7.0_131]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
~[na:1.7.0_131]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.7.0_131]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_131]
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 ~[junit-4.11.jar:na]
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 ~[junit-4.11.jar:na]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 ~[junit-4.11.jar:na]
at 
mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.executeTestMethod(JUnit4TestRunnerDecorator.java:120)
 ~[jmockit-1.3.jar:na]
at 
mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:65)
 ~[jmockit-1.3.jar:na]
at 
mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
 ~[jmockit-1.3.jar:na]
at sun.reflect.GeneratedMethodAccessor97.invoke(Unknown Source) ~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.7.0_131]
at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_131]
at 
mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
 ~[jmockit-1.3.jar:na]
at 
mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76) 
~[jmockit-1.3.jar:na]
at 
mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) 
~[jmockit-1.3.jar:na]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java) 
~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 ~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
~[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
 ~[junit-4.11.jar:na]



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #996: DRILL-5878: TableNotFound exception is being reported for ...

2017-10-24 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/996
  
So does it mean that schema validation will be done twice: first in Drill 
and then in Calcite? Will it influence query parsing performance?


---


[jira] [Resolved] (DRILL-5901) Drill test framework can have successful run even if a random failure occurs

2017-10-24 Thread Robert Hou (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Hou resolved DRILL-5901.
---
Resolution: Not A Bug

This is a bug in the Drill Test Framework, not in Drill itself.

> Drill test framework can have successful run even if a random failure occurs
> 
>
> Key: DRILL-5901
> URL: https://issues.apache.org/jira/browse/DRILL-5901
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build & Test
>Affects Versions: 1.11.0
>Reporter: Robert Hou
>
> From Jenkins:
> http://10.10.104.91:8080/view/Nightly/job/TPCH-SF100-baseline/574/console
> Random Failures:
> /root/drillAutomation/framework-master/framework/resources/Advanced/tpch/tpch_sf1/original/parquet/query17.sql
> Query: 
> SELECT
>   SUM(L.L_EXTENDEDPRICE) / 7.0 AS AVG_YEARLY
> FROM
>   lineitem L,
>   part P
> WHERE
>   P.P_PARTKEY = L.L_PARTKEY
>   AND P.P_BRAND = 'BRAND#13'
>   AND P.P_CONTAINER = 'JUMBO CAN'
>   AND L.L_QUANTITY < (
> SELECT
>   0.2 * AVG(L2.L_QUANTITY)
> FROM
>   lineitem L2
> WHERE
>   L2.L_PARTKEY = P.P_PARTKEY
>   )
> Failed with exception
> java.sql.SQLException: SYSTEM ERROR: IllegalStateException: Memory was leaked 
> by query. Memory leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> org.apache.drill.exec.memory.BaseAllocator.close():519
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86
> org.apache.drill.exec.ops.OperatorContextImpl.close():108
> org.apache.drill.exec.ops.FragmentContext.suppressingClose():435
> org.apache.drill.exec.ops.FragmentContext.close():424
> 
> org.apache.drill.exec.work.fragment.FragmentExecutor.closeOutResources():324
> org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup():155
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():267
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1145
> java.util.concurrent.ThreadPoolExecutor$Worker.run():615
> java.lang.Thread.run():744
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:489)
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.loadInitialSchema(DrillCursor.java:561)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:1895)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.execute(DrillResultSetImpl.java:61)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.java:473)
>   at 
> org.apache.drill.jdbc.impl.DrillMetaImpl.prepareAndExecute(DrillMetaImpl.java:1100)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(AvaticaConnection.java:477)
>   at 
> org.apache.drill.jdbc.impl.DrillConnectionImpl.prepareAndExecuteInternal(DrillConnectionImpl.java:181)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement.java:110)
>   at 
> oadd.org.apache.calcite.avatica.AvaticaStatement.executeQuery(AvaticaStatement.java:130)
>   at 
> org.apache.drill.jdbc.impl.DrillStatementImpl.executeQuery(DrillStatementImpl.java:112)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:206)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:115)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: oadd.org.apache.drill.common.exceptions.UserRemoteException: 
> SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> Fragment 8:2
> [Error Id: f21a2560-7259-4e13-88c2-9bac29e2930a on atsqa6c88.qa.lab:31010]
>   (java.lang.IllegalStateException) Memory was leaked by query. Memory 
> leaked: (2097152)
> Allocator(op:8:2:6:ParquetRowGroupScan) 100/0/7675904/100 
> (res/actual/peak/limit)
> org.apache.drill.exec.memory.BaseAllocator.close():519
> org.apache.drill.exec.ops.AbstractOperatorExecContext.close():86
>

[GitHub] drill issue #949: DRILL-5795: Parquet Filter push down at rowgroup level

2017-10-24 Thread dprofeta
Github user dprofeta commented on the issue:

https://github.com/apache/drill/pull/949
  
Indeed, in the plan, there is an added information that is the number of 
rowgroup scanned. I think the check of the plan should be fixed by checking 
separately numFiles and usedMetadataFile without expecting they are next to 
each other.


---


[jira] [Created] (DRILL-5903) Query encounters "Waited for 15000ms, but tasks for 'Fetch parquet metadata' are not complete."

2017-10-24 Thread Robert Hou (JIRA)
Robert Hou created DRILL-5903:
-

 Summary: Query encounters "Waited for 15000ms, but tasks for 
'Fetch parquet metadata' are not complete."
 Key: DRILL-5903
 URL: https://issues.apache.org/jira/browse/DRILL-5903
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata, Storage - Parquet
Affects Versions: 1.11.0
Reporter: Robert Hou
Priority: Critical


Query is:
{noformat}
select a.int_col, b.date_col from 
dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` a 
inner join ( select date_col, int_col from 
dfs.`/drill/testdata/parquet_date/metadata_cache/mixed/fewtypes_null_large` 
where dir0 = '1.2' and date_col > '1996-03-07' ) b on cast(a.date_col as date)= 
date_add(b.date_col, 5) where a.int_col = 7 and a.dir0='1.9' group by 
a.int_col, b.date_col
{noformat}

>From drillbit.log:
{noformat}
fc65-d430-ac1103638113: SELECT SUM(col_int) OVER() sum_int FROM vwOnParq_wCst_35
2017-10-23 11:20:50,122 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] ERROR 
o.a.d.exec.store.parquet.Metadata - Waited for 15000ms, but tasks for 'Fetch 
parquet metadata' are not complete. Total runnable size 3, parallelism 3.
2017-10-23 11:20:50,127 [26122f83-6956-5aa8-d8de-d4808f572160:foreman] INFO  
o.a.d.exec.store.parquet.Metadata - User Error Occurred: Waited for 15000ms, 
but tasks for 'Fetch parquet metadata' are not complete. Total runnable size 3, 
parallelism 3.
org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: Waited for 
15000ms, but tasks for 'Fetch parquet metadata' are not complete. Total 
runnable size 3, parallelism 3.


[Error Id: 7484e127-ea41-4797-83c0-6619ea9b2bcd ]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.TimedRunnable.run(TimedRunnable.java:151) 
[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetFileMetadata_v3(Metadata.java:341)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:318)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.Metadata.getParquetTableMetadata(Metadata.java:142)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.init(ParquetGroupScan.java:934)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:227)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetGroupScan.(ParquetGroupScan.java:190)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:170)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.parquet.ParquetFormatPlugin.getGroupScan(ParquetFormatPlugin.java:66)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.dfs.FileSystemPlugin.getPhysicalScan(FileSystemPlugin.java:144)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.store.AbstractStoragePlugin.getPhysicalScan(AbstractStoragePlugin.java:100)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.DrillTable.getGroupScan(DrillTable.java:85)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.logical.DrillPushProjIntoScan.onMatch(DrillPushProjIntoScan.java:62)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228)
 [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:811)
 [calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:310) 
[calcite-core-1.4.0-drill-r22.jar:1.4.0-drill-r22]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:400)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:342)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:241)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:291)
 

ElasticSearch via sql4es (JDBC)

2017-10-24 Thread Charuta Rajopadhye
Hi Team,

I am trying to connect to ElasticSearch via Drill. Came across this
, but the work seems to
be in progress. So i tried using JDBC storage plugin for ElasticSearch, and
used sql4es jar  for the same.
But to no avail.
I keep getting the error: Please retry: error (unable to create/ update
storage)
Following is the configuration i used:
{
  "type": "jdbc",
  "driver": "nl.anchormen.sql4es.jdbc.ESDriver",
  "url": "jdbc:sql4es://localhost:9200/48FxIll?cluster.name=elasticsearch",
  "username": "",
  "password": "",
  "enabled": true
}
and a few permutations of the same like for example changing url to just :
jdbc:sql4es://localhost:9200 etc.
I am pretty sure i don't need credentials, for i have not installed the
X-Pack extension for ElasticSearch. But i am unaware if still some default
thing needs to be used.
Can somebody please guide me in this regard?

Thanks,
Charuta