[GitHub] drill issue #1015: DRILL-5899: Simple pattern matchers can work with DrillBu...

2017-11-07 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1015
  
@paul-rogers updated with latest review comments taken care of. Please take 
a look.


---


[GitHub] drill pull request #1015: DRILL-5899: Simple pattern matchers can work with ...

2017-11-07 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149554356
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternEndsWithMatcher.java
 ---
@@ -17,33 +17,30 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternEndsWithMatcher implements SqlPatternMatcher {
-  final String patternString;
-  CharSequence charSequenceWrapper;
-  final int patternLength;
-
-  public SqlPatternEndsWithMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.charSequenceWrapper = charSequenceWrapper;
-this.patternString = patternString;
-this.patternLength = patternString.length();
+import io.netty.buffer.DrillBuf;
+
+public class SqlPatternEndsWithMatcher extends AbstractSqlPatternMatcher {
+
+  public SqlPatternEndsWithMatcher(String patternString) {
+super(patternString);
   }
 
   @Override
-  public int match() {
-int txtIndex = charSequenceWrapper.length();
-int patternIndex = patternLength;
-boolean matchFound = true; // if pattern is empty string, we always 
match.
+  public int match(int start, int end, DrillBuf drillBuf) {
+
+if ( (end - start) < patternLength) { // No match if input string 
length is less than pattern length.
--- End diff --

`( (end - start)` --> `(end - start`


---


[GitHub] drill pull request #1015: DRILL-5899: Simple pattern matchers can work with ...

2017-11-07 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149554195
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternContainsMatcher.java
 ---
@@ -17,37 +17,48 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternContainsMatcher implements SqlPatternMatcher {
-  final String patternString;
-  CharSequence charSequenceWrapper;
-  final int patternLength;
-
-  public SqlPatternContainsMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.patternString = patternString;
-this.charSequenceWrapper = charSequenceWrapper;
-patternLength = patternString.length();
+import io.netty.buffer.DrillBuf;
+
+public class SqlPatternContainsMatcher extends AbstractSqlPatternMatcher {
+
+  public SqlPatternContainsMatcher(String patternString) {
+super(patternString);
   }
 
   @Override
-  public int match() {
-final int txtLength = charSequenceWrapper.length();
-int patternIndex = 0;
-int txtIndex = 0;
-
-// simplePattern string has meta characters i.e % and _ and escape 
characters removed.
-// so, we can just directly compare.
-while (patternIndex < patternLength && txtIndex < txtLength) {
-  if (patternString.charAt(patternIndex) != 
charSequenceWrapper.charAt(txtIndex)) {
-// Go back if there is no match
-txtIndex = txtIndex - patternIndex;
-patternIndex = 0;
-  } else {
-patternIndex++;
+  public int match(int start, int end, DrillBuf drillBuf) {
+
+if (patternLength == 0) { // Everything should match for null pattern 
string
+  return 1;
+}
+
+final int txtLength = end - start;
+
+// no match if input string length is less than pattern length
+if (txtLength < patternLength) {
+  return 0;
+}
+
+outer:
+for (int txtIndex = 0; txtIndex < txtLength; txtIndex++) {
+
+  // boundary check
+  if (txtIndex + patternLength > txtLength) {
--- End diff --

Better:
```
int end = txtLength - patternLength;
for (int txtIndex = 0; txtIndex < end; txtIndex++) {
```
And omit the boundary check on every iteration. That is, no reason to 
iterate past the last possible match, then use an if-statement to shorten the 
loop. Just shorten the loop.


---


[GitHub] drill pull request #1015: DRILL-5899: Simple pattern matchers can work with ...

2017-11-07 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149552453
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AbstractSqlPatternMatcher.java
 ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import com.google.common.base.Charsets;
+import org.apache.drill.common.exceptions.UserException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.CharsetEncoder;
+import static 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.logger;
+
+// To get good performance for most commonly used pattern matches
--- End diff --

Javadoc?

```
/**
 * This is a Javadoc comment and appears in generated documentation.
 */
// This is a plain comment and does not appear in documentation.
```


---


[GitHub] drill pull request #1015: DRILL-5899: Simple pattern matchers can work with ...

2017-11-07 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149552506
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/AbstractSqlPatternMatcher.java
 ---
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.expr.fn.impl;
+
+import com.google.common.base.Charsets;
+import org.apache.drill.common.exceptions.UserException;
+import java.nio.ByteBuffer;
+import java.nio.CharBuffer;
+import java.nio.charset.CharacterCodingException;
+import java.nio.charset.CharsetEncoder;
+import static 
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.logger;
+
+// To get good performance for most commonly used pattern matches
+// i.e. CONSTANT('ABC'), STARTSWITH('%ABC'), ENDSWITH('ABC%') and 
CONTAINS('%ABC%'),
+// we have simple pattern matchers.
+// Idea is to have our own implementation for simple pattern matchers so 
we can
+// avoid heavy weight regex processing, skip UTF-8 decoding and char 
conversion.
+// Instead, we encode the pattern string and do byte comparison against 
native memory.
+// Overall, this approach
+// gives us orders of magnitude performance improvement for simple pattern 
matches.
+// Anything that is not simple is considered
+// complex pattern and we use Java regex for complex pattern matches.
+
+public abstract class AbstractSqlPatternMatcher implements 
SqlPatternMatcher {
+  final String patternString;
--- End diff --

`protected final`


---


[GitHub] drill pull request #1015: DRILL-5899: Simple pattern matchers can work with ...

2017-11-07 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1015#discussion_r149555002
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/impl/SqlPatternEndsWithMatcher.java
 ---
@@ -17,33 +17,30 @@
  */
 package org.apache.drill.exec.expr.fn.impl;
 
-public class SqlPatternEndsWithMatcher implements SqlPatternMatcher {
-  final String patternString;
-  CharSequence charSequenceWrapper;
-  final int patternLength;
-
-  public SqlPatternEndsWithMatcher(String patternString, CharSequence 
charSequenceWrapper) {
-this.charSequenceWrapper = charSequenceWrapper;
-this.patternString = patternString;
-this.patternLength = patternString.length();
+import io.netty.buffer.DrillBuf;
+
+public class SqlPatternEndsWithMatcher extends AbstractSqlPatternMatcher {
+
+  public SqlPatternEndsWithMatcher(String patternString) {
+super(patternString);
   }
 
   @Override
-  public int match() {
-int txtIndex = charSequenceWrapper.length();
-int patternIndex = patternLength;
-boolean matchFound = true; // if pattern is empty string, we always 
match.
+  public int match(int start, int end, DrillBuf drillBuf) {
+
+if ( (end - start) < patternLength) { // No match if input string 
length is less than pattern length.
+  return 0;
+}
 
 // simplePattern string has meta characters i.e % and _ and escape 
characters removed.
 // so, we can just directly compare.
-while (patternIndex > 0 && txtIndex > 0) {
-  if (charSequenceWrapper.charAt(--txtIndex) != 
patternString.charAt(--patternIndex)) {
-matchFound = false;
-break;
+for (int index = 1; index <= patternLength; index++) {
--- End diff --

```
int txtStart = end - patternLength;
if (txtStart < start) { return 0; }
for (int index = 0; index < patternLength; index++) {
   ... patternByteBuffer.get(index) ... drillBuf.getByte(txtStart + index) 
...
```



---


[jira] [Created] (DRILL-5943) Avoid the strong check introduced by DRILL-5582 for PLAIN mechanism

2017-11-07 Thread Sorabh Hamirwasia (JIRA)
Sorabh Hamirwasia created DRILL-5943:


 Summary: Avoid the strong check introduced by DRILL-5582 for PLAIN 
mechanism
 Key: DRILL-5943
 URL: https://issues.apache.org/jira/browse/DRILL-5943
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Sorabh Hamirwasia
Assignee: Sorabh Hamirwasia


For PLAIN mechanism we will weaken the strong check introduced with DRILL-5582 
to keep the forward compatibility between Drill 1.12 client and Drill 1.9 
server. This is fine since with and without this strong check PLAIN mechanism 
is still vulnerable to MITM during handshake itself unlike mutual 
authentication protocols like Kerberos.

Also for keeping forward compatibility with respect to SASL we will treat 
UNKNOWN_SASL_SUPPORT as valid value. For handshake message received from a 
client which is running on later version (let say 1.13) then Drillbit (1.12) 
and having a new value for SaslSupport field which is unknown to server, this 
field will be decoded as UNKNOWN_SASL_SUPPORT. In this scenario client will be 
treated as one aware about SASL protocol but server doesn't know exact 
capabilities of client. Hence the SASL handshake will still be required from 
server side.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #1028: DRILL-5943: Avoid the strong check introduced by DRILL-55...

2017-11-07 Thread sohami
Github user sohami commented on the issue:

https://github.com/apache/drill/pull/1028
  
@parthchandra & @laurentgo - Please help to review this PR.


---


[GitHub] drill issue #1021: DRILL-5923: Display name for query state

2017-11-07 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1021
  
@arina-ielchiieva, @prasadns14, here is my two cents.

The names and numbers used in the protobuf definitions are part of Drill's 
public network API. This API is not versioned, so we can't really change it. If 
we changed the names, then, say, C-code or Java code that expects the old names 
will break. Being part of the public API, that code may not even be in the 
Drill source tree; perhaps someone has generated, say, a Python binding. So, 
can't change the public API.

For purely aesthetic reasons, the contributor wishes to change the message 
displayed in the UI. This is purely a UI decision (the user is not expected to 
map the display names to the Protobuf enums.) And, the display name is subject 
to change. Maybe other UIs want to use other names. Maybe we want to show 
icons, or abbreviate the names ("Fail", "OK", etc.) And, of course, what if the 
display name should have spaces other characters: "In Progress", "In Queue" or 
"Didn't Work!". Can't put those in enum names. You get the idea.

For this reason, the mapping from enum values to display names should be 
part of the UI, not the network protocol definition. The present change 
provides a UI-specific mapping from API Protobuf enum values to display 
strings, which seems like a good idea.

So, the key questions are:

* Should we use display strings other than the Protobuf constants (seems a 
good idea.)
* Should we do the mapping in Java or in Freemarker? (Java seems simpler.)

Thoughts?


---


[GitHub] drill pull request #1028: DRILL-5943: Avoid the strong check introduced by D...

2017-11-07 Thread sohami
GitHub user sohami opened a pull request:

https://github.com/apache/drill/pull/1028

DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN…

… mechanism

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sohami/drill DRILL-5943

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1028


commit 708dbc203b63700fb520445e585826a5c1e911e4
Author: Sorabh Hamirwasia 
Date:   2017-11-07T23:27:45Z

DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN 
mechanism




---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149550602
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

Ok. Guess we'll do away with it. +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149550418
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

```yes, pausing before execute would totally work!```
Current here is what the test does (_italics indicate what we're doing 
under the covers_):
1. Init Statement
2. Set timeout on statement (_validating the timeout value_)
3. Calling `execute()` and fetching ResultSet instance (_starting the 
clock_) 
4. Fetching a row using ResultSet.next()
5. Pausing briefly
6. Repeat step 4 onwards (_enough pause to trigger timeout_)

I was intending to pause between step 3 and 4 as an additional step. 
You believe that we are not exercising any tests for timeout within the 
`execute()` call? 
(Ref: 
https://github.com/kkhatua/drill/blob/9c4e3f3f727e70ca058facd4767556087a1876e1/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java#L1908
 )



---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149548759
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -333,8 +368,14 @@ void close() {
 final int batchQueueThrottlingThreshold =
 client.getConfig().getInt(
 ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD );
-resultsListener = new ResultsListener(batchQueueThrottlingThreshold);
+resultsListener = new ResultsListener(this, 
batchQueueThrottlingThreshold);
 currentBatchHolder = new RecordBatchLoader(client.getAllocator());
+try {
+  setTimeout(this.statement.getQueryTimeout());
+} catch (SQLException e) {
+  // Printing any unexpected SQLException stack trace
+  e.printStackTrace();
--- End diff --

I agree. Thankfully, the _caller_ does handle any thrown `SQLException`s, 
so I'm going to pass this off to that. IMO, I don't think we'll have an issue 
because the `Statement.setQueryTimeout()` would have handled any corner cases 
before this is invoked via `execute()`


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149548337
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

yes, pausing before execute would totally work! After execute, likely not 
since injection is done when query is executed on the server side.


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149547712
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

I was originally wondering as to when should we trigger the countdown on 
the timer. 
Creating a `[Prepared]Statement` object should not be the basis for the 
starting the clock, but only when you actually call execute(). The 
`DrillCursor` is initialized in this method and is what starts the clock. 
I could create a clone of the `testTriggeredQueryTimeout` method and simply 
have the client pause after `execute()` but before fetching the `ResultSet` 
instance or invoking `ResultSet.next()` . Would that work ?


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-07 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r149542196
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -348,6 +354,21 @@ public void run() {
  */
   }
 
+  /*
+Check if the foreman is ONLINE. If not dont accept any new queries.
+   */
+  public void checkForemanState() throws ForemanException{
+DrillbitEndpoint foreman = drillbitContext.getEndpoint();
+Collection dbs = drillbitContext.getAvailableBits();
--- End diff --

Maybe add it to the DrillbitContext class ?


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-07 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r149541807
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.server;
+/*
+  State manager to manage the state of drillbit.
+ */
+public class DrillbitStateManager {
+
+
+  public DrillbitStateManager(DrillbitState currentState) {
+this.currentState = currentState;
+  }
+
+  public enum DrillbitState {
+STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN
+  }
+
+  public DrillbitState getState() {
+return currentState;
+  }
+
+  private DrillbitState currentState;
--- End diff --

I think Drillbit.quiescentMode and Drillbit.forceful_shutdown also need NOT 
be volatile given the way they are used now. You don't have to enforce 
happens-before (by preventing re-ordering) here and even if these variables are 
volatile, the read of these variables in close() can anyway race with the 
setting of these variables in another thread doing a stop/gracefulShutdown. Let 
me know if I am missing anything.

That said, adding volatiles can only makes the code more correct (and 
slower). Since this code is not critical you can let it be as it is.  


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-07 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r149544267
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java ---
@@ -0,0 +1,323 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.test;
+
+import ch.qos.logback.classic.Level;
+import org.apache.commons.io.FileUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
+import org.apache.drill.exec.server.Drillbit;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.omg.PortableServer.THREAD_POLICY_ID;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.Collection;
+import java.util.Properties;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.fail;
+
+public class TestGracefulShutdown {
+
+  @BeforeClass
+  public static void setUpTestData() {
+for( int i = 0; i < 1000; i++) {
+  setupFile(i);
+}
+  }
+
+
+  public static final Properties WEBSERVER_CONFIGURATION = new 
Properties() {
+{
+  put(ExecConstants.HTTP_ENABLE, true);
+}
+  };
+
+  public FixtureBuilder enableWebServer(FixtureBuilder builder) {
+Properties props = new Properties();
+props.putAll(WEBSERVER_CONFIGURATION);
+builder.configBuilder.configProps(props);
+return builder;
+  }
+
+
+  /*
+  Start multiple drillbits and then shutdown a drillbit. Query the online
+  endpoints and check if the drillbit still exists.
+   */
+  @Test
+  public void testOnlineEndPoints() throws  Exception {
+
+String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"};
+FixtureBuilder builder = 
ClusterFixture.builder().withBits(drillbits).withLocalZk();
+
+
+try ( ClusterFixture cluster = builder.build();
+  ClientFixture client = cluster.clientFixture()) {
+
+  Drillbit drillbit = cluster.drillbit("db2");
+  DrillbitEndpoint drillbitEndpoint =  
drillbit.getRegistrationHandle().getEndPoint();
+  int grace_period = 
drillbit.getContext().getConfig().getInt("drill.exec.grace_period");
+  new Thread(new Runnable() {
+public void run() {
+  try {
+cluster.close_drillbit("db2");
+  } catch (Exception e) {
+e.printStackTrace();
+  }
+}
+  }).start();
+  //wait for graceperiod
+  Thread.sleep(grace_period);
+  Collection drillbitEndpoints = 
cluster.drillbit().getContext()
+  .getClusterCoordinator()
+  .getOnlineEndPoints();
+  Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint));
+}
+  }
+  /*
+Test if the drillbit transitions from ONLINE state when a shutdown
+request is initiated
+   */
+  @Test
+  public void testStateChange() throws  Exception {
+
+String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"};
+FixtureBuilder builder = 
ClusterFixture.builder().withBits(drillbits).withLocalZk();
+
+try ( ClusterFixture cluster = builder.build();
+  ClientFixture client = cluster.clientFixture()) {
+  Drillbit drillbit = cluster.drillbit("db2");
+  int grace_period = 
drillbit.getContext().getConfig().getInt("drill.exec.grace_period");
+  DrillbitEndpoint drillbitEndpoint =  
drillbit.getRegistrationHandle().getEndPoint();
+  new Thread(new Runnable() {
+public void run() 

[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149546431
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +417,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  long getTimeoutInMilliseconds() {
+return timeoutInMilliseconds;
+  }
+
+  //Set the cursor's timeout in seconds
+  void setTimeout(int timeoutDurationInSeconds){
+this.timeoutInMilliseconds = timeoutDurationInSeconds*1000L;
--- End diff --

+1 


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149546258
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -139,8 +147,22 @@ private boolean stopThrottlingIfSo() {
   return stopped;
 }
 
-public void awaitFirstMessage() throws InterruptedException {
-  firstMessageReceived.await();
+public void awaitFirstMessage() throws InterruptedException, 
SQLTimeoutException {
+  //Check if a non-zero timeout has been set
+  if ( parent.timeoutInMilliseconds > 0 ) {
+//Identifying remaining in milliseconds to maintain a granularity 
close to integer value of timeout
+long timeToTimeout = (parent.timeoutInMilliseconds) - 
parent.elapsedTimer.elapsed(TimeUnit.MILLISECONDS);
+if ( timeToTimeout > 0 ) {
--- End diff --

Affects readability, but I think comments can convey the intent. +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149546105
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

Nothing is actually running in Stopwatch (it's just a state to indicate if 
elapsed time should use the current time or the time when Stopwatch was 
stopped...)


---


[GitHub] drill issue #904: DRILL-5717: change some date time test cases with specific...

2017-11-07 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/904
  
applied the review comments


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149545640
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

Just wrapping up any 'running' resources. 


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149545534
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +261,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInMilliseconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInMilliseconds ) 
{
--- End diff --

Darn! +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149543163
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +417,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  long getTimeoutInMilliseconds() {
+return timeoutInMilliseconds;
+  }
+
+  //Set the cursor's timeout in seconds
+  void setTimeout(int timeoutDurationInSeconds){
+this.timeoutInMilliseconds = timeoutDurationInSeconds*1000L;
--- End diff --

Preferably use `TimeUnit.SECONDS.toMillis(timeoutDurationInSeconds)` to 
avoid magic constants


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149542468
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -139,8 +147,22 @@ private boolean stopThrottlingIfSo() {
   return stopped;
 }
 
-public void awaitFirstMessage() throws InterruptedException {
-  firstMessageReceived.await();
+public void awaitFirstMessage() throws InterruptedException, 
SQLTimeoutException {
+  //Check if a non-zero timeout has been set
+  if ( parent.timeoutInMilliseconds > 0 ) {
+//Identifying remaining in milliseconds to maintain a granularity 
close to integer value of timeout
+long timeToTimeout = (parent.timeoutInMilliseconds) - 
parent.elapsedTimer.elapsed(TimeUnit.MILLISECONDS);
+if ( timeToTimeout > 0 ) {
--- End diff --

maybe a style issue, but to avoid code duplication both conditions could be 
checked together?
```
if ( timeToTimeout <= 0 || !firstMessageReceived.await(timeToTimeout, 
TimeUnit.MILLISECONDS) ) {
  throw new 
SqlTimeoutException(TimeUnit.MILLISECONDS.toSeconds(parent.timeoutInMilliseconds));
}
```


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149543078
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -333,8 +368,14 @@ void close() {
 final int batchQueueThrottlingThreshold =
 client.getConfig().getInt(
 ExecConstants.JDBC_BATCH_QUEUE_THROTTLING_THRESHOLD );
-resultsListener = new ResultsListener(batchQueueThrottlingThreshold);
+resultsListener = new ResultsListener(this, 
batchQueueThrottlingThreshold);
 currentBatchHolder = new RecordBatchLoader(client.getAllocator());
+try {
+  setTimeout(this.statement.getQueryTimeout());
+} catch (SQLException e) {
+  // Printing any unexpected SQLException stack trace
+  e.printStackTrace();
--- End diff --

two choices here:
- we don't think it's important if we cannot get the value, so we should 
log it properly and not simply dump the exception
- we think this is important, and we propagate the exception to the caller

(I think it is important: the most likely reason why we could not get the 
value if that the statement was closed, and we should probably notify the user 
about it).


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149542720
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -260,6 +288,10 @@ void close() {
   // when the main thread is blocked waiting for the result.  In that 
case
   // we want to unblock the main thread.
   firstMessageReceived.countDown(); // TODO:  Why not call 
releaseIfFirst as used elsewhere?
+  //Stopping timeout clock
--- End diff --

since we are closing, do we need to care about the stopwatch?


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149543581
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +105,14 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Implicit check for whether timeout is set
+if (elapsedTimer != null) {
--- End diff --

maybe an helper method from the cursor to see if we timed out instead of 
exposing elapsedTimer?

I'm not sure if this is really necessary (posted another comment about it 
previously), except maybe because of unit tests where it's hard to time out 
inside the cursor?

I did a prototype too and used control injection to pause the screen 
operator: the test would look like this:
```
  /**
   * Test setting timeout for a query that actually times out
   */
  @Test ( expected = SqlTimeoutException.class )
  public void testTriggeredQueryTimeout() throws SQLException {
// Prevent the server to complete the query to trigger a timeout
final String controls = Controls.newBuilder()
  .addPause(ScreenCreator.class, "send-complete", 0)
  .build();

try(Statement statement = connection.createStatement()) {
  assertThat(
  statement.execute(String.format(
  "ALTER session SET `%s` = '%s'",
  ExecConstants.DRILLBIT_CONTROL_INJECTIONS,
  controls)),
  equalTo(true));
}
String queryId = null;
try(Statement statement = connection.createStatement()) {
  int timeoutDuration = 3;
  //Setting to a very low value (3sec)
  statement.setQueryTimeout(timeoutDuration);
  ResultSet rs = statement.executeQuery(SYS_VERSION_SQL);
  queryId = ((DrillResultSet) rs).getQueryId();
  //Fetch rows
  while (rs.next()) {
rs.getBytes(1);
  }
} catch (SQLException sqlEx) {
  if (sqlEx instanceof SqlTimeoutException) {
throw (SqlTimeoutException) sqlEx;
  }
} finally {
  // Do not forget to unpause to avoid memory leak.
  if (queryId != null) {
DrillClient drillClient = ((DrillConnection) 
connection).getClient();

drillClient.resumeQuery(QueryIdHelper.getQueryIdFromString(queryId));
  }
  }
```

Works for PreparedStatementTest too, need to make sure you pause after 
prepared statement is created but before it is executed.


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149542622
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +261,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInMilliseconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInMilliseconds ) 
{
--- End diff --

wrong unit for the comparison (should be millis)


---


[GitHub] drill issue #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)

2017-11-07 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1024
  
@laurentgo Done the changes... ready for review. 


---


[GitHub] drill issue #1015: DRILL-5899: Simple pattern matchers can work with DrillBu...

2017-11-07 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1015
  
@paul-rogers Thanks a lot for the review. Updated the PR with code review 
comments. Please take a look. 

Overall, good improvement with this change. Here are the numbers.

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like '%a' 
1.4 sec vs 7 sec

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like '%a%' 
6.5 sec vs 13.5 sec

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like 'a%' 
1.4 sec vs 5.8 sec

select count(*) from `/Users/ppenumarthy/MAPRTECH/padma/testdata` where 
l_comment like 'a' 
1.1.65 sec vs 5.8 sec


I think for "contains", improvement is not as much as others, probably 
because of nested for loops. @sachouche changes on top of these changes can 
improve further. 



---


DRILL-4364 Image Metadata Format Plugin

2017-11-07 Thread Charles Givre
Hello all, 
I’d really love to see this PR make it into the next version of Drill.  Is 
there anything that I could do to assist?
Thanks,
- C

[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149506412
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
--- End diff --

Yep. Agree. Was trying to make use of the large number of 
`###SqlException`s defined within the Drill JDBC package. Will fix this. +1


---


[jira] [Resolved] (DRILL-5138) TopN operator on top of ~110 GB data set is very slow

2017-11-07 Thread Timothy Farkas (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Farkas resolved DRILL-5138.
---
Resolution: Fixed

> TopN operator on top of ~110 GB data set is very slow
> -
>
> Key: DRILL-5138
> URL: https://issues.apache.org/jira/browse/DRILL-5138
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Relational Operators
>Reporter: Rahul Challapalli
>Assignee: Timothy Farkas
>
> git.commit.id.abbrev=cf2b7c7
> No of cores : 23
> No of disks : 5
> DRILL_MAX_DIRECT_MEMORY="24G"
> DRILL_MAX_HEAP="12G"
> The below query ran for more than 4 hours and did not complete. The table is 
> ~110 GB
> {code}
> select * from catalog_sales order by cs_quantity, cs_wholesale_cost limit 1;
> {code}
> Physical Plan :
> {code}
> 00-00Screen : rowType = RecordType(ANY *): rowcount = 1.0, cumulative 
> cost = {1.00798629141E10 rows, 4.17594320691E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 352
> 00-01  Project(*=[$0]) : rowType = RecordType(ANY *): rowcount = 1.0, 
> cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 351
> 00-02Project(T0¦¦*=[$0]) : rowType = RecordType(ANY T0¦¦*): rowcount 
> = 1.0, cumulative cost = {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 
> 4.1287118487552E13 network, 0.0 memory}, id = 350
> 00-03  SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = 
> {1.0079862914E10 rows, 4.1759432069E10 cpu, 0.0 io, 4.1287118487552E13 
> network, 0.0 memory}, id = 349
> 00-04Limit(fetch=[1]) : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.0, cumulative cost = 
> {1.0079862913E10 rows, 4.1759432068E10 cpu, 0.0 io, 4.1287118487552E13 
> network, 0.0 memory}, id = 348
> 00-05  SingleMergeExchange(sort0=[1 ASC], sort1=[2 ASC]) : 
> rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost): 
> rowcount = 1.439980416E9, cumulative cost = {1.0079862912E10 rows, 
> 4.1759432064E10 cpu, 0.0 io, 4.1287118487552E13 network, 0.0 memory}, id = 347
> 01-01SelectionVectorRemover : rowType = RecordType(ANY T0¦¦*, 
> ANY cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative 
> cost = {8.639882496E9 rows, 3.0239588736E10 cpu, 0.0 io, 2.3592639135744E13 
> network, 0.0 memory}, id = 346
> 01-02  TopN(limit=[1]) : rowType = RecordType(ANY T0¦¦*, ANY 
> cs_quantity, ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative 
> cost = {7.19990208E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 
> network, 0.0 memory}, id = 345
> 01-03Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, 
> ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = 
> {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, 
> 0.0 memory}, id = 344
> 01-04  HashToRandomExchange(dist0=[[$1]], dist1=[[$2]]) : 
> rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = 
> {5.759921664E9 rows, 2.879960832E10 cpu, 0.0 io, 2.3592639135744E13 network, 
> 0.0 memory}, id = 343
> 02-01UnorderedMuxExchange : rowType = RecordType(ANY 
> T0¦¦*, ANY cs_quantity, ANY cs_wholesale_cost, ANY 
> E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, cumulative cost = 
> {4.319941248E9 rows, 1.1519843328E10 cpu, 0.0 io, 0.0 network, 0.0 memory}, 
> id = 342
> 03-01  Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2], E_X_P_R_H_A_S_H_F_I_E_L_D=[hash32AsDouble($2, 
> hash32AsDouble($1))]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, ANY 
> cs_wholesale_cost, ANY E_X_P_R_H_A_S_H_F_I_E_L_D): rowcount = 1.439980416E9, 
> cumulative cost = {2.879960832E9 rows, 1.0079862912E10 cpu, 0.0 io, 0.0 
> network, 0.0 memory}, id = 341
> 03-02Project(T0¦¦*=[$0], cs_quantity=[$1], 
> cs_wholesale_cost=[$2]) : rowType = RecordType(ANY T0¦¦*, ANY cs_quantity, 
> ANY cs_wholesale_cost): rowcount = 1.439980416E9, cumulative cost = 
> {1.439980416E9 rows, 4.319941248E9 cpu, 0.0 io, 0.0 network, 0.0 memory}, id 
> = 340
> 03-03  Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath 
> [path=maprfs:///drill/testdata/tpcds/parquet/sf1000/catalog_sales]], 
> selectionRoot=maprfs:/drill/testdata/tpcds/parquet/sf1000/catalog_sales, 
> numFiles=1, usedMetadataFile=false, columns=[`*`]]]) : rowType = 
> (DrillRecordRow[*, cs_quantity, cs_wholesale_cost]): 

[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149477955
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +415,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  //Set the cursor's timeout in seconds
--- End diff --

you just need to get the value when the query is executed (in DrillCursor) 
once to make sure the timeout doesn't change (that and StopWatch being managed 
by DrillCursor too.

Also, it is subject to interpretation but it seems the intent of the API is 
to time bound how much time it takes the query to complete. I'm not sure it is 
necessary to make the extra work of having a slow client reading the result set 
data although all data has already been read by the driver from the server (and 
from the server point of view, the query is completed).


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149477222
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
--- End diff --

but you did since you catch the exception and do a check on the message. 
Rewrapping it so that the test framework can check the new type has no value.


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149477233
  
--- Diff: exec/jdbc/src/test/java/org/apache/drill/jdbc/StatementTest.java 
---
@@ -61,55 +71,129 @@ public static void tearDownStatement() throws 
SQLException {
   //
   // getQueryTimeout():
 
-  /** Tests that getQueryTimeout() indicates no timeout set. */
+  /**
+   * Test for reading of default query timeout
+   */
   @Test
-  public void testGetQueryTimeoutSaysNoTimeout() throws SQLException {
-assertThat( statement.getQueryTimeout(), equalTo( 0 ) );
+  public void testDefaultGetQueryTimeout() throws SQLException {
+Statement stmt = connection.createStatement();
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
--- End diff --

+1 


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476820
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
+}
+  }
+
+  /**
+   * Test setting a valid timeout
+   */
+  @Test
+  public void testValidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting positive value
+int valueToSet = new Random(System.currentTimeMillis()).nextInt(60);
+if (0L == valueToSet) {
+  valueToSet++;
+}
+stmt.setQueryTimeout(valueToSet);
+assert( valueToSet == stmt.getQueryTimeout() );
+  }
+
+  /**
+   * Test setting timeout as zero and executing
+   */
+  @Test
+  public void testSetQueryTimeoutAsZero() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_RANDOM_SQL);
+stmt.setQueryTimeout(0);
+stmt.executeQuery();
+ResultSet rs = stmt.getResultSet();
+int rowCount = 0;
+while (rs.next()) {
+  rs.getBytes(1);
+  rowCount++;
+}
+stmt.close();
+assert( 3 == rowCount );
+  }
+
+  /**
+   * Test setting timeout for a query that actually times out
+   */
+  @Test ( expected = SQLTimeoutException.class )
+  public void testTriggeredQueryTimeout() throws SQLException {
+PreparedStatement stmt = null;
+//Setting to a very low value (3sec)
+int timeoutDuration = 3;
+int rowsCounted = 0;
+try {
+  stmt = connection.prepareStatement(SYS_RANDOM_SQL);
+  stmt.setQueryTimeout(timeoutDuration);
+  System.out.println("Set a timeout of "+ stmt.getQueryTimeout() +" 
seconds");
--- End diff --

I think I previously came across some unit tests that are using System.out 
instead of logger, so i figured there wasn't any preference. Logger is probably 
the cleaner way of doing things. +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476740
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -66,11 +70,27 @@
   private final DrillConnectionImpl connection;
   private volatile boolean hasPendingCancelationNotification = false;
 
+  private Stopwatch elapsedTimer;
+
+  private int queryTimeoutInSeconds;
+
   DrillResultSetImpl(AvaticaStatement statement, Meta.Signature signature,
  ResultSetMetaData resultSetMetaData, TimeZone 
timeZone,
  Meta.Frame firstFrame) {
 super(statement, signature, resultSetMetaData, timeZone, firstFrame);
 connection = (DrillConnectionImpl) statement.getConnection();
+try {
+  if (statement.getQueryTimeout() > 0) {
+queryTimeoutInSeconds = statement.getQueryTimeout();
+  }
+} catch (Exception e) {
+  e.printStackTrace();
--- End diff --

I think so


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476642
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ---
@@ -61,8 +65,14 @@ protected DrillPreparedStatementImpl(DrillConnectionImpl 
connection,
 if (preparedStatementHandle != null) {
   ((DrillColumnMetaDataList) 
signature.columns).updateColumnMetaData(preparedStatementHandle.getColumnsList());
 }
+//Implicit query timeout
+this.queryTimeoutInSeconds = 0;
+this.elapsedTimer = Stopwatch.createUnstarted();
--- End diff --

not even true for a statement: you can execute multiple queries, but the 
previous resultset will be closed and a new cursor created...


---


[jira] [Created] (DRILL-5942) Drill Resource Management

2017-11-07 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-5942:
-

 Summary: Drill Resource Management
 Key: DRILL-5942
 URL: https://issues.apache.org/jira/browse/DRILL-5942
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas


OutOfMemoryExceptions still occur in Drill. This ticket address a plan for what 
it would take to ensure all queries are able to execute on Drill without 
running out of memory.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149476190
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
+}
+  }
+
+  /**
+   * Test setting a valid timeout
+   */
+  @Test
+  public void testValidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting positive value
+int valueToSet = new Random(System.currentTimeMillis()).nextInt(60);
--- End diff --

I am trying to add some randomness to the test parameters, since the 
expected behaviour should be the same. I'll fix this up and get rid of that 
check.  +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149475535
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
+  // TODO: handle exception
+  assertThat( e.getMessage(), containsString( "illegal timeout value") 
);
+  //Converting this to match expected Exception
+  throw new InvalidParameterSqlException(e.getMessage());
--- End diff --

Wanted to make sure that the unit test also reports the correct exception. 
This only rewraps the thrown SQLException to an 
InvalidParameterSqlException for JUnit to confirm.


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149475115
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
+  valueToSet--;
+}
+try {
+  stmt.setQueryTimeout(valueToSet);
+} catch ( final Exception e) {
--- End diff --

Yes, it should be. Might be a legacy code. Will fix it. +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149474798
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+int timeoutValue = stmt.getQueryTimeout();
+assert( 0 == timeoutValue );
+  }
+
+  /**
+   * Test Invalid parameter by giving negative timeout
+   */
+  @Test ( expected = InvalidParameterSqlException.class )
+  public void testInvalidSetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
+//Setting negative value
+int valueToSet = -10;
+if (0L == valueToSet) {
--- End diff --

My bad. The original code would assign the negation of a random integer.,.. 
hence the check for 0L and followed by a decrement. +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149474455
  
--- Diff: 
exec/jdbc/src/test/java/org/apache/drill/jdbc/PreparedStatementTest.java ---
@@ -237,6 +245,127 @@ public String toString() {
 }
   }
 
+  /**
+   * Test for reading of default query timeout
+   */
+  @Test
+  public void testDefaultGetQueryTimeout() throws SQLException {
+PreparedStatement stmt = connection.prepareStatement(SYS_VERSION_SQL);
--- End diff --

+1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149473999
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -96,6 +117,13 @@ private void throwIfClosed() throws 
AlreadyClosedSqlException,
 throw new AlreadyClosedSqlException( "ResultSet is already 
closed." );
   }
 }
+
+//Query Timeout Check. The timer has already been started by the 
DrillCursor at this point
--- End diff --

This code block gets touched even if there is no timeout set, hence the 
check to implicitly confirm if there is a timeout set.


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149473521
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -66,11 +70,27 @@
   private final DrillConnectionImpl connection;
   private volatile boolean hasPendingCancelationNotification = false;
 
+  private Stopwatch elapsedTimer;
+
+  private int queryTimeoutInSeconds;
+
   DrillResultSetImpl(AvaticaStatement statement, Meta.Signature signature,
  ResultSetMetaData resultSetMetaData, TimeZone 
timeZone,
  Meta.Frame firstFrame) {
 super(statement, signature, resultSetMetaData, timeZone, firstFrame);
 connection = (DrillConnectionImpl) statement.getConnection();
+try {
+  if (statement.getQueryTimeout() > 0) {
+queryTimeoutInSeconds = statement.getQueryTimeout();
+  }
+} catch (Exception e) {
+  e.printStackTrace();
--- End diff --

Guess I was not sure what am I to do if `getQueryTImeout()` threw an 
Exception. Didn't want to lose the stack trace. Should I just ignore it?


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149472806
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ---
@@ -61,8 +65,14 @@ protected DrillPreparedStatementImpl(DrillConnectionImpl 
connection,
 if (preparedStatementHandle != null) {
   ((DrillColumnMetaDataList) 
signature.columns).updateColumnMetaData(preparedStatementHandle.getColumnsList());
 }
+//Implicit query timeout
+this.queryTimeoutInSeconds = 0;
+this.elapsedTimer = Stopwatch.createUnstarted();
--- End diff --

I thought the Statement and Cursor had a 1:1 relationship, so they can 
share the timer. I guess for a PreparedStatement I cannot make that assumption. 
Will fix this. +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149472887
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillResultSetImpl.java ---
@@ -66,11 +70,27 @@
   private final DrillConnectionImpl connection;
   private volatile boolean hasPendingCancelationNotification = false;
 
+  private Stopwatch elapsedTimer;
--- End diff --

+1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149471937
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillPreparedStatementImpl.java
 ---
@@ -46,6 +48,8 @@
DrillRemoteStatement {
 
   private final PreparedStatement preparedStatementHandle;
+  int queryTimeoutInSeconds = 0;
--- End diff --

Ah.. i missed this during the clean up. Thanks! +1


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149471249
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -100,13 +103,17 @@
 final LinkedBlockingDeque batchQueue =
 Queues.newLinkedBlockingDeque();
 
+private final DrillCursor parent;
--- End diff --

Stopwatch seemed a convenient way of visualizing a timer object that is 
passed between different JDBC entities, and also provides a clean way of 
specifying elapsed time, etc. 


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149467572
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -376,6 +415,19 @@ synchronized void cleanup() {
 currentBatchHolder.clear();
   }
 
+  //Set the cursor's timeout in seconds
--- End diff --

We do get the timeout value from the Statement (Ref: 
https://github.com/kkhatua/drill/blob/a008707c7b97ea95700ab0f2eb5182d779a9bcb3/exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java#L372
 )
However, the Statement is referred to by the ResultSet object as well, to 
get a handle of the timer object. During testing, I found that there is a 
possibility that the DrillCursor completes fetching all batches, but a slow 
client would call ResultSet.next() slowly and time out. The ResultSet object 
has no reference to the timer, except via the Statement object.
There is a bigger problem that this block of code fixes. During iteration, 
we don't want to be able to change the timeout period. Hence, the DrillCursor 
(invoked by the _first_ `ResultSet.next()` call) will be initialized and set 
the timer to start ticking.Thereafter, any attempt to change the timeout can be 
ignored.


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149465309
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +259,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInSeconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInSeconds ) {
--- End diff --

you don't really need a check after the pool: if it's not null, it means it 
completed before timeout and you can proceed forward. If it is null, then you 
would loop and redo the check based on the current time and might be able to 
throw a timeout exception


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-07 Thread kkhatua
Github user kkhatua commented on a diff in the pull request:

https://github.com/apache/drill/pull/1024#discussion_r149463638
  
--- Diff: 
exec/jdbc/src/main/java/org/apache/drill/jdbc/impl/DrillCursor.java ---
@@ -239,6 +259,11 @@ QueryDataBatch getNext() throws UserException, 
InterruptedException {
 }
 return qdb;
   }
+
+  // Check and throw SQLTimeoutException
+  if ( parent.timeoutInSeconds > 0 && 
parent.elapsedTimer.elapsed(TimeUnit.SECONDS) >= parent.timeoutInSeconds ) {
--- End diff --

Good point, and I thought it might help in avoiding going into polling all 
together.  
However, the granularity of the timeout is in seconds, so 50ms is 
insignificant. If I do a check before the poll, I'd need to do after the poll 
as well.. over a 50ms window. So, a post-poll check works fine, because we'll, 
at most, exceed the timeout by 50ms. So a timeout of 1sec would occur in 
1.05sec. For any larger timeout values, the 50ms is of diminishing significance.


---


[GitHub] drill pull request #904: DRILL-5717: change some date time test cases with s...

2017-11-07 Thread vvysotskyi
Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/904#discussion_r149440034
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/fn/interp/TestConstantFolding.java
 ---
@@ -117,6 +123,13 @@ public void createFiles(int smallFileLines, int 
bigFileLines) throws Exception{
 
   @Test
   public void testConstantFolding_allTypes() throws Exception {
+new MockUp() {
--- End diff --

I meant to use the same method in all tests where Locale should be mocked. 
Please replace it.


---


[GitHub] drill pull request #904: DRILL-5717: change some date time test cases with s...

2017-11-07 Thread vvysotskyi
Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/904#discussion_r149440814
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/TestCastFunctions.java
 ---
@@ -77,16 +82,23 @@ public void testCastByConstantFolding() throws 
Exception {
 
   @Test // DRILL-3769
   public void testToDateForTimeStamp() throws Exception {
-final String query = "select to_date(to_timestamp(-1)) as col \n" +
-"from (values(1))";
+new MockUp() {
--- End diff --

This mock is used twice. So let's also move the code into the separate 
method.


---


[GitHub] drill pull request #904: DRILL-5717: change some date time test cases with s...

2017-11-07 Thread vvysotskyi
Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/904#discussion_r149440392
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/fn/impl/testing/TestDateConversions.java
 ---
@@ -225,4 +243,16 @@ public void testPostgresDateFormatError() throws 
Exception {
   throw e;
 }
   }
+
+  /**
+   * mock current locale to US
+   */
+  private void mockUSLocale() {
--- End diff --

Please move this method into ExecTest class and make it public and static.


---


[jira] [Created] (DRILL-5941) Skip header / footer logic works incorrectly for Hive tables when file has several input splits

2017-11-07 Thread Arina Ielchiieva (JIRA)
Arina Ielchiieva created DRILL-5941:
---

 Summary: Skip header / footer logic works incorrectly for Hive 
tables when file has several input splits
 Key: DRILL-5941
 URL: https://issues.apache.org/jira/browse/DRILL-5941
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Hive
Affects Versions: 1.11.0
Reporter: Arina Ielchiieva
Assignee: Arina Ielchiieva
 Fix For: 1.12.0


*To reproduce*
1. Create csv file with two columns (key, value) for 329 rows, where first 
row is a header.
The data file has size of should be greater than chunk size of 256 MB. Copy 
file to the distributed file system.

2. Create table in Hive:
{noformat}
CREATE EXTERNAL TABLE `h_table`(
  `key` bigint,
  `value` string)
ROW FORMAT DELIMITED
  FIELDS TERMINATED BY ','
STORED AS INPUTFORMAT
  'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'maprfs:/tmp/h_table'
TBLPROPERTIES (
 'skip.header.line.count'='1');
{noformat}

3. Execute query {{select * from hive.h_table}} in Drill (query data using Hive 
plugin). The result will return less rows then expected. Expected result is 
328 (total count minus one row as header).

*The root cause*
Since file is greater than default chunk size, it's split into several 
fragments, known as input splits. For example:
{noformat}
maprfs:/tmp/h_table/h_table.csv:0+268435456
maprfs:/tmp/h_table/h_table.csv:268435457+492782112
{noformat}

TextHiveReader is responsible for handling skip header and / or footer logic.
Currently Drill creates reader [for each input 
split|https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveScanBatchCreator.java#L84]
 and skip header and /or footer logic is applied for each input splits, though 
ideally the above mentioned input splits should have been read by one reader, 
so skip / header footer logic was applied correctly.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5940) Avro with schema registry support for Kafka

2017-11-07 Thread B Anil Kumar (JIRA)
B Anil Kumar created DRILL-5940:
---

 Summary: Avro with schema registry support for Kafka
 Key: DRILL-5940
 URL: https://issues.apache.org/jira/browse/DRILL-5940
 Project: Apache Drill
  Issue Type: New Feature
  Components: Storage - Other
Reporter: B Anil Kumar
Assignee: Bhallamudi Venkata Siva Kamesh


Support Avro messages with Schema registry for Kafka storage plugin



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5939) NullPointerException in convert_toJSON function

2017-11-07 Thread Volodymyr Tkach (JIRA)
Volodymyr Tkach created DRILL-5939:
--

 Summary: NullPointerException in convert_toJSON function
 Key: DRILL-5939
 URL: https://issues.apache.org/jira/browse/DRILL-5939
 Project: Apache Drill
  Issue Type: Bug
Reporter: Volodymyr Tkach
Priority: Minor


The query: `select convert_toJSON(convert_fromJSON('\{"key": "value"\}')) from 
(values(1));` fails with exception.
Although, when we apply it for the data from the file from disk it succeeds.
select convert_toJSON(convert_fromJSON(columns\[0\])) from dfs.tmp.`some.csv`;
Some.csv
\{"key":"val"\},val2

{noformat}
Fragment 0:0

[Error Id: 016ca995-16f9-4eab-83c2-7679071faad4 on userf206-pc:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
NullPointerException

Fragment 0:0

[Error Id: 016ca995-16f9-4eab-83c2-7679071faad4 on userf206-pc:31010]
at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:298)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
 [drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
[na:1.7.0_80]
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
[na:1.7.0_80]
at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
Caused by: java.lang.NullPointerException: null
at 
org.apache.drill.exec.expr.fn.DrillFuncHolder.addProtectedBlock(DrillFuncHolder.java:183)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.DrillFuncHolder.generateBody(DrillFuncHolder.java:169)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.fn.DrillSimpleFuncHolder.renderEnd(DrillSimpleFuncHolder.java:86)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitFunctionHolderExpression(EvaluationVisitor.java:205)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$ConstantFilter.visitFunctionHolderExpression(EvaluationVisitor.java:1089)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitFunctionHolderExpression(EvaluationVisitor.java:827)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitFunctionHolderExpression(EvaluationVisitor.java:807)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.common.expression.FunctionHolderExpression.accept(FunctionHolderExpression.java:53)
 ~[drill-logical-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitValueVectorWriteExpression(EvaluationVisitor.java:362)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$EvalVisitor.visitUnknown(EvaluationVisitor.java:344)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$ConstantFilter.visitUnknown(EvaluationVisitor.java:1339)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitUnknown(EvaluationVisitor.java:1038)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor$CSEFilter.visitUnknown(EvaluationVisitor.java:807)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.ValueVectorWriteExpression.accept(ValueVectorWriteExpression.java:64)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.EvaluationVisitor.addExpr(EvaluationVisitor.java:104)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.expr.ClassGenerator.addExpr(ClassGenerator.java:335) 
~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.setupNewSchemaFromInput(ProjectRecordBatch.java:476)
 ~[drill-java-exec-1.12.0-SNAPSHOT.jar:1.12.0-SNAPSHOT]
at 

[jira] [Created] (DRILL-5938) Write unit tests for math function with NaN and Infinity numbers

2017-11-07 Thread Volodymyr Tkach (JIRA)
Volodymyr Tkach created DRILL-5938:
--

 Summary: Write unit tests for math function with NaN and Infinity 
numbers 
 Key: DRILL-5938
 URL: https://issues.apache.org/jira/browse/DRILL-5938
 Project: Apache Drill
  Issue Type: Test
Reporter: Volodymyr Tkach
Priority: Minor
 Fix For: Future


Drill math function needs to be covered with test cases when input is 
non-numeric numbers: NaN or Infinity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1026: DRILL-5919: Add session option to allow json reade...

2017-11-07 Thread vladimirtkach
GitHub user vladimirtkach opened a pull request:

https://github.com/apache/drill/pull/1026

DRILL-5919: Add session option to allow json reader/writer to work with 
NaN,INF

 Added two session options `store.json.reader.non_numeric_numbers` and 
`store.json.reader.non_numeric_numbers` that allow to read/write `NaN` and 
`Infinity` as numbers

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vladimirtkach/drill DRILL-5919

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1026.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1026


commit 0e972bac9d472f6681e6f16d232f61e6d0bfcb44
Author: Volodymyr Tkach 
Date:   2017-11-03T16:13:29Z

DRILL-5919: Add session option to allow json reader/writer to work with 
NaN,INF




---


[GitHub] drill pull request #1020: DRILL-5921: Display counter metrics in table

2017-11-07 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1020#discussion_r149349985
  
--- Diff: exec/java-exec/src/main/resources/rest/metrics/metrics.ftl ---
@@ -138,21 +154,14 @@
   });
 };
 
-function updateOthers(metrics) {
-  $.each(["counters", "meters"], function(i, key) {
-if(! $.isEmptyObject(metrics[key])) {
-  $("#" + key + "Val").html(JSON.stringify(metrics[key], null, 2));
-}
-  });
-};
-
 var update = function() {
   $.get("/status/metrics", function(metrics) {
 updateGauges(metrics.gauges);
 updateBars(metrics.gauges);
 if(! $.isEmptyObject(metrics.timers)) createTable(metrics.timers, 
"timers");
 if(! $.isEmptyObject(metrics.histograms)) 
createTable(metrics.histograms, "histograms");
-updateOthers(metrics);
+if(! $.isEmptyObject(metrics.counters)) 
createCountersTable(metrics.counters);
+if(! $.isEmptyObject(metrics.meters)) 
$("#metersVal").html(JSON.stringify(metrics.meters, null, 2));
--- End diff --

@prasadns14 
1. Please add two screenshots before and after the changes.
2. Can you please think of the way to make create table generic so can be 
used for timers, histograms and counters?
3. What about meters? How they are displayed right now? Maybe we need to 
display them in table as well?
Ideally, we can display all metrics in the same way.


---


[GitHub] drill issue #1021: DRILL-5923: Display name for query state

2017-11-07 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1021
  
@prasadns14 as far as I understood, you made all these changes to replace 
`completed` with `succeeded`. What if you just make changes in State enum 
itself, refactor some code and thus no changes in rest part will be required? 
From UserBitShared.proto
```
enum QueryState {
  STARTING = 0; // query has been scheduled for execution. This is 
post-enqueued.
  RUNNING = 1;
  COMPLETED = 2; // query has completed successfully
  CANCELED = 3; // query has been cancelled, and all cleanup is complete
  FAILED = 4;
  CANCELLATION_REQUESTED = 5; // cancellation has been requested, and 
is being processed
  ENQUEUED = 6; // query has been enqueued. this is pre-starting.
}
```
After the renaming, please don't forget to regenerate protobuf.


---


[jira] [Resolved] (DRILL-5746) Pcap PR manually edited Protobuf files, values lost on next build

2017-11-07 Thread Arina Ielchiieva (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva resolved DRILL-5746.
-
Resolution: Fixed

In the scope of DRILL-5716.

> Pcap PR manually edited Protobuf files, values lost on next build
> -
>
> Key: DRILL-5746
> URL: https://issues.apache.org/jira/browse/DRILL-5746
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.10.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
> Fix For: 1.12.0
>
>
> Drill recently accepted the pcap format plugin. As part of that work, the 
> author added a new operator type, {{PCAP_SUB_SCAN_VALUE}}.
> But, apparently this was done by editing the generated Protobuf files to add 
> the values, rather than modifying the protobuf definitions and rebuilding the 
> generated files. The result is, on the next build of the Protobuf sources, 
> the following compile error appears:
> {code}
> [ERROR] 
> /Users/paulrogers/git/drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/pcap/PcapFormatPlugin.java:[80,41]
>  error: cannot find symbol
> [ERROR] symbol:   variable PCAP_SUB_SCAN_VALUE
> [ERROR] location: class CoreOperatorType
> {code}
> The solution is to properly edit the Protobuf definitions with the required 
> symbol.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5937) prepare.statement.create_timeout_ms default is 10 seconds but code comment says default should be 10 mins

2017-11-07 Thread Pushpendra Jaiswal (JIRA)
Pushpendra Jaiswal created DRILL-5937:
-

 Summary:  prepare.statement.create_timeout_ms default is 10 
seconds but code comment says default should be 10 mins
 Key: DRILL-5937
 URL: https://issues.apache.org/jira/browse/DRILL-5937
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.8.0
Reporter: Pushpendra Jaiswal


prepare.statement.create_timeout_ms default is 10 seconds but code comment says 
default should be 10 mins
The value is by default set to 1 ms 

https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/ExecConstants.java#L526
 

/**
   * Timeout for create prepare statement request. If the request exceeds this 
timeout, then request is timed out.
   * Default value is 10mins.
   */





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)