[jira] [Created] (DRILL-5932) HiveServer2 queries throw error with jdbc connection

2017-11-03 Thread tooptoop4 (JIRA)
tooptoop4 created DRILL-5932:


 Summary: HiveServer2 queries throw error with jdbc connection
 Key: DRILL-5932
 URL: https://issues.apache.org/jira/browse/DRILL-5932
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC, Storage - Hive
Affects Versions: 1.11.0
 Environment: linux
2.3 hive version
Reporter: tooptoop4
Priority: Blocker


Basic hive queries all throw error!

{code:sql}

copied 
https://repo1.maven.org/maven2/org/apache/hive/hive-jdbc/2.3.0/hive-jdbc-2.3.0-standalone.jar
 to /usr/lib/apache-drill-1.11.0/jars/3rdparty/hive-jdbc-2.3.0-standalone.jar

added this storage plugin:

{
  "type": "jdbc",
  "driver": "org.apache.hive.jdbc.HiveDriver",
  "url": "jdbc:hive2://host:1/default",
  "username": "hive",
  "password": "hive1234",
  "enabled": true
}

[ec2-user@host ~]$ cd /usr/lib/apache-drill-1.11.0
[ec2-user@host apache-drill-1.11.0]$ ./bin/drill-embedded
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was 
removed in 8.0
Nov 01, 2017 7:53:53 AM org.glassfish.jersey.server.ApplicationHandler 
initialize
INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26...
apache drill 1.11.0
"this isn't your grandfather's sql"
0: jdbc:drill:zk=local> SELECT count(*) FROM hive2.`contact`;
Error: DATA_READ ERROR: The JDBC storage plugin failed while trying setup the 
SQL query.

sql SELECT COUNT(*) AS EXPR$0
FROM (SELECT 0 AS $f0
FROM.default.contact) AS t
plugin hive2
Fragment 0:0

[Error Id: 4b293e97-7547-49c5-91da-b9ee2f2184fc on 
ip-myip.mydomain.orghere.com:31010] (state=,code=0)




0: jdbc:drill:zk=local> ALTER SESSION SET `exec.errors.verbose` = true;
+---+---+
|  ok   |summary|
+---+---+
| true  | exec.errors.verbose updated.  |
+---+---+
1 row selected (0.351 seconds)
0: jdbc:drill:zk=local> SELECT count(*) FROM hive2.`contact`;
Error: DATA_READ ERROR: The JDBC storage plugin failed while trying setup the 
SQL query.

sql SELECT COUNT(*) AS EXPR$0
FROM (SELECT 0 AS $f0
FROM.default.contact) AS t
plugin hive2
Fragment 0:0

[Error Id: ac5cc8f2-69a4-455e-b1a6-5d22cde99729 on 
ip-myip.mydomain.orghere.com:31010]

  (org.apache.hive.service.cli.HiveSQLException) Error while compiling 
statement: FAILED: ParseException line 1:23 missing EOF at '$' near 'EXPR'
org.apache.hive.jdbc.Utils.verifySuccess():267
org.apache.hive.jdbc.Utils.verifySuccessWithInfo():253
org.apache.hive.jdbc.HiveStatement.runAsyncOnServer():313
org.apache.hive.jdbc.HiveStatement.execute():253
org.apache.hive.jdbc.HiveStatement.executeQuery():476
org.apache.commons.dbcp.DelegatingStatement.executeQuery():208
org.apache.commons.dbcp.DelegatingStatement.executeQuery():208
org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup():177
org.apache.drill.exec.physical.impl.ScanBatch.():104
org.apache.drill.exec.physical.impl.ScanBatch.():126
org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():40
org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():33
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():156
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():179
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():136
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():179
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():109
org.apache.drill.exec.physical.impl.ImplCreator.getExec():87
org.apache.drill.exec.work.fragment.FragmentExecutor.run():207
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (org.apache.hive.service.cli.HiveSQLException) Error while 
compiling statement: FAILED: ParseException line 1:23 missing EOF at '$' near 
'EXPR'
org.apache.hive.service.cli.operation.Operation.toSQLException():380
org.apache.hive.service.cli.operation.SQLOperation.prepare():206
org.apache.hive.service.cli.operation.SQLOperation.runInternal():290
org.apache.hive.service.cli.operation.Operation.run():320

org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal():530

org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync():517
sun.reflect.GeneratedMethodAccessor86.invoke():-1
sun.reflect.DelegatingMethodAccessorImpl.invoke():43
java.lang.reflect.Method.invoke():498
org.apache.hive.service.cli.session.HiveSessionProxy.invoke():78
org.apache.hive.service.cli.session.HiveSessionProxy.access$000():36
org.apache.hive.service.cli.session.HiveSessionProxy$1.run():63
java.security.AccessController.doPrivileged():-2

[jira] [Created] (DRILL-5931) Drill queries against hive metastore tables return 0 rows rather than error

2017-11-03 Thread tooptoop4 (JIRA)
tooptoop4 created DRILL-5931:


 Summary: Drill queries against hive metastore tables return 0 rows 
rather than error
 Key: DRILL-5931
 URL: https://issues.apache.org/jira/browse/DRILL-5931
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Hive
Affects Versions: 1.11.0
 Environment: linux
drill 1.11
hive 2.3.0
Reporter: tooptoop4
Priority: Critical


Note that the hive user called 'hive' does not have a linux account so has no 
access to /home/ec2-user/warehouse/ file system. Rather than returning an error 
message, Drill pretends that the table is empty.

{code:sql}
[ec2-user@host ~]$ cd /usr/lib/apache-drill-1.11.0
[ec2-user@host apache-drill-1.11.0]$ ./bin/drill-embedded
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was 
removed in 8.0
Nov 01, 2017 7:53:53 AM org.glassfish.jersey.server.ApplicationHandler 
initialize
INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26...
apache drill 1.11.0
"this isn't your grandfather's sql"
0: jdbc:drill:zk=local> SHOW SCHEMAS;
+-+
|   SCHEMA_NAME   |
+-+
| INFORMATION_SCHEMA  |
| coffeemysql.COFFEEBREAK |
| coffeemysql.information_schema  |
| coffeemysql.mysql   |
| coffeemysql.performance_schema  |
| coffeemysql.test|
| coffeemysql |
| cp.default  |
| dfs.default |
| dfs.root|
| dfs.tmp |
| hive.default|
| sys |
+-+
13 rows selected (0.317 seconds)
0: jdbc:drill:zk=local> use hive;
+---+---+
|  ok   |  summary  |
+---+---+
| true  | Default schema changed to [hive]  |
+---+---+
1 row selected (0.16 seconds)
0: jdbc:drill:zk=local> show tables;
+---+-+
| TABLE_SCHEMA  | TABLE_NAME  |
+---+-+
| hive.default  | contact |
| hive.default  | account |
+---+-+
2 rows selected (0.958 seconds)
0: jdbc:drill:zk=local> select * from contact;
+-+-++-+-+---+---++--+-+-+--+
| contact_id  | first_name  | last_name  | address_line_1  | address_line_2  | 
city  | postcode  | email  | dob  | gender  | marital_status  | tfn  |
+-+-++-+-+---+---++--+-+-+--+
+-+-++-+-+---+---++--+-+-+--+
No rows selected (0.104 seconds)




This is the hive plugin in the Drill browser ‘storage’ plugin
{
  "type": "hive",
  "enabled": true,
  "configProps": {
"hive.metastore.uris": "thrift://host:9083",
"javax.jdo.option.ConnectionURL": "jdbc:mysql://host:3306/metastore",
"javax.jdo.option.ConnectionDriverName": "com.mysql.jdbc.Driver",
"javax.jdo.option.ConnectionUserName": "hive",
"javax.jdo.option.ConnectionPassword": "hive1234",
"hive.metastore.warehouse.dir": "file:///home/ec2-user/warehouse",
"fs.default.name": "file:///"
  }
}



Below connection with beeline proves there is data in the table:

Beeline version 2.3.0 by Apache Hive
0: jdbc:hive2://localhost:1/default> show create table contact;
++
|   createtab_stmt   |
++
| CREATE TABLE `contact`(|
|   `contact_id` varchar(50),|
|   `first_name` varchar(50),|
|   `last_name` varchar(50), |
|   `address_line_1` varchar(100),   |
|   `address_line_2` varchar(100),   |
|   `city` varchar(50),  |
|   `postcode` varchar(10),  |
|   `email` varchar(100),|
|   `dob` date,  |
|   `gender` varchar(1), |
|   `marital_status` varchar(1), |
|   `tfn` varchar(20))   |
| ROW FORMAT SERDE   |
|   'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'  |
| STORED AS INPUTFORMAT  |
|   'org.apache.hadoop.mapred.TextInputFormat'   |
| 

[GitHub] drill issue #987: DRILL-5863: Sortable table incorrectly sorts fragments/tim...

2017-11-03 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/987
  
Thanks for the explanations.
+1


---


[GitHub] drill issue #858: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)

2017-11-03 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/858
  
Closing this in favor of #1024 


---


[GitHub] drill issue #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)

2017-11-03 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1024
  
@parthchandra / @laurentgo 
I've implemented changes based on the conversation in PR 
https://github.com/apache/drill/pull/858 and waiting for your review of this 
commit.


---


[GitHub] drill pull request #858: DRILL-3640: Support JDBC Statement.setQueryTimeout(...

2017-11-03 Thread kkhatua
Github user kkhatua closed the pull request at:

https://github.com/apache/drill/pull/858


---


[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...

2017-11-03 Thread kkhatua
GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/1024

DRILL-3640: Support JDBC Statement.setQueryTimeout(int)

Allow for queries to be cancelled if they don't complete within the 
stipulated time.
This is done by having `Drill[Prepared]StatementImpl` create a `Stopwatch` 
timer to track elapsed time. 
  * `DrillCursor` uses this to detect timeouts. 
  * `DrillResultSetImpl` uses this to detech timeout from the client side 
(e.g. a slow client, when all batches have been processed by DrillCursor)

Tests added to test these and other query timeout scenarios for both, 
`Statement` and `PreparedStatement`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-3640_Alt

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1024.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1024


commit 0fe2cfa71a60e7ed17ea12c1ba15fa28be66508f
Author: Kunal Khatua 
Date:   2017-11-04T00:20:59Z

DRILL-3640: Support JDBC Statement.setQueryTimeout(int)

Allow for queries to be cancelled if they don't complete within the 
stipulated time.
This is done by having Drill[Prepared]StatementImpl create a Stopwatch 
timer to track elapsed time. 
  * DrillCursor uses this to detect timeouts. 
  * DrillResultSetImpl uses this to detech timeout from the client side 
(e.g. a slow client, when all batches have been processed by DrillCursor)
Tests added to test these and other query timeout scenarios.




---


[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...

2017-11-03 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/889
  
ping @arina-ielchiieva @amansinha100 


---


[GitHub] drill issue #904: DRILL-5717: change some date time test cases with specific...

2017-11-03 Thread weijietong
Github user weijietong commented on the issue:

https://github.com/apache/drill/pull/904
  
@vvysotskyi any more advice ?


---


[jira] [Created] (DRILL-5930) Table function escape handling does not handle lone backslash

2017-11-03 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5930:
--

 Summary: Table function escape handling does not handle lone 
backslash
 Key: DRILL-5930
 URL: https://issues.apache.org/jira/browse/DRILL-5930
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Paul Rogers
Assignee: Paul Rogers
Priority: Minor


Consider the following query from the test framework 
({{Functional/table_function/positive/drill-3149_10.q}}):

{code}
select * from 
table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\'))
{code}

Notice the double-backslash. Drill translates this into a single backslash. 
But, what happens if the user provides just a single backslash? Drill 
translates that into the empty string. Drill passes the empty string to the 
text reader, which fails with the error described in DRILL-5929.

Better would be to either:

* Leave lone back-slashes unchanged, or
* Fail a query with an lone backslash.

The fix for this bug prefers the first (because it is easiest.) However, this 
fix does not handle a triple backslash, which will be translated to a single 
backslash:

* \ \ --> \
* \ --> (nothing)

The code in question:

{code}
class FormatPluginOp ...
  FormatPluginConfig createConfigForTable(TableInstance t) {
...
  param = StringEscapeUtils.unescapeJava(param);
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #:

2017-11-03 Thread ilooner
Github user ilooner commented on the pull request:


https://github.com/apache/drill/commit/a9f2a1c6ad59386828f41731b7415e18a9459cc6#commitcomment-25391123
  
In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java:
In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
on line 192:
I wanted the method to wait for all the fragments and queries to complete 
even if it was interrupted. I don't think there is any benefit to exiting 
prematurely during the shutdown process because of an interrupt. I think it's 
better to ignore the interrupt and continue waiting for the exit condition. One 
possible reason for obeying the interrupt would be to prevent the system from 
hanging during the shutdown of a corrupted Drillbit. But the waitToExit method 
is already time bound to 30 seconds, so we will not hang.


---


[GitHub] drill pull request #:

2017-11-03 Thread ilooner
Github user ilooner commented on the pull request:


https://github.com/apache/drill/commit/a9f2a1c6ad59386828f41731b7415e18a9459cc6#commitcomment-25390975
  
In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java:
In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
on line 188:
Your right the javadoc does say the lock is reacquired before returning in 
all cases. Fixed


---


[GitHub] drill pull request #:

2017-11-03 Thread Ben-Zvi
Github user Ben-Zvi commented on the pull request:


https://github.com/apache/drill/commit/a9f2a1c6ad59386828f41731b7415e18a9459cc6#commitcomment-25390249
  
In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java:
In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java 
on line 192:
Shouldn't the code actually exit here (e.g. throw ...) instead of returning 
into the loop ? 



---


[jira] [Created] (DRILL-5929) Misleading error for text file with blank line delimiter

2017-11-03 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5929:
--

 Summary: Misleading error for text file with blank line delimiter
 Key: DRILL-5929
 URL: https://issues.apache.org/jira/browse/DRILL-5929
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Paul Rogers
Priority: Minor


Consider the following functional test query:

{code}
select * from 
table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\'))
{code}

For some reason (yet to be determined), when running this from Java, the line 
delimiter ended up empty. This cases the following line to fail with an 
{{ArrayIndexOutOfBoundsException}}:

{code}
class TextInput ...
  public final byte nextChar() throws IOException {
if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0
{code}

We then translate the exception:

{code}
class TextReader ...
  public final boolean parseNext() throws IOException {
...
} catch (Exception ex) {
  try {
throw handleException(ex);
...
  private TextParsingException handleException(Exception ex) throws IOException 
{
...
if (ex instanceof ArrayIndexOutOfBoundsException) {
  // Not clear this exception is still thrown...

  ex = UserException
  .dataReadError(ex)
  .message(
  "Drill failed to read your text file.  Drill supports up to %d 
columns in a text file.  Your file appears to have more than that.",
  MAXIMUM_NUMBER_COLUMNS)
  .build(logger);
}
{code}

That is, due to a missing delimiter, we get an index out of bounds exception, 
which we translate to an error about having too many fields. But, the file 
itself has only a handful of fields. Thus, the error is completely wrong.

Then, we compound the error:

{code}
  private TextParsingException handleException(Exception ex) throws IOException 
{
...
throw new TextParsingException(context, message, ex);

class CompliantTextReader ...
  public boolean next() {
...
} catch (IOException | TextParsingException e) {
  throw UserException.dataReadError(e)
  .addContext("Failure while reading file %s. Happened at or shortly 
before byte position %d.",
split.getPath(), reader.getPos())
  .build(logger);
{code}

That is, our AIOB exception became a user exception that became a text parsing 
exception that became a data read error.

But, this is not a data read error. It is an error in Drill's own validation 
logic. Not clear we should be wrapping user exceptions in other errors that we 
wrap in other user exceptions.





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148896551
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
+  long diff;
+  while ((diff = (System.currentTimeMillis() - startTime)) < 
EXIT_TIMEOUT) {
+if (queries.isEmpty() && runningFragments.isEmpty()) {
+  break;
+}
+
+try {
+  final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, 
TimeUnit.MILLISECONDS);
+
+  if (!success) {
+logger.warn("Timed out after %d millis while waiting to 
exit.", EXIT_TIMEOUT);
+exitLock.lock();
+  }
+} catch (InterruptedException e) {
+  logger.warn("Interrupted while waiting to exit");
+  exitLock.lock();
+}
   }
 
-  exitLatch = new ExtendedLatch();
-}
+  if (!(queries.isEmpty() && runningFragments.isEmpty())) {
+logger.warn("Timed out after %d millis. Shutting down before all 
fragments and foremen " +
+  "have completed.", EXIT_TIMEOUT);
--- End diff --

This should actually just break from the loop instead of print a message, 
since EXIT_TIMEOUT millis have elapsed in this case, and the Timed out message 
is already printed below. I've updated the PR with this change.


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148894504
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
+  long diff;
+  while ((diff = (System.currentTimeMillis() - startTime)) < 
EXIT_TIMEOUT) {
+if (queries.isEmpty() && runningFragments.isEmpty()) {
+  break;
+}
+
+try {
+  final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, 
TimeUnit.MILLISECONDS);
+
+  if (!success) {
+logger.warn("Timed out after %d millis while waiting to 
exit.", EXIT_TIMEOUT);
+exitLock.lock();
+  }
+} catch (InterruptedException e) {
+  logger.warn("Interrupted while waiting to exit");
+  exitLock.lock();
--- End diff --

1. If await is interrupted the condition variable does not acquire the lock 
for us, so we have to acquire it manually.
2. I'll change from warning to error.


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148894158
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
+  long diff;
+  while ((diff = (System.currentTimeMillis() - startTime)) < 
EXIT_TIMEOUT) {
+if (queries.isEmpty() && runningFragments.isEmpty()) {
+  break;
+}
+
+try {
+  final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, 
TimeUnit.MILLISECONDS);
+
+  if (!success) {
+logger.warn("Timed out after %d millis while waiting to 
exit.", EXIT_TIMEOUT);
+exitLock.lock();
--- End diff --

If the await times out, the condition variable does not automatically 
reacquire the lock for us, so we must do it manually.


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148893046
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
--- End diff --

Fixed


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148892741
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

All the unit tests passed locally on my laptop. They also passed on our 
build server along with all of the functional and TPCH tests.



---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148891238
  
--- Diff: pom.xml ---
@@ -442,7 +442,7 @@
   
-Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on"
   -Ddrill.test.query.printing.silent=true
   -Ddrill.catastrophic_to_standard_out=true
-  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M
+  -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M
--- End diff --

 Did all the regression tests pass ?  This change could have some impact on 
other tests (e.g., if was expecting an OOM, now gone).



---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148889490
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
+  long diff;
+  while ((diff = (System.currentTimeMillis() - startTime)) < 
EXIT_TIMEOUT) {
+if (queries.isEmpty() && runningFragments.isEmpty()) {
+  break;
+}
+
+try {
+  final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, 
TimeUnit.MILLISECONDS);
+
+  if (!success) {
+logger.warn("Timed out after %d millis while waiting to 
exit.", EXIT_TIMEOUT);
+exitLock.lock();
+  }
+} catch (InterruptedException e) {
+  logger.warn("Interrupted while waiting to exit");
+  exitLock.lock();
+}
   }
 
-  exitLatch = new ExtendedLatch();
-}
+  if (!(queries.isEmpty() && runningFragments.isEmpty())) {
+logger.warn("Timed out after %d millis. Shutting down before all 
fragments and foremen " +
+  "have completed.", EXIT_TIMEOUT);
--- End diff --

The "time out" time should be `diff`, not EXIT_TIMEOUT 
 


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148890716
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
+  long diff;
+  while ((diff = (System.currentTimeMillis() - startTime)) < 
EXIT_TIMEOUT) {
+if (queries.isEmpty() && runningFragments.isEmpty()) {
+  break;
+}
+
+try {
+  final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, 
TimeUnit.MILLISECONDS);
+
+  if (!success) {
+logger.warn("Timed out after %d millis while waiting to 
exit.", EXIT_TIMEOUT);
+exitLock.lock();
+  }
+} catch (InterruptedException e) {
+  logger.warn("Interrupted while waiting to exit");
+  exitLock.lock();
--- End diff --

1. Why calling lock() here ?
2. Should the code issue an error instead of a warning ?



---


[jira] [Created] (DRILL-5928) Drill allows a COUNT(*) over records that are two wide to load

2017-11-03 Thread Paul Rogers (JIRA)
Paul Rogers created DRILL-5928:
--

 Summary: Drill allows a COUNT(*) over records that are two wide to 
load
 Key: DRILL-5928
 URL: https://issues.apache.org/jira/browse/DRILL-5928
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.11.0
Reporter: Paul Rogers
Priority: Minor


Consider the following test framework query:

{noformat}
/root/drillAutomation/mapr/framework/resources/Functional/data-shapes/wide-columns/general/q2.q
{noformat}

Defined as follows:

{noformat}
select count(*) from `data-shapes/wide-columns/general/10.tbl`
{noformat}

The data file ({{10.tbl}}) contains a single row, single field of 100K 
width. The query suceeds and returns 1.

However, if we did the following:

{noformat}
select * from `data-shapes/wide-columns/general/10.tbl`
{noformat}

then the query would fail with an oversize field exception. (Before Drill 1.13 
the error is "Tried to write something large in a field", while 1.13 and after 
the error is "Text column is too large."

The question is, should the query succeed in the {{COUNT\(*)}} case if it would 
fail in the {{SELECT *}} case?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r14668
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
--- End diff --

Usually, lock is called outside of try to avoid calling unlock in finally 
if lock was not successful.


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...

2017-11-03 Thread vrozov
Github user vrozov commented on a diff in the pull request:

https://github.com/apache/drill/pull/1023#discussion_r148889092
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -158,38 +165,51 @@ public DrillbitContext getContext() {
 return dContext;
   }
 
-  private ExtendedLatch exitLatch = null; // used to wait to exit when 
things are still running
-
   /**
* Waits until it is safe to exit. Blocks until all currently running 
fragments have completed.
-   *
-   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
+   * This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
   public void waitToExit() {
-synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
-return;
+final long startTime = System.currentTimeMillis();
+
+try {
+  exitLock.lock();
+  long diff;
+  while ((diff = (System.currentTimeMillis() - startTime)) < 
EXIT_TIMEOUT) {
+if (queries.isEmpty() && runningFragments.isEmpty()) {
+  break;
+}
+
+try {
+  final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, 
TimeUnit.MILLISECONDS);
+
+  if (!success) {
+logger.warn("Timed out after %d millis while waiting to 
exit.", EXIT_TIMEOUT);
+exitLock.lock();
--- End diff --

Why is this lock necessary?


---


[GitHub] drill issue #1023: DRILL-5922 Fixed Child Allocator Lead. DRILL-5926 Stabali...

2017-11-03 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1023
  
@Ben-Zvi @paul-rogers 


---


[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Lead. DRILL-5926 ...

2017-11-03 Thread ilooner
GitHub user ilooner opened a pull request:

https://github.com/apache/drill/pull/1023

DRILL-5922 Fixed Child Allocator Lead. DRILL-5926 Stabalize TestValueVector 
tests.

##DRILL-5922

 - QueryContext was never closed when the Foreman finished. So the query 
child allocator was never closed.
 - The PlanSplitter created a QueryContext temporarily to construct an RPC 
message but never closed it.
 - The waitForExit method was error prone. Changed it to use the standard 
condition variable pattern.

##DRILL-5926

The TestValueVector tests would run out of memory. I simple increased the 
MaxDirectMemorySize for the forked test processes in the pom to avoid this.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilooner/drill DRILL-5922

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1023.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1023


commit f4058a93e70b21328f1f06e1e1ede5451d00b499
Author: Timothy Farkas 
Date:   2017-11-02T18:27:56Z

 - Deleted commented out code in unit test

commit 56fd6929d3ef738e1d756c067e4e4f80cf965e37
Author: Timothy Farkas 
Date:   2017-11-02T20:34:47Z

 - Removed non-existant parameter from java docs

commit a32c3bed4cf52f15bcf2d0e284c04329827b32b7
Author: Timothy Farkas 
Date:   2017-11-03T16:11:04Z

 - Initial attempt at fix

commit c0afcc8853e4449ec5d7de6dfeee51efc8ad70ab
Author: Timothy Farkas 
Date:   2017-11-03T18:19:08Z

 - Fixed hanging unit tests

commit d1cd5672e1da1553b0d8cf94c5b3c47354d417a3
Author: Timothy Farkas 
Date:   2017-11-03T18:28:47Z

 - More wait to exit improvements

commit ce3a9555482b02a19579c2388a65c4dd29353588
Author: Timothy Farkas 
Date:   2017-11-03T19:14:24Z

 - Deleted unused logger
 - Increased direct memory to prevent TestValueVector tests from running 
out of memory




---


[jira] [Created] (DRILL-5927) Root allocator consistently Leaks a buffer in unit tests

2017-11-03 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-5927:
-

 Summary: Root allocator consistently Leaks a buffer in unit tests
 Key: DRILL-5927
 URL: https://issues.apache.org/jira/browse/DRILL-5927
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas
Priority: Minor


TestBsonRecordReader consistently poduces this exception when running on my 
laptop

{code}
13:09:15.777 [main] ERROR o.a.d.exec.server.BootStrapContext - Error while 
closing
java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding 
buffers allocated (1).
Allocator(ROOT) 0/1024/10113536/4294967296 (res/actual/peak/limit)
  child allocators: 0
  ledgers: 1
ledger[79] allocator: ROOT), isOwning: true, size: 1024, references: 1, 
life: 340912804170064..0, allocatorManager: [71, life: 340912803759189..0] 
holds 1 buffers. 
DrillBuf[106], udle: [72 0..1024]
  reservations: 0

at 
org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:502) 
~[classes/:na]
at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
[classes/:na]
at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
[classes/:na]
at 
org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256) 
~[classes/:na]
at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) 
[classes/:na]
at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) 
[classes/:na]
at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) 
[classes/:na]
at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) 
[test-classes/:na]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
~[na:1.8.0_144]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[na:1.8.0_144]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 [junit-4.11.jar:na]
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 [junit-4.11.jar:na]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 [junit-4.11.jar:na]
at 
mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:44)
 [jmockit-1.3.jar:na]
at 
mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29)
 [jmockit-1.3.jar:na]
at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) ~[na:na]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[na:1.8.0_144]
at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144]
at 
mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95)
 [jmockit-1.3.jar:na]
at 
mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76) 
[jmockit-1.3.jar:na]
at 
mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) 
[jmockit-1.3.jar:na]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java) 
[junit-4.11.jar:na]
at 
org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) 
[junit-4.11.jar:na]
at org.junit.runners.ParentRunner.run(ParentRunner.java:309) 
[junit-4.11.jar:na]
at org.junit.runner.JUnitCore.run(JUnitCore.java:160) 
[junit-4.11.jar:na]
at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 [junit-rt.jar:na]
at 
com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47)
 [junit-rt.jar:na]
at 
com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242)
 [junit-rt.jar:na]
at 
com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) 
[junit-rt.jar:na]
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148868884
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java ---
@@ -165,32 +176,60 @@ public DrillbitContext getContext() {
*
* This is intended to be used by {@link 
org.apache.drill.exec.server.Drillbit#close()}.
*/
-  public void waitToExit() {
+  public void waitToExit(Drillbit bit, boolean forcceful_shutdown) {
 synchronized(this) {
-  if (queries.isEmpty() && runningFragments.isEmpty()) {
+  numOfRunningQueries = queries.size();
+  numOfRunningFragments = runningFragments.size();
+  if ( queries.isEmpty() && runningFragments.isEmpty()) {
 return;
   }
-
+  logger.info("Draining " + queries +" queries and "+ 
runningFragments+" fragments.");
   exitLatch = new ExtendedLatch();
 }
-
-// Wait for at most 5 seconds or until the latch is released.
-exitLatch.awaitUninterruptibly(5000);
+// Wait uninterruptibly until all the queries and running fragments on 
that drillbit goes down
+// to zero
+if( forcceful_shutdown) {
+  exitLatch.awaitUninterruptibly(5000);
+}
+else {
--- End diff --

Inconsistent if-else formatting


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148871345
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -251,6 +252,11 @@ public void run() {
 final String originalName = currentThread.getName();
 currentThread.setName(queryIdString + ":foreman");
 
+try {
+  checkForemanState();
+} catch (ForemanException e) {
+  addToEventQueue(QueryState.FAILED, new ForemanException("Query 
submission failed since foreman is shutting down"));
--- End diff --

why are you not just adding 'e' to the eventQueue ? why do you have to 
create a new ForemanException object?


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148872536
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java ---
@@ -471,6 +471,22 @@ public void close() throws Exception {
   }
 
   /**
+   * Shutdown the drillbit given the name of the drillbit.
+   */
+  public void close_drillbit(final String drillbitname) throws Exception {
--- End diff --

camel case.


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148684246
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -70,7 +72,10 @@
   private final CountDownLatch initialConnection = new CountDownLatch(1);
   private final TransientStoreFactory factory;
   private ServiceCache serviceCache;
+  private DrillbitEndpoint endpoint;
 
+//private HashMap endpointsMap = new 
HashMap();
+  private ConcurrentHashMap endpointsMap = new 
ConcurrentHashMap();
--- End diff --

please add a comment about what this map contains. 


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148173100
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -70,7 +72,10 @@
   private final CountDownLatch initialConnection = new CountDownLatch(1);
   private final TransientStoreFactory factory;
   private ServiceCache serviceCache;
+  private DrillbitEndpoint endpoint;
 
+//private HashMap endpointsMap = new 
HashMap();
--- End diff --

commented code


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148682043
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -229,27 +272,52 @@ public DrillbitEndpoint 
apply(ServiceInstance input) {
 });
 
   // set of newly dead bits : original bits - new set of active bits.
-  Set unregisteredBits = new HashSet<>(endpoints);
-  unregisteredBits.removeAll(newDrillbitSet);
-
+  Set unregisteredBits = new HashSet<>();
   // Set of newly live bits : new set of active bits - original bits.
-  Set registeredBits = new HashSet<>(newDrillbitSet);
-  registeredBits.removeAll(endpoints);
-
-  endpoints = newDrillbitSet;
+  Set registeredBits = new HashSet<>();
 
+  // Update the endpoints map with change in state of the endpoint or 
with the addition
+  // of new drillbit endpoints. Registered endpoints is set to newly 
live drillbit endpoints.
+  for ( DrillbitEndpoint endpoint : newDrillbitSet) {
+if (endpointsMap.containsKey(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort( {
+  endpointsMap.put(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint);
+}
+else {
+  registeredBits.add(endpoint);
+  endpointsMap.put(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint);
+}
+  }
+//  Iterator iterator = endpointsMap.keySet().iterator() ;
--- End diff --

move the common code out of the if-else


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148872381
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java ---
@@ -348,6 +354,21 @@ public void run() {
  */
   }
 
+  /*
+Check if the foreman is ONLINE. If not dont accept any new queries.
+   */
+  public void checkForemanState() throws ForemanException{
+DrillbitEndpoint foreman = drillbitContext.getEndpoint();
+Collection dbs = drillbitContext.getAvailableBits();
--- End diff --

I would recommend refactoring this into a generic boolean 
isOnline(DrillbitEndpoint endpoint) in the DrillbitEndpoint class.


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148682167
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -229,27 +272,52 @@ public DrillbitEndpoint 
apply(ServiceInstance input) {
 });
 
   // set of newly dead bits : original bits - new set of active bits.
-  Set unregisteredBits = new HashSet<>(endpoints);
-  unregisteredBits.removeAll(newDrillbitSet);
-
+  Set unregisteredBits = new HashSet<>();
   // Set of newly live bits : new set of active bits - original bits.
-  Set registeredBits = new HashSet<>(newDrillbitSet);
-  registeredBits.removeAll(endpoints);
-
-  endpoints = newDrillbitSet;
+  Set registeredBits = new HashSet<>();
 
+  // Update the endpoints map with change in state of the endpoint or 
with the addition
--- End diff --

loop below does not handle change in state, so the comment should be 
changed. Is the endpointsMap necessary because of port hunting ? 


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148872633
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java ---
@@ -471,6 +471,22 @@ public void close() throws Exception {
   }
 
   /**
+   * Shutdown the drillbit given the name of the drillbit.
+   */
+  public void close_drillbit(final String drillbitname) throws Exception {
+Exception ex = null;
+for (Drillbit bit : drillbits())
+{
--- End diff --

placement of '{'


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148861255
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java
 ---
@@ -0,0 +1,80 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.server;
+/*
+  State manager to manage the state of drillbit.
+ */
+public class DrillbitStateManager {
+
+
+  public DrillbitStateManager(DrillbitState currentState) {
+this.currentState = currentState;
+  }
+
+  public enum DrillbitState {
+STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN
+  }
+
+  public DrillbitState getState() {
+return currentState;
+  }
+
+  private DrillbitState currentState;
--- End diff --

Drillbit.quiescentMode and Drillbit.forceful_shutdown are volatiles but 
currentState is not. Can you explain why this is so ?


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148668138
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -229,27 +272,52 @@ public DrillbitEndpoint 
apply(ServiceInstance input) {
 });
 
   // set of newly dead bits : original bits - new set of active bits.
-  Set unregisteredBits = new HashSet<>(endpoints);
-  unregisteredBits.removeAll(newDrillbitSet);
-
+  Set unregisteredBits = new HashSet<>();
   // Set of newly live bits : new set of active bits - original bits.
-  Set registeredBits = new HashSet<>(newDrillbitSet);
-  registeredBits.removeAll(endpoints);
-
-  endpoints = newDrillbitSet;
+  Set registeredBits = new HashSet<>();
 
+  // Update the endpoints map with change in state of the endpoint or 
with the addition
+  // of new drillbit endpoints. Registered endpoints is set to newly 
live drillbit endpoints.
+  for ( DrillbitEndpoint endpoint : newDrillbitSet) {
+if (endpointsMap.containsKey(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort( {
+  endpointsMap.put(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint);
+}
+else {
+  registeredBits.add(endpoint);
+  endpointsMap.put(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint);
+}
+  }
+//  Iterator iterator = endpointsMap.keySet().iterator() ;
--- End diff --

commented code


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148674083
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/ClusterCoordinator.java
 ---
@@ -60,7 +61,26 @@
*/
   public abstract Collection getAvailableEndpoints();
 
+  /**
+   * Get a collection of ONLINE drillbit endpoints by excluding the 
drillbits
+   * that are in QUIESCENT state (drillbits that are shutting down). 
Primarily used by the planner
+   * to plan queries only on ONLINE drillbits and used by the client 
during initial connection
+   * phase to connect to a drillbit (foreman)
+   * @return A collection of ONLINE endpoints
+   */
+
+  public abstract Collection getOnlineEndPoints();
+
+  public abstract RegistrationHandle update(RegistrationHandle handle, 
State state);
+
   public interface RegistrationHandle {
+/**
+ * Get the drillbit endpoint associated with the registration handle
+ * @return drillbit endpoint
+ */
+public abstract DrillbitEndpoint getEndPoint();
+
+public abstract void setEndPoint( DrillbitEndpoint endpoint);
--- End diff --

spacing


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148874640
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java ---
@@ -0,0 +1,323 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.test;
+
+import ch.qos.logback.classic.Level;
+import org.apache.commons.io.FileUtils;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState;
+import org.apache.drill.exec.server.Drillbit;
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+import org.omg.PortableServer.THREAD_POLICY_ID;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.HttpURLConnection;
+import java.net.URL;
+import java.util.Collection;
+import java.util.Properties;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNotEquals;
+import static org.junit.Assert.fail;
+
+public class TestGracefulShutdown {
+
+  @BeforeClass
+  public static void setUpTestData() {
+for( int i = 0; i < 1000; i++) {
+  setupFile(i);
+}
+  }
+
+
+  public static final Properties WEBSERVER_CONFIGURATION = new 
Properties() {
+{
+  put(ExecConstants.HTTP_ENABLE, true);
+}
+  };
+
+  public FixtureBuilder enableWebServer(FixtureBuilder builder) {
+Properties props = new Properties();
+props.putAll(WEBSERVER_CONFIGURATION);
+builder.configBuilder.configProps(props);
+return builder;
+  }
+
+
+  /*
+  Start multiple drillbits and then shutdown a drillbit. Query the online
+  endpoints and check if the drillbit still exists.
+   */
+  @Test
+  public void testOnlineEndPoints() throws  Exception {
+
+String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"};
+FixtureBuilder builder = 
ClusterFixture.builder().withBits(drillbits).withLocalZk();
+
+
+try ( ClusterFixture cluster = builder.build();
+  ClientFixture client = cluster.clientFixture()) {
+
+  Drillbit drillbit = cluster.drillbit("db2");
+  DrillbitEndpoint drillbitEndpoint =  
drillbit.getRegistrationHandle().getEndPoint();
+  int grace_period = 
drillbit.getContext().getConfig().getInt("drill.exec.grace_period");
+  new Thread(new Runnable() {
+public void run() {
+  try {
+cluster.close_drillbit("db2");
+  } catch (Exception e) {
+e.printStackTrace();
+  }
+}
+  }).start();
+  //wait for graceperiod
+  Thread.sleep(grace_period);
+  Collection drillbitEndpoints = 
cluster.drillbit().getContext()
+  .getClusterCoordinator()
+  .getOnlineEndPoints();
+  Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint));
+}
+  }
+  /*
+Test if the drillbit transitions from ONLINE state when a shutdown
+request is initiated
+   */
+  @Test
+  public void testStateChange() throws  Exception {
+
+String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"};
+FixtureBuilder builder = 
ClusterFixture.builder().withBits(drillbits).withLocalZk();
+
+try ( ClusterFixture cluster = builder.build();
+  ClientFixture client = cluster.clientFixture()) {
+  Drillbit drillbit = cluster.drillbit("db2");
+  int grace_period = 
drillbit.getContext().getConfig().getInt("drill.exec.grace_period");
+  DrillbitEndpoint drillbitEndpoint =  
drillbit.getRegistrationHandle().getEndPoint();
+  new Thread(new Runnable() {
+public void run() 

[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148171637
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/local/LocalClusterCoordinator.java
 ---
@@ -85,13 +88,62 @@ public void unregister(final RegistrationHandle handle) 
{
 endpoints.remove(handle);
   }
 
+  /**
+   * Update drillbit endpoint state. Drillbit advertises its
+   * state. State information is used during planning and initial
+   * client connection phases.
+   */
+  @Override
+  public RegistrationHandle update(RegistrationHandle handle, State state) 
{
+DrillbitEndpoint endpoint = handle.getEndPoint();
+endpoint = endpoint.toBuilder().setState(state).build();
+handle.setEndPoint(endpoint);
+endpoints.put(handle,endpoint);
+return handle;
+  }
+
   @Override
   public Collection getAvailableEndpoints() {
 return endpoints.values();
   }
 
+  /**
+   * Get a collection of ONLINE Drillbit endpoints by excluding the 
drillbits
+   * that are in QUIESCENT state (drillbits shutting down). Primarily used 
by the planner
+   * to plan queries only on ONLINE drillbits and used by the client 
during initial connection
+   * phase to connect to a drillbit (foreman)
+   * @return A collection of ONLINE endpoints
+   */
+  @Override
+  public Collection getOnlineEndPoints() {
+Collection runningEndPoints = new ArrayList<>();
+for (DrillbitEndpoint endpoint: endpoints.values()){
+  if(!endpoint.hasState() || endpoint.getState().equals(State.ONLINE)) 
{
+runningEndPoints.add(endpoint);
+  }
+}
+return runningEndPoints;
+  }
+
   private class Handle implements RegistrationHandle {
 private final UUID id = UUID.randomUUID();
+private DrillbitEndpoint drillbitEndpoint;
+
+/**
+ * Get the drillbit endpoint associated with the registration handle
+ * @return drillbit endpoint
+ */
+public DrillbitEndpoint getEndPoint() {
+  return drillbitEndpoint;
+}
+
+public void setEndPoint( DrillbitEndpoint endpoint) {
--- End diff --

spacing


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148676672
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -200,11 +206,47 @@ public void unregister(RegistrationHandle handle) {
 }
   }
 
+  /**
+   * Update drillbit endpoint state. Drillbit advertises its
+   * state in Zookeeper when a shutdown request of drillbit is
+   * triggered. State information is used during planning and
+   * initial client connection phases.
+   */
+  public RegistrationHandle update(RegistrationHandle handle, State state) 
{
+ZKRegistrationHandle h = (ZKRegistrationHandle) handle;
+  try {
+endpoint = h.endpoint.toBuilder().setState(state).build();
+ServiceInstance serviceInstance = 
ServiceInstance.builder().name(serviceName).id(h.id).payload(endpoint).build();
--- End diff --

suggestion: wrap this long line since you have already wrapped the other 
lines and comments


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148686859
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java ---
@@ -69,14 +73,30 @@
 
   public final static String SYSTEM_OPTIONS_NAME = 
"org.apache.drill.exec.server.Drillbit.system_options";
 
-  private boolean isClosed = false;
-
   private final ClusterCoordinator coord;
   private final ServiceEngine engine;
   private final PersistentStoreProvider storeProvider;
   private final WorkManager manager;
   private final BootStrapContext context;
   private final WebServer webServer;
+  private final int gracePeriod;
+  private DrillbitStateManager stateManager;
+
+  public void setQuiescentMode(boolean quiescentMode) {
+this.quiescentMode = quiescentMode;
+  }
+
+  private volatile boolean quiescentMode;
+
+  public void setForceful_shutdown(boolean forceful_shutdown) {
+this.forceful_shutdown = forceful_shutdown;
+  }
+
+  private volatile boolean forceful_shutdown = false;
--- End diff --

any reason why you did not camel-case this variable? 


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148685835
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -229,27 +272,52 @@ public DrillbitEndpoint 
apply(ServiceInstance input) {
 });
 
   // set of newly dead bits : original bits - new set of active bits.
-  Set unregisteredBits = new HashSet<>(endpoints);
-  unregisteredBits.removeAll(newDrillbitSet);
-
+  Set unregisteredBits = new HashSet<>();
   // Set of newly live bits : new set of active bits - original bits.
-  Set registeredBits = new HashSet<>(newDrillbitSet);
-  registeredBits.removeAll(endpoints);
-
-  endpoints = newDrillbitSet;
+  Set registeredBits = new HashSet<>();
 
+  // Update the endpoints map with change in state of the endpoint or 
with the addition
+  // of new drillbit endpoints. Registered endpoints is set to newly 
live drillbit endpoints.
+  for ( DrillbitEndpoint endpoint : newDrillbitSet) {
+if (endpointsMap.containsKey(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort( {
+  endpointsMap.put(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint);
+}
+else {
+  registeredBits.add(endpoint);
+  endpointsMap.put(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint);
+}
+  }
+//  Iterator iterator = endpointsMap.keySet().iterator() ;
+//  while(iterator.hasNext()) {
+//MultiKey key = iterator.next();
+//if(!newDrillbitSet.contains(endpointsMap.get(key))) {
+//  unregisteredBits.add(endpointsMap.get(key));
+//  iterator.remove();
+//}
+//  }
+
+//   Remove all the endpoints that are newly dead
+  for ( MultiKey key: endpointsMap.keySet())
+  {
--- End diff --

open-brace placement for the for-loop is inconsistent with other instances


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-03 Thread bitblender
Github user bitblender commented on a diff in the pull request:

https://github.com/apache/drill/pull/921#discussion_r148669660
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java
 ---
@@ -229,27 +272,52 @@ public DrillbitEndpoint 
apply(ServiceInstance input) {
 });
 
   // set of newly dead bits : original bits - new set of active bits.
-  Set unregisteredBits = new HashSet<>(endpoints);
-  unregisteredBits.removeAll(newDrillbitSet);
-
+  Set unregisteredBits = new HashSet<>();
   // Set of newly live bits : new set of active bits - original bits.
-  Set registeredBits = new HashSet<>(newDrillbitSet);
-  registeredBits.removeAll(endpoints);
-
-  endpoints = newDrillbitSet;
+  Set registeredBits = new HashSet<>();
 
+  // Update the endpoints map with change in state of the endpoint or 
with the addition
+  // of new drillbit endpoints. Registered endpoints is set to newly 
live drillbit endpoints.
+  for ( DrillbitEndpoint endpoint : newDrillbitSet) {
+if (endpointsMap.containsKey(new 
MultiKey(endpoint.getAddress(),endpoint.getUserPort( {
--- End diff --

I would recommend extracting endpoint.getAddress() and 
endpoint.getUserPort() into locals


---


[jira] [Created] (DRILL-5926) TestValueVector tests fail sporadically

2017-11-03 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-5926:
-

 Summary: TestValueVector tests fail sporadically
 Key: DRILL-5926
 URL: https://issues.apache.org/jira/browse/DRILL-5926
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas


The following tests fail sporadically with out of memory exception:

* TestValueVector.testFixedVectorReallocation
* TestValueVector.testVariableVectorReallocation





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1021: DRILL-5923: Display name for query state

2017-11-03 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1021#discussion_r148860361
  
--- Diff: exec/java-exec/src/main/resources/rest/profile/profile.ftl ---
@@ -135,6 +135,8 @@ table.sortable thead .sorting_desc { background-image: 
url("/static/img/black-de
 
   <#assign queueName = model.getProfile().getQueueName() />
   <#assign queued = queueName != "" && queueName != "-" />
+  <#assign queryStateDisplayName = ["Starting", "Running", "Succeeded", 
"Canceled", "Failed",
--- End diff --

Why do we have need duplicated state display name array here and in 
`ProfileResources.java`. Is there a possibility to not duplicate it?


---


[jira] [Created] (DRILL-5925) Unit test TestValueVector.testFixedVectorReallocation TestValueVector.testVariableVectorReallocation always fail

2017-11-03 Thread Chunhui Shi (JIRA)
Chunhui Shi created DRILL-5925:
--

 Summary: Unit test TestValueVector.testFixedVectorReallocation 
TestValueVector.testVariableVectorReallocation always fail
 Key: DRILL-5925
 URL: https://issues.apache.org/jira/browse/DRILL-5925
 Project: Apache Drill
  Issue Type: Bug
Reporter: Chunhui Shi


Tests in error: 
  TestValueVector.testFixedVectorReallocation »  Unexpected exception, 
expected<...
  TestValueVector.testVariableVectorReallocation »  Unexpected exception, 
expect...

Tests run: 2401, Failures: 0, Errors: 2, Skipped: 142

We are seeing these failures quite often. We should disable these two tests or 
modify the expected exception to be OutofMemory



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #774: DRILL-5337: OpenTSDB storage plugin

2017-11-03 Thread Vlad-Storona
Github user Vlad-Storona commented on the issue:

https://github.com/apache/drill/pull/774
  
@arina-ielchiieva Thanks for the review. I updated the PR. 

Please review updated changes.


---