[jira] [Created] (DRILL-5932) HiveServer2 queries throw error with jdbc connection
tooptoop4 created DRILL-5932: Summary: HiveServer2 queries throw error with jdbc connection Key: DRILL-5932 URL: https://issues.apache.org/jira/browse/DRILL-5932 Project: Apache Drill Issue Type: Bug Components: Client - JDBC, Storage - Hive Affects Versions: 1.11.0 Environment: linux 2.3 hive version Reporter: tooptoop4 Priority: Blocker Basic hive queries all throw error! {code:sql} copied https://repo1.maven.org/maven2/org/apache/hive/hive-jdbc/2.3.0/hive-jdbc-2.3.0-standalone.jar to /usr/lib/apache-drill-1.11.0/jars/3rdparty/hive-jdbc-2.3.0-standalone.jar added this storage plugin: { "type": "jdbc", "driver": "org.apache.hive.jdbc.HiveDriver", "url": "jdbc:hive2://host:1/default", "username": "hive", "password": "hive1234", "enabled": true } [ec2-user@host ~]$ cd /usr/lib/apache-drill-1.11.0 [ec2-user@host apache-drill-1.11.0]$ ./bin/drill-embedded OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0 Nov 01, 2017 7:53:53 AM org.glassfish.jersey.server.ApplicationHandler initialize INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26... apache drill 1.11.0 "this isn't your grandfather's sql" 0: jdbc:drill:zk=local> SELECT count(*) FROM hive2.`contact`; Error: DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query. sql SELECT COUNT(*) AS EXPR$0 FROM (SELECT 0 AS $f0 FROM.default.contact) AS t plugin hive2 Fragment 0:0 [Error Id: 4b293e97-7547-49c5-91da-b9ee2f2184fc on ip-myip.mydomain.orghere.com:31010] (state=,code=0) 0: jdbc:drill:zk=local> ALTER SESSION SET `exec.errors.verbose` = true; +---+---+ | ok |summary| +---+---+ | true | exec.errors.verbose updated. | +---+---+ 1 row selected (0.351 seconds) 0: jdbc:drill:zk=local> SELECT count(*) FROM hive2.`contact`; Error: DATA_READ ERROR: The JDBC storage plugin failed while trying setup the SQL query. sql SELECT COUNT(*) AS EXPR$0 FROM (SELECT 0 AS $f0 FROM.default.contact) AS t plugin hive2 Fragment 0:0 [Error Id: ac5cc8f2-69a4-455e-b1a6-5d22cde99729 on ip-myip.mydomain.orghere.com:31010] (org.apache.hive.service.cli.HiveSQLException) Error while compiling statement: FAILED: ParseException line 1:23 missing EOF at '$' near 'EXPR' org.apache.hive.jdbc.Utils.verifySuccess():267 org.apache.hive.jdbc.Utils.verifySuccessWithInfo():253 org.apache.hive.jdbc.HiveStatement.runAsyncOnServer():313 org.apache.hive.jdbc.HiveStatement.execute():253 org.apache.hive.jdbc.HiveStatement.executeQuery():476 org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 org.apache.commons.dbcp.DelegatingStatement.executeQuery():208 org.apache.drill.exec.store.jdbc.JdbcRecordReader.setup():177 org.apache.drill.exec.physical.impl.ScanBatch.():104 org.apache.drill.exec.physical.impl.ScanBatch.():126 org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():40 org.apache.drill.exec.store.jdbc.JdbcBatchCreator.getBatch():33 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():156 org.apache.drill.exec.physical.impl.ImplCreator.getChildren():179 org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():136 org.apache.drill.exec.physical.impl.ImplCreator.getChildren():179 org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():109 org.apache.drill.exec.physical.impl.ImplCreator.getExec():87 org.apache.drill.exec.work.fragment.FragmentExecutor.run():207 org.apache.drill.common.SelfCleaningRunnable.run():38 java.util.concurrent.ThreadPoolExecutor.runWorker():1149 java.util.concurrent.ThreadPoolExecutor$Worker.run():624 java.lang.Thread.run():748 Caused By (org.apache.hive.service.cli.HiveSQLException) Error while compiling statement: FAILED: ParseException line 1:23 missing EOF at '$' near 'EXPR' org.apache.hive.service.cli.operation.Operation.toSQLException():380 org.apache.hive.service.cli.operation.SQLOperation.prepare():206 org.apache.hive.service.cli.operation.SQLOperation.runInternal():290 org.apache.hive.service.cli.operation.Operation.run():320 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementInternal():530 org.apache.hive.service.cli.session.HiveSessionImpl.executeStatementAsync():517 sun.reflect.GeneratedMethodAccessor86.invoke():-1 sun.reflect.DelegatingMethodAccessorImpl.invoke():43 java.lang.reflect.Method.invoke():498 org.apache.hive.service.cli.session.HiveSessionProxy.invoke():78 org.apache.hive.service.cli.session.HiveSessionProxy.access$000():36 org.apache.hive.service.cli.session.HiveSessionProxy$1.run():63 java.security.AccessController.doPrivileged():-2
[jira] [Created] (DRILL-5931) Drill queries against hive metastore tables return 0 rows rather than error
tooptoop4 created DRILL-5931: Summary: Drill queries against hive metastore tables return 0 rows rather than error Key: DRILL-5931 URL: https://issues.apache.org/jira/browse/DRILL-5931 Project: Apache Drill Issue Type: Improvement Components: Storage - Hive Affects Versions: 1.11.0 Environment: linux drill 1.11 hive 2.3.0 Reporter: tooptoop4 Priority: Critical Note that the hive user called 'hive' does not have a linux account so has no access to /home/ec2-user/warehouse/ file system. Rather than returning an error message, Drill pretends that the table is empty. {code:sql} [ec2-user@host ~]$ cd /usr/lib/apache-drill-1.11.0 [ec2-user@host apache-drill-1.11.0]$ ./bin/drill-embedded OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0 Nov 01, 2017 7:53:53 AM org.glassfish.jersey.server.ApplicationHandler initialize INFO: Initiating Jersey application, version Jersey: 2.8 2014-04-29 01:25:26... apache drill 1.11.0 "this isn't your grandfather's sql" 0: jdbc:drill:zk=local> SHOW SCHEMAS; +-+ | SCHEMA_NAME | +-+ | INFORMATION_SCHEMA | | coffeemysql.COFFEEBREAK | | coffeemysql.information_schema | | coffeemysql.mysql | | coffeemysql.performance_schema | | coffeemysql.test| | coffeemysql | | cp.default | | dfs.default | | dfs.root| | dfs.tmp | | hive.default| | sys | +-+ 13 rows selected (0.317 seconds) 0: jdbc:drill:zk=local> use hive; +---+---+ | ok | summary | +---+---+ | true | Default schema changed to [hive] | +---+---+ 1 row selected (0.16 seconds) 0: jdbc:drill:zk=local> show tables; +---+-+ | TABLE_SCHEMA | TABLE_NAME | +---+-+ | hive.default | contact | | hive.default | account | +---+-+ 2 rows selected (0.958 seconds) 0: jdbc:drill:zk=local> select * from contact; +-+-++-+-+---+---++--+-+-+--+ | contact_id | first_name | last_name | address_line_1 | address_line_2 | city | postcode | email | dob | gender | marital_status | tfn | +-+-++-+-+---+---++--+-+-+--+ +-+-++-+-+---+---++--+-+-+--+ No rows selected (0.104 seconds) This is the hive plugin in the Drill browser ‘storage’ plugin { "type": "hive", "enabled": true, "configProps": { "hive.metastore.uris": "thrift://host:9083", "javax.jdo.option.ConnectionURL": "jdbc:mysql://host:3306/metastore", "javax.jdo.option.ConnectionDriverName": "com.mysql.jdbc.Driver", "javax.jdo.option.ConnectionUserName": "hive", "javax.jdo.option.ConnectionPassword": "hive1234", "hive.metastore.warehouse.dir": "file:///home/ec2-user/warehouse", "fs.default.name": "file:///" } } Below connection with beeline proves there is data in the table: Beeline version 2.3.0 by Apache Hive 0: jdbc:hive2://localhost:1/default> show create table contact; ++ | createtab_stmt | ++ | CREATE TABLE `contact`(| | `contact_id` varchar(50),| | `first_name` varchar(50),| | `last_name` varchar(50), | | `address_line_1` varchar(100), | | `address_line_2` varchar(100), | | `city` varchar(50), | | `postcode` varchar(10), | | `email` varchar(100),| | `dob` date, | | `gender` varchar(1), | | `marital_status` varchar(1), | | `tfn` varchar(20)) | | ROW FORMAT SERDE | | 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' | | STORED AS INPUTFORMAT | | 'org.apache.hadoop.mapred.TextInputFormat' | |
[GitHub] drill issue #987: DRILL-5863: Sortable table incorrectly sorts fragments/tim...
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/987 Thanks for the explanations. +1 ---
[GitHub] drill issue #858: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/858 Closing this in favor of #1024 ---
[GitHub] drill issue #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1024 @parthchandra / @laurentgo I've implemented changes based on the conversation in PR https://github.com/apache/drill/pull/858 and waiting for your review of this commit. ---
[GitHub] drill pull request #858: DRILL-3640: Support JDBC Statement.setQueryTimeout(...
Github user kkhatua closed the pull request at: https://github.com/apache/drill/pull/858 ---
[GitHub] drill pull request #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout...
GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/1024 DRILL-3640: Support JDBC Statement.setQueryTimeout(int) Allow for queries to be cancelled if they don't complete within the stipulated time. This is done by having `Drill[Prepared]StatementImpl` create a `Stopwatch` timer to track elapsed time. * `DrillCursor` uses this to detect timeouts. * `DrillResultSetImpl` uses this to detech timeout from the client side (e.g. a slow client, when all batches have been processed by DrillCursor) Tests added to test these and other query timeout scenarios for both, `Statement` and `PreparedStatement`. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kkhatua/drill DRILL-3640_Alt Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1024.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1024 commit 0fe2cfa71a60e7ed17ea12c1ba15fa28be66508f Author: Kunal KhatuaDate: 2017-11-04T00:20:59Z DRILL-3640: Support JDBC Statement.setQueryTimeout(int) Allow for queries to be cancelled if they don't complete within the stipulated time. This is done by having Drill[Prepared]StatementImpl create a Stopwatch timer to track elapsed time. * DrillCursor uses this to detect timeouts. * DrillResultSetImpl uses this to detech timeout from the client side (e.g. a slow client, when all batches have been processed by DrillCursor) Tests added to test these and other query timeout scenarios. ---
[GitHub] drill issue #889: DRILL-5691: enhance scalar sub queries checking for the ca...
Github user weijietong commented on the issue: https://github.com/apache/drill/pull/889 ping @arina-ielchiieva @amansinha100 ---
[GitHub] drill issue #904: DRILL-5717: change some date time test cases with specific...
Github user weijietong commented on the issue: https://github.com/apache/drill/pull/904 @vvysotskyi any more advice ? ---
[jira] [Created] (DRILL-5930) Table function escape handling does not handle lone backslash
Paul Rogers created DRILL-5930: -- Summary: Table function escape handling does not handle lone backslash Key: DRILL-5930 URL: https://issues.apache.org/jira/browse/DRILL-5930 Project: Apache Drill Issue Type: Bug Affects Versions: 1.11.0 Reporter: Paul Rogers Assignee: Paul Rogers Priority: Minor Consider the following query from the test framework ({{Functional/table_function/positive/drill-3149_10.q}}): {code} select * from table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\')) {code} Notice the double-backslash. Drill translates this into a single backslash. But, what happens if the user provides just a single backslash? Drill translates that into the empty string. Drill passes the empty string to the text reader, which fails with the error described in DRILL-5929. Better would be to either: * Leave lone back-slashes unchanged, or * Fail a query with an lone backslash. The fix for this bug prefers the first (because it is easiest.) However, this fix does not handle a triple backslash, which will be translated to a single backslash: * \ \ --> \ * \ --> (nothing) The code in question: {code} class FormatPluginOp ... FormatPluginConfig createConfigForTable(TableInstance t) { ... param = StringEscapeUtils.unescapeJava(param); {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #:
Github user ilooner commented on the pull request: https://github.com/apache/drill/commit/a9f2a1c6ad59386828f41731b7415e18a9459cc6#commitcomment-25391123 In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java: In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java on line 192: I wanted the method to wait for all the fragments and queries to complete even if it was interrupted. I don't think there is any benefit to exiting prematurely during the shutdown process because of an interrupt. I think it's better to ignore the interrupt and continue waiting for the exit condition. One possible reason for obeying the interrupt would be to prevent the system from hanging during the shutdown of a corrupted Drillbit. But the waitToExit method is already time bound to 30 seconds, so we will not hang. ---
[GitHub] drill pull request #:
Github user ilooner commented on the pull request: https://github.com/apache/drill/commit/a9f2a1c6ad59386828f41731b7415e18a9459cc6#commitcomment-25390975 In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java: In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java on line 188: Your right the javadoc does say the lock is reacquired before returning in all cases. Fixed ---
[GitHub] drill pull request #:
Github user Ben-Zvi commented on the pull request: https://github.com/apache/drill/commit/a9f2a1c6ad59386828f41731b7415e18a9459cc6#commitcomment-25390249 In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java: In exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java on line 192: Shouldn't the code actually exit here (e.g. throw ...) instead of returning into the loop ? ---
[jira] [Created] (DRILL-5929) Misleading error for text file with blank line delimiter
Paul Rogers created DRILL-5929: -- Summary: Misleading error for text file with blank line delimiter Key: DRILL-5929 URL: https://issues.apache.org/jira/browse/DRILL-5929 Project: Apache Drill Issue Type: Bug Affects Versions: 1.11.0 Reporter: Paul Rogers Priority: Minor Consider the following functional test query: {code} select * from table(`table_function/colons.txt`(type=>'text',lineDelimiter=>'\\')) {code} For some reason (yet to be determined), when running this from Java, the line delimiter ended up empty. This cases the following line to fail with an {{ArrayIndexOutOfBoundsException}}: {code} class TextInput ... public final byte nextChar() throws IOException { if (byteChar == lineSeparator[0]) { // but, lineSeparator.length == 0 {code} We then translate the exception: {code} class TextReader ... public final boolean parseNext() throws IOException { ... } catch (Exception ex) { try { throw handleException(ex); ... private TextParsingException handleException(Exception ex) throws IOException { ... if (ex instanceof ArrayIndexOutOfBoundsException) { // Not clear this exception is still thrown... ex = UserException .dataReadError(ex) .message( "Drill failed to read your text file. Drill supports up to %d columns in a text file. Your file appears to have more than that.", MAXIMUM_NUMBER_COLUMNS) .build(logger); } {code} That is, due to a missing delimiter, we get an index out of bounds exception, which we translate to an error about having too many fields. But, the file itself has only a handful of fields. Thus, the error is completely wrong. Then, we compound the error: {code} private TextParsingException handleException(Exception ex) throws IOException { ... throw new TextParsingException(context, message, ex); class CompliantTextReader ... public boolean next() { ... } catch (IOException | TextParsingException e) { throw UserException.dataReadError(e) .addContext("Failure while reading file %s. Happened at or shortly before byte position %d.", split.getPath(), reader.getPos()) .build(logger); {code} That is, our AIOB exception became a user exception that became a text parsing exception that became a data read error. But, this is not a data read error. It is an error in Drill's own validation logic. Not clear we should be wrapping user exceptions in other errors that we wrap in other user exceptions. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148896551 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); + } +} catch (InterruptedException e) { + logger.warn("Interrupted while waiting to exit"); + exitLock.lock(); +} } - exitLatch = new ExtendedLatch(); -} + if (!(queries.isEmpty() && runningFragments.isEmpty())) { +logger.warn("Timed out after %d millis. Shutting down before all fragments and foremen " + + "have completed.", EXIT_TIMEOUT); --- End diff -- This should actually just break from the loop instead of print a message, since EXIT_TIMEOUT millis have elapsed in this case, and the Timed out message is already printed below. I've updated the PR with this change. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148894504 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); + } +} catch (InterruptedException e) { + logger.warn("Interrupted while waiting to exit"); + exitLock.lock(); --- End diff -- 1. If await is interrupted the condition variable does not acquire the lock for us, so we have to acquire it manually. 2. I'll change from warning to error. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148894158 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); --- End diff -- If the await times out, the condition variable does not automatically reacquire the lock for us, so we must do it manually. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148893046 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); --- End diff -- Fixed ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148892741 --- Diff: pom.xml --- @@ -442,7 +442,7 @@ -Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on" -Ddrill.test.query.printing.silent=true -Ddrill.catastrophic_to_standard_out=true - -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M + -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M --- End diff -- All the unit tests passed locally on my laptop. They also passed on our build server along with all of the functional and TPCH tests. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148891238 --- Diff: pom.xml --- @@ -442,7 +442,7 @@ -Dorg.apache.drill.exec.server.Drillbit.system_options="org.apache.drill.exec.compile.ClassTransformer.scalar_replacement=on" -Ddrill.test.query.printing.silent=true -Ddrill.catastrophic_to_standard_out=true - -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=3072M + -XX:MaxPermSize=512M -XX:MaxDirectMemorySize=4096M --- End diff -- Did all the regression tests pass ? This change could have some impact on other tests (e.g., if was expecting an OOM, now gone). ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148889490 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); + } +} catch (InterruptedException e) { + logger.warn("Interrupted while waiting to exit"); + exitLock.lock(); +} } - exitLatch = new ExtendedLatch(); -} + if (!(queries.isEmpty() && runningFragments.isEmpty())) { +logger.warn("Timed out after %d millis. Shutting down before all fragments and foremen " + + "have completed.", EXIT_TIMEOUT); --- End diff -- The "time out" time should be `diff`, not EXIT_TIMEOUT ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148890716 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); + } +} catch (InterruptedException e) { + logger.warn("Interrupted while waiting to exit"); + exitLock.lock(); --- End diff -- 1. Why calling lock() here ? 2. Should the code issue an error instead of a warning ? ---
[jira] [Created] (DRILL-5928) Drill allows a COUNT(*) over records that are two wide to load
Paul Rogers created DRILL-5928: -- Summary: Drill allows a COUNT(*) over records that are two wide to load Key: DRILL-5928 URL: https://issues.apache.org/jira/browse/DRILL-5928 Project: Apache Drill Issue Type: Bug Affects Versions: 1.11.0 Reporter: Paul Rogers Priority: Minor Consider the following test framework query: {noformat} /root/drillAutomation/mapr/framework/resources/Functional/data-shapes/wide-columns/general/q2.q {noformat} Defined as follows: {noformat} select count(*) from `data-shapes/wide-columns/general/10.tbl` {noformat} The data file ({{10.tbl}}) contains a single row, single field of 100K width. The query suceeds and returns 1. However, if we did the following: {noformat} select * from `data-shapes/wide-columns/general/10.tbl` {noformat} then the query would fail with an oversize field exception. (Before Drill 1.13 the error is "Tried to write something large in a field", while 1.13 and after the error is "Text column is too large." The question is, should the query succeed in the {{COUNT\(*)}} case if it would fail in the {{SELECT *}} case? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r14668 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); --- End diff -- Usually, lock is called outside of try to avoid calling unlock in finally if lock was not successful. ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Leak. DRILL-5926 ...
Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1023#discussion_r148889092 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -158,38 +165,51 @@ public DrillbitContext getContext() { return dContext; } - private ExtendedLatch exitLatch = null; // used to wait to exit when things are still running - /** * Waits until it is safe to exit. Blocks until all currently running fragments have completed. - * - * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. + * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ public void waitToExit() { -synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { -return; +final long startTime = System.currentTimeMillis(); + +try { + exitLock.lock(); + long diff; + while ((diff = (System.currentTimeMillis() - startTime)) < EXIT_TIMEOUT) { +if (queries.isEmpty() && runningFragments.isEmpty()) { + break; +} + +try { + final boolean success = exitCondition.await(EXIT_TIMEOUT - diff, TimeUnit.MILLISECONDS); + + if (!success) { +logger.warn("Timed out after %d millis while waiting to exit.", EXIT_TIMEOUT); +exitLock.lock(); --- End diff -- Why is this lock necessary? ---
[GitHub] drill issue #1023: DRILL-5922 Fixed Child Allocator Lead. DRILL-5926 Stabali...
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1023 @Ben-Zvi @paul-rogers ---
[GitHub] drill pull request #1023: DRILL-5922 Fixed Child Allocator Lead. DRILL-5926 ...
GitHub user ilooner opened a pull request: https://github.com/apache/drill/pull/1023 DRILL-5922 Fixed Child Allocator Lead. DRILL-5926 Stabalize TestValueVector tests. ##DRILL-5922 - QueryContext was never closed when the Foreman finished. So the query child allocator was never closed. - The PlanSplitter created a QueryContext temporarily to construct an RPC message but never closed it. - The waitForExit method was error prone. Changed it to use the standard condition variable pattern. ##DRILL-5926 The TestValueVector tests would run out of memory. I simple increased the MaxDirectMemorySize for the forked test processes in the pom to avoid this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-5922 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1023.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1023 commit f4058a93e70b21328f1f06e1e1ede5451d00b499 Author: Timothy FarkasDate: 2017-11-02T18:27:56Z - Deleted commented out code in unit test commit 56fd6929d3ef738e1d756c067e4e4f80cf965e37 Author: Timothy Farkas Date: 2017-11-02T20:34:47Z - Removed non-existant parameter from java docs commit a32c3bed4cf52f15bcf2d0e284c04329827b32b7 Author: Timothy Farkas Date: 2017-11-03T16:11:04Z - Initial attempt at fix commit c0afcc8853e4449ec5d7de6dfeee51efc8ad70ab Author: Timothy Farkas Date: 2017-11-03T18:19:08Z - Fixed hanging unit tests commit d1cd5672e1da1553b0d8cf94c5b3c47354d417a3 Author: Timothy Farkas Date: 2017-11-03T18:28:47Z - More wait to exit improvements commit ce3a9555482b02a19579c2388a65c4dd29353588 Author: Timothy Farkas Date: 2017-11-03T19:14:24Z - Deleted unused logger - Increased direct memory to prevent TestValueVector tests from running out of memory ---
[jira] [Created] (DRILL-5927) Root allocator consistently Leaks a buffer in unit tests
Timothy Farkas created DRILL-5927: - Summary: Root allocator consistently Leaks a buffer in unit tests Key: DRILL-5927 URL: https://issues.apache.org/jira/browse/DRILL-5927 Project: Apache Drill Issue Type: Bug Reporter: Timothy Farkas Assignee: Timothy Farkas Priority: Minor TestBsonRecordReader consistently poduces this exception when running on my laptop {code} 13:09:15.777 [main] ERROR o.a.d.exec.server.BootStrapContext - Error while closing java.lang.IllegalStateException: Allocator[ROOT] closed with outstanding buffers allocated (1). Allocator(ROOT) 0/1024/10113536/4294967296 (res/actual/peak/limit) child allocators: 0 ledgers: 1 ledger[79] allocator: ROOT), isOwning: true, size: 1024, references: 1, life: 340912804170064..0, allocatorManager: [71, life: 340912803759189..0] holds 1 buffers. DrillBuf[106], udle: [72 0..1024] reservations: 0 at org.apache.drill.exec.memory.BaseAllocator.close(BaseAllocator.java:502) ~[classes/:na] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) [classes/:na] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) [classes/:na] at org.apache.drill.exec.server.BootStrapContext.close(BootStrapContext.java:256) ~[classes/:na] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:76) [classes/:na] at org.apache.drill.common.AutoCloseables.close(AutoCloseables.java:64) [classes/:na] at org.apache.drill.exec.server.Drillbit.close(Drillbit.java:205) [classes/:na] at org.apache.drill.BaseTestQuery.closeClient(BaseTestQuery.java:315) [test-classes/:na] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_144] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_144] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_144] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144] at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) [junit-4.11.jar:na] at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) [junit-4.11.jar:na] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) [junit-4.11.jar:na] at mockit.integration.junit4.internal.JUnit4TestRunnerDecorator.invokeExplosively(JUnit4TestRunnerDecorator.java:44) [jmockit-1.3.jar:na] at mockit.integration.junit4.internal.MockFrameworkMethod.invokeExplosively(MockFrameworkMethod.java:29) [jmockit-1.3.jar:na] at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source) ~[na:na] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_144] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_144] at mockit.internal.util.MethodReflection.invokeWithCheckedThrows(MethodReflection.java:95) [jmockit-1.3.jar:na] at mockit.internal.annotations.MockMethodBridge.callMock(MockMethodBridge.java:76) [jmockit-1.3.jar:na] at mockit.internal.annotations.MockMethodBridge.invoke(MockMethodBridge.java:41) [jmockit-1.3.jar:na] at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java) [junit-4.11.jar:na] at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:33) [junit-4.11.jar:na] at org.junit.runners.ParentRunner.run(ParentRunner.java:309) [junit-4.11.jar:na] at org.junit.runner.JUnitCore.run(JUnitCore.java:160) [junit-4.11.jar:na] at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68) [junit-rt.jar:na] at com.intellij.rt.execution.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:47) [junit-rt.jar:na] at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:242) [junit-rt.jar:na] at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:70) [junit-rt.jar:na] {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148868884 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +176,60 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcceful_shutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); + if ( queries.isEmpty() && runningFragments.isEmpty()) { return; } - + logger.info("Draining " + queries +" queries and "+ runningFragments+" fragments."); exitLatch = new ExtendedLatch(); } - -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +// Wait uninterruptibly until all the queries and running fragments on that drillbit goes down +// to zero +if( forcceful_shutdown) { + exitLatch.awaitUninterruptibly(5000); +} +else { --- End diff -- Inconsistent if-else formatting ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148871345 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -251,6 +252,11 @@ public void run() { final String originalName = currentThread.getName(); currentThread.setName(queryIdString + ":foreman"); +try { + checkForemanState(); +} catch (ForemanException e) { + addToEventQueue(QueryState.FAILED, new ForemanException("Query submission failed since foreman is shutting down")); --- End diff -- why are you not just adding 'e' to the eventQueue ? why do you have to create a new ForemanException object? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148872536 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java --- @@ -471,6 +471,22 @@ public void close() throws Exception { } /** + * Shutdown the drillbit given the name of the drillbit. + */ + public void close_drillbit(final String drillbitname) throws Exception { --- End diff -- camel case. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148684246 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -70,7 +72,10 @@ private final CountDownLatch initialConnection = new CountDownLatch(1); private final TransientStoreFactory factory; private ServiceCache serviceCache; + private DrillbitEndpoint endpoint; +//private HashMapendpointsMap = new HashMap (); + private ConcurrentHashMap endpointsMap = new ConcurrentHashMap (); --- End diff -- please add a comment about what this map contains. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148173100 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -70,7 +72,10 @@ private final CountDownLatch initialConnection = new CountDownLatch(1); private final TransientStoreFactory factory; private ServiceCache serviceCache; + private DrillbitEndpoint endpoint; +//private HashMapendpointsMap = new HashMap (); --- End diff -- commented code ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148682043 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} +else { + registeredBits.add(endpoint); + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} + } +// Iterator iterator = endpointsMap.keySet().iterator() ; --- End diff -- move the common code out of the if-else ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148872381 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java --- @@ -348,6 +354,21 @@ public void run() { */ } + /* +Check if the foreman is ONLINE. If not dont accept any new queries. + */ + public void checkForemanState() throws ForemanException{ +DrillbitEndpoint foreman = drillbitContext.getEndpoint(); +Collection dbs = drillbitContext.getAvailableBits(); --- End diff -- I would recommend refactoring this into a generic boolean isOnline(DrillbitEndpoint endpoint) in the DrillbitEndpoint class. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148682167 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition --- End diff -- loop below does not handle change in state, so the comment should be changed. Is the endpointsMap necessary because of port hunting ? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148872633 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java --- @@ -471,6 +471,22 @@ public void close() throws Exception { } /** + * Shutdown the drillbit given the name of the drillbit. + */ + public void close_drillbit(final String drillbitname) throws Exception { +Exception ex = null; +for (Drillbit bit : drillbits()) +{ --- End diff -- placement of '{' ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148861255 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; --- End diff -- Drillbit.quiescentMode and Drillbit.forceful_shutdown are volatiles but currentState is not. Can you explain why this is so ? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148668138 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} +else { + registeredBits.add(endpoint); + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} + } +// Iterator iterator = endpointsMap.keySet().iterator() ; --- End diff -- commented code ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148674083 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/ClusterCoordinator.java --- @@ -60,7 +61,26 @@ */ public abstract Collection getAvailableEndpoints(); + /** + * Get a collection of ONLINE drillbit endpoints by excluding the drillbits + * that are in QUIESCENT state (drillbits that are shutting down). Primarily used by the planner + * to plan queries only on ONLINE drillbits and used by the client during initial connection + * phase to connect to a drillbit (foreman) + * @return A collection of ONLINE endpoints + */ + + public abstract Collection getOnlineEndPoints(); + + public abstract RegistrationHandle update(RegistrationHandle handle, State state); + public interface RegistrationHandle { +/** + * Get the drillbit endpoint associated with the registration handle + * @return drillbit endpoint + */ +public abstract DrillbitEndpoint getEndPoint(); + +public abstract void setEndPoint( DrillbitEndpoint endpoint); --- End diff -- spacing ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148874640 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,323 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; + +import ch.qos.logback.classic.Level; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.proto.UserBitShared.QueryResult.QueryState; +import org.apache.drill.exec.server.Drillbit; +import org.junit.AfterClass; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import org.omg.PortableServer.THREAD_POLICY_ID; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + @BeforeClass + public static void setUpTestData() { +for( int i = 0; i < 1000; i++) { + setupFile(i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); +} + }; + + public FixtureBuilder enableWebServer(FixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.close_drillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +FixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + new Thread(new Runnable() { +public void run()
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148171637 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/local/LocalClusterCoordinator.java --- @@ -85,13 +88,62 @@ public void unregister(final RegistrationHandle handle) { endpoints.remove(handle); } + /** + * Update drillbit endpoint state. Drillbit advertises its + * state. State information is used during planning and initial + * client connection phases. + */ + @Override + public RegistrationHandle update(RegistrationHandle handle, State state) { +DrillbitEndpoint endpoint = handle.getEndPoint(); +endpoint = endpoint.toBuilder().setState(state).build(); +handle.setEndPoint(endpoint); +endpoints.put(handle,endpoint); +return handle; + } + @Override public Collection getAvailableEndpoints() { return endpoints.values(); } + /** + * Get a collection of ONLINE Drillbit endpoints by excluding the drillbits + * that are in QUIESCENT state (drillbits shutting down). Primarily used by the planner + * to plan queries only on ONLINE drillbits and used by the client during initial connection + * phase to connect to a drillbit (foreman) + * @return A collection of ONLINE endpoints + */ + @Override + public Collection getOnlineEndPoints() { +Collection runningEndPoints = new ArrayList<>(); +for (DrillbitEndpoint endpoint: endpoints.values()){ + if(!endpoint.hasState() || endpoint.getState().equals(State.ONLINE)) { +runningEndPoints.add(endpoint); + } +} +return runningEndPoints; + } + private class Handle implements RegistrationHandle { private final UUID id = UUID.randomUUID(); +private DrillbitEndpoint drillbitEndpoint; + +/** + * Get the drillbit endpoint associated with the registration handle + * @return drillbit endpoint + */ +public DrillbitEndpoint getEndPoint() { + return drillbitEndpoint; +} + +public void setEndPoint( DrillbitEndpoint endpoint) { --- End diff -- spacing ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148676672 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -200,11 +206,47 @@ public void unregister(RegistrationHandle handle) { } } + /** + * Update drillbit endpoint state. Drillbit advertises its + * state in Zookeeper when a shutdown request of drillbit is + * triggered. State information is used during planning and + * initial client connection phases. + */ + public RegistrationHandle update(RegistrationHandle handle, State state) { +ZKRegistrationHandle h = (ZKRegistrationHandle) handle; + try { +endpoint = h.endpoint.toBuilder().setState(state).build(); +ServiceInstance serviceInstance = ServiceInstance.builder().name(serviceName).id(h.id).payload(endpoint).build(); --- End diff -- suggestion: wrap this long line since you have already wrapped the other lines and comments ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148686859 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -69,14 +73,30 @@ public final static String SYSTEM_OPTIONS_NAME = "org.apache.drill.exec.server.Drillbit.system_options"; - private boolean isClosed = false; - private final ClusterCoordinator coord; private final ServiceEngine engine; private final PersistentStoreProvider storeProvider; private final WorkManager manager; private final BootStrapContext context; private final WebServer webServer; + private final int gracePeriod; + private DrillbitStateManager stateManager; + + public void setQuiescentMode(boolean quiescentMode) { +this.quiescentMode = quiescentMode; + } + + private volatile boolean quiescentMode; + + public void setForceful_shutdown(boolean forceful_shutdown) { +this.forceful_shutdown = forceful_shutdown; + } + + private volatile boolean forceful_shutdown = false; --- End diff -- any reason why you did not camel-case this variable? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148685835 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} +else { + registeredBits.add(endpoint); + endpointsMap.put(new MultiKey(endpoint.getAddress(),endpoint.getUserPort()),endpoint); +} + } +// Iterator iterator = endpointsMap.keySet().iterator() ; +// while(iterator.hasNext()) { +//MultiKey key = iterator.next(); +//if(!newDrillbitSet.contains(endpointsMap.get(key))) { +// unregisteredBits.add(endpointsMap.get(key)); +// iterator.remove(); +//} +// } + +// Remove all the endpoints that are newly dead + for ( MultiKey key: endpointsMap.keySet()) + { --- End diff -- open-brace placement for the for-loop is inconsistent with other instances ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r148669660 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +272,52 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); - - endpoints = newDrillbitSet; + Set registeredBits = new HashSet<>(); + // Update the endpoints map with change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +if (endpointsMap.containsKey(new MultiKey(endpoint.getAddress(),endpoint.getUserPort( { --- End diff -- I would recommend extracting endpoint.getAddress() and endpoint.getUserPort() into locals ---
[jira] [Created] (DRILL-5926) TestValueVector tests fail sporadically
Timothy Farkas created DRILL-5926: - Summary: TestValueVector tests fail sporadically Key: DRILL-5926 URL: https://issues.apache.org/jira/browse/DRILL-5926 Project: Apache Drill Issue Type: Bug Reporter: Timothy Farkas Assignee: Timothy Farkas The following tests fail sporadically with out of memory exception: * TestValueVector.testFixedVectorReallocation * TestValueVector.testVariableVectorReallocation -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1021: DRILL-5923: Display name for query state
Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1021#discussion_r148860361 --- Diff: exec/java-exec/src/main/resources/rest/profile/profile.ftl --- @@ -135,6 +135,8 @@ table.sortable thead .sorting_desc { background-image: url("/static/img/black-de <#assign queueName = model.getProfile().getQueueName() /> <#assign queued = queueName != "" && queueName != "-" /> + <#assign queryStateDisplayName = ["Starting", "Running", "Succeeded", "Canceled", "Failed", --- End diff -- Why do we have need duplicated state display name array here and in `ProfileResources.java`. Is there a possibility to not duplicate it? ---
[jira] [Created] (DRILL-5925) Unit test TestValueVector.testFixedVectorReallocation TestValueVector.testVariableVectorReallocation always fail
Chunhui Shi created DRILL-5925: -- Summary: Unit test TestValueVector.testFixedVectorReallocation TestValueVector.testVariableVectorReallocation always fail Key: DRILL-5925 URL: https://issues.apache.org/jira/browse/DRILL-5925 Project: Apache Drill Issue Type: Bug Reporter: Chunhui Shi Tests in error: TestValueVector.testFixedVectorReallocation » Unexpected exception, expected<... TestValueVector.testVariableVectorReallocation » Unexpected exception, expect... Tests run: 2401, Failures: 0, Errors: 2, Skipped: 142 We are seeing these failures quite often. We should disable these two tests or modify the expected exception to be OutofMemory -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #774: DRILL-5337: OpenTSDB storage plugin
Github user Vlad-Storona commented on the issue: https://github.com/apache/drill/pull/774 @arina-ielchiieva Thanks for the review. I updated the PR. Please review updated changes. ---