[GitHub] drill pull request #1037: DRILL-5968: Add support for empty service_host use...
GitHub user superbstreak opened a pull request: https://github.com/apache/drill/pull/1037 DRILL-5968: Add support for empty service_host user property You can merge this pull request into a Git repository by running: $ git pull https://github.com/superbstreak/drill DRILL-5968-Add-support-for-empty-service_host Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1037.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1037 commit 7deed948b05d35ff48287defad4ccefe61f0128f Author: Rob WuDate: 2017-11-15T07:27:29Z DRILL-5968: Add support for empty service_host user property ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r151019919 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java --- @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.vector.accessor.writer; + +import java.math.BigDecimal; + +import org.apache.drill.exec.vector.accessor.ColumnWriterIndex; +import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter; +import org.joda.time.Period; + +/** + * Column writer implementation that acts as the basis for the + * generated, vector-specific implementations. All set methods + * throw an exception; subclasses simply override the supported + * method(s). + * + * The only tricky part to this class is understanding the + * state of the write indexes as the write proceeds. There are + * two pointers to consider: + * + * lastWriteIndex: The position in the vector at which the + * client last asked us to write data. This index is maintained + * in this class because it depends only on the actions of this + * class. + * vectorIndex: The position in the vector at which we will + * write if the client chooses to write a value at this time. + * The vector index is shared by all columns at the same repeat + * level. It is incremented as the client steps through the write + * and is observed in this class each time a write occurs. + * + * A repeat level is defined as any of the following: + * + * The set of top-level scalar columns, or those within a + * top-level, non-repeated map, or nested to any depth within + * non-repeated maps rooted at the top level. + * The values for a single scalar array. + * The set of scalar columns within a repeated map, or + * nested within non-repeated maps within a repeated map. + * + * Items at a repeat level index together and share a vector + * index. However, the columns within a repeat level + * do not share a last write index: some can lag further + * behind than others. + * + * Let's illustrate the states. Let's focus on one column and + * illustrate the three states that can occur during write: + * + * Behind: the last write index is more than one position behind + * the vector index. Zero-filling will be needed to catch up to + * the vector index. + * Written: the last write index is the same as the vector + * index because the client wrote data at this position (and previous + * values were back-filled with nulls, empties or zeros.) + * Unwritten: the last write index is one behind the vector + * index. This occurs when the column was written, then the client + * moved to the next row or array position. + * Restarted: The current row is abandoned (perhaps filtered + * out) and is to be rewritten. The last write position moves + * back one position. Note that, the Restarted state is + * indistinguishable from the unwritten state: the only real + * difference is that the current slot (pointed to by the + * vector index) contains the previous written value that must + * be overwritten or back-filled. But, this is fine, because we + * assume that unwritten values are garbage anyway. + * + * To illustrate: + * Behind WrittenUnwrittenRestarted + * |X| |X| |X| |X| + * lw >|X| |X| |X| |X| + * | | |0| |0| lw > |0| + *v >| | lw, v > |X|lw > |X| v > |X| + *v > | | + * + * The illustrated state transitions are: + * + * Suppose the state starts in Behind. + * If the client writes a value, then the empty slot is + * back-filled and the state moves to Written. + * If the client does not write a value, the state stays + * at Behind, and the gap of unfilled values
[GitHub] drill issue #652: DRILL-4990:Use new HDFS API access instead of listStatus t...
Github user sohami commented on the issue: https://github.com/apache/drill/pull/652 Based on the last in-person discussion it was decided to further look into why the test was failing on Windows platform. Is it the test environment setup issue or an actual issue w.r.t platform implementation of access method ? ---
[jira] [Resolved] (DRILL-5894) A Storage plugin called dfs_test is added in unit tests. It's actually unnecessary and we should just use dfs
[ https://issues.apache.org/jira/browse/DRILL-5894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved DRILL-5894. Resolution: Fixed > A Storage plugin called dfs_test is added in unit tests. It's actually > unnecessary and we should just use dfs > - > > Key: DRILL-5894 > URL: https://issues.apache.org/jira/browse/DRILL-5894 > Project: Apache Drill > Issue Type: Improvement >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Minor > > The unit tests create a dfs_test storage plugin and use it in queries. It's > actually completely unnecessary and we can use the existing dfs storage > plugin. This would make the tests less confusing and more consistent since > many tests flipflop between dfs and dfs_test -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (DRILL-5841) Unit tests fail when unexpected files are in the tmp folder
[ https://issues.apache.org/jira/browse/DRILL-5841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Paul Rogers resolved DRILL-5841. Resolution: Fixed > Unit tests fail when unexpected files are in the tmp folder > --- > > Key: DRILL-5841 > URL: https://issues.apache.org/jira/browse/DRILL-5841 > Project: Apache Drill > Issue Type: Bug >Reporter: Timothy Farkas >Assignee: Timothy Farkas >Priority: Minor > > My Mac creates wifi*.log files in the /tmp folder. This can cause some unit > tests to fail because Drill does not like finding unexpected files in the > /tmp folder. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #984: DRILL-5783 Made a unit test for generated Priority ...
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/984 ---
[GitHub] drill issue #984: DRILL-5783 Made a unit test for generated Priority Queue. ...
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/984 Discussed in person. Conflicts are resolved. We have a plan for merging with other in-flight PRs. +1 (again) ---
[jira] [Resolved] (DRILL-5966) Upgrade DRILL to Calcite 1.13.0
[ https://issues.apache.org/jira/browse/DRILL-5966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pritesh Maker resolved DRILL-5966. -- Resolution: Duplicate [~kkhatua] this is already in progress with DRILL-3993 -- [~RomanKulyk] is working on this. I am closing this as a duplicate. > Upgrade DRILL to Calcite 1.13.0 > --- > > Key: DRILL-5966 > URL: https://issues.apache.org/jira/browse/DRILL-5966 > Project: Apache Drill > Issue Type: Improvement > Components: Query Planning & Optimization >Reporter: Kunal Khatua >Assignee: Pritesh Maker > Fix For: 1.13.0 > > > Apache Drill is currently on a Drill-specific fork of Calcite-1.4.0 (released > on Sep 2015). > While the fork has been continuously updated with some fixes being ported to > the Calcite project as well, improvements on Calcite in the meanwhile have > been missed out by Drill. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5968) C++ Client should set service_host to the connected host if service_host is empty
Parth Chandra created DRILL-5968: Summary: C++ Client should set service_host to the connected host if service_host is empty Key: DRILL-5968 URL: https://issues.apache.org/jira/browse/DRILL-5968 Project: Apache Drill Issue Type: Bug Reporter: Parth Chandra In the ODBC driver - The krbSpnConfigurationsRequired parameter in odbc.ini is not working as expected. If I set the following: AuthenticationType=Kerberos UID= PWD= DelegationUID= KrbServiceName= KrbServiceHost=maprsasl krbSpnConfigurationsRequired=1 I could connect successfully. I was expecting to get an error message. If only either KrbServiceHost is missing or both KrbServiceHost and KrbServiceName are missing then I get the expected error message. Turning off the parameter, I was able to connect using the following setting: AuthenticationType=Kerberos UID= PWD= DelegationUID= KrbServiceName= KrbServiceHost=maprsasl krbSpnConfigurationsRequired=0 However, if either KrbServiceHost or both KrbServiceHost and KrbServiceName are missing, I would get the following error message: 1: SQLDriverConnect = [MapR][Drill] (30) User authentication failed. Server message: DrillClientImpl::handleAuthentication: Authentication failed. [Details: Encryption: enabled ,MaxWrappedSize: 32768 ,WrapSizeLimit: 0, Error: -1]. Check connection parameters? (30) SQLSTATE=28000 1: ODBC_Connect = [MapR][Drill] (30) User authentication failed. Server message: DrillClientImpl::handleAuthentication: Authentication failed. [Details: Encryption: enabled ,MaxWrappedSize: 32768 ,WrapSizeLimit: 0, Error: -1]. Check connection parameters? (30) SQLSTATE=28000 The Drill C++ Client should set service_host to the connected host (if direct) if service_host is empty (similar logic for zookeeper connection). Going through the source code, looks like this logic was removed by this commit: https://github.com/apache/drill/commit/f246c3cad7f44baeb8153913052ebc963c62276a#diff-8e6df071d8ca863fcfa578892944c1dcL123 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5967) Memory leak by HashPartitionSender
Timothy Farkas created DRILL-5967: - Summary: Memory leak by HashPartitionSender Key: DRILL-5967 URL: https://issues.apache.org/jira/browse/DRILL-5967 Project: Apache Drill Issue Type: Bug Reporter: Timothy Farkas Assignee: Timothy Farkas The error found by [~cch...@maprtech.com] and [~dechanggu] {code} 2017-10-25 15:43:28,658 [260eec84-7de3-03ec-300f-7fdbc111fb7c:frag:2:9] ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (9216) Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 (res/actual/peak/limit) Fragment 2:9 [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010] org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IllegalStateException: Memory was leaked by query. Memory leaked: (9216) Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 (res/actual/peak/limit) Fragment 2:9 [Error Id: 7eae6c2a-868c-49f8-aad8-b690243ffe9b on mperf113.qa.lab:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586) ~[drill-common-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267) [drill-java-exec-1.11.0-mapr.jar:1.11.0-mapr] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.11.0-mapr.jar:1.11.0-mapr] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_121] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_121] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121] Caused by: java.lang.IllegalStateException: Memory was leaked by query. Memory leaked: (9216) Allocator(op:2:9:0:HashPartitionSender) 100/9216/12831744/100 (res/actual/peak/limit) {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (DRILL-5966) Upgrade DRILL to Calcite 1.13.0
Kunal Khatua created DRILL-5966: --- Summary: Upgrade DRILL to Calcite 1.13.0 Key: DRILL-5966 URL: https://issues.apache.org/jira/browse/DRILL-5966 Project: Apache Drill Issue Type: Improvement Components: Query Planning & Optimization Reporter: Kunal Khatua Assignee: Aman Sinha Fix For: 0.8.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #984: DRILL-5783 Made a unit test for generated Priority Queue. ...
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/984 Resolved conflicts. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user bitblender commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r151003788 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java --- @@ -157,10 +157,29 @@ public DrillConfig getConfig() { return context.getConfig(); } - public Collection getBits() { + public Collection getAvailableBits() { return coord.getAvailableEndpoints(); } + public Collection getBits() { +return coord.getOnlineEndPoints(); + } + + public boolean isOnline(DrillbitEndpoint endpoint) { return endpoint.getState().equals(DrillbitEndpoint.State.ONLINE); } + + public boolean isForemanOnline() { +DrillbitEndpoint foreman = getEndpoint(); +Collection dbs = getAvailableBits(); +for( DrillbitEndpoint db : dbs) { + if( db.getAddress().equals(foreman.getAddress()) && db.getUserPort() == foreman.getUserPort()) { --- End diff -- Can you please explain the logic in isForemanOnline(). Why do you have to get the list of endpoints from ZK and then check for foreman in that list before making the isOnline test ? Why can't it be done on the foreman object? Is this to make sure that the state is updated in ZK before refusing to take queries ? Why do you assume that the foreman is online if the foreman is not found in the list of endPoints? i.e. if it is not in the dbs list why do you return true in that case ? ---
[GitHub] drill pull request #1036: DRILL-5962: Adding ST_AsJSON functionality
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1036#discussion_r150999524 --- Diff: contrib/gis/src/main/java/org/apache/drill/exec/expr/fn/impl/gis/STAsJSON.java --- @@ -0,0 +1,58 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.expr.fn.impl.gis; + +import javax.inject.Inject; + +import org.apache.drill.exec.expr.DrillSimpleFunc; +import org.apache.drill.exec.expr.annotations.FunctionTemplate; +import org.apache.drill.exec.expr.annotations.Output; +import org.apache.drill.exec.expr.annotations.Param; +import org.apache.drill.exec.expr.holders.VarBinaryHolder; +import org.apache.drill.exec.expr.holders.VarCharHolder; + +import io.netty.buffer.DrillBuf; + +@FunctionTemplate(name = "st_asjson", scope = FunctionTemplate.FunctionScope.SIMPLE, + nulls = FunctionTemplate.NullHandling.NULL_IF_NULL) +public class STAsJSON implements DrillSimpleFunc { + @Param + VarBinaryHolder geom1Param; + + @Output + VarCharHolder out; + + @Inject + DrillBuf buffer; + + public void setup() { + } + + public void eval() { +com.esri.core.geometry.ogc.OGCGeometry geom1 = com.esri.core.geometry.ogc.OGCGeometry + .fromBinary(geom1Param.buffer.nioBuffer(geom1Param.start, geom1Param.end - geom1Param.start)); + +String json = geom1.asJson(); + +int outputSize = json.getBytes().length; --- End diff -- Better to cache the bytes, and use it twice, rather than converting the string from UTF-16 to UTF-8 twice. ---
[GitHub] drill issue #1028: DRILL-5943: Avoid the strong check introduced by DRILL-55...
Github user priteshm commented on the issue: https://github.com/apache/drill/pull/1028 @parthchandra can you please review this? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user dvjyothsna commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150994906 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -57,19 +57,19 @@ Control Port Data Port Version + Status <#assign i = 1> <#list model.getDrillbits() as drillbit> - + ${i} -${drillbit.getAddress()} - <#if drillbit.isCurrent()> +${drillbit.getAddress()}<#if drillbit.isCurrent()> --- End diff -- I have tried that before. But it was giving some alignment issues. ---
RE: Drill Views getting Connection Closed error after adding more columns
Did you resolve this? Could you provide more details about the data and the view? -Original Message- From: Sanchita Pandey [mailto:sanchit...@gmail.com] Sent: Tuesday, September 26, 2017 3:12 AM To: dev@drill.apache.org Subject: Drill Views getting Connection Closed error after adding more columns Hi , I have modified my existing drill View and added new columns. It includes few placeholder attributes also where I am passing ‘null’ as String(also tried with blank ‘’) but its impacting performance. Drill Views are getting connection timeout very frequently. If I remove additional and new columns added, it works fine. Please suggest Regards, Sanchita
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user dvjyothsna commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150993718 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -57,19 +57,19 @@ Control Port Data Port Version + Status <#assign i = 1> <#list model.getDrillbits() as drillbit> - + ${i} -${drillbit.getAddress()} - <#if drillbit.isCurrent()> +${drillbit.getAddress()}<#if drillbit.isCurrent()> Current -${drillbit.getUserPort()} +${drillbit.getUserPort()} --- End diff -- Yes, true. But for every row I have a different ID ("row-1") and I'm accessing "port"(child) of that row. This was the entire row is unique and need not assign a new unique id for every element in that row. Am I missing something here? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user dvjyothsna commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150992676 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +169,59 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcefulShutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); + if ( queries.isEmpty() && runningFragments.isEmpty()) { return; } - + logger.info("Draining " + queries +" queries and "+ runningFragments+" fragments."); exitLatch = new ExtendedLatch(); } - -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +// Wait uninterruptibly until all the queries and running fragments on that drillbit goes down +// to zero +if( forcefulShutdown ) { + exitLatch.awaitUninterruptibly(5000); --- End diff -- This was the default one that existed before. That is the reason I didn't changed it. ---
[GitHub] drill issue #976: DRILL-5797: Choose parquet reader from read columns
Github user priteshm commented on the issue: https://github.com/apache/drill/pull/976 @dprofeta will you be able to address the issues before the release? ---
[jira] [Created] (DRILL-5965) Array index access returns an empty array
Khurram Faraaz created DRILL-5965: - Summary: Array index access returns an empty array Key: DRILL-5965 URL: https://issues.apache.org/jira/browse/DRILL-5965 Project: Apache Drill Issue Type: Bug Components: Storage - JSON Affects Versions: 1.12.0 Reporter: Khurram Faraaz Accessing an array with [] from JSON data, returns an empty string, whereas it should return the actual data from the array at the index and not an empty array. Drill 1.11.0-mapr commit: 065d72ba48c7af6b389b763753ecb6bf7d229ce8 {noformat} 0: jdbc:drill:schema=dfs.tmp> select t.structured_rep[0] from `cornell_nlvr_train.json` t limit 1; +-+ | EXPR$0 | +-+ | [] | +-+ 1 row selected (0.249 seconds) {noformat} Where as accessing the elements of the array returns correct results. {noformat} 0: jdbc:drill:schema=dfs.tmp> select t.structured_rep[0][0] from `cornell_nlvr_train.json` t limit 1; +---+ |EXPR$0 | +---+ | {"y_loc":21,"size":20,"type":"triangle","x_loc":27,"color":"Yellow"} | +---+ 1 row selected (0.325 seconds) 0: jdbc:drill:schema=dfs.tmp> select t.structured_rep[0][1] from `cornell_nlvr_train.json` t limit 1; +-+ | EXPR$0| +-+ | {"y_loc":60,"size":10,"type":"circle","x_loc":59,"color":"Yellow"} | +-+ 1 row selected (0.247 seconds) {noformat} Data used in the test {noformat} { "sentence": "There is a circle closely touching a corner of a box.", "label": "true", "identifier": "1304-0", "directory": "74", "evals": { "r0": "true" }, "structured_rep": [ [{ "y_loc": 21, "size": 20, "type": "triangle", "x_loc": 27, "color": "Yellow" }, { "y_loc": 60, "size": 10, "type": "circle", "x_loc": 59, "color": "Yellow" }], [{ "y_loc": 81, "size": 10, "type": "triangle", "x_loc": 48, "color": "Yellow" }, { "y_loc": 64, "size": 20, "type": "circle", "x_loc": 77, "color": "#0099ff" }], [{ "y_loc": 2, "size": 20, "type": "triangle", "x_loc": 62, "color": "Yellow" }, { "y_loc": 70, "size": 30, "type": "circle", "x_loc": 70, "color": "Black" }, { "y_loc": 51, "size": 20, "type": "circle", "x_loc": 30, "color": "#0099ff" }, { "y_loc": 42, "size": 20, "type": "circle", "x_loc": 67, "color": "Yellow" }, { "y_loc": 73, "size": 20, "type": "circle", "x_loc": 37, "color": "Black" }, { "y_loc": 14, "size": 30, "type": "triangle", "x_loc": 7, "color": "Yellow" }, { "y_loc": 27, "size": 10, "type": "circle", "x_loc": 48, "color": "Black" }] ] } {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150985739 --- Diff: protocol/src/main/java/org/apache/drill/exec/proto/beans/RpcType.java --- @@ -25,28 +25,8 @@ HANDSHAKE(0), ACK(1), GOODBYE(2), -RUN_QUERY(3), -CANCEL_QUERY(4), -REQUEST_RESULTS(5), -RESUME_PAUSED_QUERY(11), -GET_QUERY_PLAN_FRAGMENTS(12), -GET_CATALOGS(14), -GET_SCHEMAS(15), -GET_TABLES(16), -GET_COLUMNS(17), -CREATE_PREPARED_STATEMENT(22), -GET_SERVER_META(8), -QUERY_DATA(6), -QUERY_HANDLE(7), -QUERY_PLAN_FRAGMENTS(13), -CATALOGS(18), -SCHEMAS(19), -TABLES(20), -COLUMNS(21), -PREPARED_STATEMENT(23), -SERVER_META(9), -QUERY_RESULT(10), -SASL_MESSAGE(24); +REQ_RECORD_BATCH(3), +SASL_MESSAGE(4); --- End diff -- Is this your change? Why has the list of RPC types changed? Is this a merge issue? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150985401 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; +import com.google.common.io.Files; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.server.Drillbit; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + static String testDirPath; + @BeforeClass + public static void setUpTestData() throws Exception { + +final File testDir = getTempDir("graceful_shutdown"); +testDirPath = testDir.getAbsolutePath(); +for( int i = 0; i < 500; i++) { + setupFile(testDir, i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); + put(ExecConstants.HTTP_PORT_HUNT, true); +} + }; + + public ClusterFixtureBuilder enableWebServer(ClusterFixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +ClusterFixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.closeDrillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +ClusterFixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + new Thread(new Runnable() { +
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150981136 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +169,59 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcefulShutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); + if ( queries.isEmpty() && runningFragments.isEmpty()) { return; } - + logger.info("Draining " + queries +" queries and "+ runningFragments+" fragments."); exitLatch = new ExtendedLatch(); } - -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +// Wait uninterruptibly until all the queries and running fragments on that drillbit goes down +// to zero +if( forcefulShutdown ) { + exitLatch.awaitUninterruptibly(5000); +} else { + exitLatch.awaitUninterruptibly(); +} } /** * If it is safe to exit, and the exitLatch is in use, signals it so that waitToExit() will - * unblock. + * unblock. Logs the number of pending fragments and queries that are running on that + * drillbit to track the progress of shutdown process. */ private void indicateIfSafeToExit() { synchronized(this) { if (exitLatch != null) { +logger.info("Waiting for "+ queries.size() +" queries to complete before shutting down"); +logger.info("Waiting for "+ runningFragments.size() +" running fragments to complete before shutting down"); +if(runningFragments.size() > numOfRunningFragments|| queries.size() > numOfRunningQueries) { + logger.info("New Fragments or queries are added while drillbit is Shutting down"); +} if (queries.isEmpty() && runningFragments.isEmpty()) { + // Both Queries and Running fragments are empty. + // So its safe for the drillbit to exit. --- End diff -- As it turns out @ilooner is making significant changes to this area. Can you two coordinate? We just need the @dvjyothsna is making to work long enough for the @ilooner changes replace them. But, Tim's changes need to support this new graceful shutdown model. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150984396 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java --- @@ -471,6 +471,21 @@ public void close() throws Exception { } /** + * Shutdown the drillbit given the name of the drillbit. + */ + public void closeDrillbit(final String drillbitName) throws Exception { +Exception ex = null; +for (Drillbit bit : drillbits()) { + if(bit.equals(bits.get(drillbitName))) { +bit.close(); + } +} +if(ex != null) { --- End diff -- Looks like `ex` is declared, but never set. Was the goal to capture exceptions from `bit.close()`? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150984962 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; +import com.google.common.io.Files; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.server.Drillbit; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + static String testDirPath; + @BeforeClass + public static void setUpTestData() throws Exception { + +final File testDir = getTempDir("graceful_shutdown"); +testDirPath = testDir.getAbsolutePath(); +for( int i = 0; i < 500; i++) { + setupFile(testDir, i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); + put(ExecConstants.HTTP_PORT_HUNT, true); +} + }; + + public ClusterFixtureBuilder enableWebServer(ClusterFixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +ClusterFixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.closeDrillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +ClusterFixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + new Thread(new Runnable() { +
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150985212 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; +import com.google.common.io.Files; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.server.Drillbit; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + static String testDirPath; + @BeforeClass + public static void setUpTestData() throws Exception { + +final File testDir = getTempDir("graceful_shutdown"); +testDirPath = testDir.getAbsolutePath(); +for( int i = 0; i < 500; i++) { + setupFile(testDir, i); +} + } + + + public static final Properties WEBSERVER_CONFIGURATION = new Properties() { +{ + put(ExecConstants.HTTP_ENABLE, true); + put(ExecConstants.HTTP_PORT_HUNT, true); +} + }; + + public ClusterFixtureBuilder enableWebServer(ClusterFixtureBuilder builder) { +Properties props = new Properties(); +props.putAll(WEBSERVER_CONFIGURATION); +builder.configBuilder.configProps(props); +return builder; + } + + + /* + Start multiple drillbits and then shutdown a drillbit. Query the online + endpoints and check if the drillbit still exists. + */ + @Test + public void testOnlineEndPoints() throws Exception { + +String[] drillbits = {"db1" ,"db2","db3", "db4", "db5", "db6"}; +ClusterFixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + + Drillbit drillbit = cluster.drillbit("db2"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + new Thread(new Runnable() { +public void run() { + try { +cluster.closeDrillbit("db2"); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + //wait for graceperiod + Thread.sleep(grace_period); + Collection drillbitEndpoints = cluster.drillbit().getContext() + .getClusterCoordinator() + .getOnlineEndPoints(); + Assert.assertFalse(drillbitEndpoints.contains(drillbitEndpoint)); +} + } + /* +Test if the drillbit transitions from ONLINE state when a shutdown +request is initiated + */ + @Test + public void testStateChange() throws Exception { + +String[] drillbits = {"db1" ,"db2", "db3", "db4", "db5", "db6"}; +ClusterFixtureBuilder builder = ClusterFixture.builder().withBits(drillbits).withLocalZk(); + +try ( ClusterFixture cluster = builder.build(); + ClientFixture client = cluster.clientFixture()) { + Drillbit drillbit = cluster.drillbit("db2"); + int grace_period = drillbit.getContext().getConfig().getInt("drill.exec.grace_period"); + DrillbitEndpoint drillbitEndpoint = drillbit.getRegistrationHandle().getEndPoint(); + new Thread(new Runnable() { +
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150983985 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -78,6 +78,11 @@ ${drillbit.getVersion()} +${drillbit.getState()} + + SHUTDOWN + + --- End diff -- Again. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150982788 --- Diff: exec/java-exec/src/main/resources/drill-module.conf --- @@ -368,6 +368,13 @@ drill.exec: { // planning and managing queries. Primarily for testing. cpus_per_node: 0, } + # Grace period is the amount of time where the drillbit accepts work after + # the shutdown request is triggered. The primary use of grace period is to + # avoid the race conditions caused by zookeeper delay in updating the state + # information of the drillbit that is shutting down. So, it is advisable + # to have a grace period that is atleast twice the amount of zookeeper + # refresh time. + grace_period : 1 --- End diff -- Units? Just add a note to the end of your comment to say the units are ms. In the future, I've found it to be handy to put the units in the name: `grace_period_ms` for example. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150976659 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; + public void setState(DrillbitState state) { +switch (state) { + case ONLINE: +if (currentState == DrillbitState.STARTUP) { + currentState = state; +} else { + throw new IllegalStateException("Cannot set drillbit to" + state + "from" + currentState); --- End diff -- To avoid redundant code and to catch new states: ``` setState(State newState) { currentState = transitionTo(newState); } State transitionTo(State newState) { switch (newState) { case ONLINE: if (currentState == DrillbitState.STARTUP) { return newState; } break; ... default: break; } throw new IllegalStateException("Cannot set drillbit to" + state + "from" + currentState); ``` An alternative is to switch on current state to say, essentially "if we are in state X, we allow transition to states Y and Z." But, that may be overkill here. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150975172 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } --- End diff -- Please move enum before constructor. General order that I observe in Drill is: * Nested interfaces, classes or enums. * Static fields * Instance fields * Constructors * Static "builder" or "factory" methods * Other methods ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150979320 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/WebServer.java --- @@ -114,11 +117,12 @@ * @param context Bootstrap context. * @param workManager WorkManager instance. */ - public WebServer(final BootStrapContext context, final WorkManager workManager) { + public WebServer(final BootStrapContext context, final WorkManager workManager, final Drillbit drillbit) { --- End diff -- Do we really want to pass the entire Drillbit to the web server? Seems like this creates a coupling that is too tight. What services are needed? Only the shutdown? Something else? Can we define an interface just for those services, and let `Drillbit` implement that interface to avoid the tight coupling? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150978597 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -58,13 +63,170 @@ @Inject UserAuthEnabled authEnabled; @Inject WorkManager work; @Inject SecurityContext sc; + @Inject Drillbit drillbit; @GET @Produces(MediaType.TEXT_HTML) public Viewable getClusterInfo() { return ViewableWithPermissions.create(authEnabled.get(), "/rest/index.ftl", sc, getClusterInfoJSON()); } + + @SuppressWarnings("resource") + @GET + @Path("/state") + @Produces(MediaType.APPLICATION_JSON) + public Response getDrillbitStatus(){ +Collection drillbits = getClusterInfoJSON().getDrillbits(); +MapdrillStatusMap = new HashMap(); +for (DrillbitInfo drillbit : drillbits) { + drillStatusMap.put(drillbit.getAddress()+"-"+drillbit.getUserPort(),drillbit.getState()); +} +return Response.ok() +.entity(drillStatusMap) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @GET + @Path("/graceperiod") + @Produces(MediaType.APPLICATION_JSON) + public Map getGracePeriod(){ + +final DrillConfig config = work.getContext().getConfig(); +final int gracePeriod = config.getInt(ExecConstants.GRACE_PERIOD); +Map gracePeriodMap = new HashMap (); +gracePeriodMap.put("graceperiod",gracePeriod); +return gracePeriodMap; + } + + @SuppressWarnings("resource") + @GET + @Path("/queriesCount") + @Produces(MediaType.APPLICATION_JSON) + public Response getRemainingQueries() { +Map queriesInfo = new HashMap (); +queriesInfo = work.getRemainingQueries(); +return Response.ok() +.entity(queriesInfo) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @POST + @Path("/graceful_shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response shutdownDrillbit() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + shutdownInfo.put("response", "Shutdown request is triggered"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") + .allow("OPTIONS").build(); +} +catch (Exception e) { + logger.debug("Exception in triggering shutdown request",e); + shutdownInfo.put("response", "Error in triggering shutdown request"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") + .allow("OPTIONS").build(); +} + } + + @SuppressWarnings("resource") + @POST + @Path("/shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response ShutdownForcefully() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.setForcefulShutdown(true); +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + shutdownInfo.put("response", "Forceful shutdown request is triggered"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") +
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150977759 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -58,13 +63,170 @@ @Inject UserAuthEnabled authEnabled; @Inject WorkManager work; @Inject SecurityContext sc; + @Inject Drillbit drillbit; @GET @Produces(MediaType.TEXT_HTML) public Viewable getClusterInfo() { return ViewableWithPermissions.create(authEnabled.get(), "/rest/index.ftl", sc, getClusterInfoJSON()); } + + @SuppressWarnings("resource") + @GET + @Path("/state") + @Produces(MediaType.APPLICATION_JSON) + public Response getDrillbitStatus(){ +Collection drillbits = getClusterInfoJSON().getDrillbits(); +MapdrillStatusMap = new HashMap(); +for (DrillbitInfo drillbit : drillbits) { + drillStatusMap.put(drillbit.getAddress()+"-"+drillbit.getUserPort(),drillbit.getState()); --- End diff -- Small point: spaces around `+` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150973540 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -176,18 +200,37 @@ public void run() throws Exception { logger.info("Startup completed ({} ms).", w.elapsed(TimeUnit.MILLISECONDS)); } + /* +Wait uninterruptibly + */ + public void waitForGracePeriod() { +ExtendedLatch exitLatch = null; +exitLatch = new ExtendedLatch(); +exitLatch.awaitUninterruptibly(gracePeriod); --- End diff -- I wonder about synchronization. This latch is private, so never triggered, except in a timeout. How is this different than `Thread.sleep()`, other than blocking interrupts? If so, maybe a comment to this effect. Note also that this is called from inside a synchronized block. Just want to verify that this is the intent. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150978981 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -315,7 +481,12 @@ public int compareTo(DrillbitInfo drillbitToCompare) { if (this.isVersionMatch() == drillbitToCompare.isVersionMatch()) { if (this.version.equals(drillbitToCompare.getVersion())) { - return this.address.compareTo(drillbitToCompare.getAddress()); + { +if(this.address.equals(drillbitToCompare.getAddress())) { + return (this.controlPort.compareTo(drillbitToCompare.getControlPort())); --- End diff -- Repeat of the `isSameDrillbit()` pattern noted earlier. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150984759 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; +import com.google.common.io.Files; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.server.Drillbit; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { + + static String testDirPath; + @BeforeClass + public static void setUpTestData() throws Exception { + +final File testDir = getTempDir("graceful_shutdown"); --- End diff -- Will want to change this to integated with the temp dir changes that @ilooner is making for Drill 1.12. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150978095 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -58,13 +63,170 @@ @Inject UserAuthEnabled authEnabled; @Inject WorkManager work; @Inject SecurityContext sc; + @Inject Drillbit drillbit; @GET @Produces(MediaType.TEXT_HTML) public Viewable getClusterInfo() { return ViewableWithPermissions.create(authEnabled.get(), "/rest/index.ftl", sc, getClusterInfoJSON()); } + + @SuppressWarnings("resource") + @GET + @Path("/state") + @Produces(MediaType.APPLICATION_JSON) + public Response getDrillbitStatus(){ +Collection drillbits = getClusterInfoJSON().getDrillbits(); +MapdrillStatusMap = new HashMap(); +for (DrillbitInfo drillbit : drillbits) { + drillStatusMap.put(drillbit.getAddress()+"-"+drillbit.getUserPort(),drillbit.getState()); +} +return Response.ok() +.entity(drillStatusMap) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @GET + @Path("/graceperiod") + @Produces(MediaType.APPLICATION_JSON) + public Map getGracePeriod(){ + +final DrillConfig config = work.getContext().getConfig(); +final int gracePeriod = config.getInt(ExecConstants.GRACE_PERIOD); +Map gracePeriodMap = new HashMap (); +gracePeriodMap.put("graceperiod",gracePeriod); +return gracePeriodMap; + } + + @SuppressWarnings("resource") + @GET + @Path("/queriesCount") + @Produces(MediaType.APPLICATION_JSON) + public Response getRemainingQueries() { +Map queriesInfo = new HashMap (); +queriesInfo = work.getRemainingQueries(); +return Response.ok() +.entity(queriesInfo) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @POST + @Path("/graceful_shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response shutdownDrillbit() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); --- End diff -- Printing the stack trace is OK for debugging. Fo production, we need to log: ``` logger.error("Web request to shutdown drillbit failed", e); ``` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150984249 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/ClusterFixture.java --- @@ -471,6 +471,21 @@ public void close() throws Exception { } /** + * Shutdown the drillbit given the name of the drillbit. + */ + public void closeDrillbit(final String drillbitName) throws Exception { +Exception ex = null; +for (Drillbit bit : drillbits()) { + if(bit.equals(bits.get(drillbitName))) { --- End diff -- Space after `if`, here and below. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150972227 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -77,6 +80,24 @@ private final WorkManager manager; private final BootStrapContext context; private final WebServer webServer; + private final int gracePeriod; + private DrillbitStateManager stateManager; + + public void setQuiescentMode(boolean quiescentMode) { +this.quiescentMode = quiescentMode; + } + + private boolean quiescentMode; + + public void setForcefulShutdown(boolean forcefulShutdown) { +this.forcefulShutdown = forcefulShutdown; + } + + private boolean forcefulShutdown = false; --- End diff -- Same. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150980131 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +169,59 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcefulShutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); --- End diff -- Here we update the counts inside a synchronized block. Do we access them outside the block? How do we synchronize reads? Or, can these be local variables inside this method? Or, should we have an `updateQueryCount()` method that handles the updates? That method would be synchronized, maybe? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150981324 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +169,59 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcefulShutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); + if ( queries.isEmpty() && runningFragments.isEmpty()) { return; } - + logger.info("Draining " + queries +" queries and "+ runningFragments+" fragments."); exitLatch = new ExtendedLatch(); } - -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +// Wait uninterruptibly until all the queries and running fragments on that drillbit goes down +// to zero +if( forcefulShutdown ) { + exitLatch.awaitUninterruptibly(5000); +} else { + exitLatch.awaitUninterruptibly(); +} } /** * If it is safe to exit, and the exitLatch is in use, signals it so that waitToExit() will - * unblock. + * unblock. Logs the number of pending fragments and queries that are running on that + * drillbit to track the progress of shutdown process. */ private void indicateIfSafeToExit() { synchronized(this) { if (exitLatch != null) { +logger.info("Waiting for "+ queries.size() +" queries to complete before shutting down"); +logger.info("Waiting for "+ runningFragments.size() +" running fragments to complete before shutting down"); +if(runningFragments.size() > numOfRunningFragments|| queries.size() > numOfRunningQueries) { + logger.info("New Fragments or queries are added while drillbit is Shutting down"); +} if (queries.isEmpty() && runningFragments.isEmpty()) { + // Both Queries and Running fragments are empty. + // So its safe for the drillbit to exit. exitLatch.countDown(); } } } } + /** + * Get the number of queries that are running on a drillbit. + * Primarily used to monitor the number of running queries after a + * shutdown request is triggered. + */ + public MapgetRemainingQueries() { +synchronized (this) { --- End diff -- Since the entire body is synchronized, put the `synchronized` on the method itself. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150974512 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java --- @@ -157,10 +157,29 @@ public DrillConfig getConfig() { return context.getConfig(); } - public Collection getBits() { + public Collection getAvailableBits() { return coord.getAvailableEndpoints(); } + public Collection getBits() { +return coord.getOnlineEndPoints(); + } + + public boolean isOnline(DrillbitEndpoint endpoint) { return endpoint.getState().equals(DrillbitEndpoint.State.ONLINE); } + + public boolean isForemanOnline() { +DrillbitEndpoint foreman = getEndpoint(); +Collection dbs = getAvailableBits(); +for( DrillbitEndpoint db : dbs) { + if( db.getAddress().equals(foreman.getAddress()) && db.getUserPort() == foreman.getUserPort()) { +if( !isOnline(db)) { --- End diff -- `if( !isOnline...` --> `if (! isOnline...` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150980543 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/work/WorkManager.java --- @@ -165,32 +169,59 @@ public DrillbitContext getContext() { * * This is intended to be used by {@link org.apache.drill.exec.server.Drillbit#close()}. */ - public void waitToExit() { + public void waitToExit(Drillbit bit, boolean forcefulShutdown) { synchronized(this) { - if (queries.isEmpty() && runningFragments.isEmpty()) { + numOfRunningQueries = queries.size(); + numOfRunningFragments = runningFragments.size(); + if ( queries.isEmpty() && runningFragments.isEmpty()) { return; } - + logger.info("Draining " + queries +" queries and "+ runningFragments+" fragments."); exitLatch = new ExtendedLatch(); } - -// Wait for at most 5 seconds or until the latch is released. -exitLatch.awaitUninterruptibly(5000); +// Wait uninterruptibly until all the queries and running fragments on that drillbit goes down +// to zero +if( forcefulShutdown ) { + exitLatch.awaitUninterruptibly(5000); --- End diff -- Why 5000? I guess that is the old magic number. Perhaps move this to a constant: ``` private final int FORCEFUL_SHUTDOWN_GRACE_MS = 5000; ``` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150983656 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -57,19 +57,19 @@ Control Port Data Port Version + Status <#assign i = 1> <#list model.getDrillbits() as drillbit> - + ${i} -${drillbit.getAddress()} - <#if drillbit.isCurrent()> +${drillbit.getAddress()}<#if drillbit.isCurrent()> Current -${drillbit.getUserPort()} +${drillbit.getUserPort()} --- End diff -- Can't us an id here; this is a list of many Drillbits and ids in HTML must be unique (that is, after all, what "identifier" means...) If this is form formatting, use a class. If for identification, append a suffix: "id-1" or "id-xxx-xxx-xxx-xxx:xx". ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150978855 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -58,13 +63,170 @@ @Inject UserAuthEnabled authEnabled; @Inject WorkManager work; @Inject SecurityContext sc; + @Inject Drillbit drillbit; @GET @Produces(MediaType.TEXT_HTML) public Viewable getClusterInfo() { return ViewableWithPermissions.create(authEnabled.get(), "/rest/index.ftl", sc, getClusterInfoJSON()); } + + @SuppressWarnings("resource") + @GET + @Path("/state") + @Produces(MediaType.APPLICATION_JSON) + public Response getDrillbitStatus(){ +Collection drillbits = getClusterInfoJSON().getDrillbits(); +MapdrillStatusMap = new HashMap(); +for (DrillbitInfo drillbit : drillbits) { + drillStatusMap.put(drillbit.getAddress()+"-"+drillbit.getUserPort(),drillbit.getState()); +} +return Response.ok() +.entity(drillStatusMap) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @GET + @Path("/graceperiod") + @Produces(MediaType.APPLICATION_JSON) + public Map getGracePeriod(){ + +final DrillConfig config = work.getContext().getConfig(); +final int gracePeriod = config.getInt(ExecConstants.GRACE_PERIOD); +Map gracePeriodMap = new HashMap (); +gracePeriodMap.put("graceperiod",gracePeriod); +return gracePeriodMap; + } + + @SuppressWarnings("resource") + @GET + @Path("/queriesCount") + @Produces(MediaType.APPLICATION_JSON) + public Response getRemainingQueries() { +Map queriesInfo = new HashMap (); +queriesInfo = work.getRemainingQueries(); +return Response.ok() +.entity(queriesInfo) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @POST + @Path("/graceful_shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response shutdownDrillbit() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + shutdownInfo.put("response", "Shutdown request is triggered"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") + .allow("OPTIONS").build(); +} +catch (Exception e) { + logger.debug("Exception in triggering shutdown request",e); + shutdownInfo.put("response", "Error in triggering shutdown request"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") + .allow("OPTIONS").build(); +} + } + + @SuppressWarnings("resource") + @POST + @Path("/shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response ShutdownForcefully() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.setForcefulShutdown(true); +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + shutdownInfo.put("response", "Forceful shutdown request is triggered"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") +
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150974066 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java --- @@ -157,10 +157,29 @@ public DrillConfig getConfig() { return context.getConfig(); } - public Collection getBits() { + public Collection getAvailableBits() { return coord.getAvailableEndpoints(); } + public Collection getBits() { +return coord.getOnlineEndPoints(); + } + + public boolean isOnline(DrillbitEndpoint endpoint) { return endpoint.getState().equals(DrillbitEndpoint.State.ONLINE); } + + public boolean isForemanOnline() { +DrillbitEndpoint foreman = getEndpoint(); +Collection dbs = getAvailableBits(); +for( DrillbitEndpoint db : dbs) { + if( db.getAddress().equals(foreman.getAddress()) && db.getUserPort() == foreman.getUserPort()) { --- End diff -- `if( db...` --> `if (db...` Also, `for( Drillbit...` --> `for (Drillbit...` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150972745 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -176,18 +200,37 @@ public void run() throws Exception { logger.info("Startup completed ({} ms).", w.elapsed(TimeUnit.MILLISECONDS)); } + /* +Wait uninterruptibly + */ + public void waitForGracePeriod() { +ExtendedLatch exitLatch = null; +exitLatch = new ExtendedLatch(); --- End diff -- ``` ExtendedLatch exitLatch = new ExtendedLatch(); ``` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150978228 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/DrillRoot.java --- @@ -58,13 +63,170 @@ @Inject UserAuthEnabled authEnabled; @Inject WorkManager work; @Inject SecurityContext sc; + @Inject Drillbit drillbit; @GET @Produces(MediaType.TEXT_HTML) public Viewable getClusterInfo() { return ViewableWithPermissions.create(authEnabled.get(), "/rest/index.ftl", sc, getClusterInfoJSON()); } + + @SuppressWarnings("resource") + @GET + @Path("/state") + @Produces(MediaType.APPLICATION_JSON) + public Response getDrillbitStatus(){ +Collection drillbits = getClusterInfoJSON().getDrillbits(); +MapdrillStatusMap = new HashMap(); +for (DrillbitInfo drillbit : drillbits) { + drillStatusMap.put(drillbit.getAddress()+"-"+drillbit.getUserPort(),drillbit.getState()); +} +return Response.ok() +.entity(drillStatusMap) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @GET + @Path("/graceperiod") + @Produces(MediaType.APPLICATION_JSON) + public Map getGracePeriod(){ + +final DrillConfig config = work.getContext().getConfig(); +final int gracePeriod = config.getInt(ExecConstants.GRACE_PERIOD); +Map gracePeriodMap = new HashMap (); +gracePeriodMap.put("graceperiod",gracePeriod); +return gracePeriodMap; + } + + @SuppressWarnings("resource") + @GET + @Path("/queriesCount") + @Produces(MediaType.APPLICATION_JSON) + public Response getRemainingQueries() { +Map queriesInfo = new HashMap (); +queriesInfo = work.getRemainingQueries(); +return Response.ok() +.entity(queriesInfo) +.header("Access-Control-Allow-Origin", "*") +.header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") +.allow("OPTIONS").build(); + } + + @SuppressWarnings("resource") + @POST + @Path("/graceful_shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response shutdownDrillbit() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); + } +} + }).start(); + shutdownInfo.put("response", "Shutdown request is triggered"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") + .allow("OPTIONS").build(); +} +catch (Exception e) { + logger.debug("Exception in triggering shutdown request",e); + shutdownInfo.put("response", "Error in triggering shutdown request"); + return Response.ok() + .entity(shutdownInfo) + .header("Access-Control-Allow-Origin", "*") + .header("Access-Control-Allow-Methods", "GET, POST, DELETE, PUT") + .header("Access-Control-Allow-Credentials","true") + .allow("OPTIONS").build(); +} + } + + @SuppressWarnings("resource") + @POST + @Path("/shutdown") + @Produces(MediaType.APPLICATION_JSON) + public Response ShutdownForcefully() throws Exception { +Map shutdownInfo = new HashMap (); +try { + new Thread(new Runnable() { +public void run() { + try { +drillbit.setForcefulShutdown(true); +drillbit.close(); + } catch (Exception e) { +e.printStackTrace(); --- End diff -- See above. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150973864 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java --- @@ -157,10 +157,29 @@ public DrillConfig getConfig() { return context.getConfig(); } - public Collection getBits() { + public Collection getAvailableBits() { return coord.getAvailableEndpoints(); } + public Collection getBits() { +return coord.getOnlineEndPoints(); + } + + public boolean isOnline(DrillbitEndpoint endpoint) { return endpoint.getState().equals(DrillbitEndpoint.State.ONLINE); } --- End diff -- A bit too much code for a single line. Maybe put the body inside the method on a separate line. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150983939 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -78,6 +78,11 @@ ${drillbit.getVersion()} +${drillbit.getState()} + + SHUTDOWN --- End diff -- Same id issue. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150983296 --- Diff: exec/java-exec/src/main/resources/rest/index.ftl --- @@ -57,19 +57,19 @@ Control Port Data Port Version + Status <#assign i = 1> <#list model.getDrillbits() as drillbit> - + ${i} -${drillbit.getAddress()} - <#if drillbit.isCurrent()> +${drillbit.getAddress()}<#if drillbit.isCurrent()> --- End diff -- `"address" >` --> `"address">` If-statement back to being on its own line? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150984538 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/TestGracefulShutdown.java --- @@ -0,0 +1,248 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.test; +import com.google.common.io.Files; +import org.apache.commons.io.FileUtils; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.server.Drillbit; +import org.junit.Assert; +import org.junit.BeforeClass; +import org.junit.Test; +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.HttpURLConnection; +import java.net.URL; +import java.util.Collection; +import java.util.Properties; +import static org.junit.Assert.assertNotEquals; +import static org.junit.Assert.fail; + +public class TestGracefulShutdown { --- End diff -- Add `extends DrillTest` so we get the narration of tests as they run. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150975442 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; + public void setState(DrillbitState state) { --- End diff -- Maybe `state` --> `newState` to make clear that this is the state we would like to transition to. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150975269 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitStateManager.java --- @@ -0,0 +1,80 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.server; +/* + State manager to manage the state of drillbit. + */ +public class DrillbitStateManager { + + + public DrillbitStateManager(DrillbitState currentState) { +this.currentState = currentState; + } + + public enum DrillbitState { +STARTUP, ONLINE, GRACE, DRAINING, OFFLINE, SHUTDOWN + } + + public DrillbitState getState() { +return currentState; + } + + private DrillbitState currentState; --- End diff -- Please move members after enum, before constructor. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150972162 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -77,6 +80,24 @@ private final WorkManager manager; private final BootStrapContext context; private final WebServer webServer; + private final int gracePeriod; + private DrillbitStateManager stateManager; + + public void setQuiescentMode(boolean quiescentMode) { +this.quiescentMode = quiescentMode; + } + + private boolean quiescentMode; --- End diff -- Please move field up with other fields. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150971745 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -200,11 +206,50 @@ public void unregister(RegistrationHandle handle) { } } + /** + * Update drillbit endpoint state. Drillbit advertises its + * state in Zookeeper when a shutdown request of drillbit is + * triggered. State information is used during planning and + * initial client connection phases. + */ + public RegistrationHandle update(RegistrationHandle handle, State state) { +ZKRegistrationHandle h = (ZKRegistrationHandle) handle; + try { +endpoint = h.endpoint.toBuilder().setState(state).build(); +ServiceInstance serviceInstance = ServiceInstance.builder() +.name(serviceName) +.id(h.id) +.payload(endpoint).build(); +discovery.updateService(serviceInstance); + } catch (Exception e) { +e.printStackTrace(); + } + return handle; + } + @Override public Collection getAvailableEndpoints() { return this.endpoints; } + /* + * Get a collection of ONLINE Drillbit endpoints by excluding the drillbits + * that are in QUIESCENT state (drillbits shutting down). Primarily used by the planner + * to plan queries only on ONLINE drillbits and used by the client during initial connection + * phase to connect to a drillbit (foreman) + * @return A collection of ONLINE endpoints + */ + @Override + public Collection getOnlineEndPoints() { +Collection runningEndPoints = new ArrayList<>(); +for (DrillbitEndpoint endpoint: endpoints){ + if(!endpoint.hasState() || endpoint.getState().equals(State.ONLINE)) { --- End diff -- This stanza appears multiple times. Can we define a static function that does the double check to avoid redundant code? ``` boolean isInState(State state) { ... ``` Choose an appropriate name, maybe `isDrillbitInState` or whatever. ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150972546 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/Drillbit.java --- @@ -135,8 +157,9 @@ public Drillbit( profileStoreProvider = storeProvider; } -engine = new ServiceEngine(manager, context, allowPortHunting, isDistributedMode); +engine = new ServiceEngine(manager, context, true, isDistributedMode); --- End diff -- Do we really want to change to always allow port hunting? This means that an attempt to start a second Drillbit on a node will succeed by default, rather than fail. OK? ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150974323 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/DrillbitContext.java --- @@ -157,10 +157,29 @@ public DrillConfig getConfig() { return context.getConfig(); } - public Collection getBits() { + public Collection getAvailableBits() { return coord.getAvailableEndpoints(); } + public Collection getBits() { +return coord.getOnlineEndPoints(); + } + + public boolean isOnline(DrillbitEndpoint endpoint) { return endpoint.getState().equals(DrillbitEndpoint.State.ONLINE); } + + public boolean isForemanOnline() { +DrillbitEndpoint foreman = getEndpoint(); +Collection dbs = getAvailableBits(); +for( DrillbitEndpoint db : dbs) { + if( db.getAddress().equals(foreman.getAddress()) && db.getUserPort() == foreman.getUserPort()) { --- End diff -- This is another place for a static function: ``` public boolean isSameDrillbit(DrillbitEndpoint target) ... ``` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150971276 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -200,11 +206,50 @@ public void unregister(RegistrationHandle handle) { } } + /** + * Update drillbit endpoint state. Drillbit advertises its + * state in Zookeeper when a shutdown request of drillbit is + * triggered. State information is used during planning and + * initial client connection phases. + */ + public RegistrationHandle update(RegistrationHandle handle, State state) { +ZKRegistrationHandle h = (ZKRegistrationHandle) handle; + try { +endpoint = h.endpoint.toBuilder().setState(state).build(); +ServiceInstance serviceInstance = ServiceInstance.builder() +.name(serviceName) +.id(h.id) +.payload(endpoint).build(); +discovery.updateService(serviceInstance); + } catch (Exception e) { +e.printStackTrace(); + } + return handle; + } + @Override public Collection getAvailableEndpoints() { return this.endpoints; } + /* + * Get a collection of ONLINE Drillbit endpoints by excluding the drillbits + * that are in QUIESCENT state (drillbits shutting down). Primarily used by the planner + * to plan queries only on ONLINE drillbits and used by the client during initial connection + * phase to connect to a drillbit (foreman) + * @return A collection of ONLINE endpoints + */ + @Override + public Collection getOnlineEndPoints() { +Collection runningEndPoints = new ArrayList<>(); +for (DrillbitEndpoint endpoint: endpoints){ + if(!endpoint.hasState() || endpoint.getState().equals(State.ONLINE)) { +runningEndPoints.add(endpoint); + } +} +logger.debug("Online endpoints in ZK are"+runningEndPoints.toString()); --- End diff -- Minor: spaces around `+` ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/921#discussion_r150971910 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java --- @@ -229,27 +275,42 @@ public DrillbitEndpoint apply(ServiceInstance input) { }); // set of newly dead bits : original bits - new set of active bits. - Set unregisteredBits = new HashSet<>(endpoints); - unregisteredBits.removeAll(newDrillbitSet); - + Set unregisteredBits = new HashSet<>(); // Set of newly live bits : new set of active bits - original bits. - Set registeredBits = new HashSet<>(newDrillbitSet); - registeredBits.removeAll(endpoints); + Set registeredBits = new HashSet<>(); - endpoints = newDrillbitSet; + // Updates the endpoints map if there is a change in state of the endpoint or with the addition + // of new drillbit endpoints. Registered endpoints is set to newly live drillbit endpoints. + for ( DrillbitEndpoint endpoint : newDrillbitSet) { +String endpointAddress = endpoint.getAddress(); +int endpointPort = endpoint.getUserPort(); +if (! endpointsMap.containsKey(new MultiKey(endpointAddress, endpointPort))) { + registeredBits.add(endpoint); +} +endpointsMap.put(new MultiKey(endpointAddress, endpointPort),endpoint); + } +// Remove all the endpoints that are newly dead --- End diff -- Minor: comment character usually is aligned with code, as above. ---
[jira] [Created] (DRILL-5964) Do not allow queries to access paths outside the current workspace root
Parth Chandra created DRILL-5964: Summary: Do not allow queries to access paths outside the current workspace root Key: DRILL-5964 URL: https://issues.apache.org/jira/browse/DRILL-5964 Project: Apache Drill Issue Type: Improvement Reporter: Parth Chandra Workspace definitions in the dfs plugin are intended to provide a convenient shortcut to long directory paths. However, some users may wish to disallow access to paths outside the root of the workspace, possibly to prevent accidental access. Note that this is a convenience option and not a substitute for permissions on the file system. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/914 Rebased on master. Squashed all previous commits; left code review revisions separate for now. ---
[GitHub] drill pull request #1036: Adding ST_AsJSON functionality
GitHub user ChrisSandison opened a pull request: https://github.com/apache/drill/pull/1036 Adding ST_AsJSON functionality Fixes #DRILL-5962 You can merge this pull request into a Git repository by running: $ git pull https://github.com/ChrisSandison/drill DRILL-5962 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1036.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1036 commit fe633ab7282d69b8eca30a255f84ec68f2bafb3b Author: chrisDate: 2017-11-14T21:26:21Z Adding ST_AsJSON functionality Fixes #DRILL-5962 ---
[GitHub] drill pull request #1029: DRILL-5867: List profiles in pages rather than a l...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1029#discussion_r150938880 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java --- @@ -93,13 +96,35 @@ public ProfileInfo(DrillConfig drillConfig, String queryId, long startTime, long this.time = new Date(startTime); this.foreman = foreman; this.link = generateLink(drillConfig, foreman, queryId); - this.query = query.substring(0, Math.min(query.length(), 150)); + this.query = extractQuerySnippet(query); this.state = state; this.user = user; this.totalCost = totalCost; this.queueName = queueName; } +private String extractQuerySnippet(String queryText) { + //Extract upto max char limit as snippet + String sizeCappedQuerySnippet = queryText.substring(0, Math.min(queryText.length(), QUERY_SNIPPET_MAX_CHAR)); + //Trimming down based on line-count + if ( QUERY_SNIPPET_MAX_LINES < sizeCappedQuerySnippet.split(System.lineSeparator()).length ) { --- End diff -- +1 ---
[GitHub] drill pull request #1029: DRILL-5867: List profiles in pages rather than a l...
Github user kkhatua commented on a diff in the pull request: https://github.com/apache/drill/pull/1029#discussion_r150938808 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/server/rest/profile/ProfileResources.java --- @@ -93,13 +96,35 @@ public ProfileInfo(DrillConfig drillConfig, String queryId, long startTime, long this.time = new Date(startTime); this.foreman = foreman; this.link = generateLink(drillConfig, foreman, queryId); - this.query = query.substring(0, Math.min(query.length(), 150)); + this.query = extractQuerySnippet(query); this.state = state; this.user = user; this.totalCost = totalCost; this.queueName = queueName; } +private String extractQuerySnippet(String queryText) { + //Extract upto max char limit as snippet + String sizeCappedQuerySnippet = queryText.substring(0, Math.min(queryText.length(), QUERY_SNIPPET_MAX_CHAR)); + //Trimming down based on line-count + if ( QUERY_SNIPPET_MAX_LINES < sizeCappedQuerySnippet.split(System.lineSeparator()).length ) { +int linesConstructed = 0; +StringBuilder lineCappedQuerySnippet = new StringBuilder(); +String[] queryParts = sizeCappedQuerySnippet.split(System.lineSeparator()); +for (String qPart : queryParts) { + lineCappedQuerySnippet.append(qPart); + if ( ++linesConstructed < QUERY_SNIPPET_MAX_LINES ) { +lineCappedQuerySnippet.append(System.lineSeparator()); --- End diff -- I wanted to preserve the original query format, hence, I'm applying new line instead of spaces. Besides, the snippet can also carry single-line comments, and putting them into 1 line will make it appear that anything following a `--` is all comments ---
[jira] [Created] (DRILL-5963) Canceling a query hung in planning state, leaves the query in ENQUEUED state for ever.
Khurram Faraaz created DRILL-5963: - Summary: Canceling a query hung in planning state, leaves the query in ENQUEUED state for ever. Key: DRILL-5963 URL: https://issues.apache.org/jira/browse/DRILL-5963 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.12.0 Environment: Drill 1.12.0-SNAPSHOT, commit: 4a718a0bd728ae02b502ac93620d132f0f6e1b6c Reporter: Khurram Faraaz Priority: Critical Canceling the below query that is hung in planning state, leaves the query in ENQUEUED state for ever. Here is the query that is hung in planning state {noformat} 0: jdbc:drill:schema=dfs.tmp> select 1 || ',' || 2 || ',' || 3 || ',' || 4 || ',' || 5 || ',' || 6 || ',' || 7 || ',' || 8 || ',' || 9 || ',' || 0 || ',' AS CSV_DATA from (values(1)); +--+ | | +--+ +--+ No rows selected (304.291 seconds) {noformat} Explain plan for that query also just hangs. {noformat} explain plan for select 1 || ',' || 2 || ',' || 3 || ',' || 4 || ',' || 5 || ',' || 6 || ',' || 7 || ',' || 8 || ',' || 9 || ',' || 0 || ',' AS CSV_DATA from (values(1)); ... {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/914 Pushed a commit with the changes suggested by Parth's review comments. @parthchandra, please review. ---
[GitHub] drill issue #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/914 Finally, a note on the fragmentation issue. As you noted, this is a subtle issue. It is true that Netty maintains a memory pool, based on binary allocations, that minimizes the normal kind of fragmentation that results from random sized allocations from a common pool. The cost of the binary structure is _internal_ fragmentation. Today, Drill vectors have, on average, 25% internal fragmentation. This PR does not address this issue per-se, but sets us on the road toward a solution. The key fragmentation issue that this PR _does_ deal with is that which occurs when allocations exceed the 16 MB (default) Netty block size. In that case, Netty does, in fact, go to the OS. The OS does a fine job of coalescing large blocks to prevent fragmentation. The problem, however, is that, over time, more and more memory resides in the Netty free list. Eventually, there simply is not enough memory left outside of Netty to service a jumbo (> 16MB) block. Drill gets an OOM error though Netty has many GB of memory free; just none available in the 32+ MB size we want. We could force Netty to release unused memory. In fact, the original [JE-Malloc paper](https://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf) (that you provided way back when, thanks) points out that the allocator should monitor its pools and release memory back to the system when a pool usage drops to zero. It does not appear that `PooledByteBufAllocatorL` implemented this feature, so the allocator never releases memory once it lands in the allocator's free list. We could certainly fix this; the JE-Malloc paper provides suggestions. Still, however, we could end up with usage patterns in which some slice of memory is used from each chunk, blocking any chunk from being released to the OS, and thereby blocking a "jumbo" block allocation, again though much memory is free on the free list. This is yet another form of fragmentation. Finally, as you point out, all of this assumes that we want to continue to allocate "jumbo" blocks. But, as we discovered in the managed sort work, and the hash agg spill work, Drill has two conflicting tendencies. On the one hand, "managed" operators wish to operate within a constrained memory footprint. (Which seems to often end up being on the order of 30 MB for the sort for various reasons.) If the scan operator, say, decides to allocate a batch that contains 32 MB vectors, then the sort can't accept even one of those batches an an OOM ensues. So, rather than solve our memory fragmentation issues by mucking with Netty (force free of unused chunks, increase chunk size, etc.) The preferred solution is to live within a budget: both the constraints of the Netty chunk size *and* the constraints placed on Drill operator memory usage. In short, we started by wanting to solve the fragmentation issue, but we realized that the best solution is to also solve the unlimited-batch-size issue, hence this PR. ---
[GitHub] drill issue #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/914 @parthchandra, thank you for taking the time to review this large chunk of code. Thanks for noticing the tests; I see those as the only way to ensure that a change of this size actually works; expecting a human to "mentally execute" the code is not practical at this scale. I did incorporate the change you suggested for grabbing the chunk size from `PooledByteBufAllocatorL` via a new static method in `AllocationManager`. Very good suggestion; thanks. ---
[GitHub] drill issue #1035: DRILL-5801: Gantt chart (fragment timeline) enhancements
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1035 @paul-rogers (cc: @arina-ielchiieva ) since this was your ask, please review the PR. ---
[GitHub] drill pull request #1035: DRILL-5801: Gantt chart (fragment timeline) enhanc...
GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/1035 DRILL-5801: Gantt chart (fragment timeline) enhancements 1. Labelled X and Y axes on the Gantt Chart that expresses the fragments' timelines 2. Support mouse hover to reveal major fragment ID * With 104 minor fragments / 5 major fragments : _You can see the tooltip (mouse pointer was not captured in screenshot)_ ![image](https://user-images.githubusercontent.com/4335237/32798813-b55b6f52-c92a-11e7-80b0-205f95e35081.png) * With 14 minor fragments / 4 major fragments : _Check for label alignment_ ![image](https://user-images.githubusercontent.com/4335237/32798883-e57f4f1e-c92a-11e7-9ce0-b2522f42c584.png) * With 1 minor fragment / 1 major fragment : _Check for label alignment_ ![image](https://user-images.githubusercontent.com/4335237/32798928-07a5f21e-c92b-11e7-9f14-9ee2e2605719.png) You can merge this pull request into a Git repository by running: $ git pull https://github.com/kkhatua/drill DRILL-5801 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1035.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1035 commit d8a9ff6b2cd112fe78ff9d412e1c6769b4b3a717 Author: Kunal KhatuaDate: 2017-11-14T18:48:36Z DRILL-5801: Gantt chart (fragment timeline) enhancements 1. Labelled X and Y axes on the Gantt Chart that expresses the fragments' timelines 2. Support mouse hover to reveal major fragment ID ---
[jira] [Created] (DRILL-5962) Add function STAsGeoJSON to extend GIS support
Chris Sandison created DRILL-5962: - Summary: Add function STAsGeoJSON to extend GIS support Key: DRILL-5962 URL: https://issues.apache.org/jira/browse/DRILL-5962 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Reporter: Chris Sandison Priority: Minor Add function as wrapper to ESRI's `asJson` -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1034: Adding asGeoJSON function
GitHub user ChrisSandison opened a pull request: https://github.com/apache/drill/pull/1034 Adding asGeoJSON function Fixes #5960 You can merge this pull request into a Git repository by running: $ git pull https://github.com/ChrisSandison/drill DRILL-5960 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1034.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1034 commit a34cad50e942408220bfc575ec10df89d64c57e5 Author: chrisDate: 2017-11-14T14:28:36Z Adding asGeoJSON function Fixes #5960 ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r150753441 --- Diff: exec/vector/src/main/codegen/templates/ColumnAccessors.java --- @@ -191,141 +180,268 @@ public void bind(RowIndex vectorIndex, ValueVector vector) { <#if accessorType=="BigDecimal"> <#assign label="Decimal"> +<#if drillType == "VarChar" || drillType == "Var16Char"> + <#assign accessorType = "byte[]"> + <#assign label = "Bytes"> + <#if ! notyet> // // ${drillType} readers and writers - public static class ${drillType}ColumnReader extends AbstractColumnReader { + public static class ${drillType}ColumnReader extends BaseScalarReader { -<@bindReader "" drillType /> +<@bindReader "" drillType false /> -<@getType label /> +<@getType drillType label /> <@get drillType accessorType label false/> } - public static class Nullable${drillType}ColumnReader extends AbstractColumnReader { + public static class Nullable${drillType}ColumnReader extends BaseScalarReader { -<@bindReader "Nullable" drillType /> +<@bindReader "Nullable" drillType false /> -<@getType label /> +<@getType drillType label /> @Override public boolean isNull() { - return accessor().isNull(vectorIndex.index()); -} - -<@get drillType accessorType label false/> - } - - public static class Repeated${drillType}ColumnReader extends AbstractArrayReader { - -<@bindReader "Repeated" drillType /> - -<@getType label /> - -@Override -public int size() { - return accessor().getInnerValueCountAt(vectorIndex.index()); + return accessor().isNull(vectorIndex.vectorIndex()); } -<@get drillType accessorType label true/> +<@get drillType accessorType label false /> } - public static class ${drillType}ColumnWriter extends AbstractColumnWriter { + public static class Repeated${drillType}ColumnReader extends BaseElementReader { -<@bindWriter "" drillType /> +<@bindReader "" drillType true /> -<@getType label /> +<@getType drillType label /> -<@set drillType accessorType label false "set" /> +<@get drillType accessorType label true /> } - public static class Nullable${drillType}ColumnWriter extends AbstractColumnWriter { - -<@bindWriter "Nullable" drillType /> + <#assign varWidth = drillType == "VarChar" || drillType == "Var16Char" || drillType == "VarBinary" /> + <#if varWidth> + public static class ${drillType}ColumnWriter extends BaseVarWidthWriter { + <#else> + public static class ${drillType}ColumnWriter extends BaseFixedWidthWriter { +<#if drillType = "Decimal9" || drillType == "Decimal18" || + drillType == "Decimal28Sparse" || drillType == "Decimal38Sparse"> +private MajorType type; + +private static final int VALUE_WIDTH = ${drillType}Vector.VALUE_WIDTH; + +private final ${drillType}Vector vector; + +public ${drillType}ColumnWriter(final ValueVector vector) { + <#if varWidth> + super(((${drillType}Vector) vector).getOffsetVector()); + <#else> +<#if drillType = "Decimal9" || drillType == "Decimal18" || + drillType == "Decimal28Sparse" || drillType == "Decimal38Sparse"> + type = vector.getField().getType(); + + + this.vector = (${drillType}Vector) vector; +} -<@getType label /> +@Override public ValueVector vector() { return vector; } +<#-- All change of buffer comes through this function to allow capturing + the buffer address and capacity. Only two ways to set the buffer: + by binding to a vector in bindVector(), or by resizing the vector + in writeIndex(). --> @Override -public void setNull() { - mutator.setNull(vectorIndex.index()); +protected final void setAddr() { + final DrillBuf buf = vector.getBuffer(); + bufAddr = buf.addr(); + <#if varWidth> + capacity = buf.capacity(); + <#else> + <#-- Turns out that keeping track of capacity as the count of + values simplifies the per-value code path. --> + capacity = buf.capacity() / VALUE_WIDTH; + } -<@set drillType accessorType label true "set" /> - } - - public static class
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r150755238 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/record/MaterializedField.java --- @@ -168,6 +174,58 @@ public boolean equals(Object obj) { Objects.equals(this.type, other.type); } + public boolean isEquivalent(MaterializedField other) { +if (! name.equalsIgnoreCase(other.name)) { + return false; +} + +// Requires full type equality, including fields such as precision and scale. +// But, unset fields are equivalent to 0. Can't use the protobuf-provided +// isEquals(), that treats set and unset fields as different. + +if (type.getMinorType() != other.type.getMinorType()) { + return false; +} +if (type.getMode() != other.type.getMode()) { + return false; +} +if (type.getScale() != other.type.getScale()) { + return false; +} +if (type.getPrecision() != other.type.getPrecision()) { + return false; +} + +// Compare children -- but only for maps, not the internal children +// for Varchar, repeated or nullable types. + +if (type.getMinorType() != MinorType.MAP) { + return true; +} + +if (children == null || other.children == null) { + return children == other.children; +} +if (children.size() != other.children.size()) { + return false; +} + +// Maps are name-based, not position. But, for our +// purposes, we insist on identical ordering. + +Iterator thisIter = children.iterator(); +Iterator otherIter = other.children.iterator(); +while (thisIter.hasNext()) { --- End diff -- The row set & writer abstractions require identical ordering so that column indexes are well-defined. Here we are facing the age-old philosophical question of "sameness." Sameness is instrumental: sameness-for-a-purpose. Here, we want to know if two schemas are equivalent for the purposes of referencing columns by index. We recently did a fix elsewhere we do use the looser definition: that A and B contain the same columns, but in possibly different orderings. Added a comment to explain this. ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r150758630 --- Diff: exec/vector/src/main/java/org/apache/drill/exec/vector/accessor/writer/BaseScalarWriter.java --- @@ -0,0 +1,264 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.vector.accessor.writer; + +import java.math.BigDecimal; + +import org.apache.drill.exec.vector.accessor.ColumnWriterIndex; +import org.apache.drill.exec.vector.accessor.impl.HierarchicalFormatter; +import org.joda.time.Period; + +/** + * Column writer implementation that acts as the basis for the + * generated, vector-specific implementations. All set methods + * throw an exception; subclasses simply override the supported + * method(s). + * + * The only tricky part to this class is understanding the + * state of the write indexes as the write proceeds. There are + * two pointers to consider: + * + * lastWriteIndex: The position in the vector at which the + * client last asked us to write data. This index is maintained + * in this class because it depends only on the actions of this + * class. + * vectorIndex: The position in the vector at which we will + * write if the client chooses to write a value at this time. + * The vector index is shared by all columns at the same repeat + * level. It is incremented as the client steps through the write + * and is observed in this class each time a write occurs. + * + * A repeat level is defined as any of the following: + * + * The set of top-level scalar columns, or those within a + * top-level, non-repeated map, or nested to any depth within + * non-repeated maps rooted at the top level. + * The values for a single scalar array. + * The set of scalar columns within a repeated map, or + * nested within non-repeated maps within a repeated map. + * + * Items at a repeat level index together and share a vector + * index. However, the columns within a repeat level + * do not share a last write index: some can lag further + * behind than others. + * + * Let's illustrate the states. Let's focus on one column and + * illustrate the three states that can occur during write: + * + * Behind: the last write index is more than one position behind + * the vector index. Zero-filling will be needed to catch up to + * the vector index. + * Written: the last write index is the same as the vector + * index because the client wrote data at this position (and previous + * values were back-filled with nulls, empties or zeros.) + * Unwritten: the last write index is one behind the vector + * index. This occurs when the column was written, then the client + * moved to the next row or array position. + * Restarted: The current row is abandoned (perhaps filtered + * out) and is to be rewritten. The last write position moves + * back one position. Note that, the Restarted state is + * indistinguishable from the unwritten state: the only real + * difference is that the current slot (pointed to by the + * vector index) contains the previous written value that must + * be overwritten or back-filled. But, this is fine, because we + * assume that unwritten values are garbage anyway. + * + * To illustrate: + * Behind WrittenUnwrittenRestarted + * |X| |X| |X| |X| + * lw >|X| |X| |X| |X| + * | | |0| |0| lw > |0| + *v >| | lw, v > |X|lw > |X| v > |X| + *v > | | + * + * The illustrated state transitions are: + * + * Suppose the state starts in Behind. + * If the client writes a value, then the empty slot is + * back-filled and the state moves to Written. + * If the client does not write a value, the state stays + * at Behind, and the gap of unfilled values
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r150725372 --- Diff: exec/java-exec/src/test/java/org/apache/drill/test/rowSet/test/RowSetTest.java --- @@ -19,420 +19,648 @@ import static org.junit.Assert.assertEquals; import static org.junit.Assert.assertFalse; -import static org.junit.Assert.assertNotNull; -import static org.junit.Assert.assertNull; import static org.junit.Assert.assertSame; import static org.junit.Assert.assertTrue; +import static org.junit.Assert.fail; + +import java.io.UnsupportedEncodingException; -import org.apache.drill.common.types.TypeProtos.DataMode; import org.apache.drill.common.types.TypeProtos.MinorType; import org.apache.drill.exec.record.BatchSchema; +import org.apache.drill.exec.record.TupleMetadata; +import org.apache.drill.exec.vector.ValueVector; import org.apache.drill.exec.vector.accessor.ArrayReader; import org.apache.drill.exec.vector.accessor.ArrayWriter; -import org.apache.drill.exec.vector.accessor.TupleAccessor.TupleSchema; +import org.apache.drill.exec.vector.accessor.ObjectType; +import org.apache.drill.exec.vector.accessor.ScalarElementReader; +import org.apache.drill.exec.vector.accessor.ScalarReader; +import org.apache.drill.exec.vector.accessor.ScalarWriter; +import org.apache.drill.exec.vector.accessor.TupleReader; +import org.apache.drill.exec.vector.accessor.TupleWriter; +import org.apache.drill.exec.vector.accessor.ValueType; +import org.apache.drill.exec.vector.complex.MapVector; +import org.apache.drill.exec.vector.complex.RepeatedMapVector; import org.apache.drill.test.SubOperatorTest; import org.apache.drill.test.rowSet.RowSet.ExtendableRowSet; -import org.apache.drill.test.rowSet.RowSet.RowSetReader; -import org.apache.drill.test.rowSet.RowSet.RowSetWriter; import org.apache.drill.test.rowSet.RowSet.SingleRowSet; import org.apache.drill.test.rowSet.RowSetComparison; -import org.apache.drill.test.rowSet.RowSetSchema; -import org.apache.drill.test.rowSet.RowSetSchema.FlattenedSchema; -import org.apache.drill.test.rowSet.RowSetSchema.PhysicalSchema; +import org.apache.drill.test.rowSet.RowSetReader; +import org.apache.drill.test.rowSet.RowSetWriter; import org.apache.drill.test.rowSet.SchemaBuilder; +import org.bouncycastle.util.Arrays; --- End diff -- Eclipse auto-import... Fixed. ---
[GitHub] drill pull request #914: DRILL-5657: Size-aware vector writer structure
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/914#discussion_r150753305 --- Diff: exec/vector/src/main/codegen/templates/ColumnAccessors.java --- @@ -191,141 +180,268 @@ public void bind(RowIndex vectorIndex, ValueVector vector) { <#if accessorType=="BigDecimal"> <#assign label="Decimal"> +<#if drillType == "VarChar" || drillType == "Var16Char"> + <#assign accessorType = "byte[]"> + <#assign label = "Bytes"> + <#if ! notyet> // // ${drillType} readers and writers - public static class ${drillType}ColumnReader extends AbstractColumnReader { + public static class ${drillType}ColumnReader extends BaseScalarReader { -<@bindReader "" drillType /> +<@bindReader "" drillType false /> -<@getType label /> +<@getType drillType label /> <@get drillType accessorType label false/> } - public static class Nullable${drillType}ColumnReader extends AbstractColumnReader { + public static class Nullable${drillType}ColumnReader extends BaseScalarReader { -<@bindReader "Nullable" drillType /> +<@bindReader "Nullable" drillType false /> -<@getType label /> +<@getType drillType label /> @Override public boolean isNull() { - return accessor().isNull(vectorIndex.index()); -} - -<@get drillType accessorType label false/> - } - - public static class Repeated${drillType}ColumnReader extends AbstractArrayReader { - -<@bindReader "Repeated" drillType /> - -<@getType label /> - -@Override -public int size() { - return accessor().getInnerValueCountAt(vectorIndex.index()); + return accessor().isNull(vectorIndex.vectorIndex()); } -<@get drillType accessorType label true/> +<@get drillType accessorType label false /> } - public static class ${drillType}ColumnWriter extends AbstractColumnWriter { + public static class Repeated${drillType}ColumnReader extends BaseElementReader { -<@bindWriter "" drillType /> +<@bindReader "" drillType true /> -<@getType label /> +<@getType drillType label /> -<@set drillType accessorType label false "set" /> +<@get drillType accessorType label true /> } - public static class Nullable${drillType}ColumnWriter extends AbstractColumnWriter { - -<@bindWriter "Nullable" drillType /> + <#assign varWidth = drillType == "VarChar" || drillType == "Var16Char" || drillType == "VarBinary" /> + <#if varWidth> + public static class ${drillType}ColumnWriter extends BaseVarWidthWriter { + <#else> + public static class ${drillType}ColumnWriter extends BaseFixedWidthWriter { +<#if drillType = "Decimal9" || drillType == "Decimal18" || + drillType == "Decimal28Sparse" || drillType == "Decimal38Sparse"> +private MajorType type; + +private static final int VALUE_WIDTH = ${drillType}Vector.VALUE_WIDTH; + +private final ${drillType}Vector vector; + +public ${drillType}ColumnWriter(final ValueVector vector) { + <#if varWidth> + super(((${drillType}Vector) vector).getOffsetVector()); + <#else> +<#if drillType = "Decimal9" || drillType == "Decimal18" || + drillType == "Decimal28Sparse" || drillType == "Decimal38Sparse"> + type = vector.getField().getType(); + + + this.vector = (${drillType}Vector) vector; +} -<@getType label /> +@Override public ValueVector vector() { return vector; } +<#-- All change of buffer comes through this function to allow capturing + the buffer address and capacity. Only two ways to set the buffer: + by binding to a vector in bindVector(), or by resizing the vector + in writeIndex(). --> @Override -public void setNull() { - mutator.setNull(vectorIndex.index()); +protected final void setAddr() { + final DrillBuf buf = vector.getBuffer(); + bufAddr = buf.addr(); + <#if varWidth> + capacity = buf.capacity(); + <#else> + <#-- Turns out that keeping track of capacity as the count of + values simplifies the per-value code path. --> + capacity = buf.capacity() / VALUE_WIDTH; + } -<@set drillType accessorType label true "set" /> - } - - public static class
[GitHub] drill issue #1030: DRILL-5941: Skip header / footer improvements for Hive st...
Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1030 @ppadma To create reader for each input split and maintain skip header / footer functionality we need to know how many rows are in input split. Unfortunately, input split does not hold such information, only number of bytes. [1] We can't apply skip header functionality for the first input split and skip footer for the last input either since we don't know how many rows will be skipped, it can be the situation that we need to skip the whole first input split and partially second. @paul-rogers To read from hive we actually use Hadoop reader [2, 3] so if I am not mistaken unfortunately the described above approach can be applied. [1] https://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/mapred/FileSplit.html [2] https://github.com/apache/drill/blob/master/contrib/storage-hive/core/src/main/java/org/apache/drill/exec/store/hive/HiveAbstractReader.java#L234 [3] https://hadoop.apache.org/docs/r2.7.0/api/org/apache/hadoop/mapred/RecordReader.html ---
[jira] [Created] (DRILL-5961) For long running queries (> 10 min) Drill may raise FragmentSetupException for completed/cancelled fragments
Vlad Rozov created DRILL-5961: - Summary: For long running queries (> 10 min) Drill may raise FragmentSetupException for completed/cancelled fragments Key: DRILL-5961 URL: https://issues.apache.org/jira/browse/DRILL-5961 Project: Apache Drill Issue Type: Bug Reporter: Vlad Rozov Assignee: Vlad Rozov {{WorkEventBus}} uses {{recentlyFinishedFragments}} cache to check for completed or cancelled fragments. Such check is not reliable as entries in {{recentlyFinishedFragments}} expire after 10 minutes, so {{FragmentSetupException}} is raised even for completed or cancelled queries. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill issue #984: DRILL-5783 Made a unit test for generated Priority Queue. ...
Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/984 @ilooner as far as know you are doing presentation today connected with this pull request. If overall approach will be accepted, then we'll try to merge this pull request. Please resolve the conflicts as well. ---