[GitHub] drill issue #972: DRILL-5838: Fix MaprDB filter pushdown for the case of nes...

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/972
  
I think we have an opportunity here to take as step back and think a bit 
about design rather than just applying code patches.

Drill must handle a large variety of names:

{noformat}
this.kind
this kind
$stranger+stuff#
{noformat}

Drill supports paths of names using Drill's SQL syntax:

*part1* . *part2* . *part3*

Drill's SQL rules require quoting if the name does not follow SQL naming 
syntax or would be confused with a SQL reserved word (something we should fix):

{noformat}
fine.`needs.quote`.`also-needs+quotes`
{noformat}

So, we have three problems when dealing with external systems:

* How do we convert from the naming system of the external tool into Drill?
* How do we represent the names within Drill unambiguously?
* How do we give Drill's names back to the external system?

Drill must handle a variety of systems; Drill's internal naming system 
can't bend with the wind to mean one thing for MySQL, another for Parquet and 
something else for MapR DB. The conversion must be done at the interface.

Our convention is (or should be):

* External names, or at least those that are not valid Drill symbols, must 
be quoted in SQL.
* Internally, each name segment is stored as a `NamePart` object, with a 
collection of parts forming a `SchemaPath`.
* When calling back out to the external system, the interface (plugin) must 
convert from Drill format back out to the external system format.

Now, let's consider this change. In Drill code, we call Drill methods to 
convert Drill names to MapR DB format. While this just might work, it does not 
really follow the above rules.

What we want is a `String drillToMapRDName(SchemaPath path)` method that 
applies DB-specific rules for conversion at the point of calls into DB. All 
names within Drill (unless cached only in the DB storage plugin) must be in 
Drill's internal form.

This point would not be that important if Drill worked with just one 
external system (DB in this case). But, as Drill evolves, it must work with a 
variety of systems. We should not rely on methods on Drill classes to provide 
the proper name format for external systems: external systems must provide 
those rules. This is true even if it happens to work out, for now, that Drill's 
current methods happen to work.

Just to drive the point home. Here we are changing a Drill method to 
produce names in a format that DB needs. But, might some other external system 
interface depend on the old format? How will this converge?

Bottom line:

Do the conversion in DB-specific code and will work fine.


---


[GitHub] drill pull request #972: DRILL-5838: Fix MaprDB filter pushdown for the case...

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/972#discussion_r143107921
  
--- Diff: 
contrib/format-maprdb/src/main/java/org/apache/drill/exec/store/mapr/db/json/JsonConditionBuilder.java
 ---
@@ -159,73 +159,73 @@ private void setIsCondition(QueryCondition c,
   private JsonScanSpec createJsonScanSpec(FunctionCall call,
   CompareFunctionsProcessor processor) {
 String functionName = processor.getFunctionName();
-SchemaPath field = processor.getPath();
+String fieldPath = processor.getPath().asPathString(false);
--- End diff --

This needs an explanation. This is a `String`. The method is 
`asPathString()` which looks like we will take name parts ["a", "b.c", "d"] to 
produce "a.b.c.d". This cannot be right as either 1) it is represents a path to 
be split, or 2) represents a full name to match. But, I might have fields 
["a.b", "c.d"] which also expands to "a.b.c.d". So, the matching part can't be 
right. The path split part can't be right since we cannot recover the name 
parts from the string.

In short, should we path the parsed path (collection of name parts) to the 
functions?

See comments for more info.


---


[GitHub] drill pull request #972: DRILL-5838: Fix MaprDB filter pushdown for the case...

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/972#discussion_r143107072
  
--- Diff: 
logical/src/main/java/org/apache/drill/common/expression/PathSegment.java ---
@@ -151,6 +151,15 @@ public boolean isNamed() {
   return true;
 }
 
+/**
+ * Checks that the path of this name segment is complex.
+ *
+ * @return true if the path of this name segment contains dots
+ */
+public boolean isComplex() {
--- End diff --

Not sure I agree with this. We just did changes to enforce the rule that a 
path segment is represented by a distinct object. If a path segment can contain 
dots, then we don't need to represent parts as multiple parts. But, we went the 
multiple part route to allow dots in names.

How will we know if a dot in a name represents a complex (multi-part) name 
vs. a simple name that happens to contain a dot?

In short, this seems a bit of a hack and introduces undesirable ambiguity.


---


[GitHub] drill pull request #972: DRILL-5838: Fix MaprDB filter pushdown for the case...

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/972#discussion_r143107377
  
--- Diff: 
logical/src/main/java/org/apache/drill/common/expression/SchemaPath.java ---
@@ -264,6 +264,56 @@ public String getRootSegmentPath() {
 return rootSegment.getPath();
   }
 
+  /**
+   * Returns {@code String} representation of this schema path,
+   * quoting all name segments if specified {@code quote} is true or 
quoting
+   * only those name segments which have a complex name (their name 
contains dots).
+   *
+   * @param quoted is name segment should be quoted
+   * @return the {@code String} representation of this {@code SchemaPath}
+   * @throws IllegalStateException if root segment is {@code ArraySegment}
+   */
+  public String asPathString(boolean quoted) throws RuntimeException {
+StringBuilder sb = new StringBuilder();
+PathSegment seg = rootSegment;
+if (seg.isArray()) {
+  throw new IllegalStateException("Drill doesn't currently support top 
level arrays");
+}
+NameSegment nameSegment = seg.getNameSegment();
+writeQuoted(sb, nameSegment.getPath(), quoted || 
nameSegment.isComplex());
+
+while ((seg = seg.getChild()) != null) {
+  if (seg.isNamed()) {
+nameSegment = seg.getNameSegment();
+sb.append('.');
+writeQuoted(sb, nameSegment.getPath(), quoted || 
nameSegment.isComplex());
--- End diff --

If `isComplex()` is meant to indicate that a name must be quoted, then 
`requiresQuotes()` would be a better name. Some other names that require quotes:

{noformat}
names with spaces
names-with-dashes
anything/with+an*operator
$looks$like$internal$name
maybeEvenCaseSensitive
MAYBEeVENcASEsENSIVIVE
!on
...
{noformat}

Basically, anything that is not a symbol (initial alpha followed by any 
number of alphanumeric...)


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143086373
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -392,10 +419,12 @@ drill.exec.options:  {
 exec.query_profile.debug_mode: false,
 exec.query_profile.save: true,
 exec.queue.enable: false,
-exec.queue.large: 10,
-exec.queue.small: 100,
+exec.queue.large: 2,
--- End diff --

This section is where we externalized the default values for system/session 
options; so users should never change these values in the config system, only 
via `ALTER SESSION`. (The queueing items are system-only.) The values are 
externalized so that distributions of Drill can alter the defaults without 
having to muck with ZK.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143084043
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143082027
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DefaultResourceManager.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.physical.PhysicalPlan;
+import org.apache.drill.exec.server.BootStrapContext;
+import org.apache.drill.exec.util.MemoryAllocationUtilities;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.apache.drill.exec.work.foreman.Foreman;
+
+/**
+ * Represents a default resource manager for clusters that do not provide 
query
+ * queues. Without queues to provide a hard limit on the query admission 
rate,
+ * the number of active queries must be estimated and the resulting 
resource
+ * allocations will be rough estimates.
+ */
+
+public class DefaultResourceManager implements ResourceManager {
+
+  public static class DefaultResourceAllocator implements 
QueryResourceAllocator {
+
+private QueryContext queryContext;
+
+protected DefaultResourceAllocator(QueryContext queryContext) {
+  this.queryContext = queryContext;
+}
+
+@Override
+public void visitAbstractPlan(PhysicalPlan plan) {
+  if (plan == null || plan.getProperties().hasResourcePlan) {
+return;
+  }
+  MemoryAllocationUtilities.setupBufferedOpsMemoryAllocations(plan, 
queryContext);
+}
+
+@Override
+public void visitPhysicalPlan(QueryWorkUnit work) {
+}
+  }
+
+  public static class DefaultQueryResourceManager extends 
DefaultResourceAllocator implements QueryResourceManager {
+
+@SuppressWarnings("unused")
+private final DefaultResourceManager rm;
+
+public DefaultQueryResourceManager(final DefaultResourceManager rm, 
final Foreman foreman) {
+  super(foreman.getQueryContext());
+  this.rm = rm;
+}
+
+@Override
+public void setCost(double cost) {
+  // Nothing to do by default.
+}
+
+@Override
+public void admit() {
+  // No queueing by default
+}
+
+@Override
+public void exit() {
+  // No queueing by default
+}
+
+@Override
+public boolean hasQueue() { return false; }
+
+@Override
+public String queueName() { return null; }
+  }
+
+  BootStrapContext bootStrapContext;
+  public long memoryPerNode;
--- End diff --

Fixed.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143085902
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/EmbeddedQueryQueue.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Query queue to be used in an embedded Drillbit. This queue has scope of 
only
+ * the one Drillbit (not even multiple Drillbits in the same process.) 
Primarily
+ * intended for testing, but may possibly be useful for other embedded
+ * applications.
+ * 
+ * Configuration is via config parameters (not via system options as for 
the
+ * distributed queue.)
+ * 
+ * drill.queue.embedded.enabled
+ * Set to true to enable the embedded queue. But, this setting has 
effect
+ * only if the Drillbit is, in fact, embedded.
+ * drill.queue.embedded.size
+ * The number of active queries, all others queue. There is no upper 
limit
+ * on the number of queued entries.
+ * drill.queue.embedded.timeout_ms
+ * The maximum time a query will wait in the queue before failing.
+ * 
+ */
+
+public class EmbeddedQueryQueue implements QueryQueue {
+
+  public static String EMBEDDED_QUEUE = "drill.exec.queue.embedded";
+  public static String ENABLED = EMBEDDED_QUEUE + ".enable";
+  public static String QUEUE_SIZE = EMBEDDED_QUEUE + ".size";
+  public static String TIMEOUT_MS = EMBEDDED_QUEUE + ".timeout_ms";
+
+  public class EmbeddedQueueLease implements QueueLease {
+
+private final QueryId queryId;
+private boolean released;
+private long queryMemory;
+
+public EmbeddedQueueLease(QueryId queryId, long queryMemory) {
+  this.queryId = queryId;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString( ) {
+  String msg = "Embedded queue lease for " +
+  QueryIdHelper.getQueryId(queryId);
+  if (released) {
+msg += " (released)";
--- End diff --

Sure.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143082265
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
--- End diff --

Does Drill have a standard? Nested-classes-at-the-top is the rule I've 
always followed, though in Drill I also see classes-at-the-bottom and even 
classes-in-the-middle.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143088158
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -392,10 +419,12 @@ drill.exec.options:  {
 exec.query_profile.debug_mode: false,
 exec.query_profile.save: true,
 exec.queue.enable: false,
-exec.queue.large: 10,
-exec.queue.small: 100,
+exec.queue.large: 2,
--- End diff --

As for the smaller values... Drill ships with a default of 8 GB heap, and 
allows 2 GB per query per node by default. Drill only has enough memory to run 
3 "normal-sized" queries. These new numbers are, in fact, wildly optimistic 
that, in that configuration, we can run 10 small and two large queries. We 
assume that each will get (8 GB * .8) / 30 = ~210 MB per small query, about 2 
GB per large query.

In fact, in explaining this, I realized that the small queue number is 
still too large for the default install and have reduced the numbers to large = 
2, small = 4. Memory for small is now (8 GB * 0.8) / 15 = ~430 MB, with large 
now ~ 2 GB. This maps a "large" query to the default memory-per-query-per-node.

Granted, the 20% overhead is a SWAG, we have evidence that it is sometimes 
larger. Fixing that is another project...


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143084915
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DynamicResourceManager.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+import org.apache.drill.exec.work.foreman.Foreman;
+import 
org.apache.drill.exec.work.foreman.rm.DistributedQueryQueue.StatusAdapter;
+
+/**
+ * Wrapper around the default and/or distributed resource managers
+ * to allow dynamically enabling and disabling queueing.
+ */
+
+public class DynamicResourceManager implements ResourceManager {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DynamicResourceManager.class);
+
+  private final DrillbitContext context;
+  private ResourceManager defaultRm;
+  private ResourceManager queueingRm;
+  private ResourceManager activeRm;
+  public long lastUpdateTime;
+  public int recheckDelayMs = 5000;
+
+  public DynamicResourceManager(final DrillbitContext context) {
+this.context = context;
+refreshRM();
+  }
+
+  public synchronized ResourceManager activeRM() {
+refreshRM();
+return activeRm;
+  }
+
+  @Override
+  public long memoryPerNode() {
+return activeRm.memoryPerNode();
+  }
+
+  @Override
+  public int cpusPerNode() {
+return activeRm.cpusPerNode();
+  }
+
+  @Override
+  public synchronized QueryResourceAllocator 
newResourceAllocator(QueryContext queryContext) {
+refreshRM();
+return activeRm.newResourceAllocator(queryContext);
+  }
+
+  @Override
+  public synchronized QueryResourceManager newQueryRM(Foreman foreman) {
+refreshRM();
+return activeRm.newQueryRM(foreman);
+  }
+
+  private void refreshRM() {
+long now = System.currentTimeMillis();
+if (lastUpdateTime + recheckDelayMs >= now) {
--- End diff --

This particular bit of code (which I modified to be consistent with the 
other refresh case) is designed to avoid banging on the options for every 
query. If the system gets 100 queries per second, we don't need to keep 
rechecking the options for every query. A small latency is fine and reduces 
unnecessary options checks.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143085678
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/EmbeddedQueryQueue.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Query queue to be used in an embedded Drillbit. This queue has scope of 
only
+ * the one Drillbit (not even multiple Drillbits in the same process.) 
Primarily
+ * intended for testing, but may possibly be useful for other embedded
+ * applications.
+ * 
+ * Configuration is via config parameters (not via system options as for 
the
+ * distributed queue.)
+ * 
+ * drill.queue.embedded.enabled
+ * Set to true to enable the embedded queue. But, this setting has 
effect
+ * only if the Drillbit is, in fact, embedded.
+ * drill.queue.embedded.size
+ * The number of active queries, all others queue. There is no upper 
limit
+ * on the number of queued entries.
+ * drill.queue.embedded.timeout_ms
+ * The maximum time a query will wait in the queue before failing.
+ * 
+ */
+
+public class EmbeddedQueryQueue implements QueryQueue {
+
+  public static String EMBEDDED_QUEUE = "drill.exec.queue.embedded";
+  public static String ENABLED = EMBEDDED_QUEUE + ".enable";
+  public static String QUEUE_SIZE = EMBEDDED_QUEUE + ".size";
+  public static String TIMEOUT_MS = EMBEDDED_QUEUE + ".timeout_ms";
+
+  public class EmbeddedQueueLease implements QueueLease {
+
+private final QueryId queryId;
+private boolean released;
--- End diff --

Fixed.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143083580
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143086005
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/EmbeddedQueryQueue.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Query queue to be used in an embedded Drillbit. This queue has scope of 
only
+ * the one Drillbit (not even multiple Drillbits in the same process.) 
Primarily
+ * intended for testing, but may possibly be useful for other embedded
+ * applications.
+ * 
+ * Configuration is via config parameters (not via system options as for 
the
+ * distributed queue.)
+ * 
+ * drill.queue.embedded.enabled
+ * Set to true to enable the embedded queue. But, this setting has 
effect
+ * only if the Drillbit is, in fact, embedded.
+ * drill.queue.embedded.size
+ * The number of active queries, all others queue. There is no upper 
limit
+ * on the number of queued entries.
+ * drill.queue.embedded.timeout_ms
+ * The maximum time a query will wait in the queue before failing.
+ * 
+ */
+
+public class EmbeddedQueryQueue implements QueryQueue {
+
+  public static String EMBEDDED_QUEUE = "drill.exec.queue.embedded";
+  public static String ENABLED = EMBEDDED_QUEUE + ".enable";
+  public static String QUEUE_SIZE = EMBEDDED_QUEUE + ".size";
+  public static String TIMEOUT_MS = EMBEDDED_QUEUE + ".timeout_ms";
+
+  public class EmbeddedQueueLease implements QueueLease {
+
+private final QueryId queryId;
+private boolean released;
+private long queryMemory;
+
+public EmbeddedQueueLease(QueryId queryId, long queryMemory) {
+  this.queryId = queryId;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString( ) {
+  String msg = "Embedded queue lease for " +
+  QueryIdHelper.getQueryId(queryId);
+  if (released) {
+msg += " (released)";
+  }
+  return msg;
+}
+
+@Override
+public long queryMemoryPerNode() {
+  return queryMemory;
+}
+
+@Override
+public void release() {
+  EmbeddedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return "local-queue"; }
+  }
+
+  private final int queueTimeoutMs;
+  private final int queueSize;
+  private final Semaphore semaphore;
+  private long memoryPerQuery;
+  private final long minimumOperatorMemory;
+
+  public EmbeddedQueryQueue(DrillbitContext context) {
+DrillConfig config = context.getConfig();
+queueTimeoutMs = config.getInt(TIMEOUT_MS);
+queueSize = config.getInt(QUEUE_SIZE);
+semaphore = new Semaphore(queueSize, true);
+minimumOperatorMemory = context.getOptionManager()
+.getOption(ExecConstants.MIN_MEMORY_PER_BUFFERED_OP);
+  }
+
+  @Override
+  public boolean enabled() { return true; }
+
+  @Override
+  public void setMemoryPerNode(long memoryPerNode) {
+memoryPerQuery = memoryPerNode / queueSize;
+  }
+
+  @Override
+  public long defaultQueryMemoryPerNode(double cost) {
+return memoryPerQuery;
+  }
+
+  @Override
+  public QueueLease enqueue(QueryId queryId, double cost)
+  throws QueueTimeoutException, QueryQueueException {
+try {
+  if (! semaphore.tryAcquire(queueTimeoutMs, TimeUnit.MILLISECONDS) ) {
+throw new 

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143084292
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143102000
  
--- Diff: 
protocol/src/main/java/org/apache/drill/exec/proto/UserBitShared.java ---
@@ -12140,6 +12140,31 @@ public Builder clearDef() {
  */
 com.google.protobuf.ByteString
 getOptionsJsonBytes();
+
+// optional double total_cost = 7;
--- End diff --

I would, but this is generated code...


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143085493
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DynamicResourceManager.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+import org.apache.drill.exec.work.foreman.Foreman;
+import 
org.apache.drill.exec.work.foreman.rm.DistributedQueryQueue.StatusAdapter;
+
+/**
+ * Wrapper around the default and/or distributed resource managers
+ * to allow dynamically enabling and disabling queueing.
+ */
+
+public class DynamicResourceManager implements ResourceManager {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DynamicResourceManager.class);
+
+  private final DrillbitContext context;
+  private ResourceManager defaultRm;
+  private ResourceManager queueingRm;
+  private ResourceManager activeRm;
+  public long lastUpdateTime;
+  public int recheckDelayMs = 5000;
+
+  public DynamicResourceManager(final DrillbitContext context) {
+this.context = context;
+refreshRM();
+  }
+
+  public synchronized ResourceManager activeRM() {
+refreshRM();
+return activeRm;
+  }
+
+  @Override
+  public long memoryPerNode() {
+return activeRm.memoryPerNode();
+  }
+
+  @Override
+  public int cpusPerNode() {
+return activeRm.cpusPerNode();
+  }
+
+  @Override
+  public synchronized QueryResourceAllocator 
newResourceAllocator(QueryContext queryContext) {
+refreshRM();
+return activeRm.newResourceAllocator(queryContext);
+  }
+
+  @Override
+  public synchronized QueryResourceManager newQueryRM(Foreman foreman) {
+refreshRM();
+return activeRm.newQueryRM(foreman);
+  }
+
+  private void refreshRM() {
+long now = System.currentTimeMillis();
+if (lastUpdateTime + recheckDelayMs >= now) {
+  return;
+}
+lastUpdateTime = now;
+@SuppressWarnings("resource")
+SystemOptionManager systemOptions = context.getOptionManager();
+if (systemOptions.getOption(ExecConstants.ENABLE_QUEUE)) {
+  if (queueingRm == null) {
+StatusAdapter statusAdapter = new StatusAdapter() {
+  @Override
+  public boolean inShutDown() {
+// Drill provides no shutdown state at present. Once
+// DRILL-4286 (graceful shutdown) is merged, use the
+// new Drillbit status to determine when the Drillbit
+// is shutting down.
+return false;
+  }
+};
+queueingRm = new ThrottledResourceManager(context,
+new DistributedQueryQueue(context, statusAdapter));
+  }
+  if (activeRm != queueingRm) {
+logger.debug("Enabling ZK-based query queue.");
+activeRm = queueingRm;
+  }
+} else {
+  if (defaultRm == null) {
+defaultRm = new DefaultResourceManager();
+  }
+  if (activeRm != defaultRm) {
+logger.debug("Disabling ZK-based query queue.");
+activeRm = defaultRm;
+  }
+}
+  }
+
+  @Override
+  public void close() {
+if (defaultRm != null) {
+  defaultRm.close();
+  defaultRm = null;
+}
--- End diff --

Paranoid-ified the `close()` method.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143100991
  
--- Diff: exec/java-exec/src/main/resources/rest/index.ftl ---
@@ -74,23 +74,66 @@
 
   
   
-Encryption Info 
+Encryption
 
-  
+  
 
 
   Client to Bit Encryption:
--- End diff --

Isn't it? This table has list style:

Client to Bit Encryption: Enabled

Compared with the query profile that has table style:

Plan TimeTotal Time
5 mins  6 mins.

I poked around a bit and found a style that might work: right-align values 
to eliminate the need for the colons. Seems to look OK.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143082463
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143083783
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143082012
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DefaultResourceManager.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.physical.PhysicalPlan;
+import org.apache.drill.exec.server.BootStrapContext;
+import org.apache.drill.exec.util.MemoryAllocationUtilities;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.apache.drill.exec.work.foreman.Foreman;
+
+/**
+ * Represents a default resource manager for clusters that do not provide 
query
+ * queues. Without queues to provide a hard limit on the query admission 
rate,
+ * the number of active queries must be estimated and the resulting 
resource
+ * allocations will be rough estimates.
+ */
+
+public class DefaultResourceManager implements ResourceManager {
+
+  public static class DefaultResourceAllocator implements 
QueryResourceAllocator {
+
+private QueryContext queryContext;
+
+protected DefaultResourceAllocator(QueryContext queryContext) {
+  this.queryContext = queryContext;
+}
+
+@Override
+public void visitAbstractPlan(PhysicalPlan plan) {
+  if (plan == null || plan.getProperties().hasResourcePlan) {
+return;
+  }
+  MemoryAllocationUtilities.setupBufferedOpsMemoryAllocations(plan, 
queryContext);
+}
+
+@Override
+public void visitPhysicalPlan(QueryWorkUnit work) {
+}
+  }
+
+  public static class DefaultQueryResourceManager extends 
DefaultResourceAllocator implements QueryResourceManager {
+
+@SuppressWarnings("unused")
+private final DefaultResourceManager rm;
+
+public DefaultQueryResourceManager(final DefaultResourceManager rm, 
final Foreman foreman) {
+  super(foreman.getQueryContext());
+  this.rm = rm;
+}
+
+@Override
+public void setCost(double cost) {
+  // Nothing to do by default.
+}
+
+@Override
+public void admit() {
+  // No queueing by default
+}
+
+@Override
+public void exit() {
+  // No queueing by default
+}
+
+@Override
+public boolean hasQueue() { return false; }
+
+@Override
+public String queueName() { return null; }
+  }
+
+  BootStrapContext bootStrapContext;
--- End diff --

Fixed.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143084974
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DynamicResourceManager.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+import org.apache.drill.exec.work.foreman.Foreman;
+import 
org.apache.drill.exec.work.foreman.rm.DistributedQueryQueue.StatusAdapter;
+
+/**
+ * Wrapper around the default and/or distributed resource managers
+ * to allow dynamically enabling and disabling queueing.
+ */
+
+public class DynamicResourceManager implements ResourceManager {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DynamicResourceManager.class);
+
+  private final DrillbitContext context;
+  private ResourceManager defaultRm;
+  private ResourceManager queueingRm;
+  private ResourceManager activeRm;
+  public long lastUpdateTime;
+  public int recheckDelayMs = 5000;
+
+  public DynamicResourceManager(final DrillbitContext context) {
+this.context = context;
+refreshRM();
+  }
+
+  public synchronized ResourceManager activeRM() {
+refreshRM();
+return activeRm;
+  }
+
+  @Override
+  public long memoryPerNode() {
+return activeRm.memoryPerNode();
+  }
+
+  @Override
+  public int cpusPerNode() {
+return activeRm.cpusPerNode();
+  }
+
+  @Override
+  public synchronized QueryResourceAllocator 
newResourceAllocator(QueryContext queryContext) {
+refreshRM();
+return activeRm.newResourceAllocator(queryContext);
+  }
+
+  @Override
+  public synchronized QueryResourceManager newQueryRM(Foreman foreman) {
+refreshRM();
+return activeRm.newQueryRM(foreman);
+  }
+
+  private void refreshRM() {
+long now = System.currentTimeMillis();
+if (lastUpdateTime + recheckDelayMs >= now) {
+  return;
+}
+lastUpdateTime = now;
+@SuppressWarnings("resource")
+SystemOptionManager systemOptions = context.getOptionManager();
+if (systemOptions.getOption(ExecConstants.ENABLE_QUEUE)) {
+  if (queueingRm == null) {
+StatusAdapter statusAdapter = new StatusAdapter() {
+  @Override
+  public boolean inShutDown() {
+// Drill provides no shutdown state at present. Once
--- End diff --

Fixed.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143086088
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/QueryQueue.java
 ---
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+
+/**
+ * Interface which defines a queue implementation for query queues.
+ * Implementations can queue locally, queue distributed, or do
+ * nothing at all.
+ * 
+ * A queue can report itself as enabled or disabled. When enabled,
+ * all queries must obtain a lease prior to starting execution. The
+ * lease must be released at the completion of execution.
+ */
+
+public interface QueryQueue {
+
+  /**
+   * The opaque lease returned once a query is admitted
+   * for execution.
+   */
+
+  public interface QueueLease {
+long queryMemoryPerNode();
+
+/**
+ * Release a query lease obtained from {@link #queue(QueryId, 
double))}.
+ * Should be called by the per-query resource manager.
+ *
+ * @param lease the lease to be released.
+ */
+
+void release();
+
+String queueName();
+  };
+
+  /**
+   * Exception thrown if a query exceeds the configured wait time
+   * in the query queue.
+   */
+
+  @SuppressWarnings("serial")
+  public class QueueTimeoutException extends Exception {
+
+private QueryId queryId;
--- End diff --

Fixed.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r143081467
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/AbstractResourceManager.java
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Abstract base class for a resource manager. Handles tasks common to all
+ * resource managers: learning the resources available on this Drillbit.
+ * In the current version, Drillbits must be symmetrical, so that knowing
+ * the resources on one node is sufficient to know resources available on
+ * all nodes.
+ */
+
+public abstract class AbstractResourceManager implements ResourceManager {
+
+  protected final DrillbitContext context;
+  private final long memoryPerNode;
+  private final int cpusPerNode;
+
+  public AbstractResourceManager(final DrillbitContext context) {
+this.context = context;
+DrillConfig config = context.getConfig();
+
+// Normally we use the actual direct memory configured on the JVM 
command
+// line. However, if the config param is set, we use that instead (if 
it is
+// lower than actual memory). Primarily for testing.
+
+long memLimit = DrillConfig.getMaxDirectMemory();
+long configMemoryPerNode = 
config.getBytes(ExecConstants.MAX_MEMORY_PER_NODE);
--- End diff --

Actually, `getBytes()` allows the value to be expressed as `128 MB` or '2 
GB` instead of the ridiculously long numbers we've used elsewhere.


---


[jira] [Created] (DRILL-5848) Implement Parquet Columnar Processing & Use Bulk APIs for processing

2017-10-05 Thread salim achouche (JIRA)
salim achouche created DRILL-5848:
-

 Summary: Implement Parquet Columnar Processing & Use Bulk APIs for 
processing
 Key: DRILL-5848
 URL: https://issues.apache.org/jira/browse/DRILL-5848
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Storage - Parquet
Affects Versions: 1.11.0
Reporter: salim achouche
 Fix For: 1.12.0


* Change Flat Parquet Reader processing from row based to columnar
* Use Bulk APIs during the parsing and data loading phase



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Parquet Metadata table on Rolling window

2017-10-05 Thread Padma Penumarthy
Unfortunately, we do not do incremental metadata updates. 
If new files are getting added constantly, refresh table metadata will not help.

Thanks
Padma


> On Oct 5, 2017, at 5:36 PM, François Méthot  wrote:
> 
> Hi,
> 
>  I have been using drill for more than year now, we are running 1.10.
> 
> My queries can spend from 5 to 10 minutes for planning because I am dealing
> with lots of file in HDFS. (then 5 min to 60 min for execution)
> 
> I maintain a rolling window of data  partitionned by the epoch seconds
> rounded to the hour.
> /mydata/3/   -> Next partition to be deleted (nightly check)
> /mydata/4/
> /mydata/.../
> /mydata/109/
> /mydata/110/ -> current hour, this is where new parquet files are added
> 
> I am  considering using REFRESH TABLE METADATA.
> Is it beneficial at all in a situation where new files are added
> constantly, (but only to the latest partition, older partition are set in
> stone)?
> Will drill detect that new files are added to the latest partition (110) ?
> -Will it trigger a refresh metadata on all the directory, on just on
> /mydata/110?
> 
> 
> Thanks for your help
> François



Parquet Metadata table on Rolling window

2017-10-05 Thread François Méthot
Hi,

  I have been using drill for more than year now, we are running 1.10.

My queries can spend from 5 to 10 minutes for planning because I am dealing
with lots of file in HDFS. (then 5 min to 60 min for execution)

I maintain a rolling window of data  partitionned by the epoch seconds
rounded to the hour.
/mydata/3/   -> Next partition to be deleted (nightly check)
/mydata/4/
/mydata/.../
/mydata/109/
/mydata/110/ -> current hour, this is where new parquet files are added

I am  considering using REFRESH TABLE METADATA.
Is it beneficial at all in a situation where new files are added
constantly, (but only to the latest partition, older partition are set in
stone)?
Will drill detect that new files are added to the latest partition (110) ?
-Will it trigger a refresh metadata on all the directory, on just on
/mydata/110?


Thanks for your help
François


[GitHub] drill pull request #975: DRILL-5743: Handling column family and column scan ...

2017-10-05 Thread prasadns14
GitHub user prasadns14 opened a pull request:

https://github.com/apache/drill/pull/975

DRILL-5743: Handling column family and column scan for hbase

This PR handles the scenario where the projected column list contains both 
a column family and a column within the same family.

@paul-rogers please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/prasadns14/drill DRILL-5743

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/975.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #975


commit 0469197e18b466dcfbb26b4c95c47c22b683416e
Author: Prasad Nagaraj Subramanya 
Date:   2017-10-05T22:18:43Z

DRILL-5743: Handling column family and column scan for hbase




---


Hangout minutes for Oct/3 2017

2017-10-05 Thread Chunhui Shi
Attendees: Sorabh, Sindhu, Padma, Arina, Vitalii, Volodymyr, Vova, Pritesh, 
Aman, Vlad, Boaz


We discussed about 1.12.0 release timeline, and might want to set the release 
time to early November. Arina offered to work as release manager for this 
release and will come up with the timeline proposal. Thanks Arina!


We also talked about some possible features to be included in 1.12.0. E.g. 
Kafka storage plugin. And what progress or obstacle in these works.


No other topic was raised.


Thank you everyone,


Chunhui


[GitHub] drill issue #974: DRILL-5839: Handle Empty Batches in Merge Receiver

2017-10-05 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/974
  
@paul-rogers Thank you Paul. I made the change and pushed the new diffs. 


---


[jira] [Created] (DRILL-5847) Flat Parquet Reader Performance Analysis

2017-10-05 Thread salim achouche (JIRA)
salim achouche created DRILL-5847:
-

 Summary: Flat Parquet Reader Performance Analysis
 Key: DRILL-5847
 URL: https://issues.apache.org/jira/browse/DRILL-5847
 Project: Apache Drill
  Issue Type: Sub-task
  Components: Storage - Parquet
Affects Versions: 1.11.0
Reporter: salim achouche
 Fix For: 1.12.0


This task is to analyze the Flat Parquet Reader logic looking for performance 
improvements opportunities.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5846) Improve Parquet Reader Performance for Flat Data types

2017-10-05 Thread salim achouche (JIRA)
salim achouche created DRILL-5846:
-

 Summary: Improve Parquet Reader Performance for Flat Data types 
 Key: DRILL-5846
 URL: https://issues.apache.org/jira/browse/DRILL-5846
 Project: Apache Drill
  Issue Type: Improvement
  Components: Storage - Parquet
Affects Versions: 1.11.0
Reporter: salim achouche
 Fix For: 1.12.0


The Parquet Reader is a key use-case for Drill. This JIRA is an attempt to 
further improve the Parquet Reader performance as several users reported that 
Parquet parsing represents the lion share of the overall query execution. It 
tracks Flat Data types only as Nested DTs might involve functional and 
processing enhancements (e.g., a nested column can be seen as a Document; user 
might want to perform operations scoped at the document level that is no need 
to span all rows). Another JIRA will be created to handle the nested columns 
use-case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #974: DRILL-5839: Handle Empty Batches in Merge Receiver

2017-10-05 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/974#discussion_r143049947
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/mock/MockRecordReader.java
 ---
@@ -52,7 +52,7 @@ public MockRecordReader(FragmentContext context, 
MockScanEntry config) {
 
   private int getEstimatedRecordSize(MockColumn[] types) {
 int x = 0;
-for (int i = 0; i < types.length; i++) {
+for (int i = 0; i < (types == null ? 0 : types.length); i++) {
--- End diff --

Nit: maybe use:

```
if (types == null) { return 0; }
// Original code here..
```


---


[GitHub] drill issue #950: DRILL-5431: SSL Support

2017-10-05 Thread parthchandra
Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/950
  
@arina-ielchiieva  The `TestSSLConfig.testMissingKeystorePassword` test 
will fail if the exception is not thrown. The test is checking that the 
exception is in fact being thrown. Anyway, I added back the assertTrue check as 
well. It doesn't hurt to have it there.
Re the failing `TestUserBitSSL.testClientConfigHostnameVerification`. The 
test generates a self signed certificate with the local machine host name
```
  String fqdn = InetAddress.getLocalHost().getHostName();
  SelfSignedCertificate certificate = new SelfSignedCertificate(fqdn);
```
It then connects to the Drillbit which, one assumes, should have the same 
hostname so we can test that the certificate's hostname is being validated.  
I really don't know why this would fail unless you have your test machine 
hostname not set or set incorrectly. It should work for all cases where the 
hostname is the default localhost.
It does appear that this test might cause angst down the road and it might 
be best to disable it for the general case.


---


[GitHub] drill pull request #950: DRILL-5431: SSL Support

2017-10-05 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/950#discussion_r143042574
  
--- Diff: contrib/native/client/src/clientlib/wincert.ipp ---
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#if defined(IS_SSL_ENABLED)
+
+#include 
+#include 
+
+#if defined _WIN32  || defined _WIN64
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+#pragma comment (lib, "crypt32.lib")
+#pragma comment (lib, "cryptui.lib")
+
+#define MY_ENCODING_TYPE  (PKCS_7_ASN_ENCODING | X509_ASN_ENCODING)
+
+inline
+int loadSystemTrustStore(const SSL *ssl, std::string& msg) {
--- End diff --

I'm actually already using these methods (see `SSLStreamChannel::init()`). 
The verification callback implements validating the certificate. In our case we 
are using the boost provided rfc2818 verification method. The load verify file 
should point to the truststore containing the certificates in pem format. 
OpenSSL will read this file and load the certificate into its in-memory X509 
certificate store.
The `loadSystemTrustStore` method reads the certificates from the Windows 
store (probably the registry) converts from the native store format into X509 
and then loads it into the in-memory store. After that OpenSSL takes over and 
does the verification. 
For Keychain, we will have to do something similar. 
Writing our own certificate verification is going to be error prone, 
especially if you want to do rfc2818 verification. Not sure I'm up to it :(.


---


[GitHub] drill pull request #970: DRILL-5832: Migrate OperatorFixture to use SystemOp...

2017-10-05 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/970#discussion_r143017778
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet2/DrillParquetReader.java
 ---
@@ -84,14 +82,7 @@
   private RecordReader recordReader;
   private DrillParquetRecordMaterializer recordMaterializer;
   private int recordCount;
-  private List primitiveVectors;
   private OperatorContext operatorContext;
--- End diff --

Can you explain the reason you are removing the existing logic?


---


[GitHub] drill pull request #970: DRILL-5832: Migrate OperatorFixture to use SystemOp...

2017-10-05 Thread sachouche
Github user sachouche commented on a diff in the pull request:

https://github.com/apache/drill/pull/970#discussion_r143020963
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -184,25 +183,26 @@ public void testAllScalarTypes() throws Exception {
 
 try {
   // read all of the types with the complex reader
-  test(String.format("alter session set %s = true", 
ExecConstants.PARQUET_NEW_RECORD_READER));
+  alterSession(ExecConstants.PARQUET_NEW_RECORD_READER, true);
--- End diff --

Paul,

- I noticed the key PARQUET_NEW_RECORD_READER is erroneous
- There are currently two readers
   o The old one is used when nested data is used as it can handle all 
parquet use-cases
   o The new reader only deals with Flat parquet data types
- We might want to rename the keys as the new reader cannot always be 
substituted with the old one 



---


[GitHub] drill pull request #950: DRILL-5431: SSL Support

2017-10-05 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/950#discussion_r142994068
  
--- Diff: contrib/native/client/src/clientlib/channel.hpp ---
@@ -0,0 +1,236 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#ifndef CHANNEL_HPP
+#define CHANNEL_HPP
+
+#include "drill/common.hpp"
+#include "drill/drillClient.hpp"
+#include "streamSocket.hpp"
+
+namespace Drill {
+
+class UserProperties;
+
+class ConnectionEndpoint{
+public:
+ConnectionEndpoint(const char* connStr);
+ConnectionEndpoint(const char* host, const char* port);
+~ConnectionEndpoint();
+
+//parse the connection string and set up the host and port to 
connect to
+connectionStatus_t getDrillbitEndpoint();
+
+std::string& getProtocol(){return m_protocol;}
+std::string& getHost(){return m_host;}
+std::string& getPort(){return m_port;}
+DrillClientError* getError(){ return m_pError;};
+
+private:
+void parseConnectString();
+bool isDirectConnection();
+bool isZookeeperConnection();
+connectionStatus_t getDrillbitEndpointFromZk();
+connectionStatus_t handleError(connectionStatus_t status, 
std::string msg);
+
+std::string m_connectString;
+std::string m_pathToDrill;
+std::string m_protocol; 
+std::string m_hostPortStr;
+std::string m_host;
+std::string m_port;
+
+DrillClientError* m_pError;
+
+};
+
+class ChannelContext{
+public:
+ChannelContext(DrillUserProperties* 
props):m_properties(props){};
+virtual ~ChannelContext(){};
+const DrillUserProperties* getUserProperties() const { return 
m_properties;}
+protected:
+DrillUserProperties* m_properties;
+};
+
+class SSLChannelContext: public ChannelContext{
+public:
+static boost::asio::ssl::context::method 
getTlsVersion(std::string version){
--- End diff --

sorry, it's abbreviation for const ref (I believe std::cref was introduced 
in C++11, which we are not using sadly, although there's a boost equivalent of 
course).


---


[GitHub] drill pull request #950: DRILL-5431: SSL Support

2017-10-05 Thread laurentgo
Github user laurentgo commented on a diff in the pull request:

https://github.com/apache/drill/pull/950#discussion_r142993025
  
--- Diff: contrib/native/client/src/clientlib/wincert.ipp ---
@@ -0,0 +1,98 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#if defined(IS_SSL_ENABLED)
+
+#include 
+#include 
+
+#if defined _WIN32  || defined _WIN64
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+
+#pragma comment (lib, "crypt32.lib")
+#pragma comment (lib, "cryptui.lib")
+
+#define MY_ENCODING_TYPE  (PKCS_7_ASN_ENCODING | X509_ASN_ENCODING)
+
+inline
+int loadSystemTrustStore(const SSL *ssl, std::string& msg) {
--- End diff --

it looks like boost::asio support both loading a file and/or a verify 
callback:
- 
http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/reference/ssl__context/set_verify_callback/overload1.html
- 
http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/reference/ssl__context/load_verify_file.html

It seems you wouldn't even need to access the native handle when using 
these functions


---


[jira] [Created] (DRILL-5845) Columns returned by select with "ORDER BY" and "LIMIT" clauses returned in correct order

2017-10-05 Thread Vitalii Diravka (JIRA)
Vitalii Diravka created DRILL-5845:
--

 Summary: Columns returned by select with "ORDER BY" and "LIMIT" 
clauses returned in correct order
 Key: DRILL-5845
 URL: https://issues.apache.org/jira/browse/DRILL-5845
 Project: Apache Drill
  Issue Type: Bug
  Components: Query Planning & Optimization
Affects Versions: 1.11.0
Reporter: Vitalii Diravka
 Fix For: Future


Column order is proper for queries with only one clause: ORDER BY or LIMIT. For 
queries with both these clauses column order isn't preserved.

Test case for reproduce:
{code}
0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` limit 1;
+--+--+--+--+
| n_nationkey  |  n_name  | n_regionkey  |  n_comment   
|
+--+--+--+--+
| 0| ALGERIA  | 0|  haggle. carefully final deposits 
detect slyly agai  |
+--+--+--+--+
1 row selected (0.181 seconds)
0: jdbc:drill:zk=local> select * from cp.`tpch/nation.parquet` order by n_name 
limit 1;
+--+--+--+--+
|  n_comment   |  n_name  | n_nationkey 
 | n_regionkey  |
+--+--+--+--+
|  haggle. carefully final deposits detect slyly agai  | ALGERIA  | 0   
 | 0|
+--+--+--+--+
1 row selected (0.154 seconds)
{code}
For json files the column ordering is not preserved as well:
{code}
select * from cp.`employee.json` limit 1;
select * from cp.`employee.json` order by full_name limit 1;
{code}

Perhaps the wrong operator for sorting is used.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142948996
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/EmbeddedQueryQueue.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Query queue to be used in an embedded Drillbit. This queue has scope of 
only
+ * the one Drillbit (not even multiple Drillbits in the same process.) 
Primarily
+ * intended for testing, but may possibly be useful for other embedded
+ * applications.
+ * 
+ * Configuration is via config parameters (not via system options as for 
the
+ * distributed queue.)
+ * 
+ * drill.queue.embedded.enabled
+ * Set to true to enable the embedded queue. But, this setting has 
effect
+ * only if the Drillbit is, in fact, embedded.
+ * drill.queue.embedded.size
+ * The number of active queries, all others queue. There is no upper 
limit
+ * on the number of queued entries.
+ * drill.queue.embedded.timeout_ms
+ * The maximum time a query will wait in the queue before failing.
+ * 
+ */
+
+public class EmbeddedQueryQueue implements QueryQueue {
+
+  public static String EMBEDDED_QUEUE = "drill.exec.queue.embedded";
+  public static String ENABLED = EMBEDDED_QUEUE + ".enable";
+  public static String QUEUE_SIZE = EMBEDDED_QUEUE + ".size";
+  public static String TIMEOUT_MS = EMBEDDED_QUEUE + ".timeout_ms";
+
+  public class EmbeddedQueueLease implements QueueLease {
+
+private final QueryId queryId;
+private boolean released;
+private long queryMemory;
+
+public EmbeddedQueueLease(QueryId queryId, long queryMemory) {
+  this.queryId = queryId;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString( ) {
+  String msg = "Embedded queue lease for " +
+  QueryIdHelper.getQueryId(queryId);
+  if (released) {
+msg += " (released)";
+  }
+  return msg;
+}
+
+@Override
+public long queryMemoryPerNode() {
+  return queryMemory;
+}
+
+@Override
+public void release() {
+  EmbeddedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return "local-queue"; }
+  }
+
+  private final int queueTimeoutMs;
+  private final int queueSize;
+  private final Semaphore semaphore;
+  private long memoryPerQuery;
+  private final long minimumOperatorMemory;
+
+  public EmbeddedQueryQueue(DrillbitContext context) {
+DrillConfig config = context.getConfig();
+queueTimeoutMs = config.getInt(TIMEOUT_MS);
+queueSize = config.getInt(QUEUE_SIZE);
+semaphore = new Semaphore(queueSize, true);
+minimumOperatorMemory = context.getOptionManager()
+.getOption(ExecConstants.MIN_MEMORY_PER_BUFFERED_OP);
+  }
+
+  @Override
+  public boolean enabled() { return true; }
+
+  @Override
+  public void setMemoryPerNode(long memoryPerNode) {
+memoryPerQuery = memoryPerNode / queueSize;
+  }
+
+  @Override
+  public long defaultQueryMemoryPerNode(double cost) {
+return memoryPerQuery;
+  }
+
+  @Override
+  public QueueLease enqueue(QueryId queryId, double cost)
+  throws QueueTimeoutException, QueryQueueException {
+try {
+  if (! semaphore.tryAcquire(queueTimeoutMs, TimeUnit.MILLISECONDS) ) {
+throw new 

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142702103
  
--- Diff: exec/java-exec/src/main/resources/rest/index.ftl ---
@@ -74,23 +74,66 @@
 
   
   
-Encryption Info 
+Encryption
 
-  
+  
 
 
   Client to Bit Encryption:
--- End diff --

Since you have updated this part, could you please also remove colon, it is 
not needed in the table.





---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142718897
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DefaultResourceManager.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.physical.PhysicalPlan;
+import org.apache.drill.exec.server.BootStrapContext;
+import org.apache.drill.exec.util.MemoryAllocationUtilities;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.apache.drill.exec.work.foreman.Foreman;
+
+/**
+ * Represents a default resource manager for clusters that do not provide 
query
+ * queues. Without queues to provide a hard limit on the query admission 
rate,
+ * the number of active queries must be estimated and the resulting 
resource
+ * allocations will be rough estimates.
+ */
+
+public class DefaultResourceManager implements ResourceManager {
+
+  public static class DefaultResourceAllocator implements 
QueryResourceAllocator {
+
+private QueryContext queryContext;
+
+protected DefaultResourceAllocator(QueryContext queryContext) {
+  this.queryContext = queryContext;
+}
+
+@Override
+public void visitAbstractPlan(PhysicalPlan plan) {
+  if (plan == null || plan.getProperties().hasResourcePlan) {
+return;
+  }
+  MemoryAllocationUtilities.setupBufferedOpsMemoryAllocations(plan, 
queryContext);
+}
+
+@Override
+public void visitPhysicalPlan(QueryWorkUnit work) {
+}
+  }
+
+  public static class DefaultQueryResourceManager extends 
DefaultResourceAllocator implements QueryResourceManager {
+
+@SuppressWarnings("unused")
+private final DefaultResourceManager rm;
+
+public DefaultQueryResourceManager(final DefaultResourceManager rm, 
final Foreman foreman) {
+  super(foreman.getQueryContext());
+  this.rm = rm;
+}
+
+@Override
+public void setCost(double cost) {
+  // Nothing to do by default.
+}
+
+@Override
+public void admit() {
+  // No queueing by default
+}
+
+@Override
+public void exit() {
+  // No queueing by default
+}
+
+@Override
+public boolean hasQueue() { return false; }
+
+@Override
+public String queueName() { return null; }
+  }
+
+  BootStrapContext bootStrapContext;
+  public long memoryPerNode;
--- End diff --

final?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142946871
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142944789
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DefaultResourceManager.java
 ---
@@ -0,0 +1,121 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.physical.PhysicalPlan;
+import org.apache.drill.exec.server.BootStrapContext;
+import org.apache.drill.exec.util.MemoryAllocationUtilities;
+import org.apache.drill.exec.work.QueryWorkUnit;
+import org.apache.drill.exec.work.foreman.Foreman;
+
+/**
+ * Represents a default resource manager for clusters that do not provide 
query
+ * queues. Without queues to provide a hard limit on the query admission 
rate,
+ * the number of active queries must be estimated and the resulting 
resource
+ * allocations will be rough estimates.
+ */
+
+public class DefaultResourceManager implements ResourceManager {
+
+  public static class DefaultResourceAllocator implements 
QueryResourceAllocator {
+
+private QueryContext queryContext;
+
+protected DefaultResourceAllocator(QueryContext queryContext) {
+  this.queryContext = queryContext;
+}
+
+@Override
+public void visitAbstractPlan(PhysicalPlan plan) {
+  if (plan == null || plan.getProperties().hasResourcePlan) {
+return;
+  }
+  MemoryAllocationUtilities.setupBufferedOpsMemoryAllocations(plan, 
queryContext);
+}
+
+@Override
+public void visitPhysicalPlan(QueryWorkUnit work) {
+}
+  }
+
+  public static class DefaultQueryResourceManager extends 
DefaultResourceAllocator implements QueryResourceManager {
+
+@SuppressWarnings("unused")
+private final DefaultResourceManager rm;
+
+public DefaultQueryResourceManager(final DefaultResourceManager rm, 
final Foreman foreman) {
+  super(foreman.getQueryContext());
+  this.rm = rm;
+}
+
+@Override
+public void setCost(double cost) {
+  // Nothing to do by default.
+}
+
+@Override
+public void admit() {
+  // No queueing by default
+}
+
+@Override
+public void exit() {
+  // No queueing by default
+}
+
+@Override
+public boolean hasQueue() { return false; }
+
+@Override
+public String queueName() { return null; }
+  }
+
+  BootStrapContext bootStrapContext;
--- End diff --

unused ...


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142726899
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DynamicResourceManager.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+import org.apache.drill.exec.work.foreman.Foreman;
+import 
org.apache.drill.exec.work.foreman.rm.DistributedQueryQueue.StatusAdapter;
+
+/**
+ * Wrapper around the default and/or distributed resource managers
+ * to allow dynamically enabling and disabling queueing.
+ */
+
+public class DynamicResourceManager implements ResourceManager {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DynamicResourceManager.class);
+
+  private final DrillbitContext context;
+  private ResourceManager defaultRm;
+  private ResourceManager queueingRm;
+  private ResourceManager activeRm;
+  public long lastUpdateTime;
+  public int recheckDelayMs = 5000;
+
+  public DynamicResourceManager(final DrillbitContext context) {
+this.context = context;
+refreshRM();
+  }
+
+  public synchronized ResourceManager activeRM() {
+refreshRM();
+return activeRm;
+  }
+
+  @Override
+  public long memoryPerNode() {
+return activeRm.memoryPerNode();
+  }
+
+  @Override
+  public int cpusPerNode() {
+return activeRm.cpusPerNode();
+  }
+
+  @Override
+  public synchronized QueryResourceAllocator 
newResourceAllocator(QueryContext queryContext) {
+refreshRM();
+return activeRm.newResourceAllocator(queryContext);
+  }
+
+  @Override
+  public synchronized QueryResourceManager newQueryRM(Foreman foreman) {
+refreshRM();
+return activeRm.newQueryRM(foreman);
+  }
+
+  private void refreshRM() {
+long now = System.currentTimeMillis();
+if (lastUpdateTime + recheckDelayMs >= now) {
+  return;
+}
+lastUpdateTime = now;
+@SuppressWarnings("resource")
+SystemOptionManager systemOptions = context.getOptionManager();
+if (systemOptions.getOption(ExecConstants.ENABLE_QUEUE)) {
+  if (queueingRm == null) {
+StatusAdapter statusAdapter = new StatusAdapter() {
+  @Override
+  public boolean inShutDown() {
+// Drill provides no shutdown state at present. Once
+// DRILL-4286 (graceful shutdown) is merged, use the
+// new Drillbit status to determine when the Drillbit
+// is shutting down.
+return false;
+  }
+};
+queueingRm = new ThrottledResourceManager(context,
+new DistributedQueryQueue(context, statusAdapter));
+  }
+  if (activeRm != queueingRm) {
+logger.debug("Enabling ZK-based query queue.");
+activeRm = queueingRm;
+  }
+} else {
+  if (defaultRm == null) {
+defaultRm = new DefaultResourceManager();
+  }
+  if (activeRm != defaultRm) {
+logger.debug("Disabling ZK-based query queue.");
+activeRm = defaultRm;
+  }
+}
+  }
+
+  @Override
+  public void close() {
+if (defaultRm != null) {
+  defaultRm.close();
+  defaultRm = null;
+}
--- End diff --

What if `defaultRM` closing fails, is it ok  that `queueingRm` remains 
unclosed?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142943128
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DynamicResourceManager.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+import org.apache.drill.exec.work.foreman.Foreman;
+import 
org.apache.drill.exec.work.foreman.rm.DistributedQueryQueue.StatusAdapter;
+
+/**
+ * Wrapper around the default and/or distributed resource managers
+ * to allow dynamically enabling and disabling queueing.
+ */
+
+public class DynamicResourceManager implements ResourceManager {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DynamicResourceManager.class);
+
+  private final DrillbitContext context;
+  private ResourceManager defaultRm;
+  private ResourceManager queueingRm;
+  private ResourceManager activeRm;
+  public long lastUpdateTime;
+  public int recheckDelayMs = 5000;
+
+  public DynamicResourceManager(final DrillbitContext context) {
+this.context = context;
+refreshRM();
+  }
+
+  public synchronized ResourceManager activeRM() {
+refreshRM();
+return activeRm;
+  }
+
+  @Override
+  public long memoryPerNode() {
+return activeRm.memoryPerNode();
+  }
+
+  @Override
+  public int cpusPerNode() {
+return activeRm.cpusPerNode();
+  }
+
+  @Override
+  public synchronized QueryResourceAllocator 
newResourceAllocator(QueryContext queryContext) {
+refreshRM();
+return activeRm.newResourceAllocator(queryContext);
+  }
+
+  @Override
+  public synchronized QueryResourceManager newQueryRM(Foreman foreman) {
+refreshRM();
+return activeRm.newQueryRM(foreman);
+  }
+
+  private void refreshRM() {
+long now = System.currentTimeMillis();
+if (lastUpdateTime + recheckDelayMs >= now) {
--- End diff --

Why do we need `recheckDelayMs`? It seems we can remove it, so when queuing 
is enabled / disabled, the change was applied right away.


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142723896
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
--- End diff --

Frankly saying, I prefer when class content (like variables and methods) 
are placed at the top of class, and other helper stuff at the bottom, not 
critical though. 


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142954623
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142967011
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -392,10 +419,12 @@ drill.exec.options:  {
 exec.query_profile.debug_mode: false,
 exec.query_profile.save: true,
 exec.queue.enable: false,
-exec.queue.large: 10,
-exec.queue.small: 100,
+exec.queue.large: 2,
--- End diff --

Do we need to add description of this properties in 
`drill-override-example.conf`?
Also is there a reason why we made limit smaller than before?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142935606
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142727260
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/EmbeddedQueryQueue.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Query queue to be used in an embedded Drillbit. This queue has scope of 
only
+ * the one Drillbit (not even multiple Drillbits in the same process.) 
Primarily
+ * intended for testing, but may possibly be useful for other embedded
+ * applications.
+ * 
+ * Configuration is via config parameters (not via system options as for 
the
+ * distributed queue.)
+ * 
+ * drill.queue.embedded.enabled
+ * Set to true to enable the embedded queue. But, this setting has 
effect
+ * only if the Drillbit is, in fact, embedded.
+ * drill.queue.embedded.size
+ * The number of active queries, all others queue. There is no upper 
limit
+ * on the number of queued entries.
+ * drill.queue.embedded.timeout_ms
+ * The maximum time a query will wait in the queue before failing.
+ * 
+ */
+
+public class EmbeddedQueryQueue implements QueryQueue {
+
+  public static String EMBEDDED_QUEUE = "drill.exec.queue.embedded";
+  public static String ENABLED = EMBEDDED_QUEUE + ".enable";
+  public static String QUEUE_SIZE = EMBEDDED_QUEUE + ".size";
+  public static String TIMEOUT_MS = EMBEDDED_QUEUE + ".timeout_ms";
+
+  public class EmbeddedQueueLease implements QueueLease {
+
+private final QueryId queryId;
+private boolean released;
+private long queryMemory;
+
+public EmbeddedQueueLease(QueryId queryId, long queryMemory) {
+  this.queryId = queryId;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString( ) {
+  String msg = "Embedded queue lease for " +
+  QueryIdHelper.getQueryId(queryId);
+  if (released) {
+msg += " (released)";
--- End diff --

`StringBuilder`?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142933298
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/AbstractResourceManager.java
 ---
@@ -0,0 +1,68 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Abstract base class for a resource manager. Handles tasks common to all
+ * resource managers: learning the resources available on this Drillbit.
+ * In the current version, Drillbits must be symmetrical, so that knowing
+ * the resources on one node is sufficient to know resources available on
+ * all nodes.
+ */
+
+public abstract class AbstractResourceManager implements ResourceManager {
+
+  protected final DrillbitContext context;
+  private final long memoryPerNode;
+  private final int cpusPerNode;
+
+  public AbstractResourceManager(final DrillbitContext context) {
+this.context = context;
+DrillConfig config = context.getConfig();
+
+// Normally we use the actual direct memory configured on the JVM 
command
+// line. However, if the config param is set, we use that instead (if 
it is
+// lower than actual memory). Primarily for testing.
+
+long memLimit = DrillConfig.getMaxDirectMemory();
+long configMemoryPerNode = 
config.getBytes(ExecConstants.MAX_MEMORY_PER_NODE);
--- End diff --

`config.getLong`?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142726171
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DynamicResourceManager.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.QueryContext;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+import org.apache.drill.exec.work.foreman.Foreman;
+import 
org.apache.drill.exec.work.foreman.rm.DistributedQueryQueue.StatusAdapter;
+
+/**
+ * Wrapper around the default and/or distributed resource managers
+ * to allow dynamically enabling and disabling queueing.
+ */
+
+public class DynamicResourceManager implements ResourceManager {
+
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DynamicResourceManager.class);
+
+  private final DrillbitContext context;
+  private ResourceManager defaultRm;
+  private ResourceManager queueingRm;
+  private ResourceManager activeRm;
+  public long lastUpdateTime;
+  public int recheckDelayMs = 5000;
+
+  public DynamicResourceManager(final DrillbitContext context) {
+this.context = context;
+refreshRM();
+  }
+
+  public synchronized ResourceManager activeRM() {
+refreshRM();
+return activeRm;
+  }
+
+  @Override
+  public long memoryPerNode() {
+return activeRm.memoryPerNode();
+  }
+
+  @Override
+  public int cpusPerNode() {
+return activeRm.cpusPerNode();
+  }
+
+  @Override
+  public synchronized QueryResourceAllocator 
newResourceAllocator(QueryContext queryContext) {
+refreshRM();
+return activeRm.newResourceAllocator(queryContext);
+  }
+
+  @Override
+  public synchronized QueryResourceManager newQueryRM(Foreman foreman) {
+refreshRM();
+return activeRm.newQueryRM(foreman);
+  }
+
+  private void refreshRM() {
+long now = System.currentTimeMillis();
+if (lastUpdateTime + recheckDelayMs >= now) {
+  return;
+}
+lastUpdateTime = now;
+@SuppressWarnings("resource")
+SystemOptionManager systemOptions = context.getOptionManager();
+if (systemOptions.getOption(ExecConstants.ENABLE_QUEUE)) {
+  if (queueingRm == null) {
+StatusAdapter statusAdapter = new StatusAdapter() {
+  @Override
+  public boolean inShutDown() {
+// Drill provides no shutdown state at present. Once
--- End diff --

may be add as `//todo`?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142721394
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142719680
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/QueryQueue.java
 ---
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+
+/**
+ * Interface which defines a queue implementation for query queues.
+ * Implementations can queue locally, queue distributed, or do
+ * nothing at all.
+ * 
+ * A queue can report itself as enabled or disabled. When enabled,
+ * all queries must obtain a lease prior to starting execution. The
+ * lease must be released at the completion of execution.
+ */
+
+public interface QueryQueue {
+
+  /**
+   * The opaque lease returned once a query is admitted
+   * for execution.
+   */
+
+  public interface QueueLease {
+long queryMemoryPerNode();
+
+/**
+ * Release a query lease obtained from {@link #queue(QueryId, 
double))}.
+ * Should be called by the per-query resource manager.
+ *
+ * @param lease the lease to be released.
+ */
+
+void release();
+
+String queueName();
+  };
+
+  /**
+   * Exception thrown if a query exceeds the configured wait time
+   * in the query queue.
+   */
+
+  @SuppressWarnings("serial")
+  public class QueueTimeoutException extends Exception {
+
+private QueryId queryId;
--- End diff --

final?


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142721931
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/DistributedQueryQueue.java
 ---
@@ -0,0 +1,342 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.TimeUnit;
+
+import javax.xml.bind.annotation.XmlRootElement;
+
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.coord.ClusterCoordinator;
+import org.apache.drill.exec.coord.DistributedSemaphore;
+import org.apache.drill.exec.coord.DistributedSemaphore.DistributedLease;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.server.options.SystemOptionManager;
+
+/**
+ * Distributed query queue which uses a Zookeeper distributed semaphore to
+ * control queuing across the cluster. The distributed queue is actually 
two
+ * queues: one for "small" queries, another for "large" queries. Query 
size is
+ * determined by the Planner's estimate of query cost.
+ * 
+ * This queue is configured using system options:
+ * 
+ * exec.queue.enable
+ * 
+ * Set to true to enable the distributed queue.
+ * exec.queue.large
+ * 
+ * The maximum number of large queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.small
+ * 
+ * The maximum number of small queries to admit. Additional
+ * queries wait in the queue.
+ * exec.queue.threshold
+ * 
+ * The cost threshold. Queries below this size are small, at
+ * or above this size are large..
+ * exec.queue.timeout_millis
+ * 
+ * The maximum time (in milliseconds) a query will wait in the
+ * queue before failing.
+ * 
+ * 
+ * The above values are refreshed every five seconds. This aids performance
+ * a bit in systems with very high query arrival rates.
+ */
+
+public class DistributedQueryQueue implements QueryQueue {
+  private static final org.slf4j.Logger logger = 
org.slf4j.LoggerFactory.getLogger(DistributedQueryQueue.class);
+
+  private class DistributedQueueLease implements QueueLease {
+private final QueryId queryId;
+private DistributedLease lease;
+private final String queueName;
+
+/**
+ * Memory allocated to the query. Though all queries in the queue use
+ * the same memory allocation rules, those rules can change at any time
+ * as the user changes system options. This value captures the value
+ * calculated at the time that this lease was granted.
+ */
+private long queryMemory;
+
+public DistributedQueueLease(QueryId queryId, String queueName,
+DistributedLease lease, long queryMemory) {
+  this.queryId = queryId;
+  this.queueName = queueName;
+  this.lease = lease;
+  this.queryMemory = queryMemory;
+}
+
+@Override
+public String toString() {
+  return String.format("Lease for %s queue to query %s",
+  queueName, QueryIdHelper.getQueryId(queryId));
+}
+
+@Override
+public long queryMemoryPerNode() { return queryMemory; }
+
+@Override
+public void release() {
+  DistributedQueryQueue.this.release(this);
+}
+
+@Override
+public String queueName() { return queueName; }
+  }
+
+  /**
+   * Exposes a snapshot of internal state information for use in status
+   * reporting, such as in the UI.
+   */
+
+  @XmlRootElement
+  public static class ZKQueueInfo {
+public final int smallQueueSize;
+public final int largeQueueSize;
+public final double queueThreshold;
+public final long memoryPerNode;
+

[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142944163
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/rm/EmbeddedQueryQueue.java
 ---
@@ -0,0 +1,147 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.work.foreman.rm;
+
+import java.util.concurrent.Semaphore;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.proto.UserBitShared.QueryId;
+import org.apache.drill.exec.proto.helper.QueryIdHelper;
+import org.apache.drill.exec.server.DrillbitContext;
+
+/**
+ * Query queue to be used in an embedded Drillbit. This queue has scope of 
only
+ * the one Drillbit (not even multiple Drillbits in the same process.) 
Primarily
+ * intended for testing, but may possibly be useful for other embedded
+ * applications.
+ * 
+ * Configuration is via config parameters (not via system options as for 
the
+ * distributed queue.)
+ * 
+ * drill.queue.embedded.enabled
+ * Set to true to enable the embedded queue. But, this setting has 
effect
+ * only if the Drillbit is, in fact, embedded.
+ * drill.queue.embedded.size
+ * The number of active queries, all others queue. There is no upper 
limit
+ * on the number of queued entries.
+ * drill.queue.embedded.timeout_ms
+ * The maximum time a query will wait in the queue before failing.
+ * 
+ */
+
+public class EmbeddedQueryQueue implements QueryQueue {
+
+  public static String EMBEDDED_QUEUE = "drill.exec.queue.embedded";
+  public static String ENABLED = EMBEDDED_QUEUE + ".enable";
+  public static String QUEUE_SIZE = EMBEDDED_QUEUE + ".size";
+  public static String TIMEOUT_MS = EMBEDDED_QUEUE + ".timeout_ms";
+
+  public class EmbeddedQueueLease implements QueueLease {
+
+private final QueryId queryId;
+private boolean released;
--- End diff --

Is never assigned...


---


[GitHub] drill pull request #928: DRILL-5716: Queue-driven memory allocation

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/928#discussion_r142707486
  
--- Diff: exec/java-exec/src/main/resources/rest/index.ftl ---
@@ -74,23 +74,66 @@
 
   
   
-Encryption Info 
+Encryption
 
-  
+  
 
 
   Client to Bit Encryption:
-  ${model.isUserEncryptionEnabled()?string("enabled", 
"disabled")}
+  ${model.isUserEncryptionEnabled()?string("Enabled", 
"Disabled")}
 
 
   Bit to Bit Encryption:
-  ${model.isBitEncryptionEnabled()?string("enabled", 
"disabled")}
+  ${model.isBitEncryptionEnabled()?string("Enabled", 
"Disabled")}
 
 
   
 
   
   
+
+  <#assign queueInfo = model.queueInfo() />
+  
+  
+Query Throttling
+
+  
+
+   
+  Queue Status:
--- End diff --

May be we can remove colons?


---


[GitHub] drill pull request #950: DRILL-5431: SSL Support

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/950#discussion_r142862510
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/TestSSLConfig.java ---
@@ -49,12 +64,19 @@ public void testMissingKeystorePassword() throws 
Exception {
 ConfigBuilder config = new ConfigBuilder();
 config.put(ExecConstants.HTTP_KEYSTORE_PATH, "/root");
 config.put(ExecConstants.HTTP_KEYSTORE_PASSWORD, "");
+config.put(ExecConstants.SSL_USE_HADOOP_CONF, false);
+config.put(ExecConstants.USER_SSL_ENABLED, true);
 try {
-  SSLConfig sslv = new SSLConfig(config.build());
+  SSLConfig sslv = new SSLConfigBuilder()
+  .config(config.build())
+  .mode(SSLFactory.Mode.SERVER)
+  .initializeSSLContext(false)
+  .validateKeyStore(true)
+  .build();
   fail();
   //Expected
 } catch (Exception e) {
-  assertTrue(e instanceof DrillException);
+
--- End diff --

But the assert `assertTrue(e instanceof DrillException);` was removed and 
catch block is empty, test will never fail...


---


[GitHub] drill pull request #950: DRILL-5431: SSL Support

2017-10-05 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/950#discussion_r142861999
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/rpc/user/security/TestUserBitSSL.java
 ---
@@ -223,9 +223,12 @@ public void testClientConfigHostnameVerification() {
   ts.load(null, password.toCharArray());
   ts.setCertificateEntry("drillTest", certificate.cert());
   // Store away the truststore.
-  FileOutputStream fos1 = new FileOutputStream(tempFile1);
-  ts.store(fos1, password.toCharArray());
-  fos1.close();
+  try (FileOutputStream fos1 = new FileOutputStream(tempFile1);) {
+ts.store(fos1, password.toCharArray());
+fos1.close();
--- End diff --

No need to close stream. It will be closed automatically.


---


[jira] [Created] (DRILL-5844) Incorrect values of TABLE_TYPE returned from method DatabaseMetaData.getTables of JDBC API

2017-10-05 Thread second88 (JIRA)
second88 created DRILL-5844:
---

 Summary: Incorrect values of TABLE_TYPE returned from method 
DatabaseMetaData.getTables of JDBC API
 Key: DRILL-5844
 URL: https://issues.apache.org/jira/browse/DRILL-5844
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC, Metadata
Reporter: second88
Priority: Minor


As far as I can see, the values of TABLE_TYPE returned from method 
DatabaseMetaData.getTables of JDBC API of a Drill Connection include:
TABLE
VIEW
SYSTEM_TABLE

According to [JDBC 
API|http://docs.oracle.com/javase/7/docs/api/java/sql/DatabaseMetaData.html#getTables(java.lang.String,%20java.lang.String,%20java.lang.String,%20java.lang.String\[\])],
 the typical types are "TABLE", "VIEW", "SYSTEM TABLE", "GLOBAL TEMPORARY", 
"LOCAL TEMPORARY", "ALIAS", "SYNONYM".

Therefore "SYSTEM_TABLE" should be replaced by "SYSTEM TABLE".

Besides, I wonder if this bug is related to another bug 
[DRILL-5843|https://issues.apache.org/jira/browse/DRILL-5843] reported by me.
It should be noted that the values of TABLE_TYPE returned from methods 
DatabaseMetaData.getTables and DatabaseMetaData.getTableTypes should be 
one-to-one matched with but may not be the same as those in 
INFORMATION_SCHEMA.TABLES.TABLE_TYPE, for instance, "TABLE" VS "BASE TABLE".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (DRILL-5843) Incorrect values of INFORMATION_SCHEMA.TABLES.TABLE_TYPE

2017-10-05 Thread second88 (JIRA)
second88 created DRILL-5843:
---

 Summary: Incorrect values of INFORMATION_SCHEMA.TABLES.TABLE_TYPE
 Key: DRILL-5843
 URL: https://issues.apache.org/jira/browse/DRILL-5843
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Reporter: second88
Priority: Minor


As far as I can see, the values available in Drill include:
TABLE
VIEW
SYSTEM_TABLE

In a draft of SQL standard SQL:2011 Part 11: Information and Definition Schemas 
(SQL/Schemata), the available values should be (source: [SQL:201x 
(preliminary)|https://wiki.postgresql.org/wiki/Developer_FAQ#Where_can_I_get_a_copy_of_the_SQL_standards.3F]):
BASE TABLE
VIEW
GLOBAL TEMPORARY
LOCAL TEMPORARY
SYSTEM VERSIONED


Some common databases, which have implemented INFORMATION_SCHEMA.TABLES view, 
show "BASE TABLE" instead of "TABLE" and use " " instead of "_" as delimiter in 
the values of TABLE_TYPE:
[MySQL 5.7|https://dev.mysql.com/doc/refman/5.7/en/tables-table.html]
[PostgreSQL 
9.1|https://www.postgresql.org/docs/9.1/static/infoschema-tables.html]
[DB2 for 
i5/OS|https://www.ibm.com/support/knowledgecenter/ssw_i5_54/db2/rbafzmstcatalogans.htm#cattables]


In conclusion, "TABLE" should be replaced by "BASE TABLE" and "SYSTEM_TABLE" by 
"SYSTEM TABLE" in INFORMATION_SCHEMA.TABLES.TABLE_TYPE.
Besides, the words in other existing or future values of TABLE_TYPE should also 
be delimited by " " instead of "_", for example, "GLOBAL TEMPORARY" instead of 
"GLOBAL_TEMPORARY".



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)