[GitHub] drill issue #914: DRILL-5657: Size-aware vector writer structure

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/914
  
Thank you, @bitblender, for your in-person review. I've pushed a commit 
that reflects your comments. 


---


[GitHub] drill issue #1014: DRILL-5771: Fix serDe errors for format plugins

2017-11-13 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1014
  
Thanks I took a look.


---


Re: Hangout Topics for 11/14/2017

2017-11-13 Thread Padma Penumarthy
Here are the topics so far:

Unit Testing - Tim
1.12 Release - Arina
Metadata Management - Padma

Thanks
Padma


On Nov 13, 2017, at 1:15 PM, Padma Penumarthy 
> wrote:

Drill hangout tomorrow Nov 14th, at 10 AM PST.
Please send email or bring them up tomorrow, if you have topics to discuss.

Hangout link:
https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Thanks
Padma




[GitHub] drill pull request #978: DRILL-5842: Refactor and simplify the fragment, ope...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/978


---


[jira] [Resolved] (DRILL-4708) connection closed unexpectedly

2017-11-13 Thread Karthikeyan Manivannan (JIRA)

 [ 
https://issues.apache.org/jira/browse/DRILL-4708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthikeyan Manivannan resolved DRILL-4708.
---
Resolution: Works for Me

> connection closed unexpectedly
> --
>
> Key: DRILL-4708
> URL: https://issues.apache.org/jira/browse/DRILL-4708
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - RPC
>Affects Versions: 1.7.0
>Reporter: Chun Chang
>Assignee: Karthikeyan Manivannan
>Priority: Critical
> Attachments: data.tgz
>
>
> Running DRILL functional automation, we often see query failed randomly due 
> to the following unexpected connection close error.
> {noformat}
> Execution Failures:
> /root/drillAutomation/framework/framework/resources/Functional/ctas/ctas_flatten/10rows/filter5.q
> Query: 
> select * from dfs.ctas_flatten.`filter5_10rows_ctas`
> Failed with exception
> java.sql.SQLException: CONNECTION ERROR: Connection /10.10.100.171:36185 <--> 
> drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. 
> Drillbit down?
> [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ]
>   at 
> org.apache.drill.jdbc.impl.DrillCursor.nextRowInternally(DrillCursor.java:247)
>   at org.apache.drill.jdbc.impl.DrillCursor.next(DrillCursor.java:321)
>   at 
> oadd.net.hydromatic.avatica.AvaticaResultSet.next(AvaticaResultSet.java:187)
>   at 
> org.apache.drill.jdbc.impl.DrillResultSetImpl.next(DrillResultSetImpl.java:172)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.executeQuery(DrillTestJdbc.java:210)
>   at 
> org.apache.drill.test.framework.DrillTestJdbc.run(DrillTestJdbc.java:99)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:744)
> Caused by: oadd.org.apache.drill.common.exceptions.UserException: CONNECTION 
> ERROR: Connection /10.10.100.171:36185 <--> 
> drillats4.qa.lab/10.10.100.174:31010 (user client) closed unexpectedly. 
> Drillbit down?
> [Error Id: 3d5dad8e-80d0-4c7f-9012-013bf01ce2b7 ]
>   at 
> oadd.org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:543)
>   at 
> oadd.org.apache.drill.exec.rpc.user.QueryResultHandler$ChannelClosedHandler$1.operationComplete(QueryResultHandler.java:373)
>   at 
> oadd.io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
>   at 
> oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:603)
>   at 
> oadd.io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:563)
>   at 
> oadd.io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:406)
>   at 
> oadd.io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:82)
>   at 
> oadd.io.netty.channel.AbstractChannel$CloseFuture.setClosed(AbstractChannel.java:943)
>   at 
> oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.doClose0(AbstractChannel.java:592)
>   at 
> oadd.io.netty.channel.AbstractChannel$AbstractUnsafe.close(AbstractChannel.java:584)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.closeOnRead(AbstractNioByteChannel.java:71)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:89)
>   at 
> oadd.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:162)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:511)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:468)
>   at 
> oadd.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:382)
>   at oadd.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:354)
>   at 
> oadd.io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
>   ... 1 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


"Batch size control" and unit testing

2017-11-13 Thread Paul Rogers
Hi All,

Here is the next installment in the “batch size control” project update.

Drill has a great many operators. As we move forward, we must update them to 
use the new batch size control framework. Unit testing becomes a major concern. 
This note explains how we address that issue in this project.

The “classic” way to test Drill is to build the product, fire up the Drill 
server, and use Sqlline to fire off queries. The problem of course, is that the 
edit-compile-debug is glacially slow (five minutes). Testing is manual 
(copy/paste the query into Sqlline, visually inspect the results.)

Another alternative is to run the very same query, but as a JUnit test. Drill 
has many such tests. The “BaseTestQuery” framework and “TestBuilder” help. The 
newish “Cluster Framework” makes it very easy to start an embedded Drillbit 
with the desired options and settings, run a query, and examine the results. 
The edit-compile-debug cycle is much faster, on the order of 10-20 seconds.

This is good, but we still run the entire Drill operator stack and throw 
queries at it. We use use a file for input and capture query results as output. 
 But, we want much finer grain testing. That is, we want true unit testing: 
isolate a component, feed it some input, and verify its output.

A fact of Drill is that operators are tightly coupled with the fragment context 
which is coupled with the Drillbit context which needs the entire server. What 
to do? One solution is to use mocks, and, indeed, Drill has three solutions 
based on JMockit, Mockito, and Jinfeng’s handy new “Mini-Plan” framework.

Mocks are handy, but it is cleaner and simpler to have code that can be tested 
in isolation without mocks. The next step is the “sub-operator” test framework, 
the “RowSet” utilities and the “context” refactoring that break the tight 
coupling with the rest of Drill, allowing us to separate out an operator (after 
some simple changes to the code) to test in isolation. We can now easily pump 
in a very large variety of inputs (such as Drill’s 30+ data types in the 3 
cardinalities) without having to set up a lot of overhead for each.

Still, however, many operators are internally complex and poking at them from 
the outside is limiting. We want to test, say, not just the sort operator, as a 
whole, but we want to exercise the bit of code that does the in-memory sort, or 
the one that writes batches to disk. To do this, we must “disaggregate” each 
operator into a series of separately-testable components, each with a clear API.

Refactoring operators can only be done for new operators, or when we need to 
make major changes to an existing operator. As part of the “batch size control” 
project, we have created a new version of the scan operator using this model.

Refactoring scan pointed out an opportunity to refactor the core operator code 
itself. Each operator has three responsibilities:

* Implement the Drill iterator protocol.
* Hold a record batch.
* Details of the operator algorithm.

The next “batch size” PR will provide a new version of the base operator class 
that splits responsibility into classes for the first two items, and an 
interface for the third. This allows us to unit test the two classes once and 
for all. Per-operator, the focus is just the operator implementation.

The core operator algorithm implementation is designed to be loosely coupled to 
the rest of Drill, allowing complete unit testing without mocks. The scan 
operator revision, which we’ll describe in the next note, makes use of this 
structure.

Thanks,

- Paul




[GitHub] drill issue #1026: DRILL-5919: Add non-numeric support for JSON processing

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1026
  
On the two functions... Maybe just have one function that handles the 
Nan/Infinity case. As noted earlier, no matter what we do, JSON without these 
symbols will work. So, we need only consider JSON with the symbols. Either:

* The NaN/Infinity cases always work, or
* The NaN/Infinity cases work sometimes, fail others, depending on some 
option or argument.

I would vote for the first case: it is simpler. I just can't see how anyone 
would use Drill to valid JSON and would want a query to fail if it contains NaN 
or Infinity. Can you suggest a case where failing the query would be of help to 
the user?


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150688725
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

As noted in DRILL-5949, options like this should be part of the plugin 
config (as in CSV), not session options. For now, let's leave them as session 
options and I'll migrate them to plugin config options along with the others.


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150688949
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

See discussion below. If they are off, then queries that use NaN and 
Infinity will fail until the user turns them on. Queries that don't use NaN and 
Infinity won't care. So, what is the advantage to failing queries unnecessarily?


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150689236
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

OK. So, if the rest of Drill either does not support (or we don't know if 
it supports) NaN and Inf, should we introduce the change here that potentially 
leads to failures elsewhere?

Do we know if JDBC and ODBC support these values? (I suppose they do as 
they are features of Java's float/double primitives...)


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150689617
  
--- Diff: 
contrib/storage-mongo/src/main/java/org/apache/drill/exec/store/mongo/MongoRecordReader.java
 ---
@@ -73,6 +73,7 @@
   private final MongoStoragePlugin plugin;
 
   private final boolean enableAllTextMode;
+  private final boolean enableNonNumericNumbers;
--- End diff --

The JSON parser we use does use the term "non-numeric numbers", but not 
sure we want to keep that naming in Drill itself.


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150682963
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/SchemaTreeProvider.java
 ---
@@ -20,12 +20,13 @@
 import java.io.IOException;
 import java.util.List;
 
-import org.apache.calcite.jdbc.SimpleCalciteSchema;
+//import org.apache.calcite.jdbc.SimpleCalciteSchema;
--- End diff --

Remove


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150686404
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java ---
@@ -229,7 +229,7 @@ public DrillbitContext getDrillbitContext() {
 return context;
   }
 
-  public SchemaPlus getRootSchema() {
+  public SchemaPlus getFullRootSchema() {
--- End diff --

Comment to explain what a "full root schema" is? Apparently, this is both 
the plugin config name and workspace combined?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150685672
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -175,6 +193,21 @@ public WorkspaceSchema createSchema(List 
parentSchemaPath, SchemaConfig
 return new WorkspaceSchema(parentSchemaPath, schemaName, schemaConfig);
   }
 
+  public WorkspaceSchema createSchema(List parentSchemaPath, 
SchemaConfig schemaConfig, DrillFileSystem fs) throws IOException {
+if (!accessible(fs)) {
--- End diff --

Is returning null sufficient to tell the user that they don't have 
permission to do this operation?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150685414
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -150,14 +152,30 @@ public WorkspaceSchemaFactory(
* @return True if the user has access. False otherwise.
*/
   public boolean accessible(final String userName) throws IOException {
-final FileSystem fs = ImpersonationUtil.createFileSystem(userName, 
fsConf);
+final DrillFileSystem fs = 
ImpersonationUtil.createFileSystem(userName, fsConf);
+return accessible(fs);
+  }
+
+  /**
+   * Checks whether a FileSystem object has the permission to list/read 
workspace directory
+   * @param fs a DrillFileSystem object that was created with certain user 
privilege
+   * @return True if the user has access. False otherwise.
+   * @throws IOException
+   */
+  public boolean accessible(DrillFileSystem fs) throws IOException {
 try {
-  // We have to rely on the listStatus as a FileSystem can have 
complicated controls such as regular unix style
-  // permissions, Access Control Lists (ACLs) or Access Control 
Expressions (ACE). Hadoop 2.7 version of FileSystem
-  // has a limited private API (FileSystem.access) to check the 
permissions directly
-  // (see https://issues.apache.org/jira/browse/HDFS-6570). Drill 
currently relies on Hadoop 2.5.0 version of
-  // FileClient. TODO: Update this when DRILL-3749 is fixed.
-  fs.listStatus(wsPath);
+  /**
+   * For Windows local file system, fs.access ends up using 
DeprecatedRawLocalFileStatus which has
+   * TrustedInstaller as owner, and a member of Administrators group 
could not satisfy the permission.
+   * In this case, we will still use method listStatus.
+   * In other cases, we use access method since it is cheaper.
+   */
+  if (SystemUtils.IS_OS_WINDOWS && 
fs.getUri().getScheme().equalsIgnoreCase("file")) {
--- End diff --

HDFS probably defines a constant for "file". Should we reference that?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150680095
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+  else {
+//this schema name could be `dfs.tmp`, a 2nd level schema under 
'dfs'
+String[] paths = schemaName.split("\\.");
+if (paths.length == 2) {
--- End diff --

Do we support only 1- and 2-part names? Should we assert that the length <= 
2?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150684042
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java
 ---
@@ -52,9 +55,20 @@
 
   private List factories;
   private String schemaName;
+  protected FileSystemPlugin plugin;
 
   public FileSystemSchemaFactory(String schemaName, 
List factories) {
 super();
+if (factories.size() > 0 ) {
+  this.plugin = factories.get(0).getPlugin();
--- End diff --

A comment would be helpful: what is special about plugin 0 that it should 
be the default? Is plugin 0 the default of some sort?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150679941
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+  else {
+//this schema name could be `dfs.tmp`, a 2nd level schema under 
'dfs'
+String[] paths = schemaName.split("\\.");
--- End diff --

Should this be done here in this simple way? How many other places do we do 
the same thing? Or, should we have a common function to split schema names so 
we can handle, way, escapes and other special cases that might come along?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150686157
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -532,7 +572,10 @@ public boolean isMutable() {
 }
 
 public DrillFileSystem getFS() {
-  return fs;
+  if (this.fs == null) {
+this.fs = 
ImpersonationUtil.createFileSystem(schemaConfig.getUserName(), fsConf);
+  }
+  return this.fs;
--- End diff --

This class caches the file system, which is good. The other classes in this 
PR do not; they create the fs as needed.

Does Calcite allow some kind of session state in which we can cache the fs 
for the query (plan) rather than creating it on the fly each time we need it?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150682252
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+  else {
+//this schema name could be `dfs.tmp`, a 2nd level schema under 
'dfs'
+String[] paths = schemaName.split("\\.");
+if (paths.length == 2) {
+  plugin = getSchemaFactories().getPlugin(paths[0]);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+
+  final CalciteSchema secondLevelSchema = 
getSubSchemaMap().get(paths[0]).getSubSchema(paths[1], caseSensitive);
+  if (secondLevelSchema != null) {
+SchemaPlus secondlevel = secondLevelSchema.plus();
+org.apache.drill.exec.store.AbstractSchema drillSchema =
--- End diff --

Fully qualified name needed? Or, will an import work?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150680632
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+  else {
+//this schema name could be `dfs.tmp`, a 2nd level schema under 
'dfs'
+String[] paths = schemaName.split("\\.");
+if (paths.length == 2) {
+  plugin = getSchemaFactories().getPlugin(paths[0]);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+
+  final CalciteSchema secondLevelSchema = 
getSubSchemaMap().get(paths[0]).getSubSchema(paths[1], caseSensitive);
+  if (secondLevelSchema != null) {
+SchemaPlus secondlevel = secondLevelSchema.plus();
+org.apache.drill.exec.store.AbstractSchema drillSchema =
+
secondlevel.unwrap(org.apache.drill.exec.store.AbstractSchema.class);
+SubSchemaWrapper wrapper = new SubSchemaWrapper(drillSchema);
+thisPlus.add(wrapper.getName(), wrapper);
+  }
+}
+  }
+} catch(ExecutionSetupException | IOException ex) {
+}
+  }
+
+  static 

[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150680951
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
--- End diff --

Class would benefit from general comments: purpose, references, structure.


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150685946
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/WorkspaceSchemaFactory.java
 ---
@@ -532,7 +572,10 @@ public boolean isMutable() {
 }
 
 public DrillFileSystem getFS() {
-  return fs;
+  if (this.fs == null) {
+this.fs = 
ImpersonationUtil.createFileSystem(schemaConfig.getUserName(), fsConf);
+  }
+  return this.fs;
--- End diff --

No need for `this.fs`, just `fs` will do.


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150682787
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicSchema.java
 ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import org.apache.calcite.jdbc.CalciteSchema;
+import org.apache.calcite.jdbc.SimpleCalciteSchema;
+import org.apache.calcite.schema.Schema;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+
+
+/**
+ * Unlike SimpleCalciteSchema, DynamicSchema could have an empty or 
partial schemaMap, but it could maintain a map of
+ * name->SchemaFactory, and only register schema when the corresponsdent 
name is requested.
+ */
+public class DynamicSchema extends SimpleCalciteSchema {
+
+  protected SchemaConfig schemaConfig;
--- End diff --

Seems the schemaConfig is never set or used.


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150679636
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
--- End diff --

If the name is `dfs.test`, we first look up the compound name, then the 
parts? Why? Do we put the compound names in the map? Or can we have one schema 
named "dfs.test" and another called `dfs`.`test`? Or, can this code be 
restructured a bit?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150684114
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java
 ---
@@ -52,9 +55,20 @@
 
   private List factories;
   private String schemaName;
+  protected FileSystemPlugin plugin;
 
   public FileSystemSchemaFactory(String schemaName, 
List factories) {
 super();
+if (factories.size() > 0 ) {
+  this.plugin = factories.get(0).getPlugin();
+}
+this.schemaName = schemaName;
+this.factories = factories;
+  }
+
+  public FileSystemSchemaFactory(FileSystemPlugin plugin, String 
schemaName, List factories) {
+super();
--- End diff --

Omit; Java does this by default.


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150682074
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+  else {
+//this schema name could be `dfs.tmp`, a 2nd level schema under 
'dfs'
+String[] paths = schemaName.split("\\.");
+if (paths.length == 2) {
+  plugin = getSchemaFactories().getPlugin(paths[0]);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
--- End diff --

What if the plugin is null? Should we fail with an error, or just keep 
going?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150684557
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java
 ---
@@ -73,9 +87,10 @@ public void registerSchemas(SchemaConfig schemaConfig, 
SchemaPlus parent) throws
 
 public FileSystemSchema(String name, SchemaConfig schemaConfig) throws 
IOException {
   super(ImmutableList.of(), name);
+  final DrillFileSystem fs = 
ImpersonationUtil.createFileSystem(schemaConfig.getUserName(), 
plugin.getFsConf());
--- End diff --

Comment to explain what's happening here?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150683680
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/SchemaTreeProvider.java
 ---
@@ -105,12 +106,36 @@ public SchemaPlus createRootSchema(final String 
userName, final SchemaConfigInfo
* @return
*/
   public SchemaPlus createRootSchema(SchemaConfig schemaConfig) {
+  final SchemaPlus rootSchema = 
DynamicSchema.createRootSchema(dContext.getStorage(), schemaConfig);
+  schemaTreesToClose.add(rootSchema);
+  return rootSchema;
+  }
+
+  /**
+   * Return full root schema with schema owner as the given user.
+   *
+   * @param userName Name of the user who is accessing the storage sources.
+   * @param provider {@link SchemaConfigInfoProvider} instance
+   * @return Root of the schema tree.
+   */
+  public SchemaPlus createFullRootSchema(final String userName, final 
SchemaConfigInfoProvider provider) {
+final String schemaUser = isImpersonationEnabled ? userName : 
ImpersonationUtil.getProcessUserName();
--- End diff --

Should this be factored out somewhere? Seems this magic stanza will be 
needed in many places: better to have one copy than several. Maybe a method in 
`ImpersonationUtil`?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150682678
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicSchema.java
 ---
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import org.apache.calcite.jdbc.CalciteSchema;
+import org.apache.calcite.jdbc.SimpleCalciteSchema;
+import org.apache.calcite.schema.Schema;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+
+
+/**
+ * Unlike SimpleCalciteSchema, DynamicSchema could have an empty or 
partial schemaMap, but it could maintain a map of
+ * name->SchemaFactory, and only register schema when the corresponsdent 
name is requested.
+ */
+public class DynamicSchema extends SimpleCalciteSchema {
+
+  protected SchemaConfig schemaConfig;
+  protected StoragePluginRegistry storages;
--- End diff --

Seems that storages is never initialized or used except in the accessor.


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150685113
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java
 ---
@@ -73,9 +87,10 @@ public void registerSchemas(SchemaConfig schemaConfig, 
SchemaPlus parent) throws
 
 public FileSystemSchema(String name, SchemaConfig schemaConfig) throws 
IOException {
   super(ImmutableList.of(), name);
+  final DrillFileSystem fs = 
ImpersonationUtil.createFileSystem(schemaConfig.getUserName(), 
plugin.getFsConf());
   for(WorkspaceSchemaFactory f :  factories){
-if (f.accessible(schemaConfig.getUserName())) {
-  WorkspaceSchema s = f.createSchema(getSchemaPath(), 
schemaConfig);
+WorkspaceSchema s = f.createSchema(getSchemaPath(), schemaConfig, 
fs);
+if ( s != null) {
--- End diff --

Here we iterate over a list of workspace schema factories. For each, we 
resolve a schemaConfig against the file system.

Under what situations would we have multiple factories? Selecting from two 
distinct storage plugins?

Calcite tends to resolve the same things over and over. Will this method be 
called multiple times?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150680454
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
+for (Map.Entry storageEntry : 
getSchemaFactories()) {
+  pluginNames.add(storageEntry.getKey());
+}
+return Compatible.INSTANCE.navigableSet(
+ImmutableSortedSet.copyOf(
+Sets.union(pluginNames, getSubSchemaMap().keySet(;
+  }
+
+  /**
+   * load schema factory(storage plugin) for schemaName
+   * @param schemaName
+   * @param caseSensitive
+   */
+  public void loadSchemaFactory(String schemaName, boolean caseSensitive) {
+try {
+  SchemaPlus thisPlus = this.plus();
+  StoragePlugin plugin = getSchemaFactories().getPlugin(schemaName);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+  else {
+//this schema name could be `dfs.tmp`, a 2nd level schema under 
'dfs'
+String[] paths = schemaName.split("\\.");
+if (paths.length == 2) {
+  plugin = getSchemaFactories().getPlugin(paths[0]);
+  if (plugin != null) {
+plugin.registerSchemas(schemaConfig, thisPlus);
+  }
+
+  final CalciteSchema secondLevelSchema = 
getSubSchemaMap().get(paths[0]).getSubSchema(paths[1], caseSensitive);
+  if (secondLevelSchema != null) {
+SchemaPlus secondlevel = secondLevelSchema.plus();
+org.apache.drill.exec.store.AbstractSchema drillSchema =
+
secondlevel.unwrap(org.apache.drill.exec.store.AbstractSchema.class);
+SubSchemaWrapper wrapper = new SubSchemaWrapper(drillSchema);
+thisPlus.add(wrapper.getName(), wrapper);
+  }
+}
+  }
+} catch(ExecutionSetupException | IOException ex) {
--- End diff --

Really? Just 

[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150678803
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
+  }
+
+  @Override
+  public NavigableSet getTableNames() {
+Set pluginNames = Sets.newHashSet();
--- End diff --

Should be case insensitive?


---


[GitHub] drill pull request #1032: DRILL-5089: Dynamically load schema of storage plu...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1032#discussion_r150678680
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DynamicRootSchema.java
 ---
@@ -0,0 +1,125 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.planner.sql;
+
+import com.google.common.collect.ImmutableSortedSet;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import org.apache.calcite.DataContext;
+import org.apache.calcite.jdbc.CalciteRootSchema;
+import org.apache.calcite.jdbc.CalciteSchema;
+
+import org.apache.calcite.linq4j.tree.Expression;
+import org.apache.calcite.linq4j.tree.Expressions;
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.calcite.schema.impl.AbstractSchema;
+import org.apache.calcite.util.BuiltInMethod;
+import org.apache.calcite.util.Compatible;
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.apache.drill.exec.store.StoragePlugin;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.SubSchemaWrapper;
+
+import java.io.IOException;
+import java.util.Map;
+import java.util.NavigableSet;
+import java.util.Set;
+
+public class DynamicRootSchema extends DynamicSchema
+implements CalciteRootSchema {
+
+  /** Creates a root schema. */
+  DynamicRootSchema(StoragePluginRegistry storages, SchemaConfig 
schemaConfig) {
+super(null, new RootSchema(), "");
+this.schemaConfig = schemaConfig;
+this.storages = storages;
+  }
+
+  @Override
+  public CalciteSchema getSubSchema(String schemaName, boolean 
caseSensitive) {
+CalciteSchema retSchema = getSubSchemaMap().get(schemaName);
+
+if (retSchema == null) {
+  loadSchemaFactory(schemaName, caseSensitive);
+}
+
+retSchema = getSubSchemaMap().get(schemaName);
+return retSchema;
--- End diff --

if the original call returns non-null, we make the same call a second time. 
Better:
```
retSchema = ...
if (retSchema != null) { return retSchema; }
loadSchemaFactory(...)
return getSubSchemaMap()...
```


---


[GitHub] drill issue #1026: DRILL-5919: Add non-numeric support for JSON processing

2017-11-13 Thread vladimirtkach
Github user vladimirtkach commented on the issue:

https://github.com/apache/drill/pull/1026
  
what do you think about having two functions instead: convertFromJSON and 
convertFromJSON+some suffix. Second will be able to convert  NaN, Infinity


---


Hangout Topics for 11/14/2017

2017-11-13 Thread Padma Penumarthy
Drill hangout tomorrow Nov 14th, at 10 AM PST.
Please send email or bring them up tomorrow, if you have topics to discuss.

Hangout link:
https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc

Thanks
Padma



[GitHub] drill issue #1030: DRILL-5941: Skip header / footer improvements for Hive st...

2017-11-13 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1030
  
For FWIW, the native CSV reader does the following:

* To read the header, it seeks to offset 0 in the file, regardless of the 
block being read, then reads the header, which may be a remote read.
* The reader then seeks to the start of its block. If this is the first 
block, it skips the header, else it searches for the start of the next record.

Since Hive has the same challenges, Hive must have solved this, we have 
only to research that existing solution.

One simple solution is:

* If block number is 0, skip the header.
* If block number is 1 or larger, look for the next record separator.


---


[GitHub] drill issue #1030: DRILL-5941: Skip header / footer improvements for Hive st...

2017-11-13 Thread ppadma
Github user ppadma commented on the issue:

https://github.com/apache/drill/pull/1030
  
@arina-ielchiieva That will have performance impact.  Better way would be 
to keep one reader per split and see if we can figure out a way to tell readers 
how many rows they should skip (for header or footer).


---


[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill

2017-11-13 Thread vrozov
Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1031
  
@arina-ielchiieva Please test with the latest commit. I changed mapr 
version back to 5.2.1, note that 2.7.0-mapr-1707 is compiled with 5.2.2


---


[jira] [Created] (DRILL-5960) Add function STAsGeoJSON to extend GIS support

2017-11-13 Thread Chris Sandison (JIRA)
Chris Sandison created DRILL-5960:
-

 Summary: Add function STAsGeoJSON to extend GIS support
 Key: DRILL-5960
 URL: https://issues.apache.org/jira/browse/DRILL-5960
 Project: Apache Drill
  Issue Type: Bug
Reporter: Chris Sandison
Priority: Minor


Add function as wrapper to ESRI's `asGeoJson` functionality. 

Implementation is very similar to STAsText



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill issue #984: DRILL-5783 Made a unit test for generated Priority Queue. ...

2017-11-13 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/984
  
@arina-ielchiieva I can put **DRILL-5783** in a separate PR **DRILL-5841** 
and **DRILL-5894** are however tightly coupled and must be in the same PR. 
Unfortunately **DRILL-5841** and **DRILL-5894** touched many files because all 
the unit tests were updated so the PR will still remain large. 


---


[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1031
  
@vrozov I have posted all content of drillbit.out. drillbit.log just had 
information that could not create remote function registry directory.


---


[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill

2017-11-13 Thread vrozov
Github user vrozov commented on the issue:

https://github.com/apache/drill/pull/1031
  
@arina-ielchiieva Please attach the full log to JIRA.


---


[GitHub] drill issue #1030: DRILL-5941: Skip header / footer improvements for Hive st...

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1030
  
@ppadma really good point! I did not realize that when creating readers 
input splits can be already distributed among several drillbits. So I guess we 
should make sure that when we have skip header / footer logic in table input 
splits of the same file reside in one `HiveSubScan` during planning stage. I'll 
have to make some changes and let you know once done.


---


[GitHub] drill pull request #970: DRILL-5832: Migrate OperatorFixture to use SystemOp...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/970


---


[GitHub] drill issue #978: DRILL-5842: Refactor and simplify the fragment, operator c...

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/978
  
+1, LGTM.


---


[GitHub] drill issue #970: DRILL-5832: Migrate OperatorFixture to use SystemOptionMan...

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/970
  
+1, LGTM.


---


[GitHub] drill issue #984: DRILL-5783 Made a unit test for generated Priority Queue. ...

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/984
  
@ilooner changes in 361 files seems to be too much :) I suggest you split 
this PR at least at three according to the addressed issues. So we try to merge 
smaller chunks.


---


[GitHub] drill issue #1031: DRILL-5917: Ban org.json:json library in Drill

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1031
  
@vrozov when I have deployed Drill with your changes on test cluster, Drill 
start up failed with the following error:
```
2017-11-13 01:30:12,3367 ERROR JniCommon 
fs/client/fileclient/cc/jni_MapRClient.cc:616 Thread: 26996 Mismatch found for 
java and native libraries java build version 5.2.2.44680GA, native build 
version 5.2.2.44680.GA java patch vserion $Id: mapr-version: 5.2.2.44680GA 
44680:b0ba7dcefd80b9aaf39f05f3, native patch version $Id: mapr-version: 
5.2.2.44680.GA 44680:b0ba7dcefd80b9aaf39f05f
2017-11-13 01:30:12,3368 ERROR JniCommon 
fs/client/fileclient/cc/jni_MapRClient.cc:626 Thread: 26996 Client 
initialization failed.
Exception in thread "main" 
org.apache.drill.exec.exception.DrillbitStartupException: Failure during 
initial startup of Drillbit.
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:353)
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:323)
at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:319)
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Error 
during udf area creation [/user/root/drill/udf/registry] on file system 
[maprfs:///]
at 
org.apache.drill.common.exceptions.DrillRuntimeException.format(DrillRuntimeException.java:49)
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.createArea(RemoteFunctionRegistry.java:277)
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.prepareAreas(RemoteFunctionRegistry.java:234)
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.init(RemoteFunctionRegistry.java:105)
at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:167)
at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:349)
... 2 more
Caused by: java.io.IOException: Could not create FileClient
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:626)
at com.mapr.fs.MapRFileSystem.lookupClient(MapRFileSystem.java:679)
at com.mapr.fs.MapRFileSystem.makeDir(MapRFileSystem.java:1239)
at com.mapr.fs.MapRFileSystem.mkdirs(MapRFileSystem.java:1276)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1917)
at 
org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.createArea(RemoteFunctionRegistry.java:254)
... 6 more
```
Without your changes start up was successful.


---


[GitHub] drill pull request #904: DRILL-5717: change some date time test cases with s...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/904


---


[GitHub] drill pull request #774: DRILL-5337: OpenTSDB storage plugin

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/774


---


[GitHub] drill pull request #987: DRILL-5863: Sortable table incorrectly sorts fragme...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/987


---


[GitHub] drill pull request #1005: DRILL-5896: Handle HBase columns vector creation i...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1005


---


[GitHub] drill pull request #1021: DRILL-5923: Display name for query state

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1021


---


[GitHub] drill pull request #1015: DRILL-5899: Simple pattern matchers can work with ...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1015


---


[GitHub] drill pull request #1019: DRILL-5909: Added new Counter metrics

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1019


---


[GitHub] drill pull request #949: DRILL-5795: Parquet Filter push down at rowgroup le...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/949


---


[GitHub] drill pull request #1017: DRILL-5822: The query with "SELECT *" with "ORDER ...

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1017


---


[GitHub] drill pull request #1020: DRILL-5921: Display counter metrics in table

2017-11-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1020


---


[GitHub] drill pull request #1033: DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150512392
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTAS.java ---
@@ -313,6 +313,59 @@ public void createTableWithCustomUmask() throws 
Exception {
 }
   }
 
+  @Test // DRILL-5951
+  public void 
testCreateTableIfNotExistsWhenTableWithSameNameAlreadyExists() throws Exception{
+final String newTblName = 
"createTableIfNotExistsWhenATableWithSameNameAlreadyExists";
+
+try {
+  test(String.format("CREATE TABLE %s.%s AS SELECT * from 
cp.`region.json`", TEMP_SCHEMA, newTblName));
+
+  final String ctasQuery =
+String.format("CREATE TABLE IF NOT EXISTS %s.%s AS SELECT * FROM 
cp.`employee.json`", TEMP_SCHEMA, newTblName);
+
+  testBuilder()
+.sqlQuery(ctasQuery)
+.unOrdered()
+.baselineColumns("ok", "summary")
+.baselineValues(false, String.format("A table or view with given 
name [%s] already exists in schema [%s]", newTblName, TEMP_SCHEMA))
+.go();
+} finally {
+  test(String.format("DROP TABLE %s.%s", TEMP_SCHEMA, newTblName));
--- End diff --

Please use here `drop table if exists`.  Since test may fail before creates 
the table and we don;t here one more exception not related to the problem.


---


[GitHub] drill pull request #1033: DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150513171
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/CreateTableHandler.java
 ---
@@ -83,7 +84,20 @@ public PhysicalPlan getPlan(SqlNode sqlNode) throws 
ValidationException, RelConv
 final DrillConfig drillConfig = context.getConfig();
 final AbstractSchema drillSchema = resolveSchema(sqlCreateTable, 
config.getConverter().getDefaultSchema(), drillConfig);
 
-checkDuplicatedObjectExistence(drillSchema, originalTableName, 
drillConfig, context.getSession());
+String schemaPath = drillSchema.getFullSchemaName();
+
+// Check duplicate object existence
+boolean isTemporaryTable = 
context.getSession().isTemporaryTable(drillSchema, drillConfig, 
originalTableName);
--- End diff --

1. Please add unit tests for temp tables as well in `TestCTTAS` class.
2. Should we consider similar functionality for views?


---


[GitHub] drill pull request #1033: DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150512944
  
--- Diff: exec/java-exec/src/main/codegen/includes/parserImpls.ftl ---
@@ -241,8 +243,8 @@ SqlNode SqlCreateTable() :
 
 query = OrderedQueryOrExpr(ExprContext.ACCEPT_QUERY)
 {
-return new SqlCreateTable(pos,tblName, fieldList, 
partitionFieldList, query,
-SqlLiteral.createBoolean(isTemporary, 
getPos()));
+return new SqlCreateTable(pos, 
SqlLiteral.createBoolean(tableNonExistenceCheck, getPos()), tblName, fieldList,
--- End diff --

Not sure if it is a good practice to add new parameter in the beginning of 
constructor rather in the end.


---


[GitHub] drill pull request #1033: DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150512516
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTAS.java ---
@@ -313,6 +313,59 @@ public void createTableWithCustomUmask() throws 
Exception {
 }
   }
 
+  @Test // DRILL-5951
+  public void 
testCreateTableIfNotExistsWhenTableWithSameNameAlreadyExists() throws Exception{
+final String newTblName = 
"createTableIfNotExistsWhenATableWithSameNameAlreadyExists";
+
+try {
+  test(String.format("CREATE TABLE %s.%s AS SELECT * from 
cp.`region.json`", TEMP_SCHEMA, newTblName));
+
+  final String ctasQuery =
+String.format("CREATE TABLE IF NOT EXISTS %s.%s AS SELECT * FROM 
cp.`employee.json`", TEMP_SCHEMA, newTblName);
+
+  testBuilder()
+.sqlQuery(ctasQuery)
+.unOrdered()
+.baselineColumns("ok", "summary")
+.baselineValues(false, String.format("A table or view with given 
name [%s] already exists in schema [%s]", newTblName, TEMP_SCHEMA))
+.go();
+} finally {
+  test(String.format("DROP TABLE %s.%s", TEMP_SCHEMA, newTblName));
+}
+  }
+
+  @Test // DRILL-5951
+  public void 
testCreateTableIfNotExistsWhenViewWithSameNameAlreadyExists() throws Exception{
+final String newTblName = 
"createTableIfNotExistsWhenAViewWithSameNameAlreadyExists";
+
+try {
+  test(String.format("CREATE VIEW %s.%s AS SELECT * from 
cp.`region.json`", TEMP_SCHEMA, newTblName));
+
+  final String ctasQuery =
+String.format("CREATE TABLE IF NOT EXISTS %s.%s AS SELECT * FROM 
cp.`employee.json`", TEMP_SCHEMA, newTblName);
+
+  testBuilder()
+.sqlQuery(ctasQuery)
+.unOrdered()
+.baselineColumns("ok", "summary")
+.baselineValues(false, String.format("A table or view with given 
name [%s] already exists in schema [%s]", newTblName, TEMP_SCHEMA))
+.go();
+} finally {
+  test(String.format("DROP VIEW %s.%s", TEMP_SCHEMA, newTblName));
+}
+  }
+
+  @Test // DRILL-5951
+  public void 
testCreateTableIfNotExistsWhenTableWithSameNameDoesNotExist() throws Exception{
+final String newTblName = 
"createTableIfNotExistsWhenATableWithSameNameDoesNotExist";
+
+try {
+  test(String.format("CREATE TABLE IF NOT EXISTS %s.%s AS SELECT * 
FROM cp.`employee.json`", TEMP_SCHEMA, newTblName));
--- End diff --

You might want to check baseline here as well but only for `true` to 
confirm that table was actually created.


---


[GitHub] drill pull request #1033: DRILL-5952: Implement "CREATE TABLE IF NOT EXISTS"

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/1033#discussion_r150512175
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/sql/TestCTAS.java ---
@@ -313,6 +313,59 @@ public void createTableWithCustomUmask() throws 
Exception {
 }
   }
 
+  @Test // DRILL-5951
+  public void 
testCreateTableIfNotExistsWhenTableWithSameNameAlreadyExists() throws Exception{
+final String newTblName = 
"createTableIfNotExistsWhenATableWithSameNameAlreadyExists";
+
+try {
+  test(String.format("CREATE TABLE %s.%s AS SELECT * from 
cp.`region.json`", TEMP_SCHEMA, newTblName));
--- End diff --

`test` method already contains `String.format`, you can use `test("CREATE 
TABLE %s.%s AS SELECT * from cp.`region.json`", TEMP_SCHEMA, newTblName)`


---


[jira] [Created] (DRILL-5959) Unable to apply Union on multiple select query expressions

2017-11-13 Thread Rajeshwar Rao (JIRA)
Rajeshwar Rao created DRILL-5959:


 Summary: Unable to apply Union on  multiple select query 
expressions
 Key: DRILL-5959
 URL: https://issues.apache.org/jira/browse/DRILL-5959
 Project: Apache Drill
  Issue Type: Bug
  Components: Client - JDBC
Affects Versions: 1.11.0, 1.10.0
Reporter: Rajeshwar Rao
Priority: Minor
 Fix For: Future


Unable to combine the result sets of  more than  two separate query expressions

example :

select color, Paul  , 'Paul' name
 from mysql.devcdp25.pivtest
 union all
 select color, John , 'John' name
 from mysql.devcdp25.pivtest
 union all
 select color, Tim , 'Tim' name
 from mysql.devcdp25.pivtest

getting the following error : 

Query Failed: An Error Occurred
org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: The 
JDBC storage plugin failed while trying setup the SQL query. sql SELECT * FROM 
(SELECT `color`, `Paul`, 'Paul' AS `name` FROM `devcdp25`.`pivtest` UNION ALL 
SELECT `color`, `John`, 'John' AS `name` FROM `devcdp25`.`pivtest`) UNION ALL 
SELECT `color`, `Tim`, 'Tim' AS `name` FROM `devcdp25`.`pivtest` plugin mysql 
Fragment 0:0



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread vladimirtkach
Github user vladimirtkach commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150500913
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

I think we should stick to json standard and leave them switched off by 
default. If user get an exception we may show the option name which he want to 
switch on.  


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread vladimirtkach
Github user vladimirtkach commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150500204
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

thanks, allow_nan_inf definitely better.


---


[GitHub] drill pull request #1026: DRILL-5919: Add non-numeric support for JSON proce...

2017-11-13 Thread vladimirtkach
Github user vladimirtkach commented on a diff in the pull request:

https://github.com/apache/drill/pull/1026#discussion_r150500059
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -502,6 +502,8 @@ drill.exec.options: {
 store.format: "parquet",
 store.hive.optimize_scan_with_native_readers: false,
 store.json.all_text_mode: false,
+store.json.writer.non_numeric_numbers: false,
+store.json.reader.non_numeric_numbers: false,
--- End diff --

No, we don't and we should test them. I think we have to create separate 
jira for testing math functions, because code changes from this PR doesn't 
affect logic of any math function, and while testing math function there will 
be other issues not connected with with functionality (like BigInteger 
constructor doesn't accept NaN, Infinity and others)


---


[GitHub] drill issue #1014: DRILL-5771: Fix serDe errors for format plugins

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/1014
  
@ilooner Paul has correctly described race condition issue. At planning 
time when we have defined plugin configuration we want to make sure it would 
stay the same during query execution. Relying on plugin configuration name is 
unsafe since config under this particular name can change and it can be that 
one fragment of the query is executed with one configuration and another with 
different (typical race condition issue).

Regarding your last question, during physical plan creation I don't think 
we deserialize DrillTable, we rather deserialize query fragments (it can be 
seen from json profile). Also you can start looking, for example, from here:

https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/work/QueryWorkUnit.java#L59



---


[GitHub] drill issue #904: DRILL-5717: change some date time test cases with specific...

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/904
  
+1, LGTM.


---


[GitHub] drill issue #774: DRILL-5337: OpenTSDB storage plugin

2017-11-13 Thread arina-ielchiieva
Github user arina-ielchiieva commented on the issue:

https://github.com/apache/drill/pull/774
  
+1, LGTM.


---