[GitHub] [drill] paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues

2020-03-10 Thread GitBox
paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues
URL: https://github.com/apache/drill/pull/2010#issuecomment-597381456
 
 
   Rebased on latest master to resolve conflict.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues

2020-03-10 Thread GitBox
paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues
URL: https://github.com/apache/drill/pull/2010#issuecomment-597381178
 
 
   @agozhiy, here is how I read the old behavior:
   
   * The UI sends the plugin as JSON via the `createOrUpdatePlugin()` mehod.
   * `StorageResources` configures its mapper (which is really the global one 
from `LogicalPlanPersistence`) to accept comments, then deserializes the JSON 
to create a config.
   * If deserialization fails, the UI returns an error message.
   * `StorageResources` passes the name and config, as a `PluginConfigWrapper` 
to `createOrUpdatePluginJSON()` which calls `createOrUpdate()` in the plugin 
registry.
   * `createOrUpdate()` first creates an instance of the plugin. The result 
will be one of a) `null` if the config is disabled, b) a `StoragePlugin` if 
enabled and valid, or c) an exception if invalid.
   * `createOrUpdate()` now does one of:
 * If enabled, replace existing plugin in the in-memory cache with the new 
one, or add the new one
 * If disabled, remove the existing plugin from the in-memory cache
   * `createOrUpdate()` then writes the config to persistent storage 
independent of the above choices.
   
   There are bugs in the above. The above writes to the local cache before 
storage meaning that, for a brief time, the two are out of sync. If, during 
that time, another update comes in, then the two will again be out of sync. 
These issues don't crop up in the current tests, of course.
   
   The new code does:
   
   * The UI sends the plugin as JSON via the `createOrUpdatePlugin()` mehod.
   * `StorageResources` calls `putJson()` in the plugin registry with the name 
and JSON text.
   * `putJson()` decodes the JSON using the system mapper, configured to allow 
comments, throwing an exception if the decode fails. If succeeds, calls `put()` 
with the config.
   * `put()` validates that the plugin is not a system plugin, then updates 
persistent storage.
   
   I see the problem, the `put()` call above should be the new `validatedPut()` 
which creates an instance of the plugin first. Once we make that fix:
   
   * `validatedPut()` creates an instance of the plugin, throwing an exception 
if instance creation fails.
   * If OK, `validatedPut()` writes he plugin to persistent storage and updates 
the plugin cache.
   
   There is a second bug here: the logic should be:
   
   *  `validatedPut()` does one of:
  * If enabled, updates the config in the plugin cache.
  * If disabled, removes the config from the plugin cache.
   * `validatedPut()` then checks if the old entry had a plugin instance. If , 
adds it to the ephemeral store.
   * `validatedPut()`  writes the updated config to persistent storage.
   
   The planner bugs explained earlier still occur. (A query should not iterate 
over all registered plugins.) Still, the above fixes should get us back to the 
original behavior, with the addition of the other fixes we're trying to make.
   
   Made the two fixes noted above. Added unit tests for this case.
   
   @agozhiy, if you run the UI scenario from your earlier comment, you should 
see that the plugin is rejected if invalid and enabled. The config is stored, 
but not instantiated, if invalid and disabled.
   
   If you are brave, you can use ZK to change a stored valid config to make it 
invalid. Doing so simulates the external system failing. With the old version, 
Drill will fail on startup. With this version, Drill will start, but you'll see 
error messages in the log each time you plan a query. A future PR will ensure 
you see no errors at all.
   
   All of this points out more open issues:
   
   * If a plugin ever fails, should it be disabled? How would this work for a 
temporary outage? The user would have to manually re-enable the plugin when it 
becomes valid again.
   * On a completely unrelated topic, there is code in the original version 
that forces all names of stored plugins to lower case (removing any mixed-case 
versions and replacing them), but no code enforces the lower-case rule in the 
in-memory cache. (I have tests that specifically test this case.) However, 
names are forced to lower case in storage. This seems more on the "bug" side of 
the ledger than the "feature" side. Should we allow mixed case everywhere or 
nowhere? Yet another thing to fix in the next PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin 
for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390678433
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
+
+try {
+  // Execute the request
+  Response response = client
+.newCall(request)
+.execute();
+
+  // If the request is unsuccessful, throw a UserException
+  if (!response.isSuccessful()) {
+throw UserException
+  .dataReadError()
+  .message("Error retrieving data from HTTP Storage Plugin: " + 
response.code() + " " + response.message())
+  .addContext("URL: ", urlStr)
+  .addContext("Response code: ", response.code())
+  .build(logger);
+  }
+  logger.debug("HTTP Request for {} successful.", urlStr);
+  logger.debug("Response Headers: {} ", response.headers().toString());
+
+  // Return the InputStream of the response
+  return Objects.requireNonNull(response.body()).byteStream();
+} catch (IOException e) {
+  throw UserException
+.dataReadError(e)
+.message("Error retrieving data from HTTP Storage Plugin: %s", 
e.getMessage())
+.addContext("URL Requested:" + urlStr)
+.build(logger);
+}
+  }
+
+  /**
+   * Function configures the OkHTTP3 server object with configuration info 
from the user.
+   *
+   * @return OkHttpClient configured server
+   */
+  private OkHttpClient setupHttpClient() {

[GitHub] [drill] cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin 
for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390677994
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java
 ##
 @@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import org.apache.drill.common.map.CaseInsensitiveMap;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.base.Objects;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+
+@JsonTypeName(HttpStoragePluginConfig.NAME)
+public class HttpStoragePluginConfig extends StoragePluginConfigBase {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpStoragePluginConfig.class);
+
+  public static final String NAME = "http";
+
+  public final Map connections;
+
+  public final boolean cacheResults;
+
+  public int timeout;
+
+  @JsonCreator
+  public HttpStoragePluginConfig(@JsonProperty("cacheResults") boolean 
cacheResults,
+ @JsonProperty("connections") Map connections,
+ @JsonProperty("timeout") int timeout) {
+logger.debug("Initialize HttpStoragePluginConfig {}", connections);
 
 Review comment:
   In the next revision, I'll remove all these debug statements.  These were 
there for me to follow the flow of the code. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin 
for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390677421
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.config = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+config = that.config();
+httpScanSpec = that.httpScanSpec();
+columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("columns") List columns,
+@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec
+  ) {
+super("no-user");
+this.config = config;
+this.columns = columns;
+this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  @JsonIgnore
+  public int getMaxParallelizationWidth() {
+return 1;
+  }
+
+  @Override
+  public boolean canPushdownProjects(List columns) {
+return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+logger.debug("HttpGroupScan getSpecificScan");
+return new HttpSubScan(config, httpScanSpec, columns);
+  }
+
+  @Override
+  public GroupScan clone(List columns) {
+logger.debug("HttpGroupScan clone {}", columns);
+HttpGroupScan newScan = new HttpGroupScan(this);
+newScan.columns = columns;
+return newScan;
+  }
+
+  @Override
+  public String getDigest() {
+return toString();
+  }
+
+  @Override
+  public PhysicalOperator getNewWithChildren(List children) {
+Preconditions.checkArgument(children.isEmpty());
+return new HttpGroupScan(this);
+  }
+
+  @Override
+  public ScanStats getScanStats() {
+int colCount = columns.size();
+int estRowCount = 1;
+int estDataSize = estRowCount * 200 * colCount;
 
 Review comment:
   Hey... this is a learning experience for us all... (but mainly for me... ) 
;-)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the sp

[GitHub] [drill] vvysotskyi merged pull request #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system

2020-03-10 Thread GitBox
vvysotskyi merged pull request #2021: DRILL-7635: Mongo tests fail for profiles 
with the non-file default file system
URL: https://github.com/apache/drill/pull/2021
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] dobesv commented on a change in pull request #1997: DRILL-7604: Allow session options to be set in HTTP queries

2020-03-10 Thread GitBox
dobesv commented on a change in pull request #1997: DRILL-7604: Allow session 
options to be set in HTTP queries
URL: https://github.com/apache/drill/pull/1997#discussion_r390652771
 
 

 ##
 File path: exec/java-exec/src/main/resources/rest/query/query.ftl
 ##
 @@ -82,6 +82,16 @@
   Submit
 
 checked> Limit results to  rows 

+ Store Format
+  
+Parquet
+JSON
+Comma-Separated Values
+Pipe-Separated Values
+Tab-Separated Values
 
 Review comment:
   @arina-ielchiieva OK, I'll just remove the field for now.  I added it here 
for demo/testing but I don't actually need it to be there for my own purposes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1997: DRILL-7604: Allow session options to be set in HTTP queries

2020-03-10 Thread GitBox
arina-ielchiieva commented on a change in pull request #1997: DRILL-7604: Allow 
session options to be set in HTTP queries
URL: https://github.com/apache/drill/pull/1997#discussion_r390649598
 
 

 ##
 File path: exec/java-exec/src/main/resources/rest/query/query.ftl
 ##
 @@ -82,6 +82,16 @@
   Submit
 
 checked> Limit results to  rows 

+ Store Format
+  
+Parquet
+JSON
+Comma-Separated Values
+Pipe-Separated Values
+Tab-Separated Values
 
 Review comment:
   @dobesv As I have stated before, we should not hard-code format names since 
they can be customized, please remove this part of code, other changes are ok.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system

2020-03-10 Thread GitBox
arina-ielchiieva commented on issue #2021: DRILL-7635: Mongo tests fail for 
profiles with the non-file default file system
URL: https://github.com/apache/drill/pull/2021#issuecomment-597352332
 
 
   +1, LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi opened a new pull request #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system

2020-03-10 Thread GitBox
vvysotskyi opened a new pull request #2021: DRILL-7635: Mongo tests fail for 
profiles with the non-file default file system
URL: https://github.com/apache/drill/pull/2021
 
 
   # [DRILL-7635](https://issues.apache.org/jira/browse/DRILL-7635): Mongo 
tests fail for profiles with the non-file default file system
   
   ## Description
   Please see Jira for details.
   
   ## Documentation
   NA
   
   ## Testing
   Ran all tests.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390583045
 
 

 ##
 File path: contrib/storage-http/src/main/resources/drill-module.conf
 ##
 @@ -0,0 +1,27 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+#  This file tells Drill to consider this module when class path scanning.
+#  This file can also include any supplementary configuration information.
+#  This file is in HOCON format, see 
https://github.com/typesafehub/config/blob/master/HOCON.md for more information.
+
+drill: {
+  classpath.scanning: {
+packages += "org.apache.drill.exec.store.http"
+  }
+}
 
 Review comment:
   Nit: newline


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390547367
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.config = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+config = that.config();
+httpScanSpec = that.httpScanSpec();
+columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("columns") List columns,
+@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec
+  ) {
+super("no-user");
+this.config = config;
+this.columns = columns;
+this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  @JsonIgnore
+  public int getMaxParallelizationWidth() {
+return 1;
+  }
+
+  @Override
+  public boolean canPushdownProjects(List columns) {
+return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+logger.debug("HttpGroupScan getSpecificScan");
 
 Review comment:
   Unless you debug by reading logs, no need for these debug statements. These 
methods are called when needed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390582732
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
+
+try {
+  // Execute the request
+  Response response = client
+.newCall(request)
+.execute();
+
+  // If the request is unsuccessful, throw a UserException
+  if (!response.isSuccessful()) {
+throw UserException
+  .dataReadError()
+  .message("Error retrieving data from HTTP Storage Plugin: " + 
response.code() + " " + response.message())
+  .addContext("URL: ", urlStr)
+  .addContext("Response code: ", response.code())
+  .build(logger);
+  }
+  logger.debug("HTTP Request for {} successful.", urlStr);
+  logger.debug("Response Headers: {} ", response.headers().toString());
+
+  // Return the InputStream of the response
+  return Objects.requireNonNull(response.body()).byteStream();
+} catch (IOException e) {
+  throw UserException
+.dataReadError(e)
+.message("Error retrieving data from HTTP Storage Plugin: %s", 
e.getMessage())
+.addContext("URL Requested:" + urlStr)
+.build(logger);
+}
+  }
+
+  /**
+   * Function configures the OkHTTP3 server object with configuration info 
from the user.
+   *
+   * @return OkHttpClient configured server
+   */
+  private OkHttpClient setupHttpClient

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390585371
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java
 ##
 @@ -253,11 +268,20 @@ private void updateRunningCount() {
 runningRecordCount += recordCount;
   }
 
+  public void setInputStream(InputStream in) {
+this.inputStream = in;
+  }
+
   @Override
   public void close() throws Exception {
 if (stream != null) {
   stream.close();
   stream = null;
 }
+
+if (inputStream != null) {
+  inputStream.close();
+  inputStream = null;
+}
 
 Review comment:
   Doesn't the Jackson parser close the input stream when the parser itself is 
closed? At least, that is what I thought I found in my own JSON tests. If not, 
we've got a pile of open HDFS streams that we've never closed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390573286
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
 
 Review comment:
   Timeout?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390582369
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
+
+try {
+  // Execute the request
+  Response response = client
+.newCall(request)
+.execute();
+
+  // If the request is unsuccessful, throw a UserException
+  if (!response.isSuccessful()) {
+throw UserException
+  .dataReadError()
+  .message("Error retrieving data from HTTP Storage Plugin: " + 
response.code() + " " + response.message())
+  .addContext("URL: ", urlStr)
+  .addContext("Response code: ", response.code())
+  .build(logger);
+  }
+  logger.debug("HTTP Request for {} successful.", urlStr);
+  logger.debug("Response Headers: {} ", response.headers().toString());
+
+  // Return the InputStream of the response
+  return Objects.requireNonNull(response.body()).byteStream();
+} catch (IOException e) {
+  throw UserException
+.dataReadError(e)
+.message("Error retrieving data from HTTP Storage Plugin: %s", 
e.getMessage())
+.addContext("URL Requested:" + urlStr)
+.build(logger);
+}
+  }
+
+  /**
+   * Function configures the OkHTTP3 server object with configuration info 
from the user.
+   *
+   * @return OkHttpClient configured server
+   */
+  private OkHttpClient setupHttpClient

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390556073
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.config = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+config = that.config();
+httpScanSpec = that.httpScanSpec();
+columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("columns") List columns,
+@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec
+  ) {
+super("no-user");
+this.config = config;
+this.columns = columns;
+this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  @JsonIgnore
+  public int getMaxParallelizationWidth() {
+return 1;
+  }
+
+  @Override
+  public boolean canPushdownProjects(List columns) {
+return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+logger.debug("HttpGroupScan getSpecificScan");
+return new HttpSubScan(config, httpScanSpec, columns);
+  }
+
+  @Override
+  public GroupScan clone(List columns) {
+logger.debug("HttpGroupScan clone {}", columns);
+HttpGroupScan newScan = new HttpGroupScan(this);
+newScan.columns = columns;
+return newScan;
+  }
+
+  @Override
+  public String getDigest() {
+return toString();
+  }
+
+  @Override
+  public PhysicalOperator getNewWithChildren(List children) {
+Preconditions.checkArgument(children.isEmpty());
+return new HttpGroupScan(this);
+  }
+
+  @Override
+  public ScanStats getScanStats() {
+int colCount = columns.size();
+int estRowCount = 1;
+int estDataSize = estRowCount * 200 * colCount;
+int estCpuCost = 1;
 
 Review comment:
   Just to be safe, I'd make CPU cost a function of data size. I think there 
are functions for this in the planner somewhere. See `DrillCostBase`.


This is an automated message from the Apache Git Se

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390572977
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
 
 Review comment:
   nit:
   ```
   requestBuilder.addHeader(entry.getKey(), entry.getValue());
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390557489
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpRecordReader.java
 ##
 @@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.easy.json.JSONRecordReader;
+import org.apache.drill.exec.store.http.util.SimpleHttp;
+
+import java.io.InputStream;
+import java.util.List;
+
+
+public class HttpRecordReader extends JSONRecordReader {
+
+  private final HttpSubScan subScan;
+
+  private final SimpleHttp http;
+
+  public HttpRecordReader(FragmentContext context, List 
projectedColumns, HttpStoragePluginConfig config, HttpSubScan subScan) {
+super(context, projectedColumns);
+this.subScan = subScan;
+String connectionName = subScan.tableSpec().database();
+this.http = new SimpleHttp(config, context, connectionName);
+InputStream inputStream = getInputStream();
+
+setInputStream(inputStream);
+  }
+  /**
+   * Executes the HTTP request and returns an InputStream to the retrieved data
+   * @return InputStream the InputStream containing the data
+   */
+  private InputStream getInputStream() {
+String url = subScan.getFullURL();
+return http.getInputStream(url);
 
 Review comment:
   Originally wrote a note asking about error handling. Then, saw it was done 
in the `getInputStream` method. Maybe add a Javadoc `@throws UserException` to 
explain this fact.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390570782
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpSubScan.java
 ##
 @@ -0,0 +1,132 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.Iterator;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.UserBitShared.CoreOperatorType;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet;
+
+@JsonTypeName("http-sub-scan")
+public class HttpSubScan extends AbstractBase implements SubScan {
+
+  private final HttpScanSpec tableSpec;
+  private final HttpStoragePluginConfig config;
+  private final List columns;
+
+  @JsonCreator
+  public HttpSubScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("tableSpec") HttpScanSpec tableSpec,
+@JsonProperty("columns") List columns) {
+super("user-if-needed");
+this.config = config;
+this.tableSpec = tableSpec;
+this.columns = columns;
+  }
+  @JsonProperty("tableSpec")
+  public HttpScanSpec tableSpec() {
+return tableSpec;
+  }
+
+  @JsonProperty("columns")
+  public List columns() {
+return columns;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() {
+return config;
+  }
+
+  @JsonIgnore
+  public String getURL() {
+return tableSpec.getURL();
+  }
+
+  @JsonIgnore
+  public String getFullURL() {
+String selectedConnection = tableSpec.database();
+String url = config.connections().get(selectedConnection).url();
+return url + tableSpec.tableName();
+  }
+
+ @Override
+  public  T accept(
+   PhysicalVisitor physicalVisitor, X value) throws E {
+return physicalVisitor.visitSubScan(this, value);
+  }
+
+  @Override
+  public PhysicalOperator getNewWithChildren(List children) {
+return new HttpSubScan(config, tableSpec, columns);
+  }
+
+  @Override
+  @JsonIgnore
+  public int getOperatorType() {
+return CoreOperatorType.HTTP_SUB_SCAN_VALUE;
+  }
+
+  @Override
+  public Iterator iterator() {
+return ImmutableSet.of().iterator();
+  }
+
+  @Override
+  public String toString() {
+return MoreObjects.toStringHelper(this)
+  .add("tableSpec", tableSpec)
+  .add("columns", columns)
+  .add("config", config)
+  .toString();
+  }
+
+  @Override
+  public int hashCode() {
+return Arrays.hashCode(
+  new Object[]{tableSpec, columns, config});
 
 Review comment:
   `Objects.hash(tableSpec, columns, config)`. Maybe this is old code?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390559029
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpScanBatchCreator.java
 ##
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.List;
+
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.ops.ExecutorFragmentContext;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.ScanBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HttpScanBatchCreator implements BatchCreator {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpScanBatchCreator.class);
+
+  @Override
+  public ScanBatch getBatch(ExecutorFragmentContext context, HttpSubScan 
subScan, List children) throws ExecutionSetupException {
+logger.debug("getBatch called");
+Preconditions.checkArgument(children == null || children.isEmpty());
+
+HttpStoragePluginConfig config = subScan.config();
+List readers = Lists.newArrayList();
+List columns;
+
+if ((columns = subScan.columns()) == null) {
+  columns = GroupScan.ALL_COLUMNS;
+}
 
 Review comment:
   Clever. But the idiomatic Java way to be clever is:
   
   ```
   List columns = subscan.columns() == null
   ? GroupScan.ALL_COLUMNS : subScan.columns();
   ```
   
   Note that this code does what was suggested earlier: handle the no-columns 
case here so that the cost calcs know if columns were pushed down.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390579548
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
+
+try {
+  // Execute the request
+  Response response = client
+.newCall(request)
+.execute();
+
+  // If the request is unsuccessful, throw a UserException
+  if (!response.isSuccessful()) {
+throw UserException
+  .dataReadError()
+  .message("Error retrieving data from HTTP Storage Plugin: " + 
response.code() + " " + response.message())
+  .addContext("URL: ", urlStr)
+  .addContext("Response code: ", response.code())
+  .build(logger);
+  }
+  logger.debug("HTTP Request for {} successful.", urlStr);
+  logger.debug("Response Headers: {} ", response.headers().toString());
+
+  // Return the InputStream of the response
+  return Objects.requireNonNull(response.body()).byteStream();
+} catch (IOException e) {
+  throw UserException
+.dataReadError(e)
+.message("Error retrieving data from HTTP Storage Plugin: %s", 
e.getMessage())
+.addContext("URL Requested:" + urlStr)
+.build(logger);
+}
+  }
+
+  /**
+   * Function configures the OkHTTP3 server object with configuration info 
from the user.
+   *
+   * @return OkHttpClient configured server
+   */
+  private OkHttpClient setupHttpClient

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390580075
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
+
+try {
+  // Execute the request
+  Response response = client
+.newCall(request)
+.execute();
+
+  // If the request is unsuccessful, throw a UserException
+  if (!response.isSuccessful()) {
+throw UserException
+  .dataReadError()
+  .message("Error retrieving data from HTTP Storage Plugin: " + 
response.code() + " " + response.message())
+  .addContext("URL: ", urlStr)
+  .addContext("Response code: ", response.code())
+  .build(logger);
+  }
+  logger.debug("HTTP Request for {} successful.", urlStr);
+  logger.debug("Response Headers: {} ", response.headers().toString());
+
+  // Return the InputStream of the response
+  return Objects.requireNonNull(response.body()).byteStream();
+} catch (IOException e) {
+  throw UserException
+.dataReadError(e)
+.message("Error retrieving data from HTTP Storage Plugin: %s", 
e.getMessage())
+.addContext("URL Requested:" + urlStr)
+.build(logger);
+}
+  }
+
+  /**
+   * Function configures the OkHTTP3 server object with configuration info 
from the user.
+   *
+   * @return OkHttpClient configured server
+   */
+  private OkHttpClient setupHttpClient

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390583400
 
 

 ##
 File path: 
contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestHttpPlugin.java
 ##
 @@ -0,0 +1,518 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+import static org.apache.drill.test.rowSet.RowSetUtilities.mapValue;
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertNull;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import okhttp3.mockwebserver.MockResponse;
+import okhttp3.mockwebserver.MockWebServer;
+import okhttp3.mockwebserver.RecordedRequest;
+import okio.Buffer;
+import okio.Okio;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.file.Paths;
+import java.util.HashMap;
+import java.util.Map;
+import java.util.concurrent.TimeUnit;
+
+/**
+ * This class tests the HTTP Storage plugin. Since the plugin makes use of 
REST requests, this
+ * test class makes use of the okhttp3 MockWebServer to simulate a remote web 
server.  There are
+ * two unit tests that make remote REST calls, however these tests are ignored 
by default.
+ *
+ * The HTTP reader uses Drill's existing JSON reader class, so the unit tests 
focus on testing the plugin configurations
+ * rather than how well it parses the JSON as this is tested elsewhere.
+ */
+public class TestHttpPlugin extends ClusterTest {
+  private static final Logger logger = 
LoggerFactory.getLogger(TestHttpPlugin.class);
+
+  private final String TEST_JSON_RESPONSE = 
"{\"results\":{\"sunrise\":\"6:13:58 AM\",\"sunset\":\"5:59:55 
PM\",\"solar_noon\":\"12:06:56 PM\",\"day_length\":\"11:45:57\"," +
 
 Review comment:
   Suggestion: place this text in a resource file and read it. Easier for us 
poor humans to read and modify.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390548155
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,170 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.config = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+config = that.config();
+httpScanSpec = that.httpScanSpec();
+columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("columns") List columns,
+@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec,
+@JacksonInject StoragePluginRegistry engineRegistry
+  ) {
+super("no-user");
+this.config = config;
+this.columns = columns;
+this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  @JsonIgnore
+  public int getMaxParallelizationWidth() {
+return 0;
+  }
+
+  @Override
+  public boolean canPushdownProjects(List columns) {
+return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+logger.debug("HttpGroupScan getSpecificScan");
+return new HttpSubScan(config, httpScanSpec, columns);
+  }
+
+  @Override
+  public GroupScan clone(List columns) {
+logger.debug("HttpGroupScan clone {}", columns);
+HttpGroupScan newScan = new HttpGroupScan(this);
+newScan.columns = columns;
 
 Review comment:
   This is messy. The constructor would be `(HttpGroupScan from, 
List columns)`. There is a set of these conversion constructors 
that each plugin needs.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390561752
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePlugin.java
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.drill.common.JSONOptions;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.AbstractStoragePlugin;
+import org.apache.drill.exec.store.SchemaConfig;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.core.type.TypeReference;
+import java.io.IOException;
+
+public class HttpStoragePlugin extends AbstractStoragePlugin {
+
+  private final DrillbitContext context;
+
+  private final HttpStoragePluginConfig engineConfig;
+
+  private final HttpSchemaFactory schemaFactory;
+
+  public HttpStoragePlugin(HttpStoragePluginConfig configuration, 
DrillbitContext context, String name) {
+super(context, name);
+engineConfig = configuration;
+schemaFactory = new HttpSchemaFactory(this, name);
+this.context = context;
+  }
+
+  @Override
+  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) {
+schemaFactory.registerSchemas(schemaConfig, parent);
+  }
+
+  @Override
+  public HttpStoragePluginConfig getConfig() {
+return engineConfig;
+  }
+
+  @Override
+  public DrillbitContext getContext() {
+return context;
+  }
+
+  @Override
+  public boolean supportsRead() {
+return true;
+  }
+
+  @Override
+  public AbstractGroupScan getPhysicalScan(String userName, JSONOptions 
selection) throws IOException {
+HttpScanSpec scanSpec = selection.getListWith(new ObjectMapper(), new 
TypeReference() {});
 
 Review comment:
   So I discovered that the right way to do this is not to create your own 
mapper, but rather to use the intuitively obvious:
   
   ```
   context.getLpPersistence().getMapper();
   ```
   
   Which has all the Drill Jackson goodies built in. Here `context` is the 
`DrillbitContext` from the base class.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390580267
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
+
+try {
+  // Execute the request
+  Response response = client
+.newCall(request)
+.execute();
+
+  // If the request is unsuccessful, throw a UserException
+  if (!response.isSuccessful()) {
+throw UserException
+  .dataReadError()
+  .message("Error retrieving data from HTTP Storage Plugin: " + 
response.code() + " " + response.message())
+  .addContext("URL: ", urlStr)
+  .addContext("Response code: ", response.code())
+  .build(logger);
+  }
+  logger.debug("HTTP Request for {} successful.", urlStr);
+  logger.debug("Response Headers: {} ", response.headers().toString());
+
+  // Return the InputStream of the response
+  return Objects.requireNonNull(response.body()).byteStream();
+} catch (IOException e) {
+  throw UserException
+.dataReadError(e)
+.message("Error retrieving data from HTTP Storage Plugin: %s", 
e.getMessage())
+.addContext("URL Requested:" + urlStr)
+.build(logger);
+}
+  }
+
+  /**
+   * Function configures the OkHTTP3 server object with configuration info 
from the user.
+   *
+   * @return OkHttpClient configured server
+   */
+  private OkHttpClient setupHttpClient

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390565084
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java
 ##
 @@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import org.apache.drill.common.map.CaseInsensitiveMap;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.base.Objects;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+
+@JsonTypeName(HttpStoragePluginConfig.NAME)
+public class HttpStoragePluginConfig extends StoragePluginConfigBase {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpStoragePluginConfig.class);
+
+  public static final String NAME = "http";
+
+  public final Map connections;
+
+  public final boolean cacheResults;
+
+  public int timeout;
+
+  @JsonCreator
+  public HttpStoragePluginConfig(@JsonProperty("cacheResults") boolean 
cacheResults,
+ @JsonProperty("connections") Map connections,
+ @JsonProperty("timeout") int timeout) {
+logger.debug("Initialize HttpStoragePluginConfig {}", connections);
 
 Review comment:
   I see many of these. If you are debugging via log messages, you can save 
yourself a huge amount of time by debugging via unit tests instead.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390568363
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpSchemaFactory.java
 ##
 @@ -0,0 +1,105 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Collections;
+import java.util.HashSet;
+import java.util.Map;
+import java.util.Set;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.drill.common.exceptions.DrillRuntimeException;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.store.AbstractSchema;
+import org.apache.drill.exec.store.AbstractSchemaFactory;
+import org.apache.drill.exec.store.SchemaConfig;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HttpSchemaFactory extends AbstractSchemaFactory {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpSchemaFactory.class);
+
+  private final HttpStoragePlugin plugin;
+
+  public HttpSchemaFactory(HttpStoragePlugin plugin, String schemaName) {
+super(schemaName);
+this.plugin = plugin;
+  }
+
+  @Override
+  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) {
+HttpSchema schema = new HttpSchema(getName());
+logger.debug("Registering {} {}", schema.getName(), schema.toString());
+
+SchemaPlus schemaPlus = parent.add(getName(), schema);
+schema.setHolder(schemaPlus);
+  }
+
+  class HttpSchema extends AbstractSchema {
+
+public HttpSchema(String name) {
+  super(Collections.emptyList(), name);
+}
+
+void setHolder(SchemaPlus plusOfThis) {
+  for (String s : getSubSchemaNames()) {
+plusOfThis.add(s, getSubSchemaKnownExists(s));
+  }
+}
+
+@Override
+public Set getSubSchemaNames() {
+  HttpStoragePluginConfig config = plugin.getConfig();
+  Map connections = config.connections();
+  Set subSchemaNames = new HashSet<>();
+
+  // Get the possible subschemas.
+  for (Map.Entry entry : connections.entrySet()) {
+subSchemaNames.add(entry.getKey());
+  }
+  return subSchemaNames;
+}
+
+@Override
+public AbstractSchema getSubSchema(String name) {
+  try {
+if (!plugin.getConfig().connections().containsKey(name)) {
+  throw UserException
+.connectionError()
+.message("API '{}' does not exist in HTTP Storage plugin '{}'", 
name, getName())
+.build(logger);
+}
+return getSubSchemaKnownExists(name);
+  } catch (Exception e) {
+throw new DrillRuntimeException(e);
 
 Review comment:
   Was this a copy/paste? If we go to the trouble of throwing a 
`UserException`, we should throw it, not catch it and wrap it in another kind 
of exception (which will, later be wrapped in a `UserException`.) Just let the 
`UserException` be thrown.
   
   Also, nit, reverse polarity: `if (...containsKey...) { return get... } else 
{ throw ... }`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390569328
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePlugin.java
 ##
 @@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import org.apache.calcite.schema.SchemaPlus;
+import org.apache.drill.common.JSONOptions;
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.server.DrillbitContext;
+import org.apache.drill.exec.store.AbstractStoragePlugin;
+import org.apache.drill.exec.store.SchemaConfig;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.core.type.TypeReference;
+import java.io.IOException;
+
+public class HttpStoragePlugin extends AbstractStoragePlugin {
+
+  private final DrillbitContext context;
+
+  private final HttpStoragePluginConfig engineConfig;
+
+  private final HttpSchemaFactory schemaFactory;
+
+  public HttpStoragePlugin(HttpStoragePluginConfig configuration, 
DrillbitContext context, String name) {
+super(context, name);
+engineConfig = configuration;
+schemaFactory = new HttpSchemaFactory(this, name);
+this.context = context;
+  }
+
+  @Override
+  public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) {
+schemaFactory.registerSchemas(schemaConfig, parent);
+  }
+
+  @Override
+  public HttpStoragePluginConfig getConfig() {
+return engineConfig;
+  }
+
+  @Override
+  public DrillbitContext getContext() {
+return context;
+  }
 
 Review comment:
   Repeats method from base class; remove.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390575455
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
 
 Review comment:
   Looks like both paths do `new Request.Builder().url(urlStr)`. Maybe put only 
the `.post()` in the post-specific block. If so, maybe check for `post`, 
instead of `! get`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390550604
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,137 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.List;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private final List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig httpStoragePluginConfig;
+  private boolean filterPushedDown = true;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.httpStoragePluginConfig = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+httpStoragePluginConfig = that.getStorageConfig();
+httpScanSpec = that.getScanSpec();
+columns = that.getColumns();
+  }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  public int getMaxParallelizationWidth() {
+return 1;
+  }
+
+
+  @Override
+  public boolean canPushdownProjects(List columns) {
+return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+logger.debug("HttpGroupScan getSpecificScan");
+return new HttpSubScan(httpStoragePluginConfig, httpScanSpec, columns);
+  }
+
+  @Override
+  public GroupScan clone(List columns) {
+logger.debug("HttpGroupScan clone {}", columns);
+return new HttpGroupScan(this);
+  }
+
+  @Override
+  public String getDigest() {
+return toString();
+  }
+
+  @Override
+  public List getColumns() {
+return columns;
+  }
+
+  @Override
+  public PhysicalOperator getNewWithChildren(List children) {
+Preconditions.checkArgument(children.isEmpty());
+return new HttpGroupScan(this);
+  }
+
+  @Override
+  public ScanStats getScanStats() {
 
 Review comment:
   This is obviously a vague, tricky, messy issue. There is no a-priori way to 
know that a particular API will return 1 vs 1 million rows. (Think of the Drill 
REST API. Quick, how man rows will it return if you know nothing about the data 
source or query?)
   
   Fortunately, the problem can be broken down into two groups: "large" or 
"small". Large results should be hash partitioned if joined. Small results can 
be broadcast to all join nodes.
   
   Maybe add an "expectedRowCount" parameter. If missing, assume 10K. Now, 
trying to explain this issue to the poor user will be a challenge. Would be 
best if we could simply mark the result as eligible for broadcast or not, but 
I'm not sure how we might do that.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390559726
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpScanSpec.java
 ##
 @@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+
+@JsonTypeName("http-scan-spec")
+public class HttpScanSpec {
+
+  protected final String schemaName;
+
+  protected final String database;
+
+  protected final String tableName;
+
+  protected final HttpStoragePluginConfig config;
+
+  @JsonCreator
+  public HttpScanSpec(@JsonProperty("schemaName") String schemaName,
+  @JsonProperty("database") String database,
+  @JsonProperty("tableName") String tableName,
+  @JsonProperty("config") HttpStoragePluginConfig config) {
+this.schemaName = schemaName;
+this.database = database;
+this.tableName = tableName;
+this.config = config;
+  }
+
+  @JsonProperty("database")
+  public String database() {
+return database;
+  }
+
+  @JsonProperty("tableName")
+  public String tableName() {
+return tableName;
+  }
+
+  @JsonIgnore
+  public String getURL() {
+return database;
+  }
+
+  @Override
+  public String toString() {
+return MoreObjects.toStringHelper(this)
 
 Review comment:
   Should be in Drill's "plan format" as it will appear as part of the scan 
spec in the logical plan.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390552671
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.config = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+config = that.config();
+httpScanSpec = that.httpScanSpec();
+columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("columns") List columns,
+@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec
+  ) {
+super("no-user");
+this.config = config;
+this.columns = columns;
+this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
+  }
+
+  @Override
+  @JsonIgnore
+  public int getMaxParallelizationWidth() {
+return 1;
+  }
+
+  @Override
+  public boolean canPushdownProjects(List columns) {
+return true;
+  }
+
+  @Override
+  public SubScan getSpecificScan(int minorFragmentId) {
+logger.debug("HttpGroupScan getSpecificScan");
+return new HttpSubScan(config, httpScanSpec, columns);
+  }
+
+  @Override
+  public GroupScan clone(List columns) {
+logger.debug("HttpGroupScan clone {}", columns);
+HttpGroupScan newScan = new HttpGroupScan(this);
+newScan.columns = columns;
+return newScan;
+  }
+
+  @Override
+  public String getDigest() {
+return toString();
+  }
+
+  @Override
+  public PhysicalOperator getNewWithChildren(List children) {
+Preconditions.checkArgument(children.isEmpty());
+return new HttpGroupScan(this);
+  }
+
+  @Override
+  public ScanStats getScanStats() {
+int colCount = columns.size();
+int estRowCount = 1;
+int estDataSize = estRowCount * 200 * colCount;
 
 Review comment:
   This, unfortunately, does the *opposite* of what you want. When Drill does 
project, data size is 0. When you do project, data size is non-zero. Calcite 
will choose the cheapest, hence it will let Drill do the project.
   
   What you want is, if the columns are provided, reduce the da

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390531796
 
 

 ##
 File path: contrib/storage-http/README.md
 ##
 @@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this 
point, the API reader will only accept JSON as input however in the future, it 
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or 
other formats.  
+ 
+Note:  This plugin should **NOT** be used for interacting with tools which 
have REST APIs such as Splunk or Solr. It will not be performant for those use 
cases.  
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following 
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+  "type": "http",
+  "connection": "https:///",
+  "cacheResults": true,
+  "timeout": 0,
+  "enabled": true
+}
+```
+The options are:
+* `type`:  This should be `http`
+* `cacheResults`:  Enable caching of the HTTP responses
+* `timeout`:  Sets the response timeout in seconds. Defaults to `0` which is 
no timeout. 
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can 
query directly from this plugin. To do so, first add a `connections` parameter 
to the configuration
+. Next give the connection a name, which will be used in queries.  For 
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending 
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not 
allowed and will default to `GET`.  
+* `headers`: Often APIs will require custom headers as part of the 
authentication. This field allows you to define key/value pairs which are 
submitted with the http request
+.  The format is:
+```json
+headers: {
+   "key1":, "Value1",
+   "key2", "Value2"
+}
+```
+* `authType`: If your API requires authentication, specify the authentication 
type. At the time of implementation, the plugin only supports basic 
authentication, however, the
+ plugin will likely support OAUTH2 in the future. Defaults to `none`. If the 
`authType` is set to `basic`, `username` and `password` must be set in the 
configuration as well. 
+ * `username`: The username for basic authentication. 
+ * `password`: The password for basic authentication.
+ * `postBody`: Contains data, in the form of key value pairs, which are sent 
during a `POST` request. Post body should be in the form:
+ ```
+key1=value1
+key2=value2
+```
+
+## Usage:
+This plugin is different from other plugins in that it the table component of 
the `FROM` clause is different. In normal Drill queries, the `FROM` clause is 
constructed as follows:
+```sql
+FROM ..
+```
+For example, you might have:
+```sql
+FROM dfs.test.`somefile.csv`
+
+-- or 
+
+FROM mongo.stats.sales_data
+```
+
+The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your 
REST call. The structure is:
+```sql
+FROM ..
+--Actual example:
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today`
+```
+
+
+## Examples:
+### Example 1:  Reference Data, A Sunrise/Sunset API
+The API sunrise-sunset.org returns data in the following format:
+
+ ```json
+   {
+ "results":
+ {
+   "sunrise":"7:27:02 AM",
+   "sunset":"5:05:55 PM",
+   "solar_noon":"12:16:28 PM",
+   "day_length":"9:38:53",
+   "civil_twilight_begin":"6:58:14 AM",
+   "civil_twilight_end":"5:34:43 PM",
+   "nautical_twilight_begin":"6:25:47 AM",
+   "nautical_twilight_end":"6:07:10 PM",
+   "astronomical_twilight_begin":"5:54:14 AM",
+   "astronomical_twilight_end":"6:38:43 PM"
+ },
+  "status":"OK"
+   }
+   }
+```
+To query this API, set the configuration as follows:
+
+```json
+{
+ {
+   "type": "http",
+   "cacheResults": false,
+   "enabled": true,
+   "timeout" 5,
 
 Review comment:
   Nit: missing colon after `"timeout"`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390533239
 
 

 ##
 File path: contrib/storage-http/README.md
 ##
 @@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this 
point, the API reader will only accept JSON as input however in the future, it 
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or 
other formats.  
+ 
+Note:  This plugin should **NOT** be used for interacting with tools which 
have REST APIs such as Splunk or Solr. It will not be performant for those use 
cases.  
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following 
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+  "type": "http",
+  "connection": "https:///",
+  "cacheResults": true,
+  "timeout": 0,
+  "enabled": true
+}
+```
+The options are:
+* `type`:  This should be `http`
+* `cacheResults`:  Enable caching of the HTTP responses
+* `timeout`:  Sets the response timeout in seconds. Defaults to `0` which is 
no timeout. 
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can 
query directly from this plugin. To do so, first add a `connections` parameter 
to the configuration
+. Next give the connection a name, which will be used in queries.  For 
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending 
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not 
allowed and will default to `GET`.  
+* `headers`: Often APIs will require custom headers as part of the 
authentication. This field allows you to define key/value pairs which are 
submitted with the http request
+.  The format is:
+```json
+headers: {
+   "key1":, "Value1",
+   "key2", "Value2"
+}
+```
+* `authType`: If your API requires authentication, specify the authentication 
type. At the time of implementation, the plugin only supports basic 
authentication, however, the
+ plugin will likely support OAUTH2 in the future. Defaults to `none`. If the 
`authType` is set to `basic`, `username` and `password` must be set in the 
configuration as well. 
+ * `username`: The username for basic authentication. 
+ * `password`: The password for basic authentication.
+ * `postBody`: Contains data, in the form of key value pairs, which are sent 
during a `POST` request. Post body should be in the form:
+ ```
+key1=value1
+key2=value2
+```
+
+## Usage:
+This plugin is different from other plugins in that it the table component of 
the `FROM` clause is different. In normal Drill queries, the `FROM` clause is 
constructed as follows:
+```sql
+FROM ..
+```
+For example, you might have:
+```sql
+FROM dfs.test.`somefile.csv`
+
+-- or 
+
+FROM mongo.stats.sales_data
+```
+
+The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your 
REST call. The structure is:
+```sql
+FROM ..
+--Actual example:
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today`
+```
+
+
+## Examples:
+### Example 1:  Reference Data, A Sunrise/Sunset API
+The API sunrise-sunset.org returns data in the following format:
+
+ ```json
+   {
+ "results":
+ {
+   "sunrise":"7:27:02 AM",
+   "sunset":"5:05:55 PM",
+   "solar_noon":"12:16:28 PM",
+   "day_length":"9:38:53",
+   "civil_twilight_begin":"6:58:14 AM",
+   "civil_twilight_end":"5:34:43 PM",
+   "nautical_twilight_begin":"6:25:47 AM",
+   "nautical_twilight_end":"6:07:10 PM",
+   "astronomical_twilight_begin":"5:54:14 AM",
+   "astronomical_twilight_end":"6:38:43 PM"
+ },
+  "status":"OK"
+   }
+   }
+```
+To query this API, set the configuration as follows:
+
+```json
+{
+ {
+   "type": "http",
+   "cacheResults": false,
+   "enabled": true,
+   "timeout" 5,
+   "connections": {
+ "sunrise": {
+   "url": "https://api.sunrise-sunset.org/";,
+   "method": "get",
+   "headers": null,
+   "authType": "none",
+   "userName": null,
+   "password": null,
+   "postBody": null
+ }
+   },
+}
+```
+Then, to execute a query:
+```sql
+SELECT api_results.results.sunrise AS sunrise, 
+api_results.results.sunset AS sunset
+FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` AS 
api_results;
 
 Review comment:
   This is screaming out for filter push-down. Let's do this: get the basic 
version in so we have a firm foundation. Then, add filter push-down, using this 
plugin as the primary use case.


This is an automated mess

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390574529
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java
 ##
 @@ -0,0 +1,240 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http.util;
+
+import okhttp3.Cache;
+import okhttp3.Credentials;
+import okhttp3.FormBody;
+import okhttp3.Interceptor;
+import okhttp3.OkHttpClient;
+import okhttp3.OkHttpClient.Builder;
+import okhttp3.Request;
+import okhttp3.Response;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.ExecConstants;
+import org.apache.drill.exec.ops.FragmentContext;
+import org.apache.drill.exec.store.http.HttpAPIConfig;
+import org.apache.drill.exec.store.http.HttpStoragePluginConfig;
+import org.jetbrains.annotations.NotNull;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.File;
+import java.io.IOException;
+import java.io.InputStream;
+import java.util.Map;
+import java.util.Objects;
+import java.util.concurrent.TimeUnit;
+import java.util.regex.Pattern;
+
+
+/**
+ * This class performs the actual HTTP requests for the HTTP Storage Plugin. 
The core method is the getInputStream()
+ * method which accepts a url and opens an InputStream with that URL's 
contents.
+ */
+public class SimpleHttp {
+  private static final Logger logger = 
LoggerFactory.getLogger(SimpleHttp.class);
+
+  private final OkHttpClient client;
+
+  private final HttpStoragePluginConfig config;
+
+  private final FragmentContext context;
+
+  private final HttpAPIConfig apiConfig;
+
+  public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, 
String connectionName) {
+this.config = config;
+this.context = context;
+this.apiConfig = config.connections().get(connectionName);
+client = setupHttpClient();
+  }
+
+
+
+  public InputStream getInputStream(String urlStr) {
+Request.Builder requestBuilder;
+
+// The configuration does not allow for any other request types other than 
POST and GET.
+if (apiConfig.method().equals("get")) {
+  // Handle GET requests
+  requestBuilder = new Request.Builder().url(urlStr);
+} else {
+  // Handle POST requests
+  FormBody.Builder formBodyBuilder = buildPostBody();
+  requestBuilder = new Request.Builder()
+.url(urlStr)
+.post(formBodyBuilder.build());
+}
+
+// Add headers to request
+if (apiConfig.headers() != null) {
+  for (Map.Entry entry : apiConfig.headers().entrySet()) {
+String key = entry.getKey();
+String value = entry.getValue();
+requestBuilder.addHeader(key, value);
+  }
+}
+
+// Build the request object
+Request request = requestBuilder.build();
+logger.debug("Headers: {}", request.headers());
 
 Review comment:
   In a busy distributed system, when we look a the logs, there will be no easy 
way to associate the headers here with anything. Do we elsewhere emit the URL? 
Then, we can at least hunt down the URL using the fragment ID, and use the same 
fragment ID to hunt down the headers. Or, add the URL here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390535951
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java
 ##
 @@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
 
 Review comment:
   HTTP could fetch movies, images, CSV files, Parquet files... pretty much 
anything. REST, however, is a request, with data encoded in the URL query 
string (usually, or in a POST), with response in the form of JSON. People wage 
huge battles about the "one true, pure REST", but most folks are pragmatic. 
This code fits the pragmatic description.
   
   If we did eventually do a file-over-http plugin what would we call that if 
not `http`? `filehttp`? If so, then wouldn't this be `resthttp`? Or, since REST 
implies HTTP, maybe just `rest`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390564687
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java
 ##
 @@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import org.apache.drill.common.map.CaseInsensitiveMap;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.base.Objects;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+
+@JsonTypeName(HttpStoragePluginConfig.NAME)
+public class HttpStoragePluginConfig extends StoragePluginConfigBase {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpStoragePluginConfig.class);
+
+  public static final String NAME = "http";
+
+  public final Map connections;
+
+  public final boolean cacheResults;
 
 Review comment:
   What does this do? I saw no code to cache results. Not sure it even makes 
sense in a distributed system unless there is some way to direct the next scan 
for the same URL to the node that holds the cached results. (Or, add a way to 
write the cached results to DFS, and expire them after some period.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390532236
 
 

 ##
 File path: contrib/storage-http/README.md
 ##
 @@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this 
point, the API reader will only accept JSON as input however in the future, it 
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or 
other formats.  
+ 
+Note:  This plugin should **NOT** be used for interacting with tools which 
have REST APIs such as Splunk or Solr. It will not be performant for those use 
cases.  
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following 
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+  "type": "http",
+  "connection": "https:///",
+  "cacheResults": true,
+  "timeout": 0,
+  "enabled": true
+}
+```
+The options are:
+* `type`:  This should be `http`
+* `cacheResults`:  Enable caching of the HTTP responses
+* `timeout`:  Sets the response timeout in seconds. Defaults to `0` which is 
no timeout. 
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can 
query directly from this plugin. To do so, first add a `connections` parameter 
to the configuration
+. Next give the connection a name, which will be used in queries.  For 
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending 
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not 
allowed and will default to `GET`.  
+* `headers`: Often APIs will require custom headers as part of the 
authentication. This field allows you to define key/value pairs which are 
submitted with the http request
+.  The format is:
+```json
+headers: {
+   "key1":, "Value1",
+   "key2", "Value2"
+}
+```
+* `authType`: If your API requires authentication, specify the authentication 
type. At the time of implementation, the plugin only supports basic 
authentication, however, the
+ plugin will likely support OAUTH2 in the future. Defaults to `none`. If the 
`authType` is set to `basic`, `username` and `password` must be set in the 
configuration as well. 
+ * `username`: The username for basic authentication. 
+ * `password`: The password for basic authentication.
+ * `postBody`: Contains data, in the form of key value pairs, which are sent 
during a `POST` request. Post body should be in the form:
+ ```
+key1=value1
+key2=value2
+```
+
+## Usage:
+This plugin is different from other plugins in that it the table component of 
the `FROM` clause is different. In normal Drill queries, the `FROM` clause is 
constructed as follows:
+```sql
+FROM ..
+```
+For example, you might have:
+```sql
+FROM dfs.test.`somefile.csv`
+
+-- or 
+
+FROM mongo.stats.sales_data
+```
+
+The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your 
REST call. The structure is:
+```sql
+FROM ..
+--Actual example:
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today`
+```
+
+
+## Examples:
+### Example 1:  Reference Data, A Sunrise/Sunset API
+The API sunrise-sunset.org returns data in the following format:
+
+ ```json
+   {
+ "results":
+ {
+   "sunrise":"7:27:02 AM",
+   "sunset":"5:05:55 PM",
+   "solar_noon":"12:16:28 PM",
+   "day_length":"9:38:53",
+   "civil_twilight_begin":"6:58:14 AM",
+   "civil_twilight_end":"5:34:43 PM",
+   "nautical_twilight_begin":"6:25:47 AM",
+   "nautical_twilight_end":"6:07:10 PM",
+   "astronomical_twilight_begin":"5:54:14 AM",
+   "astronomical_twilight_end":"6:38:43 PM"
+ },
+  "status":"OK"
+   }
+   }
+```
+To query this API, set the configuration as follows:
+
+```json
+{
+ {
+   "type": "http",
+   "cacheResults": false,
+   "enabled": true,
+   "timeout" 5,
+   "connections": {
+ "sunrise": {
+   "url": "https://api.sunrise-sunset.org/";,
+   "method": "get",
+   "headers": null,
+   "authType": "none",
+   "userName": null,
+   "password": null,
+   "postBody": null
+ }
+   },
 
 Review comment:
   Suggestion: use the old C-book trick: copy-paste this into a test case, 
ensure it works, then copy-paste the working config back here.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390546981
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java
 ##
 @@ -0,0 +1,167 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.Arrays;
+import java.util.List;
+import java.util.Objects;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.expression.SchemaPath;
+
+import org.apache.drill.exec.physical.base.AbstractGroupScan;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.ScanStats;
+import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+@JsonTypeName("http-scan")
+public class HttpGroupScan extends AbstractGroupScan {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpGroupScan.class);
+
+  private List columns;
+  private final HttpScanSpec httpScanSpec;
+  private final HttpStoragePluginConfig config;
+
+  public HttpGroupScan (
+HttpStoragePluginConfig config,
+HttpScanSpec scanSpec,
+List columns
+  ) {
+super("no-user");
+this.config = config;
+this.httpScanSpec = scanSpec;
+this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : 
columns;
+  }
+
+  public HttpGroupScan(HttpGroupScan that) {
+super(that);
+config = that.config();
+httpScanSpec = that.httpScanSpec();
+columns = that.getColumns();
+  }
+
+  @JsonCreator
+  public HttpGroupScan(
+@JsonProperty("config") HttpStoragePluginConfig config,
+@JsonProperty("columns") List columns,
+@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec
+  ) {
+super("no-user");
+this.config = config;
+this.columns = columns;
+this.httpScanSpec = httpScanSpec;
+  }
+
+  @JsonProperty("config")
+  public HttpStoragePluginConfig config() { return config; }
+
+  @JsonProperty("columns")
+  public List columns() { return columns; }
+
+  @JsonProperty("httpScanSpec")
+  public HttpScanSpec httpScanSpec() { return httpScanSpec; }
+
+  @Override
+  public void applyAssignments(List endpoints) {
+logger.debug("HttpGroupScan applyAssignments");
 
 Review comment:
   No need to log this, it will be called. Maybe this is a proxy for a comment 
that says "not sure what to do", or "assignments not used."
   
   A necessary weakness of the present design is that it must run in only a 
single Drillbit. Not good for downloading large result sets. But, distribution 
will require filter push-down and an idea about how to shard a request. That 
can come later.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390531073
 
 

 ##
 File path: contrib/storage-http/README.md
 ##
 @@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this 
point, the API reader will only accept JSON as input however in the future, it 
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or 
other formats.  
+ 
+Note:  This plugin should **NOT** be used for interacting with tools which 
have REST APIs such as Splunk or Solr. It will not be performant for those use 
cases.  
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following 
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+  "type": "http",
+  "connection": "https:///",
+  "cacheResults": true,
+  "timeout": 0,
+  "enabled": true
+}
+```
+The options are:
+* `type`:  This should be `http`
+* `cacheResults`:  Enable caching of the HTTP responses
+* `timeout`:  Sets the response timeout in seconds. Defaults to `0` which is 
no timeout. 
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can 
query directly from this plugin. To do so, first add a `connections` parameter 
to the configuration
+. Next give the connection a name, which will be used in queries.  For 
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending 
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not 
allowed and will default to `GET`.  
+* `headers`: Often APIs will require custom headers as part of the 
authentication. This field allows you to define key/value pairs which are 
submitted with the http request
+.  The format is:
+```json
+headers: {
+   "key1":, "Value1",
+   "key2", "Value2"
 
 Review comment:
   Nit: Json typo: colon after the second key, no comma after the colon in the 
first key.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390544725
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Objects;
+
+public class HttpAPIConfig {
+
+  private final String url;
+
+  private final String method;
+
+  private final Map headers;
+
+  private final String authType;
+
+  private final String userName;
+
+  private final String password;
+
+  private final String postBody;
+
+  public HttpAPIConfig(@JsonProperty("url") String url,
+   @JsonProperty("method") String method,
+   @JsonProperty("headers") Map headers,
+   @JsonProperty("authType") String authType,
+   @JsonProperty("userName") String userName,
+   @JsonProperty("password") String password,
+   @JsonProperty("postBody") String postBody) {
+
+// Get the request method.  Only accept GET and POST requests.  Anything 
else will default to GET.
+if (method.toLowerCase().equals("get") || 
method.toLowerCase().equals("post")) {
+  this.method = method.toLowerCase();
+} else {
+  this.method = "get";
+}
+this.headers = headers;
+
+// Put a trailing slash on the URL if it is missing
+if (url.charAt(url.length() - 1) != '/') {
+  this.url = url + "/";
+} else {
+  this.url = url;
+}
+
+// Get the authentication method. Future functionality will include OAUTH2 
authentication but for now
+// Accept either basic or none.  The default is none.
+this.authType = (authType == null || authType.isEmpty()) ? "none" : 
authType;
 
 Review comment:
   Consider `Strings.isNullOrEmpty()`. See 
[this](https://stackoverflow.com/questions/2272169/string-isnullorempty-in-java).
 Also, should any of these be trimmed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390537966
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Objects;
+
+public class HttpAPIConfig {
+
+  private final String url;
+
+  private final String method;
+
+  private final Map headers;
+
+  private final String authType;
+
+  private final String userName;
+
+  private final String password;
+
+  private final String postBody;
+
+  public HttpAPIConfig(@JsonProperty("url") String url,
+   @JsonProperty("method") String method,
+   @JsonProperty("headers") Map headers,
+   @JsonProperty("authType") String authType,
+   @JsonProperty("userName") String userName,
+   @JsonProperty("password") String password,
+   @JsonProperty("postBody") String postBody) {
+
+// Get the request method.  Only accept GET and POST requests.  Anything 
else will default to GET.
+if (method.toLowerCase().equals("get") || 
method.toLowerCase().equals("post")) {
 
 Review comment:
   This will crash if `method` is not provided. Since `GET` is the default, 
this should accept `null`.
   
   Also, nit, after doing the `null` check, a `switch` statement would be 
clearer for the other case.
   
   Finally, nit, this method lower-cases the method three times. Could be done 
just once.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390565386
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java
 ##
 @@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import org.apache.drill.common.map.CaseInsensitiveMap;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.base.Objects;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+
+@JsonTypeName(HttpStoragePluginConfig.NAME)
+public class HttpStoragePluginConfig extends StoragePluginConfigBase {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpStoragePluginConfig.class);
+
+  public static final String NAME = "http";
+
+  public final Map connections;
+
+  public final boolean cacheResults;
+
+  public int timeout;
+
+  @JsonCreator
+  public HttpStoragePluginConfig(@JsonProperty("cacheResults") boolean 
cacheResults,
+ @JsonProperty("connections") Map connections,
+ @JsonProperty("timeout") int timeout) {
+logger.debug("Initialize HttpStoragePluginConfig {}", connections);
+this.cacheResults = cacheResults;
+
+if (connections != null) {
 
 Review comment:
   Nit: reverse polarity: `if (connections == null)` ...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390564074
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java
 ##
 @@ -0,0 +1,90 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import org.apache.drill.common.map.CaseInsensitiveMap;
+import org.apache.drill.common.logical.StoragePluginConfigBase;
+import org.apache.drill.shaded.guava.com.google.common.base.Objects;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+
+import java.util.HashMap;
+import java.util.Map;
+import java.util.Optional;
+
+
+@JsonTypeName(HttpStoragePluginConfig.NAME)
+public class HttpStoragePluginConfig extends StoragePluginConfigBase {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpStoragePluginConfig.class);
+
+  public static final String NAME = "http";
+
+  public final Map connections;
 
 Review comment:
   The issue is that common config must be duplicated. If config is duplicated, 
we are really creating two different configs; we might as well store them at 
the top level.
   
   I might think differently if things like user, auth and so on were common 
for a site, then each service in that site had its own URL.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390541287
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Objects;
+
+public class HttpAPIConfig {
+
+  private final String url;
+
+  private final String method;
+
+  private final Map headers;
+
+  private final String authType;
+
+  private final String userName;
+
+  private final String password;
+
+  private final String postBody;
+
+  public HttpAPIConfig(@JsonProperty("url") String url,
+   @JsonProperty("method") String method,
+   @JsonProperty("headers") Map headers,
+   @JsonProperty("authType") String authType,
+   @JsonProperty("userName") String userName,
+   @JsonProperty("password") String password,
+   @JsonProperty("postBody") String postBody) {
+
+// Get the request method.  Only accept GET and POST requests.  Anything 
else will default to GET.
+if (method.toLowerCase().equals("get") || 
method.toLowerCase().equals("post")) {
+  this.method = method.toLowerCase();
+} else {
+  this.method = "get";
+}
+this.headers = headers;
+
+// Put a trailing slash on the URL if it is missing
+if (url.charAt(url.length() - 1) != '/') {
+  this.url = url + "/";
+} else {
+  this.url = url;
+}
 
 Review comment:
   It is perhaps thinking how this works. The JSON comes from either 1) the 
user, or 2) persistent storage. If from the user, we will doctor up the values 
in this constructor, and the result will be converted back to JSON in the 
persistent store. If from the persistent store (older version wrote the JSON), 
then the form used in-memory won't match the persistent version until the next 
update.
   
   I wonder, would it be better to do this fix-up when creating the reader 
rather than in the config ctor?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390537567
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Objects;
+
+public class HttpAPIConfig {
+
+  private final String url;
+
+  private final String method;
+
+  private final Map headers;
+
+  private final String authType;
+
+  private final String userName;
+
+  private final String password;
+
+  private final String postBody;
+
+  public HttpAPIConfig(@JsonProperty("url") String url,
+   @JsonProperty("method") String method,
+   @JsonProperty("headers") Map headers,
+   @JsonProperty("authType") String authType,
+   @JsonProperty("userName") String userName,
+   @JsonProperty("password") String password,
+   @JsonProperty("postBody") String postBody) {
+
+// Get the request method.  Only accept GET and POST requests.  Anything 
else will default to GET.
+if (method.toLowerCase().equals("get") || 
method.toLowerCase().equals("post")) {
+  this.method = method.toLowerCase();
+} else {
+  this.method = "get";
 
 Review comment:
   Please log a warning. Else, if I really want to use `HEAD`, I'll have no way 
of understanding why my server receives `GET`.
   
   Also, the convention is for HTTP methods to be upper case: `GET` and `POST`.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390557812
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpScanBatchCreator.java
 ##
 @@ -0,0 +1,54 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.http;
+
+import java.util.List;
+
+import org.apache.drill.common.exceptions.ExecutionSetupException;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.exec.ops.ExecutorFragmentContext;
+import org.apache.drill.exec.physical.base.GroupScan;
+import org.apache.drill.exec.physical.impl.BatchCreator;
+import org.apache.drill.exec.physical.impl.ScanBatch;
+import org.apache.drill.exec.record.RecordBatch;
+import org.apache.drill.exec.store.RecordReader;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class HttpScanBatchCreator implements BatchCreator {
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpScanBatchCreator.class);
+
+  @Override
+  public ScanBatch getBatch(ExecutorFragmentContext context, HttpSubScan 
subScan, List children) throws ExecutionSetupException {
+logger.debug("getBatch called");
+Preconditions.checkArgument(children == null || children.isEmpty());
+
+HttpStoragePluginConfig config = subScan.config();
+List readers = Lists.newArrayList();
 
 Review comment:
   nit: `new ArrayList<>()`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390533397
 
 

 ##
 File path: contrib/storage-http/README.md
 ##
 @@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this 
point, the API reader will only accept JSON as input however in the future, it 
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or 
other formats.  
+ 
+Note:  This plugin should **NOT** be used for interacting with tools which 
have REST APIs such as Splunk or Solr. It will not be performant for those use 
cases.  
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following 
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+  "type": "http",
+  "connection": "https:///",
+  "cacheResults": true,
+  "timeout": 0,
+  "enabled": true
+}
+```
+The options are:
+* `type`:  This should be `http`
+* `cacheResults`:  Enable caching of the HTTP responses
+* `timeout`:  Sets the response timeout in seconds. Defaults to `0` which is 
no timeout. 
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can 
query directly from this plugin. To do so, first add a `connections` parameter 
to the configuration
+. Next give the connection a name, which will be used in queries.  For 
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending 
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not 
allowed and will default to `GET`.  
+* `headers`: Often APIs will require custom headers as part of the 
authentication. This field allows you to define key/value pairs which are 
submitted with the http request
+.  The format is:
+```json
+headers: {
+   "key1":, "Value1",
+   "key2", "Value2"
+}
+```
+* `authType`: If your API requires authentication, specify the authentication 
type. At the time of implementation, the plugin only supports basic 
authentication, however, the
+ plugin will likely support OAUTH2 in the future. Defaults to `none`. If the 
`authType` is set to `basic`, `username` and `password` must be set in the 
configuration as well. 
+ * `username`: The username for basic authentication. 
+ * `password`: The password for basic authentication.
+ * `postBody`: Contains data, in the form of key value pairs, which are sent 
during a `POST` request. Post body should be in the form:
+ ```
+key1=value1
+key2=value2
+```
+
+## Usage:
+This plugin is different from other plugins in that it the table component of 
the `FROM` clause is different. In normal Drill queries, the `FROM` clause is 
constructed as follows:
+```sql
+FROM ..
+```
+For example, you might have:
+```sql
+FROM dfs.test.`somefile.csv`
+
+-- or 
+
+FROM mongo.stats.sales_data
+```
+
+The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your 
REST call. The structure is:
+```sql
+FROM ..
+--Actual example:
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today`
+```
+
+
+## Examples:
+### Example 1:  Reference Data, A Sunrise/Sunset API
+The API sunrise-sunset.org returns data in the following format:
+
+ ```json
+   {
+ "results":
+ {
+   "sunrise":"7:27:02 AM",
+   "sunset":"5:05:55 PM",
+   "solar_noon":"12:16:28 PM",
+   "day_length":"9:38:53",
+   "civil_twilight_begin":"6:58:14 AM",
+   "civil_twilight_end":"5:34:43 PM",
+   "nautical_twilight_begin":"6:25:47 AM",
+   "nautical_twilight_end":"6:07:10 PM",
+   "astronomical_twilight_begin":"5:54:14 AM",
+   "astronomical_twilight_end":"6:38:43 PM"
+ },
+  "status":"OK"
+   }
+   }
+```
+To query this API, set the configuration as follows:
+
+```json
+{
+ {
+   "type": "http",
+   "cacheResults": false,
+   "enabled": true,
+   "timeout" 5,
+   "connections": {
+ "sunrise": {
+   "url": "https://api.sunrise-sunset.org/";,
+   "method": "get",
+   "headers": null,
+   "authType": "none",
+   "userName": null,
+   "password": null,
+   "postBody": null
+ }
+   },
+}
+```
+Then, to execute a query:
+```sql
+SELECT api_results.results.sunrise AS sunrise, 
+api_results.results.sunset AS sunset
+FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` AS 
api_results;
+```
+Which yields the following results:
+```
++++
+|  sunrise   |   sunset   |
++++
+| 7:17:46 AM | 5:01:33 PM |
++++
+1 row selected (0.632 seconds)
+```
+
+### Example 2: JIRA
+JIRA Cloud has a REST API which is [documented 

[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390545178
 
 

 ##
 File path: 
contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java
 ##
 @@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.http;
+
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.util.Arrays;
+import java.util.Map;
+import java.util.Objects;
+
+@JsonTypeName("http-api-config")
+public class HttpAPIConfig {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(HttpAPIConfig.class);
+
+  private final String url;
+
+  private final String method;
+
+  private final Map headers;
+
+  private final String authType;
+
+  private final String username;
+
+  private final String password;
+
+  private final String postBody;
+
+  public HttpAPIConfig(@JsonProperty("url") String url,
+   @JsonProperty("method") String method,
+   @JsonProperty("headers") Map headers,
+   @JsonProperty("authType") String authType,
+   @JsonProperty("username") String username,
+   @JsonProperty("password") String password,
+   @JsonProperty("postBody") String postBody) {
+this.url = url;
+
+// Get the request method.  Only accept GET and POST requests.  Anything 
else will default to GET.
+if (method.toLowerCase().equals("get") || 
method.toLowerCase().equals("post")) {
+  this.method = method.toLowerCase();
+} else {
+  this.method = "get";
+}
+this.headers = headers;
+
+// Get the authentication method. Future functionality will include OAUTH2 
authentication but for now
+// Accept either basic or none.  The default is none.
+this.authType = (authType == null || authType.isEmpty()) ? "none" : 
authType;
+this.username = username;
+this.password = password;
+this.postBody = postBody;
+
+// Validate the authentication type
+
+  }
+
+  @JsonProperty("url")
+  public String url() { return url; }
+
+  @JsonProperty("method")
+  public String method() { return method; }
+
+  @JsonProperty("headers")
+  public Map headers() { return headers; }
+
+  @JsonProperty("authType")
+  public String authType() { return authType; }
+
+  @JsonProperty("username")
+  public String username() { return username; }
+
+  @JsonProperty("password")
+  public String password() { return password; }
+
+  @JsonProperty("postBody")
+  public String postBody() { return postBody; }
+
+  @Override
+  public int hashCode() {
+return Arrays.hashCode(
+  new Object[]{url, method, headers, authType, username, password, 
postBody});
+  }
+
+  @Override
+  public String toString() {
+return MoreObjects.toStringHelper(this)
+  .add("url", url)
+  .add("method", method)
+  .add("headers", headers)
+  .add("authType", authType)
+  .add("username", username)
+  .add("password", password)
+  .add("postBody", postBody)
+  .toString();
 
 Review comment:
   Class moved to another PR, will hopefully arrive soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390582930
 
 

 ##
 File path: 
contrib/storage-http/src/main/resources/bootstrap-storage-plugins.json
 ##
 @@ -0,0 +1,9 @@
+{
+  "storage":{
+"http" : {
+  "type":"http",
+  "connections": {},
+  "enabled": false
+}
+  }
+}
 
 Review comment:
   Nit: add newline.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage 
Plugin for Generic HTTP REST API
URL: https://github.com/apache/drill/pull/1892#discussion_r390531631
 
 

 ##
 File path: contrib/storage-http/README.md
 ##
 @@ -0,0 +1,218 @@
+
+# Generic REST API Storage Plugin
+This plugin is intended to enable you to query APIs over HTTP/REST. At this 
point, the API reader will only accept JSON as input however in the future, it 
may be possible to
+ add additional format readers to allow for APIs which return XML, CSV or 
other formats.  
+ 
+Note:  This plugin should **NOT** be used for interacting with tools which 
have REST APIs such as Splunk or Solr. It will not be performant for those use 
cases.  
+
+## Configuration
+To configure the plugin, create a new storage plugin, and add the following 
configuration options which apply to ALL connections defined in this plugin:
+
+```json
+{
+  "type": "http",
+  "connection": "https:///",
+  "cacheResults": true,
+  "timeout": 0,
+  "enabled": true
+}
+```
+The options are:
+* `type`:  This should be `http`
+* `cacheResults`:  Enable caching of the HTTP responses
+* `timeout`:  Sets the response timeout in seconds. Defaults to `0` which is 
no timeout. 
+
+### Configuring the API Connections
+The HTTP Storage plugin allows you to configure multiple APIS which you can 
query directly from this plugin. To do so, first add a `connections` parameter 
to the configuration
+. Next give the connection a name, which will be used in queries.  For 
instance `stockAPI` or `jira`.
+
+The `connection` can accept the following options:
+* `url`: The base URL which Drill will query. You should include the ending 
slash if there are additional arguments which you are passing.
+* `method`: The request method. Must be `get` or `post`. Other methods are not 
allowed and will default to `GET`.  
+* `headers`: Often APIs will require custom headers as part of the 
authentication. This field allows you to define key/value pairs which are 
submitted with the http request
+.  The format is:
+```json
+headers: {
+   "key1":, "Value1",
+   "key2", "Value2"
+}
+```
+* `authType`: If your API requires authentication, specify the authentication 
type. At the time of implementation, the plugin only supports basic 
authentication, however, the
+ plugin will likely support OAUTH2 in the future. Defaults to `none`. If the 
`authType` is set to `basic`, `username` and `password` must be set in the 
configuration as well. 
+ * `username`: The username for basic authentication. 
+ * `password`: The password for basic authentication.
+ * `postBody`: Contains data, in the form of key value pairs, which are sent 
during a `POST` request. Post body should be in the form:
+ ```
+key1=value1
+key2=value2
+```
+
+## Usage:
+This plugin is different from other plugins in that it the table component of 
the `FROM` clause is different. In normal Drill queries, the `FROM` clause is 
constructed as follows:
+```sql
+FROM ..
+```
+For example, you might have:
+```sql
+FROM dfs.test.`somefile.csv`
+
+-- or 
+
+FROM mongo.stats.sales_data
+```
+
+The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your 
REST call. The structure is:
+```sql
+FROM ..
+--Actual example:
+ FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today`
+```
+
+
+## Examples:
+### Example 1:  Reference Data, A Sunrise/Sunset API
+The API sunrise-sunset.org returns data in the following format:
+
+ ```json
+   {
+ "results":
+ {
+   "sunrise":"7:27:02 AM",
+   "sunset":"5:05:55 PM",
+   "solar_noon":"12:16:28 PM",
+   "day_length":"9:38:53",
+   "civil_twilight_begin":"6:58:14 AM",
+   "civil_twilight_end":"5:34:43 PM",
+   "nautical_twilight_begin":"6:25:47 AM",
+   "nautical_twilight_end":"6:07:10 PM",
+   "astronomical_twilight_begin":"5:54:14 AM",
+   "astronomical_twilight_end":"6:38:43 PM"
+ },
+  "status":"OK"
+   }
+   }
+```
+To query this API, set the configuration as follows:
+
+```json
+{
+ {
 
 Review comment:
   Nit: `{ {` is not valid JSON. Is there a key missing? Or, remove the outer 
brackets?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] dobesv commented on issue #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page

2020-03-10 Thread GitBox
dobesv commented on issue #1994: DRILL-7203: Accept impersonation userName as 
form field & fix back button for query page
URL: https://github.com/apache/drill/pull/1994#issuecomment-597268755
 
 
   @arina-ielchiieva Thanks.  Sorry I didn't have the resources to fix the 
Safari issue - maybe someone with macOS will have a chance to fix it in a 
future PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7635) Mongo tests fail for profiles with non-file default file system

2020-03-10 Thread Vova Vysotskyi (Jira)
Vova Vysotskyi created DRILL-7635:
-

 Summary: Mongo tests fail for profiles with non-file default file 
system
 Key: DRILL-7635
 URL: https://issues.apache.org/jira/browse/DRILL-7635
 Project: Apache Drill
  Issue Type: Bug
Affects Versions: 1.18.0
Reporter: Vova Vysotskyi
Assignee: Vova Vysotskyi
 Fix For: 1.18.0


Some profiles in Drill have specific configurations for {{fs.defaultFS}} Hadoop 
property. This option was set to {{file:///}} in {{core-site.xml}} in tests 
resources in {{java-exec}} modulte. Since in DRILL-7547 was added another 
{{core-site.xml}} file into the mongo module, it overrides the existing one, so 
mongo tests stop using {{file:///}} for such profiles.
These failures became seen after fix for DRILL-7628, where they started running 
again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [drill] asfgit closed pull request #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page

2020-03-10 Thread GitBox
asfgit closed pull request #1994: DRILL-7203: Accept impersonation userName as 
form field & fix back button for query page
URL: https://github.com/apache/drill/pull/1994
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2011: DRILL-7623: Link error is displayed at the log content page on Web UI

2020-03-10 Thread GitBox
asfgit closed pull request #2011: DRILL-7623: Link error is displayed at the 
log content page on Web UI
URL: https://github.com/apache/drill/pull/2011
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
asfgit closed pull request #2012: DRILL-7625: Add options for SslContextFactory
URL: https://github.com/apache/drill/pull/2012
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2014: DRILL-7627: Update MySql version for JdbcStoragePlugin tests and cache ~/.embedmysql

2020-03-10 Thread GitBox
asfgit closed pull request #2014: DRILL-7627: Update MySql version for 
JdbcStoragePlugin tests and cache ~/.embedmysql
URL: https://github.com/apache/drill/pull/2014
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2017: DRILL-7632: Improve user exception formatting

2020-03-10 Thread GitBox
asfgit closed pull request #2017: DRILL-7632: Improve user exception formatting
URL: https://github.com/apache/drill/pull/2017
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547

2020-03-10 Thread GitBox
asfgit closed pull request #2015: DRILL-7628: Fix Mongo tests broken after the 
fix for DRILL-7547
URL: https://github.com/apache/drill/pull/2015
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2009: DRILL-7622: Compilation error when using HLL / TDigest with group by

2020-03-10 Thread GitBox
asfgit closed pull request #2009: DRILL-7622: Compilation error when using HLL 
/ TDigest with group by
URL: https://github.com/apache/drill/pull/2009
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2019: DRILL-7630: Add additional types into SchemaParser for Parquet

2020-03-10 Thread GitBox
asfgit closed pull request #2019: DRILL-7630: Add additional types into 
SchemaParser for Parquet
URL: https://github.com/apache/drill/pull/2019
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2000: DRILL-7607: Support dynamic credit based flow control

2020-03-10 Thread GitBox
asfgit closed pull request #2000: DRILL-7607: Support dynamic credit based flow 
control
URL: https://github.com/apache/drill/pull/2000
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] asfgit closed pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

2020-03-10 Thread GitBox
asfgit closed pull request #2006: DRILL-7615: UNION ALL query returns the wrong 
result for the decimal value
URL: https://github.com/apache/drill/pull/2006
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL 
query returns the wrong result for the decimal value
URL: https://github.com/apache/drill/pull/2006#discussion_r386776641
 
 

 ##
 File path: common/src/main/java/org/apache/drill/common/types/Types.java
 ##
 @@ -795,15 +807,25 @@ public static boolean isSortable(MinorType type) {
*/
   public static MajorType.Builder calculateTypePrecisionAndScale(MajorType 
leftType, MajorType rightType, MajorType.Builder typeBuilder) {
 if (leftType.getMinorType().equals(rightType.getMinorType())) {
-  final boolean isScalarString = Types.isScalarStringType(leftType) && 
Types.isScalarStringType(rightType);
-  final boolean isDecimal = isDecimalType(leftType);
+  boolean isScalarString = Types.isScalarStringType(leftType) && 
Types.isScalarStringType(rightType);
+  boolean isDecimal = isDecimalType(leftType);
 
-  if ((isScalarString || isDecimal) && leftType.hasPrecision() && 
rightType.hasPrecision()) {
+  if (isScalarString && leftType.hasPrecision() && 
rightType.hasPrecision()) {
 typeBuilder.setPrecision(Math.max(leftType.getPrecision(), 
rightType.getPrecision()));
   }
 
-  if (isDecimal && leftType.hasScale() && rightType.hasScale()) {
-typeBuilder.setScale(Math.max(leftType.getScale(), 
rightType.getScale()));
+  if (isDecimal) {
+int scale = Math.max(leftType.getScale(), rightType.getScale());
+// resulting precision should take into account resulting scale value 
and be calculated as
+// sum of two components:
+// - max integer digits number (precision - scale) for left and right;
+// - resulting scale.
+// So for the case of cast( as decimal(4,0)) and cast(1.23 as 
decimal(3,2))
+// resulting scale would be Max(0, 2) = 2 and resulting precision 
would be Max(4 - 0, 3 - 2) + 2 = 6.
+// In this case, both values would fit into decimal(6, 2): .00, 
1.23
+int precision = Math.max(leftType.getPrecision() - 
leftType.getScale(), rightType.getPrecision() - rightType.getScale()) + scale;
+typeBuilder.setPrecision(precision);
+typeBuilder.setScale(scale);
 
 Review comment:
   Nit: please split the long line.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert 
LTSV Format Plugin to EVF
URL: https://github.com/apache/drill/pull/1962#discussion_r379859349
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java
 ##
 @@ -15,78 +15,74 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
+
 package org.apache.drill.exec.store.ltsv;
 
-import org.apache.drill.common.expression.SchemaPath;
 import org.apache.drill.common.logical.StoragePluginConfig;
-import org.apache.drill.exec.ops.FragmentContext;
-import org.apache.drill.exec.planner.common.DrillStatsTable.TableStatistics;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileSchemaNegotiator;
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileReaderFactory;
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileScanBuilder;
+import org.apache.drill.exec.physical.impl.scan.framework.ManagedReader;
 import org.apache.drill.exec.proto.UserBitShared;
 import org.apache.drill.exec.server.DrillbitContext;
-import org.apache.drill.exec.store.RecordReader;
-import org.apache.drill.exec.store.RecordWriter;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.server.options.OptionManager;
 import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin;
-import org.apache.drill.exec.store.dfs.easy.EasyWriter;
-import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.drill.exec.store.dfs.easy.EasySubScan;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
 import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
 
-import java.util.List;
 
 public class LTSVFormatPlugin extends EasyFormatPlugin 
{
 
 Review comment:
   Would be handy to reference [the spec](http://ltsv.org/) for us newbies not 
familiar with the format.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert 
LTSV Format Plugin to EVF
URL: https://github.com/apache/drill/pull/1962#discussion_r390520873
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java
 ##
 @@ -0,0 +1,35 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileSchemaNegotiator;
+import org.apache.drill.exec.store.easy.EasyEVFBatchReader;
+
+public class LTSVBatchReader extends EasyEVFBatchReader {
+
+  public LTSVBatchReader(LTSVFormatPluginConfig formatConfig) {
 
 Review comment:
   Since this code is not shared, maybe just omit the parameter. Then, nit, 
omit the empty call to `super()`. Finally, the ctor will end up doing nothing. 
so omit it entirely.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert 
LTSV Format Plugin to EVF
URL: https://github.com/apache/drill/pull/1962#discussion_r379859411
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java
 ##
 @@ -15,78 +15,74 @@
  * See the License for the specific language governing permissions and
  * limitations under the License.
  */
+
 package org.apache.drill.exec.store.ltsv;
 
-import org.apache.drill.common.expression.SchemaPath;
 import org.apache.drill.common.logical.StoragePluginConfig;
-import org.apache.drill.exec.ops.FragmentContext;
-import org.apache.drill.exec.planner.common.DrillStatsTable.TableStatistics;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.Types;
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileSchemaNegotiator;
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileReaderFactory;
+import 
org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileScanBuilder;
+import org.apache.drill.exec.physical.impl.scan.framework.ManagedReader;
 import org.apache.drill.exec.proto.UserBitShared;
 import org.apache.drill.exec.server.DrillbitContext;
-import org.apache.drill.exec.store.RecordReader;
-import org.apache.drill.exec.store.RecordWriter;
-import org.apache.drill.exec.store.dfs.DrillFileSystem;
+import org.apache.drill.exec.server.options.OptionManager;
 import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin;
-import org.apache.drill.exec.store.dfs.easy.EasyWriter;
-import org.apache.drill.exec.store.dfs.easy.FileWork;
+import org.apache.drill.exec.store.dfs.easy.EasySubScan;
+import org.apache.drill.shaded.guava.com.google.common.collect.Lists;
 import org.apache.hadoop.conf.Configuration;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
 
-import java.util.List;
 
 public class LTSVFormatPlugin extends EasyFormatPlugin 
{
 
-  private static final boolean IS_COMPRESSIBLE = true;
-
   private static final String DEFAULT_NAME = "ltsv";
 
-  public LTSVFormatPlugin(String name, DrillbitContext context, Configuration 
fsConf, StoragePluginConfig storageConfig) {
-this(name, context, fsConf, storageConfig, new LTSVFormatPluginConfig());
-  }
+  public static class LTSVReaderFactory extends FileReaderFactory {
+private final LTSVFormatPluginConfig ltsvFormatPluginConfig;
 
-  public LTSVFormatPlugin(String name, DrillbitContext context, Configuration 
fsConf, StoragePluginConfig config, LTSVFormatPluginConfig formatPluginConfig) {
-super(name, context, fsConf, config, formatPluginConfig, true, false, 
false, IS_COMPRESSIBLE, formatPluginConfig.getExtensions(), DEFAULT_NAME);
-  }
+public LTSVReaderFactory(LTSVFormatPluginConfig config) {
+  ltsvFormatPluginConfig = config;
+}
 
-  @Override
-  public RecordReader getRecordReader(FragmentContext context, DrillFileSystem 
dfs, FileWork fileWork, List columns, String userName) {
-return new LTSVRecordReader(context, fileWork.getPath(), dfs, columns);
+@Override
+public ManagedReader newReader() {
+  return new LTSVBatchReader(ltsvFormatPluginConfig);
 
 Review comment:
   Oh, never mind. I wrote the darn thing and forgot that file info is passed 
in to open, not into the constructor...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert 
LTSV Format Plugin to EVF
URL: https://github.com/apache/drill/pull/1962#discussion_r390522026
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordIterator.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.util.Iterator;
+
+public class LTSVRecordIterator implements Iterator {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVRecordIterator.class);
+
+  private RowSetLoader rowWriter;
+
+  private BufferedReader reader;
+
+  private String line;
+
+  private int recordCount;
+
+  public LTSVRecordIterator(RowSetLoader rowWriter, BufferedReader reader) {
+this.rowWriter = rowWriter;
+this.reader = reader;
+
+// Get the first line
+try {
+  line = reader.readLine();
+} catch (IOException e) {
+  throw UserException
+.dataReadError()
+.message("Error reading LTSV Data: {}", e.getMessage())
+.build(logger);
+}
+  }
+
+  @Override
+  public boolean hasNext() {
+return line != null;
+  }
+
+  @Override
+  public Boolean next() {
+// Skip empty lines
+if (line.trim().length() == 0) {
+  try {
+// Advance the line to the next line
+line = reader.readLine();
+  } catch (IOException e) {
+throw UserException
+  .dataReadError()
+  .message("Error reading LTSV Data: {}", e.getMessage())
+  .build(logger);
+  }
+  return Boolean.TRUE;
+} else if (line == null) {
+  return Boolean.FALSE;
+}
+
+// Process the row
+processRow();
+
+// Increment record counter
+recordCount++;
+
+// Get the next line
+try {
+  line = reader.readLine();
+  if(line == null) {
 
 Review comment:
   Nit: `if (` (space before paren.)


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert 
LTSV Format Plugin to EVF
URL: https://github.com/apache/drill/pull/1962#discussion_r379859570
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordIterator.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.util.Iterator;
+
+public class LTSVRecordIterator implements Iterator {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVRecordIterator.class);
+
+  private RowSetLoader rowWriter;
+
+  private BufferedReader reader;
+
+  private String line;
+
+  private int recordCount;
+
+  public LTSVRecordIterator(RowSetLoader rowWriter, BufferedReader reader) {
+this.rowWriter = rowWriter;
+this.reader = reader;
+
+// Get the first line
+try {
+  line = reader.readLine();
+} catch (IOException e) {
+  throw UserException
+.dataReadError()
+.message("Error reading LTSV Data: {}", e.getMessage())
+.build(logger);
+}
+  }
+
+  @Override
+  public boolean hasNext() {
+return line != null;
+  }
+
+  @Override
+  public Boolean next() {
+// Skip empty lines
+if (line.trim().length() == 0) {
+  try {
+// Advance the line to the next line
+line = reader.readLine();
+  } catch (IOException e) {
+throw UserException
+  .dataReadError()
+  .message("Error reading LTSV Data: {}", e.getMessage())
+  .build(logger);
+  }
+  return Boolean.TRUE;
+} else if (line == null) {
+  return Boolean.FALSE;
+}
+
+// Process the row
+processRow();
+
+// Increment record counter
+recordCount++;
+
+// Get the next line
+try {
+  line = reader.readLine();
+  if(line == null) {
+return Boolean.FALSE;
+  }
+} catch (IOException e) {
+  throw UserException
+.dataReadError()
+.message("Error reading LTSV Data: {}", e.getMessage())
+.build(logger);
+}
+return Boolean.TRUE;
+  }
+
+  /**
+   * Function processes one row of data, splitting it up first by tabs then 
splitting the key/value pairs
+   * finally recording it in the current Drill row.
+   */
+  private void processRow() {
+// Start the row
+rowWriter.start();
 
 Review comment:
   By the way, I recently was reminded that `start()` returns a `boolean`. We 
usually write
   
   ```
 while (!rs.isFull()) {
   rowWriter.start();
   ...
 }
   ```
   
   I now remember that the code was designed to allow:
   
   ```
  while (rowWriter.start()) {
 // Write stuff
 rowWriter.end();
 }
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert 
LTSV Format Plugin to EVF
URL: https://github.com/apache/drill/pull/1962#discussion_r374459658
 
 

 ##
 File path: 
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordIterator.java
 ##
 @@ -0,0 +1,127 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.BufferedReader;
+import java.io.IOException;
+import java.util.Iterator;
+
+public class LTSVRecordIterator implements Iterator {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVRecordIterator.class);
+
+  private RowSetLoader rowWriter;
+
+  private BufferedReader reader;
+
+  private String line;
+
+  private int recordCount;
+
+  public LTSVRecordIterator(RowSetLoader rowWriter, BufferedReader reader) {
+this.rowWriter = rowWriter;
+this.reader = reader;
+
+// Get the first line
+try {
+  line = reader.readLine();
+} catch (IOException e) {
+  throw UserException
+.dataReadError()
+.message("Error reading LTSV Data: {}", e.getMessage())
+.build(logger);
+}
+  }
+
+  @Override
+  public boolean hasNext() {
+return line != null;
+  }
+
+  @Override
+  public Boolean next() {
+// Skip empty lines
+if (line.trim().length() == 0) {
+  try {
+// Advance the line to the next line
+line = reader.readLine();
+  } catch (IOException e) {
+throw UserException
+  .dataReadError()
+  .message("Error reading LTSV Data: {}", e.getMessage())
+  .build(logger);
+  }
+  return Boolean.TRUE;
+} else if (line == null) {
+  return Boolean.FALSE;
 
 Review comment:
   Boolean.FALSE --> false. No need to use the boxed type. Here, above and 
below.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on issue #2000: DRILL-7607: Support dynamic credit based flow control

2020-03-10 Thread GitBox
paul-rogers commented on issue #2000: DRILL-7607: Support dynamic credit based 
flow control
URL: https://github.com/apache/drill/pull/2000#issuecomment-597239145
 
 
   Looks good, +1.
   
   Seems to me that two things need to happen after this PR. First, is to find 
a way to run a stress/performance test to ensure that the potential extra 
memory usage is not an issue. Second is to, at some point, continue with our 
resource management efforts and to 1) allocate memory to the receivers they can 
use as their buffer pool, and 2) standardize batch size so the receiver knows 
roughly the sizes to expect. This work is beyond the scope of this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on issue #1993: DRILL-7601: Shift column conversion to reader from scan framework

2020-03-10 Thread GitBox
paul-rogers commented on issue #1993: DRILL-7601: Shift column conversion to 
reader from scan framework
URL: https://github.com/apache/drill/pull/1993#issuecomment-597236752
 
 
   Rebased on latest master.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift column conversion to reader from scan framework

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift 
column conversion to reader from scan framework
URL: https://github.com/apache/drill/pull/1993#discussion_r390500706
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/avro/AvroBatchReader.java
 ##
 @@ -69,11 +69,15 @@ public boolean open(FileScanFramework.FileSchemaNegotiator 
negotiator) {
   negotiator.userName(), 
negotiator.context().getFragmentContext().getQueryUserName());
 
 logger.debug("Avro file schema: {}", reader.getSchema());
-TupleMetadata schema = AvroSchemaUtil.convert(reader.getSchema());
-logger.debug("Avro file converted schema: {}", schema);
-negotiator.setTableSchema(schema, true);
+TupleMetadata readerSchema = AvroSchemaUtil.convert(reader.getSchema());
+logger.debug("Avro file converted schema: {}", readerSchema);
+TupleMetadata providedSchema = negotiator.providedSchema();
+TupleMetadata tableSchema = 
StandardConversions.mergeSchemas(providedSchema, readerSchema);
 
 Review comment:
   You have identified a tricky area. The planner is the correct place to 
handle scan output schema: then the reader takes that output schema as gospel. 
Here we have readers that have a strong opinion about what schema they can 
produce; the provided schema is mostly a "suggestion." So, were in an akward 
middle area.
   
   In particular, the provided schema can include columns that the reader does 
not know about. Imagine the case of an Avro file where some files use schema 
v1, while others use schema 2 with an additional column. The v1 readers can't 
provide that extra column. Filling in missing columns is done via the scan 
framework. (Maybe this does not occur in Avro; maybe the Avro schema handles 
these cases. But, the issue does arise in CSV, JSON and others.)
   
   So, at the very least, the reader is not responsible for any column except 
those it can produce, any "extra" columns are handled by the scan framework.
   
   In the ideal case, the provided schema says to the reader, "of the ways you 
know how to convert a column, use **this** one." So, the reader has to know the 
target type of the column.
   
   All this said, my goal in a later PR is to reshuffle some of this stuff 
again. Now that we have multiple examples, is there some work that can be 
pulled out and reused? For example, can we build an explicit mapping from the 
reader's idea of the schema to the provided schema? Or, does it turn out that 
the reader's idea of types are unique to each reader so there is no common 
mechanism? For example, the Avro code looks quite specific to Avro and is 
perhaps not easily reused.
   
   Suggestions?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift column conversion to reader from scan framework

2020-03-10 Thread GitBox
paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift 
column conversion to reader from scan framework
URL: https://github.com/apache/drill/pull/1993#discussion_r390511107
 
 

 ##
 File path: 
exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvTableProperties.java
 ##
 @@ -547,4 +543,65 @@ public void testMessyQuotes() throws Exception {
 .build();
 RowSetUtilities.verify(expected, actual);
   }
+
+  private static final String[] trimData = {
+  " 10 , fred ",
+  " 20, wilma "
+};
+
+  /**
+   * Trim leading and trailing whitespace. This setting is currently
+   * only available via table properties.
+   */
+  @Test
+  public void testKeepWitespace() throws Exception {
+try {
+  enableSchemaSupport();
+  String tablePath = buildTable("noTrim", trimData);
+  String sql = String.format("create schema (%s) " +
+  "for table %s PROPERTIES ('" +
+  TextFormatPlugin.HAS_HEADERS_PROP + "'='false', '" +
+  TextFormatPlugin.SKIP_FIRST_LINE_PROP + "'='false')",
+  COL_SCHEMA, tablePath);
+  run(sql);
+  RowSet actual = client.queryBuilder().sql(SELECT_ALL, 
tablePath).rowSet();
+
+  TupleMetadata expectedSchema = new SchemaBuilder()
+  .add("id", MinorType.INT)
+  .add("name", MinorType.VARCHAR)
+  .buildSchema();
+
+  // String-to-number conversion trims strings automatically
+  RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema)
+  .addRow(10, " fred ")
+  .addRow(20, " wilma ")
+  .build();
+  RowSetUtilities.verify(expected, actual);
+} finally {
+  resetSchemaSupport();
+}
+  }
+
+ /**
+   * Trim leading and trailing whitespace. This setting is currently
+   * only available via table properties.
+   */
+  @Test
+  public void testTrimWitespace() throws Exception {
+try {
+  enableSchemaSupport();
+  String tablePath = buildTable("trim", trimData);
+  String sql = String.format("create schema (%s) " +
+  "for table %s PROPERTIES ('" +
+  TextFormatPlugin.HAS_HEADERS_PROP + "'='false', '" +
+  TextFormatPlugin.SKIP_FIRST_LINE_PROP + "'='false', '" +
+  TextFormatPlugin.TRIM_WHITESPACE_PROP + "'='true')",
 
 Review comment:
   Added detail to the "Documentation" section of this PR. There will be a few 
more new properties for JSON. Then I'll figure out how to show that in the 
docs. Perhaps in the provided schema section we need a table of which formats 
support the schema, and which properties each format supports, and what they do.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] paul-rogers opened a new pull request #2020: DRILL-7634: Rollup of code cleanup changes

2020-03-10 Thread GitBox
paul-rogers opened a new pull request #2020: DRILL-7634: Rollup of code cleanup 
changes
URL: https://github.com/apache/drill/pull/2020
 
 
   # [DRILL-7634](https://issues.apache.org/jira/browseDRILL-7634): Rollup of 
code cleanup changes
   
   ## Description
   
   Collection of code cleanup changes.
   
   The most significant is to create constants for function names. We already 
had a pretty good list in `FunctionGenerationHelper`. I move this list to a new 
interface, `FunctionNames` and added to it. I then changed all references to 
hard-coded names to instead refer to these constants.
   
   This change pointed out several places where code checks for variations of 
function names. Not sure why; perhaps an early version of the code did not have 
the map from alias to "canonical" name. Used constants for the canonical name, 
left the other names in place because I don't have enough knowledge to know if 
it is safe to remove them. Anyone know?
   
   Also made the `args` member of `FunctionCall` private and added accessors.
   
   ## Documentation
   
   None
   
   ## Testing
   
   Reran all unit tests.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Created] (DRILL-7634) Rollup of code cleanup changes

2020-03-10 Thread Paul Rogers (Jira)
Paul Rogers created DRILL-7634:
--

 Summary: Rollup of code cleanup changes
 Key: DRILL-7634
 URL: https://issues.apache.org/jira/browse/DRILL-7634
 Project: Apache Drill
  Issue Type: Improvement
Affects Versions: 1.17.0
Reporter: Paul Rogers
Assignee: Paul Rogers
 Fix For: 1.18.0


Pack of cosmetic code cleanup changes accumulated over recent months.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser

2020-03-10 Thread GitBox
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to 
the Json Structure Parser
URL: https://github.com/apache/drill/pull/2016#discussion_r390435908
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ValueDef.java
 ##
 @@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.easy.json.parser;
+
+/**
+ * Description of a JSON value as inferred from looking ahead in
+ * the JSON stream. Includs a type (which can be empty for an empty
+ * array, or null), and an array size (which is 0 for simple values.)
+ * 
+ * To be clear, this is the JSON parser's best guess at a field type
+ * from the input token stream. This is not a description of the
+ * desired data type as JSON can only react to what it sees on input.
+ */
+public class ValueDef {
+
+  /**
+   * Description of JSON types as derived from JSON tokens.
+   */
+  public enum JsonType {
+OBJECT, NULL, BOOLEAN,
+INTEGER, FLOAT, STRING, EMBEDDED_OBJECT,
+
+/**
+ * Indicates an empty array.
+ */
+EMPTY,
+
+/**
+ * Indicates an unknown array, appears when replacing the
+ * value listener for an array.
+ */
+UNKNOWN;
+
+public boolean isObject() { return this == OBJECT; }
+
+public boolean isUnknown() {
+  return this == NULL || this == EMPTY ||
+ this == UNKNOWN;
+}
+
+public boolean isScalar() {
+  return !isObject() && !isUnknown();
+}
+  }
+
+  public static final ValueDef UNKNOWN_ARRAY = new ValueDef(JsonType.UNKNOWN, 
1);
+  public static final ValueDef UNKNOWN = new ValueDef(JsonType.UNKNOWN, 0);
+
+  private final int arrayDims;
+  private final JsonType type;
+
+  public ValueDef(JsonType type, int dims) {
+this.type = type;
+this.arrayDims = dims;
+  }
+
+  public JsonType type() { return type; }
+  public int dimensions() { return arrayDims; }
+  public boolean isArray() { return arrayDims > 0; }
+
+  @Override
+  public String toString() {
+String result = type.name();
+for (int i = 0; i < arrayDims; i++) {
+  result += "[]";
 
 Review comment:
   Could you please replace strings concatenation with `StringBuilder` usage.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser

2020-03-10 Thread GitBox
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to 
the Json Structure Parser
URL: https://github.com/apache/drill/pull/2016#discussion_r390433673
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ObjectListener.java
 ##
 @@ -52,80 +52,64 @@
  */
 public interface ObjectListener {
 
+  public enum FieldType {
 
 Review comment:
   No need to declare it explicitly as public. Everything declared in the 
interface is public by default.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser

2020-03-10 Thread GitBox
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to 
the Json Structure Parser
URL: https://github.com/apache/drill/pull/2016#discussion_r390425236
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ValueDefFactory.java
 ##
 @@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.easy.json.parser;
+
+import org.apache.drill.exec.store.easy.json.parser.ValueDef.JsonType;
+
+import com.fasterxml.jackson.core.JsonToken;
+
+/**
+ * Constructs a {@link ValueDef} by looking ahead on the input stream.
+ * Looking ahead is safe because this class only looks at syntactic
+ * tokens such as {, {@code [} or the first value token.
+ * The underlying JSON parser is left with the first value token
+ * as its current token. Pushes other tokens back on the token stack
+ * so they can be re-consumed by the actual parser.
+ */
+public class ValueDefFactory {
+
+  private int arrayDims;
+  private JsonType jsonType = JsonType.EMPTY;
+
+  public ValueDefFactory(TokenIterator tokenizer) {
+inferValueType(tokenizer);
+  }
+
+  public static ValueDef lookAhead(TokenIterator tokenizer) {
+ValueDefFactory factory= new ValueDefFactory(tokenizer);
 
 Review comment:
   ```suggestion
   ValueDefFactory factory = new ValueDefFactory(tokenizer);
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser

2020-03-10 Thread GitBox
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to 
the Json Structure Parser
URL: https://github.com/apache/drill/pull/2016#discussion_r390434300
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ValueDef.java
 ##
 @@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.easy.json.parser;
+
+/**
+ * Description of a JSON value as inferred from looking ahead in
+ * the JSON stream. Includs a type (which can be empty for an empty
 
 Review comment:
   ```suggestion
* the JSON stream. Includes a type (which can be empty for an empty
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser

2020-03-10 Thread GitBox
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to 
the Json Structure Parser
URL: https://github.com/apache/drill/pull/2016#discussion_r390423322
 
 

 ##
 File path: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ObjectListener.java
 ##
 @@ -52,80 +52,64 @@
  */
 public interface ObjectListener {
 
+  public enum FieldType {
+
+/**
+ * The field is unprojected, ignore its content. No value listener
+ * is created.
+ */
+IGNORE,
+
+/**
+ * Parse the JSON object according to its type.
+ */
+TYPED,
+
+/**
+ * The field is to be treated as "all-text". Used when the parser-level
+ * setting for {@code allTextMode} is {@code false}; allows per-field
+ * overrides to, perhaps, ride over inconsistent scalar types for a
+ * single field. The listener will receive only strings.
+ */
+TEXT,
+
+/**
+ * Parse the value, and all its children, as JSON.
+ * That is, converts the parsed JSON back into a
+ * JSON string. The listener will receive only strings.
+ */
+JSON
+  }
+
   /**
* Called at the start of a set of values for an object. That is, called
* when the structure parser accepts the { token.
*/
   void onStart();
 
-  /**
-   * Called at the end of a set of values for an object. That is, called
-   * when the structure parser accepts the } token.
-   */
-  void onEnd();
-
   /**
* Called by the structure parser when it first sees a new field for
-   * and object to determine if that field is to be projected (is needed
-   * by the listener.) If not projected, the structure parser will not
+   * and object to determine how to parse the field.
+   * If not projected, the structure parser will not
* ask for a value listener and will insert a "dummy" parser that will
* free-wheel over any value of that field. As a result, unprojected
* fields can not cause type errors: they are invisible as long as
* they are syntactically valid.
+   * 
+   * The {@link FieldType#JSON} type says to parse the entire field, and
+   * its children, as a JSON string. The parser will ask for a value
+   * listener to accept the JSON text.
*
* @param key the object field name
-   * @return {@code true} if this listener wants to provide a listener
-   * for the field, {@code false} if the field should be ignored
-   */
-  boolean isProjected(String key);
-
-  /**
-   * A new field has appeared with a scalar (or {@code null}) value.
-   * That is: {@code key: }.
-   *
-   * @param key the field name
-   * @param type the type as given by the JSON token for the value
-   * @return a value listener for the scalar value
+   * @return how the field should be parsed
*/
-  ValueListener addScalar(String key, JsonType type);
+  FieldType fieldType(String key);
 
-  /**
-   * A new field has appeared with a scalar, {@code null} or empty array
-   * value. That is, one of:
-   * 
-   * key: [+ 
-   * key: [+ null
-   * key: [+ ]
-   * 
-   * Where "[+" means one or more opening array elements.
-   *
-   * @param key the field name
-   * @param arrayDims number of dimensions observed in the first appearance
-   * of the array (more may appear later)
-   * @param type the observed type of the first element of the array, or
-   * {@link JsonType.NULL} if {@code null} was see, or
-   * {@link JsonType.EMPTY} if an empty array was seen
-   * @return a listener for the field itself which is prepared to
-   * return an array listener
-   */
-  ValueListener addArray(String key, int arrayDims, JsonType type);
+  ValueListener addField(String key, ValueDef valueDef);
 
 Review comment:
   Please add JavaDoc.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva opened a new pull request #2019: DRILL-7630: Add additional types into SchemaParser for Parquet

2020-03-10 Thread GitBox
arina-ielchiieva opened a new pull request #2019: DRILL-7630: Add additional 
types into SchemaParser for Parquet
URL: https://github.com/apache/drill/pull/2019
 
 
   # [DRILL-7630](https://issues.apache.org/jira/browse/DRILL-7630): Add 
additional types into SchemaParser for Parquet
   
   ## Description
   
   Parquet files supports some types that are currently not included in the 
schema parser. Since schema parser is used during schema ser / de, processing 
of such Parquet files fails. We need to add additional data types to the schema 
parser.
   
   ## Documentation
   NA
   
   ## Testing
   Added unit test, ran all tests.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547

2020-03-10 Thread GitBox
arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken 
after the fix for DRILL-7547
URL: https://github.com/apache/drill/pull/2015#issuecomment-597139833
 
 
   +1, LGTM


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva removed a comment on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547

2020-03-10 Thread GitBox
arina-ielchiieva removed a comment on issue #2015: DRILL-7628: Fix Mongo tests 
broken after the fix for DRILL-7547
URL: https://github.com/apache/drill/pull/2015#issuecomment-596991728
 
 
   @vvysotskyi I think maybe we'll get rid of this test suite logic?
   We can make all Mongo tests extend `MongoTestBase.class`, annotate 
`MongoTestBase.class` with `@Category({MongoStorageTest.class})` so we can skip 
or include Mongo tests and that's it, so suite logic will be required.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] agozhiy commented on issue #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
agozhiy commented on issue #2012: DRILL-7625: Add options for SslContextFactory
URL: https://github.com/apache/drill/pull/2012#issuecomment-597130330
 
 
   Thank you, +1.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options 
for SslContextFactory
URL: https://github.com/apache/drill/pull/2012#discussion_r390351471
 
 

 ##
 File path: 
common/src/main/java/org/apache/drill/common/config/NestedConfig.java
 ##
 @@ -111,6 +112,19 @@ public String getString(String path) {
 return c.getString(path);
   }
 
+  /**
+   * Returns string option or passed default value if option is not set.
+   *
+   * @param path option path
+   * @param defaultValue default value
+   * @return option or default value
+   */
+  public String getString(String path, String defaultValue) {
 
 Review comment:
   There is no need for this method at all, default values can be set directly 
in drill-module.conf.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options 
for SslContextFactory
URL: https://github.com/apache/drill/pull/2012#discussion_r390350047
 
 

 ##
 File path: 
common/src/main/java/org/apache/drill/common/config/NestedConfig.java
 ##
 @@ -111,6 +112,19 @@ public String getString(String path) {
 return c.getString(path);
   }
 
+  /**
+   * Returns string option or passed default value if option is not set.
+   *
+   * @param path option path
+   * @param defaultValue default value
+   * @return option or default value
+   */
+  public String getString(String path, String defaultValue) {
 
 Review comment:
   There is no need for this method at all, default values can be set directly 
in drill-module.conf.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options 
for SslContextFactory
URL: https://github.com/apache/drill/pull/2012#discussion_r390350047
 
 

 ##
 File path: 
common/src/main/java/org/apache/drill/common/config/NestedConfig.java
 ##
 @@ -111,6 +112,19 @@ public String getString(String path) {
 return c.getString(path);
   }
 
+  /**
+   * Returns string option or passed default value if option is not set.
+   *
+   * @param path option path
+   * @param defaultValue default value
+   * @return option or default value
+   */
+  public String getString(String path, String defaultValue) {
 
 Review comment:
   There is no need for this method at all, default values can be set directly 
in drill-module.conf.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options 
for SslContextFactory
URL: https://github.com/apache/drill/pull/2012#discussion_r390347618
 
 

 ##
 File path: 
common/src/main/java/org/apache/drill/common/config/NestedConfig.java
 ##
 @@ -111,6 +111,14 @@ public String getString(String path) {
 return c.getString(path);
   }
 
+  public String getString(String path, String defaultValue) {
+String value = hasPath(path) ? getString(path) : null;
+if (value == null || (value = value.trim()).isEmpty()) {
+  return defaultValue;
+}
+return value;
 
 Review comment:
   There is no need for this method at all, default values can be set directly 
in drill-module.conf.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] cgivre commented on issue #2012: DRILL-7625: Add options for SslContextFactory

2020-03-10 Thread GitBox
cgivre commented on issue #2012: DRILL-7625: Add options for SslContextFactory
URL: https://github.com/apache/drill/pull/2012#issuecomment-597061241
 
 
   Could we make a note to include this in the web documentation for Drill once 
this is committed?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi merged pull request #1986: Additional changes for Drill Metastore docs

2020-03-10 Thread GitBox
vvysotskyi merged pull request #1986: Additional changes for Drill Metastore 
docs
URL: https://github.com/apache/drill/pull/1986
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] vvysotskyi merged pull request #2013: Clean up a bit, add example for s3 storage of query profiles

2020-03-10 Thread GitBox
vvysotskyi merged pull request #2013: Clean up a bit, add example for s3 
storage of query profiles
URL: https://github.com/apache/drill/pull/2013
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors

2020-03-10 Thread GitBox
arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes 
for union and repeated list accessors
URL: https://github.com/apache/drill/pull/2018#discussion_r390202680
 
 

 ##
 File path: 
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/VariantColumnMetadata.java
 ##
 @@ -96,11 +134,13 @@ public StructureType structureType() {
   public boolean isVariant() { return true; }
 
   @Override
-  public boolean isArray() { return type() == MinorType.LIST; }
+  public boolean isArray() {
+return super.isArray() || type() == MinorType.LIST;
+  }
 
   @Override
   public ColumnMetadata cloneEmpty() {
-return new VariantColumnMetadata(name, type, variantSchema.cloneEmpty());
+return new VariantColumnMetadata(name, type, mode, new VariantSchema());
 
 Review comment:
   Also this class should override `public String typeString()` which is used 
during ser / de.
   For example, `UNION`. Not sure how to describe LIST 
though...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors

2020-03-10 Thread GitBox
arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes 
for union and repeated list accessors
URL: https://github.com/apache/drill/pull/2018#discussion_r390199915
 
 

 ##
 File path: 
exec/vector/src/main/java/org/apache/drill/exec/record/metadata/VariantColumnMetadata.java
 ##
 @@ -96,11 +134,13 @@ public StructureType structureType() {
   public boolean isVariant() { return true; }
 
   @Override
-  public boolean isArray() { return type() == MinorType.LIST; }
+  public boolean isArray() {
+return super.isArray() || type() == MinorType.LIST;
+  }
 
   @Override
   public ColumnMetadata cloneEmpty() {
-return new VariantColumnMetadata(name, type, variantSchema.cloneEmpty());
+return new VariantColumnMetadata(name, type, mode, new VariantSchema());
 
 Review comment:
   Could you please also implement `public ColumnMetadata copy() {` below since 
yo are touching this class?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #2017: DRILL-7632: Improve user exception formatting

2020-03-10 Thread GitBox
arina-ielchiieva commented on issue #2017: DRILL-7632: Improve user exception 
formatting
URL: https://github.com/apache/drill/pull/2017#issuecomment-596994971
 
 
   +1, LGTM.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #2010: DRILL-7620: Fix plugin mutability issues

2020-03-10 Thread GitBox
arina-ielchiieva commented on issue #2010: DRILL-7620: Fix plugin mutability 
issues
URL: https://github.com/apache/drill/pull/2010#issuecomment-596994090
 
 
   @paul-rogers I guess we should to address issues mentioned by Anton before 
moving forward...


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page

2020-03-10 Thread GitBox
arina-ielchiieva commented on issue #1994: DRILL-7203: Accept impersonation 
userName as form field & fix back button for query page
URL: https://github.com/apache/drill/pull/1994#issuecomment-596993199
 
 
   I think we should still merge this PR since in Drill we don't have list of 
browsers we support, and Chrome is the one that is commonly used, since waiting 
widget works in Chrome, the fact it does not work in Safari should not block 
this PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [drill] arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547

2020-03-10 Thread GitBox
arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken 
after the fix for DRILL-7547
URL: https://github.com/apache/drill/pull/2015#issuecomment-596991728
 
 
   @vvysotskyi I think maybe we'll get rid of this test suite logic?
   We can make all Mongo tests extend `MongoTestBase.class`, annotate 
`MongoTestBase.class` with `@Category({MongoStorageTest.class})` so we can skip 
or include Mongo tests and that's it, so suite logic will be required.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


  1   2   >