[GitHub] [drill] paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues
paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues URL: https://github.com/apache/drill/pull/2010#issuecomment-597381456 Rebased on latest master to resolve conflict. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues
paul-rogers commented on issue #2010: DRILL-7620: Fix plugin mutability issues URL: https://github.com/apache/drill/pull/2010#issuecomment-597381178 @agozhiy, here is how I read the old behavior: * The UI sends the plugin as JSON via the `createOrUpdatePlugin()` mehod. * `StorageResources` configures its mapper (which is really the global one from `LogicalPlanPersistence`) to accept comments, then deserializes the JSON to create a config. * If deserialization fails, the UI returns an error message. * `StorageResources` passes the name and config, as a `PluginConfigWrapper` to `createOrUpdatePluginJSON()` which calls `createOrUpdate()` in the plugin registry. * `createOrUpdate()` first creates an instance of the plugin. The result will be one of a) `null` if the config is disabled, b) a `StoragePlugin` if enabled and valid, or c) an exception if invalid. * `createOrUpdate()` now does one of: * If enabled, replace existing plugin in the in-memory cache with the new one, or add the new one * If disabled, remove the existing plugin from the in-memory cache * `createOrUpdate()` then writes the config to persistent storage independent of the above choices. There are bugs in the above. The above writes to the local cache before storage meaning that, for a brief time, the two are out of sync. If, during that time, another update comes in, then the two will again be out of sync. These issues don't crop up in the current tests, of course. The new code does: * The UI sends the plugin as JSON via the `createOrUpdatePlugin()` mehod. * `StorageResources` calls `putJson()` in the plugin registry with the name and JSON text. * `putJson()` decodes the JSON using the system mapper, configured to allow comments, throwing an exception if the decode fails. If succeeds, calls `put()` with the config. * `put()` validates that the plugin is not a system plugin, then updates persistent storage. I see the problem, the `put()` call above should be the new `validatedPut()` which creates an instance of the plugin first. Once we make that fix: * `validatedPut()` creates an instance of the plugin, throwing an exception if instance creation fails. * If OK, `validatedPut()` writes he plugin to persistent storage and updates the plugin cache. There is a second bug here: the logic should be: * `validatedPut()` does one of: * If enabled, updates the config in the plugin cache. * If disabled, removes the config from the plugin cache. * `validatedPut()` then checks if the old entry had a plugin instance. If , adds it to the ephemeral store. * `validatedPut()` writes the updated config to persistent storage. The planner bugs explained earlier still occur. (A query should not iterate over all registered plugins.) Still, the above fixes should get us back to the original behavior, with the addition of the other fixes we're trying to make. Made the two fixes noted above. Added unit tests for this case. @agozhiy, if you run the UI scenario from your earlier comment, you should see that the plugin is rejected if invalid and enabled. The config is stored, but not instantiated, if invalid and disabled. If you are brave, you can use ZK to change a stored valid config to make it invalid. Doing so simulates the external system failing. With the old version, Drill will fail on startup. With this version, Drill will start, but you'll see error messages in the log each time you plan a query. A future PR will ensure you see no errors at all. All of this points out more open issues: * If a plugin ever fails, should it be disabled? How would this work for a temporary outage? The user would have to manually re-enable the plugin when it becomes valid again. * On a completely unrelated topic, there is code in the original version that forces all names of stored plugins to lower case (removing any mixed-case versions and replacing them), but no code enforces the lower-case rule in the in-memory cache. (I have tests that specifically test this case.) However, names are forced to lower case in storage. This seems more on the "bug" side of the ledger than the "feature" side. Should we allow mixed case everywhere or nowhere? Yet another thing to fix in the next PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390678433 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); + +try { + // Execute the request + Response response = client +.newCall(request) +.execute(); + + // If the request is unsuccessful, throw a UserException + if (!response.isSuccessful()) { +throw UserException + .dataReadError() + .message("Error retrieving data from HTTP Storage Plugin: " + response.code() + " " + response.message()) + .addContext("URL: ", urlStr) + .addContext("Response code: ", response.code()) + .build(logger); + } + logger.debug("HTTP Request for {} successful.", urlStr); + logger.debug("Response Headers: {} ", response.headers().toString()); + + // Return the InputStream of the response + return Objects.requireNonNull(response.body()).byteStream(); +} catch (IOException e) { + throw UserException +.dataReadError(e) +.message("Error retrieving data from HTTP Storage Plugin: %s", e.getMessage()) +.addContext("URL Requested:" + urlStr) +.build(logger); +} + } + + /** + * Function configures the OkHTTP3 server object with configuration info from the user. + * + * @return OkHttpClient configured server + */ + private OkHttpClient setupHttpClient() {
[GitHub] [drill] cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390677994 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import org.apache.drill.common.map.CaseInsensitiveMap; +import org.apache.drill.common.logical.StoragePluginConfigBase; +import org.apache.drill.shaded.guava.com.google.common.base.Objects; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + + +@JsonTypeName(HttpStoragePluginConfig.NAME) +public class HttpStoragePluginConfig extends StoragePluginConfigBase { + private static final Logger logger = LoggerFactory.getLogger(HttpStoragePluginConfig.class); + + public static final String NAME = "http"; + + public final Map connections; + + public final boolean cacheResults; + + public int timeout; + + @JsonCreator + public HttpStoragePluginConfig(@JsonProperty("cacheResults") boolean cacheResults, + @JsonProperty("connections") Map connections, + @JsonProperty("timeout") int timeout) { +logger.debug("Initialize HttpStoragePluginConfig {}", connections); Review comment: In the next revision, I'll remove all these debug statements. These were there for me to follow the flow of the code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
cgivre commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390677421 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@JsonTypeName("http-scan") +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig config; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.config = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +config = that.config(); +httpScanSpec = that.httpScanSpec(); +columns = that.getColumns(); + } + + @JsonCreator + public HttpGroupScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("columns") List columns, +@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec + ) { +super("no-user"); +this.config = config; +this.columns = columns; +this.httpScanSpec = httpScanSpec; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { return config; } + + @JsonProperty("columns") + public List columns() { return columns; } + + @JsonProperty("httpScanSpec") + public HttpScanSpec httpScanSpec() { return httpScanSpec; } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); + } + + @Override + @JsonIgnore + public int getMaxParallelizationWidth() { +return 1; + } + + @Override + public boolean canPushdownProjects(List columns) { +return true; + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) { +logger.debug("HttpGroupScan getSpecificScan"); +return new HttpSubScan(config, httpScanSpec, columns); + } + + @Override + public GroupScan clone(List columns) { +logger.debug("HttpGroupScan clone {}", columns); +HttpGroupScan newScan = new HttpGroupScan(this); +newScan.columns = columns; +return newScan; + } + + @Override + public String getDigest() { +return toString(); + } + + @Override + public PhysicalOperator getNewWithChildren(List children) { +Preconditions.checkArgument(children.isEmpty()); +return new HttpGroupScan(this); + } + + @Override + public ScanStats getScanStats() { +int colCount = columns.size(); +int estRowCount = 1; +int estDataSize = estRowCount * 200 * colCount; Review comment: Hey... this is a learning experience for us all... (but mainly for me... ) ;-) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp
[GitHub] [drill] vvysotskyi merged pull request #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system
vvysotskyi merged pull request #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system URL: https://github.com/apache/drill/pull/2021 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] dobesv commented on a change in pull request #1997: DRILL-7604: Allow session options to be set in HTTP queries
dobesv commented on a change in pull request #1997: DRILL-7604: Allow session options to be set in HTTP queries URL: https://github.com/apache/drill/pull/1997#discussion_r390652771 ## File path: exec/java-exec/src/main/resources/rest/query/query.ftl ## @@ -82,6 +82,16 @@ Submit checked> Limit results to rows + Store Format + +Parquet +JSON +Comma-Separated Values +Pipe-Separated Values +Tab-Separated Values Review comment: @arina-ielchiieva OK, I'll just remove the field for now. I added it here for demo/testing but I don't actually need it to be there for my own purposes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on a change in pull request #1997: DRILL-7604: Allow session options to be set in HTTP queries
arina-ielchiieva commented on a change in pull request #1997: DRILL-7604: Allow session options to be set in HTTP queries URL: https://github.com/apache/drill/pull/1997#discussion_r390649598 ## File path: exec/java-exec/src/main/resources/rest/query/query.ftl ## @@ -82,6 +82,16 @@ Submit checked> Limit results to rows + Store Format + +Parquet +JSON +Comma-Separated Values +Pipe-Separated Values +Tab-Separated Values Review comment: @dobesv As I have stated before, we should not hard-code format names since they can be customized, please remove this part of code, other changes are ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system
arina-ielchiieva commented on issue #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system URL: https://github.com/apache/drill/pull/2021#issuecomment-597352332 +1, LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi opened a new pull request #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system
vvysotskyi opened a new pull request #2021: DRILL-7635: Mongo tests fail for profiles with the non-file default file system URL: https://github.com/apache/drill/pull/2021 # [DRILL-7635](https://issues.apache.org/jira/browse/DRILL-7635): Mongo tests fail for profiles with the non-file default file system ## Description Please see Jira for details. ## Documentation NA ## Testing Ran all tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390583045 ## File path: contrib/storage-http/src/main/resources/drill-module.conf ## @@ -0,0 +1,27 @@ +# +# Licensed to the Apache Software Foundation (ASF) under one +# or more contributor license agreements. See the NOTICE file +# distributed with this work for additional information +# regarding copyright ownership. The ASF licenses this file +# to you under the Apache License, Version 2.0 (the +# "License"); you may not use this file except in compliance +# with the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# This file tells Drill to consider this module when class path scanning. +# This file can also include any supplementary configuration information. +# This file is in HOCON format, see https://github.com/typesafehub/config/blob/master/HOCON.md for more information. + +drill: { + classpath.scanning: { +packages += "org.apache.drill.exec.store.http" + } +} Review comment: Nit: newline This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390547367 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@JsonTypeName("http-scan") +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig config; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.config = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +config = that.config(); +httpScanSpec = that.httpScanSpec(); +columns = that.getColumns(); + } + + @JsonCreator + public HttpGroupScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("columns") List columns, +@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec + ) { +super("no-user"); +this.config = config; +this.columns = columns; +this.httpScanSpec = httpScanSpec; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { return config; } + + @JsonProperty("columns") + public List columns() { return columns; } + + @JsonProperty("httpScanSpec") + public HttpScanSpec httpScanSpec() { return httpScanSpec; } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); + } + + @Override + @JsonIgnore + public int getMaxParallelizationWidth() { +return 1; + } + + @Override + public boolean canPushdownProjects(List columns) { +return true; + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) { +logger.debug("HttpGroupScan getSpecificScan"); Review comment: Unless you debug by reading logs, no need for these debug statements. These methods are called when needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390582732 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); + +try { + // Execute the request + Response response = client +.newCall(request) +.execute(); + + // If the request is unsuccessful, throw a UserException + if (!response.isSuccessful()) { +throw UserException + .dataReadError() + .message("Error retrieving data from HTTP Storage Plugin: " + response.code() + " " + response.message()) + .addContext("URL: ", urlStr) + .addContext("Response code: ", response.code()) + .build(logger); + } + logger.debug("HTTP Request for {} successful.", urlStr); + logger.debug("Response Headers: {} ", response.headers().toString()); + + // Return the InputStream of the response + return Objects.requireNonNull(response.body()).byteStream(); +} catch (IOException e) { + throw UserException +.dataReadError(e) +.message("Error retrieving data from HTTP Storage Plugin: %s", e.getMessage()) +.addContext("URL Requested:" + urlStr) +.build(logger); +} + } + + /** + * Function configures the OkHTTP3 server object with configuration info from the user. + * + * @return OkHttpClient configured server + */ + private OkHttpClient setupHttpClient
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390585371 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/JSONRecordReader.java ## @@ -253,11 +268,20 @@ private void updateRunningCount() { runningRecordCount += recordCount; } + public void setInputStream(InputStream in) { +this.inputStream = in; + } + @Override public void close() throws Exception { if (stream != null) { stream.close(); stream = null; } + +if (inputStream != null) { + inputStream.close(); + inputStream = null; +} Review comment: Doesn't the Jackson parser close the input stream when the parser itself is closed? At least, that is what I thought I found in my own JSON tests. If not, we've got a pile of open HDFS streams that we've never closed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390573286 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} Review comment: Timeout? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390582369 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); + +try { + // Execute the request + Response response = client +.newCall(request) +.execute(); + + // If the request is unsuccessful, throw a UserException + if (!response.isSuccessful()) { +throw UserException + .dataReadError() + .message("Error retrieving data from HTTP Storage Plugin: " + response.code() + " " + response.message()) + .addContext("URL: ", urlStr) + .addContext("Response code: ", response.code()) + .build(logger); + } + logger.debug("HTTP Request for {} successful.", urlStr); + logger.debug("Response Headers: {} ", response.headers().toString()); + + // Return the InputStream of the response + return Objects.requireNonNull(response.body()).byteStream(); +} catch (IOException e) { + throw UserException +.dataReadError(e) +.message("Error retrieving data from HTTP Storage Plugin: %s", e.getMessage()) +.addContext("URL Requested:" + urlStr) +.build(logger); +} + } + + /** + * Function configures the OkHTTP3 server object with configuration info from the user. + * + * @return OkHttpClient configured server + */ + private OkHttpClient setupHttpClient
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390556073 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@JsonTypeName("http-scan") +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig config; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.config = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +config = that.config(); +httpScanSpec = that.httpScanSpec(); +columns = that.getColumns(); + } + + @JsonCreator + public HttpGroupScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("columns") List columns, +@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec + ) { +super("no-user"); +this.config = config; +this.columns = columns; +this.httpScanSpec = httpScanSpec; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { return config; } + + @JsonProperty("columns") + public List columns() { return columns; } + + @JsonProperty("httpScanSpec") + public HttpScanSpec httpScanSpec() { return httpScanSpec; } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); + } + + @Override + @JsonIgnore + public int getMaxParallelizationWidth() { +return 1; + } + + @Override + public boolean canPushdownProjects(List columns) { +return true; + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) { +logger.debug("HttpGroupScan getSpecificScan"); +return new HttpSubScan(config, httpScanSpec, columns); + } + + @Override + public GroupScan clone(List columns) { +logger.debug("HttpGroupScan clone {}", columns); +HttpGroupScan newScan = new HttpGroupScan(this); +newScan.columns = columns; +return newScan; + } + + @Override + public String getDigest() { +return toString(); + } + + @Override + public PhysicalOperator getNewWithChildren(List children) { +Preconditions.checkArgument(children.isEmpty()); +return new HttpGroupScan(this); + } + + @Override + public ScanStats getScanStats() { +int colCount = columns.size(); +int estRowCount = 1; +int estDataSize = estRowCount * 200 * colCount; +int estCpuCost = 1; Review comment: Just to be safe, I'd make CPU cost a function of data size. I think there are functions for this in the planner somewhere. See `DrillCostBase`. This is an automated message from the Apache Git Se
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390572977 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); Review comment: nit: ``` requestBuilder.addHeader(entry.getKey(), entry.getValue()); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390557489 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpRecordReader.java ## @@ -0,0 +1,53 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.easy.json.JSONRecordReader; +import org.apache.drill.exec.store.http.util.SimpleHttp; + +import java.io.InputStream; +import java.util.List; + + +public class HttpRecordReader extends JSONRecordReader { + + private final HttpSubScan subScan; + + private final SimpleHttp http; + + public HttpRecordReader(FragmentContext context, List projectedColumns, HttpStoragePluginConfig config, HttpSubScan subScan) { +super(context, projectedColumns); +this.subScan = subScan; +String connectionName = subScan.tableSpec().database(); +this.http = new SimpleHttp(config, context, connectionName); +InputStream inputStream = getInputStream(); + +setInputStream(inputStream); + } + /** + * Executes the HTTP request and returns an InputStream to the retrieved data + * @return InputStream the InputStream containing the data + */ + private InputStream getInputStream() { +String url = subScan.getFullURL(); +return http.getInputStream(url); Review comment: Originally wrote a note asking about error handling. Then, saw it was done in the `getInputStream` method. Maybe add a Javadoc `@throws UserException` to explain this fact. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390570782 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpSubScan.java ## @@ -0,0 +1,132 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.Iterator; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.physical.base.AbstractBase; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.PhysicalVisitor; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.UserBitShared.CoreOperatorType; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.collect.ImmutableSet; + +@JsonTypeName("http-sub-scan") +public class HttpSubScan extends AbstractBase implements SubScan { + + private final HttpScanSpec tableSpec; + private final HttpStoragePluginConfig config; + private final List columns; + + @JsonCreator + public HttpSubScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("tableSpec") HttpScanSpec tableSpec, +@JsonProperty("columns") List columns) { +super("user-if-needed"); +this.config = config; +this.tableSpec = tableSpec; +this.columns = columns; + } + @JsonProperty("tableSpec") + public HttpScanSpec tableSpec() { +return tableSpec; + } + + @JsonProperty("columns") + public List columns() { +return columns; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { +return config; + } + + @JsonIgnore + public String getURL() { +return tableSpec.getURL(); + } + + @JsonIgnore + public String getFullURL() { +String selectedConnection = tableSpec.database(); +String url = config.connections().get(selectedConnection).url(); +return url + tableSpec.tableName(); + } + + @Override + public T accept( + PhysicalVisitor physicalVisitor, X value) throws E { +return physicalVisitor.visitSubScan(this, value); + } + + @Override + public PhysicalOperator getNewWithChildren(List children) { +return new HttpSubScan(config, tableSpec, columns); + } + + @Override + @JsonIgnore + public int getOperatorType() { +return CoreOperatorType.HTTP_SUB_SCAN_VALUE; + } + + @Override + public Iterator iterator() { +return ImmutableSet.of().iterator(); + } + + @Override + public String toString() { +return MoreObjects.toStringHelper(this) + .add("tableSpec", tableSpec) + .add("columns", columns) + .add("config", config) + .toString(); + } + + @Override + public int hashCode() { +return Arrays.hashCode( + new Object[]{tableSpec, columns, config}); Review comment: `Objects.hash(tableSpec, columns, config)`. Maybe this is old code? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390559029 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpScanBatchCreator.java ## @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.List; + +import org.apache.drill.common.exceptions.ExecutionSetupException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.ops.ExecutorFragmentContext; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.ScanBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.store.RecordReader; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.apache.drill.shaded.guava.com.google.common.collect.Lists; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class HttpScanBatchCreator implements BatchCreator { + private static final Logger logger = LoggerFactory.getLogger(HttpScanBatchCreator.class); + + @Override + public ScanBatch getBatch(ExecutorFragmentContext context, HttpSubScan subScan, List children) throws ExecutionSetupException { +logger.debug("getBatch called"); +Preconditions.checkArgument(children == null || children.isEmpty()); + +HttpStoragePluginConfig config = subScan.config(); +List readers = Lists.newArrayList(); +List columns; + +if ((columns = subScan.columns()) == null) { + columns = GroupScan.ALL_COLUMNS; +} Review comment: Clever. But the idiomatic Java way to be clever is: ``` List columns = subscan.columns() == null ? GroupScan.ALL_COLUMNS : subScan.columns(); ``` Note that this code does what was suggested earlier: handle the no-columns case here so that the cost calcs know if columns were pushed down. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390579548 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); + +try { + // Execute the request + Response response = client +.newCall(request) +.execute(); + + // If the request is unsuccessful, throw a UserException + if (!response.isSuccessful()) { +throw UserException + .dataReadError() + .message("Error retrieving data from HTTP Storage Plugin: " + response.code() + " " + response.message()) + .addContext("URL: ", urlStr) + .addContext("Response code: ", response.code()) + .build(logger); + } + logger.debug("HTTP Request for {} successful.", urlStr); + logger.debug("Response Headers: {} ", response.headers().toString()); + + // Return the InputStream of the response + return Objects.requireNonNull(response.body()).byteStream(); +} catch (IOException e) { + throw UserException +.dataReadError(e) +.message("Error retrieving data from HTTP Storage Plugin: %s", e.getMessage()) +.addContext("URL Requested:" + urlStr) +.build(logger); +} + } + + /** + * Function configures the OkHTTP3 server object with configuration info from the user. + * + * @return OkHttpClient configured server + */ + private OkHttpClient setupHttpClient
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390580075 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); + +try { + // Execute the request + Response response = client +.newCall(request) +.execute(); + + // If the request is unsuccessful, throw a UserException + if (!response.isSuccessful()) { +throw UserException + .dataReadError() + .message("Error retrieving data from HTTP Storage Plugin: " + response.code() + " " + response.message()) + .addContext("URL: ", urlStr) + .addContext("Response code: ", response.code()) + .build(logger); + } + logger.debug("HTTP Request for {} successful.", urlStr); + logger.debug("Response Headers: {} ", response.headers().toString()); + + // Return the InputStream of the response + return Objects.requireNonNull(response.body()).byteStream(); +} catch (IOException e) { + throw UserException +.dataReadError(e) +.message("Error retrieving data from HTTP Storage Plugin: %s", e.getMessage()) +.addContext("URL Requested:" + urlStr) +.build(logger); +} + } + + /** + * Function configures the OkHTTP3 server object with configuration info from the user. + * + * @return OkHttpClient configured server + */ + private OkHttpClient setupHttpClient
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390583400 ## File path: contrib/storage-http/src/test/java/org/apache/drill/exec/store/http/TestHttpPlugin.java ## @@ -0,0 +1,518 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; +import static org.apache.drill.test.rowSet.RowSetUtilities.mapValue; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNull; +import static org.junit.Assert.assertTrue; +import static org.junit.Assert.fail; + +import okhttp3.mockwebserver.MockResponse; +import okhttp3.mockwebserver.MockWebServer; +import okhttp3.mockwebserver.RecordedRequest; +import okio.Buffer; +import okio.Okio; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.exec.physical.rowSet.RowSet; +import org.apache.drill.exec.physical.rowSet.RowSetBuilder; +import org.apache.drill.exec.record.metadata.SchemaBuilder; +import org.apache.drill.exec.record.metadata.TupleMetadata; +import org.apache.drill.exec.store.StoragePluginRegistry; +import org.apache.drill.test.ClusterFixture; +import org.apache.drill.test.ClusterTest; +import org.apache.drill.test.rowSet.RowSetComparison; +import org.junit.BeforeClass; +import org.junit.Ignore; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.nio.file.Paths; +import java.util.HashMap; +import java.util.Map; +import java.util.concurrent.TimeUnit; + +/** + * This class tests the HTTP Storage plugin. Since the plugin makes use of REST requests, this + * test class makes use of the okhttp3 MockWebServer to simulate a remote web server. There are + * two unit tests that make remote REST calls, however these tests are ignored by default. + * + * The HTTP reader uses Drill's existing JSON reader class, so the unit tests focus on testing the plugin configurations + * rather than how well it parses the JSON as this is tested elsewhere. + */ +public class TestHttpPlugin extends ClusterTest { + private static final Logger logger = LoggerFactory.getLogger(TestHttpPlugin.class); + + private final String TEST_JSON_RESPONSE = "{\"results\":{\"sunrise\":\"6:13:58 AM\",\"sunset\":\"5:59:55 PM\",\"solar_noon\":\"12:06:56 PM\",\"day_length\":\"11:45:57\"," + Review comment: Suggestion: place this text in a resource file and read it. Easier for us poor humans to read and modify. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390548155 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,170 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JacksonInject; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.exec.store.StoragePluginRegistry; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@JsonTypeName("http-scan") +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig config; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.config = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +config = that.config(); +httpScanSpec = that.httpScanSpec(); +columns = that.getColumns(); + } + + @JsonCreator + public HttpGroupScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("columns") List columns, +@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec, +@JacksonInject StoragePluginRegistry engineRegistry + ) { +super("no-user"); +this.config = config; +this.columns = columns; +this.httpScanSpec = httpScanSpec; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { return config; } + + @JsonProperty("columns") + public List columns() { return columns; } + + @JsonProperty("httpScanSpec") + public HttpScanSpec httpScanSpec() { return httpScanSpec; } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); + } + + @Override + @JsonIgnore + public int getMaxParallelizationWidth() { +return 0; + } + + @Override + public boolean canPushdownProjects(List columns) { +return true; + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) { +logger.debug("HttpGroupScan getSpecificScan"); +return new HttpSubScan(config, httpScanSpec, columns); + } + + @Override + public GroupScan clone(List columns) { +logger.debug("HttpGroupScan clone {}", columns); +HttpGroupScan newScan = new HttpGroupScan(this); +newScan.columns = columns; Review comment: This is messy. The constructor would be `(HttpGroupScan from, List columns)`. There is a set of these conversion constructors that each plugin needs. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390561752 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePlugin.java ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import org.apache.calcite.schema.SchemaPlus; +import org.apache.drill.common.JSONOptions; +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.store.AbstractStoragePlugin; +import org.apache.drill.exec.store.SchemaConfig; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.core.type.TypeReference; +import java.io.IOException; + +public class HttpStoragePlugin extends AbstractStoragePlugin { + + private final DrillbitContext context; + + private final HttpStoragePluginConfig engineConfig; + + private final HttpSchemaFactory schemaFactory; + + public HttpStoragePlugin(HttpStoragePluginConfig configuration, DrillbitContext context, String name) { +super(context, name); +engineConfig = configuration; +schemaFactory = new HttpSchemaFactory(this, name); +this.context = context; + } + + @Override + public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) { +schemaFactory.registerSchemas(schemaConfig, parent); + } + + @Override + public HttpStoragePluginConfig getConfig() { +return engineConfig; + } + + @Override + public DrillbitContext getContext() { +return context; + } + + @Override + public boolean supportsRead() { +return true; + } + + @Override + public AbstractGroupScan getPhysicalScan(String userName, JSONOptions selection) throws IOException { +HttpScanSpec scanSpec = selection.getListWith(new ObjectMapper(), new TypeReference() {}); Review comment: So I discovered that the right way to do this is not to create your own mapper, but rather to use the intuitively obvious: ``` context.getLpPersistence().getMapper(); ``` Which has all the Drill Jackson goodies built in. Here `context` is the `DrillbitContext` from the base class. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390580267 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); + +try { + // Execute the request + Response response = client +.newCall(request) +.execute(); + + // If the request is unsuccessful, throw a UserException + if (!response.isSuccessful()) { +throw UserException + .dataReadError() + .message("Error retrieving data from HTTP Storage Plugin: " + response.code() + " " + response.message()) + .addContext("URL: ", urlStr) + .addContext("Response code: ", response.code()) + .build(logger); + } + logger.debug("HTTP Request for {} successful.", urlStr); + logger.debug("Response Headers: {} ", response.headers().toString()); + + // Return the InputStream of the response + return Objects.requireNonNull(response.body()).byteStream(); +} catch (IOException e) { + throw UserException +.dataReadError(e) +.message("Error retrieving data from HTTP Storage Plugin: %s", e.getMessage()) +.addContext("URL Requested:" + urlStr) +.build(logger); +} + } + + /** + * Function configures the OkHTTP3 server object with configuration info from the user. + * + * @return OkHttpClient configured server + */ + private OkHttpClient setupHttpClient
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390565084 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import org.apache.drill.common.map.CaseInsensitiveMap; +import org.apache.drill.common.logical.StoragePluginConfigBase; +import org.apache.drill.shaded.guava.com.google.common.base.Objects; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + + +@JsonTypeName(HttpStoragePluginConfig.NAME) +public class HttpStoragePluginConfig extends StoragePluginConfigBase { + private static final Logger logger = LoggerFactory.getLogger(HttpStoragePluginConfig.class); + + public static final String NAME = "http"; + + public final Map connections; + + public final boolean cacheResults; + + public int timeout; + + @JsonCreator + public HttpStoragePluginConfig(@JsonProperty("cacheResults") boolean cacheResults, + @JsonProperty("connections") Map connections, + @JsonProperty("timeout") int timeout) { +logger.debug("Initialize HttpStoragePluginConfig {}", connections); Review comment: I see many of these. If you are debugging via log messages, you can save yourself a huge amount of time by debugging via unit tests instead. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390568363 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpSchemaFactory.java ## @@ -0,0 +1,105 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Collections; +import java.util.HashSet; +import java.util.Map; +import java.util.Set; + +import org.apache.calcite.schema.SchemaPlus; +import org.apache.drill.common.exceptions.DrillRuntimeException; +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.store.AbstractSchema; +import org.apache.drill.exec.store.AbstractSchemaFactory; +import org.apache.drill.exec.store.SchemaConfig; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class HttpSchemaFactory extends AbstractSchemaFactory { + private static final Logger logger = LoggerFactory.getLogger(HttpSchemaFactory.class); + + private final HttpStoragePlugin plugin; + + public HttpSchemaFactory(HttpStoragePlugin plugin, String schemaName) { +super(schemaName); +this.plugin = plugin; + } + + @Override + public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) { +HttpSchema schema = new HttpSchema(getName()); +logger.debug("Registering {} {}", schema.getName(), schema.toString()); + +SchemaPlus schemaPlus = parent.add(getName(), schema); +schema.setHolder(schemaPlus); + } + + class HttpSchema extends AbstractSchema { + +public HttpSchema(String name) { + super(Collections.emptyList(), name); +} + +void setHolder(SchemaPlus plusOfThis) { + for (String s : getSubSchemaNames()) { +plusOfThis.add(s, getSubSchemaKnownExists(s)); + } +} + +@Override +public Set getSubSchemaNames() { + HttpStoragePluginConfig config = plugin.getConfig(); + Map connections = config.connections(); + Set subSchemaNames = new HashSet<>(); + + // Get the possible subschemas. + for (Map.Entry entry : connections.entrySet()) { +subSchemaNames.add(entry.getKey()); + } + return subSchemaNames; +} + +@Override +public AbstractSchema getSubSchema(String name) { + try { +if (!plugin.getConfig().connections().containsKey(name)) { + throw UserException +.connectionError() +.message("API '{}' does not exist in HTTP Storage plugin '{}'", name, getName()) +.build(logger); +} +return getSubSchemaKnownExists(name); + } catch (Exception e) { +throw new DrillRuntimeException(e); Review comment: Was this a copy/paste? If we go to the trouble of throwing a `UserException`, we should throw it, not catch it and wrap it in another kind of exception (which will, later be wrapped in a `UserException`.) Just let the `UserException` be thrown. Also, nit, reverse polarity: `if (...containsKey...) { return get... } else { throw ... }` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390569328 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePlugin.java ## @@ -0,0 +1,71 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import org.apache.calcite.schema.SchemaPlus; +import org.apache.drill.common.JSONOptions; +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.store.AbstractStoragePlugin; +import org.apache.drill.exec.store.SchemaConfig; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.core.type.TypeReference; +import java.io.IOException; + +public class HttpStoragePlugin extends AbstractStoragePlugin { + + private final DrillbitContext context; + + private final HttpStoragePluginConfig engineConfig; + + private final HttpSchemaFactory schemaFactory; + + public HttpStoragePlugin(HttpStoragePluginConfig configuration, DrillbitContext context, String name) { +super(context, name); +engineConfig = configuration; +schemaFactory = new HttpSchemaFactory(this, name); +this.context = context; + } + + @Override + public void registerSchemas(SchemaConfig schemaConfig, SchemaPlus parent) { +schemaFactory.registerSchemas(schemaConfig, parent); + } + + @Override + public HttpStoragePluginConfig getConfig() { +return engineConfig; + } + + @Override + public DrillbitContext getContext() { +return context; + } Review comment: Repeats method from base class; remove. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390575455 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) Review comment: Looks like both paths do `new Request.Builder().url(urlStr)`. Maybe put only the `.post()` in the post-specific block. If so, maybe check for `post`, instead of `! get`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390550604 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,137 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.List; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private final List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig httpStoragePluginConfig; + private boolean filterPushedDown = true; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.httpStoragePluginConfig = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +httpStoragePluginConfig = that.getStorageConfig(); +httpScanSpec = that.getScanSpec(); +columns = that.getColumns(); + } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); + } + + @Override + public int getMaxParallelizationWidth() { +return 1; + } + + + @Override + public boolean canPushdownProjects(List columns) { +return true; + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) { +logger.debug("HttpGroupScan getSpecificScan"); +return new HttpSubScan(httpStoragePluginConfig, httpScanSpec, columns); + } + + @Override + public GroupScan clone(List columns) { +logger.debug("HttpGroupScan clone {}", columns); +return new HttpGroupScan(this); + } + + @Override + public String getDigest() { +return toString(); + } + + @Override + public List getColumns() { +return columns; + } + + @Override + public PhysicalOperator getNewWithChildren(List children) { +Preconditions.checkArgument(children.isEmpty()); +return new HttpGroupScan(this); + } + + @Override + public ScanStats getScanStats() { Review comment: This is obviously a vague, tricky, messy issue. There is no a-priori way to know that a particular API will return 1 vs 1 million rows. (Think of the Drill REST API. Quick, how man rows will it return if you know nothing about the data source or query?) Fortunately, the problem can be broken down into two groups: "large" or "small". Large results should be hash partitioned if joined. Small results can be broadcast to all join nodes. Maybe add an "expectedRowCount" parameter. If missing, assume 10K. Now, trying to explain this issue to the poor user will be a challenge. Would be best if we could simply mark the result as eligible for broadcast or not, but I'm not sure how we might do that. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390559726 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpScanSpec.java ## @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; + +@JsonTypeName("http-scan-spec") +public class HttpScanSpec { + + protected final String schemaName; + + protected final String database; + + protected final String tableName; + + protected final HttpStoragePluginConfig config; + + @JsonCreator + public HttpScanSpec(@JsonProperty("schemaName") String schemaName, + @JsonProperty("database") String database, + @JsonProperty("tableName") String tableName, + @JsonProperty("config") HttpStoragePluginConfig config) { +this.schemaName = schemaName; +this.database = database; +this.tableName = tableName; +this.config = config; + } + + @JsonProperty("database") + public String database() { +return database; + } + + @JsonProperty("tableName") + public String tableName() { +return tableName; + } + + @JsonIgnore + public String getURL() { +return database; + } + + @Override + public String toString() { +return MoreObjects.toStringHelper(this) Review comment: Should be in Drill's "plan format" as it will appear as part of the scan spec in the logical plan. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390552671 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@JsonTypeName("http-scan") +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig config; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.config = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +config = that.config(); +httpScanSpec = that.httpScanSpec(); +columns = that.getColumns(); + } + + @JsonCreator + public HttpGroupScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("columns") List columns, +@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec + ) { +super("no-user"); +this.config = config; +this.columns = columns; +this.httpScanSpec = httpScanSpec; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { return config; } + + @JsonProperty("columns") + public List columns() { return columns; } + + @JsonProperty("httpScanSpec") + public HttpScanSpec httpScanSpec() { return httpScanSpec; } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); + } + + @Override + @JsonIgnore + public int getMaxParallelizationWidth() { +return 1; + } + + @Override + public boolean canPushdownProjects(List columns) { +return true; + } + + @Override + public SubScan getSpecificScan(int minorFragmentId) { +logger.debug("HttpGroupScan getSpecificScan"); +return new HttpSubScan(config, httpScanSpec, columns); + } + + @Override + public GroupScan clone(List columns) { +logger.debug("HttpGroupScan clone {}", columns); +HttpGroupScan newScan = new HttpGroupScan(this); +newScan.columns = columns; +return newScan; + } + + @Override + public String getDigest() { +return toString(); + } + + @Override + public PhysicalOperator getNewWithChildren(List children) { +Preconditions.checkArgument(children.isEmpty()); +return new HttpGroupScan(this); + } + + @Override + public ScanStats getScanStats() { +int colCount = columns.size(); +int estRowCount = 1; +int estDataSize = estRowCount * 200 * colCount; Review comment: This, unfortunately, does the *opposite* of what you want. When Drill does project, data size is 0. When you do project, data size is non-zero. Calcite will choose the cheapest, hence it will let Drill do the project. What you want is, if the columns are provided, reduce the da
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390531796 ## File path: contrib/storage-http/README.md ## @@ -0,0 +1,218 @@ + +# Generic REST API Storage Plugin +This plugin is intended to enable you to query APIs over HTTP/REST. At this point, the API reader will only accept JSON as input however in the future, it may be possible to + add additional format readers to allow for APIs which return XML, CSV or other formats. + +Note: This plugin should **NOT** be used for interacting with tools which have REST APIs such as Splunk or Solr. It will not be performant for those use cases. + +## Configuration +To configure the plugin, create a new storage plugin, and add the following configuration options which apply to ALL connections defined in this plugin: + +```json +{ + "type": "http", + "connection": "https:///", + "cacheResults": true, + "timeout": 0, + "enabled": true +} +``` +The options are: +* `type`: This should be `http` +* `cacheResults`: Enable caching of the HTTP responses +* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is no timeout. + +### Configuring the API Connections +The HTTP Storage plugin allows you to configure multiple APIS which you can query directly from this plugin. To do so, first add a `connections` parameter to the configuration +. Next give the connection a name, which will be used in queries. For instance `stockAPI` or `jira`. + +The `connection` can accept the following options: +* `url`: The base URL which Drill will query. You should include the ending slash if there are additional arguments which you are passing. +* `method`: The request method. Must be `get` or `post`. Other methods are not allowed and will default to `GET`. +* `headers`: Often APIs will require custom headers as part of the authentication. This field allows you to define key/value pairs which are submitted with the http request +. The format is: +```json +headers: { + "key1":, "Value1", + "key2", "Value2" +} +``` +* `authType`: If your API requires authentication, specify the authentication type. At the time of implementation, the plugin only supports basic authentication, however, the + plugin will likely support OAUTH2 in the future. Defaults to `none`. If the `authType` is set to `basic`, `username` and `password` must be set in the configuration as well. + * `username`: The username for basic authentication. + * `password`: The password for basic authentication. + * `postBody`: Contains data, in the form of key value pairs, which are sent during a `POST` request. Post body should be in the form: + ``` +key1=value1 +key2=value2 +``` + +## Usage: +This plugin is different from other plugins in that it the table component of the `FROM` clause is different. In normal Drill queries, the `FROM` clause is constructed as follows: +```sql +FROM .. +``` +For example, you might have: +```sql +FROM dfs.test.`somefile.csv` + +-- or + +FROM mongo.stats.sales_data +``` + +The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your REST call. The structure is: +```sql +FROM .. +--Actual example: + FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` +``` + + +## Examples: +### Example 1: Reference Data, A Sunrise/Sunset API +The API sunrise-sunset.org returns data in the following format: + + ```json + { + "results": + { + "sunrise":"7:27:02 AM", + "sunset":"5:05:55 PM", + "solar_noon":"12:16:28 PM", + "day_length":"9:38:53", + "civil_twilight_begin":"6:58:14 AM", + "civil_twilight_end":"5:34:43 PM", + "nautical_twilight_begin":"6:25:47 AM", + "nautical_twilight_end":"6:07:10 PM", + "astronomical_twilight_begin":"5:54:14 AM", + "astronomical_twilight_end":"6:38:43 PM" + }, + "status":"OK" + } + } +``` +To query this API, set the configuration as follows: + +```json +{ + { + "type": "http", + "cacheResults": false, + "enabled": true, + "timeout" 5, Review comment: Nit: missing colon after `"timeout"`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390533239 ## File path: contrib/storage-http/README.md ## @@ -0,0 +1,218 @@ + +# Generic REST API Storage Plugin +This plugin is intended to enable you to query APIs over HTTP/REST. At this point, the API reader will only accept JSON as input however in the future, it may be possible to + add additional format readers to allow for APIs which return XML, CSV or other formats. + +Note: This plugin should **NOT** be used for interacting with tools which have REST APIs such as Splunk or Solr. It will not be performant for those use cases. + +## Configuration +To configure the plugin, create a new storage plugin, and add the following configuration options which apply to ALL connections defined in this plugin: + +```json +{ + "type": "http", + "connection": "https:///", + "cacheResults": true, + "timeout": 0, + "enabled": true +} +``` +The options are: +* `type`: This should be `http` +* `cacheResults`: Enable caching of the HTTP responses +* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is no timeout. + +### Configuring the API Connections +The HTTP Storage plugin allows you to configure multiple APIS which you can query directly from this plugin. To do so, first add a `connections` parameter to the configuration +. Next give the connection a name, which will be used in queries. For instance `stockAPI` or `jira`. + +The `connection` can accept the following options: +* `url`: The base URL which Drill will query. You should include the ending slash if there are additional arguments which you are passing. +* `method`: The request method. Must be `get` or `post`. Other methods are not allowed and will default to `GET`. +* `headers`: Often APIs will require custom headers as part of the authentication. This field allows you to define key/value pairs which are submitted with the http request +. The format is: +```json +headers: { + "key1":, "Value1", + "key2", "Value2" +} +``` +* `authType`: If your API requires authentication, specify the authentication type. At the time of implementation, the plugin only supports basic authentication, however, the + plugin will likely support OAUTH2 in the future. Defaults to `none`. If the `authType` is set to `basic`, `username` and `password` must be set in the configuration as well. + * `username`: The username for basic authentication. + * `password`: The password for basic authentication. + * `postBody`: Contains data, in the form of key value pairs, which are sent during a `POST` request. Post body should be in the form: + ``` +key1=value1 +key2=value2 +``` + +## Usage: +This plugin is different from other plugins in that it the table component of the `FROM` clause is different. In normal Drill queries, the `FROM` clause is constructed as follows: +```sql +FROM .. +``` +For example, you might have: +```sql +FROM dfs.test.`somefile.csv` + +-- or + +FROM mongo.stats.sales_data +``` + +The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your REST call. The structure is: +```sql +FROM .. +--Actual example: + FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` +``` + + +## Examples: +### Example 1: Reference Data, A Sunrise/Sunset API +The API sunrise-sunset.org returns data in the following format: + + ```json + { + "results": + { + "sunrise":"7:27:02 AM", + "sunset":"5:05:55 PM", + "solar_noon":"12:16:28 PM", + "day_length":"9:38:53", + "civil_twilight_begin":"6:58:14 AM", + "civil_twilight_end":"5:34:43 PM", + "nautical_twilight_begin":"6:25:47 AM", + "nautical_twilight_end":"6:07:10 PM", + "astronomical_twilight_begin":"5:54:14 AM", + "astronomical_twilight_end":"6:38:43 PM" + }, + "status":"OK" + } + } +``` +To query this API, set the configuration as follows: + +```json +{ + { + "type": "http", + "cacheResults": false, + "enabled": true, + "timeout" 5, + "connections": { + "sunrise": { + "url": "https://api.sunrise-sunset.org/";, + "method": "get", + "headers": null, + "authType": "none", + "userName": null, + "password": null, + "postBody": null + } + }, +} +``` +Then, to execute a query: +```sql +SELECT api_results.results.sunrise AS sunrise, +api_results.results.sunset AS sunset +FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` AS api_results; Review comment: This is screaming out for filter push-down. Let's do this: get the basic version in so we have a firm foundation. Then, add filter push-down, using this plugin as the primary use case. This is an automated mess
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390574529 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/util/SimpleHttp.java ## @@ -0,0 +1,240 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http.util; + +import okhttp3.Cache; +import okhttp3.Credentials; +import okhttp3.FormBody; +import okhttp3.Interceptor; +import okhttp3.OkHttpClient; +import okhttp3.OkHttpClient.Builder; +import okhttp3.Request; +import okhttp3.Response; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.exec.ExecConstants; +import org.apache.drill.exec.ops.FragmentContext; +import org.apache.drill.exec.store.http.HttpAPIConfig; +import org.apache.drill.exec.store.http.HttpStoragePluginConfig; +import org.jetbrains.annotations.NotNull; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.IOException; +import java.io.InputStream; +import java.util.Map; +import java.util.Objects; +import java.util.concurrent.TimeUnit; +import java.util.regex.Pattern; + + +/** + * This class performs the actual HTTP requests for the HTTP Storage Plugin. The core method is the getInputStream() + * method which accepts a url and opens an InputStream with that URL's contents. + */ +public class SimpleHttp { + private static final Logger logger = LoggerFactory.getLogger(SimpleHttp.class); + + private final OkHttpClient client; + + private final HttpStoragePluginConfig config; + + private final FragmentContext context; + + private final HttpAPIConfig apiConfig; + + public SimpleHttp(HttpStoragePluginConfig config, FragmentContext context, String connectionName) { +this.config = config; +this.context = context; +this.apiConfig = config.connections().get(connectionName); +client = setupHttpClient(); + } + + + + public InputStream getInputStream(String urlStr) { +Request.Builder requestBuilder; + +// The configuration does not allow for any other request types other than POST and GET. +if (apiConfig.method().equals("get")) { + // Handle GET requests + requestBuilder = new Request.Builder().url(urlStr); +} else { + // Handle POST requests + FormBody.Builder formBodyBuilder = buildPostBody(); + requestBuilder = new Request.Builder() +.url(urlStr) +.post(formBodyBuilder.build()); +} + +// Add headers to request +if (apiConfig.headers() != null) { + for (Map.Entry entry : apiConfig.headers().entrySet()) { +String key = entry.getKey(); +String value = entry.getValue(); +requestBuilder.addHeader(key, value); + } +} + +// Build the request object +Request request = requestBuilder.build(); +logger.debug("Headers: {}", request.headers()); Review comment: In a busy distributed system, when we look a the logs, there will be no easy way to associate the headers here with anything. Do we elsewhere emit the URL? Then, we can at least hunt down the URL using the fragment ID, and use the same fragment ID to hunt down the headers. Or, add the URL here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390535951 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; Review comment: HTTP could fetch movies, images, CSV files, Parquet files... pretty much anything. REST, however, is a request, with data encoded in the URL query string (usually, or in a POST), with response in the form of JSON. People wage huge battles about the "one true, pure REST", but most folks are pragmatic. This code fits the pragmatic description. If we did eventually do a file-over-http plugin what would we call that if not `http`? `filehttp`? If so, then wouldn't this be `resthttp`? Or, since REST implies HTTP, maybe just `rest`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390564687 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import org.apache.drill.common.map.CaseInsensitiveMap; +import org.apache.drill.common.logical.StoragePluginConfigBase; +import org.apache.drill.shaded.guava.com.google.common.base.Objects; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + + +@JsonTypeName(HttpStoragePluginConfig.NAME) +public class HttpStoragePluginConfig extends StoragePluginConfigBase { + private static final Logger logger = LoggerFactory.getLogger(HttpStoragePluginConfig.class); + + public static final String NAME = "http"; + + public final Map connections; + + public final boolean cacheResults; Review comment: What does this do? I saw no code to cache results. Not sure it even makes sense in a distributed system unless there is some way to direct the next scan for the same URL to the node that holds the cached results. (Or, add a way to write the cached results to DFS, and expire them after some period.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390532236 ## File path: contrib/storage-http/README.md ## @@ -0,0 +1,218 @@ + +# Generic REST API Storage Plugin +This plugin is intended to enable you to query APIs over HTTP/REST. At this point, the API reader will only accept JSON as input however in the future, it may be possible to + add additional format readers to allow for APIs which return XML, CSV or other formats. + +Note: This plugin should **NOT** be used for interacting with tools which have REST APIs such as Splunk or Solr. It will not be performant for those use cases. + +## Configuration +To configure the plugin, create a new storage plugin, and add the following configuration options which apply to ALL connections defined in this plugin: + +```json +{ + "type": "http", + "connection": "https:///", + "cacheResults": true, + "timeout": 0, + "enabled": true +} +``` +The options are: +* `type`: This should be `http` +* `cacheResults`: Enable caching of the HTTP responses +* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is no timeout. + +### Configuring the API Connections +The HTTP Storage plugin allows you to configure multiple APIS which you can query directly from this plugin. To do so, first add a `connections` parameter to the configuration +. Next give the connection a name, which will be used in queries. For instance `stockAPI` or `jira`. + +The `connection` can accept the following options: +* `url`: The base URL which Drill will query. You should include the ending slash if there are additional arguments which you are passing. +* `method`: The request method. Must be `get` or `post`. Other methods are not allowed and will default to `GET`. +* `headers`: Often APIs will require custom headers as part of the authentication. This field allows you to define key/value pairs which are submitted with the http request +. The format is: +```json +headers: { + "key1":, "Value1", + "key2", "Value2" +} +``` +* `authType`: If your API requires authentication, specify the authentication type. At the time of implementation, the plugin only supports basic authentication, however, the + plugin will likely support OAUTH2 in the future. Defaults to `none`. If the `authType` is set to `basic`, `username` and `password` must be set in the configuration as well. + * `username`: The username for basic authentication. + * `password`: The password for basic authentication. + * `postBody`: Contains data, in the form of key value pairs, which are sent during a `POST` request. Post body should be in the form: + ``` +key1=value1 +key2=value2 +``` + +## Usage: +This plugin is different from other plugins in that it the table component of the `FROM` clause is different. In normal Drill queries, the `FROM` clause is constructed as follows: +```sql +FROM .. +``` +For example, you might have: +```sql +FROM dfs.test.`somefile.csv` + +-- or + +FROM mongo.stats.sales_data +``` + +The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your REST call. The structure is: +```sql +FROM .. +--Actual example: + FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` +``` + + +## Examples: +### Example 1: Reference Data, A Sunrise/Sunset API +The API sunrise-sunset.org returns data in the following format: + + ```json + { + "results": + { + "sunrise":"7:27:02 AM", + "sunset":"5:05:55 PM", + "solar_noon":"12:16:28 PM", + "day_length":"9:38:53", + "civil_twilight_begin":"6:58:14 AM", + "civil_twilight_end":"5:34:43 PM", + "nautical_twilight_begin":"6:25:47 AM", + "nautical_twilight_end":"6:07:10 PM", + "astronomical_twilight_begin":"5:54:14 AM", + "astronomical_twilight_end":"6:38:43 PM" + }, + "status":"OK" + } + } +``` +To query this API, set the configuration as follows: + +```json +{ + { + "type": "http", + "cacheResults": false, + "enabled": true, + "timeout" 5, + "connections": { + "sunrise": { + "url": "https://api.sunrise-sunset.org/";, + "method": "get", + "headers": null, + "authType": "none", + "userName": null, + "password": null, + "postBody": null + } + }, Review comment: Suggestion: use the old C-book trick: copy-paste this into a test case, ensure it works, then copy-paste the working config back here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390546981 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpGroupScan.java ## @@ -0,0 +1,167 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.Arrays; +import java.util.List; +import java.util.Objects; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonIgnore; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.common.expression.SchemaPath; + +import org.apache.drill.exec.physical.base.AbstractGroupScan; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.base.PhysicalOperator; +import org.apache.drill.exec.physical.base.ScanStats; +import org.apache.drill.exec.physical.base.ScanStats.GroupScanProperty; +import org.apache.drill.exec.physical.base.SubScan; +import org.apache.drill.exec.proto.CoordinationProtos.DrillbitEndpoint; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +@JsonTypeName("http-scan") +public class HttpGroupScan extends AbstractGroupScan { + private static final Logger logger = LoggerFactory.getLogger(HttpGroupScan.class); + + private List columns; + private final HttpScanSpec httpScanSpec; + private final HttpStoragePluginConfig config; + + public HttpGroupScan ( +HttpStoragePluginConfig config, +HttpScanSpec scanSpec, +List columns + ) { +super("no-user"); +this.config = config; +this.httpScanSpec = scanSpec; +this.columns = columns == null || columns.size() == 0 ? ALL_COLUMNS : columns; + } + + public HttpGroupScan(HttpGroupScan that) { +super(that); +config = that.config(); +httpScanSpec = that.httpScanSpec(); +columns = that.getColumns(); + } + + @JsonCreator + public HttpGroupScan( +@JsonProperty("config") HttpStoragePluginConfig config, +@JsonProperty("columns") List columns, +@JsonProperty("httpScanSpec") HttpScanSpec httpScanSpec + ) { +super("no-user"); +this.config = config; +this.columns = columns; +this.httpScanSpec = httpScanSpec; + } + + @JsonProperty("config") + public HttpStoragePluginConfig config() { return config; } + + @JsonProperty("columns") + public List columns() { return columns; } + + @JsonProperty("httpScanSpec") + public HttpScanSpec httpScanSpec() { return httpScanSpec; } + + @Override + public void applyAssignments(List endpoints) { +logger.debug("HttpGroupScan applyAssignments"); Review comment: No need to log this, it will be called. Maybe this is a proxy for a comment that says "not sure what to do", or "assignments not used." A necessary weakness of the present design is that it must run in only a single Drillbit. Not good for downloading large result sets. But, distribution will require filter push-down and an idea about how to shard a request. That can come later. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390531073 ## File path: contrib/storage-http/README.md ## @@ -0,0 +1,218 @@ + +# Generic REST API Storage Plugin +This plugin is intended to enable you to query APIs over HTTP/REST. At this point, the API reader will only accept JSON as input however in the future, it may be possible to + add additional format readers to allow for APIs which return XML, CSV or other formats. + +Note: This plugin should **NOT** be used for interacting with tools which have REST APIs such as Splunk or Solr. It will not be performant for those use cases. + +## Configuration +To configure the plugin, create a new storage plugin, and add the following configuration options which apply to ALL connections defined in this plugin: + +```json +{ + "type": "http", + "connection": "https:///", + "cacheResults": true, + "timeout": 0, + "enabled": true +} +``` +The options are: +* `type`: This should be `http` +* `cacheResults`: Enable caching of the HTTP responses +* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is no timeout. + +### Configuring the API Connections +The HTTP Storage plugin allows you to configure multiple APIS which you can query directly from this plugin. To do so, first add a `connections` parameter to the configuration +. Next give the connection a name, which will be used in queries. For instance `stockAPI` or `jira`. + +The `connection` can accept the following options: +* `url`: The base URL which Drill will query. You should include the ending slash if there are additional arguments which you are passing. +* `method`: The request method. Must be `get` or `post`. Other methods are not allowed and will default to `GET`. +* `headers`: Often APIs will require custom headers as part of the authentication. This field allows you to define key/value pairs which are submitted with the http request +. The format is: +```json +headers: { + "key1":, "Value1", + "key2", "Value2" Review comment: Nit: Json typo: colon after the second key, no comma after the colon in the first key. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390544725 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; + +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; + +public class HttpAPIConfig { + + private final String url; + + private final String method; + + private final Map headers; + + private final String authType; + + private final String userName; + + private final String password; + + private final String postBody; + + public HttpAPIConfig(@JsonProperty("url") String url, + @JsonProperty("method") String method, + @JsonProperty("headers") Map headers, + @JsonProperty("authType") String authType, + @JsonProperty("userName") String userName, + @JsonProperty("password") String password, + @JsonProperty("postBody") String postBody) { + +// Get the request method. Only accept GET and POST requests. Anything else will default to GET. +if (method.toLowerCase().equals("get") || method.toLowerCase().equals("post")) { + this.method = method.toLowerCase(); +} else { + this.method = "get"; +} +this.headers = headers; + +// Put a trailing slash on the URL if it is missing +if (url.charAt(url.length() - 1) != '/') { + this.url = url + "/"; +} else { + this.url = url; +} + +// Get the authentication method. Future functionality will include OAUTH2 authentication but for now +// Accept either basic or none. The default is none. +this.authType = (authType == null || authType.isEmpty()) ? "none" : authType; Review comment: Consider `Strings.isNullOrEmpty()`. See [this](https://stackoverflow.com/questions/2272169/string-isnullorempty-in-java). Also, should any of these be trimmed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390537966 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; + +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; + +public class HttpAPIConfig { + + private final String url; + + private final String method; + + private final Map headers; + + private final String authType; + + private final String userName; + + private final String password; + + private final String postBody; + + public HttpAPIConfig(@JsonProperty("url") String url, + @JsonProperty("method") String method, + @JsonProperty("headers") Map headers, + @JsonProperty("authType") String authType, + @JsonProperty("userName") String userName, + @JsonProperty("password") String password, + @JsonProperty("postBody") String postBody) { + +// Get the request method. Only accept GET and POST requests. Anything else will default to GET. +if (method.toLowerCase().equals("get") || method.toLowerCase().equals("post")) { Review comment: This will crash if `method` is not provided. Since `GET` is the default, this should accept `null`. Also, nit, after doing the `null` check, a `switch` statement would be clearer for the other case. Finally, nit, this method lower-cases the method three times. Could be done just once. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390565386 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import org.apache.drill.common.map.CaseInsensitiveMap; +import org.apache.drill.common.logical.StoragePluginConfigBase; +import org.apache.drill.shaded.guava.com.google.common.base.Objects; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + + +@JsonTypeName(HttpStoragePluginConfig.NAME) +public class HttpStoragePluginConfig extends StoragePluginConfigBase { + private static final Logger logger = LoggerFactory.getLogger(HttpStoragePluginConfig.class); + + public static final String NAME = "http"; + + public final Map connections; + + public final boolean cacheResults; + + public int timeout; + + @JsonCreator + public HttpStoragePluginConfig(@JsonProperty("cacheResults") boolean cacheResults, + @JsonProperty("connections") Map connections, + @JsonProperty("timeout") int timeout) { +logger.debug("Initialize HttpStoragePluginConfig {}", connections); +this.cacheResults = cacheResults; + +if (connections != null) { Review comment: Nit: reverse polarity: `if (connections == null)` ... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390564074 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpStoragePluginConfig.java ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import org.apache.drill.common.map.CaseInsensitiveMap; +import org.apache.drill.common.logical.StoragePluginConfigBase; +import org.apache.drill.shaded.guava.com.google.common.base.Objects; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; + +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; + + +@JsonTypeName(HttpStoragePluginConfig.NAME) +public class HttpStoragePluginConfig extends StoragePluginConfigBase { + private static final Logger logger = LoggerFactory.getLogger(HttpStoragePluginConfig.class); + + public static final String NAME = "http"; + + public final Map connections; Review comment: The issue is that common config must be duplicated. If config is duplicated, we are really creating two different configs; we might as well store them at the top level. I might think differently if things like user, auth and so on were common for a site, then each service in that site had its own URL. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390541287 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; + +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; + +public class HttpAPIConfig { + + private final String url; + + private final String method; + + private final Map headers; + + private final String authType; + + private final String userName; + + private final String password; + + private final String postBody; + + public HttpAPIConfig(@JsonProperty("url") String url, + @JsonProperty("method") String method, + @JsonProperty("headers") Map headers, + @JsonProperty("authType") String authType, + @JsonProperty("userName") String userName, + @JsonProperty("password") String password, + @JsonProperty("postBody") String postBody) { + +// Get the request method. Only accept GET and POST requests. Anything else will default to GET. +if (method.toLowerCase().equals("get") || method.toLowerCase().equals("post")) { + this.method = method.toLowerCase(); +} else { + this.method = "get"; +} +this.headers = headers; + +// Put a trailing slash on the URL if it is missing +if (url.charAt(url.length() - 1) != '/') { + this.url = url + "/"; +} else { + this.url = url; +} Review comment: It is perhaps thinking how this works. The JSON comes from either 1) the user, or 2) persistent storage. If from the user, we will doctor up the values in this constructor, and the result will be converted back to JSON in the persistent store. If from the persistent store (older version wrote the JSON), then the form used in-memory won't match the persistent version until the next update. I wonder, would it be better to do this fix-up when creating the reader rather than in the config ctor? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390537567 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; + +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; + +public class HttpAPIConfig { + + private final String url; + + private final String method; + + private final Map headers; + + private final String authType; + + private final String userName; + + private final String password; + + private final String postBody; + + public HttpAPIConfig(@JsonProperty("url") String url, + @JsonProperty("method") String method, + @JsonProperty("headers") Map headers, + @JsonProperty("authType") String authType, + @JsonProperty("userName") String userName, + @JsonProperty("password") String password, + @JsonProperty("postBody") String postBody) { + +// Get the request method. Only accept GET and POST requests. Anything else will default to GET. +if (method.toLowerCase().equals("get") || method.toLowerCase().equals("post")) { + this.method = method.toLowerCase(); +} else { + this.method = "get"; Review comment: Please log a warning. Else, if I really want to use `HEAD`, I'll have no way of understanding why my server receives `GET`. Also, the convention is for HTTP methods to be upper case: `GET` and `POST`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390557812 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpScanBatchCreator.java ## @@ -0,0 +1,54 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.http; + +import java.util.List; + +import org.apache.drill.common.exceptions.ExecutionSetupException; +import org.apache.drill.common.expression.SchemaPath; +import org.apache.drill.exec.ops.ExecutorFragmentContext; +import org.apache.drill.exec.physical.base.GroupScan; +import org.apache.drill.exec.physical.impl.BatchCreator; +import org.apache.drill.exec.physical.impl.ScanBatch; +import org.apache.drill.exec.record.RecordBatch; +import org.apache.drill.exec.store.RecordReader; +import org.apache.drill.shaded.guava.com.google.common.base.Preconditions; +import org.apache.drill.shaded.guava.com.google.common.collect.Lists; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class HttpScanBatchCreator implements BatchCreator { + private static final Logger logger = LoggerFactory.getLogger(HttpScanBatchCreator.class); + + @Override + public ScanBatch getBatch(ExecutorFragmentContext context, HttpSubScan subScan, List children) throws ExecutionSetupException { +logger.debug("getBatch called"); +Preconditions.checkArgument(children == null || children.isEmpty()); + +HttpStoragePluginConfig config = subScan.config(); +List readers = Lists.newArrayList(); Review comment: nit: `new ArrayList<>()` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390533397 ## File path: contrib/storage-http/README.md ## @@ -0,0 +1,218 @@ + +# Generic REST API Storage Plugin +This plugin is intended to enable you to query APIs over HTTP/REST. At this point, the API reader will only accept JSON as input however in the future, it may be possible to + add additional format readers to allow for APIs which return XML, CSV or other formats. + +Note: This plugin should **NOT** be used for interacting with tools which have REST APIs such as Splunk or Solr. It will not be performant for those use cases. + +## Configuration +To configure the plugin, create a new storage plugin, and add the following configuration options which apply to ALL connections defined in this plugin: + +```json +{ + "type": "http", + "connection": "https:///", + "cacheResults": true, + "timeout": 0, + "enabled": true +} +``` +The options are: +* `type`: This should be `http` +* `cacheResults`: Enable caching of the HTTP responses +* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is no timeout. + +### Configuring the API Connections +The HTTP Storage plugin allows you to configure multiple APIS which you can query directly from this plugin. To do so, first add a `connections` parameter to the configuration +. Next give the connection a name, which will be used in queries. For instance `stockAPI` or `jira`. + +The `connection` can accept the following options: +* `url`: The base URL which Drill will query. You should include the ending slash if there are additional arguments which you are passing. +* `method`: The request method. Must be `get` or `post`. Other methods are not allowed and will default to `GET`. +* `headers`: Often APIs will require custom headers as part of the authentication. This field allows you to define key/value pairs which are submitted with the http request +. The format is: +```json +headers: { + "key1":, "Value1", + "key2", "Value2" +} +``` +* `authType`: If your API requires authentication, specify the authentication type. At the time of implementation, the plugin only supports basic authentication, however, the + plugin will likely support OAUTH2 in the future. Defaults to `none`. If the `authType` is set to `basic`, `username` and `password` must be set in the configuration as well. + * `username`: The username for basic authentication. + * `password`: The password for basic authentication. + * `postBody`: Contains data, in the form of key value pairs, which are sent during a `POST` request. Post body should be in the form: + ``` +key1=value1 +key2=value2 +``` + +## Usage: +This plugin is different from other plugins in that it the table component of the `FROM` clause is different. In normal Drill queries, the `FROM` clause is constructed as follows: +```sql +FROM .. +``` +For example, you might have: +```sql +FROM dfs.test.`somefile.csv` + +-- or + +FROM mongo.stats.sales_data +``` + +The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your REST call. The structure is: +```sql +FROM .. +--Actual example: + FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` +``` + + +## Examples: +### Example 1: Reference Data, A Sunrise/Sunset API +The API sunrise-sunset.org returns data in the following format: + + ```json + { + "results": + { + "sunrise":"7:27:02 AM", + "sunset":"5:05:55 PM", + "solar_noon":"12:16:28 PM", + "day_length":"9:38:53", + "civil_twilight_begin":"6:58:14 AM", + "civil_twilight_end":"5:34:43 PM", + "nautical_twilight_begin":"6:25:47 AM", + "nautical_twilight_end":"6:07:10 PM", + "astronomical_twilight_begin":"5:54:14 AM", + "astronomical_twilight_end":"6:38:43 PM" + }, + "status":"OK" + } + } +``` +To query this API, set the configuration as follows: + +```json +{ + { + "type": "http", + "cacheResults": false, + "enabled": true, + "timeout" 5, + "connections": { + "sunrise": { + "url": "https://api.sunrise-sunset.org/";, + "method": "get", + "headers": null, + "authType": "none", + "userName": null, + "password": null, + "postBody": null + } + }, +} +``` +Then, to execute a query: +```sql +SELECT api_results.results.sunrise AS sunrise, +api_results.results.sunset AS sunset +FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` AS api_results; +``` +Which yields the following results: +``` ++++ +| sunrise | sunset | ++++ +| 7:17:46 AM | 5:01:33 PM | ++++ +1 row selected (0.632 seconds) +``` + +### Example 2: JIRA +JIRA Cloud has a REST API which is [documented
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390545178 ## File path: contrib/storage-http/src/main/java/org/apache/drill/exec/store/http/HttpAPIConfig.java ## @@ -0,0 +1,135 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.http; + +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.annotation.JsonTypeName; +import org.apache.drill.shaded.guava.com.google.common.base.MoreObjects; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Arrays; +import java.util.Map; +import java.util.Objects; + +@JsonTypeName("http-api-config") +public class HttpAPIConfig { + + private static final Logger logger = LoggerFactory.getLogger(HttpAPIConfig.class); + + private final String url; + + private final String method; + + private final Map headers; + + private final String authType; + + private final String username; + + private final String password; + + private final String postBody; + + public HttpAPIConfig(@JsonProperty("url") String url, + @JsonProperty("method") String method, + @JsonProperty("headers") Map headers, + @JsonProperty("authType") String authType, + @JsonProperty("username") String username, + @JsonProperty("password") String password, + @JsonProperty("postBody") String postBody) { +this.url = url; + +// Get the request method. Only accept GET and POST requests. Anything else will default to GET. +if (method.toLowerCase().equals("get") || method.toLowerCase().equals("post")) { + this.method = method.toLowerCase(); +} else { + this.method = "get"; +} +this.headers = headers; + +// Get the authentication method. Future functionality will include OAUTH2 authentication but for now +// Accept either basic or none. The default is none. +this.authType = (authType == null || authType.isEmpty()) ? "none" : authType; +this.username = username; +this.password = password; +this.postBody = postBody; + +// Validate the authentication type + + } + + @JsonProperty("url") + public String url() { return url; } + + @JsonProperty("method") + public String method() { return method; } + + @JsonProperty("headers") + public Map headers() { return headers; } + + @JsonProperty("authType") + public String authType() { return authType; } + + @JsonProperty("username") + public String username() { return username; } + + @JsonProperty("password") + public String password() { return password; } + + @JsonProperty("postBody") + public String postBody() { return postBody; } + + @Override + public int hashCode() { +return Arrays.hashCode( + new Object[]{url, method, headers, authType, username, password, postBody}); + } + + @Override + public String toString() { +return MoreObjects.toStringHelper(this) + .add("url", url) + .add("method", method) + .add("headers", headers) + .add("authType", authType) + .add("username", username) + .add("password", password) + .add("postBody", postBody) + .toString(); Review comment: Class moved to another PR, will hopefully arrive soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390582930 ## File path: contrib/storage-http/src/main/resources/bootstrap-storage-plugins.json ## @@ -0,0 +1,9 @@ +{ + "storage":{ +"http" : { + "type":"http", + "connections": {}, + "enabled": false +} + } +} Review comment: Nit: add newline. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API
paul-rogers commented on a change in pull request #1892: DRILL-7437: Storage Plugin for Generic HTTP REST API URL: https://github.com/apache/drill/pull/1892#discussion_r390531631 ## File path: contrib/storage-http/README.md ## @@ -0,0 +1,218 @@ + +# Generic REST API Storage Plugin +This plugin is intended to enable you to query APIs over HTTP/REST. At this point, the API reader will only accept JSON as input however in the future, it may be possible to + add additional format readers to allow for APIs which return XML, CSV or other formats. + +Note: This plugin should **NOT** be used for interacting with tools which have REST APIs such as Splunk or Solr. It will not be performant for those use cases. + +## Configuration +To configure the plugin, create a new storage plugin, and add the following configuration options which apply to ALL connections defined in this plugin: + +```json +{ + "type": "http", + "connection": "https:///", + "cacheResults": true, + "timeout": 0, + "enabled": true +} +``` +The options are: +* `type`: This should be `http` +* `cacheResults`: Enable caching of the HTTP responses +* `timeout`: Sets the response timeout in seconds. Defaults to `0` which is no timeout. + +### Configuring the API Connections +The HTTP Storage plugin allows you to configure multiple APIS which you can query directly from this plugin. To do so, first add a `connections` parameter to the configuration +. Next give the connection a name, which will be used in queries. For instance `stockAPI` or `jira`. + +The `connection` can accept the following options: +* `url`: The base URL which Drill will query. You should include the ending slash if there are additional arguments which you are passing. +* `method`: The request method. Must be `get` or `post`. Other methods are not allowed and will default to `GET`. +* `headers`: Often APIs will require custom headers as part of the authentication. This field allows you to define key/value pairs which are submitted with the http request +. The format is: +```json +headers: { + "key1":, "Value1", + "key2", "Value2" +} +``` +* `authType`: If your API requires authentication, specify the authentication type. At the time of implementation, the plugin only supports basic authentication, however, the + plugin will likely support OAUTH2 in the future. Defaults to `none`. If the `authType` is set to `basic`, `username` and `password` must be set in the configuration as well. + * `username`: The username for basic authentication. + * `password`: The password for basic authentication. + * `postBody`: Contains data, in the form of key value pairs, which are sent during a `POST` request. Post body should be in the form: + ``` +key1=value1 +key2=value2 +``` + +## Usage: +This plugin is different from other plugins in that it the table component of the `FROM` clause is different. In normal Drill queries, the `FROM` clause is constructed as follows: +```sql +FROM .. +``` +For example, you might have: +```sql +FROM dfs.test.`somefile.csv` + +-- or + +FROM mongo.stats.sales_data +``` + +The HTTP/REST plugin the `FROM` clause enables you to pass arguments to your REST call. The structure is: +```sql +FROM .. +--Actual example: + FROM http.sunrise.`/json?lat=36.7201600&lng=-4.4203400&date=today` +``` + + +## Examples: +### Example 1: Reference Data, A Sunrise/Sunset API +The API sunrise-sunset.org returns data in the following format: + + ```json + { + "results": + { + "sunrise":"7:27:02 AM", + "sunset":"5:05:55 PM", + "solar_noon":"12:16:28 PM", + "day_length":"9:38:53", + "civil_twilight_begin":"6:58:14 AM", + "civil_twilight_end":"5:34:43 PM", + "nautical_twilight_begin":"6:25:47 AM", + "nautical_twilight_end":"6:07:10 PM", + "astronomical_twilight_begin":"5:54:14 AM", + "astronomical_twilight_end":"6:38:43 PM" + }, + "status":"OK" + } + } +``` +To query this API, set the configuration as follows: + +```json +{ + { Review comment: Nit: `{ {` is not valid JSON. Is there a key missing? Or, remove the outer brackets? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] dobesv commented on issue #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page
dobesv commented on issue #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page URL: https://github.com/apache/drill/pull/1994#issuecomment-597268755 @arina-ielchiieva Thanks. Sorry I didn't have the resources to fix the Safari issue - maybe someone with macOS will have a chance to fix it in a future PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (DRILL-7635) Mongo tests fail for profiles with non-file default file system
Vova Vysotskyi created DRILL-7635: - Summary: Mongo tests fail for profiles with non-file default file system Key: DRILL-7635 URL: https://issues.apache.org/jira/browse/DRILL-7635 Project: Apache Drill Issue Type: Bug Affects Versions: 1.18.0 Reporter: Vova Vysotskyi Assignee: Vova Vysotskyi Fix For: 1.18.0 Some profiles in Drill have specific configurations for {{fs.defaultFS}} Hadoop property. This option was set to {{file:///}} in {{core-site.xml}} in tests resources in {{java-exec}} modulte. Since in DRILL-7547 was added another {{core-site.xml}} file into the mongo module, it overrides the existing one, so mongo tests stop using {{file:///}} for such profiles. These failures became seen after fix for DRILL-7628, where they started running again. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [drill] asfgit closed pull request #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page
asfgit closed pull request #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page URL: https://github.com/apache/drill/pull/1994 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2011: DRILL-7623: Link error is displayed at the log content page on Web UI
asfgit closed pull request #2011: DRILL-7623: Link error is displayed at the log content page on Web UI URL: https://github.com/apache/drill/pull/2011 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2012: DRILL-7625: Add options for SslContextFactory
asfgit closed pull request #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2014: DRILL-7627: Update MySql version for JdbcStoragePlugin tests and cache ~/.embedmysql
asfgit closed pull request #2014: DRILL-7627: Update MySql version for JdbcStoragePlugin tests and cache ~/.embedmysql URL: https://github.com/apache/drill/pull/2014 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2017: DRILL-7632: Improve user exception formatting
asfgit closed pull request #2017: DRILL-7632: Improve user exception formatting URL: https://github.com/apache/drill/pull/2017 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547
asfgit closed pull request #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547 URL: https://github.com/apache/drill/pull/2015 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2009: DRILL-7622: Compilation error when using HLL / TDigest with group by
asfgit closed pull request #2009: DRILL-7622: Compilation error when using HLL / TDigest with group by URL: https://github.com/apache/drill/pull/2009 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2019: DRILL-7630: Add additional types into SchemaParser for Parquet
asfgit closed pull request #2019: DRILL-7630: Add additional types into SchemaParser for Parquet URL: https://github.com/apache/drill/pull/2019 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2000: DRILL-7607: Support dynamic credit based flow control
asfgit closed pull request #2000: DRILL-7607: Support dynamic credit based flow control URL: https://github.com/apache/drill/pull/2000 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] asfgit closed pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
asfgit closed pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value URL: https://github.com/apache/drill/pull/2006 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value
paul-rogers commented on a change in pull request #2006: DRILL-7615: UNION ALL query returns the wrong result for the decimal value URL: https://github.com/apache/drill/pull/2006#discussion_r386776641 ## File path: common/src/main/java/org/apache/drill/common/types/Types.java ## @@ -795,15 +807,25 @@ public static boolean isSortable(MinorType type) { */ public static MajorType.Builder calculateTypePrecisionAndScale(MajorType leftType, MajorType rightType, MajorType.Builder typeBuilder) { if (leftType.getMinorType().equals(rightType.getMinorType())) { - final boolean isScalarString = Types.isScalarStringType(leftType) && Types.isScalarStringType(rightType); - final boolean isDecimal = isDecimalType(leftType); + boolean isScalarString = Types.isScalarStringType(leftType) && Types.isScalarStringType(rightType); + boolean isDecimal = isDecimalType(leftType); - if ((isScalarString || isDecimal) && leftType.hasPrecision() && rightType.hasPrecision()) { + if (isScalarString && leftType.hasPrecision() && rightType.hasPrecision()) { typeBuilder.setPrecision(Math.max(leftType.getPrecision(), rightType.getPrecision())); } - if (isDecimal && leftType.hasScale() && rightType.hasScale()) { -typeBuilder.setScale(Math.max(leftType.getScale(), rightType.getScale())); + if (isDecimal) { +int scale = Math.max(leftType.getScale(), rightType.getScale()); +// resulting precision should take into account resulting scale value and be calculated as +// sum of two components: +// - max integer digits number (precision - scale) for left and right; +// - resulting scale. +// So for the case of cast( as decimal(4,0)) and cast(1.23 as decimal(3,2)) +// resulting scale would be Max(0, 2) = 2 and resulting precision would be Max(4 - 0, 3 - 2) + 2 = 6. +// In this case, both values would fit into decimal(6, 2): .00, 1.23 +int precision = Math.max(leftType.getPrecision() - leftType.getScale(), rightType.getPrecision() - rightType.getScale()) + scale; +typeBuilder.setPrecision(precision); +typeBuilder.setScale(scale); Review comment: Nit: please split the long line. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF URL: https://github.com/apache/drill/pull/1962#discussion_r379859349 ## File path: contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java ## @@ -15,78 +15,74 @@ * See the License for the specific language governing permissions and * limitations under the License. */ + package org.apache.drill.exec.store.ltsv; -import org.apache.drill.common.expression.SchemaPath; import org.apache.drill.common.logical.StoragePluginConfig; -import org.apache.drill.exec.ops.FragmentContext; -import org.apache.drill.exec.planner.common.DrillStatsTable.TableStatistics; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.common.types.Types; +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileSchemaNegotiator; +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileReaderFactory; +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileScanBuilder; +import org.apache.drill.exec.physical.impl.scan.framework.ManagedReader; import org.apache.drill.exec.proto.UserBitShared; import org.apache.drill.exec.server.DrillbitContext; -import org.apache.drill.exec.store.RecordReader; -import org.apache.drill.exec.store.RecordWriter; -import org.apache.drill.exec.store.dfs.DrillFileSystem; +import org.apache.drill.exec.server.options.OptionManager; import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin; -import org.apache.drill.exec.store.dfs.easy.EasyWriter; -import org.apache.drill.exec.store.dfs.easy.FileWork; +import org.apache.drill.exec.store.dfs.easy.EasySubScan; +import org.apache.drill.shaded.guava.com.google.common.collect.Lists; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; -import java.util.List; public class LTSVFormatPlugin extends EasyFormatPlugin { Review comment: Would be handy to reference [the spec](http://ltsv.org/) for us newbies not familiar with the format. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF URL: https://github.com/apache/drill/pull/1962#discussion_r390520873 ## File path: contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java ## @@ -0,0 +1,35 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.ltsv; + +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileSchemaNegotiator; +import org.apache.drill.exec.store.easy.EasyEVFBatchReader; + +public class LTSVBatchReader extends EasyEVFBatchReader { + + public LTSVBatchReader(LTSVFormatPluginConfig formatConfig) { Review comment: Since this code is not shared, maybe just omit the parameter. Then, nit, omit the empty call to `super()`. Finally, the ctor will end up doing nothing. so omit it entirely. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF URL: https://github.com/apache/drill/pull/1962#discussion_r379859411 ## File path: contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVFormatPlugin.java ## @@ -15,78 +15,74 @@ * See the License for the specific language governing permissions and * limitations under the License. */ + package org.apache.drill.exec.store.ltsv; -import org.apache.drill.common.expression.SchemaPath; import org.apache.drill.common.logical.StoragePluginConfig; -import org.apache.drill.exec.ops.FragmentContext; -import org.apache.drill.exec.planner.common.DrillStatsTable.TableStatistics; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.common.types.Types; +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileSchemaNegotiator; +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileReaderFactory; +import org.apache.drill.exec.physical.impl.scan.file.FileScanFramework.FileScanBuilder; +import org.apache.drill.exec.physical.impl.scan.framework.ManagedReader; import org.apache.drill.exec.proto.UserBitShared; import org.apache.drill.exec.server.DrillbitContext; -import org.apache.drill.exec.store.RecordReader; -import org.apache.drill.exec.store.RecordWriter; -import org.apache.drill.exec.store.dfs.DrillFileSystem; +import org.apache.drill.exec.server.options.OptionManager; import org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin; -import org.apache.drill.exec.store.dfs.easy.EasyWriter; -import org.apache.drill.exec.store.dfs.easy.FileWork; +import org.apache.drill.exec.store.dfs.easy.EasySubScan; +import org.apache.drill.shaded.guava.com.google.common.collect.Lists; import org.apache.hadoop.conf.Configuration; -import org.apache.hadoop.fs.FileSystem; -import org.apache.hadoop.fs.Path; -import java.util.List; public class LTSVFormatPlugin extends EasyFormatPlugin { - private static final boolean IS_COMPRESSIBLE = true; - private static final String DEFAULT_NAME = "ltsv"; - public LTSVFormatPlugin(String name, DrillbitContext context, Configuration fsConf, StoragePluginConfig storageConfig) { -this(name, context, fsConf, storageConfig, new LTSVFormatPluginConfig()); - } + public static class LTSVReaderFactory extends FileReaderFactory { +private final LTSVFormatPluginConfig ltsvFormatPluginConfig; - public LTSVFormatPlugin(String name, DrillbitContext context, Configuration fsConf, StoragePluginConfig config, LTSVFormatPluginConfig formatPluginConfig) { -super(name, context, fsConf, config, formatPluginConfig, true, false, false, IS_COMPRESSIBLE, formatPluginConfig.getExtensions(), DEFAULT_NAME); - } +public LTSVReaderFactory(LTSVFormatPluginConfig config) { + ltsvFormatPluginConfig = config; +} - @Override - public RecordReader getRecordReader(FragmentContext context, DrillFileSystem dfs, FileWork fileWork, List columns, String userName) { -return new LTSVRecordReader(context, fileWork.getPath(), dfs, columns); +@Override +public ManagedReader newReader() { + return new LTSVBatchReader(ltsvFormatPluginConfig); Review comment: Oh, never mind. I wrote the darn thing and forgot that file info is passed in to open, not into the constructor... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF URL: https://github.com/apache/drill/pull/1962#discussion_r390522026 ## File path: contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordIterator.java ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.ltsv; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.exec.physical.resultSet.RowSetLoader; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.util.Iterator; + +public class LTSVRecordIterator implements Iterator { + + private static final Logger logger = LoggerFactory.getLogger(LTSVRecordIterator.class); + + private RowSetLoader rowWriter; + + private BufferedReader reader; + + private String line; + + private int recordCount; + + public LTSVRecordIterator(RowSetLoader rowWriter, BufferedReader reader) { +this.rowWriter = rowWriter; +this.reader = reader; + +// Get the first line +try { + line = reader.readLine(); +} catch (IOException e) { + throw UserException +.dataReadError() +.message("Error reading LTSV Data: {}", e.getMessage()) +.build(logger); +} + } + + @Override + public boolean hasNext() { +return line != null; + } + + @Override + public Boolean next() { +// Skip empty lines +if (line.trim().length() == 0) { + try { +// Advance the line to the next line +line = reader.readLine(); + } catch (IOException e) { +throw UserException + .dataReadError() + .message("Error reading LTSV Data: {}", e.getMessage()) + .build(logger); + } + return Boolean.TRUE; +} else if (line == null) { + return Boolean.FALSE; +} + +// Process the row +processRow(); + +// Increment record counter +recordCount++; + +// Get the next line +try { + line = reader.readLine(); + if(line == null) { Review comment: Nit: `if (` (space before paren.) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF URL: https://github.com/apache/drill/pull/1962#discussion_r379859570 ## File path: contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordIterator.java ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.ltsv; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.exec.physical.resultSet.RowSetLoader; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.util.Iterator; + +public class LTSVRecordIterator implements Iterator { + + private static final Logger logger = LoggerFactory.getLogger(LTSVRecordIterator.class); + + private RowSetLoader rowWriter; + + private BufferedReader reader; + + private String line; + + private int recordCount; + + public LTSVRecordIterator(RowSetLoader rowWriter, BufferedReader reader) { +this.rowWriter = rowWriter; +this.reader = reader; + +// Get the first line +try { + line = reader.readLine(); +} catch (IOException e) { + throw UserException +.dataReadError() +.message("Error reading LTSV Data: {}", e.getMessage()) +.build(logger); +} + } + + @Override + public boolean hasNext() { +return line != null; + } + + @Override + public Boolean next() { +// Skip empty lines +if (line.trim().length() == 0) { + try { +// Advance the line to the next line +line = reader.readLine(); + } catch (IOException e) { +throw UserException + .dataReadError() + .message("Error reading LTSV Data: {}", e.getMessage()) + .build(logger); + } + return Boolean.TRUE; +} else if (line == null) { + return Boolean.FALSE; +} + +// Process the row +processRow(); + +// Increment record counter +recordCount++; + +// Get the next line +try { + line = reader.readLine(); + if(line == null) { +return Boolean.FALSE; + } +} catch (IOException e) { + throw UserException +.dataReadError() +.message("Error reading LTSV Data: {}", e.getMessage()) +.build(logger); +} +return Boolean.TRUE; + } + + /** + * Function processes one row of data, splitting it up first by tabs then splitting the key/value pairs + * finally recording it in the current Drill row. + */ + private void processRow() { +// Start the row +rowWriter.start(); Review comment: By the way, I recently was reminded that `start()` returns a `boolean`. We usually write ``` while (!rs.isFull()) { rowWriter.start(); ... } ``` I now remember that the code was designed to allow: ``` while (rowWriter.start()) { // Write stuff rowWriter.end(); } ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF
paul-rogers commented on a change in pull request #1962: DRILL-7554: Convert LTSV Format Plugin to EVF URL: https://github.com/apache/drill/pull/1962#discussion_r374459658 ## File path: contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVRecordIterator.java ## @@ -0,0 +1,127 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.drill.exec.store.ltsv; + +import org.apache.drill.common.exceptions.UserException; +import org.apache.drill.common.types.TypeProtos; +import org.apache.drill.exec.physical.resultSet.RowSetLoader; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.util.Iterator; + +public class LTSVRecordIterator implements Iterator { + + private static final Logger logger = LoggerFactory.getLogger(LTSVRecordIterator.class); + + private RowSetLoader rowWriter; + + private BufferedReader reader; + + private String line; + + private int recordCount; + + public LTSVRecordIterator(RowSetLoader rowWriter, BufferedReader reader) { +this.rowWriter = rowWriter; +this.reader = reader; + +// Get the first line +try { + line = reader.readLine(); +} catch (IOException e) { + throw UserException +.dataReadError() +.message("Error reading LTSV Data: {}", e.getMessage()) +.build(logger); +} + } + + @Override + public boolean hasNext() { +return line != null; + } + + @Override + public Boolean next() { +// Skip empty lines +if (line.trim().length() == 0) { + try { +// Advance the line to the next line +line = reader.readLine(); + } catch (IOException e) { +throw UserException + .dataReadError() + .message("Error reading LTSV Data: {}", e.getMessage()) + .build(logger); + } + return Boolean.TRUE; +} else if (line == null) { + return Boolean.FALSE; Review comment: Boolean.FALSE --> false. No need to use the boxed type. Here, above and below. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on issue #2000: DRILL-7607: Support dynamic credit based flow control
paul-rogers commented on issue #2000: DRILL-7607: Support dynamic credit based flow control URL: https://github.com/apache/drill/pull/2000#issuecomment-597239145 Looks good, +1. Seems to me that two things need to happen after this PR. First, is to find a way to run a stress/performance test to ensure that the potential extra memory usage is not an issue. Second is to, at some point, continue with our resource management efforts and to 1) allocate memory to the receivers they can use as their buffer pool, and 2) standardize batch size so the receiver knows roughly the sizes to expect. This work is beyond the scope of this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on issue #1993: DRILL-7601: Shift column conversion to reader from scan framework
paul-rogers commented on issue #1993: DRILL-7601: Shift column conversion to reader from scan framework URL: https://github.com/apache/drill/pull/1993#issuecomment-597236752 Rebased on latest master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift column conversion to reader from scan framework
paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift column conversion to reader from scan framework URL: https://github.com/apache/drill/pull/1993#discussion_r390500706 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/avro/AvroBatchReader.java ## @@ -69,11 +69,15 @@ public boolean open(FileScanFramework.FileSchemaNegotiator negotiator) { negotiator.userName(), negotiator.context().getFragmentContext().getQueryUserName()); logger.debug("Avro file schema: {}", reader.getSchema()); -TupleMetadata schema = AvroSchemaUtil.convert(reader.getSchema()); -logger.debug("Avro file converted schema: {}", schema); -negotiator.setTableSchema(schema, true); +TupleMetadata readerSchema = AvroSchemaUtil.convert(reader.getSchema()); +logger.debug("Avro file converted schema: {}", readerSchema); +TupleMetadata providedSchema = negotiator.providedSchema(); +TupleMetadata tableSchema = StandardConversions.mergeSchemas(providedSchema, readerSchema); Review comment: You have identified a tricky area. The planner is the correct place to handle scan output schema: then the reader takes that output schema as gospel. Here we have readers that have a strong opinion about what schema they can produce; the provided schema is mostly a "suggestion." So, were in an akward middle area. In particular, the provided schema can include columns that the reader does not know about. Imagine the case of an Avro file where some files use schema v1, while others use schema 2 with an additional column. The v1 readers can't provide that extra column. Filling in missing columns is done via the scan framework. (Maybe this does not occur in Avro; maybe the Avro schema handles these cases. But, the issue does arise in CSV, JSON and others.) So, at the very least, the reader is not responsible for any column except those it can produce, any "extra" columns are handled by the scan framework. In the ideal case, the provided schema says to the reader, "of the ways you know how to convert a column, use **this** one." So, the reader has to know the target type of the column. All this said, my goal in a later PR is to reshuffle some of this stuff again. Now that we have multiple examples, is there some work that can be pulled out and reused? For example, can we build an explicit mapping from the reader's idea of the schema to the provided schema? Or, does it turn out that the reader's idea of types are unique to each reader so there is no common mechanism? For example, the Avro code looks quite specific to Avro and is perhaps not easily reused. Suggestions? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift column conversion to reader from scan framework
paul-rogers commented on a change in pull request #1993: DRILL-7601: Shift column conversion to reader from scan framework URL: https://github.com/apache/drill/pull/1993#discussion_r390511107 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvTableProperties.java ## @@ -547,4 +543,65 @@ public void testMessyQuotes() throws Exception { .build(); RowSetUtilities.verify(expected, actual); } + + private static final String[] trimData = { + " 10 , fred ", + " 20, wilma " +}; + + /** + * Trim leading and trailing whitespace. This setting is currently + * only available via table properties. + */ + @Test + public void testKeepWitespace() throws Exception { +try { + enableSchemaSupport(); + String tablePath = buildTable("noTrim", trimData); + String sql = String.format("create schema (%s) " + + "for table %s PROPERTIES ('" + + TextFormatPlugin.HAS_HEADERS_PROP + "'='false', '" + + TextFormatPlugin.SKIP_FIRST_LINE_PROP + "'='false')", + COL_SCHEMA, tablePath); + run(sql); + RowSet actual = client.queryBuilder().sql(SELECT_ALL, tablePath).rowSet(); + + TupleMetadata expectedSchema = new SchemaBuilder() + .add("id", MinorType.INT) + .add("name", MinorType.VARCHAR) + .buildSchema(); + + // String-to-number conversion trims strings automatically + RowSet expected = new RowSetBuilder(client.allocator(), expectedSchema) + .addRow(10, " fred ") + .addRow(20, " wilma ") + .build(); + RowSetUtilities.verify(expected, actual); +} finally { + resetSchemaSupport(); +} + } + + /** + * Trim leading and trailing whitespace. This setting is currently + * only available via table properties. + */ + @Test + public void testTrimWitespace() throws Exception { +try { + enableSchemaSupport(); + String tablePath = buildTable("trim", trimData); + String sql = String.format("create schema (%s) " + + "for table %s PROPERTIES ('" + + TextFormatPlugin.HAS_HEADERS_PROP + "'='false', '" + + TextFormatPlugin.SKIP_FIRST_LINE_PROP + "'='false', '" + + TextFormatPlugin.TRIM_WHITESPACE_PROP + "'='true')", Review comment: Added detail to the "Documentation" section of this PR. There will be a few more new properties for JSON. Then I'll figure out how to show that in the docs. Perhaps in the provided schema section we need a table of which formats support the schema, and which properties each format supports, and what they do. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] paul-rogers opened a new pull request #2020: DRILL-7634: Rollup of code cleanup changes
paul-rogers opened a new pull request #2020: DRILL-7634: Rollup of code cleanup changes URL: https://github.com/apache/drill/pull/2020 # [DRILL-7634](https://issues.apache.org/jira/browseDRILL-7634): Rollup of code cleanup changes ## Description Collection of code cleanup changes. The most significant is to create constants for function names. We already had a pretty good list in `FunctionGenerationHelper`. I move this list to a new interface, `FunctionNames` and added to it. I then changed all references to hard-coded names to instead refer to these constants. This change pointed out several places where code checks for variations of function names. Not sure why; perhaps an early version of the code did not have the map from alias to "canonical" name. Used constants for the canonical name, left the other names in place because I don't have enough knowledge to know if it is safe to remove them. Anyone know? Also made the `args` member of `FunctionCall` private and added accessors. ## Documentation None ## Testing Reran all unit tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (DRILL-7634) Rollup of code cleanup changes
Paul Rogers created DRILL-7634: -- Summary: Rollup of code cleanup changes Key: DRILL-7634 URL: https://issues.apache.org/jira/browse/DRILL-7634 Project: Apache Drill Issue Type: Improvement Affects Versions: 1.17.0 Reporter: Paul Rogers Assignee: Paul Rogers Fix For: 1.18.0 Pack of cosmetic code cleanup changes accumulated over recent months. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser URL: https://github.com/apache/drill/pull/2016#discussion_r390435908 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ValueDef.java ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.easy.json.parser; + +/** + * Description of a JSON value as inferred from looking ahead in + * the JSON stream. Includs a type (which can be empty for an empty + * array, or null), and an array size (which is 0 for simple values.) + * + * To be clear, this is the JSON parser's best guess at a field type + * from the input token stream. This is not a description of the + * desired data type as JSON can only react to what it sees on input. + */ +public class ValueDef { + + /** + * Description of JSON types as derived from JSON tokens. + */ + public enum JsonType { +OBJECT, NULL, BOOLEAN, +INTEGER, FLOAT, STRING, EMBEDDED_OBJECT, + +/** + * Indicates an empty array. + */ +EMPTY, + +/** + * Indicates an unknown array, appears when replacing the + * value listener for an array. + */ +UNKNOWN; + +public boolean isObject() { return this == OBJECT; } + +public boolean isUnknown() { + return this == NULL || this == EMPTY || + this == UNKNOWN; +} + +public boolean isScalar() { + return !isObject() && !isUnknown(); +} + } + + public static final ValueDef UNKNOWN_ARRAY = new ValueDef(JsonType.UNKNOWN, 1); + public static final ValueDef UNKNOWN = new ValueDef(JsonType.UNKNOWN, 0); + + private final int arrayDims; + private final JsonType type; + + public ValueDef(JsonType type, int dims) { +this.type = type; +this.arrayDims = dims; + } + + public JsonType type() { return type; } + public int dimensions() { return arrayDims; } + public boolean isArray() { return arrayDims > 0; } + + @Override + public String toString() { +String result = type.name(); +for (int i = 0; i < arrayDims; i++) { + result += "[]"; Review comment: Could you please replace strings concatenation with `StringBuilder` usage. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser URL: https://github.com/apache/drill/pull/2016#discussion_r390433673 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ObjectListener.java ## @@ -52,80 +52,64 @@ */ public interface ObjectListener { + public enum FieldType { Review comment: No need to declare it explicitly as public. Everything declared in the interface is public by default. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser URL: https://github.com/apache/drill/pull/2016#discussion_r390425236 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ValueDefFactory.java ## @@ -0,0 +1,99 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.easy.json.parser; + +import org.apache.drill.exec.store.easy.json.parser.ValueDef.JsonType; + +import com.fasterxml.jackson.core.JsonToken; + +/** + * Constructs a {@link ValueDef} by looking ahead on the input stream. + * Looking ahead is safe because this class only looks at syntactic + * tokens such as {, {@code [} or the first value token. + * The underlying JSON parser is left with the first value token + * as its current token. Pushes other tokens back on the token stack + * so they can be re-consumed by the actual parser. + */ +public class ValueDefFactory { + + private int arrayDims; + private JsonType jsonType = JsonType.EMPTY; + + public ValueDefFactory(TokenIterator tokenizer) { +inferValueType(tokenizer); + } + + public static ValueDef lookAhead(TokenIterator tokenizer) { +ValueDefFactory factory= new ValueDefFactory(tokenizer); Review comment: ```suggestion ValueDefFactory factory = new ValueDefFactory(tokenizer); ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser URL: https://github.com/apache/drill/pull/2016#discussion_r390434300 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ValueDef.java ## @@ -0,0 +1,84 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store.easy.json.parser; + +/** + * Description of a JSON value as inferred from looking ahead in + * the JSON stream. Includs a type (which can be empty for an empty Review comment: ```suggestion * the JSON stream. Includes a type (which can be empty for an empty ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser
vvysotskyi commented on a change in pull request #2016: DRILL-7631: Updates to the Json Structure Parser URL: https://github.com/apache/drill/pull/2016#discussion_r390423322 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/store/easy/json/parser/ObjectListener.java ## @@ -52,80 +52,64 @@ */ public interface ObjectListener { + public enum FieldType { + +/** + * The field is unprojected, ignore its content. No value listener + * is created. + */ +IGNORE, + +/** + * Parse the JSON object according to its type. + */ +TYPED, + +/** + * The field is to be treated as "all-text". Used when the parser-level + * setting for {@code allTextMode} is {@code false}; allows per-field + * overrides to, perhaps, ride over inconsistent scalar types for a + * single field. The listener will receive only strings. + */ +TEXT, + +/** + * Parse the value, and all its children, as JSON. + * That is, converts the parsed JSON back into a + * JSON string. The listener will receive only strings. + */ +JSON + } + /** * Called at the start of a set of values for an object. That is, called * when the structure parser accepts the { token. */ void onStart(); - /** - * Called at the end of a set of values for an object. That is, called - * when the structure parser accepts the } token. - */ - void onEnd(); - /** * Called by the structure parser when it first sees a new field for - * and object to determine if that field is to be projected (is needed - * by the listener.) If not projected, the structure parser will not + * and object to determine how to parse the field. + * If not projected, the structure parser will not * ask for a value listener and will insert a "dummy" parser that will * free-wheel over any value of that field. As a result, unprojected * fields can not cause type errors: they are invisible as long as * they are syntactically valid. + * + * The {@link FieldType#JSON} type says to parse the entire field, and + * its children, as a JSON string. The parser will ask for a value + * listener to accept the JSON text. * * @param key the object field name - * @return {@code true} if this listener wants to provide a listener - * for the field, {@code false} if the field should be ignored - */ - boolean isProjected(String key); - - /** - * A new field has appeared with a scalar (or {@code null}) value. - * That is: {@code key: }. - * - * @param key the field name - * @param type the type as given by the JSON token for the value - * @return a value listener for the scalar value + * @return how the field should be parsed */ - ValueListener addScalar(String key, JsonType type); + FieldType fieldType(String key); - /** - * A new field has appeared with a scalar, {@code null} or empty array - * value. That is, one of: - * - * key: [+- * key: [+ null - * key: [+ ] - * - * Where "[+" means one or more opening array elements. - * - * @param key the field name - * @param arrayDims number of dimensions observed in the first appearance - * of the array (more may appear later) - * @param type the observed type of the first element of the array, or - * {@link JsonType.NULL} if {@code null} was see, or - * {@link JsonType.EMPTY} if an empty array was seen - * @return a listener for the field itself which is prepared to - * return an array listener - */ - ValueListener addArray(String key, int arrayDims, JsonType type); + ValueListener addField(String key, ValueDef valueDef); Review comment: Please add JavaDoc. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva opened a new pull request #2019: DRILL-7630: Add additional types into SchemaParser for Parquet
arina-ielchiieva opened a new pull request #2019: DRILL-7630: Add additional types into SchemaParser for Parquet URL: https://github.com/apache/drill/pull/2019 # [DRILL-7630](https://issues.apache.org/jira/browse/DRILL-7630): Add additional types into SchemaParser for Parquet ## Description Parquet files supports some types that are currently not included in the schema parser. Since schema parser is used during schema ser / de, processing of such Parquet files fails. We need to add additional data types to the schema parser. ## Documentation NA ## Testing Added unit test, ran all tests. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547
arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547 URL: https://github.com/apache/drill/pull/2015#issuecomment-597139833 +1, LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva removed a comment on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547
arina-ielchiieva removed a comment on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547 URL: https://github.com/apache/drill/pull/2015#issuecomment-596991728 @vvysotskyi I think maybe we'll get rid of this test suite logic? We can make all Mongo tests extend `MongoTestBase.class`, annotate `MongoTestBase.class` with `@Category({MongoStorageTest.class})` so we can skip or include Mongo tests and that's it, so suite logic will be required. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] agozhiy commented on issue #2012: DRILL-7625: Add options for SslContextFactory
agozhiy commented on issue #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012#issuecomment-597130330 Thank you, +1. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012#discussion_r390351471 ## File path: common/src/main/java/org/apache/drill/common/config/NestedConfig.java ## @@ -111,6 +112,19 @@ public String getString(String path) { return c.getString(path); } + /** + * Returns string option or passed default value if option is not set. + * + * @param path option path + * @param defaultValue default value + * @return option or default value + */ + public String getString(String path, String defaultValue) { Review comment: There is no need for this method at all, default values can be set directly in drill-module.conf. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012#discussion_r390350047 ## File path: common/src/main/java/org/apache/drill/common/config/NestedConfig.java ## @@ -111,6 +112,19 @@ public String getString(String path) { return c.getString(path); } + /** + * Returns string option or passed default value if option is not set. + * + * @param path option path + * @param defaultValue default value + * @return option or default value + */ + public String getString(String path, String defaultValue) { Review comment: There is no need for this method at all, default values can be set directly in drill-module.conf. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012#discussion_r390350047 ## File path: common/src/main/java/org/apache/drill/common/config/NestedConfig.java ## @@ -111,6 +112,19 @@ public String getString(String path) { return c.getString(path); } + /** + * Returns string option or passed default value if option is not set. + * + * @param path option path + * @param defaultValue default value + * @return option or default value + */ + public String getString(String path, String defaultValue) { Review comment: There is no need for this method at all, default values can be set directly in drill-module.conf. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory
agozhiy commented on a change in pull request #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012#discussion_r390347618 ## File path: common/src/main/java/org/apache/drill/common/config/NestedConfig.java ## @@ -111,6 +111,14 @@ public String getString(String path) { return c.getString(path); } + public String getString(String path, String defaultValue) { +String value = hasPath(path) ? getString(path) : null; +if (value == null || (value = value.trim()).isEmpty()) { + return defaultValue; +} +return value; Review comment: There is no need for this method at all, default values can be set directly in drill-module.conf. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] cgivre commented on issue #2012: DRILL-7625: Add options for SslContextFactory
cgivre commented on issue #2012: DRILL-7625: Add options for SslContextFactory URL: https://github.com/apache/drill/pull/2012#issuecomment-597061241 Could we make a note to include this in the web documentation for Drill once this is committed? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi merged pull request #1986: Additional changes for Drill Metastore docs
vvysotskyi merged pull request #1986: Additional changes for Drill Metastore docs URL: https://github.com/apache/drill/pull/1986 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] vvysotskyi merged pull request #2013: Clean up a bit, add example for s3 storage of query profiles
vvysotskyi merged pull request #2013: Clean up a bit, add example for s3 storage of query profiles URL: https://github.com/apache/drill/pull/2013 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors
arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors URL: https://github.com/apache/drill/pull/2018#discussion_r390202680 ## File path: exec/vector/src/main/java/org/apache/drill/exec/record/metadata/VariantColumnMetadata.java ## @@ -96,11 +134,13 @@ public StructureType structureType() { public boolean isVariant() { return true; } @Override - public boolean isArray() { return type() == MinorType.LIST; } + public boolean isArray() { +return super.isArray() || type() == MinorType.LIST; + } @Override public ColumnMetadata cloneEmpty() { -return new VariantColumnMetadata(name, type, variantSchema.cloneEmpty()); +return new VariantColumnMetadata(name, type, mode, new VariantSchema()); Review comment: Also this class should override `public String typeString()` which is used during ser / de. For example, `UNION`. Not sure how to describe LIST though... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors
arina-ielchiieva commented on a change in pull request #2018: DRILL-7633: Fixes for union and repeated list accessors URL: https://github.com/apache/drill/pull/2018#discussion_r390199915 ## File path: exec/vector/src/main/java/org/apache/drill/exec/record/metadata/VariantColumnMetadata.java ## @@ -96,11 +134,13 @@ public StructureType structureType() { public boolean isVariant() { return true; } @Override - public boolean isArray() { return type() == MinorType.LIST; } + public boolean isArray() { +return super.isArray() || type() == MinorType.LIST; + } @Override public ColumnMetadata cloneEmpty() { -return new VariantColumnMetadata(name, type, variantSchema.cloneEmpty()); +return new VariantColumnMetadata(name, type, mode, new VariantSchema()); Review comment: Could you please also implement `public ColumnMetadata copy() {` below since yo are touching this class? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #2017: DRILL-7632: Improve user exception formatting
arina-ielchiieva commented on issue #2017: DRILL-7632: Improve user exception formatting URL: https://github.com/apache/drill/pull/2017#issuecomment-596994971 +1, LGTM. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #2010: DRILL-7620: Fix plugin mutability issues
arina-ielchiieva commented on issue #2010: DRILL-7620: Fix plugin mutability issues URL: https://github.com/apache/drill/pull/2010#issuecomment-596994090 @paul-rogers I guess we should to address issues mentioned by Anton before moving forward... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page
arina-ielchiieva commented on issue #1994: DRILL-7203: Accept impersonation userName as form field & fix back button for query page URL: https://github.com/apache/drill/pull/1994#issuecomment-596993199 I think we should still merge this PR since in Drill we don't have list of browsers we support, and Chrome is the one that is commonly used, since waiting widget works in Chrome, the fact it does not work in Safari should not block this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [drill] arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547
arina-ielchiieva commented on issue #2015: DRILL-7628: Fix Mongo tests broken after the fix for DRILL-7547 URL: https://github.com/apache/drill/pull/2015#issuecomment-596991728 @vvysotskyi I think maybe we'll get rid of this test suite logic? We can make all Mongo tests extend `MongoTestBase.class`, annotate `MongoTestBase.class` with `@Category({MongoStorageTest.class})` so we can skip or include Mongo tests and that's it, so suite logic will be required. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services