Paul, That's precisely the route of the problem. Your notes will be helpful to remedy this.
Thanks, Rafael -----Original Message----- From: Paul Rogers <par0...@yahoo.com.INVALID> Sent: Tuesday, March 31, 2020 4:39 PM To: user@drill.apache.org Subject: Re: REST data source? Hi Rafael, You may be running into something that I hit at a recent employer. The firm hosted its own in-house artifactory that would pull only from "authorized" repos. Drill has a couple of dependencies on MapR-hosted repos which this firm did not mirror, causing Drill to break. Rather than argue with the Powers That Be to change the rules for my little POC, I found a work-around. If you are having the same issue, this might work for you. My notes from that time are at [1]. Of course, your issue could be different, so we might need a different solution. As I recall, the error I got was a bit different than the one you got. Still, worth a try. Thanks, - Paul [1] https://github.com/paul-rogers/drill/wiki/Build-Drill-in-a-Corporate-Environment On Tuesday, March 31, 2020, 12:21:42 PM PDT, Jaimes, Rafael - 0993 - MITLL <rafael.jai...@ll.mit.edu> wrote: Hi Paul, I tried that (even tried a vanilla build before on its own) and I run into the same dependency problem. There is something in apache-21.pom that I cannot resolve. If it works for you I am certain it is a config on our end due to the way our proxies and mirrors are setup, we have to go through these internal channels when building and it sometimes causes issues. Charles, I am almost up and running with your pre-built instance. I have narrowed the problem down to possibly being another proxy issue. The GET requests don't seem to be honoring my system env variable proxy settings. Do you think there's any way to force Drill/plug-in to use a proxy? I'm unable to get the examples you have posted working: getting Connection reset error on HTTPS and Connect time out with HTTP. The URLs work fine if I test them outside of Drill. Thanks, Rafael -----Original Message----- From: Paul Rogers <par0...@yahoo.com.INVALID> Sent: Tuesday, March 31, 2020 2:36 PM To: user@drill.apache.org Subject: Re: REST data source? Hi Rafael, The easiest way to build the plugin will be to build all of Drill 1.18 Snapshot with the plugin included. 1. Grab master from GitHub. 2. Merge in Charle's PR branch. 3. mvn clean install -DskipTests The above usually works for me. This process ensures that all the snapshot versions come from your own build. Not sure how we started storing snapshot versions in a Maven repo. This causes issues. If you rebuild part of Drill, and have not built the other parts in more than a day, Maven helpfully downloads the snapshots from the repo, causing all kinds of chaos. (We should fix this.) Once you do the build, you'll have a full Drill distribution, just like you'd download. You can use that distribution to run Drill with the plugin included. There are other ways that also work; the above may be the simplest. Thanks, - Paul On Tuesday, March 31, 2020, 10:51:18 AM PDT, Jaimes, Rafael - 0993 - MITLL <rafael.jai...@ll.mit.edu> wrote: Hi Charles, (1./2.) I have not been able to build Drill, from either a full clone of your tagged http-storage branch or from the standard Drill 1.17 release. I've narrowed it down to some dependency problems from the POM. In particular, I run into issues here: Downloading: https://repo.maven.apache.org/maven2/org/apache/apache/21/apache-21.pom [ERROR] The build could not read 1 project -> [Help 1] [ERROR] [ERROR] The project org.apache.drill:drill-root:1.18.0-SNAPSHOT (/home/ra29435/drill-official/drill/pom.xml) has 1 error [ERROR] Non-resolvable parent POM: Could not transfer artifact org.apache:apache:pom:21 from/to conjars (http://conjars.org/repo): Connection to http://conjars.org refurelativePath' points at no local POM @ line 24, column 11: Connection timed out (Connection timed out) -> [Help 2] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/ProjectBuildingException [ERROR] [Help 2] http://cwiki.apache.org/confluence/display/MAVEN/UnresolvableModelException I think it has something to do with the fact that I normally resolve dependencies from our local Maven repo mirrors. We have no problems getting stuff from Maven Central and common places, but I am unfamiliar with conjars.org. I wonder if it is related to that? (3./4.) I tried putting the JAR into either jars/ or jars/3rdparty with the same error. I haven't gone down the dependency tree so I have not made and JARs of them, that could be a major thing I'm missing. Yes this is still in a testing environment. I'm going to use your pre-built images for testing the REST endpoint, this is extremely helpful. If it works out I'll go back to trying to build it. Also, hoping that this will make its way into the next (1.18) release. Best, Rafael -----Original Message----- From: Charles Givre <cgi...@gmail.com> Sent: Tuesday, March 31, 2020 1:34 PM To: user <user@drill.apache.org> Subject: Re: REST data source? Hi Rafael, Glad you're getting some value from Drill. Repackaging that directory as a truly pluggable jar is tricky. A few questions: 1. Did you copy the contrib/storage-http into its own folder and then do a build from that? 2. Did it build successfully? 3. Did you copy the JARs into your Drill jars/3rdparty folder? 4. You'll also have to get JARs of any dependencies as well and copy them to the jars/3rdparty. Have you done that? I actually have a pre-built version of Drill with the storage-http plugin available here: https://github.com/cgivre/drill/releases <https://github.com/cgivre/drill/releases>. Please do not use that in any kind of production setup. If you're just wanting to try this out, it might be easier to d/l that and use that. -- C > On Mar 31, 2020, at 12:57 PM, Jaimes, Rafael - 0993 - MITLL > <rafael.jai...@ll.mit.edu> wrote: > > Hi Charles, > > I am trying to use the http-storage plugin from your branch. I put the > storage plug-in files in a jar and tried to keep the jar directory structure > the same as other plug-ins. Upon starting drill-embedded I’m getting the > error below. I am using your drill-module.conf and > bootstrap-storage-plugins.json from your branch. Is there another step I need > to perform to get Drill to recognize the plug-in? I am using 1.17 release. > > Error: Failure in starting embedded Drillbit: > java.lang.IllegalStateException: >com.fasterxml.jackson.databind.exc.InvalidTypeIdException: Could not >resolve type id 'http' as a subtype of [simple type, class >org.apache.drill.common.logical.StoragePluginConfig]: known type ids = >[InfoSchemaConfig, SystemTablePluginConfig, file, hbase, hive, jdbc, >kafka, kudu, mock, mongo, named, openTSDB] (for POJO property >'storage') at [Source: (String)"{ > "storage":{ > "http" : { > "type":"http", > "connections": {}, > "enabled": false > } > } > } > "; line: 4, column: 14] (through reference chain: > org.apache.drill.exec.planner.logical.StoragePlugins["storage"]->java. > util.LinkedHashMap["http"]) (state=,code=0) > > Paul, > > I don’t know much about this REST service quite yet (it is internal). We > utilize REST API where all responses are returned as JSON formatted strings > in many places, I don’t think it is very sophisticated. I am not sure how it > will handle projection and filter issues. My current pipeline involves using > python requests.get() and then unpacking the response string. It does have an > authentication layer, so I am mildly concerned that the HTTP-storage-plugin > will have a hiccup – although it looks like it can use “Basic”. If I can get > Drill to query the endpoint I will report back if I find anything else that > might be useful to you. > > Thanks both for your great work with Drill! > > - Rafael
smime.p7s
Description: S/MIME cryptographic signature