Thanks for the quick responses!
I'm using drill 1.4. I think I may have sorted out my S3 connections issues,
but I'm not sure because I'm having trouble executing a query:
My s3 connection (named "s3"):
{
"type": "file",
"enabled": true,
"connection": "s3://inrixprod-tapp/",
"workspaces": {
"root": {
"location": "/",
"writable": false,
"defaultInputFormat": null
}
Query:
SELECT * FROM
s3.`data/year=2016/month=02/day=28/part-r-00000-f2b42e00-ff01-4d82-84e3-c75aafa007ae.gz.parquet`
LIMIT 3;
Response:
Query Failed: An Error Occurred
org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR:
IOException: / doesn't exist [Error Id: 9e076a2b-c4fa-4020-af2e-4d43c2e9588c on
NickM-LPT02.inrix.corpnet.local:31010]
Nick Monetta | INRIX |[email protected] |Movement Intelligence | www.inrix.com |
mobile +1 646-248-4105 |
-----Original Message-----
From: Jason Altekruse [mailto:[email protected]]
Sent: Wednesday, April 20, 2016 4:45 PM
To: [email protected]
Cc: [email protected]
Subject: Re: Unable to connect to S3 parquet data using Drill
Which version of Drill are you running? The config block for adding your
credentials was added in a recent release, I believe 1.5.
Jason Altekruse
Software Engineer at Dremio
Apache Drill Committer
On Wed, Apr 20, 2016 at 1:38 PM, Nick Monetta <[email protected]> wrote:
> Copying and pasting your JSON directly into a new configuration gets
> me “Error (invalid JSON Mapping)”.
>
>
>
> What am I doing wrong?
>
>
>
>
>
>
>
>
>
> Nick Monetta | INRIX |[email protected] |Movement Intelligence |
> www.inrix.com | mobile +1 646-248-4105 |
>
>
>
>
>
> -----Original Message-----
> From: Jason Altekruse [mailto:[email protected]]
> Sent: Wednesday, April 20, 2016 4:27 PM
> To: [email protected]
> Cc: [email protected]
> Subject: Re: Unable to connect to S3 parquet data using Drill
>
>
>
> {
>
> "type": "file",
>
> "enabled": true,
>
> "connection": "s3a://PATH.TO.BUCKET/",
>
> "config": {
>
> "fs.s3a.access.key": "<YOUR ACCESS KEY HERE>",
>
> "fs.s3a.secret.key": "<YOUR SECRET KEY HERE>"
>
> },
>
> "workspaces": {
>
> "root": {
>
> "location": "/",
>
> "writable": false,
>
> "defaultInputFormat": null
>
> },
>
> "tmp": {
>
> "location": "/tmp",
>
> "writable": true,
>
> "defaultInputFormat": null
>
> }
>
> },
>
> "formats": {
>
> "psv": {
>
> "type": "text",
>
> "extensions": [
>
> "tbl"
>
> ],
>
> "delimiter": "|"
>
> },
>
> "csv": {
>
> "type": "text",
>
> "extensions": [
>
> "csv"
>
> ],
>
> "delimiter": ","
>
> },
>
> "tsv": {
>
> "type": "text",
>
> "extensions": [
>
> "tsv"
>
> ],
>
> "delimiter": "\t"
>
> },
>
> "parquet": {
>
> "type": "parquet"
>
> },
>
> "json": {
>
> "type": "json",
>
> "extensions": [
>
> "json"
>
> ]
>
> },
>
> "avro": {
>
> "type": "avro"
>
> },
>
> "sequencefile": {
>
> "type": "sequencefile",
>
> "extensions": [
>
> "seq"
>
> ]
>
> },
>
> "csvh": {
>
> "type": "text",
>
> "extensions": [
>
> "csvh"
>
> ],
>
> "extractHeader": true,
>
> "delimiter": ","
>
> }
>
> }
>
> }
>
>
>
> Jason Altekruse
>
> Software Engineer at Dremio
>
> Apache Drill Committer
>
>
>
> On Wed, Apr 20, 2016 at 1:24 PM, Nick Monetta <[email protected]> wrote:
>
>
>
> > Can you send me the full JSON for the new config example you provided?
>
> > I keep getting JSON errors.
>
> >
>
> >
>
> > Nick Monetta | INRIX |[email protected] |Movement Intelligence |
>
> > www.inrix.com | mobile +1 646-248-4105 |
>
> >
>
> >
>
> > -----Original Message-----
>
> > From: Abhishek Girish [mailto:[email protected]
> <[email protected]>]
>
> > Sent: Wednesday, April 20, 2016 12:57 PM
>
> > To: user <[email protected]>
>
> > Subject: Re: Unable to connect to S3 parquet data using Drill
>
> >
>
> > Hey Trang,
>
> >
>
> > A similar issue related to S3 config was discussed today on the
>
> > mailing list [1]. Can you see if that helps resolve the issue?
>
> >
>
> > [1]
>
> >
>
> > http://mail-archives.apache.org/mod_mbox/drill-dev/201604.mbox/%3CCA
> > N6
>
> > ttnukzsAKgQE-RTF0RNCvBr1uWsB9SaxnS_7y-v0yBdUj%3Dw%40mail.gmail.com%3
> > E
>
> >
>
> >
>
> > -Abhishek
>
> >
>
> > On Tue, Apr 19, 2016 at 6:38 PM, Trang Nguyen
> > <[email protected]>
>
> > wrote:
>
> >
>
> > > Hi,
>
> > >
>
> > > I am having trouble to connect to an Amazon S3 bucket containing
>
> > > parquet files.
>
> > > I followed the instructions on
>
> > > https://drill.apache.org/docs/s3-storage-plugin/ to download
>
> > > jets3_0.9.3 on my Ubuntu VM.
>
> > > My storage configs:
>
> > > {
>
> > > "type": "file",
>
> > > "enabled": true,
>
> > > "connection": "s3://inrixprod-tapp",
>
> > > "config": null,
>
> > > "workspaces": {
>
> > > "root": {
>
> > > "location": "/",
>
> > > "writable": false,
>
> > > "defaultInputFormat": null
>
> > > },
>
> > > "tmp": {
>
> > > "location": "/tmp",
>
> > > "writable": true,
>
> > > "defaultInputFormat": null
>
> > > }
>
> > > },
>
> > > ...
>
> > > }
>
> > >
>
> > > I've started the embedded-drill instance but get the following
> > > error
>
> > > trying to connect:
>
> > > 0: jdbc:drill:zk=local> use s3-trips.`root`;
>
> > > Error: SYSTEM ERROR: IOException: / doesn't exist
>
> > >
>
> > >
>
> > > [Error Id: 081c66e6-177d-48fa-8eca-4ee1370ae785 on
>
> > > ubuntu-VirtualBox:31010] (state=,code=0)
>
> > >
>
> > > Any advice would be appreciated!
>
> > >
>
> > > Thanks,
>
> > > Trang
>
> > >
>
> >
>