I think I will add this as a second thread, as I think it's kind of moved on from my original question. It sounds like one cannot connect without a key even if the bucket is public, which is good to know. Thx Jack
On Mon, Jun 12, 2017 at 2:25 PM, Jack Ingoldsby <[email protected]> wrote: > Thanks, but unfortunately that didn't work either.... > { > "type": "file", > "enabled": true, > "connection": "s3a://sisense.citibike", > "config": { > "fs.s3a.access.key": "AKIAJELPGZYEPGRP6VBA", > "fs.s3a.secret.key": "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", > "fs.s3a.endpoint": "s3-us-east-2.amazonaws.com" > }, > > On Mon, Jun 12, 2017 at 12:41 PM, Abhishek Girish <[email protected]> > wrote: > >> That's good to know. I just didn't want Drill community to be the place >> your keys were leaked :) >> >> I attempted with your keys and could reproduce the issue. One guess is >> that >> it could be due to location constraints [1]. >> >> You can attempt to set the "fs.s3a.endpoint" property in S3 config and >> give >> it a try. For example: >> >> { >> "type": "file", >> "enabled": true, >> "connection": "s3a://sisense.citibike", >> "config": { >> "fs.s3a.access.key": "AKIAJELPGZYEPGRP6VBA", >> "fs.s3a.secret.key": "h3CyqC/VzpRirOMi3nCImYJL2oNV1xwOcEBiYi02", >> "fs.s3a.endpoint": "s3-us-west-2.amazonaws.com" // Pointing to the >> region of the bucket >> } >> ... >> ... >> } >> >> >> [1] http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region >> >> On Mon, Jun 12, 2017 at 9:13 AM, Jack Ingoldsby <[email protected] >> > >> wrote: >> >> > Well, these are for a specific user I created for this bucket. The user >> > only has read access to this bucket, which only contains this public >> > citibike data and has no permissions access. >> > So, I'm fine if anyone can connect (at least until I figure out the >> > problem) >> > >> > On Mon, Jun 12, 2017 at 11:59 AM, Abhishek Girish <[email protected]> >> > wrote: >> > >> > > I hope you haven't shared your actual access / secret keys with the >> > > community. If not, please work on securing your account [1]! >> > > >> > > >> > > [1] https://aws.amazon.com/blogs/security/wheres-my-secret-acces >> s-key/ >> > > >> > > >> > > >> > > On Mon, Jun 12, 2017 at 8:34 AM, Jack Ingoldsby < >> > [email protected]> >> > > wrote: >> > > >> > > > Hi, >> > > > Thanks. I'm actually more playing around with a proof of concept >> that I >> > > can >> > > > query S3 using our tool via Drill. >> > > > So, what I did was to download the citibike and data and create my >> own >> > s3 >> > > > bucket with an accessid,secretket , but I'm having some problem >> > > connecting >> > > > I get the following error message when running a query >> > > > >> > > > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM >> ERROR: >> > > > AmazonS3Exception: Status Code: 400, AWS Service: Amazon S3, AWS >> > Request >> > > > ID: 439EE2E823001E80, AWS Error Code: null, AWS Error Message: Bad >> > > Request >> > > > [Error Id: 9da0c6bd-b173-48e0-aeac-47179812e696 on >> > > > LAP-NY-CHENO.corp.sisense.com:31010] >> > > > >> > > > It appears to be a connection issue but i can connect to the bucket >> > > > sisense.citibike using AWS command line utility, using the same >> > > accesskey, >> > > > secretkey >> > > > Does anything leap out ? >> > > > >> > > > The configuration is set to >> > > > >> > > > { >> > > > "type": "file", >> > > > "enabled": true, >> > > > "connection": "s3a://sisense.citibike", >> > > > "config": { >> > > > "fs.s3a.access.key": "ID", >> > > > "fs.s3a.secret.key": "SECRET" >> > > > }, >> > > > >> > > > >> > > > Core-site.xml is set to >> > > > >> > > > <configuration> >> > > > >> > > > <property> >> > > > <name>fs.s3a.access.key</name> >> > > > <value>AKIAJELPGZYEPGRP6VBA</value> >> > > > </property> >> > > > >> > > > <property> >> > > > <name>fs.s3a.secret.key</name> >> > > > <value>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</value> >> > > > </property> >> > > > >> > > > </configuration> >> > > > >> > > > Thanks, >> > > > Jack >> > > > >> > > > On Mon, Jun 12, 2017 at 10:43 AM, Andries Engelbrecht < >> > > > [email protected] >> > > > > wrote: >> > > > >> > > > > You may be better of downloading the NYC bike data set locally and >> > > > convert >> > > > > to parquet. >> > > > > Converting from csv.zip to parquet will result in large >> improvements >> > in >> > > > > performance if you do various queries on the data set. >> > > > > >> > > > > --Andries >> > > > > >> > > > > On 6/11/17, 10:48 PM, "Abhishek Girish" <[email protected]> >> wrote: >> > > > > >> > > > > Drill connects to to S3 buckets (AWS) via the S3a library. And >> > the >> > > > > storage >> > > > > plugin configuration requires the access & secret keys [1]. >> > > > > >> > > > > I'm not sure if Drill can access S3 without the credentials. >> It >> > > might >> > > > > be >> > > > > possible via custom authenticators [2]. Hopefully others who >> have >> > > > tried >> > > > > this will comment. >> > > > > >> > > > > >> > > > > [1] https://drill.apache.org/docs/s3-storage-plugin/ >> > > > > [2] http://docs.aws.amazon.com/AmazonS3/latest/API/sig- >> > > > > v4-authenticating-requests.html >> > > > > >> > > > > On Wed, Jun 7, 2017 at 3:02 PM, Jack Ingoldsby < >> > > > > [email protected]> >> > > > > wrote: >> > > > > >> > > > > > Hi, >> > > > > > I'm trying to access the NYC Citibike S3 bucket, which >> seems to >> > > > > publicly >> > > > > > available >> > > > > > >> > > > > > https://s3.amazonaws.com/tripdata/index.html >> > > > > > If I leave the Access Key & Secret Key empty, I get the >> > following >> > > > > message >> > > > > > >> > > > > > 0: jdbc:drill:zk=local> !tables >> > > > > > Error: Failure getting metadata: Unable to load AWS >> credentials >> > > > from >> > > > > any >> > > > > > provider in the chain (state=,code=0) >> > > > > > >> > > > > > If I try entering random numbers as keys, I get the >> following >> > > > message >> > > > > > >> > > > > > Error: Failure getting metadata: Status Code: 403, AWS >> Service: >> > > > > Amazon S3, >> > > > > > AWS Request ID: 1C888A3A21D79F87, AWS Error Code: >> > > > > InvalidAccessKeyId, AWS >> > > > > > Error Message: The AWS Access Key Id you provided does not >> > exist >> > > in >> > > > > our >> > > > > > records. (state=,code=0) >> > > > > > >> > > > > > Is it possible to connect to a data source that does not >> seem >> > to >> > > > > require a >> > > > > > key? >> > > > > > >> > > > > > Thanks, >> > > > > > Jack >> > > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > >> > >> > >
