Re: EMC ECS Configuration with Apache Drill

2019-08-21 Thread Prabu Mohan
Thanks Ted.

This is getting complex now, I thought that I might be missing something
simple while configuring drill, but this seems to be far beyond that.

I'm not sure whether I can get a proxy and also just in case if any other
issues occur as well, is there a way I can debug the code to understand
what values are being passed ?

On Tue, Aug 20, 2019 at 12:22 AM Ted Dunning  wrote:

> On Mon, Aug 19, 2019 at 11:33 AM Prabu Mohan 
> wrote:
>
> > but i am able to connect to ECS via python using boto3 libraries without
> > any issues, I am able to write files to the bucket and read them back ..
> >
> > not sure why i am facing issues with drill though with the same
> credentials
> >
>
>
> The key here is your assumption that the same credentials are being passed
> through Drill to AWS and that there isn't some other consideration that
> keeps S3 from believing whatever credentials it is getting.
>
> That assumption has to be attacked by figuring out experiments that can
> prove or disprove aspects of it. For instance, if you can get a proxy in
> the middle of the connection, you should be able to see *exactly* what is
> on the wire. Likewise if you can get better logging out of Drill.
>


Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Prabu Mohan
but i am able to connect to ECS via python using boto3 libraries without
any issues, I am able to write files to the bucket and read them back ..

not sure why i am facing issues with drill though with the same credentials

On Mon, Aug 19, 2019 at 11:53 PM Ted Dunning  wrote:

> So that looks like the security tokens aren't getting through to AWS
> correctly.
>
> On Mon, Aug 19, 2019 at 11:21 AM Prabu Mohan 
> wrote:
>
> > Log info
> >
> > 2019-08-19 16:23:05,439 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
> > INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id
> > 22a54125-e1e7-29f6-a146-6cfd529b23d1 issued by anonymous: use ecstest
> >
> > 2019-08-19 16:23:06,691 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
> > ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
> > AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
> > ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error
> Message:
> > Method Not Allowed
> >
> >
> >
> >
> >
> > Please, refer to logs for more information.
> >
> >
> >
> > [Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]
> >
> > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> > AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
> > ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error
> Message:
> > Method Not Allowed
> >
> >
> >
> >
> >
> > Please, refer to logs for more information.
> >
> >
> >
> > [Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]
> >
> > at
> >
> >
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:630)
> > ~[drill-common-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> > .foreman.Foreman$ForemanResult.close(Foreman.java:789)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> >
> .foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> > .foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> > .foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at org.apache.drill.exec.work
> > .foreman.Foreman.run(Foreman.java:304)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> >at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > [na:1.8.0_92]
> >
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > [na:1.8.0_92]
> >
> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
> >
> > Caused by: org.apache.drill.exec.work.foreman.ForemanException:
> Unexpected
> > exception during fragment initialization: Status Code: 405, AWS Service:
> > Amazon S3, AWS Request ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code:
> > null, AWS Error Message: Method Not Allowed
> >
> > at org.apache.drill.exec.work
> > .foreman.Foreman.run(Foreman.java:305)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > ... 3 common frames omitted
> >
> > Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status
> Code:
> > 405, AWS Service: Amazon S3, AWS Request ID:
> a1552153:16baf0e3c53:351c7:13,
> > AWS Error Code: null, AWS Error Message: Method Not Allowed
> >
> > at
> >
> >
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.services.s3.AmazonS3Client.doesBucketE

Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Prabu Mohan
(FileSystem.java:2703)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
~[hadoop-common-2.7.4.jar:na]

at
org.apache.drill.exec.store.dfs.DrillFileSystem.(DrillFileSystem.java:93)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.util.ImpersonationUtil.lambda$createFileSystem$0(ImpersonationUtil.java:215)
~[drill-java-exec-1.16.0.jar:1.16.0]

at ...(:0) ~[na:na]

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
~[hadoop-common-2.7.4.jar:na]

at
org.apache.drill.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:213)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:205)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.store.dfs.FileSystemSchemaFactory$FileSystemSchema.(FileSystemSchemaFactory.java:84)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.store.dfs.FileSystemSchemaFactory.registerSchemas(FileSystemSchemaFactory.java:72)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.store.dfs.FileSystemPlugin.registerSchemas(FileSystemPlugin.java:189)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.calcite.jdbc.DynamicRootSchema.loadSchemaFactory(DynamicRootSchema.java:84)
~[drill-java-exec-1.16.0.jar:1.18.0-drill-r0]

at
org.apache.calcite.jdbc.DynamicRootSchema.getImplicitSubSchema(DynamicRootSchema.java:69)
~[drill-java-exec-1.16.0.jar:1.18.0-drill-r0]

at
org.apache.calcite.jdbc.CalciteSchema.getSubSchema(CalciteSchema.java:262)
~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]

at
org.apache.calcite.jdbc.CalciteSchema$SchemaPlusImpl.getSubSchema(CalciteSchema.java:675)
~[calcite-core-1.18.0-drill-r0.jar:1.18.0-drill-r0]

at
org.apache.drill.exec.planner.sql.SchemaUtilites.searchSchemaTree(SchemaUtilites.java:108)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.planner.sql.SchemaUtilites.findSchema(SchemaUtilites.java:52)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.rpc.user.UserSession.setDefaultSchemaPath(UserSession.java:223)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.planner.sql.handlers.UseSchemaHandler.getPlan(UseSchemaHandler.java:43)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:216)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:130)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:87)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:593)
~[drill-java-exec-1.16.0.jar:1.16.0]

at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:276)
~[drill-java-exec-1.16.0.jar:1.16.0]

... 1 common frames omitted



On Mon, Aug 19, 2019 at 12:16 PM Ted Dunning  wrote:

> Did you see anything in any logs?
>
>
>
> On Sun, Aug 18, 2019 at 10:16 PM Prabu Mohan 
> wrote:
>
> > I am able to connect to the http endpoint using boto3 from python (able
> to
> > retrieve files/store files), from IE with https and port 9021 , it comes
> > back with 403 Forbidden indicating that it was able to connect to website
> > but does not have permission to view the webpage ( may be due to
> > credentials ).
> >
> > Regards,
> > Prabu
> >
> >
> > On  Mon, Aug 19, 2019 at 12:59 AM   Sorabh Apache<
> sorabh.apa...@apache.org
> > >
> > wrote:
> >
> > > Are you able to use the same configured endpoint *http://:9020* from
> > your
> > > browser ?
> > >
> > > Thanks,
> > >
> > > Sorabh.
> > >
> > > On Mon, Aug 19, 2019 at 12:19 AM Prabu Mohan 
> > wrote:
> >
> > > I'm trying to configure Apache Drill with EMC ECS, it was quite easy
> > > configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
> > > configuring with EMC ECS,
> > >
> > > When I use http with port 9020 as endpoint I face this error
> > >
> > >  Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
> > > Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error
> > Message:
> > > Method Not Allowed
> > >
> > > core-site.xml
> > >
> > > 
> > >
> &g

Re: EMC ECS Configuration with Apache Drill

2019-08-18 Thread Prabu Mohan
I am able to connect to the http endpoint using boto3 from python (able to
retrieve files/store files), from IE with https and port 9021 , it comes
back with 403 Forbidden indicating that it was able to connect to website
but does not have permission to view the webpage ( may be due to
credentials ).

Regards,
Prabu


On  Mon, Aug 19, 2019 at 12:59 AM   Sorabh Apache
wrote:

> Are you able to use the same configured endpoint *http://:9020* from your
> browser ?
>
> Thanks,
>
> Sorabh.
>
> On Mon, Aug 19, 2019 at 12:19 AM Prabu Mohan 
wrote:

> I'm trying to configure Apache Drill with EMC ECS, it was quite easy
> configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
> configuring with EMC ECS,
>
> When I use http with port 9020 as endpoint I face this error
>
>  Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
> Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error Message:
> Method Not Allowed
>
> core-site.xml
>
> 
>
> 
> fs.s3a.access.key
> accesskey
> 
>
> 
> fs.s3a.secret.key
> secretkey
> 
>
> 
> fs.s3a.endpoint
> http://:9020
> 
>
> 
> fs.s3a.connection.ssl.enabled
> false
> 
>
> 
>
> and for connecting to the bucket i use the following in the storage plugin
>
> {
> "type" : "file",
> "connection" : "s3a://bucketname/",
> "config" : {
> "fs.s3a.impl.disable.cache" : "true"
> },
> "workspaces" : {
> "tmp" : {
> "location" : "/tmp",
> "writable" : true,
> "defaultInputFormat" : null,
> "allowAccessOutsideWorkspace" : false
> },
> "root" : {
> "location" : "/",
> "writable" : false,
> "defaultInputFormat" : null,
> "allowAccessOutsideWorkspace" : false
> }
> },
> "formats" : {
> "psv" : {
> "type" : "text",
> "extensions" : [ "tbl" ],
> "delimiter" : "|"
> },
> "csv" : {
> "type" : "text",
> "extensions" : [ "csv" ],
> "delimiter" : ","
> },
> "tsv" : {
> "type" : "text",
> "extensions" : [ "tsv" ],
> "delimiter" : "\t"
> },
> "parquet" : {
> "type" : "parquet"
> },
> "json" : {
> "type" : "json",
> "extensions" : [ "json" ]
> },
> "avro" : {
> "type" : "avro"
> },
> "sequencefile" : {
> "type" : "sequencefile",
> "extensions" : [ "seq" ]
> },
> "csvh" : {
> "type" : "text",
> "extensions" : [ "csvh" ],
> "extractHeader" : true,
> "delimiter" : ","
> }
> },
> "enabled" : true
> }
>
> Any suggestions on how to get this working ?
>


EMC ECS Configuration with Apache Drill

2019-08-18 Thread Prabu Mohan
I'm trying to configure Apache Drill with EMC ECS, it was quite easy
configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
configuring with EMC ECS,

When I use http with port 9020 as endpoint I face this error

 Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error Message:
Method Not Allowed

core-site.xml




fs.s3a.access.key
accesskey



fs.s3a.secret.key
secretkey



fs.s3a.endpoint
http://:9020



fs.s3a.connection.ssl.enabled
false




and for connecting to the bucket i use the following in the storage plugin

{
"type" : "file",
"connection" : "s3a://bucketname/",
"config" : {
"fs.s3a.impl.disable.cache" : "true"
},
"workspaces" : {
"tmp" : {
"location" : "/tmp",
"writable" : true,
"defaultInputFormat" : null,
"allowAccessOutsideWorkspace" : false
},
"root" : {
"location" : "/",
"writable" : false,
"defaultInputFormat" : null,
"allowAccessOutsideWorkspace" : false
}
},
"formats" : {
"psv" : {
"type" : "text",
"extensions" : [ "tbl" ],
"delimiter" : "|"
},
"csv" : {
"type" : "text",
"extensions" : [ "csv" ],
"delimiter" : ","
},
"tsv" : {
"type" : "text",
"extensions" : [ "tsv" ],
"delimiter" : "\t"
},
"parquet" : {
"type" : "parquet"
},
"json" : {
"type" : "json",
"extensions" : [ "json" ]
},
"avro" : {
"type" : "avro"
},
"sequencefile" : {
"type" : "sequencefile",
"extensions" : [ "seq" ]
},
"csvh" : {
"type" : "text",
"extensions" : [ "csvh" ],
"extractHeader" : true,
"delimiter" : ","
}
},
"enabled" : true
}

Any suggestions on how to get this working ?