Re: EMC ECS Configuration with Apache Drill

2019-08-21 Thread Paul Rogers
Hi Prabu & Ted,

Ted is right, the next step to track this down is via debugging. As large 
projects go, Drill is actually easier to debug than most. (Hat's off the the 
team for achieving this valuable goal!)

1. Fork, clone and build Drill: [1]
2. In your IDE (both Eclipse and Intellij work) We used to have info, but can't 
find it now. [2] gives an overview. I think you can just import drill/pom.xml 
as a Maven project (in Eclipse).
3. Find the test TestCsvWithHeaders.java [3]. Run it to verify things work.
4. Create an ad-hoc test in this same package. You really just need a setup and 
a test method:

  @BeforeClass
  public static void setup() throws Exception {
   startCluster(
        ClusterFixture.builder(dirTestWatcher)
        .maxParallelization(1));
  }

@Test
  public void adHocTest() throws IOException {
    String sql = "SELECT * FROM ...";
    RowSet actual = client.queryBuilder().sql(sql).rowSet();
    actual.print();
    actual.clear()
  }

The setup method starts your cluster. The test just runs a query and will print 
the results. Put your SQL here. Works best if the file is small.

You'll need to configure your data source; the test does not hit Zookeeper 
where we store the definitions you set in the Drill web UI. My tests tend to do 
the setup in code, but this gets pretty messy.

Anyone know how to do the storage plugin setup in some file so it works for a 
unit test? Maybe edit bootstrap-storage-plugins.json [4] for a quick & dirty 
solution?

Once you get past this, run the test. It will fail and print a big nasty stack 
dump. If you look carefully (ignore the first few stacks, they are on the 
client), you should see a stack trace on the server (which is running in the 
same process) where Drill is trying to open your file. You can set a breakpoint 
here and start poking around to see what's what.

Quite a bit to get right, so feel free to ask here (or on dev) to get help. 
Note also that there is detailed info in the "Learning Apache Drill" book for 
setting up your development environment.

Thanks,
- Paul

 
[1] http://drill.apache.org/docs/compiling-drill-from-source/

[2] https://github.com/apache/drill/tree/master/docs/dev

[3] 
https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/store/easy/text/compliant/TestCsvWithHeaders.java

[4] 
https://github.com/apache/drill/blob/master/exec/java-exec/src/main/resources/bootstrap-storage-plugins.json



On Wednesday, August 21, 2019, 02:10:17 PM PDT, Ted Dunning 
 wrote:  
 
 Prabu,

Yes. You can debug the code. It is a large codebase so that can be a bit of
a trick to get started.

I think that one of the most stable approaches is to build a test case that
accesses the data you want (this doesn't have to become a public test case,
it just makes debugging easier by being very repeatable).

I am not up to speed on how to do this, however.

Is there somebody else on the list who could advise on this?



On Wed, Aug 21, 2019 at 1:08 PM Prabu Mohan  wrote:

> Thanks Ted.
>
> This is getting complex now, I thought that I might be missing something
> simple while configuring drill, but this seems to be far beyond that.
>
> I'm not sure whether I can get a proxy and also just in case if any other
> issues occur as well, is there a way I can debug the code to understand
> what values are being passed ?
>
> On Tue, Aug 20, 2019 at 12:22 AM Ted Dunning 
> wrote:
>
> > On Mon, Aug 19, 2019 at 11:33 AM Prabu Mohan 
> > wrote:
> >
> > > but i am able to connect to ECS via python using boto3 libraries
> without
> > > any issues, I am able to write files to the bucket and read them back
> ..
> > >
> > > not sure why i am facing issues with drill though with the same
> > credentials
> > >
> >
> >
> > The key here is your assumption that the same credentials are being
> passed
> > through Drill to AWS and that there isn't some other consideration that
> > keeps S3 from believing whatever credentials it is getting.
> >
> > That assumption has to be attacked by figuring out experiments that can
> > prove or disprove aspects of it. For instance, if you can get a proxy in
> > the middle of the connection, you should be able to see *exactly* what is
> > on the wire. Likewise if you can get better logging out of Drill.
> >
>
  

Re: EMC ECS Configuration with Apache Drill

2019-08-21 Thread Ted Dunning
Prabu,

Yes. You can debug the code. It is a large codebase so that can be a bit of
a trick to get started.

I think that one of the most stable approaches is to build a test case that
accesses the data you want (this doesn't have to become a public test case,
it just makes debugging easier by being very repeatable).

I am not up to speed on how to do this, however.

Is there somebody else on the list who could advise on this?



On Wed, Aug 21, 2019 at 1:08 PM Prabu Mohan  wrote:

> Thanks Ted.
>
> This is getting complex now, I thought that I might be missing something
> simple while configuring drill, but this seems to be far beyond that.
>
> I'm not sure whether I can get a proxy and also just in case if any other
> issues occur as well, is there a way I can debug the code to understand
> what values are being passed ?
>
> On Tue, Aug 20, 2019 at 12:22 AM Ted Dunning 
> wrote:
>
> > On Mon, Aug 19, 2019 at 11:33 AM Prabu Mohan 
> > wrote:
> >
> > > but i am able to connect to ECS via python using boto3 libraries
> without
> > > any issues, I am able to write files to the bucket and read them back
> ..
> > >
> > > not sure why i am facing issues with drill though with the same
> > credentials
> > >
> >
> >
> > The key here is your assumption that the same credentials are being
> passed
> > through Drill to AWS and that there isn't some other consideration that
> > keeps S3 from believing whatever credentials it is getting.
> >
> > That assumption has to be attacked by figuring out experiments that can
> > prove or disprove aspects of it. For instance, if you can get a proxy in
> > the middle of the connection, you should be able to see *exactly* what is
> > on the wire. Likewise if you can get better logging out of Drill.
> >
>


Re: EMC ECS Configuration with Apache Drill

2019-08-21 Thread Prabu Mohan
Thanks Ted.

This is getting complex now, I thought that I might be missing something
simple while configuring drill, but this seems to be far beyond that.

I'm not sure whether I can get a proxy and also just in case if any other
issues occur as well, is there a way I can debug the code to understand
what values are being passed ?

On Tue, Aug 20, 2019 at 12:22 AM Ted Dunning  wrote:

> On Mon, Aug 19, 2019 at 11:33 AM Prabu Mohan 
> wrote:
>
> > but i am able to connect to ECS via python using boto3 libraries without
> > any issues, I am able to write files to the bucket and read them back ..
> >
> > not sure why i am facing issues with drill though with the same
> credentials
> >
>
>
> The key here is your assumption that the same credentials are being passed
> through Drill to AWS and that there isn't some other consideration that
> keeps S3 from believing whatever credentials it is getting.
>
> That assumption has to be attacked by figuring out experiments that can
> prove or disprove aspects of it. For instance, if you can get a proxy in
> the middle of the connection, you should be able to see *exactly* what is
> on the wire. Likewise if you can get better logging out of Drill.
>


Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Ted Dunning
On Mon, Aug 19, 2019 at 11:33 AM Prabu Mohan  wrote:

> but i am able to connect to ECS via python using boto3 libraries without
> any issues, I am able to write files to the bucket and read them back ..
>
> not sure why i am facing issues with drill though with the same credentials
>


The key here is your assumption that the same credentials are being passed
through Drill to AWS and that there isn't some other consideration that
keeps S3 from believing whatever credentials it is getting.

That assumption has to be attacked by figuring out experiments that can
prove or disprove aspects of it. For instance, if you can get a proxy in
the middle of the connection, you should be able to see *exactly* what is
on the wire. Likewise if you can get better logging out of Drill.


Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Prabu Mohan
but i am able to connect to ECS via python using boto3 libraries without
any issues, I am able to write files to the bucket and read them back ..

not sure why i am facing issues with drill though with the same credentials

On Mon, Aug 19, 2019 at 11:53 PM Ted Dunning  wrote:

> So that looks like the security tokens aren't getting through to AWS
> correctly.
>
> On Mon, Aug 19, 2019 at 11:21 AM Prabu Mohan 
> wrote:
>
> > Log info
> >
> > 2019-08-19 16:23:05,439 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
> > INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id
> > 22a54125-e1e7-29f6-a146-6cfd529b23d1 issued by anonymous: use ecstest
> >
> > 2019-08-19 16:23:06,691 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
> > ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
> > AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
> > ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error
> Message:
> > Method Not Allowed
> >
> >
> >
> >
> >
> > Please, refer to logs for more information.
> >
> >
> >
> > [Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]
> >
> > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> > AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
> > ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error
> Message:
> > Method Not Allowed
> >
> >
> >
> >
> >
> > Please, refer to logs for more information.
> >
> >
> >
> > [Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]
> >
> > at
> >
> >
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:630)
> > ~[drill-common-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> > .foreman.Foreman$ForemanResult.close(Foreman.java:789)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> >
> .foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> > .foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at
> > org.apache.drill.exec.work
> > .foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > at org.apache.drill.exec.work
> > .foreman.Foreman.run(Foreman.java:304)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> >at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > [na:1.8.0_92]
> >
> > at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > [na:1.8.0_92]
> >
> > at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
> >
> > Caused by: org.apache.drill.exec.work.foreman.ForemanException:
> Unexpected
> > exception during fragment initialization: Status Code: 405, AWS Service:
> > Amazon S3, AWS Request ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code:
> > null, AWS Error Message: Method Not Allowed
> >
> > at org.apache.drill.exec.work
> > .foreman.Foreman.run(Foreman.java:305)
> > [drill-java-exec-1.16.0.jar:1.16.0]
> >
> > ... 3 common frames omitted
> >
> > Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status
> Code:
> > 405, AWS Service: Amazon S3, AWS Request ID:
> a1552153:16baf0e3c53:351c7:13,
> > AWS Error Code: null, AWS Error Message: Method Not Allowed
> >
> > at
> >
> >
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> > com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> >
> >
> com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
> > ~[aws-java-sdk-1.7.4.jar:na]
> >
> > at
> > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
> > ~[hadoop-aws-2.7.4.jar:na]
> >
> > at
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
> > ~[hadoop-common-2.7.4.jar:na]
> >
> > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
> > ~[hadoop-common-2.7.4.jar:na]
> >
> > at
> > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
> > ~[hadoop-common-2.7.4.jar:na]
> >
> > at
> org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
> > ~[hadoop-common-2.7.4.jar:na]
> >
> > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
> > 

Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Ted Dunning
So that looks like the security tokens aren't getting through to AWS
correctly.

On Mon, Aug 19, 2019 at 11:21 AM Prabu Mohan  wrote:

> Log info
>
> 2019-08-19 16:23:05,439 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
> INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id
> 22a54125-e1e7-29f6-a146-6cfd529b23d1 issued by anonymous: use ecstest
>
> 2019-08-19 16:23:06,691 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
> ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
> AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
> ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error Message:
> Method Not Allowed
>
>
>
>
>
> Please, refer to logs for more information.
>
>
>
> [Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]
>
> org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
> AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
> ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error Message:
> Method Not Allowed
>
>
>
>
>
> Please, refer to logs for more information.
>
>
>
> [Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]
>
> at
>
> org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:630)
> ~[drill-common-1.16.0.jar:1.16.0]
>
> at
> org.apache.drill.exec.work
> .foreman.Foreman$ForemanResult.close(Foreman.java:789)
> [drill-java-exec-1.16.0.jar:1.16.0]
>
> at
> org.apache.drill.exec.work
> .foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
> [drill-java-exec-1.16.0.jar:1.16.0]
>
> at
> org.apache.drill.exec.work
> .foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
> [drill-java-exec-1.16.0.jar:1.16.0]
>
> at
> org.apache.drill.exec.work
> .foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
> [drill-java-exec-1.16.0.jar:1.16.0]
>
> at org.apache.drill.exec.work
> .foreman.Foreman.run(Foreman.java:304)
> [drill-java-exec-1.16.0.jar:1.16.0]
>
>at
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_92]
>
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_92]
>
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]
>
> Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected
> exception during fragment initialization: Status Code: 405, AWS Service:
> Amazon S3, AWS Request ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code:
> null, AWS Error Message: Method Not Allowed
>
> at org.apache.drill.exec.work
> .foreman.Foreman.run(Foreman.java:305)
> [drill-java-exec-1.16.0.jar:1.16.0]
>
> ... 3 common frames omitted
>
> Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code:
> 405, AWS Service: Amazon S3, AWS Request ID: a1552153:16baf0e3c53:351c7:13,
> AWS Error Code: null, AWS Error Message: Method Not Allowed
>
> at
>
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
>
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
>
> com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
>
> com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
> ~[aws-java-sdk-1.7.4.jar:na]
>
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
> ~[hadoop-aws-2.7.4.jar:na]
>
> at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
> ~[hadoop-common-2.7.4.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
> ~[hadoop-common-2.7.4.jar:na]
>
> at
> org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
> ~[hadoop-common-2.7.4.jar:na]
>
> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
> ~[hadoop-common-2.7.4.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
> ~[hadoop-common-2.7.4.jar:na]
>
> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
> ~[hadoop-common-2.7.4.jar:na]
>
> at
>
> org.apache.drill.exec.store.dfs.DrillFileSystem.(DrillFileSystem.java:93)
> ~[drill-java-exec-1.16.0.jar:1.16.0]
>
> at
>
> org.apache.drill.exec.util.ImpersonationUtil.lambda$createFileSystem$0(ImpersonationUtil.java:215)
> ~[drill-java-exec-1.16.0.jar:1.16.0]
>
> at java.security.AccessController.doPrivileged(Native Method)
> ~[na:1.8.0_92]
>
> at javax.security.auth.Subject.doAs(Subject.java:422)
> ~[na:1.8.0_92]
>
>

Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Prabu Mohan
Log info

2019-08-19 16:23:05,439 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
INFO  o.a.drill.exec.work.foreman.Foreman - Query text for query with id
22a54125-e1e7-29f6-a146-6cfd529b23d1 issued by anonymous: use ecstest

2019-08-19 16:23:06,691 [22a54125-e1e7-29f6-a146-6cfd529b23d1:foreman]
ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR:
AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error Message:
Method Not Allowed





Please, refer to logs for more information.



[Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]

org.apache.drill.common.exceptions.UserException: SYSTEM ERROR:
AmazonS3Exception: Status Code: 405, AWS Service: Amazon S3, AWS Request
ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code: null, AWS Error Message:
Method Not Allowed





Please, refer to logs for more information.



[Error Id: 76b95ca3-0ed7-4463-b3a1-7b0a8a21cd1a on ]

at
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:630)
~[drill-common-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.work.foreman.Foreman$ForemanResult.close(Foreman.java:789)
[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.work.foreman.QueryStateProcessor.checkCommonStates(QueryStateProcessor.java:325)
[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.work.foreman.QueryStateProcessor.planning(QueryStateProcessor.java:221)
[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.work.foreman.QueryStateProcessor.moveToState(QueryStateProcessor.java:83)
[drill-java-exec-1.16.0.jar:1.16.0]

at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:304)
[drill-java-exec-1.16.0.jar:1.16.0]

   at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_92]

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_92]

at java.lang.Thread.run(Thread.java:745) [na:1.8.0_92]

Caused by: org.apache.drill.exec.work.foreman.ForemanException: Unexpected
exception during fragment initialization: Status Code: 405, AWS Service:
Amazon S3, AWS Request ID: a1552153:16baf0e3c53:351c7:13, AWS Error Code:
null, AWS Error Message: Method Not Allowed

at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:305)
[drill-java-exec-1.16.0.jar:1.16.0]

... 3 common frames omitted

Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Status Code:
405, AWS Service: Amazon S3, AWS Request ID: a1552153:16baf0e3c53:351c7:13,
AWS Error Code: null, AWS Error Message: Method Not Allowed

at
com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:798)
~[aws-java-sdk-1.7.4.jar:na]

at
com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:421)
~[aws-java-sdk-1.7.4.jar:na]

at
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:232)
~[aws-java-sdk-1.7.4.jar:na]

at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3528)
~[aws-java-sdk-1.7.4.jar:na]

at
com.amazonaws.services.s3.AmazonS3Client.headBucket(AmazonS3Client.java:1031)
~[aws-java-sdk-1.7.4.jar:na]

at
com.amazonaws.services.s3.AmazonS3Client.doesBucketExist(AmazonS3Client.java:994)
~[aws-java-sdk-1.7.4.jar:na]

at
org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:297)
~[hadoop-aws-2.7.4.jar:na]

at
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2669)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:94)
~[hadoop-common-2.7.4.jar:na]

at
org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2703)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2685)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:373)
~[hadoop-common-2.7.4.jar:na]

at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:172)
~[hadoop-common-2.7.4.jar:na]

at
org.apache.drill.exec.store.dfs.DrillFileSystem.(DrillFileSystem.java:93)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.util.ImpersonationUtil.lambda$createFileSystem$0(ImpersonationUtil.java:215)
~[drill-java-exec-1.16.0.jar:1.16.0]

at java.security.AccessController.doPrivileged(Native Method)
~[na:1.8.0_92]

at javax.security.auth.Subject.doAs(Subject.java:422) ~[na:1.8.0_92]

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
~[hadoop-common-2.7.4.jar:na]

at
org.apache.drill.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:213)
~[drill-java-exec-1.16.0.jar:1.16.0]

at
org.apache.drill.exec.util.ImpersonationUtil.createFileSystem(ImpersonationUtil.java:205)
~[drill-java-exec-1.16.0.jar:1.16.0]

at

Re: EMC ECS Configuration with Apache Drill

2019-08-19 Thread Ted Dunning
Did you see anything in any logs?



On Sun, Aug 18, 2019 at 10:16 PM Prabu Mohan  wrote:

> I am able to connect to the http endpoint using boto3 from python (able to
> retrieve files/store files), from IE with https and port 9021 , it comes
> back with 403 Forbidden indicating that it was able to connect to website
> but does not have permission to view the webpage ( may be due to
> credentials ).
>
> Regards,
> Prabu
>
>
> On  Mon, Aug 19, 2019 at 12:59 AM   Sorabh Apache >
> wrote:
>
> > Are you able to use the same configured endpoint *http://:9020* from
> your
> > browser ?
> >
> > Thanks,
> >
> > Sorabh.
> >
> > On Mon, Aug 19, 2019 at 12:19 AM Prabu Mohan 
> wrote:
>
> > I'm trying to configure Apache Drill with EMC ECS, it was quite easy
> > configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
> > configuring with EMC ECS,
> >
> > When I use http with port 9020 as endpoint I face this error
> >
> >  Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
> > Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error
> Message:
> > Method Not Allowed
> >
> > core-site.xml
> >
> > 
> >
> > 
> > fs.s3a.access.key
> > accesskey
> > 
> >
> > 
> > fs.s3a.secret.key
> > secretkey
> > 
> >
> > 
> > fs.s3a.endpoint
> > http://:9020
> > 
> >
> > 
> > fs.s3a.connection.ssl.enabled
> > false
> > 
> >
> > 
> >
> > and for connecting to the bucket i use the following in the storage
> plugin
> >
> > {
> > "type" : "file",
> > "connection" : "s3a://bucketname/",
> > "config" : {
> > "fs.s3a.impl.disable.cache" : "true"
> > },
> > "workspaces" : {
> > "tmp" : {
> > "location" : "/tmp",
> > "writable" : true,
> > "defaultInputFormat" : null,
> > "allowAccessOutsideWorkspace" : false
> > },
> > "root" : {
> > "location" : "/",
> > "writable" : false,
> > "defaultInputFormat" : null,
> > "allowAccessOutsideWorkspace" : false
> > }
> > },
> > "formats" : {
> > "psv" : {
> > "type" : "text",
> > "extensions" : [ "tbl" ],
> > "delimiter" : "|"
> > },
> > "csv" : {
> > "type" : "text",
> > "extensions" : [ "csv" ],
> > "delimiter" : ","
> > },
> > "tsv" : {
> > "type" : "text",
> > "extensions" : [ "tsv" ],
> > "delimiter" : "\t"
> > },
> > "parquet" : {
> > "type" : "parquet"
> > },
> > "json" : {
> > "type" : "json",
> > "extensions" : [ "json" ]
> > },
> > "avro" : {
> > "type" : "avro"
> > },
> > "sequencefile" : {
> > "type" : "sequencefile",
> > "extensions" : [ "seq" ]
> > },
> > "csvh" : {
> > "type" : "text",
> > "extensions" : [ "csvh" ],
> > "extractHeader" : true,
> > "delimiter" : ","
> > }
> > },
> > "enabled" : true
> > }
> >
> > Any suggestions on how to get this working ?
> >
>


Re: EMC ECS Configuration with Apache Drill

2019-08-18 Thread Prabu Mohan
I am able to connect to the http endpoint using boto3 from python (able to
retrieve files/store files), from IE with https and port 9021 , it comes
back with 403 Forbidden indicating that it was able to connect to website
but does not have permission to view the webpage ( may be due to
credentials ).

Regards,
Prabu


On  Mon, Aug 19, 2019 at 12:59 AM   Sorabh Apache
wrote:

> Are you able to use the same configured endpoint *http://:9020* from your
> browser ?
>
> Thanks,
>
> Sorabh.
>
> On Mon, Aug 19, 2019 at 12:19 AM Prabu Mohan 
wrote:

> I'm trying to configure Apache Drill with EMC ECS, it was quite easy
> configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
> configuring with EMC ECS,
>
> When I use http with port 9020 as endpoint I face this error
>
>  Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
> Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error Message:
> Method Not Allowed
>
> core-site.xml
>
> 
>
> 
> fs.s3a.access.key
> accesskey
> 
>
> 
> fs.s3a.secret.key
> secretkey
> 
>
> 
> fs.s3a.endpoint
> http://:9020
> 
>
> 
> fs.s3a.connection.ssl.enabled
> false
> 
>
> 
>
> and for connecting to the bucket i use the following in the storage plugin
>
> {
> "type" : "file",
> "connection" : "s3a://bucketname/",
> "config" : {
> "fs.s3a.impl.disable.cache" : "true"
> },
> "workspaces" : {
> "tmp" : {
> "location" : "/tmp",
> "writable" : true,
> "defaultInputFormat" : null,
> "allowAccessOutsideWorkspace" : false
> },
> "root" : {
> "location" : "/",
> "writable" : false,
> "defaultInputFormat" : null,
> "allowAccessOutsideWorkspace" : false
> }
> },
> "formats" : {
> "psv" : {
> "type" : "text",
> "extensions" : [ "tbl" ],
> "delimiter" : "|"
> },
> "csv" : {
> "type" : "text",
> "extensions" : [ "csv" ],
> "delimiter" : ","
> },
> "tsv" : {
> "type" : "text",
> "extensions" : [ "tsv" ],
> "delimiter" : "\t"
> },
> "parquet" : {
> "type" : "parquet"
> },
> "json" : {
> "type" : "json",
> "extensions" : [ "json" ]
> },
> "avro" : {
> "type" : "avro"
> },
> "sequencefile" : {
> "type" : "sequencefile",
> "extensions" : [ "seq" ]
> },
> "csvh" : {
> "type" : "text",
> "extensions" : [ "csvh" ],
> "extractHeader" : true,
> "delimiter" : ","
> }
> },
> "enabled" : true
> }
>
> Any suggestions on how to get this working ?
>


Re: EMC ECS Configuration with Apache Drill

2019-08-18 Thread SorabhApache
Are you able to use the same configured endpoint *http://:9020* from your
browser ?

Thanks,
Sorabh

On Sun, Aug 18, 2019 at 11:55 AM Prabu Mohan  wrote:

> I'm trying to configure Apache Drill with EMC ECS, it was quite easy
> configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
> configuring with EMC ECS,
>
> When I use http with port 9020 as endpoint I face this error
>
>  Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
> Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error Message:
> Method Not Allowed
>
> core-site.xml
>
> 
>
> 
> fs.s3a.access.key
> accesskey
> 
>
> 
> fs.s3a.secret.key
> secretkey
> 
>
> 
> fs.s3a.endpoint
> http://:9020
> 
>
> 
> fs.s3a.connection.ssl.enabled
> false
> 
>
> 
>
> and for connecting to the bucket i use the following in the storage plugin
>
> {
> "type" : "file",
> "connection" : "s3a://bucketname/",
> "config" : {
> "fs.s3a.impl.disable.cache" : "true"
> },
> "workspaces" : {
> "tmp" : {
> "location" : "/tmp",
> "writable" : true,
> "defaultInputFormat" : null,
> "allowAccessOutsideWorkspace" : false
> },
> "root" : {
> "location" : "/",
> "writable" : false,
> "defaultInputFormat" : null,
> "allowAccessOutsideWorkspace" : false
> }
> },
> "formats" : {
> "psv" : {
> "type" : "text",
> "extensions" : [ "tbl" ],
> "delimiter" : "|"
> },
> "csv" : {
> "type" : "text",
> "extensions" : [ "csv" ],
> "delimiter" : ","
> },
> "tsv" : {
> "type" : "text",
> "extensions" : [ "tsv" ],
> "delimiter" : "\t"
> },
> "parquet" : {
> "type" : "parquet"
> },
> "json" : {
> "type" : "json",
> "extensions" : [ "json" ]
> },
> "avro" : {
> "type" : "avro"
> },
> "sequencefile" : {
> "type" : "sequencefile",
> "extensions" : [ "seq" ]
> },
> "csvh" : {
> "type" : "text",
> "extensions" : [ "csvh" ],
> "extractHeader" : true,
> "delimiter" : ","
> }
> },
> "enabled" : true
> }
>
> Any suggestions on how to get this working ?
>


EMC ECS Configuration with Apache Drill

2019-08-18 Thread Prabu Mohan
I'm trying to configure Apache Drill with EMC ECS, it was quite easy
configuring with AWS S3 and GCP Cloud Storage, but I'm facing issues
configuring with EMC ECS,

When I use http with port 9020 as endpoint I face this error

 Error:SYSTEM ERROR: AmazonS3Exception: Status Code: 405, AWS
Service:Amazon S3, AWS Request ID:, AWS Error Code:null, AWS Error Message:
Method Not Allowed

core-site.xml




fs.s3a.access.key
accesskey



fs.s3a.secret.key
secretkey



fs.s3a.endpoint
http://:9020



fs.s3a.connection.ssl.enabled
false




and for connecting to the bucket i use the following in the storage plugin

{
"type" : "file",
"connection" : "s3a://bucketname/",
"config" : {
"fs.s3a.impl.disable.cache" : "true"
},
"workspaces" : {
"tmp" : {
"location" : "/tmp",
"writable" : true,
"defaultInputFormat" : null,
"allowAccessOutsideWorkspace" : false
},
"root" : {
"location" : "/",
"writable" : false,
"defaultInputFormat" : null,
"allowAccessOutsideWorkspace" : false
}
},
"formats" : {
"psv" : {
"type" : "text",
"extensions" : [ "tbl" ],
"delimiter" : "|"
},
"csv" : {
"type" : "text",
"extensions" : [ "csv" ],
"delimiter" : ","
},
"tsv" : {
"type" : "text",
"extensions" : [ "tsv" ],
"delimiter" : "\t"
},
"parquet" : {
"type" : "parquet"
},
"json" : {
"type" : "json",
"extensions" : [ "json" ]
},
"avro" : {
"type" : "avro"
},
"sequencefile" : {
"type" : "sequencefile",
"extensions" : [ "seq" ]
},
"csvh" : {
"type" : "text",
"extensions" : [ "csvh" ],
"extractHeader" : true,
"delimiter" : ","
}
},
"enabled" : true
}

Any suggestions on how to get this working ?