Re: Role-based S3 access outside of EMR

2016-08-14 Thread Steve Loughran
On 29 Jul 2016, at 00:07, Everett Anderson > wrote: Hey, Just wrapping this up -- I ended up following the instructions to build a custom Spark release with Hadoop 2.7.2,

Re: Role-based S3 access outside of EMR

2016-07-28 Thread Everett Anderson
rson [mailto:ever...@nuna.com.INVALID > <ever...@nuna.com.INVALID>] > *Sent:* 21 July 2016 17:01 > *To:* Gourav Sengupta <gourav.sengu...@gmail.com> > *Cc:* Teng Qiu <teng...@gmail.com>; Andy Davidson < > a...@santacruzintegration.com>; user <user@s

Re: Role-based S3 access outside of EMR

2016-07-23 Thread Steve Loughran
dson <a...@santacruzintegration.com<mailto:a...@santacruzintegration.com>>; user <user@spark.apache.org<mailto:user@spark.apache.org>> Subject: Re: Role-based S3 access outside of EMR Hey, FWIW, we are using EMR, actually, in production. The main case I have for wa

RE: Role-based S3 access outside of EMR

2016-07-21 Thread Ewan Leith
Sengupta <gourav.sengu...@gmail.com> Cc: Teng Qiu <teng...@gmail.com>; Andy Davidson <a...@santacruzintegration.com>; user <user@spark.apache.org> Subject: Re: Role-based S3 access outside of EMR Hey, FWIW, we are using EMR, actually, in production. The main case I have

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Everett Anderson
Hey, FWIW, we are using EMR, actually, in production. The main case I have for wanting to access S3 with Spark outside of EMR is that during development, our developers tend to run EC2 sandbox instances that have all the rest of our code and access to some of the input data on S3. It'd be nice

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Gourav Sengupta
Hi Teng, This is totally a flashing news for me, that people cannot use EMR in production because its not open sourced, I think that even Werner is not aware of such a problem. Is EMRFS opensourced? I am curious to know what does HA stand for? Regards, Gourav On Thu, Jul 21, 2016 at 8:37 AM,

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Teng Qiu
there are several reasons that AWS users do (can) not use EMR, one point for us is that security compliance problem, EMR is totally not open sourced, we can not use it in production system. second is that EMR do not support HA yet. but to the original question from @Everett : -> Credentials and

Re: Role-based S3 access outside of EMR

2016-07-21 Thread Gourav Sengupta
But that would mean you would be accessing data over internet increasing data read latency, data transmission failures. Why are you not using EMR? Regards, Gourav On Thu, Jul 21, 2016 at 1:06 AM, Everett Anderson wrote: > Thanks, Andy. > > I am indeed often doing

Re: Role-based S3 access outside of EMR

2016-07-20 Thread Everett Anderson
Thanks, Andy. I am indeed often doing something similar, now -- copying data locally rather than dealing with the S3 impl selection and AWS credentials issues. It'd be nice if it worked a little easier out of the box, though! On Tue, Jul 19, 2016 at 2:47 PM, Andy Davidson <

Re: Role-based S3 access outside of EMR

2016-07-19 Thread Andy Davidson
Hi Everett I always do my initial data exploration and all our product development in my local dev env. I typically select a small data set and copy it to my local machine My main() has an optional command line argument Œ- - runLocal¹ Normally I load data from either hdfs:/// or S3n:// . If the