Glad to help and sorry for missing a t in Matt in my prev email.

On Thu, Dec 20, 2012 at 9:54 AM, Matt Goeke <[email protected]> wrote:

> Alejandro,
>
> I am pretty sure you just saved me a couple hours of digging :)
>
> Most of these issues stem from our use of VPC but it looks like the
> hostname that it was retrieving using the 'hostname -f' in that script was
> in fact unresolvable from the subnet my EMR instance is on. A simple
> override of that property with the IP of the Oozie EC2 box and everything
> is running perfectly.
>
> As always thank you for the kind advice.
>
> --
> Matt
>
> On Thu, Dec 20, 2012 at 3:39 AM, Alejandro Abdelnur <[email protected]
> >wrote:
>
> > Hi Mat,
> >
> > Your 11min delay seems like Oozie is using its fail safe polling
> mechanism
> > to detect that a job completed in the cluster.
> >
> > You may have to tweak the following VAR in the oozie-env.sh script:
> >
> > # The base URL for callback URLs to Oozie
> > #
> > # export OOZIE_BASE_URL="http://
> > ${OOZIE_HTTP_HOSTNAME}:${OOZIE_HTTP_PORT}/oozie"
> >
> > Oozie sets callback URLs in the jobconfs using this base URL, then the JT
> > calls back oozie as soon as a job completes. If the hostname is not
> correct
> > (see the rest of the script to find out how the hostname is resolved),
> the
> > call from the JT will never arrive and Oozie will do a check every 10
> mins
> > for each action.
> >
> > Hope this solves your prob.
> >
> > Regarding the wiki, yes, Oozie wiki,
> > https://cwiki.apache.org/confluence/display/OOZIE , which seems down at
> > the
> > moment.
> >
> > Thx
> >
> >
> >
> > On Thu, Dec 20, 2012 at 9:25 AM, Matt Goeke <[email protected]>
> > wrote:
> >
> > > All,
> > >
> > > I have gotten over all of the blocker configuration hurdles between EC2
> > and
> > > EMR and I am able to submit one of my jobs with success.
> Unfortunately, I
> > > am running into a weird issue with my actions where each one will take
> > > exactly 11 mins from start to finish even though the delved MR job is
> no
> > > where near that long (1 action is exactly 34 seconds and the other is 5
> > > mins 21 seconds). I cannot guarantee that this is not an issue with
> > overall
> > > resources on the node I am running the oozie instance on, I highly
> doubt
> > it
> > > since this is on an m1.large spec, so I am curious if there are any
> > changes
> > > to the site that might be able to flesh out what is causing this issue.
> > >
> > >
> > >
> > > Alejandro: The actual setup and configuration of this is fairly
> straight
> > > forward so I am happy to write up a wiki on this if you guys have a
> > > specific wiki in mind. I am not sure many people are keen on using EMR
> > as a
> > > persistent cluster (I assume most persistant clusters are setup across
> > EC2
> > > nodes) but I am actually very pleased with it so far since it greatly
> > > reduces the amount of initial setup required to spin up a cluster.
> > >
> > > --
> > > Matt
> > >
> > >
> > > On Tue, Dec 18, 2012 at 5:48 PM, Robert Kanter <[email protected]>
> > > wrote:
> > >
> > > > Hi Matt,
> > > >
> > > > The oozie.service.ProxyUserService.proxyuser.hadoop.hosts and
> > > > oozie.service.ProxyUserService.proxyuser.hadoop.groups
> > > > properties are part of Oozie's configuration and would go in
> > > > oozie-site.xml.  This lets you impersonate users on the Oozie side of
> > > > things.  See
> > > >
> > > >
> > >
> >
> http://oozie.apache.org/docs/3.3.0/AG_Install.html#User_ProxyUser_Configurationfor
> > > > more info.
> > > >
> > > > There's two similar properties for Hadoop that go into core-site.xml:
> > > > hadoop.proxyuser.oozie.hosts
> > > > and hadoop.proxyuser.oozie.groups
> > > > I think this is what you need to fix your error.  See
> > > > http://hadoop.apache.org/docs/stable/Secure_Impersonation.html for
> > more
> > > > info.
> > > >
> > > > - Robert
> > > >
> > > >
> > > >
> > > > On Tue, Dec 18, 2012 at 3:38 PM, Matt Goeke <[email protected]
> >
> > > > wrote:
> > > >
> > > > > All,
> > > > >
> > > > > Still working on getting Oozie 3.3 integrated with EMR with most of
> > my
> > > > time
> > > > > so far spent resolving the security group config needed for VPC.
> The
> > > EC2
> > > > > configuration was pretty simple but the main blocker right now is
> > > getting
> > > > > past the error below:
> > > > >
> > > > > Caused by: org.apache.hadoop.ipc.RemoteException: User: hadoop is
> not
> > > > > allowed to impersonate hadoop
> > > > >         at org.apache.hadoop.ipc.Client.call(Client.java:1070)
> > > > >         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
> > > > >         at $Proxy24.getProtocolVersion(Unknown Source)
> > > > >         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
> > > > >         at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
> > > > >         at
> > > > >
> > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
> > > > >         at
> > org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:238)
> > > > >         at
> > org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:203)
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
> > > > >         at
> > > > >
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
> > > > >         at
> > > org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> > > > >         at
> > > > org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
> > > > >         at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:411)
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.oozie.service.HadoopAccessorService$2.run(HadoopAccessorService.java:409)
> > > > >         at java.security.AccessController.doPrivileged(Native
> Method)
> > > > >         at javax.security.auth.Subject.doAs(Subject.java:415)
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
> > > > >         at
> > > > >
> > > > >
> > > >
> > >
> >
> org.apache.oozie.service.HadoopAccessorService.createFileSystem(HadoopAccessorService.java:409)
> > > > >         ... 26 more
> > > > >
> > > > > I know this is usually related to not having the correct proxy
> > configs
> > > in
> > > > > the core site but my current core site proxy configs are below
> (and I
> > > > have
> > > > > bounced both the NN and the JT since applying them):
> > > > >
> > > > >
> <property><name>dfs.permissions</name><value>false</value></property>
> > > > >
> > > > >
> > > >
> > >
> >
> <property><name>oozie.service.ProxyUserService.proxyuser.hadoop.hosts</name><value>*</value></property>
> > > > >
> > > > >
> > > >
> > >
> >
> <property><name>oozie.service.ProxyUserService.proxyuser.hadoop.groups</name><value>*</value></property>
> > > > >
> > > > > If I recall correctly this authorization is only checked at the the
> > > JT/NN
> > > > > level and therefore shouldn't need to be pushed to the core site on
> > the
> > > > > slave machines right? Also, would there be any reason the wildcard
> > > would
> > > > be
> > > > > incompatible across hadoop distros (we are currently using 1.0.3
> from
> > > > EMR)?
> > > > > Lastly, just for the sake of clarity is the proxy hosts config
> based
> > on
> > > > the
> > > > > box submitting the oozie request (edge node) or based on the boxes
> > > > actually
> > > > > running the jobs (data/task nodes)?
> > > > >
> > > > > --
> > > > > Matt
> > > > >
> > > > >
> > > > > On Thu, Dec 13, 2012 at 6:19 PM, Alejandro Abdelnur <
> > [email protected]
> > > > > >wrote:
> > > > >
> > > > > > Matt,
> > > > > >
> > > > > > It is not matter of bundling native code or not. Officially we
> > > suppose
> > > > to
> > > > > > do source releases only. As convenience we could do binaries, but
> > > there
> > > > > are
> > > > > > discussions about that, if the could be signed or not.
> > > > > >
> > > > > > Regarding installing/running oozie in EC2. I never done it. Would
> > you
> > > > > mind
> > > > > > writing up a wiki on it once you figure it out?
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > >
> > > > > > On Thu, Dec 13, 2012 at 4:02 PM, Matt Goeke <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thank you both for the follow-up.
> > > > > > >
> > > > > > > 2 other questions that pertain to this:
> > > > > > > 1) I don't remember any natives being required for Oozie so is
> > > there
> > > > a
> > > > > > > reason why we don't release with a -bin like most other apache
> > > > > projects?
> > > > > > > 2) Are there any issues I might expect to run into when trying
> to
> > > run
> > > > > > this
> > > > > > > on EC2 backed by EMR?
> > > > > > >
> > > > > > > --
> > > > > > > Matt
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Dec 13, 2012 at 5:48 PM, Alejandro Abdelnur <
> > > > [email protected]
> > > > > > > >wrote:
> > > > > > >
> > > > > > > > Matt,
> > > > > > > >
> > > > > > > > Apache Oozie release artifacts are sources only. The easiest
> > way
> > > to
> > > > > > build
> > > > > > > > the TARBALL is:
> > > > > > > >
> > > > > > > > * install Maven
> > > > > > > > * run bin/mkdistro.sh -DskipTests
> > > > > > > >
> > > > > > > > Then follow the Quick Start instructions.
> > > > > > > >
> > > > > > > > I'll open a JIRA to add this to the docs.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Dec 13, 2012 at 3:36 PM, Matt Goeke <
> > > > [email protected]
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > All,
> > > > > > > > >
> > > > > > > > > I am falling back to Oozie 3.2 for now but can someone
> > possibly
> > > > > > explain
> > > > > > > > how
> > > > > > > > > Oozie 3.3 is supposed to be configured? I was hoping to
> just
> > > > follow
> > > > > > the
> > > > > > > > > quick start guide but it seems like the packaging does not
> > > match
> > > > up
> > > > > > at
> > > > > > > > all.
> > > > > > > > >
> > > > > > > > > Trying to work through it I ended up downloading maven and
> > > > running
> > > > > a
> > > > > > > 'mvn
> > > > > > > > > install' on the folder which built some of the hadooplibs
> > but I
> > > > am
> > > > > > > still
> > > > > > > > > missing all of the bin scripts.
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Matt
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Alejandro
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Alejandro
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Alejandro
> >
>



-- 
Alejandro

Reply via email to