Hey Zhijie, Do you know what exactly is coming out in Apache Hadoop 2.4 for HA? Will it have ZK-backed (both state and leadership election) HA RMs? I've had a lot of trouble figuring out exactly what the state of HA is in YARN in all the JIRAs and CDH modifications.
Cheers, Chris On 3/20/14 11:14 AM, "Zhijie Shen" <[email protected]> wrote: >If I remember correctly, qjournal is used by HDFS, but not YARN. Both of >the components have separate HA stack. BTW, after Hadoop 2.4, HA should be >in a better shape. > >- Zhijie > > >On Thu, Mar 20, 2014 at 10:59 AM, Dan Di Spaltro ><[email protected]>wrote: > >> Is there a different type of YARN HA? It seems the method of HA for >>CDH5 >> uses the qjournal on top of the zkfc. >> >> -Dan >> >> >> On Wed, Mar 19, 2014 at 10:53 AM, Yan Fang <[email protected]> wrote: >> >> > Hi Chris, >> > >> > I have made the Samza run in HA yarn, leveraging the high available >> > configuration. Just put my coarse approach here in case someone faces >>the >> > similar problem. >> > >> > The HA yarn is from CDH5-beta 2 version, which is ZK-based HA yarn. It >> > seems not working by just replacing the jar file. So the way I made it >> work >> > is a little hacky: changed the samza-yarn a little, having the client >> check >> > the current active RM from Zookeeper every time it submits AM. ( >>Because >> HA >> > yarn keeps the active RM name in the ZK ). Of course, Samza works >>well. >> It >> > will automatically get restarted when the RM changes (that is, >>standby RM >> > becomes active when active RM fails). >> > >> > Hope someone has a better idea for doing this. Thank you. >> > >> > Cheers, >> > >> > Fang, Yan >> > [email protected] >> > +1 (206) 849-4108 >> > >> > >> > On Mon, Mar 10, 2014 at 4:35 PM, Yan Fang <[email protected]> >>wrote: >> > >> > > Hi Chris, >> > > >> > > Thank you! You are correct, I am actually working in a CDH5-beta >> version. >> > > Will definitely try as you recommended and do some experiments to >>see >> how >> > > Samza performances. >> > > >> > > Cheers, >> > > >> > > Fang, Yan >> > > [email protected] >> > > +1 (206) 849-4108 >> > > >> > > >> > > On Mon, Mar 10, 2014 at 3:54 PM, Chris Riccomini < >> > [email protected]>wrote: >> > > >> > >> Hey Yan, >> > >> >> > >> I'm not aware of anyone successfully running Samza with CDH5's HA >> YARN. >> > As >> > >> far as I understand, those patches are not fully merged in to >>Apache >> yet >> > >> (I could be wrong, though). >> > >> >> > >> At a minimum, you'll probably need to replace Samza's 2.2 YARN jars >> with >> > >> the CDH5 jars, so that Samza properly interprets the different >>configs >> > >> (e.g. The new RM style of config, which you've mentioned). >> > >> >> > >> I'm not sure how Samza's YARN AM will behave when the RM is failed >> over. >> > >> You'll have to experiment with this and see. If you find anything >>out, >> > >> it'd be very very useful if you could share it with the rest of us. >> > Samza >> > >> and HA RMs is something that we're investigating as well. >> > >> >> > >> Cheers, >> > >> Chris >> > >> >> > >> On 3/10/14 12:11 PM, "Yan Fang" <[email protected]> wrote: >> > >> >> > >> >Hi All, >> > >> > >> > >> >Happy daylight saving! I am wondering if anyone in this >>mailing-list >> > has >> > >> >successfully run the Samza in a HA YARN cluster ? >> > >> > >> > >> >We are trying to run Samza in CDH5 which has HA YARN >>configurations. >> I >> > am >> > >> >able to run Samza only by updating the yarn-default.xml (change >> > >> >yarn.resourcemanager.address), the same approach Nirmal Kumar >> mentioned >> > >> in >> > >> >"Running Samza on multi node". Otherwise, it will always connect >>to >> > >> >0.0.0.0 >> > >> >in yarn-default.xml. (I am sure I set the conf file and YARN_HOME >> > >> >correctly.) >> > >> > >> > >> >So my question is: >> > >> >1. Can't Samza interpret HA YARN configuration file correctly? ( >>Is >> > that >> > >> >because the HA YARN configuration is using, say, >> > >> >yarn.resourcemanager.address.*rm15* instead of >> > >> >yarn.resourcemanager.address >> > >> >?) >> > >> > >> > >> >2. Is it possible to switch to a new RM automatically when one is >> down? >> > >> >Because we have two RMs, one for Active and one for Standby but I >>can >> > >> only >> > >> >put one RM address in yarn-deault.xml. I am wondering if it is >> possible >> > >> to >> > >> >detect the active RM automatically in Samza (or other method)? >> > >> > >> > >> >3. Any one has the luck to leverage the HA YARN? >> > >> > >> > >> >Thank you. >> > >> > >> > >> >Cheers, >> > >> > >> > >> >Fang, Yan >> > >> >[email protected] >> > >> >+1 (206) 849-4108 >> > >> > >> > >> > >> > >> >On Fri, Feb 21, 2014 at 3:23 PM, Chris Riccomini >> > >> ><[email protected]>wrote: >> > >> > >> > >> >> Hey Ethan, >> > >> >> >> > >> >> YARN's HA support is marginal right now, and we're still >> > investigating >> > >> >> this stuff. Some useful things to read are: >> > >> >> >> > >> >> * https://issues.apache.org/jira/browse/YARN-128 >> > >> >> * https://issues.apache.org/jira/browse/YARN-149 >> > >> >> * https://issues.apache.org/jira/browse/YARN-353 >> > >> >> * https://issues.apache.org/jira/browse/YARN-556 >> > >> >> >> > >> >> >> > >> >> Also, CDH seems to be packaging some of the ZK-based HA stuff >> > already: >> > >> >> >> > >> >> >> > >> >> >> > >> >> >> > >> >> > >> >>https://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/late >> > >> >>st >> > >> >> /CDH5-High-Availability-Guide/cdh5hag_cfg_RM_HA.html >> > >> >> >> > >> >> >> > >> >> At LI, we're still experimenting with the best setup, so my >> guidance >> > >> >>might >> > >> >> not be state of the art. We currently configure the YARN RM's >>store >> > >> >> (yarn.resourcemanager.store.class) to use the file system store >> > >> >> >> > >> >> > >> >> > >> >>>>(org.apache.hadoop.yarn.server.resourcemanager.recovery.FileSystemRMSta >>>>te >> > >> >>St >> > >> >> ore). The failover is a manual operation where we copy the RM >>state >> > to >> > >> a >> > >> >> new machine, and then start the RM on that machine. You then >>need >> to >> > >> >>front >> > >> >> the RM with a VIP or DNS entry, which you can update to point to >> the >> > >> new >> > >> >> RM machine when a failover occurs. The NMs need to be >>configured to >> > >> >>point >> > >> >> to this VIP/DNS entry, so that when a failover occurs, the NMs >> don't >> > >> >>need >> > >> >> to update their yarn-site.xml files. >> > >> >> >> > >> >> >> > >> >> It sounds like in the future you won't need to use VIPs/DNS >> entries. >> > >> You >> > >> >> should probably also email the YARN mailing list, just in case >> we're >> > >> >> misinformed or unaware of some new updates. >> > >> >> >> > >> >> Cheers, >> > >> >> Chris >> > >> >> >> > >> >> On 2/21/14 2:27 PM, "Ethan Setnik" >><[email protected]> >> > >> wrote: >> > >> >> >> > >> >> >I'm looking to deploy Samza on AWS infrastructure in a HA >> > >> >>configuration. >> > >> >> >I >> > >> >> >have a clear picture of how to configure all the components >>such >> > that >> > >> >>they >> > >> >> >do not contain any single point of failure. >> > >> >> > >> > >> >> >I'm stuck, however, when it comes to the YARN architecture. It >> > seems >> > >> >>that >> > >> >> >YARN relies on the single-master / multi-slave pattern as >> described >> > in >> > >> >>the >> > >> >> >YARN documentation. This introduces a single point of failure >>at >> > the >> > >> >> >ResourceManager level such that a failed ResourceManager will >>fail >> > the >> > >> >> >entire YARN cluster. How does LinkedIn architect a HA >> configuration >> > >> >>for >> > >> >> >Samza on YARN such that a complete instance failure of >> > ResourceManager >> > >> >> >provides failover for the YARN cluster? >> > >> >> > >> > >> >> >Thanks for your help. >> > >> >> > >> > >> >> >Best, >> > >> >> >Ethan >> > >> >> > >> > >> >> > >> > >> >> >-- >> > >> >> >Ethan Setnik >> > >> >> >MobileAware >> > >> >> > >> > >> >> >m: +1 617 513 2052 >> > >> >> >e: [email protected] >> > >> >> >> > >> >> >> > >> >> > >> >> > > >> > >> >> >> >> -- >> Dan Di Spaltro >> > > > >-- >Zhijie Shen >Hortonworks Inc. >http://hortonworks.com/ > >-- >CONFIDENTIALITY NOTICE >NOTICE: This message is intended for the use of the individual or entity >to >which it is addressed and may contain information that is confidential, >privileged and exempt from disclosure under applicable law. If the reader >of this message is not the intended recipient, you are hereby notified >that >any printing, copying, dissemination, distribution, disclosure or >forwarding of this communication is strictly prohibited. If you have >received this communication in error, please contact the sender >immediately >and delete it from your system. Thank You.
