No, ZooKeeper daemons == http://zookeeper.apache.org.
On Tue, Jan 15, 2013 at 3:38 PM, ESGLinux <[email protected]> wrote: > Hi Harsh, > > Now I´m confussed at all :-)))) > > as you pointed ZKFC runs only in the NN. That´s looks right. > > So, what are ZK peers (the odd number I´m looking for) and where I have to > run them? on another 3 nodes? > > As I can read from the previous url: > > In a typical deployment, ZooKeeper daemons are configured to run on three > or five nodes. Since ZooKeeper itself has light resource requirements, it > is acceptable to collocate the ZooKeeper nodes on the same hardware as the > HDFS NameNode and Standby Node. Many operators choose to deploy the third > ZooKeeper process on the same node as the YARN ResourceManager. It is > advisable to configure the ZooKeeper nodes to store their data on separate > disk drives from the HDFS metadata for best performance and isolation. > > Here, ZooKeeper daemons = ZKFC? > > > Thanks > > ESGLinux, > > > > 2013/1/15 Harsh J <[email protected]> > >> Hi, >> >> I fail to see your confusion. >> >> ZKFC != ZK >> >> ZK is a quorum software, like QJM is. The ZK peers are to be run odd in >> numbers, such as JNs are to be. >> >> ZKFC is something the NN needs for its Automatic Failover capability. It >> is a client to ZK and thereby demands ZK's presence; for which the odd # of >> nodes is suggested. ZKFC itself is only to be run one per NN. >> >> >> On Tue, Jan 15, 2013 at 3:23 PM, ESGLinux <[email protected]> wrote: >> >>> Hi all, >>> >>> I´m only testing the new HA feature. I´m not in a production system, >>> >>> Well, let´s talk about the number of nodes and the ZKFC daemons. >>> >>> In this url: >>> >>> https://ccp.cloudera.com/display/CDH4DOC/HDFS+High+Availability+Initial+Deployment#HDFSHighAvailabilityInitialDeployment-DeployingAutomaticFailover >>> >>> you can read: >>> If you have configured automatic failover using the ZooKeeper >>> FailoverController (ZKFC), you must install and start thezkfc daemon on >>> each of the machines that runs a NameNode. >>> >>> So, the number of ZKFC daemons are two, but reading this url: >>> >>> >>> http://archive.cloudera.com/cdh4/cdh/4/hadoop/hadoop-yarn/hadoop-yarn-site/HDFSHighAvailabilityWithQJM.html#Deploying_ZooKeeper >>> >>> you can read this: >>> In a typical deployment, ZooKeeper daemons are configured to run on >>> three or five nodes >>> >>> I think that to ensure a good HA enviroment (of any kind) you need and >>> odd number of nodes to avoid split-brain. The problem I see here is that If >>> ZKFC monitors NameNodes in a CDH4 enviroment you only have 2 NN >>> (active+standby). >>> >>> So I´m a bit confussed with this deployment... >>> >>> Any suggestion? >>> >>> Thanks in advance for all your answers >>> >>> Kind regards, >>> >>> ESGLinux >>> >>> >>> >>> >>> 2013/1/14 Colin McCabe <[email protected]> >>> >>>> On Mon, Jan 14, 2013 at 11:49 AM, Colin McCabe <[email protected]> >>>> wrote: >>>> > Hi ESGLinux, >>>> > >>>> > In production, you need to run QJM on at least 3 nodes. You also need >>>> > to run ZKFC on at least 3 nodes. You can run them on the same nodes >>>> > if you like, though. >>>> >>>> Er, this should read "You also need to run ZooKeeper on at least 3 >>>> nodes." ZKFC, which talks to ZooKeeper, runs on only two nodes-- the >>>> active NN node and the standby NN node. >>>> >>>> Colin >>>> >>>> > >>>> > Of course, none of this is "needed" to set up an example cluster. If >>>> > you just want to try something out, you can run everything on the same >>>> > node if you want. It depends on what you're trying to do. >>>> > >>>> > cheers, >>>> > Colin >>>> > >>>> > >>>> > On Fri, Dec 28, 2012 at 3:02 AM, ESGLinux <[email protected]> >>>> wrote: >>>> >> Thank you for your answer Craig, >>>> >> >>>> >> I´m planning my cluster and for now I´m not sure how many machines I >>>> need;-) >>>> >> >>>> >> If I have doubt i´ll what clouder say and If have a problem I have >>>> where to >>>> >> ask for explications :-) >>>> >> >>>> >> ESGLinux >>>> >> >>>> >> >>>> >> >>>> >> 2012/12/28 Craig Munro <[email protected]> >>>> >>> >>>> >>> OK, I have reliable storage on my datanodes so not an issue for me. >>>> If >>>> >>> that's what Cloudera recommends then I'm sure it's fine. >>>> >>> >>>> >>> On Dec 28, 2012 10:38 AM, "ESGLinux" <[email protected]> wrote: >>>> >>>> >>>> >>>> Hi Craig, >>>> >>>> >>>> >>>> I´m a bit confused, I have read this from cloudera: >>>> >>>> >>>> https://ccp.cloudera.com/display/CDH4DOC/Hardware+Configuration+for+Quorum-based+Storage >>>> >>>> >>>> >>>> The JournalNode daemon is relatively lightweight, so these daemons >>>> can >>>> >>>> reasonably be collocated on machines with other Hadoop daemons, >>>> for example >>>> >>>> NameNodes, the JobTracker, or the YARN ResourceManager. >>>> >>>> Cloudera recommends that you deploy the JournalNode daemons on the >>>> >>>> "master" host or hosts (NameNode, Standby NameNode, JobTracker, >>>> etc.) so the >>>> >>>> JournalNodes' local directories can use the reliable local storage >>>> on those >>>> >>>> machines. >>>> >>>> There must be at least three JournalNode daemons, since edit log >>>> >>>> modifications must be written to a majority of JournalNodes >>>> >>>> >>>> >>>> as you can read they recommend to put journalnode daemons with the >>>> >>>> namenodes, but you say the opposite.??¿?¿?? >>>> >>>> >>>> >>>> >>>> >>>> Thanks for your answer, >>>> >>>> >>>> >>>> ESGLinux, >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> 2012/12/28 Craig Munro <[email protected]> >>>> >>>>> >>>> >>>>> You need the following: >>>> >>>>> >>>> >>>>> - active namenode + zkfc >>>> >>>>> - standby namenode + zkfc >>>> >>>>> - pool of journal nodes (odd number, 3 or more) >>>> >>>>> - pool of zookeeper nodes (odd number, 3 or more) >>>> >>>>> >>>> >>>>> As the journal nodes hold the namesystem transactions they should >>>> not be >>>> >>>>> co-located with the namenodes in case of failure. I distribute >>>> the journal >>>> >>>>> and zookeeper nodes across the hosts running datanodes or as >>>> Harsh says you >>>> >>>>> could co-locate them on dedicated hosts. >>>> >>>>> >>>> >>>>> ZKFC does not monitor the JobTracker. >>>> >>>>> >>>> >>>>> Regards, >>>> >>>>> Craig >>>> >>>>> >>>> >>>>> On Dec 28, 2012 9:25 AM, "ESGLinux" <[email protected]> wrote: >>>> >>>>>> >>>> >>>>>> Hi, >>>> >>>>>> >>>> >>>>>> well, If I have understand you I can configure my NN HA cluster >>>> this >>>> >>>>>> way: >>>> >>>>>> >>>> >>>>>> - Active NameNode + 1 ZKFC daemon + Journal Node >>>> >>>>>> - Standby NameNode + 1 ZKFC daemon + Journal Node >>>> >>>>>> - JobTracker node + 1 ZKFC daemon + Journal Node, >>>> >>>>>> >>>> >>>>>> Is this right? >>>> >>>>>> >>>> >>>>>> Thanks in advance, >>>> >>>>>> >>>> >>>>>> ESGLinux, >>>> >>>>>> >>>> >>>>>> 2012/12/27 Harsh J <[email protected]> >>>> >>>>>>> >>>> >>>>>>> Hi, >>>> >>>>>>> >>>> >>>>>>> There are two different things here: Automatic Failover and >>>> Quorum >>>> >>>>>>> Journal Manager. The former, used via a ZooKeeper Failover >>>> Controller, >>>> >>>>>>> is to manage failovers automatically (based on health checks of >>>> NNs). >>>> >>>>>>> The latter, used via a set of Journal Nodes, is a medium of >>>> shared >>>> >>>>>>> storage for namesystem transactions that helps enable HA. >>>> >>>>>>> >>>> >>>>>>> In a typical deployment, you want 3 or more (odd) JournalNodes >>>> for >>>> >>>>>>> reliable HA, preferably on nodes of their own if possible (like >>>> you >>>> >>>>>>> would for typical ZooKeepers, and you may co-locate with those >>>> as >>>> >>>>>>> well) and one ZKFC for each NameNode (connected to the same ZK >>>> >>>>>>> quorum). >>>> >>>>>>> >>>> >>>>>>> On Thu, Dec 27, 2012 at 5:33 PM, ESGLinux <[email protected]> >>>> wrote: >>>> >>>>>>> > Hi all, >>>> >>>>>>> > >>>> >>>>>>> > I have a doubt about how to deploy the Zookeeper in a NN HA >>>> >>>>>>> > cluster, >>>> >>>>>>> > >>>> >>>>>>> > As far as I know, I need at least three nodes to run three >>>> ZooKeeper >>>> >>>>>>> > FailOver Controller (ZKFC). I plan to put these 3 daemons >>>> this way: >>>> >>>>>>> > >>>> >>>>>>> > - Active NameNode + 1 ZKFC daemon >>>> >>>>>>> > - Standby NameNode + 1 ZKFC daemon >>>> >>>>>>> > - JobTracker node + 1 ZKFC daemon, (is this right?) >>>> >>>>>>> > >>>> >>>>>>> > so the quorum is formed with these three nodes. The nodes >>>> that runs >>>> >>>>>>> > a >>>> >>>>>>> > namenode are right because the ZKFC monitors it, but what >>>> does the >>>> >>>>>>> > third >>>> >>>>>>> > daemon? >>>> >>>>>>> > >>>> >>>>>>> > as I read from this url: >>>> >>>>>>> > >>>> >>>>>>> > >>>> https://ccp.cloudera.com/display/CDH4DOC/Software+Configuration+for+Quorum-based+Storage#SoftwareConfigurationforQuorum-basedStorage-AutomaticFailoverConfiguration >>>> >>>>>>> > >>>> >>>>>>> > this daemons are only related with NameNodes, (Health >>>> monitoring - >>>> >>>>>>> > the ZKFC >>>> >>>>>>> > pings its local NameNode on a periodic basis with a >>>> health-check >>>> >>>>>>> > command.) >>>> >>>>>>> > so what does the third ZKFC? I used the jobtracker node but I >>>> could >>>> >>>>>>> > use >>>> >>>>>>> > another node without any daemon on it... >>>> >>>>>>> > >>>> >>>>>>> > Thanks in advance, >>>> >>>>>>> > >>>> >>>>>>> > ESGLInux, >>>> >>>>>>> > >>>> >>>>>>> > >>>> >>>>>>> > >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> >>>> >>>>>>> -- >>>> >>>>>>> Harsh J >>>> >>>>>> >>>> >>>>>> >>>> >>>> >>>> >> >>>> >>> >>> >> >> >> -- >> Harsh J >> > > -- Harsh J
