On Sun, Nov 30, 2014 at 12:50AM, Leidle, Rob wrote: > Thanks Roman, > > I actually fixed the problem. I had an existing process monitoring the > daemon and restarting it if it terminated. However, puppet encapsulates this > so it is no longer needed. Also, this process was causing the namenode > service to terminate once. I removed my existing monitoring process and > everything is working fine. > > That being said is there a recommended number of times we should retry the > puppet scripts on failure?
Good to see you're coming through! As for the retries: if something doesn't work I usually check the logs immediatelly. Sometimes after a second re-run. Cos > > On Nov 29, 2014, at 3:49 PM, Roman Shaposhnik <[email protected]> wrote: > > > >> On Fri, Nov 28, 2014 at 7:08 PM, Konstantin Boudnik <[email protected]> > >> wrote: > >>> On Sat, Nov 29, 2014 at 01:43AM, Leidle, Rob wrote: > >>> Yes, I ran into Bigtop-1522 and figured out I needed to add mapred-app. > >>> Sorry, I wrote what I said in the previous email incorrectly, yes, > >>> resource manager does not install because the depdendency namenode does > >>> not install correctly. I will look more closely at the service logs to see > >>> if I can figure out why it isn╧t starting. The error code of Ё3╡ indicates > >>> from the /etc/init.d/hadoop-hdfs-namenode script that this means it can╧t > >>> find the running process 5 seconds after starting it. > >> > >> Yes, please look into the logs - might be something obvious missed. We are > >> running these recipes for a good 3+ years and they are fairly well tested. > >> Would be good to fix last bugs if any ;) > > > > What Cos said above, but also note that Puppet encourages this unfortunate > > 'eventual convergence' pattern. IOW, even if the first time around a > > few services > > failed if everything goes OK on the next Puppet run -- the cluster comes up. > > > > It would be very nice to debug the nitty gritty details of > > synchronization issues > > like the ones you seem to be seeing. Unfortunately, we haven't really had > > much of a focus there, since, like I said, for internal Bigtop testing > > purposes > > the 'eventual convergence' suffices. > > > > Thanks, > > Roman.
signature.asc
Description: Digital signature
