Hi Hitesh : Thanks a lot for your reply.
1. I did a puppet kick --ping to the client from my ambari master , all the five nodes failed with the same log (Triggering vbaby2.cloud.eb Host vbaby2.cloud.eb failed: certificate verify failed. This is often because the time is out of sync on the server or client vbaby2.cloud.eb finished with exit code 2) I manually run "service ambari-agent start" , is that necessary ? How can i fix these problem ? 2. As you suggest , I run the yum command manually. And found that the installation missed some dependecy - php-gd. And i have to update my yum repo. On Tue, Aug 14, 2012 at 1:01 AM, Hitesh Shah <[email protected]> wrote: > Based on your deploy error log: > > "3": { > "nodeReport": { > "PUPPET_KICK_FAILED": [], > "PUPPET_OPERATION_FAILED": [ > "vbaby3.cloud.eb", > "vbaby5.cloud.eb", > "vbaby4.cloud.eb", > "vbaby2.cloud.eb", > "vbaby6.cloud.eb", > "vbaby1.cloud.eb" > ], > "PUPPET_OPERATION_TIMEDOUT": [ > "vbaby5.cloud.eb", > "vbaby4.cloud.eb", > "vbaby2.cloud.eb", > "vbaby6.cloud.eb", > "vbaby1.cloud.eb" > ], > > 5 nodes timed out which means the puppet agent is not running on them or they > cannot communicate with the master. Trying doing a puppet kick --ping to them > from the master. > > For the one which failed, it failed at > > "\"Mon Aug 13 11:54:17 +0800 2012 > /Stage[1]/Hdp::Pre_install_pkgs/Hdp::Exec[yum install > $pre_installed_pkgs]/Exec[yum install $pre_installed_pkgs]/returns (err): > change from notrun to 0 failed: yum install -y hadoop hadoop-libhdfs > hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo hadoop hadoop-libhdfs > hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo hdp_mon_dashboard > ganglia-gmond-3.2.0 gweb hdp_mon_ganglia_addons snappy snappy-devel returned > 1 instead of one of [0] at > /etc/puppet/agent/modules/hdp/manifests/init.pp:265\"", > > It seems like yum install failed on the host. Try running the command > manually and see what the error is. > > -- Hitesh > > > > On Aug 13, 2012, at 2:28 AM, xu peng wrote: > >> Hi Hitesh : >> >> It's me again. >> >> Followed you advice , I reinstalled the ambari server. But deploying >> cluster and uninstall cluster failed again. I really don't know why. >> >> I supplied a attachment which contains the logs of all the nodes in >> my cluster (/var/log/puppet_*.log , /var/log/puppet/*.log , >> /var/log/yum.log, /var/log/hmc/hmc.log). And vbaby3.cloud.eb is the >> ambari server. Please refer. >> >> Attachment DeployError and UninstallError is the log supplied by the >> website of ambari when failing. And attachment DeployingDetails.jpg is >> the deploy details of my cluster. Please refer. >> >> >> Thanks again for your patience ! And look forward to your reply. >> >> Xupeng >> >> On Sat, Aug 11, 2012 at 10:56 PM, Hitesh Shah <[email protected]> wrote: >>> For uninstall failures, you will need to do a couple of things. Depending >>> on where the uninstall failed, you may have to manually do a killall java >>> on all the nodes to kill any missed processes. If you want to start with a >>> complete clean install, you should also delete the hadoop dir in the mount >>> points you selected during the previous install so that the new fresh >>> install does not face errors when it tries to re-format hdfs. >>> >>> After that, simply, uinstall and re-install ambari rpm and that should >>> allow you to re-create a fresh cluster. >>> >>> -- Hitesh >>> >>> On Aug 11, 2012, at 2:34 AM, xu peng wrote: >>> >>>> Hi Hitesh : >>>> >>>> Thanks a lot for your reply. >>>> >>>> I solved this problem , it is silly mistake. Someone has changed the >>>> owner of "/" dir , and according to the errorlog , pdsh need root to >>>> proceed. >>>> >>>> After changing the owner of "/" to root , problem solved. Thank you >>>> again for you reply. >>>> >>>> I have another question. I had a uninstall failure , and there is no >>>> button on the website for me to rollback and i don't know what to do >>>> about that. What should i do now to reinstall hadoop ? >>>> >>>> Thanks >>>> >>>> On Fri, Aug 10, 2012 at 10:55 PM, Hitesh Shah <[email protected]> >>>> wrote: >>>>> Hi >>>>> >>>>> Currently, the ambari installer requires everything to be run as root. It >>>>> does not detect that the user is not root and use sudo either on the >>>>> master or on the agent nodes. >>>>> Furthermore, it seems like it is failing when trying to use pdsh to make >>>>> remote calls to the host list that you passed in due to the errors >>>>> mentioned in your script. This could be due to how it was installed but I >>>>> am not sure. >>>>> >>>>> Could you switch to become root and run any simple command on all hosts >>>>> using pdsh? If you want to reference exactly how ambari uses pdsh, you >>>>> can look into /usr/share/hmc/php/frontend/commandUtils.php >>>>> >>>>> thanks >>>>> -- Hitesh >>>>> >>>>> On Aug 9, 2012, at 9:04 PM, xu peng wrote: >>>>> >>>>>> According to the error log , is there something wrong with my account ? >>>>>> >>>>>> I installed all the dependency module and ambari with the user >>>>>> "ambari" instead of root. I added user "ambari" to /etc/sudofilers >>>>>> with no passwd. >>>>>> >>>>>> On Fri, Aug 10, 2012 at 11:49 AM, xu peng <[email protected]> wrote: >>>>>>> There is no 100.log.file in /var/log/hmc dir, but only 55.log file (55 >>>>>>> is the biggest version num). >>>>>>> >>>>>>> The content of 55.log is : >>>>>>> pdsh@vbaby1: module path "/usr/lib64/pdsh" insecure. >>>>>>> pdsh@vbaby1: "/": Owner not root, current uid, or pdsh executable owner >>>>>>> pdsh@vbaby1: Couldn't load any pdsh modules >>>>>>> >>>>>>> Thanks ~ >>>>>>> >>>>>>> >>>>>>> On Fri, Aug 10, 2012 at 11:36 AM, Hitesh Shah <[email protected]> >>>>>>> wrote: >>>>>>>> Sorry - my mistake. The last txn mentioned is 100 so please look for >>>>>>>> the 100.log file. >>>>>>>> >>>>>>>> -- Hitesh >>>>>>>> >>>>>>>> >>>>>>>> On Aug 9, 2012, at 8:34 PM, Hitesh Shah wrote: >>>>>>>> >>>>>>>>> Thanks - will take a look and get back to you. >>>>>>>>> >>>>>>>>> Could you also look at /var/log/hmc/hmc.txn.55.log and see if there >>>>>>>>> are any errors in it? >>>>>>>>> >>>>>>>>> -- Hitesh. >>>>>>>>> >>>>>>>>> On Aug 9, 2012, at 8:00 PM, xu peng wrote: >>>>>>>>> >>>>>>>>>> Hi Hitesh : >>>>>>>>>> >>>>>>>>>> Thanks a lot for your replying. I have done all your suggestions in >>>>>>>>>> my >>>>>>>>>> ambari server , and the result is as below. >>>>>>>>>> >>>>>>>>>> 1. I can confirm that the hosts.txt file is empty after i failed at >>>>>>>>>> the step finding reachable nodes. >>>>>>>>>> 2. I tried make hostdetails file in win7 and redhat , it both >>>>>>>>>> failed.(Please see the attachment, my hostdetails file) >>>>>>>>>> 3. I removed the logging re-direct and run the .sh script .It seems >>>>>>>>>> like the script works well , it print the hostname in console and >>>>>>>>>> generate a file (content is "0") in the same dir. (Please see the >>>>>>>>>> attachment , the result and my .sh script ) >>>>>>>>>> 4. I attached the hmc.log and error_log too. Hope this helps ~ >>>>>>>>>> >>>>>>>>>> Thanks ~ >>>>>>>>>> Xupeng >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Fri, Aug 10, 2012 at 12:24 AM, Hitesh Shah >>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>> Xupeng, can you confirm that the hosts.txt file at >>>>>>>>>>> /var/run/hmc/clusters/EBHadoop/hosts.txt is empty? >>>>>>>>>>> >>>>>>>>>>> Also, can you ensure that the hostdetails file that you upload does >>>>>>>>>>> not have any special characters that may be creating problems for >>>>>>>>>>> the parsing layer? >>>>>>>>>>> >>>>>>>>>>> In the same dir, there should be an ssh.sh script. Can you create a >>>>>>>>>>> copy of it, edit to remove the logging re-directs to files and run >>>>>>>>>>> the script manually from command-line ( it takes in a hostname as >>>>>>>>>>> the argument ) ? The output of that should show you as to what is >>>>>>>>>>> going wrong. >>>>>>>>>>> >>>>>>>>>>> Also, please look at /var/log/hmc/hmc.log and httpd/error_log to >>>>>>>>>>> see if there are any errors being logged which may shed more light >>>>>>>>>>> on the issue. >>>>>>>>>>> >>>>>>>>>>> thanks >>>>>>>>>>> -- Hitesh >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Aug 9, 2012, at 9:11 AM, Artem Ervits wrote: >>>>>>>>>>> >>>>>>>>>>>> Which file are you supplying in the step? Hostdetail.txt or hosts? >>>>>>>>>>>> >>>>>>>>>>>> From: xupeng.bupt [mailto:[email protected]] >>>>>>>>>>>> Sent: Thursday, August 09, 2012 11:33 AM >>>>>>>>>>>> To: ambari-user >>>>>>>>>>>> Subject: Re: RE: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>> >>>>>>>>>>>> Thank you for your replying ~ >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> I made only one hostdetail.txt file which contains the names of >>>>>>>>>>>> all servers. And i submit this file on the website , but i still >>>>>>>>>>>> have the same problem. I failed at the step of finding reachable >>>>>>>>>>>> nodes. >>>>>>>>>>>> >>>>>>>>>>>> The error log is : " >>>>>>>>>>>> [ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]: >>>>>>>>>>>> Encountered total failure in transaction 100 while running cmd: >>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: EBHadoop >>>>>>>>>>>> root >>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt >>>>>>>>>>>> " >>>>>>>>>>>> >>>>>>>>>>>> And my hostdetail.txt file is :" >>>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>>> vbaby3.cloud.eb >>>>>>>>>>>> vbaby4.cloud.eb >>>>>>>>>>>> vbaby5.cloud.eb >>>>>>>>>>>> vbaby6.cloud.eb >>>>>>>>>>>> " >>>>>>>>>>>> Thank you very much ~ >>>>>>>>>>>> >>>>>>>>>>>> 2012-08-09 >>>>>>>>>>>> xupeng.bupt >>>>>>>>>>>> 发件人: Artem Ervits >>>>>>>>>>>> 发送时间: 2012-08-09 22:16:53 >>>>>>>>>>>> 收件人: [email protected] >>>>>>>>>>>> 抄送: >>>>>>>>>>>> 主题: RE: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>> the installer requires a hosts file which I believe you called >>>>>>>>>>>> hostdetail. Make sure it's the same file. You also mention a >>>>>>>>>>>> hosts.txt and host.txt. You only need one file with the names of >>>>>>>>>>>> all servers. >>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>> From: xu peng [mailto:[email protected]] >>>>>>>>>>>> Sent: Thursday, August 09, 2012 2:02 AM >>>>>>>>>>>> To: [email protected] >>>>>>>>>>>> Subject: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>> Hi everyone : >>>>>>>>>>>> I am trying to use ambari to set up a hadoop cluster , but i >>>>>>>>>>>> encounter a problem on step 2. I already set up the password-less >>>>>>>>>>>> ssh, and i creat a hostdetail.txt file. >>>>>>>>>>>> The problem is that i found the file >>>>>>>>>>>> "/var/run/hmc/clusters/EBHadoop/hosts.txt" is empty , no matter >>>>>>>>>>>> how many times i submit the host.txt file on the website , and i >>>>>>>>>>>> really don't know why. >>>>>>>>>>>> { >>>>>>>>>>>> Here is the log file : [2012:08:09 >>>>>>>>>>>> 05:17:56][ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]: >>>>>>>>>>>> Encountered total failure in transaction 100 while running cmd: >>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: EBHadoop >>>>>>>>>>>> root >>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt >>>>>>>>>>>> and my host.txt is like this(vbaby1.cloud.eb is the master node) : >>>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>>> vbaby3.cloud.eb >>>>>>>>>>>> vbaby4.cloud.eb >>>>>>>>>>>> vbaby5.cloud.eb >>>>>>>>>>>> vbaby6.cloud.eb >>>>>>>>>>>> } >>>>>>>>>>>> Can anyone help me and tell me what i am doing wrong ? >>>>>>>>>>>> Thank you very much ~! >>>>>>>>>>>> This electronic message is intended to be for the use only of the >>>>>>>>>>>> named recipient, and may contain information that is confidential >>>>>>>>>>>> or privileged. If you are not the intended recipient, you are >>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use >>>>>>>>>>>> of the contents of this message is strictly prohibited. If you >>>>>>>>>>>> have received this message in error or are not the named >>>>>>>>>>>> recipient, please notify us immediately by contacting the sender >>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy >>>>>>>>>>>> all copies of this message. Thank you. >>>>>>>>>>>> -------------------- >>>>>>>>>>>> This electronic message is intended to be for the use only of the >>>>>>>>>>>> named recipient, and may contain information that is confidential >>>>>>>>>>>> or privileged. If you are not the intended recipient, you are >>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use >>>>>>>>>>>> of the contents of this message is strictly prohibited. If you >>>>>>>>>>>> have received this message in error or are not the named >>>>>>>>>>>> recipient, please notify us immediately by contacting the sender >>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy >>>>>>>>>>>> all copies of this message. Thank you. >>>>>>>>>>>> -------------------- >>>>>>>>>>>> This electronic message is intended to be for the use only of the >>>>>>>>>>>> named recipient, and may contain information that is confidential >>>>>>>>>>>> or privileged. If you are not the intended recipient, you are >>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use >>>>>>>>>>>> of the contents of this message is strictly prohibited. If you >>>>>>>>>>>> have received this message in error or are not the named >>>>>>>>>>>> recipient, please notify us immediately by contacting the sender >>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy >>>>>>>>>>>> all copies of this message. Thank you. >>>>>>>>>>>> -------------------- >>>>>>>>>>>> >>>>>>>>>>>> This electronic message is intended to be for the use only of the >>>>>>>>>>>> named recipient, and may contain information that is confidential >>>>>>>>>>>> or privileged. If you are not the intended recipient, you are >>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use >>>>>>>>>>>> of the contents of this message is strictly prohibited. If you >>>>>>>>>>>> have received this message in error or are not the named >>>>>>>>>>>> recipient, please notify us immediately by contacting the sender >>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy >>>>>>>>>>>> all copies of this message. Thank you. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -------------------- >>>>>>>>>>>> >>>>>>>>>>>> This electronic message is intended to be for the use only of the >>>>>>>>>>>> named recipient, and may contain information that is confidential >>>>>>>>>>>> or privileged. If you are not the intended recipient, you are >>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use >>>>>>>>>>>> of the contents of this message is strictly prohibited. If you >>>>>>>>>>>> have received this message in error or are not the named >>>>>>>>>>>> recipient, please notify us immediately by contacting the sender >>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy >>>>>>>>>>>> all copies of this message. Thank you. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> <hmcLog.txt><hostdetails.txt><httpdLog.txt><ssh1.sh><ssh1_result.jpg> >>>>>>>>> >>>>>>>> >>>>> >>> >> <DeployError1_2012.8.13.txt><log.rar><DeployingDetails.jpg><UninstallError1_2012.8.13.txt> >
