Hi  Hitesh :

Thanks a lot for your reply.

1. I did a puppet kick --ping to the client from my ambari master ,
all the five nodes failed with the same log (Triggering
vbaby2.cloud.eb
Host vbaby2.cloud.eb failed: certificate verify failed.  This is often
because the time is out of sync on the server or client
vbaby2.cloud.eb finished with exit code 2)

I manually run "service ambari-agent start" , is that necessary ? How
can i fix these problem ?

2. As you suggest , I run the yum command manually. And found that the
installation missed some dependecy - php-gd. And i have to update my
yum repo.



On Tue, Aug 14, 2012 at 1:01 AM, Hitesh Shah <[email protected]> wrote:
> Based on your deploy error log:
>
> "3": {
>         "nodeReport": {
>             "PUPPET_KICK_FAILED": [],
>             "PUPPET_OPERATION_FAILED": [
>                 "vbaby3.cloud.eb",
>                 "vbaby5.cloud.eb",
>                 "vbaby4.cloud.eb",
>                 "vbaby2.cloud.eb",
>                 "vbaby6.cloud.eb",
>                 "vbaby1.cloud.eb"
>             ],
>             "PUPPET_OPERATION_TIMEDOUT": [
>                 "vbaby5.cloud.eb",
>                 "vbaby4.cloud.eb",
>                 "vbaby2.cloud.eb",
>                 "vbaby6.cloud.eb",
>                 "vbaby1.cloud.eb"
>             ],
>
> 5 nodes timed out which means the puppet agent is not running on them or they 
> cannot communicate with the master. Trying doing a puppet kick --ping to them 
> from the master.
>
> For the one which failed, it failed at
>
> "\"Mon Aug 13 11:54:17 +0800 2012 
> /Stage[1]/Hdp::Pre_install_pkgs/Hdp::Exec[yum install 
> $pre_installed_pkgs]/Exec[yum install $pre_installed_pkgs]/returns (err): 
> change from notrun to 0 failed: yum install -y hadoop hadoop-libhdfs 
> hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo hadoop hadoop-libhdfs 
> hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo hdp_mon_dashboard 
> ganglia-gmond-3.2.0 gweb hdp_mon_ganglia_addons snappy snappy-devel returned 
> 1 instead of one of [0] at 
> /etc/puppet/agent/modules/hdp/manifests/init.pp:265\"",
>
> It seems like yum install failed on the host. Try running the command 
> manually and see what the error is.
>
> -- Hitesh
>
>
>
> On Aug 13, 2012, at 2:28 AM, xu peng wrote:
>
>> Hi Hitesh :
>>
>> It's me again.
>>
>> Followed you advice , I reinstalled the ambari server. But deploying
>> cluster and uninstall cluster failed again. I really  don't know why.
>>
>> I supplied a attachment which contains the logs of  all the nodes in
>> my cluster (/var/log/puppet_*.log , /var/log/puppet/*.log ,
>> /var/log/yum.log, /var/log/hmc/hmc.log). And vbaby3.cloud.eb is the
>> ambari server. Please refer.
>>
>> Attachment DeployError and UninstallError is the log supplied by the
>> website of ambari when failing. And attachment DeployingDetails.jpg is
>> the deploy details of my cluster. Please refer.
>>
>>
>> Thanks again for your patience ! And look forward to your reply.
>>
>> Xupeng
>>
>> On Sat, Aug 11, 2012 at 10:56 PM, Hitesh Shah <[email protected]> wrote:
>>> For uninstall failures, you will need to do a couple of things. Depending 
>>> on where the uninstall failed, you may have to manually do a killall java 
>>> on all the nodes to kill any missed processes. If you want to start with a 
>>> complete clean install, you should also delete the hadoop dir in the mount 
>>> points you selected during the previous install  so that the new fresh 
>>> install does not face errors when it tries to re-format hdfs.
>>>
>>> After that, simply, uinstall and re-install ambari rpm and that should 
>>> allow you to re-create a fresh cluster.
>>>
>>> -- Hitesh
>>>
>>> On Aug 11, 2012, at 2:34 AM, xu peng wrote:
>>>
>>>> Hi Hitesh :
>>>>
>>>> Thanks a lot for your reply.
>>>>
>>>> I solved this problem , it is silly mistake. Someone has changed the
>>>> owner of "/" dir , and according to the errorlog , pdsh need root to
>>>> proceed.
>>>>
>>>> After changing the owner of "/" to root , problem solved. Thank you
>>>> again for you reply.
>>>>
>>>> I have another question. I had a uninstall failure , and there is no
>>>> button on the website for me to rollback and i don't know what to do
>>>> about that. What should i do now to reinstall hadoop ?
>>>>
>>>> Thanks
>>>>
>>>> On Fri, Aug 10, 2012 at 10:55 PM, Hitesh Shah <[email protected]> 
>>>> wrote:
>>>>> Hi
>>>>>
>>>>> Currently, the ambari installer requires everything to be run as root. It 
>>>>> does not detect that the user is not root and use sudo either on the 
>>>>> master or on the agent nodes.
>>>>> Furthermore, it seems like it is failing when trying to use pdsh to make 
>>>>> remote calls to the host list that you passed in due to the errors 
>>>>> mentioned in your script. This could be due to how it was installed but I 
>>>>> am not sure.
>>>>>
>>>>> Could you switch to become root and run any simple command on all hosts 
>>>>> using pdsh? If you want to reference exactly how ambari uses pdsh, you 
>>>>> can look into /usr/share/hmc/php/frontend/commandUtils.php
>>>>>
>>>>> thanks
>>>>> -- Hitesh
>>>>>
>>>>> On Aug 9, 2012, at 9:04 PM, xu peng wrote:
>>>>>
>>>>>> According to the error log , is there something wrong with my account ?
>>>>>>
>>>>>> I installed all the dependency module and ambari with the user
>>>>>> "ambari" instead of root. I added user "ambari" to /etc/sudofilers
>>>>>> with no passwd.
>>>>>>
>>>>>> On Fri, Aug 10, 2012 at 11:49 AM, xu peng <[email protected]> wrote:
>>>>>>> There is no 100.log.file in /var/log/hmc dir, but only 55.log file (55
>>>>>>> is the biggest version num).
>>>>>>>
>>>>>>> The content of 55.log is :
>>>>>>> pdsh@vbaby1: module path "/usr/lib64/pdsh" insecure.
>>>>>>> pdsh@vbaby1: "/": Owner not root, current uid, or pdsh executable owner
>>>>>>> pdsh@vbaby1: Couldn't load any pdsh modules
>>>>>>>
>>>>>>> Thanks ~
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Aug 10, 2012 at 11:36 AM, Hitesh Shah <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Sorry - my mistake. The last txn mentioned is 100 so please look for 
>>>>>>>> the 100.log file.
>>>>>>>>
>>>>>>>> -- Hitesh
>>>>>>>>
>>>>>>>>
>>>>>>>> On Aug 9, 2012, at 8:34 PM, Hitesh Shah wrote:
>>>>>>>>
>>>>>>>>> Thanks - will take a look and get back to you.
>>>>>>>>>
>>>>>>>>> Could you also look at /var/log/hmc/hmc.txn.55.log and see if there 
>>>>>>>>> are any errors in it?
>>>>>>>>>
>>>>>>>>> -- Hitesh.
>>>>>>>>>
>>>>>>>>> On Aug 9, 2012, at 8:00 PM, xu peng wrote:
>>>>>>>>>
>>>>>>>>>> Hi Hitesh :
>>>>>>>>>>
>>>>>>>>>> Thanks a lot for your replying. I have done all your suggestions in 
>>>>>>>>>> my
>>>>>>>>>> ambari server , and the result is as below.
>>>>>>>>>>
>>>>>>>>>> 1. I can confirm that the hosts.txt file is empty after i failed at
>>>>>>>>>> the step finding reachable nodes.
>>>>>>>>>> 2. I tried make hostdetails file in win7 and redhat , it both
>>>>>>>>>> failed.(Please see the attachment, my hostdetails file)
>>>>>>>>>> 3. I removed the logging re-direct and run the .sh script .It seems
>>>>>>>>>> like the script works well , it print the hostname in console and
>>>>>>>>>> generate a file (content  is "0") in the same dir. (Please see the
>>>>>>>>>> attachment , the result and my .sh script )
>>>>>>>>>> 4. I attached the hmc.log and error_log too. Hope this helps ~
>>>>>>>>>>
>>>>>>>>>> Thanks ~
>>>>>>>>>> Xupeng
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Aug 10, 2012 at 12:24 AM, Hitesh Shah 
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> Xupeng, can you confirm that the hosts.txt file at 
>>>>>>>>>>> /var/run/hmc/clusters/EBHadoop/hosts.txt is empty?
>>>>>>>>>>>
>>>>>>>>>>> Also, can you ensure that the hostdetails file that you upload does 
>>>>>>>>>>> not have any special characters that may be creating problems for 
>>>>>>>>>>> the parsing layer?
>>>>>>>>>>>
>>>>>>>>>>> In the same dir, there should be an ssh.sh script. Can you create a 
>>>>>>>>>>> copy of it, edit to remove the logging re-directs to files and run 
>>>>>>>>>>> the script manually from command-line ( it takes in a hostname as 
>>>>>>>>>>> the argument ) ? The output of that should show you as to what is 
>>>>>>>>>>> going wrong.
>>>>>>>>>>>
>>>>>>>>>>> Also, please look at /var/log/hmc/hmc.log and httpd/error_log to 
>>>>>>>>>>> see if there are any errors being logged which may shed more light 
>>>>>>>>>>> on the issue.
>>>>>>>>>>>
>>>>>>>>>>> thanks
>>>>>>>>>>> -- Hitesh
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Aug 9, 2012, at 9:11 AM, Artem Ervits wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Which file are you supplying in the step? Hostdetail.txt or hosts?
>>>>>>>>>>>>
>>>>>>>>>>>> From: xupeng.bupt [mailto:[email protected]]
>>>>>>>>>>>> Sent: Thursday, August 09, 2012 11:33 AM
>>>>>>>>>>>> To: ambari-user
>>>>>>>>>>>> Subject: Re: RE: Problem when setting up hadoop cluster step 2
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you for your replying ~
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I made only one hostdetail.txt file which contains the names of 
>>>>>>>>>>>> all servers. And i submit this file on the website ,  but i still 
>>>>>>>>>>>> have the same problem. I failed at the step of finding reachable 
>>>>>>>>>>>> nodes.
>>>>>>>>>>>>
>>>>>>>>>>>> The error log is : "
>>>>>>>>>>>> [ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]:
>>>>>>>>>>>> Encountered total failure in transaction 100 while running cmd:
>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: EBHadoop 
>>>>>>>>>>>> root
>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt
>>>>>>>>>>>> "
>>>>>>>>>>>>
>>>>>>>>>>>> And my hostdetail.txt file is :"
>>>>>>>>>>>> vbaby2.cloud.eb
>>>>>>>>>>>> vbaby3.cloud.eb
>>>>>>>>>>>> vbaby4.cloud.eb
>>>>>>>>>>>> vbaby5.cloud.eb
>>>>>>>>>>>> vbaby6.cloud.eb
>>>>>>>>>>>> "
>>>>>>>>>>>> Thank you very much ~
>>>>>>>>>>>>
>>>>>>>>>>>> 2012-08-09
>>>>>>>>>>>> xupeng.bupt
>>>>>>>>>>>> 发件人: Artem Ervits
>>>>>>>>>>>> 发送时间: 2012-08-09  22:16:53
>>>>>>>>>>>> 收件人: [email protected]
>>>>>>>>>>>> 抄送:
>>>>>>>>>>>> 主题: RE: Problem when setting up hadoop cluster step 2
>>>>>>>>>>>> the installer requires a hosts file which I believe you called 
>>>>>>>>>>>> hostdetail. Make sure it's the same file. You also mention a 
>>>>>>>>>>>> hosts.txt and host.txt. You only need one file with the names of 
>>>>>>>>>>>> all servers.
>>>>>>>>>>>> -----Original Message-----
>>>>>>>>>>>> From: xu peng [mailto:[email protected]]
>>>>>>>>>>>> Sent: Thursday, August 09, 2012 2:02 AM
>>>>>>>>>>>> To: [email protected]
>>>>>>>>>>>> Subject: Problem when setting up hadoop cluster step 2
>>>>>>>>>>>> Hi everyone :
>>>>>>>>>>>> I am trying to use ambari to set up a hadoop cluster , but i 
>>>>>>>>>>>> encounter a problem on step 2. I already set up the password-less 
>>>>>>>>>>>> ssh, and i creat a hostdetail.txt file.
>>>>>>>>>>>> The problem is that i found the file
>>>>>>>>>>>> "/var/run/hmc/clusters/EBHadoop/hosts.txt" is empty , no matter 
>>>>>>>>>>>> how many times i submit the host.txt file on the website , and i 
>>>>>>>>>>>> really don't know why.
>>>>>>>>>>>> {
>>>>>>>>>>>> Here is the log file : [2012:08:09
>>>>>>>>>>>> 05:17:56][ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]:
>>>>>>>>>>>> Encountered total failure in transaction 100 while running cmd:
>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: EBHadoop 
>>>>>>>>>>>> root
>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt
>>>>>>>>>>>> and my host.txt is like this(vbaby1.cloud.eb is the master node) :
>>>>>>>>>>>> vbaby2.cloud.eb
>>>>>>>>>>>> vbaby3.cloud.eb
>>>>>>>>>>>> vbaby4.cloud.eb
>>>>>>>>>>>> vbaby5.cloud.eb
>>>>>>>>>>>> vbaby6.cloud.eb
>>>>>>>>>>>> }
>>>>>>>>>>>> Can anyone help me and tell me what i am doing wrong ?
>>>>>>>>>>>> Thank you very much ~!
>>>>>>>>>>>> This electronic message is intended to be for the use only of the 
>>>>>>>>>>>> named recipient, and may contain information that is confidential 
>>>>>>>>>>>> or privileged. If you are not the intended recipient, you are 
>>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use 
>>>>>>>>>>>> of the contents of this message is strictly prohibited. If you 
>>>>>>>>>>>> have received this message in error or are not the named 
>>>>>>>>>>>> recipient, please notify us immediately by contacting the sender 
>>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy 
>>>>>>>>>>>> all copies of this message. Thank you.
>>>>>>>>>>>> --------------------
>>>>>>>>>>>> This electronic message is intended to be for the use only of the 
>>>>>>>>>>>> named recipient, and may contain information that is confidential 
>>>>>>>>>>>> or privileged.  If you are not the intended recipient, you are 
>>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use 
>>>>>>>>>>>> of the contents of this message is strictly prohibited.  If you 
>>>>>>>>>>>> have received this message in error or are not the named 
>>>>>>>>>>>> recipient, please notify us immediately by contacting the sender 
>>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy 
>>>>>>>>>>>> all copies of this message.  Thank you.
>>>>>>>>>>>> --------------------
>>>>>>>>>>>> This electronic message is intended to be for the use only of the 
>>>>>>>>>>>> named recipient, and may contain information that is confidential 
>>>>>>>>>>>> or privileged.  If you are not the intended recipient, you are 
>>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use 
>>>>>>>>>>>> of the contents of this message is strictly prohibited.  If you 
>>>>>>>>>>>> have received this message in error or are not the named 
>>>>>>>>>>>> recipient, please notify us immediately by contacting the sender 
>>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy 
>>>>>>>>>>>> all copies of this message.  Thank you.
>>>>>>>>>>>> --------------------
>>>>>>>>>>>>
>>>>>>>>>>>> This electronic message is intended to be for the use only of the 
>>>>>>>>>>>> named recipient, and may contain information that is confidential 
>>>>>>>>>>>> or privileged.  If you are not the intended recipient, you are 
>>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use 
>>>>>>>>>>>> of the contents of this message is strictly prohibited.  If you 
>>>>>>>>>>>> have received this message in error or are not the named 
>>>>>>>>>>>> recipient, please notify us immediately by contacting the sender 
>>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy 
>>>>>>>>>>>> all copies of this message.  Thank you.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --------------------
>>>>>>>>>>>>
>>>>>>>>>>>> This electronic message is intended to be for the use only of the 
>>>>>>>>>>>> named recipient, and may contain information that is confidential 
>>>>>>>>>>>> or privileged.  If you are not the intended recipient, you are 
>>>>>>>>>>>> hereby notified that any disclosure, copying, distribution or use 
>>>>>>>>>>>> of the contents of this message is strictly prohibited.  If you 
>>>>>>>>>>>> have received this message in error or are not the named 
>>>>>>>>>>>> recipient, please notify us immediately by contacting the sender 
>>>>>>>>>>>> at the electronic mail address noted above, and delete and destroy 
>>>>>>>>>>>> all copies of this message.  Thank you.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> <hmcLog.txt><hostdetails.txt><httpdLog.txt><ssh1.sh><ssh1_result.jpg>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>
>> <DeployError1_2012.8.13.txt><log.rar><DeployingDetails.jpg><UninstallError1_2012.8.13.txt>
>

Reply via email to