I think removing the ambari and mod_passenger rpms from all the nodes except the ambari master should suffice.
-- Hitesh On Aug 19, 2012, at 5:46 PM, xu peng wrote: > No , i am not using vbaby1 as my new ambari master. But it is the > former ambari master (I did not uninstall the dependency package), and > vbaby3 is the current ambari master. > > So did i have to uninstall all mod_passenge package on the slave node > ? Or make the ganglia server on the same node as ambari server ? > > > On Mon, Aug 20, 2012 at 1:30 AM, Hitesh Shah <[email protected]> wrote: >> Yes - not using /dev/mapper/hdvg-rootlv was what I was planning to suggest. >> >> It seems to me that you installed mod_passenger and/or ambari on vbaby1. Is >> this your new ambari master? >> >> Try doing this on vbaby1: >> >> $puppet master --no-daemonize --debug >> >> The above will create the cert required by the puppet master running in >> httpd. Kill the above process ( it will run in the foreground ). >> >> Now, try the httpd restart. >> >> ( Also, note that you should not need to do anything for ganglia start >> unless you are running ganglia server on the same host as ambari. ) >> >> thanks >> -- Hitesh >> >> >> On Aug 19, 2012, at 6:03 AM, xu peng wrote: >> >>> Hi Hitesh : >>> >>> It is me again. >>> >>> I figured out the previous problem by changing the mount point to a custom >>> path. >>> >>> But i failed at the step of starting ganglia server. >>> >>> I run this command manualy on vbaby1 node , but failed . The other >>> node successed. >>> ([root@vbaby1 log]# service httpd start >>> Starting httpd: Syntax error on line 37 of >>> /etc/httpd/conf.d/puppetmaster.conf: >>> SSLCertificateChainFile: file '/var/lib/puppet/ssl/ca/ca_crt.pem' does >>> not exist or is empty >>> [FAILED]) >>> >>> Please refer to the error log. >>> >>> Thanks a lot. >>> >>> On Sun, Aug 19, 2012 at 7:26 PM, xu peng <[email protected]> wrote: >>>> Hi Hitesh : >>>> >>>> It is me again. >>>> >>>> I encountered another problem while deploying the service. And >>>> according to the log , it seems like something went wrong when >>>> executing command (Dependency Exec[mkdir -p >>>> /dev/mapper/hdvg-rootlv/hadoop/hdfs/data] has failures: true) . >>>> >>>> Please refer to the attachment. It seems like all the rpm package >>>> installed successfully , and i don't know where failed the dependency. >>>> >>>> Please help , thanks a lot. >>>> >>>> On Sun, Aug 19, 2012 at 8:08 AM, xu peng <[email protected]> wrote: >>>>> Hi Hitesh : >>>>> >>>>> I use the default settings of the mount point , but it seems like this >>>>> path is not a directory(/dev/mapper/hdvg-rootlv/), and i can not >>>>> execute mkdir -p command on this path. And the hdvg-rootlv is a >>>>> blocking file (bwrxwrxwrx) . Is there something wrong ? >>>>> >>>>> >>>>> >>>>> On Sun, Aug 19, 2012 at 3:38 AM, Hitesh Shah <[email protected]> >>>>> wrote: >>>>>> Hi >>>>>> >>>>>> Yes - you should all packages from the new repo and none from the old >>>>>> repo. Most of the packages should be the same but same like hadoop-lzo >>>>>> were re-factored to work correctly with respect to 32/64-bit installs on >>>>>> RHEL6. >>>>>> >>>>>> Regarding the mount points, from a hadoop point of view, the namenode >>>>>> and datanode dirs are just dirs. From a performance point of view, you >>>>>> want each dir to be created on a separate mount point to increase disk >>>>>> io bandwidth. This means that the mount points that you select on the UI >>>>>> should allow directories to be created. If you have mounted certain kind >>>>>> of filesystems which you do not wish to use for hadoop ( any tmpfs, nfs >>>>>> mounts etc ), you should de-select them on the UI and/or use the custom >>>>>> mount point text box as appropriate. The UI currently does not >>>>>> distinguish valid mount points and therefore it is up to the user to >>>>>> select correctly. >>>>>> >>>>>> -- Hitesh >>>>>> >>>>>> >>>>>> On Aug 18, 2012, at 9:48 AM, xu peng wrote: >>>>>> >>>>>>> Hi Hitesh: >>>>>>> >>>>>>> Thanks again for your reply. >>>>>>> >>>>>>> I solved the dependency problem after updating the hdp repo. >>>>>>> >>>>>>> But here comes two new problems : >>>>>>> 1. I update the new hdp repo , but i create a local repo copy of the >>>>>>> old hdp repo. And I installed all the rpm package except >>>>>>> hadoop-lzo-native using the old hdp repo. So it seems like the >>>>>>> hadoop-lzo-native has some conflct with hadoop-lzo. So , do i have to >>>>>>> install all the rpm package from the new repo ? >>>>>>> >>>>>>> 2. From the error log , i can see a command "mkdir -p /var/.../.. >>>>>>> (mounting point of hadoop)", but i found the mouting point is not a >>>>>>> dir , but a blocking file(bwrxwrxwrx). And the execution of this step >>>>>>> failed. Did i do something wrong ? >>>>>>> >>>>>>> I am sorry that this deploy error log is on my company's computer, and >>>>>>> i will upload it in my next email. >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> -- Xupeng >>>>>>> >>>>>>> On Sat, Aug 18, 2012 at 4:43 AM, Hitesh Shah <[email protected]> >>>>>>> wrote: >>>>>>>> Hi again, >>>>>>>> >>>>>>>> You are actually hitting a problem caused by some changes in the code >>>>>>>> which require a modified repo. Unfortunately, I got delayed in >>>>>>>> modifying the documentation to point to the new repo. >>>>>>>> >>>>>>>> Could you try using >>>>>>>> http://public-repo-1.hortonworks.com/HDP-1.0.1.14/repos/centos5/hdp-release-1.0.1.14-1.el5.noarch.rpm >>>>>>>> or >>>>>>>> http://public-repo-1.hortonworks.com/HDP-1.0.1.14/repos/centos6/hdp-release-1.0.1.14-1.el6.noarch.rpm >>>>>>>> >>>>>>>> The above should install the yum repo configs to point to the correct >>>>>>>> repo which will have the lzo packages. >>>>>>>> >>>>>>>> -- Hitesh >>>>>>>> >>>>>>>> >>>>>>>> On Aug 16, 2012, at 9:27 PM, xu peng wrote: >>>>>>>> >>>>>>>>> Hitesh Shah : >>>>>>>>> >>>>>>>>> It is a my my pleasure to fill jira of ambari to help other users . As >>>>>>>>> a matter of fact, i want to summarize all the problem before i install >>>>>>>>> ambari cluster successfully. And i will feed back as soon as >>>>>>>>> possiable. >>>>>>>>> >>>>>>>>> Here is another problem i encounter when install hadoop using ambari, >>>>>>>>> i found a rpm package "hadoop-lzp-native" not in the hdp repo >>>>>>>>> (baseurl=http://public-repo-1.hortonworks.com/HDP-1.0.13/repos/centos5) >>>>>>>>> . So i failed againg during deploying step. >>>>>>>>> >>>>>>>>> And the attachment is the deploying log , please refer. >>>>>>>>> >>>>>>>>> Thanks a lot and look forward to you reply. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Aug 14, 2012 at 11:35 PM, Hitesh Shah >>>>>>>>> <[email protected]> wrote: >>>>>>>>>> Ok - the cert issue is sometimes a result of uninstalling and >>>>>>>>>> re-installing ambari agents. >>>>>>>>>> >>>>>>>>>> The re-install causes ambari agents to regenerate a new >>>>>>>>>> certification and if the master was bootstrapped earlier, it would >>>>>>>>>> still be looking to match against old certs. >>>>>>>>>> >>>>>>>>>> Stop ambari master and remove ambari-agent rpm from all hosts. >>>>>>>>>> >>>>>>>>>> To fix this: >>>>>>>>>> - on the master, do a puppet cert revoke for all hosts ( >>>>>>>>>> http://docs.puppetlabs.com/man/cert.html ) >>>>>>>>>> - you can do a cert list to get all signed or non-signed hosts >>>>>>>>>> >>>>>>>>>> On all hosts, delete the following dirs ( if they exist ) : >>>>>>>>>> - /etc/puppet/ssl >>>>>>>>>> - /etc/puppet/[master|agent\/ssl/ >>>>>>>>>> - /var/lib/puppet/ssl/ >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> After doing the above, re-install the ambari agent. >>>>>>>>>> >>>>>>>>>> On the ambari master, stop the master. Run the following command: >>>>>>>>>> >>>>>>>>>> puppet master --no-daemonize --debug >>>>>>>>>> >>>>>>>>>> The above runs in the foreground. The reason to run this is to make >>>>>>>>>> sure the cert for the master is recreated as we deleted it earlier. >>>>>>>>>> >>>>>>>>>> Now, kill the above process running in the foreground and do a >>>>>>>>>> service ambari start to bring up the UI. >>>>>>>>>> >>>>>>>>>> You should be able to bootstrap from this point on. >>>>>>>>>> >>>>>>>>>> Would you mind filing a jira and mentioning all the various issues >>>>>>>>>> you have come across and how you solved them. We can use that to >>>>>>>>>> create an FAQ for other users. >>>>>>>>>> >>>>>>>>>> thanks >>>>>>>>>> -- Hitesh >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Aug 14, 2012, at 1:55 AM, xu peng wrote: >>>>>>>>>> >>>>>>>>>>> Hi Hitesh : >>>>>>>>>>> >>>>>>>>>>> Thanks a lot for your reply. >>>>>>>>>>> >>>>>>>>>>> 1. I did a puppet kick --ping to the client from my ambari master , >>>>>>>>>>> all the five nodes failed with the same log (Triggering >>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>> Host vbaby2.cloud.eb failed: certificate verify failed. This is >>>>>>>>>>> often >>>>>>>>>>> because the time is out of sync on the server or client >>>>>>>>>>> vbaby2.cloud.eb finished with exit code 2) >>>>>>>>>>> >>>>>>>>>>> I manually run "service ambari-agent start" , is that necessary ? >>>>>>>>>>> How >>>>>>>>>>> can i fix these problem ? >>>>>>>>>>> >>>>>>>>>>> 2. As you suggest , I run the yum command manually. And found that >>>>>>>>>>> the >>>>>>>>>>> installation missed some dependecy - php-gd. And i have to update my >>>>>>>>>>> yum repo. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, Aug 14, 2012 at 1:01 AM, Hitesh Shah >>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>> Based on your deploy error log: >>>>>>>>>>>> >>>>>>>>>>>> "3": { >>>>>>>>>>>> "nodeReport": { >>>>>>>>>>>> "PUPPET_KICK_FAILED": [], >>>>>>>>>>>> "PUPPET_OPERATION_FAILED": [ >>>>>>>>>>>> "vbaby3.cloud.eb", >>>>>>>>>>>> "vbaby5.cloud.eb", >>>>>>>>>>>> "vbaby4.cloud.eb", >>>>>>>>>>>> "vbaby2.cloud.eb", >>>>>>>>>>>> "vbaby6.cloud.eb", >>>>>>>>>>>> "vbaby1.cloud.eb" >>>>>>>>>>>> ], >>>>>>>>>>>> "PUPPET_OPERATION_TIMEDOUT": [ >>>>>>>>>>>> "vbaby5.cloud.eb", >>>>>>>>>>>> "vbaby4.cloud.eb", >>>>>>>>>>>> "vbaby2.cloud.eb", >>>>>>>>>>>> "vbaby6.cloud.eb", >>>>>>>>>>>> "vbaby1.cloud.eb" >>>>>>>>>>>> ], >>>>>>>>>>>> >>>>>>>>>>>> 5 nodes timed out which means the puppet agent is not running on >>>>>>>>>>>> them or they cannot communicate with the master. Trying doing a >>>>>>>>>>>> puppet kick --ping to them from the master. >>>>>>>>>>>> >>>>>>>>>>>> For the one which failed, it failed at >>>>>>>>>>>> >>>>>>>>>>>> "\"Mon Aug 13 11:54:17 +0800 2012 >>>>>>>>>>>> /Stage[1]/Hdp::Pre_install_pkgs/Hdp::Exec[yum install >>>>>>>>>>>> $pre_installed_pkgs]/Exec[yum install $pre_installed_pkgs]/returns >>>>>>>>>>>> (err): change from notrun to 0 failed: yum install -y hadoop >>>>>>>>>>>> hadoop-libhdfs hadoop-native hadoop-pipes hadoop-sbin hadoop-lzo >>>>>>>>>>>> hadoop hadoop-libhdfs hadoop-native hadoop-pipes hadoop-sbin >>>>>>>>>>>> hadoop-lzo hdp_mon_dashboard ganglia-gmond-3.2.0 gweb >>>>>>>>>>>> hdp_mon_ganglia_addons snappy snappy-devel returned 1 instead of >>>>>>>>>>>> one of [0] at >>>>>>>>>>>> /etc/puppet/agent/modules/hdp/manifests/init.pp:265\"", >>>>>>>>>>>> >>>>>>>>>>>> It seems like yum install failed on the host. Try running the >>>>>>>>>>>> command manually and see what the error is. >>>>>>>>>>>> >>>>>>>>>>>> -- Hitesh >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Aug 13, 2012, at 2:28 AM, xu peng wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Hitesh : >>>>>>>>>>>>> >>>>>>>>>>>>> It's me again. >>>>>>>>>>>>> >>>>>>>>>>>>> Followed you advice , I reinstalled the ambari server. But >>>>>>>>>>>>> deploying >>>>>>>>>>>>> cluster and uninstall cluster failed again. I really don't know >>>>>>>>>>>>> why. >>>>>>>>>>>>> >>>>>>>>>>>>> I supplied a attachment which contains the logs of all the nodes >>>>>>>>>>>>> in >>>>>>>>>>>>> my cluster (/var/log/puppet_*.log , /var/log/puppet/*.log , >>>>>>>>>>>>> /var/log/yum.log, /var/log/hmc/hmc.log). And vbaby3.cloud.eb is >>>>>>>>>>>>> the >>>>>>>>>>>>> ambari server. Please refer. >>>>>>>>>>>>> >>>>>>>>>>>>> Attachment DeployError and UninstallError is the log supplied by >>>>>>>>>>>>> the >>>>>>>>>>>>> website of ambari when failing. And attachment >>>>>>>>>>>>> DeployingDetails.jpg is >>>>>>>>>>>>> the deploy details of my cluster. Please refer. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks again for your patience ! And look forward to your reply. >>>>>>>>>>>>> >>>>>>>>>>>>> Xupeng >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, Aug 11, 2012 at 10:56 PM, Hitesh Shah >>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>> For uninstall failures, you will need to do a couple of things. >>>>>>>>>>>>>> Depending on where the uninstall failed, you may have to >>>>>>>>>>>>>> manually do a killall java on all the nodes to kill any missed >>>>>>>>>>>>>> processes. If you want to start with a complete clean install, >>>>>>>>>>>>>> you should also delete the hadoop dir in the mount points you >>>>>>>>>>>>>> selected during the previous install so that the new fresh >>>>>>>>>>>>>> install does not face errors when it tries to re-format hdfs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> After that, simply, uinstall and re-install ambari rpm and that >>>>>>>>>>>>>> should allow you to re-create a fresh cluster. >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- Hitesh >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Aug 11, 2012, at 2:34 AM, xu peng wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hi Hitesh : >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks a lot for your reply. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I solved this problem , it is silly mistake. Someone has >>>>>>>>>>>>>>> changed the >>>>>>>>>>>>>>> owner of "/" dir , and according to the errorlog , pdsh need >>>>>>>>>>>>>>> root to >>>>>>>>>>>>>>> proceed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> After changing the owner of "/" to root , problem solved. Thank >>>>>>>>>>>>>>> you >>>>>>>>>>>>>>> again for you reply. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have another question. I had a uninstall failure , and there >>>>>>>>>>>>>>> is no >>>>>>>>>>>>>>> button on the website for me to rollback and i don't know what >>>>>>>>>>>>>>> to do >>>>>>>>>>>>>>> about that. What should i do now to reinstall hadoop ? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 10:55 PM, Hitesh Shah >>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>> Hi >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Currently, the ambari installer requires everything to be run >>>>>>>>>>>>>>>> as root. It does not detect that the user is not root and use >>>>>>>>>>>>>>>> sudo either on the master or on the agent nodes. >>>>>>>>>>>>>>>> Furthermore, it seems like it is failing when trying to use >>>>>>>>>>>>>>>> pdsh to make remote calls to the host list that you passed in >>>>>>>>>>>>>>>> due to the errors mentioned in your script. This could be due >>>>>>>>>>>>>>>> to how it was installed but I am not sure. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Could you switch to become root and run any simple command on >>>>>>>>>>>>>>>> all hosts using pdsh? If you want to reference exactly how >>>>>>>>>>>>>>>> ambari uses pdsh, you can look into >>>>>>>>>>>>>>>> /usr/share/hmc/php/frontend/commandUtils.php >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> thanks >>>>>>>>>>>>>>>> -- Hitesh >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Aug 9, 2012, at 9:04 PM, xu peng wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> According to the error log , is there something wrong with my >>>>>>>>>>>>>>>>> account ? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I installed all the dependency module and ambari with the user >>>>>>>>>>>>>>>>> "ambari" instead of root. I added user "ambari" to >>>>>>>>>>>>>>>>> /etc/sudofilers >>>>>>>>>>>>>>>>> with no passwd. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 11:49 AM, xu peng >>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>> There is no 100.log.file in /var/log/hmc dir, but only >>>>>>>>>>>>>>>>>> 55.log file (55 >>>>>>>>>>>>>>>>>> is the biggest version num). >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The content of 55.log is : >>>>>>>>>>>>>>>>>> pdsh@vbaby1: module path "/usr/lib64/pdsh" insecure. >>>>>>>>>>>>>>>>>> pdsh@vbaby1: "/": Owner not root, current uid, or pdsh >>>>>>>>>>>>>>>>>> executable owner >>>>>>>>>>>>>>>>>> pdsh@vbaby1: Couldn't load any pdsh modules >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks ~ >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 11:36 AM, Hitesh Shah >>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>> Sorry - my mistake. The last txn mentioned is 100 so please >>>>>>>>>>>>>>>>>>> look for the 100.log file. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -- Hitesh >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> On Aug 9, 2012, at 8:34 PM, Hitesh Shah wrote: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks - will take a look and get back to you. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Could you also look at /var/log/hmc/hmc.txn.55.log and see >>>>>>>>>>>>>>>>>>>> if there are any errors in it? >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- Hitesh. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Aug 9, 2012, at 8:00 PM, xu peng wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Hi Hitesh : >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks a lot for your replying. I have done all your >>>>>>>>>>>>>>>>>>>>> suggestions in my >>>>>>>>>>>>>>>>>>>>> ambari server , and the result is as below. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> 1. I can confirm that the hosts.txt file is empty after i >>>>>>>>>>>>>>>>>>>>> failed at >>>>>>>>>>>>>>>>>>>>> the step finding reachable nodes. >>>>>>>>>>>>>>>>>>>>> 2. I tried make hostdetails file in win7 and redhat , it >>>>>>>>>>>>>>>>>>>>> both >>>>>>>>>>>>>>>>>>>>> failed.(Please see the attachment, my hostdetails file) >>>>>>>>>>>>>>>>>>>>> 3. I removed the logging re-direct and run the .sh script >>>>>>>>>>>>>>>>>>>>> .It seems >>>>>>>>>>>>>>>>>>>>> like the script works well , it print the hostname in >>>>>>>>>>>>>>>>>>>>> console and >>>>>>>>>>>>>>>>>>>>> generate a file (content is "0") in the same dir. >>>>>>>>>>>>>>>>>>>>> (Please see the >>>>>>>>>>>>>>>>>>>>> attachment , the result and my .sh script ) >>>>>>>>>>>>>>>>>>>>> 4. I attached the hmc.log and error_log too. Hope this >>>>>>>>>>>>>>>>>>>>> helps ~ >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Thanks ~ >>>>>>>>>>>>>>>>>>>>> Xupeng >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Fri, Aug 10, 2012 at 12:24 AM, Hitesh Shah >>>>>>>>>>>>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>> Xupeng, can you confirm that the hosts.txt file at >>>>>>>>>>>>>>>>>>>>>> /var/run/hmc/clusters/EBHadoop/hosts.txt is empty? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Also, can you ensure that the hostdetails file that you >>>>>>>>>>>>>>>>>>>>>> upload does not have any special characters that may be >>>>>>>>>>>>>>>>>>>>>> creating problems for the parsing layer? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In the same dir, there should be an ssh.sh script. Can >>>>>>>>>>>>>>>>>>>>>> you create a copy of it, edit to remove the logging >>>>>>>>>>>>>>>>>>>>>> re-directs to files and run the script manually from >>>>>>>>>>>>>>>>>>>>>> command-line ( it takes in a hostname as the argument ) >>>>>>>>>>>>>>>>>>>>>> ? The output of that should show you as to what is going >>>>>>>>>>>>>>>>>>>>>> wrong. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Also, please look at /var/log/hmc/hmc.log and >>>>>>>>>>>>>>>>>>>>>> httpd/error_log to see if there are any errors being >>>>>>>>>>>>>>>>>>>>>> logged which may shed more light on the issue. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> thanks >>>>>>>>>>>>>>>>>>>>>> -- Hitesh >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Aug 9, 2012, at 9:11 AM, Artem Ervits wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Which file are you supplying in the step? >>>>>>>>>>>>>>>>>>>>>>> Hostdetail.txt or hosts? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> From: xupeng.bupt [mailto:[email protected]] >>>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, August 09, 2012 11:33 AM >>>>>>>>>>>>>>>>>>>>>>> To: ambari-user >>>>>>>>>>>>>>>>>>>>>>> Subject: Re: RE: Problem when setting up hadoop cluster >>>>>>>>>>>>>>>>>>>>>>> step 2 >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thank you for your replying ~ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> I made only one hostdetail.txt file which contains the >>>>>>>>>>>>>>>>>>>>>>> names of all servers. And i submit this file on the >>>>>>>>>>>>>>>>>>>>>>> website , but i still have the same problem. I failed >>>>>>>>>>>>>>>>>>>>>>> at the step of finding reachable nodes. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> The error log is : " >>>>>>>>>>>>>>>>>>>>>>> [ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]: >>>>>>>>>>>>>>>>>>>>>>> Encountered total failure in transaction 100 while >>>>>>>>>>>>>>>>>>>>>>> running cmd: >>>>>>>>>>>>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: >>>>>>>>>>>>>>>>>>>>>>> EBHadoop root >>>>>>>>>>>>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt >>>>>>>>>>>>>>>>>>>>>>> " >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> And my hostdetail.txt file is :" >>>>>>>>>>>>>>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby3.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby4.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby5.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby6.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> " >>>>>>>>>>>>>>>>>>>>>>> Thank you very much ~ >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 2012-08-09 >>>>>>>>>>>>>>>>>>>>>>> xupeng.bupt >>>>>>>>>>>>>>>>>>>>>>> 发件人: Artem Ervits >>>>>>>>>>>>>>>>>>>>>>> 发送时间: 2012-08-09 22:16:53 >>>>>>>>>>>>>>>>>>>>>>> 收件人: [email protected] >>>>>>>>>>>>>>>>>>>>>>> 抄送: >>>>>>>>>>>>>>>>>>>>>>> 主题: RE: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>>>>>>>>>>>>> the installer requires a hosts file which I believe you >>>>>>>>>>>>>>>>>>>>>>> called hostdetail. Make sure it's the same file. You >>>>>>>>>>>>>>>>>>>>>>> also mention a hosts.txt and host.txt. You only need >>>>>>>>>>>>>>>>>>>>>>> one file with the names of all servers. >>>>>>>>>>>>>>>>>>>>>>> -----Original Message----- >>>>>>>>>>>>>>>>>>>>>>> From: xu peng [mailto:[email protected]] >>>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, August 09, 2012 2:02 AM >>>>>>>>>>>>>>>>>>>>>>> To: [email protected] >>>>>>>>>>>>>>>>>>>>>>> Subject: Problem when setting up hadoop cluster step 2 >>>>>>>>>>>>>>>>>>>>>>> Hi everyone : >>>>>>>>>>>>>>>>>>>>>>> I am trying to use ambari to set up a hadoop cluster , >>>>>>>>>>>>>>>>>>>>>>> but i encounter a problem on step 2. I already set up >>>>>>>>>>>>>>>>>>>>>>> the password-less ssh, and i creat a hostdetail.txt >>>>>>>>>>>>>>>>>>>>>>> file. >>>>>>>>>>>>>>>>>>>>>>> The problem is that i found the file >>>>>>>>>>>>>>>>>>>>>>> "/var/run/hmc/clusters/EBHadoop/hosts.txt" is empty , >>>>>>>>>>>>>>>>>>>>>>> no matter how many times i submit the host.txt file on >>>>>>>>>>>>>>>>>>>>>>> the website , and i really don't know why. >>>>>>>>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>>>>> Here is the log file : [2012:08:09 >>>>>>>>>>>>>>>>>>>>>>> 05:17:56][ERROR][sequentialScriptExecutor][sequentialScriptRunner.php:272][]: >>>>>>>>>>>>>>>>>>>>>>> Encountered total failure in transaction 100 while >>>>>>>>>>>>>>>>>>>>>>> running cmd: >>>>>>>>>>>>>>>>>>>>>>> /usr/bin/php ./addNodes/findSshableNodes.php with args: >>>>>>>>>>>>>>>>>>>>>>> EBHadoop root >>>>>>>>>>>>>>>>>>>>>>> 35 100 36 /var/run/hmc/clusters/EBHadoop/hosts.txt >>>>>>>>>>>>>>>>>>>>>>> and my host.txt is like this(vbaby1.cloud.eb is the >>>>>>>>>>>>>>>>>>>>>>> master node) : >>>>>>>>>>>>>>>>>>>>>>> vbaby2.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby3.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby4.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby5.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> vbaby6.cloud.eb >>>>>>>>>>>>>>>>>>>>>>> } >>>>>>>>>>>>>>>>>>>>>>> Can anyone help me and tell me what i am doing wrong ? >>>>>>>>>>>>>>>>>>>>>>> Thank you very much ~! >>>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use >>>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain >>>>>>>>>>>>>>>>>>>>>>> information that is confidential or privileged. If you >>>>>>>>>>>>>>>>>>>>>>> are not the intended recipient, you are hereby notified >>>>>>>>>>>>>>>>>>>>>>> that any disclosure, copying, distribution or use of >>>>>>>>>>>>>>>>>>>>>>> the contents of this message is strictly prohibited. If >>>>>>>>>>>>>>>>>>>>>>> you have received this message in error or are not the >>>>>>>>>>>>>>>>>>>>>>> named recipient, please notify us immediately by >>>>>>>>>>>>>>>>>>>>>>> contacting the sender at the electronic mail address >>>>>>>>>>>>>>>>>>>>>>> noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use >>>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain >>>>>>>>>>>>>>>>>>>>>>> information that is confidential or privileged. If you >>>>>>>>>>>>>>>>>>>>>>> are not the intended recipient, you are hereby notified >>>>>>>>>>>>>>>>>>>>>>> that any disclosure, copying, distribution or use of >>>>>>>>>>>>>>>>>>>>>>> the contents of this message is strictly prohibited. >>>>>>>>>>>>>>>>>>>>>>> If you have received this message in error or are not >>>>>>>>>>>>>>>>>>>>>>> the named recipient, please notify us immediately by >>>>>>>>>>>>>>>>>>>>>>> contacting the sender at the electronic mail address >>>>>>>>>>>>>>>>>>>>>>> noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use >>>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain >>>>>>>>>>>>>>>>>>>>>>> information that is confidential or privileged. If you >>>>>>>>>>>>>>>>>>>>>>> are not the intended recipient, you are hereby notified >>>>>>>>>>>>>>>>>>>>>>> that any disclosure, copying, distribution or use of >>>>>>>>>>>>>>>>>>>>>>> the contents of this message is strictly prohibited. >>>>>>>>>>>>>>>>>>>>>>> If you have received this message in error or are not >>>>>>>>>>>>>>>>>>>>>>> the named recipient, please notify us immediately by >>>>>>>>>>>>>>>>>>>>>>> contacting the sender at the electronic mail address >>>>>>>>>>>>>>>>>>>>>>> noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use >>>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain >>>>>>>>>>>>>>>>>>>>>>> information that is confidential or privileged. If you >>>>>>>>>>>>>>>>>>>>>>> are not the intended recipient, you are hereby notified >>>>>>>>>>>>>>>>>>>>>>> that any disclosure, copying, distribution or use of >>>>>>>>>>>>>>>>>>>>>>> the contents of this message is strictly prohibited. >>>>>>>>>>>>>>>>>>>>>>> If you have received this message in error or are not >>>>>>>>>>>>>>>>>>>>>>> the named recipient, please notify us immediately by >>>>>>>>>>>>>>>>>>>>>>> contacting the sender at the electronic mail address >>>>>>>>>>>>>>>>>>>>>>> noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> -------------------- >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> This electronic message is intended to be for the use >>>>>>>>>>>>>>>>>>>>>>> only of the named recipient, and may contain >>>>>>>>>>>>>>>>>>>>>>> information that is confidential or privileged. If you >>>>>>>>>>>>>>>>>>>>>>> are not the intended recipient, you are hereby notified >>>>>>>>>>>>>>>>>>>>>>> that any disclosure, copying, distribution or use of >>>>>>>>>>>>>>>>>>>>>>> the contents of this message is strictly prohibited. >>>>>>>>>>>>>>>>>>>>>>> If you have received this message in error or are not >>>>>>>>>>>>>>>>>>>>>>> the named recipient, please notify us immediately by >>>>>>>>>>>>>>>>>>>>>>> contacting the sender at the electronic mail address >>>>>>>>>>>>>>>>>>>>>>> noted above, and delete and destroy all copies of this >>>>>>>>>>>>>>>>>>>>>>> message. Thank you. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> <hmcLog.txt><hostdetails.txt><httpdLog.txt><ssh1.sh><ssh1_result.jpg> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> <DeployError1_2012.8.13.txt><log.rar><DeployingDetails.jpg><UninstallError1_2012.8.13.txt> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> <deployError2012.8.17.txt> >>>>>>>> >>>>>> >>> <gangliaStartError.txt><4.jpg> >>
