This the branch for 0.7.1 RC0 and all tests are working as expected: https://svn.apache.org/repos/asf/whirr/branches/branch-0.7
Can you give it a try? I'm still checking mapred.child.ulimit On Thu, Feb 23, 2012 at 9:05 PM, Edmar Ferreira < [email protected]> wrote: > just some changes in install_hadoop.sh to install ruby and some > dependencies. > I'm running whirr from trunk and I build it 5 days ago, I guess. > Do you think I need to do a svn checkout and build it again ? > > > On Thu, Feb 23, 2012 at 6:53 PM, Andrei Savu <[email protected]>wrote: > >> It's strange this is happening because the integration tests work as >> expected (we actually running MR jobs). >> >> Are you adding any other options? >> >> >> On Thu, Feb 23, 2012 at 8:50 PM, Andrei Savu <[email protected]>wrote: >> >>> That looks like a change we've made in >>> https://issues.apache.org/jira/browse/WHIRR-490 >>> >>> It seems like "unlimited" is not a valid value for mapred.child.ulimit. >>> Let me investigate a bit more. >>> >>> In the meantime you can add to your .properties file something like: >>> >>> hadoop-mapreduce.mapred.child.ulimit=<very-large-number> >>> >>> >>> On Thu, Feb 23, 2012 at 8:36 PM, Edmar Ferreira < >>> [email protected]> wrote: >>> >>>> changed it and the cluster is running and I can access the fs and >>>> submit jobs, but all jobs aways fail with this strange error: >>>> >>>> java.lang.NumberFormatException: For input string: "unlimited" >>>> at >>>> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) >>>> at java.lang.Integer.parseInt(Integer.java:481) >>>> at java.lang.Integer.valueOf(Integer.java:570) >>>> at org.apache.hadoop.util.Shell.getUlimitMemoryCommand(Shell.java:86) >>>> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:379) >>>> >>>> >>>> >>>> Also when I try to access the full error log I see this in the browser: >>>> >>>> HTTP ERROR: 410 >>>> >>>> Failed to retrieve stdout log for task: >>>> attempt_201202232026_0001_m_000005_0 >>>> >>>> RequestURI=/tasklog >>>> >>>> >>>> My proxy is running and I'm using the socks proxy in localhost 6666 >>>> >>>> >>>> >>>> >>>> >>>> On Thu, Feb 23, 2012 at 5:25 PM, Andrei Savu <[email protected]>wrote: >>>> >>>>> That should work but I recommend you to try: >>>>> >>>>> >>>>> http://apache.osuosl.org/hadoop/common/hadoop-0.20.2/hadoop-0.20.2.tar.gz >>>>> >>>>> archive.apache.org is extremely unreliable. >>>>> >>>>> >>>>> On Thu, Feb 23, 2012 at 7:18 PM, Edmar Ferreira < >>>>> [email protected]> wrote: >>>>> >>>>>> I will destroy this cluster and launch again with these lines in the >>>>>> properties: >>>>>> >>>>>> >>>>>> whirr.hadoop.version=0.20.2 >>>>>> whirr.hadoop.tarball.url= >>>>>> http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz >>>>>> >>>>>> >>>>>> Any other ideas ? >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Feb 23, 2012 at 5:16 PM, Andrei Savu >>>>>> <[email protected]>wrote: >>>>>> >>>>>>> Yep, so I think this is the root cause. I'm pretty sure that you >>>>>>> need to make sure you are running the same version. >>>>>>> >>>>>>> On Thu, Feb 23, 2012 at 7:14 PM, Edmar Ferreira < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> When I run : >>>>>>>> >>>>>>>> hadoop version in one cluster machine I get >>>>>>>> >>>>>>>> Warning: $HADOOP_HOME is deprecated. >>>>>>>> >>>>>>>> Hadoop 0.20.205.0 >>>>>>>> Subversion >>>>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-205-r >>>>>>>> 1179940 >>>>>>>> Compiled by hortonfo on Fri Oct 7 06:20:32 UTC 2011 >>>>>>>> >>>>>>>> >>>>>>>> When I run hadoop version in my local machine I get >>>>>>>> >>>>>>>> Hadoop 0.20.2 >>>>>>>> Subversion >>>>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r >>>>>>>> 911707 >>>>>>>> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010 >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Feb 23, 2012 at 5:05 PM, Andrei Savu <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> Do the local Hadoop version match the remote one? >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Feb 23, 2012 at 7:00 PM, Edmar Ferreira < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Yes, I did a >>>>>>>>>> >>>>>>>>>> export HADOOP_CONF_DIR=~/.whirr/hadoop/ >>>>>>>>>> >>>>>>>>>> before running hadoop fs -ls >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Feb 23, 2012 at 4:56 PM, Ashish >>>>>>>>>> <[email protected]>wrote: >>>>>>>>>> >>>>>>>>>>> Did you set the HADOOP_CONF_DIR=~/.whirr/<you cluster name> from >>>>>>>>>>> the >>>>>>>>>>> shell where you are running the hadoop command? >>>>>>>>>>> >>>>>>>>>>> On Fri, Feb 24, 2012 at 12:23 AM, Andrei Savu < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> > That looks fine. >>>>>>>>>>> > >>>>>>>>>>> > Anything interesting in the Hadoop logs on the remote >>>>>>>>>>> machines? Are all the >>>>>>>>>>> > daemons running as expected? >>>>>>>>>>> > >>>>>>>>>>> > On Thu, Feb 23, 2012 at 6:48 PM, Edmar Ferreira >>>>>>>>>>> > <[email protected]> wrote: >>>>>>>>>>> >> >>>>>>>>>>> >> last lines >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> 2012-02-23 16:04:30,241 INFO >>>>>>>>>>> >> [org.apache.whirr.actions.ScriptBasedClusterAction] (main) >>>>>>>>>>> Finished running >>>>>>>>>>> >> configure phase scripts on all cluster instances >>>>>>>>>>> >> 2012-02-23 16:04:30,241 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] >>>>>>>>>>> (main) >>>>>>>>>>> >> Completed configuration of hadoop role hadoop-namenode >>>>>>>>>>> >> 2012-02-23 16:04:30,241 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] >>>>>>>>>>> (main) >>>>>>>>>>> >> Namenode web UI available at >>>>>>>>>>> >> http://ec2-23-20-110-12.compute-1.amazonaws.com:50070 >>>>>>>>>>> >> 2012-02-23 16:04:30,242 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] >>>>>>>>>>> (main) >>>>>>>>>>> >> Wrote Hadoop site file >>>>>>>>>>> >> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-site.xml >>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] >>>>>>>>>>> (main) >>>>>>>>>>> >> Wrote Hadoop proxy script >>>>>>>>>>> >> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-proxy.sh >>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler] >>>>>>>>>>> >> (main) Completed configuration of hadoop role >>>>>>>>>>> hadoop-jobtracker >>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler] >>>>>>>>>>> >> (main) Jobtracker web UI available at >>>>>>>>>>> >> http://ec2-23-20-110-12.compute-1.amazonaws.com:50030 >>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopDataNodeClusterActionHandler] >>>>>>>>>>> (main) >>>>>>>>>>> >> Completed configuration of hadoop role hadoop-datanode >>>>>>>>>>> >> 2012-02-23 16:04:30,246 INFO >>>>>>>>>>> >> >>>>>>>>>>> >>>>>>>>>>> [org.apache.whirr.service.hadoop.HadoopTaskTrackerClusterActionHandler] >>>>>>>>>>> >> (main) Completed configuration of hadoop role >>>>>>>>>>> hadoop-tasktracker >>>>>>>>>>> >> 2012-02-23 16:04:30,253 INFO >>>>>>>>>>> >> [org.apache.whirr.actions.ScriptBasedClusterAction] (main) >>>>>>>>>>> Finished running >>>>>>>>>>> >> start phase scripts on all cluster instances >>>>>>>>>>> >> 2012-02-23 16:04:30,257 DEBUG >>>>>>>>>>> [org.apache.whirr.service.ComputeCache] >>>>>>>>>>> >> (Thread-3) closing ComputeServiceContext {provider=aws-ec2, >>>>>>>>>>> >> endpoint=https://ec2.us-east-1.amazonaws.com, >>>>>>>>>>> apiVersion=2010-06-15, >>>>>>>>>>> >> buildVersion=, identity=08WMRG9HQYYGVQDT57R2, >>>>>>>>>>> iso3166Codes=[US-VA, US-CA, >>>>>>>>>>> >> US-OR, BR-SP, IE, SG, JP-13]} >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> On Thu, Feb 23, 2012 at 4:31 PM, Andrei Savu < >>>>>>>>>>> [email protected]> >>>>>>>>>>> >> wrote: >>>>>>>>>>> >>> >>>>>>>>>>> >>> I think it's the first time I see this. Anything interesting >>>>>>>>>>> in the >>>>>>>>>>> >>> logs? >>>>>>>>>>> >>> >>>>>>>>>>> >>> >>>>>>>>>>> >>> On Thu, Feb 23, 2012 at 6:27 PM, Edmar Ferreira >>>>>>>>>>> >>> <[email protected]> wrote: >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> Hi guys, >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> When I launch a cluster and run the proxy everything seems >>>>>>>>>>> to be right, >>>>>>>>>>> >>>> but when I try to use any command in hadoop I get this >>>>>>>>>>> error: >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> Bad connection to FS. command aborted. >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> Any suggestions ? >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> Thanks >>>>>>>>>>> >>>> >>>>>>>>>>> >>>> -- >>>>>>>>>>> >>>> Edmar Ferreira >>>>>>>>>>> >>>> Co-Founder at Everwrite >>>>>>>>>>> >>>> >>>>>>>>>>> >>> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> >>>>>>>>>>> >> -- >>>>>>>>>>> >> Edmar Ferreira >>>>>>>>>>> >> Co-Founder at Everwrite >>>>>>>>>>> >> >>>>>>>>>>> > >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> thanks >>>>>>>>>>> ashish >>>>>>>>>>> >>>>>>>>>>> Blog: http://www.ashishpaliwal.com/blog >>>>>>>>>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Edmar Ferreira >>>>>>>>>> Co-Founder at Everwrite >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Edmar Ferreira >>>>>>>> Co-Founder at Everwrite >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Edmar Ferreira >>>>>> Co-Founder at Everwrite >>>>>> >>>>>> >>>>> >>>> >>>> >>>> -- >>>> Edmar Ferreira >>>> Co-Founder at Everwrite >>>> >>>> >>> >> > > > -- > Edmar Ferreira > Co-Founder at Everwrite > >
