Re: Bad connection to FS. command aborted.

Andrei Savu Thu, 23 Feb 2012 12:51:35 -0800

That looks like a change we've made in
https://issues.apache.org/jira/browse/WHIRR-490


It seems like "unlimited" is not a valid value for mapred.child.ulimit. Let
me investigate a bit more.

In  the meantime you can add to your .properties file something like:

hadoop-mapreduce.mapred.child.ulimit=<very-large-number>

On Thu, Feb 23, 2012 at 8:36 PM, Edmar Ferreira <
[email protected]> wrote:

> changed it and the cluster is running and I can access the fs and submit
> jobs, but all jobs aways fail with this strange error:
>
> java.lang.NumberFormatException: For input string: "unlimited"
>       at 
> java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
>       at java.lang.Integer.parseInt(Integer.java:481)
>       at java.lang.Integer.valueOf(Integer.java:570)
>       at org.apache.hadoop.util.Shell.getUlimitMemoryCommand(Shell.java:86)
>       at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:379)
>
>
>
> Also when I try to access the full error log I see this in the browser:
>
> HTTP ERROR: 410
>
> Failed to retrieve stdout log for task: attempt_201202232026_0001_m_000005_0
>
> RequestURI=/tasklog
>
>
> My proxy is running and I'm using the socks proxy in localhost 6666
>
>
>
>
>
> On Thu, Feb 23, 2012 at 5:25 PM, Andrei Savu <[email protected]>wrote:
>
>> That should work but I recommend you to try:
>>
>> http://apache.osuosl.org/hadoop/common/hadoop-0.20.2/hadoop-0.20.2.tar.gz
>>
>> archive.apache.org  is extremely unreliable.
>>
>>
>> On Thu, Feb 23, 2012 at 7:18 PM, Edmar Ferreira <
>> [email protected]> wrote:
>>
>>> I will destroy this cluster and launch again with these lines in the
>>> properties:
>>>
>>>
>>> whirr.hadoop.version=0.20.2
>>> whirr.hadoop.tarball.url=
>>> http://archive.apache.org/dist/hadoop/core/hadoop-${whirr.hadoop.version}/hadoop-${whirr.hadoop.version}.tar.gz
>>>
>>>
>>> Any other ideas ?
>>>
>>>
>>>
>>>
>>> On Thu, Feb 23, 2012 at 5:16 PM, Andrei Savu <[email protected]>wrote:
>>>
>>>> Yep, so I think this is the root cause. I'm pretty sure that you need
>>>> to make sure you are running the same version.
>>>>
>>>> On Thu, Feb 23, 2012 at 7:14 PM, Edmar Ferreira <
>>>> [email protected]> wrote:
>>>>
>>>>> When I run :
>>>>>
>>>>> hadoop version in one cluster machine I get
>>>>>
>>>>> Warning: $HADOOP_HOME is deprecated.
>>>>>
>>>>> Hadoop 0.20.205.0
>>>>> Subversion
>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-205-r
>>>>>  1179940
>>>>> Compiled by hortonfo on Fri Oct  7 06:20:32 UTC 2011
>>>>>
>>>>>
>>>>> When I run hadoop version in my local machine I get
>>>>>
>>>>> Hadoop 0.20.2
>>>>> Subversion
>>>>> https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-r 
>>>>> 911707
>>>>> Compiled by chrisdo on Fri Feb 19 08:07:34 UTC 2010
>>>>>
>>>>>
>>>>> On Thu, Feb 23, 2012 at 5:05 PM, Andrei Savu <[email protected]>wrote:
>>>>>
>>>>>> Do the local Hadoop version match the remote one?
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 23, 2012 at 7:00 PM, Edmar Ferreira <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Yes, I did a
>>>>>>>
>>>>>>> export HADOOP_CONF_DIR=~/.whirr/hadoop/
>>>>>>>
>>>>>>> before running hadoop fs -ls
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Feb 23, 2012 at 4:56 PM, Ashish <[email protected]>wrote:
>>>>>>>
>>>>>>>> Did you set the HADOOP_CONF_DIR=~/.whirr/<you cluster name> from the
>>>>>>>> shell where you are running the hadoop command?
>>>>>>>>
>>>>>>>> On Fri, Feb 24, 2012 at 12:23 AM, Andrei Savu <
>>>>>>>> [email protected]> wrote:
>>>>>>>> > That looks fine.
>>>>>>>> >
>>>>>>>> > Anything interesting in the Hadoop logs on the remote machines?
>>>>>>>> Are all the
>>>>>>>> > daemons running as expected?
>>>>>>>> >
>>>>>>>> > On Thu, Feb 23, 2012 at 6:48 PM, Edmar Ferreira
>>>>>>>> > <[email protected]> wrote:
>>>>>>>> >>
>>>>>>>> >> last lines
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> 2012-02-23 16:04:30,241 INFO
>>>>>>>> >>  [org.apache.whirr.actions.ScriptBasedClusterAction] (main)
>>>>>>>> Finished running
>>>>>>>> >> configure phase scripts on all cluster instances
>>>>>>>> >> 2012-02-23 16:04:30,241 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] 
>>>>>>>> (main)
>>>>>>>> >> Completed configuration of hadoop role hadoop-namenode
>>>>>>>> >> 2012-02-23 16:04:30,241 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] 
>>>>>>>> (main)
>>>>>>>> >> Namenode web UI available at
>>>>>>>> >> http://ec2-23-20-110-12.compute-1.amazonaws.com:50070
>>>>>>>> >> 2012-02-23 16:04:30,242 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] 
>>>>>>>> (main)
>>>>>>>> >> Wrote Hadoop site file
>>>>>>>> >> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-site.xml
>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopNameNodeClusterActionHandler] 
>>>>>>>> (main)
>>>>>>>> >> Wrote Hadoop proxy script
>>>>>>>> >> /Users/edmaroliveiraferreira/.whirr/hadoop/hadoop-proxy.sh
>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
>>>>>>>> >> (main) Completed configuration of hadoop role hadoop-jobtracker
>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopJobTrackerClusterActionHandler]
>>>>>>>> >> (main) Jobtracker web UI available at
>>>>>>>> >> http://ec2-23-20-110-12.compute-1.amazonaws.com:50030
>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>> >>
>>>>>>>>  [org.apache.whirr.service.hadoop.HadoopDataNodeClusterActionHandler] 
>>>>>>>> (main)
>>>>>>>> >> Completed configuration of hadoop role hadoop-datanode
>>>>>>>> >> 2012-02-23 16:04:30,246 INFO
>>>>>>>> >>
>>>>>>>>  
>>>>>>>> [org.apache.whirr.service.hadoop.HadoopTaskTrackerClusterActionHandler]
>>>>>>>> >> (main) Completed configuration of hadoop role hadoop-tasktracker
>>>>>>>> >> 2012-02-23 16:04:30,253 INFO
>>>>>>>> >>  [org.apache.whirr.actions.ScriptBasedClusterAction] (main)
>>>>>>>> Finished running
>>>>>>>> >> start phase scripts on all cluster instances
>>>>>>>> >> 2012-02-23 16:04:30,257 DEBUG
>>>>>>>> [org.apache.whirr.service.ComputeCache]
>>>>>>>> >> (Thread-3) closing ComputeServiceContext {provider=aws-ec2,
>>>>>>>> >> endpoint=https://ec2.us-east-1.amazonaws.com,
>>>>>>>> apiVersion=2010-06-15,
>>>>>>>> >> buildVersion=, identity=08WMRG9HQYYGVQDT57R2,
>>>>>>>> iso3166Codes=[US-VA, US-CA,
>>>>>>>> >> US-OR, BR-SP, IE, SG, JP-13]}
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> On Thu, Feb 23, 2012 at 4:31 PM, Andrei Savu <
>>>>>>>> [email protected]>
>>>>>>>> >> wrote:
>>>>>>>> >>>
>>>>>>>> >>> I think it's the first time I see this. Anything interesting in
>>>>>>>> the
>>>>>>>> >>> logs?
>>>>>>>> >>>
>>>>>>>> >>>
>>>>>>>> >>> On Thu, Feb 23, 2012 at 6:27 PM, Edmar Ferreira
>>>>>>>> >>> <[email protected]> wrote:
>>>>>>>> >>>>
>>>>>>>> >>>> Hi guys,
>>>>>>>> >>>>
>>>>>>>> >>>> When I launch a cluster and run the proxy everything seems to
>>>>>>>> be right,
>>>>>>>> >>>> but when I try to use any command in hadoop I get this error:
>>>>>>>> >>>>
>>>>>>>> >>>> Bad connection to FS. command aborted.
>>>>>>>> >>>>
>>>>>>>> >>>> Any suggestions ?
>>>>>>>> >>>>
>>>>>>>> >>>> Thanks
>>>>>>>> >>>>
>>>>>>>> >>>> --
>>>>>>>> >>>> Edmar Ferreira
>>>>>>>> >>>> Co-Founder at Everwrite
>>>>>>>> >>>>
>>>>>>>> >>>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> >> --
>>>>>>>> >> Edmar Ferreira
>>>>>>>> >> Co-Founder at Everwrite
>>>>>>>> >>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> thanks
>>>>>>>> ashish
>>>>>>>>
>>>>>>>> Blog: http://www.ashishpaliwal.com/blog
>>>>>>>> My Photo Galleries: http://www.pbase.com/ashishpaliwal
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Edmar Ferreira
>>>>>>> Co-Founder at Everwrite
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Edmar Ferreira
>>>>> Co-Founder at Everwrite
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Edmar Ferreira
>>> Co-Founder at Everwrite
>>>
>>>
>>
>
>
> --
> Edmar Ferreira
> Co-Founder at Everwrite
>
>

Re: Bad connection to FS. command aborted.

Reply via email to