[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved MAPREDUCE-3276.
-----------------------------------------
    Resolution: Later

This issue is pretty stale at this point.  Closing with later.  if it is still 
a problem, then please open a new jira.

> hadoop dfs -copyToLocal/copyFromLocal called within Hadoop Streaming returns 
> early
> ----------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3276
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3276
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>    Affects Versions: 0.20.2
>         Environment: Linux RedHat Enterprise Linux 5.
> 31 node cluster with 1 as JobTracker and NameNode, and 30 as TaskTracker and 
> DataNode.
>            Reporter: Keith Stevens
>              Labels: hadoop, shell, streaming
>
> I'm using the Cloudera hadoop realease 0.20.2.+737 to parallelize bash 
> scripts with Hadoop Streaming.
> Below is an example script that i've been running which simply copies a file 
> from hdfs to a local node.
> {code:title=SampleMapper.sh|borderStyle=solid}
>  hadoop dfs -copyToLocal /path/to/some/large/file/myFile myFile
>  # Spin until the file is fully copied.
>  while [ ! -f myFile ]
>  do 
>   echo "spin"
>   sleep 1 
>  done
> {code}
> Surprisingly, the copy call returns before the file is copied, if the file is 
> sufficiently large, and the while loop spins for several iterations.  I'm 
> seeing similar behavior with copyFromLocal.
> I've asked about this issue on other forms and no one else seems to have had 
> the problem, although I don't know how many peoplpe are attempting to do this 
> particular task.
> Has this been fixed in more recent versions of hadoop?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to