[ 
https://issues.apache.org/jira/browse/MAPREDUCE-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene A Slusarev updated MAPREDUCE-6568:
-----------------------------------------
    Attachment: PipeMapRed.diff

Attached my version

> Streaming Tasks dies when Environment Variable value longer than 100k
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-6568
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6568
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Eugene A Slusarev
>            Priority: Minor
>         Attachments: PipeMapRed.diff
>
>
> For some jobs I use 
> mapred.input.format.class=*org.apache.hadoop.mapred.lib.DelegatingInputFormat*
>  which also requires 
> *mapred.input.dir.formats/mapreduce.input.multipleinputs.dir.formats* to be 
> defined w/ a list of files provided in 
> *mapred.input.dir/mapreduce.input.fileinputformat.inputdir* extended w/ input 
> reader class per each record, sometimes this list becomes very huge and job 
> starts failing due to size of environment variable.
> I added 100k limitation to *org.apache.hadoop.streaming.PipeMapRed* to 
> *addJobConfToEnvironment*, but it doesn't seem a good solution due to 
> different limitation on different platforms (Windows, Linux, etc)
> I'm sure there should be better way to detect system limits and make this fix 
> more flexible



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to