[
https://issues.apache.org/jira/browse/MAPREDUCE-6568?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene A Slusarev updated MAPREDUCE-6568:
-----------------------------------------
Attachment: PipeMapRed.diff
Attached my version
> Streaming Tasks dies when Environment Variable value longer than 100k
> ---------------------------------------------------------------------
>
> Key: MAPREDUCE-6568
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-6568
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Eugene A Slusarev
> Priority: Minor
> Attachments: PipeMapRed.diff
>
>
> For some jobs I use
> mapred.input.format.class=*org.apache.hadoop.mapred.lib.DelegatingInputFormat*
> which also requires
> *mapred.input.dir.formats/mapreduce.input.multipleinputs.dir.formats* to be
> defined w/ a list of files provided in
> *mapred.input.dir/mapreduce.input.fileinputformat.inputdir* extended w/ input
> reader class per each record, sometimes this list becomes very huge and job
> starts failing due to size of environment variable.
> I added 100k limitation to *org.apache.hadoop.streaming.PipeMapRed* to
> *addJobConfToEnvironment*, but it doesn't seem a good solution due to
> different limitation on different platforms (Windows, Linux, etc)
> I'm sure there should be better way to detect system limits and make this fix
> more flexible
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)