[
https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13293011#comment-13293011
]
Ivan Mitic commented on MAPREDUCE-4322:
---------------------------------------
bq. TaskLog.java - Any special reasons to perform the command line length check
multiple times instead of once at the end of buildCommandLine()?
There are multiple lines that we want to execute as part of the taskjvm.cmd,
and I am checking the length of every line. Example taskjvm.cmd is the
following:
{code}
set HADOOP_CLIENT_OPTS=...
set SHELL="cmd"...
...
set CLASSPATH=...
C:\...\jre\bin\java ...
{code}
bq. the advantage with the -classpath argument was isolation of the classpath
to the specific spawned JVM. But by changing the classpath env var we risk
changing it for every spawned process too. Maybe thats not much of a problem.
I thought of this as well. As we are starting a separate bash/cmd for every
task, this will only apply to that task.
bq. What if CLASSPATH is already set on the machine? Will this append to it or
override it? From the code it looks like generating the classpath list will
pick up the parent classpath. So if CLASSPATH env var is already set then it
will be part of classpath list via the parent jvm (TaskTracket jvm). So even if
the taskjvm.cmd sets the CLASSPATH it will be a superset of any existing
CLASSPATH env var. Can you please verify this by having a pre-existing
CLASSPATH set?
Thanks, I just checked, and we do not include the system level CLASSPATH.
However, the setting itself seems to be exclusive, if you pass classpath via
{{-classpath}}, the CLASSPATH environment variable is ignored. Just tested this
out with a sample app that prints {{System.getProperty("java.class.path")}}. It
generally makes sense to be specific in this case, and not to include the
system setting as this can generally cause problems with resolution. Also,
there are ways Hadoop users can specify custom classpaths if needed. Agree?
> Fix command-line length abort issues on Windows
> -----------------------------------------------
>
> Key: MAPREDUCE-4322
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: tasktracker
> Environment: Windows, downstream applications with long aggregate
> classpaths
> Reporter: John Gordon
> Assignee: Ivan Mitic
> Attachments: MAPREDUCE-4322-branch-1-win.patch
>
> Original Estimate: 12h
> Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to
> invoke java and runs that batch. Within the batch file, the invocation of
> Java currently has -classpath ${CLASSPATH} inline to the command. That line
> often exceeds 8000 characters. This is ok for most linux distributions
> because the line limit env variable is often set much higher than this.
> However, for Windows this cause cmd to abort execution. This surfaces in
> Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the
> -classpath option into a config file to take the longest variable part of the
> line and put it somewhere that scales better.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira