[ 
https://issues.apache.org/jira/browse/MAPREDUCE-109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Egner updated MAPREDUCE-109:
----------------------------------------

    Attachment: MAPREDUCE-109.patch

Revised the patch to use a different mechanism for setting the XML version.  
I've been able to apply this patch to CDH3u0 (hadoop 0.20.2) and it works.

I haven't had luck following the instructions for submitting a patch (ant just 
sits there for hours without making progress on patch-test) and I haven't been 
able to determine why hudson fails.  At this point, I don't have any more time 
to spend on this issue, but it would be really cool if someone could come along 
and figure out what's wrong.

Remember to restart your jobtracker and all tasktrackers after deploying the 
new jar if you still get:
{code}
11/04/28 02:52:54 WARN mapred.JobClient: Error reading task 
outputhttp://grid1-dn1:50060/tasklog?plaintext=true&attemptid=attempt_201104201944_0092_m_000034_2&filter=stdout
11/04/28 02:52:55 WARN mapred.JobClient: Error reading task 
outputhttp://grid1-dn1:50060/tasklog?plaintext=true&attemptid=attempt_201104201944_0092_m_000034_2&filter=stderr
11/04/28 02:52:56 INFO mapred.JobClient: Task Id : 
attempt_201104201944_0092_r_000001_2, Status : FAILED
Error initializing attempt_201104201944_0092_r_000001_2:
java.lang.RuntimeException: org.xml.sax.SAXParseException: Character reference 
"&#1" is an invalid XML character.
        at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1393)
        at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1261)
        at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1192)
        at org.apache.hadoop.conf.Configuration.get(Configuration.java:415)
        at 
org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1957)
        at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:386)
        at 
org.apache.hadoop.mapred.DefaultTaskController.initializeJob(DefaultTaskController.java:194)
        at org.apache.hadoop.mapred.TaskTracker$4.run(TaskTracker.java:1199)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at 
org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1174)
        at 
org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1089)
        at 
org.apache.hadoop.mapred.TaskTracker.startNewTask(TaskTracker.java:2257)
        at 
org.apache.hadoop.mapred.TaskTracker$TaskLauncher.run(TaskTracker.java:2221)
Caused by: org.xml.sax.SAXParseException: Character reference "&#1" is an 
invalid XML character.
        at 
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:249)
        at 
com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:284)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124)
        at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1313)
        ... 14 more
{code}

> Setting up ctr-A as custom delimiter for "mapred.textoutputformat.separator"
> ----------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-109
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-109
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.20.2, 0.23.0
>            Reporter: Suhas Gogate
>         Attachments: MAPREDUCE-109.patch, MAPREDUCE-109.patch
>
>
> Feature added by this Jira has a problem while setting up some of the invalid 
> xml characters e.g. ctrl-A e.g. mapred.textoutputformat.separator = "\u0001"
> e,g,
> String delim = "\u0001";
> Conf.set("mapred.textoutputformat.separator", delim);
> Job client serializes the jobconf with mapred.textoutputformat.separator set 
> to "\u0001" (ctrl-A) and problem happens when it is de-serialized (read back) 
> by job tracker, where it encounters invalid xml character.
> The test for this feature public : testFormatWithCustomSeparator() does not 
> serialize the jobconf after adding the separator as ctrl-A and hence does not 
> detect the specific problem.
> Here is an exception:
> 08/12/06 01:40:50 INFO mapred.FileInputFormat: Total input paths to process : 
> 1
> org.apache.hadoop.ipc.RemoteException: java.io.IOException:
> java.lang.RuntimeException: org.xml.sax.SAXParseException: Character 
> reference "&#1" is an invalid XML
> character.
> at
> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:961)
> at
> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:864)
> at
> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:832)
> at org.apache.hadoop.conf.Configuration.get(Configuration.java:291)
> at
> org.apache.hadoop.mapred.JobConf.getJobPriority(JobConf.java:1163)
> at
> org.apache.hadoop.mapred.JobInProgress.<init>(JobInProgress.java:179)
> at
> org.apache.hadoop.mapred.JobTracker.submitJob(JobTracker.java:1783)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
> at org.apache.hadoop.ipc.Client.call(Client.java:715)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
> at org.apache.hadoop.mapred.$Proxy1.submitJob(Unknown Source)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:788)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1026)
> at

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to