Oh, very thanks, I found the picture. thks 2009/11/17 Jason Venner <[email protected]>
> There is a very clear picture in chapter 8 of pro hadoop, on all of the > separators for streaming jobs. > > > > On Tue, Nov 10, 2009 at 6:53 AM, wd <[email protected]> wrote: > >> You mean the ^A ? >> I tried \u0001 and \x01, the streaming job recognise it as a string, not >> ^A.. >> >> :( >> >> 2009/11/10 Amogh Vasekar <[email protected]> >> >> Hi, >>> I’m pretty sure you need to specify unicode equivalent, or atleast that >>> is what I used in my java map-red program. >>> >>> Amogh >>> >>> >>> >>> On 11/10/09 9:24 AM, "wd" <[email protected]> wrote: >>> >>> hi, >>> >>> I'm try to write a hadoop streaming job by perl. But i'm complately >>> confused by the key/value separator. >>> >>> I found lots of separators I can set ... >>> >>> # -jobconf stream.map.output.field.separator=A \ >>> # -jobconf stream.reducer.output.field.separator=B \ >>> # -jobconf mapred.textoutputformat.separator=C \ >>> # -jobconf key.value.separator.in.input.line=D \ >>> # -jobconf stream.map.output.field.separator=A \ >>> # -jobconf stream.reduce.input.field.separator=AA \ >>> # -jobconf stream.reduce.output.field.separator=B \ >>> # -jobconf map.output.key.field.separator=C \ >>> >>> But what does these separators mean? >>> >>> I try to use ^A in my job, and find this bug < >>> http://issues.apache.org/jira/browse/HADOOP-3341> , it seems hadoop >>> have fix it in 0.19.0, but I still get follow error when I set to ^A. >>> >>> >>> [Fatal Error] :49:68: Character reference "" is an invalid XML >>> character. >>> 09/11/10 11:10:16 FATAL conf.Configuration: error parsing conf file: >>> org.xml.sax.SAXParseException: Character reference "" is an invalid XML >>> character. >>> Exception in thread "main" java.lang.RuntimeException: >>> org.xml.sax.SAXParseException: Character reference "" is an invalid XML >>> character. >>> at >>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1167) >>> at >>> org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1039) >>> at >>> org.apache.hadoop.conf.Configuration.getProps(Configuration.java:979) >>> at org.apache.hadoop.conf.Configuration.get(Configuration.java:381) >>> at >>> org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1630) >>> at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:214) >>> at >>> org.apache.hadoop.mapred.LocalJobRunner$Job.<init>(LocalJobRunner.java:93) >>> at >>> org.apache.hadoop.mapred.LocalJobRunner.submitJob(LocalJobRunner.java:372) >>> at >>> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:800) >>> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730) >>> at >>> org.apache.hadoop.streaming.StreamJob.submitAndMonitorJob(StreamJob.java:873) >>> at org.apache.hadoop.streaming.StreamJob.run(StreamJob.java:118) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) >>> at >>> org.apache.hadoop.streaming.HadoopStreaming.main(HadoopStreaming.java:32) >>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >>> at >>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >>> at java.lang.reflect.Method.invoke(Method.java:597) >>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >>> Caused by: org.xml.sax.SAXParseException: Character reference "" is an >>> invalid XML character. >>> at >>> com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:239) >>> at >>> com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:283) >>> at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:124) >>> at >>> org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1091) >>> ... 19 more >>> >>> So, I can't use ^A as the separator ? >>> >>> >> > > > -- > Pro Hadoop, a book to guide you from beginner to hadoop mastery, > http://www.amazon.com/dp/1430219424?tag=jewlerymall > www.prohadoopbook.com a community for Hadoop Professionals >
