It looks mostly correct to me. I am not an expert on sequence files, and I have not checked the text against the spec nor have I checked the binary numbers in it to be sure they add up to the correct lengths etc, but it looks good from a first glance. I can see the SEQ tag at the beginning to mark it as a sequence file and the org.apache.hadoop.io.Text as the type for both the keys and the values.
--Bobby Evans On 12/19/11 7:51 AM, "Pedro Costa" <psdc1...@gmail.com> wrote: Hi, In the hadoop MapReduce, I've executed the webdatascan example, and the reduce output is in a SequeceFile. The result is shows here ( http://paste.lisp.org/display/126572). What's the trash (random characters), like "u 265 0000100 330 320 252 " \n # ; 374 5 211 V ' 340 376" in the output? Is the output correct? 0000000 S E Q 006 031 o r g . a p a c h e . 0000020 h a d o o p . i o . T e x t 031 o 0000040 r g . a p a c h e . h a d o o p 0000060 . i o . T e x t \0 \0 \0 \0 \0 \0 u 265 0000100 330 320 252 " \n # ; 374 5 211 V ' 340 376 \0 \0 0000120 \0 X \0 \0 \0 037 a p p l e a p p 0000140 l e b a n a n a a p p l e 0000160 a p p l e 7 c a r r o t c a 0000200 r r o t c a r r o t c a r r 0000220 o t a p p l e b a n a n a 0000240 c a r r o t b a n a n a 0000256 -- Thanks,