Hassen, I have lots of binary data that I parse using Python streaming.
The way I do this is stream the binary data into sequence files (the binary data object I save in the key and (null) as the value). Each key then gets written back to me line by line, key by key for an entire block when streaming. To have this work in streaming on the command line you need to use *-inputformat SequenceFileAsTextInputFormat* To create the sequence files I have a jar file that goes from BufferedReader and writes to org.apache.hadoop.io.SequenceFile.Writer I am not sure if you can do this for your data but if not then make your own InputFormat. good luck! /* Joe Stein http://www.linkedin.com/in/charmalloc Twitter: @allthingshadoop */ On Mon, Jun 20, 2011 at 4:13 PM, Hassen Riahi <hassen.ri...@cern.ch> wrote: > Dear all, > > Is it possible to have a binary input to a map code written in python? > > Thank you > Hassen >