hi,

i would like to use binary input and output data in combination with hadoop
streaming.

the reason why i want to use binary data is, that parsing text to float
seems to consume a big lot of time compared to directly reading the binary
floats.

i am using a C-coded mapper (getting streaming data from stdin and writing
to stdout) and no reducer.

so my question is: how do i implement binary input output in this context? 
as far as i understand i need to put an '\n' char at the end of each
binary-'line'. so hadoop knows how to split/distribute the input data among
the nodes and how to collect it for output(??)

is this approach reasonable?

thanks,
john
-- 
View this message in context: 
http://www.nabble.com/streaming-%2B-binary-input-output-data--tp16537427p16537427.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.

Reply via email to