Hello all - I am currently trying to integrate the numpy Python fast-arrays package with Hadoop. I am basically looking for a way to read binary data similar to a SequenceFile, except without keys. That is, similar to how the TextInputFormat emits the position in the file as the key, I would like a SequenceFileInputFormat that simply emits the position or index in the file and the value.
Does such a facility exist? If not, how would I go about implementing this? Thanks, Pete
