When I run map reduce task over a har file as the input, I see that the
input splits refer to 64mb byte boundaries inside the part file.

My mappers only know how to process the contents of each logical file inside
the har file. Is there some way by which I can take the offset range
specified by the input split and determine which logical files lie in that
offset range? (How else would one do map reduce over a har file?)

Roshan

Reply via email to