On Nov 8, 2007, at 5:14 PM, Milind A Bhandarkar wrote:

Does pipes deserializes and serializes data for the identity mappers or just "passes it through" ? (Streaming converts input to text, afaik)

Pipes serializes the objects to bytes and sends them to the C++ program. The C++ program gets them as C++ strings, which are effectively byte arrays. Pipes does not do the conversion to Java strings that streaming does. Therefore, pipes can support arbitrary Writable objects. Hopefully in the future, we can change the map/ reduce api to provide access to the raw bytes in the mapper and reducer as an option. In that case, pipes would not need to serialize at all.

-- Owen

Reply via email to