Hi I need to implement a Writable, which contains a lot of data, and unfortunately I cannot break it down to smaller pieces. The output of a Mapper is potentially a large record, which can be of any size ranging from few 10s of MBs to few 100s of MBs.
Is there a way for me to de-serialize the Writable into a location on the file system? Writable.readFields receives a DataInput only, which suggests I should de-serialize it into RAM. If I could get a handle to the job/task's output/temp directory, or just the temp directory, it'd be great - I could de-serialize it there and read it in my Mapper/Reducer directly from the file system. I'm not sure I can use System.getProperty("java.io.tmpdir") - will that work? Or is there a FileSystem API I should use instead? Thanks, Shai