Hi guys, I am not sure where is the right place to post this question hence I send it to both hive and tez dev mailing lists.
I am trying to get a better understanding of how the input / output for a task is handled. Typically input stages read the data to be processed. Next, all the data will flow in forms of key / value pairs till the end of the job's execution. 1. Could you guys can point me out to the key files where I should look to identify that? I am mostly interested to intercept where data is read by a task and wher the data is written after the task process the input data. 2. Also, is there a way I can identify the types (and hence read the actual values) of a key / value pair instead of just Object key, Object value? Thanks in advance,Robert