[ https://issues.apache.org/jira/browse/HADOOP-1201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12499781 ]
Vivek Ratan commented on HADOOP-1201: ------------------------------------- I'm assuming that what's suggested here is that the Ping thread just make one RPC call to the TaskTracker, which would support both the ping and progress functionalities. The ping call and the progress call have different signatures, return values, and meaning. They may also be called at different frequencies (the task pings every second as long as it is alive, whereas it may not always send a progress report at the same frequency, especially when it's not made any progress since the last call, but it's still chugging along). So, IMO, unless there's a signifcant performance penalty in having two RPC calls from the task to the TaskTracker, combining the two calls will make the code difficult to understand. The return value and the parameters for the combined call will really apply to two separate functions (ping and progress) and the code will be messier. Do we see a performance hit? If not, maybe we shouldn't use the ping thread to send the progress. If we do decide to combine the calls, we should eliminate TaskUmbilicalProtocol::Ping(). TaskUmbilicalProtocol::progress() should return a boolean value (false if the task was not foudn in the TaskTracker, true otherwise). If the RPC fails (the TT is dead) or returns false, the Task kills itself. This is also what Devaraj suggested in HADOOP-1235. My recommendation: 1. We leave the ping and progress threads as they are. If we see performance issues with two RPC calls, we eliminate Ping() and modify progress() to return a boolean. 2. Back to the original bug: there is inline code in ReduceTask to report progress. Moving this code to a separate thread, as we do for MapTask, is probably the right thing to do, but it ties in with whatever solution we adopt for HADOOP-1431. Please see my comments there. > Progress reporting can be improved for both Map/Reduce tasks > ------------------------------------------------------------ > > Key: HADOOP-1201 > URL: https://issues.apache.org/jira/browse/HADOOP-1201 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Reporter: Devaraj Das > > Both the map and reduce tasks do progress reporting in separate threads. > However, in the ReduceTask, after the sort phase, the progress reporting > happens inline with the reducer invocations. This slows down the Reduce phase > since RPC is involved for every progress report. The better thing to do would > be to do progress reporting for all phases in separate threads and have the > tasks just update the progress fields. > One proposal is to extract out the reporting stuff that is there in > MapTask/ReduceTask and put it in the Task superclass as a new class, and have > methods in the new class that control what/when progress is reported. > Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.