Just FYI, With HAMA-956, now we're allow user to force set the number of tasks.
Internally, framework launches tasks by user setting if Constants.FORCE_SET_BSP_TASKS is true and cluster capacity is available. Inputless tasks will do nothing until receive message. So, default value of Constants.FORCE_SET_BSP_TASKS is false. In graph job case, this feature is used for initial vertices distribution. Issue1. But current implementation is working when desired tasks is larger than DFS blocks. For opposite case, we'll need to patch. I guess it can be simply done at read/writeRawSplits method in BSPJobClient. Issue2. input partitioning for BSPJob is working as a separate job. We may want to get rid of this and use the messenger system. -- Best Regards, Edward J. Yoon
