Hello In one of my use case, I am sending map processes to large number of hadoop nodes. Assuming that the nodes are obtained from public cloud. I would like to ensure that the security of the nodes are not compromised. For this,planning to implement voting mechanism wherein multiple copies, lets say 3, of same map process is sent to 3 different nodes. In this regard, i have the following question.
1. I am using NLineInputFormat, wherein each line is sent to one map process. Is there any mechanism in hadoop to create 3 similar map processes for single line? This I can mimic by writing same lines thrice in the input file which is referred by NLineInputFormat. Is there any other elegant way to do this? 2. Is there any mechanism with which I can ensure similar map processes are sent to three different nodes?. Any way to control scheduling of map processes to specific nodes. For example, map1 should go to node 1, and so on. 3. Or is there any scheduler that implements voting mechanism that I can use in conjunction with hadoop? I am not sure about my above approach. Basically, I would like to ensure the results generated by the nodes are correct and can be trusted. For instance, I am sending one map process to three nodes. I verify the results from these three nodes and if one node has given different result, it is assumed that the node need to verified. Is there any other possible approach, please share with me. regards rab