Look at this thread. It has alternatives to DistributedCache. http://stackoverflow.com/questions/21239722/hadoop-distributedcache-is-deprecated-what-is-the-preferred-api
Basically you can use the new method job.addCacheFiles to pass on stuff to the individual tasks. Regards, Shahab On Thu, Dec 11, 2014 at 9:07 PM, Srinivas Chamarthi < [email protected]> wrote: > > Hi, > > I want to cache map/reducer temporary output files so that I can compare > two map results coming from two different nodes to verify the integrity > check. > > I am simulating this use case with speculative execution by rescheduling > the first task as soon as it is started and running. > > Now I want to compare output files coming from speculative attempt and > prior attempt so that I can calculate the credit scoring of each node. > > I want to use DistributedCache to cache the local file system files in > CommitPending stage from TaskImpl. But the DistributedCache is actually > deprecated. is there any other way I can do this ? > > I think I can use HDFS to save the temporary output files so that other > nodes can see it ? but is there any in-memory solution I can use ? > > any pointers are greatly appreciated. > > thx & rgds, > srinivas chamarthi >
