On 4 Apr 2012, at 22:07, John Armstrong <j...@ccri.com> wrote: > On 04/04/2012 05:00 PM, Kevin Savage wrote: >> However, what we have is one big file of design data that needs to go to all >> the maps and many big files of climate data that need to go to one map each. >> I've not been able to work out if there is a good way of doing this in >> Hadoop. > > It sounds like "one big file" belongs on the DistributedCache, while the > "many big files" should be set up as the input using some subclass of > FileInputFormat > > hth > >
Hi John, Sounds good, sounds like I just needed pointing at the right words! Thanks, Kevin