I would use DistributedCache. Put file2 to distributed cache, but you should read it for every map. If you find a better solution, please let me know, because I have a similar issue.
Rasit 2009/3/18 Nir Zohar <nirzo...@hotmail.com> > Hi, > > > > I would like your help with the below question. > > I have 2 files: file1 (key, value), file2 (only key) and I need to exclude > all records from file1 that these key records not in file2. > > 1. The output format is key-value, not only keys. > > 2. The key is not primary key; hence it's not possible to have joined in > the > end. > > > > Can you assist? > > > > Thanks, > > Nir. > > > > > > Example: > > > > file1: > > 2,1 > > 2,3 > > 2,5 > > 3,1 > > 3,2 > > 4,7 > > 4,9 > > 6,3 > > > > file2: > > 4 > > 2 > > > > Output: > > 3,1 > > 3,2 > > 6,3 > > > > > > > > -- M. Raşit ÖZDAŞ