Hi, What would be the best way to write this script? I have two datasets - huge (hkey, hdata), small(skey). I want to filter all the data from huge dataset for which F(hdata, skey) is true. Please advise.
For example, huge = load 'mydata' as (key:chararray, value:chararray); small = load 'smalldata' as skey:chararray; h_s_cross = cross huge, small; filtered = foreach h_s_cross generate CONTAINS(value, skey); Thanks, Aniket
