I am not sure about what you meant by "null match". Would this work ?
F1 = load 'largefile' as (field1,..); F2 = load 'smallfile' as (field2, ..); -- as the file is very small , use replicated join. J = join F1 by field1 LEFT, F2 by field1 using 'replicated'; FE = foreach J generate F1.field1, F2.field1 is null ? F1.field1 : F2.field1, F2.field1 is null ? F1.field1 : F2.field1 ; On 8/2/10 7:13 AM, "Kochis, Allan" <allan.koc...@schwab.com> wrote: Hi, Have a pig question. I have two HDFS file, a smaller file that has |field1|field2|field3| and a larger file that has |..|.. |...|field2|....|field3|.....|field1|...| ..| I would like to replace field2 and field3 in my larger file when they are null match on field1. I am currently doing this by caching my smaller file and using a perl hash lookup to populate the larger records in a UDF. Can this be done in a pig join? Thanks, Allan