Hello! Considering the following two relations...
grunt> querys = load 'query' as (id:int, token:chararray); grunt> dump querys (11,foo) (12,bar) (13,frog) and grunt> documents = load 'document' as (id:int, text:chararray); grunt> dump documents; (21,foo bar frog) (22,hello frog) Is is possible to do a join where the query:token is not equal to but contained in documents:text ? eg (11,foo,21,foo bar frog) (12,bar,21,foo bar frog) (13,frog,21,foo bar frog) (13,frog,22,hello frog) I can certainly do this in Java map/reduce (as we all had to in the dark days days before pig) but is there a way to hack this together with a custom udf or some other weird join backdoor (customer partitioner for a group or something whacky) ??? It's been a long day, maybe I'm just missing some super obvious.. Cheers! Mat
