This script will work: A = load 'input' as (val, gender); F = filter A by gender == 'f'; M = filter A by gender == 'm'; F = distinct F; M = distinct M; B = join F by val full, M by val; C = filter B by ((F::val is null) or (M::val is null)); D = foreach C generate ((F::val is null) ? M::val : F::val); Dump D
Thanks, -Richard -----Original Message----- From: Kelvin Moss [mailto:[email protected]] Sent: Thursday, February 11, 2010 2:25 PM To: [email protected] Subject: set functionality I have a range of values that can have an associated gender like 'm', 'f'. I want to include all distinct values that have the same gender across all records. Like if the records are - abc f abc m def m def m def m I'd include "def" but exclude "abc". I need to simulate something like set (followed by count) functionality on a column. Is there a way to do it? Thanks!
