Hi folks,
I've got one resultset which I need to run a comparison with all the
rows within the same resultset. For example:
R1
R2
R3
R4
R5
Take R1, I'll need to compare R1 with all rows from R2-R5. The
comparison will be written in a UDF. Here's what I have so far:
============================================
RAW = load 'raw_data.txt' using PigStorage(',');
RAW_2 = foreach RAW generate *;
PROCESSED = foreach RAW {
/* perform comparo here */
};
============================================
I'm stuck at the filtering inside the nested block. How should I go
about the comparing the rows there?
Any help is greatly appreciated.
Thanks!