What's the equivalent of a GROUP BY statement within a FOREACH statement?

Adamantios Corais Thu, 20 Mar 2014 06:00:31 -0700

Hi,

I have the following schema: raw3: {group: (field1: int,field2:chararray,field3: chararray),raw2: {(field1: int,field2:chararray,field4: chararray,field3: chararray,field5: boolean)}}

What I am trying to do is extract the most frequent value of field4 andthe most frequent of field field5 along with the group fields (field1,field2, field3).

I know that GROUP BY is not allowed (yet) with FOREACH statements. Howcan I accomplice the same functionality without writing a UDF?


Example:

input: ((1,2,3),{(1,2,a,3,x),(1,2,b,3,x),(1,2,a,3,x),(1,2,v,3,x),(1,2,f,3,z),(1,2,a,3,z)})


output: (1,2,a,3,x)

Thank you,
Adam.

What's the equivalent of a GROUP BY statement within a FOREACH statement?

Reply via email to