My example of a combined tuple should have A and not $-NT or $NT, and same for 
the map:

(A, 1L, 2L, 6L, 0L, 1L)

(A, 1L#1L, 2L#2L, 3L#6L, 5L#1L)

On May 5, 2010, at 5:06 PM, Greg Langmead wrote:

> At an intermediate point in my processing, I have these tuples:
> 
> DUMP X;
> (A,1L,1L)
> (A,2L,2L)
> (A,3L,6L)
> (A,5L,1L)
> 
> The middle element of these tuples can have any integer value from 1-5, and 
> the third element can have any positive integer value. (These data points 
> mean, for example for the third tuple, "I saw 6 distinct words that started 
> with the letter A that occurred 3 times each.") My problem is that to do the 
> math I need to do next, I need to know that there were 0 words that occurred 
> 4 times, so I need to group these four tuples into one record that permits me 
> to ask "what is the value that goes with 1, ... what is the value that goes 
> with 5".
> 
> I could stream these through a script and do what I want, but I'm new to Pig 
> and I'd like to explore what can be done strictly within Pig.
> 
> Maybe I could gather these into a tuple, but with a 0 at the position for 4:
> 
> ($-NT,1L,2L,6L,0L,1L)
> 
> or else somehow generate a map from this:
> 
> ($NT, 1L#1L, 2L#2L, 3L#6L, 5L#1L)
> 
> which would also alert me to the absence of 4L. Can I do either of these 
> things?
> 
> Thanks,
> Greg Langmead
> Research Scientist
> Language Weaver, Inc.

Reply via email to