Don't duplicate relation names as column names. Russell Jurney http://datasyndrome.com
On Jul 6, 2012, at 12:56 PM, Chun Yang <[email protected]> wrote: > Hi all, > > I'm walking through a pig script in grunt, but I am getting stuck with some > issues using nested foreach. I'm using Pig version 0.9.2 > > I'm trying to find the number of unique users from a bag 'top100' > > grunt> describe top100 > top100: {name: chararray,licenses: long,instance: chararray,transactions: > long,users: {(projected::userId: chararray)},runTimes: {(projected::runTime: > double)}} > > grunt> uu = foreach top100 { >>> uniqUsers = distinct users; >>> generate uniqUsers as uniqUsers; >>> } > ERROR 1200: Pig script failed to parse: > <line 132, column 9> Invalid scalar projection: uniqUsers : A column needs > to be projected from a relation for it to be used as a scalar > > I realized that I had defined uniqUsers earlier, but I didn't think it would > conflict inside the nested foreach block. The schema for uniqUsers is: > > grunt> describe uniqUsers > uniqUsers: {key: chararray,uniqUsers: long} > > I tried a different alias for the distinct clause and it seems to work. > > grunt> uu = foreach top100 { >>> un = distinct users; >>> generate un as uniqUsers; >>> } > grunt> describe uu > uu: {un: {(projected::userId: chararray)}} > grunt> uu = foreach top100 { >>> un = distinct users; >>> generate COUNT(un) as uniqUsers; >>> } > grunt> describe uu > uu: {uniqUsers: long} > > I was curious, so I tried the following, but I do not understand what the > results are. > > grunt> u2 = foreach top100 { >>> uniqUsers = distinct users; >>> generate uniqUsers.key; >>> } > grunt> describe u2 > u2: {projected::userId: chararray} > > grunt> u3 = foreach top100 { >>> uniqUsers = distinct users; >>> generate uniqUsers.uniqUsers; >>> } > grunt> describe u3 > u3: {projected::userId: chararray} > > Specifically, what is actually in the result of u3? Why is it a chararray > when uniqUsers.uniqUsers is a long? Why is the alias still > projected::userId? > > Thanks for any help! > > -Chun > > PS Sorry for the double post, I accidentally hit a keyboard shortcut for > Send. >
