Hello experienced users, I am relatively new to pig and I came across to one thing I do not fully understand. I have following script:
dirtydata = load '/data/120422' using AvroStorage(); sodtr = filter dirtydata by TransactionBlockNumber == 1; sto = foreach sodtr generate Dob.Value as Dob,StoreId, Created.UnixUtcTime; g = GROUP sto BY (Dob,StoreId); sodtime = FOREACH g GENERATE group.Dob AS Dob, group.StoreId as StoreId, MAX(sto.UnixUtcTime) AS latestStartOfDayTime; joined = join dirtydata by (Dob.Value, StoreId) LEFT OUTER, sodtime by (Dob, StoreId); cleandata = filter joined by dirtydata.Created.UnixUtcTime >= sodtime.latestStartOfDayTime; dump cleandata I am getting folloving error: ERROR 0: Exception while executing (Name: joined: Local Rearrange[tuple]{tuple}(false) - scope-166 Operator Key: scope-166): org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception while executing [POProject (Name: Project[long][0] - scope-152 Operator Key: scope-152) children: null at []]: org.apache.pig.backend.executionengine.ExecException: *ERROR 0: Scalar has more than one row in the output.* 1st : (1,(20120422),64619,2164,{(((20120422),64619,2164,(1335120734,-300),2,),{},(false,840),{},{(00200079-0000-0000-0000-000000000000,((1,LUNCH),(2097271,(2097271,WL 119),false),{(,(1335120734,-300),CheckPrint)},{},((0),PerGroup),20121,(3,Coffee Bar),),((34.57),(36.02)),{},{},{},{},{},{},{})},{})},(1412864847,-300)), 2nd :(1,(20120422),64619,1,{(((20120422),64619,1,(1335088853,-300),3,),{},(false,840),{},{},{({(ClockedIn,(1335088800,-300),(-62135596800,0),(1),{(0,(11),false)},0,(4,Baker),{},false)},(511,Roger Baeza-Vasquez))})},(1412864846,-300)) 2014-10-14 05:28:25,165 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed! When I change following relation: cleandata = filter joined by dirtydata*::*Created.UnixUtcTime >= sodtime.latestStartOfDayTime; Than all works fine. Seems to me like a mystery because I would expect that the same I need to do for sodtime relation. I am missing something here. Could someone please put some light on it? Thanks a lot Jakub