Hello experienced users,

I am relatively new to pig and I came across to one thing I do not fully
understand. I have following script:

dirtydata = load '/data/120422' using AvroStorage();

sodtr = filter dirtydata by TransactionBlockNumber == 1;
sto   = foreach sodtr generate Dob.Value as Dob,StoreId,
Created.UnixUtcTime;
g     = GROUP sto BY  (Dob,StoreId);
sodtime = FOREACH g GENERATE group.Dob AS Dob, group.StoreId as StoreId,
MAX(sto.UnixUtcTime) AS latestStartOfDayTime;

joined = join dirtydata by (Dob.Value, StoreId) LEFT OUTER, sodtime by
(Dob, StoreId);

cleandata = filter joined by dirtydata.Created.UnixUtcTime >=
sodtime.latestStartOfDayTime;
dump cleandata

I am getting folloving error:


 ERROR 0: Exception while executing (Name: joined: Local
Rearrange[tuple]{tuple}(false) - scope-166 Operator Key: scope-166):
org.apache.pig.backend.executionengine.ExecException: ERROR 0: Exception
while executing [POProject (Name: Project[long][0] - scope-152 Operator
Key: scope-152) children: null at []]:
org.apache.pig.backend.executionengine.ExecException: *ERROR 0: Scalar has
more than one row in the output.* 1st :
(1,(20120422),64619,2164,{(((20120422),64619,2164,(1335120734,-300),2,),{},(false,840),{},{(00200079-0000-0000-0000-000000000000,((1,LUNCH),(2097271,(2097271,WL
119),false),{(,(1335120734,-300),CheckPrint)},{},((0),PerGroup),20121,(3,Coffee
Bar),),((34.57),(36.02)),{},{},{},{},{},{},{})},{})},(1412864847,-300)),
2nd
:(1,(20120422),64619,1,{(((20120422),64619,1,(1335088853,-300),3,),{},(false,840),{},{},{({(ClockedIn,(1335088800,-300),(-62135596800,0),(1),{(0,(11),false)},0,(4,Baker),{},false)},(511,Roger
Baeza-Vasquez))})},(1412864846,-300))
2014-10-14 05:28:25,165 [main] ERROR
org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!

When I change following relation:
cleandata = filter joined by dirtydata*::*Created.UnixUtcTime >=
sodtime.latestStartOfDayTime;

Than all works fine. Seems to me  like a mystery because I would expect
that the same I need to do for sodtime relation. I am missing something
here. Could someone please put some light on it?

Thanks a lot
Jakub

Reply via email to