Also, what version of pig? 2011/12/17 Jonathan Coveney <[email protected]>
> Locally, or on the cluster? > > Also, how much heap are you giving the process? > > Also, can you do: > > pig -e explain -script yourscript.pig on what you posted? > > > 2011/12/17 Cameron Gandevia <[email protected]> > >> I'm trying to figure out why the following pig script takes forever to >> run. >> >> logData = FOREACH flattenedLogData GENERATE opname, host, nanoTime, depth; >> >> opNameGroupAll = GROUP logData by opname; >> opNameGroupPerHost = GROUP logData by (opname,host); >> >> overviewOpsAll = FOREACH opNameGroupAll GENERATE >> '$reportId', 'ALL' as scope, >> group as opname, >> COUNT(logData.opname) as cnt, >> AVG(logData.depth) as avgDepth, >> SUM(logData.nanoTime)/1000000 as sum, >> AVG(logData.nanoTime)/1000000 as avg, >> MAX(logData.nanoTime)/1000000 as max; >> >> overviewOpsPerHost = FOREACH opNameGroupPerHost GENERATE >> '$reportId', group.host as scope, >> group.opname as opname, >> COUNT(logData.opname) as cnt, >> AVG(logData.depth) as avgDepth, >> SUM(logData.nanoTime)/1000000 as sum, >> AVG(logData.nanoTime)/1000000 as avg, >> MAX(logData.nanoTime)/1000000 as max; >> >> STORE overviewOpsAll INTO '$outputPathRootDir/overviewOpsAll' using >> PigStorage(); >> STORE overviewOpsPerHost INTO '$outputPathRootDir/overviewOpsPerHost' >> using >> PigStorage(); >> >> It usually gets to around 90% then takes forever to finish the reduce >> phase. I notice the following log lines in output logs. >> >> 2011-12-17 20:00:08,714 INFO >> org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate of >> 336737356 bytes from 1 objects. init = 175243264(171136K) used = >> 401178152(391775K) committed = 477233152(466048K) max = 536870912(524288K) >> >> 2011-12-17 20:00:13,015 INFO >> org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate >> of 354470820 bytes from 1 objects. init = 175243264(171136K) used = >> 397146280(387838K) committed = 536870912(524288K) max = >> 536870912(524288K) >> 2011-12-17 20:00:17,814 INFO >> org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate >> of 365633020 bytes from 1 objects. init = 175243264(171136K) used = >> 407703960(398148K) committed = 536870912(524288K) max = >> 536870912(524288K) >> 2011-12-17 20:00:22,572 INFO >> org.apache.pig.impl.util.SpillableMemoryManager: Spilled an estimate >> of 367290876 bytes from 1 objects. init = 175243264(171136K) used = >> 407457224(397907K) committed = 536870912(524288K) max = >> 536870912(524288K) >> > >
