I was just looking at the SpillableMemoryManager code, and discovered
that it prints out this message even when it doesn't actually spill
anything, just when the GC gets called.
To reproduce, try using the TupleFactory to generate a few million
tuples. There's nothing to spill, and yet you will see the message
when GC kicks in.

But that's beside the point -- clearly something is wrong with the way
Corbin's data is getting distributed, if he gets gigs on one machine,
and megs on the rest. Please post the script :-).

-D

On Thu, May 6, 2010 at 11:36 AM, Olga Natkovich <[email protected]> wrote:
> This is just a warning saying that your job is spilling to the disk.
> Please, if you can, post a script that is causing this issue. In 0.6.0
> we moved large chunk of the code away from using SpillableMemoryManager
> but it is still used in some places. More changes are coming in 0.7.0 as
> well.
>
> Olga
>
> -----Original Message-----
> From: Corbin Hoenes [mailto:[email protected]]
> Sent: Thursday, May 06, 2010 11:31 AM
> To: [email protected]
> Subject: Re: SpillableMemoryManager - low memory handler called
>
> 0.6
>
> Sent from my iPhone
>
> On May 6, 2010, at 12:16 PM, "Olga Natkovich" <[email protected]>
> wrote:
>
>> Which version of Pig are you using?
>>
>> -----Original Message-----
>> From: Corbin Hoenes [mailto:[email protected]]
>> Sent: Thursday, May 06, 2010 10:29 AM
>> To: [email protected]
>> Subject: SpillableMemoryManager - low memory handler called
>>
>> Hi Piggers - Seeing an issue with a particular script where our job is
>> taking 6hrs 42min to complete.
>>
>> syslogs are showing loads of these:
>> INFO : org.apache.pig.impl.util.SpillableMemoryManager - low memory
>> handler called (Usage threshold exceeded) init = 5439488(5312K) used =
>> 283443200(276800K) committed = 357957632(349568K) max =
>> 357957632(349568K)
>> INFO : org.apache.pig.impl.util.SpillableMemoryManager - low memory
>> handler called (Usage threshold exceeded) init = 5439488(5312K) used =
>> 267128840(260868K) committed = 357957632(349568K) max =
>> 357957632(349568K)
>> One iteresting thing is it's the map phase that is slow and one of the
>> mappers is getting 8GB of input while the other 2000 or so mappers are
>> getting MBs and hundreds of MBs of data.
>>
>> Any where I can start looking?
>>
>>
>

Reply via email to