Re: Proactive Spill Count Recs

2011-08-03 Thread Thejas Nair
There are two ways pig controls the memory used by large bags - 1. Triggers set on GC, similar to mechanism described by Julien here - https://techblug.wordpress.com/2011/07/21/detecting-low-memory-in-java-part-2/ . When pig gets notified about high memory usage, it goes through the list of Spi

Re: Proactive Spill Count Recs

2011-08-03 Thread Dmitriy Ryaboy
Daniel, iirc spill requests are triggered by a gc, and spill_count is triggered by an actual spill, so the former number may be a bit misleading (if gc is effective, lots of gcs might be fine). D On Wed, Aug 3, 2011 at 10:12 AM, Daniel Dai wrote: > Spill means Pig need to dump memory into disk.

Re: Proactive Spill Count Recs

2011-08-03 Thread Daniel Dai
Spill means Pig need to dump memory into disk. It happens when Pig deals with a large key, and Pig run short of memory. The high number indicates Pig need to write to disk frequently and performance may downgrade, and you may explore approach, such as using skewed join. Daniel On Tue, Aug 2, 2011

Proactive Spill Count Recs

2011-08-02 Thread Sean Barry
org.apache.pig.PigCounters PROACTIVE_SPILL_COUNT_RECS 0 2,372,598 2,372,598 SPILLABLE_MEMORY_MANAGER_SPILL_COUNT 0 64 64 PROACTIVE_SPILL_COUNT_BAGS I was checking my jobtracker and I have no idea what these three counters are representative of... Can anyone shed some light, please? -S