-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27169/
-----------------------------------------------------------
(Updated Nov. 2, 2014, 1:24 p.m.)
Review request for pig.
Changes
-------
Removed sizereduction from spill(), but made size reduction kick off in the
main thread by changing the number of records to sample to the current record
count. This is to avoid aggregation happening in spill thread which might not
have all thread local variables set as expected.
Bugs: PIG-3979
https://issues.apache.org/jira/browse/PIG-3979
Repository: pig
Description
-------
Fixed a couple of issues with POPartialAgg
- Made the spill of POPartialAgg synchronous so that System.gc() in
SpillableMemoryManager actually frees up memory.
- Avoid lot of redundant aggregateSecondLevel() calls
- Fixed the SpillableMemoryManager to not invoke extraGC if POPartialAgg
- Made variables transient which are not required to be serialized in the
plan
Diffs (updated)
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPartialAgg.java
1635881
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/GroupingSpillable.java
PRE-CREATION
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/util/SpillableMemoryManager.java
1635881
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestPOPartialAgg.java
1635881
Diff: https://reviews.apache.org/r/27169/diff/
Testing
-------
Unit tests added to TestPOPartialAgg. Ran couple of manual e2e tests to check
behaviour.
Thanks,
Rohini Palaniswamy