[Pig Wiki] Update of "PigMemory" by AlanGates

Apache Wiki Wed, 20 May 2009 10:41:45 -0700

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Pig Wiki" for change 
notification.


The following page has been changed by AlanGates:
http://wiki.apache.org/pig/PigMemory

------------------------------------------------------------------------------
  == Problem Statement ==
  
   1. Pig hogs memory.  In the 0.2.0 version, the expansion factor of data on 
disk to data in memory is 2-3x.  This causes pig problems in terms of 
efficiently executing users programs.  It is caused largely by extensive use 
Java objects (Integer, etc.) to store internal data.
-  1. Java memory management and its garbage collector are poorly suited to the 
workload of intensive data processing.  Pig needs better control over where 
data is stored and when memory is deallocated.  For a complete discussion of 
this issue see M. A. Shah et. al., ''Java Support for Data-Intensive Systems:  
Experiences Building the Telegraph Dataflow System''.
+  1. Java memory management and its garbage collector are poorly suited to the 
workload of intensive data processing.  Pig needs better control over where 
data is stored and when memory is deallocated.  For a complete discussion of 
this issue see M. A. Shah et. al., ''Java Support for Data-Intensive Systems:  
Experiences Building the Telegraph Dataflow System''.  In particular this paper 
points out that a memory allocation and garbage collection scheme that is 
beyond the control of the programmer is a bad fit for a large data processing 
system.
+  1. Currently Pig waits until memory is low to begin spilling bags to disk.  
This has two issues:
+    a. It is difficult to accurately determine when available memory is too 
low.
+    b. The system tries to spill and continue processing simultaneously.  
Sometimes the continued processing outruns the spilling and the system still 
runs out of memory.
+    
  
  == Proposed Solution ==
  Switching from using Java containers and objects to using large memory 
buffers and a page cache will address both of these issues.
@@ -93, +97 @@

          // number of tuples with references in the buffer
          private int refCnt;
          private int nextOffset;
+         private boolean isDirty;
+         private boolean isOnDisk;
          // Package level so others can see it without the overhead of a read 
call.
          DataBuffer data;
          File diskCache;
@@ -106, +112 @@

              diskCache = null;
              nextOffset = 0;
              refCnt = 0;
+             isDirty = false;
+             isOnDisk = false;
          }
              
          /**
@@ -131, +139 @@

           */
          int write(byte[] data) {
              bringIntoMemory();
+             isDirty = true;
              write data into this.data;
              move nextOffset;
              if insufficient space return -1
@@ -142, +151 @@

           * before the tuple begins to read the data.
           */
          void bringIntoMemory() {
-             if data on disk {
+             if (isOnDisk) {
                  MemoryManager.getMemoryManager().getDataBuffer();
                  read into memory
-                 diskCache = null;
+                 isDirty = false;
                  // pushes the buffer back onto the full queue, so it can be
                  // flushed again if necessary.
                  MemoryManager.getMemoryManger().markFull();
+                 isOnDisk = false;
              }
          }
  
@@ -183, +193 @@

           * @return 
           */
          DataBuffer flush() {
-             diskCache = new File;
+             if (isDirty) {
+                 isOnDisk = true;
+                 open(diskCache);
+                 write buffer to diskCache;
-             diskCache.deleteOnExit();
+                 diskCache.deleteOnExit();
-             write data to diskCache;
-             return data;
+                 return data;
-         }
+             }
- 
+         }
      }
  
      /**
@@ -424, +436 @@

  that by totally circumventing the Java garbage collector they got around a 
2.5x speedup of their system.  So it might be worth
  investigating.
  
+ == Reader Feedback ==
+ 
+ Ted Dunning commented in 
http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3cc7d45fc70905141943v72591b09u81009cf29b9f5...@mail.gmail.com%3e
+ 
+ Response:  
http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3ce3ac3b63-adb2-4c49-83fb-49a251cd9...@yahoo-inc.com%3e
+ 
+ Thejas Nair commented in 
http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3cc633011f.42186%25te...@yahoo-inc.com%3e
+ 
+ Response:  
http://mail-archives.apache.org/mod_mbox/hadoop-pig-dev/200905.mbox/%3c55caa9b7-c415-4c63-b80b-62d7a7539...@yahoo-inc.com%3e
+ 
+ Chris Olston made a few comments in a conversation:
+ 
+ 1. LRU will sometimes be a bad replacement choice.  In the cases where one 
batch of data for the pipeline will be larger than the total of all the memory 
pages, MRU would be a
+ better choice.
+ 
+ Response:  Agreed, but it seems that in the case that an operator needs a few 
pages of memory but not all, LRU may be a better choice (assuming there are not 
other operators taking
+ all the other pages of memory).  Since we don't ''a priori'' know the size of 
the input to an operator I don't know how to choose which is better.
+ 
+ 2. It might be useful to expand the interface to allow control of how many 
buffer pages go to a given operator.  This has a couple of benefits.  One, it 
is possible to prevent one
+  operator from taking all of the resources from another.  Two, you can choose 
different replacement algorithms based on what is best for that operator.
+ 
+ Response:  I agree that allowing assignments of memory pages to specific 
operators could be useful.  I am concnerned that Pig's planner is not 
sophisticated enough to make
+ intelligent choices here.  I would like to leave this as an area for future 
work.
+

[Pig Wiki] Update of "PigMemory" by AlanGates

Reply via email to