Re: A proposal for changing pig's memory management

2009-06-01 Thread Mridul Muralidharan

Alan Gates wrote:


On May 19, 2009, at 10:30 PM, Mridul Muralidharan wrote:



I am still not very convinced about the value about this 
implementation - particularly considering the advances made since 1.3 
in memory allocators and garbage collection.


My fundamental concern is not with the slowness of garbage collection.  
I am asserting (along with the paper) that garbage collection is not an 
optimal choice for a large data processing system.  I don't want to 
improve the garbage collector, I want to manage a subset of the memory 
without it.



I should probably have elaborated better.
Most objects in Pig are in young generation (pls correct me if I am 
wrong) - so promoting them from there (which is handled pretty optimally 
and blazingly fast by vm) into slower/longer memory pools should be done 
with some thought (management of buffers, etc).


The only (corner) cases where this is not valid, from top of my head, is 
when a single tuple becomes really large due to a bag (usually) with 
either large number of tuples in it, or tuples with larger payloads : 
and imo that results in quite similar costs with this proposal too - but 
I could be wrong.











The side effect of this proposal is many, and sometimes non-obvious.
Like implicitly moving young generation data into older generation, 
causing much more memory pressure for gc, fragmentation of memory 
blocks causing quite a bit of memory pressure, replicating quite a bit 
of functionality with garbage collection, possibility of bugs with ref 
counting, etc.


I don't understand your concerns regarding the load on the gc and memory 
fragmentation.  Let's say I have 10,000 tuples, each with 10 fields.  
Let's also assume that these tuples live long enough to make it into the 
old memory pool, since this is the interesting case where objects live 
long enough to cause a problem.  In the current implementation there 
will be 110,000 objects that the gc has to manage moving into the old 
pool, and check every time it cleans the old pool.  In the proposed 
implementation there would be 10,001 objects (assuming all the data fit 
into one buffer) to manage.  And rather than allocating 100,000 small 
pieces of memory, we would have allocated one large segment.  My belief 
is that this would lighten the load on the gc.



Old gen memory management is not very trivial.
For example, which should probably be very commonly known now - if an 
old block is freed and yet the cost of moving existing blocks around to 
use the 'free' block is high, vm just leaves it around. Over time, you 
will end up with fragmentation on old gen which cant be freed. (This is 
not a vm bug - the costs outweigh the benefits).



That being said, as I mentioned above, the costs of mem usage is not 
linear - young gen is way faster (allocation, management, free) than 
objects promoted to older generations (successively) [compaction, 
reference changes, etc in gc].


In pig's case, since it is essentially streaming in nature - most 
tuples/bag - except in corner cases, would fall into young gen where 
things are faster.






Just a note though -
The last time I had to dabble in memory management for my server needs, 
it was already pretty complex and un-intutive (not to mention env and 
impl specific) - and that was a few years back - unfortunately, I have 
not kept abreast with recent changes (and quite a few have gone into vm 
for java 6 I was told) : so probably my comments above might not be 
valid anymore.
Other than saying you would probably want to test extensively like we 
had to do, and that things are not as simple as they normally appear 
[and imo almost all books/articles get it wrong - so testing is only way 
out], I cant really comment more authoritatively anymore :-) Any 
improvement to pig memory management would be a welcome change though !




Regards,
Mridul




This does replicate some of  the functionality of the garbage 
collector.  Complex systems frequently need to re-implement foundational 
functionality in order to optimize it for their needs.  Hence many RDBMS 
engines have their own implementations of memory management, file I/O, 
thread scheduling, etc.


As for bugs in ref counting, I agree that forgetting to deallocate is 
one of the most pernicious problems of allowing programmers to do memory 
management.  But in this case all that will happen is that a buffer will 
get left around that isn't needed.  If the system needs more memory then 
that buffer will eventually get selected for flushing to disk, and then 
it will stay there as no one will call it back into memory.  So the cost 
of forgetting to deallocate is minor.





If assumption that current working set of bag/tuple does not need to 
be spilled, and anything else can be, then this will pretty much 
deteriorate to current impl in worst case.
That is not the assumption.  There are two issues:  1) trying to spill 
bags only when we determine we need to is highly error prone, because we 

Re: A proposal for changing pig's memory management

2009-05-20 Thread Alan Gates


On May 19, 2009, at 10:30 PM, Mridul Muralidharan wrote:



I am still not very convinced about the value about this  
implementation - particularly considering the advances made since  
1.3 in memory allocators and garbage collection.


My fundamental concern is not with the slowness of garbage  
collection.  I am asserting (along with the paper) that garbage  
collection is not an optimal choice for a large data processing  
system.  I don't want to improve the garbage collector, I want to  
manage a subset of the memory without it.





The side effect of this proposal is many, and sometimes non-obvious.
Like implicitly moving young generation data into older generation,  
causing much more memory pressure for gc, fragmentation of memory  
blocks causing quite a bit of memory pressure, replicating quite a  
bit of functionality with garbage collection, possibility of bugs  
with ref counting, etc.


I don't understand your concerns regarding the load on the gc and  
memory fragmentation.  Let's say I have 10,000 tuples, each with 10  
fields.  Let's also assume that these tuples live long enough to make  
it into the old memory pool, since this is the interesting case  
where objects live long enough to cause a problem.  In the current  
implementation there will be 110,000 objects that the gc has to manage  
moving into the old pool, and check every time it cleans the old  
pool.  In the proposed implementation there would be 10,001 objects  
(assuming all the data fit into one buffer) to manage.  And rather  
than allocating 100,000 small pieces of memory, we would have  
allocated one large segment.  My belief is that this would lighten the  
load on the gc.


This does replicate some of  the functionality of the garbage  
collector.  Complex systems frequently need to re-implement  
foundational functionality in order to optimize it for their needs.   
Hence many RDBMS engines have their own implementations of memory  
management, file I/O, thread scheduling, etc.


As for bugs in ref counting, I agree that forgetting to deallocate is  
one of the most pernicious problems of allowing programmers to do  
memory management.  But in this case all that will happen is that a  
buffer will get left around that isn't needed.  If the system needs  
more memory then that buffer will eventually get selected for flushing  
to disk, and then it will stay there as no one will call it back into  
memory.  So the cost of forgetting to deallocate is minor.





If assumption that current working set of bag/tuple does not need to  
be spilled, and anything else can be, then this will pretty much  
deteriorate to current impl in worst case.
That is not the assumption.  There are two issues:  1) trying to spill  
bags only when we determine we need to is highly error prone, because  
we can't accurately determine when we need to and because we sometimes  
can't dump fast enough to survive; 2) current memory usage is far too  
high, and needs to be reduced.








A much more simpler method to gain benefits would be to handle  
primitives as ... primitives and not through the java wrapper  
classes for them.
It should be possible to write schema aware tuples which make use of  
the primitives specified to take a fraction of memory required (4  
bytes + null_check boolean for int + offset mapping instead of 24/32  
bytes it currently is, etc).


In my observation, at least 50% of the data in pig is untyped, which  
means it's a byte array.  Of the 50% that people declare or is  
determined by the program, probably 50-80% of that are chararrays and  
maps.  So that means that somewhere around 25% of the data is  
numeric.  Shrinking that 25% by 75% will be nice, but not adequate.   
And it does nothing to help with the issue of being able to spill in a  
controlled way instead of only in emergency situations.


Alan.


Re: A proposal for changing pig's memory management

2009-05-19 Thread Alan Gates
The claims in the paper I was interested in were not issues like non- 
blocking I/O etc.  The claim that is of interest to pig is that a  
memory allocation and garbage collection scheme that is beyond the  
control of the programmer is a bad fit for a large data processing  
system.  This is a fundamental design choice in Java, and fits it well  
for the vast majority of its uses.  But for systems like Pig there  
seems to be no choice but to work around Java's memory management.   
I'll clarify this point in the document.


I took a closer look at NIO.  My concern is that it does not give the  
level of control I want.  NIO allows you to force a buffer to disk and  
request a buffer to load, but you cannot force a page out of memory.   
It doesn't even guarantee that after you load a page it will really be  
loaded.  One of the biggest issues in pig right now is that we run out  
memory or get the garbage collector in a situation where it can't make  
sufficient progress.  Perhaps switching to large buffers instead of  
having many individual objects will address this.  But I'm concerned  
that if we cannot explicitly force data out of memory onto disk then  
we'll be back in the same boat of trusting the Java memory manager.


Alan.

On May 14, 2009, at 7:43 PM, Ted Dunning wrote:


That Telegraph dataflow paper is pretty long in the tooth.  Certainly
several of their claims have little force any more (lack of non- 
blocking
I/O, poor thread performance, no unmap, very expensive  
synchronization for
uncontested locks).  It is worth that they did all of their tests on  
the 1.3

JVM and things have come an enormous way since then.

Certainly, it is worth having opaque contains based on byte arrays,  
but

isn't that pretty much what the NIO byte buffers are there to provide?
Wouldn't a virtual tuple type that was nothing more than a byte  
buffer, type

and an offset do almost all of what is proposed here?

On Thu, May 14, 2009 at 5:33 PM, Alan Gates ga...@yahoo-inc.com  
wrote:



http://wiki.apache.org/pig/PigMemory

Alan.





Re: A proposal for changing pig's memory management

2009-05-19 Thread Ted Dunning
If you have a small number of long-lived large objects and a large number of
small ephemeral objects then the java collector should be in pig-heaven (as
it were).  The long-lived objects will take no time to collect and the
ephemeral objects won't be around to collect by the time the full GC
happens.

On Tue, May 19, 2009 at 3:44 PM, Alan Gates ga...@yahoo-inc.com wrote:

 Perhaps switching to large buffers instead of having many individual
 objects will address this.  But I'm concerned that if we cannot explicitly
 force data out of memory onto disk then we'll be back in the same boat of
 trusting the Java memory manager.


-- 
Ted Dunning, CTO
DeepDyve


Re: A proposal for changing pig's memory management

2009-05-19 Thread Mridul Muralidharan


I am still not very convinced about the value about this implementation 
- particularly considering the advances made since 1.3 in memory 
allocators and garbage collection.


The side effect of this proposal is many, and sometimes non-obvious.
Like implicitly moving young generation data into older generation, 
causing much more memory pressure for gc, fragmentation of memory blocks 
causing quite a bit of memory pressure, replicating quite a bit of 
functionality with garbage collection, possibility of bugs with ref 
counting, etc.


If assumption that current working set of bag/tuple does not need to be 
spilled, and anything else can be, then this will pretty much 
deteriorate to current impl in worst case.





A much more simpler method to gain benefits would be to handle 
primitives as ... primitives and not through the java wrapper classes 
for them.
It should be possible to write schema aware tuples which make use of the 
primitives specified to take a fraction of memory required (4 bytes + 
null_check boolean for int + offset mapping instead of 24/32 bytes it 
currently is, etc).




Regards,
Mridul

Alan Gates wrote:

http://wiki.apache.org/pig/PigMemory

Alan.




Re: A proposal for changing pig's memory management

2009-05-14 Thread Ted Dunning
That Telegraph dataflow paper is pretty long in the tooth.  Certainly
several of their claims have little force any more (lack of non-blocking
I/O, poor thread performance, no unmap, very expensive synchronization for
uncontested locks).  It is worth that they did all of their tests on the 1.3
JVM and things have come an enormous way since then.

Certainly, it is worth having opaque contains based on byte arrays, but
isn't that pretty much what the NIO byte buffers are there to provide?
Wouldn't a virtual tuple type that was nothing more than a byte buffer, type
and an offset do almost all of what is proposed here?

On Thu, May 14, 2009 at 5:33 PM, Alan Gates ga...@yahoo-inc.com wrote:

 http://wiki.apache.org/pig/PigMemory

 Alan.