I am still not very convinced about the value about this implementation - particularly considering the advances made since 1.3 in memory allocators and garbage collection.


The side effect of this proposal is many, and sometimes non-obvious.
Like implicitly moving young generation data into older generation, causing much more memory pressure for gc, fragmentation of memory blocks causing quite a bit of memory pressure, replicating quite a bit of functionality with garbage collection, possibility of bugs with ref counting, etc.

If assumption that current working set of bag/tuple does not need to be spilled, and anything else can be, then this will pretty much deteriorate to current impl in worst case.




A much more simpler method to gain benefits would be to handle primitives as ... primitives and not through the java wrapper classes for them. It should be possible to write schema aware tuples which make use of the primitives specified to take a fraction of memory required (4 bytes + null_check boolean for int + offset mapping instead of 24/32 bytes it currently is, etc).



Regards,
Mridul

Alan Gates wrote:
http://wiki.apache.org/pig/PigMemory

Alan.

Reply via email to