Sounds like a good project!

On 1/13/16 4:27 PM, Jianfeng Jia wrote:
Hi Devs,

First of all, Xi Zhang is a Master student at UCI wants to work with us for a 
while. Welcome Xi!

We are thinking of making a Frame-based, memory-bound SerializableVector at 
first. We expect this vector can solve some occasionally Java.Heap.OutOfMemory 
exceptions in Hyracks.
Though we did a good job on organizing the record-located memory, the OOM exception 
can still happen while operating the auxiliary data structure. For example in the 
sort run generator, instead of moving record around we are creating an reference 
“pointer" array to the original record. However, if the record is small and the 
size of that int array will be large, then the OOM exception will occur, which is 
the case of issue [1].

One way to solve this problem is to put auxiliary data structures into the 
memory-bounded frame as well. In general, it will be much easier to ask for 
multiple small memory blocks than one big chunk of memory. I guess that was the 
same reason why we have “SerializableHashTable” for HashJoin. It will be nice 
to have a more general structure that can be used by all the operators.

The Frame based Vector idea is inspired by the Scala Vector[2] which looks like 
a List, but internally it is implemented as a 32-ary tree. The performance of 
it is very stable for variety size of object[3]. It will have all the benefits 
of ArrayList and the LinkedList. In addition, we can take the memory usage of 
the auxiliary structure into the calculation. We will work on the detailed 
design doc later if we are agree on this direction.

Any thoughts or suggestions? Thank you!


[1] 
https://code.google.com/p/asterixdb/issues/detail?id=934&can=1&q=last%20straw&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary%20ETA%20Severity
 
<https://code.google.com/p/asterixdb/issues/detail?id=934&can=1&q=last%20straw&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary%20ETA%20Severity>
[2] https://bitbucket.org/astrieanna/bitmapped-vector-trie 
<https://bitbucket.org/astrieanna/bitmapped-vector-trie>
[3] http://danielasfregola.com/2015/06/15/which-immutable-scala-collection/ 
<http://danielasfregola.com/2015/06/15/which-immutable-scala-collection/>

Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine



Reply via email to