@Till: do you have any thoughts on this subject before we close it? On Fri, Apr 1, 2016 at 2:08 PM, Jianfeng Jia <[email protected]> wrote:
> Dear devs, > > Xi has implemented the SerializableVector(SVector) which allocated memory > increasingly rather than a long java int array upfront. > We did some tests but it seems not a good enough answer to solve the OOM > issue caused by long java array allocation. > > The first experiment was to test if the standalone SVector approach would > get more memory comparing to the long int array. With the heap size of > -Xmx2g on a single machine, the SVector could allocate up to 1280M data, > while the original Java array can allocate up to 1080M data. It can > allocate about 18% more space. > > When we move to the AsterixDB, we replace the int array which stores the > (FrameId, TupleId) pair with the SVector. Then we gradually increasing the > "compiler.sortmemory" to increase the in-memory buffer for external sort. > Both master and SVector succeeded in 1150M case and failed at 1250M case. > Furthermore, the master branch succeeded in 1200M case but the SVector > failed in that case. > > Given that, we may think that allocate the memory chunk by chunk doesn't > win too much space comparing to allocate a big one directly. > I think we need to take the structure memory usage into the spilling case > calculation, but we may not need an extra structure to chop the memory. > > Any insights? > If you are interested, the code sample Xi wrote is here: > https://github.com/xixZ/JVMExperiment/tree/master/src/testing > > > On Tue, Feb 2, 2016 at 11:01 PM, Jianfeng Jia <[email protected]> > wrote: > > > Thanks Ted for the elaborate introduction! > > I did some reading about it. Based on my understanding, the main > advantage > > of ValueVector is its column-wise design. In that case, the per-record > > based metadata, e.g. indexes or hash keys, can be directly added as an > > column along with the existing record. Thus, the sorting within one row > > group should be easily addressed. One question still not very clear to me > > is how to generate the sorted result across several row groups? > > If you can get somebody to talk more about it that will be great. Thank > > you! > > > > On Feb 1, 2016, at 6:03 PM, Ted Dunning <[email protected]> wrote: > > > > So there are several key points for ValueVectors that I can describe, but > > for the authoritative voice, others would have to speak. > > > > The first point is that in Drill at least (and this is not required) > > ValueVectors are off-heap. This helps enormously in managing the > > life-cycle since vectors can be associated with queries and when the > query > > ends, all associated vectors can be deallocated quickly. That also > allows > > the memory footprint of Drill to be adjusted both up and down while > > running. > > > > Secondly, ValueVectors are stored column-wise, not record-wise. Most > > manipulations do not require copies. Projection simply requires ignoring > > some columns. New columns can be added without disturbing old ones. > > Filtering is done using a row selection bitmap. Sorting is often done > using > > an index column. > > > > The assumption is also that you will have a row group with something > like a > > hundred thousand rows in it. This means that the size of a single group > > isn't usually astronomical although very large data structures in a > single > > row can make the regulation of the size of row groups more difficult. > > > > Of particular interest is the fact that processing code can be generated > in > > Java that avoids almost all object creation so that most SQL-like queries > > don't require any object cons'ing at all during the row scans. Moreover, > > the code generated can even be rendered by the JIT into vectorized low > > level instructions because the access patterns on ValueVectors are so > > simple. > > > > Nested data structures are handled using invisible marking columns > similar > > to the way that nesting is marked in Dremel or Parquet. This means that > you > > get uniformly typed pseudo columns that represent a flattened view of the > > nested content. Many restructuring operations can be done by simply > > re-interpreting the nested data without any copying at all. > > > > If more detail is desired we should probably get somebody who is more > > active in the Drill implementation to talk about how this all works and > how > > it will be extracted into Apache Arrow. > > > > More information can be found here: > > > > https://drill.apache.org/docs/value-vectors/ > > > > > > > > > > On Mon, Feb 1, 2016 at 5:08 PM, Jianfeng Jia <[email protected]> > > wrote: > > > > If I understand correctly, it seems very similar to the IFrame in Hyrack, > > which is also a container to store a sequence of record into the > > ByteBuffer. > > > > I’m not clear about how records are manipulated inside the ValueVectors. > > In Hyracks case, we store the pointer(usually a pair of (frameID, > > recordID)) in one java array and manipulate the pointers instead of the > > original records. We mainly want to break the that one array into > multiple > > small ByteBuffer-based arrays. By doing so, we can reduce the risk of OOM > > for a large array and we may also take the memory usage of those pointers > > into account for the flush decisions. @Ted, could you share some > insights > > about how the ValueVectors handles manipulations? e.g. sort, hashing … > etc. > > > > On Feb 1, 2016, at 3:16 PM, Ted Dunning <[email protected]> wrote: > > > > Have you guys looked at the Drill ValueVectors? > > > > This structure is being spun out as Apache Arrow with multiple interfaces > > and language bindings. > > > > On Mon, Feb 1, 2016 at 9:56 AM, Jianfeng Jia <[email protected]> > > > > wrote: > > > > > > Hi, > > > > We plan to implement an append-only array at first. The main reason is > > this is how the auxiliary data structure be used so far. Then the > > implementation is straightforward. > > > > The tree-structured Vector in Scala can save a lot in updating case > > > > mainly > > > > because of their immutable requirement. It can saving unnecessary data > > copies comparing to other immutable list when updating. We only allow > > in-place update. The tree design may be overkill for us. > > > > Xi made on detailed design doc is here: > > > > > > > https://docs.google.com/document/d/1bs3JBCxmvuJZmBq_gKzUDt0FZdM9j3d3MPAmC29ZoSA/edit?usp=sharing > > > > Any thoughts or comments? > > > > > > On Jan 18, 2016, at 9:52 AM, Chen Li <[email protected]> wrote: > > > > @Xi and Jianfeng: after we come up with the design, let's share it with > > > > the > > > > group for an approval before the implementation. > > > > Chen > > > > On Fri, Jan 15, 2016 at 11:48 AM, Mike Carey <[email protected]> > > > > wrote: > > > > > > The accounting is just as critical as the chunking - we should do both > > (together). > > > > > > On 1/15/16 9:00 AM, Till Westmann wrote: > > > > I don’t have relevant experience on the subject. But I think that it > > sounds good to avoid arbitrarily long chunks of memory. Especially - > > > > as > > > > Jianfeng wrote - it would be good to be able to a) account for this > > > > memory > > > > and b) to manage it. > > An interesting question for me would be what the overhead of such a > > Vector is compared to a simple Java array and as a result where it > > > > should > > > > be used to replace arrays. (The comparison in [3] only compares > > > > different > > > > Scala collections, but doesn’t look at plain arrays.) > > > > Cheers, > > Till > > > > On 14 Jan 2016, at 22:05, Chen Li wrote: > > > > Before we ask Xi to work on this project, it will be good to know if > > > > other people have seen similar problems and agree with this plan. > > @Till: can you share some tips? > > > > Chen > > > > On Wed, Jan 13, 2016 at 4:27 PM, Jianfeng Jia < > > > > [email protected] > > > > > > wrote: > > > > Hi Devs, > > > > First of all, Xi Zhang is a Master student at UCI wants to work > > > > with > > > > us > > > > for a while. Welcome Xi! > > > > We are thinking of making a Frame-based, memory-bound > > SerializableVector at first. We expect this vector can solve some > > occasionally Java.Heap.OutOfMemory exceptions in Hyracks. > > Though we did a good job on organizing the record-located memory, > > > > the > > > > OOM exception can still happen while operating the auxiliary data > > structure. For example in the sort run generator, instead of moving > > > > record > > > > around we are creating an reference “pointer" array to the original > > > > record. > > > > However, if the record is small and the size of that int array will > > > > be > > > > large, then the OOM exception will occur, which is the case of > > > > issue > > > > [1]. > > > > > > One way to solve this problem is to put auxiliary data structures > > > > into > > > > the memory-bounded frame as well. In general, it will be much > > > > easier > > > > to ask > > > > for multiple small memory blocks than one big chunk of memory. I > > > > guess that > > > > was the same reason why we have “SerializableHashTable” for > > > > HashJoin. It > > > > will be nice to have a more general structure that can be used by > > > > all the > > > > operators. > > > > The Frame based Vector idea is inspired by the Scala Vector[2] > > > > which > > > > looks like a List, but internally it is implemented as a 32-ary > > > > tree. The > > > > performance of it is very stable for variety size of object[3]. It > > > > will > > > > have all the benefits of ArrayList and the LinkedList. In addition, > > > > we can > > > > take the memory usage of the auxiliary structure into the > > > > calculation. We > > > > will work on the detailed design doc later if we are agree on this > > direction. > > > > Any thoughts or suggestions? Thank you! > > > > > > [1] > > > > > > > > > https://code.google.com/p/asterixdb/issues/detail?id=934&can=1&q=last%20straw&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary%20ETA%20Severity > > > > < > > > > > > > > > https://code.google.com/p/asterixdb/issues/detail?id=934&can=1&q=last%20straw&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary%20ETA%20Severity > > > > > > > > [2] https://bitbucket.org/astrieanna/bitmapped-vector-trie < > > https://bitbucket.org/astrieanna/bitmapped-vector-trie> > > [3] > > > > http://danielasfregola.com/2015/06/15/which-immutable-scala-collection/ > > > > < > > > > http://danielasfregola.com/2015/06/15/which-immutable-scala-collection/ > > > > > > > > > > Best, > > > > Jianfeng Jia > > PhD Candidate of Computer Science > > University of California, Irvine > > > > > > > > > > > > > > Best, > > > > Jianfeng Jia > > PhD Candidate of Computer Science > > University of California, Irvine > > > > > > > > > > > > Best, > > > > Jianfeng Jia > > PhD Candidate of Computer Science > > University of California, Irvine > > > > > > > > > > > > Best, > > > > Jianfeng Jia > > PhD Candidate of Computer Science > > University of California, Irvine > > > > > > > -- > > ----------------- > Best Regards > > Jianfeng Jia > Ph.D. Candidate of Computer Science > University of California, Irvine >
