Re: Implement an SerializableVector in Hyracks

Till Westmann Wed, 06 Apr 2016 10:15:07 -0700

Not really. Just talked with Yingyi as well and we agree with Jianfengthat

taking the data structure memory usage into account is probably the most
important improvement here.


Cheers,
Till

On 6 Apr 2016, at 9:21, Chen Li wrote:

@Till: do you have any thoughts on this subject before we close it?
On Fri, Apr 1, 2016 at 2:08 PM, Jianfeng Jia <[email protected]>wrote:
Dear devs,
Xi has implemented the SerializableVector(SVector) which allocatedmemory
increasingly rather than a long java int array upfront.
We did some tests but it seems not a good enough answer to solve theOOM
issue caused by long java array allocation.
The first experiment was to test if the standalone SVector approachwouldget more memory comparing to the long int array. With the heap sizeof-Xmx2g on a single machine, the SVector could allocate up to 1280Mdata,
while the original Java array can allocate up to 1080M data. It can
allocate about 18% more space.
When we move to the AsterixDB, we replace the int array which storesthe(FrameId, TupleId) pair with the SVector. Then we graduallyincreasing the"compiler.sortmemory" to increase the in-memory buffer for externalsort.Both master and SVector succeeded in 1150M case and failed at 1250Mcase.Furthermore, the master branch succeeded in 1200M case but theSVector
failed in that case.
Given that, we may think that allocate the memory chunk by chunkdoesn't
win too much space comparing to allocate a big one directly.
I think we need to take the structure memory usage into the spillingcasecalculation, but we may not need an extra structure to chop thememory.
Any insights?
If you are interested, the code sample Xi wrote is here:
https://github.com/xixZ/JVMExperiment/tree/master/src/testing
On Tue, Feb 2, 2016 at 11:01 PM, Jianfeng Jia<[email protected]>
wrote:
Thanks Ted for the elaborate introduction!
I did some reading about it. Based on my understanding, the main
advantage
of ValueVector is its column-wise design. In that case, theper-recordbased metadata, e.g. indexes or hash keys, can be directly added asancolumn along with the existing record. Thus, the sorting within onerowgroup should be easily addressed. One question still not very clearto me
is how to generate the sorted result across several row groups?
If you can get somebody to talk more about it that will be great.Thank
you!
On Feb 1, 2016, at 6:03 PM, Ted Dunning <[email protected]>wrote:
So there are several key points for ValueVectors that I candescribe, but
for the authoritative voice, others would have to speak.

The first point is that in Drill at least (and this is not required)
ValueVectors are off-heap.  This helps enormously in managing the
life-cycle since vectors can be associated with queries and when the
query
ends, all associated vectors can be deallocated quickly.  That also
allows
the memory footprint of Drill to be adjusted both up and down while
running.
Secondly, ValueVectors are stored column-wise, not record-wise.Mostmanipulations do not require copies. Projection simply requiresignoring
some columns. New columns can be added without disturbing old ones.
Filtering is done using a row selection bitmap. Sorting is oftendone
using
an index column.

The assumption is also that you will have a row group with something
like a
hundred thousand rows in it. This means that the size of a singlegroup
isn't usually astronomical although very large data structures in a
single
row can make the regulation of the size of row groups moredifficult.
Of particular interest is the fact that processing code can begenerated
in
Java that avoids almost all object creation so that most SQL-likequeriesdon't require any object cons'ing at all during the row scans.Moreover,the code generated can even be rendered by the JIT into vectorizedlowlevel instructions because the access patterns on ValueVectors areso
simple.

Nested data structures are handled using invisible marking columns
similar
to the way that nesting is marked in Dremel or Parquet. This meansthat
you
get uniformly typed pseudo columns that represent a flattened viewof the
nested content. Many restructuring operations can be done by simply
re-interpreting the nested data without any copying at all.
If more detail is desired we should probably get somebody who ismoreactive in the Drill implementation to talk about how this all worksand
how
it will be extracted into Apache Arrow.

More information can be found here:

https://drill.apache.org/docs/value-vectors/
On Mon, Feb 1, 2016 at 5:08 PM, Jianfeng Jia<[email protected]>
wrote:
If I understand correctly, it seems very similar to the IFrame inHyrack,
which is also a container to store a sequence of record into the
ByteBuffer.
I’m not clear about how records are manipulated inside theValueVectors.
In Hyracks case, we store the pointer(usually a pair of (frameID,
recordID)) in one java array and manipulate the pointers instead ofthe
original records. We mainly want to break the that one array into
multiple
small ByteBuffer-based arrays. By doing so, we can reduce the riskof OOMfor a large array and we may also take the memory usage of thosepointers
into account for the flush decisions.  @Ted, could you share some
insights
about how the ValueVectors handles manipulations? e.g. sort, hashing…
etc.
On Feb 1, 2016, at 3:16 PM, Ted Dunning <[email protected]>wrote:
Have you guys looked at the Drill ValueVectors?
This structure is being spun out as Apache Arrow with multipleinterfaces
and language bindings.
On Mon, Feb 1, 2016 at 9:56 AM, Jianfeng Jia<[email protected]>
wrote:


Hi,
We plan to implement an append-only array at first. The main reasonis
this is how the auxiliary data structure be used so far. Then the
implementation is straightforward.

The tree-structured Vector in Scala can save a lot in updating case

mainly
because of their immutable requirement. It can saving unnecessarydatacopies comparing to other immutable list when updating. We onlyallow
in-place update. The tree design may be overkill for us.

Xi made on detailed design doc is here:
https://docs.google.com/document/d/1bs3JBCxmvuJZmBq_gKzUDt0FZdM9j3d3MPAmC29ZoSA/edit?usp=sharing
Any thoughts or comments?


On Jan 18, 2016, at 9:52 AM, Chen Li <[email protected]> wrote:
@Xi and Jianfeng: after we come up with the design, let's share itwith
the

group for an approval before the implementation.

Chen

On Fri, Jan 15, 2016 at 11:48 AM, Mike Carey <[email protected]>

wrote:
The accounting is just as critical as the chunking - we should doboth
(together).


On 1/15/16 9:00 AM, Till Westmann wrote:
I don’t have relevant experience on the subject. But I think thatit
sounds good to avoid arbitrarily long chunks of memory. Especially -

as

Jianfeng wrote - it would be good to be able to a) account for this

memory

and b) to manage it.
An interesting question for me would be what the overhead of such a
Vector is compared to a simple Java array and as a result where it

should

be used to replace arrays. (The comparison in [3] only compares

different

Scala collections, but doesn’t look at plain arrays.)

Cheers,
Till

On 14 Jan 2016, at 22:05, Chen Li wrote:

Before we ask Xi to work on this project, it will be good to know if

other people have seen similar problems and agree with this plan.
@Till: can you share some tips?

Chen

On Wed, Jan 13, 2016 at 4:27 PM, Jianfeng Jia <

[email protected]


wrote:

Hi Devs,

First of all, Xi Zhang is a Master student at UCI wants to work

with

us

for a while. Welcome Xi!

We are thinking of making a Frame-based, memory-bound
SerializableVector at first. We expect this vector can solve some
occasionally Java.Heap.OutOfMemory exceptions in Hyracks.
Though we did a good job on organizing the record-located memory,

the

OOM exception can still happen while operating the auxiliary data
structure. For example in the sort run generator, instead of moving

record
around we are creating an reference “pointer" array to theoriginal
record.

However, if the record is small and the size of that int array will

be

large, then the OOM exception will occur, which is the case of

issue

[1].


One way to solve this problem is to put auxiliary data structures

into

the memory-bounded frame as well. In general, it will be much

easier

to ask

for multiple small memory blocks than one big chunk of memory. I

guess that

was the same reason why we have “SerializableHashTable” for

HashJoin. It

will be nice to have a more general structure that can be used by

all the

operators.

The Frame based Vector idea is inspired by the Scala Vector[2]

which

looks like a List, but internally it is implemented as a 32-ary

tree. The

performance of it is very stable for variety size of object[3]. It

will

have all the benefits of ArrayList and the LinkedList. In addition,

we can

take the memory usage of the auxiliary structure into the

calculation. We

will work on the detailed design doc later if we are agree on this
direction.

Any thoughts or suggestions? Thank you!


[1]
https://code.google.com/p/asterixdb/issues/detail?id=934&can=1&q=last%20straw&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary%20ETA%20Severity
<
https://code.google.com/p/asterixdb/issues/detail?id=934&can=1&q=last%20straw&colspec=ID%20Type%20Status%20Priority%20Milestone%20Owner%20Summary%20ETA%20Severity
[2] https://bitbucket.org/astrieanna/bitmapped-vector-trie <
https://bitbucket.org/astrieanna/bitmapped-vector-trie>
[3]

http://danielasfregola.com/2015/06/15/which-immutable-scala-collection/

<

http://danielasfregola.com/2015/06/15/which-immutable-scala-collection/




Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine






Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine





Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine





Best,

Jianfeng Jia
PhD Candidate of Computer Science
University of California, Irvine
--

-----------------
Best Regards

Jianfeng Jia
Ph.D. Candidate of Computer Science
University of California, Irvine

Re: Implement an SerializableVector in Hyracks

Reply via email to