[ https://issues.apache.org/jira/browse/ARROW-7213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ARROW-7213: ---------------------------------- Labels: pull-request-available (was: ) > [Java] Represent a data element of a vector as a tree of ArrowBufPointer > ------------------------------------------------------------------------ > > Key: ARROW-7213 > URL: https://issues.apache.org/jira/browse/ARROW-7213 > Project: Apache Arrow > Issue Type: New Feature > Components: Java > Reporter: Liya Fan > Assignee: Liya Fan > Priority: Major > Labels: pull-request-available > > For a fixed/variable width vector, each of its data element can be > represented as an ArrowBufPointer object, which represents a contiguous > memory segment. This makes many tasks easier and more efficient (without > memory copy): calculating hash code, comparing values, etc. > This cannot be achieved for complex vectors, because their values often > reside in more than one contiguous memory regions. However, it can be seen > that the contiguous memory regions for each data element forms a tree-like > structure, whose leaf nodes are the contiguous memory regions. For example, a > data element for a struct vector forms a tree, whose root corresponds to the > struct vector, while the child vectors corresponds to the child nodes of the > tree root. > In this issue, we provide a data structure that represents each data element > of a vector as a tree, whose leaf nodes are ArrowBufPointers, representing > contiguous memory regions for the data element. > With this data structure, many tasks also becomes easier and more efficient: > calculating hash code, comparing vector elements (ordering & equality). In > addition, we can do something that could not have been done in the past, like > placing data elements into a hash table/hash set, etc. -- This message was sent by Atlassian Jira (v8.3.4#803005)