[
https://issues.apache.org/jira/browse/ARROW-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972949#comment-15972949
]
Johan Mabille commented on ARROW-602:
-------------------------------------
Hi [~wesmckinn],
So, as I previously said, this approach would make sense if the wrappers that
provide STL-like interface to the array classes are independent, that is, there
is no inheritance relation between them. This way they would have pure value
semantic, just liske STL-containers. However, some questions:
- how do we handle mutable methods on non-mutable arrays ? Methods such as
resize must be in the interface, even when the array is not mutable (since this
information cannot be known at compile time). Would you consider throwing an
exception as an acceptable behavior ?
- why using a flag fo handling mutability instead or relying on
const-correctness only ? (this is more for my understanding of the design)
- why do you want to keep arrays the smallest possible ? (again, this is more
for my understanding of the design)
About the return type for supporting missing values, I am not sure that
std::optional is what we need; I assume that we want to read the flag telling
if the value is missing or not from the null bitmap, and we do not want to copy
that bitmap for performance considerations. In the case of a mutable array, if
a missing value becomes available (or the contrary), the corresponding bit in
the bitmap must be updated. However, std::optional does not permit this, so we
may end up with coding a dedicated type for missing values (a type that allows
reference proxy on boolean). This type can provide a similar interface to the
one of std::optional.
> C++: Provide iterator access to primitive elements inside a
> Column/ChunkedArray
> -------------------------------------------------------------------------------
>
> Key: ARROW-602
> URL: https://issues.apache.org/jira/browse/ARROW-602
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Uwe L. Korn
> Labels: beginner, newbie
>
> Given a ChunkedArray, an Arrow user must currently iterate over all its
> chunks and then cast them to their types to extract the primitive memory
> regions to access the values. A convenient way to access the underlying
> values would be to offer a function that takes a ChunkedArray and returns a
> C++ iterator over all elements.
> While this may not be the most performant way to access the underlying data,
> it should have sufficient performance and adds a convenience layer for new
> users.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)