[
https://issues.apache.org/jira/browse/ARROW-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576863#comment-17576863
]
Antoine Pitrou commented on ARROW-17339:
----------------------------------------
Also for the record the Go code for this is as follows:
{code:go}
for i, c := range codes {
if offsets[c] == -1 {
// offsets are guaranteed to be increasing according to
the spec
// so the first offset we find for a child is the
initial offset
// and will become the "0" for this child.
offsets[c] = unshiftedOffsets[i]
shiftedOffsets[i] = 0
} else {
shiftedOffsets[i] = unshiftedOffsets[i] - offsets[c]
}
lengths[c] = maxI32(lengths[c], shiftedOffsets[i]+1)
}
{code}
> [C++] Simplify IPC writer for dense unions
> ------------------------------------------
>
> Key: ARROW-17339
> URL: https://issues.apache.org/jira/browse/ARROW-17339
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Antoine Pitrou
> Priority: Minor
> Labels: good-first-issue, good-second-issue
>
> In ARROW-10580 we fixed the Arrow C++ implementation so that dense union
> offsets are always (non-strictly) monotonic for a given child, as mandated by
> the spec.
> The IPC writer implementation, however, still assumes that dense union
> offsets may be in any order:
> https://github.com/apache/arrow/blob/5719576c611929dd790f7f8a1ae3169a8f96f7f1/cpp/src/arrow/ipc/writer.cc#L476-L485
> This can probably be simplified, making it slightly less costly to emit a
> sliced union array.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)