[ 
https://issues.apache.org/jira/browse/ARROW-17339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576863#comment-17576863
 ] 

Antoine Pitrou commented on ARROW-17339:
----------------------------------------

Also for the record the Go code for this is as follows:
{code:go}

        for i, c := range codes {
                if offsets[c] == -1 {
                        // offsets are guaranteed to be increasing according to 
the spec
                        // so the first offset we find for a child is the 
initial offset
                        // and will become the "0" for this child.
                        offsets[c] = unshiftedOffsets[i]
                        shiftedOffsets[i] = 0
                } else {
                        shiftedOffsets[i] = unshiftedOffsets[i] - offsets[c]
                }
                lengths[c] = maxI32(lengths[c], shiftedOffsets[i]+1)
        }
{code}

> [C++] Simplify IPC writer for dense unions
> ------------------------------------------
>
>                 Key: ARROW-17339
>                 URL: https://issues.apache.org/jira/browse/ARROW-17339
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Antoine Pitrou
>            Priority: Minor
>              Labels: good-first-issue, good-second-issue
>
> In ARROW-10580 we fixed the Arrow C++ implementation so that dense union 
> offsets are always (non-strictly) monotonic for a given child, as mandated by 
> the spec.
> The IPC writer implementation, however, still assumes that dense union 
> offsets may be in any order:
> https://github.com/apache/arrow/blob/5719576c611929dd790f7f8a1ae3169a8f96f7f1/cpp/src/arrow/ipc/writer.cc#L476-L485
> This can probably be simplified, making it slightly less costly to emit a 
> sliced union array.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to