[
https://issues.apache.org/jira/browse/ARROW-10208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17218278#comment-17218278
]
Antoine Pitrou commented on ARROW-10208:
----------------------------------------
Ok, so the actual problem is that the split kernel was reusing the input's null
bitmap, even if the input is sliced.
> [C++] String split kernels do not propagate nulls correctly on sliced input
> ---------------------------------------------------------------------------
>
> Key: ARROW-10208
> URL: https://issues.apache.org/jira/browse/ARROW-10208
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Maarten Breddels
> Assignee: Antoine Pitrou
> Priority: Minor
> Fix For: 3.0.0
>
>
> I am not sure if this is a specific test issue or valid behavior, but when
> writing a test in [https://github.com/apache/arrow/pull/8271]
> The following test fails:
> {code:java}
> this->CheckUnary("split_pattern", R"(["foo bar", "foo", null])",
> list(this->type()), // R"([["foo", "bar"], ["foo"],
> null])", &options);
> {code}
> with the following output
> {code:java}
> Failed:
> Got:
> [
> [
> [
> "foo",
> "bar"
> ]
> ],
> [
> [
> "foo"
> ],
> null
> ]
> ]
> Expected:
> [
> [
> [
> "foo",
> "bar"
> ]
> ],
> [
> [
> "foo"
> ],
> null
> ]
> ]
> {code}
> while the outputs are the same, the arrays are seen as unequal.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)