jorisvandenbossche opened a new issue, #39635:
URL: https://github.com/apache/arrow/issues/39635
### Describe the bug, including details regarding any error messages,
version, and platform.
Not a C++ reproducer, but with very preliminary (non-merged) Python bindings
to illustrate it:
```
builder = pa.lib.StringViewBuilder()
builder.append("test")
builder.append("very long string that is not inlined")
builder.append(None)
builder.append("test")
>>> arr = builder.finish()
>>> arr
<pyarrow.lib.Array object at 0x7f9a2e1fc4c0>
[
"test",
"very long string that is not inlined",
null,
"test"
]
>>> arr.type
DataType(string_view)
```
Calculating the `unique` values of this array includes the missing value as
an empty string:
```
>>> arr.unique()
<pyarrow.lib.Array object at 0x7f9a2e45fe20>
[
"test",
"very long string that is not inlined",
""
]
```
I didn't check in the code, but I _assume_ that it's "just" missing the
validity bitmap (the empty string being the value that would otherwise be
masked).
### Component(s)
C++
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]