Hi Pierre,
While the Arrow format doesn't mandate particular values under null
slots, the Arrow C++ implementation should not create "undefined" values
(for security reasons: failing to initialize data could lead to reveal
confidential information that was previously at the same memory location)
Hi Wes,
I guess the answer is that it is not fixed...
At this line,
https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/int_util.cc#L409,
we have a utility for transposing a dict from its original indices to a new
mapping that is used when unifying dictionaries. I use that when
concate
The Arrow format does not indicate any particular value "underneath" a
null so I'm not sure what can be "fixed" here. What precisely are you
doing with the data that is failing?
On Fri, Feb 28, 2020 at 4:57 PM Pierre Belzile wrote:
>
> When I recover an array of type dictionary int32 -> string fr
When I recover an array of type dictionary int32 -> string from a parquet
file and that array has null positions, it seems that the indices that
correspond to null positions are undefined. I.e. not guaranteed to be 0.
This causes a crash when using a transpose map when trying to read the
transpose