Hi,

I'm investigating https://issues.apache.org/jira/browse/ARROW-12513.
While debugging, I've found that when we create dictionary_
https://github.com/apache/arrow/blob/master/cpp/src/arrow/array/array_dict.cc#L111
we lose information about null_count.
So data_->null_count != 0 but data_->dictionary->null_count == 0.
Later we return an array without correct statistics.
My question is this seems to be correct behaviour? Or do we need to return
an array with statistics? Or these statistics should have been added
to data_->dictionary somewhere else?

I wrote a more detailed explanation in the jira issue.

-- 
Best regards,
Kirill Lykov

Reply via email to