Uniform types in Arrow table columns (pyarrow.array) and the case of python dictionaries

simba nyatsanga Sun, 21 Jan 2018 22:30:06 -0800

Hi Everyone,

I've got two questions that I'd like help with:


1. Pandas and numpy arrays can handle multiple types in a sequence eg. a
float and a string by using the dtype=object. From what I gather, Arrow
arrays enforce a uniform type depending on the type of the first
encountered element in a sequence. This looks like a deliberate choice and
I'd like to get a better understanding of the reason for ensuring this
conformity. Does making the data structure's type deterministic allow for
efficient pointer arithmetic when reading contiguous blocks and thus making
reading performant?

2. Pandas and numpy can also handle dictionary elements using the
dtype=object while pyarrow arrays don't. I'd like to understand the
reasoning behind the choice here as well.

Thanks again for taking my questions.

Kind Regards
Simba

Uniform types in Arrow table columns (pyarrow.array) and the case of python dictionaries

Reply via email to