Hi Wes, Great! Thanks for the pointer. From what I gather this is a fundamental and deliberate design decision. Would I be correct in saying the memory footprint and access speed of a NumPy array compared to that of a Python list is the reason why the conversion is done?
Kind Regards Simba On Thu, 18 Jan 2018 at 20:35 Wes McKinney <wesmck...@gmail.com> wrote: > hi Simba, > > Yes -- Arrow list<T> types are converted to NumPy arrays when converting > back to pandas with to_pandas(...). This conversion happens in C++ code in > > https://github.com/apache/arrow/blob/master/cpp/src/arrow/python/arrow_to_pandas.cc#L541 > > - Wes > > On Thu, Jan 18, 2018 at 1:26 PM, simba nyatsanga <simnyatsa...@gmail.com> > wrote: > > > Good day everyone, > > > > I noticed what looks like type inference happening after persisting a > > pandas DataFrame where one of the column values is a list. When I load up > > the DataFrame again and do df.to_dict(), the value is no longer a list > but > > a numpy array. I dug through functions in the pandas_compat.py to try and > > figure out at what point the dtype is being applied for that value. > > > > I'd like to verify if this is the intended behaviour. > > > > Here's an illustration of the behaviour: > > > > [image: Screen Shot 2018-01-18 at 15.54.59.png] > > > > Kind Regards > > Simba > > >