Not sure about the conversion, but regarding self_destruct: the problem is that 
it only provides memory savings in limited situations that are hard to figure 
out from the outside. When enabled, PyArrow will always discard the reference 
to the array after conversion, and if there are no other references, that would 
free the array. But different arrays may be backed by the same underlying 
memory buffer (this is generally true for IPC and Flight, for example), so 
freeing the array won't actually free any memory since the buffer is still 
alive. It would only save memory if you ensure each array is actually backed by 
its own memory allocations (which right would generally mean copying data up 
front!).

On Thu, Aug 31, 2023, at 11:11, Li Jin wrote:
> Hi,
>
> I am working on some code where I have a list of pa.Arrays and I am
> creating a pandas.DataFrame from it. I also want to set the index of the
> pd.DataFrame to be the first Array in the list.
>
> Currently I am doing sth like:
> "
> df = pa.Table.from_arrays(arrs, names=input_names).to_pandas()
> df.set_index(input_names[0], inplace=True)
> "
>
> I am curious if this is the best I can do? Also I wonder if it is still
> worthwhile to use the "self_destruct=True" option here (I noticed it has
> been EXPERIMENTAL for a long time)
>
> Thanks!
> Li

Reply via email to