Slightly off-topic, but the recent work on PEP 574 (*) should allow efficient serialization of Pandas dataframes (**) with standard pickle (or the pickle5 backport). Experimental support for pickle5 has already been merged in Arrow and Numpy (and Pandas uses Numpy as its storage backend). My personal goal is to have the PEP accepted and integrated into Python 3.8.
Regards Antoine. (*) Pickle protocol 5 with out-of-band data: https://www.python.org/dev/peps/pep-0574/ (**) No-copy semantics for pandas dataframes: https://github.com/numpy/numpy/pull/12011#issuecomment-428915852 On Thu, 18 Oct 2018 21:22:04 -0700 Robert Nishihara <robertnishih...@gmail.com> wrote: > How are you serializing the dataframe? If you use *pyarrow.serialize(df)*, > then each column should be serialized separately and numeric columns will > be handled efficiently. > > On Thu, Oct 18, 2018 at 9:10 PM Mitar <mmi...@gmail.com> wrote: > > > Hi! > > > > It seems that if a DataFrame contains both numeric and object columns, > > the whole DataFrame is pickled and not that only object columns are > > pickled? Is this right? Are there any plans to improve this? > > > > > > Mitar > > > > -- > > http://mitar.tnode.com/ > > https://twitter.com/mitar_m > > >