Re: Can I load from a parquet file only few columns ?

Jacek Pliszka Fri, 12 Feb 2021 06:27:12 -0800

Sure - I believe you can do it even in pandas - you have columns
parameter:  pd.read_parquet('f.pq', columns=['A', 'B'])


arrow is more useful if you need to do some conversion of filtering.

BR,

Jacek

pt., 12 lut 2021 o 15:21 jonathan mercier <[email protected]>
napisał(a):
>
> Dear,
> I have a parquet files with 300 000 columns and 30 000 rows.
> If I load a such file to pandas dataframe (with pyarrow) that take
> around 100 GO of ram.
>
> As I perform a pairwise comparison between column I could load those
> data by N columns by N columns.
>
> So is it possible to load from a parquet file only few columns by their
> names ? Which will save some memory.
>
> Thanks
>
>
> --
>                 Researcher computational biology
>                 PhD, Jonathan MERCIER
>
>                 Bioinformatics (LBI)
>                 2, rue Gaston
>                 Crémieux
>                 91057 Evry Cedex
>
>
>                 Tel :(+33)1 60 87 83 44
>                 Email :[email protected]
>
>
>

Re: Can I load from a parquet file only few columns ?

Reply via email to