Dear,
I have a parquet files with 300 000 columns and 30 000 rows.
If I load a such file to pandas dataframe (with pyarrow) that take
around 100 GO of ram. 

As I perform a pairwise comparison between column I could load those
data by N columns by N columns. 

So is it possible to load from a parquet file only few columns by their
names ? Which will save some memory.

Thanks


-- 
                Researcher computational biology
                PhD, Jonathan MERCIER
            
                Bioinformatics (LBI)
                2, rue Gaston
                Crémieux
                91057 Evry Cedex
            
            
                Tel :(+33)1 60 87 83 44
                Email :[email protected]
                
            

Reply via email to