[jira] [Updated] (ARROW-4143) [Python] Skip rows while reading parquet file

Wes McKinney (JIRA) Thu, 07 Feb 2019 21:06:28 -0800


     [ 
https://issues.apache.org/jira/browse/ARROW-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Wes McKinney updated ARROW-4143:
--------------------------------
    Summary: [Python] Skip rows while reading parquet file  (was: Skip rows 
while reading parquet file)

> [Python] Skip rows while reading parquet file
> ---------------------------------------------
>
>                 Key: ARROW-4143
>                 URL: https://issues.apache.org/jira/browse/ARROW-4143
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Developer Tools
>            Reporter: Sanchit
>            Priority: Minor
>              Labels: newbie
>
> Is there any functionality in pyarrow that allows reading the file partially. 
> Means if I wish to read only the first 10 rows from the parquet file. 
> I got this situation while doing this:
> `df = pd.read_parquet(path= 'filepath', nrows = 10)`  #Gave me error
> I wanted to read just the 10 rows into pandas dataframe using the 
> read_parquet, (read_parquet uses pyarrow as one of the engines to read 
> parquet file). As the parquet file is considerably huge in size, if one wants 
> to read only a few n rows is there any functionality we can add in the engine 
> to do so?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (ARROW-4143) [Python] Skip rows while reading parquet file

Reply via email to