[ 
https://issues.apache.org/jira/browse/ORC-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Loncaric reassigned ORC-1191:
------------------------------------

    Assignee: Martin Loncaric

> Benchmark Taxi CSV Dataset No Longer Exists
> -------------------------------------------
>
>                 Key: ORC-1191
>                 URL: https://issues.apache.org/jira/browse/ORC-1191
>             Project: ORC
>          Issue Type: Bug
>            Reporter: Martin Loncaric
>            Assignee: Martin Loncaric
>            Priority: Minor
>
> New York TLC has replaced their CSV dataset with a Parquet version, so we 
> should switch to that.
> Since 5/12, NYC Taxi dataset used in benchmarks no longer exists as CSV's; 
> has been replaced with Parquet
> https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page
> bq. On 05/13/2022, we are making the following changes to trip record files: 
> All files will be stored in the Parquet format. Please see the ‘Working With 
> Parquet Format’ under the Data Dictionaries and MetaData section.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to