Hello.
I am using pyarrow csv module.
from pyarrow import csv
fn = '/home/srruj/cars.csv'
read_options=csv.ReadOptions(column_names=('year', 'make', 'model', 'comment',
'blank'))
convert_options = csv.ConvertOptions(include_columns=column_names=('year',
'make', 'model', 'comment', 'blank'),
include_missing_columns=True,
strings_can_be_null=True)
table = csv.read_csv(fn, read_options=read_options,
convert_options=convert_options)
table
I am getting the following error :
Csv parse error: Expected 5 columns, got 3
This is how file looks:
year,make,model,comment,blank
"2012","Tesla","S","No comment",
1997,Ford,E350,"Go get one now they are going fast",
2015,Chevy,Volt
I am able to read this file from spark using spark.read.csv(..) but not using
pyarrow.
Can you please help?
Thanks
Sricheta.