[GitHub] [arrow] paiforsyth opened a new issue, #34637: Support separate null_values per column in pyarrow.csv.ConvertOptions

via GitHub Sun, 19 Mar 2023 12:51:49 -0700


paiforsyth opened a new issue, #34637:
URL: https://github.com/apache/arrow/issues/34637


   ### Describe the enhancement requested
   
   I have  a csv dataset in which nulls are encoded differently in different 
columns.  It looks like when reading csv data with pyarrow, the same list of 
null_values must be used for all columns ([see 
ConvertOptions](https://arrow.apache.org/docs/python/generated/pyarrow.csv.ConvertOptions.html)).
  This concerns me because a value used as a null code in one column ("9999" 
for example) may be a valid non-null value in another column.  In pandas's 
[read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html), 
it is possible to pass a dictionary specifying different null codes for 
different columns.  Could this functionality be added to pyarrow?
   
   ### Component(s)
   
   Other


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [arrow] paiforsyth opened a new issue, #34637: Support separate null_values per column in pyarrow.csv.ConvertOptions

Reply via email to