seddonm1 commented on pull request #9001:
URL: https://github.com/apache/arrow/pull/9001#issuecomment-750452242


   > Thanks for the PR @seddonm1!
   > Wouldn't it be better to keep the CSV reader simple and do the trim after 
loading the CSV (in Arrow/DataFusion)? There it could be a simple `trim(col)` 
too. I the use case is relatively limited where you actually want to apply 
trimming on _all_ columns.
   > 
   > I'm hesitant too to depend on the "more advanced" csv crate features, I 
think at some point it makes sense to utilize `csv_core` instead (for better 
performance).
   
   Yes it may be so I think this is up for discussion.
   
   I have added the `trim` function in 
https://github.com/apache/arrow/pull/8966 but the actual flow is read csv -> 
trim -> cast.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to