[ 
https://issues.apache.org/jira/browse/ARROW-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17093435#comment-17093435
 ] 

Sascha Hofmann commented on ARROW-7251:
---------------------------------------

For us having different string encoding support would be amazing. That being 
said, I admit other encodings are rare/dying out but we stumble upon them once 
in a while. From those, I don't know how many are using a BOM to identify their 
encoding. We haven't actually tried it but we might use pandas as mentioned 
above in cases where a file has a BOM different than the utf-8 (see comment 
above).  I am not sure how you did the csv reading in pandas but I assume it 
might not be worth going through it again. In the end, it might be best to 
force people using UTF-8. 

> [Python] Open CSVs with different encodings
> -------------------------------------------
>
>                 Key: ARROW-7251
>                 URL: https://issues.apache.org/jira/browse/ARROW-7251
>             Project: Apache Arrow
>          Issue Type: Wish
>          Components: Python
>            Reporter: Sascha Hofmann
>            Priority: Major
>
> I would like to open an UTF-16 encoded CSVs (among others) without 
> preprocessing in let's say Pandas. Is there maybe a way to do this already ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to