[jira] [Commented] (ARROW-16000) [C++][Dataset] Support Latin-1 encoding

Nicola Crane (Jira) Tue, 22 Mar 2022 06:11:06 -0700


    [ 
https://issues.apache.org/jira/browse/ARROW-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510484#comment-17510484
 ]


Nicola Crane commented on ARROW-16000:
--------------------------------------

[~lidavidm] OK, thanks for pointing this out; I've dug deeper into this and it 
looks like there's a function, {{{}MakeReencodeInputStream{}}}, in the R 
package which handles this for the single file CSV reading.  Perhaps it might 
be possible to reuse that function for Datasets.

> [C++][Dataset] Support Latin-1 encoding
> ---------------------------------------
>
>                 Key: ARROW-16000
>                 URL: https://issues.apache.org/jira/browse/ARROW-16000
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Nicola Crane
>            Priority: Major
>
> In ARROW-15992 a user is reporting issues with trying to read in files with 
> Latin-1 encoding.  I had a look through the docs for the Dataset API and I 
> don't think this is currently supported.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

[jira] [Commented] (ARROW-16000) [C++][Dataset] Support Latin-1 encoding

Reply via email to