[
https://issues.apache.org/jira/browse/ARROW-16000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17510484#comment-17510484
]
Nicola Crane commented on ARROW-16000:
--------------------------------------
[~lidavidm] OK, thanks for pointing this out; I've dug deeper into this and it
looks like there's a function, {{{}MakeReencodeInputStream{}}}, in the R
package which handles this for the single file CSV reading. Perhaps it might
be possible to reuse that function for Datasets.
> [C++][Dataset] Support Latin-1 encoding
> ---------------------------------------
>
> Key: ARROW-16000
> URL: https://issues.apache.org/jira/browse/ARROW-16000
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Nicola Crane
> Priority: Major
>
> In ARROW-15992 a user is reporting issues with trying to read in files with
> Latin-1 encoding. I had a look through the docs for the Dataset API and I
> don't think this is currently supported.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)