[ 
https://issues.apache.org/jira/browse/ARROW-13813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17407565#comment-17407565
 ] 

Weston Pace commented on ARROW-13813:
-------------------------------------

I think my only concern is that this is something the user should be able to 
easily do themselves using the compute stuff.  They could use a scanner to read 
in their data, project the offending column to an encoding kernel, and then 
partition on the projected column.

However, since we already have segment encoding in partition objects it seems 
straightforward enough to provide.  It might be a good project to pair with 
ARROW-11378 if someone is looking for some good beginner C++ tasks.

> [C++][Dataset] Support URL encoding of partition field values for the file 
> path
> -------------------------------------------------------------------------------
>
>                 Key: ARROW-13813
>                 URL: https://issues.apache.org/jira/browse/ARROW-13813
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Joris Van den Bossche
>            Priority: Major
>              Labels: dataset
>
> In ARROW-12644, we added support for _decoding_ the file paths when reading 
> datasets. So a valid follow-up question: should we also support _encoding_ 
> when writing datasets?
> (see also https://github.com/apache/arrow/issues/11027)
> Rereading ARROW-12644, there wasn't yet much discussion on that aspect.
> cc [~westonpace] [~lidavidm]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to