[ 
https://issues.apache.org/jira/browse/ARROW-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469305#comment-17469305
 ] 

Jeroen van Straten commented on ARROW-15029:
--------------------------------------------

With respect to {{-DARROW_WITH_UTF8PROC}} disabling the whole file: that's not 
possible without making functional changes. Some of the UTF-8 kernels don't 
actually depend on {{utf8proc}}, and are enabled regardless of whether 
{{ARROW_WITH_UTF8PROC}} is defined.

I've now completed the bulk of the refactor, aside from fixing the 
{{ARROW_WITH_UTF8PROC}} {{ifdefs}} (I also initially assumed that all UTF-8 
kernels would be disabled by it) and cleaning up the {{#include}} lists, and 
the result is 3517 lines for ASCII/binary, 1389 for UTF-8, and 498 common. So I 
guess it's not necessarily the most efficient split when it comes to 
compilation times, but it works.

> [C++] Split compute/kernels/scalar_string.cc
> --------------------------------------------
>
>                 Key: ARROW-15029
>                 URL: https://issues.apache.org/jira/browse/ARROW-15029
>             Project: Apache Arrow
>          Issue Type: Task
>          Components: C++
>            Reporter: Antoine Pitrou
>            Assignee: Jeroen van Straten
>            Priority: Minor
>              Labels: good-first-issue, good-second-issue
>
> {{compute/kernels/scalar_string.cc}}, which defines scalar string kernels, is 
> getting pretty large (and probably long-ish to compile). It would be nice to 
> split it up thematically into 2 or 3 source files. Common utilities may be 
> factored into a {{scalar_string_internal.h}} header, for example.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to