[
https://issues.apache.org/jira/browse/ARROW-15029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469305#comment-17469305
]
Jeroen van Straten commented on ARROW-15029:
--------------------------------------------
With respect to {{-DARROW_WITH_UTF8PROC}} disabling the whole file: that's not
possible without making functional changes. Some of the UTF-8 kernels don't
actually depend on {{utf8proc}}, and are enabled regardless of whether
{{ARROW_WITH_UTF8PROC}} is defined.
I've now completed the bulk of the refactor, aside from fixing the
{{ARROW_WITH_UTF8PROC}} {{ifdefs}} (I also initially assumed that all UTF-8
kernels would be disabled by it) and cleaning up the {{#include}} lists, and
the result is 3517 lines for ASCII/binary, 1389 for UTF-8, and 498 common. So I
guess it's not necessarily the most efficient split when it comes to
compilation times, but it works.
> [C++] Split compute/kernels/scalar_string.cc
> --------------------------------------------
>
> Key: ARROW-15029
> URL: https://issues.apache.org/jira/browse/ARROW-15029
> Project: Apache Arrow
> Issue Type: Task
> Components: C++
> Reporter: Antoine Pitrou
> Assignee: Jeroen van Straten
> Priority: Minor
> Labels: good-first-issue, good-second-issue
>
> {{compute/kernels/scalar_string.cc}}, which defines scalar string kernels, is
> getting pretty large (and probably long-ish to compile). It would be nice to
> split it up thematically into 2 or 3 source files. Common utilities may be
> factored into a {{scalar_string_internal.h}} header, for example.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)