[
https://issues.apache.org/jira/browse/ARROW-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051302#comment-17051302
]
Wes McKinney commented on ARROW-555:
------------------------------------
We've been having some discussions about this topic in other places, e.g.
ARROW-7083. One idea that has been proposed is to generate single-function
kernels at compile time based on the LLVM IR that Gandiva spits out. So the
process would work like this:
* Generate a library of LLVM IR for all supported Gandiva kernels, with an
exported manifest so that you can dynamically determine what kernels are
available and what are their input and output signatures
* Compile that LLVM IR into a C shared library
* Implement a generic "invoker" that takes a C function kernel (the result of
compiling the LLVM IR produced by Gandiva) and evaluates it (with memory
allocation, etc. as needed)
Then the LLVM runtime would not be required to use the output of this process.
This would require some investment of time (perhaps not that much) to set up
the machinery to enable this, but it would seem to greatly simplify the process
of implementing new kernels, especially simple elementwise functions (for
numbers, strings, etc.)
We've been dancing around this idea for several months now so I would be
interested to see if someone would be interested to explore this before
tunneling too far in different directions.
cc [~emkornfield] [~apitrou] [~fsaintjacques] [~jnadeau] [~ravindra] for any
comments / thoughts if what I've written above jives with prior discussions
> [C++] String algorithm library for StringArray/BinaryArray
> ----------------------------------------------------------
>
> Key: ARROW-555
> URL: https://issues.apache.org/jira/browse/ARROW-555
> Project: Apache Arrow
> Issue Type: New Feature
> Components: C++
> Reporter: Wes McKinney
> Priority: Major
> Labels: Analytics
>
> This is a parent JIRA for starting a module for processing strings in-memory
> arranged in Arrow format. This will include using the re2 C++ regular
> expression library and other standard string manipulations (such as those
> found on Python's string objects)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)