[ 
https://issues.apache.org/jira/browse/ARROW-555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104443#comment-17104443
 ] 

Wes McKinney commented on ARROW-555:
------------------------------------

Update: I'm in the middle of an overhaul of the API for implementing new Array 
functions / kernels, with the goal of making it much easier to add new 
functions (e.g. generating a string function given an inlineable implementation 
of computing a single value). Once that's done (since I'm working on it right 
now, it will be this month) I will probably ask someone from my team to make an 
initial cut at a precompiled string function set based on the functions that 
are already in Gandiva / LLVM codegen and add new functions (from e.g. Impala 
or other SQL engines) that are not yet present. The work need not be monolithic 
so as soon as the framework is in place it should be straightforward to add new 
functions and test them. Additionally, adding Python bindings for the new 
functions should also be easy (all you will need is the name of the function 
you're calling, so some of the Cython binding boilerplate that exists now 
should also go away). 

> [C++] String algorithm library for StringArray/BinaryArray
> ----------------------------------------------------------
>
>                 Key: ARROW-555
>                 URL: https://issues.apache.org/jira/browse/ARROW-555
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: C++
>            Reporter: Wes McKinney
>            Priority: Major
>              Labels: Analytics
>
> This is a parent JIRA for starting a module for processing strings in-memory 
> arranged in Arrow format. This will include using the re2 C++ regular 
> expression library and other standard string manipulations (such as those 
> found on Python's string objects)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to