niyue opened a new issue, #39052:
URL: https://github.com/apache/arrow/issues/39052

   ### Describe the enhancement requested
   
   # Description
   Currently, Gandiva has some internal stub functions, which are registered 
via two steps:
   1) function metadata are registered in multiple internal registry classes, 
such as:
   1.1) `GetStringFunctionRegistry` in `function_registry_string.cc`
   1.2) `GetMathOpsFunctionRegistry` in `function_registry_math_ops.cc`
   1.3) etc
   2) The stub functions' implementation are mapped to LLVM engine in:
   2.1) `ExportedStubFunctions::AddMappings`
   2.2) `ExportedHashFunctions::AddMappings`
   2.3) `ExportedStringFunctions::AddMappings`
   
   There are some issues with this organizing approach:
   * When adding/removing a stub function, developers need to look for and 
change two places, which is not convenient. For example, when adding a new 
string function, both  `GetStringFunctionRegistry` in 
`function_registry_string.cc` and `ExportedStringFunctions::AddMappings` in 
`gdv_string_function_stubs.cc` need to be modified
   * The LLVM type information provided in the `AddMappings` API is similar as 
the function signature metadata provided in `GetXXXFunctionRegistry` API, which 
cost more time and effort for developers to maintain.
   
   # Proposal
   In PR https://github.com/apache/arrow/pull/38632, we added the capability to 
programmatically map function signature `NativeFunction` into LLVM-typed args. 
So the LLVM args for each function in `AddMappings` could be mapped directly 
from its `NativeFunction`. 
   This proposal plans to use `FunctionRegistry`'s `Register` C function API to 
internally register the existing stub functions, and this will leverage the 
above mapping capability, and for stub functions, we could combine the metadata 
registration and implementation mapping into one step, so that:
   * stub function metadata and implementation are associated and registered in 
one place, and developers don't have to look for two places for maintainance
   * when adding/updating a stub function's signature, there is no need for 
developers to manually map arrow data type signature into LLVM-typed args, 
which makes it easier to maintain and it is less error prone. And this will 
simplify the code a lot as well, it is expected to reduce 1500+ lines of code 
via this change.
   
   ### Component(s)
   
   C++ - Gandiva


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to