> The definition of an external function registry can certainly belong in
Gandiva, but how it's populated should be left to third-party projects

Are you proposing a more general approach, like incorporating the following
APIs into Gandiva? (Please note that the function names/signatures are
tentative and just meant for illustrative purposes.)
1) AddExternalFunctionRegistry(ExternalFunctionRegistry function_registry)
2) AddFunctionBitcodeLoader(FunctionBitcodeLoader bitcode_loader)
Where `ExternalFunctionRegistry` can return a list of function definitions
and `FunctionBitcodeLoader` can return a list of bitcode buffers, so that
the specific metadata/bitcode data population logic can be moved out of
Gandiva? Thanks.

Regards,
Yue

On Tue, Sep 26, 2023 at 12:25 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Hi Yue,
>
> Le 25/09/2023 à 18:15, Yue Ni a écrit :
> >
> >> a CMake entrypoint (for example a function) making it easy for
> > third-party projects to compile their own functions
> > I can come up with a minimum CMake template so that users can compile C++
> > based functions, and I think if the integration happens at the LLVM IR
> > level, it is possible to author the functions beyond C++ languages, such
> as
> > Rust/Zig as long as the compiler can generate LLVM IR (there are other
> > issues that need to be addressed from the Rust experiment I made, but
> that
> > can be another proposal/PR). If we make that work, CMake is probably not
> so
> > important either since other languages can use their own build tools such
> > as Cargo/zig build, and we just need some documentation to describe how
> it
> > should be interfaced typically.
>
> As long as there's a well-known and supported way to generate the code
> for external functions, then it's fine to me.
>
> (also the required signature for these functions should be documented
> somewhere)
>
> >> The rest of the proposal (a specific JSON file format, a bunch of
> functions
> > to iterate directory entries in a specific layout) is IMHO off-topic for
> > Gandiva, and each third-party project can implement their own idioms for
> > the discovery of external functions
>  >
> > Could you give some more guidance on how this should work without an
> > external function registry containing metadata? As far as I know, for
> each
> > pre-compiled function used in an expression, Gandiva needs to lookup its
> > signature from the function registry, which currently is a C++ class that
> > is hard coded to contain 6 categories of built-in functions
> > (arithmetic/datetime/hash/mathops/string/datetime arithmetic). If a third
> > party function cannot be found in the registry, it cannot be used in the
> > expression. If we don't load the pre-compiled function metadata from
> > external files, how do we avoid Gandiva rejecting the expression when a
> > third party function cannot be found in the function registry? Thanks.
>
> What I'm saying is that code to load function metadata from JSON and
> walk directories of .bc files does not belong in Gandiva. The definition
> of an external function registry can certainly belong in Gandiva, but
> how it's populated should be left to third-party projects (which then
> don't have to use JSON or a given directory layout).
>
> Regards
>
> Antoine.
>

Reply via email to