benibus commented on issue #25025:
URL: https://github.com/apache/arrow/issues/25025#issuecomment-1437881001

   @lidavidm @westonpace @zeroshade @felipecrv
   I've been working on the PR for this - just wanted to give an update and get 
some design opinions.
   
   Currently, I've set things up in the build system so that everything outside 
of `compute/kernels` is built into libarrow unconditionally - along with:
   - `scalar_cast_*.cc`
   - `vector_selection.cc` for `take` (used by dictionary casts, parquet)
   - `vector_hash.cc` for `unique` (used by parquet)
   
   If `ARROW_COMPUTE` is enabled, then the remaining kernel sources are built 
into libarrow_compute, which links against libarrow. Alternatively, we could 
introduce a new flag for this, but as it stands, `-DARROW_COMPUTE=ON` still 
gives you all the kernels (and we could unconditionally compile code that uses 
casts - i.e. the CSV writer, STL tests, etc).
   
   Assuming that sounds reasonable (if not, let me know), I just need to 
determine how registration will work for the extra kernels. Currently, this is 
done on the first invocation of `compute::GetFunctionRegistry` in libarrow.so. 
However, it'll no longer be possible to register the extra kernels in this way 
given that their registration functions are in a different object that libarrow 
doesn't link against. As I see it, there are a couple possibilities:
   - Load the kernels from libarrow_compute at runtime (more 
flexible/forward-looking, but platform-dependent)
   - Circularly link libarrow and libarrow_compute (would require fancy linker 
flags). You might be able to get the same effect with an intermediate library 
but I'm not sure if/how it would work in practice.
   
   Any suggestions? Perhaps I'm missing a more obvious solution. @wesm's 
original post suggests creating a "plugin hook" for loading kernels from an 
external lib. To me, that sounds like a `dlopen` type of deal, but I'm not 
positive.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to