[GitHub] [arrow-rs] sundy-li commented on issue #1047: Add `Scalar` / `Datum` support to compute kernels

via GitHub Thu, 01 Jun 2023 18:47:32 -0700


sundy-li commented on issue #1047:
URL: https://github.com/apache/arrow-rs/issues/1047#issuecomment-1573015858


   > For what it is worth I think this is what DuckDB does (at least this is 
how I interpret this slide from the [22 - DuckDB Internals (CMU Advanced 
Databases / Spring 2023)](https://www.youtube.com/watch?v=bZOvAKGkzpQ) lecture
   
   If the array is already in Flat format, it needs to construct an extra 
`Selection vector` to iterate the array, maybe it's a little overhead than 
iterating the array itself.
   
   We used an enum to represent the vector, it's hard to add support for 
another dictionary vector.
   
   ```
   #[derive(Debug, Clone, PartialEq, EnumAsInner)]
   pub enum ValueRef<'a, T: ValueType> {
       Scalar(T::ScalarRef<'a>),
       Column(T::Column),
   }
   ```
   
    The main drawback is code flood, so we use the code generator to help write 
vectorized methods, [the generated file](
   
https://github.com/datafuselabs/databend/blob/aa97d2c5b57f1585abc8192e5fb76f9fc4157903/src/query/expression/src/register.rs#L1092-L1105)
  is ~6000 LOC .
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-rs] sundy-li commented on issue #1047: Add `Scalar` / `Datum` support to compute kernels

Reply via email to