[GitHub] [arrow] seddonm1 commented on pull request #9376: ARROW-11446: [DataFusion] Added support for scalarValue in Builtin functions.

GitBox Sat, 13 Feb 2021 19:27:53 -0800


seddonm1 commented on pull request #9376:
URL: https://github.com/apache/arrow/pull/9376#issuecomment-778716899



   > I wonder if it is possible to get the best of both worlds, by extending 
the `Array` trait slightly and having a subclass of `Array` which denotes 
scalars, like `ScalarArray` which do not need `Buffer` storage and just 
represents constant scalars. This way, functions would only need to deal with 
Array, but can recognize this subclass `ScalarArray` and do optimizations that 
way. This is a very half-formed thought at the moment. The train of thought is 
just what if the `Array` was not strictly a buffer-based representation but 
just a way to access columnar data, and in certain cases represents scalars.
   
   I also have many instances where knowing the `ScalarArray` vs `Array` would 
provide huge performance opportunities in the big PR implement Postgres 
functions : https://github.com/apache/arrow/pull/9243 - especially for things 
like Regex matching. It would be relatively trivial to implement if it could be 
pattern matched.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] seddonm1 commented on pull request #9376: ARROW-11446: [DataFusion] Added support for scalarValue in Builtin functions.

Reply via email to