Omega359 commented on PR #12027:
URL: https://github.com/apache/datafusion/pull/12027#issuecomment-2295294998

   Thanks for taking the time to look at my idea and give feedback. I do have a 
few counterpoints to your concerns that may or may not affect things
   
   #1. For sure there is going to be a branch per call for some operations 
however I think that for the majority of usages of the StringArrays enum it 
would be just to get an iterator .. which would result in just two branch calls 
- the initial one for .try_from and the other for the .iter() call. Overall I 
suspect that won't be measurable in any meaningful way. 
   
   Additionally, I would like to try and see if we can indeed measure the 
impact when doing an operation such as `StringArrays::value` over and over in a 
loop with a benchmark. I am wondering if CPU branch prediction will kick in and 
result in a negligible overhead.
   
   #2. Absolutely agree with this. In fact I think it may be a great idea to 
add documentation to StringArrays to note that it is best used for the general 
case and any specialized implementations should use the StringArrayType trait 
or switch between the actual string array implementations are desired. That 
being said I don't see any reason why code that needs to specialize for any 
reason could not just use any of the existing methods to switch between 
implementations based on the string array type.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to