calebzulawski commented on PR #3452:
URL: https://github.com/apache/arrow-rs/pull/3452#issuecomment-1374949998

   > > I don't think it's reasonable to expect users to use target-cpu beyond 
local testing or targeting a particular embedded system
   > 
   > I don't disagree that it is definitely not ideal, but it also is not 
really feasible for us to generate multiple versions of all our kernels, as we 
already have severe issues with the amount of codegen and the corresponding 
build times.
   > 
   > Correct me if I am mistaken, but if a user enabled AVX using `target-cpu` 
the produced binary will support 95% of the CPUs in the latest Steam Hardware 
Survey. A more aggressive `target-cpu=haswell` will still support almost all 
CPUs from this decade. I'm inclined to think most people can safely make this 
trade-off?
   
   In this case, yes, AVX is commonly available.  When AVX512 stabilizes in 
Rust, however, you would likely not want to build an entire application that 
requires it.  If the amount of codegen is a concern then removing 
multiversioning can definitely help--you could also consider using a cargo 
feature to control whether or not multiversioning is present, and leave it up 
to the users.  Another thing to consider is that particular software 
distributions (e.g. Linux) probably have particular CPUs they target, and won't 
be able to take advantage of `target-cpu`.
   
   
   
   > > I'm inclined to think most people can safely make this trade-off?
   > 
   > I personally think it is more beneficial to choose the appropriate target 
architecture higher up at the application level rather than in a lower level 
library.
   > 
   > For example, I would rather compile several versions of our top level 
`influxdb_iox` binary for various different architectures / instruction sets 
and pick between those binaries in a launcher rather than have the choice done 
inside of the low level arrow library
   > 
   > I would prefer the top level approach because more of my application could 
take advantage of the target cpu's instructions (via llvm's support for auto 
vectorization, for example), rather than just arrow.
   
   That is another possibility, one that's used by Intel's MKL for example.  
I'm not sure it's the best approach, however:
   * If the amount of codegen (or binary size) is a concern, this is the worst 
option, since all code is duplicated
   * Rust still doesn't (and perhaps never will) have a stable ABI
   * It's OS specific (dlopen etc) and precludes users from making statically 
linked executables
   * The vast majority of code, from what I've seen, does not benefit from 
vectorization. MKL is a special case, where nearly every function it provides 
is vectorized.
   
   With all of this in mind, I don't have any opinion on what is the best 
option for `arrow`, but it might be helpful if I summarized all of this in 
`multiversion`'s documentation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to