Hi - I'm thinking about implementing an ARM NEON flavour for nova-simd. I have some questions...
In common/include/nova-simd/vec.hpp, vec_generic.hpp is always included, even if SSE - so is there a kind of override semantics going on here, i.e. anything not implemented in vec_sse.hpp falls back to the vec_generic implementation? Why does vec_generic say template <typename float_type> struct vec when vec_sse says template <> struct vec<float> ? Why "typedef __m128 internal_vector_type"? I don't see the latter used anywhere. If implementing a NEON version, can I implement any subset of optimised instructions that I choose, or are there risks of breakage? (At the moment I'm not worrying whether the implementation is optimal or not, just whether I can implement it incrementally without jeopardising correctness.) For example, I might start with neon versions of load() store() get() set() before getting on to the actual manipulations after that. It looks to me like I don't need to worry about leftovers (e.g. the last 3 floats in an array of size 19), they're handled elsewhere and I just need to deal with my chosen-sized chunks (i.e. 128-bit). That's correct? Thanks Dan _______________________________________________ nova-dev mailing list [email protected] http://klingt.org/cgi-bin/mailman/listinfo/nova-dev http://tim.klingt.org/nova
