https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100627
Bug ID: 100627 Summary: missing optimization Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: g.peterh...@t-online.de Target Milestone: --- Hello gcc team, i think i wrote something like that a long time ago, but i'm not sure. I think the standard conversion uint64_t -> float/double is inefficient when AVX512 is not available. At least on x86, but with SVE or other CPUs this may not be the case. Problems: - a lot of conditional jumps are generated, not BPU-friendly - and therefore not branchfree - larger codesize I briefly implemented a few conversions for SSE/SSE2 (https://godbolt.org/z/n63WedKT9). Advantages: - branchfree - mostly smaller codesize - more quickly Wouldn't it make sense to implement the standard conversion in this way (including for AVX/AVX2)? thx Gero