On 6/21/19 6:34 PM, Andrew Jones wrote: > + /* > + * In sve_vq_map each set bit is a supported vector length of > + * (bit-number + 1) * 16 bytes, i.e. each bit number + 1 is the vector > + * length in quadwords. We need a map size twice the maximum > + * quadword length though because we use two bits for each vector > + * length in order to track three states: uninitialized, off, and on. > + */ > + DECLARE_BITMAP(sve_vq_map, ARM_MAX_VQ * 2);
I don't see that having one continuous bitmap is more convenient than two. Indeed, there appear to be several places that would be clearer with two. > +static arm_vq_state arm_cpu_vq_map_get(ARMCPU *cpu, int vq) > +{ > + assert(vq <= ARM_MAX_VQ); > + > + return test_bit(vq - 1, cpu->sve_vq_map) | > + test_bit(vq - 1 + ARM_MAX_VQ, cpu->sve_vq_map) << 1; > +} Neither easier nor more complex w/ one or two bitmaps. > +static void arm_cpu_vq_map_init(ARMCPU *cpu) > +{ > + bitmap_zero(cpu->sve_vq_map, ARM_MAX_VQ * 2); > + bitmap_set(cpu->sve_vq_map, ARM_MAX_VQ, ARM_MAX_VQ); > +} Clearer with two bitmaps. bitmap_zero(cpu->sve_vq_map, ARM_MAX_VQ); bitmap_set(cpu->sve_vq_uninit_map, 0, ARM_MAX_VQ); > +static bool arm_cpu_vq_map_is_finalized(ARMCPU *cpu) > +{ > + DECLARE_BITMAP(map, ARM_MAX_VQ * 2); > + > + bitmap_zero(map, ARM_MAX_VQ * 2); > + bitmap_set(map, ARM_MAX_VQ, ARM_MAX_VQ); > + bitmap_and(map, map, cpu->sve_vq_map, ARM_MAX_VQ * 2); > + > + return bitmap_empty(map, ARM_MAX_VQ * 2); > +} Definitely clearer w/ 2 bitmaps, return bitmap_empty(cpu->sve_vq_uninit_map); As for how sve-max-vq=Y and sveX={on,off} interoperate... I wonder if we can just remove cpu->sve_max_vq. That might simplify your code a bit. What if sve-max-vq=Y "expands" to for (X = 1; X <= Y; X++) { sve(X*128)=on } Then you've got a reasonable in-order definition of how those two sets of switches interoperate. The uses of cpu->sve_max_vq within cpu.c and cpu64.c are all initialization stuff that you're replacing. The use within sve_zcr_len_for_el can be replaced by AVR_MAX_VQ. Your "select supported vector size not larger than" code goes at the end of that function, such that we select a supported maximum no larger than the raw .LEN values. The use within aarch64_sve_narrow_vq should in fact assert that the given vq is set within cpu->sve_vq_map. r~