On 11/06/2018 10:23 AM, Aubrey Li wrote: > +static inline void update_avx_state(struct avx_state *avx) > +{ > + /* > + * Check if XGETBV with ECX = 1 supported. XGETBV with ECX = 1 > + * returns the logical-AND of XCR0 and XINUSE. XINUSE is a bitmap > + * by which the processor tracks the status of various components. > + */ > + if (!use_xgetbv1()) { > + avx->state = 0; > + return; > + } > + /* > + * XINUSE is dynamic to track component state because VZEROUPPER > + * happens on every function end and reset the bitmap to the > + * initial configuration. > + * > + * State decay is introduced to solve the race condition between > + * context switch and a function end. State is aggressively set > + * once it's detected but need to be cleared by decay 3 context > + * switches > + */ > + if (xgetbv(XINUSE_STATE_BITMAP_INDEX) & XFEATURE_MASK_Hi16_ZMM) { > + avx->state = 1; > + avx->decay_count = AVX_STATE_DECAY_COUNT; > + } else { > + if (!avx->decay_count)
Seems like the check should be if (avx->decay_count) as we decrement the decay_count if it is non-zero. > + avx->decay_count--; > + else > + avx->state = 0; > + } > +} Tim