On Thu, Aug 30, 2012 at 10:09:06AM +0900, Takuya Yoshikawa wrote:
> On Thu, 30 Aug 2012 01:51:20 +0300
> "Michael S. Tsirkin" <[email protected]> wrote:
>
> > This text:
> > + if (likely(!word_offset && !word[0]))
> > + return -1;
> > is a left-over from the original implementation.
> >
> > There we did a ton of gratitious calls to interrupt
> > injection so it was important to streamline that path:
> > lookups in emoty ISR and IRR.
> >
> > This is less common nowdays, since we have kvm_make_request,
> > additionally, 8680b94b0e6046af2644c17313287ec0cb5843dc
> > means for ISR lookups we do not need to scan
> > the vector at all, and
> > 33e4c68656a2e461b296ce714ec322978de85412
> > means we never need to scan IRR if it is empty.
> >
> > So I think likely() here really became bogus by now.
>
> Yes, thank you for the explanation.
>
> > But optimizing the vector lookups is a wrong thing to do
> > I suspect: these basically should almost never trigger.
>
> This patch is not optimizing things at all, just removing
> unnatural logic which might be justified only for using that
> bogus likely().
>
> I explain this below.
>
> > Besides a single possible case: a single bit in IRR might
> > still be somewhat common I think.
>
> Then, the current code is doing very bad thing:
>
> while ((word_offset != 0) && (word[(--word_offset) << 2] == 0))
> continue;
>
> if (likely(!word_offset && !word[0]))
> return -1;
> else
> return fls(word[word_offset << 2]) - 1 + (word_offset << 5);
>
> - checking word[0] separately does not make sense
> - using fls(), not __fls(), means we doubly check (word[i] == 0)
>
> Actually, how can this likely() work so effectively even when we
> return -1? If we do (word[0] == 0) check in the same loop, CPU
> should naturally predict the result:
>
> for (word_offset = (MAX_APIC_VECTOR >> 5) - 1;
> word_offset >= 0; --word_offset) {
> word = p[word_offset << 2];
> if (word)
> return __fls(word) + (word_offset << 5);
> }
>
> return -1;
>
> > If we want to optimize the case of a single bit set in IRR.
> > my guess is the best thing is to replace irr_pending with
> > generalization of isr_count/highest_isr_cache so we can have
> > a highest IRR cache. This will avoid scans completely.
>
> Yes, but let's first fix the wrong code.
>
> Thanks,
> Takuya
After staring at your code for a while it does appear to
do the right thing, and looks cleaner than what
we have now. commit log could be clearer.
It should state something like:
Clean up code in find_highest_vector:
- likely() is there for historical reasons, it is no longer
clear it's optimizing for the correct branch,
and find_highest_vector might not be worth optimizing at all
- checking word[0] separately does not make sense:
if (word[word_offset << 2]) would be clearer
- since we test word[...] != 0 beforehand, we can use __fls
instead of fls()
- for loop iterating over an array is clearer than while
Since you are going for cleanups, maybe we could also add:
- get rid of ugly >> 5 << 2, switch to using REG_POS instead?
Something like the below pseudo code would do this I think?
#define APIC_VECTORS_PER_REG 32
int vec;
for (vec = MAX_APIC_VECTOR - APIC_VECTORS_PER_REG;
vec -= APIC_VECTORS_PER_REG; vec >= 0) {
u32 *reg = bitmap + REG_POS(vec);
if (*reg)
return __fls(*reg) - 1 + vec;
}
return -1
count_vectors similar:
for (vec = 0; vec < MAX_APIC_VECTOR; vec += APIC_VECTORS_PER_REG) {
u32 *reg = bitmap + REG_POS(vec);
count += hweight32(*reg);
}
hmm?
--
MST
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html