Andi Kleen wrote:
so INVLPG makes sense for pagetable fault realated single-address
flushes, but they rarely make sense for range flushes. (and that's how
Linux uses it)
I think it would be an interesting experiment to switch flush_tlb_range()
over to INVLPG if the length is below some
On Saturday 26 January 2008 01:11:28 Ingo Molnar wrote:
(plus
> any add-on TLB miss costs - but those are amortized quite well as long
> as the pagetables are well cached - which they usually are on today's
> 2MB-ish L2 caches),
Did you measure the cost of that amortizing too?
My guess is
Jeremy Fitzhardinge wrote:
Now, all of this reminds me of something somewhat messy: if we share
the kernel page tables for trampoline page tables, as discussed
elsewhere, we HAVE to do a complete, all-tlb-including-global-pages
flush after use, since the kernel pages are global and otherwise
H. Peter Anvin wrote:
Keir Fraser wrote:
On 25/1/08 22:54, "Jeremy Fitzhardinge" <[EMAIL PROTECTED]> wrote:
The only possibly relevant comment I can find in vol3a is:
Older IA-32 processors that implement the PAE mechanism use
uncached
accesses when loading page-directory-pointer
Ingo Molnar wrote:
* Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:
Is there any guide about the tradeoff of when to use invlpg vs
flushing the whole tlb? 1 page? 10? 90% of the tlb?
i made measurements some time ago and INVLPG was quite uniformly slow on
all important CPU types - on the
Keir Fraser wrote:
On 25/1/08 22:54, "Jeremy Fitzhardinge" <[EMAIL PROTECTED]> wrote:
The only possibly relevant comment I can find in vol3a is:
Older IA-32 processors that implement the PAE mechanism use uncached
accesses when loading page-directory-pointer table entries. This
* Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote:
> Is there any guide about the tradeoff of when to use invlpg vs
> flushing the whole tlb? 1 page? 10? 90% of the tlb?
i made measurements some time ago and INVLPG was quite uniformly slow on
all important CPU types - on the order of 100+
Keir Fraser wrote:
Go read the Intel application note "TLBs, Paging-Structure Caches, and Their
Invalidation" at http://www.intel.com/design/processor/applnots/317080.pdf
Section 8.1 explains about the PDPTR cache in 32-bit PAE mode, which can
only be refreshed by appropriate tickling of CR0,
On 25/1/08 22:54, "Jeremy Fitzhardinge" <[EMAIL PROTECTED]> wrote:
> The only possibly relevant comment I can find in vol3a is:
>
> Older IA-32 processors that implement the PAE mechanism use uncached
> accesses when loading page-directory-pointer table entries. This
> behavior is
>
H. Peter Anvin wrote:
Jeremy Fitzhardinge wrote:
PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor. This means that
in principle pud_clear needs to reload cr3 every time. However,
because reloading cr3 implies a tlb flush, we
PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor. This means that
in principle pud_clear needs to reload cr3 every time. However,
because reloading cr3 implies a tlb flush, we want to avoid it where
possible.
pud_clear() is only
Jeremy Fitzhardinge wrote:
PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor. This means that
in principle pud_clear needs to reload cr3 every time. However,
because reloading cr3 implies a tlb flush, we want to avoid it where
* Jeremy Fitzhardinge [EMAIL PROTECTED] wrote:
Is there any guide about the tradeoff of when to use invlpg vs
flushing the whole tlb? 1 page? 10? 90% of the tlb?
i made measurements some time ago and INVLPG was quite uniformly slow on
all important CPU types - on the order of 100+
On Saturday 26 January 2008 01:11:28 Ingo Molnar wrote:
(plus
any add-on TLB miss costs - but those are amortized quite well as long
as the pagetables are well cached - which they usually are on today's
2MB-ish L2 caches),
Did you measure the cost of that amortizing too?
My guess is that
Keir Fraser wrote:
Go read the Intel application note TLBs, Paging-Structure Caches, and Their
Invalidation at http://www.intel.com/design/processor/applnots/317080.pdf
Section 8.1 explains about the PDPTR cache in 32-bit PAE mode, which can
only be refreshed by appropriate tickling of CR0, CR3
Andi Kleen wrote:
so INVLPG makes sense for pagetable fault realated single-address
flushes, but they rarely make sense for range flushes. (and that's how
Linux uses it)
I think it would be an interesting experiment to switch flush_tlb_range()
over to INVLPG if the length is below some
On 25/1/08 22:54, Jeremy Fitzhardinge [EMAIL PROTECTED] wrote:
The only possibly relevant comment I can find in vol3a is:
Older IA-32 processors that implement the PAE mechanism use uncached
accesses when loading page-directory-pointer table entries. This
behavior is
model
Ingo Molnar wrote:
* Jeremy Fitzhardinge [EMAIL PROTECTED] wrote:
Is there any guide about the tradeoff of when to use invlpg vs
flushing the whole tlb? 1 page? 10? 90% of the tlb?
i made measurements some time ago and INVLPG was quite uniformly slow on
all important CPU types - on the
H. Peter Anvin wrote:
Jeremy Fitzhardinge wrote:
PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor. This means that
in principle pud_clear needs to reload cr3 every time. However,
because reloading cr3 implies a tlb flush, we
Jeremy Fitzhardinge wrote:
PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor. This means that
in principle pud_clear needs to reload cr3 every time. However,
because reloading cr3 implies a tlb flush, we want to avoid it where
Keir Fraser wrote:
On 25/1/08 22:54, Jeremy Fitzhardinge [EMAIL PROTECTED] wrote:
The only possibly relevant comment I can find in vol3a is:
Older IA-32 processors that implement the PAE mechanism use uncached
accesses when loading page-directory-pointer table entries. This
PAE mode requires that we reload cr3 in order to guarantee that
changes to the pgd will be noticed by the processor. This means that
in principle pud_clear needs to reload cr3 every time. However,
because reloading cr3 implies a tlb flush, we want to avoid it where
possible.
pud_clear() is only
Jeremy Fitzhardinge wrote:
Now, all of this reminds me of something somewhat messy: if we share
the kernel page tables for trampoline page tables, as discussed
elsewhere, we HAVE to do a complete, all-tlb-including-global-pages
flush after use, since the kernel pages are global and otherwise
23 matches
Mail list logo