[PATCH 02/10] Clean up duplicate includes in arch/i386/xen/

2007-10-12 Thread Jeremy Fitzhardinge
This patch cleans up duplicate includes in arch/i386/xen/ Signed-off-by: Jesper Juhl <[EMAIL PROTECTED]> Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/i386/xen/enlighten.c |1 - arch/i386/xen/mmu.c |2 -- 2 files changed,

[PATCH 04/10] xen: add batch completion callbacks

2007-10-12 Thread Jeremy Fitzhardinge
when unpinning pagetables" ] Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Stable Kernel <[EMAIL PROTECTED]> --- arch/i386/xen/multicalls.c | 29 ++--- arch/i386/xen/multicalls.h |3 +++ 2 files changed, 29 inse

[PATCH 03/10] xen: yield to IPI target if necessary

2007-10-12 Thread Jeremy Fitzhardinge
When sending a call-function IPI to a vcpu, yield if the vcpu isn't running. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> --- arch/i386/xen/smp.c | 14 ++ arch/i386/xen/time.c|6 ++ arch/i386/xen/xen-ops.h |2 ++ 3 files changed, 18 insertions

[PATCH RFC 2/2] paravirt: clean up lazy mode handling

2007-10-12 Thread Jeremy Fitzhardinge
imply leaving and re-entering the lazy mode. The result is that the Xen, lguest and VMI lazy mode implementations are much simpler. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Zach Amsden <[EMAIL PROTECTED]> Cc: Rusty Russell <

Re: [PATCH RFC REPOST 1/2] paravirt: refactor struct paravirt_ops into smaller pv_*_ops

2007-10-12 Thread Jeremy Fitzhardinge
Rusty Russell wrote: > Sure, but this can actually be a temporary thing inside the patch code > (or at > least static to that file if it's too big for the stack). > > struct paravirt_ops patch_template = { .pv_info = pv_info, .pv_cpu_ops > = > pv_cpu_ops, ... }; > > Then you

Re: Interaction between Xen and XFS: stray RW mappings

2007-10-12 Thread Jeremy Fitzhardinge
Jeremy Fitzhardinge wrote: > I guess we could create a special-case interface to do the same thing > with XFS mappings, but it would be nicer to have something more generic. > > Is my analysis correct? Or should XFS not be holding stray mappings? > Or is there already some

Interaction between Xen and XFS: stray RW mappings

2007-10-12 Thread Jeremy Fitzhardinge
Hi Dave & other XFS folk, I'm tracking down a bug which appears to be a bad interaction between XFS and Xen. It looks like XFS is holding RW mappings on free pages, which Xen is trying to get an exclusive RO mapping on so it can turn them into pagetables. I'm assuming the pages are actually

Interaction between Xen and XFS: stray RW mappings

2007-10-12 Thread Jeremy Fitzhardinge
Hi Dave other XFS folk, I'm tracking down a bug which appears to be a bad interaction between XFS and Xen. It looks like XFS is holding RW mappings on free pages, which Xen is trying to get an exclusive RO mapping on so it can turn them into pagetables. I'm assuming the pages are actually

Re: Interaction between Xen and XFS: stray RW mappings

2007-10-12 Thread Jeremy Fitzhardinge
Jeremy Fitzhardinge wrote: I guess we could create a special-case interface to do the same thing with XFS mappings, but it would be nicer to have something more generic. Is my analysis correct? Or should XFS not be holding stray mappings? Or is there already some kind of generic mechanism I

Re: [PATCH RFC REPOST 1/2] paravirt: refactor struct paravirt_ops into smaller pv_*_ops

2007-10-12 Thread Jeremy Fitzhardinge
Rusty Russell wrote: Sure, but this can actually be a temporary thing inside the patch code (or at least static to that file if it's too big for the stack). struct paravirt_ops patch_template = { .pv_info = pv_info, .pv_cpu_ops = pv_cpu_ops, ... }; Then you can even

[PATCH RFC 2/2] paravirt: clean up lazy mode handling

2007-10-12 Thread Jeremy Fitzhardinge
the lazy mode. The result is that the Xen, lguest and VMI lazy mode implementations are much simpler. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Andi Kleen [EMAIL PROTECTED] Cc: Zach Amsden [EMAIL PROTECTED] Cc: Rusty Russell [EMAIL PROTECTED] Cc: Avi Kivity [EMAIL PROTECTED] Cc

[PATCH 03/10] xen: yield to IPI target if necessary

2007-10-12 Thread Jeremy Fitzhardinge
When sending a call-function IPI to a vcpu, yield if the vcpu isn't running. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] --- arch/i386/xen/smp.c | 14 ++ arch/i386/xen/time.c|6 ++ arch/i386/xen/xen-ops.h |2 ++ 3 files changed, 18 insertions(+), 4

[PATCH 10/10] xfs: eagerly remove vmap mappings to avoid upsetting Xen

2007-10-12 Thread Jeremy Fitzhardinge
unmap its mappings. [ Stable: This works around a bug in 2.6.23. We may come up with a better solution for mainline, but this seems like a low-impact fix for the stable kernel. ] Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: XFS masters [EMAIL PROTECTED] Cc: Stable kernel [EMAIL

[PATCH 07/10] xen: ask the hypervisor how much space it needs reserved

2007-10-12 Thread Jeremy Fitzhardinge
Ask the hypervisor how much space it needs reserved, since 32-on-64 doesn't need any space, and it may change in future. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] --- arch/i386/xen/enlighten.c | 13 - 1 file changed, 12 insertions(+), 1 deletion

[PATCH 08/10] xen: fix incorrect vcpu_register_vcpu_info hypercall argument

2007-10-12 Thread Jeremy Fitzhardinge
The kernel's copy of struct vcpu_register_vcpu_info was out of date, at best causing the hypercall to fail and the guest kernel to fall back to the old mechanism, or worse, causing random memory corruption. [ Stable folks: applies to 2.6.23 ] Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED

[PATCH 09/10] xen: add some debug output for failed multicalls

2007-10-12 Thread Jeremy Fitzhardinge
Multicalls are expected to never fail, and the normal response to a failed multicall is very terse. In the interests of better debuggability, add some more verbose output. It may be worth turning this off once it all seems more tested. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED

[PATCH 00/10] REVIEW: Xen patches for 2.6.24

2007-10-12 Thread Jeremy Fitzhardinge
This is my current set of updates to Xen for 2.6.24. This is largely a bugfix set, and a couple of them are also relevent to 2.6.23. These are in the pre-x86 merge form; I'll update them once the merge goes into git. Quick overview: - remove some dead code in arch/i386/mm/init.c - clean up

[PATCH 01/10] remove dead code in pgtable_cache_init

2007-10-12 Thread Jeremy Fitzhardinge
The conversion from using a slab cache to quicklist left some residual dead code. I note that in the conversion it now always allocates a whole page for the pgd, rather than the 32 bytes needed for a PAE pgd. Was this intended? Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Christoph

[PATCH 02/10] Clean up duplicate includes in arch/i386/xen/

2007-10-12 Thread Jeremy Fitzhardinge
This patch cleans up duplicate includes in arch/i386/xen/ Signed-off-by: Jesper Juhl [EMAIL PROTECTED] Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] --- arch/i386/xen/enlighten.c |1 - arch/i386/xen/mmu.c |2 -- 2 files changed, 3 deletions

[PATCH 04/10] xen: add batch completion callbacks

2007-10-12 Thread Jeremy Fitzhardinge
unpinning pagetables ] Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Stable Kernel [EMAIL PROTECTED] --- arch/i386/xen/multicalls.c | 29 ++--- arch/i386/xen/multicalls.h |3 +++ 2 files changed, 29 insertions(+), 3 deletions

[PATCH 05/10] xen: deal with stale cr3 values when unpinning pagetables

2007-10-12 Thread Jeremy Fitzhardinge
been completed. Other processors wishing to unpin a pagetable can check other vcpu's xen_current_cr3 values to see if any cross-cpu IPIs are needed to clean things up. [ Stable folks: 2.6.23 bugfix ] Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Stable Kernel [EMAIL PROTECTED

[PATCH 06/10] xen: lock pte pages while pinning/unpinning

2007-10-12 Thread Jeremy Fitzhardinge
the PREEMPT_BITS portion of preempt counter - it locks and pins each pte page individually, and then finally pins the whole pagetable. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Rik van Riel [EMAIL PROTECTED] Cc: Hugh Dickens [EMAIL PROTECTED] Cc: David Rientjes [EMAIL PROTECTED] Cc: Andrew

Re: [PATCH] i386: remove dead code in pgtable_cache_init

2007-10-10 Thread Jeremy Fitzhardinge
Christoph Lameter wrote: > I believe that virtualization support needed a full pgd. > Yes, Xen requires it for PAE pgds, at least at the moment. But native, lguest, vmi and kvm don't. I'd made it so that the memory overhead was only paid in the Xen case. Allocating a whole page all the time

[PATCH] i386: remove dead code in pgtable_cache_init

2007-10-10 Thread Jeremy Fitzhardinge
The conversion from using a slab cache to quicklist left some residual dead code. I note that in the conversion it now always allocates a whole page for the pgd, rather than the 32 bytes needed for a PAE pgd. Was this intended? Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]&

Re: [PATCH RFC REPOST 1/2] paravirt: refactor struct paravirt_ops into smaller pv_*_ops

2007-10-10 Thread Jeremy Fitzhardinge
Huh, thought I did a more complete reply to this. Must have farted on it. Rusty Russell wrote: > Thanks Jeremy, I've actually taken time to finally review this in detail (I'm > assuming you'll refactor as necessary after the x86 arch merger). > Yep. >> +struct paravirt_ops paravirt_ops; >>

Re: [PATCH RFC REPOST 1/2] paravirt: refactor struct paravirt_ops into smaller pv_*_ops

2007-10-10 Thread Jeremy Fitzhardinge
Rusty Russell wrote: >> +OFFSET(PARAVIRT_enabled, pv_info, paravirt_enabled); >> > > I think this gives the right answer for the wrong reasons? > Actually, it was OK. The only use is in entry.S, which uses pv_info + PARAVIRT_enabled. J - To unsubscribe from this list: send the

Re: [PATCH RFC REPOST 1/2] paravirt: refactor struct paravirt_ops into smaller pv_*_ops

2007-10-10 Thread Jeremy Fitzhardinge
Rusty Russell wrote: +OFFSET(PARAVIRT_enabled, pv_info, paravirt_enabled); I think this gives the right answer for the wrong reasons? Actually, it was OK. The only use is in entry.S, which uses pv_info + PARAVIRT_enabled. J - To unsubscribe from this list: send the line

Re: [PATCH RFC REPOST 1/2] paravirt: refactor struct paravirt_ops into smaller pv_*_ops

2007-10-10 Thread Jeremy Fitzhardinge
Huh, thought I did a more complete reply to this. Must have farted on it. Rusty Russell wrote: Thanks Jeremy, I've actually taken time to finally review this in detail (I'm assuming you'll refactor as necessary after the x86 arch merger). Yep. +struct paravirt_ops paravirt_ops; +

[PATCH] i386: remove dead code in pgtable_cache_init

2007-10-10 Thread Jeremy Fitzhardinge
The conversion from using a slab cache to quicklist left some residual dead code. I note that in the conversion it now always allocates a whole page for the pgd, rather than the 32 bytes needed for a PAE pgd. Was this intended? Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Christoph

Re: [PATCH] i386: remove dead code in pgtable_cache_init

2007-10-10 Thread Jeremy Fitzhardinge
Christoph Lameter wrote: I believe that virtualization support needed a full pgd. Yes, Xen requires it for PAE pgds, at least at the moment. But native, lguest, vmi and kvm don't. I'd made it so that the memory overhead was only paid in the Xen case. Allocating a whole page all the time

[PATCH 2/3] xen-netfront: rearrange netfront structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Keep tx and rx elements separate on different cachelines to prevent bouncing. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Stephen Hemminger <[EMAIL PROTECTED]> Cc: Christoph Hellwig <[EMAIL PROTECTED]> --- drivers/net/xen-netfront.c | 37 ++

[PATCH 3/3] xen-netfront: remove dead code

2007-10-09 Thread Jeremy Fitzhardinge
This patch removes some residual dead code left over from removing the "flip" receive mode. This patch doesn't change the generated output at all, since gcc already realized it was dead. This resolves the "regression" reported by Adrian. Signed-off-by: Jeremy Fitzhardinge &l

[PATCH 1/3] xen-netfront: use net_device's stats structure

2007-10-09 Thread Jeremy Fitzhardinge
struct net_device has its own stats structure, so use that instead. Also, we can use the default get_stats function. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Stephen Hemminger <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> --- drivers/net/xen

Re: [PATCH] xen-netfront: rearrange netfront_info structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Jeff Garzik wrote: > ACK but does not apply to jgarzik/netdev-2.6.git#upstream nor > davem/net-2.6.24.git OK, looks like you don't have the other two patches. Will post in a sec. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL

Re: [PATCH] xen-netfront: rearrange netfront_info structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Jeff Garzik wrote: > Jeremy Fitzhardinge wrote: >> Keep tx and rx elements separate on different cachelines to prevent >> bouncing. >> >> Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> >> Cc: Stephen Hemminger <[EMAIL PROTECTED]&g

Re: __LITTLE_ENDIAN vs. __LITTLE_ENDIAN_BITFIELD

2007-10-09 Thread Jeremy Fitzhardinge
Krzysztof Halasa wrote: > Some pointer maybe? > Erm, a bit of googling will turn one up, but the gist is that IBM has traditionally bit 0 for MSB and x for LSB. It's a pain to work with: for one, bits in the same place in a word (say, control register) are renumbered in 32 vs 64. And I've

Re: __LITTLE_ENDIAN vs. __LITTLE_ENDIAN_BITFIELD

2007-10-09 Thread Jeremy Fitzhardinge
Krzysztof Halasa wrote: > There is no such thing as bit-order. The data lines are numbered, > say, D0 - D31, with D0 being LSB (bit) and D31 MSB. > Uh-huh. Check out an IBM Power manual some time. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of

[PATCH] xen-netfront: rearrange netfront_info structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Keep tx and rx elements separate on different cachelines to prevent bouncing. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Stephen Hemminger <[EMAIL PROTECTED]> Cc: Christoph Hellwig <[EMAIL PROTECTED]> --- drivers/net/xen-netfront.c | 37 ++

[PATCH RFC REPOST 2/2] paravirt: clean up lazy mode handling

2007-10-09 Thread Jeremy Fitzhardinge
urrent when entering a lazy mode, and make sure that the mode is current when leaving). Also, flush is handled in a common way, by simply leaving and re-entering the lazy mode. The result is that the Xen and VMI lazy mode implementations are much simpler; as would lguest's be. Signed-off-by:

[PATCH RFC REPOST 2/2] paravirt: clean up lazy mode handling

2007-10-09 Thread Jeremy Fitzhardinge
a lazy mode, and make sure that the mode is current when leaving). Also, flush is handled in a common way, by simply leaving and re-entering the lazy mode. The result is that the Xen and VMI lazy mode implementations are much simpler; as would lguest's be. Signed-off-by: Jeremy Fitzhardinge [EMAIL

[PATCH] xen-netfront: rearrange netfront_info structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Keep tx and rx elements separate on different cachelines to prevent bouncing. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Stephen Hemminger [EMAIL PROTECTED] Cc: Christoph Hellwig [EMAIL PROTECTED] --- drivers/net/xen-netfront.c | 37 ++--- 1 file

Re: __LITTLE_ENDIAN vs. __LITTLE_ENDIAN_BITFIELD

2007-10-09 Thread Jeremy Fitzhardinge
Krzysztof Halasa wrote: There is no such thing as bit-order. The data lines are numbered, say, D0 - D31, with D0 being LSB (bit) and D31 MSB. Uh-huh. Check out an IBM Power manual some time. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a

Re: __LITTLE_ENDIAN vs. __LITTLE_ENDIAN_BITFIELD

2007-10-09 Thread Jeremy Fitzhardinge
Krzysztof Halasa wrote: Some pointer maybe? Erm, a bit of googling will turn one up, but the gist is that IBM has traditionally bit 0 for MSB and x for LSB. It's a pain to work with: for one, bits in the same place in a word (say, control register) are renumbered in 32 vs 64. And I've

Re: [PATCH] xen-netfront: rearrange netfront_info structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Jeff Garzik wrote: Jeremy Fitzhardinge wrote: Keep tx and rx elements separate on different cachelines to prevent bouncing. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Stephen Hemminger [EMAIL PROTECTED] Cc: Christoph Hellwig [EMAIL PROTECTED] --- drivers/net/xen-netfront.c

Re: [PATCH] xen-netfront: rearrange netfront_info structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Jeff Garzik wrote: ACK but does not apply to jgarzik/netdev-2.6.git#upstream nor davem/net-2.6.24.git OK, looks like you don't have the other two patches. Will post in a sec. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL

[PATCH 1/3] xen-netfront: use net_device's stats structure

2007-10-09 Thread Jeremy Fitzhardinge
struct net_device has its own stats structure, so use that instead. Also, we can use the default get_stats function. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Stephen Hemminger [EMAIL PROTECTED] Cc: Rusty Russell [EMAIL PROTECTED] --- drivers/net/xen-netfront.c | 25

[PATCH 2/3] xen-netfront: rearrange netfront structure to separate tx and rx

2007-10-09 Thread Jeremy Fitzhardinge
Keep tx and rx elements separate on different cachelines to prevent bouncing. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Stephen Hemminger [EMAIL PROTECTED] Cc: Christoph Hellwig [EMAIL PROTECTED] --- drivers/net/xen-netfront.c | 37 ++--- 1 file

[PATCH 3/3] xen-netfront: remove dead code

2007-10-09 Thread Jeremy Fitzhardinge
This patch removes some residual dead code left over from removing the flip receive mode. This patch doesn't change the generated output at all, since gcc already realized it was dead. This resolves the regression reported by Adrian. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc

Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-08 Thread Jeremy Fitzhardinge
Arjan van de Ven wrote: > s/implemented/merged/ :) > > IN fact shared pagetables are already there for hugepages. > For small pages it's a patch at this point. > Is it kept up to date? Where does it live? > no I'm not saying that. I'm just saying that I'm worried about the > locking

Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Jeremy Fitzhardinge
Randy Dunlap wrote: > but Tested-by: doesn't have to involve any "actually looking at/reading > the patch." Right? > > IOW, the patch could be ugly as sin but it works... > Sure, absolutely. I never said its a substitute for review. An ugly working patch is useful, because its the raw

Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Jeremy Fitzhardinge
Jan Engelhardt wrote: >> Acked-by: >> Tested-by: >> > > * Used by random people to express their (dis)like/experience with the > patch. > Tested-by is more valuable than acked-by, because its empirical. Acked-by generally means "I don't generally object to the idea of the patch, but may

Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Jeremy Fitzhardinge
Jan Engelhardt wrote: Acked-by: Tested-by: * Used by random people to express their (dis)like/experience with the patch. Tested-by is more valuable than acked-by, because its empirical. Acked-by generally means I don't generally object to the idea of the patch, but may not have

Re: RFC: reviewer's statement of oversight

2007-10-08 Thread Jeremy Fitzhardinge
Randy Dunlap wrote: but Tested-by: doesn't have to involve any actually looking at/reading the patch. Right? IOW, the patch could be ugly as sin but it works... Sure, absolutely. I never said its a substitute for review. An ugly working patch is useful, because its the raw material for

Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-08 Thread Jeremy Fitzhardinge
Arjan van de Ven wrote: s/implemented/merged/ :) IN fact shared pagetables are already there for hugepages. For small pages it's a patch at this point. Is it kept up to date? Where does it live? no I'm not saying that. I'm just saying that I'm worried about the locking robustness of

Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-07 Thread Jeremy Fitzhardinge
On Oct 7, 2007, at 9:58 AM, Arjan van de Ven wrote: On Sat, 06 Oct 2007 23:31:33 -0700 I'm presume I'm the first person to try holding multiple pte locks at once, so there's no existing locking order for these locks. I'm always traversing and locking the pagetable in virtual address order

Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-07 Thread Jeremy Fitzhardinge
Peter Zijlstra wrote: >> >> I presume this is because I'm holding multiple pte locks (class >> "__pte_lockptr(new)"). Is there some way I can tell lockdep this is OK? >> > > Yeah, the typical way is to use spin_lock_nested(lock, nesting_level), > this allows one to annotate these nestings.

lockdep: how to tell it multiple pte locks is OK?

2007-10-07 Thread Jeremy Fitzhardinge
I'm writing some code which is doing some batch processing on pte pages, and so wants to hold multiple pte locks at once. This seems OK, but lockdep is giving me the warning: = [ INFO: possible recursive locking detected ] 2.6.23-rc9-paravirt #1673

lockdep: how to tell it multiple pte locks is OK?

2007-10-07 Thread Jeremy Fitzhardinge
I'm writing some code which is doing some batch processing on pte pages, and so wants to hold multiple pte locks at once. This seems OK, but lockdep is giving me the warning: = [ INFO: possible recursive locking detected ] 2.6.23-rc9-paravirt #1673

Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-07 Thread Jeremy Fitzhardinge
Peter Zijlstra wrote: I presume this is because I'm holding multiple pte locks (class __pte_lockptr(new)). Is there some way I can tell lockdep this is OK? Yeah, the typical way is to use spin_lock_nested(lock, nesting_level), this allows one to annotate these nestings. However,

Re: lockdep: how to tell it multiple pte locks is OK?

2007-10-07 Thread Jeremy Fitzhardinge
On Oct 7, 2007, at 9:58 AM, Arjan van de Ven wrote: On Sat, 06 Oct 2007 23:31:33 -0700 I'm presume I'm the first person to try holding multiple pte locks at once, so there's no existing locking order for these locks. I'm always traversing and locking the pagetable in virtual address order

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Andrew Morton wrote: > Well yes, but DMA_BIT_MASK(0) invokes undefined behaviour, generates a > compiler warning and evaluates to 0x (with my setup). > > That won't be a problem in practice, but it is strictly wrong and doesn't set > a good exmaple for the children ;) > It's

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Andreas Schwab wrote: > #define DMA_BIT_MASK(n) ((u64)-1 >> (64 - (n))) > Yeah, that's cleaner. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Robert P. J. Day wrote: > or you could take advantage of the macros in kernel.h and write that > as: > > +#define DMA_BIT_MASK(n) (((n) == 64) ? ULLONG_MAX : ((1ULL<<(n))-1)) > But that's a more indirect way of expressing "I want all 1's". J - To unsubscribe from this list: send the

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Andrew Morton wrote: > From: Andrew Morton <[EMAIL PROTECTED]> > > Now that we have DMA_BIT_MASK(), these macros are pointless. > Except, unfortunately, DMA_64BIT_MASK. I guess we could special case it, assuming this works in all the contexts the macro is used in (ie, compile-time constant?):

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Keir Fraser wrote: > The PREEMPT_BITS limitation is a good argument for at least taking the pte > locks in small batches though (small batches is preferable to one-by-one > since we will want to batch the make-readonly-and-pin hypercall requests to > amortise the cost of the hypervisor trap). Hm,

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Rik van Riel wrote: > This makes for a narrow race window, during which ptep_test_and_clear_young > cannot clear the referenced bit and may end up causing a crash. We do not > care about it not clearing the referenced bit during that window, since it > will be cleared during the next go-around

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
he reason "lock" isn't mentioned is because I mis-analyzed the situation, and overlooked that page_referenced_one does actually take the pte's lock. > Maybe what I have to add is now of historical interest only, or none, > but I was prevented from answering your original mail ear

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Keir Fraser wrote: > I didn't think that nobbling config options for particular pv_ops > implementations was acceptable? I'm rather out of the loop though, and could > be wrong. > As a workaround it would be OK. As a dependency, perhaps. > The PREEMPT_BITS limitation is a good argument for

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Keir Fraser wrote: > > Hang on! How is the access unlocked? By my reading > page_referenced_one()->page_check_address()->spin_lock(pte_lockptr()). > Ah, OK. I'd overlooked that. > The problem here is most likely insufficient locking in the pin/unpin > table-walking code, in light of the fact

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Keir Fraser wrote: Hang on! How is the access unlocked? By my reading page_referenced_one()-page_check_address()-spin_lock(pte_lockptr()). Ah, OK. I'd overlooked that. The problem here is most likely insufficient locking in the pin/unpin table-walking code, in light of the fact that you

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Keir Fraser wrote: I didn't think that nobbling config options for particular pv_ops implementations was acceptable? I'm rather out of the loop though, and could be wrong. As a workaround it would be OK. As a dependency, perhaps. The PREEMPT_BITS limitation is a good argument for at

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Rik van Riel wrote: This makes for a narrow race window, during which ptep_test_and_clear_young cannot clear the referenced bit and may end up causing a crash. We do not care about it not clearing the referenced bit during that window, since it will be cleared during the next go-around and

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
is because I mis-analyzed the situation, and overlooked that page_referenced_one does actually take the pte's lock. Maybe what I have to add is now of historical interest only, or none, but I was prevented from answering your original mail earlier... On Thu, 4 Oct 2007, Jeremy Fitzhardinge wrote

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-05 Thread Jeremy Fitzhardinge
Keir Fraser wrote: The PREEMPT_BITS limitation is a good argument for at least taking the pte locks in small batches though (small batches is preferable to one-by-one since we will want to batch the make-readonly-and-pin hypercall requests to amortise the cost of the hypervisor trap). Hm, I

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Andrew Morton wrote: From: Andrew Morton [EMAIL PROTECTED] Now that we have DMA_BIT_MASK(), these macros are pointless. Except, unfortunately, DMA_64BIT_MASK. I guess we could special case it, assuming this works in all the contexts the macro is used in (ie, compile-time constant?):

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Robert P. J. Day wrote: or you could take advantage of the macros in kernel.h and write that as: +#define DMA_BIT_MASK(n) (((n) == 64) ? ULLONG_MAX : ((1ULL(n))-1)) But that's a more indirect way of expressing I want all 1's. J - To unsubscribe from this list: send the line

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Andreas Schwab wrote: #define DMA_BIT_MASK(n) ((u64)-1 (64 - (n))) Yeah, that's cleaner. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH 1/1] unify DMA_..BIT_MASK definitions: v3.1

2007-10-05 Thread Jeremy Fitzhardinge
Andrew Morton wrote: Well yes, but DMA_BIT_MASK(0) invokes undefined behaviour, generates a compiler warning and evaluates to 0x (with my setup). That won't be a problem in practice, but it is strictly wrong and doesn't set a good exmaple for the children ;) It's

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-04 Thread Jeremy Fitzhardinge
Rik van Riel wrote: > Either of these two would work. Another alternative could be to > let test_and_clear_pte_flags have an exception table entry, where > we jump right to the next instruction if the instruction clearing > the flag fails. > > That is the essentially variant you need for Xen,

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-04 Thread Jeremy Fitzhardinge
Andrew Morton wrote: > y'know, I think I think it's been several years since I saw a report of an > honest to goodness, genuine SMP race in core kernel. We used to be > infested by them, but the term has fallen into disuse. Interesting, but > OT. > I was a bit surprised to find myself typing

race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-04 Thread Jeremy Fitzhardinge
David's change 10a8d6ae4b3182d6588a5809a8366343bc295c20, "i386: add ptep_test_and_clear_{dirty,young}" has introduced an SMP race which affects the Xen pv-ops backend. In Xen, pagetables are normally kept RO so that the hypervisor can mediate all updates to them. If Xen sees a write to an active

race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-04 Thread Jeremy Fitzhardinge
David's change 10a8d6ae4b3182d6588a5809a8366343bc295c20, i386: add ptep_test_and_clear_{dirty,young} has introduced an SMP race which affects the Xen pv-ops backend. In Xen, pagetables are normally kept RO so that the hypervisor can mediate all updates to them. If Xen sees a write to an active

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-04 Thread Jeremy Fitzhardinge
Andrew Morton wrote: y'know, I think I think it's been several years since I saw a report of an honest to goodness, genuine SMP race in core kernel. We used to be infested by them, but the term has fallen into disuse. Interesting, but OT. I was a bit surprised to find myself typing it

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-04 Thread Jeremy Fitzhardinge
Rik van Riel wrote: Either of these two would work. Another alternative could be to let test_and_clear_pte_flags have an exception table entry, where we jump right to the next instruction if the instruction clearing the flag fails. That is the essentially variant you need for Xen, except

Re: [PATCH 0/5] Boot protocol changes

2007-10-02 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: > I'm proposing that the existing bzImage format be retained, but that > the payload of the decompressor (already a gzip file) simply be > vmlinux.gz -- i.e. a gzip compressed ELF file, notes and all. A > pointer in the header will point to the offset of the payload (this is

Re: [PATCH 0/5] Boot protocol changes

2007-10-02 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: >> This series looks like a good start for Xen, but we still need to work >> out where to stash the metadata which normally lives in ELF notes. >> Using ELF is convenient for Xen because it lets a large chunk of domain >> builder code be reused; on the other hand, loading a

Re: [PATCH 0/5] Boot protocol changes

2007-10-02 Thread Jeremy Fitzhardinge
Rusty Russell wrote: > Hi all, > > Jeremy had some boot changes for bzImages, but buried in there was an > update to the boot protocol to support Xen and lguest (and kvm-lite). > I've copied those fairly simple patches, and if HPA is happy I'd like to > push them for 2.6.24 (after correcting

Re: [PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-02 Thread Jeremy Fitzhardinge
at the Xen and VMI lazy mode implementations are much simpler; as would lguest's be. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Zach Amsden <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Avi Kivity <[EMAIL PROTE

Re: [PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-02 Thread Jeremy Fitzhardinge
Rusty Russell wrote: > That's good, but this code does lose on native because we no longer > simply replace the entire thing with noops. > > Perhaps inverting this and having (inline) helpers is the way to go? > I'm thinking that the overhead will be unmeasurably small, and its not really

Re: [PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-02 Thread Jeremy Fitzhardinge
Avi Kivity wrote: > The code doesn't support having both lazy modes active at once. Maybe > that's not an issue, but aren't the two modes orthogonal? Hm, well, that's a good question. The initial semantics of the lazy mode calls were "what VMI wants", and they're still not really nailed down.

Re: [PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-02 Thread Jeremy Fitzhardinge
Avi Kivity wrote: The code doesn't support having both lazy modes active at once. Maybe that's not an issue, but aren't the two modes orthogonal? Hm, well, that's a good question. The initial semantics of the lazy mode calls were what VMI wants, and they're still not really nailed down. VMI

Re: [PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-02 Thread Jeremy Fitzhardinge
Rusty Russell wrote: That's good, but this code does lose on native because we no longer simply replace the entire thing with noops. Perhaps inverting this and having (inline) helpers is the way to go? I'm thinking that the overhead will be unmeasurably small, and its not really worth any

Re: [PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-02 Thread Jeremy Fitzhardinge
; as would lguest's be. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Andi Kleen [EMAIL PROTECTED] Cc: Zach Amsden [EMAIL PROTECTED] Cc: Rusty Russell [EMAIL PROTECTED] Cc: Avi Kivity [EMAIL PROTECTED] Cc: Anthony Liguory [EMAIL PROTECTED] Cc: Glauber de Oliveira Costa [EMAIL PROTECTED] Cc

Re: [PATCH 0/5] Boot protocol changes

2007-10-02 Thread Jeremy Fitzhardinge
Rusty Russell wrote: Hi all, Jeremy had some boot changes for bzImages, but buried in there was an update to the boot protocol to support Xen and lguest (and kvm-lite). I've copied those fairly simple patches, and if HPA is happy I'd like to push them for 2.6.24 (after correcting for

Re: [PATCH 0/5] Boot protocol changes

2007-10-02 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: This series looks like a good start for Xen, but we still need to work out where to stash the metadata which normally lives in ELF notes. Using ELF is convenient for Xen because it lets a large chunk of domain builder code be reused; on the other hand, loading a plain

Re: [PATCH 0/5] Boot protocol changes

2007-10-02 Thread Jeremy Fitzhardinge
H. Peter Anvin wrote: I'm proposing that the existing bzImage format be retained, but that the payload of the decompressor (already a gzip file) simply be vmlinux.gz -- i.e. a gzip compressed ELF file, notes and all. A pointer in the header will point to the offset of the payload (this is

Re: vm86.c audit_syscall_exit() call trashes registers

2007-10-01 Thread Jeremy Fitzhardinge
William Cattey wrote: > Thanks very much for responding. > > From your two replies, I crafted the attached patch. > Alas, the EDID transfer comes up all zeros. > I see two possible causes of this behavior: > > 1. I misunderstood how you intended the file to be modified. > 2. The fix for my bug is

[PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-01 Thread Jeremy Fitzhardinge
sult is that the Xen and VMI lazy mode implementations are much simpler; as would lguest's be. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: Andi Kleen <[EMAIL PROTECTED]> Cc: Zach Amsden <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> Cc: Avi Kivity <

[PATCH RFC] paravirt: cleanup lazy mode handling

2007-10-01 Thread Jeremy Fitzhardinge
and VMI lazy mode implementations are much simpler; as would lguest's be. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Andi Kleen [EMAIL PROTECTED] Cc: Zach Amsden [EMAIL PROTECTED] Cc: Rusty Russell [EMAIL PROTECTED] Cc: Avi Kivity [EMAIL PROTECTED] Cc: Anthony Liguory [EMAIL PROTECTED] Cc

Re: vm86.c audit_syscall_exit() call trashes registers

2007-10-01 Thread Jeremy Fitzhardinge
William Cattey wrote: Thanks very much for responding. From your two replies, I crafted the attached patch. Alas, the EDID transfer comes up all zeros. I see two possible causes of this behavior: 1. I misunderstood how you intended the file to be modified. 2. The fix for my bug is NOT in

<    5   6   7   8   9   10   11   12   13   14   >