Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-11 Thread Paul Brook
 On 03/10/2010 07:41 PM, Paul Brook wrote:
  You're much better off using a bulk-data transfer API that relaxes
  coherency requirements.  IOW, shared memory doesn't make sense for TCG
 
  Rather, tcg doesn't make sense for shared memory smp.  But we knew that
  already.
 
  In think TCG SMP is a hard, but soluble problem, especially when you're
  running guests used to coping with NUMA.
 
 Do you mean by using a per-cpu tlb?  These kind of solutions are
 generally slow, but tcg's slowness may mask this out.

Yes.

  TCG interacting with third parties via shared memory is probably never
  going to make sense.
 
 The third party in this case is qemu.

Maybe. But it's a different instance of qemu, and once this feature exists I 
bet people will use it for other things.

Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-11 Thread malc
On Thu, 11 Mar 2010, Nick Piggin wrote:

 On Thu, Mar 11, 2010 at 03:10:47AM +, Jamie Lokier wrote:
  Paul Brook wrote:
 In a cross environment that becomes extremely hairy.  For example the 
 x86
 architecture effectively has an implicit write barrier before every
 store, and an implicit read barrier before every load.

Btw, x86 doesn't have any implicit barriers due to ordinary loads.
Only stores and atomics have implicit barriers, afaik.
   
   As of March 2009[1] Intel guarantees that memory reads occur in
   order (they may only be reordered relative to writes). It appears
   AMD do not provide this guarantee, which could be an interesting
   problem for heterogeneous migration..
  
  (Summary: At least on AMD64, it does too, for normal accesses to
  naturally aligned addresses in write-back cacheable memory.)
  
  Oh, that's interesting.  Way back when I guess we knew writes were in
  order and it wasn't explicit that reads were, hence smp_rmb() using a
  locked atomic.
  
  Here is a post by Nick Piggin from 2007 with links to Intel _and_ AMD
  documents asserting that reads to cacheable memory are in program order:
  
  http://lkml.org/lkml/2007/9/28/212
  Subject: [patch] x86: improved memory barrier implementation
  
  Links to documents:
  
  http://developer.intel.com/products/processor/manuals/318147.pdf
  
  http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf
  
  The Intel link doesn't work any more, but the AMD one does.
 
 It might have been merged into their development manual now.

It was (http://www.intel.com/products/processor/manuals/):

Intel╝ 64 Architecture Memory Ordering White Paper

This document has been merged into Volume 3A of Intel 64 and IA-32 
Architectures Software Developer's Manual.

[..snip..]

-- 
mailto:av1...@comtv.ru

Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Avi Kivity

On 03/09/2010 11:44 PM, Anthony Liguori wrote:
Ah yes.  For cross tcg environments you can map the memory using mmio 
callbacks instead of directly, and issue the appropriate barriers there.



Not good enough unless you want to severely restrict the use of shared 
memory within the guest.


For instance, it's going to be useful to assume that you atomic 
instructions remain atomic.  Crossing architecture boundaries here 
makes these assumptions invalid.  A barrier is not enough.


You could make the mmio callbacks flow to the shared memory server over 
the unix-domain socket, which would then serialize them.  Still need to 
keep RMWs as single operations.  When the host supports it, implement 
the operation locally (you can't render cmpxchg16b on i386, for example).


Shared memory only makes sense when using KVM.  In fact, we should 
actively disable the shared memory device when not using KVM.


Looks like that's the only practical choice.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Avi Kivity

On 03/10/2010 06:38 AM, Cam Macdonell wrote:

On Tue, Mar 9, 2010 at 5:03 PM, Paul Brookp...@codesourcery.com  wrote:
   

In a cross environment that becomes extremely hairy.  For example the x86
architecture effectively has an implicit write barrier before every
store, and an implicit read barrier before every load.
 

Btw, x86 doesn't have any implicit barriers due to ordinary loads.
Only stores and atomics have implicit barriers, afaik.
   

As of March 2009[1] Intel guarantees that memory reads occur in order (they
may only be reordered relative to writes). It appears AMD do not provide this
guarantee, which could be an interesting problem for heterogeneous migration..

Paul

[*] The most recent docs I have handy. Up to and including Core-2 Duo.

 

Interesting, but what ordering would cause problems that AMD would do
but Intel wouldn't?  Wouldn't that ordering cause the same problems
for POSIX shared memory in general (regardless of Qemu) on AMD?
   


If some code was written for the Intel guarantees it would break if 
migrated to AMD.  Of course, it would also break if run on AMD in the 
first place.



I think shared memory breaks migration anyway.
   


Until someone implements distributed shared memory.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Paul Brook
  As of March 2009[1] Intel guarantees that memory reads occur in order
  (they may only be reordered relative to writes). It appears AMD do not
  provide this guarantee, which could be an interesting problem for
  heterogeneous migration..
 
  Interesting, but what ordering would cause problems that AMD would do
  but Intel wouldn't?  Wouldn't that ordering cause the same problems
  for POSIX shared memory in general (regardless of Qemu) on AMD?
 
 If some code was written for the Intel guarantees it would break if
 migrated to AMD.  Of course, it would also break if run on AMD in the
 first place.

Right. This is independent of shared memory, and is a case where reporting an 
Intel CPUID on and AMD host might get you into trouble.

Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Anthony Liguori

On 03/10/2010 03:25 AM, Avi Kivity wrote:

On 03/09/2010 11:44 PM, Anthony Liguori wrote:
Ah yes.  For cross tcg environments you can map the memory using 
mmio callbacks instead of directly, and issue the appropriate 
barriers there.



Not good enough unless you want to severely restrict the use of 
shared memory within the guest.


For instance, it's going to be useful to assume that you atomic 
instructions remain atomic.  Crossing architecture boundaries here 
makes these assumptions invalid.  A barrier is not enough.


You could make the mmio callbacks flow to the shared memory server 
over the unix-domain socket, which would then serialize them.  Still 
need to keep RMWs as single operations.  When the host supports it, 
implement the operation locally (you can't render cmpxchg16b on i386, 
for example).


But now you have a requirement that the shmem server runs in lock-step 
with the guest VCPU which has to happen for every single word of data 
transferred.


You're much better off using a bulk-data transfer API that relaxes 
coherency requirements.  IOW, shared memory doesn't make sense for TCG :-)


Regards,

Anthony Liguori

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Avi Kivity

On 03/10/2010 07:13 PM, Anthony Liguori wrote:

On 03/10/2010 03:25 AM, Avi Kivity wrote:

On 03/09/2010 11:44 PM, Anthony Liguori wrote:
Ah yes.  For cross tcg environments you can map the memory using 
mmio callbacks instead of directly, and issue the appropriate 
barriers there.



Not good enough unless you want to severely restrict the use of 
shared memory within the guest.


For instance, it's going to be useful to assume that you atomic 
instructions remain atomic.  Crossing architecture boundaries here 
makes these assumptions invalid.  A barrier is not enough.


You could make the mmio callbacks flow to the shared memory server 
over the unix-domain socket, which would then serialize them.  Still 
need to keep RMWs as single operations.  When the host supports it, 
implement the operation locally (you can't render cmpxchg16b on i386, 
for example).


But now you have a requirement that the shmem server runs in lock-step 
with the guest VCPU which has to happen for every single word of data 
transferred.




Alternative implementation: expose a futex in a shared memory object and 
use that to serialize access.  Now all accesses happen from vcpu 
context, and as long as there is no contention, should be fast, at least 
relative to tcg.


You're much better off using a bulk-data transfer API that relaxes 
coherency requirements.  IOW, shared memory doesn't make sense for TCG 
:-)


Rather, tcg doesn't make sense for shared memory smp.  But we knew that 
already.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Paul Brook
  You're much better off using a bulk-data transfer API that relaxes
  coherency requirements.  IOW, shared memory doesn't make sense for TCG
 
 Rather, tcg doesn't make sense for shared memory smp.  But we knew that
 already.

In think TCG SMP is a hard, but soluble problem, especially when you're 
running guests used to coping with NUMA.

TCG interacting with third parties via shared memory is probably never going 
to make sense.

Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Jamie Lokier
Paul Brook wrote:
   In a cross environment that becomes extremely hairy.  For example the x86
   architecture effectively has an implicit write barrier before every
   store, and an implicit read barrier before every load.
  
  Btw, x86 doesn't have any implicit barriers due to ordinary loads.
  Only stores and atomics have implicit barriers, afaik.
 
 As of March 2009[1] Intel guarantees that memory reads occur in
 order (they may only be reordered relative to writes). It appears
 AMD do not provide this guarantee, which could be an interesting
 problem for heterogeneous migration..

(Summary: At least on AMD64, it does too, for normal accesses to
naturally aligned addresses in write-back cacheable memory.)

Oh, that's interesting.  Way back when I guess we knew writes were in
order and it wasn't explicit that reads were, hence smp_rmb() using a
locked atomic.

Here is a post by Nick Piggin from 2007 with links to Intel _and_ AMD
documents asserting that reads to cacheable memory are in program order:

http://lkml.org/lkml/2007/9/28/212
Subject: [patch] x86: improved memory barrier implementation

Links to documents:

http://developer.intel.com/products/processor/manuals/318147.pdf

http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf

The Intel link doesn't work any more, but the AMD one does.

Nick asserts both manufacturers are committed to in-order loads from
cacheable memory for the x86 architecture.

I have just read the AMD document, and it is in there (but not
completely obviously), in section 7.2.  The implicit load-load and
store-store barriers are only guaranteed for normal cacheable
accesses on naturally aligned boundaries to WB [write-back cacheable]
memory.  There are also implicit load-store barriers but not
store-load.

Note that the document covers AMD64; it does not say anything about
their (now old) 32-bit processors.

 [*] The most recent docs I have handy. Up to and including Core-2 Duo.

Are you sure the read ordering applies to 32-bit Intel and AMD CPUs too?

Many years ago, before 64-bit x86 existed, I recall discussions on
LKML where it was made clear that stores were performed in program
order.  If it were known at the time that loads were performed in
program order on 32-bit x86s, I would have expected that to have been
mentioned by someone.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Nick Piggin
On Thu, Mar 11, 2010 at 03:10:47AM +, Jamie Lokier wrote:
 Paul Brook wrote:
In a cross environment that becomes extremely hairy.  For example the 
x86
architecture effectively has an implicit write barrier before every
store, and an implicit read barrier before every load.
   
   Btw, x86 doesn't have any implicit barriers due to ordinary loads.
   Only stores and atomics have implicit barriers, afaik.
  
  As of March 2009[1] Intel guarantees that memory reads occur in
  order (they may only be reordered relative to writes). It appears
  AMD do not provide this guarantee, which could be an interesting
  problem for heterogeneous migration..
 
 (Summary: At least on AMD64, it does too, for normal accesses to
 naturally aligned addresses in write-back cacheable memory.)
 
 Oh, that's interesting.  Way back when I guess we knew writes were in
 order and it wasn't explicit that reads were, hence smp_rmb() using a
 locked atomic.
 
 Here is a post by Nick Piggin from 2007 with links to Intel _and_ AMD
 documents asserting that reads to cacheable memory are in program order:
 
 http://lkml.org/lkml/2007/9/28/212
 Subject: [patch] x86: improved memory barrier implementation
 
 Links to documents:
 
 http://developer.intel.com/products/processor/manuals/318147.pdf
 
 http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/24593.pdf
 
 The Intel link doesn't work any more, but the AMD one does.

It might have been merged into their development manual now.

 
 Nick asserts both manufacturers are committed to in-order loads from
 cacheable memory for the x86 architecture.

At the time we did ask Intel and AMD engineers. We talked with Andy
Glew from Intel I believe, but I can't recall the AMD contact.
Linus was involved in the discussions as well. We tried to do the
right thing with this.

 I have just read the AMD document, and it is in there (but not
 completely obviously), in section 7.2.  The implicit load-load and
 store-store barriers are only guaranteed for normal cacheable
 accesses on naturally aligned boundaries to WB [write-back cacheable]
 memory.  There are also implicit load-store barriers but not
 store-load.
 
 Note that the document covers AMD64; it does not say anything about
 their (now old) 32-bit processors.

Hmm. Well it couldn't hurt to ask again. We've never seen any
problems yet, so I'm rather sure we're in the clear.

 
  [*] The most recent docs I have handy. Up to and including Core-2 Duo.
 
 Are you sure the read ordering applies to 32-bit Intel and AMD CPUs too?
 
 Many years ago, before 64-bit x86 existed, I recall discussions on
 LKML where it was made clear that stores were performed in program
 order.  If it were known at the time that loads were performed in
 program order on 32-bit x86s, I would have expected that to have been
 mentioned by someone.

The way it was explained to us by the Intel engineer is that they
had implemented only visibly in-order loads, but they wanted to keep
their options open in future so they did not want to commit to in
order loads as an ISA feature.

So when the whitepaper was released we got their blessing to
retroactively apply the rules to previous CPUs.

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-10 Thread Avi Kivity

On 03/10/2010 07:41 PM, Paul Brook wrote:

You're much better off using a bulk-data transfer API that relaxes
coherency requirements.  IOW, shared memory doesn't make sense for TCG
   

Rather, tcg doesn't make sense for shared memory smp.  But we knew that
already.
 

In think TCG SMP is a hard, but soluble problem, especially when you're
running guests used to coping with NUMA.
   


Do you mean by using a per-cpu tlb?  These kind of solutions are 
generally slow, but tcg's slowness may mask this out.



TCG interacting with third parties via shared memory is probably never going
to make sense.
   


The third party in this case is qemu.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Jamie Lokier
Paul Brook wrote:
  However, coherence could be made host-type-independent by the host
  mapping and unampping pages, so that each page is only mapped into one
  guest (or guest CPU) at a time.  Just like some clustering filesystems
  do to maintain coherence.
 
 You're assuming that a TLB flush implies a write barrier, and a TLB miss 
 implies a read barrier.  I'd be surprised if this were true in general.

The host driver itself can issue full barriers at the same time as it
maps pages on TLB miss, and would probably have to interrupt the
guest's SMP KVM threads to insert a full barrier when broadcasting a
TLB flush on unmap.

-- Jamie

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Jamie Lokier
Avi Kivity wrote:
 On 03/08/2010 03:03 PM, Paul Brook wrote:
 On 03/08/2010 12:53 AM, Paul Brook wrote:
  
 Support an inter-vm shared memory device that maps a shared-memory
 object as a PCI device in the guest.  This patch also supports
 interrupts between guest by communicating over a unix domain socket.
 This patch applies to the qemu-kvm repository.
  
 No. All new devices should be fully qdev based.
 
 I suspect you've also ignored a load of coherency issues, especially when
 not using KVM. As soon as you have shared memory in more than one host
 thread/process you have to worry about memory barriers.

 Shouldn't it be sufficient to require the guest to issue barriers (and
 to ensure tcg honours the barriers, if someone wants this with tcg)?.
  
 In a cross environment that becomes extremely hairy.  For example the x86
 architecture effectively has an implicit write barrier before every store, 
 and
 an implicit read barrier before every load.

 
 Ah yes.  For cross tcg environments you can map the memory using mmio 
 callbacks instead of directly, and issue the appropriate barriers there.

That makes sense.  It will force an mmio callback for every access to
the shared memory, which is ok for correctness but vastly slower when
running in TCG compared with KVM.

But it's hard to see what else could be done - those implicit write
barries on x86 have to be emulated somehow.  For TCG without inter-vm
shared memory, those barriers aren't a problem.

Non-random-corruption guest behaviour is paramount, so I hope the
inter-vm device will add those mmio callbacks for the cross-arch case
before it sees much action.  (Strictly, it isn't cross-arch, but
host-has-more-relaxed-implicit-memory-model-than-guest.  I'm assuming
TCG doesn't reorder memory instructions).

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Jamie Lokier
Paul Brook wrote:
  On 03/08/2010 12:53 AM, Paul Brook wrote:
   Support an inter-vm shared memory device that maps a shared-memory
   object as a PCI device in the guest.  This patch also supports
   interrupts between guest by communicating over a unix domain socket. 
   This patch applies to the qemu-kvm repository.
  
   No. All new devices should be fully qdev based.
  
   I suspect you've also ignored a load of coherency issues, especially when
   not using KVM. As soon as you have shared memory in more than one host
   thread/process you have to worry about memory barriers.
  
  Shouldn't it be sufficient to require the guest to issue barriers (and
  to ensure tcg honours the barriers, if someone wants this with tcg)?.
 
 In a cross environment that becomes extremely hairy.  For example the x86 
 architecture effectively has an implicit write barrier before every store, 
 and 
 an implicit read barrier before every load.

Btw, x86 doesn't have any implicit barriers due to ordinary loads.
Only stores and atomics have implicit barriers, afaik.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Anthony Liguori

On 03/08/2010 03:54 AM, Jamie Lokier wrote:

Alexander Graf wrote:
   

Or we could put in some code that tells the guest the host shm
architecture and only accept x86 on x86 for now. If anyone cares for
other combinations, they're free to implement them.

Seriously, we're looking at an interface designed for kvm here. Let's
please keep it as simple and fast as possible for the actual use case,
not some theoretically possible ones.
 

The concern is that a perfectly working guest image running on kvm,
the guest being some OS or app that uses this facility (_not_ a
kvm-only guest driver), is later run on qemu on a different host, and
then mostly works except for some silent data corruption.

That is not a theoretical scenario.
   


Hint: no matter what you do, shared memory is a hack that's going to 
lead to subtle failures one way or another.


It's useful to support because it has some interesting academic uses but 
it's not a mechanism that can ever be used for real world purposes.


It's impossible to support save/restore correctly.  It can never be made 
to work with TCG in a safe way.  That's why I've been advocating keeping 
this as simple as humanly possible.  It's just not worth trying to make 
this fancier than it needs to be because it will never be fully correct.


Regards,

Anthony Liguori


Well, the bit with this driver is theoretical, obviously :-)
But not the bit about moving to a different host.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
   


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Anthony Liguori

On 03/08/2010 07:16 AM, Avi Kivity wrote:

On 03/08/2010 03:03 PM, Paul Brook wrote:

On 03/08/2010 12:53 AM, Paul Brook wrote:

Support an inter-vm shared memory device that maps a shared-memory
object as a PCI device in the guest.  This patch also supports
interrupts between guest by communicating over a unix domain socket.
This patch applies to the qemu-kvm repository.

No. All new devices should be fully qdev based.

I suspect you've also ignored a load of coherency issues, 
especially when

not using KVM. As soon as you have shared memory in more than one host
thread/process you have to worry about memory barriers.

Shouldn't it be sufficient to require the guest to issue barriers (and
to ensure tcg honours the barriers, if someone wants this with tcg)?.
In a cross environment that becomes extremely hairy.  For example the 
x86
architecture effectively has an implicit write barrier before every 
store, and

an implicit read barrier before every load.


Ah yes.  For cross tcg environments you can map the memory using mmio 
callbacks instead of directly, and issue the appropriate barriers there.


Not good enough unless you want to severely restrict the use of shared 
memory within the guest.


For instance, it's going to be useful to assume that you atomic 
instructions remain atomic.  Crossing architecture boundaries here makes 
these assumptions invalid.  A barrier is not enough.


Shared memory only makes sense when using KVM.  In fact, we should 
actively disable the shared memory device when not using KVM.


Regards,

Anthony Liguori


--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Paul Brook
  In a cross environment that becomes extremely hairy.  For example the x86
  architecture effectively has an implicit write barrier before every
  store, and an implicit read barrier before every load.
 
 Btw, x86 doesn't have any implicit barriers due to ordinary loads.
 Only stores and atomics have implicit barriers, afaik.

As of March 2009[1] Intel guarantees that memory reads occur in order (they 
may only be reordered relative to writes). It appears AMD do not provide this 
guarantee, which could be an interesting problem for heterogeneous migration..

Paul

[*] The most recent docs I have handy. Up to and including Core-2 Duo.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-09 Thread Cam Macdonell
On Tue, Mar 9, 2010 at 5:03 PM, Paul Brook p...@codesourcery.com wrote:
  In a cross environment that becomes extremely hairy.  For example the x86
  architecture effectively has an implicit write barrier before every
  store, and an implicit read barrier before every load.

 Btw, x86 doesn't have any implicit barriers due to ordinary loads.
 Only stores and atomics have implicit barriers, afaik.

 As of March 2009[1] Intel guarantees that memory reads occur in order (they
 may only be reordered relative to writes). It appears AMD do not provide this
 guarantee, which could be an interesting problem for heterogeneous migration..

 Paul

 [*] The most recent docs I have handy. Up to and including Core-2 Duo.


Interesting, but what ordering would cause problems that AMD would do
but Intel wouldn't?  Wouldn't that ordering cause the same problems
for POSIX shared memory in general (regardless of Qemu) on AMD?

I think shared memory breaks migration anyway.

Cam
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Alexander Graf


Am 08.03.2010 um 02:45 schrieb Jamie Lokier ja...@shareable.org:


Paul Brook wrote:
Support an inter-vm shared memory device that maps a shared-memory  
object
as a PCI device in the guest.  This patch also supports interrupts  
between
guest by communicating over a unix domain socket.  This patch  
applies to

the qemu-kvm repository.


No. All new devices should be fully qdev based.

I suspect you've also ignored a load of coherency issues,  
especially when not

using KVM. As soon as you have shared memory in more than one host
thread/process you have to worry about memory barriers.


Yes. Guest-observable behaviour is likely to be quite different on
different hosts, expecially beteen x86 and non-x86 hosts, which is not
good at all for emulation.

Memory barriers performed by the guest would help, but would not
remove the fact that behaviour would vary beteen different host types
if a guest doesn't call them.  I.e. you could accidentally have some
guests working fine for years on x86 hosts, which gain subtle
memory corruption as soon as you run them on a different host.

This is acceptable when recompiling code for different architectures,
but it's asking for trouble with binary guest images which aren't
supposed to depend on host architecture.

However, coherence could be made host-type-independent by the host
mapping and unampping pages, so that each page is only mapped into one
guest (or guest CPU) at a time.  Just like some clustering filesystems
do to maintain coherence.


Or we could put in some code that tells the guest the host shm  
architecture and only accept x86 on x86 for now. If anyone cares for  
other combinations, they're free to implement them.


Seriously, we're looking at an interface designed for kvm here. Let's  
please keep it as simple and fast as possible for the actual use case,  
not some theoretically possible ones.



Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Avi Kivity

On 03/08/2010 12:53 AM, Paul Brook wrote:

Support an inter-vm shared memory device that maps a shared-memory object
as a PCI device in the guest.  This patch also supports interrupts between
guest by communicating over a unix domain socket.  This patch applies to
  the qemu-kvm repository.
 

No. All new devices should be fully qdev based.

I suspect you've also ignored a load of coherency issues, especially when not
using KVM. As soon as you have shared memory in more than one host
thread/process you have to worry about memory barriers.
   


Shouldn't it be sufficient to require the guest to issue barriers (and 
to ensure tcg honours the barriers, if someone wants this with tcg)?.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Jamie Lokier
Alexander Graf wrote:
 Or we could put in some code that tells the guest the host shm  
 architecture and only accept x86 on x86 for now. If anyone cares for  
 other combinations, they're free to implement them.
 
 Seriously, we're looking at an interface designed for kvm here. Let's  
 please keep it as simple and fast as possible for the actual use case,  
 not some theoretically possible ones.

The concern is that a perfectly working guest image running on kvm,
the guest being some OS or app that uses this facility (_not_ a
kvm-only guest driver), is later run on qemu on a different host, and
then mostly works except for some silent data corruption.

That is not a theoretical scenario.

Well, the bit with this driver is theoretical, obviously :-)
But not the bit about moving to a different host.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Alexander Graf
Jamie Lokier wrote:
 Alexander Graf wrote:
   
 Or we could put in some code that tells the guest the host shm  
 architecture and only accept x86 on x86 for now. If anyone cares for  
 other combinations, they're free to implement them.

 Seriously, we're looking at an interface designed for kvm here. Let's  
 please keep it as simple and fast as possible for the actual use case,  
 not some theoretically possible ones.
 

 The concern is that a perfectly working guest image running on kvm,
 the guest being some OS or app that uses this facility (_not_ a
 kvm-only guest driver), is later run on qemu on a different host, and
 then mostly works except for some silent data corruption.

 That is not a theoretical scenario.

 Well, the bit with this driver is theoretical, obviously :-)
 But not the bit about moving to a different host.
   

I agree. Hence there should be a safety check so people can't corrupt
their data silently.

Alex
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Paul Brook
 On 03/08/2010 12:53 AM, Paul Brook wrote:
  Support an inter-vm shared memory device that maps a shared-memory
  object as a PCI device in the guest.  This patch also supports
  interrupts between guest by communicating over a unix domain socket. 
  This patch applies to the qemu-kvm repository.
 
  No. All new devices should be fully qdev based.
 
  I suspect you've also ignored a load of coherency issues, especially when
  not using KVM. As soon as you have shared memory in more than one host
  thread/process you have to worry about memory barriers.
 
 Shouldn't it be sufficient to require the guest to issue barriers (and
 to ensure tcg honours the barriers, if someone wants this with tcg)?.

In a cross environment that becomes extremely hairy.  For example the x86 
architecture effectively has an implicit write barrier before every store, and 
an implicit read barrier before every load.

Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Paul Brook
 However, coherence could be made host-type-independent by the host
 mapping and unampping pages, so that each page is only mapped into one
 guest (or guest CPU) at a time.  Just like some clustering filesystems
 do to maintain coherence.

You're assuming that a TLB flush implies a write barrier, and a TLB miss 
implies a read barrier.  I'd be surprised if this were true in general.

Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-08 Thread Avi Kivity

On 03/08/2010 03:03 PM, Paul Brook wrote:

On 03/08/2010 12:53 AM, Paul Brook wrote:
 

Support an inter-vm shared memory device that maps a shared-memory
object as a PCI device in the guest.  This patch also supports
interrupts between guest by communicating over a unix domain socket.
This patch applies to the qemu-kvm repository.
 

No. All new devices should be fully qdev based.

I suspect you've also ignored a load of coherency issues, especially when
not using KVM. As soon as you have shared memory in more than one host
thread/process you have to worry about memory barriers.
   

Shouldn't it be sufficient to require the guest to issue barriers (and
to ensure tcg honours the barriers, if someone wants this with tcg)?.
 

In a cross environment that becomes extremely hairy.  For example the x86
architecture effectively has an implicit write barrier before every store, and
an implicit read barrier before every load.
   


Ah yes.  For cross tcg environments you can map the memory using mmio 
callbacks instead of directly, and issue the appropriate barriers there.


--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-07 Thread Paul Brook
 Support an inter-vm shared memory device that maps a shared-memory object
 as a PCI device in the guest.  This patch also supports interrupts between
 guest by communicating over a unix domain socket.  This patch applies to
  the qemu-kvm repository.

No. All new devices should be fully qdev based.

I suspect you've also ignored a load of coherency issues, especially when not 
using KVM. As soon as you have shared memory in more than one host 
thread/process you have to worry about memory barriers.

Paul
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Qemu-devel] [PATCH] Inter-VM shared memory PCI device

2010-03-07 Thread Jamie Lokier
Paul Brook wrote:
  Support an inter-vm shared memory device that maps a shared-memory object
  as a PCI device in the guest.  This patch also supports interrupts between
  guest by communicating over a unix domain socket.  This patch applies to
   the qemu-kvm repository.
 
 No. All new devices should be fully qdev based.
 
 I suspect you've also ignored a load of coherency issues, especially when not 
 using KVM. As soon as you have shared memory in more than one host 
 thread/process you have to worry about memory barriers.

Yes. Guest-observable behaviour is likely to be quite different on
different hosts, expecially beteen x86 and non-x86 hosts, which is not
good at all for emulation.

Memory barriers performed by the guest would help, but would not
remove the fact that behaviour would vary beteen different host types
if a guest doesn't call them.  I.e. you could accidentally have some
guests working fine for years on x86 hosts, which gain subtle
memory corruption as soon as you run them on a different host.

This is acceptable when recompiling code for different architectures,
but it's asking for trouble with binary guest images which aren't
supposed to depend on host architecture.

However, coherence could be made host-type-independent by the host
mapping and unampping pages, so that each page is only mapped into one
guest (or guest CPU) at a time.  Just like some clustering filesystems
do to maintain coherence.

-- Jamie
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html