Re: [PATCH 00/18] CXL RAM and the 'Soft Reserved' => 'System RAM' default

2023-02-21 Thread Gregory Price
On Wed, Feb 15, 2023 at 10:03:27AM +, Jonathan Cameron wrote:
> On Tue, 14 Feb 2023 16:54:02 -0500
> Gregory Price  wrote:
> 
> > Just clarifying one thing:  Even with the patch, KVM blows up.
> > Disabling KVM fixes this entirely.  I haven't tested without KVM but
> > with the patch, i will do that now.
> 
> yup.  The patch only fixes TCG so that's expected behavior.
> 
> Fingers crossed on this 'working'.
> 
> I'm open to suggestions on how to work around the problem with KVM
> or indeed allow TCG to cache the instructions (right not it has
> to fetch and emulate each instruction on it's own).
> 
> I can envision how we might do it for KVM with userspace page fault handling
> used to get a fault up to QEMU which can then stitch in a cache
> of the underlying memory as a stage 2 translation to the page (a little
> bit like how post migration copy works) though I've not prototyped
> anything...
> 

Just following up.  With the patch applied and KVM turned off, no crash.
I've been working with this for a while.

We should move the instruction alignment issue into a separate
discussion thread.



Re: [PATCH 00/18] CXL RAM and the 'Soft Reserved' => 'System RAM' default

2023-02-15 Thread Jonathan Cameron via
On Tue, 14 Feb 2023 16:54:02 -0500
Gregory Price  wrote:

> On Tue, Feb 14, 2023 at 04:51:53PM -0500, Gregory Price wrote:
> > On Tue, Feb 14, 2023 at 09:18:24PM +, Jonathan Cameron wrote:  
> > > On Tue, 14 Feb 2023 14:01:23 -0500
> > > Gregory Price  wrote:
> > > 
> > > Could you test it with TCG (just drop --enable-kvm)?  We have a known
> > > limitation with x86 instructions running out of CXL emulated memory
> > > (side effect of emulating the interleave).  You'll need a fix even on TCG
> > > for the corner case of an instruction bridging from normal ram to cxl 
> > > memory.
> > > https://lore.kernel.org/qemu-devel/20230206193809.1153124-1-richard.hender...@linaro.org/
> > > 
> > > Performance will be bad, but so far this is only way we can do it 
> > > correctly.
> > > 
> > > Jonathan
> > >   
> > 
> > Siiiggghh... i had this patch and dropped --enable-kvm, but forgot to
> > drop "accel=kvm" from the -machine line
> > 
> > This was the issue.
> > 
> > And let me tell you, if you numactl --membind=1 python, it is
> > IMPRESSIVELY slow.  I wonder if it's even hitting a few 100k
> > instructions a second.
> > 
> > 
> > This appears to be the issue.  When I get a bit more time, try to dive
> > into the deep dark depths of qemu memory regions to see how difficult
> > a non-mmio fork might be, unless someone else is already looking at it.
> > 
> > ~Gregory  
> 
> Just clarifying one thing:  Even with the patch, KVM blows up.
> Disabling KVM fixes this entirely.  I haven't tested without KVM but
> with the patch, i will do that now.

yup.  The patch only fixes TCG so that's expected behavior.

Fingers crossed on this 'working'.

I'm open to suggestions on how to work around the problem with KVM
or indeed allow TCG to cache the instructions (right not it has
to fetch and emulate each instruction on it's own).

I can envision how we might do it for KVM with userspace page fault handling
used to get a fault up to QEMU which can then stitch in a cache
of the underlying memory as a stage 2 translation to the page (a little
bit like how post migration copy works) though I've not prototyped
anything...

I think it would be complex code that would be little used
so we may just have to cope with the emulation being slow.

Intent is very much to be able to test the kernel code etc, not
test it quickly :)

Jonathan