Jonathan Cameron wrote: > On Wed, 22 Jun 2022 22:40:48 -0700 > Dan Williams <[email protected]> wrote: > > > Jonathan Cameron wrote: > > > .... > > > > > > > > Hi Ben, > > > > > > > > > > I finally got around to actually trying this out on top of Dan's > > > > > recent fix set > > > > > (I rebased it from the cxl/preview branch on kernel.org). > > > > > > > > > > I'm not having much luck actually bring up a region. > > > > > > > > > > The patch set refers to configuring the end point decoders, but all > > > > > their > > > > > sysfs attributes are read only. Am I missing a dependency somewhere > > > > > or > > > > > is the intent that this series is part of the solution only? > > > > > > > > > > I'm confused! > > > > > > > > There's a new series that's being reviewed internally before going to > > > > the list: > > > > > > > > https://gitlab.com/bwidawsk/linux/-/tree/cxl_region-redux3 > > > > > > > > Given the proximity to the merge window opening and the need to get > > > > the "mem_enabled" series staged, I asked Ben to hold it back from the > > > > list for now. > > > > > > > > There are some changes I am folding into it, but I hope to send it out > > > > in the next few days after "mem_enabled" is finalized. > > > > > > Hi Dan, > > > > > > I switched from an earlier version of the region code over to a rebase of > > > the tree. > > > Two issues below you may already have fixed. > > > > > > The second is a carry over from an earlier set so I haven't tested > > > without it but looks like it's still valid. > > > > > > Anyhow, thought it might save some cycles to preempt you sending > > > out the series if these issues are still present. > > > > > > Minimal testing so far on these with 2 hb, 2 rp, 4 directly connected > > > devices, but once you post I'll test more extensively. I've not > > > really thought about the below much, so might not be best way to fix. > > > > > > Found a bug in QEMU code as well (missing write masks for the > > > target list registers) - will post fix for that shortly. > > > > Hi Jonathan, > > > > Tomorrow I'll post the tranche to the list, but wanted to let you and > > others watching that that the 'preview' branch [1] now has the proposed > > initial region support. Once the bots give the thumbs up I'll send it > > along. > > > > To date I've only tested it with cxl_test and an internal test vehicle. > > The cxl_test script I used to setup and teardown a x8 interleave across > > x2 host bridges and x4 switches is: > > Thanks. Trivial feedback from a very quick play (busy day). > > Bit odd that regionX/size is once write - get an error even if > writing same value to it twice.
Ah true, that should just silently succeed. > Also not debugged yet but on just got a null pointer dereference on > > echo decoder3.0 > target0 > > Beyond a stacktrace pointing at store_targetN and dereference is of > 0x00008 no idea yet. The compiler unfortunately does a good job inlining the entirety of all the leaf functions beneath store_targetN() so I have found myself needing to sprinkle "noinline" to get better back traces. > > I was testing with a slightly modified version of a nasty script > I was using to test with Ben's code previously. Might well be > doing something wrong but obviously need to fix that crash anyway! Most definitely. > Will move to your nicer script below at somepoint as I've been lazy > enough I'm still hand editing a few lines depending on number on > a particular run. > > Should have some time tomorrow to debug, but definitely 'here be > dragons' at the moment. Yes. Even before this posting I had shaken out a few crash scenarios just from moving from my old QEMU baseline to "jic123/cxl-rework-draft-2" which did things like collide PCI MMIO with cxl_test fake CXL ranges. By the way, is there a "latest" tag I should be following to stay in sync with what you are running for QEMU+CXL? If only to reproduce the same crash scenarios.
