On Wed, 6 Sep 2017 16:27:20 +0800 Dong Jia Shi <bjsdj...@linux.vnet.ibm.com> wrote:
> * Halil Pasic <pa...@linux.vnet.ibm.com> [2017-09-05 19:20:43 +0200]: > > > > > > > On 09/05/2017 05:46 PM, Cornelia Huck wrote: > > > On Tue, 5 Sep 2017 17:24:19 +0200 > > > Halil Pasic <pa...@linux.vnet.ibm.com> wrote: > > > > > >> My problem with a program check (indicated by SCSW word 2 bit 10) is > > >> that, in my reading of the architecture, the semantic behind it is: The > > >> channel subsystem (not the cu or device) has detected, that the > > >> the channel program (previously submitted as an ORB) is erroneous. Which > > >> programs are erroneous is specified by the architecture. What we have > > >> here does not qualify. > > >> > > >> My idea was to rather blame the virtual hardware (device) and put no > > >> blame > > >> on the program nor he channel subsystem. This could be done using device > > >> status (unit check with command reject, maybe unit exception) or > > >> interface > > >> check. My train of thought was, the problem is not consistent across a > > >> device type, so it has to be device specific. > > > > > > Unit exception might be a better way to express what is happening here. > > > At least, it moves us away from cc 1 and not towards cc 3 :) > > > > > > > I will do a follow up patch pursuing device exception. > > > > >> > > >> Of course blaming the device could mislead the person encountering the > > >> problem, and make him believe it's an non-virtual hardware problem. > > >> > > >> About the misleading, I think the best we can do is log out a message > > >> indicating what really happened. > > > > > > Just document it in the code? If it doesn't happen with Linux as a > > > guest, it is highly unlikely to be seen in the wild. > > > > > > > > > Well we have two problems here: > > 1) Unit exception can be already defined by the device type for the > > command (reference: > > http://publibfp.dhe.ibm.com/cgi-bin/bookmgr/BOOKS/dz9ar110/2.6.10?DT=19920904110920). > > I think this one is what you mean. And I agree that's best handled > > with comment in code. > Using unit check, with bit 3 byte 0 of the sense data set to 1, to > indicate an 'Equipment check', sounds a bit more proper than unit > exception. I don't agree: Equipment check sounds a lot more dire (and seems to imply a malfunction). I like unit exception better. > > > 2) The poor user/programmer is trying to figure out why things > > don't work (why are we getting the unit exception)? I think that's > > best remedied with producing something for the log (maybe a warning > > with warn_report which states that the implementation vfio-ccw requires > > the given flags). > Fine with me. With me as well.