On 14.05.19 10:49, Cornelia Huck wrote: > On Tue, 14 May 2019 10:37:32 +0200 > Christian Borntraeger <borntrae...@de.ibm.com> wrote: > >> On 14.05.19 09:28, David Hildenbrand wrote: >>>>>> But that can be tested using the runability information if I am not >>>>>> wrong. >>>>> >>>>> You mean the cpu level information, right? >>> >>> Yes, query-cpu-definition includes for each model runability information >>> via "unavailable-features" (valid under the started QEMU machine). >>> >>>>> >>>>>> >>>>>>> and others that we have today. >>>>>>> >>>>>>> So yes, I think this would be acceptable. >>>>>> >>>>>> I guess it is acceptable yes. I doubt anybody uses that many CPUs in >>>>>> production either way. But you never know. >>>>> >>>>> I think that using that many cpus is a more uncommon setup, but I still >>>>> think that having to wait for actual failure >>>> >>>> That can happen all the time today. You can easily say z14 in the xml when >>>> on a zEC12. Only at startup you get the error. The question is really: >>> >>> "-smp 248 -cpu host" will no longer work, while e.g. "-smp 248 -cpu z12" >>> will work. Actually, even "-smp 248" will no longer work on affected >>> machines. >>> >>> That is why wonder if it is better to disable the feature and print a >>> warning. Similar to CMMA, where want want to tolerate when CMMA is not >>> possible in the current environment (huge pages). >>> >>> "Diag318 will not be enabled because it is not compatible with more than >>> 240 CPUs". >>> >>> However, I still think that implementing support for more than one SCLP >>> response page is the best solution. Guests will need adaptions for > 240 >>> CPUs with Diag318, but who cares? Existing setups will continue to work. >>> >>> Implementing that SCLP thingy will avoid any warnings and any errors. It >>> just works from the QEMU perspective. >>> >>> Is implementing this realistic? >> >> Yes it is but it will take time. I will try to get this rolling. To make >> progress on the diag318 thing, can we error on startup now and simply >> remove that check when when have implemented a larger sccb? If we would >> now do all kinds of "change the max number games" would be harder to "fix". > > So, the idea right now is: > > - fail to start if you try to specify a diag318 device and more than > 240 cpus (do we need a knob to turn off the device?) > - in the future, support more than one SCLP response page > > I'm getting a bit lost in the discussion; but the above sounds > reasonable to me. >
We can 1. Fail to start with #cpus > 240 when diag318=on 2. Remove the error once we support more than one SCLP response page Or 1. Allow to start with #cpus > 240 when diag318=on, but indicate only 240 CPUs via SCLP 2. Print a warning 3. Remove the restriction and the warning once we support more than one SCLP response page While I prefer the second approach (similar to defining zPCI devices without zpci=on), I could also live with the first approach. -- Thanks, David / dhildenb