Re: [Qemu-devel] [PATCH] ppc/pnv: fix cores per chip for multiple cpus

Cédric Le Goater Fri, 22 Sep 2017 03:53:36 -0700

On 09/22/2017 12:08 PM, David Gibson wrote:
> On Thu, Sep 21, 2017 at 08:04:55AM +0200, Cédric Le Goater wrote:
>> On 09/21/2017 05:54 AM, Nikunj A Dadhania wrote:
>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>
>>>> On Wed, Sep 20, 2017 at 12:48:55PM +0530, Nikunj A Dadhania wrote:
>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>>>
>>>>>> On Wed, Sep 20, 2017 at 12:10:48PM +0530, Nikunj A Dadhania wrote:
>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>>>>>
>>>>>>>> On Wed, Sep 20, 2017 at 10:43:19AM +0530, Nikunj A Dadhania wrote:
>>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>>>>>>>
>>>>>>>>>> On Wed, Sep 20, 2017 at 09:50:24AM +0530, Nikunj A Dadhania wrote:
>>>>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Sep 15, 2017 at 02:39:16PM +0530, Nikunj A Dadhania wrote:
>>>>>>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Sep 15, 2017 at 01:53:15PM +0530, Nikunj A Dadhania 
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> David Gibson <da...@gibson.dropbear.id.au> writes:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I thought, I am doing the same here for PowerNV, number of 
>>>>>>>>>>>>>>>>> online cores
>>>>>>>>>>>>>>>>> is equal to initial online vcpus / threads per core
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>    int boot_cores_nr = smp_cpus / smp_threads;
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Only difference that I see in PowerNV is that we have 
>>>>>>>>>>>>>>>>> multiple chips
>>>>>>>>>>>>>>>>> (max 2, at the moment)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>         cores_per_chip = smp_cpus / (smp_threads * 
>>>>>>>>>>>>>>>>> pnv->num_chips);
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This doesn't make sense to me.  Cores per chip should *always* 
>>>>>>>>>>>>>>>> equal
>>>>>>>>>>>>>>>> smp_cores, you shouldn't need another calculation for it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> And in case user has provided sane smp_cores, we use it.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If smp_cores isn't sane, you should simply reject it, not try 
>>>>>>>>>>>>>>>> to fix
>>>>>>>>>>>>>>>> it.  That's just asking for confusion.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is the case where the user does not provide a 
>>>>>>>>>>>>>>> topology(which is a
>>>>>>>>>>>>>>> valid scenario), not sure we should reject it. So qemu defaults
>>>>>>>>>>>>>>> smp_cores/smt_threads to 1. I think it makes sense to over-ride.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you can find a way to override it by altering smp_cores when 
>>>>>>>>>>>>>> it's
>>>>>>>>>>>>>> not explicitly specified, then ok.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Should I change the global smp_cores here as well ?
>>>>>>>>>>>>
>>>>>>>>>>>> I'm pretty uneasy with that option.
>>>>>>>>>>>
>>>>>>>>>>> Me too.
>>>>>>>>>>>
>>>>>>>>>>>> It would take a fair bit of checking to ensure that changing 
>>>>>>>>>>>> smp_cores
>>>>>>>>>>>> is safe here. An easier to verify option would be to make the 
>>>>>>>>>>>> generic
>>>>>>>>>>>> logic which splits up an unspecified -smp N into cores and sockets
>>>>>>>>>>>> more flexible, possibly based on machine options for max values.
>>>>>>>>>>>>
>>>>>>>>>>>> That might still be more trouble than its worth.
>>>>>>>>>>>
>>>>>>>>>>> I think the current approach is the simplest and less intrusive, as 
>>>>>>>>>>> we
>>>>>>>>>>> are handling a case where user has not bothered to provide a 
>>>>>>>>>>> detailed
>>>>>>>>>>> topology, the best we can do is create single threaded cores equal 
>>>>>>>>>>> to
>>>>>>>>>>> number of cores.
>>>>>>>>>>
>>>>>>>>>> No, sorry.  Having smp_cores not correspond to the number of cores 
>>>>>>>>>> per
>>>>>>>>>> chip in all cases is just not ok.  Add an error message if the
>>>>>>>>>> topology isn't workable for powernv by all means.  But users having 
>>>>>>>>>> to
>>>>>>>>>> use a longer command line is better than breaking basic assumptions
>>>>>>>>>> about what numbers reflect what topology.
>>>>>>>>>
>>>>>>>>> Sorry to ask again, as I am still not convinced, we do similar
>>>>>>>>> adjustment in spapr where the user did not provide the number of 
>>>>>>>>> cores,
>>>>>>>>> but qemu assumes them as single threaded cores and created
>>>>>>>>> cores(boot_cores_nr) that were not same as smp_cores ?
>>>>>>>>
>>>>>>>> What?  boot_cores_nr has absolutely nothing to do with adjusting the
>>>>>>>> topology, and it certainly doesn't assume they're single threaded.
>>>>>>>
>>>>>>> When we start a TCG guest and user provides following commandline, e.g.
>>>>>>> "-smp 4", smt_threads is set to 1 by default in vl.c. So the guest boots
>>>>>>> with 4 cores, each having 1 thread.
>>>>>>
>>>>>> Ok.. and what's the problem with that behaviour on powernv?
>>>>>
>>>>> As smp_thread defaults to 1 in vl.c, similarly smp_cores also has the
>>>>> default value of 1 in vl.c. In powernv, we were setting nr-cores like
>>>>> this:
>>>>>
>>>>>         object_property_set_int(chip, smp_cores, "nr-cores", 
>>>>> &error_fatal);
>>>>>
>>>>> Even when there were multiple cpus (-smp 4), when the guest boots up, we
>>>>> just get one core (i.e. smp_cores was 1) with single thread(smp_threads
>>>>> was 1), which is wrong as per the command-line that was provided.
>>>>
>>>> Right, so, -smp 4 defaults to 4 sockets, each with 1 core of 1
>>>> thread.  If you can't supply 4 sockets you should error, but you
>>>> shouldn't go and change the number of cores per socket.
>>>
>>> OK, that makes sense now. And I do see that smp_cpus is 4 in the above
>>> case. Now looking more into it, i see that powernv has something called
>>> "num_chips", isnt this same as sockets ? Do we need num_chips separately?
>>
>> yes that would do for cpus, but how do we retrieve the number of 
>> sockets ? I don't see a smp_sockets. 
> 
>       # sockets = smp_cpus / smp_threads / smp_cores
> 
> Or, if you want the maximum possible number of sockets (for a fully
> populated system)
>       # sockets = max_cpus / smp_threads / smp_cores
ok. that would do for a default setting. Sorry for the noise.


>> If we start looking at such issues, we should also take into account 
>> memory distribution :
>>
>>       -numa node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node]
>>
>> would allow us to define a set of cpus per node, cpus should be evenly 
>> distributed on the nodes though, and also define memory per node, but 
>> some nodes could be without memory.
> 
> I don't really see what that has to do with anything.  We already have
> ways to assign memory or cpus to specific nodes if we want.

We will see when the needs come if numa options fit the requirements.

C.

Re: [Qemu-devel] [PATCH] ppc/pnv: fix cores per chip for multiple cpus

Reply via email to