Re: [U-Boot] driver model is not smp safe

Bin Meng Fri, 07 Aug 2015 17:50:20 -0700

Hi Simon,

On Sat, Aug 8, 2015 at 3:09 AM, Simon Glass <[email protected]> wrote:
> Hi Bin,
>
> On 5 August 2015 at 02:43, Bin Meng <[email protected]> wrote:
>> Hi Simon, Tom,
>>
>> On Tue, Aug 4, 2015 at 3:27 AM, Simon Glass <[email protected]> wrote:
>>> Hi Tom,
>>>
>>> On 3 August 2015 at 13:06, Tom Rini <[email protected]> wrote:
>>>> On Mon, Aug 03, 2015 at 12:52:19PM -0600, Simon Glass wrote:
>>>>> Hi Tom,
>>>>>
>>>>> On 31 July 2015 at 08:31, Tom Rini <[email protected]> wrote:
>>>>> > On Thu, Jul 30, 2015 at 12:12:03PM +0800, Bin Meng wrote:
>>>>> >
>>>>> >> Hi Simon,
>>>>> >>
>>>>> >> When adding x86 multi-cpu initialization on a board with 4 cores, I 
>>>>> >> found:
>>>>> >>
>>>>> >> => cpu list
>>>>> >>   0: cpu@0               Genuine Intel(R) CPU         @ 1.58GHz
>>>>> >>   1: cpu@1               Genuine Intel(R) CPU         @ 1.58GHz
>>>>> >>   2: cpu@2               Genuine Intel(R) CPU         @ 1.58GHz
>>>>> >>   2: cpu@3               Genuine Intel(R) CPU         @ 1.58GHz
>>>>> >>
>>>>> >> cpu@2 and cpu@3 have the same sequence number, which indicates they
>>>>> >> are running parallelly to get the same sequence number. The call chain
>>>>> >> on an ap is: mp_init_cpu() -> device_probe() -> uclass_resolve_seq().
>>>>> >> Apparently ap2 and ap3 are running at the same time to get the same
>>>>> >> number.
>>>>> >>
>>>>> >> Note so far all x86 boards that we have enabled x86 multi-cpu
>>>>> >> initialization on only have 2 cores, which will not expose such issue
>>>>> >> as there is no parallel execution among aps.
>>>>> >
>>>>> > So what exactly are we doing with these additional cores?  My
>>>>> > recollection of what we do on other arches when we even deal with other
>>>>> > cores is that we bring them "up" and then usually put them in a holding
>>>>> > pattern for the real OS to deal with _or_ it's one of those cases where
>>>>> > we have multiple OSes running and we do what we need to load and release
>>>>> > those other OSes.
>>>>>
>>>>> In this case they end up at stop_this_cpu() which is just a hlt
>>>>> instruction in each case.
>>>>
>>>> So do we really have to be doing anything here?  Or is this just
>>>> pre-emptive work for an async MP type setup down the road?  We could
>>>> probably live with this with a big comment noting why we know it's
>>>> misbehaving.
>>>
>>> I think we should fix it - I suggested some options above and Bin may
>>> have ideas also. Bin may be able to send a patch since he can repeat
>>> the problem.
>>>
>>
>> Yes we should fix it. But IMHO, just fixing the seq number only
>> resolves the surface problem. What concerns me is that multiple cpu
>> running the same piece of codes (in this case, the DM core codes)
>> without any protection. I have no idea whether these core structures
>> (like the device list) still look good from the DM core perspective.
>> Although right now it seems that it only exposes the seq number issue,
>> we don't know if there are other potential DM issues. Thus I was
>> thinking fundamentally we are using DM CPU uclass in a wrong way.
>
> We don't add devices when running on the AP CPUs - we only scan lists.
> So long as the boot CPU creates all the devices and then waits for
> them to populate, we are OK. I don't see any fundamental problem.
>


OK, that makes me feel better, if we only need to resolve the seq
number issue. I will submit a patch for that.

Regards,
Bin
_______________________________________________
U-Boot mailing list
[email protected]
http://lists.denx.de/mailman/listinfo/u-boot

Re: [U-Boot] driver model is not smp safe

Reply via email to