Hi Bin, On 5 August 2015 at 02:43, Bin Meng <[email protected]> wrote: > Hi Simon, Tom, > > On Tue, Aug 4, 2015 at 3:27 AM, Simon Glass <[email protected]> wrote: >> Hi Tom, >> >> On 3 August 2015 at 13:06, Tom Rini <[email protected]> wrote: >>> On Mon, Aug 03, 2015 at 12:52:19PM -0600, Simon Glass wrote: >>>> Hi Tom, >>>> >>>> On 31 July 2015 at 08:31, Tom Rini <[email protected]> wrote: >>>> > On Thu, Jul 30, 2015 at 12:12:03PM +0800, Bin Meng wrote: >>>> > >>>> >> Hi Simon, >>>> >> >>>> >> When adding x86 multi-cpu initialization on a board with 4 cores, I >>>> >> found: >>>> >> >>>> >> => cpu list >>>> >> 0: cpu@0 Genuine Intel(R) CPU @ 1.58GHz >>>> >> 1: cpu@1 Genuine Intel(R) CPU @ 1.58GHz >>>> >> 2: cpu@2 Genuine Intel(R) CPU @ 1.58GHz >>>> >> 2: cpu@3 Genuine Intel(R) CPU @ 1.58GHz >>>> >> >>>> >> cpu@2 and cpu@3 have the same sequence number, which indicates they >>>> >> are running parallelly to get the same sequence number. The call chain >>>> >> on an ap is: mp_init_cpu() -> device_probe() -> uclass_resolve_seq(). >>>> >> Apparently ap2 and ap3 are running at the same time to get the same >>>> >> number. >>>> >> >>>> >> Note so far all x86 boards that we have enabled x86 multi-cpu >>>> >> initialization on only have 2 cores, which will not expose such issue >>>> >> as there is no parallel execution among aps. >>>> > >>>> > So what exactly are we doing with these additional cores? My >>>> > recollection of what we do on other arches when we even deal with other >>>> > cores is that we bring them "up" and then usually put them in a holding >>>> > pattern for the real OS to deal with _or_ it's one of those cases where >>>> > we have multiple OSes running and we do what we need to load and release >>>> > those other OSes. >>>> >>>> In this case they end up at stop_this_cpu() which is just a hlt >>>> instruction in each case. >>> >>> So do we really have to be doing anything here? Or is this just >>> pre-emptive work for an async MP type setup down the road? We could >>> probably live with this with a big comment noting why we know it's >>> misbehaving. >> >> I think we should fix it - I suggested some options above and Bin may >> have ideas also. Bin may be able to send a patch since he can repeat >> the problem. >> > > Yes we should fix it. But IMHO, just fixing the seq number only > resolves the surface problem. What concerns me is that multiple cpu > running the same piece of codes (in this case, the DM core codes) > without any protection. I have no idea whether these core structures > (like the device list) still look good from the DM core perspective. > Although right now it seems that it only exposes the seq number issue, > we don't know if there are other potential DM issues. Thus I was > thinking fundamentally we are using DM CPU uclass in a wrong way.
We don't add devices when running on the AP CPUs - we only scan lists. So long as the boot CPU creates all the devices and then waits for them to populate, we are OK. I don't see any fundamental problem. Regards, Simon _______________________________________________ U-Boot mailing list [email protected] http://lists.denx.de/mailman/listinfo/u-boot

