Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Aaron Durbin via coreboot
On Sat, May 5, 2018 at 4:26 PM, Kyösti Mälkki  wrote:
> On Sun, May 6, 2018 at 12:17 AM, Aaron Durbin  wrote:
>>> I find it particularly hard to be civil on your first question, so
>>> trying with sarcasm instead. After 5000 or so development hours and
>>> direct support from AMD, is the boot sequence for soc/stoneyridge
>>> prototypes equally bad, that is, AP CPUs execute through bootblock and
>>> verstage?
>>
>> They currently jump to where they are told at the moment. See
>> bootblock_c_entry() in src/soc/amd/stoneyridge/bootblock/bootblock.c.
>> There is no running through multiple stages within coreboot.
>
> I think CAR_GLOBAL equally fails with soc/amd/stoneyridge if you had
> SoC with multiple compute units where MTRRs are not shared, thus setup
> becomes asymmetric between some cores. Marshall's experiments support
> this happens.

Agreed. That's a concern w/ running APs through all the stages and not
having a symmetric view of the address space.

>
>> I think you are right that guarding things w/ boot_cpu() would work.
>> Is CONFIG_SQUELCH_EARLY_SMP set for all those types of boards? Or do
>> we have a weird mix? Or should we have a Kconfig that says 'CAR' is
>> not symmetric across cpus thus can't be used?
>
> I would rather grant AP CPUs access to CAR_GLOBALs by manipulating the
> MTRR setup.
>
> Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Kyösti Mälkki
On Sun, May 6, 2018 at 12:17 AM, Aaron Durbin  wrote:
>> I find it particularly hard to be civil on your first question, so
>> trying with sarcasm instead. After 5000 or so development hours and
>> direct support from AMD, is the boot sequence for soc/stoneyridge
>> prototypes equally bad, that is, AP CPUs execute through bootblock and
>> verstage?
>
> They currently jump to where they are told at the moment. See
> bootblock_c_entry() in src/soc/amd/stoneyridge/bootblock/bootblock.c.
> There is no running through multiple stages within coreboot.

I think CAR_GLOBAL equally fails with soc/amd/stoneyridge if you had
SoC with multiple compute units where MTRRs are not shared, thus setup
becomes asymmetric between some cores. Marshall's experiments support
this happens.

> I think you are right that guarding things w/ boot_cpu() would work.
> Is CONFIG_SQUELCH_EARLY_SMP set for all those types of boards? Or do
> we have a weird mix? Or should we have a Kconfig that says 'CAR' is
> not symmetric across cpus thus can't be used?

I would rather grant AP CPUs access to CAR_GLOBALs by manipulating the
MTRR setup.

Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Kyösti Mälkki
On Sat, May 5, 2018 at 6:27 PM, Marshall Dawson
 wrote:
>> So did you hit problems with CAR_GLOBAL in agesa_get_dispatcher(),
>> when experimeting CZ vs ST? I don't see why this code works even for
>> dual-core ST but I did not evaluate CAR layout. Documentation part for
>> fixed MTRRs in gcccar.inc appears to be same.
>
>
> I don't remember specifically where the condition caused a problem.  It kind
> of seems like it was when the third core tried to run printk, wanted a
> global pointer to a global variable, and got back 0x.  It's been too
> long to say for certain.  IIRC, cores 0 & 1 received fixed MTRR settings
> allowing 0x3-0x3, but 2 & 3 started at 0x4 and couldn't access
> the globals just above 0x3.  It's worth noting that cores in Family 15h
> share their MTRRs within a compute unit, so it was never possible for 0 & 1
> to be anything but identical.
>

Every discovered core adds WB regions to MSR. So if you say the tested
CZ had separate compute units, while ST only had one, that would
indeed explain it.

>> Now, does this MTRR change have potential to actually become _the_
>> fix? I don't see why it could not. It's pretty much all that we1 can do
>> with binaryPI, anything else touches the blobs.
>
>
>> Will you sent patch?
>

Yes.

>
> I don't think _the_ fix has been pursued yet.  It's certainly lower priority
> than other things I have going on right now.  But in my mind, it's still a
> little too hacky.  It assumes the hardcoded value of BSP_STACK_BASE_ADDR
> stays consistent (which needs to also equal DCACHE_RAM_BASE).  Maybe a
> better approach is to allow all cores to have access to 100% of the CAR
> region.  Maybe just their own plus the globals area?  I haven't given much
> thought to why cores received differing regions in AMD's CAR setup code.
> What I would really enjoy seeing is moving CAR above 1MB and skipping all
> the fixed MTRR setup, but that's a pretty big rewrite.

Keep in mind, this code ends up in your RO bootblock. Give it some
high priority on StoneyRidge buglist.

> Kyosti, do you want to submit a patch since you're able to test it on other
> platforms?  I'd be OK with having the 64K region at BSP_STACK_BASE_ADDR
> being WB for all cores until a more robust solution is available.  I don't
> think this is a characteristic of PI blob implementations, by the way.  I
> would expect all the native AGESA platforms to have the same limitation too.
>

That's how I have the patch [1] prepared.

Kyösti

[1] https://review.coreboot.org/#/c/coreboot/+/26115

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Aaron Durbin via coreboot
On Fri, May 4, 2018 at 8:49 PM, Kyösti Mälkki  wrote:
> On Fri, May 4, 2018 at 8:22 PM, Aaron Durbin  wrote:
>> On Fri, May 4, 2018 at 11:16 AM, Kyösti Mälkki  
>> wrote:
>>> On Fri, May 4, 2018 at 7:19 PM, Kyösti Mälkki  
>>> wrote:
 On Fri, May 4, 2018 at 6:37 PM, Aaron Durbin  wrote:
>>
>> Any idea what can be result of such weird behavior?
>
>>>
>>> My current guess is AP CPUs do not see initialised memory for
>>> _car_region_start .. _end. That region is set up using fixed MTRRs in
>>> low memory and probably not synced between cores so early in romstage.
>>>
>>
>> Ugh.  Why do we allow the APs to run through all these stages? Is this
>> for parallel node memory training? Can we ring fence where the APs
>> actually run better?
>>
>
> I have to check closer, but for apu2 this would be in AMD_INIT_EARLY
> already, so AP CPUs run way before (the only) memory controller is
> configured. I believe there is microcode updates and some CPU
> registers that are also synced during AMD_INIT_EARLY. So it is doing
> more than just bringing an idle AP task engine.
>
> I find it particularly hard to be civil on your first question, so
> trying with sarcasm instead. After 5000 or so development hours and
> direct support from AMD, is the boot sequence for soc/stoneyridge
> prototypes equally bad, that is, AP CPUs execute through bootblock and
> verstage?

They currently jump to where they are told at the moment. See
bootblock_c_entry() in src/soc/amd/stoneyridge/bootblock/bootblock.c.
There is no running through multiple stages within coreboot.

I think you are right that guarding things w/ boot_cpu() would work.
Is CONFIG_SQUELCH_EARLY_SMP set for all those types of boards? Or do
we have a weird mix? Or should we have a Kconfig that says 'CAR' is
not symmetric across cpus thus can't be used?

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Marshall Dawson
>
> So did you hit problems with CAR_GLOBAL in agesa_get_dispatcher(),
> when experimeting CZ vs ST? I don't see why this code works even for
> dual-core ST but I did not evaluate CAR layout. Documentation part for
> fixed MTRRs in gcccar.inc appears to be same.
>

I don't remember specifically where the condition caused a problem.  It
kind of seems like it was when the third core tried to run printk, wanted a
global pointer to a global variable, and got back 0x.  It's been
too long to say for certain.  IIRC, cores 0 & 1 received fixed MTRR
settings allowing 0x3-0x3, but 2 & 3 started at 0x4 and
couldn't access the globals just above 0x3.  It's worth noting that
cores in Family 15h share their MTRRs within a compute unit, so it was
never possible for 0 & 1 to be anything but identical.

Now, does this MTRR change have potential to actually become _the_
> fix? I don't see why it could not. It's pretty much all that we can do
> with binaryPI, anything else touches the blobs.
>

Will you sent patch?
>

I don't think _the_ fix has been pursued yet.  It's certainly lower
priority than other things I have going on right now.  But in my mind, it's
still a little too hacky.  It assumes the hardcoded value of
BSP_STACK_BASE_ADDR stays consistent (which needs to also equal
DCACHE_RAM_BASE).  Maybe a better approach is to allow all cores to have
access to 100% of the CAR region.  Maybe just their own plus the globals
area?  I haven't given much thought to why cores received differing regions
in AMD's CAR setup code.  What I would really enjoy seeing is moving CAR
above 1MB and skipping all the fixed MTRR setup, but that's a pretty big
rewrite.

Kyosti, do you want to submit a patch since you're able to test it on other
platforms?  I'd be OK with having the 64K region at BSP_STACK_BASE_ADDR
being WB for all cores until a more robust solution is available.  I don't
think this is a characteristic of PI blob implementations, by the way.  I
would expect all the native AGESA platforms to have the same limitation too.

Thanks,
Marshall


On Sat, May 5, 2018 at 4:22 AM, Piotr Król  wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
>
>
> On 05/05/2018 11:53 AM, Kyösti Mälkki wrote:
> > On Sat, May 5, 2018 at 3:06 AM, Marshall Dawson
> >  wrote:
> >>> My current guess is AP CPUs do not see initialised memory for
> >>> _car_region_start .. _end. That region is set up using fixed
> >>> MTRRs in low memory and probably not synced between cores so
> >>> early in romstage.
> >>
> >>
> >> Kyosti, I haven't kept a close watch on CAR changes in the other
> >> AMD systems, however in experimenting with CZ using ST source I
> >> found that 4 cores had a problem using CAR globals, so I think
> >> you're on the right track. AMD's CAR setup code assumes each core
> >> only needs fixed MTRRs configured for their own particular region
> >> of storage -- they all don't get access to 100% of the CAR area.
> >> Going from 2 cores (ST) to 4 cores (CZ), the higher two cores
> >> could no longer access the CAR globals at the bottom of the
> >> region. Try this hack and see if it improves for you:
> >>
> >
> > Hi Marshall
> >
> > Thanks, your suggested MTRR change seems to solve the regression.
> > It was definetely about CAR_GLOBALs failing for AP CPUs in general.
> > We just had not hit this case before for typical builds, as use of
> > POSTCAR_STAGE barely avoided the error from triggering.
> >
> > So did you hit problems with CAR_GLOBAL in agesa_get_dispatcher(),
> > when experimeting CZ vs ST? I don't see why this code works even
> > for dual-core ST but I did not evaluate CAR layout. Documentation
> > part for fixed MTRRs in gcccar.inc appears to be same.
> >
> > Now, does this MTRR change have potential to actually become _the_
> > fix? I don't see why it could not. It's pretty much all that we can
> > do with binaryPI, anything else touches the blobs.
>
> Hi Marshall, Kyösti,
> also confirm that this patch fix regression on recent master. Will you
> sent patch?
>
> Best Regards,
> - --
> Piotr Król
> Embedded Systems Consultant
> https://3mdeb.com | @3mdeb_com
> -BEGIN PGP SIGNATURE-
>
> iQIzBAEBCAAdFiEE4DCbLYWmfoRjKeNLsu5x6WeqnkwFAlrthdYACgkQsu5x6Weq
> nkwqPQ//QGpX6w23CWwN/lJQPMIB2Qpw0rdL5GTa2L1axsdL8GLWr+TgRkeleGi4
> q77x+jG9xPP5fBkoEiZfBOkZTwhfKIKygtSdwHybs0Qn7aammItCgdOyz1tJj9Gu
> /i6ncXK3dsWYwOj1r+qkMm0phGQDgq+uW17CnsmOUfXMIMm6ULIF6f/8t/aNvdGC
> N6gNPIjI7oDlkI73CRZjLV+bWIsFzfvoJWTZrilNz41zxI5ERCKpmGCNy93qPsMl
> p5GoGZvzE2rwniRqju8oDlw2P+o3OZ3ukAJWbQxid83dn7NHAo95ZFZ8cx7umLvN
> J+vFrD9n1CzwQRXtqWoSelYmGJkJWxx/p5Co8JK+x2yLDShrojJraV02NX32k4Yd
> Bf4RZW9nK4QnPqrH647FfBEOBZfX0ruizjTQzFLd7moTYDry8zIOwOuLc3IftT09
> edPrxnwlp6q6F8EopGXsnTbqFfbxGRRVzGICvd2/eFtSXZBULFQBfutKYrK8KVaS
> /VLXSpPQ6f07vZnWQMrCndaHIC0cShnHE1ekb2uY8kmkqzgXZetaKGsDkCXlJVRF
> vlJEoEu9qWmglqMYzxPF8MnHBx9DLAgFk/AMMmQk5yzv8Ki1m3kgJu9Fxdc0iJ3y
> 

Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Piotr Król
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256



On 05/05/2018 11:53 AM, Kyösti Mälkki wrote:
> On Sat, May 5, 2018 at 3:06 AM, Marshall Dawson 
>  wrote:
>>> My current guess is AP CPUs do not see initialised memory for 
>>> _car_region_start .. _end. That region is set up using fixed
>>> MTRRs in low memory and probably not synced between cores so
>>> early in romstage.
>> 
>> 
>> Kyosti, I haven't kept a close watch on CAR changes in the other
>> AMD systems, however in experimenting with CZ using ST source I
>> found that 4 cores had a problem using CAR globals, so I think
>> you're on the right track. AMD's CAR setup code assumes each core
>> only needs fixed MTRRs configured for their own particular region
>> of storage -- they all don't get access to 100% of the CAR area.
>> Going from 2 cores (ST) to 4 cores (CZ), the higher two cores
>> could no longer access the CAR globals at the bottom of the
>> region. Try this hack and see if it improves for you:
>> 
> 
> Hi Marshall
> 
> Thanks, your suggested MTRR change seems to solve the regression.
> It was definetely about CAR_GLOBALs failing for AP CPUs in general.
> We just had not hit this case before for typical builds, as use of 
> POSTCAR_STAGE barely avoided the error from triggering.
> 
> So did you hit problems with CAR_GLOBAL in agesa_get_dispatcher(), 
> when experimeting CZ vs ST? I don't see why this code works even
> for dual-core ST but I did not evaluate CAR layout. Documentation
> part for fixed MTRRs in gcccar.inc appears to be same.
> 
> Now, does this MTRR change have potential to actually become _the_ 
> fix? I don't see why it could not. It's pretty much all that we can
> do with binaryPI, anything else touches the blobs.

Hi Marshall, Kyösti,
also confirm that this patch fix regression on recent master. Will you
sent patch?

Best Regards,
- -- 
Piotr Król
Embedded Systems Consultant
https://3mdeb.com | @3mdeb_com
-BEGIN PGP SIGNATURE-

iQIzBAEBCAAdFiEE4DCbLYWmfoRjKeNLsu5x6WeqnkwFAlrthdYACgkQsu5x6Weq
nkwqPQ//QGpX6w23CWwN/lJQPMIB2Qpw0rdL5GTa2L1axsdL8GLWr+TgRkeleGi4
q77x+jG9xPP5fBkoEiZfBOkZTwhfKIKygtSdwHybs0Qn7aammItCgdOyz1tJj9Gu
/i6ncXK3dsWYwOj1r+qkMm0phGQDgq+uW17CnsmOUfXMIMm6ULIF6f/8t/aNvdGC
N6gNPIjI7oDlkI73CRZjLV+bWIsFzfvoJWTZrilNz41zxI5ERCKpmGCNy93qPsMl
p5GoGZvzE2rwniRqju8oDlw2P+o3OZ3ukAJWbQxid83dn7NHAo95ZFZ8cx7umLvN
J+vFrD9n1CzwQRXtqWoSelYmGJkJWxx/p5Co8JK+x2yLDShrojJraV02NX32k4Yd
Bf4RZW9nK4QnPqrH647FfBEOBZfX0ruizjTQzFLd7moTYDry8zIOwOuLc3IftT09
edPrxnwlp6q6F8EopGXsnTbqFfbxGRRVzGICvd2/eFtSXZBULFQBfutKYrK8KVaS
/VLXSpPQ6f07vZnWQMrCndaHIC0cShnHE1ekb2uY8kmkqzgXZetaKGsDkCXlJVRF
vlJEoEu9qWmglqMYzxPF8MnHBx9DLAgFk/AMMmQk5yzv8Ki1m3kgJu9Fxdc0iJ3y
Ka9voYxzP2r+K8CaJgvtXYwSeHoIqDdERBAz0lN3KTqGuXtewSs=
=0Gj7
-END PGP SIGNATURE-

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-05 Thread Kyösti Mälkki
On Sat, May 5, 2018 at 3:06 AM, Marshall Dawson
 wrote:
>> My current guess is AP CPUs do not see initialised memory for
>> _car_region_start .. _end. That region is set up using fixed MTRRs in
>> low memory and probably not synced between cores so early in romstage.
>
>
> Kyosti, I haven't kept a close watch on CAR changes in the other AMD
> systems, however in experimenting with CZ using ST source I found that 4
> cores had a problem using CAR globals, so I think you're on the right track.
> AMD's CAR setup code assumes each core only needs fixed MTRRs configured for
> their own particular region of storage -- they all don't get access to 100%
> of the CAR area.  Going from 2 cores (ST) to 4 cores (CZ), the higher two
> cores could no longer access the CAR globals at the bottom of the region.
> Try this hack and see if it improves for you:
>

Hi Marshall

Thanks, your suggested MTRR change seems to solve the regression. It
was definetely about CAR_GLOBALs failing for AP CPUs in general. We
just had not hit this case before for typical builds, as use of
POSTCAR_STAGE barely avoided the error from triggering.

So did you hit problems with CAR_GLOBAL in agesa_get_dispatcher(),
when experimeting CZ vs ST? I don't see why this code works even for
dual-core ST but I did not evaluate CAR layout. Documentation part for
fixed MTRRs in gcccar.inc appears to be same.

Now, does this MTRR change have potential to actually become _the_
fix? I don't see why it could not. It's pretty much all that we can do
with binaryPI, anything else touches the blobs.

Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-04 Thread Kyösti Mälkki
On Fri, May 4, 2018 at 8:22 PM, Aaron Durbin  wrote:
> On Fri, May 4, 2018 at 11:16 AM, Kyösti Mälkki  
> wrote:
>> On Fri, May 4, 2018 at 7:19 PM, Kyösti Mälkki  
>> wrote:
>>> On Fri, May 4, 2018 at 6:37 PM, Aaron Durbin  wrote:
>
> Any idea what can be result of such weird behavior?

>>
>> My current guess is AP CPUs do not see initialised memory for
>> _car_region_start .. _end. That region is set up using fixed MTRRs in
>> low memory and probably not synced between cores so early in romstage.
>>
>
> Ugh.  Why do we allow the APs to run through all these stages? Is this
> for parallel node memory training? Can we ring fence where the APs
> actually run better?
>

I have to check closer, but for apu2 this would be in AMD_INIT_EARLY
already, so AP CPUs run way before (the only) memory controller is
configured. I believe there is microcode updates and some CPU
registers that are also synced during AMD_INIT_EARLY. So it is doing
more than just bringing an idle AP task engine.

I find it particularly hard to be civil on your first question, so
trying with sarcasm instead. After 5000 or so development hours and
direct support from AMD, is the boot sequence for soc/stoneyridge
prototypes equally bad, that is, AP CPUs execute through bootblock and
verstage?

Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-04 Thread Marshall Dawson
>
> My current guess is AP CPUs do not see initialised memory for
> _car_region_start .. _end. That region is set up using fixed MTRRs in
> low memory and probably not synced between cores so early in romstage.
>

Kyosti, I haven't kept a close watch on CAR changes in the other AMD
systems, however in experimenting with CZ using ST source I found that 4
cores had a problem using CAR globals, so I think you're on the right
track.  AMD's CAR setup code assumes each core only needs fixed MTRRs
configured for their own particular region of storage -- they all don't get
access to 100% of the CAR area.  Going from 2 cores (ST) to 4 cores (CZ),
the higher two cores could no longer access the CAR globals at the bottom
of the region.  Try this hack and see if it improves for you:

diff --git a/src/vendorcode/amd/pi/00730F01/binaryPI/gcccar.inc
b/src/vendorcode/amd/pi/00730F01/binaryPI/gcccar.inc
index 5a4f7b9..fd3bf81 100644
--- a/src/vendorcode/amd/pi/00730F01/binaryPI/gcccar.inc
+++ b/src/vendorcode/amd/pi/00730F01/binaryPI/gcccar.inc
@@ -1151,6 +1151,7 @@ SetupStack:
 or  %edi, %edx#
 #.endif  #
 0:
+or $0x1e00, %eax# force 3-4 for cbmem
access
 _WRMSR  #

 # Enable MTRR defaults as UC type
-- 

Ugh.  Why do we allow the APs to run through all these stages? Is this
> for parallel node memory training? Can we ring fence where the APs
> actually run better?
>

Aaron, on Stoney Ridge all cores needed to be alive in order to do memory
training.  It wasn't that they were used in the algorithm (as far as I've
seen); they simply had to be there to prevent the northbridge from blowing
up.  I've long since forgotten what internal designs AMD used for Family
15h and 16h, but it seems likely the same limitation applies on Family 16h
too.  We occasionally discuss a different methodology of shutting down the
APs immediately when they're released, and not letting them run through to
InitEarly().  It seems like that ought to allow the northbridge to behave
as expected during training in InitPost().

Thanks,
Marshall



On Fri, May 4, 2018 at 11:22 AM, Aaron Durbin via coreboot <
coreboot@coreboot.org> wrote:

> On Fri, May 4, 2018 at 11:16 AM, Kyösti Mälkki 
> wrote:
> > On Fri, May 4, 2018 at 7:19 PM, Kyösti Mälkki 
> wrote:
> >> On Fri, May 4, 2018 at 6:37 PM, Aaron Durbin 
> wrote:
> 
>  Any idea what can be result of such weird behavior?
> >>>
> >
> > My current guess is AP CPUs do not see initialised memory for
> > _car_region_start .. _end. That region is set up using fixed MTRRs in
> > low memory and probably not synced between cores so early in romstage.
> >
>
> Ugh.  Why do we allow the APs to run through all these stages? Is this
> for parallel node memory training? Can we ring fence where the APs
> actually run better?
>
> >
> > diff --git a/src/console/init.c b/src/console/init.c
> > index 8f71b09..4731e7e 100644
> > --- a/src/console/init.c
> > +++ b/src/console/init.c
> > @@ -35,7 +35,7 @@ static int console_loglevel = CONFIG_DEFAULT_CONSOLE_
> LOGLEVEL;
> >
> >  static inline int get_log_level(void)
> >  {
> > -   if (car_get_var(console_inited) == 0)
> > +   if (boot_cpu() && car_get_var(console_inited) == 0)
> > return -1;
> > if (CONSOLE_LEVEL_CONST)
> > return get_console_loglevel();
> >
> > Kyösti
>
> --
> coreboot mailing list: coreboot@coreboot.org
> https://mail.coreboot.org/mailman/listinfo/coreboot
>
-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-04 Thread Aaron Durbin via coreboot
On Fri, May 4, 2018 at 11:16 AM, Kyösti Mälkki  wrote:
> On Fri, May 4, 2018 at 7:19 PM, Kyösti Mälkki  wrote:
>> On Fri, May 4, 2018 at 6:37 PM, Aaron Durbin  wrote:

 Any idea what can be result of such weird behavior?
>>>
>
> My current guess is AP CPUs do not see initialised memory for
> _car_region_start .. _end. That region is set up using fixed MTRRs in
> low memory and probably not synced between cores so early in romstage.
>

Ugh.  Why do we allow the APs to run through all these stages? Is this
for parallel node memory training? Can we ring fence where the APs
actually run better?

>
> diff --git a/src/console/init.c b/src/console/init.c
> index 8f71b09..4731e7e 100644
> --- a/src/console/init.c
> +++ b/src/console/init.c
> @@ -35,7 +35,7 @@ static int console_loglevel = 
> CONFIG_DEFAULT_CONSOLE_LOGLEVEL;
>
>  static inline int get_log_level(void)
>  {
> -   if (car_get_var(console_inited) == 0)
> +   if (boot_cpu() && car_get_var(console_inited) == 0)
> return -1;
> if (CONSOLE_LEVEL_CONST)
> return get_console_loglevel();
>
> Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-04 Thread Kyösti Mälkki
On Fri, May 4, 2018 at 7:19 PM, Kyösti Mälkki  wrote:
> On Fri, May 4, 2018 at 6:37 PM, Aaron Durbin  wrote:
>>>
>>> Any idea what can be result of such weird behavior?
>>

My current guess is AP CPUs do not see initialised memory for
_car_region_start .. _end. That region is set up using fixed MTRRs in
low memory and probably not synced between cores so early in romstage.


diff --git a/src/console/init.c b/src/console/init.c
index 8f71b09..4731e7e 100644
--- a/src/console/init.c
+++ b/src/console/init.c
@@ -35,7 +35,7 @@ static int console_loglevel = CONFIG_DEFAULT_CONSOLE_LOGLEVEL;

 static inline int get_log_level(void)
 {
-   if (car_get_var(console_inited) == 0)
+   if (boot_cpu() && car_get_var(console_inited) == 0)
return -1;
if (CONSOLE_LEVEL_CONST)
return get_console_loglevel();

Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-04 Thread Kyösti Mälkki
On Fri, May 4, 2018 at 6:37 PM, Aaron Durbin  wrote:
> On Fri, May 4, 2018 at 9:23 AM, Piotr Król  wrote:
>> Hi Aaron,
>> I tried to boot PC Engines apu2 on recent master branch and discovered
>> that it not boot. Bisection points to:
>>
>> 60320182d011 console: only allow console messages after initialization
>>
>> It is hard to believe that this change cause AGESA reset loop, but I
>> checked 3 times. After applying above commit I end up with loop:
>>
>> coreboot-4.7-441-g60320182d0 Fri Mar  2 15:22:24 UTC 2018 romstage
>> starting...
>> BSP Family_Model: 00730f01
>> cpu_init_detectedx = 
>> agesawrapper_amdinitreset() entry
>> CBFS: 'Master Header Locator' located CBFS at [200:7fffc0)
>> CBFS: Locating 'AGESA'
>> CBFS: Found @ offset 5ffdc0 size 7b0e0
>> CBFS: 'Master Header Locator' located CBFS at [200:7fffc0)
>> CBFS: Locating 'AGESA'
>> CBFS: Found @ offset 5ffdc0 size 7b0e0
>> Fch OEM config in INIT RESET Done
>>
>> Any idea what can be result of such weird behavior?
>
> I think it's because we still have boards that utilize CAR_GLOBAL, but
> don't handle migration of CAR globals. Those two things combined
> really makes for situations that just don't work and ramstage loading
> is just lucky to work on those platforms. I was originally thinking
> this patch would work, but I don't think that's the case because when
> I build apu2 CONFIG_EARLY_CBMEM_INIT is already set. If you revert
> that patch what does your log look like? We could be recursively
> entering into car_get_var_ptr() through the printk() path, but that
> would require things not marked as CAR_GLOBAL being passed to that
> function. There is one way to fix all of these scenarios: remove CAR
> migration by transitioning everyone to postcar stub between romstage
> and ramstage. That's a lot of work, though. We can try and debug
> further because I'm not clear why things aren't working.
>

Well I am looking at this as well now.

I have not booted vanilla master on apu2 for sometime, being preparing
POSTCAR_STAGE for these binaryPI boards instead..

Kyösti

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot

Re: [coreboot] PC Engines apu2 boot regression

2018-05-04 Thread Aaron Durbin via coreboot
On Fri, May 4, 2018 at 9:23 AM, Piotr Król  wrote:
> Hi Aaron,
> I tried to boot PC Engines apu2 on recent master branch and discovered
> that it not boot. Bisection points to:
>
> 60320182d011 console: only allow console messages after initialization
>
> It is hard to believe that this change cause AGESA reset loop, but I
> checked 3 times. After applying above commit I end up with loop:
>
> coreboot-4.7-441-g60320182d0 Fri Mar  2 15:22:24 UTC 2018 romstage
> starting...
> BSP Family_Model: 00730f01
> cpu_init_detectedx = 
> agesawrapper_amdinitreset() entry
> CBFS: 'Master Header Locator' located CBFS at [200:7fffc0)
> CBFS: Locating 'AGESA'
> CBFS: Found @ offset 5ffdc0 size 7b0e0
> CBFS: 'Master Header Locator' located CBFS at [200:7fffc0)
> CBFS: Locating 'AGESA'
> CBFS: Found @ offset 5ffdc0 size 7b0e0
> Fch OEM config in INIT RESET Done
>
> Any idea what can be result of such weird behavior?

I think it's because we still have boards that utilize CAR_GLOBAL, but
don't handle migration of CAR globals. Those two things combined
really makes for situations that just don't work and ramstage loading
is just lucky to work on those platforms. I was originally thinking
this patch would work, but I don't think that's the case because when
I build apu2 CONFIG_EARLY_CBMEM_INIT is already set. If you revert
that patch what does your log look like? We could be recursively
entering into car_get_var_ptr() through the printk() path, but that
would require things not marked as CAR_GLOBAL being passed to that
function. There is one way to fix all of these scenarios: remove CAR
migration by transitioning everyone to postcar stub between romstage
and ramstage. That's a lot of work, though. We can try and debug
further because I'm not clear why things aren't working.

diff --git a/src/console/init.c b/src/console/init.c
index 8f71b09881..b5c2f792fa 100644
--- a/src/console/init.c
+++ b/src/console/init.c
@@ -35,7 +35,7 @@ static int console_loglevel = CONFIG_DEFAULT_CONSOLE_LOGLEVEL;

 static inline int get_log_level(void)
 {
-   if (car_get_var(console_inited) == 0)
+   if (IS_ENABLED(CONFIG_EARLY_CBMEM_INIT) && !car_get_var(console_inited))
return -1;
if (CONSOLE_LEVEL_CONST)
return get_console_loglevel();
@@ -78,7 +78,8 @@ asmlinkage void console_init(void)

console_hw_init();

-   car_set_var(console_inited, 1);
+   if (IS_ENABLED(CONFIG_EARLY_CBMEM_INIT))
+   car_set_var(console_inited, 1);

printk(BIOS_NOTICE, "\n\ncoreboot-%s%s %s " ENV_STRING " starting...\n",
   coreboot_version, coreboot_extra_version, coreboot_build);



>
> Best Regards,
> --
> Piotr Król
> Embedded Systems Consultant
> https://3mdeb.com | @3mdeb_com
>

-- 
coreboot mailing list: coreboot@coreboot.org
https://mail.coreboot.org/mailman/listinfo/coreboot