Carl-Daniel Hailfinger wrote:
Hi Yinghai,

can you please look at the problems below?

On 06.08.2008 18:54, Marc Jones wrote:
Carl-Daniel Hailfinger wrote:
I'm currently working to unify K8 and Fam10 CAR to use the same code at
runtime (as opposed to buildtime #ifdefs). While this may not be a goal
for v2, I definitely want to try it for v3.

A few questions/comments about the CAR code:
- Only Fam10 APs are treated specially. APs of older generations seem to
be unhandled. Did older generations treat each core as BSP (code seems
to suggest that) or were there other special provisions?
I don't know. I haven't used or worked on that code. YH would be the
better person to ask. For the fam10 code there are some settings that
can only be set from the AP cores.

Older BKDGs indicate that we should treat all APs specially. That would
be a missing feature in the old code.


- CAR goes from 0xC8000 to 0xCFFFF. Assuming GlobalVarSize=0 (untrue,
but easier to calculate), BSP stack will be from 0xCC000 to 0xCFFFF and
AP stacks will be below 0xCBFFF.
* With the current settings (32k CAR total, 1k per AP, 16K for the BSP)
the scheme will fall apart if the highest NodeID shifted by the number
of CoreID bits is 16 or higher. The BKDG indicates that the number of
CoreID bits is 2, so a NodeID of 4 or higher will break.
Yes. This was sufficient for the K8 and was not changed when I added
fam10. 8 dual core K8 was the most you could have. It could probably
be expended into the rest of the shadow hole (up to FFFFF) if needed.
The reason to keep it in the hole is for memory eye finding that will
happen from 1MB to TOM.

We may need to revisit this for Family 10h. What's the maximum number of
cores in one Family 10h system?

8 quadcore cpus would be a large server setup.


* There is no good place to store the printk() buffer in CAR. On Geode
and i586, the printk buffer runs from the lowest address of the CAR area
to the middle. Keeping that design will result in the AP stacks
colliding with the printk buffer. Limiting the size of the printk buffer
dynamically would work unless there are more than 15 cores in the
system, where even a printk buffer of zero size would clobber one AP
stack. The other alternative is to keep the printk buffer size fixed and
let the AP stacks eat into BSP stack space.
This was the problem I mentioned when you were doing the printk()
buffer. You are not guaranteed the use of the cache.

I think I can fix that. If APs are only started after the BSP has
initialized DRAM, there is no problem because the printk() buffer is
relocated directly after DRAM init.


APIC, fid/vid and other msr init happen before memory. See: wait_all_core0_started() and start_other_cores() for details
- Is there any reason on any K8 or later processor supported by the
current CAR code not to use 64k CAR?
To leave room for APs? There may have been some concern about small
cache versions be introduced?

The various BKDGs state that L1 cache tag indexes of 00h - FFh are
reserved for memory training and recommend to use exactly 48k CAR. I
intend to follow that advice.
By the way, the AP CAR areas in our code are inside the BSP CAR area.

I think that is ok. IIRC the APs use the BSP cache for their stacks. It also allows them to share the sysinfo struct.

- Is 1k enough stack for the APs, given some stack-heavy functions in
v3?
I don't know for sure but I would expect it to be ok.

- Can the K8 processors work reliably with 0x1e1e1e1e settings in the
fixed MTRR or can the Fam10 processors work with 0x06060606?
No.

OK, I will work on a generic code sequence for this problem. Does Family
11h need 0x1e or 0x06?


I think that Fam11 will have 0x1e cache type.


Marc

--
Marc Jones
Senior Firmware Engineer
(970) 226-9684 Office
mailto:[EMAIL PROTECTED]
http://www.amd.com/embeddedprocessors



--
coreboot mailing list
[email protected]
http://www.coreboot.org/mailman/listinfo/coreboot

Reply via email to