On Tue, Feb 16, 2010 at 4:40 PM, Stefan Reinauer <[email protected]>wrote:
> On 2/16/10 8:42 PM, Myles Watson wrote: > > > > On Tue, Feb 16, 2010 at 12:02 PM, Stefan Reinauer > <[email protected]>wrote: > >> On 2/16/10 5:11 AM, Timothy Pearson wrote: >> > Here is a cleaned up and tested version of the SMP APIC autodetect >> patch. >> > >> > Signed-off-by: Timothy Pearson <[email protected]> >> > > >> And, can you describe high level, what the patch changes? It looks to me >> as if you are recursing through the tree instead of just walking the >> "all_devices" list. >> So this implies that you don't catch all devices when running through >> all_devices. This sounds like a problem in the resource allocator and >> maybe it should be fixed there instead? >> > I don't understand why this would be a resource allocator problem. Aren't > we talking about the device tree? > > > Oh, sorry I didn't say more... What the old code did was the following: > > device_t cpu; > for(cpu = all_devices; cpu; cpu = cpu->next) { > .... > <- check if the device is an APIC and if it is, dump an APIC entry in > the mptable -> > .... > } > > But it seems that the code does not see all devices -- anymore -- not sure > if it ever did, but I think to remember that's what happened at some point, > or was at least intended. > I'm really surprised that it didn't see all devices. > So now the code looks l > > scan_for_apic_devices(parent) > { > for(c_it=0; c_it < parent->links; c_it++) { > for (child = parent->link[c_it].children; child; child = > child->sibling) { > .... > > if (child->path.type == DEVICE_PATH_APIC_CLUSTER) { > // Found an APIC cluster, scan it for APICs > smp_scan_for_apic_devices(child); > } > } > } > } > > So two more steps are necessary: > - check all the downwards links of a device instead of just walking devices > and checking their type. > - run recursively in a special case on APIC clusters. > > This sounds a whole lot like something changed in the way "all_devices" > works. And if "all_devices" does not mean "all devices" I am sure there are > more places in our code that need similar fixes. > I agree. I've never seen the problem where I couldn't run through all the devs with all_devices. > > Maybe the real problem is related to the memory corruption seen with > printk? > > Not completely impossible, but I figured it's easier to ask the guy who > rewrote the resource allocator if he knows something about how the intended > behavior of "all_devices" ;-) > As far as I know I didn't change that behavior. Does anyone have a boot log where they can see that the tree of devices has more devices than the list (at the same stage of enumeration)? > - Maybe the behavior is intended, then we just need to check in Timothy's > patch. > - Maybe the behavior is not intended, but that's how the code works right > now. Then we'd rather have to look at the resource allocator and decide if > we want "all_devices" to mean "all devices", or whether we rename the > variable, or redefine its meaning. > I think it should mean all devices. I changed resource allocation to go through the tree because children of disabled devices shouldn't have allocated resources. - Maybe none of the above applies, then we need to do nasty stack corruption > debugging. In this case it would be fundamentally wrong to touch either the > device tree code or the mp table creation code until we fix the corruption > in order to make sure we don't create funny special cases. > This is my vote, but I'm happy to be proven wrong. I don't see anywhere in the code where devices get added to the tree but not to the list of devices. I also don't remember a place where devices get removed from the all_devices list. > So, if you have hints enlightening any of the maybes, please share! > My suggestion would be to traverse the tree and the list and compare them. The problem is that printk was causing a hang for him. Thanks, Myles
-- coreboot mailing list: [email protected] http://www.coreboot.org/mailman/listinfo/coreboot

