Please excuse this blast. Here's the problem: CBFS is breaking something it can't break. If you turn on CBFS, then very early startup in the opteron code fails. this is verified across several mainboards. Any wild ideas welcome. I can't even figure out where to start ...
ron Forwarded conversation Subject: s2892 + CBFS strange failure ------------------------ From: *Myles Watson* <[email protected]> Date: Wed, Apr 22, 2009 at 9:05 AM To: ron minnich <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]>, Marc Jones < [email protected]> On Wed, Apr 22, 2009 at 9:56 AM, ron minnich <[email protected]> wrote: > can I bring in patrick and stephan and marcj? This is getting too weird. :) Probably part of it is miscommunication on my part, but I'd be glad for any help. Here's the summary: With CONFIG_CBFS = 0 it works fine With CONFIG_CBFS = 1 I get (warm reset): INIT detected from --- { APICID = 00 NODEID = 00 COREID = 00} --- Issuing SOFT_RESET... Then nothing else. Post code 0xf0 With CONFIG_CBFS = 1 I get (cold reset): Nothing. Post code 0xf0 I've been inserting post codes, and it always makes it to real_main. It just doesn't make it out of init_cpus. On a warm reset I get the serial output. Otherwise there is none. We've tried using a different compiler. Same results. We've tried no payload and no VGA ROM. Thanks, Myles ---------- From: *ron minnich* <[email protected]> Date: Wed, Apr 22, 2009 at 9:12 AM To: Myles Watson <[email protected]> Cc: Stefan Reinauer <[email protected]>, Patrick Georgi < [email protected]>, Marc Jones <[email protected]> Also, myles, this all works on serengeti, right? ron ---------- From: *Myles Watson* <[email protected]> Date: Wed, Apr 22, 2009 at 9:13 AM To: ron minnich <[email protected]> Cc: Stefan Reinauer <[email protected]>, Patrick Georgi < [email protected]>, Marc Jones <[email protected]> I was in the middle of writing that :) I forgot an interesting point: The broken image works on SimNOW until it can't find the SMBUS. But it always gets far enough that there is some serial output. Thanks, Myles ---------- From: *Marc Jones* <[email protected]> Date: Wed, Apr 22, 2009 at 10:28 AM To: Myles Watson <[email protected]> Cc: ron minnich <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> That is very strange. can you attempt to track when it starts to fail? Does it have to boot all the way into linux before the reset stops working or does it happened before it loads any payloads? I can't think of anything that would cause that kind of problem. Can you narrow it down in cpu_init? Marc -- http://marcjonesconsulting.com ---------- From: *Myles Watson* <[email protected]> Date: Wed, Apr 22, 2009 at 10:48 AM To: Marc Jones <[email protected]> Cc: ron minnich <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> Sorry I was unclear again. I'll try to explain better. When I said it happens on warm reset, I meant from a working image. 1. boot a working image 2. switch to cbfs image 3. warm reset gives some output It seems like it hangs on the first call to printk that it reaches. I tried moving console_init ahead of init_cpus in real_main, but it didn't change the behavior. Thanks, Myles ---------- From: *ron minnich* <[email protected]> Date: Wed, Apr 22, 2009 at 1:44 PM To: Myles Watson <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> so cbfs works with qemu kontron (yes or no?I think yes) serengeit and it fails with this board. are these older CPUs? What stepping? I have to admit I'm stumped. ron ---------- From: *Myles Watson* <[email protected]> Date: Wed, Apr 22, 2009 at 1:47 PM To: ron minnich <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> I may have been chasing the wrong thing here. When I was helping Samuel with the the dl145 he said that somewhere after 4030 cold boot broke for him. He's bisecting now. Thanks, Myles ---------- From: *Myles Watson* <[email protected]> Date: Thu, Apr 23, 2009 at 6:15 AM To: ron minnich <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> Just to add a wrinkle my onboard graphics died. That's why things were flaky yesterday. It just stopped responding to config reads and gets disabled by coreboot. I added a video card and I'm back up. Cold boot works for me with 4193 (No CBFS), so the Config changes were fine. It's still broken for CBFS for me. Unless someone has an idea of how to track it down I'm just going to not use CBFS for now, even though I like the CBFS option much better. Thanks, Myles ---------- From: *ron minnich* <[email protected]> Date: Thu, Apr 23, 2009 at 7:57 AM To: Myles Watson <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> we really need to track this down because whatever this may be, it's unlikely to be cbfs. Not if you're not getting any prints at all. It would still be interesting if you could try the very first version where cbfs was introduced. ron ---------- From: *Myles Watson* <[email protected]> Date: Thu, Apr 23, 2009 at 12:04 PM To: ron minnich <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> There were some fixes put in pretty quickly. I just tried 4113 (the rename.) Which one would you suggest next? Thanks, Myles ---------- From: *Myles Watson* <[email protected]> Date: Thu, Apr 23, 2009 at 3:09 PM To: ron minnich <[email protected]> 4061 fails with CBFS but not without. Thanks, Myles ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 7:23 AM To: Myles Watson <[email protected]> no serial output and SPEW? ron ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 7:38 AM To: ron minnich <[email protected]> For a warm boot. Nothing from a cold boot. 4061 no CBFS works fine. Thanks, Myles ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 7:40 AM To: ron minnich <[email protected]> SPEW is definitely enabled for the working one. Thanks, Myles ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 8:18 AM To: ron minnich <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]> 4061 on my s2892 with SPEW: On my s2895 I am having problems with warm reset, and a cold boot powers itself off quickly with post code 0xf0. Thanks, Myles ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 9:47 AM To: Myles Watson <[email protected]> Cc: Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Patrick Georgi <[email protected]>, Ward Vandewege <[email protected] > OK, this is nuts. CBFS is in the ram code. It can't affect he ROM code, can it? And this is really early! Here are the only things I can think of: 1. CBFS changes layout somehow 2. Turning off ELFBOOT turned off something hidden 3. it's changing the way gcc works I just don't know. Somehow we've got to find this. I will set up my dbm board tonight. Patrick, Stefan, have you tested CBFS with the kontron? ron ---------- From: *Patrick Georgi* <[email protected]> Date: Fri, Apr 24, 2009 at 9:49 AM To: ron minnich <[email protected]> Cc: Myles Watson <[email protected]>, Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> Am 24.04.2009 18:47, schrieb ron minnich: That's where my lzma.c patch came from. I'm debugging the bounce buffer code right now, it seems to copy correctly into the buffer, but I'm not convinced yet that it correctly copies back. Patrick ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 9:55 AM To: Patrick Georgi <[email protected]> Cc: Myles Watson <[email protected]>, Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> but Myle's failure is WAY before any of that. His machine dies in the very earliest C code. I do not really like the bounce buffer ... it' just too fragile for my taste. If anything goes wrong, well, you're in assembly code with no way out. ron ---------- From: *Patrick Georgi* <[email protected]> Date: Fri, Apr 24, 2009 at 10:35 AM To: ron minnich <[email protected]> Cc: Myles Watson <[email protected]>, Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> OK, this is nuts. CBFS is in the ram code. It can't affect he ROM I'm not quite sure at which point in the boot process the last message before the reboot comes up, so this is just a guess. Could it be that it tries to jump into the normal image? I'm not quite certain that we get that entirely correct (and the layout might change in that dark corner of the build system). Replacing that "jmp __normal_image" with "jmp __fallback_image" might help then (for testing). Patrick ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 10:42 AM To: Patrick Georgi <[email protected]> Cc: Myles Watson <[email protected]>, Marc Jones <[email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> On Fri, Apr 24, 2009 at 10:35 AM, Patrick Georgi not *that* is a pretty smart guess. Myles, were you runinng fallback/normal? ron ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 10:42 AM To: ron minnich <[email protected]> Cc: Patrick Georgi <[email protected]>, Marc Jones < [email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> fallback only. I'll try it. Thanks, Myles ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 10:57 AM To: Myles Watson <[email protected]> Cc: Patrick Georgi <[email protected]>, Marc Jones < [email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> src/lib/cbfs.c src/include/cbfs.h src/devices/pci_rom.c src/boot/selfboot.c src/boot/hardwaremain.c But none of these are involved in the early CAR code. There is another possibility: are we somehow messing up the HT configuration space? That would explain why you die after init_cpus. ron ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 11:11 AM To: Myles Watson <[email protected]> Cc: Patrick Georgi <[email protected]>, Marc Jones < [email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> well,this problem just became urgent. I've got no idea where to start and no time right now to work on it :-( And it doesn't break on simnow, right, myles? Patrick, any progress on kontron? oh, !...@#$@!...@!#$!@$# ron ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 11:12 AM To: ron minnich <[email protected]> Cc: Patrick Georgi <[email protected]>, Marc Jones < [email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> And why there's only output on a warm reset. I don't know. I tried removing the normal image jump in cache_as_ram_auto.c. I guess I should have remembered that it got past there before init_cpus. Myles ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 11:13 AM To: ron minnich <[email protected]> Cc: Patrick Georgi <[email protected]>, Marc Jones < [email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> Right. The serengeti image works fine, and the s2892 image runs until it notices it has a different chipset and dies with "SMBUS not found." Myles ---------- From: *ron minnich* <[email protected]> Date: Fri, Apr 24, 2009 at 11:15 AM To: Myles Watson <[email protected]> Cc: Patrick Georgi <[email protected]>, Marc Jones < [email protected]>, Stefan Reinauer <[email protected]>, Ward Vandewege <[email protected]> Anyone mind if I just take this to the list. ron ---------- From: *Ward Vandewege* <[email protected]> Date: Fri, Apr 24, 2009 at 11:16 AM To: ron minnich <[email protected]> Cc: Myles Watson <[email protected]>, Patrick Georgi < [email protected]>, Marc Jones <[email protected]>, Stefan Reinauer <[email protected]> Please do. Thanks, Ward. -- Ward Vandewege <[email protected]> Free Software Foundation - Senior Systems Administrator ---------- From: *Myles Watson* <[email protected]> Date: Fri, Apr 24, 2009 at 11:21 AM To: ron minnich <[email protected]> No problem here. We probably should have done it a while ago. I just didn't want to make too big of a stink if we could fix it quickly. Thanks, Myles
-- coreboot mailing list: [email protected] http://www.coreboot.org/mailman/listinfo/coreboot

