On Mon, Dec 19, 2016 at 9:55 AM, Chauhan, Himanshu <hschau...@nulltrace.org> wrote: > On Mon, Dec 19, 2016 at 9:09 PM, Aaron Durbin <adur...@google.com> wrote: >> On Sun, Dec 18, 2016 at 11:04 PM, Chauhan, Himanshu >> <hschau...@nulltrace.org> wrote: >>> On Mon, Dec 19, 2016 at 12:40 AM, Aaron Durbin <adur...@google.com> wrote: >>>> On Sun, Dec 18, 2016 at 9:37 AM, Chauhan, Himanshu >>>> <hschau...@nulltrace.org> wrote: >>>>> Hi Aaron, >>>>> >>>>> I figured out the crash. It wan't because wrong load of the ROM image >>>>> (thanks to the nifty post_code which I could trap on IO). I see that >>>>> the page fault I am getting is in following code: >>>>> (gdb) list *(((0xfff81e41 - 0xfff80000)-200)+0x2000000) >>>> >>>> I'm curious about the 200 and 16MiB offset being applied. >>> >>> 0x2000000 is the new address where romstage is linked. Earlier >>> (atleast in 2014) the linked address used to be 0xfff80000. This is >>> the same address (guest physical) where I map the ROM code. In the >>> above calculation I am taking the offset from 0xfff80000 and adding to >>> the link address of romstage (0x2000000). The 0x200 is the difference >>> I see to map the addresses correctly. This calculation seems fine to >>> me because with this I am able to pin point all the earlier faults and >>> the post_code trap rIP. >>> >> >> If you provide 'cbfstool print -k' output, I could most likely provide >> the exact offset mapping. Alternatively you could extract the >> romstage.elf from the image using 'cbfstool extract -m x86', but it >> won't have debug info. But it'd provide the information to compare >> against the pre-relocated image for the correct mapping. >> > How exactly to run it? It says unknown option -k (cbfstool in build > directory).
./coreboot-builds/sharedutils/cbfstool/cbfstool coreboot-builds/GOOGLE_REEF/coreboot.rom print -k That's an example after me building reef with abuild. How old is your coreboot checkout? > >>>> >>>>> 0x2001d79 is in imd_recover (src/lib/imd.c:139). >>>>> 134 >>>>> 135 static void imdr_init(struct imdr *ir, void *upper_limit) >>>>> 136 { >>>>> 137 uintptr_t limit = (uintptr_t)upper_limit; >>>>> 138 /* Upper limit is aligned down to 4KiB */ >>>>> 139 ir->limit = ALIGN_DOWN(limit, LIMIT_ALIGN); >>>>> 140 ir->r = NULL; >>>>> 141 } >>>>> 142 >>>>> 143 static int imdr_create_empty(struct imdr *imdr, size_t root_size, >>>>> >>>>> I see that this function is being called multiple times (I added some >>>>> more post_code and see them being trapped). I get a series of page >>>>> faults which I am able to honour all but last. >>>> >>>> I don't see how imdr_init would be faulting. That's just assigning >>>> fields of a struct sitting on the stack. What's your stack pointer >>>> value at the time of the faults? >>> >>> "ir" should be on stack or on top of the RAM. Right now it looks like >>> its on top of the RAM. That area is not mapped initially. On a page >>> fault, I map a 4K page. For the reference, the following is the >>> register dump of coreboot. RSP is 0x9fe54. >>> >> >> The values should not be striding. That object is always on the stack. >> Where the stack is located could be in low or high memory. I still >> need to know what platform you are targeting for the image to provide >> details. However, it would not be striding. > > I am building this for qemu i440-fx. OK. What is your cmos emulation returning at addresses 0x34, 0x35, 0x5d, 0x5c and 0x5b? I also don't understand why we're adding 16MiB to qemu_get_memory_size() unconditionally. > >> >>> GUEST guest0/vcpu0 dump state: >>> >>> RAX: 0x9fe80 RBX: 0xfffff8 RCX: 0x1b RDX: 0x53a11439 >>> R08: 0x0 R09: 0x0 R10: 0x0 R11: 0x0 >>> R12: 0x0 R13: 0x0 R14: 0x0 R15: 0x0 >>> RSP: 0x9fe54 RBP: 0xa0000 RDI: 0xfff801e4 RSI: 0x9fe80 >>> RIP: 0xfff81e41 >>> >>> CR0: 0xe0000011 CR2: 0x0 CR3: 0xa23000 CR4: 0x0 >>> CS : Sel: 0x00000008 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>> 1 L: 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 11) >>> DS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>> 1 L: 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>> ES : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>> 1 L: 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>> SS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>> 1 L: 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>> FS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>> 1 L: 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>> GS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>> 1 L: 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>> GDT : Sel: 0x00000000 Limit: 0x0000001f Base: 0xfff80200 (G: 0 DB: >>> 0 L: 0 AVL: 0 P: 0 DPL: 0 S: 0 Type: 0) >>> LDT : Sel: 0x00000000 Limit: 0x0000ffff Base: 0x00000000 (G: 0 DB: >>> 0 L: 0 AVL: 0 P: 0 DPL: 0 S: 0 Type: 0) >>> IDT : Sel: 0x00000000 Limit: 0x00000000 Base: 0x00000000 (G: 0 DB: >>> 0 L: 0 AVL: 0 P: 0 DPL: 0 S: 0 Type: 0) >>> TR : Sel: 0x00000000 Limit: 0x0000ffff Base: 0x00000000 (G: 1 DB: >>> 0 L: 1 AVL: 1 P: 0 DPL: 0 S: 0 Type: 0) >>> RFLAGS: 0xa [ ] >>> >>> >>>>> >>>>> (__handle_vm_exception:543) Guest fault: 0x7f7fffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f7effc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f7dffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f7cffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f7bffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f7affc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f79ffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f78ffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f77ffc (rIP: 00000000FFF81E41) >>>>> (__handle_vm_exception:543) Guest fault: 0x7f76ffc (rIP: 00000000FFF81E41) >>>>> <snip> >>>> >>>> Are those non-rIP addresses the page fault address? >>> >>> Guest fault: 0x7f7fffc is the address which I think is pointing to >>> "ir". If you look all the faulting addresses are 4K apart which is my >>> default page size for mapping all the guest pages. It also means that >>> multiple times "imdr_init" is being called it faults for different >>> addresses hence the same rIP. >> >> I just don't see how we're using that much stack. That doesn't seem >> right at all. >> > > Yes. Something is terribly wrong. I had this working back in 2014. > Please take a look at this video that I created at that time. > https://www.youtube.com/watch?v=jPAzzLQ0NgU i see you do have serial port. It'd be interesting to get full logs when the thing is booting to see where it goes off the rails. > > I couldn't work on it for quite some time and meantime core boot > changed a lot. I have one question. In earlier core boot images, > romstage was linked to 0xfff80000 and now its 0x2000000. Any reason? It's just linked at CONFIG_ROMSTAGE_ADDR to avoid a double link step. It's linked once and cbfstool relocates the image when placing it into CBFS. It previously was linked at a specific address then the xip address was calculated by performing a pseudo CBFS add operation. Then romstage was re-linked and added to CBFS. The offset for address translation is the entry point differences between the 2 elf files. You can extract the one in coreboot.rom to get a the entry point of the romstage being ran. > >>> >>>> >>>>> >>>>> handle_guest_realmode_page_fault: offset: 0x3ffc fault: 0x1003ffc reg: >>>>> 0x1000000 >>>>> handle_guest_realmode_page_fault: offset: 0x2ffc fault: 0x1002ffc reg: >>>>> 0x1000000 >>>>> handle_guest_realmode_page_fault: offset: 0x1ffc fault: 0x1001ffc reg: >>>>> 0x1000000 >>>>> handle_guest_realmode_page_fault: offset: 0xffc fault: 0x1000ffc reg: >>>>> 0x1000000 >>>> >>>> What is the above detailing? I'm not sure what the 'fault' value means. >>> >>> These are same as Guest fault above. You can disregard them. >>> >>>> >>>>> >>>>> (__handle_vm_exception:561) ERROR: No region mapped to guest physical: >>>>> 0xfffffc >>>>> >>>>> >>>>> I want to understand why imd_recover gets called multiple times >>>>> starting from top of memory (128MB is what I have assigned to the >>>>> guest) to 16MB last (after which I can't honour). There is something >>>>> amiss in my understanding of core boot memory map. >>>>> >>>>> Could you please help? >>>> >>>> The imd library contains the implementation of cbmem. See >>>> include/cbmem.h for more details, but how it works is that the >>>> platform needs to supply the implementation of cbmem_top() which >>>> defines the exclusive upper boundary to start growing entries downward >>>> from. There is a large and small object size with large blocks being >>>> 4KiB in size and small blocks being 32 byes. I don't understand why >>>> the faulting addresses are offset from 128MiB by 512KiB with a 4KiB >>>> stride. >>>> >>>> What platform are you targeting for your coreboot build? Are you >>>> restarting the instruction that faults? I'm really curious about the >>>> current fault patterns. It looks like things are faulting around >>>> accessing the imd_root_pointer root_offset field. Are these faults >>>> reads or writes? However, that's assuming cbmem_top() is returning >>>> 128MiB-512KiB. However, it doesn't explain the successive strides. Do >>>> you have serial port emulation to get the console messages out? >>>> >>>> So in your platform code ensure 2 things are happening: >>>> >>>> 1. cbmem_top() returns a highest address in 'ram' of the guest once >>>> it's online. 128MiB if that's your expectation. The value cbmem_top() >>>> returns should never change from successive calls aside from NULL >>>> being returned when ram is not yet available. >>>> 2. cbmem_initialize_empty() is called one time once the 'ram' is >>>> online for use in the non-S3 resume path and cbmem_initialize() in the >>>> S3 resume path. If S3 isn't supported in your guest then just use >>>> cbmem_initialize_empty(). >>>> >>> >>> I will look in it. I see that RAM top is being provided by the CMOS >>> emulator. I will look at cbmem_initialize_empty(). >> >> If you could provide me the info on the platform you are targeting >> coreboot builds with it'd be easier to analyze. Where is this 'CMOS >> emulator' and why is it needed? > > Coreboot calls on port 0x34/0x35 to get the amount of memory. The cmos > emulator traps these (just like qemu) and provides that information to > core boot. > cbmem_recovery(0) is effectively cbmem_initialize_empty(). That's being called in src/mainboard/emulation/qemu-i440fx/romstage.c. Your RSP value of 0x9fe54 aligns with src/mainboard/emulation/qemu-i440fx/cache_as_ram.inc using 0xa0000 as the initial stack. So I don't think imd_recover() is your culprit. Something is changing the value of cbmem_top() it feels like. >> >>> >>>>> >>>>> Regards >>>>> Himanshu >>>>> >>>>> On Wed, Dec 14, 2016 at 9:27 PM, Chauhan, Himanshu >>>>> <hschau...@nulltrace.org> wrote: >>>>>> Hi Aaron, >>>>>> >>>>>> Yes, I am mapping the memory where coreboot.rom is loaded to upper 4GiB. >>>>>> I >>>>>> create a fixed shadow page table entry for reset vector. >>>>>> >>>>>> Coreboot doesn't have a linked address of RIP that I shared. I think with >>>>>> the increase in size of coreboot (from the previous tag I was using) the >>>>>> load address (guest physical) has changed. I used to calculate the load >>>>>> address manually. I will check this and get back. >>>>>> >>>>>> Thanks. >>>>>> >>>>>> On Wed, Dec 14, 2016 at 8:17 PM, Aaron Durbin <adur...@google.com> wrote: >>>>>>> >>>>>>> On Wed, Dec 14, 2016 at 3:11 AM, Chauhan, Himanshu >>>>>>> <hschau...@nulltrace.org> wrote: >>>>>>> > Hi, >>>>>>> > >>>>>>> > I am working on a hypvervisor and am using coreboot + FILO as guest >>>>>>> > BIOS. >>>>>>> > While things were fine a while back, it has stopped working. I see >>>>>>> > that >>>>>>> > my >>>>>>> > hypervisor can't handle address 0xFFFFFC while coreboot's RIP is at >>>>>>> > 0xfff81e41. >>>>>>> >>>>>>> >>>>>>> How are you loading up coreboot.rom in the VM? Are you just memory >>>>>>> mapping it at the top of 4GiB address space? If so, what does >>>>>>> 'cbfstool coreboot.rom print' show? >>>>>>> >>>>>>> > >>>>>>> > The exact register dump of guest is as follow: >>>>>>> > >>>>>>> > [guest0/uart0] (__handle_vm_exception:558) ERROR: No region mapped to >>>>>>> > guest >>>>>>> > physical: 0xfffffc >>>>>>> > >>>>>>> > GUEST guest0/vcpu0 dump state: >>>>>>> > >>>>>>> > RAX: 0x9fe80 RBX: 0xfffff8 RCX: 0x1b RDX: 0x53a11439 >>>>>>> > R08: 0x0 R09: 0x0 R10: 0x0 R11: 0x0 >>>>>>> > R12: 0x0 R13: 0x0 R14: 0x0 R15: 0x0 >>>>>>> > RSP: 0x9fe54 RBP: 0xa0000 RDI: 0xfff801e4 RSI: 0x9fe80 >>>>>>> > RIP: 0xfff81e41 >>>>>>> > >>>>>>> > CR0: 0xe0000011 CR2: 0x0 CR3: 0xa23000 CR4: 0x0 >>>>>>> > CS : Sel: 0x00000008 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 1 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 11) >>>>>>> > DS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 1 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>>>>>> > ES : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 1 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>>>>>> > SS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 1 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>>>>>> > FS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 1 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>>>>>> > GS : Sel: 0x00000010 Limit: 0xffffffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 1 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 1 DPL: 0 S: 1 Type: 3) >>>>>>> > GDT : Sel: 0x00000000 Limit: 0x0000001f Base: 0xfff80200 (G: 0 DB: >>>>>>> > 0 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 0 DPL: 0 S: 0 Type: 0) >>>>>>> > LDT : Sel: 0x00000000 Limit: 0x0000ffff Base: 0x00000000 (G: 0 DB: >>>>>>> > 0 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 0 DPL: 0 S: 0 Type: 0) >>>>>>> > IDT : Sel: 0x00000000 Limit: 0x00000000 Base: 0x00000000 (G: 0 DB: >>>>>>> > 0 >>>>>>> > L: >>>>>>> > 0 AVL: 0 P: 0 DPL: 0 S: 0 Type: 0) >>>>>>> > TR : Sel: 0x00000000 Limit: 0x0000ffff Base: 0x00000000 (G: 1 DB: >>>>>>> > 0 >>>>>>> > L: >>>>>>> > 1 AVL: 1 P: 0 DPL: 0 S: 0 Type: 0) >>>>>>> > RFLAGS: 0xa [ ] >>>>>>> > >>>>>>> > I want to know which binary file (.o) should I disassemble to look at >>>>>>> > the >>>>>>> > RIP? >>>>>>> > >>>>>>> > I was looking at >>>>>>> > objdump -D -mi386 -Maddr16,data16 generated/ramstage.o >>>>>>> > >>>>>>> > but this is prior to linking and thus only has offsets. >>>>>>> > >>>>>>> > -- >>>>>>> > >>>>>>> > Regards >>>>>>> > [Himanshu Chauhan] >>>>>>> > >>>>>>> > >>>>>>> > -- >>>>>>> > coreboot mailing list: coreboot@coreboot.org >>>>>>> > https://www.coreboot.org/mailman/listinfo/coreboot >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Regards >>>>>> [Himanshu Chauhan] >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Regards >>>>> [Himanshu Chauhan] >>> >>> >>> >>> -- >>> >>> Regards >>> [Himanshu Chauhan] > > > > -- > > Regards > [Himanshu Chauhan] -- coreboot mailing list: coreboot@coreboot.org https://www.coreboot.org/mailman/listinfo/coreboot