Hi, All,
After a dozens of day's struggling, I implement the ACPI S3 with coreboot-v2 on
DBM690T. I write a note as below, please help to review it. Even though it is
so primary and platform-specific, I appreciate any comments. Thank you in
advance.
---------------------------------------------------------------------------------------------------------------------------------
Platform: DBM690T populated 1G RAM, internal GFX
Fedora 6 shipping with 6.2.18 kernel
The basic principle of ACPI S3 with coreboot-v2 on DBM690T:
Before Payload running, two major memory regions are used by CAR and coreboot:
region 1: start address = 0x4000, length = 0x2ca0c(I think this only includes
code and const), coreboot will run here after loading.
region 2: start address = 0x1f8000, length = 0x8000, CAR uses this region as
stack at the final stage.
Furthermore, some tables occupy other small memory regions, such as IRQ table,
MP table, ACPI table and coreboot table. When resuming, all these memory
regions need to be protected from coreboot's overwriting. In the function of
post_cache_as_ram, just before switching the stack from cache to RAM, the
lowest 2M data are copied into the topmost 2M RAM. In the case of DBM690T, 1G
RAM is populated while the last 128M RAM is reserved for the internal GFX as
video memory. Linux of FC6 runs under the text mode on my DBM690T, so most
video memory is not used. The 2M data can be kept intact in the topmost 2M RAM
and no OS memory will be ruined by this moving. After all devices initialized,
coreboot will check whether it is S3 resuming or not. If S3 resuming, the 2M
data will be copied back into the lowest 2M RAM from the topmost 2M RAM and
coreboot shifts the control to OS by jumping into the waking vector contained
in the FACS table.
The source code is a big mess and full of tricks, furthermore, it is so
platform-specific by now: DBM690T populated 1G RAM, the internal GFX applied
and FC 6 shipping with 6.2.18 kernel, it is so experimental that I can not
check it in. The source code and comments are listed below. Hope it helpful for
others.
STEP 1. check the sleep resuming state and move the 2M data in the function of
post_cache_as_ram from the lowest 2M RAM to the topmost 2M RAM:
/* BTDC: implement ACPI S3. */
sleep_type = inw(0x804); /* BTDC: hard-coded, modify it later. */
sleep_type = (sleep_type>>10) & 0x07; /* BTDC: get the sleep type. */
print_debug_pcar("BTDC: sleep_type = \n", sleep_type);
if(3 == sleep_type)
{ /* BTDC: resuming from S3. */
msr_t msr_tmp;
u32 * pt;
u32 temp;
print_debug("BTDC: save the memory for OS resuming.\n");
/* BTDC: at this point, only the fixed MTRR 0x269 for stack in cache
* and the variable MTRR 0x202 for source code in FLASH are set, so if
* I want move the 2M data, I have to set some MTRRs myself.
*/
__asm__ volatile (
/* BTDC: disable the fixed MTRRs temporarily. */
"movl $0xC0010010, %ecx\n\t"
"rdmsr\n\t"
"andl $(~(3<<18)), %eax\n\t"
"wrmsr\n\t"
/* BTDC: enable the default MTRR, so I can access the whole RAM. */
"movl $0x2ff, %ecx\n\t"
"xorl %edx, %edx\n\t"
"movl $0x00000800, %eax\n\t"
"wrmsr\n\t"
);
/* BTDC: save the memory for OS resuming. */
pt = 0;
temp = *pt;
print_debug_pcar("\nBTDC: the memory of 0 = ", temp);
pt = 2*1024*1024 - 4;
temp = *pt;
print_debug_pcar("\nBTDC: the memory of 2M = ", temp);
memcopy(0x3fe00000, 0, 2*1024*1024); /* BTDC: move the 2M data. */
pt = 0x3fe00000;
temp = *pt;
print_debug_pcar("\nBTDC: the memory of mirror 0 = ", temp);
pt = 2*1024*1024 - 4 + 0x3fe00000;
temp = *pt;
print_debug_pcar("\nBTDC: the memory of mirror 2M = ", temp);
/* BTDC: restore the MTRR previously modified. */
__asm__ volatile (
/* BTDC: enable the fixed MTRRs again. */
"movl $0xC0010010, %ecx\n\t"
"rdmsr\n\t"
"orl $(3<<18), %eax\n\t"
"wrmsr\n\t"
"movl $0x2ff, %ecx\n\t"
"xorl %edx, %edx\n\t"
"movl $0x00000c00, %eax\n\t"
"wrmsr\n\t"
);
print_debug("BTDC: saving finished\n");
}
/* BTDC: just before the 2M data is cleared. */
set_init_ram_access(); /* So we can access RAM from [1M,
CONFIG_LB_MEM_TOPK) */
STEP 2. if the system is resuming from S3, move the 2M data from the topmost 2M
memory back into the lowest 2M memory in the function of hardwaremain and jump
into the waking vector in the FASC table instead of loading payload.
first, I set up a GDT table and two pseudo-descriptions for GDT and IDT
respectively at 0x3fdfffe8, 0x3fdfffe0 and 0x3fdfffd8 for the 32bit-16bit
opcode switching.
memcpy((void *)(0x3fe00000-sizeof(real_mode_gdt_entries)),
real_mode_gdt_entries, sizeof(real_mode_gdt_entries));
memcpy((void *)0x3fdfffe0, real_mode_gdt, sizeof(real_mode_gdt));
memcpy((void *)0x3fdfffd8, real_mode_idt, sizeof(real_mode_idt));
the three tables are as below, thanks to Rudolf,
unsigned long long real_mode_gdt_entries [3] =
{
0x0000000000000000ULL, /* Null descriptor */
0x008f9b000000ffffULL, /* 16-bit real-mode 64k code at
0x00000000 */
0x008f93000000ffffULL /* 16-bit real-mode 64k data at
0x00000100 */
};
struct Xgt_desc_struct {
unsigned short size;
unsigned long address __attribute__((packed));
unsigned short pad;
} __attribute__ ((packed));
struct Xgt_desc_struct real_mode_gdt = { sizeof (real_mode_gdt_entries)
- 1, (long)real_mode_gdt_entries };
struct Xgt_desc_struct real_mode_idt = { 0x3ff, 0 };
then, a small machine code are copied into the memory of 0x3fdfff00:
unsigned char bincode[] = {
/* BTDC: move 2M data back into the lowest 2M RAM.*/
0xbe, 0x00, 0x00, 0xe0, 0x3f, /* mov esi, 3fe00000h */
0xbf, 0x00, 0x00, 0x00, 0x00, /* mov edi, 00000000h */
0xb9, 0x00, 0x00, 0x08, 0x00, /* mov ecx, 00080000h */
0xfc, /* cld */
0xf3, 0xa5, /* rep movsd es:[edi], ds:[esi] */
/* BTDC: load new GDT and IDT for 32bit-16bit opcode switching.
*/
0xfa, /* cli */
0x0f, 0x01, 0x1d, 0xd8, 0xff, 0xdf, 0x3f, /* lidt [3fdfffd8h] */
0x0f, 0x01, 0x15, 0xe0, 0xff, 0xdf, 0x3f, /* lgdt [3fdfffe0h] */
0xb8, 0x10, 0x00, 0x00, 0x00, /* mov eax, 00000010h */
0x8e, 0xd8, /* mov ds, ax */
0x8e, 0xc0, /* mov es, ax */
0x8e, 0xe0, /* mov fs, ax */
0x8e, 0xe8, /* mov gs, ax */
0x8e, 0xd0, /* mov ss, ax */
0xea, 0x37, 0xff, 0xdf, 0x3f, 0x08, 0x00, /* jmp
$0x0008:$0x3fdfff37 */
0x90, /* nop */
0x90, /* nop */
/* BTDC: enter the real mode. */
0x0f, 0x20, 0xc0, /* movl %cr0, %eax */
0x24, 0xfe, /* andb $0xfe, %al */
0x0f, 0x22, 0xc0, /* movl %eax, %cr0 */
/* BTDC: jump into the waking vector in FASC. */
/* BTDC: hard-coded and platform-specific: FC6 with 6.2.18.
modify it later. */
0xea, 0x00, 0x00, 0x00, 0x02 /* jmp 0x200:0x00 */
};
In my case, the FACS table is located at 0xf0620, and the waking vector is
0x2000, I hard-code just to make life easier. then, OS takes the control from
coreboot. Linux resumes from S3.
Now, two questions arise to me:
1. Is it really necessary for CAR to move the stack from 0xc8000 in cache into
0x1f8000 in RAM at the final stage of CAR? Now that the stack works well in
cache, why does CAR move the stack into RAM? For verifying RAM or other stuff?
2. When resuming from S3, I initialize RAM again instead of exiting
self-Refresh. Lucky enough, RAM content is also kept intact in this way. I will
try the exiting self-refresh later.
My first attempt is to jump into the waking vector in the function of
post_cache_as_ram, at this moment, RAM is accessible, I can get the waking
vector. However, many devices are not initialized, the system is very unstable,
I got different trace every time, the best was as below. So, after stuck a
couple of days, I gave up and followed Rodulf's way as above.
============================================================
usbdev5.1_ep81: PM: suspend 0->2, parent 5-0:1.0 already 1
usbdev4.1_ep81: PM: suspend 0->2, parent 4-0:1.0 already 1
usbdev3.1_ep81: PM: suspend 0->2, parent 3-0:1.0 already 1
usbdev2.1_ep81: PM: suspend 0->2, parent 2-0:1.0 already 1
BUG: sleeping function called from invalid context at kernel/rwsem.c:20
in_atomic():0, irqs_disabled():1
Call Trace:
[<ffffffff8026929b>] show_trace+0x34/0x47
[<ffffffff802692c0>] dump_stack+0x12/0x17
[<ffffffff8029dc68>] down_read+0x15/0x23
[<ffffffff80296254>] blocking_notifier_call_chain+0x13/0x36
[<ffffffff803fde58>] cpufreq_resume+0x129/0x14c
DWARF2 unwinder stuck at cpufreq_resume+0x129/0x14c
Leftover inexact backtrace:
[<ffffffff803a4350>] __sysdev_resume+0x2a/0x66
[<ffffffff803a44fe>] sysdev_resume+0x1d/0x63
[<ffffffff803a8efc>] device_power_up+0x9/0xf
[<ffffffff802a5f42>] suspend_enter+0x3e/0x47
[<ffffffff802a608e>] enter_state+0x143/0x19b
[<ffffffff802a6155>] state_store+0x5e/0x79
[<ffffffff802fb1ac>] sysfs_write_file+0xca/0xf9
[<ffffffff802162b6>] vfs_write+0xce/0x174
[<ffffffff80216b26>] sys_write+0x45/0x6e
[<ffffffff8025c00e>] system_call+0x7e/0x83
PCI: Enabling device 0000:00:13.0 (0000 -> 0002)
PCI: Enabling device 0000:00:13.1 (0000 -> 0002)
PCI: Enabling device 0000:00:13.2 (0000 -> 0002)
PCI: Enabling device 0000:00:13.3 (0000 -> 0002)
PCI: Enabling device 0000:00:13.4 (0000 -> 0002)
PCI: Enabling device 0000:00:13.5 (0000 -> 0002)
PCI: Enabling device 0000:00:14.2 (0000 -> 0002)
hda_intel: azx_get_response timeout, switching to single_cmd mode...
Restarting tasks... done
Enabling non-boot CPUs ...
================================================================
By the way, the sequence of Linux(6.2.18) resuming is as below, if you are
curious:
The waking vector points to the function of wakeup_code in the file of
arch/x86_64/kernel/acpi/wakeup.s.
Then, the function of restore_processor_state in the file of
arch/x86_64/kernel/suspend.c is called.
Then, the function of __restore_processor_state is called in the same file.
Finally, I was lost here, I couldn't match the assembly language with the C
code any more.
Best Regards
丰立波 Feng Libo @ AMD Ext: 20906
Mobile Phone: 13683249071
Office Phone: 0086-010-62801406
--
coreboot mailing list: [email protected]
http://www.coreboot.org/mailman/listinfo/coreboot