Hi,

While working on the TM support for CRIU, I faced a TM Bad Thing exception.

Digging further, I found that it is *easy* to raised it from the user
space. I attached below a simple program which raise it all the time,
like this :

[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info
unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40
(msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat
nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT
nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables
ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm
uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses
enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti:
c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR:
0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted  (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]>  CR:
28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0
PACATMSCRATCH: 900000014280f033
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74]
sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0
7c0122a6 f80304b8
[12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8
7c0123a6 4e800020
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

The exception is raised when the kernel is restoring the TM SPRS from
the signal stack. But this operation is not allowed while in a transaction.

The sampler test is ending the signal handler with a pending transaction
while the signal got caught during a transaction itself.

I can't see any straight way to get rid of that, except by clearing the
transactional state in the path of sigreturn....

Please advise.

Cheers,
Laurent.
/*
 * !!! WARNING THIS TEST MAY PANIC YOUR PPC64 SYSTEM !!!! 
 *
 * This sampler raises a kernel BUG in the sigreturn path when called with 
 * TM suspended state:

[12045.221359] Kernel BUG at c000000000050a40 [verbose debug info unavailable]
[12045.221470] Unexpected TM Bad Thing exception at c000000000050a40 (msr 0x201033)
[12045.221540] Oops: Unrecoverable exception, sig: 6 [#1]
[12045.221586] SMP NR_CPUS=2048 NUMA PowerNV
[12045.221634] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables kvm_hv kvm uio_pdrv_genirq ipmi_powernv uio powernv_rng ipmi_msghandler autofs4 ses enclosure scsi_transport_sas bnx2x ipr mdio libcrc32c
[12045.222167] CPU: 68 PID: 6178 Comm: sigreturnpanic Not tainted 4.7.0 #34
[12045.222224] task: c0000000fce38600 ti: c0000000fceb4000 task.ti: c0000000fceb4000
[12045.222293] NIP: c000000000050a40 LR: c0000000000163bc CTR: 0000000000000000
[12045.222361] REGS: c0000000fceb7ac0 TRAP: 0700   Not tainted  (4.7.0)
[12045.222418] MSR: 9000000300201033 <SF,HV,ME,IR,DR,RI,LE,TM[SE]>  CR: 28444280  XER: 20000000
[12045.222625] CFAR: c0000000000163b8 SOFTE: 0 
PACATMSCRATCH: 900000014280f033 
GPR00: 01100000b8000001 c0000000fceb7d40 c00000000139c100 c0000000fce390d0 
GPR04: 900000034280f033 0000000000000000 0000000000000000 0000000000000000 
GPR08: 0000000000000000 b000000000001033 0000000000000001 0000000000000000 
GPR12: 0000000000000000 c000000002926400 0000000000000000 0000000000000000 
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
GPR24: 0000000000000000 00003ffff98cadd0 00003ffff98cb470 0000000000000000 
GPR28: 900000034280f033 c0000000fceb7ea0 0000000000000001 c0000000fce390d0 
[12045.223535] NIP [c000000000050a40] tm_restore_sprs+0xc/0x1c
[12045.223584] LR [c0000000000163bc] tm_recheckpoint+0x5c/0xa0
[12045.223630] Call Trace:
[12045.223655] [c0000000fceb7d80] [c000000000026e74] sys_rt_sigreturn+0x494/0x6c0
[12045.223738] [c0000000fceb7e30] [c0000000000092e0] system_call+0x38/0x108
[12045.223806] Instruction dump:
[12045.223841] 7c800164 4e800020 7c0022a6 f80304a8 7c0222a6 f80304b0 7c0122a6 f80304b8 
[12045.223955] 4e800020 e80304a8 7c0023a6 e80304b0 <7c0223a6> e80304b8 7c0123a6 4e800020 
[12045.224074] ---[ end trace cb8002ee240bae76 ]---

 *
 */

#include <stdio.h>
#include <errno.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <signal.h>


void handler(int sig)
{
    uint64_t ret;
    
    printf("Entering the signal handler\n");
    
    asm __volatile__(
	"li		3,1		;"
	"tbegin.			;"
	"beq		1f		;"
	"li		3,0		;"
	"tsuspend.			;"
	"1:				;"
	"std		3, %[ret]	;"
	: [ret]"=m"(ret)
	:
	: "memory", "3");

    if (ret) {
	printf("ERROR tbegin failed\n");
	exit(1);
    }
    
    /* 
     * We return from the signal handle while in a suspended transaction
     */
}

int main(void)
{
    struct sigaction sa;
    int ret;

    setbuf(stdout, NULL);
    
    memset(&sa, 0, sizeof(sa));

    sa.sa_handler = handler;
    sigemptyset(&sa.sa_mask);

    if (sigaction(SIGSEGV, &sa, NULL)) {
	perror("sigaction");
	exit(1);
    }

    /* 
     * Raising the signal while in a TM transaction.
     * Note that we can't call a system call otherwise the transaction is 
     * aborted.
     */
    asm __volatile__(
	"tbegin.			;"
	"beq		1f		;"
	"li		3,0		;"
	"std		3,0(3)		;" /* Oups ! */
	"1:				;"
	"tend.				;"
	);

    printf("We should not get there !\n");
    exit(1);
}

Reply via email to