Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86
On Wed, Oct 12, 2016 at 06:30:09PM +0300, Andriy Gapon wrote: > On 12/10/2016 16:45, Konstantin Belousov wrote: > > On Wed, Oct 12, 2016 at 04:25:00PM +0300, Andriy Gapon wrote: > >> On 04/10/2016 20:01, Konstantin Belousov wrote: > >>> Author: kib > >>> Date: Tue Oct 4 17:01:24 2016 > >>> New Revision: 306680 > >>> URL: https://svnweb.freebsd.org/changeset/base/306680 > >>> > >>> Log: > >>> Re-apply r306516 (by cem): > >>> > >>> Reduce the cost of TLB invalidation on x86 by using per-CPU completion > >>> flags > >>> > >>> Reduce contention during TLB invalidation operations by using a per-CPU > >>> completion flag, rather than a single atomically-updated variable. > >> > >> Kostik, > >> > >> could this commit cause a problem reported in the below links? > >> https://bz-attachments.freebsd.org/attachment.cgi?id=175614 > >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371 > > > > If I am reading the report right, the problem appears on the > > 11.0-RELEASE system. The patch you reference was only applied to HEAD a > > week ago and was not merged even to stable/11. > > Sorry for the noise, then. Somehow I thought that this went into the release > branch, but obviously there was too little time for that. > > > The examination must start with backtracing the thread which owns the > > smp_ipi_mtx (shown on the screenshot). > > It looks like DDB is not in GENERIC in 11.0? Yes, DDB is absent in GENERIC 11.0 > Not sure if the reporter would be able to configure a dump device and then > save > the dump given that the panic happens in the installer. > If anyone could provide them with instructions that would be great. > > -- > Andriy Gapon > ___ > svn-src-all@freebsd.org mailing list > https://lists.freebsd.org/mailman/listinfo/svn-src-all > To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org" ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86
On 12/10/2016 16:45, Konstantin Belousov wrote: > On Wed, Oct 12, 2016 at 04:25:00PM +0300, Andriy Gapon wrote: >> On 04/10/2016 20:01, Konstantin Belousov wrote: >>> Author: kib >>> Date: Tue Oct 4 17:01:24 2016 >>> New Revision: 306680 >>> URL: https://svnweb.freebsd.org/changeset/base/306680 >>> >>> Log: >>> Re-apply r306516 (by cem): >>> >>> Reduce the cost of TLB invalidation on x86 by using per-CPU completion >>> flags >>> >>> Reduce contention during TLB invalidation operations by using a per-CPU >>> completion flag, rather than a single atomically-updated variable. >> >> Kostik, >> >> could this commit cause a problem reported in the below links? >> https://bz-attachments.freebsd.org/attachment.cgi?id=175614 >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371 > > If I am reading the report right, the problem appears on the > 11.0-RELEASE system. The patch you reference was only applied to HEAD a > week ago and was not merged even to stable/11. Sorry for the noise, then. Somehow I thought that this went into the release branch, but obviously there was too little time for that. > The examination must start with backtracing the thread which owns the > smp_ipi_mtx (shown on the screenshot). It looks like DDB is not in GENERIC in 11.0? Not sure if the reporter would be able to configure a dump device and then save the dump given that the panic happens in the installer. If anyone could provide them with instructions that would be great. -- Andriy Gapon ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86
On Wed, Oct 12, 2016 at 04:25:00PM +0300, Andriy Gapon wrote: > On 04/10/2016 20:01, Konstantin Belousov wrote: > > Author: kib > > Date: Tue Oct 4 17:01:24 2016 > > New Revision: 306680 > > URL: https://svnweb.freebsd.org/changeset/base/306680 > > > > Log: > > Re-apply r306516 (by cem): > > > > Reduce the cost of TLB invalidation on x86 by using per-CPU completion > > flags > > > > Reduce contention during TLB invalidation operations by using a per-CPU > > completion flag, rather than a single atomically-updated variable. > > Kostik, > > could this commit cause a problem reported in the below links? > https://bz-attachments.freebsd.org/attachment.cgi?id=175614 > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371 If I am reading the report right, the problem appears on the 11.0-RELEASE system. The patch you reference was only applied to HEAD a week ago and was not merged even to stable/11. The examination must start with backtracing the thread which owns the smp_ipi_mtx (shown on the screenshot). ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86
On 04/10/2016 20:01, Konstantin Belousov wrote: > Author: kib > Date: Tue Oct 4 17:01:24 2016 > New Revision: 306680 > URL: https://svnweb.freebsd.org/changeset/base/306680 > > Log: > Re-apply r306516 (by cem): > > Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags > > Reduce contention during TLB invalidation operations by using a per-CPU > completion flag, rather than a single atomically-updated variable. Kostik, could this commit cause a problem reported in the below links? https://bz-attachments.freebsd.org/attachment.cgi?id=175614 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371 > On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements > show that smp_tlb_shootdown is about 50% faster with this patch; > observations > with VTune show that the percentage of time spent in invlrng_single_page on > an > interrupt (actually doing invalidation, rather than synchronization) > increases > from 31% with the old mechanism to 71% with the new one. (Running a basic > file > server workload.) > > Submitted by: Anton Rang > Reviewed by:cem (earlier version) > Sponsored by: Dell EMC Isilon > Differential Revision: https://reviews.freebsd.org/D8041 -- Andriy Gapon ___ svn-src-all@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86
Author: kib Date: Tue Oct 4 17:01:24 2016 New Revision: 306680 URL: https://svnweb.freebsd.org/changeset/base/306680 Log: Re-apply r306516 (by cem): Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags Reduce contention during TLB invalidation operations by using a per-CPU completion flag, rather than a single atomically-updated variable. On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements show that smp_tlb_shootdown is about 50% faster with this patch; observations with VTune show that the percentage of time spent in invlrng_single_page on an interrupt (actually doing invalidation, rather than synchronization) increases from 31% with the old mechanism to 71% with the new one. (Running a basic file server workload.) Submitted by: Anton Rang Reviewed by: cem (earlier version) Sponsored by: Dell EMC Isilon Differential Revision:https://reviews.freebsd.org/D8041 Modified: head/sys/amd64/amd64/mp_machdep.c head/sys/amd64/include/pcpu.h head/sys/i386/include/pcpu.h head/sys/x86/include/x86_smp.h head/sys/x86/x86/mp_x86.c Modified: head/sys/amd64/amd64/mp_machdep.c == --- head/sys/amd64/amd64/mp_machdep.c Tue Oct 4 16:44:40 2016 (r306679) +++ head/sys/amd64/amd64/mp_machdep.c Tue Oct 4 17:01:24 2016 (r306680) @@ -409,6 +409,7 @@ void invltlb_invpcid_handler(void) { struct invpcid_descr d; + uint32_t generation; #ifdef COUNT_XINVLTLB_HITS xhits_gbl[PCPU_GET(cpuid)]++; @@ -417,17 +418,20 @@ invltlb_invpcid_handler(void) (*ipi_invltlb_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + generation = smp_tlb_generation; d.pcid = smp_tlb_pmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid; d.pad = 0; d.addr = 0; invpcid(, smp_tlb_pmap == kernel_pmap ? INVPCID_CTXGLOB : INVPCID_CTX); - atomic_add_int(_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } void invltlb_pcid_handler(void) { + uint32_t generation; + #ifdef COUNT_XINVLTLB_HITS xhits_gbl[PCPU_GET(cpuid)]++; #endif /* COUNT_XINVLTLB_HITS */ @@ -435,6 +439,7 @@ invltlb_pcid_handler(void) (*ipi_invltlb_counts[PCPU_GET(cpuid)])++; #endif /* COUNT_IPIS */ + generation = smp_tlb_generation;/* Overlap with serialization */ if (smp_tlb_pmap == kernel_pmap) { invltlb_glob(); } else { @@ -450,5 +455,5 @@ invltlb_pcid_handler(void) smp_tlb_pmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid); } } - atomic_add_int(_tlb_wait, 1); + PCPU_SET(smp_tlb_done, generation); } Modified: head/sys/amd64/include/pcpu.h == --- head/sys/amd64/include/pcpu.h Tue Oct 4 16:44:40 2016 (r306679) +++ head/sys/amd64/include/pcpu.h Tue Oct 4 17:01:24 2016 (r306680) @@ -65,7 +65,8 @@ u_int pc_vcpu_id; /* Xen vCPU ID */ \ uint32_t pc_pcid_next; \ uint32_t pc_pcid_gen; \ - char__pad[149] /* be divisor of PAGE_SIZE \ + uint32_t pc_smp_tlb_done; /* TLB op acknowledgement */\ + char__pad[145] /* be divisor of PAGE_SIZE \ after cache alignment */ #definePC_DBREG_CMD_NONE 0 Modified: head/sys/i386/include/pcpu.h == --- head/sys/i386/include/pcpu.hTue Oct 4 16:44:40 2016 (r306679) +++ head/sys/i386/include/pcpu.hTue Oct 4 17:01:24 2016 (r306680) @@ -59,7 +59,8 @@ u_int pc_cmci_mask; /* MCx banks for CMCI */\ u_int pc_vcpu_id; /* Xen vCPU ID */ \ vm_offset_t pc_qmap_addr; /* KVA for temporary mappings */\ - char__pad[229] + uint32_t pc_smp_tlb_done; /* TLB op acknowledgement */\ + char__pad[225] #ifdef _KERNEL Modified: head/sys/x86/include/x86_smp.h == --- head/sys/x86/include/x86_smp.h Tue Oct 4 16:44:40 2016 (r306679) +++ head/sys/x86/include/x86_smp.h Tue Oct 4 17:01:24 2016 (r306680) @@ -35,7 +35,7 @@ extern volatile int aps_ready; extern struct mtx ap_boot_mtx; extern int cpu_logical; extern int cpu_cores; -extern volatile int smp_tlb_wait; +extern volatile uint32_t smp_tlb_generation; extern struct pmap *smp_tlb_pmap; extern u_int xhits_gbl[]; extern u_int xhits_pg[]; Modified: head/sys/x86/x86/mp_x86.c