Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86

2016-10-12 Thread Slawa Olhovchenkov
On Wed, Oct 12, 2016 at 06:30:09PM +0300, Andriy Gapon wrote:

> On 12/10/2016 16:45, Konstantin Belousov wrote:
> > On Wed, Oct 12, 2016 at 04:25:00PM +0300, Andriy Gapon wrote:
> >> On 04/10/2016 20:01, Konstantin Belousov wrote:
> >>> Author: kib
> >>> Date: Tue Oct  4 17:01:24 2016
> >>> New Revision: 306680
> >>> URL: https://svnweb.freebsd.org/changeset/base/306680
> >>>
> >>> Log:
> >>>   Re-apply r306516 (by cem):
> >>>   
> >>>   Reduce the cost of TLB invalidation on x86 by using per-CPU completion 
> >>> flags
> >>>   
> >>>   Reduce contention during TLB invalidation operations by using a per-CPU
> >>>   completion flag, rather than a single atomically-updated variable.
> >>
> >> Kostik,
> >>
> >> could this commit cause a problem reported in the below links?
> >> https://bz-attachments.freebsd.org/attachment.cgi?id=175614
> >> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371
> > 
> > If I am reading the report right, the problem appears on the
> > 11.0-RELEASE system. The patch you reference was only applied to HEAD a
> > week ago and was not merged even to stable/11.
> 
> Sorry for the noise, then.  Somehow I thought that this went into the release
> branch, but obviously there was too little time for that.
> 
> > The examination must start with backtracing the thread which owns the
> > smp_ipi_mtx (shown on the screenshot).
> 
> It looks like DDB is not in GENERIC in 11.0?

Yes, DDB is absent in GENERIC 11.0

> Not sure if the reporter would be able to configure a dump device and then 
> save
> the dump given that the panic happens in the installer.
> If anyone could provide them with instructions that would be great.
> 
> -- 
> Andriy Gapon
> ___
> svn-src-all@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-all
> To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86

2016-10-12 Thread Andriy Gapon
On 12/10/2016 16:45, Konstantin Belousov wrote:
> On Wed, Oct 12, 2016 at 04:25:00PM +0300, Andriy Gapon wrote:
>> On 04/10/2016 20:01, Konstantin Belousov wrote:
>>> Author: kib
>>> Date: Tue Oct  4 17:01:24 2016
>>> New Revision: 306680
>>> URL: https://svnweb.freebsd.org/changeset/base/306680
>>>
>>> Log:
>>>   Re-apply r306516 (by cem):
>>>   
>>>   Reduce the cost of TLB invalidation on x86 by using per-CPU completion 
>>> flags
>>>   
>>>   Reduce contention during TLB invalidation operations by using a per-CPU
>>>   completion flag, rather than a single atomically-updated variable.
>>
>> Kostik,
>>
>> could this commit cause a problem reported in the below links?
>> https://bz-attachments.freebsd.org/attachment.cgi?id=175614
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371
> 
> If I am reading the report right, the problem appears on the
> 11.0-RELEASE system. The patch you reference was only applied to HEAD a
> week ago and was not merged even to stable/11.

Sorry for the noise, then.  Somehow I thought that this went into the release
branch, but obviously there was too little time for that.

> The examination must start with backtracing the thread which owns the
> smp_ipi_mtx (shown on the screenshot).

It looks like DDB is not in GENERIC in 11.0?
Not sure if the reporter would be able to configure a dump device and then save
the dump given that the panic happens in the installer.
If anyone could provide them with instructions that would be great.

-- 
Andriy Gapon
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86

2016-10-12 Thread Konstantin Belousov
On Wed, Oct 12, 2016 at 04:25:00PM +0300, Andriy Gapon wrote:
> On 04/10/2016 20:01, Konstantin Belousov wrote:
> > Author: kib
> > Date: Tue Oct  4 17:01:24 2016
> > New Revision: 306680
> > URL: https://svnweb.freebsd.org/changeset/base/306680
> > 
> > Log:
> >   Re-apply r306516 (by cem):
> >   
> >   Reduce the cost of TLB invalidation on x86 by using per-CPU completion 
> > flags
> >   
> >   Reduce contention during TLB invalidation operations by using a per-CPU
> >   completion flag, rather than a single atomically-updated variable.
> 
> Kostik,
> 
> could this commit cause a problem reported in the below links?
> https://bz-attachments.freebsd.org/attachment.cgi?id=175614
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371

If I am reading the report right, the problem appears on the
11.0-RELEASE system. The patch you reference was only applied to HEAD a
week ago and was not merged even to stable/11.

The examination must start with backtracing the thread which owns the
smp_ipi_mtx (shown on the screenshot).
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


Re: svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86

2016-10-12 Thread Andriy Gapon
On 04/10/2016 20:01, Konstantin Belousov wrote:
> Author: kib
> Date: Tue Oct  4 17:01:24 2016
> New Revision: 306680
> URL: https://svnweb.freebsd.org/changeset/base/306680
> 
> Log:
>   Re-apply r306516 (by cem):
>   
>   Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags
>   
>   Reduce contention during TLB invalidation operations by using a per-CPU
>   completion flag, rather than a single atomically-updated variable.

Kostik,

could this commit cause a problem reported in the below links?
https://bz-attachments.freebsd.org/attachment.cgi?id=175614
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213371

>   On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements
>   show that smp_tlb_shootdown is about 50% faster with this patch; 
> observations
>   with VTune show that the percentage of time spent in invlrng_single_page on 
> an
>   interrupt (actually doing invalidation, rather than synchronization) 
> increases
>   from 31% with the old mechanism to 71% with the new one.  (Running a basic 
> file
>   server workload.)
>   
>   Submitted by:   Anton Rang 
>   Reviewed by:cem (earlier version)
>   Sponsored by:   Dell EMC Isilon
>   Differential Revision:  https://reviews.freebsd.org/D8041

-- 
Andriy Gapon
___
svn-src-all@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-all
To unsubscribe, send any mail to "svn-src-all-unsubscr...@freebsd.org"


svn commit: r306680 - in head/sys: amd64/amd64 amd64/include i386/include x86/include x86/x86

2016-10-04 Thread Konstantin Belousov
Author: kib
Date: Tue Oct  4 17:01:24 2016
New Revision: 306680
URL: https://svnweb.freebsd.org/changeset/base/306680

Log:
  Re-apply r306516 (by cem):
  
  Reduce the cost of TLB invalidation on x86 by using per-CPU completion flags
  
  Reduce contention during TLB invalidation operations by using a per-CPU
  completion flag, rather than a single atomically-updated variable.
  
  On a Westmere system (2 sockets x 4 cores x 1 threads), dtrace measurements
  show that smp_tlb_shootdown is about 50% faster with this patch; observations
  with VTune show that the percentage of time spent in invlrng_single_page on an
  interrupt (actually doing invalidation, rather than synchronization) increases
  from 31% with the old mechanism to 71% with the new one.  (Running a basic 
file
  server workload.)
  
  Submitted by: Anton Rang 
  Reviewed by:  cem (earlier version)
  Sponsored by: Dell EMC Isilon
  Differential Revision:https://reviews.freebsd.org/D8041

Modified:
  head/sys/amd64/amd64/mp_machdep.c
  head/sys/amd64/include/pcpu.h
  head/sys/i386/include/pcpu.h
  head/sys/x86/include/x86_smp.h
  head/sys/x86/x86/mp_x86.c

Modified: head/sys/amd64/amd64/mp_machdep.c
==
--- head/sys/amd64/amd64/mp_machdep.c   Tue Oct  4 16:44:40 2016
(r306679)
+++ head/sys/amd64/amd64/mp_machdep.c   Tue Oct  4 17:01:24 2016
(r306680)
@@ -409,6 +409,7 @@ void
 invltlb_invpcid_handler(void)
 {
struct invpcid_descr d;
+   uint32_t generation;
 
 #ifdef COUNT_XINVLTLB_HITS
xhits_gbl[PCPU_GET(cpuid)]++;
@@ -417,17 +418,20 @@ invltlb_invpcid_handler(void)
(*ipi_invltlb_counts[PCPU_GET(cpuid)])++;
 #endif /* COUNT_IPIS */
 
+   generation = smp_tlb_generation;
d.pcid = smp_tlb_pmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid;
d.pad = 0;
d.addr = 0;
invpcid(, smp_tlb_pmap == kernel_pmap ? INVPCID_CTXGLOB :
INVPCID_CTX);
-   atomic_add_int(_tlb_wait, 1);
+   PCPU_SET(smp_tlb_done, generation);
 }
 
 void
 invltlb_pcid_handler(void)
 {
+   uint32_t generation;
+  
 #ifdef COUNT_XINVLTLB_HITS
xhits_gbl[PCPU_GET(cpuid)]++;
 #endif /* COUNT_XINVLTLB_HITS */
@@ -435,6 +439,7 @@ invltlb_pcid_handler(void)
(*ipi_invltlb_counts[PCPU_GET(cpuid)])++;
 #endif /* COUNT_IPIS */
 
+   generation = smp_tlb_generation;/* Overlap with serialization */
if (smp_tlb_pmap == kernel_pmap) {
invltlb_glob();
} else {
@@ -450,5 +455,5 @@ invltlb_pcid_handler(void)
smp_tlb_pmap->pm_pcids[PCPU_GET(cpuid)].pm_pcid);
}
}
-   atomic_add_int(_tlb_wait, 1);
+   PCPU_SET(smp_tlb_done, generation);
 }

Modified: head/sys/amd64/include/pcpu.h
==
--- head/sys/amd64/include/pcpu.h   Tue Oct  4 16:44:40 2016
(r306679)
+++ head/sys/amd64/include/pcpu.h   Tue Oct  4 17:01:24 2016
(r306680)
@@ -65,7 +65,8 @@
u_int   pc_vcpu_id; /* Xen vCPU ID */   \
uint32_t pc_pcid_next;  \
uint32_t pc_pcid_gen;   \
-   char__pad[149]  /* be divisor of PAGE_SIZE  \
+   uint32_t pc_smp_tlb_done;   /* TLB op acknowledgement */\
+   char__pad[145]  /* be divisor of PAGE_SIZE  \
   after cache alignment */
 
 #definePC_DBREG_CMD_NONE   0

Modified: head/sys/i386/include/pcpu.h
==
--- head/sys/i386/include/pcpu.hTue Oct  4 16:44:40 2016
(r306679)
+++ head/sys/i386/include/pcpu.hTue Oct  4 17:01:24 2016
(r306680)
@@ -59,7 +59,8 @@
u_int   pc_cmci_mask;   /* MCx banks for CMCI */\
u_int   pc_vcpu_id; /* Xen vCPU ID */   \
vm_offset_t pc_qmap_addr;   /* KVA for temporary mappings */\
-   char__pad[229]
+   uint32_t pc_smp_tlb_done;   /* TLB op acknowledgement */\
+   char__pad[225]
 
 #ifdef _KERNEL
 

Modified: head/sys/x86/include/x86_smp.h
==
--- head/sys/x86/include/x86_smp.h  Tue Oct  4 16:44:40 2016
(r306679)
+++ head/sys/x86/include/x86_smp.h  Tue Oct  4 17:01:24 2016
(r306680)
@@ -35,7 +35,7 @@ extern volatile int aps_ready;
 extern struct mtx ap_boot_mtx;
 extern int cpu_logical;
 extern int cpu_cores;
-extern volatile int smp_tlb_wait;
+extern volatile uint32_t smp_tlb_generation;
 extern struct pmap *smp_tlb_pmap;
 extern u_int xhits_gbl[];
 extern u_int xhits_pg[];

Modified: head/sys/x86/x86/mp_x86.c