Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock
On 11/17/2015 05:17 PM, Nathan Zimmer wrote: When running the SPECint_rate gcc on some very large boxes it was noticed that the system was spending lots of time in mpol_shared_policy_lookup. The gamess benchmark can also show it and is what I mostly used to chase down the issue since the setup for that I found a easier. To be clear the binaries were on tmpfs because of disk I/O reqruirements. We then used text replication to avoid icache misses and having all the copies banging on the memory where the instruction code resides. This results in us hitting a bottle neck in mpol_shared_policy_lookup since lookup is serialised by the shared_policy lock. I have only reproduced this on very large (3k+ cores) boxes. The problem starts showing up at just a few hundred ranks getting worse until it threatens to livelock once it gets large enough. For example on the gamess benchmark at 128 ranks this area consumes only ~1% of time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is over 90%. To alleviate the contention on this area I converted the spinslock to a rwlock. This allows the large number of lookups to happen simultaneously. The results were quite good reducing this to consumtion at max ranks to around 2%. Acked-by: David Rientjes Acked-by: Vlastimil Babka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock
On 11/17/2015 05:17 PM, Nathan Zimmer wrote: When running the SPECint_rate gcc on some very large boxes it was noticed that the system was spending lots of time in mpol_shared_policy_lookup. The gamess benchmark can also show it and is what I mostly used to chase down the issue since the setup for that I found a easier. To be clear the binaries were on tmpfs because of disk I/O reqruirements. We then used text replication to avoid icache misses and having all the copies banging on the memory where the instruction code resides. This results in us hitting a bottle neck in mpol_shared_policy_lookup since lookup is serialised by the shared_policy lock. I have only reproduced this on very large (3k+ cores) boxes. The problem starts showing up at just a few hundred ranks getting worse until it threatens to livelock once it gets large enough. For example on the gamess benchmark at 128 ranks this area consumes only ~1% of time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is over 90%. To alleviate the contention on this area I converted the spinslock to a rwlock. This allows the large number of lookups to happen simultaneously. The results were quite good reducing this to consumtion at max ranks to around 2%. Acked-by: David RientjesAcked-by: Vlastimil Babka -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock
On 11/18/2015 09:05 PM, Nathan Zimmer wrote: On 11/18/2015 07:50 AM, Vlastimil Babka wrote: At first glance it seems that RCU would be a good fit here and achieve even better lookup scalability, have you considered it? Originally that was my plan but when I saw how good the results were with the rwlock, I chickened out and took the less prone to mistakes way. I should also note that the 2% time left in system is not from this lookup but another area. Ah, I see, thanks! Vlastimil Nate -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock
On 11/18/2015 09:05 PM, Nathan Zimmer wrote: On 11/18/2015 07:50 AM, Vlastimil Babka wrote: At first glance it seems that RCU would be a good fit here and achieve even better lookup scalability, have you considered it? Originally that was my plan but when I saw how good the results were with the rwlock, I chickened out and took the less prone to mistakes way. I should also note that the 2% time left in system is not from this lookup but another area. Ah, I see, thanks! Vlastimil Nate -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock
On 11/17/2015 05:17 PM, Nathan Zimmer wrote: > When running the SPECint_rate gcc on some very large boxes it was noticed > that the system was spending lots of time in mpol_shared_policy_lookup. > The gamess benchmark can also show it and is what I mostly used to chase > down the issue since the setup for that I found a easier. > > To be clear the binaries were on tmpfs because of disk I/O reqruirements. > We then used text replication to avoid icache misses and having all the > copies banging on the memory where the instruction code resides. > This results in us hitting a bottle neck in mpol_shared_policy_lookup > since lookup is serialised by the shared_policy lock. > > I have only reproduced this on very large (3k+ cores) boxes. The problem > starts showing up at just a few hundred ranks getting worse until it > threatens to livelock once it gets large enough. > For example on the gamess benchmark at 128 ranks this area consumes only > ~1% of time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is > over 90%. > > To alleviate the contention on this area I converted the spinslock to a > rwlock. This allows the large number of lookups to happen simultaneously. > The results were quite good reducing this to consumtion at max ranks to > around 2%. At first glance it seems that RCU would be a good fit here and achieve even better lookup scalability, have you considered it? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] mempolicy: convert the shared_policy lock to a rwlock
On 11/17/2015 05:17 PM, Nathan Zimmer wrote: > When running the SPECint_rate gcc on some very large boxes it was noticed > that the system was spending lots of time in mpol_shared_policy_lookup. > The gamess benchmark can also show it and is what I mostly used to chase > down the issue since the setup for that I found a easier. > > To be clear the binaries were on tmpfs because of disk I/O reqruirements. > We then used text replication to avoid icache misses and having all the > copies banging on the memory where the instruction code resides. > This results in us hitting a bottle neck in mpol_shared_policy_lookup > since lookup is serialised by the shared_policy lock. > > I have only reproduced this on very large (3k+ cores) boxes. The problem > starts showing up at just a few hundred ranks getting worse until it > threatens to livelock once it gets large enough. > For example on the gamess benchmark at 128 ranks this area consumes only > ~1% of time, at 512 ranks it consumes nearly 13%, and at 2k ranks it is > over 90%. > > To alleviate the contention on this area I converted the spinslock to a > rwlock. This allows the large number of lookups to happen simultaneously. > The results were quite good reducing this to consumtion at max ranks to > around 2%. At first glance it seems that RCU would be a good fit here and achieve even better lookup scalability, have you considered it? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/