Justin,

Thanks for the extensive reading list, very educative.

After reading
https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
I was thinking that it could be a NUMA/THP-related problem.
Turning off THP solved the "swap storm" problem. Some queries are even 40%
faster with THP off.
Then also turning off KSM improved performance by another 5%
I was seriously worried about this issue as we received today another
server with 144GB of RAM.

I will try to post a little summary of all the suggestion I received via
this thread later this week/early next week.

Thanks!

Charles

On Tue, Jul 18, 2017 at 8:01 PM, Justin Pryzby <[email protected]> wrote:

> On Tue, Jul 18, 2017 at 02:13:58PM -0300, Claudio Freire wrote:
> > On Tue, Jul 18, 2017 at 1:01 PM, Claudio Freire <[email protected]>
> wrote:
> > > On Tue, Jul 18, 2017 at 6:20 AM, Charles Nadeau
> > > <[email protected]> wrote:
> > >> Claudio,
> > >>
> > >> At one moment
> > >> during the query, there is a write storm to the swap drive (a bit
> like this
> > >> case:
> > >> https://www.postgresql.org/message-id/AANLkTi%
> 3Diw4fC2RgTxhw0aGpyXANhOT%3DXBnjLU1_v6PdA%40mail.gmail.com).
> > >> I can hardly explain it as there is plenty of memory on this server.
> > >
> > > That sounds a lot like NUMA zone_reclaim issues:
> > >
> > > https://www.postgresql.org/message-id/[email protected]
> >
> > I realize you have zone_reclaim_mode set to 0. Still, the symptoms are
> > eerily similar.
>
> Did you look at disabling KSM and/or THP ?
>
> sudo sh -c 'echo 2 >/sys/kernel/mm/ksm/run'
>
> https://www.postgresql.org/message-id/20170524155855.
> GH31097%40telsasoft.com
> https://www.postgresql.org/message-id/CANQNgOrD02f8mR3Y8Pi=
> [email protected]
> https://www.postgresql.org/message-id/CAHyXU0y9hviyKWvQZxX5UWfH9M2LY
> vwvAOPQ_DUPva2b71t12g%40mail.gmail.com
> https://www.postgresql.org/message-id/20130716195834.
> [email protected]
> https://www.postgresql.org/message-id/CAE_gQfW3dBiELcOppYN6v%3D8%2B%
> 2BpEeywD7iXGw-OT3doB8SXO4_A%40mail.gmail.com
> https://www.postgresql.org/message-id/flat/1436268563235-
> 5856914.post%40n5.nabble.com#[email protected]
> https://www.postgresql.org/message-id/CAL_0b1tJOZCx3Lo3Eve1RqGaT%2BJJ_
> [email protected]
> https://www.postgresql.org/message-id/[email protected]
> https://www.postgresql.org/message-id/1415981309.90631.
> YahooMailNeo%40web133205.mail.ir2.yahoo.com
> https://www.postgresql.org/message-id/CAHyXU0yXYpCXN4%3D81ZDRQu-
> oGzrcq2qNAXDpyz4oiQPPAGk4ew%40mail.gmail.com
> https://www.pythian.com/blog/performance-tuning-hugepages-in-linux/
> http://structureddata.org/2012/06/18/linux-6-transparent-huge-pages-and-
> hadoop-workloads/
>
> Justin
>



-- 
Charles Nadeau Ph.D.
http://charlesnadeau.blogspot.com/

Reply via email to