Justin,

Thanks for the extensive reading list, very educative.

After reading
https://blog.jcole.us/2010/09/28/mysql-swap-insanity-and-the-numa-architecture/
I was thinking that it could be a NUMA/THP-related problem.
Turning off THP solved the "swap storm" problem. Some queries are even 40%
faster with THP off.
Then also turning off KSM improved performance by another 5%
I was seriously worried about this issue as we received today another
server with 144GB of RAM.

I will try to post a little summary of all the suggestion I received via
this thread later this week/early next week.

Thanks!

Charles

On Tue, Jul 18, 2017 at 8:01 PM, Justin Pryzby <pry...@telsasoft.com> wrote:

> On Tue, Jul 18, 2017 at 02:13:58PM -0300, Claudio Freire wrote:
> > On Tue, Jul 18, 2017 at 1:01 PM, Claudio Freire <klaussfre...@gmail.com>
> wrote:
> > > On Tue, Jul 18, 2017 at 6:20 AM, Charles Nadeau
> > > <charles.nad...@gmail.com> wrote:
> > >> Claudio,
> > >>
> > >> At one moment
> > >> during the query, there is a write storm to the swap drive (a bit
> like this
> > >> case:
> > >> https://www.postgresql.org/message-id/AANLkTi%
> 3Diw4fC2RgTxhw0aGpyXANhOT%3DXBnjLU1_v6PdA%40mail.gmail.com).
> > >> I can hardly explain it as there is plenty of memory on this server.
> > >
> > > That sounds a lot like NUMA zone_reclaim issues:
> > >
> > > https://www.postgresql.org/message-id/500616cb.3070...@2ndquadrant.com
> >
> > I realize you have zone_reclaim_mode set to 0. Still, the symptoms are
> > eerily similar.
>
> Did you look at disabling KSM and/or THP ?
>
> sudo sh -c 'echo 2 >/sys/kernel/mm/ksm/run'
>
> https://www.postgresql.org/message-id/20170524155855.
> GH31097%40telsasoft.com
> https://www.postgresql.org/message-id/CANQNgOrD02f8mR3Y8Pi=
> zfsol14rqnqa8hwz1r4rsndlr1b...@mail.gmail.com
> https://www.postgresql.org/message-id/CAHyXU0y9hviyKWvQZxX5UWfH9M2LY
> vwvAOPQ_DUPva2b71t12g%40mail.gmail.com
> https://www.postgresql.org/message-id/20130716195834.
> 8fe5c79249cb2ff0d4270...@yahoo.es
> https://www.postgresql.org/message-id/CAE_gQfW3dBiELcOppYN6v%3D8%2B%
> 2BpEeywD7iXGw-OT3doB8SXO4_A%40mail.gmail.com
> https://www.postgresql.org/message-id/flat/1436268563235-
> 5856914.post%40n5.nabble.com#1436268563235-5856914.p...@n5.nabble.com
> https://www.postgresql.org/message-id/CAL_0b1tJOZCx3Lo3Eve1RqGaT%2BJJ_
> q7w4pkj87wfwwxbtug...@mail.gmail.com
> https://www.postgresql.org/message-id/556e2068.7070...@vuole.me
> https://www.postgresql.org/message-id/1415981309.90631.
> YahooMailNeo%40web133205.mail.ir2.yahoo.com
> https://www.postgresql.org/message-id/CAHyXU0yXYpCXN4%3D81ZDRQu-
> oGzrcq2qNAXDpyz4oiQPPAGk4ew%40mail.gmail.com
> https://www.pythian.com/blog/performance-tuning-hugepages-in-linux/
> http://structureddata.org/2012/06/18/linux-6-transparent-huge-pages-and-
> hadoop-workloads/
>
> Justin
>



-- 
Charles Nadeau Ph.D.
http://charlesnadeau.blogspot.com/

Reply via email to