Re: [Spacewalk-list] Spacewalk - PostgreSQL High I/O

Paul Robert Marino Tue, 11 Oct 2016 21:54:07 -0700

Matt
You may consider using multiple PostgreSQL replica servers in HOT mode
servers with pgpool in front of them it may help but a little. but
because of te way spacewalk does its queries I doubt it will help much
as the safties in pgpool will think just about every query may write
and send it to the master any way.
A better but more maintenece intensive option you may want to consider
is table partitioning to alleviate the locks
https://www.postgresql.org/docs/9.1/static/ddl-partitioning.html
https://wiki.postgresql.org/wiki/Table_partitioning
this is a chopped down version of what the commercial data warehouse
versions of PostgreSQL do,
its a little bit of a pain but it works well for alleviating locking
issues. I've used it with multi TB databases in the past with great
success but it takes a lot of planning and testing to get it right.


On Tue, Oct 11, 2016 at 10:04 PM, Matt Moldvan <m...@moldvan.com> wrote:
> Yeah, it was quite a trial to get it to scale to that degree, so I tried a
> lot of different things.  I increased the max_fds in the jabber configs and
> the ulimits for the jabber user and it's pretty stable now, though even the
> back end Python code wasn't written for having multiple osa dispatchers (one
> overwrites the other with respect to the database password in the
> rhnpushdispatcher table in the main database).
>
> As for proxies, we have 4 in each datacenter but even that was bringing down
> the database due to what I realized later was OSAD restarting on all systems
> when Puppet would run in the morning.  That coupled with the database locks
> that were being set on the snapshot tables, because Spacewalk for some
> reason thought the base channels were updated daily, made the UI completely
> unresponsive and unusable by our operations folks trying to run the patching
> for hundreds of systems at a time.
>
> As for the proxies, my main struggle was putting them behind F5 local and
> global load balancers.  I signed the SSL certs with the name of the global
> WIP, and unfortunately the way the jabber S2S component works, it was trying
> to use that same name for all the proxies.  I realized later I was trying to
> fit a square peg into a round hole and fixed up the Puppet module, disabled
> snapshots, and disabled Jabber on the proxies and pointed the clients at the
> masters, instead.  That was after months of frustration and a growing chorus
> of complaints on the issues with the UI on the masters.
>
> Anyway, having the proxies helped reduce the load significantly on the
> masters, as they still function with the other components like Apache httpd
> and Squid for caching the RPMs.
>
> On Tue, Oct 11, 2016 at 5:02 PM Paul Robert Marino <prmari...@gmail.com>
> wrote:
>>
>> Matt,
>> you ave a lot of clients so those numbers start to make sense to
>> increase the database connections.
>> also OSAD isn't meant to scale that high in fact you should run out of
>> file handles for it before it even gets to that many clients.
>> further more I hope you are using spacewalk proxies if not you may
>> find they help you a lot.
>>
>> As to the original poster look at this page
>> https://www.postgresql.org/docs/9.2/static/runtime-config-resource.html
>> specifically work_mem I would be willing to bet that is your issue
>> because it isn't enough for Spacewalk by default.
>> the easiest way to confirm it is to look in the postgresql base data
>> directory /var/lib/pgsql/data by default and look for a directory
>> called pgsql_tmp if there are any files in that directory then you
>> know its swaping.
>> further more you can look for any queries that have been running for a
>> long time and run explain analyze on them.
>> I normally do not suggest this site or any ones like it because they
>> often give very wrong answers but in this case its not a bad answer
>>
>> http://dba.stackexchange.com/questions/112079/slow-query-performance-due-to-temporary-file
>> I just don't suggest going strait to 64M
>>
>>
>>
>> On Tue, Oct 11, 2016 at 8:37 AM, Allan Moraes <al...@allanmoraes.com.br>
>> wrote:
>> > Thank you for the tips,
>> >
>> > In this case there is available 6GB of memory and the high I/O occur at
>> > the
>> > postgres disk. Other disks, the I/O is normal. The system not is using
>> > swap
>> > and there is 3GB of swap.
>> >
>> > I will separate the postgre and apply your tips.
>> >
>> > 2016-10-10 21:58 GMT-03:00 Matt Moldvan <m...@moldvan.com>:
>> >>
>> >> We have about 6,000 systems to manage and it was unusable otherwise...
>> >> I
>> >> had way too much trouble trying to get OSAD to work through proxies and
>> >> F5
>> >> load balancers, so I had to end up pointing them all to two masters
>> >> that are
>> >> still using the same Postgres database VM.  I was also toying with
>> >> having
>> >> the database be the back end for OSAD, so with that in mind the number
>> >> of
>> >> concurrent clients would often reach higher than usual numbers...  I
>> >> tried a
>> >> lot of different things to get Spacewalk stable, usable, and have
>> >> proper
>> >> failover, so I don't know that any of my recommendations or environment
>> >> specific settings might be a silver bullet for anyone else, but it
>> >> can't
>> >> hurt to try, and learn in the process.
>> >>
>> >> On Mon, Oct 10, 2016 at 6:23 PM Paul Robert Marino
>> >> <prmari...@gmail.com>
>> >> wrote:
>> >>>
>> >>> tuning for 5000 clients is nuts that would hurt your performance
>> >>> try running pgtune for about 50 to maybe 500 clients max, but I try
>> >>> the lower setting first.
>> >>> Now lets talk about the high IO that usually happens when you don't
>> >>> have enough working memory in PostgreSQL's configuration. When that
>> >>> happens PostgreSQL creates temp files that are slow and do a lot of
>> >>> write IO during read operations because it will have to swap the data
>> >>> out to the temp files, note setting the number of connections too high
>> >>> would exacerbate that issue f its the root cause.
>> >>> By the way I managed up to 400 with spacewalk and never had to disable
>> >>> the snapshots.
>> >>>
>> >>>
>> >>> On Mon, Oct 10, 2016 at 4:48 PM, Matt Moldvan <m...@moldvan.com>
>> >>> wrote:
>> >>> > I had similar issues and ended up first breaking out the database to
>> >>> > it's
>> >>> > own VM, then increasing the Postgres debug logs.  I saw that there
>> >>> > were
>> >>> > a
>> >>> > large number of operations running against the snapshot tables, with
>> >>> > locks
>> >>> > and so on being set for a long period of time.  In
>> >>> > /etc/rhn/rhn.conf,
>> >>> > try
>> >>> > disabling snapshots with:
>> >>> >
>> >>> > enable_snapshots = 0
>> >>> >
>> >>> > I also did quite a bit of Postgres tuning using pgtune, for 5,000
>> >>> > clients or
>> >>> > so:
>> >>> > pgtune -i data/postgresql.conf  -o ./data/postgresql.conf.new -c
>> >>> > 5000
>> >>> >
>> >>> > Another thing that may help is installing pgbadger to analyze your
>> >>> > Postgres
>> >>> > logs... it has some nice visualizations of the types of queries and
>> >>> > tables
>> >>> > involved, which may point you in the right direction if snapshots
>> >>> > aren't the
>> >>> > reason for the high utilization.
>> >>> > https://github.com/dalibo/pgbadger
>> >>> >
>> >>> > Hope that helps.
>> >>> >
>> >>> > On Mon, Oct 10, 2016 at 4:06 PM Konstantin Raskoshnyi
>> >>> > <konra...@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Because all your systems request information from SP, and default
>> >>> >> installation doesn't make any sense if you have more that 50
>> >>> >> machines,
>> >>> >> so
>> >>> >> you need to tyne postgres, tomcat & linux itself
>> >>> >>
>> >>> >> On Mon, Oct 10, 2016 at 12:34 PM, Allan Moraes
>> >>> >> <al...@allanmoraes.com.br>
>> >>> >> wrote:
>> >>> >>>
>> >>> >>> Hi
>> >>> >>> In my CentOS 7 server, is installed the spacewalk 2.4 and
>> >>> >>> PostgreSQL
>> >>> >>> from
>> >>> >>> default installation. Via iotop, my postgresql write a lot of
>> >>> >>> informations,
>> >>> >>> during all day. Why this occur?
>> >>> >>>
>> >>> >>> _______________________________________________
>> >>> >>> Spacewalk-list mailing list
>> >>> >>> Spacewalk-list@redhat.com
>> >>> >>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>> >>> >>
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> Spacewalk-list mailing list
>> >>> >> Spacewalk-list@redhat.com
>> >>> >> https://www.redhat.com/mailman/listinfo/spacewalk-list
>> >>> >
>> >>> >
>> >>> > _______________________________________________
>> >>> > Spacewalk-list mailing list
>> >>> > Spacewalk-list@redhat.com
>> >>> > https://www.redhat.com/mailman/listinfo/spacewalk-list
>> >>>
>> >>> _______________________________________________
>> >>> Spacewalk-list mailing list
>> >>> Spacewalk-list@redhat.com
>> >>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>> >>
>> >>
>> >> _______________________________________________
>> >> Spacewalk-list mailing list
>> >> Spacewalk-list@redhat.com
>> >> https://www.redhat.com/mailman/listinfo/spacewalk-list
>> >
>> >
>> >
>> >
>> > --
>> >
>> > Atenciosamente...
>> >
>> > Allan Moraes
>> > - Linux Consulting at Venda e Cia
>> > - Founder and Editor at MySQL Box
>> > - Linux System Administrador and DBA MySQL at Umbler
>> >
>> > Cel: (51) 81885480
>> > E-mail: al...@mysqlbox.com.br
>> > Skype: al...@allanmoraes.com.br
>> >
>> > _______________________________________________
>> > Spacewalk-list mailing list
>> > Spacewalk-list@redhat.com
>> > https://www.redhat.com/mailman/listinfo/spacewalk-list
>>
>> _______________________________________________
>> Spacewalk-list mailing list
>> Spacewalk-list@redhat.com
>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>
>
> _______________________________________________
> Spacewalk-list mailing list
> Spacewalk-list@redhat.com
> https://www.redhat.com/mailman/listinfo/spacewalk-list

_______________________________________________
Spacewalk-list mailing list
Spacewalk-list@redhat.com
https://www.redhat.com/mailman/listinfo/spacewalk-list

Re: [Spacewalk-list] Spacewalk - PostgreSQL High I/O

Reply via email to