Send netdisco-users mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."
Today's Topics:
1. Slow Search Performance (Pavel Skovajsa)
2. Re: Slow Search Performance (Andy Ruhl)
3. Re: Slow Search Performance (Pavel Skovajsa)
--- Begin Message ---
hello,
This week I noticed that Netdisco performance is terrible. Anything for
simple thing lasts minutes. Last month there was a bug related to this, but
it got fixed in the latest release. In an attempt to figure out something I
looked into postgres logs and found ton of messages about slow queries.
There are number of groups of these mentioned below. I would appreciate
somebody helping out to figure it out. I tried restarting the DB, Netdisco
itself, whole server, it works fine for couple minutes and then it gets
slow again. THanks,
-pavel
Version info
===========
App::Netdisco 2.46.2
SNMP::Info 3.70
DB Schema 63
PostgreSQL 12.00.4
Perl 5.30.0
Size
====
10,791 devices with 31,121 IPs
424,072 interfaces of which 167,725 are up
15,022 layer 2 links between devices
306,409 nodes logged, of which 152,915 are active
120,878 IPs logged, of which 111,858 are active
Slow SQL Group 1 Example
=======
2020-08-09 00:00:46.997 EDT [537360] netdisco@netdisco LOG: duration:
10420.704 ms execute dbdpg_p537356_2:
SELECT me.mac, me.ip FROM ( SELECT ip, mac FROM device where mac = any
($1::macaddr[])
UNION
SELECT ip, mac FROM device_port dp where mac = any
($2::macaddr[])
) me GROUP BY mac, ip
2020-08-09 00:00:46.997 EDT [537360] netdisco@netdisco DETAIL: parameters:
$1 = '{00:05:9b:cf:1c:1a}', $2 = '{00:05:9b:cf:1c:1a}'
The one above is the most simplest one, there are instances this query is
running with hundreds of MACs in $1 and $2.
Slow SQL Group 2 Example
=======
2020-09-13 08:45:20.719 EDT [628253] netdisco@netdisco LOG: duration:
49434.509 ms parse <unnamed>: UPDATE device_port SET is_uplink = true,
manual_topo = false, remote_id = $1, remote_ip = $2, remote_port = $3,
remote_type = $4 WHERE ( ( ip = $5 AND port = $6 ) )
2020-09-13 08:45:20.720 EDT [627470] netdisco@netdisco LOG: duration:
49278.507 ms parse <unnamed>: UPDATE device_port SET is_uplink = true,
manual_topo = false, remote_id = $1, remote_ip = $2, remote_port = $3,
remote_type = $4 WHERE ( ( ip = $5 AND port = $6 ) )
2020-09-13 08:45:20.727 EDT [627759] netdisco@netdisco LOG: duration:
44158.919 ms parse <unnamed>: UPDATE device_port SET manual_topo = false
WHERE ( ip = $1 )
2020-09-13 08:45:59.705 EDT [628327] netdisco@netdisco LOG: duration:
39725.025 ms parse <unnamed>: UPDATE device_port SET manual_topo = false
WHERE ( ip = $1 )
2020-09-13 08:45:59.764 EDT [628350] netdisco@netdisco LOG: duration:
37645.211 ms parse <unnamed>: UPDATE device_port SET manual_topo = false
WHERE ( ip = $1 )
Slow SQL Group 3 Example
========
2020-09-13 08:43:48.151 EDT [628237] netdisco@netdisco LOG: duration:
14385.674 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:43:48.216 EDT [628268] netdisco@netdisco LOG: duration:
8381.777 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:44:30.865 EDT [627759] netdisco@netdisco LOG: duration:
36200.128 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:45:19.569 EDT [628291] netdisco@netdisco LOG: duration:
80475.532 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:45:19.636 EDT [628293] netdisco@netdisco LOG: duration:
78427.346 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:45:19.730 EDT [628322] netdisco@netdisco LOG: duration:
78249.120 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:45:19.790 EDT [628327] netdisco@netdisco LOG: duration:
77798.475 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:45:19.847 EDT [628313] netdisco@netdisco LOG: duration:
75407.956 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
2020-09-13 08:45:19.903 EDT [628306] netdisco@netdisco LOG: duration:
74594.338 ms statement: LOCK TABLE "device_port" IN ACCESS EXCLUSIVE MODE
--- End Message ---
--- Begin Message ---
On Wed, Sep 16, 2020 at 2:35 PM Pavel Skovajsa <[email protected]> wrote:
>
> hello,
>
> This week I noticed that Netdisco performance is terrible. Anything for
> simple thing lasts minutes. Last month there was a bug related to this, but
> it got fixed in the latest release. In an attempt to figure out something I
> looked into postgres logs and found ton of messages about slow queries. There
> are number of groups of these mentioned below. I would appreciate somebody
> helping out to figure it out. I tried restarting the DB, Netdisco itself,
> whole server, it works fine for couple minutes and then it gets slow again.
> THanks,
You've got a lot of devices and ports. There is a limit to everything.
Did you run pgtune to adjust your dataough.base server?
I just renamed my database and re-discovered all of my devices in a
clean database and the database size is about 1/3 what it used to be.
That's not really a solution though. I hope someone has a simpler
explanation. I haven't looked into it that much yet.
Andy
--- End Message ---
--- Begin Message ---
Andy,
Thanks for the response.
Yeah we have quite large scale, but it used to be 20% larger and Netdisco
ran quite good. Granted we used to have issues with jobs running too late,
but the WEB GUI was always fast. This time the WEB GUI is super slow, a
simple search for a network node takes minutes.
I had ran postgresqltuner.pl before, so I believe the DB is "tuned". Most
confusing, looking at the CPU graphs, the server is not running hot at all.
It has 24GB of RAM and 18 Xeon CPUs.
I will try to lower the expire nodes interval (currently set to 1y) and see
if it will help.
In the meantime I was trying to dive into SQL queries and see if I can get
EXPLAIN on the the query below, but I do not fully understand the EXPLAIN
PSQL syntax, since I can't figure out how to run EXPLAIN using the $1 and
$2 parameters. I tried replacing $1, $2 with their values (inline), that
did not work. $1 is an array variable right? And "::macaddr[]" is a type
cast? Anybody give me a hint?
-pavel
////////
SELECT me.mac, me.ip FROM ( SELECT ip, mac FROM device where mac = any
($1::macaddr[])
UNION
SELECT ip, mac FROM device_port dp where mac = any
($2::macaddr[])
) me GROUP BY mac, ip
2020-08-09 00:00:46.997 EDT [537360] netdisco@netdisco DETAIL: parameters:
$1 = '{00:05:9b:cf:1c:1a}', $2 = '{00:05:9b:cf:1c:1a}'
//////////
On Thu, Sep 17, 2020 at 1:59 AM Andy Ruhl <[email protected]> wrote:
> On Wed, Sep 16, 2020 at 2:35 PM Pavel Skovajsa <[email protected]>
> wrote:
> >
> > hello,
> >
> > This week I noticed that Netdisco performance is terrible. Anything for
> simple thing lasts minutes. Last month there was a bug related to this, but
> it got fixed in the latest release. In an attempt to figure out something I
> looked into postgres logs and found ton of messages about slow queries.
> There are number of groups of these mentioned below. I would appreciate
> somebody helping out to figure it out. I tried restarting the DB, Netdisco
> itself, whole server, it works fine for couple minutes and then it gets
> slow again. THanks,
>
> You've got a lot of devices and ports. There is a limit to everything.
>
> Did you run pgtune to adjust your dataough.base server?
>
> I just renamed my database and re-discovered all of my devices in a
> clean database and the database size is about 1/3 what it used to be.
> That's not really a solution though. I hope someone has a simpler
> explanation. I haven't looked into it that much yet.
>
> Andy
>
--- End Message ---
_______________________________________________
Netdisco mailing list - Digest Mode
[email protected]
https://lists.sourceforge.net/lists/listinfo/netdisco-users