Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-10 Thread Robins Tharakan
>
>
> Without having at least compared EXPLAIN outputs from the two boxes, you
> have no business jumping to that conclusion.
>
> If EXPLAIN does show different plans, my first instinct would be to wonder
> whether the pg_stats data is equally up-to-date on both boxes.
>
> regards, tom lane
>

Thanks. EXPLAIN plans were different but (don't have them now and) didn't
know system catalogs were so severely affected by outdated statistics as
well (which is why I was looking for any other reasons I might be missing). I
reckon an ANALYSE; should solve this? ... Would get back if I have
something else to offer.

-
robins
-- 

-
robins


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-10 Thread Tom Lane
Robins Tharakan  writes:
> I completely agree. With 'irrelevant' I was only trying to imply that
> irrespective of the complexity of the query, a replicated box was seeing
> similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
> isn't to blame here...

Without having at least compared EXPLAIN outputs from the two boxes, you
have no business jumping to that conclusion.

If EXPLAIN does show different plans, my first instinct would be to wonder
whether the pg_stats data is equally up-to-date on both boxes.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-10 Thread Robins Tharakan
On Fri, 9 Sep 2016 at 09:39 Andres Freund  wrote:

> On 2016-09-07 23:37:31 +, Robins Tharakan wrote:
> > If someone asks for I could provide SQL + EXPLAIN, but it feels
> irrelevant
> > here.
>
> Why is that? information_schema are normal sql queries, and some of them
> far from trivial.
>
> Andres
>
Hi Andres,

I completely agree. With 'irrelevant' I was only trying to imply that
irrespective of the complexity of the query, a replicated box was seeing
similar slowness whereas a Restored DB wasn't. It felt that the SQL itself
isn't to blame here...

In effect, I was trying to ask if I am forgetting / missing something very
obvious / important that could cause such an observation.

As others recommended, I am unable to have direct access to the production
(master / slave) instances and so GDB / stack trace options are out of
bounds at this time. I'll revert if I am able to do that.

-
thanks
robins



-- 

-
robins


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-08 Thread Andres Freund
On 2016-09-07 23:37:31 +, Robins Tharakan wrote:
> If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
> here.

Why is that? information_schema are normal sql queries, and some of them
far from trivial.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-08 Thread Jeff Janes
On Wed, Sep 7, 2016 at 4:37 PM, Robins Tharakan  wrote:

>

> Hi,
>
> An SQL (with only information_schema related JOINS) when triggered, runs
with max CPU (and never ends - killed after 2 days).
> - It runs similarly (very slow) on a replicated server that acts as a
read-only slave.
> - Top shows only postgres as hitting max CPU (nothing else). When query
killed, CPU near 0%.
> - When the DB is restored on a separate test server (with the exact
postgresql.conf) the same query works fine.
> - There is no concurrent usage on the replicated / test server (although
the primary is a Production server and has concurrent users).
>
> Questions:
> - If this was a postgres bug or a configuration issue, query on the
restored DB should have been slow too. Is there something very basic I am
missing here?
>
> If someone asks for I could provide SQL + EXPLAIN, but it feels
irrelevant here. I amn't looking for a specific solution but what else
should I be looking for here?

strace -ttt -T -y the process to see what system calls it is making.  If it
is not doing many systme calls, or they are uninformative, then attach the
gdb debugger to it and periodically interrupt the process (ctrl c) and take
a back trace (bt), then restart it (c) and repeat.  If all the stack traces
look similar, you will know where the time is going.

Cheers,

Jeff


Re: [HACKERS] High-CPU consumption on information_schema (only) query

2016-09-08 Thread Craig Ringer
On 8 Sep. 2016 7:38 am, "Robins Tharakan"  wrote:
>
> Hi,
>
> An SQL (with only information_schema related JOINS) when triggered, runs
with max CPU (and never ends - killed after 2 days).
> - It runs similarly (very slow) on a replicated server that acts as a
read-only slave.
> - Top shows only postgres as hitting max CPU (nothing else). When query
killed, CPU near 0%.
> - When the DB is restored on a separate test server (with the exact
postgresql.conf) the same query works fine.
> - There is no concurrent usage on the replicated / test server (although
the primary is a Production server and has concurrent users).
>
> Questions:
> - If this was a postgres bug or a configuration issue, query on the
restored DB should have been slow too. Is there something very basic I am
missing here?
>
> If someone asks for I could provide SQL + EXPLAIN, but it feels
irrelevant here. I amn't looking for a specific solution but what else
should I be looking for here?

Get a series of stack traces.

Perf with stack output would be good too.

You need debug info for both.


[HACKERS] High-CPU consumption on information_schema (only) query

2016-09-07 Thread Robins Tharakan
Hi,

An SQL (with only information_schema related JOINS) when triggered, runs
with max CPU (and never ends - killed after 2 days).
- It runs similarly (very slow) on a replicated server that acts as a
read-only slave.
- Top shows only postgres as hitting max CPU (nothing else). When query
killed, CPU near 0%.
- When the DB is restored on a separate test server (with the exact
postgresql.conf) the same query works fine.
- There is no concurrent usage on the replicated / test server (although
the primary is a Production server and has concurrent users).

Questions:
- If this was a postgres bug or a configuration issue, query on the
restored DB should have been slow too. Is there something very basic I am
missing here?

If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant
here. I amn't looking for a specific solution but what else should I be
looking for here?

p.s.: All postgres servers are running the v9.3.10

-
robins
-- 

-
robins