On Wed, Sep 7, 2016 at 4:37 PM, Robins Tharakan <thara...@gmail.com> wrote:
> > Hi, > > An SQL (with only information_schema related JOINS) when triggered, runs with max CPU (and never ends - killed after 2 days). > - It runs similarly (very slow) on a replicated server that acts as a read-only slave. > - Top shows only postgres as hitting max CPU (nothing else). When query killed, CPU near 0%. > - When the DB is restored on a separate test server (with the exact postgresql.conf) the same query works fine. > - There is no concurrent usage on the replicated / test server (although the primary is a Production server and has concurrent users). > > Questions: > - If this was a postgres bug or a configuration issue, query on the restored DB should have been slow too. Is there something very basic I am missing here? > > If someone asks for I could provide SQL + EXPLAIN, but it feels irrelevant here. I amn't looking for a specific solution but what else should I be looking for here? strace -ttt -T -y the process to see what system calls it is making. If it is not doing many systme calls, or they are uninformative, then attach the gdb debugger to it and periodically interrupt the process (ctrl c) and take a back trace (bt), then restart it (c) and repeat. If all the stack traces look similar, you will know where the time is going. Cheers, Jeff