Re: Identify root-cause for intermittent spikes

Sengottaiyan T Wed, 12 Oct 2022 10:20:34 -0700

Thanks, Rick - I will give it a try.

On Wed, Oct 12, 2022 at 6:12 AM Rick Otten <rottenwindf...@gmail.com> wrote:


> I like to use pgbadger to collect data on what is happening in RDS
> instances.  You have to turn up a bunch of logging in RDS:
>
> 1. Turn on connection logging, duration logging, lock_waits, and anything
> else that you are interested in studying.
>
> 2. Then grab all of your postgresql logs from AWS.  I wrote this little
> bash script to pull all of the logs for a current day.  It will work if you
> have your aws credentials configured correctly and can run aws-cli commands.
> ```
> #!/bin/env bash
>
> ## Return all of the postgresql log files saved by RDS since midnight.
> ## Save them in your current directory.
> ## This is so we can use cli tools like "grep"
> ## It is also really handy for feeding into pgbadger for deeper analysis.
>
> # aws requires the timestamp to be in milliseconds.
> # unfortunately date will provide either seconds or nano seconds, so we
> have to do math.
> midnight_timestamp=$(date -d $(date -I) '+%s')
> midnight_timestamp_milliseconds=$(echo "${midnight_timestamp} * 1000" | bc)
>
> logfiles=$(aws rds describe-db-log-files \
>               --profile default \
>               --db-instance-identifier "*some_rds_instance_name*" \
>               --output json  \
>               --file-last-written ${midnight_timestamp_milliseconds} | jq
> -r ".DescribeDBLogFiles[].LogFileName")
>
> for logfile in $(echo ${logfiles})
> do
>     # remove the leading "error/" so we can use the name to save it.
>     logfile_save=$(echo "${logfile}" | awk -F\/ '{print $NF}')
>
>     tput bold; echo "${logfile}"; tput sgr0
>     aws rds download-db-log-file-portion \
>         --profile admin \
>         --db-instance-identifier prod-notify-me-1 \
>         --log-file-name ${logfile} \
>         --output text \
>         --no-paginate > ${logfile_save}
> done
> ```
> 3. Then run pgbadger:
> ``` ~/src/pgbadger/pgbadger -f rds  postgresql*
> ```
> 4. Open the `out.html` in your browser, and poke around.   There is a ton
> of stuff you can find in all the drop down menus about what was happening
> in your database over the time window you collected the logs for.  The html
> is generated as a standalone file by a perl script of all things.  It is
> pretty impressive.
>
>
>
> On Tue, Oct 11, 2022 at 7:07 AM Sengottaiyan T <techse...@gmail.com>
> wrote:
>
>> Hi All,
>>>
>>> I'm looking for suggestions:
>>>
>>> Environment: AWS PostgreSQL RDS instance - Version 14.3
>>> Operations support gets intermittent alerts from the monitoring tool
>>> through AWS cloud watch metrics on Disk Queue Depth, CPU burst-credit & CPU
>>> Utilization.
>>> I would like to understand what is causing the spike - is the number of
>>> logon's increased, (or) number of transactions per second increased, (or)
>>> SQL execution picked wrong plan and the long running (I/O, CPU or memory
>>> intensive) SQL is increasing load on server (cause and effect scenario)
>>> etc.,
>>>
>>> Due to the reactive nature of the issues, we rely on the metrics
>>> gathered in the AWS cloud watch monitoring (for the underlying OS stats),
>>> Performance Insights (for the DB performance) and correlate SQL queries
>>> with pg_Stat_Statements view. But the data in the view is an aggregated
>>> stats. And, I'm looking to see the deltas compared to normal runs.
>>> How should I approach and get to the root-cause?
>>>
>>> AppDynamics is already configured for the RDS instance. Are there any
>>> open source monitoring tools available which would help to capture and
>>> visualize the deltas?
>>>
>>> Thanks,
>>> Senko
>>>
>>

Re: Identify root-cause for intermittent spikes

Reply via email to