Re: Operations GDPS health monitoring

2022-02-01 Thread Nigel Morton
Dave,

The last site I worked at used a heavily-customised version of GDPS status
display facility to give a single screen with a GDPS overview for a
four-site configuration - so two sites with Metro Mirror within the sites
and XRC between the sites. For each plex (three production and one
sandbox), there was a line showing the status of each plex and the status
of replication links including the direction of the link shown as an arrow.
The SDF colour-coding of red/amber/green was used for each element. So, if
a replication link was down, the arrow denoting the link was red. We'd also
see the markers for each system occasionally turn amber as a DS8000
reconstructed an array when a disk failed.

The primary thing that we wanted to know was whether we were
hyperswap-capable at each site, and whether the XRC links were working
between sites. As a secondary item, the arrows denoting the replication
direction would show if there'd been a hyperswap that had, for some reason,
been missed.

This is from a few years ago so I've probably missed some important facets.

HTH



On Fri, 28 Jan 2022 at 20:09, Dave Jousma <
01a0403c5dc1-dmarc-requ...@listserv.ua.edu> wrote:

> All
>
> We have 5 sysplex's in MGM4SITE configuration.   We are about nearly fully
> implemented and region swap tested through all environments.   We are
> looking at how to have our operations staff monitor GDPS for critical
> events.They wouldnt be expected to take any action, other than to page
> out the responsible team to address.   Optimally, the critical errors (i.e.
> page out oncall) should be very limited, and anything else that can be
> deferred to next day should be the bulk.  I would be nice if critical
> errors could be rolled into a site scope or some other agnostic system
> monitoring tool?  I am looking through the GDPS Planning and implementation
> guides and do not really see some sort of monitoring methodology?
>
> I've asked IBM this question too, but am curious what other GDPS shops are
> doing for monitoring?
>
> Thanks, Dave
>
> --
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN
>

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN


Operations GDPS health monitoring

2022-01-28 Thread Dave Jousma
All

We have 5 sysplex's in MGM4SITE configuration.   We are about nearly fully 
implemented and region swap tested through all environments.   We are looking 
at how to have our operations staff monitor GDPS for critical events.They 
wouldnt be expected to take any action, other than to page out the responsible 
team to address.   Optimally, the critical errors (i.e. page out oncall) should 
be very limited, and anything else that can be deferred to next day should be 
the bulk.  I would be nice if critical errors could be rolled into a site scope 
or some other agnostic system monitoring tool?  I am looking through the GDPS 
Planning and implementation guides and do not really see some sort of 
monitoring methodology?   

I've asked IBM this question too, but am curious what other GDPS shops are 
doing for monitoring?

Thanks, Dave

--
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN