John has attached numbers to my previous opinion: synchronous PPRC is not a 
good choice for continuous mirroring over any significant distance. You may be 
able to tweak some option here or there, but it's a game of whack-a-mole: 
today's manageable problem will morph into yet another problem tomorrow in 
search of yet another tweak. I'm not sure what OP's management is expecting in 
the way of RPO. Our XRC usually reports very low consistency delay. At the 
moment, I'm seeing DATA EXPOSURE(00:00:02.76); less than 3 seconds. The 
business can survive that data loss.

As Ron Hawkins said earlier, the goal of zero RPO is unreasonable. OTOH real 
production performance problems abound when PPRC gets caught in a time warp. As 
the Hippocrates of IT says, whatever action you take to provide for disaster 
recovery, do no harm to production in the process. 

.
.
J.O.Skip Robinson
Southern California Edison Company
Electric Dragon Team Paddler 
SHARE MVS Program Co-Manager
323-715-0595 Mobile
626-543-6132 Office ⇐=== NEW
robin...@sce.com


-----Original Message-----
From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf 
Of John Eells
Sent: Thursday, February 15, 2018 6:52 AM
To: IBM-MAIN@LISTSERV.UA.EDU
Subject: (External):Re: DASD problem

Disclaimer: I am not a performance expert, so take this with a large grain of 
salt.

I agree with what Ron wrote: That synchronously replicated disk I/O write 
response times are longer than those for volumes that are not replicated is not 
surprising.  For basic PPRC it will be higher to start with, and distance makes 
things worse.

For 30km separation, I get ~.2ms round trip time just to get there and back at 
the speed of light.  Multiply that by the reciprocal of the fiber's velocity 
factor.  Let's call that .7 (I don't know the actual
number) which makes it about .3ms per PPRC exchange.  I don't know how many 
exchanges it takes to replicate something over PPRC, but you'd have to multiply 
the .3ms by that number and add data transfer time to see the effect of 
distance.  If anything else affects your PPRC replication traffic, these 
numbers can only go up.

You might want to see whether HyperWrite, which starts I/O to both disk 
subsystems at more or less the same time, would help you enough to consider 
using it.  There's an article about it here:

http://ibmsystemsmag.com/mainframe/administrator/db2/zhyperwrite-zip/

It will mostly eliminate the initial PPRC write delays, which might be a 
significant chunk of your replicated disk I/O response times.  But nothing can 
help you with the distance-imposed latency.  We know how to slow light down, 
but speeding it up is "more difficult."  This is one reason some people opt for 
a nonzero RPO when meaningful distances are involved.

All the above are "back of a napkin" numbers.  If I've got them wrong, I am 
sure someone will jump in.  I think the basic problems, on the other hand, are 
well-understood.  If you double-check the assumptions and math and the result 
explains what you are seeing within reasonable spitting distance, then there 
probably won't be much you can do (other than perhaps HyperWrite).

On the other hand, if the latency due to distance does not explain what you are 
seeing, then (as Ron pointed out) there are a number of things you can check on 
(about which I personally know little or nothing).

Tommy Tsui wrote:
> Hi,
> The distance is around 30km, do you know any settings on sysplex 
> environment such as GRS and JES2 checkpoint need to aware?
> Direct DASD via San switch to Dr site , 2GBPS interface , we check 
> with vendor, they didn't find any problem on San switch or DASD, I 
> suspect the system settings
<snip>


--
John Eells
IBM Poughkeepsie
ee...@us.ibm.com


----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN

Reply via email to