John has attached numbers to my previous opinion: synchronous PPRC is not a good choice for continuous mirroring over any significant distance. You may be able to tweak some option here or there, but it's a game of whack-a-mole: today's manageable problem will morph into yet another problem tomorrow in search of yet another tweak. I'm not sure what OP's management is expecting in the way of RPO. Our XRC usually reports very low consistency delay. At the moment, I'm seeing DATA EXPOSURE(00:00:02.76); less than 3 seconds. The business can survive that data loss.
As Ron Hawkins said earlier, the goal of zero RPO is unreasonable. OTOH real production performance problems abound when PPRC gets caught in a time warp. As the Hippocrates of IT says, whatever action you take to provide for disaster recovery, do no harm to production in the process. . . J.O.Skip Robinson Southern California Edison Company Electric Dragon Team Paddler SHARE MVS Program Co-Manager 323-715-0595 Mobile 626-543-6132 Office ⇐=== NEW robin...@sce.com -----Original Message----- From: IBM Mainframe Discussion List [mailto:IBM-MAIN@LISTSERV.UA.EDU] On Behalf Of John Eells Sent: Thursday, February 15, 2018 6:52 AM To: IBM-MAIN@LISTSERV.UA.EDU Subject: (External):Re: DASD problem Disclaimer: I am not a performance expert, so take this with a large grain of salt. I agree with what Ron wrote: That synchronously replicated disk I/O write response times are longer than those for volumes that are not replicated is not surprising. For basic PPRC it will be higher to start with, and distance makes things worse. For 30km separation, I get ~.2ms round trip time just to get there and back at the speed of light. Multiply that by the reciprocal of the fiber's velocity factor. Let's call that .7 (I don't know the actual number) which makes it about .3ms per PPRC exchange. I don't know how many exchanges it takes to replicate something over PPRC, but you'd have to multiply the .3ms by that number and add data transfer time to see the effect of distance. If anything else affects your PPRC replication traffic, these numbers can only go up. You might want to see whether HyperWrite, which starts I/O to both disk subsystems at more or less the same time, would help you enough to consider using it. There's an article about it here: http://ibmsystemsmag.com/mainframe/administrator/db2/zhyperwrite-zip/ It will mostly eliminate the initial PPRC write delays, which might be a significant chunk of your replicated disk I/O response times. But nothing can help you with the distance-imposed latency. We know how to slow light down, but speeding it up is "more difficult." This is one reason some people opt for a nonzero RPO when meaningful distances are involved. All the above are "back of a napkin" numbers. If I've got them wrong, I am sure someone will jump in. I think the basic problems, on the other hand, are well-understood. If you double-check the assumptions and math and the result explains what you are seeing within reasonable spitting distance, then there probably won't be much you can do (other than perhaps HyperWrite). On the other hand, if the latency due to distance does not explain what you are seeing, then (as Ron pointed out) there are a number of things you can check on (about which I personally know little or nothing). Tommy Tsui wrote: > Hi, > The distance is around 30km, do you know any settings on sysplex > environment such as GRS and JES2 checkpoint need to aware? > Direct DASD via San switch to Dr site , 2GBPS interface , we check > with vendor, they didn't find any problem on San switch or DASD, I > suspect the system settings <snip> -- John Eells IBM Poughkeepsie ee...@us.ibm.com ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to lists...@listserv.ua.edu with the message: INFO IBM-MAIN