Hi, I also have been looking solutions for improving sync. I have two clusters, 25 ms RTT, with the RGW multi-site configured and all nodes running 12.2.12. I have three rgw nodes at each with the nodes behind haproxy at each site. There is a 1G circuit between the sites and bandwidth usage averages 370Mb/s. I can put [with swift] to the remote site at wire speed.
Logs on the receiving site show ample: heartbeat_map is_healthy 'RGWAsyncRadosProcessor::m_tp thread 0x7f16e022d700' had timed out after 600 ..but it all works albeit slow. What should be my next move in researching a resolution for this? peter Peter Eisch Senior Site Reliability Engineer T1.612.659.3228 virginpulse.com |virginpulse.com/global-challenge Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland | United Kingdom | USA Confidentiality Notice: The information contained in this e-mail, including any attachment(s), is intended solely for use by the designated recipient(s). Unauthorized use, dissemination, distribution, or reproduction of this message by anyone other than the intended recipient(s), or a person designated as responsible for delivering such messages to the intended recipient, is strictly prohibited and may be unlawful. This e-mail may contain proprietary, confidential or privileged information. Any views or opinions expressed are solely those of the author and do not necessarily represent those of Virgin Pulse, Inc. If you have received this message in error, or are not the named recipient(s), please immediately notify the sender and delete this e-mail message. v2.59 On 7/17/19, 8:44 AM, "ceph-users on behalf of Casey Bodley" <[email protected] on behalf of [email protected]> wrote: On 7/17/19 8:04 AM, P. O. wrote: > Hi, > Is there any mechanism inside the rgw that can detect faulty endpoints > for a configuration with multiple endpoints? No, replication requests that fail just get retried using round robin until they succeed. If an endpoint isn't available, we assume it will come back eventually and keep trying. > Is there any advantage related with the number of replication > endpoints? Can I expect improved replication performance (the more > synchronization rgws = the faster replication)? These endpoints act as the server side of replication, and handle GET requests from other zones to read replication logs and fetch objects. As long as the number of gateways on the client side of replication (ie. gateways on other zones that have rgw_run_sync_thread enabled, which is on by default) scale along with these replication endpoints, you can expect a modest improvement in replication, though it's limited by the available bandwidth between sites. Spreading replication endpoints over several gateways also helps to limit the impact of replication on the local client workloads. > > > W dniu środa, 17 lipca 2019 P. O. <[email protected] > <mailto:[email protected]>> napisał(a): > > Hi, > > Is there any mechanism inside the rgw that can detect faulty > endpoints for a configuration with multiple endpoints? Is there > any advantage related with the number of replication endpoints? > Can I expect improved replication performance (the more synchronization rgws = the faster replication)? > > > W dniu wtorek, 16 lipca 2019 Casey Bodley <[email protected] > <mailto:[email protected]>> napisał(a): > > We used to have issues when a load balancer was in front of > the sync endpoints, because our http client didn't time out > stalled connections. Those are resolved in luminous, but we > still recommend using the radosgw addresses directly to avoid > shoveling data through an extra proxy. Internally, sync is > already doing a round robin over that list of endpoints. On > the other hand, load balancers give you some extra > flexibility, like adding/removing gateways without having to > update the global multisite configuration. > > On 7/16/19 2:52 PM, P. O. wrote: > > Hi all, > > I have multisite RGW setup with one zonegroup and two > zones. Each zone has one endpoint configured like below: > > "zonegroups": [ > { > ... > "is_master": "true", > "endpoints": ["https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.100.1%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=sDr3ODUgXP0yZrJBFu5bCNBvmLBNuFmrFZGgS4CyORg%3D&reserved=0"], > "zones": [ > { > "name": "primary_1", > "endpoints": ["https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.100.1%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=sDr3ODUgXP0yZrJBFu5bCNBvmLBNuFmrFZGgS4CyORg%3D&reserved=0"], > }, > { > "name": "secondary_1", > "endpoints": ["https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.200.1%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=d7LFy6iN2A%2Fv2aK5MVsCUvAVCKhVQEPf9YueuDNYv9E%3D&reserved=0"], > } > ], > > My question is what is the best practice with configuring > synchronization endpoints? > > 1) Should endpoints be behind load balancer? For example > two synchronization endpoints per zone, and only load > balancers address in "endpoints" section? > 2) Should endpoints be behind Round-robin DNS? > 3) Can I set RGWs addresses directly in endpoints section? > For example: > > "zones": [ > { > "name": "primary_1", > "endpoints": ["https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.100.1%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=sDr3ODUgXP0yZrJBFu5bCNBvmLBNuFmrFZGgS4CyORg%3D&reserved=0", > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.100.2%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=aBzAAReVNLVbr%2FkHxgWhpCETNblCi3kGYNdu6JSkH8k%3D&reserved=0], > }, > { > "name": "secondary_1", > "endpoints": ["https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.200.1%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=d7LFy6iN2A%2Fv2aK5MVsCUvAVCKhVQEPf9YueuDNYv9E%3D&reserved=0", > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2F192.168.200.2%3A80&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=SIu70yRCTMsA%2BUMiPtjX1eao8Rvdq5%2Br93kQibpQcoU%3D&reserved=0], > } > > Is there any advantages of third option? I mean speed up > of synchronization, for example. > > What recommendations do you have with the configuration of > the endpoints in prod environments? > > Best regards, > Dun F. > > _______________________________________________ > ceph-users mailing list > [email protected] <mailto:[email protected]> > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ceph.com%2Flistinfo.cgi%2Fceph-users-ceph.com&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=9FEGLiIrRXx%2FoEYx7vEvQ3UgDThFEeMbwlp2eCqwNns%3D&reserved=0 > > _______________________________________________ > ceph-users mailing list > [email protected] <mailto:[email protected]> > https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ceph.com%2Flistinfo.cgi%2Fceph-users-ceph.com&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=9FEGLiIrRXx%2FoEYx7vEvQ3UgDThFEeMbwlp2eCqwNns%3D&reserved=0 > _______________________________________________ ceph-users mailing list [email protected] https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flists.ceph.com%2Flistinfo.cgi%2Fceph-users-ceph.com&data=02%7C01%7Cpeter.eisch%40virginpulse.com%7C645bd531c1124940e43e08d70abceff5%7Cb123a16e892b4cf6a55a6f8c7606a035%7C0%7C0%7C636989678911683822&sdata=9FEGLiIrRXx%2FoEYx7vEvQ3UgDThFEeMbwlp2eCqwNns%3D&reserved=0
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
