Re: [ceph-users] Multi-site replication speed

2019-04-19 Thread Brian Topping
Hi Casey,

I set up a completely fresh cluster on a new VM host.. everything is fresh 
fresh fresh. I feel like it installed cleanly and because there is practically 
zero latency and unlimited bandwidth as peer VMs, this is a better place to 
experiment. The behavior is the same as the other cluster.

The realm is “example-test”, has a single zone group named “us”, and there are 
zones “left” and “right”. The master zone is “left” and I am trying to 
unidirectionally replicate to “right”. “left” is a two node cluster and right 
is a single node cluster. Both show "too few PGs per OSD” but are otherwise 
100% active+clean. Both clusters have been completely restarted to make sure 
there are no latent config issues, although only the RGW nodes should require 
that. 

The thread at [1] is the most involved engagement I’ve found with a staff 
member on the subject, so I checked and believe I attached all the logs that 
were requested there. They all appear to be consistent and are attached below.

For start: 
> [root@right01 ~]# radosgw-admin sync status
>   realm d5078dd2-6a6e-49f8-941e-55c02ad58af7 (example-test)
>   zonegroup de533461-2593-45d2-8975-99072d860bb2 (us)
>zone 5dc80bbc-3d9d-46d5-8f3e-4611fbc17fbe (right)
>   metadata sync syncing
> full sync: 0/64 shards
> incremental sync: 64/64 shards
> metadata is caught up with master
>   data sync source: 479d3f20-d57d-4b37-995b-510ba10756bf (left)
> syncing
> full sync: 0/128 shards
> incremental sync: 128/128 shards
> data is caught up with source


I tried the information at [2] and do not see any ops in progress, just 
“linger_ops”. I don’t know what those are, but probably explain the slow stream 
of requests back and forth between the two RGW endpoints:
> [root@right01 ~]# ceph daemon client.rgw.right01.54395.94074682941968 
> objecter_requests
> {
> "ops": [],
> "linger_ops": [
> {
> "linger_id": 2,
> "pg": "2.16dafda0",
> "osd": 0,
> "object_id": "notify.1",
> "object_locator": "@2",
> "target_object_id": "notify.1",
> "target_object_locator": "@2",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "snapid": "head",
> "registered": "1"
> },
> ...
> ],
> "pool_ops": [],
> "pool_stat_ops": [],
> "statfs_ops": [],
> "command_ops": []
> }
> 


The next thing I tried is `radosgw-admin data sync run --source-zone=left` from 
the right side. I get bursts of messages of the following form:
> 2019-04-19 21:46:34.281 7f1c006ad580  0 RGW-SYNC:data:sync:shard[1]: ERROR: 
> failed to read remote data log info: ret=-2
> 2019-04-19 21:46:34.281 7f1c006ad580  0 meta sync: ERROR: RGWBackoffControlCR 
> called coroutine returned -2


When I sorted and filtered the messages, each burst has one RGW-SYNC message 
for each of the PGs on the left side identified by the number in “[]”. Since 
left has 128 PGs, these are the numbers between 0-127. The bursts happen about 
once every five seconds.

The packet traces between the nodes during the `data sync run` are mostly 
requests and responses of the following form:
> HTTP GET: 
> http://right01.example.com:7480/admin/log/?type=data=7=true=de533461-2593-45d2-8975-99072d860bb2
>  
> HTTP
>  404 RESPONSE: 
> {"Code":"NoSuchKey","RequestId":"tx02a01-005cba9593-371d-right","HostId":"371d-right-us”}

When I stop the `data sync run`, these 404s stop, so clearly the `data sync 
run` isn’t changing a state in the rgw, but doing something synchronously. In 
the past, I have done a `data sync init` but it doesn’t seem like doing it 
repeatedly will make a difference so I didn’t do it any more.

NEXT STEPS:

I am working on how to get better logging output from daemons and hope to find 
something in there that will help. If I am lucky, I will find something in 
there and can report back so this thread is useful for others. If I have not 
written back, I probably haven’t found anything, so would be grateful for any 
leads.

Kind regards and thank you!

Brian

[1] 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013188.html 

[2] 
http://docs.ceph.com/docs/master/radosgw/troubleshooting/?highlight=linger_ops#blocked-radosgw-requests
 


CONFIG DUMPS:

> [root@left01 ~]# radosgw-admin period get-current
> {
> "current_period": "cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c"
> }
> [root@left01 ~]# radosgw-admin period get cdc3d603-2bc8-493b-ba6a-c6a51c49cc0c

Re: [ceph-users] Multi-site replication speed

2019-04-18 Thread Brian Topping
Hi Casey, thanks for this info. It’s been doing something for 36 hours, but not 
updating the status at all. So it either takes a really long time for 
“preparing for full sync” or I’m doing something wrong. This is helpful 
information, but there’s a myriad of states that the system could be in. 

With that, I’m going to set up a lab rig and see if I can build a fully 
replicated state. At that point, I’ll have a better understanding of what a 
working system responds like and maybe I can at least ask better questions, 
hopefully figure it out myself. 

Thanks again! Brian

> On Apr 16, 2019, at 08:38, Casey Bodley  wrote:
> 
> Hi Brian,
> 
> On 4/16/19 1:57 AM, Brian Topping wrote:
>>> On Apr 15, 2019, at 5:18 PM, Brian Topping >> > wrote:
>>> 
>>> If I am correct, how do I trigger the full sync?
>> 
>> Apologies for the noise on this thread. I came to discover the 
>> `radosgw-admin [meta]data sync init` command. That’s gotten me with 
>> something that looked like this for several hours:
>> 
>>> [root@master ~]# radosgw-admin  sync status
>>>   realm 54bb8477-f221-429a-bbf0-76678c767b5f (example)
>>>   zonegroup 8e33f5e9-02c8-4ab8-a0ab-c6a37c2bcf07 (us)
>>>zone b6e32bc8-f07e-4971-b825-299b5181a5f0 (secondary)
>>>   metadata sync preparing for full sync
>>> full sync: 64/64 shards
>>> full sync: 0 entries to sync
>>> incremental sync: 0/64 shards
>>> metadata is behind on 64 shards
>>> behind shards: 
>>> [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63]
>>>   data sync source: 35835cb0-4639-43f4-81fd-624d40c7dd6f (master)
>>> preparing for full sync
>>> full sync: 1/128 shards
>>> full sync: 0 buckets to sync
>>> incremental sync: 127/128 shards
>>> data is behind on 1 shards
>>> behind shards: [0]
>> 
>> I also had the data sync showing a list of “behind shards”, but both of them 
>> sat in “preparing for full sync” for several hours, so I tried 
>> `radosgw-admin [meta]data sync run`. My sense is that was a bad idea, but 
>> neither of the commands seem to be documented and the thread I found them on 
>> indicated they wouldn’t damage the source data.
>> 
>> QUESTIONS at this point:
>> 
>> 1) What is the best sequence of commands to properly start the sync? Does 
>> init just set things up and do nothing until a run is started?
> The sync is always running. Each shard starts with full sync (where it lists 
> everything on the remote, and replicates each), then switches to incremental 
> sync (where it polls the replication logs for changes). The 'metadata sync 
> init' command clears the sync status, but this isn't synchronized with the 
> metadata sync process running in radosgw(s) - so the gateways need to restart 
> before they'll see the new status and restart the full sync. The same goes 
> for 'data sync init'.
>> 2) Are there commands I should run before that to clear out any previous bad 
>> runs?
> Just restart gateways, and you should see progress via 'sync status'.
>> 
>> *Thanks very kindly for any assistance. *As I didn’t really see any 
>> documentation outside of setting up the realms/zones/groups, it seems like 
>> this would be useful information for others that follow.
>> 
>> best, Brian
>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-site replication speed

2019-04-16 Thread Casey Bodley

Hi Brian,

On 4/16/19 1:57 AM, Brian Topping wrote:
On Apr 15, 2019, at 5:18 PM, Brian Topping > wrote:


If I am correct, how do I trigger the full sync?


Apologies for the noise on this thread. I came to discover the 
`radosgw-admin [meta]data sync init` command. That’s gotten me with 
something that looked like this for several hours:



[root@master ~]# radosgw-admin  sync status
          realm 54bb8477-f221-429a-bbf0-76678c767b5f (example)
      zonegroup 8e33f5e9-02c8-4ab8-a0ab-c6a37c2bcf07 (us)
           zone b6e32bc8-f07e-4971-b825-299b5181a5f0 (secondary)
  metadata sync preparing for full sync
                full sync: 64/64 shards
                full sync: 0 entries to sync
                incremental sync: 0/64 shards
                metadata is behind on 64 shards
                behind shards: 
[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63]

      data sync source: 35835cb0-4639-43f4-81fd-624d40c7dd6f (master)
                        preparing for full sync
                        full sync: 1/128 shards
                        full sync: 0 buckets to sync
                        incremental sync: 127/128 shards
                        data is behind on 1 shards
                        behind shards: [0]


I also had the data sync showing a list of “behind shards”, but both 
of them sat in “preparing for full sync” for several hours, so I tried 
`radosgw-admin [meta]data sync run`. My sense is that was a bad idea, 
but neither of the commands seem to be documented and the thread I 
found them on indicated they wouldn’t damage the source data.


QUESTIONS at this point:

1) What is the best sequence of commands to properly start the sync? 
Does init just set things up and do nothing until a run is started?
The sync is always running. Each shard starts with full sync (where it 
lists everything on the remote, and replicates each), then switches to 
incremental sync (where it polls the replication logs for changes). The 
'metadata sync init' command clears the sync status, but this isn't 
synchronized with the metadata sync process running in radosgw(s) - so 
the gateways need to restart before they'll see the new status and 
restart the full sync. The same goes for 'data sync init'.
2) Are there commands I should run before that to clear out any 
previous bad runs?

Just restart gateways, and you should see progress via 'sync status'.


*Thanks very kindly for any assistance. *As I didn’t really see any 
documentation outside of setting up the realms/zones/groups, it seems 
like this would be useful information for others that follow.


best, Brian

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-site replication speed

2019-04-15 Thread Brian Topping
> On Apr 15, 2019, at 5:18 PM, Brian Topping  wrote:
> 
> If I am correct, how do I trigger the full sync?

Apologies for the noise on this thread. I came to discover the `radosgw-admin 
[meta]data sync init` command. That’s gotten me with something that looked like 
this for several hours:

> [root@master ~]# radosgw-admin  sync status
>   realm 54bb8477-f221-429a-bbf0-76678c767b5f (example)
>   zonegroup 8e33f5e9-02c8-4ab8-a0ab-c6a37c2bcf07 (us)
>zone b6e32bc8-f07e-4971-b825-299b5181a5f0 (secondary)
>   metadata sync preparing for full sync
> full sync: 64/64 shards
> full sync: 0 entries to sync
> incremental sync: 0/64 shards
> metadata is behind on 64 shards
> behind shards: 
> [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63]
>   data sync source: 35835cb0-4639-43f4-81fd-624d40c7dd6f (master)
> preparing for full sync
> full sync: 1/128 shards
> full sync: 0 buckets to sync
> incremental sync: 127/128 shards
> data is behind on 1 shards
> behind shards: [0]

I also had the data sync showing a list of “behind shards”, but both of them 
sat in “preparing for full sync” for several hours, so I tried `radosgw-admin 
[meta]data sync run`. My sense is that was a bad idea, but neither of the 
commands seem to be documented and the thread I found them on indicated they 
wouldn’t damage the source data. 

QUESTIONS at this point:

1) What is the best sequence of commands to properly start the sync? Does init 
just set things up and do nothing until a run is started?
2) Are there commands I should run before that to clear out any previous bad 
runs?

Thanks very kindly for any assistance. As I didn’t really see any documentation 
outside of setting up the realms/zones/groups, it seems like this would be 
useful information for others that follow.

best, Brian___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-site replication speed

2019-04-15 Thread Brian Topping
I’m starting to wonder if I actually have things configured and working 
correctly, but the light traffic I am seeing is that of an incremental 
replication. That would make sense, the cluster being replicated does not have 
a lot of traffic on it yet. Obviously, without the full replication, the 
incremental is pretty useless.

Here’s the status coming from the secondary side:

> [root@secondary ~]# radosgw-admin  sync status
>   realm 54bb8477-f221-429a-bbf0-76678c767b5f (example)
>   zonegroup 8e33f5e9-02c8-4ab8-a0ab-c6a37c2bcf07 (us)
>zone b6e32bc8-f07e-4971-b825-299b5181a5f0 (secondary)
>   metadata sync syncing
> full sync: 0/64 shards
> incremental sync: 64/64 shards
> metadata is caught up with master
>   data sync source: 35835cb0-4639-43f4-81fd-624d40c7dd6f (master)
> syncing
> full sync: 0/128 shards
> incremental sync: 128/128 shards
> data is caught up with source


If I am correct, how do I trigger the full sync?

Thanks!! Brian
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multi-site replication speed

2019-04-14 Thread Brian Topping


> On Apr 14, 2019, at 2:08 PM, Brian Topping  wrote:
> 
> Every so often I might see the link running at 20 Mbits/sec, but it’s not 
> consistent. It’s probably going to take a very long time at this rate, if 
> ever. What can I do?

Correction: I was looking at statistics on an aggregate interface while my 
laptop was rebuilding a mailbox. The typical transfer is around 60Kbits/sec, 
but as I said, iperf3 can easily push the link between the two points to 
>750Mbits/sec. Also, system load always has >90% idle on both machines...

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Multi-site replication speed

2019-04-14 Thread Brian Topping
Hi all! I’m finally running with Ceph multi-site per 
http://docs.ceph.com/docs/nautilus/radosgw/multisite/ 
, woo hoo!

I wanted to confirm that the process can be slow. It’s been a couple of hours 
since the sync started and `radosgw-admin sync status` does not report any 
errors, but the speeds are nowhere near link saturation. iperf3 reports 773 
Mbits/sec on the link in TCP mode, latency is about 5ms. 

Every so often I might see the link running at 20 Mbits/sec, but it’s not 
consistent. It’s probably going to take a very long time at this rate, if ever. 
What can I do?

I’m using civetweb without SSL on the gateway endpoints, only one 
master/mon/rgw for each end on Nautilus 14.2.0.

Apologies if I’ve missed some crucial tuning docs or archive messages somewhere 
on the subject.

Thanks! Brian___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com