Hello,

(everything in context of S3)


I'm currently trying to better understand bucket sharding in combination with 
an multisite - rgw setup and possible limitations.

At the moment I understand that a bucket has a bucket index, which is a list of 
objects within the bucket.

There are also indexless buckets, but those are not usable for cases like a 
multisite - rgw bucket, where your need a [delayed] consistent relation/state 
between bucket n [zone a] and bucket n [zone b].

Those bucket indexes are stored in "shards" and shards get distributed over to 
whole zone - cluster for scaling purposes.
Redhat recommends a maximum size of 102,400 object per shard and recommend this 
forumular to determine the right shard size for a cluster:

number of objects expected in a bucket / 100,000 
max number of supported shards (or tested limit) is 7877 shard.

That results in a total limit of 787.700.000 objects, as long you wanna stay in 
known and tested water.

Now some the things I did not 100% understand:

= QUESTION 1 =

Does each bucket has it's own shards? E.g

Bucket 1 reached it's shard limit at 7877 shard, can i then create other  
Buckets wish start with their own frish sets of shards?
OR is it the other way around which would mean all buckets save their Index in 
the the same shards and if i reach the shard limit I need to create a second 
cluster?

= QUESTION 2 =
How are this shards distrbuted over the cluster? I expect they are just objects 
in the rgw.bucket.index pool, is that correct?
So. those one:
rados ls -p a.rgw.buckets.index 
.dir.3638e3a4-8dde-42ee-812a-f98e266548a4.274451.1
.dir.3638e3a4-8dde-42ee-812a-f98e266548a4.87683.1
.dir.3638e3a4-8dde-42ee-812a-f98e266548a4.64716.1
.dir.3638e3a4-8dde-42ee-812a-f98e266548a4.78046.2

= QUESTION 3 = 


Does this Bucket Index Shards, has any relation to the RGW Sync shards in a rgw 
multisite setup?
E.g. If I have a ton of bucket index shards or buckets, does it have any impact 
on the sync shards? 

radosgw-admin sync status
 realm f0019e09-c830-4fe8-a992-435e6f463b7c (mumu_1)
 zonegroup 307a1bb5-4d93-4a01-af21-0d8467b9bdfe (EU_1)
 zone 5a9c4d16-27a6-4721-aeda-b1a539b3d73a (b)
 metadata sync syncing
 full sync: 0/64 shards                    <= this ones I mean
 incremental sync: 64/64 shards
 metadata is caught up with master
 data sync source: 3638e3a4-8dde-42ee-812a-f98e266548a4 (a)
 syncing
 full sync: 0/128 shards   <= and this ones
 incremental sync: 128/128 shards <= and this ones
 data is caught up with source


(swi to sync shard related topics)
= QUESTION 4 = 
(switching to sync shard related topics)


What is the exact function and purpose of the sync shards? Do they implement 
any limit? E.g. maybe a maximum amount of objects entries that waits for 
synchronization to zone b.


= QUESTION 5 = 
Are those  Sync Shards processed parallel or sequentially? And where are those 
shards stored?


= QUESTION 6 = 
As far as I have experienced the sync process pretty much works like that:

1.) The client sends a object or a operation to a rados gateway A (RGW A)
2.) RGW A logs this operation into one of it's sync shards and execute the 
operation to it's local storage pool
3.) RGW B checks via get requests in a regular intervall if any new entries in 
the RGW A log appears 
4.) If a new entry exists RGW B it's execute the operation to it's local pool 
or pulls the new object from RGW A

Did I understand that correct? (For my roughly description of this 
functionality, I want to apologize at the developers which for sure invested 
much time and effort into design and building of that sync - process)

And If I understand it correct, how would look the exact strategy in a 
multisite - setup to resync e.g. a single bucket at which one zone got 
corrupted and must be get back into a synchronous state?


Hope thats the correct place to ask such questions.

Best Regards,
Daly
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to