On Tue, Apr 1, 2014 at 7:34 PM, Shang Wu <[email protected]> wrote:
> Hi all,
>
> I have some questions about the Ceph multi-site implementation.
>
> I am thinking to have Ceph as the storage solution for across three internal
> site. I think, with a good internet connection, using the Multi-site object
> storage with RADOS (or RGW) might be a good use here. Thus, each site will
> have a MON node and many OSDs and replicate data between each other. With
> this implementation, I hope it will allow user to READ/WRITE from/to the
> local office and Ceph will take care the replication.
>
> So my question is:
>
> 1. How does Ceph know how to retrieve data from the nearest location? (As
> Ceph usually calculate where the data is through CRUSH rather than the
> nearest location for the user.) Will the data be distributed evenly
> throughout the three sites? If not, how can we let user to access the _local
> copy_ ?

The idea is to keep separate ceph clusters, one for each zone. You'll
have rgw running for each zone, configured to contact the local ceph
cluster.

> 2. Is " Multi-site object storage with RADOS" a good fit for their
> implementation? i.e. to READ/Write data To/From their local site? If not,
> what is the best way to approach this?

It's definitely one approach. I'm not sure I know enough about the
requirements to say whether it's a good fit. Specifically with this
approach the replication is not bi-directional within each region so
there are some limitations.

> 3. Does Ceph use the same ID (object name?) for all its replica? Can we
> access(read/write) these replica directly?

Yes. You can access the replicas directly (of course, depending on the
configuration), and replicas are generally independent. But writes
should not go into the replicas as there's no bi-directional sync
process. You can set it up so that reads could go to the replicas
though.

> 4. From this multi-site scenario, when a user write data to Ceph, will it
> find the nearest OSD to put the data? When a user read data, does it always
> respond from the primary data set (doesn't matter the location) or respond
> from the nearest replica copy?
>

This is a bit moot, as with my proposed solution you'll have a cluster
per location and not a single cluster overall. In any case, at this
moment the rgw accesses the primary wherever it is.

Yehuda
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to