Hi Pragya

Let me try to answer these.

1#  The decisions is based on your use case ( performance , reliability ) .If 
you need high performance out of your cluster , the deployer will create a pool 
on SSD and assign this pool to applications which require higher I/O. For Ex : 
if you integrate openstack with Ceph , you can instruct openstack configuration 
files to write data to a specific ceph pool.  
(http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance) , similarly 
you can instruct CephFS and RadosGW with pool to use for data storage.

2#  Usually the end user (client to ceph cluster) does not bother about where 
the data is getting stored , which pool its using , and what is the real 
physical locate of data. End user will demand for specific performance , 
reliability and availability. It is the job of Ceph admin to fulfil  their 
storage requirements, out of Ceph functionalities of SSD , Erausre codes , 
replication level etc.


Block Device :- End user will instruct the application ( Qemu / KVM , OpenStack 
etc ) , which pool it should for data storage. rbd is the default pool for 
block device.
CephFS :- End user will mount this pool as filesystem and can use further. 
Default pool are data and metadata .
 RadosGW :- End user will storage objects using S3 or Swift API. 



- Karan Singh -

On 15 Jul 2014, at 07:42, pragya jain <[email protected]> wrote:

> thank you very much, Craig, for your clear explanation against my questions. 
> 
> Now I am very clear about the concept of pools in ceph.
> 
> But I have two small questions:
> 1. How does the deployer decide that a particular type of information will be 
> stored in a particular pool? Are there any settings at the time of creation 
> of pool that a deployer should make to ensure that which type of data will be 
> stored in which pool?
> 
> 2. How does an end-user specify that his/her data will be stored in which 
> pool? how can an end-user come to know which pools are stored on SSDs or on 
> HDDs, what are the properties of a particular pool? 
> 
> Thanks again, Please help to clear these confusions also. 
> 
> Regards
> Pragya Jain
> 
> 
> On Sunday, 13 July 2014 5:04 AM, Craig Lewis <[email protected]> 
> wrote:
> 
> 
> I'll answer out of order.
> 
> #2: rdb is used for RDB images.  data and metadata are used by CephFS.  
> RadosGW's default pools will be created the first time radosgw starts up.  If 
> you aren't using RDB or CephFS, you can ignore those pools.
> 
> #1: RadosGW will use several pools to segregate it's data.  There are a 
> couple pools for store user/subuser information, as well as pools for storing 
> the actual data.  I'm using federation, and I have a total of 18 pools that 
> RadosGW is using in some form.  Pools are a way to logically separate your 
> data, and pools can also have different replication/storage settings.  For 
> example, I could say that the .rgw.buckets.index pool needs 4x replication 
> and is only stored on SSDs, while .rgw.bucket is 3x replication on HDDs.
> 
> #3: In addition to #1, you can setup different pools to actually store user 
> data in RadosGW.  For example, an end user may have some very important data 
> that you want replicated 4 times, and some other data that needs to be stored 
> on SSDs for low latency.  Using CRUSH, you would create the some rados pools 
> with those specs.  Then you'd setup some placement targets in RadosGW that 
> use those pools.  A user that cares will specify a placement target when they 
> create a bucket.  That way they can decide what the storage requirements are. 
>  If they don't care, then they can just use the default.
> 
> Does that help?
> 
> 
> 
> On Thu, Jul 10, 2014 at 11:34 PM, pragya jain <[email protected]> wrote:
> hi all,
> 
> I have some very basic questions about pools in ceph.
> 
> According to ceph documentation, as we deploy a ceph cluster with radosgw 
> instance over it, ceph creates pool by default to store the data or the 
> deployer can also create pools according to the requirement.
> 
> Now, my question is:
> 1. what is the relevance of multiple pools in a cluster?
> i.e. why should a deployer create multiple pools in a cluster? what should be 
> the benefits of creating multiple pools?
> 
> 2. according to the docs, the default pools are data, metadata, and rbd.
> what is the difference among these three types of pools?
> 
> 3. when a system deployer has deployed a ceph cluster with radosgw interface 
> and start providing services to the end-user, such as, end-user can create 
> their account on the ceph cluster and can store/retrieve their data to/from 
> the cluster, then Is the end user has any concern about the pools created in 
> the cluster?
> 
> Please somebody help me to clear these confusions.
> 
> regards
> Pragya Jain
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> 
> _______________________________________________
> ceph-users mailing list
> [email protected]
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to