Hi Pragya Let me try to answer these.
1# The decisions is based on your use case ( performance , reliability ) .If you need high performance out of your cluster , the deployer will create a pool on SSD and assign this pool to applications which require higher I/O. For Ex : if you integrate openstack with Ceph , you can instruct openstack configuration files to write data to a specific ceph pool. (http://ceph.com/docs/master/rbd/rbd-openstack/#configuring-glance) , similarly you can instruct CephFS and RadosGW with pool to use for data storage. 2# Usually the end user (client to ceph cluster) does not bother about where the data is getting stored , which pool its using , and what is the real physical locate of data. End user will demand for specific performance , reliability and availability. It is the job of Ceph admin to fulfil their storage requirements, out of Ceph functionalities of SSD , Erausre codes , replication level etc. Block Device :- End user will instruct the application ( Qemu / KVM , OpenStack etc ) , which pool it should for data storage. rbd is the default pool for block device. CephFS :- End user will mount this pool as filesystem and can use further. Default pool are data and metadata . RadosGW :- End user will storage objects using S3 or Swift API. - Karan Singh - On 15 Jul 2014, at 07:42, pragya jain <[email protected]> wrote: > thank you very much, Craig, for your clear explanation against my questions. > > Now I am very clear about the concept of pools in ceph. > > But I have two small questions: > 1. How does the deployer decide that a particular type of information will be > stored in a particular pool? Are there any settings at the time of creation > of pool that a deployer should make to ensure that which type of data will be > stored in which pool? > > 2. How does an end-user specify that his/her data will be stored in which > pool? how can an end-user come to know which pools are stored on SSDs or on > HDDs, what are the properties of a particular pool? > > Thanks again, Please help to clear these confusions also. > > Regards > Pragya Jain > > > On Sunday, 13 July 2014 5:04 AM, Craig Lewis <[email protected]> > wrote: > > > I'll answer out of order. > > #2: rdb is used for RDB images. data and metadata are used by CephFS. > RadosGW's default pools will be created the first time radosgw starts up. If > you aren't using RDB or CephFS, you can ignore those pools. > > #1: RadosGW will use several pools to segregate it's data. There are a > couple pools for store user/subuser information, as well as pools for storing > the actual data. I'm using federation, and I have a total of 18 pools that > RadosGW is using in some form. Pools are a way to logically separate your > data, and pools can also have different replication/storage settings. For > example, I could say that the .rgw.buckets.index pool needs 4x replication > and is only stored on SSDs, while .rgw.bucket is 3x replication on HDDs. > > #3: In addition to #1, you can setup different pools to actually store user > data in RadosGW. For example, an end user may have some very important data > that you want replicated 4 times, and some other data that needs to be stored > on SSDs for low latency. Using CRUSH, you would create the some rados pools > with those specs. Then you'd setup some placement targets in RadosGW that > use those pools. A user that cares will specify a placement target when they > create a bucket. That way they can decide what the storage requirements are. > If they don't care, then they can just use the default. > > Does that help? > > > > On Thu, Jul 10, 2014 at 11:34 PM, pragya jain <[email protected]> wrote: > hi all, > > I have some very basic questions about pools in ceph. > > According to ceph documentation, as we deploy a ceph cluster with radosgw > instance over it, ceph creates pool by default to store the data or the > deployer can also create pools according to the requirement. > > Now, my question is: > 1. what is the relevance of multiple pools in a cluster? > i.e. why should a deployer create multiple pools in a cluster? what should be > the benefits of creating multiple pools? > > 2. according to the docs, the default pools are data, metadata, and rbd. > what is the difference among these three types of pools? > > 3. when a system deployer has deployed a ceph cluster with radosgw interface > and start providing services to the end-user, such as, end-user can create > their account on the ceph cluster and can store/retrieve their data to/from > the cluster, then Is the end user has any concern about the pools created in > the cluster? > > Please somebody help me to clear these confusions. > > regards > Pragya Jain > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > > _______________________________________________ > ceph-users mailing list > [email protected] > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
