What kernel version and mds version are you running? I did
# ceph osd pool create foo 12
# ceph osd pool create bar 12
# ceph mds add_data_pool 3
# ceph mds add_data_pool 4
and from a kernel mount
# mkdir foo
# mkdir bar
# cephfs foo set_layout --pool 3
# cephfs bar set_layout --pool 4
# cephfs foo show_layout
layout.data_pool: 3
layout.object_size: 4194304
layout.stripe_unit: 4194304
layout.stripe_count: 1
# cephfs bar show_layout
layout.data_pool: 4
layout.object_size: 4194304
layout.stripe_unit: 4194304
layout.stripe_count: 1
This much you can test without playing with the crush map, btw.
Maybe there is some crazy bug when the set_layouts are pipelined? Try
with out using & ?
sage
On Fri, 30 Nov 2012, hemant surale wrote:
> Hi Sage,Community ,
> I am unable to use 2 directories to direct data to 2 different
> pools. I did following expt.
>
> Created 2 pool "host" & "ghost" to seperate data placement .
> --------------------------------------------------//crushmap file
> -------------------------------------------------------
> # begin crush map
>
> # devices
> device 0 osd.0
> device 1 osd.1
> device 2 osd.2
> device 3 osd.3
>
> # types
> type 0 osd
> type 1 host
> type 2 rack
> type 3 row
> type 4 room
> type 5 datacenter
> type 6 pool
> type 7 ghost
>
> # buckets
> host hemantone-mirror-virtual-machine {
> id -6 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.2 weight 1.000
> }
> host hemantone-virtual-machine {
> id -7 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.1 weight 1.000
> }
> rack one {
> id -2 # do not change unnecessarily
> # weight 2.000
> alg straw
> hash 0 # rjenkins1
> item hemantone-mirror-virtual-machine weight 1.000
> item hemantone-virtual-machine weight 1.000
> }
> ghost hemant-virtual-machine {
> id -4 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.0 weight 1.000
> }
> ghost hemant-mirror-virtual-machine {
> id -5 # do not change unnecessarily
> # weight 1.000
> alg straw
> hash 0 # rjenkins1
> item osd.3 weight 1.000
> }
> rack two {
> id -3 # do not change unnecessarily
> # weight 2.000
> alg straw
> hash 0 # rjenkins1
> item hemant-virtual-machine weight 1.000
> item hemant-mirror-virtual-machine weight 1.000
> }
> pool default {
> id -1 # do not change unnecessarily
> # weight 4.000
> alg straw
> hash 0 # rjenkins1
> item one weight 2.000
> item two weight 2.000
> }
>
> # rules
> rule data {
> ruleset 0
> type replicated
> min_size 1
> max_size 10
> step take default
> step take one
> step chooseleaf firstn 0 type host
> step emit
> }
> rule metadata {
> ruleset 1
> type replicated
> min_size 1
> max_size 10
> step take default
> step take one
> step chooseleaf firstn 0 type host
> step emit
> }
> rule rbd {
> ruleset 2
> type replicated
> min_size 1
> max_size 10
> step take default
> step take one
> step chooseleaf firstn 0 type host
> step emit
> }
> rule forhost {
> ruleset 3
> type replicated
> min_size 1
> max_size 10
> step take default
> step take one
> step chooseleaf firstn 0 type host
> step emit
> }
> rule forghost {
> ruleset 4
> type replicated
> min_size 1
> max_size 10
> step take default
> step take two
> step chooseleaf firstn 0 type ghost
> step emit
> }
>
> # end crush map
> ------------------------------------------------------------------------------------------------------------------------
> 1) set replication factor to 2. and crushrule accordingly . ( "host"
> got crush_ruleset = 3 & "ghost" pool got crush_ruleset = 4).
> 2) Now I mounted data to dir. using "mount.ceph 10.72.148.245:6789:/
> /home/hemant/x" & "mount.ceph 10.72.148.245:6789:/ /home/hemant/y"
> 3) then "mds add_data_pool 5" & "mds add_data_pool 6" ( here pool id
> are host = 5, ghost = 6)
> 4) "cephfs /home/hemant/x set_layout --pool 5 -c 1 -u 4194304 -s
> 4194304" & "cephfs /home/hemant/y set_layout --pool 6 -c 1 -u 4194304
> -s 4194304"
>
> PROBLEM:
> $ cephfs /home/hemant/x show_layout
> layout.data_pool: 6
> layout.object_size: 4194304
> layout.stripe_unit: 4194304
> layout.stripe_count: 1
> cephfs /home/hemant/y show_layout
> layout.data_pool: 6
> layout.object_size: 4194304
> layout.stripe_unit: 4194304
> layout.stripe_count: 1
>
> Both dir are using same pool to place data even after I specified to
> use separate using "cephfs" cmd.
> Please help me figure this out.
>
> -
> Hemant Surale.
>
>
> On Thu, Nov 29, 2012 at 3:45 PM, hemant surale <[email protected]>
> wrote:
> >>> does 'ceph mds dump' list pool 3 in teh data_pools line?
> >
> > Yes. It lists the desired poolids I wanted to put data in.
> >
> >
> > ---------- Forwarded message ----------
> > From: hemant surale <[email protected]>
> > Date: Thu, Nov 29, 2012 at 2:59 PM
> > Subject: Re: OSD daemon changes port no
> > To: Sage Weil <[email protected]>
> >
> >
> > I used a little different version of "cephfs" as "cephfs
> > /home/hemant/a set_layout --pool 3 -c 1 -u 4194304 -s 4194304"
> > and "cephfs /home/hemant/b set_layout --pool 5 -c 1 -u 4194304 -s
> > 4194304".
> >
> >
> > Now cmd didnt showed any error but When I put data to dir "a" & "b"
> > ideally it should go to different pool but its not working as of now.
> > Whatever I am doing is it possible (to use 2 dir pointing to 2
> > different pools for data placement) ?
> >
> >
> >
> > -
> > Hemant Surale.
> >
> > On Tue, Nov 27, 2012 at 10:21 PM, Sage Weil <[email protected]> wrote:
> >> On Tue, 27 Nov 2012, hemant surale wrote:
> >>> I did "mkdir a " "chmod 777 a" . So dir "a" is /home/hemant/a" .
> >>> then I used "mount.ceph 10.72.148.245:/ /ho
> >>>
> >>> root@hemantsec-virtual-machine:/home/hemant# cephfs /home/hemant/a
> >>> set_layout --pool 3
> >>> Error setting layout: Invalid argument
> >>
> >> does 'ceph mds dump' list pool 3 in teh data_pools line?
> >>
> >> sage
> >>
> >>>
> >>> On Mon, Nov 26, 2012 at 9:56 PM, Sage Weil <[email protected]> wrote:
> >>> > On Mon, 26 Nov 2012, hemant surale wrote:
> >>> >> While I was using "cephfs" following error is observed -
> >>> >> ------------------------------------------------------------------------------------------------
> >>> >> root@hemantsec-virtual-machine:~# cephfs /mnt/ceph/a --pool 3
> >>> >> invalid command
> >>> >
> >>> > Try
> >>> >
> >>> > cephfs /mnt/ceph/a set_layout --pool 3
> >>> >
> >>> > (set_layout is the command)
> >>> >
> >>> > sage
> >>> >
> >>> >> usage: cephfs path command [options]*
> >>> >> Commands:
> >>> >> show_layout -- view the layout information on a file or dir
> >>> >> set_layout -- set the layout on an empty file,
> >>> >> or the default layout on a directory
> >>> >> show_location -- view the location information on a file
> >>> >> Options:
> >>> >> Useful for setting layouts:
> >>> >> --stripe_unit, -u: set the size of each stripe
> >>> >> --stripe_count, -c: set the number of objects to stripe across
> >>> >> --object_size, -s: set the size of the objects to stripe across
> >>> >> --pool, -p: set the pool to use
> >>> >>
> >>> >> Useful for getting location data:
> >>> >> --offset, -l: the offset to retrieve location data for
> >>> >>
> >>> >> ------------------------------------------------------------------------------------------------
> >>> >> It may be silly question but unable to figure it out.
> >>> >>
> >>> >> :(
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> On Wed, Nov 21, 2012 at 8:59 PM, Sage Weil <[email protected]> wrote:
> >>> >> > On Wed, 21 Nov 2012, hemant surale wrote:
> >>> >> >> > Oh I see. Generally speaking, the only way to guarantee
> >>> >> >> > separation is to
> >>> >> >> > put them in different pools and distribute the pools across
> >>> >> >> > different sets
> >>> >> >> > of OSDs.
> >>> >> >>
> >>> >> >> yeah that was correct approach but i found problem doing so from
> >>> >> >> abstract level i.e. when I put file inside mounted dir
> >>> >> >> "/home/hemant/cephfs " ( mounted using "mount.ceph" cmd ) . At that
> >>> >> >> time anyways ceph is going to use default pool data to store files (
> >>> >> >> here files were striped into different objects and then sent to
> >>> >> >> appropriate osd ) .
> >>> >> >> So how to tell ceph to use different pools in this case ?
> >>> >> >>
> >>> >> >> Goal : separate read and write operations , where read will be done
> >>> >> >> from one group of OSD and write is done to other group of OSD.
> >>> >> >
> >>> >> > First create the other pool,
> >>> >> >
> >>> >> > ceph osd pool create <name>
> >>> >> >
> >>> >> > and then adjust the CRUSH rule to distributed to a different set of
> >>> >> > OSDs
> >>> >> > for that pool.
> >>> >> >
> >>> >> > To allow cephfs use it,
> >>> >> >
> >>> >> > ceph mds add_data_pool <poolid>
> >>> >> >
> >>> >> > and then:
> >>> >> >
> >>> >> > cephfs /mnt/ceph/foo --pool <poolid>
> >>> >> >
> >>> >> > will set the policy on the directory such that new files beneath that
> >>> >> > point will be stored in a different pool.
> >>> >> >
> >>> >> > Hope that helps!
> >>> >> > sage
> >>> >> >
> >>> >> >
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >> -
> >>> >> >> Hemant Surale.
> >>> >> >>
> >>> >> >>
> >>> >> >> On Wed, Nov 21, 2012 at 12:33 PM, Sage Weil <[email protected]>
> >>> >> >> wrote:
> >>> >> >> > On Wed, 21 Nov 2012, hemant surale wrote:
> >>> >> >> >> Its a little confusing question I believe .
> >>> >> >> >>
> >>> >> >> >> Actually there are two files X & Y. When I am reading X from its
> >>> >> >> >> primary .I want to make sure simultaneous writing of Y should go
> >>> >> >> >> to
> >>> >> >> >> any other OSD except primary OSD for X (from where my current
> >>> >> >> >> read is
> >>> >> >> >> getting served ) .
> >>> >> >> >
> >>> >> >> > Oh I see. Generally speaking, the only way to guarantee
> >>> >> >> > separation is to
> >>> >> >> > put them in different pools and distribute the pools across
> >>> >> >> > different sets
> >>> >> >> > of OSDs. Otherwise, it's all (pseudo)random and you never know.
> >>> >> >> > Usually,
> >>> >> >> > they will be different, particularly as the cluster size
> >>> >> >> > increases, but
> >>> >> >> > sometimes they will be the same.
> >>> >> >> >
> >>> >> >> > sage
> >>> >> >> >
> >>> >> >> >
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >> >> -
> >>> >> >> >> Hemant Sural.e
> >>> >> >> >>
> >>> >> >> >> On Wed, Nov 21, 2012 at 11:50 AM, Sage Weil <[email protected]>
> >>> >> >> >> wrote:
> >>> >> >> >> > On Wed, 21 Nov 2012, hemant surale wrote:
> >>> >> >> >> >> >> and one more thing how can it be possible to read from
> >>> >> >> >> >> >> one osd and
> >>> >> >> >> >> >> then simultaneous write to direct on other osd with
> >>> >> >> >> >> >> less/no traffic?
> >>> >> >> >> >> >
> >>> >> >> >> >> > I'm not sure I understand the question...
> >>> >> >> >> >>
> >>> >> >> >> >> Scenario :
> >>> >> >> >> >> I have written file X.txt on some osd which is primary
> >>> >> >> >> >> for filr
> >>> >> >> >> >> X.txt ( direct write operation using rados cmd) .
> >>> >> >> >> >> Now while read on file X.txt is in progress, Can I
> >>> >> >> >> >> make sure
> >>> >> >> >> >> the simultaneous write request must be directed to other osd
> >>> >> >> >> >> using
> >>> >> >> >> >> crushmaps/other way?
> >>> >> >> >> >
> >>> >> >> >> > Nope. The object location is based on the name. Reads and
> >>> >> >> >> > writes go to
> >>> >> >> >> > the same location so that a single OSD can serialize request.
> >>> >> >> >> > That means,
> >>> >> >> >> > for example, that a read that follows a write returns the
> >>> >> >> >> > just-written
> >>> >> >> >> > data.
> >>> >> >> >> >
> >>> >> >> >> > sage
> >>> >> >> >> >
> >>> >> >> >> >
> >>> >> >> >> >> Goal of task :
> >>> >> >> >> >> Trying to avoid read - write clashes as much as
> >>> >> >> >> >> possible to
> >>> >> >> >> >> achieve faster operations (I/O) . Although CRUSH selects osd
> >>> >> >> >> >> for data
> >>> >> >> >> >> placement based on pseudo random function. is it possible ?
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> -
> >>> >> >> >> >> Hemant Surale.
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >> >> On Tue, Nov 20, 2012 at 10:15 PM, Sage Weil
> >>> >> >> >> >> <[email protected]> wrote:
> >>> >> >> >> >> > On Tue, 20 Nov 2012, hemant surale wrote:
> >>> >> >> >> >> >> Hi Community,
> >>> >> >> >> >> >> I have question about port number used by ceph-osd
> >>> >> >> >> >> >> daemon . I
> >>> >> >> >> >> >> observed traffic (inter -osd communication while data
> >>> >> >> >> >> >> ingest happened)
> >>> >> >> >> >> >> on port 6802 and then after some time when I ingested
> >>> >> >> >> >> >> second file
> >>> >> >> >> >> >> after some delay port no 6804 was used . Is there any
> >>> >> >> >> >> >> specific reason
> >>> >> >> >> >> >> to change port no here?
> >>> >> >> >> >> >
> >>> >> >> >> >> > The ports are dynamic. Daemons bind to a random
> >>> >> >> >> >> > (6800-6900) port on
> >>> >> >> >> >> > startup and communicate on that. They discover each other
> >>> >> >> >> >> > via the
> >>> >> >> >> >> > addresses published in the osdmap when the daemon starts.
> >>> >> >> >> >> >
> >>> >> >> >> >> >> and one more thing how can it be possible to read from
> >>> >> >> >> >> >> one osd and
> >>> >> >> >> >> >> then simultaneous write to direct on other osd with
> >>> >> >> >> >> >> less/no traffic?
> >>> >> >> >> >> >
> >>> >> >> >> >> > I'm not sure I understand the question...
> >>> >> >> >> >> >
> >>> >> >> >> >> > sage
> >>> >> >> >> >> --
> >>> >> >> >> >> To unsubscribe from this list: send the line "unsubscribe
> >>> >> >> >> >> ceph-devel" in
> >>> >> >> >> >> the body of a message to [email protected]
> >>> >> >> >> >> More majordomo info at
> >>> >> >> >> >> http://vger.kernel.org/majordomo-info.html
> >>> >> >> >> >>
> >>> >> >> >> >>
> >>> >> >> >>
> >>> >> >> >>
> >>> >> >>
> >>> >> >>
> >>> >>
> >>> >>
> >>>
> >>>
>
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html