Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)
The increase in work load is from client , which we adjusted now. However the read statistics in ceph status is wrong shows always 2 times more , verified the disk read across cluster and it is seems to be inline with client traffic. Hence there is issue seems to be in ceph status read statistics . Thanks, Muthu On Wed, Feb 21, 2018 at 6:57 PM, Alfredo Dezawrote: > > > On Tue, Feb 20, 2018 at 9:33 PM, nokia ceph > wrote: > >> Hi Alfredo Deza, >> >> I understand the point between lvm and simple however we see issue , was >> it issue in luminous because we use same ceph config and workload from >> client. The graphs i attached in previous mail is from ceph-volume lvm osd. >> > > If the issue is a performance regression in Luminous I wouldn't know :( I > was trying to say that if you are seeing the same regression with > previously deployed OSDs then it can't possibly be a thing we are doing > incorrectly in ceph-volume > > >> >> In this case does it ococcupies 2 times only inside ceph. If we consider >> only lvm based system does this high iops because of dm-cache created for >> each osd?. >> > > Not sure again. Maybe someone else might be able to chime in on this. > > >> >> Meanwhile i will update some graphs to show this once i have. >> >> Thanks, >> Muthu >> >> On Tuesday, February 20, 2018, Alfredo Deza wrote: >> >>> >>> >>> On Mon, Feb 19, 2018 at 9:29 PM, nokia ceph >>> wrote: >>> Hi Alfredo Deza, We have 5 node platforms with lvm osd created from scratch and another 5 node platform migrated from kraken which is ceph volume simple. Both has same issue . Both platform has only hdd for osd. We also noticed 2 times disk iops more compare to kraken , this causes less read performance. During rocksdb compaction the situation is worse. Meanwhile we are building another platform creating osd using ceph-disk and analyse on this. >>> >>> If you have two platforms, one with `simple` and the other one with >>> `lvm` experiencing the same, then something else must be at fault here. >>> >>> The `simple` setup in ceph-volume basically keeps everything as it was >>> before, it just captures details of what devices were being used so OSDs >>> can be started. There is no interaction from ceph-volume >>> in there that could cause something like this. >>> >>> >>> Thanks, Muthu On Tuesday, February 20, 2018, Alfredo Deza wrote: > > > On Mon, Feb 19, 2018 at 2:01 PM, nokia ceph > wrote: > >> Hi All, >> >> We have 5 node clusters with EC 4+1 and use bluestore since last year >> from Kraken. >> Recently we migrated all our platforms to luminous 12.2.2 and finally >> all OSDs migrated to ceph-volume simple type and on few platforms >> installed >> ceph using ceph-volume . >> >> Now we see two times more traffic in read compare to client traffic >> on migrated platform and newly created platforms . This was not the case >> in >> older releases where ceph status read B/W will be same as client read >> traffic. >> >> Some network graphs : >> >> *Client network interface* towards ceph public interface : shows >> *4.3Gbps* read >> >> >> [image: Inline image 2] >> >> *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* >> 4.6 Gbps *- this matches. >> [image: Inline image 3] >> >> Ceph status output : show 1032 MB/s =* 8.06 Gbps* >> >> cn6.chn6us1c1.cdn ~# ceph status >> cluster: >> id: abda22db-3658-4d33-9681-e3ff10690f88 >> health: HEALTH_OK >> >> services: >> mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 >> mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 >> osd: 340 osds: 340 up, 340 in >> >> data: >> pools: 1 pools, 8192 pgs >> objects: 270M objects, 426 TB >> usage: 581 TB used, 655 TB / 1237 TB avail >> pgs: 8160 active+clean >> 32 active+clean+scrubbing >> >> io: >> client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr >> >> >> Write operation we don't see this issue. Client traffic and this >> matches. >> Is this expected behavior in Luminous and ceph-volume lvm or a bug ? >> Wrong calculation in ceph status read B/W ? >> > > You mentioned `ceph-volume simple` but here you say lvm. With LVM > ceph-volume will create the OSDs from scratch, while "simple" will keep > whatever OSD was created before. > > Have you created the OSDs from scratch with ceph-volume? or is it just > using "simple" , managing a previously deployed OSD? > >> >> Please provide your feedback. >> >> Thanks, >> Muthu >> >>
Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)
On Tue, Feb 20, 2018 at 9:33 PM, nokia cephwrote: > Hi Alfredo Deza, > > I understand the point between lvm and simple however we see issue , was > it issue in luminous because we use same ceph config and workload from > client. The graphs i attached in previous mail is from ceph-volume lvm osd. > If the issue is a performance regression in Luminous I wouldn't know :( I was trying to say that if you are seeing the same regression with previously deployed OSDs then it can't possibly be a thing we are doing incorrectly in ceph-volume > > In this case does it ococcupies 2 times only inside ceph. If we consider > only lvm based system does this high iops because of dm-cache created for > each osd?. > Not sure again. Maybe someone else might be able to chime in on this. > > Meanwhile i will update some graphs to show this once i have. > > Thanks, > Muthu > > On Tuesday, February 20, 2018, Alfredo Deza wrote: > >> >> >> On Mon, Feb 19, 2018 at 9:29 PM, nokia ceph >> wrote: >> >>> Hi Alfredo Deza, >>> >>> We have 5 node platforms with lvm osd created from scratch and another 5 >>> node platform migrated from kraken which is ceph volume simple. Both has >>> same issue . Both platform has only hdd for osd. >>> >>> We also noticed 2 times disk iops more compare to kraken , this causes >>> less read performance. During rocksdb compaction the situation is worse. >>> >>> >>> Meanwhile we are building another platform creating osd using ceph-disk >>> and analyse on this. >>> >> >> If you have two platforms, one with `simple` and the other one with `lvm` >> experiencing the same, then something else must be at fault here. >> >> The `simple` setup in ceph-volume basically keeps everything as it was >> before, it just captures details of what devices were being used so OSDs >> can be started. There is no interaction from ceph-volume >> in there that could cause something like this. >> >> >> >>> Thanks, >>> Muthu >>> >>> >>> >>> On Tuesday, February 20, 2018, Alfredo Deza wrote: >>> On Mon, Feb 19, 2018 at 2:01 PM, nokia ceph wrote: > Hi All, > > We have 5 node clusters with EC 4+1 and use bluestore since last year > from Kraken. > Recently we migrated all our platforms to luminous 12.2.2 and finally > all OSDs migrated to ceph-volume simple type and on few platforms > installed > ceph using ceph-volume . > > Now we see two times more traffic in read compare to client traffic on > migrated platform and newly created platforms . This was not the case in > older releases where ceph status read B/W will be same as client read > traffic. > > Some network graphs : > > *Client network interface* towards ceph public interface : shows > *4.3Gbps* read > > > [image: Inline image 2] > > *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* > 4.6 Gbps *- this matches. > [image: Inline image 3] > > Ceph status output : show 1032 MB/s =* 8.06 Gbps* > > cn6.chn6us1c1.cdn ~# ceph status > cluster: > id: abda22db-3658-4d33-9681-e3ff10690f88 > health: HEALTH_OK > > services: > mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 > mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 > osd: 340 osds: 340 up, 340 in > > data: > pools: 1 pools, 8192 pgs > objects: 270M objects, 426 TB > usage: 581 TB used, 655 TB / 1237 TB avail > pgs: 8160 active+clean > 32 active+clean+scrubbing > > io: > client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr > > > Write operation we don't see this issue. Client traffic and this > matches. > Is this expected behavior in Luminous and ceph-volume lvm or a bug ? > Wrong calculation in ceph status read B/W ? > You mentioned `ceph-volume simple` but here you say lvm. With LVM ceph-volume will create the OSDs from scratch, while "simple" will keep whatever OSD was created before. Have you created the OSDs from scratch with ceph-volume? or is it just using "simple" , managing a previously deployed OSD? > > Please provide your feedback. > > Thanks, > Muthu > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)
Hi Alfredo Deza, I understand the point between lvm and simple however we see issue , was it issue in luminous because we use same ceph config and workload from client. The graphs i attached in previous mail is from ceph-volume lvm osd. In this case does it ococcupies 2 times only inside ceph. If we consider only lvm based system does this high iops because of dm-cache created for each osd?. Meanwhile i will update some graphs to show this once i have. Thanks, Muthu On Tuesday, February 20, 2018, Alfredo Dezawrote: > > > On Mon, Feb 19, 2018 at 9:29 PM, nokia ceph > wrote: > >> Hi Alfredo Deza, >> >> We have 5 node platforms with lvm osd created from scratch and another 5 >> node platform migrated from kraken which is ceph volume simple. Both has >> same issue . Both platform has only hdd for osd. >> >> We also noticed 2 times disk iops more compare to kraken , this causes >> less read performance. During rocksdb compaction the situation is worse. >> >> >> Meanwhile we are building another platform creating osd using ceph-disk >> and analyse on this. >> > > If you have two platforms, one with `simple` and the other one with `lvm` > experiencing the same, then something else must be at fault here. > > The `simple` setup in ceph-volume basically keeps everything as it was > before, it just captures details of what devices were being used so OSDs > can be started. There is no interaction from ceph-volume > in there that could cause something like this. > > > >> Thanks, >> Muthu >> >> >> >> On Tuesday, February 20, 2018, Alfredo Deza wrote: >> >>> >>> >>> On Mon, Feb 19, 2018 at 2:01 PM, nokia ceph >>> wrote: >>> Hi All, We have 5 node clusters with EC 4+1 and use bluestore since last year from Kraken. Recently we migrated all our platforms to luminous 12.2.2 and finally all OSDs migrated to ceph-volume simple type and on few platforms installed ceph using ceph-volume . Now we see two times more traffic in read compare to client traffic on migrated platform and newly created platforms . This was not the case in older releases where ceph status read B/W will be same as client read traffic. Some network graphs : *Client network interface* towards ceph public interface : shows *4.3Gbps* read [image: Inline image 2] *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* 4.6 Gbps *- this matches. [image: Inline image 3] Ceph status output : show 1032 MB/s =* 8.06 Gbps* cn6.chn6us1c1.cdn ~# ceph status cluster: id: abda22db-3658-4d33-9681-e3ff10690f88 health: HEALTH_OK services: mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 osd: 340 osds: 340 up, 340 in data: pools: 1 pools, 8192 pgs objects: 270M objects, 426 TB usage: 581 TB used, 655 TB / 1237 TB avail pgs: 8160 active+clean 32 active+clean+scrubbing io: client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr Write operation we don't see this issue. Client traffic and this matches. Is this expected behavior in Luminous and ceph-volume lvm or a bug ? Wrong calculation in ceph status read B/W ? >>> >>> You mentioned `ceph-volume simple` but here you say lvm. With LVM >>> ceph-volume will create the OSDs from scratch, while "simple" will keep >>> whatever OSD was created before. >>> >>> Have you created the OSDs from scratch with ceph-volume? or is it just >>> using "simple" , managing a previously deployed OSD? >>> Please provide your feedback. Thanks, Muthu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)
On Mon, Feb 19, 2018 at 9:29 PM, nokia cephwrote: > Hi Alfredo Deza, > > We have 5 node platforms with lvm osd created from scratch and another 5 > node platform migrated from kraken which is ceph volume simple. Both has > same issue . Both platform has only hdd for osd. > > We also noticed 2 times disk iops more compare to kraken , this causes > less read performance. During rocksdb compaction the situation is worse. > > > Meanwhile we are building another platform creating osd using ceph-disk > and analyse on this. > If you have two platforms, one with `simple` and the other one with `lvm` experiencing the same, then something else must be at fault here. The `simple` setup in ceph-volume basically keeps everything as it was before, it just captures details of what devices were being used so OSDs can be started. There is no interaction from ceph-volume in there that could cause something like this. > Thanks, > Muthu > > > > On Tuesday, February 20, 2018, Alfredo Deza wrote: > >> >> >> On Mon, Feb 19, 2018 at 2:01 PM, nokia ceph >> wrote: >> >>> Hi All, >>> >>> We have 5 node clusters with EC 4+1 and use bluestore since last year >>> from Kraken. >>> Recently we migrated all our platforms to luminous 12.2.2 and finally >>> all OSDs migrated to ceph-volume simple type and on few platforms installed >>> ceph using ceph-volume . >>> >>> Now we see two times more traffic in read compare to client traffic on >>> migrated platform and newly created platforms . This was not the case in >>> older releases where ceph status read B/W will be same as client read >>> traffic. >>> >>> Some network graphs : >>> >>> *Client network interface* towards ceph public interface : shows >>> *4.3Gbps* read >>> >>> >>> [image: Inline image 2] >>> >>> *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* 4.6 >>> Gbps *- this matches. >>> [image: Inline image 3] >>> >>> Ceph status output : show 1032 MB/s =* 8.06 Gbps* >>> >>> cn6.chn6us1c1.cdn ~# ceph status >>> cluster: >>> id: abda22db-3658-4d33-9681-e3ff10690f88 >>> health: HEALTH_OK >>> >>> services: >>> mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 >>> mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 >>> osd: 340 osds: 340 up, 340 in >>> >>> data: >>> pools: 1 pools, 8192 pgs >>> objects: 270M objects, 426 TB >>> usage: 581 TB used, 655 TB / 1237 TB avail >>> pgs: 8160 active+clean >>> 32 active+clean+scrubbing >>> >>> io: >>> client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr >>> >>> >>> Write operation we don't see this issue. Client traffic and this >>> matches. >>> Is this expected behavior in Luminous and ceph-volume lvm or a bug ? >>> Wrong calculation in ceph status read B/W ? >>> >> >> You mentioned `ceph-volume simple` but here you say lvm. With LVM >> ceph-volume will create the OSDs from scratch, while "simple" will keep >> whatever OSD was created before. >> >> Have you created the OSDs from scratch with ceph-volume? or is it just >> using "simple" , managing a previously deployed OSD? >> >>> >>> Please provide your feedback. >>> >>> Thanks, >>> Muthu >>> >>> >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >> ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)
Hi Alfredo Deza, We have 5 node platforms with lvm osd created from scratch and another 5 node platform migrated from kraken which is ceph volume simple. Both has same issue . Both platform has only hdd for osd. We also noticed 2 times disk iops more compare to kraken , this causes less read performance. During rocksdb compaction the situation is worse. Meanwhile we are building another platform creating osd using ceph-disk and analyse on this. Thanks, Muthu On Tuesday, February 20, 2018, Alfredo Dezawrote: > > > On Mon, Feb 19, 2018 at 2:01 PM, nokia ceph > wrote: > >> Hi All, >> >> We have 5 node clusters with EC 4+1 and use bluestore since last year >> from Kraken. >> Recently we migrated all our platforms to luminous 12.2.2 and finally all >> OSDs migrated to ceph-volume simple type and on few platforms installed >> ceph using ceph-volume . >> >> Now we see two times more traffic in read compare to client traffic on >> migrated platform and newly created platforms . This was not the case in >> older releases where ceph status read B/W will be same as client read >> traffic. >> >> Some network graphs : >> >> *Client network interface* towards ceph public interface : shows >> *4.3Gbps* read >> >> >> [image: Inline image 2] >> >> *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* 4.6 >> Gbps *- this matches. >> [image: Inline image 3] >> >> Ceph status output : show 1032 MB/s =* 8.06 Gbps* >> >> cn6.chn6us1c1.cdn ~# ceph status >> cluster: >> id: abda22db-3658-4d33-9681-e3ff10690f88 >> health: HEALTH_OK >> >> services: >> mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 >> mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 >> osd: 340 osds: 340 up, 340 in >> >> data: >> pools: 1 pools, 8192 pgs >> objects: 270M objects, 426 TB >> usage: 581 TB used, 655 TB / 1237 TB avail >> pgs: 8160 active+clean >> 32 active+clean+scrubbing >> >> io: >> client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr >> >> >> Write operation we don't see this issue. Client traffic and this matches. >> Is this expected behavior in Luminous and ceph-volume lvm or a bug ? >> Wrong calculation in ceph status read B/W ? >> > > You mentioned `ceph-volume simple` but here you say lvm. With LVM > ceph-volume will create the OSDs from scratch, while "simple" will keep > whatever OSD was created before. > > Have you created the OSDs from scratch with ceph-volume? or is it just > using "simple" , managing a previously deployed OSD? > >> >> Please provide your feedback. >> >> Thanks, >> Muthu >> >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> >> > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Luminous : performance degrade while read operations (ceph-volume)
On Mon, Feb 19, 2018 at 2:01 PM, nokia cephwrote: > Hi All, > > We have 5 node clusters with EC 4+1 and use bluestore since last year from > Kraken. > Recently we migrated all our platforms to luminous 12.2.2 and finally all > OSDs migrated to ceph-volume simple type and on few platforms installed > ceph using ceph-volume . > > Now we see two times more traffic in read compare to client traffic on > migrated platform and newly created platforms . This was not the case in > older releases where ceph status read B/W will be same as client read > traffic. > > Some network graphs : > > *Client network interface* towards ceph public interface : shows *4.3Gbps* > read > > > [image: Inline image 2] > > *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* 4.6 > Gbps *- this matches. > [image: Inline image 3] > > Ceph status output : show 1032 MB/s =* 8.06 Gbps* > > cn6.chn6us1c1.cdn ~# ceph status > cluster: > id: abda22db-3658-4d33-9681-e3ff10690f88 > health: HEALTH_OK > > services: > mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 > mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 > osd: 340 osds: 340 up, 340 in > > data: > pools: 1 pools, 8192 pgs > objects: 270M objects, 426 TB > usage: 581 TB used, 655 TB / 1237 TB avail > pgs: 8160 active+clean > 32 active+clean+scrubbing > > io: > client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr > > > Write operation we don't see this issue. Client traffic and this matches. > Is this expected behavior in Luminous and ceph-volume lvm or a bug ? > Wrong calculation in ceph status read B/W ? > You mentioned `ceph-volume simple` but here you say lvm. With LVM ceph-volume will create the OSDs from scratch, while "simple" will keep whatever OSD was created before. Have you created the OSDs from scratch with ceph-volume? or is it just using "simple" , managing a previously deployed OSD? > > Please provide your feedback. > > Thanks, > Muthu > > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Luminous : performance degrade while read operations (ceph-volume)
Hi All, We have 5 node clusters with EC 4+1 and use bluestore since last year from Kraken. Recently we migrated all our platforms to luminous 12.2.2 and finally all OSDs migrated to ceph-volume simple type and on few platforms installed ceph using ceph-volume . Now we see two times more traffic in read compare to client traffic on migrated platform and newly created platforms . This was not the case in older releases where ceph status read B/W will be same as client read traffic. Some network graphs : *Client network interface* towards ceph public interface : shows *4.3Gbps* read [image: Inline image 2] *Ceph Node Public interface* : Each node around 960Mbps * 5 node =* 4.6 Gbps *- this matches. [image: Inline image 3] Ceph status output : show 1032 MB/s =* 8.06 Gbps* cn6.chn6us1c1.cdn ~# ceph status cluster: id: abda22db-3658-4d33-9681-e3ff10690f88 health: HEALTH_OK services: mon: 5 daemons, quorum cn6,cn7,cn8,cn9,cn10 mgr: cn6(active), standbys: cn7, cn9, cn10, cn8 osd: 340 osds: 340 up, 340 in data: pools: 1 pools, 8192 pgs objects: 270M objects, 426 TB usage: 581 TB used, 655 TB / 1237 TB avail pgs: 8160 active+clean 32 active+clean+scrubbing io: client: *1032 MB/s rd*, 168 MB/s wr, 1908 op/s rd, 1594 op/s wr Write operation we don't see this issue. Client traffic and this matches. Is this expected behavior in Luminous and ceph-volume lvm or a bug ? Wrong calculation in ceph status read B/W ? Please provide your feedback. Thanks, Muthu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com