Am 26.02.2018 um 19:24 schrieb Gregory Farnum: > I don’t actually know this option, but based on your results it’s clear that > “fast read” is telling the OSD it should issue reads to all k+m OSDs storing > data and then reconstruct the data from the first k to respond. Without the > fast read it simply asks the regular k data nodes to read it back straight > and sends the reply back. This is a straight trade off of more bandwidth for > lower long-tail latencies. > -Greg
Many thanks, this certainly explains it!
Apparently I misunderstood how "normal" read works - I thought that in any
case, all shards would be requested, and the primary OSD would check EC is
still fine.
However, with the explanation that indeed only the actual "k" shards are read
in the "normal" case, it's fully clear to me that "fast_read" will be slower
for us,
since we are limited by network bandwidth.
On a side-note, activating fast_read also appears to increase CPU load a bit,
which is then probably due to the EC calculations that need to be performed if
the "wrong"
shards arrived at the primary OSD first.
I believe this also explains why an EC pool actually does remapping in a k=4
m=2 pool with failure domain host if one of 6 hosts goes down:
Namely, to have the "k" shards available on the "up" OSDs. This answers an
earlier question of mine.
Many thanks for clearing this up!
Cheers,
Oliver
> On Mon, Feb 26, 2018 at 3:57 AM Oliver Freyermuth
> <[email protected] <mailto:[email protected]>> wrote:
>
> Some additional information gathered from our monitoring:
> It seems fast_read does indeed become active immediately, but I do not
> understand the effect.
>
> With fast_read = 0, we see:
> ~ 5.2 GB/s total outgoing traffic from all 6 OSD hosts
> ~ 2.3 GB/s total incoming traffic to all 6 OSD hosts
>
> With fast_read = 1, we see:
> ~ 5.1 GB/s total outgoing traffic from all 6 OSD hosts
> ~ 3 GB/s total incoming traffic to all 6 OSD hosts
>
> I would have expected exactly the contrary to happen...
>
> Cheers,
> Oliver
>
> Am 26.02.2018 um 12:51 schrieb Oliver Freyermuth:
> > Dear Cephalopodians,
> >
> > in the few remaining days when we can still play at our will with
> parameters,
> > we just now tried to set:
> > ceph osd pool set cephfs_data fast_read 1
> > but did not notice any effect on sequential, large file read throughput
> on our k=4 m=2 EC pool.
> >
> > Should this become active immediately? Or do OSDs need a restart first?
> > Is the option already deemed safe?
> >
> > Or is it just that we should not expect any change on throughput, since
> our system (for large sequential reads)
> > is purely limited by the IPoIB throughput, and the shards are
> nevertheless requested by the primary OSD?
> > So the gain would not be in throughput, but the reply to the client
> would be slightly faster (before all shards have arrived)?
> > Then this option would be mainly of interest if the disk IO was
> congested (which does not happen for us as of yet)
> > and not help so much if the system is limited by network bandwidth.
> >
> > Cheers,
> > Oliver
> >
> >
> >
> > _______________________________________________
> > ceph-users mailing list
> > [email protected] <mailto:[email protected]>
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
> _______________________________________________
> ceph-users mailing list
> [email protected] <mailto:[email protected]>
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
