Hello,

On Tue, 12 Apr 2016 09:56:13 +0200 Udo Lembke wrote:

> Hi Sage,
Not Sage, but since he hasn't piped up yet...

> we run ext4 only on our 8node-cluster with 110 OSDs and are quite happy 
> with ext4.
> We start with xfs but the latency was much higher comparable to ext4...
> 
Welcome to the club. ^o^

> But we use RBD only  with "short" filenames like 
> rbd_data.335986e2ae8944a.00000000000761e1.
> If we can switch from Jewel to K* and change during the update the 
> filestore for each OSD to BlueStore it's will be OK for us.
I don't think K* will be a truly, fully stable BlueStore platform, but
I'll be happy to be proven wrong.
Also, would you really want to upgrade to a non-LTS version?

> I hope we will get than an better performance with BlueStore??
That seems to be a given, after having read up on it last night.

> Will be BlueStore production ready during the Jewel-Lifetime, so that we 
> can switch to BlueStore before the next big upgrade?
>
Again doubtful from my perspective. 
 
For example cache-tiering was introduced (and not as a technology preview,
requiring "will eat your data" flags to be set in ceph.conf) in Firely.

It worked seemingly well enough, but was broken in certain situations.
And in the latest Hammer release it is again broken dangerously by a
backport from Infernalis/Jewel.

Christian
> 
> Udo
> 
> Am 11.04.2016 um 23:39 schrieb Sage Weil:
> > Hi,
> >
> > ext4 has never been recommended, but we did test it.  After Jewel is
> > out, we would like explicitly recommend *against* ext4 and stop
> > testing it.
> >
> > Why:
> >
> > Recently we discovered an issue with the long object name handling
> > that is not fixable without rewriting a significant chunk of
> > FileStores filename handling.  (There is a limit in the amount of
> > xattr data ext4 can store in the inode, which causes problems in
> > LFNIndex.)
> >
> > We *could* invest a ton of time rewriting this to fix, but it only
> > affects ext4, which we never recommended, and we plan to deprecate
> > FileStore once BlueStore is stable anyway, so it seems like a waste of
> > time that would be better spent elsewhere.
> >
> > Also, by dropping ext4 test coverage in ceph-qa-suite, we can
> > significantly improve time/coverage for FileStore on XFS and on
> > BlueStore.
> >
> > The long file name handling is problematic anytime someone is storing
> > rados objects with long names.  The primary user that does this is RGW,
> > which means any RGW cluster using ext4 should recreate their OSDs to
> > use XFS.  Other librados users could be affected too, though, like
> > users with very long rbd image names (e.g., > 100 characters), or
> > custom librados users.
> >
> > How:
> >
> > To make this change as visible as possible, the plan is to make
> > ceph-osd refuse to start if the backend is unable to support the
> > configured max object name (osd_max_object_name_len).  The OSD will
> > complain that ext4 cannot store such an object and refuse to start.  A
> > user who is only using RBD might decide they don't need long file
> > names to work and can adjust the osd_max_object_name_len setting to
> > something small (say, 64) and run successfully.  They would be taking
> > a risk, though, because we would like to stop testing on ext4.
> >
> > Is this reasonable?  If there significant ext4 users that are
> > unwilling to recreate their OSDs, now would be the time to speak up.
> >
> > Thanks!
> > sage
> >
> > _______________________________________________
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


-- 
Christian Balzer        Network/Systems Engineer                
ch...@gol.com           Global OnLine Japan/Rakuten Communications
http://www.gol.com/
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to