Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-07-05 Thread Lars Marowsky-Bree
On 2017-06-30T16:48:04, Sage Weil  wrote:

> > Simply disabling the tests while keeping the code in the distribution is
> > setting up users who happen to be using Btrfs for failure.
> 
> I don't think we can wait *another* cycle (year) to stop testing this.
> 
> We can, however,
> 
>  - prominently feature this in the luminous release notes, and
>  - require the 'enable experimental unrecoverable data corrupting features =
> btrfs' in order to use it, so that users are explicitly opting in to 
> luminous+btrfs territory.
> 
> The only good(ish) news is that we aren't touching FileStore if we can 
> help it, so it less likely to regress than other things.  And we'll 
> continue testing filestore+btrfs on jewel for some time.

That makes sense. Though btrfs is something users really shouldn't run
unless they get a heavily debugged and supported version from somewhere.

I'd also not mind just plain out dropping it completely, since I don't
believe any of our users runs it, they're all on XFS and will upconvert
to BlueStore.

That might be a good reasoning though: upgrading folks should be able to
get the OSDs on btrfs up (if they still have any) and go directly the
BlueStore, without having to first go via XFS.




Regards,
Lars

-- 
SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-07-04 Thread Lionel Bouton
Le 04/07/2017 à 19:00, Jack a écrit :
> You may just upgrade to Luminous, then replace filestore by bluestore

You don't just "replace" filestore by bluestore on a production cluster
: you transition over several weeks/months from the first to the second.
The two must be rock stable and have predictable performance
characteristics to do that.
We took more than 6 months with Firefly to migrate from XFS to Btrfs and
studied/tuned the cluster along the way. Simply replacing a store by
another without any experience of the real world behavior of the new one
is just playing with fire (and a huge heap of customer data).

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-07-04 Thread Jack
You may just upgrade to Luminous, then replace filestore by bluestore

Don't be scared, as Sage said:
> The only good(ish) news is that we aren't touching FileStore if we can 
> help it, so it less likely to regress than other things.  And we'll 
> continue testing filestore+btrfs on jewel for some time.

In my opinion, it should be fine that way

On 04/07/2017 18:54, Lionel Bouton wrote:
> Le 30/06/2017 à 18:48, Sage Weil a écrit :
>> On Fri, 30 Jun 2017, Lenz Grimmer wrote:
>>> Hi Sage,
>>>
>>> On 06/30/2017 05:21 AM, Sage Weil wrote:
>>>
 The easiest thing is to

 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
 against btrfs for a long time and are moving toward bluestore anyway.
>>> Searching the documentation for "btrfs" does not really give a user any
>>> clue that the use of Btrfs is discouraged.
>>>
>>> Where exactly has this been recommended?
>>>
>>> The documentation currently states:
>>>
>>> http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds
>>>
>>> "We recommend using the xfs file system or the btrfs file system when
>>> running mkfs."
>>>
>>> http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems
>>>
>>> "btrfs is still supported and has a comparatively compelling set of
>>> features, but be mindful of its stability and support status in your
>>> Linux distribution."
>>>
>>> http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies
>>>
>>> "If you use the btrfs file system with Ceph, we recommend using a recent
>>> Linux kernel (3.14 or later)."
>>>
>>> As an end user, none of these statements would really sound as
>>> recommendations *against* using Btrfs to me.
>>>
>>> I'm therefore concerned about just disabling the tests related to
>>> filestore on Btrfs while still including and shipping it. This has
>>> potential to introduce regressions that won't get caught and fixed.
>> Ah, crap.  This is what happens when devs don't read their own 
>> documetnation.  I recommend against btrfs every time it ever comes up, the 
>> downstream distributions all support only xfs, but yes, it looks like the 
>> docs never got updated... despite the xfs focus being 5ish years old now.
>>
>> I'll submit a PR to clean this up, but
>>  
 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out 
 the occasional ENOSPC errors we see.  (They make the test runs noisy but 
 are pretty easy to identify.)

 If we don't stop testing filestore on btrfs now, I'm not sure when we 
 would ever be able to stop, and that's pretty clearly not sustainable.
 Does that seem reasonable?  (Pretty please?)
>>> If you want to get rid of filestore on Btrfs, start a proper deprecation
>>> process and inform users that support for it it's going to be removed in
>>> the near future. The documentation must be updated accordingly and it
>>> must be clearly emphasized in the release notes.
>>>
>>> Simply disabling the tests while keeping the code in the distribution is
>>> setting up users who happen to be using Btrfs for failure.
>> I don't think we can wait *another* cycle (year) to stop testing this.
>>
>> We can, however,
>>
>>  - prominently feature this in the luminous release notes, and
>>  - require the 'enable experimental unrecoverable data corrupting features =
>> btrfs' in order to use it, so that users are explicitly opting in to 
>> luminous+btrfs territory.
>>
>> The only good(ish) news is that we aren't touching FileStore if we can 
>> help it, so it less likely to regress than other things.  And we'll 
>> continue testing filestore+btrfs on jewel for some time.
>>
>> Is that good enough?
> 
> Not sure how we will handle the transition. Is bluestore considered
> stable in Jewel ? Then our current clusters (recently migrated from
> Firefly to Hammer) will have support for both BTRFS+Filestore and
> Bluestore when the next upgrade takes place. If Bluestore is only
> considered stable on Luminous I don't see how we can manage the
> transition easily. The only path I see is to :
> - migrate to XFS+filestore with Jewel (which will not only take time but
> will be a regression for us : this will cause performance and sizing
> problems on at least one of our clusters and we will lose the silent
> corruption detection from BTRFS)
> - then upgrade to Luminous and migrate again to Bluestore.
> I was not expecting the transition from Btrfs+Filestore to Bluestore to
> be this convoluted (we were planning to add Bluestore OSDs one at a time
> and study the performance/stability for months before migrating the
> whole clusters). Is there any way to restrict your BTRFS tests to at
> least a given stable configuration (BTRFS is known to have problems with
> the high rate of snapshot deletion Ceph generates by default for example
> and we use 'filestore btrfs snap = false') ?
> 
> Best regards,
> 
> Lionel
> 

Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-07-04 Thread Lionel Bouton
Le 30/06/2017 à 18:48, Sage Weil a écrit :
> On Fri, 30 Jun 2017, Lenz Grimmer wrote:
>> Hi Sage,
>>
>> On 06/30/2017 05:21 AM, Sage Weil wrote:
>>
>>> The easiest thing is to
>>>
>>> 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
>>> against btrfs for a long time and are moving toward bluestore anyway.
>> Searching the documentation for "btrfs" does not really give a user any
>> clue that the use of Btrfs is discouraged.
>>
>> Where exactly has this been recommended?
>>
>> The documentation currently states:
>>
>> http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds
>>
>> "We recommend using the xfs file system or the btrfs file system when
>> running mkfs."
>>
>> http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems
>>
>> "btrfs is still supported and has a comparatively compelling set of
>> features, but be mindful of its stability and support status in your
>> Linux distribution."
>>
>> http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies
>>
>> "If you use the btrfs file system with Ceph, we recommend using a recent
>> Linux kernel (3.14 or later)."
>>
>> As an end user, none of these statements would really sound as
>> recommendations *against* using Btrfs to me.
>>
>> I'm therefore concerned about just disabling the tests related to
>> filestore on Btrfs while still including and shipping it. This has
>> potential to introduce regressions that won't get caught and fixed.
> Ah, crap.  This is what happens when devs don't read their own 
> documetnation.  I recommend against btrfs every time it ever comes up, the 
> downstream distributions all support only xfs, but yes, it looks like the 
> docs never got updated... despite the xfs focus being 5ish years old now.
>
> I'll submit a PR to clean this up, but
>  
>>> 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out 
>>> the occasional ENOSPC errors we see.  (They make the test runs noisy but 
>>> are pretty easy to identify.)
>>>
>>> If we don't stop testing filestore on btrfs now, I'm not sure when we 
>>> would ever be able to stop, and that's pretty clearly not sustainable.
>>> Does that seem reasonable?  (Pretty please?)
>> If you want to get rid of filestore on Btrfs, start a proper deprecation
>> process and inform users that support for it it's going to be removed in
>> the near future. The documentation must be updated accordingly and it
>> must be clearly emphasized in the release notes.
>>
>> Simply disabling the tests while keeping the code in the distribution is
>> setting up users who happen to be using Btrfs for failure.
> I don't think we can wait *another* cycle (year) to stop testing this.
>
> We can, however,
>
>  - prominently feature this in the luminous release notes, and
>  - require the 'enable experimental unrecoverable data corrupting features =
> btrfs' in order to use it, so that users are explicitly opting in to 
> luminous+btrfs territory.
>
> The only good(ish) news is that we aren't touching FileStore if we can 
> help it, so it less likely to regress than other things.  And we'll 
> continue testing filestore+btrfs on jewel for some time.
>
> Is that good enough?

Not sure how we will handle the transition. Is bluestore considered
stable in Jewel ? Then our current clusters (recently migrated from
Firefly to Hammer) will have support for both BTRFS+Filestore and
Bluestore when the next upgrade takes place. If Bluestore is only
considered stable on Luminous I don't see how we can manage the
transition easily. The only path I see is to :
- migrate to XFS+filestore with Jewel (which will not only take time but
will be a regression for us : this will cause performance and sizing
problems on at least one of our clusters and we will lose the silent
corruption detection from BTRFS)
- then upgrade to Luminous and migrate again to Bluestore.
I was not expecting the transition from Btrfs+Filestore to Bluestore to
be this convoluted (we were planning to add Bluestore OSDs one at a time
and study the performance/stability for months before migrating the
whole clusters). Is there any way to restrict your BTRFS tests to at
least a given stable configuration (BTRFS is known to have problems with
the high rate of snapshot deletion Ceph generates by default for example
and we use 'filestore btrfs snap = false') ?

Best regards,

Lionel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Wido den Hollander

> Op 30 juni 2017 om 18:48 schreef Sage Weil :
> 
> 
> On Fri, 30 Jun 2017, Lenz Grimmer wrote:
> > Hi Sage,
> > 
> > On 06/30/2017 05:21 AM, Sage Weil wrote:
> > 
> > > The easiest thing is to
> > > 
> > > 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
> > > against btrfs for a long time and are moving toward bluestore anyway.
> > 
> > Searching the documentation for "btrfs" does not really give a user any
> > clue that the use of Btrfs is discouraged.
> > 
> > Where exactly has this been recommended?
> > 
> > The documentation currently states:
> >
> > http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds
> > 
> > "We recommend using the xfs file system or the btrfs file system when
> > running mkfs."
> > 
> > http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems
> > 
> > "btrfs is still supported and has a comparatively compelling set of
> > features, but be mindful of its stability and support status in your
> > Linux distribution."
> > 
> > http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies
> > 
> > "If you use the btrfs file system with Ceph, we recommend using a recent
> > Linux kernel (3.14 or later)."
> > 
> > As an end user, none of these statements would really sound as
> > recommendations *against* using Btrfs to me.
> > 
> > I'm therefore concerned about just disabling the tests related to
> > filestore on Btrfs while still including and shipping it. This has
> > potential to introduce regressions that won't get caught and fixed.
> 
> Ah, crap.  This is what happens when devs don't read their own 
> documetnation.  I recommend against btrfs every time it ever comes up, the 
> downstream distributions all support only xfs, but yes, it looks like the 
> docs never got updated... despite the xfs focus being 5ish years old now.
> 
> I'll submit a PR to clean this up, but
>  
> > > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out 
> > > the occasional ENOSPC errors we see.  (They make the test runs noisy but 
> > > are pretty easy to identify.)
> > > 
> > > If we don't stop testing filestore on btrfs now, I'm not sure when we 
> > > would ever be able to stop, and that's pretty clearly not sustainable.
> > > Does that seem reasonable?  (Pretty please?)
> > 
> > If you want to get rid of filestore on Btrfs, start a proper deprecation
> > process and inform users that support for it it's going to be removed in
> > the near future. The documentation must be updated accordingly and it
> > must be clearly emphasized in the release notes.
> > 
> > Simply disabling the tests while keeping the code in the distribution is
> > setting up users who happen to be using Btrfs for failure.
> 
> I don't think we can wait *another* cycle (year) to stop testing this.
> 
> We can, however,
> 
>  - prominently feature this in the luminous release notes, and
>  - require the 'enable experimental unrecoverable data corrupting features =
> btrfs' in order to use it, so that users are explicitly opting in to 
> luminous+btrfs territory.
> 
> The only good(ish) news is that we aren't touching FileStore if we can 
> help it, so it less likely to regress than other things.  And we'll 
> continue testing filestore+btrfs on jewel for some time.
> 
> Is that good enough?

Sounds good to me. Every cluster I run into runs XFS. People running btrfs did 
that deliberately and by adding that flag you encourage them to go to BlueStore.

Wido 

> 
> sage
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread ceph
On 30/06/2017 18:48, Sage Weil wrote:
> We can, however,
> 
>  - prominently feature this in the luminous release notes, and
>  - require the 'enable experimental unrecoverable data corrupting features =
> btrfs' in order to use it, so that users are explicitly opting in to 
> luminous+btrfs territory.
> Is that good enough?
> 
> sage

This seems sane to me


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Sage Weil
On Fri, 30 Jun 2017, Lenz Grimmer wrote:
> Hi Sage,
> 
> On 06/30/2017 05:21 AM, Sage Weil wrote:
> 
> > The easiest thing is to
> > 
> > 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
> > against btrfs for a long time and are moving toward bluestore anyway.
> 
> Searching the documentation for "btrfs" does not really give a user any
> clue that the use of Btrfs is discouraged.
> 
> Where exactly has this been recommended?
> 
> The documentation currently states:
>
> http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds
> 
> "We recommend using the xfs file system or the btrfs file system when
> running mkfs."
> 
> http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems
> 
> "btrfs is still supported and has a comparatively compelling set of
> features, but be mindful of its stability and support status in your
> Linux distribution."
> 
> http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies
> 
> "If you use the btrfs file system with Ceph, we recommend using a recent
> Linux kernel (3.14 or later)."
> 
> As an end user, none of these statements would really sound as
> recommendations *against* using Btrfs to me.
> 
> I'm therefore concerned about just disabling the tests related to
> filestore on Btrfs while still including and shipping it. This has
> potential to introduce regressions that won't get caught and fixed.

Ah, crap.  This is what happens when devs don't read their own 
documetnation.  I recommend against btrfs every time it ever comes up, the 
downstream distributions all support only xfs, but yes, it looks like the 
docs never got updated... despite the xfs focus being 5ish years old now.

I'll submit a PR to clean this up, but
 
> > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out 
> > the occasional ENOSPC errors we see.  (They make the test runs noisy but 
> > are pretty easy to identify.)
> > 
> > If we don't stop testing filestore on btrfs now, I'm not sure when we 
> > would ever be able to stop, and that's pretty clearly not sustainable.
> > Does that seem reasonable?  (Pretty please?)
> 
> If you want to get rid of filestore on Btrfs, start a proper deprecation
> process and inform users that support for it it's going to be removed in
> the near future. The documentation must be updated accordingly and it
> must be clearly emphasized in the release notes.
> 
> Simply disabling the tests while keeping the code in the distribution is
> setting up users who happen to be using Btrfs for failure.

I don't think we can wait *another* cycle (year) to stop testing this.

We can, however,

 - prominently feature this in the luminous release notes, and
 - require the 'enable experimental unrecoverable data corrupting features =
btrfs' in order to use it, so that users are explicitly opting in to 
luminous+btrfs territory.

The only good(ish) news is that we aren't touching FileStore if we can 
help it, so it less likely to regress than other things.  And we'll 
continue testing filestore+btrfs on jewel for some time.

Is that good enough?

sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Christian Balzer
On Fri, 30 Jun 2017 16:29:43 + David Turner wrote:

> I actually don't see either of these as issues with just flat out saying
> that Btrfs will not be supported in Luminous.  It's a full new release and
> it sounds like it is no longer a relevant Filestore backend in Luminous.
> People can either plan to migrate their OSDs to Bluestore once they reach
> Luminous or just not upgrade to Luminous.  Upgrading is optional and not
> mandatory.
> 
You tell that to the people in charge when there's a critical bug in a
version that's no longer maintained. 
At the release cycle speed of Ceph this tends to be an option only for
those of us who are happy to freeze a cluster at a certain version until
it dies of natural causes.

That being said, anybody who deployed BTRFS within the last 1-2 years
should have seen the writing on the wall, but the ability of reading
between the lines is not an excuse for a "proper deprecation" indeed.
And at this time that probably should be extended formally to ZFS.

Christian

> On Fri, Jun 30, 2017 at 11:47 AM Lenz Grimmer  wrote:
> 
> > Hi Sage,
> >
> > On 06/30/2017 05:21 AM, Sage Weil wrote:
> >  
> > > The easiest thing is to
> > >
> > > 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended
> > > against btrfs for a long time and are moving toward bluestore anyway.  
> >
> > Searching the documentation for "btrfs" does not really give a user any
> > clue that the use of Btrfs is discouraged.
> >
> > Where exactly has this been recommended?
> >
> > The documentation currently states:
> >
> >
> > http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds
> >
> > "We recommend using the xfs file system or the btrfs file system when
> > running mkfs."
> >
> >
> > http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems
> >
> > "btrfs is still supported and has a comparatively compelling set of
> > features, but be mindful of its stability and support status in your
> > Linux distribution."
> >
> >
> > http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies
> >
> > "If you use the btrfs file system with Ceph, we recommend using a recent
> > Linux kernel (3.14 or later)."
> >
> > As an end user, none of these statements would really sound as
> > recommendations *against* using Btrfs to me.
> >
> > I'm therefore concerned about just disabling the tests related to
> > filestore on Btrfs while still including and shipping it. This has
> > potential to introduce regressions that won't get caught and fixed.
> >  
> > > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out
> > > the occasional ENOSPC errors we see.  (They make the test runs noisy but
> > > are pretty easy to identify.)
> > >
> > > If we don't stop testing filestore on btrfs now, I'm not sure when we
> > > would ever be able to stop, and that's pretty clearly not sustainable.
> > > Does that seem reasonable?  (Pretty please?)  
> >
> > If you want to get rid of filestore on Btrfs, start a proper deprecation
> > process and inform users that support for it it's going to be removed in
> > the near future. The documentation must be updated accordingly and it
> > must be clearly emphasized in the release notes.
> >
> > Simply disabling the tests while keeping the code in the distribution is
> > setting up users who happen to be using Btrfs for failure.
> >
> > Just my 0.02€,
> >
> > Lenz
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >  


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Rakuten Communications
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread David Turner
I actually don't see either of these as issues with just flat out saying
that Btrfs will not be supported in Luminous.  It's a full new release and
it sounds like it is no longer a relevant Filestore backend in Luminous.
People can either plan to migrate their OSDs to Bluestore once they reach
Luminous or just not upgrade to Luminous.  Upgrading is optional and not
mandatory.

On Fri, Jun 30, 2017 at 11:47 AM Lenz Grimmer  wrote:

> Hi Sage,
>
> On 06/30/2017 05:21 AM, Sage Weil wrote:
>
> > The easiest thing is to
> >
> > 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended
> > against btrfs for a long time and are moving toward bluestore anyway.
>
> Searching the documentation for "btrfs" does not really give a user any
> clue that the use of Btrfs is discouraged.
>
> Where exactly has this been recommended?
>
> The documentation currently states:
>
>
> http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds
>
> "We recommend using the xfs file system or the btrfs file system when
> running mkfs."
>
>
> http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems
>
> "btrfs is still supported and has a comparatively compelling set of
> features, but be mindful of its stability and support status in your
> Linux distribution."
>
>
> http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies
>
> "If you use the btrfs file system with Ceph, we recommend using a recent
> Linux kernel (3.14 or later)."
>
> As an end user, none of these statements would really sound as
> recommendations *against* using Btrfs to me.
>
> I'm therefore concerned about just disabling the tests related to
> filestore on Btrfs while still including and shipping it. This has
> potential to introduce regressions that won't get caught and fixed.
>
> > 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out
> > the occasional ENOSPC errors we see.  (They make the test runs noisy but
> > are pretty easy to identify.)
> >
> > If we don't stop testing filestore on btrfs now, I'm not sure when we
> > would ever be able to stop, and that's pretty clearly not sustainable.
> > Does that seem reasonable?  (Pretty please?)
>
> If you want to get rid of filestore on Btrfs, start a proper deprecation
> process and inform users that support for it it's going to be removed in
> the near future. The documentation must be updated accordingly and it
> must be clearly emphasized in the release notes.
>
> Simply disabling the tests while keeping the code in the distribution is
> setting up users who happen to be using Btrfs for failure.
>
> Just my 0.02€,
>
> Lenz
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Sean Purdy
On Fri, 30 Jun 2017, Lenz Grimmer said:
> > 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
> > against btrfs for a long time and are moving toward bluestore anyway.
> 
> Searching the documentation for "btrfs" does not really give a user any
> clue that the use of Btrfs is discouraged.
> 
> Where exactly has this been recommended?

As a new user, I certainly picked up on btrfs being discouraged, or not as 
stable as XFS.

e.g.
http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs

"We currently recommend XFS for production deployments.

We used to recommend btrfs for testing, development, and any non-critical 
deployments ..."


http://docs.ceph.com/docs/master/start/hardware-recommendations/?highlight=btrfs

"btrfs is not quite stable enough for production"
 

> If you want to get rid of filestore on Btrfs, start a proper deprecation
> process and inform users that support for it it's going to be removed in
> the near future. The documentation must be updated accordingly and it
> must be clearly emphasized in the release notes.

But this sounds sane.


Sean Purdy
CV-Library Ltd
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Lenz Grimmer
Hi Sage,

On 06/30/2017 05:21 AM, Sage Weil wrote:

> The easiest thing is to
> 
> 1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
> against btrfs for a long time and are moving toward bluestore anyway.

Searching the documentation for "btrfs" does not really give a user any
clue that the use of Btrfs is discouraged.

Where exactly has this been recommended?

The documentation currently states:

http://docs.ceph.com/docs/master/rados/configuration/ceph-conf/?highlight=btrfs#osds

"We recommend using the xfs file system or the btrfs file system when
running mkfs."

http://docs.ceph.com/docs/master/rados/configuration/filesystem-recommendations/?highlight=btrfs#filesystems

"btrfs is still supported and has a comparatively compelling set of
features, but be mindful of its stability and support status in your
Linux distribution."

http://docs.ceph.com/docs/master/start/os-recommendations/?highlight=btrfs#ceph-dependencies

"If you use the btrfs file system with Ceph, we recommend using a recent
Linux kernel (3.14 or later)."

As an end user, none of these statements would really sound as
recommendations *against* using Btrfs to me.

I'm therefore concerned about just disabling the tests related to
filestore on Btrfs while still including and shipping it. This has
potential to introduce regressions that won't get caught and fixed.

> 2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out 
> the occasional ENOSPC errors we see.  (They make the test runs noisy but 
> are pretty easy to identify.)
> 
> If we don't stop testing filestore on btrfs now, I'm not sure when we 
> would ever be able to stop, and that's pretty clearly not sustainable.
> Does that seem reasonable?  (Pretty please?)

If you want to get rid of filestore on Btrfs, start a proper deprecation
process and inform users that support for it it's going to be removed in
the near future. The documentation must be updated accordingly and it
must be clearly emphasized in the release notes.

Simply disabling the tests while keeping the code in the distribution is
setting up users who happen to be using Btrfs for failure.

Just my 0.02€,

Lenz



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] dropping filestore+btrfs testing for luminous

2017-06-30 Thread Peter Maloney
On 06/30/17 05:21, Sage Weil wrote:
> We're having a series of problems with the valgrind included in xenial[1] 
> that have led us to restrict all valgrind tests to centos nodes.  At teh 
> same time, we're also seeing spurious ENOSPC errors from btrfs on both 
> centos on xenial kernels[2], making trusty the only distro where btrfs 
> works reliably.
Do you guys know about balance filters and how to use them to prevent
ENOSPC?

see: https://btrfs.wiki.kernel.org/index.php/Balance_Filters

Basically it sometimes (when using snaps heavily) just has many
partially used chunks and so you rebalance the data inside them so it
can remove the fully empty ones and reuse the space. The above page says
to run commands like:

> btrfs balance start -dusage=50 /

where you start at 50 or so and raise it up and rerun if you want, until
you reclaimed enough space.

So, to make the automated tests eat less of your time, you could script
something that runs that after some number of unit tests, or in the
btrfs filestore itself can do it after removing some amount of
snapshots, or after ENOSPC and retry. I don't know what's easy to
implement, just making sure you're aware of the option.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] dropping filestore+btrfs testing for luminous

2017-06-29 Thread Sage Weil
We're having a series of problems with the valgrind included in xenial[1] 
that have led us to restrict all valgrind tests to centos nodes.  At teh 
same time, we're also seeing spurious ENOSPC errors from btrfs on both 
centos on xenial kernels[2], making trusty the only distro where btrfs 
works reliably.

Teuthology doesn't handle this well when it tries to put together the 
test matrix (we can't test filestore+btrfs+valgrind).

The easiest thing is to

1/ Stop testing filestore+btrfs for luminous onward.  We've recommended 
against btrfs for a long time and are moving toward bluestore anyway.

2/ Leave btrfs in the mix for jewel, and manually tolerate and filter out 
the occasional ENOSPC errors we see.  (They make the test runs noisy but 
are pretty easy to identify.)

If we don't stop testing filestore on btrfs now, I'm not sure when we 
would ever be able to stop, and that's pretty clearly not sustainable.
Does that seem reasonable?  (Pretty please?)

sage


[1] http://tracker.ceph.com/issues/18126 and 
http://tracker.ceph.com/issues/20360
[2] http://tracker.ceph.com/issues/20169
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com