Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Tue, Nov 7, 2017 at 3:28 PM, Dan van der Ster  wrote:
> On Tue, Nov 7, 2017 at 4:15 PM, John Spray  wrote:
>> On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster  wrote:
>>> On Tue, Nov 7, 2017 at 12:57 PM, John Spray  wrote:
 On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
> My organization has a production  cluster primarily used for cephfs 
> upgraded
> from jewel to luminous. We would very much like to have snapshots on that
> filesystem, but understand that there are risks.
>
> What kind of work could cephfs admins do to help the devs stabilize this
> feature?

 If you have a disposable test system, then you could install the
 latest master branch of Ceph (which has a stream of snapshot fixes in
 it) and run a replica of your intended workload.  If you can find
 snapshot bugs (especially crashes) on master then they will certainly
 attract interest.
>>>
>>> Hi John. Are those fixes in master expected to make snapshots stable
>>> with rados namespaces (a la Manila) ?
>>
>> Was there a specific bug with snapshots+namespaces?  I may be having a
>> failure of memory or have not been paying enough attention :-)
>>
>
> I might be misremembering, but I recall a mail from Greg? a few months
> ago saying that snapshots would surely not work with multiple
> namespaces.
> Can't find the mail, though

I think that was about not being able to safely use snapshots when
multiple filesystems share the same data pool (and are separated by
only namespaces).

If you have a single filesystem using multiple namespaces within the
same data pool then I don't think there's an issue.

John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread Dan van der Ster
On Tue, Nov 7, 2017 at 4:15 PM, John Spray  wrote:
> On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster  wrote:
>> On Tue, Nov 7, 2017 at 12:57 PM, John Spray  wrote:
>>> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
 My organization has a production  cluster primarily used for cephfs 
 upgraded
 from jewel to luminous. We would very much like to have snapshots on that
 filesystem, but understand that there are risks.

 What kind of work could cephfs admins do to help the devs stabilize this
 feature?
>>>
>>> If you have a disposable test system, then you could install the
>>> latest master branch of Ceph (which has a stream of snapshot fixes in
>>> it) and run a replica of your intended workload.  If you can find
>>> snapshot bugs (especially crashes) on master then they will certainly
>>> attract interest.
>>
>> Hi John. Are those fixes in master expected to make snapshots stable
>> with rados namespaces (a la Manila) ?
>
> Was there a specific bug with snapshots+namespaces?  I may be having a
> failure of memory or have not been paying enough attention :-)
>

I might be misremembering, but I recall a mail from Greg? a few months
ago saying that snapshots would surely not work with multiple
namespaces.
Can't find the mail, though

-- dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster  wrote:
> On Tue, Nov 7, 2017 at 12:57 PM, John Spray  wrote:
>> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
>>> My organization has a production  cluster primarily used for cephfs upgraded
>>> from jewel to luminous. We would very much like to have snapshots on that
>>> filesystem, but understand that there are risks.
>>>
>>> What kind of work could cephfs admins do to help the devs stabilize this
>>> feature?
>>
>> If you have a disposable test system, then you could install the
>> latest master branch of Ceph (which has a stream of snapshot fixes in
>> it) and run a replica of your intended workload.  If you can find
>> snapshot bugs (especially crashes) on master then they will certainly
>> attract interest.
>
> Hi John. Are those fixes in master expected to make snapshots stable
> with rados namespaces (a la Manila) ?

Was there a specific bug with snapshots+namespaces?  I may be having a
failure of memory or have not been paying enough attention :-)

John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread Dan van der Ster
On Tue, Nov 7, 2017 at 12:57 PM, John Spray  wrote:
> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
>> My organization has a production  cluster primarily used for cephfs upgraded
>> from jewel to luminous. We would very much like to have snapshots on that
>> filesystem, but understand that there are risks.
>>
>> What kind of work could cephfs admins do to help the devs stabilize this
>> feature?
>
> If you have a disposable test system, then you could install the
> latest master branch of Ceph (which has a stream of snapshot fixes in
> it) and run a replica of your intended workload.  If you can find
> snapshot bugs (especially crashes) on master then they will certainly
> attract interest.

Hi John. Are those fixes in master expected to make snapshots stable
with rados namespaces (a la Manila) ?

-- dan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Tue, Nov 7, 2017 at 2:40 PM, Brady Deetz  wrote:
> Are there any existing fuzzing tools you'd recommend? I know about ceph osd
> thrash, which could be tested against, but what about on the client side? I
> could just use something pre-built for posix, but that wouldn't coordinate
> simulated failures on the storage side with actions against the fs. If there
> is not any current tooling for coordinating server and client simulation,
> maybe that's where I start.

We do have "thrasher" classes for randomly failing MDS daemons in the
main test suite, but those will only be useful for you if you're
working on automated tests that run inside that framework
(https://github.com/ceph/ceph/blob/master/qa/tasks/mds_thrash.py)

If you're working on a local cluster, then it's pretty simple (and
useful) to write a small shell script that e.g. sleeps a few minutes,
then does a "ceph mds fail" on a random rank.

Something more sophisticated that would include client failures is a
long standing wishlist item for the automated testing.

John

>
> On Nov 7, 2017 5:57 AM, "John Spray"  wrote:
>>
>> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
>> > My organization has a production  cluster primarily used for cephfs
>> > upgraded
>> > from jewel to luminous. We would very much like to have snapshots on
>> > that
>> > filesystem, but understand that there are risks.
>> >
>> > What kind of work could cephfs admins do to help the devs stabilize this
>> > feature?
>>
>> If you have a disposable test system, then you could install the
>> latest master branch of Ceph (which has a stream of snapshot fixes in
>> it) and run a replica of your intended workload.  If you can find
>> snapshot bugs (especially crashes) on master then they will certainly
>> attract interest.
>>
>> John
>>
>> >
>> > ___
>> > ceph-users mailing list
>> > ceph-users@lists.ceph.com
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread Brady Deetz
Are there any existing fuzzing tools you'd recommend? I know about ceph osd
thrash, which could be tested against, but what about on the client side? I
could just use something pre-built for posix, but that wouldn't coordinate
simulated failures on the storage side with actions against the fs. If
there is not any current tooling for coordinating server and client
simulation, maybe that's where I start.

On Nov 7, 2017 5:57 AM, "John Spray"  wrote:

> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
> > My organization has a production  cluster primarily used for cephfs
> upgraded
> > from jewel to luminous. We would very much like to have snapshots on that
> > filesystem, but understand that there are risks.
> >
> > What kind of work could cephfs admins do to help the devs stabilize this
> > feature?
>
> If you have a disposable test system, then you could install the
> latest master branch of Ceph (which has a stream of snapshot fixes in
> it) and run a replica of your intended workload.  If you can find
> snapshot bugs (especially crashes) on master then they will certainly
> attract interest.
>
> John
>
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-07 Thread John Spray
On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz  wrote:
> My organization has a production  cluster primarily used for cephfs upgraded
> from jewel to luminous. We would very much like to have snapshots on that
> filesystem, but understand that there are risks.
>
> What kind of work could cephfs admins do to help the devs stabilize this
> feature?

If you have a disposable test system, then you could install the
latest master branch of Ceph (which has a stream of snapshot fixes in
it) and run a replica of your intended workload.  If you can find
snapshot bugs (especially crashes) on master then they will certainly
attract interest.

John

>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs snapshot work

2017-11-06 Thread Vasu Kulkarni
On Sun, Nov 5, 2017 at 8:19 AM, Brady Deetz  wrote:
> My organization has a production  cluster primarily used for cephfs upgraded
> from jewel to luminous. We would very much like to have snapshots on that
> filesystem, but understand that there are risks.
>
> What kind of work could cephfs admins do to help the devs stabilize this
> feature?

I would say from admin point of view have an equivalent smaller non
production cluster
with some decent workloads(or a subset of data can be copied to this
cluster), you can then try
the features you want to try on non production cluster, it will have
smaller data set but would still
help to characterize how it would perform and if you encounter issues
you can use http://tracker.ceph.com/
to file for an issue and get some attention around it and know more.

>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com