Re: [ceph-users] mount cephfs on ceph servers

2019-03-13 Thread Zhenshi Zhou
I got deadlock when mounting cephfs with kernel 4.12 and ceph 12.2.5.

The ceph servers had no any write operations to the mounted directories.
But the clients were still hang there until I restarted the servers.

I have not encountered the same issue from then on.

Paul Emmerich  于2019年3月13日周三 下午12:30写道:

> On Tue, Mar 12, 2019 at 8:56 PM David C  wrote:
> >
> > Out of curiosity, are you guys re-exporting the fs to clients over
> something like nfs or running applications directly on the OSD nodes?
>
> Kernel NFS + kernel CephFS can fall apart and deadlock itself in
> exciting ways...
>
> nfs-ganesha is so much better.
>
> Paul
>
> >
> > On Tue, 12 Mar 2019, 18:28 Paul Emmerich, 
> wrote:
> >>
> >> Mounting kernel CephFS on an OSD node works fine with recent kernels
> >> (4.14+) and enough RAM in the servers.
> >>
> >> We did encounter problems with older kernels though
> >>
> >>
> >> Paul
> >>
> >> --
> >> Paul Emmerich
> >>
> >> Looking for help with your Ceph cluster? Contact us at https://croit.io
> >>
> >> croit GmbH
> >> Freseniusstr. 31h
> >> 81247 München
> >> www.croit.io
> >> Tel: +49 89 1896585 90
> >>
> >> On Tue, Mar 12, 2019 at 10:07 AM Hector Martin 
> wrote:
> >> >
> >> > It's worth noting that most containerized deployments can effectively
> >> > limit RAM for containers (cgroups), and the kernel has limits on how
> >> > many dirty pages it can keep around.
> >> >
> >> > In particular, /proc/sys/vm/dirty_ratio (default: 20) means at most
> 20%
> >> > of your total RAM can be dirty FS pages. If you set up your containers
> >> > such that the cumulative memory usage is capped below, say, 70% of
> RAM,
> >> > then this might effectively guarantee that you will never hit this
> issue.
> >> >
> >> > On 08/03/2019 02:17, Tony Lill wrote:
> >> > > AFAIR the issue is that under memory pressure, the kernel will ask
> >> > > cephfs to flush pages, but that this in turn causes the osd (mds?)
> to
> >> > > require more memory to complete the flush (for network buffers,
> etc). As
> >> > > long as cephfs and the OSDs are feeding from the same kernel
> mempool,
> >> > > you are susceptible. Containers don't protect you, but a full VM,
> like
> >> > > xen or kvm? would.
> >> > >
> >> > > So if you don't hit the low memory situation, you will not see the
> >> > > deadlock, and you can run like this for years without a problem. I
> have.
> >> > > But you are most likely to run out of memory during recovery, so
> this
> >> > > could compound your problems.
> >> > >
> >> > > On 3/7/19 3:56 AM, Marc Roos wrote:
> >> > >>
> >> > >>
> >> > >> Container =  same kernel, problem is with processes using the same
> >> > >> kernel.
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >>
> >> > >> -Original Message-
> >> > >> From: Daniele Riccucci [mailto:devs...@posteo.net]
> >> > >> Sent: 07 March 2019 00:18
> >> > >> To: ceph-users@lists.ceph.com
> >> > >> Subject: Re: [ceph-users] mount cephfs on ceph servers
> >> > >>
> >> > >> Hello,
> >> > >> is the deadlock risk still an issue in containerized deployments?
> For
> >> > >> example with OSD daemons in containers and mounting the filesystem
> on
> >> > >> the host machine?
> >> > >> Thank you.
> >> > >>
> >> > >> Daniele
> >> > >>
> >> > >> On 06/03/19 16:40, Jake Grimmett wrote:
> >> > >>> Just to add "+1" on this datapoint, based on one month usage on
> Mimic
> >> > >>> 13.2.4 essentially "it works great for us"
> >> > >>>
> >> > >>> Prior to this, we had issues with the kernel driver on 12.2.2.
> This
> >> > >>> could have been due to limited RAM on the osd nodes (128GB / 45
> OSD),
> >> > >>> and an older kernel.
> >> > >>>
> >> > >>> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
> >

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Paul Emmerich
On Tue, Mar 12, 2019 at 8:56 PM David C  wrote:
>
> Out of curiosity, are you guys re-exporting the fs to clients over something 
> like nfs or running applications directly on the OSD nodes?

Kernel NFS + kernel CephFS can fall apart and deadlock itself in
exciting ways...

nfs-ganesha is so much better.

Paul

>
> On Tue, 12 Mar 2019, 18:28 Paul Emmerich,  wrote:
>>
>> Mounting kernel CephFS on an OSD node works fine with recent kernels
>> (4.14+) and enough RAM in the servers.
>>
>> We did encounter problems with older kernels though
>>
>>
>> Paul
>>
>> --
>> Paul Emmerich
>>
>> Looking for help with your Ceph cluster? Contact us at https://croit.io
>>
>> croit GmbH
>> Freseniusstr. 31h
>> 81247 München
>> www.croit.io
>> Tel: +49 89 1896585 90
>>
>> On Tue, Mar 12, 2019 at 10:07 AM Hector Martin  wrote:
>> >
>> > It's worth noting that most containerized deployments can effectively
>> > limit RAM for containers (cgroups), and the kernel has limits on how
>> > many dirty pages it can keep around.
>> >
>> > In particular, /proc/sys/vm/dirty_ratio (default: 20) means at most 20%
>> > of your total RAM can be dirty FS pages. If you set up your containers
>> > such that the cumulative memory usage is capped below, say, 70% of RAM,
>> > then this might effectively guarantee that you will never hit this issue.
>> >
>> > On 08/03/2019 02:17, Tony Lill wrote:
>> > > AFAIR the issue is that under memory pressure, the kernel will ask
>> > > cephfs to flush pages, but that this in turn causes the osd (mds?) to
>> > > require more memory to complete the flush (for network buffers, etc). As
>> > > long as cephfs and the OSDs are feeding from the same kernel mempool,
>> > > you are susceptible. Containers don't protect you, but a full VM, like
>> > > xen or kvm? would.
>> > >
>> > > So if you don't hit the low memory situation, you will not see the
>> > > deadlock, and you can run like this for years without a problem. I have.
>> > > But you are most likely to run out of memory during recovery, so this
>> > > could compound your problems.
>> > >
>> > > On 3/7/19 3:56 AM, Marc Roos wrote:
>> > >>
>> > >>
>> > >> Container =  same kernel, problem is with processes using the same
>> > >> kernel.
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >>
>> > >> -Original Message-
>> > >> From: Daniele Riccucci [mailto:devs...@posteo.net]
>> > >> Sent: 07 March 2019 00:18
>> > >> To: ceph-users@lists.ceph.com
>> > >> Subject: Re: [ceph-users] mount cephfs on ceph servers
>> > >>
>> > >> Hello,
>> > >> is the deadlock risk still an issue in containerized deployments? For
>> > >> example with OSD daemons in containers and mounting the filesystem on
>> > >> the host machine?
>> > >> Thank you.
>> > >>
>> > >> Daniele
>> > >>
>> > >> On 06/03/19 16:40, Jake Grimmett wrote:
>> > >>> Just to add "+1" on this datapoint, based on one month usage on Mimic
>> > >>> 13.2.4 essentially "it works great for us"
>> > >>>
>> > >>> Prior to this, we had issues with the kernel driver on 12.2.2. This
>> > >>> could have been due to limited RAM on the osd nodes (128GB / 45 OSD),
>> > >>> and an older kernel.
>> > >>>
>> > >>> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
>> > >>> allowed us to reliably use the kernel driver.
>> > >>>
>> > >>> We keep 30 snapshots ( one per day), have one active metadata server,
>> > >>> and change several TB daily - it's much, *much* faster than with fuse.
>> > >>>
>> > >>> Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.
>> > >>>
>> > >>> ta ta
>> > >>>
>> > >>> Jake
>> > >>>
>> > >>>
>> > >>>
>> > >>>
>> > >>> On 3/6/19 11:10 AM, Hector Martin wrote:
>> > >>>> On 06/03/2019 12:07, Zhenshi Zhou wrote:
>> > >>>>> Hi,
>> > >>&g

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Hector Martin
Both, in my case (since host, both local services and NFS export use the
CephFS mount). I use the in-kernel NFS server (not nfs-ganesha).

On 13/03/2019 04.55, David C wrote:
> Out of curiosity, are you guys re-exporting the fs to clients over
> something like nfs or running applications directly on the OSD nodes? 
> 
> On Tue, 12 Mar 2019, 18:28 Paul Emmerich,  <mailto:paul.emmer...@croit.io>> wrote:
> 
> Mounting kernel CephFS on an OSD node works fine with recent kernels
> (4.14+) and enough RAM in the servers.
> 
> We did encounter problems with older kernels though
> 
> 
> Paul
> 
> -- 
> Paul Emmerich
> 
> Looking for help with your Ceph cluster? Contact us at https://croit.io
> 
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io <http://www.croit.io>
> Tel: +49 89 1896585 90
> 
> On Tue, Mar 12, 2019 at 10:07 AM Hector Martin
> mailto:hec...@marcansoft.com>> wrote:
> >
> > It's worth noting that most containerized deployments can effectively
> > limit RAM for containers (cgroups), and the kernel has limits on how
> > many dirty pages it can keep around.
> >
> > In particular, /proc/sys/vm/dirty_ratio (default: 20) means at
> most 20%
> > of your total RAM can be dirty FS pages. If you set up your containers
> > such that the cumulative memory usage is capped below, say, 70% of
> RAM,
> > then this might effectively guarantee that you will never hit this
> issue.
> >
> > On 08/03/2019 02:17, Tony Lill wrote:
> > > AFAIR the issue is that under memory pressure, the kernel will ask
> > > cephfs to flush pages, but that this in turn causes the osd
> (mds?) to
> > > require more memory to complete the flush (for network buffers,
> etc). As
> > > long as cephfs and the OSDs are feeding from the same kernel
> mempool,
> > > you are susceptible. Containers don't protect you, but a full
> VM, like
> > > xen or kvm? would.
> > >
> > > So if you don't hit the low memory situation, you will not see the
> > > deadlock, and you can run like this for years without a problem.
> I have.
> > > But you are most likely to run out of memory during recovery, so
> this
> > > could compound your problems.
> > >
> > > On 3/7/19 3:56 AM, Marc Roos wrote:
> > >>
> > >>
> > >> Container =  same kernel, problem is with processes using the same
> > >> kernel.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> -Original Message-
> > >> From: Daniele Riccucci [mailto:devs...@posteo.net
> <mailto:devs...@posteo.net>]
> > >> Sent: 07 March 2019 00:18
> > >> To: ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
> > >> Subject: Re: [ceph-users] mount cephfs on ceph servers
> > >>
> > >> Hello,
> > >> is the deadlock risk still an issue in containerized
> deployments? For
> > >> example with OSD daemons in containers and mounting the
> filesystem on
> > >> the host machine?
> > >> Thank you.
> > >>
> > >> Daniele
> > >>
> > >> On 06/03/19 16:40, Jake Grimmett wrote:
> > >>> Just to add "+1" on this datapoint, based on one month usage
> on Mimic
> > >>> 13.2.4 essentially "it works great for us"
> > >>>
> > >>> Prior to this, we had issues with the kernel driver on 12.2.2.
> This
> > >>> could have been due to limited RAM on the osd nodes (128GB /
> 45 OSD),
> > >>> and an older kernel.
> > >>>
> > >>> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
> > >>> allowed us to reliably use the kernel driver.
> > >>>
> > >>> We keep 30 snapshots ( one per day), have one active metadata
> server,
> > >>> and change several TB daily - it's much, *much* faster than
> with fuse.
> > >>>
> > >>> Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2
> coding.
> > >>>
> > >>> ta ta
> > >>>
> &g

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread David C
Out of curiosity, are you guys re-exporting the fs to clients over
something like nfs or running applications directly on the OSD nodes?

On Tue, 12 Mar 2019, 18:28 Paul Emmerich,  wrote:

> Mounting kernel CephFS on an OSD node works fine with recent kernels
> (4.14+) and enough RAM in the servers.
>
> We did encounter problems with older kernels though
>
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
> On Tue, Mar 12, 2019 at 10:07 AM Hector Martin 
> wrote:
> >
> > It's worth noting that most containerized deployments can effectively
> > limit RAM for containers (cgroups), and the kernel has limits on how
> > many dirty pages it can keep around.
> >
> > In particular, /proc/sys/vm/dirty_ratio (default: 20) means at most 20%
> > of your total RAM can be dirty FS pages. If you set up your containers
> > such that the cumulative memory usage is capped below, say, 70% of RAM,
> > then this might effectively guarantee that you will never hit this issue.
> >
> > On 08/03/2019 02:17, Tony Lill wrote:
> > > AFAIR the issue is that under memory pressure, the kernel will ask
> > > cephfs to flush pages, but that this in turn causes the osd (mds?) to
> > > require more memory to complete the flush (for network buffers, etc).
> As
> > > long as cephfs and the OSDs are feeding from the same kernel mempool,
> > > you are susceptible. Containers don't protect you, but a full VM, like
> > > xen or kvm? would.
> > >
> > > So if you don't hit the low memory situation, you will not see the
> > > deadlock, and you can run like this for years without a problem. I
> have.
> > > But you are most likely to run out of memory during recovery, so this
> > > could compound your problems.
> > >
> > > On 3/7/19 3:56 AM, Marc Roos wrote:
> > >>
> > >>
> > >> Container =  same kernel, problem is with processes using the same
> > >> kernel.
> > >>
> > >>
> > >>
> > >>
> > >>
> > >>
> > >> -Original Message-
> > >> From: Daniele Riccucci [mailto:devs...@posteo.net]
> > >> Sent: 07 March 2019 00:18
> > >> To: ceph-users@lists.ceph.com
> > >> Subject: Re: [ceph-users] mount cephfs on ceph servers
> > >>
> > >> Hello,
> > >> is the deadlock risk still an issue in containerized deployments? For
> > >> example with OSD daemons in containers and mounting the filesystem on
> > >> the host machine?
> > >> Thank you.
> > >>
> > >> Daniele
> > >>
> > >> On 06/03/19 16:40, Jake Grimmett wrote:
> > >>> Just to add "+1" on this datapoint, based on one month usage on Mimic
> > >>> 13.2.4 essentially "it works great for us"
> > >>>
> > >>> Prior to this, we had issues with the kernel driver on 12.2.2. This
> > >>> could have been due to limited RAM on the osd nodes (128GB / 45 OSD),
> > >>> and an older kernel.
> > >>>
> > >>> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
> > >>> allowed us to reliably use the kernel driver.
> > >>>
> > >>> We keep 30 snapshots ( one per day), have one active metadata server,
> > >>> and change several TB daily - it's much, *much* faster than with
> fuse.
> > >>>
> > >>> Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.
> > >>>
> > >>> ta ta
> > >>>
> > >>> Jake
> > >>>
> > >>>
> > >>>
> > >>>
> > >>> On 3/6/19 11:10 AM, Hector Martin wrote:
> > >>>> On 06/03/2019 12:07, Zhenshi Zhou wrote:
> > >>>>> Hi,
> > >>>>>
> > >>>>> I'm gonna mount cephfs from my ceph servers for some reason,
> > >>>>> including monitors, metadata servers and osd servers. I know it's
> > >>>>> not a best practice. But what is the exact potential danger if I
> > >>>>> mount cephfs from its own server?
> > >>>>
> > >>>> As a datapoint, I have been doing this on two machines (single-host
> > >>>> Ceph
> > >>>> clusters) for mon

Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Paul Emmerich
Mounting kernel CephFS on an OSD node works fine with recent kernels
(4.14+) and enough RAM in the servers.

We did encounter problems with older kernels though


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Tue, Mar 12, 2019 at 10:07 AM Hector Martin  wrote:
>
> It's worth noting that most containerized deployments can effectively
> limit RAM for containers (cgroups), and the kernel has limits on how
> many dirty pages it can keep around.
>
> In particular, /proc/sys/vm/dirty_ratio (default: 20) means at most 20%
> of your total RAM can be dirty FS pages. If you set up your containers
> such that the cumulative memory usage is capped below, say, 70% of RAM,
> then this might effectively guarantee that you will never hit this issue.
>
> On 08/03/2019 02:17, Tony Lill wrote:
> > AFAIR the issue is that under memory pressure, the kernel will ask
> > cephfs to flush pages, but that this in turn causes the osd (mds?) to
> > require more memory to complete the flush (for network buffers, etc). As
> > long as cephfs and the OSDs are feeding from the same kernel mempool,
> > you are susceptible. Containers don't protect you, but a full VM, like
> > xen or kvm? would.
> >
> > So if you don't hit the low memory situation, you will not see the
> > deadlock, and you can run like this for years without a problem. I have.
> > But you are most likely to run out of memory during recovery, so this
> > could compound your problems.
> >
> > On 3/7/19 3:56 AM, Marc Roos wrote:
> >>
> >>
> >> Container =  same kernel, problem is with processes using the same
> >> kernel.
> >>
> >>
> >>
> >>
> >>
> >>
> >> -Original Message-
> >> From: Daniele Riccucci [mailto:devs...@posteo.net]
> >> Sent: 07 March 2019 00:18
> >> To: ceph-users@lists.ceph.com
> >> Subject: Re: [ceph-users] mount cephfs on ceph servers
> >>
> >> Hello,
> >> is the deadlock risk still an issue in containerized deployments? For
> >> example with OSD daemons in containers and mounting the filesystem on
> >> the host machine?
> >> Thank you.
> >>
> >> Daniele
> >>
> >> On 06/03/19 16:40, Jake Grimmett wrote:
> >>> Just to add "+1" on this datapoint, based on one month usage on Mimic
> >>> 13.2.4 essentially "it works great for us"
> >>>
> >>> Prior to this, we had issues with the kernel driver on 12.2.2. This
> >>> could have been due to limited RAM on the osd nodes (128GB / 45 OSD),
> >>> and an older kernel.
> >>>
> >>> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
> >>> allowed us to reliably use the kernel driver.
> >>>
> >>> We keep 30 snapshots ( one per day), have one active metadata server,
> >>> and change several TB daily - it's much, *much* faster than with fuse.
> >>>
> >>> Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.
> >>>
> >>> ta ta
> >>>
> >>> Jake
> >>>
> >>>
> >>>
> >>>
> >>> On 3/6/19 11:10 AM, Hector Martin wrote:
> >>>> On 06/03/2019 12:07, Zhenshi Zhou wrote:
> >>>>> Hi,
> >>>>>
> >>>>> I'm gonna mount cephfs from my ceph servers for some reason,
> >>>>> including monitors, metadata servers and osd servers. I know it's
> >>>>> not a best practice. But what is the exact potential danger if I
> >>>>> mount cephfs from its own server?
> >>>>
> >>>> As a datapoint, I have been doing this on two machines (single-host
> >>>> Ceph
> >>>> clusters) for months with no ill effects. The FUSE client performs a
> >>>> lot worse than the kernel client, so I switched to the latter, and
> >>>> it's been working well with no deadlocks.
> >>>>
> >>> ___
> >>> ceph-users mailing list
> >>> ceph-users@lists.ceph.com
> >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >>
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >>
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-12 Thread Hector Martin
It's worth noting that most containerized deployments can effectively 
limit RAM for containers (cgroups), and the kernel has limits on how 
many dirty pages it can keep around.


In particular, /proc/sys/vm/dirty_ratio (default: 20) means at most 20% 
of your total RAM can be dirty FS pages. If you set up your containers 
such that the cumulative memory usage is capped below, say, 70% of RAM, 
then this might effectively guarantee that you will never hit this issue.


On 08/03/2019 02:17, Tony Lill wrote:

AFAIR the issue is that under memory pressure, the kernel will ask
cephfs to flush pages, but that this in turn causes the osd (mds?) to
require more memory to complete the flush (for network buffers, etc). As
long as cephfs and the OSDs are feeding from the same kernel mempool,
you are susceptible. Containers don't protect you, but a full VM, like
xen or kvm? would.

So if you don't hit the low memory situation, you will not see the
deadlock, and you can run like this for years without a problem. I have.
But you are most likely to run out of memory during recovery, so this
could compound your problems.

On 3/7/19 3:56 AM, Marc Roos wrote:
  


Container =  same kernel, problem is with processes using the same
kernel.






-Original Message-
From: Daniele Riccucci [mailto:devs...@posteo.net]
Sent: 07 March 2019 00:18
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] mount cephfs on ceph servers

Hello,
is the deadlock risk still an issue in containerized deployments? For
example with OSD daemons in containers and mounting the filesystem on
the host machine?
Thank you.

Daniele

On 06/03/19 16:40, Jake Grimmett wrote:

Just to add "+1" on this datapoint, based on one month usage on Mimic
13.2.4 essentially "it works great for us"

Prior to this, we had issues with the kernel driver on 12.2.2. This
could have been due to limited RAM on the osd nodes (128GB / 45 OSD),
and an older kernel.

Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
allowed us to reliably use the kernel driver.

We keep 30 snapshots ( one per day), have one active metadata server,
and change several TB daily - it's much, *much* faster than with fuse.

Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.

ta ta

Jake




On 3/6/19 11:10 AM, Hector Martin wrote:

On 06/03/2019 12:07, Zhenshi Zhou wrote:

Hi,

I'm gonna mount cephfs from my ceph servers for some reason,
including monitors, metadata servers and osd servers. I know it's
not a best practice. But what is the exact potential danger if I
mount cephfs from its own server?


As a datapoint, I have been doing this on two machines (single-host
Ceph
clusters) for months with no ill effects. The FUSE client performs a
lot worse than the kernel client, so I switched to the latter, and
it's been working well with no deadlocks.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



--
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-07 Thread Tony Lill
AFAIR the issue is that under memory pressure, the kernel will ask
cephfs to flush pages, but that this in turn causes the osd (mds?) to
require more memory to complete the flush (for network buffers, etc). As
long as cephfs and the OSDs are feeding from the same kernel mempool,
you are susceptible. Containers don't protect you, but a full VM, like
xen or kvm? would.

So if you don't hit the low memory situation, you will not see the
deadlock, and you can run like this for years without a problem. I have.
But you are most likely to run out of memory during recovery, so this
could compound your problems.

On 3/7/19 3:56 AM, Marc Roos wrote:
>  
> 
> Container =  same kernel, problem is with processes using the same 
> kernel. 
> 
> 
> 
> 
> 
> 
> -Original Message-
> From: Daniele Riccucci [mailto:devs...@posteo.net] 
> Sent: 07 March 2019 00:18
> To: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] mount cephfs on ceph servers
> 
> Hello,
> is the deadlock risk still an issue in containerized deployments? For 
> example with OSD daemons in containers and mounting the filesystem on 
> the host machine?
> Thank you.
> 
> Daniele
> 
> On 06/03/19 16:40, Jake Grimmett wrote:
>> Just to add "+1" on this datapoint, based on one month usage on Mimic
>> 13.2.4 essentially "it works great for us"
>>
>> Prior to this, we had issues with the kernel driver on 12.2.2. This 
>> could have been due to limited RAM on the osd nodes (128GB / 45 OSD), 
>> and an older kernel.
>>
>> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has 
>> allowed us to reliably use the kernel driver.
>>
>> We keep 30 snapshots ( one per day), have one active metadata server, 
>> and change several TB daily - it's much, *much* faster than with fuse.
>>
>> Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.
>>
>> ta ta
>>
>> Jake
>>
>>
>>
>>
>> On 3/6/19 11:10 AM, Hector Martin wrote:
>>> On 06/03/2019 12:07, Zhenshi Zhou wrote:
>>>> Hi,
>>>>
>>>> I'm gonna mount cephfs from my ceph servers for some reason, 
>>>> including monitors, metadata servers and osd servers. I know it's 
>>>> not a best practice. But what is the exact potential danger if I 
>>>> mount cephfs from its own server?
>>>
>>> As a datapoint, I have been doing this on two machines (single-host 
>>> Ceph
>>> clusters) for months with no ill effects. The FUSE client performs a 
>>> lot worse than the kernel client, so I switched to the latter, and 
>>> it's been working well with no deadlocks.
>>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 

-- 
Tony Lill, OCT,   ajl...@ajlc.waterloo.on.ca
President, A. J. Lill Consultants (519) 650 0660
539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461
--- http://www.ajlc.waterloo.on.ca/ 





signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-07 Thread Marc Roos
 

Container =  same kernel, problem is with processes using the same 
kernel. 






-Original Message-
From: Daniele Riccucci [mailto:devs...@posteo.net] 
Sent: 07 March 2019 00:18
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] mount cephfs on ceph servers

Hello,
is the deadlock risk still an issue in containerized deployments? For 
example with OSD daemons in containers and mounting the filesystem on 
the host machine?
Thank you.

Daniele

On 06/03/19 16:40, Jake Grimmett wrote:
> Just to add "+1" on this datapoint, based on one month usage on Mimic
> 13.2.4 essentially "it works great for us"
> 
> Prior to this, we had issues with the kernel driver on 12.2.2. This 
> could have been due to limited RAM on the osd nodes (128GB / 45 OSD), 
> and an older kernel.
> 
> Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has 
> allowed us to reliably use the kernel driver.
> 
> We keep 30 snapshots ( one per day), have one active metadata server, 
> and change several TB daily - it's much, *much* faster than with fuse.
> 
> Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.
> 
> ta ta
> 
> Jake
> 
> 
> 
> 
> On 3/6/19 11:10 AM, Hector Martin wrote:
>> On 06/03/2019 12:07, Zhenshi Zhou wrote:
>>> Hi,
>>>
>>> I'm gonna mount cephfs from my ceph servers for some reason, 
>>> including monitors, metadata servers and osd servers. I know it's 
>>> not a best practice. But what is the exact potential danger if I 
>>> mount cephfs from its own server?
>>
>> As a datapoint, I have been doing this on two machines (single-host 
>> Ceph
>> clusters) for months with no ill effects. The FUSE client performs a 
>> lot worse than the kernel client, so I switched to the latter, and 
>> it's been working well with no deadlocks.
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread Daniele Riccucci

Hello,
is the deadlock risk still an issue in containerized deployments? For 
example with OSD daemons in containers and mounting the filesystem on 
the host machine?

Thank you.

Daniele

On 06/03/19 16:40, Jake Grimmett wrote:

Just to add "+1" on this datapoint, based on one month usage on Mimic
13.2.4 essentially "it works great for us"

Prior to this, we had issues with the kernel driver on 12.2.2. This
could have been due to limited RAM on the osd nodes (128GB / 45 OSD),
and an older kernel.

Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
allowed us to reliably use the kernel driver.

We keep 30 snapshots ( one per day), have one active metadata server,
and change several TB daily - it's much, *much* faster than with fuse.

Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.

ta ta

Jake




On 3/6/19 11:10 AM, Hector Martin wrote:

On 06/03/2019 12:07, Zhenshi Zhou wrote:

Hi,

I'm gonna mount cephfs from my ceph servers for some reason,
including monitors, metadata servers and osd servers. I know it's
not a best practice. But what is the exact potential danger if I mount
cephfs from its own server?


As a datapoint, I have been doing this on two machines (single-host Ceph
clusters) for months with no ill effects. The FUSE client performs a lot
worse than the kernel client, so I switched to the latter, and it's been
working well with no deadlocks.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread Jake Grimmett
Just to add "+1" on this datapoint, based on one month usage on Mimic
13.2.4 essentially "it works great for us"

Prior to this, we had issues with the kernel driver on 12.2.2. This
could have been due to limited RAM on the osd nodes (128GB / 45 OSD),
and an older kernel.

Upgrading the RAM to 256GB and using a RHEL 7.6 derived kernel has
allowed us to reliably use the kernel driver.

We keep 30 snapshots ( one per day), have one active metadata server,
and change several TB daily - it's much, *much* faster than with fuse.

Cluster has 10 OSD nodes, currently storing 2PB, using ec 8:2 coding.

ta ta

Jake




On 3/6/19 11:10 AM, Hector Martin wrote:
> On 06/03/2019 12:07, Zhenshi Zhou wrote:
>> Hi,
>>
>> I'm gonna mount cephfs from my ceph servers for some reason,
>> including monitors, metadata servers and osd servers. I know it's
>> not a best practice. But what is the exact potential danger if I mount
>> cephfs from its own server?
> 
> As a datapoint, I have been doing this on two machines (single-host Ceph
> clusters) for months with no ill effects. The FUSE client performs a lot
> worse than the kernel client, so I switched to the latter, and it's been
> working well with no deadlocks.
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread Hector Martin

On 06/03/2019 12:07, Zhenshi Zhou wrote:

Hi,

I'm gonna mount cephfs from my ceph servers for some reason,
including monitors, metadata servers and osd servers. I know it's
not a best practice. But what is the exact potential danger if I mount
cephfs from its own server?


As a datapoint, I have been doing this on two machines (single-host Ceph 
clusters) for months with no ill effects. The FUSE client performs a lot 
worse than the kernel client, so I switched to the latter, and it's been 
working well with no deadlocks.


--
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mount cephfs on ceph servers

2019-03-06 Thread David C
The general advice has been to not use the kernel client on an osd node as
you may see a deadlock under certain conditions. Using the fuse client
should be fine or use the kernel client inside a VM.

On Wed, 6 Mar 2019, 03:07 Zhenshi Zhou,  wrote:

> Hi,
>
> I'm gonna mount cephfs from my ceph servers for some reason,
> including monitors, metadata servers and osd servers. I know it's
> not a best practice. But what is the exact potential danger if I mount
> cephfs from its own server?
>
> Thanks
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mount cephfs on ceph servers

2019-03-05 Thread Zhenshi Zhou
Hi,

I'm gonna mount cephfs from my ceph servers for some reason,
including monitors, metadata servers and osd servers. I know it's
not a best practice. But what is the exact potential danger if I mount
cephfs from its own server?

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com