Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate
On Sat, 26 May 2018, Vladimir Davydov wrote: > > The reference counting is only implemented for root kmem_caches for > > simplicity. The reference of a root kmem_cache is elevated on sharing or > > while its memcg kmem_cache creation or deactivation request is in the > > fly and thus it is made sure that the root kmem_cache is not destroyed > > in the middle. As the reference of kmem_cache is elevated on sharing, > > the 'shared_count' does not need any locking protection as at worst it > > can be out-dated for a small window which is tolerable. > > I wonder if we could fix this problem without introducing reference > counting for kmem caches (which seems a bit of an overkill to me TBO), > e.g. by flushing memcg_kmem_cache_wq before root cache destruction? Would prefer that too but the whole memcg handling is something of a mystery to me.
Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate
On Sat, 26 May 2018, Vladimir Davydov wrote: > > The reference counting is only implemented for root kmem_caches for > > simplicity. The reference of a root kmem_cache is elevated on sharing or > > while its memcg kmem_cache creation or deactivation request is in the > > fly and thus it is made sure that the root kmem_cache is not destroyed > > in the middle. As the reference of kmem_cache is elevated on sharing, > > the 'shared_count' does not need any locking protection as at worst it > > can be out-dated for a small window which is tolerable. > > I wonder if we could fix this problem without introducing reference > counting for kmem caches (which seems a bit of an overkill to me TBO), > e.g. by flushing memcg_kmem_cache_wq before root cache destruction? Would prefer that too but the whole memcg handling is something of a mystery to me.
Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate
On Sat, May 26, 2018 at 11:58 AM, Vladimir Davydovwrote: > On Tue, May 22, 2018 at 01:13:36PM -0700, Shakeel Butt wrote: >> The memcg kmem cache creation and deactivation (SLUB only) is >> asynchronous. If a root kmem cache is destroyed whose memcg cache is in >> the process of creation or deactivation, the kernel may crash. >> >> Example of one such crash: >> general protection fault: [#1] SMP PTI >> CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp >> ... >> Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn >> RIP: 0010:has_cpu_slab >> ... >> Call Trace: >> ? on_each_cpu_cond >> __kmem_cache_shrink >> kmemcg_cache_deact_after_rcu >> kmemcg_deactivate_workfn >> process_one_work >> worker_thread >> kthread >> ret_from_fork+0x35/0x40 >> >> This issue is due to the lack of real reference counting for the root >> kmem_caches. Currently kmem_cache does have a field named refcount which >> has been used for multiple purposes i.e. shared count, reference count >> and noshare flag. Due to its conflated nature, it can not be used for >> reference counting by other subsystems. >> >> This patch decoupled the reference counting from shared count and >> noshare flag. The new field 'shared_count' represents the shared count >> and noshare flag while 'refcount' is converted into a real reference >> counter. >> >> The reference counting is only implemented for root kmem_caches for >> simplicity. The reference of a root kmem_cache is elevated on sharing or >> while its memcg kmem_cache creation or deactivation request is in the >> fly and thus it is made sure that the root kmem_cache is not destroyed >> in the middle. As the reference of kmem_cache is elevated on sharing, >> the 'shared_count' does not need any locking protection as at worst it >> can be out-dated for a small window which is tolerable. > > I wonder if we could fix this problem without introducing reference > counting for kmem caches (which seems a bit of an overkill to me TBO), > e.g. by flushing memcg_kmem_cache_wq before root cache destruction? Thanks I will look into workqueue flushing.
Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate
On Sat, May 26, 2018 at 11:58 AM, Vladimir Davydov wrote: > On Tue, May 22, 2018 at 01:13:36PM -0700, Shakeel Butt wrote: >> The memcg kmem cache creation and deactivation (SLUB only) is >> asynchronous. If a root kmem cache is destroyed whose memcg cache is in >> the process of creation or deactivation, the kernel may crash. >> >> Example of one such crash: >> general protection fault: [#1] SMP PTI >> CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp >> ... >> Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn >> RIP: 0010:has_cpu_slab >> ... >> Call Trace: >> ? on_each_cpu_cond >> __kmem_cache_shrink >> kmemcg_cache_deact_after_rcu >> kmemcg_deactivate_workfn >> process_one_work >> worker_thread >> kthread >> ret_from_fork+0x35/0x40 >> >> This issue is due to the lack of real reference counting for the root >> kmem_caches. Currently kmem_cache does have a field named refcount which >> has been used for multiple purposes i.e. shared count, reference count >> and noshare flag. Due to its conflated nature, it can not be used for >> reference counting by other subsystems. >> >> This patch decoupled the reference counting from shared count and >> noshare flag. The new field 'shared_count' represents the shared count >> and noshare flag while 'refcount' is converted into a real reference >> counter. >> >> The reference counting is only implemented for root kmem_caches for >> simplicity. The reference of a root kmem_cache is elevated on sharing or >> while its memcg kmem_cache creation or deactivation request is in the >> fly and thus it is made sure that the root kmem_cache is not destroyed >> in the middle. As the reference of kmem_cache is elevated on sharing, >> the 'shared_count' does not need any locking protection as at worst it >> can be out-dated for a small window which is tolerable. > > I wonder if we could fix this problem without introducing reference > counting for kmem caches (which seems a bit of an overkill to me TBO), > e.g. by flushing memcg_kmem_cache_wq before root cache destruction? Thanks I will look into workqueue flushing.
Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate
On Tue, May 22, 2018 at 01:13:36PM -0700, Shakeel Butt wrote: > The memcg kmem cache creation and deactivation (SLUB only) is > asynchronous. If a root kmem cache is destroyed whose memcg cache is in > the process of creation or deactivation, the kernel may crash. > > Example of one such crash: > general protection fault: [#1] SMP PTI > CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp > ... > Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn > RIP: 0010:has_cpu_slab > ... > Call Trace: > ? on_each_cpu_cond > __kmem_cache_shrink > kmemcg_cache_deact_after_rcu > kmemcg_deactivate_workfn > process_one_work > worker_thread > kthread > ret_from_fork+0x35/0x40 > > This issue is due to the lack of real reference counting for the root > kmem_caches. Currently kmem_cache does have a field named refcount which > has been used for multiple purposes i.e. shared count, reference count > and noshare flag. Due to its conflated nature, it can not be used for > reference counting by other subsystems. > > This patch decoupled the reference counting from shared count and > noshare flag. The new field 'shared_count' represents the shared count > and noshare flag while 'refcount' is converted into a real reference > counter. > > The reference counting is only implemented for root kmem_caches for > simplicity. The reference of a root kmem_cache is elevated on sharing or > while its memcg kmem_cache creation or deactivation request is in the > fly and thus it is made sure that the root kmem_cache is not destroyed > in the middle. As the reference of kmem_cache is elevated on sharing, > the 'shared_count' does not need any locking protection as at worst it > can be out-dated for a small window which is tolerable. I wonder if we could fix this problem without introducing reference counting for kmem caches (which seems a bit of an overkill to me TBO), e.g. by flushing memcg_kmem_cache_wq before root cache destruction?
Re: [PATCH v2] mm: fix race between kmem_cache destroy, create and deactivate
On Tue, May 22, 2018 at 01:13:36PM -0700, Shakeel Butt wrote: > The memcg kmem cache creation and deactivation (SLUB only) is > asynchronous. If a root kmem cache is destroyed whose memcg cache is in > the process of creation or deactivation, the kernel may crash. > > Example of one such crash: > general protection fault: [#1] SMP PTI > CPU: 1 PID: 1721 Comm: kworker/14:1 Not tainted 4.17.0-smp > ... > Workqueue: memcg_kmem_cache kmemcg_deactivate_workfn > RIP: 0010:has_cpu_slab > ... > Call Trace: > ? on_each_cpu_cond > __kmem_cache_shrink > kmemcg_cache_deact_after_rcu > kmemcg_deactivate_workfn > process_one_work > worker_thread > kthread > ret_from_fork+0x35/0x40 > > This issue is due to the lack of real reference counting for the root > kmem_caches. Currently kmem_cache does have a field named refcount which > has been used for multiple purposes i.e. shared count, reference count > and noshare flag. Due to its conflated nature, it can not be used for > reference counting by other subsystems. > > This patch decoupled the reference counting from shared count and > noshare flag. The new field 'shared_count' represents the shared count > and noshare flag while 'refcount' is converted into a real reference > counter. > > The reference counting is only implemented for root kmem_caches for > simplicity. The reference of a root kmem_cache is elevated on sharing or > while its memcg kmem_cache creation or deactivation request is in the > fly and thus it is made sure that the root kmem_cache is not destroyed > in the middle. As the reference of kmem_cache is elevated on sharing, > the 'shared_count' does not need any locking protection as at worst it > can be out-dated for a small window which is tolerable. I wonder if we could fix this problem without introducing reference counting for kmem caches (which seems a bit of an overkill to me TBO), e.g. by flushing memcg_kmem_cache_wq before root cache destruction?