On 4/16/18 5:58 PM, Guenter Roeck wrote:

On Mon, Apr 16, 2018 at 02:43:01PM +0200, Vitaly Wool wrote:
Hey Guenter,

On 04/13/2018 07:56 PM, Guenter Roeck wrote:

On Fri, Apr 13, 2018 at 05:40:18PM +0000, Vitaly Wool wrote:
On Fri, Apr 13, 2018, 7:35 PM Guenter Roeck <li...@roeck-us.net> wrote:

On Fri, Apr 13, 2018 at 05:21:02AM +0000, Vitaly Wool wrote:
Hi Guenter,


Den fre 13 apr. 2018 kl 00:01 skrev Guenter Roeck <li...@roeck-us.net>:

Hi all,
we are observing crashes with z3pool under memory pressure. The kernel
version
used to reproduce the problem is v4.16-11827-g5d1365940a68, but the
problem was
also seen with v4.14 based kernels.
just before I dig into this, could you please try reproducing the errors
you see with https://patchwork.kernel.org/patch/10210459/ applied?

As mentioned above, I tested with v4.16-11827-g5d1365940a68, which already
includes this patch.

Bah. Sorry. Expect an update after the weekend.

NP; easy to miss. Thanks a lot for looking into it.

I wonder if the following patch would make a difference:

diff --git a/mm/z3fold.c b/mm/z3fold.c
index c0bca6153b95..5e547c2d5832 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -887,19 +887,21 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, 
unsigned int retries)
                                goto next;
                }
  next:
-               spin_lock(&pool->lock);
                if (test_bit(PAGE_HEADLESS, &page->private)) {
                        if (ret == 0) {
-                               spin_unlock(&pool->lock);
                                free_z3fold_page(page);
                                return 0;
                        }
-               } else if (kref_put(&zhdr->refcount, release_z3fold_page)) {
-                       atomic64_dec(&pool->pages_nr);
-                       spin_unlock(&pool->lock);
-                       return 0;
+               } else {
+                       spin_lock(&zhdr->page_lock);
+                       if (kref_put(&zhdr->refcount, 
release_z3fold_page_locked)) {
+                               atomic64_dec(&pool->pages_nr);
+                               return 0;
+                       }
+                       spin_unlock(&zhdr->page_lock);
                }
+               spin_lock(&pool->lock);
                /*
                 * Add to the beginning of LRU.
                 * Pool lock has to be kept here to ensure the page has

No, it doesn't. Same crash.

BUG: MAX_LOCK_DEPTH too low!
turning off the locking correctness validator.
depth: 48  max: 48!
48 locks held by kswapd0/51:
  #0: 000000004d7a35a9 (&(&pool->lock)->rlock#3){+.+.}, at: 
z3fold_zpool_shrink+0x47/0x3e0
  #1: 000000007739f49e (&(&zhdr->page_lock)->rlock){+.+.}, at: 
z3fold_zpool_shrink+0xb7/0x3e0
  #2: 00000000ff6cd4c8 (&(&zhdr->page_lock)->rlock){+.+.}, at: 
z3fold_zpool_shrink+0xb7/0x3e0
  #3: 000000004cffc6cb (&(&zhdr->page_lock)->rlock){+.+.}, at: 
z3fold_zpool_shrink+0xb7/0x3e0
...
PU: 0 PID: 51 Comm: kswapd0 Not tainted 4.17.0-rc1-yocto-standard+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
Call Trace:
  dump_stack+0x67/0x9b
  __lock_acquire+0x429/0x18f0
  ? __lock_acquire+0x2af/0x18f0
  ? __lock_acquire+0x2af/0x18f0
  ? lock_acquire+0x93/0x230
  lock_acquire+0x93/0x230
  ? z3fold_zpool_shrink+0xb7/0x3e0
  _raw_spin_trylock+0x65/0x80
  ? z3fold_zpool_shrink+0xb7/0x3e0
  ? z3fold_zpool_shrink+0x47/0x3e0
  z3fold_zpool_shrink+0xb7/0x3e0
  zswap_frontswap_store+0x180/0x7c0
...
BUG: sleeping function called from invalid context at mm/page_alloc.c:4320
in_atomic(): 1, irqs_disabled(): 0, pid: 51, name: kswapd0
INFO: lockdep is turned off.
Preemption disabled at:
[<0000000000000000>]           (null)
CPU: 0 PID: 51 Comm: kswapd0 Not tainted 4.17.0-rc1-yocto-standard+ #11
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.10.2-1 04/01/2014
Call Trace:
  dump_stack+0x67/0x9b
  ___might_sleep+0x16c/0x250
  __alloc_pages_nodemask+0x1e7/0x1490
  ? lock_acquire+0x93/0x230
  ? lock_acquire+0x93/0x230
  __read_swap_cache_async+0x14d/0x260
  zswap_writeback_entry+0xdb/0x340
  z3fold_zpool_shrink+0x2b1/0x3e0
  zswap_frontswap_store+0x180/0x7c0
  ? page_vma_mapped_walk+0x22/0x230
  __frontswap_store+0x6e/0xf0
  swap_writepage+0x49/0x70
...

This is with your patch applied on top of v4.17-rc1.

Guenter

Ugh. Could you please keep that patch and apply this on top:

diff --git a/mm/z3fold.c b/mm/z3fold.c
index c0bca6153b95..e8a80d044d9e 100644
--- a/mm/z3fold.c
+++ b/mm/z3fold.c
@@ -840,6 +840,7 @@ static int z3fold_reclaim_page(struct z3fold_pool *pool, 
unsigned int retries)
                        kref_get(&zhdr->refcount);
                        list_del_init(&zhdr->buddy);
                        zhdr->cpu = -1;
+                       break;
                }
list_del_init(&page->lru);

Thanks,
   Vitaly

Reply via email to