On Sat,  8 Jun 2013 00:23:18 +0400 Artem Savkov <artem.sav...@gmail.com> wrote:

> There is no guarantee that page in end_swap_bio_read is in swapcache so we 
> need
> to check it before calling page_swap_info(). Otherwise kernel hits a bug on
> like the one below.
> Introduced in "mm: remove compressed copy from zram in-memory"
> 
> kernel BUG at mm/swapfile.c:2361!

Fair enough.

> --- a/mm/page_io.c
> +++ b/mm/page_io.c
>
> ...
>
> +             /*
> +              * There is no guarantee that the page is in swap cache, so
> +              * we need to check PG_swapcache before proceeding with this
> +              * optimization.
> +              */

My initial thought was "how the heck can this not be a swapcache page".
So let's add the important details to this comment.

> +             if (unlikely(PageSwapCache(page))) {

Surely this is "likely".

> +                     struct swap_info_struct *sis;
> +
> +                     sis = page_swap_info(page);
> +                     if (sis->flags & SWP_BLKDEV) {
> +                             /*
> +                              * The swap subsystem performs lazy swap slot 
> freeing,
> +                              * expecting that the page will be swapped out 
> again.
> +                              * So we can avoid an unnecessary write if the 
> page
> +                              * isn't redirtied.
> +                              * This is good for real swap storage because 
> we can
> +                              * reduce unnecessary I/O and enhance 
> wear-leveling
> +                              * if an SSD is used as the as swap device.
> +                              * But if in-memory swap device (eg zram) is 
> used,
> +                              * this causes a duplicated copy between 
> uncompressed
> +                              * data in VM-owned memory and compressed data 
> in
> +                              * zram-owned memory.  So let's free zram-owned 
> memory
> +                              * and make the VM-owned decompressed page 
> *dirty*,
> +                              * so the page should be swapped out somewhere 
> again if
> +                              * we again wish to reclaim it.
> +                              */

This comment now makes a horrid mess in an 80-col display.  I think we
may as well remove a tabstop here.

> +                             struct gendisk *disk = sis->bdev->bd_disk;
> +                             if (disk->fops->swap_slot_free_notify) {
> +                                     swp_entry_t entry;
> +                                     unsigned long offset;
> +
> +                                     entry.val = page_private(page);
> +                                     offset = swp_offset(entry);
> +
> +                                     SetPageDirty(page);
> +                                     
> disk->fops->swap_slot_free_notify(sis->bdev,
> +                                                     offset);
> +                             }
>                       }
>               }
>       }

Result:

--- a/mm/page_io.c~mm-remove-compressed-copy-from-zram-in-memory-fix-2-fix
+++ a/mm/page_io.c
@@ -81,51 +81,54 @@ void end_swap_bio_read(struct bio *bio,
                                imajor(bio->bi_bdev->bd_inode),
                                iminor(bio->bi_bdev->bd_inode),
                                (unsigned long long)bio->bi_sector);
-       } else {
-               SetPageUptodate(page);
+               goto out;
+       }
+
+       SetPageUptodate(page);
+
+       /*
+        * There is no guarantee that the page is in swap cache - the software
+        * suspend code (at least) uses end_swap_bio_read() against a non-
+        * swapcache page.  So we must check PG_swapcache before proceeding with
+        * this optimization.
+        */
+       if (likely(PageSwapCache(page))) {
+               struct swap_info_struct *sis;
+
+               sis = page_swap_info(page);
+               if (sis->flags & SWP_BLKDEV) {
+                       /*
+                        * The swap subsystem performs lazy swap slot freeing,
+                        * expecting that the page will be swapped out again.
+                        * So we can avoid an unnecessary write if the page
+                        * isn't redirtied.
+                        * This is good for real swap storage because we can
+                        * reduce unnecessary I/O and enhance wear-leveling
+                        * if an SSD is used as the as swap device.
+                        * But if in-memory swap device (eg zram) is used,
+                        * this causes a duplicated copy between uncompressed
+                        * data in VM-owned memory and compressed data in
+                        * zram-owned memory.  So let's free zram-owned memory
+                        * and make the VM-owned decompressed page *dirty*,
+                        * so the page should be swapped out somewhere again if
+                        * we again wish to reclaim it.
+                        */
+                       struct gendisk *disk = sis->bdev->bd_disk;
+                       if (disk->fops->swap_slot_free_notify) {
+                               swp_entry_t entry;
+                               unsigned long offset;
+
+                               entry.val = page_private(page);
+                               offset = swp_offset(entry);
 
-               /*
-                * There is no guarantee that the page is in swap cache, so
-                * we need to check PG_swapcache before proceeding with this
-                * optimization.
-                */
-               if (unlikely(PageSwapCache(page))) {
-                       struct swap_info_struct *sis;
-
-                       sis = page_swap_info(page);
-                       if (sis->flags & SWP_BLKDEV) {
-                               /*
-                                * The swap subsystem performs lazy swap slot 
freeing,
-                                * expecting that the page will be swapped out 
again.
-                                * So we can avoid an unnecessary write if the 
page
-                                * isn't redirtied.
-                                * This is good for real swap storage because 
we can
-                                * reduce unnecessary I/O and enhance 
wear-leveling
-                                * if an SSD is used as the as swap device.
-                                * But if in-memory swap device (eg zram) is 
used,
-                                * this causes a duplicated copy between 
uncompressed
-                                * data in VM-owned memory and compressed data 
in
-                                * zram-owned memory.  So let's free zram-owned 
memory
-                                * and make the VM-owned decompressed page 
*dirty*,
-                                * so the page should be swapped out somewhere 
again if
-                                * we again wish to reclaim it.
-                                */
-                               struct gendisk *disk = sis->bdev->bd_disk;
-                               if (disk->fops->swap_slot_free_notify) {
-                                       swp_entry_t entry;
-                                       unsigned long offset;
-
-                                       entry.val = page_private(page);
-                                       offset = swp_offset(entry);
-
-                                       SetPageDirty(page);
-                                       
disk->fops->swap_slot_free_notify(sis->bdev,
-                                                       offset);
-                               }
+                               SetPageDirty(page);
+                               disk->fops->swap_slot_free_notify(sis->bdev,
+                                               offset);
                        }
                }
        }
 
+out:
        unlock_page(page);
        bio_put(bio);
 }
_

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to