Re: [PATCH 2/4] mm: introduce memalloc_noreclaim_{save,restore}

2017-04-07 Thread Hillf Danton

On April 05, 2017 3:47 PM Vlastimil Babka wrote: 
> 
> The previous patch has shown that simply setting and clearing PF_MEMALLOC in
> current->flags can result in wrongly clearing a pre-existing PF_MEMALLOC flag
> and potentially lead to recursive reclaim. Let's introduce helpers that 
> support
> proper nesting by saving the previous stat of the flag, similar to the 
> existing
> memalloc_noio_* and memalloc_nofs_* helpers. Convert existing setting/clearing
> of PF_MEMALLOC within mm to the new helpers.
> 
> There are no known issues with the converted code, but the change makes it 
> more
> robust.
> 
> Suggested-by: Michal Hocko <mho...@suse.com>
> Signed-off-by: Vlastimil Babka <vba...@suse.cz>
> ---

Acked-by: Hillf Danton <hillf...@alibaba-inc.com>



Re: [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC

2017-04-07 Thread Hillf Danton
On April 05, 2017 3:47 PM Vlastimil Babka wrote: 
> 
> The function __alloc_pages_direct_compact() sets PF_MEMALLOC to prevent
> deadlock during page migration by lock_page() (see the comment in
> __unmap_and_move()). Then it unconditionally clears the flag, which can clear 
> a
> pre-existing PF_MEMALLOC flag and result in recursive reclaim. This was not a
> problem until commit a8161d1ed609 ("mm, page_alloc: restructure direct
> compaction handling in slowpath"), because direct compation was called only
> after direct reclaim, which was skipped when PF_MEMALLOC flag was set.
> 
> Even now it's only a theoretical issue, as the new callsite of
> __alloc_pages_direct_compact() is reached only for costly orders and when
> gfp_pfmemalloc_allowed() is true, which means either __GFP_NOMEMALLOC is in
> gfp_flags or in_interrupt() is true. There is no such known context, but let's
> play it safe and make __alloc_pages_direct_compact() robust for cases where
> PF_MEMALLOC is already set.
> 
> Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling 
> in slowpath")
> Reported-by: Andrey Ryabinin <aryabi...@virtuozzo.com>
> Signed-off-by: Vlastimil Babka <vba...@suse.cz>
> Cc: <sta...@vger.kernel.org>
> ---
Acked-by: Hillf Danton <hillf...@alibaba-inc.com>

>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3589f8be53be..b84e6ffbe756 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3288,6 +3288,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned 
> int order,
>   enum compact_priority prio, enum compact_result *compact_result)
>  {
>   struct page *page;
> + unsigned int noreclaim_flag = current->flags & PF_MEMALLOC;
> 
>   if (!order)
>   return NULL;
> @@ -3295,7 +3296,7 @@ __alloc_pages_direct_compact(gfp_t gfp_mask, unsigned 
> int order,
>   current->flags |= PF_MEMALLOC;
>   *compact_result = try_to_compact_pages(gfp_mask, order, alloc_flags, ac,
>   prio);
> - current->flags &= ~PF_MEMALLOC;
> + current->flags = (current->flags & ~PF_MEMALLOC) | noreclaim_flag;
> 
>   if (*compact_result <= COMPACT_INACTIVE)
>   return NULL;
> --
> 2.12.2



Re: How to online remove an error scsi disk from the system?

2013-02-01 Thread Hillf Danton
On Fri, Feb 1, 2013 at 2:13 PM, Tao Ma t...@tao.ma wrote:
 Hi All,
 In our product system, we have several sata disks attached to one
 machine. So when one of the disk fails, the jbd2(yes, we use ext4) will
 hang forever and we will get something in /var/log/messages like below.
 It seems to me that the io sent to the scsi layer is never returned back
 with -EIO which is a little bit surprised for me(It should be a timeout
 somewhere, right?). We have tried echo offline 
 /sys/block/sdl/device/state, but it doesn't work. So is there any way
 for us to let the scsi device returns all the io requests back with EIO
 so that all the end_io can be called accordingly? Am I missing something
 here?

 Thanks,
 Tao


 sd 0:0:11:0: attempting task abort! scmd(88180e900580)
 sd 0:0:11:0: [sdl] CDB: Write(10): 2a 00 0d ca e0 3f 00 04 00 00
 target0:0:11: handle(0x0015), sas_address(0x500e004aaa0b), phy(11)
 target0:0:11: enclosure_logical_id(0x500e004aaa00), slot(11)
 INFO: task jbd2/sdl1-8:4629 blocked for more than 120 seconds.
 echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this message.
 jbd2/sdl1-8   D  0  4629  2 0x
  88180aa79ae0 0046 88180aa79aa8 
  88007ce0fe40 00015f40 8818102c0638 8818102c0080
  880a9184a100 8818102c0638 000105006028 0001
 Call Trace:
  [81236a15] ? cpumask_next_and+0x25/0x40
  [810122b6] ? read_tsc+0x16/0x40
  [81093cd9] ? ktime_get_ts+0xa9/0xe0
  [810122b6] ? read_tsc+0x16/0x40
  [81093cd9] ? ktime_get_ts+0xa9/0xe0
  [814a8a53] io_schedule+0x73/0xc0
  [811036a8] sync_page+0x38/0x50
  [814a927e] __wait_on_bit+0x5e/0x90
  [81103670] ? sync_page+0x0/0x50
  [81103845] wait_on_page_bit+0x75/0x80
  [81089320] ? wake_bit_function+0x0/0x40
  [811197c7] ? pagevec_lookup_tag+0x27/0x40
  [81118b55] write_cache_pages+0x1d5/0x440
  [811172f0] ? __writepage+0x0/0x40
  [81118de4] generic_writepages+0x24/0x30
  [a02dc719] jbd2_journal_commit_transaction+0x3e9/0x1490 [jbd2]
  [81074299] ? try_to_del_timer_sync+0x49/0xe0
  [a02e2734] kjournald2+0xb4/0x220 [jbd2]
  [810892e0] ? autoremove_wake_function+0x0/0x40
  [a02e2680] ? kjournald2+0x0/0x220 [jbd2]
  [81089166] kthread+0x96/0xa0
  [8100c08a] child_rip+0xa/0x20
  [810890d0] ? kthread+0x0/0xa0
  [8100c080] ? child_rip+0x0/0x20

Can you try upstream?
--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html