[PATCH 0/2] btrfs: fortification for GFP_NOFS allocations

2015-08-19 Thread mhocko
Hi, these two patches were sent as a part of a larger RFC which aims at allowing GFP_NOFS allocations to fail to help sort out memory reclaim issues bound to the current behavior (http://marc.info/?l=linux-mmm=143876830616538w=2). It is clear that move to the GFP_NOFS behavior change is a long

[PATCH 1/2] btrfs: Prevent from early transaction abort

2015-08-19 Thread mhocko
From: Michal Hocko mho...@suse.com Btrfs relies on GFP_NOFS allocation when committing the transaction but this allocation context is rather weak wrt. reclaim capabilities. The page allocator currently tries hard to not fail these allocations if they are small (=PAGE_ALLOC_COSTLY_ORDER) so this

[PATCH 2/2] btrfs: use __GFP_NOFAIL in alloc_btrfs_bio

2015-08-19 Thread mhocko
From: Michal Hocko mho...@suse.com alloc_btrfs_bio relies on GFP_NOFS allocation when committing the transaction but this allocation context is rather weak wrt. reclaim capabilities. The page allocator currently tries hard to not fail these allocations if they are small (=PAGE_ALLOC_COSTLY_ORDER)

[RFC 1/8] mm, oom: Give __GFP_NOFAIL allocations access to memory reserves

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com __GFP_NOFAIL is a big hammer used to ensure that the allocation request can never fail. This is a strong requirement and as such it also deserves a special treatment when the system is OOM. The primary problem here is that the allocation request might have come

[RFC 8/8] btrfs: use __GFP_NOFAIL in alloc_btrfs_bio

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com alloc_btrfs_bio is relying on GFP_NOFS to allocate a bio but since mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM this is allowed to fail which can lead to [ 37.928625] kernel BUG at fs/btrfs/extent_io.c:4045 This is clearly undesirable and the

[RFC 7/8] btrfs: Prevent from early transaction abort

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com Btrfs relies on GFP_NOFS allocation when commiting the transaction but since mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM those allocations are allowed to fail which can lead to a pre-mature transaction abort: [ 55.328093] Call Trace: [

[RFC 4/8] jbd, jbd2: Do not fail journal because of frozen_buffer allocation failure

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com Journal transaction might fail prematurely because the frozen_buffer is allocated by GFP_NOFS request: [ 72.440013] do_get_write_access: OOM for frozen_buffer [ 72.440014] EXT4-fs: ext4_reserve_inode_write:4729: aborting transaction: Out of memory in

[RFC 6/8] ext3: Do not abort journal prematurely

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com journal_get_undo_access is relying on GFP_NOFS allocation yet it is essential for the journal transaction: [ 83.256914] journal_get_undo_access: No memory for committed data [ 83.258022] EXT3-fs: ext3_free_blocks_sb: aborting transaction: Out of memory in

[RFC 5/8] ext4: Do not fail journal due to block allocator

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com Since mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM memory allocator doesn't endlessly loop to satisfy low-order allocations and instead fails them to allow callers to handle them gracefully. Some of the callers are not yet prepared for this

[RFC 3/8] mm: page_alloc: do not lock up GFP_NOFS allocations upon OOM

2015-08-05 Thread mhocko
From: Johannes Weiner han...@cmpxchg.org GFP_NOFS allocations are not allowed to invoke the OOM killer since their reclaim abilities are severely diminished. However, without the OOM killer available there is no hope of progress once the reclaimable pages have been exhausted. Don't risk hanging

[RFC 2/8] mm: Allow GFP_IOFS for page_cache_read page cache allocation

2015-08-05 Thread mhocko
From: Michal Hocko mho...@suse.com page_cache_read has been historically using page_cache_alloc_cold to allocate a new page. This means that mapping_gfp_mask is used as the base for the gfp_mask. Many filesystems are setting this mask to GFP_NOFS to prevent from fs recursion issues.

[RFC 0/8] Allow GFP_NOFS allocation to fail

2015-08-05 Thread mhocko
Hi, small GFP_NOFS, like GFP_KERNEL, allocations have not been not failing traditionally even though their reclaim capabilities are restricted because the VM code cannot recurse into filesystems to clean dirty pages. At the same time these allocation requests do not allow to trigger the OOM killer