Re: [PATCH 7/8] btrfs: be more explicit about allowed flush states
On 3.12.18 г. 17:24 ч., Josef Bacik wrote: > For FLUSH_LIMIT flushers (think evict, truncate) we can deadlock when > running delalloc because we may be holding a tree lock. We can also > deadlock with delayed refs rsv's that are running via the committing > mechanism. The only safe operations for FLUSH_LIMIT is to run the > delayed operations and to allocate chunks, everything else has the > potential to deadlock. Future proof this by explicitly specifying the > states that FLUSH_LIMIT is allowed to use. This will keep us from > introducing bugs later on when adding new flush states. > > Signed-off-by: Josef Bacik Reviewed-by: Nikolay Borisov > --- > fs/btrfs/extent-tree.c | 21 ++--- > 1 file changed, 10 insertions(+), 11 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 0e1a499035ac..ab9d915d9289 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -5123,12 +5123,18 @@ void btrfs_init_async_reclaim_work(struct work_struct > *work) > INIT_WORK(work, btrfs_async_reclaim_metadata_space); > } > > +static const enum btrfs_flush_state priority_flush_states[] = { > + FLUSH_DELAYED_ITEMS_NR, > + FLUSH_DELAYED_ITEMS, > + ALLOC_CHUNK, > +}; > + > static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, > struct btrfs_space_info *space_info, > struct reserve_ticket *ticket) > { > u64 to_reclaim; > - int flush_state = FLUSH_DELAYED_ITEMS_NR; > + int flush_state = 0; > > spin_lock(&space_info->lock); > to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info, > @@ -5140,7 +5146,8 @@ static void priority_reclaim_metadata_space(struct > btrfs_fs_info *fs_info, > spin_unlock(&space_info->lock); > > do { > - flush_space(fs_info, space_info, to_reclaim, flush_state); > + flush_space(fs_info, space_info, to_reclaim, > + priority_flush_states[flush_state]); > flush_state++; > spin_lock(&space_info->lock); > if (ticket->bytes == 0) { > @@ -5148,15 +5155,7 @@ static void priority_reclaim_metadata_space(struct > btrfs_fs_info *fs_info, > return; > } > spin_unlock(&space_info->lock); > - > - /* > - * Priority flushers can't wait on delalloc without > - * deadlocking. > - */ > - if (flush_state == FLUSH_DELALLOC || > - flush_state == FLUSH_DELALLOC_WAIT) > - flush_state = ALLOC_CHUNK; > - } while (flush_state < COMMIT_TRANS); > + } while (flush_state < ARRAY_SIZE(priority_flush_states)); > } > > static int wait_reserve_ticket(struct btrfs_fs_info *fs_info, >
Re: [PATCH 7/8] btrfs: be more explicit about allowed flush states
On Mon, Dec 03, 2018 at 10:24:58AM -0500, Josef Bacik wrote: > For FLUSH_LIMIT flushers (think evict, truncate) we can deadlock when > running delalloc because we may be holding a tree lock. We can also > deadlock with delayed refs rsv's that are running via the committing > mechanism. The only safe operations for FLUSH_LIMIT is to run the > delayed operations and to allocate chunks, everything else has the > potential to deadlock. Future proof this by explicitly specifying the > states that FLUSH_LIMIT is allowed to use. This will keep us from > introducing bugs later on when adding new flush states. > > Signed-off-by: Josef Bacik Reviewed-by: David Sterba
[PATCH 7/8] btrfs: be more explicit about allowed flush states
For FLUSH_LIMIT flushers (think evict, truncate) we can deadlock when running delalloc because we may be holding a tree lock. We can also deadlock with delayed refs rsv's that are running via the committing mechanism. The only safe operations for FLUSH_LIMIT is to run the delayed operations and to allocate chunks, everything else has the potential to deadlock. Future proof this by explicitly specifying the states that FLUSH_LIMIT is allowed to use. This will keep us from introducing bugs later on when adding new flush states. Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 0e1a499035ac..ab9d915d9289 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -5123,12 +5123,18 @@ void btrfs_init_async_reclaim_work(struct work_struct *work) INIT_WORK(work, btrfs_async_reclaim_metadata_space); } +static const enum btrfs_flush_state priority_flush_states[] = { + FLUSH_DELAYED_ITEMS_NR, + FLUSH_DELAYED_ITEMS, + ALLOC_CHUNK, +}; + static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, struct reserve_ticket *ticket) { u64 to_reclaim; - int flush_state = FLUSH_DELAYED_ITEMS_NR; + int flush_state = 0; spin_lock(&space_info->lock); to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info, @@ -5140,7 +5146,8 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, spin_unlock(&space_info->lock); do { - flush_space(fs_info, space_info, to_reclaim, flush_state); + flush_space(fs_info, space_info, to_reclaim, + priority_flush_states[flush_state]); flush_state++; spin_lock(&space_info->lock); if (ticket->bytes == 0) { @@ -5148,15 +5155,7 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, return; } spin_unlock(&space_info->lock); - - /* -* Priority flushers can't wait on delalloc without -* deadlocking. -*/ - if (flush_state == FLUSH_DELALLOC || - flush_state == FLUSH_DELALLOC_WAIT) - flush_state = ALLOC_CHUNK; - } while (flush_state < COMMIT_TRANS); + } while (flush_state < ARRAY_SIZE(priority_flush_states)); } static int wait_reserve_ticket(struct btrfs_fs_info *fs_info, -- 2.14.3
Re: [PATCH 7/8] btrfs: be more explicit about allowed flush states
On 26.11.18 г. 14:41 ч., Nikolay Borisov wrote: > > > On 21.11.18 г. 21:03 ч., Josef Bacik wrote: >> For FLUSH_LIMIT flushers we really can only allocate chunks and flush >> delayed inode items, everything else is problematic. I added a bunch of >> new states and it lead to weirdness in the FLUSH_LIMIT case because I >> forgot about how it worked. So instead explicitly declare the states >> that are ok for flushing with FLUSH_LIMIT and use that for our state >> machine. Then as we add new things that are safe we can just add them >> to this list. > > > Code-wise it's ok but the changelog needs rewording. At the very least > explain the weirdness. Also in the last sentence the word 'thing' is > better substituted with "flush states". Case in point, you yourself mention that you have forgotten how the FLUSH_LIMIT case works. That's why we need good changelogs so that those details can be quickly worked out from reading the changelog. > >> >> Signed-off-by: Josef Bacik >> --- >> fs/btrfs/extent-tree.c | 21 ++--- >> 1 file changed, 10 insertions(+), 11 deletions(-) >> >> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c >> index 0e9ba77e5316..e31980d451c2 100644 >> --- a/fs/btrfs/extent-tree.c >> +++ b/fs/btrfs/extent-tree.c >> @@ -5112,12 +5112,18 @@ void btrfs_init_async_reclaim_work(struct >> work_struct *work) >> INIT_WORK(work, btrfs_async_reclaim_metadata_space); >> } >> >> +static const enum btrfs_flush_state priority_flush_states[] = { >> +FLUSH_DELAYED_ITEMS_NR, >> +FLUSH_DELAYED_ITEMS, >> +ALLOC_CHUNK, >> +}; >> + >> static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, >> struct btrfs_space_info *space_info, >> struct reserve_ticket *ticket) >> { >> u64 to_reclaim; >> -int flush_state = FLUSH_DELAYED_ITEMS_NR; >> +int flush_state = 0; >> >> spin_lock(&space_info->lock); >> to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info, >> @@ -5129,7 +5135,8 @@ static void priority_reclaim_metadata_space(struct >> btrfs_fs_info *fs_info, >> spin_unlock(&space_info->lock); >> >> do { >> -flush_space(fs_info, space_info, to_reclaim, flush_state); >> +flush_space(fs_info, space_info, to_reclaim, >> +priority_flush_states[flush_state]); >> flush_state++; >> spin_lock(&space_info->lock); >> if (ticket->bytes == 0) { >> @@ -5137,15 +5144,7 @@ static void priority_reclaim_metadata_space(struct >> btrfs_fs_info *fs_info, >> return; >> } >> spin_unlock(&space_info->lock); >> - >> -/* >> - * Priority flushers can't wait on delalloc without >> - * deadlocking. >> - */ >> -if (flush_state == FLUSH_DELALLOC || >> -flush_state == FLUSH_DELALLOC_WAIT) >> -flush_state = ALLOC_CHUNK; >> -} while (flush_state < COMMIT_TRANS); >> +} while (flush_state < ARRAY_SIZE(priority_flush_states)); >> } >> >> static int wait_reserve_ticket(struct btrfs_fs_info *fs_info, >> >
Re: [PATCH 7/8] btrfs: be more explicit about allowed flush states
On 21.11.18 г. 21:03 ч., Josef Bacik wrote: > For FLUSH_LIMIT flushers we really can only allocate chunks and flush > delayed inode items, everything else is problematic. I added a bunch of > new states and it lead to weirdness in the FLUSH_LIMIT case because I > forgot about how it worked. So instead explicitly declare the states > that are ok for flushing with FLUSH_LIMIT and use that for our state > machine. Then as we add new things that are safe we can just add them > to this list. Code-wise it's ok but the changelog needs rewording. At the very least explain the weirdness. Also in the last sentence the word 'thing' is better substituted with "flush states". > > Signed-off-by: Josef Bacik > --- > fs/btrfs/extent-tree.c | 21 ++--- > 1 file changed, 10 insertions(+), 11 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index 0e9ba77e5316..e31980d451c2 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -5112,12 +5112,18 @@ void btrfs_init_async_reclaim_work(struct work_struct > *work) > INIT_WORK(work, btrfs_async_reclaim_metadata_space); > } > > +static const enum btrfs_flush_state priority_flush_states[] = { > + FLUSH_DELAYED_ITEMS_NR, > + FLUSH_DELAYED_ITEMS, > + ALLOC_CHUNK, > +}; > + > static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, > struct btrfs_space_info *space_info, > struct reserve_ticket *ticket) > { > u64 to_reclaim; > - int flush_state = FLUSH_DELAYED_ITEMS_NR; > + int flush_state = 0; > > spin_lock(&space_info->lock); > to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info, > @@ -5129,7 +5135,8 @@ static void priority_reclaim_metadata_space(struct > btrfs_fs_info *fs_info, > spin_unlock(&space_info->lock); > > do { > - flush_space(fs_info, space_info, to_reclaim, flush_state); > + flush_space(fs_info, space_info, to_reclaim, > + priority_flush_states[flush_state]); > flush_state++; > spin_lock(&space_info->lock); > if (ticket->bytes == 0) { > @@ -5137,15 +5144,7 @@ static void priority_reclaim_metadata_space(struct > btrfs_fs_info *fs_info, > return; > } > spin_unlock(&space_info->lock); > - > - /* > - * Priority flushers can't wait on delalloc without > - * deadlocking. > - */ > - if (flush_state == FLUSH_DELALLOC || > - flush_state == FLUSH_DELALLOC_WAIT) > - flush_state = ALLOC_CHUNK; > - } while (flush_state < COMMIT_TRANS); > + } while (flush_state < ARRAY_SIZE(priority_flush_states)); > } > > static int wait_reserve_ticket(struct btrfs_fs_info *fs_info, >
[PATCH 7/8] btrfs: be more explicit about allowed flush states
For FLUSH_LIMIT flushers we really can only allocate chunks and flush delayed inode items, everything else is problematic. I added a bunch of new states and it lead to weirdness in the FLUSH_LIMIT case because I forgot about how it worked. So instead explicitly declare the states that are ok for flushing with FLUSH_LIMIT and use that for our state machine. Then as we add new things that are safe we can just add them to this list. Signed-off-by: Josef Bacik --- fs/btrfs/extent-tree.c | 21 ++--- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 0e9ba77e5316..e31980d451c2 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -5112,12 +5112,18 @@ void btrfs_init_async_reclaim_work(struct work_struct *work) INIT_WORK(work, btrfs_async_reclaim_metadata_space); } +static const enum btrfs_flush_state priority_flush_states[] = { + FLUSH_DELAYED_ITEMS_NR, + FLUSH_DELAYED_ITEMS, + ALLOC_CHUNK, +}; + static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, struct reserve_ticket *ticket) { u64 to_reclaim; - int flush_state = FLUSH_DELAYED_ITEMS_NR; + int flush_state = 0; spin_lock(&space_info->lock); to_reclaim = btrfs_calc_reclaim_metadata_size(fs_info, space_info, @@ -5129,7 +5135,8 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, spin_unlock(&space_info->lock); do { - flush_space(fs_info, space_info, to_reclaim, flush_state); + flush_space(fs_info, space_info, to_reclaim, + priority_flush_states[flush_state]); flush_state++; spin_lock(&space_info->lock); if (ticket->bytes == 0) { @@ -5137,15 +5144,7 @@ static void priority_reclaim_metadata_space(struct btrfs_fs_info *fs_info, return; } spin_unlock(&space_info->lock); - - /* -* Priority flushers can't wait on delalloc without -* deadlocking. -*/ - if (flush_state == FLUSH_DELALLOC || - flush_state == FLUSH_DELALLOC_WAIT) - flush_state = ALLOC_CHUNK; - } while (flush_state < COMMIT_TRANS); + } while (flush_state < ARRAY_SIZE(priority_flush_states)); } static int wait_reserve_ticket(struct btrfs_fs_info *fs_info, -- 2.14.3