On Mon, Mar 13, 2017 at 12:27 PM, Goldwyn Rodrigues <rgold...@suse.de> wrote: > > > On 03/13/2017 07:14 AM, Filipe Manana wrote: >> On Sat, Mar 11, 2017 at 11:02 PM, Goldwyn Rodrigues <rgold...@suse.de> wrote: >>> From: Goldwyn Rodrigues <rgold...@suse.com> >>> >>> We are facing the same problem with EDQUOT which was experienced with >>> ENOSPC. Not sure if we require a full ticketing system such as ENOSPC, but >>> here is a fix. Let me know if it is too big a hammer. >>> >>> Quotas are reserved during the start of an operation, incrementing >>> qg->reserved. However, it is written to disk in a commit_transaction >>> which could take as long as commit_interval. In the meantime there >>> could be deletions which are not accounted for because deletions are >>> accounted for only while committed (free_refroot). So, when we get >>> a EDQUOT flush the data to disk and try again. >> >> So this doesn't explain why the problem happens, taking into account >> the reproducer below. >> You mention that deletions are only accounted at transaction commit >> time, but the reproducer does not do any deletions. > > Yes it does, after Cleanup. It is a loop. I suppose you got confused by > the indent.
Indeed, the indentation is screwed up. > >> So why does the reproducer gets EDQUOT errors? The second sentence >> "However, it is written to disk in a commit_transaction >> which could take as long as commit_interval." also does not explain >> why is that a problem, why it makes the reproducer get EDQUOT errors. >> >> I would like to see a clear and complete explanation in the changelog. > > I suppose this comment is irrelevant now? Yes. > >> >>> >>> I combined the conditions of rfer and excl to reduce code lines, though >>> the condition looks uglier. >> >> This sentence is outdated, it's no longer valid for v2 as it now calls >> qgroup_check_limits(). >> >>> >>> Here is a sample script which shows this issue. >>> >>> DEVICE=/dev/vdb >>> MOUNTPOINT=/mnt >>> TESTVOL=$MOUNTPOINT/tmp >>> QUOTA=5 >>> PROG=btrfs >>> DD_BS="4k" >>> DD_COUNT="256" >>> RUN_TIMES=5000 >>> >>> mkfs.btrfs -f $DEVICE >>> mount -o commit=240 $DEVICE $MOUNTPOINT >>> $PROG subvolume create $TESTVOL >>> $PROG quota enable $TESTVOL >>> $PROG qgroup limit ${QUOTA}G $TESTVOL >>> >>> typeset -i DD_RUN_GOOD >>> typeset -i QUOTA >>> >>> function _check_cmd() { >>> if [[ ${?} > 0 ]]; then >>> echo -n "$(date) E: Running previous command" >>> echo ${*} >>> echo "Without sync" >>> $PROG qgroup show -pcreFf ${TESTVOL} >>> echo "With sync" >>> $PROG qgroup show -pcreFf --sync ${TESTVOL} >>> exit 1 >>> fi >>> } >>> >>> while true; do >>> DD_RUN_GOOD=$RUN_TIMES >>> >>> while (( ${DD_RUN_GOOD} != 0 )); do >>> dd if=/dev/zero of=${TESTVOL}/quotatest${DD_RUN_GOOD} bs=${DD_BS} >>> count=${DD_COUNT} >>> _check_cmd "dd if=/dev/zero of=${TESTVOL}/quotatest${DD_RUN_GOOD} >>> bs=${DD_BS} count=${DD_COUNT}" >>> DD_RUN_GOOD=(${DD_RUN_GOOD}-1) >>> done >>> >>> $PROG qgroup show -pcref $TESTVOL >>> echo "----------- Cleanup ---------- " >>> rm $TESTVOL/quotatest* > > Here is the deletion. > >>> >>> done > > And the infinite loop ending. > >> >> Please write a test case for fstests. > > You mean xfstests. will do. xfstests got renamed to fstests long time ago, and obviously it's xfstests as that's the test suite we all use. thanks > >> >>> >>> Changes since v1: >>> - Changed start_delalloc_roots() to start_delalloc_inode() to target >>> the root in question only to reduce the amount of flush to be done. >>> - Added wait_ordered_extents(). >> >> Listing what changed between versions is not meant to appear in the >> changelog. This should go after the --- below. You should also prefix >> the patch subject with "btrfs:" (or "btrfs-progs:"). See the examples >> from others or the wiki for intsructions >> (https://btrfs.wiki.kernel.org/index.php/Writing_patch_for_btrfs#Updating_patches). >> > > Ok, understood. Thanks. > >> thanks >> >>> >>> Signed-off-by: Goldwyn Rodrigues <rgold...@suse.com> >>> Reviewed-by: Qu Wenruo <quwen...@cn.fujitsu.com> >>> --- >>> fs/btrfs/qgroup.c | 15 +++++++++++++++ >>> 1 file changed, 15 insertions(+) >>> >>> diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c >>> index a5da750..6215264 100644 >>> --- a/fs/btrfs/qgroup.c >>> +++ b/fs/btrfs/qgroup.c >>> @@ -2367,6 +2367,7 @@ static int qgroup_reserve(struct btrfs_root *root, >>> u64 num_bytes, bool enforce) >>> struct btrfs_fs_info *fs_info = root->fs_info; >>> u64 ref_root = root->root_key.objectid; >>> int ret = 0; >>> + int retried = 0; >>> struct ulist_node *unode; >>> struct ulist_iterator uiter; >>> >>> @@ -2376,6 +2377,7 @@ static int qgroup_reserve(struct btrfs_root *root, >>> u64 num_bytes, bool enforce) >>> if (num_bytes == 0) >>> return 0; >>> >>> +retry: >>> spin_lock(&fs_info->qgroup_lock); >>> quota_root = fs_info->quota_root; >>> if (!quota_root) >>> @@ -2402,6 +2404,19 @@ static int qgroup_reserve(struct btrfs_root *root, >>> u64 num_bytes, bool enforce) >>> qg = unode_aux_to_qgroup(unode); >>> >>> if (enforce && !qgroup_check_limits(qg, num_bytes)) { >>> + if (!retried && qg->reserved > 0) { >>> + struct btrfs_trans_handle *trans; >>> + spin_unlock(&fs_info->qgroup_lock); >>> + btrfs_start_delalloc_inodes(root, 0); >>> + btrfs_wait_ordered_extents(root, -1, 0, >>> + (u64)-1); >>> + trans = btrfs_join_transaction(root); >>> + if (IS_ERR(trans)) >>> + return PTR_ERR(trans); >>> + btrfs_commit_transaction(trans); >>> + retried++; >>> + goto retry; >>> + } >>> ret = -EDQUOT; >>> goto out; >>> } >>> -- >>> 2.10.2 >>> > > -- > Goldwyn -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html