Hi, Adding some more information. A Few months ago the data on the Lustre fs was migrated to new physical storage. After successful migration the old ost were marked as active=0 (lctl conf_param technion-OST0001.osc.active=0)
Since then all the clients were unmounted and mounted. tunefs.lustre --writeconf was executed on the mgs/mdt and all the ost. lctl dl don't show the old ost anymore, but when querying the quota they still appear. As I see that new users are less affected by the "quota exceeded" problem (blocked from writing while quota is not filled), I suspect that quota calculation is still summing values from the old ost: *lfs quota -g -v md_kaplan /storage/* Disk quotas for grp md_kaplan (gid 10028): Filesystem kbytes quota limit grace files quota limit grace /storage/ 4823987000 0 5368709120 - 143596 0 0 - technion-MDT0000_UUID 37028 - 0 - 143596 - 0 - quotactl ost0 failed. quotactl ost1 failed. quotactl ost2 failed. quotactl ost3 failed. quotactl ost4 failed. quotactl ost5 failed. quotactl ost6 failed. quotactl ost7 failed. quotactl ost8 failed. quotactl ost9 failed. quotactl ost10 failed. quotactl ost11 failed. quotactl ost12 failed. quotactl ost13 failed. quotactl ost14 failed. quotactl ost15 failed. quotactl ost16 failed. quotactl ost17 failed. quotactl ost18 failed. quotactl ost19 failed. quotactl ost20 failed. technion-OST0015_UUID 114429464* - 114429464 - - - - - technion-OST0016_UUID 92938588 - 92938592 - - - - - technion-OST0017_UUID 128496468* - 128496468 - - - - - technion-OST0018_UUID 191478704* - 191478704 - - - - - technion-OST0019_UUID 107720552 - 107720560 - - - - - technion-OST001a_UUID 165631952* - 165631952 - - - - - technion-OST001b_UUID 460714156* - 460714156 - - - - - technion-OST001c_UUID 157182900* - 157182900 - - - - - technion-OST001d_UUID 102945952* - 102945952 - - - - - technion-OST001e_UUID 175840980* - 175840980 - - - - - technion-OST001f_UUID 142666872* - 142666872 - - - - - technion-OST0020_UUID 188147548* - 188147548 - - - - - technion-OST0021_UUID 125914240* - 125914240 - - - - - technion-OST0022_UUID 186390800* - 186390800 - - - - - technion-OST0023_UUID 115386876 - 115386884 - - - - - technion-OST0024_UUID 127139556* - 127139556 - - - - - technion-OST0025_UUID 179666580* - 179666580 - - - - - technion-OST0026_UUID 147837348 - 147837356 - - - - - technion-OST0027_UUID 129823528 - 129823536 - - - - - technion-OST0028_UUID 158270776 - 158270784 - - - - - technion-OST0029_UUID 168762120 - 168763104 - - - - - technion-OST002a_UUID 164235684 - 164235688 - - - - - technion-OST002b_UUID 147512200 - 147512204 - - - - - technion-OST002c_UUID 158046652 - 158046668 - - - - - technion-OST002d_UUID 199314048* - 199314048 - - - - - technion-OST002e_UUID 209187196* - 209187196 - - - - - technion-OST002f_UUID 162586732 - 162586764 - - - - - technion-OST0030_UUID 131248812* - 131248812 - - - - - technion-OST0031_UUID 134665176* - 134665176 - - - - - technion-OST0032_UUID 149767512* - 149767512 - - - - - Total allocated inode limit: 0, total allocated block limit: 4823951056 Some errors happened when getting quota info. Some devices may be not working or deactivated. The data in "[]" is inaccurate. *lfs quota -g -h md_kaplan /storage/* Disk quotas for grp md_kaplan (gid 10028): Filesystem used quota limit grace files quota limit grace /storage/ 4.493T 0k 5T - 143596 0 0 - On Tue, Aug 11, 2020 at 7:35 AM David Cohen <cda...@physics.technion.ac.il> wrote: > Hi, > I'm running Lustre 2.10.5 on the oss and mds, and 2.10.7 on the clients. > While inode quota ons mdt worked fine for a while now: > lctl conf_param technion.quota.mdt=ugp > When, few days ago I turned on quota on ost: > lctl conf_param technion.quota.ost=ugp > Users started getting "Disk quota exceeded" error messages while quota is > not filled > > Actions taken: > Full e2fsck -f -y to all the file system, mdt and ost. > lctl lfsck_start -A -t all -o -e continue > turning quota to none and back. > > None of the above solved the problem. > > lctl lfsck_query > > > layout_mdts_init: 0 > layout_mdts_scanning-phase1: 0 > layout_mdts_scanning-phase2: 0 > layout_mdts_completed: 0 > layout_mdts_failed: 0 > layout_mdts_stopped: 0 > layout_mdts_paused: 0 > layout_mdts_crashed: 0 > *layout_mdts_partial: 1 *# is that normal output? > layout_mdts_co-failed: 0 > layout_mdts_co-stopped: 0 > layout_mdts_co-paused: 0 > layout_mdts_unknown: 0 > layout_osts_init: 0 > layout_osts_scanning-phase1: 0 > layout_osts_scanning-phase2: 0 > layout_osts_completed: 30 > layout_osts_failed: 0 > layout_osts_stopped: 0 > layout_osts_paused: 0 > layout_osts_crashed: 0 > layout_osts_partial: 0 > layout_osts_co-failed: 0 > layout_osts_co-stopped: 0 > layout_osts_co-paused: 0 > layout_osts_unknown: 0 > layout_repaired: 15 > namespace_mdts_init: 0 > namespace_mdts_scanning-phase1: 0 > namespace_mdts_scanning-phase2: 0 > namespace_mdts_completed: 1 > namespace_mdts_failed: 0 > namespace_mdts_stopped: 0 > namespace_mdts_paused: 0 > namespace_mdts_crashed: 0 > namespace_mdts_partial: 0 > namespace_mdts_co-failed: 0 > namespace_mdts_co-stopped: 0 > namespace_mdts_co-paused: 0 > namespace_mdts_unknown: 0 > namespace_osts_init: 0 > namespace_osts_scanning-phase1: 0 > namespace_osts_scanning-phase2: 0 > namespace_osts_completed: 0 > namespace_osts_failed: 0 > namespace_osts_stopped: 0 > namespace_osts_paused: 0 > namespace_osts_crashed: 0 > namespace_osts_partial: 0 > namespace_osts_co-failed: 0 > namespace_osts_co-stopped: 0 > namespace_osts_co-paused: 0 > namespace_osts_unknown: 0 > namespace_repaired: 99 > > > > >
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org