Hi,
*Question 1: Is it expected behavior that pools disappeared after the Luster
upgrade?*
No this should not happen. Pool will disappear only if you do a --writeconf on
the targets. You should check the kernel logs to see what happened.
Maybe you have some kind of configuration corruption/mismatch. You can dump MGS
configurations with:
[root@mgs ~] debugfs -c -R 'dump CONFIGS/cluster-client /tmp/cluster-client'
/dev/<mgt_device> && llog_reader /tmp/cluster-client
[root@mgs ~] debugfs -c -R 'dump CONFIGS/cluster-MDT0000 /tmp/cluster-MDT0000'
/dev/<mgt_device> && llog_reader /tmp/cluster-MDT0000
...
*Question 2: Is it safe to ignore warning messages "OST ... not found in pool
..." when adding OST to the pool.*
Sometime, it takes time to synchronize all clients and targets (because one or
several nodes are unresponsive). But this can hide a MGS communication issue
too. You should check the kernel messages.
You can verify the clients states with "lfs pool_list cluster.ssd" and the MDT states
with "lctl pool_list cluster.ssd" (or lctl get_param lov.cluster-*.pools.ssd for both).
Those kinds of behavior will be fixed by 53202
<https://review.whamcloud.com/c/fs/lustre-release/+/53202>: LU-17308
<https://jira.whamcloud.com/browse/LU-17308> mgs: move pool_cmd check to the kernel.
Etienne
On 8/1/24 14:00, Pavlo Khmel via lustre-discuss wrote:
Hi,
I upgraded Luster from 2.12.8 to 2.15.5 (server and clients). After upgrade, I
found that all Lustre pools disappeared. There were 2 pools:
- cluster.ssh
- cluster.hdd
[root@mds1 ~]# lctl pool_list cluster
Pools from cluster:
[root@mds1 ~]# lctl pool_list cluster.ssd
Pool: cluster.ssd
lctl pool_list: cannot open
/proc/fs/lustre/lov/cluster-MDT0000-mdtlov/pools/ssd: No such file or directory
(2)
[root@mds1 ~]# ls -la /proc/fs/lustre/lov/cluster-MDT0000-mdtlov/pools/
total 0
dr-xr-xr-x 2 root root 0 Aug 1 10:49 .
dr-xr-xr-x 4 root root 0 Aug 1 10:49 ..
On the client side "lfs getstripe" still shows pool names.
[root@login2 ~]# lfs getstripe /cluster
. . .
/cluster/home
stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1
pool: hdd
/cluster/apps
stripe_count: 1 stripe_size: 1048576 pattern: raid0 stripe_offset: -1
pool: ssd
I created pools again:
[root@mds1 ~]# lctl pool_new cluster.ssd
Pool cluster.ssd created
[root@mds1 ~]# lctl pool_add cluster.ssd OST[0-3]
Warning, OST cluster-OST0000_UUID not found in pool cluster.ssd
OST cluster-OST0001_UUID added to pool cluster.ssd
. . .
Adding OSTs to the pool shows a warings message "OST ... not found in pool ..."
sometimes.
Question 1: Is it expected behavior that pools disappeared after the Luster
upgrade?
Question 2: Is it safe to ignore warning messages "OST ... not found in pool
..." when adding OST to the pool.
Best regards,
Pavlo Khmel
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org