Re: [lustre-discuss] [EXTERNAL] Re: What is the meaning of these messages?

2023-12-08 Thread Mohr, Rick via lustre-discuss
It could mean that there are network issues with that one particular client.  
If the client loses connectivity to an ost for some reason (even if the problem 
is on the client side), requests would timeout and the client would assume the 
target ost is unavailable.  The client would then try to reconnect to the 
target on the failover node, but since the target is not available on the 
failover node (because no failover occurred), I believe that node would log a 
message like what you have seen.  The fact that you see errors on multiple  
servers from the same client makes me think the problem is on the client.  
Maybe the network connection is flapping up and down?

In the example you gave, is oss010 the failover node for target fs-OST00b0?

--Rick


On 12/8/23, 9:39 AM, "lustre-discuss on behalf of Backer via lustre-discuss" 
mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


Hi All,


Just sending this again. 




On Tue, 5 Dec 2023 at 15:03, Backer mailto:backer.k...@gmail.com> >> wrote:


Hi All,


Time to time, I see the following messages on multiple OSS about a particular 
client IP. What does it mean? All the OSS and OSTs are online and has been 
online in the past. 




Dec 4 18:05:27 oss010 kernel: LustreError: 137-5: fs-OST00b0_UUID: not 
available for connect from @tcp1 (no target). If you are running an 
HA pair check that the target is mounted on the other server.





















___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] No port 988?

2023-09-26 Thread Mohr, Rick via lustre-discuss
What error do you get when you run "modprobe lnet"?

--Rick

On 9/26/23, 12:29 PM, "lustre-discuss on behalf of Jan Andersen" 
mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of j...@comind.io 
> wrote:


I have come a bit further with this problem - it seems the lnet module 
can't load:


[root@rocky8 lustre-release]# depmod lnet
depmod: ERROR: Bad version passed lnet


I deleted the VMs and reinstalled Rocky 8.8, then built lustre 2.15.3 
and installed it, everything without any error messages. I haven't been 
able to find any indication of what this message means through google, 
but I assume it would mean that the kernel source doesn't match the 
running kernel? But how well must they match? This is my running kernel:


[root@rocky8 lustre]# uname -r
4.18.0-477.10.1.el8_8.x86_64


And this is the kernel source:


[root@rocky8 lustre]# ll /usr/src/kernels
total 4
drwxr-xr-x. 23 root root 4096 Sep 26 12:34 4.18.0-477.27.1.el8_8.x86_64/


IOW, they diverge just after '477.' - is that the problem?


/jan


Hi,


I've built and installed lustre on two VirtualBoxes running Rocky 8.8 
and formatted one as the MGS/MDS and the other as OSS, following a 
presentation from Oak Ridge National Laboratory: "Creating a Lustre Test 
System from Source with Virtual Machines" (sorry, no link; it was a 
while ago I downloaded them).


I can mount the filesystems on the MDS, but when I try from the OSS, it 
just times out - from dmesg:


[root@oss1 log]# dmesg | grep -i lustre
[ 564.028680] Lustre: Lustre: Build Version: 2.15.58_42_ga54a206
[ 625.567672] LustreError: 15f-b: lustre-OST: cannot register this 
server with the MGS: rc = -110. Is the MGS running?
[ 625.567767] LustreError: 
1789:0:(tgt_mount.c:2216:server_fill_super()) Unable to start targets: -110
[ 625.567851] LustreError: 1789:0:(tgt_mount.c:1752:server_put_super()) 
no obd lustre-OST
[ 625.567894] LustreError: 
1789:0:(tgt_mount.c:132:server_deregister_mount()) lustre-OST not 
registered
[ 625.588244] Lustre: server umount lustre-OST complete
[ 625.588251] LustreError: 
1789:0:(tgt_mount.c:2365:lustre_tgt_fill_super()) Unable to mount (-110)


Both 'nmap' and 'netstat -nap' show that there is nothing listening on 
port 988:


[root@mds ~]# netstat -nap | grep -i listen
tcp 0 0 0.0.0.0:111 0.0.0.0:* LISTEN 1/systemd
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 806/sshd
tcp6 0 0 :::111 :::* LISTEN 1/systemd
tcp6 0 0 :::22 :::* LISTEN 806/sshd


What should be listening on 988?


/jan


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org 
https://urldefense.us/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org=DwICAg=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc=SpEwA4Pnyq7nH7aMGq8KpA=CgNxrHlVi8E080Wn9FedFf9aFiNoDLgThFJTOZPuDDQhPM4NButKWaORGrnA5Wpp=8Km2w08u3C_u5IhtX97HQ8K535wZx5OcHElSsUbsNCA=
 

 



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Cannot mount MDT after upgrading from Lustre 2.12.6 to 2.15.3

2023-09-26 Thread Mohr, Rick via lustre-discuss
Typically after an upgrade you do not need to perform a writeconf.  Did you 
perform the writeconf only on the MDT?  If so, that could be your problem.  
When you do a writeconf to regenerate the lustre logs, you need to follow the 
whole procedure listed in the lustre manual.  You can try that to see if it 
fixes your issue.

--Rick

On 9/23/23, 2:22 PM, "lustre-discuss on behalf of Tung-Han Hsieh via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


Dear All,


Today we tried to upgrade Lustre file system from version 2.12.6 to 2.15.3. But 
after the work, we cannot mount MDT successfully. Our MDT is ldiskfs backend. 
The procedure of upgrade is




1. Install the new version of e2fsprogs-1.47.0
2. Install Lustre-2.15.3
3. After reboot, run: tunefs.lustre --writeconf /dev/md0




Then when mounting MDT, we got the error message in dmesg:




===
[11662.434724] LDISKFS-fs (md0): mounted filesystem with ordered data mode. 
Opts: user_xattr,errors=remount-ro,no_mbcache,nodelalloc
[11662.584593] Lustre: 3440:0:(scrub.c:189:scrub_file_load()) chome-MDT: 
reset scrub OI count for format change (LU-16655)
[11666.036253] Lustre: MGS: Logs for fs chome were removed by user request. All 
servers must be restarted in order to regenerate the logs: rc = 0
[11666.523144] Lustre: chome-MDT: Imperative Recovery not enabled, recovery 
window 300-900
[11666.594098] LustreError: 3440:0:(mdd_device.c:1355:mdd_prepare()) 
chome-MDD: get default LMV of root failed: rc = -2
[11666.594291] LustreError: 
3440:0:(obd_mount_server.c:2027:server_fill_super()) Unable to start targets: -2
[11666.594951] Lustre: Failing over chome-MDT
[11672.868438] Lustre: 3440:0:(client.c:2295:ptlrpc_expire_one_request()) @@@ 
Request sent has timed out for slow reply: [sent 1695492248/real 1695492248] 
req@5dfd9b53 x1777852464760768/t0(0) 
o251->MGC192.168.32.240@o2ib@0@lo:26/25 lens 224/224 e 0 to 1 dl 1695492254 ref 
2 fl Rpc:XNQr/0/ rc 0/-1 job:''
[11672.925905] Lustre: server umount chome-MDT complete
[11672.926036] LustreError: 3440:0:(super25.c:183:lustre_fill_super()) llite: 
Unable to mount : rc = -2
[11872.893970] LDISKFS-fs (md0): mounted filesystem with ordered data mode. 
Opts: (null)







Could anyone help to solve this problem ? Sorry that it is really urgent.




Thank you very much.




T.H.Hsieh





___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] getting without inodes

2023-08-15 Thread Mohr, Rick via lustre-discuss
Carlos,

One thing you should look at is the stripe counts for your files.  For T1, the 
mdt has used about 56M inodes and each of the 16 osts has used about 30M 
inodes.  This means that the average stripe count for a file is around 8.  If 
there are some very small files, it's possible that those files have inodes 
allocated on osts for objects that don’t contain any data.  I don't know what 
the space usage looks like for your osts, but if there is room, you could try 
using "lfs migrate" to restripe files to stripe count 1.  If you were able to 
do this for all files on T1, then each ost would only be using about 3M to 4M 
inodes instead of 30M.  I don't know how T1 is being used, what the file size 
distribution looks like, etc. so you will have to decide if that is a viable 
option.  T0 looks like it might be on a similar trajectory as T1.  There are 
about 1.5M inodes used on the mdt and about 1M used on each of the 22 osts.  
This means the average stripe count for a file looks to be around 14-15.  At 
that rate, there could be a similar imbalance in inode usage between the mdt 
and osts at some point.

Do your users customize the stripe counts for their files?  Or are they like 
most users and just use whatever default stripe count is set on the file 
system?  You might need to do some digging to see if files are being striped 
appropriately based on their size.  If you find lots of small files with large 
stripe counts, you could consider changing the default stripe count or maybe 
look into using PFL to define a default file layout that increases in stripe 
count as the file size increases.

--Rick


On 8/10/23, 6:17 PM, "lustre-discuss on behalf of Carlos Adean via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


Hello experts,

We have a Lustre with two tiers T0(SSD) and T1(HDD), the first with 70TB and 
the second one with ~500TB.
I'm experiencing a problem that the T1 has much less inodes than T0 and that is 
getting without inodes in the OSTs, so I'd like to understand the source of 
this and how to fix that.

Thanks in advance.



=== T0

$ lfs df -i /lustre/t0
UUID Inodes IUsed IFree IUse% Mounted on
t0-MDT_UUID 390627328 1499300 389128028 1% /lustre/t0[MDT:0]
t0-OST_UUID 14651392 1097442 13553950 8% /lustre/t0[OST:0]
t0-OST0001_UUID 14651392 1097492 13553900 8% /lustre/t0[OST:1]
t0-OST0002_UUID 14651392 1097331 13554061 8% /lustre/t0[OST:2]
t0-OST0003_UUID 14651392 1097563 13553829 8% /lustre/t0[OST:3]
t0-OST0004_UUID 14651392 1097576 13553816 8% /lustre/t0[OST:4]
t0-OST0005_UUID 14651392 1097505 13553887 8% /lustre/t0[OST:5]
t0-OST0006_UUID 14651392 1097524 13553868 8% /lustre/t0[OST:6]
t0-OST0007_UUID 14651392 1097596 13553796 8% /lustre/t0[OST:7]
t0-OST0008_UUID 14651392 1097442 13553950 8% /lustre/t0[OST:8]
t0-OST0009_UUID 14651392 1097563 13553829 8% /lustre/t0[OST:9]
t0-OST000a_UUID 14651392 1097515 13553877 8% /lustre/t0[OST:10]
t0-OST000b_UUID 14651392 1096524 13554868 8% /lustre/t0[OST:11]
t0-OST000c_UUID 14651392 1096608 13554784 8% /lustre/t0[OST:12]
t0-OST000d_UUID 14651392 1096524 13554868 8% /lustre/t0[OST:13]
t0-OST000e_UUID 14651392 1096641 13554751 8% /lustre/t0[OST:14]
t0-OST000f_UUID 14651392 1096647 13554745 8% /lustre/t0[OST:15]
t0-OST0010_UUID 14651392 1096705 13554687 8% /lustre/t0[OST:16]
t0-OST0011_UUID 14651392 1096616 13554776 8% /lustre/t0[OST:17]
t0-OST0012_UUID 14651392 1096520 13554872 8% /lustre/t0[OST:18]
t0-OST0013_UUID 14651392 1096598 13554794 8% /lustre/t0[OST:19]
t0-OST0014_UUID 14651392 1096669 13554723 8% /lustre/t0[OST:20]
t0-OST0015_UUID 14651392 1096570 13554822 8% /lustre/t0[OST:21]


filesystem_summary: 299694753 1499300 298195453 1% /lustre/t0


=== T1
$ lfs df -i /lustre/t1
UUID Inodes IUsed IFree IUse% Mounted on
t1-MDT_UUID 1478721536 56448788 1422272748 4% /lustre/t1[MDT:0]
t1-OST_UUID 30492032 30491899 133 100% /lustre/t1[OST:0]
t1-OST0001_UUID 30492032 30491990 42 100% /lustre/t1[OST:1]
t1-OST0002_UUID 30492032 30491916 116 100% /lustre/t1[OST:2]
t1-OST0003_UUID 30492032 27471050 3020982 91% /lustre/t1[OST:3]
t1-OST0004_UUID 30492032 30491989 43 100% /lustre/t1[OST:4]
t1-OST0005_UUID 30492032 30491960 72 100% /lustre/t1[OST:5]
t1-OST0006_UUID 30492032 30491948 84 100% /lustre/t1[OST:6]
t1-OST0007_UUID 30492032 30491939 93 100% /lustre/t1[OST:7]
t1-OST0008_UUID 30492032 29811803 680229 98% /lustre/t1[OST:8]
t1-OST0009_UUID 30492032 29808261 683771 98% /lustre/t1[OST:9]
t1-OST000a_UUID 30492032 29809919 682113 98% /lustre/t1[OST:10]
t1-OST000b_UUID 30492032 29807585 684447 98% /lustre/t1[OST:11]
t1-OST000c_UUID 30492032 29809171 682861 98% /lustre/t1[OST:12]
t1-OST000d_UUID 30492032 29804206 687826 98% /lustre/t1[OST:13]
t1-OST000e_UUID 30492032 29806399 685633 98% /lustre/t1[OST:14]
t1-OST000f_UUID 30492032 29802857 689175 98% /lustre/t1[OST:15]


filesystem_summary: 64946408 56448788 8497620 87% /lustre/t1



Re: [lustre-discuss] [EXTERNAL] MDTs will only mount read only

2023-06-21 Thread Mohr, Rick via lustre-discuss
Mike,

On the off chance that the recovery process is causing the issue, you could try 
mounting the mdt with the "abort_recov" option and see if the behavior changes.

--Rick



On 6/21/23, 2:33 PM, "lustre-discuss on behalf of Jeff Johnson" 
mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
jeff.john...@aeoncomputing.com > wrote:


Maybe someone else in the list can add clarity but I don't believe a recovery 
process on mount would keep the MDS read-only or trigger that trace. Something 
else may be going on. 


I would start from the ground up. Bring your servers up, unmounted. Ensure lnet 
is loaded and configured properly. Test lnet using ping or lnet_selftest from 
your MDS to all of your OSS nodes. Then mount your combined MGS/MDT volume on 
the MDS and see what happens. 




Is your MDS in a high-availability pair? 
What version of Lustre are you running? 




...just a few things readers on the list might want to know.




--Jeff








On Wed, Jun 21, 2023 at 11:21 AM Mike Mosley mailto:mike.mos...@charlotte.edu> >> wrote:


Jeff,


At this point we have the OSS shutdown. We were coming back from. full outage 
and so we are trying to get the MDS up before starting to bring up the OSS.




Mike




On Wed, Jun 21, 2023 at 2:15 PM Jeff Johnson mailto:jeff.john...@aeoncomputing.com> <_blank>> wrote:


Mike,


Have you made sure the the o2ib interface on all of your Lustre servers (MDS & 
OSS) are functioning properly? Are you able to `lctl ping x.x.x.x@o2ib` 
successfully between MDS and OSS nodes?




--Jeff








On Wed, Jun 21, 2023 at 10:08 AM Mike Mosley via lustre-discuss 
mailto:lustre-discuss@lists.lustre.org> 
<_blank>> wrote:


Rick,172.16.100.4 is the IB address of one of the OSS servers. I 
believe the mgt and mdt0 are the same target. My understanding is that we have 
a single instanceof the MGT which is on the first MDT server i.e. it was 
created via a comand similar to:




# mkfs.lustre --fsname=scratch --index=0 --mdt --mgs --replace /dev/sdb 






Does that make sense.






On Wed, Jun 21, 2023 at 12:55 PM Mohr, Rick mailto:moh...@ornl.gov> <_blank>> wrote:


Which host is 172.16.100.4? Also, are the mgt and mdt0 on the same target or 
are they two separate targets just on the same host?


--Rick




On 6/21/23, 12:52 PM, "Mike Mosley" mailto:mike.mos...@charlotte.edu> <_blank>  <_blank>>> wrote:




Hi Rick,




The MGS/MDS are combined. The output I posted is from the primary.








THanks,








Mike








On Wed, Jun 21, 2023 at 12:27 PM Mohr, Rick mailto:moh...@ornl.gov> <_blank>  <_blank>>  <_blank>  <_blank wrote:




Mike,




It looks like the mds server is having a problem contacting the mgs server. I'm 
guessing the mgs is a separate host? I would start by looking for possible 
network problems that might explain the LNet timeouts. You can try using "lctl 
ping" to test the LNet connection between nodes, and you can also try regular 
"ping" between the IP addresses on the IB interfaces.




--Rick








On 6/21/23, 11:35 AM, "lustre-discuss on behalf of Mike Mosley via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> <_blank> 
 <_blank>> <_blank> 
 <_blank> 
 <_blank>> <_blank>> on behalf 
of lustre-discuss@lists.lustre.org  
<_blank>  <_blank>> <_blank> 
 <_blank> 
 <_blank>> <_blank>>> wrote:








Greetings,








We have experienced some type of issue that is causing both of our MDS servers 
to only be able to mount the mdt device in read only mode. Here are some of the 
error messages we are seeing in the log files below. We lost our Lustre expert 
a while back and we are not sure how to proceed to troubleshoot this issue. Can 
anybody provide us guidance on how to proceed?
















Thanks,
















Mike
















Jun 20 15:12:14 hyd-mds1 kernel: INFO: task mount.lustre:4123 blocked for more 
than 120 seconds.
Jun 20 15:12:14 hyd-mds1 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 15:12:14 hyd-mds1 kernel: mount.lustre D 9f27a3bc5230 0 4123 1 
0x0086
Jun 20 

Re: [lustre-discuss] [EXTERNAL] MDTs will only mount read only

2023-06-21 Thread Mohr, Rick via lustre-discuss
Mike,

It looks like the mds server is having a problem contacting the mgs server.  
I'm guessing the mgs is a separate host?  I would start by looking for possible 
network problems that might explain the LNet timeouts.  You can try using "lctl 
ping" to test the LNet connection between nodes, and you can also try regular 
"ping" between the IP addresses on the IB interfaces.

--Rick


On 6/21/23, 11:35 AM, "lustre-discuss on behalf of Mike Mosley via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


Greetings,


We have experienced some type of issue that is causing both of our MDS servers 
to only be able to mount the mdt device in read only mode. Here are some of the 
error messages we are seeing in the log files below. We lost our Lustre expert 
a while back and we are not sure how to proceed to troubleshoot this issue. Can 
anybody provide us guidance on how to proceed?




Thanks,




Mike




Jun 20 15:12:14 hyd-mds1 kernel: INFO: task mount.lustre:4123 blocked for more 
than 120 seconds.
Jun 20 15:12:14 hyd-mds1 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 15:12:14 hyd-mds1 kernel: mount.lustre D 9f27a3bc5230 0 4123 1 
0x0086
Jun 20 15:12:14 hyd-mds1 kernel: Call Trace:
Jun 20 15:12:14 hyd-mds1 kernel: [] schedule+0x29/0x70
Jun 20 15:12:14 hyd-mds1 kernel: [] 
schedule_timeout+0x221/0x2d0
Jun 20 15:12:14 hyd-mds1 kernel: [] ? tracing_is_on+0x15/0x30
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
tracing_record_cmdline+0x1d/0x120
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
probe_sched_wakeup+0x2b/0xa0
Jun 20 15:12:14 hyd-mds1 kernel: [] ? ttwu_do_wakeup+0xb5/0xe0
Jun 20 15:12:14 hyd-mds1 kernel: [] 
wait_for_completion+0xfd/0x140
Jun 20 15:12:14 hyd-mds1 kernel: [] ? wake_up_state+0x20/0x20
Jun 20 15:12:14 hyd-mds1 kernel: [] 
llog_process_or_fork+0x244/0x450 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] llog_process+0x14/0x20 
[obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
class_config_parse_llog+0x125/0x350 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
mgc_process_cfg_log+0x790/0xc40 [mgc]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
mgc_process_log+0x3dc/0x8f0 [mgc]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
config_recover_log_add+0x13f/0x280 [mgc]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
class_config_dump_handler+0x7e0/0x7e0 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
mgc_process_config+0x88b/0x13f0 [mgc]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
lustre_process_log+0x2d8/0xad0 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
libcfs_debug_msg+0x57/0x80 [libcfs]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
lprocfs_counter_add+0xf9/0x160 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
server_start_targets+0x13a4/0x2a20 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
lustre_start_mgc+0x260/0x2510 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
class_config_dump_handler+0x7e0/0x7e0 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
server_fill_super+0x10cc/0x1890 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] 
lustre_fill_super+0x468/0x960 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
lustre_common_put_super+0x270/0x270 [obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] mount_nodev+0x4f/0xb0
Jun 20 15:12:14 hyd-mds1 kernel: [] lustre_mount+0x38/0x60 
[obdclass]
Jun 20 15:12:14 hyd-mds1 kernel: [] mount_fs+0x3e/0x1b0
Jun 20 15:12:14 hyd-mds1 kernel: [] vfs_kern_mount+0x67/0x110
Jun 20 15:12:14 hyd-mds1 kernel: [] do_mount+0x1ef/0xd00
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
__check_object_size+0x1ca/0x250
Jun 20 15:12:14 hyd-mds1 kernel: [] ? 
kmem_cache_alloc_trace+0x3c/0x200
Jun 20 15:12:14 hyd-mds1 kernel: [] SyS_mount+0x83/0xd0
Jun 20 15:12:14 hyd-mds1 kernel: [] 
system_call_fastpath+0x25/0x2a
Jun 20 15:13:14 hyd-mds1 kernel: LNet: 
4458:0:(o2iblnd_cb.c:3397:kiblnd_check_conns()) Timed out tx for 
172.16.100.4@o2ib: 9 seconds
Jun 20 15:13:14 hyd-mds1 kernel: LNet: 
4458:0:(o2iblnd_cb.c:3397:kiblnd_check_conns()) Skipped 239 previous similar 
messages
Jun 20 15:14:14 hyd-mds1 kernel: INFO: task mount.lustre:4123 blocked for more 
than 120 seconds.
Jun 20 15:14:14 hyd-mds1 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 20 15:14:14 hyd-mds1 kernel: mount.lustre D 9f27a3bc5230 0 4123 1 
0x0086






dumpe2fs seems to show that the file systems are clean i.e.




dumpe2fs 1.45.6.wc1 (20-Mar-2020)
Filesystem volume name: hydra-MDT
Last mounted on: /
Filesystem UUID: 3ae09231-7f2a-43b3-a4ee-7f36080b5a66
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index filetype mmp 
flex_bg dirdata sparse_super large_file huge_file uninit_bg dir_nlink quota
Filesystem flags: signed_directory_hash 
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 2247671504
Block count: 

Re: [lustre-discuss] [EXTERNAL] I/O error on lctl ping although ibping successful

2023-06-20 Thread Mohr, Rick via lustre-discuss
Have you tried tcp pings on the IP addresses associated with the IB interfaces?

--Rick


On 6/20/23, 12:11 PM, "lustre-discuss on behalf of Youssef Eldakar via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


In a cluster having ~100 Lustre clients (compute nodes) connected together with 
the MDS and OSS over Intel True Scale InfiniBand (discontinued product), we 
started seeing certain nodes failing to mount the Lustre file system and giving 
I/O error on LNET (lctl) ping even though an ibping test to the MDS gives no 
errors. We tried rebooting the problematic nodes and even fresh-installing the 
OS and Lustre client, which did not help. However, rebooting the MDS seems to 
possibly momentarily help after the MDS starts up again, but the same set of 
problematic nodes seem to always eventually revert back to the state where they 
fail to ping the MDS over LNET.


Thank you for any pointers we may pursue.




Youssef Eldakar
Bibliotheca Alexandrina
www.bibalex.org 

 

hpc.bibalex.org 

 






___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: ZFS zpool/filesystem operations while mounted with '-t lustre'

2023-05-19 Thread Mohr, Rick via lustre-discuss


On 5/18/23, 10:48 AM, "lustre-discuss on behalf of Peter Grandi via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:

> BTW I wonder whether I should bother to do the same for the MGT -- on 
> one hand it is tiny, on the other my guess is that its state is pretty 
> much irrelevant in case of restoring the MDT.

You should be able to recreate all the MGT state just by doing a writeconf on 
the file system, so I'm not sure it's worth going through the whole zfs 
send/receive process in order to create a backup.  There are some things that 
don't get recreated during a writeconf (ost pool definitions, lctl conf_param 
settings, etc.) so it would be worth just putting those commands in a script 
some place where you could easily rerun it.

--Rick

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: ZFS zpool/filesystem operations while mounted with '-t lustre'

2023-05-18 Thread Mohr, Rick via lustre-discuss


On 5/18/23, 10:15 AM, "lustre-discuss on behalf of Peter Grandi via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:

  I was indeed reading that but I was a bit hesitant because the 
  "zpool"/"zfs" operations are bracketed by 'service lustre stop 
  ...'/'service lustre start ...' commands which I hope to avoid.

You only need to stop lustre if you are planning to decommission the old 
hardware and switch over entirely to the new hardware.  In that case, stopping 
lustre is needed to ensure that no new content is created on the mdt during the 
final sync.  If you are just trying to keep an on-going backup of the mdt on 
another zpool, you can just keep doing incremental snapshots and don't have to 
stop lustre (with the knowledge that your backup of the mdt data will be 
slightly old).

--Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] ZFS zpool/filesystem operations while mounted with '-t lustre'

2023-05-18 Thread Mohr, Rick via lustre-discuss
Peter,

You might want to take a look at this: 
https://www.opensfs.org/wp-content/uploads/2017/06/Wed06-CroweTom-lug17-ost_data_migration_using_ZFS.pdf

It's a few years old, but it shows how IU used zfs send/receive to copy data 
from osts.  I worked with someone several years ago to do basically the same 
procedure to make incremental copies of mdt data when we were attempting to 
switch over to a new mds server.  You don't need to unmount lustre in order to 
do the incremental backups. 

--Rick



On 5/18/23, 9:38 AM, "lustre-discuss on behalf of Peter Grandi via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


I have a Lustre 2.15.2 instance "temp01" on ZFS 2.1.5 (on EL8), and I 
just want to backup the MDT of the instance (I am mirroring the data on 
two separate "pools" of servers).


The "zpool" is called "temp01_mdt_000" and so is the filesystem, so the 
'/etc/fstab' mount line is (I have set legacy ZFS mount-points):


> temp01_mdt_000/temp01_mdt_000 /srv/temp01/temp01_mdt_000 lustre 
> defaults,noatime,auto 0 0


As usual '/srv/temp01/temp01_mdt_000' is opaque (no access permissions) 
when mounted as 'lustre', but anyhow I would like to backup a snapshot 
of it, using 'zfs send'.


If I use 'zpool list' and 'zfs list' I see the relevant details even if 
it is mounted as 'lustre'. If I un-mount it and re-mount it as 'zfs' it 
looks ordinary, but I would rather not do that.


In theory, however it is mounted, I should be able to (suitably preceded 
by 'lctl barrier_freeze') create a snapshot for it, and mount that as 
'zfs' to some other mount-point and then 'zfs send' that.


Before doing it on a live Lustre instance (I can't easily afford to 
setup a test Lustre instance in the short run) I would like some 
confirmation that this is meant to work ideally by someone who does it 
routinely :-).


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org 
https://urldefense.us/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org=DwICAg=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc=SpEwA4Pnyq7nH7aMGq8KpA=IL-tAnpVG0zDmIcadFEhpvjcYSfKQOVy2N17rJokQM3BHR_ICfc_JTGFR6euTBAL=dqTFpK0IyU_gT7eF3npzRFYtt7N7AXnGk3tQCHcjiUs=
 

 



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Mounting lustre on block device

2023-03-16 Thread Mohr, Rick via lustre-discuss
Are you asking if you can mount Lustre on a client so that it shows up as a 
block device?  If so, the answer to that is you can't.  Lustre does not appear 
as a block device to the clients.

-Rick



On 3/16/23, 3:44 PM, "lustre-discuss on behalf of Shambhu Raje via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


When we mount a lustre file system on client, the lustre file system does not 
use block device on client side. Instead it uses virtual file system namespace. 
Mounting point will not be shown when we do 'lsblk'. As it only show on 'df-hT'.


How can we mount lustre file system on block such that when we write something 
with lusterfs then it can be shown in block device??
Can share command??











___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] tunefs.lustre safe way to get config

2023-02-24 Thread Mohr, Rick via lustre-discuss
Sid,

As far as I know, it is safe to run tunefe.lustre on a live file system as long 
as you aren’t making changes.  The man page doesn't show a --print option for 
tunefs.lustre, but I usually just run "tunefs.lustre --dryrun " to 
print the info.  (Note: I have only done this with ldiskfs backends before, but 
I don't think zfs should be any different)

--Rick


On 2/23/23, 8:07 PM, "lustre-discuss on behalf of Sid Young via 
lustre-discuss" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


G'Day all,


I need to review the IP's assigned during the initial mkfs.lustre on ten ZFS 
based OST's and two ZFS backed MDT's.




The ZFS disks are:
osthome0/ost0, osthome1/ost1, osthome2/ost2, osthome3/ost3,
ostlustre0/ost0, ostlustre1/ost1, ostlustre2/ost2, ostlustre3/ost3, 
ostlustre4/ost4, ostlustre5/ost5And 


mdsthome/home
mdtlustre/lustre).




A few questions




Is it safe to use tunefs.lustre on the running system to read back the 
parameters only? or do I have to shut everything down and read from the 
unmounted filesystems?




Is this the correct commands to use for the DMTs?




tunefs.lustre --print mdthome/home


tunefs.lustre --print mdtlustre/lustre










Sid Young





























___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] ZFS Support for Lustre

2023-02-24 Thread Mohr, Rick via lustre-discuss
Nick,

You can run "rpm -ql zfs" to get a list of all the files in that package to see 
where the zfs module was installed.

-Rick


On 2/24/23, 5:56 AM, "lustre-discuss on behalf of Nick dan via lustre-discuss" 
mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of 
lustre-discuss@lists.lustre.org > wrote:


Hi


I have installed following packages of Lustre and ZFS




LUSTRE
[root@1u fs]# rpm -qa | grep lustre
kmod-lustre-tests-debuginfo-2.15.2-1.el8.x86_64
lustre-zfs-dkms-2.15.2-1.el8.noarch
kernel-debuginfo-common-x86_64-4.18.0-425.3.1.el8_lustre.x86_64
kernel-selftests-internal-4.18.0-425.3.1.el8_lustre.x86_64
kernel-core-4.18.0-425.3.1.el8_lustre.x86_64
kernel-devel-4.18.0-425.3.1.el8_lustre.x86_64
lustre-osd-zfs-mount-2.15.2-1.el8.x86_64
lustre-debuginfo-2.15.2-1.el8.x86_64
lustre-tests-debuginfo-2.15.2-1.el8.x86_64
kernel-modules-internal-4.18.0-425.3.1.el8_lustre.x86_64
kernel-debuginfo-4.18.0-425.3.1.el8_lustre.x86_64
lustre-iokit-2.15.2-1.el8.x86_64
kernel-ipaclones-internal-4.18.0-425.3.1.el8_lustre.x86_64
python3-perf-4.18.0-425.3.1.el8_lustre.x86_64
perf-4.18.0-425.3.1.el8_lustre.x86_64
kmod-lustre-osd-zfs-2.15.2-1.el8.x86_64
kmod-lustre-osd-zfs-debuginfo-2.15.2-1.el8.x86_64
kernel-headers-4.18.0-425.3.1.el8_lustre.x86_64
kernel-4.18.0-425.3.1.el8_lustre.x86_64
kmod-lustre-2.15.2-1.el8.x86_64
kernel-modules-extra-4.18.0-425.3.1.el8_lustre.x86_64
kmod-lustre-tests-2.15.2-1.el8.x86_64
lustre-debugsource-2.15.2-1.el8.x86_64
kernel-modules-4.18.0-425.3.1.el8_lustre.x86_64
kmod-lustre-debuginfo-2.15.2-1.el8.x86_64
perf-debuginfo-4.18.0-425.3.1.el8_lustre.x86_64
lustre-osd-ldiskfs-mount-2.15.2-1.el8.x86_64
lustre-osd-zfs-mount-debuginfo-2.15.2-1.el8.x86_64






ZFS
lustre-zfs-dkms-2.15.2-1.el8.noarch
libzfs5-devel-2.1.5-1.el8.x86_64
zfs-debuginfo-2.1.5-1.el8.x86_64
lustre-osd-zfs-mount-2.15.2-1.el8.x86_64
zfs-2.1.5-1.el8.x86_64
python3-pyzfs-2.1.5-1.el8.noarch
zfs-test-2.1.5-1.el8.x86_64
kmod-lustre-osd-zfs-2.15.2-1.el8.x86_64
kmod-lustre-osd-zfs-debuginfo-2.15.2-1.el8.x86_64
zfs-debugsource-2.1.5-1.el8.x86_64
libzfs5-debuginfo-2.1.5-1.el8.x86_64
libzfs5-2.1.5-1.el8.x86_64
zfs-dkms-2.1.5-1.el8.noarch
zfs-test-debuginfo-2.1.5-1.el8.x86_64
lustre-osd-zfs-mount-debuginfo-2.15.2-1.el8.x86_64






When I am trying to modprobe zfs this is the error I am getting
I have rebooted the server




[root@1u user]# modprobe zfs


modprobe: FATAL: Module zfs not found in directory 
/lib/modules/4.18.0-425.3.1.el8_lustre.x86_64






FYI
[root@1u user]# cd 
/lib/modules/4.18.0-425.3.1.el8_lustre.x86_64/extra/lustre-osd-zfs/fs
[root@1u fs]# ls
osd_zfs.ko






Can you help with the error?




Regards
Nick Dan









___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Accessing files with bad PFL causing MDS kernel panics

2022-10-25 Thread Mohr, Rick via lustre-discuss
Nate,

For the example layout you attached, it looks like the file does not have any 
data in the components with the messed up extent_end value.  Have you tried 
using "lfs setstripe --component-del" to delete just those messed up components 
and see if you can then access the data?

--Rick


On 10/25/22, 4:43 PM, "lustre-discuss on behalf of Nathan Crawford" 
 wrote:

Hi All,
  I'm looking for possible work-arounds to recover data from some 
mis-migrated files (as seen in  LU-16152). Basically, there's a bug in "lfs 
setstripe --yaml" where extent start/end values in the yaml file >= 2GiB 
overflow to 16 EiB - 2 GiB.

  Using lfs_migrate, I re-striped many files in directories with a default 
striping pattern containing these values.  I'm pretty sure that the data exists 
(was trying to purge an older OST, and disk usage on the other OSTs increased 
as the purged OST decreased), and an lfsck procedure happily returns after a 
day or so. Unfortunately, attempts to access or re-migrate the files triggers a 
kernel panic on the MDS with:

LustreError: 12576:0:(osd_io.c:311:kmem_to_page()) ASSERTION( !((unsigned 
long)addr & ~(~(((1UL) << 12)-1))) ) failed:
LustreError: 12576:0:(osd_io.c:311:kmem_to_page()) LBUG

Kernel panic - not syncing: LBUG


 The servers are lustre 2.12.8 on OpenZFS 0.8.5 on CentOS 7.9. The output 
from "lfs getstripe -v badfile" is attached.

  I can use lfs find to search for files with these bad extent endpoint 
values, then move them to a quarantine area on the same FS. This will allow the 
rest of the system to stay up (hopefully) but recovering the data is still 
needed.

Thanks!
Nate

-- 
Dr. Nathan Crawford  nathan.crawf...@uci.edu
Director of Scientific Computing
School of Physical Sciences
164 Rowland Hall Office: 152 Rowland Hall
University of California, Irvine  Phone: 949-824-1380
Irvine, CA 92697-2025, USA

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] ran out of MDT inodes

2022-09-16 Thread Mohr, Rick via lustre-discuss
Liam,

I found the slides for the talk I was referring to:

https://www.opensfs.org/wp-content/uploads/2017/06/Wed06-CroweTom-lug17-ost_data_migration_using_ZFS.pdf

The example in the talk used the -R option (as Shane mentioned).

-Rick

On 9/16/22, 1:44 PM, "Nehring, Shane R [LAS]"  wrote:

You would probably also want to make sure you specify either -p or -b (or 
-R)
when you do the zfs send to make sure the properties are sent as well. 
Lustre
stores some information about the volume some custom properties. Forgetting 
that
has bitten me in the past.

Shane

On Fri, 2022-09-16 at 17:01 +0000, Mohr, Rick via lustre-discuss wrote:
> Liam,
> 
> As far as I know, all the data contained in ZFS would be transferred 
(Lustre
> or otherwise).  We were able to move an mdt to new storage on a different
> server with no known issues.  There was also a talk at LUG several years 
ago
> given by a site that would use ZFS snapshots as a safeguard during Lustre
> upgrades so that they could easily roll back if needed.  Perhaps that 
would be
> of use to you.  I can't remember which year it was, but it may have been 
the
> one hosted at IU.  Anyway, all the old LUG talks/slides should be online 
so if
> you dig around you should be able to find it. 
> 
> -Rick
> 
> On 9/16/22, 12:06 PM, "Liam Forbes"  wrote:
> 
> Morning Rick!
> 
> At first we were planning to do that. However, we weren't sure if that
> would capture all the internal Lustre data. Specifically the note in 
18.3.1
> about index backups made us think it wouldn't for version 2.10.X. Also the
> fact that ldiskfs backups and LVM snapshots are detailed in the manual 
but ZFS
> snapshots are not caused us to think they wouldn't work, at least with our
> older lustre version. Maybe we were taking things too literally. Would a 
ZFS
> snapshot contain all the Lustre data?
> 
> 
> 
> On Fri, Sep 16, 2022 at 7:45 AM Mohr, Rick  wrote:
> 
> 
> Liam,
> 
>  If you have another zpool configured somewhere, you could always 
take a
> snapshot of your mdt and then used send/received to copy that snapshot to
> another zpool.  I helped someone do this one time in order to move the 
mdt to
> new hardware.
> 
> -Rick
> 
> 
> 
> 
> -- 
> Regards,
> -liam
> 
> -There are uncountably more irrational fears than rational ones. -P. 
Dolan
> Liam Forbes   lofor...@alaska.edu   ph:
> 907.450.8618
> UAF GI Research Computing Systems Manager
> hxxps://calendly.com/ualoforbes/30min
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> hxxp://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] ran out of MDT inodes

2022-09-16 Thread Mohr, Rick via lustre-discuss
Liam,

As far as I know, all the data contained in ZFS would be transferred (Lustre or 
otherwise).  We were able to move an mdt to new storage on a different server 
with no known issues.  There was also a talk at LUG several years ago given by 
a site that would use ZFS snapshots as a safeguard during Lustre upgrades so 
that they could easily roll back if needed.  Perhaps that would be of use to 
you.  I can't remember which year it was, but it may have been the one hosted 
at IU.  Anyway, all the old LUG talks/slides should be online so if you dig 
around you should be able to find it. 

-Rick

On 9/16/22, 12:06 PM, "Liam Forbes"  wrote:

Morning Rick!

At first we were planning to do that. However, we weren't sure if that 
would capture all the internal Lustre data. Specifically the note in 18.3.1 
about index backups made us think it wouldn't for version 2.10.X. Also the fact 
that ldiskfs backups and LVM snapshots are detailed in the manual but ZFS 
snapshots are not caused us to think they wouldn't work, at least with our 
older lustre version. Maybe we were taking things too literally. Would a ZFS 
snapshot contain all the Lustre data?



On Fri, Sep 16, 2022 at 7:45 AM Mohr, Rick  wrote:


Liam,

 If you have another zpool configured somewhere, you could always take a 
snapshot of your mdt and then used send/received to copy that snapshot to 
another zpool.  I helped someone do this one time in order to move the mdt to 
new hardware.

-Rick




-- 
Regards,
-liam

-There are uncountably more irrational fears than rational ones. -P. Dolan
Liam Forbes   lofor...@alaska.edu   ph: 
907.450.8618
UAF GI Research Computing Systems Manager
hxxps://calendly.com/ualoforbes/30min

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] ran out of MDT inodes

2022-09-16 Thread Mohr, Rick via lustre-discuss
Liam,

 If you have another zpool configured somewhere, you could always take a 
snapshot of your mdt and then used send/received to copy that snapshot to 
another zpool.  I helped someone do this one time in order to move the mdt to 
new hardware.

-Rick


On 9/14/22, 6:23 PM, "lustre-discuss on behalf of Liam Forbes via 
lustre-discuss"  wrote:

Today, in our lustre 2.10.3 filesystem, the MDT ran out of inodes. We are 
using ZFS as the backing filesystem.

[loforbes@mds02 ~]$ df -i -t lustre
FilesystemInodesIUsed IFree IUse% Mounted on
digdug-meta/lustre2-mgt-mdt 83703636 83703636 0  100% 
/mnt/lustre/local/lustre2-MDT

[loforbes@mds02 ~]$ sudo zpool list -v
NAME   SIZE  ALLOC   FREE  EXPANDSZ   FRAGCAP  DEDUP  HEALTH  ALTROOT
digdug-meta   744G   721G  23.2G -86%96%  1.00x  ONLINE  -
  mirror   372G   368G  4.25G -84%98%
scsi-35000c5003017156b  -  -  - -  -  -
scsi-35000c500301715e7  -  -  - -  -  -
  mirror   372G   353G  19.0G -88%94%
scsi-35000c5003017155f  -  -  - -  -  -
scsi-35000c500301715a7  -  -  - -  -  -

When we try to delete files, we get the error message:
  rm: cannot remove X: No space left on device

Is there a way to unlink files and free up inodes?

Is it possible to expand the existing zpool and filesystem for the MDT?

Is it possible to do a backup of just our MDT? If so, how?


-- 
Regards,
-liam

-There are uncountably more irrational fears than rational ones. -P. Dolan
Liam Forbes   lofor...@alaska.edu   ph: 
907.450.8618
UAF GI Research Computing Systems Manager
hxxps://calendly.com/ualoforbes/30min

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Configuring File Layout Questions

2021-07-13 Thread Mohr, Rick via lustre-discuss
Ellis,

I am not aware of a tool to run lfs commands without having the file system 
mounted.  A lot of those operations (maybe all) are handled via ioctls that 
expect to operate on a file.

A few times on smaller file systems, I have mounted them on one of the lustre 
servers temporarily without any ill effects.  This is not likely something you 
would want to have running in production all the time, but if you are just 
looking for a quick way to run a few "lfs setstripe" or other types of 
administrative commands then it should be fine.  There used to be a time when 
that sort of thing was a big no-no, but I think Lustre has become more 
resilient in recent versions.  I would just avoid doing any heavy IO (like 
benchmarking) using the servers as clients.

-Rick


On 7/13/21, 3:48 PM, "lustre-discuss on behalf of Ellis Wilson via 
lustre-discuss"  wrote:

Hi Lustre folks,

A few questions about around configuring file layouts, specifically 
progressive file layouts:

1. In a freshly stood-up Lustre cluster, if there are no clients yet 
mounted, are there any Lustre utilities (I’ve not found one) that allows one to 
perform the equivalent of “lfs setstripe” without an active mount point (say, 
from the MDS node)?

2. If not, is there a reasonable API against which such a utility could be 
constructed, or is this request at odds with the architecture?

3. In the absence of a separate client to mount the filesystem to perform 
normal “lfs” commands, can one safely mount the cluster directly from an MDS or 
some other node within the Lustre FS proper?  My understanding is that it is 
not safe, but that’s based on hearsay, so I’d love to get a more authoritative 
answer.

Thanks to anybody who can help answer one or more of these!

ellis

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: OST not being used

2021-07-12 Thread Mohr, Rick via lustre-discuss
Alastair,

Sorry this response is a bit late, but I thought I would add a bit of info in 
case you ever run across the problem again.

Since the mds assigns osts for new files, you may want to check if the mds can 
contact the oss server.  If the mds thinks those osts are unavailable, it won't 
assign them to new files (even if the client has no problem accessing those 
osts).   Also, is it possible that an admin marked those osts as inactive on 
the mds?  Sometimes that is done to prevent new data from going to certain osts 
(for instance, if they were getting very full), but the clients still see those 
osts as active which allows them the continue reading from existing files that 
already reside on those osts.

Just a couple of possibilities you might want to check if you run into this 
problem again.

-Rick


On 6/23/21, 10:46 AM, "lustre-discuss on behalf of Alastair Basden" 
 
wrote:

Hi Megan,

Thanks - yes, lctl ping responds.

In the end, we did a writeconf, and this seems to have fixed the problem, 
so probably some previous transient.  I would however have expected it to 
heal whilst online - taking the filesystem down and doing a writeconf 
seems a bit drastic!

Cheers,
Alastair.

On Wed, 23 Jun 2021, Ms. Megan Larko via lustre-discuss wrote:

> [EXTERNAL EMAIL]
> Hi!
>
> Does the NIC on the OSS that serves OST 4-7 respond to an lctl ping? 
> You indicated that it does respond to regular ping, ssh, etc.  I would 
> review my /etc/lnet.conf file for the behavior of a NIC that times out. 
> Does the conf allow for asymmetrical routing?  (Is that what you wish?) 
> Is there only one path to those OSTs or is there a way failover NIC 
> address that did not work in this even for some reason?
>
> The Lustre Operations Manual Section 9.1 on lnetctl command shows how you 
can get more info on the NIC ( lnetctl show...)
>
> Good luck.
> megan
>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Converting MGS to ZFS - HA Config Question

2021-05-28 Thread Mohr, Rick via lustre-discuss
Sid,

The --failnode option is the sort of the "old way" of configuring failover.  It 
is assumed that the target will always be mounted for the first time on the 
primary server, and so the --failnode option only needs to list the secondary 
server NID (because the primary server NID is implied by whatever host mounts 
the target first).

The --servicenode option is the preferred way of configuring failover now.  It 
does not make any assumptions about primary/secondary nodes, but as a result, 
you need to specify the NIDs of all nodes that could possibly mount the target. 
 The easiest way to do this is to specify the "--servicenode " option 
multiple times (once for each node).

And as you have seen, the two options are not compatible with each other.

-Rick


On 5/27/21, 11:53 PM, "lustre-discuss on behalf of Sid Young via 
lustre-discuss"  wrote:

Hi, 
I am in the process of converting my pre-production cluster to use ZFS, and 
I have a question regarding HA config parameters. The storage node has 24 
disks, I've sliced off two disks in HBA mode to act as a 960G mirror. the 
command is:
# mkfs.lustre --reformat --mgs  --failnode 10.140.93.41@o2ib 
--backfstype=zfs mgspool/mgt mirror d3710M0 d3710M1
This runs successfully and I get the output below, however I want to make 
sure the second MDS node can be failed over too using Pacemaker, so if the 
server I am on now is 10.140.93.42 and the other MDS is 10.140.93.41, do I need 
to specify the host its on now (.42) anywhere in the config? I tried the 
servicenode parameter but it refuses to have servicenode and failnode in the 
command:

   Permanent disk data:
Target: MGS
Index:  unassigned
Lustre FS:
Mount type: zfs
Flags:  0x64
  (MGS first_time update )
Persistent mount opts:
Parameters: failover.node=10.140.93.41@o2ib
mkfs_cmd = zpool create -f -O canmount=off mgspool mirror d3710M0 d3710M1
mkfs_cmd = zfs create -o canmount=off  mgspool/mgt
  xattr=sa
  dnodesize=auto
Writing mgspool/mgt properties
  lustre:failover.node=10.140.93.41@o2ib
  lustre:version=1
  lustre:flags=100
  lustre:index=65535
  lustre:svname=MGS
[root@hpc-mds-02]#


]# zfs list
NAME  USED  AVAIL  REFER  MOUNTPOINT
mgspool   468K   860G96K  /mgspool
mgspool/mgt96K   860G96K  /mgspool/mgt
[root@hpc-mds-02 by-id]# zpool status
  pool: mgspool
 state: ONLINE
  scan: none requested
config:

NAME STATE READ WRITE CKSUM
mgspool  ONLINE   0 0 0
  mirror-0   ONLINE   0 0 0
d3710M0  ONLINE   0 0 0
d3710M1  ONLINE   0 0 0

errors: No known data errors
[root@hpc-mds-02#




Sid Young

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: OST mount issue

2021-04-30 Thread Mohr, Rick via lustre-discuss
One thing you could do would be to verify that all the kernel modules are 
identical.  You can try running 'lsmod' to check that the servers have loaded 
the same set of modules, run 'modinfo' to verify the path to the module that 
was loaded, and then compute a checksum of the kernel module to compare.

-Rick

On 4/26/21, 12:27 PM, "lustre-discuss on behalf of Steve Thompson" 
 wrote:

Yes, I believe that something must be different; I just cannot find it. I 
now have six OST systems. All were installed the same way; two work fine 
and four do not. The rpm list:

# rpm -qa | grep lustre
lustre-osd-zfs-mount-2.12.6-1.el7.x86_64
lustre-2.12.6-1.el7.x86_64
lustre-zfs-dkms-2.12.6-1.el7.noarch

# the mount command example:
# grep lustre /etc/fstab
fs1/ost1/mnt/fs1/ost1   lustre defaults,_netdev_  0 0

and all are the same on all six systems. I currently have ZFS 0.8.5 
installed, but I have tried with ZFS 0.7.13, and the results are
the same.

Steve
-- 

Steve Thompson E-mail:  smt AT vgersoft DOT com
Voyager Software LLC   Web: http://www DOT vgersoft DOT com
3901 N Charles St  VSW Support: support AT vgersoft DOT com
Baltimore MD 21218
   "186,282 miles per second: it's not just a good idea, it's the law"

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] ZFS and OST Space Difference

2021-04-06 Thread Mohr, Rick via lustre-discuss
Makia,

The drive sizes are 7.6 TB which translates to about 6.9 TiB (which is the unit 
that zpool uses for "T").  So the zpool sizes as just 10 x 6.9T = 69T since 
zpool shows the total amount of disk space available to the pool.  The usable 
space (which is what df is reporting) should be more like 0.8 x 69T = 55T.  I 
am not sure about the discrepancy of 3T.  Maybe that is due to some ZFS and/or 
Lustre overhead?

--Rick

On 4/6/21, 3:49 PM, "lustre-discuss on behalf of Makia Minich" 
 wrote:

I believe this was discussed a while ago, but I was unable to find clear 
answers, so I’ll re-ask in hopefully a slightly different way.
On an OST, I have 30 drives, each at 7.6TB. I create 3 raidz2 zpools of 10 
devices (ashift=12):

[root@lustre47b ~]# zpool list
NAMESIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAGCAP  DEDUP  
  HEALTH  ALTROOT
oss55-0  69.9T  37.3M  69.9T- - 0% 0%  1.00x
ONLINE  -
oss55-1  69.9T  37.3M  69.9T- - 0% 0%  1.00x
ONLINE  -
oss55-2  69.9T  37.4M  69.9T- - 0% 0%  1.00x
ONLINE  -
[root@lustre47b ~]#


Running a mkfs.lustre against these (and the lustre mount) and I see:

[root@lustre47b ~]# df -h | grep ost
oss55-0/ost165 52T   27M   52T   1% /lustre/ost165
oss55-1/ost166 52T   27M   52T   1% /lustre/ost166
oss55-2/ost167 52T   27M   52T   1% /lustre/ost167
[root@lustre47b ~]#


Basically, we’re seeing a pretty dramatic loss in capacity (156TB vs 
209.7TB, so a loss of about 50TB). Is there any insight on where this capacity 
is disappearing to? If there some mkfs.lustre or zpool option I missed in 
creating this? Is something just reporting slightly off and that space really 
is there?

Thanks.

—


Makia Minich

Chief Architect

System Fabric Works
"Fabric Computing that Works”

"Oh, I don't know. I think everything is just as it should be, y'know?”
- Frank Fairfield







___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] Re: Restricting sub directory mounts/access

2021-03-30 Thread Mohr, Rick via lustre-discuss
Amit,

I don't think that lustre will do exactly what you want in this case.  If you 
mount the entire file system, then you could restrict access to a directory 
based on normal uid/gid permission or even ACLs.  But those restrictions would 
then apply to every lustre client that mounted the file system.  I don't know 
of any way to allow directory to be visible in lustre and also prevent access 
to that directory based just on the node that mounted it.

I don't know if it is possible in your case, but you could consider organizing 
the directory layout in such a way that subdirectory mounts would accomplish 
what you want.  For example, if your file system is normally mounted under 
"/lustre" on the client, then you could create two directories in the file 
system called "restricted/" and "normal/".  (These names are just for 
illustrative purposed.  You'll likely want to choose something better.). Most 
of your clients would then see /lustre/normal, /lustre/restricted, etc.  On the 
login nodes, you would just create the mount point /lustre/normal and only 
mount that subdirectory.  Then /lustre/restricted would not even be visible.

As a personal preference, I like to avoid putting any "real data" at the root 
of my lustre file system.  The only things I create there are subdirectories 
that organize files into logical groups (/lustre/projects, /lustre/users, 
/lustre/admin, etc.).  I feel that it gives me more control in situations like 
these if I want to only mount certain subdirectories or even apply things like 
project quotas.  I wouldn't call it a "best practice", but over the years I 
have found that approach to be very useful/practical.

-Rick


On 3/30/21, 4:25 PM, "lustre-discuss on behalf of Kumar, Amit" 
 
wrote:

Hi David,

Thank you for your reply. Yes I would like to use the isolation mentioned 
in the link you shared, but a bit differently. I did a bit of reading but it 
appears to me, that Isolation provided by filesets feature allows me to mount 
sub-directory in isolation of the root directory, and using nodemap allows me 
to squash or map uid/gid on a set of clients. Based on my understanding this 
would not help me, I hope I am wrong. 

Here is what I am trying: I still want the entire namespace mounted on all 
clients, but exclude access to one of the sub-directory from the namespace on a 
handful of clients. Rational: we have some datasets that resides in a 
sub-directory, and given lustre namespace is mounted on login servers which are 
not setup behind a 2FA authentication system, the entity providing us the data 
set has raised concerns and hence we are trying to look for options around 
this. We do have a place to put the data elsewhere at the moment, but I would 
like to explore options not all our file systems are as large as Lustre and it 
could benefit when the need arises. 

Best Regards,
Amit



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] OST mount with failover MDS

2021-03-23 Thread Mohr, Rick via lustre-discuss
Could this be due to the fact that the IP for the primary host is 10.20.3.0?  I 
don't know what your netmask is, so I don't know if that is a valid host IP 
address.  Or maybe since it looks like a network address, Lustre is not using 
it as a host address?

-Rick

On 3/17/21, 7:57 AM, "lustre-discuss on behalf of Thomas Roth via 
lustre-discuss"  wrote:

Hi all,

I wonder if I am seeing signs of network problems when mounting an OST:


tunefs.lustre --dryrun tells me (what I know from my own format command)
 >Parameters: mgsnode=10.20.3.0@o2ib5:10.20.3.1@o2ib5

These are the nids for our MGS+MDT0, there are two more pairs for MDT1 and 
MDT2.

I went step-by-step, modprobing lnet and lustre, and checking LNET by 'lnet 
ping' to the active MDTs, 
which worked fine.

However, mounting such an OST (e.g. after a crash) at first prints a number 
of
 > LNet: 19444:0:(o2iblnd_cb.c:3397:kiblnd_check_conns()) Timed out tx for 
10.20.3.1@o2ib5: 0 seconds

and similarly for the failover partners of the other two MDS.

Should it do that?


Imho, LNET to a failover node _must_ fail, because LNET should not be up on 
the failover node, right?

If I started LNET there, and some client does not get an answer quickly 
enough from the acting MDS, it 
would try the failover, LNET yes but Lustre no - that doesn't sound right.


Regards,
Thomas

-- 

Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986


GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] tgt_grant.c:571:tgt_grant_incoming

2021-03-11 Thread Mohr, Rick via lustre-discuss
Maybe this bug?

https://jira.whamcloud.com/browse/LU-11939

--Rick

On 3/11/21, 3:01 AM, "lustre-discuss on behalf of bkmz via lustre-discuss" 
 wrote:

Hello, please help :)
periodically I get this error in dmesg

Mar  9 17:42:32 oss05 kernel: LustreError: 
14715:0:(tgt_grant.c:571:tgt_grant_incoming()) scratch-OST001c: cli 
dd4a4653-12d7-4/96b92f789800 dirty 28672 pend 0 grant -741

Mar  9 17:42:32 oss05 kernel: LustreError: 
14715:0:(tgt_grant.c:573:tgt_grant_incoming()) LBUG



Package information: 
lustre-2.12.2-1.el7.x86_64
Name: lustre
Version : 2.12.2
Release : 1.el7
Architecture: x86_64
Install Date: Tue 19 Jan 2021 06:55:34 PM MSK
Group   : System Environment/Kernel
Size: 2586107
License : GPL
Signature   : (none)
Source RPM  : lustre-2.12.2-1.el7.src.rpm
Build Date  : Mon 27 May 2019 01:11:16 AM MSK
Build Host  : trevis-310-el7-x8664-4.trevis.whamcloud.com
Relocations : (not relocatable)
URL : https://wiki.whamcloud.com/
Summary : Lustre File System

System information: CentOS Linux release 7.6.1810 (Core)
Linux oss06 3.10.0-957.el7.x86_64 #1 SMP Thu Nov 8 23:39:32 UTC 2018 x86_64 
x86_64 x86_64 GNU/Linux
OFED.4.6.0.4.1.46101.x86_64

560 line in tgt_grant.c

if (ted->ted_dirty < 0 || ted->ted_grant < 0 || ted->ted_pending < 0) {
 but I don't understand the reason :(   Why ted->ted_grant < 0

C Уважением,
Фатеев Илья 

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] [EXTERNAL] MDT mount stuck

2021-03-11 Thread Mohr, Rick via lustre-discuss
Thomas,

Is the behavior any different if you mount with the "-o abort_recov" option to 
avoid the recovery phase?

--Rick

On 3/11/21, 11:48 AM, "lustre-discuss on behalf of Thomas Roth via 
lustre-discuss"  wrote:

Hi all,

after not getting out of the ldlm_lockd - situation, we are trying a 
shutdown plus restart.
Does not work at all, the very first mount of the restart is MGS + MDT0, of 
course.

It is quite busy writing traces to the log


Mar 11 17:21:17 lxmds19.gsi.de kernel: INFO: task mount.lustre:2948 blocked 
for more than 120 seconds.
Mar 11 17:21:17 lxmds19.gsi.de kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables 
this message.
Mar 11 17:21:17 lxmds19.gsi.de kernel: mount.lustreD 9616ffc5acc0   
  0  2948   2947 0x0082
Mar 11 17:21:17 lxmds19.gsi.de kernel: Call Trace:
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
schedule+0x29/0x70
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
schedule_timeout+0x221/0x2d0
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
select_task_rq_fair+0x5a6/0x760
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
wait_for_completion+0xfd/0x140
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
wake_up_state+0x20/0x20
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
llog_process_or_fork+0x244/0x450 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
llog_process+0x14/0x20 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
class_config_parse_llog+0x125/0x350 
[obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
mgc_process_cfg_log+0x790/0xc40 [mgc]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
mgc_process_log+0x3dc/0x8f0 [mgc]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
config_recover_log_add+0x13f/0x280 [mgc]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
class_config_dump_handler+0x7e0/0x7e0 
[obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
mgc_process_config+0x88b/0x13f0 [mgc]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
lustre_process_log+0x2d8/0xad0 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
libcfs_debug_msg+0x57/0x80 [libcfs]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
lprocfs_counter_add+0xf9/0x160 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
server_start_targets+0x13a4/0x2a20 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
lustre_start_mgc+0x260/0x2510 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
class_config_dump_handler+0x7e0/0x7e0 
[obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
server_fill_super+0x10cc/0x1890 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
lustre_fill_super+0x468/0x960 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
lustre_common_put_super+0x270/0x270 
[obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
mount_nodev+0x4f/0xb0
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
lustre_mount+0x38/0x60 [obdclass]
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
mount_fs+0x3e/0x1b0
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
vfs_kern_mount+0x67/0x110
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
do_mount+0x1ef/0xd00
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
__check_object_size+0x1ca/0x250
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] ? 
kmem_cache_alloc_trace+0x3c/0x200
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
SyS_mount+0x83/0xd0
Mar 11 17:21:17 lxmds19.gsi.de kernel:  [] 
system_call_fastpath+0x25/0x2a




Other than that, nothing is happening.

The Lustre processes have started, but e.g. recovery_status = Inactive.
OK, perhaps because there is nothing out there to recover besides this MDS, 
all other Lustre 
servers+clients are still stopped.


Still, on previous occasions the mount would not block in this way. The 
device would be mounted - now 
it does not make it into /proc/mounts

Btw, the disk device can be mounted as type ldiskfs. So it exists, and it 
looks definitely like a 
Lustre MDT on the inside.


Best,
Thomas

-- 

Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986


GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Dr. Ulrich Breuer, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org

Re: [lustre-discuss] Stray files after failed lfs_migrate

2021-03-03 Thread Mohr, Rick via lustre-discuss
Angelos,

If a file still existed on the MDS but its data on the OST had somehow been 
removed, then you might see symptoms like those you described.  (stat fails 
because info can't be retrieved from the ost, but lfs getstripe can still query 
layout info from the mds.).  But if that is the case, I can't really say how it 
might have happened in the first place.

Have you tried running lfsck to look for consistency problems?

--Rick


On 3/2/21, 5:24 AM, "lustre-discuss on behalf of Angelos Ching via 
lustre-discuss"  wrote:

Dear all,

I was dealing with some OST migration using lfs_migrate and things went 
mostly fine albeit for a few files that might have been in use during 
the migration:

> # ls
> ls: cannot access ibleTHWm: No such file or directory
> ls: cannot access ib7rP0qy: No such file or directory
> ls: cannot access ib3AQ9vK: No such file or directory
> ls: cannot access ib30N1p9: No such file or directory
> ib30N1p9  ib3AQ9vK  ib7rP0qy  ibleTHWm
> # stat ib30N1p9
> stat: cannot stat ‘ib30N1p9’: No such file or directory
> # lfs getstripe ib30N1p9
> ib30N1p9
> lmm_stripe_count:  1
> lmm_stripe_size:   1048576
> lmm_pattern:   raid0
> lmm_layout_gen:0
> lmm_stripe_offset: 1
> obdidx objid objid group
>  1  719094380x449403e 0
The files couldn't be stat'ed but still returns upon lfs getstripe.

The same error appears on all clients and I've tried unmounting and 
remounting the MDT on the server side already.

Any idea what might have been corrupted and what could be the fix?

Cheers,

-- 
Angelos Ching
ClusterTech Limited

Tel : +852-2655-6138
Fax : +852-2994-2101
Address : Unit 211-213, Lakeside 1, 8 Science Park West Ave., Shatin, 
Hong Kong

Got praises or room for improvements? http://bit.ly/TellAngelos



The information contained in this e-mail and its attachments is 
confidential and
intended solely for the specified addressees. If you have received this 
email in
error, please do not read, copy, distribute, disclose or use any 
information of
this email in any way and please immediately notify the sender and delete 
this
email. Thank you for your cooperation.



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Attempting to recover zfs ost after file corruption

2021-03-03 Thread Mohr, Rick via lustre-discuss
I have a file system running Lustre 2.10.4 on CentOS 7.5 with zfs 0.7.9 that I 
am attempting to keep functional until we can move data to a new Lustre file 
system.  We recently had a couple of osts suffer from some data corruption, and 
after getting them imported and running a scrub, it seems the errors may be 
confined to two directories on the ost's underlying zfs file system: CONFIGS/ 
and oi.10/.

Is it possible to simply remove these files and have them automatically get 
rebuilt when the ost is remounted?  My hope is that any files under CONFIGS/ 
would get repopulated when it connected to the mgs.  But if needed, I can 
always extract files directly from the mgt.  The one thing that I am not sure 
about is how to handle the oi.10/ directory.

I reviewed the procedure in the Lustre manual for restoring an ost from a 
file-level backup.  Since it looks like all the user files are still intact, my 
thought was that I could avoid the actual file restoration step and just 
proceed with the steps to remove CATALOGS, oi.*, LFSCK, etc.  The main 
difference is that since I am not reformatting the ost, I wouldn't be able to 
add the "--replace" flag which sounds like it is used to trigger some of the 
recovery steps.

Any help is greatly appreciated.

--Rick

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org