Hi,
Mounting the Lustre file file system on the OSS. Some of the OSTs are
locally attached to the OSS.
The failover IP on the OST is "10.99.100.152". It is a local lnet on the
OSS. However, when the client mounts it, the import automatically changes
to 0@lo. It is undesirable here because when this OST fails over to another
server, the client is still trying to connect to 0@lo while it is no longer
on the same host. This makes the client fs mount hangs for ever.
Here the failover is designed in such a way that the IP address moves
(fails over) with OST and becomes active on the other server.
How can I make the import pointing to the real IP and not the loopback? (so
that the failover works)
[oss000 ~]$ lfs df
UUID 1K-blocks Used Available Use% Mounted on
fs-MDT0000_UUID 29068444 25692 26422344 1% /mnt/fs[MDT:0]
fs-OST0000_UUID 50541812 30160292 17743696 63% /mnt/fs[OST:0]
fs-OST0001_UUID 50541812 29301740 18602248 62% /mnt/fs[OST:1]
fs-OST0002_UUID 50541812 29356508 18547480 62% /mnt/fs[OST:2]
fs-OST0003_UUID 50541812 8822980 39081008 19% /mnt/fs[OST:3]
filesystem_summary: 202167248 97641520 93974432 51% /mnt/fs
[oss000 ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 30G 0 30G 0% /dev
tmpfs 30G 8.1M 30G 1% /dev/shm
tmpfs 30G 25M 30G 1% /run
tmpfs 30G 0 30G 0% /sys/fs/cgroup
/dev/mapper/ocivolume-root 36G 17G 19G 48% /
/dev/sdc2 1014M 637M 378M 63% /boot
/dev/mapper/ocivolume-oled 10G 2.5G 7.6G 25% /var/oled
/dev/sdc1 100M 5.1M 95M 6% /boot/efi
tmpfs 5.9G 0 5.9G 0% /run/user/987
tmpfs 5.9G 0 5.9G 0% /run/user/0
/dev/sdb 49G 28G 18G 62% /fs-OST0001
/dev/sda 49G 29G 17G 63% /fs-OST0000
tmpfs 5.9G 0 5.9G 0% /run/user/1000
10.99.100.221@tcp1:/fs 193G 94G 90G 51% /mnt/fs
[oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda
checking for existing Lustre data: found
Read previous values:
Target: fs-OST0000
Index: 0
Lustre FS: fs
Mount type: ldiskfs
Flags: 0x1002
(OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1
,10.99.100.152@tcp1
Permanent disk data:
Target: fs-OST0000
Index: 0
Lustre FS: fs
Mount type: ldiskfs
Flags: 0x1002
(OST no_primnode )
Persistent mount opts: ,errors=remount-ro
Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1
,10.99.100.152@tcp1
exiting before disk write.
[oss000 proc]# cat
/proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import
import:
name: fs-OST0000-osc-ffff89c57672e000
target: fs-OST0000_UUID
state: IDLE
connect_flags: [ write_grant, server_lock, version, request_portal,
max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize,
alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink,
full20, layout_lock, 64bithash, object_max_bytes, jobstats, einprogress,
grant_param, lvb_type, short_io, lfsck, bulk_mbits, second_flags,
lockaheadv2, increasing_xid, client_encryption, lseek, reply_mbits ]
connect_data:
flags: 0xa0425af2e3440078
instance: 39
target_version: 2.15.3.0
initial_grant: 8437760
max_brw_size: 4194304
grant_block_size: 4096
grant_inode_size: 32
grant_max_extent_size: 67108864
grant_extent_tax: 24576
cksum_types: 0xf7
max_object_bytes: 17592186040320
import_flags: [ replayable, pingable, connect_tried ]
connection:
failover_nids: [ 0@lo, 0@lo ]
current_connection: 0@lo
connection_attempts: 1
generation: 1
in-progress_invalidations: 0
idle: 36 sec
rpcs:
inflight: 0
unregistering: 0
timeouts: 0
avg_waittime: 2627 usec
service_estimates:
services: 1 sec
network: 1 sec
transactions:
last_replay: 0
peer_committed: 0
last_checked: 0
_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org