Re: [lustre-discuss] Lustre switching to loop back lnet interface when it is not desired

Backer Wed, 06 Nov 2024 10:40:56 -0800

Hi Chris,

Thank you looking in to this. I agree.  In cloud and other type of networks
on-Prem, floating ip is real thing providing ha and I am attempting to make
it work. Since ip move happens within subseconds in these environment, the
failover happens within a few seconds and even notice Any delay. This
optimization is an undesired optimization in certain environment. If there
is no param already  exists for a behavior change, how I can make it work
within this environment?  I wonder if it requires a code change? If so, I
could look in to it if someone can help with some pointers.


Regards

Aboo

On Wed, Nov 6, 2024 at 11:05 AM Horn, Chris <[email protected]> wrote:

> Here the failover is designed in such a way that the IP address moves
> (fails over) with OST and becomes active on the other server.
>
>
>
> This is probably the source of your problem. I would suggest assigning
> unique IP addresses to each OSS.
>
>
>
> Chris Horn
>
>
>
> *From: *lustre-discuss <[email protected]> on
> behalf of Backer <[email protected]>
> *Date: *Tuesday, November 5, 2024 at 10:19 PM
> *To: *Backer via lustre-discuss <[email protected]>,
> [email protected] <[email protected]>
> *Subject: *Re: [lustre-discuss] Lustre switching to loop back lnet
> interface when it is not desired
>
> Any ideas on how to avoid using 0@lo as failover_nids? Please see below.
>
>
>
> On Tue, 5 Nov 2024 at 12:34, Backer <[email protected]> wrote:
>
> Hi,
>
>
>
> Mounting the Lustre file file system on the OSS. Some of the OSTs are
> locally attached to the OSS.
>
> The failover IP on the OST is "10.99.100.152". It is a local lnet on the
> OSS. However, when the client mounts it, the import automatically changes
> to 0@lo. It is undesirable here because when this OST fails over to
> another server, the client is still trying to connect to 0@lo while it is
> no longer on the same host. This makes the client fs mount hangs for ever.
>
> Here the failover is designed in such a way that the IP address moves
> (fails over) with OST and becomes active on the other server.
>
> How can I make the import pointing to the real IP and not the loopback?
> (so that the failover works)
>
>
>
>
>
> [oss000 ~]$ lfs df
> UUID                   1K-blocks        Used   Available Use% Mounted on
> fs-MDT0000_UUID     29068444       25692    26422344   1% /mnt/fs[MDT:0]
> fs-OST0000_UUID     50541812    30160292    17743696  63% /mnt/fs[OST:0]
> fs-OST0001_UUID     50541812    29301740    18602248  62% /mnt/fs[OST:1]
> fs-OST0002_UUID     50541812    29356508    18547480  62% /mnt/fs[OST:2]
> fs-OST0003_UUID     50541812     8822980    39081008  19% /mnt/fs[OST:3]
>
> filesystem_summary:    202167248    97641520    93974432  51% /mnt/fs
>
> [oss000 ~]$ df -h
> Filesystem                  Size  Used Avail Use% Mounted on
> devtmpfs                     30G     0   30G   0% /dev
> tmpfs                        30G  8.1M   30G   1% /dev/shm
> tmpfs                        30G   25M   30G   1% /run
> tmpfs                        30G     0   30G   0% /sys/fs/cgroup
> /dev/mapper/ocivolume-root   36G   17G   19G  48% /
> /dev/sdc2                  1014M  637M  378M  63% /boot
> /dev/mapper/ocivolume-oled   10G  2.5G  7.6G  25% /var/oled
> /dev/sdc1                   100M  5.1M   95M   6% /boot/efi
> tmpfs                       5.9G     0  5.9G   0% /run/user/987
> tmpfs                       5.9G     0  5.9G   0% /run/user/0
> /dev/sdb                     49G   28G   18G  62% /fs-OST0001
> /dev/sda                     49G   29G   17G  63% /fs-OST0000
> tmpfs                       5.9G     0  5.9G   0% /run/user/1000
> 10.99.100.221@tcp1:/fs  193G   94G   90G  51% /mnt/fs
>
> [oss000 ~]$ sudo tunefs.lustre --dryrun /dev/sda
> checking for existing Lustre data: found
>
>    Read previous values:
> Target:     fs-OST0000
> Index:      0
> Lustre FS:  fs
> Mount type: ldiskfs
> Flags:      0x1002
>               (OST no_primnode )
> Persistent mount opts: ,errors=remount-ro
> Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1
> ,10.99.100.152@tcp1
>
>
>    Permanent disk data:
> Target:     fs-OST0000
> Index:      0
> Lustre FS:  fs
> Mount type: ldiskfs
> Flags:      0x1002
>               (OST no_primnode )
> Persistent mount opts: ,errors=remount-ro
> Parameters: mgsnode=10.99.100.221@tcp1 failover.node=10.99.100.152@tcp1
> ,10.99.100.152@tcp1
>
> exiting before disk write.
>
>
> [oss000 proc]# cat
> /proc/fs/lustre/osc/fs-OST0000-osc-ffff89c57672e000/import
> import:
>     name: fs-OST0000-osc-ffff89c57672e000
>     target: fs-OST0000_UUID
>     state: IDLE
>     connect_flags: [ write_grant, server_lock, version, request_portal,
> max_byte_per_rpc, early_lock_cancel, adaptive_timeouts, lru_resize,
> alt_checksum_algorithm, fid_is_enabled, version_recovery, grant_shrink,
> full20, layout_lock, 64bithash, object_max_bytes, jobstats, einprogress,
> grant_param, lvb_type, short_io, lfsck, bulk_mbits, second_flags,
> lockaheadv2, increasing_xid, client_encryption, lseek, reply_mbits ]
>     connect_data:
>        flags: 0xa0425af2e3440078
>        instance: 39
>        target_version: 2.15.3.0
>        initial_grant: 8437760
>        max_brw_size: 4194304
>        grant_block_size: 4096
>        grant_inode_size: 32
>        grant_max_extent_size: 67108864
>        grant_extent_tax: 24576
>        cksum_types: 0xf7
>        max_object_bytes: 17592186040320
>     import_flags: [ replayable, pingable, connect_tried ]
>     connection:
>        failover_nids: [ 0@lo, 0@lo ]
>        current_connection: 0@lo
>        connection_attempts: 1
>        generation: 1
>        in-progress_invalidations: 0
>        idle: 36 sec
>     rpcs:
>        inflight: 0
>        unregistering: 0
>        timeouts: 0
>        avg_waittime: 2627 usec
>     service_estimates:
>        services: 1 sec
>        network: 1 sec
>     transactions:
>        last_replay: 0
>        peer_committed: 0
>        last_checked: 0
>
>

_______________________________________________
lustre-discuss mailing list
[email protected]
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre switching to loop back lnet interface when it is not desired

Reply via email to