[lustre-discuss] BCP for High Availability?

2023-01-15 Thread Andrew Elwell via lustre-discuss
Hi Folks,

I'm just rebuilding my testbed and have got to the "sort out all the
pacemaker stuff" part. What's the best current practice for the
current LTS (2.15.x) release tree?

I've always done this as multiple individual HA clusters covering each
pair of servers with common dual connected drive array(s), but I
remember seeing a talk some years ago where one of the US labs was
using ?pacemaker-remote? and bringing them all up from a central node

I note there's a few (old) crib notes on the wiki - referenced from
the lustre manual, but nothing updated in the last couple of years.

What are people out there doing?


Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] 2.15.x with ConnectX-3 cards

2022-12-10 Thread Andrew Elwell via lustre-discuss
Hi Gang,

I've just gone and reimaged a test system in prep for doing an upgrade
to Rocky 8 + 2.15.1 (What's the bets 2.15.2 comes out the night I push
to prod?) However, the 2.15.1-ib release uses mofed 5.6
... which no longer supports CX-3 cards. (yeah, it's olde hardware...)

Having been badly bitten (see posts passim) with using non-OFED
2.10/2.12 on this hardware (it needs to talk ib_srp to a SFA7700X)
what's the advice for going to a 2.15 server? non-OFED or some rebuild
based on 4.9?

Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Version interoperability

2022-11-08 Thread Andrew Elwell via lustre-discuss
Hi folks,

We're faced with a (short term measured in months, not years
thankfully) seriously large gap in versions between our existing
clients (2.7.5) and new hardware clients (2.15.0) that will be
mounting the same file system.

It's currently on 2.10.8-ib (ldiskfs) with connectx-5 cards, and I
have a maintenance window coming up where I have the opportunity to
upgrade it.

Which is likely to cause less breakage:
* stick with 2.10.8 on server and the annoyances with multi-rail /
discovery when talking to our new system
* upgrade to 2.12.9, sticking with the same OS major version
* upgrade to 2.15.1 including a jump to RHEL 8 / rocky 8 (depending on
licencing as we seem to have lost our HA add-on)

I can't upgrade the old 2.7.5 clients, as this system is already on
the decommissioning roadmap for next year

Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] 2.12.9-ib release?

2022-06-24 Thread Andrew Elwell via lustre-discuss
Hi folks,

I see the 2.12.9/ release tree on
https://downloads.whamcloud.com/public/lustre/, but I don't see the
accompanying 2.12.9-ib/ one.

ISTR someone needed to poke a build process last time to get this
public - can they do the same this time please?

Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] unclear language in Operations manual

2022-06-15 Thread Andrew Elwell via lustre-discuss
Hi folks,

I've recently come across this snippet in the ops manual (section
13.8. Running Multiple Lustre File Systems, page 111 in the current
pdf)

> Note
> If a client(s) will be mounted on several file systems, add the following 
> line to /etc/ xattr.conf file to avoid problems when files are moved between 
> the file systems: lustre.* skip

Is this describing the case where a single client mounts more than one
lustre filesystem simultaneously? ie mount -t lustre mgsnode:/foo
/mnt/foo AND mount -t lustre mgsnode:/bar /mnt/bar?
I suspect I should file a LUDOC if so, as the language doesn't flow.

As (ahem) we've never done this, what's it likely to screw up when a
user's copying from /mnt/foo to /mnt/bar ?

Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] jobstats

2022-05-27 Thread Andrew Elwell via lustre-discuss
Hi folks,

I've finally started to re-investigate pushing jobstats to our central
dashboards and realised there's a dearth of scripts / tooling to
actually gather the job_stats files and push them to $whatever. I have
seen the telegraf one, and the DDN fork of collectd seems somewhat
abandonware. Hence at this stage I'm back to rolling another Python
script to feed influxdb. Yes, I know all the cool kids are using
prometheus, but I'm not one of them.

However while rummaging I came across LU-11407 (Improve stats data) -
Andreas commented[1] he was hoping to add start_time and elapsed_time
fields, but are these targeted in an upcoming release (still shows
'open') - It's also referred to in LU-15826 - Is that likely to make a
point release of 2.15 or will it be the targeted at the next major
release? It might be handy to save me correlating with slurm job start
times, especially if the user job does $other_stuff before actually
hitting the disks.


Many thanks

Andrew




[1] 
https://jira.whamcloud.com/browse/LU-11407?focusedCommentId=234830=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-234830
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Corrupted? MDT not mounting

2022-05-10 Thread Andrew Elwell via lustre-discuss
On Wed, 11 May 2022 at 04:37, Laura Hild  wrote:
> The non-dummy SRP module is in the kmod-srp package, which isn't included in 
> the Lustre repository...

Thanks Laura,
Yeah, I realised that earlier in the week, and have rebuilt the srp
module from source via mlnxofedinstall, and sure enough installing
srp-4.9-OFED.4.9.4.1.6.1.kver.3.10.0_1160.49.1.el7_lustre.x86_64.x86_64.rpm
(gotta love those short names) gives me working srp again.

Hat tip to a DDN contact here (we owe him even more beers now) for
some extra tuning parameters:
options ib_srp cmd_sg_entries=255 indirect_sg_entries=2048
allow_ext_sg=1 ch_count=1 use_imm_data=0
but I'm pleased to say that it _seems_ to be working much better. I'd
done one half of the HA pairs earlier in the week, lfsck completed,
full robinhood scan done (dropped the DB and rescanned from fresh) and
I'm just bringing the other half of the pairs up to the same software
stack now.

Couple of pointers for anyone caught in the same boat that apparently
we did correctly:
* upgrade your f2fsprogs to the latest - if your fsck'ing disks make
sure you're not introducing more problems with a buggy old e2fsck
* tunefs.lustre --writeconf isn't too destructive (see the warnings,
you'll lose pool info but in our case that wasn't critical)
* monitoring is good but tbh the rate of change and that it happened
out of hours means we likely couldn't have intervened
* so quotas are better.

Thanks to those who replied on and off-list - I'm just grateful we
only had the pair of MDTs, not the 40 (!!!) that Origin's getting
(yeah, I was watching the LUG talk last night) - service isn't quite
back to users but we're getting there!

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Corrupted? MDT not mounting

2022-05-08 Thread Andrew Elwell via lustre-discuss
On Fri, 6 May 2022 at 20:04, Andreas Dilger  wrote:
> MOFED is usually preferred over in-kernel OFED, it is just tested and fixed a 
> lot more.

Fair enough, However is the 2.12.8-ib tree built with all the features?
specifically 
https://downloads.whamcloud.com/public/lustre/lustre-2.12.8-ib/MOFED-4.9-4.1.7.0/el7/server/

If I compare the ib_srp module from 2.12 in-kernel

[root@astrofs-oss3 ~]# find /lib/modules/`uname -r` -name ib_srp.ko.xz
/lib/modules/3.10.0-1160.49.1.el7_lustre.x86_64/kernel/drivers/infiniband/ulp/srp/ib_srp.ko.xz
[root@astrofs-oss3 ~]# rpm -qf
/lib/modules/3.10.0-1160.49.1.el7_lustre.x86_64/kernel/drivers/infiniband/ulp/srp/ib_srp.ko.xz
kernel-3.10.0-1160.49.1.el7_lustre.x86_64
[root@astrofs-oss3 ~]# modinfo ib_srp
filename:
/lib/modules/3.10.0-1160.49.1.el7_lustre.x86_64/kernel/drivers/infiniband/ulp/srp/ib_srp.ko.xz
license:Dual BSD/GPL
description:InfiniBand SCSI RDMA Protocol initiator
author: Roland Dreier
retpoline:  Y
rhelversion:7.9
srcversion: 1FB80E3A962EE7F39AD3959
depends:ib_core,scsi_transport_srp,ib_cm,rdma_cm
intree: Y
vermagic:   3.10.0-1160.49.1.el7_lustre.x86_64 SMP mod_unload modversions
signer: CentOS Linux kernel signing key
sig_key:FA:A3:27:4B:D9:17:36:F0:FD:43:6A:42:1B:6A:A4:FA:FE:D0:AC:FA
sig_hashalgo:   sha256
parm:   srp_sg_tablesize:Deprecated name for cmd_sg_entries (uint)
parm:   cmd_sg_entries:Default number of gather/scatter
entries in the SRP command (default is 12, max 255) (uint)
parm:   indirect_sg_entries:Default max number of
gather/scatter entries (default is 12, max is 2048) (uint)
parm:   allow_ext_sg:Default behavior when there are more than
cmd_sg_entries S/G entries after mapping; fails the request when false
(default false) (bool)
parm:   topspin_workarounds:Enable workarounds for
Topspin/Cisco SRP target bugs if != 0 (int)
parm:   prefer_fr:Whether to use fast registration if both FMR
and fast registration are supported (bool)
parm:   register_always:Use memory registration even for
contiguous memory regions (bool)
parm:   never_register:Never register memory (bool)
parm:   reconnect_delay:Time between successive reconnect attempts
parm:   fast_io_fail_tmo:Number of seconds between the
observation of a transport layer error and failing all I/O. "off"
means that this functionality is disabled.
parm:   dev_loss_tmo:Maximum number of seconds that the SRP
transport should insulate transport layer errors. After this time has
been exceeded the SCSI host is removed. Should be between 1 and
SCSI_DEVICE_BLOCK_MAX_TIMEOUT if fast_io_fail_tmo has not been set.
"off" means that this functionality is disabled.
parm:   ch_count:Number of RDMA channels to use for
communication with an SRP target. Using more than one channel improves
performance if the HCA supports multiple completion vectors. The
default value is the minimum of four times the number of online CPU
sockets and the number of completion vectors supported by the HCA.
(uint)
parm:   use_blk_mq:Use blk-mq for SRP (bool)
[root@astrofs-oss3 ~]#

.. it all looks normal and capable of mounting our exascaler luns

cf the one from 2.12.8-ib

=
 PackageArch
  Version
Repository   Size
=
Installing:
 kernel x86_64
  3.10.0-1160.49.1.el7_lustre
lustre-2.12-mofed50 M
 kmod-lustre-osd-ldiskfsx86_64
  2.12.8_6_g5457c37-1.el7
lustre-2.12-mofed   469 k
 lustre x86_64
  2.12.8_6_g5457c37-1.el7
lustre-2.12-mofed   805 k
Installing for dependencies:
 kmod-lustrex86_64
  2.12.8_6_g5457c37-1.el7
lustre-2.12-mofed   3.9 M
 kmod-mlnx-ofa_kernel   x86_64
  4.9-OFED.4.9.4.1.7.1
lustre-2.12-mofed   1.3 M
 lustre-osd-ldiskfs-mount   x86_64
  2.12.8_6_g5457c37-1.el7
lustre-2.12-mofed15 k
 mlnx-ofa_kernelx86_64
  4.9-OFED.4.9.4.1.7.1
lustre-2.12-mofed   108 k

[root@astrofs-oss1 ~]# find /lib/modules/`uname -r` -name ib_srp.ko.xz

Re: [lustre-discuss] Corrupted? MDT not mounting

2022-05-05 Thread Andrew Elwell via lustre-discuss
> It's looking more like something filled up our space - I'm just
> copying the files out as a backup (mounted as ldiskfs just now) -

Ahem. Inode quotas are a good idea. Turns out that a user creating
about 130 million directories rapidly is more than a small MDT volume
can take.

An update on recovery progress - Upgrading the MDS to 2.12 got us over
the issue in LU-12674 enough to recover, and I've migrated half (one
of the HA pairs) of the OSSs to RHEL 7.9 / Lustre 2.12.8 too

It needed a set of writeconf's doing before they'd mount, and e2fsck
has run over any suspect luns. The filesystem "works" in that under
light testing I can read/write OK, but as soon as it gets stressed,
OSSs are falling over

[ 1226.864430] BUG: unable to handle kernel NULL pointer dereference
at   (null)
[ 1226.872281] IP: [] __list_add+0x1b/0xc0
[ 1226.877699] PGD 1ffba0d067 PUD 1ffa48e067 PMD 0
[ 1226.882360] Oops:  [#1] SMP
[ 1226.885619] Modules linked in: osp(OE) ofd(OE) lfsck(OE) ost(OE)
mgc(OE) osd_ldiskfs(OE) ldiskfs(OE) lquota(OE) fid(OE) fld(OE)
ko2iblnd(OE) ptlrpc(OE) obdclass(OE) lnet(OE) libcfs(OE)
dm_round_robin ib_srp scsi_transport_srp scsi_tgt tcp_diag inet_diag
ib_isert iscsi_target_mod target_core_mod rpcrdma rdma_ucm ib_iser
ib_umad bonding rdma_cm ib_ipoib iw_cm libiscsi scsi_transport_iscsi
ib_cm mlx4_ib ib_uverbs ib_core sunrpc ext4 mbcache jbd2 sb_edac
intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel iTCO_wdt kvm
iTCO_vendor_support irqbypass crc32_pclmul ghash_clmulni_intel
aesni_intel lrw gf128mul glue_helper ablk_helper cryptd pcspkr
i2c_i801 lpc_ich mei_me joydev mei sg ioatdma wmi ipmi_si ipmi_devintf
ipmi_msghandler dm_multipath acpi_pad acpi_power_meter dm_mod
ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mlx4_en
ast drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm
drm igb ahci libahci mpt2sas mlx4_core ptp crct10dif_pclmul
crct10dif_common libata crc32c_intel pps_core dca raid_class devlink
i2c_algo_bit drm_panel_orientation_quirks scsi_transport_sas nfit
libnvdimm [last unloaded: scsi_tgt]
[ 1226.987670] CPU: 6 PID: 366 Comm: kworker/u24:6 Kdump: loaded
Tainted: G   OE  
3.10.0-1160.49.1.el7_lustre.x86_64 #1
[ 1227.000168] Hardware name: SGI.COM CH-C1104-GP6/X10SRW-F, BIOS 3.1 06/06/2018
[ 1227.007310] Workqueue: rdma_cm cma_work_handler [rdma_cm]
[ 1227.012725] task: 934839f0b180 ti: 934836c2 task.ti:
934836c2
[ 1227.020195] RIP: 0010:[]  []
__list_add+0x1b/0xc0
[ 1227.028036] RSP: 0018:934836c23d68  EFLAGS: 00010246
[ 1227.09] RAX:  RBX: 934836c23d90 RCX: 
[ 1227.040463] RDX: 932fa518e680 RSI:  RDI: 934836c23d90
[ 1227.047587] RBP: 934836c23d80 R08:  R09: b2df8c1b3dcb3100
[ 1227.054712] R10: b2df8c1b3dcb3100 R11: 00ff R12: 932fa518e680
[ 1227.061835] R13:  R14:  R15: 932fa518e680
[ 1227.068958] FS:  () GS:93483f38()
knlGS:
[ 1227.077034] CS:  0010 DS:  ES:  CR0: 80050033
[ 1227.082772] CR2:  CR3: 001fe47a8000 CR4: 003607e0
[ 1227.089895] DR0:  DR1:  DR2: 
[ 1227.097020] DR3:  DR6: fffe0ff0 DR7: 0400
[ 1227.104142] Call Trace:
[ 1227.106593]  [] __mutex_lock_slowpath+0xa6/0x1d0
[ 1227.112770]  [] ? __switch_to+0xce/0x580
[ 1227.118255]  [] mutex_lock+0x1f/0x2f
[ 1227.123399]  [] cma_work_handler+0x25/0xa0 [rdma_cm]
[ 1227.129922]  [] process_one_work+0x17f/0x440
[ 1227.135752]  [] worker_thread+0x126/0x3c0
[ 1227.141324]  [] ? manage_workers.isra.26+0x2a0/0x2a0
[ 1227.147849]  [] kthread+0xd1/0xe0
[ 1227.152729]  [] ? insert_kthread_work+0x40/0x40
[ 1227.158822]  [] ret_from_fork_nospec_begin+0x7/0x21
[ 1227.165260]  [] ? insert_kthread_work+0x40/0x40
[ 1227.171348] Code: ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00
55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 53 4c 8b 42 08 48 89 fb 49
39 f0 75 2a <4d> 8b 45 00 4d 39 c4 75 68 4c 39 e3 74 3e 4c 39 eb 74 39
49 89
[ 1227.191295] RIP  [] __list_add+0x1b/0xc0
[ 1227.196798]  RSP 
[ 1227.200284] CR2: 


and I'm able to reproduce this on multiple servers :-/

I can see a few mentions (https://access.redhat.com/solutions/4969471
for example) that seem to hint it's low memory triggered, but they
also say it's fixed in the Red Hat 7.9 kernel (and we're running the
2.12.8 stock 3.10.0-1160.49.1.el7_lustre.x86_64)

I've got a case open with the vendor to see if there are any firmware
updates - but I'm not hopeful. These are 6 core single socket
broadwells. with 128G of RAM, Storage disks are mounted over SRP from
a DDN appliance. Would jumping to MOFED make a difference? Otherwise
I'm open to suggestions as it's getting very tiring wrangling servers
back to life

[root@astrofs-oss1 ~]# ls -l /var/crash/ | grep 2022
drwxr-xr-x 2 root root 44 

Re: [lustre-discuss] Corrupted? MDT not mounting

2022-04-20 Thread Andrew Elwell via lustre-discuss
Thanks Stéphane,

It's looking more like something filled up our space - I'm just
copying the files out as a backup (mounted as ldiskfs just now) -
we're running DNE (MDT and this one MDT0001) but I don't
understand why so much space is being taken up in REMOTE_PARENT_DIR -
we seem to have actual user data stashed in there


[root@astrofs-mds2 SSINS_uvfits]# pwd
/mnt/REMOTE_PARENT_DIR/0xa40002340:0x1:0x0/MWA/data/1061313128/SSINS_uvfits
[root@astrofs-mds2 SSINS_uvfits]# ls -l
total 0
-rw-rw-r--+ 1 redacted redacted 67153694400 Oct  9  2018
1061313128_noavg_noflag_00.uvfits
-rw-rw-r--+ 1 redacted redacted   0 Oct  9  2018
1061313128_noavg_noflag_01.uvfits
[root@astrofs-mds2 SSINS_uvfits]#

and although this one was noticeably large, it's not the only non-zero
sized file under REMOTE_PARENT_DIR:
[root@astrofs-mds2 1061314832]# ls -l | head
total 116
-rw-rw-r--+ 1 redacted redacted7338240 Nov 14  2017 1061314832_01.mwaf
-rw-rw-r--+ 1 redacted redacted7338240 Nov 14  2017 1061314832_02.mwaf
-rw-rw-r--+ 1 redacted redacted7404480 Nov 14  2017 1061314832_03.mwaf
-rw-rw-r--+ 1 redacted redacted7404480 Nov 14  2017 1061314832_04.mwaf
-rw-rw-r--+ 1 redacted redacted7338240 Nov 14  2017 1061314832_05.mwaf
-rw-rw-r--+ 1 redacted redacted7338240 Nov 14  2017 1061314832_06.mwaf
-rw-rw-r--+ 1 redacted redacted7404480 Nov 14  2017 1061314832_07.mwaf
-rw-rw-r--+ 1 redacted redacted7404480 Nov 14  2017 1061314832_08.mwaf
-rw-rw-r--+ 1 redacted redacted7404480 Nov 14  2017 1061314832_09.mwaf
[root@astrofs-mds2 1061314832]# pwd
/mnt/REMOTE_PARENT_DIR/0xa40002340:0x1:0x0/MWA/data/1061314832

Suggestions for how to clean up and recover anyone?

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Corrupted? MDT not mounting

2022-04-19 Thread Andrew Elwell via lustre-discuss
Hi Folks,

One of our filesystems seemed to fail over the holiday weekend - we're
running DNE and MDT0001 won't mount. At first it looked like we'd run
out of space (rc = -28) but then we were seeing this

mount.lustre: mount /dev/mapper/MDT0001 at /lustre/astrofs-MDT0001
failed: File exists retries left: 0
mount.lustre: mount /dev/mapper/MDT0001 at /lustre/astrofs-MDT0001
failed: File exists

possibly
kernel: LustreError: 13921:0:(genops.c:478:class_register_device())
astrofs-OST-osc-MDT0001: already exists, won't add

lustre_rmmod wouldn't remove everything cleanly (osc in use) and so
after a reboot everything *seemed* to start OK

[root@astrofs-mds1 ~]# mount -t lustre
/dev/mapper/MGS on /lustre/MGS type lustre (ro)
/dev/mapper/MDT on /lustre/astrofs-MDT type lustre (ro)
/dev/mapper/MDT0001 on /lustre/astrofs-MDT0001 type lustre (ro)

... but not for long

 kernel: LustreError: 12355:0:(osp_sync.c:343:osp_sync_declare_add())
ASSERTION( ctxt ) failed:
 kernel: LustreError: 12355:0:(osp_sync.c:343:osp_sync_declare_add()) LBUG

possibly corrupt llog?

I see LU-12674 which looks like our problem, but only backported to
2.12 branch (these servers are still 2.10.8)

Piecing together what *might* have happened is a user possibly ran out
of inodes and then did a rm -r before the system stopped responding.

Mounting just now I'm getting:
[ 1985.078422] LustreError: 10953:0:(llog.c:654:llog_process_thread())
astrofs-OST0001-osc-MDT0001: Local llog found corrupted #0x7ede0:1:0
plain index 35518 count 2
[ 1985.095129] LustreError:
10959:0:(llog_osd.c:961:llog_osd_next_block()) astrofs-MDT0001-osd:
invalid llog tail at log id [0x7ef40:0x1:0x0]:0 offset 577536 bytes
4096
[ 1985.109892] LustreError:
10959:0:(osp_sync.c:1242:osp_sync_thread())
astrofs-OST0004-osc-MDT0001: llog process with osp_sync_process_queues
failed: -22
[ 1985.126797] LustreError:
10973:0:(llog_cat.c:269:llog_cat_id2handle())
astrofs-OST000b-osc-MDT0001: error opening log id [0x7ef76:0x1:0x0]:0:
rc = -2
[ 1985.140169] LustreError:
10973:0:(llog_cat.c:823:llog_cat_process_cb())
astrofs-OST000b-osc-MDT0001: cannot find handle for llog
[0x7ef76:0x1:0x0]: rc = -2
[ 1985.155321] Lustre: astrofs-MDT0001: Imperative Recovery enabled,
recovery window shrunk from 300-900 down to 150-900
[ 1985.169404] Lustre: astrofs-MDT0001: in recovery but waiting for
the first client to connect
[ 1985.177869] Lustre: astrofs-MDT0001: Will be in recovery for at
least 2:30, or until 1508 clients reconnect
[ 1985.187612] Lustre: astrofs-MDT0001: Connection restored to
a5e41149-73fc-b60a-30b1-da096a5c2527 (at 1170@gni1)
[ 2017.251374] Lustre: astrofs-MDT0001: Connection restored to
7a388f58-bc16-6bd7-e0c8-4ffa7c0dd305 (at 400@gni1)
[ 2017.261374] Lustre: Skipped 1275 previous similar messages
[ 2081.458117] Lustre: astrofs-MDT0001: Connection restored to
10.10.36.143@o2ib4 (at 10.10.36.143@o2ib4)
[ 2081.467419] Lustre: Skipped 277 previous similar messages
[ 2082.324547] Lustre: astrofs-MDT0001: Recovery over after 1:37, of
1508 clients 1508 recovered and 0 were evicted.

Message from syslogd@astrofs-mds2 at Apr 19 17:32:49 ...
 kernel: LustreError: 11082:0:(osp_sync.c:343:osp_sync_declare_add())
ASSERTION( ctxt ) failed:

Message from syslogd@astrofs-mds2 at Apr 19 17:32:49 ...
 kernel: LustreError: 11082:0:(osp_sync.c:343:osp_sync_declare_add()) LBUG
[ 2082.392381] LustreError:
11082:0:(osp_sync.c:343:osp_sync_declare_add()) ASSERTION( ctxt )
failed:
[ 2082.401422] LustreError: 11082:0:(osp_sync.c:343:osp_sync_declare_add()) LBUG
[ 2082.408558] Pid: 11082, comm: orph_cleanup_as
3.10.0-957.1.3.el7_lustre.x86_64 #1 SMP Mon May 27 03:45:37 UTC 2019
[ 2082.418891] Call Trace:
[ 2082.421340]  [] libcfs_call_trace+0x8c/0xc0 [libcfs]
[ 2082.427890]  [] lbug_with_loc+0x4c/0xa0 [libcfs]
[ 2082.434077]  [] osp_sync_declare_add+0x3a9/0x3e0 [osp]
[ 2082.440797]  [] osp_declare_destroy+0xc9/0x1c0 [osp]
[ 2082.447338]  [] lod_sub_declare_destroy+0xce/0x2d0 [lod]
[ 2082.454237]  [] lod_obj_stripe_destroy_cb+0x85/0x90 [lod]
[ 2082.461213]  [] lod_obj_for_each_stripe+0xb6/0x230 [lod]
[ 2082.468104]  [] lod_declare_destroy+0x43b/0x5c0 [lod]
[ 2082.474736]  [] orph_key_test_and_del+0x5f6/0xd30 [mdd]
[ 2082.481538]  [] __mdd_orphan_cleanup+0x5b7/0x840 [mdd]
[ 2082.488250]  [] kthread+0xd1/0xe0
[ 2082.493147]  [] ret_from_fork_nospec_begin+0x7/0x21
[ 2082.499601]  [] 0x
[ 2082.504585] Kernel panic - not syncing: LBUG

e2fsck when mounted as lfiskfs seems to be clean, but is there a way I
can get it mounted enough to run lfsck?

Alternatively, can I upgrade the MDSs to 2.12.x while having the OSSs
still on 2.10? yes I know this isn't ideal but I wasn't planning a
large upgrade at zero notice to our users (also, we still have a
legacy system accessing it with a 2.7 client - it's replacement
arrived last Sept, but still hasn't been handed over to us yet, so I
really don't want to get too out of step)

Many thanks

Andrew

[lustre-discuss] Hardware advice for homelab

2021-07-19 Thread Andrew Elwell via lustre-discuss
Hi folks,

Given my homelab testing for Lustre tends to be contained within
VirtualBox on laptop ($work has a physical hardware test bed once
mucking around gets serious), I'm considering expanding to some real
hardware at home for testing. My MythTV days are over, but I'd ideally
like an aarch64 client that can run on a Raspberry Pi, incase I ever
poke at Kodi.

What server hardware would people advise that fulfils:

* low running cost (it's my electricity bill!)
* fairly cheap to buy (own budget)
* if I'm buying a cased 'nuc' type thing, it must be able to fit in a
3.5" SATA drive (as I have some old ones that fell off the back of a
rack)
* not full of screaming fans

Given it's not planned for production use 24/7 I don't care about HA
with multi-tailed drives, but would quite like the ability to add more
OSSs as required.
Cable sprawl / mounting isn't that much of an issue, providing it can
live in the shed

Any suggestions?

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Determining server version from client

2021-01-18 Thread Andrew Elwell
Hi All,

Is there a trivial command to determine the server side version of
lustre (in my case, trying to confirm what types of quotas are allowed
(project - 2.10+, default - 2.12+) ?

I was hoping there'd be something in lfs, such as lfs getname
--version which would ideally spit out something like

$ lfs getname --version
fs1-9920dde7d000 /fs1 2.10.4
testfs-992073597800 /testfs 2.12.5

but that's wishful thinking :-) as lfs --version merely gives me the
client version as expected

Is this something that's fairly trivial and I'll open a jira ticket
for the request - I know it's done at mount time as the kernel can log
kernel: Lustre: Server MGS version (2.5.1.0) is much older than
client. Consider upgrading server (2.12.5)

Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] quotas not being enforced

2021-01-13 Thread Andrew Elwell
On Thu, 14 Jan 2021 at 17:12, Andrew Elwell  wrote:
> I'm struggling to debug quota enforcement (or more worryingly, lack
> of) in recentish LTS releases.
>
> [root@pgfs-mds3 ~]# lctl conf_param testfs.quota.ost=g
> ... time passes
> [root@pgfs-mds4 ~]# lctl get_param osd-*.*.quota_slave.info | egrep
> '(name|enabled)'
> target name:testfs-OST
> quota enabled:  none
> target name:testfs-OST0001
> quota enabled:  none
>
> doesn't seem to be rippling out.

Not sure I've done the right thing, but poking via
lctl set_param osd-ldiskfs..quota_slave.enabled=
*seems* to have done the trick!

[root@pgfs-mds3 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.testfs-MDT.quota_slave.info=
target name:testfs-MDT
pool ID:0
type:   md
quota enabled:  ug
conn to master: setup
space acct: ug
user uptodate:  glb[1],slv[1],reint[0]
group uptodate: glb[1],slv[1],reint[0]
project uptodate: glb[0],slv[0],reint[0]
[root@pgfs-mds3 ~]#  lctl get_param osd-ldiskfs.*.quota_slave.enabled
osd-ldiskfs.testfs-MDT.quota_slave.enabled=ug
[root@pgfs-mds3 ~]# lctl set_param
osd-ldiskfs.testfs-MDT.quota_slave.enabled=g
osd-ldiskfs.testfs-MDT.quota_slave.enabled=g
[root@pgfs-mds3 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.testfs-MDT.quota_slave.info=
target name:testfs-MDT
pool ID:0
type:   md
quota enabled:  g
conn to master: setup
space acct: ug
user uptodate:  glb[1],slv[1],reint[0]
group uptodate: glb[1],slv[1],reint[0]
project uptodate: glb[0],slv[0],reint[0]

Another Q - if I use the 2.12+ default quota feature (ie sudo lfs
setquota -G -B 10m -I 1025 /testfs) and then read/write with an old
client will it enforce the quotas (even if I can't manipulate /
display them clearly) OK or is Something Bad (tm) going to happen
behind the scenes? - it _seemed_ to be behaving as expected from a
client side (2.7.mumble)

Many thanks
Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] quotas not being enforced

2021-01-13 Thread Andrew Elwell
Hi folks,
I'm struggling to debug quota enforcement (or more worryingly, lack
of) in recentish LTS releases.

our test system (2 servers shared SAS disks between them) is running
lustre-2.12.6-1.el7.x86_64
e2fsprogs-1.45.6.wc3-0.el7.x86_64
kernel-3.10.0-1160.2.1.el7_lustre.x86_64

but the storage luns have been upgraded from 2.7 onwards (maybe a
reformat at 2.10? - its a test system so gets a hard life)

couple of things
1) in the lustre manual (snapshot as at 2021-01-13) section 25.8 0
Lustre Quota Statistics
-- are these obsolete as in pre 2.4 versions?

[root@pgfs-mds4 ~]# lctl get_param lquota.testfs-OST.stats
error: get_param: param_path 'lquota/testfs-OST/stats': No such
file or directory

ie - is this what's referred to in LUDOC-362


2) how long should I have to wait for changing enforcement on MGS to
seeing it rattle out onto MDT/OSTs?

[root@pgfs-mds3 ~]# lctl get_param osd-*.*.quota_slave.info | egrep
'(name|enabled)'
target name:testfs-MDT
quota enabled:  ug
[root@pgfs-mds3 ~]# lctl conf_param testfs.quota.mdt=g
[root@pgfs-mds3 ~]# mount -t lustre
/dev/mapper/TEST_MGT on /lustre/testfs-MGT type lustre
(ro,svname=MGS,nosvc,mgs,osd=osd-ldiskfs,user_xattr,errors=remount-ro)
/dev/mapper/TEST_MDT on /lustre/testfs-MDT type lustre
(ro,svname=testfs-MDT,mgsnode=10.10.36.145@o2ib4:10.10.36.145@o2ib4,osd=osd-ldiskfs,user_xattr,errors=remount-ro)
[root@pgfs-mds3 ~]# lctl get_param osd-*.*.quota_slave.info
osd-ldiskfs.testfs-MDT.quota_slave.info=
target name:testfs-MDT
pool ID:0
type:   md
quota enabled:  ug
conn to master: setup
space acct: ug
user uptodate:  glb[1],slv[1],reint[0]
group uptodate: glb[1],slv[1],reint[0]
project uptodate: glb[0],slv[0],reint[0]
[root@pgfs-mds3 ~]#

ie still showing user enforcement


similarly
[root@pgfs-mds3 ~]# lctl conf_param testfs.quota.ost=g
... time passes
[root@pgfs-mds4 ~]# lctl get_param osd-*.*.quota_slave.info | egrep
'(name|enabled)'
target name:testfs-OST
quota enabled:  none
target name:testfs-OST0001
quota enabled:  none

doesn't seem to be rippling out.

Do I need to umount / tunefs --writeconf / e2fsck / whatever them? -
where do I look for debug info on what's (not) happening?

Many thanks
Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] CentOS / LTS plans

2020-12-10 Thread Andrew Elwell
Hi All,

I'm guessing most of you have heard of the recent roadmap for CentOS
(discussion of which isn't on topic for this list), but can we have a
vague (happy for it to be "at this point we're thinking about X, but
we haven't really decided" level) indication of what the plan for the
upcoming releases are likely to be?

Thanks for the 2.12.6 update the other day - that's on this
afternoon's plan to get it on our testbed and I see from Peter's mail
that 2.12.7 will be the next LTS release. Will this likely be using
RHEL 7.x for server again?

Are the remaining 2.12.x LTS releases likely to stick with RHEL 7 for server?

Is the "next big branch" LTS release (whatever that may be) likely to
be based on RHEL 8 for server?



Many thanks

Andrew (who's trying to work out what licence purchases we're likely
to need to include in storage plans)
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] status of HSM copytools?

2020-08-22 Thread Andrew Elwell
Hi folks,
I'm looking round to see what's current / 'supported' / working in the
state of copytools. Ideally one that can migrate to/from object stores
(Ceph or S3). The github repo for Lemur
(https://github.com/whamcloud/lemur/commits/master) doesn't seem to
have had any substantial work since it left Intel - unlucky timing
with the owner shift?
I've seen another from Compute Canada
(https://github.com/ComputeCanada/lustre-obj-copytool) but that too
hasn't been touched for years.

Anyone care to comment on some working ones? Horror stories? Ones to avoid?
hey, I'm even (I'll probably regret this) open to _email_ from
salesdroids if you have a working product and can point me to some
users (but don't try and phone me or make me sit through a webinar).

Many thanks


Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Commvault lustre backup / archive

2020-08-10 Thread Andrew Elwell
Hi folks,

I see from their release notes that Commvault should be able to act on
changelogs for backup. Anyone here doing so? Any gotchas to worry about? Is
it better than scanning (ugh) and making the MDS unhappy?

Similarly how good is the archive functionally? Does it play well with
lustre HSM design?

(Feel free to contact me off list if you'd rather) - I tried to get info
out of our local reseller without much success...

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Pacemaker resource Agents

2020-07-13 Thread Andrew Elwell
> I've been trying to locate the Lustre specific Pacemaker resource agents but 
> I've had no luck at github where they were meant to be hosted, maybe I am 
> looking at the wrong project?
> Has anyone recently implemented a HA lustre cluster using pacemaker and did 
> you use lustre specific RA's?

I just grabbed them from the repo

https://downloads.whamcloud.com/public/lustre/latest-2.10-release/el7/server/RPMS/x86_64/lustre-resource-agents-2.10.8-1.el7.x86_64.rpm

(yum install lustre-resource-agents)

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Jobstats harvesting

2020-02-17 Thread Andrew Elwell
On Mon., 17 Feb. 2020, 18:06 Andreas Dilger,  wrote:

> You don't mention which Lustre release you are using, but newer
> releases allow "complex JobIDs" that can contain both the SLURMJobID
> as well as other constant strings (e.g. cluster name), hostname, UID, GID,
> and process name.
>

Yeah, i twigged that once I'd sent the mail: we're still 2.10.8 in
production, so having the option of the more complex jobid string is
another reason for upgrading

Related, ive found the DDN fork of collectd, and i see the lustre2.c plugin
is GPL2 but are there any plans to get it merged upstream?

Andrew
(Also who's mad enough to be running mythtv on lustre judging from the
examples?)

>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Jobstats harvesting

2020-02-14 Thread Andrew Elwell
Hi folks,

I've finally got round to enabling jobstats on a test system. As we're
a Slurm shop, setting this to jobid_var=SLURM_JOB_ID works OK, but is
it possible to use a combination of variables?
ie ${PAWSEY_CLUSTER}-${SLURM_JOB_ID} (or even SLURM_CLUSTER_NAME which
is the same as $PAWSEY_CLUSTER)? if so, what's the syntax? (Yes, I
know that setting it to federated would jump up the JobId namespace to
include a cluster identifier, but that's not happening for now.

However, main reason for mail is to find out what people use to
harvest the stats off the MDT/OSTs - I'm aware of Roland Laifer's
LAD15 presentation (sadly his tarball misses a sample config file out,
so it's taken me a bit of iteration over the Perl scripts to recreate
syntax) which saves to a file based structure, and I've seen others
using Prometheus (via https://grafana.com/grafana/dashboards/9671)

We've got influxdb (lnet / mds / ost stats gathered as well as regular
collectd output) and mariaDB (slurmdbd and robinhood) DBs available,
so I'd rather go with something that fed into that.
We're not doing serious high throughput (financial style) but more
traditional HPC with a lot (sigh) of single node jobs over 4
production filesystems (of which 3 are non-appliance LTS releases
maintained by us)

Hopefully the discussion here will lead to some updated content at
http://wiki.lustre.org/Lustre_Monitoring_and_Statistics_Guide (hat tip
to Scott for a great start)

Many thanks

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Slow mount on clients

2020-02-04 Thread Andrew Elwell
> HA / MGS running on second node in fstab
:-) that was one of the first things we checked, and I've tried
manually mounting it but no change

10.10.36.224@o2ib4:10.10.36.225@o2ib4:/askapfs1  3.7P  3.0P  507T  86%
/askapbuffer

hpc-admin2:~ # lctl ping 10.10.36.224@o2ib4
12345-0@lo
12345-10.10.36.224@o2ib4
hpc-admin2:~ # lctl ping 10.10.36.225@o2ib4
12345-0@lo
12345-10.10.36.225@o2ib4
hpc-admin2:~ # umount /askapbuffer
hpc-admin2:~ # time mount /askapbuffer/

real 1m15.099s
user 0m0.012s
sys 0m0.021s
hpc-admin2:~ #

and on the server:

[root@askap-fs1-mds01 ~]# mount -t lustre
/dev/mapper/array00_2 on /lustre/MGS type lustre (ro)
/dev/mapper/array00_1 on /lustre/askapfs1-MDT0001 type lustre (ro)
[root@askap-fs1-mds01 ~]# lctl list_nids
10.10.36.224@o2ib4
[root@askap-fs1-mds01 ~]# tunefs.lustre --dryrun /dev/mapper/array00_2
checking for existing Lustre data: found
Reading CONFIGS/mountdata

   Read previous values:
Target: MGS
Index:  unassigned
Lustre FS:  askapfs1
Mount type: ldiskfs
Flags:  0x1004
  (MGS no_primnode )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=10.10.36.224@o2ib4:10.10.36.225@o2ib4


   Permanent disk data:
Target: MGS
Index:  unassigned
Lustre FS:  askapfs1
Mount type: ldiskfs
Flags:  0x1004
  (MGS no_primnode )
Persistent mount opts: user_xattr,errors=remount-ro
Parameters: failover.node=10.10.36.224@o2ib4:10.10.36.225@o2ib4

exiting before disk write.
[root@askap-fs1-mds01 ~]#


(MDT is mounted on the other node at this time).
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Running an older Lustre server (2.5) with a newer client (2.11)

2019-08-30 Thread Andrew Elwell
On Fri., 30 Aug. 2019, 09:01 Kirill Lozinskiy,  wrote:

> Is there anyone out there running Lustre server version 2.5.x with a
> Lustre client version 2.11.x? I'm curious if you are running this
> combination and whether or not you saw and gains or losses when you went to
> the newer Lustre client.
>

Not quite that new (we're still sles12 based), but we still have a 2.5
based filesystem (neo 2.0) and happily mount it on 2.10.8 clients (together
with our 2.10 LTS filesystems)

Given that 2.11 was chosen by the same supplier as their 2.5 based system,
I suspect you'd have a good case for support if you hit any issues...

Andrew

>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Wanted: multipath.conf for dell ME4 series arrays

2019-08-21 Thread Andrew Elwell
Hi Jeff,

On Wed, 21 Aug 2019 at 17:34, Jeff Johnson
 wrote:
> What underlying Lustre target filesystem? (assuming ldiskfs with a hardware 
> RAID array)
correct - ldiskfs, using 8* raid6 luns per ME4084

> What does your current multipath.conf look like?
we just had blacklist, WWNs and mappings, we were missing any ME4
specific device {} settings, however I've since found the magic
incantation from
https://downloads.dell.com/manuals/common/powervault-me4-series-linux-dell-emc-2018-3924-bp-l_wp_en-us.pdf,
notably

device {
   vendor "DellEMC"
   product "ME4"
   path_grouping_policy "group_by_prio"
   path_checker "tur"
   hardware_handler "1 alua"
   prio "alua"
   failback immediate
   rr_weight "uniform"
   path_selector "service-time 0"
   }
and it seems to be working a whole lot better :-)


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Wanted: multipath.conf for dell ME4 series arrays

2019-08-21 Thread Andrew Elwell
Hi folks,

we're seeing MMP reluctance to hand over the (umounted) OSTs to the
partner pair on our shiny new ME4084 arrays,

Does anyone have the device {} settings they'd be willing to share?
My gut feel is we've not defined path failover properly and some
timeouts need tweaking


(4* ME4084's per 2 740 servers with SAS cabling, Lustre 2.10.8 and CentOS 7.x)

Many thanks


Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] State of arm client?

2019-04-24 Thread Andrew Elwell
Hi folks,

I remember seeing a press release by DDN/Whamcloud last November that they
were going to support ARM, but can anyone point me to the current state of
client?

I'd like to deploy it onto a raspberry pi cluster (only 4-5 nodes) ideally
on raspbian for demo / training purposes. (Yes I know it won't *quite* be
infiniband performance, but as it's hitting a VM based set of lustre
servers, that's the least of my worries). Ideally 2.10.x, but I'd take a
2.12 client if it can talk to 2.10.x servers


Many thanks
Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] lfs check *, change of behaviour from 2.7 to 2.10?

2019-04-09 Thread Andrew Elwell
I've just noticed that 'lfs check mds / servers no longer works (2.10.0 or
greater clients) for unprivileged users, yet it worked for 2.7.x clients.

Is this by design?
(lfs quota thankfully still works as a normal user tho)


Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-28 Thread Andrew Elwell
On Tue, 26 Feb 2019 at 23:25, Andreas Dilger  wrote:
> I agree that having an option that creates the OSTs as inactive might be 
> helpful, though I wouldn't want that to be the default as I'd imagine it 
> would also cause problems for the majority users that wouldn't know that they 
> need to enable the OSTs after they are mounted.

> Could you file a feature request for this in Jira?
Done https://jira.whamcloud.com/browse/LU-12036
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Command line tool to monitor Lustre I/O ?

2018-12-21 Thread Andrew Elwell
On Fri., 21 Dec. 2018, 01:05 Laifer, Roland (SCC)  Dear Lustre administrators,
>
> what is a good command line tool to monitor current Lustre metadata and
> throughput operations on the local client or server?
>

I wrote a small python script to parse lctl get_param and inject it
straight into our influxdb server - As I was dropping this onto a sonnexion
(as well as our newer systems which had collectd installed) I didn't want
to require any software not already installed on the system.

My plan [one of these days in my spare time] is to wrap it properly as a
collectd python plugin - would people be interested and I'll probably see
if I can find some time to work on it over xmas?

Once it's in influx, we can then just plot it with our normal tooling
(grafana) - some pictures in the pptx at
https://www.dropbox.com/s/rck1lm73wlwlg6v/monitoring.pptx?dl=0 (near end)

Andrew

>
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Openstack Manila + Lustre?

2018-11-20 Thread Andrew Elwell
Hi All,

Is there anyone on list exporting Lustre filesystems to (private)
cloud services - possibly via Manila?
I can see 
https://www.openstack.org/assets/science/CrossroadofCloudandHPC-Print.pdf
and Simon's talk from LUG2016
(https://nci.org.au/wp-content/uploads/2016/07/LUG-2016-sjjfowler-hpc-data-in-the-cloud.pdf)
which seems to be pretty much what I'm after.

Does anyone else have updated notes / success / horror stories they'd
be willing to share?

Many thanks


Andrew
(no prizes for guessing who's been asked to look at integrating our
HPC storage with cloud...)
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] rsync target for https://downloads.whamcloud.com/public/?

2018-09-26 Thread Andrew Elwell
Hi folks,

Is there an rsync (or other easily mirrorable) target for
downloads.whamcloud.com ?

I'm trying to pull e2fsprogs/latest/el7/ and
lustre/latest-release/el7/server/ locally to reinstall a bunch of
machines


Many thanks,
Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Does lustre 2.10 client support 2.5 server ?

2017-11-09 Thread Andrew Elwell
> My Lustre server is running the version 2.5 and I want to use 2.10 client.
> Is this combination supported ? Is there anything that I need to be aware of

2 of our storage appliances (sonnexion 1600 based) run 2.5.1, I've
mounted this OK on infiniband clients fine with 2.10.0 and 2.10.1 OK,
but a colleague has since had to downgrade some of our clients to
2.9.0 on OPA / KNL hosts as we were seeing strange issues (can't
remember the ticket details)

We do see the warnings at startup:
Lustre: Server MGS version (2.5.1.0) is much older than client.
Consider upgrading server (2.10.0)

Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] 1 MDS and 1 OSS

2017-10-30 Thread Andrew Elwell
On 31 Oct. 2017 07:20, "Dilger, Andreas"  wrote:


Having a larger MDT isn't bad if you plan future expansion.  That said, you
would get better performance over FDR if you used SSDs for the MDT rather
than HDDs (if you aren't already planning this), and for a single OSS you
probably don't need the extra MDT capacity.  With both ldiskfs+LVM and ZFS
you can also expand the MDT size in the future if you need more capacity.


Can someone with wiki editing rights summarise the advantages of different
hardware combinations? For example I remember Daniel @ NCI had some nice
comments about which components (MDS v OSS) benefited from faster cores
over thread count and where more RAM was important.

I feel this would be useful for people building small test systems and
comparing vendor responses for large tenders.

Many thanks,
Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Point release updates

2017-09-08 Thread Andrew Elwell
Hi Folks,

We currently have a couple of storage systems based on IEEEL 3.0:

[root@pgfs-oss1 ~]# cat /proc/fs/lustre/version
lustre: 2.7.16.8
kernel: patchless_client
build:  
jenkins-arch=x86_64,build_type=client,distro=el7,ib_stack=inkernel-15--PRISTINE-3.10.0-327.36.1.el7_lustre.x86_64
[root@pgfs-oss1 ~]#

[root@astrofs-oss1 ~]# cat /proc/fs/lustre/version
lustre: 2.7.19.8
kernel: patchless_client
build:  
jenkins-arch=x86_64,build_type=server,distro=el7,ib_stack=inkernel-165--PRISTINE-3.10.0-514.2.2.el7_lustre.x86_64
[root@astrofs-oss1 ~]#

and we'd like to update these -- preferred choice will be to 2.10.1
once it's out, but happy to go for an intermediate 2.7 release.
(mainly because we're seeing lots of these in the logs:
astrofs-MDT0001-osd: FID [whatever] != self_fid [whatever] - which
seem to be https://jira.hpdd.intel.com/browse/LU-8532 / LU-8319 which
apparently has a backport to 2.7).

However - where do we get "blessed" point release updates from?
https://downloads.hpdd.intel.com/public/lustre/ doesn't seem to have
any.

Many thanks


Andrew
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org