Re: [Lustre-discuss] how do I deactivate a very wonky OST

2015-01-22 Thread Thomas Roth
ns hang (ls, df, etc). > > Brian Andrus > ITACS/Research Computing > Naval Postgraduate School > Monterey, California > voice: 831-656-6238 > > > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org

Re: [lustre-discuss] Unable to deactivate a device

2015-05-01 Thread Thomas Roth
Hi Brian, I think I have seen the persistent "UP", too. But I managed to deactivate my OST nevertheless, by > root@MDS:~# lctl set_param osc.WORK-OST008-osc*.active=0 And this has to be done on the MDS - to take effect everywhere. Cheers, Thomas On 05/01/2015 05:42 PM, Andrus, Brian Contractor

Re: [lustre-discuss] quotacheck on a live fs (pre 2.4)

2016-03-20 Thread Thomas Roth
My guess is that the trouble is caused by the Bytes added or moved while the quotacheck-scan is running. That gives you an estimate of the size of the inaccuracy. We gave the users/groups quota in the order of 100 - 1000 Terabytes. During quotacheck, people would not write more than a few Teraby

[lustre-discuss] MDT quota problem / MDS crash 2.5.3

2016-07-12 Thread Thomas Roth
had the same kbytes used...) Regards, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de

Re: [lustre-discuss] MDT quota problem / MDS crash 2.5.3

2016-07-14 Thread Thomas Roth
des Aktivieren der Quoten-Unterstützung (mittels 'tune2fs -O ^quota' und anschließendem 'tunefs.lustre --quota') auf dem MDT konnten wir es wieder reparieren. Vielleicht hilft das bei Euch auch... On Tue, 12 Jul 2016, Thomas Roth wrote: Hi all, we are running Lustre 2.5.3 on ou

Re: [lustre-discuss] luster client mount issues

2016-07-17 Thread Thomas Roth
.org/listinfo.cgi/lustre-discuss-lustre.org -- -------- Thomas Roth Department: HPC Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz

Re: [lustre-discuss] MDT quota problem / MDS crash 2.5.3

2016-07-19 Thread Thomas Roth
e appreciated. Thanks. Gary. On 19/07/16 06:24 AM, Dilger, Andreas wrote: On Jul 14, 2016, at 04:13, Thomas Roth wrote: Hi Guido, thanks for the tip, that was successful, with the exact same commands, tune2fs -O ^quota /dev/mdt (took about ~3min) tunefs.lustre --quota /dev/mdt (took

[lustre-discuss] lnet peer credits

2016-08-01 Thread Thomas Roth
8 -419 0 (The last line, the only peer that is "up", is an LNET-router) Something to worry about? Cheers, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +

[lustre-discuss] client server communication half-lost, read-out?

2016-08-01 Thread Thomas Roth
wireshark et al.? Cheers, Thomas -- -------- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftu

Re: [lustre-discuss] lnet peer credits

2016-08-02 Thread Thomas Roth
Thanks, Chris, something less to worry about ;-) Thomas On 08/01/2016 08:16 PM, Christopher J. Morrone wrote: On 08/01/2016 06:33 AM, Thomas Roth wrote: Hi all, is there a kind of a rule of thumb for the "min" number in /proc/sys/lnet/peers? No, there is no rule of thumb. It depe

[lustre-discuss] quota on zfs wrong

2016-08-02 Thread Thomas Roth
is not a per-OST ZFS quota? Regards, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschrän

[lustre-discuss] ZFS not freeing disk space

2016-08-10 Thread Thomas Roth
symptom of a broken OST? I think I have seen this behavior before, and the "df" result shrank to an expected value after the server had been rebooted. In that case, this seems more like a too persistent caching effect -? Cheers, Thomas -- ------

Re: [lustre-discuss] ZFS not freeing disk space

2016-08-11 Thread Thomas Roth
oot of the MDS. Out of question of course in a production environment. Cheers, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum

[lustre-discuss] RDMA too fragmented, OSTs unavailable (permanently)

2016-09-10 Thread Thomas Roth
re are some routers which connect to an older cluster, but of course the old (1.8) clients never show any of these errors. Cheers, Thomas ---- Thomas Roth Department: HPC Location: SB3 1.262 Phone: +49-6159-71 1453 Fax: +49-61

Re: [lustre-discuss] RDMA too fragmented, OSTs unavailable (permanently)

2016-10-14 Thread Thomas Roth
g On Sep 10, 2016, at 12:38 AM, Thomas Roth wrote: Hi all, we are running Lustre 2.5.3 on Infiniband. We have massive problems with clients being unable to communicate with any number of OSTs, rendering the entire cluster quite unusable. Clients show LNetError: 1399:0:(o2iblnd

[lustre-discuss] lost files on ZFS

2016-10-30 Thread Thomas Roth
this be possible so easily? Regrard, Thomas -------- -- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1

[lustre-discuss] "getting" ldlm_enqueue_min

2017-03-29 Thread Thomas Roth
missing a parameter? Regards, Thomas -- -------- Thomas Roth GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister: Amtsgericht Darmstadt, H

[lustre-discuss] "getting" ldlm_enqueue_min

2017-03-29 Thread Thomas Roth
or are we missing a parameter? Regards, Thomas -- ---- Thomas Roth GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz der Gesellschaft: Darmstadt Handelsregister:

Re: [lustre-discuss] "getting" ldlm_enqueue_min

2017-03-30 Thread Thomas Roth
400 options ptlrpc at_min=40 options ptlrpc ldlm_enqueue_min=260 Malcolm. On 30/3/17, 1:39 am, "lustre-discuss on behalf of Thomas Roth" wrote: Hi all, I found that I can set 'ldlm_enqueue_min', but not read it. At least > lctl set_param -P ldlm_enqueue_

Re: [lustre-discuss] Lustre/ZFS space accounting

2017-06-08 Thread Thomas Roth
debug this? Cheers, Hans Henrik ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org -- ---- Thomas Roth Department: HPC Location: SB3 1.262

[lustre-discuss] Funny message when mounting ZFS OSTs

2017-08-08 Thread Thomas Roth
Hi all, while installing some servers with ZFS and Lustre 2.10, for the first OSS I created the kmod-lustre-osd-zfs with zfs-0.6.5 (I think). The second OSS already got zfs-0.7, so I created new Lustre modules. Both went fine. Of course I was curious about what happens when updating ZFS-benea

Re: [lustre-discuss] Lustre 2.10 and RHEL74

2017-09-26 Thread Thomas Roth
.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org -- -----

[lustre-discuss] ZFS-OST layout, number of OSTs

2017-10-22 Thread Thomas Roth
Hi all, I have done some "fio" benchmarking, amongst other things to test the proposition that to get more iops, the number of disks per raidz should be less. I was happy I could reproduce that: one server with 30 disks in one raidz2 (=one zpool = one OST) is indeed slower than one with 30 disks

Re: [lustre-discuss] ZFS-OST layout, number of OSTs

2017-10-22 Thread Thomas Roth
the clients. I don’t have a good sense of how serious this is in practice, but I know some larger sites worry about it. - Patrick From: lustre-discuss on behalf of Thomas Roth Sent: Sunday, October 22, 2017 9:04:35 AM To: Lustre Discuss Subject: [lustre

[lustre-discuss] new client - failover mds: no connection

2017-10-24 Thread Thomas Roth
Regards, Thomas -- -------- Thomas Roth Department: Informationstechnologie Location: SB3 1.250 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkt

Re: [lustre-discuss] new client - failover mds: no connection

2017-10-24 Thread Thomas Roth
:26 PM, Thomas Roth wrote: Hi all, in a Lustre 2.10, CentOS 7.4 test system, I have a pair of MDS, format command was > mkfs.lustre --mgs --mdt --fsname=test --index=0 --servicenode=10.20.1.198@o2ib5 --servicenode=10.20.1.199@o2ib5     --mgsnode=10.20.1.198@o2ib5 --mgsnode=10.20.1.

Re: [lustre-discuss] ZFS-OST layout, number of OSTs

2017-10-26 Thread Thomas Roth
On the other hand if we gather three or four raidz2s into one zpool/OST, loss of one raidz means loss of a 120-160TB OST. Around here, this is usually the deciding argument. (Even temporarily taking down one OST for whatever repairs would take more data offline). How is the general experience

[lustre-discuss] IML no zfs?

2017-10-26 Thread Thomas Roth
Hi all, wanted to give IML (4.0.0.0) a try, as a monitor for an existing test system, not to create it from scratch. This works only partially: the "Detect File Systems" fails for my three OSS, all of which use ZFS on disk. The log window says modprobe osd_ldiskfs: 1 modprobe: FATAL: Module

[lustre-discuss] quota: space accounting isn't functional

2017-11-17 Thread Thomas Roth
Hi all, I have this test system where the OSS are CentOS 7.4, ZFS 0.7.1, the MDS uses ldiskfs. Lustre version = 2.10 When I check the quota of some user - "lfs quota -u troth /lustre/hebetest" - I'm told by the client that the data may be inaccurate, log entry is > LustreError: 10006:0:(osc_

Re: [lustre-discuss] quota: space accounting isn't functional

2017-12-05 Thread Thomas Roth
- Total allocated inode limit: 0, total allocated block limit: 0 (and of course the group quota also works). Cheers, Thomas On 11/17/2017 03:51 PM, Thomas Roth wrote: Hi all, I have this test system where the OSS are CentOS 7.4, ZFS 0.7.1, the MDS uses ldiskfs. Lustre

[lustre-discuss] size of MDT, inode count, inode size

2018-01-26 Thread Thomas Roth
o the manual and various Jira/Ludocs the size should be 2k nowadays? Actually, the command within mkfs.lustre reads mke2fs -j -b 4096 -L test0:MDT -J size=4096 -I 1024 -i 2560 -F /dev/sdb 241699072 -i 2560 ? Cheers, Thomas -- -----

Re: [lustre-discuss] size of MDT, inode count, inode size

2018-01-26 Thread Thomas Roth
should better be such that it leads to "-I 1024 -i 2048"? Regards, Thomas On 01/26/2018 03:10 PM, Thomas Roth wrote: Hi all, what is the relation between raw device size and size of a formatted MDT? Size of inodes + free space = raw size? The example: MDT device has 922 GB i

[lustre-discuss] mgsnode notation in mkfs and tunefs

2018-03-02 Thread Thomas Roth
Hi all, (we are now on Lustre 2.10.2.) It seems there is still a difference in how to declare --mgsnode between mkfs.lustre and tunefs.lustre. For an OST, I did: > mkfs.lustre --ost --backfstype=zfs --mgsnode=10.20.3.0@o2ib5:10.20.3.1@o2ib5 --... osspool0/ost0 This OST mounts, is usable, all

[lustre-discuss] MDT LBUG after every restart

2018-03-12 Thread Thomas Roth
s, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1 64291 Darmstadt www.gsi.de Gesellschaft mit beschränkter Haftung Sitz der Ges

Re: [lustre-discuss] MDT LBUG after every restart

2018-03-26 Thread Thomas Roth
/2018 09:54 AM, Thomas Roth wrote: Hi all, our production system running Lustre 2.5.3 has broken down, and I'm quite clueless. The second (of two) MDTs crashed and after reboot + recovery LBUGs again with: Mar 11 20:02:37 lxmds15 kernel: Lustre: nyx-MDT0001: Recovery over after 1:36, o

[lustre-discuss] Upgrade to 2.11: unrecognized mount option

2018-04-06 Thread Thomas Roth
Hi all, (don't know if it isn't a bit early to complain yet, but) I have upgraded an OSS and MDS von 2.10.2 to 2.11.0, just installing the downloaded rpms - no issues here, except when mounting the MDS: > LDISKFS-fs (drbd0): Unrecognized mount option "context="unconfined_u:object_r:user_tmp_t:

Re: [lustre-discuss] OSTffff created :-(

2018-05-25 Thread Thomas Roth
ion/LeENDmtfhfsnsBfJnpimw0EmABFKDmABFKDmxSGKDmABFKDmhbxd6n/qq2': > Cannot send after transport endpoint shutdown > > same is true for "lctl --device XX deactivate". > > > So we are looking for ways now to: > > 1.) set the OST read-only but ke

[lustre-discuss] Jira? no access / disappeared?

2018-06-21 Thread Thomas Roth
.intel.com/browse/LU-10794 Is it just me? my browser? Did I miss something? Cheers, Thomas -- -------- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzz

[lustre-discuss] lnetctl net add does not alway work

2018-06-23 Thread Thomas Roth
tion nodes with add: - net: errno: -22 descr: "cannot add network: Invalid argument" Other than the busy lnet on the production nodes I cannot see any difference. What did I miss? Regards, Thomas -- ---------

Re: [lustre-discuss] lnetctl net add does not alway work

2018-06-24 Thread Thomas Roth
load all modules and starts all processes, works fine for all but "lnetctl", which is capricious and wants the "network up" explicitly. Cheers, Thomas On 23.06.2018 21:46, Thomas Roth wrote: Hi all, I have a test node and some active batch nodes, all running Lustre 2.10.

[lustre-discuss] Lustre log messages and log files

2018-08-13 Thread Thomas Roth
stre: haven't heard from client...", obviously I have not cut off Lustre completely. What did I do wrong? Many regards, Thomas -- -------- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-

Re: [lustre-discuss] Lustre log messages and log files

2018-08-13 Thread Thomas Roth
ot what you need > > regards, > J. > > On 08/13/2018 02:00 PM, Thomas Roth wrote: >> Hi all, >> >> we have this rather rare phenomenon of too few Lustre log entries - it would >> seem. >> This is a cluster running Lustre 2.10.4 on CentOS 7.4 >>

[lustre-discuss] different quota counts between different Lustre versions?

2018-09-13 Thread Thomas Roth
between those Lustre versions? However, some copied directories have much larger count-differences, but some have identical numbers. Cheers, Thomas -- ---- Thomas Roth Abteilung: Informationstechnologie GSI Helmholtzzentrum für Schwe

Re: [lustre-discuss] different quota counts between different Lustre versions?

2018-09-13 Thread Thomas Roth
e accounting > didn't immediately change. > After stopping Lustre, exporting/importing the pools again, then things > showed correctly and the > userobj_accounting went from "enabled" to "active". > > Cameron > > > On 09/13/2018 05:33 AM, Thomas Rot

Re: [lustre-discuss] new mounted client shows lower disk space

2018-11-14 Thread Thomas Roth
Hi, your error messages are all well known - the one on the OSS will show up as soon as the Lustre modules are loaded, provided you have some clients asking for the OSTs (and your MDT, which should be up by then, is also looking for the OSTs). The kiblnd_check_conns message I have also seen quit

[lustre-discuss] projects and project quota

2018-12-10 Thread Thomas Roth
manual nor via Google. Any instructions anywhere? Thanks, Thomas -- -------- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Re: [lustre-discuss] projects and project quota

2018-12-11 Thread Thomas Roth
Thanks, Daniel. So we will wait a little longer. Regards, Thomas On 12/10/18 5:49 PM, Daniel Kobras wrote: > Hi! > > Am 10.12.18 um 17:38 schrieb Thomas Roth: >> Why do I gent an error when enabling project quota? >> >> The system ("hebe") is running Lu

[Lustre-discuss] error e2fsck run for lfsck

2010-09-10 Thread Thomas Roth
Hi all, on a 1.8.4 test system, I tried prepare for lfsck and got an error from e2fsck: mds:~# e2fsck -n -v --mdsdb /tmp/mdsdb /dev/sdb2 e2fsck 1.41.10.sun2 (24-Feb-2010) lustre-MDT lustre database creation, check forced. Pass 1: Checking inodes, blocks, and sizes MDS: ost_idx 0 max_id 190010

Re: [Lustre-discuss] error e2fsck run for lfsck

2010-09-18 Thread Thomas Roth
Thanks, Daniel. I have tried on another test system without pools, and there it worked indeed. Regards, Thomas On 09/10/2010 08:48 PM, Daniel Kobras wrote: > Hi Thomas! > > On Fri, Sep 10, 2010 at 08:16:57PM +0200, Thomas Roth wrote: >> on a 1.8.4 test system, I tried prepare fo

[Lustre-discuss] Question about adaptive timeouts, not sending early reply

2010-09-18 Thread Thomas Roth
Hi all, I'm trying to understand MDT logs and adaptive timeouts. After upgrade to 1.8.4 and while users believed Lustre to be still in maintenance (= no activity), the MDT log just shows Lustre: 19823:0:(service.c:808:ptlrpc_at_send_early_reply()) @@@ Couldn't add any time (42/30), not sending ea

[Lustre-discuss] no failover with failover MDS

2010-09-18 Thread Thomas Roth
Hi all, we have two servers A, B as a failover MGS/MDT pair, with IPs A=10.12.112.28 and B=10.12.115.120 over tcp. When server B crashes, MGS and MDT are mounted on A. Recovery times out with only one out of 445 clients recovered. Afterwards, the MDT lists all its OSTs as UP and in the logs of

Re: [Lustre-discuss] ls does not work on ram disk for normal user

2010-09-22 Thread Thomas Roth
ards, Michael > > > > > > _______ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss -- Thomas Roth Depar

[Lustre-discuss] mkfs.lustre fails, ldiskfs: ext4 or ext3 ?

2010-11-03 Thread Thomas Roth
a magic switch for mkfs.lustre? Cheers, Thomas -- -------- Thomas Roth Department: IT/HPC GSI Darmstadt ___ Lustre-discuss mailing list Lustre-discuss@lists.lustre.org http://lists.lustre.org/mailman/listinfo/lustre-discuss

[Lustre-discuss] robinhood error messages

2010-11-23 Thread Thomas Roth
Hi all, we are running robinhood (v2.2.1) on our 1.8.4 cluster (basically to find out where and who the big space consumers are - no purging). Robinhood sends me lots and lots of messages (~100/day) of the type > = FS scan is blocked (/lustre) = > Date: 2010/11/23 20:05:22 > Program:

Re: [Lustre-discuss] [robinhood-support] robinhood error messages

2010-11-24 Thread Thomas Roth
od daemon when everything is fixed. > > Best regards, > Thomas LEIBOVICI > CEA/DAM > > > A support request from lustre-discuss. > > > > -------- > > > > Sujet: > > [Lustre-discuss]

Re: [Lustre-discuss] [robinhood-support] robinhood error messages

2010-11-24 Thread Thomas Roth
On 24.11.2010 15:17, LEIBOVICI Thomas wrote: > Thomas Roth wrote: > > > ListMgr | DB query failed in ListMgr_Insert line 340... > > and assorted messages, which seem to indicate that the new robinhood > > scan tries to put something into the DB that is already there, a

[Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-21 Thread Thomas Roth
Hi all, we have gotten new MDS hardware, and I've got two questions: What are the recommendations for the RAID configuration and formatting options? I was following the recent discussion about these aspects on an OST: chunk size, strip size, stride-size, stripe-width etc. in the light of the 1

Re: [Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-22 Thread Thomas Roth
s could have a different numbering. Very well, we'll stick to one MGS, then. Thomas >> -Original Message- >> From: lustre-discuss-boun...@lists.lustre.org on behalf of Thomas Roth >> Sent: Fri 1/21/2011 6:43 AM >> To: lustre-discuss@lists.lustre.org >> Subject: [Lu

Re: [Lustre-discuss] MDT raid parameters, multiple MGSes

2011-01-22 Thread Thomas Roth
have size 935M. So for some reason, one has a metatdata entry that appears as a huge sparse file, the other does not. Is there a reason, or is this just an illness of our installation? Cheers, Thomas On 01/21/2011 09:31 PM, Cliff White wrote: > > > On Fri, Jan 21, 2011 at 3

[Lustre-discuss] llverfs outcome

2011-01-27 Thread Thomas Roth
Hi all, I have run llverfs (lustre-utils 1.8.4) on an OST partition as "llverfs -w -v /srv/OST0002". That went smoothly until all 9759209724 kB were written, terminating with: write File name: /srv/OST0002/dir00072/file022 write complete llverfs: writing /srv/OST0002/llverfs.filecount failed :N

Re: [Lustre-discuss] llverfs outcome

2011-01-31 Thread Thomas Roth
but it doesn't perform differently with llverfs. So I'm still in the dark as to whether we should use these larger partitions.. Cheers, Thomas On 27.01.2011 20:06, Andreas Dilger wrote: > On 2011-01-27, at 04:56, Thomas Roth wrote: > > I have run llverfs (lustre-utils 1.8.4

Re: [Lustre-discuss] Migrating MDT volume to a new location

2011-02-03 Thread Thomas Roth
On 02.02.2011 18:15, Bob Ball wrote: > Is there a recommended way to migrate an MDT (MGS is separate) volume > from one location to another on the same server? This uses iSCSI volumes. > > Lustre 1.8.4 > We'll try the copy (DRBD) + resize variant soon. I've tried that with a backup copy of the M

Re: [Lustre-discuss] Migrating MDT volume to a new location

2011-02-03 Thread Thomas Roth
resize. >> >> Good luck, >> Frederik >> > _______ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss -- --

[Lustre-discuss] MDT extremely slow after restart

2011-04-02 Thread Thomas Roth
Hi all, we are suffering from a sever metadata performance degradation on our 1.8.4 cluster and are pretty clueless. - We moved the MDT to a new hardware, since the old one was failing - We increased the size of the MDT with 'resize2fs' (+ mounted it and saw all the files) - We found the perform

Re: [Lustre-discuss] MDT extremely slow after restart

2011-04-04 Thread Thomas Roth
ying disk, did that hardware/RAID config change > when you switched hardware? > The 'still busy' message is a bug, may be fixed in 1.8.5 > cliffw > > > On Sat, Apr 2, 2011 at 1:01 AM, Thomas Roth <mailto:t.r...@gsi.de>> wrote: > > Hi all, > >

Re: [Lustre-discuss] aacraid kernel panic caused failover

2011-04-06 Thread Thomas Roth
know what it's like out there! I've worked in the private >>> sector. They expect results. -Ray Ghostbusters >>> ___ >>> Lustre-discuss mailing list >>> Lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >>

Re: [Lustre-discuss] aacraid kernel panic caused failover

2011-04-06 Thread Thomas Roth
dition. >> >> Not the best controller for an OSS. >> >> --Jeff >> >> ---mobile signature--- >> Jeff Johnson - Aeon Computing >> jeff.john...@aeoncomputing.com >> >> >> On Apr 6, 2011, at 1:05, Thomas Roth wrote: >> >>

[Lustre-discuss] high OSS load - readcache_max_filesize

2011-05-05 Thread Thomas Roth
Hi all, a recent posting here (which I can't find atm) has pointed me to http://jira.whamcloud.com/browse/LU-15, where an issue is discussed that we seem to see as well: some OSS really get overloaded, and the log says slow journal start 36s due to heavy IO load slow commitrw commit 36s due to

[Lustre-discuss] MDT error messages

2011-06-07 Thread Thomas Roth
Hi all, there are some "new" error messages on our MDT, haven't seen these before and according to Google nobody else has... The usual question: what does it mean? Something to worry about? > Jun 7 06:23:53 lxmds kernel: [4565451.097596] LustreError: 9998:0:(obd.h:1372:lsm_op_find()) Cannot r

[Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Thomas Roth
Hi all, I'd like to mount two Lustre filesystems on one client. Issues with more than one MGS set aside, the point here is that one of them is an Infiniband-cluster, the other is ethernet-based. And my client is on the ethernet. I have managed to mount the o2ib-fs by setting up an LNET router, b

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Thomas Roth
create route to tcp via Gateway-IP@tcp Cheers, Thomas On 06/14/2011 07:00 PM, Michael Shuey wrote: > Is your ethernet FS in tcp1, or tcp0? Your config bits indicate the > client is in tcp1 - do the servers agree? > > -- > Mike Shuey > > > > On Tue, Jun 14, 2011 at 12:23 PM,

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib

2011-06-14 Thread Thomas Roth
;re using lnet routers. Without that > parameter, the lustre client will automatically mark a router as > failed when it's unavailable but will not check to see if it ever > comes back. With this param, it checks every 300 seconds (and > re-enables it if found). > > Hope this

Re: [Lustre-discuss] Mount 2 clusters, different networks - LNET tcp1-tcp2-o2ib - solved?

2011-06-14 Thread Thomas Roth
endeavor fails). The routes=statement seems to say: "If you have data for tcp, use the Default-Router-IP and go via the interace that is on network tcp1". Oh well, I should probably take some networking lectures... Regards, Thomas On 06/14/2011 06:23 PM, Thomas Roth wrote: > Hi al

[Lustre-discuss] Emptied OSTs not empty

2011-06-27 Thread Thomas Roth
Hi all, I am currently moving off files of a number of OSTs - some in a machine with a predicted hardware failure, some for decommissioning old hardware etc. I'm deactivating the OSTs on the MDS, then "lfs find --obd OST_UUID /dir" to create a list of file to migrate. When finished, the O

Re: [Lustre-discuss] problem with clients and multiple transports

2011-11-09 Thread Thomas Roth
Your clients have both ib and tcp nids? Because I encountered a strange behavior trying to mount an ib based FS and a tcp based FS on the same (ethernet-only) client. To connect to the ib MDS it had to go through a lnet router, of course. Experimentally, I found > options lnet networks=tcp0(et

Re: [Lustre-discuss] Problems with lustre router setup IB <-> TCP

2011-12-23 Thread Thomas Roth
hnology > a Fujitsu Company > Kackertstraße 20 > 52072 Aachen > www.ictag.net > www.hpcline.de > www.ictag-shop.de > > Tel: +49 241 88949 156 > Fax: +49 241 88949 110 > Mobil: +49 177 88949 22 > Mail: m...@ictag.net > > > > ___ > Lustre-discuss mailing li

Re: [Lustre-discuss] removing ost

2012-03-24 Thread Thomas Roth
> > > > How can I see what stay on home-OST0004 ? > > > > > > ___ > Lustre-discuss mailing list > Lustre-discuss@lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss -- -

Re: [Lustre-discuss] Lustre on Debian

2012-04-06 Thread Thomas Roth
Hi Marinho, no problem for Lustre 1.8. All the necessary packages are here: http://pkg-lustre.alioth.debian.org/backports/lustre-1.8.7-wc1-squeeze/ We've been running Lustre on Debian since version 1.5.9 (aka beta for 1.6, on Sarge! ;-)). Now we are at 3.5 PB, on 200+ servers. No Debian-specific

Re: [Lustre-discuss] Problems getting Lustre started with ZFS

2013-10-26 Thread Thomas Roth
stre:svname lustre:OST local > lustre4.calthrop.com > NAME PROPERTY VALUE SOURCE > lustre-ost0 lustre:svname - - > lustre-ost0/ost0 lustre:svname lustre:OST local > _______ >

[lustre-discuss] 2.10 <-> 2.12 interoperability?

2019-04-24 Thread Thomas Roth
connection to data-on-mdt which we don't use. Any suggestions? Regards, Thomas -- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1,

Re: [lustre-discuss] 2.10 <-> 2.12 interoperability?

2019-05-06 Thread Thomas Roth
t and are put into a blocking mode. I guess the hard part would be to > re-negotiate all the state after the upgrade, which is hard enough for > regular replays. > > Cheers, > Hans Henrik > >>> On Wednesday, April 24, 2019 3:54:09 AM, Thomas Roth wrote: >>>&g

Re: [lustre-discuss] 2.10 <-> 2.12 interoperability?

2019-05-07 Thread Thomas Roth
connection to data-on-mdt >>>> which we don't use. >>>> >>>> Any suggestions? >>>> >>>> >>>> Regards, >>>> Thomas >> -- >> Andreas Dilger >> Principal Lustre Architect >>

[lustre-discuss] quota distribution over OSTs

2019-06-21 Thread Thomas Roth
OSTs like "OST0004_UUID 28k* - 28k", but writing still works. Any clues to clear my confusion? Best regards Thomas -- -------- Thomas Roth Department: Informationstechnologie GSI Helmholtzzentrum für Schwerionenf

[lustre-discuss] Error when mounting additional MDT

2019-07-04 Thread Thomas Roth
file put into that directory shows # lfs getstripe -M /lustre/test2/testfile 2 Does this mean that the log for hebe-MDT0002 has been written? Or should we do the big writeconf? Regards, Thomas -- Thomas Roth GSI

Re: [lustre-discuss] RV: Lustre quota issues

2019-07-08 Thread Thomas Roth
San Francisco > 10200 Trujillo (Cáceres) > Tel: 927 659 317 | Ext: 214 > alfonso.pa...@ciemat.es <mailto:alfonso.pa...@ciemat.es> > > <http://www.ceta-ciemat.es/> > > > > > > ___ > lustre-discuss mailing list > lustre

Re: [lustre-discuss] RV: Lustre quota issues

2019-07-10 Thread Thomas Roth
-- > - > > > As you can see, I have some OST deactivated, because I will remove them. > > I have set quotas without "quota" (soft with -b) only setting "limit" (hard > with -B),

[lustre-discuss] mdt: unhealthy - healthy

2019-07-26 Thread Thomas Roth
some action? Regards Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, ww

Re: [lustre-discuss] Lustre 2.12.3 released

2019-10-28 Thread Thomas Roth
Hi, on downloads.whamcloud.com, there is still a directory lustre-2.12.3-ib (/MOFED-4.7-1.0.0.1/) I went for the "-ib" because before, that helped with LU-10736. However, if MOFED-4.7 is the default now, the lustre-2.12.3-ib directory just contains the mlnx-ofa_kernel packages and libraries i

[lustre-discuss] MDT deadlocks LU-10697

2019-11-13 Thread Thomas Roth
Regards, Thomas -- -------- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de Comm

Re: [lustre-discuss] MDT deadlocks LU-10697

2019-11-13 Thread Thomas Roth
es in both browsers ... ? Because here at home I find LU-12136 immediately ;-) Anyhow, thanks Nathan. So, LU-12018 could be covered by our planned upgrade to 2.12, very good. Regards, Thomas On 13.11.19 17:24, Nathan Dauchy - NOAA Affiliate wrote: On Wed, Nov 13, 2019 at 4:28 AM Thomas Roth

Re: [lustre-discuss] MDT deadlocks LU-10697

2019-11-13 Thread Thomas Roth
MDS runs kernel 3.10.0-957.el7_lustre, from downloads.whamcloud -> lustre-2.10.6-ib, on CentOS 7.5 On 13.11.19 18:20, Colin Faber wrote: Which kernel are you running? https://access.redhat.com/solutions/3393611 On Wed, Nov 13, 2019 at 4:28 AM Thomas Roth wrote: Hi all, we keep hitt

[lustre-discuss] OSS read cache disappeared?

2019-12-11 Thread Thomas Roth
t parameter). Regards, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 6429

[lustre-discuss] MDT restart: WAITING non-ready MDTs

2020-01-20 Thread Thomas Roth
of multiple MDTs: I thought they were independent of each other - execept that MDT 0 has the root of the filesystem, of course. But the others, waiting for everybody to be online? Regards, Thomas -- Thomas Roth Department

Re: [lustre-discuss] MDT restart: WAITING non-ready MDTs

2020-01-20 Thread Thomas Roth
starting 0 - 1 - 2 ? Regrads, Thomas On 20/01/2020 14.00, Thomas Roth wrote: Hi all, I had to restart our MDTs 1 and 2. No.2 is still doing a file system check, no. 1 is mounted again and should be in recovery, however: :~# cat recovery_status status: WAITING non-ready MDTs:  0002 reco

[lustre-discuss] No read_cache on OSS

2020-01-22 Thread Thomas Roth
ustre due to the abysmal read performance, the possible misconfiguration here is a rather pressing matter. Any ideas? Regards, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-615

[lustre-discuss] read performance and inode_permission

2020-01-22 Thread Thomas Roth
ples [regs] getxattr_hits 1 samples [regs] inode_permission 3960 samples [regs] Basically the only counter that increases quickly is the inode_permission value. Is this the expected behavior? Cheers, Thomas -- -----

[lustre-discuss] Read performance bad, telepathy in Lustre

2020-01-23 Thread Thomas Roth
ing on server-1? Curioser and curioser, Thomas -- ---- Thomas Roth Department: Informationstechnologie Location: SB3 2.291 Phone: +49-6159-71 1453 Fax: +49-6159-71 2986 GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckst

Re: [lustre-discuss] LUSTRE - Installation on DEBIAN 10.x ????

2020-02-06 Thread Thomas Roth
re-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org -- -------- Thomas Roth Department: IT GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de

Re: [lustre-discuss] problem after upgrading 2.10.4 to 2.12.4

2020-06-25 Thread Thomas Roth
y >> Lustre 2.4/2.5 to a format compatible with 2.12. >> >>> Can I downgrade from 2.12.4 to 2.10.8 without destroying the FS? >> We've done this successfully, but again - no guarantees. >> >>> Has the error described in https://jira.whamcloud.com/browse/LU-13392 >>> been

[lustre-discuss] changelogs stop working

2020-12-08 Thread Thomas Roth
ds, Thomas -- ----- Thomas Roth HPC Department GSI Helmholtzzentrum für Schwerionenforschung GmbH Planckstr. 1, 64291 Darmstadt, http://www.gsi.de/ Gesellschaft mit beschraenkter Haftung Sitz der Gesellschaft / Registered Office:Darmstadt Handelsregister / Commercia

  1   2   3   >