Re: [lustre-discuss] Stop writes for users

2019-05-14 Thread Mohr Jr, Richard Frank (Rick Mohr)
clients to treat all the targets as read-only. But if there is such a parameter, I am not familiar with it. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discus

Re: [lustre-discuss] PFL not working on 2.10 client

2019-05-01 Thread Mohr Jr, Richard Frank (Rick Mohr)
I don’t think we need to have PFL working immediately, and since we have plans to upgrade the client at some point, I will just wait and see what happens after the upgrade. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu

[lustre-discuss] PFL not working on 2.10 client

2019-04-22 Thread Mohr Jr, Richard Frank (Rick Mohr)
lcme_flags: 0 lcme_extent.e_start: 4194304 lcme_extent.e_end: 67108864 lmm_stripe_count: 4 lmm_stripe_size: 1048576 lmm_pattern: 1 lmm_layout_gen:65535 lmm_stripe_offset: -1 -- Rick Mohr Senior HPC System Administrator National Institute

Re: [lustre-discuss] lfsck repair quota

2019-04-17 Thread Mohr Jr, Richard Frank (Rick Mohr)
hould be 1. Fix? yes > > In fact I think that the e2fsc ran so slow due that all the mdt inodes were > corrected. You may already be doing this, but just in case, make sure that you are using the latest version of Whamcloud’s e2fsprogs (https://downloads.whamcloud.com/public/e2fsprogs

Re: [lustre-discuss] unable to install lustre clients on Centos 7.6 with MLNX_OFED_LINUX-4.5-1.0.1.0

2019-04-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
use the RPMs found alongside the lustre client RPM? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Apr 16, 2019, at 11:27 AM, Pharthiphan Asokan wrote: > > > Hello, > > unable to install lustre

Re: [lustre-discuss] lfsck repair quota

2019-04-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
ects for that file. This is necessary to ensure that quota information reported by Lustre is accurate, but I don’t believe it is meant to fix any corruption in the quota files themselves. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.

Re: [lustre-discuss] lfsck repair quota

2019-04-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
sure if/how you can regenerate quota info for ZFS.) -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://list

Re: [lustre-discuss] inodes not adding up

2019-04-15 Thread Mohr Jr, Richard Frank (Rick Mohr)
10157 26% /share/lfs02 Again, there are already 19,222,318 files on the file system, so IUsed=19222318. All the OSTs together only have 18,175,092 + 18,300,779 + 18,134,286 = 54,610,157 inodes available, so IFree=54610157. And Inodes = IUsed + IFree = 73832475. -- Rick Mohr Senior HPC

Re: [lustre-discuss] how to erase lustre filesystem

2019-04-04 Thread Mohr Jr, Richard Frank (Rick Mohr)
there is some kind of bad data or corruption in the config logs on the MGS (so you use the writeconf process to blow away the bad config logs and regenerate them). -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Apr 4, 2019, a

Re: [lustre-discuss] Tools for backing up a ZFS MDT

2019-03-29 Thread Mohr Jr, Richard Frank (Rick Mohr)
to migrate a MDT to new hardware. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Mar 29, 2019, at 6:03 PM, Hans Henrik Happe wrote: > > Hi Kurt, > > Haven't got much experience with the comple

[lustre-discuss] Using lfs migrate to move files between MDTs

2019-03-29 Thread Mohr Jr, Richard Frank (Rick Mohr)
be appreciated. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss

Re: [lustre-discuss] Error with project quotas on 2.10.6

2019-03-26 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Mar 20, 2019, at 1:24 PM, Peter Jones wrote: > > If it's not in the manual then it should be. Could you please open an LUDOC > ticket to track getting this corrected if need be? Done. https://jira.whamcloud.com/browse/LUDOC-435 -- Rick Mohr Senior HPC System Administra

Re: [lustre-discuss] Error with project quotas on 2.10.6

2019-03-18 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Mar 18, 2019, at 5:31 PM, Peter Jones wrote: > > You need the patched kernel for that feature I suppose that should be documented in the manual somewhere. I thought project quota support was determined based on ldiskfs vs zfs, and not patched vs unpatched. -- Rick Mohr S

[lustre-discuss] Error with project quotas on 2.10.6

2019-03-18 Thread Mohr Jr, Richard Frank (Rick Mohr)
in the documentation? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss

Re: [lustre-discuss] Migrating files doesn't free space on the OST

2019-01-17 Thread Mohr Jr, Richard Frank (Rick Mohr)
t.lfsckadmin -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] index is already in use problem

2019-01-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Jan 16, 2019, at 4:18 AM, Jae-Hyuck Kwak wrote: > > How can I force --writeconf option? It seems that mkfs.lustre doesn't support > --writeconf option. You will need to use the tunefs.lustre command to do a writeconf. -- Rick Mohr Senior HPC System Administrator Nation

Re: [lustre-discuss] Odd client behavior with mixed Lustre versions

2019-01-11 Thread Mohr Jr, Richard Frank (Rick Mohr)
Is it possible you have some incompatible ko2iblnd module parameters between the 2.8 servers and the 2.10 clients? If there was something causing LNet issues, that could possibly explain some of the symptoms you are seeing. -- Rick Mohr Senior HPC System Administrator National Institute

Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Mohr Jr, Richard Frank (Rick Mohr)
lfs” command has a built-in “lfs migrate” subcommand which supports a “—block” option to prevent file access while the migration is happening. So it might be safe to use. Perhaps someone else on the list with more experience using this command could chime in. -- Rick Mohr Senior HPC Syste

Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Mohr Jr, Richard Frank (Rick Mohr)
out what the "lctl > set_param osp..max_create_count=0” command would do? The Lustre manual has a section on removing MDTs/OSTs: http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.deactivating_mdt_ost -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences htt

Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Mohr Jr, Richard Frank (Rick Mohr)
"lctl conf_param .osc.active=0”. This will notify all Lustre clients to deactivate the OST, which I believe causes the hangs you were seeing when any client tries to remove or stat a file on that OST. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sci

Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-06 Thread Mohr Jr, Richard Frank (Rick Mohr)
om allocating any new files to the OST, but still allow clients to read and delete files on that OST. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss

Re: [lustre-discuss] Usage for lfs setstripe -o ost_indices

2018-11-09 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Nov 9, 2018, at 11:28 AM, Mohr Jr, Richard Frank (Rick Mohr) > wrote: > > >> On Nov 8, 2018, at 11:44 AM, Ms. Megan Larko wrote: >> >> I have been attempting this command on a directory on a Lustre-2.10.4 >> storage from a Lustre 2.10.1 client a

Re: [lustre-discuss] Usage for lfs setstripe -o ost_indices

2018-11-09 Thread Mohr Jr, Richard Frank (Rick Mohr)
r on ioctl 0x4008669a for 'custTest' (3): Invalid argument > error: setstripe: create striped file 'custTest' filed: Invalid argument Do you get the same error if you try to run this on a file instead of a directory? Also, don’t you typically need to add the “-d” option when setting stripe parameters f

Re: [lustre-discuss] migrating MDS to different infrastructure

2018-10-29 Thread Mohr Jr, Richard Frank (Rick Mohr)
nd zfs receive it on mds2 > • zfs send the MGT partition from mds1 and zfs receive it on mds2 > • mount lustre on mds2 > should it work ? I think that should work, except that in the first step you don’t need to create a lustre FS on the new pools. -- Rick Mohr Se

Re: [lustre-discuss] lustre 2.10.5 or 2.11.0

2018-10-19 Thread Mohr Jr, Richard Frank (Rick Mohr)
isable ZIL, change the > redundant_metadata to "most" atime off. > > I could send you a list of parameters that in my case work well. Riccardo, Would you mind sharing your ZFS parameters with the mailing list? I would be interested to see which options you have changed. -- Rick

Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5

2018-10-19 Thread Mohr Jr, Richard Frank (Rick Mohr)
with most of the applications continuing without issues. Sometimes there are a few jobs that abort, but overall this is better than having to stop all jobs and remount lustre on all the compute nodes. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences

Re: [lustre-discuss] Multihoming Lustre server

2018-10-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
gt; servernodes to support both LNET nids after mounting the OSTs, the command > succeeds, but the file system is not mountable from the client. You can’t use mkfs.lustre to update service node NIDs once the file system is formatted. You would need to perform a writeconf or use the “lctl

Re: [lustre-discuss] Experience with resizing MDT

2018-09-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
good idea to do this. So that is why I was thinking that resizing the MDT might be the simplest approach. Of course, I might be mistunderstanding something about DNE2, and if that is the case, someone can correct me. Of if there are options I am not considering, I would welcome those too. --

[lustre-discuss] Experience with resizing MDT

2018-09-19 Thread Mohr Jr, Richard Frank (Rick Mohr)
remember is not available for ZFS at the moment .) -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http

Re: [lustre-discuss] lustre client not able to lctl ping or mount

2018-09-04 Thread Mohr Jr, Richard Frank (Rick Mohr)
th ko2iblnd-opa are intended for OmniPath hardware. Since you are using IB, you will want to just set your options like this: options ko2iblnd peer_credits=…, etc. Have you verified that the firewall is not running? It’s possible a firewall might be allowing ping traffic but blocking the port n

Re: [lustre-discuss] migrating MDS to different infrastructure

2018-08-23 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Aug 22, 2018, at 8:10 PM, Riccardo Veraldi > wrote: > > On 8/22/18 3:13 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote: >>> On Aug 22, 2018, at 3:31 PM, Riccardo Veraldi >>> wrote: >>> I would like to migrate this virtual machine to another infras

Re: [lustre-discuss] migrating MDS to different infrastructure

2018-08-22 Thread Mohr Jr, Richard Frank (Rick Mohr)
corruption of data ? > May I simply use zfs send and zfs receive thru SSH ? > what is the best way to move a MDS based virtual machine ? I don’t have much experience with VMs, but I have used zfs send/receive to migrate a MDT from one server to another. It worked quite well. -- Ric

Re: [lustre-discuss] Lustre Size Variation after formatiing

2018-08-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
e size discrepancy. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre 2.10.4 failover

2018-08-13 Thread Mohr Jr, Richard Frank (Rick Mohr)
over config or not. (Maybe it doesn’t matter.) -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Upgrading ZFS version for Lustre

2018-07-27 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Jul 27, 2018, at 1:56 PM, Andreas Dilger wrote: > >> On Jul 27, 2018, at 10:24, Mohr Jr, Richard Frank (Rick Mohr) >> wrote: >> >> I am working on upgrading some Lustre servers. The servers currently run >> lustre 2.8.0 with zfs 0.6.5, and I am

[lustre-discuss] Upgrading ZFS version for Lustre

2018-07-27 Thread Mohr Jr, Richard Frank (Rick Mohr)
need to run “zfs upgrade” on the underlying pools before upgrading the lustre version? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-28 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On Jun 27, 2018, at 4:44 PM, Mohr Jr, Richard Frank (Rick Mohr) > wrote: > > >> On Jun 27, 2018, at 3:12 AM, yu sun wrote: >> >> client: >> root@ml-gpu-ser200.nmg01:~$ mount -t lustre >> node28@o2ib1:node29@o2ib1:/project /mnt/lustre_data >&g

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Mohr Jr, Richard Frank (Rick Mohr)
something change in the meantime? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-26 Thread Mohr Jr, Richard Frank (Rick Mohr)
d line : lnetctl lnet configure --all to make my static lnet > configuration take effect. but i still can't ping node28 from my client > ml-gpu-ser200.nmg01. I can mount as well as access lustre on client > ml-gpu-ser200.nmg01. What options did you use when mounting the file

Re: [lustre-discuss] Lustre on native ZFS encryption

2018-05-02 Thread Mohr Jr, Richard Frank (Rick Mohr)
is info about mount options stored in the file system that gets retrieved with one of the e2fsprogs tools (maybe debugfs) which is then used when performing the actual mount. So I could easily see something trying to query a ZFS attribute to retrieve similar information before doing the moun

Re: [lustre-discuss] varying sequential read performance.

2018-04-05 Thread Mohr Jr, Richard Frank (Rick Mohr)
ity to modify the vm.zone_reclaim_mode parameter (or have an admin do it for you), then it might be worth looking at. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss m

Re: [lustre-discuss] varying sequential read performance.

2018-04-05 Thread Mohr Jr, Richard Frank (Rick Mohr)
your 4 OSTs, but it might explain why the cache for some OSTs decrease when others increase. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Apr 2, 2018, at 8:06 PM, John Bauer <bau...@iodoctors.com> wrote:

Re: [lustre-discuss] Adding a servicenode (failnode) to existing OSTs

2018-04-04 Thread Mohr Jr, Richard Frank (Rick Mohr)
.lustre --param="failover.node= /dev/ The first one is the preferred method. Keep in mind that the “—servicenode=nid,nid” syntax is intended for specifying multiple nids that belong to the same host. To specify multiple hosts for failover, you will want to add a —servicenode option for e

[lustre-discuss] Question about lctl changelog_deregister

2018-01-26 Thread Mohr Jr, Richard Frank (Rick Mohr)
waited a while in case it just took some time to clear the entries, but after several hours, they were still there. Am I misunderstanding what is supposed to happen when a userid is deregistered? Or did I mess up a command somewhere? Or is this a bug? -- Rick Mohr Senior HPC System Administrator

Re: [lustre-discuss] Designing a new Lustre system

2017-12-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
able to use zfs send/receive to move data using incremental snapshots. This was much easier than trying to tar up the contents of a ldiskfs-backed MDT and untar it to the new storage. -- Rick Mohr Senior HPC System Administrator National Instit

Re: [lustre-discuss] Lustre compilation error

2017-11-30 Thread Mohr Jr, Richard Frank (Rick Mohr)
’t provide a patch. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre compilation error

2017-11-29 Thread Mohr Jr, Richard Frank (Rick Mohr)
got error messages like these: make[3]: *** No rule to make target `fld.ko', needed by `all-am'. Stop. When I removed the “—disable-client” option, the error went away. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___

Re: [lustre-discuss] mdt mounting error

2017-11-01 Thread Mohr Jr, Richard Frank (Rick Mohr)
If there are firewall rules blocking any traffic, that could cause a problem. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@l

Re: [lustre-discuss] 1 MDS and 1 OSS

2017-10-30 Thread Mohr Jr, Richard Frank (Rick Mohr)
bably depends on what the primary usage will be. If the applications create lots of small files (like some biomed programs), then a larger MDT would result in more inodes allowing more Lustre files to be created. -- Rick Mohr Senior HPC System Administrator National Institute for Comput

Re: [lustre-discuss] Lustre routing help needed

2017-10-30 Thread Mohr Jr, Richard Frank (Rick Mohr)
the files under /sys/module/ko2iblnd/parameters. It might be worthwhile to compare those values on the lnet routers to the values on the servers to see if maybe there is a difference that could affect the behavior. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sc

Re: [lustre-discuss] Linux users are not able to access lustre folders

2017-10-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
can cd /home/luser6 manually and create files or folders. Are you using automount for /home? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing l

Re: [lustre-discuss] Linux users are not able to access lustre folders

2017-10-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
er accounts are visible to the lustre servers. You could also use LDAP or even just /etc/passwd. You’ll probably just want to choose whatever mechanism is used on your other systems. For the purposes of testing, you could always just create the luser1 locally on each lustre server to see if things star

[lustre-discuss] OSTs remounting read-only after ldiskfs journal error

2017-10-19 Thread Mohr Jr, Richard Frank (Rick Mohr)
have some experience with this so they can share their wisdom with me. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss

Re: [lustre-discuss] Lustre poor performance

2017-08-23 Thread Mohr Jr, Richard Frank (Rick Mohr)
line to ko2iblnd.conf. Or just do what I did and comment out all the lines in ko2iblnd.conf and add your own lines. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre

Re: [lustre-discuss] Lustre Quotas

2017-08-03 Thread Mohr Jr, Richard Frank (Rick Mohr)
in Lustre. > What is the performance impact, if any, from using quotas? The last time we did performance testing, I think we only saw a performance hit of around 10%. But this was several years ago (i.e. - Lustre 1.8 days), so I don’t know how much things have changed since then. -- Rick M

Re: [lustre-discuss] Spiking OSS load?

2017-08-03 Thread Mohr Jr, Richard Frank (Rick Mohr)
user restripe a file can dramatically reduce load. So in summary: Q: Is it a problem to have a high load on my OSS servers? A: It depends…. (Wish it could be a little more clear cut than that) -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http:

Re: [lustre-discuss] New Lustre Installation

2017-05-22 Thread Mohr Jr, Richard Frank (Rick Mohr)
You might want to start by looking at these online tutorials: http://lustre.ornl.gov/lustre101-courses/ -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On May 21, 2017, at 6:19 AM, Ravi Konila <ravibh...@gma

Re: [lustre-discuss] Lustre 2.8.0 - MDT/MGT failing to mount

2017-05-04 Thread Mohr Jr, Richard Frank (Rick Mohr)
> On May 4, 2017, at 11:03 AM, Steve Barnet <bar...@icecube.wisc.edu> wrote: > > On 5/4/17 10:01 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote: >> Did you try doing a writeconf to regenerate the config logs for the file >> system? > > > Not yet, but quick en

Re: [lustre-discuss] building of lustre-client fails

2017-05-04 Thread Mohr Jr, Richard Frank (Rick Mohr)
ine I used to build: > > rpmbuild --without servers --without lustre-tests --with zfs --with > lustre_modules -bb lustre-2.9.0.spec This probably has nothing to do with the errors you are seeing, but just for reference, you shouldn’t need to specify —with-zfs for the client. This is

Re: [lustre-discuss] operation ldlm_queue failed with -11

2017-05-03 Thread Mohr Jr, Richard Frank (Rick Mohr)
gt; that rather than repeat it here. > > https://jira.hpdd.intel.com/browse/LU-8658 Ah, that is good to know. Thanks for the explanation. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu _

Re: [lustre-discuss] operation ldlm_queue failed with -11

2017-05-03 Thread Mohr Jr, Richard Frank (Rick Mohr)
complaining about errors to the same MDS server, then my first guess would be that there is some wrong on the server side of things. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On May 2, 2017, at 4:52 AM, Lydia Heck <l

Re: [lustre-discuss] client fails to mount

2017-04-24 Thread Mohr Jr, Richard Frank (Rick Mohr)
This might be a long shot, but have you checked for possible firewall rules that might be causing the issue? I’m wondering if there is a chance that some rules were added after the nodes were up to allow Lustre access, and when a node got rebooted, it lost the rules. -- Rick Mohr Senior HPC

Re: [lustre-discuss] Lustre [2.8.0] flock Functionality

2017-03-28 Thread Mohr Jr, Richard Frank (Rick Mohr)
icular to look out for? I have enabled flock on all my Lustre file systems (2.4.3 and 2.8), and I have not yet encountered any issues. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu _

Re: [lustre-discuss] Odd quota behavior with Lustre/ZFS

2017-02-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
Alex, Were you ever able to get more details about this problem? Thanks. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Feb 9, 2017, at 10:27 PM, Alexander I Kulyavtsev <a...@fnal.gov> wrote: > >

Re: [lustre-discuss] Virtual servers

2017-02-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
hand > easily verified. Have you tried mounting the file system on different nodes? This could help determine if the problem is always the same or if it might be affected by the type of node (MDS vs OSS) that is being used for the client. -- Rick Mohr Senior HPC System Adm

[lustre-discuss] Odd quota behavior with Lustre/ZFS

2017-02-09 Thread Mohr Jr, Richard Frank (Rick Mohr)
- Has anyone else encountered this “off by 21” problem before? I didn’t see anything online, but perhaps I missed something. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu

Re: [lustre-discuss] Lustre Client hanging on mount

2017-01-12 Thread Mohr Jr, Richard Frank (Rick Mohr)
I noticed that you appear to have formatted the MDT with the file system name “mgsZFS” while the OST was formatted with the file system name “ossZFS”. The same name needs to be used on all MDTs/OSTs in the same file system. Until that is fixed, your file system won’t work properly. -- Rick

Re: [lustre-discuss] MGS failover problem

2017-01-11 Thread Mohr Jr, Richard Frank (Rick Mohr)
m still persists. > > Yes, but is there any reason why you are choosing IB over Ethernet? I think > I'd prefer to try over the Ethernet is we are going to pick one. I just figured that if you had Infiniband, then you would prefer to run with the higher performance interconnect. But

Re: [lustre-discuss] MGS failover problem

2017-01-11 Thread Mohr Jr, Richard Frank (Rick Mohr)
red? If you could set up the file system to only use Infiniband, then that would eliminate any complications from having two fabrics active at the same time. Then you could see if the problem still persists. -- Rick Mohr Senior HPC System Administrator National Institute for Computati

Re: [lustre-discuss] Lustre with Hadoop (Hortonworks Data Platform)

2017-01-09 Thread Mohr Jr, Richard Frank (Rick Mohr)
es. Sometimes sites will use more specialized storage hardware (like DDN or NetApp), but that is not required for Lustre. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre

Re: [lustre-discuss] MGS failover problem

2017-01-09 Thread Mohr Jr, Richard Frank (Rick Mohr)
--servicenode options if you wanted. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Jan 8, 2017, at 11:58 PM, Vicker, Darby (JSC-EG311) > <darby.vicke...@nasa.gov> wrote: > > We have a new

Re: [lustre-discuss] Round robin allocation (in general and in buggy 2.5.3)

2016-12-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
of my head). If the OST usage drops, then you can use “lctl enable” to reenable it. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing lis

Re: [lustre-discuss] [UNTRUSTED] Re: Check clients connected?

2016-12-15 Thread Mohr Jr, Richard Frank (Rick Mohr)
n the new > centos 7.2 server, which I installed from RPMs I suspect there is one that I > have not installed :) It looks like that command may have been removed in more recent versions of Lustre. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sc

Re: [lustre-discuss] Lustre newbie problems: formatting disk or partition with Lustre filesystem fails

2016-11-28 Thread Mohr Jr, Richard Frank (Rick Mohr)
more additional servers then mount the OSTs (these are referred to as the OSS nodes). However, it is possible to have a single server that mounts the MDT/MGT as well as the OSTs. If you are interested in some entry-level Lustre tutorials, check out http://lustre.ornl.gov/lustre101-cour

Re: [lustre-discuss] Quick ZFS pool question?

2016-10-13 Thread Mohr Jr, Richard Frank (Rick Mohr)
the OSTs. However, ZFS has features (like snapshots) that are useful for the MDT so some folks are willing to accept a performance hit in order to take advantage of those features. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.

Re: [lustre-discuss] Still having problems Lustre 2.8 Centos 7.2

2016-09-28 Thread Mohr Jr, Richard Frank (Rick Mohr)
Did you check to make sure there are no firewalls running that could be blocking traffic? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Sep 27, 2016, at 10:12 AM, Phill Harvey-Smith > <p.harvey-sm...@warw

Re: [lustre-discuss] More problems setting things up....

2016-09-21 Thread Mohr Jr, Richard Frank (Rick Mohr)
el: osd_zfs: Unknown symbol sa_spill_rele (err -22) > Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol > zap_curs: Often time those types of error messages indicate some sort of version mismatch between kernel modules. Did you just download the lustre RPMs from the web s

Re: [lustre-discuss] Mount lustre client with MDS/MGS backup

2016-09-14 Thread Mohr Jr, Richard Frank (Rick Mohr)
Alfonso, Are you still having problems with this, or were you able to get it resolved? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Sep 1, 2016, at 12:43 PM, Pardo Diaz, Alfonso <alfonso.pa...@ciemat.es>

Re: [lustre-discuss] Mount lustre client with MDS/MGS backup

2016-08-31 Thread Mohr Jr, Richard Frank (Rick Mohr)
ddress. I am guessing that this corresponds to mds1, so when it is down, there is no second host for the client to try. Try specifying IP addresses instead of hostnames and see if that make a difference. -- Rick Mohr Senior HPC System Administrator National Institute for Computa

Re: [lustre-discuss] Does an updated version exist?

2016-08-16 Thread Mohr Jr, Richard Frank (Rick Mohr)
re how > relevant it still is….. Some of the information should still be relevant. In my opinion, it is still worthwhile reading to get a better idea on what is happening inside of Lustre (even if some of the details are out of date). -- Rick Mohr Senior HPC System Administrator Nation

Re: [lustre-discuss] proper procedure after MDT kernel panic

2016-08-11 Thread Mohr Jr, Richard Frank (Rick Mohr)
f the OSS was still up, I don’t think there should be any problem with the OSTs that would require a fsck. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discus

Re: [lustre-discuss] tune2fs being blocked by MMP

2016-08-04 Thread Mohr Jr, Richard Frank (Rick Mohr)
-level details, but the server that has the OST mounted writes a bit of info to the disk periodically to indicate that it is in use. If another host tries to mount the OST, it looks at the MMP info. If it is recent, it assumes the OST is in use and won’t mount it. I suspect that tune2fs is doing s

Re: [lustre-discuss] poor performance on reading small files

2016-08-03 Thread Mohr Jr, Richard Frank (Rick Mohr)
I haven’t done so for newer versions of Lustre. So it’s possible I am mistaken about the defaults. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing l

Re: [lustre-discuss] poor performance on reading small files

2016-08-03 Thread Mohr Jr, Richard Frank (Rick Mohr)
bottleneck as Oliver suggested.) Do your OSS nodes have a lot of memory? Do you know what your typical memory usage is on the OSS nodes? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On Jul 28, 2016, at 10:19 PM, Ricca

Re: [lustre-discuss] tune2fs being blocked by MMP

2016-08-02 Thread Mohr Jr, Richard Frank (Rick Mohr)
f-1-5.ad.cirrus.com device: dm-19 > > 0 edi-vf-1-5:~# Is the device currently mounted? If so, that would explain the error. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___

Re: [lustre-discuss] ​luster client mount issues

2016-08-01 Thread Mohr Jr, Richard Frank (Rick Mohr)
for LNet routes, so I don’t know if that could be used to prefer one interface over another.) -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailin

Re: [lustre-discuss] lnet router lustre rpm compatibility

2016-06-20 Thread Mohr Jr, Richard Frank (Rick Mohr)
et routers while you are at it. The clients can then be upgraded later like you listed in your plan. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss m

Re: [lustre-discuss] stripe count recommendation, and proposal for auto-stripe tool

2016-05-19 Thread Mohr Jr, Richard Frank (Rick Mohr)
t; anyway. That’s true. If your restriping will mostly be done by a script, then you don’t necessarily need a simple formula. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lus

Re: [lustre-discuss] stripe count recommendation, and proposal for auto-stripe tool

2016-05-18 Thread Mohr Jr, Richard Frank (Rick Mohr)
high end, I would expect to see several OST at 81%, a few at 82%, and maybe one or two at 83%. Instead, I see two OSTs at 85% and 86% which fall outside the norm. Since the default stripe count for my file system is 2, this is an excellent indication that someone has a misstriped file

Re: [lustre-discuss] Lustre filesystem suddenly not allowing *new* mounts, but exciting mounts continue working.

2016-05-17 Thread Mohr Jr, Richard Frank (Rick Mohr)
Have you tried doing a writeconf to regenerate the config logs? -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu > On May 17, 2016, at 12:08 PM, Randall Radmer <rad...@slac.stanford.edu> wrote: > > We'

Re: [lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

2016-04-13 Thread Mohr Jr, Richard Frank (Rick Mohr)
d exciting ways. IIRC, SDSC also had an issue with LU-5726 but the symptoms they saw were not identical to mine. So maybe you are seeing the same problem manifest itself in a different way. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Scien

Re: [lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

2016-04-13 Thread Mohr Jr, Richard Frank (Rick Mohr)
t I started doing as well). -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

2016-04-13 Thread Mohr Jr, Richard Frank (Rick Mohr)
to spin, > "perf top" showing nearly all time spent in spinlock_irq. iirc.) > > might your system have had a *lot* of memory? ours tend to be fairly modest > (32-64G, dual-socket intel.) I have 64 GB on my servers. -- Rick Mohr Senior HPC System Administrator Nationa

Re: [lustre-discuss] MDS crashing: unable to handle kernel paging request at 00000000deadbeef (iam_container_init+0x18/0x70)

2016-04-12 Thread Mohr Jr, Richard Frank (Rick Mohr)
abled quotas right after > the 500M file thing, and were thinking that inconsistent > quota records might cause this sort of crash. Have you set vm.zone_reclaim_mode=0 yet? I had an issue with this on my file system a while back when it was set to 1. -- Rick Mohr Senior HPC System Ad

Re: [lustre-discuss] Problems with quota after adding a new OST

2016-04-12 Thread Mohr Jr, Richard Frank (Rick Mohr)
at you were able to resolve your problem by re-enabling quota enforcement. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-di

Re: [lustre-discuss] Problems with quota after adding a new OST

2016-04-12 Thread Mohr Jr, Richard Frank (Rick Mohr)
Based on the output of your “lfs quota” command, I am not sure that I see a problem with it. Both active osts (OST and OST0002) and the MDT seem to be returning quota information. Can you explain what you were expecting to see and how the output differs from your expectations? -- Rick

Re: [lustre-discuss] Problems with quota after adding a new OST

2016-04-12 Thread Mohr Jr, Richard Frank (Rick Mohr)
system, quotas are enabled but the value of quota_slave.enabled for all of my OSTs is “none”. So this probably isn’t the source of your problem. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___

Re: [lustre-discuss] fetching a histogram of idx counts

2016-03-01 Thread Mohr Jr, Richard Frank (Rick Mohr)
onably close. (I guess it depends on just how accurate you need the number to be.) -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discu

Re: [lustre-discuss] issue in lnet con

2016-02-08 Thread Mohr Jr, Richard Frank (Rick Mohr)
same version of OFED that is running on the system. -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lu

Re: [lustre-discuss] Large directory Feature in lustre 2.5.3

2016-01-29 Thread Mohr Jr, Richard Frank (Rick Mohr)
looks like this might be ongoing work: https://jira.hpdd.intel.com/browse/LU-896 -- Rick Mohr Senior HPC System Administrator National Institute for Computational Sciences http://www.nics.tennessee.edu ___ lustre-discuss mailing list lustre-dis

  1   2   >