Re: [lustre-discuss] 2.10 <-> 2.12 interoperability?

2019-05-07 Thread Andreas Dilger
On May 3, 2019, at 15:35, Hans Henrik Happe wrote: > > On 03/05/2019 22.41, Andreas Dilger wrote: >> On May 3, 2019, at 14:33, Patrick Farrell wrote: >>> >>> Thomas, >>> >>> As a general rule, Lustre only supports mixing versions on serve

Re: [lustre-discuss] Limit client side caching?

2019-05-07 Thread Andreas Dilger
nt of cached (dirty+clean) cached data for the filesystem. By default this is 3/4 of RAM. Cheers, Andreas -- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Setting infinite grace period with soft quotas

2019-05-06 Thread Andreas Dilger
k anyone has tried this, but it also seems like something that could be tested quite easily? Cheers, Andreas -- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] inotify

2019-05-06 Thread Andreas Dilger
ich need full inotify functionality could be added as changelog consumers, and Changelog records would be mapped to inotify events, but I think there would be a very significant overhead if a large number of clients were all trying to be notified of every even in the whole filesystem... Cheers, Andr

Re: [lustre-discuss] stat

2019-05-06 Thread Andreas Dilger
It would be useful to add an llapi_ function for this. In connection with LSOM the client will also be able to get the approximate file size, once https://jira.whamcloud.com/browse/LU-11367 is landed. Cheers, Andreas On May 1, 2019, at 09:35, Nathaniel Clark mailto:ncl...@whamcloud.com>> wrote

Re: [lustre-discuss] 2.10 <-> 2.12 interoperability?

2019-05-03 Thread Andreas Dilger
a >> certain parameter. >> LU-10175 indicates that the ibits have some connection to data-on-mdt which >> we don't use. >> >> Any suggestions? >> >> >> Regards, >> Thomas -- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] State of arm client?

2019-04-25 Thread Andreas Dilger
The older Pi boards are also 64-bit CPUs, but the problem is that Raspbian is only compiled with 32-bit kernels. I was recently testing this, and for Raspbian you will need at least the tip of b2_12, or 2.12.1 in order to compile. I compiled 2.12.1-rc on my 32-bit Raspbian. This mostly works, b

Re: [lustre-discuss] PFL not working on 2.10 client

2019-04-23 Thread Andreas Dilger
Rick, Does this still fail with 2.10.1 or a later client? It may just be a bug in "lfs" or the client, not an interop problem per-se. If it doesn't fail with a newer client then it probably isn't worthwhile to track down. If you _really_ need to get this working with the 2.10.0 client you coul

Re: [lustre-discuss] inodes not adding up

2019-04-18 Thread Andreas Dilger
Thanks to Rick for the good explanation here. One thing to add is that it appears that the /lfs01 filesystem has a default stripe_count=2, since there are 46560885 inodes used on MDT and 91572739 total objects used on the four OSTs, and 91572739/46560885 = 1.96 OST objects per MDT inode. I

Re: [lustre-discuss] lfsck repair quota

2019-04-17 Thread Andreas Dilger
elona >> Phone: (+34) 93 230 96 35 >> >> >>> El 16 abr 2019, a las 15:34, Mohr Jr, Richard Frank (Rick Mohr) >>> escribió: >>> >>> >>>> On Apr 15, 2019, at 10:54 AM, Fernando Perez

Re: [lustre-discuss] lfsck repair quota

2019-04-16 Thread Andreas Dilger
e same time, except in the case your filesystem is corrupted, in which case you'd want e2fsck to repair the filesystem anyway. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lfsck repair quota

2019-04-16 Thread Andreas Dilger
that regard the quota usage should be indirectly repaired by an LFSCK run. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] LNET Conf Advise and Rearchitecting

2019-04-04 Thread Andreas Dilger
I can share with you our current configs but given that I'm happy > to throw them out I'm good with just scrapping all I have to start from > scratch and do it right. Also happy to send a diagram if that would be > helpful. > > Thanks for your help in advance! > &

Re: [lustre-discuss] EINVAL error when writing to a PFL file (lustre 2.12.0)

2019-03-29 Thread Andreas Dilger
sh: echo: write error: Invalid argument >> >> # strace indicates that write() gets the error: >> >> write(1, "qsdkjqslkdjkj\n", 14) = -1 EINVAL (Invalid argument) >> >> * no error in case of an open/truncate: >> >> [root

Re: [lustre-discuss] How often Log file get Updated

2019-03-25 Thread Andreas Dilger
s, so if it is not being updated, then there are a few options that are possible: - you are looking into the wrong stats file (e.g. different OST), as there are many different ones - there is a bug in the code that prevents that the "write_bytes" from being updated. What verson of Lus

Re: [lustre-discuss] Disaster recover files from ZFS OSTs

2019-03-24 Thread Andreas Dilger
performance declines when a directory gets very full and is then emptied. This isn't really a problem as the object directories are continually used. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Data migration from one OST to anther

2019-03-10 Thread Andreas Dilger
Note that the "max_create_count=0" feature is only working with newer versions of Lustre - 2.10 and later. It is recommended to upgrade to a newer release than 2.5 in any case. Cheers, Andreas > On Mar 5, 2019, at 10:33, Tung-Han Hsieh > wrote: > > Dear All, > > We have found the answer. S

Re: [lustre-discuss] Lustre Monitoring metrics

2019-02-26 Thread Andreas Dilger
erything from scratch", and rather "improve some existing code", and becoming adept at understanding existing code and improving it is valuable in the industry. Cheers, Andreas > On Sun, Feb 24, 2019 at 11:06 AM Andreas Dilger wrote: >> Probably for a new user it doesn

Re: [lustre-discuss] Lustre Monitoring metrics

2019-02-24 Thread Andreas Dilger
Probably for a new user it doesn't make sense to use the Lustre stats in /proc directly. There are a number of different tools that present these stats in a more useful manner, such as IML (GUI Web front end), LMT, lltop, etc. Cheers, Andreas On Feb 24, 2019, at 02:09, Masudul Hasan Masud Bhuiy

Re: [lustre-discuss] Suspended jobs and rebooting lustre servers

2019-02-22 Thread Andreas Dilger
This is not really correct. Lustre clients can handle the addition of OSTs to a running filesystem. The MGS will register the new OSTs, and the clients will be notified by the MGS that the OSTs have been added, so no need to unmount the clients during this process. Cheers, Andreas On Feb 21,

Re: [lustre-discuss] Migrate MGS to ZFS

2019-02-19 Thread Andreas Dilger
PS: it is always a good idea to make a backup of your MDT, since it is relatively small compared to the rest of the filesystem. A full-device "dd" copy doesn't take too long and is the most accurate backup for ldiskfs. Cheers, Andreas > On Feb 19, 2019, at 19:31, And

Re: [lustre-discuss] Migrate MGS to ZFS

2019-02-19 Thread Andreas Dilger
Yes, it is possible to migrate the MGS files to another device as you propose. I don't think there is any particular difference if you move it to a separate ldiskfs or ZFS target. One caveat is that we don't test combined ZFS and ldiskfs targets on the same node, though in theory it would work

Re: [lustre-discuss] Command line tool to monitor Lustre I/O ?

2019-02-15 Thread Andreas Dilger
is lltop, which >> has already been mentioned a couple of times and that's what came to my >> mind as well when I read your question. >> >> best regards, >> Martin > > _______ > lustre-discuss mailing list >

Re: [lustre-discuss] MDS/MGS has a block storage device mounted and it does not have any permissions (no read , no write, no execute)

2019-02-06 Thread Andreas Dilger
below > commands: > > > mkfs.lustre --ost --fsname=lustrewt --index=0 --mgsnode=10.0.2.4@tcp /dev/sdb > mkdir -p /ostoss_mount > mount -t lustre /dev/sdb /ostoss_mount > > > Client node > 1 client node. The setup to upd

Re: [lustre-discuss] Disable identity_upcall and ACL

2019-01-13 Thread Andreas Dilger
On Jan 10, 2019, at 04:52, Degremont, Aurelien wrote: > > > Le 09/01/2019 21:39, « Andreas Dilger » a écrit : > >> If admins completely trust the client nodes (e.g. they are on a secure >> network) or they completely _distrust_ them (e.g. subdirectory mounts >

Re: [lustre-discuss] Kernel Module Build

2019-01-12 Thread Andreas Dilger
if I could see an example of a correct line in > autoMakefile. > > Andrew Tauferner > 1-952-562-4944 (office) > > -Original Message- > From: Andreas Dilger [mailto:adil...@whamcloud.com] > Sent: Thursday, January 10, 2019 10:19 PM > To: Tauferner, Andrew T &g

Re: [lustre-discuss] Kernel Module Build

2019-01-10 Thread Andreas Dilger
. > I wouldn't suspect any strange RPM macros as I'm able to build other RPMs > (the kernel RPM, for example) on this system. > I don't have a different system on which to build this. Sorry, I've never used SLES, though we definitely build it in our build farm. C

Re: [lustre-discuss] Kernel Module Build

2019-01-10 Thread Andreas Dilger
rpmbuilddir="$rpmbuilddir" rpm-local || exit 1; \ > cp ./rpm/* .; \ > /usr/bin/rpmbuild \ >--define "_tmppath $rpmbuilddir/TMP" \ >--define "_topdir $rpmbuilddir" \ >--define "dist %{nil}" \ > -ts lustre-2.12.0.

Re: [lustre-discuss] Kernel Module Build

2019-01-10 Thread Andreas Dilger
tory '/nfshome/attaufer/lustre-release/lustre/mdc' > (cd lmv && make top_distdir=../../lustre-2.12.0 > distdir=../../lustre-2.12.0/lustre/lmv \ > am__remove_distdir=: am__skip_length_check=: am__skip_mode_fix=: distdir) > make[3]: Entering directory '/nf

Re: [lustre-discuss] Disable identity_upcall and ACL

2019-01-09 Thread Andreas Dilger
r simple usage modes. I guess the other question is why you are interested to get rid of it, or what issue you are seeing with it enabled? Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Kernel Module Build

2019-01-09 Thread Andreas Dilger
From: Degremont, Aurelien [mailto:degre...@amazon.com] > Sent: Wednesday, January 9, 2019 11:49 AM > To: Tauferner, Andrew T ; Andreas Dilger > > Cc: lustre-discuss@lists.lustre.org > Subject: Re: [lustre-discuss] Kernel Module Build > > 2.10.6 does not support Linux 4.14. > The

Re: [lustre-discuss] Kernel Module Build

2019-01-08 Thread Andreas Dilger
stom x86_64 >>> kernel. Can somebody point me to the proper place for source and build >>> instructions? Thanks. >>> Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre-discuss Digest, Vol 154, Issue 3

2019-01-04 Thread Andreas Dilger
g > > > -- > > End of lustre-discuss Digest, Vol 154, Issue 3 > ****** > > > -- > Thanks, > ANS. > ___ > lustre-discuss mailing list > lus

Re: [lustre-discuss] Making a copy of an OST

2018-12-04 Thread Andreas Dilger
ld/artifact/lustre_manual.html#dbdoclet.backup_device Using "dd" is fine, as long as the target device is as the source. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Errors when starting Lustre on CentOS 6.5

2018-11-28 Thread Andreas Dilger
nqueue+0x129/0x9d0 [ptlrpc] > Nov 28 10:52:49 localhost kernel: [] > ldlm_handle_enqueue0+0x51b/0x13f0 [ptlrpc] > > > Does anyone know how to solve that problem? > > Build version:

Re: [lustre-discuss] lfs find with time queries

2018-11-28 Thread Andreas Dilger
ime will be kept totally uptodate on the MDS (normally mtime is only kept on the OST objects), but that seems like a relatively small change to also update mtime on close on the MDS. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud __

Re: [lustre-discuss] replicated 3+ file system

2018-11-13 Thread Andreas Dilger
OST. That said, it would probably be a lot easier to just have 3 separate Lustre filesystems and use a higher-level tool to do resync between the sites. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss

Re: [lustre-discuss] Any documentation regarding caching on RAM?

2018-11-11 Thread Andreas Dilger
Could you please explain your problem/configuration more? Lustre will use RAM for cache on both the clients and servers. It does impose limits (tunable) on the client RAM usage so that it doesn't interfere with applications. Cheers, Andreas > On Nov 11, 2018, at 09:55, shirshak bajgain wrot

Re: [lustre-discuss] Usage for lfs setstripe -o ost_indices

2018-11-09 Thread Andreas Dilger
This is https://jira.whamcloud.com/browse/LU-8417 "setstripe -o does not work on directories", which has not been implemented yet. That said, setting the default striping to specific OSTs on a directory is usually not the right thing to do. That will result in OST imbalance. Equivalent mechanis

Re: [lustre-discuss] lctl set_param not setting values permanently

2018-11-08 Thread Andreas Dilger
Jira is your friend. This is a known bug and fixed in 2.12. LU-10906. Cheers, Andreas > On Nov 8, 2018, at 09:25, Riccardo Veraldi > wrote: > > Hello, > > I did set a bunch of params from the MDS so that they can be taken up by the > Lustre clients > > lctl set_param -P osc.*.checksums=0

Re: [lustre-discuss] dd oflag=direct error (512 byte Direct I/O)

2018-10-30 Thread Andreas Dilger
t; > • Patrick > > From: lustre-discuss on behalf of > 김형근 > Date: Sunday, October 28, 2018 at 11:40 PM > To: Andreas Dilger > Cc: "lustre-discuss@lists.lustre.org" > Subject: Re: [lustre-discuss] dd oflag=direct error (512 byte Direct I/O) > >

Re: [lustre-discuss] dd oflag=direct error (512 byte Direct I/O)

2018-10-25 Thread Andreas Dilger
o change this limitation, so it is not a high priority to change on our side, especially since applications will have to deal with 4096-byte sectors in any case. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lust

Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5

2018-10-22 Thread Andreas Dilger
lustre on all the compute nodes. >> >>-- >> Rick Mohr >>Senior HPC System Administrator >>National Institute for Computational Sciences >>http://www.nics.tennessee.edu >> >>___ >>lustre-discuss mailing list >>lustre-discuss@lists.lustre.org >>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >> >> > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org Cheers, Andreas --- Andreas Dilger CTO Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre 2.10.5 or 2.11.0

2018-10-21 Thread Andreas Dilger
lctl set_param at_min=250 > lctl set_param at_max=600 > > ### > > Also I run this script at boot time to redefine IRQ assignments for hard > drives spanned across all CPUs, not needed for kernel > 4.4 > > #!/bin/sh > # numa_smp.sh > device=$1 > cpu1=$2

Re: [lustre-discuss] LU-11465 OSS/MDS deadlock in 2.10.5

2018-10-19 Thread Andreas Dilger
-the-MDS (or OSS) situation? >>> Rebooting the storage servers does not clear the hang-up, as upon reboot >>> the MDS quickly ends up with the same number of D-state threads (around the >>> same number as we have clients). It seems to me l

Re: [lustre-discuss] Error while building Lustre on CentOS 7

2018-10-18 Thread Andreas Dilger
have been a few recent ZFS releases that have serious bugs, so we've been keeping a bit behind on the ZFS updates until it has seen a few weeks of outside usage. Cheers, Andreas > On Sat, Oct 13, 2018 at 4:40 AM Andreas Dilger wrote: > There is a build breakage with Lustre and ZFS

Re: [lustre-discuss] Error while building Lustre on CentOS 7

2018-10-13 Thread Andreas Dilger
There is a build breakage with Lustre and ZFS 0.7.10/0.7.11. There is a patch in Gerrit that fixed this build issue for master, and there is a patch landed in ZFS Git that also fixes this issue. That said, both ZFS 0.7.10 and 0.7.11 have serious bugs and should not be used. Also note that Lust

Re: [lustre-discuss] zpool features, recordsize, post-upgrade?

2018-10-11 Thread Andreas Dilger
ht" compression can also be useful (e.g. lz4), if you have enough spare CPU cycles, files are not modified very often, and are compressible (otherwise it just adds overhead). In many systems compression can reduce space usage by 40% or more, and in some cases even improve read performance if

Re: [lustre-discuss] Writing to a single big file is slower

2018-10-10 Thread Andreas Dilger
s do not have much overhead, but very large files can use the full IO bandwidth. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre write wrong data under postgresql benchmark tool test(concurrent access same data file with primary node write and standy node read)

2018-10-05 Thread Andreas Dilger
ient than small ones. The important point is that Lustre can only keep the data consistent on the kernel side of the filesystem, it can't do anything once the data is in userspace buffers, which is what I think the problem of the original poster relates to. Cheers, Andreas > On Sat, Sep

Re: [lustre-discuss] Errors large directory feature ldiskfs

2018-10-04 Thread Andreas Dilger
s? >>>> >>>> I can't see references to this feature in the lustre documentation. Is it >>>> related with the LU-1365? >>>> >>>> Regards. >>>> >>> ___ >>> lustre-discuss mailing list >>> lustre-discuss@lists.lustre.org >>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org Cheers, Andreas --- Andreas Dilger CTO Whamcloud ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Updating kernel will require recompilation of lustre kernel modules?

2018-10-02 Thread Andreas Dilger
client, do we need to > recompile lustre against the updated kernel version ? Newer RHEL kernel modules are built with weak symbols, so if the kernel update is minor then the modules will likely work with the new kernel. Cheers, Andreas --- Andr

Re: [lustre-discuss] Limit to the number of "--servicenode="

2018-09-29 Thread Andreas Dilger
I haven't checked the code recently, but I believe that there can be up to 32 NIDs assigned per target. I've heard of at least some sites that are configuring four OSS nodes per OST, with three OSTs/OSS to allow failover of each OST to a different OSS. That will only increase load on an OSS t

Re: [lustre-discuss] lustre write wrong data under postgresql benchmark tool test(concurrent access same data file with primary node write and standy node read)

2018-09-29 Thread Andreas Dilger
Is PG using O_DIRECT or buffered read/write? Is it caching the pages in userspace? Lustre will definitely keep pages consistent between clients, but if the application is caching the pages in userspace, and does not have any protocol between the nodes to invalidate cached pages when they are m

Re: [lustre-discuss] lustre 2.10.* and zfs record size

2018-09-28 Thread Andreas Dilger
ed question. The documentation says nothing about the recommended recordsize for MDTs with ZFS backend. Are there any recommendation? Or is data on MDTs stored in a way that the recordsize does not matter? Thanks, Robert On 09/28/2018 12:54 AM, Andreas Dilger wrote: Firstly, we don't test anyt

Re: [lustre-discuss] lustre 2.10.* and zfs record size

2018-09-27 Thread Andreas Dilger
Firstly, we don't test anything larger than 1MB. Secondly, the best recordsize is up to the apication IO pattern. If it is streaming writes, it might be OK. If it is random write then probably not. Cheers, Andreas On Sep 27, 2018, at 19:04, Riccardo Veraldi mailto:riccardo.vera...@cnaf.infn.it

Re: [lustre-discuss] Experience with resizing MDT

2018-09-27 Thread Andreas Dilger
ues. It looks like the presence of enable_remote_dir is not strictly needed, and enable_remote_dir_gid is controlling access. Setting it to a specific group number (e.g. "wheel" or "admin") will allow that group to create remote/striped directories, while "-1" will

Re: [lustre-discuss] Experience with resizing MDT

2018-09-27 Thread Andreas Dilger
s. As far as I can see, "mdt_remote_dir" checks still exist in the master code. Cheers, Andreas > On 9/21/18, 11:28 PM, "Andreas Dilger" wrote: > >On Sep 20, 2018, at 16:38, Mohr Jr, Richard Frank (Rick Mohr) > wrote: >> >> >>> On Sep 19

Re: [lustre-discuss] rsync target for https://downloads.whamcloud.com/public/?

2018-09-26 Thread Andreas Dilger
I don't think there is an rsync target, but all of the Jenkins builds are also Yum repos, so you could point at latest-release and do "yum update". Cheers, Andreas > On Sep 26, 2018, at 16:20, Andrew Elwell wrote: > > Hi folks, > > Is there an rsync (or other easily mirrorable) target for > d

Re: [lustre-discuss] Understanding MDT getxattr stats

2018-09-25 Thread Andreas Dilger
> OST000e:read_bytes : 8.02 MB > OST000e:write_bytes: 7.48 GB > OST000e:setattr: 1 > OST000d:write_bytes: 1.21 GB > OST000f:write_bytes: 2.88 GB > > > ___ > lustre-discuss mailing list > lustre-discuss

Re: [lustre-discuss] Lustre-2.10.5 problem

2018-09-25 Thread Andreas Dilger
blnd-opa ko2iblnd >>>options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024 >>> concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 >>> fmr_flush_trigger=512 fmr_cache=1 >>>install ko2iblnd /usr/sbin/ko2iblnd-probe

Re: [lustre-discuss] Lustre-2.10.5 problem

2018-09-24 Thread Andreas Dilger
Don't use 0.7.10, it has a serious bug. Use 0.7.12 instead. Cheers, Andreas > On Sep 24, 2018, at 21:39, Riccardo Veraldi > wrote: > > as for me Lustre 2.10.5 is not building on ZFS 0.7.10 > of course it builds fine with ZFS 0.7.9 > > CC:gcc > LD:/usr/bin/ld -m elf_x8

Re: [lustre-discuss] Experience with resizing MDT

2018-09-21 Thread Andreas Dilger
roach. Of course, I might be mistunderstanding something about DNE2, and > if that is the case, someone can correct me. Of if there are options I am > not considering, I would welcome those too. Yes, if you are not pushing the limits of MDT size, then resizing the MDT is a reason

Re: [lustre-discuss] Second read or write performance

2018-09-21 Thread Andreas Dilger
ting a single file of 300TB in size, so that is definitely going to skew the space allocation. Cheers, Andreas > > On Thu, Sep 20, 2018 at 10:57 PM Andreas Dilger wrote: > On Sep 20, 2018, at 03:07, fırat yılmaz wrote: > > > > Hi all, > > > > OS=Redhat 7.4 &g

Re: [lustre-discuss] Experience with resizing MDT

2018-09-20 Thread Andreas Dilger
a good idea to have an MDT backup (every few days if possible) since it is a relatively small amount of space to store an MDT backup, which may avoid a large amount of data loss/restore. Even a "dd" backup of the live MDT is likely usable after e2fsck, and better than a broken MDT

Re: [lustre-discuss] Second read or write performance

2018-09-20 Thread Andreas Dilger
you please describe what command you are using for testing. Lustre is already using round-robin OST allocation by default, so the second job should use the next set of 36 OSTs, unless the file layout has been specified e.g. to start on OST or the space usage of the OSTs is very imbalance

Re: [lustre-discuss] One MGS for two different MDT - Can't mount the second fs

2018-09-18 Thread Andreas Dilger
ión > Chilean Virtual Observatory > Ayudante Coordinador CSJ, IWI-131 > Universidad Técnica Federico Santa María > camilo.nun...@sansano.usm.cl > > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http:/

Re: [lustre-discuss] How to find the performance degrade reason

2018-09-16 Thread Andreas Dilger
result around 20MB/sec before it was above 600 MB. >>>> >>>> i checked zfs status -xv and (all pool healths are ok) >>>> >>>> 1) How to check which OSTs are involved during data write operation ? >>>> 2) How to check Meta data (read

Re: [lustre-discuss] lustre-discuss Digest, Vol 150, Issue 14

2018-09-11 Thread Andreas Dilger
med >> while the snapshot remains mounted (which is for us typically several hours)? >> Is there already an LU-ticket about this issue? >> >> Thanks! >> Robert >> -- >> Dr. Robert Redl >> Scientific Programmer, "Waves to Weather" (SFB/TRR165) &g

Re: [lustre-discuss] 2.10.5 compiler versions

2018-09-06 Thread Andreas Dilger
dependent on earlier ones, but that is probably the minority of cases. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud signature.asc Description: Message signed with OpenPGP ___ lustre-discuss mailing list lus

Re: [lustre-discuss] separate SSD only filesystem including HDD

2018-08-31 Thread Andreas Dilger
formance of ZFS is "good enough", and whether the features (checksums, online scrub, drive management, etc) outweigh the performance impact? > On 8/31/18, 3:20 AM, "Andreas Dilger" wrote: > >>Just to confirm, there is only a single NVMe device in each server node,

Re: [lustre-discuss] separate SSD only filesystem including HDD

2018-08-31 Thread Andreas Dilger
t;> >>>>> >>>>> >>>>> Issue 1: is that when I combined SSDs in stripe mode using zfs we are >>>>> not linearly scaling in terms of performance . for e..g single SSD write >>>>> speed is 1.3GB/sec ,

Re: [lustre-discuss] 2.10.5 compiler versions

2018-08-30 Thread Andreas Dilger
nel module compile each time, which is slow. If you have time to investigate and optimize, that would be much appreciated. Cheers, Andreas --- Andreas Dilger CTO Whamcloud signature.asc Description: Message signed with OpenPGP ___ lustre-dis

Re: [lustre-discuss] Future of Lustre in upstream?

2018-08-30 Thread Andreas Dilger
ady stream of patches for that tree. I don't have the URL for that Git tree handy, but I'm sure they are happy to get some more testing and usage of their code. Cheers, Andreas --- Andreas Dilger CTO Whamcloud signature.asc Description: Message signed with OpenPGP __

Re: [lustre-discuss] Lustre/ZFS snapshots mount error

2018-08-27 Thread Andreas Dilger
t; [1353498.974525] LustreError: 25582:0:(mdd_device.c:1061:mdd_prepare()) > 36ca26b-MDD: failed to initialize changelog: rc = -30 > [1353498.976229] LustreError: > 25582:0:(obd_mount_server.c:1879:server_fill_super()) Unable to start > targets: -30 > [1353499.072002] Lu

Re: [lustre-discuss] Lustre Size Variation after formatiing

2018-08-20 Thread Andreas Dilger
On an ldiskfs MDT, by default 1/2 of all the space is consumed by inodes. That is fine (good even) because inodes is (or at least was) most of what the MDT stores. All of the file data is stored on the OSTs. Cheers, Andreas > On Aug 19, 2018, at 23:59, ANS wrote: > > Dear Team, > > I am tryi

Re: [lustre-discuss] oldest lustre deployment?

2018-08-15 Thread Andreas Dilger
such an MPICH library is run against a newer version of Lustre, it will return some approximate values via the old interface, which may not be ideal but are no worse than runnnig an older MPICH library. Cheers, Andreas --- Andreas Dilger Principal Lustre

Re: [lustre-discuss] lustre vs. lustre-client

2018-08-10 Thread Andreas Dilger
ly a handful of modules would be different between the client and server). Having a patched server kernel isn't needed for ZFS, and while it works for ldiskfs as well, there are still a few kernel patches that improve ldiskfs server performance/functionality that are not in RHEL7 (e.g. pro

Re: [lustre-discuss] Using lctl lfsck syntax issues

2018-08-09 Thread Andreas Dilger
The lctl commands need to be run on the MDS. Cheers, Andreas > On Aug 9, 2018, at 11:49, Ms. Megan Larko wrote: > > Howdy List! > > I am checking Lustre-2.10.4 (kernel 3.10.0-693 on CentOS 7.3.1611). > I am having trouble using lctl lfsck. I believe I am not using the proper > syntax. The

Re: [lustre-discuss] Upgrading ZFS version for Lustre

2018-07-27 Thread Andreas Dilger
On Jul 27, 2018, at 12:38, Mohr Jr, Richard Frank (Rick Mohr) wrote: > >> >> On Jul 27, 2018, at 1:56 PM, Andreas Dilger wrote: >> >>> On Jul 27, 2018, at 10:24, Mohr Jr, Richard Frank (Rick Mohr) >>> wrote: >>> >>> I am working on

Re: [lustre-discuss] lfs find to locate files with specific permissions

2018-07-27 Thread Andreas Dilger
r similar, and then modifying it to compare the mode. The tricky part is getting the semantics correct with +/-mode, and converting symbolic modes to octal (the find(1) man page goes into depth on this issue). Starting with only "[+-]octal" may be enoug

Re: [lustre-discuss] Upgrading ZFS version for Lustre

2018-07-27 Thread Andreas Dilger
nting=enabled" when you are ready to use these features. Cheers, Andreas --- Andreas Dilger CTO Whamcloud signature.asc Description: Message signed with OpenPGP ___ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Recommended minimal amount of free space to keep on a lustre filesystem

2018-07-23 Thread Andreas Dilger
the drive read/write heads. Keeping the filesystem less full uses the higher block numbers less. Cheers, Andreas --- Andreas Dilger Principal Lustre Architect Whamcloud signature.asc Description: Message signed with OpenPGP ___ lustre-discuss m

Re: [lustre-discuss] changing the lnet IP addresses

2018-07-13 Thread Andreas Dilger
There is an "lctl replace_nids" command that does this. I believe it is documented in the lctl(8) man page as well as the user manual. Cheers, Andreas > On Jul 13, 2018, at 09:40, Lydia Heck wrote: > > > Dear all, > > we are in the unfortunate position that we will have to change the lustre

Re: [lustre-discuss] Lustre 2.10.4 server with 2.5.32 clients

2018-07-03 Thread Andreas Dilger
Lustre has a fairly robust network protocol negotiation at mount time between the client and server, and we try hard to avoid any protocol changes that are not covered by the connect-time negotiation. However, we aren't able to test all possible combinations of versions, so the published list is wh

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-28 Thread Andreas Dilger
on the same IP subnet. > > Have you tried running a regular “ping ” command between > clients and servers to make sure that part is working? > > -- > Rick Mohr > Senior HPC System Administrator > National Institute for Computational Sciences > http://www.nics.tennesse

Re: [lustre-discuss] Not able to load lustre modules on Lustre client

2018-06-28 Thread Andreas Dilger
l the modules were built for. This is also stored in every module, and can be seen with "modinfo lnet" and "modinfo libcfs" in the "verinfo" field. This should match the currently running kernel version reported by "uname -r". Cheers, Andreas > On J

Re: [lustre-discuss] Not able to load lustre modules on Luster client

2018-06-28 Thread Andreas Dilger
It would be useful to include the actual error messages, in particular which module symbols it is complaining about. Cheers, Andreas On Jun 28, 2018, at 22:01, vaibhav pol wrote: > > Hi, >I have installed the Lustre client RPMS (Version 2.11.0) on CentOS 7.4 > > Whenever I tried

Re: [lustre-discuss] SSK configuration

2018-06-28 Thread Andreas Dilger
7: revoked > Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: > [22251]:TRACE:lgss_release_cred(): releasing sk cred 0x1ecc2e0 > Jun 22 19:22:02 ip-172-31-46-245 lgss_keyring: [22254]:TRACE:do_nego_rpc(): > do_nego_rpc: to parse reply > Jun 2

Re: [lustre-discuss] what is fsname used for ? and how to get role based security ?

2018-06-28 Thread Andreas Dilger
For #3 you should look at "nodemap" and "subdirectory mount" in the manual. I agree that simple user permissions should be the starting point, but if you need more complete isolation (eg. if users are in charge of VM images), then the following presentation will be useful: http://wiki.lustre.org/i

Re: [lustre-discuss] lctl ping node28@o2ib report Input/output error

2018-06-27 Thread Andreas Dilger
01. I can mount as well as access lustre on > > client ml-gpu-ser200.nmg01. > > What options did you use when mounting the file system? > > -- > Rick Mohr > Senior HPC System Administrator > National Institute for Computational Sciences > http://www

Re: [lustre-discuss] MDT size smaller than expected

2018-06-26 Thread Andreas Dilger
On Jun 26, 2018, at 14:21, Steve Barnet wrote: > > Hi Andreas, > > > On 6/25/18 5:47 PM, Andreas Dilger wrote: >> On Jun 25, 2018, at 20:39, Steve Barnet wrote: >>> >>> Hi all, >>> >>> I'm setting up a new lustre filesystem wit

Re: [lustre-discuss] MDT size smaller than expected

2018-06-25 Thread Andreas Dilger
ir_nlink,quota,huge_file,flex_bg -E > lazy_journal_init -F /dev/mapper/md3420-1-vd-0 583172096 > Writing CONFIGS/mountdata > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http://lists.lustre.org/listinfo.cg

Re: [lustre-discuss] Lustre 2.11 File Level Replication

2018-06-25 Thread Andreas Dilger
lcme_extent.e_start: 0 lcme_extent.e_end: EOF lmm_stripe_count: 1 lmm_stripe_size: 1048576 lmm_pattern: raid0 lmm_layout_gen:0 lmm_stripe_offset: 0 lmm_objects: - 0: { l_ost_idx: 0, l_fid: [0x1:0x6:0x0] } B.R., Emol

Re: [lustre-discuss] Documentation link dead

2018-06-25 Thread Andreas Dilger
You can also find the manual at: http://lustre.org/documentation/ Cheers, Andreas On Jun 25, 2018, at 13:19, Zeeshan Ali Shah mailto:javacli...@gmail.com>> wrote: Lustre manual link https://wiki.whamcloud.com/display/PUB/Documentation Lustre Manual , PDF, HTML , epub all are not working.. is d

Re: [lustre-discuss] LUG 2018

2018-06-20 Thread Andreas Dilger
18 at 12:20 PM > To: Lustre discussion > Subject: [lustre-discuss] LUG 2018 > > Hi all, > Are the talks online yet? > Thanks, > Eli > > > ___ > lustre-discuss mailing list > lustre-discuss@lists.lustre.org > http

Re: [lustre-discuss] dealing with maybe dead OST

2018-06-20 Thread Andreas Dilger
uld dump them all into $MNT/.lustre/lost+found. > we can also take an lvm snapshot of the MDT and refer to that later I > suppose, but I'm not sure how that might help us. It should be possible to copy the unlinked files from the backup MDT to the current MDT (via ldiskfs), along wi

Re: [lustre-discuss] Moving files out of MDT ROOT/ directory

2016-12-22 Thread Andreas Dilger
If you move them to CORRUPTED (either inside the namespace or outside) then Lustre shouldn't do anything further to them. That would also prevent the blocks from being reallocated. The other option would be to run "e2fsck -l" and feed it the bad block numbers and have those stored in the ext4

Re: [Lustre-discuss] RE : Lustre-2.4 VMs (EL6.4)

2014-08-19 Thread Andreas Dilger
Often this problem is because the hostname in /etc/hosts is actually mapped to localhost on the node itself. Unfortunately, this is how some systems are set up by default. Cheers, Andreas > On Aug 19, 2014, at 12:39, "Abhay Dandekar" wrote: > > I came across a similar situation. > > Below

Re: [Lustre-discuss] mds-survey on Lustre-1.8

2012-11-15 Thread Andreas Dilger
ifferently?That's a clear advantage, and being a standard Lustre tool give some confidence too.Chris___Lustre-discuss mailing listLustre-discuss@lists.lustre.orghttp://lists.lustre.org/mailman/listinfo/lustre-discussCheers, Andreas--Andreas Dilger      

<    1   2   3   4   5   6   7   8   9   10   >