from:"Riccardo Veraldi"

[lustre-discuss] problem building lustre 2.12.9

2022-09-21 Thread Riccardo Veraldi


Hello,

I Am trying to bulid lustre client 2.12.9 on RHEL8

rpmbuild --rebuild --without servers 
/root/rpmbuild/SRPMS/lustre-client-dkms-2.12.9-1.el8.src.rpm


configure: error:
You seem to have an OFED installed but have not installed it's devel 
package.
If you still want to build Lustre for your OFED I/B stack, you need to 
install its devel headers RPM.
Instead, if you want to build Lustre for your kernel's built-in I/B 
stack rather than your installed OFED stack, either remove the OFED 
package(s) or use --with-o2ib=no.


I am using Mellanox OFED.

Any hints on how could I get it fixed ?
thank you

Riccardo


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Anyone know why lustre-zfs-dkms-2.12.8_6_g5457c37-1.el7.noarch.rpm won't install?

2022-05-05 Thread Riccardo Veraldi

If you look in the mentioned thread I wrote a patch for it. It was long time 
ago I thought somebody could have fixed it already

Here is my post about this issue 

http://lists.lustre.org/pipermail/lustre-discuss-lustre.org/2021-October/017813.html



> On May 3, 2022, at 5:16 PM, Mannthey, Keith via lustre-discuss 
>  wrote:
> 
> 
> Hello,
>   I has been a while since I installed Lustre ZFS and I am working to get 
> 2.12.8 Lustre ZFS going on a basic system.
>  
> The lib* spl* and zfs* packages all install with DKMS with no issue.
>  
> lustre-zfs-dkms-2.12.8_6_g5457c37-1.el7.noarch.rpm does not want to install:
>The root issues seems to be a zfs-devel package is missing from 
> https://downloads.whamcloud.com/public/lustre/latest-2.12-release/el7/server/RPMS/x86_64/
>   ?
>Any ideas?
>  
> The Lustre ZFS DKMS build errors out with:
> “
> checking if kernel has 64-bit quota limits support... yes
> checking if Linux kernel was built with CONFIG_QUOTA in or as module... yes
> checking whether to build ldiskfs... no
> checking whether to enable zfs... yes
> checking zfs source directory... /usr/src/zfs-3.10.0-1160.62.1.el7.x86_64
> checking zfs build directory... 
> /var/lib/dkms/zfs/3.10.0-1160.62.1.el7.x86_64/3.10.0-1160.62.1.el7.x86_64/x86_64
> checking user provided zfs devel headers...
> checking zfs devel headers... -I/usr/include/libspl -I /usr/include/libzfs
> ./configure: line 33341: test: zfs: integer expression expected
> configure: error:
>  
> Required zfs osd cannot be built due to missing zfs development headers.
>  
> Support for zfs can be enabled by downloading the required packages for your
> distribution. See http://zfsonlinux.org/ to determine is zfs is supported by
> your distribution.
>  
> configure error, check 
> /var/lib/dkms/lustre-zfs/2.12.8_6_g5457c37/build/config.log
>  
> Building module:
> cleaning build area...(bad exit status: 2)
> make -j72 KERNELRELEASE=3.10.0-1160.62.1.el7.x86_64...(bad exit status: 2)
> Error! Bad return status for module build on kernel: 
> 3.10.0-1160.62.1.el7.x86_64 (x86_64)
> Consult /var/lib/dkms/lustre-zfs/2.12.8_6_g5457c37/build/make.log for more 
> information.
> “
>  
> Thanks,
> Keith
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] dkms-2.8.6 breaks installation of lustre-zfs-dkms-2.12.7-1.el7.noarch

2021-10-21 Thread Riccardo Veraldi


thanks a lot, and thanks for the corrections.

Anyway I never use more than one dkms module built for different kernel 
versions and usually I build the lustre dkms module always on the 
current version of the running kernel, but yes your fixes will address 
correctly the issues you mentioned and it is a more general approach.



On 10/21/21 11:38 AM, Franke, Knut wrote:

Hi,

Am Mittwoch, dem 13.10.2021 um 16:06 -0700 schrieb Riccardo Veraldi:

This is my patch to make things works and build the lustre-dkms rpm

Thank you! I just ran into the exact same problem. Two comments on the
patch:


-   ZFS_VERSION=$(dkms status -m zfs -k $3 -a $5 | awk -F', ' '{print $2; 
exit 0}' | grep -v ': added$')
+   ZFS_VERSION=$(dkms status -m zfs | awk ' { print $1 } ' | sed -e 
's/zfs\///' -e 's/,//')

This produces an incorrect result if the dkms module is already built
for multiple kernel versions. I would suggest picking the largest zfs
version for simplicity's sake:

+   ZFS_VERSION=$(dkms status -m zfs | awk ' { print $1 } ' | sed
-e 's/zfs\///' -e 's/,//' | sort -V | tail -n1)

Secondly,


+   SERVER="--enable-server $LDISKFS \
+-  --with-linux=$4 --with-linux-obj=$4 \
+-  --with-spl=$6/spl-${ZFS_VERSION} \
+-  --with-spl-obj=$7/spl/${ZFS_VERSION}/$3/$5 \
+-  --with-zfs=$6/zfs-${ZFS_VERSION} \
+-  --with-zfs-obj=$7/zfs/${ZFS_VERSION}/$3/$5"
++  --with-zfs=/usr/src/zfs-${ZFS_VERSION} \
++  --with-zfs-obj=/var/lib/dkms/zfs/${ZFS_VERSION}/$3/$5"

This fails if we're building for a newly installed kernel we haven't
rebooted to yet (or rather, any kernel version other than the one
that's currently booted). Also, we might want to keep open the
possibility of building for non-x86 that was present in the original
(though I don't now whether Luster even supports non-x86). So:

+   --with-zfs-obj=/var/lib/dkms/zfs/${ZFS_VERSION}/$3/$5"

To be honest, I don't understand why this second block of changes is
necessary at all; but I currently don't have the time to do any more
experiments..

Cheers,
Knut


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] dkms-2.8.6 breaks installation of lustre-zfs-dkms-2.12.7-1.el7.noarch

2021-10-13 Thread Riccardo Veraldi

yes, same problem for me, I Addressed this a few weeks go and I think I 
Reported to the mailing list.


This is my patch to make things works and build the lustre-dkms rpm


diff -ru lustre-2.12.7/lustre-dkms_pre-build.sh 
lustre-2.12.7-dkms-pcds/lustre-dkms_pre-build.sh
--- lustre-2.12.7/lustre-dkms_pre-build.sh    2021-07-14 
22:06:05.0 -0700
+++ lustre-2.12.7-dkms-pcds/lustre-dkms_pre-build.sh    2021-09-26 
08:30:54.09600 -0700

@@ -20,18 +20,16 @@
 fi

 # ZFS and SPL are version locked
-    ZFS_VERSION=$(dkms status -m zfs -k $3 -a $5 | awk -F', ' '{print 
$2; exit 0}' | grep -v ': added$')
+    ZFS_VERSION=$(dkms status -m zfs | awk ' { print $1 } ' | sed -e 
's/zfs\///' -e 's/,//')

+
 if [ -z $ZFS_VERSION ] ; then
     echo "zfs-dkms package must already be installed and built 
under DKMS control"

     exit 1
 fi

 SERVER="--enable-server $LDISKFS \
-        --with-linux=$4 --with-linux-obj=$4 \
-        --with-spl=$6/spl-${ZFS_VERSION} \
-        --with-spl-obj=$7/spl/${ZFS_VERSION}/$3/$5 \
-        --with-zfs=$6/zfs-${ZFS_VERSION} \
-        --with-zfs-obj=$7/zfs/${ZFS_VERSION}/$3/$5"
+        --with-zfs=/usr/src/zfs-${ZFS_VERSION} \
+        --with-zfs-obj=/var/lib/dkms/zfs/${ZFS_VERSION}/$(uname -r)/x86_64"

 KERNEL_STUFF="--with-linux=$4 --with-linux-obj=$4"
 ;;




On 10/13/21 2:30 PM, Fredrik Nyström via lustre-discuss wrote:

dkms was recently updated to version 2.8.6 in epel/7.

After this update installation of lustre-zfs-dkms-2.12.7-1.el7.noarch
fails with following error:

./configure: line 33341: test: zfs: integer expression expected
configure: error:


Breakage seems to be caused by following dkms commit:
https://github.com/dell/dkms/commit/f83b758b6fb8ca67b1ab65df9e3d2a1e994eb483


configure line 33341:
if test x$enable_modules = xyes && test $ZFS_MAJOR -eq 0 && test $ZFS_MINOR -lt 
8; then :

Not sure exactly how but it ends up with ZFS_MAJOR=zfs, ZFS_MINOR=zfs
instead of: ZFS_MAJOR=0, ZFS_MINOR=7


Downgrading to older dkms or manually reverting the commit mentioned
above solved this problem for me.


Regards / Fredrik N.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] how to optimize write performances

2021-09-30 Thread Riccardo Veraldi




Hello,

I wanted to ask some hint on how I may increase single process 
sequential write performance on Lustre.


I am using Lustre 2.12.7 on RHEL 7.9

I have a number of OSSes with SAS SSDs in raidz. 3 OST per oss and each 
OST is made by 8 SSD in raidz.


On a local test with multiple writes I can write and read from the zpool 
at 7GB/s per OSS.


With Lustre/ZFS backend I can reach peak writes of 5.5GB/s per OSS which 
is ok.


This anyway happens only with several multiple writes at once on the 
filesystem.


A single write cannot perform more than 800MB-1GB/s

Changing the underlying hardware and moving to MVMe slightly improve 
single write performance but just slightly.


What is preventing a single write pattern to perform better ? They are 
XTC files.


Each single SSD has a 500MB/s write capability by factory specs. So 
seems like that with a single write it is not possible to take advantage 
of the


zpool parallelism. I tried also striping but that does not really help much.

Any hint is really appreciated.

Best

Riccardo



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] how to enforce traffic to OSS on o2ib1 only ?

2021-09-28 Thread Riccardo Veraldi

You are more than right. The IB interface 172.21.164.116 is not 
registered, only the TCP one is


- { index: 234, event: add_uuid, nid: 
172.21.156.102@tcp1(0x20001ac159c66), node: 172.21.156.102@tcp1 }
- { index: 240, event: add_uuid, nid: 
172.21.156.102@tcp1(0x20001ac159c66), node: 172.21.156.102@tcp1 }
- { index: 246, event: add_uuid, nid: 
172.21.156.102@tcp1(0x20001ac159c66), node: 172.21.156.102@tcp1 }


Do you know how can I register it with the o2ib1 interface ?

I Already reformatted the OSTs but that did not fix the problem.

Thanks

Riccardo



On 9/28/21 11:56 AM, Stephane Thiell wrote:

Hi Riccardo,

I would check if the OSTs on this OSS have been registered with the correct 
NIDs (o2ib1) on the MGS:

$ lctl --device MGS llog_print -client

and look for the NIDs in setup/add_conn for the OSTs in question.

Best,

Stephane




On Sep 28, 2021, at 9:52 AM, Riccardo Veraldi  
wrote:

Hello.

I have a lustre setup where the MDS (172.21.156.112)  is on tcp1 while the 
OSSes are on o2ib1.

I am using Lustre 2.12.7 on RHEL 7.9

All the clients can see the MDS correctly as a tcp1 peer:

peer:
 - primary nid: 172.21.156.112@tcp1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.156.112@tcp1
   state: NA


This is by design because the MDS has no IB interface. So the MDS to OSSes 
traffic and MDS to Clients traffic is on tcp1, while clients to OSSes traffic 
is meant to be on o2ib1.

I have 1 MDS (tcp1)  And 12 OSSes (tcp1, o2ib1) and a bunch of 20 clients 
(tcp1, o2ib1).

All is fine but not for one of the OSSes (172.21.164.116@o2ib1, 
172.21.156.102@tcp1).

Even though it is configured the same as all the other ones, traffic only goes 
through tcp1 and not o2ib1.

Even if I force the peer settings to use o2ib, it ignores it and the tcp1 peer 
is added anyway

this is lnet.conf on the MDS

p2nets:
  - net-spec: o2ib1
interfaces:
   0: ib0
  - net-spec: tcp1
interfaces:
   0: eno1
global:
 discovery: 0



this is lnet.conf on OSSes

ip2nets:
  - net-spec: o2ib1
interfaces:
   0: ib0
  - net-spec: tcp1
interfaces:
   0: enp1s0f0
global:
 discovery: 0



I also tried this on the lustre clients side:

peer:
 - primary nid: 172.21.164.116@o2ib1
   Multi-Rail: False
   peer ni:
 - nid: 172.21.164.116@o2ib1

enforcing the peer settings to o2ib1.

This is ignored and the peer is added by its tcp1 LNET interface.

 - primary nid: 172.21.156.102@tcp1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.156.102@tcp1
   state: NA

All of the hosts involved have discovery set to 0.

Nevertheless the peer setting for that specific OSS is using tcp1 and not o2ib.

This is disrupting because traffic goes to tcp1 for that specific OSS and it is 
of course slower than IB.

I had to deactivate the OSTs on that specific OSS.

How may I Fix this issue ?

Here is the complete peer list from the lustre client side and as you can see 
there is that specific OSS included as tcp1 peer.

even if I do  "lnetctl peer del --nid 172.21.156.102@tcp1 --prim_nid 
172.21.156.102@tcp1" the entry is added automatically after a while.

lnetctl peer show
peer:
 - primary nid: 172.21.156.112@tcp1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.156.112@tcp1
   state: NA
 - primary nid: 172.21.164.111@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.111@o2ib1
   state: NA
 - primary nid: 172.21.164.117@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.117@o2ib1
   state: NA
 - primary nid: 172.21.164.112@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.112@o2ib1
   state: NA
 - primary nid: 172.21.164.119@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.119@o2ib1
   state: NA
 - primary nid: 172.21.164.114@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.114@o2ib1
   state: NA
 - primary nid: 172.21.164.120@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.120@o2ib1
   state: NA
 - primary nid: 172.21.156.102@tcp1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.156.102@tcp1
   state: NA
 - primary nid: 172.21.164.116@o2ib1
   Multi-Rail: False
   peer ni:
 - nid: 172.21.164.116@o2ib1
   state: NA
 - primary nid: 172.21.164.110@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.110@o2ib1
   state: NA
 - primary nid: 172.21.164.115@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.115@o2ib1
   state: NA
 - primary nid: 172.21.164.118@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.118@o2ib1
   state: NA
 - primary nid: 172.21.164.113@o2ib1
   Multi-Rail: True
   peer ni:
 - nid: 172.21.164.113@o2ib1

[lustre-discuss] how to enforce traffic to OSS on o2ib1 only ?

2021-09-28 Thread Riccardo Veraldi


Hello.

I have a lustre setup where the MDS (172.21.156.112)  is on tcp1 while 
the OSSes are on o2ib1.


I am using Lustre 2.12.7 on RHEL 7.9

All the clients can see the MDS correctly as a tcp1 peer:

peer:
    - primary nid: 172.21.156.112@tcp1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.156.112@tcp1
  state: NA

This is by design because the MDS has no IB interface. So the MDS to 
OSSes traffic and MDS to Clients traffic is on tcp1, while clients to 
OSSes traffic is meant to be on o2ib1.


I have 1 MDS (tcp1)  And 12 OSSes (tcp1, o2ib1) and a bunch of 20 
clients (tcp1, o2ib1).


All is fine but not for one of the OSSes (172.21.164.116@o2ib1, 
172.21.156.102@tcp1).


Even though it is configured the same as all the other ones, traffic 
only goes through tcp1 and not o2ib1.


Even if I force the peer settings to use o2ib, it ignores it and the 
tcp1 peer is added anyway


this is lnet.conf on the MDS

p2nets:
 - net-spec: o2ib1
   interfaces:
  0: ib0
 - net-spec: tcp1
   interfaces:
  0: eno1
global:
    discovery: 0


this is lnet.conf on OSSes

ip2nets:
 - net-spec: o2ib1
   interfaces:
  0: ib0
 - net-spec: tcp1
   interfaces:
  0: enp1s0f0
global:
    discovery: 0


I also tried this on the lustre clients side:

peer:
    - primary nid: 172.21.164.116@o2ib1
  Multi-Rail: False
  peer ni:
    - nid: 172.21.164.116@o2ib1

enforcing the peer settings to o2ib1.

This is ignored and the peer is added by its tcp1 LNET interface.

    - primary nid: 172.21.156.102@tcp1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.156.102@tcp1
  state: NA

All of the hosts involved have discovery set to 0.

Nevertheless the peer setting for that specific OSS is using tcp1 and 
not o2ib.


This is disrupting because traffic goes to tcp1 for that specific OSS 
and it is of course slower than IB.


I had to deactivate the OSTs on that specific OSS.

How may I Fix this issue ?

Here is the complete peer list from the lustre client side and as you 
can see there is that specific OSS included as tcp1 peer.


even if I do  "lnetctl peer del --nid 172.21.156.102@tcp1 --prim_nid 
172.21.156.102@tcp1" the entry is added automatically after a while.


lnetctl peer show
peer:
    - primary nid: 172.21.156.112@tcp1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.156.112@tcp1
  state: NA
    - primary nid: 172.21.164.111@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.111@o2ib1
  state: NA
    - primary nid: 172.21.164.117@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.117@o2ib1
  state: NA
    - primary nid: 172.21.164.112@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.112@o2ib1
  state: NA
    - primary nid: 172.21.164.119@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.119@o2ib1
  state: NA
    - primary nid: 172.21.164.114@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.114@o2ib1
  state: NA
    - primary nid: 172.21.164.120@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.120@o2ib1
  state: NA
*    - primary nid: 172.21.156.102@tcp1**
**  Multi-Rail: True**
**  peer ni:**
**    - nid: 172.21.156.102@tcp1**
**  state: NA*
    - primary nid: 172.21.164.116@o2ib1
  Multi-Rail: False
  peer ni:
    - nid: 172.21.164.116@o2ib1
  state: NA
    - primary nid: 172.21.164.110@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.110@o2ib1
  state: NA
    - primary nid: 172.21.164.115@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.115@o2ib1
  state: NA
    - primary nid: 172.21.164.118@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.118@o2ib1
  state: NA
    - primary nid: 172.21.164.113@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.113@o2ib1
  state: NA
    - primary nid: 172.21.164.121@o2ib1
  Multi-Rail: True
  peer ni:
    - nid: 172.21.164.121@o2ib1
  state: NA

thanks for looking at this.

Rick






___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] can't install Lustre 2.12.7

2021-09-27 Thread Riccardo Veraldi


hello,

after debugging the lustre-zfs dkms building I realized for some reasons 
the lustre pre build script does not have the correct parameters,


or the correct parameters are not passed to it for some reason so that 
the include and object dir for building the dkms modules are wrong and 
then the configure script fails.


my solution was to patch lustre-dkms_pre-build.sh in a way that it would 
work, at least for me.


In this way lustre-dkms will build. Also I did not need spl at all since 
I am using zfs 0.8.6



--- lustre-2.12.7/lustre-dkms_pre-build.sh    2021-07-14 
22:06:05.0 -0700
+++ lustre-2.12.7-dkms-pcds/lustre-dkms_pre-build.sh 2021-09-26 
08:30:54.09600 -0700

@@ -20,18 +20,16 @@
 fi

 # ZFS and SPL are version locked
-    ZFS_VERSION=$(dkms status -m zfs -k $3 -a $5 | awk -F', ' '{print 
$2; exit 0}' | grep -v ': added$')
+    ZFS_VERSION=$(dkms status -m zfs | awk ' { print $1 } ' | sed -e 
's/zfs\///' -e 's/,//')

+
 if [ -z $ZFS_VERSION ] ; then
     echo "zfs-dkms package must already be installed and built 
under DKMS control"

     exit 1
 fi

 SERVER="--enable-server $LDISKFS \
-        --with-linux=$4 --with-linux-obj=$4 \
-        --with-spl=$6/spl-${ZFS_VERSION} \
-        --with-spl-obj=$7/spl/${ZFS_VERSION}/$3/$5 \
-        --with-zfs=$6/zfs-${ZFS_VERSION} \
-        --with-zfs-obj=$7/zfs/${ZFS_VERSION}/$3/$5"
+        --with-zfs=/usr/src/zfs-${ZFS_VERSION} \
+        --with-zfs-obj=/var/lib/dkms/zfs/${ZFS_VERSION}/$(uname -r)/x86_64"

 KERNEL_STUFF="--with-linux=$4 --with-linux-obj=$4"
 ;;

On 9/27/21 7:05 AM, Thomas Roth wrote:

Hi Riccardo

(no solution for you problem here)
out of curiosity, I have just upgraded a test server to centos 7.9 and 
Lustre 2.12.7.


Kernel is 3.10.0-1160.42.2.el7.x86_64

But I installed kmod-lustre, lustre, kmod-lustre-osd-zfs. The Lustre 
modules went to /lib/modules/3.10.0-1160.25.1.el7.x86_64/extra, with 
symlinks to the weak-updates of the previous kernel 
(3.10.0-1160.2.1.el7.x86_64).


I symlinked those directories in 3.10.0-1160.25.1.el7.x86_64/extra to 
3.10.0-1160.42.2.el7.x86_64/extra and mounted my OSTs.
= Dirty workaround, if you quickly need 2.12.7, but of course not 
sustainable - at some point the kernel version will have deviated too 
much and dkms is needed again.


Regards
Thomas

On 9/24/21 03:13, Riccardo Veraldi wrote:

Hello,

I am not successful installing lustre 2.12.7 I run into a problem 
with dkms on RHEL 7.9


kernel 3.10.0-1160.42.2.el7.x86_64

I am using rpm from 
https://downloads.whamcloud.com/public/lustre/lustre-2.12.7/el7.9.2009/server/RPMS/x86_64/


the lustre dkms module fails building, seems like something is 
missing or there is a wrong path somewhere so that the proper headers 
are not found, but I Could not figure out what. I Tried both with 
zfs-0.7.13 and zfs-0.8.6 but same result. So I am missing something


Any hints ? I am stuck.

ZFS is:

libnvpair1-0.7.13-1.el7.x86_64
spl-0.7.13-1.el7.x86_64
libzfs2-0.7.13-1.el7.x86_64
spl-dkms-0.7.13-1.el7.noarch
libuutil1-0.7.13-1.el7.x86_64
zfs-dkms-0.7.13-1.el7.noarch
libzfs2-devel-0.7.13-1.el7.x86_64
libzpool2-0.7.13-1.el7.x86_64
zfs-0.7.13-1.el7.x86_64


yum install lustre-dkms
Loaded plugins: langpacks
Resolving Dependencies
--> Running transaction check
---> Package lustre-zfs-dkms.noarch 0:2.12.7-1.el7 will be installed
--> Processing Dependency: lustre-osd-zfs-mount for package: 
lustre-zfs-dkms-2.12.7-1.el7.noarch

--> Running transaction check
---> Package lustre-osd-zfs-mount.x86_64 0:2.12.7-1.el7 will be 
installed

--> Finished Dependency Resolution

Dependencies Resolved

 

  Package  Arch 
Version   Repository  Size
 


Installing:
  lustre-zfs-dkms  noarch 
2.12.7-1.el7  lustre  12 M

Installing for dependencies:
  lustre-osd-zfs-mount x86_64 
2.12.7-1.el7  lustre  12 k


Transaction Summary
 


Install  1 Package (+1 Dependent package)

Total download size: 12 M
Installed size: 38 M
Is this ok [y/d/N]: y
Downloading packages:
(1/2): lustre-osd-zfs-mount-2.12.7-1.el7.x86_64.rpm |  12 kB 00:00:00
(2/2): lustre-zfs-dkms-2.12.7-1.el7.noarch.rpm |  12 MB 00:00:00
 


Total 26 MB/s |  12 MB  00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
   Installing : lustr

[lustre-discuss] can't install Lustre 2.12.7

2021-09-23 Thread Riccardo Veraldi


Hello,

I am not successful installing lustre 2.12.7 I run into a problem with 
dkms on RHEL 7.9


kernel 3.10.0-1160.42.2.el7.x86_64

I am using rpm from 
https://downloads.whamcloud.com/public/lustre/lustre-2.12.7/el7.9.2009/server/RPMS/x86_64/


the lustre dkms module fails building, seems like something is missing 
or there is a wrong path somewhere so that the proper headers are not 
found, but I Could not figure out what. I Tried both with zfs-0.7.13 and 
zfs-0.8.6 but same result. So I am missing something


Any hints ? I am stuck.

ZFS is:

libnvpair1-0.7.13-1.el7.x86_64
spl-0.7.13-1.el7.x86_64
libzfs2-0.7.13-1.el7.x86_64
spl-dkms-0.7.13-1.el7.noarch
libuutil1-0.7.13-1.el7.x86_64
zfs-dkms-0.7.13-1.el7.noarch
libzfs2-devel-0.7.13-1.el7.x86_64
libzpool2-0.7.13-1.el7.x86_64
zfs-0.7.13-1.el7.x86_64


yum install lustre-dkms
Loaded plugins: langpacks
Resolving Dependencies
--> Running transaction check
---> Package lustre-zfs-dkms.noarch 0:2.12.7-1.el7 will be installed
--> Processing Dependency: lustre-osd-zfs-mount for package: 
lustre-zfs-dkms-2.12.7-1.el7.noarch

--> Running transaction check
---> Package lustre-osd-zfs-mount.x86_64 0:2.12.7-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved


 Package  Arch 
Version   Repository  Size


Installing:
 lustre-zfs-dkms  noarch 
2.12.7-1.el7  lustre  12 M

Installing for dependencies:
 lustre-osd-zfs-mount x86_64 
2.12.7-1.el7  lustre  12 k


Transaction Summary

Install  1 Package (+1 Dependent package)

Total download size: 12 M
Installed size: 38 M
Is this ok [y/d/N]: y
Downloading packages:
(1/2): lustre-osd-zfs-mount-2.12.7-1.el7.x86_64.rpm |  12 kB  00:00:00
(2/2): lustre-zfs-dkms-2.12.7-1.el7.noarch.rpm |  12 MB  00:00:00

Total 26 MB/s |  12 MB  00:00:00
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing : lustre-osd-zfs-mount-2.12.7-1.el7.x86_64 1/2
  Installing : lustre-zfs-dkms-2.12.7-1.el7.noarch 2/2
Loading new lustre-zfs-2.12.7 DKMS files...
Building for 3.10.0-1160.42.2.el7.x86_64
Building initial module for 3.10.0-1160.42.2.el7.x86_64
configure: WARNING:

Disabling ldiskfs support because complete ext4 source does not exist.

If you are building using kernel-devel packages and require ldiskfs
server support then ensure that the matching kernel-debuginfo-common
and kernel-debuginfo-common- packages are installed.

./configure: line 33341: test: zfs: integer expression expected
configure: error:

Required zfs osd cannot be built due to missing zfs development headers.

Support for zfs can be enabled by downloading the required packages for your
distribution.  See http://zfsonlinux.org/ to determine is zfs is 
supported by

your distribution.

Error! Bad return status for module build on kernel: 
3.10.0-1160.42.2.el7.x86_64 (x86_64)

Consult /var/lib/dkms/lustre-zfs/2.12.7/build/make.log for more information.
warning: %post(lustre-zfs-dkms-2.12.7-1.el7.noarch) scriptlet failed, 
exit status 10
Non-fatal POSTIN scriptlet failure in rpm package 
lustre-zfs-dkms-2.12.7-1.el7.noarch

  Verifying  : lustre-osd-zfs-mount-2.12.7-1.el7.x86_64 1/2
  Verifying  : lustre-zfs-dkms-2.12.7-1.el7.noarch 2/2

Installed:
  lustre-zfs-dkms.noarch 0:2.12.7-1.el7

Dependency Installed:
  lustre-osd-zfs-mount.x86_64 0:2.12.7-1.el7


cat /var/lib/dkms/lustre-zfs/2.12.7/build/make.log
DKMS make.log for lustre-zfs-2.12.7 for kernel 
3.10.0-1160.42.2.el7.x86_64 (x86_64)

Thu Sep 23 17:42:12 PDT 2021
make: *** No targets specified and no makefile found.  Stop.

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Correct ZoL version matching Lustre 2.12.7 ?

2021-09-14 Thread Riccardo Veraldi


Hello,

I am about to deploy a new Lustre 2.12.7 systems.

With ZoL version should I choose for my Lustre/ZFS system ?

0.7.13, 0.8.6, 2.0.5, 2.1.0 ?

Thanks

Rick

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Disabling multi-rail dynamic discovery

2021-09-13 Thread Riccardo Veraldi


I supposed you removed the /etc/modprobe.d/lustre.conf completely.

I only have the lnet service enabled at startup, I do not start any 
lustre3 service, but I am running lustre 2.12.0 sorry not 2.14


so something might be different.

Did you start over with a clean configuration ?

Did you reboot your system to make sure it picks up the new config ? At 
least for me sometimes the lnet module does not unload correctly.


Also I have to mention in my setup I did disable discovery also on the 
OSSes not only client side.


Generally it is not advisable to disable Multi-rail unless you have 
backward compatibility issues with older lustre peers.


But disabling discovery will also disable Multi-rail.

You can try with

lenetctl set discovery 0

as  you already did,

then you do

lnetctl -b export > /etc/lnet.conf

check discovery is set to 0 in the file and if not edit it and set it to 0.

reboot and see if things changes.

If anyway you did not define any tcp interface in lnet.conf  you should 
not see any tcp peers.



On 9/13/21 2:59 PM, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, 
Inc.] wrote:


Thanks Rick.  I removed my lnet modprobe options and adapted my 
lnet.conf file to:


# cat /etc/lnet.conf

ip2nets:

- net-spec: o2ib1

   interfaces:

  0: ib0

global:

    discovery: 0

#

Now "lnetctl export" doesn't have any reference to NIDs on the other 
networks, so that's good.  However, I'm still seeing some values that 
concern me:


# lnetctl export | grep -e Multi -e discover | sort -u

    discovery: 1

Multi-Rail: True

#

Any idea why discovery is still 1 if I'm specifying that to 0 in the 
lnet.conf file?  I'm a little concerned that with Multi-Rail still 
True and discovery on, the client could still find its way back to the 
TCP route.


*From: *Riccardo Veraldi 
*Date: *Monday, September 13, 2021 at 3:16 PM
*To: *"Vicker, Darby J. (JSC-EG111)[Jacobs Technology, Inc.]" 
, "lustre-discuss@lists.lustre.org" 

*Subject: *[EXTERNAL] Re: [lustre-discuss] Disabling multi-rail 
dynamic discovery


I would use configuration on /etc/lnet.conf and I would not use 
anymore the older style configuration in


/etc/modprobe.d/lustre.conf

for example in my /etc/lnet.conf configuration I have:

*ip2nets:
 - net-spec: o2ib
   interfaces:
  0: ib0
 - net-spec: tcp
   interfaces:
  0: enp24s0f0
global:
    discovery: 0*

As I disabled the auto discovery.

Regarding ko2ib you can just use /etc/modprobe.d/ko2iblnd.conf

Mine looks like this:

*options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 
ntx=2048 map_on_demand=256 fmr_pool_size=2048 fmr_flush_trigger=512 
fmr_cache=1 conns_per_peer=4*


Hope it helps.

Rick

On 9/13/21 1:53 PM, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, 
Inc.] via lustre-discuss wrote:


Hello,

I would like to know how to turn off auto discovery of peers on a
client.  This seems like it should be straight forward but we
can't get it to work. Please fill me in on what I'm missing.

We recently upgraded our servers to 2.14.  Our servers are
multi-homed (1 tcp network and 2 separate IB networks) but we want
them to be single rail.  On one of our clusters we are still using
the 2.12.6 client and it uses one of the IB networks for lustre. 
The modprobe file from one of the client nodes:

# cat /etc/modprobe.d/lustre.conf

options lnet networks=o2ib1(ib0)

options ko2iblnd map_on_demand=32

#

The client does have a route to the TCP network.  This is intended
to allow jobs on the compute nodes to access licenese servers, not
for any serious I/O.  We recently discovered that due to some
instability in the IB fabric, the client was trying to fail over
to tcp:

# dmesg | grep Lustre

[  250.205912] Lustre: Lustre: Build Version: 2.12.6

[  255.886086] Lustre: Mounted scratch-client

[  287.247547] Lustre:
3472:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request
sent has timed out for sent delay: [sent 1630699139/real 0] 
req@98deb9358480 x1709911947878336/t0(0)
o9->hpfs-fsl-OST0001-osc-9880cfb8@192.52.98.33@tcp:28/4
<mailto:hpfs-fsl-OST0001-osc-9880cfb8@192.52.98.33@tcp:28/4>
lens 224/224 e 0 to 1 dl 1630699145 ref 2 fl Rpc:XN/0/ rc 0/-1

[  739.832744] Lustre:
3526:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request
sent has timed out for sent delay: [sent 1630699591/real 0] 
req@98deb935da00 x1709911947883520/t0(0)
o400->scratch-MDT-mdc-98b0f1fc0800@192.52.98.31@tcp:12/10
<mailto:scratch-MDT-mdc-98b0f1fc0800@192.52.98.31@tcp:12/10>
lens 224/224 e 0 to 1 dl 1630699598 ref 2 fl Rpc:XN/0/ rc 0/-1

[  739.832755] Lustre:
3526:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 5
previous similar messages

[  739.832762] LustreError: 166-1: MGC10.150.100.30@o2ib1:
Connection

Re: [lustre-discuss] Disabling multi-rail dynamic discovery

2021-09-13 Thread Riccardo Veraldi

I would use configuration on /etc/lnet.conf and I would not use anymore 
the older style configuration in


/etc/modprobe.d/lustre.conf

for example in my /etc/lnet.conf configuration I have:

*ip2nets:
 - net-spec: o2ib
   interfaces:
  0: ib0
 - net-spec: tcp
   interfaces:
  0: enp24s0f0
global:
    discovery: 0*

As I disabled the auto discovery.

Regarding ko2ib you can just use /etc/modprobe.d/ko2iblnd.conf

Mine looks like this:

*options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 
ntx=2048 map_on_demand=256 fmr_pool_size=2048 fmr_flush_trigger=512 
fmr_cache=1 conns_per_peer=4*


Hope it helps.

Rick


On 9/13/21 1:53 PM, Vicker, Darby J. (JSC-EG111)[Jacobs Technology, 
Inc.] via lustre-discuss wrote:


Hello,

I would like to know how to turn off auto discovery of peers on a 
client.  This seems like it should be straight forward but we can't 
get it to work. Please fill me in on what I'm missing.


We recently upgraded our servers to 2.14.  Our servers are multi-homed 
(1 tcp network and 2 separate IB networks) but we want them to be 
single rail.  On one of our clusters we are still using the 2.12.6 
client and it uses one of the IB networks for lustre.  The modprobe 
file from one of the client nodes:


# cat /etc/modprobe.d/lustre.conf

options lnet networks=o2ib1(ib0)

options ko2iblnd map_on_demand=32

#

The client does have a route to the TCP network.  This is intended to 
allow jobs on the compute nodes to access licenese servers, not for 
any serious I/O.  We recently discovered that due to some instability 
in the IB fabric, the client was trying to fail over to tcp:


# dmesg | grep Lustre

[ 250.205912] Lustre: Lustre: Build Version: 2.12.6

[ 255.886086] Lustre: Mounted scratch-client

[ 287.247547] Lustre: 
3472:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent 
has timed out for sent delay: [sent 1630699139/real 0]  
req@98deb9358480 x1709911947878336/t0(0) 
o9->hpfs-fsl-OST0001-osc-9880cfb8@192.52.98.33@tcp:28/4 lens 
224/224 e 0 to 1 dl 1630699145 ref 2 fl Rpc:XN/0/ rc 0/-1


[ 739.832744] Lustre: 
3526:0:(client.c:2146:ptlrpc_expire_one_request()) @@@ Request sent 
has timed out for sent delay: [sent 1630699591/real 0]  
req@98deb935da00 x1709911947883520/t0(0) 
o400->scratch-MDT-mdc-98b0f1fc0800@192.52.98.31@tcp:12/10 lens 
224/224 e 0 to 1 dl 1630699598 ref 2 fl Rpc:XN/0/ rc 0/-1


[ 739.832755] Lustre: 
3526:0:(client.c:2146:ptlrpc_expire_one_request()) Skipped 5 previous 
similar messages


[ 739.832762] LustreError: 166-1: MGC10.150.100.30@o2ib1: Connection 
to MGS (at 192.52.98.30@tcp) was lost; in progress operations using 
this service will fail


[ 739.832769] Lustre: hpfs-fsl-MDT-mdc-9880cfb8: 
Connection to hpfs-fsl-MDT (at 192.52.98.30@tcp) was lost; in 
progress operations using this service will wait for recovery to complete


[ 1090.978619] LustreError: 167-0: 
scratch-MDT-mdc-98b0f1fc0800: This client was evicted by 
scratch-MDT; in progress operations using this service will fail.


I'm pretty sure this is due to the auto discovery.  Again, from a client:

# lnetctl export | grep -e Multi -e discover | sort -u
     discovery: 0
   Multi-Rail: True
#

We want to restrict lustre to only the IB NID but its not clear 
exactly how to do that.


Here is one attempt:


[root@r1i1n18 lnet]# service lustre3 stop

Shutting down lustre mounts

Lustre modules successfully unloaded

[root@r1i1n18 lnet]# lsmod | grep lnet

[root@r1i1n18 lnet]# cat /etc/lnet.conf

global:

    discovery: 0

[root@r1i1n18 lnet]# service lustre3 start

Mounting /ephemeral... done.

Mounting /nobackup... done.

[root@r1i1n18 lnet]# lnetctl export | grep -e Multi -e discover | sort -u

    discovery: 1

Multi-Rail: True

[root@r1i1n18 lnet]#

And a similar attempt (same lnet.conf file), but trying to turn off 
the discovery before doing the mounts:


[root@r1i1n18 lnet]# service lustre3 stop
Shutting down lustre mounts
Lustre modules successfully unloaded
[root@r1i1n18 lnet]# modprobe lnet
[root@r1i1n18 lnet]# lnetctl set discovery 0
[root@r1i1n18 lnet]# service lustre3 start
Mounting /ephemeral... done.
Mounting /nobackup... done.
[root@r1i1n18 lnet]# lnetctl export | grep -e Multi -e discover | sort -u
     discovery: 0
   Multi-Rail: True
[root@r1i1n18 lnet]#

If someone can point me in the right direction, I'd appreciate it.

Thanks,

Darby


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Lustre MDS as a Virtual Machine

2020-11-29 Thread Riccardo Veraldi

Hello,
I wanted to ask if anybody has experienced running MDS as a virtual
machine while OSSes are physical machines. The benefit would be to have
a kind of intrinsic level of high availability if the underlying
hypervisor/storage infrastructure is a HA cluster, but I was wondering
what do you think about it performance wise and if you consider it as
acceptable for deployment on a production system.

Best

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Lustre as VM backend

2020-02-19 Thread Riccardo Veraldi

Hello,
I wanted to ask if anybody is using lustre as a FS backend for virtual
machines. I am thinking to environments like Openstack, or oVirt
where VM are inside a single qcow2 file basically using libvirt to
access the underlying filesystem where VMs are stored.
Anyone is using Lustre for this and if so any best practice for the
specific utilization of Lustre in this environment (libvirt storage
backend) ?
Thank you

Rick

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] ZFS and multipathing for OSTs

2019-04-26 Thread Riccardo Veraldi

for my experience multipathd+ZFS works well, and it worked well usually.
I just remove the broken disk when it happens, replace it and the new
multipathd device is added once the disk is replaced, and then then I
start resilvering.
Anyway I found out this not always works with some version of JBOD disk
array/firmware.
Some Proware controller that I had did not recognize that a disk was
replaced. But This is not a multipathd problem in my case.
So my hint is to try it out with your hardware and see how it behaves.

On 26/04/2019 16:57, Kurt Strosahl wrote:
>
> Hey, thanks!
>
>
> I tried the multipathing part you had down there and I couldn't get it
> to work... I did find that this worked though
>
>
> #I pick a victim device
> multipath -ll
> ...
> mpathax (35000cca2680a8194) dm-49 HGST    ,HUH721010AL5200 
> size=9.1T features='0' hwhandler='0' wp=rw
> `-+- policy='service-time 0' prio=1 status=enabled
>   |- 1:0:10:0   sdj     8:144   active ready running
>   `- 11:0:9:0   sddy    128:0   active ready running
> #then I remove the device
> multipath -f mpathax
> #and verify that it is gone
> multipath -ll | grep mpathax
> #then I run the following, which seems to rescan for devices.
> multipath -v2
> Apr 26 10:49:06 | sdj: No SAS end device for 'end_device-1:1'
> Apr 26 10:49:06 | sddy: No SAS end device for 'end_device-11:1'
> create: mpathax (35000cca2680a8194) undef HGST    ,HUH721010AL5200 
> size=9.1T features='0' hwhandler='0' wp=undef
> `-+- policy='service-time 0' prio=1 status=undef
>   |- 1:0:10:0   sdj     8:144   undef ready running
>   `- 11:0:9:0   sddy    128:0   undef ready running
> #then its back
> multipath -ll mpathax
> mpathax (35000cca2680a8194) dm-49 HGST    ,HUH721010AL5200 
> size=9.1T features='0' hwhandler='0' wp=rw
> `-+- policy='service-time 0' prio=1 status=enabled
>   |- 1:0:10:0   sdj     8:144   active ready running
>   `- 11:0:9:0   sddy    128:0   active ready running
>
> I still need to test it fully once I get the whole stack up and
> running, but this seems to be a step in the right direction.
>
>
> w/r,
> Kurt
>
> 
> *From:* Jongwoo Han 
> *Sent:* Friday, April 26, 2019 6:28 AM
> *To:* Kurt Strosahl
> *Cc:* lustre-discuss@lists.lustre.org
> *Subject:* Re: [lustre-discuss] ZFS and multipathing for OSTs
>  
> Disk replacement with multipathd + zfs is somewhat not convenient.
>
> step1: mark offline the disk you should replace with zpool command
> step2: remove disk from multipathd table with multipath -f 
> step3: replace disk
> step4: add disk to multipath table with multipath -ll 
> step5:  replace disk in zpool with zpool replace
>
> try this in your test environment and tell us if you have found
> anything interesting in the syslog.
> In my case replacing single disk in multipathd+zfs pool triggerd
> massive udevd partition scan. 
>
> Thanks
> Jongwoo Han
>
> 2019년 4월 26일 (금) 오전 3:44, Kurt Strosahl  >님이 작성:
>
> Good Afternoon,
>
>
>     As part of a new lustre deployment I've now got two disk
> shelves connected redundantly to two servers.  Since each disk has
> two paths to the server I'd like to use multipathing for both
> redundancy and improved performance.  I haven't found examples or
> discussion about such a setup, and was wondering if there are any
> resources out there that I could consult.
>
>
> Of particular interest would be examples of the
> /etc/zfs/vdev_id.conf and any tuning that was done.  I'm also
> wondering about extra steps that may have to be taken when doing a
> disk replacement to account for the multipathing.  I've got plenty
> of time to experiment with this process, but I'd rather not
> reinvent the wheel if I don't have to.
>
>
> w/r,
>
> Kurt J. Strosahl
> System Administrator: Lustre, HPC
> Scientific Computing Group, Thomas Jefferson National Accelerator
> Facility
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> 
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
> 
>
>
>
> -- 
> Jongwoo Han
> +82-505-227-6108
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] ZFS tuning for MDT/MGS

2019-03-20 Thread Riccardo Veraldi


On 3/19/19 11:46 AM, Degremont, Aurelien wrote:


Also, if you’re not using Lustre 2.11 or 2.12, do not forget 
dnodesize=auto and recordsize=1M for OST


zfs set dnodesize=auto mdt0

zfs set dnodesize=auto ostX

https://jira.whamcloud.com/browse/LU-8342


good point, thank you



(useful for 2.10 LTS. Automatically done by Lustre for 2.11+)

*De : *lustre-discuss  au nom 
de "Carlson, Timothy S" 

*Date : *mercredi 13 mars 2019 à 23:07
*À : *Riccardo Veraldi , Kurt Strosahl 
, "lustre-discuss@lists.lustre.org" 


*Objet : *Re: [lustre-discuss] ZFS tuning for MDT/MGS

+1 on

options zfs zfs_prefetch_disable=1


Might not be as critical now, but that was a must-have on Lustre 2.5.x

Tim

*From:* lustre-discuss  *On 
Behalf Of *Riccardo Veraldi

*Sent:* Wednesday, March 13, 2019 3:00 PM
*To:* Kurt Strosahl ; lustre-discuss@lists.lustre.org
*Subject:* Re: [lustre-discuss] ZFS tuning for MDT/MGS

these are the zfs settings I use on my MDSes


 zfs set mountpoint=none mdt0
 zfs set sync=disabled mdt0
 zfs set atime=off amdt0
 zfs set redundant_metadata=most mdt0
 zfs set xattr=sa mdt0

if youor MDT partition is on a 4KB sector disk then you can use 
ashift=12 when you create the filesystem but zfs is pretty smart and 
in my case it recognized it automatically and used ashift=12 
automatically.


also here are the zfs kernel modules parameters i use to ahve better 
performance. I use it on both MDS and OSSes


options zfs zfs_prefetch_disable=1
options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#options zfs zfs_vdev_sync_write_min_active=64
#options zfs zfs_vdev_sync_write_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32

some people may disagree with me anyway after years of trying 
different options I reached this stable configuration.


then there are a bunch of other important Lustre level optimizations 
that you can do if you are looking for performance increase.


Cheers

Rick

On 3/13/19 11:44 AM, Kurt Strosahl wrote:

Good Afternoon,

    I'm reviewing the zfs parameters for a new metadata system and
I was looking to see if anyone had examples (good or bad) of zfs
parameters?  I'm assuming that the MDT won't benefit from a
recordsize of 1MB, and I've already set the ashift to 12.  I'm
using an MDT/MGS made up of a stripe across mirrored ssds.

w/r,

Kurt




___

lustre-discuss mailing list

lustre-discuss@lists.lustre.org  <mailto:lustre-discuss@lists.lustre.org>

http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] ZFS tuning for MDT/MGS

2019-03-13 Thread Riccardo Veraldi


these are the zfs settings I use on my MDSes

 zfs set mountpoint=none mdt0
 zfs set sync=disabled mdt0
 zfs set atime=off amdt0
 zfs set redundant_metadata=most mdt0
 zfs set xattr=sa mdt0

if youor MDT partition is on a 4KB sector disk then you can use 
ashift=12 when you create the filesystem but zfs is pretty smart and in 
my case it recognized it automatically and used ashift=12 automatically.


also here are the zfs kernel modules parameters i use to ahve better 
performance. I use it on both MDS and OSSes


options zfs zfs_prefetch_disable=1
options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#options zfs zfs_vdev_sync_write_min_active=64
#options zfs zfs_vdev_sync_write_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32

some people may disagree with me anyway after years of trying different 
options I reached this stable configuration.


then there are a bunch of other important Lustre level optimizations 
that you can do if you are looking for performance increase.


Cheers

Rick

On 3/13/19 11:44 AM, Kurt Strosahl wrote:


Good Afternoon,


    I'm reviewing the zfs parameters for a new metadata system and I 
was looking to see if anyone had examples (good or bad) of zfs 
parameters? I'm assuming that the MDT won't benefit from a recordsize 
of 1MB, and I've already set the ashift to 12.  I'm using an MDT/MGS 
made up of a stripe across mirrored ssds.



w/r,

Kurt


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre 2.12.0 and locking problems

2019-03-06 Thread Riccardo Veraldi


Hello Amir i answer in-line

On 3/5/19 3:42 PM, Amir Shehata wrote:
It looks like the ping is passing. Did you try it several times to 
make sure it always pings successfully?


The way it works is the MDS (2.12) discovers all the interfaces on the 
peer. There is a concept of the primary NID for the peer. That's the 
first interface configured on the peer. In your case it's the o2ib 
NID. So when you do lnetctl net show you'll see Primary NID: @o2ib.


    - primary nid: 172.21.52.88@o2ib
   Multi-Rail: True
   peer ni:
 - nid: 172.21.48.250@tcp
   state: NA
 - nid: 172.21.52.88@o2ib
   state: NA
 - nid: 172.21.48.250@tcp1
   state: NA
 - nid: 172.21.48.250@tcp2
   state: NA

On the MDS it uses the primary_nid to identify the peer. So you can 
ping using the Primary NID. LNet will resolve the Primary NID to the 
tcp NID. As you can see in the logs, it never actually talks over 
o2ib. It ends up talking to the peer on its TCP NID, which is what you 
want to do.


I think the problem you're seeing is caused by the combination of 2.12 
and 2.10.x.

From what I understand your servers are 2.12 and your clients are 2.10.x.
my clients are 2.10.5 but this problem arise also with one client 
2.12.0, anyway the combination of 2.10.0 clients and 2.12.0 is not 
working right


Can you try disabling dynamic discovery on your servers:
lnetctl set discovery 0


I did this on the MDS and OSS. I did not disable discovery on the client 
side.


now on the MDS side lnetctl peer show looks right.

Anyway on the client side where I have both IB and tcp if I write on the 
lustre filesystem (OSS) what hapens is that the write operation is 
splitte/load balanced between IB and tcp (Ethernet) and I do not want 
this. I would like that only IB would be used when the client writes 
data to the OSS. but both peer ni (o2ib,tcp) are seen from the 2.12.0 
client and traffic goes to both of them thus reducing performances 
because IB is not fully used. This does not happen with 2.10.5 client 
writing on the same 2.12.0 OSS





Do that as part of the initial bring up to make sure 2.12 nodes don't 
try to discover peers. Let me know if that resolves your issue?


On Tue, 5 Mar 2019 at 15:09, Riccardo Veraldi 
mailto:riccardo.vera...@cnaf.infn.it>> 
wrote:


it is not exactly this problem.

here is my setup

  * MDS is on tcp0
  * client is on tcp0 and o2ib0
  * OSS is on tcp0 and o2ib0

The problem is that the MDS is discovering both the lustre client
and the OSS as well over o2ib and it should not because the MDS
has only one ethernet interface. I can see this from lnetctl peer
show. This did not happen prior to upgrading to Lustre 2.12.0 from
2.10.5

so I tried to debug with lctl ping from the MDS to the lustre client

[root@psmdsana1501 ~]# lctl ping 172.21.48.250@tcp
12345-0@lo
12345-172.21.52.88@o2ib
12345-172.21.48.250@tcp

172.21.52.88 is the ib interface of the lustre client.

so I did

[root@psmdsana1501 ~]# lctl ping 172.21.52.88@o2ib
12345-0@lo
12345-172.21.52.88@o2ib
12345-172.21.48.250@tcp


this is the mds lnet.conf

ip2nets:
 - net-spec: tcp
   interfaces:
  0: eth0

then I did as you suggested on the MDS:

lctl set_param debug=+"net neterror"

lctl ping 172.21.48.250
12345-0@lo
12345-172.21.52.88@o2ib
12345-172.21.48.250@tcp


LOG file:


0800:0200:3.0F:1551827094.376143:0:17197:0:(socklnd.c:195:ksocknal_find_peer_locked())
got peer_ni [8e0f3a312100] -> 12345-172.21.49.100@tcp (4)

0800:0200:3.0:1551827094.376155:0:17197:0:(socklnd_cb.c:757:ksocknal_queue_tx_locked())
Sending to 12345-172.21.49.100@tcp ip *MailScanner warning:
numerical links are often malicious:* 172.21.49.100:1021
<http://172.21.49.100:1021>

0800:0200:3.0:1551827094.376158:0:17197:0:(socklnd_cb.c:776:ksocknal_queue_tx_locked())
Packet 8e0f3d32d800 type 192, nob 24 niov 1 nkiov 0

0800:0200:3.0:1551827094.376312:0:17200:0:(socklnd_cb.c:549:ksocknal_process_transmit())
send(0) 0

0800:0200:3.0:1551827097.376102:0:17197:0:(socklnd.c:195:ksocknal_find_peer_locked())
got peer_ni [8e0f346f9400] -> 12345-172.21.49.110@tcp (4)

0800:0200:3.0:1551827097.376110:0:17197:0:(socklnd_cb.c:757:ksocknal_queue_tx_locked())
Sending to 12345-172.21.49.110@tcp ip *MailScanner warning:
numerical links are often malicious:* 172.21.49.110:1021
<http://172.21.49.110:1021>

0800:0200:3.0:1551827097.376114:0:17197:0:(socklnd_cb.c:776:ksocknal_queue_tx_locked())
Packet 8e0f3d32d800 type 192, nob 24 niov 1 nkiov 0

0800:0200:3.0:1551827097.376135:0:17197:0:(socklnd.c:195:ksocknal_find_peer_locked())
got peer_ni [8e0f3d32d000] -> 12345-172.21.48.69@tcp (4)

0800:000

Re: [lustre-discuss] Lustre 2.12.0 and locking problems

2019-03-05 Thread Riccardo Veraldi

in the peer show.

Multi-Rail doesn't enable o2ib. It just sees it. If the node doing
the discovery has only tcp, then it should never try to connect
over the o2ib.

Are you able to do a "lnetctl ping 172.21.48.250@tcp" from the MDS
multiple times? Do you see the ping failing intermittently?

What should happen is that when the MDS (running 2.12) tries to
talk to the peer you have identified, then it'll discover its
interfaces. But then should realize that it can only reach it on
the tcp network, since that's the only network configured on the MDS.

It might help, if you just configure LNet only, on the MDS and the
peer and run a simple
lctl set_param debug=+"net neterror"
lnetctl ping <>
lctl dk >log

If you can share the debug output, it'll help to pinpoint the problem.

thanks
amir

On Tue, 5 Mar 2019 at 12:30, Riccardo Veraldi
mailto:riccardo.vera...@cnaf.infn.it>> wrote:

I think I figured out the problem.
My problem is related to Lnet Network Health feature:
https://jira.whamcloud.com/browse/LU-9120
the lustre MDS and the lsutre client having same version 2.12.0
negotiate a Multi-rail peer connection while this does not
happen with
the other clients (2.10.5). So what happens is that both IB
and tcp are
being used during transfers.
tcp is only for connecting to the MDS, IB only to connect to
the OSS
anyway Multi-rail is enabled by default between the MDS,OSS
and client.
This messes up the situation. the MDS has only one TCP
interface and
cannot communicate by IB but in the "lnetctl peer show" a NID
@o2ib
shows up and it should not. At this point the MDS tries to
connect to
the client using IB and it will never work because there is no
IB on the
MDS.
MDS Lnet configuration:

net:
 - net type: lo
   local NI(s):
 - nid: 0@lo
   status: up
 - net type: tcp
   local NI(s):
 - nid: 172.21.49.233@tcp
   status: up
   interfaces:
   0: eth0

but if I look at lnetctl peer show I See

    - primary nid: 172.21.52.88@o2ib
   Multi-Rail: True
   peer ni:
 - nid: 172.21.48.250@tcp
   state: NA
 - nid: 172.21.52.88@o2ib
   state: NA
 - nid: 172.21.48.250@tcp1
   state: NA
 - nid: 172.21.48.250@tcp2
   state: NA

there should be no o2ib nid but Multi-rail for some reason
enables it.
I do not have problems with the other clients (non 2.12.0)

How can I disable Multi-rail on 2.12.0 ??

thank you

On 3/5/19 12:14 PM, Patrick Farrell wrote:
> Riccardo,
>
> Since 2.12 is still a relatively new maintenance release, it
would be helpful if you could open an LU and provide more
detail there - Such as what clients were doing, if you were
using any new features (like DoM or FLR), and full dmesg from
the clients and servers involved in these evictions.
>
> - Patrick
>
> On 3/5/19, 11:50 AM, "lustre-discuss on behalf of Riccardo
Veraldi" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of
riccardo.vera...@cnaf.infn.it
<mailto:riccardo.vera...@cnaf.infn.it>> wrote:
>
>      Hello,
>
>      I have quite a big issue on my Lustre 2.12.0 MDS/MDT.
>
>      Clients moving data to the OSS occur into a locking
problem I never met
>      before.
>
>      The clients are mostly 2.10.5 except for one which is
2.12.0 but
>      regardless the client version the problem is still there.
>
>      So these are the errors I see on hte MDS/MDT. When this
happens
>      everything just hangs. If I reboot the MDS everything
is back to
>      normality but it happened already 2 times in 3 days and
it is disrupting.
>
>      Any hints ?
>
>      Is it feasible to downgrade from 2.12.0 to 2.10.6 ?
>
>      thanks
>
>      Mar  5 11:10:33 psmdsana1501 kernel: Lustre:
> 7898:0:(client.c:2132:ptlrpc_expire_one_request()) @@@
Request sent has
>      failed due to network error: [sent 1551813033/real
1551813033]
>      req@9fdcbecd0300 x1626845000210688/t0(0)
>      o104->ana15-MD

Re: [lustre-discuss] Lustre 2.12.0 and locking problems

2019-03-05 Thread Riccardo Veraldi


I think I figured out the problem.
My problem is related to Lnet Network Health feature: 
https://jira.whamcloud.com/browse/LU-9120
the lustre MDS and the lsutre client having same version 2.12.0 
negotiate a Multi-rail peer connection while this does not happen with 
the other clients (2.10.5). So what happens is that both IB and tcp are 
being used during transfers.
tcp is only for connecting to the MDS, IB only to connect to the OSS 
anyway Multi-rail is enabled by default between the MDS,OSS and client. 
This messes up the situation. the MDS has only one TCP interface and 
cannot communicate by IB but in the "lnetctl peer show" a NID @o2ib 
shows up and it should not. At this point the MDS tries to connect to 
the client using IB and it will never work because there is no IB on the 
MDS.

MDS Lnet configuration:

net:
    - net type: lo
  local NI(s):
    - nid: 0@lo
  status: up
    - net type: tcp
  local NI(s):
    - nid: 172.21.49.233@tcp
  status: up
  interfaces:
  0: eth0

but if I look at lnetctl peer show I See

   - primary nid: 172.21.52.88@o2ib
  Multi-Rail: True
  peer ni:
    - nid: 172.21.48.250@tcp
  state: NA
    - nid: 172.21.52.88@o2ib
  state: NA
    - nid: 172.21.48.250@tcp1
  state: NA
    - nid: 172.21.48.250@tcp2
  state: NA

there should be no o2ib nid but Multi-rail for some reason enables it.
I do not have problems with the other clients (non 2.12.0)

How can I disable Multi-rail on 2.12.0 ??

thank you



On 3/5/19 12:14 PM, Patrick Farrell wrote:

Riccardo,

Since 2.12 is still a relatively new maintenance release, it would be helpful 
if you could open an LU and provide more detail there - Such as what clients 
were doing, if you were using any new features (like DoM or FLR), and full 
dmesg from the clients and servers involved in these evictions.

- Patrick

On 3/5/19, 11:50 AM, "lustre-discuss on behalf of Riccardo Veraldi" 
 
wrote:

 Hello,
 
 I have quite a big issue on my Lustre 2.12.0 MDS/MDT.
 
 Clients moving data to the OSS occur into a locking problem I never met

 before.
 
 The clients are mostly 2.10.5 except for one which is 2.12.0 but

 regardless the client version the problem is still there.
 
 So these are the errors I see on hte MDS/MDT. When this happens

 everything just hangs. If I reboot the MDS everything is back to
 normality but it happened already 2 times in 3 days and it is disrupting.
 
 Any hints ?
 
 Is it feasible to downgrade from 2.12.0 to 2.10.6 ?
 
 thanks
 
 Mar  5 11:10:33 psmdsana1501 kernel: Lustre:

 7898:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has
 failed due to network error: [sent 1551813033/real 1551813033]
 req@9fdcbecd0300 x1626845000210688/t0(0)
 o104->ana15-MDT@172.21.52.87@o2ib:15/16 lens 296/224 e 0 to 1 dl
 1551813044 ref 1 fl Rpc:eX/0/ rc 0/-1
 Mar  5 11:10:33 psmdsana1501 kernel: Lustre:
 7898:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 50552576
 previous similar messages
 Mar  5 11:13:03 psmdsana1501 kernel: LustreError:
 7898:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid
 172.21.52.87@o2ib) failed to reply to blocking AST (req@9fdcbecd0300
 x1626845000210688 status 0 rc -110), evict it ns: mdt-ana15-MDT_UUID
 lock: 9fde9b6873c0/0x9824623d2148ef38 lrc: 4/0,0 mode: PR/PR res:
 [0x213a9:0x1d347:0x0].0x0 bits 0x13/0x0 rrc: 5 type: IBT flags:
 0x6020040020 nid: 172.21.52.87@o2ib remote: 0xd8efecd6e7621e63
 expref: 8 pid: 7898 timeout: 333081 lvb_type: 0
 Mar  5 11:13:03 psmdsana1501 kernel: LustreError: 138-a: ana15-MDT:
 A client on nid 172.21.52.87@o2ib was evicted due to a lock blocking
 callback time out: rc -110
 Mar  5 11:13:03 psmdsana1501 kernel: LustreError:
 5321:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer
 expired after 150s: evicting client at 172.21.52.87@o2ib ns:
 mdt-ana15-MDT_UUID lock: 9fde9b6873c0/0x9824623d2148ef38 lrc:
 3/0,0 mode: PR/PR res: [0x213a9:0x1d347:0x0].0x0 bits 0x13/0x0 rrc:
 5 type: IBT flags: 0x6020040020 nid: 172.21.52.87@o2ib remote:
 0xd8efecd6e7621e63 expref: 9 pid: 7898 timeout: 0 lvb_type: 0
 Mar  5 11:13:04 psmdsana1501 kernel: Lustre: ana15-MDT: Connection
 restored to 59c5a826-f4e9-0dd0-8d4f-08c204f25941 (at 172.21.52.87@o2ib)
 Mar  5 11:15:34 psmdsana1501 kernel: LustreError:
 7898:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid
 172.21.52.142@o2ib) failed to reply to blocking AST
 (req@9fde2d393600 x1626845000213776 status 0 rc -110), evict it ns:
 mdt-ana15-MDT_UUID lock: 9fde9b6858c0/0x9824623d2148efee lrc:
 4/0,0 mode: PR/PR res: [0x213ac:0x1:0x0].0x0 bits 0x13/0x0 rrc:

[lustre-discuss] Lustre 2.12.0 and locking problems

2019-03-05 Thread Riccardo Veraldi


Hello,

I have quite a big issue on my Lustre 2.12.0 MDS/MDT.

Clients moving data to the OSS occur into a locking problem I never met 
before.


The clients are mostly 2.10.5 except for one which is 2.12.0 but 
regardless the client version the problem is still there.


So these are the errors I see on hte MDS/MDT. When this happens 
everything just hangs. If I reboot the MDS everything is back to 
normality but it happened already 2 times in 3 days and it is disrupting.


Any hints ?

Is it feasible to downgrade from 2.12.0 to 2.10.6 ?

thanks

Mar  5 11:10:33 psmdsana1501 kernel: Lustre: 
7898:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has 
failed due to network error: [sent 1551813033/real 1551813033] 
req@9fdcbecd0300 x1626845000210688/t0(0) 
o104->ana15-MDT@172.21.52.87@o2ib:15/16 lens 296/224 e 0 to 1 dl 
1551813044 ref 1 fl Rpc:eX/0/ rc 0/-1
Mar  5 11:10:33 psmdsana1501 kernel: Lustre: 
7898:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 50552576 
previous similar messages
Mar  5 11:13:03 psmdsana1501 kernel: LustreError: 
7898:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 
172.21.52.87@o2ib) failed to reply to blocking AST (req@9fdcbecd0300 
x1626845000210688 status 0 rc -110), evict it ns: mdt-ana15-MDT_UUID 
lock: 9fde9b6873c0/0x9824623d2148ef38 lrc: 4/0,0 mode: PR/PR res: 
[0x213a9:0x1d347:0x0].0x0 bits 0x13/0x0 rrc: 5 type: IBT flags: 
0x6020040020 nid: 172.21.52.87@o2ib remote: 0xd8efecd6e7621e63 
expref: 8 pid: 7898 timeout: 333081 lvb_type: 0
Mar  5 11:13:03 psmdsana1501 kernel: LustreError: 138-a: ana15-MDT: 
A client on nid 172.21.52.87@o2ib was evicted due to a lock blocking 
callback time out: rc -110
Mar  5 11:13:03 psmdsana1501 kernel: LustreError: 
5321:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer 
expired after 150s: evicting client at 172.21.52.87@o2ib ns: 
mdt-ana15-MDT_UUID lock: 9fde9b6873c0/0x9824623d2148ef38 lrc: 
3/0,0 mode: PR/PR res: [0x213a9:0x1d347:0x0].0x0 bits 0x13/0x0 rrc: 
5 type: IBT flags: 0x6020040020 nid: 172.21.52.87@o2ib remote: 
0xd8efecd6e7621e63 expref: 9 pid: 7898 timeout: 0 lvb_type: 0
Mar  5 11:13:04 psmdsana1501 kernel: Lustre: ana15-MDT: Connection 
restored to 59c5a826-f4e9-0dd0-8d4f-08c204f25941 (at 172.21.52.87@o2ib)
Mar  5 11:15:34 psmdsana1501 kernel: LustreError: 
7898:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 
172.21.52.142@o2ib) failed to reply to blocking AST 
(req@9fde2d393600 x1626845000213776 status 0 rc -110), evict it ns: 
mdt-ana15-MDT_UUID lock: 9fde9b6858c0/0x9824623d2148efee lrc: 
4/0,0 mode: PR/PR res: [0x213ac:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 
type: IBT flags: 0x6020040020 nid: 172.21.52.142@o2ib remote: 
0xbb35541ea6663082 expref: 9 pid: 7898 timeout: 333232 lvb_type: 0
Mar  5 11:15:34 psmdsana1501 kernel: LustreError: 138-a: ana15-MDT: 
A client on nid 172.21.52.142@o2ib was evicted due to a lock blocking 
callback time out: rc -110
Mar  5 11:15:34 psmdsana1501 kernel: LustreError: 
5321:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer 
expired after 151s: evicting client at 172.21.52.142@o2ib ns: 
mdt-ana15-MDT_UUID lock: 9fde9b6858c0/0x9824623d2148efee lrc: 
3/0,0 mode: PR/PR res: [0x213ac:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 
type: IBT flags: 0x6020040020 nid: 172.21.52.142@o2ib remote: 
0xbb35541ea6663082 expref: 10 pid: 7898 timeout: 0 lvb_type: 0
Mar  5 11:15:34 psmdsana1501 kernel: Lustre: ana15-MDT: Connection 
restored to 9d49a115-646b-c006-fd85-000a4b90019a (at 172.21.52.142@o2ib)
Mar  5 11:20:33 psmdsana1501 kernel: Lustre: 
7898:0:(client.c:2132:ptlrpc_expire_one_request()) @@@ Request sent has 
failed due to network error: [sent 1551813633/real 1551813633] 
req@9fdcc2a95100 x1626845000222624/t0(0) 
o104->ana15-MDT@172.21.52.87@o2ib:15/16 lens 296/224 e 0 to 1 dl 
1551813644 ref 1 fl Rpc:eX/2/ rc 0/-1
Mar  5 11:20:33 psmdsana1501 kernel: Lustre: 
7898:0:(client.c:2132:ptlrpc_expire_one_request()) Skipped 23570550 
previous similar messages
Mar  5 11:22:46 psmdsana1501 kernel: LustreError: 
7898:0:(ldlm_lockd.c:682:ldlm_handle_ast_error()) ### client (nid 
172.21.52.87@o2ib) failed to reply to blocking AST (req@9fdcc2a95100 
x1626845000222624 status 0 rc -110), evict it ns: mdt-ana15-MDT_UUID 
lock: 9fde86ffdf80/0x9824623d2148f23a lrc: 4/0,0 mode: PR/PR res: 
[0x213ae:0x1:0x0].0x0 bits 0x13/0x0 rrc: 3 type: IBT flags: 
0x6020040020 nid: 172.21.52.87@o2ib remote: 0xd8efecd6e7621eb7 
expref: 9 pid: 7898 timeout: 333665 lvb_type: 0
Mar  5 11:22:46 psmdsana1501 kernel: LustreError: 138-a: ana15-MDT: 
A client on nid 172.21.52.87@o2ib was evicted due to a lock blocking 
callback time out: rc -110
Mar  5 11:22:46 psmdsana1501 kernel: LustreError: 
5321:0:(ldlm_lockd.c:256:expired_lock_main()) ### lock callback timer 
expired after 150s: evicting client at 172.21.52.87@o2ib ns:

[lustre-discuss] problem on the MDT after upgrading lustre version

2019-03-01 Thread Riccardo Veraldi


Hello,

Yesterday I upgraded one of my filesystems from Lustre 2.10.5 to Lustre 
2.12.0


everything apparently went well. I Also upgraded form zfs 0.7.9 to zfs 
0.7.12


I also ahve another cluster with a clean 2.12.0 install and it works 
well and performs well


Today after the yesterday's upgrade I noticed that the clients could not 
mount the filesystem anymore and the clients had this error


Mar  1 14:41:15 psexport07 kernel: LustreError: 
1355:0:(lmv_obd.c:1412:lmv_statfs()) can't stat MDS #0 
(ana15-MDT-mdc-9cf3b03e), error -11
Mar  1 14:41:16 psexport07 kernel: LustreError: 
1355:0:(lov_obd.c:831:lov_cleanup()) ana15-clilov-9cf3b03e: lov 
tgt 1 not cleaned! deathrow=0, lovrc=1



mount operation would just hang with no error on the MDS or on the OSS

After rebooting the MDS/MDT server everything is apparently fixed.

I wonder what could be the cause of this. I have checked the networking 
and LNet configuration and everything is properly configured.


lctl ping works and anyway nothing has been changed in the configuration 
from 2.10.5 to 2.12.0


Any hints are very appreciated.

thanks

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] upgrade from 2.10.5 to 2.12.0

2019-02-28 Thread Riccardo Veraldi


Hello,

I am planning a Lustre upgrade from 2.10.5/ZFS  to 2.12.0/ZFS

any particular cavetat on this procedure ?

Can I simply upgrade the Lustre package and mount the filesystem ?

thank you

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] peculiar LNet behavior on 2.12.0

2019-02-25 Thread Riccardo Veraldi


Hello,

I have a new Lustre cluster 2.12.0 with 8 OSSes and 24 clients on RHEL76.

I have a problem with lnet behavior.

even thought I configured lnet.conf in this way on every client

ip2nets:
 - net-spec: o2ib0
   interfaces:
  0: ib0

when I check the lnet status on each client, sometimes is like this as 
it should be:


lnetctl net show
net:
    - net type: lo
  local NI(s):
    - nid: 0@lo
  status: up
    - net type: o2ib
  local NI(s):
    - nid: 172.21.52.145@o2ib
  status: up
  interfaces:
  0: ib0


but other times it shows a tcp instance that I did not configure in 
lnet.conf


lnetctl net show

net:
    - net type: lo
  local NI(s):
    - nid: 0@lo
  status: up
    - net type: tcp
  local NI(s):
    - nid: 172.21.42.245@tcp
  status: up
    - net type: o2ib
  local NI(s):
    - nid: 172.21.52.145@o2ib
  status: up
  interfaces:
  0: ib0


I configured the clients to connect through o2ib to the MDS and not tcp 
but somwhat LNet automatically from time to time adds the tcp net instance.


This does not happen on the OSS side but only on the clients side.

I thought I could just use o2ib without a net type tcp instance but LNet 
likes to add it automatically eventually.


How to prevent this ?

thank you


Riccardo




___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Which release to use?

2019-02-22 Thread Riccardo Veraldi

I am using Lustre 2.12.0 and seems workign pretty well, anyway I built 
it against zfs 0.7.12 libraries... was it a mistake ?

what's the zfs release that Lustre 2.12.0 is built/tested on ?


On 2/22/19 12:18 PM, Peter Jones wrote:


Nathan

Yes 2.12 is an LTS branch. We’re planning on putting out both 2.10.7 
and 2.12.1 this quarter but have been focusing on the former first to 
allow for more time to receive feedback from early adopters on 2.12.0. 
You can see the patches that will land starting to accumulate here - 
https://review.whamcloud.com/#/q/status:open+project:fs/lustre-release+branch:b2_12 
. I guess what I am trying to say is “be patient” ☺


Peter

*From: *Nathan R Crawford 
*Reply-To: *"nathan.crawf...@uci.edu" 
*Date: *Friday, February 22, 2019 at 11:31 AM
*To: *Peter Jones 
*Cc: *"lustre-discuss@lists.lustre.org" 
*Subject: *Re: [lustre-discuss] Which release to use?

Hi Peter,

  Somewhat related: where should we be looking for the commits leading 
up to 2.12.1? The b2_12 branch 
(https://git.whamcloud.com/?p=fs/lustre-release.git;a=shortlog;h=refs/heads/b2_12) 
has no activity since 2.12.0 was released. I assumed that if 2.12 is a 
LTS branch like 2.10, there would be something by now. Commits started 
appearing on b2_10 after a week or so.


  "Be patient" is an acceptable response :)

-Nate

On Fri, Feb 22, 2019 at 10:51 AM Peter Jones > wrote:


2.12.0 is relatively new. It does have some improvements over
2.10.x (notably Data on MDT) but if those are not an immediate
requirement then using 2.10.6 would be a proven and more
comnservative option. 2.12.51 is an interim development build for
2.13 and should absolutely not be used for production purposes.

On 2019-02-22, 10:07 AM, "lustre-discuss on behalf of Bernd
Melchers" mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of
melch...@zedat.fu-berlin.de >
wrote:

    Hi all,
    in the git repository i find v2.10.6, v2.12.0 and v2.12.51.
Which version should
    i compile and use for my productive CentOS 7.6 System?

    Mit freundlichen Grüßen
    Bernd Melchers

    --
    Archiv- und Backup-Service | fab-serv...@zedat.fu-berlin.de

    Freie Universität Berlin   | Tel. +49-30-838-55905
    ___
    lustre-discuss mailing list
lustre-discuss@lists.lustre.org

http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org

http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


--

Dr. Nathan crawfordnathan.crawf...@uci.edu  
Modeling Facility Director
Department of Chemistry
1102 Natural Sciences II Office: 2101 Natural Sciences II
University of California, Irvine  Phone: 949-824-4508
Irvine, CA 92697-2025, USA

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] 2.10.6 or 2.12.0 ?

2019-02-04 Thread Riccardo Veraldi


Hello,

I have to build a bunch of new big Lustre filesystems.

I Was wondering if i should go for 2.12.0 so that it will be simpler in 
the future to keep it up to date to 2.12.* family or if I it is better 
to opt for 2.10.6 now and upgrade in the future to 2.12.*


Any hints ?

thank you

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lnet yaml error

2019-02-01 Thread Riccardo Veraldi

 
 

 Make sure not to have old Lustre.conf file in modprobe.d which may load legacy 
lnet settings  
 

 
 

 
 
>  
> On Feb 1, 2019 at 12:32 PM,   (mailto:mdidomeni...@gmail.com)>  wrote:
>  
>  
>  
>  yes. turns out there must have been something futzy in the system, i 
> did and lctl net down, a lustre_rmmod, and then systemctl restart 
> lnet. things seemed to work after that. seems a strange failure 
> scenario though 
>
> i can't mount the filesystem still, but i think that's a separate issue 
>
> On Fri, Feb 1, 2019 at 3:29 PM Riccardo Veraldi 
>   wrote: 
> >  
> >  Did you install yaml 
> >  and yaml-devel ? 
> >  
> >  
> >  On Feb 1, 2019 at 12:20 PM,wrote: 
> >  
> >  i'm trying to start an lnet client, but lnet kicks out the config with 
> >  
> >  a yaml error 
> >  
> >  
> >  
> >  yaml: 
> >  
> >  - builder: 
> >  
> >  errno: -1 
> >  
> >  descr: failed to handle token:0 [state=3, rc=-5] 
> >  
> >  
> >  
> >  i've compared the lnet.conf on the server i'm testing with another 
> >  
> >  that is working, they appaer identical (except for some ip addresses), 
> >  
> >  but i can't locate a problem. 
> >  
> >  
> >  
> >  is there something i can run that might be a little more vocal about 
> >  
> >  what's broke? 
> >  
> >  ___ 
> >  
> >  lustre-discuss mailing list 
> >  
> >  lustre-discuss@lists.lustre.org 
> >  
> >  http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 
> >  
>  ___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lnet yaml error

2019-02-01 Thread Riccardo Veraldi

 
 

 Did you install yaml
 
and yaml-devel ?  
 

 
 

 
 
>  
> On Feb 1, 2019 at 12:20 PM,   (mailto:mdidomeni...@gmail.com)>  wrote:
>  
>  
>  
>  i'm trying to start an lnet client, but lnet kicks out the config with 
> a yaml error 
>
> yaml: 
> - builder: 
> errno: -1 
> descr: failed to handle token:0 [state=3, rc=-5] 
>
> i've compared the lnet.conf on the server i'm testing with another 
> that is working, they appaer identical (except for some ip addresses), 
> but i can't locate a problem. 
>
> is there something i can run that might be a little more vocal about 
> what's broke? 
> ___ 
> lustre-discuss mailing list 
> lustre-discuss@lists.lustre.org 
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org 
>  ___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] ko2iblnd optimizations for EDR

2018-11-08 Thread Riccardo Veraldi


thanks for sharing your experience.
I am running peer credits value 63 as it was for mlx4.
I did not notice problems with these settings anyway to saturate the EDR 
bandwith writing and reading using Lustre I have to do it from 2 
different clients,
so that I can reach up to 9GB/s per OSS. Otherwise from a single client 
I cannot reach more than 5GB/s. it is not clear to me why.


On 11/8/18 4:58 AM, Martin Hecht wrote:

On 11/7/18 9:44 PM, Riccardo Veraldi wrote:

Anyway I Was wondering if something different is needed for mlx5 and
what are the suggested values in that case ?

Anyone has experience with mlx5 LNET performance tunings ?

Hi Riccardo,

We have recently integrated mlx5 nodes into our fabric, and we had to
reduce the values to

peer_credits = 16
concurrent_sends = 16

because mlx5 doesn't support larger values for some reason. The peer_credits 
must have the same value in all connected lnets, even across routers (at least 
it used to be like this. I believe we are currently running some Lustre 2.5.x 
derivates on the server side, and newer versions on the various clients).

kind regards,
Martin


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] lctl set_param not setting values permanently

2018-11-08 Thread Riccardo Veraldi


Hello,

I did set a bunch of params from the MDS so that they can be taken up by 
the Lustre clients


lctl set_param -P osc.*.checksums=0
lctl set_param -P timeout=600
lctl set_param -P at_min=250
lctl set_param -P at_max=600
lctl set_param -P ldlm.namespaces.*.lru_size=2000
lctl set_param -P osc.*.max_rpcs_in_flight=64
lctl set_param -P osc.*.max_dirty_mb=1024
lctl set_param -P llite.*.max_read_ahead_mb=1024
lctl set_param -P llite.*.max_cached_mb=81920
lctl set_param -P llite.*.max_read_ahead_per_file_mb=1024
lctl set_param -P subsystem_debug=0

it works but  not for osc.*.checksums=0

every time a client reboots and the filesystem is remounted I have to 
lctl set_param -P osc.*.checksums=0 on the MDS again so that


all the clients will take again this setting. I do not know why this 
happens. Any clue ?


I am using Lustre 2.10.5  Centos 7 with kernel 3.10.0-862.14.4.el7.x86_64

thanks

Riccardo


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] ko2iblnd optimizations for EDR

2018-11-07 Thread Riccardo Veraldi


Hello,

I found that regarding FDR Infiniband if I set ko2iblnd parameters as 
below I have quite an increase in performance using mlx4


options ko2iblnd timeout=100 peer_credits=63 credits=2560 
concurrent_sends=63  fmr_pool_size=1280 fmr_flush_trigger=1024 ntx=5120


this suggested values comes from a ORNL presentation about advanced LNET 
configuration.


Anyway I Was wondering if something different is needed for mlx5 and 
what are the suggested values in that case ?


Anyone has experience with mlx5 LNET performance tunings ?

thanks a lot

Riccardo


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre 2.10.5 or 2.11.0

2018-10-30 Thread Riccardo Veraldi


Sorry for replaying late, I answered in-line

On 10/21/18 6:00 AM, Andreas Dilger wrote:

It would be useful to post information like this on wiki.lustre.org so they can 
be found more easily by others.  There are already some ZFS tunings there (I 
don't have the URL handy, just on a plane), so it might be useful to include 
some information about the hardware and workload to give context to what this 
is tuned for.

Even more interesting would be to see if there is a general set of tunings that 
people agree should be made the default?  It is even better when new users 
don't have to seek out the various tuning parameters, and instead get good 
performance out of the box.

A few comments inline...

On Oct 19, 2018, at 17:52, Riccardo Veraldi  
wrote:

On 10/19/18 12:37 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:

On Oct 17, 2018, at 7:30 PM, Riccardo Veraldi  
wrote:

anyway especially regarding the OSSes you may eventually need some ZFS module parameters 
optimizations regarding vdev_write and vdev_read max to increase those values higher than 
default. You may also disable ZIL, change the redundant_metadata to "most"  
atime off.

I could send you a list of parameters that in my case work well.

Riccardo,

Would you mind sharing your ZFS parameters with the mailing list?  I would be 
interested to see which options you have changed.


this worked for me on my high performance cluster

options zfs zfs_prefetch_disable=1

This matches what I've seen in the past - at high bandwidth under concurrent 
client load the prefetched data on the server is lost, and just causes needless 
disk IO that is discarded.


options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30

This is interesting.  Is this actually setting the maximum TXG age up to 30s?


yes, I think the default is 5 seconds.





options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32

##

these the zfs attributes that I changed on the OSSes:

zfs set mountpoint=none $ostpool
zfs set sync=disabled $ostpool
zfs set atime=off $ostpool
zfs set redundant_metadata=most $ostpool
zfs set xattr=sa $ostpool
zfs set recordsize=1M $ostpool

The recordsize=1M is already the default for Lustre OSTs.

Did you disable multimount, or just not include it here?  That is fairly
important for any multi-homed ZFS storage, to prevent multiple imports.


#


these the ko2iblnd parameters for FDR Mellanox IB interfaces

options ko2iblnd timeout=100 peer_credits=63 credits=2560 concurrent_sends=63 
ntx=2048 fmr_pool_size=1280 fmr_flush_trigger=1024 ntx=5120

You have ntx= in there twice...


yes it is a mistake I typed it two times





If this provides a significant improvement for FDR, it might make sense to add in 
machinery to lustre/conf/{ko2iblnd-probe,ko2iblnd.conf} to have a new alias 
"ko2iblnd-fdr" set these values on Mellanox FDB IB cards by default?


I found it it works better with FDR.

Anyway most of the tunings I did were taken here and there reading what 
other people did. So mostly from here:


 * https://lustre.ornl.gov/lustre101-courses/content/C1/L5/LustreTuning.pdf
 * https://www.eofs.eu/_media/events/lad15/15_chris_horn_lad_2015_lnet.pdf
 * https://lustre.ornl.gov/ecosystem-2015/documents/LustreEco2015-Tutorial2.pdf

And by the way the most effective tweaks were after reading Rick Mohr 
advice  in LustreTuning.pdf, Thanks Rick!







these the ksocklnd paramaters

options ksocklnd sock_timeout=100 credits=2560 peer_credits=63

##

these other parameters that I did tweak

echo 32 > /sys/module/ptlrpc/parameters/max_ptlrpcds
echo 3 > /sys/module/ptlrpc/parameters/ptlrpcd_bind_policy

This parameter is marked as obsolete in the code.


Yes I should fix my configuration and use the new parameters



lctl set_param timeout=600
lctl set_param ldlm_timeout=200
lctl set_param at_min=250
lctl set_param at_max=600

###

Also I run this script at boot time to redefine IRQ assignments for hard drives 
spanned across all CPUs, not needed for kernel > 4.4

#!/bin/sh
# numa_smp.sh
device=$1
cpu1=$2
cpu2=$3
cpu=$cpu1
grep $1 /proc/interrupts|awk '{print $1}'|sed 's/://'|while read int
do
  echo $cpu > /proc/irq/$int/smp_affinity_list
  echo "echo CPU $cpu > /proc/irq/$a/smp_affinity_list"
  if [ $cpu = $cpu2 ]
  then
 cpu=$cpu1
  else
 ((cpu=$c

Re: [lustre-discuss] Lustre OSS kernel panic after mounting OSTs

2018-10-30 Thread Riccardo Veraldi

thank you Fernando  for the hint, I did it right  now thanks. I am 
running e2fsck again.

Anyway my problem was this:

https://jira.whamcloud.com/browse/LU-5040

thank you

On 10/30/18 5:28 AM, Fernando Perez wrote:

Dear Riccardo.

Have you tried to upgrade e2fsprogs packages before perform the e2fsck?

Regards.

=
Fernando Pérez
Institut de Ciències del Mar (CSIC)
Departament Oceanografía Física i Tecnològica
Passeig Marítim de la Barceloneta,37-49
08003 Barcelona
Phone:  (+34) 93 230 96 35
=

On 10/30/2018 01:05 PM, Riccardo Veraldi wrote:

Hello,

I have quite a very critical problem.

One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.

After mounting 11 OSTs over 12 total OSTs it goes into kernel panic. 
Does not matter hte order in which they are mounted.


Any clue on hints ?

I cannot really recover it and I have important data on it.

I already performed an e2fsck. Anyway it did not fix. it has found a 
few inode count inconsistencies before.


kernel is 2.6.32-431.23.3.el6_lustre.x86_64

Red Hat Enterprise Linux Server release 6.7 (Santiago)

lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64


Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked 
for more than 120 seconds.


Oct 30 04:58:52 psanaoss231 kernel:  Not tainted 
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov D 
0003 0  4569  2 0x0080
Oct 30 04:58:52 psanaoss231 kernel: 880bf2ae1da0 0046 
 0003
Oct 30 04:58:52 psanaoss231 kernel: 880bf2ae1d30 81059096 
880bf2ae1d40 880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: 880bf2b01ab8 880bf2ae1fd8 
fbc8 880bf2b01ab8

Oct 30 04:58:52 psanaoss231 kernel: Call Trace:
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [] 
child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery 
over after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client 
89ba817f-45c3-5e64-99a8-b472651bbe45 (at 172.21.52.213@o2ib) 
reconnecting
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous 
similar messages
Oct 30 04:59:21 psanaoss231 kernel: LustreError: 
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write 
from 12345-172.21.49.129@tcp because locking object 0x0:14198730 took 
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO 
write error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at 
172.21.49.129@tcp), client will retry: rc -110

Oct 30 04:59:52 psanaoss231 kernel: [ cut here ]
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at 
fs/jbd2/transaction.c:1033!

Oct 30 04:59:52 psanaoss231 kernel: invalid opcode:  [#1] SMP
Oct 30 04:59:52 psanaoss231 kernel: last sysfs file: 
/sys/devices/system/cpu/online

Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U) 
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) 
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) 
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic 
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss 
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase 
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4 
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm 
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode 
power_meter iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf sb_edac 
edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core ses 
enclosure sg ixgbe dca ptp pps_core mdio ext4 jbd2 mbcache raid1 
sd_mod crc_t10dif ahci wmi mlx4_ib ib_sa ib_mad ib_core mlx4_en 
mlx4_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [l

Re: [lustre-discuss] Lustre OSS kernel panic after mounting OSTs

2018-10-30 Thread Riccardo Veraldi


I could mount the OSTs the only way though was to  mount with abort_recov

thanks to this old ticket

https://jira.whamcloud.com/browse/LU-5040




On 10/30/18 5:05 AM, Riccardo Veraldi wrote:

Hello,

I have quite a very critical problem.

One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.

After mounting 11 OSTs over 12 total OSTs it goes into kernel panic. 
Does not matter hte order in which they are mounted.


Any clue on hints ?

I cannot really recover it and I have important data on it.

I already performed an e2fsck. Anyway it did not fix. it has found a 
few inode count inconsistencies before.


kernel is 2.6.32-431.23.3.el6_lustre.x86_64

Red Hat Enterprise Linux Server release 6.7 (Santiago)

lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64


Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked 
for more than 120 seconds.


Oct 30 04:58:52 psanaoss231 kernel:  Not tainted 
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov D 
0003 0  4569  2 0x0080
Oct 30 04:58:52 psanaoss231 kernel: 880bf2ae1da0 0046 
 0003
Oct 30 04:58:52 psanaoss231 kernel: 880bf2ae1d30 81059096 
880bf2ae1d40 880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: 880bf2b01ab8 880bf2ae1fd8 
fbc8 880bf2b01ab8

Oct 30 04:58:52 psanaoss231 kernel: Call Trace:
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [] 
child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery 
over after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client 
89ba817f-45c3-5e64-99a8-b472651bbe45 (at 172.21.52.213@o2ib) reconnecting
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous 
similar messages
Oct 30 04:59:21 psanaoss231 kernel: LustreError: 
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write 
from 12345-172.21.49.129@tcp because locking object 0x0:14198730 took 
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO 
write error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at 
172.21.49.129@tcp), client will retry: rc -110

Oct 30 04:59:52 psanaoss231 kernel: [ cut here ]
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at 
fs/jbd2/transaction.c:1033!

Oct 30 04:59:52 psanaoss231 kernel: invalid opcode:  [#1] SMP
Oct 30 04:59:52 psanaoss231 kernel: last sysfs file: 
/sys/devices/system/cpu/online

Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U) 
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) 
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) 
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic 
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss 
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase 
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4 
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm 
ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode 
power_meter iTCO_wdt iTCO_vendor_support dcdbas ipmi_devintf sb_edac 
edac_core lpc_ich mfd_core shpchp igb i2c_algo_bit i2c_core ses 
enclosure sg ixgbe dca ptp pps_core mdio ext4 jbd2 mbcache raid1 
sd_mod crc_t10dif ahci wmi mlx4_ib ib_sa ib_mad ib_core mlx4_en 
mlx4_core megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last 
unloaded: speedstep_lib]

Oct 30 04:59:52 psanaoss231 kernel:
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007 Not 
tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1 Dell Inc. PowerEdge 
R620/0PXXHP
Oct 30 04:59:52 psanaoss231 kernel: RIP: 0010:[]  
[] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP: 0018:880c058437d0 EFLAGS: 
00010246
Oct 30 04:59:52 psanaoss231 kernel: RAX: fff

[lustre-discuss] Lustre OSS kernel panic after mounting OSTs

2018-10-30 Thread Riccardo Veraldi


Hello,

I have quite a very critical problem.

One of my OSSes hanfs into a kernel panic when trying to mount the OSTs.

After mounting 11 OSTs over 12 total OSTs it goes into kernel panic. 
Does not matter hte order in which they are mounted.


Any clue on hints ?

I cannot really recover it and I have important data on it.

I already performed an e2fsck. Anyway it did not fix. it has found a few 
inode count inconsistencies before.


kernel is 2.6.32-431.23.3.el6_lustre.x86_64

Red Hat Enterprise Linux Server release 6.7 (Santiago)

lustre-2.5.3-2.6.32_431.23.3.el6_lustre.x86_64.x86_64


Oct 30 04:58:52 psanaoss231 kernel: INFO: task tgt_recov:4569 blocked 
for more than 120 seconds.


Oct 30 04:58:52 psanaoss231 kernel:  Not tainted 
2.6.32-431.23.3.el6_lustre.x86_64 #1
Oct 30 04:58:52 psanaoss231 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 30 04:58:52 psanaoss231 kernel: tgt_recov D 0003 
0  4569  2 0x0080
Oct 30 04:58:52 psanaoss231 kernel: 880bf2ae1da0 0046 
 0003
Oct 30 04:58:52 psanaoss231 kernel: 880bf2ae1d30 81059096 
880bf2ae1d40 880bf2a1d500
Oct 30 04:58:52 psanaoss231 kernel: 880bf2b01ab8 880bf2ae1fd8 
fbc8 880bf2b01ab8

Oct 30 04:58:52 psanaoss231 kernel: Call Trace:
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
enqueue_task+0x66/0x80
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
check_for_clients+0x0/0x70 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
target_recovery_overseer+0x9d/0x230 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
exp_connect_healthy+0x0/0x20 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
autoremove_wake_function+0x0/0x40
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
target_recovery_thread+0x0/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] 
target_recovery_thread+0x540/0x1920 [ptlrpc]
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
default_wake_function+0x12/0x20
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
target_recovery_thread+0x0/0x1920 [ptlrpc]

Oct 30 04:58:52 psanaoss231 kernel: [] kthread+0x96/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [] child_rip+0xa/0x20
Oct 30 04:58:52 psanaoss231 kernel: [] ? kthread+0x0/0xa0
Oct 30 04:58:52 psanaoss231 kernel: [] ? 
child_rip+0x0/0x20
Oct 30 04:59:02 psanaoss231 kernel: Lustre: ana13-OST0004: Recovery over 
after 3:05, of 147 clients 146 recovered and 1 was evicted.
Oct 30 04:59:03 psanaoss231 kernel: Lustre: ana13-OST0004: Client 
89ba817f-45c3-5e64-99a8-b472651bbe45 (at 172.21.52.213@o2ib) reconnecting
Oct 30 04:59:03 psanaoss231 kernel: Lustre: Skipped 94 previous similar 
messages
Oct 30 04:59:21 psanaoss231 kernel: LustreError: 
4569:0:(ost_handler.c:1123:ost_brw_write()) Dropping timed-out write 
from 12345-172.21.49.129@tcp because locking object 0x0:14198730 took 
153 seconds (limit was 30).
Oct 30 04:59:21 psanaoss231 kernel: Lustre: ana13-OST0005: Bulk IO write 
error with 3a71df2f-16e7-d507-2495-ab60364d8e7c (at 172.21.49.129@tcp), 
client will retry: rc -110

Oct 30 04:59:52 psanaoss231 kernel: [ cut here ]
Oct 30 04:59:52 psanaoss231 kernel: kernel BUG at 
fs/jbd2/transaction.c:1033!

Oct 30 04:59:52 psanaoss231 kernel: invalid opcode:  [#1] SMP
Oct 30 04:59:52 psanaoss231 kernel: last sysfs file: 
/sys/devices/system/cpu/online

Oct 30 04:59:52 psanaoss231 kernel: CPU 10
Oct 30 04:59:52 psanaoss231 kernel: Modules linked in: osp(U) ofd(U) 
lfsck(U) ost(U) mgc(U) fsfilt_ldiskfs(U) osd_ldiskfs(U) lquota(U) 
ldiskfs(U) lustre(U) lov(U) osc(U) mdc(U) fid(U) fld(U) ksocklnd(U) 
ko2iblnd(U) ptlrpc(U) obdclass(U) lnet(U) lvfs(U) sha512_generic 
sha256_generic crc32c_intel libcfs(U) nfs lockd fscache auth_rpcgss 
nfs_acl mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase 
autofs4 sunrpc ipt_REDIRECT iptable_nat nf_nat nf_conntrack_ipv4 
nf_conntrack nf_defrag_ipv4 ip_tables ib_ipoib rdma_ucm ib_ucm ib_uverbs 
ib_umad rdma_cm ib_cm iw_cm ib_addr ipv6 microcode power_meter iTCO_wdt 
iTCO_vendor_support dcdbas ipmi_devintf sb_edac edac_core lpc_ich 
mfd_core shpchp igb i2c_algo_bit i2c_core ses enclosure sg ixgbe dca ptp 
pps_core mdio ext4 jbd2 mbcache raid1 sd_mod crc_t10dif ahci wmi mlx4_ib 
ib_sa ib_mad ib_core mlx4_en mlx4_core megaraid_sas dm_mirror 
dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]

Oct 30 04:59:52 psanaoss231 kernel:
Oct 30 04:59:52 psanaoss231 kernel: Pid: 4272, comm: ll_ost01_007 Not 
tainted 2.6.32-431.23.3.el6_lustre.x86_64 #1 Dell Inc. PowerEdge R620/0PXXHP
Oct 30 04:59:52 psanaoss231 kernel: RIP: 0010:[]  
[] jbd2_journal_dirty_metadata+0x10d/0x150 [jbd2]
Oct 30 04:59:52 psanaoss231 kernel: RSP: 0018:880c058437d0 EFLAGS: 
00010246
Oct 30 04:59:52 psanaoss231 kernel: RAX: 880c05573dc0 RBX: 
880c043b8d08 RCX: 88175b0fedc8
Oct 30 04:59:52 psanaoss231 kernel: RDX:  RSI: 
88175b0fedc8 RDI: 
Oct 30 04:59:52 psanaoss231 kernel: RBP:

Re: [lustre-discuss] migrating MDS to different infrastructure

2018-10-28 Thread Riccardo Veraldi

it is time for me to move my MDS to a diferent HW infrastructure.
So I Was wondering if the following procedure can work.
I have mds1 (old mds) and mds2 (new mds). On the old mds I have a zfs
MGS partition and a zfs MDT partition.

* create a new ZFS MGS and MDT partition on mds2 and create a lustre
FS on it
* umount lustre on mds2
* stop all the OSSes belonging to mds1 and stop and umount lustre on mds1
* zfs send the MGS partition from mds1 and zfs receive it on mds2
* zfs send the MGT partition from mds1 and zfs receive it on mds2
* mount lustre on mds2

should it work ?

thanks

Rick

On 8/23/18 2:40 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:

On Aug 22, 2018, at 8:10 PM, Riccardo Veraldi
wrote:

On 8/22/18 3:13 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:

On Aug 22, 2018, at 3:31 PM, Riccardo Veraldi
wrote:
I would like to migrate this virtual machine to another infrastructure. it is
not simple because the other infrastructure is vmware.
what is the best way to migrate those partitions without incurring into any
corruption of data ?
May I simply use zfs send and zfs receive thru SSH ?
what is the best way to move a MDS based virtual machine ?

I don’t have much experience with VMs, but I have used zfs send/receive to
migrate a MDT from one server to another. It worked quite well.

that's encouraging. it should work for VM too then regardless.
I have many MDSes as virtual machines. I found it to be a good way for high
availability, they perform well enough.

so what you do is to shut down lustre on OSSes and MDS. At this point you
simply zfs send and zfs receive on the new MDS ?
once the operation is terminated the new MDS is just ready to be used ?

It is possible to take snapshots on the current MDT and do incremental
send/receives to the new MDT while the system is still up and running. When
you are ready to switch over, then you shutdown the file system, do a final
send/receive, and then start the new MDS. The new MDS will need to have the
same configuration (NID, etc.) as the old MDS (unless you want to perform a
writeconf).

There was a talk at LUG in 2017 from someone who did something similar with
OSTs:

http://cdn.opensfs.org/wp-content/uploads/2017/06/Wed06-CroweTom-lug17-ost_data_migration_using_ZFS.pdf

That should give you a good idea of how to proceed.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre 2.10.5 or 2.11.0

2018-10-19 Thread Riccardo Veraldi


On 10/19/18 12:37 PM, Mohr Jr, Richard Frank (Rick Mohr) wrote:

On Oct 17, 2018, at 7:30 PM, Riccardo Veraldi  
wrote:

anyway especially regarding the OSSes you may eventually need some ZFS module parameters 
optimizations regarding vdev_write and vdev_read max to increase those values higher than 
default. You may also disable ZIL, change the redundant_metadata to "most"  
atime off.

I could send you a list of parameters that in my case work well.

Riccardo,

Would you mind sharing your ZFS parameters with the mailing list?  I would be 
interested to see which options you have changed.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


this worked for me on my high performance cluster

options zfs zfs_prefetch_disable=1
options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32

##

these the zfs attributes that I changed on the OSSes:

zfs set mountpoint=none $ostpool

zfs set sync=disabled $ostpool

zfs set atime=off $ostpool

zfs set redundant_metadata=most $ostpool

zfs set xattr=sa $ostpool

zfs set recordsize=1M $ostpool

#


these the ko2iblnd parameters for FDR Mellanox IB interfaces

options ko2iblnd timeout=100 peer_credits=63 credits=2560 
concurrent_sends=63 ntx=2048 fmr_pool_size=1280 fmr_flush_trigger=1024 
ntx=5120




these the ksocklnd paramaters

options ksocklnd sock_timeout=100 credits=2560 peer_credits=63

##

these other parameters that I did tweak

echo 32 > /sys/module/ptlrpc/parameters/max_ptlrpcds
echo 3 > /sys/module/ptlrpc/parameters/ptlrpcd_bind_policy

lctl set_param timeout=600
lctl set_param ldlm_timeout=200
lctl set_param at_min=250
lctl set_param at_max=600

###

Also I run this script at boot time to redefine IRQ assignments for hard 
drives spanned across all CPUs, not needed for kernel > 4.4


#!/bin/sh
# numa_smp.sh
device=$1
cpu1=$2
cpu2=$3
cpu=$cpu1
grep $1 /proc/interrupts|awk '{print $1}'|sed 's/://'|while read int
do
  echo $cpu > /proc/irq/$int/smp_affinity_list
  echo "echo CPU $cpu > /proc/irq/$a/smp_affinity_list"
  if [ $cpu = $cpu2 ]
  then
 cpu=$cpu1
  else
 ((cpu=$cpu+1))
  fi
done

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre 2.10.5 or 2.11.0

2018-10-17 Thread Riccardo Veraldi


On 10/17/18 1:20 PM, Kurt Strosahl wrote:


Good Afternoon,


I believe 2.10.* is the long time support.

I am happy with 2.10.5 on my standard  performance cluster but for a 
very high performance cluster I built 5 months ago where 6GB/s per each OSS


where required in read and write transfers I had some problem with file 
locking using Lustre 2.10.3 when having many writers.


This issue did not show up with Lustre 2.11.0 so that I Went for it.



    I'm in the early planning stages of a lustre upgrade.  We are 
going to be moving from 2.5 to either 2.10 or 2.11, possibly by 
standing up a new lustre file system alongside the existing one and 
migrating the data over.  I'm wondering if anyone has had specific 
experiences (either positive or negative) with either of these versions.



I was also looking for, but couldn't seem to find, a mapping of zfs 
versions to lustre versions.  Am I correct in my assumption that zfs 
0.7.9 will work with either version?



Finally, does anyone have experience with zfs as the backing for the 
MDS/MDT?


I have several MDS/MDT on ZFS and they are also virtual machines and I 
never had issues with them.



anyway especially regarding the OSSes you may eventually need some ZFS 
module parameters optimizations regarding vdev_write and vdev_read max 
to increase those values higher than default. You may also disable ZIL, 
change the redundant_metadata to "most" atime off.



I could send you a list of parameters that in my case work well.




w/r,

Kurt J. Strosahl
System Administrator: HPC, Lustre
Scientific Computing Group, Thomas Jefferson National Accelerator Facility


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] slow write performance from client to server

2018-10-16 Thread Riccardo Veraldi


On 10/16/18 5:54 AM, Peter Jones wrote:

Have you tried running 2.10.5 on a RHEL6 client?


hello, sorry I did not do that but I think it is my next option.



On 2018-10-16, 2:13 AM, "lustre-discuss on behalf of Riccardo Veraldi" 
 
wrote:

 On 10/16/18 2:08 AM, George Melikov wrote:
 > Please post your benchmark method.
 
 my benchmark method is by using bbcp.
 
 lustre clients 2.10.5 on RHEL7  are fast (1.1GB/s) while RHEL6 clients

 using 2.8.0 perform very poorly (2MB/s)
 
 
 >

 > 
 > Sincerely,
 > George Melikov,
 > Tel. 7-915-278-39-36
 > Skype: georgemelikov
 >
 >
 > 16.10.2018, 12:03, "Riccardo Veraldi" :
 >> On 10/15/18 4:59 PM, Alexander I Kulyavtsev wrote:
 >>>   You can do a quick check with 2.10.5 client by mounting lustre on 
MDS if you do not have free node to install 2.10.5 client.
 >>>
 >>>   Do you have lnet configured with IB or 10GE? LNet defaults to tcp if 
not set. Can it be you are connected through slow management network?
 >>>
 >>>   Alex.
 >> hi, my lnet is configured with both IB and 10GE. it is using IB I
 >> verified it and anyway performance is very slow even if it where just
 >> using tcp on 10GE
 >>
 >> since I only get 2MB/s
 >>
 >> thanks
 >>
 >>>   On 10/15/18, 6:41 PM, "lustre-discuss on behalf of Riccardo Veraldi" 
 wrote:
 >>>
 >>>Hello,
 >>>
 >>>I have a new Lustre FS version 2.10.5. 18 OSTs 18TB each on 3 
OSSes.
 >>>
 >>>I noticed very slow performances couple of MB/sec when RHEL6 
Lustre
 >>>clients 2.8.0 are writing to the fielsystem.
 >>>
 >>>Could it be a Lustre version problem server vs client ?
 >>>
 >>>I have no errors either on server or client side that can debug 
it
 >>>further...
 >>>
 >>>thanks
 >>>
 >>>Rick
 >>>
 >>>___
 >>>lustre-discuss mailing list
 >>>lustre-discuss@lists.lustre.org
 >>>
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org=DwIGaQ=gRgGjJ3BkIsb5y6s49QqsA=23V5nhLj03jeTboyg6QveA=ABK5Lf73Df1JeZ-ryuh87ds4a5qoTk1gcookZ1auOuU=3gW-gvhmg4r0oQv3isx4u1P2TBHOzyDeFC-MZoxR68Y=
 >> ___
 >> lustre-discuss mailing list
 >> lustre-discuss@lists.lustre.org
 >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
 
 
 ___

 lustre-discuss mailing list
 lustre-discuss@lists.lustre.org
 http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
 



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] slow write performance from client to server

2018-10-16 Thread Riccardo Veraldi


On 10/16/18 2:08 AM, George Melikov wrote:

Please post your benchmark method.


my benchmark method is by using bbcp.

lustre clients 2.10.5 on RHEL7  are fast (1.1GB/s) while RHEL6 clients 
using 2.8.0 perform very poorly (2MB/s)






Sincerely,
George Melikov,
Tel. 7-915-278-39-36
Skype: georgemelikov


16.10.2018, 12:03, "Riccardo Veraldi" :

On 10/15/18 4:59 PM, Alexander I Kulyavtsev wrote:

  You can do a quick check with 2.10.5 client by mounting lustre on MDS if you 
do not have free node to install 2.10.5 client.

  Do you have lnet configured with IB or 10GE? LNet defaults to tcp if not set. 
Can it be you are connected through slow management network?

  Alex.

hi, my lnet is configured with both IB and 10GE. it is using IB I
verified it and anyway performance is very slow even if it where just
using tcp on 10GE

since I only get 2MB/s

thanks


  On 10/15/18, 6:41 PM, "lustre-discuss on behalf of Riccardo Veraldi" 
 
wrote:

   Hello,

   I have a new Lustre FS version 2.10.5. 18 OSTs 18TB each on 3 OSSes.

   I noticed very slow performances couple of MB/sec when RHEL6 Lustre
   clients 2.8.0 are writing to the fielsystem.

   Could it be a Lustre version problem server vs client ?

   I have no errors either on server or client side that can debug it
   further...

   thanks

   Rick

   ___
   lustre-discuss mailing list
   lustre-discuss@lists.lustre.org
   
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org=DwIGaQ=gRgGjJ3BkIsb5y6s49QqsA=23V5nhLj03jeTboyg6QveA=ABK5Lf73Df1JeZ-ryuh87ds4a5qoTk1gcookZ1auOuU=3gW-gvhmg4r0oQv3isx4u1P2TBHOzyDeFC-MZoxR68Y=

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] slow write performance from client to server

2018-10-16 Thread Riccardo Veraldi


On 10/15/18 4:59 PM, Alexander I Kulyavtsev wrote:

You can do a quick check with 2.10.5 client by mounting lustre on MDS if you do 
not have free node to install 2.10.5 client.

Do you have lnet configured with IB or 10GE? LNet defaults to tcp if not set. 
Can it be you are connected through slow management network?

Alex.


hi, my lnet is configured with both IB and 10GE. it is using IB I 
verified it and anyway performance is very slow even if it where just 
using tcp on 10GE


since I only get 2MB/s

thanks





On 10/15/18, 6:41 PM, "lustre-discuss on behalf of Riccardo Veraldi" 
 
wrote:

 Hello,
 
 I have a new Lustre FS version 2.10.5. 18 OSTs 18TB each on 3 OSSes.
 
 I noticed very slow performances couple of MB/sec when RHEL6 Lustre

 clients  2.8.0 are writing to the fielsystem.
 
 Could it be a Lustre version problem server vs client ?
 
 I have no errors either on server or client side  that can debug it

 further...
 
 thanks
 
 Rick
 
 
 ___

 lustre-discuss mailing list
 lustre-discuss@lists.lustre.org
 
https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.lustre.org_listinfo.cgi_lustre-2Ddiscuss-2Dlustre.org=DwIGaQ=gRgGjJ3BkIsb5y6s49QqsA=23V5nhLj03jeTboyg6QveA=ABK5Lf73Df1JeZ-ryuh87ds4a5qoTk1gcookZ1auOuU=3gW-gvhmg4r0oQv3isx4u1P2TBHOzyDeFC-MZoxR68Y=
 



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Updating kernel will require recompilation of lustre kernel modules?

2018-10-02 Thread Riccardo Veraldi


I suppose you want to say 2.10.*

if you use lustre-client-dkms rpm (or build it from sources) the only 
thing you need to do on your clients is to remove the actual 
lustre-client-dkms rpm package
and reinstall  it after you upgrade the kernel.  In this way the lustre 
modules will be automatically built for your new kernel (you must have 
proper kernel-headers though).



On 9/29/18 2:49 AM, Amjad Syed wrote:

Hello
We have an HPC running RHEL 7.4. We are using lustre 2.0
Red hat   last week released an advisory to update kernel to fix 
mutagen astronomy bug.


Now question is we updrade kernel on MDS/OSS and linux client, do we 
need to recompile lustre against the updated kernel version ?


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre-2.10.5 problem

2018-09-24 Thread Riccardo Veraldi


as for me Lustre 2.10.5 is not building on ZFS 0.7.10
of course it builds fine with ZFS 0.7.9

CC:    gcc
LD:    /usr/bin/ld -m elf_x86_64
CPPFLAGS:  -include /root/rpmbuild/BUILD/lustre-2.10.5/undef.h 
-include /root/rpmbuild/BUILD/lustre-2.10.5/config.h 
-I/root/rpmbuild/BUILD/lustre-2.10.5/libcfs/include 
-I/root/rpmbuild/BUILD/lustre-2.10.5/lnet/include 
-I/root/rpmbuild/BUILD/lustre-2.10.5/lustre/include 
-I/root/rpmbuild/BUILD/lustre-2.10.5/lustre/include/uapi

CFLAGS:    -g -O2 -Werror -Wall -Werror
EXTRA_KCFLAGS: -include /root/rpmbuild/BUILD/lustre-2.10.5/undef.h 
-include /root/rpmbuild/BUILD/lustre-2.10.5/config.h  -g 
-I/root/rpmbuild/BUILD/lustre-2.10.5/libcfs/include 
-I/root/rpmbuild/BUILD/lustre-2.10.5/lnet/include 
-I/root/rpmbuild/BUILD/lustre-2.10.5/lustre/include


Type 'make' to build Lustre.
+ make -j2 -s
Making all in .
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c: In 
function '__osd_attr_init':
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c:1292:2: 
error: unknown type name 'timestruc_t'

  timestruc_t  now;
  ^
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c:1302:2: 
error: passing argument 1 of 'gethrestime' from incompatible pointer 
type [-Werror]

  gethrestime();
  ^
In file included from 
/var/lib/dkms/spl/0.7.10/source/include/sys/condvar.h:34:0,
 from 
/var/lib/dkms/spl/0.7.10/source/include/sys/t_lock.h:31,
 from 
/var/lib/dkms/zfs/0.7.10/source/include/sys/zfs_context.h:35,

 from /var/lib/dkms/zfs/0.7.10/source/include/sys/arc.h:30,
 from 
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_internal.h:49,
 from 
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c:50:
/var/lib/dkms/spl/0.7.10/source/include/sys/time.h:70:1: note: expected 
'struct inode_timespec_t *' but argument is of type 'int *'

 gethrestime(inode_timespec_t *ts)
 ^
In file included from 
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_internal.h:51:0,
 from 
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c:50:
/var/lib/dkms/zfs/0.7.10/source/include/sys/zfs_znode.h:278:28: error: 
request for member 'tv_sec' in something not a structure or union

  (stmp)[0] = (uint64_t)(tp)->tv_sec; \
    ^
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c:1303:2: 
note: in expansion of macro 'ZFS_TIME_ENCODE'

  ZFS_TIME_ENCODE(, crtime);
  ^
/var/lib/dkms/zfs/0.7.10/source/include/sys/zfs_znode.h:279:28: error: 
request for member 'tv_nsec' in something not a structure or union

  (stmp)[1] = (uint64_t)(tp)->tv_nsec; \
    ^
/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.c:1303:2: 
note: in expansion of macro 'ZFS_TIME_ENCODE'

  ZFS_TIME_ENCODE(, crtime);
  ^
cc1: all warnings being treated as errors
make[6]: *** 
[/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs/osd_object.o] Error 1

make[5]: *** [/root/rpmbuild/BUILD/lustre-2.10.5/lustre/osd-zfs] Error 2
make[5]: *** Waiting for unfinished jobs
make[4]: *** [/root/rpmbuild/BUILD/lustre-2.10.5/lustre] Error 2
make[3]: *** [_module_/root/rpmbuild/BUILD/lustre-2.10.5] Error 2
make[2]: *** [modules] Error 2
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2
error: Bad exit status from /var/tmp/rpm-tmp.IeEkol (%build)



On 9/24/18 9:14 AM, Tung-Han Hsieh wrote:

Dear Nathaniel,

Thank you very much for your kindly reply. Indeed I modified the
lustre-2.10.5 codes:

 lustre/osd-zfs/osd_object.c
 lustre/osd-zfs/osd_xattr.c

for the declaration:

 inode_timespec_t now;

Similar to what you have done in your patch. So I can compile
lustre-2.10.5 cleanly with zfs-0.7.11. Sorry I forgot to mention.

But my problem is still there. Actually I just tried:

1. Applying your patch to the original lustre-2.10.5 code, and
recompile with spl-0.7.11 and zfs-0.7.11. But loading "lustre"
module still gives "no such device" error.

2. I recompile the original lustre-2.10.5 with spl-0.7.9 and
zfs-0.7.9. They can be compiled cleanly. But again I got the
"no such device" error when loading "lustre" module.

I am wondering that I must overlooked a trivial step, something
like one (or some) of the utilities in /opt/lustre/sbin/* should
be linked to /sbin/ or /usr/sbin/ 

Any suggestions are very appreciated.

Thank you very much.


T.H.Hsieh


On Mon, Sep 24, 2018 at 01:21:19PM +, Nathaniel Clark wrote:

Hello Tung-Han,

ZFS 0.7.11 doesn’t compile cleanly with Lustre, yet.

There’s a ticket for adding ZFS 0.7.11 support to lustre:
https://jira.whamcloud.com/browse/LU-11393

It has patches for master (pre-2.12) and a separate patch for 2.10.

—
Nathaniel Clark mailto:ncl...@whamcloud.com>>
Senior Engineer
Whamcloud / DDN

On Sep 24, 2018, at 2:15 PM, Tung-Han Hsieh 
mailto:thhs...@twcp1.phys.ntu.edu.tw>> wrote:

Dear All,

I am trying to install Lustre version

[lustre-discuss] Lustre on RHEL 7 with Kernel 4

2018-09-21 Thread Riccardo Veraldi


Hello,

I am running Lustre 2.10.5 on RHEL 7.5 using the kernel 4.4.157 from elrepo.

everything seems working fine. I Ask if anyone else is running kernel 4 
on CENTOS with Lustre, and if this configuration is kind of unsupported 
or not recommended for some reason.


I had an hard time with the latest RHEL 7.5 kernel because Infiniband is 
not working right


https://bugs.centos.org/view.php?id=15193

So just asking if anyone deployed the Lustre server on Kernel 4 and is 
happy with it or if I should not do that.


thanks

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] help ldev.conf usage clarification

2018-09-12 Thread Riccardo Veraldi


Hello,

I wanted to ask some clarifiaction on ldev.conf usage and features.

I am using ldev.conf only on my ZFS lustre OSSes and MDS.

Anyway I hae a doubt on what should go in that file.

I have seen people having only the metadata configuration in it like for 
example:


mds01 - mgs zfs:lustre01-mgs/mgs
mds01 - mdt0 zfs:lustre01-mdt0/mdt0

and people filling hte file with both mgs settings and listing also all 
the OSSes/OSTs then spreading the same ldev.conf file over all the OSSes 
like in this example with


3 OSSes where each one has one OST:


mds01 - mgs zfs:lustre01-mgs/mgs
mds01 - mdt0 zfs:lustre01-mdt0/mdt0
#
drp-tst-ffb01 - OST01 zfs:lustre01-ost01/ost01
drp-tst-ffb02 - OST02 zfs:lustre01-ost02/ost02
drp-tst-ffb03 - OST03 zfs:lustre01-ost03/ost03

is this correct or only the metadata information should stay in ldev.conf ?

Also can ldev.conf be used with ldiskfs based cluster ? On ldiskfs based 
clsuters I usually mount the metadata partition and OSS partitions in 
fstab and my ldev.conf is empty.


thanks

Rick



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre error when trying to mount

2018-09-11 Thread Riccardo Veraldi


here is the reason, it's a CENTOS 7.5 kernel bug

https://bugs.centos.org/view.php?id=15193

On 9/10/18 11:05 PM, Riccardo Veraldi wrote:


hello,

I installed a new Lustre system where MDS and OSSes are version 2.10.5

the lustre clients are running 2.10.1 and 2.9.0

when I try to mount the filesystem it fails with these errors:

OSS:

Sep 10 22:39:46 psananehoss01 kernel: LNetError: 
10055:0:(o2iblnd_cb.c:2513:kiblnd_passive_connect()) Can't accept 
172.21.52.33@o2ib2: -22
Sep 10 22:39:46 psananehoss01 kernel: LNet: 
10055:0:(o2iblnd_cb.c:2212:kiblnd_reject()) Error -22 sending reject


Client:

Sep 10 22:41:26 psana101 kernel: LNetError: 
336:0:(o2iblnd_cb.c:2726:kiblnd_rejected()) 172.21.52.90@o2ib2 
rejected: consumer defined fatal error



I Am afraid this is the consequence of a mixed configuration.

on the client side Lustre is configured in /etc/modprobe/lustre.conf

options lnet networks=o2ib2(ib0),tcp0(enp6s0),tcp1(enp6s0),tcp2(enp6s0)

on the OSS site I am using lnet.conf

ip2nets:
 - net-spec: o2ib2
   interfaces:
  0: ib0
 - net-spec: tcp2
   interfaces:
  0: enp8s0f0

I supposed that peers could be discovered automatically and added 
automatically to lnet


Should I revert back to static lustre.conf  on the OSS side too ?

I have several lustre clients I cannot add all of them in a peers 
section inside lnet.conf on the OSS side


any hints are very welcomed.

thank you


Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] lustre error when trying to mount

2018-09-11 Thread Riccardo Veraldi



hello,

I installed a new Lustre system where MDS and OSSes are version 2.10.5

the lustre clients are running 2.10.1 and 2.9.0

when I try to mount the filesystem it fails with these errors:

OSS:

Sep 10 22:39:46 psananehoss01 kernel: LNetError: 
10055:0:(o2iblnd_cb.c:2513:kiblnd_passive_connect()) Can't accept 
172.21.52.33@o2ib2: -22
Sep 10 22:39:46 psananehoss01 kernel: LNet: 
10055:0:(o2iblnd_cb.c:2212:kiblnd_reject()) Error -22 sending reject


Client:

Sep 10 22:41:26 psana101 kernel: LNetError: 
336:0:(o2iblnd_cb.c:2726:kiblnd_rejected()) 172.21.52.90@o2ib2 rejected: 
consumer defined fatal error



I Am afraid this is the consequence of a mixed configuration.

on the client side Lustre is configured in /etc/modprobe/lustre.conf

options lnet networks=o2ib2(ib0),tcp0(enp6s0),tcp1(enp6s0),tcp2(enp6s0)

on the OSS site I am using lnet.conf

ip2nets:
 - net-spec: o2ib2
   interfaces:
  0: ib0
 - net-spec: tcp2
   interfaces:
  0: enp8s0f0

I supposed that peers could be discovered automatically and added 
automatically to lnet


Should I revert back to static lustre.conf  on the OSS side too ?

I have several lustre clients I cannot add all of them in a peers 
section inside lnet.conf on the OSS side


any hints are very welcomed.

thank you


Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] problem with resource-agents rpm

2018-08-30 Thread Riccardo Veraldi


Lustre 2.10.5

seems like that lustre-resource-agents has a dependency problem

yum localinstall -y lustre-resource-agents-2.10.5-1.el7.x86_64.rpm
Loaded plugins: langpacks
Examining lustre-resource-agents-2.10.5-1.el7.x86_64.rpm: 
lustre-resource-agents-2.10.5-1.el7.x86_64

Marking lustre-resource-agents-2.10.5-1.el7.x86_64.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package lustre-resource-agents.x86_64 0:2.10.5-1.el7 will be installed
--> Processing Dependency: resource-agents for package: 
lustre-resource-agents-2.10.5-1.el7.x86_64

--> Finished Dependency Resolution
Error: Package: lustre-resource-agents-2.10.5-1.el7.x86_64 
(/lustre-resource-agents-2.10.5-1.el7.x86_64)

   Requires: resource-agents
 You could try using --skip-broken to work around the problem

Anyone had this issue ?

thanks

Rick



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] migrating MDS to different infrastructure

2018-08-22 Thread Riccardo Veraldi


Hello,

I have a virtual machine running on oVirt which is a MDS. I have a mgs 
and mdt partition.


ffb11-mgs/mgs   100283136 4096 
100276992   1% /lustre/local/mgs
ffb11-mdt0/mdt0 100269952   558976 
99708928   1% /lustre/local/mdt0



NAME  USED  AVAIL  REFER  MOUNTPOINT
ffb11-mdt0    559M  95.8G    19K  /ffb11-mdt0
ffb11-mdt0/mdt0   546M  95.8G   546M  /ffb11-mdt0/mdt0
ffb11-mgs    4.13M  96.4G    19K  /ffb11-mgs
ffb11-mgs/mgs    3.99M  96.4G  3.99M  /ffb11-mgs/mgs


I would like to migrate this virtual machine to another infrastructure. 
it is not simple because the other infrastructure is vmware.
what is the best way to migrate those partitions without incurring into 
any corruption of data ?

May I simply use zfs send and zfs receive thru SSH ?
what is the best way to move a MDS based virtual machine ?

thank you

Rick




___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre 2.9.0 server won't start anymore

2018-07-31 Thread Riccardo Veraldi


this got fixed the problem was a corrupt ldev.conf file.


On 7/31/18 12:41 AM, Riccardo Veraldi wrote:

Hello,
my lustre server 2.9.0 suddently won't mount anymore lustre partitions 
after a power outage.
ZFS pool is active (Resilvering one disk). Anyway systemctl start 
lustre is not working.
I do not see any error message it just does not mount my Lustre OSSes 
partions.


NAME    USED  AVAIL  REFER  MOUNTPOINT
ana02-ost21    17.7T  2.25T   205K  none
ana02-ost21/ost21  17.7T  2.25T  17.7T  none
ana02-ost22    17.9T  2.09T   205K  none
ana02-ost22/ost22  17.9T  2.09T  17.9T  none
ana02-ost23    18.4T  1.57T   205K  none
ana02-ost23/ost23  18.4T  1.57T  18.4T  none
ana02-ost24    18.4T  1.54T   205K  none
ana02-ost24/ost24  18.4T  1.54T  18.4T  none
ana02-ost25    17.9T  2.07T   205K  none
ana02-ost25/ost25  17.9T  2.07T  17.9T  none
ana02-ost26    18.5T  1.49T   205K  none
ana02-ost26/ost26  18.5T  1.49T  18.5T  none
ana02-ost27    18.3T  1.67T   205K  none
ana02-ost27/ost27  18.3T  1.67T  18.3T  none
ana02-ost28    18.3T  1.71T   205K  none
ana02-ost28/ost28  18.3T  1.71T  18.3T  none
ana02-ost29    17.6T  2.38T   205K  none
ana02-ost29/ost29  17.6T  2.38T  17.6T  none
ana02-ost30    17.9T  2.06T   205K  none
ana02-ost30/ost30  17.9T  2.06T  17.9T  none
ana02-ost31    18.0T  1.98T   205K  none
ana02-ost31/ost31  18.0T  1.98T  18.0T  none
ana02-ost32    18.4T  1.59T   205K  none
ana02-ost32/ost32  18.4T  1.59T  18.4T  none
ana02-ost33    18.3T  1.69T   205K  none
ana02-ost33/ost33  18.3T  1.69T  18.3T  none
ana02-ost34    18.4T  1.61T   205K  none
ana02-ost34/ost34  18.4T  1.61T  18.4T  none
ana02-ost35    18.3T  1.70T   205K  none
ana02-ost35/ost35  18.3T  1.70T  18.3T  none
ana02-ost36    18.5T  1.52T   205K  none
ana02-ost36/ost36  18.5T  1.52T  18.5T  none
ana02-ost37    18.1T  1.89T   205K  none
ana02-ost37/ost37  18.1T  1.89T  18.1T  none
ana02-ost38    17.8T  2.14T   205K  none
ana02-ost38/ost38  17.8T  2.14T  17.8T  none
ana02-ost39    17.7T  2.27T   205K  none
ana02-ost39/ost39  17.7T  2.27T  17.7T  none
ana02-ost40    17.8T  2.21T   205K  none
ana02-ost40/ost40  17.8T  2.21T  17.8T  none

I do not know how to recover from this situation and I have important 
data on the OSTs.

Any idea on how can I mount Lustre on the underlying ZFS filesystem

this is my ldev.conf. OST from 21 to 40 won't be mounted anymore locally.

psanamds121.pcdsn - mgs zfs:ana02-mgs/mgs
psanamds121.pcdsn - mdt0 zfs:ana02-mdt0/mdt0
#
psanaoss121.pcdsn - ANA02-OST01 zfs:ana02-ost01/ost01
psanaoss121.pcdsn - ANA02-OST02 zfs:ana02-ost02/ost02
psanaoss121.pcdsn - ANA02-OST03 zfs:ana02-ost03/ost03
psanaoss121.pcdsn - ANA02-OST04 zfs:ana02-ost04/ost04
psanaoss121.pcdsn - ANA02-OST05 zfs:ana02-ost05/ost05
psanaoss121.pcdsn - ANA02-OST06 zfs:ana02-ost06/ost06
psanaoss121.pcdsn - ANA02-OST07 zfs:ana02-ost07/ost07
psanaoss121.pcdsn - ANA02-OST08 zfs:ana02-ost08/ost08
psanaoss121.pcdsn - ANA02-OST09 zfs:ana02-ost09/ost09
psanaoss121.pcdsn - ANA02-OST10 zfs:ana02-ost10/ost10
psanaoss121.pcdsn - ANA02-OST11 zfs:ana02-ost11/ost11
psanaoss121.pcdsn - ANA02-OST12 zfs:ana02-ost12/ost12
psanaoss121.pcdsn - ANA02-OST13 zfs:ana02-ost13/ost13
psanaoss121.pcdsn - ANA02-OST14 zfs:ana02-ost14/ost14
psanaoss121.pcdsn - ANA02-OST15 zfs:ana02-ost15/ost15
psanaoss121.pcdsn - ANA02-OST16 zfs:ana02-ost16/ost16
psanaoss121.pcdsn - ANA02-OST17 zfs:ana02-ost17/ost17
psanaoss121.pcdsn - ANA02-OST18 zfs:ana02-ost18/ost18
psanaoss121.pcdsn - ANA02-OST19 zfs:ana02-ost19/ost19
psanaoss121.pcdsn - ANA02-OST20 zfs:ana02-ost20/ost20
#
psanaoss122.pcdsn - ANA02-OST21 zfs:ana02-ost21/ost21
psanaoss122.pcdsn - ANA02-OST22 zfs:ana02-ost22/ost22
psanaoss122.pcdsn - ANA02-OST23 zfs:ana02-ost23/ost23
psanaoss122.pcdsn - ANA02-OST24 zfs:ana02-ost24/ost24
psanaoss122.pcdsn - ANA02-OST25 zfs:ana02-ost25/ost25
psanaoss122.pcdsn - ANA02-OST26 zfs:ana02-ost26/ost26
psanaoss122.pcdsn - ANA02-OST27 zfs:ana02-ost27/ost27
psanaoss122.pcdsn - ANA02-OST28 zfs:ana02-ost28/ost28
psanaoss122.pcdsn - ANA02-OST29 zfs:ana02-ost29/ost29
psanaoss122.pcdsn - ANA02-OST30 zfs:ana02-ost30/ost30
psanaoss122.pcdsn - ANA02-OST31 zfs:ana02-ost31/ost31
psanaoss122.pcdsn - ANA02-OST32 zfs:ana02-ost32/ost32
psanaoss122.pcdsn - ANA02-OST33 zfs:ana02-ost33/ost33
psanaoss122.pcdsn - ANA02-OST34 zfs:ana02-ost34/ost34
psanaoss122.pcdsn - ANA02-OST35 zfs:ana02-ost35/ost35
psanaoss122.pcdsn - ANA02-OST36 zfs:ana02-ost36/ost36
psanaoss122.pcdsn - ANA02-OST37 zfs:ana02-ost37/ost37
psanaoss122.pcdsn - ANA02-OST38 zfs:ana02-ost38/ost38
psanaoss122.pcdsn - ANA02-OST39 zfs:ana02-ost39/ost39
psanaoss122.pcdsn - ANA02-OST40 zfs:ana02-ost40/ost40
#

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org



___
lustre-discuss mailing list
lustre-discuss

[lustre-discuss] Lustre 2.9.0 server won't start anymore

2018-07-31 Thread Riccardo Veraldi


Hello,
my lustre server 2.9.0 suddently won't mount anymore lustre partitions 
after a power outage.
ZFS pool is active (Resilvering one disk). Anyway systemctl start lustre 
is not working.
I do not see any error message it just does not mount my Lustre OSSes 
partions.


NAME    USED  AVAIL  REFER  MOUNTPOINT
ana02-ost21    17.7T  2.25T   205K  none
ana02-ost21/ost21  17.7T  2.25T  17.7T  none
ana02-ost22    17.9T  2.09T   205K  none
ana02-ost22/ost22  17.9T  2.09T  17.9T  none
ana02-ost23    18.4T  1.57T   205K  none
ana02-ost23/ost23  18.4T  1.57T  18.4T  none
ana02-ost24    18.4T  1.54T   205K  none
ana02-ost24/ost24  18.4T  1.54T  18.4T  none
ana02-ost25    17.9T  2.07T   205K  none
ana02-ost25/ost25  17.9T  2.07T  17.9T  none
ana02-ost26    18.5T  1.49T   205K  none
ana02-ost26/ost26  18.5T  1.49T  18.5T  none
ana02-ost27    18.3T  1.67T   205K  none
ana02-ost27/ost27  18.3T  1.67T  18.3T  none
ana02-ost28    18.3T  1.71T   205K  none
ana02-ost28/ost28  18.3T  1.71T  18.3T  none
ana02-ost29    17.6T  2.38T   205K  none
ana02-ost29/ost29  17.6T  2.38T  17.6T  none
ana02-ost30    17.9T  2.06T   205K  none
ana02-ost30/ost30  17.9T  2.06T  17.9T  none
ana02-ost31    18.0T  1.98T   205K  none
ana02-ost31/ost31  18.0T  1.98T  18.0T  none
ana02-ost32    18.4T  1.59T   205K  none
ana02-ost32/ost32  18.4T  1.59T  18.4T  none
ana02-ost33    18.3T  1.69T   205K  none
ana02-ost33/ost33  18.3T  1.69T  18.3T  none
ana02-ost34    18.4T  1.61T   205K  none
ana02-ost34/ost34  18.4T  1.61T  18.4T  none
ana02-ost35    18.3T  1.70T   205K  none
ana02-ost35/ost35  18.3T  1.70T  18.3T  none
ana02-ost36    18.5T  1.52T   205K  none
ana02-ost36/ost36  18.5T  1.52T  18.5T  none
ana02-ost37    18.1T  1.89T   205K  none
ana02-ost37/ost37  18.1T  1.89T  18.1T  none
ana02-ost38    17.8T  2.14T   205K  none
ana02-ost38/ost38  17.8T  2.14T  17.8T  none
ana02-ost39    17.7T  2.27T   205K  none
ana02-ost39/ost39  17.7T  2.27T  17.7T  none
ana02-ost40    17.8T  2.21T   205K  none
ana02-ost40/ost40  17.8T  2.21T  17.8T  none

I do not know how to recover from this situation and I have important 
data on the OSTs.

Any idea on how can I mount Lustre on the underlying ZFS filesystem

this is my ldev.conf. OST from 21 to 40 won't be mounted anymore locally.

psanamds121.pcdsn - mgs zfs:ana02-mgs/mgs
psanamds121.pcdsn - mdt0 zfs:ana02-mdt0/mdt0
#
psanaoss121.pcdsn - ANA02-OST01 zfs:ana02-ost01/ost01
psanaoss121.pcdsn - ANA02-OST02 zfs:ana02-ost02/ost02
psanaoss121.pcdsn - ANA02-OST03 zfs:ana02-ost03/ost03
psanaoss121.pcdsn - ANA02-OST04 zfs:ana02-ost04/ost04
psanaoss121.pcdsn - ANA02-OST05 zfs:ana02-ost05/ost05
psanaoss121.pcdsn - ANA02-OST06 zfs:ana02-ost06/ost06
psanaoss121.pcdsn - ANA02-OST07 zfs:ana02-ost07/ost07
psanaoss121.pcdsn - ANA02-OST08 zfs:ana02-ost08/ost08
psanaoss121.pcdsn - ANA02-OST09 zfs:ana02-ost09/ost09
psanaoss121.pcdsn - ANA02-OST10 zfs:ana02-ost10/ost10
psanaoss121.pcdsn - ANA02-OST11 zfs:ana02-ost11/ost11
psanaoss121.pcdsn - ANA02-OST12 zfs:ana02-ost12/ost12
psanaoss121.pcdsn - ANA02-OST13 zfs:ana02-ost13/ost13
psanaoss121.pcdsn - ANA02-OST14 zfs:ana02-ost14/ost14
psanaoss121.pcdsn - ANA02-OST15 zfs:ana02-ost15/ost15
psanaoss121.pcdsn - ANA02-OST16 zfs:ana02-ost16/ost16
psanaoss121.pcdsn - ANA02-OST17 zfs:ana02-ost17/ost17
psanaoss121.pcdsn - ANA02-OST18 zfs:ana02-ost18/ost18
psanaoss121.pcdsn - ANA02-OST19 zfs:ana02-ost19/ost19
psanaoss121.pcdsn - ANA02-OST20 zfs:ana02-ost20/ost20
#
psanaoss122.pcdsn - ANA02-OST21 zfs:ana02-ost21/ost21
psanaoss122.pcdsn - ANA02-OST22 zfs:ana02-ost22/ost22
psanaoss122.pcdsn - ANA02-OST23 zfs:ana02-ost23/ost23
psanaoss122.pcdsn - ANA02-OST24 zfs:ana02-ost24/ost24
psanaoss122.pcdsn - ANA02-OST25 zfs:ana02-ost25/ost25
psanaoss122.pcdsn - ANA02-OST26 zfs:ana02-ost26/ost26
psanaoss122.pcdsn - ANA02-OST27 zfs:ana02-ost27/ost27
psanaoss122.pcdsn - ANA02-OST28 zfs:ana02-ost28/ost28
psanaoss122.pcdsn - ANA02-OST29 zfs:ana02-ost29/ost29
psanaoss122.pcdsn - ANA02-OST30 zfs:ana02-ost30/ost30
psanaoss122.pcdsn - ANA02-OST31 zfs:ana02-ost31/ost31
psanaoss122.pcdsn - ANA02-OST32 zfs:ana02-ost32/ost32
psanaoss122.pcdsn - ANA02-OST33 zfs:ana02-ost33/ost33
psanaoss122.pcdsn - ANA02-OST34 zfs:ana02-ost34/ost34
psanaoss122.pcdsn - ANA02-OST35 zfs:ana02-ost35/ost35
psanaoss122.pcdsn - ANA02-OST36 zfs:ana02-ost36/ost36
psanaoss122.pcdsn - ANA02-OST37 zfs:ana02-ost37/ost37
psanaoss122.pcdsn - ANA02-OST38 zfs:ana02-ost38/ost38
psanaoss122.pcdsn - ANA02-OST39 zfs:ana02-ost39/ost39
psanaoss122.pcdsn - ANA02-OST40 zfs:ana02-ost40/ost40
#

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] OST doomed after e2fsck

2018-05-31 Thread Riccardo Veraldi

thanks a lot! it worked.

On 5/31/18 12:14 AM, Fernando Pérez wrote:
> I had the same problem in the past with 2.4 release.
>
> I solved the problem upgrading the e2fsprogs to the its latest release
> and running again e2fsck in the failed OST.
>
> Regards.
>
> 
> Fernando Pérez
> Institut de Ciències del Mar (CMIMA-CSIC)
> Departament Oceanografía Física i Tecnològica
> Passeig Marítim de la Barceloneta,37-49
> 08003 Barcelona
> Phone:  (+34) 93 230 96 35 
> ====
>
> El 31 may 2018, a las 5:36, Riccardo Veraldi
> mailto:riccardo.vera...@cnaf.infn.it>>
> escribió:
>
>> Hello,
>>
>> after a power outage I had one of my OSTs (total of 60) in an unhappy
>> state.
>>
>> Lustre version 2.4.1
>>
>> I ran then a FS check and here follows:
>>
>> e2fsck 1.42.7.wc1 (12-Apr-2013)
>> Pass 1: Checking inodes, blocks, and sizes
>> Pass 2: Checking directory structure
>> Pass 3: Checking directory connectivity
>> Pass 4: Checking reference counts
>> Unattached inode 25793
>> Connect to /lost+found? yes
>> Inode 25793 ref count is 2, should be 1.  Fix? yes
>> Unattached inode 29096
>> Connect to /lost+found? yes
>> Inode 29096 ref count is 2, should be 1.  Fix? yes
>> Unattached inode 29745
>> Connect to /lost+found? yes
>> Inode 29745 ref count is 2, should be 1.  Fix? yes
>> Unattached inode 29821
>> Connect to /lost+found? yes
>> Inode 29821 ref count is 2, should be 1.  Fix? yes
>> yPass 5: Checking group summary information
>> Inode bitmap differences:  +23902 +29082 +29096 +29130 +29459 +29497
>> -29530 +29552 +29566 +29596 +(29643--29644) +29655 +29668 +29675 +29696
>> +29701 +29720 +29736 +29739 +29745 +29751 +29778 +29787 -29795 +29808
>> +29821
>> Fix? yes
>> Free inodes count wrong for group #70 (1, counted=0).
>> Fix? yes
>> Free inodes count wrong for group #76 (1, counted=0).
>> Fix? yes
>> Free inodes count wrong for group #90 (1, counted=0).
>> Fix? yes
>> Free inodes count wrong for group #93 (3, counted=2).
>> Fix? yes
>> Free inodes count wrong for group #100 (2, counted=0).
>> Fix? yes
>> Free inodes count wrong for group #101 (1, counted=0).
>> Fix? yes
>> Free inodes count wrong for group #113 (5, counted=2).
>> Fix? yes
>> Free inodes count wrong for group #114 (1, counted=0).
>> Fix? yes
>> Free inodes count wrong for group #115 (13, counted=4).
>> Fix? yes
>> Free inodes count wrong for group #116 (149, counted=140).
>> Fix? yes
>> Free inodes count wrong (30493545, counted=30493516).
>> Fix? yes
>> [QUOTA WARNING] Usage inconsistent for ID 0:actual (2083721216, 841) !=
>> expected (2082398208, 678)
>> [QUOTA WARNING] Usage inconsistent for ID 9997:actual (1095815659520,
>> 19800) != expected (664375967744, 19791)
>> [QUOTA WARNING] Usage inconsistent for ID -1597706240:actual (0, 0) !=
>> expected (90112, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -1428439040:actual (0, 0) !=
>> expected (126976, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -1936064512:actual (0, 0) !=
>> expected (12288, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -1684783104:actual (0, 0) !=
>> expected (28672, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -2131947520:actual (0, 0) !=
>> expected (4096, 1)
>> [QUOTA WARNING] Usage inconsistent for ID 963263424:actual (957718528,
>> 49) != expected (957628416, 48)
>> [QUOTA WARNING] Usage inconsistent for ID 987173056:actual (1364516864,
>> 158) != expected (1364426752, 157)
>> [QUOTA WARNING] Usage inconsistent for ID -1537871872:actual (0, 0) !=
>> expected (73728, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -2105077760:actual (0, 0) !=
>> expected (49152, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -2145202176:actual (0, 0) !=
>> expected (24576, 1)
>> [QUOTA WARNING] Usage inconsistent for ID -1422704640:actual (0, 0) !=
>> expected (65536, 1)
>> Update quota info for quota type 0? yes
>> [ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
>> other block (0) than it should (472).
>> [ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
>> other block (0) than it should (507).
>> [ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
>> other block (0) than it should (170).
>> [ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
>> other block (0) than it should (435).
>> [ERROR] quotaio_tree.

[lustre-discuss] OST doomed after e2fsck

2018-05-30 Thread Riccardo Veraldi

Hello,

after a power outage I had one of my OSTs (total of 60) in an unhappy state.

Lustre version 2.4.1

I ran then a FS check and here follows:

e2fsck 1.42.7.wc1 (12-Apr-2013)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Unattached inode 25793
Connect to /lost+found? yes
Inode 25793 ref count is 2, should be 1.  Fix? yes
Unattached inode 29096
Connect to /lost+found? yes
Inode 29096 ref count is 2, should be 1.  Fix? yes
Unattached inode 29745
Connect to /lost+found? yes
Inode 29745 ref count is 2, should be 1.  Fix? yes
Unattached inode 29821
Connect to /lost+found? yes
Inode 29821 ref count is 2, should be 1.  Fix? yes
yPass 5: Checking group summary information
Inode bitmap differences:  +23902 +29082 +29096 +29130 +29459 +29497
-29530 +29552 +29566 +29596 +(29643--29644) +29655 +29668 +29675 +29696
+29701 +29720 +29736 +29739 +29745 +29751 +29778 +29787 -29795 +29808
+29821
Fix? yes
Free inodes count wrong for group #70 (1, counted=0).
Fix? yes
Free inodes count wrong for group #76 (1, counted=0).
Fix? yes
Free inodes count wrong for group #90 (1, counted=0).
Fix? yes
Free inodes count wrong for group #93 (3, counted=2).
Fix? yes
Free inodes count wrong for group #100 (2, counted=0).
Fix? yes
Free inodes count wrong for group #101 (1, counted=0).
Fix? yes
Free inodes count wrong for group #113 (5, counted=2).
Fix? yes
Free inodes count wrong for group #114 (1, counted=0).
Fix? yes
Free inodes count wrong for group #115 (13, counted=4).
Fix? yes
Free inodes count wrong for group #116 (149, counted=140).
Fix? yes
Free inodes count wrong (30493545, counted=30493516).
Fix? yes
[QUOTA WARNING] Usage inconsistent for ID 0:actual (2083721216, 841) !=
expected (2082398208, 678)
[QUOTA WARNING] Usage inconsistent for ID 9997:actual (1095815659520,
19800) != expected (664375967744, 19791)
[QUOTA WARNING] Usage inconsistent for ID -1597706240:actual (0, 0) !=
expected (90112, 1)
[QUOTA WARNING] Usage inconsistent for ID -1428439040:actual (0, 0) !=
expected (126976, 1)
[QUOTA WARNING] Usage inconsistent for ID -1936064512:actual (0, 0) !=
expected (12288, 1)
[QUOTA WARNING] Usage inconsistent for ID -1684783104:actual (0, 0) !=
expected (28672, 1)
[QUOTA WARNING] Usage inconsistent for ID -2131947520:actual (0, 0) !=
expected (4096, 1)
[QUOTA WARNING] Usage inconsistent for ID 963263424:actual (957718528,
49) != expected (957628416, 48)
[QUOTA WARNING] Usage inconsistent for ID 987173056:actual (1364516864,
158) != expected (1364426752, 157)
[QUOTA WARNING] Usage inconsistent for ID -1537871872:actual (0, 0) !=
expected (73728, 1)
[QUOTA WARNING] Usage inconsistent for ID -2105077760:actual (0, 0) !=
expected (49152, 1)
[QUOTA WARNING] Usage inconsistent for ID -2145202176:actual (0, 0) !=
expected (24576, 1)
[QUOTA WARNING] Usage inconsistent for ID -1422704640:actual (0, 0) !=
expected (65536, 1)
Update quota info for quota type 0? yes
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (472).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (507).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (170).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (435).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (89).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (5).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (130).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (435).
[ERROR] quotaio_tree.c:357:free_dqentry:: Quota structure has offset to
other block (0) than it should (251).
[QUOTA WARNING] Usage inconsistent for ID 0:actual (8301957120, 843) !=
expected (5880315904, 677)
[QUOTA WARNING] Usage inconsistent for ID 2279:actual (14819280969728,
21842) != expected (14298746671104, 21705)
Update quota info for quota type 1? yes

ana01-OST000e: * FILE SYSTEM WAS MODIFIED *
ana01-OST000e: 29876/30523392 files (22.3% non-contiguous),
3670668872/3906963456 blocks


After this when trying to mount the OST again, it makes the lustre
kernel module hang and Linux kernel goes to panic.
it is reproducible, every time I try to mount the OST fixed with e2fsck.
So basically I lost all data on the OST.
Any hints on how could I recover it ?
thank you.

Rick





___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] set_param permanent on client side ?

2018-05-28 Thread Riccardo Veraldi

the problem is that some of these parameters seems like are not there on
the MDS side

lctl get_param osc.*.checksums
error: get_param: param_path 'osc/*/checksums': No such file or directory

lctl get_param osc.*.max_pages_per_rpc
error: get_param: param_path 'osc/*/max_pages_per_rpc': No such file or
directory

lctl get_param llite.*.max_cached_mb

only some of them are on the MDS side

lctl get_param osc.*.max_rpcs_in_flight
osc.drpffb-OST0001-osc-MDT.max_rpcs_in_flight=64
osc.drpffb-OST0002-osc-MDT.max_rpcs_in_flight=64
osc.drpffb-OST0003-osc-MDT.max_rpcs_in_flight=64




On 5/23/18 1:15 AM, Artem Blagodarenko wrote:
> Hello Riccardo,
>
> There is “lctl set_param -P” command that set parameter permanently. It needs 
> to be executed on MGS server (and only MGS must be mounted), but parameter is 
> applied to given  target (or client). From your example:
>
> lctl set_param -P osc.*.checksums=0   
>
> Will execute “set_param osc.*.checksums=0” on all targets.
>
> Best regards,
> Artem Blagodarenko.
>
>> On 23 May 2018, at 00:11, Riccardo Veraldi  
>> wrote:
>>
>> Hello,
>>
>> how do I set_param in a persistent way on the lsutre clinet side so that
>> it has not to be set every time after reboot ?
>>
>> Not all of these parameters can be set on the MDS, for example the osc.* :
>>
>> lctl set_param osc.*.checksums=0
>> lctl set_param timeout=600
>> lctl set_param at_min=250
>> lctl set_param at_max=600
>> lctl set_param ldlm.namespaces.*.lru_size=2000
>> lctl set_param osc.*.max_rpcs_in_flight=64
>> lctl set_param osc.*.max_dirty_mb=1024
>> lctl set_param llite.*.max_read_ahead_mb=1024
>> lctl set_param llite.*.max_cached_mb=81920
>> lctl set_param llite.*.max_read_ahead_per_file_mb=1024
>> lctl set_param subsystem_debug=0
>>
>> thank you
>>
>>
>> Rick
>>
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] ptrlpcd parameters

2018-05-28 Thread Riccardo Veraldi

On 5/22/18 11:44 PM, Dilger, Andreas wrote:
> On May 22, 2018, at 15:15, Riccardo Veraldi  
> wrote:
>> hello,
>>
>> how to set ptrlpcd parameters at boot time ?
>>
>> instead of
>>
>> echo 32 > /sys/module/ptlrpc/parameters/max_ptlrpcds
>> echo 3 > /sys/module/ptlrpc/parameters/ptlrpcd_bind_policy
>>
>> I tried to load from /etd/modprobe.d/ptlrpc.conf 
>>
>> options ptlrpcd max_ptlrpcds=32
>> options ptlrpcd ptlrpcd_bind_policy=3
>>
>> but this way does not work. the file in modprobe.d is ignored.
> The module is not named "ptlrpcd", it is named "ptlrpc".
>
>> also these 2 parameters are labeled as obsoleted but I found benefits
>> setting ptlrpcd max_ptlrpcds to 32
> The "max_ptlrpcds" and "ptlrpcd_bind_policy" parameters are obsolete, you
> should use "ptlrpcd_per_cpt_max" and "ptlrpcd_partner_group_size" instead.
great thank you very much
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] ptrlpcd parameters

2018-05-22 Thread Riccardo Veraldi

hello,

how to set ptrlpcd parameters at boot time ?

instead of

echo 32 > /sys/module/ptlrpc/parameters/max_ptlrpcds
echo 3 > /sys/module/ptlrpc/parameters/ptlrpcd_bind_policy

I tried to load from /etd/modprobe.d/ptlrpc.conf 

options ptlrpcd max_ptlrpcds=32
options ptlrpcd ptlrpcd_bind_policy=3

but this way does not work. the file in modprobe.d is ignored.

also these 2 parameters are labeled as obsoleted but I found benefits
setting ptlrpcd max_ptlrpcds to 32


thanks

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] set_param permanent on client side ?

2018-05-22 Thread Riccardo Veraldi

Hello,

how do I set_param in a persistent way on the lsutre clinet side so that
it has not to be set every time after reboot ?

Not all of these parameters can be set on the MDS, for example the osc.* :

lctl set_param osc.*.checksums=0
lctl set_param timeout=600
lctl set_param at_min=250
lctl set_param at_max=600
lctl set_param ldlm.namespaces.*.lru_size=2000
lctl set_param osc.*.max_rpcs_in_flight=64
lctl set_param osc.*.max_dirty_mb=1024
lctl set_param llite.*.max_read_ahead_mb=1024
lctl set_param llite.*.max_cached_mb=81920
lctl set_param llite.*.max_read_ahead_per_file_mb=1024
lctl set_param subsystem_debug=0

thank you


Rick



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] parallel write/reads problem

2018-05-10 Thread Riccardo Veraldi

Hello,
So far I am not able to solve this problem on my Lustre setup.
I can reach very good performance with multi threaded writes or reads,
that are sequential writes and sequential reads at different times.
I can saturate Infiniband FDR capabilities reaching 6GB/s.
The problem rises when while writing I start also reading the same file
or even a different file.
In our I/O model there are writers and readers which start reading files
after a while they're being written. In this case read
performances drops dramatically. Writes go up to 6GB/s but reads have a
barrier and won't go more than 3GB/s.
I Tried all kind of optimizations. ZFS is performing very well itself,
but when Lustre is on top of it I have this problem.
Infiniband is working at full speed and Lnet test also is at full speed.
So I do not understand while I have concurrent writes/reads the reading
performances go down.

I also tweaked the ko2blnd parameters to gain more parallelism:

options ko2iblnd timeout=100 peer_credits=63 credits=2560
concurrent_sends=63 ntx=2048 fmr_pool_size=1280 fmr_flush_trigger=1024
ntx=5120

then on OSS side:

lctl set_param timeout=600
lctl set_param ldlm_timeout=200
lctl set_param at_min=250
lctl set_param at_max=600

on client side:

lctl set_param osc.*.checksums=0
lctl set_param timeout=600
lctl set_param at_min=250
lctl set_param at_max=600
lctl set_param ldlm.namespaces.*.lru_size=2000
lctl set_param osc.*.max_rpcs_in_flight=64
lctl set_param osc.*.max_dirty_mb=1024
lctl set_param llite.*.max_read_ahead_mb=1024
lctl set_param llite.*.max_cached_mb=81920
lctl set_param llite.*.max_read_ahead_per_file_mb=1024
lctl set_param subsystem_debug=0

I tried to set

lctl set_param osc.*.max_pages_per_rpc=1024

but it is not allowed...

 lctl set_param osc.*.max_pages_per_rpc=1024
error: set_param: setting
/proc/fs/lustre/osc/drplu-OST0001-osc-881ed6b05800/max_pages_per_rpc=1024:
Numerical result out of range


any other idea on what I may work on to get better performance on
concurrent writes/reads ?

thank you


Rick



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] /proc/sys/lnet gone ?

2018-05-08 Thread Riccardo Veraldi

ok thank you

On 5/8/18 8:08 PM, Dilger, Andreas wrote:
> Please use "lctl get_param peers" or "lctl get_param nis". This will work 
> with any version of lustre, since we have to move files from /proc to /sys to 
> make upstream kernel folks happy. 
>
> Cheers, Andreas
>
>> On May 8, 2018, at 18:24, Riccardo Veraldi <riccardo.vera...@cnaf.infn.it> 
>> wrote:
>>
>> Hello,
>> on my tunning lustre 2.11.0 testbed  I cannot find anymore /proc/sys/lnet
>> was very handy to look at /proc/sys/lnet/peers nd /proc/sys/lnet/nis
>> has this been moved somewhre else ?
>> thank you
>>
>>
>> Rick
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] problems with lnet peers on lustre 2.11.0

2018-05-08 Thread Riccardo Veraldi

Hello,
I have problems with my lnet configuration on lustre 2.11.0
everything starts just fine but after a while lnet auto discovers peers
and it adds the tcp network interface of my OSSes and clients
so that clients start to write on lustre partition using tcp and no more
o2ib.
I use and need tcp just to contact the MDS, and o2ib for contacting
OSSes. this configuration has always been working with Lustre 2.10.*

I tried to switch off auto peer dicovery but it did not work.
I Als otried not tto use lnet.conf at all and just to use
/etc/modprobe/lustre.conf with

opions lnet networks=o2ib(ib0),tcp(eth0)

but seems like lustre 2.11.0 does not like it anymore.

so I went back to lnet.conf but I can't make it stop to auto discover
tcp interfaces.

after a while the tcp interfaces starts to appear while I did not
configure it to do so. And they overcome the usage of o2ib.

how I can prevent the usage of tcp interfaces on my OSS and clients side
giving priority to the o2ib interface ?


 lnetctl export | grep tcp
  tcp bonding: 0
    - net type: tcp
    - nid: 172.21.42.211@tcp
  tcp bonding: 0
  tcp bonding: 0
*    - primary nid: 172.21.42.202@tcp**
**    - nid: 172.21.42.202@tcp*
    - primary nid: 172.21.42.213@tcp
    - nid: 172.21.42.213@tcp

so 172.21.42.202@tcp is used instead of the infiniband interface, and
this is discovered automatically.

This is the configuration on my OSS where 172.21.42.213 is the MDS.

net:
    - net type: tcp
  local NI(s):
    - nid: 172.21.42.211@tcp
  status: up
  interfaces:
  0: enp1s0f0
    - net type: o2ib
  local NI(s):
    - nid: 172.21.52.86@o2ib
  status: up
  interfaces:
  0: ib0
peer:
    - primary nid: 172.21.42.213@tcp
  Multi-Rail: False
  peer ni:
    - nid: 172.21.42.213@tcp
  state: NA
    - primary nid:  172.21.52.126@o2ib
  Multi-Rail: False
  peer ni:
    - nid: 172.21.52.126@o2ib
  state: NA
    - primary nid:  172.21.52.127@o2ib
  Multi-Rail: False
  peer ni:
    - nid: 172.21.52.127@o2ib
  state: NA
    - primary nid:  172.21.52.128@o2ib
  Multi-Rail: False
  peer ni:
    - nid: 172.21.52.128@o2ib
  state: NA
    - primary nid:  172.21.52.129@o2ib
  Multi-Rail: False
  peer ni:
    - nid: 172.21.52.129@o2ib
  state: NA
    - primary nid:  172.21.52.130@o2ib
  Multi-Rail: False
  peer ni:
    - nid: 172.21.52.130@o2ib
  state: NA
    - primary nid:  172.21.52.131@o2ib
  Multi-Rail: False
  peer ni:
    - nid: 172.21.52.131@o2ib
  state: NA
global:
    numa_range: 0
    discovery: 0


thanks a lot


Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Lustre ldlm_lockd errors

2018-04-25 Thread Riccardo Veraldi

Hello,
I am having a quite  serious problem with the lock manager.
First of all we are using Lustre 2.10.3 both on server and client side
on RHEL7.
The only difference beween servers and clients is that lustre OSSes have
kernel 4.4.126 while clients have stock RHEL7 kernel.
We have NVMe disks on the OSSes and kernel 4.4 mnages IRQ balancing for
NVMe disks much better.
it is possible to reproduce the problem.
I get this error during the simultaneous read and write. If I run the
writer and reader sequentially, the problem does not occur and
everything performs really well.
Unfortunately we need to write a file and have several threads reading
from it too.
So one big file is written and after a while multiple thread readers
access the file to read data (experimental data). This is the model of
our DAQ.
The specific failures are occurring in the read threads when they ask
for the file size (the call to os.stat() in python).
This is both to delay the start of the readers until the file exists and
to keep the reader from deadlocking the writer by repeatedly asking for
the data at the end of the file.
I do not know if there is a way to fix this. Apparently seems that
writing one file and having a bunch of threads reading form the same
file makes the lock manager unhappy in some way.
Any hints would be very appreciated. Thank you.

Errors OSS side:

Apr 25 10:31:19 drp-tst-ffb01 kernel: LustreError:
0:0:(ldlm_lockd.c:334:waiting_locks_callback()) ### lock callback timer
expired after 101s: evicting client at 172.21.52.131@o2ib  ns:
filter-drpffb-OST0001_UUID lock: 88202010b600/0x5be7c3e66a45b63f
lrc: 3/0,0 mode: PR/PR res: [0x4ad:0x0:0x0].0x0 rrc: 4397 type: EXT
[0->18446744073709551615] (req 0->18446744073709551615) flags:
0x6400010020 nid: 172.21.52.131@o2ib remote: 0xc0c93433d781fff9
expref: 5 pid: 10804 timeout: 4774735450 lvb_type: 1
Apr 25 10:31:20 drp-tst-ffb01 kernel: LustreError:
9524:0:(ldlm_lockd.c:2365:ldlm_cancel_handler()) ldlm_cancel from
172.21.52.127@o2ib arrived at 1524677480 with bad export cookie
6622477171464070609
Apr 25 10:31:20 drp-tst-ffb01 kernel: Lustre: drpffb-OST0001: Connection
restored to 23bffb9d-10bd-0603-76f6-e2173f99e3c6 (at 172.21.52.127@o2ib)
Apr 25 10:31:20 drp-tst-ffb01 kernel: Lustre: Skipped 65 previous
similar messages


Errors client side:

Apr 25 10:31:19 drp-tst-acc06 kernel: Lustre:
drpffb-OST0002-osc-880167fda800: Connection to drpffb-OST0002 (at
172.21.52.84@o2ib) was lost; in progress operations using this service
will wait for recovery to complete
Apr 25 10:31:19 drp-tst-acc06 kernel: Lustre: Skipped 1 previous similar
message
Apr 25 10:31:19 drp-tst-acc06 kernel: LustreError: 167-0:
drpffb-OST0002-osc-880167fda800: This client was evicted by
drpffb-OST0002; in progress operations using this service will fail.
Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError: 11-0:
drpffb-OST0001-osc-880167fda800: operation ost_statfs to node
172.21.52.83@o2ib failed: rc = -107
Apr 25 10:31:22 drp-tst-acc06 kernel: Lustre:
drpffb-OST0001-osc-880167fda800: Connection to drpffb-OST0001 (at
172.21.52.83@o2ib) was lost; in progress operations using this service
will wait for recovery to complete
Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError: 167-0:
drpffb-OST0001-osc-880167fda800: This client was evicted by
drpffb-OST0001; in progress operations using this service will fail.
Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError:
59702:0:(ldlm_resource.c:1100:ldlm_resource_complain())
drpffb-OST0001-osc-880167fda800: namespace resource
[0x4ad:0x0:0x0].0x0 (881004af6e40) refcount nonzero (1) after lock
cleanup; forcing cleanup.
Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError:
59702:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource:
[0x4ad:0x0:0x0].0x0 (881004af6e40) refcount = 2
Apr 25 10:31:22 drp-tst-acc06 kernel: LustreError:
59702:0:(ldlm_resource.c:1682:ldlm_resource_dump()) --- Resource:
[0x4ad:0x0:0x0].0x0 (881004af6e40) refcount = 2


some other info that can be useful:

# lctl get_param  llite.*.max_cached_mb
llite.drpffb-880167fda800.max_cached_mb=
users: 5
max_cached_mb: 64189
used_mb: 9592
unused_mb: 54597
reclaim_count: 0
llite.drplu-881fe1f99000.max_cached_mb=
users: 8
max_cached_mb: 64189
used_mb: 0
unused_mb: 64189
reclaim_count: 0

# lctl get_param ldlm.namespaces.*.lru_size
ldlm.namespaces.MGC172.21.42.159@tcp.lru_size=1600
ldlm.namespaces.MGC172.21.42.213@tcp.lru_size=1600
ldlm.namespaces.drpffb-MDT-mdc-880167fda800.lru_size=3
ldlm.namespaces.drpffb-OST0001-osc-880167fda800.lru_size=0
ldlm.namespaces.drpffb-OST0002-osc-880167fda800.lru_size=2
ldlm.namespaces.drpffb-OST0003-osc-880167fda800.lru_size=0
ldlm.namespaces.drplu-MDT-mdc-881fe1f99000.lru_size=0
ldlm.namespaces.drplu-OST0001-osc-881fe1f99000.lru_size=0
ldlm.namespaces.drplu-OST0002-osc-881fe1f99000.lru_size=0
ldlm.namespaces.drplu-OST0003-osc-881fe1f99000.lru_size=0

[lustre-discuss] lustre for home directories

2018-04-25 Thread Riccardo Veraldi

Hello,
just wondering if who is using lustre for home directories with several
users is happy or not.
I am considering to move home directories from NFS to Lustre/ZFS.
it is quite easy to send the NFS server in troubles with just a few
users copying files around.
What special tuning is needed to optimize Lustre usage with small files
? I guess 1M record size wold not be a good choice anymore.
thanks

Rick

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] problems mounting 2.9.0 server and older

2018-04-24 Thread Riccardo Veraldi

On 4/24/18 2:11 AM, Jones, Peter A wrote:
> We’ll be adding support for RHEL 7.5 in the upcoming 2.10.4 release. Note 
> that 2.9.59 is not a fully qualified release and is just a testing tag during 
> the 2.10 release cycle (that is the case for any version where y>50 for 
> version x.y.z)
yes I was using 2.9.59 because it was fixing a bug on data corruption.
thanks

>
>
>
> On 2018-04-23, 7:08 PM, "lustre-discuss on behalf of Riccardo Veraldi" 
> <lustre-discuss-boun...@lists.lustre.org on behalf of 
> riccardo.vera...@cnaf.infn.it> wrote:
>
>> I tried older client 2.9.59 and still have the same problem.
>> I think there may be a problem with RHEL75.
>> Anyone is using Lustre client 2.10.* on RHEL75 ?
>> thank you
>>
>> Rick
>>
>>
>> On 4/23/18 6:39 PM, Riccardo Veraldi wrote:
>>> Hello,
>>>
>>> I upgraded some of my clients to RHEL75 and Lustre 2.10.3
>>>
>>> Now I can mount all my lustre FS which are 2.9.0 and older  (down to 2.4
>>> ) but I cannot see the directories.
>>>
>>> ls: reading directory /lfs01:  Not a direcory
>>> total 0
>>>
>>> on another client with Lustre 2.10.1  I can mount and use the filesytem
>>> from server 2.9.0 and older.
>>>
>>> Am I reaching a compatibility issue trying to mount 2.9.0 and older from
>>> a 2.10.3 client ?
>>>
>>> thank you
>>>
>>>
>>> Rick
>>>
>>>
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] problems mounting 2.9.0 server and older

2018-04-23 Thread Riccardo Veraldi

I tried older client 2.9.59 and still have the same problem.
I think there may be a problem with RHEL75.
Anyone is using Lustre client 2.10.* on RHEL75 ?
thank you

Rick


On 4/23/18 6:39 PM, Riccardo Veraldi wrote:
> Hello,
>
> I upgraded some of my clients to RHEL75 and Lustre 2.10.3
>
> Now I can mount all my lustre FS which are 2.9.0 and older  (down to 2.4
> ) but I cannot see the directories.
>
> ls: reading directory /lfs01:  Not a direcory
> total 0
>
> on another client with Lustre 2.10.1  I can mount and use the filesytem
> from server 2.9.0 and older.
>
> Am I reaching a compatibility issue trying to mount 2.9.0 and older from
> a 2.10.3 client ?
>
> thank you
>
>
> Rick
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] problems mounting 2.9.0 server and older

2018-04-23 Thread Riccardo Veraldi

Hello,

I upgraded some of my clients to RHEL75 and Lustre 2.10.3

Now I can mount all my lustre FS which are 2.9.0 and older  (down to 2.4
) but I cannot see the directories.

ls: reading directory /lfs01:  Not a direcory
total 0

on another client with Lustre 2.10.1  I can mount and use the filesytem
from server 2.9.0 and older.

Am I reaching a compatibility issue trying to mount 2.9.0 and older from
a 2.10.3 client ?

thank you


Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lnet configuration messed up when clients mount lustre

2018-04-20 Thread Riccardo Veraldi

I figured out the problem was because of a messed up mgs partition on my
MDS.
thanks

On 4/19/18 7:18 PM, Riccardo Veraldi wrote:
> Hello,
> I have on my OSSes and on my clients the lnet configuration is loaded at
> boot time form lnet.conf
> I define local interfaces and peers.
> What happens is that when the lustre filesystems are mounted by the
> clients lnet is modified both on client and OSS side  and tcp peers are
> added at the end
> of the lnet configuration and this has as a consequence that all traffic
> starts to go through TCP and not infiniband.
> I am using RHEL74 and Lustre 2.10.3 my configuration si a bit not common
> because at the same time I use kernel 4.4 on the servers while all the
> clients are stock RHEL74 kernel.
>
> Follows Lnet yaml configuration before client mounting lustre and after
> client mounting lustre partitions.
>
> seems like that auto peer discovering is overriding ib and using just tcp.
> is ther a way to stop peer auto discovery ? or a way to tell that ib has
> precedence over tcp ?
>
> lnet configuread at boot:
>
> net:
>     - net type: lo
>   local NI(s):
>     - nid: 0@lo
>   status: up
>   statistics:
>   send_count: 0
>   recv_count: 0
>   drop_count: 0
>   tunables:
>   peer_timeout: 0
>   peer_credits: 0
>   peer_buffer_credits: 0
>   credits: 0
>   lnd tunables:
>   tcp bonding: 0
>   dev cpt: 0
>   CPT: "[0,1]"
>     - net type: o2ib
>   local NI(s):
>     - nid: 172.21.52.84@o2ib
>   status: up
>   interfaces:
>   0: ib0
>   statistics:
>   send_count: 96252389
>   recv_count: 61558248
>   drop_count: 0
>   tunables:
>   peer_timeout: 180
>   peer_credits: 128
>   peer_buffer_credits: 0
>   credits: 1024
>   lnd tunables:
>   peercredits_hiw: 64
>   map_on_demand: 32
>   concurrent_sends: 256
>   fmr_pool_size: 2048
>   fmr_flush_trigger: 512
>   fmr_cache: 1
>   ntx: 2048
>   conns_per_peer: 4
>   tcp bonding: 0
>   dev cpt: 1
>   CPT: "[0,1]"
>     - nid: 172.21.52.116@o2ib
>   status: up
>   interfaces:
>   0: ib1
>   statistics:
>   send_count: 96253070
>   recv_count: 61558217
>   drop_count: 0
>   tunables:
>   peer_timeout: 180
>   peer_credits: 128
>   peer_buffer_credits: 0
>   credits: 1024
>   lnd tunables:
>   peercredits_hiw: 64
>   map_on_demand: 32
>   concurrent_sends: 256
>   fmr_pool_size: 2048
>   fmr_flush_trigger: 512
>   fmr_cache: 1
>   ntx: 2048
>   conns_per_peer: 4
>   tcp bonding: 0
>   dev cpt: 1
>   CPT: "[0,1]"
>     - net type: tcp
>   local NI(s):
>     - nid: 172.21.42.207@tcp
>   status: up
>   interfaces:
>   0: enp1s0f0
>   statistics:
>   send_count: 380697
>   recv_count: 380352
>   drop_count: 0
>   tunables:
>   peer_timeout: 180
>   peer_credits: 8
>   peer_buffer_credits: 0
>   credits: 256
>   lnd tunables:
>   tcp bonding: 0
>   dev cpt: 0
>   CPT: "[0,1]"
> peer:
>     - primary nid: 172.21.42.159@tcp
>   Multi-Rail: True
>   peer ni:
>     - nid: 172.21.42.159@tcp
>   state: NA
>   max_ni_tx_credits: 8
>   available_tx_credits: 8
>   min_tx_credits: 0
>   tx_q_num_of_buf: 0
>   available_rtr_credits: 8
>   min_rtr_credits: 8
>   send_count: 380697
>   recv_count: 380352
>   drop_count: 0
>   refcount: 1
>     - primary nid: 172.21.52.126@o2ib
>   Multi-Rail: True
>   peer ni:
>     - nid: 172.21.52.126@o2ib
>   state: NA
>   max_ni_tx_credits: 128
>   available_tx_credits: 128
>   min_tx_credits: -7
>   tx_q_num_of_buf: 0
>   available_rtr_credits: 128
>   min_rtr_credits: 128
>   send_count: 28134533
>   recv_count: 8553649
>   drop_count:

[lustre-discuss] lnet configuration messed up when clients mount lustre

2018-04-19 Thread Riccardo Veraldi

Hello,
I have on my OSSes and on my clients the lnet configuration is loaded at
boot time form lnet.conf
I define local interfaces and peers.
What happens is that when the lustre filesystems are mounted by the
clients lnet is modified both on client and OSS side  and tcp peers are
added at the end
of the lnet configuration and this has as a consequence that all traffic
starts to go through TCP and not infiniband.
I am using RHEL74 and Lustre 2.10.3 my configuration si a bit not common
because at the same time I use kernel 4.4 on the servers while all the
clients are stock RHEL74 kernel.

Follows Lnet yaml configuration before client mounting lustre and after
client mounting lustre partitions.

seems like that auto peer discovering is overriding ib and using just tcp.
is ther a way to stop peer auto discovery ? or a way to tell that ib has
precedence over tcp ?

lnet configuread at boot:

net:
    - net type: lo
  local NI(s):
    - nid: 0@lo
  status: up
  statistics:
  send_count: 0
  recv_count: 0
  drop_count: 0
  tunables:
  peer_timeout: 0
  peer_credits: 0
  peer_buffer_credits: 0
  credits: 0
  lnd tunables:
  tcp bonding: 0
  dev cpt: 0
  CPT: "[0,1]"
    - net type: o2ib
  local NI(s):
    - nid: 172.21.52.84@o2ib
  status: up
  interfaces:
  0: ib0
  statistics:
  send_count: 96252389
  recv_count: 61558248
  drop_count: 0
  tunables:
  peer_timeout: 180
  peer_credits: 128
  peer_buffer_credits: 0
  credits: 1024
  lnd tunables:
  peercredits_hiw: 64
  map_on_demand: 32
  concurrent_sends: 256
  fmr_pool_size: 2048
  fmr_flush_trigger: 512
  fmr_cache: 1
  ntx: 2048
  conns_per_peer: 4
  tcp bonding: 0
  dev cpt: 1
  CPT: "[0,1]"
    - nid: 172.21.52.116@o2ib
  status: up
  interfaces:
  0: ib1
  statistics:
  send_count: 96253070
  recv_count: 61558217
  drop_count: 0
  tunables:
  peer_timeout: 180
  peer_credits: 128
  peer_buffer_credits: 0
  credits: 1024
  lnd tunables:
  peercredits_hiw: 64
  map_on_demand: 32
  concurrent_sends: 256
  fmr_pool_size: 2048
  fmr_flush_trigger: 512
  fmr_cache: 1
  ntx: 2048
  conns_per_peer: 4
  tcp bonding: 0
  dev cpt: 1
  CPT: "[0,1]"
    - net type: tcp
  local NI(s):
    - nid: 172.21.42.207@tcp
  status: up
  interfaces:
  0: enp1s0f0
  statistics:
  send_count: 380697
  recv_count: 380352
  drop_count: 0
  tunables:
  peer_timeout: 180
  peer_credits: 8
  peer_buffer_credits: 0
  credits: 256
  lnd tunables:
  tcp bonding: 0
  dev cpt: 0
  CPT: "[0,1]"
peer:
    - primary nid: 172.21.42.159@tcp
  Multi-Rail: True
  peer ni:
    - nid: 172.21.42.159@tcp
  state: NA
  max_ni_tx_credits: 8
  available_tx_credits: 8
  min_tx_credits: 0
  tx_q_num_of_buf: 0
  available_rtr_credits: 8
  min_rtr_credits: 8
  send_count: 380697
  recv_count: 380352
  drop_count: 0
  refcount: 1
    - primary nid: 172.21.52.126@o2ib
  Multi-Rail: True
  peer ni:
    - nid: 172.21.52.126@o2ib
  state: NA
  max_ni_tx_credits: 128
  available_tx_credits: 128
  min_tx_credits: -7
  tx_q_num_of_buf: 0
  available_rtr_credits: 128
  min_rtr_credits: 128
  send_count: 28134533
  recv_count: 8553649
  drop_count: 0
  refcount: 1
    - primary nid: 172.21.52.127@o2ib
  Multi-Rail: True
  peer ni:
    - nid: 172.21.52.127@o2ib
  state: NA
  max_ni_tx_credits: 128
  available_tx_credits: 128
  min_tx_credits: 97
  tx_q_num_of_buf: 0
  available_rtr_credits: 128
  min_rtr_credits: 128
  send_count: 13505518
  recv_count: 6106498
  drop_count: 0
  refcount: 1
    - primary nid: 172.21.52.128@o2ib
  Multi-Rail: True
  peer ni:
    - nid: 172.21.52.128@o2ib
  state: NA
  max_ni_tx_credits: 128
  available_tx_credits: 128
  min_tx_credits: -751
  tx_q_num_of_buf: 0
  available_rtr_credits: 128
  min_rtr_credits: 128
  send_count: 17672565
  recv_count: 13195155
  drop_count: 0

Re: [lustre-discuss] luster 2.10.3 lnetctl configurations not persisting through reboot

2018-04-19 Thread Riccardo Veraldi

this is what I do.

lnetctl export > /etc/lnet.conf

systemctl enable lnet



On 4/17/18 1:37 PM, Kurt Strosahl wrote:
> I configured an lnet router today with luster 2.10.3 as the lustre software.  
> I then connfigured the lnet router using the following lnetctl commands
>
>
> lnetctl lnet configure
> lnetctl net add --net o2ib0 --if ib1
> lnetctl net add --net o2ib1 --if ib0
> lnetctl set routing 1
>
> When I rebooted the router the configuration didn't stick.  Is there a way to 
> make this persist through a reboot?
>
> I also notices that when I do an export of the lnetctl configuration it 
> contains
>
> - net type: o2ib1
>   local NI(s):
> - nid: @o2ib1
>   status: up
>   interfaces:
>   0: ib0
>   statistics:
>   send_count: 2958318
>   recv_count: 2948077
>   drop_count: 0
>   tunables:
>   peer_timeout: 180
>   peer_credits: 8
>   peer_buffer_credits: 0
>   credits: 256
>   lnd tunables:
>   peercredits_hiw: 4
>   map_on_demand: 256
>   concurrent_sends: 8
>   fmr_pool_size: 512
>   fmr_flush_trigger: 384
>   fmr_cache: 1
>   ntx: 512
>   conns_per_peer: 1
>   tcp bonding: 0
>   dev cpt: 0
>   CPT: "[0,1]"
>
> Is this expected behavior?
>
> w/r,
> Kurt J. Strosahl
> System Administrator: Lustre, HPC
> Scientific Computing Group, Thomas Jefferson National Accelerator Facility
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] LNET Multi-rail

2018-04-16 Thread Riccardo Veraldi

On 4/10/18 7:15 AM, Hans Henrik Happe wrote:
> Thanks for the info. A few observations I found so far:
>
> - I think LU-10297 has solved my stability issues.
> - lustre.conf does work with comma separation of interfaces. I.e.
> o2ib(ib0,ib1). However, peers need to be configured with ldev.conf or
> lnetctl.
myself I just use lnet.conf (lnetctl) I do not use anymore lustre.conf,
I define all my interfaces and peers and they are loaded from lnet.conf

> - Defining peering ('lnetctl peer add' and ARP settings) on the client
> only, seems to make  multi-rail work both ways.
>
> I'm a bit puzzled by the last observation. I expected that both ends
> needed to define peers? The client NID does not show as multi-rail
> (lnetctl peer show) on the server.
>
> Cheers,
> Hans Henrik
>
> On 14-03-2018 03:00, Riccardo Veraldi wrote:
>> it works for me but you have to set up correctly lnet.conf either
>> manually or using  lnetctl to add peers. Then you export your
>> configuration in lnet.conf
>> and it will be loaded at reboot. I had to add my peers manually, I think
>> peer auto discovery is not yet operational on 2.10.3.
>> I suppose you are not using anymore lustre.conf to configure interfaces
>> (ib,tcp) and that you are using the new Lustre DLC style:
>>
>> http://wiki.lustre.org/Dynamic_LNET_Configuration
>>
>> Also I do not know if you did this yet but you should configure ARP
>> settings and also rt_tables for your ib interfaces if you use
>> multi-rail.
>> Here is an example. I had to do that to have things working properly:
>>
>> https://wiki.hpdd.intel.com/display/LNet/MR+Cluster+Setup
>>
>> You may also want to check that your IB interfaces (if you have a dual
>> port infiniband like I have) can really double the performance when you
>> enable both of them.
>> The infiniband PCIe card bandwidth has to be capable of feeding enough
>> traffic to both dual ports or it will just be useful as a fail over
>> device,
>> without improving the speed as you may want to.
>>
>> In my configuration fail over is working. If I disconnect one port, the
>> other will still work. Of course if you disconnect it when traffic is
>> going through
>> you may have a problem with that stream of data. But new traffic will be
>> handled correctly. I do not know if there is a way to avoid this, I am
>> just talking about my experience and as I said I Am more interested in
>> performance than fail over.
>>
>>
>> Riccardo
>>
>>
>> On 3/13/18 8:05 AM, Hans Henrik Happe wrote:
>>> Hi,
>>>
>>> I'm testing LNET multi-rail with 2.10.3 and I ran into some questions
>>> that I couldn't find in the documentation or elsewhere.
>>>
>>> As I understand the design document "Dynamic peer discovery" will make
>>> it possible to discover multi-rail peer without adding them manually?
>>> Is that functionality in 2.10.3?
>>>
>>> Will failover work without doing anything special? I've tested with
>>> two IB ports and unplugging resulted in no I/O from client and
>>> replugging didn't resolve it.
>>>
>>> How do I make and active/passive setup? One example I would really
>>> like to see in the documentation, is the obvious o2ib-tcp combination,
>>> where tcp is used if o2ib is down and fails back if it comes op again.
>>>
>>> Anyone using MR in production? Done at bit of testing with dual ib on
>>> both server and client and had a few crashes.
>>>
>>> Cheers,
>>> Hans Henrik
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>
>>
>>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] bad performance with Lustre/ZFS on NVMe SSD

2018-04-12 Thread Riccardo Veraldi

Yes I tested every single disk and also with disks in a raidz pool
without Lustre.
disks perform to specs, 1.2TB each and up to 6GB/s in the zpool.
When using lustre the zpool performs really bad no more than 1.5GB/s.

I then configured one OST per disk without any raidz (6 OST total).
I can scale up with performance distributing processes across OSTs in
this way, but anyway if I use striping across all OSTs
instead of manually bounding proesses to a specific OST, the performance
decreases.
Also running a single process on a single OST I never can get more than
700MB/s while I can reach 1.2GB/s using at least 4 processes on the same
OST.

I did test using obdfilter-survey this is what I got:

ost  1 sz 524288000K rsz 1024K obj    4 thr    4 write 4872.92 [1525.83,
6120.75]

I did run Lnet selftest and I got 6GB/s using FDR.

But when I write form the client side the performances really drops
dramatically. Especially when using a Lustre on raidz.

so I Was wondering if  there is any RPC parameter setting that I need to
set to get better performances out of Lustre ?

thank you

On 4/9/18 4:15 PM, Dilger, Andreas wrote:
> On Apr 6, 2018, at 23:04, Riccardo Veraldi <riccardo.vera...@cnaf.infn.it> 
> wrote:
>> So I'm struggling since months with these low performances on Lsutre/ZFS.
>>
>> Looking for hints.
>>
>> 3 OSSes, RHEL 74  Lustre 2.10.3 and zfs 0.7.6
>>
>> each OSS has one  OST raidz
>>
>>   pool: drpffb-ost01
>>  state: ONLINE
>>   scan: none requested
>>   trim: completed on Fri Apr  6 21:53:04 2018 (after 0h3m)
>> config:
>>
>> NAME  STATE READ WRITE CKSUM
>> drpffb-ost01  ONLINE   0 0 0
>>   raidz1-0ONLINE   0 0 0
>> nvme0n1   ONLINE   0 0 0
>> nvme1n1   ONLINE   0 0 0
>> nvme2n1   ONLINE   0 0 0
>> nvme3n1   ONLINE   0 0 0
>> nvme4n1   ONLINE   0 0 0
>> nvme5n1   ONLINE   0 0 0
>>
>> while the raidz without Lustre perform well at 6GB/s (1GB/s per disk),
>> with Lustre on top of it performances are really poor.
>> most of all they are not stable at all and go up and down between
>> 1.5GB/s and 6GB/s. I Tested with obfilter-survey
>> LNET is ok and working at 6GB/s (using infiniband FDR)
>>
>> What could be the cause of OST performance going up and down like a
>> roller coaster ?
> Riccardo,
> to take a step back for a minute, have you tested all of the devices
> individually, and also concurrently with some low-level tool like
> sgpdd or vdbench?  After that is known to be working, have you tested
> with obdfilter-survey locally on the OSS, then remotely on the client(s)
> so that we can isolate where the bottleneck is being hit.
>
> Cheers, Andreas
>
>
>> for reference here are few considerations:
>>
>> filesystem parameters:
>>
>> zfs set mountpoint=none drpffb-ost01
>> zfs set sync=disabled drpffb-ost01
>> zfs set atime=off drpffb-ost01
>> zfs set redundant_metadata=most drpffb-ost01
>> zfs set xattr=sa drpffb-ost01
>> zfs set recordsize=1M drpffb-ost01
>>
>> NVMe SSD are  4KB/sector
>>
>> ashift=12
>>
>>
>> ZFS module parameters
>>
>> options zfs zfs_prefetch_disable=1
>> options zfs zfs_txg_history=120
>> options zfs metaslab_debug_unload=1
>> #
>> options zfs zfs_vdev_scheduler=deadline
>> options zfs zfs_vdev_async_write_active_min_dirty_percent=20
>> #
>> options zfs zfs_vdev_scrub_min_active=48
>> options zfs zfs_vdev_scrub_max_active=128
>> #options zfs zfs_vdev_sync_write_min_active=64
>> #options zfs zfs_vdev_sync_write_max_active=128
>> #
>> options zfs zfs_vdev_sync_write_min_active=8
>> options zfs zfs_vdev_sync_write_max_active=32
>> options zfs zfs_vdev_sync_read_min_active=8
>> options zfs zfs_vdev_sync_read_max_active=32
>> options zfs zfs_vdev_async_read_min_active=8
>> options zfs zfs_vdev_async_read_max_active=32
>> options zfs zfs_top_maxinflight=320
>> options zfs zfs_txg_timeout=30
>> options zfs zfs_dirty_data_max_percent=40
>> options zfs zfs_vdev_scheduler=deadline
>> options zfs zfs_vdev_async_write_min_active=8
>> options zfs zfs_vdev_async_write_max_active=32
>>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] latest kernel version supported by Lustre ?

2018-04-09 Thread Riccardo Veraldi


thank you.
Yes I could build lustre 2.10.3 on Kernel 4.4 without any trouble.


On 4/9/18 6:52 AM, Jones, Peter A wrote:
> Ah sorry about that– I was reading Andreas’s response rather than
> Riccardo’s original request. Thanks for pointing out my error Patrick.
> In that case we are tracking this less closely and the latest that I
> am aware of is Ubuntu 16 with a 4.4 kernel should work for 2.11. This
> was not an officially supported option but I know that some people
> have used this.
>
> On 2018-04-09, 6:11 AM, "Patrick Farrell" <p...@cray.com
> <mailto:p...@cray.com>> wrote:
>
> Peter,
>
> Unfortunately, Riccardo was asking about server support.
>
> - Patrick
>
> 
> *From:* lustre-discuss <lustre-discuss-boun...@lists.lustre.org
> <mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of
> Jones, Peter A <peter.a.jo...@intel.com
> <mailto:peter.a.jo...@intel.com>>
> *Sent:* Monday, April 9, 2018 7:24:52 AM
> *To:* Dilger, Andreas; Riccardo Veraldi
> *Cc:* lustre-discuss@lists.lustre.org
> <mailto:lustre-discuss@lists.lustre.org>
> *Subject:* Re: [lustre-discuss] latest kernel version supported by
> Lustre ?
>  
> We’re more up to date than that - we have 4.12 client support in
> 2.10.3 and 2.11 (see LU-9558). We’re tracking 4.14 client support
> under LU-10560 and the last couple of patches just missed out on
> 2.11 but should land to master in the coming days. Work to track
> 4.15 is underway under LU-10805. James Simmons may well elaborate.
>
>
>
>
> On 2018-04-08, 12:17 PM, "lustre-discuss on behalf of Dilger,
> Andreas" <lustre-discuss-boun...@lists.lustre.org
> <mailto:lustre-discuss-boun...@lists.lustre.org> on behalf of
> andreas.dil...@intel.com <mailto:andreas.dil...@intel.com>> wrote:
>
> >What version of Lustre?  I think 2.11 clients work with something
> like 4.8? kernels, while 2.10 works with 4.4?  Sorry, I can't
> check the specifics right now.
> >
> >If you need a specific kernel, the best thing to do is try the
> configure/build step for Lustre with that kernel, and then check
>     Jira/Gerrit for tickets for each build failure you hit.
> >
> >It may be that there are some unlanded patches that can get you a
> running client.
> >
> >Cheers, Andreas
> >
> >> On Apr 7, 2018, at 09:48, Riccardo Veraldi
> <riccardo.vera...@cnaf.infn.it
> <mailto:riccardo.vera...@cnaf.infn.it>> wrote:
> >>
> >> Hello,
> >>
> >> if I would like to use kernel 4.* from elrepo on RHEL74 for the
> lustre
> >> OSSes what is the latest supported kernel 4 version  by Lustre ?
> >>
> >> thank you
> >>
> >>
> >> Rick
> >>
> >> ___
> >> lustre-discuss mailing list
> >> lustre-discuss@lists.lustre.org
> <mailto:lustre-discuss@lists.lustre.org>
> >> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> >___
> >lustre-discuss mailing list
> >lustre-discuss@lists.lustre.org
> <mailto:lustre-discuss@lists.lustre.org>
> >http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> <mailto:lustre-discuss@lists.lustre.org>
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] latest kernel version supported by Lustre ?

2018-04-09 Thread Riccardo Veraldi


build of Lustre dkms fails on both 4.9 and 4.12
So I went back to 4.4


On 4/9/18 5:24 AM, Jones, Peter A wrote:
> We’re more up to date than that - we have 4.12 client support in 2.10.3 and 
> 2.11 (see LU-9558). We’re tracking 4.14 client support under LU-10560 and the 
> last couple of patches just missed out on 2.11 but should land to master in 
> the coming days. Work to track 4.15 is underway under LU-10805. James Simmons 
> may well elaborate.
>
>
>
>
> On 2018-04-08, 12:17 PM, "lustre-discuss on behalf of Dilger, Andreas" 
> <lustre-discuss-boun...@lists.lustre.org on behalf of 
> andreas.dil...@intel.com> wrote:
>
>> What version of Lustre?  I think 2.11 clients work with something like 4.8? 
>> kernels, while 2.10 works with 4.4?  Sorry, I can't check the specifics 
>> right now. 
>>
>> If you need a specific kernel, the best thing to do is try the 
>> configure/build step for Lustre with that kernel, and then check Jira/Gerrit 
>> for tickets for each build failure you hit. 
>>
>> It may be that there are some unlanded patches that can get you a running 
>> client. 
>>
>> Cheers, Andreas
>>
>>> On Apr 7, 2018, at 09:48, Riccardo Veraldi <riccardo.vera...@cnaf.infn.it> 
>>> wrote:
>>>
>>> Hello,
>>>
>>> if I would like to use kernel 4.* from elrepo on RHEL74 for the lustre
>>> OSSes what is the latest supported kernel 4 version  by Lustre ?
>>>
>>> thank you
>>>
>>>
>>> Rick
>>>
>>> ___
>>> lustre-discuss mailing list
>>> lustre-discuss@lists.lustre.org
>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] latest kernel version supported by Lustre ?

2018-04-07 Thread Riccardo Veraldi

Hello,

if I would like to use kernel 4.* from elrepo on RHEL74 for the lustre
OSSes what is the latest supported kernel 4 version  by Lustre ?

thank you


Rick

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] bad performance with Lustre/ZFS on NVMe SSD

2018-04-06 Thread Riccardo Veraldi

So I'm struggling since months with these low performances on Lsutre/ZFS.

Looking for hints.

3 OSSes, RHEL 74  Lustre 2.10.3 and zfs 0.7.6

each OSS has one  OST raidz

  pool: drpffb-ost01
 state: ONLINE
  scan: none requested
  trim: completed on Fri Apr  6 21:53:04 2018 (after 0h3m)
config:

    NAME  STATE READ WRITE CKSUM
    drpffb-ost01  ONLINE   0 0 0
      raidz1-0    ONLINE   0 0 0
        nvme0n1   ONLINE   0 0 0
        nvme1n1   ONLINE   0 0 0
        nvme2n1   ONLINE   0 0 0
        nvme3n1   ONLINE   0 0 0
        nvme4n1   ONLINE   0 0 0
        nvme5n1   ONLINE   0 0 0

while the raidz without Lustre perform well at 6GB/s (1GB/s per disk),
with Lustre on top of it performances are really poor.
most of all they are not stable at all and go up and down between
1.5GB/s and 6GB/s. I Tested with obfilter-survey
LNET is ok and working at 6GB/s (using infiniband FDR)

What could be the cause of OST performance going up and down like a
roller coaster ?

for reference here are few considerations:

filesystem parameters:

zfs set mountpoint=none drpffb-ost01
zfs set sync=disabled drpffb-ost01
zfs set atime=off drpffb-ost01
zfs set redundant_metadata=most drpffb-ost01
zfs set xattr=sa drpffb-ost01
zfs set recordsize=1M drpffb-ost01

NVMe SSD are  4KB/sector

ashift=12


ZFS module parameters

options zfs zfs_prefetch_disable=1
options zfs zfs_txg_history=120
options zfs metaslab_debug_unload=1
#
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_active_min_dirty_percent=20
#
options zfs zfs_vdev_scrub_min_active=48
options zfs zfs_vdev_scrub_max_active=128
#options zfs zfs_vdev_sync_write_min_active=64
#options zfs zfs_vdev_sync_write_max_active=128
#
options zfs zfs_vdev_sync_write_min_active=8
options zfs zfs_vdev_sync_write_max_active=32
options zfs zfs_vdev_sync_read_min_active=8
options zfs zfs_vdev_sync_read_max_active=32
options zfs zfs_vdev_async_read_min_active=8
options zfs zfs_vdev_async_read_max_active=32
options zfs zfs_top_maxinflight=320
options zfs zfs_txg_timeout=30
options zfs zfs_dirty_data_max_percent=40
options zfs zfs_vdev_scheduler=deadline
options zfs zfs_vdev_async_write_min_active=8
options zfs zfs_vdev_async_write_max_active=32


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] LNET Multi-rail

2018-03-13 Thread Riccardo Veraldi

it works for me but you have to set up correctly lnet.conf either
manually or using  lnetctl to add peers. Then you export your
configuration in lnet.conf
and it will be loaded at reboot. I had to add my peers manually, I think
peer auto discovery is not yet operational on 2.10.3.
I suppose you are not using anymore lustre.conf to configure interfaces
(ib,tcp) and that you are using the new Lustre DLC style:

http://wiki.lustre.org/Dynamic_LNET_Configuration

Also I do not know if you did this yet but you should configure ARP
settings and also rt_tables for your ib interfaces if you use multi-rail.
Here is an example. I had to do that to have things working properly: 

https://wiki.hpdd.intel.com/display/LNet/MR+Cluster+Setup

You may also want to check that your IB interfaces (if you have a dual
port infiniband like I have) can really double the performance when you
enable both of them.
The infiniband PCIe card bandwidth has to be capable of feeding enough
traffic to both dual ports or it will just be useful as a fail over device,
without improving the speed as you may want to.

In my configuration fail over is working. If I disconnect one port, the
other will still work. Of course if you disconnect it when traffic is
going through
you may have a problem with that stream of data. But new traffic will be
handled correctly. I do not know if there is a way to avoid this, I am
just talking about my experience and as I said I Am more interested in
performance than fail over.

Riccardo

On 3/13/18 8:05 AM, Hans Henrik Happe wrote:
> Hi,
>
> I'm testing LNET multi-rail with 2.10.3 and I ran into some questions
> that I couldn't find in the documentation or elsewhere.
>
> As I understand the design document "Dynamic peer discovery" will make
> it possible to discover multi-rail peer without adding them manually?
> Is that functionality in 2.10.3?
>
> Will failover work without doing anything special? I've tested with
> two IB ports and unplugging resulted in no I/O from client and
> replugging didn't resolve it.
>
> How do I make and active/passive setup? One example I would really
> like to see in the documentation, is the obvious o2ib-tcp combination,
> where tcp is used if o2ib is down and fails back if it comes op again.
>
> Anyone using MR in production? Done at bit of testing with dual ib on
> both server and client and had a few crashes.
>
> Cheers,
> Hans Henrik
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre-dkms problems

2018-02-21 Thread Riccardo Veraldi

On 2/21/18 6:04 AM, Faccini, Bruno wrote:
> Hello Ricardo,
>
>> [root@psludev02 SPECS]# rpmbuild --with zfs --without ldiskfs -bb 
>> lustre-dkms.spec
> from where/which RPM did this lustre-dkms.spec file come from?

https://downloads.hpdd.intel.com/public/lustre/latest-release/el7.4.1708/server/SRPMS/lustre-dkms-2.10.3-1.el7.src.rpm

the problem is also in the default distributed ready to install RPM

https://downloads.hpdd.intel.com/public/lustre/latest-release/el7.4.1708/server/RPMS/x86_64/lustre-dkms-2.10.3-1.el7.noarch.rpm

it won't build the kernel module


>
> Bruno.
>
>
>> On Feb 21, 2018, at 3:11 AM, Riccardo Veraldi 
>> <riccardo.vera...@cnaf.infn.it> wrote:
>>
>> Hello.
>>
>> I have problems installing the lustre-dkms package for Lustre 2.10.3
>> after building it from SRPMS.
>>
>> the same problem occurs with lustre-dkms-2.10.3-1.el7.noarch.rpm
>> downloaded from the official Lustre repo.
>>
>> There is an error and the lustre module is not built.
>>
>> RHEL74 Linux 3.10.0-693.17.1.el7.x86_64
>>
>> [root@psludev02 noarch]# yum localinstall
>> lustre-dkms-2.10.3-1.el7.noarch.rpm
>> Loaded plugins: product-id, search-disabled-repos
>> Examining lustre-dkms-2.10.3-1.el7.noarch.rpm:
>> lustre-dkms-2.10.3-1.el7.noarch
>> Marking lustre-dkms-2.10.3-1.el7.noarch.rpm to be installed
>> Resolving Dependencies
>> --> Running transaction check
>> ---> Package lustre-dkms.noarch 0:2.10.3-1.el7 will be installed
>> --> Finished Dependency Resolution
>>
>> Dependencies Resolved
>>
>> 
>>  Package   Arch Version   
>> Repository  Size
>> 
>> Installing:
>>  lustre-dkms   noarch   2.10.3-1.el7  
>> /lustre-dkms-2.10.3-1.el7.noarch34 M
>>
>> Transaction Summary
>> 
>> Install  1 Package
>>
>> Total size: 34 M
>> Installed size: 34 M
>> Is this ok [y/d/N]: y
>> Downloading packages:
>> Running transaction check
>> Running transaction test
>> Transaction test succeeded
>> Running transaction
>>   Installing :
>> lustre-dkms-2.10.3-1.el7.noarch  
>> 
>> 1/1
>> Loading new lustre-2.10.3 DKMS files...
>> Building for 3.10.0-693.17.1.el7.x86_64
>> Building initial module for 3.10.0-693.17.1.el7.x86_64
>> Error! Bad return status for module build on kernel:
>> 3.10.0-693.17.1.el7.x86_64 (x86_64)
>> Consult /var/lib/dkms/lustre/2.10.3/build/make.log for more information.
>> warning: %post(lustre-dkms-2.10.3-1.el7.noarch) scriptlet failed, exit
>> status 10
>> Non-fatal POSTIN scriptlet failure in rpm package
>> lustre-dkms-2.10.3-1.el7.noarch
>>   Verifying  :
>> lustre-dkms-2.10.3-1.el7.noarch  
>> 
>> 1/1
>>
>> Installed:
>>   lustre-dkms.noarch
>> 0:2.10.3-1.el7   
>>   
>>
>>
>> Complete!
>>
>> the log says
>>
>> DKMS make.log for lustre-2.10.3 for kernel 3.10.0-693.17.1.el7.x86_64
>> (x86_64)
>> Tue Feb 20 17:58:56 PST 2018
>> make: *** No targets specified and no makefile found.  Stop.
>>
>> and in fact in /var/lib/dkms/lustre/2.10.3/build/ there is no Makefile
>> but only Makefile.in
>>
>> seems like autogen is not called.
>>
>> this is how I built the lustre-dkms rpm
>>
>> [root@psludev02 SPECS]# rpmbuild --with zfs --without ldiskfs -bb 
>> lustre-dkms.spec
>> error: Macro %mkconf_options has empty body
>> error: Macro %mkconf_options has empty body
>> Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.3Jwnia
>> + umask 022
>> + cd /root/rpmbuild/BUILD
>> + cd /root/rpmbuild/BUILD
>> + rm -rf lustre-2.10.3
>> + /usr/bin/gzip -dc /root/rpmbuild/SOURCES/lustre-2.10.3.tar.gz
>> + /usr/bin/tar -xf -
>> + STATUS=0
>> + '[' 0 -ne 0 ']'
>> + cd lustre-2.10.3
>> + /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
>> + exit 0
>> Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.K47nWk
>> + umask 022
>> + cd /root/rpmbuild/BUILD
>> + cd lustre-2.10.3
>> + lustre/scripts/dkms.mkconf -n lustre -v 2.10.3 -f dkms.conf
>> '%{mkconf_options}'
>> + exit 0
&g

[lustre-discuss] lustre-dkms problems

2018-02-20 Thread Riccardo Veraldi

Hello.

I have problems installing the lustre-dkms package for Lustre 2.10.3
after building it from SRPMS.

the same problem occurs with lustre-dkms-2.10.3-1.el7.noarch.rpm
downloaded from the official Lustre repo.

There is an error and the lustre module is not built.

RHEL74 Linux 3.10.0-693.17.1.el7.x86_64

[root@psludev02 noarch]# yum localinstall
lustre-dkms-2.10.3-1.el7.noarch.rpm
Loaded plugins: product-id, search-disabled-repos
Examining lustre-dkms-2.10.3-1.el7.noarch.rpm:
lustre-dkms-2.10.3-1.el7.noarch
Marking lustre-dkms-2.10.3-1.el7.noarch.rpm to be installed
Resolving Dependencies
--> Running transaction check
---> Package lustre-dkms.noarch 0:2.10.3-1.el7 will be installed
--> Finished Dependency Resolution

Dependencies Resolved


 Package   Arch Version   
Repository  Size

Installing:
 lustre-dkms   noarch   2.10.3-1.el7  
/lustre-dkms-2.10.3-1.el7.noarch    34 M

Transaction Summary

Install  1 Package

Total size: 34 M
Installed size: 34 M
Is this ok [y/d/N]: y
Downloading packages:
Running transaction check
Running transaction test
Transaction test succeeded
Running transaction
  Installing :
lustre-dkms-2.10.3-1.el7.noarch 
 
1/1
Loading new lustre-2.10.3 DKMS files...
Building for 3.10.0-693.17.1.el7.x86_64
Building initial module for 3.10.0-693.17.1.el7.x86_64
Error! Bad return status for module build on kernel:
3.10.0-693.17.1.el7.x86_64 (x86_64)
Consult /var/lib/dkms/lustre/2.10.3/build/make.log for more information.
warning: %post(lustre-dkms-2.10.3-1.el7.noarch) scriptlet failed, exit
status 10
Non-fatal POSTIN scriptlet failure in rpm package
lustre-dkms-2.10.3-1.el7.noarch
  Verifying  :
lustre-dkms-2.10.3-1.el7.noarch 
 
1/1

Installed:
  lustre-dkms.noarch
0:2.10.3-1.el7  
   


Complete!

the log says

DKMS make.log for lustre-2.10.3 for kernel 3.10.0-693.17.1.el7.x86_64
(x86_64)
Tue Feb 20 17:58:56 PST 2018
make: *** No targets specified and no makefile found.  Stop.

and in fact in /var/lib/dkms/lustre/2.10.3/build/ there is no Makefile
but only Makefile.in

seems like autogen is not called.

this is how I built the lustre-dkms rpm

[root@psludev02 SPECS]# rpmbuild --with zfs --without ldiskfs -bb 
lustre-dkms.spec
error: Macro %mkconf_options has empty body
error: Macro %mkconf_options has empty body
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.3Jwnia
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd /root/rpmbuild/BUILD
+ rm -rf lustre-2.10.3
+ /usr/bin/gzip -dc /root/rpmbuild/SOURCES/lustre-2.10.3.tar.gz
+ /usr/bin/tar -xf -
+ STATUS=0
+ '[' 0 -ne 0 ']'
+ cd lustre-2.10.3
+ /usr/bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ exit 0
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.K47nWk
+ umask 022
+ cd /root/rpmbuild/BUILD
+ cd lustre-2.10.3
+ lustre/scripts/dkms.mkconf -n lustre -v 2.10.3 -f dkms.conf
'%{mkconf_options}'
+ exit 0
Executing(%install): /bin/sh -e /var/tmp/rpm-tmp.OSRfCv
+ umask 022
+ cd /root/rpmbuild/BUILD
+ '[' /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64 '!=' / ']'
+ rm -rf /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64
++ dirname /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64
+ mkdir -p /root/rpmbuild/BUILDROOT
+ mkdir /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64
+ cd lustre-2.10.3
+ '[' /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64 '!=' / ']'
+ rm -rf /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64
+ mkdir -p /root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64/usr/src/
+ cp -rfp /root/rpmbuild/BUILD/lustre-2.10.3
/root/rpmbuild/BUILDROOT/lustre-dkms-2.10.3-1.el7.x86_64/usr/src/
+ /usr/lib/rpm/find-debuginfo.sh --strict-build-id -m --run-dwz
--dwz-low-mem-die-limit 1000 --dwz-max-die-limit 11000
/root/rpmbuild/BUILD/lustre-2.10.3
/usr/lib/rpm/sepdebugcrcfix: Updated 0 CRC32s, 0 CRC32s did match.
+ /usr/lib/rpm/check-buildroot
+ /usr/lib/rpm/redhat/brp-compress
+ /usr/lib/rpm/redhat/brp-strip-static-archive /usr/bin/strip
+ /usr/lib/rpm/brp-python-bytecompile /usr/bin/python 1
+ /usr/lib/rpm/redhat/brp-python-hardlink
+ /usr/lib/rpm/redhat/brp-java-repack-jars
Processing files: lustre-dkms-2.10.3-1.el7.noarch
Provides: kmod-lustre = 2.10.3 lustre-dkms = 2.10.3-1.el7 lustre-modules
= 2.10.3 lustre-osd lustre-osd-zfs = 2.10.3
Requires(interp): /bin/sh /bin/sh
Requires(rpmlib): rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1 rpmlib(PartialHardlinkSets) <= 4.0.4-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
Requires(post): /bin/sh
Requires(preun): /bin/sh
Requires: /bin/bash /bin/sh /usr/bin/env

[lustre-discuss] Project quota Lustre/ZFS

2018-02-15 Thread Riccardo Veraldi

Hello,
is Project quota supported on Lustre 2.10.3/ZFS.
Apparently LU-7991 adds it as a feature.
Anyone may confirm it does work ?

thank you

Riccardo


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Issues compiling lustre-client 2.8 on CentOS 7.4

2018-01-08 Thread Riccardo Veraldi

On 1/8/18 12:26 AM, Scott Wood wrote:
>
> Thanks for the feedback, Riccardo.  I understand not all versions are
> certified compatible but knowing that some folks have had success
> helps build some confidence.  I tried building 2.8.0, the latest from
> the 2.8 branch, the latest from the 2.9 branch, 2.10.2, and the latest
> from master (2.10.56_85_g76afb10-1). Only the latter two succeeded.
>
when you build the client should not matter if you use ldiskfs or ZFS on
the server side.
For my case 2.10.1 client worked on older 2.5 Lustre server (ldiskfs)
and on newer 2.9 server (ZFS backend).
I did not try anyway with 2.10.2 client yet.
>
>
> I'll run some tests and hold off to see if other chime in with known
> successes or known issues.  I'm ldiskfs, not zfs.  Tcp only, not
> infiniband or RDMA.  No lnet routers. Independent MGT and MDT, rather
> than combined.  48 OSTs and about 70 clients.  Pretty basic config. 
> Fingers crossed on more similar success stories.
>
>
> Cheers,
>
> Scott
>
> ----
> *From:* Riccardo Veraldi <riccardo.vera...@cnaf.infn.it>
> *Sent:* Monday, 8 January 2018 5:28:42 PM
> *To:* Scott Wood; lustre-discuss@lists.lustre.org
> *Subject:* Re: [lustre-discuss] Issues compiling lustre-client 2.8 on
> CentOS 7.4
>  
> I am running at the moment 2.10.1 clients with any server version down
> to 2.5 without troubles. I know that there is no warranty of full
> interoperability bot so far I did not have problems.
> Not sure if you can run 2.8 on Centos 7.4. You can try to git clone
> the latest source code from 2.8.* and see if it builds on Centos 7.4
>
> On 1/7/18 8:10 PM, Scott Wood wrote:
>>
>> Afternoon, folks,
>>
>>
>> In the interest of patching kernels to mitigate Meltdown security
>> issues on user accessible systems, we're trying to build lustre
>> client rpms for the latest released Centos 7.4
>> kernel, 3.10.0-693.11.6.el7.x86_64.  We're running in to issues
>> compiling though.  As I understand from the docs, as our servers are
>> CentOS6 and running the Intel distributed 2.7.0 server binaries, the
>> newest "officially" supported client versions are lustre 2.8.  
>>
>>
>> Has anyone run the 2.10.2 (or 2.10.x) clients connecting to 2.7.0
>> servers (as we have successfully built all client rpms from a current
>> git checkout, and from a 2.10.2 checkout)?  Alternatively, is there a
>> known 2.8.x tag that builds successfully on CentOS7.4?  Is there a
>> third option that folks would propose?
>>
>>
>> build errors visible at https://pastebin.com/izF3bXg3
>>
>>
>> Cheers
>>
>> Scott
>>
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org <mailto:lustre-discuss@lists.lustre.org>
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Issues compiling lustre-client 2.8 on CentOS 7.4

2018-01-07 Thread Riccardo Veraldi

I am running at the moment 2.10.1 clients with any server version down
to 2.5 without troubles. I know that there is no warranty of full
interoperability bot so far I did not have problems.
Not sure if you can run 2.8 on Centos 7.4. You can try to git clone the
latest source code from 2.8.* and see if it builds on Centos 7.4

On 1/7/18 8:10 PM, Scott Wood wrote:
>
> Afternoon, folks,
>
>
> In the interest of patching kernels to mitigate Meltdown security
> issues on user accessible systems, we're trying to build lustre client
> rpms for the latest released Centos 7.4
> kernel, 3.10.0-693.11.6.el7.x86_64.  We're running in to issues
> compiling though.  As I understand from the docs, as our servers are
> CentOS6 and running the Intel distributed 2.7.0 server binaries, the
> newest "officially" supported client versions are lustre 2.8.  
>
>
> Has anyone run the 2.10.2 (or 2.10.x) clients connecting to 2.7.0
> servers (as we have successfully built all client rpms from a current
> git checkout, and from a 2.10.2 checkout)?  Alternatively, is there a
> known 2.8.x tag that builds successfully on CentOS7.4?  Is there a
> third option that folks would propose?
>
>
> build errors visible at https://pastebin.com/izF3bXg3
>
>
> Cheers
>
> Scott
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre 2.10.1 + RHEL7 Lock Callback Timer Expired

2017-12-03 Thread Riccardo Veraldi

Hello,
are you using Infiniband ?
if so what are the peer credit settings ?

 cat /proc/sys/lnet/nis
 cat /proc/sys/lnet/peers


On 12/3/17 8:38 AM, E.S. Rosenberg wrote:
> Did you find the problem? Were there any useful suggestions off-list?
>
> On Wed, Nov 29, 2017 at 1:34 PM, Charles A Taylor  > wrote:
>
>
> We have a genomics pipeline app (supernova) that fails
> consistently due to the client being evicted on the OSSs with a 
> “lock callback timer expired”.  I doubled “nlm_enqueue_min” across
> the cluster but then the timer simply expired after 200s rather
> than 100s so I don’t think that is the answer.   The syslog/dmesg
> on the client shows no signs of distress and it is a “bigmem”
> machine with 1TB of RAM.
>
> The eviction appears to come while the application is processing a
> large number (~300) of data “chunks” (i.e. files) which occur in
> pairs.
>
> -rw-r--r-- 1 chasman ufhpc 24 Nov 28 23:31
> 
> ./Tdtest915/ASSEMBLER_CS/_ASSEMBLER/_ASM_SN/SHARD_ASM/fork0/join/files/chunk233.sedge_bcs
> -rw-r--r-- 1 chasman ufhpc 34M Nov 28 23:31
> 
> ./Tdtest915/ASSEMBLER_CS/_ASSEMBLER/_ASM_SN/SHARD_ASM/fork0/join/files/chunk233.sedge_asm
>
> I assume the 24-byte file is metadata (an index or some such) and
> the 34M file is the actual data but I’m just guessing since I’m
> completely unfamiliar with the application.
>
> The write error is,
>
>     #define ENOTCONN        107     /* Transport endpoint is not
> connected */
>
> which occurs after the OSS eviction.  This was reproducible under
> 2.5.3.90 as well.  We hoped that upgrading to 2.10.1 would resolve
> the issue but it has not.
>
> This is the first application (in 10 years) we have encountered
> that consistently and reliably fails when run over Lustre.  I’m
> not sure at this point whether this is a bug or tuning issue.
> If others have encountered and overcome something like this, we’d
> be grateful to hear from you.
>
> Regards,
>
> Charles Taylor
> UF Research Computing
>
> OSS:
> --
> Nov 28 23:41:41 ufrcoss28 kernel: LustreError:
> 0:0:(ldlm_lockd.c:334:waiting_locks_callback()) ### lock callback
> timer expired after 201s: evicing client at 10.13.136.74@o2ib  ns:
> filter-testfs-OST002e_UUID lock:
> 880041717400/0x9bd23c8dc69323a1 lrc: 3/0,0 mode: PW/PW res:
> [0x7ef2:0x0:0x0].0x0 rrc: 3 type: EXT [0->18446744073709551615]
> (req 4096->1802239) flags: 0x6400010020 nid: 10.13.136.74@o2ib
> remote: 0xe54f26957f2ac591 expref: 45 pid: 6836 timeout:
> 6488120506 lvb_type: 0
>
> Client:
> ———
> Nov 28 23:41:42 s5a-s23 kernel: LustreError: 11-0:
> testfs-OST002e-osc-88c053fe3800: operation ost_write to node
> 10.13.136.30@o2ib failed: rc = -107
> Nov 28 23:41:42 s5a-s23 kernel: Lustre:
> testfs-OST002e-osc-88c053fe3800: Connection to testfs-OST002e
> (at 10.13.136.30@o2ib) was lost; in progress operations using this
> service will wait for recovery to complete
> Nov 28 23:41:42 s5a-s23 kernel: LustreError: 167-0:
> testfs-OST002e-osc-88c053fe3800: This client was evicted by
> testfs-OST002e; in progress operations using this service will fail.
> Nov 28 23:41:42 s5a-s23 kernel: LustreError: 11-0:
> testfs-OST002c-osc-88c053fe3800: operation ost_punch to node
> 10.13.136.30@o2ib failed: rc = -107
> Nov 28 23:41:42 s5a-s23 kernel: Lustre:
> testfs-OST002c-osc-88c053fe3800: Connection to testfs-OST002c
> (at 10.13.136.30@o2ib) was lost; in progress operations using this
> service will wait for recovery to complete
> Nov 28 23:41:42 s5a-s23 kernel: LustreError: 167-0:
> testfs-OST002c-osc-88c053fe3800: This client was evicted by
> testfs-OST002c; in progress operations using this service will fail.
> Nov 28 23:41:47 s5a-s23 kernel: LustreError: 11-0:
> testfs-OST-osc-88c053fe3800: operation ost_statfs to node
> 10.13.136.23@o2ib failed: rc = -107
> Nov 28 23:41:47 s5a-s23 kernel: Lustre:
> testfs-OST-osc-88c053fe3800: Connection to testfs-OST
> (at 10.13.136.23@o2ib) was lost; in progress operations using this
> service will wait for recovery to complete
> Nov 28 23:41:47 s5a-s23 kernel: LustreError: 167-0:
> testfs-OST0004-osc-88c053fe3800: This client was evicted by
> testfs-OST0004; in progress operations using this service will fail.
> Nov 28 23:43:11 s5a-s23 kernel: Lustre:
> testfs-OST0006-osc-88c053fe3800: Connection restored to
> 10.13.136.24@o2ib (at 10.13.136.24@o2ib)
> Nov 28 23:43:38 s5a-s23 kernel: Lustre:
> testfs-OST002c-osc-88c053fe3800: Connection restored to
> 10.13.136.30@o2ib (at 10.13.136.30@o2ib)
> Nov 28 23:43:45 s5a-s23 kernel: Lustre:
> testfs-OST-osc-88c053fe3800:

Re: [lustre-discuss] Does lustre 2.10 client support 2.5 server ?

2017-11-27 Thread Riccardo Veraldi

I had a problems between 2.9.0 servers and 2.10.1 clients.
Actually from 2.10.1 clients I can list directories and files on my
Lustre FS OSTs, but I am not able to access any file for reading or writing.
I solved the problem decreasing the number of peer credits in ko2iblnd
to 8 instead of 128, client side.
This worked but I do not know exactly what is causing this problem.
these are the errors I was getting.

Nov 21 11:23:35 psana101 kernel: Lustre:
2287:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has
failed due to network error: [sent 1511292215/real 1511292215] 
req@8801e8f08000 x1584330612541264/t0(0)
o3->ffb11-OST000e-osc-8806320a3000@172.21.52.66@o2ib2:6/4 lens
608/432 e 0 to 1 dl 1511292227 ref 2 fl Rpc:eX/2/ rc 0/-1
Nov 21 11:23:35 psana101 kernel: Lustre:
2287:0:(client.c:2114:ptlrpc_expire_one_request()) Skipped 366 previous
similar messages
Nov 21 11:23:35 psana101 kernel: Lustre:
ffb11-OST000e-osc-8806320a3000: Connection to ffb11-OST000e (at
172.21.52.66@o2ib2) was lost; in progress operations using this service
will wait for recovery to complete



On 11/9/17 3:13 PM, Dilger, Andreas wrote:
> On Nov 9, 2017, at 05:27, Andrew Elwell  wrote:
>>> My Lustre server is running the version 2.5 and I want to use 2.10 client.
>>> Is this combination supported ? Is there anything that I need to be aware of
>> 2 of our storage appliances (sonnexion 1600 based) run 2.5.1, I've
>> mounted this OK on infiniband clients fine with 2.10.0 and 2.10.1 OK,
>> but a colleague has since had to downgrade some of our clients to
>> 2.9.0 on OPA / KNL hosts as we were seeing strange issues (can't
>> remember the ticket details)
> If people are having problems like this, it would be useful to know the
> details.  If you are using a non-Intel release, you should go through your
> support provider, since they know the most details about what patches are
> in their release.
>
>> We do see the warnings at startup:
>> Lustre: Server MGS version (2.5.1.0) is much older than client.
>> Consider upgrading server (2.10.0)
> This is a standard message if your client/server versions are more than 0.4 
> releases apart.  While Lustre clients and servers negotiate the supported 
> features between them at connection time, so there should be broad 
> interoperability between releases, we only test between the latest major 
> releases (e.g. 2.5.x and 2.7.y, or 2.7.y and 2.10.z), it isn't possible to 
> test interoperability between every release.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Multi-Rail anyone ?

2017-10-09 Thread Riccardo Veraldi

Just curious to ask if anyone lse is using Lustre in multi-rail
configuration ?

thanks

Rick


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre client can't moutn after configuring LNET with lnetctl

2017-09-29 Thread Riccardo Veraldi

On 9/28/17 8:29 PM, Dilger, Andreas wrote:
> Riccardo,
> I'm not an LNet expert, but a number of LNet multi-rail fixes are landed or 
> being worked on for Lustre 2.10.1.  You might try testing the current b2_10 
> to see if that resolves your problems.
You are right I might end up with that. Sorry but I did not understand
if 2.10.1 is officially out or if it is release candidate.
thanks
>
> Cheers, Andreas
>
> On Sep 27, 2017, at 21:22, Riccardo Veraldi <riccardo.vera...@cnaf.infn.it> 
> wrote:
>> Hello.
>>
>> I configure Multi-rail on my lustre environment.
>>
>> MDS: 172.21.42.213@tcp
>> OSS: 172.21.52.118@o2ib
>> 172.21.52.86@o2ib
>> Client: 172.21.52.124@o2ib
>> 172.21.52.125@o2ib
>>
>>  
>> [root@drp-tst-oss10:~]# cat /proc/sys/lnet/peers
>> nid  refs state  last   max   rtr   mintx   min
>> queue
>> 172.21.52.124@o2ib  1NA-1   128   128   128   128   128 0
>> 172.21.52.125@o2ib  1NA-1   128   128   128   128   128 0
>> 172.21.42.213@tcp   1NA-1 8 8 8 8 6 0
>>
>> after configuring multi-rail I can see both infiniband interfaces peers on 
>> the OSS and on the client side. 
>> Anyway before multi-rail lustre client could mount the lustre FS without 
>> problems.
>> Now after multi-rail is set up the client cannot mount anymore the 
>> filesystem.
>>
>> When I mount lustre from the client (fstab entry):
>>
>> 172.21.42.213@tcp:/drplu /drplu lustre noauto,lazystatfs,flock, 0 0
>>
>> the file system cannot be mounted and I got these errors
>>
>> Sep 27 18:28:46 drp-tst-lu10 kernel: [  596.842861] Lustre:
>> 2490:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has
>> failed due to network error: [sent 1506562126/real 1506562126] 
>> req@8808326b2a00 x1579744801849904/t0(0)
>> o400->
>> drplu-OST0001-osc-88085d134800@172.21.52.86@o2ib:28/4
>>  lens
>> 224/224 e 0 to 1 dl 1506562133 ref 1 fl Rpc:eXN/0/ rc 0/-1
>> Sep 27 18:28:46 drp-tst-lu10 kernel: [  596.842872] Lustre:
>> drplu-OST0001-osc-88085d134800: Connection to drplu-OST0001 (at
>> 172.21.52.86@o2ib) was lost; in progress operations using this service
>> will wait for recovery to complete
>> Sep 27 18:28:46 drp-tst-lu10 kernel: [  596.843306] Lustre:
>> drplu-OST0001-osc-88085d134800: Connection restored to
>> 172.21.52.86@o2ib (at 172.21.52.86@o2ib)
>>
>>
>> the mount point appears and disappears every few seconds from "df"
>>
>> I do not have a clue on how to fix. The multi rail capability is important 
>> for me.
>>
>> I have Lustre 2.10.0 both client side and server side.
>> here is my lnet.conf on the lustre client side. The one OSS side is
>> similar just swapped peers for o2ib net.
>>
>> net:
>> - net type: lo
>>   local NI(s):
>> - nid: 0@lo
>>   status: up
>>   statistics:
>>   send_count: 0
>>   recv_count: 0
>>   drop_count: 0
>>   tunables:
>>   peer_timeout: 0
>>   peer_credits: 0
>>   peer_buffer_credits: 0
>>   credits: 0
>>   lnd tunables:
>>   tcp bonding: 0
>>   dev cpt: 0
>>   CPT: "[0]"
>> - net type: o2ib
>>   local NI(s):
>> - nid: 172.21.52.124@o2ib
>>   status: up
>>   interfaces:
>>   0: ib0
>>   statistics:
>>   send_count: 7
>>   recv_count: 7
>>   drop_count: 0
>>   tunables:
>>   peer_timeout: 180
>>   peer_credits: 128
>>   peer_buffer_credits: 0
>>   credits: 1024
>>   lnd tunables:
>>   peercredits_hiw: 64
>>   map_on_demand: 32
>>   concurrent_sends: 256
>>   fmr_pool_size: 2048
>>   fmr_flush_trigger: 512
>>   fmr_cache: 1
>>   ntx: 2048
>>   conns_per_peer: 4
>>   tcp bonding: 0
>>   dev cpt: -1
>>   CPT: "[0]"
>> - nid: 172.21.52.125@o2ib
>>   status: up
>>   interfaces:
>>   0: ib1
>>   statistics:
>>   send_count: 5
>>   recv_count: 5
>>

Re: [lustre-discuss] E5-2667 or E5-2697A for MDS

2017-09-28 Thread Riccardo Veraldi

Just out of curiosity how much is recommended to run a MDS on a virtual
machine (oVirt) ?
Are there any performance comparison/testing available ?
thanks

On 9/28/17 9:49 AM, Dilger, Andreas wrote:
> On Sep 28, 2017, at 04:54, forrest.wc.l...@dell.com wrote:
>> Hello :   
>>  
>> Our customer is going to configure Lustre FS, which will have a lot of small 
>> files to be accessed.
>>  
>> We are to configure 1TB Memory for MDS.
>>  
>> Regarding to CPU configuration , can we propose E5-2667 or E5-2697A for MDS 
>> for good performance ?
>>  
>> E5-2667 v4 : 3.2GHz, 8 Cores
>> E5-2697A v4: 2.6GHz, 16 cores
> There is a good presentation showing CPU speed vs. cores vs. MDS performance:
>
> https://www.eofs.eu/_media/events/lad14/03_shuichi_ihara_lustre_metadata_lad14.pdf
>
> Normally, higher GHz is good for the MDS, but if it reduces the number of
> cores by half, it may not be worthwhile.  It also depends on whether your
> workloads are mostly parallel (in which case more cores * GHz is better),
> or more serial (in which case a higher GHz is better).
>
> In this case, cores * GHz is 3.2GHz * 8 = 25.6GHz, and 2.6GHz * 16 = 41.6GHz,
> so you would probably get better aggregate performance from the E5-2697A as
> long as you have sufficient client parallelism to drive the system heavily.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] lustre client can't moutn after configuring LNET with lnetctl

2017-09-28 Thread Riccardo Veraldi

Hello.

I configure Multi-rail on my lustre environment.

MDS: 172.21.42.213@tcp
OSS: 172.21.52.118@o2ib
    172.21.52.86@o2ib
Client: 172.21.52.124@o2ib
    172.21.52.125@o2ib

 
[root@drp-tst-oss10:~]# cat /proc/sys/lnet/peers
nid  refs state  last   max   rtr   min    tx   min
queue
172.21.52.124@o2ib  1    NA    -1   128   128   128   128   128 0
172.21.52.125@o2ib  1    NA    -1   128   128   128   128   128 0
172.21.42.213@tcp   1    NA    -1 8 8 8 8 6 0

after configuring multi-rail I can see both infiniband interfaces peers on the 
OSS and on the client side. 
Anyway before multi-rail lustre client could mount the lustre FS without 
problems.
Now after multi-rail is set up the client cannot mount anymore the filesystem.

When I mount lustre from the client (fstab entry):

172.21.42.213@tcp:/drplu /drplu lustre noauto,lazystatfs,flock, 0 0

the file system cannot be mounted and I got these errors

Sep 27 18:28:46 drp-tst-lu10 kernel: [  596.842861] Lustre:
2490:0:(client.c:2114:ptlrpc_expire_one_request()) @@@ Request sent has
failed due to network error: [sent 1506562126/real 1506562126] 
req@8808326b2a00 x1579744801849904/t0(0)
o400->drplu-OST0001-osc-88085d134800@172.21.52.86@o2ib:28/4 lens
224/224 e 0 to 1 dl 1506562133 ref 1 fl Rpc:eXN/0/ rc 0/-1
Sep 27 18:28:46 drp-tst-lu10 kernel: [  596.842872] Lustre:
drplu-OST0001-osc-88085d134800: Connection to drplu-OST0001 (at
172.21.52.86@o2ib) was lost; in progress operations using this service
will wait for recovery to complete
Sep 27 18:28:46 drp-tst-lu10 kernel: [  596.843306] Lustre:
drplu-OST0001-osc-88085d134800: Connection restored to
172.21.52.86@o2ib (at 172.21.52.86@o2ib)


the mount point appears and disappears every few seconds from "df"

I do not have a clue on how to fix. The multi rail capability is important for 
me.

I have Lustre 2.10.0 both client side and server side.
here is my lnet.conf on the lustre client side. The one OSS side is
similar just swapped peers for o2ib net.

net:
    - net type: lo
  local NI(s):
    - nid: 0@lo
  status: up
  statistics:
  send_count: 0
  recv_count: 0
  drop_count: 0
  tunables:
  peer_timeout: 0
  peer_credits: 0
  peer_buffer_credits: 0
  credits: 0
  lnd tunables:
  tcp bonding: 0
  dev cpt: 0
  CPT: "[0]"
    - net type: o2ib
  local NI(s):
    - nid: 172.21.52.124@o2ib
  status: up
  interfaces:
  0: ib0
  statistics:
  send_count: 7
  recv_count: 7
  drop_count: 0
  tunables:
  peer_timeout: 180
  peer_credits: 128
  peer_buffer_credits: 0
  credits: 1024
  lnd tunables:
  peercredits_hiw: 64
  map_on_demand: 32
  concurrent_sends: 256
  fmr_pool_size: 2048
  fmr_flush_trigger: 512
  fmr_cache: 1
  ntx: 2048
  conns_per_peer: 4
  tcp bonding: 0
  dev cpt: -1
  CPT: "[0]"
    - nid: 172.21.52.125@o2ib
  status: up
  interfaces:
  0: ib1
  statistics:
  send_count: 5
  recv_count: 5
  drop_count: 0
  tunables:
  peer_timeout: 180
  peer_credits: 128
  peer_buffer_credits: 0
  credits: 1024
  lnd tunables:
  peercredits_hiw: 64
  map_on_demand: 32
  concurrent_sends: 256
  fmr_pool_size: 2048
  fmr_flush_trigger: 512
  fmr_cache: 1
  ntx: 2048
  conns_per_peer: 4
  tcp bonding: 0
  dev cpt: -1
  CPT: "[0]"
    - net type: tcp
  local NI(s):
    - nid: 172.21.42.195@tcp
  status: up
  interfaces:
  0: enp7s0f0
  statistics:
  send_count: 51
  recv_count: 51
  drop_count: 0
  tunables:
  peer_timeout: 180
  peer_credits: 8
  peer_buffer_credits: 0
  credits: 256
  lnd tunables:
  tcp bonding: 0
  dev cpt: -1
  CPT: "[0]"
peer:
    - primary nid: 172.21.42.213@tcp
  Multi-Rail: False
  peer ni:
    - nid: 172.21.42.213@tcp
  state: NA
  max_ni_tx_credits: 8
  available_tx_credits: 8
  min_tx_credits: 6
  tx_q_num_of_buf: 0
  available_rtr_credits: 8
  min_rtr_credits: 8
  send_count: 0
  recv_count: 0
  drop_count: 0
  refcount: 1
    - primary nid: 172.21.52.86@o2ib
  Multi-Rail: True
  peer ni:
    - nid: 172.21.52.86@o2ib
  state: NA

Re: [lustre-discuss] Lustre 2.10 and RHEL74

2017-09-26 Thread Riccardo Veraldi

ah ok the answer is already in your message sorry :)

On 9/26/17 4:02 PM, Riccardo Veraldi wrote:
> On 9/26/17 3:12 PM, Thomas Roth wrote:
>> I don't know about RHEL74, but got it to work on CentOS 7.4 (kernel
>> 3.10.0-693.el7).
>>
>> After ZFS was working on the server, I followed
>> https://lists.01.org/pipermail/hpdd-discuss/2016-December/003044.html
>> quite religiously.
>> Basically that means
>>> rpmbuild --rebuild --with zfs --without ldiskfs lustre-2.10.0-1.src.rpm
>> (the "--without ldiskfs" I added after some complaints of the
>> configure/build process).
>>
> thank you for pointing me at that document.
> Did you build 2.10.0 or the latest git from source code ?
>
> thanks
>
> Rick
>
>
>> The kernel-abi-whitelists is not installed there.
>>
>> Regards,
>> Thomas
>>
>> On 06.09.2017 04:02, Riccardo Veraldi wrote:
>>> I have the kabi whitelist package on hte system:
>>>
>>> kernel-abi-whitelists-3.10.0-693.1.1.el7.noarch
>>>
>>> I also built the rpm without ldiskfs support because I Am using ZFS.
>>>
>>> the problem seems to be with  kmod-lustre-osd-zfs
>>>
>>>
>>> yum localinstall lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
>>> kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
>>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm
>>> Loaded plugins: kabi, product-id, search-disabled-repos
>>> Loading support for Red Hat kernel ABI
>>> Examining lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
>>> lustre-2.10.52_75_ge1679d0.el7.x86_64
>>> Marking lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
>>> Examining kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
>>> kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64
>>> Marking kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
>>> Examining kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm:
>>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> Marking kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm to be
>>> installed
>>> Resolving Dependencies
>>> --> Running transaction check
>>> ---> Package kmod-lustre.x86_64 0:2.10.52_75_ge1679d0.el7 will be
>>> installed
>>> ---> Package kmod-lustre-osd-zfs.x86_64 0:2.10.52_75_ge1679d0.el7 will
>>> be installed
>>> --> Processing Dependency: lustre-osd-zfs-mount = 2.10.52_75_ge1679d0
>>> for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(__cv_broadcast) = 0xb75ecbeb for
>>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(arc_add_prune_callback) = 0x23573478 for
>>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(arc_buf_size) = 0xb5c5f0b4 for package:
>>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(arc_remove_prune_callback) = 0x6f8b923b
>>> for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(dbuf_create_bonus) = 0x294914aa for
>>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(dbuf_read) = 0x2a4a4dec for package:
>>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(dmu_assign_arcbuf) = 0x704f47e3 for
>>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> --> Processing Dependency: ksym(dmu_bonus_hold) = 0xacd57ac1 for
>>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>>> ...
>>> ..
>>> WARNING: possible kABI issue with package: kmod-lustre-osd-zfs
>>> WARNING: possible kABI issue with package: kmod-lustre
>>>
>>>
>>> ZFS is installed
>>>
>>> ls -la /lib/modules/3.10.0-693.1.1.el7.x86_64/extra/
>>> total 4516
>>> drwxr-xr-x 2 root root 141 Sep  5 17:17 .
>>> drwxr-xr-x 7 root root    4096 Sep  5 17:16 ..
>>> -rw-r--r-- 1 root root  331904 Sep  5 13:59 icp.ko
>>> -rw-r--r-- 1 root root  306208 Sep  5 13:56 splat.ko
>>> -rw-r--r-- 1 root root  187824 Sep  5 13:56 spl.ko
>>> -rw-r--r-- 1 root root   14016 Sep  5 13:59 zavl.ko
>>> -rw-r--r-- 1 root root  109552 Sep  5 13:59 zcommon.ko
>>> -rw-r--r-- 1 root root 3156744 Sep  5 13:59 zfs.ko
>>> -rw-r--r-- 1 root root  132488 Sep  5 13:59 znvpair.ko
>>> -rw-r--r-- 1 root root   35160 Sep  5 13:59 zpios.ko
>>>

Re: [lustre-discuss] Lustre 2.10 and RHEL74

2017-09-26 Thread Riccardo Veraldi

On 9/26/17 3:12 PM, Thomas Roth wrote:
> I don't know about RHEL74, but got it to work on CentOS 7.4 (kernel
> 3.10.0-693.el7).
>
> After ZFS was working on the server, I followed
> https://lists.01.org/pipermail/hpdd-discuss/2016-December/003044.html
> quite religiously.
> Basically that means
> > rpmbuild --rebuild --with zfs --without ldiskfs lustre-2.10.0-1.src.rpm
> (the "--without ldiskfs" I added after some complaints of the
> configure/build process).
>
thank you for pointing me at that document.
Did you build 2.10.0 or the latest git from source code ?

thanks

Rick


> The kernel-abi-whitelists is not installed there.
>
> Regards,
> Thomas
>
> On 06.09.2017 04:02, Riccardo Veraldi wrote:
>> I have the kabi whitelist package on hte system:
>>
>> kernel-abi-whitelists-3.10.0-693.1.1.el7.noarch
>>
>> I also built the rpm without ldiskfs support because I Am using ZFS.
>>
>> the problem seems to be with  kmod-lustre-osd-zfs
>>
>>
>> yum localinstall lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
>> kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm
>> Loaded plugins: kabi, product-id, search-disabled-repos
>> Loading support for Red Hat kernel ABI
>> Examining lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
>> lustre-2.10.52_75_ge1679d0.el7.x86_64
>> Marking lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
>> Examining kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
>> kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64
>> Marking kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
>> Examining kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm:
>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> Marking kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm to be
>> installed
>> Resolving Dependencies
>> --> Running transaction check
>> ---> Package kmod-lustre.x86_64 0:2.10.52_75_ge1679d0.el7 will be
>> installed
>> ---> Package kmod-lustre-osd-zfs.x86_64 0:2.10.52_75_ge1679d0.el7 will
>> be installed
>> --> Processing Dependency: lustre-osd-zfs-mount = 2.10.52_75_ge1679d0
>> for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(__cv_broadcast) = 0xb75ecbeb for
>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(arc_add_prune_callback) = 0x23573478 for
>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(arc_buf_size) = 0xb5c5f0b4 for package:
>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(arc_remove_prune_callback) = 0x6f8b923b
>> for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(dbuf_create_bonus) = 0x294914aa for
>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(dbuf_read) = 0x2a4a4dec for package:
>> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(dmu_assign_arcbuf) = 0x704f47e3 for
>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> --> Processing Dependency: ksym(dmu_bonus_hold) = 0xacd57ac1 for
>> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
>> ...
>> ..
>> WARNING: possible kABI issue with package: kmod-lustre-osd-zfs
>> WARNING: possible kABI issue with package: kmod-lustre
>>
>>
>> ZFS is installed
>>
>> ls -la /lib/modules/3.10.0-693.1.1.el7.x86_64/extra/
>> total 4516
>> drwxr-xr-x 2 root root 141 Sep  5 17:17 .
>> drwxr-xr-x 7 root root    4096 Sep  5 17:16 ..
>> -rw-r--r-- 1 root root  331904 Sep  5 13:59 icp.ko
>> -rw-r--r-- 1 root root  306208 Sep  5 13:56 splat.ko
>> -rw-r--r-- 1 root root  187824 Sep  5 13:56 spl.ko
>> -rw-r--r-- 1 root root   14016 Sep  5 13:59 zavl.ko
>> -rw-r--r-- 1 root root  109552 Sep  5 13:59 zcommon.ko
>> -rw-r--r-- 1 root root 3156744 Sep  5 13:59 zfs.ko
>> -rw-r--r-- 1 root root  132488 Sep  5 13:59 znvpair.ko
>> -rw-r--r-- 1 root root   35160 Sep  5 13:59 zpios.ko
>> -rw-r--r-- 1 root root  330952 Sep  5 13:59 zunicode.ko
>>
>>
>>
>>
>>
>> On 9/5/17 6:13 PM, Cowe, Malcolm J wrote:
>>> One possibility is that the kernel-abi-whitelists.noarch package is
>>> not installed – although I’ve certainly compiled Lustre without this
>>> package in the past on RHEL 7.3.
>>>
>>> I believe that the project quota patches for LDISKFS break KABI
>>> compatibility, so

Re: [lustre-discuss] Lustre 2.10 and RHEL74

2017-09-05 Thread Riccardo Veraldi

I add these are the ZFS packages which are on the system

libzfs2-0.7.1-1.el7_4.x86_64
zfs-release-1-5.el7_4.noarch
zfs-0.7.1-1.el7_4.x86_64
zfs-dkms-0.7.1-1.el7_4.noarch
libzfs2-devel-0.7.1-1.el7_4.x86_64


On 9/5/17 7:02 PM, Riccardo Veraldi wrote:
> I have the kabi whitelist package on hte system:
>
> kernel-abi-whitelists-3.10.0-693.1.1.el7.noarch
>
> I also built the rpm without ldiskfs support because I Am using ZFS.
>
> the problem seems to be with  kmod-lustre-osd-zfs
>
>
> yum localinstall lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
> kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm
> Loaded plugins: kabi, product-id, search-disabled-repos
> Loading support for Red Hat kernel ABI
> Examining lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
> lustre-2.10.52_75_ge1679d0.el7.x86_64
> Marking lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
> Examining kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
> kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64
> Marking kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
> Examining kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm:
> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> Marking kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm to be
> installed
> Resolving Dependencies
> --> Running transaction check
> ---> Package kmod-lustre.x86_64 0:2.10.52_75_ge1679d0.el7 will be installed
> ---> Package kmod-lustre-osd-zfs.x86_64 0:2.10.52_75_ge1679d0.el7 will
> be installed
> --> Processing Dependency: lustre-osd-zfs-mount = 2.10.52_75_ge1679d0
> for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(__cv_broadcast) = 0xb75ecbeb for
> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(arc_add_prune_callback) = 0x23573478 for
> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(arc_buf_size) = 0xb5c5f0b4 for package:
> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(arc_remove_prune_callback) = 0x6f8b923b
> for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(dbuf_create_bonus) = 0x294914aa for
> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(dbuf_read) = 0x2a4a4dec for package:
> kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(dmu_assign_arcbuf) = 0x704f47e3 for
> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> --> Processing Dependency: ksym(dmu_bonus_hold) = 0xacd57ac1 for
> package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
> ...
> ..
> WARNING: possible kABI issue with package: kmod-lustre-osd-zfs
> WARNING: possible kABI issue with package: kmod-lustre
>
>
> ZFS is installed
>
> ls -la /lib/modules/3.10.0-693.1.1.el7.x86_64/extra/
> total 4516
> drwxr-xr-x 2 root root 141 Sep  5 17:17 .
> drwxr-xr-x 7 root root    4096 Sep  5 17:16 ..
> -rw-r--r-- 1 root root  331904 Sep  5 13:59 icp.ko
> -rw-r--r-- 1 root root  306208 Sep  5 13:56 splat.ko
> -rw-r--r-- 1 root root  187824 Sep  5 13:56 spl.ko
> -rw-r--r-- 1 root root   14016 Sep  5 13:59 zavl.ko
> -rw-r--r-- 1 root root  109552 Sep  5 13:59 zcommon.ko
> -rw-r--r-- 1 root root 3156744 Sep  5 13:59 zfs.ko
> -rw-r--r-- 1 root root  132488 Sep  5 13:59 znvpair.ko
> -rw-r--r-- 1 root root   35160 Sep  5 13:59 zpios.ko
> -rw-r--r-- 1 root root  330952 Sep  5 13:59 zunicode.ko
>
>
>
>
>
> On 9/5/17 6:13 PM, Cowe, Malcolm J wrote:
>> One possibility is that the kernel-abi-whitelists.noarch package is not 
>> installed – although I’ve certainly compiled Lustre without this package in 
>> the past on RHEL 7.3.
>>
>> I believe that the project quota patches for LDISKFS break KABI 
>> compatibility, so it is possible this is what is causing the build to fail. 
>> If so, then you can either remove the “vfs-project-quotas-rhel7.patch” from 
>> the patch series for the server kernel (which will remove project quota 
>> support), or disable the kabi check when compiling the kernel. For example:
>>
>> _TOPDIR=`rpm --eval %{_topdir}`
>> rpmbuild -ba --with firmware --with baseonly \
>> --without kabichk \
>> --define "buildid _lustre" \
>> --target x86_64 \
>> $_TOPDIR/SPECS/kernel.spec
>>
>> Malcolm.
>>
>> On 6/9/17, 10:30 am, "lustre-discuss on behalf of Riccardo Veraldi" 
>> <lustre-discuss-boun...@lists.lustre.org on behalf of 
>> riccardo.vera...@cnaf.infn.it> w

Re: [lustre-discuss] Lustre 2.10 and RHEL74

2017-09-05 Thread Riccardo Veraldi

I have the kabi whitelist package on hte system:

kernel-abi-whitelists-3.10.0-693.1.1.el7.noarch

I also built the rpm without ldiskfs support because I Am using ZFS.

the problem seems to be with  kmod-lustre-osd-zfs


yum localinstall lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm
kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm
Loaded plugins: kabi, product-id, search-disabled-repos
Loading support for Red Hat kernel ABI
Examining lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
lustre-2.10.52_75_ge1679d0.el7.x86_64
Marking lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
Examining kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm:
kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64
Marking kmod-lustre-2.10.52_75_ge1679d0.el7.x86_64.rpm to be installed
Examining kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm:
kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
Marking kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64.rpm to be
installed
Resolving Dependencies
--> Running transaction check
---> Package kmod-lustre.x86_64 0:2.10.52_75_ge1679d0.el7 will be installed
---> Package kmod-lustre-osd-zfs.x86_64 0:2.10.52_75_ge1679d0.el7 will
be installed
--> Processing Dependency: lustre-osd-zfs-mount = 2.10.52_75_ge1679d0
for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(__cv_broadcast) = 0xb75ecbeb for
package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(arc_add_prune_callback) = 0x23573478 for
package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(arc_buf_size) = 0xb5c5f0b4 for package:
kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(arc_remove_prune_callback) = 0x6f8b923b
for package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(dbuf_create_bonus) = 0x294914aa for
package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(dbuf_read) = 0x2a4a4dec for package:
kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(dmu_assign_arcbuf) = 0x704f47e3 for
package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
--> Processing Dependency: ksym(dmu_bonus_hold) = 0xacd57ac1 for
package: kmod-lustre-osd-zfs-2.10.52_75_ge1679d0.el7.x86_64
...
..
WARNING: possible kABI issue with package: kmod-lustre-osd-zfs
WARNING: possible kABI issue with package: kmod-lustre


ZFS is installed

ls -la /lib/modules/3.10.0-693.1.1.el7.x86_64/extra/
total 4516
drwxr-xr-x 2 root root 141 Sep  5 17:17 .
drwxr-xr-x 7 root root    4096 Sep  5 17:16 ..
-rw-r--r-- 1 root root  331904 Sep  5 13:59 icp.ko
-rw-r--r-- 1 root root  306208 Sep  5 13:56 splat.ko
-rw-r--r-- 1 root root  187824 Sep  5 13:56 spl.ko
-rw-r--r-- 1 root root   14016 Sep  5 13:59 zavl.ko
-rw-r--r-- 1 root root  109552 Sep  5 13:59 zcommon.ko
-rw-r--r-- 1 root root 3156744 Sep  5 13:59 zfs.ko
-rw-r--r-- 1 root root  132488 Sep  5 13:59 znvpair.ko
-rw-r--r-- 1 root root   35160 Sep  5 13:59 zpios.ko
-rw-r--r-- 1 root root  330952 Sep  5 13:59 zunicode.ko





On 9/5/17 6:13 PM, Cowe, Malcolm J wrote:
> One possibility is that the kernel-abi-whitelists.noarch package is not 
> installed – although I’ve certainly compiled Lustre without this package in 
> the past on RHEL 7.3.
>
> I believe that the project quota patches for LDISKFS break KABI 
> compatibility, so it is possible this is what is causing the build to fail. 
> If so, then you can either remove the “vfs-project-quotas-rhel7.patch” from 
> the patch series for the server kernel (which will remove project quota 
> support), or disable the kabi check when compiling the kernel. For example:
>
> _TOPDIR=`rpm --eval %{_topdir}`
> rpmbuild -ba --with firmware --with baseonly \
> --without kabichk \
> --define "buildid _lustre" \
> --target x86_64 \
> $_TOPDIR/SPECS/kernel.spec
>
> Malcolm.
>
> On 6/9/17, 10:30 am, "lustre-discuss on behalf of Riccardo Veraldi" 
> <lustre-discuss-boun...@lists.lustre.org on behalf of 
> riccardo.vera...@cnaf.infn.it> wrote:
>
> Hello,
> is it foreseen that Lustre 2.10.*  will be compatible with RHEL74 ?
> I tried lustre 2.10.52 but it complains abotu kABI.
> 
> thank you
> 
> Rick
> 
> 
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
> 
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre 2.10.0 multi rail configuration

2017-08-28 Thread Riccardo Veraldi

I tried to follow this Intel document

https://www.eofs.eu/_media/events/lad16/12_multirail_lnet_for_lustre_weber.pdf

anyways lnetctl fails with the error:

add:
   - ip2nets:
 errno: -22
             descr: "cannot add network: Invalid argument"

and this is my lnet.conf

ip2nets:
  - net-spec: o2ib5
 interfaces:
 0: ib0[0]
 1: ib1[1]


here my lustre.conf

options lnet networks=o2ib5,tcp5(enp1s0f0)

thanks


Rick



On 8/28/17 4:07 PM, Chris Horn wrote:
>
> Dynamic LNet configuration (DLC) must be used to configure multi-rail.
> Lustre 2.10 contains an “lnet.conf” file that has a sample multi-rail
> configuration. I’ve copied it below for your convenience.
>
>  
>
> > # lnet.conf - configuration file for lnet routes to be imported by lnetctl
>
> > #
>
> > # This configuration file is formatted as YAML and can be imported
>
> > # by lnetctl.
>
> > #
>
> > # net:
>
> > # - net type: o2ib1
>
> > #   local NI(s):
>
> > # - nid: 172.16.1.4@o2ib1
>
> > #   interfaces:
>
> > #   0: ib0
>
> > #   tunables:
>
> > #   peer_timeout: 180
>
> > #   peer_credits: 128
>
> > #   peer_buffer_credits: 0
>
> > #   credits: 1024
>
> > #   lnd tunables:
>
> > #   peercredits_hiw: 64
>
> > #   map_on_demand: 32
>
> > #   concurrent_sends: 256
>
> > #   fmr_pool_size: 2048
>
> > #   fmr_flush_trigger: 512
>
> > #   fmr_cache: 1
>
> > #   CPT: "[0,1]"
>
> > # - nid: 172.16.2.4@o2ib1
>
> > #   interfaces:
>
> > #   0: ib1
>
> > #   tunables:
>
> > #   peer_timeout: 180
>
> > #   peer_credits: 128
>
> > #   peer_buffer_credits: 0
>
> > #   credits: 1024
>
> > #   lnd tunables:
>
> > #   peercredits_hiw: 64
>
> > #   map_on_demand: 32
>
> > #   concurrent_sends: 256
>
> > #   fmr_pool_size: 2048
>
> > #   fmr_flush_trigger: 512
>
> > #   fmr_cache: 1
>
> > #   CPT: "[0,1]"
>
> > # route:
>
> > # - net: o2ib
>
> > #   gateway: 172.16.1.1@o2ib1
>
> > #   hop: -1
>
> > #   priority: 0
>
> > # peer:
>
> > # - primary nid: 192.168.1.2@o2ib
>
> > #   Multi-Rail: True
>
> > #   peer ni:
>
> > # - nid: 192.168.1.2@o2ib
>
> > # - nid: 192.168.2.2@o2ib
>
> > # - primary nid: 172.16.1.1@o2ib1
>
> > #   Multi-Rail: True
>
> > #   peer ni:
>
> > # - nid: 172.16.1.1@o2ib1
>
> > # - nid: 172.16.2.1@o2ib1 <mailto:172.16.2.1@o2ib1>
>
>  
>
> Chris Horn
>
>  
>
> *From: *lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on
> behalf of Riccardo Veraldi <riccardo.vera...@cnaf.infn.it>
> *Date: *Monday, August 28, 2017 at 5:49 PM
> *To: *"lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org>
> *Subject: *[lustre-discuss] Lustre 2.10.0 multi rail configuration
>
>  
>
> Hello,
> I am trying to deploy a multi rail configuration on Lustre 2.10.0 on
> RHEL73.
> My goal is to use both the IB interfaces on OSSes and client.
> I have one client and two OSSes and 1 MDS
> My LNet network is labelled o2ib5 and tcp5 just for my own
> convenience. What I did is to modify the configuration of lustre.conf
>
> options lnet networks=o2ib5(ib0,ib1),tcp5(enp1s0f0)
>
> lctl list_nids on either hte OSSes or the client shows me both local
> IB interfaces:
>
> *172.21.52.86@o2ib5
> 172.21.52.118@o2ib5*
> 172.21.42.211@tcp5
>
> anyway I can't run a LNet selftest using the new nids, it fails.
>
> Seems like they are unused.
> Any hint on the multi-rail configuration needed?
> What I'd like to do is use both InfiniBand cards (ib0,ib1)  on my two
> OSSes and on my client to leverage more bandwidth usage
> since with only one InfiniBand I cannot saturate the disk performance.
> thank you
>
>  
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

[lustre-discuss] Lustre 2.10.0 multi rail configuration

2017-08-28 Thread Riccardo Veraldi

Hello,
I am trying to deploy a multi rail configuration on Lustre 2.10.0 on RHEL73.
My goal is to use both the IB interfaces on OSSes and client.
I have one client and two OSSes and 1 MDS
My LNet network is labelled o2ib5 and tcp5 just for my own convenience.
What I did is to modify the configuration of lustre.conf

options lnet networks=o2ib5(ib0,ib1),tcp5(enp1s0f0)

lctl list_nids on either hte OSSes or the client shows me both local IB
interfaces:

*172.21.52.86@o2ib5**
**172.21.52.118@o2ib5*
172.21.42.211@tcp5

anyway I can't run a LNet selftest using the new nids, it fails.

Seems like they are unused.
Any hint on the multi-rail configuration needed?
What I'd like to do is use both InfiniBand cards (ib0,ib1)  on my two
OSSes and on my client to leverage more bandwidth usage
since with only one InfiniBand I cannot saturate the disk performance.
thank you


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] lustre-dkms-2.10.0 fails to install on kernel 4.4

2017-08-28 Thread Riccardo Veraldi

I tend to use dkms, I always did with RHEL7.2 and 7.3
I foudn it much more handy than bulding a kmod module every time you
have to upgrade the kernel.

On 8/20/17 3:49 AM, E.S. Rosenberg wrote:
> Shouldn't you use kmod with RHEL? I may be wrong but that is how we
> installed it
>
> On Thu, Aug 17, 2017 at 1:58 AM, Riccardo Veraldi
> <riccardo.vera...@cnaf.infn.it> wrote:
>> hello.
>>
>> I am installing lustre-dkms-2.10.0 on RHEL74  running kernel
>> 4.4.82-1.el7.elrepo.x86_64
>>
>> dkms.conf: Error! Directive 'DEST_MODULE_LOCATION' does not begin with
>> '/kernel', '/updates', or '/extra' in record #24.
>> dkms.conf: Error! Directive 'DEST_MODULE_LOCATION' does not begin with
>> '/kernel', '/updates', or '/extra' in record #25.
>> dkms.conf: Error! Directive 'DEST_MODULE_LOCATION' does not begin with
>> '/kernel', '/updates', or '/extra' in record #26.
>> Error! Bad conf file.
>> File:
>> does not represent a valid dkms.conf file.
>>
>> this looks similar to LU-8630 that was fixed.
>>
>> how can I Solve this ?
>>
>> thank you
>>
>> Rick
>>
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre poor performance

2017-08-28 Thread Riccardo Veraldi

for Qlogic the script works but then there is some other parameter to
change in the peer credits value otherwise Lustre will complane and it
would not work.
At lest this is for my old Qlogic QDR cards.
I do not know if this does apply for newer Qlogic too.

I'll write a patch to the script that will work for mellanox cards
(ConnectX-3 family).
I Can't speak for ConnectX-4 because I have no experience on those right
now.
 


On 8/23/17 4:36 PM, Dilger, Andreas wrote:
> On Aug 23, 2017, at 08:39, Mohr Jr, Richard Frank (Rick Mohr) <rm...@utk.edu> 
> wrote:
>>
>>> On Aug 22, 2017, at 7:14 PM, Riccardo Veraldi 
>>> <riccardo.vera...@cnaf.infn.it> wrote:
>>>
>>> On 8/22/17 9:22 AM, Mannthey, Keith wrote:
>>>> Younot expected.
>>>>
>>> yes they are automatically used on my Mellanox and the script 
>>> ko2iblnd-probe seems like not working properly.
>> The ko2iblnd-probe script looks in /sys/class/infiniband for device names 
>> starting with “hfi” or “qib”.  If it detects those, it decides that the 
>> “profile” it should use is “opa” so then it basically invokes the 
>> ko2iblnd-opa modprobe line.  But the script has no logic to detect other 
>> types of card (i.e. - mellanox), so in those cases, no ko2iblnd options are 
>> used and you end up with the default module parameters being used.
>>
>> If you want to use the script, you will need to modify ko2iblnd-probe to add 
>> a new case for your brand of HCA and then add an appropriate 
>> ko2iblnd- line to ko2iblnd.conf.
>>
>> Or just do what I did and comment out all the lines in ko2iblnd.conf and add 
>> your own lines.
> If there are significantly different options needed for newer Mellanox HCAs 
> (e.g. as between Qlogic/OPA and MLX) it would be great to get a patch to 
> ko2iblnd-probe and ko2iblnd.conf that adds those options as the default for 
> the new type of card, so that Lustre works better out of the box.  That helps 
> transfer the experience of veteran IB users to users that may not have the 
> background to get the best LNet IB performance.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Lustre Principal Architect
> Intel Corporation
>
>
>
>
>
>
>
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre poor performance

2017-08-28 Thread Riccardo Veraldi

On 8/23/17 7:39 AM, Mohr Jr, Richard Frank (Rick Mohr) wrote:
>> On Aug 22, 2017, at 7:14 PM, Riccardo Veraldi 
>> <riccardo.vera...@cnaf.infn.it> wrote:
>>
>> On 8/22/17 9:22 AM, Mannthey, Keith wrote:
>>> Younot expected.
>>>
>> yes they are automatically used on my Mellanox and the script ko2iblnd-probe 
>> seems like not working properly.
> The ko2iblnd-probe script looks in /sys/class/infiniband for device names 
> starting with “hfi” or “qib”.  If it detects those, it decides that the 
> “profile” it should use is “opa” so then it basically invokes the 
> ko2iblnd-opa modprobe line.  But the script has no logic to detect other 
> types of card (i.e. - mellanox), so in those cases, no ko2iblnd options are 
> used and you end up with the default module parameters being used.
>
> If you want to use the script, you will need to modify ko2iblnd-probe to add 
> a new case for your brand of HCA and then add an appropriate 
> ko2iblnd- line to ko2iblnd.conf.
>
> Or just do what I did and comment out all the lines in ko2iblnd.conf and add 
> your own lines.
yes what I did was to disable the module alias and just

options ko2iblnd ...
install ko2iblnd ...

and it worked.
I may modify the script as well as you mentioned.

thank you.
>
> --
> Rick Mohr
> Senior HPC System Administrator
> National Institute for Computational Sciences
> http://www.nics.tennessee.edu
>
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Re: [lustre-discuss] Lustre poor performance

2017-08-22 Thread Riccardo Veraldi

On 8/22/17 9:22 AM, Mannthey, Keith wrote:
>
> You may want to file a jira ticket if ko2iblnd-opa setting were being
> automatically used on your Mellanox setup.  That is not expected.
>
yes they are automatically used on my Mellanox and the script
ko2iblnd-probe seems like not working properly.
>
>  
>
> On another note:  As you note you NVMe backend is much faster than QRD
> link speed.  You may want to look at using the new Multi-rall lnet
> feature to boost network bandwidth.  You can add a 2^nd QRD HCA/Port
> and get more Lnet bandwith from your OSS server.   It is a new feature
> that is a bit of work to use but if you are chasing bandwith it might
> be worth the effort.
>
I have a dual infiniband card so I was thinking to bond them to have
more bandwidth. Is this that you mean when you are talking about the
Muti-rail feature boost ?

thanks

Rick


>  
>
> Thanks,
>
> Keith
>
>  
>
> *From:*lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org]
> *On Behalf Of *Chris Horn
> *Sent:* Monday, August 21, 2017 12:40 PM
> *To:* Riccardo Veraldi <riccardo.vera...@cnaf.infn.it>; Arman
> Khalatyan <arm2...@gmail.com>
> *Cc:* lustre-discuss@lists.lustre.org
> *Subject:* Re: [lustre-discuss] Lustre poor performance
>
>  
>
> The ko2iblnd-opa settings are tuned specifically for Intel OmniPath.
> Take a look at the /usr/sbin/ko2iblnd-probe script to see how OPA
> hardware is detected and the “ko2iblnd-opa” settings get used.
>
>  
>
> Chris Horn
>
>  
>
> *From: *lustre-discuss <lustre-discuss-boun...@lists.lustre.org
> <mailto:lustre-discuss-boun...@lists.lustre.org>> on behalf of
> Riccardo Veraldi <riccardo.vera...@cnaf.infn.it
> <mailto:riccardo.vera...@cnaf.infn.it>>
> *Date: *Saturday, August 19, 2017 at 5:00 PM
> *To: *Arman Khalatyan <arm2...@gmail.com <mailto:arm2...@gmail.com>>
> *Cc: *"lustre-discuss@lists.lustre.org
> <mailto:lustre-discuss@lists.lustre.org>"
> <lustre-discuss@lists.lustre.org <mailto:lustre-discuss@lists.lustre.org>>
> *Subject: *Re: [lustre-discuss] Lustre poor performance
>
>  
>
> I ran again my Lnet self test and  this time adding --concurrency=16 
> I can use all of the IB bandwith (3.5GB/sec).
>
> the only thing I do not understand is why ko2iblnd.conf is not loaded
> properly and I had to remove the alias in the config file to allow
> the proper peer_credit settings to be loaded.
>
> thanks to everyone for helping
>
> Riccardo
>
> On 8/19/17 8:54 AM, Riccardo Veraldi wrote:
>
>
> I found out that ko2iblnd is not getting settings from
> /etc/modprobe/ko2iblnd.conf
> alias ko2iblnd-opa ko2iblnd
> options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64
> credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32
> fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
>
> install ko2iblnd /usr/sbin/ko2iblnd-probe
>
> but if I modify ko2iblnd.conf like this, then settings are loaded:
>
> options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024
> concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
> fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
>
> install ko2iblnd /usr/sbin/ko2iblnd-probe
>
> Lnet tests show better behaviour but still I Would expect more
> than this.
> Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf
> so that Mellanox ConnectX-3 will work more efficiently ?
>
> [LNet Rates of servers]
> [R] Avg: 2286 RPC/s Min: 0RPC/s Max: 4572 RPC/s
> [W] Avg: 3322 RPC/s Min: 0RPC/s Max: 6643 RPC/s
> [LNet Bandwidth of servers]
> [R] Avg: 625.23   MiB/s Min: 0.00 MiB/s Max: 1250.46  MiB/s
> [W] Avg: 1035.85  MiB/s Min: 0.00 MiB/s Max: 2071.69  MiB/s
> [LNet Rates of servers]
> [R] Avg: 2286 RPC/s Min: 1RPC/s Max: 4571 RPC/s
> [W] Avg: 3321 RPC/s Min: 1RPC/s Max: 6641 RPC/s
> [LNet Bandwidth of servers]
> [R] Avg: 625.55   MiB/s Min: 0.00 MiB/s Max: 1251.11  MiB/s
> [W] Avg: 1035.05  MiB/s Min: 0.00 MiB/s Max: 2070.11  MiB/s
> [LNet Rates of servers]
> [R] Avg: 2291 RPC/s Min: 0RPC/s Max: 4581 RPC/s
> [W] Avg: 3329 RPC/s Min: 0RPC/s Max: 6657 RPC/s
> [LNet Bandwidth of servers]
> [R] Avg: 626.55   MiB/s Min: 0.00 MiB/s Max: 1253.11  MiB/s
> [W] Avg: 1038.05  MiB/s Min: 0.00 MiB/s Max: 2076.11  MiB/s
> session is ended
> ./lnet_test.sh: line 17: 23394 Terminated  lst stat
> servers
>
>
>
>
> On 8/19/17 4:

Re: [lustre-discuss] Lustre poor performance

2017-08-19 Thread Riccardo Veraldi

I ran again my Lnet self test and  this time adding --concurrency=16  I
can use all of the IB bandwith (3.5GB/sec).

the only thing I do not understand is why ko2iblnd.conf is not loaded
properly and I had to remove the alias in the config file to allow
the proper peer_credit settings to be loaded.

thanks to everyone for helping

Riccardo

On 8/19/17 8:54 AM, Riccardo Veraldi wrote:
>
> I found out that ko2iblnd is not getting settings from
> /etc/modprobe/ko2iblnd.conf
> alias ko2iblnd-opa ko2iblnd
> options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024
> concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
> fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
>
> install ko2iblnd /usr/sbin/ko2iblnd-probe
>
> but if I modify ko2iblnd.conf like this, then settings are loaded:
>
> options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024
> concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
> fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4
>
> install ko2iblnd /usr/sbin/ko2iblnd-probe
>
> Lnet tests show better behaviour but still I Would expect more than this.
> Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf so
> that Mellanox ConnectX-3 will work more efficiently ?
>
> [LNet Rates of servers]
> [R] Avg: 2286 RPC/s Min: 0RPC/s Max: 4572 RPC/s
> [W] Avg: 3322 RPC/s Min: 0RPC/s Max: 6643 RPC/s
> [LNet Bandwidth of servers]
> [R] Avg: 625.23   MiB/s Min: 0.00 MiB/s Max: 1250.46  MiB/s
> [W] Avg: 1035.85  MiB/s Min: 0.00 MiB/s Max: 2071.69  MiB/s
> [LNet Rates of servers]
> [R] Avg: 2286 RPC/s Min: 1RPC/s Max: 4571 RPC/s
> [W] Avg: 3321 RPC/s Min: 1RPC/s Max: 6641 RPC/s
> [LNet Bandwidth of servers]
> [R] Avg: 625.55   MiB/s Min: 0.00 MiB/s Max: 1251.11  MiB/s
> [W] Avg: 1035.05  MiB/s Min: 0.00 MiB/s Max: 2070.11  MiB/s
> [LNet Rates of servers]
> [R] Avg: 2291 RPC/s Min: 0RPC/s Max: 4581 RPC/s
> [W] Avg: 3329 RPC/s Min: 0RPC/s Max: 6657 RPC/s
> [LNet Bandwidth of servers]
> [R] Avg: 626.55   MiB/s Min: 0.00 MiB/s Max: 1253.11  MiB/s
> [W] Avg: 1038.05  MiB/s Min: 0.00 MiB/s Max: 2076.11  MiB/s
> session is ended
> ./lnet_test.sh: line 17: 23394 Terminated  lst stat servers
>
>
>
>
> On 8/19/17 4:20 AM, Arman Khalatyan wrote:
>> just minor comment,
>> you should push up performance of your nodes,they are not running in
>> the max cpu frequencies.Al tests might be inconsistent. in order to
>> get most of ib run following:
>> tuned-adm profile latency-performance
>> for more options use:
>> tuned-adm list
>>
>> It will be interesting to see the difference.
>>
>> Am 19.08.2017 3:57 vorm. schrieb "Riccardo Veraldi"
>> <riccardo.vera...@cnaf.infn.it <mailto:riccardo.vera...@cnaf.infn.it>>:
>>
>> Hello Keith and Dennis, these are the test I ran.
>>
>>   * obdfilter-survey, shows that I Can saturate disk performance,
>> the NVMe/ZFS backend is performing very well and it is faster
>> then my Infiniband network
>>
>> *pool  alloc   free   read  write   read  write**
>> **  -  -  -  -  -  -**
>> **drpffb-ost01  3.31T  3.19T  3  35.7K  16.0K  7.03G**
>> **  raidz1  3.31T  3.19T  3  35.7K  16.0K  7.03G**
>> **nvme0n1   -  -  1  5.95K  7.99K  1.17G**
>> **nvme1n1   -  -  0  6.01K  0  1.18G**
>> **nvme2n1   -  -  0  5.93K  0  1.17G**
>> **nvme3n1   -  -  0  5.88K  0  1.16G**
>> **nvme4n1   -  -  1  5.95K  7.99K  1.17G**
>> **nvme5n1   -  -  0  5.96K  0  1.17G**
>> **  -  -  -  -  -  -*
>>
>> this are the tests results
>>
>> Fri Aug 18 16:54:48 PDT 2017 Obdfilter-survey for case=disk from
>> drp-tst-ffb01
>> ost  1 sz 10485760K rsz 1024K obj1 thr1
>> write*7633.08   *  SHORT rewrite 7558.78
>> SHORT read 3205.24 [3213.70, 3226.78]
>> ost  1 sz 10485760K rsz 1024K obj1 thr2
>> write*7996.89 *SHORT rewrite 7903.42
>> SHORT read 5264.70 SHORT
>> ost  1 sz 10485760K rsz 1024K obj2 thr2 write
>> *7718.94* SHORT rewrite 7977.84 SHORT
>> read 5802.17 SHORT
>>
>>   * Lnet self test, and here I see the problems. For reference
>> 172.21.52.[83,84] are th

Re: [lustre-discuss] Lustre poor performance

2017-08-19 Thread Riccardo Veraldi


I found out that ko2iblnd is not getting settings from
/etc/modprobe/ko2iblnd.conf
alias ko2iblnd-opa ko2iblnd
options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024
concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4

install ko2iblnd /usr/sbin/ko2iblnd-probe

but if I modify ko2iblnd.conf like this, then settings are loaded:

options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024
concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048
fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4

install ko2iblnd /usr/sbin/ko2iblnd-probe

Lnet tests show better behaviour but still I Would expect more than this.
Is it possible to tune parameters in /etc/modprobe/ko2iblnd.conf so that
Mellanox ConnectX-3 will work more efficiently ?

[LNet Rates of servers]
[R] Avg: 2286 RPC/s Min: 0RPC/s Max: 4572 RPC/s
[W] Avg: 3322 RPC/s Min: 0RPC/s Max: 6643 RPC/s
[LNet Bandwidth of servers]
[R] Avg: 625.23   MiB/s Min: 0.00 MiB/s Max: 1250.46  MiB/s
[W] Avg: 1035.85  MiB/s Min: 0.00 MiB/s Max: 2071.69  MiB/s
[LNet Rates of servers]
[R] Avg: 2286 RPC/s Min: 1RPC/s Max: 4571 RPC/s
[W] Avg: 3321 RPC/s Min: 1RPC/s Max: 6641 RPC/s
[LNet Bandwidth of servers]
[R] Avg: 625.55   MiB/s Min: 0.00 MiB/s Max: 1251.11  MiB/s
[W] Avg: 1035.05  MiB/s Min: 0.00 MiB/s Max: 2070.11  MiB/s
[LNet Rates of servers]
[R] Avg: 2291 RPC/s Min: 0RPC/s Max: 4581 RPC/s
[W] Avg: 3329 RPC/s Min: 0RPC/s Max: 6657 RPC/s
[LNet Bandwidth of servers]
[R] Avg: 626.55   MiB/s Min: 0.00 MiB/s Max: 1253.11  MiB/s
[W] Avg: 1038.05  MiB/s Min: 0.00 MiB/s Max: 2076.11  MiB/s
session is ended
./lnet_test.sh: line 17: 23394 Terminated  lst stat servers




On 8/19/17 4:20 AM, Arman Khalatyan wrote:
> just minor comment,
> you should push up performance of your nodes,they are not running in
> the max cpu frequencies.Al tests might be inconsistent. in order to
> get most of ib run following:
> tuned-adm profile latency-performance
> for more options use:
> tuned-adm list
>
> It will be interesting to see the difference.
>
> Am 19.08.2017 3:57 vorm. schrieb "Riccardo Veraldi"
> <riccardo.vera...@cnaf.infn.it <mailto:riccardo.vera...@cnaf.infn.it>>:
>
> Hello Keith and Dennis, these are the test I ran.
>
>   * obdfilter-survey, shows that I Can saturate disk performance,
> the NVMe/ZFS backend is performing very well and it is faster
> then my Infiniband network
>
> *pool  alloc   free   read  write   read  write**
> **  -  -  -  -  -  -**
> **drpffb-ost01  3.31T  3.19T  3  35.7K  16.0K  7.03G**
> **  raidz1  3.31T  3.19T  3  35.7K  16.0K  7.03G**
> **nvme0n1   -  -  1  5.95K  7.99K  1.17G**
> **nvme1n1   -  -  0  6.01K  0  1.18G**
> **nvme2n1   -  -  0  5.93K  0  1.17G**
> **nvme3n1   -  -  0  5.88K  0  1.16G**
> **nvme4n1   -  -  1  5.95K  7.99K  1.17G**
> **nvme5n1   -  -  0  5.96K  0  1.17G**
> **  -  -  -  -  -  -*
>
> this are the tests results
>
> Fri Aug 18 16:54:48 PDT 2017 Obdfilter-survey for case=disk from
> drp-tst-ffb01
> ost  1 sz 10485760K rsz 1024K obj1 thr1
> write*7633.08   *  SHORT rewrite 7558.78 SHORT
> read 3205.24 [3213.70, 3226.78]
> ost  1 sz 10485760K rsz 1024K obj1 thr2
> write*7996.89 *SHORT rewrite 7903.42 SHORT
> read 5264.70 SHORT
> ost  1 sz 10485760K rsz 1024K obj2 thr2 write
> *7718.94* SHORT rewrite 7977.84 SHORT read
> 5802.17 SHORT
>
>   * Lnet self test, and here I see the problems. For reference
> 172.21.52.[83,84] are the two OSSes 172.21.52.86 is the
> reader/writer. Here is the script that I ran
>
> #!/bin/bash
> export LST_SESSION=$$
> lst new_session read_write
> lst add_group servers 172.21.52.[83,84]@o2ib5
> lst add_group readers 172.21.52.86@o2ib5
> lst add_group writers 172.21.52.86@o2ib5
> lst add_batch bulk_rw
> lst add_test --batch bulk_rw --from readers --to servers \
> brw read check=simple size=1M
> lst add_test --batch bulk_rw --from writers --to servers \
> brw write check=full size=1M
> # start running
> lst run bulk_rw
> # display server stats for 30 seconds
> lst stat servers & sleep 30; kill $!
> # tear down
> lst end_session
>
>
>

Re: [lustre-discuss] Lustre poor performance

2017-08-18 Thread Riccardo Veraldi

On 8/18/17 7:05 PM, Dennis Nelson wrote:
> If all four servers are identical and all have IB, why are you
> specifying tcp when mounting the client?
because the MDS does not have InfiniBand but just ethernet connection.
Only the OSSes have Infiniband on ib0 interface.

this is my ldev.conf

psdrp-tst-mds01 - mgs zfs:drpffb-mgs/mgs
psdrp-tst-mds01 - mdt0 zfs:drpffb-mdt0/mdt0
#
drp-tst-ffb01 - OST01 zfs:drpffb-ost01/ost01
drp-tst-ffb02 - OST02 zfs:drpffb-ost02/ost02

this is my lustre.conf on the OSSes and Lustre client

options lnet networks=o2ib5(ib0),tcp5(enp1s0f0)

this is my lustre.conf on the MDS

options lnet networks=tcp5(eth0)






>
> Sent from my iPhone
>
> On Aug 18, 2017, at 8:57 PM, Riccardo Veraldi
> <riccardo.vera...@cnaf.infn.it <mailto:riccardo.vera...@cnaf.infn.it>>
> wrote:
>
>> Hello Keith and Dennis, these are the test I ran.
>>
>>   * obdfilter-survey, shows that I Can saturate disk performance, the
>> NVMe/ZFS backend is performing very well and it is faster then my
>> Infiniband network
>>
>> *pool  alloc   free   read  write   read  write**
>> **  -  -  -  -  -  -**
>> **drpffb-ost01  3.31T  3.19T  3  35.7K  16.0K  7.03G**
>> **  raidz1  3.31T  3.19T  3  35.7K  16.0K  7.03G**
>> **nvme0n1   -  -  1  5.95K  7.99K  1.17G**
>> **nvme1n1   -  -  0  6.01K  0  1.18G**
>> **nvme2n1   -  -  0  5.93K  0  1.17G**
>> **nvme3n1   -  -  0  5.88K  0  1.16G**
>> **nvme4n1   -  -  1  5.95K  7.99K  1.17G**
>> **nvme5n1   -  -  0  5.96K  0  1.17G**
>> **  -  -  -  -  -  -*
>>
>> this are the tests results
>>
>> Fri Aug 18 16:54:48 PDT 2017 Obdfilter-survey for case=disk from
>> drp-tst-ffb01
>> ost  1 sz 10485760K rsz 1024K obj1 thr1
>> write*7633.08   *  SHORT rewrite 7558.78 SHORT
>> read 3205.24 [3213.70, 3226.78]
>> ost  1 sz 10485760K rsz 1024K obj1 thr2
>> write*7996.89 *SHORT rewrite 7903.42 SHORT
>> read 5264.70 SHORT
>> ost  1 sz 10485760K rsz 1024K obj2 thr2 write
>> *7718.94* SHORT rewrite 7977.84 SHORT read
>> 5802.17 SHORT
>>
>>   * Lnet self test, and here I see the problems. For reference
>> 172.21.52.[83,84] are the two OSSes 172.21.52.86 is the
>> reader/writer. Here is the script that I ran
>>
>> #!/bin/bash
>> export LST_SESSION=$$
>> lst new_session read_write
>> lst add_group servers 172.21.52.[83,84]@o2ib5
>> lst add_group readers 172.21.52.86@o2ib5
>> lst add_group writers 172.21.52.86@o2ib5
>> lst add_batch bulk_rw
>> lst add_test --batch bulk_rw --from readers --to servers \
>> brw read check=simple size=1M
>> lst add_test --batch bulk_rw --from writers --to servers \
>> brw write check=full size=1M
>> # start running
>> lst run bulk_rw
>> # display server stats for 30 seconds
>> lst stat servers & sleep 30; kill $!
>> # tear down
>> lst end_session
>>
>>
>> here the results
>>
>> SESSION: read_write FEATURES: 1 TIMEOUT: 300 FORCE: No
>> 172.21.52.[83,84]@o2ib5 are added to session
>> 172.21.52.86@o2ib5 are added to session
>> 172.21.52.86@o2ib5 are added to session
>> Test was added successfully
>> Test was added successfully
>> bulk_rw is running now
>> [LNet Rates of servers]
>> [R] Avg: 1751 RPC/s Min: 0RPC/s Max: 3502 RPC/s
>> [W] Avg: 2525 RPC/s Min: 0RPC/s Max: 5050 RPC/s
>> [LNet Bandwidth of servers]
>> [R] Avg: 488.79   MiB/s Min: 0.00 MiB/s Max: 977.59   MiB/s
>> [W] Avg: 773.99   MiB/s Min: 0.00 MiB/s Max: 1547.99  MiB/s
>> [LNet Rates of servers]
>> [R] Avg: 1718 RPC/s Min: 0RPC/s Max: 3435 RPC/s
>> [W] Avg: 2479 RPC/s Min: 0RPC/s Max: 4958 RPC/s
>> [LNet Bandwidth of servers]
>> [R] Avg: 478.19   MiB/s Min: 0.00 MiB/s Max: 956.39   MiB/s
>> [W] Avg: 761.74   MiB/s Min: 0.00 MiB/s Max: 1523.47  MiB/s
>> [LNet Rates of servers]
>> [R] Avg: 1734 RPC/s Min: 0RPC/s Max: 3467 RPC/s
>> [W] Avg: 2506 RPC/s Min: 0RPC/s Max: 5012 RPC/s
>> [LNet Bandwidth of servers]
>> [R] Avg: 480.79   MiB/s Min: 0.00 MiB/s Max: 961.58   MiB/s
>> [W] Avg: 772.49   MiB/s Min: 0.00 MiB/s Max: 1544.98  MiB/s
>> [LNet Rates of servers]
>> [R] Avg: 1722 RPC/s Min: 0RPC/s Max: 3444 RPC/s
>

Re: [lustre-discuss] Lustre poor performance

2017-08-18 Thread Riccardo Veraldi

> I would suggest you a few other tests to help isolate where the issue might 
> be.  
>
> 1. What is the single thread "DD" write speed?
>  
> 2. Lnet_selfttest:  Please see " Chapter 28. Testing Lustre Network 
> Performance (LNet Self-Test)" in the Lustre manual if this is a new test for 
> you. 
> This will help show how much Lnet bandwith you have from your single client.  
> There are tunable in the lnet later that can affect things.  Which QRD HCA 
> are you using?
>
> 3. OBDFilter_survey :  Please see " 29.3. Testing OST Performance 
> (obdfilter-survey)" in the Lustre manual.  This test will help demonstrate 
> what the backed NVMe/ZFS setup can do at the OBD layer in Lustre.  
>
> Thanks,
>  Keith 
> -Original Message-----
> From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On 
> Behalf Of Riccardo Veraldi
> Sent: Thursday, August 17, 2017 10:48 PM
> To: Dennis Nelson <dnel...@ddn.com>; lustre-discuss@lists.lustre.org
> Subject: Re: [lustre-discuss] Lustre poor performance
>
> this is my lustre.conf
>
> [drp-tst-ffb01:~]$ cat /etc/modprobe.d/lustre.conf options lnet 
> networks=o2ib5(ib0),tcp5(enp1s0f0)
>
> data transfer is over infiniband
>
> ib0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 65520
> inet 172.21.52.83  netmask 255.255.252.0  broadcast 172.21.55.255
>
>
> On 8/17/17 10:45 PM, Riccardo Veraldi wrote:
>> On 8/17/17 9:22 PM, Dennis Nelson wrote:
>>> It appears that you are running iozone on a single client?  What kind of 
>>> network is tcp5?  Have you looked at the network to make sure it is not the 
>>> bottleneck?
>>>
>> yes the data transfer is on ib0 interface and I did a memory to memory 
>> test through InfiniBand QDR  resulting in 3.7GB/sec.
>> tcp is used to connect to the MDS. It is tcp5 to differentiate it from 
>> my other many Lustre clusters. I could have called it tcp but it does 
>> not make any difference performance wise.
>> I ran the test from one single node yes, I ran the same test also 
>> locally on a zpool identical to the one on the Lustre OSS.
>>  Ihave 4 identical servers each of them with the aame nvme disks:
>>
>> server1: OSS - OST1 Lustre/ZFS  raidz1
>>
>> server2: OSS - OST2 Lustre/ZFS  raidz1
>>
>> server3: local ZFS raidz1
>>
>> server4: Lustre client
>>
>>
>>
>> ___
>> lustre-discuss mailing list
>> lustre-discuss@lists.lustre.org
>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>
> ___
> lustre-discuss mailing list
> lustre-discuss@lists.lustre.org
> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

1 2 >

1 - 100 of 152 matches

Mail list logo