Re: [lustre-discuss] More problems setting things up....

2016-09-21 Thread Mohr Jr, Richard Frank (Rick Mohr)

> On Sep 21, 2016, at 5:08 AM, Phill Harvey-Smith 
>  wrote:
> 
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> dsl_prop_register
> Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol dsl_prop_register (err 
> -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> zap_cursor_serialize
> Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol zap_cursor_serialize 
> (err -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> zap_remove
> Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol zap_remove (err -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> dmu_tx_hold_write
> Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol dmu_tx_hold_write (err 
> -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> dsl_prop_unregister
> Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol dsl_prop_unregister (err 
> -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> sa_spill_rele
> Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol sa_spill_rele (err -22)
> Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
> zap_curs:

Often time those types of error messages indicate some sort of version mismatch 
between kernel modules.  Did you just download the lustre RPMs from the web 
site?

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] "Not on preferred path" error

2016-09-21 Thread Ben Evans
That's the way multipath is showing it, yes, however back in the 1.8 days
we used LSI's propriatary multipathing kernel modules called MPP.  MPP
presented both paths to the device driver layer as a single device, so the
multipath view would have a single path.

I no longer have any of my notes from this sort of thing, I don't know if
there are any old-school LSI/NetApp/Engenio people on here who would have
a better chance with diagnosing this sort of thing.

-Ben Evans

On 9/21/16, 1:37 PM, "Tao, Zhiqi"  wrote:

>It appears that there is only one SAS path to the back storage, which
>explained why some of LUN showed on non-preferred path.
>
>Typically we recommend to have two SAS connections from each OSS to the
>storage. One connects to the upper controller and one connects to the
>lower controller. Then, distributed LUNs between two controllers. In the
>event of SAS connection failure, all LUNs would failover to one
>controller. The one used to go through the other controller would shows
>that they are not on the preferred path. As this kind of failover
>happened on the multipath layer, it's transparent to Lustre. The file
>system continues to run as you observed.
>
>Best Regards,
>Zhiqi
>
>-Original Message-
>From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On
>Behalf Of Lewis Hyatt
>Sent: Tuesday, September 20, 2016 12:53 PM
>To: Ben Evans ; lustre-discuss@lists.lustre.org
>Subject: Re: [lustre-discuss] "Not on preferred path" error
>
>I see, thanks. This is what we see from running multipath cmds... i don't
>see anything that means anything to me, but FWIW it looks the same as on
>our other OSS that is working ok.
>
>$multipath -ll
>map03 (360080e50002ee510023f50092c6c) dm-13 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][active]
>  \_ 3:0:1:3 sdk 8:160 [active][ready]
>map02 (360080e50002ee410024250092c11) dm-12 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][active]
>  \_ 3:0:1:2 sdj 8:144 [active][ready]
>map01 (360080e50002ee510023b50092c4c) dm-11 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][enabled]
>  \_ 3:0:1:1 sdi 8:128 [active][ready]
>map00 (360080e50002ee410023e50092bf2) dm-10 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][enabled]
>  \_ 3:0:1:0 sdh 8:112 [active][ready]
>map09 (360080e50002ee4dc02f250092c62) dm-7 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][enabled]
>  \_ 3:0:0:3 sde 8:64  [active][ready]
>map11 (360080e50002ee4dc02f650092c84) dm-9 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][active]
>  \_ 3:0:0:5 sdg 8:96  [active][ready]
>map08 (360080e50002ec89002e550092a07) dm-6 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][enabled]
>  \_ 3:0:0:2 sdd 8:48  [active][ready]
>map10 (360080e50002ec89002e950092a27) dm-8 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][active]
>  \_ 3:0:0:4 sdf 8:80  [active][ready]
>map07 (360080e50002ee4dc02ee50092c44) dm-5 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][active]
>  \_ 3:0:0:1 sdc 8:32  [active][ready]
>map06 (360080e50002ec89002e1500929e9) dm-4 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][active]
>  \_ 3:0:0:0 sdb 8:16  [active][ready]
>map05 (360080e50002ee510024350092c8c) dm-15 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][enabled]
>  \_ 3:0:1:5 sdm 8:192 [active][ready]
>map04 (360080e50002ee410024650092c31) dm-14 LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][rw]
>\_ round-robin 0 [prio=1][enabled]
>  \_ 3:0:1:4 sdl 8:176 [active][ready]
>
>===
>
>$multipath -r
>reload: map06 (360080e50002ec89002e1500929e9)  LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][n/a]
>\_ round-robin 0 [prio=1][undef]
>  \_ 3:0:0:0 sdb 8:16  [active][ready]
>reload: map07 (360080e50002ee4dc02ee50092c44)  LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][n/a]
>\_ round-robin 0 [prio=1][undef]
>  \_ 3:0:0:1 sdc 8:32  [active][ready]
>reload: map08 (360080e50002ec89002e550092a07)  LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][n/a]
>\_ round-robin 0 [prio=1][undef]
>  \_ 3:0:0:2 sdd 8:48  [active][ready]
>reload: map09 (360080e50002ee4dc02f250092c62)  LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][n/a]
>\_ round-robin 0 [prio=1][undef]
>  \_ 3:0:0:3 sde 8:64  [active][ready]
>reload: map10 (360080e50002ec89002e950092a27)  LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][n/a]
>\_ round-robin 0 [prio=1][undef]
>  \_ 3:0:0:4 sdf 8:80  [active][ready]
>reload: map11 (360080e50002ee4dc02f650092c84)  LSI,VirtualDisk
>[size=15T][features=0][hwhandler=0][n/a]
>\_ round-robin 0 [prio=1][undef]

Re: [lustre-discuss] "Not on preferred path" error

2016-09-21 Thread Tao, Zhiqi
It appears that there is only one SAS path to the back storage, which explained 
why some of LUN showed on non-preferred path. 

Typically we recommend to have two SAS connections from each OSS to the 
storage. One connects to the upper controller and one connects to the lower 
controller. Then, distributed LUNs between two controllers. In the event of SAS 
connection failure, all LUNs would failover to one controller. The one used to 
go through the other controller would shows that they are not on the preferred 
path. As this kind of failover happened on the multipath layer, it's 
transparent to Lustre. The file system continues to run as you observed. 

Best Regards,
Zhiqi

-Original Message-
From: lustre-discuss [mailto:lustre-discuss-boun...@lists.lustre.org] On Behalf 
Of Lewis Hyatt
Sent: Tuesday, September 20, 2016 12:53 PM
To: Ben Evans ; lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] "Not on preferred path" error

I see, thanks. This is what we see from running multipath cmds... i don't see 
anything that means anything to me, but FWIW it looks the same as on our other 
OSS that is working ok.

$multipath -ll
map03 (360080e50002ee510023f50092c6c) dm-13 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:1:3 sdk 8:160 [active][ready]
map02 (360080e50002ee410024250092c11) dm-12 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:1:2 sdj 8:144 [active][ready]
map01 (360080e50002ee510023b50092c4c) dm-11 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:1 sdi 8:128 [active][ready]
map00 (360080e50002ee410023e50092bf2) dm-10 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:0 sdh 8:112 [active][ready]
map09 (360080e50002ee4dc02f250092c62) dm-7 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:0:3 sde 8:64  [active][ready]
map11 (360080e50002ee4dc02f650092c84) dm-9 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:5 sdg 8:96  [active][ready]
map08 (360080e50002ec89002e550092a07) dm-6 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:0:2 sdd 8:48  [active][ready]
map10 (360080e50002ec89002e950092a27) dm-8 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:4 sdf 8:80  [active][ready]
map07 (360080e50002ee4dc02ee50092c44) dm-5 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:1 sdc 8:32  [active][ready]
map06 (360080e50002ec89002e1500929e9) dm-4 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][active]
  \_ 3:0:0:0 sdb 8:16  [active][ready]
map05 (360080e50002ee510024350092c8c) dm-15 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:5 sdm 8:192 [active][ready]
map04 (360080e50002ee410024650092c31) dm-14 LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][rw]
\_ round-robin 0 [prio=1][enabled]
  \_ 3:0:1:4 sdl 8:176 [active][ready]

===

$multipath -r
reload: map06 (360080e50002ec89002e1500929e9)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:0 sdb 8:16  [active][ready]
reload: map07 (360080e50002ee4dc02ee50092c44)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:1 sdc 8:32  [active][ready]
reload: map08 (360080e50002ec89002e550092a07)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:2 sdd 8:48  [active][ready]
reload: map09 (360080e50002ee4dc02f250092c62)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:3 sde 8:64  [active][ready]
reload: map10 (360080e50002ec89002e950092a27)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:4 sdf 8:80  [active][ready]
reload: map11 (360080e50002ee4dc02f650092c84)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:0:5 sdg 8:96  [active][ready]
reload: map00 (360080e50002ee410023e50092bf2)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:0 sdh 8:112 [active][ready]
reload: map01 (360080e50002ee510023b50092c4c)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:1 sdi 8:128 [active][ready]
reload: map02 (360080e50002ee410024250092c11)  LSI,VirtualDisk 
[size=15T][features=0][hwhandler=0][n/a]
\_ round-robin 0 [prio=1][undef]
  \_ 3:0:1:2 sdj 8:144 [active][ready]
reload: map03 

Re: [lustre-discuss] will lustre client 2.8 work with lustre server 2.5.4x

2016-09-21 Thread Patrick Farrell
Lydia,


Lustre 2.8 client will work fine with Lustre server 2.5.x.  You will not be 
able to use some of the new features in 2.8 without a 2.8 server, of course.


- Patrick


From: lustre-discuss  on behalf of 
Lydia Heck 
Sent: Wednesday, September 21, 2016 5:27:05 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] will lustre client 2.8 work with lustre server 2.5.4x


Dear list,

will lustre client 2.8 work with a lustre server 2.5.4x ?

I am planning two lustre server setups, one with a definite version 2.5.4x and
the other with 2.8.x.

If I install lustre client 2.8 on the cluster nodes will they be able to
communicate effectively with the cluster server 2.5.4x ?

Best wishes,
Lydia


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] will lustre client 2.8 work with lustre server 2.5.4x

2016-09-21 Thread Lydia Heck


Dear list,

will lustre client 2.8 work with a lustre server 2.5.4x ?

I am planning two lustre server setups, one with a definite version 2.5.4x and 
the other with 2.8.x.


If I install lustre client 2.8 on the cluster nodes will they be able to 
communicate effectively with the cluster server 2.5.4x ?


Best wishes,
Lydia


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] More problems setting things up....

2016-09-21 Thread Phill Harvey-Smith

Hi all,

I've hopefully setup a test MGT/MGS on one of our test servers however 
when I try and start lustre with service lustre start it reports [OK] 
but the following errors are logged in /var/log/messages


Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
dsl_prop_register
Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol dsl_prop_register 
(err -22)
Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
zap_cursor_serialize
Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol 
zap_cursor_serialize (err -22)
Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
zap_remove

Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol zap_remove (err -22)
Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
dmu_tx_hold_write
Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol dmu_tx_hold_write 
(err -22)
Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
dsl_prop_unregister
Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol dsl_prop_unregister 
(err -22)
Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
sa_spill_rele

Sep 21 09:44:29 oric kernel: osd_zfs: Unknown symbol sa_spill_rele (err -22)
Sep 21 09:44:29 oric kernel: osd_zfs: disagrees about version of symbol 
zap_curs:


Running Centos 7.2, with lustre 2.8.

Any clues as to what the problem is, wrong version of lustre installed?

Cheers.

Phill.
___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] FID error showing on MDS

2016-09-21 Thread Jérôme BECOT

Hello,

We migrated server software from 2.6 to 2.7 last month and now logs are 
filled with these lines :


 lustre-MDT-osd: FID [0x25edc:0x3b91:0x0] != self_fid 
[0x25edc:0x3be5:0x0]


Is there anything we could do ?

Thanks

--
Jérome BECOT

Administrateur Systèmes et Réseaux

Molécules à visée Thérapeutique par des approches in Silico (MTi)
Univ Paris Diderot, UMRS973 Inserm
Case 013
Bât. Lamarck A, porte 412
35, rue Hélène Brion 75205 Paris Cedex 13
France

Tel : 01 57 27 83 82

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org