Re: [lustre-discuss] design to enable kernel updates

2017-02-10 Thread Vicker, Darby (JSC-EG311)
Yeah, since this is our first experience using failover with lustre we are just 
doing manual failover for now. But we may implement the 
corosync/pacemaker/stonith setup in the future.

On Feb 10, 2017, at 11:57 PM, Jeff Johnson 
<jeff.john...@aeoncomputing.com<mailto:jeff.john...@aeoncomputing.com>> wrote:

You're also leaving out the corosync/pacemaker/stonith configuration. That is 
unless you are doing manual export/import of pools.

On Fri, Feb 10, 2017 at 9:03 PM, Vicker, Darby (JSC-EG311) 
<darby.vicke...@nasa.gov<mailto:darby.vicke...@nasa.gov>> wrote:
Sure.  Our hardware is very similar to this:

https://www.supermicro.com/solutions/Lustre.cfm

We are using twin servers instead two single chassis servers as shown there but 
functionally this is the same – we can just fit more stuff into a single rack 
with the twin servers.  We are using a single JBOB per twin server as shown in 
one of the configurations on the above page and are using ZFS as the backend.  
All servers are dual-homed on both Ethernet and IB.  A combined MGS/MDS is at 
10.148.0.30 address for IB and X.X.98.30 for Ethernet. The secondary MDS/MGS on 
the .31 address for both networks.  With the combined MDS/MGS, they both fail 
over together.  This did require a patch from LU-8397 to get the MGS failover 
to work properly so we are using 2.9.0 with the LU-8397 patch and are compiling 
our own server rpms.  But this is pretty simple with ZFS since you don't need a 
patched kernel.  The lustre formatting and configuration bits are below.  I'm 
leaving out the ZFS pool creation but I think you get the idea.

I hope that helps.

Darby



if [[ $HOSTNAME == *mds* ]] ; then

mkfs.lustre \
--fsname=hpfs-fsl \
--backfstype=zfs \
--reformat \
--verbose \
--mgs --mdt --index=0 \
--servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_IB_IP}@o2ib0 \
--servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_IP}@o2ib0 \
metadata/meta-fst

elif [[ $HOSTNAME == *oss* ]] ; then

   num=`hostname --short | sed 's/hpfs-fsl-//' | sed 's/oss//'`
   num=`printf '%g' $num`

   mkfs.lustre \
   --mgsnode=X.X.98.30@tcp0,10.148.0.30@o2ib0 \
   --mgsnode=X.X.98.31@tcp0,10.148.0.31@o2ib0 \
   --fsname=hpfs-fsl \
   --backfstype=zfs \
   --reformat \
   --verbose \
   --ost --index=$num \
   --servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_IB_IP}@o2ib0 \
   --servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_IP}@o2ib0 \
   $pool/ost-fsl
fi




/etc/ldev.conf:

#local  foreign/-  label   [md|zfs:]device-path   [journal-path]/- [raidtab]

hpfs-fsl-mds0  hpfs-fsl-mds1  hpfs-fsl-MDT  zfs:metadata/meta-fsl

hpfs-fsl-oss00 hpfs-fsl-oss01 hpfs-fsl-OST  zfs:oss00-0/ost-fsl
hpfs-fsl-oss01 hpfs-fsl-oss00 hpfs-fsl-OST0001  zfs:oss01-0/ost-fsl
hpfs-fsl-oss02 hpfs-fsl-oss03 hpfs-fsl-OST0002  zfs:oss02-0/ost-fsl
hpfs-fsl-oss03 hpfs-fsl-oss02 hpfs-fsl-OST0003  zfs:oss03-0/ost-fsl
hpfs-fsl-oss04 hpfs-fsl-oss05 hpfs-fsl-OST0004  zfs:oss04-0/ost-fsl
hpfs-fsl-oss05 hpfs-fsl-oss04 hpfs-fsl-OST0005  zfs:oss05-0/ost-fsl
hpfs-fsl-oss06 hpfs-fsl-oss07 hpfs-fsl-OST0006  zfs:oss06-0/ost-fsl
hpfs-fsl-oss07 hpfs-fsl-oss06 hpfs-fsl-OST0007  zfs:oss07-0/ost-fsl
hpfs-fsl-oss08 hpfs-fsl-oss09 hpfs-fsl-OST0008  zfs:oss08-0/ost-fsl
hpfs-fsl-oss09 hpfs-fsl-oss08 hpfs-fsl-OST0009  zfs:oss09-0/ost-fsl
hpfs-fsl-oss10 hpfs-fsl-oss11 hpfs-fsl-OST000a  zfs:oss10-0/ost-fsl
hpfs-fsl-oss11 hpfs-fsl-oss10 hpfs-fsl-OST000b  zfs:oss11-0/ost-fsl




/etc/modprobe.d/lustre.conf:

options lnet networks=tcp0(enp4s0),o2ib0(ib1)
options ko2iblnd map_on_demand=32

-Original Message-
From: Brian Andrus <toomuc...@gmail.com<mailto:toomuc...@gmail.com>>
Date: Friday, February 10, 2017 at 12:07 AM
To: Darby Vicker <darby.vicke...@nasa.gov<mailto:darby.vicke...@nasa.gov>>, Ben 
Evans <bev...@cray.com<mailto:bev...@cray.com>>, 
"lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>" 
<lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>>
Subject: Re: [lustre-discuss] design to enable kernel updates

Darby,

Do you mind if I inquire about the setup for your lustre systems?
I'm trying to understand how the MGS/MGT is setup for high availability.
I understand with OSTs and MDTs where all I really need is to have the
failnode set when I do the mkfs.lustre
However, as I understand it, you have to use something like pacemaker
and drbd to deal with the MGS/MGT. Is this how you approached it?

Brian Andrus



On 2/6/2017 12:58 PM, Vicker, Darby (JSC-EG311) wrote:
> Agreed.  We are just about to go into production on our next LFS with the
> setup described.  We had to get past a bug in the MGS failover for
> dual-homed servers but as of last week that is done and everything is
> working great (see "MGS failover problem" thread on this 

Re: [lustre-discuss] design to enable kernel updates

2017-02-10 Thread Jeff Johnson
You're also leaving out the corosync/pacemaker/stonith configuration. That
is unless you are doing manual export/import of pools.

On Fri, Feb 10, 2017 at 9:03 PM, Vicker, Darby (JSC-EG311) <
darby.vicke...@nasa.gov> wrote:

> Sure.  Our hardware is very similar to this:
>
> https://www.supermicro.com/solutions/Lustre.cfm
>
> We are using twin servers instead two single chassis servers as shown
> there but functionally this is the same – we can just fit more stuff into a
> single rack with the twin servers.  We are using a single JBOB per twin
> server as shown in one of the configurations on the above page and are
> using ZFS as the backend.  All servers are dual-homed on both Ethernet and
> IB.  A combined MGS/MDS is at 10.148.0.30 address for IB and X.X.98.30 for
> Ethernet. The secondary MDS/MGS on the .31 address for both networks.  With
> the combined MDS/MGS, they both fail over together.  This did require a
> patch from LU-8397 to get the MGS failover to work properly so we are using
> 2.9.0 with the LU-8397 patch and are compiling our own server rpms.  But
> this is pretty simple with ZFS since you don't need a patched kernel.  The
> lustre formatting and configuration bits are below.  I'm leaving out the
> ZFS pool creation but I think you get the idea.
>
> I hope that helps.
>
> Darby
>
>
>
> if [[ $HOSTNAME == *mds* ]] ; then
>
> mkfs.lustre \
> --fsname=hpfs-fsl \
> --backfstype=zfs \
> --reformat \
> --verbose \
> --mgs --mdt --index=0 \
> --servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_
> IB_IP}@o2ib0 \
> --servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_
> IP}@o2ib0 \
> metadata/meta-fst
>
> elif [[ $HOSTNAME == *oss* ]] ; then
>
>num=`hostname --short | sed 's/hpfs-fsl-//' | sed 's/oss//'`
>num=`printf '%g' $num`
>
>mkfs.lustre \
>--mgsnode=X.X.98.30@tcp0,10.148.0.30@o2ib0 \
>--mgsnode=X.X.98.31@tcp0,10.148.0.31@o2ib0 \
>--fsname=hpfs-fsl \
>--backfstype=zfs \
>--reformat \
>--verbose \
>--ost --index=$num \
>--servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_
> IB_IP}@o2ib0 \
>--servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_IP}@o2ib0
> \
>$pool/ost-fsl
> fi
>
>
>
>
> /etc/ldev.conf:
>
> #local  foreign/-  label   [md|zfs:]device-path   [journal-path]/-
> [raidtab]
>
> hpfs-fsl-mds0  hpfs-fsl-mds1  hpfs-fsl-MDT  zfs:metadata/meta-fsl
>
> hpfs-fsl-oss00 hpfs-fsl-oss01 hpfs-fsl-OST  zfs:oss00-0/ost-fsl
> hpfs-fsl-oss01 hpfs-fsl-oss00 hpfs-fsl-OST0001  zfs:oss01-0/ost-fsl
> hpfs-fsl-oss02 hpfs-fsl-oss03 hpfs-fsl-OST0002  zfs:oss02-0/ost-fsl
> hpfs-fsl-oss03 hpfs-fsl-oss02 hpfs-fsl-OST0003  zfs:oss03-0/ost-fsl
> hpfs-fsl-oss04 hpfs-fsl-oss05 hpfs-fsl-OST0004  zfs:oss04-0/ost-fsl
> hpfs-fsl-oss05 hpfs-fsl-oss04 hpfs-fsl-OST0005  zfs:oss05-0/ost-fsl
> hpfs-fsl-oss06 hpfs-fsl-oss07 hpfs-fsl-OST0006  zfs:oss06-0/ost-fsl
> hpfs-fsl-oss07 hpfs-fsl-oss06 hpfs-fsl-OST0007  zfs:oss07-0/ost-fsl
> hpfs-fsl-oss08 hpfs-fsl-oss09 hpfs-fsl-OST0008  zfs:oss08-0/ost-fsl
> hpfs-fsl-oss09 hpfs-fsl-oss08 hpfs-fsl-OST0009  zfs:oss09-0/ost-fsl
> hpfs-fsl-oss10 hpfs-fsl-oss11 hpfs-fsl-OST000a  zfs:oss10-0/ost-fsl
> hpfs-fsl-oss11 hpfs-fsl-oss10 hpfs-fsl-OST000b  zfs:oss11-0/ost-fsl
>
>
>
>
> /etc/modprobe.d/lustre.conf:
>
> options lnet networks=tcp0(enp4s0),o2ib0(ib1)
> options ko2iblnd map_on_demand=32
>
> -----Original Message-----
> From: Brian Andrus <toomuc...@gmail.com>
> Date: Friday, February 10, 2017 at 12:07 AM
> To: Darby Vicker <darby.vicke...@nasa.gov>, Ben Evans <bev...@cray.com>, "
> lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org>
> Subject: Re: [lustre-discuss] design to enable kernel updates
>
> Darby,
>
> Do you mind if I inquire about the setup for your lustre systems?
> I'm trying to understand how the MGS/MGT is setup for high availability.
> I understand with OSTs and MDTs where all I really need is to have the
> failnode set when I do the mkfs.lustre
> However, as I understand it, you have to use something like pacemaker
> and drbd to deal with the MGS/MGT. Is this how you approached it?
>
> Brian Andrus
>
>
>
> On 2/6/2017 12:58 PM, Vicker, Darby (JSC-EG311) wrote:
> > Agreed.  We are just about to go into production on our next LFS with the
> > setup described.  We had to get past a bug in the MGS failover for
> > dual-homed servers but as of last week that is done and everything is
> > working great (see "MGS failover problem" thread on this mailing list
> from

Re: [lustre-discuss] design to enable kernel updates

2017-02-10 Thread Vicker, Darby (JSC-EG311)
Sure.  Our hardware is very similar to this:

https://www.supermicro.com/solutions/Lustre.cfm

We are using twin servers instead two single chassis servers as shown there but 
functionally this is the same – we can just fit more stuff into a single rack 
with the twin servers.  We are using a single JBOB per twin server as shown in 
one of the configurations on the above page and are using ZFS as the backend.  
All servers are dual-homed on both Ethernet and IB.  A combined MGS/MDS is at 
10.148.0.30 address for IB and X.X.98.30 for Ethernet. The secondary MDS/MGS on 
the .31 address for both networks.  With the combined MDS/MGS, they both fail 
over together.  This did require a patch from LU-8397 to get the MGS failover 
to work properly so we are using 2.9.0 with the LU-8397 patch and are compiling 
our own server rpms.  But this is pretty simple with ZFS since you don't need a 
patched kernel.  The lustre formatting and configuration bits are below.  I'm 
leaving out the ZFS pool creation but I think you get the idea.  

I hope that helps. 

Darby



if [[ $HOSTNAME == *mds* ]] ; then

mkfs.lustre \
--fsname=hpfs-fsl \
--backfstype=zfs \
--reformat \
--verbose \
--mgs --mdt --index=0 \
--servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_IB_IP}@o2ib0 \
--servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_IP}@o2ib0 \
metadata/meta-fst

elif [[ $HOSTNAME == *oss* ]] ; then

   num=`hostname --short | sed 's/hpfs-fsl-//' | sed 's/oss//'`
   num=`printf '%g' $num`

   mkfs.lustre \
   --mgsnode=X.X.98.30@tcp0,10.148.0.30@o2ib0 \
   --mgsnode=X.X.98.31@tcp0,10.148.0.31@o2ib0 \
   --fsname=hpfs-fsl \
   --backfstype=zfs \
   --reformat \
   --verbose \
   --ost --index=$num \
   --servicenode=${LUSTRE_LOCAL_TCP_IP}@tcp0,${LUSTRE_LOCAL_IB_IP}@o2ib0 \
   --servicenode=${LUSTRE_PEER_TCP_IP}@tcp0,${LUSTRE_PEER_IB_IP}@o2ib0 \
   $pool/ost-fsl
fi




/etc/ldev.conf:

#local  foreign/-  label   [md|zfs:]device-path   [journal-path]/- [raidtab]

hpfs-fsl-mds0  hpfs-fsl-mds1  hpfs-fsl-MDT  zfs:metadata/meta-fsl

hpfs-fsl-oss00 hpfs-fsl-oss01 hpfs-fsl-OST  zfs:oss00-0/ost-fsl
hpfs-fsl-oss01 hpfs-fsl-oss00 hpfs-fsl-OST0001  zfs:oss01-0/ost-fsl
hpfs-fsl-oss02 hpfs-fsl-oss03 hpfs-fsl-OST0002  zfs:oss02-0/ost-fsl
hpfs-fsl-oss03 hpfs-fsl-oss02 hpfs-fsl-OST0003  zfs:oss03-0/ost-fsl
hpfs-fsl-oss04 hpfs-fsl-oss05 hpfs-fsl-OST0004  zfs:oss04-0/ost-fsl
hpfs-fsl-oss05 hpfs-fsl-oss04 hpfs-fsl-OST0005  zfs:oss05-0/ost-fsl
hpfs-fsl-oss06 hpfs-fsl-oss07 hpfs-fsl-OST0006  zfs:oss06-0/ost-fsl
hpfs-fsl-oss07 hpfs-fsl-oss06 hpfs-fsl-OST0007  zfs:oss07-0/ost-fsl
hpfs-fsl-oss08 hpfs-fsl-oss09 hpfs-fsl-OST0008  zfs:oss08-0/ost-fsl
hpfs-fsl-oss09 hpfs-fsl-oss08 hpfs-fsl-OST0009  zfs:oss09-0/ost-fsl
hpfs-fsl-oss10 hpfs-fsl-oss11 hpfs-fsl-OST000a  zfs:oss10-0/ost-fsl
hpfs-fsl-oss11 hpfs-fsl-oss10 hpfs-fsl-OST000b  zfs:oss11-0/ost-fsl




/etc/modprobe.d/lustre.conf:

options lnet networks=tcp0(enp4s0),o2ib0(ib1)
options ko2iblnd map_on_demand=32

-Original Message-
From: Brian Andrus <toomuc...@gmail.com>
Date: Friday, February 10, 2017 at 12:07 AM
To: Darby Vicker <darby.vicke...@nasa.gov>, Ben Evans <bev...@cray.com>, 
"lustre-discuss@lists.lustre.org" <lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] design to enable kernel updates

Darby,

Do you mind if I inquire about the setup for your lustre systems?
I'm trying to understand how the MGS/MGT is setup for high availability.
I understand with OSTs and MDTs where all I really need is to have the 
failnode set when I do the mkfs.lustre
However, as I understand it, you have to use something like pacemaker 
and drbd to deal with the MGS/MGT. Is this how you approached it?

Brian Andrus



On 2/6/2017 12:58 PM, Vicker, Darby (JSC-EG311) wrote:
> Agreed.  We are just about to go into production on our next LFS with the
> setup described.  We had to get past a bug in the MGS failover for
> dual-homed servers but as of last week that is done and everything is
> working great (see "MGS failover problem" thread on this mailing list from
> this month and last).  We are in the process of syncing our existing LFS
> to this new one and I've failed over/rebooted/upgraded the new LFS servers
> many times now to make sure we can do this in practice when the new LFS goes
> into production.  Its working beautifully.
>
> Many thanks to the lustre developers for their continued efforts.  We have
> been using and have been fans of lustre for quite some time now and it
> just keeps getting better.
>
> -Original Message-
> From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of 
> Ben Evans <bev...@cray.com>
> Date: Monday, February 6, 2017 at 2:22 PM
> To: Brian Andrus <toomuc...@gmail.com>, "lustre-discuss@lists.

Re: [lustre-discuss] design to enable kernel updates

2017-02-09 Thread Brian Andrus

Darby,

Do you mind if I inquire about the setup for your lustre systems?

I'm trying to understand how the MGS/MGT is setup for high availability.
I understand with OSTs and MDTs where all I really need is to have the 
failnode set when I do the mkfs.lustre
However, as I understand it, you have to use something like pacemaker 
and drbd to deal with the MGS/MGT. Is this how you approached it?


Brian Andrus



On 2/6/2017 12:58 PM, Vicker, Darby (JSC-EG311) wrote:

Agreed.  We are just about to go into production on our next LFS with the
setup described.  We had to get past a bug in the MGS failover for
dual-homed servers but as of last week that is done and everything is
working great (see "MGS failover problem" thread on this mailing list from
this month and last).  We are in the process of syncing our existing LFS
to this new one and I've failed over/rebooted/upgraded the new LFS servers
many times now to make sure we can do this in practice when the new LFS goes
into production.  Its working beautifully.

Many thanks to the lustre developers for their continued efforts.  We have
been using and have been fans of lustre for quite some time now and it
just keeps getting better.

-Original Message-
From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of Ben Evans 
<bev...@cray.com>
Date: Monday, February 6, 2017 at 2:22 PM
To: Brian Andrus <toomuc...@gmail.com>, "lustre-discuss@lists.lustre.org" 
<lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] design to enable kernel updates

It's certainly possible.  When I've done that sort of thing, you upgrade
the OS on all the servers first, boot half of them (the A side) to the new
image, all the targets will fail over to the B servers.  Once the A side
is up, reboot the B half to the new OS.  Finally, do a failback to the
"normal" running state.

At least when I've done it, you'll want to do the failovers manually so
the HA infrastructure doesn't surprise you for any reason.

-Ben

On 2/6/17, 2:54 PM, "lustre-discuss on behalf of Brian Andrus"
<lustre-discuss-boun...@lists.lustre.org on behalf of toomuc...@gmail.com>
wrote:


All,

I have been contemplating how lustre could be configured such that I
could update the kernel on each server without downtime.

It seems this is _almost_ possible when you have a san system so you
have failover for OSTs and MDTs. BUT the MGS/MGT seems to be the
problematic one, since rebooting that seems cause downtime that cannot
be avoided.

If you have a system where the disks are physically part of the OSS
hardware, you are out of luck. The hypothetical scenario I am using is
if someone had a VM that was a qcow image on a lustre mount (basically
an active, open file being read/written to continuously). How could
lustre be built to ensure anyone on the VM would not notice a kernel
upgrade to the underlying lustre servers.


Could such a setup be done? It seems that would be a better use case for
something like GPFS or Gluster, but being a die-hard lustre enthusiast,
I want to at least show it could be done.


Thanks in advance,

Brian Andrus

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org




___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] design to enable kernel updates

2017-02-06 Thread Vicker, Darby (JSC-EG311)
Agreed.  We are just about to go into production on our next LFS with the 
setup described.  We had to get past a bug in the MGS failover for 
dual-homed servers but as of last week that is done and everything is 
working great (see "MGS failover problem" thread on this mailing list from
this month and last).  We are in the process of syncing our existing LFS
to this new one and I've failed over/rebooted/upgraded the new LFS servers
many times now to make sure we can do this in practice when the new LFS goes
into production.  Its working beautifully.  

Many thanks to the lustre developers for their continued efforts.  We have 
been using and have been fans of lustre for quite some time now and it 
just keeps getting better.  

-Original Message-
From: lustre-discuss <lustre-discuss-boun...@lists.lustre.org> on behalf of Ben 
Evans <bev...@cray.com>
Date: Monday, February 6, 2017 at 2:22 PM
To: Brian Andrus <toomuc...@gmail.com>, "lustre-discuss@lists.lustre.org" 
<lustre-discuss@lists.lustre.org>
Subject: Re: [lustre-discuss] design to enable kernel updates

It's certainly possible.  When I've done that sort of thing, you upgrade
the OS on all the servers first, boot half of them (the A side) to the new
image, all the targets will fail over to the B servers.  Once the A side
is up, reboot the B half to the new OS.  Finally, do a failback to the
"normal" running state.

At least when I've done it, you'll want to do the failovers manually so
the HA infrastructure doesn't surprise you for any reason.

-Ben

On 2/6/17, 2:54 PM, "lustre-discuss on behalf of Brian Andrus"
<lustre-discuss-boun...@lists.lustre.org on behalf of toomuc...@gmail.com>
wrote:

>All,
>
>I have been contemplating how lustre could be configured such that I
>could update the kernel on each server without downtime.
>
>It seems this is _almost_ possible when you have a san system so you
>have failover for OSTs and MDTs. BUT the MGS/MGT seems to be the
>problematic one, since rebooting that seems cause downtime that cannot
>be avoided.
>
>If you have a system where the disks are physically part of the OSS
>hardware, you are out of luck. The hypothetical scenario I am using is
>if someone had a VM that was a qcow image on a lustre mount (basically
>an active, open file being read/written to continuously). How could
>lustre be built to ensure anyone on the VM would not notice a kernel
>upgrade to the underlying lustre servers.
>
>
>Could such a setup be done? It seems that would be a better use case for
>something like GPFS or Gluster, but being a die-hard lustre enthusiast,
>I want to at least show it could be done.
>
>
>Thanks in advance,
>
>Brian Andrus
>
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] design to enable kernel updates

2017-02-06 Thread Ben Evans
It's certainly possible.  When I've done that sort of thing, you upgrade
the OS on all the servers first, boot half of them (the A side) to the new
image, all the targets will fail over to the B servers.  Once the A side
is up, reboot the B half to the new OS.  Finally, do a failback to the
"normal" running state.

At least when I've done it, you'll want to do the failovers manually so
the HA infrastructure doesn't surprise you for any reason.

-Ben

On 2/6/17, 2:54 PM, "lustre-discuss on behalf of Brian Andrus"

wrote:

>All,
>
>I have been contemplating how lustre could be configured such that I
>could update the kernel on each server without downtime.
>
>It seems this is _almost_ possible when you have a san system so you
>have failover for OSTs and MDTs. BUT the MGS/MGT seems to be the
>problematic one, since rebooting that seems cause downtime that cannot
>be avoided.
>
>If you have a system where the disks are physically part of the OSS
>hardware, you are out of luck. The hypothetical scenario I am using is
>if someone had a VM that was a qcow image on a lustre mount (basically
>an active, open file being read/written to continuously). How could
>lustre be built to ensure anyone on the VM would not notice a kernel
>upgrade to the underlying lustre servers.
>
>
>Could such a setup be done? It seems that would be a better use case for
>something like GPFS or Gluster, but being a die-hard lustre enthusiast,
>I want to at least show it could be done.
>
>
>Thanks in advance,
>
>Brian Andrus
>
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] design to enable kernel updates

2017-02-06 Thread Brian Andrus

All,

I have been contemplating how lustre could be configured such that I 
could update the kernel on each server without downtime.


It seems this is _almost_ possible when you have a san system so you 
have failover for OSTs and MDTs. BUT the MGS/MGT seems to be the 
problematic one, since rebooting that seems cause downtime that cannot 
be avoided.


If you have a system where the disks are physically part of the OSS 
hardware, you are out of luck. The hypothetical scenario I am using is 
if someone had a VM that was a qcow image on a lustre mount (basically 
an active, open file being read/written to continuously). How could 
lustre be built to ensure anyone on the VM would not notice a kernel 
upgrade to the underlying lustre servers.



Could such a setup be done? It seems that would be a better use case for 
something like GPFS or Gluster, but being a die-hard lustre enthusiast, 
I want to at least show it could be done.



Thanks in advance,

Brian Andrus

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org