What can I expect to happen to the jobs that are suspended during the file system restart? Will the processes holding an open file handle die when I unsuspend them after the filesystem restart?
Thanks! -Raj On Thu, Feb 21, 2019 at 12:52 PM Colin Faber <cfa...@gmail.com> wrote: > Ah yes, > > If you're adding to an existing OSS, then you will need to reconfigure the > file system which requires writeconf event. > > On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam <ans...@gmail.com> > wrote: > >> The new OST's will be added to the existing file system (the OSS nodes >> are already part of the filesystem), I will have to re-configure the >> current HA resource configuration to tell it about the 4 new OST's. >> Our exascaler's HA monitors the individual OST and I need to re-configure >> the HA on the existing filesystem. >> >> Our vendor support has confirmed that we would have to restart the >> filesystem if we want to regenerate the HA configs to include the new OST's. >> >> Thanks, >> -Raj >> >> >> On Thu, Feb 21, 2019 at 11:23 AM Colin Faber <cfa...@gmail.com> wrote: >> >>> It seems to me that steps may still be missing? >>> >>> You're going to rack/stack and provision the OSS nodes with new OSTs'. >>> >>> Then you're going to introduce failover options somewhere? new osts? >>> existing system? etc? >>> >>> If you're introducing failover with the new OST's and leaving the >>> existing system in place, you should be able to accomplish this without >>> bringing the system offline. >>> >>> If you're going to be introducing failover to your existing system then >>> you will need to reconfigure the file system to accommodate the new >>> failover settings (failover nides, etc.) >>> >>> -cf >>> >>> >>> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam <ans...@gmail.com> >>> wrote: >>> >>>> Our upgrade strategy is as follows: >>>> >>>> 1) Load all disks into the storage array. >>>> 2) Create RAID pools and virtual disks. >>>> 3) Create lustre file system using mkfs.lustre command. (I still have >>>> to figure out all the parameters used on the existing OSTs). >>>> 4) Create mount points on all OSSs. >>>> 5) Mount the lustre OSTs. >>>> 6) Maybe rebalance the filesystem. >>>> My understanding is that the above can be done without bringing the >>>> filesystem down. I want to create the HA configuration (corosync and >>>> pacemaker) for the new OSTs. This step requires the filesystem to be down. >>>> I want to know what would happen to the suspended processes across the >>>> cluster when I bring the filesystem down to re-generate the HA configs. >>>> >>>> Thanks, >>>> -Raj >>>> >>>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber <cfa...@gmail.com> wrote: >>>> >>>>> Can you provide more details on your upgrade strategy? In some cases >>>>> expanding your storage shouldn't impact client / job activity at all. >>>>> >>>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam <ans...@gmail.com> >>>>> wrote: >>>>> >>>>>> Hello, >>>>>> >>>>>> We are planning on expanding our storage by adding more OSTs to our >>>>>> lustre file system. It looks like it would be easier to expand if we >>>>>> bring >>>>>> the filesystem down and perform the necessary operations. We are planning >>>>>> to suspend all the jobs running on the cluster. We originally planned to >>>>>> add new OSTs to the live filesystem. >>>>>> >>>>>> We are trying to determine the potential impact to the suspended jobs >>>>>> if we bring down the filesystem for the upgrade. >>>>>> One of the questions we have is what would happen to the suspended >>>>>> processes that hold an open file handle in the lustre file system when >>>>>> the >>>>>> filesystem is brought down for the upgrade? >>>>>> Will they recover from the client eviction? >>>>>> >>>>>> We do have vendor support and have engaged them. I wanted to ask the >>>>>> community and get some feedback. >>>>>> >>>>>> Thanks, >>>>>> -Raj >>>>>> >>>>> _______________________________________________ >>>>>> lustre-discuss mailing list >>>>>> lustre-discuss@lists.lustre.org >>>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>>>>> >>>>>
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org