Ah yes, If you're adding to an existing OSS, then you will need to reconfigure the file system which requires writeconf event.
On Thu, Feb 21, 2019 at 10:00 AM Raj Ayyampalayam <ans...@gmail.com> wrote: > The new OST's will be added to the existing file system (the OSS nodes are > already part of the filesystem), I will have to re-configure the current HA > resource configuration to tell it about the 4 new OST's. > Our exascaler's HA monitors the individual OST and I need to re-configure > the HA on the existing filesystem. > > Our vendor support has confirmed that we would have to restart the > filesystem if we want to regenerate the HA configs to include the new OST's. > > Thanks, > -Raj > > > On Thu, Feb 21, 2019 at 11:23 AM Colin Faber <cfa...@gmail.com> wrote: > >> It seems to me that steps may still be missing? >> >> You're going to rack/stack and provision the OSS nodes with new OSTs'. >> >> Then you're going to introduce failover options somewhere? new osts? >> existing system? etc? >> >> If you're introducing failover with the new OST's and leaving the >> existing system in place, you should be able to accomplish this without >> bringing the system offline. >> >> If you're going to be introducing failover to your existing system then >> you will need to reconfigure the file system to accommodate the new >> failover settings (failover nides, etc.) >> >> -cf >> >> >> On Thu, Feb 21, 2019 at 9:13 AM Raj Ayyampalayam <ans...@gmail.com> >> wrote: >> >>> Our upgrade strategy is as follows: >>> >>> 1) Load all disks into the storage array. >>> 2) Create RAID pools and virtual disks. >>> 3) Create lustre file system using mkfs.lustre command. (I still have to >>> figure out all the parameters used on the existing OSTs). >>> 4) Create mount points on all OSSs. >>> 5) Mount the lustre OSTs. >>> 6) Maybe rebalance the filesystem. >>> My understanding is that the above can be done without bringing the >>> filesystem down. I want to create the HA configuration (corosync and >>> pacemaker) for the new OSTs. This step requires the filesystem to be down. >>> I want to know what would happen to the suspended processes across the >>> cluster when I bring the filesystem down to re-generate the HA configs. >>> >>> Thanks, >>> -Raj >>> >>> On Thu, Feb 21, 2019 at 12:59 AM Colin Faber <cfa...@gmail.com> wrote: >>> >>>> Can you provide more details on your upgrade strategy? In some cases >>>> expanding your storage shouldn't impact client / job activity at all. >>>> >>>> On Wed, Feb 20, 2019, 11:09 AM Raj Ayyampalayam <ans...@gmail.com> >>>> wrote: >>>> >>>>> Hello, >>>>> >>>>> We are planning on expanding our storage by adding more OSTs to our >>>>> lustre file system. It looks like it would be easier to expand if we bring >>>>> the filesystem down and perform the necessary operations. We are planning >>>>> to suspend all the jobs running on the cluster. We originally planned to >>>>> add new OSTs to the live filesystem. >>>>> >>>>> We are trying to determine the potential impact to the suspended jobs >>>>> if we bring down the filesystem for the upgrade. >>>>> One of the questions we have is what would happen to the suspended >>>>> processes that hold an open file handle in the lustre file system when the >>>>> filesystem is brought down for the upgrade? >>>>> Will they recover from the client eviction? >>>>> >>>>> We do have vendor support and have engaged them. I wanted to ask the >>>>> community and get some feedback. >>>>> >>>>> Thanks, >>>>> -Raj >>>>> >>>> _______________________________________________ >>>>> lustre-discuss mailing list >>>>> lustre-discuss@lists.lustre.org >>>>> http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org >>>>> >>>>
_______________________________________________ lustre-discuss mailing list lustre-discuss@lists.lustre.org http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org