Hello Labs,

Many of you may recall that until some point late 2013, one of the features of the labs file server was that it provided time travel snapshots (you could see a consistent view of the filesystem as it existed 1h, 2h, 3h, 1d, 2d, 3d and 1 week ago).

This was disabled at that time - despite being generally considered valuable - because it was suspected to be (part of) the stability problems the NFS server suffered at the time. This turns out to not have been the case, and we could turn it back on now.

Indeed, doing so is a prerequisite to the planned replication of the filesystem in the new datacenter where a redundant Labs installation is slated to be deployed[1].

The issue is that turning that feature back on requires changing the way the disk space is currently allocated at a low level[2] and necessitates a fairly long period of partial downtime during which data is being copied from one part of the disk subsystem to the other. In practice, this would require the primary partitions (/home and /data/project) to be set readonly for a period on the order of a day (24-30 hours).

That downtime is pretty much unavoidable eventually as it is a requirement of expanding labs and improving data resillience and reliability, but the /timing/ of that is flexible. I wanted to "poll" labs users as to when the possibility of disruption is minimized, and give everyone plenty of time to make contingency planning and/or notify their endusers of the expected period of reduced availability.

Provided there is a good consensus that the week is a better time than the weekend (I am guessing here that volunteer coders and users are more active during the weekend) then I would suggest starting the operation on Tuesday, January 13 at 18:00 UTC. The period of downtime is expected to last until January 14, 18:00 UTC but may extend a few hours beyond that.

The expected impacts are:

* Starting at the beginning of the window, /home and /data/project will switch to readonly mode; any attempt to write to files to those trees will result in EROFS errors being thrown. Reading from those filesystems will still work as expected, so would writing to other filesystems; * Read performance may degrade noticably as the disk subsystem will be loaded to capacity; * It will not be possible to manipulate the gridengine queue - specifically, starting or stopping jobs will not work; and * At the end of the window, when the operation is complete, the "old" file system will go away and be replaced by the new one - this will cause any access to files or directories that were previously opened (including working directories) on the affected filesystems to error out with ESTALE. Reopening files by name will access the new copy identical to the one at the time the filesystems became readonly.

In practice, that latter impact has the effect that most running programs will be unable to continue unless they have special handling for this situation, and most gridengine jobs will no longer be able to log output. It may be a good idea to restart any continuous tool at that time. All webservices that were running at the start of the maintenance window will be restarted at that time.

If you have tools or other processes running that do not rely on being able to write to /data/project, they may be able to continue running during the downtime without interruption. Jobs that only access the network (for instance, the Mediawiki API) or the databases will not likely be affected. Because of this, no automatic or forcible restart of running (non-webservice) jobs will be made.

In particular, if you have a tool whose continued operation is important, temporarily modifying it so that it works from /data/scratch may be a good workaround.

Finally, in order to avoid risks of the filesystem move taking longer than expected and increasing downtime significantly, LOG FILES OVER 1G WILL BE NOT BE COPIED. If you have critical files that are not simple log files but whose names end in .log, .err or .out then you MUST compress those files if you absolutely require them to survive the transition. Alternately, truncating them to some size comfortably smaller than 1G will work if the file must remain uncompressed.

The speed and reliability of the maintenance process depends on the total data to copy. If you can clean up both your home and project directories of extraneous files, you'll help the process greatly. :-)

Thanks all,

-- Marc

_______________________________________________
Labs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/labs-l

Reply via email to