On 1/18/22 10:43 AM, Martin Urbanec wrote:
Hello Andrew,

Will there be any impact on Toolforge users using scratch?

Yes, all of the things listed will affect tools as well as servers. Affected tools will most likely get restarted and un-stuck as a consequence of exec/k8s node reboots.



Martin

út 18. 1. 2022 v 18:00 odesílatel Andrew Bogott <[email protected]> napsal:

    Since no one expressed concerns about this, I'm going to go ahead and
    roll this out tomorrow morning at 16:00 UTC.  Here's what to expect:

    1) If your VM mounts secondary-scratch but doesn't actually use it,
    nothing much will happen
    2) If your VM or tool has an open file on that volume when the
    switchover happens, it will probably freeze up.  I will reboot VMs
    that
    this happens to.
    3) If you had files on the scratch volume before this change, they
    will
    be gone after the change. Precious files will be recoverable after
    the
    fact for a few weeks.

    -Andrew


    On 1/14/22 2:06 PM, Andrew Bogott wrote:
    > Hello, all!
    >
    > We are in the process of re-engineering and virtualizing[0] the NFS
    > service provided to Toolforge and VMs. The transition will be rocky
    > and involve some service interruption... I'm still running tests to
    > determine exactly host much disruption will be required.
    >
    > The first volume that I'd like to replace is 'scratch,' typically
    > mounted as /mnt/nfs/secondary-scratch. I'm seeking feedback
    about how
    > vital scratch uptime is to your current workflow, and how
    disruptive
    > it would be to lose data there.
    >
    > If you have a project or tool that uses scratch, please respond
    with
    > your thoughts! My preference would be to wipe out all existing
    data on
    > scratch and also subject users to several unannounced periods of
    > downtime, but I also don't want anyone to suffer. If you have
    > important/persistent data on that volume then the WMCS team will
    work
    > with you to migrate that data somewhere safer, and if you have an
    > important workflow that will break due to Scratch downtime then
    I'll
    > work harder on devising a low-impact roll-out.
    >
    > Thank you!
    >
    > -Andrew
    >
    > [0] https://phabricator.wikimedia.org/T291405
    >
    _______________________________________________
    Cloud-announce mailing list -- [email protected]
    List information:
    
https://lists.wikimedia.org/postorius/lists/cloud-announce.lists.wikimedia.org/
    _______________________________________________
    Cloud mailing list -- [email protected]
    List information:
    https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/


_______________________________________________
Cloud mailing list [email protected]
List 
information:https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

_______________________________________________
Cloud mailing list -- [email protected]
List information: 
https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/

Reply via email to