Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2022-01-06 Thread Daniela Martín
Hi Valentyn,

We decided to include the Java 17 installation in the image that we are
creating for the Ubuntu upgrade (BEAM-12621). We are using the latest image
j*enkins-worker-boot-image-20211029* that the Jenkins workers are currently
using, so the remaining changes in this new image would be the ones that
were made yesterday in *jenkins-worker-boot-image-20220105* image.

We will create the new image later today, including the Ubuntu upgrade and
Java SDK17 installation (which were previously implemented in
*jenkins-worker-boot-image-20211214*), and let you know.

Thank you.

Regards,

On Thu, Jan 6, 2022 at 10:01 AM Valentyn Tymofieiev 
wrote:

> Thanks, Daniela. I am happy to spot-check the new image you are building
> for issues I am aware of.
>
> I made my changes to the latest VM image, building on top of latest
> jenkins-worker-boot-image-20211214, and replicated those changes on the
> running workers.
>
> I noticed that current Jenkins workers (at least some of them) are still
> running on boot disks from older image jenkins-worker-boot-image-20211029,
> and not the newest available image, jenkins-worker-boot-image-20211214.
> Image comment for the latter image says: Installed Java SDK 17. See
> BEAM-12313.
>
> I was wondering - is there a reason we did not reload Jenkins workers to
> pick up this latest image? Or did you decide to upgrade to the new Ubuntu
> version instead that would also include Java 17.
>
> If jenkins-worker-boot-image-20211214 is known to work and needed for
> BEAM-12313 ~now, I can do this update, and we can continue to work in
> parallel on BEAM-12621.
>
> Thanks,
> Valentyn
>


-- 

Daniela Martín (She/Her) | 

Site Reliability Engineer

daniela.mar...@wizeline.com

Amado Nervo 2200, Esfera P6, Col. Ciudad del Sol, 45050 Zapopan, Jal.

Follow us Twitter  | Facebook
 | Instagram
 | LinkedIn


Share feedback on Clutch 

-- 
*This email and its contents (including any attachments) are being sent to
you on the condition of confidentiality and may be protected by legal
privilege. Access to this email by anyone other than the intended recipient
is unauthorized. If you are not the intended recipient, please immediately
notify the sender by replying to this message and delete the material
immediately from your system. Any further use, dissemination, distribution
or reproduction of this email is strictly prohibited. Further, no
representation is made with respect to any content contained in this email.*


Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2022-01-05 Thread Robert Burke
Ack. Thanks for the headsup.

As long as Jenkins ends up with at least go1.16, the Go targets should be
fine.

On Wed, Jan 5, 2022, 6:08 PM Daniela Martín 
wrote:

> Hi Valentyn,
>
> Giomar and I are working on the upgrade of Jenkins VMs to modern Ubuntu
> version (BEAM-12621 ).
> We are very close to finishing it, we will reach out for the review.
>
> Thank you.
>
> Regards,
>
> On Wed, Jan 5, 2022 at 7:57 PM Valentyn Tymofieiev 
> wrote:
>
>> Heads-up, I am planning a Jenkins image upgrade with a minor change to
>> clean up some unwanted log4j artifacts from gradle caches to silence some
>> alerts I received. Hopefully noone else is currently doing an upgrade,
>> otherwise - please reach out.
>>
>> I will take care of of the PATH issue discussed here (if it is still an
>> issue).
>>
>>
>> On Tue, Nov 2, 2021 at 4:16 PM Robert Burke  wrote:
>>
>>> TIL as well. Sounds like the right location. Thanks Valentyn!
>>>
>>> On Tue, Nov 2, 2021, 11:00 AM Valentyn Tymofieiev 
>>> wrote:
>>>
 Yeah,  .profile is only sourced by login shells. Adding the PATH in
 .bashrc can be a workaround, but since .bashrc is executed every time a new
 shell runs, PATH variable will be growing with every shell subprocess, so
 several sources recommend .profile instead, which does not always work.
 We should be able to fix this by updating  /etc/environment instead
 (TIL).

 This is the current content:
 cat /etc/environment

 PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"




 On Mon, Nov 1, 2021 at 10:50 AM Robert Burke 
 wrote:

> Looks like while .profile was edited to add in a PATH section pointing
> to /snap/bin (where go is now installed), it doesn't seem like .profile is
> executed by the jenkins login shells.
>
>
>
> On Fri, Oct 29, 2021, 6:23 PM Valentyn Tymofieiev 
> wrote:
>
>>
>>
>> On Wed, Oct 20, 2021 at 11:16 AM Valentyn Tymofieiev <
>> valen...@google.com> wrote:
>>
>>>
>>>
>>> On Wed, Oct 20, 2021 at 11:12 AM Pablo Estrada 
>>> wrote:
>>>
 Thanks everyone for investigating and documenting this. I'll use it
 today : )

>>> Dan may be also in the middle of doing this, please coordinate.
>>>

 ahem - maybe we should rename the image name/image family names
 to jenkins-worker-boot-image ? Does anyone foresee issues if we do 
 that?
 Does jenkins depend on these names in some undocumented way?

>>> +1. it should 'just work', need to update the wiki after the change.
>>> Jenkins also did a terminology adjustment.
>>>
>> I had to reimage Jenkins workers again, took care of the rename and
>> changed the instructions.
>>
>> I am not sure what is the status of Go Postcommit problem, but
>> noticed that jenkins worker #1 had a different boot disk. I reimaged all
>> workers building on top of the latest image from the image family. If Go
>> tests start failing, we may need to get help from Dan again.
>>
>>
>>>
 On Tue, Oct 19, 2021 at 1:43 PM Daniel Oliveira <
 danolive...@google.com> wrote:

> I'm ok with deciding to avoid the "lite" update option, feel free
> to revise the instructions as it seems appropriate. As for the issue, 
> I
> fixed it with a workaround that should work until we need to add a new
> image to the agents, and I'm currently investigating the root cause 
> and
> prepare a fixed image.
>
> That said, I think this issue would have still happened even if we
> didn't perform the "lite" update. I'm still trying to figure out the 
> exact
> problem, but it looks to be a PATH issue that wasn't effectively 
> caught by
> the current process. I won't get into details too much in this thread 
> (see
> the Jira for that), but essentially everything works in my 
> environment when
> I SSH into the VMs, but because the location of the "go" command 
> changed in
> the PATH, it seems to have stopped working for every other user, 
> including
> the Jenkins agents. I actually did notice that would happen when I was
> working on the image, but the solution seemed to be to reboot the 
> machine,
> which I assumed happened already since I shut down the VM to image it.
>
> On Tue, Oct 19, 2021 at 12:09 PM Robert Burke 
> wrote:
>
>> +1 to only having one way to do things. The Lite option seems
>> liable to cause more problems since it means it's changes can be 
>> blown away
>> if a new image isn't prepared anyway.
>> I don't think we are changing the images often 

Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2022-01-05 Thread Daniela Martín
Hi Valentyn,

Giomar and I are working on the upgrade of Jenkins VMs to modern Ubuntu
version (BEAM-12621 ). We
are very close to finishing it, we will reach out for the review.

Thank you.

Regards,

On Wed, Jan 5, 2022 at 7:57 PM Valentyn Tymofieiev 
wrote:

> Heads-up, I am planning a Jenkins image upgrade with a minor change to
> clean up some unwanted log4j artifacts from gradle caches to silence some
> alerts I received. Hopefully noone else is currently doing an upgrade,
> otherwise - please reach out.
>
> I will take care of of the PATH issue discussed here (if it is still an
> issue).
>
>
> On Tue, Nov 2, 2021 at 4:16 PM Robert Burke  wrote:
>
>> TIL as well. Sounds like the right location. Thanks Valentyn!
>>
>> On Tue, Nov 2, 2021, 11:00 AM Valentyn Tymofieiev 
>> wrote:
>>
>>> Yeah,  .profile is only sourced by login shells. Adding the PATH in
>>> .bashrc can be a workaround, but since .bashrc is executed every time a new
>>> shell runs, PATH variable will be growing with every shell subprocess, so
>>> several sources recommend .profile instead, which does not always work.
>>> We should be able to fix this by updating  /etc/environment instead
>>> (TIL).
>>>
>>> This is the current content:
>>> cat /etc/environment
>>>
>>> PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
>>>
>>>
>>>
>>>
>>> On Mon, Nov 1, 2021 at 10:50 AM Robert Burke  wrote:
>>>
 Looks like while .profile was edited to add in a PATH section pointing
 to /snap/bin (where go is now installed), it doesn't seem like .profile is
 executed by the jenkins login shells.



 On Fri, Oct 29, 2021, 6:23 PM Valentyn Tymofieiev 
 wrote:

>
>
> On Wed, Oct 20, 2021 at 11:16 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>>
>>
>> On Wed, Oct 20, 2021 at 11:12 AM Pablo Estrada 
>> wrote:
>>
>>> Thanks everyone for investigating and documenting this. I'll use it
>>> today : )
>>>
>> Dan may be also in the middle of doing this, please coordinate.
>>
>>>
>>> ahem - maybe we should rename the image name/image family names
>>> to jenkins-worker-boot-image ? Does anyone foresee issues if we do that?
>>> Does jenkins depend on these names in some undocumented way?
>>>
>> +1. it should 'just work', need to update the wiki after the change.
>> Jenkins also did a terminology adjustment.
>>
> I had to reimage Jenkins workers again, took care of the rename and
> changed the instructions.
>
> I am not sure what is the status of Go Postcommit problem, but noticed
> that jenkins worker #1 had a different boot disk. I reimaged all workers
> building on top of the latest image from the image family. If Go tests
> start failing, we may need to get help from Dan again.
>
>
>>
>>> On Tue, Oct 19, 2021 at 1:43 PM Daniel Oliveira <
>>> danolive...@google.com> wrote:
>>>
 I'm ok with deciding to avoid the "lite" update option, feel free
 to revise the instructions as it seems appropriate. As for the issue, I
 fixed it with a workaround that should work until we need to add a new
 image to the agents, and I'm currently investigating the root cause and
 prepare a fixed image.

 That said, I think this issue would have still happened even if we
 didn't perform the "lite" update. I'm still trying to figure out the 
 exact
 problem, but it looks to be a PATH issue that wasn't effectively 
 caught by
 the current process. I won't get into details too much in this thread 
 (see
 the Jira for that), but essentially everything works in my environment 
 when
 I SSH into the VMs, but because the location of the "go" command 
 changed in
 the PATH, it seems to have stopped working for every other user, 
 including
 the Jenkins agents. I actually did notice that would happen when I was
 working on the image, but the solution seemed to be to reboot the 
 machine,
 which I assumed happened already since I shut down the VM to image it.

 On Tue, Oct 19, 2021 at 12:09 PM Robert Burke 
 wrote:

> +1 to only having one way to do things. The Lite option seems
> liable to cause more problems since it means it's changes can be 
> blown away
> if a new image isn't prepared anyway.
> I don't think we are changing the images often enough for it.
> Perhaps call it the option to test changes if anything?
>
> On Tue, Oct 19, 2021, 11:55 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> All workers were updated to
>> use jenkins-slave-boot-image-20211011, which should have had a go 
>> 

Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2021-11-02 Thread Robert Burke
TIL as well. Sounds like the right location. Thanks Valentyn!

On Tue, Nov 2, 2021, 11:00 AM Valentyn Tymofieiev 
wrote:

> Yeah,  .profile is only sourced by login shells. Adding the PATH in
> .bashrc can be a workaround, but since .bashrc is executed every time a new
> shell runs, PATH variable will be growing with every shell subprocess, so
> several sources recommend .profile instead, which does not always work.
> We should be able to fix this by updating  /etc/environment instead (TIL).
>
> This is the current content:
> cat /etc/environment
>
> PATH="/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games"
>
>
>
>
> On Mon, Nov 1, 2021 at 10:50 AM Robert Burke  wrote:
>
>> Looks like while .profile was edited to add in a PATH section pointing to
>> /snap/bin (where go is now installed), it doesn't seem like .profile is
>> executed by the jenkins login shells.
>>
>>
>>
>> On Fri, Oct 29, 2021, 6:23 PM Valentyn Tymofieiev 
>> wrote:
>>
>>>
>>>
>>> On Wed, Oct 20, 2021 at 11:16 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>


 On Wed, Oct 20, 2021 at 11:12 AM Pablo Estrada 
 wrote:

> Thanks everyone for investigating and documenting this. I'll use it
> today : )
>
 Dan may be also in the middle of doing this, please coordinate.

>
> ahem - maybe we should rename the image name/image family names
> to jenkins-worker-boot-image ? Does anyone foresee issues if we do that?
> Does jenkins depend on these names in some undocumented way?
>
 +1. it should 'just work', need to update the wiki after the change.
 Jenkins also did a terminology adjustment.

>>> I had to reimage Jenkins workers again, took care of the rename and
>>> changed the instructions.
>>>
>>> I am not sure what is the status of Go Postcommit problem, but noticed
>>> that jenkins worker #1 had a different boot disk. I reimaged all workers
>>> building on top of the latest image from the image family. If Go tests
>>> start failing, we may need to get help from Dan again.
>>>
>>>

> On Tue, Oct 19, 2021 at 1:43 PM Daniel Oliveira <
> danolive...@google.com> wrote:
>
>> I'm ok with deciding to avoid the "lite" update option, feel free to
>> revise the instructions as it seems appropriate. As for the issue, I 
>> fixed
>> it with a workaround that should work until we need to add a new image to
>> the agents, and I'm currently investigating the root cause and prepare a
>> fixed image.
>>
>> That said, I think this issue would have still happened even if we
>> didn't perform the "lite" update. I'm still trying to figure out the 
>> exact
>> problem, but it looks to be a PATH issue that wasn't effectively caught 
>> by
>> the current process. I won't get into details too much in this thread 
>> (see
>> the Jira for that), but essentially everything works in my environment 
>> when
>> I SSH into the VMs, but because the location of the "go" command changed 
>> in
>> the PATH, it seems to have stopped working for every other user, 
>> including
>> the Jenkins agents. I actually did notice that would happen when I was
>> working on the image, but the solution seemed to be to reboot the 
>> machine,
>> which I assumed happened already since I shut down the VM to image it.
>>
>> On Tue, Oct 19, 2021 at 12:09 PM Robert Burke 
>> wrote:
>>
>>> +1 to only having one way to do things. The Lite option seems liable
>>> to cause more problems since it means it's changes can be blown away if 
>>> a
>>> new image isn't prepared anyway.
>>> I don't think we are changing the images often enough for it.
>>> Perhaps call it the option to test changes if anything?
>>>
>>> On Tue, Oct 19, 2021, 11:55 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 All workers were updated to use jenkins-slave-boot-image-20211011,
 which should have had a go command, but it appears slightly 
 misconfigured.
 I reopened BEAM-13037 [1] and added some details there.

 I also added instructions to wiki [2] on how to perform an image
 swap and it is actually very straightforward. I think a lesson here is 
 that
 making 'lite' upgrades is brittle as misconfigurations could resurface 
 down
 the road when the context of the lite upgrade is no longer fresh in our
 memory.

 I suggest we revise the instructions to keep only image swap
 commands and remove the 'lite' update option. +Daniel Oliveira
 , WDYT?  In the meantime, we should also
 prepare an image that fixes the misconfiguration. Would you be able to 
 help
 with that? Thank you.

 [1] https://issues.apache.org/jira/browse/BEAM-13037
 [2]
 

Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2021-11-01 Thread Robert Burke
Looks like while .profile was edited to add in a PATH section pointing to
/snap/bin (where go is now installed), it doesn't seem like .profile is
executed by the jenkins login shells.



On Fri, Oct 29, 2021, 6:23 PM Valentyn Tymofieiev 
wrote:

>
>
> On Wed, Oct 20, 2021 at 11:16 AM Valentyn Tymofieiev 
> wrote:
>
>>
>>
>> On Wed, Oct 20, 2021 at 11:12 AM Pablo Estrada 
>> wrote:
>>
>>> Thanks everyone for investigating and documenting this. I'll use it
>>> today : )
>>>
>> Dan may be also in the middle of doing this, please coordinate.
>>
>>>
>>> ahem - maybe we should rename the image name/image family names
>>> to jenkins-worker-boot-image ? Does anyone foresee issues if we do that?
>>> Does jenkins depend on these names in some undocumented way?
>>>
>> +1. it should 'just work', need to update the wiki after the change.
>> Jenkins also did a terminology adjustment.
>>
> I had to reimage Jenkins workers again, took care of the rename and
> changed the instructions.
>
> I am not sure what is the status of Go Postcommit problem, but noticed
> that jenkins worker #1 had a different boot disk. I reimaged all workers
> building on top of the latest image from the image family. If Go tests
> start failing, we may need to get help from Dan again.
>
>
>>
>>> On Tue, Oct 19, 2021 at 1:43 PM Daniel Oliveira 
>>> wrote:
>>>
 I'm ok with deciding to avoid the "lite" update option, feel free to
 revise the instructions as it seems appropriate. As for the issue, I fixed
 it with a workaround that should work until we need to add a new image to
 the agents, and I'm currently investigating the root cause and prepare a
 fixed image.

 That said, I think this issue would have still happened even if we
 didn't perform the "lite" update. I'm still trying to figure out the exact
 problem, but it looks to be a PATH issue that wasn't effectively caught by
 the current process. I won't get into details too much in this thread (see
 the Jira for that), but essentially everything works in my environment when
 I SSH into the VMs, but because the location of the "go" command changed in
 the PATH, it seems to have stopped working for every other user, including
 the Jenkins agents. I actually did notice that would happen when I was
 working on the image, but the solution seemed to be to reboot the machine,
 which I assumed happened already since I shut down the VM to image it.

 On Tue, Oct 19, 2021 at 12:09 PM Robert Burke 
 wrote:

> +1 to only having one way to do things. The Lite option seems liable
> to cause more problems since it means it's changes can be blown away if a
> new image isn't prepared anyway.
> I don't think we are changing the images often enough for it.  Perhaps
> call it the option to test changes if anything?
>
> On Tue, Oct 19, 2021, 11:55 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> All workers were updated to use jenkins-slave-boot-image-20211011,
>> which should have had a go command, but it appears slightly 
>> misconfigured.
>> I reopened BEAM-13037 [1] and added some details there.
>>
>> I also added instructions to wiki [2] on how to perform an image swap
>> and it is actually very straightforward. I think a lesson here is that
>> making 'lite' upgrades is brittle as misconfigurations could resurface 
>> down
>> the road when the context of the lite upgrade is no longer fresh in our
>> memory.
>>
>> I suggest we revise the instructions to keep only image swap commands
>> and remove the 'lite' update option. +Daniel Oliveira
>> , WDYT?  In the meantime, we should also
>> prepare an image that fixes the misconfiguration. Would you be able to 
>> help
>> with that? Thank you.
>>
>> [1] https://issues.apache.org/jira/browse/BEAM-13037
>> [2]
>> https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips#JenkinsTips-HowtoinstallandupgradesoftwareonJenkinsworkers
>>
>>
>> On Tue, Oct 19, 2021 at 8:46 AM Robert Burke 
>> wrote:
>>
>>> FYI it looks like all the Go tests are now failing because it can't
>>> find the Go command at all.
>>> Did a Jenkins image without Go (v1.16+) pre-installed get pushed?
>>>
>>> On Mon, Oct 18, 2021, 1:45 PM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 Thanks Daniel,

 I can recreate the VMs on new disks.

 We currently have a set of stopped jenkins workers (named:
 apache-beam-jenkins-##) and running workers (named:
 apache-ci-beam-jenkins-##)

 Are there any concerns about deleting the stopped group of workers?



 On Mon, Oct 18, 2021 at 11:19 AM Ahmet Altay 
 wrote:

> Thank you Daniel, Valentyn!
>
> On Mon, Oct 18, 2021 at 8:02 AM Daniel Oliveira <

Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2021-10-19 Thread Robert Burke
+1 to only having one way to do things. The Lite option seems liable to
cause more problems since it means it's changes can be blown away if a new
image isn't prepared anyway.
I don't think we are changing the images often enough for it.  Perhaps call
it the option to test changes if anything?

On Tue, Oct 19, 2021, 11:55 AM Valentyn Tymofieiev 
wrote:

> All workers were updated to use jenkins-slave-boot-image-20211011, which
> should have had a go command, but it appears slightly misconfigured. I
> reopened BEAM-13037 [1] and added some details there.
>
> I also added instructions to wiki [2] on how to perform an image swap and
> it is actually very straightforward. I think a lesson here is that making
> 'lite' upgrades is brittle as misconfigurations could resurface down the
> road when the context of the lite upgrade is no longer fresh in our memory.
>
> I suggest we revise the instructions to keep only image swap commands and
> remove the 'lite' update option. +Daniel Oliveira ,
> WDYT?  In the meantime, we should also prepare an image that fixes the
> misconfiguration. Would you be able to help with that? Thank you.
>
> [1] https://issues.apache.org/jira/browse/BEAM-13037
> [2]
> https://cwiki.apache.org/confluence/display/BEAM/Jenkins+Tips#JenkinsTips-HowtoinstallandupgradesoftwareonJenkinsworkers
>
>
> On Tue, Oct 19, 2021 at 8:46 AM Robert Burke  wrote:
>
>> FYI it looks like all the Go tests are now failing because it can't find
>> the Go command at all.
>> Did a Jenkins image without Go (v1.16+) pre-installed get pushed?
>>
>> On Mon, Oct 18, 2021, 1:45 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> Thanks Daniel,
>>>
>>> I can recreate the VMs on new disks.
>>>
>>> We currently have a set of stopped jenkins workers (named:
>>> apache-beam-jenkins-##) and running workers (named:
>>> apache-ci-beam-jenkins-##)
>>>
>>> Are there any concerns about deleting the stopped group of workers?
>>>
>>>
>>>
>>> On Mon, Oct 18, 2021 at 11:19 AM Ahmet Altay  wrote:
>>>
 Thank you Daniel, Valentyn!

 On Mon, Oct 18, 2021 at 8:02 AM Daniel Oliveira 
 wrote:

> I performed a light update of both Go and Python (from Valentyn's
> update) on each worker VM over the weekend. I also added additional
> instructions for the light update to Confluence (as an alternative to the
> current instructions).
>
> There is still reason to perform a full update at some point: Valentyn
> updated the VM image from 500 GB to 1000 GB of storage, which requires a
> full update to actually take effect.
>
> On Tue, Oct 12, 2021 at 10:32 AM Valentyn Tymofieiev <
> valen...@google.com> wrote:
>
>> > 3. SSH into the agent and perform the update.
>> So, this would be a 'lite' version of the update, where we make
>> changes to the live worker without recreating worker VM with a new image?
>> We could perhaps document both options, and also make it clear that
>> producing a VM image that has necessary updates is mandatory even if we
>> perform 'lite' updates without recreating the worker.
>> Also, for a lite update, marking the Jenkins offer offline may be
>> optional, as some updates might not be disruptive (such as installing 
>> some
>> software that will not be used immediately).
>>
>>
>>
>> On Mon, Oct 11, 2021 at 7:53 PM Robert Burke 
>> wrote:
>>
>>> SGTM. Thank you very much Daniel!
>>>
>>> On Mon, Oct 11, 2021, 7:51 PM Ahmet Altay  wrote:
>>>
 Thank you Daniel. Could you please update the wiki once you are
 done with the process?

 On Mon, Oct 11, 2021 at 6:22 PM Daniel Oliveira <
 danolive...@google.com> wrote:

> Took me a bit to get to this, sorry. I finally figured out an
> approach for updating Go and did so and will be updating the image
> momentarily.
>
> I think a more important note is that I tried what Valentyn was
> considering, which is SSHing into workers and updating the 
> dependency. I'll
> describe the process below, but the summary is that I did it on one 
> worker
> with Go so far, saw no problems over the weekend, and would like to
> continue updating the rest of the workers if there are no objections.
>
> Here's a step-by-step of what I did. If we decide to stick with
> this approach, these instructions can be added to Confluence:
>
> 1. Go to the page for the Jenkins agent you want to update [1] and
> click "Mark this node temporarily offline", leaving a reason such as
> "Updating X dependency."
> 2. Wait until there are no more tests running in that agent (under
> "Build Executor Status" on the left of the page).
> 3. SSH into the agent and perform the update.
> 4. Mark the node as online again.
> 5. Repeat for every worker.
>

Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2021-10-19 Thread Robert Burke
FYI it looks like all the Go tests are now failing because it can't find
the Go command at all.
Did a Jenkins image without Go (v1.16+) pre-installed get pushed?

On Mon, Oct 18, 2021, 1:45 PM Valentyn Tymofieiev 
wrote:

> Thanks Daniel,
>
> I can recreate the VMs on new disks.
>
> We currently have a set of stopped jenkins workers (named:
> apache-beam-jenkins-##) and running workers (named:
> apache-ci-beam-jenkins-##)
>
> Are there any concerns about deleting the stopped group of workers?
>
>
>
> On Mon, Oct 18, 2021 at 11:19 AM Ahmet Altay  wrote:
>
>> Thank you Daniel, Valentyn!
>>
>> On Mon, Oct 18, 2021 at 8:02 AM Daniel Oliveira 
>> wrote:
>>
>>> I performed a light update of both Go and Python (from Valentyn's
>>> update) on each worker VM over the weekend. I also added additional
>>> instructions for the light update to Confluence (as an alternative to the
>>> current instructions).
>>>
>>> There is still reason to perform a full update at some point: Valentyn
>>> updated the VM image from 500 GB to 1000 GB of storage, which requires a
>>> full update to actually take effect.
>>>
>>> On Tue, Oct 12, 2021 at 10:32 AM Valentyn Tymofieiev <
>>> valen...@google.com> wrote:
>>>
 > 3. SSH into the agent and perform the update.
 So, this would be a 'lite' version of the update, where we make changes
 to the live worker without recreating worker VM with a new image? We could
 perhaps document both options, and also make it clear that producing a VM
 image that has necessary updates is mandatory even if we perform 'lite'
 updates without recreating the worker.
 Also, for a lite update, marking the Jenkins offer offline may be
 optional, as some updates might not be disruptive (such as installing some
 software that will not be used immediately).



 On Mon, Oct 11, 2021 at 7:53 PM Robert Burke 
 wrote:

> SGTM. Thank you very much Daniel!
>
> On Mon, Oct 11, 2021, 7:51 PM Ahmet Altay  wrote:
>
>> Thank you Daniel. Could you please update the wiki once you are done
>> with the process?
>>
>> On Mon, Oct 11, 2021 at 6:22 PM Daniel Oliveira <
>> danolive...@google.com> wrote:
>>
>>> Took me a bit to get to this, sorry. I finally figured out an
>>> approach for updating Go and did so and will be updating the image
>>> momentarily.
>>>
>>> I think a more important note is that I tried what Valentyn was
>>> considering, which is SSHing into workers and updating the dependency. 
>>> I'll
>>> describe the process below, but the summary is that I did it on one 
>>> worker
>>> with Go so far, saw no problems over the weekend, and would like to
>>> continue updating the rest of the workers if there are no objections.
>>>
>>> Here's a step-by-step of what I did. If we decide to stick with this
>>> approach, these instructions can be added to Confluence:
>>>
>>> 1. Go to the page for the Jenkins agent you want to update [1] and
>>> click "Mark this node temporarily offline", leaving a reason such as
>>> "Updating X dependency."
>>> 2. Wait until there are no more tests running in that agent (under
>>> "Build Executor Status" on the left of the page).
>>> 3. SSH into the agent and perform the update.
>>> 4. Mark the node as online again.
>>> 5. Repeat for every worker.
>>>
>>> And these are some additional steps if you want to immediately run a
>>> test suite to check that the update worked correctly. For example in my
>>> case, I wanted to check against the Go Postcommit, and it was a good 
>>> thing
>>> I did, because it actually failed the first time and I had to go back 
>>> in to
>>> fix a small oversight I made. So doing this after you update your first
>>> worker is probably a good idea before updating the rest:
>>>
>>> 1. Go to the page for the job you want to run (for example: [2]).
>>> 2. Click "Configure" on the left menu.
>>> 3. Find the checkmark "Restrict where this project can be run" and
>>> change the restriction from "beam" to the specific name of the agent 
>>> (ex.
>>> "apache-beam-jenkins-1").
>>> 4. Save and apply that change.
>>> 5. Back on the page for the job, click "Build with Parameters" on
>>> the left menu.
>>> 6. Run the build on "master".
>>> 7. Once you're done checking the results, change the restriction for
>>> the job back to "beam". (This also gets reset once every 24 hours in 
>>> case
>>> you forget.)
>>>
>>> I did that on one agent (apache-beam-jenkins-2) on Friday evening
>>> when it wasn't too busy, and got Go updated and working. I checked that
>>> agent's execution history again today just in case, and it was healthy 
>>> over
>>> the weekend, with no Go-related problems as far as I could see. If 
>>> there's
>>> no objections I'd like to go ahead and continue 

Re: Best practices for upgrading installed dependencies on Jenkins VMs?

2021-10-11 Thread Robert Burke
SGTM. Thank you very much Daniel!

On Mon, Oct 11, 2021, 7:51 PM Ahmet Altay  wrote:

> Thank you Daniel. Could you please update the wiki once you are done with
> the process?
>
> On Mon, Oct 11, 2021 at 6:22 PM Daniel Oliveira 
> wrote:
>
>> Took me a bit to get to this, sorry. I finally figured out an approach
>> for updating Go and did so and will be updating the image momentarily.
>>
>> I think a more important note is that I tried what Valentyn was
>> considering, which is SSHing into workers and updating the dependency. I'll
>> describe the process below, but the summary is that I did it on one worker
>> with Go so far, saw no problems over the weekend, and would like to
>> continue updating the rest of the workers if there are no objections.
>>
>> Here's a step-by-step of what I did. If we decide to stick with this
>> approach, these instructions can be added to Confluence:
>>
>> 1. Go to the page for the Jenkins agent you want to update [1] and click
>> "Mark this node temporarily offline", leaving a reason such as "Updating X
>> dependency."
>> 2. Wait until there are no more tests running in that agent (under "Build
>> Executor Status" on the left of the page).
>> 3. SSH into the agent and perform the update.
>> 4. Mark the node as online again.
>> 5. Repeat for every worker.
>>
>> And these are some additional steps if you want to immediately run a test
>> suite to check that the update worked correctly. For example in my case, I
>> wanted to check against the Go Postcommit, and it was a good thing I did,
>> because it actually failed the first time and I had to go back in to fix a
>> small oversight I made. So doing this after you update your first worker is
>> probably a good idea before updating the rest:
>>
>> 1. Go to the page for the job you want to run (for example: [2]).
>> 2. Click "Configure" on the left menu.
>> 3. Find the checkmark "Restrict where this project can be run" and change
>> the restriction from "beam" to the specific name of the agent (ex.
>> "apache-beam-jenkins-1").
>> 4. Save and apply that change.
>> 5. Back on the page for the job, click "Build with Parameters" on the
>> left menu.
>> 6. Run the build on "master".
>> 7. Once you're done checking the results, change the restriction for the
>> job back to "beam". (This also gets reset once every 24 hours in case you
>> forget.)
>>
>> I did that on one agent (apache-beam-jenkins-2) on Friday evening when it
>> wasn't too busy, and got Go updated and working. I checked that agent's
>> execution history again today just in case, and it was healthy over
>> the weekend, with no Go-related problems as far as I could see. If there's
>> no objections I'd like to go ahead and continue updating the rest of the
>> workers (I'll do this late at night or over the weekend to avoid disrupting
>> dev work).
>>
>> [1] https://ci-beam.apache.org/computer/apache-beam-jenkins-1/
>> [2] https://ci-beam.apache.org/job/beam_PostCommit_Go/
>>
>> On Mon, Oct 4, 2021 at 6:14 PM Valentyn Tymofieiev 
>> wrote:
>>
>>> I updated the image in [1], but did not change the workers yet to pick
>>> up the new image yet. We can do this once we add Go changes on top of it.
>>>
>>> I am also considering to SSH into every worker and run a one-line
>>> command that adds the dependency that was missing. It seems to be low risk,
>>> and  there is a fall-back plan to re-start the worker using the saved image
>>> - both new and old images are saved and available in Cloud Console.
>>>
>>> Ideally, we should find a way to do a rolling upgrade that a PMC or
>>> committer could trigger without logging into every machine.
>>>
>>> [1]
>>> https://issues.apache.org/jira/browse/BEAM-8152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424228#comment-17424228
>>>
>>>
>>> On Wed, Sep 22, 2021 at 3:28 PM Daniel Oliveira 
>>> wrote:
>>>
 @Brian Hulette  That button seems like exactly
 what we'd need. Doing it manually would be a pain, but it's probably still
 preferable to causing a bunch of aborted tests.

 @Valentyn Tymofieiev  Collaborating to do both
 updates at once is a great idea! I'll message you directly about it.

 On Wed, Sep 22, 2021 at 2:44 PM Valentyn Tymofieiev <
 valen...@google.com> wrote:

> I am also interested in this updating version of Python on VMs, I need
> to install Python 3.9. Thanks for looking into this.  We can coordinate
> together to make one update instead of two.
>
> On Wed, Sep 22, 2021 at 2:40 PM Brian Hulette 
> wrote:
>
>> I'm not sure about best practices here. Out of curiosity I just poked
>> around in the Jenkins UI (e.g. [1]) and it looks like you can manually
>> "Mark node temporarily offline" when logged in (if you're a committer).
>> According to [2] this will prevent it from picking up new jobs after it's
>> finished the currently executing ones. Doing that manually for every 
>> worker
>> could be a pain