Re: Update

Evans Ye Mon, 09 Nov 2020 08:00:27 -0800

OK. I got it now.
So the newly created volumes are currently attached to slave06_2 and
slave07_2, respectively.
However, they're standard HDD, not GP2 SSD. I think we can take this chance
to recreate those 2 slaves and do an overhaul of our infrastructure.


Kengo would you like to take this, or you need a help?

Evans

Olaf Flebbe <[email protected]> 於 2020年11月6日 週五 上午2:40寫道：

> Hi,
>
> OMG . I think I did it.
>
> A few years ago two of the instance had a hardware problems and did not
> reboot any more, filesystem was corrupted and so on.  That was at the time
> of the spectre vulnarability discovery. (2018) . At that time AWS had major
> instabilities since updating firmware seem to have failed for some classes
> of hardware.
>
> I tried to recreate them as close as possible but I may have left
> accidentely the volumes around. Please lets delete them.
>
> Olaf
>
> > Am 05.11.2020 um 14:44 schrieb Konstantin Boudnik <[email protected]>:
> >
> > Thanks Evans!
> >
> > It's great you found the details: they are definitely accurate as I am
> > recalling now. Kengo, do you think splitting the volumes would help us
> for a
> > while? Or perhaps we shall try to expand the resource pool (which might
> take a
> > while)?
> >
> > Thanks!
> >  Cos
> >
> > On Thu, Nov 05, 2020 at 12:32PM, Evans Ye wrote:
> >> In fact, the original deal of our resource is as follows:
> >>
> >>> 1 m3.2xlarge for CI
> >>> 4 m3.xlarge for CI and demo
> >>> 3 1TB EBS volumes
> >>> 5 elastic IP addresses
> >>
> >> So technically we should not use that 2 additional 1T volumes (created
> in
> >> 2018).
> >> Instead, I think what we can do is to split up one of the existing 1TB
> >> volumes(ex: attached to slave07) into smaller volumes for slave02, 03.
> >>
> >>
> >> Konstantin Boudnik <[email protected]> 於 2020年11月4日 週三 下午2:28寫道：
> >>
> >>> Kengo,
> >>>
> >>> We had an agreement with EMR folks that we are using the resources
> >>> available
> >>> to us and it is included into their budget (or something to this
> extent).
> >>> If
> >>> you see some of the resources available under our account - I don't see
> >>> why we
> >>> can't use them.
> >>>
> >>> If for whatever reason we need to expand the pool, that would require a
> >>> separate conversation with nice folks from that team, I imagine. Please
> >>> let me
> >>> know if I can help with this going forward.
> >>>
> >>> Thanks!
> >>>  Cos
> >>>
> >>> On Wed, Nov 04, 2020 at 11:11AM, Kengo Seki wrote:
> >>>> Thanks for the comment, Cos! I was able to start docker service on
> >>>> docker-slave-02 without replacing and am running some Jenkins jobs on
> >>>> it now, so I'll replace it in the short future.
> >>>> I have a few things that I'd like to ask additionally:
> >>>>
> >>>> * docker-slave-02 and 03 have a gp2 storage as a root volume that has
> >>>> only 8GiB capacity, and they sometimes run short and stop the CI.
> >>>>  May I increase them to 20 or 30 GiB when I replace those instances?
> >>>> (I'm not sure what is our budget)
> >>>>
> >>>> * They use an instance store with 30GiB to put docker images into it,
> >>>> and they also sometimes run short.
> >>>>  It seems there are two unused volumes with 1TiB (vol-ae71114e and
> >>>> vol-4efa69ae) on AWS console.
> >>>>  May I attach them to 02 and 03 instead of instance stores, or are
> >>>> they backups or something?
> >>>>
> >>>> Kengo Seki <[email protected]>
> >>>>
> >>>> On Mon, Nov 2, 2020 at 6:41 PM Konstantin Boudnik <[email protected]>
> >>> wrote:
> >>>>>
> >>>>> I'd say let replace the broken one. I don't think there's a
> sentimental
> >>>>> value attached ;)
> >>>>>
> >>>>> --
> >>>>> With regards,
> >>>>>   Cos
> >>>>>
> >>>>> On 02.11.2020 08:16, Kengo Seki wrote:
> >>>>>> Thanks for updating Olaf! I've just noticed the Jenkins UI became
> >>> cool :)
> >>>>>> Regarding docker-slave-02, I'll try to replace it after waiting for
> a
> >>>>>> while to make sure there's no objection.
> >>>>>>
> >>>>>> Kengo Seki <[email protected]>
> >>>>>>
> >>>>>> On Mon, Nov 2, 2020 at 1:39 PM Jun HE <[email protected]> wrote:
> >>>>>>>
> >>>>>>> Thanks a lot for the update, Olaf!
> >>>>>>>
> >>>>>>> Olaf Flebbe <[email protected]> 于2020年10月31日周六 上午3:24写道：
> >>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> All machines patched. Jenkins and it plugins are updated:
> >>>>>>>>
> >>>>>>>> Things to be noted:
> >>>>>>>>
> >>>>>>>> * Slave 2 seems to be in serious problems. The disk image seems to
> >>> be
> >>>>>>>> corrupt, I would say:
> >>>>>>>> One of the problems: docker does not start any more.
> >>>>>>>> Is there anything important on it ? If yes please contact me. I
> >>> would
> >>>>>>>> recommend to set up slave2 from scratch again.
> >>>>>>>>
> >>>>>>>> * There was a warning regarding Copy Artifacts Plugin. It now
> >>> imposes
> >>>>>>>> stricter rules. Not sure if there is a job depending on it.
> >>>>>>>>
> >>>>>>>> * I removed the CVS plugin.
> >>>>>>>>
> >>>>>>>> Everything else seem to working as usual.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Olaf
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> Am 30.10.2020 um 19:09 schrieb Olaf Flebbe <[email protected]>:
> >>>>>>>>>
> >>>>>>>>> Hi,
> >>>>>>>>>
> >>>>>>>>> I am doing an update of the machines in CI . Seems a couple of
> >>> security
> >>>>>>>> fixes are to be applied.
> >>>>>>>>>
> >>>>>>>>> Olaf
> >>>>>>>>
> >>>>>>>>
> >>>
>
>

Re: Update

Reply via email to