OK. I got it now. So the newly created volumes are currently attached to slave06_2 and slave07_2, respectively. However, they're standard HDD, not GP2 SSD. I think we can take this chance to recreate those 2 slaves and do an overhaul of our infrastructure.
Kengo would you like to take this, or you need a help? Evans Olaf Flebbe <[email protected]> 於 2020年11月6日 週五 上午2:40寫道: > Hi, > > OMG . I think I did it. > > A few years ago two of the instance had a hardware problems and did not > reboot any more, filesystem was corrupted and so on. That was at the time > of the spectre vulnarability discovery. (2018) . At that time AWS had major > instabilities since updating firmware seem to have failed for some classes > of hardware. > > I tried to recreate them as close as possible but I may have left > accidentely the volumes around. Please lets delete them. > > Olaf > > > Am 05.11.2020 um 14:44 schrieb Konstantin Boudnik <[email protected]>: > > > > Thanks Evans! > > > > It's great you found the details: they are definitely accurate as I am > > recalling now. Kengo, do you think splitting the volumes would help us > for a > > while? Or perhaps we shall try to expand the resource pool (which might > take a > > while)? > > > > Thanks! > > Cos > > > > On Thu, Nov 05, 2020 at 12:32PM, Evans Ye wrote: > >> In fact, the original deal of our resource is as follows: > >> > >>> 1 m3.2xlarge for CI > >>> 4 m3.xlarge for CI and demo > >>> 3 1TB EBS volumes > >>> 5 elastic IP addresses > >> > >> So technically we should not use that 2 additional 1T volumes (created > in > >> 2018). > >> Instead, I think what we can do is to split up one of the existing 1TB > >> volumes(ex: attached to slave07) into smaller volumes for slave02, 03. > >> > >> > >> Konstantin Boudnik <[email protected]> 於 2020年11月4日 週三 下午2:28寫道: > >> > >>> Kengo, > >>> > >>> We had an agreement with EMR folks that we are using the resources > >>> available > >>> to us and it is included into their budget (or something to this > extent). > >>> If > >>> you see some of the resources available under our account - I don't see > >>> why we > >>> can't use them. > >>> > >>> If for whatever reason we need to expand the pool, that would require a > >>> separate conversation with nice folks from that team, I imagine. Please > >>> let me > >>> know if I can help with this going forward. > >>> > >>> Thanks! > >>> Cos > >>> > >>> On Wed, Nov 04, 2020 at 11:11AM, Kengo Seki wrote: > >>>> Thanks for the comment, Cos! I was able to start docker service on > >>>> docker-slave-02 without replacing and am running some Jenkins jobs on > >>>> it now, so I'll replace it in the short future. > >>>> I have a few things that I'd like to ask additionally: > >>>> > >>>> * docker-slave-02 and 03 have a gp2 storage as a root volume that has > >>>> only 8GiB capacity, and they sometimes run short and stop the CI. > >>>> May I increase them to 20 or 30 GiB when I replace those instances? > >>>> (I'm not sure what is our budget) > >>>> > >>>> * They use an instance store with 30GiB to put docker images into it, > >>>> and they also sometimes run short. > >>>> It seems there are two unused volumes with 1TiB (vol-ae71114e and > >>>> vol-4efa69ae) on AWS console. > >>>> May I attach them to 02 and 03 instead of instance stores, or are > >>>> they backups or something? > >>>> > >>>> Kengo Seki <[email protected]> > >>>> > >>>> On Mon, Nov 2, 2020 at 6:41 PM Konstantin Boudnik <[email protected]> > >>> wrote: > >>>>> > >>>>> I'd say let replace the broken one. I don't think there's a > sentimental > >>>>> value attached ;) > >>>>> > >>>>> -- > >>>>> With regards, > >>>>> Cos > >>>>> > >>>>> On 02.11.2020 08:16, Kengo Seki wrote: > >>>>>> Thanks for updating Olaf! I've just noticed the Jenkins UI became > >>> cool :) > >>>>>> Regarding docker-slave-02, I'll try to replace it after waiting for > a > >>>>>> while to make sure there's no objection. > >>>>>> > >>>>>> Kengo Seki <[email protected]> > >>>>>> > >>>>>> On Mon, Nov 2, 2020 at 1:39 PM Jun HE <[email protected]> wrote: > >>>>>>> > >>>>>>> Thanks a lot for the update, Olaf! > >>>>>>> > >>>>>>> Olaf Flebbe <[email protected]> 于2020年10月31日周六 上午3:24写道: > >>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> All machines patched. Jenkins and it plugins are updated: > >>>>>>>> > >>>>>>>> Things to be noted: > >>>>>>>> > >>>>>>>> * Slave 2 seems to be in serious problems. The disk image seems to > >>> be > >>>>>>>> corrupt, I would say: > >>>>>>>> One of the problems: docker does not start any more. > >>>>>>>> Is there anything important on it ? If yes please contact me. I > >>> would > >>>>>>>> recommend to set up slave2 from scratch again. > >>>>>>>> > >>>>>>>> * There was a warning regarding Copy Artifacts Plugin. It now > >>> imposes > >>>>>>>> stricter rules. Not sure if there is a job depending on it. > >>>>>>>> > >>>>>>>> * I removed the CVS plugin. > >>>>>>>> > >>>>>>>> Everything else seem to working as usual. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Olaf > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>>> Am 30.10.2020 um 19:09 schrieb Olaf Flebbe <[email protected]>: > >>>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I am doing an update of the machines in CI . Seems a couple of > >>> security > >>>>>>>> fixes are to be applied. > >>>>>>>>> > >>>>>>>>> Olaf > >>>>>>>> > >>>>>>>> > >>> > >
