Re: Update

Olaf Flebbe Tue, 10 Nov 2020 12:00:14 -0800

hi,

fully supporting evans:
the unconnected disk do not contain anything valuable, please remove. it might 
make sense to even recreate the current disks on ssd, a bit larger as before if 
needed.


olaf

> Am 10.11.2020 um 08:09 schrieb Evans Ye <[email protected]>:
> 
> Yes I think overall your plan is good.
> What's the purpose of leveraging EBS snapshot? Is it to backup the things
> we have before migration?
> Except for the master node(have jenkins settings stored on disk), all those
> slaves can be wiped out directly.
> 
> 
> 
> Kengo Seki <[email protected]> 於 2020年11月10日 週二 下午2:42寫道：
> 
>> Thanks everyone for the information! Now I understand our circumstances.
>> So we're going to split two 1TB volumes attached to slave06 and 07
>> into four 500GB volumes (and change their type to gp2), reattach them
>> to 02, 03, 06 and 07, and remove currently unused two 1TB volumes,
>> right?
>> 
>>> Kengo would you like to take this, or you need a help?
>> 
>> I think I can do them somehow (maybe using EBS snapshot?), but let me
>> ask your help if I'm stuck. :)
>> 
>> Kengo Seki <[email protected]>
>> 
>> On Tue, Nov 10, 2020 at 1:00 AM Evans Ye <[email protected]> wrote:
>>> 
>>> OK. I got it now.
>>> So the newly created volumes are currently attached to slave06_2 and
>>> slave07_2, respectively.
>>> However, they're standard HDD, not GP2 SSD. I think we can take this
>> chance
>>> to recreate those 2 slaves and do an overhaul of our infrastructure.
>>> 
>>> Kengo would you like to take this, or you need a help?
>>> 
>>> Evans
>>> 
>>> Olaf Flebbe <[email protected]> 於 2020年11月6日 週五 上午2:40寫道：
>>> 
>>>> Hi,
>>>> 
>>>> OMG . I think I did it.
>>>> 
>>>> A few years ago two of the instance had a hardware problems and did not
>>>> reboot any more, filesystem was corrupted and so on.  That was at the
>> time
>>>> of the spectre vulnarability discovery. (2018) . At that time AWS had
>> major
>>>> instabilities since updating firmware seem to have failed for some
>> classes
>>>> of hardware.
>>>> 
>>>> I tried to recreate them as close as possible but I may have left
>>>> accidentely the volumes around. Please lets delete them.
>>>> 
>>>> Olaf
>>>> 
>>>>> Am 05.11.2020 um 14:44 schrieb Konstantin Boudnik <[email protected]>:
>>>>> 
>>>>> Thanks Evans!
>>>>> 
>>>>> It's great you found the details: they are definitely accurate as I
>> am
>>>>> recalling now. Kengo, do you think splitting the volumes would help
>> us
>>>> for a
>>>>> while? Or perhaps we shall try to expand the resource pool (which
>> might
>>>> take a
>>>>> while)?
>>>>> 
>>>>> Thanks!
>>>>> Cos
>>>>> 
>>>>> On Thu, Nov 05, 2020 at 12:32PM, Evans Ye wrote:
>>>>>> In fact, the original deal of our resource is as follows:
>>>>>> 
>>>>>>> 1 m3.2xlarge for CI
>>>>>>> 4 m3.xlarge for CI and demo
>>>>>>> 3 1TB EBS volumes
>>>>>>> 5 elastic IP addresses
>>>>>> 
>>>>>> So technically we should not use that 2 additional 1T volumes
>> (created
>>>> in
>>>>>> 2018).
>>>>>> Instead, I think what we can do is to split up one of the existing
>> 1TB
>>>>>> volumes(ex: attached to slave07) into smaller volumes for slave02,
>> 03.
>>>>>> 
>>>>>> 
>>>>>> Konstantin Boudnik <[email protected]> 於 2020年11月4日 週三 下午2:28寫道：
>>>>>> 
>>>>>>> Kengo,
>>>>>>> 
>>>>>>> We had an agreement with EMR folks that we are using the resources
>>>>>>> available
>>>>>>> to us and it is included into their budget (or something to this
>>>> extent).
>>>>>>> If
>>>>>>> you see some of the resources available under our account - I
>> don't see
>>>>>>> why we
>>>>>>> can't use them.
>>>>>>> 
>>>>>>> If for whatever reason we need to expand the pool, that would
>> require a
>>>>>>> separate conversation with nice folks from that team, I imagine.
>> Please
>>>>>>> let me
>>>>>>> know if I can help with this going forward.
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> Cos
>>>>>>> 
>>>>>>> On Wed, Nov 04, 2020 at 11:11AM, Kengo Seki wrote:
>>>>>>>> Thanks for the comment, Cos! I was able to start docker service on
>>>>>>>> docker-slave-02 without replacing and am running some Jenkins
>> jobs on
>>>>>>>> it now, so I'll replace it in the short future.
>>>>>>>> I have a few things that I'd like to ask additionally:
>>>>>>>> 
>>>>>>>> * docker-slave-02 and 03 have a gp2 storage as a root volume that
>> has
>>>>>>>> only 8GiB capacity, and they sometimes run short and stop the CI.
>>>>>>>> May I increase them to 20 or 30 GiB when I replace those
>> instances?
>>>>>>>> (I'm not sure what is our budget)
>>>>>>>> 
>>>>>>>> * They use an instance store with 30GiB to put docker images into
>> it,
>>>>>>>> and they also sometimes run short.
>>>>>>>> It seems there are two unused volumes with 1TiB (vol-ae71114e and
>>>>>>>> vol-4efa69ae) on AWS console.
>>>>>>>> May I attach them to 02 and 03 instead of instance stores, or are
>>>>>>>> they backups or something?
>>>>>>>> 
>>>>>>>> Kengo Seki <[email protected]>
>>>>>>>> 
>>>>>>>> On Mon, Nov 2, 2020 at 6:41 PM Konstantin Boudnik <[email protected]
>>> 
>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> I'd say let replace the broken one. I don't think there's a
>>>> sentimental
>>>>>>>>> value attached ;)
>>>>>>>>> 
>>>>>>>>> --
>>>>>>>>> With regards,
>>>>>>>>>  Cos
>>>>>>>>> 
>>>>>>>>> On 02.11.2020 08:16, Kengo Seki wrote:
>>>>>>>>>> Thanks for updating Olaf! I've just noticed the Jenkins UI
>> became
>>>>>>> cool :)
>>>>>>>>>> Regarding docker-slave-02, I'll try to replace it after waiting
>> for
>>>> a
>>>>>>>>>> while to make sure there's no objection.
>>>>>>>>>> 
>>>>>>>>>> Kengo Seki <[email protected]>
>>>>>>>>>> 
>>>>>>>>>> On Mon, Nov 2, 2020 at 1:39 PM Jun HE <[email protected]> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Thanks a lot for the update, Olaf!
>>>>>>>>>>> 
>>>>>>>>>>> Olaf Flebbe <[email protected]> 于2020年10月31日周六 上午3:24写道：
>>>>>>>>>>> 
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> 
>>>>>>>>>>>> All machines patched. Jenkins and it plugins are updated:
>>>>>>>>>>>> 
>>>>>>>>>>>> Things to be noted:
>>>>>>>>>>>> 
>>>>>>>>>>>> * Slave 2 seems to be in serious problems. The disk image
>> seems to
>>>>>>> be
>>>>>>>>>>>> corrupt, I would say:
>>>>>>>>>>>> One of the problems: docker does not start any more.
>>>>>>>>>>>> Is there anything important on it ? If yes please contact me.
>> I
>>>>>>> would
>>>>>>>>>>>> recommend to set up slave2 from scratch again.
>>>>>>>>>>>> 
>>>>>>>>>>>> * There was a warning regarding Copy Artifacts Plugin. It now
>>>>>>> imposes
>>>>>>>>>>>> stricter rules. Not sure if there is a job depending on it.
>>>>>>>>>>>> 
>>>>>>>>>>>> * I removed the CVS plugin.
>>>>>>>>>>>> 
>>>>>>>>>>>> Everything else seem to working as usual.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Olaf
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> Am 30.10.2020 um 19:09 schrieb Olaf Flebbe <[email protected]>:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I am doing an update of the machines in CI . Seems a couple
>> of
>>>>>>> security
>>>>>>>>>>>> fixes are to be applied.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Olaf
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>> 
>>>> 
>>>> 
>>

Re: Update

Reply via email to