[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
On 07/13/19 17:50, Molly Miller wrote: > All, > > I have a couple of things I'd like to say in response to both the > original post and some of the points raised in the replies by Kiyoshi > and Luis. I've been using Scaleway's virtualised services for nearly two > years now (first with a pair of small x86_64 machines for personal use, > and more recently an aarch64 machine for Adélie development work), so I > have some operational experience which I'd like to share, and I also > broadly agree with the points regarding shared tenancy of physical > hardware. > > Please excuse the thread-breaking, but I wanted to make sure this gets > copied to adelie-infra properly. I'm also a bit over-caffeinated as I'm > writing this, so apologies if it gets a bit rambly or hard to follow. Thank you for the in-depth response, and sharing your operational experience! > - > > First, the OP: > > On 2019-07-13 10:56, A. Wilcox wrote: >> Scaleway offers a similar level of reliability, and has a higher level >> of availability based on our current account with them. They >> additionally offer servers that are not based on the x86 architecture, >> so we are still protected from the numerous issues that plague x86. > > Scaleway's business focus has shifted over the past year or so; they are > no longer pushing ARM cloud services anywhere near as much as they used > to, and now seem to be more interested in providing managed services in > a similar vein to AWS. As such, there's limited capacity on their ARM > cloud (which I believe they are no longer expanding), which could > potentially cause issues in securing the resources necessary for a > migration. The only limited capacity I saw was on the really big systems (128/256 GB RAM), but I'll admit I don't know what capacity might look like in the future if we needed to expand. >> The network has never suffered any outages, either. Since the Scaleway >> cloud features ARM servers, we would additionally still be able to avoid >> the x86 architecture and all of its failings. > > I can't recall any particular dates, but from offhand experience > Scaleway's network (at least the virtual machine estate in Amsterdam) > has *definitely* suffered availability issues. (I can't speak for > Scaleway's Paris network, which I'm assuming is more substantial.) I > must stress, however, that these issues are nowhere near as bad as > Integricloud's -- just don't expect perfection, because you'll be > disappointed. For what it's worth, in my experience any of the brief > network issues have disrupted IPv6 connectivity more than they have > disrupted IPv4 connectivity. It's a lot better, at this point, to have hiccups on v6 than total outages. Integricloud's total outages have killed our productivity. >> We have continually been limited by our lack of IPv4 space at >> Integricloud. Currently, we "proxy" every server via athdheise, a >> virtual server on our Integricloud dedicated system that has both an >> IPv4 and IPv6 address. > > (This is an aside, but I've worked in an environment which has > successfully operated services from an IPv6-only network, with a > dual-stacked reverse proxy at the network border to handle connections > from IPv4-only clients. The border gateway ran Haproxy, which is capable > of selecting backends based on server name indication in TLS handshakes; > as the SNI is sent before any key exchange is performed, the gateway > machine did not need access to any private key material, and could be > used for any protocol which runs over TLS and uses SNI.) I hate this networking setup so much. Everything should just be native (or native-ish). It's needless complexity. It irritates me. >> If we use Scaleway virtual servers, every system gets its own dedicated >> IPv4 address, which drastically simplifies our administration. > > Scaleway's network configuration is weird for virtual machines -- I'll > get to that in my operational experience spiel in a bit. > >> Additionally, we would receive a lot more RAM per virtual server. > > More RAM is always better -- the RAM which our Integricloud machines > currently have is eye-wateringly small. > >> Finally, we would save a dramatic amount of money. We currently pay >> 225$/mo pre-tax for Integricloud. > > Saving money is also good. > >> The current systems we run on Integricloud are: >> > > I agree strongly with Kiyoshi here -- though I'm not so keen on having > personal resources under the adelielinux.org banner, I won't object if > they're made available for use by other contributors. Refer to my response there. > - > > I have some points to add to what Luis said: > > On 2019-07-13 16:58, Luis Ressel wrote: >> I strongly agree with Aerdan here. In my opinion, the risks of moving to >> VPSes on hardware shared with other tenants outweight all (perceived or >> real) benefits of using aarch64 instead of x86. > > The place I
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
On 07/13/19 11:28, Max Rees wrote: >> My understanding is that we were planning on retiring the wiki. This >> would be an excellent time to do so. > > Agreed. We are already moving some stuff to GitLab wikis and that seems > appropriate. For other cases we should move forward on previous > proposals to provision project-specific static websites (perhaps via > GitLab static site generation). As noted, it's quicker to just migrate them and move them off after. Unless someone else wants to pay for an August at Integricloud :/ >> I think that bts should be retired and merged with gitlab, or >> alternatively it can be on the same server if retiring it is >> contra-indicated (e.g. due to gitlab being unable to provide >> bug-tracking without a git repo associated). > > No. GitLab issues are not sufficient for the needs of proper bug tracking. Heavily +1. >> The mailing lists should be on hosting separate from our other >> infrastructure, since it can and should be usable regardless of the >> rest of our infrastructure's dispositions. > > This is an important point and I agree. Consensus seems to be building on this point. I will look in to alternative hosting for the mailing lists, and perhaps write a (much smaller!) proposal with the options I find. Best, --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org signature.asc Description: OpenPGP digital signature ___ Ad?lie Open Governance mailing list -- adelie-project@lists.adelielinux.org To unsubscribe send an email to adelie-project-le...@lists.adelielinux.org
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
On 07/13/19 10:58, Luis Ressel wrote: > Hi *, > > I strongly agree with Aerdan here. In my opinion, the risks of moving to > VPSes on hardware shared with other tenants outweight all (perceived or > real) benefits of using aarch64 instead of x86. I suppose I can look into more dedicated hosting. Honestly, at our level of project income, I wasn't even considering it. > Furthermore, I believe our current x86 infra has more than enough > capacity to accomodate all services we require, so forgoing the use of > scw aarch64 vpses would save us money too. See my response to Kiyoshi for why I feel this is a Very Bad Idea™. > However, I am in favour of migrating away from Integricloud regardless > of the destination to which we'd migrate, be it aarch64 vpses, our > already existing x86 infra, colo'ed x86 or ppc servers, or a cluster of > raspis in someone's basement. I do have a static /27 on AT, if everyone can stomach 5 Mbit/s egress. (That's not a typo, I have 30 down and 5 up at my office until the bonded pair is healed. ETA: whenever AT gets their fingers out of their ---) > The idea of Integricloud is neat, but their unwillingness to set up a > reliable, redundant uplink even after several outages that could've been > easily avoided, combined with their completely ludicrous prices, makes > them an extremly unattractive option in practice. +1. Best, --arw -- A. Wilcox (awilfox) Project Lead, Adélie Linux https://www.adelielinux.org signature.asc Description: OpenPGP digital signature ___ Ad?lie Open Governance mailing list -- adelie-project@lists.adelielinux.org To unsubscribe send an email to adelie-project-le...@lists.adelielinux.org
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
On 07/13/19 10:10, Kiyoshi Aman wrote: > I think that, before we can make a good decision regarding migration, we > should look at the resources we already have available to us. We have a > dedicated server in Finland with 32GB RAM and a fair amount of disk > space available. We have two dedicated servers generously provided for > our use in Pennsylvania, albeit one earmarked for buildbot service and > the other as yet provisioned (by us) for service. There are a number of reasons that I didn't want to consider running everything on the Finland server, nor the Pennsylvania one. 1. They are sponsored, i.e. they are not paid directly by us. From experience, they don't last forever. See what happened with Rack911. Do I think either of them would do that to us? Probably not. However, we don't know what the future will hold, and we have no control over the futures of orgs that aren't us. 2. Putting all of our eggs in one Integricloud basket has already proven to be a TERRIBLE idea. If we pile all-in to the Leuhta or ionfish boxes, we're making the same mistake. (Moving GitLab to Finland and moving its SQL with it was one attempt at making that component independent from all the others.) 3. Hetzner specifically has garbage-tier ratings for DNSBL / running MXes. Our mailing lists would be classed as spam. > In light of that, I want to take the server list below in reverse order. > > A. Wilcox wrote: >> The current systems we run on Integricloud are: >> >> enfys (postgresql) 768 MB RAM 30 GB disk >> >> rarity (these mailing lists) 1536 MB RAM 30 GB disk >> >> mirrormaster 256 MB RAM 1 TB disk >> >> bts (Bugzilla issue tracking) 512 MB RAM 8 GB disk >> >> athdheise (Web server/proxy) 256 MB RAM 4 GB disk >> >> wiki 512 MB RAM 8 GB disk >> >> annwyn (Nextcloud) 512 MB RAM 100 GB disk >> >> chatterbox (Quassel IRC) 512 MB RAM 40 GB disk > > At the moment, both chatterbox and annwyn are personal resources. > Leaving aside the discussion as to whether they belong on project > infrastructure, they should be migrated to personal infrastructure > unless they are intended to be made more widely available to Adélie > contributors (even if only to the core and/or infrastructure teams). Yes, the intention was opening up Quassel to other project members. I was piloting Nextcloud for Adélie on annwyn and I wanted to run one for all contributors to rectify our current reliance on i.e. bpaste and imgur. I haven't finalised how this will look yet, hence why there is no proposal on -project yet :) > athdheise only exists because IntegriCloud was not able to provide IPv4 > addresses at a price we were able to pay. It would be retired regardless. athdheise is our main web server (www., help., support.) in addition to being the proxy. It won't be retired. It just won't be overtaxed. > My understanding is that we were planning on retiring the wiki. This > would be an excellent time to do so. I didn't want to delay the migration off Integricloud until we can save all the content of the wiki to other places. I wanted to focus on "getting off of a 250$ per month money sink that is also going down a lot" before focusing on the process improvements around deprecating the wiki. Deprecating the wiki also means we need to set up the GitLab page runners, which is blocking on me actually setting the thing up because I'm too busy dealing with Integricloud outages. Migrating the wiki would take less than three minutes of time (add a pg_hba auth line, then `scp -pr /srv/www new-wiki:`). It would probably take a few days to a week before we could retire it. > I think that bts should be retired and merged with gitlab, or > alternatively it can be on the same server if retiring it is > contra-indicated (e.g. due to gitlab being unable to provide > bug-tracking without a git repo associated). Retiring bts is a hard NAK from me. I can't stand the workflow nor the UI that GitLab Issues provides. > mirrormaster would need to be migrated to the Finland server regardless, > since no VPS provider provides block storage in the quantities we need > at a rate we can live with. Agreed. > The mailing lists should be on hosting separate from our other > infrastructure, since it can and should be usable regardless of the rest > of our infrastructure's dispositions. It was originally on Vultr, though I'd rather prefer not to do that again. Not only is the cost high, but enabling MX is a per-VM thing, so I'd have to get it approved again, which would delay the move. Also, it's x86 hardware. > That leaves the postgresql server, which should be co-located with gitlab. No. N. No. We are not putting the PostgreSQL server in Finland. I'm barely comfortable with it going in Norway for GDPR reasons, but Finland specifically has DRACONIAN regulations on handling PII, which
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
All, I have a couple of things I'd like to say in response to both the original post and some of the points raised in the replies by Kiyoshi and Luis. I've been using Scaleway's virtualised services for nearly two years now (first with a pair of small x86_64 machines for personal use, and more recently an aarch64 machine for Adélie development work), so I have some operational experience which I'd like to share, and I also broadly agree with the points regarding shared tenancy of physical hardware. Please excuse the thread-breaking, but I wanted to make sure this gets copied to adelie-infra properly. I'm also a bit over-caffeinated as I'm writing this, so apologies if it gets a bit rambly or hard to follow. - First, the OP: On 2019-07-13 10:56, A. Wilcox wrote: Scaleway offers a similar level of reliability, and has a higher level of availability based on our current account with them. They additionally offer servers that are not based on the x86 architecture, so we are still protected from the numerous issues that plague x86. Scaleway's business focus has shifted over the past year or so; they are no longer pushing ARM cloud services anywhere near as much as they used to, and now seem to be more interested in providing managed services in a similar vein to AWS. As such, there's limited capacity on their ARM cloud (which I believe they are no longer expanding), which could potentially cause issues in securing the resources necessary for a migration. We have had a working relationship with Scaleway for almost a year and a half. We launched our 32-bit ARM builder on the Scaleway ARM cloud in March 2018, and have had no downtime in that time: I assume this is one of the dedicated ARMv7 machines (as all of Scaleway's other ARM services are ARMv8 from what I know). I haven't ever used any of those machines, but I'm pretty sure they are very different beasts from the virtualised ARMv8 services. The network has never suffered any outages, either. Since the Scaleway cloud features ARM servers, we would additionally still be able to avoid the x86 architecture and all of its failings. I can't recall any particular dates, but from offhand experience Scaleway's network (at least the virtual machine estate in Amsterdam) has *definitely* suffered availability issues. (I can't speak for Scaleway's Paris network, which I'm assuming is more substantial.) I must stress, however, that these issues are nowhere near as bad as Integricloud's -- just don't expect perfection, because you'll be disappointed. For what it's worth, in my experience any of the brief network issues have disrupted IPv6 connectivity more than they have disrupted IPv4 connectivity. We have continually been limited by our lack of IPv4 space at Integricloud. Currently, we "proxy" every server via athdheise, a virtual server on our Integricloud dedicated system that has both an IPv4 and IPv6 address. (This is an aside, but I've worked in an environment which has successfully operated services from an IPv6-only network, with a dual-stacked reverse proxy at the network border to handle connections from IPv4-only clients. The border gateway ran Haproxy, which is capable of selecting backends based on server name indication in TLS handshakes; as the SNI is sent before any key exchange is performed, the gateway machine did not need access to any private key material, and could be used for any protocol which runs over TLS and uses SNI.) If we use Scaleway virtual servers, every system gets its own dedicated IPv4 address, which drastically simplifies our administration. Scaleway's network configuration is weird for virtual machines -- I'll get to that in my operational experience spiel in a bit. Additionally, we would receive a lot more RAM per virtual server. More RAM is always better -- the RAM which our Integricloud machines currently have is eye-wateringly small. Finally, we would save a dramatic amount of money. We currently pay 225$/mo pre-tax for Integricloud. Saving money is also good. The current systems we run on Integricloud are: I agree strongly with Kiyoshi here -- though I'm not so keen on having personal resources under the adelielinux.org banner, I won't object if they're made available for use by other contributors. - I have some points to add to what Luis said: On 2019-07-13 16:58, Luis Ressel wrote: I strongly agree with Aerdan here. In my opinion, the risks of moving to VPSes on hardware shared with other tenants outweight all (perceived or real) benefits of using aarch64 instead of x86. The place I mentioned above with the IPv4-to-IPv6 gateway also provided virtualised hosting services, and I interacted with their systems for provisioning and managing customer VM's on a number of occasions. It's *really easy* for the party which controls the host to reboot a guest into a rescue
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
Apologies if this email is malformed, I had to send it from my phone. On Jul 13, 2019, at 11:10 AM, Kiyoshi Aman wrote: > My understanding is that we were planning on retiring the wiki. This would be > an excellent time to do so. > Agreed. We are already moving some stuff to GitLab wikis and that seems appropriate. For other cases we should move forward on previous proposals to provision project-specific static websites (perhaps via GitLab static site generation). > I think that bts should be retired and merged with gitlab, or alternatively > it can be on the same server if retiring it is contra-indicated (e.g. due to > gitlab being unable to provide bug-tracking without a git repo associated). > No. GitLab issues are not sufficient for the needs of proper bug tracking. > The mailing lists should be on hosting separate from our other > infrastructure, since it can and should be usable regardless of the rest of > our infrastructure's dispositions. > This is an important point and I agree. Max___ Ad?lie Open Governance mailing list -- adelie-project@lists.adelielinux.org To unsubscribe send an email to adelie-project-le...@lists.adelielinux.org
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
Hi *, I strongly agree with Aerdan here. In my opinion, the risks of moving to VPSes on hardware shared with other tenants outweight all (perceived or real) benefits of using aarch64 instead of x86. Furthermore, I believe our current x86 infra has more than enough capacity to accomodate all services we require, so forgoing the use of scw aarch64 vpses would save us money too. However, I am in favour of migrating away from Integricloud regardless of the destination to which we'd migrate, be it aarch64 vpses, our already existing x86 infra, colo'ed x86 or ppc servers, or a cluster of raspis in someone's basement. The idea of Integricloud is neat, but their unwillingness to set up a reliable, redundant uplink even after several outages that could've been easily avoided, combined with their completely ludicrous prices, makes them an extremly unattractive option in practice. Cheers, Luis ___ Adélie Open Governance mailing list -- adelie-project@lists.adelielinux.org To unsubscribe send an email to adelie-project-le...@lists.adelielinux.org
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
I think that, before we can make a good decision regarding migration, we should look at the resources we already have available to us. We have a dedicated server in Finland with 32GB RAM and a fair amount of disk space available. We have two dedicated servers generously provided for our use in Pennsylvania, albeit one earmarked for buildbot service and the other as yet provisioned (by us) for service. In light of that, I want to take the server list below in reverse order. A. Wilcox wrote: > The current systems we run on Integricloud are: > > enfys (postgresql) 768 MB RAM30 GB disk > > rarity (these mailing lists) 1536 MB RAM30 GB disk > > mirrormaster 256 MB RAM 1 TB disk > > bts (Bugzilla issue tracking) 512 MB RAM 8 GB disk > > athdheise (Web server/proxy) 256 MB RAM 4 GB disk > > wiki 512 MB RAM 8 GB disk > > annwyn (Nextcloud) 512 MB RAM 100 GB disk > > chatterbox (Quassel IRC) 512 MB RAM40 GB disk At the moment, both chatterbox and annwyn are personal resources. Leaving aside the discussion as to whether they belong on project infrastructure, they should be migrated to personal infrastructure unless they are intended to be made more widely available to Adélie contributors (even if only to the core and/or infrastructure teams). athdheise only exists because IntegriCloud was not able to provide IPv4 addresses at a price we were able to pay. It would be retired regardless. My understanding is that we were planning on retiring the wiki. This would be an excellent time to do so. I think that bts should be retired and merged with gitlab, or alternatively it can be on the same server if retiring it is contra-indicated (e.g. due to gitlab being unable to provide bug-tracking without a git repo associated). mirrormaster would need to be migrated to the Finland server regardless, since no VPS provider provides block storage in the quantities we need at a rate we can live with. The mailing lists should be on hosting separate from our other infrastructure, since it can and should be usable regardless of the rest of our infrastructure's dispositions. That leaves the postgresql server, which should be co-located with gitlab. I understand that one of our goals is for our infrastructure to not be subject to architectural issues with x86_64. In principle I agree, but migrating the majority of our infrastructure from VPSes on a single dedicated server to VPSes on an unknown number of servers, especially in today's security environment, carries more risk than ensuring our infrastructure sits on hardware we know is used only by us. ___ Ad?lie Open Governance mailing list -- adelie-project@lists.adelielinux.org To unsubscribe send an email to adelie-project-le...@lists.adelielinux.org
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
This sounds reasonable. I approve of this plan. +1. ___ Adélie Open Governance mailing list -- adelie-project@lists.adelielinux.org To unsubscribe send an email to adelie-project-le...@lists.adelielinux.org
[adelie-project] Re: Proposal: Replacing Integricloud with Scaleway
This sounds reasonable. I approve of this plan. Sent from my iPad > On Jul 13, 2019, at 05:56, A. Wilcox wrote: > > == Table of Contents == > > * Executive Summary > > * How Did We Get Here? > > * Reliability Is Not Availability > > * Enter Scaleway > > * Hard Numbers > > * Conclusion > > * References > > > > == Executive Summary == > > This is a formal proposal to retire the dedicated server we have with > Integricloud and replace it with a set of virtual servers from Scaleway. > > We originally chose Integricloud's dedicated server offering primarily > for reliability and security. While it has proven secure, and the > hardware itself is reliable, its availability leaves something to be > desired. > > Scaleway offers a similar level of reliability, and has a higher level > of availability based on our current account with them. They > additionally offer servers that are not based on the x86 architecture, > so we are still protected from the numerous issues that plague x86. > > This will also reduce our hosting costs by almost 90%, and should reduce > downtime by nearly 100%. > > > == How Did We Get Here? == > > In early January 2019, we were notified that both of our dedicated > servers at Rack911 were being retired, with very little notice. For > some additional information, reference adelie-devel@ post with message > ID > (archived at [1]). > > After our sponsorship was pulled in October 2018, we had done a bit of > investigation into replacement hosting providers in the event that this > would happen. Our requirements at the time were: > > * non-x86 based (due to the plethora of x86 bugs being discovered) > > * at least 8 GB RAM minimum > > * dedicated hardware preferred > > * at least 3 IPv4 addresses > > We evaluated Packet.net for ARM64 based systems[2] and Integricloud for > PPC64 based systems[3]. We found Integricloud to be approximately 60% > of the cost of Packet.net[4]. Additionally, we had a professional > working relationship with their parent company, Raptor Engineering, who > make the Talos and Blackbird family of computers. In fact, the > Integricloud system we were offered was to be a rack-mounted Talos II. > Since we already had a Talos II in use as a build server, we felt this > would be close to ideal, as any hardware oddities have already been > worked out. > > We chose their 4-core (16-thread) PowerPC system with 8 GB RAM and 2 x 1 > TB NVMe disk storage. One 1 TB NVMe disk is dedicated to > mirrormaster.adelielinux.org. The other 1 TB NVMe disk is an LVM group, > shared between the various KVM-based virtual servers run on it. > > > == Reliability Is Not Availability == > > The Integricloud dedicated server, chloe.adelielinux.org, has has no > hardware issues in over eight months of service. The hardware itself > has been fast, stable, and very reliable. However, there have been > multiple issues regarding availability. > > Integricloud has a single homed fibre infrastructure; per a public > looking glass, it is run via Mediacom[5]. This has caused an unforeseen > and consistent issue regarding availability. > > 2019-04-16 13:17 down > 2019-04-16 22:24 9 hours, 7 minutes > > 2019-04-17 00:10 down > 2019-04-17 12:29 12 hours, 19 minutes > > 2019-07-09 06:25 down > 2019-07-09 20:01 13 hours, 37 minutes > > 2019-07-10 15:14 down > 2019-07-10 15:39 25 minutes > > 2019-07-12 16:35 down > 2019-07-12 16:43 8 minutes > > This has resulted in a 97% uptime for April, and a 98% uptime for July - > and we are only 13 days into July, so this number could go down further. > > Additionally, many ISPs are not accepting Mediacom's IPv6 route > announcements. This has caused mirrormaster to be inaccessible to many > of our users, and even one of the members of our own Infra Team[6]. > > Finally, while yours truly was trying to show an Adélie Web page to > someone while on public Wi-Fi at a well-known place in Broken Arrow, OK, > I was greeted with an error page[7]: > > > Sonicwall Network Security Appliance > > This site has been blocked by the network administrator. > > Block reason: Gateway GEO-IP Filter Alert > > IP address: 23.155.224.64 > > Connection initiated towards country: Unknown > > > If a car dealership's firewall is blocking us, who knows what other > firewalls are blocking us. How many people are unable discover us, and > how many corporate sponsors are we missing out on, because they can't > even connect to our Web site? And why can they not connect to our Web > site? It could be the IPv6 peering issue, or a firewall blocking our > IPv4 space, or because Mediacom has suffered another "fibre cut". > > > == Enter Scaleway == > > We have had a working relationship with Scaleway for almost a year and a > half. We launched our 32-bit ARM builder on the Scaleway ARM cloud in > March 2018, and have had no downtime in that time: > > awilcox on erin [pts/0 Sat 13 9:33] ~: uptime > 09:33:02 up 489 days, 5:59,