Re: New mirrorlist server implementation
Thanks a lot. I will change the container to be based on this instead of the binary I have built locally. Perfect! Adrian On Tue, Nov 19, 2019 at 07:54:34AM +0100, Igor Gnatenko wrote: > And I finally got this done. The update for F31 is here: > https://bodhi.fedoraproject.org/updates/FEDORA-2019-48a3fc7eb5 > > On Tue, Oct 8, 2019 at 8:53 AM Igor Gnatenko > wrote: > > > > I'll try to package new Rust implementation in Fedora :) > > > > And thanks for working on this! > > > > On Tue, Oct 8, 2019 at 8:52 AM Adrian Reber wrote: > > > > > > > > > Fedora's complete MirrorManager setup is still running on Python2. The > > > code has been ported to Python3 probably over two years ago but we have > > > not switched yet. One of the reasons is that the backend is running on > > > RHEL7 which means we are not in a hurry to deploy the Python3 version. > > > > > > The mirrorlist server which is answering the actual dnf/yum queries for > > > a mirrorlist/metalink is, however, running in a Fedora 29 container. > > > This container also still uses Python2 and it actually cannot use the > > > Python3 version. > > > > > > One of MirrorManager's design points is that the mirrorlist servers, > > > which are answering around 27 000 000 requests per day, are not directly > > > accessing the database. The backend creates a snapshot of the relevant > > > data (113MB) and the mirrorlist servers are using this snapshot to > > > answer client requests. > > > > > > This data exchange is based on Python's pickle format and that does not > > > seem to work with Python3 if it is generated using Python2. > > > > > > Having used protobuf before, I added code to also export the data for the > > > mirrorlist servers based on protobuf. > > > > > > The good news with protobuf is, that the resulting file is only 66MB > > > instead of 113MB. The bad news is, that loading it from Python requires > > > 3.5 times the amount of memory during runtime (3.5GB instead of 1GB). > > > > > > In addition to the data exchange problems between backend and > > > mirrorlist servers the architecture of the mirrorlist server does not > > > really make sense today. 12 years ago it made a lot of sense as it could > > > be easily integrated into httpd and it could be easily reloaded without > > > stopping the service. Today the mirrorlist server and httpd is all part > > > of a container which is then behind haproxy. So there is a lot of > > > infrastructure in the container which is not really useful. > > > > > > To get rid of the pickle format and to have a simpler architecture I > > > reimplemented the mirrorlist-server in Rust. This was brought up some > > > time ago on a ticket and with the protobuf problems I was seeing in > > > Python it made sense to try it out. > > > > > > My code currently can be found at > > > https://github.com/adrianreber/mirrorlist-server > > > and so far the results from the new mirrorlist server are the same as > > > from the Python based mirrorlist server. > > > > > > It requires less than 700MB instead of the 1GB in Python with production > > > based data and seems really fast. > > > > > > I have set up a test instance with the mirror data from Sunday at: > > > > > > https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64 > > > https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64 > > > > > > The instance is based on the container I pushed to quay.io: > > > > > > $ podman run quay.io/adrianreber/mirrorlist-server:latest -h > > > > > > With this change the mirrorlist server would also finally switch to > > > geoip2. The currently running mirrorlist server still uses the legacy > > > geoip database. > > > > > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > > server implementation on the proxies. I already verified that I can run > > > this mirrorlist container rootless. This new container can be a drop-in > > > replacement for the current container and no infrastructure around it > > > needs to be changed. > > > > > > The main changes to get it into production is to change > > > mirrorlist1.service > > > and mirrorlist2.service to include a line "User=mirrormanager" and > > > replace the current container name w
Re: New mirrorlist server implementation
And I finally got this done. The update for F31 is here: https://bodhi.fedoraproject.org/updates/FEDORA-2019-48a3fc7eb5 On Tue, Oct 8, 2019 at 8:53 AM Igor Gnatenko wrote: > > I'll try to package new Rust implementation in Fedora :) > > And thanks for working on this! > > On Tue, Oct 8, 2019 at 8:52 AM Adrian Reber wrote: > > > > > > Fedora's complete MirrorManager setup is still running on Python2. The > > code has been ported to Python3 probably over two years ago but we have > > not switched yet. One of the reasons is that the backend is running on > > RHEL7 which means we are not in a hurry to deploy the Python3 version. > > > > The mirrorlist server which is answering the actual dnf/yum queries for > > a mirrorlist/metalink is, however, running in a Fedora 29 container. > > This container also still uses Python2 and it actually cannot use the > > Python3 version. > > > > One of MirrorManager's design points is that the mirrorlist servers, > > which are answering around 27 000 000 requests per day, are not directly > > accessing the database. The backend creates a snapshot of the relevant > > data (113MB) and the mirrorlist servers are using this snapshot to > > answer client requests. > > > > This data exchange is based on Python's pickle format and that does not > > seem to work with Python3 if it is generated using Python2. > > > > Having used protobuf before, I added code to also export the data for the > > mirrorlist servers based on protobuf. > > > > The good news with protobuf is, that the resulting file is only 66MB > > instead of 113MB. The bad news is, that loading it from Python requires > > 3.5 times the amount of memory during runtime (3.5GB instead of 1GB). > > > > In addition to the data exchange problems between backend and > > mirrorlist servers the architecture of the mirrorlist server does not > > really make sense today. 12 years ago it made a lot of sense as it could > > be easily integrated into httpd and it could be easily reloaded without > > stopping the service. Today the mirrorlist server and httpd is all part > > of a container which is then behind haproxy. So there is a lot of > > infrastructure in the container which is not really useful. > > > > To get rid of the pickle format and to have a simpler architecture I > > reimplemented the mirrorlist-server in Rust. This was brought up some > > time ago on a ticket and with the protobuf problems I was seeing in > > Python it made sense to try it out. > > > > My code currently can be found at > > https://github.com/adrianreber/mirrorlist-server > > and so far the results from the new mirrorlist server are the same as > > from the Python based mirrorlist server. > > > > It requires less than 700MB instead of the 1GB in Python with production > > based data and seems really fast. > > > > I have set up a test instance with the mirror data from Sunday at: > > > > https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64 > > https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64 > > > > The instance is based on the container I pushed to quay.io: > > > > $ podman run quay.io/adrianreber/mirrorlist-server:latest -h > > > > With this change the mirrorlist server would also finally switch to > > geoip2. The currently running mirrorlist server still uses the legacy > > geoip database. > > > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > server implementation on the proxies. I already verified that I can run > > this mirrorlist container rootless. This new container can be a drop-in > > replacement for the current container and no infrastructure around it > > needs to be changed. > > > > The main changes to get it into production is to change mirrorlist1.service > > and mirrorlist2.service to include a line "User=mirrormanager" and > > replace the current container name with new container. > > > > Adrian > > ___ > > infrastructure mailing list -- infrastructure@lists.fedoraproject.org > > To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org > > Fedora Code of Conduct: > > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > > List Archives: > > https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Mon, Oct 14, 2019 at 07:42:30AM -0700, Kevin Fenzi wrote: > On Mon, Oct 14, 2019 at 08:53:26AM -0400, Stephen John Smoogen wrote: > > On Mon, 14 Oct 2019 at 03:39, Adrian Reber wrote: > > > > I would say proxy14. It should have ipv6 and is not a OMG we broke > > koji/everything else like proxy01,10,110,101 can be. > > +1 I will prepare the ansible changes and post them here as a freeze break request. Adrian signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Mon, Oct 14, 2019 at 08:53:26AM -0400, Stephen John Smoogen wrote: > On Mon, 14 Oct 2019 at 03:39, Adrian Reber wrote: > > I would say proxy14. It should have ipv6 and is not a OMG we broke > koji/everything else like proxy01,10,110,101 can be. +1 kevin signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Mon, 14 Oct 2019 at 03:39, Adrian Reber wrote: > > On Sun, Oct 13, 2019 at 02:02:41PM -0700, Kevin Fenzi wrote: > > On Fri, Oct 11, 2019 at 08:06:20AM +0200, Adrian Reber wrote: > > > On Tue, Oct 08, 2019 at 08:38:18AM -0700, Kevin Fenzi wrote: > > > > On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: > > > > > After the Fedora 31 freeze I would like to introduce this new > > > > > mirrorlist > > > > > server implementation on the proxies. I already verified that I can > > > > > run > > > > > this mirrorlist container rootless. This new container can be a > > > > > drop-in > > > > > replacement for the current container and no infrastructure around it > > > > > needs to be changed. > > > > > > > > > > The main changes to get it into production is to change > > > > > mirrorlist1.service > > > > > and mirrorlist2.service to include a line "User=mirrormanager" and > > > > > replace the current container name with new container. > > > > > > > > Awesome. > > > > > > > > How about we get this deployed in stg soonish so we can test it out > > > > more. > > > > > > Thanks to smooge's help we were able to switch staging to the new > > > mirrorlist. All changes in ansible are staging only. > > > > > > So you should get almost the same result (modulo randomness) from: > > > > > > https://mirrors.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 > > > https://mirrors.stg.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 > > > > > > My plan was to wait until after the freeze to switch prod to the new > > > setup, but there was also the proposal to switch one production proxy > > > server to the new setup earlier to see if it also works under real load. > > > > > > To test the new setup on one production proxy sounds like a good idea to > > > me, especially if we can try it before the actual Fedora 31 release. > > > > > > If it does not work, we can easily revert to the old setup. If it works, > > > however, I am not sure yet what this would mean. Running only one proxy > > > with the new code and everything else with the current code does not > > > seem like a good idea. So if the test on one production proxy is > > > successful it would mean to switch everything to the new setup during > > > the freeze?? Maybe also not the best idea. > > > > > > I like the idea of trying it in prod, but I am unsure what to do with > > > result of that try. > > > > > > Any further comments about trying this during the freeze? > > > > I'd be +1 on upgrading one production proxy during the freeze. > > > > However, at that point I would say we wait until after the freeze to do > > anything futher. If the one prod proxy did well, we look at upgrading > > everything. If it broke we look at fixing it before we upgrade. :) > > > > Seem reasonable? That would let us get a lot of traffic processed before > > we moved everything to production and would let us still release with > > all the other proxies if something happened to it. > > Sounds good to me. If you tell me which proxy I can adapt all the > conditionals in ansible to include that proxy in addition to the staging > environment. I would say proxy14. It should have ipv6 and is not a OMG we broke koji/everything else like proxy01,10,110,101 can be. -- Stephen J Smoogen. ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Sun, Oct 13, 2019 at 02:02:41PM -0700, Kevin Fenzi wrote: > On Fri, Oct 11, 2019 at 08:06:20AM +0200, Adrian Reber wrote: > > On Tue, Oct 08, 2019 at 08:38:18AM -0700, Kevin Fenzi wrote: > > > On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: > > > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > > > server implementation on the proxies. I already verified that I can run > > > > this mirrorlist container rootless. This new container can be a drop-in > > > > replacement for the current container and no infrastructure around it > > > > needs to be changed. > > > > > > > > The main changes to get it into production is to change > > > > mirrorlist1.service > > > > and mirrorlist2.service to include a line "User=mirrormanager" and > > > > replace the current container name with new container. > > > > > > Awesome. > > > > > > How about we get this deployed in stg soonish so we can test it out > > > more. > > > > Thanks to smooge's help we were able to switch staging to the new > > mirrorlist. All changes in ansible are staging only. > > > > So you should get almost the same result (modulo randomness) from: > > > > https://mirrors.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 > > https://mirrors.stg.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 > > > > My plan was to wait until after the freeze to switch prod to the new > > setup, but there was also the proposal to switch one production proxy > > server to the new setup earlier to see if it also works under real load. > > > > To test the new setup on one production proxy sounds like a good idea to > > me, especially if we can try it before the actual Fedora 31 release. > > > > If it does not work, we can easily revert to the old setup. If it works, > > however, I am not sure yet what this would mean. Running only one proxy > > with the new code and everything else with the current code does not > > seem like a good idea. So if the test on one production proxy is > > successful it would mean to switch everything to the new setup during > > the freeze?? Maybe also not the best idea. > > > > I like the idea of trying it in prod, but I am unsure what to do with > > result of that try. > > > > Any further comments about trying this during the freeze? > > I'd be +1 on upgrading one production proxy during the freeze. > > However, at that point I would say we wait until after the freeze to do > anything futher. If the one prod proxy did well, we look at upgrading > everything. If it broke we look at fixing it before we upgrade. :) > > Seem reasonable? That would let us get a lot of traffic processed before > we moved everything to production and would let us still release with > all the other proxies if something happened to it. Sounds good to me. If you tell me which proxy I can adapt all the conditionals in ansible to include that proxy in addition to the staging environment. Adrian signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Fri, Oct 11, 2019 at 08:06:20AM +0200, Adrian Reber wrote: > On Tue, Oct 08, 2019 at 08:38:18AM -0700, Kevin Fenzi wrote: > > On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: > > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > > server implementation on the proxies. I already verified that I can run > > > this mirrorlist container rootless. This new container can be a drop-in > > > replacement for the current container and no infrastructure around it > > > needs to be changed. > > > > > > The main changes to get it into production is to change > > > mirrorlist1.service > > > and mirrorlist2.service to include a line "User=mirrormanager" and > > > replace the current container name with new container. > > > > Awesome. > > > > How about we get this deployed in stg soonish so we can test it out > > more. > > Thanks to smooge's help we were able to switch staging to the new > mirrorlist. All changes in ansible are staging only. > > So you should get almost the same result (modulo randomness) from: > > https://mirrors.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 > https://mirrors.stg.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 > > My plan was to wait until after the freeze to switch prod to the new > setup, but there was also the proposal to switch one production proxy > server to the new setup earlier to see if it also works under real load. > > To test the new setup on one production proxy sounds like a good idea to > me, especially if we can try it before the actual Fedora 31 release. > > If it does not work, we can easily revert to the old setup. If it works, > however, I am not sure yet what this would mean. Running only one proxy > with the new code and everything else with the current code does not > seem like a good idea. So if the test on one production proxy is > successful it would mean to switch everything to the new setup during > the freeze?? Maybe also not the best idea. > > I like the idea of trying it in prod, but I am unsure what to do with > result of that try. > > Any further comments about trying this during the freeze? I'd be +1 on upgrading one production proxy during the freeze. However, at that point I would say we wait until after the freeze to do anything futher. If the one prod proxy did well, we look at upgrading everything. If it broke we look at fixing it before we upgrade. :) Seem reasonable? That would let us get a lot of traffic processed before we moved everything to production and would let us still release with all the other proxies if something happened to it. kevin signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Tue, Oct 08, 2019 at 08:38:18AM -0700, Kevin Fenzi wrote: > On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > server implementation on the proxies. I already verified that I can run > > this mirrorlist container rootless. This new container can be a drop-in > > replacement for the current container and no infrastructure around it > > needs to be changed. > > > > The main changes to get it into production is to change mirrorlist1.service > > and mirrorlist2.service to include a line "User=mirrormanager" and > > replace the current container name with new container. > > Awesome. > > How about we get this deployed in stg soonish so we can test it out > more. Thanks to smooge's help we were able to switch staging to the new mirrorlist. All changes in ansible are staging only. So you should get almost the same result (modulo randomness) from: https://mirrors.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 https://mirrors.stg.fedoraproject.org/mirrorlist?repo=epel-7&arch=x86_64 My plan was to wait until after the freeze to switch prod to the new setup, but there was also the proposal to switch one production proxy server to the new setup earlier to see if it also works under real load. To test the new setup on one production proxy sounds like a good idea to me, especially if we can try it before the actual Fedora 31 release. If it does not work, we can easily revert to the old setup. If it works, however, I am not sure yet what this would mean. Running only one proxy with the new code and everything else with the current code does not seem like a good idea. So if the test on one production proxy is successful it would mean to switch everything to the new setup during the freeze?? Maybe also not the best idea. I like the idea of trying it in prod, but I am unsure what to do with result of that try. Any further comments about trying this during the freeze? Adrian signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Tue, Oct 08, 2019 at 05:57:45PM +0200, Adrian Reber wrote: > On Tue, Oct 08, 2019 at 08:38:18AM -0700, Kevin Fenzi wrote: > > On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: > > ...snip... > > > > > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > > server implementation on the proxies. I already verified that I can run > > > this mirrorlist container rootless. This new container can be a drop-in > > > replacement for the current container and no infrastructure around it > > > needs to be changed. > > > > > > The main changes to get it into production is to change > > > mirrorlist1.service > > > and mirrorlist2.service to include a line "User=mirrormanager" and > > > replace the current container name with new container. > > > > Awesome. > > > > How about we get this deployed in stg soonish so we can test it out > > more. > > I was not aware, that the mirrorlist containers are also running in stg, > but they are, good. Yeah, it should be all setup anyhow... > > Assuming we want to run the containers rootless as the mirrormanager > user I would need the necessary entries in /etc/subuid and /etc/subgid. > Grepping through the ansible repository this does not seem to be used > yet. If someone can setup subuid and subgid I can do everything else. Currently we have been running them as root (by of course the httpd in the container as apache, etc). So, if you just want to pick those and get it working non root, that would be great. kevin signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Tue, Oct 08, 2019 at 08:38:18AM -0700, Kevin Fenzi wrote: > On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: > ...snip... > > > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > > server implementation on the proxies. I already verified that I can run > > this mirrorlist container rootless. This new container can be a drop-in > > replacement for the current container and no infrastructure around it > > needs to be changed. > > > > The main changes to get it into production is to change mirrorlist1.service > > and mirrorlist2.service to include a line "User=mirrormanager" and > > replace the current container name with new container. > > Awesome. > > How about we get this deployed in stg soonish so we can test it out > more. I was not aware, that the mirrorlist containers are also running in stg, but they are, good. Assuming we want to run the containers rootless as the mirrormanager user I would need the necessary entries in /etc/subuid and /etc/subgid. Grepping through the ansible repository this does not seem to be used yet. If someone can setup subuid and subgid I can do everything else. Adrian signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Tue, Oct 08, 2019 at 08:42:13AM +0200, Adrian Reber wrote: ...snip... > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > server implementation on the proxies. I already verified that I can run > this mirrorlist container rootless. This new container can be a drop-in > replacement for the current container and no infrastructure around it > needs to be changed. > > The main changes to get it into production is to change mirrorlist1.service > and mirrorlist2.service to include a line "User=mirrormanager" and > replace the current container name with new container. Awesome. How about we get this deployed in stg soonish so we can test it out more. Thanks so much for working on this. kevin signature.asc Description: PGP signature ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
On Tue, 8 Oct 2019 at 02:42, Adrian Reber wrote: > > > Fedora's complete MirrorManager setup is still running on Python2. The > code has been ported to Python3 probably over two years ago but we have > not switched yet. One of the reasons is that the backend is running on > RHEL7 which means we are not in a hurry to deploy the Python3 version. > > The mirrorlist server which is answering the actual dnf/yum queries for > a mirrorlist/metalink is, however, running in a Fedora 29 container. > This container also still uses Python2 and it actually cannot use the > Python3 version. > > One of MirrorManager's design points is that the mirrorlist servers, > which are answering around 27 000 000 requests per day, are not directly > accessing the database. The backend creates a snapshot of the relevant > data (113MB) and the mirrorlist servers are using this snapshot to > answer client requests. > > This data exchange is based on Python's pickle format and that does not > seem to work with Python3 if it is generated using Python2. > > Having used protobuf before, I added code to also export the data for the > mirrorlist servers based on protobuf. > > The good news with protobuf is, that the resulting file is only 66MB > instead of 113MB. The bad news is, that loading it from Python requires > 3.5 times the amount of memory during runtime (3.5GB instead of 1GB). > > In addition to the data exchange problems between backend and > mirrorlist servers the architecture of the mirrorlist server does not > really make sense today. 12 years ago it made a lot of sense as it could > be easily integrated into httpd and it could be easily reloaded without > stopping the service. Today the mirrorlist server and httpd is all part > of a container which is then behind haproxy. So there is a lot of > infrastructure in the container which is not really useful. > > To get rid of the pickle format and to have a simpler architecture I > reimplemented the mirrorlist-server in Rust. This was brought up some > time ago on a ticket and with the protobuf problems I was seeing in > Python it made sense to try it out. > > My code currently can be found at > https://github.com/adrianreber/mirrorlist-server > and so far the results from the new mirrorlist server are the same as > from the Python based mirrorlist server. > > It requires less than 700MB instead of the 1GB in Python with production > based data and seems really fast. > > I have set up a test instance with the mirror data from Sunday at: > > https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64 > https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64 > Nice. Very very nice. Mirror management software is hard. Thank you for doing all this work -- Stephen J Smoogen. ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
Re: New mirrorlist server implementation
I'll try to package new Rust implementation in Fedora :) And thanks for working on this! On Tue, Oct 8, 2019 at 8:52 AM Adrian Reber wrote: > > > Fedora's complete MirrorManager setup is still running on Python2. The > code has been ported to Python3 probably over two years ago but we have > not switched yet. One of the reasons is that the backend is running on > RHEL7 which means we are not in a hurry to deploy the Python3 version. > > The mirrorlist server which is answering the actual dnf/yum queries for > a mirrorlist/metalink is, however, running in a Fedora 29 container. > This container also still uses Python2 and it actually cannot use the > Python3 version. > > One of MirrorManager's design points is that the mirrorlist servers, > which are answering around 27 000 000 requests per day, are not directly > accessing the database. The backend creates a snapshot of the relevant > data (113MB) and the mirrorlist servers are using this snapshot to > answer client requests. > > This data exchange is based on Python's pickle format and that does not > seem to work with Python3 if it is generated using Python2. > > Having used protobuf before, I added code to also export the data for the > mirrorlist servers based on protobuf. > > The good news with protobuf is, that the resulting file is only 66MB > instead of 113MB. The bad news is, that loading it from Python requires > 3.5 times the amount of memory during runtime (3.5GB instead of 1GB). > > In addition to the data exchange problems between backend and > mirrorlist servers the architecture of the mirrorlist server does not > really make sense today. 12 years ago it made a lot of sense as it could > be easily integrated into httpd and it could be easily reloaded without > stopping the service. Today the mirrorlist server and httpd is all part > of a container which is then behind haproxy. So there is a lot of > infrastructure in the container which is not really useful. > > To get rid of the pickle format and to have a simpler architecture I > reimplemented the mirrorlist-server in Rust. This was brought up some > time ago on a ticket and with the protobuf problems I was seeing in > Python it made sense to try it out. > > My code currently can be found at > https://github.com/adrianreber/mirrorlist-server > and so far the results from the new mirrorlist server are the same as > from the Python based mirrorlist server. > > It requires less than 700MB instead of the 1GB in Python with production > based data and seems really fast. > > I have set up a test instance with the mirror data from Sunday at: > > https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64 > https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64 > > The instance is based on the container I pushed to quay.io: > > $ podman run quay.io/adrianreber/mirrorlist-server:latest -h > > With this change the mirrorlist server would also finally switch to > geoip2. The currently running mirrorlist server still uses the legacy > geoip database. > > After the Fedora 31 freeze I would like to introduce this new mirrorlist > server implementation on the proxies. I already verified that I can run > this mirrorlist container rootless. This new container can be a drop-in > replacement for the current container and no infrastructure around it > needs to be changed. > > The main changes to get it into production is to change mirrorlist1.service > and mirrorlist2.service to include a line "User=mirrormanager" and > replace the current container name with new container. > > Adrian > ___ > infrastructure mailing list -- infrastructure@lists.fedoraproject.org > To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org
New mirrorlist server implementation
Fedora's complete MirrorManager setup is still running on Python2. The code has been ported to Python3 probably over two years ago but we have not switched yet. One of the reasons is that the backend is running on RHEL7 which means we are not in a hurry to deploy the Python3 version. The mirrorlist server which is answering the actual dnf/yum queries for a mirrorlist/metalink is, however, running in a Fedora 29 container. This container also still uses Python2 and it actually cannot use the Python3 version. One of MirrorManager's design points is that the mirrorlist servers, which are answering around 27 000 000 requests per day, are not directly accessing the database. The backend creates a snapshot of the relevant data (113MB) and the mirrorlist servers are using this snapshot to answer client requests. This data exchange is based on Python's pickle format and that does not seem to work with Python3 if it is generated using Python2. Having used protobuf before, I added code to also export the data for the mirrorlist servers based on protobuf. The good news with protobuf is, that the resulting file is only 66MB instead of 113MB. The bad news is, that loading it from Python requires 3.5 times the amount of memory during runtime (3.5GB instead of 1GB). In addition to the data exchange problems between backend and mirrorlist servers the architecture of the mirrorlist server does not really make sense today. 12 years ago it made a lot of sense as it could be easily integrated into httpd and it could be easily reloaded without stopping the service. Today the mirrorlist server and httpd is all part of a container which is then behind haproxy. So there is a lot of infrastructure in the container which is not really useful. To get rid of the pickle format and to have a simpler architecture I reimplemented the mirrorlist-server in Rust. This was brought up some time ago on a ticket and with the protobuf problems I was seeing in Python it made sense to try it out. My code currently can be found at https://github.com/adrianreber/mirrorlist-server and so far the results from the new mirrorlist server are the same as from the Python based mirrorlist server. It requires less than 700MB instead of the 1GB in Python with production based data and seems really fast. I have set up a test instance with the mirror data from Sunday at: https://lisas.de/metalink?repo=updates-testing-f31&arch=x86_64 https://lisas.de/mirrorlist?repo=updates-testing-f31&arch=x86_64 The instance is based on the container I pushed to quay.io: $ podman run quay.io/adrianreber/mirrorlist-server:latest -h With this change the mirrorlist server would also finally switch to geoip2. The currently running mirrorlist server still uses the legacy geoip database. After the Fedora 31 freeze I would like to introduce this new mirrorlist server implementation on the proxies. I already verified that I can run this mirrorlist container rootless. This new container can be a drop-in replacement for the current container and no infrastructure around it needs to be changed. The main changes to get it into production is to change mirrorlist1.service and mirrorlist2.service to include a line "User=mirrormanager" and replace the current container name with new container. Adrian ___ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedoraproject.org