On Mon, 29 Jun 2020, Baptiste BEAUPLAT wrote: > > Indeed, creating a dedicated service for this does not seem a good idea. > > I would love to have this feature integrated directly with > distro-tracker. However, I'm wondering about the load that would case > for the service.
Network request do not generate much "load", such processes spend the bulk of their time waiting on the network. > The duck worker has to process around 460000 urls (only counting > Homepage) in less than 24h. How do you get to that figure? We don't have that many source package and even if you consider multiple URL for each source package due to changes over time (in multiple releases), that makes way too many URLs per source package. > I'm not sure that can done properly using > the distro-tracker tasks (parallel workers are needed to work around > timeout). Obviously that can be optimized (different check delay for > different results) but that's still bulk network related tasks. Nothing forbids parallel workers and in any case, I welcome any improvement to the task mechanism to make that kind of parallelism easier to handle. There are other tasks that could benefit from this (and in general I want to merge more of such features in distro-tracker to make them available to derivatives too). Cheers, -- ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog <hert...@debian.org> ⣾⠁⢠⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋ The Debian Handbook: https://debian-handbook.info/get/ ⠈⠳⣄⠀⠀⠀⠀ Debian Long Term Support: https://deb.li/LTS
Description: PGP signature