Re: rpmbuild is very slow with large files
Am 13.07.22 um 16:30 schrieb John Reiser: On 7/11/22 Marius Schwarz wrote: I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 GB rpm and found, that rpmbuild is an extrem bottleneck. IMHO, this is caused by a fileread function which reads files in 32k blocks, which is very slow and extrem IO intensive. The result is a task running at 1 core at 100% perma. With changes to larger chunks, we can speed up so many build tasks on the farm. Multicore use would also be helpful i.e. while packing the files. Any counter-arguments ? If you give the complete package name and URL of the repo, then more persons may be likely to help investigate. Specifying a reproducible example is always good. All issues solved for far. Just to give you (all) an impression, here are source and result in my test repo: 3,2G /usr/share/pva/vosk-model-de-0.21 [vosk-model-de-0.21]# du -sh * 100M am 12K conf 685M graph 8,2M ivector 4,0K README 2,1G rescore 281M rnnlm [rescore]# ll insgesamt 2171812 -rw-r--r-- 1 root root 2115929988 14. Sep 2021 G.carpa <- -rw-r--r-- 1 root root 107992138 14. Sep 2021 G.fst So compressing this 2+ GB file (and others) was slowing down the process because of the one core compression default. Building this now takes just ~4-5 minutes on 8 cores and a system doing other things in parallel. Resulting in a 1.7 GB rpm : -rw-r--r-- 1 root root 1758210157 14. Jul 09:44 /home/linux-am-dienstagde/repo/x86_64/fedora/35/pva-vosk-model-de-large-1-2.x86_64.rpm Luckily, not all vosk language models are not changing frequently and are not that big, but some are. If this ever makes it into Fedora repo, it will take a lot of space and bind resources on builds ;) @BCotton: No idea, if you remember, but when i said it will waste 100gb + updates, in the last year, there were only a few updates to the languages models, reducing the expected needed space over time a lot. Best regards, Marius ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
On 7/11/22 Marius Schwarz wrote: I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 GB rpm and found, that rpmbuild is an extrem bottleneck. IMHO, this is caused by a fileread function which reads files in 32k blocks, which is very slow and extrem IO intensive. The result is a task running at 1 core at 100% perma. With changes to larger chunks, we can speed up so many build tasks on the farm. Multicore use would also be helpful i.e. while packing the files. Any counter-arguments ? If you give the complete package name and URL of the repo, then more persons may be likely to help investigate. Specifying a reproducible example is always good. If you know "strace -p $PID" then please learn "perf record -p $PID". If the size of the package is in gigabytes, then upstream bears some responsibility for investigating and documenting the use of data compression with the package. What does upstream say? In the few samples of "read(" from the output of strace, there I see text similar to JSON or XML tags. A large dataset that contains zillions of repetitions of only a few dozen tags, creates O(n**2) work for deflation. Finding many matches of any particular tag is quick, but which match can be extended the most, considering the exact context of prefixes and suffixes? A "looser" compression such as "gzip -3" or lzo might be much faster with only slightly larger output. A software implementation of a hardware technique such as WK, or even "ancient" modem compression MNP5 or MNP10, might also be a good choice. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
On Mon, 11 Jul 2022 at 18:52, Marius Schwarz wrote: > > Hi, > > I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 > GB rpm and found, that rpmbuild is an extrem bottleneck. > > At that size, is RPM actually a good fit for the data inside it? Building it is going to be slow and so is going to be installing it and upgrading it. Downloads from mirrors are going to be problematic going from the complaints we get from users on various large rpms taking too long to download or timing out or breaking something else. I realize RPM is the cardboard box we are comfortable using for a lot of things, but this seems like trying to use it for a shipping container across the ocean. -- Stephen Smoogen, Red Hat Automotive Let us be kind to one another, for most of us are fighting a hard battle. -- Ian MacClaren ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
The only coincidence is "large files" the rest is different but may worth check , if you are talking about rawhide. https://pagure.io/copr/copr/issue/2241 On Tue, 2022-07-12 at 00:52 +0200, Marius Schwarz wrote: > > Hi, > > I have just create(d/ not finished yet, started 15 minutes ago) a > ~2.5 > GB rpm and found, that rpmbuild is an extrem bottleneck. > > IMHO, this is caused by a fileread function which reads files in 32k > blocks, which is very slow and extrem IO intensive. The result is a > task running at 1 core at 100% perma. With changes to larger chunks, > we > can speed up so many build tasks on the farm. > > Multicore use would also be helpful i.e. while packing the files. > > Any counter-arguments ? > > strace example: > > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=477601377}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=477685727}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=477892054}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=47876}) = 0 > [pid 2604060] read(5, "_I @_I s_I t_E\nauss\303\244het > auss\303\244h"..., 32768) = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478212651}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478301347}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478409015}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478505273}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478701366}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478784826}) = 0 > [pid 2604060] read(5, " Y_I k_I t_E\naustun austun 'aU_B"..., 32768) > = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478962539}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479045029}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479130924}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479213446}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479407336}) = 0 > [pid 2604060] clock_gettime(CLOCK_REAobjections > LTIME, {tv_sec=1657579222, tv_nsec=479489832}) = 0 > [pid 2604060] read(5, "s_I v_I u:_I k_I s_I @_I s_E\naus"..., 32768) > = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479720335}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479803090}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479950309}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480067186}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480305924}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480417985}) = 0 > [pid 2604060] read(5, "B s_I ts_I u:_I g_I R_I E_I n_I "..., 32768) = > 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480654716}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480763606}) = 0 > > and I don't think, this tasks needs to read the clock that often too. > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: > https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure -- Sérgio M. B. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
* Miroslav Lichvar: > On Tue, Jul 12, 2022 at 09:26:14AM +0200, Florian Weimer wrote: >> * Marius Schwarz: >> > and I don't think, this tasks needs to read the clock that often too. >> >> strace shouldn't see a system call here because clock_gettime should be >> handled in the vDSO. This suggests something is wrong with the system >> (unless it's some obscure variant that really doesn't have vDSO support). > > It doesn't necessarily have to be something wrong with the system. > The vDSO clock_gettime() works only with specific clocksources, > typically TSC on x86_64. On some older HW it's not reliable enough to > be selected by the kernel, or it could be a VM which doesn't have one > that would work with migrations, etc. True, but I suspect a Xeon E5-2620 v4 is recent enough to have a stable TSC (I see nonstop_tsc and constant_tsc in /proc/cpuinfo for some random lab system, for a start). So I suspect that virtualization is masking it, or otherwise interfering with the TSC. Thanks, Florian ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
On 7/12/22 11:02, Marius Schwarz wrote: > Am 12.07.22 um 10:55 schrieb Marius Schwarz: >> >> The rpmbuild process for this one rpm was single thread. With a >> lsof-loop, I could see "bytes" getting attached to the resulting file >> with an awful slow progression rate. Which is very frustrating to see >> on a 8 core system. >> >> The thing is, I do testbuilds of VOSK with language model and code >> etc. on one of my servers. If this project ever reaches the Fedora >> build farm, >> we can expect a very long build time, if nothing is changed in >> rpmbuild. Is there maybe a hidden parallel compression option somewhere? >> > > Looks like someone else had the same problem with rpmbuilds XZ single > core compression: > > https://insujang.github.io/2020-11-07/accelerating-ceph-rpm-packaging-using-multithreaded-compression/ > > |--define "_binary_payload w2T16.xzdio" | > > |Question: Is the resulting compression format suitable for Fedora repo > or a against a policy?| Yeah, we did quite some work upstream to get builds run in parallel at various stages and levels. RPM does support multithreaded compression where the compression libraries support that but it needs to be enabled. As the result are not the same as single threaded compression this may have impact on the viability of deltarpms. But IIRC at least zstd while having different results would at least have reproducible results. But I am not the one that actually checked and decided against threaded compression. Florian > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
Am 12.07.22 um 10:55 schrieb Marius Schwarz: The rpmbuild process for this one rpm was single thread. With a lsof-loop, I could see "bytes" getting attached to the resulting file with an awful slow progression rate. Which is very frustrating to see on a 8 core system. The thing is, I do testbuilds of VOSK with language model and code etc. on one of my servers. If this project ever reaches the Fedora build farm, we can expect a very long build time, if nothing is changed in rpmbuild. Is there maybe a hidden parallel compression option somewhere? Looks like someone else had the same problem with rpmbuilds XZ single core compression: https://insujang.github.io/2020-11-07/accelerating-ceph-rpm-packaging-using-multithreaded-compression/ |--define "_binary_payload w2T16.xzdio" | |Question: Is the resulting compression format suitable for Fedora repo or a against a policy?| || |best regards,| |Marius| || ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
Am 12.07.22 um 09:47 schrieb Kamil Dudka: On Tuesday, July 12, 2022 12:52:13 AM CEST Marius Schwarz wrote: Multicore use would also be helpful i.e. while packing the files. What do you mean by packing? Creation of the resulting RPM packages? I believe this phase already runs in parallel in case multiple RPM packages are being created out of a single source RPM package. "packing" aka "compression" . The rpmbuild process for this one rpm was single thread. With a lsof-loop, I could see "bytes" getting attached to the resulting file with an awful slow progression rate. Which is very frustrating to see on a 8 core system. The thing is, I do testbuilds of VOSK with language model and code etc. on one of my servers. If this project ever reaches the Fedora build farm, we can expect a very long build time, if nothing is changed in rpmbuild. Is there maybe a hidden parallel compression option somewhere? best regards, Marius ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
Am 12.07.22 um 10:30 schrieb Miroslav Lichvar: be selected by the kernel, or it could be a VM which doesn't have one that would work with migrations, etc. I think your right, it's a vm on xen base. best regards, Marius ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
Am 12.07.22 um 09:26 schrieb Florian Weimer: * Marius Schwarz: I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 GB rpm and found, that rpmbuild is an extrem bottleneck. IMHO, this is caused by a fileread function which reads files in 32k blocks, which is very slow and extrem IO intensive. The result is a task running at 1 core at 100% perma. With changes to larger chunks, we can speed up so many build tasks on the farm. That's unlikely. 32K is not a small buffer size. It's more likely that time is spent during compression. In this case, a pigz , pbiz2 or other parallel compression mode, would be helpful. strace shouldn't see a system call here because clock_gettime should be handled in the vDSO. This suggests something is wrong with the system (unless it's some obscure variant that really doesn't have vDSO support). Thanks, Florian it's a "normal" ( desktopless ) F 35 on a Xeon E5-2620v4 . best regards, Marius Schwarz ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
On Tue, Jul 12, 2022 at 09:26:14AM +0200, Florian Weimer wrote: > * Marius Schwarz: > > and I don't think, this tasks needs to read the clock that often too. > > strace shouldn't see a system call here because clock_gettime should be > handled in the vDSO. This suggests something is wrong with the system > (unless it's some obscure variant that really doesn't have vDSO support). It doesn't necessarily have to be something wrong with the system. The vDSO clock_gettime() works only with specific clocksources, typically TSC on x86_64. On some older HW it's not reliable enough to be selected by the kernel, or it could be a VM which doesn't have one that would work with migrations, etc. -- Miroslav Lichvar ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
On Tuesday, July 12, 2022 12:52:13 AM CEST Marius Schwarz wrote: > > Hi, > > I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 > GB rpm and found, that rpmbuild is an extrem bottleneck. > > IMHO, this is caused by a fileread function which reads files in 32k > blocks, which is very slow and extrem IO intensive. The result is a > task running at 1 core at 100% perma. With changes to larger chunks, we > can speed up so many build tasks on the farm. > > Multicore use would also be helpful i.e. while packing the files. What do you mean by packing? Creation of the resulting RPM packages? I believe this phase already runs in parallel in case multiple RPM packages are being created out of a single source RPM package. Kamil > Any counter-arguments ? > > strace example: > > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=477601377}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=477685727}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=477892054}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=47876}) = 0 > [pid 2604060] read(5, "_I @_I s_I t_E\nauss\303\244het > auss\303\244h"..., 32768) = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478212651}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478301347}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478409015}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478505273}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478701366}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478784826}) = 0 > [pid 2604060] read(5, " Y_I k_I t_E\naustun austun 'aU_B"..., 32768) = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=478962539}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479045029}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479130924}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479213446}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479407336}) = 0 > [pid 2604060] clock_gettime(CLOCK_REAobjections > LTIME, {tv_sec=1657579222, tv_nsec=479489832}) = 0 > [pid 2604060] read(5, "s_I v_I u:_I k_I s_I @_I s_E\naus"..., 32768) = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479720335}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479803090}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=479950309}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480067186}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480305924}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480417985}) = 0 > [pid 2604060] read(5, "B s_I ts_I u:_I g_I R_I E_I n_I "..., 32768) = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480654716}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480763606}) = 0 > > and I don't think, this tasks needs to read the clock that often too. > ___ > devel mailing list -- devel@lists.fedoraproject.org > To unsubscribe send an email to devel-le...@lists.fedoraproject.org > Fedora Code of Conduct: > https://docs.fedoraproject.org/en-US/project/code-of-conduct/ > List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines > List Archives: > https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org > Do not reply to spam on the list, report it: > https://pagure.io/fedora-infrastructure ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
Re: rpmbuild is very slow with large files
* Marius Schwarz: > I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 > GB rpm and found, that rpmbuild is an extrem bottleneck. > > IMHO, this is caused by a fileread function which reads files in 32k > blocks, which is very slow and extrem IO intensive. The result is a > task running at 1 core at 100% perma. With changes to larger chunks, > we can speed up so many build tasks on the farm. That's unlikely. 32K is not a small buffer size. It's more likely that time is spent during compression. > [pid 2604060] read(5, "B s_I ts_I u:_I g_I R_I E_I n_I "..., 32768) = 32768 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480654716}) = 0 > [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, > tv_nsec=480763606}) = 0 > > and I don't think, this tasks needs to read the clock that often too. strace shouldn't see a system call here because clock_gettime should be handled in the vDSO. This suggests something is wrong with the system (unless it's some obscure variant that really doesn't have vDSO support). Thanks, Florian ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure
rpmbuild is very slow with large files
Hi, I have just create(d/ not finished yet, started 15 minutes ago) a ~2.5 GB rpm and found, that rpmbuild is an extrem bottleneck. IMHO, this is caused by a fileread function which reads files in 32k blocks, which is very slow and extrem IO intensive. The result is a task running at 1 core at 100% perma. With changes to larger chunks, we can speed up so many build tasks on the farm. Multicore use would also be helpful i.e. while packing the files. Any counter-arguments ? strace example: [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=477601377}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=477685727}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=477892054}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=47876}) = 0 [pid 2604060] read(5, "_I @_I s_I t_E\nauss\303\244het auss\303\244h"..., 32768) = 32768 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478212651}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478301347}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478409015}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478505273}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478701366}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478784826}) = 0 [pid 2604060] read(5, " Y_I k_I t_E\naustun austun 'aU_B"..., 32768) = 32768 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=478962539}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479045029}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479130924}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479213446}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479407336}) = 0 [pid 2604060] clock_gettime(CLOCK_REAobjections LTIME, {tv_sec=1657579222, tv_nsec=479489832}) = 0 [pid 2604060] read(5, "s_I v_I u:_I k_I s_I @_I s_E\naus"..., 32768) = 32768 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479720335}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479803090}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=479950309}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=480067186}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=480305924}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=480417985}) = 0 [pid 2604060] read(5, "B s_I ts_I u:_I g_I R_I E_I n_I "..., 32768) = 32768 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=480654716}) = 0 [pid 2604060] clock_gettime(CLOCK_REALTIME, {tv_sec=1657579222, tv_nsec=480763606}) = 0 and I don't think, this tasks needs to read the clock that often too. ___ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure