Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)
> Filed https://bugzilla.gnome.org/show_bug.cgi?id=796227 This one is bad, as ulimit might have been changed during the runtime of a process, and so using rlim_max is more correct and safe. For the same reason I have added the second commit https://github.com/rpm-software-management/rpm/pull/444/commits/aa6cc049cc43cc0ae7d7c2d40fe0c9f7d8dcb6f8 to this PR. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390060247___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)
> One other thing is, they call getrlimit(RLIMIT_NOFILE, ) and use > rl.rlim_max which seems to be a mistake -- rlim_cur should be used. Filed https://bugzilla.gnome.org/show_bug.cgi?id=796227 -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390055067___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)
@cgwalters I took a look at glib implementation, they make use of `fdwalk()` if available (it's a function unique to SunOS/Solaris AFAICS), and reimplement it using /proc/self/fd if not (which is the case for Linux). This seems to be an unnecessary complication (unless we care much about Solaris). One other thing is, they call `getrlimit(RLIMIT_NOFILE, )` and use `rl.rlim_max` which seems to be a mistake -- `rlim_cur` should be used. But since this code is only used when /proc is not available it's probably nothing. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390049589___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)
OK, patch updated to take all the review comments into account (thanks @Saur2000 for your suggestions) so it's now also fixing the `doScriptExec()` from `lib/rpmscript.c`. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390041464___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
For the reference, here's my quick-n-dirty benchmarking code: ```c #include #include #include #include #include #include static void set_cloexec(int fd) { int flags = fcntl(fd, F_GETFD); if (flags == -1 || (flags & FD_CLOEXEC)) return; fcntl(fd, F_SETFD, flags | FD_CLOEXEC); } static void set_cloexec_all(int open_max, char method) { const int min_fd = 3; // don't touch stdin/out/err int fd; if (method == 'i') goto fallback; // iterate over fds obtained from /proc DIR *dir = opendir("/proc/self/fd"); if (dir == NULL) { goto fallback; } struct dirent *entry; while ((entry = readdir(dir)) != NULL) { fd = atoi(entry->d_name); if (fd >= min_fd) set_cloexec(fd); } closedir(dir); return; fallback: // iterate over all possible fds for (fd = min_fd; fd < open_max; fd++) set_cloexec(fd); } int main(int argc, char **argv) { if (argc < 4) { fprintf(stderr, "Usage: %s \n", argv[0]); fprintf(stderr, " is either i (iterate) or p (use proc)\n"); return 1; } int open_max = atoi(argv[1]); int iter = atoi(argv[2]); char m = argv[3][0]; while (iter--) set_cloexec_all(open_max, m); return 0; } ``` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390021766___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
``` kir@kd:~/git/rpm$ time ./a 64 100 p real0m3.413s user0m0.416s sys 0m2.994s kir@kd:~/git/rpm$ time ./a 64 100 i real0m3.661s user0m1.421s sys 0m2.233s ``` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390021860___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
I did some more benchmarks comparing the new /proc way with the old one. It seems that the old one is faster when `open_max < 60` which should almost never be the case. Based on that and the suggestion from @Saur2000 I'm dropping the conditional switch to old method completely. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390020844___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
kolyshkin commented on this pull request. > + fcntl(fd, F_SETFD, FD_CLOEXEC); +} + +static void set_cloexec_all(void) +{ + const int min_fd = 3; // don't touch stdin/out/err + const int default_open_max = 1024; + int fd, open_max; + + open_max = sysconf(_SC_OPEN_MAX); + if (open_max == -1) { + open_max = default_open_max; + goto fallback; + } + + if (open_max <= default_open_max) { I wanted to be conservative for the sole reason to minimize the change to current behavior. What you say makes sense though, let me update the patch. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#discussion_r189106419___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
kolyshkin commented on this pull request. > @@ -330,26 +330,64 @@ static int Pmkfifo(lua_State *L)/** > mkfifo(path) */ } +static void set_cloexec(int fd) +{ + int flag = fcntl(fd, F_GETFD); + + if (flag == -1 || (flag & FD_CLOEXEC)) + return; + + fcntl(fd, F_SETFD, FD_CLOEXEC); This I just copied from the old code (and the only known flag for now is CLOEXEC). Anyway, will do. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#discussion_r189106021___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
kolyshkin commented on this pull request. > + struct dirent *entry; + while ((entry = readdir(dir)) != NULL) { + fd = atoi(entry->d_name); + if (fd >= min_fd) + set_cloexec(fd); + } + closedir(dir); + + return; + +fallback: + // iterate over all possible fds + for (fd = min_fd; fd < open_max; fd++) + set_cloexec(fd); +} + static int Pexec(lua_State *L) /** exec(path,[args]) */ Yes, ideally those two places need to be unified -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#discussion_r189105974___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
The CI failure on rawhide is unrelated, it's a dnf error complaining about repo metadata: > raise ValueError("The supplied metadata version isn't supported") -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390016012___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
Saur2000 requested changes on this pull request. > + struct dirent *entry; + while ((entry = readdir(dir)) != NULL) { + fd = atoi(entry->d_name); + if (fd >= min_fd) + set_cloexec(fd); + } + closedir(dir); + + return; + +fallback: + // iterate over all possible fds + for (fd = min_fd; fd < open_max; fd++) + set_cloexec(fd); +} + static int Pexec(lua_State *L) /** exec(path,[args]) */ You need to do the same change for doScriptExec() in lib/rpmscript.c as well. At least in our case, that is where all the time is spent. > @@ -330,26 +330,64 @@ static int Pmkfifo(lua_State *L)/** > mkfifo(path) */ } +static void set_cloexec(int fd) +{ + int flag = fcntl(fd, F_GETFD); Change "flag" to "flags". Even if FD_CLOEXEC is currently the only defined flag, we may as well make the code forward compatible, > @@ -330,26 +330,64 @@ static int Pmkfifo(lua_State *L)/** > mkfifo(path) */ } +static void set_cloexec(int fd) +{ + int flag = fcntl(fd, F_GETFD); + + if (flag == -1 || (flag & FD_CLOEXEC)) + return; + + fcntl(fd, F_SETFD, FD_CLOEXEC); Change "FD_CLOEXEC" to "flags | FD_CLOEXEC". > + if (flag == -1 || (flag & FD_CLOEXEC)) + return; + + fcntl(fd, F_SETFD, FD_CLOEXEC); +} + +static void set_cloexec_all(void) +{ + const int min_fd = 3; // don't touch stdin/out/err + const int default_open_max = 1024; + int fd, open_max; + + open_max = sysconf(_SC_OPEN_MAX); + if (open_max == -1) { + open_max = default_open_max; + goto fallback; I see no reason to goto fallback here. > + fcntl(fd, F_SETFD, FD_CLOEXEC); +} + +static void set_cloexec_all(void) +{ + const int min_fd = 3; // don't touch stdin/out/err + const int default_open_max = 1024; + int fd, open_max; + + open_max = sysconf(_SC_OPEN_MAX); + if (open_max == -1) { + open_max = default_open_max; + goto fallback; + } + + if (open_max <= default_open_max) { I doubt you need to be this conservative. I don't have any measurements to support it, but I'd be very surprised if it is not a win to remove this if statement and always use the code that only traverses the open file descriptors. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#pullrequestreview-121202950___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
At a quick glance anyways, your code looks fine to me. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444#issuecomment-389980176___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
[Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)
In case maximum number of open files limit is set too high, Pexec() spends way too much time trying to set FD_CLOEXEC for all those file descriptors, resulting in severe increase of time it takes to execute rpm or dnf. For example, this happens while running under Docker because: > $ docker run fedora ulimit -n > 1048576 One obvious fix is when open_max is too big, use procfs to get the actual list of opened file descriptors and iterate over those. My quick-n-dirty benchmark shows the /proc approach is about 10x faster than iterating through a list of 1024 fds. Note that this commit is rather conservative, meaning that the old code is used: - if open_max <= 1024; - if /proc is not available. This should fix: - https://github.com/moby/moby/issues/23137 - https://bugzilla.redhat.com/show_bug.cgi?id=1537564 You can view, comment on, or merge this pull request online at: https://github.com/rpm-software-management/rpm/pull/444 -- Commit Summary -- * luaext/Pexec: optimize setting CLOEXEC -- File Changes -- M luaext/lposix.c (60) -- Patch Links -- https://github.com/rpm-software-management/rpm/pull/444.patch https://github.com/rpm-software-management/rpm/pull/444.diff -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/rpm-software-management/rpm/pull/444 ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint
[Rpm-maint] Proposal for rpmbuild patch
Hi all, [I posted a similar proposal under the thread "How to generate GNU make dependency file for a SPEC file automatically?" but I'm reposting here with a better subject] Proposal: add a new option to rpmbuild, something like --generate-md5sum= that writes the md5sum (or whatever other hash, sha1sum, etc) of all files placed inside the RPM package. Of course the hash would be that associated to the file prior to cpio compression, after all other rpm processing (e.g., stripping of debug info), just before it gets compressed. Rationale: with some simple post-processing such new option would allow to e.g., find in a way transparent to the build system employed by the user (automake, cmake, etc) to understand if a .rpm needs to be regenerated (because some of its packaged files have been changed) or not. Would RPM maintainers accept such a patch? Thanks for comments! Francesco ___ Rpm-maint mailing list Rpm-maint@lists.rpm.org http://lists.rpm.org/mailman/listinfo/rpm-maint