Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)

2018-05-17 Thread Kirill Kolyshkin
> Filed https://bugzilla.gnome.org/show_bug.cgi?id=796227

This one is bad, as ulimit might have been changed during the runtime of a 
process, and so using rlim_max is more correct and safe. For the same reason I 
have added the second commit 
https://github.com/rpm-software-management/rpm/pull/444/commits/aa6cc049cc43cc0ae7d7c2d40fe0c9f7d8dcb6f8
 to this PR.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390060247___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)

2018-05-17 Thread Kirill Kolyshkin
> One other thing is, they call getrlimit(RLIMIT_NOFILE, ) and use 
> rl.rlim_max which seems to be a mistake -- rlim_cur should be used.

Filed https://bugzilla.gnome.org/show_bug.cgi?id=796227

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390055067___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)

2018-05-17 Thread Kirill Kolyshkin
@cgwalters I took a look at glib implementation, they make use of `fdwalk()` if 
available (it's a function unique to SunOS/Solaris AFAICS), and reimplement it 
using /proc/self/fd if not (which is the case for Linux).  This seems to be an 
unnecessary complication (unless we care much about Solaris).

One other thing is, they call `getrlimit(RLIMIT_NOFILE, )` and use 
`rl.rlim_max` which seems to be a mistake -- `rlim_cur` should be used. But 
since this code is only used when /proc is not available it's probably nothing.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390049589___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] Optimize and unite setting CLOEXEC on fds (#444)

2018-05-17 Thread Kirill Kolyshkin
OK, patch updated to take all the review comments into account (thanks 
@Saur2000 for your suggestions) so it's now also fixing the `doScriptExec()` 
from `lib/rpmscript.c`.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390041464___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
For the reference, here's my quick-n-dirty benchmarking code:
```c
#include 
#include 
#include 
#include 
#include 
#include 



static void set_cloexec(int fd)
{
int flags = fcntl(fd, F_GETFD);

if (flags == -1 || (flags & FD_CLOEXEC))
return;

fcntl(fd, F_SETFD, flags | FD_CLOEXEC);
}

static void set_cloexec_all(int open_max, char method)
{
const int min_fd = 3; // don't touch stdin/out/err
int fd;

if (method == 'i')
goto fallback;

// iterate over fds obtained from /proc
DIR *dir = opendir("/proc/self/fd");
if (dir == NULL) {
goto fallback;
}

struct dirent *entry;
while ((entry = readdir(dir)) != NULL) {
fd = atoi(entry->d_name);
if (fd >= min_fd)
set_cloexec(fd);
}
closedir(dir);

return;

fallback:
// iterate over all possible fds
for (fd = min_fd; fd < open_max; fd++)
set_cloexec(fd);
}

int main(int argc, char **argv) {
if (argc < 4) {
fprintf(stderr, "Usage: %s   \n", 
argv[0]);
fprintf(stderr, "   is either i (iterate) or p (use 
proc)\n");
return 1;
}

int open_max = atoi(argv[1]);
int iter = atoi(argv[2]);
char m = argv[3][0];

while (iter--)
set_cloexec_all(open_max, m);

return 0;
}
```


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390021766___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
```
kir@kd:~/git/rpm$ time ./a 64 100 p

real0m3.413s
user0m0.416s
sys 0m2.994s
kir@kd:~/git/rpm$ time ./a 64 100 i

real0m3.661s
user0m1.421s
sys 0m2.233s
```

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390021860___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
I did some more benchmarks comparing the new /proc way with the old one. It 
seems that the old one is faster when `open_max < 60` which should almost never 
be the case. Based on that and the suggestion from @Saur2000 I'm dropping the 
conditional switch to old method completely.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390020844___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
kolyshkin commented on this pull request.



> + fcntl(fd, F_SETFD, FD_CLOEXEC);
+}
+
+static void set_cloexec_all(void)
+{
+   const int min_fd = 3; // don't touch stdin/out/err
+   const int default_open_max = 1024;
+   int fd, open_max;
+
+   open_max = sysconf(_SC_OPEN_MAX);
+   if (open_max == -1) {
+   open_max = default_open_max;
+   goto fallback;
+   }
+
+   if (open_max <= default_open_max) {

I wanted to be conservative for the sole reason to minimize the change to 
current behavior. What you say makes sense though, let me update the patch.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#discussion_r189106419___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
kolyshkin commented on this pull request.



> @@ -330,26 +330,64 @@ static int Pmkfifo(lua_State *L)/** 
> mkfifo(path) */
 }
 
 
+static void set_cloexec(int fd)
+{
+   int flag = fcntl(fd, F_GETFD);
+
+   if (flag == -1 || (flag & FD_CLOEXEC))
+   return;
+
+   fcntl(fd, F_SETFD, FD_CLOEXEC);

This I just copied from the old code (and the only known flag for now is 
CLOEXEC).

Anyway, will do.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#discussion_r189106021___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
kolyshkin commented on this pull request.



> + struct dirent *entry;
+   while ((entry = readdir(dir)) != NULL) {
+   fd = atoi(entry->d_name);
+   if (fd >= min_fd)
+   set_cloexec(fd);
+   }
+   closedir(dir);
+
+   return;
+
+fallback:
+   // iterate over all possible fds
+   for (fd = min_fd; fd < open_max; fd++)
+   set_cloexec(fd);
+}
+
 static int Pexec(lua_State *L) /** exec(path,[args]) */

Yes, ideally those two places need to be unified

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#discussion_r189105974___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
The CI failure on rawhide is unrelated, it's a dnf error complaining about repo 
metadata:

> raise ValueError("The supplied metadata version isn't supported")


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-390016012___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Peter Kjellerstedt
Saur2000 requested changes on this pull request.



> + struct dirent *entry;
+   while ((entry = readdir(dir)) != NULL) {
+   fd = atoi(entry->d_name);
+   if (fd >= min_fd)
+   set_cloexec(fd);
+   }
+   closedir(dir);
+
+   return;
+
+fallback:
+   // iterate over all possible fds
+   for (fd = min_fd; fd < open_max; fd++)
+   set_cloexec(fd);
+}
+
 static int Pexec(lua_State *L) /** exec(path,[args]) */

You need to do the same change for doScriptExec() in lib/rpmscript.c as well. 
At least in our case, that is where all the time is spent.

> @@ -330,26 +330,64 @@ static int Pmkfifo(lua_State *L)/** 
> mkfifo(path) */
 }
 
 
+static void set_cloexec(int fd)
+{
+   int flag = fcntl(fd, F_GETFD);

Change "flag" to "flags". Even if FD_CLOEXEC is currently the only defined 
flag, we may as well make the code forward compatible,

> @@ -330,26 +330,64 @@ static int Pmkfifo(lua_State *L)/** 
> mkfifo(path) */
 }
 
 
+static void set_cloexec(int fd)
+{
+   int flag = fcntl(fd, F_GETFD);
+
+   if (flag == -1 || (flag & FD_CLOEXEC))
+   return;
+
+   fcntl(fd, F_SETFD, FD_CLOEXEC);

Change "FD_CLOEXEC" to "flags | FD_CLOEXEC".


> + if (flag == -1 || (flag & FD_CLOEXEC))
+   return;
+
+   fcntl(fd, F_SETFD, FD_CLOEXEC);
+}
+
+static void set_cloexec_all(void)
+{
+   const int min_fd = 3; // don't touch stdin/out/err
+   const int default_open_max = 1024;
+   int fd, open_max;
+
+   open_max = sysconf(_SC_OPEN_MAX);
+   if (open_max == -1) {
+   open_max = default_open_max;
+   goto fallback;

I see no reason to goto fallback here.

> + fcntl(fd, F_SETFD, FD_CLOEXEC);
+}
+
+static void set_cloexec_all(void)
+{
+   const int min_fd = 3; // don't touch stdin/out/err
+   const int default_open_max = 1024;
+   int fd, open_max;
+
+   open_max = sysconf(_SC_OPEN_MAX);
+   if (open_max == -1) {
+   open_max = default_open_max;
+   goto fallback;
+   }
+
+   if (open_max <= default_open_max) {

I doubt you need to be this conservative. I don't have any measurements to 
support it, but I'd be very surprised if it is not a win to remove this if 
statement and always use the code that only traverses the open file descriptors.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#pullrequestreview-121202950___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


Re: [Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Colin Walters
At a quick glance anyways, your code looks fine to me.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444#issuecomment-389980176___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


[Rpm-maint] [rpm-software-management/rpm] luaext/Pexec: optimize setting CLOEXEC (#444)

2018-05-17 Thread Kirill Kolyshkin
In case maximum number of open files limit is set too high, Pexec() spends way 
too much time trying to set FD_CLOEXEC for all those file descriptors, 
resulting in severe increase of time it takes to execute rpm or dnf. For 
example, this happens while running under Docker because:

> $ docker run fedora ulimit -n
> 1048576

One obvious fix is when open_max is too big, use procfs to get the actual list 
of opened file descriptors and iterate over those.

My quick-n-dirty benchmark shows the /proc approach is about 10x faster than 
iterating through a list of 1024 fds.

Note that this commit is rather conservative, meaning that the old code is used:
 - if open_max <= 1024;
 - if /proc is not available.

This should fix:
- https://github.com/moby/moby/issues/23137
- https://bugzilla.redhat.com/show_bug.cgi?id=1537564
You can view, comment on, or merge this pull request online at:

  https://github.com/rpm-software-management/rpm/pull/444

-- Commit Summary --

  * luaext/Pexec: optimize setting CLOEXEC

-- File Changes --

M luaext/lposix.c (60)

-- Patch Links --

https://github.com/rpm-software-management/rpm/pull/444.patch
https://github.com/rpm-software-management/rpm/pull/444.diff

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/rpm-software-management/rpm/pull/444
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint


[Rpm-maint] Proposal for rpmbuild patch

2018-05-17 Thread Francesco
 Hi all,
[I posted a similar proposal under the thread "How to generate GNU make
dependency file for a SPEC file automatically?" but I'm reposting here with
a better subject]

Proposal: add a new option to rpmbuild, something like
--generate-md5sum= that writes the md5sum (or whatever other hash, sha1sum, etc) of all
files placed inside the RPM package. Of course the hash would be that
associated to the file prior to cpio compression, after all other rpm
processing (e.g., stripping of debug info), just before it gets compressed.

Rationale: with some simple post-processing such new option would allow to
e.g., find in a way transparent to the build system employed by the user
(automake, cmake, etc) to understand if a .rpm needs to be regenerated
(because some of its packaged files have been changed) or not.

Would RPM maintainers accept such a patch?

Thanks for comments!
Francesco
___
Rpm-maint mailing list
Rpm-maint@lists.rpm.org
http://lists.rpm.org/mailman/listinfo/rpm-maint