Re: Workers for multicore processing (Question and RFC)
Hi Petr, > I believe modern Linux systems have much higher than 1024 file descriptor > per process limits. Right. However Joh-Tob was talking about the number of child processes. There is a limit in PicoLisp in how many child processes can be spawned with (fork), caused by the size of an 'fd_set' structure for the select() system call. (fork) creates two pipes to the parent for each child process. The limitation can be overcome with http://software-lab.de/doc/refD.html#detach for child processes who do not depend on interprocess communication. ♪♫ Alex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: Workers for multicore processing (Question and RFC)
Am 28.02.2017 um 12:04 schrieb Petr Gladkikh: I believe modern Linux systems have much higher than 1024 file descriptor per process limits. E.g. on my system $ uname -srv Linux 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 $ cat /proc/sys/fs/file-max 2013532 And you can adjust this limit to your requirements. Sorry to jump in here out of my closet, but i think this fact needs some clarification. My research came up with the following: /proc/sys/fs/file-max is not the "per" process limit , rather then the overall limit of the system This can also be controlled by writing into /proc/sys/fs/file-max. to get to know the hardlimit per process you use ulimit -Hn soft limit is showed via ulimit -Sn as the ulimit manpage states, the hard limit acts as a ceiling for the softlimit , which can be controlled. So we have the per-proccess limitations, but as soon as we fork() and recieve a new pid, we can have yet another set of fds , right ? cheers rob
Re: Workers for multicore processing (Question and RFC)
I believe modern Linux systems have much higher than 1024 file descriptor per process limits. E.g. on my system $ uname -srv Linux 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 $ cat /proc/sys/fs/file-max 2013532 And you can adjust this limit to your requirements. 2017-02-25 12:26 GMT+01:00 Joh-Tob Schäg : > There are limits on how many processes can exist at the same time and > having more 'later processes than cores is wasteful and Linux only allows > for 1024 file descriptors per process. > ? (for N 1 (later (cons) (* N N))) > !? (pipe (pr (prog (* N N > Pipe error: Too many open files > ? N > -> 339 > > Is there some "worker" implementation which keeps a queue of task to do > and creates N worker processes which take tasks from the queue when they > are finished? > > Ideas: > Task would be copy of environment + function + values to apply the > function too. There would be some optimization potential in case the > function and environment is static, or reseted each time to a constant > value but it should be possible to fetch a copy of the current environment > to. > Such a pool of workers is best represented with a symbol on which certain > functions are applied which are may stored inside the symbol. This could be > archived with some overhead with the picolisp object system when compared > to a manual implementation since the worker queue should be accessible fast > and putting it in val of the symbol would be the easiest way to achieve > that. > > Further thoughts: > It might be wise to give the possibility to register a callback on the > return value. > This would allow to make things more efficient. > For example: > > (prog1 # Parallel background calculation of square numbers >(mapcan '((N) (later (cons) )) (range 1 100)) >(wait NIL (full @)) ) > > could be: > > (pool "squares.db") > (with-workerpool W '((workers . 2) (callback . '((R) (put *DB (car R) (cadr > R )) >(mapcar '((N) (add_task 'W '((N) (cons N (* N N))) N ) (range 1 100)) )] > > #Normally the with-workerpool should not finish until the queue is empty > > How ever it should also be possible to do something like that: > > > (with-workerpool W '((callback . NIL) (return . nil) (output . "+file") (wait > . NIL)) >(for X 1000 > (add_task 'W '((N) (ifn (= N (apply '* (prime-factors N))) > (print N))) X ] # workers run in background main thread can continue > > This usage may seem overly complex at first and the proposed interface is > horrible. However i think that such an library could be an valuable addition > to picolisp. > > Lets discuss the idea itself, your proposals for the interface and the > implementation. > > -- Petr Gladkikh
Re: Workers for multicore processing (Question and RFC)
> > > > > Is there some "worker" implementation which keeps a queue of task to do > and creates N worker processes which take tasks from the queue when they > are finished? > > you have to implement everything by yourself. as start point: https://bitbucket.org/mihailp/tankfeeder/src/9de46f9e807786fdbf4a86604aca20dd25f0c19e/map-reduce.l?at=default&fileviewer=file-view-default https://bitbucket.org/mihailp/tankfeeder/src/9de46f9e807786fdbf4a86604aca20dd25f0c19e/pow.l?at=default&fileviewer=file-view-default
Workers for multicore processing (Question and RFC)
There are limits on how many processes can exist at the same time and having more 'later processes than cores is wasteful and Linux only allows for 1024 file descriptors per process. ? (for N 1 (later (cons) (* N N))) !? (pipe (pr (prog (* N N Pipe error: Too many open files ? N -> 339 Is there some "worker" implementation which keeps a queue of task to do and creates N worker processes which take tasks from the queue when they are finished? Ideas: Task would be copy of environment + function + values to apply the function too. There would be some optimization potential in case the function and environment is static, or reseted each time to a constant value but it should be possible to fetch a copy of the current environment to. Such a pool of workers is best represented with a symbol on which certain functions are applied which are may stored inside the symbol. This could be archived with some overhead with the picolisp object system when compared to a manual implementation since the worker queue should be accessible fast and putting it in val of the symbol would be the easiest way to achieve that. Further thoughts: It might be wise to give the possibility to register a callback on the return value. This would allow to make things more efficient. For example: (prog1 # Parallel background calculation of square numbers (mapcan '((N) (later (cons) )) (range 1 100)) (wait NIL (full @)) ) could be: (pool "squares.db") (with-workerpool W '((workers . 2) (callback . '((R) (put *DB (car R) (cadr R )) (mapcar '((N) (add_task 'W '((N) (cons N (* N N))) N ) (range 1 100)) )] #Normally the with-workerpool should not finish until the queue is empty How ever it should also be possible to do something like that: (with-workerpool W '((callback . NIL) (return . nil) (output . "+file") (wait . NIL)) (for X 1000 (add_task 'W '((N) (ifn (= N (apply '* (prime-factors N))) (print N))) X ] # workers run in background main thread can continue This usage may seem overly complex at first and the proposed interface is horrible. However i think that such an library could be an valuable addition to picolisp. Lets discuss the idea itself, your proposals for the interface and the implementation.