Re: Workers for multicore processing (Question and RFC)

2017-02-28 Thread Alexander Burger
Hi Petr,

> I believe modern Linux systems have much higher than 1024 file descriptor
> per process limits.

Right. However Joh-Tob was talking about the number of child processes.

There is a limit in PicoLisp in how many child processes can be spawned with
(fork), caused by the size of an 'fd_set' structure for the select() system
call.

(fork) creates two pipes to the parent for each child process. The limitation
can be overcome with http://software-lab.de/doc/refD.html#detach for child
processes who do not depend on interprocess communication.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: Workers for multicore processing (Question and RFC)

2017-02-28 Thread Robert Wörle

Am 28.02.2017 um 12:04 schrieb Petr Gladkikh:
I believe modern Linux systems have much higher than 1024 file 
descriptor per process limits. E.g. on my system

$ uname -srv
Linux 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017
$ cat /proc/sys/fs/file-max
2013532
And you can adjust this limit to your requirements.



Sorry to jump in here out of my closet,  but i think this fact needs 
some clarification. My research came up with the following:


/proc/sys/fs/file-max  is not the "per" process limit , rather then the 
overall limit of the system

This can also be controlled by writing into  /proc/sys/fs/file-max.

to get to know the hardlimit per process you use
ulimit -Hn
soft limit is showed via
ulimit -Sn

as the ulimit manpage states, the hard limit acts as a ceiling for the 
softlimit , which can be controlled.



So we have the per-proccess limitations, but as soon as we fork() and 
recieve a new pid, we can have yet another set of fds , right ?


cheers rob



Re: Workers for multicore processing (Question and RFC)

2017-02-28 Thread Petr Gladkikh
I believe modern Linux systems have much higher than 1024 file descriptor
per process limits. E.g. on my system
$ uname -srv
Linux 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017
$ cat /proc/sys/fs/file-max
2013532
And you can adjust this limit to your requirements.

2017-02-25 12:26 GMT+01:00 Joh-Tob Schäg :

> There are limits on how many processes can exist at the same time and
> having more 'later processes than cores is wasteful and Linux only allows
> for 1024 file descriptors per process.
> ? (for N 1 (later (cons) (* N N)))
> !? (pipe (pr (prog (* N N
> Pipe error: Too many open files
> ? N
> -> 339
>
> Is there some "worker" implementation which keeps a queue of task to do
> and creates N worker processes which take tasks from the queue when they
> are finished?
>
> Ideas:
> Task would be copy of environment + function + values to apply the
> function too. There would be some optimization potential in case the
> function and environment is static, or reseted each time to a constant
> value but it should be possible to fetch a copy of the current environment
> to.
> Such a pool of workers is best represented with a symbol on which certain
> functions are applied which are may stored inside the symbol. This could be
> archived with some overhead with the picolisp object system when compared
> to a manual implementation since the worker queue should be accessible fast
> and putting it in val of the symbol would be the easiest way to achieve
> that.
>
> Further thoughts:
> It might be wise to give the possibility to register a callback on the
> return value.
> This would allow to make things more efficient.
> For example:
>
> (prog1  # Parallel background calculation of square numbers
>(mapcan '((N) (later (cons) )) (range 1 100))
>(wait NIL (full @)) )
>
> could be:
>
> (pool "squares.db")
> (with-workerpool W '((workers . 2) (callback . '((R) (put *DB (car R) (cadr 
> R ))
>(mapcar '((N) (add_task 'W '((N) (cons N (* N N))) N ) (range 1 100)) )]
>
> #Normally the with-workerpool should not finish until the queue is empty
>
> How ever it should also be possible to do something like that:
>
>
> (with-workerpool W '((callback . NIL) (return . nil) (output . "+file") (wait 
> . NIL))
>(for X 1000
>   (add_task  'W '((N) (ifn (= N (apply '* (prime-factors N)))
>  (print N))) X ] # workers run in background main thread can continue
>
> This usage may seem overly complex at first and the proposed interface is 
> horrible. However i think that such an library could be an valuable addition 
> to picolisp.
>
> Lets discuss the idea itself, your proposals for the interface and the 
> implementation.
>
>


-- 
Petr Gladkikh


Re: Workers for multicore processing (Question and RFC)

2017-02-25 Thread Mike Pechkin
>
>
>
>
> Is there some "worker" implementation which keeps a queue of task to do
> and creates N worker processes which take tasks from the queue when they
> are finished?
>
>
​you have to implement everything by yourself.
as start point:
https://bitbucket.org/mihailp/tankfeeder/src/9de46f9e807786fdbf4a86604aca20dd25f0c19e/map-reduce.l?at=default=file-view-default

https://bitbucket.org/mihailp/tankfeeder/src/9de46f9e807786fdbf4a86604aca20dd25f0c19e/pow.l?at=default=file-view-default
​


Workers for multicore processing (Question and RFC)

2017-02-25 Thread Joh-Tob Schäg
There are limits on how many processes can exist at the same time and
having more 'later processes than cores is wasteful and Linux only allows
for 1024 file descriptors per process.
? (for N 1 (later (cons) (* N N)))
!? (pipe (pr (prog (* N N
Pipe error: Too many open files
? N
-> 339

Is there some "worker" implementation which keeps a queue of task to do and
creates N worker processes which take tasks from the queue when they are
finished?

Ideas:
Task would be copy of environment + function + values to apply the function
too. There would be some optimization potential in case the function and
environment is static, or reseted each time to a constant value but it
should be possible to fetch a copy of the current environment to.
Such a pool of workers is best represented with a symbol on which certain
functions are applied which are may stored inside the symbol. This could be
archived with some overhead with the picolisp object system when compared
to a manual implementation since the worker queue should be accessible fast
and putting it in val of the symbol would be the easiest way to achieve
that.

Further thoughts:
It might be wise to give the possibility to register a callback on the
return value.
This would allow to make things more efficient.
For example:

(prog1  # Parallel background calculation of square numbers
   (mapcan '((N) (later (cons) )) (range 1 100))
   (wait NIL (full @)) )

could be:

(pool "squares.db")
(with-workerpool W '((workers . 2) (callback . '((R) (put *DB (car R)
(cadr R ))
   (mapcar '((N) (add_task 'W '((N) (cons N (* N N))) N ) (range 1 100)) )]

#Normally the with-workerpool should not finish until the queue is empty

How ever it should also be possible to do something like that:


(with-workerpool W '((callback . NIL) (return . nil) (output .
"+file") (wait . NIL))
   (for X 1000
  (add_task  'W '((N) (ifn (= N (apply '* (prime-factors N)))
 (print N))) X ] # workers run in background main thread can continue

This usage may seem overly complex at first and the proposed interface
is horrible. However i think that such an library could be an valuable
addition to picolisp.

Lets discuss the idea itself, your proposals for the interface and the
implementation.