branch: externals/el-job
commit 934327f011e139d9723bf4fcb8ba1550c5b7392d
Author: Martin Edström <meedst...@runbox.eu>
Commit: Martin Edström <meedst...@runbox.eu>

    Readme
---
 README.org  |  58 +++++++++++--
 el-job.texi | 267 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 317 insertions(+), 8 deletions(-)

diff --git a/README.org b/README.org
index 8d5d17d6a7..19b9950ee3 100644
--- a/README.org
+++ b/README.org
@@ -1,9 +1,11 @@
-# Copying and distribution of this file, with or without modification,
-# are permitted in any medium without royalty provided the copyright
-# notice and this notice are preserved.  This file is offered as-is,
-# without any warranty.
-#+HTML: <a href="https://repology.org/project/emacs%3Ael-job/versions";><img 
src="https://repology.org/badge/vertical-allrepos/emacs%3Ael-job.svg"; 
alt="Packaging status"></a>
-* el-job
+#+TITLE: el-job
+#+AUTHOR: Martin Edström
+#+EMAIL: meedst...@runbox.eu
+#+EXPORT_FILE_NAME: el-job
+#+TEXINFO_DIR_TITLE: El-Job: (el-job).
+#+TEXINFO_DIR_DESC: Async multicore mapcar
+#+TEXINFO_DIR_CATEGORY: Emacs
+#+HTML: <a href="https://repology.org/project/emacs%3Ael-job/versions";><img 
src="https://repology.org/badge/vertical-allrepos/emacs%3Ael-job.svg"; 
alt="Packaging status" align="right"></a>
 
 Imagine you have a function you'd like to run on a long list of inputs.  You 
could run =(mapcar #'FN INPUTS)=, but that hangs Emacs until done.
 
@@ -13,10 +15,50 @@ In the meantime, current Emacs does not hang at all.
 
 Best of all, it completes /faster/ than =(mapcar #'FN INPUTS)=, owing to the 
use of all CPU cores!
 
-For real-world usage, search for =el-job= in the source of 
[[https://github.com/meedstrom/org-mem/blob/main/org-mem.el][org-mem.el]].
+That's it in a nutshell.  You can look at real-world usage by searching for 
"el-job" in these packages:
+- 
[[https://raw.githubusercontent.com/meedstrom/org-mem/refs/heads/main/org-mem.el][org-mem.el]]
+- 
[[https://raw.githubusercontent.com/meedstrom/org-roam-async/refs/heads/main/org-roam-async.el][org-roam-async.el]]
 
+* New (el-job-ng)
+
+Since 2.5.0, released [2025-10-07 Tue], this repo comes with a variant library 
"el-job-ng".
+
+I find it simpler and easier to reason about.  400 lines of code instead of 
800.
+
+Some differences:
+
+- Does /not/ ever keep a process alive
+- Does /not/ merge the subprocesses' outputs in a bespoke way, just uses 
=append=.
+- Does /not/ look up =load-history= to try hard to find an .eln variant of 
your libraries, instead just shares the whole =load-path= to let =require= do 
its thing.
+- Removed argument =:if-busy=, you can manage this yourself with 
=el-job-ng-busy-p= and =el-job-ng-kill=.
+- Argument =:funcall-per-input= takes a function of two arguments, not one.
+- Added an optional argument =:eval=.
+
+Feel free to file an issue or email.  The design is always changing in 
response to my needs, so hearing about other people's needs is instructive as 
well!
+
+** Future work
+
+I may write more variants.  Something that came with experience is that it's 
best to hard-code an entire variant library for a narrow use-case rather than 
complicate the same library with different code flows, as when it comes to this 
type of library, you really want to keep it easy to reason about!
+
+Ideas as of [2025-10-06 Mon]:
+
+- File IPC ::
+  Sending input and output by writing and reading files, instead of through 
the pipe connection.
+
+  Theory: performance can sometimes be a lot better, with large inputs and 
outputs.  I would guess it depends a lot on the machine.
+
+  Drawback: if the data sent is sensitive, those files probably should be 
encrypted, and that could negate any performance benefit.
+
+- Worker daemons ::
+  Keeping subprocesses alive forever, so they are available at beck and call 
-- think worker daemons.
+
+  That's basically implemented in "el-job-old", and partly why it got so 
hairy, but it could be remade to put this usage front-and-center, with a "happy 
path" UX.
+
+  At this time, I do not have a demanding use-case to experiment with, to 
discover what that happy path should be, how the affordances should be.
+
+* Old (el-job-old)
 ** Design rationale
-I want to shorten the round-trip as much as possible, *between the start of an 
async task and having the results*.
+I wanted to shorten the round-trip as much as possible, *between the start of 
an async task and having the results*.
 
 For example, say you have some lisp that collects completion candidates, and 
you want to run it asynchronously because the lisp you wrote isn't always fast 
enough to avoid the user's notice, but you'd still like it to return as soon as 
possible.
 
diff --git a/el-job.texi b/el-job.texi
new file mode 100644
index 0000000000..113d205a8b
--- /dev/null
+++ b/el-job.texi
@@ -0,0 +1,267 @@
+\input texinfo    @c -*- texinfo -*-
+@c %**start of header
+@setfilename el-job.info
+@settitle el-job
+@documentencoding UTF-8
+@documentlanguage en
+@c %**end of header
+
+@dircategory Emacs
+@direntry
+* El-Job: (el-job).     Async multicore mapcar.
+@end direntry
+
+@finalout
+@titlepage
+@title el-job
+@author Martin Edström
+@end titlepage
+
+@ifnottex
+@node Top
+@top el-job
+
+Imagine you have a function you'd like to run on a long list of inputs.  You 
could run @samp{(mapcar #'FN INPUTS)}@comma{} but that hangs Emacs until done.
+
+This library lets you run the same function in many subprocesses (one per CPU 
core)@comma{} each with their own split of the @samp{INPUTS} list@comma{} then 
merge their outputs and pass it back to the current Emacs.
+
+In the meantime@comma{} current Emacs does not hang at all.
+
+Best of all@comma{} it completes @emph{faster} than @samp{(mapcar #'FN 
INPUTS)}@comma{} owing to the use of all CPU cores!
+
+That's it in a nutshell.  You can look at real-world usage by searching for 
"el-job" in these packages:
+@itemize
+@item
+@uref{https://raw.githubusercontent.com/meedstrom/org-mem/refs/heads/main/org-mem.el,
 org-mem.el}
+@item
+@uref{https://raw.githubusercontent.com/meedstrom/org-roam-async/refs/heads/main/org-roam-async.el,
 org-roam-async.el}
+@end itemize
+
+@end ifnottex
+
+@menu
+* New (el-job-ng)::
+* Old (el-job-old)::
+
+@detailmenu
+--- The Detailed Node Listing ---
+
+New (el-job-ng)
+
+* Future work::
+
+Old (el-job-old)
+
+* Design rationale::
+* News 2.4: News 24. 
+* News 2.3: News 23. 
+* News 2.1: News 21. 
+* News 2.0: News 20. 
+* News 1.1: News 11. 
+* News 1.0: News 10. 
+* Limitations::
+
+Design rationale
+
+* Processes stay alive::
+* Emacs 30 @samp{fast-read-process-output}::
+
+@end detailmenu
+@end menu
+
+@node New (el-job-ng)
+@chapter New (el-job-ng)
+
+Since 2.5.0@comma{} released @emph{<2025-Oct-07>}@comma{} this repo comes with 
a variant library "el-job-ng".
+
+I find it simpler and easier to reason about.  400 lines of code instead of 
800.
+
+Some differences:
+
+@itemize
+@item
+Does @emph{not} ever keep a process alive
+@item
+Does @emph{not} merge the subprocesses' outputs in a bespoke way@comma{} just 
uses @samp{append}.
+@item
+Does @emph{not} look up @samp{load-history} to try hard to find an .eln 
variant of your libraries@comma{} instead just shares the whole 
@samp{load-path} to let @samp{require} do its thing.
+@item
+Removed argument @samp{:if-busy}@comma{} you can manage this yourself with 
@samp{el-job-ng-busy-p} and @samp{el-job-ng-kill}.
+@item
+Argument @samp{:funcall-per-input} takes a function of two arguments@comma{} 
not one.
+@item
+Added an optional argument @samp{:eval}.
+@end itemize
+
+Feel free to file an issue or email.  The design is always changing in 
response to my needs@comma{} so hearing about other people's needs is 
instructive as well!
+
+@menu
+* Future work::
+@end menu
+
+@node Future work
+@section Future work
+
+I may write more variants.  Something that came with experience is that it's 
best to hard-code an entire variant library for a narrow use-case rather than 
complicate the same library with different code flows@comma{} as when it comes 
to this type of library@comma{} you really want to keep it easy to reason about!
+
+Ideas as of @emph{<2025-Oct-06>}:
+
+@table @asis
+@item File IPC
+Sending input and output by writing and reading files@comma{} instead of 
through the pipe connection.
+
+Theory: performance can sometimes be a lot better@comma{} with large inputs 
and outputs.  I would guess it depends a lot on the machine.
+
+Drawback: if the data sent is sensitive@comma{} those files probably should be 
encrypted@comma{} and that could negate any performance benefit.
+
+@item Worker daemons
+Keeping subprocesses alive forever@comma{} so they are available at beck and 
call -- think worker daemons.
+
+That's basically implemented in "el-job-old"@comma{} and partly why it got so 
hairy@comma{} but it could be remade to put this usage front-and-center@comma{} 
with a "happy path" UX@.
+
+At this time@comma{} I do not have a demanding use-case to experiment 
with@comma{} to discover what that happy path should be@comma{} how the 
affordances should be.
+@end table
+
+@node Old (el-job-old)
+@chapter Old (el-job-old)
+
+@menu
+* Design rationale::
+* News 2.4: News 24. 
+* News 2.3: News 23. 
+* News 2.1: News 21. 
+* News 2.0: News 20. 
+* News 1.1: News 11. 
+* News 1.0: News 10. 
+* Limitations::
+@end menu
+
+@node Design rationale
+@section Design rationale
+
+I wanted to shorten the round-trip as much as possible@comma{} @strong{between 
the start of an async task and having the results}.
+
+For example@comma{} say you have some lisp that collects completion 
candidates@comma{} and you want to run it asynchronously because the lisp you 
wrote isn't always fast enough to avoid the user's notice@comma{} but you'd 
still like it to return as soon as possible.
+
+@menu
+* Processes stay alive::
+* Emacs 30 @samp{fast-read-process-output}::
+@end menu
+
+@node Processes stay alive
+@subsection Processes stay alive
+
+In the above example@comma{} a user might only delay a fraction of a second 
between opening the minibuffer and beginning to type@comma{} so there's scant 
room for overhead like spinning up subprocesses that load a bunch of libraries 
before getting to work.
+
+Thus@comma{} el-job keeps idle subprocesses for up to 30 seconds after a job 
finishes@comma{} awaiting more input.
+
+An aesthetic drawback is cluttering your task manager with many processes 
named "emacs".
+
+Users who tend to run system commands such as @samp{pkill emacs} may find that 
the command occasionally "does not work"@comma{} because it actually killed an 
el-job subprocess@comma{} instead of the Emacs they see on screen.
+
+@node Emacs 30 @samp{fast-read-process-output}
+@subsection Emacs 30 @samp{fast-read-process-output}
+
+Some other libraries@comma{} like the popular 
@uref{https://github.com/jwiegley/emacs-async/, async.el}@comma{} are designed 
around a custom process filter.
+
+Since Emacs 30@comma{} it's a good idea to instead use the @emph{built-in} 
process filter when performance is critical@comma{} and el-job does so.  
Quoting @uref{https://github.com/emacs-mirror/emacs/blob/master/etc/NEWS.30, 
NEWS.30}:
+
+@example
+** The default process filter was rewritten in native code.
+The round-trip through the Lisp function
+'internal-default-process-filter' is skipped when the process filter is
+the default one.  It is reimplemented in native code@comma{} reducing GC churn.
+To undo this change@comma{} set 'fast-read-process-output' to nil.
+@end example
+
+@node News 24
+@section News 2.4
+
+@itemize
+@item
+Jobs must now have @samp{:inputs}.  If @samp{:inputs} nil and there was 
nothing queued@comma{} @samp{el-job-launch} will no-op and return the symbol 
@samp{inputs-were-empty}.
+@end itemize
+
+@node News 23
+@section News 2.3
+
+@itemize
+@item
+Some renames to follow Elisp convention
+@itemize
+@item
+@samp{el-job:timestamps} and friends now @samp{el-job-timestamps}.
+@end itemize
+@end itemize
+
+@node News 21
+@section News 2.1
+
+@itemize
+@item
+DROP SUPPORT Emacs 28
+@itemize
+@item
+It likely has not been working for a while anyway.  Maybe works on the 
@uref{https://github.com/meedstrom/el-job/tree/v0.3, v0.3 branch}@comma{} from 
0.3.26+.
+@end itemize
+@end itemize
+
+@node News 20
+@section News 2.0
+
+@itemize
+@item
+Jobs must now have @samp{:id} (no more anonymous jobs).
+@item
+Pruned many code paths.
+@end itemize
+
+@node News 11
+@section News 1.1
+
+@itemize
+@item
+Changed internals so that all builds of Emacs can be expected to perform 
similarly well.
+@end itemize
+
+@node News 10
+@section News 1.0
+
+@itemize
+@item
+No longer keeps processes alive forever.  All jobs are kept alive for up to 30 
seconds of disuse@comma{} then reaped.
+@item
+Pruned many code paths.
+@item
+Many arguments changed@comma{} and a few were removed.  Consult the docstring 
of @samp{el-job-launch} again.
+@end itemize
+
+@node Limitations
+@section Limitations
+
+@enumerate
+@item
+The return value from the @samp{:funcall-per-input} function must always be a 
list with a fixed length@comma{} where the elements are also lists.
+
+For example@comma{} org-mem passes @samp{:funcall-per-input 
#'org-mem-parser--parse-file} to el-job@comma{} and if you look in 
@uref{https://github.com/meedstrom/org-mem/blob/main/org-mem-parser.el, 
org-mem-parser.el} for the defun of @samp{org-mem-parser--parse-file}@comma{} 
it always returns a list of 5 items:
+
+@lisp
+(list (if missing-file (list missing-file)) ; List of 0 or 1 item
+      (if file-mtime (list file-mtime))     ; List of 0 or 1 item
+      found-entries                         ; List of many items
+      org-node-parser--found-links          ; List of many items
+      (if problem (list problem))))         ; List of 0 or 1 item
+@end lisp
+
+It may look clunky to return sub-lists of only one item@comma{} but you could 
consider it a minor expense in exchange for simpler library code.
+
+@item
+Some data types cannot be exchanged with the children: those whose printed 
form look like @samp{#<...>}.  For example@comma{} @samp{#<buffer 
notes.org>}@comma{} @samp{#<obarray n=94311>}@comma{} @samp{#<marker at 3102 in 
README.org>}.
+
+IIUC@comma{} this sort of data only has meaning within the current process -- 
so even if you could send it@comma{} it would not be usable by the recipient 
anyway.
+
+@item
+For now@comma{} this library tends to be applicable only to a narrow set of 
use-cases@comma{} since you can only pass one @samp{:inputs} list which would 
tend to contain a single kind of thing@comma{} e.g. it could be a list of files 
to visit@comma{} to be split between child processes.  In many potential 
use-cases@comma{} you'd actually want multiple input lists and split them 
differently@comma{} and that's not supported yet.
+@end enumerate
+
+@bye

Reply via email to