On Friday, August 14, 2020 at 1:44:38 PM UTC+2 david....@gmail.com wrote:

> Same here on a fresh clone with macOS 10.15.16
>
>
> [dochtml] Building en/constructions.
> [dochtml] 
> [dochtml] [construct] building [html]: targets for 16 source files that 
> are out of date
> [dochtml] [construct] updating environment: [new config] 16 added, 0 
> changed, 0 removed
> ^C[dochtml] Error building the documentation.
> [dochtml] Traceback (most recent call last):
> [dochtml]   File 
> "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/runpy.py",
>  
> line 193, in _run_module_as_main
> [dochtml]     "__main__", mod_spec)
> [dochtml]   File 
> "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.7/lib/python3.7/runpy.py",
>  
> line 85, in _run_code
> [dochtml]     exec(code, run_globals)
> [dochtml]   File 
> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__main__.py",
>  
> line 2, in <module>
> [dochtml]     main()
> [dochtml]   File 
> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py",
>  
> line 1721, in main
> [dochtml]     builder()
> [dochtml]   File 
> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py",
>  
> line 337, in _wrapper
> [dochtml]     build_many(build_other_doc, L)
> [dochtml]   File 
> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/__init__.py",
>  
> line 281, in build_many
> [dochtml]     _build_many(target, args, processes=NUM_THREADS)
> [dochtml]   File 
> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/utils.py",
>  
> line 263, in build_many
> [dochtml]     waited_pid, waited_exitcode = wait_for_one()
> [dochtml]   File 
> "/Users/dcoudert/sage/local/lib/python3.7/site-packages/sage_setup/docbuild/utils.py",
>  
> line 179, in wait_for_one
> [dochtml]     pid, sts = os.wait()
> [dochtml]   File "src/cysignals/signals.pyx", line 320, in 
> cysignals.signals.python_check_interrupt
> [dochtml] KeyboardInterrupt
> [dochtml] 
> [dochtml]     Note: incremental documentation builds sometimes cause 
> spurious
> [dochtml]     error messages. To be certain that these are real errors, run
> [dochtml]     "make doc-clean" first and try again.
> make[3]: *** [doc-html] Error 1
> make[2]: *** [all-start] Interrupt: 2
> make[1]: *** [all-start] Interrupt: 2
> make: *** [all] Interrupt: 2
>
>
>
As I noted at [1], this implies that one or more of the docbuilds are 
running some code that hangs forever, so it would help narrow it down by 
finding out what code it's running to cause a hang.  Normally in docbuilds 
the most likely code to run will be some plotting code, so you can try to 
take the plot code present in the relevant documentation pages, run it, and 
see if it hangs.  Chances are if you run it in a single process it might 
*not* hang--frequently this happens only in a forked subprocess.  So you 
can try something like:

from multiprocessing import Process
p = Process(target=<function implementing the plotting code>)
p.start()

and see if it hangs. I've often found this to be the case with calls to 
np.dot(), which matpotlib uses sometimes to perform various 
transformations, and which in turn often results in a call to a 
multi-threaded OpenBLAS function which are sometimes buggy.

As an aside, I would like to make the parallel doc build code more robust 
to this kind of hang, but it's hard to say exactly what the right solution 
is (I know more or less what the technical solution is, but I mean more the 
UX solution).  Because as far as the docbuild code is concerned, it doesn't 
know that the process is "hung".  It's just taking however long it needs to 
take, and the code will wait for it to finish.  Perhaps we could implement 
some kind of timeout to kill docbuild processes that are taking too long 
(but what should the timeout be)? Or report back to the user exactly which 
docbuilds are still running (like, a log message at some interval) so that 
the user can decide whether or not to take action.

[1] https://trac.sagemath.org/ticket/30351#comment:40 

-- 
You received this message because you are subscribed to the Google Groups 
"sage-release" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-release+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/sage-release/2755f337-7bf1-433b-81b1-7a98d1421a8cn%40googlegroups.com.

Reply via email to