Re: [fricas-devel] [PATCH] remove "sleep" to speed up "draw"

2024-04-20 Thread Qian Yun

The patch deals with the beginning stage, transferring data
from fricas to view2/3D, that code deals with the ending stage,
writing data from view2/3D to disk.

Also the problem raised by "THEMOS" is mitigated by the "Fix" part.

But!!!  The mitigation was invalided by my "OBEY"->"|run_program|"
change.  Because |waitForViewport| uses a lowercase "obey" and
I missed it.

I was going to submit the following patch anyway.
There is a "sleep" to prevent spawning lots of processes.

- Qian

diff --git a/src/interp/util.lisp b/src/interp/util.lisp
index a3da8695..6f1ee95f 100644
--- a/src/interp/util.lisp
+++ b/src/interp/util.lisp
@@ -372,11 +372,11 @@ After this function is called the image is clean 
and can be saved.

 (defun |waitForViewport| ()
   (progn
(do ()
-   ((not (zerop (obey
+   ((not (zerop (|run_shell_command|
 (concat
  "ps "
  |$ViewportProcessToWatch|
- " > /dev/null")
+ " > /dev/null && sleep 0.05")
())
(|sockSendInt| |$MenuServer| 1)
(|setIOindex| (- |$IOindex| 3))


On 4/20/24 18:28, Ralf Hemmecke wrote:

Maybe this comment is connected to the sleep stuff...

https://github.com/fricas/fricas/blob/master/src/hyper/htinp.c#L351

==
THEMOS says: There is a problem here in that we issue the (|close|) and
then go on. If this is the last command, we will soon send a SIGTERM and
the whole thing will collapse maybe BEFORE the writing out has finished.
Fix: Call a Lisp function that checks (with \spadop{key} ps and grep) 
the health of the viewport. We do this after the (|close|).

==

Ralf



--
You received this message because you are subscribed to the Google Groups "FriCAS - 
computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/6fd18a14-bcc5-4a11-a6e9-a9621b2d724f%40gmail.com.


Re: [fricas-devel] [PATCH] remove "sleep" to speed up "draw"

2024-04-20 Thread Ralf Hemmecke

Maybe this comment is connected to the sleep stuff...

https://github.com/fricas/fricas/blob/master/src/hyper/htinp.c#L351

==
THEMOS says: There is a problem here in that we issue the (|close|) and
then go on. If this is the last command, we will soon send a SIGTERM and
the whole thing will collapse maybe BEFORE the writing out has finished.
Fix: Call a Lisp function that checks (with \spadop{key} ps and grep) 
the health of the viewport. We do this after the (|close|).

==

Ralf

--
You received this message because you are subscribed to the Google Groups "FriCAS - 
computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/8591317d-3061-411c-9354-d516aec8417e%40hemmecke.org.


Re: [fricas-devel] [PATCH] remove "sleep" to speed up "draw"

2024-04-19 Thread Waldek Hebisch
On Fri, Apr 19, 2024 at 09:02:59PM +0800, Qian Yun wrote:
> 
> 
> On 4/19/24 20:41, Waldek Hebisch wrote:
> > 
> > Long ago I looked at communication protocol between various
> > processes that we use and my conclusion was that it is
> > inherently racy: there are parallel chanels of communication
> > and both ends assume that data comes in right order.  I added
> > a little piece of code to detect lost race, that mitigates
> > worst things.  Machines now are faster than in the past,
> > so lost races probably are quite rare given 1s delay.
> > It is quite possible that original authors after realizing
> > that there are races put delays in places that are not necessary
> > (and wrogly placed delay could even make races worse).
> > 
> > To explain more: most of C code uses sockets and this alone should
> > be OK.  However, data from FRICASsys sends textual output on
> > standard output, which is captured by 'sman'.  But "event indicators"
> > are sent via sockets, so we depend on data on FRICASsys standard
> > output and data coming via sockets to arrive in order in which
> > it was sent.
> 
> In this particular case, it is not socket, but pipe.
> 
> > >   The only reason to put a "sleep" here,
> > > I presume it is a workaround for a bug in viewAlone:
> > > 
> > > See the "printf" I removed bellow: it writes to stdout instead of
> > > stderr, causing the parent process function "readViewport" to return
> > > early and make parent process exits, and the data passed to child
> > > process via pipe is lost.
> > 
> > IIRC 'viewAlone' is started from HyperDoc when you click on a image.
> > Normal graphics uses 'view2D' and 'view3D' via 'viewman'.
> > 
> > Anyway, I think it would be reasonably safe to use smaller delay,
> > say 50 or 100 milliseconds (I use such delay during startup).
> > Quite possible that we can elliminate the delays, but that IMO
> > would require deeper analysis and more testing.  As I wrote we
> > have several channels of communication and code assumes certain
> > ordering contraints.  Without identifying contraints (and some
> > could be far from obvious) and analysing them it is hard to say
> > more than "there are races".
> 
> This case is rather straight forward I think, viewman/viewAlone
> forks and passes data to child process view2D/view3D via pipe,
> then child process writes a value after receiving all data,
> then viewAlone exits while viewman sends it via sock to fricas
> and continues to run.
> 
> I can accept the 100ms sleep as well.  But could you give this
> another review based on the new information I provided?

_Locally_ this should work fine without delay.  But the question
is how this affects timing in the whole constelation of processes
that we use.

I do not say that the change is bad.  Simply, on fast unloaded
machine in fixed configuration timings typically are resonably
repeateble and it may be very hard to reproduce a problem
(if there are any).  OTOH in different configuration timing
problem may appear.  From my point of view testing is not
enough to prove absence to troubles, and since there are
races simple theoretical argument also can not work.  Pragmatically
we can accept some risk of breakge, but I prefer to be
conservative here, accepting changes that risk breakage can
have cumulative effect and significantly degrade code quality.
And to say the truth I much more willig to accept some
deterministic risk, as this once discoverd can be reproduced
and fixed.  Non deterministic breakge can live long...

-- 
  Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/ZiKAOi--bW1sHGl_%40fricas.org.


Re: [fricas-devel] [PATCH] remove "sleep" to speed up "draw"

2024-04-19 Thread Qian Yun




On 4/19/24 20:41, Waldek Hebisch wrote:


Long ago I looked at communication protocol between various
processes that we use and my conclusion was that it is
inherently racy: there are parallel chanels of communication
and both ends assume that data comes in right order.  I added
a little piece of code to detect lost race, that mitigates
worst things.  Machines now are faster than in the past,
so lost races probably are quite rare given 1s delay.
It is quite possible that original authors after realizing
that there are races put delays in places that are not necessary
(and wrogly placed delay could even make races worse).

To explain more: most of C code uses sockets and this alone should
be OK.  However, data from FRICASsys sends textual output on
standard output, which is captured by 'sman'.  But "event indicators"
are sent via sockets, so we depend on data on FRICASsys standard
output and data coming via sockets to arrive in order in which
it was sent.


In this particular case, it is not socket, but pipe.


  The only reason to put a "sleep" here,
I presume it is a workaround for a bug in viewAlone:

See the "printf" I removed bellow: it writes to stdout instead of
stderr, causing the parent process function "readViewport" to return
early and make parent process exits, and the data passed to child
process via pipe is lost.


IIRC 'viewAlone' is started from HyperDoc when you click on a image.
Normal graphics uses 'view2D' and 'view3D' via 'viewman'.

Anyway, I think it would be reasonably safe to use smaller delay,
say 50 or 100 milliseconds (I use such delay during startup).
Quite possible that we can elliminate the delays, but that IMO
would require deeper analysis and more testing.  As I wrote we
have several channels of communication and code assumes certain
ordering contraints.  Without identifying contraints (and some
could be far from obvious) and analysing them it is hard to say
more than "there are races".


This case is rather straight forward I think, viewman/viewAlone
forks and passes data to child process view2D/view3D via pipe,
then child process writes a value after receiving all data,
then viewAlone exits while viewman sends it via sock to fricas
and continues to run.

I can accept the 100ms sleep as well.  But could you give this
another review based on the new information I provided?

- Qian


BTW: My inital motivation for work on streams was to pass all
FRICASsys (actually it was called AXIOMsys at that time) output
to sockets, which would allow messages with confirmation.



--
You received this message because you are subscribed to the Google Groups "FriCAS - 
computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/83664668-d3a4-40b1-b9d7-ef5d26fdc696%40gmail.com.


Re: [fricas-devel] [PATCH] remove "sleep" to speed up "draw"

2024-04-19 Thread Waldek Hebisch
On Fri, Apr 19, 2024 at 07:12:06PM +0800, Qian Yun wrote:
> You can notice a "pause" when you draw a picture like:
> 
> (1) -> draw(x,x=1..2)
>Compiling function %C with type DoubleFloat -> DoubleFloat
>Graph data being transmitted to the viewport manager...
>FriCAS2D data being transmitted to the viewport manager...
> <<<>>>
>(1)  TwoDimensionalViewport: "x"
> 
> It is also very noticeable when you generate the viewports.
> 
> The following numbers are build with 8 cores.
> 
> This is build without X11.
> 
> real1m52.533s
> user6m47.266s
> sys 0m58.385s
> 
> 
> This is build with X11.  So it takes 11s more CPU time
> to build hyperdoc/graphics binary and run them to
> generate hyperpages and viewports.
> 
> But the user have to wait 1 more minute to see the build
> finish.  Most time is wasted on this "sleep".
> 
> real2m52.534s
> user6m58.368s
> sys 0m55.921s
> 
> 
> This is build with X11 and this patch.
> This takes 14s more CPU time and 9s real time.
> 
> real2m1.381s
> user7m0.911s
> sys 0m56.194s
> 
> 
> The "readViewport" and "send_int" are synchronized IO,
> so no race condition here.

Long ago I looked at communication protocol between various
processes that we use and my conclusion was that it is
inherently racy: there are parallel chanels of communication
and both ends assume that data comes in right order.  I added
a little piece of code to detect lost race, that mitigates
worst things.  Machines now are faster than in the past,
so lost races probably are quite rare given 1s delay.
It is quite possible that original authors after realizing
that there are races put delays in places that are not necessary
(and wrogly placed delay could even make races worse).

To explain more: most of C code uses sockets and this alone should
be OK.  However, data from FRICASsys sends textual output on
standard output, which is captured by 'sman'.  But "event indicators"
are sent via sockets, so we depend on data on FRICASsys standard
output and data coming via sockets to arrive in order in which
it was sent.

>  The only reason to put a "sleep" here,
> I presume it is a workaround for a bug in viewAlone:
> 
> See the "printf" I removed bellow: it writes to stdout instead of
> stderr, causing the parent process function "readViewport" to return
> early and make parent process exits, and the data passed to child
> process via pipe is lost.

IIRC 'viewAlone' is started from HyperDoc when you click on a image.
Normal graphics uses 'view2D' and 'view3D' via 'viewman'.

Anyway, I think it would be reasonably safe to use smaller delay,
say 50 or 100 milliseconds (I use such delay during startup).
Quite possible that we can elliminate the delays, but that IMO
would require deeper analysis and more testing.  As I wrote we
have several channels of communication and code assumes certain
ordering contraints.  Without identifying contraints (and some
could be far from obvious) and analysing them it is hard to say
more than "there are races".

BTW: My inital motivation for work on streams was to pass all
FRICASsys (actually it was called AXIOMsys at that time) output
to sockets, which would allow messages with confirmation.

-- 
  Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/ZiJmXz8JYw2GfIRC%40fricas.org.


[fricas-devel] [PATCH] remove "sleep" to speed up "draw"

2024-04-19 Thread Qian Yun

You can notice a "pause" when you draw a picture like:

(1) -> draw(x,x=1..2)
   Compiling function %C with type DoubleFloat -> DoubleFloat
   Graph data being transmitted to the viewport manager...
   FriCAS2D data being transmitted to the viewport manager...
<<<>>>
   (1)  TwoDimensionalViewport: "x"

It is also very noticeable when you generate the viewports.

The following numbers are build with 8 cores.

This is build without X11.

real1m52.533s
user6m47.266s
sys 0m58.385s


This is build with X11.  So it takes 11s more CPU time
to build hyperdoc/graphics binary and run them to
generate hyperpages and viewports.

But the user have to wait 1 more minute to see the build
finish.  Most time is wasted on this "sleep".

real2m52.534s
user6m58.368s
sys 0m55.921s


This is build with X11 and this patch.
This takes 14s more CPU time and 9s real time.

real2m1.381s
user7m0.911s
sys 0m56.194s


The "readViewport" and "send_int" are synchronized IO,
so no race condition here.  The only reason to put a "sleep" here,
I presume it is a workaround for a bug in viewAlone:

See the "printf" I removed bellow: it writes to stdout instead of
stderr, causing the parent process function "readViewport" to return
early and make parent process exits, and the data passed to child
process via pipe is lost.

- Qian

diff --git a/src/graph/viewAlone/spoon2D.c b/src/graph/viewAlone/spoon2D.c
index 480dab6b..0bc94a43 100644
--- a/src/graph/viewAlone/spoon2D.c
+++ b/src/graph/viewAlone/spoon2D.c
@@ -82,7 +82,6 @@ spoonView2D(void)
 close(pipe0[1]);
 close(pipe1[0]);
 close(pipe1[1]);
-printf("(spoon2D child) start the TwoDimensionalViewport process\n");
 fricas_sprintf_to_buf1(errorStr, "%s",
 "(viewAlone) execution of the TwoDimensionalViewport process");
 env_fricas = getenv("FRICAS");
@@ -150,7 +149,6 @@ spoonView2D(void)
 /*** get acknowledge from viewport */

 code = read(viewP.viewIn,&(viewP.viewWindow),sizeof(Window));
-sleep(1);  /* wait a second...*/
 exit(0);

   }   /* switch */
diff --git a/src/graph/viewAlone/spoonComp.c 
b/src/graph/viewAlone/spoonComp.c

index ed0ce619..dc0786cb 100644
--- a/src/graph/viewAlone/spoonComp.c
+++ b/src/graph/viewAlone/spoonComp.c
@@ -169,7 +169,6 @@ spoonView3D(int type)

 /*** get acknowledge from viewport */
 code = read(viewP.viewIn,&(viewP.viewWindow),sizeof(Window));
-sleep(1);  /* wait a second...*/
 exit(0);

   }   /* switch */
diff --git a/src/graph/viewman/fun2D.c b/src/graph/viewman/fun2D.c
index 4b55a963..87ec3350 100644
--- a/src/graph/viewman/fun2D.c
+++ b/src/graph/viewman/fun2D.c
@@ -294,7 +294,6 @@ forkView2D(void)
  /*** get acknowledge from viewport */

 code = readViewport(viewport,&(viewport->viewWindow),sizeof(Window));
-sleep(1);  /* wait a second...*/
 send_int(spadSock,viewport->PID);  /* acknowledge to spad */

   }   /* switch */
diff --git a/src/graph/viewman/fun3D.c b/src/graph/viewman/fun3D.c
index 795b23d3..53ab32c9 100644
--- a/src/graph/viewman/fun3D.c
+++ b/src/graph/viewman/fun3D.c
@@ -405,7 +405,6 @@ forkView3D(int typeOfViewport)

  /*** get acknowledge from viewport */
 code = readViewport(viewport,&(viewport->viewWindow),sizeof(Window));
-sleep(1);  /* wait a second...*/
 send_int(spadSock,viewport->PID);  /* acknowledge to spad */

   }   /* switch */

--
You received this message because you are subscribed to the Google Groups "FriCAS - 
computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/0962a538-2ad2-45c5-b051-d4a71dc25c82%40gmail.com.