On Fri, May 31, 2024 at 09:47:18AM +0800, Qian Yun wrote:
> 
> 
> On 5/31/24 00:38, Waldek Hebisch wrote:
> > On Thu, May 30, 2024 at 07:43:30PM +0800, Qian Yun wrote:
> > > 
> > > 
> > > On 5/29/24 22:51, Waldek Hebisch wrote:
> > > > 
> > > > first build went fine, but a few later failed.  Actually, it looks
> > > > worse than previous version where probability of success looked
> > > > higher.
> > > > 
> > > > Yes, problem is because some .tex files are truncated.  In one run it
> > > > was 'ug10.tex', in few cases it was 'SEGBIND.tex'.  In other cases
> > > > I did not check the files but LaTeX error messuge indicated truncation.
> > > 
> > > Can you try my patch from yesterday and see if this one helps.
> > 
> > I could, but it takes time.  And such tweaking is risky, while it
> > may solve problem on my machine we risk breaking other.  I would
> > prefer to get closer to reasons so that we can be confident that
> > book build really works.
> > 
> 
> I'll explain more.  The output is truncated because some subprocess
> of sman are killed before socket buffer is outputted.
> 
> The direction of IO goes like this:
> 
>  FRICASsys <=> (forked child) sman <=> session <=> spadclient <=> stdio
> 
> When FRICASsys quits, the SpadServer socket and pty closes,
> sman detects that and quits, causing SessionIOServer to be closed,
> session detects that and quits, causing SessionServer to be closed,
> spadclient detects that and quits.
> (I ignored hypertex in this picture, it should also quit properly.)
> 
> Each process quits before processing all of its IO, so the output
                     ^^^^^^

Do you mean "after"?

> will not be truncated.
> 
> The core idea is to detect socket shutdown, from "man recv":
> 
>      When a stream socket peer has performed an orderly shutdown,
>      the return value will be 0.
> 
> I removed my previous workaround and applied this patch,
> and I no longer have this truncation issue.
> 
> If this patch works on your side as well, I can improve the
> details of this patch and upstream it.

Trying 3 times it worked without failure.  Will do more tests,
but seem to solve the truncation problem.

> > Concerning book, I did a few trials with version in the trunk, and
> > it worked fine on each trial.  That is too little to be sure,
> > but is strong indication that trouble is due to recent changes.
> > 
> 
> Trunk version uses "FRICASsys" in pipe, that is fine.
> My version uses "sman with FRICASsys" in pipe, which causes problem,
> but I think it existed in the past as well.

Well, I agree that "sman in the pipe" almost surely was buggy.

> > > The "sman -paste" invocation of "hypertex" does not have this problem
> > > because it uses only socket IO to FRICASsys, it's purely sequential.
> > 
> > Yes, socket I/O is free from worst races.  Pure use of stdio also
> > should be good.
> > 
> > > While the "usage of sman/FRICASsys in pipe" is more complex, it involves
> > > both socket IO and stdio, so the race happens there.
> > 
> > Well, clearly we should try to limit simultaneous use of socket IO
> > and stdio.
> > 
> 
> A minor correction: socket IO and stdio are sequential, but the signal
> to kill the process is parallel, causing the race problem.
> As I explained, this patch makes each process exits normally, instead
> of being killed by sman simultaneously.

Yes, avoiding signal is good.

-- 
                              Waldek Hebisch

-- 
You received this message because you are subscribed to the Google Groups 
"FriCAS - computer algebra system" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to fricas-devel+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/fricas-devel/ZlnR65T6Aw-DydLu%40fricas.org.

Reply via email to