Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-13 Thread Sean Whitton
Hello,

On Sat 12 Nov 2022 at 02:55AM +01, Vincent Lefevre wrote:

> Hi,
>
> On 2022-11-11 11:32:33 -0700, Sean Whitton wrote:
>> On Thu 10 Nov 2022 at 11:23AM +01, Vincent Lefevre wrote:
>> > On 2022-11-08 12:44:08 -0700, Sean Whitton wrote:
>> >> Are you able to test the patch?  Let me know if you need help getting an
>> >> installable .deb.  Thanks.
>> >
>> > Sorry, I couldn't test it yet, first because of an uninstallable
>> > package needed for the build because I couldn't upgrade libc6 yet
>> > and I couldn't get the previous version from snapshot.debian.org
>> > (bug 1023540). Now that I could upgrade libc6, I'll be able to
>> > test when I have some time, but perhaps not before the week-end.
>>
>> Okay, do let me know if I can help -- this is blocking Emacs from migrating.
>
> I've rebuilt the packages with the patch and couldn't reproduce
> the bug yet. So it may be the correct fix.

Many thanks for testing, and Eli and Paul for the patch.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-11 Thread Vincent Lefevre
Hi,

On 2022-11-11 11:32:33 -0700, Sean Whitton wrote:
> On Thu 10 Nov 2022 at 11:23AM +01, Vincent Lefevre wrote:
> > On 2022-11-08 12:44:08 -0700, Sean Whitton wrote:
> >> Are you able to test the patch?  Let me know if you need help getting an
> >> installable .deb.  Thanks.
> >
> > Sorry, I couldn't test it yet, first because of an uninstallable
> > package needed for the build because I couldn't upgrade libc6 yet
> > and I couldn't get the previous version from snapshot.debian.org
> > (bug 1023540). Now that I could upgrade libc6, I'll be able to
> > test when I have some time, but perhaps not before the week-end.
> 
> Okay, do let me know if I can help -- this is blocking Emacs from migrating.

I've rebuilt the packages with the patch and couldn't reproduce
the bug yet. So it may be the correct fix.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-11 Thread Sean Whitton
Hello,

On Thu 10 Nov 2022 at 11:23AM +01, Vincent Lefevre wrote:

> On 2022-11-08 12:44:08 -0700, Sean Whitton wrote:
>> Are you able to test the patch?  Let me know if you need help getting an
>> installable .deb.  Thanks.
>
> Sorry, I couldn't test it yet, first because of an uninstallable
> package needed for the build because I couldn't upgrade libc6 yet
> and I couldn't get the previous version from snapshot.debian.org
> (bug 1023540). Now that I could upgrade libc6, I'll be able to
> test when I have some time, but perhaps not before the week-end.

Okay, do let me know if I can help -- this is blocking Emacs from migrating.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-10 Thread Vincent Lefevre
On 2022-11-08 12:44:08 -0700, Sean Whitton wrote:
> Are you able to test the patch?  Let me know if you need help getting an
> installable .deb.  Thanks.

Sorry, I couldn't test it yet, first because of an uninstallable
package needed for the build because I couldn't upgrade libc6 yet
and I couldn't get the previous version from snapshot.debian.org
(bug 1023540). Now that I could upgrade libc6, I'll be able to
test when I have some time, but perhaps not before the week-end.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-10 Thread Eli Zaretskii
> Date: Sat, 5 Nov 2022 13:54:54 -0700
> Cc: vinc...@vinc17.net, spwhit...@spwhitton.name, 58...@debbugs.gnu.org,
>  1017...@bugs.debian.org
> From: Paul Eggert 
> 
> On 2022-11-04 00:00, Eli Zaretskii wrote:
> > We need to establish what is the
> > source of SIGHUP in these cases.  "These cases" mean, AFAIU, the
> > situations where Emacs launched an async subprocess to do native
> > compilation (which is another Emacs process in a --batch session), and
> > the parent Emacs session is terminated by the user before the async
> > compilation runs to completion.  Would the child Emacs process get
> > SIGHUP in this scenario?
> 
> Hard for me to say. It's a messy area, with kernels (and Emacs itself) 
> sending SIGHUP on various whims.
> 
> Does the attached patch fix things? It builds on your commit 
> 190a6853708ab22072437f6ebd93beb3ec1a9ce6 dated 2020-12-04; I don't know 
> why that earlier patch was installed, but it would seem to apply to 
> SIGHUP and SIGTERM as well as it applies to SIGINT.

No further comments, so I've now installed this on the master branch,
and I'm marking this bug done.

Thanks.



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-08 Thread Sean Whitton
Hello Vincent,

Are you able to test the patch?  Let me know if you need help getting an
installable .deb.  Thanks.

-- 
Sean Whitton


signature.asc
Description: PGP signature


Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-06 Thread Eli Zaretskii
> Date: Sun, 6 Nov 2022 11:44:43 -0800
> Cc: a...@sdf.org, vinc...@vinc17.net, spwhit...@spwhitton.name,
>  58...@debbugs.gnu.org, 1017...@bugs.debian.org
> From: Paul Eggert 
> 
> On 2022-11-06 11:32, Eli Zaretskii wrote:
> > My question was whether in this scenario, since the parent Emacs
> > exits, the child Emacs can get SIGHUP, simply because its parent
> > exited and the read end of the PTY no longer exists.
> 
> Yes, my sense from the few experiments I tried, is that it's a plausible 
> scenario, though I never observed it actually happening for Emacs doing 
> a subprocess compile.

OK, thanks.  So I hope your suggested patch will solve this issue.



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-06 Thread Paul Eggert

On 2022-11-06 11:32, Eli Zaretskii wrote:

My question was whether in this scenario, since the parent Emacs
exits, the child Emacs can get SIGHUP, simply because its parent
exited and the read end of the PTY no longer exists.


Yes, my sense from the few experiments I tried, is that it's a plausible 
scenario, though I never observed it actually happening for Emacs doing 
a subprocess compile.




Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-06 Thread Eli Zaretskii
> Date: Sun, 6 Nov 2022 11:18:03 -0800
> Cc: a...@sdf.org, vinc...@vinc17.net, spwhit...@spwhitton.name,
>  58...@debbugs.gnu.org, 1017...@bugs.debian.org
> From: Paul Eggert 
> 
> On 2022-11-05 22:51, Eli Zaretskii wrote:
> 
> > But is it possible for a program like Emacs to get SIGHUP in such a
> > situation, or is that highly improbable?  We have standard streams of
> > the inferior Emacs process connected via PTYs to the parent process, I
> > believe -- does that deliver SIGHUP or SIGPIPE when the parent exits?
> 
> It depends on the OS and the app that invokes Emacs and how that app 
> itself was invoked. It's a hairy area.
> 
> On a POSIX platform it's certainly *possible* for Emacs to get SIGHUP in 
> that situation, because a user can invoke the shell command 'kill -s HUP 
> P', where P is the process ID of the inferior Emacs. Whether it's 
> *likely* is a bit harder to say. I ran a few little experiments on 
> Fedora 36 and Ubuntu 22.10 and found SIGHUP being sent in a few 
> situations and not others and didn't have the time or patience to suss 
> out exactly why or when.

Thanks.  The scenario that is of primary interest in this case is the
following:

 . user starts Emacs
 . Emacs loads some Lisp package and as results starts a subordinate
   Emacs process in batch mode to native-compile the loaded Lisp
 . user exits Emacs

My question was whether in this scenario, since the parent Emacs
exits, the child Emacs can get SIGHUP, simply because its parent
exited and the read end of the PTY no longer exists.



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-06 Thread Paul Eggert

On 2022-11-05 22:51, Eli Zaretskii wrote:


But is it possible for a program like Emacs to get SIGHUP in such a
situation, or is that highly improbable?  We have standard streams of
the inferior Emacs process connected via PTYs to the parent process, I
believe -- does that deliver SIGHUP or SIGPIPE when the parent exits?


It depends on the OS and the app that invokes Emacs and how that app 
itself was invoked. It's a hairy area.


On a POSIX platform it's certainly *possible* for Emacs to get SIGHUP in 
that situation, because a user can invoke the shell command 'kill -s HUP 
P', where P is the process ID of the inferior Emacs. Whether it's 
*likely* is a bit harder to say. I ran a few little experiments on 
Fedora 36 and Ubuntu 22.10 and found SIGHUP being sent in a few 
situations and not others and didn't have the time or patience to suss 
out exactly why or when.




Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-05 Thread Eli Zaretskii
> Date: Sat, 5 Nov 2022 13:54:54 -0700
> Cc: vinc...@vinc17.net, spwhit...@spwhitton.name, 58...@debbugs.gnu.org,
>  1017...@bugs.debian.org
> From: Paul Eggert 
> 
> On 2022-11-04 00:00, Eli Zaretskii wrote:
> > We need to establish what is the
> > source of SIGHUP in these cases.  "These cases" mean, AFAIU, the
> > situations where Emacs launched an async subprocess to do native
> > compilation (which is another Emacs process in a --batch session), and
> > the parent Emacs session is terminated by the user before the async
> > compilation runs to completion.  Would the child Emacs process get
> > SIGHUP in this scenario?
> 
> Hard for me to say. It's a messy area, with kernels (and Emacs itself) 
> sending SIGHUP on various whims.

But is it possible for a program like Emacs to get SIGHUP in such a
situation, or is that highly improbable?  We have standard streams of
the inferior Emacs process connected via PTYs to the parent process, I
believe -- does that deliver SIGHUP or SIGPIPE when the parent exits?

> Does the attached patch fix things? It builds on your commit 
> 190a6853708ab22072437f6ebd93beb3ec1a9ce6 dated 2020-12-04; I don't know 
> why that earlier patch was installed, but it would seem to apply to 
> SIGHUP and SIGTERM as well as it applies to SIGINT.

I was trying to be conservative, that's all.  I'm okay with doing the
same for SIGHUP.  Vincent, can you try this patch, please?



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-05 Thread Paul Eggert

On 2022-11-04 00:00, Eli Zaretskii wrote:

We need to establish what is the
source of SIGHUP in these cases.  "These cases" mean, AFAIU, the
situations where Emacs launched an async subprocess to do native
compilation (which is another Emacs process in a --batch session), and
the parent Emacs session is terminated by the user before the async
compilation runs to completion.  Would the child Emacs process get
SIGHUP in this scenario?


Hard for me to say. It's a messy area, with kernels (and Emacs itself) 
sending SIGHUP on various whims.


Does the attached patch fix things? It builds on your commit 
190a6853708ab22072437f6ebd93beb3ec1a9ce6 dated 2020-12-04; I don't know 
why that earlier patch was installed, but it would seem to apply to 
SIGHUP and SIGTERM as well as it applies to SIGINT.diff --git a/src/emacs.c b/src/emacs.c
index 1b2aa9442b..92e2299a04 100644
--- a/src/emacs.c
+++ b/src/emacs.c
@@ -432,9 +432,9 @@ terminate_due_to_signal (int sig, int backtrace_limit)
   if (sig == SIGTERM || sig == SIGHUP || sig == SIGINT)
 	{
 	  /* Avoid abort in shut_down_emacs if we were interrupted
-		 by SIGINT in noninteractive usage, as in that case we
+		 in noninteractive usage, as in that case we
 		 don't care about the message stack.  */
-	  if (sig == SIGINT && noninteractive)
+	  if (noninteractive)
 		clear_message_stack ();
 	  Fkill_emacs (make_fixnum (sig), Qnil);
 	}


Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-04 Thread Andrea Corallo
Eli Zaretskii  writes:

>> From: Andrea Corallo 
>> Cc: Vincent Lefevre , spwhit...@spwhitton.name,
>> 58...@debbugs.gnu.org, 1017...@bugs.debian.org
>> Date: Thu, 03 Nov 2022 21:25:08 +
>> 
>> AFAIU the Emacs subprocess we use to compile should behave like a
>> regular Emacs.
>
> Basically, you are saying that if the sub-process that runs
> async-compilation gets SIGHUP, it should abort and dump core, like a
> normal Emacs session does, right?
>
> The backtrace posted to the Debian bug tracker, here:
>
>   
> https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1017711;filename=gdb.txt;msg=5
>
> indicates that Emacs was in the middle of comp-copy-insn which was
> called from comp-fwprop.  Then Emacs performed GC, and SIGHUP was
> received during GC.  IOW, we were in our Lisp code, not in a libgccjit
> code, when the signal arrived.
>
> Another backtrace, posted here:
>
>   
> https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1017711;filename=gdb.txt;msg=45
>
> tells a somewhat different story: it doesn't show Emacs in the middle
> of a native compilation, but just inside substitute-command-keys that
> was called from command-line.

Sorry I missed those traces.  Okay so for both cases if libgccjit is not
involved the behaviour of Emacs here is just the plain one and should
not be related to native compilation.  It's just that native compilation
makes it more likely to be identify this condition.

>> Now, the only option that comes to my mind is that libgccjit (being
>> strictly derived from the GCC codebase) might be registering a signal
>> handler of some kind that alters the behaviour we expect.  But if this
>> is the case we should find trace of it the strace, or we can use gdb
>> setting a break point into 'signal' as well to check.
>> 
>> Indeed if this theory is true I think should be classified as a
>> libgccjit bug.
>
> I don't think it's true, see above.
>
> Paul, can you help here, please?  We need to establish what is the
> source of SIGHUP in these cases.  "These cases" mean, AFAIU, the
> situations where Emacs launched an async subprocess to do native
> compilation (which is another Emacs process in a --batch session), and
> the parent Emacs session is terminated by the user before the async
> compilation runs to completion.  Would the child Emacs process get
> SIGHUP in this scenario?  If yes, then I think we should treat SIGHUP
> differently in non-interactive invocations: instead of dumping core,
> we should catch the signal and exit with a non-zero exit status.
>
> Does this make sense?

To me yes.

> Andrea, if we do the above as I suggest, is there any cleanup that we
> need to do before exiting?  For example, what if the subprocess that
> does the async compilation already started writing the .eln file when
> the signal arrives?  What do we do today when the parent interactive
> Emacs is terminated by the user?

I think we have no special handling for this case, so yeah we might
leave some traces of the compilation.  Other than the .eln we should
also remove the lisp file we write to be loaded by the async compilation
process.  I'm not sure where and how would be best to handle all of this
tho.

Best Regards

  Andrea



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-04 Thread Eli Zaretskii
> From: Andrea Corallo 
> Cc: Vincent Lefevre , spwhit...@spwhitton.name,
> 58...@debbugs.gnu.org, 1017...@bugs.debian.org
> Date: Thu, 03 Nov 2022 21:25:08 +
> 
> AFAIU the Emacs subprocess we use to compile should behave like a
> regular Emacs.

Basically, you are saying that if the sub-process that runs
async-compilation gets SIGHUP, it should abort and dump core, like a
normal Emacs session does, right?

The backtrace posted to the Debian bug tracker, here:

  
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1017711;filename=gdb.txt;msg=5

indicates that Emacs was in the middle of comp-copy-insn which was
called from comp-fwprop.  Then Emacs performed GC, and SIGHUP was
received during GC.  IOW, we were in our Lisp code, not in a libgccjit
code, when the signal arrived.

Another backtrace, posted here:

  
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=1017711;filename=gdb.txt;msg=45

tells a somewhat different story: it doesn't show Emacs in the middle
of a native compilation, but just inside substitute-command-keys that
was called from command-line.

> Now, the only option that comes to my mind is that libgccjit (being
> strictly derived from the GCC codebase) might be registering a signal
> handler of some kind that alters the behaviour we expect.  But if this
> is the case we should find trace of it the strace, or we can use gdb
> setting a break point into 'signal' as well to check.
> 
> Indeed if this theory is true I think should be classified as a
> libgccjit bug.

I don't think it's true, see above.

Paul, can you help here, please?  We need to establish what is the
source of SIGHUP in these cases.  "These cases" mean, AFAIU, the
situations where Emacs launched an async subprocess to do native
compilation (which is another Emacs process in a --batch session), and
the parent Emacs session is terminated by the user before the async
compilation runs to completion.  Would the child Emacs process get
SIGHUP in this scenario?  If yes, then I think we should treat SIGHUP
differently in non-interactive invocations: instead of dumping core,
we should catch the signal and exit with a non-zero exit status.

Does this make sense?

Andrea, if we do the above as I suggest, is there any cleanup that we
need to do before exiting?  For example, what if the subprocess that
does the async compilation already started writing the .eln file when
the signal arrives?  What do we do today when the parent interactive
Emacs is terminated by the user?



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-03 Thread Andrea Corallo
Eli Zaretskii  writes:

>> Date: Thu, 3 Nov 2022 11:13:08 +0100
>> From: Vincent Lefevre 
>> Cc: spwhit...@spwhitton.name, 58...@debbugs.gnu.org,
>>  1017...@bugs.debian.org
>> 
>> On 2022-11-03 08:47:06 +0200, Eli Zaretskii wrote:
>> > > On 2022-11-02 14:24:51 +0200, Eli Zaretskii wrote:
>> > > > Signal 1 is SIGHUP, AFAIU.  Why should Emacs receive SIGHUP in the
>> > > > middle of GC, I have no idea.  Maybe ask the user what was he doing at
>> > > > that time.  E.g., could that be a remote Emacs session?
>> > > 
>> > > No, it is on my local machine.
>> > 
>> > So how come Emacs gets a SIGHUP?  This is the crucial detail that is
>> > missing here.  Basically, if SIGHUP is delivered to Emacs, Emacs is
>> > supposed to die a violent death.
>> 
>> I suspect the SIGHUP comes from Emacs itself. According to strace
>> output, the only processes started by Emacs are "/usr/bin/emacs"
>> (there are many of them). I don't see what other process could be
>> aware of the situation. Unfortunately, I couldn't reproduce the
>> issue with strace (I suspect some race condition).
>> 
>> > > I run emacs, and quit it immediately. The generation of the core dump
>> > > is almost 100% reproducible. Ditto with "emacs -nw".
>> > 
>> > Wait, you mean the crash is during exiting Emacs?
>> 
>> For this test, yes. In general, I don't know.
>> 
>> > That could mean Emacs receives some input event when it's half-way
>> > through the shutdown process, and the input descriptor is already
>> > closed.
>> 
>> Note that the process that crashes is not the Emacs I started,
>> but a subprocess run by Emacs itself, since it has arguments like
>> "-no-comp-spawn --batch -l /tmp/emacs-async-comp-url.el-FGov4z.el".
>
> Andrea, could you please look into this?  The SIGHUP could be because
> the parent process exits, but that shouldn't cause a crash in the
> sub-process that performs native compilation?

Hi Eli,

AFAIU the Emacs subprocess we use to compile should behave like a
regular Emacs.

Now, the only option that comes to my mind is that libgccjit (being
strictly derived from the GCC codebase) might be registering a signal
handler of some kind that alters the behaviour we expect.  But if this
is the case we should find trace of it the strace, or we can use gdb
setting a break point into 'signal' as well to check.

Indeed if this theory is true I think should be classified as a
libgccjit bug.

  Andrea



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-03 Thread Eli Zaretskii
> Date: Thu, 3 Nov 2022 11:13:08 +0100
> From: Vincent Lefevre 
> Cc: spwhit...@spwhitton.name, 58...@debbugs.gnu.org,
>   1017...@bugs.debian.org
> 
> On 2022-11-03 08:47:06 +0200, Eli Zaretskii wrote:
> > > On 2022-11-02 14:24:51 +0200, Eli Zaretskii wrote:
> > > > Signal 1 is SIGHUP, AFAIU.  Why should Emacs receive SIGHUP in the
> > > > middle of GC, I have no idea.  Maybe ask the user what was he doing at
> > > > that time.  E.g., could that be a remote Emacs session?
> > > 
> > > No, it is on my local machine.
> > 
> > So how come Emacs gets a SIGHUP?  This is the crucial detail that is
> > missing here.  Basically, if SIGHUP is delivered to Emacs, Emacs is
> > supposed to die a violent death.
> 
> I suspect the SIGHUP comes from Emacs itself. According to strace
> output, the only processes started by Emacs are "/usr/bin/emacs"
> (there are many of them). I don't see what other process could be
> aware of the situation. Unfortunately, I couldn't reproduce the
> issue with strace (I suspect some race condition).
> 
> > > I run emacs, and quit it immediately. The generation of the core dump
> > > is almost 100% reproducible. Ditto with "emacs -nw".
> > 
> > Wait, you mean the crash is during exiting Emacs?
> 
> For this test, yes. In general, I don't know.
> 
> > That could mean Emacs receives some input event when it's half-way
> > through the shutdown process, and the input descriptor is already
> > closed.
> 
> Note that the process that crashes is not the Emacs I started,
> but a subprocess run by Emacs itself, since it has arguments like
> "-no-comp-spawn --batch -l /tmp/emacs-async-comp-url.el-FGov4z.el".

Andrea, could you please look into this?  The SIGHUP could be because
the parent process exits, but that shouldn't cause a crash in the
sub-process that performs native compilation?

Thanks.



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-03 Thread Vincent Lefevre
On 2022-11-03 08:47:06 +0200, Eli Zaretskii wrote:
> > On 2022-11-02 14:24:51 +0200, Eli Zaretskii wrote:
> > > Signal 1 is SIGHUP, AFAIU.  Why should Emacs receive SIGHUP in the
> > > middle of GC, I have no idea.  Maybe ask the user what was he doing at
> > > that time.  E.g., could that be a remote Emacs session?
> > 
> > No, it is on my local machine.
> 
> So how come Emacs gets a SIGHUP?  This is the crucial detail that is
> missing here.  Basically, if SIGHUP is delivered to Emacs, Emacs is
> supposed to die a violent death.

I suspect the SIGHUP comes from Emacs itself. According to strace
output, the only processes started by Emacs are "/usr/bin/emacs"
(there are many of them). I don't see what other process could be
aware of the situation. Unfortunately, I couldn't reproduce the
issue with strace (I suspect some race condition).

> > I run emacs, and quit it immediately. The generation of the core dump
> > is almost 100% reproducible. Ditto with "emacs -nw".
> 
> Wait, you mean the crash is during exiting Emacs?

For this test, yes. In general, I don't know.

> That could mean Emacs receives some input event when it's half-way
> through the shutdown process, and the input descriptor is already
> closed.

Note that the process that crashes is not the Emacs I started,
but a subprocess run by Emacs itself, since it has arguments like
"-no-comp-spawn --batch -l /tmp/emacs-async-comp-url.el-FGov4z.el".
However, it also happened that the Emacs I started immediately
crashed (this occurred only once, though).

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-03 Thread Eli Zaretskii
> Cc: 58...@debbugs.gnu.org, 1017...@bugs.debian.org
> Date: Thu, 3 Nov 2022 04:00:46 +0100
> From: Vincent Lefevre 
> 
> On 2022-11-02 14:24:51 +0200, Eli Zaretskii wrote:
> > Signal 1 is SIGHUP, AFAIU.  Why should Emacs receive SIGHUP in the
> > middle of GC, I have no idea.  Maybe ask the user what was he doing at
> > that time.  E.g., could that be a remote Emacs session?
> 
> No, it is on my local machine.

So how come Emacs gets a SIGHUP?  This is the crucial detail that is
missing here.  Basically, if SIGHUP is delivered to Emacs, Emacs is
supposed to die a violent death.

> I run emacs, and quit it immediately. The generation of the core dump
> is almost 100% reproducible. Ditto with "emacs -nw".

Wait, you mean the crash is during exiting Emacs?  That could mean
Emacs receives some input event when it's half-way through the
shutdown process, and the input descriptor is already closed.

But the backtrace you posted shows SIGHUP during GC, which is AFAIU a
very different case.



Bug#1017711: bug#58956: mark_object, mark_objects(?) crash

2022-11-02 Thread Vincent Lefevre
On 2022-11-02 14:24:51 +0200, Eli Zaretskii wrote:
> Signal 1 is SIGHUP, AFAIU.  Why should Emacs receive SIGHUP in the
> middle of GC, I have no idea.  Maybe ask the user what was he doing at
> that time.  E.g., could that be a remote Emacs session?

No, it is on my local machine.

On 2022-11-02 14:43:41 -0700, Sean Whitton wrote:
> Upstream says there isn't enough information in the backtrace to say
> anything helpful about this.  Could you take a look at
>  and consider supplying more information
> over there, please?
> 
> Also, are you able to reproduce this with 'emacs -q' (not -Q)?

This is not reproducible with "emacs -q".

I can reproduce it in a firejail private directory[*] (so that the
behavior doesn't depend on my own config files), where there is no
.emacs file. There is a .emacs.d directory with just a eln-cache
subdirectory:

zira% ls -la .emacs.d 
total 12
drwx-- 3 vinc17 vinc17 4096 2022-11-01 00:40:05 .
drwx-- 4 vinc17 vinc17 4096 2022-11-03 03:53:23 ..
drwxr-xr-x 3 vinc17 vinc17 4096 2022-11-01 00:40:05 eln-cache
zira% ls -la .emacs.d/eln-cache 
total 12
drwxr-xr-x 3 vinc17 vinc17 4096 2022-11-01 00:40:05 .
drwx-- 3 vinc17 vinc17 4096 2022-11-01 00:40:05 ..
drwxr-xr-x 2 vinc17 vinc17 4096 2022-11-01 00:40:05 28.2-43f520ab
zira% ls -la .emacs.d/eln-cache/28.2-43f520ab 
total 8
drwxr-xr-x 2 vinc17 vinc17 4096 2022-11-01 00:40:05 .
drwxr-xr-x 3 vinc17 vinc17 4096 2022-11-01 00:40:05 ..
zira% 

[*] firejail --ignore=read-only --ignore='noexec ${HOME}' 
--noblacklist='${HOME}/*' --private=fj-dir zsh

I run emacs, and quit it immediately. The generation of the core dump
is almost 100% reproducible. Ditto with "emacs -nw".

But note that the bug is also reproducible without firejail, but
harder to reproduce.

-- 
Vincent Lefèvre  - Web: 
100% accessible validated (X)HTML - Blog: 
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)