Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-21 Thread Vincent MAILHOL
Hi Richard and Lersek,

Thanks for your help on this issue and thanks for picking up my patch
and applying it upstream!

On Tue. 20 June 2023 à 17:10, Richard W.M. Jones  wrote:
> I think you've solved the problem now, but for future reference you
> can run:
>
>   $ virt-rescue

Perfect! This last comment was what I needed for my final investigation.

The UUID 65534 problem showed up again. Within the qemu VM, the active
user is indeed root.

However,

  $ ls -al /bin/mount
  -rwsr-xr-x 1 65534 65534 55528 May 30 15:42 /bin/mount

Where 65534 corresponds to the user "nobody" and the group "nogroup".
So the root cause was that the bazel sandbox created an environment in
which SUID programs had a different UID and GID than expected. The
guestfs-tools would just copy those IDs when creating the qemu rootfs.
Even if /bin/mount was executed as root, the SUID makes it run
effectively as nobody. This is kind of comical: it is the first time
that I see a SUID resulting in a drop of privilege ¯\_(ツ)_/¯.

At this point, I just gave up on using the bazel sandbox for the
particular target in which I need guestfs-tools.

For the record, and in case anyone has the same issue as I did and
find this thread, the sandbox can be disabled for a particular target
by using the "no-sandbox" tag. Example:

  genrule(
  name = "rootfs",
  srcs = ["rootfs.tar"],
  outs = ["rootfs.ext4"],
  tags = ["no-sandbox"],
  cmd = "virt-make-fs --format=raw --type=ext4 --size=+500M $< $@",
  )


Yours sincerely,
Vincent Mailhol

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs


Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-20 Thread Richard W.M. Jones
On Mon, Jun 19, 2023 at 08:18:20PM +0900, Vincent MAILHOL wrote:
> On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones  wrote:
> (...)
> > > Last thing, the segfault on ldmtool [1] still seems a valid issue.
> > > Even if I now do have a workaround for my problem, that segfault might
> > > be worth a bit more investigation.
> >
> > Yes that does look like a real problem.  Does it crash if you just run
> > ldmtool as a normal command, nothing to do with libguestfs?  Might be
> > a good idea to try to get a stack trace of the crash.
> 
> The fact is that it only crashes with the UUID 65534 in the qemu VM. I
> am not sure what command line is passed to ldmtool for this crash to
> occur.
> 
> I can help to gather information, but my biggest issue is that I do
> not know how to interact with the VM under /tmp/.guestfs-1001/

I think you've solved the problem now, but for future reference you
can run:

  $ virt-rescue

(there are various options, see the manual).  This will create a
virtual machine with the appliance and drop you into a shell.

Rich.

>   [0.777352] ldmtool[164]: segfault at 0 ip 563a225cd6a5 sp
> 7ffe54965a60 error 4 in ldmtool[563a225cb000+3000]
>  ^^^
> This smells like a NULL pointer dereference. The instruction pointer
> being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a:
> 
>   addr2line -e /usr/bin/ldmtool 564a892506a5
> 
> Results:
> 
>   ??:0
> 
> Without conviction, I also tried in GDB:
> 
>   $ gdb /usr/bin/ldmtool
>   (...)
>   Reading symbols from /usr/bin/ldmtool...
>   Reading symbols from
> /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug...
>   (gdb) info line *0x564a892506a5
>   No line number information available for address 0x564a892506a5
> 
> Debug symbols are correctly installed but impossible to convert that
> instruction pointer into a line number. It is as if the ldmtool on my
> host and the ldmtool in the qemu VM were from a different build. I
> tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image
> did not contain ldmtool.
> 
> I am not sure how to generate a stack trace or a core dump within that
> qemu VM. If you can tell me how to get an interactive prompt (or any
> other guidance) I can try to collect more information.
> 
> 
> Yours sincerely,
> Vincent Mailhol

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-19 Thread Laszlo Ersek
On 6/19/23 16:32, Vincent Mailhol wrote:
> On Mon 19 June 2023 at 21:16, Laszlo Ersek  wrote:
>> On 6/19/23 13:18, Vincent MAILHOL wrote:
>>> On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones  wrote:
>>> (...)
> Last thing, the segfault on ldmtool [1] still seems a valid issue.
> Even if I now do have a workaround for my problem, that segfault might
> be worth a bit more investigation.

 Yes that does look like a real problem.  Does it crash if you just run
 ldmtool as a normal command, nothing to do with libguestfs?  Might be
 a good idea to try to get a stack trace of the crash.
>>>
>>> The fact is that it only crashes with the UUID 65534 in the qemu VM. I
>>> am not sure what command line is passed to ldmtool for this crash to
>>> occur.
>>>
>>> I can help to gather information, but my biggest issue is that I do
>>> not know how to interact with the VM under /tmp/.guestfs-1001/
>>>
>>>   [0.777352] ldmtool[164]: segfault at 0 ip 563a225cd6a5 sp 
>>> 7ffe54965a60 error 4 in ldmtool[563a225cb000+3000]
>>>  ^^^
>>> This smells like a NULL pointer dereference.
>>
>> ... Hey this is actually my line from an email I started writing earlier
>> today :) , but I then decided not to send it.
>>
>> It certainly looks like a null pointer dereference, and if you
>> disassemble the instruction byte stream dump (the "Code:" line from the
>> kernel log) with (e.g.) ndisasm, that confirms it. You get something like
>>
>> 0025  E8DBFDcall 0xfe05
>> 002A  4C8B20mov r12,[rax]  < crash
>> 002D  4889442408mov [rsp+0x8],rax
>> 0032  4C89E7mov rdi,r12
>> 0035  E80BE1call 0xe145
>>
>> with the "mov r12,[rax]" instruction faulting (with the previously
>> called function presumably having returned 0 in rax). See the "<4c> 8b
>> 20" substring in the "Code:" line -- the angle brackets point at the
>> first byte of the crashing instruction.
>>
>> I didn't send the email ultimately because your email included a link
>> [1] pointing at a particular line number:
>>
>> https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164
>>
>> and so I assumed you actually traced the crash to that line.
>>
>> Is that the case?
>>
>> Or did you perhaps mistake *PID* 164 (from the kernel log) for the line
>> number?
> 
> Yes, two messages back, I misinterpreted the PID (164) as a line
> number. Because that particular line manipulate the result of a
> g_array_index(), it looked coherent with the potential NULL pointer
> dereference. Realizing my mistake, I then started to do a deeper
> addr2line investigation in the previous message. Sorry.
> 
>>> The instruction pointer
>>> being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a:
>>>
>>>   addr2line -e /usr/bin/ldmtool 564a892506a5
> 
> 
> Reading my previous message, I do not know where this 564a892506a5
> comes from. I meant 563a225cd6a5 here (and below in gdb).
> 
>>> Results:
>>>
>>>   ??:0
>>>
>>> Without conviction, I also tried in GDB:
>>>
>>>   $ gdb /usr/bin/ldmtool
>>>   (...)
>>>   Reading symbols from /usr/bin/ldmtool...
>>>   Reading symbols from
>>> /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug...
>>>   (gdb) info line *0x564a892506a5
>>>   No line number information available for address 0x564a892506a5
>>>
>>> Debug symbols are correctly installed but impossible to convert that
>>> instruction pointer into a line number. It is as if the ldmtool on my
>>> host and the ldmtool in the qemu VM were from a different build. I
>>> tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image
>>> did not contain ldmtool.
>>>
>>> I am not sure how to generate a stack trace or a core dump within that
>>> qemu VM. If you can tell me how to get an interactive prompt (or any
>>> other guidance) I can try to collect more information.
>>
>> The IP where the crash occurs is 563a225cd6a5. The ldmtool binary
>> (as opposed to a shared object / library) is mapped into the process's
>> address space at 563a225cb000, for a length of 0x3000 bytes. So the
>> offending instruction is supposed to be 563a225cd6a5 - 563a225cb000
>> = 26A5.
> 
> Thanks for the explanation.
> 
>> With the debug symbols installed, can you attach the output of
>>
>>   objdump --headers --wide -S /usr/bin/ldmtool
>>
>> ?
> 
> Results attached at the bottom of the e-mail.
> 
>> Can you try
>>
>>   addr2line -p -i -f -e /usr/bin/ldmtool 26A5
>>
>> ?
> 
> Unfortunately:
> 
>   $ addr2line -p -i -f -e /usr/bin/ldmtool 26a5
>   ?? ??:0
> 
>> (This still may not be good enough; we might have to offset the
>> difference 0x26A5 with some address related to the .text section... The
>> objdump output should help us experiment.)
> 
> For what it is worth, I loaded the program in GDB:
> 
>   (gdb) break main
>   Breakpoint 

Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-19 Thread Laszlo Ersek
On 6/19/23 13:18, Vincent MAILHOL wrote:
> On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones  wrote:
> (...)
>>> Last thing, the segfault on ldmtool [1] still seems a valid issue.
>>> Even if I now do have a workaround for my problem, that segfault might
>>> be worth a bit more investigation.
>>
>> Yes that does look like a real problem.  Does it crash if you just run
>> ldmtool as a normal command, nothing to do with libguestfs?  Might be
>> a good idea to try to get a stack trace of the crash.
> 
> The fact is that it only crashes with the UUID 65534 in the qemu VM. I
> am not sure what command line is passed to ldmtool for this crash to
> occur.
> 
> I can help to gather information, but my biggest issue is that I do
> not know how to interact with the VM under /tmp/.guestfs-1001/
> 
>   [0.777352] ldmtool[164]: segfault at 0 ip 563a225cd6a5 sp
> 7ffe54965a60 error 4 in ldmtool[563a225cb000+3000]
>  ^^^
> This smells like a NULL pointer dereference.

... Hey this is actually my line from an email I started writing earlier
today :) , but I then decided not to send it.

It certainly looks like a null pointer dereference, and if you
disassemble the instruction byte stream dump (the "Code:" line from the
kernel log) with (e.g.) ndisasm, that confirms it. You get something like

0025  E8DBFDcall 0xfe05
002A  4C8B20mov r12,[rax]  < crash
002D  4889442408mov [rsp+0x8],rax
0032  4C89E7mov rdi,r12
0035  E80BE1call 0xe145

with the "mov r12,[rax]" instruction faulting (with the previously
called function presumably having returned 0 in rax). See the "<4c> 8b
20" substring in the "Code:" line -- the angle brackets point at the
first byte of the crashing instruction.

I didn't send the email ultimately because your email included a link
[1] pointing at a particular line number:

https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164

and so I assumed you actually traced the crash to that line.

Is that the case?

Or did you perhaps mistake *PID* 164 (from the kernel log) for the line
number?

> The instruction pointer
> being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a:
> 
>   addr2line -e /usr/bin/ldmtool 564a892506a5
> 
> Results:
> 
>   ??:0
> 
> Without conviction, I also tried in GDB:
> 
>   $ gdb /usr/bin/ldmtool
>   (...)
>   Reading symbols from /usr/bin/ldmtool...
>   Reading symbols from
> /usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug...
>   (gdb) info line *0x564a892506a5
>   No line number information available for address 0x564a892506a5
> 
> Debug symbols are correctly installed but impossible to convert that
> instruction pointer into a line number. It is as if the ldmtool on my
> host and the ldmtool in the qemu VM were from a different build. I
> tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image
> did not contain ldmtool.
> 
> I am not sure how to generate a stack trace or a core dump within that
> qemu VM. If you can tell me how to get an interactive prompt (or any
> other guidance) I can try to collect more information.

The IP where the crash occurs is 563a225cd6a5. The ldmtool binary
(as opposed to a shared object / library) is mapped into the process's
address space at 563a225cb000, for a length of 0x3000 bytes. So the
offending instruction is supposed to be 563a225cd6a5 - 563a225cb000
= 26A5.

With the debug symbols installed, can you attach the output of

  objdump --headers --wide -S /usr/bin/ldmtool

?

Can you try

  addr2line -p -i -f -e /usr/bin/ldmtool 26A5

?

(This still may not be good enough; we might have to offset the
difference 0x26A5 with some address related to the .text section... The
objdump output should help us experiment.)

Laszlo
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-19 Thread Vincent MAILHOL
On Fri. 16 juin 2023 at 16:34, Richard W.M. Jones  wrote:
(...)
> > Last thing, the segfault on ldmtool [1] still seems a valid issue.
> > Even if I now do have a workaround for my problem, that segfault might
> > be worth a bit more investigation.
>
> Yes that does look like a real problem.  Does it crash if you just run
> ldmtool as a normal command, nothing to do with libguestfs?  Might be
> a good idea to try to get a stack trace of the crash.

The fact is that it only crashes with the UUID 65534 in the qemu VM. I
am not sure what command line is passed to ldmtool for this crash to
occur.

I can help to gather information, but my biggest issue is that I do
not know how to interact with the VM under /tmp/.guestfs-1001/

  [0.777352] ldmtool[164]: segfault at 0 ip 563a225cd6a5 sp
7ffe54965a60 error 4 in ldmtool[563a225cb000+3000]
 ^^^
This smells like a NULL pointer dereference. The instruction pointer
being 563a225cd6a5, I installed libguestfs-tools-dbgsym and tried a:

  addr2line -e /usr/bin/ldmtool 564a892506a5

Results:

  ??:0

Without conviction, I also tried in GDB:

  $ gdb /usr/bin/ldmtool
  (...)
  Reading symbols from /usr/bin/ldmtool...
  Reading symbols from
/usr/lib/debug/.build-id/21/37b4a64903ebe427c242be08b8d496ba570583.debug...
  (gdb) info line *0x564a892506a5
  No line number information available for address 0x564a892506a5

Debug symbols are correctly installed but impossible to convert that
instruction pointer into a line number. It is as if the ldmtool on my
host and the ldmtool in the qemu VM were from a different build. I
tried to mount /tmp/.guestfs-1001/appliance.d/root but that disk image
did not contain ldmtool.

I am not sure how to generate a stack trace or a core dump within that
qemu VM. If you can tell me how to get an interactive prompt (or any
other guidance) I can try to collect more information.


Yours sincerely,
Vincent Mailhol

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs



Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-16 Thread Richard W.M. Jones
On Fri, Jun 16, 2023 at 11:17:21AM +0900, Vincent MAILHOL wrote:
> Hi Richard,
> 
> On Fri. 16 Jun. 2023 à 03:08, Richard W.M. Jones  wrote:
> > On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:
> > > Hello,
> > >
> > > I am using libguestfs in a Bazel's linux-sandbox environment[1].
> > >
> > > When executing in that sandbox environment, I got frequent crashes.
> > >
> > > Please find attached below the results of libguestfs-test-tool when
> > > run into that linux-sandbox environment. The most relevant part seems
> > > to be:
> > >
> > >   [0.797233] ldmtool[164]: segfault at 0 ip 564a892506a5 sp 
> > > 7fff8ee5b900 error 4 in ldmtool[564a8924e000+3000]
> > >   [0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 
> > > 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff 
> > > ff <4c> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1
> > >   /init: line 154:   164 Segmentation fault  ldmtool create all
> > >
> > > So the root cause seems to be around libldm. This mailing list seems
> > > to cover both libguestfs and libldm, so hopefully, I am at the right
> > > place to ask :)
> > >
> > > Needless to say, when run outside of the sandbox environment, no crash
> > > were observed.
> > >
> > > [1] linux-sandbox.cc
> > > Link: 
> > > https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc
> > >
> > > ---
> > ...
> > > supermin: picked /sys/block/sdb/dev (8:16) as root device
> > > supermin: creating /dev/root as block special 8:16
> > > supermin: mounting new root on /root
> > > [0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 
> > > subsystem
> > > [0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . 
> > > Quota mode: none.
> > > supermin: deleting initramfs files
> > > supermin: chroot
> > > Starting /init script ...
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > /init: line 38: /proc/cmdline: No such file or directory
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > mount: only root can use "--options" option (effective UID is 65534)
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > mount: only root can use "--types" option (effective UID is 65534)
> > > mount: only root can use "--options" option (effective UID is 65534)
> >
> > It really goes wrong from here, where apparently it's not running as
> > root (instead UID 65534), even though we're supposed to be running
> > inside a Linux appliance virtual machine.
> >
> > Any idea why that would be?
> >
> > I looked at the sandbox and that would run the qemu process as UID
> > "nobody" (which might be 65534).  However I don't understand why that
> > would affect anything running on the new kernel inside the appliance.
> 
> And you were right. It was a fact that I got a crash in the sandbox
> but did not outside of it and I jumped to the conclusion that the root
> cause was linked to the sandbox.
> 
> I continued the analysis and looked at all the differences between a
> successful libguestfs-test-tool log and the failed one. It turned out
> that the sandbox was not the cause. The culprit turns out to be the
> first line of the log: TMPDIR=/tmp.
> 
> If I force TMPDIR=/var/tmp, the problem disappears !!
> 
> This gave me a minimal reproducer:
> 
>   TMPDIR=/tmp/ libguestfs-test-tool
> 
> That one crashed outside the sandbox. Next, my attention went to this line:
> 
>   libguestfs: checking for previously cached test results of
> /usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001
> 
> I did a:
> 
>   rm -rf /tmp/.guestfs-1001
> 
> and that solved my issue \o/
> 
> I still do not understand how I could get the issue of running of UID
> 65534 instead of root in the first place. I did other qemu
> experimentation, so not sure how, but I somehow got a corrupted
> environment under /tmp/.guestfs-1001.

We will cache the appliance under $TMPDIR/.guestfs-$UID/ (eg have a
look at appliance/root in that directory).

We rebuild it if the distro changes, so most of the time we don't have
to rebuild it when launching libguestfs (although there was a
long-standing bug which I fixed recently:
https://github.com/libguestfs/supermin/commit/8c38641042e274a713a18daf7fc85584ca0fc9bb).

> Last thing, the segfault on ldmtool [1] still seems a valid issue.
> Even if I now do have a workaround for my problem, that segfault might
> be worth a bit more investigation.

Yes that does look like a real problem.  Does it crash if you just run
ldmtool as a normal command, nothing to do with libguestfs?  Might be
a good idea to try to get a stack trace of the crash.

Rich.

> Regardless, thanks a lot for your quick answer, that helped me to
> continue the troubleshooting.
> 
> [1] ldmtool line 164
> Link: https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones

Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-15 Thread Vincent MAILHOL
Hi Richard,

On Fri. 16 Jun. 2023 à 03:08, Richard W.M. Jones  wrote:
> On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:
> > Hello,
> >
> > I am using libguestfs in a Bazel's linux-sandbox environment[1].
> >
> > When executing in that sandbox environment, I got frequent crashes.
> >
> > Please find attached below the results of libguestfs-test-tool when
> > run into that linux-sandbox environment. The most relevant part seems
> > to be:
> >
> >   [0.797233] ldmtool[164]: segfault at 0 ip 564a892506a5 sp 
> > 7fff8ee5b900 error 4 in ldmtool[564a8924e000+3000]
> >   [0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 
> > 5d 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff ff 
> > <4c> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1
> >   /init: line 154:   164 Segmentation fault  ldmtool create all
> >
> > So the root cause seems to be around libldm. This mailing list seems
> > to cover both libguestfs and libldm, so hopefully, I am at the right
> > place to ask :)
> >
> > Needless to say, when run outside of the sandbox environment, no crash
> > were observed.
> >
> > [1] linux-sandbox.cc
> > Link: 
> > https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc
> >
> > ---
> ...
> > supermin: picked /sys/block/sdb/dev (8:16) as root device
> > supermin: creating /dev/root as block special 8:16
> > supermin: mounting new root on /root
> > [0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 
> > subsystem
> > [0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . 
> > Quota mode: none.
> > supermin: deleting initramfs files
> > supermin: chroot
> > Starting /init script ...
> > mount: only root can use "--types" option (effective UID is 65534)
> > /init: line 38: /proc/cmdline: No such file or directory
> > mount: only root can use "--types" option (effective UID is 65534)
> > mount: only root can use "--options" option (effective UID is 65534)
> > mount: only root can use "--types" option (effective UID is 65534)
> > mount: only root can use "--types" option (effective UID is 65534)
> > mount: only root can use "--options" option (effective UID is 65534)
>
> It really goes wrong from here, where apparently it's not running as
> root (instead UID 65534), even though we're supposed to be running
> inside a Linux appliance virtual machine.
>
> Any idea why that would be?
>
> I looked at the sandbox and that would run the qemu process as UID
> "nobody" (which might be 65534).  However I don't understand why that
> would affect anything running on the new kernel inside the appliance.

And you were right. It was a fact that I got a crash in the sandbox
but did not outside of it and I jumped to the conclusion that the root
cause was linked to the sandbox.

I continued the analysis and looked at all the differences between a
successful libguestfs-test-tool log and the failed one. It turned out
that the sandbox was not the cause. The culprit turns out to be the
first line of the log: TMPDIR=/tmp.

If I force TMPDIR=/var/tmp, the problem disappears !!

This gave me a minimal reproducer:

  TMPDIR=/tmp/ libguestfs-test-tool

That one crashed outside the sandbox. Next, my attention went to this line:

  libguestfs: checking for previously cached test results of
/usr/bin/qemu-system-x86_64, in /tmp/.guestfs-1001

I did a:

  rm -rf /tmp/.guestfs-1001

and that solved my issue \o/

I still do not understand how I could get the issue of running of UID
65534 instead of root in the first place. I did other qemu
experimentation, so not sure how, but I somehow got a corrupted
environment under /tmp/.guestfs-1001.

Last thing, the segfault on ldmtool [1] still seems a valid issue.
Even if I now do have a workaround for my problem, that segfault might
be worth a bit more investigation.

Regardless, thanks a lot for your quick answer, that helped me to
continue the troubleshooting.

[1] ldmtool line 164
Link: https://github.com/mdbooth/libldm/blob/master/src/ldmtool.c#L164

___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs


Re: [Libguestfs] libldm crashes in a linux-sandbox context

2023-06-15 Thread Richard W.M. Jones


On Thu, Jun 15, 2023 at 09:18:38PM +0900, Vincent Mailhol wrote:
> Hello,
> 
> I am using libguestfs in a Bazel's linux-sandbox environment[1].
> 
> When executing in that sandbox environment, I got frequent crashes.
> 
> Please find attached below the results of libguestfs-test-tool when
> run into that linux-sandbox environment. The most relevant part seems
> to be:
> 
>   [0.797233] ldmtool[164]: segfault at 0 ip 564a892506a5 sp 
> 7fff8ee5b900 error 4 in ldmtool[564a8924e000+3000]
>   [0.798117] Code: 18 64 48 33 1c 25 28 00 00 00 75 5e 48 83 c4 28 5b 5d 
> 41 5c 41 5d 41 5e 41 5f c3 66 2e 0f 1f 84 00 00 00 00 00 e8 db fd ff ff <4c> 
> 8b 20 48 89 44 24 08 4c 89 e7 e8 0b e1 ff ff 45 31 c0 4c 89 e1
>   /init: line 154:   164 Segmentation fault  ldmtool create all
> 
> So the root cause seems to be around libldm. This mailing list seems
> to cover both libguestfs and libldm, so hopefully, I am at the right
> place to ask :)
> 
> Needless to say, when run outside of the sandbox environment, no crash
> were observed.
> 
> [1] linux-sandbox.cc
> Link: 
> https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc
> 
> ---
...
> supermin: picked /sys/block/sdb/dev (8:16) as root device
> supermin: creating /dev/root as block special 8:16
> supermin: mounting new root on /root
> [0.678248] EXT4-fs (sdb): mounting ext2 file system using the ext4 
> subsystem
> [0.679832] EXT4-fs (sdb): mounted filesystem without journal. Opts: . 
> Quota mode: none.
> supermin: deleting initramfs files
> supermin: chroot
> Starting /init script ...
> mount: only root can use "--types" option (effective UID is 65534)
> /init: line 38: /proc/cmdline: No such file or directory
> mount: only root can use "--types" option (effective UID is 65534)
> mount: only root can use "--options" option (effective UID is 65534)
> mount: only root can use "--types" option (effective UID is 65534)
> mount: only root can use "--types" option (effective UID is 65534)
> mount: only root can use "--options" option (effective UID is 65534)

It really goes wrong from here, where apparently it's not running as
root (instead UID 65534), even though we're supposed to be running
inside a Linux appliance virtual machine.

Any idea why that would be?

I looked at the sandbox and that would run the qemu process as UID
"nobody" (which might be 65534).  However I don't understand why that
would affect anything running on the new kernel inside the appliance.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org
___
Libguestfs mailing list
Libguestfs@redhat.com
https://listman.redhat.com/mailman/listinfo/libguestfs