Re: Network driver domain broken

2022-03-07 Thread Andrea Stevanato
On 3/7/2022 5:07 PM, Jason Andryuk wrote:
> On Mon, Mar 7, 2022 at 10:00 AM Andrea Stevanato
>  wrote:
>> (XEN) XSM Framework v1.0.0 initialized
>> (XEN) Initialising XSM SILO mode
> 
> Yes, SILO mode is running.
> 
>> # cat /boot/xen-4.14.3-pre.config | grep XSM
>> CONFIG_XSM=y
>> CONFIG_XSM_FLASK=y
>> CONFIG_XSM_FLASK_AVC_STATS=y
>> # CONFIG_XSM_FLASK_POLICY is not set
>> CONFIG_XSM_SILO=y
>> # CONFIG_XSM_DUMMY_DEFAULT is not set
>> # CONFIG_XSM_FLASK_DEFAULT is not set
>> CONFIG_XSM_SILO_DEFAULT=y
>>
>> This is the default configuration shipped with petalinux. From the
>> help menuconfig, it seems that this XSM SILO deny communication
>> between unprivileged VMs.
> 
> You could try adding xsm=dummy to your hypervisor command line to turn
> off SILO and allow the guests to communicate.

I changed it to FLASK adding flask=late to hypervisor the command line.
Which one should I choose? SILO + xsm=dummy or FLASK + flask=late/disabled?
What are the differences?

Cheers,
Andrea



Re: Network driver domain broken

2022-03-07 Thread Andrea Stevanato
On 3/7/2022 3:56 PM, Jan Beulich wrote:
> On 07.03.2022 15:52, Roger Pau Monné wrote:
>> On Mon, Mar 07, 2022 at 03:20:22PM +0100, Andrea Stevanato wrote:
>>> On 3/7/2022 12:46 PM, Roger Pau Monné wrote:
>>>> On Mon, Mar 07, 2022 at 12:39:22PM +0100, Andrea Stevanato wrote:
>>>>> /local/domain/2 = ""   (n0,r2)
>>>>> /local/domain/2/vm = "/vm/f6dca20a-54bb-43af-9a62-67c55cb75708"   (n0,r2)
>>>>> /local/domain/2/name = "guest1"   (n0,r2)
>>>>> /local/domain/2/cpu = ""   (n0,r2)
>>>>> /local/domain/2/cpu/0 = ""   (n0,r2)
>>>>> /local/domain/2/cpu/0/availability = "online"   (n0,r2)
>>>>> /local/domain/2/cpu/1 = ""   (n0,r2)
>>>>> /local/domain/2/cpu/1/availability = "online"   (n0,r2)
>>>>> /local/domain/2/memory = ""   (n0,r2)
>>>>> /local/domain/2/memory/static-max = "1048576"   (n0,r2)
>>>>> /local/domain/2/memory/target = "1048577"   (n0,r2)
>>>>> /local/domain/2/memory/videoram = "-1"   (n0,r2)
>>>>> /local/domain/2/device = ""   (n0,r2)
>>>>> /local/domain/2/device/suspend = ""   (n0,r2)
>>>>> /local/domain/2/device/suspend/event-channel = ""   (n2)
>>>>> /local/domain/2/device/vif = ""   (n0,r2)
>>>>> /local/domain/2/device/vif/0 = ""   (n2,r1)
>>>>> /local/domain/2/device/vif/0/backend = "/local/domain/1/backend/vif/2/0"
>>>>> (n2,r1)
>>>>> /local/domain/2/device/vif/0/backend-id = "1"   (n2,r1)
>>>>> /local/domain/2/device/vif/0/state = "6"   (n2,r1)
>>>>> /local/domain/2/device/vif/0/handle = "0"   (n2,r1)
>>>>> /local/domain/2/device/vif/0/mac = "00:16:3e:07:df:91"   (n2,r1)
>>>>> /local/domain/2/device/vif/0/xdp-headroom = "0"   (n2,r1)
>>>>> /local/domain/2/control = ""   (n0,r2)
>>>>> /local/domain/2/control/shutdown = ""   (n2)
>>>>> /local/domain/2/control/feature-poweroff = "1"   (n2)
>>>>> /local/domain/2/control/feature-reboot = "1"   (n2)
>>>>> /local/domain/2/control/feature-suspend = ""   (n2)
>>>>> /local/domain/2/control/sysrq = ""   (n2)
>>>>> /local/domain/2/control/platform-feature-multiprocessor-suspend = "1"
>>>>> (n0,r2)
>>>>> /local/domain/2/control/platform-feature-xs_reset_watches = "1"   (n0,r2)
>>>>> /local/domain/2/data = ""   (n2)
>>>>> /local/domain/2/drivers = ""   (n2)
>>>>> /local/domain/2/feature = ""   (n2)
>>>>> /local/domain/2/attr = ""   (n2)
>>>>> /local/domain/2/error = ""   (n2)
>>>>> /local/domain/2/error/device = ""   (n2)
>>>>> /local/domain/2/error/device/vif = ""   (n2)
>>>>> /local/domain/2/error/device/vif/0 = ""   (n2)
>>>>> /local/domain/2/error/device/vif/0/error = "1 allocating event channel"
>>>>> (n2)
>>>>
>>>> That's the real error. Your guest netfront fails to allocate the event
>>>> channel. Do you get any messages in the guest dmesg after trying to
>>>> attach the network interface?
>>>
>>> Just these two lines:
>>>
>>> [  389.453390] vif vif-0: 1 allocating event channel
>>> [  389.804135] vif vif-0: 1 allocating event channel
>>
>> Are you perhaps using some kind flask/xsm policy different from the
>> defaults?
> 
> Or SILO mode.

It turns out that this was the problem. I changed it to FLASK, added
flask=late to the bootloader cmd and now it works fine (at least for
now).

> Jan

Forgive me for bothering you so much, as soon as I can I will update
the wiki with all the information that I have discovered!
Thank you all!

Cheers,
Andrea



Re: Network driver domain broken

2022-03-07 Thread Andrea Stevanato
On 3/7/2022 3:50 PM, Andrew Cooper wrote:
> On 07/03/2022 14:43, Andrea Stevanato wrote:
>> On 3/7/2022 3:36 PM, Jan Beulich wrote:
>>> On 07.03.2022 15:20, Andrea Stevanato wrote:
>>>> On 3/7/2022 12:46 PM, Roger Pau Monné wrote:
>>>>> On Mon, Mar 07, 2022 at 12:39:22PM +0100, Andrea Stevanato wrote:
>>>>>> /local/domain/2 = ""   (n0,r2)
>>>>>> /local/domain/2/vm = "/vm/f6dca20a-54bb-43af-9a62-67c55cb75708"   (n0,r2)
>>>>>> /local/domain/2/name = "guest1"   (n0,r2)
>>>>>> /local/domain/2/cpu = ""   (n0,r2)
>>>>>> /local/domain/2/cpu/0 = ""   (n0,r2)
>>>>>> /local/domain/2/cpu/0/availability = "online"   (n0,r2)
>>>>>> /local/domain/2/cpu/1 = ""   (n0,r2)
>>>>>> /local/domain/2/cpu/1/availability = "online"   (n0,r2)
>>>>>> /local/domain/2/memory = ""   (n0,r2)
>>>>>> /local/domain/2/memory/static-max = "1048576"   (n0,r2)
>>>>>> /local/domain/2/memory/target = "1048577"   (n0,r2)
>>>>>> /local/domain/2/memory/videoram = "-1"   (n0,r2)
>>>>>> /local/domain/2/device = ""   (n0,r2)
>>>>>> /local/domain/2/device/suspend = ""   (n0,r2)
>>>>>> /local/domain/2/device/suspend/event-channel = ""   (n2)
>>>>>> /local/domain/2/device/vif = ""   (n0,r2)
>>>>>> /local/domain/2/device/vif/0 = ""   (n2,r1)
>>>>>> /local/domain/2/device/vif/0/backend = "/local/domain/1/backend/vif/2/0"
>>>>>> (n2,r1)
>>>>>> /local/domain/2/device/vif/0/backend-id = "1"   (n2,r1)
>>>>>> /local/domain/2/device/vif/0/state = "6"   (n2,r1)
>>>>>> /local/domain/2/device/vif/0/handle = "0"   (n2,r1)
>>>>>> /local/domain/2/device/vif/0/mac = "00:16:3e:07:df:91"   (n2,r1)
>>>>>> /local/domain/2/device/vif/0/xdp-headroom = "0"   (n2,r1)
>>>>>> /local/domain/2/control = ""   (n0,r2)
>>>>>> /local/domain/2/control/shutdown = ""   (n2)
>>>>>> /local/domain/2/control/feature-poweroff = "1"   (n2)
>>>>>> /local/domain/2/control/feature-reboot = "1"   (n2)
>>>>>> /local/domain/2/control/feature-suspend = ""   (n2)
>>>>>> /local/domain/2/control/sysrq = ""   (n2)
>>>>>> /local/domain/2/control/platform-feature-multiprocessor-suspend = "1"
>>>>>> (n0,r2)
>>>>>> /local/domain/2/control/platform-feature-xs_reset_watches = "1"   (n0,r2)
>>>>>> /local/domain/2/data = ""   (n2)
>>>>>> /local/domain/2/drivers = ""   (n2)
>>>>>> /local/domain/2/feature = ""   (n2)
>>>>>> /local/domain/2/attr = ""   (n2)
>>>>>> /local/domain/2/error = ""   (n2)
>>>>>> /local/domain/2/error/device = ""   (n2)
>>>>>> /local/domain/2/error/device/vif = ""   (n2)
>>>>>> /local/domain/2/error/device/vif/0 = ""   (n2)
>>>>>> /local/domain/2/error/device/vif/0/error = "1 allocating event channel"
>>>>>> (n2)
>>>>> That's the real error. Your guest netfront fails to allocate the event
>>>>> channel. Do you get any messages in the guest dmesg after trying to
>>>>> attach the network interface?
>>>> Just these two lines:
>>>>
>>>> [  389.453390] vif vif-0: 1 allocating event channel
>>>> [  389.804135] vif vif-0: 1 allocating event channel
>>> Well, these are the error messages, from xenbus_alloc_evtchn().
>>> What's a little odd is that the error code is positive, but that's
>>> how -EPERM is logged. Is there perhaps a strange or broken XSM
>>> policy in use? I ask because evtchn_alloc_unbound() itself
>>> wouldn't return -EPERM afaics.
>> As you can see I'm pretty new to Xen. Furthermore, it is the first
>> time that I heard about XSM, so since I did not change anything I
>> do not know what to answer!
> 
> Please can you attach the full output of `xl dmesg`, which will help
> answer this question.

# xl dmesg
(XEN) Checking for initrd in /chosen
(XEN) RAM:  - 7fef
(XEN) RAM: 0008

Re: Network driver domain broken

2022-03-07 Thread Andrea Stevanato
On 3/7/2022 3:36 PM, Jan Beulich wrote:
> On 07.03.2022 15:20, Andrea Stevanato wrote:
>> On 3/7/2022 12:46 PM, Roger Pau Monné wrote:
>>> On Mon, Mar 07, 2022 at 12:39:22PM +0100, Andrea Stevanato wrote:
>>>> /local/domain/2 = ""   (n0,r2)
>>>> /local/domain/2/vm = "/vm/f6dca20a-54bb-43af-9a62-67c55cb75708"   (n0,r2)
>>>> /local/domain/2/name = "guest1"   (n0,r2)
>>>> /local/domain/2/cpu = ""   (n0,r2)
>>>> /local/domain/2/cpu/0 = ""   (n0,r2)
>>>> /local/domain/2/cpu/0/availability = "online"   (n0,r2)
>>>> /local/domain/2/cpu/1 = ""   (n0,r2)
>>>> /local/domain/2/cpu/1/availability = "online"   (n0,r2)
>>>> /local/domain/2/memory = ""   (n0,r2)
>>>> /local/domain/2/memory/static-max = "1048576"   (n0,r2)
>>>> /local/domain/2/memory/target = "1048577"   (n0,r2)
>>>> /local/domain/2/memory/videoram = "-1"   (n0,r2)
>>>> /local/domain/2/device = ""   (n0,r2)
>>>> /local/domain/2/device/suspend = ""   (n0,r2)
>>>> /local/domain/2/device/suspend/event-channel = ""   (n2)
>>>> /local/domain/2/device/vif = ""   (n0,r2)
>>>> /local/domain/2/device/vif/0 = ""   (n2,r1)
>>>> /local/domain/2/device/vif/0/backend = "/local/domain/1/backend/vif/2/0"
>>>> (n2,r1)
>>>> /local/domain/2/device/vif/0/backend-id = "1"   (n2,r1)
>>>> /local/domain/2/device/vif/0/state = "6"   (n2,r1)
>>>> /local/domain/2/device/vif/0/handle = "0"   (n2,r1)
>>>> /local/domain/2/device/vif/0/mac = "00:16:3e:07:df:91"   (n2,r1)
>>>> /local/domain/2/device/vif/0/xdp-headroom = "0"   (n2,r1)
>>>> /local/domain/2/control = ""   (n0,r2)
>>>> /local/domain/2/control/shutdown = ""   (n2)
>>>> /local/domain/2/control/feature-poweroff = "1"   (n2)
>>>> /local/domain/2/control/feature-reboot = "1"   (n2)
>>>> /local/domain/2/control/feature-suspend = ""   (n2)
>>>> /local/domain/2/control/sysrq = ""   (n2)
>>>> /local/domain/2/control/platform-feature-multiprocessor-suspend = "1"
>>>> (n0,r2)
>>>> /local/domain/2/control/platform-feature-xs_reset_watches = "1"   (n0,r2)
>>>> /local/domain/2/data = ""   (n2)
>>>> /local/domain/2/drivers = ""   (n2)
>>>> /local/domain/2/feature = ""   (n2)
>>>> /local/domain/2/attr = ""   (n2)
>>>> /local/domain/2/error = ""   (n2)
>>>> /local/domain/2/error/device = ""   (n2)
>>>> /local/domain/2/error/device/vif = ""   (n2)
>>>> /local/domain/2/error/device/vif/0 = ""   (n2)
>>>> /local/domain/2/error/device/vif/0/error = "1 allocating event channel"
>>>> (n2)
>>>
>>> That's the real error. Your guest netfront fails to allocate the event
>>> channel. Do you get any messages in the guest dmesg after trying to
>>> attach the network interface?
>>
>> Just these two lines:
>>
>> [  389.453390] vif vif-0: 1 allocating event channel
>> [  389.804135] vif vif-0: 1 allocating event channel
> 
> Well, these are the error messages, from xenbus_alloc_evtchn().
> What's a little odd is that the error code is positive, but that's
> how -EPERM is logged. Is there perhaps a strange or broken XSM
> policy in use? I ask because evtchn_alloc_unbound() itself
> wouldn't return -EPERM afaics.

As you can see I'm pretty new to Xen. Furthermore, it is the first
time that I heard about XSM, so since I did not change anything I
do not know what to answer! The only thing that I can tell is that
for both dom0 and guests I'm using the same exact kernel and rootfs.
 
> Jan

Cheers,
Andrea



Re: Network driver domain broken

2022-03-07 Thread Andrea Stevanato
On 3/7/2022 12:46 PM, Roger Pau Monné wrote:
> On Mon, Mar 07, 2022 at 12:39:22PM +0100, Andrea Stevanato wrote:
>> /local/domain/2 = ""   (n0,r2)
>> /local/domain/2/vm = "/vm/f6dca20a-54bb-43af-9a62-67c55cb75708"   (n0,r2)
>> /local/domain/2/name = "guest1"   (n0,r2)
>> /local/domain/2/cpu = ""   (n0,r2)
>> /local/domain/2/cpu/0 = ""   (n0,r2)
>> /local/domain/2/cpu/0/availability = "online"   (n0,r2)
>> /local/domain/2/cpu/1 = ""   (n0,r2)
>> /local/domain/2/cpu/1/availability = "online"   (n0,r2)
>> /local/domain/2/memory = ""   (n0,r2)
>> /local/domain/2/memory/static-max = "1048576"   (n0,r2)
>> /local/domain/2/memory/target = "1048577"   (n0,r2)
>> /local/domain/2/memory/videoram = "-1"   (n0,r2)
>> /local/domain/2/device = ""   (n0,r2)
>> /local/domain/2/device/suspend = ""   (n0,r2)
>> /local/domain/2/device/suspend/event-channel = ""   (n2)
>> /local/domain/2/device/vif = ""   (n0,r2)
>> /local/domain/2/device/vif/0 = ""   (n2,r1)
>> /local/domain/2/device/vif/0/backend = "/local/domain/1/backend/vif/2/0"
>> (n2,r1)
>> /local/domain/2/device/vif/0/backend-id = "1"   (n2,r1)
>> /local/domain/2/device/vif/0/state = "6"   (n2,r1)
>> /local/domain/2/device/vif/0/handle = "0"   (n2,r1)
>> /local/domain/2/device/vif/0/mac = "00:16:3e:07:df:91"   (n2,r1)
>> /local/domain/2/device/vif/0/xdp-headroom = "0"   (n2,r1)
>> /local/domain/2/control = ""   (n0,r2)
>> /local/domain/2/control/shutdown = ""   (n2)
>> /local/domain/2/control/feature-poweroff = "1"   (n2)
>> /local/domain/2/control/feature-reboot = "1"   (n2)
>> /local/domain/2/control/feature-suspend = ""   (n2)
>> /local/domain/2/control/sysrq = ""   (n2)
>> /local/domain/2/control/platform-feature-multiprocessor-suspend = "1"
>> (n0,r2)
>> /local/domain/2/control/platform-feature-xs_reset_watches = "1"   (n0,r2)
>> /local/domain/2/data = ""   (n2)
>> /local/domain/2/drivers = ""   (n2)
>> /local/domain/2/feature = ""   (n2)
>> /local/domain/2/attr = ""   (n2)
>> /local/domain/2/error = ""   (n2)
>> /local/domain/2/error/device = ""   (n2)
>> /local/domain/2/error/device/vif = ""   (n2)
>> /local/domain/2/error/device/vif/0 = ""   (n2)
>> /local/domain/2/error/device/vif/0/error = "1 allocating event channel"
>> (n2)
> 
> That's the real error. Your guest netfront fails to allocate the event
> channel. Do you get any messages in the guest dmesg after trying to
> attach the network interface?

Just these two lines:

[  389.453390] vif vif-0: 1 allocating event channel
[  389.804135] vif vif-0: 1 allocating event channel
 
> Does the same happen if you don't use a driver domain and run the
> backend in dom0?

No, it does not. On dom0 everything is set up correctly. Here the final
part of xl -vvv devd -F executed on dom0, which is different from the 
execution on guest0

libxl: debug: libxl_event.c:1052:devstate_callback: backend 
/local/domain/0/backend/vif/1/0/state wanted state 2 ok
libxl: debug: libxl_event.c:850:libxl__ev_xswatch_deregister: watch 
w=0xca342470 wpath=/local/domain/0/backend/vif/1/0/state token=1/2: 
deregister slotnum=1
libxl: debug: libxl_device.c:1090:device_backend_callback: Domain 1:calling 
device_backend_cleanup
libxl: debug: libxl_event.c:864:libxl__ev_xswatch_deregister: watch 
w=0xca342470: deregister unregistered
libxl: debug: libxl_device.c:1191:device_hotplug: Domain 1:calling hotplug 
script: /etc/xen/scripts/vif-bridge online
libxl: debug: libxl_device.c:1192:device_hotplug: Domain 1:extra args:
libxl: debug: libxl_device.c:1198:device_hotplug: Domain 1: type_if=vif
libxl: debug: libxl_device.c:1200:device_hotplug: Domain 1:env:
libxl: debug: libxl_device.c:1207:device_hotplug: Domain 1: script: 
/etc/xen/scripts/vif-bridge
libxl: debug: libxl_device.c:1207:device_hotplug: Domain 1: XENBUS_TYPE: vif
libxl: debug: libxl_device.c:1207:device_hotplug: Domain 1: XENBUS_PATH: 
backend/vif/1/0
libxl: debug: libxl_device.c:1207:device_hotplug: Domain 1: 
XENBUS_BASE_PATH: backend
libxl: debug: libxl_device.c:1207:device_hotplug: Domain 1: netdev:
libxl: debug: libxl_device.c:1207:device_hotplug: Domain 1: vif: vif1.0
libxl: debug: libxl_aoutils.c:593:libxl__async_exec_start: forking to execute: 
/etc/xen/scripts/vif-bridge online

> 
> Regards, Roger.

Cheers,
Andrea.



Re: Network driver domain broken

2022-03-07 Thread Andrea Stevanato

On 3/7/22 12:22, Roger Pau Monné wrote:

On Fri, Mar 04, 2022 at 02:46:37PM +0100, Andrea Stevanato wrote:

On 3/4/2022 1:27 PM, Roger Pau Monné wrote:

On Fri, Mar 04, 2022 at 01:05:55PM +0100, Andrea Stevanato wrote:

On 3/4/2022 12:52 PM, Roger Pau Monné wrote:

On Thu, Mar 03, 2022 at 01:08:31PM -0500, Jason Andryuk wrote:

On Thu, Mar 3, 2022 at 11:34 AM Roger Pau Monné  wrote:


On Thu, Mar 03, 2022 at 05:01:23PM +0100, Andrea Stevanato wrote:

On 03/03/2022 15:54, Andrea Stevanato wrote:

Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name= "guest0"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name= "guest1"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0



ioctl(5, _IOC(_IOC_NONE, 0x50, 0, 0x30), 0xe6e41b40) = -1 EPERM (Operation 
not permitted)
write(2, "libxl: ", 7libxl: )  = 7
write(2, "error: ", 7error: )  = 7
write(2, "libxl_utils.c:820:libxl_cpu_bitm"..., 
87libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of 
cpus) = 87
write(2, "\n", 1
)   = 1
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x9ee7a0e0) = 814
wait4(814, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 814
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=814, si_uid=0, 
si_status=0, si_utime=2, si_stime=2} ---


xl devd is daemonizing, but strace is only following the first
process.  Use `strace xl devd -F` to prevent the daemonizing (or
`strace -f xl devd` to follow children).


Or as a first step try to see what kind of messages you get from `xl
devd -F` when trying to attach a device using the driver domain.


Nothing has changed. On guest0 (the driver domain):

# xl devd -F
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
[  696.805619] xenbr0: port 1(vif2.0) entered blocking state
[  696.810334] xenbr0: port 1(vif2.0) entered disabled state
[  696.824518] device vif2.0 entered promiscuous mode


Can you use `xl -vvv devd -F` here?


# xl -vvv devd -F
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: debug: libxl_device.c:1749:libxl_device_events_handler: ao
0xece52130: create: how=(nil) callback=(nil) poller=0xece52430
libxl: debug: libxl_event.c:813:libxl__ev_xswatch_register: watch
w=0xe628caf8 wpath=/local/domain/1/backend token=3/0: register slotnum=3
libxl: debug: libxl_device.c:1806:libxl_device_events_handler: ao
0xece52130: inprogress: poller=0xece52430, flags=i
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf8
wpath=/local/domain/1/backend token=3/0: event epath=/local/domain/1/backend
libxl: debug: libxl_event.c:2445:libxl__nested_ao_create: ao 0xece51b90:
nested ao, parent 0xece52130
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xece51b90:
destroy
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf8
wpath=/local/domain/1/backend token=3/0: event
epath=/local/domain/1/backend/vif/2/0
libxl: debug: libxl_event.c:2445:libxl__nested_ao_create: ao 0xece4e7b0:
nested ao, parent 0xece52130
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xece4e7b0:
destroy
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf8
wpath=/local/domain/1/backend token=3/0: event
epath=/local/domain/1/backend/vif/2
libxl: debug: libxl_event.c:2445:libxl__nested_ao_create: ao 0xece4e990:
nested ao, parent 0xece52130
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xaaa

Re: Network driver domain broken

2022-03-04 Thread Andrea Stevanato
On 04/03/2022 14:46, Andrea Stevanato wrote:
> On 3/4/2022 1:27 PM, Roger Pau Monné wrote:
>> On Fri, Mar 04, 2022 at 01:05:55PM +0100, Andrea Stevanato wrote:
>>> On 3/4/2022 12:52 PM, Roger Pau Monné wrote:
>>>> On Thu, Mar 03, 2022 at 01:08:31PM -0500, Jason Andryuk wrote:
>>>>> On Thu, Mar 3, 2022 at 11:34 AM Roger Pau Monné  
>>>>> wrote:
>>>>>>
>>>>>> On Thu, Mar 03, 2022 at 05:01:23PM +0100, Andrea Stevanato wrote:
>>>>>>> On 03/03/2022 15:54, Andrea Stevanato wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> according to the conversation that I had with royger, aa67b97ed34  
>>>>>>>> broke the driver domain support.
>>>>>>>>
>>>>>>>> What I'm trying to do is to setup networking between guests using 
>>>>>>>> driver domain. Therefore, the guest (driver) has been started with the 
>>>>>>>> following cfg.
>>>>>>>>
>>>>>>>> name    = "guest0"
>>>>>>>> kernel  = "/media/sd-mmcblk0p1/Image"
>>>>>>>> ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
>>>>>>>> extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
>>>>>>>> memory  = 1024 vcpus   = 2
>>>>>>>> driver_domain = 1
>>>>>>>>
>>>>>>>> On guest0 I created the bridge, assigned a static IP and started the 
>>>>>>>> udhcpd on xenbr0 interface.
>>>>>>>> While the second guest has been started with the following cfg:
>>>>>>>>
>>>>>>>> name    = "guest1"
>>>>>>>> kernel  = "/media/sd-mmcblk0p1/Image"
>>>>>>>> ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
>>>>>>>> extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
>>>>>>>> memory  = 1024 vcpus   = 2
>>>>>>>> vcpus   = 2
>>>>>>>> vif = [ 'bridge=xenbr0, backend=guest0' ]
>>>>>>>>
>>>>>>>> Follows the result of strace xl devd:
>>>>>>>>
>>>>>>>> # strace xl devd
>>>>>>>> execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = >>>>>>>> 0
>>>>>
>>>>>>>> ioctl(5, _IOC(_IOC_NONE, 0x50, 0, 0x30), 0xe6e41b40) = -1 EPERM 
>>>>>>>> (Operation not permitted)
>>>>>>>> write(2, "libxl: ", 7libxl: )  = 7
>>>>>>>> write(2, "error: ", 7error: )  = 7
>>>>>>>> write(2, "libxl_utils.c:820:libxl_cpu_bitm"..., 
>>>>>>>> 87libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the 
>>>>>>>> maximum number of cpus) = 87
>>>>>>>> write(2, "\n", 1
>>>>>>>> )   = 1
>>>>>>>> clone(child_stack=NULL, 
>>>>>>>> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
>>>>>>>> child_tidptr=0x9ee7a0e0) = 814
>>>>>>>> wait4(814, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 814
>>>>>>>> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=814, 
>>>>>>>> si_uid=0, si_status=0, si_utime=2, si_stime=2} ---
>>>>>
>>>>> xl devd is daemonizing, but strace is only following the first
>>>>> process.  Use `strace xl devd -F` to prevent the daemonizing (or
>>>>> `strace -f xl devd` to follow children).
>>>>
>>>> Or as a first step try to see what kind of messages you get from `xl
>>>> devd -F` when trying to attach a device using the driver domain.
>>>
>>> Nothing has changed. On guest0 (the driver domain):
>>>
>>> # xl devd -F
>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
>>> the maximum number of cpus
>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
>>> the maximum number of cpus
>>> libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
>>> the maximum number of cpus
>>&

Re: Network driver domain broken

2022-03-04 Thread Andrea Stevanato

On 3/4/2022 1:27 PM, Roger Pau Monné wrote:

On Fri, Mar 04, 2022 at 01:05:55PM +0100, Andrea Stevanato wrote:

On 3/4/2022 12:52 PM, Roger Pau Monné wrote:

On Thu, Mar 03, 2022 at 01:08:31PM -0500, Jason Andryuk wrote:

On Thu, Mar 3, 2022 at 11:34 AM Roger Pau Monné  wrote:


On Thu, Mar 03, 2022 at 05:01:23PM +0100, Andrea Stevanato wrote:

On 03/03/2022 15:54, Andrea Stevanato wrote:

Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name= "guest0"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name= "guest1"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0



ioctl(5, _IOC(_IOC_NONE, 0x50, 0, 0x30), 0xe6e41b40) = -1 EPERM (Operation 
not permitted)
write(2, "libxl: ", 7libxl: )  = 7
write(2, "error: ", 7error: )  = 7
write(2, "libxl_utils.c:820:libxl_cpu_bitm"..., 
87libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of 
cpus) = 87
write(2, "\n", 1
)   = 1
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x9ee7a0e0) = 814
wait4(814, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 814
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=814, si_uid=0, 
si_status=0, si_utime=2, si_stime=2} ---


xl devd is daemonizing, but strace is only following the first
process.  Use `strace xl devd -F` to prevent the daemonizing (or
`strace -f xl devd` to follow children).


Or as a first step try to see what kind of messages you get from `xl
devd -F` when trying to attach a device using the driver domain.


Nothing has changed. On guest0 (the driver domain):

# xl devd -F
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve
the maximum number of cpus
[  696.805619] xenbr0: port 1(vif2.0) entered blocking state
[  696.810334] xenbr0: port 1(vif2.0) entered disabled state
[  696.824518] device vif2.0 entered promiscuous mode


Can you use `xl -vvv devd -F` here?


# xl -vvv devd -F
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to 
retrieve the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to 
retrieve the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to 
retrieve the maximum number of cpus
libxl: debug: libxl_device.c:1749:libxl_device_events_handler: ao 
0xece52130: create: how=(nil) callback=(nil) poller=0xece52430
libxl: debug: libxl_event.c:813:libxl__ev_xswatch_register: watch 
w=0xe628caf8 wpath=/local/domain/1/backend token=3/0: register slotnum=3
libxl: debug: libxl_device.c:1806:libxl_device_events_handler: ao 
0xece52130: inprogress: poller=0xece52430, flags=i
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf8 
wpath=/local/domain/1/backend token=3/0: event epath=/local/domain/1/backend
libxl: debug: libxl_event.c:2445:libxl__nested_ao_create: ao 
0xece51b90: nested ao, parent 0xece52130
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xece51b90: 
destroy
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf8 
wpath=/local/domain/1/backend token=3/0: event 
epath=/local/domain/1/backend/vif/2/0
libxl: debug: libxl_event.c:2445:libxl__nested_ao_create: ao 
0xece4e7b0: nested ao, parent 0xece52130
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xece4e7b0: 
destroy
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf8 
wpath=/local/domain/1/backend token=3/0: event 
epath=/local/domain/1/backend/vif/2
libxl: debug: libxl_event.c:2445:libxl__nested_ao_create: ao 
0xece4e990: nested ao, parent 0xece52130
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xece4e990: 
destroy
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xe628caf

Re: Network driver domain broken

2022-03-04 Thread Andrea Stevanato

On 3/4/2022 12:52 PM, Roger Pau Monné wrote:

On Thu, Mar 03, 2022 at 01:08:31PM -0500, Jason Andryuk wrote:

On Thu, Mar 3, 2022 at 11:34 AM Roger Pau Monné  wrote:


On Thu, Mar 03, 2022 at 05:01:23PM +0100, Andrea Stevanato wrote:

On 03/03/2022 15:54, Andrea Stevanato wrote:

Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name= "guest0"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name= "guest1"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0



ioctl(5, _IOC(_IOC_NONE, 0x50, 0, 0x30), 0xe6e41b40) = -1 EPERM (Operation 
not permitted)
write(2, "libxl: ", 7libxl: )  = 7
write(2, "error: ", 7error: )  = 7
write(2, "libxl_utils.c:820:libxl_cpu_bitm"..., 
87libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of 
cpus) = 87
write(2, "\n", 1
)   = 1
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x9ee7a0e0) = 814
wait4(814, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 814
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=814, si_uid=0, 
si_status=0, si_utime=2, si_stime=2} ---


xl devd is daemonizing, but strace is only following the first
process.  Use `strace xl devd -F` to prevent the daemonizing (or
`strace -f xl devd` to follow children).


Or as a first step try to see what kind of messages you get from `xl
devd -F` when trying to attach a device using the driver domain.


Nothing has changed. On guest0 (the driver domain):

# xl devd -F
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to 
retrieve the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to 
retrieve the maximum number of cpus
libxl: error: libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to 
retrieve the maximum number of cpus

[  696.805619] xenbr0: port 1(vif2.0) entered blocking state
[  696.810334] xenbr0: port 1(vif2.0) entered disabled state
[  696.824518] device vif2.0 entered promiscuous mode

While on dom0:

# xl network-list guest1
Idx BE Mac Addr. handle state evt-ch   tx-/rx-ring-ref BE-path
0   1  00:16:3e:18:52:ac 0 6 -1-1/-1 
/local/domain/1/backend/vif/2/0


The same with using strace gives the following output:

# strace xl devd -F
execve("/usr/sbin/xl", ["xl", "devd", "-F"], 0xeed242a0 /* 13 vars 
*/) = 0

brk(NULL)   = 0xaaab092a8000
faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = -1 ENOENT (No such 
file or directory)

openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=7840, ...}) = 0
mmap(NULL, 7840, PROT_READ, MAP_PRIVATE, 3, 0) = 0x986e2000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxlutil.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0200\0\0\0\0\0\0"..., 
832) = 832

fstat(3, {st_mode=S_IFREG|0755, st_size=68168, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x986e
mmap(NULL, 131784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x98694000

mprotect(0x986a3000, 65536, PROT_NONE) = 0
mmap(0x986b3000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x986b3000

close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxenlight.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0`\16\2\0\0\0\0\0"..., 
832) = 832

fstat(3, {st_mode=S_IFREG|0755, st_size=861848, ...}) = 0
mmap(NULL, 925752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x985b1000

mprotect(0x9867e000, 61440, PROT_NONE) = 0
mmap(0x9868d000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xcc000) = 0x9868d000
mmap(0x98693000, 56, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXE

Re: Network driver domain broken

2022-03-04 Thread Andrea Stevanato

On 3/3/2022 7:08 PM, Jason Andryuk wrote:

On Thu, Mar 3, 2022 at 11:34 AM Roger Pau Monné  wrote:


On Thu, Mar 03, 2022 at 05:01:23PM +0100, Andrea Stevanato wrote:

On 03/03/2022 15:54, Andrea Stevanato wrote:

Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name= "guest0"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name= "guest1"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0



ioctl(5, _IOC(_IOC_NONE, 0x50, 0, 0x30), 0xe6e41b40) = -1 EPERM (Operation 
not permitted)
write(2, "libxl: ", 7libxl: )  = 7
write(2, "error: ", 7error: )  = 7
write(2, "libxl_utils.c:820:libxl_cpu_bitm"..., 
87libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of 
cpus) = 87
write(2, "\n", 1
)   = 1
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x9ee7a0e0) = 814
wait4(814, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 814
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=814, si_uid=0, 
si_status=0, si_utime=2, si_stime=2} ---


xl devd is daemonizing, but strace is only following the first
process.  Use `strace xl devd -F` to prevent the daemonizing (or
`strace -f xl devd` to follow children).


Sorry, I have not read this part.

# strace xl devd -F
execve("/usr/sbin/xl", ["xl", "devd", "-F"], 0xc53b6e50 /* 13 vars 
*/) = 0

brk(NULL)   = 0xaaab058a
faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = -1 ENOENT (No such 
file or directory)

openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=7840, ...}) = 0
mmap(NULL, 7840, PROT_READ, MAP_PRIVATE, 3, 0) = 0x833c7000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxlutil.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0200\0\0\0\0\0\0"..., 
832) = 832

fstat(3, {st_mode=S_IFREG|0755, st_size=68168, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) 
= 0x833c5000
mmap(NULL, 131784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x83379000

mprotect(0x83388000, 65536, PROT_NONE) = 0
mmap(0x83398000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x83398000

close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxenlight.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0`\16\2\0\0\0\0\0"..., 
832) = 832

fstat(3, {st_mode=S_IFREG|0755, st_size=861848, ...}) = 0
mmap(NULL, 925752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x83296000

mprotect(0x83363000, 61440, PROT_NONE) = 0
mmap(0x83372000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xcc000) = 0x83372000
mmap(0x83378000, 56, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x83378000

close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxentoollog.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0P\r\0\0\0\0\0\0"..., 
832) = 832

fstat(3, {st_mode=S_IFREG|0755, st_size=10368, ...}) = 0
mmap(NULL, 73904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x83283000

mprotect(0x83285000, 61440, PROT_NONE) = 0
mmap(0x83294000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x83294000

close(3)= 0
openat(AT_FDCWD, "/usr/lib/libyajl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\320\22\0\0\0\0\0\0"..., 
832) = 832

fstat(3, {st_mode=S_IFREG|0755, st_size=38728, ...}) = 0
mmap(NULL, 102416, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) 
= 0x83269000

mprotect(0x83272000, 61

Re: Network driver domain broken

2022-03-04 Thread Andrea Stevanato

On 03/03/2022 19:08, Jason Andryuk wrote:

On Thu, Mar 3, 2022 at 11:34 AM Roger Pau Monné  wrote:


On Thu, Mar 03, 2022 at 05:01:23PM +0100, Andrea Stevanato wrote:

On 03/03/2022 15:54, Andrea Stevanato wrote:

Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name= "guest0"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name= "guest1"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0



ioctl(5, _IOC(_IOC_NONE, 0x50, 0, 0x30), 0xe6e41b40) = -1 EPERM (Operation 
not permitted)
write(2, "libxl: ", 7libxl: )  = 7
write(2, "error: ", 7error: )  = 7
write(2, "libxl_utils.c:820:libxl_cpu_bitm"..., 
87libxl_utils.c:820:libxl_cpu_bitmap_alloc: failed to retrieve the maximum number of 
cpus) = 87
write(2, "\n", 1
)   = 1
clone(child_stack=NULL, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, 
child_tidptr=0x9ee7a0e0) = 814
wait4(814, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 814
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=814, si_uid=0, 
si_status=0, si_utime=2, si_stime=2} ---


xl devd is daemonizing, but strace is only following the first
process.  Use `strace xl devd -F` to prevent the daemonizing (or
`strace -f xl devd` to follow children).


close(6)= 0
close(5)= 0
munmap(0x9f45f000, 4096)= 0
close(7)= 0
close(10)   = 0
close(9)= 0
close(8)= 0
close(11)   = 0
close(3)= 0
close(4)= 0
exit_group(0)   = ?
+++ exited with 0 +++

royger told me that it is a BUG and not an issue with my setup. Therefore here 
I am.


Just a bit more context: AFAICT the calls to libxl_cpu_bitmap_alloc in
parse_global_config will prevent xl from being usable on anything
different than the control domain (due to sysctl only available to
privileged domains). This is an issue for 'xl devd', as it won't
start anymore.


These look non-fatal at first glance?

Regards,
Jason


Well, actually, this prevents me to be able to create network driver 
domains for inter-guests networking (no passthrough is required since 
they do not need to reach outside).


Cheers,
Andrea



Re: Network driver domain broken

2022-03-03 Thread Andrea Stevanato

On 03/03/2022 15:54, Andrea Stevanato wrote:

Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name    = "guest0"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 
vcpus   = 2

driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name    = "guest1"
kernel  = "/media/sd-mmcblk0p1/Image"
ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"
extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"
memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0
brk(NULL)   = 0xeaf3b000
faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or 
directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=7840, ...}) = 0
mmap(NULL, 7840, PROT_READ, MAP_PRIVATE, 3, 0) = 0x9f45e000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxlutil.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0200\0\0\0\0\0\0"..., 832) = 
832
fstat(3, {st_mode=S_IFREG|0755, st_size=68168, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x9f45c000
mmap(NULL, 131784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f41
mprotect(0x9f41f000, 65536, PROT_NONE) = 0
mmap(0x9f42f000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x9f42f000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxenlight.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0`\16\2\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=861848, ...}) = 0
mmap(NULL, 925752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f32d000
mprotect(0x9f3fa000, 61440, PROT_NONE) = 0
mmap(0x9f409000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xcc000) = 0x9f409000
mmap(0x9f40f000, 56, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x9f40f000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxentoollog.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0P\r\0\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=10368, ...}) = 0
mmap(NULL, 73904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f31a000
mprotect(0x9f31c000, 61440, PROT_NONE) = 0
mmap(0x9f32b000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x9f32b000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libyajl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\320\22\0\0\0\0\0\0"..., 832) 
= 832
fstat(3, {st_mode=S_IFREG|0755, st_size=38728, ...}) = 0
mmap(NULL, 102416, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f30
mprotect(0x9f309000, 61440, PROT_NONE) = 0
mmap(0x9f318000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x8000) = 0x9f318000
close(3)= 0
openat(AT_FDCWD, "/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\300j\0\0\0\0\0\0"..., 832) = 
832
fstat(3, {st_mode=S_IFREG|0755, st_size=113184, ...}) = 0
mmap(NULL, 192872, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f2d
mprotect(0x9f2ea000, 65536, PROT_NONE) = 0
mmap(0x9f2fa000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a000) = 0x9f2fa000
mmap(0x9f2fc000, 12648, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x9f2fc000
close(3)= 0
openat(AT_FDCWD, "/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\320I\2\0\0\0\0\0"..., 832) = 
832
fstat(3, {st_mode=S_IFREG|0755, st_size=1428872, ...}) = 0
mmap(NULL, 1502000, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f161000
mprotect(0x9f2b8000, 61440, PROT_NONE) = 0
mmap(0x9f2c7000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_F

Network driver domain broken

2022-03-03 Thread Andrea Stevanato
Hi all,

according to the conversation that I had with royger, aa67b97ed34  broke the 
driver domain support.

What I'm trying to do is to setup networking between guests using driver 
domain. Therefore, the guest (driver) has been started with the following cfg.

name    = "guest0"


kernel  = "/media/sd-mmcblk0p1/Image"


ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"


extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"


memory  = 1024

vcpus   = 2
driver_domain = 1

On guest0 I created the bridge, assigned a static IP and started the udhcpd on 
xenbr0 interface.
While the second guest has been started with the following cfg:

name    = "guest1"


kernel  = "/media/sd-mmcblk0p1/Image"


ramdisk = "/media/sd-mmcblk0p1/rootfs.cpio.gz"


extra   = "console=hvc0 rdinit=/sbin/init root=/dev/ram0"


memory  = 1024 vcpus   = 2
vcpus   = 2
vif = [ 'bridge=xenbr0, backend=guest0' ]

Follows the result of strace xl devd:

# strace xl devd
execve("/usr/sbin/xl", ["xl", "devd"], 0xdf0420c8 /* 13 vars */) = 0
brk(NULL)   = 0xeaf3b000
faccessat(AT_FDCWD, "/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or 
directory)
openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=7840, ...}) = 0
mmap(NULL, 7840, PROT_READ, MAP_PRIVATE, 3, 0) = 0x9f45e000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxlutil.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\0200\0\0\0\0\0\0"..., 832) = 
832
fstat(3, {st_mode=S_IFREG|0755, st_size=68168, ...}) = 0
mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 
0x9f45c000
mmap(NULL, 131784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f41
mprotect(0x9f41f000, 65536, PROT_NONE) = 0
mmap(0x9f42f000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xf000) = 0x9f42f000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxenlight.so.4.14", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0`\16\2\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=861848, ...}) = 0
mmap(NULL, 925752, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f32d000
mprotect(0x9f3fa000, 61440, PROT_NONE) = 0
mmap(0x9f409000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0xcc000) = 0x9f409000
mmap(0x9f40f000, 56, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x9f40f000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxentoollog.so.1", O_RDONLY|O_CLOEXEC) = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0P\r\0\0\0\0\0\0"..., 
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=10368, ...}) = 0
mmap(NULL, 73904, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f31a000
mprotect(0x9f31c000, 61440, PROT_NONE) = 0
mmap(0x9f32b000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1000) = 0x9f32b000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libyajl.so.2", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\320\22\0\0\0\0\0\0"..., 832) 
= 832
fstat(3, {st_mode=S_IFREG|0755, st_size=38728, ...}) = 0
mmap(NULL, 102416, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f30
mprotect(0x9f309000, 61440, PROT_NONE) = 0
mmap(0x9f318000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x8000) = 0x9f318000
close(3)= 0
openat(AT_FDCWD, "/lib/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\300j\0\0\0\0\0\0"..., 832) = 
832
fstat(3, {st_mode=S_IFREG|0755, st_size=113184, ...}) = 0
mmap(NULL, 192872, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f2d
mprotect(0x9f2ea000, 65536, PROT_NONE) = 0
mmap(0x9f2fa000, 8192, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1a000) = 0x9f2fa000
mmap(0x9f2fc000, 12648, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x9f2fc000
close(3)= 0
openat(AT_FDCWD, "/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
read(3, 
"\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0\267\0\1\0\0\0\320I\2\0\0\0\0\0"..., 832) = 
832
fstat(3, {st_mode=S_IFREG|0755, st_size=1428872, ...}) = 0
mmap(NULL, 1502000, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 
0x9f161000
mprotect(0x9f2b8000, 61440, PROT_NONE) = 0
mmap(0x9f2c7000, 24576, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x156000) = 0x9f2c7000
mmap(0x9f2cd000, 11056, PROT_READ|PROT_WRITE, 
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x9f2cd000
close(3)= 0
openat(AT_FDCWD, "/usr/lib/libxenevtchn.so.1", O_RDONLY|O_CLOEXEC) = 3