On Mon, 1 Aug 2016 16:18:20 +0000
chris thompson <[email protected]> wrote:

>  
> From: [email protected]
> To: [email protected]
> Date: Mon, 1 Aug 2016 16:16:00 +0000
> CC: [email protected]
> Subject: Re: [vfio-users] Cannot register eventfd with MSI/MSI-X interrupts
> 
> 
> 
> 
>  
> > Date: Mon, 1 Aug 2016 07:06:18 -0600
> > From: [email protected]
> > To: [email protected]
> > CC: [email protected]
> > Subject: Re: [vfio-users] Cannot register eventfd with MSI/MSI-X interrupts
> > 
> > On Mon, 1 Aug 2016 10:24:52 +0000
> > chris thompson <[email protected]> wrote:
> >   
> > >  Hi,
> > >  
> > > I'm trying to talk to my PCI-e card (USB3 host controller) from userland, 
> > > and have been successful accessing the PCI Config registers and device 
> > > memory, and able to register eventfd file descriptors for legacy 
> > > interrupts (IRQx) and Error interrupts - including stimulating these via 
> > > the ioctl calls. Unfortunately the MSI and MSI-X interrupts return EINVAL 
> > > (invalid argument) when trying to register them.
> > >  
> > > Sample from the code:
> > > printf("\nInterrupts:\n");
> > > int * irq_fds[VFIO_PCI_NUM_IRQS];
> > > int num_irqs[VFIO_PCI_NUM_IRQS];
> > > for (i = 0; i < device_info.num_irqs; i++) {
> > >   struct vfio_irq_info irq = { .argsz = sizeof(irq) };
> > >   irq.index = i;
> > >   ioctl(device, VFIO_DEVICE_GET_IRQ_INFO, &irq);
> > >   printf("argsz 0x%x, flags 0x%x, index 0x%x, count 0x%x,\n", irq.argsz, 
> > > irq.flags, irq.index, irq.count);
> > >   num_irqs[i] = irq.count;
> > >   irq_fds[i]=(int*)malloc(irq.count * sizeof(int));
> > >   int j;
> > >   for(j=0; j<irq.count; j++)
> > >   {
> > >     irq_fds[i][j] = eventfd(0, 0);
> > >     printf("config IRQ %u : %u\n", i, j);
> > >     struct vfio_irq_set *irq_setup;
> > >     unsigned int size = sizeof(struct vfio_irq_set) + sizeof(int);
> > >     irq_setup = malloc(size);
> > >     irq_setup->argsz = size;
> > >     irq_setup->flags = VFIO_IRQ_SET_DATA_EVENTFD | 
> > > VFIO_IRQ_SET_ACTION_TRIGGER ;
> > >     irq_setup->index = i;
> > >     irq_setup->start = j;
> > >     irq_setup->count = 1;
> > >     *(int*)(irq_setup->data)=irq_fds[i][j];
> > >     int ret = ioctl(device, VFIO_DEVICE_SET_IRQS, irq_setup);
> > >     if(ret){ printf("failed to register interrupt %u : %u, error %i 
> > > %s\n", i, j, ret, strerror(errno)); }
> > >     free(irq_setup);
> > >   }
> > > }
> > >  
> > > Which gives the output:
> > >  
> > > Interrupts:
> > > argsz 0x10, flags 0x7, index 0x0, count 0x1,
> > > config IRQ 0 : 0
> > > argsz 0x10, flags 0x9, index 0x1, count 0x40,
> > > config IRQ 1 : 0
> > > failed to register interrupt 1 : 0, error -1 Invalid argument
> > > config IRQ 1 : 1
> > > failed to register interrupt 1 : 1, error -1 Invalid argument
> > > config IRQ 1 : 2
> > > ...
> > > 
> > > Does anyone know a likely reason that I can't register these? From what 
> > > I've read my IOMMUs all support IRQ remapping:
> > > [    0.016443] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap 
> > > c0000020e60262 ecap f0101a
> > > [    0.016450] dmar: IOMMU 1: reg_base_addr fed91000 ver 1:0 cap 
> > > c9008020660262 ecap f0105a
> > > 
> > > The hardware is an Intel i5 2500 in an HP 8200 Elite small form-factor 
> > > PC, running Ubuntu 14.04 on the 3.13 kernel.  
> > 
> > 
> > include/uapi/linux/vfio.h:
> >  * The NORESIZE flag indicates that the interrupt lines within the index
> >  * are setup as a set and new subindexes cannot be enabled without first
> >  * disabling the entire index.  This is used for interrupts like PCI MSI
> >  * and MSI-X where the driver may only use a subset of the available
> >  * indexes, but VFIO needs to enable a specific number of vectors
> >  * upfront.  In the case of MSI-X, where the user can enable MSI-X and
> >  * then add and unmask vectors, it's up to userspace to make the decision
> >  * whether to allocate the maximum supported number of vectors or tear
> >  * down setup and incrementally increase the vectors as each is enabled.
> > 
> > struct vfio_irq_info {
> >         __u32   argsz;
> >         __u32   flags;
> > #define VFIO_IRQ_INFO_EVENTFD           (1 << 0)
> > #define VFIO_IRQ_INFO_MASKABLE          (1 << 1)
> > #define VFIO_IRQ_INFO_AUTOMASKED        (1 << 2)
> > #define VFIO_IRQ_INFO_NORESIZE          (1 << 3)   <----
> >         __u32   index;          /* IRQ index */
> >         __u32   count;          /* Number of IRQs within this index */
> > };  
>  Hi Alex, Thanks for your suggestions, based on the same documentation I 
> should be able to disable the interrupts for the set using:"... or the index 
> can be disabled as a whole with: flags = (DATA_NONE|ACTION_TRIGGER), count = 
> 0."and then bind them to eventfd as a group (I was originally trying to bind 
> them as a group with the same results). However: printf("\nInterrupts:\n");
> int * irq_fds[VFIO_PCI_NUM_IRQS];
> int num_irqs[VFIO_PCI_NUM_IRQS];
> for (i = 0; i < device_info.num_irqs; i++) {
>     struct vfio_irq_info irq = { .argsz = sizeof(irq) };    irq.index = i;    
> ioctl(device, VFIO_DEVICE_GET_IRQ_INFO, &irq);
>     printf("argsz 0x%x, flags 0x%x, index 0x%x, count 0x%x,\n", irq.argsz, 
> irq.flags, irq.index, irq.count);    num_irqs[i] = irq.count;
>     irq_fds[i]=(int*)malloc(irq.count * sizeof(int));
>     int j;
>     for(j=0; j<irq.count; j++) irq_fds[i][j] = eventfd(0, 0);    
> printf("config IRQ set %u\n", i);
>     struct vfio_irq_set *irq_setup;
>     unsigned int size = sizeof(struct vfio_irq_set) + (irq.count*sizeof(int));
>     irq_setup = malloc(size);
>     irq_setup->argsz = size;
>     irq_setup->flags = VFIO_IRQ_SET_DATA_NONE | VFIO_IRQ_SET_ACTION_TRIGGER ;
>     irq_setup->index = i;
>     irq_setup->start = 0;
>     irq_setup->count = irq.count;
>     int ret = ioctl(device, VFIO_DEVICE_SET_IRQS, irq_setup);

If this was an attempt to disable, then irq_setup->count should have
been set to 0 here.  This results in the -EINVAL return here.

>     if(ret){ printf("failed to unregister register interrupt set %u error %i 
> %s\n", i, ret, strerror(errno)); }    irq_setup->flags = 
> VFIO_IRQ_SET_DATA_EVENTFD | VFIO_IRQ_SET_ACTION_TRIGGER ;
>     irq_setup->index = i;
>     irq_setup->start = 0;
>     irq_setup->count = irq.count;
>     int j;
>     for(j=0; j<irq.count; j++) ((int*)(irq_setup->data))[j]=irq_fds[i][j];    
> ret = ioctl(device, VFIO_DEVICE_SET_IRQS, irq_setup);
>     if(ret){ printf("failed to register interrupts %u : %u-%u, error %i 
> %s\n", i, 0, irq.count-1, ret, strerror(errno)); }
>     free(irq_setup);
> } 
> Results in:
>  
> Interrupts:
> argsz 0x10, flags 0x7, index 0x0, count 0x1,
> config IRQ 0 : 0
> failed to unregister register interrupt set 0 error -1 Invalid argument

Here the first ioctl failed because count != 0, the second ioctl
worked, so the device is running in INTx mode.
 
> argsz 0x10, flags 0x9, index 0x1, count 0x40,
> config IRQ 1 : 0
> failed to unregister register interrupt set 1 error -1 Invalid argument

This fails because count != 0

> failed to register interrupts 1 : 0-63, error -1 Invalid argument

This fails because the device is running in INTx mode and multiple
interrupt modes cannot be simultaneously enabled.

> argsz 0x10, flags 0x9, index 0x2, count 0x8,
> config IRQ 2 : 0
> failed to unregister register interrupt set 2 error -1 Invalid argument
> failed to register interrupts 2 : 0-7, error -1 Invalid argument

Same as the MSI index.
 
> argsz 0x10, flags 0x9, index 0x3, count 0x1,
> config IRQ 3 : 0
> Killed
>  
> This Killed at the IOCtl to the error IRQ line is concerning, and a second 
> attempt results in:
>  
> Interrupts:
> argsz 0x10, flags 0x7, index 0x0, count 0x1,
> config IRQ 0 : 0
> 
>      
> then the process hangs in the unregister ioctl. If I force kill the process 
> then I can't open the device again without rebooting my machine.
>  
> Any ideas? I've probably done something obviously wrong.
>  
> Thanks and regards,
> Chris
> P.S. Missed something important from dmesg too:  
> [ 1334.286951] VFIO - User Level meta-driver version: 0.3
> [ 1383.701974] xhci_hcd 0000:02:00.0: remove, state 4
> [ 1383.701984] usb usb4: USB disconnect, device number 1
> [ 1383.705976] xhci_hcd 0000:02:00.0: USB bus 4 deregistered
> [ 1383.705984] xhci_hcd 0000:02:00.0: remove, state 4
> [ 1383.705989] usb usb3: USB disconnect, device number 1
> [ 1383.712050] xhci_hcd 0000:02:00.0: USB bus 3 deregistered
> [ 1498.085560] vfio_pin_pages: RLIMIT_MEMLOCK (65536) exceeded
> [ 1499.112900] BUG: unable to handle kernel NULL pointer dereference at       
>     (null)
> [ 1499.112954] IP: [<ffffffffa071b6f9>] vfio_pci_set_err_trigger+0x19/0x130 
> [vfio_pci]
> [ 1499.113000] PGD 362e7067 PUD 35172067 PMD 0
> [ 1499.113028] Oops: 0000 [#1] SMP
> [ 1499.113049] Modules linked in: vfio_iommu_type1 vfio_pci vfio nfsv3 nfsv4 
> autofs4 cuse dm_crypt bnep rfcomm bluetooth gpio_ich hp_wmi sparse_keymap 
> intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi 
> snd_hda_codec_realtek kvm_intel kvm crct10dif_pclmul crc32_pclmul 
> snd_hda_intel snd_hda_codec snd_hwdep aesni_intel snd_pcm aes_x86_64 lrw 
> snd_page_alloc gf128mul glue_helper ablk_helper snd_seq_midi cryptd 
> snd_seq_midi_event snd_rawmidi snd_seq serio_raw snd_seq_device snd_timer 
> lpc_ich shpchp snd nfsd tpm_infineon wmi mac_hid auth_rpcgss mei_me nfs_acl 
> mei soundcore parport_pc nfs ppdev lockd sunrpc lp fscache parport 
> binfmt_misc btrfs xor raid6_pq libcrc32c i915 video i2c_algo_bit e1000e 
> drm_kms_helper psmouse ahci drm libahci ptp pps_core
> [ 1499.113526] CPU: 3 PID: 2229 Comm: vfio_test Not tainted 3.13.0-87-generic 
> #133-Ubuntu
> [ 1499.113568] Hardware name: Hewlett-Packard HP Compaq 8200 Elite SFF 
> PC/1495, BIOS J01 v02.15 11/10/2011
> [ 1499.113618] task: ffff880222964800 ti: ffff88003630c000 task.ti: 
> ffff88003630c000
> [ 1499.113657] RIP: 0010:[<ffffffffa071b6f9>]  [<ffffffffa071b6f9>] 
> vfio_pci_set_err_trigger+0x19/0x130 [vfio_pci]
> [ 1499.113713] RSP: 0018:ffff88003630ddd0  EFLAGS: 00010246
> [ 1499.113742] RAX: 00000000ffffffea RBX: ffff8800ca409800 RCX: 
> 0000000000000001
> [ 1499.113779] RDX: 0000000000000000 RSI: 0000000000000003 RDI: 
> ffff8800ca409800
> [ 1499.113816] RBP: ffff88003630dde8 R08: 0000000000000021 R09: 
> 0000000000000000
> [ 1499.113854] R10: 0000000000000021 R11: ffffffffa071b6e0 R12: 
> 0000000000000000
> [ 1499.113891] R13: ffff8800ca409858 R14: 00000000007a81d0 R15: 
> 0000000000000000
> [ 1499.113929] FS:  00007f3f57f07740(0000) GS:ffff88022e380000(0000) 
> knlGS:0000000000000000
> [ 1499.113971] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1499.114002] CR2: 0000000000000000 CR3: 00000000353c2000 CR4: 
> 00000000000407e0
> [ 1499.114039] Stack:
> [ 1499.114050]  ffff8800ca409800 0000000000000000 ffff8800ca409858 
> ffff88003630ddf8
> [ 1499.114093]  ffffffffa071c9ec ffff88003630deb0 ffffffffa071ae7d 
> ffff88003630de98
> [ 1499.114135]  000000018145452f 0000000000000004 ffff88021f7fba10 
> 0000000000000202
> [ 1499.114178] Call Trace:
> [ 1499.114196]  [<ffffffffa071c9ec>] vfio_pci_set_irqs_ioctl+0x8c/0xa0 
> [vfio_pci]
> [ 1499.114236]  [<ffffffffa071ae7d>] vfio_pci_ioctl+0x31d/0xa30 [vfio_pci]
> [ 1499.114277]  [<ffffffff810ad624>] ? __wake_up+0x44/0x50
> [ 1499.114308]  [<ffffffff812035e1>] ? fsnotify+0x241/0x320
> [ 1499.114340]  [<ffffffffa070e193>] vfio_device_fops_unl_ioctl+0x23/0x30 
> [vfio]
> [ 1499.114379]  [<ffffffff811d46e0>] do_vfs_ioctl+0x2e0/0x4c0
> [ 1499.114412]  [<ffffffff8172d854>] ? __schedule+0x384/0x7f0
> [ 1499.114442]  [<ffffffff811d4941>] SyS_ioctl+0x81/0xa0
> [ 1499.114472]  [<ffffffff8173a3dd>] system_call_fastpath+0x1a/0x1f
> [ 1499.114503] Code: 48 89 e5 e8 8a 21 af e0 b8 01 00 00 00 5d c3 0f 1f 00 66 
> 66 66 66 90 55 83 fe 03 b8 ea ff ff ff 48 89 e5 41 55 41 54 53 48 89 fb <41> 
> 8b 39 4c 8b 23 75 64 41 f6 c0 07 74 5e 41 f6 c0 01 75 63 41
> [ 1499.114673] RIP  [<ffffffffa071b6f9>] vfio_pci_set_err_trigger+0x19/0x130 
> [vfio_pci]
> [ 1499.114717]  RSP <ffff88003630ddd0>
> [ 1499.114735] CR2: 0000000000000000
> [ 1499.258683] ---[ end trace 978702c604a203ce ]---

This is of course a bug, users shouldn't be able to trigger this.  I'll
sort it out and post a patch.

So your code is useful for exposing this bug, but what are you actually
trying to accomplish?  PCI devices have three different types of
interrupts, INTx, MSI, and MSI-X.  They can't all be configured
simultaneously.  The error and request interrupts are not tied to a
physical interrupt, they're just communication channels to signal an
error and release the device respectively.  Your code is great as a
unit test, but not so much if you're actually trying to write a
userspace device driver.  Thanks,

Alex

_______________________________________________
vfio-users mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/vfio-users

Reply via email to