Re: [PATCH v5 09/44] drm: handle HAS_IOPORT dependencies

2023-05-22 Thread Arnd Bergmann
On Mon, May 22, 2023, at 14:38, Thomas Zimmermann wrote:
> Am 22.05.23 um 12:50 schrieb Niklas Schnelle:

>> There is also a direct and hard coded use in cirrus.c which according to
>> the comment is only necessary during resume.  Let's just skip this as
>> for example s390 which doesn't have I/O port support also doesen't
>> support suspend/resume.
>
> I think we should consider making cirrus depend on HAS_IOPORT. The 
> driver is only for qemu's cirrus emulation, which IIRC can only be 
> enabled for i586. And it has all been deprecated long ago.

I tried it out in arm64 debvm, and both the kernel and Xorg drivers
are detected just fine according to the log. I only get a black
screen, which could be the result of a missing VGA BIOS or the missing
unblank Port I/O in the driver (I doubt the VGA ports are hooked up
on arm64 qemu), but there is a good chance that it's currently
working for someone else.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/20] x86: address -Wmissing-prototype warnings

2023-05-19 Thread Arnd Bergmann
On Thu, May 18, 2023, at 23:56, Dave Hansen wrote:
> On 5/16/23 12:35, Arnd Bergmann wrote:
>> 
>> All of the warnings have to be addressed in some form before the warning
>> can be enabled by default.
>
> I picked up the ones that were blatantly obvious, but left out 03, 04,
> 10, 12 and 19 for the moment.

Ok, thanks!

I've already sent a fixed version of patch 10, let me know if you
need anything else for the other ones.

> BTW, I think the i386 allyesconfig is getting pretty lightly tested
> these days.  I think you and I hit the same mlx4 __bad_copy_from()
> compile issue.

I did all my testing on randconfig builds, so I probably caught a lot
of the more obscure corner cases, but it doesn't always hit everything
that is in allyesconfig/allmodconfig.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 10/20] x86: xen: add missing prototypes

2023-05-19 Thread Arnd Bergmann
On Thu, May 18, 2023, at 19:28, Dave Hansen wrote:
> On 5/16/23 12:35, Arnd Bergmann wrote:
>> 
>> arch/x86/xen/enlighten_pv.c:1233:34: error: no previous prototype for 
>> 'xen_start_kernel' [-Werror=missing-prototypes]
>> arch/x86/xen/irq.c:22:14: error: no previous prototype for 
>> 'xen_force_evtchn_callback' [-Werror=missing-prototypes]
>> arch/x86/xen/mmu_pv.c:358:20: error: no previous prototype for 'xen_pte_val' 
>> [-Werror=missing-prototypes]
>> arch/x86/xen/mmu_pv.c:366:20: error: no previous prototype for 'xen_pgd_val' 
>> [-Werror=missing-prototypes]
>
> What's the deal with this one?
>
> The patch is doing a bunch functions on top of the ones from the commit
> message.  Were you just showing a snippet of what the actual set of
> warnings is?

I missed this one going through the changelogs before sending them out,
I thought I had added a proper text to each one, but it fell through the
cracks. I've followed up with a v2 patch that has a proper changelog
now.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 07/41] drm: handle HAS_IOPORT dependencies

2023-05-16 Thread Arnd Bergmann
On Tue, May 16, 2023, at 19:13, Thomas Zimmermann wrote:
>
> Am 16.05.23 um 13:00 schrieb Niklas Schnelle:
>> In a future patch HAS_IOPORT=n will result in inb()/outb() and friends
>> not being declared. We thus need to add HAS_IOPORT as dependency for
>> those drivers using them. In the bochs driver there is optional MMIO
>> support detected at runtime, warn if this isn't taken when
>> HAS_IOPORT is not defined.
>> 
>> There is also a direct and hard coded use in cirrus.c which according to
>> the comment is only necessary during resume.  Let's just skip this as
>> for example s390 which doesn't have I/O port support also doesen't
>> support suspend/resume.
>> 
>> Co-developed-by: Arnd Bergmann 
>> Signed-off-by: Arnd Bergmann 
>> Signed-off-by: Niklas Schnelle 
>> ---
>> Note: The HAS_IOPORT Kconfig option was added in v6.4-rc1 so
>>per-subsystem patches may be applied independently
>> 
>>   drivers/gpu/drm/qxl/Kconfig   |  1 +
>>   drivers/gpu/drm/tiny/bochs.c  | 17 +
>>   drivers/gpu/drm/tiny/cirrus.c |  2 ++
>
> There are more invocations in gma500. See[1]
>
> [1] 
> https://elixir.bootlin.com/linux/v6.3/source/drivers/gpu/drm/gma500/cdv_device.c#L30

GMA500 already has "depends on X86", so I don't think
any changes are needed there -- x86 is already highly dependent
on I/O ports for a number of reasons.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 3/3 v6] virtio: vdpa: new SolidNET DPU driver.

2022-12-20 Thread Arnd Bergmann
On Tue, Dec 20, 2022, at 17:46, Alvaro Karsz wrote:
> Hi Nathan,
>
>> This does not appear to be a false positive but what was the intent
>> here? Should the local name variables increase their length or should
>> the buffer length be reduced?
>
> You're right, the local name variables and snprintf argument don't match.
> Thanks for noticing.
> I think that we should increase the name variables  to be
> SNET_NAME_SIZE bytes long.

If you can show that the string fits into the current length, it
would be better to keep the stack usage low and just adapt the
length to be sizeof(string) instead of SNET_NAME_SIZE.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Arnd Bergmann
On Thu, Oct 13, 2022, at 12:08 AM, Michael S. Tsirkin wrote:
> On Wed, Oct 12, 2022 at 11:06:54PM +0200, Arnd Bergmann wrote:
>> On Wed, Oct 12, 2022, at 7:22 PM, Linus Torvalds wrote:
>> >
>> > The NO_IRQ thing is mainly actually defined by a few drivers that just
>> > never got converted to the proper world order, and even then you can
>> > see the confusion (ie some drivers use "-1", others use "0", and yet
>> > others use "((unsigned int)(-1)".
>> 
>> The last time I looked at removing it for arch/arm/, one problem was
>> that there were a number of platforms using IRQ 0 as a valid number.
>> We have converted most of them in the meantime, leaving now only
>> mach-rpc and mach-footbridge. For the other platforms, we just
>> renumbered all interrupts to add one, but footbridge apparently
>> relies on hardcoded ISA interrupts in device drivers. For rpc,
>> it looks like IRQ 0 (printer) already wouldn't work, and it
>> looks like there was never a driver referencing it either.
>
> Do these two boxes even have pci?

Footbridge/netwinder has PCI and PC-style ISA on-board devices
(floppy, ps2 mouse/keyboard, parport, soundblaster, ...), RiscPC
has neither.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [GIT PULL] virtio: fixes, features

2022-10-12 Thread Arnd Bergmann
On Wed, Oct 12, 2022, at 7:22 PM, Linus Torvalds wrote:
>
> The NO_IRQ thing is mainly actually defined by a few drivers that just
> never got converted to the proper world order, and even then you can
> see the confusion (ie some drivers use "-1", others use "0", and yet
> others use "((unsigned int)(-1)".

The last time I looked at removing it for arch/arm/, one problem was
that there were a number of platforms using IRQ 0 as a valid number.
We have converted most of them in the meantime, leaving now only
mach-rpc and mach-footbridge. For the other platforms, we just
renumbered all interrupts to add one, but footbridge apparently
relies on hardcoded ISA interrupts in device drivers. For rpc,
it looks like IRQ 0 (printer) already wouldn't work, and it
looks like there was never a driver referencing it either.

I see that openrisc and parisc also still define NO_IRQ to -1, but at
least openrisc already relies on 0 being the invalid IRQ (from
CONFIG_IRQ_DOMAIN), probably parisc as well.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 20/36] arch/idle: Change arch_cpu_idle() IRQ behaviour

2022-06-08 Thread Arnd Bergmann
On Wed, Jun 8, 2022 at 4:27 PM Peter Zijlstra  wrote:
>
> Current arch_cpu_idle() is called with IRQs disabled, but will return
> with IRQs enabled.
>
> However, the very first thing the generic code does after calling
> arch_cpu_idle() is raw_local_irq_disable(). This means that
> architectures that can idle with IRQs disabled end up doing a
> pointless 'enable-disable' dance.
>
> Therefore, push this IRQ disabling into the idle function, meaning
> that those architectures can avoid the pointless IRQ state flipping.
>
> Signed-off-by: Peter Zijlstra (Intel) 

I think you now need to add the a raw_local_irq_disable(); in loongarch
as well.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 33/36] cpuidle,omap3: Use WFI for omap3_pm_idle()

2022-06-08 Thread Arnd Bergmann
On Wed, Jun 8, 2022 at 4:27 PM Peter Zijlstra  wrote:
>
> arch_cpu_idle() is a very simple idle interface and exposes only a
> single idle state and is expected to not require RCU and not do any
> tracing/instrumentation.
>
> As such, omap_sram_idle() is not a valid implementation. Replace it
> with the simple (shallow) omap3_do_wfi() call. Leaving the more
> complicated idle states for the cpuidle driver.
>
> Signed-off-by: Peter Zijlstra (Intel) 

I see similar code in omap2:

omap2_pm_idle()
 -> omap2_enter_full_retention()
 -> omap2_sram_suspend()

Is that code path safe to use without RCU or does it need a similar change?

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [linux-next:master] BUILD REGRESSION 8cb8311e95e3bb58bd84d6350365f14a718faa6d

2022-05-26 Thread Arnd Bergmann
On Wed, May 25, 2022 at 11:35 PM kernel test robot  wrote:
> .__mulsi3.o.cmd: No such file or directory
> Makefile:686: arch/h8300/Makefile: No such file or directory
> Makefile:765: arch/h8300/Makefile: No such file or directory
> arch/Kconfig:10: can't open file "arch/h8300/Kconfig"

Please stop building h8300  after the asm-generic tree is merged, the
architecture is getting removed.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 5/7] dt-bindings: Add xen,dev-domid property description for xen-grant DMA ops

2022-05-18 Thread Arnd Bergmann
On Wed, May 18, 2022 at 5:06 PM Oleksandr  wrote:
> On 18.05.22 17:32, Arnd Bergmann wrote:
> > On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko  
> > wrote:
>
> >   This would mean having a device
> > node for the grant-table mechanism that can be referred to using the 
> > 'iommus'
> > phandle property, with the domid as an additional argument.
>
> I assume, you are speaking about something like the following?
>
>
> xen_dummy_iommu {
> compatible = "xen,dummy-iommu";
> #iommu-cells = <1>;
> };
>
> virtio@3000 {
> compatible = "virtio,mmio";
> reg = <0x3000 0x100>;
> interrupts = <41>;
>
> /* The device is located in Xen domain with ID 1 */
> iommus = <&xen_dummy_iommu 1>;
> };

Right, that's that's the idea, except I would not call it a 'dummy'.
>From the perspective of the DT, this behaves just like an IOMMU,
even if the exact mechanism is different from most hardware IOMMU
implementations.

> > It does not quite fit the model that Linux currently uses for iommus,
> > as that has an allocator for dma_addr_t space
>
> yes (# 3/7 adds grant-table based allocator)
>
>
> > , but it would think it's
> > conceptually close enough that it makes sense for the binding.
>
> Interesting idea. I am wondering, do we need an extra actions for this
> to work in Linux guest (dummy IOMMU driver, etc)?

It depends on how closely the guest implementation can be made to
resemble a normal iommu. If you do allocate dma_addr_t addresses,
it may actually be close enough that you can just turn the grant-table
code into a normal iommu driver and change nothing else.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 5/7] dt-bindings: Add xen,dev-domid property description for xen-grant DMA ops

2022-05-18 Thread Arnd Bergmann
On Sat, May 7, 2022 at 7:19 PM Oleksandr Tyshchenko  wrote:
>
> diff --git a/Documentation/devicetree/bindings/virtio/mmio.yaml 
> b/Documentation/devicetree/bindings/virtio/mmio.yaml
> index 10c22b5..29a0932 100644
> --- a/Documentation/devicetree/bindings/virtio/mmio.yaml
> +++ b/Documentation/devicetree/bindings/virtio/mmio.yaml
> @@ -13,6 +13,9 @@ description:
>See https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=virtio for
>more details.
>
> +allOf:
> +  - $ref: /schemas/arm/xen,dev-domid.yaml#
> +
>  properties:
>compatible:
>  const: virtio,mmio
> @@ -33,6 +36,10 @@ properties:
>  description: Required for devices making accesses thru an IOMMU.
>  maxItems: 1
>
> +  xen,dev-domid:
> +description: Required when Xen grant mappings need to be enabled for 
> device.
> +$ref: /schemas/types.yaml#/definitions/uint32
> +
>  required:
>- compatible
>- reg

Sorry for joining the discussion late. Have you considered using the
generic iommu
binding here instead of a custom property? This would mean having a device
node for the grant-table mechanism that can be referred to using the 'iommus'
phandle property, with the domid as an additional argument.

It does not quite fit the model that Linux currently uses for iommus,
as that has an allocator for dma_addr_t space, but it would think it's
conceptually close enough that it makes sense for the binding.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] VMCI: Add support for ARM64

2022-05-11 Thread Arnd Bergmann
On Wed, May 11, 2022 at 9:54 PM Vishnu Dasa  wrote:
>
>
> > FWIW, it seems you're doing three things at once, better split this into
> > a 3-patch series.
>
> Thanks for the feedback.  I was debating between the two ways of doing
> it and ultimately did it one way.  It is a bit late now to change it as it has
> made its way to linux-next already.

It's really ok either way here. While there are clearly multiple changes in the
source file, they are all trivial, and the Kconfig change doesn't make sense
without the other changes, so having a combined patch seems totally
reasonable as well.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 8/8] virtio_ring.h: do not include from exported header

2022-04-04 Thread Arnd Bergmann
On Tue, Apr 5, 2022 at 7:35 AM Christoph Hellwig  wrote:
>
> On Mon, Apr 04, 2022 at 10:04:02AM +0200, Arnd Bergmann wrote:
> > The header is shared between kernel and other projects using virtio, such as
> > qemu and any boot loaders booting from virtio devices. It's not technically 
> > a
> > /kernel/ ABI, but it is an ABI and for practical reasons the kernel version 
> > is
> > maintained as the master copy if I understand it correctly.
>
> Besides that fact that as you correctly states these are not a UAPI at
> all, qemu and bootloades are not specific to Linux and can't require a
> specific kernel version.  So the same thing we do for file system
> formats or network protocols applies here:  just copy the damn header.
> And as stated above any reasonably portable userspace needs to have a
> copy anyway.

I think the users all have their own copies, at least the ones I could
find on codesearch.debian.org. However, there are 27 virtio_*.h
files in include/uapi/linux that probably should stay together for
the purpose of defining the virtio protocol, and some others might
be uapi relevant.

I see that at least include/uapi/linux/vhost.h has ioctl() definitions
in it, and includes the virtio_ring.h header indirectly.

Adding the virtio maintainers to Cc to see if they can provide
more background on this.

> If it is just as a "master copy" it can live in drivers/virtio/, just
> like we do for other formats.

It has to be in include/linux/ at least because it's used by a number
of drivers outside of drivers/virtio/.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/17] Add memberof(), split some headers, and slightly simplify code

2021-11-19 Thread Arnd Bergmann
On Fri, Nov 19, 2021 at 5:12 PM Alejandro Colomar (man-pages)
 wrote:
>
> On 11/19/21 16:57, Arnd Bergmann wrote:
> >
> > From what I can tell, linux/stddef.h is tiny, I don't think it's really
> > worth optimizing this part. I have spent some time last year
> > trying to untangle some of the more interesting headers, but ended
> > up not completing this as there are some really hard problems
> > once you start getting to the interesting bits.
>
> In this case it was not about being worth it or not,
> but that the fact that adding memberof() would break,
> unless I use 0 instead of NULL for the implementation of memberof(),
> which I'm against, or I split stddef.
>
> If I don't do either of those,
> I'm creating a circular dependency,
> and it doesn't compile.

Sorry for missing the background here, but I don't see what that
dependency is. If memberof() is a macro, then including the definition
should not require having the NULL definition first, you just need to
have both at the time you use it.

> > The main issue here is that user space code should not
> > include anything outside of include/uapi/ and arch/*/include/uapi/
>
> Okay.  That's good to know.
>
> So everything can use uapi code,
> and uapi code can only use uapi code,
> right?

Correct.

> > offsetof() is defined in include/linux/stddef.h, so this is by
> > definition not accessible here. It appears that there is also
> > an include/uapi/linux/stddef.h that is really strange because
> > it includes linux/compiler_types.h, which in turn is outside
> > of uapi/. This should probably be fixed.
>
> I see.
> Then,
> perhaps it would be better to define offsetof() _only_ inside uapi/,
> and use that definition from everywhere else,
> and therefore remove the non-uapi version,
> right?

No, because the user-space  provided by the compiler
also includes an offsetof() definition. In the uapi/ namespace, the
kernel must only provide definitions that do not clash with anything
in user space.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/17] Add memberof(), split some headers, and slightly simplify code

2021-11-19 Thread Arnd Bergmann
On Fri, Nov 19, 2021 at 5:22 PM Alejandro Colomar (man-pages)
 wrote:
> On 11/19/21 17:18, Arnd Bergmann wrote:
> > On Fri, Nov 19, 2021 at 5:10 PM Andy Shevchenko
> >  wrote:
> >> On Fri, Nov 19, 2021 at 04:57:46PM +0100, Arnd Bergmann wrote:
> >
> >>> The main problem with this approach is that as soon as you start
> >>> actually reducing the unneeded indirect includes, you end up with
> >>> countless .c files that no longer build because they are missing a
> >>> direct include for something that was always included somewhere
> >>> deep underneath, so I needed a second set of scripts to add
> >>> direct includes to every .c file.
> >>
> >> Can't it be done with cocci support?
> >
> > There are many ways of doing it, but they all tend to suffer from the
> > problem of identifying which headers are actually needed based on
> > the contents of a file, and also figuring out where to put the extra
> > #include if there are complex #ifdefs.
> >
> > For reference, see below for the naive pattern matching I tried.
> > This is obviously incomplete and partially wrong.
>
> FYI, if you may not know the tool,
> theres include-what-you-use(1) (a.k.a. iwyu(1))[1],
> although it is still not mature,
> and I'm helping improve it a bit.

Yes, I know that one, I tried using it as well, but it did not really
scale to the size of the kernel as it requires having all files to use
the correct set of #include, and to know about all the definitions.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/17] Add memberof(), split some headers, and slightly simplify code

2021-11-19 Thread Arnd Bergmann
On Fri, Nov 19, 2021 at 5:10 PM Andy Shevchenko
 wrote:
> On Fri, Nov 19, 2021 at 04:57:46PM +0100, Arnd Bergmann wrote:

> > The main problem with this approach is that as soon as you start
> > actually reducing the unneeded indirect includes, you end up with
> > countless .c files that no longer build because they are missing a
> > direct include for something that was always included somewhere
> > deep underneath, so I needed a second set of scripts to add
> > direct includes to every .c file.
>
> Can't it be done with cocci support?

There are many ways of doing it, but they all tend to suffer from the
problem of identifying which headers are actually needed based on
the contents of a file, and also figuring out where to put the extra
#include if there are complex #ifdefs.

For reference, see below for the naive pattern matching I tried.
This is obviously incomplete and partially wrong.

Arnd

---
#!/bin/bash

GITARGS="$@"
declare HEADER
declare SEARCH
declare FILES
declare OTHERHEADERS

# insert  alphabetically after the nearest 
insertafter() {
MYFILES=`git grep -l "#include.*<${HEADER%/*}/.*>" $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
MYFILES=`grep -wL $HEADER $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
echo after $HEADER $MYFILES
echo $OTHERHEADERS | tr '[:space:]' '\n' | sort -ru | grep
^${HEADER%/*} | grep -A1 ^$HEADER | grep -v ^$HEADER | tr / . |
while read i ; do

if [ -z "$MYFILES" ] ; then return ; fi
echo AFTER $i
echo $MYFILES | xargs sed -i '/include.*<'$i'>/ a #include '"<$HEADER>"
MYFILES=`grep -wL $HEADER $MYFILES`
done
}

# insert  alphabetically after the nearest 
insertbefore() {
MYFILES=`git grep -l "#include.*<${HEADER%/*}/.*>" $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
MYFILES=`grep -wL $HEADER $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
echo before $HEADER $MYFILES
echo $OTHERHEADERS | tr '[:space:]' '\n' | sort -u | grep
^${HEADER%/*} | grep -A1 ^$HEADER | grep -v ^$HEADER | tr / . |
while read i ; do
if [ -z "$MYFILES" ] ; then return ; fi
echo BEFORE $i
echo $MYFILES | xargs sed -i '/include.*<'$i'>/ i #include '"<$HEADER>"
MYFILES=`grep -wL $HEADER $MYFILES`
done
}

# insert  before first 
insertcategory() {
MYFILES=`git grep -l "#include.*<.*/.*>" $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
MYFILES=`grep -wL $HEADER $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
for f in $MYFILES ; do
sed -i -e "/^#include.*<.*\/.*>/ { i #include <$HEADER>" -e " ; :L; n;
bL; } " $f
done
}

insertafterlocal() {
MYFILES=`git grep -l "#include.*\".*\"" $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
MYFILES=`grep -wL $HEADER $FILES`
if [ -z "$MYFILES" ] ; then return ; fi
for f in $MYFILES ; do
sed -i -e "/^#include.*\".*\/.*\"/ { a #include <$HEADER>" -e " ; :L;
n; bL; } " $f
done
}

x() {
HEADER="$1"
SEARCH="$2"
echo $HEADER
FILES=`git grep -wl "$SEARCH" | grep -v
"^\(Documentation\|include\|tools\|arch/.*/include\|arch/.*/boot/dts\|scripts\|samples\|arch/*/\(kernel/\|vdso\)\|lib/vdso\)\|classmap.h$"
| grep -v "\.S\>"`
if [ -z "$FILES" ] ; then return ; fi
FILES=`echo $FILES | xargs grep $HEADER -L `
if [ -z "$FILES" ] ; then return ; fi
OTHERHEADERS=`echo $FILES | xargs grep -h  "include.*\<.*/.*\>" | cut
-f 2 -d\< | cut -f 1 -d \> | grep ^[-_a-zA-Z0-9]*/ ; echo $HEADER`

insertafter
insertbefore
insertcategory
# insertafterlocal
}

# error: implicit declaration of function 'skb_tunnel_rx'
[-Werror,-Wimplicit-function-declaration]
#x linux/debug_locks.h "\(__\|\)debug_locks_off"
#x linux/stat.h
"S_I\(R\|W\|X\)\(USR\|GRP\|OTH\|UGO\)\|S_IRWX\(UGO\|U\|G\|O\)\|S_IALLUGO"
#x linux/stat.h "[A-Z0-9_]*ATTR_\(RW\|RO\|WO\)"
#x linux/dev_printk.h "dev_\(debug\|info\|warn\|err\|notice\|alert\)"
x net/scheduler.h "dev_init_scheduler\|dev_activate"
x linux/sysfs.h
"sysfs_\(create\|remove\|update\|merge\|add_file_to\)_\(\(bin_\|\)file\|file_self\|group\|link\|mount_point\)\(\|s\|_ns\|\)\|sysfs_\(ops\|notify\|chmod_file\|get_dirent\|notify_dirent\|put\)"
x linux/kref.h "kref_get_unless_zero\|kref_get\|kref_init\|kref_read\|kref_put"
x linux/skb_alloc.h
"\(dev\|netdev\|__dev\|__netdev\)_alloc_skb\(\|_ip_align\)\|skb_frag_must_loop\|skb_fill_page_desc\|skb_queue_purge\|skb_rbtree_purge\|netdev_alloc_frag\|skb_frag_address_safe"
x linux/mutex.h
"DEFINE_MUTEX\|\(device\|mutex\|tty\)_\(lock\|unlock\|is_locked\|is_writelocked\)\|usb_\(un\|\)lock_device\|BLOCKING_INIT_N

Re: [PATCH 00/17] Add memberof(), split some headers, and slightly simplify code

2021-11-19 Thread Arnd Bergmann
On Fri, Nov 19, 2021 at 4:06 PM Alejandro Colomar (man-pages)
 wrote:
> On 11/19/21 15:47, Arnd Bergmann wrote:
> > On Fri, Nov 19, 2021 at 12:36 PM Alejandro Colomar
>
> Yes, I would like to untangle the dependencies.
>
> The main reason I started doing this splitting
> is because I wouldn't be able to include
>  in some headers,
> because it pulled too much stuff that broke unrelated things.
>
> So that's why I started from there.
>
> I for example would like to get NULL in memberof()
> without puling anything else,
> so  makes sense for that.
>
> It's clear that every .c wants NULL,
> but it's not so clear that every .c wants
> everything that  pulls indirectly.

>From what I can tell, linux/stddef.h is tiny, I don't think it's really
worth optimizing this part. I have spent some time last year
trying to untangle some of the more interesting headers, but ended
up not completing this as there are some really hard problems
once you start getting to the interesting bits.

The approach I tried was roughly:

- For each header in the kernel, create a preprocessed version
  that includes all the indirect includes, from that start a set
  of lookup tables that record which header is eventually included
  by which ones, and the size of each preprocessed header in
  bytes

- For a given kernel configuration (e.g. defconfig or allmodconfig)
  that I'm most interested in, look at which files are built, and what
  the direct includes are in the source files.

- Sort the headers by the product of the number of direct includes
  and the preprocessed size: the largest ones are those that are
  worth looking at first.

- use graphviz to visualize the directed graph showing the includes
  between the top 100 headers in that list. You get something like
  I had in [1], or the version afterwards at [2].

- split out unneeded indirect includes from the headers in the center
  of that graph, typically by splitting out struct definitions.

- repeat.

The main problem with this approach is that as soon as you start
actually reducing the unneeded indirect includes, you end up with
countless .c files that no longer build because they are missing a
direct include for something that was always included somewhere
deep underneath, so I needed a second set of scripts to add
direct includes to every .c file.

On the plus side, I did see something on the order of a 30%
compile speed improvement with clang, which is insane
given that this only removed dead definitions.

> But I'll note that linux/fs.h, linux/sched.h, linux/mm.h are
> interesting headers for further splitting.
>
>
> BTW, I also have a longstanding doubt about
> how header files are organized in the kernel,
> and which headers can and cannot be included
> from which other files.
>
> For example I see that files in samples or scripts or tools,
> that redefine many things such as offsetof() or ARRAY_SIZE(),
> and I don't know if there's a good reason for that,
> or if I should simply remove all that stuff and
> include  everywhere I see offsetof() being used.

The main issue here is that user space code should not
include anything outside of include/uapi/ and arch/*/include/uapi/

offsetof() is defined in include/linux/stddef.h, so this is by
definition not accessible here. It appears that there is also
an include/uapi/linux/stddef.h that is really strange because
it includes linux/compiler_types.h, which in turn is outside
of uapi/. This should probably be fixed.

  Arnd

[1] 
https://drive.google.com/file/d/14IKifYDadg2W5fMsefxr4373jizo9bLl/view?usp=sharing
[2] 
https://drive.google.com/file/d/1pWQcv3_ZXGqZB8ogV-JOfoV-WJN2UNnd/view?usp=sharing
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/17] Add memberof(), split some headers, and slightly simplify code

2021-11-19 Thread Arnd Bergmann
On Fri, Nov 19, 2021 at 12:36 PM Alejandro Colomar
 wrote:
>
> Alejandro Colomar (17):
>   linux/container_of.h: Add memberof(T, m)
>   Use memberof(T, m) instead of explicit NULL dereference
>   Replace some uses of memberof() by its wrappers
>   linux/memberof.h: Move memberof() to separate header
>   linux/typeof_member.h: Move typeof_member() to a separate header
>   Simplify sizeof(typeof_member()) to sizeof_field()
>   linux/NULL.h: Move NULL to a separate header
>   linux/offsetof.h: Move offsetof(T, m) to a separate header
>   linux/offsetof.h: Implement offsetof() in terms of memberof()
>   linux/container_of.h: Implement container_of_safe() in terms of
> container_of()
>   linux/container_of.h: Cosmetic
>   linux/container_of.h: Remove unnecessary cast to (void *)

My feeling is that this takes the separation too far: by having this many header
files that end up being included from practically every single .c file
in the kernel,
I think you end up making compile speed worse overall.

If your goal is to avoid having to recompile as much of the kernel
after touching
a header, I think a better approach is to help untangle the dependencies, e.g.
by splitting out type definitions from headers with inline functions (most
indirect header dependencies are on type definitions) and by focusing on
linux/fs.h, linux/sched.h, linux/mm.h and how they interact with the rest of the
headers. At the moment, these are included in most .c files and they in turn
include a ton of other headers.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()

2021-08-12 Thread Arnd Bergmann
On Thu, Aug 12, 2021 at 10:08 AM Xianting TIan
 wrote:
> 在 2021/8/6 下午10:51, Arnd Bergmann 写道:
> > On Fri, Aug 6, 2021 at 5:01 AM Xianting Tian
> >> +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long
> > I think you need a higher alignment for DMA buffers, instead of 
> > sizeof(long),
> > I would suggest ARCH_DMA_MINALIGN.
>
> As some ARCH(eg, x86, riscv) doesn't define ARCH_DMA_MINALIG, so i think
> it 's better remain the code unchanged,
>
> I will send v5 patch soon.

I think you could just use "L1_CACHE_BYTES" as the alignment in this case.
This will make the structure slightly larger for architectures that do not have
alignment constraints on DMA buffers, but using a smaller alignment is
clearly wrong. Another option would be to use ARCH_KMALLOC_MINALIGN.

Note that there is a patch to add ARCH_DMA_MINALIGN to riscv already,
as some implementations do not have coherent DMA. I had failed to
realized though that on x86 you do not get an ARCH_DMA_MINALIGN
definition.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 1/2] tty: hvc: pass DMA capable memory to put_chars()

2021-08-06 Thread Arnd Bergmann
On Fri, Aug 6, 2021 at 5:01 AM Xianting Tian
 wrote:
> @@ -163,6 +155,13 @@ static void hvc_console_print(struct console *co, const 
> char *b,
> if (vtermnos[index] == -1)
> return;
>
> +   list_for_each_entry(hp, &hvc_structs, next)
> +   if (hp->vtermno == vtermnos[index])
> +   break;
> +
> +   c = hp->c;
> +
> +   spin_lock_irqsave(&hp->c_lock, flags);

The loop looks like it might race against changes to the list. It seems strange
that the print function has to actually search for the structure here.

It may be better to have yet another array for the buffer pointers next to
the cons_ops[] and vtermnos[] arrays.

> +/*
> + * These sizes are most efficient for vio, because they are the
> + * native transfer size. We could make them selectable in the
> + * future to better deal with backends that want other buffer sizes.
> + */
> +#define N_OUTBUF   16
> +#define N_INBUF16
> +
> +#define __ALIGNED__ __attribute__((__aligned__(sizeof(long

I think you need a higher alignment for DMA buffers, instead of sizeof(long),
I would suggest ARCH_DMA_MINALIGN.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 1/2] tty: hvc: pass DMA capable memory to put_chars()

2021-08-01 Thread Arnd Bergmann
On Sun, Aug 1, 2021 at 7:16 AM Xianting Tian
 wrote:

> Considering lock competition of hp->outbuf and the complicated logic in
> hvc_console_print(), I didn’t use hp->outbuf, just allocate additional
> memory(length N_OUTBUF) and append it to hp->outbuf.
> For the issue in hvc_poll_put_char(), I use a static char to replace
> the char in stack.

While this may work, it sounds rather obscure to me, I don't think
it's a good idea
to append the buffer at the back.

If you need a separate field besides hp->outbuf, I would make that part of the
structure itself, and give it the correct alignment constraints to ensure it is
in a cache line by itself. The size of this field is a compile-time
constant, so I
don't see a need to play tricks with pointer arithmetic.

I'm not sure about the locking either: Is it possible for two CPUs to enter
hvc_console_print() at the same time, or is there locking at a higher level
already? It would be good to document this in the structure definition next
to the field.

> @@ -878,6 +885,7 @@ static void hvc_poll_put_char(struct tty_driver *driver, 
> int line, char ch)
> struct tty_struct *tty = driver->ttys[0];
> struct hvc_struct *hp = tty->driver_data;
> int n;
> +   static char ch = ch;
>
> do {
> n = hp->ops->put_chars(hp->vtermno, &ch, 1);

This does not compile, and it's neither thread-safe nor dma safe if you get it
to build by renaming the variable.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-console: avoid DMA from vmalloc area

2021-07-28 Thread Arnd Bergmann
On Wed, Jul 28, 2021 at 10:28 AM Xianting Tian
 wrote:
> 在 2021/7/28 下午3:25, Arnd Bergmann 写道:
>
> I checked several hvc backends, like drivers/tty/hvc/hvc_riscv_sbi.c,
> drivers/tty/hvc/hvc_iucv.c, drivers/tty/hvc/hvc_rtas.c, they don't use dma.
>
> I not finished all hvc backends check yet. But I think even if all hvc
> backends don't use dma currently, it is still possible that the hvc
> backend using dma will be added in the furture.
>
> So I agree with you it should better be fixed in the hvc framework,
> solve the issue in the first place.

Ok, sounds good to me, no need to check more backends then.
I see the hvc-console driver is listed as 'Odd Fixes' in the maintainer
list, with nobody assigned other than the ppc kernel list (added to Cc).

Once you come up with a fix in hvc_console.c, please send that to the
tty maintainers, the ppc list and me, and I'll review it.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-console: avoid DMA from vmalloc area

2021-07-28 Thread Arnd Bergmann
On Wed, Jul 28, 2021 at 4:59 AM Xianting Tian
 wrote:
>
> Arnd, thanks for your quick reply,
>
> As we know put_chars() of virtio-console is registered to hvc framework.
> I go throughed the code, actually there are totally three places that
> put_chars() is called in hvc driver,  but only 1 has issue which is
> fixed by commit c4baad5029.

Ah, good. Knowing what the callers are definitely helps. ;-)

> So I think the scenario that the buf is from "ioremap(), kmap_atomic() ,
> fixmap, loadable module" doesn't exist for virtio-console.
> If there is something wrong about above description, please correct me,
> thanks.

The description is good then.

> Three places that put_chars() is called in hvc driver:
> 1, it is on stack buf,  it is not ok for dma
>  hvc_console_print():
>  char c[N_OUTBUF] __ALIGNED__;
>  cons_ops[index]->put_chars(vtermnos[index], c, i);
>
> 2, just one byte, no issue for dma
>  static void hvc_poll_put_char(struct tty_driver *driver, int line,
> char ch)
>  {
>  struct tty_struct *tty = driver->ttys[0];
>  struct hvc_struct *hp = tty->driver_data;
>  int n;
>
>  do {
>  n = hp->ops->put_chars(hp->vtermno, &ch, 1);
>  } while (n <= 0);
>  }

This is actually the same as the first, taking the address of a
function argument forces it onto the stack.

> 3,  hp->outbuf is allocated in hvc_alloc() via kzalloc(), no issue for dma
>  static int hvc_push(struct hvc_struct *hp)
>  {
>  int n;
>
>  n = hp->ops->put_chars(hp->vtermno, hp->outbuf, hp->n_outbuf);
>  …
>  }

ok.

I have a new question then: are there any other hvc backends that do
DMA, or is the virtio-console driver the only one? If there are any others,
I think this should better be fixed in the hvc framework, by changing it
to never pass stack data into the put_chars() function in the first place.

It may be possible to just use the 'hp->n_outbuf' buffer in all three cases.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] virtio-console: avoid DMA from vmalloc area

2021-07-27 Thread Arnd Bergmann
On Tue, Jul 27, 2021 at 3:13 PM Xianting Tian
 wrote:
> @@ -1127,13 +1128,18 @@ static int put_chars(u32 vtermno, const char *buf, 
> int count)
> if (!port)
> return -EPIPE;
>
> -   data = kmemdup(buf, count, GFP_ATOMIC);
> -   if (!data)
> -   return -ENOMEM;
> +   if (is_vmalloc_addr(buf)) {
> +   data = kmemdup(buf, count, GFP_ATOMIC);

What about buffers in .data? If those are in a loadable module, I guess you have
the same problem as with vmalloc() and vmap().

is_vmalloc_or_module_addr() would take care of both, not sure if there are
other examples that don't work. In theory it could be ioremap(), kmap_atomic()
or fixmap as well, but those seem less likely to matter here.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v15] i2c: virtio: add a virtio i2c frontend driver

2021-07-23 Thread Arnd Bergmann
On Fri, Jul 23, 2021 at 7:44 AM Jie Deng  wrote:

> +
> +   ret = virtio_i2c_setup_vqs(vi);
> +   if (ret)
> +   return ret;
> +
> +   vi->adap.owner = THIS_MODULE;
> +   snprintf(vi->adap.name, sizeof(vi->adap.name),
> +"i2c_virtio at virtio bus %d", vdev->index);
> +   vi->adap.algo = &virtio_algorithm;
> +   vi->adap.quirks = &virtio_i2c_quirks;
> +   vi->adap.dev.parent = &vdev->dev;
> +   i2c_set_adapdata(&vi->adap, vi);
> +
> +   /*
> +* Setup ACPI node for controlled devices which will be probed through
> +* ACPI.
> +*/
> +   ACPI_COMPANION_SET(&vi->adap.dev, ACPI_COMPANION(pdev));

Since there is now a generic way for virtio drivers to link up with OF
device nodes, maybe this should be handled the same way in the
virtio core rather than the driver?

> index 70a8057a..99aa27b 100644
> --- a/include/uapi/linux/virtio_ids.h
> +++ b/include/uapi/linux/virtio_ids.h
> @@ -55,6 +55,7 @@
>  #define VIRTIO_ID_FS   26 /* virtio filesystem */
>  #define VIRTIO_ID_PMEM 27 /* virtio pmem */
>  #define VIRTIO_ID_MAC80211_HWSIM   29 /* virtio mac80211-hwsim */
> +#define VIRTIO_ID_I2C_ADAPTER  34 /* virtio i2c adapter */
>  #define VIRTIO_ID_BT   40 /* virtio bluetooth */

This will now conflict with Viresh's patch that adds all the other IDs.
Not sure if there is anything to be done about that.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v10] i2c: virtio: add a virtio i2c frontend driver

2021-06-30 Thread Arnd Bergmann
On Wed, Jun 30, 2021 at 10:09 AM Andy Shevchenko
 wrote:
>
> On Wed, Jun 30, 2021 at 09:55:49AM +0200, Arnd Bergmann wrote:
> > On Wed, Jun 30, 2021 at 9:51 AM Jie Deng  wrote:
>
> ...
>
> > On a related note, we are apparently still missing the bit in the virtio bus
> > layer that fills in the dev->of_node pointer of the virtio device. Without
> > this, it is not actually possible to automatically probe i2c devices 
> > connected
> > to a virtio-i2c bus. The same problem came up again with the virtio-gpio
> > driver that suffers from the same issue.
>
> Don't we need to take care about fwnode handle as well?

I'm fairly sure this gets set up automatically on DT based systems, based
on the dev->of_node of the virtio device, with no changes to the i2c
core core.

If you want to automatically probe i2c devices on a virtio-i2c controller
with ACPI, I have no idea if that would require changes to both
i2c-core-acpi.c as well as the virtio core, or just one of them.
So far, my assumption was that this would not be needed with ACPI.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v10] i2c: virtio: add a virtio i2c frontend driver

2021-06-30 Thread Arnd Bergmann
On Wed, Jun 30, 2021 at 9:51 AM Jie Deng  wrote:
> On 2021/6/30 15:32, Wolfram Sang wrote:
>  +  snprintf(vi->adap.name, sizeof(vi->adap.name), "Virtio I2C Adapter");
> >>> Is there something to add so you can distinguish multiple instances?
> >>> Most people want that.
> >>
> >> I find the I2C core will set a device name "i2c-%d" for this purpose, 
> >> right?
> >>
> >> I think this name can be used to distinguish the adapter types while
> >> "i2c-%d" can be used to
> >>
> >> distinguish instances. Does it make sense ?
> > That alone does not help. See the 'i2cdetect -l' output of my Renesas
> > board here:
> >
> > i2c-4 i2c e66d8000.i2cI2C adapter
> > i2c-2 i2c e651.i2cI2C adapter
> > i2c-7 i2c e60b.i2cI2C adapter
> >
> > Notice that the third column carries the base address, so you know which
> > i2c-%d is which physical bus. I don't know if it makes sense in your
> > "virtual" case, but so far it would always print "Virtio I2C Adapter".
> > Maybe it makes sense to add some parent device name, too?
> >
> > And if this is not reasonable, just skip it. As I said, it can be
> > helpful at times, but it is definately not a show stopper.
>
>
> OK. I will add the virtio_device index for this purpose.
> which indicates the unique position on the virtio bus.

Is that position stable across kernel versions? We do have stable naming
for PCI devices and for platform devices that are the parent of a virtio
device, but I would expect the virtio device to be numbered in probe
order instead.

On a related note, we are apparently still missing the bit in the virtio bus
layer that fills in the dev->of_node pointer of the virtio device. Without
this, it is not actually possible to automatically probe i2c devices connected
to a virtio-i2c bus. The same problem came up again with the virtio-gpio
driver that suffers from the same issue.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v10] i2c: virtio: add a virtio i2c frontend driver

2021-06-28 Thread Arnd Bergmann
On Mon, Jun 28, 2021 at 12:13 PM Wolfram Sang  wrote:
>
>
> > Ok, that's what I thought. There is one corner case that I've struggled
> > with though: Suppose the host has an SMBus-only driver, and the
> > proposed driver exposes this as an I2C device to the guest, which
> > makes it available to guest user space (or a guest kernel driver)
> > using the emulated smbus command set. Is it always possible for
> > the host user space to turn the I2C transaction back into the
> > expected SMBus transaction on the host?
>
> If an SMBus commands gets converted to I2C messages, it can be converted
> back to an SMBus command. I don't see anything preventing that right
> now. However, the mapping-back code does look a bit clumsy, now that I
> envision it. Maybe it is better, after all, to support I2C_SMBUS
> directly and pass SMBus transactions as is. It should be a tad more
> effiecient, too.

You can fine Viresh's vhost-user implementation at
https://lore.kernel.org/qemu-devel/cover.1617278395.git.viresh.ku...@linaro.org/t/#m3b5044bad9769b170f505e63bd081eb27cef8db2

As you say, it does get a bit clumsy, but I think there is also a good argument
to be made that the clumsiness is based on the host Linux user interface
more than the on the requirements of the physical interface,
and that should not have to be reflected in the virtio specification.

> Speaking of it, I recall another gory detail: SMBus has transfers named
> "read block data" and "block process call". These also need special
> support from I2C host controllers before they can be emulated because
> the length of the read needs to be adjusted in flight. These commands
> are rare and not hard to implement. However, it makes exposing what is
> supported a little more difficult.

Right, this one has come up before as well: the preliminary result
was to assume that this probably won't be needed, but would be easy
enough to add later if necessary.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v10] i2c: virtio: add a virtio i2c frontend driver

2021-06-28 Thread Arnd Bergmann
On Mon, Jun 28, 2021 at 11:25 AM Wolfram Sang  wrote:
> > As far as I understand me (please clarify), implementing only the smbus
> > subset would mean that we cannot communicate with all client devices,
> > while implementing both would add more complexity than the lower-level
> > protocol.
>
> Correct. You need to pick I2C if you want to support all devices. SMBus
> will give you only a lot of devices.

Ok, that's what I thought. There is one corner case that I've struggled
with though: Suppose the host has an SMBus-only driver, and the
proposed driver exposes this as an I2C device to the guest, which
makes it available to guest user space (or a guest kernel driver)
using the emulated smbus command set. Is it always possible for
the host user space to turn the I2C transaction back into the
expected SMBus transaction on the host?

> > > Also, as I read it, a whole bus is para-virtualized to the guest, or?
> > > Wouldn't it be better to allow just specific devices on a bus? Again, I
> > > am kinda new to this, so I may have overlooked things.
> >
> > Do you mean just allowing a single device per bus (as opposed to
> > having multiple devices as on a real bus), or just allowing
> > a particular set of client devices that can be identified using
> > virtio specific configuration (as opposed to relying on device
> > tree or similar for probing). Both of these are valid questions that
> > have been discussed before, but that could be revisited.
>
> I am just thinking that a physical bus on the host may have devices that
> should be shared with the guest while other devices on the same bus
> should not be shared (PMIC!). I'd think this is a minimal requirement
> for security. I also wonder if it is possible to have a paravirtualized
> bus in the guest which has multiple devices from the host sitting on
> different physical busses there. But that may be the cherry on the top.

This is certainly possible, but is independent of the implementation of
the guest driver. It's up to the host to provision the devices that
are actually passed down to the guest, and this could in theory
be any combination of emulated devices with devices connected to
any of the host's physical buses. The host may also decide to remap
the addresses of the devices during passthrough.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v10] i2c: virtio: add a virtio i2c frontend driver

2021-06-28 Thread Arnd Bergmann
On Mon, Jun 28, 2021 at 10:39 AM Wolfram Sang  wrote:
>
> sorry for the long delay. I am not familiar with VFIO, so I had to dive
> into the topic a little first. I am still not seeing through it
> completely, so I have very high-level questions first.

You probably know this already, but just in case for clarification
these are two different things:

VFIO: kernel feature to make raw (usually PCI) devices available
   to user space drivers and virtual machines from a kernel
   running on bare metal.

virtio: transport protocol for implementing arbitrary paravirtualized
  drivers in (usually) a virtual machine guest without giving the
  guest access to hardware registers.

Both can be used for letting a KVM guest talk to the outside world,
but usually you have one or the other, not both.

> > The device specification can be found on
> > https://lists.oasis-open.org/archives/virtio-comment/202101/msg8.html.
>
> I think we need to start here:
>
> ===
>
> If ``length of \field{read_buf}''=0 and ``length of \field{write_buf}''>0,
> the request is called write request.
>
> If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''=0,
> the request is called read request.
>
> If ``length of \field{read_buf}''>0 and ``length of \field{write_buf}''>0,
> the request is called write-read request. It means an I2C write segment 
> followed
> by a read segment. Usually, the write segment provides the number of an I2C
> controlled device register to be read.
>
> ===
>
> I2C transactions can have an arbitrary number of messages which can
> arbitrarily be read or write. As I understand the above, only one read,
> write or read-write transaction is supported. If that is the case, it
> would be not very much I2C but more SMBus. If my assumptions are true,
> we first need to decide if you want to go the I2C way or SMBus subset.

This has come up in previous reviews already. I think it comes down
to the requirement that the virtio i2c protocol should allow passthrough
access to any client devices connected to a physical i2c bus on the host,
and this should ideally be independent of whether the host driver
exposes I2C_RDWR or I2C_SMBUS ioctl interface, or both.

This can be done either by having both interface types in the transport,
or picking one of the two, and translating to the host interface type
in software.

As far as I understand me (please clarify), implementing only the smbus
subset would mean that we cannot communicate with all client devices,
while implementing both would add more complexity than the lower-level
protocol.

> ===
>
> The case when ``length of \field{write_buf}''=0, and at the same time,
> ``length of \field{read_buf}''=0 doesn't make any sense.
>
> ===
>
> Oh, it does. That's a legal transfer, both in SMBus and I2C. It is used
> to e.g. discover devices. I think it should be supported, even though
> not all bus master drivers on the host can support it. Is it possible?
>
> Also, as I read it, a whole bus is para-virtualized to the guest, or?
> Wouldn't it be better to allow just specific devices on a bus? Again, I
> am kinda new to this, so I may have overlooked things.

Do you mean just allowing a single device per bus (as opposed to
having multiple devices as on a real bus), or just allowing
a particular set of client devices that can be identified using
virtio specific configuration (as opposed to relying on device
tree or similar for probing). Both of these are valid questions that
have been discussed before, but that could be revisited.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v10] i2c: virtio: add a virtio i2c frontend driver

2021-03-23 Thread Arnd Bergmann
On Tue, Mar 23, 2021 at 9:33 AM Jie Deng  wrote:
>
> On 2021/3/23 15:27, Viresh Kumar wrote:
>
> > On 23-03-21, 22:19, Jie Deng wrote:
> >> +static int __maybe_unused virtio_i2c_freeze(struct virtio_device *vdev)
> >> +{
> >> +virtio_i2c_del_vqs(vdev);
> >> +return 0;
> >> +}
> >> +
> >> +static int __maybe_unused virtio_i2c_restore(struct virtio_device *vdev)
> >> +{
> >> +return virtio_i2c_setup_vqs(vdev->priv);
> >> +}
> > Sorry for not looking at this earlier, but shouldn't we enclose the above 
> > two
> > within #ifdef CONFIG_PM_SLEEP instead and drop the __maybe_unused ?
>
>
> I remembered I was suggested to use "__maybe_unused" instead of "#ifdef".
>
> You may check this https://lore.kernel.org/patchwork/patch/732981/
>
> The reason may be something like that.

I usually recommend the use of __maybe_unused for the suspend/resume
callbacks for drivers that use SIMPLE_DEV_PM_OPS() or similar helpers
that hide the exact conditions under which the functions get called.

In this driver, there is an explicit #ifdef in the reference to the
functions, so
it would make sense to use the same #ifdef around the definition.

A better question to ask is whether you could use the helpers instead,
and drop the other #ifdef.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v8] i2c: virtio: add a virtio i2c frontend driver

2021-03-19 Thread Arnd Bergmann
On Fri, Mar 19, 2021 at 7:35 AM Viresh Kumar  wrote:
>
> On 19-03-21, 14:29, Jie Deng wrote:
> > I also see example drivers/i2c/busses/i2c-xiic.c. Some people might think
> > this way is more clearer than
> >
> > updating each member in probe. Basically, I think it's just a matter of
> > personal preference which doesn't
>
> Memory used by one instance of struct i2c_adapter (on arm64):
>
> struct i2c_adapter {
...
> };
>
> So, this extra instance that i2c-xiic.c is creating (and that you want to
> create) is going to waste 1KB of memory and it will never be used.
>
> This is bad coding practice IMHO and it is not really about personal 
> preference.

Agreed. At the minimum, it should have been written as an explicit
memcpy() in the few drivers that have that pattern instead of a benign
looking struct assignment, but even then there is nothing good about it
really. Notably, the largest member by far is the 'struct device', and
that by itself should be a red flag as a device is never meant to be
allocated statically (this used to be common in pre-DT days, but
even then was considered bad style).

I suppose the i2c_add_adapter()/i2c_add_numbered_adapter()
interface could be redesigned to handle this better, since every
driver needs to set the same few fields. That however requires finding
someone to spend the effort on coming up with a nice design and
converting lots of drivers over.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v8] i2c: virtio: add a virtio i2c frontend driver

2021-03-18 Thread Arnd Bergmann
On Thu, Mar 18, 2021 at 3:42 PM Enrico Weigelt, metux IT consult
 wrote:
>
> On 16.03.21 08:44, Viresh Kumar wrote:
>
> > FWIW, this limits this driver to support a single device ever. We
> > can't bind multiple devices to this driver now. Yeah, perhaps we will
> > never be required to do so, but who knows.
>
> Actually, I believe multiple devices really should be possible.
>
> The major benefit of virtio-i2c is either bridging certan real bus'es
> into a confined workload, or creating virtual hw testbeds w/o having to
> write a complete emulation (in this case, for dozens of different i2c
> controllers) - and having multiple i2c interfaces in one machine isn't
> exactly rare.

Allowing multiple virtio-i2c controllers in one system, and multiple i2c
devices attached to each controller is clearly something that has to work.

I don't actually see a limitation though. Viresh, what is the problem
you see for having multiple controllers?

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v7] i2c: virtio: add a virtio i2c frontend driver

2021-03-15 Thread Arnd Bergmann
On Mon, Mar 15, 2021 at 6:54 AM Jie Deng  wrote:
> On 2021/3/15 11:13, Jason Wang wrote:
> > On 2021/3/15 9:14 上午, Jie Deng wrote:
> >> On 2021/3/12 16:58, Arnd Bergmann wrote:
> >
> Then do you think it is necessary to mark the virtio bufs with
> cacheline_aligned ?

I think so, yes.

> I haven't seen any virtio interface being marked yet. If this is a
> problem, I believe it should  be common for all virtio devices, right ?

Yes, but it's not a problem if the buffers are allocated separately
because kmalloc provinces a cachelinen aligned buffer on architectures
that need it.

It's only a problem here because there is a single allocation for three
objects that have different ownership states during the DMA (device
owned to-device, cpu-owned, device owned to-cpu).

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v7] i2c: virtio: add a virtio i2c frontend driver

2021-03-12 Thread Arnd Bergmann
On Fri, Mar 12, 2021 at 2:33 PM Jie Deng  wrote:

> +
> +/**
> + * struct virtio_i2c_req - the virtio I2C request structure
> + * @out_hdr: the OUT header of the virtio I2C message
> + * @buf: the buffer into which data is read, or from which it's written
> + * @in_hdr: the IN header of the virtio I2C message
> + */
> +struct virtio_i2c_req {
> +   struct virtio_i2c_out_hdr out_hdr;
> +   uint8_t *buf;
> +   struct virtio_i2c_in_hdr in_hdr;
> +};

The simpler request structure clearly looks better than the previous version,
but I think I found another problem here, at least a theoretical one:

When you map the headers into the DMA address space, they should
be in separate cache lines, to allow the DMA mapping interfaces to
perform cache management on each one without accidentally clobbering
another member.

So far I think there is an assumption that virtio buffers are always
on cache-coherent devices, but if you ever have a virtio-i2c device
backend on a physical interconnect that is not cache coherent (e.g. a
microcontroller that shares the memory bus), this breaks down.

You could avoid this by either allocating arrays of each type separately,
or by marking each member that you pass to the device as
cacheline_aligned.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v6] i2c: virtio: add a virtio i2c frontend driver

2021-03-10 Thread Arnd Bergmann
On Wed, Mar 10, 2021 at 4:59 AM Jason Wang  wrote:
> On 2021/3/10 10:22 上午, Jie Deng wrote:
> > On 2021/3/4 17:15, Jason Wang wrote:
> >>
> >>
> >>> +}
> >>> +
> >>> +if (msgs[i].flags & I2C_M_RD)
> >>> +memcpy(msgs[i].buf, req->buf, msgs[i].len);
> >>
> >>
> >> Sorry if I had asked this before but any rason not to use msg[i].buf
> >> directly?
> >>
> >>
> > The msg[i].buf is passed by the I2C core. I just noticed that these
> > bufs are not
> > always allocated by kmalloc. They may come from the stack, which may
> > cause
> > the check "sg_init_one -> sg_set_buf -> virt_addr_valid"  to fail.
> > Therefore the
> > msg[i].buf is not suitable for direct use here.
>
> Right, stack is virtually mapped.

Maybe there is (or should be) a way to let the i2c core code handle
the bounce buffering in this case. This is surely not a problem that
is unique to this driver, and I'm sure it has come up many times in
the past.

I see that there is a i2c_get_dma_safe_msg_buf() helper for this
purpose, but it has to be called by the driver rather than the core,
so the driver still needs to keep track of each address when it
sends multiple i2c_msg at once, but maybe it can all be done
inside the sg_table instead of yet another structure.

At least this one avoids copying data that is marked with the
I2C_M_DMA_SAFE flag.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v5] i2c: virtio: add a virtio i2c frontend driver

2021-03-02 Thread Arnd Bergmann
On Tue, Mar 2, 2021 at 10:51 AM Stefan Hajnoczi  wrote:
> On Tue, Mar 02, 2021 at 10:42:06AM +0800, Jie Deng wrote:
> > > > +/*
> > > > + * Definitions for virtio I2C Adpter
> > > > + *
> > > > + * Copyright (c) 2021 Intel Corporation. All rights reserved.
> > > > + */
> > > > +
> > > > +#ifndef _UAPI_LINUX_VIRTIO_I2C_H
> > > > +#define _UAPI_LINUX_VIRTIO_I2C_H
> > > Why is this a uapi header? Can't this all be moved into the driver
> > > itself?
>
> Linux VIRTIO drivers provide a uapi header with structs and constants
> that describe the device interface. This allows other software like
> QEMU, other operating systems, etc to reuse these headers instead of
> redefining everything.
>
> These files should contain:
> 1. Device-specific feature bits (VIRTIO__F_)
> 2. VIRTIO Configuration Space layout (struct virtio__config)
> 3. Virtqueue request layout (struct virtio__)
>
> For examples, see  and .

Ok, makes sense. So it's not strictly uapi but just helpful for
virtio applications to reference these. I suppose it is slightly harder
when building qemu on other operating systems though, how does
it get these headers on BSD or Windows?

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v5] i2c: virtio: add a virtio i2c frontend driver

2021-03-01 Thread Arnd Bergmann
On Mon, Mar 1, 2021 at 1:10 PM Andy Shevchenko
 wrote:
> On Mon, Mar 01, 2021 at 02:09:25PM +0200, Andy Shevchenko wrote:
> > On Mon, Mar 01, 2021 at 05:24:41PM +0530, Viresh Kumar wrote:
> > > On 01-03-21, 14:41, Jie Deng wrote:
> > > > +/**
> > > > + * struct virtio_i2c_req - the virtio I2C request structure
> > > > + * @out_hdr: the OUT header of the virtio I2C message
> > > > + * @write_buf: contains one I2C segment being written to the device
> > > > + * @read_buf: contains one I2C segment being read from the device
> > > > + * @in_hdr: the IN header of the virtio I2C message
> > > > + */
> > > > +struct virtio_i2c_req {
> > > > + struct virtio_i2c_out_hdr out_hdr;
> > > > + u8 *write_buf;
> > > > + u8 *read_buf;
> > > > + struct virtio_i2c_in_hdr in_hdr;
> > > > +};
> > >
> > > I am not able to appreciate the use of write/read bufs here as we
> > > aren't trying to read/write data in the same transaction. Why do we
> > > have two bufs here as well as in specs ?
> >
> > I涎 and SMBus support bidirectional transfers as well. I think two buffers is
> > the right thing to do.
>
> Strictly speaking "half duplex".

But the driver does not support this at all: the sglist always has three
members as Viresh says: outhdr, msgbuf and inhdr. It then uses a
bounce buffer for the actual data transfer, and this always goes either
one way or the other.

I think the more important question is: does this driver actually need
the bounce buffer? It doesn't have to worry about adjacent stack
data being clobbered by cache maintenance because virtio is always
cache coherent and, so I suspect the bounce buffer can be left out.

It might actually be simpler to just have a fixed-length array of
headers on the stack and at most the corresponding number of
transfers for one virtqueue_kick().

Is there a reasonable limit on how many transfers we would
expect to handle at once? I see that most callers of i2c_transfer()
hardcode the number to '1' or '2', rarely '3' or '4', while the proposed
implementation seems to be optimized for much larger numbers.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v5] i2c: virtio: add a virtio i2c frontend driver

2021-03-01 Thread Arnd Bergmann
On Mon, Mar 1, 2021 at 7:41 AM Jie Deng  wrote:

> --- /dev/null
> +++ b/include/uapi/linux/virtio_i2c.h
> @@ -0,0 +1,56 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later WITH Linux-syscall-note */
> +/*
> + * Definitions for virtio I2C Adpter
> + *
> + * Copyright (c) 2021 Intel Corporation. All rights reserved.
> + */
> +
> +#ifndef _UAPI_LINUX_VIRTIO_I2C_H
> +#define _UAPI_LINUX_VIRTIO_I2C_H

Why is this a uapi header? Can't this all be moved into the driver
itself?

> +/**
> + * struct virtio_i2c_req - the virtio I2C request structure
> + * @out_hdr: the OUT header of the virtio I2C message
> + * @write_buf: contains one I2C segment being written to the device
> + * @read_buf: contains one I2C segment being read from the device
> + * @in_hdr: the IN header of the virtio I2C message
> + */
> +struct virtio_i2c_req {
> +   struct virtio_i2c_out_hdr out_hdr;
> +   u8 *write_buf;
> +   u8 *read_buf;
> +   struct virtio_i2c_in_hdr in_hdr;
> +};

In particular, this structure looks like it is only ever usable between
the transfer functions in the driver itself, it is shared with neither
user space nor the virtio host side.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 04/18] alpha: Override READ_ONCE() with barriered implementation

2020-07-02 Thread Arnd Bergmann
On Thu, Jul 2, 2020 at 1:18 PM Will Deacon  wrote:
> On Thu, Jul 02, 2020 at 12:08:41PM +0200, Arnd Bergmann wrote:
> > On Thu, Jul 2, 2020 at 11:48 AM Will Deacon  wrote:
> > > On Thu, Jul 02, 2020 at 10:32:39AM +0100, Mark Rutland wrote:

> Not sure I follow you here, but I can confirm that what you're worried
> about doesn't happen for the usual case of a pointer-to-volatile scalar.
>
> For example, ignoring dependency ordering:
>
> unsigned long foo(volatile unsigned long *p)
> {
> return smp_load_acquire(p) + 1;
> }
>
> Ends up looking like:
>
> unsigned long ___p1 = *(const volatile unsigned long *)p;
> smp_mb();
> (volatile unsigned long)___p1;
>
> My understanding is that casting a non-pointer type to volatile doesn't
> do anything, so we're good.

Right, I mixed up the correct

(typeof(*p))___p;

with the incorrect

   *typeof(p)&___p;

which would dereference a volatile pointer and cause the
problem.

The code is all fine then.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 04/18] alpha: Override READ_ONCE() with barriered implementation

2020-07-02 Thread Arnd Bergmann
On Thu, Jul 2, 2020 at 11:48 AM Will Deacon  wrote:
> On Thu, Jul 02, 2020 at 10:32:39AM +0100, Mark Rutland wrote:
> > On Tue, Jun 30, 2020 at 06:37:20PM +0100, Will Deacon wrote:
> > > -#define read_barrier_depends() __asm__ __volatile__("mb": : :"memory")
> > > +#define __smp_load_acquire(p)
> > >   \
> > > +({ \
> > > +   __unqual_scalar_typeof(*p) ___p1 =  \
> > > +   (*(volatile typeof(___p1) *)(p));   \
> > > +   compiletime_assert_atomic_type(*p); \
> > > +   ___p1;  \
> > > +})
> >
> > Sorry if I'm being thick, but doesn't this need a barrier after the
> > volatile access to provide the acquire semantic?
> >
> > IIUC prior to this commit alpha would have used the asm-generic
> > __smp_load_acquire, i.e.
> >
> > | #ifndef __smp_load_acquire
> > | #define __smp_load_acquire(p)   \
> > | ({  \
> > | __unqual_scalar_typeof(*p) ___p1 = READ_ONCE(*p);   \
> > | compiletime_assert_atomic_type(*p); \
> > | __smp_mb(); \
> > | (typeof(*p))___p1;  \
> > | })
> > | #endif

I also have a question that I didn't dare ask when the same
code came up before (I guess it's also what's in the kernel today):

With the cast to 'typeof(*p)' at the end, doesn't that mean we
lose the effect of __unqual_scalar_typeof() again, so any "volatile"
pointer passed into __READ_ONCE_SCALAR() or
__smp_load_acquire() still leads to a volatile load of the original
variable, plus another volatile access to ___p1 after
spilling it to the stack as a non-volatile variable?

I hope I'm missing something obvious here.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 0/5] drivers/char: Constify static variables

2020-07-02 Thread Arnd Bergmann
On Wed, Jul 1, 2020 at 11:48 PM Rikard Falkeborn
 wrote:
>
> Constify some static variables (mostly structs) that are not modified.
>
> Rikard Falkeborn (5):
>   hwrng: bcm2835 - Constify bcm2835_rng_devtype[]
>   hwrng: nomadik - Constify nmk_rng_ids[]
>   hwrng: virtio - Constify id_table[]
>   ipmi: watchdog: Constify ident
>   virtio_console: Constify some static variables

I just realized it was a series rather than a single patch I received. They
all look correct, so

Acked-by: Arnd Bergmann 

but if you do more of those, I would suggest not including the 'size'
output for the small variables as that is not the main point here.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 02/18] compiler.h: Split {READ, WRITE}_ONCE definitions out into rwonce.h

2020-07-01 Thread Arnd Bergmann
On Wed, Jul 1, 2020 at 12:16 PM Will Deacon  wrote:
> On Tue, Jun 30, 2020 at 09:11:32PM +0200, Arnd Bergmann wrote:
> > On Tue, Jun 30, 2020 at 7:37 PM Will Deacon  wrote:
> > >
> > > In preparation for allowing architectures to define their own
> > > implementation of the READ_ONCE() macro, move the generic
> > > {READ,WRITE}_ONCE() definitions out of the unwieldy 'linux/compiler.h'
> > > file and into a new 'rwonce.h' header under 'asm-generic'.
> > >
> > > Acked-by: Paul E. McKenney 
> > > Signed-off-by: Will Deacon 
> > > ---
> > >  include/asm-generic/Kbuild   |  1 +
> > >  include/asm-generic/rwonce.h | 91 
> > >  include/linux/compiler.h | 83 +---
> >
> > Very nice, this has the added benefit of allowing us to stop including
> > asm/barrier.h once linux/compiler.h gets changed to not include
> > asm/rwonce.h.
>
> Yeah, with this series linux/compiler.h _does_ include asm/rwonce.h because
> otherwise there are many callers to fix up, but that could be addressed
> subsequently, I suppose.

Right, I didn't mean you should change that right away. I actually
have a work-in-progress patch series that removes a ton of
indirect header inclusions to improve build speed, and a script to
add the explicit includes into .c files where needed.

> > The asm/barrier.h header has a circular dependency, pulling in
> > linux/compiler.h itself.
>
> Hmm. Once smp_read_barrier_depends() disappears, I could actually remove
> the include of  from asm-generic/rwonce.h. It would have to
> remain for arch/alpha/, however, since we need the barrier definitions to
> implement READ_ONCE(). I can probably also replace the include of
>  in asm-generic/barrier.h with  too (so it's
> still circular, but at least a lot simpler).
>
> I'll have a play...

I think this would require the same kind of fixup for anything that depends on
asm/barrier.h being included implicitly at the moment.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y

2020-07-01 Thread Arnd Bergmann
On Wed, Jul 1, 2020 at 12:19 PM Will Deacon  wrote:
> On Tue, Jun 30, 2020 at 09:25:03PM +0200, Arnd Bergmann wrote:
> > On Tue, Jun 30, 2020 at 7:39 PM Will Deacon  wrote:
> > Once we make gcc-4.9 the minimum version,
> > this could be further improved to
> >
> >__auto_type __x = &(x);
>
> Is anybody working on moving to 4.9? I've seen the mails from Linus
> championing it, but I thought there was a RHEL in support that people
> might care about?

I don't think there was a serious discussion about it so far, and
we only just moved to gcc-4.8.

I think moving to gnu11 (gcc-4.9 or clang) instead of gnu99 has other
benefits as well, so we may well want to do it anyway when something
else comes up.

For __auto_type(), we could do it like

#if (clang or gcc-4.9+)
#define auto_typeof(x) __auto_type
#else
#define auto_typeof(x) typeof(x)
#endif

which could be used in a lot of macros.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 18/18] arm64: lto: Strengthen READ_ONCE() to acquire when CLANG_LTO=y

2020-06-30 Thread Arnd Bergmann
On Tue, Jun 30, 2020 at 7:39 PM Will Deacon  wrote:
> +#define __READ_ONCE(x) \
> +({ \
> +   int atomic = 1; \
> +   union { __unqual_scalar_typeof(x) __val; char __c[1]; } __u;\
> +   typeof(&(x)) __x = &(x);\
> +   switch (sizeof(x)) {\
...
> +   atomic ? (typeof(x))__u.__val : (*(volatile typeof(x) *)__x);   \
> +})

This expands (x) nine times (five in __unqual_scala_typeof()), which can
lead to significant code bloat after preprocessing if something passes a
compound expression into READ_ONCE().
The compiler works it out eventually, but we've seen an actual slowdown
in compile speed from this recently, especially on clang.

I think if you move the

typeof(&(x)) __x = &(x);

line first, all other instances can use typeof(*__x) instead of typeof(x)
and avoid this problem. Once we make gcc-4.9 the minimum version,
this could be further improved to

   __auto_type __x = &(x);

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 02/18] compiler.h: Split {READ, WRITE}_ONCE definitions out into rwonce.h

2020-06-30 Thread Arnd Bergmann
On Tue, Jun 30, 2020 at 7:37 PM Will Deacon  wrote:
>
> In preparation for allowing architectures to define their own
> implementation of the READ_ONCE() macro, move the generic
> {READ,WRITE}_ONCE() definitions out of the unwieldy 'linux/compiler.h'
> file and into a new 'rwonce.h' header under 'asm-generic'.
>
> Acked-by: Paul E. McKenney 
> Signed-off-by: Will Deacon 
> ---
>  include/asm-generic/Kbuild   |  1 +
>  include/asm-generic/rwonce.h | 91 
>  include/linux/compiler.h | 83 +---

Very nice, this has the added benefit of allowing us to stop including
asm/barrier.h once linux/compiler.h gets changed to not include
asm/rwonce.h.

The asm/barrier.h header has a circular dependency, pulling in
linux/compiler.h itself.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] vhost: fix default for vhost_iotlb

2020-04-29 Thread Arnd Bergmann
During randconfig build testing, I ran into a configuration that has
CONFIG_VHOST=m, CONFIG_VHOST_IOTLB=m and CONFIG_VHOST_RING=y, which
makes the iotlb implementation left out from vhost_ring, and in turn
leads to a link failure of the vdpa_sim module:

ERROR: modpost: "vringh_set_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_init_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_iov_push_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_iov_pull_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_complete_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!
ERROR: modpost: "vringh_getdesc_iotlb" [drivers/vdpa/vdpa_sim/vdpa_sim.ko] 
undefined!

Work around it by setting the default for VHOST_IOTLB to avoid this
configuration.

Fixes: e6faeaa12841 ("vhost: drop vring dependency on iotlb")
Signed-off-by: Arnd Bergmann 
---
I fixed this a while ago locally but never got around to sending the
fix. If the problem has been addressed differently in the meantime,
please ignore this one.
---
 drivers/vhost/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/vhost/Kconfig b/drivers/vhost/Kconfig
index 2c75d164b827..ee5f85761024 100644
--- a/drivers/vhost/Kconfig
+++ b/drivers/vhost/Kconfig
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 config VHOST_IOTLB
tristate
+   default y if VHOST=m && VHOST_RING=y
help
  Generic IOTLB implementation for vhost and vringh.
  This option is selected by any driver which needs to support
-- 
2.26.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V9 9/9] virtio: Intel IFC VF driver for VDPA

2020-04-09 Thread Arnd Bergmann
On Thu, Mar 26, 2020 at 3:08 PM Jason Wang  wrote:
>
> From: Zhu Lingshan 
>
> This commit introduced two layers to drive IFC VF:
>
> (1) ifcvf_base layer, which handles IFC VF NIC hardware operations and
> configurations.
>
> (2) ifcvf_main layer, which complies to VDPA bus framework,
> implemented device operations for VDPA bus, handles device probe,
> bus attaching, vring operations, etc.
>
> Signed-off-by: Zhu Lingshan 
> Signed-off-by: Bie Tiwei 
> Signed-off-by: Wang Xiao 
> Signed-off-by: Jason Wang 

> +
> +#define IFCVF_QUEUE_ALIGNMENT  PAGE_SIZE
> +#define IFCVF_QUEUE_MAX32768
> +static u16 ifcvf_vdpa_get_vq_align(struct vdpa_device *vdpa_dev)
> +{
> +   return IFCVF_QUEUE_ALIGNMENT;
> +}

This fails to build on arm64 with 64kb page size (found in linux-next):

/drivers/vdpa/ifcvf/ifcvf_main.c: In function 'ifcvf_vdpa_get_vq_align':
arch/arm64/include/asm/page-def.h:17:20: error: conversion from 'long
unsigned int' to 'u16' {aka 'short unsigned int'} changes value from
'65536' to '0' [-Werror=overflow]
   17 | #define PAGE_SIZE  (_AC(1, UL) << PAGE_SHIFT)
  |^
drivers/vdpa/ifcvf/ifcvf_base.h:37:31: note: in expansion of macro 'PAGE_SIZE'
   37 | #define IFCVF_QUEUE_ALIGNMENT PAGE_SIZE
  |   ^
drivers/vdpa/ifcvf/ifcvf_main.c:231:9: note: in expansion of macro
'IFCVF_QUEUE_ALIGNMENT'
  231 |  return IFCVF_QUEUE_ALIGNMENT;
  | ^

It's probably good enough to just not allow the driver to be built in that
configuration as it's fairly rare but unfortunately there is no simple Kconfig
symbol for it.

In a similar driver, we did

config VMXNET3
tristate "VMware VMXNET3 ethernet driver"
depends on PCI && INET
depends on !(PAGE_SIZE_64KB || ARM64_64K_PAGES || \
 IA64_PAGE_SIZE_64KB || MICROBLAZE_64K_PAGES || \
 PARISC_PAGE_SIZE_64KB || PPC_64K_PAGES)

I think we should probably make PAGE_SIZE_64KB a global symbol
in arch/Kconfig and have it selected by the other symbols so drivers
like yours can add a dependency for it.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] vhost: disable for OABI

2020-04-06 Thread Arnd Bergmann
On Mon, Apr 6, 2020 at 3:02 PM Michael S. Tsirkin  wrote:
>
> On Mon, Apr 06, 2020 at 02:50:32PM +0200, Arnd Bergmann wrote:
> > On Mon, Apr 6, 2020 at 2:12 PM Michael S. Tsirkin  wrote:
> >
> > >
> > > +config VHOST_DPN
> > > +   bool "VHOST dependencies"
> > > +   depends on !ARM || AEABI
> > > +   default y
> > > +   help
> > > + Anything selecting VHOST or VHOST_RING must depend on VHOST_DPN.
> > > + This excludes the deprecated ARM ABI since that forces a 4 byte
> > > + alignment on all structs - incompatible with virtio spec 
> > > requirements.
> > > +
> >
> > This should not be a user-visible option, so just make this 'def_bool
> > !ARM || AEABI'
> >
>
> I like keeping some kind of hint around for when one tries to understand
> why is a specific symbol visible.

I meant you should remove the "VHOST dependencies" prompt, not the
help text, which is certainly useful here. You can also use the three lines

 bool
 depends on !ARM || AEABI
 default y

in front of the help text, but those are equivalent to the one-line version
I suggested.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] vhost: disable for OABI

2020-04-06 Thread Arnd Bergmann
On Mon, Apr 6, 2020 at 2:12 PM Michael S. Tsirkin  wrote:

>
> +config VHOST_DPN
> +   bool "VHOST dependencies"
> +   depends on !ARM || AEABI
> +   default y
> +   help
> + Anything selecting VHOST or VHOST_RING must depend on VHOST_DPN.
> + This excludes the deprecated ARM ABI since that forces a 4 byte
> + alignment on all structs - incompatible with virtio spec 
> requirements.
> +

This should not be a user-visible option, so just make this 'def_bool
!ARM || AEABI'

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 13/22] compat_ioctl: scsi: move ioctl handling into drivers

2020-02-12 Thread Arnd Bergmann
On Wed, Feb 12, 2020 at 10:15 PM Johannes Hirte
 wrote:
>
> On 2020 Jan 02, Arnd Bergmann wrote:

>
> Error in getting drive hardware properties
> Error in getting drive reading properties
> Error in getting drive writing properties
> __
>
> Disc mode is listed as: CD-DA
> ++ WARN: error in ioctl CDROMREADTOCHDR: Bad address
>
> cd-info: Can't get first track number. I give up.

Right, there was also a report about breaking the Fedora installer,
see https://bugzilla.redhat.com/show_bug.cgi?id=1801353

There is a preliminary patch that should fix this, I'll post a
version with more references tomorrow:
https://www.happyassassin.net/temp/0001-Replace-.ioctl-with-.compat_ioctl-in-three-appropria.patch

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 1/9] iomap: Constify ioreadX() iomem argument (as in generic implementation)

2020-01-08 Thread Arnd Bergmann
On Wed, Jan 8, 2020 at 9:05 PM Krzysztof Kozlowski  wrote:
>
> The ioreadX() and ioreadX_rep() helpers have inconsistent interface.  On
> some architectures void *__iomem address argument is a pointer to const,
> on some not.
>
> Implementations of ioreadX() do not modify the memory under the address
> so they can be converted to a "const" version for const-safety and
> consistency among architectures.
>
> Suggested-by: Geert Uytterhoeven 
> Signed-off-by: Krzysztof Kozlowski 
> Reviewed-by: Geert Uytterhoeven 

Thanks for getting this done!

Reviewed-by: Arnd Bergmann 

> Changes since v1:
> 1. Constify also ioreadX_rep() and mmio_insX(),
> 2. Squash lib+alpha+powerpc+parisc+sh into one patch for bisectability,

The alpha and parisc versions should be independent, I was thinking
you leave them as separate patches, but this work for me too.

I have no real opinion on the other 8 patches, I would have left
them out completely, but they don't hurt either.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFT 00/13] iomap: Constify ioreadX() iomem argument

2020-01-08 Thread Arnd Bergmann
On Wed, Jan 8, 2020 at 10:15 AM Krzysztof Kozlowski  wrote:

> > The __force-cast that removes the __iomem here also means that
> > the 'volatile' keyword could be dropped from the argument list,
> > as it has no real effect any more, but then there are a few drivers
> > that mark their iomem pointers as either 'volatile void __iomem*' or
> > (worse) 'volatile void *', so we keep it in the argument list to not
> > add warnings for those drivers.
> >
> > It may be time to change these drivers to not use volatile for __iomem
> > pointers, but that seems out of scope for what Krzysztof is trying
> > to do. Ideally we would be consistent here though, either using volatile
> > all the time or never.
>
> Indeed. I guess there are no objections around const so let me send v2
> for const only.

Ok, sounds good. Maybe mention in the changelog then that the
'volatile' in the interface is intentionally left out, and that only users
of readl/writel still have it to deal with existing drivers.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFT 00/13] iomap: Constify ioreadX() iomem argument

2020-01-08 Thread Arnd Bergmann
On Wed, Jan 8, 2020 at 9:36 AM Christophe Leroy  wrote:
> Le 08/01/2020 à 09:18, Krzysztof Kozlowski a écrit :
> > On Wed, 8 Jan 2020 at 09:13, Geert Uytterhoeven  
> > wrote:
> > I'll add to this one also changes to ioreadX_rep() and add another
> > patch for volatile for reads and writes. I guess your review will be
> > appreciated once more because of ioreadX_rep()
> >
>
> volatile should really only be used where deemed necessary:
>
> https://www.kernel.org/doc/html/latest/process/volatile-considered-harmful.html
>
> It is said: " ...  accessor functions might use volatile on
> architectures where direct I/O memory access does work. Essentially,
> each accessor call becomes a little critical section on its own and
> ensures that the access happens as expected by the programmer."

The I/O accessors are one of the few places in which 'volatile' generally
makes sense, at least for the implementations that do a plain pointer
dereference (probably none of the ones in question here).

In case of readl/writel, this is what we do in asm-generic:

static inline u32 __raw_readl(const volatile void __iomem *addr)
{
return *(const volatile u32 __force *)addr;
}

The __force-cast that removes the __iomem here also means that
the 'volatile' keyword could be dropped from the argument list,
as it has no real effect any more, but then there are a few drivers
that mark their iomem pointers as either 'volatile void __iomem*' or
(worse) 'volatile void *', so we keep it in the argument list to not
add warnings for those drivers.

It may be time to change these drivers to not use volatile for __iomem
pointers, but that seems out of scope for what Krzysztof is trying
to do. Ideally we would be consistent here though, either using volatile
all the time or never.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFT 06/13] arc: Constify ioreadX() iomem argument (as in generic implementation)

2020-01-07 Thread Arnd Bergmann
On Tue, Jan 7, 2020 at 5:54 PM Krzysztof Kozlowski  wrote:
>
> The ioreadX() helpers have inconsistent interface.  On some architectures
> void *__iomem address argument is a pointer to const, on some not.
>
> Implementations of ioreadX() do not modify the memory under the
> address so they can be converted to a "const" version for const-safety
> and consistency among architectures.
>
> Signed-off-by: Krzysztof Kozlowski 

The patch looks correct, but I would not bother here, as it has no
effect on the compiler or its output.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFT 03/13] sh: Constify ioreadX() iomem argument (as in generic implementation)

2020-01-07 Thread Arnd Bergmann
On Tue, Jan 7, 2020 at 5:54 PM Krzysztof Kozlowski  wrote:
>
> The ioreadX() helpers have inconsistent interface.  On some architectures
> void *__iomem address argument is a pointer to const, on some not.
>
> Implementations of ioreadX() do not modify the memory under the address
> so they can be converted to a "const" version for const-safety and
> consistency among architectures.
>
> Signed-off-by: Krzysztof Kozlowski 

The patch looks good, but I think this has to be done together with the powerpc
version and the header that declares the function, for bisectibility.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 13/22] compat_ioctl: scsi: move ioctl handling into drivers

2020-01-02 Thread Arnd Bergmann
Each driver calling scsi_ioctl() gets an equivalent compat_ioctl()
handler that implements the same commands by calling scsi_compat_ioctl().

The scsi_cmd_ioctl() and scsi_cmd_blk_ioctl() functions are compatible
at this point, so any driver that calls those can do so for both native
and compat mode, with the argument passed through compat_ptr().

With this, we can remove the entries from fs/compat_ioctl.c.  The new
code is larger, but should be easier to maintain and keep updated with
newly added commands.

Signed-off-by: Arnd Bergmann 
---
 drivers/block/virtio_blk.c |   3 +
 drivers/scsi/ch.c  |   9 ++-
 drivers/scsi/sd.c  |  50 ++
 drivers/scsi/sg.c  |  44 -
 drivers/scsi/sr.c  |  57 ++--
 drivers/scsi/st.c  |  51 --
 fs/compat_ioctl.c  | 132 +
 7 files changed, 142 insertions(+), 204 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 7ffd719d89de..fbbf18ac1d5d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -405,6 +405,9 @@ static int virtblk_getgeo(struct block_device *bd, struct 
hd_geometry *geo)
 
 static const struct block_device_operations virtblk_fops = {
.ioctl  = virtblk_ioctl,
+#ifdef CONFIG_COMPAT
+   .compat_ioctl = blkdev_compat_ptr_ioctl,
+#endif
.owner  = THIS_MODULE,
.getgeo = virtblk_getgeo,
 };
diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index 76751d6c7f0d..ed5f4a6ae270 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -872,6 +872,10 @@ static long ch_ioctl_compat(struct file * file,
unsigned int cmd, unsigned long arg)
 {
scsi_changer *ch = file->private_data;
+   int retval = scsi_ioctl_block_when_processing_errors(ch->device, cmd,
+   file->f_flags & 
O_NDELAY);
+   if (retval)
+   return retval;
 
switch (cmd) {
case CHIOGPARAMS:
@@ -883,7 +887,7 @@ static long ch_ioctl_compat(struct file * file,
case CHIOINITELEM:
case CHIOSVOLTAG:
/* compatible */
-   return ch_ioctl(file, cmd, arg);
+   return ch_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
case CHIOGSTATUS32:
{
struct changer_element_status32 ces32;
@@ -898,8 +902,7 @@ static long ch_ioctl_compat(struct file * file,
return ch_gstatus(ch, ces32.ces_type, data);
}
default:
-   // return scsi_ioctl_compat(ch->device, cmd, (void*)arg);
-   return -ENOIOCTLCMD;
+   return scsi_compat_ioctl(ch->device, cmd, compat_ptr(arg));
 
}
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index cea625906440..5afb0046b12a 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1465,13 +1465,12 @@ static int sd_getgeo(struct block_device *bdev, struct 
hd_geometry *geo)
  * Note: most ioctls are forward onto the block subsystem or further
  * down in the scsi subsystem.
  **/
-static int sd_ioctl(struct block_device *bdev, fmode_t mode,
-   unsigned int cmd, unsigned long arg)
+static int sd_ioctl_common(struct block_device *bdev, fmode_t mode,
+  unsigned int cmd, void __user *p)
 {
struct gendisk *disk = bdev->bd_disk;
struct scsi_disk *sdkp = scsi_disk(disk);
struct scsi_device *sdp = sdkp->device;
-   void __user *p = (void __user *)arg;
int error;
 
SCSI_LOG_IOCTL(1, sd_printk(KERN_INFO, sdkp, "sd_ioctl: disk=%s, "
@@ -1507,9 +1506,6 @@ static int sd_ioctl(struct block_device *bdev, fmode_t 
mode,
break;
default:
error = scsi_cmd_blk_ioctl(bdev, mode, cmd, p);
-   if (error != -ENOTTY)
-   break;
-   error = scsi_ioctl(sdp, cmd, p);
break;
}
 out:
@@ -1691,39 +1687,31 @@ static void sd_rescan(struct device *dev)
revalidate_disk(sdkp->disk);
 }
 
+static int sd_ioctl(struct block_device *bdev, fmode_t mode,
+   unsigned int cmd, unsigned long arg)
+{
+   void __user *p = (void __user *)arg;
+   int ret;
+
+   ret = sd_ioctl_common(bdev, mode, cmd, p);
+   if (ret != -ENOTTY)
+   return ret;
+
+   return scsi_ioctl(scsi_disk(bdev->bd_disk)->device, cmd, p);
+}
 
 #ifdef CONFIG_COMPAT
-/* 
- * This gets directly called from VFS. When the ioctl 
- * is not recognized we go back to the other translation paths. 
- */
 static int sd_compat_ioctl(struct block_device *bdev, fmode_t mode,
   unsigned int cmd, unsigned long arg)
 {
-   struct gendisk *disk = bdev->bd_disk;
-   struct scsi_disk *sdkp = scsi_disk(disk);

Re: [PATCH 15/24] compat_ioctl: scsi: move ioctl handling into drivers

2019-12-12 Thread Arnd Bergmann
On Thu, Dec 12, 2019 at 1:28 AM Paolo Bonzini  wrote:
> On 12/12/19 00:05, Michael S. Tsirkin wrote:
> >> @@ -405,6 +405,9 @@ static int virtblk_getgeo(struct block_device *bd, 
> >> struct hd_geometry *geo)
> >>
> >>  static const struct block_device_operations virtblk_fops = {
> >>  .ioctl  = virtblk_ioctl,
> >> +#ifdef CONFIG_COMPAT
> >> +.compat_ioctl = blkdev_compat_ptr_ioctl,
> >> +#endif
> >>  .owner  = THIS_MODULE,
> >>  .getgeo = virtblk_getgeo,
> >>  };
> > Hmm - is virtio blk lumped in with scsi things intentionally?
>
> I think it's because the only ioctl for virtio-blk is SG_IO.  It makes
> sense to lump it in with scsi, but I wouldn't mind getting rid of
> CONFIG_VIRTIO_BLK_SCSI altogether.

It currently calls scsi_cmd_blk_ioctl(), which implements a bunch of ioctl
commands, including some that are unrelated to SG_IO:

case SG_GET_VERSION_NUM:
case SCSI_IOCTL_GET_IDLUN:
case SCSI_IOCTL_GET_BUS_NUMBER:
case SG_SET_TIMEOUT:
case SG_GET_TIMEOUT:
case SG_GET_RESERVED_SIZE:
case SG_SET_RESERVED_SIZE:
case SG_EMULATED_HOST:
case SG_IO: {
case CDROM_SEND_PACKET:
case SCSI_IOCTL_SEND_COMMAND:
case CDROMCLOSETRAY:
case CDROMEJECT:

My patch changes all callers of this function, and the idea is
to preserve the existing behavior through my series, so I think
it makes sense to keep my patch as is.

I would assume that calling scsi_cmd_blk_ioctl() is harmless
here, but if you want to remove it or limit the set of supported
commands, that should be independent of my change.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH 15/24] compat_ioctl: scsi: move ioctl handling into drivers

2019-12-11 Thread Arnd Bergmann
Each driver calling scsi_ioctl() gets an equivalent compat_ioctl()
handler that implements the same commands by calling scsi_compat_ioctl().

The scsi_cmd_ioctl() and scsi_cmd_blk_ioctl() functions are compatible
at this point, so any driver that calls those can do so for both native
and compat mode, with the argument passed through compat_ptr().

With this, we can remove the entries from fs/compat_ioctl.c.  The new
code is larger, but should be easier to maintain and keep updated with
newly added commands.

Signed-off-by: Arnd Bergmann 
---
 drivers/block/virtio_blk.c |   3 +
 drivers/scsi/ch.c  |   9 ++-
 drivers/scsi/sd.c  |  50 ++
 drivers/scsi/sg.c  |  44 -
 drivers/scsi/sr.c  |  57 ++--
 drivers/scsi/st.c  |  51 --
 fs/compat_ioctl.c  | 132 +
 7 files changed, 142 insertions(+), 204 deletions(-)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 7ffd719d89de..fbbf18ac1d5d 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -405,6 +405,9 @@ static int virtblk_getgeo(struct block_device *bd, struct 
hd_geometry *geo)
 
 static const struct block_device_operations virtblk_fops = {
.ioctl  = virtblk_ioctl,
+#ifdef CONFIG_COMPAT
+   .compat_ioctl = blkdev_compat_ptr_ioctl,
+#endif
.owner  = THIS_MODULE,
.getgeo = virtblk_getgeo,
 };
diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c
index 76751d6c7f0d..ed5f4a6ae270 100644
--- a/drivers/scsi/ch.c
+++ b/drivers/scsi/ch.c
@@ -872,6 +872,10 @@ static long ch_ioctl_compat(struct file * file,
unsigned int cmd, unsigned long arg)
 {
scsi_changer *ch = file->private_data;
+   int retval = scsi_ioctl_block_when_processing_errors(ch->device, cmd,
+   file->f_flags & 
O_NDELAY);
+   if (retval)
+   return retval;
 
switch (cmd) {
case CHIOGPARAMS:
@@ -883,7 +887,7 @@ static long ch_ioctl_compat(struct file * file,
case CHIOINITELEM:
case CHIOSVOLTAG:
/* compatible */
-   return ch_ioctl(file, cmd, arg);
+   return ch_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
case CHIOGSTATUS32:
{
struct changer_element_status32 ces32;
@@ -898,8 +902,7 @@ static long ch_ioctl_compat(struct file * file,
return ch_gstatus(ch, ces32.ces_type, data);
}
default:
-   // return scsi_ioctl_compat(ch->device, cmd, (void*)arg);
-   return -ENOIOCTLCMD;
+   return scsi_compat_ioctl(ch->device, cmd, compat_ptr(arg));
 
}
 }
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index cea625906440..5afb0046b12a 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -1465,13 +1465,12 @@ static int sd_getgeo(struct block_device *bdev, struct 
hd_geometry *geo)
  * Note: most ioctls are forward onto the block subsystem or further
  * down in the scsi subsystem.
  **/
-static int sd_ioctl(struct block_device *bdev, fmode_t mode,
-   unsigned int cmd, unsigned long arg)
+static int sd_ioctl_common(struct block_device *bdev, fmode_t mode,
+  unsigned int cmd, void __user *p)
 {
struct gendisk *disk = bdev->bd_disk;
struct scsi_disk *sdkp = scsi_disk(disk);
struct scsi_device *sdp = sdkp->device;
-   void __user *p = (void __user *)arg;
int error;
 
SCSI_LOG_IOCTL(1, sd_printk(KERN_INFO, sdkp, "sd_ioctl: disk=%s, "
@@ -1507,9 +1506,6 @@ static int sd_ioctl(struct block_device *bdev, fmode_t 
mode,
break;
default:
error = scsi_cmd_blk_ioctl(bdev, mode, cmd, p);
-   if (error != -ENOTTY)
-   break;
-   error = scsi_ioctl(sdp, cmd, p);
break;
}
 out:
@@ -1691,39 +1687,31 @@ static void sd_rescan(struct device *dev)
revalidate_disk(sdkp->disk);
 }
 
+static int sd_ioctl(struct block_device *bdev, fmode_t mode,
+   unsigned int cmd, unsigned long arg)
+{
+   void __user *p = (void __user *)arg;
+   int ret;
+
+   ret = sd_ioctl_common(bdev, mode, cmd, p);
+   if (ret != -ENOTTY)
+   return ret;
+
+   return scsi_ioctl(scsi_disk(bdev->bd_disk)->device, cmd, p);
+}
 
 #ifdef CONFIG_COMPAT
-/* 
- * This gets directly called from VFS. When the ioctl 
- * is not recognized we go back to the other translation paths. 
- */
 static int sd_compat_ioctl(struct block_device *bdev, fmode_t mode,
   unsigned int cmd, unsigned long arg)
 {
-   struct gendisk *disk = bdev->bd_disk;
-   struct scsi_disk *sdkp = scsi_disk(disk);

[PATCH 00/24] block, scsi: final compat_ioctl cleanup

2019-12-11 Thread Arnd Bergmann
Hi Jens, James and Martin,

This series concludes the work I did for linux-5.5 on the compat_ioctl()
cleanup, killing off fs/compat_ioctl.c and block/compat_ioctl.c by moving
everything into drivers.

Overall this would be a reduction both in complexity and line count, but
as I'm also adding documentation the overall number of lines increases
in the end.

My plan was originally to keep the SCSI and block parts separate.
This did not work easily because of interdependencies: I cannot
do the final SCSI cleanup in a good way without first addressing the
CDROM ioctls, so this is one series that I hope could be merged through
either the block or the scsi git trees, or possibly both if you can
pull in the same branch.

The series comes in these steps:

1. clean up the sg v3 interface as suggested by Linus. I have
   talked about this with Doug Gilbert as well, and he would
   rebase his sg v4 patches on top of "compat: scsi: sg: fix v3
   compat read/write interface"

2. Four patches for missing block compat_ioctl handlers, to be
   backported into stable kernels. Separate patches because they
   are needed in different stable versions.

3. Actually moving handlers out of block/compat_ioctl.c and
   block/scsi_ioctl.c into drivers, mixed in with cleanup
   patches

4. Document how to do this right. I keep getting asked about this,
   and it helps to point to some documentation file.

The series is avaialable for testing at [1].

   Arnd

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground.git/log/?h=compat-ioctl-endgame

Arnd Bergmann (24):
  compat: ARM64: always include asm-generic/compat.h
  compat: scsi: sg: fix v3 compat read/write interface
  compat_ioctl: block: handle BLKREPORTZONE/BLKRESETZONE
  compat_ioctl: block: handle BLKGETZONESZ/BLKGETNRZONES
  compat_ioctl: block: handle add zone open, close and finish ioctl
  compat_ioctl: block: handle Persistent Reservations
  compaT_ioctl: ubd, aoe: use blkdev_compat_ptr_ioctl
  compat_ioctl: move CDROM_SEND_PACKET handling into scsi
  compat_ioctl: move CDROMREADADIO to cdrom.c
  compat_ioctl: cdrom: handle CDROM_LAST_WRITTEN
  compat_ioctl: block: handle cdrom compat ioctl in non-cdrom drivers
  compat_ioctl: add scsi_compat_ioctl
  compat_ioctl: bsg: add handler
  compat_ioctl: ide: floppy: add handler
  compat_ioctl: scsi: move ioctl handling into drivers
  compat_ioctl: move sys_compat_ioctl() to ioctl.c
  compat_ioctl: simplify the implementation
  compat_ioctl: move cdrom commands into cdrom.c
  compat_ioctl: scsi: handle HDIO commands from drivers
  compat_ioctl: move HDIO ioctl handling into drivers/ide
  compat_ioctl: block: move blkdev_compat_ioctl() into ioctl.c
  compat_ioctl: block: simplify compat_blkpg_ioctl()
  compat_ioctl: simplify up block/ioctl.c
  Documentation: document ioctl interfaces better

 Documentation/core-api/index.rst   |   1 +
 Documentation/core-api/ioctl.rst   | 250 +++
 arch/arm64/include/asm/compat.h|   5 +-
 arch/um/drivers/ubd_kern.c |   1 +
 block/Makefile |   1 -
 block/bsg.c|   1 +
 block/compat_ioctl.c   | 411 -
 block/ioctl.c  | 319 +++
 block/scsi_ioctl.c | 214 -
 drivers/ata/libata-scsi.c  |   9 +
 drivers/block/aoe/aoeblk.c |   1 +
 drivers/block/floppy.c |   3 +
 drivers/block/paride/pcd.c |   3 +
 drivers/block/paride/pd.c  |   1 +
 drivers/block/paride/pf.c  |   1 +
 drivers/block/pktcdvd.c|  26 +-
 drivers/block/sunvdc.c |   1 +
 drivers/block/virtio_blk.c |   3 +
 drivers/block/xen-blkfront.c   |   1 +
 drivers/cdrom/cdrom.c  |  35 ++-
 drivers/cdrom/gdrom.c  |   3 +
 drivers/ide/ide-cd.c   |  40 +++
 drivers/ide/ide-disk.c |   3 +
 drivers/ide/ide-floppy.c   |   4 +
 drivers/ide/ide-floppy.h   |   2 +
 drivers/ide/ide-floppy_ioctl.c |  35 +++
 drivers/ide/ide-gd.c   |  14 +
 drivers/ide/ide-ioctls.c   |  47 ++-
 drivers/ide/ide-tape.c |  14 +
 drivers/scsi/aic94xx/aic94xx_init.c|   3 +
 drivers/scsi/ch.c  |   9 +-
 drivers/scsi/hisi_sas/hisi_sas_v1_hw.c |   3 +
 drivers/scsi/hisi_sas/hisi_sas_v2_hw.c |   3 +
 drivers/scsi/hisi_sas/hisi_sas_v3_hw.c |   3 +
 drivers/scsi/ipr.c |   3 +
 drivers/scsi/isci/init.c   |   3 +
 drivers/scsi/mvsas/mv_init.c   |   3 +
 drivers/scsi/pm8001/pm8001_init.c  |   3 +
 drivers/scsi/scsi_ioctl.c  |  54 +++-
 drivers/scsi/sd.c  |  50 ++-
 drivers/scsi/sg.c  | 169 +-
 drivers/scsi/sr.c  |  53 +++-
 drivers/scsi/st.c

Re: [PATCH 01/13] compiler.h: Split {READ, WRITE}_ONCE definitions out into rwonce.h

2019-11-11 Thread Arnd Bergmann
On Mon, Nov 11, 2019 at 9:10 AM Christian Borntraeger
 wrote:
> On 08.11.19 20:57, Arnd Bergmann wrote:
> > On Fri, Nov 8, 2019 at 6:01 PM Will Deacon  wrote:
> >>
> >> In preparation for allowing architectures to define their own
> >> implementation of the 'READ_ONCE()' macro, move the generic
> >> '{READ,WRITE}_ONCE()' definitions out of the unwieldy 'linux/compiler.h'
> >> and into a new 'rwonce.h' header under 'asm-generic'.
> >
> > Adding Christian Bornträger to Cc, he originally added the
> > READ_ONCE()/WRITE_ONCE()
> > code.
> >
> > I wonder if it would be appropriate now to revert back to a much simpler 
> > version
> > of these helpers for any modern compiler. As I understand, only gcc-4.6 and
> > gcc4.7 actually need the song-and-dance version with the union and 
> > switch/case,
> > while for others, we can might be able back to a macro doing a volatile 
> > access.
>
> As far as I know this particular issue with  volatile access on aggregate 
> types
> was fixed in gcc 4.8. On the other hand we know that the current construct 
> will
> work on all compilers. Not so sure about the orignal ACCESS_ONCE 
> implementation.

I've seen problems with clang on the current version, leading to unnecessary
temporaries being spilled to the stack in some cases, so I think it would still
help to simplify it.

We probably don't want the exact ACCESS_ONCE() implementation back
that existed before, but rather something that implements the stricter
READ_ONCE() and WRITE_ONCE(). I'd probably also want to avoid the
__builtin_memcpy() exception for odd-sized accesses and instead have
a separate way to do those.

  Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 01/13] compiler.h: Split {READ, WRITE}_ONCE definitions out into rwonce.h

2019-11-08 Thread Arnd Bergmann
On Fri, Nov 8, 2019 at 6:01 PM Will Deacon  wrote:
>
> In preparation for allowing architectures to define their own
> implementation of the 'READ_ONCE()' macro, move the generic
> '{READ,WRITE}_ONCE()' definitions out of the unwieldy 'linux/compiler.h'
> and into a new 'rwonce.h' header under 'asm-generic'.

Adding Christian Bornträger to Cc, he originally added the
READ_ONCE()/WRITE_ONCE()
code.

I wonder if it would be appropriate now to revert back to a much simpler version
of these helpers for any modern compiler. As I understand, only gcc-4.6 and
gcc4.7 actually need the song-and-dance version with the union and switch/case,
while for others, we can might be able back to a macro doing a volatile access.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v5 12/29] compat_ioctl: move drivers to compat_ptr_ioctl

2019-07-30 Thread Arnd Bergmann
Each of these drivers has a copy of the same trivial helper function to
convert the pointer argument and then call the native ioctl handler.

We now have a generic implementation of that, so use it.

Acked-by: Greg Kroah-Hartman 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Jarkko Sakkinen 
Reviewed-by: Jason Gunthorpe 
Reviewed-by: Jiri Kosina 
Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Arnd Bergmann 
---
 drivers/char/ppdev.c  | 12 +-
 drivers/char/tpm/tpm_vtpm_proxy.c | 12 +-
 drivers/firewire/core-cdev.c  | 12 +-
 drivers/hid/usbhid/hiddev.c   | 11 +
 drivers/hwtracing/stm/core.c  | 12 +-
 drivers/misc/mei/main.c   | 22 +
 drivers/mtd/ubi/cdev.c| 36 +++-
 drivers/net/tap.c | 12 +-
 drivers/staging/pi433/pi433_if.c  | 12 +-
 drivers/usb/core/devio.c  | 16 +
 drivers/vfio/vfio.c   | 39 +++
 drivers/vhost/net.c   | 12 +-
 drivers/vhost/scsi.c  | 12 +-
 drivers/vhost/test.c  | 12 +-
 drivers/vhost/vsock.c | 12 +-
 fs/ceph/dir.c |  2 +-
 fs/ceph/file.c|  2 +-
 fs/ceph/super.h   |  9 ---
 fs/fat/file.c | 13 +--
 19 files changed, 22 insertions(+), 248 deletions(-)

diff --git a/drivers/char/ppdev.c b/drivers/char/ppdev.c
index f0a8adca1eee..c4d5cc4a1d3e 100644
--- a/drivers/char/ppdev.c
+++ b/drivers/char/ppdev.c
@@ -670,14 +670,6 @@ static long pp_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
return ret;
 }
 
-#ifdef CONFIG_COMPAT
-static long pp_compat_ioctl(struct file *file, unsigned int cmd,
-   unsigned long arg)
-{
-   return pp_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static int pp_open(struct inode *inode, struct file *file)
 {
unsigned int minor = iminor(inode);
@@ -786,9 +778,7 @@ static const struct file_operations pp_fops = {
.write  = pp_write,
.poll   = pp_poll,
.unlocked_ioctl = pp_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = pp_compat_ioctl,
-#endif
+   .compat_ioctl   = compat_ptr_ioctl,
.open   = pp_open,
.release= pp_release,
 };
diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
index 2f6e087ec496..91c772e38bb5 100644
--- a/drivers/char/tpm/tpm_vtpm_proxy.c
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -670,20 +670,10 @@ static long vtpmx_fops_ioctl(struct file *f, unsigned int 
ioctl,
}
 }
 
-#ifdef CONFIG_COMPAT
-static long vtpmx_fops_compat_ioctl(struct file *f, unsigned int ioctl,
- unsigned long arg)
-{
-   return vtpmx_fops_ioctl(f, ioctl, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static const struct file_operations vtpmx_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = vtpmx_fops_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl = vtpmx_fops_compat_ioctl,
-#endif
+   .compat_ioctl = compat_ptr_ioctl,
.llseek = noop_llseek,
 };
 
diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
index 1da7ba18d399..c777088f5828 100644
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -1646,14 +1646,6 @@ static long fw_device_op_ioctl(struct file *file,
return dispatch_ioctl(file->private_data, cmd, (void __user *)arg);
 }
 
-#ifdef CONFIG_COMPAT
-static long fw_device_op_compat_ioctl(struct file *file,
- unsigned int cmd, unsigned long arg)
-{
-   return dispatch_ioctl(file->private_data, cmd, compat_ptr(arg));
-}
-#endif
-
 static int fw_device_op_mmap(struct file *file, struct vm_area_struct *vma)
 {
struct client *client = file->private_data;
@@ -1795,7 +1787,5 @@ const struct file_operations fw_device_ops = {
.mmap   = fw_device_op_mmap,
.release= fw_device_op_release,
.poll   = fw_device_op_poll,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = fw_device_op_compat_ioctl,
-#endif
+   .compat_ioctl   = compat_ptr_ioctl,
 };
diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index 55b72573066b..70009bd76ac1 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -842,13 +842,6 @@ static long hiddev_ioctl(struct file *file, unsigned int 
cmd, unsigned long arg)
return r;
 }
 
-#ifdef CONFIG_COMPAT
-static long hiddev_compat_ioctl(struct file *file, unsigned int cmd, unsigned 
long arg)
-{
-   return hiddev_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static const struct file_operations hiddev_fops = {
.owner =THIS_MODULE,
.read = hiddev_read,
@@ -858,9 +851,7 @@ static con

[PATCH v3 09/26] compat_ioctl: move drivers to compat_ptr_ioctl

2019-04-16 Thread Arnd Bergmann
Each of these drivers has a copy of the same trivial helper function to
convert the pointer argument and then call the native ioctl handler.

We now have a generic implementation of that, so use it.

Acked-by: Greg Kroah-Hartman 
Reviewed-by: Jarkko Sakkinen 
Reviewed-by: Jason Gunthorpe 
Signed-off-by: Arnd Bergmann 
---
 drivers/char/ppdev.c  | 12 +-
 drivers/char/tpm/tpm_vtpm_proxy.c | 12 +-
 drivers/firewire/core-cdev.c  | 12 +-
 drivers/hid/usbhid/hiddev.c   | 11 +
 drivers/hwtracing/stm/core.c  | 12 +-
 drivers/misc/mei/main.c   | 22 +
 drivers/mtd/ubi/cdev.c| 36 +++-
 drivers/net/tap.c | 12 +-
 drivers/staging/pi433/pi433_if.c  | 12 +-
 drivers/usb/core/devio.c  | 16 +
 drivers/vfio/vfio.c   | 39 +++
 drivers/vhost/net.c   | 12 +-
 drivers/vhost/scsi.c  | 12 +-
 drivers/vhost/test.c  | 12 +-
 drivers/vhost/vsock.c | 12 +-
 fs/fat/file.c | 13 +--
 16 files changed, 20 insertions(+), 237 deletions(-)

diff --git a/drivers/char/ppdev.c b/drivers/char/ppdev.c
index 1ae77b41050a..e96c8d9623e0 100644
--- a/drivers/char/ppdev.c
+++ b/drivers/char/ppdev.c
@@ -674,14 +674,6 @@ static long pp_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
return ret;
 }
 
-#ifdef CONFIG_COMPAT
-static long pp_compat_ioctl(struct file *file, unsigned int cmd,
-   unsigned long arg)
-{
-   return pp_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static int pp_open(struct inode *inode, struct file *file)
 {
unsigned int minor = iminor(inode);
@@ -790,9 +782,7 @@ static const struct file_operations pp_fops = {
.write  = pp_write,
.poll   = pp_poll,
.unlocked_ioctl = pp_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = pp_compat_ioctl,
-#endif
+   .compat_ioctl   = compat_ptr_ioctl,
.open   = pp_open,
.release= pp_release,
 };
diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
index d74f3de74ae6..fb845f0a430b 100644
--- a/drivers/char/tpm/tpm_vtpm_proxy.c
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -675,20 +675,10 @@ static long vtpmx_fops_ioctl(struct file *f, unsigned int 
ioctl,
}
 }
 
-#ifdef CONFIG_COMPAT
-static long vtpmx_fops_compat_ioctl(struct file *f, unsigned int ioctl,
- unsigned long arg)
-{
-   return vtpmx_fops_ioctl(f, ioctl, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static const struct file_operations vtpmx_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = vtpmx_fops_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl = vtpmx_fops_compat_ioctl,
-#endif
+   .compat_ioctl = compat_ptr_ioctl,
.llseek = noop_llseek,
 };
 
diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
index 16a7045736a9..fb934680fdd3 100644
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -1659,14 +1659,6 @@ static long fw_device_op_ioctl(struct file *file,
return dispatch_ioctl(file->private_data, cmd, (void __user *)arg);
 }
 
-#ifdef CONFIG_COMPAT
-static long fw_device_op_compat_ioctl(struct file *file,
- unsigned int cmd, unsigned long arg)
-{
-   return dispatch_ioctl(file->private_data, cmd, compat_ptr(arg));
-}
-#endif
-
 static int fw_device_op_mmap(struct file *file, struct vm_area_struct *vma)
 {
struct client *client = file->private_data;
@@ -1808,7 +1800,5 @@ const struct file_operations fw_device_ops = {
.mmap   = fw_device_op_mmap,
.release= fw_device_op_release,
.poll   = fw_device_op_poll,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = fw_device_op_compat_ioctl,
-#endif
+   .compat_ioctl   = compat_ptr_ioctl,
 };
diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index a746017fac17..ef4a1cd389d6 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -855,13 +855,6 @@ static long hiddev_ioctl(struct file *file, unsigned int 
cmd, unsigned long arg)
return r;
 }
 
-#ifdef CONFIG_COMPAT
-static long hiddev_compat_ioctl(struct file *file, unsigned int cmd, unsigned 
long arg)
-{
-   return hiddev_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static const struct file_operations hiddev_fops = {
.owner =THIS_MODULE,
.read = hiddev_read,
@@ -871,9 +864,7 @@ static const struct file_operations hiddev_fops = {
.release =  hiddev_release,
.unlocked_ioctl =   hiddev_ioctl,
.fasync =   hiddev_fasync,
-#ifdef CONFIG_COMPAT
-   .com

[PATCH] vhost: silence an unused-variable warning

2019-03-06 Thread Arnd Bergmann
On some architectures, the MMU can be disabled, leading to access_ok()
becoming an empty macro that does not evaluate its size argument,
which in turn produces an unused-variable warning:

drivers/vhost/vhost.c:1191:9: error: unused variable 's' 
[-Werror,-Wunused-variable]
size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;

Mark the variable as __maybe_unused to shut up that warning.

Signed-off-by: Arnd Bergmann 
---
 drivers/vhost/vhost.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index a2e5dc7716e2..5ace833de746 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -1188,7 +1188,7 @@ static bool vq_access_ok(struct vhost_virtqueue *vq, 
unsigned int num,
 struct vring_used __user *used)
 
 {
-   size_t s = vhost_has_feature(vq, VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
+   size_t s __maybe_unused = vhost_has_feature(vq, 
VIRTIO_RING_F_EVENT_IDX) ? 2 : 0;
 
return access_ok(desc, num * sizeof *desc) &&
   access_ok(avail,
-- 
2.20.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support

2018-09-24 Thread Arnd Bergmann
On Mon, Sep 17, 2018 at 3:00 PM Thomas Gleixner  wrote:
>
> On Fri, 14 Sep 2018, Arnd Bergmann wrote:
> > On Fri, Sep 14, 2018 at 2:52 PM Thomas Gleixner  wrote:
> > A couple of architectures (s390, ia64, riscv, powerpc, arm64)
> > implement the vdso as assembler code at the moment, so they
> > won't be as easy to consolidate (other than outright replacing all
> > the code).
> >
> > The other five:
> > arch/x86/entry/vdso/vclock_gettime.c
> > arch/sparc/vdso/vclock_gettime.c
> > arch/nds32/kernel/vdso/gettimeofday.c
> > arch/mips/vdso/gettimeofday.c
> > arch/arm/vdso/vgettimeofday.c
> >
> > are basically all minor variations of the same code base and could be
> > consolidated to some degree.
> > Any suggestions here? Should we plan to do that consolitdation based on
> > your new version, or just add clock_gettime64 in arm32 and x86-32, and then
> > be done with it? The other ones will obviously still be fast for 32-bit 
> > time_t
> > and will have a working non-vdso sys_clock_getttime64().
>
> In principle consolidating all those implementations should be possible to
> some extent and probably worthwhile. What's arch specific are the actual
> accessors to the hardware clocks.

Ok.

> > I also wonder about clock_getres(): half the architectures seem to implement
> > it in vdso, but notably arm32 and x86 don't, and I had not expected it to be
> > performance critical given that the result is easily cached in user space.
>
> getres() is not really performance critical, but adding it does not create
> a huge problem either.

Right, I'd just not add a getres_time64() to the vdso then.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [patch 00/11] x86/vdso: Cleanups, simmplifications and CLOCK_TAI support

2018-09-14 Thread Arnd Bergmann
On Fri, Sep 14, 2018 at 2:52 PM Thomas Gleixner  wrote:
>
> Matt attempted to add CLOCK_TAI support to the VDSO clock_gettime()
> implementation, which extended the clockid switch case and added yet
> another slightly different copy of the same code.
>
> Especially the extended switch case is problematic as the compiler tends to
> generate a jump table which then requires to use retpolines. If jump tables
> are disabled it adds yet another conditional to the existing maze.
>
> This series takes a different approach by consolidating the almost
> identical functions into one implementation for high resolution clocks and
> one for the coarse grained clock ids by storing the base data for each
> clock id in an array which is indexed by the clock id.
>
> This completely eliminates the switch case and allows further
> simplifications of the code base, which at the end all together gain a few
> cycles performance or at least stay on par with todays code. The resulting
> performance depends heavily on the micro architecture and the compiler.

No objections from my side, just a few general remarks: Deepa
and I have discussed the vdso in the past for 2038. I have started
a patch that I'll have to redo on top of yours, but that is fine, your
cleanup is only going to help here.

More generally speaking, Deepa said that she would like to see
some generalization on the vdso before adding the time64_t
based variants. Your series probably makes x86 diverge more
from the others, which makes it harder to consolidate them again,
but we might just as well use your new implementation to base the
generic one on, and just move the other ones over to that.

A couple of architectures (s390, ia64, riscv, powerpc, arm64)
implement the vdso as assembler code at the moment, so they
won't be as easy to consolidate (other than outright replacing all
the code).

The other five:
arch/x86/entry/vdso/vclock_gettime.c
arch/sparc/vdso/vclock_gettime.c
arch/nds32/kernel/vdso/gettimeofday.c
arch/mips/vdso/gettimeofday.c
arch/arm/vdso/vgettimeofday.c

are basically all minor variations of the same code base and could be
consolidated to some degree.
Any suggestions here? Should we plan to do that consolitdation based on
your new version, or just add clock_gettime64 in arm32 and x86-32, and then
be done with it? The other ones will obviously still be fast for 32-bit time_t
and will have a working non-vdso sys_clock_getttime64().

I also wonder about clock_getres(): half the architectures seem to implement
it in vdso, but notably arm32 and x86 don't, and I had not expected it to be
performance critical given that the result is easily cached in user space.

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 02/17] compat_ioctl: move drivers to generic_compat_ioctl_ptrarg

2018-09-12 Thread Arnd Bergmann
On Wed, Sep 12, 2018 at 5:33 PM Jason Gunthorpe  wrote:
>
> On Wed, Sep 12, 2018 at 05:01:03PM +0200, Arnd Bergmann wrote:
> > Each of these drivers has a copy of the same trivial helper function to
> > convert the pointer argument and then call the native ioctl handler.
> >
> > We now have a generic implementation of that, so use it.
> >
>
> For vtpm:
>
> Reviewed-by: Jason Gunthorpe 
>
> Arnd, would you consider including a patch as part of/after this
> series to make compat_ioctl in drivers/infiniband/core/uverbs_main.c
> use this as well?  Looks like a bug too?

That should be included in patch 5 in this series. I may have skipped
some Cc there since it had too many recipients (sent only to the
mailing lists instead).

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v2 02/17] compat_ioctl: move drivers to generic_compat_ioctl_ptrarg

2018-09-12 Thread Arnd Bergmann
Each of these drivers has a copy of the same trivial helper function to
convert the pointer argument and then call the native ioctl handler.

We now have a generic implementation of that, so use it.

Signed-off-by: Arnd Bergmann 
---
 drivers/char/ppdev.c  | 12 +-
 drivers/char/tpm/tpm_vtpm_proxy.c | 12 +-
 drivers/firewire/core-cdev.c  | 12 +-
 drivers/hid/usbhid/hiddev.c   | 11 +
 drivers/hwtracing/stm/core.c  | 12 +-
 drivers/misc/mei/main.c   | 22 +
 drivers/mtd/ubi/cdev.c| 36 +++-
 drivers/net/tap.c | 12 +-
 drivers/staging/pi433/pi433_if.c  | 12 +-
 drivers/usb/core/devio.c  | 16 +
 drivers/vfio/vfio.c   | 39 +++
 drivers/vhost/net.c   | 12 +-
 drivers/vhost/scsi.c  | 12 +-
 drivers/vhost/test.c  | 12 +-
 drivers/vhost/vsock.c | 12 +-
 fs/fat/file.c | 13 +--
 16 files changed, 20 insertions(+), 237 deletions(-)

diff --git a/drivers/char/ppdev.c b/drivers/char/ppdev.c
index 1ae77b41050a..c38a62457cf0 100644
--- a/drivers/char/ppdev.c
+++ b/drivers/char/ppdev.c
@@ -674,14 +674,6 @@ static long pp_ioctl(struct file *file, unsigned int cmd, 
unsigned long arg)
return ret;
 }
 
-#ifdef CONFIG_COMPAT
-static long pp_compat_ioctl(struct file *file, unsigned int cmd,
-   unsigned long arg)
-{
-   return pp_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static int pp_open(struct inode *inode, struct file *file)
 {
unsigned int minor = iminor(inode);
@@ -790,9 +782,7 @@ static const struct file_operations pp_fops = {
.write  = pp_write,
.poll   = pp_poll,
.unlocked_ioctl = pp_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = pp_compat_ioctl,
-#endif
+   .compat_ioctl   = generic_compat_ioctl_ptrarg,
.open   = pp_open,
.release= pp_release,
 };
diff --git a/drivers/char/tpm/tpm_vtpm_proxy.c 
b/drivers/char/tpm/tpm_vtpm_proxy.c
index 87a0ce47f201..a170f5ca7416 100644
--- a/drivers/char/tpm/tpm_vtpm_proxy.c
+++ b/drivers/char/tpm/tpm_vtpm_proxy.c
@@ -678,20 +678,10 @@ static long vtpmx_fops_ioctl(struct file *f, unsigned int 
ioctl,
}
 }
 
-#ifdef CONFIG_COMPAT
-static long vtpmx_fops_compat_ioctl(struct file *f, unsigned int ioctl,
- unsigned long arg)
-{
-   return vtpmx_fops_ioctl(f, ioctl, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static const struct file_operations vtpmx_fops = {
.owner = THIS_MODULE,
.unlocked_ioctl = vtpmx_fops_ioctl,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl = vtpmx_fops_compat_ioctl,
-#endif
+   .compat_ioctl = generic_compat_ioctl_ptrarg,
.llseek = noop_llseek,
 };
 
diff --git a/drivers/firewire/core-cdev.c b/drivers/firewire/core-cdev.c
index d8e185582642..2acc0c9ddf94 100644
--- a/drivers/firewire/core-cdev.c
+++ b/drivers/firewire/core-cdev.c
@@ -1659,14 +1659,6 @@ static long fw_device_op_ioctl(struct file *file,
return dispatch_ioctl(file->private_data, cmd, (void __user *)arg);
 }
 
-#ifdef CONFIG_COMPAT
-static long fw_device_op_compat_ioctl(struct file *file,
- unsigned int cmd, unsigned long arg)
-{
-   return dispatch_ioctl(file->private_data, cmd, compat_ptr(arg));
-}
-#endif
-
 static int fw_device_op_mmap(struct file *file, struct vm_area_struct *vma)
 {
struct client *client = file->private_data;
@@ -1808,7 +1800,5 @@ const struct file_operations fw_device_ops = {
.mmap   = fw_device_op_mmap,
.release= fw_device_op_release,
.poll   = fw_device_op_poll,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = fw_device_op_compat_ioctl,
-#endif
+   .compat_ioctl   = generic_compat_ioctl_ptrarg,
 };
diff --git a/drivers/hid/usbhid/hiddev.c b/drivers/hid/usbhid/hiddev.c
index 23872d08308c..73a168f97024 100644
--- a/drivers/hid/usbhid/hiddev.c
+++ b/drivers/hid/usbhid/hiddev.c
@@ -845,13 +845,6 @@ static long hiddev_ioctl(struct file *file, unsigned int 
cmd, unsigned long arg)
return r;
 }
 
-#ifdef CONFIG_COMPAT
-static long hiddev_compat_ioctl(struct file *file, unsigned int cmd, unsigned 
long arg)
-{
-   return hiddev_ioctl(file, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
 static const struct file_operations hiddev_fops = {
.owner =THIS_MODULE,
.read = hiddev_read,
@@ -861,9 +854,7 @@ static const struct file_operations hiddev_fops = {
.release =  hiddev_release,
.unlocked_ioctl =   hiddev_ioctl,
.fasync =   hiddev_fasync,
-#ifdef CONFIG_COMPAT
-   .compat_ioctl   = hiddev_compat_ioctl,
-#endif
+   .com

Re: [PATCH v4 1/3] compiler-gcc.h: add gnu_inline to all inline declarations

2018-06-08 Thread Arnd Bergmann
On Thu, Jun 7, 2018 at 10:49 PM, Nick Desaulniers
 wrote:
> Functions marked extern inline do not emit an externally visible
> function when the gnu89 C standard is used. Some KBUILD Makefiles
> overwrite KBUILD_CFLAGS. This is an issue for GCC 5.1+ users as without
> an explicit C standard specified, the default is gnu11. Since c99, the
> semantics of extern inline have changed such that an externally visible
> function is always emitted. This can lead to multiple definition errors
> of extern inline functions at link time of compilation units whose build
> files have removed an explicit C standard compiler flag for users of GCC
> 5.1+ or Clang.
>
> Signed-off-by: Nick Desaulniers 
> Suggested-by: H. Peter Anvin 
> Suggested-by: Joe Perches 

I suspect this will break Geert's gcc-4.1.2, which I think doesn't have that
attribute yet (4.1.3 or higher have it according to the documentation.

It wouldn't be hard to work around that if we want to keep that version
working, or we could decide that it's time to officially stop supporting
that version, but we should probably decide on one or the other.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] x86: paravirt: make native_save_fl extern inline

2018-06-05 Thread Arnd Bergmann
On Tue, Jun 5, 2018 at 11:28 PM, Arnd Bergmann  wrote:
> On Tue, Jun 5, 2018 at 7:05 PM, Nick Desaulniers
>  wrote:
>>
>> The semantics of extern inline has changed since gnu89. This means that
>> folks using GCC versions >= 5.1 may see symbol redefinition errors at
>> link time for subdirs that override KBUILD_CFLAGS (making the C standard
>> used implicit) regardless of this patch. This has been cleaned up
>> earlier in the patch set, but is left as a note in the commit message
>> for future travelers.
>
> I think the keyword you are missing is
>
> __attribute__((gnu_inline))
>
> which forces the gnu89 behavior on all compiler versions. It's been supported
> since gcc-4.2, so it should not cause problems on any compiler that is able
> to build an x86 kernel.

Nevermind, I just saw you already posted that.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] x86: paravirt: make native_save_fl extern inline

2018-06-05 Thread Arnd Bergmann
On Tue, Jun 5, 2018 at 7:05 PM, Nick Desaulniers
 wrote:
>
> The semantics of extern inline has changed since gnu89. This means that
> folks using GCC versions >= 5.1 may see symbol redefinition errors at
> link time for subdirs that override KBUILD_CFLAGS (making the C standard
> used implicit) regardless of this patch. This has been cleaned up
> earlier in the patch set, but is left as a note in the commit message
> for future travelers.

I think the keyword you are missing is

__attribute__((gnu_inline))

which forces the gnu89 behavior on all compiler versions. It's been supported
since gcc-4.2, so it should not cause problems on any compiler that is able
to build an x86 kernel.

   Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[4.4-stable 12/22] virtio_balloon: prevent uninitialized variable use

2018-02-20 Thread Arnd Bergmann
commit f0bb2d50dfcc519f06f901aac88502be6ff1df2c upstream.

The latest gcc-7.0.1 snapshot reports a new warning:

virtio/virtio_balloon.c: In function 'update_balloon_stats':
virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in 
this function [-Werror=uninitialized]
virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in 
this function [-Werror=uninitialized]
virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in 
this function [-Werror=uninitialized]
virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in 
this function [-Werror=uninitialized]

This seems absolutely right, so we should add an extra check to
prevent copying uninitialized stack data into the statistics.
>From all I can tell, this has been broken since the statistics code
was originally added in 2.6.34.

Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the balloon 
driver (V4)")
Signed-off-by: Arnd Bergmann 
Signed-off-by: Ladi Prosek 
Signed-off-by: Michael S. Tsirkin 
[arnd: backported to 4.4]
Signed-off-by: Arnd Bergmann 
---
 drivers/virtio/virtio_balloon.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 01d15dca940e..26d0dff069f0 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -239,12 +239,15 @@ static void update_balloon_stats(struct virtio_balloon 
*vb)
all_vm_events(events);
si_meminfo(&i);
 
+
+#ifdef CONFIG_VM_EVENT_COUNTERS
update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_IN,
pages_to_bytes(events[PSWPIN]));
update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_OUT,
pages_to_bytes(events[PSWPOUT]));
update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]);
update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]);
+#endif
update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE,
pages_to_bytes(i.freeram));
update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMTOT,
-- 
2.9.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH net-next] virtio-net: mark PM functions as __maybe_unused

2017-07-25 Thread Arnd Bergmann
After removing the reset function, the freeze and restore functions
are now unused when CONFIG_PM_SLEEP is disabled:

drivers/net/virtio_net.c:1881:12: error: 'virtnet_restore_up' defined but not 
used [-Werror=unused-function]
 static int virtnet_restore_up(struct virtio_device *vdev)
drivers/net/virtio_net.c:1859:13: error: 'virtnet_freeze_down' defined but not 
used [-Werror=unused-function]
 static void virtnet_freeze_down(struct virtio_device *vdev)

A more robust way to do this is to remove the #ifdef around the callers
and instead mark them as __maybe_unused. The compiler will now just
silently drop the unused code.

Fixes: 4941d472bf95 ("virtio-net: do not reset during XDP set")
Signed-off-by: Arnd Bergmann 
---
 drivers/net/virtio_net.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index d4751ce23b4f..1902701e15a9 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2702,8 +2702,7 @@ static void virtnet_remove(struct virtio_device *vdev)
free_netdev(vi->dev);
 }
 
-#ifdef CONFIG_PM_SLEEP
-static int virtnet_freeze(struct virtio_device *vdev)
+static __maybe_unused int virtnet_freeze(struct virtio_device *vdev)
 {
struct virtnet_info *vi = vdev->priv;
 
@@ -2714,7 +2713,7 @@ static int virtnet_freeze(struct virtio_device *vdev)
return 0;
 }
 
-static int virtnet_restore(struct virtio_device *vdev)
+static __maybe_unused int virtnet_restore(struct virtio_device *vdev)
 {
struct virtnet_info *vi = vdev->priv;
int err;
@@ -2730,7 +2729,6 @@ static int virtnet_restore(struct virtio_device *vdev)
 
return 0;
 }
-#endif
 
 static struct virtio_device_id id_table[] = {
{ VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID },
-- 
2.9.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v3 2/3] virtio-balloon: use actual number of stats for stats queue buffers

2017-03-28 Thread Arnd Bergmann
On Tue, Mar 28, 2017 at 6:46 PM, Ladi Prosek  wrote:
> The virtio balloon driver contained a not-so-obvious invariant that
> update_balloon_stats has to update exactly VIRTIO_BALLOON_S_NR counters
> in order to send valid stats to the host. This commit fixes it by having
> update_balloon_stats return the actual number of counters, and its
> callers use it when pushing buffers to the stats virtqueue.
>
> Note that it is still out of spec to change the number of counters
> at run-time. "Driver MUST supply the same subset of statistics in all
> buffers submitted to the statsq."
>
> Suggested-by: Arnd Bergmann 
> Signed-off-by: Ladi Prosek 

>
Acked-by: Arnd Bergmann 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_balloon: prevent uninitialized variable use

2017-03-24 Thread Arnd Bergmann
On Fri, Mar 24, 2017 at 9:11 PM, Ladi Prosek  wrote:
> On Fri, Mar 24, 2017 at 7:38 PM, David Hildenbrand  wrote:
>> On 23.03.2017 16:17, Arnd Bergmann wrote:
>>> The latest gcc-7.0.1 snapshot reports a new warning:
>>>
>>> virtio/virtio_balloon.c: In function 'update_balloon_stats':
>>> virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in 
>>> this function [-Werror=uninitialized]
>>> virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in 
>>> this function [-Werror=uninitialized]
>>> virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized 
>>> in this function [-Werror=uninitialized]
>>> virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized 
>>> in this function [-Werror=uninitialized]
>>>
>>> This seems absolutely right, so we should add an extra check to
>>> prevent copying uninitialized stack data into the statistics.
>>> From all I can tell, this has been broken since the statistics code
>>> was originally added in 2.6.34.
>>>
>>> Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the 
>>> balloon driver (V4)")
>>> Signed-off-by: Arnd Bergmann 
>>> ---
>>>  drivers/virtio/virtio_balloon.c | 2 ++
>>>  1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/virtio/virtio_balloon.c 
>>> b/drivers/virtio/virtio_balloon.c
>>> index 4e1191508228..cd5c54e2003d 100644
>>> --- a/drivers/virtio/virtio_balloon.c
>>> +++ b/drivers/virtio/virtio_balloon.c
>>> @@ -254,12 +254,14 @@ static void update_balloon_stats(struct 
>>> virtio_balloon *vb)
>>>
>>>   available = si_mem_available();
>>>
>>> +#ifdef CONFIG_VM_EVENT_COUNTERS
>>>   update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_IN,
>>>   pages_to_bytes(events[PSWPIN]));
>>>   update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_OUT,
>>>   pages_to_bytes(events[PSWPOUT]));
>>>   update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]);
>>>   update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]);
>>> +#endif
>>>   update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE,
>>>   pages_to_bytes(i.freeram));
>>>   update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMTOT,
>
> This will leave four uninitialized slots in vb->stats if
> CONFIG_VM_EVENT_COUNTERS is not defined. update_balloon_stats should
> have
>
>   BUG_ON(idx < VIRTIO_BALLOON_S_NR);
>
> at the end.
>
> You need to make sure that vb->stats is smaller, either by using
> something else than VIRTIO_BALLOON_S_NR for its size or something else
> than sizeof(vb->stats) as the last argument to sg_init_one in this
> file.

Ah, got it. Would one of you create a fixed patch for this, or should I?

An easy way to solve it would be to preinitialize the events array
and return zeroes to the host, but I don't know if that's the right
solution.

Another option would be to return 'idx' from update_balloon_stats,
and use that for the size:

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index cd5c54e2003d..bea3cfb88e53 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -242,7 +242,7 @@ static inline void update_stat(struct
virtio_balloon *vb, int idx,

 #define pages_to_bytes(x) ((u64)(x) << PAGE_SHIFT)

-static void update_balloon_stats(struct virtio_balloon *vb)
+static unsigned int update_balloon_stats(struct virtio_balloon *vb)
 {
  unsigned long events[NR_VM_EVENT_ITEMS];
  struct sysinfo i;
@@ -268,6 +268,8 @@ static void update_balloon_stats(struct virtio_balloon *vb)
  pages_to_bytes(i.totalram));
  update_stat(vb, idx++, VIRTIO_BALLOON_S_AVAIL,
  pages_to_bytes(available));
+
+ return idx;
 }

 /*
@@ -294,13 +296,14 @@ static void stats_handle_request(struct
virtio_balloon *vb)
  struct virtqueue *vq;
  struct scatterlist sg;
  unsigned int len;
+ unsigned int num_stats;

- update_balloon_stats(vb);
+ num_stats = update_balloon_stats(vb);

  vq = vb->stats_vq;
  if (!virtqueue_get_buf(vq, &len))
  return;
- sg_init_one(&sg, vb->stats, sizeof(vb->stats));
+ sg_init_one(&sg, vb->stats, sizeof(vb->stats[0] * num_stats));
  virtqueue_add_outbuf(vq, &sg, 1, vb, GFP_KERNEL);
  virtqueue_kick(vq);
 }

 Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] virtio_balloon: prevent uninitialized variable use

2017-03-23 Thread Arnd Bergmann
The latest gcc-7.0.1 snapshot reports a new warning:

virtio/virtio_balloon.c: In function 'update_balloon_stats':
virtio/virtio_balloon.c:258:26: error: 'events[2]' is used uninitialized in 
this function [-Werror=uninitialized]
virtio/virtio_balloon.c:260:26: error: 'events[3]' is used uninitialized in 
this function [-Werror=uninitialized]
virtio/virtio_balloon.c:261:56: error: 'events[18]' is used uninitialized in 
this function [-Werror=uninitialized]
virtio/virtio_balloon.c:262:56: error: 'events[17]' is used uninitialized in 
this function [-Werror=uninitialized]

This seems absolutely right, so we should add an extra check to
prevent copying uninitialized stack data into the statistics.
>From all I can tell, this has been broken since the statistics code
was originally added in 2.6.34.

Fixes: 9564e138b1f6 ("virtio: Add memory statistics reporting to the balloon 
driver (V4)")
Signed-off-by: Arnd Bergmann 
---
 drivers/virtio/virtio_balloon.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 4e1191508228..cd5c54e2003d 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -254,12 +254,14 @@ static void update_balloon_stats(struct virtio_balloon 
*vb)
 
available = si_mem_available();
 
+#ifdef CONFIG_VM_EVENT_COUNTERS
update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_IN,
pages_to_bytes(events[PSWPIN]));
update_stat(vb, idx++, VIRTIO_BALLOON_S_SWAP_OUT,
pages_to_bytes(events[PSWPOUT]));
update_stat(vb, idx++, VIRTIO_BALLOON_S_MAJFLT, events[PGMAJFAULT]);
update_stat(vb, idx++, VIRTIO_BALLOON_S_MINFLT, events[PGFAULT]);
+#endif
update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMFREE,
pages_to_bytes(i.freeram));
update_stat(vb, idx++, VIRTIO_BALLOON_S_MEMTOT,
-- 
2.9.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_mmio: Set DMA masks appropriately

2017-01-10 Thread Arnd Bergmann
On Tuesday, January 10, 2017 1:44:37 PM CET Robin Murphy wrote:
> On 10/01/17 13:15, Arnd Bergmann wrote:
> > On Tuesday, January 10, 2017 12:26:01 PM CET Robin Murphy wrote:
> >> @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device 
> >> *pdev)
> >> if (vm_dev->version == 1)
> >> writel(PAGE_SIZE, vm_dev->base + 
> >> VIRTIO_MMIO_GUEST_PAGE_SIZE);
> >>  
> >> +   rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
> >> +   if (rc)
> >> +   rc = dma_set_mask_and_coherent(&pdev->dev, 
> >> DMA_BIT_MASK(32));
> > 
> > You don't seem to do anything different when 64-bit DMA is unsupported.
> > How do you prevent the use of kernel buffers that are above the first 4G
> > here?
> 
> That's the token "give up and rely on SWIOTLB/IOMMU" point, which as we
> already know won't necessarily work very well (because it's already the
> situation without this patch), but is still arguably better than
> nothing. As I've just replied elsewhere, I personally hate this idiom,
> but it's the done thing given the current DMA mask API.

Ah, I think I understand now. This is actually unrelated to SWIOTLB, which
should always succeed, but it is used on some architectures (x86, and
sparc in particular) to control whether the iommu is allowed to hand
out IOVA above 32-bit. Sorry for assuming in my earlier mail that all
IOMMUs only do 32-bit VA addressing in the dma-mapping API.

The related case on PowerPC is that DMA_BIT_MASK(64) has a special
meaning of bypassing the IOMMU entirely to optimize for performance.
Again that should not fail, but it will switch the dma_map_ops between
iommu and direct, using the IOMMU only when the device can't address
the entire space.

> >> +   else if (vm_dev->version == 1)
> >> +   dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + 
> >> PAGE_SHIFT));
> > 
> > Why is this limitation only for the coherent mask?
> 
> AIUI, the "32-bit pointers to pages" limitation of legacy virtio only
> applies to the location of the vring itself, which is allocated via
> dma_alloc_coherent - the descriptors themselves hold full 64-bit
> addresses pointing at the actual data, which is mapped using streaming
> DMA. It relies on the API guarantee that if we've managed to set a
> 64-bit streaming mask, then setting a smaller coherent mask cannot fail
> (DMA-API-HOWTO.txt:257)
> 
> This is merely an amalgamation of the logic already in place for
> virtio-pci, I just skimped on duplicating all the rationale (I know
> there's a mail thread somewhere I could probably dig up).

Ok, got it.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio_mmio: Set DMA masks appropriately

2017-01-10 Thread Arnd Bergmann
On Tuesday, January 10, 2017 12:26:01 PM CET Robin Murphy wrote:
> @@ -548,6 +550,14 @@ static int virtio_mmio_probe(struct platform_device 
> *pdev)
> if (vm_dev->version == 1)
> writel(PAGE_SIZE, vm_dev->base + VIRTIO_MMIO_GUEST_PAGE_SIZE);
>  
> +   rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64));
> +   if (rc)
> +   rc = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));

You don't seem to do anything different when 64-bit DMA is unsupported.
How do you prevent the use of kernel buffers that are above the first 4G
here?

> +   else if (vm_dev->version == 1)
> +   dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32 + 
> PAGE_SHIFT));

Why is this limitation only for the coherent mask?

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] x86/paravirt: hide unused patch_default label

2016-12-16 Thread Arnd Bergmann
On Friday, December 16, 2016 10:51:50 AM CET Peter Zijlstra wrote:
> -patch_default:
> +patch_default: __maybe_unused
> ret = paravirt_patch_default(type, clobbers, ibuf, addr, len);
> break;
> 

Ah, nice, I didn't know you could do that. Yes, please do this instead.

Arnd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] x86/paravirt: hide unused patch_default label

2016-12-16 Thread Arnd Bergmann
A bugfix introduced a harmless warning:

arch/x86/kernel/paravirt_patch_32.c: In function 'native_patch':
arch/x86/kernel/paravirt_patch_32.c:71:1: error: label 'patch_default' defined 
but not used [-Werror=unused-label]

This puts it in the same #ifdef as its caller.

Fixes: 45dbea5f55c0 ("x86/paravirt: Fix native_patch()")
Signed-off-by: Arnd Bergmann 
---
 arch/x86/kernel/paravirt_patch_32.c | 2 ++
 arch/x86/kernel/paravirt_patch_64.c | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/kernel/paravirt_patch_32.c 
b/arch/x86/kernel/paravirt_patch_32.c
index d33ef165b1f8..2b729b1df391 100644
--- a/arch/x86/kernel/paravirt_patch_32.c
+++ b/arch/x86/kernel/paravirt_patch_32.c
@@ -68,7 +68,9 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
 #endif
 
default:
+#if defined(CONFIG_PARAVIRT_SPINLOCKS)
 patch_default:
+#endif
ret = paravirt_patch_default(type, clobbers, ibuf, addr, len);
break;
 
diff --git a/arch/x86/kernel/paravirt_patch_64.c 
b/arch/x86/kernel/paravirt_patch_64.c
index f4fcf26c9fce..b5bff128949a 100644
--- a/arch/x86/kernel/paravirt_patch_64.c
+++ b/arch/x86/kernel/paravirt_patch_64.c
@@ -80,7 +80,9 @@ unsigned native_patch(u8 type, u16 clobbers, void *ibuf,
 #endif
 
default:
+#if defined(CONFIG_PARAVIRT_SPINLOCKS)
 patch_default:
+#endif
ret = paravirt_patch_default(type, clobbers, ibuf, addr, len);
break;
 
-- 
2.9.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] virtio: mark vring_dma_dev() static

2016-09-01 Thread Arnd Bergmann
On Thursday, September 1, 2016 7:02:57 PM CEST Baoyou Xie wrote:
> We get 1 warning when building kernel with W=1:
> drivers/virtio/virtio_ring.c:170:16: warning: no previous prototype for 
> 'vring_dma_dev' [-Wmissing-prototypes]
> 
> In fact, this function is only used in the file in which it is
> declared and don't need a declaration, but can be made static.
> so this patch marks this function with 'static'.
> 
> Signed-off-by: Baoyou Xie 
> 

Acked-by: Arnd Bergmann 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] drm/virtio: fix building without CONFIG_FBDEV

2016-08-02 Thread Arnd Bergmann
Removing the build-time dependency on DRM_KMS_FB_HELPER means
we can now build with CONFIG_FB disabled or as a loadable module,
leading to a link error:

ERROR: "remove_conflicting_framebuffers" [drivers/gpu/drm/virtio/virtio-gpu.ko] 
undefined!

There is no need to call remove_conflicting_framebuffers() if
CONFIG_FB is disabled, or if the virtio-gpu driver is built-in
and CONFIG_FB=m, as there won't be any prior consoles that
are registered at that point.

This adds a compile-time check in the driver to ensure we only
attempt to call that function if either CONFIG_FB=y or
both subsystems are configured as loadable modules.

Signed-off-by: Arnd Bergmann 
Fixes: 0b6320dfdfea ("drm/virtio: make fbdev support really optional")
---
 drivers/gpu/drm/virtio/virtgpu_drm_bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_drm_bus.c 
b/drivers/gpu/drm/virtio/virtgpu_drm_bus.c
index 7f0e93f87a55..2087569ae8d7 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drm_bus.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drm_bus.c
@@ -65,7 +65,7 @@ int drm_virtio_init(struct drm_driver *driver, struct 
virtio_device *vdev)
DRM_INFO("pci: %s detected\n",
 vga ? "virtio-vga" : "virtio-gpu-pci");
dev->pdev = pdev;
-   if (vga)
+   if (IS_REACHABLE(CONFIG_FB) && vga)
virtio_pci_kick_out_firmware_fb(pdev);
}
 
-- 
2.9.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: Build regressions/improvements in v4.7-rc6

2016-07-04 Thread Arnd Bergmann
On Monday, July 4, 2016 10:21:45 AM CEST Geert Uytterhoeven wrote:
> On Mon, Jul 4, 2016 at 10:12 AM, Geert Uytterhoeven
>  wrote:
> > JFYI, when comparing v4.7-rc6[1] to v4.7-rc5[3], the summaries are:
> >   - build errors: +3/-2
> 
>   + /home/kisskb/slave/src/drivers/vhost/vhost.c: error: call to
> '__compiletime_assert_844' declared with attribute error: BUILD_BUG_ON
> failed: __alignof__ *vq->avail > VRING_AVAIL_ALIGN_SIZE:  => 844:3
> 
> arm-randconfig
> 
> > [1] http://kisskb.ellerman.id.au/kisskb/head/10562/ (260 out of 263 configs)
> > [3] http://kisskb.ellerman.id.au/kisskb/head/10532/ (260 out of 263 configs)

I don't see any changes in the code in this time frame, but this is the
code causing it:


struct vring_avail {
__virtio16 flags;
__virtio16 idx;
__virtio16 ring[];
};  

/* The virtqueue structure describes a queue attached to a device. */
struct vhost_virtqueue {
struct vhost_dev *dev;

/* The actual ring of buffers. */
struct mutex mutex; 
unsigned int num; 
struct vring_desc __user *desc;
struct vring_avail __user *avail;
struct vring_used __user *used;
...
};

struct vhost_virtqueue *vq;
   BUILD_BUG_ON(__alignof__ *vq->avail > VRING_AVAIL_ALIGN_SIZE);


The alignment of the *vq->avail should be '2' on all architectures,
however an ARM OABI compiler will have a padded structure with alignment '4'.

Looking at the build logs, I find it only in a single randconfig build
at http://kisskb.ellerman.id.au/kisskb/buildresult/12735927/ which apparently
enabled the vhost driver in combination with ARM_AEABI=n.

In my own randconfig builds I am forcing ARM_AEABI=y because there are a
couple of other problems with OABI.

If we want to avoid this one, we could make the inclusion of
drivers/vhost/Kconfig from arch/arm/kvm/Kconfig depend on CONFIG_AEABI,
or perhaps go further force-enable CONFIG_AEABI for ARMv6k and higher
(cmpxchg64() is broken on OABI too), and only include vhost if KVM
is enabled (KVM in turn requires ARMv7).

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/34] arch: barrier cleanup + __smp_xxx barriers for virt

2015-12-30 Thread Arnd Bergmann
On Wednesday 30 December 2015 23:45:41 Michael S. Tsirkin wrote:
> On Wed, Dec 30, 2015 at 02:49:53PM +0100, Arnd Bergmann wrote:
> > On Wednesday 30 December 2015 15:24:12 Michael S. Tsirkin wrote:
> > > This is really trying to cleanup some virt code, as suggested by Peter, 
> > > who
> > > said
> > > > You could of course go fix that instead of mutilating things into
> > > > sort-of functional state.
> > > 
> > > This work is needed for virtio, so it's probably easiest to
> > > merge it through my tree - is this fine by everyone?
> > > Arnd, if you agree, could you ack this please?
> > 
> > I've looked over all patches now, and everything seems fine (I had one
> > small comment about a duplicate patch). It's a very nice cleanup, both
> > for using the asm-generic file, and for the virtualization users.
> > 
> > Please add my "Acked-by: Arnd Bergmann " when merging the
> > series through your tree.
> > 
> >   Arnd
> 
> To all patches in the series?

Your choice, either all of them, or just the asm-generic ones.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 08/34] asm-generic: smp_store_mb should use smp_mb

2015-12-30 Thread Arnd Bergmann
On Wednesday 30 December 2015 22:30:38 Michael S. Tsirkin wrote:
> On Wed, Dec 30, 2015 at 02:44:21PM +0100, Arnd Bergmann wrote:
> > On Wednesday 30 December 2015 15:24:47 Michael S. Tsirkin wrote:

> > >  #ifndef smp_store_mb
> > > -#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); mb(); } 
> > > while (0)
> > > +#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); smp_mb(); 
> > > } while (0)
> > >  #endif
> > >  
> > >  #ifndef smp_mb__before_atomic
> > > 
> > 
> > The same patch is already in the tip tree scheduled for 4.5 as d5a73cadf3fd
> > ("lcoking/barriers, arch: Use smp barriers in smp_store_release()").
> 
> Sorry which tree do you mean exactly?

$ git log --ancestry-path --oneline --merges d5a73cadf3fd..next/master | tail 
-n 17
cb17a685bed6 Merge remote-tracking branch 'tip/auto-latest'
f29c2e03f0b3 Merge branch 'x86/urgent'
8cd6990bf71d Merge branch 'x86/platform'
0541d92a5eb4 Merge branch 'x86/mm'
aa7c8013c8c0 Merge branch 'x86/fpu'
fcc9a1bd013c Merge branch 'x86/efi'
e74ef3f60886 Merge branch 'x86/cpu'
44a4f0063508 Merge branch 'x86/cleanups'
28c814578fcf Merge branch 'x86/cache'
d74ff99dada8 Merge branch 'x86/boot'
db3c55380b10 Merge branch 'x86/asm'
7cd91b91da20 Merge branch 'x86/apic'
7bfc343947e6 Merge branch 'timers/core'
1720bbcb66d1 Merge branch 'sched/core'
af9a59f26764 Merge branch 'ras/core'
984b85eca78d Merge branch 'perf/core'
d2b22d438aab Merge branch 'locking/core'

$ grep auto-latest Next/Trees 
tip git 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git#auto-latest

So locking/core of tip.git has the patch and gets merged into linux-next
through auto-latest in tip.git.

> > I think you can drop your version.
> > 
> >   arnd
> 
> Will drop mine, thanks.
> I kind of dislike that if I just drop it, some arches will temporarily
> regress to a slower implementation.
> I think I can just cherry-pick d5a73cadf3fd into my tree: git
> normally figures such duplicates out nicely.
> Does this sound good?

I don't think there is a perfect solution, you can either cherry-pick
it and get a duplicate commit in the git history, or you merge in the
whole locking/core branch from tip.

I'd say ask Ingo/PeterZ/Davidlohr which way they prefer.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 00/34] arch: barrier cleanup + __smp_XXX barriers for virt

2015-12-30 Thread Arnd Bergmann
On Wednesday 30 December 2015 15:24:12 Michael S. Tsirkin wrote:
> This is really trying to cleanup some virt code, as suggested by Peter, who
> said
> > You could of course go fix that instead of mutilating things into
> > sort-of functional state.
> 
> This work is needed for virtio, so it's probably easiest to
> merge it through my tree - is this fine by everyone?
> Arnd, if you agree, could you ack this please?

I've looked over all patches now, and everything seems fine (I had one
small comment about a duplicate patch). It's a very nice cleanup, both
for using the asm-generic file, and for the virtualization users.

Please add my "Acked-by: Arnd Bergmann " when merging the
series through your tree.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH 08/34] asm-generic: smp_store_mb should use smp_mb

2015-12-30 Thread Arnd Bergmann
On Wednesday 30 December 2015 15:24:47 Michael S. Tsirkin wrote:
> asm-generic variant of smp_store_mb() calls mb() which is stronger
> than implied by both the name and the documentation.
> 
> smp_store_mb is only used by core kernel code at the moment, so
> we know no one mis-uses it for an MMIO barrier.
> Make it call smp_mb consistently before some arch-specific
> code uses it as such by mistake.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  include/asm-generic/barrier.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h
> index 538f8d1..987b2e0 100644
> --- a/include/asm-generic/barrier.h
> +++ b/include/asm-generic/barrier.h
> @@ -93,7 +93,7 @@
>  #endif   /* CONFIG_SMP */
>  
>  #ifndef smp_store_mb
> -#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); mb(); } while 
> (0)
> +#define smp_store_mb(var, value)  do { WRITE_ONCE(var, value); smp_mb(); } 
> while (0)
>  #endif
>  
>  #ifndef smp_mb__before_atomic
> 

The same patch is already in the tip tree scheduled for 4.5 as d5a73cadf3fd
("lcoking/barriers, arch: Use smp barriers in smp_store_release()").

I think you can drop your version.

arnd

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] vhost: vsock: select CONFIG_VHOST

2015-12-08 Thread Arnd Bergmann
When building the new vsock code without vhost, we get a build error:

drivers/built-in.o: In function `vhost_vsock_flush':
:(.text+0x24d29c): undefined reference to `vhost_poll_flush'

This adds an explicit 'select' like we have for the other vhost
drivers.

Signed-off-by: Arnd Bergmann 
---
 drivers/vhost/Kconfig.vsock | 2 ++
 1 file changed, 2 insertions(+)

The patch causing the problem is currently in net-next, so the fix should be
applied on top of that.

diff --git a/drivers/vhost/Kconfig.vsock b/drivers/vhost/Kconfig.vsock
index 3491865d3eb9..bfb9edc4b5d6 100644
--- a/drivers/vhost/Kconfig.vsock
+++ b/drivers/vhost/Kconfig.vsock
@@ -2,6 +2,8 @@ config VHOST_VSOCK
tristate "vhost virtio-vsock driver"
depends on VSOCKETS && EVENTFD
select VIRTIO_VSOCKETS_COMMON
+   select VHOST
+   select VHOST_RING
default n
---help---
Say M here to enable the vhost-vsock for virtio-vsock guests
-- 
2.1.0.rc2


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: use %llu format string form atomic64_t

2015-10-19 Thread Arnd Bergmann
On Monday 19 October 2015 09:34:15 Geert Uytterhoeven wrote:
> On Wed, Oct 7, 2015 at 1:23 PM, Arnd Bergmann  wrote:
> > static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
> >
> > which truncates the result to 32 bit.
> 
> Woops.
> 
> See also my unanswered question in "atomic64 on 32-bit vs 64-bit (was:
> Re: Add virtio gpu driver.)", which is still valid:
> https://lkml.org/lkml/2015/6/28/18
> 

Regarding your question of

> Instead of sprinkling casts, is there any good reason why atomic64_read()
> and atomic64_t aren't "long long" everywhere, cfr. u64?


I assume the answer is that some (all?) 64-bit architectures intentionally
return 'long' here, in order for atomic_long_read() to return 'long' on
all architectures, given the definitions from
include/asm-generic/atomic-long.h

We would have to either change those, or we have to pick between
atomic_long_* or atomic64_* to have a consistent return type.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: use %llu format string form atomic64_t

2015-10-19 Thread Arnd Bergmann
On Monday 19 October 2015 11:37:00 Ralf Baechle wrote:
> On Wed, Oct 07, 2015 at 01:23:07PM +0200, Arnd Bergmann wrote:
> 
> > > I haven't checked all architectures, but I assume what happens is that
> > > 64-bit ones just #define atomic64_t atomic_long_t, so they don't have
> > > to provide three sets of functions.
> > 
> > scratch that, I just looked at all the architectures and found that it's
> > just completely arbitrary, even within one architecture you get a mix
> > of 'long' and 'long long', plus this gem from MIPS:
> > 
> > static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)
> > 
> > which truncates the result to 32 bit.
> 
> Eh...  The result is 0/1 so nothing is truncated.  Alpha, MIPS,
> PARISC and PowerPC are using the same prototype and x86 only differs
> in the use of inline instead __inline__.  And anyway, that function on
> MIPS is only built for CONFIG_64BIT.

Ah, got it. Sorry about that.

> What's wrong on MIPS is the comment describing the function's return value
> which was changed by f24219b4e90cf70ec4a211b17fbabc725a0ddf3c (atomic: move
> atomic_add_unless to generic code) and I've queued up a patch to fix that
> since a few days.  I guess that was a cut and paste error from
> __atomic_add_unless which indeed does return the old value.

Thanks!

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: use %llu format string form atomic64_t

2015-10-07 Thread Arnd Bergmann
On Wednesday 07 October 2015 13:04:06 Arnd Bergmann wrote:
> On Wednesday 07 October 2015 11:45:02 Russell King - ARM Linux wrote:
> > On Wed, Oct 07, 2015 at 12:41:21PM +0200, Arnd Bergmann wrote:
> > > The virtgpu driver prints the last_seq variable using the %ld or
> > > %lu format string, which does not work correctly on all architectures
> > > and causes this compiler warning on ARM:
> > > 
> > > drivers/gpu/drm/virtio/virtgpu_fence.c: In function 
> > > 'virtio_timeline_value_str':
> > > drivers/gpu/drm/virtio/virtgpu_fence.c:64:22: warning: format '%lu' 
> > > expects argument of type 'long unsigned int', but argument 4 has type 
> > > 'long long int' [-Wformat=]
> > >   snprintf(str, size, "%lu", atomic64_read(&fence->drv->last_seq));
> > >   ^
> > > drivers/gpu/drm/virtio/virtgpu_debugfs.c: In function 
> > > 'virtio_gpu_debugfs_irq_info':
> > > drivers/gpu/drm/virtio/virtgpu_debugfs.c:37:16: warning: format '%ld' 
> > > expects argument of type 'long int', but argument 3 has type 'long long 
> > > int' [-Wformat=]
> > >   seq_printf(m, "fence %ld %lld\n",
> > > ^
> > > 
> > > In order to avoid the warnings, this changes the format strings to %llu
> > > and adds a cast to u64, which makes it work the same way everywhere.
> > 
> > You have to wonder why atomic64_* functions do not use u64 types.
> > If they're not reliant on manipulating 64-bit quantities, then what's
> > the point of calling them atomic _64_.
> 
> I haven't checked all architectures, but I assume what happens is that
> 64-bit ones just #define atomic64_t atomic_long_t, so they don't have
> to provide three sets of functions.

scratch that, I just looked at all the architectures and found that it's
just completely arbitrary, even within one architecture you get a mix
of 'long' and 'long long', plus this gem from MIPS:

static __inline__ int atomic64_add_unless(atomic64_t *v, long a, long u)

which truncates the result to 32 bit.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH] drm/virtio: use %llu format string form atomic64_t

2015-10-07 Thread Arnd Bergmann
On Wednesday 07 October 2015 11:45:02 Russell King - ARM Linux wrote:
> On Wed, Oct 07, 2015 at 12:41:21PM +0200, Arnd Bergmann wrote:
> > The virtgpu driver prints the last_seq variable using the %ld or
> > %lu format string, which does not work correctly on all architectures
> > and causes this compiler warning on ARM:
> > 
> > drivers/gpu/drm/virtio/virtgpu_fence.c: In function 
> > 'virtio_timeline_value_str':
> > drivers/gpu/drm/virtio/virtgpu_fence.c:64:22: warning: format '%lu' expects 
> > argument of type 'long unsigned int', but argument 4 has type 'long long 
> > int' [-Wformat=]
> >   snprintf(str, size, "%lu", atomic64_read(&fence->drv->last_seq));
> >   ^
> > drivers/gpu/drm/virtio/virtgpu_debugfs.c: In function 
> > 'virtio_gpu_debugfs_irq_info':
> > drivers/gpu/drm/virtio/virtgpu_debugfs.c:37:16: warning: format '%ld' 
> > expects argument of type 'long int', but argument 3 has type 'long long 
> > int' [-Wformat=]
> >   seq_printf(m, "fence %ld %lld\n",
> > ^
> > 
> > In order to avoid the warnings, this changes the format strings to %llu
> > and adds a cast to u64, which makes it work the same way everywhere.
> 
> You have to wonder why atomic64_* functions do not use u64 types.
> If they're not reliant on manipulating 64-bit quantities, then what's
> the point of calling them atomic _64_.

I haven't checked all architectures, but I assume what happens is that
64-bit ones just #define atomic64_t atomic_long_t, so they don't have
to provide three sets of functions.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH] drm/virtio: use %llu format string form atomic64_t

2015-10-07 Thread Arnd Bergmann
The virtgpu driver prints the last_seq variable using the %ld or
%lu format string, which does not work correctly on all architectures
and causes this compiler warning on ARM:

drivers/gpu/drm/virtio/virtgpu_fence.c: In function 'virtio_timeline_value_str':
drivers/gpu/drm/virtio/virtgpu_fence.c:64:22: warning: format '%lu' expects 
argument of type 'long unsigned int', but argument 4 has type 'long long int' 
[-Wformat=]
  snprintf(str, size, "%lu", atomic64_read(&fence->drv->last_seq));
  ^
drivers/gpu/drm/virtio/virtgpu_debugfs.c: In function 
'virtio_gpu_debugfs_irq_info':
drivers/gpu/drm/virtio/virtgpu_debugfs.c:37:16: warning: format '%ld' expects 
argument of type 'long int', but argument 3 has type 'long long int' [-Wformat=]
  seq_printf(m, "fence %ld %lld\n",
^

In order to avoid the warnings, this changes the format strings to %llu
and adds a cast to u64, which makes it work the same way everywhere.

Signed-off-by: Arnd Bergmann 

diff --git a/drivers/gpu/drm/virtio/virtgpu_debugfs.c 
b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
index db8b49101a8b..512263919282 100644
--- a/drivers/gpu/drm/virtio/virtgpu_debugfs.c
+++ b/drivers/gpu/drm/virtio/virtgpu_debugfs.c
@@ -34,8 +34,8 @@ virtio_gpu_debugfs_irq_info(struct seq_file *m, void *data)
struct drm_info_node *node = (struct drm_info_node *) m->private;
struct virtio_gpu_device *vgdev = node->minor->dev->dev_private;
 
-   seq_printf(m, "fence %ld %lld\n",
-  atomic64_read(&vgdev->fence_drv.last_seq),
+   seq_printf(m, "fence %llu %lld\n",
+  (u64)atomic64_read(&vgdev->fence_drv.last_seq),
   vgdev->fence_drv.sync_seq);
return 0;
 }
diff --git a/drivers/gpu/drm/virtio/virtgpu_fence.c 
b/drivers/gpu/drm/virtio/virtgpu_fence.c
index 1da632631dac..67097c9ce9c1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fence.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fence.c
@@ -61,7 +61,7 @@ static void virtio_timeline_value_str(struct fence *f, char 
*str, int size)
 {
struct virtio_gpu_fence *fence = to_virtio_fence(f);
 
-   snprintf(str, size, "%lu", atomic64_read(&fence->drv->last_seq));
+   snprintf(str, size, "%llu", (u64)atomic64_read(&fence->drv->last_seq));
 }
 
 static const struct fence_ops virtio_fence_ops = {

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PULL] uaccess: fix sparse warning on get/put_user for bitwise types

2015-01-14 Thread Arnd Bergmann
On Wednesday 14 January 2015 19:36:18 Michael S. Tsirkin wrote:
> As you asked, here's a pull request.
> This has been in linux-next apparently with no ill effects.
> 
> The following changes since commit 99975cc6ada0d5f2675e83abecae05aba5f437d2:
> 
>   vhost/net: length miscalculation (2015-01-07 12:22:00 +0200)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git 
> tags/uaccess_for_upstream
> 
> for you to fetch changes up to 0795cb1b46e7938ed679ccd48f933e75272b30e3:
> 

Pulled into my asm-generic now, thanks!

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH RFC 3/5] pci: add pci_iomap_range

2014-12-11 Thread Arnd Bergmann
On Thursday 11 December 2014 21:37:34 Michael S. Tsirkin wrote:
> if (flags & IORESOURCE_MEM) {
> -   if (flags & IORESOURCE_CACHEABLE)
> +   if (!force_nocache && (flags & IORESOURCE_CACHEABLE))
> return ioremap(start, len);
> return ioremap_nocache(start, len);
> }
> 

ioremap is the same as ioremap_nocache, so this doesn't really make any
sense. IORESOURCE_CACHEABLE is practically only set for ROM bars, which
we rarely map.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device

2014-08-04 Thread Arnd Bergmann
On Monday 04 August 2014, Yijing Wang wrote:
> On 2014/8/1 21:52, Arnd Bergmann wrote:
> > On Wednesday 30 July 2014, Yijing Wang wrote:
> >> On 2014/7/29 22:08, Arnd Bergmann wrote:
> >>> The other part I'm not completely sure about is how you want to
> >>> have MSIs map into normal IRQ descriptors. At the moment, all
> >>> MSI users are based on IRQ numbers, but this has known scalability 
> >>> problems.
> >>
> >> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
> >> I'm sorry, I don't understand you meaning.
> >> What are the scalability problems you mentioned ?
> >> For device drivers, they always process interrupt in two steps.
> >> If irq is the legacy interrupt, drivers will first
> >> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the 
> >> IRQ number.
> >> Then drivers will call the request_irq() to register the interrupt handler.
> >> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and 
> >> then call
> >> request_irq() to register interrupt handler.
> > 
> > The method you describe here makes sense for PCI devices that are required 
> > to support
> > legacy interrupts and may or may not support MSI on a given system, but not 
> > so much
> > for platform devices for which we know exactly whether we want to use MSI
> > or legacy interrupts.
> > 
> > In particular if you have a device that can only do MSI, the entire 
> > pci_enable_msi
> > step is pointless: all we need to do is program the correct MSI target 
> > address/message
> > pair into the device and register the handler.
> 
> Yes, I almost agree if we won't change the existing hundreds of drivers, what
> I worried about is some drivers may want to know the IRQ numbers, and use the 
> IRQ
> number to process different things, as I mentioned in another reply.
> But we can also provide the interface which integrate MSI configuration and 
> request_irq(),
> if most drivers don't care the IRQ number.

The driver would still have the option of getting the IRQ number for now: With
the interface I imagine, you would get a 'struct msi_desc' pointer, from which
you can look up the 'struct irq_desc' pointer (either embedded in msi_desc,
or using a pointer from a member of msi_desc), and you can already get the
interrupt number from the irq_desc.

My point was that a well-written driver already does not care about the 
interrupt
number: the only information a driver needs in the interrupt handler is a 
pointer
to its own context, which we already derive from the irq_desc.

The main interface that currently requires the irq number is free_irq(), but
I would argue that we can just add a wrapper that takes the msi_desc pointer
as its first argument so the driver does not have to worry about it.

We can add additional wrappers like that as needed.

> >>> What I'd envision as the API from the device driver perspective is 
> >>> something
> >>> as simple like this:
> >>>
> >>> struct msi_desc *msi_request(struct msi_chip *chip, irq_handler_t handler,
> >>>   unsigned long flags, const char *name, struct device 
> >>> *dev);
> >>>
> >>> which would get an msi descriptor that is valid for this device (dev)
> >>> connected to a particular msi_chip, and associate a handler function
> >>> with it. The device driver can call that function and retrieve the
> >>> address/message pair from the msi_desc in order to store it in its own
> >>> device specific registers. The request_irq() can be handled internally
> >>> to msi_request().
> >>
> >> This is a huge change for device drivers, and some device drivers don't 
> >> know which msi_chip
> >> their MSI irq deliver to. I'm reworking the msi_chip, and try to use 
> >> msi_chip to eliminate
> >> all arch_msi_xxx() under every arch in kernel. And the important point is 
> >> how to create the
> >> binding for the MSI device to the target msi_chip.
> > 
> > Which drivers are you thinking of? Again, I wouldn't expect to change any 
> > PCI drivers,
> > but only platform drivers that do native MSI, so we only have to change 
> > drivers that
> > do not support any MSI at all yet and that need to be changed anyway in 
> > order to add
> > support.
> 
> I mean platform device drivers, because we can find the target msi_chip by 
> some platform
> interfaces(like the existing of_pci_find_msi_chip_by_node()). So 

Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device

2014-08-04 Thread Arnd Bergmann
On Monday 04 August 2014, Yijing Wang wrote:
> I have another question is some drivers will request more than one
> MSI/MSI-X IRQ, and the driver will use them to process different things.
> Eg. network driver generally uses one of them to process trivial network 
> thins,
> and others to transmit/receive data.
> 
> So, in this case, it seems to driver need to touch the IRQ numbers.
> 
> wr-linux:~ # cat /proc/interrupts
> CPU0   CPU1   CPU2   CPU17  CPU18  
> CPU19  CPU20  CPU21  CPU22  CPU23
>  ..
>  100:  0  0  0   0  0  0  
> 0  0  0  0  IR-PCI-MSI-edge  eth0
>  101:  2  0  0   0  0  0  
> 302830488  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-0
>  102:110  0  0   0  0  360675897  
> 0  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-1
>  103:109  0  0   0  0  0  
> 0  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-2
>  104:107  0  0 9678933  0  0  
> 0  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-3
>  105:107  0  0   0  357838258  0  
> 0  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-4
>  106:115  0  0   0  0  0  
> 0  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-5
>  107:114  0  0   0  0  0  
> 0  337866096  0  0  IR-PCI-MSI-edge  eth0-TxRx-6
>  108:  373801199  0  0   0  0  0  
> 0  0  0  0  IR-PCI-MSI-edge  eth0-TxRx-7
> 

I think in this example, you just need to request eight interrupts, and pass a
different data pointer each time, pointing to the napi_struct of each of the
NIC queues. The driver has no need to deal with the IRQ number at all,
and I would be surprised if it cared today.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH 00/11] Refactor MSI to support Non-PCI device

2014-08-01 Thread Arnd Bergmann
On Wednesday 30 July 2014, Yijing Wang wrote:
> On 2014/7/29 22:08, Arnd Bergmann wrote:
> > On Saturday 26 July 2014 11:08:37 Yijing Wang wrote:
> >>
> >> The new data struct for generic MSI driver.
> >> struct msi_irqs {
> >> u8 msi_enabled:1; /* Enable flag */
> >> u8 msix_enabled:1;
> >> struct list_head msi_list; /* MSI desc list */
> >> void *data; /* help to find the MSI device */
> >> struct msi_ops *ops; /* MSI device specific hook */
> >> };
> >> struct msi_irqs is used to manage MSI related informations. Every device 
> >> supports
> >> MSI should contain this data struct and allocate it.
> > 
> > I think you should have a stronger association with the 'struct
> > device' here. Can you replace the 'void *data' with 'struct device *dev'?
> 
> Actually, I used the struct device *dev in my first draft, finally, I replaced
> it with void *data, because some MSI devices don't have a struct device *dev,
> like the existing hpet device, dmar msi device, and OF device, like the ARM 
> consolidator.
> 
> Of course, we can make the MSI devices have their own struct device, and 
> register to
> device tree, eg, add a class device named MSI_DEV. But I'm not sure whether 
> it is appropriate.

It doesn't have to be in the (OF) device tree, but I think it absolutely makes
sense to use the 'struct device' infrastructure here, as almost everything uses
a device, and the ones that don't do that today can be easily changed.

> > The other part I'm not completely sure about is how you want to
> > have MSIs map into normal IRQ descriptors. At the moment, all
> > MSI users are based on IRQ numbers, but this has known scalability problems.
> 
> Hmmm, I still use the IRQ number to map the MSIs to IRQ description.
> I'm sorry, I don't understand you meaning.
> What are the scalability problems you mentioned ?
> For device drivers, they always process interrupt in two steps.
> If irq is the legacy interrupt, drivers will first
> use the irq_of_parse_and_map() or pci_enable_device() to parse and get the 
> IRQ number.
> Then drivers will call the request_irq() to register the interrupt handler.
> If irq is MSIs, first call pci_enable_msi/x() to get the IRQ number and then 
> call
> request_irq() to register interrupt handler.

The method you describe here makes sense for PCI devices that are required to 
support
legacy interrupts and may or may not support MSI on a given system, but not so 
much
for platform devices for which we know exactly whether we want to use MSI
or legacy interrupts.

In particular if you have a device that can only do MSI, the entire 
pci_enable_msi
step is pointless: all we need to do is program the correct MSI target 
address/message
pair into the device and register the handler.

> > I wonder if we can do the interface in a way that
> > hides the interrupt number from generic device drivers and just
> > passes a 'struct irq_desc'. Note that there are long-term plans to
> > get rid of IRQ numbers entirely, but those plans have existed for
> > a long time already without anybody seriously addressing the device
> > driver interfaces so far, so it might never really happen.
> > 
> 
> Maybe this is a huge work, now hundreds drivers use the IRQ number, so maybe 
> we can consider
> this in a separate title.

Sorry for being unclear here: I did suggest changing all drivers now. What I 
meant
is that we use a different API for non-PCI devices that works without IRQ 
numbers.
I don't think we should touch the PCI interfaces at this point.

> > With the other operations, I think they should all take a 'struct device *'
> > as the first argument for convenience and consistency. I don't think you 
> > actually
> > need msi_read_message(), and we could avoid msi_write_message() by doing it
> > the other way round.
> > 
> 
> There only two functions use the read_msi_msg(), because every msi_desc has
> a struct msi_msg, and it caches the msi address and data. I will consider to
> retrieve the msg from cached msi_msg, then we can avoid the 
> msi_read_message().
> But msi_write_message() maybe necessary, some xxx_set_affinity() functions and
> restore functions need the msi_write_message() to rewrite the address and 
> data.

Makes sense. I'd have to think about it more, but I think you are right
about the affinity APIs needing this.

> > What I'd envision as the API from the device driver perspective is something
> > as simple like this:
> > 
> > struct msi_desc *msi_request(struct msi_chip *chip, i

  1   2   3   >