I.T. X plan:
* Karmic Koala bootstyle,high res text, no image / initrd. * Enligthenment, Right Corner Bar/Launch Menu * Fair Pay, lexically organized commercial directory. Com:|Top|Category|Subcategory|1m km2 zone|23.000 km2 zone|Person|Groupings - no unecessary logins, easy exposure, and changing / to | symbolizing fair pay structure, and also making / available in filenames, which is a common thing. * 0.33 ms latency Renoise, suggesting optimized paths for this. * 72.7 (3x pass) Doom 3 (low jitter config) - will probably be great for Direct 3D 12. * Readied for €-money integration. EU optimally symbolically located for this. * Calibri font for easy cursive Islamic integration, and bold chan integration, supporting all developments back to Adams Tablet, source of fair job principles. Hail Jagod! Serene Greetings, Ywe Cærlyn https://www.youtube.com/channel/UCR3gmLVjHS5A702wo4bol_Q
Re: [GIT PULL] sh: remove sh5 support
On 5/28/20 7:46 AM, Christoph Hellwig wrote: > [adding Linus] > > On Thu, May 07, 2020 at 07:35:52AM -0700, Christoph Hellwig wrote: >> Any progress on this? I plan to resend the sh dma-mapping I've been >> trying to get upstream for a year again, and they would conflict, >> so I could look into rebasing them first. > > So for years now it has been close to and in the end impossible to > provoke sh maintainer action. At the same point hardware is pretty much > long gone for the real commercial variants, and never took off for the > open hardware nommu variant. > > Linus, would you ok with a 5.8 pull request to just kill off arch/sh/? We're maintaining SH in Debian so I'm interested in keeping arch/sh, but I'm also let down that SH maintainers aren't that active at the moment. I do know that Yoshinori Sato has a tree where he takes patches and sends PRs from time to time, but I have no idea what is going on. Adrian -- .''`. John Paul Adrian Glaubitz : :' : Debian Developer - glaub...@debian.org `. `' Freie Universitaet Berlin - glaub...@physik.fu-berlin.de `-GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Re: [PATCH v2] sctp: check assoc before SCTP_ADDR_{MADE_PRIM,ADDED} event
On Wed, May 27, 2020 at 5:57 PM Jonas Falkevik wrote: > > Make sure SCTP_ADDR_{MADE_PRIM,ADDED} are sent only for associations > that have been established. > > These events are described in rfc6458#section-6.1 > SCTP_PEER_ADDR_CHANGE: > This tag indicates that an address that is > part of an existing association has experienced a change of > state (e.g., a failure or return to service of the reachability > of an endpoint via a specific transport address). > > Signed-off-by: Jonas Falkevik Reviewed-by: Xin Long > --- > Changes in v2: > - Check asoc state to be at least established. >Instead of associd being SCTP_FUTURE_ASSOC. > - Common check for all peer addr change event > > net/sctp/ulpevent.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/net/sctp/ulpevent.c b/net/sctp/ulpevent.c > index c82dbdcf13f2..77d5c36a8991 100644 > --- a/net/sctp/ulpevent.c > +++ b/net/sctp/ulpevent.c > @@ -343,6 +343,9 @@ void sctp_ulpevent_nofity_peer_addr_change(struct > sctp_transport *transport, > struct sockaddr_storage addr; > struct sctp_ulpevent *event; > > + if (asoc->state < SCTP_STATE_ESTABLISHED) > + return; > + > memset(, 0, sizeof(struct sockaddr_storage)); > memcpy(, >ipaddr, > transport->af_specific->sockaddr_len); > > -- > 2.25.4 >
Re: [PATCH 5.6 086/126] virtio-balloon: Revert "virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM"
On 26. 05. 20, 20:53, Greg Kroah-Hartman wrote: > From: Michael S. Tsirkin > > [ Upstream commit 835a6a649d0dd1b1f46759eb60fff2f63ed253a7 ] > > This reverts commit 5a6b4cc5b7a1892a8d7f63d6cbac6e0ae2a9d031. > > It has been queued properly in the akpm tree, this version is just > creating conflicts. Should this be applied to stable trees at all? To me, it occurs to be a revert to avoid conflicts, not to fix something? > Signed-off-by: Michael S. Tsirkin > Signed-off-by: Sasha Levin thanks, -- js suse labs
Re: [PATCH net-next 2/4] vmxnet3: add support to get/set rx flow hash
On Wed, May 27, 2020 at 07:07:04PM -0700, Ronak Doshi wrote: > With vmxnet3 version 4, the emulation supports multiqueue(RSS) for > UDP and ESP traffic. A guest can enable/disable RSS for UDP/ESP over > IPv4/IPv6 by issuing commands introduced in this patch. ESP ipv6 is > not yet supported in this patch. > > This patch implements get_rss_hash_opts and set_rss_hash_opts > methods to allow querying and configuring different Rx flow hash > configurations. > > Signed-off-by: Ronak Doshi > --- [...] > diff --git a/drivers/net/vmxnet3/vmxnet3_ethtool.c > b/drivers/net/vmxnet3/vmxnet3_ethtool.c > index 1163eca7aba5..ceedf63020cb 100644 > --- a/drivers/net/vmxnet3/vmxnet3_ethtool.c > +++ b/drivers/net/vmxnet3/vmxnet3_ethtool.c > @@ -665,18 +665,236 @@ vmxnet3_set_ringparam(struct net_device *netdev, > return err; > } > > +static int > +vmxnet3_get_rss_hash_opts(struct vmxnet3_adapter *adapter, > + struct ethtool_rxnfc *info) > +{ > + enum Vmxnet3_RSSField rss_fields; > + > + if (netif_running(adapter->netdev)) { > + unsigned long flags; > + > + spin_lock_irqsave(>cmd_lock, flags); > + > + VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD, > +VMXNET3_CMD_GET_RSS_FIELDS); > + rss_fields = VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_CMD); > + spin_unlock_irqrestore(>cmd_lock, flags); > + } else { > + rss_fields = adapter->rss_fields; > + } > + > + info->data = 0; > + > + /* Report default options for RSS on vmxnet3 */ > + switch (info->flow_type) { > + case TCP_V4_FLOW: > + if (rss_fields & VMXNET3_RSS_FIELDS_TCPIP4) > + info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3 | > + RXH_IP_SRC | RXH_IP_DST; > + break; > + case UDP_V4_FLOW: > + if (rss_fields & VMXNET3_RSS_FIELDS_UDPIP4) > + info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3 | > + RXH_IP_SRC | RXH_IP_DST; > + break; In both cases above (and also in the two for IPv6 below) you set info->data to either 0 or all four bits, depending on the value of corresponding flag in rss_fields. But in vmxnet3_set_rss_hash_opt() you have different mapping: - for TCP, you only accept all four bits (no other value) and don't touch rss_fields at all - for UDP, you allow either all four bits (and set the flag) or the two IP related bits (and clear the flag) The UDPv4/UDPv6 behaviour of vmxnet3_set_rss_hash_opt() seems to be the correct one but you should be consistent between get and set handlers. > + case AH_ESP_V4_FLOW: > + case AH_V4_FLOW: > + case ESP_V4_FLOW: > + if (rss_fields & VMXNET3_RSS_FIELDS_ESPIP4) > + info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3; If this fallthrough is intentional (it seems to be), it should be marked. Michal > + case SCTP_V4_FLOW: > + case IPV4_FLOW: > + info->data |= RXH_IP_SRC | RXH_IP_DST; > + break; > + case TCP_V6_FLOW: > + if (rss_fields & VMXNET3_RSS_FIELDS_TCPIP6) > + info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3 | > + RXH_IP_SRC | RXH_IP_DST; > + break; > + case UDP_V6_FLOW: > + if (rss_fields & VMXNET3_RSS_FIELDS_UDPIP6) > + info->data |= RXH_L4_B_0_1 | RXH_L4_B_2_3 | > + RXH_IP_SRC | RXH_IP_DST; > + break; > + case AH_ESP_V6_FLOW: > + case AH_V6_FLOW: > + case ESP_V6_FLOW: > + case SCTP_V6_FLOW: > + case IPV6_FLOW: > + info->data |= RXH_IP_SRC | RXH_IP_DST; > + break; > + default: > + return -EINVAL; > + } > + > + return 0; > +} > + > +static int > +vmxnet3_set_rss_hash_opt(struct net_device *netdev, > + struct vmxnet3_adapter *adapter, > + struct ethtool_rxnfc *nfc) > +{ > + enum Vmxnet3_RSSField rss_fields = adapter->rss_fields; > + > + /* RSS does not support anything other than hashing > + * to queues on src and dst IPs and ports > + */ > + if (nfc->data & ~(RXH_IP_SRC | RXH_IP_DST | > + RXH_L4_B_0_1 | RXH_L4_B_2_3)) > + return -EINVAL; > + > + switch (nfc->flow_type) { > + case TCP_V4_FLOW: > + case TCP_V6_FLOW: > + if (!(nfc->data & RXH_IP_SRC) || > + !(nfc->data & RXH_IP_DST) || > + !(nfc->data & RXH_L4_B_0_1) || > + !(nfc->data & RXH_L4_B_2_3)) > + return -EINVAL; > + break; > + case UDP_V4_FLOW: > + if (!(nfc->data & RXH_IP_SRC) || > + !(nfc->data & RXH_IP_DST)) > + return -EINVAL; > + switch (nfc->data & (RXH_L4_B_0_1 | RXH_L4_B_2_3)) { > +
Re: [PATCHv2] media: videobuf2-dma-contig: fix bad kfree in vb2_dma_contig_clear_max_seg_size
On 27.05.2020 10:23, Tomi Valkeinen wrote: > Commit 9495b7e92f716ab2bd6814fab5e97ab4a39adfdd ("driver core: platform: > Initialize dma_parms for platform devices") in v5.7-rc5 causes > vb2_dma_contig_clear_max_seg_size() to kfree memory that was not > allocated by vb2_dma_contig_set_max_seg_size(). > > The assumption in vb2_dma_contig_set_max_seg_size() seems to be that > dev->dma_parms is always NULL when the driver is probed, and the case > where dev->dma_parms has bee initialized by someone else than the driver > (by calling vb2_dma_contig_set_max_seg_size) will cause a failure. > > All the current users of these functions are platform devices, which now > always have dma_parms set by the driver core. To fix the issue for v5.7, > make vb2_dma_contig_set_max_seg_size() return an error if dma_parms is > NULL to be on the safe side, and remove the kfree code from > vb2_dma_contig_clear_max_seg_size(). > > For v5.8 we should remove the two functions and move the > dma_set_max_seg_size() calls into the drivers. > > Signed-off-by: Tomi Valkeinen > Fixes: 9495b7e92f71 ("driver core: platform: Initialize dma_parms for > platform devices") > Cc: sta...@vger.kernel.org Acked-by: Marek Szyprowski > --- > > Changes in v2: > * vb2_dma_contig_clear_max_seg_size to empty static inline > * Added cc: stable and fixes tag > > .../common/videobuf2/videobuf2-dma-contig.c | 20 ++- > include/media/videobuf2-dma-contig.h | 2 +- > 2 files changed, 3 insertions(+), 19 deletions(-) > > diff --git a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > index d3a3ee5b597b..f4b4a7c135eb 100644 > --- a/drivers/media/common/videobuf2/videobuf2-dma-contig.c > +++ b/drivers/media/common/videobuf2/videobuf2-dma-contig.c > @@ -726,9 +726,8 @@ EXPORT_SYMBOL_GPL(vb2_dma_contig_memops); > int vb2_dma_contig_set_max_seg_size(struct device *dev, unsigned int size) > { > if (!dev->dma_parms) { > - dev->dma_parms = kzalloc(sizeof(*dev->dma_parms), GFP_KERNEL); > - if (!dev->dma_parms) > - return -ENOMEM; > + dev_err(dev, "Failed to set max_seg_size: dma_parms is NULL\n"); > + return -ENODEV; > } > if (dma_get_max_seg_size(dev) < size) > return dma_set_max_seg_size(dev, size); > @@ -737,21 +736,6 @@ int vb2_dma_contig_set_max_seg_size(struct device *dev, > unsigned int size) > } > EXPORT_SYMBOL_GPL(vb2_dma_contig_set_max_seg_size); > > -/* > - * vb2_dma_contig_clear_max_seg_size() - release resources for DMA parameters > - * @dev: device for configuring DMA parameters > - * > - * This function releases resources allocated to configure DMA parameters > - * (see vb2_dma_contig_set_max_seg_size() function). It should be called from > - * device drivers on driver remove. > - */ > -void vb2_dma_contig_clear_max_seg_size(struct device *dev) > -{ > - kfree(dev->dma_parms); > - dev->dma_parms = NULL; > -} > -EXPORT_SYMBOL_GPL(vb2_dma_contig_clear_max_seg_size); > - > MODULE_DESCRIPTION("DMA-contig memory handling routines for videobuf2"); > MODULE_AUTHOR("Pawel Osciak "); > MODULE_LICENSE("GPL"); > diff --git a/include/media/videobuf2-dma-contig.h > b/include/media/videobuf2-dma-contig.h > index 5604818d137e..5be313cbf7d7 100644 > --- a/include/media/videobuf2-dma-contig.h > +++ b/include/media/videobuf2-dma-contig.h > @@ -25,7 +25,7 @@ vb2_dma_contig_plane_dma_addr(struct vb2_buffer *vb, > unsigned int plane_no) > } > > int vb2_dma_contig_set_max_seg_size(struct device *dev, unsigned int size); > -void vb2_dma_contig_clear_max_seg_size(struct device *dev); > +static inline void vb2_dma_contig_clear_max_seg_size(struct device *dev) { } > > extern const struct vb2_mem_ops vb2_dma_contig_memops; > Best regards -- Marek Szyprowski, PhD Samsung R Institute Poland
Re: [GIT PULL] sh: remove sh5 support
[adding Linus] On Thu, May 07, 2020 at 07:35:52AM -0700, Christoph Hellwig wrote: > Any progress on this? I plan to resend the sh dma-mapping I've been > trying to get upstream for a year again, and they would conflict, > so I could look into rebasing them first. So for years now it has been close to and in the end impossible to provoke sh maintainer action. At the same point hardware is pretty much long gone for the real commercial variants, and never took off for the open hardware nommu variant. Linus, would you ok with a 5.8 pull request to just kill off arch/sh/? > > On Sat, Apr 25, 2020 at 12:19:47AM +0200, Arnd Bergmann wrote: > > The following changes since commit > > ae83d0b416db002fe95601e7f97f64b59514d936: > > > > Linux 5.7-rc2 (2020-04-19 14:35:30 -0700) > > > > are available in the Git repository at: > > > > git://git.kernel.org:/pub/scm/linux/kernel/git/arnd/playground.git > > tags/sh5-remove > > > > for you to fetch changes up to 29e36fbee3be4c13ff6881a275c86d5f68acfa23: > > > > sh: remove sh5 support (2020-04-24 22:20:55 +0200) > > > > > > sh: remove sh5 support > > > > At long last, this is the removal of the 64-bit sh5 port > > that never went into production. > > > > Signed-off-by: Arnd Bergmann > > > > > > > > v2: I should have fixed all the missing changes that Geert pointed out, > > this time sending it as a pull request as the removal patch is > > too big for the mailing lists, and a 'git format-patch -D' patch > > is unreliable > > > > Arnd Bergmann (1): > > sh: remove sh5 support > > > > arch/sh/Kconfig | 62 +- > > arch/sh/Kconfig.cpu |9 - > > arch/sh/Kconfig.debug | 13 +- > > arch/sh/Makefile | 29 +- > > arch/sh/boot/compressed/Makefile | 12 +- > > arch/sh/boot/compressed/misc.c|8 - > > arch/sh/drivers/pci/Makefile |1 - > > arch/sh/drivers/pci/ops-sh5.c | 65 - > > arch/sh/drivers/pci/pci-sh5.c | 217 --- > > arch/sh/drivers/pci/pci-sh5.h | 108 -- > > arch/sh/include/asm/barrier.h |4 +- > > arch/sh/include/asm/bitops.h | 26 - > > arch/sh/include/asm/bl_bit.h | 11 +- > > arch/sh/include/asm/bl_bit_64.h | 37 - > > arch/sh/include/asm/bugs.h|4 - > > arch/sh/include/asm/cache_insns.h | 12 +- > > arch/sh/include/asm/cache_insns_64.h | 20 - > > arch/sh/include/asm/checksum.h|6 +- > > arch/sh/include/asm/elf.h | 23 - > > arch/sh/include/asm/extable.h |4 - > > arch/sh/include/asm/fixmap.h |4 - > > arch/sh/include/asm/io.h |4 - > > arch/sh/include/asm/irq.h |3 - > > arch/sh/include/asm/mmu_context.h | 12 - > > arch/sh/include/asm/mmu_context_64.h | 75 - > > arch/sh/include/asm/module.h |4 - > > arch/sh/include/asm/page.h| 21 +- > > arch/sh/include/asm/pgtable.h | 17 - > > arch/sh/include/asm/pgtable_64.h | 307 > > arch/sh/include/asm/posix_types.h |6 +- > > arch/sh/include/asm/processor.h | 14 +- > > arch/sh/include/asm/processor_64.h| 212 --- > > arch/sh/include/asm/ptrace_64.h | 14 - > > arch/sh/include/asm/string.h |6 +- > > arch/sh/include/asm/string_64.h | 21 - > > arch/sh/include/asm/switch_to.h | 11 +- > > arch/sh/include/asm/switch_to_64.h| 32 - > > arch/sh/include/asm/syscall.h |6 +- > > arch/sh/include/asm/syscall_64.h | 75 - > > arch/sh/include/asm/syscalls.h|9 +- > > arch/sh/include/asm/syscalls_64.h | 18 - > > arch/sh/include/asm/thread_info.h |4 +- > > arch/sh/include/asm/tlb.h |6 +- > > arch/sh/include/asm/tlb_64.h | 68 - > > arch/sh/include/asm/traps.h |4 - > > arch/sh/include/asm/traps_64.h| 35 - > > arch/sh/include/asm/types.h |5 - > > arch/sh/include/asm/uaccess.h |4 - > > arch/sh/include/asm/uaccess_64.h | 85 - > > arch/sh/include/asm/unistd.h |6 +- > > arch/sh/include/asm/user.h|7 - > > arch/sh/include/asm/vmlinux.lds.h |8 - > > arch/sh/include/cpu-sh5/cpu/addrspace.h | 12 - > > arch/sh/include/cpu-sh5/cpu/cache.h | 94 - > > arch/sh/include/cpu-sh5/cpu/irq.h | 113 -- > > arch/sh/include/cpu-sh5/cpu/mmu_context.h | 22 - > > arch/sh/include/cpu-sh5/cpu/registers.h | 103 -- > > arch/sh/include/cpu-sh5/cpu/rtc.h |9 - > >
Re: [PATCH v1 2/2] Add PWM driver for LGM
On 27/5/2020 5:15 pm, Andy Shevchenko wrote: > On Wed, May 27, 2020 at 02:28:53PM +0800, Tanwar, Rahul wrote: >> On 22/5/2020 4:56 pm, Uwe Kleine-König wrote: >>> On Fri, May 22, 2020 at 03:41:59PM +0800, Rahul Tanwar wrote: > ... > >>> I'm a unhappy to have this in the PWM driver. The PWM driver is supposed >>> to be generic and I think this belongs into a dedicated driver. >> Well noted about all other review concerns. I will rework the driver in v2. >> However, i am not very sure about the above point - of having a separate >> dedicated driver for tach_work because its logic is tightly coupled with >> this driver. > Actually I agree with Uwe. > Here is layering violation, i.e. provider and consumer in the same pot. It's > not good from design perspective. > Just to clarify, the PWM controller in our SoC serves just one purpose which is to control the fan. Its actually named as PWM Fan Controller. There is no other generic usage or any other consumer for this PWM driver. So separating out this part seems redundant to me. Also, if we separate it out as a dedicated driver, this will endup as a very small daemon which is going to be very hard to justify while upstreaming.. Regards, Rahul
Re: [PATCH 3/3] perf jvmti: Fix demangling Java symbols
On 05/28/20 06:34 AM, Arnaldo Carvalho de Melo wrote: >> >> This is in my tmp.perf/core branch pending a round of testing, after >> that it'll move to perf/core on its way to 5.8, thanks. > > All tests passed, moved to perf/core. > Great, thank you! -- Nick
[PATCH 01/14] cachefiles: switch to kernel_write
__kernel_write doesn't take a sb_writers references, which we need here. Signed-off-by: Christoph Hellwig Reviewed-by: David Howells --- fs/cachefiles/rdwr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/cachefiles/rdwr.c b/fs/cachefiles/rdwr.c index e7726f5f1241c..3080cda9e8245 100644 --- a/fs/cachefiles/rdwr.c +++ b/fs/cachefiles/rdwr.c @@ -937,7 +937,7 @@ int cachefiles_write_page(struct fscache_storage *op, struct page *page) } data = kmap(page); - ret = __kernel_write(file, data, len, ); + ret = kernel_write(file, data, len, ); kunmap(page); fput(file); if (ret != len) -- 2.26.2
[PATCH 10/14] fs: add a __kernel_read helper
This is the counterpart to __kernel_write, and skip the rw_verify_area call compared to kernel_read. Signed-off-by: Christoph Hellwig --- fs/read_write.c| 21 + include/linux/fs.h | 1 + 2 files changed, 22 insertions(+) diff --git a/fs/read_write.c b/fs/read_write.c index 8cfca5f8fc3ce..bd12af8a895c8 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -430,6 +430,27 @@ ssize_t __vfs_read(struct file *file, char __user *buf, size_t count, return -EINVAL; } +ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) +{ + mm_segment_t old_fs = get_fs(); + ssize_t ret; + + if (!(file->f_mode & FMODE_CAN_READ)) + return -EINVAL; + + if (count > MAX_RW_COUNT) + count = MAX_RW_COUNT; + set_fs(KERNEL_DS); + ret = __vfs_read(file, (void __user *)buf, count, pos); + set_fs(old_fs); + if (ret > 0) { + fsnotify_access(file); + add_rchar(current, ret); + } + inc_syscr(current); + return ret; +} + ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { mm_segment_t old_fs; diff --git a/include/linux/fs.h b/include/linux/fs.h index 21f126957c2cf..6441aaa25f8f2 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3011,6 +3011,7 @@ extern int kernel_read_file_from_path_initns(const char *, void **, loff_t *, lo extern int kernel_read_file_from_fd(int, void **, loff_t *, loff_t, enum kernel_read_file_id); extern ssize_t kernel_read(struct file *, void *, size_t, loff_t *); +ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos); extern ssize_t kernel_write(struct file *, const void *, size_t, loff_t *); extern ssize_t __kernel_write(struct file *, const void *, size_t, loff_t *); extern struct file * open_exec(const char *); -- 2.26.2
[PATCH 07/14] fs: implement kernel_write using __kernel_write
Consolidate the two in-kernel write helpers to make upcoming changes easier. The only difference are the missing call to rw_verify_area in kernel_write, and an access_ok check that doesn't make sense for kernel buffers to start with. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index f0768313ea010..abb84391cfbc5 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -499,6 +499,7 @@ static ssize_t __vfs_write(struct file *file, const char __user *p, return -EINVAL; } +/* caller is responsible for file_start_write/file_end_write */ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) { mm_segment_t old_fs; @@ -528,16 +529,16 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t ssize_t kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) { - mm_segment_t old_fs; - ssize_t res; + ssize_t ret; - old_fs = get_fs(); - set_fs(KERNEL_DS); - /* The cast to a user pointer is valid due to the set_fs() */ - res = vfs_write(file, (__force const char __user *)buf, count, pos); - set_fs(old_fs); + ret = rw_verify_area(WRITE, file, pos, count); + if (ret) + return ret; - return res; + file_start_write(file); + ret = __kernel_write(file, buf, count, pos); + file_end_write(file); + return ret; } EXPORT_SYMBOL(kernel_write); -- 2.26.2
[PATCH 06/14] fs: remove the call_{read,write}_iter functions
Just open coding the methods calls is a lot easier to follow. Signed-off-by: Christoph Hellwig --- drivers/block/loop.c | 4 ++-- drivers/target/target_core_file.c | 4 ++-- fs/aio.c | 4 ++-- fs/io_uring.c | 4 ++-- fs/read_write.c | 12 ++-- fs/splice.c | 2 +- include/linux/fs.h| 12 7 files changed, 15 insertions(+), 27 deletions(-) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index da693e6a834e5..ad167050a4ec4 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -572,9 +572,9 @@ static int lo_rw_aio(struct loop_device *lo, struct loop_cmd *cmd, kthread_associate_blkcg(cmd->css); if (rw == WRITE) - ret = call_write_iter(file, >iocb, ); + ret = file->f_op->write_iter(>iocb, ); else - ret = call_read_iter(file, >iocb, ); + ret = file->f_op->read_iter(>iocb, ); lo_rw_aio_do_completion(cmd); kthread_associate_blkcg(NULL); diff --git a/drivers/target/target_core_file.c b/drivers/target/target_core_file.c index 7143d03f0e027..79f0707877917 100644 --- a/drivers/target/target_core_file.c +++ b/drivers/target/target_core_file.c @@ -303,9 +303,9 @@ fd_execute_rw_aio(struct se_cmd *cmd, struct scatterlist *sgl, u32 sgl_nents, aio_cmd->iocb.ki_flags |= IOCB_DSYNC; if (is_write) - ret = call_write_iter(file, _cmd->iocb, ); + ret = file->f_op->write_iter(_cmd->iocb, ); else - ret = call_read_iter(file, _cmd->iocb, ); + ret = file->f_op->read_iter(_cmd->iocb, ); kfree(bvec); diff --git a/fs/aio.c b/fs/aio.c index 5f3d3d8149287..1ccc0efdc357d 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -1540,7 +1540,7 @@ static int aio_read(struct kiocb *req, const struct iocb *iocb, return ret; ret = rw_verify_area(READ, file, >ki_pos, iov_iter_count()); if (!ret) - aio_rw_done(req, call_read_iter(file, req, )); + aio_rw_done(req, file->f_op->read_iter(req, )); kfree(iovec); return ret; } @@ -1580,7 +1580,7 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb, __sb_writers_release(file_inode(file)->i_sb, SB_FREEZE_WRITE); } req->ki_flags |= IOCB_WRITE; - aio_rw_done(req, call_write_iter(file, req, )); + aio_rw_done(req, file->f_op->write_iter(req, )); } kfree(iovec); return ret; diff --git a/fs/io_uring.c b/fs/io_uring.c index bb25e3997d418..f4b808231af0b 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2579,7 +2579,7 @@ static int io_read(struct io_kiocb *req, bool force_nonblock) ssize_t ret2; if (req->file->f_op->read_iter) - ret2 = call_read_iter(req->file, kiocb, ); + ret2 = req->file->f_op->read_iter(kiocb, ); else ret2 = loop_rw_iter(READ, req->file, kiocb, ); @@ -2694,7 +2694,7 @@ static int io_write(struct io_kiocb *req, bool force_nonblock) current->signal->rlim[RLIMIT_FSIZE].rlim_cur = req->fsize; if (req->file->f_op->write_iter) - ret2 = call_write_iter(req->file, kiocb, ); + ret2 = req->file->f_op->write_iter(kiocb, ); else ret2 = loop_rw_iter(WRITE, req->file, kiocb, ); diff --git a/fs/read_write.c b/fs/read_write.c index 76be155ad9824..f0768313ea010 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -412,7 +412,7 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo kiocb.ki_pos = (ppos ? *ppos : 0); iov_iter_init(, READ, , 1, len); - ret = call_read_iter(filp, , ); + ret = filp->f_op->read_iter(, ); BUG_ON(ret == -EIOCBQUEUED); if (ppos) *ppos = kiocb.ki_pos; @@ -481,7 +481,7 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t kiocb.ki_pos = (ppos ? *ppos : 0); iov_iter_init(, WRITE, , 1, len); - ret = call_write_iter(filp, , ); + ret = filp->f_op->write_iter(, ); BUG_ON(ret == -EIOCBQUEUED); if (ret > 0 && ppos) *ppos = kiocb.ki_pos; @@ -690,9 +690,9 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter, kiocb.ki_pos = (ppos ? *ppos : 0); if (type == READ) - ret = call_read_iter(filp, , iter); + ret = filp->f_op->read_iter(, iter); else - ret = call_write_iter(filp, , iter); + ret = filp->f_op->write_iter(, iter); BUG_ON(ret == -EIOCBQUEUED); if (ppos) *ppos = kiocb.ki_pos; @@
[PATCH 14/14] fs: don't change the address limit for ->read_iter in __kernel_read
If we read to a file that implements ->read_iter there is no need to change the address limit if we send a kvec down. Implement that case, and prefer it over using plain ->read with a changed address limit if available. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 24 +--- 1 file changed, 17 insertions(+), 7 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 46ddfce17e839..c93acbd8bf5a3 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -421,7 +421,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { - mm_segment_t old_fs = get_fs(); ssize_t ret; if (!(file->f_mode & FMODE_CAN_READ)) @@ -429,14 +428,25 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; - set_fs(KERNEL_DS); - if (file->f_op->read) + if (file->f_op->read_iter) { + struct kvec iov = { .iov_base = buf, .iov_len = count }; + struct kiocb kiocb; + struct iov_iter iter; + + init_sync_kiocb(, file); + kiocb.ki_pos = *pos; + iov_iter_kvec(, READ, , 1, count); + ret = file->f_op->read_iter(, ); + *pos = kiocb.ki_pos; + } else if (file->f_op->read) { + mm_segment_t old_fs = get_fs(); + + set_fs(KERNEL_DS); ret = file->f_op->read(file, (void __user *)buf, count, pos); - else if (file->f_op->read_iter) - ret = new_sync_read(file, (void __user *)buf, count, pos); - else + set_fs(old_fs); + } else { ret = -EINVAL; - set_fs(old_fs); + } if (ret > 0) { fsnotify_access(file); add_rchar(current, ret); -- 2.26.2
[PATCH 11/14] integrity/ima: switch to using __kernel_read
__kernel_read has a bunch of additional sanity checks, and this moves the set_fs out of non-core code. Signed-off-by: Christoph Hellwig --- security/integrity/iint.c | 14 +- 1 file changed, 1 insertion(+), 13 deletions(-) diff --git a/security/integrity/iint.c b/security/integrity/iint.c index e12c4900510f6..1d20003243c3f 100644 --- a/security/integrity/iint.c +++ b/security/integrity/iint.c @@ -188,19 +188,7 @@ DEFINE_LSM(integrity) = { int integrity_kernel_read(struct file *file, loff_t offset, void *addr, unsigned long count) { - mm_segment_t old_fs; - char __user *buf = (char __user *)addr; - ssize_t ret; - - if (!(file->f_mode & FMODE_READ)) - return -EBADF; - - old_fs = get_fs(); - set_fs(KERNEL_DS); - ret = __vfs_read(file, buf, count, ); - set_fs(old_fs); - - return ret; + return __kernel_read(file, addr, count, ); } /* -- 2.26.2
[PATCH v2] perf jvmti: Remove redundant jitdump line table entries
For each PC/BCI pair in the JVMTI compiler inlining record table, the jitdump plugin emits debug line table entries for every source line in the method preceding that BCI. Instead only emit one source line per PC/BCI pair. Reported by Ian Rogers. This reduces the .dump size for SPECjbb from ~230MB to ~40MB. Signed-off-by: Nick Gasson --- Changes in v2: - Split the unrelated DWARF debug fix into a separate patch - Added a comment about the use of c->methods tools/perf/jvmti/libjvmti.c | 78 - 1 file changed, 33 insertions(+), 45 deletions(-) diff --git a/tools/perf/jvmti/libjvmti.c b/tools/perf/jvmti/libjvmti.c index c5d30834a64c..fcca275e5bf9 100644 --- a/tools/perf/jvmti/libjvmti.c +++ b/tools/perf/jvmti/libjvmti.c @@ -32,38 +32,41 @@ static void print_error(jvmtiEnv *jvmti, const char *msg, jvmtiError ret) #ifdef HAVE_JVMTI_CMLR static jvmtiError -do_get_line_numbers(jvmtiEnv *jvmti, void *pc, jmethodID m, jint bci, - jvmti_line_info_t *tab, jint *nr) +do_get_line_number(jvmtiEnv *jvmti, void *pc, jmethodID m, jint bci, + jvmti_line_info_t *tab) { - jint i, lines = 0; - jint nr_lines = 0; + jint i, nr_lines = 0; jvmtiLineNumberEntry *loc_tab = NULL; jvmtiError ret; + jint src_line = -1; ret = (*jvmti)->GetLineNumberTable(jvmti, m, _lines, _tab); if (ret == JVMTI_ERROR_ABSENT_INFORMATION || ret == JVMTI_ERROR_NATIVE_METHOD) { /* No debug information for this method */ - *nr = 0; - return JVMTI_ERROR_NONE; + return ret; } else if (ret != JVMTI_ERROR_NONE) { print_error(jvmti, "GetLineNumberTable", ret); return ret; } - for (i = 0; i < nr_lines; i++) { - if (loc_tab[i].start_location < bci) { - tab[lines].pc = (unsigned long)pc; - tab[lines].line_number = loc_tab[i].line_number; - tab[lines].discrim = 0; /* not yet used */ - tab[lines].methodID = m; - lines++; - } else { - break; - } + for (i = 0; i < nr_lines && loc_tab[i].start_location <= bci; i++) { + src_line = i; + } + + if (src_line != -1) { + tab->pc = (unsigned long)pc; + tab->line_number = loc_tab[src_line].line_number; + tab->discrim = 0; /* not yet used */ + tab->methodID = m; + + ret = JVMTI_ERROR_NONE; + } else { + ret = JVMTI_ERROR_ABSENT_INFORMATION; } + (*jvmti)->Deallocate(jvmti, (unsigned char *)loc_tab); - *nr = lines; - return JVMTI_ERROR_NONE; + + return ret; } static jvmtiError @@ -71,9 +74,8 @@ get_line_numbers(jvmtiEnv *jvmti, const void *compile_info, jvmti_line_info_t ** { const jvmtiCompiledMethodLoadRecordHeader *hdr; jvmtiCompiledMethodLoadInlineRecord *rec; - jvmtiLineNumberEntry *lne = NULL; PCStackInfo *c; - jint nr, ret; + jint ret; int nr_total = 0; int i, lines_total = 0; @@ -86,24 +88,7 @@ get_line_numbers(jvmtiEnv *jvmti, const void *compile_info, jvmti_line_info_t ** for (hdr = compile_info; hdr != NULL; hdr = hdr->next) { if (hdr->kind == JVMTI_CMLR_INLINE_INFO) { rec = (jvmtiCompiledMethodLoadInlineRecord *)hdr; - for (i = 0; i < rec->numpcs; i++) { - c = rec->pcinfo + i; - nr = 0; - /* -* unfortunately, need a tab to get the number of lines! -*/ - ret = (*jvmti)->GetLineNumberTable(jvmti, c->methods[0], , ); - if (ret == JVMTI_ERROR_NONE) { - /* free what was allocated for nothing */ - (*jvmti)->Deallocate(jvmti, (unsigned char *)lne); - nr_total += (int)nr; - } else if (ret == JVMTI_ERROR_ABSENT_INFORMATION || - ret == JVMTI_ERROR_NATIVE_METHOD) { - /* No debug information for this method */ - } else { - print_error(jvmti, "GetLineNumberTable", ret); - } - } + nr_total += rec->numpcs; } } @@ -122,14 +107,17 @@ get_line_numbers(jvmtiEnv *jvmti, const void *compile_info, jvmti_line_info_t ** rec = (jvmtiCompiledMethodLoadInlineRecord *)hdr; for (i = 0; i <
[PATCH 13/14] fs: remove __vfs_read
Fold it into the two callers. Signed-off-by: Christoph Hellwig --- fs/read_write.c| 43 +-- include/linux/fs.h | 1 - 2 files changed, 21 insertions(+), 23 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 4e19152a7efe0..46ddfce17e839 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -419,17 +419,6 @@ static ssize_t new_sync_read(struct file *filp, char __user *buf, size_t len, lo return ret; } -ssize_t __vfs_read(struct file *file, char __user *buf, size_t count, - loff_t *pos) -{ - if (file->f_op->read) - return file->f_op->read(file, buf, count, pos); - else if (file->f_op->read_iter) - return new_sync_read(file, buf, count, pos); - else - return -EINVAL; -} - ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { mm_segment_t old_fs = get_fs(); @@ -441,7 +430,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; set_fs(KERNEL_DS); - ret = __vfs_read(file, (void __user *)buf, count, pos); + if (file->f_op->read) + ret = file->f_op->read(file, (void __user *)buf, count, pos); + else if (file->f_op->read_iter) + ret = new_sync_read(file, (void __user *)buf, count, pos); + else + ret = -EINVAL; set_fs(old_fs); if (ret > 0) { fsnotify_access(file); @@ -474,17 +468,22 @@ ssize_t vfs_read(struct file *file, char __user *buf, size_t count, loff_t *pos) return -EFAULT; ret = rw_verify_area(READ, file, pos, count); - if (!ret) { - if (count > MAX_RW_COUNT) - count = MAX_RW_COUNT; - ret = __vfs_read(file, buf, count, pos); - if (ret > 0) { - fsnotify_access(file); - add_rchar(current, ret); - } - inc_syscr(current); - } + if (ret) + return ret; + if (count > MAX_RW_COUNT) + count = MAX_RW_COUNT; + if (file->f_op->read) + ret = file->f_op->read(file, buf, count, pos); + else if (file->f_op->read_iter) + ret = new_sync_read(file, buf, count, pos); + else + ret = -EINVAL; + if (ret > 0) { + fsnotify_access(file); + add_rchar(current, ret); + } + inc_syscr(current); return ret; } diff --git a/include/linux/fs.h b/include/linux/fs.h index 6441aaa25f8f2..4c10a07a36178 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -1905,7 +1905,6 @@ ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector, struct iovec *fast_pointer, struct iovec **ret_pointer); -extern ssize_t __vfs_read(struct file *, char __user *, size_t, loff_t *); extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *); extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *); extern ssize_t vfs_readv(struct file *, const struct iovec __user *, -- 2.26.2
[PATCH 12/14] fs: implement kernel_read using __kernel_read
Consolidate the two in-kernel read helpers to make upcoming changes easier. The only difference are the missing call to rw_verify_area in kernel_read, and an access_ok check that doesn't make sense for kernel buffers to start with. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index bd12af8a895c8..4e19152a7efe0 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -453,15 +453,12 @@ ssize_t __kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) ssize_t kernel_read(struct file *file, void *buf, size_t count, loff_t *pos) { - mm_segment_t old_fs; - ssize_t result; + ssize_t ret; - old_fs = get_fs(); - set_fs(KERNEL_DS); - /* The cast to a user pointer is valid due to the set_fs() */ - result = vfs_read(file, (void __user *)buf, count, pos); - set_fs(old_fs); - return result; + ret = rw_verify_area(READ, file, pos, count); + if (ret) + return ret; + return __kernel_read(file, buf, count, pos); } EXPORT_SYMBOL(kernel_read); -- 2.26.2
[PATCH 05/14] fs: check FMODE_WRITE in __kernel_write
We still need to check if the fѕ is open write, even for the low-level helper. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/read_write.c b/fs/read_write.c index 2c601d853ff3d..76be155ad9824 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -505,6 +505,8 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t const char __user *p; ssize_t ret; + if (!(file->f_mode & FMODE_WRITE)) + return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; -- 2.26.2
[PATCH 03/14] bpfilter: switch to kernel_write
While pipes don't really need sb_writers projection, __kernel_write is an interface better kept private, and the additional rw_verify_area does not hurt here. Signed-off-by: Christoph Hellwig --- net/bpfilter/bpfilter_kern.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/bpfilter/bpfilter_kern.c b/net/bpfilter/bpfilter_kern.c index c0f0990f30b60..1905e01c3aa9a 100644 --- a/net/bpfilter/bpfilter_kern.c +++ b/net/bpfilter/bpfilter_kern.c @@ -50,7 +50,7 @@ static int __bpfilter_process_sockopt(struct sock *sk, int optname, req.len = optlen; if (!bpfilter_ops.info.pid) goto out; - n = __kernel_write(bpfilter_ops.info.pipe_to_umh, , sizeof(req), + n = kernel_write(bpfilter_ops.info.pipe_to_umh, , sizeof(req), ); if (n != sizeof(req)) { pr_err("write fail %zd\n", n); -- 2.26.2
[PATCH 09/14] fs: don't change the address limit for ->write_iter in __kernel_write
If we write to a file that implements ->write_iter there is no need to change the address limit if we send a kvec down. Implement that case, and prefer it over using plain ->write with a changed address limit if available. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 34 ++ 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index 3bcb084f160de..8cfca5f8fc3ce 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -489,10 +489,9 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t } /* caller is responsible for file_start_write/file_end_write */ -ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) +ssize_t __kernel_write(struct file *file, const void *buf, size_t count, + loff_t *pos) { - mm_segment_t old_fs; - const char __user *p; ssize_t ret; if (!(file->f_mode & FMODE_WRITE)) @@ -500,18 +499,29 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; - old_fs = get_fs(); - set_fs(KERNEL_DS); - p = (__force const char __user *)buf; if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; - if (file->f_op->write) - ret = file->f_op->write(file, p, count, pos); - else if (file->f_op->write_iter) - ret = new_sync_write(file, p, count, pos); - else + if (file->f_op->write_iter) { + struct kvec iov = { .iov_base = (void *)buf, .iov_len = count }; + struct kiocb kiocb; + struct iov_iter iter; + + init_sync_kiocb(, file); + kiocb.ki_pos = *pos; + iov_iter_kvec(, WRITE, , 1, count); + ret = file->f_op->write_iter(, ); + if (ret > 0) + *pos = kiocb.ki_pos; + } else if (file->f_op->write) { + mm_segment_t old_fs = get_fs(); + + set_fs(KERNEL_DS); + ret = file->f_op->write(file, (__force const char __user *)buf, + count, pos); + set_fs(old_fs); + } else { ret = -EINVAL; - set_fs(old_fs); + } if (ret > 0) { fsnotify_modify(file); add_wchar(current, ret); -- 2.26.2
[PATCH 08/14] fs: remove __vfs_write
Fold it into the two callers. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 46 ++ 1 file changed, 22 insertions(+), 24 deletions(-) diff --git a/fs/read_write.c b/fs/read_write.c index abb84391cfbc5..3bcb084f160de 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -488,17 +488,6 @@ static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t return ret; } -static ssize_t __vfs_write(struct file *file, const char __user *p, - size_t count, loff_t *pos) -{ - if (file->f_op->write) - return file->f_op->write(file, p, count, pos); - else if (file->f_op->write_iter) - return new_sync_write(file, p, count, pos); - else - return -EINVAL; -} - /* caller is responsible for file_start_write/file_end_write */ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) { @@ -516,7 +505,12 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t p = (__force const char __user *)buf; if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; - ret = __vfs_write(file, p, count, pos); + if (file->f_op->write) + ret = file->f_op->write(file, p, count, pos); + else if (file->f_op->write_iter) + ret = new_sync_write(file, p, count, pos); + else + ret = -EINVAL; set_fs(old_fs); if (ret > 0) { fsnotify_modify(file); @@ -554,19 +548,23 @@ ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_ return -EFAULT; ret = rw_verify_area(WRITE, file, pos, count); - if (!ret) { - if (count > MAX_RW_COUNT) - count = MAX_RW_COUNT; - file_start_write(file); - ret = __vfs_write(file, buf, count, pos); - if (ret > 0) { - fsnotify_modify(file); - add_wchar(current, ret); - } - inc_syscw(current); - file_end_write(file); + if (ret) + return ret; + if (count > MAX_RW_COUNT) + count = MAX_RW_COUNT; + file_start_write(file); + if (file->f_op->write) + ret = file->f_op->write(file, buf, count, pos); + else if (file->f_op->write_iter) + ret = new_sync_write(file, buf, count, pos); + else + ret = -EINVAL; + if (ret > 0) { + fsnotify_modify(file); + add_wchar(current, ret); } - + inc_syscw(current); + file_end_write(file); return ret; } -- 2.26.2
[PATCH 04/14] fs: unexport __kernel_write
This is a very special interface that skips sb_writes protection, and not used by modules anymore. Signed-off-by: Christoph Hellwig --- fs/read_write.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/read_write.c b/fs/read_write.c index bbfa9b12b15eb..2c601d853ff3d 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -522,7 +522,6 @@ ssize_t __kernel_write(struct file *file, const void *buf, size_t count, loff_t inc_syscw(current); return ret; } -EXPORT_SYMBOL(__kernel_write); ssize_t kernel_write(struct file *file, const void *buf, size_t count, loff_t *pos) -- 2.26.2
clean up kernel_{read,write} & friends v2
Hi Al, this series fixes a few issues and cleans up the helpers that read from or write to kernel space buffers, and ensures that we don't change the address limit if we are using the ->read_iter and ->write_iter methods that don't need the changed address limit. Changes since v2: - picked up a few ACKs Changes since v1: - __kernel_write must not take sb_writers - unexport __kernel_write
[PATCH 02/14] autofs: switch to kernel_write
While pipes don't really need sb_writers projection, __kernel_write is an interface better kept private, and the additional rw_verify_area does not hurt here. Signed-off-by: Christoph Hellwig Acked-by: Ian Kent --- fs/autofs/waitq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c index b04c528b19d34..74c886f7c51cb 100644 --- a/fs/autofs/waitq.c +++ b/fs/autofs/waitq.c @@ -53,7 +53,7 @@ static int autofs_write(struct autofs_sb_info *sbi, mutex_lock(>pipe_mutex); while (bytes) { - wr = __kernel_write(file, data, bytes, >f_pos); + wr = kernel_write(file, data, bytes, >f_pos); if (wr <= 0) break; data += wr; -- 2.26.2
[tip:WIP.core/rcu] BUILD SUCCESS 07325d4a90d2d84de45cc07b134fd0f023dbb971
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git WIP.core/rcu branch HEAD: 07325d4a90d2d84de45cc07b134fd0f023dbb971 rcu: Provide rcu_irq_exit_check_preempt() elapsed time: 2186m configs tested: 97 configs skipped: 1 The following configs have been built successfully. More configs may be tested in the coming days. arm defconfig arm allyesconfig arm allmodconfig arm allnoconfig arm64allyesconfig arm64 defconfig arm64allmodconfig arm64 allnoconfig i386 allnoconfig i386 allyesconfig i386defconfig i386 debian-10.3 ia64 allmodconfig ia64defconfig ia64 allnoconfig ia64 allyesconfig m68k allmodconfig m68k allnoconfig m68k sun3_defconfig m68kdefconfig m68k allyesconfig nds32 defconfig nds32 allnoconfig csky allyesconfig cskydefconfig alpha defconfig alphaallyesconfig xtensa allyesconfig h8300allyesconfig h8300allmodconfig xtensa defconfig arc defconfig arc allyesconfig sh allmodconfig shallnoconfig microblazeallnoconfig nios2 defconfig nios2allyesconfig openriscdefconfig c6x allyesconfig c6x allnoconfig openrisc allyesconfig mips allyesconfig mips allnoconfig mips allmodconfig pariscallnoconfig parisc defconfig parisc allyesconfig parisc allmodconfig powerpc allyesconfig powerpc rhel-kconfig powerpc allmodconfig powerpc allnoconfig powerpc defconfig i386 randconfig-a001-20200527 i386 randconfig-a004-20200527 i386 randconfig-a003-20200527 i386 randconfig-a006-20200527 i386 randconfig-a002-20200527 i386 randconfig-a005-20200527 x86_64 randconfig-a006-20200527 x86_64 randconfig-a002-20200527 x86_64 randconfig-a005-20200527 x86_64 randconfig-a003-20200527 x86_64 randconfig-a004-20200527 x86_64 randconfig-a001-20200527 i386 randconfig-a013-20200527 i386 randconfig-a015-20200527 i386 randconfig-a012-20200527 i386 randconfig-a011-20200527 i386 randconfig-a016-20200527 i386 randconfig-a014-20200527 riscvallyesconfig riscv allnoconfig riscv defconfig riscvallmodconfig s390 allyesconfig s390 allnoconfig s390 allmodconfig s390defconfig sparcallyesconfig sparc defconfig sparc64 defconfig sparc64 allnoconfig sparc64 allyesconfig sparc64 allmodconfig umallnoconfig um defconfig um allmodconfig um allyesconfig x86_64 rhel x86_64 rhel-7.6 x86_64rhel-7.6-kselftests x86_64 rhel-7.2-clear x86_64lkp x86_64 fedora-25 x86_64 kexec --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org
[PATCH v4] bluetooth: hci_qca: Fix qca6390 enable failure after warm reboot
Warm reboot can not restore qca6390 controller baudrate to default due to lack of controllable BT_EN pin or power supply, so fails to download firmware after warm reboot. Fixed by sending EDL_SOC_RESET VSC to reset controller within added device shutdown implementation. Signed-off-by: Zijun Hu --- drivers/bluetooth/hci_qca.c | 33 + 1 file changed, 33 insertions(+) diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c index e4a6823..8e03bfe 100644 --- a/drivers/bluetooth/hci_qca.c +++ b/drivers/bluetooth/hci_qca.c @@ -1975,6 +1975,38 @@ static void qca_serdev_remove(struct serdev_device *serdev) hci_uart_unregister_device(>serdev_hu); } +static void qca_serdev_shutdown(struct device *dev) +{ + int ret; + int timeout = msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS); + struct serdev_device *serdev = to_serdev_device(dev); + struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); + const u8 ibs_wake_cmd[] = { 0xFD }; + const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; + + if (qcadev->btsoc_type == QCA_QCA6390) { + serdev_device_write_flush(serdev); + ret = serdev_device_write_buf(serdev, + ibs_wake_cmd, sizeof(ibs_wake_cmd)); + if (ret < 0) { + BT_ERR("QCA send IBS_WAKE_IND error: %d", ret); + return; + } + serdev_device_wait_until_sent(serdev, timeout); + usleep_range(8000, 1); + + serdev_device_write_flush(serdev); + ret = serdev_device_write_buf(serdev, + edl_reset_soc_cmd, sizeof(edl_reset_soc_cmd)); + if (ret < 0) { + BT_ERR("QCA send EDL_RESET_REQ error: %d", ret); + return; + } + serdev_device_wait_until_sent(serdev, timeout); + usleep_range(8000, 1); + } +} + static int __maybe_unused qca_suspend(struct device *dev) { struct hci_dev *hdev = container_of(dev, struct hci_dev, dev); @@ -2100,6 +2132,7 @@ static struct serdev_device_driver qca_serdev_driver = { .name = "hci_uart_qca", .of_match_table = of_match_ptr(qca_bluetooth_of_match), .acpi_match_table = ACPI_PTR(qca_bluetooth_acpi_match), + .shutdown = qca_serdev_shutdown, .pm = _pm_ops, }, }; -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
Re: [Nouveau] [PATCH] nouveau: add fbdev dependency
On Thu, 28 May 2020 at 00:36, Arnd Bergmann wrote: > > On Wed, May 27, 2020 at 4:05 PM Ilia Mirkin wrote: > > > > Isn't this already fixed by > > > > https://cgit.freedesktop.org/drm/drm/commit/?id=7dbbdd37f2ae7dd4175ba3f86f4335c463b18403 > > Ok, I see that fixes the link error, but I when I created my fix, that did > not seem like the correct solution because it reverts part of the original > patch without reverting the rest of it. Unfortunately there was no > changelog text in the first patch to explain why this is safe. No it doesn't, I think you missed the pci in API name. The initial behaviour doesn't use the pci version of the API, the replacement did, and the fix used the drm wrapper around the pci one. So this patch isn't necessary now that I've fixed it the other way, Thanks, Dave.
Re: [PATCH v30 07/20] x86/sgx: Enumerate and track EPC sections
On Thu, May 28, 2020 at 08:25:43AM +0300, Jarkko Sakkinen wrote: > On Tue, May 26, 2020 at 08:56:14PM -0700, Sean Christopherson wrote: > > On Mon, May 25, 2020 at 11:23:04AM +0200, Borislav Petkov wrote: > > > On Fri, May 15, 2020 at 03:43:57AM +0300, Jarkko Sakkinen wrote: > > > > +struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; > > > > +int sgx_nr_epc_sections; > > > > > > We have become very averse against global stuff. What is going to use > > > those, only sgx code I assume...? > > > > Yes, only SGX code. The reclaim/swap code needs access to the sections, > > and that code is in a different file, reclaim.c. I don't have a super > > strong objection to sucking reclaim.c into main.c, but I'm somewhat > > indifferent on code organization as a whole. Jarkko likely has a stronger > > opinion. > > I'll change it. > > It's not quite as easy as just "sucking the file in". All the commits > that touch the file need to be reworked: > > $ git --no-pager log --format="%H %s" arch/x86/kernel/cpu/sgx/reclaim.c > 5aeca6dabf767e9350ee3188ba25ceb21f3162b4 x86/sgx: Add a page reclaimer > de9b1088959f36ffdaf43a49bfea1c7f9f81cac7 x86/sgx: Linux Enclave Driver > 08d8fcb74fe268059ee58fcc2a0833b244e1f22a x86/sgx: Enumerate and track EPC > sections Not that I haven't done this a lot last few years. A proven approach is to do it in two "git rebase -i mainline/master" sweeps: 1. For each commit, remove reclaim.c entry from the Makefile and import reclaim.c contents to main.c. 2. For each commit, delete reclaim.c. I've tried quite a few different angles and this what I've converged into. Very hard to hit messy into messy merge conflicts. /Jarkko
Re: [PATCH v3 5/6] bus: Add Baikal-T1 APB-bus driver
Hi Serge, I love your patch! Yet something to improve: [auto build test ERROR on robh/for-next] [also build test ERROR on char-misc/char-misc-testing staging/staging-testing linus/master v5.7-rc7 next-20200526] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Serge-Semin/bus-memory-Add-Baikal-T1-SoC-APB-AXI-L2-drivers/20200526-210837 base: https://git.kernel.org/pub/scm/linux/kernel/git/robh/linux.git for-next config: sparc-allyesconfig (attached as .config) compiler: sparc64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=sparc If you fix the issue, kindly add following tag as appropriate Reported-by: kbuild test robot All error/warnings (new ones prefixed by >>, old ones prefixed by <<): drivers/bus/bt1-apb.c: In function 'inject_error_store': drivers/bus/bt1-apb.c:329:3: error: implicit declaration of function 'readl' [-Werror=implicit-function-declaration] 329 | readl(apb->res); | ^ In file included from include/linux/kobject.h:20, from include/linux/module.h:20, from drivers/bus/bt1-apb.c:12: drivers/bus/bt1-apb.c: At top level: >> drivers/bus/bt1-apb.c:338:23: error: initialization of 'ssize_t (*)(struct >> device *, struct device_attribute *, char *)' {aka 'long int (*)(struct >> device *, struct device_attribute *, char *)'} from incompatible pointer >> type 'int (*)(struct device *, struct device_attribute *, char *)' >> [-Werror=incompatible-pointer-types] 338 | static DEVICE_ATTR_RW(inject_error); | ^~~~ include/linux/sysfs.h:104:10: note: in definition of macro '__ATTR' 104 | .show = _show, | ^ include/linux/device.h:130:45: note: in expansion of macro '__ATTR_RW' 130 | struct device_attribute dev_attr_##_name = __ATTR_RW(_name) | ^ >> drivers/bus/bt1-apb.c:338:8: note: in expansion of macro 'DEVICE_ATTR_RW' 338 | static DEVICE_ATTR_RW(inject_error); |^~ drivers/bus/bt1-apb.c:338:23: note: (near initialization for 'dev_attr_inject_error.show') 338 | static DEVICE_ATTR_RW(inject_error); | ^~~~ include/linux/sysfs.h:104:10: note: in definition of macro '__ATTR' 104 | .show = _show, | ^ include/linux/device.h:130:45: note: in expansion of macro '__ATTR_RW' 130 | struct device_attribute dev_attr_##_name = __ATTR_RW(_name) | ^ >> drivers/bus/bt1-apb.c:338:8: note: in expansion of macro 'DEVICE_ATTR_RW' 338 | static DEVICE_ATTR_RW(inject_error); |^~ >> drivers/bus/bt1-apb.c:338:23: error: initialization of 'ssize_t (*)(struct >> device *, struct device_attribute *, const char *, size_t)' {aka 'long int >> (*)(struct device *, struct device_attribute *, const char *, long unsigned >> int)'} from incompatible pointer type 'int (*)(struct device *, struct >> device_attribute *, const char *, size_t)' {aka 'int (*)(struct device *, >> struct device_attribute *, const char *, long unsigned int)'} >> [-Werror=incompatible-pointer-types] 338 | static DEVICE_ATTR_RW(inject_error); | ^~~~ include/linux/sysfs.h:105:11: note: in definition of macro '__ATTR' 105 | .store = _store, | ^~ include/linux/device.h:130:45: note: in expansion of macro '__ATTR_RW' 130 | struct device_attribute dev_attr_##_name = __ATTR_RW(_name) | ^ >> drivers/bus/bt1-apb.c:338:8: note: in expansion of macro 'DEVICE_ATTR_RW' 338 | static DEVICE_ATTR_RW(inject_error); |^~ drivers/bus/bt1-apb.c:338:23: note: (near initialization for 'dev_attr_inject_error.store') 338 | static DEVICE_ATTR_RW(inject_error); | ^~~~ include/linux/sysfs.h:105:11: note: in definition of macro '__ATTR' 105 | .store = _store, | ^~ include/linux/device.h:130:45: note: in expansion of macro '__ATTR_RW' 130 | struct device_attribute dev_attr_##_name = __ATTR_RW(_name) | ^ >> drivers/bus/bt1-apb.c:338:8: note: in expansion of macro 'DEVICE_ATTR_RW' 338 | static DEVICE_ATTR_RW(inject_error); |^~ cc1: some warnings being treated as errors vim +338 drivers/bus/bt1-apb.c 317 318 static int inject_error_store(struct device *dev, 319struct device_attribute *attr, 320
Re: [PATCH -V3] swap: Reduce lock contention on swap cache from swap slots allocation
Daniel Jordan writes: > On Mon, May 25, 2020 at 08:26:48AM +0800, Huang Ying wrote: >> diff --git a/mm/swapfile.c b/mm/swapfile.c >> index 423c234aca15..0abd93d2a4fc 100644 >> --- a/mm/swapfile.c >> +++ b/mm/swapfile.c >> @@ -615,7 +615,8 @@ static bool scan_swap_map_try_ssd_cluster(struct >> swap_info_struct *si, >> * discarding, do discard now and reclaim them >> */ >> swap_do_scheduled_discard(si); >> -*scan_base = *offset = si->cluster_next; >> +*scan_base = this_cpu_read(*si->cluster_next_cpu); >> +*offset = *scan_base; >> goto new_cluster; > > Why is this done? As far as I can tell, the values always get overwritten at > the end of the function with tmp and tmp isn't derived from them. Seems > ebc2a1a69111 moved some logic that used to make sense but doesn't have any > effect now. If we fail to allocate from cluster, "scan_base" and "offset" will not be overridden. And "cluster_next" or "cluster_next_cpu" may be changed in swap_do_scheduled_discard(), because the lock is released and re-acquired there. The code may not have much value. And you may think that it's better to remove it. But that should be in another patch. >> } else >> return false; >> @@ -721,6 +722,34 @@ static void swap_range_free(struct swap_info_struct >> *si, unsigned long offset, >> } >> } >> >> +static void set_cluster_next(struct swap_info_struct *si, unsigned long >> next) >> +{ >> +unsigned long prev; >> + >> +if (!(si->flags & SWP_SOLIDSTATE)) { >> +si->cluster_next = next; >> +return; >> +} >> + >> +prev = this_cpu_read(*si->cluster_next_cpu); >> +/* >> + * Cross the swap address space size aligned trunk, choose >> + * another trunk randomly to avoid lock contention on swap >> + * address space if possible. >> + */ >> +if ((prev >> SWAP_ADDRESS_SPACE_SHIFT) != >> +(next >> SWAP_ADDRESS_SPACE_SHIFT)) { >> +/* No free swap slots available */ >> +if (si->highest_bit <= si->lowest_bit) >> +return; >> +next = si->lowest_bit + >> +prandom_u32_max(si->highest_bit - si->lowest_bit + 1); >> +next = ALIGN(next, SWAP_ADDRESS_SPACE_PAGES); >> +next = max_t(unsigned int, next, si->lowest_bit); > > next is always greater than lowest_bit because it's aligned up. I think the > intent of the max_t line is to handle when next is aligned outside the valid > range, so it'd have to be ALIGN_DOWN instead? Oops. I misunderstood "ALIGN()" here. Yes. we should use ALIGN_DOWN() instead. Thanks for pointing this out! > > These aside, patch looks good to me. Thanks for your review! It really help me to improve the quality of the patch. Can I add your "Reviewed-by" in the next version? Best Regards, Huang, Ying
RE: [PATCH 1/4] exfat: redefine PBR as boot_sector
> Aggregate PBR related definitions and redefine as "boot_sector" to comply > with the exFAT specification. > And, rename variable names including 'pbr'. > > Signed-off-by: Tetsuhiro Kohada > --- > fs/exfat/exfat_fs.h | 2 +- > fs/exfat/exfat_raw.h | 79 +++-- > fs/exfat/super.c | 84 ++-- > 3 files changed, 72 insertions(+), 93 deletions(-) > [snip] > +/* EXFAT: Main and Backup Boot Sector (512 bytes) */ struct boot_sector > +{ > + __u8jmp_boot[BOOTSEC_JUMP_BOOT_LEN]; > + __u8oem_name[BOOTSEC_OEM_NAME_LEN]; According to the exFAT specification, fs_name and BOOTSEC_FS_NAME_LEN look better. > + __u8must_be_zero[BOOTSEC_OLDBPB_LEN]; > + __le64 partition_offset; > + __le64 vol_length; > + __le32 fat_offset; > + __le32 fat_length; > + __le32 clu_offset; > + __le32 clu_count; > + __le32 root_cluster; > + __le32 vol_serial; > + __u8fs_revision[2]; > + __le16 vol_flags; > + __u8sect_size_bits; > + __u8sect_per_clus_bits; > + __u8num_fats; > + __u8drv_sel; > + __u8percent_in_use; > + __u8reserved[7]; > + __u8boot_code[390]; > + __le16 signature; > } __packed;
Re: [PATCH v30 07/20] x86/sgx: Enumerate and track EPC sections
On Tue, May 26, 2020 at 08:56:14PM -0700, Sean Christopherson wrote: > On Mon, May 25, 2020 at 11:23:04AM +0200, Borislav Petkov wrote: > > On Fri, May 15, 2020 at 03:43:57AM +0300, Jarkko Sakkinen wrote: > > > +struct sgx_epc_section sgx_epc_sections[SGX_MAX_EPC_SECTIONS]; > > > +int sgx_nr_epc_sections; > > > > We have become very averse against global stuff. What is going to use > > those, only sgx code I assume...? > > Yes, only SGX code. The reclaim/swap code needs access to the sections, > and that code is in a different file, reclaim.c. I don't have a super > strong objection to sucking reclaim.c into main.c, but I'm somewhat > indifferent on code organization as a whole. Jarkko likely has a stronger > opinion. I'll change it. It's not quite as easy as just "sucking the file in". All the commits that touch the file need to be reworked: $ git --no-pager log --format="%H %s" arch/x86/kernel/cpu/sgx/reclaim.c 5aeca6dabf767e9350ee3188ba25ceb21f3162b4 x86/sgx: Add a page reclaimer de9b1088959f36ffdaf43a49bfea1c7f9f81cac7 x86/sgx: Linux Enclave Driver 08d8fcb74fe268059ee58fcc2a0833b244e1f22a x86/sgx: Enumerate and track EPC sections /Jarkko
[PATCH v9 2/2] mtd: rawnand: Add NAND controller support on Intel LGM SoC
From: Ramuthevar Vadivel Murugan This patch adds the new IP of Nand Flash Controller(NFC) support on Intel's Lightning Mountain(LGM) SoC. DMA is used for burst data transfer operation, also DMA HW supports aligned 32bit memory address and aligned data access by default. DMA burst of 8 supported. Data register used to support the read/write operation from/to device. NAND controller driver implements ->exec_op() to replace legacy hooks, these specific call-back method to execute NAND operations. Signed-off-by: Ramuthevar Vadivel Murugan --- drivers/mtd/nand/raw/Kconfig | 8 + drivers/mtd/nand/raw/Makefile| 1 + drivers/mtd/nand/raw/intel-nand-controller.c | 747 +++ 3 files changed, 756 insertions(+) create mode 100644 drivers/mtd/nand/raw/intel-nand-controller.c diff --git a/drivers/mtd/nand/raw/Kconfig b/drivers/mtd/nand/raw/Kconfig index a80a46bb5b8b..75ab2afb78cf 100644 --- a/drivers/mtd/nand/raw/Kconfig +++ b/drivers/mtd/nand/raw/Kconfig @@ -457,6 +457,14 @@ config MTD_NAND_CADENCE Enable the driver for NAND flash on platforms using a Cadence NAND controller. +config MTD_NAND_INTEL_LGM + tristate "Support for NAND controller on Intel LGM SoC" + depends on OF || COMPILE_TEST + depends on HAS_IOMEM + help + Enables support for NAND Flash chips on Intel's LGM SoC. + NAND flash controller interfaced through the External Bus Unit. + comment "Misc" config MTD_SM_COMMON diff --git a/drivers/mtd/nand/raw/Makefile b/drivers/mtd/nand/raw/Makefile index 2d136b158fb7..bfc8fe4d2cb0 100644 --- a/drivers/mtd/nand/raw/Makefile +++ b/drivers/mtd/nand/raw/Makefile @@ -58,6 +58,7 @@ obj-$(CONFIG_MTD_NAND_TEGRA) += tegra_nand.o obj-$(CONFIG_MTD_NAND_STM32_FMC2) += stm32_fmc2_nand.o obj-$(CONFIG_MTD_NAND_MESON) += meson_nand.o obj-$(CONFIG_MTD_NAND_CADENCE) += cadence-nand-controller.o +obj-$(CONFIG_MTD_NAND_INTEL_LGM) += intel-nand-controller.o nand-objs := nand_base.o nand_legacy.o nand_bbt.o nand_timings.o nand_ids.o nand-objs += nand_onfi.o diff --git a/drivers/mtd/nand/raw/intel-nand-controller.c b/drivers/mtd/nand/raw/intel-nand-controller.c new file mode 100644 index ..564d28978943 --- /dev/null +++ b/drivers/mtd/nand/raw/intel-nand-controller.c @@ -0,0 +1,747 @@ +// SPDX-License-Identifier: GPL-2.0+ +/* Copyright (c) 2020 Intel Corporation. */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define EBU_CLC0x000 +#define EBU_CLC_RST0xu + +#define EBU_ADDR_SEL(n)(0x20 + (n) * 4) +/* 5 bits 26:22 included for comparison in the ADDR_SELx */ +#define EBU_ADDR_MASK(x) ((x) << 4) +#define EBU_ADDR_SEL_REGEN 0x1 + +#define EBU_BUSCON(n) (0x60 + (n) * 4) +#define EBU_BUSCON_CMULT_V40x1 +#define EBU_BUSCON_RECOVC(n) ((n) << 2) +#define EBU_BUSCON_HOLDC(n)((n) << 4) +#define EBU_BUSCON_WAITRDC(n) ((n) << 6) +#define EBU_BUSCON_WAITWRC(n) ((n) << 8) +#define EBU_BUSCON_BCGEN_CS0x0 +#define EBU_BUSCON_SETUP_ENBIT(22) +#define EBU_BUSCON_ALEC0xC000 + +#define EBU_CON0x0B0 +#define EBU_CON_NANDM_EN BIT(0) +#define EBU_CON_NANDM_DIS 0x0 +#define EBU_CON_CSMUX_E_EN BIT(1) +#define EBU_CON_ALE_P_LOW BIT(2) +#define EBU_CON_CLE_P_LOW BIT(3) +#define EBU_CON_CS_P_LOW BIT(4) +#define EBU_CON_SE_P_LOW BIT(5) +#define EBU_CON_WP_P_LOW BIT(6) +#define EBU_CON_PRE_P_LOW BIT(7) +#define EBU_CON_IN_CS_S(n) ((n) << 8) +#define EBU_CON_OUT_CS_S(n)((n) << 10) +#define EBU_CON_LAT_EN_CS_P((0x3D) << 18) + +#define EBU_WAIT 0x0B4 +#define EBU_WAIT_RDBY BIT(0) +#define EBU_WAIT_WR_C BIT(3) + +#define HSNAND_CTL10x110 +#define HSNAND_CTL1_ADDR_SHIFT 24 + +#define HSNAND_CTL20x114 +#define HSNAND_CTL2_ADDR_SHIFT 8 +#define HSNAND_CTL2_CYC_N_V5 (0x2 << 16) + +#define HSNAND_INT_MSK_CTL 0x124 +#define HSNAND_INT_MSK_CTL_WR_CBIT(4) + +#define HSNAND_INT_STA 0x128 +#define HSNAND_INT_STA_WR_CBIT(4) + +#define HSNAND_CTL 0x130 +#define HSNAND_CTL_ENABLE_ECC BIT(0) +#define HSNAND_CTL_GO BIT(2) +#define HSNAND_CTL_CE_SEL_CS(n)BIT(3 + (n)) +#define HSNAND_CTL_RW_READ 0x0 +#define HSNAND_CTL_RW_WRITEBIT(10) +#define HSNAND_CTL_ECC_OFF_V8THBIT(11) +#define HSNAND_CTL_CKFF_EN 0x0 +#define HSNAND_CTL_MSG_EN BIT(17) + +#define HSNAND_PARA0 0x13c +#define HSNAND_PARA0_PAGE_V81920x3 +#define HSNAND_PARA0_PIB_V256 (0x3 << 4) +#define HSNAND_PARA0_BYP_EN_NP 0x0 +#define HSNAND_PARA0_BYP_DEC_NP0x0 +#define HSNAND_PARA0_TYPE_ONFI BIT(18)
Re: [PATCH v3 0/7] Statsfs: a new ram-based file system for Linux kernel statistics
On 28/05/20 00:21, David Ahern wrote: > On 5/27/20 3:07 PM, Paolo Bonzini wrote: >> I see what you meant now. statsfs can also be used to enumerate objects >> if one is so inclined (with the prototype in patch 7, for example, each >> network interface becomes a directory). > > there are many use cases that have 100's to 1000's have network devices. > Having a sysfs entry per device already bloats memory usage for these > use cases; another filesystem with an entry per device makes that worse. > Really the wrong direction for large scale systems. Hi David, IMO the important part for now is having a flexible kernel API for exposing statistics across multiple subsystems, so that they can be harvested in an efficient way. The userspace API is secondary, and multiple APIs can be added to cater for different usecases. For example, as of the first five patches the memory usage is the same as what is now in the mainline kernel, since all the patchset does is take existing debugfs inodes and move them to statsfs. I agree that, if the concept is extended to the whole kernel, scalability and memory usage becomes an issue; and indeed, the long-term plan is to support a binary format that is actually _more_ efficient than the status quo for large scale systems. In the meanwhile, the new filesystem can be disabled (see the difference between "STATS_FS" and "STATS_FS_API") if it imposes undesirable overhead. Thanks, Paolo
Re: [PATCH 8/8] blk-mq: drain I/O when all CPUs in a hctx are offline
On Wed, May 27, 2020 at 08:33:48PM -0700, Bart Van Assche wrote: > On 2020-05-27 18:46, Ming Lei wrote: > > On Wed, May 27, 2020 at 04:09:19PM -0700, Bart Van Assche wrote: > >> On 2020-05-27 11:06, Christoph Hellwig wrote: > >>> --- a/block/blk-mq-tag.c > >>> +++ b/block/blk-mq-tag.c > >>> @@ -180,6 +180,14 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data > >>> *data) > >>> sbitmap_finish_wait(bt, ws, ); > >>> > >>> found_tag: > >>> + /* > >>> + * Give up this allocation if the hctx is inactive. The caller will > >>> + * retry on an active hctx. > >>> + */ > >>> + if (unlikely(test_bit(BLK_MQ_S_INACTIVE, >hctx->state))) { > >>> + blk_mq_put_tag(tags, data->ctx, tag + tag_offset); > >>> + return -1; > >>> + } > >>> return tag + tag_offset; > >>> } > >> > >> The code that has been added in blk_mq_hctx_notify_offline() will only > >> work correctly if blk_mq_get_tag() tests BLK_MQ_S_INACTIVE after the > >> store instructions involved in the tag allocation happened. Does this > >> mean that a memory barrier should be added in the above function before > >> the test_bit() call? > > > > Please see comment in blk_mq_hctx_notify_offline(): > > > > + /* > > +* Prevent new request from being allocated on the current hctx. > > +* > > +* The smp_mb__after_atomic() Pairs with the implied barrier in > > +* test_and_set_bit_lock in sbitmap_get(). Ensures the inactive > > flag is > > +* seen once we return from the tag allocator. > > +*/ > > + set_bit(BLK_MQ_S_INACTIVE, >state); > > From Documentation/atomic_bitops.txt: "Except for a successful > test_and_set_bit_lock() which has ACQUIRE semantics and > clear_bit_unlock() which has RELEASE semantics." test_bit(BLK_MQ_S_INACTIVE, >hctx->state) is called exactly after one tag is allocated, that means test_and_set_bit_lock is successful before the test_bit(). The ACQUIRE semantics guarantees that test_bit(BLK_MQ_S_INACTIVE) is always done after successful test_and_set_bit_lock(), so tag bit is always set before testing BLK_MQ_S_INACTIVE. See Documentation/memory-barriers.txt: (5) ACQUIRE operations. This acts as a one-way permeable barrier. It guarantees that all memory operations after the ACQUIRE operation will appear to happen after the ACQUIRE operation with respect to the other components of the system. ACQUIRE operations include LOCK operations and both smp_load_acquire() and smp_cond_load_acquire() operations. > > My understanding is that operations that have acquire semantics pair > with operations that have release semantics. I haven't been able to find > any documentation that shows that smp_mb__after_atomic() has release > semantics. So I looked up its definition. This is what I found: > > $ git grep -nH 'define __smp_mb__after_atomic' > arch/ia64/include/asm/barrier.h:49:#define __smp_mb__after_atomic() > barrier() > arch/mips/include/asm/barrier.h:133:#define __smp_mb__after_atomic() > smp_llsc_mb() > arch/s390/include/asm/barrier.h:50:#define __smp_mb__after_atomic() > barrier() > arch/sparc/include/asm/barrier_64.h:57:#define __smp_mb__after_atomic() > barrier() > arch/x86/include/asm/barrier.h:83:#define __smp_mb__after_atomic()do { > } while (0) > arch/xtensa/include/asm/barrier.h:20:#define __smp_mb__after_atomic() > barrier() > include/asm-generic/barrier.h:116:#define __smp_mb__after_atomic() > __smp_mb() > > My interpretation of the above is that not all smp_mb__after_atomic() > implementations have release semantics. Do you agree with this conclusion? I understand smp_mb__after_atomic() orders set_bit(BLK_MQ_S_INACTIVE) and reading the tag bit which is done in blk_mq_all_tag_iter(). So the two pair of OPs are ordered: 1) if one request(tag bit) is allocated before setting BLK_MQ_S_INACTIVE, the tag bit will be observed in blk_mq_all_tag_iter() from blk_mq_hctx_has_requests(), so the request will be drained. OR 2) if one request(tag bit) is allocated after setting BLK_MQ_S_INACTIVE, the request(tag bit) will be released and retried on another CPU finally, see __blk_mq_alloc_request(). Cc Paul and linux-kernel list. Thanks, Ming
[PATCH] perf jit: Fix inaccurate DWARF line table
Fix an issue where addresses in the DWARF line table are offset by -0x40 (GEN_ELF_TEXT_OFFSET). This can be seen with `objdump -S` on the ELF files after perf inject. Signed-off-by: Nick Gasson --- tools/perf/util/genelf_debug.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/genelf_debug.c b/tools/perf/util/genelf_debug.c index 30e9f618f6cd..dd40683bd4c0 100644 --- a/tools/perf/util/genelf_debug.c +++ b/tools/perf/util/genelf_debug.c @@ -342,7 +342,7 @@ static void emit_lineno_info(struct buffer_ext *be, */ /* start state of the state machine we take care of */ - unsigned long last_vma = code_addr; + unsigned long last_vma = 0; char const *cur_filename = NULL; unsigned long cur_file_idx = 0; int last_line = 1; @@ -473,7 +473,7 @@ jit_process_debug_info(uint64_t code_addr, ent = debug_entry_next(ent); } add_compilation_unit(di, buffer_ext_size(dl)); - add_debug_line(dl, debug, nr_debug_entries, 0); + add_debug_line(dl, debug, nr_debug_entries, GEN_ELF_TEXT_OFFSET); add_debug_abbrev(da); if (0) buffer_ext_dump(da, "abbrev"); -- 2.26.2
[PATCH 02/28] net: add sock_no_linger
Add a helper to directly set the SO_LINGER sockopt from kernel space with onoff set to true and a linger time of 0 without going through a fake uaccess. Signed-off-by: Christoph Hellwig Acked-by: Sagi Grimberg --- drivers/nvme/host/tcp.c | 9 + drivers/nvme/target/tcp.c | 6 +- include/net/sock.h| 1 + net/core/sock.c | 9 + net/rds/tcp.h | 1 - net/rds/tcp_connect.c | 2 +- net/rds/tcp_listen.c | 13 + net/sunrpc/svcsock.c | 12 ++-- 8 files changed, 16 insertions(+), 37 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index c15a92163c1f7..e72d87482eb78 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1313,7 +1313,6 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, { struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); struct nvme_tcp_queue *queue = >queues[qid]; - struct linger sol = { .l_onoff = 1, .l_linger = 0 }; int ret, opt, rcv_pdu_size; queue->ctrl = ctrl; @@ -1361,13 +1360,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, * close. This is done to prevent stale data from being sent should * the network connection be restored before TCP times out. */ - ret = kernel_setsockopt(queue->sock, SOL_SOCKET, SO_LINGER, - (char *), sizeof(sol)); - if (ret) { - dev_err(nctrl->device, - "failed to set SO_LINGER sock opt %d\n", ret); - goto err_sock; - } + sock_no_linger(queue->sock->sk); if (so_priority > 0) { ret = kernel_setsockopt(queue->sock, SOL_SOCKET, SO_PRIORITY, diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 40757a63f4553..e0801494b097f 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1429,7 +1429,6 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue) { struct socket *sock = queue->sock; struct inet_sock *inet = inet_sk(sock->sk); - struct linger sol = { .l_onoff = 1, .l_linger = 0 }; int ret; ret = kernel_getsockname(sock, @@ -1447,10 +1446,7 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue) * close. This is done to prevent stale data from being sent should * the network connection be restored before TCP times out. */ - ret = kernel_setsockopt(sock, SOL_SOCKET, SO_LINGER, - (char *), sizeof(sol)); - if (ret) - return ret; + sock_no_linger(sock->sk); if (so_priority > 0) { ret = kernel_setsockopt(sock, SOL_SOCKET, SO_PRIORITY, diff --git a/include/net/sock.h b/include/net/sock.h index 2ec085044790c..6ed00bf009bbe 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2688,6 +2688,7 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif) void sock_def_readable(struct sock *sk); +void sock_no_linger(struct sock *sk); void sock_set_reuseaddr(struct sock *sk); #endif /* _SOCK_H */ diff --git a/net/core/sock.c b/net/core/sock.c index 18eb84fdf5fbe..f0f09524911c8 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -720,6 +720,15 @@ void sock_set_reuseaddr(struct sock *sk) } EXPORT_SYMBOL(sock_set_reuseaddr); +void sock_no_linger(struct sock *sk) +{ + lock_sock(sk); + sk->sk_lingertime = 0; + sock_set_flag(sk, SOCK_LINGER); + release_sock(sk); +} +EXPORT_SYMBOL(sock_no_linger); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. diff --git a/net/rds/tcp.h b/net/rds/tcp.h index 3c69361d21c73..d640e210b97b6 100644 --- a/net/rds/tcp.h +++ b/net/rds/tcp.h @@ -73,7 +73,6 @@ void rds_tcp_listen_data_ready(struct sock *sk); int rds_tcp_accept_one(struct socket *sock); int rds_tcp_keepalive(struct socket *sock); void *rds_tcp_listen_sock_def_readable(struct net *net); -void rds_tcp_set_linger(struct socket *sock); /* tcp_recv.c */ int rds_tcp_recv_init(void); diff --git a/net/rds/tcp_connect.c b/net/rds/tcp_connect.c index 008f50fb25dd2..4e64598176b05 100644 --- a/net/rds/tcp_connect.c +++ b/net/rds/tcp_connect.c @@ -207,7 +207,7 @@ void rds_tcp_conn_path_shutdown(struct rds_conn_path *cp) if (sock) { if (rds_destroy_pending(cp->cp_conn)) - rds_tcp_set_linger(sock); + sock_no_linger(sock->sk); sock->ops->shutdown(sock, RCV_SHUTDOWN | SEND_SHUTDOWN); lock_sock(sock->sk); rds_tcp_restore_callbacks(sock, tc); /* tc->tc_sock = NULL */ diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 810a3a49e9474..bbb31b9c0b391 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -111,17 +111,6 @@ struct rds_tcp_connection *rds_tcp_accept_one_path(struct rds_connection
remove most callers of kernel_setsockopt v3
Hi Dave, this series removes most callers of the kernel_setsockopt functions, and instead switches their users to small functions that implement setting a sockopt directly using a normal kernel function call with type safety and all the other benefits of not having a function call. In some cases these functions seem pretty heavy handed as they do a lock_sock even for just setting a single variable, but this mirrors the real setsockopt implementation unlike a few drivers that just set set the fields directly. Changes since v2: - drop the separately merged kernel_getopt_removal - drop the sctp patches, as there is conflicting cleanup going on - add an additional ACK for the rxrpc changes Changes since v1: - use ->getname for sctp sockets in dlm - add a new ->bind_add struct proto method for dlm/sctp - switch the ipv6 and remaining sctp helpers to inline function so that the ipv6 and sctp modules are not pulled in by any module that could potentially use ipv6 or sctp connections - remove arguments to various sock_* helpers that are always used with the same constant arguments
[PATCH 08/28] net: add sock_set_rcvbuf
Add a helper to directly set the SO_RCVBUFFORCE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- fs/dlm/lowcomms.c | 7 +- include/net/sock.h | 1 + net/core/sock.c| 59 +- 3 files changed, 34 insertions(+), 33 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index 138009c6a2ee1..45c37f572c9d2 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1297,7 +1297,6 @@ static int sctp_listen_for_all(void) struct socket *sock = NULL; int result = -EINVAL; struct connection *con = nodeid2con(0, GFP_NOFS); - int bufsize = NEEDED_RMEM; int one = 1; if (!con) @@ -1312,11 +1311,7 @@ static int sctp_listen_for_all(void) goto out; } - result = kernel_setsockopt(sock, SOL_SOCKET, SO_RCVBUFFORCE, -(char *), sizeof(bufsize)); - if (result) - log_print("Error increasing buffer space on socket %d", result); - + sock_set_rcvbuf(sock->sk, NEEDED_RMEM); result = kernel_setsockopt(sock, SOL_SCTP, SCTP_NODELAY, (char *), sizeof(one)); if (result < 0) diff --git a/include/net/sock.h b/include/net/sock.h index dc08c176238fd..c997289aabbf9 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2693,6 +2693,7 @@ void sock_enable_timestamps(struct sock *sk); void sock_no_linger(struct sock *sk); void sock_set_keepalive(struct sock *sk); void sock_set_priority(struct sock *sk, u32 priority); +void sock_set_rcvbuf(struct sock *sk, int val); void sock_set_reuseaddr(struct sock *sk); void sock_set_sndtimeo(struct sock *sk, s64 secs); diff --git a/net/core/sock.c b/net/core/sock.c index 728f5fb156a0c..3c6ebf952e9ad 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -789,6 +789,35 @@ void sock_set_keepalive(struct sock *sk) } EXPORT_SYMBOL(sock_set_keepalive); +static void __sock_set_rcvbuf(struct sock *sk, int val) +{ + /* Ensure val * 2 fits into an int, to prevent max_t() from treating it +* as a negative value. +*/ + val = min_t(int, val, INT_MAX / 2); + sk->sk_userlocks |= SOCK_RCVBUF_LOCK; + + /* We double it on the way in to account for "struct sk_buff" etc. +* overhead. Applications assume that the SO_RCVBUF setting they make +* will allow that much actual data to be received on that socket. +* +* Applications are unaware that "struct sk_buff" and other overheads +* allocate from the receive buffer during socket buffer allocation. +* +* And after considering the possible alternatives, returning the value +* we actually used in getsockopt is the most desirable behavior. +*/ + WRITE_ONCE(sk->sk_rcvbuf, max_t(int, val * 2, SOCK_MIN_RCVBUF)); +} + +void sock_set_rcvbuf(struct sock *sk, int val) +{ + lock_sock(sk); + __sock_set_rcvbuf(sk, val); + release_sock(sk); +} +EXPORT_SYMBOL(sock_set_rcvbuf); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. @@ -885,30 +914,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname, * play 'guess the biggest size' games. RCVBUF/SNDBUF * are treated in BSD as hints */ - val = min_t(u32, val, sysctl_rmem_max); -set_rcvbuf: - /* Ensure val * 2 fits into an int, to prevent max_t() -* from treating it as a negative value. -*/ - val = min_t(int, val, INT_MAX / 2); - sk->sk_userlocks |= SOCK_RCVBUF_LOCK; - /* -* We double it on the way in to account for -* "struct sk_buff" etc. overhead. Applications -* assume that the SO_RCVBUF setting they make will -* allow that much actual data to be received on that -* socket. -* -* Applications are unaware that "struct sk_buff" and -* other overheads allocate from the receive buffer -* during socket buffer allocation. -* -* And after considering the possible alternatives, -* returning the value we actually used in getsockopt -* is the most desirable behavior. -*/ - WRITE_ONCE(sk->sk_rcvbuf, - max_t(int, val * 2, SOCK_MIN_RCVBUF)); + __sock_set_rcvbuf(sk, min_t(u32, val, sysctl_rmem_max)); break; case SO_RCVBUFFORCE: @@ -920,9 +926,8 @@ int sock_setsockopt(struct socket *sock, int level, int optname, /* No negative values (to prevent underflow, as val will be * multiplied by 2). */ - if (val
[PATCH 09/28] net: add sock_set_reuseport
Add a helper to directly set the SO_REUSEPORT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/sock.h| 1 + net/core/sock.c | 8 net/sunrpc/xprtsock.c | 17 + 3 files changed, 10 insertions(+), 16 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index c997289aabbf9..d994daa418ec2 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2695,6 +2695,7 @@ void sock_set_keepalive(struct sock *sk); void sock_set_priority(struct sock *sk, u32 priority); void sock_set_rcvbuf(struct sock *sk, int val); void sock_set_reuseaddr(struct sock *sk); +void sock_set_reuseport(struct sock *sk); void sock_set_sndtimeo(struct sock *sk, s64 secs); #endif /* _SOCK_H */ diff --git a/net/core/sock.c b/net/core/sock.c index 3c6ebf952e9ad..2ca3425b519c0 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -729,6 +729,14 @@ void sock_set_reuseaddr(struct sock *sk) } EXPORT_SYMBOL(sock_set_reuseaddr); +void sock_set_reuseport(struct sock *sk) +{ + lock_sock(sk); + sk->sk_reuseport = true; + release_sock(sk); +} +EXPORT_SYMBOL(sock_set_reuseport); + void sock_no_linger(struct sock *sk) { lock_sock(sk); diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 30082cd039960..399848c2bcb29 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -1594,21 +1594,6 @@ static int xs_get_random_port(void) return rand + min; } -/** - * xs_set_reuseaddr_port - set the socket's port and address reuse options - * @sock: socket - * - * Note that this function has to be called on all sockets that share the - * same port, and it must be called before binding. - */ -static void xs_sock_set_reuseport(struct socket *sock) -{ - int opt = 1; - - kernel_setsockopt(sock, SOL_SOCKET, SO_REUSEPORT, - (char *), sizeof(opt)); -} - static unsigned short xs_sock_getport(struct socket *sock) { struct sockaddr_storage buf; @@ -1801,7 +1786,7 @@ static struct socket *xs_create_sock(struct rpc_xprt *xprt, xs_reclassify_socket(family, sock); if (reuseport) - xs_sock_set_reuseport(sock); + sock_set_reuseport(sock->sk); err = xs_bind(transport, sock); if (err) { -- 2.26.2
Re: [PATCH] ASoC: AMD: Use mixer control to switch between DMICs
On 5/27/2020 4:57 PM, Mark Brown wrote: On Wed, May 27, 2020 at 07:10:16AM +0530, Akshu Agrawal wrote: + SOC_SINGLE_BOOL_EXT("Front Mic", 0, front_mic_get, front_mic_set), This should probably be a mux with two labelled options, or if it's a boolean control it should end in Switch. A mux definitely seems like a better option though. Actually it's a dmic switch, so will change it to boolean control named "DMIC switch". Front or rear mic might change with variants. Thanks, Akshu
[PATCH 11/28] tcp: add tcp_sock_set_nodelay
Add a helper to directly set the TCP_NODELAY sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig Acked-by: Sagi Grimberg Acked-by: Jason Gunthorpe --- drivers/block/drbd/drbd_int.h | 7 drivers/block/drbd/drbd_main.c| 2 +- drivers/block/drbd/drbd_receiver.c| 4 +-- drivers/infiniband/sw/siw/siw_cm.c| 24 +++--- drivers/nvme/host/tcp.c | 9 +- drivers/nvme/target/tcp.c | 12 ++- drivers/target/iscsi/iscsi_target_login.c | 15 ++--- fs/cifs/connect.c | 10 ++ fs/dlm/lowcomms.c | 8 ++--- fs/ocfs2/cluster/tcp.c| 20 ++-- include/linux/tcp.h | 1 + net/ceph/messenger.c | 11 ++- net/ipv4/tcp.c| 39 +++ net/rds/tcp.c | 11 +-- net/rds/tcp.h | 1 - net/rds/tcp_listen.c | 2 +- 16 files changed, 49 insertions(+), 127 deletions(-) diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index 3550adc93c68b..e24bba87c8e02 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -1570,13 +1570,6 @@ extern void drbd_set_recv_tcq(struct drbd_device *device, int tcq_enabled); extern void _drbd_clear_done_ee(struct drbd_device *device, struct list_head *to_be_freed); extern int drbd_connected(struct drbd_peer_device *); -static inline void drbd_tcp_nodelay(struct socket *sock) -{ - int val = 1; - (void) kernel_setsockopt(sock, SOL_TCP, TCP_NODELAY, - (char*), sizeof(val)); -} - static inline void drbd_tcp_quickack(struct socket *sock) { int val = 2; diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index c094c3c2c5d4d..45fbd526c453b 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -660,7 +660,7 @@ static int __send_command(struct drbd_connection *connection, int vnr, /* DRBD protocol "pings" are latency critical. * This is supposed to trigger tcp_push_pending_frames() */ if (!err && (cmd == P_PING || cmd == P_PING_ACK)) - drbd_tcp_nodelay(sock->socket); + tcp_sock_set_nodelay(sock->socket->sk); return err; } diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 55ea907ad33cb..20a5e94494acd 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -1051,8 +1051,8 @@ static int conn_connect(struct drbd_connection *connection) /* we don't want delays. * we use TCP_CORK where appropriate, though */ - drbd_tcp_nodelay(sock.socket); - drbd_tcp_nodelay(msock.socket); + tcp_sock_set_nodelay(sock.socket->sk); + tcp_sock_set_nodelay(msock.socket->sk); connection->data.socket = sock.socket; connection->meta.socket = msock.socket; diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c index d1860f3e87401..1662216be66df 100644 --- a/drivers/infiniband/sw/siw/siw_cm.c +++ b/drivers/infiniband/sw/siw/siw_cm.c @@ -947,16 +947,8 @@ static void siw_accept_newconn(struct siw_cep *cep) siw_cep_get(new_cep); new_s->sk->sk_user_data = new_cep; - if (siw_tcp_nagle == false) { - int val = 1; - - rv = kernel_setsockopt(new_s, SOL_TCP, TCP_NODELAY, - (char *), sizeof(val)); - if (rv) { - siw_dbg_cep(cep, "setsockopt NODELAY error: %d\n", rv); - goto error; - } - } + if (siw_tcp_nagle == false) + tcp_sock_set_nodelay(new_s->sk); new_cep->state = SIW_EPSTATE_AWAIT_MPAREQ; rv = siw_cm_queue_work(new_cep, SIW_CM_WORK_MPATIMEOUT); @@ -1386,16 +1378,8 @@ int siw_connect(struct iw_cm_id *id, struct iw_cm_conn_param *params) siw_dbg_qp(qp, "kernel_bindconnect: error %d\n", rv); goto error; } - if (siw_tcp_nagle == false) { - int val = 1; - - rv = kernel_setsockopt(s, SOL_TCP, TCP_NODELAY, (char *), - sizeof(val)); - if (rv) { - siw_dbg_qp(qp, "setsockopt NODELAY error: %d\n", rv); - goto error; - } - } + if (siw_tcp_nagle == false) + tcp_sock_set_nodelay(s->sk); cep = siw_cep_alloc(sdev); if (!cep) { rv = -ENOMEM; diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index a307972d33a02..4e4a750ecdb97 100644 ---
Re: [PATCH v30 07/20] x86/sgx: Enumerate and track EPC sections
On Mon, May 25, 2020 at 11:23:04AM +0200, Borislav Petkov wrote: > Enabling this gives: > > In file included from arch/x86/kernel/cpu/sgx/main.c:11: > arch/x86/kernel/cpu/sgx/encls.h:189:51: warning: ‘struct sgx_einittoken’ > declared inside parameter list will not be visible outside of this definition > or declaration > 189 | static inline int __einit(void *sigstruct, struct sgx_einittoken > *einittoken, > | ^~ > In file included from arch/x86/kernel/cpu/sgx/reclaim.c:12: > arch/x86/kernel/cpu/sgx/encls.h:189:51: warning: ‘struct sgx_einittoken’ > declared inside parameter list will not be visible outside of this definition > or declaration > 189 | static inline int __einit(void *sigstruct, struct sgx_einittoken > *einittoken, > | > > You need a forward declaration somewhere. It is a left-over from v28 and should be "void *". To backtrack what happened it looks that I squashed the change that does this to "x86/sgx: Linux Enclave Driver". This is fixed now in my tree. /Jarkko
[PATCH 07/28] net: add sock_set_keepalive
Add a helper to directly set the SO_KEEPALIVE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- fs/dlm/lowcomms.c | 6 +- include/net/sock.h| 1 + net/core/sock.c | 10 ++ net/rds/tcp_listen.c | 6 +- net/sunrpc/xprtsock.c | 4 +--- 5 files changed, 14 insertions(+), 13 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index b4d491122814b..138009c6a2ee1 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1259,11 +1259,7 @@ static struct socket *tcp_create_listen_sock(struct connection *con, con->sock = NULL; goto create_out; } - result = kernel_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, -(char *), sizeof(one)); - if (result < 0) { - log_print("Set keepalive failed: %d", result); - } + sock_set_keepalive(sock->sk); result = sock->ops->listen(sock, 5); if (result < 0) { diff --git a/include/net/sock.h b/include/net/sock.h index 99ef43508d2b5..dc08c176238fd 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2691,6 +2691,7 @@ void sock_def_readable(struct sock *sk); int sock_bindtoindex(struct sock *sk, int ifindex); void sock_enable_timestamps(struct sock *sk); void sock_no_linger(struct sock *sk); +void sock_set_keepalive(struct sock *sk); void sock_set_priority(struct sock *sk, u32 priority); void sock_set_reuseaddr(struct sock *sk); void sock_set_sndtimeo(struct sock *sk, s64 secs); diff --git a/net/core/sock.c b/net/core/sock.c index e4a4dd2b3d8b3..728f5fb156a0c 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -779,6 +779,16 @@ void sock_enable_timestamps(struct sock *sk) } EXPORT_SYMBOL(sock_enable_timestamps); +void sock_set_keepalive(struct sock *sk) +{ + lock_sock(sk); + if (sk->sk_prot->keepalive) + sk->sk_prot->keepalive(sk, true); + sock_valbool_flag(sk, SOCK_KEEPOPEN, true); + release_sock(sk); +} +EXPORT_SYMBOL(sock_set_keepalive); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index bbb31b9c0b391..d8bd132769594 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -43,13 +43,9 @@ int rds_tcp_keepalive(struct socket *sock) /* values below based on xs_udp_default_timeout */ int keepidle = 5; /* send a probe 'keepidle' secs after last data */ int keepcnt = 5; /* number of unack'ed probes before declaring dead */ - int keepalive = 1; int ret = 0; - ret = kernel_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, - (char *), sizeof(keepalive)); - if (ret < 0) - goto bail; + sock_set_keepalive(sock->sk); ret = kernel_setsockopt(sock, IPPROTO_TCP, TCP_KEEPCNT, (char *), sizeof(keepcnt)); diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 845d0be805ece..30082cd039960 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2110,7 +2110,6 @@ static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt); unsigned int keepidle; unsigned int keepcnt; - unsigned int opt_on = 1; unsigned int timeo; spin_lock(>transport_lock); @@ -2122,8 +2121,7 @@ static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, spin_unlock(>transport_lock); /* TCP Keepalive options */ - kernel_setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, - (char *)_on, sizeof(opt_on)); + sock_set_keepalive(sock->sk); kernel_setsockopt(sock, SOL_TCP, TCP_KEEPIDLE, (char *), sizeof(keepidle)); kernel_setsockopt(sock, SOL_TCP, TCP_KEEPINTVL, -- 2.26.2
[PATCH 05/28] net: add sock_bindtoindex
Add a helper to directly set the SO_BINDTOIFINDEX sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/sock.h| 1 + net/core/sock.c | 21 +++-- net/ipv4/udp_tunnel.c | 4 +--- net/ipv6/ip6_udp_tunnel.c | 4 +--- 4 files changed, 18 insertions(+), 12 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index 9a7b9e98685ac..cdec7bc055d5b 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2688,6 +2688,7 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif) void sock_def_readable(struct sock *sk); +int sock_bindtoindex(struct sock *sk, int ifindex); void sock_no_linger(struct sock *sk); void sock_set_priority(struct sock *sk, u32 priority); void sock_set_reuseaddr(struct sock *sk); diff --git a/net/core/sock.c b/net/core/sock.c index d3b1d61e4f768..23f80880fbb2c 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -566,7 +566,7 @@ struct dst_entry *sk_dst_check(struct sock *sk, u32 cookie) } EXPORT_SYMBOL(sk_dst_check); -static int sock_setbindtodevice_locked(struct sock *sk, int ifindex) +static int sock_bindtoindex_locked(struct sock *sk, int ifindex) { int ret = -ENOPROTOOPT; #ifdef CONFIG_NETDEVICES @@ -594,6 +594,18 @@ static int sock_setbindtodevice_locked(struct sock *sk, int ifindex) return ret; } +int sock_bindtoindex(struct sock *sk, int ifindex) +{ + int ret; + + lock_sock(sk); + ret = sock_bindtoindex_locked(sk, ifindex); + release_sock(sk); + + return ret; +} +EXPORT_SYMBOL(sock_bindtoindex); + static int sock_setbindtodevice(struct sock *sk, char __user *optval, int optlen) { @@ -634,10 +646,7 @@ static int sock_setbindtodevice(struct sock *sk, char __user *optval, goto out; } - lock_sock(sk); - ret = sock_setbindtodevice_locked(sk, index); - release_sock(sk); - + return sock_bindtoindex(sk, index); out: #endif @@ -1216,7 +1225,7 @@ int sock_setsockopt(struct socket *sock, int level, int optname, break; case SO_BINDTOIFINDEX: - ret = sock_setbindtodevice_locked(sk, val); + ret = sock_bindtoindex_locked(sk, val); break; default: diff --git a/net/ipv4/udp_tunnel.c b/net/ipv4/udp_tunnel.c index 150e6f0fdbf59..2158e8bddf41c 100644 --- a/net/ipv4/udp_tunnel.c +++ b/net/ipv4/udp_tunnel.c @@ -22,9 +22,7 @@ int udp_sock_create4(struct net *net, struct udp_port_cfg *cfg, goto error; if (cfg->bind_ifindex) { - err = kernel_setsockopt(sock, SOL_SOCKET, SO_BINDTOIFINDEX, - (void *)>bind_ifindex, - sizeof(cfg->bind_ifindex)); + err = sock_bindtoindex(sock->sk, cfg->bind_ifindex); if (err < 0) goto error; } diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c index 58956a6b66a21..6523609516d25 100644 --- a/net/ipv6/ip6_udp_tunnel.c +++ b/net/ipv6/ip6_udp_tunnel.c @@ -33,9 +33,7 @@ int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg, goto error; } if (cfg->bind_ifindex) { - err = kernel_setsockopt(sock, SOL_SOCKET, SO_BINDTOIFINDEX, - (void *)>bind_ifindex, - sizeof(cfg->bind_ifindex)); + err = sock_bindtoindex(sock->sk, cfg->bind_ifindex); if (err < 0) goto error; } -- 2.26.2
Re: [RFC PATCH] iommu/arm-smmu: Add module parameter to set msi iova address
On Wed, May 27, 2020 at 11:00 PM Robin Murphy wrote: > Thanks Robin for your quick response. > On 2020-05-27 17:03, Srinath Mannam wrote: > > This patch gives the provision to change default value of MSI IOVA base > > to platform's suitable IOVA using module parameter. The present > > hardcoded MSI IOVA base may not be the accessible IOVA ranges of platform. > > That in itself doesn't seem entirely unreasonable; IIRC the current > address is just an arbitrary choice to fit nicely into Qemu's memory > map, and there was always the possibility that it wouldn't suit everything. > > > Since commit aadad097cd46 ("iommu/dma: Reserve IOVA for PCIe inaccessible > > DMA address"), inaccessible IOVA address ranges parsed from dma-ranges > > property are reserved. > > That, however, doesn't seem to fit here; iommu-dma maps MSI doorbells > dynamically, so they aren't affected by reserved regions any more than > regular DMA pages are. In fact, it explicitly ignores the software MSI > region, since as the comment says, it *is* the software that manages those. Yes you are right, we don't see any issues with kernel drivers(PCI EP) because MSI IOVA allocated dynamically by honouring reserved regions same as DMA pages. > > The MSI_IOVA_BASE region exists for VFIO, precisely because in that case > the kernel *doesn't* control the address space, but still needs some way > to steal a bit of it for MSIs that the guest doesn't necessarily know > about, and give userspace a fighting chance of knowing what it's taken. > I think at the time we discussed the idea of adding something to the > VFIO uapi such that userspace could move this around if it wanted or > needed to, but decided we could live without that initially. Perhaps now > the time has come? Yes, we see issues only with user-space drivers(DPDK) in which MSI_IOVA_BASE region is considered to map MSI registers. This patch helps us to fix the issue. Thanks, Srinath. > > Robin. > > > If any platform has the limitaion to access default MSI IOVA, then it can > > be changed using "arm-smmu.msi_iova_base=0xa000" command line argument. > > > > Signed-off-by: Srinath Mannam > > --- > > drivers/iommu/arm-smmu.c | 5 - > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > index 4f1a350..5e59c9d 100644 > > --- a/drivers/iommu/arm-smmu.c > > +++ b/drivers/iommu/arm-smmu.c > > @@ -72,6 +72,9 @@ static bool disable_bypass = > > module_param(disable_bypass, bool, S_IRUGO); > > MODULE_PARM_DESC(disable_bypass, > > "Disable bypass streams such that incoming transactions from devices > > that are not attached to an iommu domain will report an abort back to the > > device and will not be allowed to pass through the SMMU."); > > +static unsigned long msi_iova_base = MSI_IOVA_BASE; > > +module_param(msi_iova_base, ulong, S_IRUGO); > > +MODULE_PARM_DESC(msi_iova_base, "msi iova base address."); > > > > struct arm_smmu_s2cr { > > struct iommu_group *group; > > @@ -1566,7 +1569,7 @@ static void arm_smmu_get_resv_regions(struct device > > *dev, > > struct iommu_resv_region *region; > > int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; > > > > - region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH, > > + region = iommu_alloc_resv_region(msi_iova_base, MSI_IOVA_LENGTH, > >prot, IOMMU_RESV_SW_MSI); > > if (!region) > > return; > >
[PATCH 20/28] ipv4: add ip_sock_set_recverr
Add a helper to directly set the IP_RECVERR sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig Reviewed-by: David Howells --- include/net/ip.h | 1 + net/ipv4/ip_sockglue.c | 8 net/rxrpc/local_object.c | 8 +--- 3 files changed, 10 insertions(+), 7 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index 5f5d8226b6abc..f063a491b9063 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -766,6 +766,7 @@ static inline bool inetdev_valid_mtu(unsigned int mtu) } void ip_sock_set_freebind(struct sock *sk); +void ip_sock_set_recverr(struct sock *sk); void ip_sock_set_tos(struct sock *sk, int val); #endif /* _IP_H */ diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index 767838d2030d8..aca6b81da9bae 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -589,6 +589,14 @@ void ip_sock_set_freebind(struct sock *sk) } EXPORT_SYMBOL(ip_sock_set_freebind); +void ip_sock_set_recverr(struct sock *sk) +{ + lock_sock(sk); + inet_sk(sk)->recverr = true; + release_sock(sk); +} +EXPORT_SYMBOL(ip_sock_set_recverr); + /* * Socket option code for IP. This is the end of the line after any * TCP,UDP etc options on an IP socket. diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 5ea2bd01fdd59..4c0e8fe5ec1fb 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -171,13 +171,7 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) /* Fall through */ case AF_INET: /* we want to receive ICMP errors */ - opt = 1; - ret = kernel_setsockopt(local->socket, SOL_IP, IP_RECVERR, - (char *) , sizeof(opt)); - if (ret < 0) { - _debug("setsockopt failed"); - goto error; - } + ip_sock_set_recverr(local->socket->sk); /* we want to set the don't fragment bit */ opt = IP_PMTUDISC_DO; -- 2.26.2
[PATCH 19/28] ipv4: add ip_sock_set_freebind
Add a helper to directly set the IP_FREEBIND sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- drivers/target/iscsi/iscsi_target_login.c | 13 +++-- include/net/ip.h | 1 + net/ipv4/ip_sockglue.c| 8 3 files changed, 12 insertions(+), 10 deletions(-) diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index b561b07a869a0..85748e3388582 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -15,6 +15,7 @@ #include #include #include /* TCP_NODELAY */ +#include #include /* ipv6_addr_v4mapped() */ #include #include @@ -855,7 +856,7 @@ int iscsit_setup_np( struct sockaddr_storage *sockaddr) { struct socket *sock = NULL; - int backlog = ISCSIT_TCP_BACKLOG, ret, opt = 0, len; + int backlog = ISCSIT_TCP_BACKLOG, ret, len; switch (np->np_network_transport) { case ISCSI_TCP: @@ -900,15 +901,7 @@ int iscsit_setup_np( if (np->np_network_transport == ISCSI_TCP) tcp_sock_set_nodelay(sock->sk); sock_set_reuseaddr(sock->sk); - - opt = 1; - ret = kernel_setsockopt(sock, IPPROTO_IP, IP_FREEBIND, - (char *), sizeof(opt)); - if (ret < 0) { - pr_err("kernel_setsockopt() for IP_FREEBIND" - " failed\n"); - goto fail; - } + ip_sock_set_freebind(sock->sk); ret = kernel_bind(sock, (struct sockaddr *)>np_sockaddr, len); if (ret < 0) { diff --git a/include/net/ip.h b/include/net/ip.h index 2fc52e26fa88b..5f5d8226b6abc 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -765,6 +765,7 @@ static inline bool inetdev_valid_mtu(unsigned int mtu) return likely(mtu >= IPV4_MIN_MTU); } +void ip_sock_set_freebind(struct sock *sk); void ip_sock_set_tos(struct sock *sk, int val); #endif /* _IP_H */ diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index b43a29e11f4a5..767838d2030d8 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -581,6 +581,14 @@ void ip_sock_set_tos(struct sock *sk, int val) } EXPORT_SYMBOL(ip_sock_set_tos); +void ip_sock_set_freebind(struct sock *sk) +{ + lock_sock(sk); + inet_sk(sk)->freebind = true; + release_sock(sk); +} +EXPORT_SYMBOL(ip_sock_set_freebind); + /* * Socket option code for IP. This is the end of the line after any * TCP,UDP etc options on an IP socket. -- 2.26.2
[PATCH 22/28] ipv4: add ip_sock_set_pktinfo
Add a helper to directly set the IP_PKTINFO sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/ip.h | 1 + net/ipv4/ip_sockglue.c | 8 net/sunrpc/svcsock.c | 5 ++--- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index d3649c49dd333..04ebe7bf54c6a 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -767,6 +767,7 @@ static inline bool inetdev_valid_mtu(unsigned int mtu) void ip_sock_set_freebind(struct sock *sk); int ip_sock_set_mtu_discover(struct sock *sk, int val); +void ip_sock_set_pktinfo(struct sock *sk); void ip_sock_set_recverr(struct sock *sk); void ip_sock_set_tos(struct sock *sk, int val); diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index aa115be11dcfb..84ec3703c9091 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -608,6 +608,14 @@ int ip_sock_set_mtu_discover(struct sock *sk, int val) } EXPORT_SYMBOL(ip_sock_set_mtu_discover); +void ip_sock_set_pktinfo(struct sock *sk) +{ + lock_sock(sk); + inet_sk(sk)->cmsg_flags |= IP_CMSG_PKTINFO; + release_sock(sk); +} +EXPORT_SYMBOL(ip_sock_set_pktinfo); + /* * Socket option code for IP. This is the end of the line after any * TCP,UDP etc options on an IP socket. diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 6773dacc64d8e..7a805d165689c 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -616,9 +616,8 @@ static void svc_udp_init(struct svc_sock *svsk, struct svc_serv *serv) /* make sure we get destination address info */ switch (svsk->sk_sk->sk_family) { case AF_INET: - level = SOL_IP; - optname = IP_PKTINFO; - break; + ip_sock_set_pktinfo(svsk->sk_sock->sk); + return; case AF_INET6: level = SOL_IPV6; optname = IPV6_RECVPKTINFO; -- 2.26.2
[PATCH 21/28] ipv4: add ip_sock_set_mtu_discover
Add a helper to directly set the IP_MTU_DISCOVER sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig Reviewed-by: David Howells [rxrpc bits] --- include/net/ip.h | 1 + net/ipv4/ip_sockglue.c | 11 +++ net/rxrpc/local_object.c | 8 +--- net/rxrpc/output.c | 14 +- 4 files changed, 18 insertions(+), 16 deletions(-) diff --git a/include/net/ip.h b/include/net/ip.h index f063a491b9063..d3649c49dd333 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -766,6 +766,7 @@ static inline bool inetdev_valid_mtu(unsigned int mtu) } void ip_sock_set_freebind(struct sock *sk); +int ip_sock_set_mtu_discover(struct sock *sk, int val); void ip_sock_set_recverr(struct sock *sk); void ip_sock_set_tos(struct sock *sk, int val); diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index aca6b81da9bae..aa115be11dcfb 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -597,6 +597,17 @@ void ip_sock_set_recverr(struct sock *sk) } EXPORT_SYMBOL(ip_sock_set_recverr); +int ip_sock_set_mtu_discover(struct sock *sk, int val) +{ + if (val < IP_PMTUDISC_DONT || val > IP_PMTUDISC_OMIT) + return -EINVAL; + lock_sock(sk); + inet_sk(sk)->pmtudisc = val; + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(ip_sock_set_mtu_discover); + /* * Socket option code for IP. This is the end of the line after any * TCP,UDP etc options on an IP socket. diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 4c0e8fe5ec1fb..6f4e6b4817cf2 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -174,13 +174,7 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) ip_sock_set_recverr(local->socket->sk); /* we want to set the don't fragment bit */ - opt = IP_PMTUDISC_DO; - ret = kernel_setsockopt(local->socket, SOL_IP, IP_MTU_DISCOVER, - (char *) , sizeof(opt)); - if (ret < 0) { - _debug("setsockopt failed"); - goto error; - } + ip_sock_set_mtu_discover(local->socket->sk, IP_PMTUDISC_DO); /* We want receive timestamps. */ sock_enable_timestamps(local->socket->sk); diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c index f8b632a5c6197..1ba43c3df4adb 100644 --- a/net/rxrpc/output.c +++ b/net/rxrpc/output.c @@ -321,7 +321,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb, struct kvec iov[2]; rxrpc_serial_t serial; size_t len; - int ret, opt; + int ret; _enter(",{%d}", skb->len); @@ -473,18 +473,14 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct sk_buff *skb, switch (conn->params.local->srx.transport.family) { case AF_INET6: case AF_INET: - opt = IP_PMTUDISC_DONT; - kernel_setsockopt(conn->params.local->socket, - SOL_IP, IP_MTU_DISCOVER, - (char *), sizeof(opt)); + ip_sock_set_mtu_discover(conn->params.local->socket->sk, + IP_PMTUDISC_DONT); ret = kernel_sendmsg(conn->params.local->socket, , iov, 2, len); conn->params.peer->last_tx_at = ktime_get_seconds(); - opt = IP_PMTUDISC_DO; - kernel_setsockopt(conn->params.local->socket, - SOL_IP, IP_MTU_DISCOVER, - (char *), sizeof(opt)); + ip_sock_set_mtu_discover(conn->params.local->socket->sk, + IP_PMTUDISC_DO); break; default: -- 2.26.2
[PATCH 16/28] tcp: add tcp_sock_set_keepintvl
Add a helper to directly set the TCP_KEEPINTVL sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/linux/tcp.h | 1 + net/ipv4/tcp.c| 12 net/rds/tcp_listen.c | 4 +--- net/sunrpc/xprtsock.c | 3 +-- 4 files changed, 15 insertions(+), 5 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 5724dd84a85ed..1f9bada00faab 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -499,6 +499,7 @@ int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from, int pcount, void tcp_sock_set_cork(struct sock *sk, bool on); int tcp_sock_set_keepidle(struct sock *sk, int val); +int tcp_sock_set_keepintvl(struct sock *sk, int val); void tcp_sock_set_nodelay(struct sock *sk); void tcp_sock_set_quickack(struct sock *sk, int val); int tcp_sock_set_syncnt(struct sock *sk, int val); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index bdf0ff9333514..7eb083e09786a 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2934,6 +2934,18 @@ int tcp_sock_set_keepidle(struct sock *sk, int val) } EXPORT_SYMBOL(tcp_sock_set_keepidle); +int tcp_sock_set_keepintvl(struct sock *sk, int val) +{ + if (val < 1 || val > MAX_TCP_KEEPINTVL) + return -EINVAL; + + lock_sock(sk); + tcp_sk(sk)->keepalive_intvl = val * HZ; + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(tcp_sock_set_keepintvl); + /* * Socket option code for TCP. */ diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 79f9adc008114..9ad555c48d15d 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -53,12 +53,10 @@ int rds_tcp_keepalive(struct socket *sock) goto bail; tcp_sock_set_keepidle(sock->sk, keepidle); - /* KEEPINTVL is the interval between successive probes. We follow * the model in xs_tcp_finish_connecting() and re-use keepidle. */ - ret = kernel_setsockopt(sock, IPPROTO_TCP, TCP_KEEPINTVL, - (char *), sizeof(keepidle)); + tcp_sock_set_keepintvl(sock->sk, keepidle); bail: return ret; } diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 473290f7c5c0a..5ca64e12af0c5 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2108,8 +2108,7 @@ static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, /* TCP Keepalive options */ sock_set_keepalive(sock->sk); tcp_sock_set_keepidle(sock->sk, keepidle); - kernel_setsockopt(sock, SOL_TCP, TCP_KEEPINTVL, - (char *), sizeof(keepidle)); + tcp_sock_set_keepintvl(sock->sk, keepidle); kernel_setsockopt(sock, SOL_TCP, TCP_KEEPCNT, (char *), sizeof(keepcnt)); -- 2.26.2
[PATCH 27/28] rxrpc: add rxrpc_sock_set_min_security_level
Add a helper to directly set the RXRPC_MIN_SECURITY_LEVEL sockopt from kernel space without going through a fake uaccess. Thanks to David Howells for the documentation updates. Signed-off-by: Christoph Hellwig Acked-by: David Howells --- Documentation/networking/rxrpc.rst | 13 +++-- fs/afs/rxrpc.c | 6 ++ include/net/af_rxrpc.h | 2 ++ net/rxrpc/af_rxrpc.c | 13 + 4 files changed, 28 insertions(+), 6 deletions(-) diff --git a/Documentation/networking/rxrpc.rst b/Documentation/networking/rxrpc.rst index 5ad35113d0f46..68552b92dc442 100644 --- a/Documentation/networking/rxrpc.rst +++ b/Documentation/networking/rxrpc.rst @@ -477,7 +477,7 @@ AF_RXRPC sockets support a few socket options at the SOL_RXRPC level: Encrypted checksum plus packet padded and first eight bytes of packet encrypted - which includes the actual packet length. - (c) RXRPC_SECURITY_ENCRYPTED + (c) RXRPC_SECURITY_ENCRYPT Encrypted checksum plus entire packet padded and encrypted, including actual packet length. @@ -578,7 +578,7 @@ A client would issue an operation by: This issues a request_key() to get the key representing the security context. The minimum security level can be set:: - unsigned int sec = RXRPC_SECURITY_ENCRYPTED; + unsigned int sec = RXRPC_SECURITY_ENCRYPT; setsockopt(client, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL, , sizeof(sec)); @@ -1090,6 +1090,15 @@ The kernel interface functions are as follows: jiffies). In the event of the timeout occurring, the call will be aborted and -ETIME or -ETIMEDOUT will be returned. + (#) Apply the RXRPC_MIN_SECURITY_LEVEL sockopt to a socket from within in the + kernel:: + + int rxrpc_sock_set_min_security_level(struct sock *sk, +unsigned int val); + + This specifies the minimum security level required for calls on this + socket. + Configurable Parameters === diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c index 1ecc67da6c1a4..e313dae01674f 100644 --- a/fs/afs/rxrpc.c +++ b/fs/afs/rxrpc.c @@ -37,7 +37,6 @@ int afs_open_socket(struct afs_net *net) { struct sockaddr_rxrpc srx; struct socket *socket; - unsigned int min_level; int ret; _enter(""); @@ -57,9 +56,8 @@ int afs_open_socket(struct afs_net *net) srx.transport.sin6.sin6_family = AF_INET6; srx.transport.sin6.sin6_port= htons(AFS_CM_PORT); - min_level = RXRPC_SECURITY_ENCRYPT; - ret = kernel_setsockopt(socket, SOL_RXRPC, RXRPC_MIN_SECURITY_LEVEL, - (void *)_level, sizeof(min_level)); + ret = rxrpc_sock_set_min_security_level(socket->sk, + RXRPC_SECURITY_ENCRYPT); if (ret < 0) goto error_2; diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h index ab988940bf045..91eacbdcf33d2 100644 --- a/include/net/af_rxrpc.h +++ b/include/net/af_rxrpc.h @@ -72,4 +72,6 @@ bool rxrpc_kernel_call_is_complete(struct rxrpc_call *); void rxrpc_kernel_set_max_life(struct socket *, struct rxrpc_call *, unsigned long); +int rxrpc_sock_set_min_security_level(struct sock *sk, unsigned int val); + #endif /* _NET_RXRPC_H */ diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c index 15ee92d795815..394189b81849f 100644 --- a/net/rxrpc/af_rxrpc.c +++ b/net/rxrpc/af_rxrpc.c @@ -571,6 +571,19 @@ static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len) return ret; } +int rxrpc_sock_set_min_security_level(struct sock *sk, unsigned int val) +{ + if (sk->sk_state != RXRPC_UNBOUND) + return -EISCONN; + if (val > RXRPC_SECURITY_MAX) + return -EINVAL; + lock_sock(sk); + rxrpc_sk(sk)->min_sec_level = val; + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(rxrpc_sock_set_min_security_level); + /* * set RxRPC socket options */ -- 2.26.2
[PATCH 26/28] ipv6: add ip6_sock_set_recvpktinfo
Add a helper to directly set the IPV6_RECVPKTINFO sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/ipv6.h | 7 +++ net/sunrpc/svcsock.c | 10 ++ 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 9a90759830162..5e65bf2fd32d0 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1262,4 +1262,11 @@ static inline int ip6_sock_set_addr_preferences(struct sock *sk, bool val) return ret; } +static inline void ip6_sock_set_recvpktinfo(struct sock *sk) +{ + lock_sock(sk); + inet6_sk(sk)->rxopt.bits.rxinfo = true; + release_sock(sk); +} + #endif /* _NET_IPV6_H */ diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index a391892977cd2..e7a0037d9b56c 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -595,8 +595,6 @@ static struct svc_xprt_class svc_udp_class = { static void svc_udp_init(struct svc_sock *svsk, struct svc_serv *serv) { - int err, level, optname, one = 1; - svc_xprt_init(sock_net(svsk->sk_sock->sk), _udp_class, >sk_xprt, serv); clear_bit(XPT_CACHE_AUTH, >sk_xprt.xpt_flags); @@ -617,17 +615,13 @@ static void svc_udp_init(struct svc_sock *svsk, struct svc_serv *serv) switch (svsk->sk_sk->sk_family) { case AF_INET: ip_sock_set_pktinfo(svsk->sk_sock->sk); - return; + break; case AF_INET6: - level = SOL_IPV6; - optname = IPV6_RECVPKTINFO; + ip6_sock_set_recvpktinfo(svsk->sk_sock->sk); break; default: BUG(); } - err = kernel_setsockopt(svsk->sk_sock, level, optname, - (char *), sizeof(one)); - dprintk("svc: kernel_setsockopt returned %d\n", err); } /* -- 2.26.2
[PATCH 28/28] tipc: call tsk_set_importance from tipc_topsrv_create_listener
Avoid using kernel_setsockopt for the TIPC_IMPORTANCE option when we can just use the internal helper. The only change needed is to pass a struct sock instead of tipc_sock, which is private to socket.c Signed-off-by: Christoph Hellwig --- net/tipc/socket.c | 18 +- net/tipc/socket.h | 2 ++ net/tipc/topsrv.c | 6 +++--- 3 files changed, 14 insertions(+), 12 deletions(-) diff --git a/net/tipc/socket.c b/net/tipc/socket.c index d6b67d07d22ec..3734cdbedc9cc 100644 --- a/net/tipc/socket.c +++ b/net/tipc/socket.c @@ -196,17 +196,17 @@ static int tsk_importance(struct tipc_sock *tsk) return msg_importance(>phdr); } -static int tsk_set_importance(struct tipc_sock *tsk, int imp) +static struct tipc_sock *tipc_sk(const struct sock *sk) { - if (imp > TIPC_CRITICAL_IMPORTANCE) - return -EINVAL; - msg_set_importance(>phdr, (u32)imp); - return 0; + return container_of(sk, struct tipc_sock, sk); } -static struct tipc_sock *tipc_sk(const struct sock *sk) +int tsk_set_importance(struct sock *sk, int imp) { - return container_of(sk, struct tipc_sock, sk); + if (imp > TIPC_CRITICAL_IMPORTANCE) + return -EINVAL; + msg_set_importance(_sk(sk)->phdr, (u32)imp); + return 0; } static bool tsk_conn_cong(struct tipc_sock *tsk) @@ -2721,7 +2721,7 @@ static int tipc_accept(struct socket *sock, struct socket *new_sock, int flags, /* Connect new socket to it's peer */ tipc_sk_finish_conn(new_tsock, msg_origport(msg), msg_orignode(msg)); - tsk_set_importance(new_tsock, msg_importance(msg)); + tsk_set_importance(new_sk, msg_importance(msg)); if (msg_named(msg)) { new_tsock->conn_type = msg_nametype(msg); new_tsock->conn_instance = msg_nameinst(msg); @@ -3139,7 +3139,7 @@ static int tipc_setsockopt(struct socket *sock, int lvl, int opt, switch (opt) { case TIPC_IMPORTANCE: - res = tsk_set_importance(tsk, value); + res = tsk_set_importance(sk, value); break; case TIPC_SRC_DROPPABLE: if (sock->type != SOCK_STREAM) diff --git a/net/tipc/socket.h b/net/tipc/socket.h index 235b9679acee4..b11575afc66fe 100644 --- a/net/tipc/socket.h +++ b/net/tipc/socket.h @@ -75,4 +75,6 @@ u32 tipc_sock_get_portid(struct sock *sk); bool tipc_sk_overlimit1(struct sock *sk, struct sk_buff *skb); bool tipc_sk_overlimit2(struct sock *sk, struct sk_buff *skb); +int tsk_set_importance(struct sock *sk, int imp); + #endif diff --git a/net/tipc/topsrv.c b/net/tipc/topsrv.c index 446af7bbd13e6..1489cfb941d8e 100644 --- a/net/tipc/topsrv.c +++ b/net/tipc/topsrv.c @@ -497,7 +497,6 @@ static void tipc_topsrv_listener_data_ready(struct sock *sk) static int tipc_topsrv_create_listener(struct tipc_topsrv *srv) { - int imp = TIPC_CRITICAL_IMPORTANCE; struct socket *lsock = NULL; struct sockaddr_tipc saddr; struct sock *sk; @@ -514,8 +513,9 @@ static int tipc_topsrv_create_listener(struct tipc_topsrv *srv) sk->sk_user_data = srv; write_unlock_bh(>sk_callback_lock); - rc = kernel_setsockopt(lsock, SOL_TIPC, TIPC_IMPORTANCE, - (char *), sizeof(imp)); + lock_sock(sk); + rc = tsk_set_importance(sk, TIPC_CRITICAL_IMPORTANCE); + release_sock(sk); if (rc < 0) goto err; -- 2.26.2
[PATCH 24/28] ipv6: add ip6_sock_set_recverr
Add a helper to directly set the IPV6_RECVERR sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig Reviewed-by: David Howells --- include/net/ipv6.h | 7 +++ net/rxrpc/local_object.c | 10 ++ 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 9b91188c9a74c..49c4abf991489 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1188,4 +1188,11 @@ static inline int ip6_sock_set_v6only(struct sock *sk) return 0; } +static inline void ip6_sock_set_recverr(struct sock *sk) +{ + lock_sock(sk); + inet6_sk(sk)->recverr = true; + release_sock(sk); +} + #endif /* _NET_IPV6_H */ diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 6f4e6b4817cf2..c8b2097f499c0 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -107,7 +107,7 @@ static struct rxrpc_local *rxrpc_alloc_local(struct rxrpc_net *rxnet, static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) { struct sock *usk; - int ret, opt; + int ret; _enter("%p{%d,%d}", local, local->srx.transport_type, local->srx.transport.family); @@ -157,13 +157,7 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) switch (local->srx.transport.family) { case AF_INET6: /* we want to receive ICMPv6 errors */ - opt = 1; - ret = kernel_setsockopt(local->socket, SOL_IPV6, IPV6_RECVERR, - (char *) , sizeof(opt)); - if (ret < 0) { - _debug("setsockopt failed"); - goto error; - } + ip6_sock_set_recverr(local->socket->sk); /* Fall through and set IPv4 options too otherwise we don't get * errors from IPv4 packets sent through the IPv6 socket. -- 2.26.2
[PATCH 17/28] tcp: add tcp_sock_set_keepcnt
Add a helper to directly set the TCP_KEEPCNT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/linux/tcp.h | 1 + net/ipv4/tcp.c| 12 net/rds/tcp.h | 2 +- net/rds/tcp_listen.c | 17 +++-- net/sunrpc/xprtsock.c | 3 +-- 5 files changed, 18 insertions(+), 17 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 1f9bada00faab..9aac824c523cf 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -498,6 +498,7 @@ int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from, int pcount, int shiftlen); void tcp_sock_set_cork(struct sock *sk, bool on); +int tcp_sock_set_keepcnt(struct sock *sk, int val); int tcp_sock_set_keepidle(struct sock *sk, int val); int tcp_sock_set_keepintvl(struct sock *sk, int val); void tcp_sock_set_nodelay(struct sock *sk); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7eb083e09786a..15d47d5e79510 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2946,6 +2946,18 @@ int tcp_sock_set_keepintvl(struct sock *sk, int val) } EXPORT_SYMBOL(tcp_sock_set_keepintvl); +int tcp_sock_set_keepcnt(struct sock *sk, int val) +{ + if (val < 1 || val > MAX_TCP_KEEPCNT) + return -EINVAL; + + lock_sock(sk); + tcp_sk(sk)->keepalive_probes = val; + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(tcp_sock_set_keepcnt); + /* * Socket option code for TCP. */ diff --git a/net/rds/tcp.h b/net/rds/tcp.h index f6d75d8cb167a..bad9cf49d5657 100644 --- a/net/rds/tcp.h +++ b/net/rds/tcp.h @@ -70,7 +70,7 @@ struct socket *rds_tcp_listen_init(struct net *net, bool isv6); void rds_tcp_listen_stop(struct socket *sock, struct work_struct *acceptor); void rds_tcp_listen_data_ready(struct sock *sk); int rds_tcp_accept_one(struct socket *sock); -int rds_tcp_keepalive(struct socket *sock); +void rds_tcp_keepalive(struct socket *sock); void *rds_tcp_listen_sock_def_readable(struct net *net); /* tcp_recv.c */ diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 9ad555c48d15d..101cf14215a0b 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -38,27 +38,19 @@ #include "rds.h" #include "tcp.h" -int rds_tcp_keepalive(struct socket *sock) +void rds_tcp_keepalive(struct socket *sock) { /* values below based on xs_udp_default_timeout */ int keepidle = 5; /* send a probe 'keepidle' secs after last data */ int keepcnt = 5; /* number of unack'ed probes before declaring dead */ - int ret = 0; sock_set_keepalive(sock->sk); - - ret = kernel_setsockopt(sock, IPPROTO_TCP, TCP_KEEPCNT, - (char *), sizeof(keepcnt)); - if (ret < 0) - goto bail; - + tcp_sock_set_keepcnt(sock->sk, keepcnt); tcp_sock_set_keepidle(sock->sk, keepidle); /* KEEPINTVL is the interval between successive probes. We follow * the model in xs_tcp_finish_connecting() and re-use keepidle. */ tcp_sock_set_keepintvl(sock->sk, keepidle); -bail: - return ret; } /* rds_tcp_accept_one_path(): if accepting on cp_index > 0, make sure the @@ -140,10 +132,7 @@ int rds_tcp_accept_one(struct socket *sock) new_sock->ops = sock->ops; __module_get(new_sock->ops->owner); - ret = rds_tcp_keepalive(new_sock); - if (ret < 0) - goto out; - + rds_tcp_keepalive(new_sock); rds_tcp_tune(new_sock); inet = inet_sk(new_sock->sk); diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 5ca64e12af0c5..0d3ec055bc12f 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2109,8 +2109,7 @@ static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, sock_set_keepalive(sock->sk); tcp_sock_set_keepidle(sock->sk, keepidle); tcp_sock_set_keepintvl(sock->sk, keepidle); - kernel_setsockopt(sock, SOL_TCP, TCP_KEEPCNT, - (char *), sizeof(keepcnt)); + tcp_sock_set_keepcnt(sock->sk, keepcnt); /* TCP user timeout (see RFC5482) */ tcp_sock_set_user_timeout(sock->sk, timeo); -- 2.26.2
[PATCH 25/28] ipv6: add ip6_sock_set_addr_preferences
Add a helper to directly set the IPV6_ADD_PREFERENCES sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/ipv6.h | 67 net/ipv6/ipv6_sockglue.c | 59 +-- net/sunrpc/xprtsock.c| 7 +++-- 3 files changed, 72 insertions(+), 61 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 49c4abf991489..9a90759830162 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1195,4 +1195,71 @@ static inline void ip6_sock_set_recverr(struct sock *sk) release_sock(sk); } +static inline int __ip6_sock_set_addr_preferences(struct sock *sk, int val) +{ + unsigned int pref = 0; + unsigned int prefmask = ~0; + + /* check PUBLIC/TMP/PUBTMP_DEFAULT conflicts */ + switch (val & (IPV6_PREFER_SRC_PUBLIC | + IPV6_PREFER_SRC_TMP | + IPV6_PREFER_SRC_PUBTMP_DEFAULT)) { + case IPV6_PREFER_SRC_PUBLIC: + pref |= IPV6_PREFER_SRC_PUBLIC; + prefmask &= ~(IPV6_PREFER_SRC_PUBLIC | + IPV6_PREFER_SRC_TMP); + break; + case IPV6_PREFER_SRC_TMP: + pref |= IPV6_PREFER_SRC_TMP; + prefmask &= ~(IPV6_PREFER_SRC_PUBLIC | + IPV6_PREFER_SRC_TMP); + break; + case IPV6_PREFER_SRC_PUBTMP_DEFAULT: + prefmask &= ~(IPV6_PREFER_SRC_PUBLIC | + IPV6_PREFER_SRC_TMP); + break; + case 0: + break; + default: + return -EINVAL; + } + + /* check HOME/COA conflicts */ + switch (val & (IPV6_PREFER_SRC_HOME | IPV6_PREFER_SRC_COA)) { + case IPV6_PREFER_SRC_HOME: + prefmask &= ~IPV6_PREFER_SRC_COA; + break; + case IPV6_PREFER_SRC_COA: + pref |= IPV6_PREFER_SRC_COA; + break; + case 0: + break; + default: + return -EINVAL; + } + + /* check CGA/NONCGA conflicts */ + switch (val & (IPV6_PREFER_SRC_CGA|IPV6_PREFER_SRC_NONCGA)) { + case IPV6_PREFER_SRC_CGA: + case IPV6_PREFER_SRC_NONCGA: + case 0: + break; + default: + return -EINVAL; + } + + inet6_sk(sk)->srcprefs = (inet6_sk(sk)->srcprefs & prefmask) | pref; + return 0; +} + +static inline int ip6_sock_set_addr_preferences(struct sock *sk, bool val) +{ + int ret; + + lock_sock(sk); + ret = __ip6_sock_set_addr_preferences(sk, val); + release_sock(sk); + return ret; +} + #endif /* _NET_IPV6_H */ diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c index e10258c2210e8..adbfed6adf11c 100644 --- a/net/ipv6/ipv6_sockglue.c +++ b/net/ipv6/ipv6_sockglue.c @@ -845,67 +845,10 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname, break; case IPV6_ADDR_PREFERENCES: - { - unsigned int pref = 0; - unsigned int prefmask = ~0; - if (optlen < sizeof(int)) goto e_inval; - - retv = -EINVAL; - - /* check PUBLIC/TMP/PUBTMP_DEFAULT conflicts */ - switch (val & (IPV6_PREFER_SRC_PUBLIC| - IPV6_PREFER_SRC_TMP| - IPV6_PREFER_SRC_PUBTMP_DEFAULT)) { - case IPV6_PREFER_SRC_PUBLIC: - pref |= IPV6_PREFER_SRC_PUBLIC; - break; - case IPV6_PREFER_SRC_TMP: - pref |= IPV6_PREFER_SRC_TMP; - break; - case IPV6_PREFER_SRC_PUBTMP_DEFAULT: - break; - case 0: - goto pref_skip_pubtmp; - default: - goto e_inval; - } - - prefmask &= ~(IPV6_PREFER_SRC_PUBLIC| - IPV6_PREFER_SRC_TMP); -pref_skip_pubtmp: - - /* check HOME/COA conflicts */ - switch (val & (IPV6_PREFER_SRC_HOME|IPV6_PREFER_SRC_COA)) { - case IPV6_PREFER_SRC_HOME: - break; - case IPV6_PREFER_SRC_COA: - pref |= IPV6_PREFER_SRC_COA; - case 0: - goto pref_skip_coa; - default: - goto e_inval; - } - - prefmask &= ~IPV6_PREFER_SRC_COA; -pref_skip_coa: - - /* check CGA/NONCGA conflicts */ - switch (val & (IPV6_PREFER_SRC_CGA|IPV6_PREFER_SRC_NONCGA)) { - case IPV6_PREFER_SRC_CGA: - case IPV6_PREFER_SRC_NONCGA: - case 0: - break; - default: - goto e_inval; -
[PATCH 18/28] ipv4: add ip_sock_set_tos
Add a helper to directly set the IP_TOS sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig Acked-by: Sagi Grimberg --- drivers/nvme/host/tcp.c | 14 +++--- drivers/nvme/target/tcp.c | 10 ++ include/net/ip.h | 2 ++ net/ipv4/ip_sockglue.c| 30 +- 4 files changed, 28 insertions(+), 28 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 2872584f52f63..4c972d8abf317 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1313,7 +1313,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, { struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); struct nvme_tcp_queue *queue = >queues[qid]; - int ret, opt, rcv_pdu_size; + int ret, rcv_pdu_size; queue->ctrl = ctrl; INIT_LIST_HEAD(>send_list); @@ -1352,16 +1352,8 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, sock_set_priority(queue->sock->sk, so_priority); /* Set socket type of service */ - if (nctrl->opts->tos >= 0) { - opt = nctrl->opts->tos; - ret = kernel_setsockopt(queue->sock, SOL_IP, IP_TOS, - (char *), sizeof(opt)); - if (ret) { - dev_err(nctrl->device, - "failed to set IP_TOS sock opt %d\n", ret); - goto err_sock; - } - } + if (nctrl->opts->tos >= 0) + ip_sock_set_tos(queue->sock->sk, nctrl->opts->tos); queue->sock->sk->sk_allocation = GFP_ATOMIC; nvme_tcp_set_queue_io_cpu(queue); diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 55bc4c3c0a74a..4546049a96b37 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1452,14 +1452,8 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue) sock_set_priority(sock->sk, so_priority); /* Set socket type of service */ - if (inet->rcv_tos > 0) { - int tos = inet->rcv_tos; - - ret = kernel_setsockopt(sock, SOL_IP, IP_TOS, - (char *), sizeof(tos)); - if (ret) - return ret; - } + if (inet->rcv_tos > 0) + ip_sock_set_tos(sock->sk, inet->rcv_tos); write_lock_bh(>sk->sk_callback_lock); sock->sk->sk_user_data = queue; diff --git a/include/net/ip.h b/include/net/ip.h index 5b317c9f4470a..2fc52e26fa88b 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -765,4 +765,6 @@ static inline bool inetdev_valid_mtu(unsigned int mtu) return likely(mtu >= IPV4_MIN_MTU); } +void ip_sock_set_tos(struct sock *sk, int val); + #endif /* _IP_H */ diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c index f43d5f12aa86a..b43a29e11f4a5 100644 --- a/net/ipv4/ip_sockglue.c +++ b/net/ipv4/ip_sockglue.c @@ -560,6 +560,26 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len) return err; } +static void __ip_sock_set_tos(struct sock *sk, int val) +{ + if (sk->sk_type == SOCK_STREAM) { + val &= ~INET_ECN_MASK; + val |= inet_sk(sk)->tos & INET_ECN_MASK; + } + if (inet_sk(sk)->tos != val) { + inet_sk(sk)->tos = val; + sk->sk_priority = rt_tos2priority(val); + sk_dst_reset(sk); + } +} + +void ip_sock_set_tos(struct sock *sk, int val) +{ + lock_sock(sk); + __ip_sock_set_tos(sk, val); + release_sock(sk); +} +EXPORT_SYMBOL(ip_sock_set_tos); /* * Socket option code for IP. This is the end of the line after any @@ -823,15 +843,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, inet->cmsg_flags &= ~IP_CMSG_RECVFRAGSIZE; break; case IP_TOS:/* This sets both TOS and Precedence */ - if (sk->sk_type == SOCK_STREAM) { - val &= ~INET_ECN_MASK; - val |= inet->tos & INET_ECN_MASK; - } - if (inet->tos != val) { - inet->tos = val; - sk->sk_priority = rt_tos2priority(val); - sk_dst_reset(sk); - } + __ip_sock_set_tos(sk, val); break; case IP_TTL: if (optlen < 1) -- 2.26.2
[PATCH 23/28] ipv6: add ip6_sock_set_v6only
Add a helper to directly set the IPV6_V6ONLY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/ipv6.h| 11 +++ net/ipv6/ip6_udp_tunnel.c | 5 + net/sunrpc/svcsock.c | 6 +- 3 files changed, 13 insertions(+), 9 deletions(-) diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 39a00d3ef5e22..9b91188c9a74c 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -1177,4 +1177,15 @@ int ipv6_sock_mc_join_ssm(struct sock *sk, int ifindex, const struct in6_addr *addr, unsigned int mode); int ipv6_sock_mc_drop(struct sock *sk, int ifindex, const struct in6_addr *addr); + +static inline int ip6_sock_set_v6only(struct sock *sk) +{ + if (inet_sk(sk)->inet_num) + return -EINVAL; + lock_sock(sk); + sk->sk_ipv6only = true; + release_sock(sk); + return 0; +} + #endif /* _NET_IPV6_H */ diff --git a/net/ipv6/ip6_udp_tunnel.c b/net/ipv6/ip6_udp_tunnel.c index 6523609516d25..2e0ad1bc84a83 100644 --- a/net/ipv6/ip6_udp_tunnel.c +++ b/net/ipv6/ip6_udp_tunnel.c @@ -25,10 +25,7 @@ int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg, goto error; if (cfg->ipv6_v6only) { - int val = 1; - - err = kernel_setsockopt(sock, IPPROTO_IPV6, IPV6_V6ONLY, - (char *) , sizeof(val)); + err = ip6_sock_set_v6only(sock->sk); if (err < 0) goto error; } diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c index 7a805d165689c..a391892977cd2 100644 --- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -1328,7 +1328,6 @@ static struct svc_xprt *svc_create_socket(struct svc_serv *serv, struct sockaddr *newsin = (struct sockaddr *) int newlen; int family; - int val; RPC_IFDEBUG(char buf[RPC_MAX_ADDRBUFLEN]); dprintk("svc: svc_create_socket(%s, %d, %s)\n", @@ -1364,11 +1363,8 @@ static struct svc_xprt *svc_create_socket(struct svc_serv *serv, * getting requests from IPv4 remotes. Those should * be shunted to a PF_INET listener via rpcbind. */ - val = 1; if (family == PF_INET6) - kernel_setsockopt(sock, SOL_IPV6, IPV6_V6ONLY, - (char *), sizeof(val)); - + ip6_sock_set_v6only(sock->sk); if (type == SOCK_STREAM) sock->sk->sk_reuse = SK_CAN_REUSE; /* allow address reuse */ error = kernel_bind(sock, sin, len); -- 2.26.2
[PATCH 15/28] tcp: add tcp_sock_set_keepidle
Add a helper to directly set the TCP_KEEP_IDLE sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/linux/tcp.h | 1 + net/ipv4/tcp.c| 49 ++- net/rds/tcp_listen.c | 5 + net/sunrpc/xprtsock.c | 3 +-- 4 files changed, 37 insertions(+), 21 deletions(-) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index de682143efe4d..5724dd84a85ed 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -498,6 +498,7 @@ int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from, int pcount, int shiftlen); void tcp_sock_set_cork(struct sock *sk, bool on); +int tcp_sock_set_keepidle(struct sock *sk, int val); void tcp_sock_set_nodelay(struct sock *sk); void tcp_sock_set_quickack(struct sock *sk, int val); int tcp_sock_set_syncnt(struct sock *sk, int val); diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 0004bd9ae7b0a..bdf0ff9333514 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2901,6 +2901,39 @@ void tcp_sock_set_user_timeout(struct sock *sk, u32 val) } EXPORT_SYMBOL(tcp_sock_set_user_timeout); +static int __tcp_sock_set_keepidle(struct sock *sk, int val) +{ + struct tcp_sock *tp = tcp_sk(sk); + + if (val < 1 || val > MAX_TCP_KEEPIDLE) + return -EINVAL; + + tp->keepalive_time = val * HZ; + if (sock_flag(sk, SOCK_KEEPOPEN) && + !((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN))) { + u32 elapsed = keepalive_time_elapsed(tp); + + if (tp->keepalive_time > elapsed) + elapsed = tp->keepalive_time - elapsed; + else + elapsed = 0; + inet_csk_reset_keepalive_timer(sk, elapsed); + } + + return 0; +} + +int tcp_sock_set_keepidle(struct sock *sk, int val) +{ + int err; + + lock_sock(sk); + err = __tcp_sock_set_keepidle(sk, val); + release_sock(sk); + return err; +} +EXPORT_SYMBOL(tcp_sock_set_keepidle); + /* * Socket option code for TCP. */ @@ -3070,21 +3103,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, break; case TCP_KEEPIDLE: - if (val < 1 || val > MAX_TCP_KEEPIDLE) - err = -EINVAL; - else { - tp->keepalive_time = val * HZ; - if (sock_flag(sk, SOCK_KEEPOPEN) && - !((1 << sk->sk_state) & - (TCPF_CLOSE | TCPF_LISTEN))) { - u32 elapsed = keepalive_time_elapsed(tp); - if (tp->keepalive_time > elapsed) - elapsed = tp->keepalive_time - elapsed; - else - elapsed = 0; - inet_csk_reset_keepalive_timer(sk, elapsed); - } - } + err = __tcp_sock_set_keepidle(sk, val); break; case TCP_KEEPINTVL: if (val < 1 || val > MAX_TCP_KEEPINTVL) diff --git a/net/rds/tcp_listen.c b/net/rds/tcp_listen.c index 6f90ea077adcd..79f9adc008114 100644 --- a/net/rds/tcp_listen.c +++ b/net/rds/tcp_listen.c @@ -52,10 +52,7 @@ int rds_tcp_keepalive(struct socket *sock) if (ret < 0) goto bail; - ret = kernel_setsockopt(sock, IPPROTO_TCP, TCP_KEEPIDLE, - (char *), sizeof(keepidle)); - if (ret < 0) - goto bail; + tcp_sock_set_keepidle(sock->sk, keepidle); /* KEEPINTVL is the interval between successive probes. We follow * the model in xs_tcp_finish_connecting() and re-use keepidle. diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 231fd6162f68d..473290f7c5c0a 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2107,8 +2107,7 @@ static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, /* TCP Keepalive options */ sock_set_keepalive(sock->sk); - kernel_setsockopt(sock, SOL_TCP, TCP_KEEPIDLE, - (char *), sizeof(keepidle)); + tcp_sock_set_keepidle(sock->sk, keepidle); kernel_setsockopt(sock, SOL_TCP, TCP_KEEPINTVL, (char *), sizeof(keepidle)); kernel_setsockopt(sock, SOL_TCP, TCP_KEEPCNT, -- 2.26.2
[PATCH 14/28] tcp: add tcp_sock_set_user_timeout
Add a helper to directly set the TCP_USER_TIMEOUT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig --- fs/ocfs2/cluster/tcp.c | 22 ++ include/linux/tcp.h| 1 + net/ipv4/tcp.c | 8 net/sunrpc/xprtsock.c | 3 +-- 4 files changed, 12 insertions(+), 22 deletions(-) diff --git a/fs/ocfs2/cluster/tcp.c b/fs/ocfs2/cluster/tcp.c index 4c70fe9d19ab2..79a2317194600 100644 --- a/fs/ocfs2/cluster/tcp.c +++ b/fs/ocfs2/cluster/tcp.c @@ -1441,14 +1441,6 @@ static void o2net_rx_until_empty(struct work_struct *work) sc_put(sc); } -static int o2net_set_usertimeout(struct socket *sock) -{ - int user_timeout = O2NET_TCP_USER_TIMEOUT; - - return kernel_setsockopt(sock, SOL_TCP, TCP_USER_TIMEOUT, - (void *)_timeout, sizeof(user_timeout)); -} - static void o2net_initialize_handshake(void) { o2net_hand->o2hb_heartbeat_timeout_ms = cpu_to_be32( @@ -1629,12 +1621,7 @@ static void o2net_start_connect(struct work_struct *work) } tcp_sock_set_nodelay(sc->sc_sock->sk); - - ret = o2net_set_usertimeout(sock); - if (ret) { - mlog(ML_ERROR, "set TCP_USER_TIMEOUT failed with %d\n", ret); - goto out; - } + tcp_sock_set_user_timeout(sock->sk, O2NET_TCP_USER_TIMEOUT); o2net_register_callbacks(sc->sc_sock->sk, sc); @@ -1821,12 +1808,7 @@ static int o2net_accept_one(struct socket *sock, int *more) new_sock->sk->sk_allocation = GFP_ATOMIC; tcp_sock_set_nodelay(new_sock->sk); - - ret = o2net_set_usertimeout(new_sock); - if (ret) { - mlog(ML_ERROR, "set TCP_USER_TIMEOUT failed with %d\n", ret); - goto out; - } + tcp_sock_set_user_timeout(new_sock->sk, O2NET_TCP_USER_TIMEOUT); ret = new_sock->ops->getname(new_sock, (struct sockaddr *) , 1); if (ret < 0) diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 6aa4ae5ebf3d5..de682143efe4d 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -501,5 +501,6 @@ void tcp_sock_set_cork(struct sock *sk, bool on); void tcp_sock_set_nodelay(struct sock *sk); void tcp_sock_set_quickack(struct sock *sk, int val); int tcp_sock_set_syncnt(struct sock *sk, int val); +void tcp_sock_set_user_timeout(struct sock *sk, u32 val); #endif /* _LINUX_TCP_H */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index d2c67ae1da07a..0004bd9ae7b0a 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2893,6 +2893,14 @@ int tcp_sock_set_syncnt(struct sock *sk, int val) } EXPORT_SYMBOL(tcp_sock_set_syncnt); +void tcp_sock_set_user_timeout(struct sock *sk, u32 val) +{ + lock_sock(sk); + inet_csk(sk)->icsk_user_timeout = val; + release_sock(sk); +} +EXPORT_SYMBOL(tcp_sock_set_user_timeout); + /* * Socket option code for TCP. */ diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index 399848c2bcb29..231fd6162f68d 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2115,8 +2115,7 @@ static void xs_tcp_set_socket_timeouts(struct rpc_xprt *xprt, (char *), sizeof(keepcnt)); /* TCP user timeout (see RFC5482) */ - kernel_setsockopt(sock, SOL_TCP, TCP_USER_TIMEOUT, - (char *), sizeof(timeo)); + tcp_sock_set_user_timeout(sock->sk, timeo); } static void xs_tcp_set_connect_timeout(struct rpc_xprt *xprt, -- 2.26.2
[PATCH v9 1/2] dt-bindings: mtd: Add Nand Flash Controller support for Intel LGM SoC
From: Ramuthevar Vadivel Murugan Add YAML file for dt-bindings to support NAND Flash Controller on Intel's Lightning Mountain SoC. Signed-off-by: Ramuthevar Vadivel Murugan --- .../devicetree/bindings/mtd/intel,lgm-nand.yaml| 93 ++ 1 file changed, 93 insertions(+) create mode 100644 Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml diff --git a/Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml b/Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml new file mode 100644 index ..8672d03b4e6a --- /dev/null +++ b/Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml @@ -0,0 +1,93 @@ +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/mtd/intel,lgm-nand.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Intel LGM SoC NAND Controller Device Tree Bindings + +allOf: + - $ref: "nand-controller.yaml" + +maintainers: + - Ramuthevar Vadivel Murugan + +properties: + compatible: +const: intel,lgm-nand + + reg: +maxItems: 6 + + reg-names: +items: + - const: ebunand + - const: hsnand + - const: nand_cs0 + - const: nand_cs1 + - const: addr_sel0 + - const: addr_sel1 + + clocks: +maxItems: 1 + + dmas: +maxItems: 2 + + dma-names: +items: + - const: tx + - const: rx + +patternProperties: + "^nand@[a-f0-9]+$": +type: object +properties: + reg: +minimum: 0 +maximum: 7 + + nand-ecc-mode: true + + nand-ecc-algo: +const: hw + +additionalProperties: false + +required: + - compatible + - reg + - reg-names + - clocks + - dmas + - dma-names + +additionalProperties: false + +examples: + - | +nand-controller@e0f0 { + compatible = "intel,lgm-nand"; + reg = <0xe0f0 0x100>, +<0xe100 0x300>, +<0xe140 0x8000>, +<0xe1c0 0x1000>, +<0x1740 0x4>, +<0x17c0 0x4>; + reg-names = "ebunand", "hsnand", "nand_cs0", "nand_cs1", +"addr_sel0", "addr_sel1"; + clocks = < 125>; + dmas = < 8>, < 9>; + dma-names = "tx", "rx"; + #address-cells = <1>; + #size-cells = <0>; + + nand@0 { +reg = <0>; +nand-on-flash-bbt; +#address-cells = <1>; +#size-cells = <1>; + }; +}; + +... -- 2.11.0
[PATCH 06/28] net: add sock_enable_timestamps
Add a helper to directly enable timestamps instead of setting the SO_TIMESTAMP* sockopts from kernel space and going through a fake uaccess. Signed-off-by: Christoph Hellwig --- include/net/sock.h | 1 + net/core/sock.c | 47 +--- net/rxrpc/local_object.c | 8 +-- 3 files changed, 31 insertions(+), 25 deletions(-) diff --git a/include/net/sock.h b/include/net/sock.h index cdec7bc055d5b..99ef43508d2b5 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2689,6 +2689,7 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif) void sock_def_readable(struct sock *sk); int sock_bindtoindex(struct sock *sk, int ifindex); +void sock_enable_timestamps(struct sock *sk); void sock_no_linger(struct sock *sk); void sock_set_priority(struct sock *sk, u32 priority); void sock_set_reuseaddr(struct sock *sk); diff --git a/net/core/sock.c b/net/core/sock.c index 23f80880fbb2c..e4a4dd2b3d8b3 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -757,6 +757,28 @@ void sock_set_sndtimeo(struct sock *sk, s64 secs) } EXPORT_SYMBOL(sock_set_sndtimeo); +static void __sock_set_timestamps(struct sock *sk, bool val, bool new, bool ns) +{ + if (val) { + sock_valbool_flag(sk, SOCK_TSTAMP_NEW, new); + sock_valbool_flag(sk, SOCK_RCVTSTAMPNS, ns); + sock_set_flag(sk, SOCK_RCVTSTAMP); + sock_enable_timestamp(sk, SOCK_TIMESTAMP); + } else { + sock_reset_flag(sk, SOCK_RCVTSTAMP); + sock_reset_flag(sk, SOCK_RCVTSTAMPNS); + sock_reset_flag(sk, SOCK_TSTAMP_NEW); + } +} + +void sock_enable_timestamps(struct sock *sk) +{ + lock_sock(sk); + __sock_set_timestamps(sk, true, false, true); + release_sock(sk); +} +EXPORT_SYMBOL(sock_enable_timestamps); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. @@ -948,28 +970,17 @@ int sock_setsockopt(struct socket *sock, int level, int optname, break; case SO_TIMESTAMP_OLD: + __sock_set_timestamps(sk, valbool, false, false); + break; case SO_TIMESTAMP_NEW: + __sock_set_timestamps(sk, valbool, true, false); + break; case SO_TIMESTAMPNS_OLD: + __sock_set_timestamps(sk, valbool, false, true); + break; case SO_TIMESTAMPNS_NEW: - if (valbool) { - if (optname == SO_TIMESTAMP_NEW || optname == SO_TIMESTAMPNS_NEW) - sock_set_flag(sk, SOCK_TSTAMP_NEW); - else - sock_reset_flag(sk, SOCK_TSTAMP_NEW); - - if (optname == SO_TIMESTAMP_OLD || optname == SO_TIMESTAMP_NEW) - sock_reset_flag(sk, SOCK_RCVTSTAMPNS); - else - sock_set_flag(sk, SOCK_RCVTSTAMPNS); - sock_set_flag(sk, SOCK_RCVTSTAMP); - sock_enable_timestamp(sk, SOCK_TIMESTAMP); - } else { - sock_reset_flag(sk, SOCK_RCVTSTAMP); - sock_reset_flag(sk, SOCK_RCVTSTAMPNS); - sock_reset_flag(sk, SOCK_TSTAMP_NEW); - } + __sock_set_timestamps(sk, valbool, true, true); break; - case SO_TIMESTAMPING_NEW: sock_set_flag(sk, SOCK_TSTAMP_NEW); /* fall through */ diff --git a/net/rxrpc/local_object.c b/net/rxrpc/local_object.c index 01135e54d95d2..5ea2bd01fdd59 100644 --- a/net/rxrpc/local_object.c +++ b/net/rxrpc/local_object.c @@ -189,13 +189,7 @@ static int rxrpc_open_socket(struct rxrpc_local *local, struct net *net) } /* We want receive timestamps. */ - opt = 1; - ret = kernel_setsockopt(local->socket, SOL_SOCKET, SO_TIMESTAMPNS_OLD, - (char *), sizeof(opt)); - if (ret < 0) { - _debug("setsockopt failed"); - goto error; - } + sock_enable_timestamps(local->socket->sk); break; default: -- 2.26.2
[PATCH 10/28] tcp: add tcp_sock_set_cork
Add a helper to directly set the TCP_CORK sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig --- drivers/block/drbd/drbd_int.h | 14 drivers/block/drbd/drbd_receiver.c | 4 +-- drivers/block/drbd/drbd_worker.c | 6 ++-- fs/cifs/transport.c| 8 ++--- include/linux/tcp.h| 2 ++ net/ipv4/tcp.c | 51 +++--- net/rds/tcp_send.c | 9 ++ 7 files changed, 43 insertions(+), 51 deletions(-) diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index aae99a2d7bd40..3550adc93c68b 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -1570,20 +1570,6 @@ extern void drbd_set_recv_tcq(struct drbd_device *device, int tcq_enabled); extern void _drbd_clear_done_ee(struct drbd_device *device, struct list_head *to_be_freed); extern int drbd_connected(struct drbd_peer_device *); -static inline void drbd_tcp_cork(struct socket *sock) -{ - int val = 1; - (void) kernel_setsockopt(sock, SOL_TCP, TCP_CORK, - (char*), sizeof(val)); -} - -static inline void drbd_tcp_uncork(struct socket *sock) -{ - int val = 0; - (void) kernel_setsockopt(sock, SOL_TCP, TCP_CORK, - (char*), sizeof(val)); -} - static inline void drbd_tcp_nodelay(struct socket *sock) { int val = 1; diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index c15e7083b13a6..55ea907ad33cb 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -6162,7 +6162,7 @@ void drbd_send_acks_wf(struct work_struct *ws) rcu_read_unlock(); if (tcp_cork) - drbd_tcp_cork(connection->meta.socket); + tcp_sock_set_cork(connection->meta.socket->sk, true); err = drbd_finish_peer_reqs(device); kref_put(>kref, drbd_destroy_device); @@ -6175,7 +6175,7 @@ void drbd_send_acks_wf(struct work_struct *ws) } if (tcp_cork) - drbd_tcp_uncork(connection->meta.socket); + tcp_sock_set_cork(connection->meta.socket->sk, false); return; } diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c index 0dc019da1f8d0..2b89c9f2ca707 100644 --- a/drivers/block/drbd/drbd_worker.c +++ b/drivers/block/drbd/drbd_worker.c @@ -2098,7 +2098,7 @@ static void wait_for_work(struct drbd_connection *connection, struct list_head * if (uncork) { mutex_lock(>data.mutex); if (connection->data.socket) - drbd_tcp_uncork(connection->data.socket); + tcp_sock_set_cork(connection->data.socket->sk, false); mutex_unlock(>data.mutex); } @@ -2153,9 +2153,9 @@ static void wait_for_work(struct drbd_connection *connection, struct list_head * mutex_lock(>data.mutex); if (connection->data.socket) { if (cork) - drbd_tcp_cork(connection->data.socket); + tcp_sock_set_cork(connection->data.socket->sk, true); else if (!uncork) - drbd_tcp_uncork(connection->data.socket); + tcp_sock_set_cork(connection->data.socket->sk, false); } mutex_unlock(>data.mutex); } diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c index c97570eb2c180..99760063e0006 100644 --- a/fs/cifs/transport.c +++ b/fs/cifs/transport.c @@ -325,7 +325,6 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, size_t total_len = 0, sent, size; struct socket *ssocket = server->ssocket; struct msghdr smb_msg; - int val = 1; __be32 rfc1002_marker; if (cifs_rdma_enabled(server)) { @@ -345,8 +344,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, } /* cork the socket */ - kernel_setsockopt(ssocket, SOL_TCP, TCP_CORK, - (char *), sizeof(val)); + tcp_sock_set_cork(ssocket->sk, true); for (j = 0; j < num_rqst; j++) send_length += smb_rqst_len(server, [j]); @@ -435,9 +433,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, int num_rqst, } /* uncork it */ - val = 0; - kernel_setsockopt(ssocket, SOL_TCP, TCP_CORK, - (char *), sizeof(val)); + tcp_sock_set_cork(ssocket->sk, false); if ((total_len > 0) && (total_len != send_length)) { cifs_dbg(FYI, "partial send (wanted=%u sent=%zu): terminating session\n", diff --git a/include/linux/tcp.h b/include/linux/tcp.h index bf44e85d709dc..889eeb2256c2d 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -497,4 +497,6 @@ static inline u16
[PATCH v9 0/2] mtd: rawnand: Add NAND controller support on Intel LGM SoC
This patch adds the new IP of Nand Flash Controller(NFC) support on Intel's Lightning Mountain(LGM) SoC. DMA is used for burst data transfer operation, also DMA HW supports aligned 32bit memory address and aligned data access by default. DMA burst of 8 supported. Data register used to support the read/write operation from/to device. NAND controller also supports in-built HW ECC engine. NAND controller driver implements ->exec_op() to replace legacy hooks, these specific call-back method to execute NAND operations. Thanks Boris, Andy, Arnd and Rob for the review comments and suggestions. --- v9: No change v8: - fix the kbuild bot warnings - correct the typo's v7: - indentation issue is fixed - add error check for retrieve the resource from dt v6: - update EBU_ADDR_SELx register base value build it from DT - Add tabs in in Kconfig v5: - replace by 'HSNAND_CLE_OFFS | HSNAND_CS_OFFS' to NAND_WRITE_CMD and NAND_WRITE_ADDR - remove the unused macros - update EBU_ADDR_MASK(x) macro - update the EBU_ADDR_SELx register values to be written v4: - add ebu_nand_cs structure for multiple-CS support - mask/offset encoding for 0x51 value - update macro HSNAND_CTL_ENABLE_ECC - drop the op argument and un-used macros. - updated the datatype and macros - add function disable nand module - remove ebu_host->dma_rx = NULL; - rename MMIO address range variables to ebu and hsnand - implement ->setup_data_interface() - update label err_cleanup_nand and err_cleanup_dma - add return value check in the nand_remove function - add/remove tabs and spaces as per coding standard - encoded CS ids by reg property v3: - Add depends on MACRO in Kconfig - file name update in Makefile - file name update to intel-nand-controller - modification of MACRO divided like EBU, HSNAND and NAND - add NAND_ALE_OFFS, NAND_CLE_OFFS and NAND_CS_OFFS - rename lgm_ to ebu_ and _va suffix is removed in the whole file - rename structure and varaibles as per review comments. - remove lgm_read_byte(), lgm_dev_ready() and cmd_ctrl() un-used function - update in exec_op() as per review comments - rename function lgm_dma_exit() by lgm_dma_cleanup() - hardcoded magic value for base and offset replaced by MACRO defined - mtd_device_unregister() + nand_cleanup() instead of nand_release() v2: - implement the ->exec_op() to replaces the legacy hook-up. - update the commit message - add MIPS maintainers and xway_nand driver author in CC v1: - initial version dt-bindings: mtd: Add Nand Flash Controller support for Intel LGM SoC --- v9: - Rob's review comments address - dual licensed - compatible change - add reg-names - drop clock-names and clock-cells - correct typo's v8: No change v7: - Rob's review comments addressed - dt-schema build issue fixed with upgraded dt-schema v6: - Rob's review comments addressed in YAML file - add addr_sel0 and addr_sel1 reg-names in YAML example v5: - add the example in YAML file v4: - No change v3: - No change v2: YAML compatible string update to intel, lgm-nand-controller v1: - initial version Ramuthevar Vadivel Murugan (2): dt-bindings: mtd: Add Nand Flash Controller support for Intel LGM SoC mtd: rawnand: Add NAND controller support on Intel LGM SoC .../devicetree/bindings/mtd/intel,lgm-nand.yaml| 93 +++ drivers/mtd/nand/raw/Kconfig | 8 + drivers/mtd/nand/raw/Makefile | 1 + drivers/mtd/nand/raw/intel-nand-controller.c | 747 + 4 files changed, 849 insertions(+) create mode 100644 Documentation/devicetree/bindings/mtd/intel,lgm-nand.yaml create mode 100644 drivers/mtd/nand/raw/intel-nand-controller.c -- 2.11.0
[PATCH 12/28] tcp: add tcp_sock_set_quickack
Add a helper to directly set the TCP_QUICKACK sockopt from kernel space without going through a fake uaccess. Cleanup the callers to avoid pointless wrappers now that this is a simple function call. Signed-off-by: Christoph Hellwig --- drivers/block/drbd/drbd_int.h | 7 -- drivers/block/drbd/drbd_receiver.c | 5 ++-- include/linux/tcp.h| 1 + net/ipv4/tcp.c | 39 -- 4 files changed, 29 insertions(+), 23 deletions(-) diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h index e24bba87c8e02..14345a87c7cc5 100644 --- a/drivers/block/drbd/drbd_int.h +++ b/drivers/block/drbd/drbd_int.h @@ -1570,13 +1570,6 @@ extern void drbd_set_recv_tcq(struct drbd_device *device, int tcq_enabled); extern void _drbd_clear_done_ee(struct drbd_device *device, struct list_head *to_be_freed); extern int drbd_connected(struct drbd_peer_device *); -static inline void drbd_tcp_quickack(struct socket *sock) -{ - int val = 2; - (void) kernel_setsockopt(sock, SOL_TCP, TCP_QUICKACK, - (char*), sizeof(val)); -} - /* sets the number of 512 byte sectors of our virtual device */ void drbd_set_my_capacity(struct drbd_device *device, sector_t size); diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c index 20a5e94494acd..3a3f2b6a821f3 100644 --- a/drivers/block/drbd/drbd_receiver.c +++ b/drivers/block/drbd/drbd_receiver.c @@ -1223,7 +1223,7 @@ static int drbd_recv_header_maybe_unplug(struct drbd_connection *connection, str * quickly as possible, and let remote TCP know what we have * received so far. */ if (err == -EAGAIN) { - drbd_tcp_quickack(connection->data.socket); + tcp_sock_set_quickack(connection->data.socket->sk, 2); drbd_unplug_all_devices(connection); } if (err > 0) { @@ -4959,8 +4959,7 @@ static int receive_UnplugRemote(struct drbd_connection *connection, struct packe { /* Make sure we've acked all the TCP data associated * with the data requests being unplugged */ - drbd_tcp_quickack(connection->data.socket); - + tcp_sock_set_quickack(connection->data.socket->sk, 2); return 0; } diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 9e42c7fe50a8b..2eaf8320b9db0 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -499,5 +499,6 @@ int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from, int pcount, void tcp_sock_set_cork(struct sock *sk, bool on); void tcp_sock_set_nodelay(struct sock *sk); +void tcp_sock_set_quickack(struct sock *sk, int val); #endif /* _LINUX_TCP_H */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index a65f293a19fac..27b5e7a4e2ef9 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2856,6 +2856,31 @@ void tcp_sock_set_nodelay(struct sock *sk) } EXPORT_SYMBOL(tcp_sock_set_nodelay); +static void __tcp_sock_set_quickack(struct sock *sk, int val) +{ + if (!val) { + inet_csk_enter_pingpong_mode(sk); + return; + } + + inet_csk_exit_pingpong_mode(sk); + if ((1 << sk->sk_state) & (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT) && + inet_csk_ack_scheduled(sk)) { + inet_csk(sk)->icsk_ack.pending |= ICSK_ACK_PUSHED; + tcp_cleanup_rbuf(sk, 1); + if (!(val & 1)) + inet_csk_enter_pingpong_mode(sk); + } +} + +void tcp_sock_set_quickack(struct sock *sk, int val) +{ + lock_sock(sk); + __tcp_sock_set_quickack(sk, val); + release_sock(sk); +} +EXPORT_SYMBOL(tcp_sock_set_quickack); + /* * Socket option code for TCP. */ @@ -3096,19 +3121,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level, break; case TCP_QUICKACK: - if (!val) { - inet_csk_enter_pingpong_mode(sk); - } else { - inet_csk_exit_pingpong_mode(sk); - if ((1 << sk->sk_state) & - (TCPF_ESTABLISHED | TCPF_CLOSE_WAIT) && - inet_csk_ack_scheduled(sk)) { - icsk->icsk_ack.pending |= ICSK_ACK_PUSHED; - tcp_cleanup_rbuf(sk, 1); - if (!(val & 1)) - inet_csk_enter_pingpong_mode(sk); - } - } + __tcp_sock_set_quickack(sk, val); break; #ifdef CONFIG_TCP_MD5SIG -- 2.26.2
[PATCH 04/28] net: add sock_set_sndtimeo
Add a helper to directly set the SO_SNDTIMEO_NEW sockopt from kernel space without going through a fake uaccess. The interface is simplified to only pass the seconds value, as that is the only thing needed at the moment. Signed-off-by: Christoph Hellwig --- fs/dlm/lowcomms.c | 8 ++-- include/net/sock.h | 1 + net/core/sock.c| 11 +++ 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c index b801e77e3e596..b4d491122814b 100644 --- a/fs/dlm/lowcomms.c +++ b/fs/dlm/lowcomms.c @@ -1035,7 +1035,6 @@ static void sctp_connect_to_sock(struct connection *con) int result; int addr_len; struct socket *sock; - struct __kernel_sock_timeval tv = { .tv_sec = 5, .tv_usec = 0 }; if (con->nodeid == 0) { log_print("attempt to connect sock 0 foiled"); @@ -1087,13 +1086,10 @@ static void sctp_connect_to_sock(struct connection *con) * since O_NONBLOCK argument in connect() function does not work here, * then, we should restore the default value of this attribute. */ - kernel_setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO_NEW, (char *), - sizeof(tv)); + sock_set_sndtimeo(sock->sk, 5); result = sock->ops->connect(sock, (struct sockaddr *), addr_len, 0); - memset(, 0, sizeof(tv)); - kernel_setsockopt(sock, SOL_SOCKET, SO_SNDTIMEO_NEW, (char *), - sizeof(tv)); + sock_set_sndtimeo(sock->sk, 0); if (result == -EINPROGRESS) result = 0; diff --git a/include/net/sock.h b/include/net/sock.h index a3a43141a4be2..9a7b9e98685ac 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2691,5 +2691,6 @@ void sock_def_readable(struct sock *sk); void sock_no_linger(struct sock *sk); void sock_set_priority(struct sock *sk, u32 priority); void sock_set_reuseaddr(struct sock *sk); +void sock_set_sndtimeo(struct sock *sk, s64 secs); #endif /* _SOCK_H */ diff --git a/net/core/sock.c b/net/core/sock.c index ceda1a9248b3e..d3b1d61e4f768 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -737,6 +737,17 @@ void sock_set_priority(struct sock *sk, u32 priority) } EXPORT_SYMBOL(sock_set_priority); +void sock_set_sndtimeo(struct sock *sk, s64 secs) +{ + lock_sock(sk); + if (secs && secs < MAX_SCHEDULE_TIMEOUT / HZ - 1) + sk->sk_sndtimeo = secs * HZ; + else + sk->sk_sndtimeo = MAX_SCHEDULE_TIMEOUT; + release_sock(sk); +} +EXPORT_SYMBOL(sock_set_sndtimeo); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. -- 2.26.2
[PATCH 13/28] tcp: add tcp_sock_set_syncnt
Add a helper to directly set the TCP_SYNCNT sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig Acked-by: Sagi Grimberg --- drivers/nvme/host/tcp.c | 9 + include/linux/tcp.h | 1 + net/ipv4/tcp.c | 12 3 files changed, 14 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 4e4a750ecdb97..2872584f52f63 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1336,14 +1336,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, } /* Single syn retry */ - opt = 1; - ret = kernel_setsockopt(queue->sock, IPPROTO_TCP, TCP_SYNCNT, - (char *), sizeof(opt)); - if (ret) { - dev_err(nctrl->device, - "failed to set TCP_SYNCNT sock opt %d\n", ret); - goto err_sock; - } + tcp_sock_set_syncnt(queue->sock->sk, 1); /* Set TCP no delay */ tcp_sock_set_nodelay(queue->sock->sk); diff --git a/include/linux/tcp.h b/include/linux/tcp.h index 2eaf8320b9db0..6aa4ae5ebf3d5 100644 --- a/include/linux/tcp.h +++ b/include/linux/tcp.h @@ -500,5 +500,6 @@ int tcp_skb_shift(struct sk_buff *to, struct sk_buff *from, int pcount, void tcp_sock_set_cork(struct sock *sk, bool on); void tcp_sock_set_nodelay(struct sock *sk); void tcp_sock_set_quickack(struct sock *sk, int val); +int tcp_sock_set_syncnt(struct sock *sk, int val); #endif /* _LINUX_TCP_H */ diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 27b5e7a4e2ef9..d2c67ae1da07a 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2881,6 +2881,18 @@ void tcp_sock_set_quickack(struct sock *sk, int val) } EXPORT_SYMBOL(tcp_sock_set_quickack); +int tcp_sock_set_syncnt(struct sock *sk, int val) +{ + if (val < 1 || val > MAX_TCP_SYNCNT) + return -EINVAL; + + lock_sock(sk); + inet_csk(sk)->icsk_syn_retries = val; + release_sock(sk); + return 0; +} +EXPORT_SYMBOL(tcp_sock_set_syncnt); + /* * Socket option code for TCP. */ -- 2.26.2
[PATCH 03/28] net: add sock_set_priority
Add a helper to directly set the SO_PRIORITY sockopt from kernel space without going through a fake uaccess. Signed-off-by: Christoph Hellwig Acked-by: Sagi Grimberg --- drivers/nvme/host/tcp.c | 12 ++-- drivers/nvme/target/tcp.c | 18 -- include/net/sock.h| 1 + net/core/sock.c | 8 4 files changed, 15 insertions(+), 24 deletions(-) diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index e72d87482eb78..a307972d33a02 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1362,16 +1362,8 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, */ sock_no_linger(queue->sock->sk); - if (so_priority > 0) { - ret = kernel_setsockopt(queue->sock, SOL_SOCKET, SO_PRIORITY, - (char *)_priority, sizeof(so_priority)); - if (ret) { - dev_err(ctrl->ctrl.device, - "failed to set SO_PRIORITY sock opt, ret %d\n", - ret); - goto err_sock; - } - } + if (so_priority > 0) + sock_set_priority(queue->sock->sk, so_priority); /* Set socket type of service */ if (nctrl->opts->tos >= 0) { diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index e0801494b097f..f3088156d01da 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1448,12 +1448,8 @@ static int nvmet_tcp_set_queue_sock(struct nvmet_tcp_queue *queue) */ sock_no_linger(sock->sk); - if (so_priority > 0) { - ret = kernel_setsockopt(sock, SOL_SOCKET, SO_PRIORITY, - (char *)_priority, sizeof(so_priority)); - if (ret) - return ret; - } + if (so_priority > 0) + sock_set_priority(sock->sk, so_priority); /* Set socket type of service */ if (inet->rcv_tos > 0) { @@ -1638,14 +1634,8 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) goto err_sock; } - if (so_priority > 0) { - ret = kernel_setsockopt(port->sock, SOL_SOCKET, SO_PRIORITY, - (char *)_priority, sizeof(so_priority)); - if (ret) { - pr_err("failed to set SO_PRIORITY sock opt %d\n", ret); - goto err_sock; - } - } + if (so_priority > 0) + sock_set_priority(port->sock->sk, so_priority); ret = kernel_bind(port->sock, (struct sockaddr *)>addr, sizeof(port->addr)); diff --git a/include/net/sock.h b/include/net/sock.h index 6ed00bf009bbe..a3a43141a4be2 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2689,6 +2689,7 @@ static inline bool sk_dev_equal_l3scope(struct sock *sk, int dif) void sock_def_readable(struct sock *sk); void sock_no_linger(struct sock *sk); +void sock_set_priority(struct sock *sk, u32 priority); void sock_set_reuseaddr(struct sock *sk); #endif /* _SOCK_H */ diff --git a/net/core/sock.c b/net/core/sock.c index f0f09524911c8..ceda1a9248b3e 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -729,6 +729,14 @@ void sock_no_linger(struct sock *sk) } EXPORT_SYMBOL(sock_no_linger); +void sock_set_priority(struct sock *sk, u32 priority) +{ + lock_sock(sk); + sk->sk_priority = priority; + release_sock(sk); +} +EXPORT_SYMBOL(sock_set_priority); + /* * This is meant for all protocols to use and covers goings on * at the socket level. Everything here is generic. -- 2.26.2
[PATCH 01/28] net: add sock_set_reuseaddr
Add a helper to directly set the SO_REUSEADDR sockopt from kernel space without going through a fake uaccess. For this the iscsi target now has to formally depend on inet to avoid a mostly theoretical compile failure. For actual operation it already did depend on having ipv4 or ipv6 support. Signed-off-by: Christoph Hellwig Acked-by: Sagi Grimberg --- drivers/infiniband/sw/siw/siw_cm.c| 18 +- drivers/nvme/target/tcp.c | 8 +--- drivers/target/iscsi/Kconfig | 2 +- drivers/target/iscsi/iscsi_target_login.c | 9 + fs/dlm/lowcomms.c | 6 +- include/net/sock.h| 2 ++ net/core/sock.c | 8 7 files changed, 19 insertions(+), 34 deletions(-) diff --git a/drivers/infiniband/sw/siw/siw_cm.c b/drivers/infiniband/sw/siw/siw_cm.c index 559e5fd3bad8b..d1860f3e87401 100644 --- a/drivers/infiniband/sw/siw/siw_cm.c +++ b/drivers/infiniband/sw/siw/siw_cm.c @@ -1312,17 +1312,14 @@ static void siw_cm_llp_state_change(struct sock *sk) static int kernel_bindconnect(struct socket *s, struct sockaddr *laddr, struct sockaddr *raddr) { - int rv, flags = 0, s_val = 1; + int rv, flags = 0; size_t size = laddr->sa_family == AF_INET ? sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6); /* * Make address available again asap. */ - rv = kernel_setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (char *)_val, - sizeof(s_val)); - if (rv < 0) - return rv; + sock_set_reuseaddr(s->sk); rv = s->ops->bind(s, laddr, size); if (rv < 0) @@ -1781,7 +1778,7 @@ int siw_create_listen(struct iw_cm_id *id, int backlog) struct siw_cep *cep = NULL; struct siw_device *sdev = to_siw_dev(id->device); int addr_family = id->local_addr.ss_family; - int rv = 0, s_val; + int rv = 0; if (addr_family != AF_INET && addr_family != AF_INET6) return -EAFNOSUPPORT; @@ -1793,13 +1790,8 @@ int siw_create_listen(struct iw_cm_id *id, int backlog) /* * Allow binding local port when still in TIME_WAIT from last close. */ - s_val = 1; - rv = kernel_setsockopt(s, SOL_SOCKET, SO_REUSEADDR, (char *)_val, - sizeof(s_val)); - if (rv) { - siw_dbg(id->device, "setsockopt error: %d\n", rv); - goto error; - } + sock_set_reuseaddr(s->sk); + if (addr_family == AF_INET) { struct sockaddr_in *laddr = _sockaddr_in(id->local_addr); diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index f0da04e960f40..40757a63f4553 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1632,6 +1632,7 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) port->sock->sk->sk_user_data = port; port->data_ready = port->sock->sk->sk_data_ready; port->sock->sk->sk_data_ready = nvmet_tcp_listen_data_ready; + sock_set_reuseaddr(port->sock->sk); opt = 1; ret = kernel_setsockopt(port->sock, IPPROTO_TCP, @@ -1641,13 +1642,6 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport) goto err_sock; } - ret = kernel_setsockopt(port->sock, SOL_SOCKET, SO_REUSEADDR, - (char *), sizeof(opt)); - if (ret) { - pr_err("failed to set SO_REUSEADDR sock opt %d\n", ret); - goto err_sock; - } - if (so_priority > 0) { ret = kernel_setsockopt(port->sock, SOL_SOCKET, SO_PRIORITY, (char *)_priority, sizeof(so_priority)); diff --git a/drivers/target/iscsi/Kconfig b/drivers/target/iscsi/Kconfig index 1f93ea3813536..922484ea4e304 100644 --- a/drivers/target/iscsi/Kconfig +++ b/drivers/target/iscsi/Kconfig @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0-only config ISCSI_TARGET tristate "Linux-iSCSI.org iSCSI Target Mode Stack" - depends on NET + depends on INET select CRYPTO select CRYPTO_CRC32C select CRYPTO_CRC32C_INTEL if X86 diff --git a/drivers/target/iscsi/iscsi_target_login.c b/drivers/target/iscsi/iscsi_target_login.c index 731ee67fe914b..91acb3f07b4cc 100644 --- a/drivers/target/iscsi/iscsi_target_login.c +++ b/drivers/target/iscsi/iscsi_target_login.c @@ -909,14 +909,7 @@ int iscsit_setup_np( } } - /* FIXME: Someone please explain why this is endian-safe */ - ret = kernel_setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, - (char *), sizeof(opt)); - if (ret < 0) { - pr_err("kernel_setsockopt() for SO_REUSEADDR" - " failed\n"); - goto fail; - } + sock_set_reuseaddr(sock->sk); ret =
RE: [PATCH] exfat: optimize dir-cache
> >>> > In order to prevent illegal accesses to bh and dentries, it > >>> would be better to check validation for num and bh. > >>> > >>> There is no new error checking for same reason as above. > >>> > >>> I'll try to add error checking to this v2 patch. > >>> Or is it better to add error checking in another patch? > >> The latter:) > >> Thanks! > > > > Yes, the latter looks better. > > I will do so. > > I will post additional patches for error checking, after this patch is merged > into tree. > OK? Okay. > >
Re: [PATCH] x86: drop deprecated DISCONTIGMEM support for 32-bit
Gentle ping... On Sun, Feb 23, 2020 at 11:43:22AM +0200, Mike Rapoport wrote: > From: Mike Rapoport > > The DISCONTIGMEM support was marked as deprecated in v5.2 and since there > were no complaints about it for almost 5 releases it can be completely > removed. > > Signed-off-by: Mike Rapoport > --- > arch/x86/Kconfig | 9 --- > arch/x86/include/asm/mmzone_32.h | 39 --- > arch/x86/include/asm/pgtable_32.h | 3 +-- > arch/x86/mm/numa_32.c | 34 --- > 4 files changed, 1 insertion(+), 84 deletions(-) > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index beea77046f9b..e3fc3aa80f97 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -1613,19 +1613,10 @@ config NODES_SHIFT > Specify the maximum number of NUMA Nodes available on the target > system. Increases memory reserved to accommodate various tables. > > -config ARCH_HAVE_MEMORY_PRESENT > - def_bool y > - depends on X86_32 && DISCONTIGMEM > - > config ARCH_FLATMEM_ENABLE > def_bool y > depends on X86_32 && !NUMA > > -config ARCH_DISCONTIGMEM_ENABLE > - def_bool n > - depends on NUMA && X86_32 > - depends on BROKEN > - > config ARCH_SPARSEMEM_ENABLE > def_bool y > depends on X86_64 || NUMA || X86_32 || X86_32_NON_STANDARD > diff --git a/arch/x86/include/asm/mmzone_32.h > b/arch/x86/include/asm/mmzone_32.h > index 73d8dd14dda2..2d4515e8b7df 100644 > --- a/arch/x86/include/asm/mmzone_32.h > +++ b/arch/x86/include/asm/mmzone_32.h > @@ -14,43 +14,4 @@ extern struct pglist_data *node_data[]; > #define NODE_DATA(nid) (node_data[nid]) > #endif /* CONFIG_NUMA */ > > -#ifdef CONFIG_DISCONTIGMEM > - > -/* > - * generic node memory support, the following assumptions apply: > - * > - * 1) memory comes in 64Mb contiguous chunks which are either present or not > - * 2) we will not have more than 64Gb in total > - * > - * for now assume that 64Gb is max amount of RAM for whole system > - *64Gb / 4096bytes/page = 16777216 pages > - */ > -#define MAX_NR_PAGES 16777216 > -#define MAX_SECTIONS 1024 > -#define PAGES_PER_SECTION (MAX_NR_PAGES/MAX_SECTIONS) > - > -extern s8 physnode_map[]; > - > -static inline int pfn_to_nid(unsigned long pfn) > -{ > -#ifdef CONFIG_NUMA > - return((int) physnode_map[(pfn) / PAGES_PER_SECTION]); > -#else > - return 0; > -#endif > -} > - > -static inline int pfn_valid(int pfn) > -{ > - int nid = pfn_to_nid(pfn); > - > - if (nid >= 0) > - return (pfn < node_end_pfn(nid)); > - return 0; > -} > - > -#define early_pfn_valid(pfn) pfn_valid((pfn)) > - > -#endif /* CONFIG_DISCONTIGMEM */ > - > #endif /* _ASM_X86_MMZONE_32_H */ > diff --git a/arch/x86/include/asm/pgtable_32.h > b/arch/x86/include/asm/pgtable_32.h > index 0dca7f7aeff2..be7b19646897 100644 > --- a/arch/x86/include/asm/pgtable_32.h > +++ b/arch/x86/include/asm/pgtable_32.h > @@ -66,8 +66,7 @@ do {\ > #endif /* !__ASSEMBLY__ */ > > /* > - * kern_addr_valid() is (1) for FLATMEM and (0) for > - * SPARSEMEM and DISCONTIGMEM > + * kern_addr_valid() is (1) for FLATMEM and (0) for SPARSEMEM > */ > #ifdef CONFIG_FLATMEM > #define kern_addr_valid(addr)(1) > diff --git a/arch/x86/mm/numa_32.c b/arch/x86/mm/numa_32.c > index f2bd3d61e16b..104544359d69 100644 > --- a/arch/x86/mm/numa_32.c > +++ b/arch/x86/mm/numa_32.c > @@ -27,40 +27,6 @@ > > #include "numa_internal.h" > > -#ifdef CONFIG_DISCONTIGMEM > -/* > - * 4) physnode_map - the mapping between a pfn and owning node > - * physnode_map keeps track of the physical memory layout of a generic > - * numa node on a 64Mb break (each element of the array will > - * represent 64Mb of memory and will be marked by the node id. so, > - * if the first gig is on node 0, and the second gig is on node 1 > - * physnode_map will contain: > - * > - * physnode_map[0-15] = 0; > - * physnode_map[16-31] = 1; > - * physnode_map[32- ] = -1; > - */ > -s8 physnode_map[MAX_SECTIONS] __read_mostly = { [0 ... (MAX_SECTIONS - 1)] = > -1}; > -EXPORT_SYMBOL(physnode_map); > - > -void memory_present(int nid, unsigned long start, unsigned long end) > -{ > - unsigned long pfn; > - > - printk(KERN_INFO "Node: %d, start_pfn: %lx, end_pfn: %lx\n", > - nid, start, end); > - printk(KERN_DEBUG " Setting physnode_map array to node %d for > pfns:\n", nid); > - printk(KERN_DEBUG " "); > - start = round_down(start, PAGES_PER_SECTION); > - end = round_up(end, PAGES_PER_SECTION); > - for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) { > - physnode_map[pfn / PAGES_PER_SECTION] = nid; > - printk(KERN_CONT "%lx ", pfn); > - } > - printk(KERN_CONT "\n"); > -} > -#endif > - > extern unsigned long highend_pfn, highstart_pfn; > > void __init initmem_init(void) > -- > 2.24.0 > -- Sincerely yours,
Re: [PATCH v3] bluetooth: hci_qca: Fix qca6390 enable failure after warm reboot
On 5/28/2020 12:48 AM, Matthias Kaehlcke wrote: > Hi Zijun, > > On Wed, May 27, 2020 at 10:32:39AM +0800, Zijun Hu wrote: >> Warm reboot can not restore qca6390 controller baudrate >> to default due to lack of controllable BT_EN pin or power >> supply, so fails to download firmware after warm reboot. >> >> Fixed by sending EDL_SOC_RESET VSC to reset controller >> within added device shutdown implementation. >> >> Signed-off-by: Zijun Hu >> --- >> drivers/bluetooth/hci_qca.c | 29 + >> 1 file changed, 29 insertions(+) >> >> diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c >> index e4a6823..4b6f8b6 100644 >> --- a/drivers/bluetooth/hci_qca.c >> +++ b/drivers/bluetooth/hci_qca.c >> @@ -1975,6 +1975,34 @@ static void qca_serdev_remove(struct serdev_device >> *serdev) >> hci_uart_unregister_device(>serdev_hu); >> } >> >> +static void qca_serdev_shutdown(struct device *dev) >> +{ >> +int ret; >> +int timeout = msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS); >> +struct serdev_device *serdev = to_serdev_device(dev); >> +struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev); >> +const u8 ibs_wake_cmd[] = { 0xFD }; >> +const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 }; >> + >> +if (qcadev->btsoc_type == QCA_QCA6390) { >> +serdev_device_write_flush(serdev); >> +serdev_device_write_buf(serdev, >> +ibs_wake_cmd, sizeof(ibs_wake_cmd)); >> +serdev_device_wait_until_sent(serdev, timeout); > > Why no check of the return value of serdev_device_write_buf() here, > does it make sense to continue if sending the wakeup command failed? > i will correct it at v4 patch > Couldn't serdev_device_write() be used instead of the _write_buf() + > _wait_until_sent() combo? > i don't think so, serdev_device_write() is not appropriate at here. serdev_device_write_wakeup() should be used to release completion hold by serdev_device_write(), however @hci_serdev_client_ops doesn't use serdev_device_write_wakeup() to implement its write_wakeup operation. we don't want to touch common hci_serdev.c code. >> +usleep_range(8000, 1); >> + >> +serdev_device_write_flush(serdev); > > I suppose the flush is done because _wait_until_sent() could have timed out. > Another reason to use _device_write() (if suitable), since it returns > -ETIMEDOUT in that case? > flush is prefixed at write operation to speed up shutdown procedure in case of unexpected data injected during waiting for controller wakeup. the combo have been used and i just follow it>> + ret = serdev_device_write_buf(serdev, >> +edl_reset_soc_cmd, sizeof(edl_reset_soc_cmd)); >> +if (ret < 0) { >> +BT_ERR("QCA send EDL_RESET_REQ error: %d", ret); >> +return; >> +} >> +serdev_device_wait_until_sent(serdev, timeout); >> +usleep_range(8000, 1); >> +} >> +} >> + >> static int __maybe_unused qca_suspend(struct device *dev) >> { >> struct hci_dev *hdev = container_of(dev, struct hci_dev, dev); >> @@ -2100,6 +2128,7 @@ static struct serdev_device_driver qca_serdev_driver = >> { >> .name = "hci_uart_qca", >> .of_match_table = of_match_ptr(qca_bluetooth_of_match), >> .acpi_match_table = ACPI_PTR(qca_bluetooth_acpi_match), >> +.shutdown = qca_serdev_shutdown, >> .pm = _pm_ops, >> }, >> }; >> -- >> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a >> Linux Foundation Collaborative Project >> -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project
arch/mips/vdso/vdso-image.c:13:35: sparse: sparse: incorrect type in assignment (different address spaces)
tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master head: b0c3ba31be3e45a130e13b278cf3b90f69bda6f6 commit: ad1df95419cc46b4a832cbb537716e3da9a98881 mips/vdso: Support mremap() for vDSO date: 4 months ago config: mips-randconfig-s032-20200527 (attached as .config) compiler: mips64-linux-gcc (GCC) 9.3.0 reproduce: # apt-get install sparse # sparse version: v0.6.1-240-gf0fe1cd9-dirty git checkout ad1df95419cc46b4a832cbb537716e3da9a98881 # save the attached .config to linux build tree make W=1 C=1 ARCH=mips CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' If you fix the issue, kindly add following tag as appropriate Reported-by: kbuild test robot sparse warnings: (new ones prefixed by >>) >> arch/mips/vdso/vdso-image.c:13:35: sparse: sparse: incorrect type in >> assignment (different address spaces) @@ expected void *[usertype] vdso >> @@ got void [noderef] * @@ arch/mips/vdso/vdso-image.c:13:35: sparse: expected void *[usertype] vdso arch/mips/vdso/vdso-image.c:13:35: sparse: got void [noderef] * -- >> arch/mips/vdso/vdso-n32-image.c:13:35: sparse: sparse: incorrect type in >> assignment (different address spaces) @@ expected void *[usertype] vdso >> @@ got void [noderef] * @@ arch/mips/vdso/vdso-n32-image.c:13:35: sparse: expected void *[usertype] vdso arch/mips/vdso/vdso-n32-image.c:13:35: sparse: got void [noderef] * --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
RE: [PATCH 4/4] exfat: standardize checksum calculation
> >> II tried applying patch to dev-tree (4c4dbb6ad8e8). > >> -The .patch file I sent > >> -mbox file downloaded from archive > >> But I can't reproduce the error. (Both succeed) How do you reproduce > >> the error? > > I tried to appy your patches in the following order. > > 1. [PATCH] exfat: optimize dir-cache > > 2. [PATCH 1/4] exfat: redefine PBR as boot_sector 3. [PATCH 2/4] > > exfat: separate the boot sector analysis 4. [PATCH 3/4] exfat: add > > boot region verification 5. [PATCH 4/4] exfat: standardize checksum > > calculation > > I was able to reproduce it. > > The dir-cache patch was created based on the HEAD of dev-tree. > The 4 patches for boot_sector were also created based on the HEAD of dev-tree. > (at physically separated place) > > I'm sorry I didn't check any conflicts with these patches. > > I'll repost the patch, based on the dir-cache patched dev-tree. > If dir-cache patch will merge into dev-tree, should I wait until then? I will apply them after testing at once if you send updated 5 patches again. Thanks! > > BR
Re: [PATCH] gpiolib: split character device into gpiolib-cdev
Hi Kent, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on v5.7-rc5] [cannot apply to gpio/for-next linus/master linux/master v5.7-rc7 v5.7-rc6 next-20200526] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Kent-Gibson/gpiolib-split-character-device-into-gpiolib-cdev/20200528-35 base:2ef96a5bb12be62ef75b5828c0aab838ebb29cb8 config: riscv-allyesconfig (attached as .config) compiler: riscv64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=riscv If you fix the issue, kindly add following tag as appropriate Reported-by: kbuild test robot All warnings (new ones prefixed by >>, old ones prefixed by <<): drivers/gpio/gpiolib-cdev.c:1092:5: warning: no previous prototype for 'gpiolib_cdev_register' [-Wmissing-prototypes] 1092 | int gpiolib_cdev_register(struct gpio_device *gdev, dev_t devt) | ^ drivers/gpio/gpiolib-cdev.c:1110:6: warning: no previous prototype for 'gpiolib_cdev_unregister' [-Wmissing-prototypes] 1110 | void gpiolib_cdev_unregister(struct gpio_device *gdev) | ^~~ drivers/gpio/gpiolib-cdev.c: In function 'gpio_desc_to_lineinfo': >> drivers/gpio/gpiolib-cdev.c:779:3: warning: 'strncpy' specified bound 32 >> equals destination size [-Wstringop-truncation] 779 | strncpy(info->name, desc->name, sizeof(info->name)); | ^~~ vim +/strncpy +779 drivers/gpio/gpiolib-cdev.c 769 770 static void gpio_desc_to_lineinfo(struct gpio_desc *desc, 771struct gpioline_info *info) 772 { 773 struct gpio_chip *gc = desc->gdev->chip; 774 unsigned long flags; 775 776 spin_lock_irqsave(_lock, flags); 777 778 if (desc->name) { > 779 strncpy(info->name, desc->name, sizeof(info->name)); 780 info->name[sizeof(info->name) - 1] = '\0'; 781 } else { 782 info->name[0] = '\0'; 783 } 784 785 if (desc->label) { 786 strncpy(info->consumer, desc->label, sizeof(info->consumer)); 787 info->consumer[sizeof(info->consumer) - 1] = '\0'; 788 } else { 789 info->consumer[0] = '\0'; 790 } 791 792 /* 793 * Userspace only need to know that the kernel is using this GPIO so 794 * it can't use it. 795 */ 796 info->flags = 0; 797 if (test_bit(FLAG_REQUESTED, >flags) || 798 test_bit(FLAG_IS_HOGGED, >flags) || 799 test_bit(FLAG_USED_AS_IRQ, >flags) || 800 test_bit(FLAG_EXPORT, >flags) || 801 test_bit(FLAG_SYSFS, >flags) || 802 !pinctrl_gpio_can_use_line(gc->base + info->line_offset)) 803 info->flags |= GPIOLINE_FLAG_KERNEL; 804 if (test_bit(FLAG_IS_OUT, >flags)) 805 info->flags |= GPIOLINE_FLAG_IS_OUT; 806 if (test_bit(FLAG_ACTIVE_LOW, >flags)) 807 info->flags |= GPIOLINE_FLAG_ACTIVE_LOW; 808 if (test_bit(FLAG_OPEN_DRAIN, >flags)) 809 info->flags |= (GPIOLINE_FLAG_OPEN_DRAIN | 810 GPIOLINE_FLAG_IS_OUT); 811 if (test_bit(FLAG_OPEN_SOURCE, >flags)) 812 info->flags |= (GPIOLINE_FLAG_OPEN_SOURCE | 813 GPIOLINE_FLAG_IS_OUT); 814 if (test_bit(FLAG_BIAS_DISABLE, >flags)) 815 info->flags |= GPIOLINE_FLAG_BIAS_DISABLE; 816 if (test_bit(FLAG_PULL_DOWN, >flags)) 817 info->flags |= GPIOLINE_FLAG_BIAS_PULL_DOWN; 818 if (test_bit(FLAG_PULL_UP, >flags)) 819 info->flags |= GPIOLINE_FLAG_BIAS_PULL_UP; 820 821 spin_unlock_irqrestore(_lock, flags); 822 } 823 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH 06/12] PM / devfreq: Add cpu based scaling support to passive_governor
Hi Andrew-sh.Cheng, Thanks for your posting. I like this approach absolutely. I think that it is necessary. When I developed the embedded product, I needed this feature always. I add the comments on below. On 5/20/20 12:43 PM, Andrew-sh.Cheng wrote: > From: Saravana Kannan > > Many CPU architectures have caches that can scale independent of the > CPUs. Frequency scaling of the caches is necessary to make sure that the > cache is not a performance bottleneck that leads to poor performance and > power. The same idea applies for RAM/DDR. > > To achieve this, this patch adds support for cpu based scaling to the > passive governor. This is accomplished by taking the current frequency > of each CPU frequency domain and then adjust the frequency of the cache > (or any devfreq device) based on the frequency of the CPUs. It listens > to CPU frequency transition notifiers to keep itself up to date on the > current CPU frequency. > > To decide the frequency of the device, the governor does one of the > following: > * Derives the optimal devfreq device opp from required-opps property of > the parent cpu opp_table. > > * Scales the device frequency in proportion to the CPU frequency. So, if > the CPUs are running at their max frequency, the device runs at its > max frequency. If the CPUs are running at their min frequency, the > device runs at its min frequency. It is interpolated for frequencies > in between. > > Andrew-sh.Cheng change > dev_pm_opp_xlate_opp to dev_pm_opp_xlate_required_opp devfreq->max_freq > to devfreq->user_min_freq_req.data.freq.qos->min_freq.target_value > for kernel-5.7 > > Signed-off-by: Saravana Kannan > [Sibi: Integrated cpu-freqmap governor into passive_governor] > Signed-off-by: Sibi Sankar > Signed-off-by: Andrew-sh.Cheng > --- > drivers/devfreq/Kconfig| 2 + > drivers/devfreq/governor_passive.c | 278 > ++--- > include/linux/devfreq.h| 40 +- > 3 files changed, 299 insertions(+), 21 deletions(-) > > diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig > index 0b1df12e0f21..d9067950af6a 100644 > --- a/drivers/devfreq/Kconfig > +++ b/drivers/devfreq/Kconfig > @@ -73,6 +73,8 @@ config DEVFREQ_GOV_PASSIVE > device. This governor does not change the frequency by itself > through sysfs entries. The passive governor recommends that > devfreq device uses the OPP table to get the frequency/voltage. > + Alternatively the governor can also be chosen to scale based on > + the online CPUs current frequency. > > comment "DEVFREQ Drivers" > > diff --git a/drivers/devfreq/governor_passive.c > b/drivers/devfreq/governor_passive.c > index 2d67d6c12dce..7dcda02a5bb7 100644 > --- a/drivers/devfreq/governor_passive.c > +++ b/drivers/devfreq/governor_passive.c > @@ -8,11 +8,89 @@ > */ > > #include > +#include > +#include > +#include > #include > #include > +#include > #include "governor.h" > > -static int devfreq_passive_get_target_freq(struct devfreq *devfreq, > +static unsigned int xlate_cpufreq_to_devfreq(struct devfreq_passive_data > *data, Need to change 'unsigned int' to 'unsigned long'. > + unsigned int cpu) > +{ > + unsigned int cpu_min, cpu_max, dev_min, dev_max, cpu_percent, max_state; Better to define them separately as following and then need to rename the variable. Usually, use the 'min_freq' and 'max_freq' word for the minimum/maximum frequency. unsigned int cpu_min_freq, cpu_max_freq, cpu_curr_freq, cpu_percent; unsigned long dev_min_freq, dev_max_freq, dev_max_state, The devfreq used 'unsigned long'. The cpufreq used 'unsigned long' and 'unsigned int'. You need to handle them properly. > + struct devfreq_cpu_state *cpu_state = data->cpu_state[cpu]; > + struct devfreq *devfreq = (struct devfreq *)data->this; > + unsigned long *freq_table = devfreq->profile->freq_table; In this function, use 'cpu' work for cpufreq and use 'dev' for devfreq. So, I think 'dev_freq_table' is proper name instead of 'freq_table' for the readability. freq_table -> dev_freq_table > + struct dev_pm_opp *opp = NULL, *cpu_opp = NULL; In the get_target_freq_with_devfreq(), use 'p_opp' indicating the OPP of parent device. For the consistency, I think that use 'p_opp' instead of 'cpu_opp'. > + unsigned long cpu_freq, freq; Define the 'cpu_freq' on above with cpu_min_freq/cpu_max_freq definition. cpu_freq -> cpu_curr_freq. > + > + if (!cpu_state || cpu_state->first_cpu != cpu || > + !cpu_state->opp_table || !devfreq->opp_table) > + return 0; > + > + cpu_freq = cpu_state->freq * 1000; > + cpu_opp = devfreq_recommended_opp(cpu_state->dev, _freq, 0); > + if (IS_ERR(cpu_opp)) > + return 0; > + > + opp = dev_pm_opp_xlate_required_opp(cpu_state->opp_table, > +
linux-next: manual merge of the devicetree tree with the nand tree
Hi all, Today's linux-next merge of the devicetree tree got a conflict in: Documentation/devicetree/bindings/mtd/nand-controller.yaml between commit: 1777341d9335 ("dt-bindings: mtd: Deprecate OOB_FIRST mode") from the nand tree and commit: 3d21a4609335 ("dt-bindings: Remove cases of 'allOf' containing a '$ref'") from the devicetree tree. I fixed it up (see below) and can carry the fix as necessary. This is now fixed as far as linux-next is concerned, but any non trivial conflicts should be mentioned to your upstream maintainer when your tree is submitted for merging. You may also want to consider cooperating with the maintainer of the conflicting tree to minimise any particularly complex conflicts. -- Cheers, Stephen Rothwell diff --cc Documentation/devicetree/bindings/mtd/nand-controller.yaml index d529f8587ba6,cde7c4d79efe.. --- a/Documentation/devicetree/bindings/mtd/nand-controller.yaml +++ b/Documentation/devicetree/bindings/mtd/nand-controller.yaml @@@ -55,21 -52,21 +52,21 @@@ patternProperties embedded in the NAND controller) or software correction (Linux will handle the calculations). soft_bch is deprecated and should be replaced by soft and nand-ecc-algo. + $ref: /schemas/types.yaml#/definitions/string -enum: [none, soft, hw, hw_syndrome, hw_oob_first, on-die] ++enum: [none, soft, hw, hw_syndrome, on-die] nand-ecc-algo: - allOf: - - $ref: /schemas/types.yaml#/definitions/string - - enum: [ hamming, bch, rs ] description: Desired ECC algorithm. + $ref: /schemas/types.yaml#/definitions/string + enum: [hamming, bch, rs] nand-bus-width: - allOf: - - $ref: /schemas/types.yaml#/definitions/uint32 - - enum: [ 8, 16 ] - - default: 8 description: Bus width to the NAND chip + $ref: /schemas/types.yaml#/definitions/uint32 + enum: [8, 16] + default: 8 nand-on-flash-bbt: $ref: /schemas/types.yaml#/definitions/flag pgpAZsQtKxkFA.pgp Description: OpenPGP digital signature
[git pull] Input updates for v5.7-rc7
Hi Linus, Please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git for-linus to receive updates for the input subsystem. Just a few random driver fixups. Changelog: - Brendan Shanks (1): Input: evdev - call input_flush_device() on release(), not flush() Christophe JAILLET (1): Input: dlink-dir685-touchkeys - fix a typo in driver name Dennis Kadioglu (1): Input: synaptics - add a second working PNP_ID for Lenovo T470s Dmitry Torokhov (1): Revert "Input: i8042 - add ThinkPad S230u to i8042 nomux list" Enric Balletbo i Serra (1): Input: cros_ec_keyb - use cros_ec_cmd_xfer_status helper Evan Green (1): Input: synaptics-rmi4 - really fix attn_data use-after-free Gustavo A. R. Silva (1): Input: applespi - replace zero-length array with flexible-array Hans de Goede (1): Input: axp20x-pek - always register interrupt handlers James Hilliard (1): Input: usbtouchscreen - add support for BonXeon TP Johnny Chuang (1): Input: elants_i2c - support palm detection Kevin Locke (2): Input: i8042 - add ThinkPad S230u to i8042 nomux list Input: i8042 - add ThinkPad S230u to i8042 reset list Stephan Gerhold (1): Input: mms114 - fix handling of mms345l Wei Yongjun (1): Input: synaptics-rmi4 - fix error return code in rmi_driver_probe() Wolfram Sang (1): Input: lm8333 - update contact email Łukasz Patron (1): Input: xpad - add custom init packet for Xbox One S controllers Diffstat: drivers/input/evdev.c | 19 ++- drivers/input/joystick/xpad.c | 12 + drivers/input/keyboard/applespi.c | 2 +- drivers/input/keyboard/cros_ec_keyb.c | 14 ++--- drivers/input/keyboard/dlink-dir685-touchkeys.c | 2 +- drivers/input/misc/axp20x-pek.c | 72 + drivers/input/mouse/synaptics.c | 1 + drivers/input/rmi4/rmi_driver.c | 5 +- drivers/input/serio/i8042-x86ia64io.h | 7 +++ drivers/input/touchscreen/elants_i2c.c | 11 +++- drivers/input/touchscreen/mms114.c | 12 ++--- drivers/input/touchscreen/usbtouchscreen.c | 1 + include/linux/input/lm8333.h| 2 +- 13 files changed, 88 insertions(+), 72 deletions(-) Thanks. -- Dmitry
RE: [PATCH] exfat: optimize dir-cache
> +struct exfat_dentry *exfat_get_dentry_cached( > + struct exfat_entry_set_cache *es, int num); You used a single tab for the continuing line of the prototype here. We usually use two tabs for this. > struct exfat_entry_set_cache *exfat_get_dentry_set(struct super_block *sb, > - struct exfat_chain *p_dir, int entry, unsigned int type, > - struct exfat_dentry **file_ep); > + struct exfat_chain *p_dir, int entry, unsigned int type); void > +exfat_free_dentry_set(struct exfat_entry_set_cache *es, int sync); > int exfat_count_dir_entries(struct super_block *sb, struct exfat_chain > *p_dir); >
Re: [PATCH] tpm: Revert "tpm: fix invalid locking in NONBLOCKING mode"
On 5/27/20 5:30 PM, Jarkko Sakkinen wrote: >> This won't help if the message is read by an async tcti. If the problem lies >> in the chip get locality code, perhaps this could help to debug the >> root-cause >> instead of masking it out in the upper layer code: > What is TCTI and async TCTI? Not following. TPM Command Transmission Interface (TCTI) as defined by TCG in https://trustedcomputinggroup.org/resource/tss-tcti-specification/ the reason we added the O_NONBLOCK mode was to satisfy the TCG spec for async TCTI. Thanks, Tadeusz
Re: [PATCH 12/23] bpf: handle the compat string in bpf_trace_copy_string better
On Wed, May 27, 2020 at 07:26:30PM -0700, Yonghong Song wrote: >> --- a/kernel/trace/bpf_trace.c~xxx >> +++ a/kernel/trace/bpf_trace.c >> @@ -588,15 +588,22 @@ BPF_CALL_5(bpf_seq_printf, struct seq_fi >> } >> if (fmt[i] == 's') { >> +void *unsafe_ptr; >> + >> /* try our best to copy */ >> if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) { >> err = -E2BIG; >> goto out; >> } >> - err = strncpy_from_unsafe(bufs->buf[memcpy_cnt], >> - (void *) (long) args[fmt_cnt], >> - MAX_SEQ_PRINTF_STR_LEN); >> +unsafe_ptr = (void *)(long)args[fmt_cnt]; >> +if ((unsigned long)unsafe_ptr < TASK_SIZE) { >> +err = strncpy_from_user_nofault( >> +bufs->buf[memcpy_cnt], unsafe_ptr, >> +MAX_SEQ_PRINTF_STR_LEN); >> +} else { >> +err = -EFAULT; >> +} > > This probably not right. > The pointer stored at args[fmt_cnt] is a kernel pointer, > but it could be an invalid address and we do not want to fault. > Not sure whether it exists or not, we should use > strncpy_from_kernel_nofault()? If you know it is a kernel pointer with this series it should be strncpy_from_kernel_nofault. But even before the series it should have been strncpy_from_unsafe_strict.
Re: [PATCH] perf jvmti: remove redundant jitdump line table entries
On 05/28/20 02:08 AM, Ian Rogers wrote: >> >> I noticed it loses information when the Hotspot code cache is >> resized. I've been working around that by setting >> -XX:InitialCodeCacheSize and -XX:ReservedCodeCacheSize to large >> values. Does this help in your case? > > Thanks, I tried and also with Steve's patch: > https://lore.kernel.org/lkml/1590544271-125795-1-git-send-email-steve.macl...@linux.microsoft.com/ Thanks for the reference! That patch fixes the problem I had with code cache resizing so the workaround above is no longer necessary. > > Trying something very basic like just the -version command with compile only: > /tmp/perf/perf record -k 1 -e cycles:u -F 6500 -o /tmp/perf.data java > -agentpath:/tmp/perf/libperf-jvmti.so -XX:+PreserveFramePointer > -XX:InitialCodeCacheSize=2G -XX:ReservedCodeCacheSize=2G > -XX:CompileOnly=1 -version > /tmp/perf/perf inject -i /tmp/perf.data -o /tmp/perf-jit.data -j > /tmp/perf/perf report -i /tmp/perf-jit.data > > I don't see any of the JDK classes but 35 unknown symbols out of 272. > The JDK classes are stripped to some degree iirc, but we should be > able to give a symbol name as we don't care about local variables and > like. > I tried this with latest perf/core and JDK 11 but I don't see any [unknown] from jitted-*.so. All the events are in "Interpreter": I think the options you want are -Xcomp -Xbatch rather than -XX:CompileOnly=1? The latter restricts compilation to the named method/package. There was a bug where no jitdump debug info was written for classes compiled without line tables. That was fixed by d3ea46da3 ("perf jvmti: Fix jitdump for methods without debug info"). -- Nick
Re: [PATCH v3 3/3] mm/memory.c: Add memory read privilege before filling PTE entry
On 05/28/2020 04:55 AM, Hugh Dickins wrote: > On Tue, 19 May 2020, maobibo wrote: >> On 05/19/2020 04:57 AM, Andrew Morton wrote: >>> On Mon, 18 May 2020 13:08:49 +0800 Bibo Mao wrote: >>> On mips platform, hw PTE entry valid bit is set in pte_mkyoung function, it is used to set physical page with readable privilege. >>> >>> pte_mkyoung() seems to be a strange place to set the pte's valid bit. >>> Why is it done there? Can it be done within mips's mk_pte()? >> On MIPS system hardware cannot set PAGE_ACCESS bit when accessing the page, >> software sets PAGE_ACCESS software bit and PAGE_VALID hw bit together during >> page >> fault stage. >> >> If mk_pte is called in page fault flow, it is ok to set both bits. If it is >> not >> called in page fault, PAGE_ACCESS is set however there is no actual memory >> accessing. > > Sorry for joining in so late, but would you please explain that some more: > preferably in the final commit message, if not here. > > I still don't understand why this is not done in the same way as on other > architectures - those that care (I just checked x86, powerpc, arm, arm64, > but not all of them) make sure that all the bits they want are there in > mm/mmap.c's protection_map[16], which then feeds into vma->vm_page_prot, > and so into mk_pte() as Andrew indicated. > > And I can see that arch/mips/mm/cache.c has a setup_protection_map() > to do that: why does it not set the additional bits that you want? > including the valid bit and the accessed (young) bit, as others do. > Are you saying that there are circumstances in which it is wrong > for mk_pte() to set the additional bits? MIPS is actually strange here, _PAGE_ACCESSED is not set in protection_map. I do not understand history of mips neither. On x86/aarch/powerpc system, _PAGE_ACCESSED bit is set in the beginning. How does software track memory page accessing frequency? Does software not care current status about _PAGE_ACCESSED bit, just calles madvise_cold to clear this bit, and then watches whether this bit is changed or not? regards bibo,mao > > I'm afraid that generic mm developers will have no clue as to whether > or not to add a pte_sw_mkyoung() after a mk_pte(); and generic source > will be the cleaner if it turns out not to be needed (but thank you > for making sure that it does nothing on the other architectures). > > Hugh >
Re: [PATCH 1/2] seccomp: notify user trap about unused filter
On Thu, May 28, 2020 at 3:59 AM Kees Cook wrote: > On Thu, May 28, 2020 at 01:16:46AM +0200, Christian Brauner wrote: > > I'm also starting to think this isn't even possible or currently doable > > safely. > > The fdtable in the kernel would end up with a dangling pointer, I would > > think. Unless you backtrack all fds that still have a reference into the > > fdtable and refer to that file and close them all in the kernel which I > > don't think is possible and also sounds very dodgy. This also really > > seems like we would be breaking a major contract, namely that fds stay > > valid until userspace calls close, execve(), or exits. > > Right, I think I was just using the wrong words? I was looking at it > like a pipe, or a socket, where you still have an fd, but reads return > 0, you might get SIGPIPE, etc. The VFS clearly knows what a > "disconnected" fd is, and I had assumed there was general logic for it > to indicate "I'm not here any more". Nope. For example, pipes have manual checks based on pipe->readers and pipe->writers, and manually send SIGPIPE and stuff from inside fs/pipe.c. And pipes are not actually permanently "disconnected" - someone can e.g. open a pipe that previously had no readers in read mode, and suddenly you can write to it again.
Re: [PATCH] gpiolib: split character device into gpiolib-cdev
Hi Kent, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on v5.7-rc5] [cannot apply to gpio/for-next linus/master linux/master v5.7-rc7 v5.7-rc6 next-20200526] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system. BTW, we also suggest to use '--base' option to specify the base tree in git format-patch, please see https://stackoverflow.com/a/37406982] url: https://github.com/0day-ci/linux/commits/Kent-Gibson/gpiolib-split-character-device-into-gpiolib-cdev/20200528-35 base:2ef96a5bb12be62ef75b5828c0aab838ebb29cb8 config: nios2-allmodconfig (attached as .config) compiler: nios2-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nios2 If you fix the issue, kindly add following tag as appropriate Reported-by: kbuild test robot All warnings (new ones prefixed by >>, old ones prefixed by <<): >> drivers/gpio/gpiolib-cdev.c:1092:5: warning: no previous prototype for >> 'gpiolib_cdev_register' [-Wmissing-prototypes] 1092 | int gpiolib_cdev_register(struct gpio_device *gdev, dev_t devt) | ^ >> drivers/gpio/gpiolib-cdev.c:1110:6: warning: no previous prototype for >> 'gpiolib_cdev_unregister' [-Wmissing-prototypes] 1110 | void gpiolib_cdev_unregister(struct gpio_device *gdev) | ^~~ vim +/gpiolib_cdev_register +1092 drivers/gpio/gpiolib-cdev.c 1091 > 1092 int gpiolib_cdev_register(struct gpio_device *gdev, dev_t devt) 1093 { 1094 int ret; 1095 1096 cdev_init(>chrdev, _fileops); 1097 gdev->chrdev.owner = THIS_MODULE; 1098 gdev->dev.devt = MKDEV(MAJOR(devt), gdev->id); 1099 1100 ret = cdev_device_add(>chrdev, >dev); 1101 if (ret) 1102 return ret; 1103 1104 chip_dbg(gdev->chip, "added GPIO chardev (%d:%d)\n", 1105 MAJOR(devt), gdev->id); 1106 1107 return 0; 1108 } 1109 > 1110 void gpiolib_cdev_unregister(struct gpio_device *gdev) { 1112 cdev_device_del(>chrdev, >dev); 1113 } 1114 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org .config.gz Description: application/gzip
Re: [PATCH v4 2/2] gpio: add a reusable generic gpio_chip using regmap
Hi, Am 2020-05-28 02:31, schrieb Pierre-Louis Bossart: Hi Michael, +struct gpio_regmap_config { + struct device *parent; + struct regmap *regmap; + + const char *label; + int ngpio; could we add a .names field for the gpio_chip, I found this useful for PCM512x GPIO support, e.g. Sure, I have the names in the device tree. But I'd prefer that you'd do a patch on top of this (assuming it is applied soon), because you can actually test it and there might be missing more. I am happy to report that this gpio-regmap worked like a charm for me, after I applied the minor diff below (complete code at https://github.com/plbossart/sound/tree/fix/regmap-gpios). I worked around my previous comments by forcing the GPIO internal routing directly in regmap, and that allowed me to only play with the _set and _dir bases. I see the LEDs and clock selected as before, quite nice indeed. The chip->label test is probably wrong, since the gpio_chip structure is zeroed out if(!chip->label) is always true so the label is always set to the device name. I don't know what the intent was so just removed that test - maybe the correct test should be if (!config->label) ? yes, that was a typo. should have been if (!config->label). I've send a v5 with that fix and your names property. I added the names support as well, and btw I don't understand how one would get them through device tree? gpio-line-names property, see Documentation/devicetree/bindings/gpio/gpio.txt. I still have a series of odd warnings I didn't have before: [ 101.400263] WARNING: CPU: 3 PID: 1129 at drivers/gpio/gpiolib.c:4084 gpiod_set_value+0x3f/0x50 This seems to come from /* Should be using gpiod_set_value_cansleep() */ WARN_ON(desc->gdev->chip->can_sleep); Right now, gpio-regmap hardcodes can_sleep to true. But the only regmap which don't sleep is regmap-mmio. The PCM512x seems to be either I2C or SPI, which can both sleep. So this warning is actually correct and wherever this gpio is set should do it by calling the _cansleep() version. so maybe we need an option here as well? Or use a different function? Anyways, that gpio-regmap does simplify my code a great deal so thanks for this work, much appreciated. Glad to see that there are more users for it ;) -michael
Re: [PATCH 1/2] xen-pciback: Use dev_printk() when possible
On Wed, 2020-05-27 at 15:34 -0700, Boris Ostrovsky wrote: > On 5/27/20 1:43 PM, Bjorn Helgaas wrote: > > @@ -155,8 +157,8 @@ int xen_pcibk_config_read(struct pci_dev *dev, int > > offset, int size, > > u32 value = 0, tmp_val; > > > > if (unlikely(verbose_request)) > > - printk(KERN_DEBUG DRV_NAME ": %s: read %d bytes at 0x%x\n", > > - pci_name(dev), size, offset); > > + dev_printk(KERN_DEBUG, >dev, "read %d bytes at 0x%x\n", > > + size, offset); > > Maybe then dev_dbg() ? It likely would be better to remove verbose_request altogether and just use dynamic debugging and dev_dbg for all the output. $ git grep -w -A3 verbose_request drivers/pci/xen-pcifront.c:static int verbose_request; drivers/pci/xen-pcifront.c:module_param(verbose_request, int, 0644); drivers/pci/xen-pcifront.c- drivers/pci/xen-pcifront.c-static int errno_to_pcibios_err(int errno) drivers/pci/xen-pcifront.c-{ -- drivers/pci/xen-pcifront.c: if (verbose_request) drivers/pci/xen-pcifront.c- dev_info(>xdev->dev, drivers/pci/xen-pcifront.c- "read dev=%04x:%02x:%02x.%d - offset %x size %d\n", drivers/pci/xen-pcifront.c- pci_domain_nr(bus), bus->number, PCI_SLOT(devfn), -- drivers/pci/xen-pcifront.c: if (verbose_request) drivers/pci/xen-pcifront.c- dev_info(>xdev->dev, "read got back value %x\n", drivers/pci/xen-pcifront.c- op.value); drivers/pci/xen-pcifront.c- -- drivers/pci/xen-pcifront.c: if (verbose_request) drivers/pci/xen-pcifront.c- dev_info(>xdev->dev, drivers/pci/xen-pcifront.c- "write dev=%04x:%02x:%02x.%d - " drivers/pci/xen-pcifront.c- "offset %x size %d val %x\n", -- drivers/xen/xen-pciback/conf_space.c: if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space.c- printk(KERN_DEBUG DRV_NAME ": %s: read %d bytes at 0x%x\n", drivers/xen/xen-pciback/conf_space.c- pci_name(dev), size, offset); drivers/xen/xen-pciback/conf_space.c- -- drivers/xen/xen-pciback/conf_space.c: if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space.c- printk(KERN_DEBUG DRV_NAME ": %s: read %d bytes at 0x%x = %x\n", drivers/xen/xen-pciback/conf_space.c- pci_name(dev), size, offset, value); drivers/xen/xen-pciback/conf_space.c- -- drivers/xen/xen-pciback/conf_space.c: if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space.c- printk(KERN_DEBUG drivers/xen/xen-pciback/conf_space.c- DRV_NAME ": %s: write request %d bytes at 0x%x = %x\n", drivers/xen/xen-pciback/conf_space.c- pci_name(dev), size, offset, value); -- drivers/xen/xen-pciback/conf_space_header.c:if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space_header.c- printk(KERN_DEBUG DRV_NAME ": %s: enable\n", drivers/xen/xen-pciback/conf_space_header.c- pci_name(dev)); drivers/xen/xen-pciback/conf_space_header.c-err = pci_enable_device(dev); -- drivers/xen/xen-pciback/conf_space_header.c:if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space_header.c- printk(KERN_DEBUG DRV_NAME ": %s: disable\n", drivers/xen/xen-pciback/conf_space_header.c- pci_name(dev)); drivers/xen/xen-pciback/conf_space_header.c-pci_disable_device(dev); -- drivers/xen/xen-pciback/conf_space_header.c:if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space_header.c- printk(KERN_DEBUG DRV_NAME ": %s: set bus master\n", drivers/xen/xen-pciback/conf_space_header.c- pci_name(dev)); drivers/xen/xen-pciback/conf_space_header.c-pci_set_master(dev); -- drivers/xen/xen-pciback/conf_space_header.c:if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space_header.c- printk(KERN_DEBUG DRV_NAME ": %s: clear bus master\n", drivers/xen/xen-pciback/conf_space_header.c- pci_name(dev)); drivers/xen/xen-pciback/conf_space_header.c-pci_clear_master(dev); -- drivers/xen/xen-pciback/conf_space_header.c:if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space_header.c- printk(KERN_DEBUG drivers/xen/xen-pciback/conf_space_header.c- DRV_NAME ": %s: enable memory-write-invalidate\n", drivers/xen/xen-pciback/conf_space_header.c- pci_name(dev)); -- drivers/xen/xen-pciback/conf_space_header.c:if (unlikely(verbose_request)) drivers/xen/xen-pciback/conf_space_header.c- printk(KERN_DEBUG drivers/xen/xen-pciback/conf_space_header.c- DRV_NAME ": %s: disable memory-write-invalidate\n",
Re: [PATCH 1/2] seccomp: notify user trap about unused filter
On Wed, May 27, 2020 at 1:19 PM Christian Brauner wrote: > We've been making heavy use of the seccomp notifier to intercept and > handle certain syscalls for containers. This patch allows a syscall > supervisor listening on a given notifier to be notified when a seccomp > filter has become unused. [...] > To fix this, we introduce a new "live" reference counter that tracks the > live tasks making use of a given filter and when a notifier is > registered waiting tasks will be notified that the filter is now empty > by receiving a (E)POLLHUP event. > The concept in this patch introduces is the same as for signal_struct, > i.e. reference counting for life-cycle management is decoupled from > reference counting live taks using the object. [...] > + * @live: tasks that actually use this filter, only to be altered > + * during fork(), exit()/free_task(), and filter installation This comment is a bit off. Actually, @live counts the number of tasks that use the filter directly plus the number of dependent filters that have non-zero @live. [...] > +void seccomp_filter_notify(const struct task_struct *tsk) > +{ > + struct seccomp_filter *orig = tsk->seccomp.filter; > + > + while (orig && refcount_dec_and_test(>live)) { > + if (waitqueue_active(>wqh)) > + wake_up_poll(>wqh, EPOLLHUP); > + orig = orig->prev; > + } > +} /me fetches the paint bucket Maybe name this seccomp_filter_unuse() or seccomp_filter_unuse_notify() or something like that? The current name isn't very descriptive.
[GIT PULL 1/2] soc: TI drivers updates for v5.8
The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136: Linux 5.7-rc1 (2020-04-12 12:35:55 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone.git tags/drivers_soc_for_5.8 for you to fetch changes up to b8b38a8e3cae100f292d756e32c78ab288db8a7d: drivers: soc: ti: knav_qmss_queue: Make knav_gp_range_ops static (2020-05-27 20:39:14 -0700) soc: ARM TI update for v5.8 - Platform chipid driver support and associated dts doc update - Sparse warning fix in Navigator driver Grygorii Strashko (2): dt-bindings: soc: ti: add binding for k3 platforms chipid module soc: ti: add k3 platforms chipid module driver Samuel Zou (1): drivers: soc: ti: knav_qmss_queue: Make knav_gp_range_ops static .../devicetree/bindings/soc/ti/k3-socinfo.yaml | 40 ++ drivers/soc/ti/Kconfig | 10 ++ drivers/soc/ti/Makefile| 1 + drivers/soc/ti/k3-socinfo.c| 152 + drivers/soc/ti/knav_qmss_queue.c | 2 +- 5 files changed, 204 insertions(+), 1 deletion(-) create mode 100644 Documentation/devicetree/bindings/soc/ti/k3-socinfo.yaml create mode 100644 drivers/soc/ti/k3-socinfo.c
[GIT PULL 2/2] ARM: DTS: Keystone update for v5.8
The following changes since commit 8f3d9f354286745c751374f5f1fcafee6b3f3136: Linux 5.7-rc1 (2020-04-12 12:35:55 -0700) are available in the git repository at: git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone.git tags/keystone_dts_for_5.8 for you to fetch changes up to 644c5a582261ecdf1df41b11d05d10a10a66: ARM: dts: keystone: Rename "msmram" node to "sram" (2020-05-27 20:36:32 -0700) ARM: dts: Keystone update for v5.8 - Rename "msmram" node to "sram" Krzysztof Kozlowski (1): ARM: dts: keystone: Rename "msmram" node to "sram" arch/arm/boot/dts/keystone-k2e.dtsi | 4 ++-- arch/arm/boot/dts/keystone-k2g.dtsi | 4 ++-- arch/arm/boot/dts/keystone-k2hk.dtsi | 4 ++-- arch/arm/boot/dts/keystone-k2l.dtsi | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-)
[PATCH v5 2/2] gpio: add a reusable generic gpio_chip using regmap
There are quite a lot simple GPIO controller which are using regmap to access the hardware. This driver tries to be a base to unify existing code into one place. This won't cover everything but it should be a good starting point. It does not implement its own irq_chip because there is already a generic one for regmap based devices. Instead, the irq_chip will be instantiated in the parent driver and its irq domain will be associate to this driver. For now it consists of the usual registers, like set (and an optional clear) data register, an input register and direction registers. Out-of-the-box, it supports consecutive register mappings and mappings where the registers have gaps between them with a linear mapping between GPIO offset and bit position. For weirder mappings the user can register its own .xlate(). Signed-off-by: Michael Walle --- drivers/gpio/Kconfig| 4 + drivers/gpio/Makefile | 1 + drivers/gpio/gpio-regmap.c | 352 include/linux/gpio-regmap.h | 70 +++ 4 files changed, 427 insertions(+) create mode 100644 drivers/gpio/gpio-regmap.c create mode 100644 include/linux/gpio-regmap.h diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig index 7d077be10a0f..bcacd9c74aa8 100644 --- a/drivers/gpio/Kconfig +++ b/drivers/gpio/Kconfig @@ -73,6 +73,10 @@ config GPIO_GENERIC depends on HAS_IOMEM # Only for IOMEM drivers tristate +config GPIO_REGMAP + depends on REGMAP + tristate + # put drivers in the right section, in alphabetical order # This symbol is selected by both I2C and SPI expanders diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile index 65bf3940e33c..1e4894e0bf0f 100644 --- a/drivers/gpio/Makefile +++ b/drivers/gpio/Makefile @@ -12,6 +12,7 @@ obj-$(CONFIG_GPIO_SYSFS) += gpiolib-sysfs.o obj-$(CONFIG_GPIO_ACPI)+= gpiolib-acpi.o # Device drivers. Generally keep list sorted alphabetically +obj-$(CONFIG_GPIO_REGMAP) += gpio-regmap.o obj-$(CONFIG_GPIO_GENERIC) += gpio-generic.o # directly supported by gpio-generic diff --git a/drivers/gpio/gpio-regmap.c b/drivers/gpio/gpio-regmap.c new file mode 100644 index ..5060ca865276 --- /dev/null +++ b/drivers/gpio/gpio-regmap.c @@ -0,0 +1,352 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * regmap based generic GPIO driver + * + * Copyright 2019 Michael Walle + */ + +#include +#include +#include +#include +#include + +struct gpio_regmap { + struct device *parent; + struct regmap *regmap; + struct gpio_chip gpio_chip; + + int reg_stride; + int ngpio_per_reg; + unsigned int reg_dat_base; + unsigned int reg_set_base; + unsigned int reg_clr_base; + unsigned int reg_dir_in_base; + unsigned int reg_dir_out_base; + + int (*reg_mask_xlate)(struct gpio_regmap *gpio, unsigned int base, + unsigned int offset, unsigned int *reg, + unsigned int *mask); + + void *driver_data; +}; + +static unsigned int gpio_regmap_addr(unsigned int addr) +{ + return (addr == GPIO_REGMAP_ADDR_ZERO) ? 0 : addr; +} + +/** + * gpio_regmap_simple_xlate() - translate base/offset to reg/mask + * + * Use a simple linear mapping to translate the offset to the bitmask. + */ +static int gpio_regmap_simple_xlate(struct gpio_regmap *gpio, + unsigned int base, unsigned int offset, + unsigned int *reg, unsigned int *mask) +{ + unsigned int line = offset % gpio->ngpio_per_reg; + unsigned int stride = offset / gpio->ngpio_per_reg; + + *reg = base + stride * gpio->reg_stride; + *mask = BIT(line); + + return 0; +} + +static int gpio_regmap_get(struct gpio_chip *chip, unsigned int offset) +{ + struct gpio_regmap *gpio = gpiochip_get_data(chip); + unsigned int base, val, reg, mask; + int ret; + + /* we might not have an output register if we are input only */ + if (gpio->reg_dat_base) + base = gpio_regmap_addr(gpio->reg_dat_base); + else + base = gpio_regmap_addr(gpio->reg_set_base); + + ret = gpio->reg_mask_xlate(gpio, base, offset, , ); + if (ret) + return ret; + + ret = regmap_read(gpio->regmap, reg, ); + if (ret) + return ret; + + return (val & mask) ? 1 : 0; +} + +static void gpio_regmap_set(struct gpio_chip *chip, unsigned int offset, + int val) +{ + struct gpio_regmap *gpio = gpiochip_get_data(chip); + unsigned int base = gpio_regmap_addr(gpio->reg_set_base); + unsigned int reg, mask; + + gpio->reg_mask_xlate(gpio, base, offset, , ); + if (val) + regmap_update_bits(gpio->regmap, reg, mask, mask); + else + regmap_update_bits(gpio->regmap, reg, mask, 0); +} + +static void
[PATCH v5 0/2] gpio: generic regmap implementation
This series is a split off of the sl28cpld series: https://lore.kernel.org/linux-gpio/20200423174543.17161-1-mich...@walle.cc/ I wasn't sure if I should also include the gpiochip_irqchip_add_domain() patch here. So feel free to skip it. OTOH if you use interrupts with gpio-regmap it is quite handy. For an actual user see the patch 11/16 ("gpio: add support for the sl28cpld GPIO controller") of the series above. Changes since v4: - add comment about can_sleep - fix config->label typo - add config->names property Changes since v3: - set reg_dat_base, that was actually broken - fix typo - fix swapped reg_in_dir/reg_out_dir documentation - use "goto err" in error path in gpio_regmap_register() Changes since v2: See changelog in the former patch series. Michael Walle (2): gpiolib: Introduce gpiochip_irqchip_add_domain() gpio: add a reusable generic gpio_chip using regmap drivers/gpio/Kconfig| 4 + drivers/gpio/Makefile | 1 + drivers/gpio/gpio-regmap.c | 352 drivers/gpio/gpiolib.c | 20 ++ include/linux/gpio-regmap.h | 70 +++ include/linux/gpio/driver.h | 3 + 6 files changed, 450 insertions(+) create mode 100644 drivers/gpio/gpio-regmap.c create mode 100644 include/linux/gpio-regmap.h -- 2.20.1
[PATCH v5 1/2] gpiolib: Introduce gpiochip_irqchip_add_domain()
The function connects an IRQ domain to a gpiochip and reuses gpiochip_to_irq() which is provided by gpiolib. gpiochip_irqchip_* and regmap_irq partially provide the same functionality. This function will help to connect just the minimal functionality of the gpiochip_irqchip which is needed to work together with regmap-irq. Signed-off-by: Michael Walle --- drivers/gpio/gpiolib.c | 20 include/linux/gpio/driver.h | 3 +++ 2 files changed, 23 insertions(+) diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index eaa0e209188d..d07f763c9c0b 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -2756,6 +2756,26 @@ int gpiochip_irqchip_add_key(struct gpio_chip *gc, } EXPORT_SYMBOL_GPL(gpiochip_irqchip_add_key); +/** + * gpiochip_irqchip_add_domain() - adds an irqdomain to a gpiochip + * @gc: the gpiochip to add the irqchip to + * @domain: the irqdomain to add to the gpiochip + * + * This function adds an IRQ domain to the gpiochip. + */ +int gpiochip_irqchip_add_domain(struct gpio_chip *gc, + struct irq_domain *domain) +{ + if (!domain) + return -EINVAL; + + gc->to_irq = gpiochip_to_irq; + gc->irq.domain = domain; + + return 0; +} +EXPORT_SYMBOL_GPL(gpiochip_irqchip_add_domain); + #else /* CONFIG_GPIOLIB_IRQCHIP */ static inline int gpiochip_add_irqchip(struct gpio_chip *gc, diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h index 8c41ae41b6bb..ee30065b6f61 100644 --- a/include/linux/gpio/driver.h +++ b/include/linux/gpio/driver.h @@ -599,6 +599,9 @@ int gpiochip_irqchip_add_key(struct gpio_chip *gc, bool gpiochip_irqchip_irq_valid(const struct gpio_chip *gc, unsigned int offset); +int gpiochip_irqchip_add_domain(struct gpio_chip *gc, + struct irq_domain *domain); + #ifdef CONFIG_LOCKDEP /* -- 2.20.1
linux-next: build warning after merge of the sound-asoc tree
Hi all, After merging the sound-asoc tree, today's linux-next build (x86_64 allmodconfig) produced this warning: sound/soc/sof/intel/byt.c:464:12: warning: 'byt_remove' defined but not used [-Wunused-function] 464 | static int byt_remove(struct snd_sof_dev *sdev) |^~ sound/soc/sof/intel/byt.c:454:12: warning: 'byt_resume' defined but not used [-Wunused-function] 454 | static int byt_resume(struct snd_sof_dev *sdev) |^~ sound/soc/sof/intel/byt.c:447:12: warning: 'byt_suspend' defined but not used [-Wunused-function] 447 | static int byt_suspend(struct snd_sof_dev *sdev, u32 target_state) |^~~ Introduced by commits ddcccd543f5d ("ASoC: SOF: Intel: byt: Add PM callbacks") c691f0c6e267 ("ASoC: SOF: Intel: BYT: add .remove op") -- Cheers, Stephen Rothwell pgpJFVvF3ALX_.pgp Description: OpenPGP digital signature
Re: [PATCH v1 1/1] PCI/ERR: Handle fatal error recovery for non-hotplug capable devices
On 5/26/20 11:41 PM, Yicong Yang wrote: We should do slot reset if driver required, but it's different from the `slot reset` in pci_bus_error_reset(). Previously we don't do a slot reset and call ->slot_reset() directly, I don't know the certain reason. IIUC, your concern is whether it is correct to trigger reset for pci_channel_io_normal case right ? Please correct me if my assumption is incorrect. right. If its true, then why would report_error_detected() will return PCI_ERS_*_NEED_RESET for pci_channel_io_normal case ? If report_error_detected() requests reset in pci_channel_io_normal case then I think we should give preference to it. If we get PCI_ERS_*_NEED_RESET, we should do slot reset, no matter it's a hotpluggable slot or not. pci_slot_reset() function itself has dependency on hotplug ops. So what kind of slot reset is needed for non-hotplug case? static int pci_slot_reset(struct pci_slot *slot, int probe) { int rc; if (!slot || !pci_slot_resetable(slot)) return -ENOTTY; if (!probe) pci_slot_lock(slot); might_sleep(); rc = pci_reset_hotplug_slot(slot->hotplug, probe); if (!probe) pci_slot_unlock(slot); return rc; } static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, int probe) { int rc = -ENOTTY; if (!hotplug || !try_module_get(hotplug->owner)) return rc; if (hotplug->ops->reset_slot) rc = hotplug->ops->reset_slot(hotplug, probe); module_put(hotplug->owner); return rc; } And we shouldn't do it here in reset_link(), that's two separate things. The `slot reset` done in aer_root_reset() is only for *link reset*, as there may have some side effects to perform secondary bus reset directly for hotpluggable slot, as mentioned in commit c4eed62a2143, so it use slot reset to do the reset link things. As for slot reset required by the driver, we should perform it later just before the ->slot_reset(). I noticed the TODO comments there and we should implement it if it's necessary. I agree. It lies in line 183, drivers/pcie/err.c: if (status == PCI_ERS_RESULT_NEED_RESET) { /* * TODO: Should call platform-specific * functions to reset slot before calling * drivers' slot_reset callbacks? */ status = PCI_ERS_RESULT_RECOVERED; pci_dbg(dev, "broadcast slot_reset message\n"); pci_walk_bus(bus, report_slot_reset, ); }
Re: Cache flush issue with page_mapping_file() and swap back shmem page ?
Hi Jerome, On Wed, 27 May 2020, Jerome Glisse wrote: > So any arch code which uses page_mapping_file() might get the wrong > answer, this function will return NULL for a swap backed page which > can be a shmem pages. But shmem pages can still be shared among > multiple process (and possibly at different virtual addresses if > mremap was use). > > Attached is a patch that changes page_mapping_file() to return the > shmem mapping for swap backed shmem page. I have not tested it (no > way for me to test all those architecture) and i spotted this while > working on something else. So i hope someone can take a closer look. I'm certainly no expert on flush_dcache_page() and friends, but I'd be very surprised if such a problem exists, yet has gone unnoticed for so long. page_mapping_file() itself is fairly new, added when a risk of crashing on a race with swapoff came in: but the previous use of page_mapping() would have suffered equally if there were such a cache flushinhg problem here. And I'm afraid your patch won't do anything to help if there is a problem: very soon after shmem calls add_to_swap_cache(), it calls shmem_delete_from_page_cache(), which sets page->mapping to NULL. But I can assure you that a shmem page (unlike an anon page) is never put into swap cache while it is mapped into userspace, and never mapped into userspace while it is still in swap cache: does that help? Hugh > This might be a shmem page that is in a sense a file that > can be mapped multiple times in different processes at > possibly different virtual addresses (fork + mremap). So > return the shmem mapping that will allow any arch code to > find all mappings of the page. > > Note that even if page is not anonymous then the page might > have a NULL page->mapping field if it is being truncated, > but then it is fine as each pte poiting to the page will be > remove and cache flushing should be handled properly by that > part of the code. > > Signed-off-by: Jerome Glisse > Cc: "Huang, Ying" > Cc: Michal Hocko > Cc: Mel Gorman > Cc: Russell King > Cc: Andrew Morton > Cc: Mike Rapoport > Cc: "David S. Miller" > Cc: "James E.J. Bottomley" > --- > mm/util.c | 18 +- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/mm/util.c b/mm/util.c > index 988d11e6c17c..ec8739ab0cc3 100644 > --- a/mm/util.c > +++ b/mm/util.c > @@ -685,8 +685,24 @@ EXPORT_SYMBOL(page_mapping); > */ > struct address_space *page_mapping_file(struct page *page) > { > - if (unlikely(PageSwapCache(page))) > + if (unlikely(PageSwapCache(page))) { > + /* > + * This might be a shmem page that is in a sense a file that > + * can be mapped multiple times in different processes at > + * possibly different virtual addresses (fork + mremap). So > + * return the shmem mapping that will allow any arch code to > + * find all mappings of the page. > + * > + * Note that even if page is not anonymous then the page might > + * have a NULL page->mapping field if it is being truncated, > + * but then it is fine as each pte poiting to the page will be > + * remove and cache flushing should be handled properly by that > + * part of the code. > + */ > + if (!PageAnon(page)) > + return page->mapping; > return NULL; > + } > return page_mapping(page); > } > > -- > 2.26.2
[RFC] decrease tsk->signal->live before profile_task_exit
I want to dermine which thread is the last one to enter do_exit in profile_task_exit. But when a lot of threads exit, tsk->signal->live is not correct since it decrease after profile_task_exit. Signed-off-by: liuchao --- kernel/exit.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/exit.c b/kernel/exit.c index ce2a75bc0ade..1693764bc356 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -708,6 +708,7 @@ void __noreturn do_exit(long code) struct task_struct *tsk = current; int group_dead; + group_dead = atomic_dec_and_test(>signal->live); profile_task_exit(tsk); kcov_task_exit(tsk); @@ -755,7 +756,6 @@ void __noreturn do_exit(long code) if (tsk->mm) sync_mm_rss(tsk->mm); acct_update_integrals(tsk); - group_dead = atomic_dec_and_test(>signal->live); if (group_dead) { /* * If the last thread of global init has exited, panic -- 2.19.1
Re: [PATCH v2] bluetooth: hci_qca: Fix QCA6390 memdump failure
Hi Zijun, On Tue, May 26, 2020 at 8:37 PM Zijun Hu wrote: > > QCA6390 memdump VSE sometimes come to bluetooth driver > with wrong sequence number as illustrated as follows: > frame # in DEC: frame data in HEX > 1396: ff fd 01 08 74 05 00 37 8f 14 > 1397: ff fd 01 08 75 05 00 ff bf 38 > 1414: ff fd 01 08 86 05 00 fb 5e 4b > 1399: ff fd 01 08 77 05 00 f3 44 0a > 1400: ff fd 01 08 78 05 00 ca f7 41 > it is mistook for controller missing packets, so results > in page fault after overwriting memdump buffer allocated. > > it is fixed by ignoring QCA6390 sequence number error > and checking buffer space before writing. > > Signed-off-by: Zijun Hu > --- > drivers/bluetooth/hci_qca.c | 45 > ++--- > 1 file changed, 38 insertions(+), 7 deletions(-) > > diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c > index e4a6823..388fe01b 100644 > --- a/drivers/bluetooth/hci_qca.c > +++ b/drivers/bluetooth/hci_qca.c > @@ -114,6 +114,7 @@ struct qca_memdump_data { > char *memdump_buf_tail; > u32 current_seq_no; > u32 received_dump; > + u32 ram_dump_size; > }; > > struct qca_memdump_event_hdr { > @@ -976,6 +977,8 @@ static void qca_controller_memdump(struct work_struct > *work) > char nullBuff[QCA_DUMP_PACKET_SIZE] = { 0 }; > u16 seq_no; > u32 dump_size; > + u32 rx_size; > + enum qca_btsoc_type soc_type = qca_soc_type(hu); > > while ((skb = skb_dequeue(>rx_memdump_q))) { > > @@ -1029,6 +1032,7 @@ static void qca_controller_memdump(struct work_struct > *work) > > skb_pull(skb, sizeof(dump_size)); > memdump_buf = vmalloc(dump_size); > + qca_memdump->ram_dump_size = dump_size; > qca_memdump->memdump_buf_head = memdump_buf; > qca_memdump->memdump_buf_tail = memdump_buf; > } > @@ -1052,25 +1056,52 @@ static void qca_controller_memdump(struct work_struct > *work) > * packets in the buffer. > */ > while ((seq_no > qca_memdump->current_seq_no + 1) && > + (soc_type != QCA_QCA6390) && This probably shouldn't be SOC specific. > seq_no != QCA_LAST_SEQUENCE_NUM) { > bt_dev_err(hu->hdev, "QCA controller missed > packet:%d", >qca_memdump->current_seq_no); > + rx_size = qca_memdump->received_dump; > + rx_size += QCA_DUMP_PACKET_SIZE; > + if (rx_size > qca_memdump->ram_dump_size) { > + bt_dev_err(hu->hdev, > + "QCA memdump received %d, no > space for missed packet", > + qca_memdump->received_dump); > + break; > + } > memcpy(memdump_buf, nullBuff, QCA_DUMP_PACKET_SIZE); > memdump_buf = memdump_buf + QCA_DUMP_PACKET_SIZE; > qca_memdump->received_dump += QCA_DUMP_PACKET_SIZE; > qca_memdump->current_seq_no++; > } You can replace this loop with a memset(memdump_buf, 0, (seq_no - qca_memdump->current_seq_no) * QCA_DUMP_PACKET_SIZE). This simplifies the ram_dump_size check as well because it won't zero fill until the end anymore (meaning a single bad seq_no doesn't make the rest of the dump incorrect). > > - memcpy(memdump_buf, (unsigned char *) skb->data, skb->len); > - memdump_buf = memdump_buf + skb->len; > - qca_memdump->memdump_buf_tail = memdump_buf; > - qca_memdump->current_seq_no = seq_no + 1; > - qca_memdump->received_dump += skb->len; > + rx_size = qca_memdump->received_dump + skb->len; > + if (rx_size <= qca_memdump->ram_dump_size) { > + if ((seq_no != QCA_LAST_SEQUENCE_NUM) && > + (seq_no != > qca_memdump->current_seq_no)) > + bt_dev_err(hu->hdev, > + "QCA memdump unexpected > packet %d", > + seq_no); > + bt_dev_dbg(hu->hdev, > + "QCA memdump packet %d with length > %d", > + seq_no, skb->len); > + memcpy(memdump_buf, (unsigned char *)skb->data, > + skb->len); > + memdump_buf = memdump_buf + skb->len; > + qca_memdump->memdump_buf_tail = memdump_buf; > + qca_memdump->current_seq_no = seq_no + 1; > + qca_memdump->received_dump += skb->len; > +