[PATCH AUTOSEL for 4.14 13/51] spi: spi-axi: fix potential use-after-free after deregistration
From: Johan Hovold[ Upstream commit 4d5e0689dc9d5640ad46cdfbe1896b74d8df1661 ] Take an extra reference to the controller before deregistering it to prevent use-after-free in the interrupt handler in case an interrupt fires before the line is disabled. Fixes: b1353d1c1d45 ("spi: Add Analog Devices AXI SPI Engine controller support") Acked-by: Lars-Peter Clausen Signed-off-by: Johan Hovold Signed-off-by: Mark Brown Signed-off-by: Sasha Levin --- drivers/spi/spi-axi-spi-engine.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/spi/spi-axi-spi-engine.c b/drivers/spi/spi-axi-spi-engine.c index 6ab4c7700228..68cfc351b47f 100644 --- a/drivers/spi/spi-axi-spi-engine.c +++ b/drivers/spi/spi-axi-spi-engine.c @@ -553,7 +553,7 @@ static int spi_engine_probe(struct platform_device *pdev) static int spi_engine_remove(struct platform_device *pdev) { - struct spi_master *master = platform_get_drvdata(pdev); + struct spi_master *master = spi_master_get(platform_get_drvdata(pdev)); struct spi_engine *spi_engine = spi_master_get_devdata(master); int irq = platform_get_irq(pdev, 0); @@ -561,6 +561,8 @@ static int spi_engine_remove(struct platform_device *pdev) free_irq(irq, master); + spi_master_put(master); + writel_relaxed(0xff, spi_engine->base + SPI_ENGINE_REG_INT_PENDING); writel_relaxed(0x00, spi_engine->base + SPI_ENGINE_REG_INT_ENABLE); writel_relaxed(0x01, spi_engine->base + SPI_ENGINE_REG_RESET); -- 2.11.0
[PATCH AUTOSEL for 4.14 07/51] staging: greybus: loopback: Fix iteration count on async path
From: Bryan O'Donoghue[ Upstream commit 44b02da39210e6dd67e39ff1f48d30c56d384240 ] Commit 12927835d211 ("greybus: loopback: Add asynchronous bi-directional support") does what it says on the tin - namely, adds support for asynchronous bi-directional loopback operations. What it neglects to do though is increment the per-connection gb->iteration_count on an asynchronous operation error. This patch fixes that omission. Fixes: 12927835d211 ("greybus: loopback: Add asynchronous bi-directional support") Signed-off-by: Bryan O'Donoghue Reported-by: Mitch Tasman Reviewed-by: Johan Hovold Cc: Alex Elder Cc: Mitch Tasman Cc: greybus-...@lists.linaro.org Cc: de...@driverdev.osuosl.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Greg Kroah-Hartman Signed-off-by: Sasha Levin --- drivers/staging/greybus/loopback.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/drivers/staging/greybus/loopback.c b/drivers/staging/greybus/loopback.c index 08e255884206..93e86798ec1c 100644 --- a/drivers/staging/greybus/loopback.c +++ b/drivers/staging/greybus/loopback.c @@ -1042,8 +1042,10 @@ static int gb_loopback_fn(void *data) else if (type == GB_LOOPBACK_TYPE_SINK) error = gb_loopback_async_sink(gb, size); - if (error) + if (error) { gb->error++; + gb->iteration_count++; + } } else { /* We are effectively single threaded here */ if (type == GB_LOOPBACK_TYPE_PING) -- 2.11.0
[PATCH AUTOSEL for 4.14 06/51] selftests/x86/ldt_gdt: Robustify against set_thread_area() and LAR oddities
From: Andy Lutomirski[ Upstream commit d60ad744c9741586010d4bea286f09a063a90fbd ] Bits 19:16 of LAR's result are undefined, and some upcoming improvements to the test case seem to trigger this. Mask off those bits to avoid spurious failures. commit 5b781c7e317f ("x86/tls: Forcibly set the accessed bit in TLS segments") adds a valid case in which LAR's output doesn't quite agree with set_thread_area()'s input. This isn't triggered in the test as is, but it will be if we start calling set_thread_area() with the accessed bit clear. Work around this discrepency. I've added a Fixes tag so that -stable can pick this up if neccesary. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: 5b781c7e317f ("x86/tls: Forcibly set the accessed bit in TLS segments") Link: http://lkml.kernel.org/r/b82f3f89c034b53580970ac865139fd8863f44e2.1509794321.git.l...@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- tools/testing/selftests/x86/ldt_gdt.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c index b1779da72428..2afc41a3730f 100644 --- a/tools/testing/selftests/x86/ldt_gdt.c +++ b/tools/testing/selftests/x86/ldt_gdt.c @@ -115,7 +115,15 @@ static void check_valid_segment(uint16_t index, int ldt, return; } - if (ar != expected_ar) { + /* The SDM says "bits 19:16 are undefined". Thanks. */ + ar &= ~0xF; + + /* +* NB: Different Linux versions do different things with the +* accessed bit in set_thread_area(). +*/ + if (ar != expected_ar && + (ldt || ar != (expected_ar | AR_ACCESSED))) { printf("[FAIL]\t%s entry %hu has AR 0x%08X but expected 0x%08X\n", (ldt ? "LDT" : "GDT"), index, ar, expected_ar); nerrs++; -- 2.11.0
[PATCH AUTOSEL for 4.14 05/51] selftests/x86/ldt_get: Add a few additional tests for limits
From: Andy Lutomirski[ Upstream commit fec8f5ae1715a01c72ad52cb2ecd8aacaf142302 ] We weren't testing the .limit and .limit_in_pages fields very well. Add more tests. This addition seems to trigger the "bits 16:19 are undefined" issue that was fixed in an earlier patch. I think that, at least on my CPU, the high nibble of the limit ends in LAR bits 16:19. Signed-off-by: Andy Lutomirski Cc: Borislav Petkov Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/5601c15ea9b3113d288953fd2838b18bedf6bc67.1509794321.git.l...@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Sasha Levin --- tools/testing/selftests/x86/ldt_gdt.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/x86/ldt_gdt.c b/tools/testing/selftests/x86/ldt_gdt.c index 961e3ee26c27..b1779da72428 100644 --- a/tools/testing/selftests/x86/ldt_gdt.c +++ b/tools/testing/selftests/x86/ldt_gdt.c @@ -367,9 +367,24 @@ static void do_simple_tests(void) install_invalid(, false); desc.seg_not_present = 0; - desc.read_exec_only = 0; desc.seg_32bit = 1; + desc.read_exec_only = 0; + desc.limit = 0xf; + install_valid(, AR_DPL3 | AR_TYPE_RWDATA | AR_S | AR_P | AR_DB); + + desc.limit_in_pages = 1; + + install_valid(, AR_DPL3 | AR_TYPE_RWDATA | AR_S | AR_P | AR_DB | AR_G); + desc.read_exec_only = 1; + install_valid(, AR_DPL3 | AR_TYPE_RODATA | AR_S | AR_P | AR_DB | AR_G); + desc.contents = 1; + desc.read_exec_only = 0; + install_valid(, AR_DPL3 | AR_TYPE_RWDATA_EXPDOWN | AR_S | AR_P | AR_DB | AR_G); + desc.read_exec_only = 1; + install_valid(, AR_DPL3 | AR_TYPE_RODATA_EXPDOWN | AR_S | AR_P | AR_DB | AR_G); + + desc.limit = 0; install_invalid(, true); } -- 2.11.0
[PATCH AUTOSEL for 4.9 24/54] KVM: arm/arm64: Fix occasional warning from the timer work function
From: Christoffer Dall[ Upstream commit 63e41226afc3f7a044b70325566fa86ac3142538 ] When a VCPU blocks (WFI) and has programmed the vtimer, we program a soft timer to expire in the future to wake up the vcpu thread when appropriate. Because such as wake up involves a vcpu kick, and the timer expire function can get called from interrupt context, and the kick may sleep, we have to schedule the kick in the work function. The work function currently has a warning that gets raised if it turns out that the timer shouldn't fire when it's run, which was added because the idea was that in that case the work should never have been cancelled. However, it turns out that this whole thing is racy and we can get spurious warnings. The problem is that we clear the armed flag in the work function, which may run in parallel with the kvm_timer_unschedule->timer_disarm() call. This results in a possible situation where the timer_disarm() call does not call cancel_work_sync(), which effectively synchronizes the completion of the work function with running the VCPU. As a result, the VCPU thread proceeds before the work function completees, causing changes to the timer state such that kvm_timer_should_fire(vcpu) returns false in the work function. All we do in the work function is to kick the VCPU, and an occasional rare extra kick never harmed anyone. Since the race above is extremely rare, we don't bother checking if the race happens but simply remove the check and the clearing of the armed flag from the work function. Reported-by: Matthias Brugger Reviewed-by: Marc Zyngier Signed-off-by: Christoffer Dall Signed-off-by: Marc Zyngier Signed-off-by: Sasha Levin --- virt/kvm/arm/arch_timer.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/virt/kvm/arm/arch_timer.c b/virt/kvm/arm/arch_timer.c index 27a1f6341d41..7b49a1378c90 100644 --- a/virt/kvm/arm/arch_timer.c +++ b/virt/kvm/arm/arch_timer.c @@ -89,9 +89,6 @@ static void kvm_timer_inject_irq_work(struct work_struct *work) struct kvm_vcpu *vcpu; vcpu = container_of(work, struct kvm_vcpu, arch.timer_cpu.expired); - vcpu->arch.timer_cpu.armed = false; - - WARN_ON(!kvm_timer_should_fire(vcpu)); /* * If the vcpu is blocked we want to wake it up so that it will see -- 2.11.0
[PATCH AUTOSEL for 4.9 16/54] libfs: Modify mount_pseudo_xattr to be clear it is not a userspace mount
From: "Eric W. Biederman"[ Upstream commit 75422726b0f717d67db3283c2eb5bc14fa2619c5 ] Add MS_KERNMOUNT to the flags that are passed. Use sget_userns and force _user_ns instead of calling sget so that even if called from a weird context the internal filesystem will be considered to be in the intial user namespace. Luis Ressel reported that the the failure to pass MS_KERNMOUNT into mount_pseudo broke his in development graphics driver that uses the generic drm infrastructure. I am not certain the deriver was bug free in it's usage of that infrastructure but since mount_pseudo_xattr can never be triggered by userspace it is clearer and less error prone, and less problematic for the code to be explicit. Reported-by: Luis Ressel Tested-by: Luis Ressel Acked-by: Al Viro Signed-off-by: "Eric W. Biederman" Signed-off-by: Sasha Levin --- fs/libfs.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/libfs.c b/fs/libfs.c index 48826d4da189..9588780ad43e 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -245,7 +245,8 @@ struct dentry *mount_pseudo_xattr(struct file_system_type *fs_type, char *name, struct inode *root; struct qstr d_name = QSTR_INIT(name, strlen(name)); - s = sget(fs_type, NULL, set_anon_super, MS_NOUSER, NULL); + s = sget_userns(fs_type, NULL, set_anon_super, MS_KERNMOUNT|MS_NOUSER, + _user_ns, NULL); if (IS_ERR(s)) return ERR_CAST(s); -- 2.11.0
[PATCH AUTOSEL for 4.9 14/54] be2net: fix unicast list filling
From: Ivan Vecera[ Upstream commit 6052cd1af86f9833b6b0b60d5d4787c4a06d65ea ] The adapter->pmac_id[0] item is used for primary MAC address but this is not true for adapter->uc_list[0] as is assumed in be_set_uc_list(). There are N UC addresses copied first from net_device to adapter->uc_list[1..N] and then N UC addresses from adapter->uc_list[0..N-1] are sent to HW. So the last UC address is never stored into HW and address 00:00:00:00;00:00 (from uc_list[0]) is used instead. Cc: Sathya Perla Cc: Ajit Khaparde Cc: Sriharsha Basavapatna Cc: Somnath Kotur Fixes: b717241 be2net: replace polling with sleeping in the FW completion path Signed-off-by: Ivan Vecera Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/emulex/benet/be_main.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c index c82f22b244c1..fbb396b08495 100644 --- a/drivers/net/ethernet/emulex/benet/be_main.c +++ b/drivers/net/ethernet/emulex/benet/be_main.c @@ -1719,9 +1719,8 @@ static void be_set_uc_list(struct be_adapter *adapter) } if (adapter->update_uc_list) { - i = 1; /* First slot is claimed by the Primary MAC */ - /* cache the uc-list in adapter array */ + i = 0; netdev_for_each_uc_addr(ha, netdev) { ether_addr_copy(adapter->uc_list[i].mac, ha->addr); i++; -- 2.11.0
[PATCH AUTOSEL for 4.9 23/54] drm/exynos/decon5433: set STANDALONE_UPDATE_F also if planes are disabled
From: Andrzej Hajda[ Upstream commit 821b40b79db7dedbfe15ab330dfd181e661a533f ] STANDALONE_UPDATE_F should be set if something changed in plane configurations, including plane disable. The patch fixes page-faults bugs, caused by decon still using framebuffers of disabled planes. v2: fixed clear-bit code (Thx Marek) v3: use test_and_clear_bit (Thx Joonyoung) Signed-off-by: Andrzej Hajda Tested-by: Joonyoung Shim Signed-off-by: Inki Dae Signed-off-by: Sasha Levin --- drivers/gpu/drm/exynos/exynos5433_drm_decon.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c index ef7fcb5f044b..09e8cc36948e 100644 --- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c +++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c @@ -46,7 +46,8 @@ enum decon_flag_bits { BIT_CLKS_ENABLED, BIT_IRQS_ENABLED, BIT_WIN_UPDATED, - BIT_SUSPENDED + BIT_SUSPENDED, + BIT_REQUEST_UPDATE }; struct decon_context { @@ -313,6 +314,7 @@ static void decon_update_plane(struct exynos_drm_crtc *crtc, /* window enable */ decon_set_bits(ctx, DECON_WINCONx(win), WINCONx_ENWIN_F, ~0); + set_bit(BIT_REQUEST_UPDATE, >flags); } static void decon_disable_plane(struct exynos_drm_crtc *crtc, @@ -325,6 +327,7 @@ static void decon_disable_plane(struct exynos_drm_crtc *crtc, return; decon_set_bits(ctx, DECON_WINCONx(win), WINCONx_ENWIN_F, 0); + set_bit(BIT_REQUEST_UPDATE, >flags); } static void decon_atomic_flush(struct exynos_drm_crtc *crtc) @@ -338,8 +341,7 @@ static void decon_atomic_flush(struct exynos_drm_crtc *crtc) for (i = ctx->first_win; i < WINDOWS_NR; i++) decon_shadow_protect_win(ctx, i, false); - /* update iff there are active windows */ - if (crtc->base.state->plane_mask) + if (test_and_clear_bit(BIT_REQUEST_UPDATE, >flags)) decon_set_bits(ctx, DECON_UPDATE, STANDALONE_UPDATE_F, ~0); if (ctx->out_type & IFTYPE_I80) -- 2.11.0
[PATCH AUTOSEL for 4.9 03/56] RDS: RDMA: return appropriate error on rdma map failures
From: Santosh Shilimkar[ Upstream commit 584a8279a44a800dea5a5c1e9d53a002e03016b4 ] The first message to a remote node should prompt a new connection even if it is RDMA operation. For RDMA operation the MR mapping can fail because connections is not yet up. Since the connection establishment is asynchronous, we make sure the map failure because of unavailable connection reach to the user by appropriate error code. Before returning to the user, lets trigger the connection so that its ready for the next retry. Signed-off-by: Santosh Shilimkar Signed-off-by: Sasha Levin --- net/rds/send.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/net/rds/send.c b/net/rds/send.c index 310d57928405..ad247dc71ebb 100644 --- a/net/rds/send.c +++ b/net/rds/send.c @@ -946,6 +946,11 @@ static int rds_cmsg_send(struct rds_sock *rs, struct rds_message *rm, ret = rds_cmsg_rdma_map(rs, rm, cmsg); if (!ret) *allocated_mr = 1; + else if (ret == -ENODEV) + /* Accommodate the get_mr() case which can fail +* if connection isn't established yet. +*/ + ret = -EAGAIN; break; case RDS_CMSG_ATOMIC_CSWP: case RDS_CMSG_ATOMIC_FADD: @@ -1114,8 +1119,12 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len) /* Parse any control messages the user may have included. */ ret = rds_cmsg_send(rs, rm, msg, _mr); - if (ret) + if (ret) { + /* Trigger connection so that its ready for the next retry */ + if (ret == -EAGAIN) + rds_conn_connect_if_down(conn); goto out; + } if (rm->rdma.op_active && !conn->c_trans->xmit_rdma) { printk_ratelimited(KERN_NOTICE "rdma_op %p conn xmit_rdma %p\n", -- 2.11.0
Re: [PATCH] hwmon: (pmbus/lm25066) Swap low/high current coefficients for LM5066(i)
On Wed, Nov 22, 2017 at 02:07:28PM -0800, Robert Lippert wrote: > The _L low-current mode coefficient values should reference the > datasheet rows with CL=VDD but it seems were mistakenly pulled from > the rows with CL=GND. > > This causes the current/power to be reported as approximately double > the actual value when CL=GND and half the actual value when CL=VDD. > This would affect all chips supported by this driver. Hmm, and I was sure I tested this. I'll have to dig out my hardware and confirm. The code currently only uses bit 4 of the DEVICE_SETUP (D9h) command to determine which current limit setting to use. Looking into the datasheet, it looks like it also has to evaluate bit 2, and I wonder if there is a means to determine CL if bit 2 = 0. Any idea ? Does bit 4 report the CL pin value if bit 2 = 0 ? Thanks, Guenter > Signed-off-by: Robert Lippert> --- > drivers/hwmon/pmbus/lm25066.c | 24 > 1 file changed, 12 insertions(+), 12 deletions(-) > > diff --git a/drivers/hwmon/pmbus/lm25066.c b/drivers/hwmon/pmbus/lm25066.c > index 10d17fb8f283..aa052f4449a9 100644 > --- a/drivers/hwmon/pmbus/lm25066.c > +++ b/drivers/hwmon/pmbus/lm25066.c > @@ -191,19 +191,19 @@ static struct __coeff lm25066_coeff[6][PSC_NUM_CLASSES > + 2] = { > .R = -2, > }, > [PSC_CURRENT_IN] = { > - .m = 10753, > + .m = 5405, > .R = -2, > }, > [PSC_CURRENT_IN_L] = { > - .m = 5405, > + .m = 10753, > .R = -2, > }, > [PSC_POWER] = { > - .m = 1204, > + .m = 605, > .R = -3, > }, > [PSC_POWER_L] = { > - .m = 605, > + .m = 1204, > .R = -3, > }, > [PSC_TEMPERATURE] = { > @@ -222,23 +222,23 @@ static struct __coeff lm25066_coeff[6][PSC_NUM_CLASSES > + 2] = { > .R = -2, > }, > [PSC_CURRENT_IN] = { > - .m = 15076, > - .b = -504, > + .m = 7645, > + .b = 100, > .R = -2, > }, > [PSC_CURRENT_IN_L] = { > - .m = 7645, > - .b = 100, > + .m = 15076, > + .b = -504, > .R = -2, > }, > [PSC_POWER] = { > - .m = 1701, > - .b = -4000, > + .m = 861, > + .b = -965, > .R = -3, > }, > [PSC_POWER_L] = { > - .m = 861, > - .b = -965, > + .m = 1701, > + .b = -4000, > .R = -3, > }, > [PSC_TEMPERATURE] = { > -- > 2.15.0.448.gf294e3d99a-goog >
Re: [PATCH 1/2] ALSA: pcm: add SNDRV_PCM_FORMAT_{S, U}20_4
On Nov 23 2017 04:17, Maciej S. Szmigiero wrote: This format is similar to existing SNDRV_PCM_FORMAT_{S,U}20_3 that keep 20-bit PCM samples in 3 bytes, however i.MX6 platform SSI FIFO does not allow 3-byte accesses (including DMA) so a 4-byte format is needed for it. Signed-off-by: Maciej S. Szmigiero--- include/sound/pcm.h | 8 include/sound/soc-dai.h | 2 ++ include/uapi/sound/asound.h | 10 +- sound/core/pcm_misc.c | 16 4 files changed, 35 insertions(+), 1 deletion(-) ... > diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h index c227ccba60ae..69b661816491 100644 --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -236,7 +236,11 @@ typedef int __bitwise snd_pcm_format_t; #define SNDRV_PCM_FORMAT_DSD_U32_LE ((__force snd_pcm_format_t) 50) /* DSD, 4-byte samples DSD (x32), little endian */ #define SNDRV_PCM_FORMAT_DSD_U16_BE ((__force snd_pcm_format_t) 51) /* DSD, 2-byte samples DSD (x16), big endian */ #define SNDRV_PCM_FORMAT_DSD_U32_BE ((__force snd_pcm_format_t) 52) /* DSD, 4-byte samples DSD (x32), big endian */ -#defineSNDRV_PCM_FORMAT_LAST SNDRV_PCM_FORMAT_DSD_U32_BE +#defineSNDRV_PCM_FORMAT_S20_4LE((__force snd_pcm_format_t) 53) /* in four bytes */ +#defineSNDRV_PCM_FORMAT_S20_4BE((__force snd_pcm_format_t) 54) /* in four bytes */ +#defineSNDRV_PCM_FORMAT_U20_4LE((__force snd_pcm_format_t) 55) /* in four bytes */ +#defineSNDRV_PCM_FORMAT_U20_4BE((__force snd_pcm_format_t) 56) /* in four bytes */ +#defineSNDRV_PCM_FORMAT_LAST SNDRV_PCM_FORMAT_U20_4BE In my opinion, for this type of definition, it's better to declare left/right-adjusted or padding side. (Of course, silence definition is already a hint, however the lack of information forces developers to have a careful behaviour to handle entries on the list.) (I note that in current ALSA PCM interface there's no way to deliver MSB/LSB-first information about sample format.) Additionally, alsa-lib includes some codes related to the definition[1]. If you'd like to thing goes well out of ALSA SoC part, it's better to submit changes to the library as well. [1] http://git.alsa-project.org/?p=alsa-lib.git;a=blob;f=src/pcm/pcm_misc.c;h=5420b1895713a3aec3624a5218794a7b49baf167;hb=HEAD Regards Takashi Sakamoto
[PATCH AUTOSEL for 4.9 05/54] dmaengine: stm32-dma: Fix null pointer dereference in stm32_dma_tx_status
From: M'boumba Cedric Madianga[ Upstream commit 57b5a32135c813f2ab669039fb4ec16b30cb3305 ] chan->desc is always set to NULL when a DMA transfer is complete. As a DMA transfer could be complete during the call of stm32_dma_tx_status, we need to be sure that chan->desc is not NULL before using this variable to avoid a null pointer deference issue. Signed-off-by: M'boumba Cedric Madianga Reviewed-by: Ludovic BARRE Signed-off-by: Vinod Koul Signed-off-by: Sasha Levin --- drivers/dma/stm32-dma.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/dma/stm32-dma.c b/drivers/dma/stm32-dma.c index bd758e67843c..ae3f60be7759 100644 --- a/drivers/dma/stm32-dma.c +++ b/drivers/dma/stm32-dma.c @@ -884,7 +884,7 @@ static enum dma_status stm32_dma_tx_status(struct dma_chan *c, struct virt_dma_desc *vdesc; enum dma_status status; unsigned long flags; - u32 residue; + u32 residue = 0; status = dma_cookie_status(c, cookie, state); if ((status == DMA_COMPLETE) || (!state)) @@ -892,16 +892,12 @@ static enum dma_status stm32_dma_tx_status(struct dma_chan *c, spin_lock_irqsave(>vchan.lock, flags); vdesc = vchan_find_desc(>vchan, cookie); - if (cookie == chan->desc->vdesc.tx.cookie) { + if (chan->desc && cookie == chan->desc->vdesc.tx.cookie) residue = stm32_dma_desc_residue(chan, chan->desc, chan->next_sg); - } else if (vdesc) { + else if (vdesc) residue = stm32_dma_desc_residue(chan, to_stm32_dma_desc(vdesc), 0); - } else { - residue = 0; - } - dma_set_residue(state, residue); spin_unlock_irqrestore(>vchan.lock, flags); -- 2.11.0
[PATCH AUTOSEL for 4.9 07/54] libcxgb: fix error check for ip6_route_output()
From: Varun Prakash[ Upstream commit a9a8cdb368d99bb655b5cdabea560446db0527cc ] ip6_route_output() never returns NULL so check dst->error instead of !dst. Signed-off-by: Varun Prakash Signed-off-by: David S. Miller Signed-off-by: Sasha Levin --- drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c b/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c index 0f0de5b63622..d04a6c163445 100644 --- a/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c +++ b/drivers/net/ethernet/chelsio/libcxgb/libcxgb_cm.c @@ -133,17 +133,15 @@ cxgb_find_route6(struct cxgb4_lld_info *lldi, if (ipv6_addr_type() & IPV6_ADDR_LINKLOCAL) fl6.flowi6_oif = sin6_scope_id; dst = ip6_route_output(_net, NULL, ); - if (!dst) - goto out; - if (!cxgb_our_interface(lldi, get_real_dev, - ip6_dst_idev(dst)->dev) && - !(ip6_dst_idev(dst)->dev->flags & IFF_LOOPBACK)) { + if (dst->error || + (!cxgb_our_interface(lldi, get_real_dev, +ip6_dst_idev(dst)->dev) && +!(ip6_dst_idev(dst)->dev->flags & IFF_LOOPBACK))) { dst_release(dst); - dst = NULL; + return NULL; } } -out: return dst; } EXPORT_SYMBOL(cxgb_find_route6); -- 2.11.0
[PATCH AUTOSEL for 4.9 06/56] drm/sun4i: Fix a return value in case of error
From: Christophe JAILLET[ Upstream commit 0f0861e31e3c59ca4bc1ec59d99260cfca79740e ] If 'sun4i_backend_drm_format_to_layer()' does not return 0, then 'val' is left unmodified. As it is not initialized either, the return value can be anything. It is likely that returning the error code was expected here. As the only caller of 'sun4i_backend_update_layer_formats()' does not check the return value, this fix is purely theorical. Signed-off-by: Christophe JAILLET Signed-off-by: Maxime Ripard Signed-off-by: Sasha Levin --- drivers/gpu/drm/sun4i/sun4i_backend.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c index 6e6c59a661b6..223944a3ba18 100644 --- a/drivers/gpu/drm/sun4i/sun4i_backend.c +++ b/drivers/gpu/drm/sun4i/sun4i_backend.c @@ -172,7 +172,7 @@ int sun4i_backend_update_layer_formats(struct sun4i_backend *backend, ret = sun4i_backend_drm_format_to_layer(plane, fb->pixel_format, ); if (ret) { DRM_DEBUG_DRIVER("Invalid format\n"); - return val; + return ret; } regmap_update_bits(backend->regs, SUN4I_BACKEND_ATTCTL_REG1(layer), -- 2.11.0
[PATCH AUTOSEL for 4.9 05/56] PCI: Apply _HPX settings only to relevant devices
From: Bjorn Helgaas[ Upstream commit 977509f7c5c6fb992ffcdf4291051af343b91645 ] Previously we didn't check the type of device before trying to apply Type 1 (PCI-X) or Type 2 (PCIe) Setting Records from _HPX. We don't support PCI-X Setting Records, so this was harmless, but the warning was useless. We do support PCIe Setting Records, and we didn't check whether a device was PCIe before applying settings. I don't think anything bad happened on non-PCIe devices because pcie_capability_clear_and_set_word(), pcie_cap_has_lnkctl(), etc., would fail before doing any harm. But it's ugly to depend on those internals. Check the device type before attempting to apply Type 1 and Type 2 Setting Records (Type 0 records are applicable to PCI, PCI-X, and PCIe devices). A side benefit is that this prevents useless "not supported" warnings when a BIOS supplies a Type 1 (PCI-X) Setting Record and we try to apply it to every single device: pci :00:00.0: PCI-X settings not supported After this patch, we'll get the warning only when a BIOS supplies a Type 1 record and we have a PCI-X device to which it should be applied. Link: https://bugzilla.kernel.org/show_bug.cgi?id=187731 Signed-off-by: Bjorn Helgaas Signed-off-by: Sasha Levin --- drivers/pci/probe.c | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c index d266d800f246..60bada90cd75 100644 --- a/drivers/pci/probe.c +++ b/drivers/pci/probe.c @@ -1438,8 +1438,16 @@ static void program_hpp_type0(struct pci_dev *dev, struct hpp_type0 *hpp) static void program_hpp_type1(struct pci_dev *dev, struct hpp_type1 *hpp) { - if (hpp) - dev_warn(>dev, "PCI-X settings not supported\n"); + int pos; + + if (!hpp) + return; + + pos = pci_find_capability(dev, PCI_CAP_ID_PCIX); + if (!pos) + return; + + dev_warn(>dev, "PCI-X settings not supported\n"); } static bool pcie_root_rcb_set(struct pci_dev *dev) @@ -1465,6 +1473,9 @@ static void program_hpp_type2(struct pci_dev *dev, struct hpp_type2 *hpp) if (!hpp) return; + if (!pci_is_pcie(dev)) + return; + if (hpp->revision > 1) { dev_warn(>dev, "PCIe settings rev %d not supported\n", hpp->revision); -- 2.11.0
[PATCH AUTOSEL for 4.9 06/54] usb: gadget: f_fs: Fix ExtCompat descriptor validation
From: Vincent Pelletier[ Upstream commit 354bc45bf329494ef6051f3229ef50b9e2a7ea2a ] Reserved1 is documented as expected to be set to 0, but this test fails when it it set to 0. Reverse the condition. Signed-off-by: Vincent Pelletier Signed-off-by: Felipe Balbi Signed-off-by: Sasha Levin --- drivers/usb/gadget/function/f_fs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c index 273320fa30ae..4fce83266926 100644 --- a/drivers/usb/gadget/function/f_fs.c +++ b/drivers/usb/gadget/function/f_fs.c @@ -2263,7 +2263,7 @@ static int __ffs_data_do_os_desc(enum ffs_os_desc_type type, if (len < sizeof(*d) || d->bFirstInterfaceNumber >= ffs->interfaces_count || - !d->Reserved1) + d->Reserved1) return -EINVAL; for (i = 0; i < ARRAY_SIZE(d->Reserved2); ++i) if (d->Reserved2[i]) -- 2.11.0
[GIT PULL] platform-drivers-x86 for 4.15-2
Hi Linus, No merge-specific notes for this pull request, content described below in the tag. The following changes since commit aaa40965d2342137d756121993c395e2a7463a8d: platform/x86: silead_dmi: Add silead, home-button property to some tablets (2017-11-18 19:28:58 +0200) are available in the git repository at: git://git.infradead.org/linux-platform-drivers-x86.git tags/platform-drivers-x86-v4.15-2 for you to fetch changes up to c6f9288ee460565b94994aaf3261318199c2a674: platform/x86: dell-laptop: fix error return code in dell_init() (2017-11-21 20:11:44 +0200) Thanks, Darren Hart VMware Open Source Technology Center platform-drivers-x86 for v4.15-2 Fix two issues resulting from the dell-smbios refactoring and introduction of the dell-smbios-wmi dispatcher. The first ensures a proper error code is returned when kzalloc fails. The second avoids an issue in older Dell BIOS implementations which would fail if the more complex calls were made by limiting those platforms to the simple calls such as those used by the existing dell-laptop and dell-wmi drivers, preserving their functionality prior to the addition of the dell-smbios-wmi dispatcher. The following is an automated git shortlog grouped by driver: dell-laptop: - Fix error return code in dell_init() dell-smbios-wmi: - Disable userspace interface if missing hotfix Mario Limonciello (1): platform/x86: dell-smbios-wmi: Disable userspace interface if missing hotfix weiyongjun (A) (1): platform/x86: dell-laptop: fix error return code in dell_init() drivers/platform/x86/dell-laptop.c | 4 +++- drivers/platform/x86/dell-smbios-wmi.c | 13 + drivers/platform/x86/dell-wmi-descriptor.c | 26 -- drivers/platform/x86/dell-wmi-descriptor.h | 1 + 4 files changed, 41 insertions(+), 3 deletions(-) -- Darren Hart VMware Open Source Technology Center
Re: [PATCH v1] PCI: Remove unused HyperTransport interrupt support
Bjorn Helgaaswrites: > From: Bjorn Helgaas > > There are no in-tree callers of ht_create_irq(), the driver interface for > HyperTransport interrupts. Remove the unused entry point and all the > supporting code. > > See 8b955b0dddb3 ("[PATCH] Initial generic hypertransport interrupt > support"). This support has been in use until comparatively recently. But now that the ipath driver has been removed from the kernel, and apparently no other native hypertransport cards it does seem reasonable to remove this support. 6f9b38903c06 ("IB/ipath: Deprecate ipath driver and move to staging.") b85d9905a7ca ("staging/rdma: remove deprecated ipath driver") Acked-by: "Eric W. Biederman" > Signed-off-by: Bjorn Helgaas > Cc: Thomas Gleixner > Cc: Ingo Molnar > Cc: H. Peter Anvin > Cc: Eric W. Biederman > Cc: Benjamin Herrenschmidt > Cc: Andi Kleen > --- > arch/x86/include/asm/hw_irq.h |8 - > arch/x86/include/asm/hypertransport.h | 46 > arch/x86/include/asm/irqdomain.h |6 - > arch/x86/kernel/apic/Makefile |1 > arch/x86/kernel/apic/htirq.c | 198 > - > arch/x86/kernel/apic/vector.c |5 - > drivers/pci/Kconfig |9 -- > drivers/pci/Makefile |3 - > drivers/pci/htirq.c | 135 --- > include/linux/htirq.h | 39 --- > include/linux/pci.h |6 - > 11 files changed, 2 insertions(+), 454 deletions(-) > delete mode 100644 arch/x86/include/asm/hypertransport.h > delete mode 100644 arch/x86/kernel/apic/htirq.c > delete mode 100644 drivers/pci/htirq.c > delete mode 100644 include/linux/htirq.h > > diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h > index b80e46733909..2851077b6051 100644 > --- a/arch/x86/include/asm/hw_irq.h > +++ b/arch/x86/include/asm/hw_irq.h > @@ -99,14 +99,6 @@ struct irq_alloc_info { > void*dmar_data; > }; > #endif > -#ifdef CONFIG_HT_IRQ > - struct { > - int ht_pos; > - int ht_idx; > - struct pci_dev *ht_dev; > - void*ht_update; > - }; > -#endif > #ifdef CONFIG_X86_UV > struct { > int uv_limit; > diff --git a/arch/x86/include/asm/hypertransport.h > b/arch/x86/include/asm/hypertransport.h > deleted file mode 100644 > index 5d55df352879.. > --- a/arch/x86/include/asm/hypertransport.h > +++ /dev/null > @@ -1,46 +0,0 @@ > -/* SPDX-License-Identifier: GPL-2.0 */ > -#ifndef _ASM_X86_HYPERTRANSPORT_H > -#define _ASM_X86_HYPERTRANSPORT_H > - > -/* > - * Constants for x86 Hypertransport Interrupts. > - */ > - > -#define HT_IRQ_LOW_BASE 0xf800 > - > -#define HT_IRQ_LOW_VECTOR_SHIFT 16 > -#define HT_IRQ_LOW_VECTOR_MASK 0x00ff > -#define HT_IRQ_LOW_VECTOR(v) \ > - (((v) << HT_IRQ_LOW_VECTOR_SHIFT) & HT_IRQ_LOW_VECTOR_MASK) > - > -#define HT_IRQ_LOW_DEST_ID_SHIFT 8 > -#define HT_IRQ_LOW_DEST_ID_MASK 0xff00 > -#define HT_IRQ_LOW_DEST_ID(v) > \ > - (((v) << HT_IRQ_LOW_DEST_ID_SHIFT) & HT_IRQ_LOW_DEST_ID_MASK) > - > -#define HT_IRQ_LOW_DM_PHYSICAL 0x000 > -#define HT_IRQ_LOW_DM_LOGICAL0x040 > - > -#define HT_IRQ_LOW_RQEOI_EDGE0x000 > -#define HT_IRQ_LOW_RQEOI_LEVEL 0x020 > - > - > -#define HT_IRQ_LOW_MT_FIXED 0x000 > -#define HT_IRQ_LOW_MT_ARBITRATED 0x004 > -#define HT_IRQ_LOW_MT_SMI0x008 > -#define HT_IRQ_LOW_MT_NMI0x00c > -#define HT_IRQ_LOW_MT_INIT 0x010 > -#define HT_IRQ_LOW_MT_STARTUP0x014 > -#define HT_IRQ_LOW_MT_EXTINT 0x018 > -#define HT_IRQ_LOW_MT_LINT1 0x08c > -#define HT_IRQ_LOW_MT_LINT0 0x098 > - > -#define HT_IRQ_LOW_IRQ_MASKED0x001 > - > - > -#define HT_IRQ_HIGH_DEST_ID_SHIFT0 > -#define HT_IRQ_HIGH_DEST_ID_MASK 0x00ff > -#define HT_IRQ_HIGH_DEST_ID(v) > \ > - v) >> 8) << HT_IRQ_HIGH_DEST_ID_SHIFT) & HT_IRQ_HIGH_DEST_ID_MASK) > - > -#endif /* _ASM_X86_HYPERTRANSPORT_H */ > diff --git a/arch/x86/include/asm/irqdomain.h > b/arch/x86/include/asm/irqdomain.h > index f695cc6b8e1f..139feef467f7 100644 > --- a/arch/x86/include/asm/irqdomain.h > +++ b/arch/x86/include/asm/irqdomain.h > @@ -56,10 +56,4 @@ extern void
Clang patch stacks for LTS kernels and status update
This is a follow-up on my earlier post on clang patch stacks for LTS kernels (https://lkml.org/lkml/2017/8/22/912). In the meantime v4.14 (LTS) has been released, which includes almost all changes for basic clang support, also a few issues have been fixed in clang. Status of v4.14: - all archs - it is strongly recommended to use clang 5 (or higher), additional patches are needed for clang 4 (the patches can be found in the '_ext' patch stacks referenced below) - allyesconfig/allmodconfig - 4 extra patches are needed, all landed in Linus' tree for 4.15 - other architecture specific patches may be needed (see below) - need to disable - CONFIG_EXOFS_FS - extensive usage of VLAIS (http://elixir.free-electrons.com/linux/v4.14/source/fs/exofs/ore_raid.c#L74) - CONFIG_KASAN - clang support is WIP - x86 - defconfig - builds without extra patches - allyesconfig/allmodconfig - see 'all archs' - allnoconfig - (at least) errors on inline assembly constraints in percpu macros - arm64 - all configs - 3 extra patches are needed, two of them have landed in the kbuild tree, the other is a workaround for a clang issue that has been fixed, however no release with that fix is available at this time - defconfig - builds with the 3 common patches - allyesconfig/allmodconfig - see 'all archs' - need to disable - CONFIG_CPU_BIG_ENDIAN - clang 5 does not support target aarch64_be_linux - CONFIG_ARM64_LSE_ATOMICS - clang 5 does not support -ffixed-REG and -fcall-saved-REG (https://bugs.llvm.org/show_bug.cgi?id=9457) - CONFIG_VIDEO_QCOM_CAMSS - "undefined reference to `__compiletime_assert_N - fixed in upstream clang, no release with the fix available yet - still fails with "arch/arm64/kernel/entry.o:(.altinstr_replacement+0x0): relocation truncated to fit: R_AARCH64_JUMP26 against `.entry.text" - allnoconfig - builds with the 3 common patches Patch stacks for LTS kernels v4.4, v4.9 and v4.14 can be found here: https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/sandbox/mka/llvm/v4.4 https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/sandbox/mka/llvm/v4.4_ext https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/sandbox/mka/llvm/v4.9 https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/sandbox/mka/llvm/v4.9_ext https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/sandbox/mka/llvm/v4.14 https://chromium.googlesource.com/chromiumos/third_party/kernel/+log/sandbox/mka/llvm/v4.14_ext The 4.x stacks have all necessary patches to build arm64 and x86 defconfig with clang 5, the '_ext' stacks include additional patches for building with clang 4, fixes for warnings or issues with other configurations like 'allyesconfig'. The OBSOLETE tag used in earlier versions of the stack has been replaced with CLANGx tags, indicating that a patch is only needed with clang version <= x and can be dropped otherwise. A brief description of the other tags can be found here: https://lkml.org/lkml/2017/8/22/912 To retrieve the patches (v4.14): git fetch https://chromium.googlesource.com/chromiumos/third_party/kernel refs/sandbox/mka/llvm/v4.14 git checkout -b llvm_v4.14 FETCH_HEAD To build the kernel with clang (x86/native): make CC=clang or cross compilation make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- CC=clang In case you wonder if kernels built with clang are actually used on real systems, the answer is 'yes': Google Pixel 2 phones ship with a v4.4 kernel built with clang, for several Chromebooks (both x86 and arm64) clang built kernels (also v4.4) are currently distributed through the 'beta' channel and will be deployed to most users of these devices in December. For more information on this topic: https://lwn.net/Articles/734071/ https://www.linuxplumbersconf.org/2017/ocw/system/presentations/4799/original/LPC%202017-%20Clang%20built%20kernels.pdf https://llvm.org/devmtg/2017-10/#talk21 Thanks Matthias
[Patch v8 09/16] CIFS: SMBD: Implement function to send data via RDMA send
From: Long LiThe transport doesn't maintain send buffers or send queue for transferring payload via RDMA send. There is no data copy in the transport on send. Signed-off-by: Long Li --- fs/cifs/smbdirect.c | 246 fs/cifs/smbdirect.h | 5 ++ 2 files changed, 251 insertions(+) diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c index 6cff234..cb062e2 100644 --- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -41,6 +41,12 @@ static int smbd_post_recv( struct smbd_response *response); static int smbd_post_send_empty(struct smbd_connection *info); +static int smbd_post_send_data( + struct smbd_connection *info, + struct kvec *iov, int n_vec, int remaining_data_length); +static int smbd_post_send_page(struct smbd_connection *info, + struct page *page, unsigned long offset, + size_t size, int remaining_data_length); /* SMBD version number */ #define SMBD_V10x0100 @@ -177,6 +183,10 @@ static void smbd_destroy_rdma_work(struct work_struct *work) log_rdma_event(INFO, "cancelling send immediate work\n"); cancel_delayed_work_sync(>send_immediate_work); + log_rdma_event(INFO, "wait for all send to finish\n"); + wait_event(info->wait_smbd_send_pending, + info->smbd_send_pending == 0); + log_rdma_event(INFO, "wait for all recv to finish\n"); wake_up_interruptible(>wait_reassembly_queue); wait_event(info->wait_smbd_recv_pending, @@ -1078,6 +1088,24 @@ static int smbd_post_send_sgl(struct smbd_connection *info, } /* + * Send a page + * page: the page to send + * offset: offset in the page to send + * size: length in the page to send + * remaining_data_length: remaining data to send in this payload + */ +static int smbd_post_send_page(struct smbd_connection *info, struct page *page, + unsigned long offset, size_t size, int remaining_data_length) +{ + struct scatterlist sgl; + + sg_init_table(, 1); + sg_set_page(, page, size, offset); + + return smbd_post_send_sgl(info, , size, remaining_data_length); +} + +/* * Send an empty message * Empty message is used to extend credits to peer to for keep live * while there is no upper layer payload to send at the time @@ -1089,6 +1117,35 @@ static int smbd_post_send_empty(struct smbd_connection *info) } /* + * Send a data buffer + * iov: the iov array describing the data buffers + * n_vec: number of iov array + * remaining_data_length: remaining data to send following this packet + * in segmented SMBD packet + */ +static int smbd_post_send_data( + struct smbd_connection *info, struct kvec *iov, int n_vec, + int remaining_data_length) +{ + int i; + u32 data_length = 0; + struct scatterlist sgl[SMBDIRECT_MAX_SGE]; + + if (n_vec > SMBDIRECT_MAX_SGE) { + cifs_dbg(VFS, "Can't fit data to SGL, n_vec=%d\n", n_vec); + return -ENOMEM; + } + + sg_init_table(sgl, n_vec); + for (i = 0; i < n_vec; i++) { + data_length += iov[i].iov_len; + sg_set_buf([i], iov[i].iov_base, iov[i].iov_len); + } + + return smbd_post_send_sgl(info, sgl, data_length, remaining_data_length); +} + +/* * Post a receive request to the transport * The remote peer can only send data when a receive request is posted * The interaction is controlled by send/receive credit system @@ -1652,6 +1709,9 @@ struct smbd_connection *_smbd_get_connection( queue_delayed_work(info->workqueue, >idle_timer_work, info->keep_alive_interval*HZ); + init_waitqueue_head(>wait_smbd_send_pending); + info->smbd_send_pending = 0; + init_waitqueue_head(>wait_smbd_recv_pending); info->smbd_recv_pending = 0; @@ -1943,3 +2003,189 @@ int smbd_recv(struct smbd_connection *info, struct msghdr *msg) msg->msg_iter.count = 0; return rc; } + +/* + * Send data to transport + * Each rqst is transported as a SMBDirect payload + * rqst: the data to write + * return value: 0 if successfully write, otherwise error code + */ +int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst) +{ + struct kvec vec; + int nvecs; + int size; + int buflen = 0, remaining_data_length; + int start, i, j; + int max_iov_size = + info->max_send_size - sizeof(struct smbd_data_transfer); + struct kvec iov[SMBDIRECT_MAX_SGE]; + int rc; + + info->smbd_send_pending++; + if (info->transport_status != SMBD_CONNECTED) { + rc = -ENODEV; + goto done; + } + + /* +* This usually means a configuration error +* We use RDMA read/write for packet size > rdma_readwrite_threshold +* as long as it's properly configured we should never get
[Patch v8 12/16] CIFS: SMBD: Upper layer performs SMB write via RDMA read through memory registration
From: Long LiWhen sending I/O, if size is larger than rdma_readwrite_threshold we prepare to send SMB write packet for a RDMA read via memory registration. The actual I/O is done by remote peer through local RDMA hardware. Modify the relevant fields in the packet accordingly, and append a smbd_buffer_descriptor_v1 to the end of the SMB write packet. On write I/O finish, deregister the memory region if this was for a RDMA read. If remote invalidation is not used, the call to smbd_deregister_mr will do local invalidation and possibly wait. Memory region is normally deregistered in MID callback as soon as it's used. There are situations where the MID may not be created on I/O failure, under which memory region is deregistered when write data context is released. Signed-off-by: Long Li --- fs/cifs/cifsglob.h | 3 +++ fs/cifs/cifssmb.c | 7 ++ fs/cifs/smb2pdu.c | 65 +++--- 3 files changed, 72 insertions(+), 3 deletions(-) diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h index 3fb1a2f..22bfda0 100644 --- a/fs/cifs/cifsglob.h +++ b/fs/cifs/cifsglob.h @@ -1174,6 +1174,9 @@ struct cifs_writedata { pid_t pid; unsigned intbytes; int result; +#ifdef CONFIG_CIFS_SMB_DIRECT + struct smbd_mr *mr; +#endif unsigned intpagesz; unsigned inttailsz; unsigned intcredits; diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 35dc5bf..66d1ebf 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -43,6 +43,7 @@ #include "cifs_unicode.h" #include "cifs_debug.h" #include "fscache.h" +#include "smbdirect.h" #ifdef CONFIG_CIFS_POSIX static struct { @@ -1923,6 +1924,12 @@ cifs_writedata_release(struct kref *refcount) { struct cifs_writedata *wdata = container_of(refcount, struct cifs_writedata, refcount); +#ifdef CONFIG_CIFS_SMB_DIRECT + if (wdata->mr) { + smbd_deregister_mr(wdata->mr); + wdata->mr = NULL; + } +#endif if (wdata->cfile) cifsFileInfo_put(wdata->cfile); diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index c0dc049..908d777 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -48,6 +48,7 @@ #include "smb2glob.h" #include "cifspdu.h" #include "cifs_spnego.h" +#include "smbdirect.h" /* * The following table defines the expected "StructureSize" of SMB2 requests @@ -2728,7 +2729,19 @@ smb2_writev_callback(struct mid_q_entry *mid) wdata->result = -EIO; break; } - +#ifdef CONFIG_CIFS_SMB_DIRECT + /* +* If this wdata has a memory registered, the MR can be freed +* The number of MRs available is limited, it's important to recover +* used MR as soon as I/O is finished. Hold MR longer in the later +* I/O process can possibly result in I/O deadlock due to lack of MR +* to send request on I/O retry +*/ + if (wdata->mr) { + smbd_deregister_mr(wdata->mr); + wdata->mr = NULL; + } +#endif if (wdata->result) cifs_stats_fail_inc(tcon, SMB2_WRITE_HE); @@ -2780,7 +2793,42 @@ smb2_async_writev(struct cifs_writedata *wdata, req->DataOffset = cpu_to_le16( offsetof(struct smb2_write_req, Buffer)); req->RemainingBytes = 0; - +#ifdef CONFIG_CIFS_SMB_DIRECT + /* +* If we want to do a server RDMA read, fill in and append +* smbd_buffer_descriptor_v1 to the end of write request +*/ + if (server->rdma && wdata->bytes >= + server->smbd_conn->rdma_readwrite_threshold) { + + struct smbd_buffer_descriptor_v1 *v1; + bool need_invalidate = server->dialect == SMB30_PROT_ID; + + wdata->mr = smbd_register_mr( + server->smbd_conn, wdata->pages, + wdata->nr_pages, wdata->tailsz, + false, need_invalidate); + if (!wdata->mr) { + rc = -ENOBUFS; + goto async_writev_out; + } + req->Length = 0; + req->DataOffset = 0; + req->RemainingBytes = + (wdata->nr_pages-1)*PAGE_SIZE + wdata->tailsz; + req->Channel = SMB2_CHANNEL_RDMA_V1_INVALIDATE; + if (need_invalidate) + req->Channel = SMB2_CHANNEL_RDMA_V1; + req->WriteChannelInfoOffset = + offsetof(struct smb2_write_req, Buffer); + req->WriteChannelInfoLength = + sizeof(struct smbd_buffer_descriptor_v1); + v1 = (struct
[Patch v8 11/16] CIFS: SMBD: Implement RDMA memory registration
From: Long LiMemory registration is used for transferring payload via RDMA read or write. After I/O is done, memory registrations are recovered and reused. This process can be time consuming and is done in a work queue. Signed-off-by: Long Li --- fs/cifs/smbdirect.c | 421 fs/cifs/smbdirect.h | 53 +++ 2 files changed, 474 insertions(+) diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c index cb062e2..238e310 100644 --- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -48,6 +48,9 @@ static int smbd_post_send_page(struct smbd_connection *info, struct page *page, unsigned long offset, size_t size, int remaining_data_length); +static void destroy_mr_list(struct smbd_connection *info); +static int allocate_mr_list(struct smbd_connection *info); + /* SMBD version number */ #define SMBD_V10x0100 @@ -198,6 +201,12 @@ static void smbd_destroy_rdma_work(struct work_struct *work) wait_event(info->wait_send_payload_pending, atomic_read(>send_payload_pending) == 0); + log_rdma_event(INFO, "freeing mr list\n"); + wake_up_interruptible_all(>wait_mr); + wait_event(info->wait_for_mr_cleanup, + atomic_read(>mr_used_count) == 0); + destroy_mr_list(info); + /* It's not posssible for upper layer to get to reassembly */ log_rdma_event(INFO, "drain the reassembly queue\n"); do { @@ -453,6 +462,16 @@ static bool process_negotiation_response( } info->max_fragmented_send_size = le32_to_cpu(packet->max_fragmented_size); + info->rdma_readwrite_threshold = + rdma_readwrite_threshold > info->max_fragmented_send_size ? + info->max_fragmented_send_size : + rdma_readwrite_threshold; + + + info->max_readwrite_size = min_t(u32, + le32_to_cpu(packet->max_readwrite_size), + info->max_frmr_depth * PAGE_SIZE); + info->max_frmr_depth = info->max_readwrite_size / PAGE_SIZE; return true; } @@ -748,6 +767,12 @@ static int smbd_ia_open( rc = -EPROTONOSUPPORT; goto out2; } + info->max_frmr_depth = min_t(int, + smbd_max_frmr_depth, + info->id->device->attrs.max_fast_reg_page_list_len); + info->mr_type = IB_MR_TYPE_MEM_REG; + if (info->id->device->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG) + info->mr_type = IB_MR_TYPE_SG_GAPS; info->pd = ib_alloc_pd(info->id->device, 0); if (IS_ERR(info->pd)) { @@ -1582,6 +1607,8 @@ struct smbd_connection *_smbd_get_connection( struct rdma_conn_param conn_param; struct ib_qp_init_attr qp_attr; struct sockaddr_in *addr_in = (struct sockaddr_in *) dstaddr; + struct ib_port_immutable port_immutable; + u32 ird_ord_hdr[2]; info = kzalloc(sizeof(struct smbd_connection), GFP_KERNEL); if (!info) @@ -1670,6 +1697,28 @@ struct smbd_connection *_smbd_get_connection( memset(_param, 0, sizeof(conn_param)); conn_param.initiator_depth = 0; + conn_param.responder_resources = + info->id->device->attrs.max_qp_rd_atom + < SMBD_CM_RESPONDER_RESOURCES ? + info->id->device->attrs.max_qp_rd_atom : + SMBD_CM_RESPONDER_RESOURCES; + info->responder_resources = conn_param.responder_resources; + log_rdma_mr(INFO, "responder_resources=%d\n", + info->responder_resources); + + /* Need to send IRD/ORD in private data for iWARP */ + info->id->device->get_port_immutable( + info->id->device, info->id->port_num, _immutable); + if (port_immutable.core_cap_flags & RDMA_CORE_PORT_IWARP) { + ird_ord_hdr[0] = info->responder_resources; + ird_ord_hdr[1] = 1; + conn_param.private_data = ird_ord_hdr; + conn_param.private_data_len = sizeof(ird_ord_hdr); + } else { + conn_param.private_data = NULL; + conn_param.private_data_len = 0; + } + conn_param.retry_count = SMBD_CM_RETRY; conn_param.rnr_retry_count = SMBD_CM_RNR_RETRY; conn_param.flow_control = 0; @@ -1734,8 +1783,19 @@ struct smbd_connection *_smbd_get_connection( goto negotiation_failed; } + rc = allocate_mr_list(info); + if (rc) { + log_rdma_mr(ERR, "memory registration allocation failed\n"); + goto allocate_mr_failed; + } + return info; +allocate_mr_failed: + /* At this point, need to a full transport shutdown */ + smbd_destroy(info); + return NULL; + negotiation_failed: cancel_delayed_work_sync(>idle_timer_work); destroy_caches_and_workqueue(info); @@ -2189,3
[PATCH 08/23] x86, kaiser: map cpu entry area
From: Dave HansenThere is now a special 'struct cpu_entry' area that contains all of the data needed to enter the kernel. It's mapped in the fixmap area and contains: * The GDT (hardware segment descriptor) * The TSS (thread information structure that points the hardware to the various stacks, and contains the entry stack). * The entry trampoline code itself * The exception stacks (aka IRQ stacks) Signed-off-by: Dave Hansen Cc: Moritz Lipp Cc: Daniel Gruss Cc: Michael Schwarz Cc: Richard Fellner Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Kees Cook Cc: Hugh Dickins Cc: x...@kernel.org --- b/arch/x86/include/asm/kaiser.h |6 ++ b/arch/x86/kernel/cpu/common.c |4 b/arch/x86/mm/kaiser.c | 31 +++ b/include/linux/kaiser.h|3 +++ 4 files changed, 44 insertions(+) diff -puN arch/x86/include/asm/kaiser.h~kaiser-user-map-cpu-entry-structure arch/x86/include/asm/kaiser.h --- a/arch/x86/include/asm/kaiser.h~kaiser-user-map-cpu-entry-structure 2017-11-22 15:45:48.447619740 -0800 +++ b/arch/x86/include/asm/kaiser.h 2017-11-22 15:45:48.456619740 -0800 @@ -34,6 +34,12 @@ extern int kaiser_add_mapping(unsigned l unsigned long flags); /** + * kaiser_add_mapping_cpu_entry - map the cpu entry area + * @cpu: the CPU for which the entry area is being mapped + */ +extern void kaiser_add_mapping_cpu_entry(int cpu); + +/** * kaiser_remove_mapping - remove a kernel mapping from the userpage tables * @addr: the start address of the range * @size: the size of the range diff -puN arch/x86/kernel/cpu/common.c~kaiser-user-map-cpu-entry-structure arch/x86/kernel/cpu/common.c --- a/arch/x86/kernel/cpu/common.c~kaiser-user-map-cpu-entry-structure 2017-11-22 15:45:48.449619740 -0800 +++ b/arch/x86/kernel/cpu/common.c 2017-11-22 15:45:48.457619740 -0800 @@ -4,6 +4,7 @@ #include #include #include +#include #include #include #include @@ -587,6 +588,9 @@ static inline void setup_cpu_entry_area( __set_fixmap(get_cpu_entry_area_index(cpu, entry_trampoline), __pa_symbol(_entry_trampoline), PAGE_KERNEL_RX); #endif + /* CPU 0's mapping is done in kaiser_init() */ + if (cpu) + kaiser_add_mapping_cpu_entry(cpu); } /* Load the original GDT from the per-cpu structure */ diff -puN arch/x86/mm/kaiser.c~kaiser-user-map-cpu-entry-structure arch/x86/mm/kaiser.c --- a/arch/x86/mm/kaiser.c~kaiser-user-map-cpu-entry-structure 2017-11-22 15:45:48.451619740 -0800 +++ b/arch/x86/mm/kaiser.c 2017-11-22 15:45:48.457619740 -0800 @@ -353,6 +353,26 @@ static void __init kaiser_init_all_pgds( WARN_ON(__ret); \ } while (0) +void kaiser_add_mapping_cpu_entry(int cpu) +{ + kaiser_add_user_map_early(get_cpu_gdt_ro(cpu), PAGE_SIZE, + __PAGE_KERNEL_RO); + + /* includes the entry stack */ + kaiser_add_user_map_early(_cpu_entry_area(cpu)->tss, + sizeof(get_cpu_entry_area(cpu)->tss), + __PAGE_KERNEL | _PAGE_GLOBAL); + + /* Entry code, so needs to be EXEC */ + kaiser_add_user_map_early(_cpu_entry_area(cpu)->entry_trampoline, + sizeof(get_cpu_entry_area(cpu)->entry_trampoline), + __PAGE_KERNEL_EXEC | _PAGE_GLOBAL); + + kaiser_add_user_map_early(_cpu_entry_area(cpu)->exception_stacks, + sizeof(get_cpu_entry_area(cpu)->exception_stacks), +__PAGE_KERNEL | _PAGE_GLOBAL); +} + extern char __per_cpu_user_mapped_start[], __per_cpu_user_mapped_end[]; /* * If anything in here fails, we will likely die on one of the @@ -390,6 +410,17 @@ void __init kaiser_init(void) kaiser_add_user_map_early((void *)idt_descr.address, sizeof(gate_desc) * NR_VECTORS, __PAGE_KERNEL_RO | _PAGE_GLOBAL); + + /* +* We delay CPU 0's mappings because these structures are +* created before the page allocator is up. Deferring it +* until here lets us use the plain page allocator +* unconditionally in the page table code above. +* +* This is OK because kaiser_init() is called long before +* we ever run userspace and need the KAISER mappings. +*/ + kaiser_add_mapping_cpu_entry(0); } int kaiser_add_mapping(unsigned long addr, unsigned long size, diff -puN include/linux/kaiser.h~kaiser-user-map-cpu-entry-structure include/linux/kaiser.h ---
[PATCH 10/23] x86, kaiser: map espfix structures
From: Dave HansenThere is some rather arcane code to help when an IRET returns to 16-bit segments. It is referred to as the "espfix" code. This consists of a few per-cpu variables: espfix_stack: tells us where the stack is allocated (the bottom) espfix_waddr: tells us to where %rsp may be pointed (the top) These are in addition to the stack itself. All three things must be mapped for the espfix code to function. Note: the espfix code runs with a kernel GSBASE, but user (shadow) page tables. A switch to the kernel page tables could be performed instead of mapping these structures, but mapping them is simpler and less likely to break the assembly. To switch over to the kernel copy, additional temporary storage would be required which is in short supply in this context. The original KAISER patch missed this case. Signed-off-by: Dave Hansen Cc: Moritz Lipp Cc: Daniel Gruss Cc: Michael Schwarz Cc: Richard Fellner Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Kees Cook Cc: Hugh Dickins Cc: x...@kernel.org --- b/arch/x86/kernel/espfix_64.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff -puN arch/x86/kernel/espfix_64.c~kaiser-user-map-espfix arch/x86/kernel/espfix_64.c --- a/arch/x86/kernel/espfix_64.c~kaiser-user-map-espfix2017-11-22 15:45:49.592619738 -0800 +++ b/arch/x86/kernel/espfix_64.c 2017-11-22 15:45:49.596619738 -0800 @@ -33,6 +33,7 @@ #include #include +#include #include #include #include @@ -41,7 +42,6 @@ #include #include #include -#include /* * Note: we only need 6*8 = 48 bytes for the espfix stack, but round @@ -61,8 +61,8 @@ #define PGALLOC_GFP (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO) /* This contains the *bottom* address of the espfix stack */ -DEFINE_PER_CPU_READ_MOSTLY(unsigned long, espfix_stack); -DEFINE_PER_CPU_READ_MOSTLY(unsigned long, espfix_waddr); +DEFINE_PER_CPU_USER_MAPPED(unsigned long, espfix_stack); +DEFINE_PER_CPU_USER_MAPPED(unsigned long, espfix_waddr); /* Initialization mutex - should this be a spinlock? */ static DEFINE_MUTEX(espfix_init_mutex); @@ -225,4 +225,10 @@ done: per_cpu(espfix_stack, cpu) = addr; per_cpu(espfix_waddr, cpu) = (unsigned long)stack_page + (addr & ~PAGE_MASK); + /* +* _PAGE_GLOBAL is not really required. This is not a hot +* path, but we do it here for consistency. +*/ + kaiser_add_mapping((unsigned long)stack_page, PAGE_SIZE, + __PAGE_KERNEL | _PAGE_GLOBAL); } _
[PATCH 09/23] x86, kaiser: map dynamically-allocated LDTs
From: Dave HansenNormally, a process has a NULL mm->context.ldt. But, there is a syscall for a process to set a new one. If a process does that, the LDT be mapped into the user page tables, just like the default copy. The original KAISER patch missed this case. Signed-off-by: Dave Hansen Cc: Moritz Lipp Cc: Daniel Gruss Cc: Michael Schwarz Cc: Richard Fellner Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Kees Cook Cc: Hugh Dickins Cc: x...@kernel.org --- b/arch/x86/kernel/ldt.c | 25 - 1 file changed, 20 insertions(+), 5 deletions(-) diff -puN arch/x86/kernel/ldt.c~kaiser-user-map-new-ldts arch/x86/kernel/ldt.c --- a/arch/x86/kernel/ldt.c~kaiser-user-map-new-ldts2017-11-22 15:45:49.059619739 -0800 +++ b/arch/x86/kernel/ldt.c 2017-11-22 15:45:49.062619739 -0800 @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -57,11 +58,21 @@ static void flush_ldt(void *__mm) refresh_ldt_segments(); } +static void __free_ldt_struct(struct ldt_struct *ldt) +{ + if (ldt->nr_entries * LDT_ENTRY_SIZE > PAGE_SIZE) + vfree_atomic(ldt->entries); + else + free_page((unsigned long)ldt->entries); + kfree(ldt); +} + /* The caller must call finalize_ldt_struct on the result. LDT starts zeroed. */ static struct ldt_struct *alloc_ldt_struct(unsigned int num_entries) { struct ldt_struct *new_ldt; unsigned int alloc_size; + int ret; if (num_entries > LDT_ENTRIES) return NULL; @@ -89,6 +100,12 @@ static struct ldt_struct *alloc_ldt_stru return NULL; } + ret = kaiser_add_mapping((unsigned long)new_ldt->entries, alloc_size, +__PAGE_KERNEL | _PAGE_GLOBAL); + if (ret) { + __free_ldt_struct(new_ldt); + return NULL; + } new_ldt->nr_entries = num_entries; return new_ldt; } @@ -115,12 +132,10 @@ static void free_ldt_struct(struct ldt_s if (likely(!ldt)) return; + kaiser_remove_mapping((unsigned long)ldt->entries, + ldt->nr_entries * LDT_ENTRY_SIZE); paravirt_free_ldt(ldt->entries, ldt->nr_entries); - if (ldt->nr_entries * LDT_ENTRY_SIZE > PAGE_SIZE) - vfree_atomic(ldt->entries); - else - free_page((unsigned long)ldt->entries); - kfree(ldt); + __free_ldt_struct(ldt); } /* _
[PATCH 06/23] x86, kaiser: allow NX poison to be set in p4d/pgd
From: Dave HansenThe user portion of the kernel page tables use the NX bit to poison them for userspace. But, that trips the p4d/pgd_bad() checks. Make sure it does not do that. Signed-off-by: Dave Hansen Cc: Moritz Lipp Cc: Daniel Gruss Cc: Michael Schwarz Cc: Richard Fellner Cc: Andy Lutomirski Cc: Linus Torvalds Cc: Kees Cook Cc: Hugh Dickins Cc: x...@kernel.org --- b/arch/x86/include/asm/pgtable.h | 14 -- 1 file changed, 12 insertions(+), 2 deletions(-) diff -puN arch/x86/include/asm/pgtable.h~kaiser-p4d-allow-nx arch/x86/include/asm/pgtable.h --- a/arch/x86/include/asm/pgtable.h~kaiser-p4d-allow-nx2017-11-22 15:45:47.382619743 -0800 +++ b/arch/x86/include/asm/pgtable.h2017-11-22 15:45:47.386619743 -0800 @@ -846,7 +846,12 @@ static inline pud_t *pud_offset(p4d_t *p static inline int p4d_bad(p4d_t p4d) { - return (p4d_flags(p4d) & ~(_KERNPG_TABLE | _PAGE_USER)) != 0; + unsigned long ignore_flags = _KERNPG_TABLE | _PAGE_USER; + + if (IS_ENABLED(CONFIG_KAISER)) + ignore_flags |= _PAGE_NX; + + return (p4d_flags(p4d) & ~ignore_flags) != 0; } #endif /* CONFIG_PGTABLE_LEVELS > 3 */ @@ -880,7 +885,12 @@ static inline p4d_t *p4d_offset(pgd_t *p static inline int pgd_bad(pgd_t pgd) { - return (pgd_flags(pgd) & ~_PAGE_USER) != _KERNPG_TABLE; + unsigned long ignore_flags = _PAGE_USER; + + if (IS_ENABLED(CONFIG_KAISER)) + ignore_flags |= _PAGE_NX; + + return (pgd_flags(pgd) & ~ignore_flags) != _KERNPG_TABLE; } static inline int pgd_none(pgd_t pgd) _
[Patch v8 08/16] CIFS: SMBD: Upper layer receives data via RDMA receive
From: Long LiWith SMB Direct connected, use it for receiving data via RDMA receive. Signed-off-by: Long Li --- fs/cifs/connect.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index d8bfa89..1677401 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -542,8 +542,10 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg) if (server_unresponsive(server)) return -ECONNABORTED; - - length = sock_recvmsg(server->ssocket, smb_msg, 0); + if (cifs_rdma_enabled(server) && server->smbd_conn) + length = smbd_recv(server->smbd_conn, smb_msg); + else + length = sock_recvmsg(server->ssocket, smb_msg, 0); if (server->tcpStatus == CifsExiting) return -ESHUTDOWN; -- 2.7.4
[Patch v8 02/16] CIFS: SMBD: Implement function to reconnect to a SMB Direct transport
From: Long LiAdd function to implement a reconnect to SMB Direct. This involves tearing down the current connection and establishing/negotiating a new connection. Signed-off-by: Long Li --- fs/cifs/smbdirect.c | 36 fs/cifs/smbdirect.h | 4 2 files changed, 40 insertions(+) diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c index 862cdf9..a96058a 100644 --- a/fs/cifs/smbdirect.c +++ b/fs/cifs/smbdirect.c @@ -1387,6 +1387,42 @@ static void idle_connection_timer(struct work_struct *work) info->keep_alive_interval*HZ); } +/* + * Reconnect this SMBD connection, called from upper layer + * return value: 0 on success, or actual error code + */ +int smbd_reconnect(struct TCP_Server_Info *server) +{ + log_rdma_event(INFO, "reconnecting rdma session\n"); + + if (!server->smbd_conn) { + log_rdma_event(ERR, "rdma session already destroyed\n"); + return -EINVAL; + } + + /* +* This is possible if transport is disconnected and we haven't received +* notification from RDMA, but upper layer has detected timeout +*/ + if (server->smbd_conn->transport_status == SMBD_CONNECTED) { + log_rdma_event(INFO, "disconnecting transport\n"); + smbd_disconnect_rdma_connection(server->smbd_conn); + } + + /* wait until the transport is destroyed */ + wait_event(server->smbd_conn->wait_destroy, + server->smbd_conn->transport_status == SMBD_DESTROYED); + + destroy_workqueue(server->smbd_conn->workqueue); + kfree(server->smbd_conn); + + log_rdma_event(INFO, "creating rdma session\n"); + server->smbd_conn = smbd_get_connection( + server, (struct sockaddr *) >dstaddr); + + return server->smbd_conn ? 0 : -ENOENT; +} + static void destroy_caches_and_workqueue(struct smbd_connection *info) { destroy_receive_buffers(info); diff --git a/fs/cifs/smbdirect.h b/fs/cifs/smbdirect.h index 25b3782..8948f06 100644 --- a/fs/cifs/smbdirect.h +++ b/fs/cifs/smbdirect.h @@ -247,11 +247,15 @@ struct smbd_response { struct smbd_connection *smbd_get_connection( struct TCP_Server_Info *server, struct sockaddr *dstaddr); +/* Reconnect SMBDirect session */ +int smbd_reconnect(struct TCP_Server_Info *server); + #else #define cifs_rdma_enabled(server) 0 struct smbd_connection {}; static inline void *smbd_get_connection( struct TCP_Server_Info *server, struct sockaddr *dstaddr) {return NULL;} +static inline int smbd_reconnect(struct TCP_Server_Info *server) {return -1;} #endif #endif -- 2.7.4
[Patch v8 03/16] CIFS: SMBD: Upper layer reconnects to SMB Direct session
From: Long LiDo a reconnect on SMB Direct when it is used as the connection. Reconnect can happen for many reasons and it's mostly the decision of SMB2 upper layer. Signed-off-by: Long Li --- fs/cifs/connect.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index fafaecb..fc46066 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -406,7 +406,10 @@ cifs_reconnect(struct TCP_Server_Info *server) /* we should try only the port we connected to before */ mutex_lock(>srv_mutex); - rc = generic_ip_connect(server); + if (cifs_rdma_enabled(server)) + rc = smbd_reconnect(server); + else + rc = generic_ip_connect(server); if (rc) { cifs_dbg(FYI, "reconnect error %d\n", rc); mutex_unlock(>srv_mutex); -- 2.7.4
[Patch v8 05/16] CIFS: SMBD: Upper layer destroys SMB Direct session on shutdown or umount
From: Long LiWhen upper layer wants to umount, make it call shutdown on transport when SMB Direct is used. Signed-off-by: Long Li --- fs/cifs/connect.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index fc46066..d8bfa89 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -704,7 +704,10 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server) wake_up_all(>request_q); /* give those requests time to exit */ msleep(125); - + if (cifs_rdma_enabled(server) && server->smbd_conn) { + smbd_destroy(server->smbd_conn); + server->smbd_conn = NULL; + } if (server->ssocket) { sock_release(server->ssocket); server->ssocket = NULL; -- 2.7.4
Re: [PATCH v7 0/4] Add the ability to do BPF directed error injection
On Wed, Nov 22, 2017 at 04:23:29PM -0500, Josef Bacik wrote: > This is hopefully the final version, I've addressed the comment by Igno and > added his Acks. > > v6->v7: > - moved the opt-in macro to bpf.h out of kprobes.h. Thanks Josef! All patches look great to me. We'll probably take them all into bpf-next.git to start testing together with other bpf changes and when net-next reopens will send them to Dave. Then optionally can send pull-req for the first patch only to tip if Ingo thinks that there can be conflicts with the work happening in parallel on kprobe/x86 bits. This way hopefully there will be no conflicts during the next merge window. Makes sense? > v5->v6: > - add BPF_ALLOW_ERROR_INJECTION() tagging for functions that will support this > feature. This way only functions that opt-in will be allowed to be > overridden. > - added a btrfs patch to allow error injection for open_ctree() so that the > bpf > sample actually works. > > v4->v5: > - disallow kprobe_override programs from being put in the prog map array so we > don't tail call into something we didn't check. This allows us to make the > normal path still fast without a bunch of percpu operations. > > v3->v4: > - fix a build error found by kbuild test bot (I didn't wait long enough > apparently.) > - Added a warning message as per Daniels suggestion. > > v2->v3: > - added a ->kprobe_override flag to bpf_prog. > - added some sanity checks to disallow attaching bpf progs that have > ->kprobe_override set that aren't for ftrace kprobes. > - added the trace_kprobe_ftrace helper to check if the trace_event_call is a > ftrace kprobe. > - renamed bpf_kprobe_state to bpf_kprobe_override, fixed it so we only read > this > value in the kprobe path, and thus only write to it if we're overriding or > clearing the override. > > v1->v2: > - moved things around to make sure that bpf_override_return could really only > be > used for an ftrace kprobe. > - killed the special return values from trace_call_bpf. > - renamed pc_modified to bpf_kprobe_state so bpf_override_return could tell if > it was being called from an ftrace kprobe context. > - reworked the logic in kprobe_perf_func to take advantage of > bpf_kprobe_state. > - updated the test as per Alexei's review. > > - Original message - > > A lot of our error paths are not well tested because we have no good way of > injecting errors generically. Some subystems (block, memory) have ways to > inject errors, but they are random so it's hard to get reproduceable results. > > With BPF we can add determinism to our error injection. We can use kprobes > and > other things to verify we are injecting errors at the exact case we are trying > to test. This patch gives us the tool to actual do the error injection part. > It is very simple, we just set the return value of the pt_regs we're given to > whatever we provide, and then override the PC with a dummy function that > simply > returns. > > Right now this only works on x86, but it would be simple enough to expand to > other architectures. Thanks, > > Josef
[lkp-robot] [fw_cfg] d5daa79dd1: BUG:unable_to_handle_kernel
FYI, we noticed the following commit (built with gcc-5): commit: d5daa79dd1c013fb9dbec70c7e371eed1feb09db ("fw_cfg: do DMA read operation") https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master in testcase: boot on test machine: qemu-system-x86_64 -enable-kvm -cpu Nehalem -smp 2 -m 512M caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): +--+++ | | 2199115e4b | d5daa79dd1 | +--+++ | boot_successes | 16 | 4 | | boot_failures| 0 | 10 | | BUG:unable_to_handle_kernel | 0 | 10 | | Oops:#[##] | 0 | 10 | | RIP:nommu_map_page | 0 | 10 | | Kernel_panic-not_syncing:Fatal_exception | 0 | 10 | +--+++ [ 64.571579] BUG: unable to handle kernel paging request at 008000c0 [ 64.572878] IP: nommu_map_page+0x5/0x70 [ 64.573627] PGD 1f3c8067 P4D 1f3c8067 PUD 1f3c9067 PMD 0 [ 64.580011] Oops: [#1] SMP [ 64.580011] Modules linked in: qemu_fw_cfg(+) [ 64.580011] CPU: 1 PID: 185 Comm: udevd Not tainted 4.14.0-9-gd5daa79 #1 [ 64.580011] task: 880015c58200 task.stack: 88001ef2 [ 64.580011] RIP: 0010:nommu_map_page+0x5/0x70 [ 64.580011] RSP: 0018:88001ef23a60 EFLAGS: 00010206 [ 64.580011] RAX: 88001ffda080 RBX: 0400 RCX: 0010 [ 64.580011] RDX: 06c0 RSI: 008000c0 RDI: 88001d697810 [ 64.580011] RBP: 88001ef23a68 R08: R09: [ 64.580011] R10: 8101e940 R11: d9b5e5fe R12: 0004 [ 64.580011] R13: R14: R15: 0001 [ 64.580011] FS: 7fdd303f1780() GS:88001e70() knlGS: [ 64.580011] CS: 0010 DS: ES: CR0: 80050033 [ 64.580011] CR2: 008000c0 CR3: 1f3c7000 CR4: 06e0 [ 64.580011] Call Trace: [ 64.580011] fw_cfg_dma_transfer+0x1a1/0x350 [qemu_fw_cfg] [ 64.580011] fw_cfg_read_blob+0xa5/0x180 [qemu_fw_cfg] [ 64.580011] fw_cfg_sysfs_probe+0x25a/0x1550 [qemu_fw_cfg] [ 64.580011] ? acpi_device_wakeup_disable+0x4d/0x50 [ 64.580011] platform_drv_probe+0x36/0x90 [ 64.580011] driver_probe_device+0x199/0x380 [ 64.580011] __driver_attach+0x9a/0xa0 [ 64.580011] ? driver_probe_device+0x380/0x380 [ 64.580011] bus_for_each_dev+0x61/0xa0 [ 64.580011] driver_attach+0x19/0x20 [ 64.580011] bus_add_driver+0x1a1/0x210 [ 64.580011] ? 0xa0006000 [ 64.580011] driver_register+0x5b/0xd0 [ 64.580011] ? 0xa0006000 [ 64.580011] __platform_driver_register+0x31/0x40 [ 64.580011] fw_cfg_sysfs_init+0x3e/0x1000 [qemu_fw_cfg] [ 64.580011] ? 0xa0006000 [ 64.580011] do_one_initcall+0x3f/0x164 [ 64.580011] ? __might_sleep+0x45/0x80 [ 64.580011] do_init_module+0x78/0x3d9 [ 64.580011] load_module+0x2267/0x2710 [ 64.580011] SYSC_finit_module+0xba/0xc0 [ 64.580011] ? SYSC_finit_module+0xba/0xc0 [ 64.580011] SyS_finit_module+0x9/0x10 [ 64.580011] do_syscall_64+0x74/0x1f0 [ 64.580011] entry_SYSCALL64_slow_path+0x25/0x25 [ 64.580011] RIP: 0033:0x7fdd2fac64a9 [ 64.580011] RSP: 002b:7ffcecc0b848 EFLAGS: 0206 ORIG_RAX: 0139 [ 64.580011] RAX: ffda RBX: 00653380 RCX: 7fdd2fac64a9 [ 64.580011] RDX: RSI: 7fdd2fd920aa RDI: 0007 [ 64.580011] RBP: 7fdd2fd920aa R08: R09: 00653380 [ 64.580011] R10: 0007 R11: 0206 R12: [ 64.580011] R13: 0002 R14: R15: 00653380 [ 64.580011] Code: 49 89 c6 74 12 49 8b 16 48 83 e2 fc 75 83 0f 0b 0f ff e9 66 ff ff ff 44 89 e0 5b 41 5c 41 5d 41 5e 5d c3 0f 1f 00 55 48 89 e5 53 <4c> 8b 06 4c 89 c0 49 c1 e8 34 4e 8b 04 c5 00 e7 a3 82 48 c1 e8 [ 64.580011] RIP: nommu_map_page+0x5/0x70 RSP: 88001ef23a60 [ 64.580011] CR2: 008000c0 [ 64.813564] ---[ end trace c6675425e1ab9b4d ]--- To reproduce: git clone https://github.com/intel/lkp-tests.git cd lkp-tests bin/lkp qemu -k job-script # job-script is attached in this email Thanks, Xiaolong # # Automatically generated file; DO NOT EDIT. # Linux/x86_64 4.14.0 Kernel Configuration # CONFIG_64BIT=y CONFIG_X86_64=y CONFIG_X86=y CONFIG_INSTRUCTION_DECODER=y CONFIG_OUTPUT_FORMAT="elf64-x86-64" CONFIG_ARCH_DEFCONFIG="arch/x86/configs/x86_64_defconfig" CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_MMU=y CONFIG_ARCH_MMAP_RND_BITS_MIN=28
Re: [PATCH 3/5] media: i2c: Add TDA1997x HDMI receiver driver
On Wed, Nov 15, 2017 at 8:30 PM, Rob Herringwrote: > On Wed, Nov 15, 2017 at 10:31:14AM -0800, Tim Harvey wrote: >> On Wed, Nov 15, 2017 at 7:52 AM, Rob Herring wrote: >> > On Thu, Nov 09, 2017 at 10:45:34AM -0800, Tim Harvey wrote: >> >> Add support for the TDA1997x HDMI receivers. >> >> >> >> Cc: Hans Verkuil >> >> Signed-off-by: Tim Harvey >> >> --- >> >> v3: >> >> - use V4L2_DV_BT_FRAME_WIDTH/HEIGHT macros >> >> - fixed missing break >> >> - use only hdmi_infoframe_log for infoframe logging >> >> - simplify tda1997x_s_stream error handling >> >> - add delayed work proc to handle hotplug enable/disable >> >> - fix set_edid (disable HPD before writing, enable after) >> >> - remove enabling edid by default >> >> - initialize timings >> >> - take quant range into account in colorspace conversion >> >> - remove vendor/product tracking (we provide this in log_status via >> >> infoframes) >> >> - add v4l_controls >> >> - add more detail to log_status >> >> - calculate vhref generator timings >> >> - timing detection fixes (rounding errors, hswidth errors) >> >> - rename configure_input/configure_conv functions >> >> >> >> v2: >> >> - implement dv timings enum/cap >> >> - remove deprecated g_mbus_config op >> >> - fix dv_query_timings >> >> - add EDID get/set handling >> >> - remove max-pixel-rate support >> >> - add audio codec DAI support >> >> - change audio bindings >> >> --- >> >> drivers/media/i2c/Kconfig|9 + >> >> drivers/media/i2c/Makefile |1 + >> >> drivers/media/i2c/tda1997x.c | 3485 >> >> ++ >> >> include/dt-bindings/media/tda1997x.h | 78 + >> > >> > This belongs with the binding documentation patch. >> > >> >> Rob, >> >> Thanks - missed that. I will move it for v4. >> >> Regarding your previous comment to the v2 series: >> > The rest of the binding looks fine, but I have some reservations about >> > this. I think this should be common probably. There's been a few >> > bindings for display recently that deal with the interface format. Maybe >> > some vendor property is needed here to map a standard interface format >> > back to pin configuration. >> >> I take it this is not an 'Ack' for the bindings? >> >> Which did you feel should be made common? I admit I was surprised >> there wasn't a common binding for audio bus format (i2s|spdif) but if >> you were referring to the video data that would probably be much more >> complicated. > > The video data. Either you have to try to come up with some way to map > color components to signals/pins (and even cycles) or you just enumerate > the formats and keep adding to them when new ones appear. There's h/w > that allows the former, but in the end you have to interoperate, so > enumerating the formats is probably enough. > >> I was hoping one of the media/driver maintainers would respond to your >> comment with thoughts as I'm not familiar with a very wide variety of >> receivers. > > I am hoping, too. > > Rob Hans, Do you have any comment here regarding Rob's hope that there could be some generic properties created for video port bindings? Anyone else you know of who should chime in here? The TDA1997x allows mapping its internal video output bus to its physical pin in a fairly flexible way. I don't know how unique this is to other chips. Regards, Tim
Re: [PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored
On Thu, Nov 23 2017, Ian Kent wrote: > On 23/11/17 10:21, NeilBrown wrote: >> On Thu, Nov 23 2017, Ian Kent wrote: >> >>> >>> Hey Neil, I'm looking at this again because RH QE have complained about >>> a regression test failing with a kernel that has this change. >>> >>> Maybe I'm just dumb but I though a "find " >>> would, well, just look at the contents below but an >>> strace shows that it reads and calls fstatat() on "every entry in the >>> mount table" regardless of the path. >> >> weird ... I can only get find to look at the mount table if given the >> -fstyp option, and even then it doesn't fstatat anything that isn't in >> the tree it is searching. > > It's probably the -xautofs (exclude autofs fs'es) that was used in > the test that requires reading the mount table to get info about > excluding autofs mounts but the fstatat() on all the entries, > regardless of path, that was a surprise to me. > > find did use AT_SYMLINK_NOFOLLOW which historically behaved like > AT_NO_AUTOMOUNT. > >> >> >>> >>> And with the move of userspace to use /proc based mount tables (one >>> example being the symlink of /etc/mtab into /proc) even modest sized >>> direct mount maps will be a problem with every entry getting mounted. >> >> But the patch in question is only about indirect mount maps, isn't it? >> How is it relevant to direct mount maps? > > The change here will cause fstatat() to trigger direct mounts on access > if it doesn't use AT_NO_AUTOMOUNT. Ahhh... light dawns. This is about this bit of the patch: static inline int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat, int flags) { - return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT, -stat, STATX_BASIC_STATS); + return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS); } I hadn't paid much attention to that. I before this patch: stat and lstat act as you would expect AT_NO_AUTOMOUNT to act on direct mount and browseable indirect mount, but not on unbrowseable indirect mounts fstatat appeared to accept the AT_NO_AUTOMOUNT flag, but actually assumed it was always set, but acted like stat and lstat xstatat actually accepted the AT_NO_AUTOMOUNT flag, but it had no effect on unbrowseable indirect mounts. after the patch, the distinction between direct and indirect was gone, and fstatat now handles AT_NO_AUTOMOUNT the same as xstatat. So: stat and lstat now don't trigger automounts even on indirect, but this is a mixed blessing as they don't even trigger the mkdir fstatat without AT_NO_AUTOMOUNT now always triggers an automount This is a problematic regression that you have noticed and likely needs to be reverted. Maybe we can assume AT_NO_AUTOMOUNT when AT_SYMLINK_NOFOLLOW is set, and require people to use xstatat if they need to set the flags separately xstatat now correctly honours AT_NO_AUTOMOUNT for indirect mounts but is otherwise unchanged. What would you think of changing the above to static inline int vfs_fstatat(int dfd, const char __user *filename, struct kstat *stat, int flags) { - return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT, -stat, STATX_BASIC_STATS); + return vfs_statx(dfd, filename, +(flags & AT_SYMLINK_NOFOLLOW) ? (flags | + AT_NO_AUTOMOUNT) : flags, + stat, STATX_BASIC_STATS); } ?? Thanks, NeilBrown > > It's not a problem for browse indirect mounts because they are plain > directories within the autofs mount point not individual autofs mount > triggers. > >> >>> >>> Systems will cope with this fine but larger systems not so much. >>> >>> If find does this then the user space changes needed to accommodate >>> this sort of change are almost certainly far more than I expected. >>> >>> I think this is an example of the larger problem I'm faced with and >>> this change was was meant to be a starting point for resolution. >>> >>> The most obvious symptom of the problem is auto-mounts no longer able >>> to be expired due to being re-mounted immediately after expire. Another >>> symptom is unwanted (by the user) accesses causing unexpected auto-mount >>> attempts. >>> >>> I believe this monitoring of the mount table is what leads to excessive >>> CPU consumption I've seen, usually around six processes, under heavy >>> mount activity. And following this, when the mount table is large and >>> there is "no mount activity" two of the six processes continue to consume >>> excessive CPU, until the mount table shrinks. >>> >>> So now I'm coming around to the idea of reverting this change . and >>> going back to the drawing board. >> >> I can well imaging that a large mount table could cause problems for >> applications that are written to expect one, and I can imagine that >> autofs could cause extra issues for such a
[PATCH 2/2] scripts: leaking_addresses: help screen updates
The current leaking_addresses.pl script only supports showing "leaked" 64-bit kernel virtual addresses. This patch modifies the "help" screen in the following manner: - the '--raw', '--suppress-dmesg', '--squash-by-path' and '--squash-by-filename' option switches are only meaningful when the '--input-raw=' option switch is used. So, indent the 'Help' screen lines to reflect the fact. - an additional example demonstrating usage of the new '--page-offset' parameter. Feedback welcome.. Signed-off-by: Kaiwan N Billimoria--- diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl index 7ca218221486..3832abb743d7 100755 --- a/scripts/leaking_addresses.pl +++ b/scripts/leaking_addresses.pl @@ -105,10 +105,10 @@ Options: -o, --output-raw= Save results for future processing. -i, --input-raw= Read results from file instead of scanning. - --rawShow raw results (default). - --suppress-dmesg Do not show dmesg results. - --squash-by-path Show one result per unique path. - --squash-by-filename Show one result per unique filename. + --rawShow raw results (default). + --suppress-dmesg Do not show dmesg results. + --squash-by-path Show one result per unique path. + --squash-by-filename Show one result per unique filename. --page-offset= PAGE_OFFSET value (for 32-bit kernels). -d, --debug Display debugging output. -h, --help, --versionDisplay this help and exit. @@ -124,6 +124,10 @@ Examples: # View summary report. $0 --input-raw scan.out --squash-by-filename + # (On a 32-bit system with a 2GB:2GB VMSPLIT), pass PAGE_OFFSET value + # as a parameter + $0 --page-offset=0x8000 + Scans the running (32 or 64 bit) kernel for potential leaking addresses. EOM
[PATCH] ALSA: hda: Add Raven PCI ID
This commit adds PCI ID for Raven platform Signed-off-by: Vijendar Mukunda--- sound/pci/hda/hda_intel.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/sound/pci/hda/hda_intel.c b/sound/pci/hda/hda_intel.c index 01eb1dc..9c7d479 100644 --- a/sound/pci/hda/hda_intel.c +++ b/sound/pci/hda/hda_intel.c @@ -2454,6 +2454,8 @@ static const struct pci_device_id azx_ids[] = { /* AMD Hudson */ { PCI_DEVICE(0x1022, 0x780d), .driver_data = AZX_DRIVER_GENERIC | AZX_DCAPS_PRESET_ATI_SB }, + { PCI_DEVICE(0x1022, 0x15e3), + .driver_data = AZX_DRIVER_GENERIC | AZX_DCAPS_PRESET_ATI_SB }, /* ATI HDMI */ { PCI_DEVICE(0x1002, 0x0002), .driver_data = AZX_DRIVER_ATIHDMI_NS | AZX_DCAPS_PRESET_ATI_HDMI_NS }, -- 2.7.4
Re: [PATCH v7 07/13] slimbus: Add support for 'clock-pause' feature
On Wed, Nov 15, 2017 at 02:10:37PM +, srinivas.kandaga...@linaro.org wrote: > From: Sagar Dharia> > Per slimbus specification, a reconfiguration sequence known as > 'clock pause' needs to be broadcast over the bus while entering low- > power mode. Clock-pause is initiated by the controller driver. > To exit clock-pause, controller typically wakes up the framer device. > Since wakeup precedure is controller-specific, framework calls it via > controller's function pointer to invoke it. > > Signed-off-by: Sagar Dharia > Signed-off-by: Srinivas Kandagatla > --- > +/** > + * struct slim_sched: Framework uses this structure internally for > scheduling. Missing kernel doc for clkgear here. > + * @clk_state: Controller's clock state from enum slim_clk_state > + * @pause_comp: Signals completion of clock pause sequence. This is useful > when > + * client tries to call slimbus transaction when controller is entering > + * clock pause. > + * @m_reconf: This mutex is held until current reconfiguration (data channel > + * scheduling, message bandwidth reservation) is done. Message APIs can > + * use the bus concurrently when this mutex is held since elemental access > + * messages can be sent on the bus when reconfiguration is in progress. > + */ > +struct slim_sched { > + int clkgear; > + enum slim_clk_state clk_state; > + struct completion pause_comp; > + struct mutexm_reconf; > +}; Thanks, Charles
Re: [PATCH 00/23] [v4] KAISER: unmap most of the kernel from userspace page tables
32-bit x86 defconfig still doesn't build: arch/x86/events/intel/ds.c: In function ‘dsalloc’: arch/x86/events/intel/ds.c:296:6: error: implicit declaration of function ‘kaiser_add_mapping’; did you mean ‘kgid_has_mapping’? [-Werror=implicit-function-declaration] Also, could you please use proper subsystem tags, instead of: Subject: x86, kaiser: Disable global pages by default with KAISER Please do something like: Subject: x86/mm/kaiser: Disable global pages by default with KAISER Thanks, Ingo
Re: Add fine grained sampled metrics for perf script
On Fri, Nov 17, 2017 at 01:42:57PM -0800, Andi Kleen wrote: SNIP > TopDown: > > Note TopDown requires disabling SMT if you have it enabled (e.g. by offlining > the extra CPUs), because SMT would require sampling per core, which is not > supported. > > $ perf record -e '{ref-cycles,topdown-fetch-bubbles,topdown-recovery-bubbles,\ > topdown-slots-retired,topdown-total-slots,topdown-slots-issued}:S' -a sleep 1 > $ perf script --header -I -F cpu,ip,sym,event,metric,period > ... > [000] 121108 ref-cycles: 8165222e > copy_user_enhanced_fast_string > [000] 190350topdown-fetch-bubbles: 8165222e > copy_user_enhanced_fast_string > [000] 2055 topdown-recovery-bubbles: 8165222e > copy_user_enhanced_fast_string > [000] 148729topdown-slots-retired: 8165222e > copy_user_enhanced_fast_string > [000] 144324 topdown-total-slots: 8165222e > copy_user_enhanced_fast_string > [000] 160852 topdown-slots-issued: 8165222e > copy_user_enhanced_fast_string > [000] metric: 33.0% frontend bound > [000] metric: 3.5% bad speculation > [000] metric: 25.8% retiring > [000] metric: 37.7% backend bound > [000] 112112 ref-cycles: 8165aec8 > _raw_spin_lock_irqsave > [000] 357222topdown-fetch-bubbles: 8165aec8 > _raw_spin_lock_irqsave > [000] 3325 topdown-recovery-bubbles: 8165aec8 > _raw_spin_lock_irqsave > [000] 323553topdown-slots-retired: 8165aec8 > _raw_spin_lock_irqsave > [000] 270507 topdown-total-slots: 8165aec8 > _raw_spin_lock_irqsave > [000] 341226 topdown-slots-issued: 8165aec8 > _raw_spin_lock_irqsave > [000] metric: 33.0% frontend bound > [000] metric: 2.9% bad speculation > [000] metric: 29.9% retiring > [000] metric: 34.2% backend bound > > > Git tree: > git://git.kernel.org/pub/scm/limux/kernel/git/ak/linux-misc.git > perf/script-metric-3 > > > v1: Initial post > v2: > Remove already merged patches. > Use evsel->priv for new fields > Port to new base line, support fp output. > Handle stats in ->stats, not ->priv > Minor cleanups > v3: > Enable EVENT_UPDATE in perf record, and record unit/scale/cpu map/thread map > Drop the previous zero cpu map hack. Acked-by: Jiri Olsathanks, jirka
[PATCH 5/6] ARM: dts: keystone-k2g-ice: Add DT nodes for few peripherals
Add DT nodes for QSPI, on board LEDS, MMC, I2C, PCA IO expander, gpio-decoder and regulators on K2G ICE board. Thanks to Franklin S Cooper Jrfor initial work on few peripherals. Signed-off-by: Franklin S Cooper Jr Signed-off-by: Vignesh R --- arch/arm/boot/dts/keystone-k2g-ice.dts | 336 + 1 file changed, 336 insertions(+) diff --git a/arch/arm/boot/dts/keystone-k2g-ice.dts b/arch/arm/boot/dts/keystone-k2g-ice.dts index 78692745e0af..1736eb53ad83 100644 --- a/arch/arm/boot/dts/keystone-k2g-ice.dts +++ b/arch/arm/boot/dts/keystone-k2g-ice.dts @@ -30,6 +30,191 @@ status = "okay"; }; }; + + vmain: fixedregulator-vmain { + compatible = "regulator-fixed"; + regulator-name = "vmain_fixed"; + regulator-min-microvolt = <2400>; + regulator-max-microvolt = <2400>; + regulator-always-on; + }; + + v5_0: fixedregulator-v5_0 { + /* TPS54531 */ + compatible = "regulator-fixed"; + regulator-name = "v5_0_fixed"; + regulator-min-microvolt = <500>; + regulator-max-microvolt = <500>; + vin-supply = <>; + regulator-always-on; + }; + + vdd_3v3: fixedregulator-vdd_3v3 { + /* TLV62084 */ + compatible = "regulator-fixed"; + regulator-name = "vdd_3v3_fixed"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + vin-supply = <_0>; + regulator-always-on; + }; + + vdd_1v8: fixedregulator-vdd_1v8 { + /* TLV62084 */ + compatible = "regulator-fixed"; + regulator-name = "vdd_1v8_fixed"; + regulator-min-microvolt = <180>; + regulator-max-microvolt = <180>; + vin-supply = <_0>; + regulator-always-on; + }; + + vdds_ddr: fixedregulator-vdds_ddr { + /* TLV62080 */ + compatible = "regulator-fixed"; + regulator-name = "vdds_ddr_fixed"; + regulator-min-microvolt = <135>; + regulator-max-microvolt = <135>; + vin-supply = <_0>; + regulator-always-on; + }; + + vref_ddr: fixedregulator-vref_ddr { + /* LP2996A */ + compatible = "regulator-fixed"; + regulator-name = "vref_ddr_fixed"; + regulator-min-microvolt = <675000>; + regulator-max-microvolt = <675000>; + vin-supply = <_3v3>; + regulator-always-on; + }; + + vtt_ddr: fixedregulator-vtt_ddr { + /* LP2996A */ + compatible = "regulator-fixed"; + regulator-name = "vtt_ddr_fixed"; + regulator-min-microvolt = <675000>; + regulator-max-microvolt = <675000>; + vin-supply = <_3v3>; + regulator-always-on; + }; + + vdd_0v9: fixedregulator-vdd_0v9 { + /* TPS62180 */ + compatible = "regulator-fixed"; + regulator-name = "vdd_0v9_fixed"; + regulator-min-microvolt = <90>; + regulator-max-microvolt = <90>; + vin-supply = <_0>; + regulator-always-on; + }; + + vddb: fixedregulator-vddb { + /* TPS22945 */ + compatible = "regulator-fixed"; + regulator-name = "vddb_fixed"; + regulator-min-microvolt = <330>; + regulator-max-microvolt = <330>; + + gpio = < 53 GPIO_ACTIVE_HIGH>; + enable-active-high; + }; + + gpio-decoder { + compatible = "gpio-decoder"; + gpios = < 3 GPIO_ACTIVE_HIGH>, + < 2 GPIO_ACTIVE_HIGH>, + < 1 GPIO_ACTIVE_HIGH>, + < 0 GPIO_ACTIVE_HIGH>; + linux,axis = <0>; /* ABS_X */ + decoder-max-value = <9>; + }; + + leds1 { + compatible = "gpio-leds"; + pinctrl-names = "default"; + pinctrl-0 = <_leds>; + + led0 { + label = "status0:red:cpu0"; + gpios = < 11 GPIO_ACTIVE_HIGH>; + default-state = "off"; + linux,default-trigger = "cpu0"; + }; + + led1 { + label = "status0:green:usr"; + gpios = < 12 GPIO_ACTIVE_HIGH>; + default-state = "off"; + }; + + led2 { + label = "status0:yellow:usr"; + gpios = < 13 GPIO_ACTIVE_HIGH>;
RE: [PATCH v18 0/6] drm/i915/gvt: Dma-buf support for GVT-g
> -Original Message- > From: intel-gvt-dev [mailto:intel-gvt-dev-boun...@lists.freedesktop.org] On > Behalf Of Zhenyu Wang > Sent: Thursday, November 23, 2017 2:13 PM > To: Gerd Hoffmann> Cc: Tian, Kevin ; alex.william...@redhat.com; intel- > g...@lists.freedesktop.org; joonas.lahti...@linux.intel.com; Wang, Zhi A > ; linux-kernel@vger.kernel.org; > zhen...@linux.intel.com; Zhang, Tina ; > kwankh...@nvidia.com; Lv, Zhiyuan ; dan...@ffwll.ch; > ch...@chris-wilson.co.uk; intel-gvt-...@lists.freedesktop.org; Yuan, Hang > > Subject: Re: [PATCH v18 0/6] drm/i915/gvt: Dma-buf support for GVT-g > > On 2017.11.15 11:49:00 +0100, Gerd Hoffmann wrote: > > On Wed, Nov 15, 2017 at 05:11:49PM +0800, Tina Zhang wrote: > > > v17->v18: > > > 1) unmap vgpu's opregion when destroying vgpu. > > > 2) update comments for VFIO_DEVICE_GET_GFX_DMABUF. (Alex) > > > > > This patch set adds the dma-buf support for intel GVT-g. > > > > > > dma-buf is an uniform mechanism to share DMA buffers across > > > different devices and subsystems. dma-buf for intel GVT-g is mainly > > > used to share the vgpu's framebuffer to userspace to leverage > > > userspace graphics stacks to render the framebuffer to the display > > > monitor. > > > > > > The main idea is that we create a gem object and set vgpu's > > > framebuffer as its backing storage. Then, export a dma-buf associated with > this gem object. > > > With the fd of this dma-buf, userspace can directly handle this buffer. > > > > > > This patch set can be tried with the following example: > > > git://git.kraxel.org/qemu branch: work/intel-vgpu > > > > > > A topic branch with the latest patch set is: > > > https://github.com/intel/gvt-linux.git branch: topic/dmabuf > > > > Tested-by: Gerd Hoffmann > > > > After debugging with Tina on one left race that fixed by > https://lists.freedesktop.org/archives/intel-gvt-dev/2017- > November/002505.html The next version of this patch set will include this patch. Thanks. BR, Tina > > I still need below qemu fix for proper cursor handling, otherwise qemu just > crashed when I click in my terminal program which hides cursor then. > > diff --git a/hw/vfio/display.c b/hw/vfio/display.c index > e500ec2cb1..d9a044b080 100644 > --- a/hw/vfio/display.c > +++ b/hw/vfio/display.c > @@ -169,8 +169,9 @@ static void vfio_display_dmabuf_update(void *opaque) > cursor = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_CURSOR); > if (vdev->cursor != cursor) { > vdev->cursor = cursor; > -dpy_gl_cursor_dmabuf(vdev->display_con, > - >buf); > +if (cursor) > +dpy_gl_cursor_dmabuf(vdev->display_con, > + >buf); > free_bufs = true; > } > if (cursor != NULL) { > > And with these it seems pretty fine now that I'll queue them up for -next > pull. > > thanks > > -- > Open Source Technology Center, Intel ltd. > > $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827
[PATCH 3/6] ARM: dts: keystone-k2g: Move ti,non-removable property to board dts
On 66AK2G EVM mmc1 is connected to emmc whereas 66AK2G ICE baord has SD card slot connected to mmc1. Therefore move emmc specific ti,non-removable property from SoC file to EVM's dts file. Signed-off-by: Vignesh R--- arch/arm/boot/dts/keystone-k2g-evm.dts | 1 + arch/arm/boot/dts/keystone-k2g.dtsi| 1 - 2 files changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm/boot/dts/keystone-k2g-evm.dts b/arch/arm/boot/dts/keystone-k2g-evm.dts index 298a50555e46..03b3e7c5dc8e 100644 --- a/arch/arm/boot/dts/keystone-k2g-evm.dts +++ b/arch/arm/boot/dts/keystone-k2g-evm.dts @@ -127,6 +127,7 @@ pinctrl-names = "default"; pinctrl-0 = <_pins>; vmmc-supply = <_dcin_reg>; /* VCC3V3_EMMC is connected to VCC3V3_DCIN */ + ti,non-removable; status = "okay"; }; diff --git a/arch/arm/boot/dts/keystone-k2g.dtsi b/arch/arm/boot/dts/keystone-k2g.dtsi index 01d29320b04c..ef82c0a6e607 100644 --- a/arch/arm/boot/dts/keystone-k2g.dtsi +++ b/arch/arm/boot/dts/keystone-k2g.dtsi @@ -372,7 +372,6 @@ dma-names = "tx", "rx"; bus-width = <8>; ti,needs-special-reset; - ti,non-removable; max-frequency = <9600>; power-domains = <_pds 0xc>; clocks = <_clks 0xc 1>, <_clks 0xc 2>; -- 2.15.0
[PATCH 4/6] ARM: dts: keystone-k2g-evm: Add QSPI DT node.
66AK2G EVM has a s25fl512s flash connected to QSPI. Add DT nodes for the same. Signed-off-by: Vignesh R--- arch/arm/boot/dts/keystone-k2g-evm.dts | 59 ++ 1 file changed, 59 insertions(+) diff --git a/arch/arm/boot/dts/keystone-k2g-evm.dts b/arch/arm/boot/dts/keystone-k2g-evm.dts index 03b3e7c5dc8e..8d100217e38f 100644 --- a/arch/arm/boot/dts/keystone-k2g-evm.dts +++ b/arch/arm/boot/dts/keystone-k2g-evm.dts @@ -103,6 +103,18 @@ K2G_CORE_IOPAD(0x11b4) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_mosi.spi1_mosi */ >; }; + + qspi_pins: pinmux_qspi_pins { + pinctrl-single,pins = < + K2G_CORE_IOPAD(0x1204) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_clk.qspi_clk */ + K2G_CORE_IOPAD(0x1208) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_rclk.qspi_rclk */ + K2G_CORE_IOPAD(0x120c) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_d0.qspi_d0 */ + K2G_CORE_IOPAD(0x1210) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_d1.qspi_d1 */ + K2G_CORE_IOPAD(0x1214) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_d2.qspi_d2 */ + K2G_CORE_IOPAD(0x1218) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_d3.qspi_d3 */ + K2G_CORE_IOPAD(0x121c) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* qspi_csn0.qspi_csn0 */ + >; + }; }; { @@ -204,3 +216,50 @@ }; }; }; + + { + status = "okay"; + pinctrl-names = "default"; + pinctrl-0 = <_pins>; + cdns,rclk-en; + + flash0: m25p80@0 { + compatible = "s25fl512s", "jedec,spi-nor"; + reg = <0>; + spi-tx-bus-width = <1>; + spi-rx-bus-width = <4>; + spi-max-frequency = <9600>; + #address-cells = <1>; + #size-cells = <1>; + cdns,read-delay = <5>; + cdns,tshsl-ns = <500>; + cdns,tsd2d-ns = <500>; + cdns,tchsh-ns = <119>; + cdns,tslch-ns = <119>; + + partition@0 { + label = "QSPI.u-boot-spl-os"; + reg = <0x 0x0010>; + }; + partition@1 { + label = "QSPI.u-boot-env"; + reg = <0x0010 0x0004>; + }; + partition@2 { + label = "QSPI.skern"; + reg = <0x0014 0x004>; + }; + partition@3 { + label = "QSPI.pmmc-firmware"; + reg = <0x0018 0x004>; + }; + partition@4 { + label = "QSPI.kernel"; + reg = <0x001C 0x080>; + }; + partition@5 { + label = "QSPI.file-system"; + reg = <0x009C 0x364>; + }; + }; +}; -- 2.15.0
[PATCH 2/6] ARM: dts: keystone-k2g-evm: Fix botched up merge
spi1 and ecap0 pinmuxes ended up under root node instead of k2g_pinctrl node. Fix this by moving them under k2g_pinctrl node. Signed-off-by: Vignesh R--- arch/arm/boot/dts/keystone-k2g-evm.dts | 30 ++ 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/arch/arm/boot/dts/keystone-k2g-evm.dts b/arch/arm/boot/dts/keystone-k2g-evm.dts index 656af194a518..298a50555e46 100644 --- a/arch/arm/boot/dts/keystone-k2g-evm.dts +++ b/arch/arm/boot/dts/keystone-k2g-evm.dts @@ -45,22 +45,6 @@ regulator-max-microvolt = <330>; regulator-always-on; }; - - ecap0_pins: ecap0_pins { - pinctrl-single,pins = < - K2G_CORE_IOPAD(0x1374) (BUFFER_CLASS_B | MUX_MODE4) /* pr1_mdio_data.ecap0_in_apwm0_out */ - >; - }; - - spi1_pins: pinmux_spi1_pins { - pinctrl-single,pins = < - K2G_CORE_IOPAD(0x11a4) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_scs0.spi1_scs0 */ - K2G_CORE_IOPAD(0x11ac) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_clk.spi1_clk */ - K2G_CORE_IOPAD(0x11b0) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_miso.spi1_miso */ - K2G_CORE_IOPAD(0x11b4) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_mosi.spi1_mosi */ - >; - }; - }; _pinctrl { @@ -105,6 +89,20 @@ >; }; + ecap0_pins: ecap0_pins { + pinctrl-single,pins = < + K2G_CORE_IOPAD(0x1374) (BUFFER_CLASS_B | MUX_MODE4) /* pr1_mdio_data.ecap0_in_apwm0_out */ + >; + }; + + spi1_pins: pinmux_spi1_pins { + pinctrl-single,pins = < + K2G_CORE_IOPAD(0x11a4) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_scs0.spi1_scs0 */ + K2G_CORE_IOPAD(0x11ac) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_clk.spi1_clk */ + K2G_CORE_IOPAD(0x11b0) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_miso.spi1_miso */ + K2G_CORE_IOPAD(0x11b4) (BUFFER_CLASS_B | PULL_DISABLE | MUX_MODE0) /* spi1_mosi.spi1_mosi */ + >; + }; }; { -- 2.15.0
[PATCH] drm/i915: Avoid enum conversion warning
Fixes the following enum conversion warning: drivers/gpu/drm/i915/intel_ddi.c:1481:30: error: implicit conversion from enumeration type 'enum port' to different enumeration type 'enum intel_dpll_id' [-Werror,-Wenum-conversion] enum intel_dpll_id pll_id = port; ~~ ^~~~ Signed-off-by: Nick Desaulniers--- drivers/gpu/drm/i915/intel_ddi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c index 933c18fd4258..f9de45316901 100644 --- a/drivers/gpu/drm/i915/intel_ddi.c +++ b/drivers/gpu/drm/i915/intel_ddi.c @@ -1478,7 +1478,7 @@ static void bxt_ddi_clock_get(struct intel_encoder *encoder, { struct drm_i915_private *dev_priv = to_i915(encoder->base.dev); enum port port = intel_ddi_get_encoder_port(encoder); - enum intel_dpll_id pll_id = port; + uint32_t pll_id = port; pipe_config->port_clock = bxt_calc_pll_link(dev_priv, pll_id); -- 2.14.1
[PATCH 6/6] ARM: configs: keystone_defconfig: Enable few peripheral drivers
Enable drivers for QSPI, LEDS, gpio-decoder that are present on 66AK2G EVM and 66AK2G ICE boards. Signed-off-by: Vignesh R--- arch/arm/configs/keystone_defconfig | 7 +++ 1 file changed, 7 insertions(+) diff --git a/arch/arm/configs/keystone_defconfig b/arch/arm/configs/keystone_defconfig index f710c192b33a..2536c231eea1 100644 --- a/arch/arm/configs/keystone_defconfig +++ b/arch/arm/configs/keystone_defconfig @@ -228,3 +228,10 @@ CONFIG_CRYPTO_DES=y CONFIG_CRYPTO_ANSI_CPRNG=y CONFIG_CRYPTO_USER_API_HASH=y CONFIG_CRYPTO_USER_API_SKCIPHER=y +CONFIG_SPI_CADENCE_QUADSPI=y +CONFIG_INPUT_MISC=y +CONFIG_INPUT_EVDEV=m +CONFIG_INPUT_GPIO_DECODER=m +CONFIG_GPIO_PCA953X=m +CONFIG_LEDS_TRIGGER_ACTIVITY=y +CONFIG_LEDS_TRIGGER_CPU=y -- 2.15.0
[PATCH 1/6] ARM: dts: keystone-k2g: Add QSPI DT entry
Add DT node for Cadence QSPI IP present in 66AK2G SoC. Signed-off-by: Vignesh R--- arch/arm/boot/dts/keystone-k2g.dtsi | 14 ++ 1 file changed, 14 insertions(+) diff --git a/arch/arm/boot/dts/keystone-k2g.dtsi b/arch/arm/boot/dts/keystone-k2g.dtsi index 8f313ff406b9..01d29320b04c 100644 --- a/arch/arm/boot/dts/keystone-k2g.dtsi +++ b/arch/arm/boot/dts/keystone-k2g.dtsi @@ -377,6 +377,20 @@ power-domains = <_pds 0xc>; clocks = <_clks 0xc 1>, <_clks 0xc 2>; clock-names = "fck", "mmchsdb_fck"; + }; + + qspi: qspi@294 { + compatible = "ti,k2g-qspi", "cdns,qspi-nor"; + #address-cells = <1>; + #size-cells = <0>; + reg = <0x0294 0x1000>, + <0x2400 0x400>; + interrupts = ; + cdns,fifo-depth = <256>; + cdns,fifo-width = <4>; + cdns,trigger-address = <0x2400>; + clocks = <_clks 0x43 0x0>; + power-domains = <_pds 0x43>; status = "disabled"; }; -- 2.15.0
[PATCH 0/6] 66AK2G: Add DT nodes for few peripherals
This patch series adds DT nodes for bunch of peripherials on 66AK2G EVM and 66AK2G ICE boards. Tested on 66AK2G EVM and ICE boards Vignesh R (6): ARM: dts: keystone-k2g: Add QSPI DT entry ARM: dts: keystone-k2g-evm: Fix botched up merge ARM: dts: keystone-k2g: Move ti,non-removable property to board dts ARM: dts: keystone-k2g-evm: Add QSPI DT node. ARM: dts: keystone-k2g-ice: Add DT nodes for few peripherals ARM: configs: keystone_defconfig: Enable few peripheral drivers arch/arm/boot/dts/keystone-k2g-evm.dts | 90 +++-- arch/arm/boot/dts/keystone-k2g-ice.dts | 336 + arch/arm/boot/dts/keystone-k2g.dtsi| 15 +- arch/arm/configs/keystone_defconfig| 7 + 4 files changed, 431 insertions(+), 17 deletions(-) -- 2.15.0
Re: [PATCH v6 27/37] tracing: Add 'onmax' hist trigger action support
On Fri, Nov 17, 2017 at 02:33:06PM -0600, Tom Zanussi wrote: > Add an 'onmax(var).save(field,...)' hist trigger action which is > invoked whenever an event exceeds the current maximum. > > The end result is that the trace event fields or variables specified > as the onmax.save() params will be saved if 'var' exceeds the current > maximum for that hist trigger entry. This allows context from the > event that exhibited the new maximum to be saved for later reference. > When the histogram is displayed, additional fields displaying the > saved values will be printed. > > As an example the below defines a couple of hist triggers, one for > sched_wakeup and another for sched_switch, keyed on pid. Whenever a > sched_wakeup occurs, the timestamp is saved in the entry corresponding > to the current pid, and when the scheduler switches back to that pid, > the timestamp difference is calculated. If the resulting latency > exceeds the current maximum latency, the specified save() values are > saved: > > # echo 'hist:keys=pid:ts0=$common_timestamp.usecs \ > if comm=="cyclictest"' >> \ > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger > > # echo 'hist:keys=next_pid:\ > wakeup_lat=$common_timestamp.usecs-$ts0:\ > onmax($wakeup_lat).save(next_comm,prev_pid,prev_prio,prev_comm) \ > if next_comm=="cyclictest"' >> \ > /sys/kernel/debug/tracing/events/sched/sched_switch/trigger > > When the histogram is displayed, the max value and the saved values > corresponding to the max are displayed following the rest of the > fields: > > # cat /sys/kernel/debug/tracing/events/sched/sched_switch/hist > > { next_pid: 3728 } hitcount:199 \ > max:123 next_comm: cyclictest prev_pid: 0 \ > prev_prio:120 prev_comm: swapper/3 > { next_pid: 3730 } hitcount: 1321 \ > max: 15 next_comm: cyclictest prev_pid: 0 \ > prev_prio:120 prev_comm: swapper/1 > { next_pid: 3729 } hitcount: 1973\ > max: 25 next_comm: cyclictest prev_pid: 0 \ > prev_prio:120 prev_comm: swapper/0 > > Totals: > Hits: 3493 > Entries: 3 > Dropped: 0 > > Signed-off-by: Tom Zanussi> --- [SNIP] > +static int onmax_create(struct hist_trigger_data *hist_data, > + struct action_data *data) > +{ > + struct trace_event_call *call = hist_data->event_file->event_call; > + struct trace_event_file *file = hist_data->event_file; > + struct hist_field *var_field, *ref_field, *max_var; > + unsigned int var_ref_idx = hist_data->n_var_refs; > + struct field_var *field_var; > + char *onmax_var_str, *param; > + const char *event_name; > + unsigned long flags; > + unsigned int i; > + int ret = 0; > + > + onmax_var_str = data->onmax.var_str; > + if (onmax_var_str[0] != '$') > + return -EINVAL; > + onmax_var_str++; > + > + event_name = trace_event_name(call); It seems not used. > + var_field = find_target_event_var(hist_data, NULL, NULL, onmax_var_str); > + if (!var_field) > + return -EINVAL; > + > + flags = HIST_FIELD_FL_VAR_REF; > + ref_field = create_hist_field(hist_data, NULL, flags, NULL); > + if (!ref_field) > + return -ENOMEM; > + > + if (init_var_ref(ref_field, var_field, NULL, NULL)) { > + destroy_hist_field(ref_field, 0); > + ret = -ENOMEM; > + goto out; > + } > + hist_data->var_refs[hist_data->n_var_refs] = ref_field; > + ref_field->var_ref_idx = hist_data->n_var_refs++; > + data->onmax.var = ref_field; I was confused that this could create a reference to self-variable which would prevent a hist from being freed. IIUC it tries to avoid such a self reference by using local field variable and disallowing variable of expression, right? But it seems that's not the case since the reference is saved in other place than hist_data->fields which is used in find_var_ref(). > + > + data->fn = onmax_save; > + data->onmax.max_var_ref_idx = var_ref_idx; > + max_var = create_var(hist_data, file, "max", sizeof(u64), "u64"); > + if (IS_ERR(max_var)) { > + ret = PTR_ERR(max_var); > + goto out; > + } > + data->onmax.max_var = max_var; > + > + for (i = 0; i < data->n_params; i++) { > + param = kstrdup(data->params[i], GFP_KERNEL); > + if (!param) { > + ret = -ENOMEM; > + goto out; > + } > + > + field_var = create_target_field_var(hist_data, NULL, NULL, > param); > + if (IS_ERR(field_var)) { > + ret = PTR_ERR(field_var); > + kfree(param); > + goto out; > + } > + > +
[PATCH RT 01/10] timer/hrtimer: check properly for a running timer
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewiorhrtimer_callback_running() checks only whether a timmer is running on a CPU in hardirq-context. This is okay for !RT. For RT environment we move most timers to the timer-softirq and therefore we therefore need to check if the timer is running in the softirq context. Cc: stable...@vger.kernel.org Reported-by: Alexander Gerasiov Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- include/linux/hrtimer.h | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h index 8fbcdfa5dc77..ff317006d3e8 100644 --- a/include/linux/hrtimer.h +++ b/include/linux/hrtimer.h @@ -455,7 +455,13 @@ static inline int hrtimer_is_queued(struct hrtimer *timer) */ static inline int hrtimer_callback_running(const struct hrtimer *timer) { - return timer->base->cpu_base->running == timer; + if (timer->base->cpu_base->running == timer) + return 1; +#ifdef CONFIG_PREEMPT_RT_BASE + if (timer->base->cpu_base->running_soft == timer) + return 1; +#endif + return 0; } /* Forward a hrtimer so it expires after now: */ -- 2.13.2
[PATCH RT 05/10] sched: Remove TASK_ALL
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Peter ZijlstraIt's unused: $ git grep "\ " | wc -l 1 And dangerous, kill the bugger. Cc: stable...@vger.kernel.org Acked-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- include/linux/sched.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index b7b001e26509..56ccd0a3dd49 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -234,7 +234,6 @@ extern char ___assert_task_state[1 - 2*!!( /* Convenience macros for the sake of wake_up */ #define TASK_NORMAL(TASK_INTERRUPTIBLE | TASK_UNINTERRUPTIBLE) -#define TASK_ALL (TASK_NORMAL | __TASK_STOPPED | __TASK_TRACED) /* get_task_state() */ #define TASK_REPORT(TASK_RUNNING | TASK_INTERRUPTIBLE | \ -- 2.13.2
[PATCH RT 02/10] rtmutex: Make lock_killable work
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Thomas GleixnerLocking an rt mutex killable does not work because signal handling is restricted to TASK_INTERRUPTIBLE. Use signal_pending_state() unconditionaly. Cc: rt-sta...@vger.kernel.org Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- kernel/locking/rtmutex.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c index 0e9a6260441d..552dc6dd3a79 100644 --- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1672,18 +1672,13 @@ __rt_mutex_slowlock(struct rt_mutex *lock, int state, if (try_to_take_rt_mutex(lock, current, waiter)) break; - /* -* TASK_INTERRUPTIBLE checks for signals and -* timeout. Ignored otherwise. -*/ - if (unlikely(state == TASK_INTERRUPTIBLE)) { - /* Signal pending? */ - if (signal_pending(current)) - ret = -EINTR; - if (timeout && !timeout->task) - ret = -ETIMEDOUT; - if (ret) - break; + if (timeout && !timeout->task) { + ret = -ETIMEDOUT; + break; + } + if (signal_pending_state(state, current)) { + ret = -EINTR; + break; } if (ww_ctx && ww_ctx->acquired > 0) { -- 2.13.2
[PATCH RT 06/10] sched/migrate disable: handle updated task-mask mg-dis section
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorIf task's cpumask changes while in the task is in a migrate_disable() section then we don't react on it after a migrate_enable(). It matters however if current CPU is no longer part of the cpumask. We also miss the ->set_cpus_allowed() callback. This patch fixes it by setting task->migrate_disable_update once we this "delayed" hook. This bug was introduced while fixing unrelated issue in migrate_disable() in v4.4-rt3 (update_migrate_disable() got removed during that). Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- include/linux/sched.h | 1 + kernel/sched/core.c | 59 +-- 2 files changed, 54 insertions(+), 6 deletions(-) diff --git a/include/linux/sched.h b/include/linux/sched.h index 56ccd0a3dd49..331cdbfc6431 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1438,6 +1438,7 @@ struct task_struct { unsigned int policy; #ifdef CONFIG_PREEMPT_RT_FULL int migrate_disable; + int migrate_disable_update; # ifdef CONFIG_SCHED_DEBUG int migrate_disable_atomic; # endif diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 970b893a1d15..bea476417297 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -1212,18 +1212,14 @@ void set_cpus_allowed_common(struct task_struct *p, const struct cpumask *new_ma p->nr_cpus_allowed = cpumask_weight(new_mask); } -void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) +static void __do_set_cpus_allowed_tail(struct task_struct *p, + const struct cpumask *new_mask) { struct rq *rq = task_rq(p); bool queued, running; lockdep_assert_held(>pi_lock); - if (__migrate_disabled(p)) { - cpumask_copy(>cpus_allowed, new_mask); - return; - } - queued = task_on_rq_queued(p); running = task_current(rq, p); @@ -1246,6 +1242,20 @@ void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) enqueue_task(rq, p, ENQUEUE_RESTORE); } +void do_set_cpus_allowed(struct task_struct *p, const struct cpumask *new_mask) +{ + if (__migrate_disabled(p)) { + lockdep_assert_held(>pi_lock); + + cpumask_copy(>cpus_allowed, new_mask); +#if defined(CONFIG_PREEMPT_RT_FULL) && defined(CONFIG_SMP) + p->migrate_disable_update = 1; +#endif + return; + } + __do_set_cpus_allowed_tail(p, new_mask); +} + static DEFINE_PER_CPU(struct cpumask, sched_cpumasks); static DEFINE_MUTEX(sched_down_mutex); static cpumask_t sched_down_cpumask; @@ -3231,6 +3241,43 @@ void migrate_enable(void) */ p->migrate_disable = 0; + if (p->migrate_disable_update) { + unsigned long flags; + struct rq *rq; + + rq = task_rq_lock(p, ); + update_rq_clock(rq); + + __do_set_cpus_allowed_tail(p, >cpus_allowed); + task_rq_unlock(rq, p, ); + + p->migrate_disable_update = 0; + + WARN_ON(smp_processor_id() != task_cpu(p)); + if (!cpumask_test_cpu(task_cpu(p), >cpus_allowed)) { + const struct cpumask *cpu_valid_mask = cpu_active_mask; + struct migration_arg arg; + unsigned int dest_cpu; + + if (p->flags & PF_KTHREAD) { + /* +* Kernel threads are allowed on online && !active CPUs +*/ + cpu_valid_mask = cpu_online_mask; + } + dest_cpu = cpumask_any_and(cpu_valid_mask, >cpus_allowed); + arg.task = p; + arg.dest_cpu = dest_cpu; + + unpin_current_cpu(); + preempt_lazy_enable(); + preempt_enable(); + stop_one_cpu(task_cpu(p), migration_cpu_stop, ); + tlb_migrate_finish(p->mm); + return; + } + } + unpin_current_cpu(); preempt_enable(); preempt_lazy_enable(); -- 2.13.2
[PATCH RT 08/10] fs: convert two more BH_Uptodate_Lock related bitspinlocks
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej SiewiorWe convert all BH_Uptodate_Lock based bit-spinlocks to use bh_uptodate_lock_irqsave() instead. Those two were introduced after the initial change in -RT and were not noticed before. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- fs/ext4/page-io.c | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c index 6ca56f5f72b5..9e145fe7cae0 100644 --- a/fs/ext4/page-io.c +++ b/fs/ext4/page-io.c @@ -97,8 +97,7 @@ static void ext4_finish_bio(struct bio *bio) * We check all buffers in the page under BH_Uptodate_Lock * to avoid races with other end io clearing async_write flags */ - local_irq_save(flags); - bit_spin_lock(BH_Uptodate_Lock, >b_state); + flags = bh_uptodate_lock_irqsave(head); do { if (bh_offset(bh) < bio_start || bh_offset(bh) + bh->b_size > bio_end) { @@ -110,8 +109,7 @@ static void ext4_finish_bio(struct bio *bio) if (bio->bi_error) buffer_io_error(bh); } while ((bh = bh->b_this_page) != head); - bit_spin_unlock(BH_Uptodate_Lock, >b_state); - local_irq_restore(flags); + bh_uptodate_unlock_irqrestore(head, flags); if (!under_io) { #ifdef CONFIG_EXT4_FS_ENCRYPTION if (ctx) -- 2.13.2
[PATCH RT 00/10] Linux 4.4.97-rt111-rc2
Dear RT Folks, This is the RT stable review cycle of patch 4.4.97-rt111-rc2. Please scream at me if I messed something up. Please test the patches too. The -rc release will be uploaded to kernel.org and will be deleted when the final release is out. This is just a review release (or release candidate). The pre-releases will not be pushed to the git repository, only the final release is. If all goes well, this patch will be converted to the next main release on 11/27/2017. Note, I realized I only backported from v4.13.10-rt3 and not v4.13.13-rt5. Added a few more patches. Enjoy, -- Steve To build 4.4.97-rt111-rc2 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v4.x/linux-4.4.tar.xz http://www.kernel.org/pub/linux/kernel/v4.x/patch-4.4.97.xz http://www.kernel.org/pub/linux/kernel/projects/rt/4.4/patch-4.4.97-rt111-rc2.patch.xz You can also build from 4.4.97-rt110 by applying the incremental patch: http://www.kernel.org/pub/linux/kernel/projects/rt/4.4/incr/patch-4.4.97-rt110-rt111-rc2.patch.xz Changes from 4.4.97-rt110: --- Peter Zijlstra (1): sched: Remove TASK_ALL Sebastian Andrzej Siewior (6): timer/hrtimer: check properly for a running timer random: avoid preempt_disable()ed section sched/migrate disable: handle updated task-mask mg-dis section kernel/locking: use an exclusive wait_q for sleepers fs: convert two more BH_Uptodate_Lock related bitspinlocks md/raid5: do not disable interrupts Steven Rostedt (VMware) (1): Linux 4.4.97-rt111-rc2 Thomas Gleixner (2): rtmutex: Make lock_killable work sched: Prevent task state corruption by spurious lock wakeup drivers/char/random.c| 10 +++--- drivers/md/raid5.c | 4 +-- fs/ext4/page-io.c| 6 ++-- include/linux/hrtimer.h | 8 - include/linux/sched.h| 19 ++-- kernel/fork.c| 1 + kernel/locking/rtmutex.c | 21 + kernel/sched/core.c | 81 +--- localversion-rt | 2 +- 9 files changed, 113 insertions(+), 39 deletions(-)
[PATCH RT 10/10] Linux 4.4.97-rt111-rc2
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: "Steven Rostedt (VMware)"--- localversion-rt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/localversion-rt b/localversion-rt index b3e668a8fb94..757d33fb65a3 100644 --- a/localversion-rt +++ b/localversion-rt @@ -1 +1 @@ --rt110 +-rt111-rc2 -- 2.13.2
[PATCH RT 03/10] random: avoid preempt_disable()ed section
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewiorextract_crng() will use sleeping locks while in a preempt_disable() section due to get_cpu_var(). Work around it with local_locks. Cc: stable...@vger.kernel.org # where it applies to Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- drivers/char/random.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/drivers/char/random.c b/drivers/char/random.c index fecc40a69df8..b41745c5962c 100644 --- a/drivers/char/random.c +++ b/drivers/char/random.c @@ -260,6 +260,7 @@ #include #include #include +#include #include #include @@ -1796,6 +1797,7 @@ int random_int_secret_init(void) static DEFINE_PER_CPU(__u32 [MD5_DIGEST_WORDS], get_random_int_hash) __aligned(sizeof(unsigned long)); +static DEFINE_LOCAL_IRQ_LOCK(hash_entropy_int_lock); /* * Get a random word for internal kernel use only. Similar to urandom but @@ -1811,12 +1813,12 @@ unsigned int get_random_int(void) if (arch_get_random_int()) return ret; - hash = get_cpu_var(get_random_int_hash); + hash = _locked_var(hash_entropy_int_lock, get_random_int_hash); hash[0] += current->pid + jiffies + random_get_entropy(); md5_transform(hash, random_int_secret); ret = hash[0]; - put_cpu_var(get_random_int_hash); + put_locked_var(hash_entropy_int_lock, get_random_int_hash); return ret; } @@ -1833,12 +1835,12 @@ unsigned long get_random_long(void) if (arch_get_random_long()) return ret; - hash = get_cpu_var(get_random_int_hash); + hash = _locked_var(hash_entropy_int_lock, get_random_int_hash); hash[0] += current->pid + jiffies + random_get_entropy(); md5_transform(hash, random_int_secret); ret = *(unsigned long *)hash; - put_cpu_var(get_random_int_hash); + put_locked_var(hash_entropy_int_lock, get_random_int_hash); return ret; } -- 2.13.2
Re: [PATCH v18 0/6] drm/i915/gvt: Dma-buf support for GVT-g
On 2017.11.15 11:49:00 +0100, Gerd Hoffmann wrote: > On Wed, Nov 15, 2017 at 05:11:49PM +0800, Tina Zhang wrote: > > v17->v18: > > 1) unmap vgpu's opregion when destroying vgpu. > > 2) update comments for VFIO_DEVICE_GET_GFX_DMABUF. (Alex) > > > This patch set adds the dma-buf support for intel GVT-g. > > > > dma-buf is an uniform mechanism to share DMA buffers across different > > devices and subsystems. dma-buf for intel GVT-g is mainly used to share > > the vgpu's framebuffer to userspace to leverage userspace graphics stacks > > to render the framebuffer to the display monitor. > > > > The main idea is that we create a gem object and set vgpu's framebuffer as > > its backing storage. Then, export a dma-buf associated with this gem object. > > With the fd of this dma-buf, userspace can directly handle this buffer. > > > > This patch set can be tried with the following example: > > git://git.kraxel.org/qemu branch: work/intel-vgpu > > > > A topic branch with the latest patch set is: > > https://github.com/intel/gvt-linux.git branch: topic/dmabuf > > Tested-by: Gerd Hoffmann> After debugging with Tina on one left race that fixed by https://lists.freedesktop.org/archives/intel-gvt-dev/2017-November/002505.html I still need below qemu fix for proper cursor handling, otherwise qemu just crashed when I click in my terminal program which hides cursor then. diff --git a/hw/vfio/display.c b/hw/vfio/display.c index e500ec2cb1..d9a044b080 100644 --- a/hw/vfio/display.c +++ b/hw/vfio/display.c @@ -169,8 +169,9 @@ static void vfio_display_dmabuf_update(void *opaque) cursor = vfio_display_get_dmabuf(vdev, DRM_PLANE_TYPE_CURSOR); if (vdev->cursor != cursor) { vdev->cursor = cursor; -dpy_gl_cursor_dmabuf(vdev->display_con, - >buf); +if (cursor) +dpy_gl_cursor_dmabuf(vdev->display_con, + >buf); free_bufs = true; } if (cursor != NULL) { And with these it seems pretty fine now that I'll queue them up for -next pull. thanks -- Open Source Technology Center, Intel ltd. $gpg --keyserver wwwkeys.pgp.net --recv-keys 4D781827 signature.asc Description: PGP signature
Re: [PATCH] arch, mm: introduce arch_tlb_gather_mmu_lazy
On Wed, Nov 22, 2017 at 07:30:50PM +, Will Deacon wrote: > Hi Michal, > > On Mon, Nov 20, 2017 at 05:04:22PM +0100, Michal Hocko wrote: > > On Mon 20-11-17 14:24:44, Will Deacon wrote: > > > On Thu, Nov 16, 2017 at 10:20:42AM +0100, Michal Hocko wrote: > > > > On Wed 15-11-17 17:33:32, Will Deacon wrote: > > [...] > > > > > > diff --git a/arch/arm64/include/asm/tlb.h > > > > > > b/arch/arm64/include/asm/tlb.h > > > > > > index ffdaea7954bb..7adde19b2bcc 100644 > > > > > > --- a/arch/arm64/include/asm/tlb.h > > > > > > +++ b/arch/arm64/include/asm/tlb.h > > > > > > @@ -43,7 +43,7 @@ static inline void tlb_flush(struct mmu_gather > > > > > > *tlb) > > > > > > * The ASID allocator will either invalidate the ASID or mark > > > > > > * it as used. > > > > > > */ > > > > > > - if (tlb->fullmm) > > > > > > + if (tlb->lazy) > > > > > > return; > > > > > > > > > > This looks like the right idea, but I'd rather make this check: > > > > > > > > > > if (tlb->fullmm && tlb->lazy) > > > > > > > > > > since the optimisation doesn't work for anything than tearing down the > > > > > entire address space. > > > > > > > > OK, that makes sense. > > > > > > > > > Alternatively, I could actually go check MMF_UNSTABLE in tlb->mm, > > > > > which > > > > > would save you having to add an extra flag in the first place, e.g.: > > > > > > > > > > if (tlb->fullmm && !test_bit(MMF_UNSTABLE, >mm->flags)) > > > > > > > > > > which is a nice one-liner. > > > > > > > > But that would make it oom_reaper specific. What about the softdirty > > > > case Minchan has mentioned earlier? > > > > > > We don't (yet) support that on arm64, so we're ok for now. If we do grow > > > support for it, then I agree that we want a flag to identify the case > > > where > > > the address space is going away and only elide the invalidation then. > > > > What do you think about the following patch instead? I have to confess > > I do not really understand the fullmm semantic so I might introduce some > > duplication by this flag. If you think this is a good idea, I will post > > it in a separate thread. > > > Please do! My only suggestion would be s/lazy/exit/, since I don't think the > optimisation works in any other situation than the address space going away > for good. Yes, address space going. That's why I wanted to add additional check that address space going without adding new flags. http://lkml.kernel.org/r/<20171113002833.GA18301@bbox> However, if you guys love to add new flag to distinguish, I prefer "exit" to "lazy". It also would be better to add WARN_ON to catch future potential wrong use case like OOM reaper. Anyway, I'm not strong against so it up to you, Michal. WARN_ON_ONCE(exit == true && atomic_read(>mm_users) > 0);
Re: [PATCH v3 0/4] vm: add a syscall to map a process memory into a pipe
On Wed, Nov 22, 2017 at 09:43:31PM +0100, Michael Kerrisk (man-pages) wrote: > Hi Mike, > > On 22 November 2017 at 20:36, Mike Rapoportwrote: > > Hi, > > > > This patches introduces new process_vmsplice system call that combines > > functionality of process_vm_read and vmsplice. > > > > It allows to map the memory of another process into a pipe, similarly to > > what vmsplice does for its own address space. > > > > The patch 2/4 ("vm: add a syscall to map a process memory into a pipe") > > actually adds the new system call and provides its elaborate description. > > Where is the man page for this new syscall? It's still WIP, I'll send it out soon. > Cheers, > > Michael > > > The patchset is against -mm tree. > > > > v3: minor refactoring to reduce code duplication > > v2: move this syscall under CONFIG_CROSS_MEMORY_ATTACH > > give correct flags to get_user_pages_remote() > > > > Andrei Vagin (3): > > vm: add a syscall to map a process memory into a pipe > > x86: wire up the process_vmsplice syscall > > test: add a test for the process_vmsplice syscall > > > > Mike Rapoport (1): > > fs/splice: introduce pages_to_pipe helper > > > > arch/x86/entry/syscalls/syscall_32.tbl | 1 + > > arch/x86/entry/syscalls/syscall_64.tbl | 2 + > > fs/splice.c| 262 > > +++-- > > include/linux/compat.h | 3 + > > include/linux/syscalls.h | 4 + > > include/uapi/asm-generic/unistd.h | 5 +- > > kernel/sys_ni.c| 2 + > > tools/testing/selftests/process_vmsplice/Makefile | 5 + > > .../process_vmsplice/process_vmsplice_test.c | 188 +++ > > 9 files changed, 450 insertions(+), 22 deletions(-) > > create mode 100644 tools/testing/selftests/process_vmsplice/Makefile > > create mode 100644 > > tools/testing/selftests/process_vmsplice/process_vmsplice_test.c > > > > -- > > 2.7.4 > > > > -- > Michael Kerrisk > Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/ > Linux/UNIX System Programming Training: http://man7.org/training/ > -- Sincerely yours, Mike.
[PATCH] fat: Fix sb_rdonly() change
commit bc98a42c1f7d0f886c0c1b75a92a004976a46d9f introduced bug. Signed-off-by: OGAWA Hirofumi--- fs/fat/inode.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/fat/inode.c~fat-fix-sb_rdonly-fix fs/fat/inode.c --- linux/fs/fat/inode.c~fat-fix-sb_rdonly-fix 2017-11-23 15:03:47.371667601 +0900 +++ linux-hirofumi/fs/fat/inode.c 2017-11-23 15:04:07.654616440 +0900 @@ -779,7 +779,7 @@ static void __exit fat_destroy_inodecach static int fat_remount(struct super_block *sb, int *flags, char *data) { - int new_rdonly; + bool new_rdonly; struct msdos_sb_info *sbi = MSDOS_SB(sb); *flags |= MS_NODIRATIME | (sbi->options.isvfat ? 0 : MS_NOATIME); _ -- OGAWA Hirofumi
Re: [PATCH v6 36/37] tracing: Add inter-event blurb to HIST_TRIGGERS config option
On Fri, Nov 17, 2017 at 02:33:15PM -0600, Tom Zanussi wrote: > So that users know that inter-event tracing is supported as part of > the HIST_TRIGGERS option, include text to that effect in the help > text. > > Signed-off-by: Tom Zanussi> --- > kernel/trace/Kconfig | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig > index b8395a0..b01536d 100644 > --- a/kernel/trace/Kconfig > +++ b/kernel/trace/Kconfig > @@ -596,6 +596,9 @@ config HIST_TRIGGERS > event activity as an initial guide for further investigation > using more advanced tools. > > + Inter-event tracing of quantities such as latencies is also > + supported using hist triggers under this option. > + > See Documentation/trace/events.txt. Unrelated, but the doc was renamed. :) Thanks, Namhyung > If in doubt, say N. > > -- > 1.9.3 >
Re: [PATCH v2 00/18] Entry stack switching
* Ingo Molnarwrote: > > * Ingo Molnar wrote: > > > > Anyway, I booted your config (more or less -- I munged it through > > > virtme-configkernel --update first) with 17 vCPUs and it seems fine. > > > Is the issue reliable enough to bisect? > > > > Ok, it should be bisectable, will try to bisect it. > > The latestest entry-stack code appears to be working fine though. > > So one of the below fixes from yesterday appears to have done the trick. > > I'll re-test today to make sure: maybe it's more sporadic than I thought, in > one > of the bootups I got the do_IRQ warning only once, in half a day of uptime. I re-tested and it all seems fine now. I suspect it got fixed by: ca37e57bbe0c: x86/entry/64: Add missing irqflags tracing to native_load_gs_index() still it is weird, because I boot that system with latest -tip on a daily basis, and don't remember having seen that warning. Do you have any theory for why the entry stack changes would uncover this bug? Thanks, Ingo
[ANNOUNCE] 3.18.82-rt88
Dear RT Folks, I'm pleased to announce the 3.18.82-rt88 stable release. This release is just an update to the new stable 3.18.82 version and no RT specific changes have been made. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git branch: v3.18-rt Head SHA1: c3c324a4e15b687b3cfbc2d28ca0aa37bb747448 Or to build 3.18.82-rt88 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.18.tar.xz http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.18.82.xz http://www.kernel.org/pub/linux/kernel/projects/rt/3.18/patch-3.18.82-rt88.patch.xz Enjoy, -- Steve
[PATCH] arm64: dts: Hi3660: Fix state id for 'CPU_NAP' state
Thanks a lot for Vincent Guittot careful work to find bug for 'CPU_NAP' idle state. From ftrace log we can observe CA73 CPUs can be easily waken up from 'CPU_NAP' state but the 'waken up' CPUs doesn't handle anything and sleep again; so there have tons of trace events for CA73 CPUs entering and exiting idle state. On Hi3660 CA73 has retention state 'CPU_NAP' for CPU idle, this state we set its psci parameter as '0x001' and from this parameter it can calculate state id is 1. Unfortunately ARM trusted firmware (ARM-TF) takes 1 as a invalid value for state id, so the CPU cannot enter idle state and directly bail out to kernel. This commit changes psci parameter to '0x' for state id = 0; this id is accepted by ARM trusted firmware and finally CPU can stay properly in 'CPU_NAP' state. Cc: Vincent GuittotCc: Daniel Lezcano Signed-off-by: Leo Yan --- arch/arm64/boot/dts/hisilicon/hi3660.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi index ab0b95b..5666d29 100644 --- a/arch/arm64/boot/dts/hisilicon/hi3660.dtsi +++ b/arch/arm64/boot/dts/hisilicon/hi3660.dtsi @@ -147,7 +147,7 @@ CPU_NAP: cpu-nap { compatible = "arm,idle-state"; - arm,psci-suspend-param = <0x001>; + arm,psci-suspend-param = <0x000>; entry-latency-us = <7>; exit-latency-us = <2>; min-residency-us = <15>; -- 2.7.4
Re: [PATCH v6 31/37] tracing: Add 'last error' error facility for hist triggers
On Fri, Nov 17, 2017 at 02:33:10PM -0600, Tom Zanussi wrote: > With the addition of variables and actions, it's become necessary to > provide more detailed error information to users about syntax errors. > > Add a 'last error' facility accessible via the erroring event's 'hist' > file. Reading the hist file after an error will display more detailed > information about what went wrong, if information is available. This > extended error information will be available until the next hist > trigger command for that event. > > # echo xxx > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger > echo: write error: Invalid argument > > # cat /sys/kernel/debug/tracing/events/sched/sched_wakeup/hist > > ERROR: Couldn't yyy: zzz > Last command: xxx > > Also add specific error messages for variable and action errors. > > Signed-off-by: Tom Zanussi> --- > @@ -2271,9 +2333,18 @@ static struct hist_field *create_var_ref(struct > hist_field *var_field, > return ref_field; > } > > +static bool is_common_field(char *var_name) > +{ > + if (strncmp(var_name, "$common_timestamp", strlen("$common_timestamp")) > == 0) > + return true; > + > + return false; > +} > + > static bool is_var_ref(char *var_name) > { > - if (!var_name || strlen(var_name) < 2 || var_name[0] != '$') > + if (!var_name || strlen(var_name) < 2 || var_name[0] != '$' || > + is_common_field(var_name)) Looks like it's not a part of this change. Thanks, Namhyung > return false; > > return true;
Re: [PATCH 3/3] autofs - fix AT_NO_AUTOMOUNT not being honored
On 23/11/17 12:49, NeilBrown wrote: > On Thu, Nov 23 2017, Ian Kent wrote: > >> On 23/11/17 10:21, NeilBrown wrote: >>> On Thu, Nov 23 2017, Ian Kent wrote: >>> Hey Neil, I'm looking at this again because RH QE have complained about a regression test failing with a kernel that has this change. Maybe I'm just dumb but I though a "find " would, well, just look at the contents below but an strace shows that it reads and calls fstatat() on "every entry in the mount table" regardless of the path. >>> >>> weird ... I can only get find to look at the mount table if given the >>> -fstyp option, and even then it doesn't fstatat anything that isn't in >>> the tree it is searching. >> >> It's probably the -xautofs (exclude autofs fs'es) that was used in >> the test that requires reading the mount table to get info about >> excluding autofs mounts but the fstatat() on all the entries, >> regardless of path, that was a surprise to me. >> >> find did use AT_SYMLINK_NOFOLLOW which historically behaved like >> AT_NO_AUTOMOUNT. >> >>> >>> And with the move of userspace to use /proc based mount tables (one example being the symlink of /etc/mtab into /proc) even modest sized direct mount maps will be a problem with every entry getting mounted. >>> >>> But the patch in question is only about indirect mount maps, isn't it? >>> How is it relevant to direct mount maps? >> >> The change here will cause fstatat() to trigger direct mounts on access >> if it doesn't use AT_NO_AUTOMOUNT. > > Ahhh... light dawns. > This is about this bit of the patch: > > static inline int vfs_fstatat(int dfd, const char __user *filename, > struct kstat *stat, int flags) > { > - return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT, > -stat, STATX_BASIC_STATS); > + return vfs_statx(dfd, filename, flags, stat, STATX_BASIC_STATS); > } > > I hadn't paid much attention to that. > > I before this patch: > stat and lstat act as you would expect AT_NO_AUTOMOUNT to act on > direct mount and browseable indirect mount, but not on unbrowseable > indirect mounts Yep, because of the fall thru for negative dentrys at: if (!(nd->flags & (LOOKUP_PARENT | LOOKUP_DIRECTORY | LOOKUP_OPEN | LOOKUP_CREATE | LOOKUP_AUTOMOUNT)) && path->dentry->d_inode) return -EISDIR; which is missing a LOOKUP_FOLLOW check if we wanted to catch AT_SYMLINK_NOFOLLOW. > fstatat appeared to accept the AT_NO_AUTOMOUNT flag, but actually > assumed it was always set, but acted like stat and lstat Yep, always set AT_NO_AUTOMOUNT so that it behaved the same as > xstatat actually accepted the AT_NO_AUTOMOUNT flag, but it had no > effect on unbrowseable indirect mounts. Yep. > > after the patch, the distinction between direct and indirect was gone, > and fstatat now handles AT_NO_AUTOMOUNT the same as xstatat. > So: > stat and lstat now don't trigger automounts even on indirect, but > this is a mixed blessing as they don't even trigger the mkdir Yep. > fstatat without AT_NO_AUTOMOUNT now always triggers an automount > This is a problematic regression that you have noticed and > likely needs to be reverted. Maybe we can assume > AT_NO_AUTOMOUNT when AT_SYMLINK_NOFOLLOW is set, and require > people to use xstatat if they need to set the flags separately Yep. The introduction of AT_NO_AUTOMOUNT (and the introduction of follow_managed() and friends) was meant to do away with the misleading use the AT_SYMLINK_NOFOLLOW (at the time the automount mechanism abused the ->follow_link() method because it had similar semantics to symlinks). To catch the older usage pattern re-adding an AT_SYMLINK_NOFOLLOW check would be helpful. The reality is there are still the same old problems of unintended mounting (mount storms) and AT_NO_AUTOMOUNT not being properly handled. Certainly the implementation we have now is much better but these niggling problems remain and user space steps on them way too often, and it feels like its much more so lately. > xstatat now correctly honours AT_NO_AUTOMOUNT for indirect mounts > but is otherwise unchanged. Yep, assuming we accept the ENOENT return as sensible for AT_NO_AUTOMOUNT no browse indirect mount case. statx() being a new system call it would be ideal to get the semantics of this call right now before it becomes well used. > > What would you think of changing the above to > > static inline int vfs_fstatat(int dfd, const char __user *filename, > struct kstat *stat, int flags) > { > - return vfs_statx(dfd, filename, flags | AT_NO_AUTOMOUNT, > -stat, STATX_BASIC_STATS); > + return vfs_statx(dfd, filename, > +(flags & AT_SYMLINK_NOFOLLOW) ? (flags | > + AT_NO_AUTOMOUNT) :
Re: usb/media/em28xx: use-after-free in dvb_unregister_frontend
Am 21.11.2017 um 14:51 schrieb Andrey Konovalov: > Hi! > Hi Andrey, > I've got the following report while fuzzing the kernel with syzkaller. > > On commit e1d1ea549b57790a3d8cf6300e6ef86118d692a3 (4.15-rc1). > > em28xx 1-1:9.0: Disconnecting > tc90522 1-0015: Toshiba TC90522 attached. > qm1d1c0042 2-0061: Sharp QM1D1C0042 attached. > dvbdev: DVB: registering new adapter (1-1:9.0) > em28xx 1-1:9.0: DVB: registering adapter 0 frontend 0 (Toshiba TC90522 > ISDB-S module)... > dvbdev: dvb_create_media_entity: media entity 'Toshiba TC90522 ISDB-S > module' registered. > dvbdev: dvb_create_media_entity: media entity 'dvb-demux' registered. > em28xx 1-1:9.0: DVB extension successfully initialized > em28xx 1-1:9.0: Remote control support is not available for this card. > em28xx 1-1:9.0: Closing DVB extension > == > BUG: KASAN: use-after-free in dvb_unregister_frontend+0x8f/0xa0 > Read of size 8 at addr 880067853628 by task kworker/0:3/3182 > > CPU: 0 PID: 3182 Comm: kworker/0:3 Not tainted 4.14.0-57501-g9284d204d604 #119 > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 > Workqueue: usb_hub_wq hub_event > Call Trace: > __dump_stack lib/dump_stack.c:17 > dump_stack+0xe1/0x157 lib/dump_stack.c:53 > print_address_description+0x71/0x234 mm/kasan/report.c:252 > kasan_report_error mm/kasan/report.c:351 > kasan_report+0x173/0x270 mm/kasan/report.c:409 > __asan_report_load8_noabort+0x19/0x20 mm/kasan/report.c:430 > dvb_unregister_frontend+0x8f/0xa0 drivers/media/dvb-core/dvb_frontend.c:2768 > em28xx_unregister_dvb drivers/media/usb/em28xx/em28xx-dvb.c:1122 > em28xx_dvb_fini+0x62d/0x8e0 drivers/media/usb/em28xx/em28xx-dvb.c:2129 > em28xx_close_extension+0x71/0x220 drivers/media/usb/em28xx/em28xx-core.c:1122 > em28xx_usb_disconnect+0xd7/0x130 drivers/media/usb/em28xx/em28xx-cards.c:3763 > usb_unbind_interface+0x1b6/0x950 drivers/usb/core/driver.c:423 > __device_release_driver drivers/base/dd.c:870 > device_release_driver_internal+0x563/0x630 drivers/base/dd.c:903 > device_release_driver+0x1e/0x30 drivers/base/dd.c:928 > bus_remove_device+0x2fc/0x4b0 drivers/base/bus.c:565 > device_del+0x39f/0xa70 drivers/base/core.c:1984 > usb_disable_device+0x223/0x710 drivers/usb/core/message.c:1205 > usb_disconnect+0x285/0x7f0 drivers/usb/core/hub.c:2205 > hub_port_connect drivers/usb/core/hub.c:4851 > hub_port_connect_change drivers/usb/core/hub.c:5106 > port_event drivers/usb/core/hub.c:5212 > hub_event_impl+0x10f0/0x3440 drivers/usb/core/hub.c:5324 > hub_event+0x38/0x50 drivers/usb/core/hub.c:5222 > process_one_work+0x944/0x15f0 kernel/workqueue.c:2112 > worker_thread+0xef/0x10d0 kernel/workqueue.c:2246 > kthread+0x367/0x420 kernel/kthread.c:238 > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:437 > this looks similar to the oops fixed by this patch: https://patchwork.linuxtv.org/patch/45219/ Could you try if it fixes your case also? Regards Matthias
Re: [PATCH 00/23] [v4] KAISER: unmap most of the kernel from userspace page tables
* Dave Hansenwrote: > Thanks, everyone for all the reviews thus far. I hope I managed to > address all the feedback given so far, except for the TODOs of > course. This is a pretty minor update compared to v1->v2. > > These patches are all on this tip branch: > > > https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/log/?h=WIP.x86/mm Note that on top of latest -tip the bzImage build fails with: arch/x86/boot/compressed/pagetable.o: In function `kernel_ident_mapping_init': pagetable.c:(.text+0x31b): undefined reference to `kaiser_enabled' arch/x86/boot/compressed/Makefile:109: recipe for target 'arch/x86/boot/compressed/vmlinux' failed that's I think because the early boot code shares some code via kernel_ident_mapping_init() et al, and that code grew a new KAISER runtime variable which isn't present in the special early-boot environment. I.e. something like the (totally untested) patch below should do the trick. Thanks, Ingo --- arch/x86/boot/compressed/pagetable.c |6 ++ 1 file changed, 6 insertions(+) Index: tip/arch/x86/boot/compressed/pagetable.c === --- tip.orig/arch/x86/boot/compressed/pagetable.c +++ tip/arch/x86/boot/compressed/pagetable.c @@ -36,6 +36,12 @@ /* Used by pgtable.h asm code to force instruction serialization. */ unsigned long __force_order; +/* + * We share the kernel_ident_mapping_init(), but the early boot version does not need + * the Kaiser-logic: + */ +int kaiser_enabled = 0; + /* Used to track our page table allocation area. */ struct alloc_pgt_data { unsigned char *pgt_buf;
enum conversion warnings
pulling down tot, I'm seeing: CC [M] drivers/gpu/drm/i915/intel_ddi.o drivers/gpu/drm/i915/intel_ddi.c:1481:30: error: implicit conversion from enumeration type 'enum port' to different enumeration type 'enum intel_dpll_id' [-Werror,-Wenum-conversion] enum intel_dpll_id pll_id = port; ~~ ^~~~ seems to be coming from commit 2952cd6fb4cc9 "drm/i915: Let's use more enum intel_dpll_id pll_id." That commit seems to be using enums instead of uints. I think maybe the final 2 hunks of that patch should be reverted?
Re: [PATCH v7 02/13] dt-bindings: Add SLIMbus bindings
On Wed, Nov 15, 2017 at 11:15:00PM -0600, Rob Herring wrote: > On Wed, Nov 15, 2017 at 02:10:32PM +, srinivas.kandaga...@linaro.org > wrote: > > From: Sagar Dharia> > > > SLIMbus (Serial Low Power Interchip Media Bus) is a specification > > developed by MIPI (Mobile Industry Processor Interface) alliance. > > SLIMbus is a 2-wire implementation, which is used to communicate with > > peripheral components like audio-codec. > > > > This patch adds device tree bindings for the slimbus. > > > > Signed-off-by: Sagar Dharia > > Signed-off-by: Srinivas Kandagatla > > --- > > Documentation/devicetree/bindings/slimbus/bus.txt | 50 > > +++ > > 1 file changed, 50 insertions(+) > > create mode 100644 Documentation/devicetree/bindings/slimbus/bus.txt > > Reviewed-by: Rob Herring I still have some reservations about the putting the MID and PID in the compatible sting, are we sure this is what we want to do? As has been discussed previous the discoverability of SLIMbus is really theoretical since you really always need to bind a driver to power on the device, since power is not part of the bus itself. Many devices (ours included) will support SLIMbus and other interfaces, this means we will need a different compatible string between SLIMbus and I2C/SPI, which feels a little icky. Additionally it does make the compatible strings really unreadable and which is a little annoying when looking at device trees as you can't easily see what things are. Thanks, Charles
Re: [PATCH 0/4] staging: lustre: fixed some signedness warns from sparse
On Wed, Nov 22, 2017 at 08:38:27PM +0100, Stefano Manni wrote: > Fixed some signedness warnings from sparse on lustre. > > Stefano Manni (4): > staging: lustre: fixed signedness of some socklnd params > staging: lustre: fixed signedness of llite > staging: lustre: fixed signedness of lov > staging: lustre: fixed signedness of obdclass You may like to use imperative mood for your git log brief descriptions Stefano. s/fixed/fix/ For justification see Documentation/process/submitting-patches.rst. Specifically section 2 of that document. Hope this helps, Tobin. > drivers/staging/lustre/lnet/klnds/socklnd/socklnd.h| 4 ++-- > .../staging/lustre/lnet/klnds/socklnd/socklnd_modparams.c | 2 +- > drivers/staging/lustre/lustre/llite/dir.c | 3 ++- > drivers/staging/lustre/lustre/llite/llite_lib.c| 9 ++--- > drivers/staging/lustre/lustre/llite/lproc_llite.c | 14 > ++ > drivers/staging/lustre/lustre/lov/lov_obd.c| 2 +- > drivers/staging/lustre/lustre/lov/lov_offset.c | 11 +++ > drivers/staging/lustre/lustre/obdclass/obd_mount.c | 2 +- > 8 files changed, 26 insertions(+), 21 deletions(-) > > -- > 2.5.5 > > ___ > devel mailing list > de...@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
[PATCH RT 3/7] tpm_tis: fix stall after iowrite*()s
4.9.61-rt52-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Haris Okanovicioread8() operations to TPM MMIO addresses can stall the cpu when immediately following a sequence of iowrite*()'s to the same region. For example, cyclitest measures ~400us latency spikes when a non-RT usermode application communicates with an SPI-based TPM chip (Intel Atom E3940 system, PREEMPT_RT_FULL kernel). The spikes are caused by a stalling ioread8() operation following a sequence of 30+ iowrite8()s to the same address. I believe this happens because the write sequence is buffered (in cpu or somewhere along the bus), and gets flushed on the first LOAD instruction (ioread*()) that follows. The enclosed change appears to fix this issue: read the TPM chip's access register (status code) after every iowrite*() operation to amortize the cost of flushing data to chip across multiple instructions. Signed-off-by: Haris Okanovic Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- drivers/char/tpm/tpm_tis.c | 29 +++-- 1 file changed, 27 insertions(+), 2 deletions(-) diff --git a/drivers/char/tpm/tpm_tis.c b/drivers/char/tpm/tpm_tis.c index 8022bea27fed..247330efd310 100644 --- a/drivers/char/tpm/tpm_tis.c +++ b/drivers/char/tpm/tpm_tis.c @@ -50,6 +50,31 @@ static inline struct tpm_tis_tcg_phy *to_tpm_tis_tcg_phy(struct tpm_tis_data *da return container_of(data, struct tpm_tis_tcg_phy, priv); } +#ifdef CONFIG_PREEMPT_RT_FULL +/* + * Flushes previous write operations to chip so that a subsequent + * ioread*()s won't stall a cpu. + */ +static inline void tpm_tis_flush(void __iomem *iobase) +{ + ioread8(iobase + TPM_ACCESS(0)); +} +#else +#define tpm_tis_flush(iobase) do { } while (0) +#endif + +static inline void tpm_tis_iowrite8(u8 b, void __iomem *iobase, u32 addr) +{ + iowrite8(b, iobase + addr); + tpm_tis_flush(iobase); +} + +static inline void tpm_tis_iowrite32(u32 b, void __iomem *iobase, u32 addr) +{ + iowrite32(b, iobase + addr); + tpm_tis_flush(iobase); +} + static bool interrupts = true; module_param(interrupts, bool, 0444); MODULE_PARM_DESC(interrupts, "Enable interrupts"); @@ -103,7 +128,7 @@ static int tpm_tcg_write_bytes(struct tpm_tis_data *data, u32 addr, u16 len, struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data); while (len--) - iowrite8(*value++, phy->iobase + addr); + tpm_tis_iowrite8(*value++, phy->iobase, addr); return 0; } @@ -127,7 +152,7 @@ static int tpm_tcg_write32(struct tpm_tis_data *data, u32 addr, u32 value) { struct tpm_tis_tcg_phy *phy = to_tpm_tis_tcg_phy(data); - iowrite32(value, phy->iobase + addr); + tpm_tis_iowrite32(value, phy->iobase, addr); return 0; } -- 2.13.2
Re: [PATCH v6 29/37] tracing: Add cpu field for hist triggers
On Fri, Nov 17, 2017 at 02:33:08PM -0600, Tom Zanussi wrote: > A common key to use in a histogram is the cpuid - add a new cpu > 'synthetic' field for that purpose. This field is named cpu rather > than $cpu or $common_cpu because 'cpu' already exists as a special > filter field and it makes more sense to match that rather than add > another name for the same thing. > > Signed-off-by: Tom Zanussi> --- > Documentation/trace/histogram.txt | 17 + > kernel/trace/trace_events_hist.c | 28 +++- > 2 files changed, 44 insertions(+), 1 deletion(-) > > diff --git a/Documentation/trace/histogram.txt > b/Documentation/trace/histogram.txt > index d1d92ed..cd3ec00 100644 > --- a/Documentation/trace/histogram.txt > +++ b/Documentation/trace/histogram.txt > @@ -172,6 +172,23 @@ >The examples below provide a more concrete illustration of the >concepts and typical usage patterns discussed above. > > + 'special' event fields > + > + > + There are a number of 'special event fields' available for use as > + keys or values in a hist trigger. These look like and behave as if > + they were actual event fields, but aren't really part of the event's > + field definition or format file. They are however available for any > + event, and can be used anywhere an actual event field could be. > + 'Special' field names are always prefixed with a '$' character to > + indicate that they're not normal fields (with the exception of > + 'cpu', for compatibility with existing filter usage): But it also could make a confusion to variables. How about removing '$' character at all? > + > +$common_timestamp u64 - timestamp (from ring buffer) associated > + with the event, in nanoseconds. May be > + modified by .usecs to have timestamps > + interpreted as microseconds. > +cpuint - the cpu on which the event occurred. > > 6.2 'hist' trigger examples > --- > diff --git a/kernel/trace/trace_events_hist.c > b/kernel/trace/trace_events_hist.c > index 121f7ef..afbfa9c 100644 > --- a/kernel/trace/trace_events_hist.c > +++ b/kernel/trace/trace_events_hist.c > @@ -227,6 +227,7 @@ enum hist_field_flags { > HIST_FIELD_FL_VAR = 1 << 12, > HIST_FIELD_FL_EXPR = 1 << 13, > HIST_FIELD_FL_VAR_REF = 1 << 14, > + HIST_FIELD_FL_CPU = 1 << 15, > }; > > struct var_defs { > @@ -1177,6 +1178,16 @@ static u64 hist_field_timestamp(struct hist_field > *hist_field, > return ts; > } > > +static u64 hist_field_cpu(struct hist_field *hist_field, > + struct tracing_map_elt *elt, > + struct ring_buffer_event *rbe, > + void *event) > +{ > + int cpu = smp_processor_id(); > + > + return cpu; > +} > + > static struct hist_field * > check_field_for_var_ref(struct hist_field *hist_field, > struct hist_trigger_data *var_data, > @@ -1622,6 +1633,8 @@ static const char *hist_field_name(struct hist_field > *field, > field_name = hist_field_name(field->operands[0], ++level); > else if (field->flags & HIST_FIELD_FL_TIMESTAMP) > field_name = "$common_timestamp"; > + else if (field->flags & HIST_FIELD_FL_CPU) > + field_name = "cpu"; > else if (field->flags & HIST_FIELD_FL_EXPR || >field->flags & HIST_FIELD_FL_VAR_REF) { > if (field->system) { > @@ -2125,6 +2138,15 @@ static struct hist_field *create_hist_field(struct > hist_trigger_data *hist_data, > goto out; > } > > + if (flags & HIST_FIELD_FL_CPU) { > + hist_field->fn = hist_field_cpu; > + hist_field->size = sizeof(int); > + hist_field->type = kstrdup("int", GFP_KERNEL); Is it unsigned? Thanks, Namhyung > + if (!hist_field->type) > + goto free; > + goto out; > + } > + > if (WARN_ON_ONCE(!field)) > goto out; > > @@ -2343,7 +2365,9 @@ static struct hist_field *parse_var_ref(struct > hist_trigger_data *hist_data, > hist_data->enable_timestamps = true; > if (*flags & HIST_FIELD_FL_TIMESTAMP_USECS) > hist_data->attrs->ts_in_usecs = true; > - } else { > + } else if (strcmp(field_name, "cpu") == 0) > + *flags |= HIST_FIELD_FL_CPU; > + else { > field = trace_find_event_field(file->event_call, field_name); > if (!field || !field->size) { > field = ERR_PTR(-EINVAL); > @@ -4612,6 +4636,8 @@ static void hist_field_print(struct seq_file *m, struct > hist_field *hist_field) > > if (hist_field->flags & HIST_FIELD_FL_TIMESTAMP) >
[PATCH RT 1/7] drivers/zram: fix zcomp_stream_get() smp_processor_id() use in preemptible code
4.9.61-rt52-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Mike GalbraithUse get_local_ptr() instead this_cpu_ptr() to avoid a warning regarding smp_processor_id() in preemptible code. raw_cpu_ptr() would be fine, too because the per-CPU data structure is protected with a spin lock so it does not matter much if we take the other one. Cc: stable...@vger.kernel.org Signed-off-by: Mike Galbraith Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- drivers/block/zram/zcomp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c index fa8329ad79fd..8c93ee150ee8 100644 --- a/drivers/block/zram/zcomp.c +++ b/drivers/block/zram/zcomp.c @@ -120,7 +120,7 @@ struct zcomp_strm *zcomp_stream_get(struct zcomp *comp) { struct zcomp_strm *zstrm; - zstrm = *this_cpu_ptr(comp->stream); + zstrm = *get_local_ptr(comp->stream); spin_lock(>zcomp_lock); return zstrm; } @@ -131,6 +131,7 @@ void zcomp_stream_put(struct zcomp *comp) zstrm = *this_cpu_ptr(comp->stream); spin_unlock(>zcomp_lock); + put_local_ptr(zstrm); } int zcomp_compress(struct zcomp_strm *zstrm, -- 2.13.2
Re: [PATCH 1/8] drm/mediatek: Use regmap for register access
Hi, Matthias: On Tue, 2017-11-14 at 22:41 +0100, Matthias Brugger wrote: > The mmsys memory space is shared between the drm and the > clk driver. Use regmap to access it. > > Signed-off-by: Matthias BruggerAcked-by: CK Hu > --- > drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 4 ++-- > drivers/gpu/drm/mediatek/mtk_drm_ddp.c | 30 +- > drivers/gpu/drm/mediatek/mtk_drm_ddp.h | 4 ++-- > drivers/gpu/drm/mediatek/mtk_drm_drv.c | 13 - > drivers/gpu/drm/mediatek/mtk_drm_drv.h | 2 +- > 5 files changed, 26 insertions(+), 27 deletions(-) > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c > b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c > index 658b8dd45b83..4c65873b4867 100644 > --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c > +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c > @@ -33,7 +33,7 @@ > * @enabled: records whether crtc_enable succeeded > * @planes: array of 4 drm_plane structures, one for each overlay plane > * @pending_planes: whether any plane has pending changes to be applied > - * @config_regs: memory mapped mmsys configuration register space > + * @config_regs: regmap mapped mmsys configuration register space > * @mutex: handle to one of the ten disp_mutex streams > * @ddp_comp_nr: number of components in ddp_comp > * @ddp_comp: array of pointers the mtk_ddp_comp structures used by this crtc > @@ -48,7 +48,7 @@ struct mtk_drm_crtc { > struct drm_planeplanes[OVL_LAYER_NR]; > boolpending_planes; > > - void __iomem*config_regs; > + struct regmap *config_regs; > struct mtk_disp_mutex *mutex; > unsigned intddp_comp_nr; > struct mtk_ddp_comp **ddp_comp; > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp.c > b/drivers/gpu/drm/mediatek/mtk_drm_ddp.c > index 8130f3dab661..1227d6db07da 100644 > --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp.c > +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp.c > @@ -185,16 +185,16 @@ static unsigned int mtk_ddp_sel_in(enum mtk_ddp_comp_id > cur, > return value; > } > > -static void mtk_ddp_sout_sel(void __iomem *config_regs, > +static void mtk_ddp_sout_sel(struct regmap *config_regs, >enum mtk_ddp_comp_id cur, >enum mtk_ddp_comp_id next) > { > if (cur == DDP_COMPONENT_BLS && next == DDP_COMPONENT_DSI0) > - writel_relaxed(BLS_TO_DSI_RDMA1_TO_DPI1, > -config_regs + DISP_REG_CONFIG_OUT_SEL); > + regmap_write(config_regs, DISP_REG_CONFIG_OUT_SEL, > + BLS_TO_DSI_RDMA1_TO_DPI1); > } > > -void mtk_ddp_add_comp_to_path(void __iomem *config_regs, > +void mtk_ddp_add_comp_to_path(struct regmap *config_regs, > enum mtk_ddp_comp_id cur, > enum mtk_ddp_comp_id next) > { > @@ -202,20 +202,22 @@ void mtk_ddp_add_comp_to_path(void __iomem *config_regs, > > value = mtk_ddp_mout_en(cur, next, ); > if (value) { > - reg = readl_relaxed(config_regs + addr) | value; > - writel_relaxed(reg, config_regs + addr); > + regmap_read(config_regs, addr, ); > + reg |= value; > + regmap_write(config_regs, addr, reg); > } > > mtk_ddp_sout_sel(config_regs, cur, next); > > value = mtk_ddp_sel_in(cur, next, ); > if (value) { > - reg = readl_relaxed(config_regs + addr) | value; > - writel_relaxed(reg, config_regs + addr); > + regmap_read(config_regs, addr, ); > + reg |= value; > + regmap_write(config_regs, addr, reg); > } > } > > -void mtk_ddp_remove_comp_from_path(void __iomem *config_regs, > +void mtk_ddp_remove_comp_from_path(struct regmap *config_regs, > enum mtk_ddp_comp_id cur, > enum mtk_ddp_comp_id next) > { > @@ -223,14 +225,16 @@ void mtk_ddp_remove_comp_from_path(void __iomem > *config_regs, > > value = mtk_ddp_mout_en(cur, next, ); > if (value) { > - reg = readl_relaxed(config_regs + addr) & ~value; > - writel_relaxed(reg, config_regs + addr); > + regmap_read(config_regs, addr, ); > + reg &= ~value; > + regmap_write(config_regs, addr, reg); > } > > value = mtk_ddp_sel_in(cur, next, ); > if (value) { > - reg = readl_relaxed(config_regs + addr) & ~value; > - writel_relaxed(reg, config_regs + addr); > + regmap_read(config_regs, addr, ); > + reg &= ~value; > + regmap_write(config_regs, addr, reg); > } > } > > diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp.h > b/drivers/gpu/drm/mediatek/mtk_drm_ddp.h > index f9a799168077..32e12f33b76a 100644 >
[ANNOUNCE] 3.10.108-rt123
Dear RT Folks, I'm pleased to announce the 3.10.108-rt123 stable release. This release is just an update to the new stable 3.10.108 version and no RT specific changes have been made. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git branch: v3.10-rt Head SHA1: 5709dbd5bedd92382c8a63f2cdcff76b9e220b75 Or to build 3.10.108-rt123 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.10.tar.xz http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.10.108.xz http://www.kernel.org/pub/linux/kernel/projects/rt/3.10/patch-3.10.108-rt123.patch.xz Enjoy, -- Steve
Re: [PATCH v2 00/18] Entry stack switching
* Andy Lutomirskiwrote: > On Tue, Nov 21, 2017 at 10:22 PM, Ingo Molnar wrote: > > > > * Andy Lutomirski wrote: > > > >> This sets up stack switching, including for SYSCALL. I think it's > >> in decent shape. > >> > >> Known issues: > >> - I think we're going to want a way to turn the stack switching on and > >>off either at boot time or at runtime. It should be fairly > >> straightforward > >>to make it work. > >> > >> - I think the ORC unwinder isn't so good at dealing with stack overflows. > >>It bails too early (I think), resulting in lots of ? entries. This > >>isn't a regression with this series -- it's just something that could > >>be improved. > >> > >> Ingo, patch 1 may be tip/urgent material. It fixes what I think is > >> a bug in Xen. I'm having a hard time testing because it's being > >> masked by a bigger unrelated bug that's keeping Xen from booting > >> when configured to hit the bug I'm fixing. (The latter bug goes at > >> least back to v4.13, I think I know roughtly what's wrong, and I've > >> reported it to the maintainers.) > > > > Hm, with this series the previous IRQ vector bug appears again: > > > > [ 51.156370] do_IRQ: 16.34 No irq handler for vector > > [ 57.511030] do_IRQ: 16.34 No irq handler for vector > > [ 57.528335] do_IRQ: 16.34 No irq handler for vector > > [ 57.533256] do_IRQ: 16.34 No irq handler for vector > > [ 63.991913] do_IRQ: 16.34 No irq handler for vector > > [ 63.996810] do_IRQ: 16.34 No irq handler for vector > > > > I've attached the reproducer config. Note that the system appears to be > > working to > > a certain extent (I could ssh to it and extract its config), but produces > > these > > warnings sporadically. > > I'll try to reproduce this, but this is weird. This is vector 34, > which is, or could be, a genuine IRQ vector. The only way I can think > of that my series would have caused this is if I very severely broke > common_interrupt, but I don't see how that could have happened without > breaking everything. It's also weird that you're seeing this only on > CPU 16. Maybe it's worth adding a WARN_ON to that warning to get a > stack trace just in case. > > Thomas, any insight here? > > > but don't get the IRQ vector warnings. > > Ingo, are you saying that you only get the IRQ vector warnings with > the SYSCALL hwframe fix applied? That's bizarre. Correct. I assume it's because lockdep is working fine with that fix applied, but that also means that different irq-tracing code paths are taken. The lockdep error disables lockdep globally and immediately. > Anyway, I booted your config (more or less -- I munged it through > virtme-configkernel --update first) with 17 vCPUs and it seems fine. > Is the issue reliable enough to bisect? Ok, it should be bisectable, will try to bisect it. I think it's a key aspect that the CPU is AMD - a similar config on Intel seems to be working fine (modulo the unwinder warning). Thanks, Ingo
Re: [PATCH v2 00/18] Entry stack switching
* Ingo Molnarwrote: > > Anyway, I booted your config (more or less -- I munged it through > > virtme-configkernel --update first) with 17 vCPUs and it seems fine. > > Is the issue reliable enough to bisect? > > Ok, it should be bisectable, will try to bisect it. The latestest entry-stack code appears to be working fine though. So one of the below fixes from yesterday appears to have done the trick. I'll re-test today to make sure: maybe it's more sporadic than I thought, in one of the bootups I got the do_IRQ warning only once, in half a day of uptime. Thanks, Ingo => diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index f1cef194dfba..3d404f8d0443 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -51,15 +51,19 @@ ENTRY(native_usergs_sysret64) END(native_usergs_sysret64) #endif /* CONFIG_PARAVIRT */ -.macro TRACE_IRQS_IRETQ +.macro TRACE_IRQS_FLAGS flags:req #ifdef CONFIG_TRACE_IRQFLAGS - bt $9, EFLAGS(%rsp)/* interrupts off? */ + bt $9, \flags /* interrupts off? */ jnc 1f TRACE_IRQS_ON 1: #endif .endm +.macro TRACE_IRQS_IRETQ + TRACE_IRQS_FLAGS EFLAGS(%rsp) +.endm + /* * When dynamic function tracer is enabled it will add a breakpoint * to all locations that it is about to modify, sync CPUs, update @@ -1069,11 +1073,13 @@ ENTRY(native_load_gs_index) FRAME_BEGIN pushfq DISABLE_INTERRUPTS(CLBR_ANY & ~CLBR_RDI) + TRACE_IRQS_OFF SWAPGS .Lgs_change: movl%edi, %gs 2: ALTERNATIVE "", "mfence", X86_BUG_SWAPGS_FENCE SWAPGS + TRACE_IRQS_FLAGS (%rsp) popfq FRAME_END ret diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h index 8562356213cd..15cf010225c9 100644 --- a/arch/x86/include/asm/fixmap.h +++ b/arch/x86/include/asm/fixmap.h @@ -47,10 +47,9 @@ extern unsigned long __FIXADDR_TOP; /* * cpu_entry_area is a percpu region in the fixmap that contains things * needed by the CPU and early entry/exit code. Real types aren't used - * for all fields here to about circular header dependencies. + * for all fields here to avoid circular header dependencies. */ -struct cpu_entry_area -{ +struct cpu_entry_area { char gdt[PAGE_SIZE]; /* @@ -232,8 +231,7 @@ static inline unsigned int __get_cpu_entry_area_page_index(int cpu, int page) static inline struct cpu_entry_area *get_cpu_entry_area(int cpu) { - return (struct cpu_entry_area *) - __fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0)); + return (struct cpu_entry_area *)__fix_to_virt(__get_cpu_entry_area_page_index(cpu, 0)); } #endif /* !__ASSEMBLY__ */
[PATCH] fat: Fix sb_rdonly() change
Ouch forgot to add stable@ -- commit bc98a42c1f7d0f886c0c1b75a92a004976a46d9f introduced bug. Cc:Signed-off-by: OGAWA Hirofumi --- fs/fat/inode.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff -puN fs/fat/inode.c~fat-fix-sb_rdonly-fix fs/fat/inode.c --- linux/fs/fat/inode.c~fat-fix-sb_rdonly-fix 2017-11-23 15:03:47.371667601 +0900 +++ linux-hirofumi/fs/fat/inode.c 2017-11-23 15:04:07.654616440 +0900 @@ -779,7 +779,7 @@ static void __exit fat_destroy_inodecach static int fat_remount(struct super_block *sb, int *flags, char *data) { - int new_rdonly; + bool new_rdonly; struct msdos_sb_info *sbi = MSDOS_SB(sb); *flags |= MS_NODIRATIME | (sbi->options.isvfat ? 0 : MS_NOATIME); _ -- OGAWA Hirofumi
Re: [Outreachy kernel] Re: [PATCH] net: usb: hso.c: remove unneeded DRIVER_LICENSE #define
On Thu, 23 Nov 2017, Greg Kroah-Hartman wrote: > On Wed, Nov 22, 2017 at 10:20:49PM +0100, Julia Lawall wrote: > > > > > > On Wed, 22 Nov 2017, Joe Perches wrote: > > > > > On Fri, 2017-11-17 at 15:19 +0100, Greg Kroah-Hartman wrote: > > > > There is no need to #define the license of the driver, just put it in > > > > the MODULE_LICENSE() line directly as a text string. > > > > > > > > This allows tools that check that the module license matches the source > > > > code license to work properly, as there is no need to unwind the > > > > unneeded dereference. > > > [] > > > > diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c > > > [] > > > > @@ -76,7 +76,6 @@ > > > > > > > > #define MOD_AUTHOR "Option Wireless" > > > > #define MOD_DESCRIPTION"USB High Speed Option > > > > driver" > > > > -#define MOD_LICENSE"GPL" > > > > > > > > #define HSO_MAX_NET_DEVICES10 > > > > #define HSO__MAX_MTU 2048 > > > > @@ -3288,7 +3287,7 @@ module_exit(hso_exit); > > > > > > > > MODULE_AUTHOR(MOD_AUTHOR); > > > > MODULE_DESCRIPTION(MOD_DESCRIPTION); > > > > -MODULE_LICENSE(MOD_LICENSE); > > > > +MODULE_LICENSE("GPL"); > > > > > > Probably all of these MODULE_(MOD_) uses could be > > > simplified as well. > > > > > > Perhaps there's utility in a (cocci?) script that looks for > > > used-once > > > macro #defines in various types of macros. > > > > What about module_version, eg: > > > > diff -u -p a/drivers/ata/pata_pdc202xx_old.c > > b/drivers/ata/pata_pdc202xx_old.c > > --- a/drivers/ata/pata_pdc202xx_old.c > > +++ b/drivers/ata/pata_pdc202xx_old.c > > @@ -21,7 +21,6 @@ > > #include > > > > #define DRV_NAME "pata_pdc202xx_old" > > -#define DRV_VERSION "0.4.3" > > > > static int pdc2026x_cable_detect(struct ata_port *ap) > > { > > @@ -389,4 +388,4 @@ MODULE_AUTHOR("Alan Cox"); > > MODULE_DESCRIPTION("low-level driver for Promise 2024x and 20262-20267"); > > MODULE_LICENSE("GPL"); > > MODULE_DEVICE_TABLE(pci, pdc202xx); > > -MODULE_VERSION(DRV_VERSION); > > +MODULE_VERSION("0.4.3"); > > I've just deleted MODULE_VERSION() entirely from some subsystems, as > once the driver is in the kernel source tree, the "version" makes almost > no sense at all. > > But I know some companies love incrementing it (some network and scsi > drivers specifically), so those might want to keep it around for some > odd reason. OK, that seems like a simple soluton. Thanks. julia
Re: [PATCH 1/2] ALSA: pcm: add SNDRV_PCM_FORMAT_{S, U}20_4
On Nov 23 2017 08:44, Maciej S. Szmigiero wrote: On 23.11.2017 00:27, Takashi Sakamoto wrote: On Nov 23 2017 04:17, Maciej S. Szmigiero wrote: (..) --- a/include/uapi/sound/asound.h +++ b/include/uapi/sound/asound.h @@ -236,7 +236,11 @@ typedef int __bitwise snd_pcm_format_t; #define SNDRV_PCM_FORMAT_DSD_U32_LE ((__force snd_pcm_format_t) 50) /* DSD, 4-byte samples DSD (x32), little endian */ #define SNDRV_PCM_FORMAT_DSD_U16_BE ((__force snd_pcm_format_t) 51) /* DSD, 2-byte samples DSD (x16), big endian */ #define SNDRV_PCM_FORMAT_DSD_U32_BE ((__force snd_pcm_format_t) 52) /* DSD, 4-byte samples DSD (x32), big endian */ -#define SNDRV_PCM_FORMAT_LAST SNDRV_PCM_FORMAT_DSD_U32_BE +#define SNDRV_PCM_FORMAT_S20_4LE ((__force snd_pcm_format_t) 53) /* in four bytes */ +#define SNDRV_PCM_FORMAT_S20_4BE ((__force snd_pcm_format_t) 54) /* in four bytes */ +#define SNDRV_PCM_FORMAT_U20_4LE ((__force snd_pcm_format_t) 55) /* in four bytes */ +#define SNDRV_PCM_FORMAT_U20_4BE ((__force snd_pcm_format_t) 56) /* in four bytes */ +#define SNDRV_PCM_FORMAT_LAST SNDRV_PCM_FORMAT_U20_4BE In my opinion, for this type of definition, it's better to declare left/right-adjusted or padding side. (Of course, silence definition is already a hint, however the lack of information forces developers to have a careful behaviour to handle entries on the list. (I note that in current ALSA PCM interface there's no way to deliver MSB/LSB-first information about sample format.) No other sound format includes this information in its name You overlook comments in 'SNDRV_PCM_FORMAT_[U|S]24_[LE|BE]'. Let me refer to them [1]: 198 #define SNDRV_PCM_FORMAT_S24_LE ((__force snd_pcm_format_t) 6) /* low three bytes */ 199 #define SNDRV_PCM_FORMAT_S24_BE ((__force snd_pcm_format_t) 7) /* low three bytes */ 200 #define SNDRV_PCM_FORMAT_U24_LE ((__force snd_pcm_format_t) 8) /* low three bytes */ 201 #define SNDRV_PCM_FORMAT_U24_BE ((__force snd_pcm_format_t) 9) /* low three bytes */ In your way, these types of format can be represented by 'SNDRV_PCM_FORMAT_[U|S]24_4[LE|BE]', thus for playback direction they mean: ``` #include #include uint32_t *buf; uint32_t sample; snd_pcm_format_t format; sample = generate_a_sample(); (sample & ~0x00ff) /* invalid bits as sample */ if (format == SNDRV_PCM_FORMAT_[U|S]24_LE) { buf[0] = htole32(sample); else buf[0] = htobe32(sample); /* transfer content of the buf via ALSA kernel stuffs. */ ``` The comments are good enough for application developers in an aspect of a position for padding. In general, studying from the past is preferable behaviour to be genius, however accumulated history includes mistakes and defects. Just pretending the past is not so genius, without further consideration. Actually additions of the rest of entries for PCM format were done without enough cares of what information they give to application developers. Adding new entries is easier than fixing and improving them once exposed. It's a reason that they're left what they're. I wish you had enough care to assist applications developers. Without applications, drivers are worthless and just waste of code base. so if we name these formats SNDRV_PCM_FORMAT_{S, U}20LSB_4 they are going to have it inconsistent with every other one (I assume you meant to include such information in a format name?). But information about whether this format is MSB or LSB justified can be added in a comment so the situation is clear for other developers from the definition without needing to read the actual processing code. For consistency of the other entries, this is not so preferable, in my opinion. So I didn't suggest it and just noted. Additionally, alsa-lib includes some codes related to the definition[1]. If you'd like to thing goes well out of ALSA SoC part, it's better to submit changes to the library as well. [1] http://git.alsa-project.org/?p=alsa-lib.git;a=blob;f=src/pcm/pcm_misc.c;h=5420b1895713a3aec3624a5218794a7b49baf167;hb=HEAD I have alsa-lib changes ready for these formats - they were needed to test these patches, will post them when this is merged on the kernel side (in case some changes are needed which affect both). Please pay enough care when writing patch comment. Silence means nothing, at least for reviewers, even if you have good preparations. [1] https://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git/tree/include/uapi/sound/asound.h?h=sound-4.15-rc1#n198 Regards Takashi Sakamoto
Re: [PATCH 4.14 00/18] 4.14.2-stable review
On Wed, Nov 22, 2017 at 01:34:13PM -0800, Guenter Roeck wrote: > On Wed, Nov 22, 2017 at 11:12:24AM +0100, Greg Kroah-Hartman wrote: > > This is the start of the stable review cycle for the 4.14.2 release. > > There are 18 patches in this series, all will be posted as a response > > to this one. If anyone has any issues with these being applied, please > > let me know. > > > > Responses should be made by Fri Nov 24 10:11:38 UTC 2017. > > Anything received after that time might be too late. > > > > Build results: > total: 145 pass: 145 fail: 0 > Qemu test results: > total: 123 pass: 123 fail: 0 > > Details are available at http://kerneltests.org/builders. Wonderful, thanks for testing all of these and letting me know. greg k-h
[ANNOUNCE] 3.2.95-rt133
Dear RT Folks, I'm pleased to announce the 3.2.95-rt133 stable release. This release is just an update to the new stable 3.2.95 version and no RT specific changes have been made. You can get this release via the git tree at: git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-stable-rt.git branch: v3.2-rt Head SHA1: 6456225eef4c956db441af54b72ad497b797a343 Or to build 3.2.95-rt133 directly, the following patches should be applied: http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.2.tar.xz http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.2.95.xz http://www.kernel.org/pub/linux/kernel/projects/rt/3.2/patch-3.2.95-rt133.patch.xz Enjoy, -- Steve
[PATCH] r8152: disable rx checksum offload on Dell TB dock
r8153 on Dell TB dock corrupts rx packets. The root cause is not found yet, but disabling rx checksumming can workaround the issue. We can use this connection to decide if it's a Dell TB dock: Realtek r8153 <-> SMSC hub <-> ASMedia XHCI controller BugLink: https://bugs.launchpad.net/bugs/1729674 Cc: Mario LimoncielloSigned-off-by: Kai-Heng Feng --- drivers/net/usb/r8152.c | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index d51d9abf7986..58b80b5e7803 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include /* Information for net-next */ #define NETNEXT_VERSION"09" @@ -5135,6 +5137,35 @@ static u8 rtl_get_version(struct usb_interface *intf) return version; } +/* Ethernet on Dell TB 15/16 dock is connected this way: + * Realtek r8153 <-> SMSC hub <-> ASMedia XHCI controller + * We use this connection to make sure r8153 is on the Dell TB dock. + */ +static bool check_dell_tb_dock(struct usb_device *udev) +{ + struct usb_device *hub = udev->parent; + struct usb_device *root_hub; + struct pci_dev *controller; + + if (!hub) + return false; + + if (!(le16_to_cpu(hub->descriptor.idVendor) == 0x0424 && + le16_to_cpu(hub->descriptor.idProduct) == 0x5537)) + return false; + + root_hub = hub->parent; + if (!root_hub || root_hub->parent) + return false; + + controller = to_pci_dev(bus_to_hcd(root_hub->bus)->self.controller); + + if (controller->vendor == 0x1b21 && controller->device == 0x1142) + return true; + + return false; +} + static int rtl8152_probe(struct usb_interface *intf, const struct usb_device_id *id) { @@ -5202,7 +5233,7 @@ static int rtl8152_probe(struct usb_interface *intf, NETIF_F_HIGHDMA | NETIF_F_FRAGLIST | NETIF_F_IPV6_CSUM | NETIF_F_TSO6; - if (tp->version == RTL_VER_01) { + if (tp->version == RTL_VER_01 || check_dell_tb_dock(udev)) { netdev->features &= ~NETIF_F_RXCSUM; netdev->hw_features &= ~NETIF_F_RXCSUM; } -- 2.14.1
Re: [PATCH 00/23] [v4] KAISER: unmap most of the kernel from userspace page tables
* Ingo Molnarwrote: > > 32-bit x86 defconfig still doesn't build: > > arch/x86/events/intel/ds.c: In function ‘dsalloc’: > arch/x86/events/intel/ds.c:296:6: error: implicit declaration of function > ‘kaiser_add_mapping’; did you mean ‘kgid_has_mapping’? > [-Werror=implicit-function-declaration] The patch below should cure this one - only build tested. Thanks, Ingo arch/x86/events/intel/ds.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c index c9f44d7ce838..61388b01962d 100644 --- a/arch/x86/events/intel/ds.c +++ b/arch/x86/events/intel/ds.c @@ -3,7 +3,7 @@ #include #include -#include +#include #include #include
constant conversion warning in umip
Pulling down ToT, I'm seeing the following warning when building with clang: CC arch/x86/lib/insn.o arch/x86/lib/insn-eval.c:780:10: error: implicit conversion from 'int' to 'char' changes value from 132 to -124 [-Werror,-Wconstant-conversion] return INSN_CODE_SEG_PARAMS(4, 8); ~~ ^~ ./arch/x86/include/asm/insn-eval.h:16:57: note: expanded from macro 'INSN_CODE_SEG_PARAMS' #define INSN_CODE_SEG_PARAMS(oper_sz, addr_sz) (oper_sz | (addr_sz << 4)) ^~~~ seems to be coming from commit 4efea85fb56fa "x86/insn-eval: Add function to get default params of code segment". Should this be an unsigned char (as well as seg_defs in arch/x86/kernel/umip.c? That might be an issue for returning -EINVAL, maybe an in/out parameter would be better?
Re: mm/percpu.c: use smarter memory allocation for struct pcpu_alloc_info (crisv32 hang)
On Wed, Nov 22, 2017 at 03:17:00PM -0500, Nicolas Pitre wrote: > On Wed, 22 Nov 2017, Jesper Nilsson wrote: > > > On Mon, Nov 20, 2017 at 10:50:46PM -0500, Nicolas Pitre wrote: > > > On Mon, 20 Nov 2017, Guenter Roeck wrote: > > > > On Mon, Nov 20, 2017 at 07:28:21PM -0500, Nicolas Pitre wrote: > > > > > On Mon, 20 Nov 2017, Guenter Roeck wrote: > > > > > > > > > > > bdata->node_min_pfn=6 PFN_PHYS(bdata->node_min_pfn)=c000 > > > > > > start_off=536000 region=c0536000 > > > > > > > > > > If PFN_PHYS(bdata->node_min_pfn)=c000 and > > > > > region=c0536000 that means phys_to_virt() is a no-op. > > > > > > > > > No, it is |= 0x8000 > > > > > > Then the bootmem registration looks very fishy. If you have: > > > > > > > I think the problem is the 0x6 in bdata->node_min_pfn. It is shifted > > > > left by PFN_PHYS, making it 0xc000, which in my understanding is > > > > a virtual address. > > > > > > Exact. > > > > > > #define __pa(x) ((unsigned long)(x) & 0x7fff) > > > #define __va(x) ((void *)((unsigned long)(x) | > > > 0x8000)) > > > > > > With that, the only possible physical address range you may have is > > > 0x4000 - 0x7fff, and it better start at 0x4000. If that's > > > not where your RAM is then something is wrong. > > > > > > This is in fact a very bad idea to define __va() and __pa() using > > > bitwise operations as this hides mistakes like defining physical RAM > > > address at 0xc000. Instead, it should look like: > > > > > > #define __pa(x) ((unsigned long)(x) - 0x8000) > > > #define __va(x) ((void *)((unsigned long)(x) + > > > 0x8000)) > > > > > > This way, bad physical RAM address definitions will be caught > > > immediately. > > > > > > > That doesn't seem to be easy to fix. It seems there is a mixup of > > > > physical > > > > and virtual addresses in the architecture. > > > > > > Well... I don't think there is much else to say other than this needs > > > fixing. > > > > The memory map for the ETRAX FS has the SDRAM mapped at both > > 0x4000-0x7fff > > and 0xc000-0x, and the difference is cached and non-cached. > > That is actively (ab)used in the port, unfortunately, allthough I'm > > uncertain if this is the problem in this case. > > It certainly is a problem. If your cached RAM is physically mapped at > 0xc000 and you want it to be virtually mapped at 0xc000 then you > should have: > > #define __pa(x) ((unsigned long)(x)) > #define __va(x) ((void *)(x)) > > i.e. no translation. Sorry, it's the other way around, cached memory is at 0x4000 and non-cached is at 0xc000, so the translation is right, even if as you pointed out earlier, it should be performed differently. > For non-cached RAM access, there are specific > interfaces for that. For example, you could have dma_alloc_coherent() > take advantage of the fact that memory with the top bit cleared becomes > uncached. But __pa() is the wrong interface for obtaining uncached > memory. > > Nicolas /^JN - Jesper Nilsson -- Jesper Nilsson -- jesper.nils...@axis.com
Re: [PATCH 1/2] serial: 8250_fintek: Return -EINVAL on invalid configuration
On Wed, Nov 22, 2017 at 11:30:39PM +0100, Ricardo Ribalda Delgado wrote: > ping? Both patches went into 4.15. Thanks, Lukas
[PATCH 1/2] scripts: leaking_addresses: add support for 32-bit kernel addresses
The current leaking_addresses.pl script only supports showing "leaked" 64-bit kernel virtual addresses. This patch adds support for showing "leaked" 32-bit kernel virtual addresses. It also takes into account Tobin's feedback on the previous iteration. (Note: this patch is meant to apply on the 'leaks' branch of Tobin's tree). Briefly, the way it works- once it detects we're running on an i'x'86 platform, (where x=3|4|5|6), it takes this arch into account for checking. The essential rationale: if virt-addr >= PAGE_OFFSET => it's a kernel virtual address. This version programatically queries and sets PAGE_OFFSET based on the /boot/config-$(uname -r) content. If, for any reason, this file cannot be used, we fallback to requesting the user to pass PAGE_OFFSET as a parameter. Pending/TODO: - support for ARM-32 Feedback welcome.. Signed-off-by: Kaiwan N Billimoria--- diff --git a/scripts/leaking_addresses.pl b/scripts/leaking_addresses.pl index 865c07649dff..0566f8055ec5 100755 --- a/scripts/leaking_addresses.pl +++ b/scripts/leaking_addresses.pl @@ -2,10 +2,10 @@ # # (c) 2017 Tobin C. Harding # (c) 2017 Kaiwan N Billimoria (ix86 support) - + # Licensed under the terms of the GNU GPL License version 2 # -# leaking_addresses.pl: Scan 64 bit kernel for potential leaking addresses. +# leaking_addresses.pl: Scan 32/64 bit kernel for potential leaking addresses. # - Scans dmesg output. # - Walks directory tree and parses each file (for each directory in @DIRS). # @@ -14,7 +14,7 @@ # # You may like to set kptr_restrict=2 before running script # (see Documentation/sysctl/kernel.txt). - +# use warnings; use strict; use POSIX; @@ -37,7 +37,7 @@ my $TIMEOUT = 10; # Script can only grep for kernel addresses on the following architectures. If # your architecture is not listed here and has a grep'able kernel address please # consider submitting a patch. -my @SUPPORTED_ARCHITECTURES = ('x86_64', 'ppc64'); +my @SUPPORTED_ARCHITECTURES = ('x86_64', 'ppc64', 'i[3456]86'); # Command line options. my $help = 0; @@ -49,6 +49,12 @@ my $input_raw = ""; # Read raw results from file instead of scanning. my $suppress_dmesg = 0;# Don't show dmesg in output. my $squash_by_path = 0;# Summary report grouped by absolute path. my $squash_by_filename = 0;# Summary report grouped by filename. +my $page_offset_param = 0; # 32-bit: overrides value of PAGE_OFFSET_32BIT + +my $bit_size = 64; # Check 64-bit kernel addresses by default +my $kconfig_file = '/boot/config-'.`uname -r`; +$kconfig_file =~ s/\R*//g; +my $PAGE_OFFSET_32BIT = 0xc000; # Do not parse these files (absolute path). my @skip_parse_files_abs = ('/proc/kmsg', @@ -99,10 +105,11 @@ Options: -o, --output-raw= Save results for future processing. -i, --input-raw= Read results from file instead of scanning. - --rawShow raw results (default). - --suppress-dmesg Do not show dmesg results. - --squash-by-path Show one result per unique path. - --squash-by-filename Show one result per unique filename. + --rawShow raw results (default). + --suppress-dmesg Do not show dmesg results. + --squash-by-path Show one result per unique path. + --squash-by-filename Show one result per unique filename. + --page-offset= PAGE_OFFSET value (for 32-bit kernels). -d, --debug Display debugging output. -h, --help, --versionDisplay this help and exit. @@ -117,7 +124,7 @@ Examples: # View summary report. $0 --input-raw scan.out --squash-by-filename -Scans the running (64 bit) kernel for potential leaking addresses. +Scans the running (32 or 64 bit) kernel for potential leaking addresses. EOM exit($exitcode); @@ -133,10 +140,16 @@ GetOptions( 'squash-by-path'=> \$squash_by_path, 'squash-by-filename'=> \$squash_by_filename, 'raw' => \$raw, + 'page-offset=o' => \$page_offset_param, ) or help(1); help(0) if ($help); +sub dprint +{ + printf(STDERR @_) if $debug; +} + if ($input_raw) { format_output($input_raw); exit(0); @@ -162,6 +175,24 @@ if (!is_supported_architecture()) { exit(129); } +dprint "Detected arch : $bit_size bits\n"; + +if ($bit_size == 32) { + # Parameter --page-offset passed? if Y, override with it + if ($page_offset_param != 0) { + $PAGE_OFFSET_32BIT = $page_offset_param; + } else { + $PAGE_OFFSET_32BIT = eval parse_kconfig($kconfig_file, "CONFIG_PAGE_OFFSET"); + if ($PAGE_OFFSET_32BIT == 0) { + printf "$P: Fatal Error :: couldn't parse CONFIG_PAGE_OFFSET, aborting...\n"; + printf " [Detail ::
Re: [dm-devel] new patchset to eliminate DM's use of BIOSET_NEED_RESCUER
On Wed, Nov 22 2017, Mikulas Patocka wrote: > On Wed, 22 Nov 2017, NeilBrown wrote: > >> On Tue, Nov 21 2017, Mikulas Patocka wrote: >> >> > On Tue, 21 Nov 2017, Mike Snitzer wrote: >> > >> >> On Tue, Nov 21 2017 at 4:23pm -0500, >> >> Mikulas Patockawrote: >> >> >> >> > This is not correct: >> >> > >> >> >2206 static void dm_wq_work(struct work_struct *work) >> >> >2207 { >> >> >2208 struct mapped_device *md = container_of(work, struct >> >> > mapped_device, work); >> >> >2209 struct bio *bio; >> >> >2210 int srcu_idx; >> >> >2211 struct dm_table *map; >> >> >2212 >> >> >2213 if (!bio_list_empty(>rescued)) { >> >> >2214 struct bio_list list; >> >> >2215 spin_lock_irq(>deferred_lock); >> >> >2216 list = md->rescued; >> >> >2217 bio_list_init(>rescued); >> >> >2218 spin_unlock_irq(>deferred_lock); >> >> >2219 while ((bio = bio_list_pop())) >> >> >2220 generic_make_request(bio); >> >> >2221 } >> >> > >> >> >2223 map = dm_get_live_table(md, _idx); >> >> >2224 >> >> >2225 while (!test_bit(DMF_BLOCK_IO_FOR_SUSPEND, >flags)) >> >> > { >> >> >2226 spin_lock_irq(>deferred_lock); >> >> >2227 bio = bio_list_pop(>deferred); >> >> >2228 spin_unlock_irq(>deferred_lock); >> >> >2229 >> >> >2230 if (!bio) >> >> >2231 break; >> >> >2232 >> >> >2233 if (dm_request_based(md)) >> >> >2234 generic_make_request(bio); >> >> >2235 else >> >> >2236 __split_and_process_bio(md, map, bio); >> >> >2237 } >> >> >2238 >> >> >2239 dm_put_live_table(md, srcu_idx); >> >> >2240 } >> >> > >> >> > You can see that if we are in dm_wq_work in __split_and_process_bio, we >> >> > will not process md->rescued list. >> >> >> >> Can you elaborate further? We cannot be "in dm_wq_work in >> >> __split_and_process_bio" simultaneously. Do you mean as a side-effect >> >> of scheduling away from __split_and_process_bio? >> >> >> >> The more detail you can share the better. >> > >> > Suppose this scenario: >> > >> > * dm_wq_work calls __split_and_process_bio >> > * __split_and_process_bio eventually reaches the function snapshot_map >> > * snapshot_map attempts to take the snapshot lock >> > >> > * the snapshot lock could be released only if some bios submitted by the >> > snapshot driver to the underlying device complete >> > * the bios submitted to the underlying device were already offloaded by >> > some other task and they are waiting on the list md->rescued >> > * the bios waiting on md->rescued are not processed, because dm_wq_work is >> > blocked in snapshot_map (called from __split_and_process_bio) >> >> Yes, I think you are right. >> >> I think the solution is to get rid of the dm_offload() infrastructure >> and make it not necessary. >> i.e. discard my patches >> dm: prepare to discontinue use of BIOSET_NEED_RESCUER >> and >> dm: revise 'rescue' strategy for bio-based bioset allocations >> >> And build on "dm: ensure bio submission follows a depth-first tree walk" >> which was written after those and already makes dm_offload() less >> important. >> >> Since that "depth-first" patch, every request to the dm device, after >> the initial splitting, allocates just one dm_target_io structure, and >> makes just one __map_bio() call, and so will behave exactly the way >> generic_make_request() expects and copes with - thus avoiding awkward >> dependencies and deadlocks. Except >> >> a/ If any target defines ->num_write_bios() to return >1, >>__clone_and_map_data_bio() will make multiple calls to alloc_tio() >>and __map_bio(), which might need rescuing. >>But no target defines num_write_bios, and none have since it was >>removed from dm-cache 4.5 years ago. >>Can we discard num_write_bios?? >> >> b/ If any target sets any of num_{flush,discard,write_same,write_zeroes}_bios >>to a value > 1, then __send_duplicate_bios() will also make multiple >>calls to alloc_tio() and __map_bio(). >>Some do. >> dm-cache-target: flush=2 >> dm-snap: flush=2 >> dm-stripe: discard, write_same, write_zeroes all set to 'stripes'. >> >> These will only be a problem if the second (or subsequent) alloc_tio() >> blocks waiting for an earlier allocation to complete. This will only >> be a problem if multiple threads are each trying to allocate multiple >> dm_target_io from the same bioset at the same time. >> This is rare and should be easier to address than the current >> dm_offload() approach. >> One possibility would be to copy the approach taken by >>
[PATCH RT 04/10] sched: Prevent task state corruption by spurious lock wakeup
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Thomas GleixnerMathias and others reported GDB failures on RT. The following scenario leads to task state corruption: CPU0CPU1 T1->state = TASK_XXX; spin_lock() rt_spin_lock_slowlock(>rtmutex) raw_spin_lock(>wait_lock); T1->saved_state = current->state; T1->state = TASK_UNINTERRUPTIBLE; spin_unlock() task_blocks_on_rt_mutex(rtm) rt_spin_lock_slowunlock(>rtmutex) queue_waiter(rtm) raw_spin_lock(>wait_lock); pi_chain_walk(rtm) raw_spin_unlock(>wait_lock); wake_top_waiter(T1) raw_spin_lock(>wait_lock); for (;;) { if (__try_to_take_rt_mutex()) <- Succeeds break; ... } T1->state = T1->saved_state; try_to_wake_up(T1) ttwu_do_wakeup(T1) T1->state = TASK_RUNNING; In most cases this is harmless because waiting for some event, which is the usual reason for TASK_[UN]INTERRUPTIBLE has to be safe against other forms of spurious wakeups anyway. But in case of TASK_TRACED this is actually fatal, because the task loses the TASK_TRACED state. In consequence it fails to consume SIGSTOP which was sent from the debugger and actually delivers SIGSTOP to the task which breaks the ptrace mechanics and brings the debugger into an unexpected state. The TASK_TRACED state should prevent getting there due to the state matching logic in try_to_wake_up(). But that's not true because wake_up_lock_sleeper() uses TASK_ALL as state mask. That's bogus because lock sleepers always use TASK_UNINTERRUPTIBLE, so the wakeup should use that as well. The cure is way simpler as figuring it out: Change the mask used in wake_up_lock_sleeper() from TASK_ALL to TASK_UNINTERRUPTIBLE. Cc: stable...@vger.kernel.org Reported-by: Mathias Koehrer Reported-by: David Hauck Signed-off-by: Thomas Gleixner Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index ed9550c87f66..970b893a1d15 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -2212,7 +2212,7 @@ EXPORT_SYMBOL(wake_up_process); */ int wake_up_lock_sleeper(struct task_struct *p) { - return try_to_wake_up(p, TASK_ALL, WF_LOCK_SLEEPER); + return try_to_wake_up(p, TASK_UNINTERRUPTIBLE, WF_LOCK_SLEEPER); } int wake_up_state(struct task_struct *p, unsigned int state) -- 2.13.2
[PATCH RT 09/10] md/raid5: do not disable interrupts
4.4.97-rt111-rc2 stable review patch. If anyone has any objections, please let me know. -- From: Sebastian Andrzej Siewior|BUG: sleeping function called from invalid context at kernel/locking/rtmutex.c:974 |in_atomic(): 0, irqs_disabled(): 1, pid: 2992, name: lvm |CPU: 2 PID: 2992 Comm: lvm Not tainted 4.13.10-rt3+ #54 |Call Trace: | dump_stack+0x4f/0x65 | ___might_sleep+0xfc/0x150 | atomic_dec_and_spin_lock+0x3c/0x80 | raid5_release_stripe+0x73/0x110 | grow_one_stripe+0xce/0xf0 | setup_conf+0x841/0xaa0 | raid5_run+0x7e7/0xa40 | md_run+0x515/0xaf0 | raid_ctr+0x147d/0x25e0 | dm_table_add_target+0x155/0x320 | table_load+0x103/0x320 | ctl_ioctl+0x1d9/0x510 | dm_ctl_ioctl+0x9/0x10 | do_vfs_ioctl+0x8e/0x670 | SyS_ioctl+0x3c/0x70 | entry_SYSCALL_64_fastpath+0x17/0x98 The interrupts were disabled because ->device_lock is taken with interrupts disabled. Cc: stable...@vger.kernel.org Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Steven Rostedt (VMware) --- drivers/md/raid5.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c index 9b1aedb8e5df..8b236d622889 100644 --- a/drivers/md/raid5.c +++ b/drivers/md/raid5.c @@ -429,7 +429,7 @@ void raid5_release_stripe(struct stripe_head *sh) md_wakeup_thread(conf->mddev->thread); return; slow_path: - local_irq_save(flags); + local_irq_save_nort(flags); /* we are ok here if STRIPE_ON_RELEASE_LIST is set or not */ if (atomic_dec_and_lock(>count, >device_lock)) { INIT_LIST_HEAD(); @@ -438,7 +438,7 @@ slow_path: spin_unlock(>device_lock); release_inactive_stripe_list(conf, , hash); } - local_irq_restore(flags); + local_irq_restore_nort(flags); } static inline void remove_hash(struct stripe_head *sh) -- 2.13.2
[PATCH] gfs2: Fix wrong error handling in init_gfs2_fs()
init_gfs2_fs() is calling e.g. calling unregister_shrinker() without register_shrinker() when an error occurred during initialization. Rename goto labels and call appropriate undo function. Signed-off-by: Tetsuo Handa--- fs/gfs2/main.c | 90 -- 1 file changed, 44 insertions(+), 46 deletions(-) diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c index 0a89e6f..2d55e2c 100644 --- a/fs/gfs2/main.c +++ b/fs/gfs2/main.c @@ -93,7 +93,7 @@ static int __init init_gfs2_fs(void) error = gfs2_glock_init(); if (error) - goto fail; + goto fail_glock; error = -ENOMEM; gfs2_glock_cachep = kmem_cache_create("gfs2_glock", @@ -101,7 +101,7 @@ static int __init init_gfs2_fs(void) 0, 0, gfs2_init_glock_once); if (!gfs2_glock_cachep) - goto fail; + goto fail_cachep1; gfs2_glock_aspace_cachep = kmem_cache_create("gfs2_glock(aspace)", sizeof(struct gfs2_glock) + @@ -109,7 +109,7 @@ static int __init init_gfs2_fs(void) 0, 0, gfs2_init_gl_aspace_once); if (!gfs2_glock_aspace_cachep) - goto fail; + goto fail_cachep2; gfs2_inode_cachep = kmem_cache_create("gfs2_inode", sizeof(struct gfs2_inode), @@ -118,107 +118,105 @@ static int __init init_gfs2_fs(void) SLAB_ACCOUNT, gfs2_init_inode_once); if (!gfs2_inode_cachep) - goto fail; + goto fail_cachep3; gfs2_bufdata_cachep = kmem_cache_create("gfs2_bufdata", sizeof(struct gfs2_bufdata), 0, 0, NULL); if (!gfs2_bufdata_cachep) - goto fail; + goto fail_cachep4; gfs2_rgrpd_cachep = kmem_cache_create("gfs2_rgrpd", sizeof(struct gfs2_rgrpd), 0, 0, NULL); if (!gfs2_rgrpd_cachep) - goto fail; + goto fail_cachep5; gfs2_quotad_cachep = kmem_cache_create("gfs2_quotad", sizeof(struct gfs2_quota_data), 0, 0, NULL); if (!gfs2_quotad_cachep) - goto fail; + goto fail_cachep6; gfs2_qadata_cachep = kmem_cache_create("gfs2_qadata", sizeof(struct gfs2_qadata), 0, 0, NULL); if (!gfs2_qadata_cachep) - goto fail; + goto fail_cachep7; error = register_shrinker(_qd_shrinker); if (error) - goto fail; + goto fail_shrinker; error = register_filesystem(_fs_type); if (error) - goto fail; + goto fail_fs1; error = register_filesystem(_fs_type); if (error) - goto fail_unregister; + goto fail_fs2; error = -ENOMEM; gfs_recovery_wq = alloc_workqueue("gfs_recovery", WQ_MEM_RECLAIM | WQ_FREEZABLE, 0); if (!gfs_recovery_wq) - goto fail_wq; + goto fail_wq1; gfs2_control_wq = alloc_workqueue("gfs2_control", WQ_UNBOUND | WQ_FREEZABLE, 0); if (!gfs2_control_wq) - goto fail_recovery; + goto fail_wq2; gfs2_freeze_wq = alloc_workqueue("freeze_workqueue", 0, 0); if (!gfs2_freeze_wq) - goto fail_control; + goto fail_wq3; gfs2_page_pool = mempool_create_page_pool(64, 0); if (!gfs2_page_pool) - goto fail_freeze; + goto fail_mempool; - gfs2_register_debugfs(); + error = gfs2_register_debugfs(); + if (error) + goto fail_debugfs; pr_info("GFS2 installed\n"); return 0; -fail_freeze: +fail_debugfs: + mempool_destroy(gfs2_page_pool); +fail_mempool: destroy_workqueue(gfs2_freeze_wq); -fail_control: +fail_wq3: destroy_workqueue(gfs2_control_wq); -fail_recovery: +fail_wq2: destroy_workqueue(gfs_recovery_wq); -fail_wq: +fail_wq1: unregister_filesystem(_fs_type); -fail_unregister: +fail_fs2: unregister_filesystem(_fs_type); -fail: - list_lru_destroy(_qd_lru); -fail_lru: +fail_fs1: unregister_shrinker(_qd_shrinker); +fail_shrinker: + kmem_cache_destroy(gfs2_qadata_cachep);
Re: [PATCH] ASoC: wm0010: Delete an error message for a failed memory allocation in wm0010_boot()
On Wed, Nov 22, 2017 at 05:27:11PM +0100, SF Markus Elfring wrote: > From: Markus Elfring> Date: Wed, 22 Nov 2017 17:17:48 +0100 > > Omit an extra message for a memory allocation failure in this function. > > This issue was detected by using the Coccinelle software. > > Signed-off-by: Markus Elfring > --- Acked-by: Charles Keepax Thanks, Charles
RE: [PATCH AUTOSEL for 4.9 37/54] RDMA/qedr: Fix RDMA CM loopback
> From: Ram Amrani> > [ Upstream commit af2b14b8b8ae21b0047a52c767ac8b44f435a280 ] > > The loopback logic in RDMA CM packets compares Ethernet addresses and > was accidently inverse. > > Signed-off-by: Ram Amrani > Signed-off-by: Ariel Elior > Signed-off-by: Doug Ledford > Signed-off-by: Sasha Levin > --- > drivers/infiniband/hw/qedr/qedr_cm.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/hw/qedr/qedr_cm.c > b/drivers/infiniband/hw/qedr/qedr_cm.c > index 63890ebb72bd..eccf7039aaca 100644 > --- a/drivers/infiniband/hw/qedr/qedr_cm.c > +++ b/drivers/infiniband/hw/qedr/qedr_cm.c > @@ -404,9 +404,9 @@ static inline int qedr_gsi_build_packet(struct qedr_dev > *dev, > } > > if (ether_addr_equal(udh.eth.smac_h, udh.eth.dmac_h)) > - packet->tx_dest = QED_ROCE_LL2_TX_DEST_NW; > - else > packet->tx_dest = QED_ROCE_LL2_TX_DEST_LB; > + else > + packet->tx_dest = QED_ROCE_LL2_TX_DEST_NW; > > packet->roce_mode = roce_mode; > memcpy(packet->header.vaddr, ud_header_buffer, header_size); > -- > 2.11.0 Thanks! Acked-by: Ram Amrani
[GIT PULL] final round of SCSI updates for the 4.14+ merge window
Two basic fixes: one for the sparse problem with the blacklist flags and another for a hang forever in bnx2i. The patch is available here: git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-fixes The short changelog is: Chad Dupuis (1): scsi: bnx2fc: Fix hung task messages when a cleanup response is not received during abort Hannes Reinecke (1): scsi: Use 'blist_flags_t' for scsi_devinfo flags And the diffstat: drivers/scsi/bnx2fc/bnx2fc_io.c | 40 ++--- drivers/scsi/scsi_devinfo.c | 18 +++ drivers/scsi/scsi_priv.h| 15 +++-- drivers/scsi/scsi_scan.c| 2 +- include/scsi/scsi_device.h | 4 +++- include/scsi/scsi_devinfo.h | 50 - 6 files changed, 78 insertions(+), 51 deletions(-) With full diff below. James --- diff --git a/drivers/scsi/bnx2fc/bnx2fc_io.c b/drivers/scsi/bnx2fc/bnx2fc_io.c index 5b6153f23f01..8e2f767147cb 100644 --- a/drivers/scsi/bnx2fc/bnx2fc_io.c +++ b/drivers/scsi/bnx2fc/bnx2fc_io.c @@ -1084,24 +1084,35 @@ static int bnx2fc_abts_cleanup(struct bnx2fc_cmd *io_req) { struct bnx2fc_rport *tgt = io_req->tgt; int rc = SUCCESS; + unsigned int time_left; io_req->wait_for_comp = 1; bnx2fc_initiate_cleanup(io_req); spin_unlock_bh(>tgt_lock); - wait_for_completion(_req->tm_done); - + /* +* Can't wait forever on cleanup response lest we let the SCSI error +* handler wait forever +*/ + time_left = wait_for_completion_timeout(_req->tm_done, + BNX2FC_FW_TIMEOUT); io_req->wait_for_comp = 0; + if (!time_left) + BNX2FC_IO_DBG(io_req, "%s(): Wait for cleanup timed out.\n", + __func__); + /* -* release the reference taken in eh_abort to allow the -* target to re-login after flushing IOs +* Release reference held by SCSI command the cleanup completion +* hits the BNX2FC_CLEANUP case in bnx2fc_process_cq_compl() and +* thus the SCSI command is not returnedi by bnx2fc_scsi_done(). */ kref_put(_req->refcount, bnx2fc_cmd_release); spin_lock_bh(>tgt_lock); return rc; } + /** * bnx2fc_eh_abort - eh_abort_handler api to abort an outstanding * SCSI command @@ -1118,6 +1129,7 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) struct fc_lport *lport; struct bnx2fc_rport *tgt; int rc; + unsigned int time_left; rc = fc_block_scsi_eh(sc_cmd); if (rc) @@ -1194,6 +1206,11 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) if (cancel_delayed_work(_req->timeout_work)) kref_put(_req->refcount, bnx2fc_cmd_release); /* drop timer hold */ + /* +* We don't want to hold off the upper layer timer so simply +* cleanup the command and return that I/O was successfully +* aborted. +*/ rc = bnx2fc_abts_cleanup(io_req); /* This only occurs when an task abort was requested while ABTS is in progress. Setting the IO_CLEANUP flag will skip the @@ -1201,7 +1218,7 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) was a result from the ABTS request rather than the CLEANUP request */ set_bit(BNX2FC_FLAG_IO_CLEANUP, _req->req_flags); - goto out; + goto done; } /* Cancel the current timer running on this io_req */ @@ -1221,7 +1238,11 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) } spin_unlock_bh(>tgt_lock); - wait_for_completion(_req->tm_done); + /* Wait 2 * RA_TOV + 1 to be sure timeout function hasn't fired */ + time_left = wait_for_completion_timeout(_req->tm_done, + (2 * rp->r_a_tov + 1) * HZ); + if (time_left) + BNX2FC_IO_DBG(io_req, "Timed out in eh_abort waiting for tm_done"); spin_lock_bh(>tgt_lock); io_req->wait_for_comp = 0; @@ -1233,8 +1254,12 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) /* Let the scsi-ml try to recover this command */ printk(KERN_ERR PFX "abort failed, xid = 0x%x\n", io_req->xid); + /* +* Cleanup firmware residuals before returning control back +* to SCSI ML. +*/ rc = bnx2fc_abts_cleanup(io_req); - goto out; + goto done; } else { /* * We come here even when there was a race condition @@ -1249,7 +1274,6 @@ int bnx2fc_eh_abort(struct scsi_cmnd *sc_cmd) done: /* release the reference taken in eh_abort */
Re: [v4,2/3] drivers: hwmon: Add W83773G driver
On Sun, Nov 19, 2017 at 2:45 AM, Guenter Roeckwrote: > On Mon, Nov 13, 2017 at 11:27:33AM +0800, Lei YU wrote: >> Nuvoton W83773G is a hardware monitor IC providing one local >> temperature and two remote temperature sensors. >> >> Signed-off-by: Lei YU > > Applied to hwmon-next. Where does hwmon-next live? I was looking on kernel.org and I can't seem to find it. Cheers, Joel
Re: [PATCH 00/18] arm64: Unmap the kernel whilst running in userspace (KAISER)
> On 22 Nov 2017, at 23:37, Pavel Machekwrote: > > Hi! > > If I'm willing to do timing attacks to defeat KASLR... what prevents > me from using CPU caches to do that? > Because it is impossible to get a cache hit on an access to an unmapped address? >>> >>> Um, no, I don't need to be able to directly access kernel addresses. I >>> just put some data in _same place in cache where kernel data would >>> go_, then do syscall and look if my data are still cached. Caches >>> don't have infinite associativity. >>> >> >> Ah ok. Interesting. >> >> But how does that leak address bits that are covered by the tag? > > Same as leaking any other address bits? Caches are "virtually > indexed", Not on arm64, although I don’t see how that is relevant if you are trying to defeat kaslr. > and tag does not come into play... > Well, I must be missing something then, because I don’t see how knowledge about which userland address shares a cache way with a kernel address can leak anything beyond the bits that make up the index (i.e., which cache way is being shared) > Maybe this explains it? > No not really. It explains how cache timing can be used as a side channel, not how it defeats kaslr. Thanks, Ard.
Creating cyclecounter and lock member in timecounter structure [ Was Re: [RFC 1/4] drm/i915/perf: Add support to correlate GPU timestamp with system time]
Hi, We needed inputs on possible optimization that can be done to timecounter/cyclecounter structures/usage. This mail is in response to review of patch https://patchwork.freedesktop.org/patch/188448/. As Chris's observation below, about dozen of timecounter users in the kernel have below structures defined individually: spinlock_t lock; struct cyclecounter cc; struct timecounter tc; Can we move lock and cc to tc? That way it will be convenient. Also it will allow unifying the locking/overflow watchdog handling across all drivers. Please suggest. Thanks Sagar On 11/15/2017 5:55 PM, Chris Wilson wrote: Quoting Sagar Arun Kamble (2017-11-15 12:13:51) #include #include @@ -2149,6 +2150,14 @@ struct i915_perf_stream { * @oa_config: The OA configuration used by the stream. */ struct i915_oa_config *oa_config; + + /** +* System time correlation variables. +*/ + struct cyclecounter cc; + spinlock_t systime_lock; + struct timespec64 start_systime; + struct timecounter tc; This pattern is repeated a lot by struct timecounter users. (I'm still trying to understand why the common case is not catered for by a convenience timecounter api.) }; /** diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 00be015..72ddc34 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -192,6 +192,7 @@ */ #include +#include #include #include @@ -2391,6 +2392,56 @@ static unsigned int i915_perf_poll(struct file *file, poll_table *wait) } /** + * i915_cyclecounter_read - read raw cycle/timestamp counter + * @cc: cyclecounter structure + */ +static u64 i915_cyclecounter_read(const struct cyclecounter *cc) +{ + struct i915_perf_stream *stream = container_of(cc, typeof(*stream), cc); + struct drm_i915_private *dev_priv = stream->dev_priv; + u64 ts_count; + + intel_runtime_pm_get(dev_priv); + ts_count = I915_READ64_2x32(GEN4_TIMESTAMP, + GEN7_TIMESTAMP_UDW); + intel_runtime_pm_put(dev_priv); + + return ts_count; +} + +static void i915_perf_init_cyclecounter(struct i915_perf_stream *stream) +{ + struct drm_i915_private *dev_priv = stream->dev_priv; + int cs_ts_freq = dev_priv->perf.oa.timestamp_frequency; + struct cyclecounter *cc = >cc; + u32 maxsec; + + cc->read = i915_cyclecounter_read; + cc->mask = CYCLECOUNTER_MASK(CS_TIMESTAMP_WIDTH(dev_priv)); + maxsec = cc->mask / cs_ts_freq; + + clocks_calc_mult_shift(>mult, >shift, cs_ts_freq, + NSEC_PER_SEC, maxsec); +} + +static void i915_perf_init_timecounter(struct i915_perf_stream *stream) +{ +#define SYSTIME_START_OFFSET 35 /* Counter read takes about 350us */ + unsigned long flags; + u64 ns; + + i915_perf_init_cyclecounter(stream); + spin_lock_init(>systime_lock); + + getnstimeofday64(>start_systime); + ns = timespec64_to_ns(>start_systime) + SYSTIME_START_OFFSET; Use ktime directly. Or else Arnd will be back with a patch to fix it. (All non-ktime interfaces are effectively deprecated; obsolete for drivers.) + spin_lock_irqsave(>systime_lock, flags); + timecounter_init(>tc, >cc, ns); + spin_unlock_irqrestore(>systime_lock, flags); +} + +/** * i915_perf_enable_locked - handle `I915_PERF_IOCTL_ENABLE` ioctl * @stream: A disabled i915 perf stream * @@ -2408,6 +2459,8 @@ static void i915_perf_enable_locked(struct i915_perf_stream *stream) /* Allow stream->ops->enable() to refer to this */ stream->enabled = true; + i915_perf_init_timecounter(stream); + if (stream->ops->enable) stream->ops->enable(stream); } diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index cfdf4f8..e7e6966 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -8882,6 +8882,12 @@ enum skl_power_gate { /* Gen4+ Timestamp and Pipe Frame time stamp registers */ #define GEN4_TIMESTAMP _MMIO(0x2358) +#define GEN7_TIMESTAMP_UDW _MMIO(0x235C) +#define PRE_GEN7_TIMESTAMP_WIDTH 32 +#define GEN7_TIMESTAMP_WIDTH 36 +#define CS_TIMESTAMP_WIDTH(dev_priv) \ + (INTEL_GEN(dev_priv) < 7 ? PRE_GEN7_TIMESTAMP_WIDTH : \ + GEN7_TIMESTAMP_WIDTH) s/PRE_GEN7/GEN4/ would be consistent. If you really want to add support for earlier, I9XX_. Ok. I can accept the justification, and we are not the only ones who do the cyclecounter -> timecounter correction like this. -Chris