from:"ian"

[PATCH][next] dax: remove redundant assignment to variable rc

2024-04-15 Thread Colin Ian King

The variable rc is being assigned an value and then is being re-assigned
a new value in the next statement. The assignment is redundant and can
be removed.

Cleans up clang scan build warning:
drivers/dax/bus.c:1207:2: warning: Value stored to 'rc' is never
read [deadcode.DeadStores]

Signed-off-by: Colin Ian King 
---
 drivers/dax/bus.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 797e1ebff299..f758afbf8f09 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1204,7 +1204,6 @@ static ssize_t mapping_store(struct device *dev, struct 
device_attribute *attr,
if (rc)
return rc;
 
-   rc = -ENXIO;
rc = down_write_killable(_region_rwsem);
if (rc)
return rc;
-- 
2.39.2

Re: [RFC PATCH 0/4] perf: Correlating user process data to samples

2024-04-11 Thread Ian Rogers

On Thu, Apr 11, 2024 at 5:17 PM Beau Belgrave  wrote:
>
> In the Open Telemetry profiling SIG [1], we are trying to find a way to
> grab a tracing association quickly on a per-sample basis. The team at
> Elastic has a bespoke way to do this [2], however, I'd like to see a
> more general way to achieve this. The folks I've been talking with seem
> open to the idea of just having a TLS value for this we could capture

Presumably TLS == Thread Local Storage.

> upon each sample. We could then just state, Open Telemetry SDKs should
> have a TLS value for span correlation. However, we need a way to sample
> the TLS or other value(s) when a sampling event is generated. This is
> supported today on Windows via EventActivityIdControl() [3]. Since
> Open Telemetry works on both Windows and Linux, ideally we can do
> something as efficient for Linux based workloads.
>
> This series is to explore how it would be best possible to collect
> supporting data from a user process when a profile sample is collected.
> Having a value stored in TLS makes a lot of sense for this however
> there are other ways to explore. Whatever is chosen, kernel samples
> taken in process context should be able to get this supporting data.
> In these patches on X64 the fsbase and gsbase are used for this.
>
> An option to explore suggested by Mathieu Desnoyers is to utilize rseq
> for processes to register a value location that can be included when
> profiling if desired. This would allow a tighter contract between user
> processes and a profiler.  It would allow better labeling/categorizing
> the correlation values.

It is hard to understand this idea. Are you saying stash a cookie in
TLS for samples to capture to indicate an activity? Restartable
sequences are about preemption on a CPU not of a thread, so at least
my intuition is that they feel different. You could stash information
like this today by changing the thread name which generates comm
events. I've wondered about having similar information in some form of
reserved for profiling stack slot, for example, to stash a pointer to
the name of a function being interpreted. Snapshotting all of a stack
is bad performance wise and for security. A stack slot would be able
to deal with nesting.

> An idea flow would look like this:
> User Task   Profile
> do_work();  sample() -> IP + No activity
> ...
> set_activity(123);
> ...
> do_work();  sample() -> IP + activity (123)
> ...
> set_activity(124);
> ...
> do_work();  sample() -> IP + activity (124)
>
> Ideally, the set_activity() method would not be a syscall. It needs to
> be very cheap as this should not bottleneck work. Ideally this is just
> a memcpy of 16-20 bytes as it is on Windows via EventActivityIdControl()
> using EVENT_ACTIVITY_CTRL_SET_ID.
>
> For those not aware, Open Telemetry allows collecting data from multiple
> machines and show where time was spent. The tracing context is already
> available for logs, but not for profiling samples. The idea is to show
> where slowdowns occur and have profile samples to explain why they
> slowed down. This must be possible without having to track context
> switches to do this correlation. This is because the profiling rates
> are typically 20hz - 1Khz, while the context switching rates are much
> higher. We do not want to have to consume high context switch rates
> just to know a correlation for a 20hz signal. Often these 20hz signals
> are always enabled in some environments.
>
> Regardless if TLS, rseq, or other source is used I believe we will need
> a way for perf_events to include it within a sample. The changes in this
> series show how it could be done with TLS. There is some factoring work
> under perf to make it easier to add more dump types using the existing
> ABI. This is mostly to make the patches clearer, certainly the refactor
> parts could get dropped and we could have duplicated/specialized paths.

fs and gs may be used for more than just the C runtime's TLS. For
example, they may be used by emulators or managed runtimes. I'm not
clear why this specific case couldn't be handled through BPF.

Thanks,
Ian

> 1. https://opentelemetry.io/blog/2024/profiling/
> 2. 
> https://www.elastic.co/blog/continuous-profiling-distributed-tracing-correlation
> 3. 
> https://learn.microsoft.com/en-us/windows/win32/api/evntprov/nf-evntprov-eventactivityidcontrol
>
> Beau Belgrave (4):
>   perf/core: Introduce perf_prepare_dump_data()
>   perf: Introduce PERF_SAMPLE_TLS_USER sample type
>   perf/core: Factor perf_output_sample_udump()
>   perf/x86/core: Add tls dump support
>
>  arch/Kconfig  |   7 ++
>  arch/x86/Kconfig  |   1 +
>  arch/x86/events/core.c|  14

Re: [PATCH v1 0/4] perf parse-regs: Cleanup config and building

2024-02-14 Thread Ian Rogers

On Wed, Feb 14, 2024 at 3:40 AM Leo Yan  wrote:
>
> Currently, the perf building enables register parsing based on the
> target architecture has supported register feature.
>
> Furthermore, the perf building system needs to maintain a variable
> 'NO_PERF_REGS' and defines macro 'HAVE_PERF_REGS_SUPPORT' for statically
> compiling the tool.
>
> As a result, the perf has no flexibilty for parsing register if an
> architecture doesn't support it. And the source files use the macro
> 'HAVE_PERF_REGS_SUPPORT' to switch on and off the register parsing
> related code, which is not a good practice.
>
> This series is to remove the static building for register parsing. In
> theory, we should can dynamically detect if an arch has support this
> feature and functions can return errors when the feature is not
> supported.
>
> The first patch is to remove unused build configuration
> CONFIG_PERF_REGS.
>
> The second patch is to build perf register functions, without using the
> macro 'HAVE_PERF_REGS_SUPPORT' to statically turn on or off code.
>
> The third patch is to introduce a weak function arch__sample_reg_masks(),
> this function can allow the target arch to return its sample register
> list.  With this change, we can totally remove the macro
> 'HAVE_PERF_REGS_SUPPORT' in the source file.
>
> The forth patch is to clean up the Makefile for removing relevant
> configuration and macro definition, as they are not useful anymore.
>
> I tested this patch set on Arm64 and x86 for building and did a cross
> register parsing ('perf record' on Arm64 and 'perf report' on x86).
>
>
> Leo Yan (4):
>   perf build: Remove unused CONFIG_PERF_REGS
>   perf parse-regs: Always build perf register functions
>   perf parse-regs: Introduce a weak function arch__sample_reg_masks()
>   perf build: Cleanup perf register configuration

Thanks Leo, this is great cleanup! Series:
Reviewed-by: Ian Rogers 

Ian

>  tools/perf/Makefile.config| 25 --
>  tools/perf/arch/arm/util/perf_regs.c  |  7 +++-
>  tools/perf/arch/arm64/util/machine.c  |  2 ++
>  tools/perf/arch/arm64/util/perf_regs.c|  7 +++-
>  tools/perf/arch/csky/util/perf_regs.c |  7 +++-
>  tools/perf/arch/loongarch/util/perf_regs.c|  7 +++-
>  tools/perf/arch/mips/util/perf_regs.c |  7 +++-
>  tools/perf/arch/powerpc/util/perf_regs.c  |  7 +++-
>  tools/perf/arch/riscv/util/perf_regs.c|  7 +++-
>  tools/perf/arch/s390/util/perf_regs.c |  7 +++-
>  tools/perf/arch/x86/util/perf_regs.c  |  7 +++-
>  tools/perf/util/parse-regs-options.c  |  8 ++---
>  .../util/perf-regs-arch/perf_regs_aarch64.c   |  4 ---
>  .../perf/util/perf-regs-arch/perf_regs_arm.c  |  4 ---
>  .../perf/util/perf-regs-arch/perf_regs_csky.c |  4 ---
>  .../util/perf-regs-arch/perf_regs_loongarch.c |  4 ---
>  .../perf/util/perf-regs-arch/perf_regs_mips.c |  4 ---
>  .../util/perf-regs-arch/perf_regs_powerpc.c   |  4 ---
>  .../util/perf-regs-arch/perf_regs_riscv.c |  4 ---
>  .../perf/util/perf-regs-arch/perf_regs_s390.c |  4 ---
>  .../perf/util/perf-regs-arch/perf_regs_x86.c  |  4 ---
>  tools/perf/util/perf_regs.c   | 11 --
>  tools/perf/util/perf_regs.h   | 34 +--
>  23 files changed, 67 insertions(+), 112 deletions(-)
>
> --
> 2.34.1
>

[PATCH][next] fsdax: remove redundant variable 'error'

2023-06-21 Thread Colin Ian King

The variable 'error' is being assigned a value that is never read,
the assignment and the variable and redundant and can be removed.
Cleans up clang scan build warning:

fs/dax.c:1880:10: warning: Although the value stored to 'error' is
used in the enclosing expression, the value is never actually read
from 'error' [deadcode.DeadStores]

Signed-off-by: Colin Ian King 
---
 fs/dax.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 2ababb89918d..cb36c6746fc4 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1830,7 +1830,6 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault 
*vmf, pfn_t *pfnp,
vm_fault_t ret = VM_FAULT_FALLBACK;
pgoff_t max_pgoff;
void *entry;
-   int error;
 
if (vmf->flags & FAULT_FLAG_WRITE)
iter.flags |= IOMAP_WRITE;
@@ -1877,7 +1876,7 @@ static vm_fault_t dax_iomap_pmd_fault(struct vm_fault 
*vmf, pfn_t *pfnp,
}
 
iter.pos = (loff_t)xas.xa_index << PAGE_SHIFT;
-   while ((error = iomap_iter(, ops)) > 0) {
+   while (iomap_iter(, ops) > 0) {
if (iomap_length() < PMD_SIZE)
continue; /* actually breaks out of the loop */
 
-- 
2.39.2

[PATCH] nvdimm/region: Fix spelling mistake "memergion" -> "memregion"

2022-12-05 Thread Colin Ian King

There is a spelling mistake in a dev_warn message. Fix it.

Signed-off-by: Colin Ian King 
---
 drivers/nvdimm/region_devs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvdimm/region_devs.c b/drivers/nvdimm/region_devs.c
index 83dbf398ea84..8f5274b04348 100644
--- a/drivers/nvdimm/region_devs.c
+++ b/drivers/nvdimm/region_devs.c
@@ -80,7 +80,7 @@ static int nd_region_invalidate_memregion(struct nd_region 
*nd_region)
if (IS_ENABLED(CONFIG_NVDIMM_SECURITY_TEST)) {
dev_warn(
_region->dev,
-   "Bypassing cpu_cache_invalidate_memergion() for 
testing!\n");
+   "Bypassing cpu_cache_invalidate_memregion() for 
testing!\n");
goto out;
} else {
dev_err(_region->dev,
-- 
2.38.1

Re: [PATCH] perf/x86: Fix integer overflow when left shifting an integer more than 32 bits

2021-04-20 Thread Colin Ian King

On 20/04/2021 16:31, Peter Zijlstra wrote:
> On Tue, Apr 20, 2021 at 05:03:03PM +0200, Peter Zijlstra wrote:
>> On Tue, Apr 20, 2021 at 03:29:07PM +0100, Colin King wrote:
>>> From: Colin Ian King 
>>>
>>> The 64 bit value read from MSR_ARCH_PERFMON_FIXED_CTR_CTRL is being
>>> bit-wise masked with the value (0x03 << i*4). However, the shifted value
>>> is evaluated using 32 bit arithmetic, so will overflow when i > 8.
>>> Fix this by making 0x03 a ULL so that the shift is performed using
>>> 64 bit arithmetic.
>>>
>>> Addresses-Coverity: ("Unintentional integer overflow")
>>
>> Strange tag that, also inaccurate, wide shifts are UB and don't behave
>> consistently.
>>
>> As is, we've not had hardware with that many fixed counters, but yes,
>> worth fixing I suppose.
> 
> Patch now reads:
> 
> ---
> Subject: perf/x86: Allow for 8 From: Colin Ian King 
> Date: Tue, 20 Apr 2021 15:29:07 +0100
> 
> From: Colin Ian King 
> 
> The 64 bit value read from MSR_ARCH_PERFMON_FIXED_CTR_CTRL is being
> bit-wise masked with the value (0x03 << i*4). However, the shifted value
> is evaluated using 32 bit arithmetic, so will UB when i > 8. Fix this
> by making 0x03 a ULL so that the shift is performed using 64 bit
> arithmetic.
> 
> This makes the arithmetic internally consistent and preparers for the
> day when hardware provides 8 
> Signed-off-by: Colin Ian King 
> Signed-off-by: Peter Zijlstra (Intel) 
> Link: 
> https://lkml.kernel.org/r/20210420142907.382417-1-colin.k...@canonical.com
> ---
>  arch/x86/events/core.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -261,7 +261,7 @@ static bool check_hw_exists(void)
>   for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
>   if (fixed_counter_disabled(i))
>   continue;
> - if (val & (0x03 << i*4)) {
> + if (val & (0x03ULL << i*4)) {
>   bios_fail = 1;
>   val_fail = val;
>   reg_fail = reg;
>

re: phy: nxp-c45: add driver for tja1103

2021-04-20 Thread Colin Ian King

Hi,

Static analysis with Coverity on linux-next has found a potential issue
in drivers/net/phy/nxp-c45-tja11xx.c, function nxp_c45_get_phase_shift.
The analysis by Coverity is as follows:

350 static u64 nxp_c45_get_phase_shift(u64 phase_offset_raw)
351 {
352/* The delay in degree phase is 73.8 + phase_offset_raw * 0.9.
353 * To avoid floating point operations we'll multiply by 10
354 * and get 1 decimal point precision.
355 */
356phase_offset_raw *= 10;

Operands don't affect result (CONSTANT_EXPRESSION_RESULT)
result_independent_of_operands: phase_offset_raw is always assigned 0.

Did you intend to negate the value of phase_offset_raw instead of
assigning it 0? This occurs as the value assigned by "-".

357phase_offset_raw -= phase_offset_raw;
358return div_u64(phase_offset_raw, 9);
359 }

phase_offset_raw -= phase_offset_raw results in phase_offset_raw being
zero, I don't think that was the intent.

Colin

Re: [PATCH v3 0/4] kernfs: proposed locking and concurrency improvement

2021-04-19 Thread Ian Kent

On Mon, 2021-04-19 at 15:56 +0800, Fox Chen wrote:
> On Fri, Apr 9, 2021 at 9:14 AM Ian Kent  wrote:
> > There have been a few instances of contention on the kernfs_mutex
> > during
> > path walks, a case on very large IBM systems seen by myself, a
> > report by
> > Brice Goglin and followed up by Fox Chen, and I've since seen a
> > couple
> > of other reports by CoreOS users.
> > 
> > The common thread is a large number of kernfs path walks leading to
> > slowness of path walks due to kernfs_mutex contention.
> > 
> > The problem being that changes to the VFS over some time have
> > increased
> > it's concurrency capabilities to an extent that kernfs's use of a
> > mutex
> > is no longer appropriate. There's also an issue of walks for non-
> > existent
> > paths causing contention if there are quite a few of them which is
> > a less
> > common problem.
> > 
> > This patch series is relatively straight forward.
> > 
> > All it does is add the ability to take advantage of VFS negative
> > dentry
> > caching to avoid needless dentry alloc/free cycles for lookups of
> > paths
> > that don't exit and change the kernfs_mutex to a read/write
> > semaphore.
> > 
> > The patch that tried to stay in VFS rcu-walk mode during path walks
> > has
> > been dropped for two reasons. First, it doesn't actually give very
> > much
> > improvement and, second, if there's a place where mistakes could go
> > unnoticed it would be in that path. This makes the patch series
> > simpler
> > to review and reduces the likelihood of problems going unnoticed
> > and
> > popping up later.
> > 
> > The patch to use a revision to identify if a directory has changed
> > has
> > also been dropped. If the directory has changed the dentry revision
> > needs to be updated to avoid subsequent rb tree searches and after
> > changing to use a read/write semaphore the update also requires a
> > lock.
> > But the d_lock is the only lock available at this point which might
> > itself be contended.
> > 
> > Changes since v2:
> > - actually fix the inode attribute update locking.
> > - drop the patch that tried to stay in rcu-walk mode.
> > - drop the use a revision to identify if a directory has changed
> > patch.
> > 
> > Changes since v1:
> > - fix locking in .permission() and .getattr() by re-factoring the
> > attribute
> >   handling code.
> > 
> > ---
> > 
> > Ian Kent (4):
> >   kernfs: move revalidate to be near lookup
> >   kernfs: use VFS negative dentry caching
> >   kernfs: switch kernfs to use an rwsem
> >   kernfs: use i_lock to protect concurrent inode updates
> > 
> > 
> >  fs/kernfs/dir.c |  240 +++--
> > --
> >  fs/kernfs/file.c|4 -
> >  fs/kernfs/inode.c   |   18 ++-
> >  fs/kernfs/kernfs-internal.h |5 +
> >  fs/kernfs/mount.c   |   12 +-
> >  fs/kernfs/symlink.c |4 -
> >  include/linux/kernfs.h  |2
> >  7 files changed, 155 insertions(+), 130 deletions(-)
> > 
> > --
> > 
> 
> Hi Ian,
> 
> I tested this patchset with my
> benchmark(https://github.com/foxhlchen/sysfs_benchmark) on a 96 CPUs
> (aws c5) machine.
> 
> The result was promising:
> Before, one open+read+close cycle took 500us without much variation.
> With this patch, the fastest one only takes 30us, though the slowest
> is still around 100us(due to the spinlock). perf report shows no more
> significant mutex contention.

Thanks for this Fox.
I'll have a look through the data a bit later.

For now, I'd like to keep the series as simple as possible.

But there shouldn't be a problem reading and comparing those
attributes between the kernfs node and the inode without taking
the additional lock. So a check could be done and the lock only
taken if an update is needed.

That may well improve that worst case quite a bit, but as I say,
it would need to be a follow up change.

Ian

[PATCH] perf arm64: Fix off-by-one directory paths.

2021-04-16 Thread Ian Rogers

Relative path include works in the regular build due to -I paths but may
break in other situations.

Signed-off-by: Ian Rogers 
---
 tools/perf/arch/arm64/util/kvm-stat.c | 4 ++--
 tools/perf/arch/arm64/util/pmu.c  | 4 ++--
 tools/perf/arch/arm64/util/unwind-libunwind.c | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/arch/arm64/util/kvm-stat.c 
b/tools/perf/arch/arm64/util/kvm-stat.c
index 50376b9062c1..2303256b7d05 100644
--- a/tools/perf/arch/arm64/util/kvm-stat.c
+++ b/tools/perf/arch/arm64/util/kvm-stat.c
@@ -1,8 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
 #include 
-#include "../../util/evsel.h"
-#include "../../util/kvm-stat.h"
+#include "../../../util/evsel.h"
+#include "../../../util/kvm-stat.h"
 #include "arm64_exception_types.h"
 #include "debug.h"
 
diff --git a/tools/perf/arch/arm64/util/pmu.c b/tools/perf/arch/arm64/util/pmu.c
index d3259d61ca75..2234fbd0a912 100644
--- a/tools/perf/arch/arm64/util/pmu.c
+++ b/tools/perf/arch/arm64/util/pmu.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
-#include "../../util/cpumap.h"
-#include "../../util/pmu.h"
+#include "../../../util/cpumap.h"
+#include "../../../util/pmu.h"
 
 struct pmu_events_map *pmu_events_map__find(void)
 {
diff --git a/tools/perf/arch/arm64/util/unwind-libunwind.c 
b/tools/perf/arch/arm64/util/unwind-libunwind.c
index 1495a9523a23..5aecf88e3de6 100644
--- a/tools/perf/arch/arm64/util/unwind-libunwind.c
+++ b/tools/perf/arch/arm64/util/unwind-libunwind.c
@@ -4,9 +4,9 @@
 #ifndef REMOTE_UNWIND_LIBUNWIND
 #include 
 #include "perf_regs.h"
-#include "../../util/unwind.h"
+#include "../../../util/unwind.h"
 #endif
-#include "../../util/debug.h"
+#include "../../../util/debug.h"
 
 int LIBUNWIND__ARCH_REG_ID(int regnum)
 {
-- 
2.31.1.368.gbe11c130af-goog

Re: [PATCH 55/57] staging: comedi: drivers: ni_mio_common: Move 'range_ni_E_ao_ext' to where it is used

2021-04-15 Thread Ian Abbott


On 14/04/2021 19:11, Lee Jones wrote:

... and mark it as __maybe_unused since not all users of the
header file reference it.

Fixes the following W=1 kernel build warning(s):

  drivers/staging/comedi/drivers/ni_mio_common.c:163:35: warning: 
‘range_ni_E_ao_ext’ defined but not used [-Wunused-const-variable=]

Cc: Ian Abbott 
Cc: H Hartley Sweeten 
Cc: Greg Kroah-Hartman 
Cc: Thierry Reding 
Cc: "Uwe Kleine-König" 
Cc: Lee Jones 
Cc: "David A. Schleef" 
Cc: Mori Hess 
Cc: Truxton Fulton 
Cc: linux-stag...@lists.linux.dev
Cc: linux-...@vger.kernel.org
Signed-off-by: Lee Jones 
---
  drivers/staging/comedi/drivers/ni_mio_common.c | 9 -
  drivers/staging/comedi/drivers/ni_stc.h| 9 -
  2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/staging/comedi/drivers/ni_mio_common.c 
b/drivers/staging/comedi/drivers/ni_mio_common.c
index 4f80a4991f953..37615b4e2c10d 100644
--- a/drivers/staging/comedi/drivers/ni_mio_common.c
+++ b/drivers/staging/comedi/drivers/ni_mio_common.c
@@ -160,15 +160,6 @@ static const struct comedi_lrange range_ni_M_ai_628x = {
}
  };
  
-static const struct comedi_lrange range_ni_E_ao_ext = {

-   4, {
-   BIP_RANGE(10),
-   UNI_RANGE(10),
-   RANGE_ext(-1, 1),
-   RANGE_ext(0, 1)
-   }
-};
-
  static const struct comedi_lrange *const ni_range_lkup[] = {
[ai_gain_16] = _ni_E_ai,
[ai_gain_8] = _ni_E_ai_limited,
diff --git a/drivers/staging/comedi/drivers/ni_stc.h 
b/drivers/staging/comedi/drivers/ni_stc.h
index fbc0b753a0f59..0822e65f709dd 100644
--- a/drivers/staging/comedi/drivers/ni_stc.h
+++ b/drivers/staging/comedi/drivers/ni_stc.h
@@ -1137,6 +1137,13 @@ struct ni_private {
u8 rgout0_usage;
  };
  
-static const struct comedi_lrange range_ni_E_ao_ext;

+static const struct comedi_lrange __maybe_unused range_ni_E_ao_ext = {
+   4, {
+   BIP_RANGE(10),
+   UNI_RANGE(10),
+   RANGE_ext(-1, 1),
+   RANGE_ext(0, 1)
+   }
+};
  
  #endif /* _COMEDI_NI_STC_H */




I think it is better where it is for now with its fellow struct 
comedi_lrange variables, but feel free to mark it as __maybe_unused.


(Really, the #include "ni_mio_common.c" mess needs sorting out sometime.)

--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH 48/57] staging: comedi: drivers: jr3_pci: Remove set but unused variable 'min_full_scale'

2021-04-15 Thread Ian Abbott


On 14/04/2021 19:11, Lee Jones wrote:

Fixes the following W=1 kernel build warning(s):

  drivers/staging/comedi/drivers/jr3_pci.c: In function 
‘jr3_pci_poll_subdevice’:
  drivers/staging/comedi/drivers/jr3_pci.c:507:22: warning: variable 
‘min_full_scale’ set but not used [-Wunused-but-set-variable]

Cc: Ian Abbott 
Cc: H Hartley Sweeten 
Cc: Greg Kroah-Hartman 
Cc: "Alexander A. Klimov" 
Cc: Anders Blomdell 
Cc: linux-stag...@lists.linux.dev
Signed-off-by: Lee Jones 
---
  drivers/staging/comedi/drivers/jr3_pci.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/staging/comedi/drivers/jr3_pci.c 
b/drivers/staging/comedi/drivers/jr3_pci.c
index 7a02c4fa3cda8..afa2f8d5c8c0c 100644
--- a/drivers/staging/comedi/drivers/jr3_pci.c
+++ b/drivers/staging/comedi/drivers/jr3_pci.c
@@ -504,10 +504,9 @@ jr3_pci_poll_subdevice(struct comedi_subdevice *s)
result = poll_delay_min_max(20, 100);
} else {
/* Set full scale */
-   struct six_axis_t min_full_scale;
struct six_axis_t max_full_scale;
  
-			min_full_scale = get_min_full_scales(sensor);

+   get_min_full_scales(sensor);
max_full_scale = get_max_full_scales(sensor);
set_full_scales(sensor, max_full_scale);
  



get_min_full_scales() could be removed altogether.  It was only being 
called originally so the driver could printk the values, but those 
printks have since been removed.


--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH][next] can: etas_es58x: Fix potential null pointer dereference on pointer cf

2021-04-15 Thread Colin Ian King

On 15/04/2021 10:03, Marc Kleine-Budde wrote:
> On 15.04.2021 09:55:35, Colin King wrote:
>> From: Colin Ian King 
>>
>> The pointer cf is being null checked earlier in the code, however the
>> update of the rx_bytes statistics is dereferencing cf without null
>> checking cf.  Fix this by moving the statement into the following code
>> block that has a null cf check.
>>
>> Addresses-Coverity: ("Dereference after null check")
>> Fixes: 8537257874e9 ("can: etas_es58x: add core support for ETAS ES58X CAN 
>> USB interfaces")
>> Signed-off-by: Colin Ian King 
> 
> A somewhat different fix is already in net-next/master
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=e2b1e4b532abdd39bfb7313146153815e370d60c

+1 on that

> 
> Marc
>

Re: [RFC] Improve workload error in 'perf record'

2021-04-14 Thread Ian Rogers

On Wed, Apr 14, 2021 at 6:16 AM Arnaldo Carvalho de Melo
 wrote:
>
> Hi,
>
> Please take a look,
>
> Best regards,

Acked-by: Ian Rogers 

Having been confused by this for a case in the past, thanks! It'd be
nice for code coverage's sake to have a shell test on this.

Thanks,
Ian

> - Arnaldo
>
> Arnaldo Carvalho de Melo (2):
>   perf evlist: Add a method to return the list of evsels as a string
>   perf record: Improve 'Workload failed' message printing events + what
> was exec'ed
>
>  tools/perf/builtin-record.c |  8 ++--
>  tools/perf/util/evlist.c| 19 +++
>  tools/perf/util/evlist.h|  2 ++
>  3 files changed, 27 insertions(+), 2 deletions(-)
>
> --
> 2.26.2
>

Re: mmotm 2021-04-11-20-47 uploaded (ni_routes_test.c)

2021-04-13 Thread Ian Abbott

: ni_routes_test.c:(.text+0xb3d): undefined reference to `ni_find_route_set'
ld: ni_routes_test.c:(.text+0xb74): undefined reference to `ni_find_route_set'
ld: ni_routes_test.c:(.text+0xbb2): undefined reference to `ni_find_route_set'
ld: ni_routes_test.c:(.text+0xbfa): undefined reference to `ni_find_route_set'
ld: drivers/staging/comedi/drivers/tests/ni_routes_test.o: in function 
`test_ni_assign_device_routes':
ni_routes_test.c:(.text+0xc6c): undefined reference to `ni_assign_device_routes'
ld: ni_routes_test.c:(.text+0xeb6): undefined reference to 
`ni_assign_device_routes'
ld: drivers/staging/comedi/drivers/tests/ni_routes_test.o: in function 
`test_ni_count_valid_routes':
ni_routes_test.c:(.text+0xf9c): undefined reference to `ni_count_valid_routes'
ld: drivers/staging/comedi/drivers/tests/ni_routes_test.o: in function 
`ni_get_reg_value_roffs.constprop.7':
ni_routes_test.c:(.text+0xfef): undefined reference to `ni_find_route_source'
ld: ni_routes_test.c:(.text+0x1015): undefined reference to 
`ni_route_to_register'
ld: drivers/staging/comedi/drivers/tests/ni_routes_test.o: in function 
`test_route_is_valid':
ni_routes_test.c:(.text+0x1074): undefined reference to `ni_route_to_register'
ld: ni_routes_test.c:(.text+0x10ab): undefined reference to 
`ni_route_to_register'
ld: ni_routes_test.c:(.text+0x10e2): undefined reference to 
`ni_route_to_register'
ld: ni_routes_test.c:(.text+0x1119): undefined reference to 
`ni_route_to_register'
ld: drivers/staging/comedi/drivers/tests/ni_routes_test.o: in function 
`test_route_register_is_valid':
ni_routes_test.c:(.text+0x115a): undefined reference to `ni_find_route_source'
ld: ni_routes_test.c:(.text+0x118e): undefined reference to 
`ni_find_route_source'
ld: ni_routes_test.c:(.text+0x11c5): undefined reference to 
`ni_find_route_source'
ld: ni_routes_test.c:(.text+0x11fc): undefined reference to 
`ni_find_route_source'
ld: drivers/staging/comedi/drivers/tests/ni_routes_test.o: in function 
`test_ni_sort_device_routes':
ni_routes_test.c:(.text+0x16ef): undefined reference to `ni_sort_device_routes'


Full randconfig file is attached.


Hi all,

That should be fixed by commit e7442ffe1cc5 ("staging: comedi: Kconfig: 
Fix COMEDI_TESTS_NI_ROUTES selections") in linux-next master.


Sorry for the trouble!

--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH][next] KEYS: trusted: Fix missing null return from kzalloc call

2021-04-12 Thread Colin Ian King

On 12/04/2021 17:48, James Bottomley wrote:
> On Mon, 2021-04-12 at 17:01 +0100, Colin King wrote:
>> From: Colin Ian King 
>>
>> The kzalloc call can return null with the GFP_KERNEL flag so
>> add a null check and exit via a new error exit label. Use the
>> same exit error label for another error path too.
>>
>> Addresses-Coverity: ("Dereference null return value")
>> Fixes: 830027e2cb55 ("KEYS: trusted: Add generic trusted keys
>> framework")
>> Signed-off-by: Colin Ian King 
>> ---
>>  security/keys/trusted-keys/trusted_core.c | 6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/security/keys/trusted-keys/trusted_core.c
>> b/security/keys/trusted-keys/trusted_core.c
>> index ec3a066a4b42..90774793f0b1 100644
>> --- a/security/keys/trusted-keys/trusted_core.c
>> +++ b/security/keys/trusted-keys/trusted_core.c
>> @@ -116,11 +116,13 @@ static struct trusted_key_payload
>> *trusted_payload_alloc(struct key *key)
>>  
>>  ret = key_payload_reserve(key, sizeof(*p));
>>  if (ret < 0)
>> -return p;
>> +goto err;
>>  p = kzalloc(sizeof(*p), GFP_KERNEL);
>> +if (!p)
>> +goto err;
>>  
>>  p->migratable = migratable;
>> -
>> +err:
>>  return p;
> 
> This is clearly a code migration bug in 
> 
> commit 251c85bd106099e6f388a89e88e12d14de2c9cda
> Author: Sumit Garg 
> Date:   Mon Mar 1 18:41:24 2021 +0530
> 
> KEYS: trusted: Add generic trusted keys framework
> 
> Which has for addition to trusted_core.c:
> 
> +static struct trusted_key_payload *trusted_payload_alloc(struct key
> *key)
> +{
> +   struct trusted_key_payload *p = NULL;
> +   int ret;
> +
> +   ret = key_payload_reserve(key, sizeof(*p));
> +   if (ret < 0)
> +   return p;
> +   p = kzalloc(sizeof(*p), GFP_KERNEL);
> +
> +   p->migratable = migratable;
> +
> +   return p;
> +}
> 
> And for trusted_tpm1.c:
> 
> -static struct trusted_key_payload *trusted_payload_alloc(struct key
> *key)
> -{
> -   struct trusted_key_payload *p = NULL;
> -   int ret;
> -
> -   ret = key_payload_reserve(key, sizeof *p);
> -   if (ret < 0)
> -   return p;
> -   p = kzalloc(sizeof *p, GFP_KERNEL);
> -   if (p)
> -   p->migratable = 1; /* migratable by default */
> -   return p;
> -}
> 
> The trusted_tpm1.c code was correct and we got this bug introduced by
> what should have been a simple cut and paste ... how did that happen? 
> And therefore, how safe is the rest of the extraction into
> trusted_core.c?
> 

fortunately it gets caught by static analysis, but it does make me also
concerned about what else has changed and how this gets through review.

> James
> 
>

Re: [PATCH] Staging: Remove line to fix checkpatch error

2021-04-12 Thread Ian Abbott


On 11/04/2021 21:49, tawahpeggy wrote:

remove one empty line.CHECK: Please don't use multiple blank lines

Signed-off-by: tawahpeggy 

---
  drivers/staging/comedi/comedi_pcmcia.mod.c | 1 -
  1 file changed, 0 insertion(+), 1 deletion(-)
  create mode 100644 drivers/staging/comedi/comedi_pcmcia.mod.c

diff --git a/drivers/staging/comedi/comedi_pcmcia.mod.c 
b/drivers/staging/comedi/comedi_pcmcia.mod.c
index 0904b8765afs96..3984db1a39c8
--- /dev/null
+++ b/drivers/staging/comedi/comedi_pcmcia.mod.c


The .mod.c files are not really part of the Linux kernel source code. 
They are generated during the kernel build process.  There is no point 
checking them with checkpatch.pl.  If you are adding them to your git 
repository, then you are doing something wrong.


--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] xfs: fix return of uninitialized value in variable error

2021-04-09 Thread Colin Ian King

On 09/04/2021 15:28, Brian Foster wrote:
> On Fri, Apr 09, 2021 at 03:18:34PM +0100, Colin King wrote:
>> From: Colin Ian King 
>>
>> A previous commit removed a call to xfs_attr3_leaf_read that
>> assigned an error return code to variable error. We now have
>> a few early error return paths to label 'out' that return
>> error if error is set; however error now is uninitialized
>> so potentially garbage is being returned.  Fix this by setting
>> error to zero to restore the original behaviour where error
>> was zero at the label 'restart'.
>>
>> Addresses-Coverity: ("Uninitialized scalar variable")
>> Fixes: 07120f1abdff ("xfs: Add xfs_has_attr and subroutines")
>> Signed-off-by: Colin Ian King 
>> ---
>>  fs/xfs/libxfs/xfs_attr.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 472b3039eabb..902e5f7e6642 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -928,6 +928,7 @@ xfs_attr_node_addname(
>>   * Search to see if name already exists, and get back a pointer
>>   * to where it should go.
>>   */
>> +error = 0;
>>  retval = xfs_attr_node_hasname(args, );
>>  if (retval != -ENOATTR && retval != -EEXIST)
>>  goto out;
> 
> I think it would be nicer to initialize at the top of the function as
> opposed to try and "preserve" historical behavior, but that nit aside:

I did think about that, but this fix does ensure it's zero'd for each
iteration rather than just the once, so it should catch any code changes
later on that may loop back to this point were error is non-zero.

> 
> Reviewed-by: Brian Foster 
> 
>> -- 
>> 2.30.2
>>
>

cnic: issue with double assignment to ictx->xstorm_st_context.common.flags

2021-04-09 Thread Colin Ian King

Hi,

Analysis of linux with Coverity has detected an issue with the
assignment of ictx->xstorm_st_context.common.fla in
drivers/net/ethernet/broadcom/cnic.c in function cnic_setup_bnx2x_ctx.

This was introduced a while back with the following commit:

commit 619c5cb6885b936c44ae1422ef805b69c6291485
Author: Vlad Zolotarov 
Date:   Tue Jun 14 14:33:44 2011 +0300

New 7.0 FW: bnx2x, cnic, bnx2i, bnx2fc

The static analysis is as follows:

1761ictx->xstorm_st_context.common.flags =

Unused value (UNUSED_VALUE)assigned_value: Assigning value 1 to
ictx->xstorm_st_context.common.flags here, but that stored value is
overwritten before it can be used.

17621 <<
XSTORM_COMMON_CONTEXT_SECTION_PHYSQ_INITIALIZED_SHIFT;
1763ictx->xstorm_st_context.common.flags =

value_overwrite: Overwriting previous write to
ictx->xstorm_st_context.common.flags with value from port << 1.

1764port << XSTORM_COMMON_CONTEXT_SECTION_PBF_PORT_SHIFT;
1765
1766ictx->tstorm_st_context.iscsi.hdr_bytes_2_fetch =
ISCSI_HEADER_SIZE;

The re-assignment of ictx->xstorm_st_context.common.flags in line 1763
is overwriting the value assigned on line 1761.  Should the = operator
on the re-assignment be an |= operator instead?

Colin

Re: [PATCH v3 2/4] kernfs: use VFS negative dentry caching

2021-04-09 Thread Ian Kent

On Fri, 2021-04-09 at 16:26 +0800, Ian Kent wrote:
> On Fri, 2021-04-09 at 01:35 +, Al Viro wrote:
> > On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote:
> > > + parent = kernfs_dentry_node(dentry->d_parent);
> > > + if (parent) {
> > > + const void *ns = NULL;
> > > +
> > > + if (kernfs_ns_enabled(parent))
> > > + ns = kernfs_info(dentry->d_parent-
> > > > d_sb)->ns;
> > 
> > For any dentry d, we have d->d_parent->d_sb == d->d_sb.  All
> > the time.
> > If you ever run into the case where that would not be true, you've
> > found
> > a critical bug.
> 
> Right, yes.
> 
> > > + kn = kernfs_find_ns(parent, dentry-
> > > > d_name.name, ns);
> > > + if (kn)
> > > + goto out_bad;
> > > + }
> > 
> > Umm...  What's to prevent a race with successful rename(2)?  IOW,
> > what's
> > there to stabilize ->d_parent and ->d_name while we are in that
> > function?
> 
> Indeed, glad you looked at this.
> 
> Now I'm wondering how kerfs_iop_rename() protects itself from
> concurrent kernfs_rename_ns() ... 

As I thought ... I haven't done an exhaustive search but I can't find
any file system that doesn't call back into kernfs from
kernfs_syscall_ops (if provided at kernfs root creation).

I don't see anything that uses kernfs that defines a .rename() op
but if there was one it would be expected to call back into kernfs
at which point it would block on kernfs_mutex (kernfs_rwsem) until
it's released.

So I don't think there can be changes in this case due to the lock
taken just above the code your questioning.

I need to think a bit about whether the dentry being negative (ie.
not having kernfs node) could allow bad things to happen ...

Or am I misunderstanding the race your pointing out here?

Ian

Re: [PATCH][next] mlxsw: spectrum_router: remove redundant initialization of variable force

2021-04-09 Thread Colin Ian King

On 29/03/2021 08:13, Ido Schimmel wrote:
> On Sat, Mar 27, 2021 at 10:33:34PM +, Colin King wrote:
>> From: Colin Ian King 
>>
>> The variable force is being initialized with a value that is
>> never read and it is being updated later with a new value. The
>> initialization is redundant and can be removed.
>>
>> Addresses-Coverity: ("Unused value")
>> Signed-off-by: Colin Ian King 
>> ---
>>  drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c 
>> b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> index 6ccaa194733b..ff240e3c29f7 100644
>> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
>> @@ -5059,7 +5059,7 @@ mlxsw_sp_nexthop_obj_bucket_adj_update(struct mlxsw_sp 
>> *mlxsw_sp,
>>  {
>>  u16 bucket_index = info->nh_res_bucket->bucket_index;
>>  struct netlink_ext_ack *extack = info->extack;
>> -bool force = info->nh_res_bucket->force;
>> +bool force;
> 
> Actually, there is a bug to be fixed here:
> 
> ```
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c 
> b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
> index 6ccaa194733b..41259c0004d1 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
> @@ -5068,8 +5068,9 @@ mlxsw_sp_nexthop_obj_bucket_adj_update(struct mlxsw_sp 
> *mlxsw_sp,
> /* No point in trying an atomic replacement if the idle timer interval
>  * is smaller than the interval in which we query and clear activity.
>  */
> -   force = info->nh_res_bucket->idle_timer_ms <
> -   MLXSW_SP_NH_GRP_ACTIVITY_UPDATE_INTERVAL;
> +   if (!force && info->nh_res_bucket->idle_timer_ms <
> +   MLXSW_SP_NH_GRP_ACTIVITY_UPDATE_INTERVAL)
> +   force = true;
>  
> adj_index = nh->nhgi->adj_index + bucket_index;
> err = mlxsw_sp_nexthop_update(mlxsw_sp, adj_index, nh, force, 
> ratr_pl);
> ```
> 
> We should only fallback to a non-atomic replacement when the current
> replacement is atomic and the idle timer is too short.
> 
> We currently ignore the value of 'force'. This means that a non-atomic
> replacement ('force' is true) can be made atomic if idle timer is larger
> than 1 second.
> 
> Colin, do you mind if I submit it formally as a fix later this week? I
> want to run it through our usual process. Will mention you in
> Reported-by, obviously.

Sure. Good idea.

> 
>>  char ratr_pl_new[MLXSW_REG_RATR_LEN];
>>  char ratr_pl[MLXSW_REG_RATR_LEN];
>>  u32 adj_index;
>> -- 
>> 2.30.2
>>

Re: [PATCH] clk: uniphier: Fix potential infinite loop

2021-04-09 Thread Colin Ian King

On 09/04/2021 07:46, Masahiro Yamada wrote:
> On Thu, Apr 8, 2021 at 12:25 AM Colin King  wrote:
>>
>> From: Colin Ian King 
>>
>> The for-loop iterates with a u8 loop counter i and compares this
>> with the loop upper limit of num_parents that is an int type.
>> There is a potential infinite loop if num_parents is larger than
>> the u8 loop counter. Fix this by making the loop counter the same
>> type as num_parents.
>>
>> Addresses-Coverity: ("Infinite loop")
>> Fixes: 734d82f4a678 ("clk: uniphier: add core support code for UniPhier 
>> clock driver")
>> Signed-off-by: Colin Ian King 
>> ---
>>  drivers/clk/uniphier/clk-uniphier-mux.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/clk/uniphier/clk-uniphier-mux.c 
>> b/drivers/clk/uniphier/clk-uniphier-mux.c
>> index 462c84321b2d..ce219e0d2a85 100644
>> --- a/drivers/clk/uniphier/clk-uniphier-mux.c
>> +++ b/drivers/clk/uniphier/clk-uniphier-mux.c
>> @@ -34,7 +34,7 @@ static u8 uniphier_clk_mux_get_parent(struct clk_hw *hw)
>> int num_parents = clk_hw_get_num_parents(hw);
>> int ret;
>> unsigned int val;
>> -   u8 i;
>> +   int i;
>>
>> ret = regmap_read(mux->regmap, mux->reg, );
>> if (ret)
>> --
>> 2.30.2
>>
> 
> clk_hw_get_num_parents() returns 'unsigned int', so
> I think 'num_parents' should also have been 'unsigned int'.
> 
> Maybe, the loop counter 'i' also should be 'unsigned int' then?
> 
> 
Good point. I'll send a V2.

Re: [PATCH v3 2/4] kernfs: use VFS negative dentry caching

2021-04-09 Thread Ian Kent

On Fri, 2021-04-09 at 01:35 +, Al Viro wrote:
> On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote:
> > +   parent = kernfs_dentry_node(dentry->d_parent);
> > +   if (parent) {
> > +   const void *ns = NULL;
> > +
> > +   if (kernfs_ns_enabled(parent))
> > +   ns = kernfs_info(dentry->d_parent-
> > >d_sb)->ns;
> 
>   For any dentry d, we have d->d_parent->d_sb == d->d_sb.  All
> the time.
> If you ever run into the case where that would not be true, you've
> found
> a critical bug.

Right, yes.

> 
> > +   kn = kernfs_find_ns(parent, dentry-
> > >d_name.name, ns);
> > +   if (kn)
> > +   goto out_bad;
> > +   }
> 
> Umm...  What's to prevent a race with successful rename(2)?  IOW,
> what's
> there to stabilize ->d_parent and ->d_name while we are in that
> function?

Indeed, glad you looked at this.

Now I'm wondering how kerfs_iop_rename() protects itself from
concurrent kernfs_rename_ns() ...

[PATCH v3 4/4] kernfs: use i_lock to protect concurrent inode updates

2021-04-08 Thread Ian Kent

The inode operations .permission() and .getattr() use the kernfs node
write lock but all that's needed is to keep the rb tree stable while
updating the inode attributes as well as protecting the update itself
against concurrent changes.

And .permission() is called frequently during path walks and can cause
quite a bit of contention between kernfs node operations and path
walks when the number of concurrent walks is high.

To change kernfs_iop_getattr() and kernfs_iop_permission() to take
the rw sem read lock instead of the write lock an additional lock is
needed to protect against multiple processes concurrently updating
the inode attributes and link count in kernfs_refresh_inode().

The inode i_lock seems like the sensible thing to use to protect these
inode attribute updates so use it in kernfs_refresh_inode().

Signed-off-by: Ian Kent 
---
 fs/kernfs/inode.c |   10 ++
 fs/kernfs/mount.c |4 ++--
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index 3b01e9e61f14e..6728ecd81eb37 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -172,6 +172,7 @@ static void kernfs_refresh_inode(struct kernfs_node *kn, 
struct inode *inode)
 {
struct kernfs_iattrs *attrs = kn->iattr;
 
+   spin_lock(>i_lock);
inode->i_mode = kn->mode;
if (attrs)
/*
@@ -182,6 +183,7 @@ static void kernfs_refresh_inode(struct kernfs_node *kn, 
struct inode *inode)
 
if (kernfs_type(kn) == KERNFS_DIR)
set_nlink(inode, kn->dir.subdirs + 2);
+   spin_unlock(>i_lock);
 }
 
 int kernfs_iop_getattr(struct user_namespace *mnt_userns,
@@ -191,9 +193,9 @@ int kernfs_iop_getattr(struct user_namespace *mnt_userns,
struct inode *inode = d_inode(path->dentry);
struct kernfs_node *kn = inode->i_private;
 
-   down_write(_rwsem);
+   down_read(_rwsem);
kernfs_refresh_inode(kn, inode);
-   up_write(_rwsem);
+   up_read(_rwsem);
 
generic_fillattr(_user_ns, inode, stat);
return 0;
@@ -284,9 +286,9 @@ int kernfs_iop_permission(struct user_namespace *mnt_userns,
 
kn = inode->i_private;
 
-   down_write(_rwsem);
+   down_read(_rwsem);
kernfs_refresh_inode(kn, inode);
-   up_write(_rwsem);
+   up_read(_rwsem);
 
return generic_permission(_user_ns, inode, mask);
 }
diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index baa4155ba2edf..f2f909d09f522 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -255,9 +255,9 @@ static int kernfs_fill_super(struct super_block *sb, struct 
kernfs_fs_context *k
sb->s_shrink.seeks = 0;
 
/* get root inode, initialize and unlock it */
-   down_write(_rwsem);
+   down_read(_rwsem);
inode = kernfs_get_inode(sb, info->root->kn);
-   up_write(_rwsem);
+   up_read(_rwsem);
if (!inode) {
pr_debug("kernfs: could not get root inode\n");
return -ENOMEM;

[PATCH v3 3/4] kernfs: switch kernfs to use an rwsem

2021-04-08 Thread Ian Kent

The kernfs global lock restricts the ability to perform kernfs node
lookup operations in parallel during path walks.

Change the kernfs mutex to an rwsem so that, when opportunity arises,
node searches can be done in parallel with path walk lookups.

Signed-off-by: Ian Kent 
---
 fs/kernfs/dir.c |  117 ---
 fs/kernfs/file.c|4 +
 fs/kernfs/inode.c   |   16 +++---
 fs/kernfs/kernfs-internal.h |5 +-
 fs/kernfs/mount.c   |   12 ++--
 fs/kernfs/symlink.c |4 +
 include/linux/kernfs.h  |2 -
 7 files changed, 86 insertions(+), 74 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index edfeee1bf38ec..9bea235f2ec66 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -17,7 +17,7 @@
 
 #include "kernfs-internal.h"
 
-DEFINE_MUTEX(kernfs_mutex);
+DECLARE_RWSEM(kernfs_rwsem);
 static DEFINE_SPINLOCK(kernfs_rename_lock);/* kn->parent and ->name */
 static char kernfs_pr_cont_buf[PATH_MAX];  /* protected by rename_lock */
 static DEFINE_SPINLOCK(kernfs_idr_lock);   /* root->ino_idr */
@@ -26,10 +26,21 @@ static DEFINE_SPINLOCK(kernfs_idr_lock);/* 
root->ino_idr */
 
 static bool kernfs_active(struct kernfs_node *kn)
 {
-   lockdep_assert_held(_mutex);
return atomic_read(>active) >= 0;
 }
 
+static bool kernfs_active_write(struct kernfs_node *kn)
+{
+   lockdep_assert_held_write(_rwsem);
+   return kernfs_active(kn);
+}
+
+static bool kernfs_active_read(struct kernfs_node *kn)
+{
+   lockdep_assert_held_read(_rwsem);
+   return kernfs_active(kn);
+}
+
 static bool kernfs_lockdep(struct kernfs_node *kn)
 {
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
@@ -340,7 +351,7 @@ static int kernfs_sd_compare(const struct kernfs_node *left,
  * @kn->parent->dir.children.
  *
  * Locking:
- * mutex_lock(kernfs_mutex)
+ * kernfs_rwsem held exclusive
  *
  * RETURNS:
  * 0 on susccess -EEXIST on failure.
@@ -385,7 +396,7 @@ static int kernfs_link_sibling(struct kernfs_node *kn)
  * removed, %false if @kn wasn't on the rbtree.
  *
  * Locking:
- * mutex_lock(kernfs_mutex)
+ * kernfs_rwsem held exclusive
  */
 static bool kernfs_unlink_sibling(struct kernfs_node *kn)
 {
@@ -455,14 +466,14 @@ void kernfs_put_active(struct kernfs_node *kn)
  * return after draining is complete.
  */
 static void kernfs_drain(struct kernfs_node *kn)
-   __releases(_mutex) __acquires(_mutex)
+   __releases(_rwsem) __acquires(_rwsem)
 {
struct kernfs_root *root = kernfs_root(kn);
 
-   lockdep_assert_held(_mutex);
+   lockdep_assert_held_write(_rwsem);
WARN_ON_ONCE(kernfs_active(kn));
 
-   mutex_unlock(_mutex);
+   up_write(_rwsem);
 
if (kernfs_lockdep(kn)) {
rwsem_acquire(>dep_map, 0, 0, _RET_IP_);
@@ -481,7 +492,7 @@ static void kernfs_drain(struct kernfs_node *kn)
 
kernfs_drain_open_files(kn);
 
-   mutex_lock(_mutex);
+   down_write(_rwsem);
 }
 
 /**
@@ -720,7 +731,7 @@ int kernfs_add_one(struct kernfs_node *kn)
bool has_ns;
int ret;
 
-   mutex_lock(_mutex);
+   down_write(_rwsem);
 
ret = -EINVAL;
has_ns = kernfs_ns_enabled(parent);
@@ -735,7 +746,7 @@ int kernfs_add_one(struct kernfs_node *kn)
if (parent->flags & KERNFS_EMPTY_DIR)
goto out_unlock;
 
-   if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active(parent))
+   if ((parent->flags & KERNFS_ACTIVATED) && !kernfs_active_write(parent))
goto out_unlock;
 
kn->hash = kernfs_name_hash(kn->name, kn->ns);
@@ -751,7 +762,7 @@ int kernfs_add_one(struct kernfs_node *kn)
ps_iattr->ia_mtime = ps_iattr->ia_ctime;
}
 
-   mutex_unlock(_mutex);
+   up_write(_rwsem);
 
/*
 * Activate the new node unless CREATE_DEACTIVATED is requested.
@@ -765,7 +776,7 @@ int kernfs_add_one(struct kernfs_node *kn)
return 0;
 
 out_unlock:
-   mutex_unlock(_mutex);
+   up_write(_rwsem);
return ret;
 }
 
@@ -786,7 +797,7 @@ static struct kernfs_node *kernfs_find_ns(struct 
kernfs_node *parent,
bool has_ns = kernfs_ns_enabled(parent);
unsigned int hash;
 
-   lockdep_assert_held(_mutex);
+   lockdep_assert_held(_rwsem);
 
if (has_ns != (bool)ns) {
WARN(1, KERN_WARNING "kernfs: ns %s in '%s' for '%s'\n",
@@ -818,7 +829,7 @@ static struct kernfs_node *kernfs_walk_ns(struct 
kernfs_node *parent,
size_t len;
char *p, *name;
 
-   lockdep_assert_held(_mutex);
+   lockdep_assert_held_read(_rwsem);
 
/* grab kernfs_rename_lock to piggy back on kernfs_pr_cont_buf */
spin_lock_irq(_rename_lock);
@@ -858,10 +869,10 @@ struct kernfs_node *kernfs_find_and_get_ns(struct 
kernfs_node *parent,
 {
struct kernfs_no

[PATCH v3 2/4] kernfs: use VFS negative dentry caching

2021-04-08 Thread Ian Kent

If there are many lookups for non-existent paths these negative lookups
can lead to a lot of overhead during path walks.

The VFS allows dentries to be created as negative and hashed, and caches
them so they can be used to reduce the fairly high overhead alloc/free
cycle that occurs during these lookups.

Signed-off-by: Ian Kent 
---
 fs/kernfs/dir.c |   55 +--
 1 file changed, 33 insertions(+), 22 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 4c69e2af82dac..edfeee1bf38ec 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1037,12 +1037,33 @@ static int kernfs_dop_revalidate(struct dentry *dentry, 
unsigned int flags)
if (flags & LOOKUP_RCU)
return -ECHILD;
 
-   /* Always perform fresh lookup for negatives */
-   if (d_really_is_negative(dentry))
-   goto out_bad_unlocked;
+   mutex_lock(_mutex);
 
kn = kernfs_dentry_node(dentry);
-   mutex_lock(_mutex);
+
+   /* Negative hashed dentry? */
+   if (!kn) {
+   struct kernfs_node *parent;
+
+   /* If the kernfs node can be found this is a stale negative
+* hashed dentry so it must be discarded and the lookup redone.
+*/
+   parent = kernfs_dentry_node(dentry->d_parent);
+   if (parent) {
+   const void *ns = NULL;
+
+   if (kernfs_ns_enabled(parent))
+   ns = kernfs_info(dentry->d_parent->d_sb)->ns;
+   kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
+   if (kn)
+   goto out_bad;
+   }
+
+   /* The kernfs node doesn't exist, leave the dentry negative
+* and return success.
+*/
+   goto out;
+   }
 
/* The kernfs node has been deactivated */
if (!kernfs_active_read(kn))
@@ -1060,12 +1081,11 @@ static int kernfs_dop_revalidate(struct dentry *dentry, 
unsigned int flags)
if (kn->parent && kernfs_ns_enabled(kn->parent) &&
kernfs_info(dentry->d_sb)->ns != kn->ns)
goto out_bad;
-
+out:
mutex_unlock(_mutex);
return 1;
 out_bad:
mutex_unlock(_mutex);
-out_bad_unlocked:
return 0;
 }
 
@@ -1080,33 +1100,24 @@ static struct dentry *kernfs_iop_lookup(struct inode 
*dir,
struct dentry *ret;
struct kernfs_node *parent = dir->i_private;
struct kernfs_node *kn;
-   struct inode *inode;
+   struct inode *inode = NULL;
const void *ns = NULL;
 
mutex_lock(_mutex);
-
if (kernfs_ns_enabled(parent))
ns = kernfs_info(dir->i_sb)->ns;
 
kn = kernfs_find_ns(parent, dentry->d_name.name, ns);
-
-   /* no such entry */
-   if (!kn || !kernfs_active(kn)) {
-   ret = NULL;
-   goto out_unlock;
-   }
-
/* attach dentry and inode */
-   inode = kernfs_get_inode(dir->i_sb, kn);
-   if (!inode) {
-   ret = ERR_PTR(-ENOMEM);
-   goto out_unlock;
+   if (kn && kernfs_active(kn)) {
+   inode = kernfs_get_inode(dir->i_sb, kn);
+   if (!inode)
+   inode = ERR_PTR(-ENOMEM);
}
-
-   /* instantiate and hash dentry */
+   /* instantiate and hash (possibly negative) dentry */
ret = d_splice_alias(inode, dentry);
- out_unlock:
mutex_unlock(_mutex);
+
return ret;
 }

[PATCH v3 1/4] kernfs: move revalidate to be near lookup

2021-04-08 Thread Ian Kent

While the dentry operation kernfs_dop_revalidate() is grouped with
dentry type functions it also has a strong affinity to the inode
operation ->lookup().

In order to take advantage of the VFS negative dentry caching that
can be used to reduce path lookup overhead on non-existent paths it
will need to call kernfs_find_ns(). So, to avoid a forward declaration,
move it to be near kernfs_iop_lookup().

There's no functional change from this patch.

Signed-off-by: Ian Kent 
---
 fs/kernfs/dir.c |   86 ---
 1 file changed, 43 insertions(+), 43 deletions(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 7e0e62deab53c..4c69e2af82dac 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -548,49 +548,6 @@ void kernfs_put(struct kernfs_node *kn)
 }
 EXPORT_SYMBOL_GPL(kernfs_put);
 
-static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
-{
-   struct kernfs_node *kn;
-
-   if (flags & LOOKUP_RCU)
-   return -ECHILD;
-
-   /* Always perform fresh lookup for negatives */
-   if (d_really_is_negative(dentry))
-   goto out_bad_unlocked;
-
-   kn = kernfs_dentry_node(dentry);
-   mutex_lock(_mutex);
-
-   /* The kernfs node has been deactivated */
-   if (!kernfs_active(kn))
-   goto out_bad;
-
-   /* The kernfs node has been moved? */
-   if (kernfs_dentry_node(dentry->d_parent) != kn->parent)
-   goto out_bad;
-
-   /* The kernfs node has been renamed */
-   if (strcmp(dentry->d_name.name, kn->name) != 0)
-   goto out_bad;
-
-   /* The kernfs node has been moved to a different namespace */
-   if (kn->parent && kernfs_ns_enabled(kn->parent) &&
-   kernfs_info(dentry->d_sb)->ns != kn->ns)
-   goto out_bad;
-
-   mutex_unlock(_mutex);
-   return 1;
-out_bad:
-   mutex_unlock(_mutex);
-out_bad_unlocked:
-   return 0;
-}
-
-const struct dentry_operations kernfs_dops = {
-   .d_revalidate   = kernfs_dop_revalidate,
-};
-
 /**
  * kernfs_node_from_dentry - determine kernfs_node associated with a dentry
  * @dentry: the dentry in question
@@ -1073,6 +1030,49 @@ struct kernfs_node *kernfs_create_empty_dir(struct 
kernfs_node *parent,
return ERR_PTR(rc);
 }
 
+static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags)
+{
+   struct kernfs_node *kn;
+
+   if (flags & LOOKUP_RCU)
+   return -ECHILD;
+
+   /* Always perform fresh lookup for negatives */
+   if (d_really_is_negative(dentry))
+   goto out_bad_unlocked;
+
+   kn = kernfs_dentry_node(dentry);
+   mutex_lock(_mutex);
+
+   /* The kernfs node has been deactivated */
+   if (!kernfs_active_read(kn))
+   goto out_bad;
+
+   /* The kernfs node has been moved? */
+   if (kernfs_dentry_node(dentry->d_parent) != kn->parent)
+   goto out_bad;
+
+   /* The kernfs node has been renamed */
+   if (strcmp(dentry->d_name.name, kn->name) != 0)
+   goto out_bad;
+
+   /* The kernfs node has been moved to a different namespace */
+   if (kn->parent && kernfs_ns_enabled(kn->parent) &&
+   kernfs_info(dentry->d_sb)->ns != kn->ns)
+   goto out_bad;
+
+   mutex_unlock(_mutex);
+   return 1;
+out_bad:
+   mutex_unlock(_mutex);
+out_bad_unlocked:
+   return 0;
+}
+
+const struct dentry_operations kernfs_dops = {
+   .d_revalidate   = kernfs_dop_revalidate,
+};
+
 static struct dentry *kernfs_iop_lookup(struct inode *dir,
struct dentry *dentry,
unsigned int flags)

[PATCH v3 0/4] kernfs: proposed locking and concurrency improvement

2021-04-08 Thread Ian Kent

There have been a few instances of contention on the kernfs_mutex during
path walks, a case on very large IBM systems seen by myself, a report by
Brice Goglin and followed up by Fox Chen, and I've since seen a couple
of other reports by CoreOS users.

The common thread is a large number of kernfs path walks leading to
slowness of path walks due to kernfs_mutex contention.

The problem being that changes to the VFS over some time have increased
it's concurrency capabilities to an extent that kernfs's use of a mutex
is no longer appropriate. There's also an issue of walks for non-existent
paths causing contention if there are quite a few of them which is a less
common problem.

This patch series is relatively straight forward.

All it does is add the ability to take advantage of VFS negative dentry
caching to avoid needless dentry alloc/free cycles for lookups of paths
that don't exit and change the kernfs_mutex to a read/write semaphore.

The patch that tried to stay in VFS rcu-walk mode during path walks has
been dropped for two reasons. First, it doesn't actually give very much
improvement and, second, if there's a place where mistakes could go
unnoticed it would be in that path. This makes the patch series simpler
to review and reduces the likelihood of problems going unnoticed and
popping up later.

The patch to use a revision to identify if a directory has changed has
also been dropped. If the directory has changed the dentry revision
needs to be updated to avoid subsequent rb tree searches and after
changing to use a read/write semaphore the update also requires a lock.
But the d_lock is the only lock available at this point which might
itself be contended.

Changes since v2:
- actually fix the inode attribute update locking.
- drop the patch that tried to stay in rcu-walk mode.
- drop the use a revision to identify if a directory has changed patch.

Changes since v1:
- fix locking in .permission() and .getattr() by re-factoring the attribute
  handling code.

---

Ian Kent (4):
  kernfs: move revalidate to be near lookup
  kernfs: use VFS negative dentry caching
  kernfs: switch kernfs to use an rwsem
  kernfs: use i_lock to protect concurrent inode updates


 fs/kernfs/dir.c |  240 +++
 fs/kernfs/file.c|4 -
 fs/kernfs/inode.c   |   18 ++-
 fs/kernfs/kernfs-internal.h |5 +
 fs/kernfs/mount.c   |   12 +-
 fs/kernfs/symlink.c |4 -
 include/linux/kernfs.h  |2 
 7 files changed, 155 insertions(+), 130 deletions(-)

--

[PATCH] perf arm-spe: Avoid potential buffer overrun.

2021-04-07 Thread Ian Rogers

SPE extended headers are >1 byte so ensure the buffer contains at
least this before reading. This issue was detected by fuzzing.

Signed-off-by: Ian Rogers 
---
 tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c 
b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
index f3ac9d40cebf..2e5eff4f8f03 100644
--- a/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
+++ b/tools/perf/util/arm-spe-decoder/arm-spe-pkt-decoder.c
@@ -210,8 +210,10 @@ static int arm_spe_do_get_packet(const unsigned char *buf, 
size_t len,
 
if ((hdr & SPE_HEADER0_MASK2) == SPE_HEADER0_EXTENDED) {
/* 16-bit extended format header */
-   ext_hdr = 1;
+   if (len == 1)
+   return ARM_SPE_BAD_PACKET;
 
+   ext_hdr = 1;
hdr = buf[1];
if (hdr == SPE_HEADER1_ALIGNMENT)
return arm_spe_get_alignment(buf, len, packet);
-- 
2.31.0.208.g409f899ff0-goog

Re: [PATCH v2 1/6] perf metricgroup: Make find_metric() public with name change

2021-04-01 Thread Ian Rogers

On Thu, Mar 25, 2021 at 3:38 AM John Garry  wrote:
>
> Function find_metric() is required for the metric processing in the
> pmu-events testcase, so make it public. Also change the name to include
> "metricgroup".

Would it make more sense as "pmu_events_map__find_metric" ?

Thanks,
Ian

> Signed-off-by: John Garry 
> ---
>  tools/perf/util/metricgroup.c | 5 +++--
>  tools/perf/util/metricgroup.h | 3 ++-
>  2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
> index 6acb44ad439b..71a13406e0bd 100644
> --- a/tools/perf/util/metricgroup.c
> +++ b/tools/perf/util/metricgroup.c
> @@ -900,7 +900,8 @@ static int __add_metric(struct list_head *metric_list,
> (match_metric(__pe->metric_group, __metric) ||  \
>  match_metric(__pe->metric_name, __metric)))
>
> -static struct pmu_event *find_metric(const char *metric, struct 
> pmu_events_map *map)
> +struct pmu_event *metrcgroup_find_metric(const char *metric,
> +struct pmu_events_map *map)
>  {
> struct pmu_event *pe;
> int i;
> @@ -985,7 +986,7 @@ static int __resolve_metric(struct metric *m,
> struct expr_id *parent;
> struct pmu_event *pe;
>
> -   pe = find_metric(cur->key, map);
> +   pe = metrcgroup_find_metric(cur->key, map);
> if (!pe)
> continue;
>
> diff --git a/tools/perf/util/metricgroup.h b/tools/perf/util/metricgroup.h
> index ed1b9392e624..1674c6a36d74 100644
> --- a/tools/perf/util/metricgroup.h
> +++ b/tools/perf/util/metricgroup.h
> @@ -44,7 +44,8 @@ int metricgroup__parse_groups(const struct option *opt,
>   bool metric_no_group,
>   bool metric_no_merge,
>   struct rblist *metric_events);
> -
> +struct pmu_event *metrcgroup_find_metric(const char *metric,
> +struct pmu_events_map *map);
>  int metricgroup__parse_groups_test(struct evlist *evlist,
>struct pmu_events_map *map,
>const char *str,
> --
> 2.26.2
>

re: drm/i915/selftests: Prepare gtt tests for obj->mm.lock removal

2021-04-01 Thread Colin Ian King

Hi,

Static analysis with Coverity on Linux-next has detected a potential
issue with the following commit:

commit 480ae79537b28f30ef6e07b7de69a9ae2599daa7
Author: Maarten Lankhorst 
Date:   Tue Mar 23 16:50:49 2021 +0100

drm/i915/selftests: Prepare gtt tests for obj->mm.lock removal


The analysis by Coverity is as follows:

145 static int igt_ppgtt_alloc(void *arg)
146 {
147struct drm_i915_private *dev_priv = arg;
148struct i915_ppgtt *ppgtt;
   1. var_decl: Declaring variable ww without initializer.
149struct i915_gem_ww_ctx ww;
150u64 size, last, limit;
151int err = 0;
152
153/* Allocate a ppggt and try to fill the entire range */
154
   2. Condition !(dev_priv->__info.ppgtt_type != INTEL_PPGTT_NONE),
taking false branch.
155if (!HAS_PPGTT(dev_priv))
156return 0;
157
158ppgtt = i915_ppgtt_create(_priv->gt);
   3. Condition IS_ERR(ppgtt), taking false branch.
159if (IS_ERR(ppgtt))
160return PTR_ERR(ppgtt);
161
   4. Condition !ppgtt->vm.allocate_va_range, taking true branch.
162if (!ppgtt->vm.allocate_va_range)
   5. Jumping to label err_ppgtt_cleanup.
163goto err_ppgtt_cleanup;
164
165/*
166 * While we only allocate the page tables here and so we could
167 * address a much larger GTT than we could actually fit into
168 * RAM, a practical limit is the amount of physical pages in
the system.
169 * This should ensure that we do not run into the oomkiller
during
170 * the test and take down the machine wilfully.
171 */
172limit = totalram_pages() << PAGE_SHIFT;
173limit = min(ppgtt->vm.total, limit);
174
175i915_gem_ww_ctx_init(, false);
176retry:
177err = i915_vm_lock_objects(>vm, );
178if (err)
179goto err_ppgtt_cleanup;
180
181/* Check we can allocate the entire range */
182for (size = 4096; size <= limit; size <<= 2) {
183struct i915_vm_pt_stash stash = {};
184
185err = i915_vm_alloc_pt_stash(>vm, , size);
186if (err)
187goto err_ppgtt_cleanup;
188
189err = i915_vm_pin_pt_stash(>vm, );
190if (err) {
191i915_vm_free_pt_stash(>vm, );
192goto err_ppgtt_cleanup;
193}
194
195ppgtt->vm.allocate_va_range(>vm, , 0, size);
196cond_resched();
197
198ppgtt->vm.clear_range(>vm, 0, size);
199
200i915_vm_free_pt_stash(>vm, );
201}
202
203/* Check we can incrementally allocate the entire range */
204for (last = 0, size = 4096; size <= limit; last = size, size
<<= 2) {
205struct i915_vm_pt_stash stash = {};
206
207err = i915_vm_alloc_pt_stash(>vm, , size
- last);
208if (err)
209goto err_ppgtt_cleanup;
210
211err = i915_vm_pin_pt_stash(>vm, );
212if (err) {
213i915_vm_free_pt_stash(>vm, );
214goto err_ppgtt_cleanup;
215}
216
217ppgtt->vm.allocate_va_range(>vm, ,
218last, size - last);
219cond_resched();
220
221i915_vm_free_pt_stash(>vm, );
222}
223
224 err_ppgtt_cleanup:
   6. Condition err == -35, taking false branch.
225if (err == -EDEADLK) {
226err = i915_gem_ww_ctx_backoff();
227if (!err)
228goto retry;
229}
   7. uninit_use_in_call: Using uninitialized value ww.contended when
calling i915_gem_ww_ctx_fini.
   Uninitialized pointer read (UNINIT)
   8. uninit_use_in_call: Using uninitialized value ww.ctx.acquired when
calling i915_gem_ww_ctx_fini.
230i915_gem_ww_ctx_fini();
231
232i915_vm_put(>vm);
233return err;
234 }

Coverity is reporting use of uninitialized values in (lines 230.  Not
sure what the best fix is for this, so I'm reporting this as a potential
issue.

Colin

re: ALSA: control - add layer registration routines

2021-03-31 Thread Colin Ian King

Hi,

Static analysis on linux-next with Coverity has detected a potential
issue in the following commit:

commit 3f0638a0333bfdd0549985aa620f2ab69737af47
Author: Jaroslav Kysela 
Date:   Wed Mar 17 18:29:41 2021 +0100

ALSA: control - add layer registration routines

The static analysis is as follows:

2072 void snd_ctl_disconnect_layer(struct snd_ctl_layer_ops *lops)
2073 {
2074struct snd_ctl_layer_ops *lops2, *prev_lops2;
2075
2076down_write(_ctl_layer_rwsem);

assignment: Assigning: prev_lops2 = NULL.

2077for (lops2 = snd_ctl_layer, prev_lops2 = NULL; lops2; lops2
= lops2->next)
2078if (lops2 == lops) {

null: At condition prev_lops2, the value of prev_lops2 must be NULL.
dead_error_condition: The condition !prev_lops2 must be true.

2079if (!prev_lops2)
2080snd_ctl_layer = lops->next;
2081else

'Constant' variable guards dead code (DEADCODE) dead_error_line:
Execution cannot reach this statement: prev_lops2->next = lops->next;.
Local variable prev_lops2 is assigned only once, to a constant
value, making it effectively constant throughout its scope. If this is
not the intent, examine the logic to see if there is a missing
assignment that would make prev_lops2 not remain constant.

2082prev_lops2->next = lops->next;
2083break;
2084}
2085up_write(_ctl_layer_rwsem);
2086 }

I couldn't quite figure out the original intent of the prev_lops use, so
I'd thought I'd report this issue as the code does look incorrect.

Colin

Re: [PATCH] mm/page_alloc: Add a bulk page allocator -fix -fix

2021-03-30 Thread Colin Ian King

On 30/03/2021 12:48, Mel Gorman wrote:
> Colin Ian King reported the following problem (slightly edited)
> 
>   Author: Mel Gorman 
>   Date:   Mon Mar 29 11:12:24 2021 +1100
> 
>   mm/page_alloc: add a bulk page allocator
> 
>   ...
> 
>   Static analysis on linux-next with Coverity has found a potential
>   uninitialized variable issue in function __alloc_pages_bulk with
>   the following commit:
> 
>   ...
> 
>   Uninitialized scalar variable (UNINIT)
>   15. uninit_use_in_call: Using uninitialized value alloc_flags when
>   calling prepare_alloc_pages.
> 
>   5056if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask,
>   , _gfp, _flags))
> 
> The problem is that prepare_alloc_flags only updates alloc_flags
> which must have a valid initial value. The appropriate initial value is
> ALLOC_WMARK_LOW to avoid the bulk allocator pushing a zone below the low
> watermark without waking kswapd assuming the GFP mask allows kswapd to
> be woken.
> 
> This is a second fix to the mmotm patch
> mm-page_alloc-add-a-bulk-page-allocator.patch . It will cause a mild conflict
> with a later patch due to renaming of an adjacent variable that is trivially
> resolved. I can post a full series with the fixes merged if that is preferred.
> 
> Signed-off-by: Mel Gorman 
> ---
>  mm/page_alloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 92d55f80c289..dabef0b910c9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4990,7 +4990,7 @@ unsigned long __alloc_pages_bulk(gfp_t gfp, int 
> preferred_nid,
>   struct list_head *pcp_list;
>   struct alloc_context ac;
>   gfp_t alloc_gfp;
> - unsigned int alloc_flags;
> + unsigned int alloc_flags = ALLOC_WMARK_LOW;
>   int allocated = 0;
>  
>   if (WARN_ON_ONCE(nr_pages <= 0))
> 

Thanks Mel, that definitely fixes the issue.

Reviewed-by: Colin Ian King

re: mm/page_alloc: add a bulk page allocator

2021-03-29 Thread Colin Ian King

Hi,

Static analysis on linux-next with Coverity has found a potential
uninitialized variable issue in function __alloc_pages_bulk with the
following commit:

commit b0e0a469733fa571ddd8fe147247c9561b51b2da
Author: Mel Gorman 
Date:   Mon Mar 29 11:12:24 2021 +1100

mm/page_alloc: add a bulk page allocator

The analysis is as follows:

5023 unsigned long __alloc_pages_bulk(gfp_t gfp, int preferred_nid,
5024nodemask_t *nodemask, int nr_pages,
5025struct list_head *page_list,
5026struct page **page_array)
5027 {
5028struct page *page;
5029unsigned long flags;
5030struct zone *zone;
5031struct zoneref *z;
5032struct per_cpu_pages *pcp;
5033struct list_head *pcp_list;
5034struct alloc_context ac;
5035gfp_t alloc_gfp;
1. var_decl: Declaring variable alloc_flags without initializer.
5036unsigned int alloc_flags;
5037int nr_populated = 0;
5038
2. Condition !!(nr_pages <= 0), taking false branch.
5039if (unlikely(nr_pages <= 0))
5040return 0;
5041
5042/*
5043 * Skip populated array elements to determine if any pages need
5044 * to be allocated before disabling IRQs.
5045 */
3. Condition page_array, taking true branch.
4. Condition page_array[nr_populated], taking true branch.
5. Condition nr_populated < nr_pages, taking true branch.
7. Condition page_array, taking true branch.
8. Condition page_array[nr_populated], taking true branch.
9. Condition nr_populated < nr_pages, taking true branch.
11. Condition page_array, taking true branch.
12. Condition page_array[nr_populated], taking true branch.
13. Condition nr_populated < nr_pages, taking false branch.
5046while (page_array && page_array[nr_populated] &&
nr_populated < nr_pages)
6. Jumping back to the beginning of the loop.
10. Jumping back to the beginning of the loop.
5047nr_populated++;
5048
5049/* Use the single page allocator for one page. */
14. Condition nr_pages - nr_populated == 1, taking false branch.
5050if (nr_pages - nr_populated == 1)
5051goto failed;
5052
5053/* May set ALLOC_NOFRAGMENT, fragmentation will return 1
page. */
5054gfp &= gfp_allowed_mask;
5055alloc_gfp = gfp;

Uninitialized scalar variable (UNINIT)
15. uninit_use_in_call: Using uninitialized value alloc_flags when
calling prepare_alloc_pages.

5056if (!prepare_alloc_pages(gfp, 0, preferred_nid, nodemask,
, _gfp, _flags))
5057return 0;

And in prepare_alloc_pages():

4957 static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int
order,
4958int preferred_nid, nodemask_t *nodemask,
4959struct alloc_context *ac, gfp_t *alloc_gfp,
4960unsigned int *alloc_flags)
4961 {
4962ac->highest_zoneidx = gfp_zone(gfp_mask);
4963ac->zonelist = node_zonelist(preferred_nid, gfp_mask);
4964ac->nodemask = nodemask;
4965ac->migratetype = gfp_migratetype(gfp_mask);
4966

1. Condition cpusets_enabled(), taking false branch.

4967if (cpusets_enabled()) {
4968*alloc_gfp |= __GFP_HARDWALL;
4969/*
4970 * When we are in the interrupt context, it is
irrelevant
4971 * to the current task context. It means that any
node ok.
4972 */
4973if (!in_interrupt() && !ac->nodemask)
4974ac->nodemask = _current_mems_allowed;
4975else
4976*alloc_flags |= ALLOC_CPUSET;
4977}
4978
4979fs_reclaim_acquire(gfp_mask);
4980fs_reclaim_release(gfp_mask);
4981
2. Condition gfp_mask & 1024U /* (gfp_t)1024U */, taking true branch.
4982might_sleep_if(gfp_mask & __GFP_DIRECT_RECLAIM);
4983
3. Condition should_fail_alloc_page(gfp_mask, order), taking false
branch.
4984if (should_fail_alloc_page(gfp_mask, order))
4985return false;
4986
4. read_value: Reading value *alloc_flags when calling
gfp_to_alloc_flags_cma.
4987*alloc_flags = gfp_to_alloc_flags_cma(gfp_mask, *alloc_flags);

And in call gfp_to_alloc_flags_cma():

in /mm/page_alloc.c

3853 static inline unsigned int gfp_to_alloc_flags_cma(gfp_t gfp_mask,
3854  unsigned int
alloc_flags)
3855 {
3856#ifdef CONFIG_CMA
1. Condition gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE, taking
true branch.
3857if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
2. read_value: Reading value alloc_flags.
3858alloc_flags |= ALLOC_CMA;
3859#endif
3860return alloc_flags;
3861 }

So alloc_flags in gfp_to_alloc_flags_cma is being updated with the |=
operator and we managed to get to this path with uninitialized

Periodic locking IO is causing server to stop responding

2021-03-29 Thread Ian Coetzee


Hi All,

We have run into a slight mishap here on one of our servers, which I am 
hoping you could help narrow down to a cause.


One of our servers locks up every so often, seemingly because of a disk 
IO lockup. Symptoms include high load average (106) stemming from the 
processor waiting around 97-99% for the disk. When this occurs any new 
ssh sessions is met with a connection timeout.


Kernel version: 5.10.24-uls #1 SMP Fri Mar 19 11:31:52 SAST 2021 x86_64 
Intel(R) Xeon(R) CPU E5-2603 v3 @ 1.60GHz GenuineIntel GNU/Linux


The following log entries appeared in dmesg around the time the last 
lock up started occurring.


[Tue Mar 23 06:09:32 2021] scsi target0:0:8: handle(0x0012), 
sas_address(0x500304800175f088), phy(8)
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: enclosure logical 
id(0x500304800175f0bf), slot(8)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: No reference found at driver, 
assuming scmd(0x29a7ef73) might have completed
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: task abort: SUCCESS 
scmd(0x29a7ef73)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: attempting task 
abort!scmd(0xde97f273), outstanding for 184620 ms & timeout 
18 ms
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: [sdi] tag#1993 CDB: Write(16) 
8a 00 00 00 00 00 34 d5 2c 00 00 00 04 00 00 00
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: handle(0x0012), 
sas_address(0x500304800175f088), phy(8)
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: enclosure logical 
id(0x500304800175f0bf), slot(8)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: No reference found at driver, 
assuming scmd(0xde97f273) might have completed
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: task abort: SUCCESS 
scmd(0xde97f273)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: attempting task 
abort!scmd(0xe4cfbc75), outstanding for 184600 ms & timeout 
18 ms
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: [sdi] tag#1987 CDB: Write(16) 
8a 00 00 00 00 00 34 d5 96 a8 00 00 01 58 00 00
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: handle(0x0012), 
sas_address(0x500304800175f088), phy(8)
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: enclosure logical 
id(0x500304800175f0bf), slot(8)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: No reference found at driver, 
assuming scmd(0xe4cfbc75) might have completed
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: task abort: SUCCESS 
scmd(0xe4cfbc75)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: attempting task 
abort!scmd(0x2282f27d), outstanding for 184620 ms & timeout 
18 ms
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: [sdi] tag#1986 CDB: Write(16) 
8a 00 00 00 00 00 34 d5 3c 00 00 00 04 00 00 00
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: handle(0x0012), 
sas_address(0x500304800175f088), phy(8)
[Tue Mar 23 06:09:32 2021] scsi target0:0:8: enclosure logical 
id(0x500304800175f0bf), slot(8)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: No reference found at driver, 
assuming scmd(0x2282f27d) might have completed
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: task abort: SUCCESS 
scmd(0x2282f27d)
[Tue Mar 23 06:09:32 2021] sd 0:0:8:0: device_unblock and setting to 
running, handle(0x0012)

[Tue Mar 23 06:09:33 2021] sd 0:0:8:0: Power-on or device reset occurred


We are running a bank of drives on software raid, all on controller


   *-storage
    description: Serial Attached SCSI controller
    product: SAS3008 PCI-Express Fusion-MPT SAS-3
    vendor: Broadcom / LSI
    physical id: 0
    bus info: pci@:01:00.0
    logical name: scsi0
    version: 02
    width: 64 bits
    clock: 33MHz
    capabilities: storage pm pciexpress vpd msi msix 
bus_master cap_list rom

    configuration: driver=mpt3sas latency=0
    resources: irq:24 ioport:e000(size=256) 
memory:fb20-fb20 memory:fb10-fb1f


So far we have not seen this on any of other servers on the same kernel 
version.


Please let me know if I can provide anymore information, we have since 
restarted the server.


Kind regards
Ian Coetzee

re: drm/ttm: switch to per device LRU lock

2021-03-25 Thread Colin Ian King

Hi,

Static analysis with Coverity in linux-next has detected an issue in
drivers/gpu/drm/ttm/ttm_bo.c with the follow commit:

commit a1f091f8ef2b680a5184db065527612247cb4cae
Author: Christian König 
Date:   Tue Oct 6 17:26:42 2020 +0200

drm/ttm: switch to per device LRU lock

Instead of having a global lock for potentially less contention.


The analysis is as follows:

617 int ttm_mem_evict_first(struct ttm_device *bdev,
618struct ttm_resource_manager *man,
619const struct ttm_place *place,
620struct ttm_operation_ctx *ctx,
621struct ww_acquire_ctx *ticket)
622 {
   1. assign_zero: Assigning: bo = NULL.

623struct ttm_buffer_object *bo = NULL, *busy_bo = NULL;
624bool locked = false;
625unsigned i;
626int ret;
627

   Explicit null dereferenced (FORWARD_NULL)2. var_deref_op:
Dereferencing null pointer bo.

628spin_lock(>bdev->lru_lock);
629for (i = 0; i < TTM_MAX_BO_PRIORITY; ++i) {

The spin_lock on bo is dereferencing a null bo pointer.

Colin

Re: [PATCH v3 04/21] x86/insn: Add an insn_decode() API

2021-03-24 Thread Ian Rogers

On Wed, Mar 24, 2021 at 6:54 AM Borislav Petkov  wrote:
>
> On Wed, Mar 24, 2021 at 10:43:20AM -0300, Arnaldo Carvalho de Melo wrote:
> > Borislav, was this addressed? Ian?
>
> Yap:
>
> https://git.kernel.org/tip/0705ef64d1ff52b817e278ca6e28095585ff31e1

Tested on PPC and ARM64 fwiw. Thanks,
Ian

> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette

Re: [PATCH] x86/kprobes: Remove dead code

2021-03-24 Thread Colin Ian King

On 24/03/2021 17:36, Muhammad Usama Anjum wrote:
> The condition in switch statement `opcode & 0xf0` cannot evaluate to
> 0xff. So this case statement will never execute. Remove it.
> 
> Fixes: 6256e668b7 ("x86/kprobes: Use int3 instead of debug trap for 
> single-step")
> Signed-off-by: Muhammad Usama Anjum 
> ---
>  arch/x86/kernel/kprobes/core.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
> index 89d9f26785c7..3b7bcc077020 100644
> --- a/arch/x86/kernel/kprobes/core.c
> +++ b/arch/x86/kernel/kprobes/core.c
> @@ -177,9 +177,6 @@ int can_boost(struct insn *insn, void *addr)
>   case 0xf0:
>   /* clear and set flags are boostable */
>   return (opcode == 0xf5 || (0xf7 < opcode && opcode < 0xfe));
> - case 0xff:
> - /* indirect jmp is boostable */
> - return X86_MODRM_REG(insn->modrm.bytes[0]) == 4;
>   default:
>   /* CS override prefix and call are not boostable */
>   return (opcode != 0x2e && opcode != 0x9a);
> 

The 0xff case was added with some form of intention to be executed so I
suspect removing it is not an appropriate fix.

Re: [PATCH][next] staging: rtl8188eu: Fix null pointer dereference on free_netdev call

2021-03-24 Thread Colin Ian King

On 24/03/2021 16:11, Martin Kaiser wrote:
> Hello Colin,
> 
> Thus wrote Colin King (colin.k...@canonical.com):
> 
>> From: Colin Ian King 
> 
>> An unregister_netdev call checks if pnetdev is null, hence a later
>> call to free_netdev can potentially be passing a null pointer, causing
>> a null pointer dereference. Avoid this by adding a null pointer check
>> on pnetdev before calling free_netdev.
> 
>> Fixes: 1665c8fdffbb ("staging: rtl8188eu: use netdev routines for private 
>> data")
>> Signed-off-by: Colin Ian King 
>> ---
>>  drivers/staging/rtl8188eu/os_dep/usb_intf.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
>> diff --git a/drivers/staging/rtl8188eu/os_dep/usb_intf.c 
>> b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
>> index 518e9feb3f46..91a3d34a1050 100644
>> --- a/drivers/staging/rtl8188eu/os_dep/usb_intf.c
>> +++ b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
>> @@ -446,7 +446,8 @@ static void rtw_usb_if1_deinit(struct adapter *if1)
>>  pr_debug("+r871xu_dev_remove, hw_init_completed=%d\n",
>>   if1->hw_init_completed);
>>  rtw_free_drv_sw(if1);
>> -free_netdev(pnetdev);
>> +if (pnetdev)
>> +free_netdev(pnetdev);
>>  }
> 
>>  static int rtw_drv_init(struct usb_interface *pusb_intf, const struct 
>> usb_device_id *pdid)
>> -- 
>> 2.30.2
> 
> you're right. I removed the NULL check that was part of rtw_free_netdev.
> Sorry for the mistake and thanks for your fix.

Thank static analysis :-)

> 
> Reviewed-by: Martin Kaiser 
> 
> Best regards,
> Martin
>

re: x86/kprobes: Use int3 instead of debug trap for single-step

2021-03-24 Thread Colin Ian King

Hi,

Static analysis on linux-next using Coverity has detected an issue in
the following commit:

commit 6256e668b7af9d81472e03c6a171630c08f8858a
Author: Masami Hiramatsu 
Date:   Wed Mar 3 00:25:46 2021 +0900

x86/kprobes: Use int3 instead of debug trap for single-step

The analysis is as follows:

160switch (opcode & 0xf0) {
161case 0x60:
162/* can't boost "bound" */
163return (opcode != 0x62);
164case 0x70:
165return 0; /* can't boost conditional jump */
166case 0x90:
167return opcode != 0x9a;  /* can't boost call far */
168case 0xc0:
169/* can't boost software-interruptions */
170return (0xc1 < opcode && opcode < 0xcc) || opcode ==
0xcf;
171case 0xd0:
172/* can boost AA* and XLAT */
173return (opcode == 0xd4 || opcode == 0xd5 || opcode ==
0xd7);
174case 0xe0:
175/* can boost in/out and absolute jmps */
176return ((opcode & 0x04) || opcode == 0xea);
177case 0xf0:
178/* clear and set flags are boostable */
179return (opcode == 0xf5 || (0xf7 < opcode && opcode <
0xfe));

   dead_error_condition: The switch governing value opcode & 0xf0 cannot
be 255.
  undefined (#1 of 1): Logically dead code (DEADCODE)
  dead_error_begin: Execution cannot reach this statement: case 255:

180case 0xff:
181/* indirect jmp is boostable */
182   return X86_MODRM_REG(insn->modrm.bytes[0]) == 4;

the case 0xff statement can never be reached because the switch
statement is acting on opcode & 0xf0.

Colin

Re: [RFC PATCH] autofs: find_autofs_mount overmounted parent support

2021-03-23 Thread Ian Kent

On Tue, 2021-03-09 at 13:43 +0300, Alexander Mikhalitsyn wrote:
> On Sat, 06 Mar 2021 17:13:32 +0800
> Ian Kent  wrote:
> 
> > On Fri, 2021-03-05 at 14:55 +0300, Alexander Mikhalitsyn wrote:
> > > On Fri, 05 Mar 2021 18:10:02 +0800
> > > Ian Kent  wrote:
> > > 
> > > > On Thu, 2021-03-04 at 13:11 +0300, Alexander Mikhalitsyn wrote:
> > > > > On Thu, 04 Mar 2021 14:54:11 +0800
> > > > > Ian Kent  wrote:
> > > > > 
> > > > > > On Wed, 2021-03-03 at 18:28 +0300, Alexander Mikhalitsyn
> > > > > > wrote:
> > > > > > > It was discovered that find_autofs_mount() function
> > > > > > > in autofs not support cases when autofs mount
> > > > > > > parent is overmounted. In this case this function will
> > > > > > > always return -ENOENT.
> > > > > > 
> > > > > > Ok, I get this shouldn't happen.
> > > > > > 
> > > > > > > Real-life reproducer is fairly simple.
> > > > > > > Consider the following mounts on root mntns:
> > > > > > > --
> > > > > > > 35 24 0:36 / /proc/sys/fs/binfmt_misc ... shared:16 -
> > > > > > > autofs
> > > > > > > systemd-
> > > > > > > 1 ...
> > > > > > > 654 35 0:57 / /proc/sys/fs/binfmt_misc ... shared:322 -
> > > > > > > binfmt_misc
> > > > > > > ...
> > > > > > > --
> > > > > > > and some process which calls
> > > > > > > ioctl(AUTOFS_DEV_IOCTL_OPENMOUNT)
> > > > > > > $ unshare -m -p --fork --mount-proc ./process-bin
> > > > > > > 
> > > > > > > Due to "mount-proc" /proc will be overmounted and
> > > > > > > ioctl() will fail with -ENOENT
> > > > > > 
> > > > > > I think I need a better explanation ...
> > > > > 
> > > > > Thank you for the quick reply, Ian.
> > > > > I'm sorry If my patch description was not sufficiently clear
> > > > > and
> > > > > detailed.
> > > > > 
> > > > > That problem connected with CRIU (Checkpoint-Restore in
> > > > > Userspace)
> > > > > project.
> > > > > In CRIU we have support of autofs mounts C/R. To acheive that
> > > > > we
> > > > > need
> > > > > to use
> > > > > ioctl's from /dev/autofs to get data about mounts, restore
> > > > > mount
> > > > > as
> > > > > catatonic
> > > > > (if needed), change pipe fd and so on. But the problem is
> > > > > that
> > > > > during
> > > > > CRIU
> > > > > dump we may meet situation when VFS subtree where autofs
> > > > > mount
> > > > > present was
> > > > > overmounted as whole.
> > > > > 
> > > > > Simpliest example is /proc/sys/fs/binfmt_misc. This mount
> > > > > present
> > > > > on
> > > > > most
> > > > > GNU/Linux distributions by default. For instance on my Fedora
> > > > > 33:
> > > > 
> > > > Yes, I don't know why systemd uses this direct mount, there
> > > > must
> > > > have been a reason for it.
> > > > 
> > > > > trigger automount of binfmt_misc
> > > > > $ ls /proc/sys/fs/binfmt_misc
> > > > > 
> > > > > $ cat /proc/1/mountinfo | grep binfmt
> > > > > 35 24 0:36 / /proc/sys/fs/binfmt_misc rw,relatime shared:16 -
> > > > > autofs
> > > > > systemd-1 rw,...,direct,pipe_ino=223
> > > > > 632 35 0:56 / /proc/sys/fs/binfmt_misc rw,...,relatime
> > > > > shared:315
> > > > > -
> > > > > binfmt_misc binfmt_misc rw
> > > > 
> > > > Yes, I think this looks normal.
> > > > 
> > > > > $ sudo unshare -m -p --fork --mount-proc sh
> > > > > # cat /proc/self/mountinfo | grep "/proc"
> > > > > 828 809 0:23 / /proc rw,nosuid,nodev,noexec,relatime - proc
> > > > > proc
> > > > > rw
> > > > > 829 828 0:36 / /proc/sys/fs/binfmt_misc rw,relatime - autofs
> > > > > systemd-
> > > > > 1 r

Re: [PATCH RFC v2 8/8] selftests/perf: Add kselftest for remove_on_exec

2021-03-22 Thread Ian Rogers

On Mon, Mar 22, 2021 at 6:24 AM Marco Elver  wrote:
>
> On Wed, Mar 10, 2021 at 11:41AM +0100, Marco Elver wrote:
> > Add kselftest to test that remove_on_exec removes inherited events from
> > child tasks.
> >
> > Signed-off-by: Marco Elver 
>
> To make compatible with more recent libc, we'll need to fixup the tests
> with the below.
>
> Also, I've seen that tools/perf/tests exists, however it seems to be
> primarily about perf-tool related tests. Is this correct?
>
> I'd propose to keep these purely kernel ABI related tests separate, and
> that way we can also make use of the kselftests framework which will
> also integrate into various CI systems such as kernelci.org.

Perhaps there is a way to have both? Having the perf tool spot an
errant kernel feels like a feature. There are also
tools/lib/perf/tests and Vince Weaver's tests [1]. It is possible to
run standalone tests from within perf test by having them be executed
by a shell test.

Thanks,
Ian

[1] https://github.com/deater/perf_event_tests

> Thanks,
> -- Marco
>
> -- >8 --
>
> diff --git a/tools/testing/selftests/perf_events/remove_on_exec.c 
> b/tools/testing/selftests/perf_events/remove_on_exec.c
> index e176b3a74d55..f89d0cfdb81e 100644
> --- a/tools/testing/selftests/perf_events/remove_on_exec.c
> +++ b/tools/testing/selftests/perf_events/remove_on_exec.c
> @@ -13,6 +13,11 @@
>  #define __have_siginfo_t 1
>  #define __have_sigval_t 1
>  #define __have_sigevent_t 1
> +#define __siginfo_t_defined
> +#define __sigval_t_defined
> +#define __sigevent_t_defined
> +#define _BITS_SIGINFO_CONSTS_H 1
> +#define _BITS_SIGEVENT_CONSTS_H 1
>
>  #include 
>  #include 
> diff --git a/tools/testing/selftests/perf_events/sigtrap_threads.c 
> b/tools/testing/selftests/perf_events/sigtrap_threads.c
> index 7ebb9bb34c2e..b9a7d4b64b3c 100644
> --- a/tools/testing/selftests/perf_events/sigtrap_threads.c
> +++ b/tools/testing/selftests/perf_events/sigtrap_threads.c
> @@ -13,6 +13,11 @@
>  #define __have_siginfo_t 1
>  #define __have_sigval_t 1
>  #define __have_sigevent_t 1
> +#define __siginfo_t_defined
> +#define __sigval_t_defined
> +#define __sigevent_t_defined
> +#define _BITS_SIGINFO_CONSTS_H 1
> +#define _BITS_SIGEVENT_CONSTS_H 1
>
>  #include 
>  #include

Re: [PATCH][next] loop: Fix missing max_active argument in alloc_workqueue call

2021-03-19 Thread Colin Ian King

On 19/03/2021 15:54, Dan Schatzberg wrote:
> On Thu, Mar 18, 2021 at 03:16:26PM +, Colin King wrote:
>> From: Colin Ian King 
>>
>> The 3rd argument to alloc_workqueue should be the max_active count,
>> however currently it is the lo->lo_number that is intended for the
>> loop%d number. Fix this by adding in the missing max_active count.
>>
> 
> Thanks for catching this Colin. I'm fairly new to kernel development.
> Is there some tool I could have run locally to catch this?
> 
I'm using Coverity to find these issues. There is a free version [1],
but the one I use is licensed and has the scan level turned up quite
high to catch more issues than the free service.

Refs: [1] https://scan.coverity.com/projects/linux-next-weekly-scan

Colin

Re: [PATCH][next] loop: Fix missing max_active argument in alloc_workqueue call

2021-03-18 Thread Colin Ian King

On 18/03/2021 20:12, Jens Axboe wrote:
> On 3/18/21 9:16 AM, Colin King wrote:
>> From: Colin Ian King 
>>
>> The 3rd argument to alloc_workqueue should be the max_active count,
>> however currently it is the lo->lo_number that is intended for the
>> loop%d number. Fix this by adding in the missing max_active count.
> 
> Dan, please fold this (or something similar) in when you're redoing the
> series.
> 
Appreciate this fix being picked up. Are we going to lose the SoB?

Colin

Re: [PATCH][next] soc: xilinx: vcu: remove deadcode on null divider check

2021-03-18 Thread Colin Ian King

On 11/02/2021 19:05, Stephen Boyd wrote:
> Quoting Michael Tretter (2021-02-10 23:39:06)
>> On Wed, 10 Feb 2021 19:28:18 -0800, Stephen Boyd wrote:
>>> Quoting Colin King (2021-02-10 10:49:38)
>>>> From: Colin Ian King 
>>>>
>>>> The pointer 'divider' has previously been null checked followed by
>>>> a return, hence the subsequent null check is redundant deadcode
>>>> that can be removed.  Clean up the code and remove it.
>>>>
>>>> Fixes: 9c789deea206 ("soc: xilinx: vcu: implement clock provider for 
>>>> output clocks")
>>>> Signed-off-by: Colin Ian King 
>>>> ---
>>>>  drivers/clk/xilinx/xlnx_vcu.c | 3 ---
>>>>  1 file changed, 3 deletions(-)
>>>>
>>>> diff --git a/drivers/clk/xilinx/xlnx_vcu.c b/drivers/clk/xilinx/xlnx_vcu.c
>>>> index d66b1315114e..607936d7a413 100644
>>>> --- a/drivers/clk/xilinx/xlnx_vcu.c
>>>> +++ b/drivers/clk/xilinx/xlnx_vcu.c
>>>> @@ -512,9 +512,6 @@ static void xvcu_clk_hw_unregister_leaf(struct clk_hw 
>>>> *hw)
>>>>  
>>>> mux = clk_hw_get_parent(divider);
>>>> clk_hw_unregister_mux(mux);
>>>> -   if (!divider)
>>>> -   return;
>>>> -
>>>
>>> This code is pretty confusing. Waiting for m.tret...@pengutronix.de to
>>> reply
>>
>> Can you elaborate what you find confusing about this code. I would gladly try
>> to clarify and improve the code.
> 
> The fact that pointers are being checked and then bailing out of the
> function early, vs. doing something if the pointer is non-NULL.
> 
>>
>> What happens here is that the driver registers a mux -> divider -> gate chain
>> for each output clock, but only stores the gate clock. When unregistering the
>> clocks, the driver starts at the gate and walks up to the mux while
>> unregistering the clocks.
>>

OK, so I think I understand this better, should the order of
unregisteration be as follows:

diff --git a/drivers/clk/xilinx/xlnx_vcu.c b/drivers/clk/xilinx/xlnx_vcu.c
index d66b1315114e..66bac8421460 100644
--- a/drivers/clk/xilinx/xlnx_vcu.c
+++ b/drivers/clk/xilinx/xlnx_vcu.c
@@ -511,11 +511,11 @@ static void xvcu_clk_hw_unregister_leaf(struct
clk_hw *hw)
return;

mux = clk_hw_get_parent(divider);
-   clk_hw_unregister_mux(mux);
-   if (!divider)
+   clk_hw_unregister_mux(divider);
+   if (!mux)
return;

-   clk_hw_unregister_divider(divider);
+   clk_hw_unregister_divider(mux);

Re: linux-next: build failure after merge of the tip tree

2021-03-17 Thread Ian Rogers

On Wed, Mar 17, 2021 at 3:54 AM Borislav Petkov  wrote:
>
> + Ian.
>
> On Wed, Mar 17, 2021 at 03:08:58PM +1100, Stephen Rothwell wrote:
> > Hi all,
> >
> > After merging the tip tree, today's linux-next build (native perf)
> > failed like this:
> >
> > In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:15:
> > util/intel-pt-decoder/../../../arch/x86/lib/insn.c:14:10: fatal error: 
> > asm/inat.h: No such file or directory
> >14 | #include  /*__ignore_sync_check__ */
> >   |  ^~~~
> >
> > This is a powerpc build of perf.  I can't see what caused this failure,
> > so I have used the version of the tip tree from next-20210316 for today.
>
> Yah, this has come up in the past during review but the wrong version
> somehow snuck in, sorry. ;-\
>
> Can you guys verify this fixes the build issue? I don't have a ppc build
> setup.
>
> Thx.

The  path also needs fixing. With the following
I was able to build for arm64 and powerpc.

Thanks,
Ian

diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
index cd4dedde3265..968360bf2150 100644
--- a/tools/arch/x86/lib/insn.c
+++ b/tools/arch/x86/lib/insn.c
@@ -11,13 +11,13 @@
 #else
 #include 
 #endif
-#include  /*__ignore_sync_check__ */
-#include  /* __ignore_sync_check__ */
+#include "../include/asm/inat.h" /*__ignore_sync_check__ */
+#include "../include/asm/insn.h" /* __ignore_sync_check__ */

 #include 
 #include 

-#include  /* __ignore_sync_check__ */
+#include "../include/asm/emulate_prefix.h" /* __ignore_sync_check__ */

 #define leXX_to_cpu(t, r)  \
 ({ \

> ---
> From 50d91054fc421e2a90923706d5ca79e941e28300 Mon Sep 17 00:00:00 2001
> From: Borislav Petkov 
> Date: Wed, 17 Mar 2021 11:33:04 +0100
> Subject: [PATCH] tools/insn: Restore the relative include paths for cross
>  building
>
> Building perf on ppc causes:
>
>   In file included from util/intel-pt-decoder/intel-pt-insn-decoder.c:15:
>   util/intel-pt-decoder/../../../arch/x86/lib/insn.c:14:10: fatal error: 
> asm/inat.h: No such file or directory
>  14 | #include  /*__ignore_sync_check__ */
> |  ^~~~
>
> Restore the relative include paths so that the compiler can find the
> headers.
>
> Fixes: 93281c4a9657 ("x86/insn: Add an insn_decode() API")
> Reported-by: Ian Rogers 
> Reported-by: Stephen Rothwell 
> NOT-Signed-off-by: Borislav Petkov 
> ---
>  tools/arch/x86/lib/insn.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
> index cd4dedde3265..999fbd4c9bea 100644
> --- a/tools/arch/x86/lib/insn.c
> +++ b/tools/arch/x86/lib/insn.c
> @@ -11,8 +11,8 @@
>  #else
>  #include 
>  #endif
> -#include  /*__ignore_sync_check__ */
> -#include  /* __ignore_sync_check__ */
> +#include "../include/asm/inat.h" /* __ignore_sync_check__ */
> +#include "../include/asm/insn.h" /* __ignore_sync_check__ */
>
>  #include 
>  #include 
> --
> 2.29.2
>
> --
> Regards/Gruss,
> Boris.
>
> SUSE Software Solutions Germany GmbH, GF: Felix Imendörffer, HRB 36809, AG 
> Nürnberg

Re: [PATCH] staging: comedi: das800: fix request_irq() warn

2021-03-17 Thread Ian Abbott


On 16/03/2021 22:42, Tong Zhang wrote:

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created
since the .name might be used by userspace and we don't want to break
userspace, so we are changing the parameters passed to request_irq()

Suggested-by: Ian Abbott 
Signed-off-by: Tong Zhang 
---
  drivers/staging/comedi/drivers/das800.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/das800.c 
b/drivers/staging/comedi/drivers/das800.c
index 2881808d6606..bc08324f422f 100644
--- a/drivers/staging/comedi/drivers/das800.c
+++ b/drivers/staging/comedi/drivers/das800.c
@@ -668,7 +668,7 @@ static int das800_attach(struct comedi_device *dev, struct 
comedi_devconfig *it)
dev->board_name = board->name;
  
  	if (irq > 1 && irq <= 7) {

-   ret = request_irq(irq, das800_interrupt, 0, dev->board_name,
+   ret = request_irq(irq, das800_interrupt, 0, "das800",
  dev);
if (ret == 0)
dev->irq = irq;



Looks good (apart from the minor spelling niggle spotted by Dan 
Carpenter), thanks!


Reviewed-by: Ian Abbott 

--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH v3 04/21] x86/insn: Add an insn_decode() API

2021-03-16 Thread Ian Rogers

/**
> + * insn_decode() - Decode an x86 instruction
> + * @insn:   insn to be initialized
> + * @kaddr: address (in kernel memory) of instruction (or copy thereof)
> + * @buf_len:   length of the insn buffer at @kaddr
> + * @m: insn mode, see enum insn_mode
> + *
> + * Returns:
> + * 0: if decoding succeeded
> + * < 0: otherwise.
> + */
> +int insn_decode(struct insn *insn, const void *kaddr, int buf_len, enum 
> insn_mode m)
> +{
> +   int ret;
> +
> +/* #define INSN_MODE_KERN  -1 __ignore_sync_check__ mode is only valid 
> in the kernel */
> +
> +   if (m == INSN_MODE_KERN)
> +   insn_init(insn, kaddr, buf_len, IS_ENABLED(CONFIG_X86_64));
> +   else
> +   insn_init(insn, kaddr, buf_len, m == INSN_MODE_64);
> +
> +   ret = insn_get_length(insn);
> +   if (ret)
> +   return ret;
> +
> +   if (insn_complete(insn))
> +   return 0;
> +
> +   return -EINVAL;
>  }
> diff --git a/tools/arch/x86/include/asm/insn.h 
> b/tools/arch/x86/include/asm/insn.h
> index 7e1239aa7dd9..2fe19b1a8765 100644
> --- a/tools/arch/x86/include/asm/insn.h
> +++ b/tools/arch/x86/include/asm/insn.h
> @@ -132,13 +132,23 @@ struct insn {
>  #define X86_VEX_M_MAX  0x1f/* VEX3.M Maximum value */
>
>  extern void insn_init(struct insn *insn, const void *kaddr, int buf_len, int 
> x86_64);
> -extern void insn_get_prefixes(struct insn *insn);
> -extern void insn_get_opcode(struct insn *insn);
> -extern void insn_get_modrm(struct insn *insn);
> -extern void insn_get_sib(struct insn *insn);
> -extern void insn_get_displacement(struct insn *insn);
> -extern void insn_get_immediate(struct insn *insn);
> -extern void insn_get_length(struct insn *insn);
> +extern int insn_get_prefixes(struct insn *insn);
> +extern int insn_get_opcode(struct insn *insn);
> +extern int insn_get_modrm(struct insn *insn);
> +extern int insn_get_sib(struct insn *insn);
> +extern int insn_get_displacement(struct insn *insn);
> +extern int insn_get_immediate(struct insn *insn);
> +extern int insn_get_length(struct insn *insn);
> +
> +enum insn_mode {
> +   INSN_MODE_32,
> +   INSN_MODE_64,
> +   /* Mode is determined by the current kernel build. */
> +   INSN_MODE_KERN,
> +   INSN_NUM_MODES,
> +};
> +
> +extern int insn_decode(struct insn *insn, const void *kaddr, int buf_len, 
> enum insn_mode m);
>
>  /* Attribute will be determined after getting ModRM (for opcode groups) */
>  static inline void insn_get_attribute(struct insn *insn)
> diff --git a/tools/arch/x86/lib/insn.c b/tools/arch/x86/lib/insn.c
> index 1174c43b3944..be2b057f91d4 100644
> --- a/tools/arch/x86/lib/insn.c
> +++ b/tools/arch/x86/lib/insn.c
> @@ -11,10 +11,13 @@
>  #else
>  #include 
>  #endif
> -#include "../include/asm/inat.h" /* __ignore_sync_check__ */
> -#include "../include/asm/insn.h" /* __ignore_sync_check__ */
> +#include  /*__ignore_sync_check__ */
> +#include  /* __ignore_sync_check__ */

Hi, this change is breaking non-x86 builds of perf for me in
tip.git/master. The reason being that non-x86 builds compile the
intel-pt-decoder, which includes this file, but don't have their
include paths set to find tools/arch/x86. I think we want to keep the
relative paths.

Thanks,
Ian

> -#include "../include/asm/emulate_prefix.h" /* __ignore_sync_check__ */
> +#include 
> +#include 
> +
> +#include  /* __ignore_sync_check__ */
>
>  #define leXX_to_cpu(t, r)  \
>  ({ \
> @@ -112,8 +115,12 @@ static void insn_get_emulate_prefix(struct insn *insn)
>   * Populates the @insn->prefixes bitmap, and updates @insn->next_byte
>   * to point to the (first) opcode.  No effect if @insn->prefixes.got
>   * is already set.
> + *
> + * * Returns:
> + * 0:  on success
> + * < 0: on error
>   */
> -void insn_get_prefixes(struct insn *insn)
> +int insn_get_prefixes(struct insn *insn)
>  {
> struct insn_field *prefixes = >prefixes;
> insn_attr_t attr;
> @@ -121,7 +128,7 @@ void insn_get_prefixes(struct insn *insn)
> int i, nb;
>
> if (prefixes->got)
> -   return;
> +   return 0;
>
> insn_get_emulate_prefix(insn);
>
> @@ -231,8 +238,10 @@ void insn_get_prefixes(struct insn *insn)
>
> prefixes->got = 1;
>
> +   return 0;
> +
>  err_out:
> -   return;
> +   return -ENODATA;
>  }
>
>  /**
> @@ -244,16 +253,25 @@ void insn_get_prefixes(struct insn *insn)
>   * If nec

[PATCH v2 1/3] perf test: Remove unused argument

2021-03-16 Thread Ian Rogers

Remove unused argument from daemon_exit.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/shell/daemon.sh | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/shell/daemon.sh b/tools/perf/tests/shell/daemon.sh
index 5ad3ca8d681b..66ad56b4e0a5 100755
--- a/tools/perf/tests/shell/daemon.sh
+++ b/tools/perf/tests/shell/daemon.sh
@@ -115,8 +115,7 @@ daemon_start()
 
 daemon_exit()
 {
-   local base=$1
-   local config=$2
+   local config=$1
 
local line=`perf daemon --config ${config} -x: | head -1`
local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
@@ -171,7 +170,7 @@ EOF
 ${base}/session-time/ack "0"
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
@@ -288,7 +287,7 @@ EOF
done
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
@@ -333,7 +332,7 @@ EOF
fi
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
# check that sessions are gone
if [ -d "/proc/${pid_size}" ]; then
@@ -374,7 +373,7 @@ EOF
perf daemon signal --config ${config}
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
# count is 2 perf.data for signals and 1 for perf record finished
count=`ls ${base}/session-test/ | grep perf.data | wc -l`
@@ -420,7 +419,7 @@ EOF
fi
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
@@ -457,7 +456,7 @@ EOF
fi
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
-- 
2.31.0.rc2.261.g7f71774620-goog

[PATCH v2 2/3] perf test: Cleanup daemon if test is interrupted.

2021-03-16 Thread Ian Rogers

Reorder daemon_start and daemon_exit as the trap handler is added in
daemon_start referencing daemon_exit.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/shell/daemon.sh | 34 +++-
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/tools/perf/tests/shell/daemon.sh b/tools/perf/tests/shell/daemon.sh
index 66ad56b4e0a5..61d13c4c64b8 100755
--- a/tools/perf/tests/shell/daemon.sh
+++ b/tools/perf/tests/shell/daemon.sh
@@ -98,6 +98,23 @@ check_line_other()
fi
 }
 
+daemon_exit()
+{
+   local config=$1
+
+   local line=`perf daemon --config ${config} -x: | head -1`
+   local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
+
+   # Reset trap handler.
+   trap - SIGINT SIGTERM
+
+   # stop daemon
+   perf daemon stop --config ${config}
+
+   # ... and wait for the pid to go away
+   tail --pid=${pid} -f /dev/null
+}
+
 daemon_start()
 {
local config=$1
@@ -105,6 +122,9 @@ daemon_start()
 
perf daemon start --config ${config}
 
+   # Clean up daemon if interrupted.
+   trap "echo 'FAILED: Signal caught'; daemon_exit ${config}; exit 1" 
SIGINT SIGTERM
+
# wait for the session to ping
local state="FAIL"
while [ "${state}" != "OK" ]; do
@@ -113,20 +133,6 @@ daemon_start()
done
 }
 
-daemon_exit()
-{
-   local config=$1
-
-   local line=`perf daemon --config ${config} -x: | head -1`
-   local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
-
-   # stop daemon
-   perf daemon stop --config ${config}
-
-   # ... and wait for the pid to go away
-   tail --pid=${pid} -f /dev/null
-}
-
 test_list()
 {
echo "test daemon list"
-- 
2.31.0.rc2.261.g7f71774620-goog

[PATCH v2 3/3] perf test: Add 30s timeout for wait for daemon start.

2021-03-16 Thread Ian Rogers

Retry the ping loop upto 600 times, or approximately 30 seconds, to make
sure the test does hang at start up.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/shell/daemon.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/tests/shell/daemon.sh b/tools/perf/tests/shell/daemon.sh
index 61d13c4c64b8..ee4a30ca3f57 100755
--- a/tools/perf/tests/shell/daemon.sh
+++ b/tools/perf/tests/shell/daemon.sh
@@ -127,9 +127,16 @@ daemon_start()
 
# wait for the session to ping
local state="FAIL"
+   local retries=0
while [ "${state}" != "OK" ]; do
state=`perf daemon ping --config ${config} --session ${session} 
| awk '{ print $1 }'`
sleep 0.05
+   retries=$((${retries} +1))
+   if [ ${retries} -ge 600 ]; then
+   echo "FAILED: Timeout waiting for daemon to ping"
+   daemon_exit ${config}
+   exit 1
+   fi
done
 }
 
-- 
2.31.0.rc2.261.g7f71774620-goog

Re: [PATCH] staging: comedi: replace slash in name

2021-03-16 Thread Ian Abbott


On 15/03/2021 20:00, Tong Zhang wrote:

Thanks Ian,
I have submitted a v2 patch based on your suggestions.
Thanks,
- Tong


Thanks.  I think the only other Comedi driver with the same problem is 
"drivers/staging/comedi/drivers/das800.c".  It passes dev->board_name as 
the name argument of request_irq(), but that is "cio-das802/16" for one 
of the boards supported by the driver.



On Mon, Mar 15, 2021 at 6:48 AM Ian Abbott  wrote:


On 15/03/2021 10:44, Ian Abbott wrote:

On 14/03/2021 03:57, Tong Zhang wrote:

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created

[1.565966] name 'pci-das6402/16'
[1.566149] WARNING: CPU: 0 PID: 184 at fs/proc/generic.c:180 
__xlate_proc_name+0x93/0xb0
[1.568923] RIP: 0010:__xlate_proc_name+0x93/0xb0
[1.574200] Call Trace:
[1.574722]  proc_mkdir+0x18/0x20
[1.576629]  request_threaded_irq+0xfe/0x160
[1.576859]  auto_attach+0x60a/0xc40 [cb_pcidas64]

Signed-off-by: Tong Zhang 

[snip]

Userspace applications can use these strings to determine the board
type, so changing the strings would break those applications.

I suggest passing the comedi driver name "cb_pcidas" to request_irq()
for now.


Oops, I meant "cb_pcidas64".  But you could reach that via
dev->driver->driver_name if you want (where dev is the struct
comedi_device * parameter).


--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH v2] staging: comedi: cb_pcidas64: fix request_irq() warn

2021-03-16 Thread Ian Abbott


On 15/03/2021 19:58, Tong Zhang wrote:

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created
since the .name might be used by userspace and we don't want to break
userspace, so we are changing the parameters passed to request_irq()

[1.565966] name 'pci-das6402/16'
[1.566149] WARNING: CPU: 0 PID: 184 at fs/proc/generic.c:180 
__xlate_proc_name+0x93/0xb0
[1.568923] RIP: 0010:__xlate_proc_name+0x93/0xb0
[1.574200] Call Trace:
[1.574722]  proc_mkdir+0x18/0x20
[1.576629]  request_threaded_irq+0xfe/0x160
[1.576859]  auto_attach+0x60a/0xc40 [cb_pcidas64]

Suggested-by: Ian Abbott 
Signed-off-by: Tong Zhang 
---
v2: revert changes to .name field so that we dont break userspace

  drivers/staging/comedi/drivers/cb_pcidas64.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/cb_pcidas64.c 
b/drivers/staging/comedi/drivers/cb_pcidas64.c
index fa987bb0e7cd..6d3ba399a7f0 100644
--- a/drivers/staging/comedi/drivers/cb_pcidas64.c
+++ b/drivers/staging/comedi/drivers/cb_pcidas64.c
@@ -4035,7 +4035,7 @@ static int auto_attach(struct comedi_device *dev,
init_stc_registers(dev);
  
  	retval = request_irq(pcidev->irq, handle_interrupt, IRQF_SHARED,

-dev->board_name, dev);
+"cb_pcidas64", dev);
if (retval) {
dev_dbg(dev->class_dev, "unable to allocate irq %u\n",
pcidev->irq);



Looks good.

Reviewed-by: Ian Abbott 

--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH v2] staging: comedi: cb_pcidas: fix request_irq() warn

2021-03-16 Thread Ian Abbott


On 15/03/2021 19:59, Tong Zhang wrote:

request_irq() wont accept a name which contains slash so we need to
repalce it with something else -- otherwise it will trigger a warning
and the entry in /proc/irq/ will not be created
since the .name might be used by userspace and we don't want to break
userspace, so we are changing the parameters passed to request_irq()

[1.630764] name 'pci-das1602/16'
[1.630950] WARNING: CPU: 0 PID: 181 at fs/proc/generic.c:180 
__xlate_proc_name+0x93/0xb0
[1.634009] RIP: 0010:__xlate_proc_name+0x93/0xb0
[1.639441] Call Trace:
[1.639976]  proc_mkdir+0x18/0x20
[1.641946]  request_threaded_irq+0xfe/0x160
[1.642186]  cb_pcidas_auto_attach+0xf4/0x610 [cb_pcidas]

Suggested-by: Ian Abbott 
Signed-off-by: Tong Zhang 
---
v2: revert changes to .name field so that we dont break userspace

  drivers/staging/comedi/drivers/cb_pcidas.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/drivers/cb_pcidas.c 
b/drivers/staging/comedi/drivers/cb_pcidas.c
index d740c4782775..2f20bd56ec6c 100644
--- a/drivers/staging/comedi/drivers/cb_pcidas.c
+++ b/drivers/staging/comedi/drivers/cb_pcidas.c
@@ -1281,7 +1281,7 @@ static int cb_pcidas_auto_attach(struct comedi_device 
*dev,
 devpriv->amcc + AMCC_OP_REG_INTCSR);
  
  	ret = request_irq(pcidev->irq, cb_pcidas_interrupt, IRQF_SHARED,

- dev->board_name, dev);
+ "cb_pcidas", dev);
if (ret) {
dev_dbg(dev->class_dev, "unable to allocate irq %d\n",
pcidev->irq);



Looks good.

Reviewed-by: Ian Abbott 

--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] staging: comedi: cb_pcidas: replace slash in name

2021-03-15 Thread Ian Abbott

On 14/03/2021 04:02, Tong Zhang wrote:
> request_irq() wont accept a name which contains slash so we need to
> repalce it with something else -- otherwise it will trigger a warning
> and the entry in /proc/irq/ will not be created
> 
> [1.630764] name 'pci-das1602/16'
> [1.630950] WARNING: CPU: 0 PID: 181 at fs/proc/generic.c:180 
> __xlate_proc_name+0x93/0xb0
> [1.634009] RIP: 0010:__xlate_proc_name+0x93/0xb0
> [1.639441] Call Trace:
> [1.639976]  proc_mkdir+0x18/0x20
> [1.641946]  request_threaded_irq+0xfe/0x160
> [1.642186]  cb_pcidas_auto_attach+0xf4/0x610 [cb_pcidas]
> 
> Signed-off-by: Tong Zhang 
> ---
>  drivers/staging/comedi/drivers/cb_pcidas.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/staging/comedi/drivers/cb_pcidas.c 
> b/drivers/staging/comedi/drivers/cb_pcidas.c
> index d740c4782775..df0960d41cff 100644
> --- a/drivers/staging/comedi/drivers/cb_pcidas.c
> +++ b/drivers/staging/comedi/drivers/cb_pcidas.c
> @@ -230,7 +230,7 @@ struct cb_pcidas_board {
>  
>  static const struct cb_pcidas_board cb_pcidas_boards[] = {
>   [BOARD_PCIDAS1602_16] = {
> - .name   = "pci-das1602/16",
> + .name   = "pci-das1602-16",
>   .ai_speed   = 5000,
>   .ao_scan_speed  = 1,
>   .fifo_size  = 512,
> @@ -248,7 +248,7 @@ static const struct cb_pcidas_board cb_pcidas_boards[] = {
>   .has_ao = 1,
>   },
>   [BOARD_PCIDAS1602_12] = {
> - .name   = "pci-das1602/12",
> + .name   = "pci-das1602-12",
>   .ai_speed   = 3200,
>   .ao_scan_speed  = 4000,
>   .fifo_size  = 1024,
> @@ -257,12 +257,12 @@ static const struct cb_pcidas_board cb_pcidas_boards[] 
> = {
>   .is_1602= 1,
>   },
>   [BOARD_PCIDAS1200_JR] = {
> - .name   = "pci-das1200/jr",
> + .name   = "pci-das1200-jr",
>   .ai_speed   = 3200,
>   .fifo_size  = 1024,
>   },
>   [BOARD_PCIDAS1602_16_JR] = {
> - .name   = "pci-das1602/16/jr",
> + .name   = "pci-das1602-16-jr",
>   .ai_speed   = 5000,
>   .fifo_size  = 512,
>   .is_16bit   = 1,
> 

As for cb_pcidas64, the board name may be used by user-space
applications, so this may break those applications.

I suggest passing the comedi driver name "cb_pcidas" to request_irq()
for now.  (It can also be reached via dev->driver->driver_name .)

-- 
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] staging: comedi: replace slash in name

2021-03-15 Thread Ian Abbott

On 15/03/2021 10:44, Ian Abbott wrote:
> On 14/03/2021 03:57, Tong Zhang wrote:
>> request_irq() wont accept a name which contains slash so we need to
>> repalce it with something else -- otherwise it will trigger a warning
>> and the entry in /proc/irq/ will not be created
>>
>> [1.565966] name 'pci-das6402/16'
>> [1.566149] WARNING: CPU: 0 PID: 184 at fs/proc/generic.c:180 
>> __xlate_proc_name+0x93/0xb0
>> [1.568923] RIP: 0010:__xlate_proc_name+0x93/0xb0
>> [1.574200] Call Trace:
>> [1.574722]  proc_mkdir+0x18/0x20
>> [1.576629]  request_threaded_irq+0xfe/0x160
>> [1.576859]  auto_attach+0x60a/0xc40 [cb_pcidas64]
>>
>> Signed-off-by: Tong Zhang 
[snip]
> Userspace applications can use these strings to determine the board
> type, so changing the strings would break those applications.
> 
> I suggest passing the comedi driver name "cb_pcidas" to request_irq()
> for now.

Oops, I meant "cb_pcidas64".  But you could reach that via
dev->driver->driver_name if you want (where dev is the struct
comedi_device * parameter).

-- 
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] staging: comedi: replace slash in name

2021-03-15 Thread Ian Abbott

On 14/03/2021 03:57, Tong Zhang wrote:
> request_irq() wont accept a name which contains slash so we need to
> repalce it with something else -- otherwise it will trigger a warning
> and the entry in /proc/irq/ will not be created
> 
> [1.565966] name 'pci-das6402/16'
> [1.566149] WARNING: CPU: 0 PID: 184 at fs/proc/generic.c:180 
> __xlate_proc_name+0x93/0xb0
> [1.568923] RIP: 0010:__xlate_proc_name+0x93/0xb0
> [1.574200] Call Trace:
> [1.574722]  proc_mkdir+0x18/0x20
> [1.576629]  request_threaded_irq+0xfe/0x160
> [1.576859]  auto_attach+0x60a/0xc40 [cb_pcidas64]
> 
> Signed-off-by: Tong Zhang 
> ---
>  drivers/staging/comedi/drivers/cb_pcidas64.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/staging/comedi/drivers/cb_pcidas64.c 
> b/drivers/staging/comedi/drivers/cb_pcidas64.c
> index fa987bb0e7cd..662d6ffb8f60 100644
> --- a/drivers/staging/comedi/drivers/cb_pcidas64.c
> +++ b/drivers/staging/comedi/drivers/cb_pcidas64.c
> @@ -677,7 +677,7 @@ static const int bytes_in_sample = 2;
>  
>  static const struct pcidas64_board pcidas64_boards[] = {
>   [BOARD_PCIDAS6402_16] = {
> - .name   = "pci-das6402/16",
> + .name   = "pci-das6402-16",
>   .ai_se_chans= 64,
>   .ai_bits= 16,
>   .ai_speed   = 5000,
> @@ -693,7 +693,7 @@ static const struct pcidas64_board pcidas64_boards[] = {
>   .has_8255   = 1,
>   },
>   [BOARD_PCIDAS6402_12] = {
> - .name   = "pci-das6402/12", /* XXX check */
> + .name   = "pci-das6402-12", /* XXX check */
>   .ai_se_chans= 64,
>   .ai_bits= 12,
>   .ai_speed   = 5000,
> @@ -709,7 +709,7 @@ static const struct pcidas64_board pcidas64_boards[] = {
>   .has_8255   = 1,
>   },
>   [BOARD_PCIDAS64_M1_16] = {
> - .name   = "pci-das64/m1/16",
> + .name   = "pci-das64-m1-16",
>   .ai_se_chans= 64,
>   .ai_bits= 16,
>   .ai_speed   = 1000,
> @@ -725,7 +725,7 @@ static const struct pcidas64_board pcidas64_boards[] = {
>   .has_8255   = 1,
>   },
>   [BOARD_PCIDAS64_M2_16] = {
> - .name = "pci-das64/m2/16",
> + .name = "pci-das64-m2-16",
>   .ai_se_chans= 64,
>   .ai_bits= 16,
>   .ai_speed   = 500,
> @@ -741,7 +741,7 @@ static const struct pcidas64_board pcidas64_boards[] = {
>   .has_8255   = 1,
>   },
>   [BOARD_PCIDAS64_M3_16] = {
> - .name   = "pci-das64/m3/16",
> + .name   = "pci-das64-m3-16",
>   .ai_se_chans= 64,
>   .ai_bits= 16,
>   .ai_speed   = 333,
> @@ -984,7 +984,7 @@ static const struct pcidas64_board pcidas64_boards[] = {
>   .has_8255   = 0,
>   },
>   [BOARD_PCIDAS4020_12] = {
> - .name   = "pci-das4020/12",
> + .name   = "pci-das4020-12",
>   .ai_se_chans= 4,
>   .ai_bits= 12,
>   .ai_speed   = 50,
> 

Userspace applications can use these strings to determine the board
type, so changing the strings would break those applications.

I suggest passing the comedi driver name "cb_pcidas" to request_irq()
for now.

-- 
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: pinctrl: core: Handling pinmux and pinconf separately

2021-03-12 Thread Colin Ian King

On 12/03/2021 12:45, Andy Shevchenko wrote:
> On Thu, Mar 11, 2021 at 1:26 PM Colin Ian King  
> wrote:
>> On 11/03/2021 11:16, Michal Simek wrote:
>>> On 3/11/21 11:57 AM, Colin Ian King wrote:
> 
>>>> For the PIN_MAP_TYPE_CONFIGS_PIN and PIN_MAP_TYPE_CONFIGS_GROUP
>>>> setting->type cases the loop can break out with ret not being set. Since
>>>> ret has not been initialized it the ret < 0 check is checking against an
>>>> uninitialized value.
>>>>
>>>> I was not sure if the PIN_MAP_TYPE_CONFIGS_PIN and
>>>> PIN_MAP_TYPE_CONFIGS_GROUP cases should be setting ret and if so, what
>>>> the value of ret should be set to (is it an error condition or not?). Or
>>>> should ret be initialized to 0 or a default error value at the start of
>>>> the function.
>>>>
>>>> Hence I'm reporting this issue.
>>>
>>> What about this? Is this passing static analysis?
>>
>> It will take me 2 hours to re-run the analysis, but from eyeballing the
>> code I think the assignments will fix this.
> 
> It surprises me that tools in the 21st century can't run on a subset
> of the data.
> 
> Had you filed a bug to the Coverity team that they will provide a way
> to rerun analysis on a subset of the data?

It can. However I need to keep copies of the entire build to do this and
I build many different kernels (hence lots of storage required) and
rarely do minor change + rebuilds, so I don't cater for this in my test
build environment.

> 
>

Re: [PATCH] pinctrl: core: Set ret to 0 when group is skipped

2021-03-12 Thread Colin Ian King

On 12/03/2021 07:31, Michal Simek wrote:
> Static analyzer tool found that the ret variable is not initialized but
> code expects ret value >=0 when pinconf is skipped in the first pinmux
> loop. The same expectation is for pinmux in a pinconf loop.
> That's why initialize ret to 0 to avoid uninitialized ret value in first
> loop or reusing ret value from first loop in second.
> 
> Addresses-Coverity: ("Uninitialized variables")
> Signed-off-by: Michal Simek 
> CC: Colin Ian King 
> CC: Dan Carpenter 
> ---
> 
>  drivers/pinctrl/core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/pinctrl/core.c b/drivers/pinctrl/core.c
> index f5c32d2a3c91..136c323d855e 100644
> --- a/drivers/pinctrl/core.c
> +++ b/drivers/pinctrl/core.c
> @@ -1266,6 +1266,7 @@ static int pinctrl_commit_state(struct pinctrl *p, 
> struct pinctrl_state *state)
>   break;
>   case PIN_MAP_TYPE_CONFIGS_PIN:
>   case PIN_MAP_TYPE_CONFIGS_GROUP:
> + ret = 0;
>   break;
>   default:
>   ret = -EINVAL;
> @@ -1284,6 +1285,7 @@ static int pinctrl_commit_state(struct pinctrl *p, 
> struct pinctrl_state *state)
>   list_for_each_entry(setting, >settings, node) {
>   switch (setting->type) {
>   case PIN_MAP_TYPE_MUX_GROUP:
> + ret = 0;
>   break;
>       case PIN_MAP_TYPE_CONFIGS_PIN:
>   case PIN_MAP_TYPE_CONFIGS_GROUP:
> 

Thanks Michal,

Reviewed-by: Colin Ian King

Re: [PATCH][next] nvmem: core: Fix unintentional sign extension issue

2021-03-11 Thread Colin Ian King

On 11/03/2021 17:12, Doug Anderson wrote:
> Hi,
> 
> On Thu, Mar 11, 2021 at 1:53 AM Colin King  wrote:
>>
>> From: Colin Ian King 
>>
>> The shifting of the u8 integer buf[3] by 24 bits to the left will
>> be promoted to a 32 bit signed int and then sign-extended to a
>> u64. In the event that the top bit of buf[3] is set then all
>> then all the upper 32 bits of the u64 end up as also being set
>> because of the sign-extension. Fix this by casting buf[i] to
>> a u64 before the shift.
>>
>> Addresses-Coverity: ("Unintended sign extension")
>> Fixes: 097eb1136ebb ("nvmem: core: Add functions to make number reading 
>> easy")
>> Signed-off-by: Colin Ian King 
>> ---
>>  drivers/nvmem/core.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> Thanks! I had only tested the "u64" version to read smaller data and
> store it in a u64. From my understanding of C rules, without your
> patch it would have been even worse than just a sign extension though,
> right? Shifting "buf[i]" by more than 32 bits would just not have
> worked right.

yep, that's correct, I forgot to mention that issue too :-/

> 
> In any case:
> 
> Reviewed-by: Douglas Anderson 
>

Re: [PATCH 2/3] perf test: Cleanup daemon if test is interrupted.

2021-03-11 Thread Ian Rogers

On Thu, Mar 11, 2021 at 2:38 AM Jiri Olsa  wrote:
>
> On Wed, Mar 10, 2021 at 12:41:17PM -0800, Ian Rogers wrote:
> > Reorder daemon_start and daemon_exit as the trap handler is added in
> > daemon_start referencing daemon_exit.
>
> makes sense, minor comments below
>
> >
> > Signed-off-by: Ian Rogers 
> > ---
> >  tools/perf/tests/shell/daemon.sh | 34 +++-
> >  1 file changed, 20 insertions(+), 14 deletions(-)
> >
> > diff --git a/tools/perf/tests/shell/daemon.sh 
> > b/tools/perf/tests/shell/daemon.sh
> > index 66ad56b4e0a5..a02cedc76de6 100755
> > --- a/tools/perf/tests/shell/daemon.sh
> > +++ b/tools/perf/tests/shell/daemon.sh
> > @@ -98,6 +98,23 @@ check_line_other()
> >   fi
> >  }
> >
> > +daemon_exit()
> > +{
> > + local config=$1
> > +
> > + local line=`perf daemon --config ${config} -x: | head -1`
> > + local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
> > +
> > +# Reset trap handler.
> > +trap - SIGINT SIGTERM
>
> please keep the 'tabs' in here
>
> > +
> > + # stop daemon
> > + perf daemon stop --config ${config}
> > +
> > + # ... and wait for the pid to go away
> > + tail --pid=${pid} -f /dev/null
> > +}
> > +
> >  daemon_start()
> >  {
> >   local config=$1
> > @@ -105,6 +122,9 @@ daemon_start()
> >
> >   perf daemon start --config ${config}
> >
> > +# Clean up daemon if interrupted.
> > +trap "daemon_exit ${config}; exit 4" SIGINT SIGTERM
>
> ditto, plus let's document exit values somewhere in comments,
> I think the next patch is adding exit 62, so that one as well

Thanks, it might be easier to just have these as exit 1 - I pulled
values from the errno list. Do you have any preferences?

Ian

> thanks,
> jirka
>
> > +
> >   # wait for the session to ping
> >   local state="FAIL"
> >   while [ "${state}" != "OK" ]; do
> > @@ -113,20 +133,6 @@ daemon_start()
> >   done
> >  }
> >
> > -daemon_exit()
> > -{
> > - local config=$1
> > -
> > - local line=`perf daemon --config ${config} -x: | head -1`
> > - local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
> > -
> > - # stop daemon
> > - perf daemon stop --config ${config}
> > -
> > - # ... and wait for the pid to go away
> > - tail --pid=${pid} -f /dev/null
> > -}
> > -
> >  test_list()
> >  {
> >   echo "test daemon list"
> > --
> > 2.30.1.766.gb4fecdf3b7-goog
> >
>

Re: pinctrl: core: Handling pinmux and pinconf separately

2021-03-11 Thread Colin Ian King

On 11/03/2021 11:28, Michal Simek wrote:
> 
> 
> On 3/11/21 12:24 PM, Colin Ian King wrote:
>> On 11/03/2021 11:16, Michal Simek wrote:
>>>
>>>
>>> On 3/11/21 11:57 AM, Colin Ian King wrote:
>>>> Hi,
>>>>
>>>> Static analysis on linux-next with Coverity has found a potential issue
>>>> in drivers/pinctrl/core.c with the following commit:
>>>>
>>>> commit 0952b7ec1614abf232e921aac0cc2bca8e60e162
>>>> Author: Michal Simek 
>>>> Date:   Wed Mar 10 09:16:54 2021 +0100
>>>>
>>>> pinctrl: core: Handling pinmux and pinconf separately
>>>>
>>>> The analysis is as follows:
>>>>
>>>> 1234 /**
>>>> 1235  * pinctrl_commit_state() - select/activate/program a pinctrl state
>>>> to HW
>>>> 1236  * @p: the pinctrl handle for the device that requests configuration
>>>> 1237  * @state: the state handle to select/activate/program
>>>> 1238  */
>>>> 1239 static int pinctrl_commit_state(struct pinctrl *p, struct
>>>> pinctrl_state *state)
>>>> 1240 {
>>>> 1241struct pinctrl_setting *setting, *setting2;
>>>> 1242struct pinctrl_state *old_state = p->state;
>>>>
>>>> 1. var_decl: Declaring variable ret without initializer.
>>>>
>>>> 1243int ret;
>>>> 1244
>>>>
>>>> 2. Condition p->state, taking true branch.
>>>>
>>>> 1245if (p->state) {
>>>> 1246/*
>>>> 1247 * For each pinmux setting in the old state, forget
>>>> SW's record
>>>> 1248 * of mux owner for that pingroup. Any pingroups
>>>> which are
>>>> 1249 * still owned by the new state will be re-acquired
>>>> by the call
>>>> 1250 * to pinmux_enable_setting() in the loop below.
>>>> 1251 */
>>>>
>>>> 3. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>>>> !__builtin_types_compatible_p()) */, taking false branch.
>>>> 4. Condition !(>node == >state->settings), taking true
>>>> branch.
>>>> 7. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>>>> !__builtin_types_compatible_p()) */, taking false branch.
>>>> 8. Condition !(>node == >state->settings), taking true
>>>> branch.
>>>> 11. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>>>> !__builtin_types_compatible_p()) */, taking false branch.
>>>> 12. Condition !(>node == >state->settings), taking false
>>>> branch.
>>>>
>>>> 1252list_for_each_entry(setting, >state->settings,
>>>> node) {
>>>>
>>>> 5. Condition setting->type != PIN_MAP_TYPE_MUX_GROUP, taking true
>>>> branch.
>>>> 9. Condition setting->type != PIN_MAP_TYPE_MUX_GROUP, taking true
>>>> branch.
>>>> 1253if (setting->type != PIN_MAP_TYPE_MUX_GROUP)
>>>> 6. Continuing loop.
>>>> 10. Continuing loop.
>>>>
>>>> 1254continue;
>>>> 1255pinmux_disable_setting(setting);
>>>> 1256}
>>>> 1257}
>>>> 1258
>>>> 1259p->state = NULL;
>>>> 1260
>>>> 1261/* Apply all the settings for the new state - pinmux first */
>>>>
>>>> 13. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>>>> !__builtin_types_compatible_p()) */, taking false branch.
>>>> 14. Condition !(>node == >settings), taking true 
>>>> branch.
>>>> 1262list_for_each_entry(setting, >settings, node) {
>>>> 15. Switch case value PIN_MAP_TYPE_CONFIGS_PIN.
>>>>
>>>> 1263switch (setting->type) {
>>>> 1264case PIN_MAP_TYPE_MUX_GROUP:
>>>> 1265ret = pinmux_enable_setting(setting);
>>>> 1266break;
>>>> 1267case PIN_MAP_TYPE_CONFIGS_PIN:
>>>> 1268case PIN_MAP_TYPE_CONFIGS_GROUP:
>>>>
>>>> 16. Breaking from switch.
>>>>
>>>> 1269

Re: pinctrl: core: Handling pinmux and pinconf separately

2021-03-11 Thread Colin Ian King

On 11/03/2021 11:16, Michal Simek wrote:
> 
> 
> On 3/11/21 11:57 AM, Colin Ian King wrote:
>> Hi,
>>
>> Static analysis on linux-next with Coverity has found a potential issue
>> in drivers/pinctrl/core.c with the following commit:
>>
>> commit 0952b7ec1614abf232e921aac0cc2bca8e60e162
>> Author: Michal Simek 
>> Date:   Wed Mar 10 09:16:54 2021 +0100
>>
>> pinctrl: core: Handling pinmux and pinconf separately
>>
>> The analysis is as follows:
>>
>> 1234 /**
>> 1235  * pinctrl_commit_state() - select/activate/program a pinctrl state
>> to HW
>> 1236  * @p: the pinctrl handle for the device that requests configuration
>> 1237  * @state: the state handle to select/activate/program
>> 1238  */
>> 1239 static int pinctrl_commit_state(struct pinctrl *p, struct
>> pinctrl_state *state)
>> 1240 {
>> 1241struct pinctrl_setting *setting, *setting2;
>> 1242struct pinctrl_state *old_state = p->state;
>>
>> 1. var_decl: Declaring variable ret without initializer.
>>
>> 1243int ret;
>> 1244
>>
>> 2. Condition p->state, taking true branch.
>>
>> 1245if (p->state) {
>> 1246/*
>> 1247 * For each pinmux setting in the old state, forget
>> SW's record
>> 1248 * of mux owner for that pingroup. Any pingroups
>> which are
>> 1249 * still owned by the new state will be re-acquired
>> by the call
>> 1250 * to pinmux_enable_setting() in the loop below.
>> 1251 */
>>
>> 3. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>> !__builtin_types_compatible_p()) */, taking false branch.
>> 4. Condition !(>node == >state->settings), taking true
>> branch.
>> 7. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>> !__builtin_types_compatible_p()) */, taking false branch.
>> 8. Condition !(>node == >state->settings), taking true
>> branch.
>> 11. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>> !__builtin_types_compatible_p()) */, taking false branch.
>> 12. Condition !(>node == >state->settings), taking false
>> branch.
>>
>> 1252list_for_each_entry(setting, >state->settings,
>> node) {
>>
>> 5. Condition setting->type != PIN_MAP_TYPE_MUX_GROUP, taking true
>> branch.
>> 9. Condition setting->type != PIN_MAP_TYPE_MUX_GROUP, taking true
>> branch.
>> 1253if (setting->type != PIN_MAP_TYPE_MUX_GROUP)
>> 6. Continuing loop.
>> 10. Continuing loop.
>>
>> 1254continue;
>> 1255pinmux_disable_setting(setting);
>> 1256}
>> 1257}
>> 1258
>> 1259p->state = NULL;
>> 1260
>> 1261/* Apply all the settings for the new state - pinmux first */
>>
>> 13. Condition 0 /* !!(!__builtin_types_compatible_p() &&
>> !__builtin_types_compatible_p()) */, taking false branch.
>> 14. Condition !(>node == >settings), taking true branch.
>> 1262list_for_each_entry(setting, >settings, node) {
>> 15. Switch case value PIN_MAP_TYPE_CONFIGS_PIN.
>>
>> 1263switch (setting->type) {
>> 1264case PIN_MAP_TYPE_MUX_GROUP:
>> 1265ret = pinmux_enable_setting(setting);
>> 1266break;
>> 1267case PIN_MAP_TYPE_CONFIGS_PIN:
>> 1268case PIN_MAP_TYPE_CONFIGS_GROUP:
>>
>> 16. Breaking from switch.
>>
>> 1269break;
>> 1270default:
>> 1271ret = -EINVAL;
>> 1272break;
>> 1273}
>> 1274
>>
>> Uninitialized scalar variable (UNINIT)
>> 17. uninit_use: Using uninitialized value ret.
>>
>> 1275if (ret < 0)
>> 1276goto unapply_new_state;
>>
>> For the PIN_MAP_TYPE_CONFIGS_PIN and PIN_MAP_TYPE_CONFIGS_GROUP
>> setting->type cases the loop can break out with ret not being set. Since
>> ret has not been initialized it the ret < 0 check is checking against an
>> uninitialized value.
>>
>> I was not sure if the PIN_MAP_TYPE_CONFIGS_PIN and
>> PIN_MAP_TYPE_CONFIGS_GROUP cases should be setting ret and if so, what
>> th

re: scsi: sg: Replace sg_allow_access()

2021-03-11 Thread Colin Ian King

Hi,

Static analysis on linux-next with Coverity has detected an issue in
drivers/scsi/sg.c in function sg_remove_sfp_usercontext with the
following recent commit:

commit 0c32296d73ec5dec64729eb555f1a29ded8a7272
Author: Douglas Gilbert 
Date:   Fri Feb 19 21:00:28 2021 -0500

scsi: sg: Replace sg_allow_access()

The analysis is as follows:

3913if (unlikely(sfp != e_sfp))
3914SG_LOG(1, sfp, "%s: xa_erase() return unexpected\n",
3915   __func__);

deref_ptr_in_call: Dereferencing pointer sdp.

3916o_count = atomic_dec_return(>open_cnt);
3917SG_LOG(3, sfp, "%s: dev o_count after=%d: sfp=0x%pK --\n",
__func__,
3918   o_count, sfp);
3919kfree(sfp);
3920

Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking sdp suggests that it may be null,
but it has already been dereferenced on all paths leading to the check.

3921if (sdp) {
3922scsi_device_put(sdp->device);
3923kref_put(>d_ref, sg_device_destroy);
3924}

Line 3916 dereferences pointer sdp with >open_cnt, however later on
in line 3921 sdp is being null checked.  Either the null check is
redundant if sdp is never null or there is a potential null pointer
dereference on line 3916.

Colin

re: pinctrl: core: Handling pinmux and pinconf separately

2021-03-11 Thread Colin Ian King

Hi,

Static analysis on linux-next with Coverity has found a potential issue
in drivers/pinctrl/core.c with the following commit:

commit 0952b7ec1614abf232e921aac0cc2bca8e60e162
Author: Michal Simek 
Date:   Wed Mar 10 09:16:54 2021 +0100

pinctrl: core: Handling pinmux and pinconf separately

The analysis is as follows:

1234 /**
1235  * pinctrl_commit_state() - select/activate/program a pinctrl state
to HW
1236  * @p: the pinctrl handle for the device that requests configuration
1237  * @state: the state handle to select/activate/program
1238  */
1239 static int pinctrl_commit_state(struct pinctrl *p, struct
pinctrl_state *state)
1240 {
1241struct pinctrl_setting *setting, *setting2;
1242struct pinctrl_state *old_state = p->state;

1. var_decl: Declaring variable ret without initializer.

1243int ret;
1244

2. Condition p->state, taking true branch.

1245if (p->state) {
1246/*
1247 * For each pinmux setting in the old state, forget
SW's record
1248 * of mux owner for that pingroup. Any pingroups
which are
1249 * still owned by the new state will be re-acquired
by the call
1250 * to pinmux_enable_setting() in the loop below.
1251 */

3. Condition 0 /* !!(!__builtin_types_compatible_p() &&
!__builtin_types_compatible_p()) */, taking false branch.
4. Condition !(>node == >state->settings), taking true
branch.
7. Condition 0 /* !!(!__builtin_types_compatible_p() &&
!__builtin_types_compatible_p()) */, taking false branch.
8. Condition !(>node == >state->settings), taking true
branch.
11. Condition 0 /* !!(!__builtin_types_compatible_p() &&
!__builtin_types_compatible_p()) */, taking false branch.
12. Condition !(>node == >state->settings), taking false
branch.

1252list_for_each_entry(setting, >state->settings,
node) {

5. Condition setting->type != PIN_MAP_TYPE_MUX_GROUP, taking true
branch.
9. Condition setting->type != PIN_MAP_TYPE_MUX_GROUP, taking true
branch.
1253if (setting->type != PIN_MAP_TYPE_MUX_GROUP)
6. Continuing loop.
10. Continuing loop.

1254continue;
1255pinmux_disable_setting(setting);
1256}
1257}
1258
1259p->state = NULL;
1260
1261/* Apply all the settings for the new state - pinmux first */

13. Condition 0 /* !!(!__builtin_types_compatible_p() &&
!__builtin_types_compatible_p()) */, taking false branch.
14. Condition !(>node == >settings), taking true branch.
1262list_for_each_entry(setting, >settings, node) {
15. Switch case value PIN_MAP_TYPE_CONFIGS_PIN.

1263switch (setting->type) {
1264case PIN_MAP_TYPE_MUX_GROUP:
1265ret = pinmux_enable_setting(setting);
1266break;
1267case PIN_MAP_TYPE_CONFIGS_PIN:
1268case PIN_MAP_TYPE_CONFIGS_GROUP:

16. Breaking from switch.

1269break;
1270default:
1271ret = -EINVAL;
1272break;
1273}
1274

Uninitialized scalar variable (UNINIT)
17. uninit_use: Using uninitialized value ret.

1275if (ret < 0)
1276goto unapply_new_state;

For the PIN_MAP_TYPE_CONFIGS_PIN and PIN_MAP_TYPE_CONFIGS_GROUP
setting->type cases the loop can break out with ret not being set. Since
ret has not been initialized it the ret < 0 check is checking against an
uninitialized value.

I was not sure if the PIN_MAP_TYPE_CONFIGS_PIN and
PIN_MAP_TYPE_CONFIGS_GROUP cases should be setting ret and if so, what
the value of ret should be set to (is it an error condition or not?). Or
should ret be initialized to 0 or a default error value at the start of
the function.

Hence I'm reporting this issue.

Colin

re: scsi: sg: NO_DXFER move to/from kernel buffers

2021-03-11 Thread Colin Ian King

Hi,

Static analysis on linux-next with Coverity has detected an issue in
drivers/scsi/sg.c with the following recent commit:

commit b32ac463cb59e758b4560260fd168a2b4ea6e81a
Author: Douglas Gilbert 
Date:   Fri Feb 19 21:00:54 2021 -0500

scsi: sg: NO_DXFER move to/from kernel buffers

The analysis is as follows:

2973 sg_rq_map_kern(struct sg_request *srp, struct request_queue *q,
struct request *rqq, int rw_ind)
2974 {
2975struct sg_scatter_hold *schp = >sgat_h;
2976struct bio *bio;

1. var_decl: Declaring variable k without initializer.

2977int k, ln;
2978int op_flags = 0;
2979int num_sgat = schp->num_sgat;
2980int dlen = schp->dlen;
2981int pg_sz = 1 << (PAGE_SHIFT + schp->page_order);
2982int num_segs = (1 << schp->page_order) * num_sgat;
2983int res = 0;
2984

2. Condition _sdp, taking true branch.
3. Condition _sdp->disk, taking true branch.
4. Condition !!(_sdp && _sdp->disk), taking true branch.
5. Condition !!(((scsi_logging_level >> 3) & 7U /* (1 << 3) - 1 */)
> 4), taking true branch.
6. Condition !!(((scsi_logging_level >> 3) & 7U /* (1 << 3) - 1 */)
> 4), taking true branch.
7. Falling through to end of if statement.

2985SG_LOG(4, srp->parentfp, "%s: dlen=%d, pg_sz=%d\n",
__func__, dlen, pg_sz);

8. Condition num_sgat <= 0, taking false branch.

2986if (num_sgat <= 0)
2987return 0;

9. Condition rw_ind == 1, taking true branch.

2988if (rw_ind == WRITE)
2989op_flags = REQ_SYNC | REQ_IDLE;
Uninitialized scalar variable
10. uninit_use: Using uninitialized value k.

2990bio = sg_mk_kern_bio(num_sgat - k);
2991if (!bio)

Variable k is not initialized, however it is being read when it contains
a garbage value.

Colin

[PATCH 3/3] perf test: Add 30s timeout for wait for daemon start.

2021-03-10 Thread Ian Rogers

Retry the ping loop upto 600 times, or approximately 30 seconds, to make
sure the test does hang at start up.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/shell/daemon.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/tools/perf/tests/shell/daemon.sh b/tools/perf/tests/shell/daemon.sh
index a02cedc76de6..cb831ff2c185 100755
--- a/tools/perf/tests/shell/daemon.sh
+++ b/tools/perf/tests/shell/daemon.sh
@@ -127,9 +127,16 @@ daemon_start()
 
# wait for the session to ping
local state="FAIL"
+   local retries=0
while [ "${state}" != "OK" ]; do
state=`perf daemon ping --config ${config} --session ${session} 
| awk '{ print $1 }'`
sleep 0.05
+   retries=$((${retries} +1))
+   if [ ${retries} -ge 600 ]; then
+   echo "Timeout waiting for daemon to ping"
+   daemon_exit ${config}
+   exit 62
+   fi
done
 }
 
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH 1/3] perf test: Remove unused argument

2021-03-10 Thread Ian Rogers

Remove unused argument from daemon_exit.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/shell/daemon.sh | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/tools/perf/tests/shell/daemon.sh b/tools/perf/tests/shell/daemon.sh
index 5ad3ca8d681b..66ad56b4e0a5 100755
--- a/tools/perf/tests/shell/daemon.sh
+++ b/tools/perf/tests/shell/daemon.sh
@@ -115,8 +115,7 @@ daemon_start()
 
 daemon_exit()
 {
-   local base=$1
-   local config=$2
+   local config=$1
 
local line=`perf daemon --config ${config} -x: | head -1`
local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
@@ -171,7 +170,7 @@ EOF
 ${base}/session-time/ack "0"
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
@@ -288,7 +287,7 @@ EOF
done
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
@@ -333,7 +332,7 @@ EOF
fi
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
# check that sessions are gone
if [ -d "/proc/${pid_size}" ]; then
@@ -374,7 +373,7 @@ EOF
perf daemon signal --config ${config}
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
# count is 2 perf.data for signals and 1 for perf record finished
count=`ls ${base}/session-test/ | grep perf.data | wc -l`
@@ -420,7 +419,7 @@ EOF
fi
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
@@ -457,7 +456,7 @@ EOF
fi
 
# stop daemon
-   daemon_exit ${base} ${config}
+   daemon_exit ${config}
 
rm -rf ${base}
rm -f ${config}
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH 2/3] perf test: Cleanup daemon if test is interrupted.

2021-03-10 Thread Ian Rogers

Reorder daemon_start and daemon_exit as the trap handler is added in
daemon_start referencing daemon_exit.

Signed-off-by: Ian Rogers 
---
 tools/perf/tests/shell/daemon.sh | 34 +++-
 1 file changed, 20 insertions(+), 14 deletions(-)

diff --git a/tools/perf/tests/shell/daemon.sh b/tools/perf/tests/shell/daemon.sh
index 66ad56b4e0a5..a02cedc76de6 100755
--- a/tools/perf/tests/shell/daemon.sh
+++ b/tools/perf/tests/shell/daemon.sh
@@ -98,6 +98,23 @@ check_line_other()
fi
 }
 
+daemon_exit()
+{
+   local config=$1
+
+   local line=`perf daemon --config ${config} -x: | head -1`
+   local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
+
+# Reset trap handler.
+trap - SIGINT SIGTERM
+
+   # stop daemon
+   perf daemon stop --config ${config}
+
+   # ... and wait for the pid to go away
+   tail --pid=${pid} -f /dev/null
+}
+
 daemon_start()
 {
local config=$1
@@ -105,6 +122,9 @@ daemon_start()
 
perf daemon start --config ${config}
 
+# Clean up daemon if interrupted.
+trap "daemon_exit ${config}; exit 4" SIGINT SIGTERM
+
# wait for the session to ping
local state="FAIL"
while [ "${state}" != "OK" ]; do
@@ -113,20 +133,6 @@ daemon_start()
done
 }
 
-daemon_exit()
-{
-   local config=$1
-
-   local line=`perf daemon --config ${config} -x: | head -1`
-   local pid=`echo "${line}" | awk 'BEGIN { FS = ":" } ; { print $1 }'`
-
-   # stop daemon
-   perf daemon stop --config ${config}
-
-   # ... and wait for the pid to go away
-   tail --pid=${pid} -f /dev/null
-}
-
 test_list()
 {
echo "test daemon list"
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH] perf synthetic events: Avoid write of uninitialized memory.

2021-03-09 Thread Ian Rogers

Account for alignment bytes in the zero-ing memset.

Signed-off-by: Ian Rogers 
---
 tools/perf/util/synthetic-events.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/tools/perf/util/synthetic-events.c 
b/tools/perf/util/synthetic-events.c
index b698046ec2db..31bf3dd6a1e0 100644
--- a/tools/perf/util/synthetic-events.c
+++ b/tools/perf/util/synthetic-events.c
@@ -424,7 +424,7 @@ int perf_event__synthesize_mmap_events(struct perf_tool 
*tool,
 
while (!io.eof) {
static const char anonstr[] = "//anon";
-   size_t size;
+   size_t size, aligned_size;
 
/* ensure null termination since stack will be reused. */
event->mmap2.filename[0] = '\0';
@@ -484,11 +484,12 @@ int perf_event__synthesize_mmap_events(struct perf_tool 
*tool,
}
 
size = strlen(event->mmap2.filename) + 1;
-   size = PERF_ALIGN(size, sizeof(u64));
+   aligned_size = PERF_ALIGN(size, sizeof(u64));
event->mmap2.len -= event->mmap.start;
event->mmap2.header.size = (sizeof(event->mmap2) -
-   (sizeof(event->mmap2.filename) - size));
-   memset(event->mmap2.filename + size, 0, machine->id_hdr_size);
+   (sizeof(event->mmap2.filename) - 
aligned_size));
+   memset(event->mmap2.filename + size, 0, machine->id_hdr_size +
+   (aligned_size - size));
event->mmap2.header.size += machine->id_hdr_size;
event->mmap2.pid = tgid;
event->mmap2.tid = pid;
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH] tools include: Add sum16 and wsum definitions.

2021-03-07 Thread Ian Rogers

This adds definitions available in the uapi version.

Explanation:
In the kernel include of types.h the uapi version is included.
In tools the uapi/linux/types.h and linux/types.h are distinct.
For BPF programs a definition of __wsum is needed by the generated
bpf_helpers.h. The definition comes either from a generated vmlinux.h or
from  that may be transitively included from bpf.h. The
perf build prefers linux/types.h over uapi/linux/types.h for
*. To allow tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
to compile with the same include path used for perf then these
definitions are necessary.

There is likely a wider conversation about exactly how types.h should be
specified and the include order used by the perf build - it is somewhat
confusing that tools/include/uapi/linux/bpf.h is using the non-uapi
types.h.

*see tools/perf/Makefile.config:
...
INC_FLAGS += -I$(srctree)/tools/include/
INC_FLAGS += -I$(srctree)/tools/arch/$(SRCARCH)/include/uapi
INC_FLAGS += -I$(srctree)/tools/include/uapi
...
The include directories are scanned from left-to-right:
https://gcc.gnu.org/onlinedocs/gcc/Directory-Options.html
As tools/include/linux/types.h appears before
tools/include/uapi/linux/types.h then I say it is preferred.

Signed-off-by: Ian Rogers 
---
 tools/include/linux/types.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/tools/include/linux/types.h b/tools/include/linux/types.h
index e9c5a215837d..6e14a533ab4e 100644
--- a/tools/include/linux/types.h
+++ b/tools/include/linux/types.h
@@ -61,6 +61,9 @@ typedef __u32 __bitwise __be32;
 typedef __u64 __bitwise __le64;
 typedef __u64 __bitwise __be64;
 
+typedef __u16 __bitwise __sum16;
+typedef __u32 __bitwise __wsum;
+
 typedef struct {
int counter;
 } atomic_t;
-- 
2.30.1.766.gb4fecdf3b7-goog

Re: [RFC PATCH] autofs: find_autofs_mount overmounted parent support

2021-03-06 Thread Ian Kent

On Fri, 2021-03-05 at 14:55 +0300, Alexander Mikhalitsyn wrote:
> On Fri, 05 Mar 2021 18:10:02 +0800
> Ian Kent  wrote:
> 
> > On Thu, 2021-03-04 at 13:11 +0300, Alexander Mikhalitsyn wrote:
> > > On Thu, 04 Mar 2021 14:54:11 +0800
> > > Ian Kent  wrote:
> > > 
> > > > On Wed, 2021-03-03 at 18:28 +0300, Alexander Mikhalitsyn wrote:
> > > > > It was discovered that find_autofs_mount() function
> > > > > in autofs not support cases when autofs mount
> > > > > parent is overmounted. In this case this function will
> > > > > always return -ENOENT.
> > > > 
> > > > Ok, I get this shouldn't happen.
> > > > 
> > > > > Real-life reproducer is fairly simple.
> > > > > Consider the following mounts on root mntns:
> > > > > --
> > > > > 35 24 0:36 / /proc/sys/fs/binfmt_misc ... shared:16 - autofs
> > > > > systemd-
> > > > > 1 ...
> > > > > 654 35 0:57 / /proc/sys/fs/binfmt_misc ... shared:322 -
> > > > > binfmt_misc
> > > > > ...
> > > > > --
> > > > > and some process which calls
> > > > > ioctl(AUTOFS_DEV_IOCTL_OPENMOUNT)
> > > > > $ unshare -m -p --fork --mount-proc ./process-bin
> > > > > 
> > > > > Due to "mount-proc" /proc will be overmounted and
> > > > > ioctl() will fail with -ENOENT
> > > > 
> > > > I think I need a better explanation ...
> > > 
> > > Thank you for the quick reply, Ian.
> > > I'm sorry If my patch description was not sufficiently clear and
> > > detailed.
> > > 
> > > That problem connected with CRIU (Checkpoint-Restore in
> > > Userspace)
> > > project.
> > > In CRIU we have support of autofs mounts C/R. To acheive that we
> > > need
> > > to use
> > > ioctl's from /dev/autofs to get data about mounts, restore mount
> > > as
> > > catatonic
> > > (if needed), change pipe fd and so on. But the problem is that
> > > during
> > > CRIU
> > > dump we may meet situation when VFS subtree where autofs mount
> > > present was
> > > overmounted as whole.
> > > 
> > > Simpliest example is /proc/sys/fs/binfmt_misc. This mount present
> > > on
> > > most
> > > GNU/Linux distributions by default. For instance on my Fedora 33:
> > 
> > Yes, I don't know why systemd uses this direct mount, there must
> > have been a reason for it.
> > 
> > > trigger automount of binfmt_misc
> > > $ ls /proc/sys/fs/binfmt_misc
> > > 
> > > $ cat /proc/1/mountinfo | grep binfmt
> > > 35 24 0:36 / /proc/sys/fs/binfmt_misc rw,relatime shared:16 -
> > > autofs
> > > systemd-1 rw,...,direct,pipe_ino=223
> > > 632 35 0:56 / /proc/sys/fs/binfmt_misc rw,...,relatime shared:315
> > > -
> > > binfmt_misc binfmt_misc rw
> > 
> > Yes, I think this looks normal.
> > 
> > > $ sudo unshare -m -p --fork --mount-proc sh
> > > # cat /proc/self/mountinfo | grep "/proc"
> > > 828 809 0:23 / /proc rw,nosuid,nodev,noexec,relatime - proc proc
> > > rw
> > > 829 828 0:36 / /proc/sys/fs/binfmt_misc rw,relatime - autofs
> > > systemd-
> > > 1 rw,...,direct,pipe_ino=223
> > > 943 829 0:56 / /proc/sys/fs/binfmt_misc rw,...,relatime -
> > > binfmt_misc
> > > binfmt_misc rw
> > > 949 828 0:57 / /proc rw...,relatime - proc proc rw
> > 
> > Isn't this screwed up, /proc is on top of the binfmt_misc mount ...
> > 
> > Is this what's seen from the root namespace?
> 
> No-no, after issuing
> $ sudo unshare -m -p --fork --mount-proc sh
> 
> we enter to the pid+mount namespace and:
> 
> # cat /proc/self/mountinfo | grep "/proc"
> 
> So, it's picture from inside namespaces.

Ok, so potentially some of those have been propagated from the
original mount namespace.

It seems to me the sensible thing would be those mounts would
not propagate when a new proc has been requested. It doesn't
make sense to me to carry around mounts that are not accessible
because of something requested by the mount namespace creator.

But that's nothing new and isn't likely to change any time soon.

> 
> > > As we can see now autofs mount /proc/sys/fs/binfmt_misc is
> > > inaccessible.
> > > If we do something like:
> > > 
> > > struct autofs_dev_ioctl *param;
> > > param = malloc(...);
> >

[PATCH 1/3] perf skel: Remove some unused variables.

2021-03-06 Thread Ian Rogers

Fixes -Wall warnings.

Signed-off-by: Ian Rogers 
---
 tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c 
b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
index c7cec92d0236..ab12b4c4ece2 100644
--- a/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
+++ b/tools/perf/util/bpf_skel/bpf_prog_profiler.bpf.c
@@ -52,7 +52,7 @@ int BPF_PROG(fentry_XXX)
 static inline void
 fexit_update_maps(struct bpf_perf_event_value *after)
 {
-   struct bpf_perf_event_value *before, diff, *accum;
+   struct bpf_perf_event_value *before, diff;
__u32 zero = 0;
 
before = bpf_map_lookup_elem(_readings, );
@@ -78,7 +78,6 @@ int BPF_PROG(fexit_XXX)
 {
struct bpf_perf_event_value reading;
__u32 cpu = bpf_get_smp_processor_id();
-   __u32 one = 1, zero = 0;
int err;
 
/* read all events before updating the maps, to reduce error */
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH 2/3] perf tool: Enable warnings when compiling BPF programs

2021-03-06 Thread Ian Rogers

Add -Wall -Werror when compiling BPF skeletons.

Signed-off-by: Ian Rogers 
---
 tools/perf/Makefile.perf | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 5345ac70cd83..f43d2551f3de 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -1029,7 +1029,7 @@ $(BPFTOOL): | $(SKEL_TMP_OUT)
OUTPUT=$(SKEL_TMP_OUT)/ bootstrap
 
 $(SKEL_TMP_OUT)/%.bpf.o: util/bpf_skel/%.bpf.c $(LIBBPF) | $(SKEL_TMP_OUT)
-   $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf $(BPF_INCLUDE) \
+   $(QUIET_CLANG)$(CLANG) -g -O2 -target bpf -Wall -Werror $(BPF_INCLUDE) \
  -c $(filter util/bpf_skel/%.bpf.c,$^) -o $@ && $(LLVM_STRIP) -g $@
 
 $(SKEL_OUT)/%.skel.h: $(SKEL_TMP_OUT)/%.bpf.o | $(BPFTOOL)
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH 3/3] perf bpf: Minor whitespace cleanup.

2021-03-06 Thread Ian Rogers

Missed space after #include.

Signed-off-by: Ian Rogers 
---
 tools/perf/util/bpf_counter.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/bpf_counter.h b/tools/perf/util/bpf_counter.h
index 2eca210e5dc1..cb9c532e0a07 100644
--- a/tools/perf/util/bpf_counter.h
+++ b/tools/perf/util/bpf_counter.h
@@ -38,7 +38,7 @@ int bpf_counter__install_pe(struct evsel *evsel, int cpu, int 
fd);
 
 #else /* HAVE_BPF_SKEL */
 
-#include
+#include 
 
 static inline int bpf_counter__load(struct evsel *evsel __maybe_unused,
struct target *target __maybe_unused)
-- 
2.30.1.766.gb4fecdf3b7-goog

Re: [RFC PATCH] autofs: find_autofs_mount overmounted parent support

2021-03-05 Thread Ian Kent

On Thu, 2021-03-04 at 13:11 +0300, Alexander Mikhalitsyn wrote:
> On Thu, 04 Mar 2021 14:54:11 +0800
> Ian Kent  wrote:
> 
> > On Wed, 2021-03-03 at 18:28 +0300, Alexander Mikhalitsyn wrote:
> > > It was discovered that find_autofs_mount() function
> > > in autofs not support cases when autofs mount
> > > parent is overmounted. In this case this function will
> > > always return -ENOENT.
> > 
> > Ok, I get this shouldn't happen.
> > 
> > > Real-life reproducer is fairly simple.
> > > Consider the following mounts on root mntns:
> > > --
> > > 35 24 0:36 / /proc/sys/fs/binfmt_misc ... shared:16 - autofs
> > > systemd-
> > > 1 ...
> > > 654 35 0:57 / /proc/sys/fs/binfmt_misc ... shared:322 -
> > > binfmt_misc
> > > ...
> > > --
> > > and some process which calls ioctl(AUTOFS_DEV_IOCTL_OPENMOUNT)
> > > $ unshare -m -p --fork --mount-proc ./process-bin
> > > 
> > > Due to "mount-proc" /proc will be overmounted and
> > > ioctl() will fail with -ENOENT
> > 
> > I think I need a better explanation ...
> 
> Thank you for the quick reply, Ian.
> I'm sorry If my patch description was not sufficiently clear and
> detailed.
> 
> That problem connected with CRIU (Checkpoint-Restore in Userspace)
> project.
> In CRIU we have support of autofs mounts C/R. To acheive that we need
> to use
> ioctl's from /dev/autofs to get data about mounts, restore mount as
> catatonic
> (if needed), change pipe fd and so on. But the problem is that during
> CRIU
> dump we may meet situation when VFS subtree where autofs mount
> present was
> overmounted as whole.
> 
> Simpliest example is /proc/sys/fs/binfmt_misc. This mount present on
> most
> GNU/Linux distributions by default. For instance on my Fedora 33:

Yes, I don't know why systemd uses this direct mount, there must
have been a reason for it.

> 
> trigger automount of binfmt_misc
> $ ls /proc/sys/fs/binfmt_misc
> 
> $ cat /proc/1/mountinfo | grep binfmt
> 35 24 0:36 / /proc/sys/fs/binfmt_misc rw,relatime shared:16 - autofs
> systemd-1 rw,...,direct,pipe_ino=223
> 632 35 0:56 / /proc/sys/fs/binfmt_misc rw,...,relatime shared:315 -
> binfmt_misc binfmt_misc rw

Yes, I think this looks normal.

> 
> $ sudo unshare -m -p --fork --mount-proc sh
> # cat /proc/self/mountinfo | grep "/proc"
> 828 809 0:23 / /proc rw,nosuid,nodev,noexec,relatime - proc proc rw
> 829 828 0:36 / /proc/sys/fs/binfmt_misc rw,relatime - autofs systemd-
> 1 rw,...,direct,pipe_ino=223
> 943 829 0:56 / /proc/sys/fs/binfmt_misc rw,...,relatime - binfmt_misc
> binfmt_misc rw
> 949 828 0:57 / /proc rw...,relatime - proc proc rw

Isn't this screwed up, /proc is on top of the binfmt_misc mount ...

Is this what's seen from the root namespace?

> 
> As we can see now autofs mount /proc/sys/fs/binfmt_misc is
> inaccessible.
> If we do something like:
> 
> struct autofs_dev_ioctl *param;
> param = malloc(...);
> devfd = open("/dev/autofs", O_RDONLY);
> init_autofs_dev_ioctl(param);
> param->size = size;
> strcpy(param->path, "/proc/sys/fs/binfmt_misc");
> param->openmount.devid = 36;
> err = ioctl(devfd, AUTOFS_DEV_IOCTL_OPENMOUNT, param)
> 
> now we get err = -ENOENT.

Maybe that should be EINVAL, not sure about cases though.

> 
> > What's being said here?
> > 
> > For a start your talking about direct mounts, I'm pretty sure this
> > use case can't occur with indirect mounts in the sense that the
> > indirect mount base should/must never be over mounted and IIRC that
> > base can't be /proc (but maybe that's just mounts inside proc ...),
> > can't remember now but from a common sense POV an indirect mount
> > won't/can't be on /proc.
> > 
> > And why is this ioctl be called?
> 
> We call this ioctl during criu dump stage to open fd from autofs
> mount dentry. This fd is used later to call
> ioctl(AUTOFS_IOC_CATATONIC)
> (we do that on criu dump if we see that control process of autofs
> mount
> is dead or pipe is dead).

Right so your usage "is" the way it's intended, ;)

> 
> > If the mount is over mounted should that prevent expiration of the
> > over mounted /proc anyway, so maybe the return is correct ... or
> > not ...
> 
> I agree that case with overmounted subtree with autofs mount is weird
> case.
> But it may be easily created by user and we in CRIU try to handle
> that.

I'm not yet ready to make a call on how I think this this should
be done.

Since you seem to be clear on what this should be used for I'll
need to look more closely at the patch.

But, at fir

net: mscc: ocelot: issue with uninitialized pointer read in ocelot_flower_parse_key

2021-03-04 Thread Colin Ian King

Hi,

Static analysis with Coverity had detected an uninitialized pointer read
in function ocelot_flower_parse_key in
drivers/net/ethernet/mscc/ocelot_flower.c introduced by commit:

commit 75944fda1dfe836fdd406bef6cb3cc8a80f7af83
Author: Xiaoliang Yang 
Date:   Fri Oct 2 15:02:23 2020 +0300

net: mscc: ocelot: offload ingress skbedit and vlan actions to VCAP IS1

The analysis is as follows:

531

   10. Condition flow_rule_match_key(rule,
FLOW_DISSECTOR_KEY_IPV4_ADDRS), taking true branch.
   11. Condition proto == 2048, taking true branch.

532if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS) &&
533proto == ETH_P_IP) {

   12. var_decl: Declaring variable match without initializer.

534struct flow_match_ipv4_addrs match;
535u8 *tmp;
536

   13. Condition filter->block_id == VCAP_ES0, taking false branch.

537if (filter->block_id == VCAP_ES0) {
538NL_SET_ERR_MSG_MOD(extack,
539   "VCAP ES0 cannot match on
IP address");
540return -EOPNOTSUPP;
541}
542

   14. Condition filter->block_id == VCAP_IS1, taking true branch.
   Uninitialized pointer read (UNINIT)
   15. uninit_use: Using uninitialized value match.mask.

543if (filter->block_id == VCAP_IS1 && *(u32
*)>dst) {
544NL_SET_ERR_MSG_MOD(extack,
545   "Key type S1_NORMAL cannot
match on destination IP");
546return -EOPNOTSUPP;
547}

match is declared in line 534 and is not initialized and the
uninitialized match.mask is being dereferenced on line 543. Not sure
what intent was on this and how to fix, hence I'm reporting this issue.

Colin

Re: [RFC PATCH] autofs: find_autofs_mount overmounted parent support

2021-03-03 Thread Ian Kent

On Wed, 2021-03-03 at 18:28 +0300, Alexander Mikhalitsyn wrote:
> It was discovered that find_autofs_mount() function
> in autofs not support cases when autofs mount
> parent is overmounted. In this case this function will
> always return -ENOENT.

Ok, I get this shouldn't happen.

> 
> Real-life reproducer is fairly simple.
> Consider the following mounts on root mntns:
> --
> 35 24 0:36 / /proc/sys/fs/binfmt_misc ... shared:16 - autofs systemd-
> 1 ...
> 654 35 0:57 / /proc/sys/fs/binfmt_misc ... shared:322 - binfmt_misc
> ...
> --
> and some process which calls ioctl(AUTOFS_DEV_IOCTL_OPENMOUNT)
> $ unshare -m -p --fork --mount-proc ./process-bin
> 
> Due to "mount-proc" /proc will be overmounted and
> ioctl() will fail with -ENOENT

I think I need a better explanation ...

What's being said here?

For a start your talking about direct mounts, I'm pretty sure this
use case can't occur with indirect mounts in the sense that the
indirect mount base should/must never be over mounted and IIRC that
base can't be /proc (but maybe that's just mounts inside proc ...),
can't remember now but from a common sense POV an indirect mount
won't/can't be on /proc.

And why is this ioctl be called?

If the mount is over mounted should that prevent expiration of the
over mounted /proc anyway, so maybe the return is correct ... or
not ...

I get that the mount namespaces should be independent and intuitively
this is a bug but what is the actual use and expected result.

But anyway, aren't you saying that the VFS path walk isn't handling
mount namespaces properly or are you saying that a process outside
this new mount namespace becomes broken because of it?

Either way the solution looks more complicated than I'd expect so
some explanation along these lines would be good.

Ian
> 
> Cc: Matthew Wilcox 
> Cc: Al Viro 
> Cc: Pavel Tikhomirov 
> Cc: Kirill Tkhai 
> Cc: aut...@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Alexander Mikhalitsyn <
> alexander.mikhalit...@virtuozzo.com>
> ---
>  fs/autofs/dev-ioctl.c | 127 +---
> --
>  fs/namespace.c|  44 +++
>  include/linux/mount.h |   5 ++
>  3 files changed, 162 insertions(+), 14 deletions(-)
> 
> diff --git a/fs/autofs/dev-ioctl.c b/fs/autofs/dev-ioctl.c
> index 5bf781ea6d67..55edd3eba8ce 100644
> --- a/fs/autofs/dev-ioctl.c
> +++ b/fs/autofs/dev-ioctl.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "autofs_i.h"
>  
> @@ -179,32 +180,130 @@ static int autofs_dev_ioctl_protosubver(struct
> file *fp,
>   return 0;
>  }
>  
> +struct filter_autofs_data {
> + char *pathbuf;
> + const char *fpathname;
> + int (*test)(const struct path *path, void *data);
> + void *data;
> +};
> +
> +static int filter_autofs(const struct path *path, void *p)
> +{
> + struct filter_autofs_data *data = p;
> + char *name;
> + int err;
> +
> + if (path->mnt->mnt_sb->s_magic != AUTOFS_SUPER_MAGIC)
> + return 0;
> +
> + name = d_path(path, data->pathbuf, PATH_MAX);
> + if (IS_ERR(name)) {
> + err = PTR_ERR(name);
> + pr_err("d_path failed, errno %d\n", err);
> + return 0;
> + }
> +
> + if (strncmp(data->fpathname, name, PATH_MAX))
> + return 0;
> +
> + if (!data->test(path, data->data))
> + return 0;
> +
> + return 1;
> +}
> +
>  /* Find the topmost mount satisfying test() */
>  static int find_autofs_mount(const char *pathname,
>struct path *res,
>int test(const struct path *path, void
> *data),
>void *data)
>  {
> - struct path path;
> + struct filter_autofs_data mdata = {
> + .pathbuf = NULL,
> + .test = test,
> + .data = data,
> + };
> + struct mnt_namespace *mnt_ns = current->nsproxy->mnt_ns;
> + struct path path = {};
> + char *fpathbuf = NULL;
>   int err;
>  
> + /*
> +  * In most cases user will provide full path to autofs mount
> point
> +  * as it is in /proc/X/mountinfo. But if not, then we need to
> +  * open provided relative path and calculate full path.
> +  * It will not work in case when parent mount of autofs mount
> +  * is overmounted:
> +  * cd /root
> +  * ./autofs_mount /root/autofs_yard/mnt
> +  * mount -t tmpfs tmpfs /root/autofs_yard/mnt
> +  * mount -t tmpfs tmpfs /root/autofs_yard
> +  * ./call_ioctl /root/autofs_yard/

Re: [PATCH] usb: dwc3: Fix dereferencing of null dwc->usb_psy

2021-03-03 Thread Colin Ian King

On 03/03/2021 21:29, Heiko Thiery wrote:
> Hi all,
> 
>> On Wed, Mar 3, 2021 at 6:00 PM Colin King  wrote:
>>>
>>> From: Colin Ian King 
>>>
>>> Currently the null check logic on dwc->usb_psy is inverted as it allows
>>> calls to power_supply_put with a null dwc->usb_psy causing a null
>>> pointer dereference. Fix this by removing the ! operator.
>>>
>>> Addresses-Coverity: ("Dereference after null check")
>>> Fixes: 59fa3def35de ("usb: dwc3: add a power supply for current control")
>>
>> Acked-by: Ray Chi 
>>
>>> Signed-off-by: Colin Ian King 
> 
> Tested-by: Heiko Thiery 

Thanks for testing. Much appreciated.

Colin
> 
>>> ---
>>>  drivers/usb/dwc3/core.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
>>> index d15f065849cd..94fdbe502ce9 100644
>>> --- a/drivers/usb/dwc3/core.c
>>> +++ b/drivers/usb/dwc3/core.c
>>> @@ -1628,7 +1628,7 @@ static int dwc3_probe(struct platform_device *pdev)
>>>  assert_reset:
>>> reset_control_assert(dwc->reset);
>>>
>>> -   if (!dwc->usb_psy)
>>> +   if (dwc->usb_psy)
>>> power_supply_put(dwc->usb_psy);
>>>
>>> return ret;
>>> @@ -1653,7 +1653,7 @@ static int dwc3_remove(struct platform_device *pdev)
>>> dwc3_free_event_buffers(dwc);
>>> dwc3_free_scratch_buffers(dwc);
>>>
>>> -   if (!dwc->usb_psy)
>>> +   if (dwc->usb_psy)
>>> power_supply_put(dwc->usb_psy);
>>>
>>> return 0;
>>> --
>>> 2.30.0
>>>
> 
> Thank you.
>

Re: f2fs_convert_inline_inode causing rebalance based on random uninitialized value in dn.node_changed

2021-03-03 Thread Colin Ian King

On 03/03/2021 19:44, Jaegeuk Kim wrote:
> On 03/02, Colin Ian King wrote:
>> Hi,
>>
>> Static analysis on linux-next detected a potential uninitialized
>> variable dn.node_changed that does not get set when a call to
>> f2fs_get_node_page() fails.  This uninitialized value gets used in the
>> call to f2fs_balance_fs() that may or not may not balances dirty node
>> and dentry pages depending on the uninitialized state of the variable.
>>
>> I believe the issue was introduced by commit:
>>
>> commit 2a3407607028f7c780f1c20faa4e922bf631d340
>> Author: Jaegeuk Kim 
>> Date:   Tue Dec 22 13:23:35 2015 -0800
>>
>> f2fs: call f2fs_balance_fs only when node was changed
>>
>>
>> The analysis is a follows:
>>
>> 184 int f2fs_convert_inline_inode(struct inode *inode)
>> 185 {
>> 186struct f2fs_sb_info *sbi = F2FS_I_SB(inode);
>>
>>1. var_decl: Declaring variable dn without initializer.
>>
>> 187struct dnode_of_data dn;
>>
>>NOTE dn is not initialized here.
>>
>> 188struct page *ipage, *page;
>> 189int err = 0;
>> 190
>>
>>2. Condition !f2fs_has_inline_data(inode), taking false branch.
>>3. Condition f2fs_hw_is_readonly(sbi), taking false branch.
>>4. Condition f2fs_readonly(sbi->sb), taking false branch.
>>
>> 191if (!f2fs_has_inline_data(inode) ||
>> 192f2fs_hw_is_readonly(sbi) ||
>> f2fs_readonly(sbi->sb))
>> 193return 0;
>> 194
>> 195err = dquot_initialize(inode);
>>
>>5. Condition err, taking false branch.
>>
>> 196if (err)
>> 197return err;
>> 198
>> 199page = f2fs_grab_cache_page(inode->i_mapping, 0, false);
>>
>>6. Condition !page, taking false branch.
>>
>> 200if (!page)
>> 201return -ENOMEM;
>> 202
>> 203f2fs_lock_op(sbi);
>> 204
>> 205ipage = f2fs_get_node_page(sbi, inode->i_ino);
>>
>>7. Condition IS_ERR(ipage), taking true branch.
>>
>> 206if (IS_ERR(ipage)) {
>> 207err = PTR_ERR(ipage);
>>
>>8. Jumping to label out.
>>
>> 208goto out;
>> 209}
>> 210
>>
>>NOTE: set_new_dnode memset's dn so sets the flag to false, but we
>> don't get to this memset if IS_ERR(ipage) above is true.
>>
>> 211set_new_dnode(, inode, ipage, ipage, 0);
>> 212
>> 213if (f2fs_has_inline_data(inode))
>> 214err = f2fs_convert_inline_page(, page);
>> 215
>> 216f2fs_put_dnode();
>> 217 out:
>> 218f2fs_unlock_op(sbi);
>> 219
>> 220f2fs_put_page(page, 1);
>> 221
>>
>> Uninitialized scalar variable:
>>
>>9. uninit_use_in_call: Using uninitialized value dn.node_changed when
>> calling f2fs_balance_fs.
>>
>> 222f2fs_balance_fs(sbi, dn.node_changed);
>> 223
>> 224return err;
>> 225 }
>>
>> I think a suitable fix will be to set dn.node_changed to false on in
>> line 207-208 but I'm concerned if I'm missing something subtle to the
>> rebalancing if I do this.
>>
>> Comments?
> 
> Thank you for the report. Yes, it seems that's a right call and we need to
> check the error to decide calling f2fs_balance_fs() in line 222, since
> set_new_dnode() is used to set all the fields in dnode_of_data. So, if you
> don't mind, could you please post a patch?

Just to clarify, just setting dn.node_changes to false is enough?

I'm not entirely sure what you meant when you wrote "and we need to
check the error to decide calling f2fs_balance_fs() in line 222".

Colin

> 
> Thanks,
> 
>>
>> Colin
>>

re: drm/amd/display: Implement dmub trace event

2021-03-03 Thread Colin Ian King

Hi,

Static analysis on linux-next wit Coverity has found a potential null
pointer dereference in commit:

commit 70732504c53b2d3aae2cebc457515a304672d5bb
Author: Yongqiang Sun 
Date:   Fri Feb 19 14:50:23 2021 -0500

drm/amd/display: Implement dmub trace event

The analysis is as follows:

400 enum dmub_status dmub_srv_hw_init(struct dmub_srv *dmub,
401  const struct dmub_srv_hw_params
*params)
402 {
403struct dmub_fb *inst_fb = params->fb[DMUB_WINDOW_0_INST_CONST];
404struct dmub_fb *stack_fb = params->fb[DMUB_WINDOW_1_STACK];
405struct dmub_fb *data_fb = params->fb[DMUB_WINDOW_2_BSS_DATA];
406struct dmub_fb *bios_fb = params->fb[DMUB_WINDOW_3_VBIOS];
407struct dmub_fb *mail_fb = params->fb[DMUB_WINDOW_4_MAILBOX];
408struct dmub_fb *tracebuff_fb =
params->fb[DMUB_WINDOW_5_TRACEBUFF];
409struct dmub_fb *fw_state_fb = params->fb[DMUB_WINDOW_6_FW_STATE];
410struct dmub_fb *scratch_mem_fb =
params->fb[DMUB_WINDOW_7_SCRATCH_MEM];
411
412struct dmub_rb_init_params rb_params, outbox0_rb_params;
413struct dmub_window cw0, cw1, cw2, cw3, cw4, cw5, cw6;
414struct dmub_region inbox1, outbox1, outbox0;
415

   1. Condition !dmub->sw_init, taking false branch.

416if (!dmub->sw_init)
417return DMUB_STATUS_INVALID;
418
419dmub->fb_base = params->fb_base;
420dmub->fb_offset = params->fb_offset;
421dmub->psp_version = params->psp_version;
422

   2. Condition dmub->hw_funcs.reset, taking true branch.

423if (dmub->hw_funcs.reset)
424dmub->hw_funcs.reset(dmub);
425

   3. Condition inst_fb, taking true branch.
   4. Condition data_fb, taking true branch.

426if (inst_fb && data_fb) {
427cw0.offset.quad_part = inst_fb->gpu_addr;
428cw0.region.base = DMUB_CW0_BASE;
429cw0.region.top = cw0.region.base + inst_fb->size - 1;
430
431cw1.offset.quad_part = stack_fb->gpu_addr;
432cw1.region.base = DMUB_CW1_BASE;
433cw1.region.top = cw1.region.base + stack_fb->size - 1;
434

   5. Condition params->load_inst_const, taking true branch.
   6. Condition dmub->hw_funcs.backdoor_load, taking true branch.

435if (params->load_inst_const &&
dmub->hw_funcs.backdoor_load) {
436/**
437 * Read back all the instruction memory so we
don't hang the
438 * DMCUB when backdoor loading if the write from
x86 hasn't been
439 * flushed yet. This only occurs in backdoor loading.
440 */
441dmub_flush_buffer_mem(inst_fb);
442dmub->hw_funcs.backdoor_load(dmub, , );
443}
444
445}
446

   7. Condition inst_fb, taking true branch.
   8. Condition data_fb, taking true branch.
   9. Condition bios_fb, taking true branch.
   10. Condition mail_fb, taking true branch.
   11. Condition tracebuff_fb, taking false branch.
   12. var_compare_op: Comparing tracebuff_fb to null implies that
tracebuff_fb might be null.

447if (inst_fb && data_fb && bios_fb && mail_fb && tracebuff_fb &&
448fw_state_fb && scratch_mem_fb) {
449cw2.offset.quad_part = data_fb->gpu_addr;
450cw2.region.base = DMUB_CW0_BASE + inst_fb->size;
451cw2.region.top = cw2.region.base + data_fb->size;
452
453cw3.offset.quad_part = bios_fb->gpu_addr;
454cw3.region.base = DMUB_CW3_BASE;
455cw3.region.top = cw3.region.base + bios_fb->size;
456
457cw4.offset.quad_part = mail_fb->gpu_addr;
458cw4.region.base = DMUB_CW4_BASE;
459cw4.region.top = cw4.region.base + mail_fb->size;
460
461/**
462 * Doubled the mailbox region to accomodate inbox and
outbox.
463 * Note: Currently, currently total mailbox size is
16KB. It is split
464 * equally into 8KB between inbox and outbox. If this
config is
465 * changed, then uncached base address configuration
of outbox1
466 * has to be updated in funcs->setup_out_mailbox.
467 */
468inbox1.base = cw4.region.base;
469inbox1.top = cw4.region.base + DMUB_RB_SIZE;
470outbox1.base = inbox1.top;
471outbox1.top = cw4.region.top;
472
473cw5.offset.quad_part = tracebuff_fb->gpu_addr;
474cw5.region.base = DMUB_CW5_BASE;
475cw5.region.top = cw5.region.base + tracebuff_fb->size;
476
477outbox0.base = DMUB_REGION5_BASE +
TRACE_BUFFER_ENTRY_OFFSET;
478outbox0.top = outbox0.base + sizeof(struct
dmcub_trace_buf_entry) * PERF_TRACE_MAX_ENTRY;
479
480
481cw6.offset.quad_part =

Re: [PATCH][next] mtd: nand: Fix error handling in nand_prog_page_op

2021-03-03 Thread Colin Ian King

On 03/03/2021 09:46, Miquel Raynal wrote:
> Hi Colin,
> 
> Colin King  wrote on Wed,  3 Mar 2021
> 09:42:46 +:
> 
>> From: Colin Ian King 
>>
>> The less than zero comparison with status is always false because status
>> is a u8. Fix this by using ret as the return check for the call to
>> chip->legacy.waitfunc() and checking on this and assigning status to ret
>> if it is OK.
>>
>> Addresses-Coverity: ("Unsigned compared against 0")
>> Fixes: 52f67def97f1 ("mtd: nand: fix error handling in nand_prog_page_op() 
>> #1")
>> Signed-off-by: Colin Ian King 
> 
> Thanks for the fix, but this has been handled just an hour ago :)

Ah, missed that. Glad to know it's fixed.

> 
> Cheers,
> Miquèl
>

f2fs_convert_inline_inode causing rebalance based on random uninitialized value in dn.node_changed

2021-03-02 Thread Colin Ian King

Hi,

Static analysis on linux-next detected a potential uninitialized
variable dn.node_changed that does not get set when a call to
f2fs_get_node_page() fails.  This uninitialized value gets used in the
call to f2fs_balance_fs() that may or not may not balances dirty node
and dentry pages depending on the uninitialized state of the variable.

I believe the issue was introduced by commit:

commit 2a3407607028f7c780f1c20faa4e922bf631d340
Author: Jaegeuk Kim 
Date:   Tue Dec 22 13:23:35 2015 -0800

f2fs: call f2fs_balance_fs only when node was changed


The analysis is a follows:

184 int f2fs_convert_inline_inode(struct inode *inode)
185 {
186struct f2fs_sb_info *sbi = F2FS_I_SB(inode);

   1. var_decl: Declaring variable dn without initializer.

187struct dnode_of_data dn;

   NOTE dn is not initialized here.

188struct page *ipage, *page;
189int err = 0;
190

   2. Condition !f2fs_has_inline_data(inode), taking false branch.
   3. Condition f2fs_hw_is_readonly(sbi), taking false branch.
   4. Condition f2fs_readonly(sbi->sb), taking false branch.

191if (!f2fs_has_inline_data(inode) ||
192f2fs_hw_is_readonly(sbi) ||
f2fs_readonly(sbi->sb))
193return 0;
194
195err = dquot_initialize(inode);

   5. Condition err, taking false branch.

196if (err)
197return err;
198
199page = f2fs_grab_cache_page(inode->i_mapping, 0, false);

   6. Condition !page, taking false branch.

200if (!page)
201return -ENOMEM;
202
203f2fs_lock_op(sbi);
204
205ipage = f2fs_get_node_page(sbi, inode->i_ino);

   7. Condition IS_ERR(ipage), taking true branch.

206if (IS_ERR(ipage)) {
207err = PTR_ERR(ipage);

   8. Jumping to label out.

208goto out;
209}
210

   NOTE: set_new_dnode memset's dn so sets the flag to false, but we
don't get to this memset if IS_ERR(ipage) above is true.

211set_new_dnode(, inode, ipage, ipage, 0);
212
213if (f2fs_has_inline_data(inode))
214err = f2fs_convert_inline_page(, page);
215
216f2fs_put_dnode();
217 out:
218f2fs_unlock_op(sbi);
219
220f2fs_put_page(page, 1);
221

Uninitialized scalar variable:

   9. uninit_use_in_call: Using uninitialized value dn.node_changed when
calling f2fs_balance_fs.

222f2fs_balance_fs(sbi, dn.node_changed);
223
224return err;
225 }

I think a suitable fix will be to set dn.node_changed to false on in
line 207-208 but I'm concerned if I'm missing something subtle to the
rebalancing if I do this.

Comments?

Colin

Re: [PATCH] memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]

2021-03-02 Thread Colin Ian King

On 02/03/2021 08:44, Krzysztof Kozlowski wrote:
> On Tue, Feb 23, 2021 at 07:38:21PM +, Colin King wrote:
>> From: Colin Ian King 
>>
>> Currently the array gpmc_cs is indexed by cs before it cs is range checked
>> and the pointer read from this out-of-index read is dereferenced. Fix this
>> by performing the range check on cs before the read and the following
>> pointer dereference.
>>
>> Addresses-Coverity: ("Negative array index read")
>> Fixes: 186401937927 ("memory: gpmc: Move omap gpmc code to live under 
>> drivers")
>> Signed-off-by: Colin Ian King 
>> ---
>>  drivers/memory/omap-gpmc.c | 7 +--
> 
> Thanks, applied with Tony's ack and changed Fixes to 9ed7a776eb50 ("ARM:
> OMAP2+: Fix support for multiple devices on a GPMC chip select"). 


Thanks for correcting the Fixes line

> 
> Best regards,
> Krzysztof
>

[PATCH] perf trace: Ensure read cmdlines are null terminated.

2021-02-26 Thread Ian Rogers

Issue detected by address sanitizer.

Fixes: cd4ceb63438e (perf util: Save pid-cmdline mapping into tracing header)
Signed-off-by: Ian Rogers 
---
 tools/perf/util/trace-event-read.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/perf/util/trace-event-read.c 
b/tools/perf/util/trace-event-read.c
index f507dff713c9..8a01af783310 100644
--- a/tools/perf/util/trace-event-read.c
+++ b/tools/perf/util/trace-event-read.c
@@ -361,6 +361,7 @@ static int read_saved_cmdline(struct tep_handle *pevent)
pr_debug("error reading saved cmdlines\n");
goto out;
}
+   buf[ret] = '\0';
 
parse_saved_cmdline(pevent, buf, size);
ret = 0;
-- 
2.30.1.766.gb4fecdf3b7-goog

[PATCH] perf tools: Fix documentation of verbose options

2021-02-26 Thread Ian Rogers

Option doesn't take a value, make sure the man pages agree. For example:

$ perf evlist --verbose=1
 Error: option `verbose' takes no value

Signed-off-by: Ian Rogers 
---
 tools/perf/Documentation/perf-evlist.txt   | 2 +-
 tools/perf/Documentation/perf-ftrace.txt   | 4 ++--
 tools/perf/Documentation/perf-kallsyms.txt | 2 +-
 tools/perf/Documentation/perf-trace.txt| 4 ++--
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/tools/perf/Documentation/perf-evlist.txt 
b/tools/perf/Documentation/perf-evlist.txt
index c0a66400a960..9af8b8dfb7b6 100644
--- a/tools/perf/Documentation/perf-evlist.txt
+++ b/tools/perf/Documentation/perf-evlist.txt
@@ -29,7 +29,7 @@ OPTIONS
Show just the sample frequency used for each event.
 
 -v::
---verbose=::
+--verbose::
Show all fields.
 
 -g::
diff --git a/tools/perf/Documentation/perf-ftrace.txt 
b/tools/perf/Documentation/perf-ftrace.txt
index 1e91121bac0f..6e82b7cc0bf0 100644
--- a/tools/perf/Documentation/perf-ftrace.txt
+++ b/tools/perf/Documentation/perf-ftrace.txt
@@ -28,8 +28,8 @@ OPTIONS
specified: function_graph or function.
 
 -v::
---verbose=::
-Verbosity level.
+--verbose::
+Increase the verbosity level.
 
 -F::
 --funcs::
diff --git a/tools/perf/Documentation/perf-kallsyms.txt 
b/tools/perf/Documentation/perf-kallsyms.txt
index f3c620951f6e..c97527df8ecd 100644
--- a/tools/perf/Documentation/perf-kallsyms.txt
+++ b/tools/perf/Documentation/perf-kallsyms.txt
@@ -20,5 +20,5 @@ modules).
 OPTIONS
 ---
 -v::
---verbose=::
+--verbose::
Increase verbosity level, showing details about symbol table loading, 
etc.
diff --git a/tools/perf/Documentation/perf-trace.txt 
b/tools/perf/Documentation/perf-trace.txt
index abc9b5d83312..f0da8cf63e9a 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -97,8 +97,8 @@ filter out the startup phase of the program, which is often 
very different.
Filter out events for these pids and for 'trace' itself (comma 
separated list).
 
 -v::
---verbose=::
-Verbosity level.
+--verbose::
+Increase the verbosity level.
 
 --no-inherit::
Child tasks do not inherit counters.
-- 
2.30.1.766.gb4fecdf3b7-goog

re: cifs: Retain old ACEs when converting between mode bits and ACL.

2021-02-24 Thread Colin Ian King

Hi,

Static analysis on linux-next with Coverity had detected a potential
null pointer dereference with the following commit:

commit f5065508897a922327f32223082325d10b069ebc
Author: Shyam Prasad N 
Date:   Fri Feb 12 04:38:43 2021 -0800

cifs: Retain old ACEs when converting between mode bits and ACL.

The analysis is as follows:

1258 /* Convert permission bits from mode to equivalent CIFS ACL */
1259 static int build_sec_desc(struct cifs_ntsd *pntsd, struct cifs_ntsd
*pnntsd,
1260__u32 secdesclen, __u32 *pnsecdesclen, __u64 *pnmode, kuid_t
uid, kgid_t gid,
1261bool mode_from_sid, bool id_from_sid, int *aclflag)
1262 {
1263int rc = 0;
1264__u32 dacloffset;
1265__u32 ndacloffset;
1266__u32 sidsoffset;
1267struct cifs_sid *owner_sid_ptr, *group_sid_ptr;
1268struct cifs_sid *nowner_sid_ptr = NULL, *ngroup_sid_ptr = NULL;

1. assign_zero: Assigning: dacl_ptr = NULL.

1269struct cifs_acl *dacl_ptr = NULL;  /* no need for SACL ptr */
1270struct cifs_acl *ndacl_ptr = NULL; /* no need for SACL ptr */
1271char *end_of_acl = ((char *)pntsd) + secdesclen;
1272u16 size = 0;
1273
1274dacloffset = le32_to_cpu(pntsd->dacloffset);

2. Condition dacloffset, taking false branch.

1275if (dacloffset) {
1276dacl_ptr = (struct cifs_acl *)((char *)pntsd +
dacloffset);
1277if (end_of_acl < (char *)dacl_ptr +
le16_to_cpu(dacl_ptr->size)) {
1278cifs_dbg(VFS, "Existing ACL size is wrong.
Discarding old ACL\n");
1279dacl_ptr = NULL;

NOTE: dacl_ptr is set to NULL and dacloffset is true

1280}
1281}
1282
1283owner_sid_ptr = (struct cifs_sid *)((char *)pntsd +
1284le32_to_cpu(pntsd->osidoffset));
1285group_sid_ptr = (struct cifs_sid *)((char *)pntsd +
1286le32_to_cpu(pntsd->gsidoffset));
1287

3. Condition pnmode, taking true branch.
4. Condition *pnmode != 18446744073709551615ULL, taking false
branch.

1288if (pnmode && *pnmode != NO_CHANGE_64) { /* chmod */
1289ndacloffset = sizeof(struct cifs_ntsd);
1290ndacl_ptr = (struct cifs_acl *)((char *)pnntsd +
ndacloffset);
1291ndacl_ptr->revision =
1292dacloffset ? dacl_ptr->revision :
cpu_to_le16(ACL_REVISION);
1293
1294ndacl_ptr->size = cpu_to_le16(0);
1295ndacl_ptr->num_aces = cpu_to_le32(0);
1296
1297rc = set_chmod_dacl(dacl_ptr, ndacl_ptr,
owner_sid_ptr, group_sid_ptr,
1298pnmode, mode_from_sid);
1299
1300sidsoffset = ndacloffset + le16_to_cpu(ndacl_ptr->size);
1301/* copy the non-dacl portion of secdesc */
1302*pnsecdesclen = copy_sec_desc(pntsd, pnntsd, sidsoffset,
1303NULL, NULL);
1304
1305*aclflag |= CIFS_ACL_DACL;
1306} else {
1307ndacloffset = sizeof(struct cifs_ntsd);
1308ndacl_ptr = (struct cifs_acl *)((char *)pnntsd +
ndacloffset);

5. Condition dacloffset, taking false branch.

1309ndacl_ptr->revision =
1310dacloffset ? dacl_ptr->revision :
cpu_to_le16(ACL_REVISION);

Explicit null dereferenced (FORWARD_NULL)

6. var_deref_op: Dereferencing null pointer dacl_ptr.

1311ndacl_ptr->num_aces = dacl_ptr->num_aces;


Line 1309..1311, when dacloffset and dacl_ptr is null we hit a null ptr
dereference on dacl_ptr.

Re: [PATCH] memory: gpmc: fix out of bounds read and dereference on gpmc_cs[]

2021-02-24 Thread Colin Ian King

On 24/02/2021 07:55, Dan Carpenter wrote:
> On Tue, Feb 23, 2021 at 07:38:21PM +, Colin King wrote:
>> From: Colin Ian King 
>>
>> Currently the array gpmc_cs is indexed by cs before it cs is range checked
>> and the pointer read from this out-of-index read is dereferenced. Fix this
>> by performing the range check on cs before the read and the following
>> pointer dereference.
>>
>> Addresses-Coverity: ("Negative array index read")
>> Fixes: 186401937927 ("memory: gpmc: Move omap gpmc code to live under 
>> drivers")
>> Signed-off-by: Colin Ian King 
>> ---
>>  drivers/memory/omap-gpmc.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/memory/omap-gpmc.c b/drivers/memory/omap-gpmc.c
>> index cfa730cfd145..f80c2ea39ca4 100644
>> --- a/drivers/memory/omap-gpmc.c
>> +++ b/drivers/memory/omap-gpmc.c
>> @@ -1009,8 +1009,8 @@ EXPORT_SYMBOL(gpmc_cs_request);
>>  
>>  void gpmc_cs_free(int cs)
>>  {
>> -struct gpmc_cs_data *gpmc = _cs[cs];
>> -struct resource *res = >mem;
> > There is no actual dereferencing going on here, it just taking the
> addresses.  But the patch is also harmless and improves readability.

Plus compilers are getting smarter with static analysis so some day in
the future they will warn about this.


> 
> regards,
> dan carpenter
>

Re: [PATCH v2] perf docs: Add man pages to see also

2021-02-22 Thread Ian Rogers

On Thu, Nov 19, 2020 at 10:30 PM Ian Rogers  wrote:
>
> Add all other man pages to the "see also" list except for
> perf-script-perl and perf-script-python that are linked to from
> perf-script.
>
> v2. Fix accidentally listing perf-top twice.
>
> Signed-off-by: Ian Rogers 

Ping. I think this might have gotten lost.

Thanks,
Ian

> ---
>  tools/perf/Documentation/perf.txt | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/tools/perf/Documentation/perf.txt 
> b/tools/perf/Documentation/perf.txt
> index c130a3c46a90..9c330cdfa973 100644
> --- a/tools/perf/Documentation/perf.txt
> +++ b/tools/perf/Documentation/perf.txt
> @@ -76,3 +76,15 @@ SEE ALSO
>  linkperf:perf-stat[1], linkperf:perf-top[1],
>  linkperf:perf-record[1], linkperf:perf-report[1],
>  linkperf:perf-list[1]
> +
> +linkperf:perf-annotate[1],linkperf:perf-archive[1],
> +linkperf:perf-bench[1], linkperf:perf-buildid-cache[1],
> +linkperf:perf-buildid-list[1], linkperf:perf-c2c[1],
> +linkperf:perf-config[1], linkperf:perf-data[1], linkperf:perf-diff[1],
> +linkperf:perf-evlist[1], linkperf:perf-ftrace[1],
> +linkperf:perf-help[1], linkperf:perf-inject[1],
> +linkperf:perf-intel-pt[1], linkperf:perf-kallsyms[1],
> +linkperf:perf-kmem[1], linkperf:perf-kvm[1], linkperf:perf-lock[1],
> +linkperf:perf-mem[1], linkperf:perf-probe[1], linkperf:perf-sched[1],
> +linkperf:perf-script[1], linkperf:perf-test[1],
> +linkperf:perf-trace[1], linkperf:perf-version[1]
> --
> 2.29.2.454.gaff20da3a2-goog
>

Re: [PATCH -next] staging: comedi dt2814: Removed unused variables

2021-02-22 Thread Ian Abbott

On 21/02/2021 20:28, Fatih Yildirim wrote:
> Removed unused variables.
> 
> Signed-off-by: Fatih Yildirim 
> ---
>  drivers/staging/comedi/drivers/dt2814.c | 13 -
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/staging/comedi/drivers/dt2814.c 
> b/drivers/staging/comedi/drivers/dt2814.c
> index bcf4d5444faf..bd329d7b4893 100644
> --- a/drivers/staging/comedi/drivers/dt2814.c
> +++ b/drivers/staging/comedi/drivers/dt2814.c
> @@ -186,21 +186,17 @@ static int dt2814_ai_cmd(struct comedi_device *dev, 
> struct comedi_subdevice *s)
>  
>  static irqreturn_t dt2814_interrupt(int irq, void *d)
>  {
> - int lo, hi;
>   struct comedi_device *dev = d;
>   struct dt2814_private *devpriv = dev->private;
>   struct comedi_subdevice *s = dev->read_subdev;
> - int data;
>  
>   if (!dev->attached) {
>   dev_err(dev->class_dev, "spurious interrupt\n");
>   return IRQ_HANDLED;
>   }
>  
> - hi = inb(dev->iobase + DT2814_DATA);
> - lo = inb(dev->iobase + DT2814_DATA);
> -
> - data = (hi << 4) | (lo >> 4);
> + inb(dev->iobase + DT2814_DATA);
> + inb(dev->iobase + DT2814_DATA);
>  
>   if (!(--devpriv->ntrig)) {
>   int i;
> @@ -229,7 +225,6 @@ static int dt2814_attach(struct comedi_device *dev, 
> struct comedi_devconfig *it)
>   struct dt2814_private *devpriv;
>   struct comedi_subdevice *s;
>   int ret;
> - int i;
>  
>   ret = comedi_request_region(dev, it->options[0], 0x2);
>   if (ret)
> @@ -241,8 +236,8 @@ static int dt2814_attach(struct comedi_device *dev, 
> struct comedi_devconfig *it)
>   dev_err(dev->class_dev, "reset error (fatal)\n");
>   return -EIO;
>   }
> - i = inb(dev->iobase + DT2814_DATA);
> - i = inb(dev->iobase + DT2814_DATA);
> + inb(dev->iobase + DT2814_DATA);
> + inb(dev->iobase + DT2814_DATA);
>  
>   if (it->options[1]) {
>   ret = request_irq(it->options[1], dt2814_interrupt, 0,
> 

I've no objection to this patch.  The interrupt handling to support
Comedi asynchronous commands in this driver has always been broken.  I'm
tempted to remove the code for handling asynchronous commands in this
driver altogether for that reason.  (The naive fix of writing the data
to the Comedi buffer is insufficient without an additional check that
the command is running.  The end-of-conversion interrupt occurs
regardless of any command being active.)

Acked-by: Ian Abbott 

-- 
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] staging: comedi: cast to (unsigned int *)

2021-02-19 Thread Ian Abbott

On 19/02/2021 09:03, David Laight wrote:
>> It's kind of moot anyway because the patch is outdated.  But the reason
>> for the ___force is that the same `struct comedi_cmd` is used in both
>> user and kernel contexts.  In user contexts, the `chanlist` member
>> points to user memory and in kernel contexts it points to kernel memory
>> (copied from userspace).
> 
> Can't you use a union of the user and kernel pointers?
> (Possibly even anonymous?)
> Although, ideally, keeping them in separate fields is better.
> 8 bytes for a pointer isn't going make a fat lot of difference.

This is for a UAPI header (eventually), so cannot add a new field.  For
an anonymous union, one tagged with __user and one not, the __user tag
would be removed during conversion from UAPI headers to
/usr/include/linux headers, leaving a union of two identically typed
members, which would look a bit odd.  The union also kind of hides the
problem.

-- 
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] Staging: comedi: Replaced strlcpy to strscpy

2021-02-18 Thread Ian Abbott

On 18/02/2021 14:31, chakravarthikulkarni wrote:
> Warning found by checkpath.pl script.
> 
> Signed-off-by: chakravarthikulkarni 
> ---
>  drivers/staging/comedi/comedi_fops.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/staging/comedi/comedi_fops.c 
> b/drivers/staging/comedi/comedi_fops.c
> index 80d74cce2a01..df77b6bf5c64 100644
> --- a/drivers/staging/comedi/comedi_fops.c
> +++ b/drivers/staging/comedi/comedi_fops.c
> @@ -939,8 +939,8 @@ static int do_devinfo_ioctl(struct comedi_device *dev,
>   /* fill devinfo structure */
>   devinfo.version_code = COMEDI_VERSION_CODE;
>   devinfo.n_subdevs = dev->n_subdevices;
> - strlcpy(devinfo.driver_name, dev->driver->driver_name, COMEDI_NAMELEN);
> - strlcpy(devinfo.board_name, dev->board_name, COMEDI_NAMELEN);
> + strscpy(devinfo.driver_name, dev->driver->driver_name, COMEDI_NAMELEN);
> + strscpy(devinfo.board_name, dev->board_name, COMEDI_NAMELEN);
>  
>   s = comedi_file_read_subdevice(file);
>   if (s)
> 

Thanks, but you are too late.  It has already been fixed in linux-next.

-- 
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] staging: comedi: cast to (unsigned int *)

2021-02-18 Thread Ian Abbott


On 17/02/2021 18:26, Greg KH wrote:

On Wed, Feb 17, 2021 at 11:40:00PM +0530, Atul Gopinathan wrote:

On Wed, Feb 17, 2021 at 06:35:15PM +0100, Greg KH wrote:

On Wed, Feb 17, 2021 at 10:29:08PM +0530, Atul Gopinathan wrote:

Resolve the following warning generated by sparse:

drivers/staging//comedi/comedi_fops.c:2956:23: warning: incorrect type in 
assignment (different address spaces)
drivers/staging//comedi/comedi_fops.c:2956:23:expected unsigned int 
*chanlist
drivers/staging//comedi/comedi_fops.c:2956:23:got void [noderef]  *

compat_ptr() has a return type of "void __user *"
as defined in "include/linux/compat.h"

cmd->chanlist is of type "unsigned int *" as defined
in drivers/staging/comedi/comedi.h" in struct
comedi_cmd.

Signed-off-by: Atul Gopinathan 
---
  drivers/staging/comedi/comedi_fops.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/comedi_fops.c 
b/drivers/staging/comedi/comedi_fops.c
index e85a99b68f31..fc4ec38012b4 100644
--- a/drivers/staging/comedi/comedi_fops.c
+++ b/drivers/staging/comedi/comedi_fops.c
@@ -2953,7 +2953,7 @@ static int get_compat_cmd(struct comedi_cmd *cmd,
cmd->scan_end_arg = v32.scan_end_arg;
cmd->stop_src = v32.stop_src;
cmd->stop_arg = v32.stop_arg;
-   cmd->chanlist = compat_ptr(v32.chanlist);
+   cmd->chanlist = (unsigned int __force *)compat_ptr(v32.chanlist);


__force?  That feels wrong, something is odd if that is ever needed.

Are you _sure_ this is correct?


The same file has instances of "(usigned int __force *)" cast being
used on the same "cmd->chanlist". For reference:

At line 1797 of comedi_fops.c:
1796 /* restore chanlist pointer before copying back */
1797 cmd->chanlist = (unsigned int __force *)user_chanlist;
1798 cmd->data = NULL;

At line 1880:
1879 /* restore chanlist pointer before copying back */
1880 cmd->chanlist = (unsigned int __force *)user_chanlist;
1881 *copy = true;

Here "user_chanlist" is of type "unsigned int __user *".


Or perhaps, I shouldn't be relying on them?


I don't know, it still feels wrong.

Ian, any thoughts?


It's kind of moot anyway because the patch is outdated.  But the reason 
for the ___force is that the same `struct comedi_cmd` is used in both 
user and kernel contexts.  In user contexts, the `chanlist` member 
points to user memory and in kernel contexts it points to kernel memory 
(copied from userspace).


The sparse tagging of this member has flip-flopped a bit over the years:

* commit 92d0127c9d24 ("Staging: comedi: __user markup on 
comedi_fops.c") (May 2010) tagged it as `__user`.


* commit 9be56c643263 ("staging: comedi: comedi.h: remove __user tag 
from chanlist") (Sep 2012) removed the `__user` tag.


It is mostly used in a kernel context, for example all the low-level 
drivers with `do_cmd` and `do_cmdtest` handlers use it in kernel context.


The alternative would be to have a separate kernel version of this 
struct, but it would be mostly identical to the user version apart from 
the sparse tagging of this member and perhaps the removal of the unused 
`data` and `data_len` members (which need to be kept in the user version 
of the struct for compatibility reasons).


--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH v2 1/2] staging: comedi: cast function output to assigned variable type

2021-02-18 Thread Ian Abbott


On 18/02/2021 08:44, Atul Gopinathan wrote:

Fix the following warning generated by sparse:

drivers/staging//comedi/comedi_fops.c:2956:23: warning: incorrect type in 
assignment (different address spaces)
drivers/staging//comedi/comedi_fops.c:2956:23:expected unsigned int 
*chanlist
drivers/staging//comedi/comedi_fops.c:2956:23:got void [noderef]  *

compat_ptr() has a return type of "void __user *"
as defined in "include/linux/compat.h"

cmd->chanlist is of type "unsigned int *" as defined
in drivers/staging/comedi/comedi.h" in struct
comedi_cmd.

Signed-off-by: Atul Gopinathan 
---
  drivers/staging/comedi/comedi_fops.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/comedi/comedi_fops.c 
b/drivers/staging/comedi/comedi_fops.c
index e85a99b68f31..fc4ec38012b4 100644
--- a/drivers/staging/comedi/comedi_fops.c
+++ b/drivers/staging/comedi/comedi_fops.c
@@ -2953,7 +2953,7 @@ static int get_compat_cmd(struct comedi_cmd *cmd,
cmd->scan_end_arg = v32.scan_end_arg;
cmd->stop_src = v32.stop_src;
cmd->stop_arg = v32.stop_arg;
-   cmd->chanlist = compat_ptr(v32.chanlist);
+   cmd->chanlist = (unsigned int __force *)compat_ptr(v32.chanlist);
cmd->chanlist_len = v32.chanlist_len;
cmd->data = compat_ptr(v32.data);
cmd->data_len = v32.data_len;



This patch and the other one in your series clash with commit 
9d5d041eebe3 ("staging: comedi: comedi_fops.c: added casts to get rid of 
sparse warnings") by B K Karthik.


https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9d5d041eebe3dcf7591ff7004896c329eb841ca6

--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

Re: [PATCH] drivers: staging: comedi: Fixed side effects from macro definition.

2021-02-18 Thread Ian Abbott


On 17/02/2021 14:20, chakravarthikulkarni wrote:

Warning found by checkpatch.pl script.

Signed-off-by: chakravarthikulkarni 
---
  drivers/staging/comedi/comedi.h | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/staging/comedi/comedi.h b/drivers/staging/comedi/comedi.h
index b5d00a006dbb..b2af6a88d389 100644
--- a/drivers/staging/comedi/comedi.h
+++ b/drivers/staging/comedi/comedi.h
@@ -1103,9 +1103,12 @@ enum ni_common_signal_names {
  
  /* *** END GLOBALLY-NAMED NI TERMINALS/SIGNALS *** */
  
-#define NI_USUAL_PFI_SELECT(x)	(((x) < 10) ? (0x1 + (x)) : (0xb + (x)))

-#define NI_USUAL_RTSI_SELECT(x)(((x) < 7) ? (0xb + (x)) : 0x1b)
-
+#define NI_USUAL_PFI_SELECT(x) \
+   ({ typeof(x) _x = x; \
+(((_x) < 10) ? (0x1 + (_x)) : (0xb + (_x))); })
+#define NI_USUAL_RTSI_SELECT(x)\
+   ({ typeof(x) _x = x; \
+(((_x) < 7) ? (0xb + (_x)) : 0x1b); })
  /*
   * mode bits for NI general-purpose counters, set with
   * INSN_CONFIG_SET_COUNTER_MODE



I'd rather not do that because this is intended to be a userspace 
header.  This change adds GCC extensions and prohibits the use of the 
macros in constant expressions.


--
-=( Ian Abbott  || MEV Ltd. is a company  )=-
-=( registered in England & Wales.  Regd. number: 02862268.  )=-
-=( Regd. addr.: S11 & 12 Building 67, Europa Business Park, )=-
-=( Bird Hall Lane, STOCKPORT, SK3 0XA, UK. || www.mev.co.uk )=-

NACK: [PATCH][next] tracing/tools: fix spelling mistake "functionph" -> "graph"

2021-02-16 Thread Colin Ian King

On 16/02/2021 14:01, Colin King wrote:
> From: Colin Ian King 
> 
> There is a spelling mistake in the -g help option, I believe
> it should be "graph". Fix it.
> 
> Signed-off-by: Colin Ian King 
> ---
>  tools/tracing/latency/latency-collector.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/tracing/latency/latency-collector.c 
> b/tools/tracing/latency/latency-collector.c
> index 57b20802e71b..8d28234cd6fb 100644
> --- a/tools/tracing/latency/latency-collector.c
> +++ b/tools/tracing/latency/latency-collector.c
> @@ -1711,7 +1711,7 @@ static void show_usage(void)
>  "\t\t\tbeginning, end, and backtrace.\n\n"
>  
>  "-g, --graph\t\tEnable the display-graph option in trace_option. This\n"
> -"\t\t\toption causes ftrace to show the functionph of how\n"
> +"\t\t\toption causes ftrace to show the graph of how\n"
>  "\t\t\tfunctions are calling other functions.\n\n"
>  
>  "-c, --policy POL\tRun the program with scheduling policy POL. POL can be\n"
> 

Found another spelling mistake, sending a V2 soon.

re: octeontx2-af: cn10k: MAC internal loopback support

2021-02-15 Thread Colin Ian King

Hi,

Static analysis on linux-next today using Coverity found an issue in the
following commit:

commit 3ad3f8f93c81f81d6e28b2e286b03669cc1fb3b0
Author: Hariprasad Kelam 
Date:   Thu Feb 11 21:28:34 2021 +0530

octeontx2-af: cn10k: MAC internal loopback support

The analysis is as follows:

723 static int rvu_cgx_config_intlbk(struct rvu *rvu, u16 pcifunc, bool en)
724 {
725struct mac_ops *mac_ops;

   1. var_decl: Declaring variable lmac_id without initializer.

726u8 cgx_id, lmac_id;
727

   2. Condition !is_cgx_config_permitted(rvu, pcifunc), taking false branch.

728if (!is_cgx_config_permitted(rvu, pcifunc))
729return -EPERM;
730

Uninitialized scalar variable (UNINIT)

731mac_ops = get_mac_ops(rvu_cgx_pdata(cgx_id, rvu));
732

   Uninitialized scalar variable (UNINIT)
   3. uninit_use_in_call: Using uninitialized value lmac_id when calling
*mac_ops->mac_lmac_intl_lbk.

733return mac_ops->mac_lmac_intl_lbk(rvu_cgx_pdata(cgx_id, rvu),
734  lmac_id, en);
735 }

Variables cgx_id and lmac_id are no longer being initialized and garbage
values are being passed into function calls.  Originally, these
variables were being initialized with a call to rvu_get_cgx_lmac_id()
but that has now been removed.

Colin

Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-14 Thread Ian Lance Taylor

On Sun, Feb 14, 2021 at 4:38 PM Dave Chinner  wrote:
>
> On Fri, Feb 12, 2021 at 03:54:48PM -0800, Darrick J. Wong wrote:
> > On Sat, Feb 13, 2021 at 10:27:26AM +1100, Dave Chinner wrote:
> >
> > > If you can't tell from userspace that a file has data in it other
> > > than by calling read() on it, then you can't use cfr on it.
> >
> > I don't know how to do that, Dave. :)
>
> If stat returns a non-zero size, then userspace knows it has at
> least that much data in it, whether it be zeros or previously
> written data. cfr will copy that data. The special zero length
> regular pipe files fail this simple "how much data is there to copy
> in this file" check...

This suggests that if the Go standard library sees that
copy_file_range reads zero bytes, we should assume that it failed, and
should use the fallback path as though copy_file_range returned
EOPNOTSUPP or EINVAL.  This will cause an extra system call for an
empty file, but as long as copy_file_range does not return an error
for cases that it does not support we are going to need an extra
system call anyhow.

Does that seem like a possible mitigation?  That is, are there cases
where copy_file_range will fail to do a correct copy, and will return
success, and will not return zero?

Ian

Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor

On Fri, Feb 12, 2021 at 3:03 PM Dave Chinner  wrote:
>
> On Fri, Feb 12, 2021 at 04:45:41PM +0100, Greg KH wrote:
> > On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > > On Fri, Feb 12, 2021 at 12:38 AM Greg KH  
> > > wrote:
> > > >
> > > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > > files in the first place?  They can not seek (well most can not), so
> > > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > > problem that userspace should not do.
> > >
> > > This may have been covered elsewhere, but it's not that people are
> > > saying "let's use copy_file_range on files in /proc."  It's that the
> > > Go language standard library provides an interface to operating system
> > > files.  When Go code uses the standard library function io.Copy to
> > > copy the contents of one open file to another open file, then on Linux
> > > kernels 5.3 and greater the Go standard library will use the
> > > copy_file_range system call.  That seems to be exactly what
> > > copy_file_range is intended for.  Unfortunately it appears that when
> > > people writing Go code open a file in /proc and use io.Copy the
> > > contents to another open file, copy_file_range does nothing and
> > > reports success.  There isn't anything on the copy_file_range man page
> > > explaining this limitation, and there isn't any documented way to know
> > > that the Go standard library should not use copy_file_range on certain
> > > files.
> >
> > But, is this a bug in the kernel in that the syscall being made is not
> > working properly, or a bug in that Go decided to do this for all types
> > of files not knowing that some types of files can not handle this?
> >
> > If the kernel has always worked this way, I would say that Go is doing
> > the wrong thing here.  If the kernel used to work properly, and then
> > changed, then it's a regression on the kernel side.
> >
> > So which is it?
>
> Both Al Viro and myself have said "copy file range is not a generic
> method for copying data between two file descriptors". It is a
> targetted solution for *regular files only* on filesystems that store
> persistent data and can accelerate the data copy in some way (e.g.
> clone, server side offload, hardware offlead, etc). It is not
> intended as a copy mechanism for copying data from one random file
> descriptor to another.
>
> The use of it as a general file copy mechanism in the Go system
> library is incorrect and wrong. It is a userspace bug.  Userspace
> has done the wrong thing, userspace needs to be fixed.

OK, we'll take it out.

I'll just make one last plea that I think that copy_file_range could
be much more useful if there were some way that a program could know
whether it would work or not.  It's pretty unfortunate that we can't
use it in the Go standard library, or, indeed, in any general purpose
code, in any language, that is intended to support arbitrary file
names.  To be pedantically clear, I'm not saying that copy_file_range
should work on all file systems.  I'm only saying that on file systems
for which it doesn't work it should fail rather than silently
returning success without doing anything.

Ian

Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor

On Fri, Feb 12, 2021 at 8:28 AM Greg KH  wrote:
>
> On Fri, Feb 12, 2021 at 07:59:04AM -0800, Ian Lance Taylor wrote:
> > On Fri, Feb 12, 2021 at 7:45 AM Greg KH  wrote:
> > >
> > > On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > > > On Fri, Feb 12, 2021 at 12:38 AM Greg KH  
> > > > wrote:
> > > > >
> > > > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > > > files in the first place?  They can not seek (well most can not), so
> > > > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > > > problem that userspace should not do.
> > > >
> > > > This may have been covered elsewhere, but it's not that people are
> > > > saying "let's use copy_file_range on files in /proc."  It's that the
> > > > Go language standard library provides an interface to operating system
> > > > files.  When Go code uses the standard library function io.Copy to
> > > > copy the contents of one open file to another open file, then on Linux
> > > > kernels 5.3 and greater the Go standard library will use the
> > > > copy_file_range system call.  That seems to be exactly what
> > > > copy_file_range is intended for.  Unfortunately it appears that when
> > > > people writing Go code open a file in /proc and use io.Copy the
> > > > contents to another open file, copy_file_range does nothing and
> > > > reports success.  There isn't anything on the copy_file_range man page
> > > > explaining this limitation, and there isn't any documented way to know
> > > > that the Go standard library should not use copy_file_range on certain
> > > > files.
> > >
> > > But, is this a bug in the kernel in that the syscall being made is not
> > > working properly, or a bug in that Go decided to do this for all types
> > > of files not knowing that some types of files can not handle this?
> > >
> > > If the kernel has always worked this way, I would say that Go is doing
> > > the wrong thing here.  If the kernel used to work properly, and then
> > > changed, then it's a regression on the kernel side.
> > >
> > > So which is it?
> >
> > I don't work on the kernel, so I can't tell you which it is.  You will
> > have to decide.
>
> As you have the userspace code, it should be easier for you to test this
> on an older kernel.  I don't have your userspace code...

Sorry, I'm not sure what you are asking.

I've attached a sample Go program.  On kernel version 2.6.32 this
program exits 0.  On kernel version 5.7.17 it prints

got "" want "./foo\x00"

and exits with status 1.

This program hardcodes the string "/proc/self/cmdline" for
convenience, but of course the same results would happen if this were
a generic copy program that somebody invoked with /proc/self/cmdline
as a command line option.

I could write the same program in C easily enough, by explicitly
calling copy_file_range.  Would it help to see a sample C program?


> > From my perspective, as a kernel user rather than a kernel developer,
> > a system call that silently fails for certain files and that provides
> > no way to determine either 1) ahead of time that the system call will
> > fail, or 2) after the call that the system call did fail, is a useless
> > system call.
>
> Great, then don't use copy_file_range() yet as it seems like it fits
> that category at the moment :)

That seems like an unfortunate result, but if that is the determining
opinion then I guess that is what we will have to do in the Go
standard library.

Ian
package main

import (
	"bytes"
	"fmt"
	"io"
	"io/ioutil"
	"os"
)

func main() {
	tmpfile, err := ioutil.TempFile("", "copy_file_range")
	if err != nil {
		fmt.Fprint(os.Stderr, err)
		os.Exit(1)
	}
	status := copy(tmpfile)
	os.Remove(tmpfile.Name())
	os.Exit(status)
}

func copy(tmpfile *os.File) int {
	cmdline, err := os.Open("/proc/self/cmdline")
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	defer cmdline.Close()
	if _, err := io.Copy(tmpfile, cmdline); err != nil {
		fmt.Fprintf(os.Stderr, "copy failed: %v\n", err)
		return 1
	}
	if err := tmpfile.Close(); err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	old, err := ioutil.ReadFile("/proc/self/cmdline")
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	new, err := ioutil.ReadFile(tmpfile.Name())
	if err != nil {
		fmt.Fprintln(os.Stderr, err)
		return 1
	}
	if !bytes.Equal(old, new) {
		fmt.Fprintf(os.Stderr, "got %q want %q\n", new, old)
		return 1
	}
	return 0
}

Re: [PATCH 1/6] fs: Add flag to file_system_type to indicate content is generated

2021-02-12 Thread Ian Lance Taylor

On Fri, Feb 12, 2021 at 7:45 AM Greg KH  wrote:
>
> On Fri, Feb 12, 2021 at 07:33:57AM -0800, Ian Lance Taylor wrote:
> > On Fri, Feb 12, 2021 at 12:38 AM Greg KH  wrote:
> > >
> > > Why are people trying to use copy_file_range on simple /proc and /sys
> > > files in the first place?  They can not seek (well most can not), so
> > > that feels like a "oh look, a new syscall, let's use it everywhere!"
> > > problem that userspace should not do.
> >
> > This may have been covered elsewhere, but it's not that people are
> > saying "let's use copy_file_range on files in /proc."  It's that the
> > Go language standard library provides an interface to operating system
> > files.  When Go code uses the standard library function io.Copy to
> > copy the contents of one open file to another open file, then on Linux
> > kernels 5.3 and greater the Go standard library will use the
> > copy_file_range system call.  That seems to be exactly what
> > copy_file_range is intended for.  Unfortunately it appears that when
> > people writing Go code open a file in /proc and use io.Copy the
> > contents to another open file, copy_file_range does nothing and
> > reports success.  There isn't anything on the copy_file_range man page
> > explaining this limitation, and there isn't any documented way to know
> > that the Go standard library should not use copy_file_range on certain
> > files.
>
> But, is this a bug in the kernel in that the syscall being made is not
> working properly, or a bug in that Go decided to do this for all types
> of files not knowing that some types of files can not handle this?
>
> If the kernel has always worked this way, I would say that Go is doing
> the wrong thing here.  If the kernel used to work properly, and then
> changed, then it's a regression on the kernel side.
>
> So which is it?

I don't work on the kernel, so I can't tell you which it is.  You will
have to decide.

>From my perspective, as a kernel user rather than a kernel developer,
a system call that silently fails for certain files and that provides
no way to determine either 1) ahead of time that the system call will
fail, or 2) after the call that the system call did fail, is a useless
system call.  I can never use that system call, because I don't know
whether or not it will work.  So as a kernel user I would say that you
should fix the system call to report failure, or document some way to
know whether the system call will fail, or you should remove the
system call.  But I'm not a kernel developer, I don't have all the
information, and it's obviously your call.

I'll note that to the best of my knowledge this failure started
happening with the 5.3 kernel, as before 5.3 the problematic calls
would report a failure (EXDEV).  Since 5.3 isn't all that old I
personally wouldn't say that the kernel "has always worked this way."
But I may be mistaken about this.


> > So ideally the kernel will report EOPNOTSUPP or EINVAL when using
> > copy_file_range on a file in /proc or some other file system that
> > fails (and, minor side note, the copy_file_range man page should
> > document that it can return EOPNOTSUPP or EINVAL in some cases, which
> > does already happen on at least some kernel versions using at least
> > some file systems).
>
> Documentation is good, but what the kernel does is the true "definition"
> of what is going right or wrong here.

Sure.  The documentation comment was just a side note.  I hope that we
can all agree that accurate man pages are better than inaccurate ones.

Ian

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 7811 matches

Mail list logo