RE: arm: ERROR: modpost: "__aeabi_uldivmod" [drivers/gpu/drm/sun4i/sun4i-drm-hdmi.ko] undefined!

2024-03-09 Thread David Laight
From: Maxime Ripard
> Sent: 04 March 2024 11:46
> 
> On Mon, Mar 04, 2024 at 12:11:36PM +0100, Arnd Bergmann wrote:
> > On Mon, Mar 4, 2024, at 09:07, Naresh Kamboju wrote:
> > > The arm defconfig builds failed on today's Linux next tag next-20240304.
> > >
> > > Build log:
> > > -
> > > ERROR: modpost: "__aeabi_uldivmod"
> > > [drivers/gpu/drm/sun4i/sun4i-drm-hdmi.ko] undefined!
> > >
> >
> > Apparently caused by the 64-bit division in 358e76fd613a
> > ("drm/sun4i: hdmi: Consolidate atomic_check and mode_valid"):
> >
> >
> > +static enum drm_mode_status
> > +sun4i_hdmi_connector_clock_valid(const struct drm_connector *connector,
> > +const struct drm_display_mode *mode,
> > +unsigned long long clock)
> >  {
> > -   struct sun4i_hdmi *hdmi = drm_encoder_to_sun4i_hdmi(encoder);
> > -   unsigned long rate = mode->clock * 1000;
> > -   unsigned long diff = rate / 200; /* +-0.5% allowed by HDMI spec */
> > +   const struct sun4i_hdmi *hdmi = 
> > drm_connector_to_sun4i_hdmi(connector);
> > +   unsigned long diff = clock / 200; /* +-0.5% allowed by HDMI spec */
> > long rounded_rate;
> >
> > This used to be a 32-bit division. If the rate is never more than
> > 4.2GHz, clock could be turned back into 'unsigned long' to avoid
> > the expensive div_u64().
> 
> I sent a fix for it this morning:
> https://lore.kernel.org/r/20240304091225.366325-1-mrip...@kernel.org
> 
> The framework will pass an unsigned long long because HDMI character
> rates can go up to 5.9GHz.

You could do:
/* The max clock is 5.9GHz, split the divide */
u32 diff = (u32)(clock / 8) / (200/8);

The code should really use u32 and u64.
Otherwise the sizes are different on 32bit.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next v2 03/11] minmax: Simplify signedness check

2024-02-27 Thread David Laight
From: kernel test robot
> Sent: 27 February 2024 01:34
> 
> kernel test robot noticed the following build warnings:
> 
> [auto build test WARNING on drm-misc/drm-misc-next]
> [also build test WARNING on linux/master mkl-can-next/testing kdave/for-next 
> akpm-mm/mm-nonmm-unstable
> axboe-block/for-next linus/master v6.8-rc6 next-20240226]
> [cannot apply to next-20240223 dtor-input/next dtor-input/for-linus 
> horms-ipvs/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch#_base_tree_information]
> 
> url:
> https://github.com/intel-lab-lkp/linux/commits/David-Laight/minmax-Put-all-the-clamp-
> definitions-together/20240226-005902
> base:   git://anongit.freedesktop.org/drm/drm-misc drm-misc-next
> patch link:
> https://lore.kernel.org/r/8657dd5c2264456f8a005520a3b90e2b%40AcuMS.aculab.com
> patch subject: [PATCH next v2 03/11] minmax: Simplify signedness check
> config: alpha-defconfig 
> (https://download.01.org/0day-ci/archive/20240227/202402270937.9kmO5PFt-
> l...@intel.com/config)
> compiler: alpha-linux-gcc (GCC) 13.2.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-
> ci/archive/20240227/202402270937.9kmo5pft-...@intel.com/reproduce)
> 
> If you fix the issue in a separate patch/commit (i.e. not just a new version 
> of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot 
> | Closes: 
> https://lore.kernel.org/oe-kbuild-all/202402270937.9kmo5pft-...@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
>In file included from include/linux/kernel.h:28,
> from include/linux/cpumask.h:10,
> from include/linux/smp.h:13,
> from include/linux/lockdep.h:14,
> from include/linux/spinlock.h:63,
> from include/linux/swait.h:7,
> from include/linux/completion.h:12,
> from include/linux/crypto.h:15,
> from include/crypto/aead.h:13,
> from include/crypto/internal/aead.h:11,
> from crypto/skcipher.c:12:
>crypto/skcipher.c: In function 'skcipher_get_spot':
> >> include/linux/minmax.h:31:70: warning: ordered comparison of pointer with 
> >> integer zero [-Wextra]
>   31 | (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 
> 0 >= 0 : 0))

Hmmm -Wextra isn't normally set.
But I do wish the compiler would do dead code elimination before
these warnings.

Apart from stopping code using min()/max() for pointer types
(all the type checking is pointless) I think that __is_constextr()
can be implemented using _Generic (instead of sizeof(type)) and then the
true/false return values can be specified and need not be the same types.
That test can then be:
(__if_constexpr(x, x, -1) >= 0)
(The '+ 0' is there to convert bool to int and won't be needed
for non-constant bool.)

I may drop the last few patches until MIN/MAX have been removed
from everywhere else to free up the names.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next v2 11/11] minmax: min() and max() don't need to return constant expressions

2024-02-26 Thread David Laight
From: kernel test robot 
> Sent: 26 February 2024 09:42

> If you fix the issue in a separate patch/commit (i.e. not just a new version 
> of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot 
> | Closes: 
> https://lore.kernel.org/oe-kbuild-all/202402261720.eamc0ehm-...@intel.com/
> 
> All warnings (new ones prefixed by >>):
> 
> >> arch/x86/mm/pgtable.c:437:14: warning: variable length array used [-Wvla]
>  437 | pmd_t *pmds[MAX_PREALLOCATED_PMDS];

Not surprisingly I missed X86_CONFIG_PAE builds :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next v2 02/11] minmax: Use _Static_assert() instead of static_assert()

2024-02-26 Thread David Laight
From: Jani Nikula
> Sent: 26 February 2024 09:28
> 
> On Sun, 25 Feb 2024, David Laight  wrote:
> > The wrapper just adds two more lines of error output when the test fails.
> 
> There are only a handful of places in kernel code that use
> _Static_assert() directly. Nearly 900 instances of static_assert().

How many of those supply an error message?

> Are we now saying it's fine to use _Static_assert() directly all over
> the place? People will copy-paste and cargo cult.

Is that actually a problem?
The wrapper allows the error message to be omitted and substitutes
the text of the conditional.
But it isn't 'free'.
As well as slightly slowing down the compilation, the error messages
from the compiler get more difficult to interpret.

Most of the static_assert() will probably never generate an error.
But the ones in min()/max() will so it is best to make them as
readable as possible.
(Don't even look as the mess clang makes)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next v2 08/11] minmax: Add min_const() and max_const()

2024-02-25 Thread David Laight
...
> Yes, yes, that may end up requiring getting rid of some current users of
> 
>   #define MIN(a,b) ((a)<(b) ? (a):(b))
> 
> but dammit, we don't actually have _that_ many of them, and why should
> we have random drivers doing that anyway?

They look like they could be changed to min().
It is even likely the header gets pulled in somewhere.

I'm not sure about the ones in drivers/gpu/drm/amd/display/*..*/*.c, but it
wouldn't surprise me if that code doesn't use any standard kernel headers.
Isn't that also the code that manages to pass 42 integer parameters
to functions?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH next v2 08/11] minmax: Add min_const() and max_const()

2024-02-25 Thread David Laight
From: Linus Torvalds
> Sent: 25 February 2024 17:14
> 
> On Sun, 25 Feb 2024 at 08:53, David Laight  wrote:
> >
> > The expansions of min() and max() contain statement expressions so are
> > not valid for static intialisers.
> > min_const() and max_const() are expressions so can be used for static
> > initialisers.
> 
> I hate the name.

Picking name is always hard...

> Naming shouldn't be about an implementation detail, particularly not
> an esoteric one like the "C constant expression" rule. That can be
> useful for some internal helper functions or macros, but not for
> something that random people are supposed to USE.
> 
> Telling some random developer that inside an array size declaration or
> a static initializer you need to use "max_const()" because it needs to
> syntactically be a constant expression, and our regular "max()"
> function isn't that, is just *horrid*.
> 
> No, please just use the traditional C model of just using ALL CAPS for
> macro names that don't act like a function.
> 
> Yes, yes, that may end up requiring getting rid of some current users of
> 
>   #define MIN(a,b) ((a)<(b) ? (a):(b))
> 
> but dammit, we don't actually have _that_ many of them, and why should
> we have random drivers doing that anyway?

I'll have a look at what is there.
It might take a three part patch though.
Unless you apply it as a tree-wide patch?

One option is to add as max_const(), then change any existing MAX()
to be max() or max_const() and then finally rename to MAX()?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH next v2 11/11] minmax: min() and max() don't need to return constant expressions

2024-02-25 Thread David Laight
After changing the handful of places max() was used to size an on-stack
array to use max_const() it is no longer necessary for min() and max()
to return constant expressions from constant inputs.
Remove the associated logic to reduce the expanded text.

Remove the 'hack' that allowed max(bool, bool).

Fixup the initial block comment to match current reality.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index c08916588425..5e65c98ff256 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,13 +8,10 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish three things:
+ * min()/max()/clamp() macros must accomplish several things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
- * - Retain result as a constant expressions when called with only
- *   constant expressions (to avoid tripping VLA warnings in stack
- *   allocation usage).
  * - Perform signed v unsigned type-checking (to generate compile
  *   errors instead of nasty runtime surprises).
  * - Unsigned char/short are always promoted to signed int and can be
@@ -22,13 +19,19 @@
  * - Unsigned arguments can be compared against non-negative signed constants.
  * - Comparison of a signed argument against an unsigned constant fails
  *   even if the constant is below __INT_MAX__ and could be cast to int.
+ *
+ * The return value of min()/max() is not a constant expression for
+ * constant parameters - so will trigger a VLA warging if used to size
+ * an on-stack array.
+ * Instead use min_const() or max_const() which do generate constant
+ * expressions and are also valid for static initialisers.
  */
 #define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 
 /* Allow unsigned compares against non-negative signed constants. */
 #define __is_ok_unsigned(x) \
-   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 0 >= 0 : 0))
+   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) >= 0 : 0))
 
 /* Check for signed after promoting unsigned char/short to int */
 #define __is_ok_signed(x) is_signed_type(typeof((x) + 0))
@@ -53,12 +56,10 @@
typeof(y) __y_##uniq = (y); \
__cmp(op, __x_##uniq, __y_##uniq); })
 
-#define __careful_cmp(op, x, y, uniq)  \
-   __builtin_choose_expr(__is_constexpr((x) - (y)),\
-   __cmp(op, x, y),\
-   ({ _Static_assert(__types_ok(x, y), \
-   #op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
-   __cmp_once(op, x, y, uniq); }))
+#define __careful_cmp(op, x, y, uniq) ({   \
+   _Static_assert(__types_ok(x, y),\
+   #op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
+   __cmp_once(op, x, y, uniq); })
 
 #define __careful_cmp_const(op, x, y)  \
(BUILD_BUG_ON_ZERO(!__is_constexpr((x) - (y))) +\
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 10/11] block: Use a boolean expression instead of max() on booleans

2024-02-25 Thread David Laight
blk_stack_limits() contains:
t->zoned = max(t->zoned, b->zoned);
These are bool, so can be replaced by bitwise or.
However it generates:
error: comparison of constant '0' with boolean expression is always true 
[-Werror=bool-compare]
inside the signedness check that max() does unless a '+ 0' is added.
It is a shame the compiler generates this warning for code that will
be optimised away.

Change so that the extra '+ 0' can be removed.

Signed-off-by: David Laight 
---
 block/blk-settings.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 06ea91e51b8b..9ca21fea039d 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -688,7 +688,7 @@ int blk_stack_limits(struct queue_limits *t, struct 
queue_limits *b,
   b->max_secure_erase_sectors);
t->zone_write_granularity = max(t->zone_write_granularity,
b->zone_write_granularity);
-   t->zoned = max(t->zoned, b->zoned);
+   t->zoned = t->zoned | b->zoned;
return ret;
 }
 EXPORT_SYMBOL(blk_stack_limits);
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 09/11] tree-wide: minmax: Replace all the uses of max() for array sizes with max_const()

2024-02-25 Thread David Laight
These are the only uses of max() that require a constant value
from constant parameters.
There don't seem to be any similar uses of min().

Replacing the max() by max_const() lets min()/max() be simplified
speeding up compilation.

max_const() will convert enums to int (or unsigned int) so that the
casts added by max_t() are no longer needed.

Signed-off-by: David Laight 
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
 drivers/gpu/drm/drm_color_mgmt.c   | 4 ++--
 drivers/input/touchscreen/cyttsp4_core.c   | 2 +-
 drivers/net/can/usb/etas_es58x/es58x_devlink.c | 2 +-
 fs/btrfs/tree-checker.c| 2 +-
 lib/vsprintf.c | 4 ++--
 net/ipv4/proc.c| 2 +-
 net/ipv6/proc.c| 2 +-
 8 files changed, 10 insertions(+), 10 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 00cd615bbcdc..935fb4014f7c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -708,7 +708,7 @@ static const char *smu_get_feature_name(struct smu_context 
*smu,
 size_t smu_cmn_get_pp_feature_mask(struct smu_context *smu,
   char *buf)
 {
-   int8_t sort_feature[max(SMU_FEATURE_COUNT, SMU_FEATURE_MAX)];
+   int8_t sort_feature[max_const(SMU_FEATURE_COUNT, SMU_FEATURE_MAX)];
uint64_t feature_mask;
int i, feature_index;
uint32_t count = 0;
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c
index d021497841b8..43a6bd0ca960 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -532,8 +532,8 @@ int drm_plane_create_color_properties(struct drm_plane 
*plane,
 {
struct drm_device *dev = plane->dev;
struct drm_property *prop;
-   struct drm_prop_enum_list enum_list[max_t(int, DRM_COLOR_ENCODING_MAX,
-  DRM_COLOR_RANGE_MAX)];
+   struct drm_prop_enum_list enum_list[max_const(DRM_COLOR_ENCODING_MAX,
+ DRM_COLOR_RANGE_MAX)];
int i, len;
 
if (WARN_ON(supported_encodings == 0 ||
diff --git a/drivers/input/touchscreen/cyttsp4_core.c 
b/drivers/input/touchscreen/cyttsp4_core.c
index 7cb26929dc73..c6884c3c3fca 100644
--- a/drivers/input/touchscreen/cyttsp4_core.c
+++ b/drivers/input/touchscreen/cyttsp4_core.c
@@ -871,7 +871,7 @@ static void cyttsp4_get_mt_touches(struct cyttsp4_mt_data 
*md, int num_cur_tch)
struct cyttsp4_touch tch;
int sig;
int i, j, t = 0;
-   int ids[max(CY_TMA1036_MAX_TCH, CY_TMA4XX_MAX_TCH)];
+   int ids[max_const(CY_TMA1036_MAX_TCH, CY_TMA4XX_MAX_TCH)];
 
memset(ids, 0, si->si_ofs.tch_abs[CY_TCH_T].max * sizeof(int));
for (i = 0; i < num_cur_tch; i++) {
diff --git a/drivers/net/can/usb/etas_es58x/es58x_devlink.c 
b/drivers/net/can/usb/etas_es58x/es58x_devlink.c
index 635edeb8f68c..28fa87668bf8 100644
--- a/drivers/net/can/usb/etas_es58x/es58x_devlink.c
+++ b/drivers/net/can/usb/etas_es58x/es58x_devlink.c
@@ -215,7 +215,7 @@ static int es58x_devlink_info_get(struct devlink *devlink,
struct es58x_sw_version *fw_ver = _dev->firmware_version;
struct es58x_sw_version *bl_ver = _dev->bootloader_version;
struct es58x_hw_revision *hw_rev = _dev->hardware_revision;
-   char buf[max(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))];
+   char buf[max_const(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))];
int ret = 0;
 
if (es58x_sw_version_is_valid(fw_ver)) {
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 6eccf8496486..aec4729a9a82 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -615,7 +615,7 @@ static int check_dir_item(struct extent_buffer *leaf,
 */
if (key->type == BTRFS_DIR_ITEM_KEY ||
key->type == BTRFS_XATTR_ITEM_KEY) {
-   char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+   char namebuf[max_const(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
 
read_extent_buffer(leaf, namebuf,
(unsigned long)(di + 1), name_len);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 552738f14275..6c3c319afd86 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1080,8 +1080,8 @@ char *resource_string(char *buf, char *end, struct 
resource *res,
 #define FLAG_BUF_SIZE  (2 * sizeof(res->flags))
 #define DECODED_BUF_SIZE   sizeof("[mem - 64bit pref window disabled]")
 #define RAW_BUF_SIZE   sizeof("[mem - flags 0x]")
-   char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,

[PATCH next v2 08/11] minmax: Add min_const() and max_const()

2024-02-25 Thread David Laight
The expansions of min() and max() contain statement expressions so are
not valid for static intialisers.
min_const() and max_const() are expressions so can be used for static
initialisers.
The arguments are checked for being constant and for negative signed
values being converted to large unsigned values.

Using these to size on-stack arrays lets min/max be simplified.
Zero is added before the compare to convert enum values to integers
avoinding the need for casts when enums have been used for constants.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 15 +++
 1 file changed, 15 insertions(+)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 278a390b8a4c..c08916588425 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -60,19 +60,34 @@
#op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
__cmp_once(op, x, y, uniq); }))
 
+#define __careful_cmp_const(op, x, y)  \
+   (BUILD_BUG_ON_ZERO(!__is_constexpr((x) - (y))) +\
+   BUILD_BUG_ON_ZERO(!__types_ok(x, y)) +  \
+   __cmp(op, (x) + 0, (y) + 0))
+
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
+ *
+ * If @x and @y are constants the return value is constant, but not 'constant
+ * enough' for things like static initialisers.
+ * min_const(@x, @y) is a constant expression for constant inputs.
  */
 #define min(x, y)  __careful_cmp(min, x, y, __COUNTER__)
+#define min_const(x, y)__careful_cmp_const(min, x, y)
 
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
+ *
+ * If @x and @y are constants the return value is constant, but not 'constant
+ * enough' for things like static initialisers.
+ * max_const(@x, @y) is a constant expression for constant inputs.
  */
 #define max(x, y)  __careful_cmp(max, x, y, __COUNTER__)
+#define max_const(x, y)__careful_cmp_const(max, x, y)
 
 /**
  * umin - return minimum of two non-negative values
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 07/11] minmax: minmax: Add __types_ok3() and optimise defines with 3 arguments

2024-02-25 Thread David Laight
min3() and max3() were added to optimise nested min(x, min(y, z))
sequences, but only moved where the expansion was requiested.

Add a separate implementation for 3 argument calls.
These are never required to generate constant expressions to
remove that logic.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 5c7fce76abe5..278a390b8a4c 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -38,6 +38,11 @@
((__is_ok_signed(x) && __is_ok_signed(y)) ||\
 (__is_ok_unsigned(x) && __is_ok_unsigned(y)))
 
+/* Check three values for min3(), max3() and clamp() */
+#define __types_ok3(x, y, z)   
\
+   ((__is_ok_signed(x) && __is_ok_signed(y) && __is_ok_signed(z)) ||   
\
+(__is_ok_unsigned(x) && __is_ok_unsigned(y) && __is_ok_unsigned(z)))
+
 #define __cmp_op_min <
 #define __cmp_op_max >
 
@@ -87,13 +92,24 @@
 #define umax(x, y) \
__careful_cmp(max, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
+#define __cmp_once3(op, x, y, z, uniq) ({  \
+   typeof(x) __x_##uniq = (x); \
+   typeof(x) __y_##uniq = (y); \
+   typeof(x) __z_##uniq = (z); \
+   __cmp(op, __cmp(op, __x_##uniq, __y_##uniq), __z_##uniq); })
+
+#define __careful_cmp3(op, x, y, z, uniq) ({   \
+   static_assert(__types_ok3(x, y, z), \
+   #op "3(" #x ", " #y ", " #z ") signedness error");  \
+   __cmp_once3(op, x, y, z, uniq); })
+
 /**
  * min3 - return minimum of three values
  * @x: first value
  * @y: second value
  * @z: third value
  */
-#define min3(x, y, z) min((typeof(x))min(x, y), z)
+#define min3(x, y, z) __careful_cmp3(min, x, y, z, __COUNTER__)
 
 /**
  * max3 - return maximum of three values
@@ -101,7 +117,7 @@
  * @y: second value
  * @z: third value
  */
-#define max3(x, y, z) max((typeof(x))max(x, y), z)
+#define max3(x, y, z) __careful_cmp3(max, x, y, z, __COUNTER__)
 
 /**
  * min_t - return minimum of two values, using the specified type
@@ -142,8 +158,7 @@
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
 #define __careful_clamp(val, lo, hi, uniq) ({  
\
-   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
+   _Static_assert(__types_ok3(val, lo, hi), "clamp() signedness error");   
\
__clamp_once(val, lo, hi, uniq); })
 
 /**
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 06/11] minmax: Remove 'constexpr' check from __careful_clamp()

2024-02-25 Thread David Laight
Nothing requires that clamp() return a constant expression.
The logic to do so significantly increases the .i file.
Remove the check and directly expand __clamp_once() from clamp_t()
since the type check can't fail.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 111c52a14fe5..5c7fce76abe5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -141,12 +141,10 @@
"clamp() low limit " #lo " greater than high limit " #hi);  
\
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
-#define __careful_clamp(val, lo, hi, uniq) \
-   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
-   __clamp(val, lo, hi),   \
-   ({ _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness 
error");\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness 
error");   \
-   __clamp_once(val, lo, hi, uniq); }))
+#define __careful_clamp(val, lo, hi, uniq) ({  
\
+   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
+   __clamp_once(val, lo, hi, uniq); })
 
 /**
  * clamp - return a value clamped to a given range with strict typechecking
@@ -168,7 +166,9 @@
  * This macro does no typechecking and uses temporary variables of type
  * @type to make all the comparisons.
  */
-#define clamp_t(type, val, lo, hi) clamp((type)(val), (type)(lo), (type)(hi))
+#define __clamp_t(type, val, lo, hi, uniq) \
+   __clamp_once((type)(val), (type)(lo), (type)(hi), uniq)
+#define clamp_t(type, val, lo, hi) __clamp_t(type, val, lo, hi, __COUNTER__)
 
 /**
  * clamp_val - return a value clamped to a given range using val's type
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 05/11] minmax: Move the signedness check out of __cmp_once() and __clamp_once()

2024-02-25 Thread David Laight
There is no need to do the signedness/type check when the arguments
are being cast to a fixed type.
So move the check out of __xxx_once() into __careful_xxx().

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 8ee003d8abaf..111c52a14fe5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -46,14 +46,14 @@
 #define __cmp_once(op, x, y, uniq) ({  \
typeof(x) __x_##uniq = (x); \
typeof(y) __y_##uniq = (y); \
-   _Static_assert(__types_ok(x, y),\
-   #op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
__cmp(op, __x_##uniq, __y_##uniq); })
 
 #define __careful_cmp(op, x, y, uniq)  \
__builtin_choose_expr(__is_constexpr((x) - (y)),\
__cmp(op, x, y),\
-   __cmp_once(op, x, y, uniq))
+   ({ _Static_assert(__types_ok(x, y), \
+   #op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
+   __cmp_once(op, x, y, uniq); }))
 
 /**
  * min - return minimum of two values of the same or compatible types
@@ -139,14 +139,14 @@
_Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
-   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
-#define __careful_clamp(val, lo, hi, uniq) ({  \
+#define __careful_clamp(val, lo, hi, uniq) \
__builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
__clamp(val, lo, hi),   \
-   __clamp_once(val, lo, hi, uniq)); })
+   ({ _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness 
error");\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness 
error");   \
+   __clamp_once(val, lo, hi, uniq); }))
 
 /**
  * clamp - return a value clamped to a given range with strict typechecking
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 04/11] minmax: Replace multiple __UNIQUE_ID() by directly using __COUNTER__

2024-02-25 Thread David Laight
Provided __COUNTER__ is passed through an extra #define it can be pasted
onto multiple local variables to give unique names.
This saves having 3 __UNIQUE_ID() for #defines with three locals and
look less messy in general.

Stop the umin()/umax() lines being overlong by factoring out the
zero-extension logic.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 48 +-
 1 file changed, 24 insertions(+), 24 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index c32b4b40ce01..8ee003d8abaf 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,7 +8,7 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish several things:
+ * min()/max()/clamp() macros must accomplish three things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
@@ -43,31 +43,31 @@
 
 #define __cmp(op, x, y)((x) __cmp_op_##op (y) ? (x) : (y))
 
-#define __cmp_once(op, x, y, unique_x, unique_y) ({\
-   typeof(x) unique_x = (x);   \
-   typeof(y) unique_y = (y);   \
-   _Static_assert(__types_ok(x, y),\
+#define __cmp_once(op, x, y, uniq) ({  \
+   typeof(x) __x_##uniq = (x); \
+   typeof(y) __y_##uniq = (y); \
+   _Static_assert(__types_ok(x, y),\
#op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
-   __cmp(op, unique_x, unique_y); })
+   __cmp(op, __x_##uniq, __y_##uniq); })
 
-#define __careful_cmp(op, x, y)\
+#define __careful_cmp(op, x, y, uniq)  \
__builtin_choose_expr(__is_constexpr((x) - (y)),\
__cmp(op, x, y),\
-   __cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
+   __cmp_once(op, x, y, uniq))
 
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define min(x, y)  __careful_cmp(min, x, y)
+#define min(x, y)  __careful_cmp(min, x, y, __COUNTER__)
 
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define max(x, y)  __careful_cmp(max, x, y)
+#define max(x, y)  __careful_cmp(max, x, y, __COUNTER__)
 
 /**
  * umin - return minimum of two non-negative values
@@ -75,8 +75,9 @@
  * @x: first value
  * @y: second value
  */
+#define __zero_extend(x) ((x) + 0u + 0ul + 0ull)
 #define umin(x, y) \
-   __careful_cmp(min, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)
+   __careful_cmp(min, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
 /**
  * umax - return maximum of two non-negative values
@@ -84,7 +85,7 @@
  * @y: second value
  */
 #define umax(x, y) \
-   __careful_cmp(max, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)
+   __careful_cmp(max, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
 /**
  * min3 - return minimum of three values
@@ -108,7 +109,7 @@
  * @x: first value
  * @y: second value
  */
-#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y))
+#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y), 
__COUNTER__)
 
 /**
  * max_t - return maximum of two values, using the specified type
@@ -116,7 +117,7 @@
  * @x: first value
  * @y: second value
  */
-#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y))
+#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y), 
__COUNTER__)
 
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
@@ -131,22 +132,21 @@
 #define __clamp(val, lo, hi)   \
((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
 
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
-   typeof(val) unique_val = (val); 
\
-   typeof(lo) unique_lo = (lo);
\
-   typeof(hi) unique_hi = (hi);
\
+#define __clamp_once(val, lo, hi, uniq) ({ 
\
+   typeof(val) __val_##uniq = (val);   
\
+   typeof(lo) __lo_##uniq = (lo);  
\
+   typeof(hi) __hi_##uniq = (hi);  
\
_Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high lim

[PATCH next v2 03/11] minmax: Simplify signedness check

2024-02-25 Thread David Laight
It is enough to check that both 'x' and 'y' are valid for either
a signed compare or an unsigned compare.
For unsigned they must be an unsigned type or a positive constant.
For signed they must be signed after unsigned char/short are promoted.

The predicate for _Static_assert() only needs to be a compile-time
constant not a constant integeger expression.
In particular the short-circuit evaluation of || && ?: can be used
to avoid the non-constantness of (pointer_type)1 in is_signed_type().

The '+ 0' in '(x) + 0 > = 0' is there to convert 'bool' to 'int'
and avoid a compiler warning because max() gets used for 'bool'
in one place (a very expensive 'or').
(The code is optimised away by two earlier checks - but the compiler
still bleats.)

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 900eec7a28e5..c32b4b40ce01 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,7 +8,7 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish three things:
+ * min()/max()/clamp() macros must accomplish several things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
@@ -26,19 +26,17 @@
 #define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 
-/* is_signed_type() isn't a constexpr for pointer types */
-#define __is_signed(x) 
\
-   __builtin_choose_expr(__is_constexpr(is_signed_type(typeof(x))),
\
-   is_signed_type(typeof(x)), 0)
+/* Allow unsigned compares against non-negative signed constants. */
+#define __is_ok_unsigned(x) \
+   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 0 >= 0 : 0))
 
-/* True for a non-negative signed int constant */
-#define __is_noneg_int(x)  \
-   (__builtin_choose_expr(__is_constexpr(x) && __is_signed(x), x, -1) >= 0)
+/* Check for signed after promoting unsigned char/short to int */
+#define __is_ok_signed(x) is_signed_type(typeof((x) + 0))
 
-#define __types_ok(x, y)   \
-   (__is_signed(x) == __is_signed(y) ||\
-   __is_signed((x) + 0) == __is_signed((y) + 0) || \
-   __is_noneg_int(x) || __is_noneg_int(y))
+/* Allow if both x and y are valid for either signed or unsigned compares. */
+#define __types_ok(x, y)   \
+   ((__is_ok_signed(x) && __is_ok_signed(y)) ||\
+(__is_ok_unsigned(x) && __is_ok_unsigned(y)))
 
 #define __cmp_op_min <
 #define __cmp_op_max >
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 02/11] minmax: Use _Static_assert() instead of static_assert()

2024-02-25 Thread David Laight
The wrapper just adds two more lines of error output when the test fails.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 63c45865b48a..900eec7a28e5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -48,7 +48,7 @@
 #define __cmp_once(op, x, y, unique_x, unique_y) ({\
typeof(x) unique_x = (x);   \
typeof(y) unique_y = (y);   \
-   static_assert(__types_ok(x, y), \
+   _Static_assert(__types_ok(x, y),\
#op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
__cmp(op, unique_x, unique_y); })
 
@@ -137,11 +137,11 @@
typeof(val) unique_val = (val); 
\
typeof(lo) unique_lo = (lo);
\
typeof(hi) unique_hi = (hi);
\
-   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
+   _Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
-   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
-   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
+   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
__clamp(unique_val, unique_lo, unique_hi); })
 
 #define __careful_clamp(val, lo, hi) ({
\
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next v2 01/11] minmax: Put all the clamp() definitions together

2024-02-25 Thread David Laight
The defines for clamp() have got separated, move togther for readability.
Update description of signedness check.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 120 +++--
 1 file changed, 56 insertions(+), 64 deletions(-)

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 2ec559284a9f..63c45865b48a 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -57,26 +57,6 @@
__cmp(op, x, y),\
__cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
 
-#define __clamp(val, lo, hi)   \
-   ((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
-
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
-   typeof(val) unique_val = (val); 
\
-   typeof(lo) unique_lo = (lo);
\
-   typeof(hi) unique_hi = (hi);
\
-   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
-   (lo) <= (hi), true),
\
-   "clamp() low limit " #lo " greater than high limit " #hi);  
\
-   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
-   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
-   __clamp(unique_val, unique_lo, unique_hi); })
-
-#define __careful_clamp(val, lo, hi) ({
\
-   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
-   __clamp(val, lo, hi),   \
-   __clamp_once(val, lo, hi, __UNIQUE_ID(__val),   \
-__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
-
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
@@ -124,6 +104,22 @@
  */
 #define max3(x, y, z) max((typeof(x))max(x, y), z)
 
+/**
+ * min_t - return minimum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y))
+
+/**
+ * max_t - return maximum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y))
+
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
  * @x: value1
@@ -134,39 +130,60 @@
typeof(y) __y = (y);\
__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
 
+#define __clamp(val, lo, hi)   \
+   ((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
+
+#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
+   typeof(val) unique_val = (val); 
\
+   typeof(lo) unique_lo = (lo);
\
+   typeof(hi) unique_hi = (hi);
\
+   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
+   (lo) <= (hi), true),
\
+   "clamp() low limit " #lo " greater than high limit " #hi);  
\
+   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
+   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
+   __clamp(unique_val, unique_lo, unique_hi); })
+
+#define __careful_clamp(val, lo, hi) ({
\
+   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
+   __clamp(val, lo, hi),   \
+   __clamp_once(val, lo, hi, __UNIQUE_ID(__val),   \
+__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
+
 /**
  * clamp - return a value clamped to a given range with strict typechecking
  * @val: current value
  * @lo: lowest allowable value
  * @hi: highest allowable value
  *
- * This macro does strict typechecking of @lo/@hi to make sure they are of the
- * same type as @val.  See the unnecessary pointer comparisons.
+ * This macro checks that @val, @lo and @hi have the same signedness.
  */
 #define clamp(val, lo, hi) __careful_clamp(val, lo, hi)
 
-/*
- * ..and if you can't take the strict
- * types, you can specify one yourself.
- *
- * Or not use min/max/clamp at all, of course.
- */
-
 /**
- * min_t - return minimum of two values, using the specified type
- * @type: data type to use
- * @x: first value
- * @y: second value
+ * clamp_t - return a value clamped to

[PATCH next v2 00/11] minmax: Optimise to reduce .i line length

2024-02-25 Thread David Laight
The changes to minmax.h that changed the type check to a signedness
check significantly increased the length of the expansion.
In some cases it has also significantly increased compile type.
This is particularly noticable for nested expansions.

The fact that _Static_assert() only requires a compile time constant
not a constant expression allows a lot of simplification.

The other thing that compilicates the expansion is the necessity of
returning a constant expression from constant arguments (for VLA).
I can only find a handful of places this is done.
Penalising most of the code for these few cases seems 'suboptimal'.
Instead I've added min_const() and max_const() for VLA and static
initialisers, these check the arguments are constant to avoid misuse.

Patch [9] is dependant on the earlier patches.
Patch [10] isn't dependant on them.
Patch [11] depends on both 9 and 10.

Changes for v2:
- Typographical and spelling corrections to the commit messages.
  Patches unchanged.

David Laight (11):
  [1] minmax: Put all the clamp() definitions together
  [2] minmax: Use _Static_assert() instead of static_assert()
  [3] minmax: Simplify signedness check
  [4] minmax: Replace multiple __UNIQUE_ID() by directly using __COUNTER__
  [5] minmax: Move the signedness check out of __cmp_once() and
__clamp_once()
  [6] minmax: Remove 'constexpr' check from __careful_clamp()
  [7] minmax: minmax: Add __types_ok3() and optimise defines with 3
arguments
  [8] minmax: Add min_const() and max_const()
  [9] tree-wide: minmax: Replace all the uses of max() for array sizes with
max_const().
  [10] block: Use a boolean expression instead of max() on booleans
  [11] minmax: min() and max() don't need to return constant expressions

 block/blk-settings.c  |   2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c|   2 +-
 drivers/gpu/drm/drm_color_mgmt.c  |   4 +-
 drivers/input/touchscreen/cyttsp4_core.c  |   2 +-
 .../net/can/usb/etas_es58x/es58x_devlink.c|   2 +-
 fs/btrfs/tree-checker.c   |   2 +-
 include/linux/minmax.h| 211 ++
 lib/vsprintf.c|   4 +-
 net/ipv4/proc.c   |   2 +-
 net/ipv6/proc.c   |   2 +-
 10 files changed, 127 insertions(+), 106 deletions(-)

-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: Re: [PATCH v3 2/3] bits: Introduce fixed-type BIT

2024-02-10 Thread David Laight
...
> >> +#define BIT_U8(b) ((u8)(BIT_INPUT_CHECK(u8, b) + BIT(b)))
> >> +#define BIT_U16(b)((u16)(BIT_INPUT_CHECK(u16, b) + 
> >> BIT(b)))
> >> +#define BIT_U32(b)((u32)(BIT_INPUT_CHECK(u32, b) + 
> >> BIT(b)))
> >> +#define BIT_U64(b)((u64)(BIT_INPUT_CHECK(u64, b) + 
> >> BIT(b)))
> >
> >considering that BIT defines are always referred to unsigned
> >types, I would just call them

Except that pretty much as soon as you breath on them
the u8 and u16 types get converted to int.
If you want them to be an unsigned type then you need
to cast them to (unsigned int).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-29 Thread David Laight
From: Jani Nikula
> Sent: 29 January 2024 09:08
> 
> On Sun, 28 Jan 2024, David Laight  wrote:
> > blk_stack_limits() contains:
> > t->zoned = max(t->zoned, b->zoned);
> > These are bool, so it is just a bitwise or.
> 
> Should be a logical or, really. And || in code.

Not really, bitwise is fine for bool (especially for 'or')
and generates better code.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-28 Thread David Laight
From: Linus Torvalds
> Sent: 28 January 2024 19:59
> 
> On Sun, 28 Jan 2024 at 11:36, David Laight  wrote:
> >
> > However it generates:
> > error: comparison of constant ‘0’ with boolean expression is always 
> > true [-Werror=bool-compare]
> > inside the signedness check that max() does unless a '+ 0' is added.
> 
> Please fix your locale. You have random garbage characters there,
> presumably because you have some incorrect locale setting somewhere in
> your toolchain.

H blame gcc :-)
The error message displays as '0' but is e2:80:98 30 e2:80:99
I HATE UTF-8, it wouldn't be as bad if it were a bijection.

Lets see if adding 'LANG=C' in the shell script I use to
do kernel builds is enough.

I also managed to send parts 1 to 6 without deleting the RE:
(I have to cut from wordpad into a 'reply-all' of the first
message I send. Work uses mimecast and it has started bouncing
my copy of every message I send to the lists.)

Maybe I should start using telnet to send raw SMTP :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH next 11/11] minmax: min() and max() don't need to return constant expressions

2024-01-28 Thread David Laight
After changing the handful of places max() was used to size an on-stack
array to use max_const() it is no longer necessary for min() and max()
to return constant expressions from constant inputs.
Remove the associated logic to reduce the expanded text.

Remove the 'hack' that allowed max(bool, bool).

Fixup the initial block comment to match current reality.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 23 ---
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index c08916588425..5e65c98ff256 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,13 +8,10 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish three things:
+ * min()/max()/clamp() macros must accomplish several things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
- * - Retain result as a constant expressions when called with only
- *   constant expressions (to avoid tripping VLA warnings in stack
- *   allocation usage).
  * - Perform signed v unsigned type-checking (to generate compile
  *   errors instead of nasty runtime surprises).
  * - Unsigned char/short are always promoted to signed int and can be
@@ -22,13 +19,19 @@
  * - Unsigned arguments can be compared against non-negative signed constants.
  * - Comparison of a signed argument against an unsigned constant fails
  *   even if the constant is below __INT_MAX__ and could be cast to int.
+ *
+ * The return value of min()/max() is not a constant expression for
+ * constant parameters - so will trigger a VLA warging if used to size
+ * an on-stack array.
+ * Instead use min_const() or max_const() which do generate constant
+ * expressions and are also valid for static initialisers.
  */
 #define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 
 /* Allow unsigned compares against non-negative signed constants. */
 #define __is_ok_unsigned(x) \
-   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 0 >= 0 : 0))
+   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) >= 0 : 0))
 
 /* Check for signed after promoting unsigned char/short to int */
 #define __is_ok_signed(x) is_signed_type(typeof((x) + 0))
@@ -53,12 +56,10 @@
typeof(y) __y_##uniq = (y); \
__cmp(op, __x_##uniq, __y_##uniq); })
 
-#define __careful_cmp(op, x, y, uniq)  \
-   __builtin_choose_expr(__is_constexpr((x) - (y)),\
-   __cmp(op, x, y),\
-   ({ _Static_assert(__types_ok(x, y), \
-   #op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
-   __cmp_once(op, x, y, uniq); }))
+#define __careful_cmp(op, x, y, uniq) ({   \
+   _Static_assert(__types_ok(x, y),\
+   #op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
+   __cmp_once(op, x, y, uniq); })
 
 #define __careful_cmp_const(op, x, y)  \
(BUILD_BUG_ON_ZERO(!__is_constexpr((x) - (y))) +\
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



[PATCH next 10/11] block: Use a boolean expression instead of max() on booleans

2024-01-28 Thread David Laight
blk_stack_limits() contains:
t->zoned = max(t->zoned, b->zoned);
These are bool, so it is just a bitwise or.
However it generates:
error: comparison of constant ‘0’ with boolean expression is always true 
[-Werror=bool-compare]
inside the signedness check that max() does unless a '+ 0' is added.
It is a shame the compiler generates this warning for code that will
be optimised away.

Change so that the extra '+ 0' can be removed.

Signed-off-by: David Laight 
---
 block/blk-settings.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index 06ea91e51b8b..9ca21fea039d 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -688,7 +688,7 @@ int blk_stack_limits(struct queue_limits *t, struct 
queue_limits *b,
   b->max_secure_erase_sectors);
t->zone_write_granularity = max(t->zone_write_granularity,
b->zone_write_granularity);
-   t->zoned = max(t->zoned, b->zoned);
+   t->zoned = t->zoned | b->zoned;
return ret;
 }
 EXPORT_SYMBOL(blk_stack_limits);
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


[PATCH next 09/11] tree-wide: minmax: Replace all the uses of max() for array sizes with max_const()

2024-01-28 Thread David Laight
These are the only uses of max() that require a constant value
from constant parameters.
There don't seem to be any similar uses of min().

Replacing the max() by max_const() lets min()/max() be simplified
speeding up compilation.

max_const() will convert enums to int (or unsigned int) so that the
casts added by max_t() are no longer needed.

Signed-off-by: David Laight 
---
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c | 2 +-
 drivers/gpu/drm/drm_color_mgmt.c   | 4 ++--
 drivers/input/touchscreen/cyttsp4_core.c   | 2 +-
 drivers/net/can/usb/etas_es58x/es58x_devlink.c | 2 +-
 fs/btrfs/tree-checker.c| 2 +-
 lib/vsprintf.c | 4 ++--
 net/ipv4/proc.c| 2 +-
 net/ipv6/proc.c| 2 +-
 8 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c 
b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
index 00cd615bbcdc..935fb4014f7c 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
@@ -708,7 +708,7 @@ static const char *smu_get_feature_name(struct smu_context 
*smu,
 size_t smu_cmn_get_pp_feature_mask(struct smu_context *smu,
   char *buf)
 {
-   int8_t sort_feature[max(SMU_FEATURE_COUNT, SMU_FEATURE_MAX)];
+   int8_t sort_feature[max_const(SMU_FEATURE_COUNT, SMU_FEATURE_MAX)];
uint64_t feature_mask;
int i, feature_index;
uint32_t count = 0;
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c
index d021497841b8..43a6bd0ca960 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -532,8 +532,8 @@ int drm_plane_create_color_properties(struct drm_plane 
*plane,
 {
struct drm_device *dev = plane->dev;
struct drm_property *prop;
-   struct drm_prop_enum_list enum_list[max_t(int, DRM_COLOR_ENCODING_MAX,
-  DRM_COLOR_RANGE_MAX)];
+   struct drm_prop_enum_list enum_list[max_const(DRM_COLOR_ENCODING_MAX,
+ DRM_COLOR_RANGE_MAX)];
int i, len;
 
if (WARN_ON(supported_encodings == 0 ||
diff --git a/drivers/input/touchscreen/cyttsp4_core.c 
b/drivers/input/touchscreen/cyttsp4_core.c
index 7cb26929dc73..c6884c3c3fca 100644
--- a/drivers/input/touchscreen/cyttsp4_core.c
+++ b/drivers/input/touchscreen/cyttsp4_core.c
@@ -871,7 +871,7 @@ static void cyttsp4_get_mt_touches(struct cyttsp4_mt_data 
*md, int num_cur_tch)
struct cyttsp4_touch tch;
int sig;
int i, j, t = 0;
-   int ids[max(CY_TMA1036_MAX_TCH, CY_TMA4XX_MAX_TCH)];
+   int ids[max_const(CY_TMA1036_MAX_TCH, CY_TMA4XX_MAX_TCH)];
 
memset(ids, 0, si->si_ofs.tch_abs[CY_TCH_T].max * sizeof(int));
for (i = 0; i < num_cur_tch; i++) {
diff --git a/drivers/net/can/usb/etas_es58x/es58x_devlink.c 
b/drivers/net/can/usb/etas_es58x/es58x_devlink.c
index 635edeb8f68c..28fa87668bf8 100644
--- a/drivers/net/can/usb/etas_es58x/es58x_devlink.c
+++ b/drivers/net/can/usb/etas_es58x/es58x_devlink.c
@@ -215,7 +215,7 @@ static int es58x_devlink_info_get(struct devlink *devlink,
struct es58x_sw_version *fw_ver = _dev->firmware_version;
struct es58x_sw_version *bl_ver = _dev->bootloader_version;
struct es58x_hw_revision *hw_rev = _dev->hardware_revision;
-   char buf[max(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))];
+   char buf[max_const(sizeof("xx.xx.xx"), sizeof("axxx/xxx"))];
int ret = 0;
 
if (es58x_sw_version_is_valid(fw_ver)) {
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index 6eccf8496486..aec4729a9a82 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -615,7 +615,7 @@ static int check_dir_item(struct extent_buffer *leaf,
 */
if (key->type == BTRFS_DIR_ITEM_KEY ||
key->type == BTRFS_XATTR_ITEM_KEY) {
-   char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+   char namebuf[max_const(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
 
read_extent_buffer(leaf, namebuf,
(unsigned long)(di + 1), name_len);
diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 552738f14275..6c3c319afd86 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1080,8 +1080,8 @@ char *resource_string(char *buf, char *end, struct 
resource *res,
 #define FLAG_BUF_SIZE  (2 * sizeof(res->flags))
 #define DECODED_BUF_SIZE   sizeof("[mem - 64bit pref window disabled]")
 #define RAW_BUF_SIZE   sizeof("[mem - flags 0x]")
-   char sym[max(2*RSRC_BUF_SIZE + DECODED_BUF_SIZE,
-2*RSRC_BUF_SIZE + FLAG_BUF_SIZE + RAW_BUF_SIZE)];
+   char sym[max_const(2*RSRC_

[PATCH next 08/11 minmax: Add min_const() and max_const()

2024-01-28 Thread David Laight
The expansions of min() and max() contain statement expressions so are
not valid for static intialisers.
min_const() and max_const() are expressions so can be used for static
initialisers.
The arguments are checked for being constant and for negative signed
values being converted to large unsigned values.

Using these to size on-stack arrays lets min/max be simplified.
Zero is added before the compare to convert enum values to integers
avoinding the need for casts when enums have been used for constants.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 278a390b8a4c..c08916588425 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -60,19 +60,34 @@
#op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
__cmp_once(op, x, y, uniq); }))
 
+#define __careful_cmp_const(op, x, y)  \
+   (BUILD_BUG_ON_ZERO(!__is_constexpr((x) - (y))) +\
+   BUILD_BUG_ON_ZERO(!__types_ok(x, y)) +  \
+   __cmp(op, (x) + 0, (y) + 0))
+
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
+ *
+ * If @x and @y are constants the return value is constant, but not 'constant
+ * enough' for things like static initialisers.
+ * min_const(@x, @y) is a constant expression for constant inputs.
  */
 #define min(x, y)  __careful_cmp(min, x, y, __COUNTER__)
+#define min_const(x, y)__careful_cmp_const(min, x, y)
 
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
+ *
+ * If @x and @y are constants the return value is constant, but not 'constant
+ * enough' for things like static initialisers.
+ * max_const(@x, @y) is a constant expression for constant inputs.
  */
 #define max(x, y)  __careful_cmp(max, x, y, __COUNTER__)
+#define max_const(x, y)__careful_cmp_const(max, x, y)
 
 /**
  * umin - return minimum of two non-negative values
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 02/11] minmax: Use _Static_assert() instead of static_assert()

2024-01-28 Thread David Laight
The wrapper just adds two more lines of error output when the test fails.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 63c45865b48a..900eec7a28e5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -48,7 +48,7 @@
 #define __cmp_once(op, x, y, unique_x, unique_y) ({\
typeof(x) unique_x = (x);   \
typeof(y) unique_y = (y);   \
-   static_assert(__types_ok(x, y), \
+   _Static_assert(__types_ok(x, y),\
#op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
__cmp(op, unique_x, unique_y); })
 
@@ -137,11 +137,11 @@
typeof(val) unique_val = (val); 
\
typeof(lo) unique_lo = (lo);
\
typeof(hi) unique_hi = (hi);
\
-   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
+   _Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
-   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
-   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
+   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
__clamp(unique_val, unique_lo, unique_hi); })
 
 #define __careful_clamp(val, lo, hi) ({
\
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 01/11] minmax: Put all the clamp() definitions together

2024-01-28 Thread David Laight
The defines for clamp() have got separated, move togther for readability.
Update description of signedness check.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 120 +++--
 1 file changed, 56 insertions(+), 64 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 2ec559284a9f..63c45865b48a 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -57,26 +57,6 @@
__cmp(op, x, y),\
__cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
 
-#define __clamp(val, lo, hi)   \
-   ((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
-
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
-   typeof(val) unique_val = (val); 
\
-   typeof(lo) unique_lo = (lo);
\
-   typeof(hi) unique_hi = (hi);
\
-   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
-   (lo) <= (hi), true),
\
-   "clamp() low limit " #lo " greater than high limit " #hi);  
\
-   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
-   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
-   __clamp(unique_val, unique_lo, unique_hi); })
-
-#define __careful_clamp(val, lo, hi) ({
\
-   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
-   __clamp(val, lo, hi),   \
-   __clamp_once(val, lo, hi, __UNIQUE_ID(__val),   \
-__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
-
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
@@ -124,6 +104,22 @@
  */
 #define max3(x, y, z) max((typeof(x))max(x, y), z)
 
+/**
+ * min_t - return minimum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y))
+
+/**
+ * max_t - return maximum of two values, using the specified type
+ * @type: data type to use
+ * @x: first value
+ * @y: second value
+ */
+#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y))
+
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
  * @x: value1
@@ -134,39 +130,60 @@
typeof(y) __y = (y);\
__x == 0 ? __y : ((__y == 0) ? __x : min(__x, __y)); })
 
+#define __clamp(val, lo, hi)   \
+   ((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
+
+#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
+   typeof(val) unique_val = (val); 
\
+   typeof(lo) unique_lo = (lo);
\
+   typeof(hi) unique_hi = (hi);
\
+   static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),
\
+   (lo) <= (hi), true),
\
+   "clamp() low limit " #lo " greater than high limit " #hi);  
\
+   static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");
\
+   static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");
\
+   __clamp(unique_val, unique_lo, unique_hi); })
+
+#define __careful_clamp(val, lo, hi) ({
\
+   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
+   __clamp(val, lo, hi),   \
+   __clamp_once(val, lo, hi, __UNIQUE_ID(__val),   \
+__UNIQUE_ID(__lo), __UNIQUE_ID(__hi))); })
+
 /**
  * clamp - return a value clamped to a given range with strict typechecking
  * @val: current value
  * @lo: lowest allowable value
  * @hi: highest allowable value
  *
- * This macro does strict typechecking of @lo/@hi to make sure they are of the
- * same type as @val.  See the unnecessary pointer comparisons.
+ * This macro checks that @val, @lo and @hi have the same signedness.
  */
 #define clamp(val, lo, hi) __careful_clamp(val, lo, hi)
 
-/*
- * ..and if you can't take the strict
- * types, you can specify one yourself.
- *
- * Or not use min/max/clamp at all, of course.
- */
-
 /**
- * min_t - return minimum of two values, using the specified type
- * @type: data type to use
- * @x: first value
- * @y: second value
+ * clamp_t - return a value clamped to a given range using a given type
+ * @type: the type of variable to use
+ * @val: current value
+

[PATCH next 00/11] minmax: Optimise to reduce .i line length.

2024-01-28 Thread David Laight
The changes to minmax.h that changed the type check to a signedness
check significantly increased the length of the expansion.
In some cases it has also significantly increased compile type.
This is particularly noticeable for nested expansions.

The fact that _Static_assert() only requires a compile time constant
not a constant expression allows a lot of simplification.

The other thing that complicates the expansion is the necessity of
returning a constant expression from constant arguments (for VLA).
I can only find a handful of places this is done.
Penalising most of the code for these few cases seems 'suboptimal'.
Instead I've added min_const() and max_const() for VLA and static
initialisers, these check the arguments are constant to avoid misuse.

Patch [9] is dependent on the earlier patches.
Patch [10] isn't dependant on them.
Patch [11] depends on both 9 and 10.

David Laight (11):
  [1] minmax: Put all the clamp() definitions together
  [2] minmax: Use _Static_assert() instead of static_assert()
  [3] minmax: Simplify signedness check
  [4] minmax: Replace multiple __UNIQUE_ID() by directly using __COUNTER__
  [5] minmax: Move the signedness check out of __cmp_once() and
__clamp_once()
  [6] minmax: Remove 'constexpr' check from __careful_clamp()
  [7] minmax: minmax: Add __types_ok3() and optimise defines with 3
arguments
  [8] minmax: Add min_const() and max_const()
  [9] tree-wide: minmax: Replace all the uses of max() for array sizes with
max_const().
  [10] block: Use a boolean expression instead of max() on booleans
  [11] minmax: min() and max() don't need to return constant expressions

 block/blk-settings.c  |   2 +-
 drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c|   2 +-
 drivers/gpu/drm/drm_color_mgmt.c  |   4 +-
 drivers/input/touchscreen/cyttsp4_core.c  |   2 +-
 .../net/can/usb/etas_es58x/es58x_devlink.c|   2 +-
 fs/btrfs/tree-checker.c   |   2 +-
 include/linux/minmax.h| 211 ++
 lib/vsprintf.c|   4 +-
 net/ipv4/proc.c   |   2 +-
 net/ipv6/proc.c   |   2 +-
 10 files changed, 127 insertions(+), 106 deletions(-)

-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 00611] minmax: Remove 'constexpr' check from __careful_clamp()

2024-01-28 Thread David Laight
Nothing requires that clamp() return a constant expression.
The logic to do so significantly increases the .i file.
Remove the check and directly expand __clamp_once() from clamp_t()
since the type check can't fail.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 111c52a14fe5..5c7fce76abe5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -141,12 +141,10 @@
"clamp() low limit " #lo " greater than high limit " #hi);  
\
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
-#define __careful_clamp(val, lo, hi, uniq) \
-   __builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
-   __clamp(val, lo, hi),   \
-   ({ _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness 
error");\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness 
error");   \
-   __clamp_once(val, lo, hi, uniq); }))
+#define __careful_clamp(val, lo, hi, uniq) ({  
\
+   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
+   __clamp_once(val, lo, hi, uniq); })
 
 /**
  * clamp - return a value clamped to a given range with strict typechecking
@@ -168,7 +166,9 @@
  * This macro does no typechecking and uses temporary variables of type
  * @type to make all the comparisons.
  */
-#define clamp_t(type, val, lo, hi) clamp((type)(val), (type)(lo), (type)(hi))
+#define __clamp_t(type, val, lo, hi, uniq) \
+   __clamp_once((type)(val), (type)(lo), (type)(hi), uniq)
+#define clamp_t(type, val, lo, hi) __clamp_t(type, val, lo, hi, __COUNTER__)
 
 /**
  * clamp_val - return a value clamped to a given range using val's type
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 0711] minmax: minmax: Add __types_ok3() and optimise defines with 3 arguments

2024-01-28 Thread David Laight
min3() and max3() were added to optimise nested min(x, min(y, z))
sequences, bit only moved where the expansion was requiested.

Add a separate implementation for 3 argument calls.
These are never required to generate constant expressiions to
remove that logic.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 23 +++
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 5c7fce76abe5..278a390b8a4c 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -38,6 +38,11 @@
((__is_ok_signed(x) && __is_ok_signed(y)) ||\
 (__is_ok_unsigned(x) && __is_ok_unsigned(y)))
 
+/* Check three values for min3(), max3() and clamp() */
+#define __types_ok3(x, y, z)   
\
+   ((__is_ok_signed(x) && __is_ok_signed(y) && __is_ok_signed(z)) ||   
\
+(__is_ok_unsigned(x) && __is_ok_unsigned(y) && __is_ok_unsigned(z)))
+
 #define __cmp_op_min <
 #define __cmp_op_max >
 
@@ -87,13 +92,24 @@
 #define umax(x, y) \
__careful_cmp(max, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
+#define __cmp_once3(op, x, y, z, uniq) ({  \
+   typeof(x) __x_##uniq = (x); \
+   typeof(x) __y_##uniq = (y); \
+   typeof(x) __z_##uniq = (z); \
+   __cmp(op, __cmp(op, __x_##uniq, __y_##uniq), __z_##uniq); })
+
+#define __careful_cmp3(op, x, y, z, uniq) ({   \
+   static_assert(__types_ok3(x, y, z), \
+   #op "3(" #x ", " #y ", " #z ") signedness error");  \
+   __cmp_once3(op, x, y, z, uniq); })
+
 /**
  * min3 - return minimum of three values
  * @x: first value
  * @y: second value
  * @z: third value
  */
-#define min3(x, y, z) min((typeof(x))min(x, y), z)
+#define min3(x, y, z) __careful_cmp3(min, x, y, z, __COUNTER__)
 
 /**
  * max3 - return maximum of three values
@@ -101,7 +117,7 @@
  * @y: second value
  * @z: third value
  */
-#define max3(x, y, z) max((typeof(x))max(x, y), z)
+#define max3(x, y, z) __careful_cmp3(max, x, y, z, __COUNTER__)
 
 /**
  * min_t - return minimum of two values, using the specified type
@@ -142,8 +158,7 @@
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
 #define __careful_clamp(val, lo, hi, uniq) ({  
\
-   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
+   _Static_assert(__types_ok3(val, lo, hi), "clamp() signedness error");   
\
__clamp_once(val, lo, hi, uniq); })
 
 /**
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 04/11] minmax: Replace multiple __UNIQUE_ID() by directly using __COUNTER__

2024-01-28 Thread David Laight
Provided __COUNTER__ is passed through an extra #define it can be pasted
onto multiple local variables to give unique names.
This saves having 3 __UNIQUE_ID() for #defines with three locals and
look less messy in general.

Stop the umin()/umax() lines being overlong by factoring out the
zero-extension logic.

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 48 +-
 1 file changed, 24 insertions(+), 24 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index c32b4b40ce01..8ee003d8abaf 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,7 +8,7 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish several things:
+ * min()/max()/clamp() macros must accomplish three things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
@@ -43,31 +43,31 @@
 
 #define __cmp(op, x, y)((x) __cmp_op_##op (y) ? (x) : (y))
 
-#define __cmp_once(op, x, y, unique_x, unique_y) ({\
-   typeof(x) unique_x = (x);   \
-   typeof(y) unique_y = (y);   \
-   _Static_assert(__types_ok(x, y),\
+#define __cmp_once(op, x, y, uniq) ({  \
+   typeof(x) __x_##uniq = (x); \
+   typeof(y) __y_##uniq = (y); \
+   _Static_assert(__types_ok(x, y),\
#op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
-   __cmp(op, unique_x, unique_y); })
+   __cmp(op, __x_##uniq, __y_##uniq); })
 
-#define __careful_cmp(op, x, y)\
+#define __careful_cmp(op, x, y, uniq)  \
__builtin_choose_expr(__is_constexpr((x) - (y)),\
__cmp(op, x, y),\
-   __cmp_once(op, x, y, __UNIQUE_ID(__x), __UNIQUE_ID(__y)))
+   __cmp_once(op, x, y, uniq))
 
 /**
  * min - return minimum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define min(x, y)  __careful_cmp(min, x, y)
+#define min(x, y)  __careful_cmp(min, x, y, __COUNTER__)
 
 /**
  * max - return maximum of two values of the same or compatible types
  * @x: first value
  * @y: second value
  */
-#define max(x, y)  __careful_cmp(max, x, y)
+#define max(x, y)  __careful_cmp(max, x, y, __COUNTER__)
 
 /**
  * umin - return minimum of two non-negative values
@@ -75,8 +75,9 @@
  * @x: first value
  * @y: second value
  */
+#define __zero_extend(x) ((x) + 0u + 0ul + 0ull)
 #define umin(x, y) \
-   __careful_cmp(min, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)
+   __careful_cmp(min, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
 /**
  * umax - return maximum of two non-negative values
@@ -84,7 +85,7 @@
  * @y: second value
  */
 #define umax(x, y) \
-   __careful_cmp(max, (x) + 0u + 0ul + 0ull, (y) + 0u + 0ul + 0ull)
+   __careful_cmp(max, __zero_extend(x), _zero_extend(y), __COUNTER__)
 
 /**
  * min3 - return minimum of three values
@@ -108,7 +109,7 @@
  * @x: first value
  * @y: second value
  */
-#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y))
+#define min_t(type, x, y)  __careful_cmp(min, (type)(x), (type)(y), 
__COUNTER__)
 
 /**
  * max_t - return maximum of two values, using the specified type
@@ -116,7 +117,7 @@
  * @x: first value
  * @y: second value
  */
-#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y))
+#define max_t(type, x, y)  __careful_cmp(max, (type)(x), (type)(y), 
__COUNTER__)
 
 /**
  * min_not_zero - return the minimum that is _not_ zero, unless both are zero
@@ -131,22 +132,21 @@
 #define __clamp(val, lo, hi)   \
((val) >= (hi) ? (hi) : ((val) <= (lo) ? (lo) : (val)))
 
-#define __clamp_once(val, lo, hi, unique_val, unique_lo, unique_hi) ({ 
\
-   typeof(val) unique_val = (val); 
\
-   typeof(lo) unique_lo = (lo);
\
-   typeof(hi) unique_hi = (hi);
\
+#define __clamp_once(val, lo, hi, uniq) ({ 
\
+   typeof(val) __val_##uniq = (val);   
\
+   typeof(lo) __lo_##uniq = (lo);  
\
+   typeof(hi) __hi_##uniq = (hi);  
\
_Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
_Static_assert(__types_ok(val, lo), "clamp() 'lo' si

RE: [PATCH next 05/11] minmax: Move the signedness check out of __cmp_once() and __clamp_once()

2024-01-28 Thread David Laight
There is no need to do the signedness/type check when the arguments
are being cast to a fixed type.
So move the check out of __xxx_once() into __careful_xxx().

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 8ee003d8abaf..111c52a14fe5 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -46,14 +46,14 @@
 #define __cmp_once(op, x, y, uniq) ({  \
typeof(x) __x_##uniq = (x); \
typeof(y) __y_##uniq = (y); \
-   _Static_assert(__types_ok(x, y),\
-   #op "(" #x ", " #y ") signedness error, fix types or consider 
u" #op "() before " #op "_t()"); \
__cmp(op, __x_##uniq, __y_##uniq); })
 
 #define __careful_cmp(op, x, y, uniq)  \
__builtin_choose_expr(__is_constexpr((x) - (y)),\
__cmp(op, x, y),\
-   __cmp_once(op, x, y, uniq))
+   ({ _Static_assert(__types_ok(x, y), \
+   #op "(" #x ", " #y ") signedness error, fix types or 
consider u" #op "() before " #op "_t()"); \
+   __cmp_once(op, x, y, uniq); }))
 
 /**
  * min - return minimum of two values of the same or compatible types
@@ -139,14 +139,14 @@
_Static_assert(__builtin_choose_expr(__is_constexpr((lo) > (hi)),   
\
(lo) <= (hi), true),
\
"clamp() low limit " #lo " greater than high limit " #hi);  
\
-   _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness error");   
\
-   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness error");   
\
__clamp(__val_##uniq, __lo_##uniq, __hi_##uniq); })
 
-#define __careful_clamp(val, lo, hi, uniq) ({  \
+#define __careful_clamp(val, lo, hi, uniq) \
__builtin_choose_expr(__is_constexpr((val) - (lo) + (hi)),  \
__clamp(val, lo, hi),   \
-   __clamp_once(val, lo, hi, uniq)); })
+   ({ _Static_assert(__types_ok(val, lo), "clamp() 'lo' signedness 
error");\
+   _Static_assert(__types_ok(val, hi), "clamp() 'hi' signedness 
error");   \
+   __clamp_once(val, lo, hi, uniq); }))
 
 /**
  * clamp - return a value clamped to a given range with strict typechecking
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH next 03/11] minmax: Simplify signedness check

2024-01-28 Thread David Laight
It is enough to check that both 'x' and 'y' are valid for either
a signed compare or an unsigned compare.
For unsigned they must be an unsigned type or a positive constant.
For signed they must be signed after unsigned char/short are promoted.

The predicate for _Static_assert() only needs to be a compile-time
constant not a constant integeger expression.
In particular the short-circuit evaluation of || && ?: can be used
to avoid the non-constantness of (pointer_type)1 in is_signed_type().

The '+ 0' in '(x) + 0 > = 0' is there to convert 'bool' to 'int'
and avoid a compiler warning because max() gets used for 'bool'
in one place (a very expensive 'or').
(The code is optimised away by two earlier checks - but the compiler
still bleats.)

Signed-off-by: David Laight 
---
 include/linux/minmax.h | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/include/linux/minmax.h b/include/linux/minmax.h
index 900eec7a28e5..c32b4b40ce01 100644
--- a/include/linux/minmax.h
+++ b/include/linux/minmax.h
@@ -8,7 +8,7 @@
 #include 
 
 /*
- * min()/max()/clamp() macros must accomplish three things:
+ * min()/max()/clamp() macros must accomplish several things:
  *
  * - Avoid multiple evaluations of the arguments (so side-effects like
  *   "x++" happen only once) when non-constant.
@@ -26,19 +26,17 @@
 #define __typecheck(x, y) \
(!!(sizeof((typeof(x) *)1 == (typeof(y) *)1)))
 
-/* is_signed_type() isn't a constexpr for pointer types */
-#define __is_signed(x) 
\
-   __builtin_choose_expr(__is_constexpr(is_signed_type(typeof(x))),
\
-   is_signed_type(typeof(x)), 0)
+/* Allow unsigned compares against non-negative signed constants. */
+#define __is_ok_unsigned(x) \
+   (is_unsigned_type(typeof(x)) || (__is_constexpr(x) ? (x) + 0 >= 0 : 0))
 
-/* True for a non-negative signed int constant */
-#define __is_noneg_int(x)  \
-   (__builtin_choose_expr(__is_constexpr(x) && __is_signed(x), x, -1) >= 0)
+/* Check for signed after promoting unsigned char/short to int */
+#define __is_ok_signed(x) is_signed_type(typeof((x) + 0))
 
-#define __types_ok(x, y)   \
-   (__is_signed(x) == __is_signed(y) ||\
-   __is_signed((x) + 0) == __is_signed((y) + 0) || \
-   __is_noneg_int(x) || __is_noneg_int(y))
+/* Allow if both x and y are valid for either signed or unsigned compares. */
+#define __types_ok(x, y)   \
+   ((__is_ok_signed(x) && __is_ok_signed(y)) ||\
+(__is_ok_unsigned(x) && __is_ok_unsigned(y)))
 
 #define __cmp_op_min <
 #define __cmp_op_max >
-- 
2.17.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 00/27] sparc32: sunset sun4m and sun4d

2023-12-20 Thread David Laight
From: Arnd Bergmann
> Sent: 20 December 2023 08:37
> 
> On Tue, Dec 19, 2023, at 22:03, Sam Ravnborg via B4 Relay wrote:
> > TODO before this can be applied:
> > - Ack from davem - as he is the principal sparc maintainer
> > - Tested-by: preferably on a target or QEMU (see above)
> >   I expect bugs as there are some involved changes!
> >
> > Ideas for the future
> > - Apply the most relevant downstream Gaisler patches
> >   - The ones introducing CAS should have preference as we then
> > can drop the cmpxchg emulation
> 
> One note about the CAS -- as far as I can tell, the absence
> of the futex() syscall on sparc32 kernels means that no glibc
> from the past decade can work correctly as it now requires futex
> for its internal locking, though it does work on sparc64 kernels
> in compat32 mode as well as the LEON3 kernel that adds futex
> support.

Does the glibc mutex 'fast path' also require a CAS instruction?

Presumably having CAS also removes the 'really horrid (tm)' code
required to make all the bitmap operations atomic?

Seems a shame to lose old sparc32 support when (I think) the sun3
in my cupboard would still work - if only it had a working psu.
(The 110/220V switch wasn't connected and the FET wasn't rated
for 450V. UK mains can be 240+10% if you are near a substation.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [RFC PATCH v3 09/12] net: add support for skbs with unreadable frags

2023-11-08 Thread David Laight
From: Mina Almasry
> Sent: 06 November 2023 02:44
> 
> For device memory TCP, we expect the skb headers to be available in host
> memory for access, and we expect the skb frags to be in device memory
> and unaccessible to the host. We expect there to be no mixing and
> matching of device memory frags (unaccessible) with host memory frags
> (accessible) in the same skb.
> 
> Add a skb->devmem flag which indicates whether the frags in this skb
> are device memory frags or not.
> 
...
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 1fae276c1353..8fb468ff8115 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -805,6 +805,8 @@ typedef unsigned char *sk_buff_data_t;
>   *   @csum_level: indicates the number of consecutive checksums found in
>   *   the packet minus one that have been verified as
>   *   CHECKSUM_UNNECESSARY (max 3)
> + *   @devmem: indicates that all the fragments in this skb are backed by
> + *   device memory.
>   *   @dst_pending_confirm: need to confirm neighbour
>   *   @decrypted: Decrypted SKB
>   *   @slow_gro: state present at GRO time, slower prepare step required
> @@ -991,7 +993,7 @@ struct sk_buff {
>  #if IS_ENABLED(CONFIG_IP_SCTP)
>   __u8csum_not_inet:1;
>  #endif
> -
> + __u8devmem:1;
>  #if defined(CONFIG_NET_SCHED) || defined(CONFIG_NET_XGRESS)
>   __u16   tc_index;   /* traffic control index */
>  #endif
> @@ -1766,6 +1768,12 @@ static inline void skb_zcopy_downgrade_managed(struct 
> sk_buff *skb)
>   __skb_zcopy_downgrade_managed(skb);
>  }

Doesn't that bloat struct sk_buff?
I'm not sure there are any spare bits available.
Although CONFIG_NET_SWITCHDEV and CONFIG_NET_SCHED seem to
already add padding.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [RFC PATCH v2 00/11] Device Memory TCP

2023-08-15 Thread David Laight
From: Mina Almasry
> Sent: 10 August 2023 02:58
...
> * TL;DR:
> 
> Device memory TCP (devmem TCP) is a proposal for transferring data to and/or
> from device memory efficiently, without bouncing the data to a host memory
> buffer.

Doesn't that really require peer-to-peer PCIe transfers?
IIRC these aren't supported by many root hubs and have
fundamental flow control and/or TLP credit problems.

I'd guess they are also pretty incompatible with IOMMU?

I can see how you might manage to transmit frames from
some external memory (eg after encryption) but surely
processing receive data that way needs the packets
be filtered by both IP addresses and port numbers before
being redirected to the (presumably limited) external
memory.

OTOH isn't the kernel going to need to run code before
the packet is actually sent and just after it is received?
So all you might gain is a bit of latency?
And a bit less utilisation of host memory??
But if your system is really limited by cpu-memory bandwidth
you need more cache :-)

So how much benefit is there over efficient use of host
memory bounce buffers??

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v4 1/1] drm/i915: Move abs_diff() to math.h

2023-08-04 Thread David Laight
From: Andrew Morton
> Sent: 03 August 2023 18:25
> 
> On Thu,  3 Aug 2023 16:19:18 +0300 Andy Shevchenko 
>  wrote:
> 
> > abs_diff() belongs to math.h. Move it there.
> > This will allow others to use it.
> >
> > ...
> >
> > --- a/include/linux/math.h
> > +++ b/include/linux/math.h
> > @@ -155,6 +155,13 @@ __STRUCT_FRACT(u32)
> > __builtin_types_compatible_p(typeof(x), unsigned type), \
> > ({ signed type __x = (x); __x < 0 ? -__x : __x; }), other)
> >
> > +#define abs_diff(a, b) ({  \
> > +   typeof(a) __a = (a);\
> > +   typeof(b) __b = (b);\
> > +   (void)(&__a == &__b);   \
> > +   __a > __b ? (__a - __b) : (__b - __a);  \
> > +})
> 
> Can we document it please?
> 
> Also, the open-coded type comparison could be replaced with __typecheck()?
> 
> And why the heck isn't __typecheck() in typecheck.h, to be included by
> minmax.h.

And why would you want to use __typecheck() anyway?
It pretty much isn't the test you are looking for.
If you are trying to explicitly avoid converting negative value
to large positive unsigned ones then you want something like:
is_signed_type(typeof(a)) == is_signed_type(typeof(b))
but it isn't even that simple :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH v3 3/4] drm/mediatek: Add casting before assign

2023-07-17 Thread David Laight
From: Jason-JH Lin
> Sent: 14 July 2023 07:46
> 
> Hi CK,
> 
> Thanks for the reviews.
> 
> On Fri, 2023-07-14 at 05:45 +, CK Hu (胡俊光) wrote:
> > Hi, Jason:
> >
> > On Wed, 2023-06-21 at 18:22 +0800, Jason-JH.Lin wrote:
> > > 1. Add casting before assign to avoid the unintentional integer
> > >overflow or unintended sign extension.
> > > 2. Add a int varriable for multiplier calculation instead of
> > > calculating
> > >different types multiplier with dma_addr_t varriable directly.
> >
> > I agree with these modification, but the title does not match the
> > modification.
> >
> > Regards,
> > CK
> 
> I'll change the title and commit msg at the next version below:
> 
> Fix unintentional integer overflow in multiplying different types
> 
> 1. Instead of multiplying 2 variable of different types. Change to
> assign a value of one variable and then multiply the other variable.
> 
> 2. Add a int variable for multiplier calculation instead of calculating
> different types multiplier with dma_addr_t variable directly.

I'm pretty sure the patch makes absolutely no difference.
In C all arithmetic is done with char/short (inc. unsigned)
promoted to int.

So the only likely overflow is if the values exceed 2^31.
Since the temporaries you are using are 'int' this isn't true.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] udmabuf: revert 'Add support for mapping hugepages (v4)'

2023-06-13 Thread David Laight
From: Kasireddy, Vivek 
> Sent: 13 June 2023 09:26
...
> > Is my understanding correct, that we can effectively long-term pin
> > (worse than mlock) 64 MiB per UDMABUF_CREATE, allowing eventually !root
> > users
>
> The 64 MiB limit is the theoretical upper bound that we have not seen hit in
> practice. Typically, for a 1920x1080 resolution (commonly used in Guests),
> the size of the FB is ~8 MB (1920x1080x4). And, most modern Graphics
> compositors flip between two FBs.

What code does and what potentially malicious code might do
are entirely different things.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] drm/drm_vblank.c: avoid unsigned int to signed int cast

2023-05-23 Thread David Laight
From: 15330273...@189.cn <15330273...@189.cn>
> Sent: 23 May 2023 05:27
> 
> On 2023/5/22 19:29, Jani Nikula wrote:
> > In general, do not use unsigned types in arithmethic to avoid negative
> > values, because most people will be tripped over by integer promotion
> > rules, and you'll get negative values anyway.
> 
> 
> Here I'm sure about this,
> 
> but there are plenty unsigned types arithmetic in the kernel.

The real problem is (attempted) arithmetic on types smaller than int.
Regardless of whether they are signed or unsigned.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] drm/drm_vblank.c: avoid unsigned int to signed int cast

2023-05-22 Thread David Laight
From: 15330273...@189.cn <15330273...@189.cn>
> Sent: 22 May 2023 12:56
...
> > I'll bet most people will be surprised to see what this prints:
> >
> > #include 
> > #include 
> >
> > int main(void)
> > {
> > uint16_t x = 0x;
> > uint16_t y = 0x;
> > uint64_t z = x * y;
> >
> > printf("0x%016lx\n", z);
> > printf("%ld\n", z);
> 
> Here, please replace the "%ld\n" with the "%lu\n", then you will see the
> difference.
> 
> you are casting the variable 'z' to signed value,  "%d" is for printing
> signed value, and "%u" is for printing unsigned value.

That makes very little difference on 2's compliment systems.
They both display the contents of the variable.

> Your simple code explained exactly why you are still in confusion,
> 
> that is u16 * u16  can yield a negative value if you use the int as the
> return type. Because it overflowed.

There is no 'return type', the type of 'u16 * u16' is signed int.
When 'signed int' is promoted/cast to u64 it is first sign extended
to 64 bits.

You can get what you want/expect by either forcing an unsigned multiply
or by explicitly casting the result of the multiply to u32.
So the product in 'z = (x + 0u) * y' is 'unsigned int' it gets
promoted to int64_t (ie a signed type) and then converted to
unsigned.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] drm/drm_vblank.c: avoid unsigned int to signed int cast

2023-05-17 Thread David Laight
From: 15330273...@189.cn
> Sent: 16 May 2023 18:30
> 
> From: Sui Jingfeng 
> 
> Both mode->crtc_htotal and mode->crtc_vtotal are u16 type,
> mode->crtc_htotal * mode->crtc_vtotal will results a unsigned type.

Nope, u16 gets promoted to 'signed int' and the result of the
multiply is also signed.

> Using a u32 is enough to store the result, but considering that the
> result will be casted to u64 soon after. We use a u64 type directly.
> So there no need to cast it to signed type and cast back then.

> - int frame_size = mode->crtc_htotal * mode->crtc_vtotal;
> + u64 frame_size = mode->crtc_htotal * mode->crtc_vtotal;
...
> - framedur_ns = div_u64((u64) frame_size * 100, dotclock);
> + framedur_ns = div_u64(frame_size * 100, dotclock);

The (u64) cast is there to extend the value to 64bits, not
because the original type is signed.
The compiler will detect that the old code is a 32x32 multiply
where a 64bit result is needed, that may not be true for the
changed code (it would need to track back as far as the u16s).

It is not uncommon to force a 64bit result from a multiply
by making the constant 64bit. As in:
div_u64(frame_size * 100ULL, dotclock);

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH v7 1/7] fbdev/hitfb: Cast I/O offset to address

2023-05-12 Thread David Laight
From: Thomas Zimmermann
> Sent: 12 May 2023 11:25
> 
> Cast I/O offsets to pointers to use them with I/O functions. The I/O
> functions expect pointers of type 'volatile void __iomem *', but the
> offsets are plain integers. Build warnings are
> 
>   ../drivers/video/fbdev/hitfb.c: In function 'hitfb_accel_wait':
>   ../arch/x86/include/asm/hd64461.h:18:33: warning: passing argument 1 of 
> 'fb_readw' makes pointer
> from integer without a cast [-Wint-conversion]
>18 | #define HD64461_IO_OFFSET(x)(HD64461_IOBASE + (x))
>   | ^~
...
>52 | static inline u16 fb_readw(const volatile void __iomem *addr)
>   |~^~~~
> 
> This patch only fixes the build warnings. It's not clear if the I/O
> offsets can legally be passed to the I/O helpers. It was apparently
> broken in 2007 when custom inw()/outw() helpers got removed by
> commit 34a780a0afeb ("sh: hp6xx pata_platform support."). Fixing the
> driver would require setting the I/O base address.

Did you try changing the definition of HD64461_IOBASE to include
a (volatile void __iomem *) cast?
A lot less churn...

I'm guessing that 'sh' deosn't have in/out instructions so this
is something that is always mapped at a fixed kernel virtual address?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 0/4] log2: make is_power_of_2() more generic

2023-03-31 Thread David Laight
From: Andrew Morton
> Sent: 30 March 2023 23:19
> 
> On Thu, 30 Mar 2023 21:53:03 +0000 David Laight  
> wrote:
> 
> > > But wouldn't all these issues be addressed by simply doing
> > >
> > > #define is_power_of_2(n) (n != 0 && ((n & (n - 1)) == 0))
> > >
> > > ?
> > >
> > > (With suitable tweaks to avoid evaluating `n' more than once)
> >
> > I think you need to use the 'horrid tricks' from min() to get
> > a constant expression from constant inputs.
> 
> This
> 
> --- a/include/linux/log2.h~a
> +++ a/include/linux/log2.h
> @@ -41,11 +41,11 @@ int __ilog2_u64(u64 n)
>   * *not* considered a power of two.
>   * Return: true if @n is a power of 2, otherwise false.
>   */
> -static inline __attribute__((const))
> -bool is_power_of_2(unsigned long n)
> -{
> - return (n != 0 && ((n & (n - 1)) == 0));
> -}
> +#define is_power_of_2(_n)\
> + ({  \
> + typeof(_n) n = (_n);\
> + n != 0 && ((n & (n - 1)) == 0); \
> + })
> 
>  /**
>   * __roundup_pow_of_two() - round up to nearest power of two
> _
> 
> worked for me in a simple test.
> 
> --- a/fs/open.c~b
> +++ a/fs/open.c
> @@ -1564,3 +1564,10 @@ int stream_open(struct inode *inode, str
>  }
> 
>  EXPORT_SYMBOL(stream_open);
> +
> +#include 
> +
> +int foo(void)
> +{
> + return is_power_of_2(43);
> +}
> _
> 
> 
> foo:
> # fs/open.c:1573: }
>   xorl%eax, %eax  #
>   ret
> 
> 
> Is there some more tricky situation where it breaks?

Try:
static int x = is_power_of_2(43);

I suspect that some (all?) of the compile-time assert checks won't
like ({...}) either.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 0/4] log2: make is_power_of_2() more generic

2023-03-30 Thread David Laight
From: Andrew Morton
> Sent: 30 March 2023 20:51
> 
> On Thu, 30 Mar 2023 13:42:39 +0300 Jani Nikula  wrote:
> 
> > is_power_of_2() only works for types <= sizeof(unsigned long) and it's
> > also not a constant expression. There are a number of places in kernel
> > where is_power_of_2() is called on u64, which fails on 32-bit
> > builds. Try to remedy that. While at it, make it a constant expression
> > when possible.
> 
> Yes, the current `is_power_of_2(unsigned long n)' isn't very general.
> 
> But wouldn't all these issues be addressed by simply doing
> 
> #define is_power_of_2(n) (n != 0 && ((n & (n - 1)) == 0))
> 
> ?
> 
> (With suitable tweaks to avoid evaluating `n' more than once)

I think you need to use the 'horrid tricks' from min() to get
a constant expression from constant inputs.

For non-constants this looks ok (see https://godbolt.org/z/G73MTr9jn)

David

static inline int lisp2(unsigned long n)
{
return n && !(n & (n - 1));
}

static inline int llisp2(unsigned long long lln)
{
#if 0  // I think this looks worse, esp. for gcc on x86
return lln && !(lln & (lln - 1));
#else
unsigned long n = lln;
if (lln >= 1ull << 32) {
if (n)
return 0;
n = lln >> 32; 
}
return lisp2(n);
#endif
}

#define isp2(n) (sizeof ((n)+0) == sizeof (long) ? lisp2(n) : llisp2(n))

int is12(unsigned long long  i)
{
return isp2(i);
}

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH] drm/ssd130x: Silence a `dubious: x & !y` warning

2023-01-23 Thread David Laight
From: Javier Martinez Canillas
> Sent: 21 January 2023 19:10
> 
> The sparse tool complains with the following warning:
> 
> $ make M=drivers/gpu/drm/solomon/ C=2
>   CC [M]  drivers/gpu/drm/solomon/ssd130x.o
>   CHECK   drivers/gpu/drm/solomon/ssd130x.c
> drivers/gpu/drm/solomon/ssd130x.c:363:21: warning: dubious: x & !y
> 
> This seems to be a false positive in my opinion but still we can silence
> the tool while making the code easier to read. Let's also add a comment,
> to explain why the "com_seq" logical not is used rather than its value.
> 
> Reported-by: Thomas Zimmermann 
> Signed-off-by: Javier Martinez Canillas 
> Reviewed-by: Thomas Zimmermann 
> ---
> 
>  drivers/gpu/drm/solomon/ssd130x.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/solomon/ssd130x.c 
> b/drivers/gpu/drm/solomon/ssd130x.c
> index c3bf3a18302e..b16330a8b624 100644
> --- a/drivers/gpu/drm/solomon/ssd130x.c
> +++ b/drivers/gpu/drm/solomon/ssd130x.c
> @@ -81,7 +81,7 @@
>  #define SSD130X_SET_PRECHARGE_PERIOD2_MASK   GENMASK(7, 4)
>  #define SSD130X_SET_PRECHARGE_PERIOD2_SET(val)   
> FIELD_PREP(SSD130X_SET_PRECHARGE_PERIOD2_MASK, (val))
>  #define SSD130X_SET_COM_PINS_CONFIG1_MASKGENMASK(4, 4)
> -#define SSD130X_SET_COM_PINS_CONFIG1_SET(val)
> FIELD_PREP(SSD130X_SET_COM_PINS_CONFIG1_MASK, !(val))
> +#define SSD130X_SET_COM_PINS_CONFIG1_SET(val)
> FIELD_PREP(SSD130X_SET_COM_PINS_CONFIG1_MASK, (val))

How about just changing !(val) to (val) ? 0 : 1
It should shut the compiler up and is probably more descriptive.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [Intel-gfx] [PATCH 1/5] linux/minmax.h: add non-atomic version of xchg

2023-01-05 Thread David Laight
From: Daniel Vetter
> Sent: 05 January 2023 14:13
...
> > > So here we are, with Andrzej looking to add the common helper. And the
> > > same concerns crop up. What should it be called to make it clear that
> > > it's not atomic? Is that possible?
> >
> > old_value = read_write(variable, new_value);
> >
> > But two statements are much clearer.
> 
> Yeah this is my point for fetch_and_zero or any of the other proposals.
> We're essentially replacing these two lines:
> 
>   var = some->pointer->chase;
>   some->pointer->chase = NULL;
> 
> with a macro. C is verbose, and sometimes painfully so,

Try ADA or VHDL :-)

> if the pointer
> chase is really to onerous then I think that should be refactored with a
> meaningfully locally name variable, not fancy macros wrapped around to
> golf a few characters away.

Provided 'var' is a local the compiler is pretty likely to only do the
'pointer chase' once.
You can also do:
var = NULL;
swap(some->pointer->chase, var);
and get pretty much the same object code.

> But what about swap() you ask? That one needs a temp variable, and it does
> make sense to hide that in a ({}) block in a macro.

Sometimes, but not enough for the 'missed opportunity for swap()'
message. 

> But for the above two
> lines I really don't see a point outside of obfuscated C contexts.

Indeed.

Isn't the suggested __xchg() in one of the 'reserved for implementation'
namespaces - so shouldn't be a function that might be expected to be
actually used.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [Intel-gfx] [PATCH 1/5] linux/minmax.h: add non-atomic version of xchg

2023-01-05 Thread David Laight
From: Jani Nikula
> Sent: 05 January 2023 13:28
> 
> On Thu, 05 Jan 2023, Daniel Vetter  wrote:
> > On Mon, Dec 12, 2022 at 09:38:12AM +0000, David Laight wrote:
> >> From: Andrzej Hajda 
> >> > Sent: 09 December 2022 15:49
> >> >
> >> > The pattern of setting variable with new value and returning old
> >> > one is very common in kernel. Usually atomicity of the operation
> >> > is not required, so xchg seems to be suboptimal and confusing in
> >> > such cases. Since name xchg is already in use and __xchg is used
> >> > in architecture code, proposition is to name the macro exchange.
> >>
> >> Dunno, if it is non-atomic then two separate assignment statements
> >> is decidedly more obvious and needs less brain cells to process.
> >> Otherwise someone will assume 'something clever' is going on
> >> and the operation is atomic.
> >
> > Yes, this also my take. The i915 code that uses this to excess is decidely
> > unreadable imo, and the macro should simply be replaced by open-coded
> > versions.
> >
> > Not moved into shared headers where even more people can play funny games
> > with it.
> 
> My stand in i915 has been that the local fetch_and_zero() needs to
> go. Either replaced by a common helper in core kernel headers, or open
> coded, I personally don't care, but the local version can't stay.
> 
> My rationale has been that fetch_and_zero() looks atomic and looks like
> it comes from shared headers, but it's neither. It's deceptive. It
> started small and harmless, but things like this just proliferate and
> get copy-pasted all over the place.
> 
> So here we are, with Andrzej looking to add the common helper. And the
> same concerns crop up. What should it be called to make it clear that
> it's not atomic? Is that possible?

old_value = read_write(variable, new_value);

But two statements are much clearer.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 1/5] linux/minmax.h: add non-atomic version of xchg

2022-12-12 Thread David Laight
From: Andrzej Hajda 
> Sent: 09 December 2022 15:49
> 
> The pattern of setting variable with new value and returning old
> one is very common in kernel. Usually atomicity of the operation
> is not required, so xchg seems to be suboptimal and confusing in
> such cases. Since name xchg is already in use and __xchg is used
> in architecture code, proposition is to name the macro exchange.

Dunno, if it is non-atomic then two separate assignment statements
is decidedly more obvious and needs less brain cells to process.
Otherwise someone will assume 'something clever' is going on
and the operation is atomic.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH] video: fbdev: sis: use explicitly signed char

2022-10-25 Thread David Laight
From: Jason A. Donenfeld
> Sent: 24 October 2022 17:29
> To: linux-ker...@vger.kernel.org
> 
> With char becoming unsigned by default, and with `char` alone being
> ambiguous and based on architecture, signed chars need to be marked
> explicitly as such. This fixes warnings like:
> 
...
> ---
>  drivers/usb/misc/sisusbvga/sisusb_struct.h | 2 +-
>  drivers/video/fbdev/sis/vstruct.h  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/usb/misc/sisusbvga/sisusb_struct.h 
> b/drivers/usb/misc/sisusbvga/sisusb_struct.h
> index 3df64d2a9d43..a86032a26d36 100644
> --- a/drivers/usb/misc/sisusbvga/sisusb_struct.h
> +++ b/drivers/usb/misc/sisusbvga/sisusb_struct.h
> @@ -91,7 +91,7 @@ struct SiS_Ext {
>   unsigned char VB_ExtTVYFilterIndex;
>   unsigned char VB_ExtTVYFilterIndexROM661;
>   unsigned char REFindex;
> - char ROMMODEIDX661;
> + signed char ROMMODEIDX661;

Isn't the correct fix to use u8 and s8 ?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH v1 3/5] treewide: use get_random_u32() when possible

2022-10-12 Thread David Laight
From: Joe Perches
> Sent: 12 October 2022 20:17
> 
> On Wed, 2022-10-05 at 23:48 +0200, Jason A. Donenfeld wrote:
> > The prandom_u32() function has been a deprecated inline wrapper around
> > get_random_u32() for several releases now, and compiles down to the
> > exact same code. Replace the deprecated wrapper with a direct call to
> > the real function.
> []
> > diff --git a/drivers/infiniband/hw/cxgb4/cm.c 
> > b/drivers/infiniband/hw/cxgb4/cm.c
> []
> > @@ -734,7 +734,7 @@ static int send_connect(struct c4iw_ep *ep)
> >>com.remote_addr;
> > int ret;
> > enum chip_type adapter_type = ep->com.dev->rdev.lldi.adapter_type;
> > -   u32 isn = (prandom_u32() & ~7UL) - 1;
> > +   u32 isn = (get_random_u32() & ~7UL) - 1;
> 
> trivia:
> 
> There are somewhat odd size mismatches here.
> 
> I had to think a tiny bit if random() returned a value from 0 to 7
> and was promoted to a 64 bit value then truncated to 32 bit.
> 
> Perhaps these would be clearer as ~7U and not ~7UL

That makes no difference - the compiler will generate the same code.

The real question is WTF is the code doing?
The '& ~7u' clears the bottom 3 bits.
The '- 1' then sets the bottom 3 bits and decrements the
(random) high bits.

So is the same as get_random_u32() | 7.
But I bet the coder had something else in mind.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH v4 4/6] treewide: use get_random_u32() when possible

2022-10-08 Thread David Laight
From: Jason A. Donenfeld
> Sent: 07 October 2022 19:01
> 
> The prandom_u32() function has been a deprecated inline wrapper around
> get_random_u32() for several releases now, and compiles down to the
> exact same code. Replace the deprecated wrapper with a direct call to
> the real function. The same also applies to get_random_int(), which is
> just a wrapper around get_random_u32().
> 
...
> diff --git a/net/802/garp.c b/net/802/garp.c
> index f6012f8e59f0..c1bb67e25430 100644
> --- a/net/802/garp.c
> +++ b/net/802/garp.c
> @@ -407,7 +407,7 @@ static void garp_join_timer_arm(struct garp_applicant 
> *app)
>  {
>   unsigned long delay;
> 
> - delay = (u64)msecs_to_jiffies(garp_join_time) * prandom_u32() >> 32;
> + delay = (u64)msecs_to_jiffies(garp_join_time) * get_random_u32() >> 32;
>   mod_timer(>join_timer, jiffies + delay);
>  }
> 
> diff --git a/net/802/mrp.c b/net/802/mrp.c
> index 35e04cc5390c..3e9fe9f5d9bf 100644
> --- a/net/802/mrp.c
> +++ b/net/802/mrp.c
> @@ -592,7 +592,7 @@ static void mrp_join_timer_arm(struct mrp_applicant *app)
>  {
>   unsigned long delay;
> 
> - delay = (u64)msecs_to_jiffies(mrp_join_time) * prandom_u32() >> 32;
> + delay = (u64)msecs_to_jiffies(mrp_join_time) * get_random_u32() >> 32;
>   mod_timer(>join_timer, jiffies + delay);
>  }
> 

Aren't those:
delay = prandom_u32_max(msecs_to_jiffies(xxx_join_time));

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v4 2/6] treewide: use prandom_u32_max() when possible

2022-10-08 Thread David Laight
From: Jason A. Donenfeld
> Sent: 07 October 2022 19:01
> 
> Rather than incurring a division or requesting too many random bytes for
> the given range, use the prandom_u32_max() function, which only takes
> the minimum required bytes from the RNG and avoids divisions.
> 
...
> --- a/lib/cmdline_kunit.c
> +++ b/lib/cmdline_kunit.c
> @@ -76,7 +76,7 @@ static void cmdline_test_lead_int(struct kunit *test)
>   int rc = cmdline_test_values[i];
>   int offset;
> 
> - sprintf(in, "%u%s", prandom_u32_max(256), str);
> + sprintf(in, "%u%s", get_random_int() % 256, str);
>   /* Only first '-' after the number will advance the pointer */
>   offset = strlen(in) - strlen(str) + !!(rc == 2);
>   cmdline_do_one_test(test, in, rc, offset);
> @@ -94,7 +94,7 @@ static void cmdline_test_tail_int(struct kunit *test)
>   int rc = strcmp(str, "") ? (strcmp(str, "-") ? 0 : 1) : 1;
>   int offset;
> 
> - sprintf(in, "%s%u", str, prandom_u32_max(256));
> + sprintf(in, "%s%u", str, get_random_int() % 256);
>   /*
>* Only first and leading '-' not followed by integer
>* will advance the pointer.

Something has gone backwards here
And get_random_u8() looks a better fit.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v3 3/5] treewide: use get_random_u32() when possible

2022-10-08 Thread David Laight
From: Jason A. Donenfeld
> Sent: 07 October 2022 18:56
...
> > Given these kinds of less mechanical changes, it may make sense to split
> > these from the "trivial" conversions in a treewide patch. The chance of
> > needing a revert from the simple 1:1 conversions is much lower than the
> > need to revert by-hand changes.
> >
> > The Cocci script I suggested in my v1 review gets 80% of the first
> > patch, for example.
> 
> I'll split things up into a mechanical step and a non-mechanical step.
> Good idea.

I'd also do something about the 'get_random_int() & 3' cases.
(ie remainder by 2^n-1)
These can be converted to 'get_random_u8() & 3' (etc).
So they only need one random byte (not 4) and no multiply.

Possibly something based on (the quickly typed, and not C):
#define get_random_below(val) [
if (builtin_constant(val))
BUILD_BUG_ON(!val || val > 0x1ull)
if (!(val & (val - 1)) {
if (val <= 0x100)
return get_random_u8() & (val - 1);
if (val <= 0x1)
return get_random_u16() & (val - 1);
return get_random_u32() & (val - 1);
}
}
BUILD_BUG_ON(sizeof (val) > 4);
return ((u64)get_random_u32() * val) >> 32;
}

get_random_below() is a much better name than prandom_u32_max().

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v3 3/5] treewide: use get_random_u32() when possible

2022-10-07 Thread David Laight
From: Christophe Leroy
> Sent: 06 October 2022 18:43
...
> But taking into account that sp must remain 16 bytes aligned, would it
> be better to do something like ?
> 
>   sp -= prandom_u32_max(PAGE_SIZE >> 4) << 4;

That makes me think...
If prandom_u32_max() is passed a (constant) power of 2 it doesn't
need to do the multiply, it can just do a shift right.

Doesn't it also always get a 32bit random value?
So actually get_random_u32() & PAGE_MASK & ~0xf is faster!

When PAGE_SIZE is 4k, PAGE_SIZE >> 4 is 256 so it could use:
get_ramdom_u8() << 4

You also seem to have removed prandom_u32() in favour of
get_random_u32() but have added more prandom_() functions.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] drm/amd/display: fix i386 frame size warning

2022-08-24 Thread David Laight
From: Hamza Mahfooz
> Sent: 18 August 2022 17:49
> 
> Addresses the following warning:
> drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3596:6:
>  error: stack frame
> size (2092) exceeds limit (2048) in 
> 'dml30_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-
> larger-than]
> void dml30_ModeSupportAndSystemConfigurationFull(struct display_mode_lib 
> *mode_lib)
>  ^
> 
> UseMinimumDCFCLK() is eating away at
> dml30_ModeSupportAndSystemConfigurationFull()'s stack space, so use a
> pointer to struct vba_vars_st instead of passing lots of large arrays
> as parameters by value.
> 
> Signed-off-by: Hamza Mahfooz 
> ---
>  .../dc/dml/dcn30/display_mode_vba_30.c| 295 --
>  1 file changed, 63 insertions(+), 232 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> index 876b321b30ca..b7fa003ffe06 100644
> --- a/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn30/display_mode_vba_30.c
> @@ -396,64 +396,10 @@ static void CalculateUrgentBurstFactor(
> 
>  static void UseMinimumDCFCLK(
>   struct display_mode_lib *mode_lib,
> - int MaxInterDCNTileRepeaters,
> + struct vba_vars_st *v,

You should probably add 'const' in there.
Thinks will likely break if v->xxx gets changed.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: mainline build failure for x86_64 allmodconfig with clang

2022-08-07 Thread David Laight
From: Arnd Bergmann
> Sent: 05 August 2022 20:32
...
> One thing to try would be to condense a function call like
> 
> dml32_CalculateWatermarksMALLUseAndDRAMSpeedChangeSupport(
> 
...
>  /* more arguments */);
> 
> into calling conventions that take a pointer to 'mode_lib->vba' and another
> one to 'v', so these are no longer passed on the stack individually.

Or, if it is only called once (I can't find the source)
force it to be inlined.

Or just shoot the software engineer who thinks 100 arguments
is sane. :-)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: mainline build failure for x86_64 allmodconfig with clang

2022-08-05 Thread David Laight
...
>  * NOTE:
>  *   This file is gcc-parsable HW gospel, coming straight from HW engineers.

I never trust hardware engineers to write code :-)
(Although at the moment they trust me to write VHDL...)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [Intel-gfx] [PATCH v2 01/21] drm/i915/gt: Ignore TLB invalidations on idle engines

2022-07-19 Thread David Laight
From: Tvrtko Ursulin
> Sent: 19 July 2022 08:25
...
> > It's not only the TLB flushes that cause grief.
> >
> > There is a loop that forces a write-back of all the frame buffer pages.
> > With a large display and some cpu (like my Ivy bridge one) that
> > takes long enough with pre-emption disabled that wakeup of RT processes
> > (and any pinned to the cpu) takes far longer than one might have
> > wished for.
> >
> > Since some X servers request a flush every few seconds this makes
> > the system unusable for some workloads.
> 
> Ok TLB invalidations as discussed in this patch does not apply to
> Ivybridge. But what is the write back loop you mention which is causing
> you grief? What size frame buffers are we talking about here? If they
> don't fit in the mappable area recently we merged a patch* which
> improves things in that situation but not sure you are hitting exactly that.

I found the old email:

What I've found is that the Intel i915 graphics driver uses the 'events_unbound'
kernel worker thread to periodically execute drm_cflush_sg().
(see https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/drm_cache.c)

I'm guessing this is to ensure that any writes to graphics memory become
visible is a semi-timely manner.

This loop takes about 1us per iteration split fairly evenly between whatever is 
in
for_each_sg_page() and drm_cflush_page().
With a 2560x1440 display the loop count is 3600 (4 bytes/pixel) and the whole
function takes around 3.3ms.

IIRC the first few page flushes are quick (I bet they go into a fifo)
and then they all get slow.
The flushes are actually requested from userspace.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [Intel-gfx] [PATCH v2 01/21] drm/i915/gt: Ignore TLB invalidations on idle engines

2022-07-18 Thread David Laight
From: Mauro Carvalho Chehab
> Sent: 18 July 2022 15:54
> 
> On Mon, 18 Jul 2022 14:16:10 +0100
> Tvrtko Ursulin  wrote:
> 
> > On 14/07/2022 13:06, Mauro Carvalho Chehab wrote:
> > > From: Chris Wilson 
> > >
> > > Check if the device is powered down prior to any engine activity,
> > > as, on such cases, all the TLBs were already invalidated, so an
> > > explicit TLB invalidation is not needed, thus reducing the
> > > performance regression impact due to it.
> > >
> > > This becomes more significant with GuC, as it can only do so when
> > > the connection to the GuC is awake.
> > >
> > > Cc: sta...@vger.kernel.org
> > > Fixes: 7938d61591d3 ("drm/i915: Flush TLBs before releasing backing 
> > > store")
> >
> > Patch itself looks fine but I don't think we closed on the issue of
> > stable/fixes on this patch?
> 
> No, because TLB cache invalidation takes time and causes time outs, which
> in turn affects applications and produce Kernel warnings.

It's not only the TLB flushes that cause grief.

There is a loop that forces a write-back of all the frame buffer pages.
With a large display and some cpu (like my Ivy bridge one) that
takes long enough with pre-emption disabled that wakeup of RT processes
(and any pinned to the cpu) takes far longer than one might have
wished for.

Since some X servers request a flush every few seconds this makes
the system unusable for some workloads.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH 2/2] procfs: Add 'path' to /proc//fdinfo/

2022-06-01 Thread David Laight
From: Kalesh Singh
> Sent: 31 May 2022 23:30
...
> > File paths can contain fun characters like newlines or colons, which
> > could make parsing out filenames in this text file... fun. How would your
> > userspace parsing logic handle "/home/stephen/filename\nsize:\t4096"? The
> > readlink(2) API makes that easy already.
> 
> I think since we have escaped the "\n" (seq_file_path(m, file, "\n")),
> then user space might parse this line like:
> 
> if (strncmp(line, "path:\t", 6) == 0)
> char* path = line + 6;

The real annoyance is other things doing scans of the filesystem
that accidentally 'bump into' strange names.

While anything serious probably gets it right how many times
Do you run 'find' to quickly search for something?

Spaces in filenames (popularised by some other os) are a PITA.
Not to mention leading and trailing spaces!
Anyone using filenames that only contain spaces does need shooting.

Deliberately adding non-printables isn't really a good idea.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH v2 1/2] module: update dependencies at try_module_get()

2022-05-01 Thread David Laight
From: Mauro Carvalho Chehab
> Sent: 30 April 2022 14:38
> 
> Em Sat, 30 Apr 2022 14:04:59 +0200
> Greg KH  escreveu:
> 
> > On Sat, Apr 30, 2022 at 11:30:58AM +0100, Mauro Carvalho Chehab wrote:
> 
> > Did you run checkpatch on this?  Please do :)
> >
> > > +
> > > + if (mod == this)
> > > + return 0;
> >
> > How can this happen?
> > When people mistakenly call try_module_get(THIS_MODULE)?
> 
> Yes. There are lots of place where this is happening:
> 
>   $ git grep try_module_get\(THIS_MODULE|wc -l
>   82
> 
> > We should
> > throw up a big warning when that happens anyway as that's always wrong.
> >
> > But that's a different issue from this change, sorry for the noise.
> 
> It sounds very weird to use try_module_get(THIS_MODULE).
> 
> We could add a WARN_ON() there - or something similar - but I would do it
> on a separate patch.

You could add a compile-time check.
But a run-time one seems unnecessary.
Clearly try_module_get(THIS_MODULE) usually succeeds.

I think I can invent a case where it can fail:
The module count must be zero, and a module unload in progress.
The thread doing the unload is blocked somewhere.
Another thread makes a callback into the module for some request
that (for instance) would need to create a kernel thread.
It tries to get a reference for the thread.
So try_module_get(THIS_MODULE) is the right call - and will fail here.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [RFC v3 7/8] binder: use __kernel_pid_t and __kernel_uid_t for userspace

2022-03-15 Thread David Laight
From: T.J. Mercier
> Sent: 14 March 2022 23:45
> 
> On Thu, Mar 10, 2022 at 11:33 AM Todd Kjos  wrote:
> >
> > On Wed, Mar 9, 2022 at 8:52 AM T.J. Mercier  wrote:
> > >
> > > The kernel interface should use types that the kernel defines instead of
> > > pid_t and uid_t, whose definiton is owned by libc. This fixes the header
> > > so that it can be included without first including sys/types.h.
> > >
> > > Signed-off-by: T.J. Mercier 
> > > ---
> > >  include/uapi/linux/android/binder.h | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/uapi/linux/android/binder.h 
> > > b/include/uapi/linux/android/binder.h
> > > index 169fd5069a1a..aa28454dbca3 100644
> > > --- a/include/uapi/linux/android/binder.h
> > > +++ b/include/uapi/linux/android/binder.h
> > > @@ -289,8 +289,8 @@ struct binder_transaction_data {
> > >
> > > /* General information about the transaction. */
> > > __u32   flags;
> > > -   pid_t   sender_pid;
> > > -   uid_t   sender_euid;
> > > +   __kernel_pid_t  sender_pid;
> > > +   __kernel_uid_t  sender_euid;
> >
> > Are we guaranteed that this does not affect the UAPI at all? Userspace
> > code using this definition will have to run with kernels using the old
> > definition and visa-versa.
> 
> A standards compliant userspace should be expecting a signed integer
> type here. So the only way I can think userspace would be affected is
> if:
> 1) pid_t is a long AND
> 2) sizeof(long) > sizeof(int) AND
> 3) Consumers of the pid_t definition actually attempt to mutate the
> result to make use of extra bits in the variable (which are not there)

Or the userspace headers have a 16bit pid_t.

I can't help feeling that uapi headers should only use explicit
fixed sized types.
There is no point indirecting the type names - the sizes still
can't be changes.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH RFC 0/3] MAP_POPULATE for device memory

2022-03-07 Thread David Laight
From: Christoph Hellwig
> Sent: 07 March 2022 15:57
> 
> On Mon, Mar 07, 2022 at 03:29:35PM +0200, Jarkko Sakkinen wrote:
> > So what would you suggest to sort out the issue? I'm happy to go with
> > ioctl if nothing else is acceptable.
> 
> PLenty of drivers treat all mmaps as if MAP_POPULATE was specified,
> typically by using (io_)remap_pfn_range.  If there any reason to only
> optionally have the pre-fault semantics for sgx?  If not this should
> be really simple.  And if we have a real need for it to be optional
> we'll just need to find a sane way to pass that information to ->mmap.

Is there any space in vma->vm_flags ?

That would be better than an extra argument or function.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 0/6] Remove usage of list iterator past the loop body

2022-03-07 Thread David Laight
From: Dan Carpenter
> Sent: 07 March 2022 15:01
> 
> Updating this API is risky because some places rely on the old behavior
> and not all of them have been updated.  Here are some additional places
> you might want to change.

I really can't help thinking that trying to merge this patch is
actually impossible.
It affects far too many different parts of the tree.

Since (I believe) this is a doubly linked list with forwards and
backwards pointers that point to a 'node' (not that there is a
nice comment to that effect in the header - and there are lots of
ways to do linked lists) the 'head' pretty much has to be a 'node'.

I'd write the following new defines (but I might be using
the old names here):

list_first(head, field) First item, NULL if empty.
list_last(head, field) Last item NULL if empty.
list_next(head, item, field) Item after 'item', NULL if last.
list_prev(head, item. field) Item before 'item', NULL if first.

You get (something like):
#define list_first(head, field) \
head->next ==  ? NULL : list_item(head->next, field)
(probably needs typeof(item) from somewhere).

The iterator loop is then just:
#define loop_iterate(item, head, field) \
for (item = list_first(head, field); item; \
item = list_next(head, item, field)

I'm not sure, but making the 'head' be a structure that contains
a single member that is a 'node' might help type checking.

Then all the code that uses the current defines can slowly be
moved over (probably a couple of releases) before the existing
defines are deleted.

That should simplify all the open-coded search loops that are
just as likely to be buggy (possibly more so).

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH RFC 0/3] MAP_POPULATE for device memory

2022-03-06 Thread David Laight
From: Jarkko Sakkinen
> Sent: 06 March 2022 05:32
> 
> For device memory (aka VM_IO | VM_PFNMAP) MAP_POPULATE does nothing. Allow
> to use that for initializing the device memory by providing a new callback
> f_ops->populate() for the purpose.
> 
> SGX patches are provided to show the callback in context.
> 
> An obvious alternative is a ioctl but it is less elegant and requires
> two syscalls (mmap + ioctl) per memory range, instead of just one
> (mmap).

Is this all about trying to stop the vm_operations_struct.fault()
function being called?

It is pretty easy to ensure the mappings are setup in the driver's
mmap() function.
Then the fault() function can just return -VM_FAULT_SIGBUS;

If it is actually device memory you just need to call vm_iomap_memory()
That quite nicely mmap()s PCIe memory space into a user process.

Mapping driver memory is slightly more difficult.
For buffers allocated with dma_alloc_coherent() you can
probably use dma_mmap_coherent().
But I have a loop calling remap_pfn_range() because the
buffer area is made of multiple 16kB kernel buffers that
need to be mapped to contiguous user pages.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-03 Thread David Laight
From: Xiaomeng Tong
> Sent: 03 March 2022 07:27
> 
> On Thu, 3 Mar 2022 04:58:23 +0000, David Laight wrote:
> > on 3 Mar 2022 10:27:29 +0800, Xiaomeng Tong wrote:
> > > The problem is the mis-use of iterator outside the loop on exit, and
> > > the iterator will be the HEAD's container_of pointer which pointers
> > > to a type-confused struct. Sidenote: The *mis-use* here refers to
> > > mistakely access to other members of the struct, instead of the
> > > list_head member which acutally is the valid HEAD.
> >
> > The problem is that the HEAD's container_of pointer should never
> > be calculated at all.
> > This is what is fundamentally broken about the current definition.
> 
> Yes, the rule is "the HEAD's container_of pointer should never be
> calculated at all outside the loop", but how do you make sure everyone
> follows this rule?
> Everyone makes mistakes, but we can eliminate them all from the beginning
> with the help of compiler which can catch such use-after-loop things.
> 
> > > IOW, you would dereference a (NULL + offset_of_member) address here.
> >
> >Where?
> 
> In the case where a developer do not follows the above rule, and mistakely
> access a non-list-head member of the HEAD's container_of pointer outside
> the loop. For example:
> struct req{
>   int a;
>   struct list_head h;
> }
> struct req *r;
> list_for_each_entry(r, HEAD, h) {
>   if (r->a == 0x10)
> break;
> }
> // the developer made a mistake: he didn't take this situation into
> // account where all entries in the list are *r->a != 0x10*, and now
> // the r is the HEAD's container_of pointer.
> r->a = 0x20;
> Thus the "r->a = 0x20" would dereference a (NULL + offset_of_member)
> address here.

That is just a bug.
No different to failing to check anything else might 'return'
a NULL pointer.
Because it is a NULL dereference you find out pretty quickly.
The existing loop leaves you with a valid pointer to something
that isn't a list item.

> > > Please remind me if i missed something, thanks.
> > >
> > > Can you share your "alternative definitions" details? thanks!
> >
> > The loop should probably use as extra variable that points
> > to the 'list node' in the next structure.
> > Something like:
> > for (xxx *iter = head->next;
> > iter ==  ? ((item = NULL),0) : ((item = 
> > list_item(iter),1));
> > iter = item->member->next) {
> >...
> > With a bit of casting you can use 'item' to hold 'iter'.
> 
> you still can not make sure everyone follows this rule:
> "do not use iterator outside the loop" without the help of compiler,
> because item is declared outside the loop.

That one has 'iter' defined in the loop.

> BTW, to avoid ambiguity,the "alternative definitions" here i asked is
> something from you in this context:
> "OTOH there may be alternative definitions that can be used to get
> the compiler (or other compiler-like tools) to detect broken code.
> Even if the definition can't possibly generate a working kerrnel."

I was thinking of something like:
if ((pos = list_first)), 1) pos = NULL else
so that unchecked dereferences after the loop will be detectable
as NULL pointer offsets - but that in itself isn't enough to avoid
other warnings.

> > > The "list_for_each_entry_inside(pos, type, head, member)" way makes
> > > the iterator invisiable outside the loop, and would be catched by
> > > compiler if use-after-loop things happened.
> 
> > It is also a compete PITA for anything doing a search.
> 
> You mean it would be a burden on search? can you show me some examples?

The whole business of having to save the pointer to the located item
before breaking the loop, remembering to have set it to NULL earlier etc.

It is so much better if you can just do:
if (found)
break;

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread David Laight
From: Xiaomeng Tong
> Sent: 03 March 2022 02:27
> 
> On Wed, 2 Mar 2022 14:04:06 +0000, David Laight
>  wrote:
> > I think that it would be better to make any alternate loop macro
> > just set the variable to NULL on the loop exit.
> > That is easier to code for and the compiler might be persuaded to
> > not redo the test.
> 
> No, that would lead to a NULL dereference.

Why, it would make it b ethe same as the 'easy to use':
for (item = head; item; item = item->next) {
...
if (...)
break;
...
}
if (!item)
return;
 
> The problem is the mis-use of iterator outside the loop on exit, and
> the iterator will be the HEAD's container_of pointer which pointers
> to a type-confused struct. Sidenote: The *mis-use* here refers to
> mistakely access to other members of the struct, instead of the
> list_head member which acutally is the valid HEAD.

The problem is that the HEAD's container_of pointer should never
be calculated at all.
This is what is fundamentally broken about the current definition.

> IOW, you would dereference a (NULL + offset_of_member) address here.

Where?

> Please remind me if i missed something, thanks.
>
> Can you share your "alternative definitions" details? thanks!

The loop should probably use as extra variable that points
to the 'list node' in the next structure.
Something like:
for (xxx *iter = head->next;
iter ==  ? ((item = NULL),0) : ((item = 
list_item(iter),1));
iter = item->member->next) {
   ...
With a bit of casting you can use 'item' to hold 'iter'.

> 
> > OTOH there may be alternative definitions that can be used to get
> > the compiler (or other compiler-like tools) to detect broken code.
> > Even if the definition can't possibly generate a working kerrnel.
> 
> The "list_for_each_entry_inside(pos, type, head, member)" way makes
> the iterator invisiable outside the loop, and would be catched by
> compiler if use-after-loop things happened.

It is also a compete PITA for anything doing a search.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-02 Thread David Laight
From: Xiaomeng Tong
> Sent: 02 March 2022 09:31
> 
> On Mon, 28 Feb 2022 16:41:04 -0800, Linus Torvalds
>  wrote:
> >
> > But basically to _me_, the important part is that the end result is
> > maintainable longer-term.
> 
> I couldn't agree more. And because of that, I stick with the following
> approach because it's maintainable longer-term than "type(pos) pos" one:
>  Implements a new macro for each list_for_each_entry* with _inside suffix.
>   #define list_for_each_entry_inside(pos, type, head, member)

I think that it would be better to make any alternate loop macro
just set the variable to NULL on the loop exit.
That is easier to code for and the compiler might be persuaded to
not redo the test.

It also doesn't need an extra variable defined in the for() statement
so can be back-ported to older kernels without required declaration
in the middle of blocks.

OTOH there may be alternative definitions that can be used to get
the compiler (or other compiler-like tools) to detect broken code.
Even if the definition can't possibly generate a working kerrnel.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread David Laight
From: Linus Torvalds
> Sent: 01 March 2022 23:03
> 
> On Tue, Mar 1, 2022 at 2:58 PM David Laight  wrote:
> >
> > Can it be resolved by making:
> > #define list_entry_is_head(pos, head, member) ((pos) == NULL)
> > and double-checking that it isn't used anywhere else (except in
> > the list macros themselves).
> 
> Well, yes, except for the fact that then the name is entirely misleading...
> 
> And somebody possibly uses it together with list_first_entry() etc, so
> it really is completely broken to mix that change with the list
> traversal change.

Probably true :-(

Actually adding list_entry_not_found() as a synonym for
list_entry_is_head() and changing the 25ish places that
use it after a loop might work.

Once that is done the loop can be changed at the same time
as list_entry_not_found().
That won't affect the in-tree callers.
(and my out of tree modules don't use those lists - so I
don't care about that!)

Having said that there are so few users of list_entry_is_head()
it is reasonable to generate two new names.
One for use after list_for_each_entry() and one for list_next_entry().
Then the change all the call sites.
After that list_entry_is_head() can be deleted - breaking out of
tree compiles.
Finally list_for_each_entry() can be rewritten to set NULL
at the end of the list.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-03-01 Thread David Laight
From: Linus Torvalds
> Sent: 01 March 2022 19:07
> On Mon, Feb 28, 2022 at 2:29 PM James Bottomley
>  wrote:
> >
> > However, if the desire is really to poison the loop variable then we
> > can do
> >
> > #define list_for_each_entry(pos, head, member)  \
> > for (pos = list_first_entry(head, typeof(*pos), member);\
> >  !list_entry_is_head(pos, head, member) && ((pos = NULL) == 
> > NULL;   \
> >  pos = list_next_entry(pos, member))
> >
> > Which would at least set pos to NULL when the loop completes.
> 
> That would actually have been excellent if we had done that
> originally. It would not only avoid the stale and incorrectly typed
> head entry left-over turd, it would also have made it very easy to
> test for "did I find an entry in the loop".
> 
> But I don't much like it in the situation we are now.
> 
> Why? Mainly because it basically changes the semantics of the loop
> _without_ any warnings about it.  And we don't actually get the
> advantage of the nicer semantics, because we can't actually make code
> do
> 
> list_for_each_entry(entry, ) {
> ..
> }
> if (!entry)
> return -ESRCH;
> .. use the entry we found ..
> 
> because that would be a disaster for back-porting, plus it would be a
> flag-day issue (ie we'd have to change the semantics of the loop at
> the same time we change every single user).
> 
> So instead of that simple "if (!entry)", we'd effectively have to
> continue to use something that still works with the old world order
> (ie that "if (list_entry_is_head())" model).
> 
> So we couldn't really take _advantage_ of the nicer semantics, and
> we'd not even get a warning if somebody does it wrong - the code would
> just silently do the wrong thing.
> 
> IOW: I don't think you are wrong about that patch: it would solve the
> problem that Jakob wants to solve, and it would have absolutely been
> much better if we had done this from the beginning. But I think that
> in our current situation, it's actually a really fragile solution to
> the "don't do that then" problem we have.

Can it be resolved by making:
#define list_entry_is_head(pos, head, member) ((pos) == NULL)
and double-checking that it isn't used anywhere else (except in
the list macros themselves).

The odd ones I just found are fs/locks.c mm/page_reporting.c
security/apparmor/apparmorfs.c (3 times)

net/xfrm/xfrm_ipcomp.c#L244 is buggy.
(There is a WARN_ON() then it just carries on regardless!)

There are only about 25 uses of list_entry_is_head().

There are a lot more places where these lists seem to be scanned by hand.
I bet a few of those aren't actually right either.

(Oh at 3am this morning I thought it was a different list type
that could have much the same problem!)

Another plausible solution is a variant of list_foreach_entry()
that does set the 'entry' to NULL at the end.
Then code can be moved over in stages.
I'd reorder the arguments as well as changing the name!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] devcoredump: increase the device delete timeout to 10 mins

2022-02-28 Thread David Laight
From: Abhinav Kumar
> Sent: 28 February 2022 21:38
...
> We also did some profiling around how much increasing the block size
> helps and here is the data:
> 
> Block sizecost
> 
> 4KB   229s
> 8KB86s

You must have an O(n^2) operation in there - find it.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread David Laight
From: Matthew Wilcox
> Sent: 28 February 2022 20:16
> 
> On Mon, Feb 28, 2022 at 12:10:24PM -0800, Linus Torvalds wrote:
> > We can do
> >
> > typeof(pos) pos
> >
> > in the 'for ()' loop, and never use __iter at all.
> >
> > That means that inside the for-loop, we use a _different_ 'pos' than 
> > outside.
> 
> Then we can never use -Wshadow ;-(  I'd love to be able to turn it on;
> it catches real bugs.
> 
> > +#define list_for_each_entry(pos, head, member) 
> > \
> > +   for (typeof(pos) pos = list_first_entry(head, typeof(*pos), member);
> > \
> > +!list_entry_is_head(pos, head, member);\
> >  pos = list_next_entry(pos, member))

Actually can't you use 'pos' to temporarily hold the address of 'member'.
Something like:
for (pos = (void *)head; \
pos ? ((pos = (void *)pos - offsetof(member)), 1) : 0; \
pos = (void *)pos->next)
So that 'pos' is NULL if the loop terminates.
No pointers outside structures are generated.
Probably need to kill list_entry_is_head() - or it just checks for NULL.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 2/6] treewide: remove using list iterator after loop body as a ptr

2022-02-28 Thread David Laight
From: Linus Torvalds
> Sent: 28 February 2022 19:56
> On Mon, Feb 28, 2022 at 4:19 AM Christian König
>  wrote:
> >
> > I don't think that using the extra variable makes the code in any way
> > more reliable or easier to read.
> 
> So I think the next step is to do the attached patch (which requires
> that "-std=gnu11" that was discussed in the original thread).
> 
> That will guarantee that the 'pos' parameter of list_for_each_entry()
> is only updated INSIDE the for_each_list_entry() loop, and can never
> point to the (wrongly typed) head entry.
> 
> And I would actually hope that it should actually cause compiler
> warnings about possibly uninitialized variables if people then use the
> 'pos' pointer outside the loop. Except
> 
>  (a) that code in sgx/encl.c currently initializes 'tmp' to NULL for
> inexplicable reasons - possibly because it already expected this
> behavior
> 
>  (b) when I remove that NULL initializer, I still don't get a warning,
> because we've disabled -Wno-maybe-uninitialized since it results in so
> many false positives.
> 
> Oh well.
> 
> Anyway, give this patch a look, and at least if it's expanded to do
> "(pos) = NULL" in the entry statement for the for-loop, it will avoid
> the HEAD type confusion that Jakob is working on. And I think in a
> cleaner way than the horrid games he plays.
> 
> (But it won't avoid possible CPU speculation of such type confusion.
> That, in my opinion, is a completely different issue)
> 
> I do wish we could actually poison the 'pos' value after the loop
> somehow - but clearly the "might be uninitialized" I was hoping for
> isn't the way to do it.
> 
> Anybody have any ideas?
> 
> Linus
diff --git a/include/linux/list.h b/include/linux/list.h
index dd6c2041d09c..bab995596aaa 100644
--- a/include/linux/list.h
+++ b/include/linux/list.h
@@ -634,10 +634,10 @@ static inline void list_splice_tail_init(struct list_head 
*list,
  * @head:  the head for your list.
  * @member:the name of the list_head within the struct.
  */
-#define list_for_each_entry(pos, head, member) \
-   for (pos = list_first_entry(head, typeof(*pos), member);\
-!list_entry_is_head(pos, head, member);\
-pos = list_next_entry(pos, member))
+#define list_for_each_entry(pos, head, member) 
\
+   for (typeof(pos) __iter = list_first_entry(head, typeof(*pos), member); 
\
+!list_entry_is_head(__iter, head, member) && (((pos)=__iter),1);   
\
+__iter = list_next_entry(__iter, member))
 
 /**
  * list_for_each_entry_reverse - iterate backwards over list of given type.

I think you actually want:
!list_entry_is_head(__iter, head, member) ? (((pos)=__iter),1) : 
(((pos) = NULL),0);

Which can be done in the original by:
!list_entry_is_head(pos, head, member) ? 1 : (((pos) = NULL), 0);

Although it would be safer to have (without looking up the actual name):
for (item *__item = head; \
__item ? (((pos) = list_item(__item, member)), 1) : (((pos) = 
NULL), 0);
__item = (pos)->member)

The local does need 'fixing' to avoid shadowing for nested loops.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH] drm/amdgpu: Fix realloc of ptr

2022-02-26 Thread David Laight
From: t...@redhat.com
> Sent: 26 February 2022 15:59
> 
> From: Tom Rix 
> 
> Clang static analysis reports this error
> amdgpu_debugfs.c:1690:9: warning: 1st function call
>   argument is an uninitialized value
>   tmp = krealloc_array(tmp, i + 1,
> ^~~
> 
> realloc will free tmp, so tmp can not be garbage.
> And the return needs to be checked.

Are you sure?
A quick check seems to show that krealloc() behaves the same
way as libc realloc() and the pointer isn't freed on failure.

David

> Fixes: 5ce5a584cb82 ("drm/amdgpu: add debugfs for reset registers list")
> Signed-off-by: Tom Rix 
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> index 9eb9b440bd438..159b97c0b4ebc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c
> @@ -1676,7 +1676,7 @@ static ssize_t 
> amdgpu_reset_dump_register_list_write(struct file *f,
>  {
>   struct amdgpu_device *adev = (struct amdgpu_device 
> *)file_inode(f)->i_private;
>   char reg_offset[11];
> - uint32_t *tmp;
> + uint32_t *tmp = NULL;
>   int ret, i = 0, len = 0;
> 
>   do {
> @@ -1688,6 +1688,10 @@ static ssize_t 
> amdgpu_reset_dump_register_list_write(struct file *f,
>   }
> 
>   tmp = krealloc_array(tmp, i + 1, sizeof(uint32_t), GFP_KERNEL);
> + if (!tmp) {
> + ret = -ENOMEM;
> + goto error_free;
> + }
>   if (sscanf(reg_offset, "%X %n", [i], ) != 1) {
>   ret = -EINVAL;
>   goto error_free;
> --
> 2.26.3

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH v2] fbdev: fbmem: Fix the implicit type casting

2022-02-02 Thread David Laight
From: Yizhuo Zhai
> Sent: 01 February 2022 02:36
> 
> In function do_fb_ioctl(), the "arg" is the type of unsigned long,
> and in "case FBIOBLANK:" this argument is casted into an int before
> passig to fb_blank(). In fb_blank(), the comparision
> if (blank > FB_BLANK_POWERDOWN) would be bypass if the original
> "arg" is a large number, which is possible because it comes from
> the user input. Fix this by adding the check before the function
> call.

Doesn't this convert invalid values (> FB_BLANK_POWERDOWN)
that should generate errors into valid requests?

David

> 
> Signed-off-by: Yizhuo Zhai 
> ---
>  drivers/video/fbdev/core/fbmem.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/video/fbdev/core/fbmem.c 
> b/drivers/video/fbdev/core/fbmem.c
> index 0fa7ede94fa6..f08326efff54 100644
> --- a/drivers/video/fbdev/core/fbmem.c
> +++ b/drivers/video/fbdev/core/fbmem.c
> @@ -1162,6 +1162,8 @@ static long do_fb_ioctl(struct fb_info *info, unsigned 
> int cmd,
>   case FBIOBLANK:
>   console_lock();
>   lock_fb_info(info);
> + if (blank > FB_BLANK_POWERDOWN)
> + blank = FB_BLANK_POWERDOWN;
>   ret = fb_blank(info, arg);
>   /* might again call into fb_blank */
>   fbcon_fb_blanked(info, arg);
> --
> 2.25.1

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 0/3] lib/string_helpers: Add a few string helpers

2022-01-20 Thread David Laight
...
> Yeah, and I am sorry for bikeshedding. Honestly, I do not know what is
> better. This is why I do not want to block this series when others
> like this.
> 
> My main motivation is to point out that:
> 
> enabledisable(enable)
> 
> might be, for some people, more eye bleeding than
> 
> enable ? "enable" : "disable"

Indeed - you need to look the former up, wasting brain time.

> The problem is not that visible with yesno() and onoff(). But as you said,
> onoff() confliscts with variable names. And enabledisable() sucks.
> As a result, there is a non-trivial risk of two mass changes:
> 
> now:
> 
> - contition ? "yes" : "no"
> + yesno(condition)
> 
> a few moths later:
> 
> - yesno(condition)
> + str_yes_no(condition)

Followed by:
- str_yes_no(x)
+ no_yes_str(x)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation

2022-01-19 Thread David Laight
> > except '"no\0\0yes" + v * 4' works a bit better.
> 
> Is it a C code obfuscation contest?

That would be:
return &(v * 3)["no\0yes"];

:-)

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation

2022-01-19 Thread David Laight
> > > > +static inline const char *yesno(bool v) { return v ? "yes" : "no"; }
> 
>   return "yes\0no" + v * 4;
> 
> :-)

except '"no\0\0yes" + v * 4' works a bit better.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)


RE: [PATCH 1/3] lib/string_helpers: Consolidate yesno() implementation

2022-01-19 Thread David Laight
> > > +static inline const char *yesno(bool v) { return v ? "yes" : "no"; }

return "yes\0no" + v * 4;

:-)

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)



RE: [PATCH 02/11] x86: tboot: avoid Wstringop-overread-warning

2021-03-24 Thread David Laight
From: David Laight
> Sent: 24 March 2021 09:12
> 
> From: Martin Sebor
> > Sent: 22 March 2021 22:08
> ...
> > In GCC 11, all access warnings expect objects to be either declared
> > or allocated.  Pointers with constant values are taken to point to
> > nothing valid (as Arnd mentioned above, this is to detect invalid
> > accesses to members of structs at address zero).
> >
> > One possible solution to the known address problem is to extend GCC
> > attributes address and io that pin an object to a hardwired address
> > to all targets (at the moment they're supported on just one or two
> > targets).  I'm not sure this can still happen before GCC 11 releases
> > sometime in April or May.
> 
> A different solution is to define a normal C external data item
> and then assign a fixed address with an asm statement or in
> the linker script.

Or stop gcc tracking the value by using:
struct foo *foo = (void *)x;
asm ("", "+r" (foo));

If the address is used more than once forcing it into
a register is also likely to generate better code.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 02/11] x86: tboot: avoid Wstringop-overread-warning

2021-03-24 Thread David Laight
From: Martin Sebor
> Sent: 22 March 2021 22:08
...
> In GCC 11, all access warnings expect objects to be either declared
> or allocated.  Pointers with constant values are taken to point to
> nothing valid (as Arnd mentioned above, this is to detect invalid
> accesses to members of structs at address zero).
> 
> One possible solution to the known address problem is to extend GCC
> attributes address and io that pin an object to a hardwired address
> to all targets (at the moment they're supported on just one or two
> targets).  I'm not sure this can still happen before GCC 11 releases
> sometime in April or May.

A different solution is to define a normal C external data item
and then assign a fixed address with an asm statement or in
the linker script.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: DMA-buf and uncached system memory

2021-02-15 Thread David Laight
From: Christian König
> Sent: 15 February 2021 12:05
...
> Snooping the CPU caches introduces some extra latency, so what can
> happen is that the response to the PCIe read comes to late for the
> scanout. The result is an underflow and flickering whenever something is
> in the cache which needs to be flushed first.

Aren't you going to get the same problem if any other endpoints are
doing memory reads?
Possibly even ones that don't require a cache snoop and flush.

What about just the cpu doing a real memory transfer?

Or a combination of the two above happening just before your request.

If you don't have a big enough fifo you'll lose.

I did 'fix' a similar(ish) issue with video DMA latency on an embedded
system based the on SA1100/SA1101 by significantly reducing the clock
to the VGA panel whenever the cpu was doing 'slow io'.
(Interleaving an uncached cpu DRAM write between the slow io cycles
also fixed it.)
But the video was the only DMA device and that was an embedded system.
Given the application note about video latency didn't mention what was
actually happening, I'm not sure how many people actually got it working!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 02/13] module: add a module_loaded helper

2021-01-22 Thread David Laight
> 
> On Thu, Jan 21, 2021 at 11:00:20AM +0100, Christophe Leroy wrote:
> > > +bool module_loaded(const char *name);
> >
> > Maybe module_is_loaded() would be a better name.
> 
> Fine with me.

It does look like both callers aren't 'unsafe'.
But it is a bit strange returning a stale value.

It did make be look a bit more deeply at try_module_get().
For THIS_MODULE (eg to get an extra reference for a kthread)
I doubt it can ever fail.

But the other cases require a 'module' structure be passed in.
ISTM that unloading the module could invalidate the pointer
that has just been read from some other structure.

I'm guessing that some other lock must always be held.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [patch 14/30] drm/i915/pmu: Replace open coded kstat_irqs() copy

2020-12-14 Thread David Laight
From: Thomas Gleixner
> Sent: 11 December 2020 21:11
> 
> On Fri, Dec 11 2020 at 14:19, David Laight wrote:
> > From: Thomas Gleixner
> >> You can't catch that. If this really becomes an issue you need a
> >> sequence counter around it.
> >
> > Or just two copies of the high word.
> > Provided the accesses are sequenced:
> > writer:
> > load high:low
> > add small_value,high:low
> > store high
> > store low
> > store high_copy
> > reader:
> > load high_copy
> > load low
> > load high
> > if (high != high_copy)
> > low = 0;
> 
> And low = 0 is solving what? You need to loop back and retry until it's
> consistent and then it's nothing else than an open coded sequence count.

If it is a counter or timestamp then the high:0 value happened
some time between when you started trying to read the value and
when you finished trying to read it.

As such it is a perfectly reasonable return value.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [patch 14/30] drm/i915/pmu: Replace open coded kstat_irqs() copy

2020-12-14 Thread David Laight
From: Thomas Gleixner
> Sent: 11 December 2020 12:58
..
> > After my failed hasty sketch from last night I had a different one which
> > was kind of heuristics based (re-reading the upper dword and retrying if
> > it changed on 32-bit).
> 
> The problem is that there will be two seperate modifications for the low
> and high word. Several ways how the compiler can translate this, but the
> problem is the same for all of them:
> 
> CPU 0   CPU 1
> load low
> load high
> add  low, 1
> addc high, 0
> store low   load high
> --> NMI load low
> load high and compare
> store high
> 
> You can't catch that. If this really becomes an issue you need a
> sequence counter around it.

Or just two copies of the high word.
Provided the accesses are sequenced:
writer:
load high:low
add small_value,high:low
store high
store low
store high_copy
reader:
load high_copy
load low
load high
if (high != high_copy)
low = 0;

The read value is always stale, so it probably doesn't
matter that the value you have is one that is between the
value when you started and that when you finished.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4; graphics alignment

2020-11-25 Thread David Laight
From: David Laight
> Sent: 20 November 2020 15:39
> 
> From: Thomas Zimmermann
> > Sent: 20 November 2020 13:42
> ...
> > I did a diff from v5.10-rc4 to drm-tip to look for suspicious changes.
> > Some candidates are
> >
> >8e3784dfef8a ("drm/ast: Reload gamma LUT after changing primary
> > plane's color format")
> 
> Ok, that one fixes the screen colours (etc).
> So 8e3784dfef8a was good and then HEAD^ was bad.
> 
> I might try to bisect the breakage.
> 
> The stack splat is entirely different.
> I'll try to bisect that on Linus's tree.

The good news is I'm not getting the stack splat on rc5.
I'm not sure I can be bothered to find out when :-)

Applying 8e3784dfef8a to rc5 by hand also fixes the display colours.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4; graphics alignment

2020-11-21 Thread David Laight
From: Thomas Zimmermann
> Sent: 20 November 2020 13:42
...
> I did a diff from v5.10-rc4 to drm-tip to look for suspicious changes.
> Some candidates are
> 
>8e3784dfef8a ("drm/ast: Reload gamma LUT after changing primary
> plane's color format")

Ok, that one fixes the screen colours (etc).
So 8e3784dfef8a was good and then HEAD^ was bad.

I might try to bisect the breakage.

The stack splat is entirely different.
I'll try to bisect that on Linus's tree.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4; graphics alignment

2020-11-21 Thread David Laight
From: Thomas Zimmermann
> Sent: 20 November 2020 10:14
...
> > Is there any way to bisect through the parts of the
> > drm merge patch into v5.10-rc1 ?
> >
> > That ought to be quicker (and less error prone) than
> > the bisect builds I was doing.
> >
> > Note that the stack 'splat' is due to a later change.
> > It is separate from the broken pixel alignment.
> >
> > I actually saw the vga text go 'funny' while the boot
> > was outputting all the [OK] messages (from systemd?)
> > before the graphic login stole tty1 (bloody stupid
> > to use tty1).
> >
> > I don't need to use the failing system today, I'll
> > have another go at isolating the failure.
> 
> You can use drm-tip for testing, where many of the DRM patches go through.
> 
>https://cgit.freedesktop.org/drm/drm-tip/
> 
> It's fairly up-to-date.

Any idea of tags either side of the 5.10 merge?

> I have two systems with AST chips and neither shows any of the symptoms
> you describe; nor do we have such reports about drivers that use a
> similar stack (hibmc, bochs). Could you provide the output of
> 
>dmesg | grep drm

[2.112303] fb0: switching to astdrmfb from EFI VGA
[2.120222] ast :02:00.0: [drm] Using P2A bridge for configuration
[2.120233] ast :02:00.0: [drm] AST 2400 detected
[2.120247] ast :02:00.0: [drm] Analog VGA only
[2.120257] ast :02:00.0: [drm] dram MCLK=408 Mhz type=1 bus_width=16
[2.121121] [drm] Initialized ast 0.1.0 20120228 for :02:00.0 on minor 0
[2.125838] fbcon: astdrmfb (fb0) is primary device
[2.152179] ast :02:00.0: [drm] fb0: astdrmfb frame buffer device
[6.061034] systemd[1]: Condition check resulted in Load Kernel Module drm 
being skipped.

The output is the same for both good and bad kernels.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4; graphics alignment

2020-11-21 Thread David Laight
From: Thomas Zimmermann
> Sent: 20 November 2020 11:27
...
> >> You can use drm-tip for testing, where many of the DRM patches go through.
> >>
> >> https://cgit.freedesktop.org/drm/drm-tip/
> >>
> >> It's fairly up-to-date.
> >
> > Any idea of tags either side of the 5.10 merge?
> 
> The final commit before v5.9 appears to be
> 
>Fixes: 33c8256b3bcc ("drm/amd/display: Change ABM config init interface")
> 
> I'd try this as a good commit. For the bad commit, just try HEAD.

HEAD off that tree works.
Colours ok and no stack backtrace.

Ideas??

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4; graphics alignment

2020-11-21 Thread David Laight
> Hi David
> 
> Am 18.11.20 um 23:01 schrieb David Laight:
...
> Did you try Daniel's suggestion of testing with the direct parent commit?
(I was on holiday yesterday and didn't want to spend a sunny
afternoon doing bisects.)

I've just done that and it is bad.

Is there any way to bisect through the parts of the
drm merge patch into v5.10-rc1 ?

That ought to be quicker (and less error prone) than
the bisect builds I was doing.

Note that the stack 'splat' is due to a later change.
It is separate from the broken pixel alignment.

I actually saw the vga text go 'funny' while the boot
was outputting all the [OK] messages (from systemd?)
before the graphic login stole tty1 (bloody stupid
to use tty1).

I don't need to use the failing system today, I'll
have another go at isolating the failure.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4; graphics alignment

2020-11-21 Thread David Laight
From: Thomas Zimmermann
> Sent: 20 November 2020 12:30
> 
> Am 20.11.20 um 12:45 schrieb David Laight:
> > From: Thomas Zimmermann
> >> Sent: 20 November 2020 11:27
> > ...
> >>>> You can use drm-tip for testing, where many of the DRM patches go 
> >>>> through.
> >>>>
> >>>>  https://cgit.freedesktop.org/drm/drm-tip/
> >>>>
> >>>> It's fairly up-to-date.
> >>>
> >>> Any idea of tags either side of the 5.10 merge?
> >>
> >> The final commit before v5.9 appears to be
> >>
> >> Fixes: 33c8256b3bcc ("drm/amd/display: Change ABM config init 
> >> interface")
> >>
> >> I'd try this as a good commit. For the bad commit, just try HEAD.
> >
> > HEAD off that tree works.
> > Colours ok and no stack backtrace.
> >
> > Ideas??
> 
> The good news is that it's been fixed. All you have to do is wait for
> the fix to hit upstream.
> 
> Did you try the patch that Dave linked? 

That patch makes no difference to my system.
The condition is false so it doesn't corrupt the flags.
(I printed the values to see.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4

2020-11-19 Thread David Laight
From: Dave Airlie
> Sent: 19 November 2020 01:16
> 
> On Thu, 19 Nov 2020 at 08:25, Dave Airlie  wrote:
> >
> > On Thu, 19 Nov 2020 at 08:15, Daniel Vetter  wrote:
> > >
> > > On Wed, Nov 18, 2020 at 11:01 PM David Laight  
> > > wrote:
> > > >
> > > > From: Thomas Zimmermann
> > > > > Sent: 18 November 2020 19:37
> > > > >
> > > > > Hi
> > > > >
> > > > > Am 18.11.20 um 19:10 schrieb Linus Torvalds:
> > > > > > On Wed, Nov 18, 2020 at 4:12 AM David Laight 
> > > > > >  wrote:
> > > > > >>
> > > > > >> I've got the 'splat' below during boot.
> > > > > >> This is an 8-core C2758 Atom cpu using the on-board/cpu graphics.
> > > > > >> User space is Ubuntu 20.04.
> > > > > >>
> > > > > >> Additionally the X display has all the colours and alignment 
> > > > > >> slightly
> > > > > >> messed up.
> > > > > >> 5.9.0 was ok.
> > > > > >> I'm just guessing the two issues are related.
> > > > > >
> > > > > > Sounds likely.  But it would be lovely if you could bisect when
> > > > > > exactly the problem(s) started to both verify that, and just to
> > > > > > pinpoint the exact change..
> > > >
> > > > I don't quite understand what 'git bisect' did.
> > > > I was bisecting between v5.9 and v5.10-rc1 but it suddenly started
> > > > generating v5.9.0-rc5+ kernels.
> > >
> > > We queue up patches for -rc1 way before the previous kernel is
> > > released, so this is normal.
> > >
> > > > The identified commit was 13a8f46d803 drm/ttm: move ghost object 
> > > > created.
> > > > (retyped - hope it is right).
> > > > But the diff to that last 'good' commit is massive.
> > >
> > > Yeah that's also normal for non-linear history. If you want to
> > > double-check, re-test the parent of that commit (which is 2ee476f77ffe
> > > ("drm/ttm: add a simple assign mem to bo wrapper")), which should
> > > work, and then the bad commit.
> > >
> > > Also is this the first bad commit for both the splat and the screen
> > > corruption issues?
> > >
> > > > So I don't know if that is anywhere near right.
> > >
> > > Thomas guessed it could be a ttm change, you hit one, and it looks
> > > like it could be the culprit. Now I guess it's up to Dave. Also adding
> > > Christian, in case he has an idea.
> >
> > I'd be mildly surprised if it's that commit, since it just refactors
> > what looks to me to be two identical code pieces into one instance
> > (within the scope of me screwing that up, but reading it I can't see
> > it).
> >
> > I'll dig into this today.
> 
> https://patchwork.freedesktop.org/patch/401559/
> 
> should fix it.

Nope, and probably not relevant.
pl_flags is 2 or 3 and it is testing for 4.

The oldest kernel doesn't generate the 'splat' either.
Just the f*cked up display output.

I'll put a screenshot (photo) into another email.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: Linux 5.10-rc4

2020-11-19 Thread David Laight
From: Thomas Zimmermann
> Sent: 18 November 2020 19:37
> 
> Hi
> 
> Am 18.11.20 um 19:10 schrieb Linus Torvalds:
> > On Wed, Nov 18, 2020 at 4:12 AM David Laight  
> > wrote:
> >>
> >> I've got the 'splat' below during boot.
> >> This is an 8-core C2758 Atom cpu using the on-board/cpu graphics.
> >> User space is Ubuntu 20.04.
> >>
> >> Additionally the X display has all the colours and alignment slightly
> >> messed up.
> >> 5.9.0 was ok.
> >> I'm just guessing the two issues are related.
> >
> > Sounds likely.  But it would be lovely if you could bisect when
> > exactly the problem(s) started to both verify that, and just to
> > pinpoint the exact change..

I don't quite understand what 'git bisect' did.
I was bisecting between v5.9 and v5.10-rc1 but it suddenly started
generating v5.9.0-rc5+ kernels.

The identified commit was 13a8f46d803 drm/ttm: move ghost object created.
(retyped - hope it is right).
But the diff to that last 'good' commit is massive.

So I don't know if that is anywhere near right.

David

> >
> > I'm adding Thomas Zimmermann to the cc, because he did that "drm/ast:
> > Program display mode in CRTC's atomic_enable" which looks relevant in
> > that it's right in that call-chain.
> >
> > Did some initialization perhaps get overlooked?
> >
> > And Dave and Daniel and the drm list cc'd as well..
> >
> > Full splat left quoted below for new people and list.
> >
> >  Linus
> >
> >> [   20.809891] WARNING: CPU: 0 PID: 973 at 
> >> drivers/gpu/drm/drm_gem_vram_helper.c:284
> drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
> 
> That line is at [1], which comes from
> 
>   46642a7d4d80 ("drm/vram-helper: don't use ttm bo->offset v4")
> 
> But the patch was merged in 5.9-rc1, so it's probably something else.
> 
> We've had a lot of TTM-related changes recently, so my best guess is
> that it's something in TTM with BO initialization.
> 
>  From some grepping, it looks like we have to call ttm_bo_mem_space() to
> fill mm_node (i.e., the pointer that causes the warning). But I cannot
> find where vram helpers do this. Maybe that's a good starting point.
> 
> I'm adding the TTM devs to cc.
> 
> Best regards
> Thomas
> 
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/gpu/drm/drm_gem_vram_h
> elper.c?h=v5.10-rc4#n284
> 
> 
> >> [   20.821543] Modules linked in: nls_iso8859_1 dm_multipath scsi_dh_rdac 
> >> scsi_dh_emc scsi_dh_alua
> ipmi_ssif intel_powerclamp coretemp kvm_intel kvm joydev input_leds ipmi_si 
> intel_cstate ipmi_devintf
> ipmi_msghandler mac_hid sch_fq_codel parport_pc ppdev lp parport ip_tables 
> x_tables autofs4 btrfs
> blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy 
> async_pq async_xor
> async_tx libcrc32c xor raid6_pq raid1 raid0 multipath linear ast 
> drm_vram_helper drm_kms_helper
> syscopyarea sysfillrect sysimgblt fb_sys_fops cec drm_ttm_helper ttm 
> crct10dif_pclmul crc32_pclmul
> ghash_clmulni_intel gpio_ich drm aesni_intel hid_generic glue_helper 
> crypto_simd igb usbhid cryptd
> ahci i2c_i801 hid libahci i2c_smbus lpc_ich dca i2c_ismt i2c_algo_bit
> >> [   20.887477] CPU: 0 PID: 973 Comm: gnome-shell Not tainted 5.10.0-rc4+ 
> >> #78
> >> [   20.894274] Hardware name: Supermicro A1SAi/A1SRi, BIOS 1.1a 08/27/2015
> >> [   20.900896] RIP: 0010:drm_gem_vram_offset+0x35/0x40 [drm_vram_helper]
> >> [   20.907342] Code: 00 48 89 e5 85 c0 74 17 48 83 bf 78 01 00 00 00 74 18 
> >> 48 8b 87 80 01 00 00 5d
> 48 c1 e0 0c c3 0f 0b 48 c7 c0 ed ff ff ff 5d c3 <0f> 0b 31 c0 5d c3 0f 1f 44 
> 00 00 0f 1f 44 00 00 55
> 48 8b 87 18 06
> >> [   20.926100] RSP: 0018:9f59811d3a68 EFLAGS: 00010246
> >> [   20.931339] RAX: 0002 RBX: 8b46861e20c0 RCX: 
> >> c032d600
> >> [   20.938479] RDX: 8b468f47a000 RSI: 8b46861e2000 RDI: 
> >> 8b468f9acc00
> >> [   20.945622] RBP: 9f59811d3a68 R08: 0040 R09: 
> >> 8b46864ce288
> >> [   20.952769] R10:  R11: 0001 R12: 
> >> 8b468f47a000
> >> [   20.959915] R13:  R14:  R15: 
> >> 8b468ad2bf00
> >> [   20.967057] FS:  7f5b37ac5cc0() GS:8b49efc0() 
> >> knlGS:
> >> [   20.975149] CS:  0010 DS:  ES:  CR0: 80050033
> >> [   20.980904] CR2: 7f5b3d093f00 CR3: 000103438000 CR4: 
> >> 0

RE: Linux 5.10-rc4

2020-11-19 Thread David Laight
From: Linus Torvalds
> Sent: 18 November 2020 18:11
> 
> On Wed, Nov 18, 2020 at 4:12 AM David Laight  wrote:
> >
> > I've got the 'splat' below during boot.
> > This is an 8-core C2758 Atom cpu using the on-board/cpu graphics.
> > User space is Ubuntu 20.04.
> >
> > Additionally the X display has all the colours and alignment slightly
> > messed up.
> > 5.9.0 was ok.
> > I'm just guessing the two issues are related.
> 
> Sounds likely.  But it would be lovely if you could bisect when
> exactly the problem(s) started to both verify that, and just to
> pinpoint the exact change..

I'm working on it - have been all afternoon.
(I'm on holiday and it is raining...)

5.10-rc1 fails, so it is something in the merge window.
I suspect I'll just hit the pull of the drm changes.
The bisect suddenly build a 5.9-rc5+ kernel!
So I'm retesting a good/bad pair with likely dates and will restart it.

Annoyingly the test system defaults to booting the highest version
kernel - not the one I've just build; I may have given it a wrong answer.
The builds also all take 20 minutes; so the bisect is slow.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] drm/amdgpu: do not initialise global variables to 0 or NULL

2020-11-03 Thread David Laight
From: Greg KH
> Sent: 02 November 2020 20:11
> 
> On Mon, Nov 02, 2020 at 02:43:45PM -0500, Alex Deucher wrote:
> > On Mon, Nov 2, 2020 at 1:42 PM Deepak R Varma  wrote:
> > >
> > > Initializing global variable to 0 or NULL is not necessary and should
> > > be avoided. Issue reported by checkpatch script as:
> > > ERROR: do not initialise globals to 0 (or NULL).
> >
> > I agree that this is technically correct, but a lot of people don't
> > seem to know that so we get a lot of comments about this code for the
> > variables that are not explicitly set.  Seems less confusing to
> > initialize them even if it not necessary.  I don't have a particularly
> > strong opinion on it however.
> 
> The kernel coding style is to do it this way.  You even save space and
> time by doing it as well :)

Uninitialised globals end up as 'named common' (variables that are
their own code section - from FORTRAN) until the final link puts
them into the .bss.
Globals initialised to 0 go into the .bss of the object file
being created.

So both end up in the final .bss.

If the code goes into a module you aren't allowed 'common' data
in a module to every global must be initialised.

I'm surprised checkpatch complains.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH 0/3] Prevent out-of-bounds access for built-in font data buffers

2020-09-25 Thread David Laight
> On Thu, Sep 24, 2020 at 09:38:22AM -0400, Peilin Ye wrote:
> > Hi all,
> >
> > syzbot has reported [1] a global out-of-bounds read issue in
> > fbcon_get_font(). A malicious user may resize `vc_font.height` to a large
> > value in vt_ioctl(), causing fbcon_get_font() to overflow our built-in
> > font data buffers, declared in lib/fonts/font_*.c:
...
> > (drivers/video/fbdev/core/fbcon.c)
> > if (font->width <= 8) {
> > j = vc->vc_font.height;
> > +   if (font->charcount * j > FNTSIZE(fontdata))
> > +   return -EINVAL;

Can that still go wrong because the multiply wraps?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [Linux-kernel-mentees] [PATCH v2] drm/bufs: Prevent kernel-infoleak in copy_one_buf()

2020-07-28 Thread David Laight
From: Peilin Ye
> Sent: 28 July 2020 12:52
> Currently `struct drm_buf_desc` is defined as follows:
> 
> struct drm_buf_desc {
>   int count;
>   int size;
>   int low_mark;
>   int high_mark;
>   enum {
>   _DRM_PAGE_ALIGN = 0x01,
>   _DRM_AGP_BUFFER = 0x02,
>   _DRM_SG_BUFFER = 0x04,
>   _DRM_FB_BUFFER = 0x08,
>   _DRM_PCI_BUFFER_RO = 0x10
>   } flags;
>   unsigned long agp_start;
> };
> 
> copy_one_buf() is potentially copying uninitialized kernel stack memory
> to userspace, since the compiler may leave such "holes" (around `.flags`
> and `.agp_start` fields) in this statically allocated structure. Prevent
> it by initializing `v` with memset().
> 
> Cc: sta...@vger.kernel.org
> Fixes: 5c7640ab6258 ("switch compat_drm_infobufs() to drm_ioctl_kernel()")
> Suggested-by: Dan Carpenter 
> Acked-by: Arnd Bergmann 
> Signed-off-by: Peilin Ye 
> ---
> Change in v2:
> - Improve commit description. (Suggested by Arnd Bergmann
>   )
> 
>  drivers/gpu/drm/drm_bufs.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_bufs.c b/drivers/gpu/drm/drm_bufs.c
> index a0735fbc144b..f99cd4a3f951 100644
> --- a/drivers/gpu/drm/drm_bufs.c
> +++ b/drivers/gpu/drm/drm_bufs.c
> @@ -1349,10 +1349,14 @@ static int copy_one_buf(void *data, int count, struct 
> drm_buf_entry *from)
>  {
>   struct drm_buf_info *request = data;
>   struct drm_buf_desc __user *to = >list[count];
> - struct drm_buf_desc v = {.count = from->buf_count,
> -  .size = from->buf_size,
> -  .low_mark = from->low_mark,
> -  .high_mark = from->high_mark};
> + struct drm_buf_desc v;
> +
> + memset(, 0, sizeof(v));
> +
> + v.count = from->buf_count;
> + v.size = from->buf_size;
> + v.low_mark = from->low_mark;
> + v.high_mark = from->high_mark;
> 
>   if (copy_to_user(to, , offsetof(struct drm_buf_desc, flags)))
>   return -EFAULT;

The memset() isn't needed.
The copy_to_user() stops after the 4 'int' values so no 'random'
kernel stack can get copied.

Quite why it is 'right' to leave the remaining part of each
userspace structure unchanged is another matter.

David.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] dmabuf: use spinlock to access dmabuf->name

2020-06-18 Thread David Laight
From: Charan Teja Kalla
> Sent: 17 June 2020 07:29
...
> >> If name is freed you will copy garbage, but the only way
> >> for that to happen is that _set_name or _release have to be called
> >> at just the right time.
> >>
> >> And the above would probably only be an issue if the set_name
> >> was called, so you will get NULL or a real name.
> 
> And there exists a use-after-free to avoid which requires the lock. Say
> that memcpy() in dmabuffs_dname is in progress and in parallel _set_name
> will free the same buffer that memcpy is operating on.

If the name is being looked at while the item is being freed
you almost certainly have much bigger problems that just
the name being a 'junk' pointer.

David.

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


RE: [PATCH] drm/i915: check to see if SIMD registers are available before using SIMD

2020-05-05 Thread David Laight
From: Christoph Hellwig
> Sent: 04 May 2020 17:03
> 
> On Sun, May 03, 2020 at 09:20:19PM +0100, Chris Wilson wrote:
> > > Err, why does i915 implements its own uncached memcpy instead of relying
> > > on core functionality to start with?
> >
> > What is this core functionality that provides movntqda?
> 
> A sensible name might be memcpy_uncached or mempcy_nontemporal.
> But the important point is that this should be arch code with a common
> fallback rather than hacking it up in drivers.

More the point, you are trying to do a copy where:
1) The kernel isn't expected to read the data - so can bypass the cache.
and maybe:
2) The data needs flushing from the cache to actual memory.
and maybe:
3) The cache lines need invalidating.

The fallbacks depend on the required behaviour.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, 
UK
Registration No: 1397386 (Wales)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   >