Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
r >From andriy.shevche...@linux.intel.com Sun Jun 05 14:21:40 2016 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,421,1459839600"; d="scan'208";a="969274163" Subject: Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string() From: Andy Shevchenko <andriy.shevche...@linux.intel.com> To: George Spelvin <li...@sciencehorizons.net> Cc: bj...@mork.no, linux-kernel@vger.kernel.org, m...@codeblueprint.co.uk, r...@rasmusvillemoes.dk Date: Sun, 05 Jun 2016 17:22:56 +0300 In-Reply-To: <20160604051411.3635.qm...@ns.sciencehorizons.net> References: <20160604051411.3635.qm...@ns.sciencehorizons.net> Organization: Intel Finland Oy Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.2-2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Andy Shevchenko wrote: > On Sat, 2016-06-04 at 01:14 -0400, George Spelvin wrote: >> -if (uc) >> -p = hex_byte_pack_upper(p, addr[index[i]]); >> -else >> -p = hex_byte_pack(p, addr[index[i]]); >> +u8 byte = addr[index[i]]; >> + >> +*p++ = hex[byte >> 4]; >> +*p++ = hex[byte & 0x0f]; > And what prevents you to assign hex_byte_pack()/hex_byte_pack_upper() > and do one call here? Because they're inline functions, so there's no compiled copy to take the address of. Since they're delcared "static", if you take their addresses anyway, gcc will helpfully compile private versions for the use of this source file, which will be bigger and slower than the simple inline expansion I used here. It's not even any *simpler*. Remembering what "hex_byte_pack()" means is as much mental effort as interpreting those two very simple lines. There's no strong reason to *avoid* using the hex_asc[] arrays directly. It's done in several other places in the kernel, including earlier in lib/vsprintf.c (search for "hex_asc_upper" in number()). If they were intended to be "off limits", they would have been given _-prefixed names.
Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
r >From andriy.shevche...@linux.intel.com Sun Jun 05 14:21:40 2016 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,421,1459839600"; d="scan'208";a="969274163" Subject: Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string() From: Andy Shevchenko To: George Spelvin Cc: bj...@mork.no, linux-kernel@vger.kernel.org, m...@codeblueprint.co.uk, r...@rasmusvillemoes.dk Date: Sun, 05 Jun 2016 17:22:56 +0300 In-Reply-To: <20160604051411.3635.qm...@ns.sciencehorizons.net> References: <20160604051411.3635.qm...@ns.sciencehorizons.net> Organization: Intel Finland Oy Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.2-2 Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Andy Shevchenko wrote: > On Sat, 2016-06-04 at 01:14 -0400, George Spelvin wrote: >> -if (uc) >> -p = hex_byte_pack_upper(p, addr[index[i]]); >> -else >> -p = hex_byte_pack(p, addr[index[i]]); >> +u8 byte = addr[index[i]]; >> + >> +*p++ = hex[byte >> 4]; >> +*p++ = hex[byte & 0x0f]; > And what prevents you to assign hex_byte_pack()/hex_byte_pack_upper() > and do one call here? Because they're inline functions, so there's no compiled copy to take the address of. Since they're delcared "static", if you take their addresses anyway, gcc will helpfully compile private versions for the use of this source file, which will be bigger and slower than the simple inline expansion I used here. It's not even any *simpler*. Remembering what "hex_byte_pack()" means is as much mental effort as interpreting those two very simple lines. There's no strong reason to *avoid* using the hex_asc[] arrays directly. It's done in several other places in the kernel, including earlier in lib/vsprintf.c (search for "hex_asc_upper" in number()). If they were intended to be "off limits", they would have been given _-prefixed names.
Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
On Sat, 2016-06-04 at 01:14 -0400, George Spelvin wrote: > Rather than have a second pass to upcase the buffer, just make the > hex lookup table a variable. > > Removing the conditional branch from the inner loop is also a > speedup, but since this is not hot code, the important factor > it shrinks both source and compiled forms: > > Before After Delta Percentage > x86-32245 199 -46 -18.8% > x86-64246 186 -60 -24.4% > arm 292 264 -28 -9.6% > thumb 220 160 -60 -27.3% > arm64 296 244 -52 -17.6% > > Signed-off-by: George Spelvin> --- > lib/vsprintf.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index 7332a5d7..4ee07e89 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -1316,24 +1316,24 @@ char *uuid_string(char *buf, char *end, const > u8 *addr, > char *p = uuid; > int i; > const u8 *index = uuid_be_index; > - bool uc = false; > + const char *hex = hex_asc; > > - switch (*(++fmt)) { > + switch (fmt[1]) { > case 'L': > - uc = true; /* fall-through */ > + hex = hex_asc_upper;/* fall-through */ > case 'l': > index = uuid_le_index; > break; > case 'B': > - uc = true; > + hex = hex_asc_upper; > break; > } > > for (i = 0; i < 16; i++) { > - if (uc) > - p = hex_byte_pack_upper(p, addr[index[i]]); > - else > - p = hex_byte_pack(p, addr[index[i]]); > + u8 byte = addr[index[i]]; > + > + *p++ = hex[byte >> 4]; > + *p++ = hex[byte & 0x0f]; And what prevents you to assign hex_byte_pack()/hex_byte_pack_upper() and do one call here? > switch (i) { > case 3: > case 5: -- Andy Shevchenko Intel Finland Oy
Re: [PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
On Sat, 2016-06-04 at 01:14 -0400, George Spelvin wrote: > Rather than have a second pass to upcase the buffer, just make the > hex lookup table a variable. > > Removing the conditional branch from the inner loop is also a > speedup, but since this is not hot code, the important factor > it shrinks both source and compiled forms: > > Before After Delta Percentage > x86-32245 199 -46 -18.8% > x86-64246 186 -60 -24.4% > arm 292 264 -28 -9.6% > thumb 220 160 -60 -27.3% > arm64 296 244 -52 -17.6% > > Signed-off-by: George Spelvin > --- > lib/vsprintf.c | 16 > 1 file changed, 8 insertions(+), 8 deletions(-) > > diff --git a/lib/vsprintf.c b/lib/vsprintf.c > index 7332a5d7..4ee07e89 100644 > --- a/lib/vsprintf.c > +++ b/lib/vsprintf.c > @@ -1316,24 +1316,24 @@ char *uuid_string(char *buf, char *end, const > u8 *addr, > char *p = uuid; > int i; > const u8 *index = uuid_be_index; > - bool uc = false; > + const char *hex = hex_asc; > > - switch (*(++fmt)) { > + switch (fmt[1]) { > case 'L': > - uc = true; /* fall-through */ > + hex = hex_asc_upper;/* fall-through */ > case 'l': > index = uuid_le_index; > break; > case 'B': > - uc = true; > + hex = hex_asc_upper; > break; > } > > for (i = 0; i < 16; i++) { > - if (uc) > - p = hex_byte_pack_upper(p, addr[index[i]]); > - else > - p = hex_byte_pack(p, addr[index[i]]); > + u8 byte = addr[index[i]]; > + > + *p++ = hex[byte >> 4]; > + *p++ = hex[byte & 0x0f]; And what prevents you to assign hex_byte_pack()/hex_byte_pack_upper() and do one call here? > switch (i) { > case 3: > case 5: -- Andy Shevchenko Intel Finland Oy
[PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
Rather than have a second pass to upcase the buffer, just make the hex lookup table a variable. Removing the conditional branch from the inner loop is also a speedup, but since this is not hot code, the important factor it shrinks both source and compiled forms: Before After Delta Percentage x86-32 245 199 -46 -18.8% x86-64 246 186 -60 -24.4% arm 292 264 -28 -9.6% thumb 220 160 -60 -27.3% arm64 296 244 -52 -17.6% Signed-off-by: George Spelvin--- lib/vsprintf.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 7332a5d7..4ee07e89 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -1316,24 +1316,24 @@ char *uuid_string(char *buf, char *end, const u8 *addr, char *p = uuid; int i; const u8 *index = uuid_be_index; - bool uc = false; + const char *hex = hex_asc; - switch (*(++fmt)) { + switch (fmt[1]) { case 'L': - uc = true; /* fall-through */ + hex = hex_asc_upper;/* fall-through */ case 'l': index = uuid_le_index; break; case 'B': - uc = true; + hex = hex_asc_upper; break; } for (i = 0; i < 16; i++) { - if (uc) - p = hex_byte_pack_upper(p, addr[index[i]]); - else - p = hex_byte_pack(p, addr[index[i]]); + u8 byte = addr[index[i]]; + + *p++ = hex[byte >> 4]; + *p++ = hex[byte & 0x0f]; switch (i) { case 3: case 5: -- 2.8.1
[PATCH v2 1/2] lib/vsprintf.c: Simplify uuid_string()
Rather than have a second pass to upcase the buffer, just make the hex lookup table a variable. Removing the conditional branch from the inner loop is also a speedup, but since this is not hot code, the important factor it shrinks both source and compiled forms: Before After Delta Percentage x86-32 245 199 -46 -18.8% x86-64 246 186 -60 -24.4% arm 292 264 -28 -9.6% thumb 220 160 -60 -27.3% arm64 296 244 -52 -17.6% Signed-off-by: George Spelvin --- lib/vsprintf.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/lib/vsprintf.c b/lib/vsprintf.c index 7332a5d7..4ee07e89 100644 --- a/lib/vsprintf.c +++ b/lib/vsprintf.c @@ -1316,24 +1316,24 @@ char *uuid_string(char *buf, char *end, const u8 *addr, char *p = uuid; int i; const u8 *index = uuid_be_index; - bool uc = false; + const char *hex = hex_asc; - switch (*(++fmt)) { + switch (fmt[1]) { case 'L': - uc = true; /* fall-through */ + hex = hex_asc_upper;/* fall-through */ case 'l': index = uuid_le_index; break; case 'B': - uc = true; + hex = hex_asc_upper; break; } for (i = 0; i < 16; i++) { - if (uc) - p = hex_byte_pack_upper(p, addr[index[i]]); - else - p = hex_byte_pack(p, addr[index[i]]); + u8 byte = addr[index[i]]; + + *p++ = hex[byte >> 4]; + *p++ = hex[byte & 0x0f]; switch (i) { case 3: case 5: -- 2.8.1