Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On Thu, 2020-03-26 at 08:26 -0500, Eric Blake wrote: > On 3/25/20 9:09 PM, Hu, Robert wrote: > > (Don't know why my Linux-Evolution missed this mail.) > > > -Original Message- > > > Long line; it's nice to wrap commit messages around column 70 or > > > so (because > > > reading 'git log' in an 80-column window adds indentation). > > > > > > > [Hu, Robert] > > I think I set my vim on wrap. This probably escaped by paste. > > I ran checkpatch.pl on the patches before sending. It escaped check > > but didn't > > escaped your eagle eye Thank you. > > checkpatch doesn't flag commit message long lines. Maybe it could > be > patched to do so, but it's not at the top of my list to write that > patch. > > > > > > > I just fix a boudary case on his original patch. > > > > > > boundary > > > > [Hu, Robert] > > Emm... again spell error. Usually I would paste descriptions into > > some editors > > with spell check, but forgot this time. > > Vim doesn't have spell check I think. What editor would you suggest > > me to > > integrate with git editing? > > I'm an emacs user, so I have no suggestions for vim, but I'd be very > surprised if there were not some vim expert online that could figure > out > how to wire in a spell-checker to vim. Google quickly finds: > https://www.ostechnix.com/use-spell-check-feature-vim-text-editor/ > nice, thanks:)
Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On 3/25/20 9:09 PM, Hu, Robert wrote: (Don't know why my Linux-Evolution missed this mail.) -Original Message- Long line; it's nice to wrap commit messages around column 70 or so (because reading 'git log' in an 80-column window adds indentation). [Hu, Robert] I think I set my vim on wrap. This probably escaped by paste. I ran checkpatch.pl on the patches before sending. It escaped check but didn't escaped your eagle eye Thank you. checkpatch doesn't flag commit message long lines. Maybe it could be patched to do so, but it's not at the top of my list to write that patch. I just fix a boudary case on his original patch. boundary [Hu, Robert] Emm... again spell error. Usually I would paste descriptions into some editors with spell check, but forgot this time. Vim doesn't have spell check I think. What editor would you suggest me to integrate with git editing? I'm an emacs user, so I have no suggestions for vim, but I'd be very surprised if there were not some vim expert online that could figure out how to wire in a spell-checker to vim. Google quickly finds: https://www.ostechnix.com/use-spell-check-feature-vim-text-editor/ -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On 26/03/20 03:09, Hu, Robert wrote: > BTW, do I need to resend these 2 patches? No, thanks! I have queued them. Paolo
RE: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
(Don't know why my Linux-Evolution missed this mail.) > -Original Message- > From: Eric Blake > Sent: Wednesday, March 25, 2020 20:54 > To: Robert Hoo ; qemu-devel@nongnu.org; > pbonz...@redhat.com; richard.hender...@linaro.org > Cc: Hu, Robert > Subject: Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator > > On 3/25/20 1:50 AM, Robert Hoo wrote: > > By increasing avx2 length_to_accel to 128, we can simplify its logic > > and reduce a branch. > > > > The authorship of this patch actually belongs to Richard Henderson > > , > > Long line; it's nice to wrap commit messages around column 70 or so (because > reading 'git log' in an 80-column window adds indentation). > [Hu, Robert] I think I set my vim on wrap. This probably escaped by paste. I ran checkpatch.pl on the patches before sending. It escaped check but didn't escaped your eagle eye Thank you. > > I just fix a boudary case on his original patch. > > boundary [Hu, Robert] Emm... again spell error. Usually I would paste descriptions into some editors with spell check, but forgot this time. Vim doesn't have spell check I think. What editor would you suggest me to integrate with git editing? BTW, do I need to resend these 2 patches? > > > > > Suggested-by: Richard Henderson > > Signed-off-by: Robert Hoo > > --- > > util/bufferiszero.c | 26 +- > > 1 file changed, 9 insertions(+), 17 deletions(-) > > > > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org
Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
On 3/25/20 1:50 AM, Robert Hoo wrote: By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a branch. The authorship of this patch actually belongs to Richard Henderson , Long line; it's nice to wrap commit messages around column 70 or so (because reading 'git log' in an 80-column window adds indentation). I just fix a boudary case on his original patch. boundary Suggested-by: Richard Henderson Signed-off-by: Robert Hoo --- util/bufferiszero.c | 26 +- 1 file changed, 9 insertions(+), 17 deletions(-) -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
[PATCH 2/2] util/bufferiszero: improve avx2 accelerator
By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a branch. The authorship of this patch actually belongs to Richard Henderson , I just fix a boudary case on his original patch. Suggested-by: Richard Henderson Signed-off-by: Robert Hoo --- util/bufferiszero.c | 26 +- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/util/bufferiszero.c b/util/bufferiszero.c index b801253..695bb4c 100644 --- a/util/bufferiszero.c +++ b/util/bufferiszero.c @@ -158,27 +158,19 @@ buffer_zero_avx2(const void *buf, size_t len) __m256i *p = (__m256i *)(((uintptr_t)buf + 5 * 32) & -32); __m256i *e = (__m256i *)(((uintptr_t)buf + len) & -32); -if (likely(p <= e)) { -/* Loop over 32-byte aligned blocks of 128. */ -do { -__builtin_prefetch(p); -if (unlikely(!_mm256_testz_si256(t, t))) { -return false; -} -t = p[-4] | p[-3] | p[-2] | p[-1]; -p += 4; -} while (p <= e); -} else { -t |= _mm256_loadu_si256(buf + 32); -if (len <= 128) { -goto last2; +/* Loop over 32-byte aligned blocks of 128. */ +while (p <= e) { +__builtin_prefetch(p); +if (unlikely(!_mm256_testz_si256(t, t))) { +return false; } -} +t = p[-4] | p[-3] | p[-2] | p[-1]; +p += 4; +} ; /* Finish the last block of 128 unaligned. */ t |= _mm256_loadu_si256(buf + len - 4 * 32); t |= _mm256_loadu_si256(buf + len - 3 * 32); - last2: t |= _mm256_loadu_si256(buf + len - 2 * 32); t |= _mm256_loadu_si256(buf + len - 1 * 32); @@ -263,7 +255,7 @@ static void init_accel(unsigned cache) } if (cache & CACHE_AVX2) { fn = buffer_zero_avx2; -length_to_accel = 64; +length_to_accel = 128; } #endif #ifdef CONFIG_AVX512F_OPT -- 1.8.3.1