Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator

2020-03-26 Thread Robert Hoo
On Thu, 2020-03-26 at 08:26 -0500, Eric Blake wrote:
> On 3/25/20 9:09 PM, Hu, Robert wrote:
> > (Don't know why my Linux-Evolution missed this mail.)
> > > -Original Message-
> > > Long line; it's nice to wrap commit messages around column 70 or
> > > so (because
> > > reading 'git log' in an 80-column window adds indentation).
> > > 
> > 
> > [Hu, Robert]
> > I think I set my vim on wrap. This probably escaped by paste.
> > I ran checkpatch.pl on the patches before sending. It escaped check
> > but didn't
> > escaped your eagle eye Thank you.
> 
> checkpatch doesn't flag commit message long lines.  Maybe it could
> be 
> patched to do so, but it's not at the top of my list to write that
> patch.
> 
> > 
> > > > I just fix a boudary case on his original patch.
> > > 
> > > boundary
> > 
> > [Hu, Robert]
> > Emm... again spell error. Usually I would paste descriptions into
> > some editors
> > with spell check, but forgot this time.
> > Vim doesn't have spell check I think. What editor would you suggest
> > me to
> > integrate with git editing?
> 
> I'm an emacs user, so I have no suggestions for vim, but I'd be very 
> surprised if there were not some vim expert online that could figure
> out 
> how to wire in a spell-checker to vim.  Google quickly finds: 
> https://www.ostechnix.com/use-spell-check-feature-vim-text-editor/
> 
nice, thanks:)




Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator

2020-03-26 Thread Eric Blake

On 3/25/20 9:09 PM, Hu, Robert wrote:

(Don't know why my Linux-Evolution missed this mail.)

-Original Message-



Long line; it's nice to wrap commit messages around column 70 or so (because
reading 'git log' in an 80-column window adds indentation).


[Hu, Robert]
I think I set my vim on wrap. This probably escaped by paste.
I ran checkpatch.pl on the patches before sending. It escaped check but didn't
escaped your eagle eye Thank you.


checkpatch doesn't flag commit message long lines.  Maybe it could be 
patched to do so, but it's not at the top of my list to write that patch.





I just fix a boudary case on his original patch.


boundary

[Hu, Robert]
Emm... again spell error. Usually I would paste descriptions into some editors
with spell check, but forgot this time.
Vim doesn't have spell check I think. What editor would you suggest me to
integrate with git editing?


I'm an emacs user, so I have no suggestions for vim, but I'd be very 
surprised if there were not some vim expert online that could figure out 
how to wire in a spell-checker to vim.  Google quickly finds: 
https://www.ostechnix.com/use-spell-check-feature-vim-text-editor/


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator

2020-03-26 Thread Paolo Bonzini
On 26/03/20 03:09, Hu, Robert wrote:
> BTW, do I need to resend these 2 patches?

No, thanks!  I have queued them.

Paolo




RE: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator

2020-03-25 Thread Hu, Robert
(Don't know why my Linux-Evolution missed this mail.)
> -Original Message-
> From: Eric Blake 
> Sent: Wednesday, March 25, 2020 20:54
> To: Robert Hoo ; qemu-devel@nongnu.org;
> pbonz...@redhat.com; richard.hender...@linaro.org
> Cc: Hu, Robert 
> Subject: Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator
> 
> On 3/25/20 1:50 AM, Robert Hoo wrote:
> > By increasing avx2 length_to_accel to 128, we can simplify its logic
> > and reduce a branch.
> >
> > The authorship of this patch actually belongs to Richard Henderson
> > ,
> 
> Long line; it's nice to wrap commit messages around column 70 or so (because
> reading 'git log' in an 80-column window adds indentation).
> 
[Hu, Robert] 
I think I set my vim on wrap. This probably escaped by paste.
I ran checkpatch.pl on the patches before sending. It escaped check but didn't
escaped your eagle eye Thank you.

> > I just fix a boudary case on his original patch.
> 
> boundary
[Hu, Robert] 
Emm... again spell error. Usually I would paste descriptions into some editors
with spell check, but forgot this time.
Vim doesn't have spell check I think. What editor would you suggest me to
integrate with git editing?

BTW, do I need to resend these 2 patches?
> 
> >
> > Suggested-by: Richard Henderson 
> > Signed-off-by: Robert Hoo 
> > ---
> >   util/bufferiszero.c | 26 +-
> >   1 file changed, 9 insertions(+), 17 deletions(-)
> >
> 
> 
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3226
> Virtualization:  qemu.org | libvirt.org



Re: [PATCH 2/2] util/bufferiszero: improve avx2 accelerator

2020-03-25 Thread Eric Blake

On 3/25/20 1:50 AM, Robert Hoo wrote:

By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce 
a
branch.

The authorship of this patch actually belongs to Richard Henderson 
,


Long line; it's nice to wrap commit messages around column 70 or so 
(because reading 'git log' in an 80-column window adds indentation).



I just fix a boudary case on his original patch.


boundary



Suggested-by: Richard Henderson 
Signed-off-by: Robert Hoo 
---
  util/bufferiszero.c | 26 +-
  1 file changed, 9 insertions(+), 17 deletions(-)




--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




[PATCH 2/2] util/bufferiszero: improve avx2 accelerator

2020-03-25 Thread Robert Hoo
By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce 
a
branch.

The authorship of this patch actually belongs to Richard Henderson 
,
I just fix a boudary case on his original patch.

Suggested-by: Richard Henderson 
Signed-off-by: Robert Hoo 
---
 util/bufferiszero.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/util/bufferiszero.c b/util/bufferiszero.c
index b801253..695bb4c 100644
--- a/util/bufferiszero.c
+++ b/util/bufferiszero.c
@@ -158,27 +158,19 @@ buffer_zero_avx2(const void *buf, size_t len)
 __m256i *p = (__m256i *)(((uintptr_t)buf + 5 * 32) & -32);
 __m256i *e = (__m256i *)(((uintptr_t)buf + len) & -32);
 
-if (likely(p <= e)) {
-/* Loop over 32-byte aligned blocks of 128.  */
-do {
-__builtin_prefetch(p);
-if (unlikely(!_mm256_testz_si256(t, t))) {
-return false;
-}
-t = p[-4] | p[-3] | p[-2] | p[-1];
-p += 4;
-} while (p <= e);
-} else {
-t |= _mm256_loadu_si256(buf + 32);
-if (len <= 128) {
-goto last2;
+/* Loop over 32-byte aligned blocks of 128.  */
+while (p <= e) {
+__builtin_prefetch(p);
+if (unlikely(!_mm256_testz_si256(t, t))) {
+return false;
 }
-}
+t = p[-4] | p[-3] | p[-2] | p[-1];
+p += 4;
+} ;
 
 /* Finish the last block of 128 unaligned.  */
 t |= _mm256_loadu_si256(buf + len - 4 * 32);
 t |= _mm256_loadu_si256(buf + len - 3 * 32);
- last2:
 t |= _mm256_loadu_si256(buf + len - 2 * 32);
 t |= _mm256_loadu_si256(buf + len - 1 * 32);
 
@@ -263,7 +255,7 @@ static void init_accel(unsigned cache)
 }
 if (cache & CACHE_AVX2) {
 fn = buffer_zero_avx2;
-length_to_accel = 64;
+length_to_accel = 128;
 }
 #endif
 #ifdef CONFIG_AVX512F_OPT
-- 
1.8.3.1