Am 25.03.2013 um 15:34 schrieb Paolo Bonzini pbonz...@redhat.com:
Hmm, right. What about just processing the first few longs twice, i.e.
the above followed by for (i = 0; i len / sizeof(sizeof(VECTYPE); i
+= BUFFER_FIND_NONZERO_OFFSET_UNROLL_FACTOR)?
I tested this version as v3:
size_t
Il 26/03/2013 09:14, Peter Lieven ha scritto:
If noone objects I would use is_zero_page_2 and continue with v5 of
the patch set. As I am ooo for the next 8 days from tomorrow. i
prefer v3 as it has better performance if the non-zeroness is within
the 8*sizeof(VECTYPE) bytes and not in the
On 22.03.2013 22:24, Paolo Bonzini wrote:
Il 22/03/2013 20:20, Peter Lieven ha scritto:
I think patch 4 is a bit overengineered. I would prefer the simple
patch you had using three/four non-vectorized accesses. The setup cost
of the vectorized buffer_is_zero is quite high, and 64 bits are
ubuntu 12.04 LTS 64-bit desktop with 1G memory shortly after boot:
histogram: 31.7% 32.9% [...] 36.4% 100.0%
---
opensuse 11.1 64-bit with 24GB ram (busy server)
histogram: 97.5% 97.9% [...] 99.5% 100.0%
---
windows server 2008 R2 with 8G ram running for 3 days:
histogram: 20.9%
Am 25.03.2013 um 11:53 schrieb Paolo Bonzini pbonz...@redhat.com:
ubuntu 12.04 LTS 64-bit desktop with 1G memory shortly after boot:
histogram: 31.7% 32.9% [...] 36.4% 100.0%
---
opensuse 11.1 64-bit with 24GB ram (busy server)
histogram: 97.5% 97.9% [...] 99.5% 100.0%
---
windows
Maybe I should have explained the output more detailed. The percentages
are added. 35.8% in the second last column means that
35.8% have a return value that is less than TARGET_PAGE_SIZE.
This was meant to illustrate at how many 64-bit chunks you have
to look to grab a certain percentage of
Am 25.03.2013 um 14:02 schrieb Paolo Bonzini pbonz...@redhat.com:
Maybe I should have explained the output more detailed. The percentages
are added. 35.8% in the second last column means that
35.8% have a return value that is less than TARGET_PAGE_SIZE.
This was meant to illustrate at how
Am 25.03.2013 um 14:23 schrieb Peter Lieven p...@kamp.de:
Am 25.03.2013 um 14:02 schrieb Paolo Bonzini pbonz...@redhat.com:
Maybe I should have explained the output more detailed. The percentages
are added. 35.8% in the second last column means that
35.8% have a return value that is less
Il 25/03/2013 14:32, Peter Lieven ha scritto:
Am 25.03.2013 um 14:23 schrieb Peter Lieven p...@kamp.de:
Am 25.03.2013 um 14:02 schrieb Paolo Bonzini pbonz...@redhat.com:
Maybe I should have explained the output more detailed. The percentages
are added. 35.8% in the second last column
Am 25.03.2013 um 15:34 schrieb Paolo Bonzini pbonz...@redhat.com:
Il 25/03/2013 14:32, Peter Lieven ha scritto:
Am 25.03.2013 um 14:23 schrieb Peter Lieven p...@kamp.de:
Am 25.03.2013 um 14:02 schrieb Paolo Bonzini pbonz...@redhat.com:
Maybe I should have explained the output more
Am 22.03.2013 um 22:24 schrieb Paolo Bonzini pbonz...@redhat.com:
Il 22/03/2013 20:20, Peter Lieven ha scritto:
I think patch 4 is a bit overengineered. I would prefer the simple
patch you had using three/four non-vectorized accesses. The setup cost
of the vectorized buffer_is_zero is
this is v4 of my patch series with various optimizations in
zero buffer checking and migration tweaks.
thanks especially to Eric Blake for reviewing.
v4:
- do not inline buffer_find_nonzero_offset()
- inline can_usebuffer_find_nonzero_offset() correctly
- readd asserts in
Il 22/03/2013 13:46, Peter Lieven ha scritto:
this is v4 of my patch series with various optimizations in
zero buffer checking and migration tweaks.
thanks especially to Eric Blake for reviewing.
v4:
- do not inline buffer_find_nonzero_offset()
- inline
Am 22.03.2013 18:25, schrieb Paolo Bonzini:
Il 22/03/2013 13:46, Peter Lieven ha scritto:
this is v4 of my patch series with various optimizations in
zero buffer checking and migration tweaks.
thanks especially to Eric Blake for reviewing.
v4:
- do not inline buffer_find_nonzero_offset()
Il 22/03/2013 20:20, Peter Lieven ha scritto:
I think patch 4 is a bit overengineered. I would prefer the simple
patch you had using three/four non-vectorized accesses. The setup cost
of the vectorized buffer_is_zero is quite high, and 64 bits are just
256k RAM; if the host doesn't touch
15 matches
Mail list logo