On 05/30/2014 01:07 PM, Tim Chen wrote:
On Fri, 2014-05-30 at 12:38 -0700, Dirk Brandewie wrote:
Dirk,
Thanks for checking things out.
I tested on a Haswell system, and I see that the frequency
can dip below the max even when I set the min_perf_pct to 100.
Let me know if you want to log on
On Fri, 2014-05-30 at 12:38 -0700, Dirk Brandewie wrote:
> > Dirk,
> >
> > Thanks for checking things out.
> >
> > I tested on a Haswell system, and I see that the frequency
> > can dip below the max even when I set the min_perf_pct to 100.
> > Let me know if you want to log on to my system and
On 05/30/2014 12:32 PM, Tim Chen wrote:
On Fri, 2014-05-30 at 11:45 -0700, Dirk Brandewie wrote:
With turbostat from rc7.
[root@echolake turbostat]# ./turbostat
Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3
CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6
On Fri, 2014-05-30 at 11:45 -0700, Dirk Brandewie wrote:
>
> With turbostat from rc7.
> [root@echolake turbostat]# ./turbostat
> Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3
> CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt
> CorWatt GFXWatt
On 05/30/2014 10:56 AM, Tim Chen wrote:
> On Thu, 2014-05-29 at 21:16 -0400, Dave Jones wrote:
>> On Thu, May 29, 2014 at 06:07:16PM -0700, Tim Chen wrote:
>> > On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
>> > > Sorry for the delay; my Ivy Bridge test machine isn't in my
>> > >
On Thu, 2014-05-29 at 21:16 -0400, Dave Jones wrote:
> On Thu, May 29, 2014 at 06:07:16PM -0700, Tim Chen wrote:
> > On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
> > > Sorry for the delay; my Ivy Bridge test machine isn't in my
> > > office and getting to the console to tweak the
On Fri, 2014-05-30 at 12:52 -0400, George Spelvin wrote:
> > That's very small (less than 0.2%) so I think it's acceptable.
>
> Thank you! May I take this as an Acked-by; ?
Yes, with the caveat that you still have a v3 of this patch
that reorganize the K table to rodata.
Tim
>
> I'll work on
> That's very small (less than 0.2%) so I think it's acceptable.
Thank you! May I take this as an Acked-by; ?
I'll work on some performance improvements, but they proably
won't be ready for the 3.16 merge window.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the
On Fri, 2014-05-30 at 01:25 -0400, George Spelvin wrote:
>
> Averaging the 8K bytes per update, I do see an average of 3.2 cycles per
> operation (that is, per 8K of data processed) lost, or about 1 cycle per
> (3K or less) block processed. I'm hoping the reduced D-cache polution
> makes it up
On Fri, 2014-05-30 at 01:25 -0400, George Spelvin wrote:
Averaging the 8K bytes per update, I do see an average of 3.2 cycles per
operation (that is, per 8K of data processed) lost, or about 1 cycle per
(3K or less) block processed. I'm hoping the reduced D-cache polution
makes it up
That's very small (less than 0.2%) so I think it's acceptable.
Thank you! May I take this as an Acked-by; ?
I'll work on some performance improvements, but they proably
won't be ready for the 3.16 merge window.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the
On Fri, 2014-05-30 at 12:52 -0400, George Spelvin wrote:
That's very small (less than 0.2%) so I think it's acceptable.
Thank you! May I take this as an Acked-by; ?
Yes, with the caveat that you still have a v3 of this patch
that reorganize the K table to rodata.
Tim
I'll work on some
On Thu, 2014-05-29 at 21:16 -0400, Dave Jones wrote:
On Thu, May 29, 2014 at 06:07:16PM -0700, Tim Chen wrote:
On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
Sorry for the delay; my Ivy Bridge test machine isn't in my
office and getting to the console to tweak the BIOS is a
On 05/30/2014 10:56 AM, Tim Chen wrote:
On Thu, 2014-05-29 at 21:16 -0400, Dave Jones wrote:
On Thu, May 29, 2014 at 06:07:16PM -0700, Tim Chen wrote:
On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
Sorry for the delay; my Ivy Bridge test machine isn't in my
office and
On Fri, 2014-05-30 at 11:45 -0700, Dirk Brandewie wrote:
With turbostat from rc7.
[root@echolake turbostat]# ./turbostat
Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3
CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6 Pkg%pc7 PkgWatt
CorWatt GFXWatt
On 05/30/2014 12:32 PM, Tim Chen wrote:
On Fri, 2014-05-30 at 11:45 -0700, Dirk Brandewie wrote:
With turbostat from rc7.
[root@echolake turbostat]# ./turbostat
Core CPU Avg_MHz %Busy Bzy_MHz TSC_MHz SMI CPU%c1 CPU%c3
CPU%c6 CPU%c7 CoreTmp PkgTmp Pkg%pc2 Pkg%pc3 Pkg%pc6
On Fri, 2014-05-30 at 12:38 -0700, Dirk Brandewie wrote:
Dirk,
Thanks for checking things out.
I tested on a Haswell system, and I see that the frequency
can dip below the max even when I set the min_perf_pct to 100.
Let me know if you want to log on to my system and check if
On 05/30/2014 01:07 PM, Tim Chen wrote:
On Fri, 2014-05-30 at 12:38 -0700, Dirk Brandewie wrote:
Dirk,
Thanks for checking things out.
I tested on a Haswell system, and I see that the frequency
can dip below the max even when I set the min_perf_pct to 100.
Let me know if you want to log on
Olay, recompiled with the acpi-cpufreq driver, so the performance governor
actually works, pegging the frequency at 3900 MHz.
Existing (old) code:
[ 455.641397]
[ 455.641397] testing speed of crc32c
[ 455.641403] test 0 ( 16 byte blocks, 16 bytes per update, 1 updates):
73
> This is odd. On my Ivy Bridge system the CPU speed from /proc/cpuinfo
> is at max freq once I set the performance governor.
> The numbers above almost look like
> the cpu frequency is fluctuating and an average is taken.
> What version of the kernel are you running? Is
>
On Thu, May 29, 2014 at 06:07:16PM -0700, Tim Chen wrote:
> On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
> > Sorry for the delay; my Ivy Bridge test machine isn't in my
> > office and getting to the console to tweak the BIOS is a
> > bit of a bother.
> >
> > Anyway, i7-4930K,
On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
> Sorry for the delay; my Ivy Bridge test machine isn't in my
> office and getting to the console to tweak the BIOS is a
> bit of a bother.
>
> Anyway, i7-4930K, turbo boost & hyperthreading disabled,
> $ cat
Sorry for the delay; my Ivy Bridge test machine isn't in my
office and getting to the console to tweak the BIOS is a
bit of a bother.
Anyway, i7-4930K, turbo boost & hyperthreading disabled,
$ cat /sys/devices/system/cpu/cpu?/cpufreq/scaling_governor
performance
performance
performance
>>> "George Spelvin" 05/28/14 11:47 PM >>>
>Jan Beulich wrote:
>> "George Spelvin" 05/28/14 4:40 PM
>>> Jan: Is support for SLE10's pre-2.18 binutils still required?
>>> Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.
>
>> I'd much appreciate if I would be able to build
George Spelvin li...@horizon.com 05/28/14 11:47 PM
Jan Beulich jbeul...@suse.com wrote:
George Spelvin li...@horizon.com 05/28/14 4:40 PM
Jan: Is support for SLE10's pre-2.18 binutils still required?
Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.
I'd much appreciate
Sorry for the delay; my Ivy Bridge test machine isn't in my
office and getting to the console to tweak the BIOS is a
bit of a bother.
Anyway, i7-4930K, turbo boost hyperthreading disabled,
$ cat /sys/devices/system/cpu/cpu?/cpufreq/scaling_governor
performance
performance
performance
performance
On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
Sorry for the delay; my Ivy Bridge test machine isn't in my
office and getting to the console to tweak the BIOS is a
bit of a bother.
Anyway, i7-4930K, turbo boost hyperthreading disabled,
$ cat
On Thu, May 29, 2014 at 06:07:16PM -0700, Tim Chen wrote:
On Thu, 2014-05-29 at 19:54 -0400, George Spelvin wrote:
Sorry for the delay; my Ivy Bridge test machine isn't in my
office and getting to the console to tweak the BIOS is a
bit of a bother.
Anyway, i7-4930K, turbo boost
This is odd. On my Ivy Bridge system the CPU speed from /proc/cpuinfo
is at max freq once I set the performance governor.
The numbers above almost look like
the cpu frequency is fluctuating and an average is taken.
What version of the kernel are you running? Is
Olay, recompiled with the acpi-cpufreq driver, so the performance governor
actually works, pegging the frequency at 3900 MHz.
Existing (old) code:
[ 455.641397]
[ 455.641397] testing speed of crc32c
[ 455.641403] test 0 ( 16 byte blocks, 16 bytes per update, 1 updates):
73
On Wed, 2014-05-28 at 19:01 -0400, George Spelvin wrote:
> Thanks for the reply!
>
> > Changing from the aligned move (movdqa) to unaligned move and zeroing
> > (pmovzxdq), is going to make things slower. If the table is aligned
> > on 8 byte boundary, some of the table can span 2 cache lines,
Thanks for the reply!
> Changing from the aligned move (movdqa) to unaligned move and zeroing
> (pmovzxdq), is going to make things slower. If the table is aligned
> on 8 byte boundary, some of the table can span 2 cache lines, which
> can slow things further.
Um, two notes:
1) This load is
On Wed, 2014-05-28 at 10:40 -0400, George Spelvin wrote:
> While following a number of tangents in the code (I was figuring out
> how to edit lib/Kconfig; don't ask), I came across a table of 256 64-bit
> words, all of which had the high half set to zero.
>
> Since the code depends on both
Jan Beulich wrote:
> "George Spelvin" 05/28/14 4:40 PM
>> Jan: Is support for SLE10's pre-2.18 binutils still required?
>> Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.
> I'd much appreciate if I would be able to build the kernel that way for
> another while.
Does it
>>> "George Spelvin" 05/28/14 4:40 PM >>>
>Jan: Is support for SLE10's pre-2.18 binutils still required?
>Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.
I'd much appreciate if I would be able to build the kernel that way for another
while.
>Two other minor additional
Um, yeah, I just noticed the problem with that patch: half of the numbers
in that table are 33 bits, and cause a pile of warnings (not errors,
unfortunately!) from gas that scrolled by when I wasn't looking.
Logically, there should be no need for 33-bit values; they should all be
reducible modulo
While following a number of tangents in the code (I was figuring out
how to edit lib/Kconfig; don't ask), I came across a table of 256 64-bit
words, all of which had the high half set to zero.
Since the code depends on both pclmulq and crc32, SSE 4.1 is obviously
present, so it could use pmovzxdq
George Spelvin li...@horizon.com 05/28/14 4:40 PM
Jan: Is support for SLE10's pre-2.18 binutils still required?
Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.
I'd much appreciate if I would be able to build the kernel that way for another
while.
Two other minor
Jan Beulich jbeul...@suse.com wrote:
George Spelvin li...@horizon.com 05/28/14 4:40 PM
Jan: Is support for SLE10's pre-2.18 binutils still required?
Your PEXTRD fix was only a year ago, so I expect, but I wanted to ask.
I'd much appreciate if I would be able to build the kernel that way for
On Wed, 2014-05-28 at 10:40 -0400, George Spelvin wrote:
While following a number of tangents in the code (I was figuring out
how to edit lib/Kconfig; don't ask), I came across a table of 256 64-bit
words, all of which had the high half set to zero.
Since the code depends on both pclmulq and
Thanks for the reply!
Changing from the aligned move (movdqa) to unaligned move and zeroing
(pmovzxdq), is going to make things slower. If the table is aligned
on 8 byte boundary, some of the table can span 2 cache lines, which
can slow things further.
Um, two notes:
1) This load is
On Wed, 2014-05-28 at 19:01 -0400, George Spelvin wrote:
Thanks for the reply!
Changing from the aligned move (movdqa) to unaligned move and zeroing
(pmovzxdq), is going to make things slower. If the table is aligned
on 8 byte boundary, some of the table can span 2 cache lines, which
While following a number of tangents in the code (I was figuring out
how to edit lib/Kconfig; don't ask), I came across a table of 256 64-bit
words, all of which had the high half set to zero.
Since the code depends on both pclmulq and crc32, SSE 4.1 is obviously
present, so it could use pmovzxdq
Um, yeah, I just noticed the problem with that patch: half of the numbers
in that table are 33 bits, and cause a pile of warnings (not errors,
unfortunately!) from gas that scrolled by when I wasn't looking.
Logically, there should be no need for 33-bit values; they should all be
reducible modulo
44 matches
Mail list logo