from:"Jet Chen"

[block, blk] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

2014-07-03 Thread Jet Chen

Hi Tejun,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-mq-percpu_ref
commit c924ec35e72ce0d6c289b858d323f7eb3f5076a5 ("block, blk-mq: draining can't 
be skipped even if bypass_depth was non-zero")

+--+++
|  | ea854572ee | 
c924ec35e7 |
+--+++
| boot_successes   | 26 | 10
 |
| early-boot-hang  | 1  |   
 |
| boot_failures| 0  | 16
 |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference | 0  | 16
 |
| Oops | 0  | 16
 |
| RIP:blk_throtl_drain | 0  | 16
 |
| kernel_BUG_at_arch/x86/mm/pageattr.c | 0  | 7 
 |
| invalid_opcode   | 0  | 7 
 |
| RIP:change_page_attr_set_clr | 0  | 7 
 |
| Kernel_panic-not_syncing:Fatal_exception | 0  | 16
 |
| backtrace:scsi_debug_exit| 0  | 6 
 |
| backtrace:SyS_delete_module  | 0  | 6 
 |
| backtrace:do_vfs_ioctl   | 0  | 10
 |
| backtrace:SyS_ioctl  | 0  | 10
 |
+--+++


[  522.186410] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
acl,user_xattr
[  522.368967] EXT4-fs (dm-0): recovery complete
[  522.415305] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
acl,user_xattr
[  523.030685] BUG: unable to handle kernel NULL pointer dereference at 
0028
[  523.031682] IP: [] blk_throtl_drain+0x30/0x150
[  523.031682] PGD a8d1c067 PUD a71fd067 PMD 0 [  523.031682] Oops:  [#1] 
SMP [  523.031682] Modules linked in: dm_flakey dm_mod fuse sg sr_mod cdrom 
ata_generic pata_acpi cirrus syscopyarea snd_pcm sysfillrect snd_timer 
sysimgblt floppy snd ttm soundcore parport_pc drm_kms_helper parport drm pcspkr 
i2c_piix4 ata_piix libata
[  523.031682] CPU: 0 PID: 30028 Comm: dmsetup Not tainted 
3.16.0-rc1-01463-g94b6452 #1
[  523.031682] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  523.031682] task: 88011560bb20 ti: 8800a6c7c000 task.ti: 
8800a6c7c000
[  523.031682] RIP: 0010:[]  [] 
blk_throtl_drain+0x30/0x150
[  523.031682] RSP: 0018:8800a6c7fb58  EFLAGS: 00010046
[  523.031682] RAX:  RBX: 88011503be40 RCX: 7fff
[  523.031682] RDX: 0016 RSI:  RDI: 
[  523.031682] RBP: 8800a6c7fb70 R08:  R09: 0046
[  523.031682] R10: 8800a6c7fb70 R11: 813dcbb1 R12: 88011503be40
[  523.031682] R13: 8800d50a7700 R14: 88011503c498 R15: 
[  523.031682] FS:  7fa84cf11800() GS:88011fc0() 
knlGS:
[  523.031682] CS:  0010 DS:  ES:  CR0: 8005003b
[  523.031682] CR2: 0028 CR3: 7ed12000 CR4: 06f0
[  523.031682] Stack:
[  523.031682]  88011503be40  88011503c4a8 
8800a6c7fb80
[  523.031682]  813cba6e 8800a6c7fbb0 813b0b6c 
88011503be40
[  523.031682]  81cf3920 88011503be40 8800aad17a00 
8800a6c7fbc8
[  523.031682] Call Trace:
[  523.031682]  [] blkcg_drain_queue+0xe/0x10
[  523.031682]  [] __blk_drain_queue+0x7c/0x180
[  523.031682]  [] blk_queue_bypass_start+0x8e/0xd0
[  523.031682]  [] blkcg_deactivate_policy+0x38/0x140
[  523.031682]  [] blk_throtl_exit+0x34/0x50
[  523.031682]  [] blkcg_exit_queue+0x48/0x70
[  523.031682]  [] blk_release_queue+0x26/0x100
[  523.031682]  [] kobject_cleanup+0x77/0x1b0
[  523.031682]  [] kobject_put+0x28/0x60
[  523.031682]  [] blk_cleanup_queue+0x119/0x1c0
[  523.031682]  [] __dm_destroy+0x1f3/0x280 [dm_mod]
[  523.031682]  [] dm_destroy+0x13/0x20 [dm_mod]
[  523.031682]  [] dev_remove+0x11e/0x180 [dm_mod]
[  523.031682]  [] ? dev_suspend+0x250/0x250 [dm_mod]
[  523.031682]  [] ctl_ioctl+0x269/0x500 [dm_mod]
[  523.031682]  [] ? extract_buf+0xbb/0x130
[  523.031682]  [] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[  523.031682]  [] do_vfs_ioctl+0x300/0x520
[  523.031682]  [] ? file_has_perm+0x86/0xa0
[  523.031682]  [] SyS_ioctl+0x81/0xa0
[  523.031682]  [] system_call_fastpath+0x16/0x1b
[  523.031682] Code: 55 65 ff 04 25 a0 c7 00 00 48 89 e5 41 55 41 54 49 89 fc 
53 4c 8b af 40 07 00 00 49 8b 85 a0 00 00 00 31 ff 48 8b 80 c8 05 00 00 <48> 8b 
70 28 e8 37 8d d2 ff 48 85 c0 48 89 c3

[kernel/watchdog.c] ed235875e2c: -14.2% will-it-scale.scalability

2014-07-03 Thread Jet Chen

Hi Aaron,

FYI, we noticed the below changes on

commit ed235875e2ca983197831337a986f0517074e1a0 ("kernel/watchdog.c: print 
traces for all cpus on lockup detection")

test case: lkp-snb01/will-it-scale/signal1

f3aca3d09525f87  ed235875e2ca983197831337a
---  -
  0.12 ~ 0% -14.2%   0.10 ~ 0%  TOTAL will-it-scale.scalability
506146 ~ 0%  -4.4% 484004 ~ 0%  TOTAL will-it-scale.per_process_ops
 12193 ~ 4% +12.6%  13726 ~ 6%  TOTAL 
slabinfo.kmalloc-256.active_objs
 12921 ~ 4% +12.3%  14516 ~ 5%  TOTAL slabinfo.kmalloc-256.num_objs
123094 ~ 3%  -6.5% 115117 ~ 3%  TOTAL meminfo.Committed_AS

Legend:
~XX%- stddev percent
[+-]XX% - change percent


will-it-scale.per_process_ops

  515000 ++-+
 |  .*.*.*.*. .*|
  51 *+* *.*.*  + .*.   *.  |
  505000 ++  *   *.*.*.*. .* .*.*.*. .*.*.*.   +  *.*.*.*.*.*
 |   *  *   *   *.* |
  50 ++ |
 |  |
  495000 ++ |
 |  |
  49 ++ |
  485000 ++O O O O  O O |
 | O O O O O O O O OO   |
  48 O+O O O OO |
 |  |
  475000 ++-+


[*] bisect-good sample
[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
./runtest.py signal1 25 1 8 16 24 32

[kernel/watchdog.c] ed235875e2c: -14.2% will-it-scale.scalability

2014-07-03 Thread Jet Chen

Hi Aaron,

FYI, we noticed the below changes on

commit ed235875e2ca983197831337a986f0517074e1a0 (kernel/watchdog.c: print 
traces for all cpus on lockup detection)

test case: lkp-snb01/will-it-scale/signal1

f3aca3d09525f87  ed235875e2ca983197831337a
---  -
  0.12 ~ 0% -14.2%   0.10 ~ 0%  TOTAL will-it-scale.scalability
506146 ~ 0%  -4.4% 484004 ~ 0%  TOTAL will-it-scale.per_process_ops
 12193 ~ 4% +12.6%  13726 ~ 6%  TOTAL 
slabinfo.kmalloc-256.active_objs
 12921 ~ 4% +12.3%  14516 ~ 5%  TOTAL slabinfo.kmalloc-256.num_objs
123094 ~ 3%  -6.5% 115117 ~ 3%  TOTAL meminfo.Committed_AS

Legend:
~XX%- stddev percent
[+-]XX% - change percent


will-it-scale.per_process_ops

  515000 ++-+
 |  .*.*.*.*. .*|
  51 *+* *.*.*  + .*.   *.  |
  505000 ++  *   *.*.*.*. .* .*.*.*. .*.*.*.   +  *.*.*.*.*.*
 |   *  *   *   *.* |
  50 ++ |
 |  |
  495000 ++ |
 |  |
  49 ++ |
  485000 ++O O O O  O O |
 | O O O O O O O O OO   |
  48 O+O O O OO |
 |  |
  475000 ++-+


[*] bisect-good sample
[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet

echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
./runtest.py signal1 25 1 8 16 24 32

[block, blk] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028

2014-07-03 Thread Jet Chen

Hi Tejun,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc.git review-mq-percpu_ref
commit c924ec35e72ce0d6c289b858d323f7eb3f5076a5 (block, blk-mq: draining can't 
be skipped even if bypass_depth was non-zero)

+--+++
|  | ea854572ee | 
c924ec35e7 |
+--+++
| boot_successes   | 26 | 10
 |
| early-boot-hang  | 1  |   
 |
| boot_failures| 0  | 16
 |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference | 0  | 16
 |
| Oops | 0  | 16
 |
| RIP:blk_throtl_drain | 0  | 16
 |
| kernel_BUG_at_arch/x86/mm/pageattr.c | 0  | 7 
 |
| invalid_opcode   | 0  | 7 
 |
| RIP:change_page_attr_set_clr | 0  | 7 
 |
| Kernel_panic-not_syncing:Fatal_exception | 0  | 16
 |
| backtrace:scsi_debug_exit| 0  | 6 
 |
| backtrace:SyS_delete_module  | 0  | 6 
 |
| backtrace:do_vfs_ioctl   | 0  | 10
 |
| backtrace:SyS_ioctl  | 0  | 10
 |
+--+++


[  522.186410] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
acl,user_xattr
[  522.368967] EXT4-fs (dm-0): recovery complete
[  522.415305] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
acl,user_xattr
[  523.030685] BUG: unable to handle kernel NULL pointer dereference at 
0028
[  523.031682] IP: [813cea30] blk_throtl_drain+0x30/0x150
[  523.031682] PGD a8d1c067 PUD a71fd067 PMD 0 [  523.031682] Oops:  [#1] 
SMP [  523.031682] Modules linked in: dm_flakey dm_mod fuse sg sr_mod cdrom 
ata_generic pata_acpi cirrus syscopyarea snd_pcm sysfillrect snd_timer 
sysimgblt floppy snd ttm soundcore parport_pc drm_kms_helper parport drm pcspkr 
i2c_piix4 ata_piix libata
[  523.031682] CPU: 0 PID: 30028 Comm: dmsetup Not tainted 
3.16.0-rc1-01463-g94b6452 #1
[  523.031682] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[  523.031682] task: 88011560bb20 ti: 8800a6c7c000 task.ti: 
8800a6c7c000
[  523.031682] RIP: 0010:[813cea30]  [813cea30] 
blk_throtl_drain+0x30/0x150
[  523.031682] RSP: 0018:8800a6c7fb58  EFLAGS: 00010046
[  523.031682] RAX:  RBX: 88011503be40 RCX: 7fff
[  523.031682] RDX: 0016 RSI:  RDI: 
[  523.031682] RBP: 8800a6c7fb70 R08:  R09: 0046
[  523.031682] R10: 8800a6c7fb70 R11: 813dcbb1 R12: 88011503be40
[  523.031682] R13: 8800d50a7700 R14: 88011503c498 R15: 
[  523.031682] FS:  7fa84cf11800() GS:88011fc0() 
knlGS:
[  523.031682] CS:  0010 DS:  ES:  CR0: 8005003b
[  523.031682] CR2: 0028 CR3: 7ed12000 CR4: 06f0
[  523.031682] Stack:
[  523.031682]  88011503be40  88011503c4a8 
8800a6c7fb80
[  523.031682]  813cba6e 8800a6c7fbb0 813b0b6c 
88011503be40
[  523.031682]  81cf3920 88011503be40 8800aad17a00 
8800a6c7fbc8
[  523.031682] Call Trace:
[  523.031682]  [813cba6e] blkcg_drain_queue+0xe/0x10
[  523.031682]  [813b0b6c] __blk_drain_queue+0x7c/0x180
[  523.031682]  [813b0cfe] blk_queue_bypass_start+0x8e/0xd0
[  523.031682]  [813cac18] blkcg_deactivate_policy+0x38/0x140
[  523.031682]  [813cec84] blk_throtl_exit+0x34/0x50
[  523.031682]  [813cbab8] blkcg_exit_queue+0x48/0x70
[  523.031682]  [813b43c6] blk_release_queue+0x26/0x100
[  523.031682]  [813dcb97] kobject_cleanup+0x77/0x1b0
[  523.031682]  [813dca48] kobject_put+0x28/0x60
[  523.031682]  [813b0e59] blk_cleanup_queue+0x119/0x1c0
[  523.031682]  [a019c243] __dm_destroy+0x1f3/0x280 [dm_mod]
[  523.031682]  [a019d083] dm_destroy+0x13/0x20 [dm_mod]
[  523.031682]  [a01a27de] dev_remove+0x11e/0x180 [dm_mod]
[  523.031682]  [a01a26c0] ? dev_suspend+0x250/0x250 [dm_mod]
[  523.031682]  [a01a2ea9] ctl_ioctl+0x269/0x500 [dm_mod]
[  523.031682]  [814c5c4b] ? extract_buf+0xbb/0x130
[  523.031682]  [a01a3153] dm_ctl_ioctl+0x13/0x20 [dm_mod]
[  523.031682]  [811da350] do_vfs_ioctl+0x300/0x520

[mempolicy] 5507231dd04: -18.2% vm-scalability.migrate_mbps

2014-06-25 Thread Jet Chen


Hi Naoya,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git am437x-starterkit
commit 5507231dd04d3d68796bafe83e6a20c985a0ef68 ("mempolicy: apply page table walker 
on queue_pages_range()")

test case: ivb44/vm-scalability/300s-migrate

8c81f3eeb336567  5507231dd04d3d68796bafe83
---  -
 347258 ~ 0% -18.2% 284195 ~ 0%  TOTAL vm-scalability.migrate_mbps
   0.00   +Inf%   0.94 ~ 7%  TOTAL 
perf-profile.cpu-cycles._raw_spin_lock.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
  11.49 ~ 1%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.queue_pages_range.migrate_to_node.do_migrate_pages.SYSC_migrate_pages
  69.40 ~ 0%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.queue_pages_range.migrate_to_node.do_migrate_pages.SYSC_migrate_pages.sys_migrate_pages
   3.68 ~ 3%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.migrate_to_node.do_migrate_pages.SYSC_migrate_pages.sys_migrate_pages
   0.00   +Inf%   4.51 ~ 2%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
   0.00   +Inf%   8.36 ~ 1%  TOTAL 
perf-profile.cpu-cycles.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node.do_migrate_pages
   1.17 ~ 4%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles._raw_spin_lock.queue_pages_range.migrate_to_node.do_migrate_pages.SYSC_migrate_pages
   0.00   +Inf%   9.30 ~ 2%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.queue_pages_pte.__walk_page_range.walk_page_range.queue_pages_range
   0.00   +Inf%  63.92 ~ 1%  TOTAL 
perf-profile.cpu-cycles.queue_pages_pte.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
 61 ~32%+363.8%286 ~10%  TOTAL 
numa-vmstat.node0.nr_unevictable
257 ~30%+345.5%   1147 ~10%  TOTAL 
numa-meminfo.node0.Unevictable
   1133 ~ 8%+129.0%   2596 ~ 0%  TOTAL meminfo.Unevictable
282 ~ 8%+129.1%647 ~ 0%  TOTAL proc-vmstat.nr_unevictable
  93913 ~ 7% -49.8%  47172 ~ 3%  TOTAL softirqs.RCU
 113808 ~ 1% -45.4%  62087 ~ 0%  TOTAL softirqs.SCHED
 362197 ~ 0% -32.9% 243163 ~ 0%  TOTAL cpuidle.C6-IVT.usage
   1.49 ~ 3% -19.6%   1.20 ~ 4%  TOTAL 
perf-profile.cpu-cycles.intel_idle.cpuidle_enter_state.cpuidle_enter.cpu_startup_entry.start_secondary
 743815 ~ 2% -20.3% 592628 ~ 6%  TOTAL proc-vmstat.pgmigrate_fail
310 ~ 6% +16.6%362 ~ 8%  TOTAL 
numa-vmstat.node1.nr_unevictable
   1243 ~ 6% +16.5%   1448 ~ 8%  TOTAL 
numa-meminfo.node1.Unevictable
   1230 ~ 6% +16.6%   1434 ~ 8%  TOTAL numa-meminfo.node1.Mlocked
307 ~ 6% +16.7%358 ~ 8%  TOTAL numa-vmstat.node1.nr_mlock
3943910 ~ 0% -12.3%3459206 ~ 0%  TOTAL proc-vmstat.pgfault
   4402 ~ 3% -13.4%   3812 ~ 5%  TOTAL 
numa-meminfo.node1.KernelStack
  15303 ~ 7% -17.5%  12621 ~ 9%  TOTAL slabinfo.kmalloc-192.num_objs
  15301 ~ 7% -17.5%  12621 ~ 9%  TOTAL 
slabinfo.kmalloc-192.active_objs
  30438 ~ 0% +91.0%  58142 ~ 0%  TOTAL 
time.involuntary_context_switches
162 ~ 3% +81.9%296 ~ 0%  TOTAL time.system_time
 53 ~ 3% +81.1% 96 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
2586283 ~ 0% -18.5%2107842 ~ 0%  TOTAL time.minor_page_faults
  48619 ~ 0% -18.1%  39800 ~ 0%  TOTAL 
time.voluntary_context_switches
   2037 ~ 0% -17.7%   1677 ~ 0%  TOTAL vmstat.system.in
   2206 ~ 0%  -4.7%   2101 ~ 0%  TOTAL vmstat.system.cs
~ 1%  -3.6%~ 1%  TOTAL turbostat.Cor_W
~ 1%  -2.2%~ 1%  TOTAL turbostat.Pkg_W
   2.17 ~ 0%  -1.4%   2.14 ~ 0%  TOTAL turbostat.%c0

Legend:
~XX%- stddev percent
[+-]XX% - change percent


  time.system_time

   300 O+O--O-O-O--O-O-O--O-O-O--O-O-O--O-O-O--O-O-O-+
   | |
   280 ++|
   260 ++|
   | |
   240 ++|
   | |
   220 ++|
   | |
   200 ++|
   180 ++

[mempolicy] 5507231dd04: -18.2% vm-scalability.migrate_mbps

2014-06-25 Thread Jet Chen


Hi Naoya,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/balbi/usb.git am437x-starterkit
commit 5507231dd04d3d68796bafe83e6a20c985a0ef68 (mempolicy: apply page table walker 
on queue_pages_range())

test case: ivb44/vm-scalability/300s-migrate

8c81f3eeb336567  5507231dd04d3d68796bafe83
---  -
 347258 ~ 0% -18.2% 284195 ~ 0%  TOTAL vm-scalability.migrate_mbps
   0.00   +Inf%   0.94 ~ 7%  TOTAL 
perf-profile.cpu-cycles._raw_spin_lock.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
  11.49 ~ 1%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.queue_pages_range.migrate_to_node.do_migrate_pages.SYSC_migrate_pages
  69.40 ~ 0%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.queue_pages_range.migrate_to_node.do_migrate_pages.SYSC_migrate_pages.sys_migrate_pages
   3.68 ~ 3%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.migrate_to_node.do_migrate_pages.SYSC_migrate_pages.sys_migrate_pages
   0.00   +Inf%   4.51 ~ 2%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
   0.00   +Inf%   8.36 ~ 1%  TOTAL 
perf-profile.cpu-cycles.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node.do_migrate_pages
   1.17 ~ 4%-100.0%   0.00 ~ 0%  TOTAL 
perf-profile.cpu-cycles._raw_spin_lock.queue_pages_range.migrate_to_node.do_migrate_pages.SYSC_migrate_pages
   0.00   +Inf%   9.30 ~ 2%  TOTAL 
perf-profile.cpu-cycles.vm_normal_page.queue_pages_pte.__walk_page_range.walk_page_range.queue_pages_range
   0.00   +Inf%  63.92 ~ 1%  TOTAL 
perf-profile.cpu-cycles.queue_pages_pte.__walk_page_range.walk_page_range.queue_pages_range.migrate_to_node
 61 ~32%+363.8%286 ~10%  TOTAL 
numa-vmstat.node0.nr_unevictable
257 ~30%+345.5%   1147 ~10%  TOTAL 
numa-meminfo.node0.Unevictable
   1133 ~ 8%+129.0%   2596 ~ 0%  TOTAL meminfo.Unevictable
282 ~ 8%+129.1%647 ~ 0%  TOTAL proc-vmstat.nr_unevictable
  93913 ~ 7% -49.8%  47172 ~ 3%  TOTAL softirqs.RCU
 113808 ~ 1% -45.4%  62087 ~ 0%  TOTAL softirqs.SCHED
 362197 ~ 0% -32.9% 243163 ~ 0%  TOTAL cpuidle.C6-IVT.usage
   1.49 ~ 3% -19.6%   1.20 ~ 4%  TOTAL 
perf-profile.cpu-cycles.intel_idle.cpuidle_enter_state.cpuidle_enter.cpu_startup_entry.start_secondary
 743815 ~ 2% -20.3% 592628 ~ 6%  TOTAL proc-vmstat.pgmigrate_fail
310 ~ 6% +16.6%362 ~ 8%  TOTAL 
numa-vmstat.node1.nr_unevictable
   1243 ~ 6% +16.5%   1448 ~ 8%  TOTAL 
numa-meminfo.node1.Unevictable
   1230 ~ 6% +16.6%   1434 ~ 8%  TOTAL numa-meminfo.node1.Mlocked
307 ~ 6% +16.7%358 ~ 8%  TOTAL numa-vmstat.node1.nr_mlock
3943910 ~ 0% -12.3%3459206 ~ 0%  TOTAL proc-vmstat.pgfault
   4402 ~ 3% -13.4%   3812 ~ 5%  TOTAL 
numa-meminfo.node1.KernelStack
  15303 ~ 7% -17.5%  12621 ~ 9%  TOTAL slabinfo.kmalloc-192.num_objs
  15301 ~ 7% -17.5%  12621 ~ 9%  TOTAL 
slabinfo.kmalloc-192.active_objs
  30438 ~ 0% +91.0%  58142 ~ 0%  TOTAL 
time.involuntary_context_switches
162 ~ 3% +81.9%296 ~ 0%  TOTAL time.system_time
 53 ~ 3% +81.1% 96 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
2586283 ~ 0% -18.5%2107842 ~ 0%  TOTAL time.minor_page_faults
  48619 ~ 0% -18.1%  39800 ~ 0%  TOTAL 
time.voluntary_context_switches
   2037 ~ 0% -17.7%   1677 ~ 0%  TOTAL vmstat.system.in
   2206 ~ 0%  -4.7%   2101 ~ 0%  TOTAL vmstat.system.cs
~ 1%  -3.6%~ 1%  TOTAL turbostat.Cor_W
~ 1%  -2.2%~ 1%  TOTAL turbostat.Pkg_W
   2.17 ~ 0%  -1.4%   2.14 ~ 0%  TOTAL turbostat.%c0

Legend:
~XX%- stddev percent
[+-]XX% - change percent


  time.system_time

   300 O+O--O-O-O--O-O-O--O-O-O--O-O-O--O-O-O--O-O-O-+
   | |
   280 ++|
   260 ++|
   | |
   240 ++|
   | |
   220 ++|
   | |
   200 ++|
   180 ++

Re: [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked()

2014-06-21 Thread Jet Chen

 - the memsw counter is not accounted, but then unaccounted.
> 
> Andrew, can you please put this in to fix the uncharge rewrite patch
> mentioned above?
> 
> ---
> 
> From 29bcfcf54494467008aaf9d4e37771d3b2e2c2c7 Mon Sep 17 00:00:00 2001
> From: Johannes Weiner 
> Date: Fri, 20 Jun 2014 11:09:14 -0400
> Subject: [patch] mm: memcontrol: rewrite uncharge API fix
> 
> It's not entirely clear whether do_swap_account or PCG_MEMSW is the
> authoritative answer to whether a page is swap-accounted or not.  This
> currently leads to the following memsw counter underflow when swap
> accounting is disabled:
> 
> [2.753355] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 
> res_counter_uncharge_locked+0x48/0x74()
> [2.753355] CPU: 0 PID: 1 Comm: init Not tainted 3.16.0-rc1-00238-gddc5bfe 
> #1
> [2.753355] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [2.753355]   880012073c50 81a23b9d 
> 880012073c88
> [2.753355]  810bc765 8111fac8 1000 
> 88001200fa50
> [2.753355]  0001 88001200fa01 880012073c98 
> 810bc84b
> [2.753355] Call Trace:
> [2.753355]  [] dump_stack+0x19/0x1b
> [2.753355]  [] warn_slowpath_common+0x73/0x8c
> [2.753355]  [] ? res_counter_uncharge_locked+0x48/0x74
> [2.753355]  [] warn_slowpath_null+0x1a/0x1c
> [2.753355]  [] res_counter_uncharge_locked+0x48/0x74
> [2.753355]  [] res_counter_uncharge_until+0x4e/0xa9
> [2.753355]  [] res_counter_uncharge+0x13/0x15
> [2.753355]  [] mem_cgroup_uncharge_end+0x73/0x8d
> [2.753355]  [] release_pages+0x1f2/0x20d
> [2.753355]  [] tlb_flush_mmu_free+0x28/0x43
> [2.753355]  [] tlb_flush_mmu+0x20/0x23
> [2.753355]  [] tlb_finish_mmu+0x14/0x39
> [2.753355]  [] unmap_region+0xcd/0xdf
> [2.753355]  [] ? vma_gap_callbacks_propagate+0x18/0x33
> [2.753355]  [] do_munmap+0x252/0x2e0
> [2.753355]  [] vm_munmap+0x44/0x5c
> [2.753355]  [] SyS_munmap+0x23/0x29
> [2.753355]  [] system_call_fastpath+0x16/0x1b
> [2.753355] ---[ end trace cfeb07101f6fbdfb ]---
> 
> Don't set PCG_MEMSW when swap accounting is disabled, so that
> uncharging only has to look at this per-page flag.
> 
> mem_cgroup_swapout() could also fully rely on this flag, but as it can
> bail out before even looking up the page_cgroup, check do_swap_account
> as a performance optimization and only sanity test for PCG_MEMSW.
> 
> Signed-off-by: Johannes Weiner 
> ---
>  mm/memcontrol.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 94d7c40b9f26..d6a20935f9c4 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2740,7 +2740,7 @@ static void commit_charge(struct page *page, struct 
> mem_cgroup *memcg,
>*   have the page locked
>*/
>   pc->mem_cgroup = memcg;
> - pc->flags = PCG_USED | PCG_MEM | PCG_MEMSW;
> + pc->flags = PCG_USED | PCG_MEM | (do_swap_account ? PCG_MEMSW : 0);
>  
>   if (lrucare) {
>   if (was_on_lru) {
> @@ -6598,7 +6598,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct 
> page *newpage,
>   return;
>  
>   VM_BUG_ON_PAGE(!(pc->flags & PCG_MEM), oldpage);
> - VM_BUG_ON_PAGE(!(pc->flags & PCG_MEMSW), oldpage);
> + VM_BUG_ON_PAGE(do_swap_account && !(pc->flags & PCG_MEMSW), oldpage);
>   pc->flags &= ~(PCG_MEM | PCG_MEMSW);
>  
>   if (PageTransHuge(oldpage)) {
> 

Johannes, your patch fixes the problem.

Tested-by: Jet Chen 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [memcontrol] WARNING: CPU: 0 PID: 1 at kernel/res_counter.c:28 res_counter_uncharge_locked()

2014-06-21 Thread Jet Chen

 81a23b9d 
 880012073c88
 [2.753355]  810bc765 8111fac8 1000 
 88001200fa50
 [2.753355]  0001 88001200fa01 880012073c98 
 810bc84b
 [2.753355] Call Trace:
 [2.753355]  [81a23b9d] dump_stack+0x19/0x1b
 [2.753355]  [810bc765] warn_slowpath_common+0x73/0x8c
 [2.753355]  [8111fac8] ? res_counter_uncharge_locked+0x48/0x74
 [2.753355]  [810bc84b] warn_slowpath_null+0x1a/0x1c
 [2.753355]  [8111fac8] res_counter_uncharge_locked+0x48/0x74
 [2.753355]  [8111fd02] res_counter_uncharge_until+0x4e/0xa9
 [2.753355]  [8111fd70] res_counter_uncharge+0x13/0x15
 [2.753355]  [8119499c] mem_cgroup_uncharge_end+0x73/0x8d
 [2.753355]  [8115735e] release_pages+0x1f2/0x20d
 [2.753355]  [8116cc3a] tlb_flush_mmu_free+0x28/0x43
 [2.753355]  [8116d5e5] tlb_flush_mmu+0x20/0x23
 [2.753355]  [8116d5fc] tlb_finish_mmu+0x14/0x39
 [2.753355]  [811730c1] unmap_region+0xcd/0xdf
 [2.753355]  [81172b0e] ? vma_gap_callbacks_propagate+0x18/0x33
 [2.753355]  [81174bf1] do_munmap+0x252/0x2e0
 [2.753355]  [81174cc3] vm_munmap+0x44/0x5c
 [2.753355]  [81174cfe] SyS_munmap+0x23/0x29
 [2.753355]  [81a31567] system_call_fastpath+0x16/0x1b
 [2.753355] ---[ end trace cfeb07101f6fbdfb ]---
 
 Don't set PCG_MEMSW when swap accounting is disabled, so that
 uncharging only has to look at this per-page flag.
 
 mem_cgroup_swapout() could also fully rely on this flag, but as it can
 bail out before even looking up the page_cgroup, check do_swap_account
 as a performance optimization and only sanity test for PCG_MEMSW.
 
 Signed-off-by: Johannes Weiner han...@cmpxchg.org
 ---
  mm/memcontrol.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/mm/memcontrol.c b/mm/memcontrol.c
 index 94d7c40b9f26..d6a20935f9c4 100644
 --- a/mm/memcontrol.c
 +++ b/mm/memcontrol.c
 @@ -2740,7 +2740,7 @@ static void commit_charge(struct page *page, struct 
 mem_cgroup *memcg,
*   have the page locked
*/
   pc-mem_cgroup = memcg;
 - pc-flags = PCG_USED | PCG_MEM | PCG_MEMSW;
 + pc-flags = PCG_USED | PCG_MEM | (do_swap_account ? PCG_MEMSW : 0);
  
   if (lrucare) {
   if (was_on_lru) {
 @@ -6598,7 +6598,7 @@ void mem_cgroup_migrate(struct page *oldpage, struct 
 page *newpage,
   return;
  
   VM_BUG_ON_PAGE(!(pc-flags  PCG_MEM), oldpage);
 - VM_BUG_ON_PAGE(!(pc-flags  PCG_MEMSW), oldpage);
 + VM_BUG_ON_PAGE(do_swap_account  !(pc-flags  PCG_MEMSW), oldpage);
   pc-flags = ~(PCG_MEM | PCG_MEMSW);
  
   if (PageTransHuge(oldpage)) {
 

Johannes, your patch fixes the problem.

Tested-by: Jet Chen jet.c...@intel.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[x86_64,vsyscall] 21d4ab4881a: -11.1% will-it-scale.per_process_ops

2014-06-17 Thread Jet Chen


Hi Andy,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git x86/vsyscall
commit 21d4ab4881ad9b257bec75d04480105dad4336e1 ("x86_64,vsyscall: Move all of the 
gate_area code to vsyscall_64.c")

test case: lkp-wsx01/will-it-scale/signal1

a7781f1035319a7  21d4ab4881ad9b257bec75d04
---  -
259032 ~ 1% -11.1% 230288 ~ 0%  TOTAL will-it-scale.per_process_ops
  0.04 ~ 9%   +4276.2%   1.84 ~ 1%  TOTAL 
perf-profile.cpu-cycles.map_id_up.do_tkill.sys_tgkill.system_call_fastpath.raise
  2.36 ~ 0% -63.8%   0.85 ~ 2%  TOTAL 
perf-profile.cpu-cycles._atomic_dec_and_lock.free_uid.__sigqueue_free.__dequeue_signal.dequeue_signal
  2.25 ~14% -55.2%   1.01 ~ 1%  TOTAL perf-profile.cpu-cycles.raise
 42.41 ~ 0% +34.5%  57.04 ~ 0%  TOTAL 
perf-profile.cpu-cycles.__sigqueue_alloc.__send_signal.send_signal.do_send_sig_info.do_send_specific
 40.70 ~ 0% -23.9%  30.96 ~ 0%  TOTAL 
perf-profile.cpu-cycles.__sigqueue_free.part.11.__dequeue_signal.dequeue_signal.get_signal_to_deliver.do_signal
   252 ~11% -18.8%204 ~ 9%  TOTAL 
numa-vmstat.node1.nr_page_table_pages
  1012 ~11% -18.3%827 ~ 9%  TOTAL numa-meminfo.node1.PageTables
   520 ~ 7% -17.1%431 ~ 5%  TOTAL cpuidle.C1-NHM.usage

Legend:
~XX%- stddev percent
[+-]XX% - change percent


will-it-scale.per_process_ops

  27 ++-+
 |  |
  26 *+.*..*.. .*..  .*..  .*..*...  .*..   *..  .*..   *.. .*..*
 |*..*.*.*.*. ..   *. ..   *..*.|
  25 ++  *   *  |
 |  |
  24 ++ |
 |  |
  23 ++ O  O  O O  O   O O  O  O  O  O  |
 O   O O|
  22 ++   OO O  |
 |  O O O O |
  21 ++  O  |
 |  |
  20 ++-+


[*] bisect-good sample
[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet

echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
echo performance >

[tracing] 939c7a4f04f: -46.4% cpuidle.C3-IVT.time

2014-06-17 Thread Jet Chen


Hi Yoshihiro,

FYI, we noticed the below changes on

commit 939c7a4f04fcd2162109744e8bf88194948a6e65 ("tracing: Introduce 
saved_cmdlines_size file")

test case: brickland3/aim7/3000-brk_test

beba4bb096201ce  939c7a4f04fcd2162109744e8
---  -
352643 ~40% -46.4% 189136 ~15%  TOTAL cpuidle.C3-IVT.time
   750 ~ 2% -13.3%650 ~ 2%  TOTAL vmstat.procs.b
362016 ~ 3%  -6.0% 340296 ~ 3%  TOTAL proc-vmstat.pgmigrate_success
362016 ~ 3%  -6.0% 340296 ~ 3%  TOTAL 
proc-vmstat.numa_pages_migrated
   1642328 ~ 3% +13.3%1860721 ~ 2%  TOTAL softirqs.RCU
   2278829 ~ 2% -14.0%1959647 ~ 2%  TOTAL 
time.voluntary_context_switches
   207 ~ 0% -10.4%185 ~ 0%  TOTAL time.user_time
   ~ 0%  +8.5%~ 0%  TOTAL turbostat.RAM_W
   4418246 ~ 3%  -6.4%4136447 ~ 1%  TOTAL time.minor_page_faults
   ~ 0%  +3.4%~ 0%  TOTAL turbostat.Cor_W
   ~ 0%  +3.0%~ 0%  TOTAL turbostat.Pkg_W
  11354889 ~ 0%  +2.0%   11579897 ~ 0%  TOTAL 
time.involuntary_context_switches

Legend:
~XX%- stddev percent
[+-]XX% - change percent


time.voluntary_context_switches

  2.4e+06 ++---*+
  *.. *....  .. |
  2.3e+06 ++  .*..  *.. ..  *   |
  |  *..*..  .*.. ..   * *..   +   *..   *..*
  |*.*  . .*..*.. +..   |
  2.2e+06 ++ *..*.   ** |
  | |
  2.1e+06 ++|
  | |
2e+06 O+ O   OO |
  | O O O  O|
  |O   O  O  O  O   |
  1.9e+06 ++ O  O  O   O|
  |O OO |
  1.8e+06 +++


  turbostat.Pkg_W

   +++
   O+ O   O O  OO|
   | O  O O   O  O  O  O  O O   O  O  O  |
   ++ O  O   |
   ++|
   ++|
   ++|
   | |
   ++|
   ++|
   ++*..  .*..   |
   *+.*... ..   *...*.*...*..*..*...*..*..  ..*..   *...*..*..*...*..*
   |  *   *.  .. |
   ++*   |
   +++


  turbostat.Cor_W

   +++
   O+ O   O  O  O   O  OO  O  O  |
   |  O   O  O  O O   O  O  O   O  O |
   ++|
   ++|
   | |
   ++|
   ++|
   ++|
   | |
   ++*.. |
   *+.*... ..   *...*..*..*...*..*..*...*..*..  ..*..   *...  .*..*...*..*
   |  *   *.  ..*.   |
   ++*   |
   +++


 turbostat.RAM_W

  ++-+
  |  O  O OO   O O

[net] b58537a1f56: +89.2% netperf.Throughput_Mbps

2014-06-17 Thread Jet Chen


Hi Daniel,

FYI, we noticed the below changes on

commit b58537a1f5629bdc98a8b9dc2051ce0e952f6b4b ("net: sctp: fix permissions for 
rto_alpha and rto_beta knobs")

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
950 ~ 1% +93.7%   1841 ~ 0%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
750 ~ 1% +83.4%   1375 ~ 0%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
   1700 ~ 1% +89.2%   3217 ~ 0%  TOTAL netperf.Throughput_Mbps

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   7841 ~ 0% +50.4%  11792 ~ 0%  lkp-nex04/netperf/300s-200%-SCTP_RR
   7603 ~ 0% +37.6%  10463 ~ 0%  lkp-nex05/netperf/300s-200%-SCTP_RR
  15445 ~ 0% +44.1%  22256 ~ 0%  TOTAL netperf.Throughput_tps

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   5.93 ~ 1% -99.0%   0.06 ~ 0%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   6.53 ~ 1% -99.1%   0.06 ~12%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  12.46 ~ 1% -99.1%   0.12 ~ 6%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_cmp_addr_exact.sctp_assoc_lookup_paddr.sctp_endpoint_lookup_assoc.sctp_sendmsg

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   6.35 ~ 0% -98.3%   0.11 ~ 3%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   2.95 ~ 5%-100.0%   0.00 ~ 0%  lkp-nex04/netperf/300s-200%-SCTP_RR
   2.26 ~ 6%-100.0%   0.00 ~ 0%  lkp-nex05/netperf/300s-200%-SCTP_RR
   6.68 ~ 0% -98.5%   0.10 ~10%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  18.23 ~ 1% -98.9%   0.21 ~ 7%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_chunk_iif.sctp_ulpevent_make_rcvmsg.sctp_ulpq_tail_data.sctp_cmd_interpreter

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   3.07 ~ 0% -99.0%   0.03 ~12%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   4.84 ~ 0%-100.0%   0.00 ~ 0%  lkp-nex04/netperf/300s-200%-SCTP_RR
   3.92 ~ 5%-100.0%   0.00 ~ 0%  lkp-nex05/netperf/300s-200%-SCTP_RR
   3.17 ~ 0% -99.0%   0.03 ~12%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  15.01 ~ 1% -99.6%   0.06 ~12%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_cmp_addr_exact.sctp_assoc_lookup_paddr.sctp_assoc_is_match.__sctp_lookup_association

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   1.92 ~ 1% -97.9%   0.04 ~ 0%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   3.78 ~ 5% -99.3%   0.03 ~18%  lkp-nex04/netperf/300s-200%-SCTP_RR
   2.91 ~ 5% -99.0%   0.03 ~14%  lkp-nex05/netperf/300s-200%-SCTP_RR
   2.00 ~ 1% -98.2%   0.04 ~13%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  10.61 ~ 3% -98.8%   0.13 ~10%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_eat_data.sctp_sf_eat_data_6_2.sctp_do_sm.sctp_assoc_bh_rcv

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   3.36 ~ 0% -97.1%   0.10 ~ 4%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   6.16 ~ 0% -99.3%   0.04 ~23%  lkp-nex04/netperf/300s-200%-SCTP_RR
   4.91 ~ 5% -98.9%   0.05 ~22%  lkp-nex05/netperf/300s-200%-SCTP_RR
   3.47 ~ 1% -97.4%   0.09 ~ 7%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  17.90 ~ 2% -98.4%   0.28 ~11%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_rcv.ip_local_deliver_finish.ip_local_deliver.ip_rcv_finish

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   6.94 ~ 0% -97.3%   0.18 ~ 4%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   7.54 ~ 0% -97.6%   0.18 ~10%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  14.48 ~ 0% -97.5%   0.36 ~ 7%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_sockaddr_af.sctp_sendmsg.inet_sendmsg.sock_sendmsg

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   6207 ~ 0%+273.8%  23202 ~36%  lkp-nex05/netperf/300s-200%-SCTP_RR
   6207 ~ 0%+273.8%  23202 ~36%  TOTAL 
numa-vmstat.node3.nr_active_file

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
  24831 ~ 0%+273.8%  92810 ~36%  lkp-nex05/netperf/300s-200%-SCTP_RR
  24831 ~ 0%+273.8%  92810 ~36%  TOTAL 
numa-meminfo.node3.Active(file)

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
  27053 ~ 3%+237.7%  91368 ~10%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
606 ~26% -68.2%192 ~11%  lkp-nex05/netperf/300s-200%-SCTP_RR
  83164 ~14%+322.9% 351725 ~26%

[net] b58537a1f56: +89.2% netperf.Throughput_Mbps

2014-06-17 Thread Jet Chen


Hi Daniel,

FYI, we noticed the below changes on

commit b58537a1f5629bdc98a8b9dc2051ce0e952f6b4b (net: sctp: fix permissions for 
rto_alpha and rto_beta knobs)

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
950 ~ 1% +93.7%   1841 ~ 0%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
750 ~ 1% +83.4%   1375 ~ 0%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
   1700 ~ 1% +89.2%   3217 ~ 0%  TOTAL netperf.Throughput_Mbps

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   7841 ~ 0% +50.4%  11792 ~ 0%  lkp-nex04/netperf/300s-200%-SCTP_RR
   7603 ~ 0% +37.6%  10463 ~ 0%  lkp-nex05/netperf/300s-200%-SCTP_RR
  15445 ~ 0% +44.1%  22256 ~ 0%  TOTAL netperf.Throughput_tps

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   5.93 ~ 1% -99.0%   0.06 ~ 0%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   6.53 ~ 1% -99.1%   0.06 ~12%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  12.46 ~ 1% -99.1%   0.12 ~ 6%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_cmp_addr_exact.sctp_assoc_lookup_paddr.sctp_endpoint_lookup_assoc.sctp_sendmsg

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   6.35 ~ 0% -98.3%   0.11 ~ 3%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   2.95 ~ 5%-100.0%   0.00 ~ 0%  lkp-nex04/netperf/300s-200%-SCTP_RR
   2.26 ~ 6%-100.0%   0.00 ~ 0%  lkp-nex05/netperf/300s-200%-SCTP_RR
   6.68 ~ 0% -98.5%   0.10 ~10%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  18.23 ~ 1% -98.9%   0.21 ~ 7%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_chunk_iif.sctp_ulpevent_make_rcvmsg.sctp_ulpq_tail_data.sctp_cmd_interpreter

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   3.07 ~ 0% -99.0%   0.03 ~12%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   4.84 ~ 0%-100.0%   0.00 ~ 0%  lkp-nex04/netperf/300s-200%-SCTP_RR
   3.92 ~ 5%-100.0%   0.00 ~ 0%  lkp-nex05/netperf/300s-200%-SCTP_RR
   3.17 ~ 0% -99.0%   0.03 ~12%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  15.01 ~ 1% -99.6%   0.06 ~12%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_cmp_addr_exact.sctp_assoc_lookup_paddr.sctp_assoc_is_match.__sctp_lookup_association

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   1.92 ~ 1% -97.9%   0.04 ~ 0%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   3.78 ~ 5% -99.3%   0.03 ~18%  lkp-nex04/netperf/300s-200%-SCTP_RR
   2.91 ~ 5% -99.0%   0.03 ~14%  lkp-nex05/netperf/300s-200%-SCTP_RR
   2.00 ~ 1% -98.2%   0.04 ~13%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  10.61 ~ 3% -98.8%   0.13 ~10%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_eat_data.sctp_sf_eat_data_6_2.sctp_do_sm.sctp_assoc_bh_rcv

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   3.36 ~ 0% -97.1%   0.10 ~ 4%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   6.16 ~ 0% -99.3%   0.04 ~23%  lkp-nex04/netperf/300s-200%-SCTP_RR
   4.91 ~ 5% -98.9%   0.05 ~22%  lkp-nex05/netperf/300s-200%-SCTP_RR
   3.47 ~ 1% -97.4%   0.09 ~ 7%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  17.90 ~ 2% -98.4%   0.28 ~11%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_rcv.ip_local_deliver_finish.ip_local_deliver.ip_rcv_finish

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   6.94 ~ 0% -97.3%   0.18 ~ 4%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
   7.54 ~ 0% -97.6%   0.18 ~10%  
lkp-wsx02/netperf/300s-200%-10K-SCTP_STREAM_MANY
  14.48 ~ 0% -97.5%   0.36 ~ 7%  TOTAL 
perf-profile.cpu-cycles.sctp_get_af_specific.sctp_sockaddr_af.sctp_sendmsg.inet_sendmsg.sock_sendmsg

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
   6207 ~ 0%+273.8%  23202 ~36%  lkp-nex05/netperf/300s-200%-SCTP_RR
   6207 ~ 0%+273.8%  23202 ~36%  TOTAL 
numa-vmstat.node3.nr_active_file

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
  24831 ~ 0%+273.8%  92810 ~36%  lkp-nex05/netperf/300s-200%-SCTP_RR
  24831 ~ 0%+273.8%  92810 ~36%  TOTAL 
numa-meminfo.node3.Active(file)

e4f7ae930afafd4  b58537a1f5629bdc98a8b9dc2
---  -
  27053 ~ 3%+237.7%  91368 ~10%  
lkp-nex04/netperf/300s-200%-10K-SCTP_STREAM_MANY
606 ~26% -68.2%192 ~11%  lkp-nex05/netperf/300s-200%-SCTP_RR
  83164 ~14%+322.9% 351725 ~26%

[tracing] 939c7a4f04f: -46.4% cpuidle.C3-IVT.time

2014-06-17 Thread Jet Chen


Hi Yoshihiro,

FYI, we noticed the below changes on

commit 939c7a4f04fcd2162109744e8bf88194948a6e65 (tracing: Introduce 
saved_cmdlines_size file)

test case: brickland3/aim7/3000-brk_test

beba4bb096201ce  939c7a4f04fcd2162109744e8
---  -
352643 ~40% -46.4% 189136 ~15%  TOTAL cpuidle.C3-IVT.time
   750 ~ 2% -13.3%650 ~ 2%  TOTAL vmstat.procs.b
362016 ~ 3%  -6.0% 340296 ~ 3%  TOTAL proc-vmstat.pgmigrate_success
362016 ~ 3%  -6.0% 340296 ~ 3%  TOTAL 
proc-vmstat.numa_pages_migrated
   1642328 ~ 3% +13.3%1860721 ~ 2%  TOTAL softirqs.RCU
   2278829 ~ 2% -14.0%1959647 ~ 2%  TOTAL 
time.voluntary_context_switches
   207 ~ 0% -10.4%185 ~ 0%  TOTAL time.user_time
   ~ 0%  +8.5%~ 0%  TOTAL turbostat.RAM_W
   4418246 ~ 3%  -6.4%4136447 ~ 1%  TOTAL time.minor_page_faults
   ~ 0%  +3.4%~ 0%  TOTAL turbostat.Cor_W
   ~ 0%  +3.0%~ 0%  TOTAL turbostat.Pkg_W
  11354889 ~ 0%  +2.0%   11579897 ~ 0%  TOTAL 
time.involuntary_context_switches

Legend:
~XX%- stddev percent
[+-]XX% - change percent


time.voluntary_context_switches

  2.4e+06 ++---*+
  *.. *....  .. |
  2.3e+06 ++  .*..  *.. ..  *   |
  |  *..*..  .*.. ..   * *..   +   *..   *..*
  |*.*  . .*..*.. +..   |
  2.2e+06 ++ *..*.   ** |
  | |
  2.1e+06 ++|
  | |
2e+06 O+ O   OO |
  | O O O  O|
  |O   O  O  O  O   |
  1.9e+06 ++ O  O  O   O|
  |O OO |
  1.8e+06 +++


  turbostat.Pkg_W

   +++
   O+ O   O O  OO|
   | O  O O   O  O  O  O  O O   O  O  O  |
   ++ O  O   |
   ++|
   ++|
   ++|
   | |
   ++|
   ++|
   ++*..  .*..   |
   *+.*... ..   *...*.*...*..*..*...*..*..  ..*..   *...*..*..*...*..*
   |  *   *.  .. |
   ++*   |
   +++


  turbostat.Cor_W

   +++
   O+ O   O  O  O   O  OO  O  O  |
   |  O   O  O  O O   O  O  O   O  O |
   ++|
   ++|
   | |
   ++|
   ++|
   ++|
   | |
   ++*.. |
   *+.*... ..   *...*..*..*...*..*..*...*..*..  ..*..   *...  .*..*...*..*
   |  *   *.  ..*.   |
   ++*   |
   +++


 turbostat.RAM_W

  ++-+
  |  O  O OO   O O O

[x86_64,vsyscall] 21d4ab4881a: -11.1% will-it-scale.per_process_ops

2014-06-17 Thread Jet Chen


Hi Andy,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git x86/vsyscall
commit 21d4ab4881ad9b257bec75d04480105dad4336e1 (x86_64,vsyscall: Move all of the 
gate_area code to vsyscall_64.c)

test case: lkp-wsx01/will-it-scale/signal1

a7781f1035319a7  21d4ab4881ad9b257bec75d04
---  -
259032 ~ 1% -11.1% 230288 ~ 0%  TOTAL will-it-scale.per_process_ops
  0.04 ~ 9%   +4276.2%   1.84 ~ 1%  TOTAL 
perf-profile.cpu-cycles.map_id_up.do_tkill.sys_tgkill.system_call_fastpath.raise
  2.36 ~ 0% -63.8%   0.85 ~ 2%  TOTAL 
perf-profile.cpu-cycles._atomic_dec_and_lock.free_uid.__sigqueue_free.__dequeue_signal.dequeue_signal
  2.25 ~14% -55.2%   1.01 ~ 1%  TOTAL perf-profile.cpu-cycles.raise
 42.41 ~ 0% +34.5%  57.04 ~ 0%  TOTAL 
perf-profile.cpu-cycles.__sigqueue_alloc.__send_signal.send_signal.do_send_sig_info.do_send_specific
 40.70 ~ 0% -23.9%  30.96 ~ 0%  TOTAL 
perf-profile.cpu-cycles.__sigqueue_free.part.11.__dequeue_signal.dequeue_signal.get_signal_to_deliver.do_signal
   252 ~11% -18.8%204 ~ 9%  TOTAL 
numa-vmstat.node1.nr_page_table_pages
  1012 ~11% -18.3%827 ~ 9%  TOTAL numa-meminfo.node1.PageTables
   520 ~ 7% -17.1%431 ~ 5%  TOTAL cpuidle.C1-NHM.usage

Legend:
~XX%- stddev percent
[+-]XX% - change percent


will-it-scale.per_process_ops

  27 ++-+
 |  |
  26 *+.*..*.. .*..  .*..  .*..*...  .*..   *..  .*..   *.. .*..*
 |*..*.*.*.*. ..   *. ..   *..*.|
  25 ++  *   *  |
 |  |
  24 ++ |
 |  |
  23 ++ O  O  O O  O   O O  O  O  O  O  |
 O   O O|
  22 ++   OO O  |
 |  O O O O |
  21 ++  O  |
 |  |
  20 ++-+


[*] bisect-good sample
[O] bisect-bad  sample


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet

echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
echo performance

Re: [net/ipvs] BUG: unable to handle kernel NULL pointer dereference at 00000004

2014-06-11 Thread Jet Chen

On 06/11/2014 01:59 PM, Julian Anastasov wrote:
> 
>   Hello,
> 
> On Wed, 11 Jun 2014, Jet Chen wrote:
> 
>> Hi Wensong,
>>
>> 0day kernel testing robot got the below dmesg.
>>
>> +---++
>> | boot_successes| 26 |
>> | boot_failures | 4  |
>> | BUG:unable_to_handle_kernel_NULL_pointer_dereference  | 4  |
>> | Oops  | 4  |
>> | EIP_is_at_ip_vs_stop_estimator| 4  |
>> | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 4  |
>> | backtrace:cleanup_net | 4  |
>> +---++
>>
>>
>> [child0:2725] process_vm_readv (347) returned ENOSYS, marking as inactive.
>> [child0:2725] uid changed! Was: 0, now -788547075
>> Bailing main loop. Exit reason: UID changed.
>> [   12.182233] BUG: unable to handle kernel NULL pointer dereference at 
>> 0004
>> [   12.183011] IP: [<4c2f6567>] ip_vs_stop_estimator+0x20/0x3e
>> [   12.183011] *pdpt =  *pde = f000ff53f000ff53 [   
>> 12.183011] Oops: 0002 [#1] DEBUG_PAGEALLOC
>> [   12.183011] Modules linked in:
>> [   12.183011] CPU: 0 PID: 57 Comm: kworker/u2:1 Not tainted 3.15.0-rc8 #1
>> [   12.183011] Workqueue: netns cleanup_net
>> [   12.183011] task: 528773f0 ti: 52878000 task.ti: 52878000
>> [   12.183011] EIP: 0060:[<4c2f6567>] EFLAGS: 00010206 CPU: 0
>> [   12.183011] EIP is at ip_vs_stop_estimator+0x20/0x3e
>> [   12.183011] EAX:  EBX: 51c39a54 ECX:  EDX: 
> 
>   ip_vs_stop_estimator fails at list_del(>list)
> on mov %eax,0x4(%edx) instruction and EDX is 0. It means,
> this estimator was never started (initialized with
> INIT_LIST_HEAD in ip_vs_start_estimator) or stopped
> before with the same list_del.
> 
>   At first look, it is strange but I think the reason
> is the missing CONFIG_SYSCTL. ip_vs_control_net_cleanup
> fails at ip_vs_stop_estimator(net, >tot_stats)
> because it is called not depending on CONFIG_SYSCTL but
> without CONFIG_SYSCTL ip_vs_start_estimator was never
> called.
> 
>   Can you test such patch?

Julian, your patch works. Thanks.

Tested-by: Jet Chen 

> 
> ipvs: stop tot_stats estimator only under CONFIG_SYSCTL
> 
> The tot_stats estimator is started only when CONFIG_SYSCTL
> is defined. But it is stopped without checking CONFIG_SYSCTL.
> Fix the crash by moving ip_vs_stop_estimator into
> ip_vs_control_net_cleanup_sysctl.
> 
> Signed-off-by: Julian Anastasov 
> ---
>  net/netfilter/ipvs/ip_vs_ctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
> index c42e83d..581a658 100644
> --- a/net/netfilter/ipvs/ip_vs_ctl.c
> +++ b/net/netfilter/ipvs/ip_vs_ctl.c
> @@ -3778,6 +3778,7 @@ static void __net_exit 
> ip_vs_control_net_cleanup_sysctl(struct net *net)
>   cancel_delayed_work_sync(>defense_work);
>   cancel_work_sync(>defense_work.work);
>   unregister_net_sysctl_table(ipvs->sysctl_hdr);
> + ip_vs_stop_estimator(net, >tot_stats);
>  }
>  
>  #else
> @@ -3840,7 +3841,6 @@ void __net_exit ip_vs_control_net_cleanup(struct net 
> *net)
>   struct netns_ipvs *ipvs = net_ipvs(net);
>  
>   ip_vs_trash_cleanup(net);
> - ip_vs_stop_estimator(net, >tot_stats);
>   ip_vs_control_net_cleanup_sysctl(net);
>   remove_proc_entry("ip_vs_stats_percpu", net->proc_net);
>   remove_proc_entry("ip_vs_stats", net->proc_net);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [net/ipvs] BUG: unable to handle kernel NULL pointer dereference at 00000004

2014-06-11 Thread Jet Chen

On 06/11/2014 01:59 PM, Julian Anastasov wrote:
 
   Hello,
 
 On Wed, 11 Jun 2014, Jet Chen wrote:
 
 Hi Wensong,

 0day kernel testing robot got the below dmesg.

 +---++
 | boot_successes| 26 |
 | boot_failures | 4  |
 | BUG:unable_to_handle_kernel_NULL_pointer_dereference  | 4  |
 | Oops  | 4  |
 | EIP_is_at_ip_vs_stop_estimator| 4  |
 | Kernel_panic-not_syncing:Fatal_exception_in_interrupt | 4  |
 | backtrace:cleanup_net | 4  |
 +---++


 [child0:2725] process_vm_readv (347) returned ENOSYS, marking as inactive.
 [child0:2725] uid changed! Was: 0, now -788547075
 Bailing main loop. Exit reason: UID changed.
 [   12.182233] BUG: unable to handle kernel NULL pointer dereference at 
 0004
 [   12.183011] IP: [4c2f6567] ip_vs_stop_estimator+0x20/0x3e
 [   12.183011] *pdpt =  *pde = f000ff53f000ff53 [   
 12.183011] Oops: 0002 [#1] DEBUG_PAGEALLOC
 [   12.183011] Modules linked in:
 [   12.183011] CPU: 0 PID: 57 Comm: kworker/u2:1 Not tainted 3.15.0-rc8 #1
 [   12.183011] Workqueue: netns cleanup_net
 [   12.183011] task: 528773f0 ti: 52878000 task.ti: 52878000
 [   12.183011] EIP: 0060:[4c2f6567] EFLAGS: 00010206 CPU: 0
 [   12.183011] EIP is at ip_vs_stop_estimator+0x20/0x3e
 [   12.183011] EAX:  EBX: 51c39a54 ECX:  EDX: 
 
   ip_vs_stop_estimator fails at list_del(est-list)
 on mov %eax,0x4(%edx) instruction and EDX is 0. It means,
 this estimator was never started (initialized with
 INIT_LIST_HEAD in ip_vs_start_estimator) or stopped
 before with the same list_del.
 
   At first look, it is strange but I think the reason
 is the missing CONFIG_SYSCTL. ip_vs_control_net_cleanup
 fails at ip_vs_stop_estimator(net, ipvs-tot_stats)
 because it is called not depending on CONFIG_SYSCTL but
 without CONFIG_SYSCTL ip_vs_start_estimator was never
 called.
 
   Can you test such patch?

Julian, your patch works. Thanks.

Tested-by: Jet Chen jet.c...@intel.com

 
 ipvs: stop tot_stats estimator only under CONFIG_SYSCTL
 
 The tot_stats estimator is started only when CONFIG_SYSCTL
 is defined. But it is stopped without checking CONFIG_SYSCTL.
 Fix the crash by moving ip_vs_stop_estimator into
 ip_vs_control_net_cleanup_sysctl.
 
 Signed-off-by: Julian Anastasov j...@ssi.bg
 ---
  net/netfilter/ipvs/ip_vs_ctl.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/net/netfilter/ipvs/ip_vs_ctl.c b/net/netfilter/ipvs/ip_vs_ctl.c
 index c42e83d..581a658 100644
 --- a/net/netfilter/ipvs/ip_vs_ctl.c
 +++ b/net/netfilter/ipvs/ip_vs_ctl.c
 @@ -3778,6 +3778,7 @@ static void __net_exit 
 ip_vs_control_net_cleanup_sysctl(struct net *net)
   cancel_delayed_work_sync(ipvs-defense_work);
   cancel_work_sync(ipvs-defense_work.work);
   unregister_net_sysctl_table(ipvs-sysctl_hdr);
 + ip_vs_stop_estimator(net, ipvs-tot_stats);
  }
  
  #else
 @@ -3840,7 +3841,6 @@ void __net_exit ip_vs_control_net_cleanup(struct net 
 *net)
   struct netns_ipvs *ipvs = net_ipvs(net);
  
   ip_vs_trash_cleanup(net);
 - ip_vs_stop_estimator(net, ipvs-tot_stats);
   ip_vs_control_net_cleanup_sysctl(net);
   remove_proc_entry(ip_vs_stats_percpu, net-proc_net);
   remove_proc_entry(ip_vs_stats, net-proc_net);
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[raid5] cf170f3fa45: +4.8% vmstat.io.bo

2014-06-07 Thread Jet Chen

Hi Eivind,

FYI, we noticed the below changes on

git://neil.brown.name/md for-next
commit cf170f3fa451350e431314e1a0a52014fda4b2d6 ("raid5: avoid release list 
until last reference of the stripe")

test case: lkp-st02/dd-write/11HDD-RAID5-cfq-xfs-10dd

8b32bf5e37328c0  cf170f3fa451350e431314e1a
---  -
486996 ~ 0%  +4.8% 510428 ~ 0%  TOTAL vmstat.io.bo
 17643 ~ 1% -17.3%  14599 ~ 0%  TOTAL vmstat.system.in
 11633 ~ 4% -56.7%   5039 ~ 0%  TOTAL vmstat.system.cs
   109 ~ 1%  +6.5%116 ~ 1%  TOTAL iostat.sdb.rrqm/s
   109 ~ 2%  +5.1%114 ~ 1%  TOTAL iostat.sdc.rrqm/s
   110 ~ 2%  +5.5%117 ~ 0%  TOTAL iostat.sdj.rrqm/s
 12077 ~ 0%  +4.8%  12660 ~ 0%  TOTAL iostat.sde.wrqm/s
 48775 ~ 0%  +4.8%  51125 ~ 0%  TOTAL iostat.sde.wkB/s
 12077 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdb.wrqm/s
 12076 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdd.wrqm/s
 12077 ~ 0%  +4.8%  12660 ~ 0%  TOTAL iostat.sdf.wrqm/s
 48775 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdb.wkB/s
 12078 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdj.wrqm/s
 12078 ~ 0%  +4.8%  12660 ~ 0%  TOTAL iostat.sdi.wrqm/s
 12076 ~ 0%  +4.8%  12658 ~ 0%  TOTAL iostat.sdg.wrqm/s
 48774 ~ 0%  +4.8%  51122 ~ 0%  TOTAL iostat.sdd.wkB/s
 48776 ~ 0%  +4.8%  51128 ~ 0%  TOTAL iostat.sdf.wkB/s
 48780 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdj.wkB/s
 48779 ~ 0%  +4.8%  51128 ~ 0%  TOTAL iostat.sdi.wkB/s
 48773 ~ 0%  +4.8%  51119 ~ 0%  TOTAL iostat.sdg.wkB/s
486971 ~ 0%  +4.8% 510409 ~ 0%  TOTAL iostat.md0.wkB/s
 12076 ~ 0%  +4.8%  12657 ~ 0%  TOTAL iostat.sdc.wrqm/s
 12077 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdh.wrqm/s
  1910 ~ 0%  +4.8%   2001 ~ 0%  TOTAL iostat.md0.w/s
   110 ~ 2%  +6.5%117 ~ 1%  TOTAL iostat.sdk.rrqm/s
 12077 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdk.wrqm/s
 48772 ~ 0%  +4.8%  51115 ~ 0%  TOTAL iostat.sdc.wkB/s
 48776 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdh.wkB/s
 48777 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdk.wkB/s
   109 ~ 2%  +3.3%113 ~ 1%  TOTAL iostat.sde.rrqm/s
  4.28e+09 ~ 0%  -4.1%  4.104e+09 ~ 0%  TOTAL perf-stat.cache-misses
 8.654e+10 ~ 0%  +4.7%  9.058e+10 ~ 0%  TOTAL 
perf-stat.L1-dcache-store-misses
 3.549e+09 ~ 1%  +3.7%  3.682e+09 ~ 0%  TOTAL perf-stat.L1-dcache-prefetches
 6.764e+11 ~ 0%  +3.7%  7.011e+11 ~ 0%  TOTAL perf-stat.dTLB-stores
 6.759e+11 ~ 0%  +3.7%  7.011e+11 ~ 0%  TOTAL perf-stat.L1-dcache-stores
 4.731e+10 ~ 0%  +3.6%  4.903e+10 ~ 0%  TOTAL 
perf-stat.L1-dcache-load-misses
 3.017e+12 ~ 0%  +3.5%  3.121e+12 ~ 0%  TOTAL perf-stat.instructions
 1.118e+12 ~ 0%  +3.3%  1.156e+12 ~ 0%  TOTAL perf-stat.dTLB-loads
 1.117e+12 ~ 0%  +3.2%  1.152e+12 ~ 0%  TOTAL perf-stat.L1-dcache-loads
 3.022e+12 ~ 0%  +3.2%  3.119e+12 ~ 0%  TOTAL perf-stat.iTLB-loads
 5.613e+11 ~ 0%  +3.2%  5.794e+11 ~ 0%  TOTAL perf-stat.branch-instructions
  5.62e+11 ~ 0%  +3.1%  5.793e+11 ~ 0%  TOTAL perf-stat.branch-loads
 1.343e+09 ~ 0%  +2.6%  1.378e+09 ~ 0%  TOTAL perf-stat.LLC-store-misses
 2.073e+10 ~ 0%  +2.9%  2.133e+10 ~ 1%  TOTAL perf-stat.LLC-loads
 4.854e+10 ~ 0%  +1.6%  4.931e+10 ~ 0%  TOTAL perf-stat.cache-references
 1.167e+10 ~ 0%  +1.4%  1.183e+10 ~ 0%  TOTAL 
perf-stat.L1-icache-load-misses
   7068624 ~ 4% -56.4%3078966 ~ 0%  TOTAL perf-stat.context-switches
 2.214e+09 ~ 1%  -7.8%  2.041e+09 ~ 1%  TOTAL perf-stat.LLC-load-misses
131433 ~ 0% -18.9% 106597 ~ 1%  TOTAL perf-stat.cpu-migrations


Legend:
~XX%- stddev percent
[+-]XX% - change percent

Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet

mdadm -q --create /dev/md0 --chunk=256 --level=raid5 --raid-devices=11 --force 
--assume-clean /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 
/dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1
echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1 > 
/sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
mkfs -t xfs /dev/md0
mount -t xfs -o nobarrier,inode64 /dev/md0 /fs/md0
dd  if=/dev/zero of=/fs/md0/zero-1 status=none &
dd  if=/dev/zero of=/fs/md0/zero-2 status=none &
dd  if=/dev/zero of=/fs/md0/zero-3 status=none &
dd  if=/dev/zero of=/fs/md0/zero-4 status=none &
dd  if=/dev/zero of=/fs/md0/zero-5 status=none &
dd  if=/dev/zero

[raid5] cf170f3fa45: +4.8% vmstat.io.bo

2014-06-07 Thread Jet Chen

Hi Eivind,

FYI, we noticed the below changes on

git://neil.brown.name/md for-next
commit cf170f3fa451350e431314e1a0a52014fda4b2d6 (raid5: avoid release list 
until last reference of the stripe)

test case: lkp-st02/dd-write/11HDD-RAID5-cfq-xfs-10dd

8b32bf5e37328c0  cf170f3fa451350e431314e1a
---  -
486996 ~ 0%  +4.8% 510428 ~ 0%  TOTAL vmstat.io.bo
 17643 ~ 1% -17.3%  14599 ~ 0%  TOTAL vmstat.system.in
 11633 ~ 4% -56.7%   5039 ~ 0%  TOTAL vmstat.system.cs
   109 ~ 1%  +6.5%116 ~ 1%  TOTAL iostat.sdb.rrqm/s
   109 ~ 2%  +5.1%114 ~ 1%  TOTAL iostat.sdc.rrqm/s
   110 ~ 2%  +5.5%117 ~ 0%  TOTAL iostat.sdj.rrqm/s
 12077 ~ 0%  +4.8%  12660 ~ 0%  TOTAL iostat.sde.wrqm/s
 48775 ~ 0%  +4.8%  51125 ~ 0%  TOTAL iostat.sde.wkB/s
 12077 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdb.wrqm/s
 12076 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdd.wrqm/s
 12077 ~ 0%  +4.8%  12660 ~ 0%  TOTAL iostat.sdf.wrqm/s
 48775 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdb.wkB/s
 12078 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdj.wrqm/s
 12078 ~ 0%  +4.8%  12660 ~ 0%  TOTAL iostat.sdi.wrqm/s
 12076 ~ 0%  +4.8%  12658 ~ 0%  TOTAL iostat.sdg.wrqm/s
 48774 ~ 0%  +4.8%  51122 ~ 0%  TOTAL iostat.sdd.wkB/s
 48776 ~ 0%  +4.8%  51128 ~ 0%  TOTAL iostat.sdf.wkB/s
 48780 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdj.wkB/s
 48779 ~ 0%  +4.8%  51128 ~ 0%  TOTAL iostat.sdi.wkB/s
 48773 ~ 0%  +4.8%  51119 ~ 0%  TOTAL iostat.sdg.wkB/s
486971 ~ 0%  +4.8% 510409 ~ 0%  TOTAL iostat.md0.wkB/s
 12076 ~ 0%  +4.8%  12657 ~ 0%  TOTAL iostat.sdc.wrqm/s
 12077 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdh.wrqm/s
  1910 ~ 0%  +4.8%   2001 ~ 0%  TOTAL iostat.md0.w/s
   110 ~ 2%  +6.5%117 ~ 1%  TOTAL iostat.sdk.rrqm/s
 12077 ~ 0%  +4.8%  12659 ~ 0%  TOTAL iostat.sdk.wrqm/s
 48772 ~ 0%  +4.8%  51115 ~ 0%  TOTAL iostat.sdc.wkB/s
 48776 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdh.wkB/s
 48777 ~ 0%  +4.8%  51121 ~ 0%  TOTAL iostat.sdk.wkB/s
   109 ~ 2%  +3.3%113 ~ 1%  TOTAL iostat.sde.rrqm/s
  4.28e+09 ~ 0%  -4.1%  4.104e+09 ~ 0%  TOTAL perf-stat.cache-misses
 8.654e+10 ~ 0%  +4.7%  9.058e+10 ~ 0%  TOTAL 
perf-stat.L1-dcache-store-misses
 3.549e+09 ~ 1%  +3.7%  3.682e+09 ~ 0%  TOTAL perf-stat.L1-dcache-prefetches
 6.764e+11 ~ 0%  +3.7%  7.011e+11 ~ 0%  TOTAL perf-stat.dTLB-stores
 6.759e+11 ~ 0%  +3.7%  7.011e+11 ~ 0%  TOTAL perf-stat.L1-dcache-stores
 4.731e+10 ~ 0%  +3.6%  4.903e+10 ~ 0%  TOTAL 
perf-stat.L1-dcache-load-misses
 3.017e+12 ~ 0%  +3.5%  3.121e+12 ~ 0%  TOTAL perf-stat.instructions
 1.118e+12 ~ 0%  +3.3%  1.156e+12 ~ 0%  TOTAL perf-stat.dTLB-loads
 1.117e+12 ~ 0%  +3.2%  1.152e+12 ~ 0%  TOTAL perf-stat.L1-dcache-loads
 3.022e+12 ~ 0%  +3.2%  3.119e+12 ~ 0%  TOTAL perf-stat.iTLB-loads
 5.613e+11 ~ 0%  +3.2%  5.794e+11 ~ 0%  TOTAL perf-stat.branch-instructions
  5.62e+11 ~ 0%  +3.1%  5.793e+11 ~ 0%  TOTAL perf-stat.branch-loads
 1.343e+09 ~ 0%  +2.6%  1.378e+09 ~ 0%  TOTAL perf-stat.LLC-store-misses
 2.073e+10 ~ 0%  +2.9%  2.133e+10 ~ 1%  TOTAL perf-stat.LLC-loads
 4.854e+10 ~ 0%  +1.6%  4.931e+10 ~ 0%  TOTAL perf-stat.cache-references
 1.167e+10 ~ 0%  +1.4%  1.183e+10 ~ 0%  TOTAL 
perf-stat.L1-icache-load-misses
   7068624 ~ 4% -56.4%3078966 ~ 0%  TOTAL perf-stat.context-switches
 2.214e+09 ~ 1%  -7.8%  2.041e+09 ~ 1%  TOTAL perf-stat.LLC-load-misses
131433 ~ 0% -18.9% 106597 ~ 1%  TOTAL perf-stat.cpu-migrations


Legend:
~XX%- stddev percent
[+-]XX% - change percent

Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.

Thanks,
Jet

mdadm -q --create /dev/md0 --chunk=256 --level=raid5 --raid-devices=11 --force 
--assume-clean /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 
/dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1
echo 1  /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1  /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1  /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1  
/sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
mkfs -t xfs /dev/md0
mount -t xfs -o nobarrier,inode64 /dev/md0 /fs/md0
dd  if=/dev/zero of=/fs/md0/zero-1 status=none 
dd  if=/dev/zero of=/fs/md0/zero-2 status=none 
dd  if=/dev/zero of=/fs/md0/zero-3 status=none 
dd  if=/dev/zero of=/fs/md0/zero-4 status=none 
dd  if=/dev/zero of=/fs/md0/zero-5 status=none 
dd  if=/dev/zero of=/fs/md0/zero-6

[rcu] 5057f55e543: -23.5% qperf.udp.recv_bw

2014-06-03 Thread Jet Chen

Hi Paul,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/fixes
commit 5057f55e543b7859cfd26bc281291795eac93f8a ("rcu: Bind RCU grace-period 
kthreads if NO_HZ_FULL")

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.127e+09 ~ 0% -23.5%  1.628e+09 ~ 4%  bens/qperf/600s
 2.127e+09 ~ 0% -23.5%  1.628e+09 ~ 4%  TOTAL qperf.udp.recv_bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.128e+09 ~ 0% -23.3%  1.633e+09 ~ 4%  bens/qperf/600s
 2.128e+09 ~ 0% -23.3%  1.633e+09 ~ 4%  TOTAL qperf.udp.send_bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  bens/iperf/300s-tcp
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  TOTAL iperf.tcp.sender.bps

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  bens/iperf/300s-tcp
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  TOTAL iperf.tcp.receiver.bps

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 1.331e+09 ~ 2%  -5.8%  1.255e+09 ~ 2%  bens/qperf/600s
   2.4e+09 ~ 6% -30.4%  1.671e+09 ~12%  brickland3/qperf/600s
 2.384e+09 ~ 7% -12.1%  2.096e+09 ~ 3%  lkp-sb03/qperf/600s
 6.115e+09 ~ 5% -17.9%  5.022e+09 ~ 6%  TOTAL qperf.sctp.bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  2.83e+09 ~ 1% -12.5%  2.476e+09 ~ 3%  bens/qperf/600s
  2.83e+09 ~ 1% -12.5%  2.476e+09 ~ 3%  TOTAL qperf.tcp.bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.272e+08 ~ 1% -13.3%   1.97e+08 ~ 2%  bens/qperf/600s
 2.272e+08 ~ 1% -13.3%   1.97e+08 ~ 2%  TOTAL proc-vmstat.pgalloc_dma32

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 53062 ~ 2% -35.1%  34464 ~ 3%  bens/qperf/600s
109531 ~13% +46.9% 160928 ~ 5%  brickland3/qperf/600s
 67902 ~ 1% +13.8%  77302 ~ 3%  lkp-sb03/qperf/600s
230496 ~ 7% +18.3% 272694 ~ 4%  TOTAL softirqs.RCU

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 80344 ~ 1% -26.2%  59325 ~ 2%  bens/qperf/600s
 80344 ~ 1% -26.2%  59325 ~ 2%  TOTAL softirqs.SCHED

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  1036 ~ 4% -17.6%853 ~ 4%  brickland3/qperf/600s
  1036 ~ 4% -17.6%853 ~ 4%  TOTAL 
proc-vmstat.nr_page_table_pages

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 48.12 ~ 0% -11.7%  42.46 ~ 6%  brickland3/qperf/600s
 48.12 ~ 0% -11.7%  42.46 ~ 6%  TOTAL turbostat.%pc2

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  74689352 ~ 1% -13.3%   64771743 ~ 2%  bens/qperf/600s
  74689352 ~ 1% -13.3%   64771743 ~ 2%  TOTAL proc-vmstat.pgalloc_normal

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 3.019e+08 ~ 1% -13.3%  2.618e+08 ~ 2%  bens/qperf/600s
 3.019e+08 ~ 1% -13.3%  2.618e+08 ~ 2%  TOTAL proc-vmstat.pgfree

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  23538414 ~ 0% -12.9%   20506157 ~ 2%  bens/qperf/600s
  23538414 ~ 0% -12.9%   20506157 ~ 2%  TOTAL proc-vmstat.numa_local

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  23538414 ~ 0% -12.9%   20506157 ~ 2%  bens/qperf/600s
  23538414 ~ 0% -12.9%   20506157 ~ 2%  TOTAL proc-vmstat.numa_hit

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 12789 ~ 1% -10.9%  11391 ~ 2%  bens/qperf/600s
 12789 ~ 1% -10.9%  11391 ~ 2%  TOTAL softirqs.HRTIMER

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
481253 ~ 0%  -8.9% 438624 ~ 0%  bens/qperf/600s
481253 ~ 0%  -8.9% 438624 ~ 0%  TOTAL softirqs.TIMER

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  1297 ~33%+565.9%   8640 ~ 7%  bens/iperf/300s-tcp
  2788 ~ 3%+588.8%  19204 ~ 4%  bens/qperf/600s
  1191 ~ 5%   +1200.9%  15493 ~ 4%  brickland3/qperf/600s
  1135 ~26%   +1195.9%  14709 ~ 4%  lkp-sb03/qperf/600s
  6411 ~13%+805.3%  58047 ~ 4%  TOTAL 
time.involuntary_context_switches

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 72398 ~ 1%  -5.4%  68503 ~ 0%  bens/qperf/600s
  8789 ~ 4% +22.3%  10749 ~15%  lkp-sb03/qperf/600s
 81187 ~ 1%  -2.4%  79253 ~ 2%  TOTAL vmstat.system.in

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
141174 ~ 1%

[rcu] 5057f55e543: -23.5% qperf.udp.recv_bw

2014-06-03 Thread Jet Chen

Hi Paul,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git rcu/fixes
commit 5057f55e543b7859cfd26bc281291795eac93f8a (rcu: Bind RCU grace-period 
kthreads if NO_HZ_FULL)

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.127e+09 ~ 0% -23.5%  1.628e+09 ~ 4%  bens/qperf/600s
 2.127e+09 ~ 0% -23.5%  1.628e+09 ~ 4%  TOTAL qperf.udp.recv_bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.128e+09 ~ 0% -23.3%  1.633e+09 ~ 4%  bens/qperf/600s
 2.128e+09 ~ 0% -23.3%  1.633e+09 ~ 4%  TOTAL qperf.udp.send_bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  bens/iperf/300s-tcp
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  TOTAL iperf.tcp.sender.bps

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  bens/iperf/300s-tcp
 2.101e+10 ~ 2% -18.7%  1.707e+10 ~ 2%  TOTAL iperf.tcp.receiver.bps

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 1.331e+09 ~ 2%  -5.8%  1.255e+09 ~ 2%  bens/qperf/600s
   2.4e+09 ~ 6% -30.4%  1.671e+09 ~12%  brickland3/qperf/600s
 2.384e+09 ~ 7% -12.1%  2.096e+09 ~ 3%  lkp-sb03/qperf/600s
 6.115e+09 ~ 5% -17.9%  5.022e+09 ~ 6%  TOTAL qperf.sctp.bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  2.83e+09 ~ 1% -12.5%  2.476e+09 ~ 3%  bens/qperf/600s
  2.83e+09 ~ 1% -12.5%  2.476e+09 ~ 3%  TOTAL qperf.tcp.bw

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 2.272e+08 ~ 1% -13.3%   1.97e+08 ~ 2%  bens/qperf/600s
 2.272e+08 ~ 1% -13.3%   1.97e+08 ~ 2%  TOTAL proc-vmstat.pgalloc_dma32

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 53062 ~ 2% -35.1%  34464 ~ 3%  bens/qperf/600s
109531 ~13% +46.9% 160928 ~ 5%  brickland3/qperf/600s
 67902 ~ 1% +13.8%  77302 ~ 3%  lkp-sb03/qperf/600s
230496 ~ 7% +18.3% 272694 ~ 4%  TOTAL softirqs.RCU

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 80344 ~ 1% -26.2%  59325 ~ 2%  bens/qperf/600s
 80344 ~ 1% -26.2%  59325 ~ 2%  TOTAL softirqs.SCHED

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  1036 ~ 4% -17.6%853 ~ 4%  brickland3/qperf/600s
  1036 ~ 4% -17.6%853 ~ 4%  TOTAL 
proc-vmstat.nr_page_table_pages

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 48.12 ~ 0% -11.7%  42.46 ~ 6%  brickland3/qperf/600s
 48.12 ~ 0% -11.7%  42.46 ~ 6%  TOTAL turbostat.%pc2

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  74689352 ~ 1% -13.3%   64771743 ~ 2%  bens/qperf/600s
  74689352 ~ 1% -13.3%   64771743 ~ 2%  TOTAL proc-vmstat.pgalloc_normal

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 3.019e+08 ~ 1% -13.3%  2.618e+08 ~ 2%  bens/qperf/600s
 3.019e+08 ~ 1% -13.3%  2.618e+08 ~ 2%  TOTAL proc-vmstat.pgfree

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  23538414 ~ 0% -12.9%   20506157 ~ 2%  bens/qperf/600s
  23538414 ~ 0% -12.9%   20506157 ~ 2%  TOTAL proc-vmstat.numa_local

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  23538414 ~ 0% -12.9%   20506157 ~ 2%  bens/qperf/600s
  23538414 ~ 0% -12.9%   20506157 ~ 2%  TOTAL proc-vmstat.numa_hit

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 12789 ~ 1% -10.9%  11391 ~ 2%  bens/qperf/600s
 12789 ~ 1% -10.9%  11391 ~ 2%  TOTAL softirqs.HRTIMER

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
481253 ~ 0%  -8.9% 438624 ~ 0%  bens/qperf/600s
481253 ~ 0%  -8.9% 438624 ~ 0%  TOTAL softirqs.TIMER

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
  1297 ~33%+565.9%   8640 ~ 7%  bens/iperf/300s-tcp
  2788 ~ 3%+588.8%  19204 ~ 4%  bens/qperf/600s
  1191 ~ 5%   +1200.9%  15493 ~ 4%  brickland3/qperf/600s
  1135 ~26%   +1195.9%  14709 ~ 4%  lkp-sb03/qperf/600s
  6411 ~13%+805.3%  58047 ~ 4%  TOTAL 
time.involuntary_context_switches

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
 72398 ~ 1%  -5.4%  68503 ~ 0%  bens/qperf/600s
  8789 ~ 4% +22.3%  10749 ~15%  lkp-sb03/qperf/600s
 81187 ~ 1%  -2.4%  79253 ~ 2%  TOTAL vmstat.system.in

71a9b26963f8c2d  5057f55e543b7859cfd26bc28
---  -
141174 ~ 1%

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-29 Thread Jet Chen


On 05/29/2014 01:44 AM, Ming Lei wrote:

On Thu, May 29, 2014 at 1:21 AM, Maurizio Lombardi  wrote:

Hi Ming,

On Thu, May 29, 2014 at 12:59:19AM +0800, Ming Lei wrote:


Actually, the correct thing may be like what did in the
attached patch, as Maurizio discussed with me[1].

Very interestingly, I have reproduced the problem one time
with ext4/271 ext4/301 ext4/305, but won't with the attached
patch after running it for 3 rounds.

[tom@localhost xfstests]$ sudo ./check ext4/271 ext4/301 ext4/305
FSTYP -- ext4
PLATFORM  -- Linux/x86_64 localhost 3.15.0-rc7-next-20140527+
MKFS_OPTIONS  -- /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /mnt/scratch

ext4/271 1s ... 1s
ext4/301 31s ... 32s
ext4/305 181s ... 180s
Ran: ext4/271 ext4/301 ext4/305
Passed all 3 tests

Jet, could you test the attached patch?

[1], https://lkml.org/lkml/2014/5/27/327


There is a little mistake in your patch, you removed bio->bi_iter.bi_size += 
len;
after the "done" label,
but be careful that at line 747 there is a "goto done"... bi_size should be 
incremented
before jumping there.


Good catch, thanks Maurizio.

Jet, please test the attached patch in this mail and ignore previous
one.

The story behind the patch should be like below:

- one page is added in __bio_add_page() 'successfully',
and bio->bi_phys_segments is equal to queue_max_segments(q),
but it should have been rejected since the last vector isn't covered

- next time, __bio_add_page() is called to add one page, but this
time blk_recount_segments() can figure out the actual physical
segments and find it is more than max segments, so failure is
triggered, but the bio->bi_phys_segments is updated with
max segments plus one

- the oops is triggered and reported by Jet, :-)


Thanks,


This patch works, thanks.

Tested-by: Jet Chen 

diff --git a/block/bio.c b/block/bio.c
index 0443694..f9bae56 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -744,6 +744,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 }
 			}
 
+			bio->bi_iter.bi_size += len;
 			goto done;
 		}
 	}
@@ -761,6 +762,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 	bvec->bv_offset = offset;
 	bio->bi_vcnt++;
 	bio->bi_phys_segments++;
+	bio->bi_iter.bi_size += len;
 
 	/*
 	 * Perform a recount if the number of segments is greater
@@ -802,7 +804,6 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 		bio->bi_flags &= ~(1 << BIO_SEG_VALID);
 
  done:
-	bio->bi_iter.bi_size += len;
 	return len;
 
  failed:
@@ -810,6 +811,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 	bvec->bv_len = 0;
 	bvec->bv_offset = 0;
 	bio->bi_vcnt--;
+	bio->bi_iter.bi_size -= len;
 	blk_recount_segments(q, bio);
 	return 0;
 }

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-29 Thread Jet Chen


On 05/29/2014 01:44 AM, Ming Lei wrote:

On Thu, May 29, 2014 at 1:21 AM, Maurizio Lombardi mlomb...@redhat.com wrote:

Hi Ming,

On Thu, May 29, 2014 at 12:59:19AM +0800, Ming Lei wrote:


Actually, the correct thing may be like what did in the
attached patch, as Maurizio discussed with me[1].

Very interestingly, I have reproduced the problem one time
with ext4/271 ext4/301 ext4/305, but won't with the attached
patch after running it for 3 rounds.

[tom@localhost xfstests]$ sudo ./check ext4/271 ext4/301 ext4/305
FSTYP -- ext4
PLATFORM  -- Linux/x86_64 localhost 3.15.0-rc7-next-20140527+
MKFS_OPTIONS  -- /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /mnt/scratch

ext4/271 1s ... 1s
ext4/301 31s ... 32s
ext4/305 181s ... 180s
Ran: ext4/271 ext4/301 ext4/305
Passed all 3 tests

Jet, could you test the attached patch?

[1], https://lkml.org/lkml/2014/5/27/327


There is a little mistake in your patch, you removed bio-bi_iter.bi_size += 
len;
after the done label,
but be careful that at line 747 there is a goto done... bi_size should be 
incremented
before jumping there.


Good catch, thanks Maurizio.

Jet, please test the attached patch in this mail and ignore previous
one.

The story behind the patch should be like below:

- one page is added in __bio_add_page() 'successfully',
and bio-bi_phys_segments is equal to queue_max_segments(q),
but it should have been rejected since the last vector isn't covered

- next time, __bio_add_page() is called to add one page, but this
time blk_recount_segments() can figure out the actual physical
segments and find it is more than max segments, so failure is
triggered, but the bio-bi_phys_segments is updated with
max segments plus one

- the oops is triggered and reported by Jet, :-)


Thanks,


This patch works, thanks.

Tested-by: Jet Chen jet.c...@intel.com

diff --git a/block/bio.c b/block/bio.c
index 0443694..f9bae56 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -744,6 +744,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 }
 			}
 
+			bio-bi_iter.bi_size += len;
 			goto done;
 		}
 	}
@@ -761,6 +762,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 	bvec-bv_offset = offset;
 	bio-bi_vcnt++;
 	bio-bi_phys_segments++;
+	bio-bi_iter.bi_size += len;
 
 	/*
 	 * Perform a recount if the number of segments is greater
@@ -802,7 +804,6 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 		bio-bi_flags = ~(1  BIO_SEG_VALID);
 
  done:
-	bio-bi_iter.bi_size += len;
 	return len;
 
  failed:
@@ -810,6 +811,7 @@ static int __bio_add_page(struct request_queue *q, struct bio *bio, struct page
 	bvec-bv_len = 0;
 	bvec-bv_offset = 0;
 	bio-bi_vcnt--;
+	bio-bi_iter.bi_size -= len;
 	blk_recount_segments(q, bio);
 	return 0;
 }

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-28 Thread Jet Chen


On 05/29/2014 12:13 PM, Ming Lei wrote:

On Thu, May 29, 2014 at 11:35 AM, Jet Chen  wrote:

On 05/29/2014 12:59 AM, Ming Lei wrote:

On Wed, May 28, 2014 at 11:42 PM, Ming Lei  wrote:

Hi Dongsu,

On Wed, May 28, 2014 at 11:09 PM, Dongsu Park
 wrote:

From: Dongsu Park 

Commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d ("bio-modify-
__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3")
introduced a regression as reported by Jet Chen.
That results in a kernel BUG at drivers/block/virtio_blk.c:166.

To fix that, bi_iter.bi_size must be decreased by len, before
recounting the number of physical segments.

Tested on with kernel 3.15.0-rc7-next-20140527 on qemu guest,
by running xfstests/ext4/271.

Cc: Jens Axboe 
Cc: Jet Chen 
Cc: Maurizio Lombardi 
Signed-off-by: Dongsu Park 
---
  block/bio.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/block/bio.c b/block/bio.c
index 0443694ccbb4..67d7cba1e5fd 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -810,6 +810,7 @@ static int __bio_add_page(struct request_queue *q, struct 
bio *bio, struct page
 bvec->bv_len = 0;
 bvec->bv_offset = 0;
 bio->bi_vcnt--;
+   bio->bi_iter.bi_size -= len;


Would you mind explaining why bi_iter.bi_size need to be
decreased by 'len'? In the failure path, it wasn't added by
'len', was it?


Actually, the correct thing may be like what did in the
attached patch, as Maurizio discussed with me[1].

Very interestingly, I have reproduced the problem one time
with ext4/271 ext4/301 ext4/305, but won't with the attached
patch after running it for 3 rounds.

[tom@localhost xfstests]$ sudo ./check ext4/271 ext4/301 ext4/305
FSTYP -- ext4
PLATFORM  -- Linux/x86_64 localhost 3.15.0-rc7-next-20140527+
MKFS_OPTIONS  -- /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /mnt/scratch

ext4/271 1s ... 1s
ext4/301 31s ... 32s
ext4/305 181s ... 180s
Ran: ext4/271 ext4/301 ext4/305
Passed all 3 tests

Jet, could you test the attached patch?


sorry, could you specify which patch need me to test ?
actually I got confused. I only find


Firstly, dongsu's patch is wrong, and it doesn't make sense to test
that.

Secondly, it is the patch attached in my last email, and the
name is 'fix_compute_segments.patch'.

Please let me know if you can find the patch, if you still can't, I
may resend to you.



Just got your email which attached that patch, thanks. I guess there is some 
network problem on my side which leads to some latency.
Will test it out.



Thanks,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-28 Thread Jet Chen

On 05/29/2014 12:59 AM, Ming Lei wrote:
> On Wed, May 28, 2014 at 11:42 PM, Ming Lei  wrote:
>> Hi Dongsu,
>>
>> On Wed, May 28, 2014 at 11:09 PM, Dongsu Park
>>  wrote:
>>> From: Dongsu Park 
>>>
>>> Commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d ("bio-modify-
>>> __bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3")
>>> introduced a regression as reported by Jet Chen.
>>> That results in a kernel BUG at drivers/block/virtio_blk.c:166.
>>>
>>> To fix that, bi_iter.bi_size must be decreased by len, before
>>> recounting the number of physical segments.
>>>
>>> Tested on with kernel 3.15.0-rc7-next-20140527 on qemu guest,
>>> by running xfstests/ext4/271.
>>>
>>> Cc: Jens Axboe 
>>> Cc: Jet Chen 
>>> Cc: Maurizio Lombardi 
>>> Signed-off-by: Dongsu Park 
>>> ---
>>>  block/bio.c | 1 +
>>>  1 file changed, 1 insertion(+)
>>>
>>> diff --git a/block/bio.c b/block/bio.c
>>> index 0443694ccbb4..67d7cba1e5fd 100644
>>> --- a/block/bio.c
>>> +++ b/block/bio.c
>>> @@ -810,6 +810,7 @@ static int __bio_add_page(struct request_queue *q, 
>>> struct bio *bio, struct page
>>> bvec->bv_len = 0;
>>> bvec->bv_offset = 0;
>>> bio->bi_vcnt--;
>>> +   bio->bi_iter.bi_size -= len;
>>
>> Would you mind explaining why bi_iter.bi_size need to be
>> decreased by 'len'? In the failure path, it wasn't added by
>> 'len', was it?
> 
> Actually, the correct thing may be like what did in the
> attached patch, as Maurizio discussed with me[1].
> 
> Very interestingly, I have reproduced the problem one time
> with ext4/271 ext4/301 ext4/305, but won't with the attached
> patch after running it for 3 rounds.
> 
> [tom@localhost xfstests]$ sudo ./check ext4/271 ext4/301 ext4/305
> FSTYP -- ext4
> PLATFORM  -- Linux/x86_64 localhost 3.15.0-rc7-next-20140527+
> MKFS_OPTIONS  -- /dev/vdc
> MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /mnt/scratch
> 
> ext4/271 1s ... 1s
> ext4/301 31s ... 32s
> ext4/305 181s ... 180s
> Ran: ext4/271 ext4/301 ext4/305
> Passed all 3 tests
> 
> Jet, could you test the attached patch?

sorry, could you specify which patch need me to test ?
actually I got confused. I only find

[PATCH V3] bio: modify __bio_add_page() to accept pages that don't 
start a new segment

in this mail thread. is it need to be tested ?

on next/master branch,

commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d
Author: Maurizio Lombardi 
Date:   Sat May 17 23:17:30 2014 +1000

bio-modify-__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3

Changes in V3:

In case of error, V2 restored the previous number of segments but left
the BIO_SEG_FLAG set.
To avoid problems, after the page is removed from the bio vec,
V3 performs a recount of the segments in the error code path.

Signed-off-by: Maurizio Lombardi 
Cc: Al Viro 
Cc: Christoph Hellwig 
Cc: Kent Overstreet 
Cc: Jens Axboe 
Signed-off-by: Andrew Morton 

commit fceb38f36f4fecabf9ca33aa44a3f943f133cb78
Author: Maurizio Lombardi 
Date:   Sat May 17 23:17:30 2014 +1000

bio: modify __bio_add_page() to accept pages that don't start a new segment

The original behaviour is to refuse to add a new page if the maximum
number of segments has been reached, regardless of the fact the page we

3979ef4dcf3d1de55a560a3a4016c30a835df44d is the first bad commit.

> 
> [1], https://lkml.org/lkml/2014/5/27/327
> 
> 
> Thanks,
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-28 Thread Jet Chen

On 05/28/2014 11:09 PM, Dongsu Park wrote:
> From: Dongsu Park 
> 
> Commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d ("bio-modify-
> __bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3")
> introduced a regression as reported by Jet Chen.
> That results in a kernel BUG at drivers/block/virtio_blk.c:166.
> 
> To fix that, bi_iter.bi_size must be decreased by len, before
> recounting the number of physical segments.
> 
> Tested on with kernel 3.15.0-rc7-next-20140527 on qemu guest,
> by running xfstests/ext4/271.
> 
> Cc: Jens Axboe 
> Cc: Jet Chen 
> Cc: Maurizio Lombardi 
> Signed-off-by: Dongsu Park 
> ---
>  block/bio.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/bio.c b/block/bio.c
> index 0443694ccbb4..67d7cba1e5fd 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -810,6 +810,7 @@ static int __bio_add_page(struct request_queue *q, struct 
> bio *bio, struct page
>   bvec->bv_len = 0;
>   bvec->bv_offset = 0;
>   bio->bi_vcnt--;
> + bio->bi_iter.bi_size -= len;
>   blk_recount_segments(q, bio);
>   return 0;
>  }
> 
This patch fixes the problem.

Tested-by: Jet Chen 




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]

2014-05-28 Thread Jet Chen

On 05/27/2014 07:24 PM, Maurizio Lombardi wrote:
> On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote:
>>
>> But now I'm suspicious of this part of commit 3979ef4dcf:
>>
>>  failed:
>> bvec->bv_page = NULL;
>> bvec->bv_len = 0;
>> bvec->bv_offset = 0;
>> bio->bi_vcnt--;  <
>> blk_recount_segments(q, bio);
>> return 0;
>>
>> Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments()
>> recalculates the correct number of physical segments?
>> Looking at the __blk_recalc_rq_segments() it appears it may not be the case.
>>
>> The question is how can we restore the correct number of physical segments 
>> in case
>> of failure without breaking anything...
>>
> 
> If my hypothesis is correct, the following patch should trigger a kernel 
> panic,
> Jet Chen, can you try it and let me know whether the BUG_ON is hit or not?

Sorry for late respond. Dongsu has sent a patch for this issue.
message-id 
<1401289778-9840-1-git-send-email-dongsu.p...@profitbricks.com>
Do you still need me to test the following patch ?

> 
> diff --git a/block/bio.c b/block/bio.c
> index 0443694..763868f 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct 
> bio *bio, struct page
> unsigned int max_sectors)
>  {
>   int retried_segments = 0;
> + unsigned int phys_segments_orig;
>   struct bio_vec *bvec;
>  
>   /*
> @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct 
> bio *bio, struct page
>   if (bio->bi_vcnt >= bio->bi_max_vecs)
>   return 0;
>  
> + blk_recount_segments(q, bio);
> + phys_segments_orig = bio->bi_phys_segments;
> +
>   /*
>* setup the new entry, we might clear it again later if we
>* cannot add the page
> @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct 
> bio *bio, struct page
>   bvec->bv_offset = 0;
>   bio->bi_vcnt--;
>   blk_recount_segments(q, bio);
> + BUG_ON(phys_segments_orig != bio->bi_phys_segments);
>   return 0;
>  }
> 
> 
> Regards,
> Maurizio Lombardi 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-28 Thread Jet Chen

On 05/28/2014 11:09 PM, Dongsu Park wrote:
 From: Dongsu Park dongsu.p...@profitbricks.com
 
 Commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d (bio-modify-
 __bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3)
 introduced a regression as reported by Jet Chen.
 That results in a kernel BUG at drivers/block/virtio_blk.c:166.
 
 To fix that, bi_iter.bi_size must be decreased by len, before
 recounting the number of physical segments.
 
 Tested on with kernel 3.15.0-rc7-next-20140527 on qemu guest,
 by running xfstests/ext4/271.
 
 Cc: Jens Axboe ax...@kernel.dk
 Cc: Jet Chen jet.c...@intel.com
 Cc: Maurizio Lombardi mlomb...@redhat.com
 Signed-off-by: Dongsu Park dongsu.p...@profitbricks.com
 ---
  block/bio.c | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/block/bio.c b/block/bio.c
 index 0443694ccbb4..67d7cba1e5fd 100644
 --- a/block/bio.c
 +++ b/block/bio.c
 @@ -810,6 +810,7 @@ static int __bio_add_page(struct request_queue *q, struct 
 bio *bio, struct page
   bvec-bv_len = 0;
   bvec-bv_offset = 0;
   bio-bi_vcnt--;
 + bio-bi_iter.bi_size -= len;
   blk_recount_segments(q, bio);
   return 0;
  }
 
This patch fixes the problem.

Tested-by: Jet Chen jet.c...@intel.com




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-28 Thread Jet Chen

On 05/29/2014 12:59 AM, Ming Lei wrote:
 On Wed, May 28, 2014 at 11:42 PM, Ming Lei tom.leim...@gmail.com wrote:
 Hi Dongsu,

 On Wed, May 28, 2014 at 11:09 PM, Dongsu Park
 dongsu.p...@profitbricks.com wrote:
 From: Dongsu Park dongsu.p...@profitbricks.com

 Commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d (bio-modify-
 __bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3)
 introduced a regression as reported by Jet Chen.
 That results in a kernel BUG at drivers/block/virtio_blk.c:166.

 To fix that, bi_iter.bi_size must be decreased by len, before
 recounting the number of physical segments.

 Tested on with kernel 3.15.0-rc7-next-20140527 on qemu guest,
 by running xfstests/ext4/271.

 Cc: Jens Axboe ax...@kernel.dk
 Cc: Jet Chen jet.c...@intel.com
 Cc: Maurizio Lombardi mlomb...@redhat.com
 Signed-off-by: Dongsu Park dongsu.p...@profitbricks.com
 ---
  block/bio.c | 1 +
  1 file changed, 1 insertion(+)

 diff --git a/block/bio.c b/block/bio.c
 index 0443694ccbb4..67d7cba1e5fd 100644
 --- a/block/bio.c
 +++ b/block/bio.c
 @@ -810,6 +810,7 @@ static int __bio_add_page(struct request_queue *q, 
 struct bio *bio, struct page
 bvec-bv_len = 0;
 bvec-bv_offset = 0;
 bio-bi_vcnt--;
 +   bio-bi_iter.bi_size -= len;

 Would you mind explaining why bi_iter.bi_size need to be
 decreased by 'len'? In the failure path, it wasn't added by
 'len', was it?
 
 Actually, the correct thing may be like what did in the
 attached patch, as Maurizio discussed with me[1].
 
 Very interestingly, I have reproduced the problem one time
 with ext4/271 ext4/301 ext4/305, but won't with the attached
 patch after running it for 3 rounds.
 
 [tom@localhost xfstests]$ sudo ./check ext4/271 ext4/301 ext4/305
 FSTYP -- ext4
 PLATFORM  -- Linux/x86_64 localhost 3.15.0-rc7-next-20140527+
 MKFS_OPTIONS  -- /dev/vdc
 MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /mnt/scratch
 
 ext4/271 1s ... 1s
 ext4/301 31s ... 32s
 ext4/305 181s ... 180s
 Ran: ext4/271 ext4/301 ext4/305
 Passed all 3 tests
 
 Jet, could you test the attached patch?

sorry, could you specify which patch need me to test ?
actually I got confused. I only find

[PATCH V3] bio: modify __bio_add_page() to accept pages that don't 
start a new segment

in this mail thread. is it need to be tested ?

on next/master branch,

commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d
Author: Maurizio Lombardi mlomb...@redhat.com
Date:   Sat May 17 23:17:30 2014 +1000

bio-modify-__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3

Changes in V3:

In case of error, V2 restored the previous number of segments but left
the BIO_SEG_FLAG set.
To avoid problems, after the page is removed from the bio vec,
V3 performs a recount of the segments in the error code path.

Signed-off-by: Maurizio Lombardi mlomb...@redhat.com
Cc: Al Viro v...@zeniv.linux.org.uk
Cc: Christoph Hellwig h...@lst.de
Cc: Kent Overstreet k...@daterainc.com
Cc: Jens Axboe ax...@kernel.dk
Signed-off-by: Andrew Morton a...@linux-foundation.org

commit fceb38f36f4fecabf9ca33aa44a3f943f133cb78
Author: Maurizio Lombardi mlomb...@redhat.com
Date:   Sat May 17 23:17:30 2014 +1000

bio: modify __bio_add_page() to accept pages that don't start a new segment

The original behaviour is to refuse to add a new page if the maximum
number of segments has been reached, regardless of the fact the page we

3979ef4dcf3d1de55a560a3a4016c30a835df44d is the first bad commit.

 
 [1], https://lkml.org/lkml/2014/5/27/327
 
 
 Thanks,
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] bio: decrease bi_iter.bi_size by len in the fail path

2014-05-28 Thread Jet Chen


On 05/29/2014 12:13 PM, Ming Lei wrote:

On Thu, May 29, 2014 at 11:35 AM, Jet Chen jet.c...@intel.com wrote:

On 05/29/2014 12:59 AM, Ming Lei wrote:

On Wed, May 28, 2014 at 11:42 PM, Ming Lei tom.leim...@gmail.com wrote:

Hi Dongsu,

On Wed, May 28, 2014 at 11:09 PM, Dongsu Park
dongsu.p...@profitbricks.com wrote:

From: Dongsu Park dongsu.p...@profitbricks.com

Commit 3979ef4dcf3d1de55a560a3a4016c30a835df44d (bio-modify-
__bio_add_page-to-accept-pages-that-dont-start-a-new-segment-v3)
introduced a regression as reported by Jet Chen.
That results in a kernel BUG at drivers/block/virtio_blk.c:166.

To fix that, bi_iter.bi_size must be decreased by len, before
recounting the number of physical segments.

Tested on with kernel 3.15.0-rc7-next-20140527 on qemu guest,
by running xfstests/ext4/271.

Cc: Jens Axboe ax...@kernel.dk
Cc: Jet Chen jet.c...@intel.com
Cc: Maurizio Lombardi mlomb...@redhat.com
Signed-off-by: Dongsu Park dongsu.p...@profitbricks.com
---
  block/bio.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/block/bio.c b/block/bio.c
index 0443694ccbb4..67d7cba1e5fd 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -810,6 +810,7 @@ static int __bio_add_page(struct request_queue *q, struct 
bio *bio, struct page
 bvec-bv_len = 0;
 bvec-bv_offset = 0;
 bio-bi_vcnt--;
+   bio-bi_iter.bi_size -= len;


Would you mind explaining why bi_iter.bi_size need to be
decreased by 'len'? In the failure path, it wasn't added by
'len', was it?


Actually, the correct thing may be like what did in the
attached patch, as Maurizio discussed with me[1].

Very interestingly, I have reproduced the problem one time
with ext4/271 ext4/301 ext4/305, but won't with the attached
patch after running it for 3 rounds.

[tom@localhost xfstests]$ sudo ./check ext4/271 ext4/301 ext4/305
FSTYP -- ext4
PLATFORM  -- Linux/x86_64 localhost 3.15.0-rc7-next-20140527+
MKFS_OPTIONS  -- /dev/vdc
MOUNT_OPTIONS -- -o acl,user_xattr /dev/vdc /mnt/scratch

ext4/271 1s ... 1s
ext4/301 31s ... 32s
ext4/305 181s ... 180s
Ran: ext4/271 ext4/301 ext4/305
Passed all 3 tests

Jet, could you test the attached patch?


sorry, could you specify which patch need me to test ?
actually I got confused. I only find


Firstly, dongsu's patch is wrong, and it doesn't make sense to test
that.

Secondly, it is the patch attached in my last email, and the
name is 'fix_compute_segments.patch'.

Please let me know if you can find the patch, if you still can't, I
may resend to you.



Just got your email which attached that patch, thanks. I guess there is some 
network problem on my side which leads to some latency.
Will test it out.



Thanks,


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [jet.c...@intel.com: [bio] kernel BUG at drivers/block/virtio_blk.c:166!]

2014-05-28 Thread Jet Chen

On 05/27/2014 07:24 PM, Maurizio Lombardi wrote:
 On Tue, May 27, 2014 at 10:43:59AM +0200, Maurizio Lombardi wrote:

 But now I'm suspicious of this part of commit 3979ef4dcf:

  failed:
 bvec-bv_page = NULL;
 bvec-bv_len = 0;
 bvec-bv_offset = 0;
 bio-bi_vcnt--;  
 blk_recount_segments(q, bio);
 return 0;

 Is decreasing bi_vcnt sufficient to guarantee that blk_recount_segments()
 recalculates the correct number of physical segments?
 Looking at the __blk_recalc_rq_segments() it appears it may not be the case.

 The question is how can we restore the correct number of physical segments 
 in case
 of failure without breaking anything...

 
 If my hypothesis is correct, the following patch should trigger a kernel 
 panic,
 Jet Chen, can you try it and let me know whether the BUG_ON is hit or not?

Sorry for late respond. Dongsu has sent a patch for this issue.
message-id 
1401289778-9840-1-git-send-email-dongsu.p...@profitbricks.com
Do you still need me to test the following patch ?

 
 diff --git a/block/bio.c b/block/bio.c
 index 0443694..763868f 100644
 --- a/block/bio.c
 +++ b/block/bio.c
 @@ -701,6 +701,7 @@ static int __bio_add_page(struct request_queue *q, struct 
 bio *bio, struct page
 unsigned int max_sectors)
  {
   int retried_segments = 0;
 + unsigned int phys_segments_orig;
   struct bio_vec *bvec;
  
   /*
 @@ -751,6 +752,9 @@ static int __bio_add_page(struct request_queue *q, struct 
 bio *bio, struct page
   if (bio-bi_vcnt = bio-bi_max_vecs)
   return 0;
  
 + blk_recount_segments(q, bio);
 + phys_segments_orig = bio-bi_phys_segments;
 +
   /*
* setup the new entry, we might clear it again later if we
* cannot add the page
 @@ -811,6 +815,7 @@ static int __bio_add_page(struct request_queue *q, struct 
 bio *bio, struct page
   bvec-bv_offset = 0;
   bio-bi_vcnt--;
   blk_recount_segments(q, bio);
 + BUG_ON(phys_segments_orig != bio-bi_phys_segments);
   return 0;
  }
 
 
 Regards,
 Maurizio Lombardi 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[vfs] 662aa027bda: -7.2% will-it-scale.scalability

2014-05-27 Thread Jet Chen

Hi Miklos,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs.v22
commit 662aa027bdaa082aa3dd21886830c97a1dd4c184 ("vfs: add RENAME_WHITEOUT")

test case: lkp-wsx01/will-it-scale/lseek2

9f498c77b4332b4  662aa027bdaa082aa3dd21886
---  -
  0.63 ~ 0%  -7.2%   0.58 ~ 0%  TOTAL will-it-scale.scalability
  12861563 ~ 0%  -1.1%   12715565 ~ 0%  TOTAL will-it-scale.per_process_ops
   958 ~11% -16.9%796 ~18%  TOTAL numa-meminfo.node3.PageTables
  3013 ~11% +20.9%   3643 ~ 4%  TOTAL 
numa-vmstat.node2.nr_slab_reclaimable
 12055 ~11% +20.9%  14574 ~ 4%  TOTAL 
numa-meminfo.node2.SReclaimable
  1734 ~ 5% -10.6%   1550 ~ 4%  TOTAL 
slabinfo.sock_inode_cache.active_objs
  1734 ~ 5% -10.6%   1550 ~ 4%  TOTAL 
slabinfo.sock_inode_cache.num_objs

Legend:
~XX%- stddev percent
[+-]XX% - change percent


 will-it-scale.scalability

  0.63 ++*-*--*---*--*--*--*+
   | .*.*...*.*..*..*..  + : .*.   *..* |
  0.62 *+   +   :   *..*.   |
   |   *: ..|
   | *  |
  0.61 ++   |
   ||
   0.6 ++   |
   ||
  0.59 ++   |
   |   O  O O   |
   |O O  O  O O  O  O  O  O  O O   O O  O
  0.58 O+O  O  O  O |
   |  O |
  0.57 ++---+


[*] bisect-good sample
[O] bisect-bad  sample


Thanks,
Jet


echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
echo performance >

[vfs] 662aa027bda: -7.2% will-it-scale.scalability

2014-05-27 Thread Jet Chen

Hi Miklos,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs.v22
commit 662aa027bdaa082aa3dd21886830c97a1dd4c184 (vfs: add RENAME_WHITEOUT)

test case: lkp-wsx01/will-it-scale/lseek2

9f498c77b4332b4  662aa027bdaa082aa3dd21886
---  -
  0.63 ~ 0%  -7.2%   0.58 ~ 0%  TOTAL will-it-scale.scalability
  12861563 ~ 0%  -1.1%   12715565 ~ 0%  TOTAL will-it-scale.per_process_ops
   958 ~11% -16.9%796 ~18%  TOTAL numa-meminfo.node3.PageTables
  3013 ~11% +20.9%   3643 ~ 4%  TOTAL 
numa-vmstat.node2.nr_slab_reclaimable
 12055 ~11% +20.9%  14574 ~ 4%  TOTAL 
numa-meminfo.node2.SReclaimable
  1734 ~ 5% -10.6%   1550 ~ 4%  TOTAL 
slabinfo.sock_inode_cache.active_objs
  1734 ~ 5% -10.6%   1550 ~ 4%  TOTAL 
slabinfo.sock_inode_cache.num_objs

Legend:
~XX%- stddev percent
[+-]XX% - change percent


 will-it-scale.scalability

  0.63 ++*-*--*---*--*--*--*+
   | .*.*...*.*..*..*..  + : .*.   *..* |
  0.62 *+   +   :   *..*.   |
   |   *: ..|
   | *  |
  0.61 ++   |
   ||
   0.6 ++   |
   ||
  0.59 ++   |
   |   O  O O   |
   |O O  O  O O  O  O  O  O  O O   O O  O
  0.58 O+O  O  O  O |
   |  O |
  0.57 ++---+


[*] bisect-good sample
[O] bisect-bad  sample


Thanks,
Jet


echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
echo

Re: [staging: r8192ee] WARNING: CPU: 0 PID: 1 at net/mac80211/rate.c:43 ieee80211_rate_control_register()

2014-05-26 Thread Jet Chen

On 05/27/2014 01:07 AM, Larry Finger wrote:
> On 05/26/2014 09:55 AM, Jet Chen wrote:
> 
> Jet,
> 
>> 0day kernel testing robot got the below dmesg and the first bad commit is
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git staging-next
>> commit 0629f3b8c33899140b48d5897259eab8ebae78ca
>> Author: Larry Finger 
>> AuthorDate: Wed May 21 16:25:36 2014 -0500
>> Commit: Greg Kroah-Hartman 
>> CommitDate: Fri May 23 11:33:56 2014 +0900
>>
>>  staging: r8192ee: Turn on build of the new driver
>>  In addition, this commit contains a TODO file for this driver
>>  Signed-off-by: Larry Finger 
>>  Signed-off-by: Greg Kroah-Hartman 
> 
> The splat comes from the driver trying to register a rate-control algorithm 
> that 
> is already registered. That could happen because a driver is calling the 
> registration routine twice, or because more than one driver is using the same 
> routine name. As I think there is code to prevent the former, and I have 
> never 
> seen that splat here, I suspect that more than one driver has used the name. 
> On 
> my real hardware, I can only have one of these devices in the machine at a 
> time.
> 
> Is if possible to test the attached patch to see if it fixes the problem? If 
> not, could you get a listing of the loaded modules at the time of the splat? 
> If 
> this is not the fix, I will try to duplicate it here.
> 
> Thanks,
> 
> Larry
> 

That patch fixes the problem.

Tested-by: Jet Chen 

Thanks,
Jet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [staging: r8192ee] WARNING: CPU: 0 PID: 1 at net/mac80211/rate.c:43 ieee80211_rate_control_register()

2014-05-26 Thread Jet Chen

On 05/27/2014 01:07 AM, Larry Finger wrote:
 On 05/26/2014 09:55 AM, Jet Chen wrote:
 
 Jet,
 
 0day kernel testing robot got the below dmesg and the first bad commit is

 git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git staging-next
 commit 0629f3b8c33899140b48d5897259eab8ebae78ca
 Author: Larry Finger larry.fin...@lwfinger.net
 AuthorDate: Wed May 21 16:25:36 2014 -0500
 Commit: Greg Kroah-Hartman gre...@linuxfoundation.org
 CommitDate: Fri May 23 11:33:56 2014 +0900

  staging: r8192ee: Turn on build of the new driver
  In addition, this commit contains a TODO file for this driver
  Signed-off-by: Larry Finger larry.fin...@lwfinger.net
  Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
 
 The splat comes from the driver trying to register a rate-control algorithm 
 that 
 is already registered. That could happen because a driver is calling the 
 registration routine twice, or because more than one driver is using the same 
 routine name. As I think there is code to prevent the former, and I have 
 never 
 seen that splat here, I suspect that more than one driver has used the name. 
 On 
 my real hardware, I can only have one of these devices in the machine at a 
 time.
 
 Is if possible to test the attached patch to see if it fixes the problem? If 
 not, could you get a listing of the loaded modules at the time of the splat? 
 If 
 this is not the fix, I will try to duplicate it here.
 
 Thanks,
 
 Larry
 

That patch fixes the problem.

Tested-by: Jet Chen jet.c...@intel.com

Thanks,
Jet
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [net] WARNING: CPU: 1 PID: 1 at net/batman-adv/hard-interface.c:92 batadv_is_on_batman_iface()

2014-05-22 Thread Jet Chen


On 05/22/2014 02:12 PM, Cong Wang wrote:

On Wed, May 21, 2014 at 9:42 PM, Jet Chen  wrote:

Hi Steffen,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
commit 78ff4be45a4c51d8fb21ad92e4fabb467c6c3eeb
Author: Steffen Klassert 
AuthorDate: Mon May 19 11:36:56 2014 +0200
Commit: David S. Miller 
CommitDate: Wed May 21 02:08:32 2014 -0400

 ip_tunnel: Initialize the fallback device properly
 We need to initialize the fallback device to have a correct mtu
 set on this device. Otherwise the mtu is set to null and the device
 is unusable.
 Fixes: fd58156e456d ("IPIP: Use ip-tunneling code.")
 Cc: Pravin B Shelar 
 Signed-off-by: Steffen Klassert 
 Signed-off-by: David S. Miller 

++++
|
| d8d33c3b8a | 78ff4be45a |
++++
| boot_successes
| 60 | 0  |
| boot_failures
| 0  | 20 |
|
WARNING:CPU:PID:at_net/batman-adv/hard-interface.c:batadv_is_on_batman_iface()
| 0  | 20 |
| backtrace:register_netdevice_notifier
| 0  | 20 |
| backtrace:batadv_init
| 0  | 20 |
| backtrace:kernel_init_freeable
| 0  | 20 |
++++



batman needs to fix:

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index b851cc5..fbda6b5 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -83,7 +83,7 @@ static bool batadv_is_on_batman_iface(const struct
net_device *net_dev)
 return true;

 /* no more parents..stop recursion */
-   if (net_dev->iflink == net_dev->ifindex)
+   if (net_dev->iflink == 0 || net_dev->iflink == net_dev->ifindex)
 return false;

 /* recurse over the parent device */


Your patch fixes that issue.

Tested-by: Jet Chen 

Thanks,
Jet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [net] WARNING: CPU: 1 PID: 1 at net/batman-adv/hard-interface.c:92 batadv_is_on_batman_iface()

2014-05-22 Thread Jet Chen


On 05/22/2014 02:12 PM, Cong Wang wrote:

On Wed, May 21, 2014 at 9:42 PM, Jet Chen jet.c...@intel.com wrote:

Hi Steffen,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master
commit 78ff4be45a4c51d8fb21ad92e4fabb467c6c3eeb
Author: Steffen Klassert steffen.klass...@secunet.com
AuthorDate: Mon May 19 11:36:56 2014 +0200
Commit: David S. Miller da...@davemloft.net
CommitDate: Wed May 21 02:08:32 2014 -0400

 ip_tunnel: Initialize the fallback device properly
 We need to initialize the fallback device to have a correct mtu
 set on this device. Otherwise the mtu is set to null and the device
 is unusable.
 Fixes: fd58156e456d (IPIP: Use ip-tunneling code.)
 Cc: Pravin B Shelar pshe...@nicira.com
 Signed-off-by: Steffen Klassert steffen.klass...@secunet.com
 Signed-off-by: David S. Miller da...@davemloft.net

++++
|
| d8d33c3b8a | 78ff4be45a |
++++
| boot_successes
| 60 | 0  |
| boot_failures
| 0  | 20 |
|
WARNING:CPU:PID:at_net/batman-adv/hard-interface.c:batadv_is_on_batman_iface()
| 0  | 20 |
| backtrace:register_netdevice_notifier
| 0  | 20 |
| backtrace:batadv_init
| 0  | 20 |
| backtrace:kernel_init_freeable
| 0  | 20 |
++++



batman needs to fix:

diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index b851cc5..fbda6b5 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -83,7 +83,7 @@ static bool batadv_is_on_batman_iface(const struct
net_device *net_dev)
 return true;

 /* no more parents..stop recursion */
-   if (net_dev-iflink == net_dev-ifindex)
+   if (net_dev-iflink == 0 || net_dev-iflink == net_dev-ifindex)
 return false;

 /* recurse over the parent device */


Your patch fixes that issue.

Tested-by: Jet Chen jet.c...@intel.com

Thanks,
Jet
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[x86, vdso] cfda7bb9ecb: +14.7% will-it-scale.per_thread_ops

2014-05-20 Thread Jet Chen


Hi Andy,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/vdso
commit cfda7bb9ecbf9d96264bb5bade33a842966d1062 ("x86, vdso: Move syscall and 
sysenter setup into kernel/cpu/common.c")

test case: nhm4/will-it-scale/sched_yield

3d7ee969bffcc98  cfda7bb9ecbf9d96264bb5bad
---  -
5497021 ~ 0% +14.7%6303424 ~ 0%  TOTAL will-it-scale.per_thread_ops
   0.54 ~ 0%  +5.6%   0.57 ~ 0%  TOTAL will-it-scale.scalability
6209483 ~ 0%  +1.6%6305917 ~ 0%  TOTAL will-it-scale.per_process_ops
   2455 ~ 5% +16.9%   2870 ~ 5%  TOTAL cpuidle.C1-NHM.usage
   8829 ~ 7% +15.2%  10169 ~10%  TOTAL 
slabinfo.kmalloc-64.active_objs
  24.13 ~12% +48.9%  35.93 ~14%  TOTAL time.user_time
393 ~ 0%  -3.0%382 ~ 1%  TOTAL time.system_time


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
./runtest.py sched_yield 32 1 4 6 8

[x86, vdso] cfda7bb9ecb: +14.7% will-it-scale.per_thread_ops

2014-05-20 Thread Jet Chen


Hi Andy,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/vdso
commit cfda7bb9ecbf9d96264bb5bade33a842966d1062 (x86, vdso: Move syscall and 
sysenter setup into kernel/cpu/common.c)

test case: nhm4/will-it-scale/sched_yield

3d7ee969bffcc98  cfda7bb9ecbf9d96264bb5bad
---  -
5497021 ~ 0% +14.7%6303424 ~ 0%  TOTAL will-it-scale.per_thread_ops
   0.54 ~ 0%  +5.6%   0.57 ~ 0%  TOTAL will-it-scale.scalability
6209483 ~ 0%  +1.6%6305917 ~ 0%  TOTAL will-it-scale.per_process_ops
   2455 ~ 5% +16.9%   2870 ~ 5%  TOTAL cpuidle.C1-NHM.usage
   8829 ~ 7% +15.2%  10169 ~10%  TOTAL 
slabinfo.kmalloc-64.active_objs
  24.13 ~12% +48.9%  35.93 ~14%  TOTAL time.user_time
393 ~ 0%  -3.0%382 ~ 1%  TOTAL time.system_time


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
./runtest.py sched_yield 32 1 4 6 8

Re: [goldfish] genirq: Flags mismatch irq 4. 00000000 (serial) vs. 00000080 (goldfish_pdev_bus)

2014-05-18 Thread Jet Chen


forward to alan@intel.com

On 05/19/2014 10:46 AM, Jet Chen wrote:

Hi Alan,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git staging-next
commit 9b17aeec232a5f0a61ce3952c2e728a0eeddda8b
Author: Alan 
AuthorDate: Mon May 12 16:55:35 2014 +0100
Commit: Greg Kroah-Hartman 
CommitDate: Thu May 15 13:19:01 2014 -0700

 goldfish: Allow 64bit builds
 We can now enable the 64bit option for the Goldfish 64bit emulator.
 Signed-off-by: Alan Cox 
 Signed-off-by: Greg Kroah-Hartman 

+--+++
|  | b8658bc810 | 
9b17aeec23 |
+--+++
| boot_successes   | 60 | 0 
 |
| boot_failures| 0  | 20
 |
| genirq:Flags_mismatch_irq.(serial)vs.(goldfish_pdev_bus) | 0  | 20
 |
+--+++


[8.668148] ALSA device list:
[8.668419]   #0: Loopback 1
[8.668677]   #1: MTPAV on parallel port at 0x378
[8.669432] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.671901] Freeing unused kernel memory: 1196K (82bf - 
82d1b000)
[8.672607] Write protecting the kernel read-only data: 18432k
[8.673681] Freeing unused kernel memory: 80K (880001bec000 - 
880001c0)
[8.677081] Freeing unused kernel memory: 1076K (8800020f3000 - 
88000220)
[8.678569] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.679593] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.680682] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.681640] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.682609] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.684017] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.685239] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.692514] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.693570] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.694499] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.765477] gfs2: path_lookup on rootfs returned error -2
[   60.373367] spin_lock-torture: Writes:  Total: 4  Max/Min: 0/0   Fail: 0 [   
69.215881] random: nonblocking pool is initialized

git bisect start 0856ad1ef175c45dc6a0bc629ca355fdfbb1001e v3.14 --
git bisect good 200bde278d78318cbd10cd865f94016924e76c56  # 16:16 20+  
0  Merge branch 'perf-urgent-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 9ae8df66e6742271b49a1518c7bf0debdfc523cf  # 16:32 20+  
0  Merge remote-tracking branch 'devicetree-current/devicetree/merge'
git bisect good 77645accb67841c8557483afe029ec80a240d636  # 16:39 20+  
0  Merge remote-tracking branch 'drm/drm-next'
git bisect good 21dd17170d63646851debe993194e46811dbdb1b  # 16:47 20+  
0  Merge remote-tracking branch 'ftrace/for-next'
git bisect good 2fa6c75b02c032252eaf3c0b8885ddb1b9ad4840  # 16:55 20+  
0  Merge remote-tracking branch 'usb-gadget/next'
git bisect  bad 7dc48a5c223dddebc2bb263d6e77edb557c11f6f  # 17:00  0-  
7  Merge remote-tracking branch 'dma-mapping/dma-mapping-next'
git bisect  bad a75b5bfa061c14e14261aed3771e2b174df72450  # 17:09  0-  
8  Merge remote-tracking branch 'cgroup/for-next'
git bisect  bad 47bb948ff65de22da21d235cc768bba7802ad45f  # 17:15  0- 
12  Merge remote-tracking branch 'staging/staging-next'
git bisect good d5cef008e95ffdaf457e38254e67d145a927df96  # 17:25 20+  
0  Merge tag 'iio-for-3.16a' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
git bisect good a1f3b3fdba2ab124c1e8c2b6aa17cfcf5c99d756  # 17:34 20+  
0  staging: rtl8188eu: Remove 'u32 ref_cnt' from struct recv_buf
git bisect good be4c50604d22f0fd2c65853af18292f44d784b86  # 17:45 20+  
0  staging: rtl8723au: Eliminate RTW_STATUS_CODE23a()
git bisect  bad 2b46be68a1b67f40df4f288b90e79181c387eda6  # 17:53  0-  
8  staging: dgnc: cleanup dgnc_finalize_board_init()
git bisect good c7fff4e43ced0584bcbcef8a5f93bd9a813aac1b  # 18:01 20+  
0  staging: rtl8723au: struct phy_info and struct odm_phy_info are identical
git bisect  bad f6279717bbb20bf90ec414af17d2a31d843f5eb5  # 18:12  0- 
10  goldfish: clean up staging ifdefs

Re: [goldfish] genirq: Flags mismatch irq 4. 00000000 (serial) vs. 00000080 (goldfish_pdev_bus)

2014-05-18 Thread Jet Chen


forward to alan@intel.com

On 05/19/2014 10:46 AM, Jet Chen wrote:

Hi Alan,

0day kernel testing robot got the below dmesg and the first bad commit is

git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git staging-next
commit 9b17aeec232a5f0a61ce3952c2e728a0eeddda8b
Author: Alan a...@acox1-desk.ger.corp.intel.com
AuthorDate: Mon May 12 16:55:35 2014 +0100
Commit: Greg Kroah-Hartman gre...@linuxfoundation.org
CommitDate: Thu May 15 13:19:01 2014 -0700

 goldfish: Allow 64bit builds
 We can now enable the 64bit option for the Goldfish 64bit emulator.
 Signed-off-by: Alan Cox a...@linux.intel.com
 Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org

+--+++
|  | b8658bc810 | 
9b17aeec23 |
+--+++
| boot_successes   | 60 | 0 
 |
| boot_failures| 0  | 20
 |
| genirq:Flags_mismatch_irq.(serial)vs.(goldfish_pdev_bus) | 0  | 20
 |
+--+++


[8.668148] ALSA device list:
[8.668419]   #0: Loopback 1
[8.668677]   #1: MTPAV on parallel port at 0x378
[8.669432] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.671901] Freeing unused kernel memory: 1196K (82bf - 
82d1b000)
[8.672607] Write protecting the kernel read-only data: 18432k
[8.673681] Freeing unused kernel memory: 80K (880001bec000 - 
880001c0)
[8.677081] Freeing unused kernel memory: 1076K (8800020f3000 - 
88000220)
[8.678569] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.679593] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.680682] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.681640] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.682609] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.684017] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.685239] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.692514] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.693570] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.694499] genirq: Flags mismatch irq 4.  (serial) vs. 0080 
(goldfish_pdev_bus)
[8.765477] gfs2: path_lookup on rootfs returned error -2
[   60.373367] spin_lock-torture: Writes:  Total: 4  Max/Min: 0/0   Fail: 0 [   
69.215881] random: nonblocking pool is initialized

git bisect start 0856ad1ef175c45dc6a0bc629ca355fdfbb1001e v3.14 --
git bisect good 200bde278d78318cbd10cd865f94016924e76c56  # 16:16 20+  
0  Merge branch 'perf-urgent-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 9ae8df66e6742271b49a1518c7bf0debdfc523cf  # 16:32 20+  
0  Merge remote-tracking branch 'devicetree-current/devicetree/merge'
git bisect good 77645accb67841c8557483afe029ec80a240d636  # 16:39 20+  
0  Merge remote-tracking branch 'drm/drm-next'
git bisect good 21dd17170d63646851debe993194e46811dbdb1b  # 16:47 20+  
0  Merge remote-tracking branch 'ftrace/for-next'
git bisect good 2fa6c75b02c032252eaf3c0b8885ddb1b9ad4840  # 16:55 20+  
0  Merge remote-tracking branch 'usb-gadget/next'
git bisect  bad 7dc48a5c223dddebc2bb263d6e77edb557c11f6f  # 17:00  0-  
7  Merge remote-tracking branch 'dma-mapping/dma-mapping-next'
git bisect  bad a75b5bfa061c14e14261aed3771e2b174df72450  # 17:09  0-  
8  Merge remote-tracking branch 'cgroup/for-next'
git bisect  bad 47bb948ff65de22da21d235cc768bba7802ad45f  # 17:15  0- 
12  Merge remote-tracking branch 'staging/staging-next'
git bisect good d5cef008e95ffdaf457e38254e67d145a927df96  # 17:25 20+  
0  Merge tag 'iio-for-3.16a' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-next
git bisect good a1f3b3fdba2ab124c1e8c2b6aa17cfcf5c99d756  # 17:34 20+  
0  staging: rtl8188eu: Remove 'u32 ref_cnt' from struct recv_buf
git bisect good be4c50604d22f0fd2c65853af18292f44d784b86  # 17:45 20+  
0  staging: rtl8723au: Eliminate RTW_STATUS_CODE23a()
git bisect  bad 2b46be68a1b67f40df4f288b90e79181c387eda6  # 17:53  0-  
8  staging: dgnc: cleanup dgnc_finalize_board_init()
git bisect good c7fff4e43ced0584bcbcef8a5f93bd9a813aac1b  # 18:01 20+  
0  staging: rtl8723au: struct phy_info and struct odm_phy_info are identical
git bisect

[cgroup] a0f9ec1f181: -4.3% will-it-scale.per_thread_ops

2014-05-14 Thread Jet Chen

Hi Tejun,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 
review-kill-tree_mutex
commit a0f9ec1f181534694cb5bf40b7b56515b8cabef9 ("cgroup: use 
cgroup_kn_lock_live() in other cgroup kernfs methods")

Test case : lkp-nex05/will-it-scale/writeseek

2074b6e38668e62  a0f9ec1f181534694cb5bf40b
---  -
   1027273 ~ 0%  -4.3% 982732 ~ 0%  TOTAL will-it-scale.per_thread_ops
   136 ~ 3% -43.1% 77 ~43%  TOTAL proc-vmstat.nr_dirtied
  0.51 ~ 3% +98.0%   1.01 ~ 4%  TOTAL 
perf-profile.cpu-cycles.shmem_write_end.generic_perform_write.__generic_file_aio_write.generic_file_aio_write.do_sync_write
  1078 ~ 9% -16.3%903 ~11%  TOTAL numa-meminfo.node0.Unevictable
   269 ~ 9% -16.2%225 ~11%  TOTAL 
numa-vmstat.node0.nr_unevictable
  1.64 ~ 1% -14.3%   1.41 ~ 4%  TOTAL 
perf-profile.cpu-cycles.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_aio_write
  1.62 ~ 2% +14.1%   1.84 ~ 1%  TOTAL 
perf-profile.cpu-cycles.lseek64


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
echo

[tracing] b1169cc69ba: +10060.4% proc-vmstat.numa_pte_updates

2014-05-14 Thread Jet Chen

Hi Steven,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git 
ftrace/core
commit b1169cc69ba96b124df820904a6d3eb775491d7f ("tracing: Remove mock up poll 
wait function")

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
   191 ~15%  +10060.4%  19487 ~ 8%  lkp-ne04/dd-write/1HDD-ext4-2dd.4k
   191 ~15%  +10060.4%  19487 ~ 8%  TOTAL proc-vmstat.numa_pte_updates

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
   2419721 ~ 0% +88.0%4549654 ~ 2%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
   2419721 ~ 0% +88.0%4549654 ~ 2%  TOTAL perf-stat.context-switches

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
   954 ~ 0% +52.7%   1457 ~ 0%  lkp-ne04/dd-write/1HDD-ext4-2dd.4k
  3932 ~ 0% +89.1%   7438 ~ 2%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
  4887 ~ 0% +82.0%   8895 ~ 2%  TOTAL vmstat.system.cs

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
163477 ~ 0% +60.1% 261769 ~ 0%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
163477 ~ 0% +60.1% 261769 ~ 0%  TOTAL perf-stat.cpu-migrations

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
 28742 ~ 0%  +6.4%  30574 ~ 0%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
 28742 ~ 0%  +6.4%  30574 ~ 0%  TOTAL vmstat.system.in


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
mkfs -t ext4 -q /dev/sda2
echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1 > 
/sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
mount -t ext4 /dev/sda2 /fs/sda2
dd bs=4k if=/dev/zero of=/fs/sda2/zero-1 status=none &
dd bs=4k if=/dev/zero of=/fs/sda2/zero-2 status=none &
sleep 600
killall -9 dd

[uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1

2014-05-14 Thread Jet Chen

Hi Oleg,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 8ad8e9d3fd64f101eed6652964670672d699e563 ("uprobes/x86: Introduce 
uprobe_xol_ops and arch_uprobe->ops")

Test case: lkp-snb01/aim7/signal_test

34e7317d6ae8f61  8ad8e9d3fd64f101eed665296
---  -
230689 ~ 0%  -7.5% 213485 ~ 0%  TOTAL aim7.2000.jobs-per-min
  0.51 ~30% -45.7%   0.28 ~10%  TOTAL turbostat.%c1
   430 ~17% +27.8%549 ~15%  TOTAL vmstat.procs.r
  0.83 ~14% +38.1%   1.15 ~12%  TOTAL 
perf-profile.cpu-cycles.copy_pte_range.copy_page_range.copy_process.do_fork.sys_clone
106076 ~ 4% +22.4% 129816 ~ 3%  TOTAL softirqs.RCU
 12117 ~ 6%  -9.9%  10914 ~ 6%  TOTAL 
slabinfo.kmalloc-256.active_objs
 32163 ~17% +39.4%  44824 ~12%  TOTAL 
time.voluntary_context_switches
276487 ~ 1% +16.5% 322091 ~ 1%  TOTAL 
time.involuntary_context_switches
 83.14 ~ 0% +13.3%  94.21 ~ 0%  TOTAL time.user_time
108800 ~ 2%  +9.4% 119014 ~ 3%  TOTAL time.minor_page_faults
  1255 ~ 0%  +9.9%   1379 ~ 0%  TOTAL time.system_time
  6774 ~ 1%  +8.8%   7373 ~ 1%  TOTAL vmstat.system.cs
 52.15 ~ 0%  +8.0%  56.34 ~ 0%  TOTAL time.elapsed_time
 25185 ~ 0%  +2.4%  25784 ~ 0%  TOTAL vmstat.system.in
 79.23 ~ 0%  +2.0%  80.80 ~ 0%  TOTAL turbostat.%c0
  2567 ~ 0%  +1.9%   2615 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
   112 ~ 0%  +1.3%113 ~ 0%  TOTAL turbostat.Cor_W
   138 ~ 0%  +1.0%139 ~ 0%  TOTAL turbostat.Pkg_W


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor

[uprobes/x86] 8ad8e9d3fd6: -7.5% aim7.2000.jobs-per-min, -45.7% turbostat.%c1

2014-05-14 Thread Jet Chen

Hi Oleg,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit 8ad8e9d3fd64f101eed6652964670672d699e563 (uprobes/x86: Introduce 
uprobe_xol_ops and arch_uprobe-ops)

Test case: lkp-snb01/aim7/signal_test

34e7317d6ae8f61  8ad8e9d3fd64f101eed665296
---  -
230689 ~ 0%  -7.5% 213485 ~ 0%  TOTAL aim7.2000.jobs-per-min
  0.51 ~30% -45.7%   0.28 ~10%  TOTAL turbostat.%c1
   430 ~17% +27.8%549 ~15%  TOTAL vmstat.procs.r
  0.83 ~14% +38.1%   1.15 ~12%  TOTAL 
perf-profile.cpu-cycles.copy_pte_range.copy_page_range.copy_process.do_fork.sys_clone
106076 ~ 4% +22.4% 129816 ~ 3%  TOTAL softirqs.RCU
 12117 ~ 6%  -9.9%  10914 ~ 6%  TOTAL 
slabinfo.kmalloc-256.active_objs
 32163 ~17% +39.4%  44824 ~12%  TOTAL 
time.voluntary_context_switches
276487 ~ 1% +16.5% 322091 ~ 1%  TOTAL 
time.involuntary_context_switches
 83.14 ~ 0% +13.3%  94.21 ~ 0%  TOTAL time.user_time
108800 ~ 2%  +9.4% 119014 ~ 3%  TOTAL time.minor_page_faults
  1255 ~ 0%  +9.9%   1379 ~ 0%  TOTAL time.system_time
  6774 ~ 1%  +8.8%   7373 ~ 1%  TOTAL vmstat.system.cs
 52.15 ~ 0%  +8.0%  56.34 ~ 0%  TOTAL time.elapsed_time
 25185 ~ 0%  +2.4%  25784 ~ 0%  TOTAL vmstat.system.in
 79.23 ~ 0%  +2.0%  80.80 ~ 0%  TOTAL turbostat.%c0
  2567 ~ 0%  +1.9%   2615 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
   112 ~ 0%  +1.3%113 ~ 0%  TOTAL turbostat.Cor_W
   138 ~ 0%  +1.0%139 ~ 0%  TOTAL turbostat.Pkg_W


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor

[tracing] b1169cc69ba: +10060.4% proc-vmstat.numa_pte_updates

2014-05-14 Thread Jet Chen

Hi Steven,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git 
ftrace/core
commit b1169cc69ba96b124df820904a6d3eb775491d7f (tracing: Remove mock up poll 
wait function)

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
   191 ~15%  +10060.4%  19487 ~ 8%  lkp-ne04/dd-write/1HDD-ext4-2dd.4k
   191 ~15%  +10060.4%  19487 ~ 8%  TOTAL proc-vmstat.numa_pte_updates

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
   2419721 ~ 0% +88.0%4549654 ~ 2%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
   2419721 ~ 0% +88.0%4549654 ~ 2%  TOTAL perf-stat.context-switches

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
   954 ~ 0% +52.7%   1457 ~ 0%  lkp-ne04/dd-write/1HDD-ext4-2dd.4k
  3932 ~ 0% +89.1%   7438 ~ 2%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
  4887 ~ 0% +82.0%   8895 ~ 2%  TOTAL vmstat.system.cs

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
163477 ~ 0% +60.1% 261769 ~ 0%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
163477 ~ 0% +60.1% 261769 ~ 0%  TOTAL perf-stat.cpu-migrations

f4874261049e3ab  b1169cc69ba96b124df820904
---  -
 28742 ~ 0%  +6.4%  30574 ~ 0%  
lkp-ws02/dd-write/11HDD-JBOD-cfq-xfs-1dd
 28742 ~ 0%  +6.4%  30574 ~ 0%  TOTAL vmstat.system.in


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
mkfs -t ext4 -q /dev/sda2
echo 1  /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1  /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1  /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1  
/sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
mount -t ext4 /dev/sda2 /fs/sda2
dd bs=4k if=/dev/zero of=/fs/sda2/zero-1 status=none 
dd bs=4k if=/dev/zero of=/fs/sda2/zero-2 status=none 
sleep 600
killall -9 dd

[cgroup] a0f9ec1f181: -4.3% will-it-scale.per_thread_ops

2014-05-14 Thread Jet Chen

Hi Tejun,

FYI, we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git 
review-kill-tree_mutex
commit a0f9ec1f181534694cb5bf40b7b56515b8cabef9 (cgroup: use 
cgroup_kn_lock_live() in other cgroup kernfs methods)

Test case : lkp-nex05/will-it-scale/writeseek

2074b6e38668e62  a0f9ec1f181534694cb5bf40b
---  -
   1027273 ~ 0%  -4.3% 982732 ~ 0%  TOTAL will-it-scale.per_thread_ops
   136 ~ 3% -43.1% 77 ~43%  TOTAL proc-vmstat.nr_dirtied
  0.51 ~ 3% +98.0%   1.01 ~ 4%  TOTAL 
perf-profile.cpu-cycles.shmem_write_end.generic_perform_write.__generic_file_aio_write.generic_file_aio_write.do_sync_write
  1078 ~ 9% -16.3%903 ~11%  TOTAL numa-meminfo.node0.Unevictable
   269 ~ 9% -16.2%225 ~11%  TOTAL 
numa-vmstat.node0.nr_unevictable
  1.64 ~ 1% -14.3%   1.41 ~ 4%  TOTAL 
perf-profile.cpu-cycles.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_aio_write
  1.62 ~ 2% +14.1%   1.84 ~ 1%  TOTAL 
perf-profile.cpu-cycles.lseek64


Legend:
~XX%- stddev percent
[+-]XX% - change percent


Thanks,
Jet


echo performance  /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu16/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu17/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu18/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu19/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu20/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu21/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu22/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu23/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu24/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu25/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu26/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu27/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu28/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu29/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu30/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu31/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu32/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu33/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu34/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu35/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu36/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu37/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu38/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu39/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu40/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu41/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu42/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu43/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu44/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu45/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu46/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu47/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu48/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu49/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu50/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu51/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu52/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu53/cpufreq/scaling_governor
echo performance  /sys/devices/system/cpu/cpu54/cpufreq/scaling_governor
echo performance

Re: [sched] BUG: unable to handle kernel paging request at 093cd001

2014-05-12 Thread Jet Chen

On 05/12/2014 10:43 PM, Vincent Guittot wrote:
> Hi,
> 
> Sorry the previous patch, that i sent, becomes wrong after optimizing it
> Could you please try this one ?

Hi Vincent, this patch works. Thanks.

Tested-by: Jet Chen 

> 
> Regards,
> Vincent
> ---
>  kernel/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 4ea7b3f..205fa17 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6234,7 +6234,7 @@ static void sched_init_numa(void)
>   /* Compute default topology size */
>   for (i = 0; sched_domain_topology[i].mask; i++);
>  
> - tl = kzalloc((i + level) *
> + tl = kzalloc((i + level + 1) *
>   sizeof(struct sched_domain_topology_level), GFP_KERNEL);
>   if (!tl)
>   return;
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched] BUG: unable to handle kernel paging request at 093cd001

2014-05-12 Thread Jet Chen

On 05/12/2014 05:52 PM, Vincent Guittot wrote:
> Hi,
> 
> Does this patch solve your issue ?
> 

Sorry, this patch doesn't work. The issue is still there.

[0.036000] x86: Booting SMP configuration:
[0.036000]  node  #0, CPUs:  #1
[0.004000] Initializing CPU#1
[0.008000] kvm-clock: cpu 1, msr 0:13ffb081, secondary cpu clock
[0.008000] masked ExtINT on CPU#1
[0.008000] numa_add_cpu cpu 1 node 0: mask now 0-1
[0.052085] x86: Booted up 1 node, 2 CPUs
[0.052023] KVM setup async PF for cpu 1
[0.052023] kvm-stealtime: cpu 1, msr 13851980
[0.053075] smpboot: Total of 2 processors activated (10774.44 BogoMIPS)
[0.053878] BUG: unable to handle kernel paging request at 02f63001
[0.054403] IP: [] build_sched_domains+0x252/0x1545
[0.054863] *pdpt =  *pde = f000ff53f000ff53
[0.055337] Oops:  [#1] SMP
[0.055616] Modules linked in:
[0.055871] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
3.15.0-rc2-00066-g7a15434 #1
[0.056000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[0.056000] task: d305 ti: d304c000 task.ti: d304c000
[0.056000] EIP: 0060:[] EFLAGS: 00010202 CPU: 0
[0.056000] EIP is at build_sched_domains+0x252/0x1545
[0.056000] EAX: 0001 EBX: d30031f0 ECX: 001e EDX: 02f63000
[0.056000] ESI:  EDI: d30031f0 EBP: d304df44 ESP: d304dee0
[0.056000]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[0.056000] CR0: 8005003b CR2: 02f63001 CR3: 108ed000 CR4: 06f0
[0.056000] Stack:
[0.056000]  0002   d026fd5f 00d0 d30066d0 0002 
d03b9d70
[0.056000]  d3003180 d3003200 d30066c0 d3006360   d30031d0 
d30066c8
[0.056000]  d30031c0   d3003200 d08e6d14 d3041000 d30066c0 
0008
[0.056000] Call Trace:
[0.056000]  [] ? build_sched_domains+0x140e/0x1545
[0.056000]  [] ? alloc_cpumask_var_node+0x1f/0x77
[0.056000]  [] sched_init_smp+0x350/0x3c9
[0.056000]  [] kernel_init_freeable+0x6e/0x187
[0.056000]  [] ? finish_task_switch+0x3e/0xfa
[0.056000]  [] kernel_init+0xb/0xcc
[0.056000]  [] ret_from_kernel_thread+0x21/0x30
[0.056000]  [] ? rest_init+0xbf/0xbf
[0.056000] Code: 00 31 73 d0 8b 0c 11 85 c9 74 0a f6 41 3d 20 0f 85 b9 00 
00 00 8b 04 02 e8 80 5f 0b 00 8b 43 04 85 c0 74 0f 8b 14 b5 00 31 73 d0 <8b> 04 
10 e8 6a 5f 0b 00 8b 43 08 85 c0 74 0f 8b 14 b5 00 31 73
[0.056000] EIP: [] build_sched_domains+0x252/0x1545 SS:ESP 
0068:d304dee0
[0.056000] CR2: 02f63001
[0.056000] ---[ end trace 528da1c27c66605c ]---
[0.056000] Kernel panic - not syncing: Fatal exception


> A null line is missing at the end of the array for NUMA case.
> My test was passed thanks to a null data after the allocated array 
> 
> regards,
> Vincent
> ---
>  kernel/sched/core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 4ea7b3f..941da33 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6232,7 +6232,7 @@ static void sched_init_numa(void)
>   }
>  
>   /* Compute default topology size */
> - for (i = 0; sched_domain_topology[i].mask; i++);
> + for (i = 1; sched_domain_topology[i].mask; i++);
>  
>   tl = kzalloc((i + level) *
>   sizeof(struct sched_domain_topology_level), GFP_KERNEL);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched] BUG: unable to handle kernel paging request at 093cd001

2014-05-12 Thread Jet Chen

On 05/12/2014 05:52 PM, Vincent Guittot wrote:
 Hi,
 
 Does this patch solve your issue ?
 

Sorry, this patch doesn't work. The issue is still there.

[0.036000] x86: Booting SMP configuration:
[0.036000]  node  #0, CPUs:  #1
[0.004000] Initializing CPU#1
[0.008000] kvm-clock: cpu 1, msr 0:13ffb081, secondary cpu clock
[0.008000] masked ExtINT on CPU#1
[0.008000] numa_add_cpu cpu 1 node 0: mask now 0-1
[0.052085] x86: Booted up 1 node, 2 CPUs
[0.052023] KVM setup async PF for cpu 1
[0.052023] kvm-stealtime: cpu 1, msr 13851980
[0.053075] smpboot: Total of 2 processors activated (10774.44 BogoMIPS)
[0.053878] BUG: unable to handle kernel paging request at 02f63001
[0.054403] IP: [d026eba3] build_sched_domains+0x252/0x1545
[0.054863] *pdpt =  *pde = f000ff53f000ff53
[0.055337] Oops:  [#1] SMP
[0.055616] Modules linked in:
[0.055871] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
3.15.0-rc2-00066-g7a15434 #1
[0.056000] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[0.056000] task: d305 ti: d304c000 task.ti: d304c000
[0.056000] EIP: 0060:[d026eba3] EFLAGS: 00010202 CPU: 0
[0.056000] EIP is at build_sched_domains+0x252/0x1545
[0.056000] EAX: 0001 EBX: d30031f0 ECX: 001e EDX: 02f63000
[0.056000] ESI:  EDI: d30031f0 EBP: d304df44 ESP: d304dee0
[0.056000]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[0.056000] CR0: 8005003b CR2: 02f63001 CR3: 108ed000 CR4: 06f0
[0.056000] Stack:
[0.056000]  0002   d026fd5f 00d0 d30066d0 0002 
d03b9d70
[0.056000]  d3003180 d3003200 d30066c0 d3006360   d30031d0 
d30066c8
[0.056000]  d30031c0   d3003200 d08e6d14 d3041000 d30066c0 
0008
[0.056000] Call Trace:
[0.056000]  [d026fd5f] ? build_sched_domains+0x140e/0x1545
[0.056000]  [d03b9d70] ? alloc_cpumask_var_node+0x1f/0x77
[0.056000]  [d074ec9a] sched_init_smp+0x350/0x3c9
[0.056000]  [d0737b85] kernel_init_freeable+0x6e/0x187
[0.056000]  [d0266d9d] ? finish_task_switch+0x3e/0xfa
[0.056000]  [d04fda4a] kernel_init+0xb/0xcc
[0.056000]  [d05120c1] ret_from_kernel_thread+0x21/0x30
[0.056000]  [d04fda3f] ? rest_init+0xbf/0xbf
[0.056000] Code: 00 31 73 d0 8b 0c 11 85 c9 74 0a f6 41 3d 20 0f 85 b9 00 
00 00 8b 04 02 e8 80 5f 0b 00 8b 43 04 85 c0 74 0f 8b 14 b5 00 31 73 d0 8b 04 
10 e8 6a 5f 0b 00 8b 43 08 85 c0 74 0f 8b 14 b5 00 31 73
[0.056000] EIP: [d026eba3] build_sched_domains+0x252/0x1545 SS:ESP 
0068:d304dee0
[0.056000] CR2: 02f63001
[0.056000] ---[ end trace 528da1c27c66605c ]---
[0.056000] Kernel panic - not syncing: Fatal exception


 A null line is missing at the end of the array for NUMA case.
 My test was passed thanks to a null data after the allocated array 
 
 regards,
 Vincent
 ---
  kernel/sched/core.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index 4ea7b3f..941da33 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -6232,7 +6232,7 @@ static void sched_init_numa(void)
   }
  
   /* Compute default topology size */
 - for (i = 0; sched_domain_topology[i].mask; i++);
 + for (i = 1; sched_domain_topology[i].mask; i++);
  
   tl = kzalloc((i + level) *
   sizeof(struct sched_domain_topology_level), GFP_KERNEL);
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched] BUG: unable to handle kernel paging request at 093cd001

2014-05-12 Thread Jet Chen

On 05/12/2014 10:43 PM, Vincent Guittot wrote:
 Hi,
 
 Sorry the previous patch, that i sent, becomes wrong after optimizing it
 Could you please try this one ?

Hi Vincent, this patch works. Thanks.

Tested-by: Jet Chen jet.c...@intel.com

 
 Regards,
 Vincent
 ---
  kernel/sched/core.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/kernel/sched/core.c b/kernel/sched/core.c
 index 4ea7b3f..205fa17 100644
 --- a/kernel/sched/core.c
 +++ b/kernel/sched/core.c
 @@ -6234,7 +6234,7 @@ static void sched_init_numa(void)
   /* Compute default topology size */
   for (i = 0; sched_domain_topology[i].mask; i++);
  
 - tl = kzalloc((i + level) *
 + tl = kzalloc((i + level + 1) *
   sizeof(struct sched_domain_topology_level), GFP_KERNEL);
   if (!tl)
   return;
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] DCA: fix over-warning in ioat3_dca_init

2014-05-08 Thread Jet Chen

On 05/09/2014 12:13 AM, Jiang, Dave wrote:
> On Thu, 2014-05-08 at 08:57 -0700, Alexander Duyck wrote:
>> On 05/08/2014 08:28 AM, Jet Chen wrote:
>>> I agree with your option that it is a real BIOS bug and it puts pressure on 
>>> the BIOS guys to get this fixed. However, this warning message interferes 
>>> with our kernel booting tests and kernel performance tests. We have to 
>>> disable CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before 
>>> that, code of CONFIG_INTEL_IOATDMA will not be validated in our testing 
>>> system :(.
>>> Hope this issue could get fixed soon.
>>>
>>> Thanks,
>>> Jet
>>>
>>
>> First I would recommend updating your BIOS.  If the updated BIOS also
>> has the issue I would recommend taking this feedback to whoever provided
>> the BIOS for your platform so that they can implement the fix.
>>
>> If I am not mistaken some BIOSes have the option to disable DCA and/or
>> IOATDMA.  You might want to check yours to see if you can just disable
>> DCA on your platform until the issue can be resolved.
> 
> Disabling DCA is the preferred option. IOATDMA is functional without
> DCA.
> 
> Jet,
> What exactly are you attempting to test with IOATDMA? The only two
> consumers of this DMA driver I know of are MDRAID and NTB. But support
> for XOR/PQ ops on Xeon platforms have been removed due to various
> reasons recently so it really is just NTB at the moment in the latest
> kernels. 

We are running LKP to test kernel boot and performance. More information you 
can find at https://01.org/lkp/
This issue shows up in our boot testing with certain kconfig and impacts many 
of our test machines. It is difficult to update BIOS for all test boxes. 
Besides, we still not make sure every test box model have a workable version of 
BIOS. We will consider disabling DCA as your suggestion.

Thanks,
Jet

> 
>> Thanks,
>>
>> Alex
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] DCA: fix over-warning in ioat3_dca_init

2014-05-08 Thread Jet Chen

I agree with your option that it is a real BIOS bug and it puts pressure on the 
BIOS guys to get this fixed. However, this warning message interferes with our 
kernel booting tests and kernel performance tests. We have to disable 
CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before that, code 
of CONFIG_INTEL_IOATDMA will not be validated in our testing system :(.
Hope this issue could get fixed soon.

Thanks,
Jet

On 05/08/2014 11:04 PM, Alexander Duyck wrote:
> I actually went to a bit of trouble to get this message added as many
> BIOSes have the annoying quality of getting this wrong, and then
> products are shipped and labeled as having the DCA feature when they
> actually don't.  One easy way to get rid of the message is to disable
> either DCA or IOAT in the BIOS since it is broken anyway.  By adding
> this we at least have some visibility and it puts pressure on the BIOS
> guys to get this fixed if we want to claim the platform does DCA.
> 
> I consider this to be a real BIOS bug as the DCA feature is crippled
> without it.  Also moving this to just a debug message is going to make
> it very difficult for us to debug this when a performance issue comes up
> on a customer platform as we will have to get them to perform extra
> steps in order to actually figure out what is going on with DCA.
> 
> Thanks,
> 
> Alex
> 
> On 05/08/2014 02:42 AM, Jet Chen wrote:
>> We keep seeing such dmesg messages on boxes
>>
>> [   16.596610] WARNING: CPU: 0 PID: 457 at drivers/dma/ioat/dca.c:697 
>> ioat3_dca_init+0x19c/0x1b0 [ioatdma]()
>> [   16.609614] ioatdma :00:04.0: APICID_TAG_MAP set incorrectly by BIOS, 
>> disabling DCA
>> ...
>> [   16.892058]  [] dump_stack+0x4d/0x66
>> [   16.892061]  [] warn_slowpath_common+0x7d/0xa0
>> [   16.892064]  [] warn_slowpath_fmt_taint+0x44/0x50
>> [   16.892065]  [] ioat3_dca_init+0x19c/0x1b0 [ioatdma]
>> [   16.892069]  [] ioat3_dma_probe+0x386/0x3e0 [ioatdma]
>> [   16.892071]  [] ioat_pci_probe+0x122/0x1b0 [ioatdma]
>> [   16.892074]  [] local_pci_probe+0x45/0xa0
>> [   16.892076]  [] work_for_cpu_fn+0x14/0x20
>> [   16.892077]  [] process_one_work+0x183/0x490
>> [   16.892079]  [] worker_thread+0x2a3/0x410
>> [   16.892080]  [] ? rescuer_thread+0x410/0x410
>> [   16.892081]  [] kthread+0xd2/0xf0
>> [   16.892083]  [] ? kthread_create_on_node+0x180/0x180
>> [   16.892085]  [] ret_from_fork+0x7c/0xb0
>> [   16.892091] fbcon: mgadrmfb (fb0) is primary device
>> [   16.892092]  [] ? kthread_create_on_node+0x180/0x180
>>
>> No need to use WARN_TAINT_ONCE to generate a such big noise if this is not a 
>> critical error for kernel. DCA driver could print out a debug messages then 
>> quit quietly.
>>
>> If this is a real BIOS bug, please ignore this patch. Let's transfer this 
>> issue to BIOS guys.
>>
>> Signed-off-by: Jet Chen 
>> ---
>>  drivers/dma/ioat/dca.c | 10 ++
>>  1 file changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/dma/ioat/dca.c b/drivers/dma/ioat/dca.c
>> index 9e84d5b..c0f7971 100644
>> --- a/drivers/dma/ioat/dca.c
>> +++ b/drivers/dma/ioat/dca.c
>> @@ -470,10 +470,7 @@ struct dca_provider *ioat2_dca_init(struct pci_dev 
>> *pdev, void __iomem *iobase)
>>  }
>>  
>>  if (!dca2_tag_map_valid(ioatdca->tag_map)) {
>> -WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
>> -"%s %s: APICID_TAG_MAP set incorrectly by BIOS, 
>> disabling DCA\n",
>> -dev_driver_string(>dev),
>> -dev_name(>dev));
>> +dev_dbg(>dev, "APICID_TAG_MAP set incorrectly by BIOS, 
>> disabling DCA\n");
>>  free_dca_provider(dca);
>>  return NULL;
>>  }
>> @@ -691,10 +688,7 @@ struct dca_provider *ioat3_dca_init(struct pci_dev 
>> *pdev, void __iomem *iobase)
>>  }
>>  
>>  if (dca3_tag_map_invalid(ioatdca->tag_map)) {
>> -WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
>> -"%s %s: APICID_TAG_MAP set incorrectly by BIOS, 
>> disabling DCA\n",
>> -dev_driver_string(>dev),
>> -dev_name(>dev));
>> +dev_dbg(>dev, "APICID_TAG_MAP set incorrectly by BIOS, 
>> disabling DCA\n");
>>  free_dca_provider(dca);
>>  return NULL;
>>  }
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] DCA: fix over-warning in ioat3_dca_init

2014-05-08 Thread Jet Chen

I agree with your option that it is a real BIOS bug and it puts pressure on the 
BIOS guys to get this fixed. However, this warning message interferes with our 
kernel booting tests and kernel performance tests. We have to disable 
CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before that, code 
of CONFIG_INTEL_IOATDMA will not be validated in our testing system :(.
Hope this issue could get fixed soon.

Thanks,
Jet

On 05/08/2014 11:04 PM, Alexander Duyck wrote:
 I actually went to a bit of trouble to get this message added as many
 BIOSes have the annoying quality of getting this wrong, and then
 products are shipped and labeled as having the DCA feature when they
 actually don't.  One easy way to get rid of the message is to disable
 either DCA or IOAT in the BIOS since it is broken anyway.  By adding
 this we at least have some visibility and it puts pressure on the BIOS
 guys to get this fixed if we want to claim the platform does DCA.
 
 I consider this to be a real BIOS bug as the DCA feature is crippled
 without it.  Also moving this to just a debug message is going to make
 it very difficult for us to debug this when a performance issue comes up
 on a customer platform as we will have to get them to perform extra
 steps in order to actually figure out what is going on with DCA.
 
 Thanks,
 
 Alex
 
 On 05/08/2014 02:42 AM, Jet Chen wrote:
 We keep seeing such dmesg messages on boxes

 [   16.596610] WARNING: CPU: 0 PID: 457 at drivers/dma/ioat/dca.c:697 
 ioat3_dca_init+0x19c/0x1b0 [ioatdma]()
 [   16.609614] ioatdma :00:04.0: APICID_TAG_MAP set incorrectly by BIOS, 
 disabling DCA
 ...
 [   16.892058]  [8172807e] dump_stack+0x4d/0x66
 [   16.892061]  [81067f7d] warn_slowpath_common+0x7d/0xa0
 [   16.892064]  [81068034] warn_slowpath_fmt_taint+0x44/0x50
 [   16.892065]  [a00228bc] ioat3_dca_init+0x19c/0x1b0 [ioatdma]
 [   16.892069]  [a0021cd6] ioat3_dma_probe+0x386/0x3e0 [ioatdma]
 [   16.892071]  [a001a192] ioat_pci_probe+0x122/0x1b0 [ioatdma]
 [   16.892074]  [81329385] local_pci_probe+0x45/0xa0
 [   16.892076]  [81080d34] work_for_cpu_fn+0x14/0x20
 [   16.892077]  [81083c33] process_one_work+0x183/0x490
 [   16.892079]  [81084bd3] worker_thread+0x2a3/0x410
 [   16.892080]  [81084930] ? rescuer_thread+0x410/0x410
 [   16.892081]  [8108b852] kthread+0xd2/0xf0
 [   16.892083]  [8108b780] ? kthread_create_on_node+0x180/0x180
 [   16.892085]  [817396bc] ret_from_fork+0x7c/0xb0
 [   16.892091] fbcon: mgadrmfb (fb0) is primary device
 [   16.892092]  [8108b780] ? kthread_create_on_node+0x180/0x180

 No need to use WARN_TAINT_ONCE to generate a such big noise if this is not a 
 critical error for kernel. DCA driver could print out a debug messages then 
 quit quietly.

 If this is a real BIOS bug, please ignore this patch. Let's transfer this 
 issue to BIOS guys.

 Signed-off-by: Jet Chen jet.c...@intel.com
 ---
  drivers/dma/ioat/dca.c | 10 ++
  1 file changed, 2 insertions(+), 8 deletions(-)

 diff --git a/drivers/dma/ioat/dca.c b/drivers/dma/ioat/dca.c
 index 9e84d5b..c0f7971 100644
 --- a/drivers/dma/ioat/dca.c
 +++ b/drivers/dma/ioat/dca.c
 @@ -470,10 +470,7 @@ struct dca_provider *ioat2_dca_init(struct pci_dev 
 *pdev, void __iomem *iobase)
  }
  
  if (!dca2_tag_map_valid(ioatdca-tag_map)) {
 -WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
 -%s %s: APICID_TAG_MAP set incorrectly by BIOS, 
 disabling DCA\n,
 -dev_driver_string(pdev-dev),
 -dev_name(pdev-dev));
 +dev_dbg(pdev-dev, APICID_TAG_MAP set incorrectly by BIOS, 
 disabling DCA\n);
  free_dca_provider(dca);
  return NULL;
  }
 @@ -691,10 +688,7 @@ struct dca_provider *ioat3_dca_init(struct pci_dev 
 *pdev, void __iomem *iobase)
  }
  
  if (dca3_tag_map_invalid(ioatdca-tag_map)) {
 -WARN_TAINT_ONCE(1, TAINT_FIRMWARE_WORKAROUND,
 -%s %s: APICID_TAG_MAP set incorrectly by BIOS, 
 disabling DCA\n,
 -dev_driver_string(pdev-dev),
 -dev_name(pdev-dev));
 +dev_dbg(pdev-dev, APICID_TAG_MAP set incorrectly by BIOS, 
 disabling DCA\n);
  free_dca_provider(dca);
  return NULL;
  }
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] DCA: fix over-warning in ioat3_dca_init

2014-05-08 Thread Jet Chen

On 05/09/2014 12:13 AM, Jiang, Dave wrote:
 On Thu, 2014-05-08 at 08:57 -0700, Alexander Duyck wrote:
 On 05/08/2014 08:28 AM, Jet Chen wrote:
 I agree with your option that it is a real BIOS bug and it puts pressure on 
 the BIOS guys to get this fixed. However, this warning message interferes 
 with our kernel booting tests and kernel performance tests. We have to 
 disable CONFIG_INTEL_IOATDMA in kconfig until this issue gets fixed. Before 
 that, code of CONFIG_INTEL_IOATDMA will not be validated in our testing 
 system :(.
 Hope this issue could get fixed soon.

 Thanks,
 Jet


 First I would recommend updating your BIOS.  If the updated BIOS also
 has the issue I would recommend taking this feedback to whoever provided
 the BIOS for your platform so that they can implement the fix.

 If I am not mistaken some BIOSes have the option to disable DCA and/or
 IOATDMA.  You might want to check yours to see if you can just disable
 DCA on your platform until the issue can be resolved.
 
 Disabling DCA is the preferred option. IOATDMA is functional without
 DCA.
 
 Jet,
 What exactly are you attempting to test with IOATDMA? The only two
 consumers of this DMA driver I know of are MDRAID and NTB. But support
 for XOR/PQ ops on Xeon platforms have been removed due to various
 reasons recently so it really is just NTB at the moment in the latest
 kernels. 

We are running LKP to test kernel boot and performance. More information you 
can find at https://01.org/lkp/
This issue shows up in our boot testing with certain kconfig and impacts many 
of our test machines. It is difficult to update BIOS for all test boxes. 
Besides, we still not make sure every test box model have a workable version of 
BIOS. We will consider disabling DCA as your suggestion.

Thanks,
Jet

 
 Thanks,

 Alex

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [KVM] BUG: unable to handle kernel NULL pointer dereference at 00000000000002b0

2014-04-28 Thread Jet Chen

On 04/28/2014 07:34 PM, Paolo Bonzini wrote:
> Il 28/04/2014 11:54, Jet Chen ha scritto:
>>>>>> We noticed the below kernel BUG on
>>>>>>
>>>>>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>>>>
>>>> What commit?
>>>>
>> This one,
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>>
>> commit 93c4adc7afedf9b0ec190066d45b6d67db5270da
>> Author: Paolo Bonzini 
>> AuthorDate: Wed Mar 5 23:19:52 2014 +0100
>> Commit: Paolo Bonzini 
>> CommitDate: Mon Mar 17 12:21:39 2014 +0100
>>
>>   KVM: x86: handle missing MPX in nested virtualization
>>
>>
>> BTW, the same issue has been reported by Fengguang last month.
>>
>> https://lkml.org/lkml/2014/3/26/200
> 
> It should have been fixed already by commit 920c83778569 (KVM: vmx: fix 
> MPX detection, 2014-03-26).  That's why I was confused, I thought it was 
> for a recent commit on Linus's master branch.

You're right, Paolo. It have been fixed already by commit 920c83778569. Sorry 
for this noisy duplicated report.
We found this issued commit on git://git.kernel.org/pub/scm/virt/kvm/kvm.git 
tree at first. When that commit entered mainline, we tested it again and I 
reported by mistake.

> 
> Paolo
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [KVM] BUG: unable to handle kernel NULL pointer dereference at 00000000000002b0

2014-04-28 Thread Jet Chen

On 04/28/2014 05:33 PM, Paolo Bonzini wrote:
> Il 14/04/2014 09:49, Jet Chen ha scritto:
>> Hi Paolo,
>>
>> We noticed the below kernel BUG on
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
> 
> What commit?
> 

This one,

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

commit 93c4adc7afedf9b0ec190066d45b6d67db5270da
Author: Paolo Bonzini 
AuthorDate: Wed Mar 5 23:19:52 2014 +0100
Commit: Paolo Bonzini 
CommitDate: Mon Mar 17 12:21:39 2014 +0100

  KVM: x86: handle missing MPX in nested virtualization


BTW, the same issue has been reported by Fengguang last month.

https://lkml.org/lkml/2014/3/26/200

Thanks,
Jet

> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [KVM] BUG: unable to handle kernel NULL pointer dereference at 00000000000002b0

2014-04-28 Thread Jet Chen

On 04/28/2014 05:33 PM, Paolo Bonzini wrote:
 Il 14/04/2014 09:49, Jet Chen ha scritto:
 Hi Paolo,

 We noticed the below kernel BUG on

 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
 
 What commit?
 

This one,

git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

commit 93c4adc7afedf9b0ec190066d45b6d67db5270da
Author: Paolo Bonzini pbonz...@redhat.com
AuthorDate: Wed Mar 5 23:19:52 2014 +0100
Commit: Paolo Bonzini pbonz...@redhat.com
CommitDate: Mon Mar 17 12:21:39 2014 +0100

  KVM: x86: handle missing MPX in nested virtualization


BTW, the same issue has been reported by Fengguang last month.

https://lkml.org/lkml/2014/3/26/200

Thanks,
Jet

 Paolo
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [KVM] BUG: unable to handle kernel NULL pointer dereference at 00000000000002b0

2014-04-28 Thread Jet Chen

On 04/28/2014 07:34 PM, Paolo Bonzini wrote:
 Il 28/04/2014 11:54, Jet Chen ha scritto:
 We noticed the below kernel BUG on

 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

 What commit?

 This one,

 git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

 commit 93c4adc7afedf9b0ec190066d45b6d67db5270da
 Author: Paolo Bonzini pbonz...@redhat.com
 AuthorDate: Wed Mar 5 23:19:52 2014 +0100
 Commit: Paolo Bonzini pbonz...@redhat.com
 CommitDate: Mon Mar 17 12:21:39 2014 +0100

   KVM: x86: handle missing MPX in nested virtualization


 BTW, the same issue has been reported by Fengguang last month.

 https://lkml.org/lkml/2014/3/26/200
 
 It should have been fixed already by commit 920c83778569 (KVM: vmx: fix 
 MPX detection, 2014-03-26).  That's why I was confused, I thought it was 
 for a recent commit on Linus's master branch.

You're right, Paolo. It have been fixed already by commit 920c83778569. Sorry 
for this noisy duplicated report.
We found this issued commit on git://git.kernel.org/pub/scm/virt/kvm/kvm.git 
tree at first. When that commit entered mainline, we tested it again and I 
reported by mistake.

 
 Paolo
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

2014-04-23 Thread Jet Chen

On 04/23/2014 01:11 AM, Dan Williams wrote:
> On Mon, Apr 21, 2014 at 12:29 AM, Jet Chen  wrote:
>> HI Dan,
>>
>> we noticed the below changes on
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
>> commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd ("libata/ahci: accommodate
>> tag ordered controllers")
> 
> Hi, was this on simulated hardware or a real AHCI controller and disk?
> 

Testing was on a physical machine with a real AHCI controller.

root@bay ~# lspci | grep AHCI
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port 
SATA Controller [AHCI mode] (rev 02)

> It does appear this test noticed increased throughput:
> 
> 203893 ~ 0%  +3.7% 211474 ~ 0%  TOTAL iostat.sda.wkB/s
> 
> I wonder if ap->last_tag can be moved to a hotter cacheline, but if
> throughput goes up I can imagine it throws off the cpu statistics
> quite a bit.
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

2014-04-23 Thread Jet Chen

On 04/23/2014 01:11 AM, Dan Williams wrote:
 On Mon, Apr 21, 2014 at 12:29 AM, Jet Chen jet.c...@intel.com wrote:
 HI Dan,

 we noticed the below changes on

 git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
 commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd (libata/ahci: accommodate
 tag ordered controllers)
 
 Hi, was this on simulated hardware or a real AHCI controller and disk?
 

Testing was on a physical machine with a real AHCI controller.

root@bay ~# lspci | grep AHCI
00:1f.2 SATA controller: Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port 
SATA Controller [AHCI mode] (rev 02)

 It does appear this test noticed increased throughput:
 
 203893 ~ 0%  +3.7% 211474 ~ 0%  TOTAL iostat.sda.wkB/s
 
 I wonder if ap-last_tag can be moved to a hotter cacheline, but if
 throughput goes up I can imagine it throws off the cpu statistics
 quite a bit.
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched,rcu] 9234566d3a3: +1.6% will-it-scale.scalability, +1302.6% time.involuntary_context_switches

2014-04-21 Thread Jet Chen


On 04/22/2014 09:59 AM, Paul E. McKenney wrote:

On Mon, Apr 21, 2014 at 02:28:21PM +0800, Jet Chen wrote:

Hi Paul,

we noticed the below changes on
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
torture.2014.04.18a
commit 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 ("sched,rcu: Make cond_resched() 
report RCU quiescent states")


My impression of these statistics is that this commit results in huge
numbers of additional context switches and interrupts, but has a slightly
positive effect on performance and a larger negative effect on
scalability.

Is this a reasonable interpretation?


Yes, you're right.



Thanx, Paul


Comparison 1 - parent commit of 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

e119454e74a852f  9234566d3a36c0aead8852e3c
---  -
 1035948 ~ 0%  +1.6%1052990 ~ 0%  TOTAL will-it-scale.per_thread_ops
 1271322 ~ 0%  +1.8%1294004 ~ 0%  TOTAL 
will-it-scale.per_process_ops
0.63 ~ 0%  -5.2%   0.60 ~ 0%  TOTAL will-it-scale.scalability
   22470 ~ 2%   +1302.6% 315168 ~ 2%  TOTAL 
time.involuntary_context_switches
   84265 ~ 5%   +1047.1% 966581 ~ 1%  TOTAL interrupts.IWI
1828 ~44%+189.6%   5295 ~13%  TOTAL 
time.voluntary_context_switches
5337 ~ 1% +82.1%   9720 ~ 1%  TOTAL vmstat.system.cs
  118599 ~ 0% -30.4%  82545 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
  224021 ~ 4% +34.7% 301858 ~ 2%  TOTAL interrupts.RES
   25148 ~ 0%  +7.0%  26917 ~ 0%  TOTAL vmstat.system.in
 7063439 ~ 0%  -5.2%6694536 ~ 0%  TOTAL interrupts.LOC
  188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.NMI
  188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.PMI
3720 ~ 0%  -1.5%   3665 ~ 0%  TOTAL time.system_time
1215 ~ 0%  -1.4%   1198 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got


Comparison 2 - b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

Fengguang has reported stats changes about 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 to you days ago.
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2014.04.14a
commit b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 ("sched,rcu: Make cond_resched() 
report RCU quiescent states")

Let's have a compare here.

b84c4e08143c98d  9234566d3a36c0aead8852e3c
---  -
  809309 ~ 0%  -2.6% 788400 ~ 0%  TOTAL 
will-it-scale.per_process_ops
0.61 ~ 0%  -1.9%   0.60 ~ 0%  TOTAL will-it-scale.scalability
  434080 ~ 0%  -1.5% 427643 ~ 0%  TOTAL will-it-scale.per_thread_ops
   4 ~11%  +1.2e+05%   5249 ~ 2%  TOTAL interrupts.IWI
 607 ~ 7% +28.0%778 ~14%  TOTAL 
interrupts.47:PCI-MSI-edge.eth0
   12349 ~ 2% -14.6%  10548 ~ 1%  TOTAL 
interrupts.0:IO-APIC-edge.timer
3078 ~ 3% +20.9%   3722 ~ 6%  TOTAL interrupts.RES


Comparison 3 - parent commit of b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4

Duplicated with Fengguang's report. Help you to see these info in a page :)

ad86a04266f9b49  b84c4e08143c98dad4b4d139f
---  -
  676264 ~ 0%  +3.3% 698461 ~ 0%  TOTAL will-it-scale.per_thread_ops
 1174547 ~ 0%  +3.0%1209307 ~ 0%  TOTAL 
will-it-scale.per_process_ops
1.67 ~ 0%  -2.3%   1.63 ~ 0%  TOTAL will-it-scale.scalability
   10522 ~ 2%+921.2% 107463 ~ 1%  TOTAL 
time.involuntary_context_switches
   77671 ~ 3% +67.0% 129688 ~ 3%  TOTAL interrupts.RES
   99502 ~ 0% -27.8%  71813 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
2554 ~ 0% +49.1%   3809 ~ 1%  TOTAL vmstat.system.cs
   11524 ~ 0%  -2.3%  11259 ~ 0%  TOTAL vmstat.system.in
 213 ~ 0%  -4.3%204 ~ 0%  TOTAL time.system_time
  74 ~ 0%  -4.1% 71 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
 3495099 ~ 0%  -3.1%3387173 ~ 0%  TOTAL interrupts.LOC




Thanks,
Jet






./runtest.py open2 32 1 4 6 8





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

2014-04-21 Thread Jet Chen


HI Dan,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd ("libata/ahci: accommodate tag 
ordered controllers")

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4
---  -
88694337 ~39%+138.4%  2.115e+08 ~46%  TOTAL perf-stat.dTLB-store-misses
  217057 ~ 0% -31.3% 149221 ~ 3%  TOTAL 
interrupts.46:PCI-MSI-edge.ahci
   6.995e+08 ~20% +37.2%  9.598e+08 ~25%  TOTAL perf-stat.dTLB-load-misses
  110302 ~ 0% -28.9%  78402 ~ 2%  TOTAL interrupts.CAL
   3.168e+08 ~ 9% +14.5%  3.627e+08 ~10%  TOTAL 
perf-stat.L1-dcache-prefetches
   2.553e+09 ~12% +26.5%  3.228e+09 ~11%  TOTAL perf-stat.LLC-loads
   5.815e+08 ~ 6% +27.3%  7.403e+08 ~11%  TOTAL perf-stat.LLC-stores
   3.662e+09 ~11% +22.9%  4.501e+09 ~10%  TOTAL 
perf-stat.L1-dcache-load-misses
   2.155e+10 ~ 1%  +8.3%  2.333e+10 ~ 1%  TOTAL 
perf-stat.L1-dcache-store-misses
   3.619e+10 ~ 1%  +5.9%  3.832e+10 ~ 2%  TOTAL perf-stat.cache-references
   1.605e+10 ~ 1%  +4.3%  1.674e+10 ~ 1%  TOTAL 
perf-stat.L1-icache-load-misses
  239691 ~ 7%  -8.4% 219537 ~ 1%  TOTAL interrupts.RES
3483 ~ 0%  -5.4%   3297 ~ 0%  TOTAL vmstat.system.in
   2.748e+08 ~ 1%  +4.3%  2.865e+08 ~ 0%  TOTAL perf-stat.cache-misses
98935369 ~ 0%  +4.9%  1.038e+08 ~ 0%  TOTAL perf-stat.LLC-store-misses
 699 ~ 1%  -3.7%673 ~ 1%  TOTAL iostat.sda.w_await
 698 ~ 1%  -3.7%672 ~ 1%  TOTAL iostat.sda.await
  203893 ~ 0%  +3.7% 211474 ~ 0%  TOTAL iostat.sda.wkB/s
  203972 ~ 0%  +3.7% 211488 ~ 0%  TOTAL vmstat.io.bo
  618082 ~ 4%  -4.6% 589619 ~ 1%  TOTAL perf-stat.context-switches
   1.432e+12 ~ 1%  +3.0%  1.475e+12 ~ 0%  TOTAL perf-stat.L1-icache-loads
3.35e+11 ~ 0%  +3.2%  3.456e+11 ~ 0%  TOTAL perf-stat.L1-dcache-stores
   1.486e+12 ~ 0%  +2.8%  1.527e+12 ~ 0%  TOTAL perf-stat.iTLB-loads
   3.006e+11 ~ 0%  +2.6%  3.084e+11 ~ 0%  TOTAL 
perf-stat.branch-instructions
   1.793e+12 ~ 0%  +2.8%  1.843e+12 ~ 0%  TOTAL perf-stat.cpu-cycles
   3.352e+11 ~ 1%  +2.9%  3.451e+11 ~ 0%  TOTAL perf-stat.dTLB-stores
   2.994e+11 ~ 1%  +3.1%  3.087e+11 ~ 0%  TOTAL perf-stat.branch-loads
1.49e+12 ~ 0%  +2.9%  1.533e+12 ~ 0%  TOTAL perf-stat.instructions
5.48e+11 ~ 0%  +2.8%  5.633e+11 ~ 0%  TOTAL perf-stat.dTLB-loads
   2.028e+11 ~ 1%  +2.9%  2.086e+11 ~ 1%  TOTAL perf-stat.bus-cycles
   5.484e+11 ~ 0%  +2.9%  5.644e+11 ~ 0%  TOTAL perf-stat.L1-dcache-loads
   1.829e+12 ~ 0%  +2.7%  1.877e+12 ~ 1%  TOTAL perf-stat.ref-cycles

Legend:
~XX%- stddev percent
[+-]XX% - change percent

Attach full stats changes entries for reference.

Thanks,
Jet




mkfs -t ext4 -q /dev/sda1
echo 1 > /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1 > /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1 > 
/sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
mount -t ext4 /dev/sda1 /fs/sda1
dd  if=/dev/zero of=/fs/sda1/zero-1 status=none &
sleep 600
killall -9 dd


2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  1.23 ~ 8% -30.0%   0.86 ~15%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  1.23 ~ 8% -30.0%   0.86 ~15%  TOTAL 
perf-profile.cpu-cycles.jbd2_journal_add_journal_head.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_reserve_inode_write.ext4_mark_inode_dirty

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
 56347 ~ 0% -26.3%  41535 ~ 5%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
 56543 ~ 0% -32.9%  37934 ~ 0%  bay/micro/dd-write/1HDD-cfq-xfs-1dd
112890 ~ 0% -29.6%  79469 ~ 2%  TOTAL softirqs.BLOCK

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  0.95 ~12% -26.0%   0.70 ~ 7%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  0.95 ~12% -26.0%   0.70 ~ 7%  TOTAL 
perf-profile.cpu-cycles.jbd2_journal_put_journal_head.__ext4_handle_dirty_metadata.ext4_mark_iloc_dirty.ext4_mark_inode_dirty.ext4_dirty_inode

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  0.95 ~ 5% -18.2%   0.77 ~24%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  0.95 ~ 5% -18.2%   0.77 ~24%  TOTAL 
perf-profile.cpu-cycles.generic_file_aio_write.ext4_file_write.do_sync_write.vfs_write.sys_write

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  2468 ~ 3% +19.5%   2949 ~ 6%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  2468 ~ 3% +19.5%   2949 ~ 6%  TOTAL 
proc-vmstat.kswapd_high_wmark_hit_quickly

2cf532f5e67c0cf

[sched,rcu] 9234566d3a3: +1.6% will-it-scale.scalability, +1302.6% time.involuntary_context_switches

2014-04-21 Thread Jet Chen


Hi Paul,

we noticed the below changes on
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
torture.2014.04.18a
commit 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 ("sched,rcu: Make cond_resched() 
report RCU quiescent states")

Comparison 1 - parent commit of 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

e119454e74a852f  9234566d3a36c0aead8852e3c
---  -
1035948 ~ 0%  +1.6%1052990 ~ 0%  TOTAL will-it-scale.per_thread_ops
1271322 ~ 0%  +1.8%1294004 ~ 0%  TOTAL will-it-scale.per_process_ops
   0.63 ~ 0%  -5.2%   0.60 ~ 0%  TOTAL will-it-scale.scalability
  22470 ~ 2%   +1302.6% 315168 ~ 2%  TOTAL 
time.involuntary_context_switches
  84265 ~ 5%   +1047.1% 966581 ~ 1%  TOTAL interrupts.IWI
   1828 ~44%+189.6%   5295 ~13%  TOTAL 
time.voluntary_context_switches
   5337 ~ 1% +82.1%   9720 ~ 1%  TOTAL vmstat.system.cs
 118599 ~ 0% -30.4%  82545 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
 224021 ~ 4% +34.7% 301858 ~ 2%  TOTAL interrupts.RES
  25148 ~ 0%  +7.0%  26917 ~ 0%  TOTAL vmstat.system.in
7063439 ~ 0%  -5.2%6694536 ~ 0%  TOTAL interrupts.LOC
 188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.NMI
 188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.PMI
   3720 ~ 0%  -1.5%   3665 ~ 0%  TOTAL time.system_time
   1215 ~ 0%  -1.4%   1198 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got


Comparison 2 - b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

Fengguang has reported stats changes about 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 to you days ago.
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2014.04.14a
commit b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 ("sched,rcu: Make cond_resched() 
report RCU quiescent states")

Let's have a compare here.

b84c4e08143c98d  9234566d3a36c0aead8852e3c
---  -
 809309 ~ 0%  -2.6% 788400 ~ 0%  TOTAL will-it-scale.per_process_ops
   0.61 ~ 0%  -1.9%   0.60 ~ 0%  TOTAL will-it-scale.scalability
 434080 ~ 0%  -1.5% 427643 ~ 0%  TOTAL will-it-scale.per_thread_ops
  4 ~11%  +1.2e+05%   5249 ~ 2%  TOTAL interrupts.IWI
607 ~ 7% +28.0%778 ~14%  TOTAL 
interrupts.47:PCI-MSI-edge.eth0
  12349 ~ 2% -14.6%  10548 ~ 1%  TOTAL 
interrupts.0:IO-APIC-edge.timer
   3078 ~ 3% +20.9%   3722 ~ 6%  TOTAL interrupts.RES


Comparison 3 - parent commit of b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4

Duplicated with Fengguang's report. Help you to see these info in a page :)

ad86a04266f9b49  b84c4e08143c98dad4b4d139f
---  -
 676264 ~ 0%  +3.3% 698461 ~ 0%  TOTAL will-it-scale.per_thread_ops
1174547 ~ 0%  +3.0%1209307 ~ 0%  TOTAL will-it-scale.per_process_ops
   1.67 ~ 0%  -2.3%   1.63 ~ 0%  TOTAL will-it-scale.scalability
  10522 ~ 2%+921.2% 107463 ~ 1%  TOTAL 
time.involuntary_context_switches
  77671 ~ 3% +67.0% 129688 ~ 3%  TOTAL interrupts.RES
  99502 ~ 0% -27.8%  71813 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
   2554 ~ 0% +49.1%   3809 ~ 1%  TOTAL vmstat.system.cs
  11524 ~ 0%  -2.3%  11259 ~ 0%  TOTAL vmstat.system.in
213 ~ 0%  -4.3%204 ~ 0%  TOTAL time.system_time
 74 ~ 0%  -4.1% 71 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
3495099 ~ 0%  -3.1%3387173 ~ 0%  TOTAL interrupts.LOC




Thanks,
Jet



./runtest.py open2 32 1 4 6 8

[sched,rcu] 9234566d3a3: +1.6% will-it-scale.scalability, +1302.6% time.involuntary_context_switches

2014-04-21 Thread Jet Chen


Hi Paul,

we noticed the below changes on
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
torture.2014.04.18a
commit 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 (sched,rcu: Make cond_resched() 
report RCU quiescent states)

Comparison 1 - parent commit of 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

e119454e74a852f  9234566d3a36c0aead8852e3c
---  -
1035948 ~ 0%  +1.6%1052990 ~ 0%  TOTAL will-it-scale.per_thread_ops
1271322 ~ 0%  +1.8%1294004 ~ 0%  TOTAL will-it-scale.per_process_ops
   0.63 ~ 0%  -5.2%   0.60 ~ 0%  TOTAL will-it-scale.scalability
  22470 ~ 2%   +1302.6% 315168 ~ 2%  TOTAL 
time.involuntary_context_switches
  84265 ~ 5%   +1047.1% 966581 ~ 1%  TOTAL interrupts.IWI
   1828 ~44%+189.6%   5295 ~13%  TOTAL 
time.voluntary_context_switches
   5337 ~ 1% +82.1%   9720 ~ 1%  TOTAL vmstat.system.cs
 118599 ~ 0% -30.4%  82545 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
 224021 ~ 4% +34.7% 301858 ~ 2%  TOTAL interrupts.RES
  25148 ~ 0%  +7.0%  26917 ~ 0%  TOTAL vmstat.system.in
7063439 ~ 0%  -5.2%6694536 ~ 0%  TOTAL interrupts.LOC
 188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.NMI
 188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.PMI
   3720 ~ 0%  -1.5%   3665 ~ 0%  TOTAL time.system_time
   1215 ~ 0%  -1.4%   1198 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got


Comparison 2 - b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

Fengguang has reported stats changes about 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 to you days ago.
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2014.04.14a
commit b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 (sched,rcu: Make cond_resched() 
report RCU quiescent states)

Let's have a compare here.

b84c4e08143c98d  9234566d3a36c0aead8852e3c
---  -
 809309 ~ 0%  -2.6% 788400 ~ 0%  TOTAL will-it-scale.per_process_ops
   0.61 ~ 0%  -1.9%   0.60 ~ 0%  TOTAL will-it-scale.scalability
 434080 ~ 0%  -1.5% 427643 ~ 0%  TOTAL will-it-scale.per_thread_ops
  4 ~11%  +1.2e+05%   5249 ~ 2%  TOTAL interrupts.IWI
607 ~ 7% +28.0%778 ~14%  TOTAL 
interrupts.47:PCI-MSI-edge.eth0
  12349 ~ 2% -14.6%  10548 ~ 1%  TOTAL 
interrupts.0:IO-APIC-edge.timer
   3078 ~ 3% +20.9%   3722 ~ 6%  TOTAL interrupts.RES


Comparison 3 - parent commit of b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4

Duplicated with Fengguang's report. Help you to see these info in a page :)

ad86a04266f9b49  b84c4e08143c98dad4b4d139f
---  -
 676264 ~ 0%  +3.3% 698461 ~ 0%  TOTAL will-it-scale.per_thread_ops
1174547 ~ 0%  +3.0%1209307 ~ 0%  TOTAL will-it-scale.per_process_ops
   1.67 ~ 0%  -2.3%   1.63 ~ 0%  TOTAL will-it-scale.scalability
  10522 ~ 2%+921.2% 107463 ~ 1%  TOTAL 
time.involuntary_context_switches
  77671 ~ 3% +67.0% 129688 ~ 3%  TOTAL interrupts.RES
  99502 ~ 0% -27.8%  71813 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
   2554 ~ 0% +49.1%   3809 ~ 1%  TOTAL vmstat.system.cs
  11524 ~ 0%  -2.3%  11259 ~ 0%  TOTAL vmstat.system.in
213 ~ 0%  -4.3%204 ~ 0%  TOTAL time.system_time
 74 ~ 0%  -4.1% 71 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
3495099 ~ 0%  -3.1%3387173 ~ 0%  TOTAL interrupts.LOC




Thanks,
Jet



./runtest.py open2 32 1 4 6 8

[libata/ahci] 8a4aeec8d2d: +138.4% perf-stat.dTLB-store-misses, +37.2% perf-stat.dTLB-load-misses

2014-04-21 Thread Jet Chen


HI Dan,

we noticed the below changes on

git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata for-next
commit 8a4aeec8d2d6a3edeffbdfae451cdf05cbf0fefd (libata/ahci: accommodate tag 
ordered controllers)

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4
---  -
88694337 ~39%+138.4%  2.115e+08 ~46%  TOTAL perf-stat.dTLB-store-misses
  217057 ~ 0% -31.3% 149221 ~ 3%  TOTAL 
interrupts.46:PCI-MSI-edge.ahci
   6.995e+08 ~20% +37.2%  9.598e+08 ~25%  TOTAL perf-stat.dTLB-load-misses
  110302 ~ 0% -28.9%  78402 ~ 2%  TOTAL interrupts.CAL
   3.168e+08 ~ 9% +14.5%  3.627e+08 ~10%  TOTAL 
perf-stat.L1-dcache-prefetches
   2.553e+09 ~12% +26.5%  3.228e+09 ~11%  TOTAL perf-stat.LLC-loads
   5.815e+08 ~ 6% +27.3%  7.403e+08 ~11%  TOTAL perf-stat.LLC-stores
   3.662e+09 ~11% +22.9%  4.501e+09 ~10%  TOTAL 
perf-stat.L1-dcache-load-misses
   2.155e+10 ~ 1%  +8.3%  2.333e+10 ~ 1%  TOTAL 
perf-stat.L1-dcache-store-misses
   3.619e+10 ~ 1%  +5.9%  3.832e+10 ~ 2%  TOTAL perf-stat.cache-references
   1.605e+10 ~ 1%  +4.3%  1.674e+10 ~ 1%  TOTAL 
perf-stat.L1-icache-load-misses
  239691 ~ 7%  -8.4% 219537 ~ 1%  TOTAL interrupts.RES
3483 ~ 0%  -5.4%   3297 ~ 0%  TOTAL vmstat.system.in
   2.748e+08 ~ 1%  +4.3%  2.865e+08 ~ 0%  TOTAL perf-stat.cache-misses
98935369 ~ 0%  +4.9%  1.038e+08 ~ 0%  TOTAL perf-stat.LLC-store-misses
 699 ~ 1%  -3.7%673 ~ 1%  TOTAL iostat.sda.w_await
 698 ~ 1%  -3.7%672 ~ 1%  TOTAL iostat.sda.await
  203893 ~ 0%  +3.7% 211474 ~ 0%  TOTAL iostat.sda.wkB/s
  203972 ~ 0%  +3.7% 211488 ~ 0%  TOTAL vmstat.io.bo
  618082 ~ 4%  -4.6% 589619 ~ 1%  TOTAL perf-stat.context-switches
   1.432e+12 ~ 1%  +3.0%  1.475e+12 ~ 0%  TOTAL perf-stat.L1-icache-loads
3.35e+11 ~ 0%  +3.2%  3.456e+11 ~ 0%  TOTAL perf-stat.L1-dcache-stores
   1.486e+12 ~ 0%  +2.8%  1.527e+12 ~ 0%  TOTAL perf-stat.iTLB-loads
   3.006e+11 ~ 0%  +2.6%  3.084e+11 ~ 0%  TOTAL 
perf-stat.branch-instructions
   1.793e+12 ~ 0%  +2.8%  1.843e+12 ~ 0%  TOTAL perf-stat.cpu-cycles
   3.352e+11 ~ 1%  +2.9%  3.451e+11 ~ 0%  TOTAL perf-stat.dTLB-stores
   2.994e+11 ~ 1%  +3.1%  3.087e+11 ~ 0%  TOTAL perf-stat.branch-loads
1.49e+12 ~ 0%  +2.9%  1.533e+12 ~ 0%  TOTAL perf-stat.instructions
5.48e+11 ~ 0%  +2.8%  5.633e+11 ~ 0%  TOTAL perf-stat.dTLB-loads
   2.028e+11 ~ 1%  +2.9%  2.086e+11 ~ 1%  TOTAL perf-stat.bus-cycles
   5.484e+11 ~ 0%  +2.9%  5.644e+11 ~ 0%  TOTAL perf-stat.L1-dcache-loads
   1.829e+12 ~ 0%  +2.7%  1.877e+12 ~ 1%  TOTAL perf-stat.ref-cycles

Legend:
~XX%- stddev percent
[+-]XX% - change percent

Attach full stats changes entries for reference.

Thanks,
Jet




mkfs -t ext4 -q /dev/sda1
echo 1  /sys/kernel/debug/tracing/events/writeback/balance_dirty_pages/enable
echo 1  /sys/kernel/debug/tracing/events/writeback/bdi_dirty_ratelimit/enable
echo 1  /sys/kernel/debug/tracing/events/writeback/global_dirty_state/enable
echo 1  
/sys/kernel/debug/tracing/events/writeback/writeback_single_inode/enable
mount -t ext4 /dev/sda1 /fs/sda1
dd  if=/dev/zero of=/fs/sda1/zero-1 status=none 
sleep 600
killall -9 dd


2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  1.23 ~ 8% -30.0%   0.86 ~15%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  1.23 ~ 8% -30.0%   0.86 ~15%  TOTAL 
perf-profile.cpu-cycles.jbd2_journal_add_journal_head.jbd2_journal_get_write_access.__ext4_journal_get_write_access.ext4_reserve_inode_write.ext4_mark_inode_dirty

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
 56347 ~ 0% -26.3%  41535 ~ 5%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
 56543 ~ 0% -32.9%  37934 ~ 0%  bay/micro/dd-write/1HDD-cfq-xfs-1dd
112890 ~ 0% -29.6%  79469 ~ 2%  TOTAL softirqs.BLOCK

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  0.95 ~12% -26.0%   0.70 ~ 7%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  0.95 ~12% -26.0%   0.70 ~ 7%  TOTAL 
perf-profile.cpu-cycles.jbd2_journal_put_journal_head.__ext4_handle_dirty_metadata.ext4_mark_iloc_dirty.ext4_mark_inode_dirty.ext4_dirty_inode

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  0.95 ~ 5% -18.2%   0.77 ~24%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  0.95 ~ 5% -18.2%   0.77 ~24%  TOTAL 
perf-profile.cpu-cycles.generic_file_aio_write.ext4_file_write.do_sync_write.vfs_write.sys_write

2cf532f5e67c0cf  8a4aeec8d2d6a3edeffbdfae4  
---  -  
  2468 ~ 3% +19.5%   2949 ~ 6%  bay/micro/dd-write/1HDD-cfq-ext4-1dd
  2468 ~ 3% +19.5%   2949 ~ 6%  TOTAL 
proc-vmstat.kswapd_high_wmark_hit_quickly

2cf532f5e67c0cf

Re: [sched,rcu] 9234566d3a3: +1.6% will-it-scale.scalability, +1302.6% time.involuntary_context_switches

2014-04-21 Thread Jet Chen


On 04/22/2014 09:59 AM, Paul E. McKenney wrote:

On Mon, Apr 21, 2014 at 02:28:21PM +0800, Jet Chen wrote:

Hi Paul,

we noticed the below changes on
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
torture.2014.04.18a
commit 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 (sched,rcu: Make cond_resched() 
report RCU quiescent states)


My impression of these statistics is that this commit results in huge
numbers of additional context switches and interrupts, but has a slightly
positive effect on performance and a larger negative effect on
scalability.

Is this a reasonable interpretation?


Yes, you're right.



Thanx, Paul


Comparison 1 - parent commit of 9234566d3a36c0aead8852e3c2ca94cd8ebfe219 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

e119454e74a852f  9234566d3a36c0aead8852e3c
---  -
 1035948 ~ 0%  +1.6%1052990 ~ 0%  TOTAL will-it-scale.per_thread_ops
 1271322 ~ 0%  +1.8%1294004 ~ 0%  TOTAL 
will-it-scale.per_process_ops
0.63 ~ 0%  -5.2%   0.60 ~ 0%  TOTAL will-it-scale.scalability
   22470 ~ 2%   +1302.6% 315168 ~ 2%  TOTAL 
time.involuntary_context_switches
   84265 ~ 5%   +1047.1% 966581 ~ 1%  TOTAL interrupts.IWI
1828 ~44%+189.6%   5295 ~13%  TOTAL 
time.voluntary_context_switches
5337 ~ 1% +82.1%   9720 ~ 1%  TOTAL vmstat.system.cs
  118599 ~ 0% -30.4%  82545 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
  224021 ~ 4% +34.7% 301858 ~ 2%  TOTAL interrupts.RES
   25148 ~ 0%  +7.0%  26917 ~ 0%  TOTAL vmstat.system.in
 7063439 ~ 0%  -5.2%6694536 ~ 0%  TOTAL interrupts.LOC
  188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.NMI
  188866 ~ 0%  -3.1% 183008 ~ 0%  TOTAL interrupts.PMI
3720 ~ 0%  -1.5%   3665 ~ 0%  TOTAL time.system_time
1215 ~ 0%  -1.4%   1198 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got


Comparison 2 - b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
9234566d3a36c0aead8852e3c2ca94cd8ebfe219

Fengguang has reported stats changes about 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 to you days ago.
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git 
dev.2014.04.14a
commit b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 (sched,rcu: Make cond_resched() 
report RCU quiescent states)

Let's have a compare here.

b84c4e08143c98d  9234566d3a36c0aead8852e3c
---  -
  809309 ~ 0%  -2.6% 788400 ~ 0%  TOTAL 
will-it-scale.per_process_ops
0.61 ~ 0%  -1.9%   0.60 ~ 0%  TOTAL will-it-scale.scalability
  434080 ~ 0%  -1.5% 427643 ~ 0%  TOTAL will-it-scale.per_thread_ops
   4 ~11%  +1.2e+05%   5249 ~ 2%  TOTAL interrupts.IWI
 607 ~ 7% +28.0%778 ~14%  TOTAL 
interrupts.47:PCI-MSI-edge.eth0
   12349 ~ 2% -14.6%  10548 ~ 1%  TOTAL 
interrupts.0:IO-APIC-edge.timer
3078 ~ 3% +20.9%   3722 ~ 6%  TOTAL interrupts.RES


Comparison 3 - parent commit of b84c4e08143c98dad4b4d139f08db0b98b0d3ec4 vs 
b84c4e08143c98dad4b4d139f08db0b98b0d3ec4

Duplicated with Fengguang's report. Help you to see these info in a page :)

ad86a04266f9b49  b84c4e08143c98dad4b4d139f
---  -
  676264 ~ 0%  +3.3% 698461 ~ 0%  TOTAL will-it-scale.per_thread_ops
 1174547 ~ 0%  +3.0%1209307 ~ 0%  TOTAL 
will-it-scale.per_process_ops
1.67 ~ 0%  -2.3%   1.63 ~ 0%  TOTAL will-it-scale.scalability
   10522 ~ 2%+921.2% 107463 ~ 1%  TOTAL 
time.involuntary_context_switches
   77671 ~ 3% +67.0% 129688 ~ 3%  TOTAL interrupts.RES
   99502 ~ 0% -27.8%  71813 ~ 0%  TOTAL 
interrupts.0:IO-APIC-edge.timer
2554 ~ 0% +49.1%   3809 ~ 1%  TOTAL vmstat.system.cs
   11524 ~ 0%  -2.3%  11259 ~ 0%  TOTAL vmstat.system.in
 213 ~ 0%  -4.3%204 ~ 0%  TOTAL time.system_time
  74 ~ 0%  -4.1% 71 ~ 0%  TOTAL 
time.percent_of_cpu_this_job_got
 3495099 ~ 0%  -3.1%3387173 ~ 0%  TOTAL interrupts.LOC




Thanks,
Jet






./runtest.py open2 32 1 4 6 8





--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [visorchipset] invalid opcode: 0000 [#1] PREEMPT SMP

2014-04-11 Thread Jet Chen

On 04/12/2014 12:33 AM, H. Peter Anvin wrote:
> On 04/11/2014 06:51 AM, Romer, Benjamin M wrote:
>>
>>> I'm still confused where KVM comes into the picture.  Are you actually
>>> using KVM (and thus talking about nested virtualization) or are you
>>> using Qemu in JIT mode and running another hypervisor underneath?
>>
>> The test that Fengguang used to find the problem was running the linux
>> kernel directly using KVM. When the kernel was run with "-cpu Haswell,
>> +smep,+smap" set, the vmcall failed with invalid op, but when the kernel
>> is run with "-cpu qemu64", the vmcall causes a vmexit, as it should.
> 
> As far as I know, Fengguang's test doesn't use KVM at all, it runs Qemu
> as a JIT.  Completely different thing.  In that case Qemu probably
> should *not* set the hypervisor bit.  However, the only thing that the
> hypervisor bit means is that you can look for specific hypervisor APIs
> in CPUID level 0x4000+.
> 
>> My point is, the vmcall was made because the hypervisor bit was set. If
>> this bit had been turned off, as it would be on a real processor, the
>> vmcall wouldn't have happened.
> 
> And my point is that that is a bug.  In the driver.  A very serious one.
>  You cannot call VMCALL until you know *which* hypervisor API(s) you
> have available, period.
> 
>>> The hypervisor bit is a complete red herring. If the guest CPU is
>>> running in VT-x mode, then VMCALL should VMEXIT inside the guest
>>> (invoking the guest root VT-x), 
>>
>> The CPU is running in VT-X. That was my point, the kernel is running in
>> the KVM guest, and KVM is setting the CPU feature bits such that bit 31
>> is enabled.
> 
> Which it is because it wants to export the KVM hypercall interface.
> However, keying VMCALL *only* on the HYPERVISOR bit is wrong in the extreme.
> 
>> I don't think it's a red herring because the kernel uses this bit
>> elsewhere - it is reported as X86_FEATURE_HYPERVISOR in the CPU
>> features, and can be checked with the cpu_has_hypervisor macro (which
>> was not used by the original author of the code in the driver, but
>> should have been). VMWare and KVM support in the kernel also check for
>> this bit before checking their hypervisor leaves for an ID. If it's not
>> properly set it affects more than just the s-Par drivers.
>>
>>> but the fact still remains that you
>>> should never, ever, invoke VMCALL unless you know what hypervisor you
>>> have underneath.
>>
>> From the standpoint of the s-Par drivers, yes, I agree (as I already
>> said). However, VMCALL is not a privileged instruction, so anyone could
>> use it from user space and go right past the OS straight to the
>> hypervisor. IMHO, making it *lethal* to the guest is a bad idea, since
>> any user could hard-stop the guest with a couple of lines of C.
> 
> Typically the hypervisor wants to generate a #UD inside of the guest for
> that case.  The guest OS will intercept it and SIGILL the user space
> process.
> 
>   -hpa
> 

Hi Ben,

I re-tested this case with/without option -enable-kvm.

qemu-system-x86_64 -cpu Haswell,+smep,+smap invalid op
qemu-system-x86_64 -cpu kvm64   invalid op
qemu-system-x86_64 -cpu Haswell,+smep,+smap -enable-kvm everything OK
qemu-system-x86_64 -cpu kvm64 -enable-kvm   everything OK

I think this is probably a bug in QEMU.
Sorry for misleading you. I am not experienced in QEMU usage. I don't realize I 
need try this case with different options Until read Peter's reply.

As Peter said, QEMU probably should *not* set the hypervisor bit. But based on 
my testing, I think KVM works properly in this case.

Thanks,
Jet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [visorchipset] invalid opcode: 0000 [#1] PREEMPT SMP

2014-04-11 Thread Jet Chen

On 04/12/2014 12:33 AM, H. Peter Anvin wrote:
 On 04/11/2014 06:51 AM, Romer, Benjamin M wrote:

 I'm still confused where KVM comes into the picture.  Are you actually
 using KVM (and thus talking about nested virtualization) or are you
 using Qemu in JIT mode and running another hypervisor underneath?

 The test that Fengguang used to find the problem was running the linux
 kernel directly using KVM. When the kernel was run with -cpu Haswell,
 +smep,+smap set, the vmcall failed with invalid op, but when the kernel
 is run with -cpu qemu64, the vmcall causes a vmexit, as it should.
 
 As far as I know, Fengguang's test doesn't use KVM at all, it runs Qemu
 as a JIT.  Completely different thing.  In that case Qemu probably
 should *not* set the hypervisor bit.  However, the only thing that the
 hypervisor bit means is that you can look for specific hypervisor APIs
 in CPUID level 0x4000+.
 
 My point is, the vmcall was made because the hypervisor bit was set. If
 this bit had been turned off, as it would be on a real processor, the
 vmcall wouldn't have happened.
 
 And my point is that that is a bug.  In the driver.  A very serious one.
  You cannot call VMCALL until you know *which* hypervisor API(s) you
 have available, period.
 
 The hypervisor bit is a complete red herring. If the guest CPU is
 running in VT-x mode, then VMCALL should VMEXIT inside the guest
 (invoking the guest root VT-x), 

 The CPU is running in VT-X. That was my point, the kernel is running in
 the KVM guest, and KVM is setting the CPU feature bits such that bit 31
 is enabled.
 
 Which it is because it wants to export the KVM hypercall interface.
 However, keying VMCALL *only* on the HYPERVISOR bit is wrong in the extreme.
 
 I don't think it's a red herring because the kernel uses this bit
 elsewhere - it is reported as X86_FEATURE_HYPERVISOR in the CPU
 features, and can be checked with the cpu_has_hypervisor macro (which
 was not used by the original author of the code in the driver, but
 should have been). VMWare and KVM support in the kernel also check for
 this bit before checking their hypervisor leaves for an ID. If it's not
 properly set it affects more than just the s-Par drivers.

 but the fact still remains that you
 should never, ever, invoke VMCALL unless you know what hypervisor you
 have underneath.

 From the standpoint of the s-Par drivers, yes, I agree (as I already
 said). However, VMCALL is not a privileged instruction, so anyone could
 use it from user space and go right past the OS straight to the
 hypervisor. IMHO, making it *lethal* to the guest is a bad idea, since
 any user could hard-stop the guest with a couple of lines of C.
 
 Typically the hypervisor wants to generate a #UD inside of the guest for
 that case.  The guest OS will intercept it and SIGILL the user space
 process.
 
   -hpa
 

Hi Ben,

I re-tested this case with/without option -enable-kvm.

qemu-system-x86_64 -cpu Haswell,+smep,+smap invalid op
qemu-system-x86_64 -cpu kvm64   invalid op
qemu-system-x86_64 -cpu Haswell,+smep,+smap -enable-kvm everything OK
qemu-system-x86_64 -cpu kvm64 -enable-kvm   everything OK

I think this is probably a bug in QEMU.
Sorry for misleading you. I am not experienced in QEMU usage. I don't realize I 
need try this case with different options Until read Peter's reply.

As Peter said, QEMU probably should *not* set the hypervisor bit. But based on 
my testing, I think KVM works properly in this case.

Thanks,
Jet
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FMC] BUG: scheduling while atomic: swapper/1/0x10000002

2014-04-09 Thread Jet Chen

On 04/09/2014 01:08 PM, Alessandro Rubini wrote:
> Hello.
> Thank you for the report.
> 
> I'm at a conference and I fear I won't be able to test myself in the
> next days, but I think this is already fixed (it is part of
> the "misc_register" call path, so it's the same problem).
> 
> The fix is commit v3.11-rc2-11-g783c2fb
> 
>783c2fb FMC: fix locking in sample chardev driver
> 
> This commit, however, is not part of v3.11 and I think this is why you
> are finding the problem in the v3.10..v3.11 interval.
> 
> thank you again
> /alessandro

Alessandro,  your commit 83c2fb FMC: fix locking in sample chardev driver fixes 
the issue.

Tested-by: Jet Chen 

Thanks,
Jet
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FMC] BUG: scheduling while atomic: swapper/1/0x10000002

2014-04-09 Thread Jet Chen

On 04/09/2014 01:28 PM, Fengguang Wu wrote:
> On Wed, Apr 09, 2014 at 07:08:43AM +0200, Alessandro Rubini wrote:
>> Hello.
>> Thank you for the report.
>>
>> I'm at a conference and I fear I won't be able to test myself in the
>> next days, but I think this is already fixed (it is part of
>> the "misc_register" call path, so it's the same problem).
>>
>> The fix is commit v3.11-rc2-11-g783c2fb
>>
>>783c2fb FMC: fix locking in sample chardev driver
>>
>> This commit, however, is not part of v3.11 and I think this is why you
>> are finding the problem in the v3.10..v3.11 interval.
> 
> Alessandro, you are right. There are no more "scheduling while 
> atomic" bugs in v3.12 and v3.13.
> 
> Our bisect log shows
> 
> git bisect  bad 38dbfb59d1175ef458d006556061adeaa8751b72  # 10:03  0-
> 345  Linus 3.14-rc1
> 
> However that happen to be caused by an independent "scheduling while
> atomic" bug:

Alessandro, Fengguang & I confirmed that this below dmesg is also caused by 
over locking in fc_probe(). Not a new introduced bug.

> 
> [   20.038125] Fixing recursive fault but reboot is needed!
> [   20.038125] BUG: scheduling while atomic: kworker/0:1H/77/0x0005
> [   20.038125] INFO: lockdep is turned off.
> [   20.038125] irq event stamp: 758
> [   20.038125] hardirqs last  enabled at (757): [] 
> _raw_spin_unlock_irq+0x22/0x30
> [   20.038125] hardirqs last disabled at (758): [] 
> _raw_spin_lock_irq+0x14/0x73
> [   20.038125] softirqs last  enabled at (302): [] 
> __do_softirq+0x186/0x1d2
> [   20.038125] softirqs last disabled at (295): [] 
> do_softirq_own_stack+0x2f/0x35
> [   20.038125] CPU: 0 PID: 77 Comm: kworker/0:1H Tainted: G  D W
> 3.14.0-rc1 #1
> [   20.038125] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
> [   20.038125]  c0420610 c0420610 c0449a38 c1c1f562 c0449a54 c1c1b59c 
> c1f91661 c0420938
> [   20.038125]  004d 0005 c0420610 c0449acc c1c2e4e2 c105fff8 
> 01449a7c 04af
> [   20.038125]  c0420610 002c 0001 c0449a7c c0420610 c0449ab4 
> c106001c 
> [   20.038125] Call Trace:
> [   20.038125]  [] dump_stack+0x16/0x18
> [   20.038125]  [] __schedule_bug+0x5d/0x6f
> [   20.038125]  [] __schedule+0x45/0x55f
> [   20.038125]  [] ? vprintk_emit+0x367/0x3a4
> [   20.038125]  [] ? vprintk_emit+0x38b/0x3a4
> [   20.038125]  [] ? trace_hardirqs_off+0xb/0xd
> [   20.038125]  [] ? printk+0x38/0x3a
> [   20.038125]  [] schedule+0x5d/0x5f
> [   20.038125]  [] do_exit+0xcc/0x75d
> [   20.038125]  [] ? kmsg_dump+0x184/0x191
> [   20.038125]  [] ? kmsg_dump+0x1c/0x191
> [   20.038125]  [] oops_end+0x7e/0x83
> [   20.038125]  [] no_context+0x1ba/0x1c2
> [   20.038125]  [] __bad_area_nosemaphore+0x137/0x13f
> [   20.038125]  [] ? pte_offset_kernel+0x13/0x2a
> [   20.038125]  [] ? spurious_fault+0x75/0xd5
> [   20.038125]  [] bad_area_nosemaphore+0x12/0x14
> 
> Thanks,
> Fengguang
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FMC] BUG: scheduling while atomic: swapper/1/0x10000002

2014-04-09 Thread Jet Chen

On 04/09/2014 01:28 PM, Fengguang Wu wrote:
 On Wed, Apr 09, 2014 at 07:08:43AM +0200, Alessandro Rubini wrote:
 Hello.
 Thank you for the report.

 I'm at a conference and I fear I won't be able to test myself in the
 next days, but I think this is already fixed (it is part of
 the misc_register call path, so it's the same problem).

 The fix is commit v3.11-rc2-11-g783c2fb

783c2fb FMC: fix locking in sample chardev driver

 This commit, however, is not part of v3.11 and I think this is why you
 are finding the problem in the v3.10..v3.11 interval.
 
 Alessandro, you are right. There are no more scheduling while 
 atomic bugs in v3.12 and v3.13.
 
 Our bisect log shows
 
 git bisect  bad 38dbfb59d1175ef458d006556061adeaa8751b72  # 10:03  0-
 345  Linus 3.14-rc1
 
 However that happen to be caused by an independent scheduling while
 atomic bug:

Alessandro, Fengguang  I confirmed that this below dmesg is also caused by 
over locking in fc_probe(). Not a new introduced bug.

 
 [   20.038125] Fixing recursive fault but reboot is needed!
 [   20.038125] BUG: scheduling while atomic: kworker/0:1H/77/0x0005
 [   20.038125] INFO: lockdep is turned off.
 [   20.038125] irq event stamp: 758
 [   20.038125] hardirqs last  enabled at (757): [c1c31683] 
 _raw_spin_unlock_irq+0x22/0x30
 [   20.038125] hardirqs last disabled at (758): [c1c31523] 
 _raw_spin_lock_irq+0x14/0x73
 [   20.038125] softirqs last  enabled at (302): [c1032d4d] 
 __do_softirq+0x186/0x1d2
 [   20.038125] softirqs last disabled at (295): [c1002f99] 
 do_softirq_own_stack+0x2f/0x35
 [   20.038125] CPU: 0 PID: 77 Comm: kworker/0:1H Tainted: G  D W
 3.14.0-rc1 #1
 [   20.038125] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 [   20.038125]  c0420610 c0420610 c0449a38 c1c1f562 c0449a54 c1c1b59c 
 c1f91661 c0420938
 [   20.038125]  004d 0005 c0420610 c0449acc c1c2e4e2 c105fff8 
 01449a7c 04af
 [   20.038125]  c0420610 002c 0001 c0449a7c c0420610 c0449ab4 
 c106001c 
 [   20.038125] Call Trace:
 [   20.038125]  [c1c1f562] dump_stack+0x16/0x18
 [   20.038125]  [c1c1b59c] __schedule_bug+0x5d/0x6f
 [   20.038125]  [c1c2e4e2] __schedule+0x45/0x55f
 [   20.038125]  [c105fff8] ? vprintk_emit+0x367/0x3a4
 [   20.038125]  [c106001c] ? vprintk_emit+0x38b/0x3a4
 [   20.038125]  [c105876b] ? trace_hardirqs_off+0xb/0xd
 [   20.038125]  [c1c1c185] ? printk+0x38/0x3a
 [   20.038125]  [c1c2ea59] schedule+0x5d/0x5f
 [   20.038125]  [c10314b8] do_exit+0xcc/0x75d
 [   20.038125]  [c1060e7b] ? kmsg_dump+0x184/0x191
 [   20.038125]  [c1060d13] ? kmsg_dump+0x1c/0x191
 [   20.038125]  [c1003d54] oops_end+0x7e/0x83
 [   20.038125]  [c1c1ae82] no_context+0x1ba/0x1c2
 [   20.038125]  [c1c1afc1] __bad_area_nosemaphore+0x137/0x13f
 [   20.038125]  [c1c1a82d] ? pte_offset_kernel+0x13/0x2a
 [   20.038125]  [c1c1aa5f] ? spurious_fault+0x75/0xd5
 [   20.038125]  [c1c1afdb] bad_area_nosemaphore+0x12/0x14
 
 Thanks,
 Fengguang
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FMC] BUG: scheduling while atomic: swapper/1/0x10000002

2014-04-09 Thread Jet Chen

On 04/09/2014 01:08 PM, Alessandro Rubini wrote:
 Hello.
 Thank you for the report.
 
 I'm at a conference and I fear I won't be able to test myself in the
 next days, but I think this is already fixed (it is part of
 the misc_register call path, so it's the same problem).
 
 The fix is commit v3.11-rc2-11-g783c2fb
 
783c2fb FMC: fix locking in sample chardev driver
 
 This commit, however, is not part of v3.11 and I think this is why you
 are finding the problem in the v3.10..v3.11 interval.
 
 thank you again
 /alessandro

Alessandro,  your commit 83c2fb FMC: fix locking in sample chardev driver fixes 
the issue.

Tested-by: Jet Chen jet.c...@intel.com

Thanks,
Jet
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FMC] BUG: scheduling while atomic: swapper/1/0x10000002

2014-04-08 Thread Jet Chen

On 04/09/2014 01:08 PM, Alessandro Rubini wrote:
> Hello.
> Thank you for the report.
> 
> I'm at a conference and I fear I won't be able to test myself in the
> next days, but I think this is already fixed (it is part of
> the "misc_register" call path, so it's the same problem).
> 
> The fix is commit v3.11-rc2-11-g783c2fb
> 
>783c2fb FMC: fix locking in sample chardev driver
> 
> This commit, however, is not part of v3.11 and I think this is why you
> are finding the problem in the v3.10..v3.11 interval.
> 
> thank you again
> /alessandro
> 

I find commit 783c2fb FMC: fix locking in sample chardev driver. I will help to 
test it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [clocksource] INFO: possible irq lock inversion dependency detected

2014-04-08 Thread Jet Chen


On 04/09/2014 12:40 PM, Viresh Kumar wrote:

On 9 April 2014 10:04, Jet Chen  wrote:

How did you got this in cc list ?


"abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org"





I got it from the patch you sent to me before. attach it again.
Apologizes if it's improper to cc this list.


There is no list like this :), its just the message id number
generated by git while
sending my patch.


Oh, I see. I'm supposed to in-reply-to that message id. I guess I just simple "reply 
all" so that my email client put it in TO list.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [clocksource] INFO: possible irq lock inversion dependency detected

2014-04-08 Thread Jet Chen


On 04/09/2014 12:25 PM, Viresh Kumar wrote:

On 9 April 2014 06:51, Jet Chen  wrote:

spin_lock_irqsave() does fix this issue.

Tested-by: Jet Chen 


Thanks a lot :)



Welcome.


How did you got this in cc list ?

"abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org"




I got it from the patch you sent to me before. attach it again.
Apologizes if it's improper to cc this list.

>From abd38155f8293923de5953cc063f9e2d7ecb3f04 Mon Sep 17 00:00:00 2001
Message-Id: 
From: Viresh Kumar 
Date: Sat, 5 Apr 2014 11:43:25 +0530
Subject: [PATCH] clocksource: register cpu notifier to remove timer from
 dying CPU

clocksource core is using add_timer_on() to run clocksource_watchdog() on all
CPUs one by one. But when a core is brought down, clocksource core doesn't
remove this timer from the dying CPU. And in this case timer core gives this
(Gives this only with unmerged code, anyway in the current code as well timer
core is migrating a pinned timer to other CPUs, which is also wrong:
http://www.gossamer-threads.com/lists/linux/kernel/1898117)

migrate_timer_list: can't migrate pinned timer: 81f06a60,
timer->function: 810d7010,deactivating it Modules linked in:

CPU: 0 PID: 1932 Comm: 01-cpu-hotplug Not tainted 3.14.0-rc1-00088-gab3c4fd #4
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 0009 88001d407c38 817237bd 88001d407c80
 88001d407c70 8106a1dd 0010 81f06a60
 88001e04d040 81e3d4c0 88001e04d030 88001d407cd0
Call Trace:
 [] dump_stack+0x4d/0x66
 [] warn_slowpath_common+0x7d/0xa0
 [] warn_slowpath_fmt+0x4c/0x50
 [] ? __internal_add_timer+0x113/0x130
 [] ? clocksource_watchdog_kthread+0x40/0x40
 [] migrate_timer_list+0xdb/0xf0
 [] timer_cpu_notify+0xfc/0x1f0
 [] notifier_call_chain+0x4c/0x70
 [] __raw_notifier_call_chain+0xe/0x10
 [] cpu_notify+0x23/0x50
 [] cpu_notify_nofail+0xe/0x20
 [] _cpu_down+0x1ad/0x2e0
 [] cpu_down+0x34/0x50
 [] cpu_subsys_offline+0x14/0x20
 [] device_offline+0x95/0xc0
 [] online_store+0x40/0x90
 [] dev_attr_store+0x18/0x30
 [] sysfs_kf_write+0x3d/0x50

This patch tries to fix this by registering cpu notifiers from clocksource core,
only when we start clocksource-watchdog. And if on the CPU_DEAD notification it
is found that dying CPU was the CPU on which this timer is queued on, then it is
removed from that CPU and queued to next CPU.

Reported-by: Jet Chen 
Reported-by: Fengguang Wu 
Signed-off-by: Viresh Kumar 
---
 kernel/time/clocksource.c | 64 +++
 1 file changed, 53 insertions(+), 11 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index ba3e502..9e96853 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -23,16 +23,21 @@
  *   o Allow clocksource drivers to be unregistered
  */
 
+#include 
 #include 
 #include 
 #include 
 #include 
+#include 
 #include  /* for spin_unlock_irq() using preempt_count() m68k */
 #include 
 #include 
 
 #include "tick-internal.h"
 
+/* Tracks next CPU to queue watchdog timer on */
+static int timer_cpu;
+
 void timecounter_init(struct timecounter *tc,
 		  const struct cyclecounter *cc,
 		  u64 start_tstamp)
@@ -246,12 +251,25 @@ void clocksource_mark_unstable(struct clocksource *cs)
 	spin_unlock_irqrestore(_lock, flags);
 }
 
+void queue_timer_on_next_cpu(void)
+{
+	/*
+	 * Cycle through CPUs to check if the CPUs stay synchronized to each
+	 * other.
+	 */
+	timer_cpu = cpumask_next(timer_cpu, cpu_online_mask);
+	if (timer_cpu >= nr_cpu_ids)
+		timer_cpu = cpumask_first(cpu_online_mask);
+	watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
+	add_timer_on(_timer, timer_cpu);
+}
+
 static void clocksource_watchdog(unsigned long data)
 {
 	struct clocksource *cs;
 	cycle_t csnow, wdnow;
 	int64_t wd_nsec, cs_nsec;
-	int next_cpu, reset_pending;
+	int reset_pending;
 
 	spin_lock(_lock);
 	if (!watchdog_running)
@@ -336,27 +354,50 @@ static void clocksource_watchdog(unsigned long data)
 	if (reset_pending)
 		atomic_dec(_reset_pending);
 
-	/*
-	 * Cycle through CPUs to check if the CPUs stay synchronized
-	 * to each other.
-	 */
-	next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
-	if (next_cpu >= nr_cpu_ids)
-		next_cpu = cpumask_first(cpu_online_mask);
-	watchdog_timer.expires += WATCHDOG_INTERVAL;
-	add_timer_on(_timer, next_cpu);
+	queue_timer_on_next_cpu();
 out:
 	spin_unlock(_lock);
 }
 
+static int clocksource_cpu_notify(struct notifier_block *self,
+unsigned long action, void *hcpu)
+{
+	long cpu = (long)hcpu;
+
+	spin_lock(_lock);
+	if (!watchdog_running)
+		goto notify_out;
+
+	switch (action) {
+	case CPU_DEAD:
+	case CPU_DEAD_FROZEN:
+		if (cpu != timer_cpu)
+			break;
+		del_timer(_timer);
+		queue_timer_on_next_cpu();
+		break;
+	}
+
+notify_out:
+	spin_unlock(_lock);
+	return NOTIFY_OK;
+}
+
+static struct notifier_block clocksource_nb = {
+	.notifi

Re: [clocksource] INFO: possible irq lock inversion dependency detected

2014-04-08 Thread Jet Chen


On 04/08/2014 01:21 PM, Viresh Kumar wrote:

On 8 April 2014 09:29, Jet Chen  wrote:

(Sorry for previous bad format email)
Your patch on my testing branch in LKP system:
git://bee.sh.intel.com/git/tchen37/linux.git timer_debug3 got the below
dmesg.
FYI, I applied your patch on the top of commit
6378cb51af5f4743db0dcb3cbcf862eac5908754 - timer: don't migrate pinned
timers.


Hi Jet,

Thanks for your efforts. It looks like we must use spin_lock_irqsave() here.
And that's all we need to fix this issue..



spin_lock_irqsave() does fix this issue.

Tested-by: Jet Chen 


To get the right order in which patches must be applied (obviously with the
updates I have), please pick this branch:

git://git.linaro.org/people/viresh.kumar/linux.git isolate-cpusets

I hope this fixes the issues you were getting.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [clocksource] INFO: possible irq lock inversion dependency detected

2014-04-08 Thread Jet Chen


On 04/08/2014 01:21 PM, Viresh Kumar wrote:

On 8 April 2014 09:29, Jet Chen jet.c...@intel.com wrote:

(Sorry for previous bad format email)
Your patch on my testing branch in LKP system:
git://bee.sh.intel.com/git/tchen37/linux.git timer_debug3 got the below
dmesg.
FYI, I applied your patch on the top of commit
6378cb51af5f4743db0dcb3cbcf862eac5908754 - timer: don't migrate pinned
timers.


Hi Jet,

Thanks for your efforts. It looks like we must use spin_lock_irqsave() here.
And that's all we need to fix this issue..



spin_lock_irqsave() does fix this issue.

Tested-by: Jet Chen jet.c...@intel.com


To get the right order in which patches must be applied (obviously with the
updates I have), please pick this branch:

git://git.linaro.org/people/viresh.kumar/linux.git isolate-cpusets

I hope this fixes the issues you were getting.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [clocksource] INFO: possible irq lock inversion dependency detected

2014-04-08 Thread Jet Chen


On 04/09/2014 12:25 PM, Viresh Kumar wrote:

On 9 April 2014 06:51, Jet Chen jet.c...@intel.com wrote:

spin_lock_irqsave() does fix this issue.

Tested-by: Jet Chen jet.c...@intel.com


Thanks a lot :)



Welcome.


How did you got this in cc list ?

abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org
abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org



I got it from the patch you sent to me before. attach it again.
Apologizes if it's improper to cc this list.

From abd38155f8293923de5953cc063f9e2d7ecb3f04 Mon Sep 17 00:00:00 2001
Message-Id: abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org
From: Viresh Kumar viresh.ku...@linaro.org
Date: Sat, 5 Apr 2014 11:43:25 +0530
Subject: [PATCH] clocksource: register cpu notifier to remove timer from
 dying CPU

clocksource core is using add_timer_on() to run clocksource_watchdog() on all
CPUs one by one. But when a core is brought down, clocksource core doesn't
remove this timer from the dying CPU. And in this case timer core gives this
(Gives this only with unmerged code, anyway in the current code as well timer
core is migrating a pinned timer to other CPUs, which is also wrong:
http://www.gossamer-threads.com/lists/linux/kernel/1898117)

migrate_timer_list: can't migrate pinned timer: 81f06a60,
timer-function: 810d7010,deactivating it Modules linked in:

CPU: 0 PID: 1932 Comm: 01-cpu-hotplug Not tainted 3.14.0-rc1-00088-gab3c4fd #4
Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
 0009 88001d407c38 817237bd 88001d407c80
 88001d407c70 8106a1dd 0010 81f06a60
 88001e04d040 81e3d4c0 88001e04d030 88001d407cd0
Call Trace:
 [817237bd] dump_stack+0x4d/0x66
 [8106a1dd] warn_slowpath_common+0x7d/0xa0
 [8106a24c] warn_slowpath_fmt+0x4c/0x50
 [810761c3] ? __internal_add_timer+0x113/0x130
 [810d7010] ? clocksource_watchdog_kthread+0x40/0x40
 [8107753b] migrate_timer_list+0xdb/0xf0
 [810782dc] timer_cpu_notify+0xfc/0x1f0
 [8173046c] notifier_call_chain+0x4c/0x70
 [8109340e] __raw_notifier_call_chain+0xe/0x10
 [8106a3f3] cpu_notify+0x23/0x50
 [8106a44e] cpu_notify_nofail+0xe/0x20
 [81712a5d] _cpu_down+0x1ad/0x2e0
 [81712bc4] cpu_down+0x34/0x50
 [813fec54] cpu_subsys_offline+0x14/0x20
 [813f9f65] device_offline+0x95/0xc0
 [813fa060] online_store+0x40/0x90
 [813f75d8] dev_attr_store+0x18/0x30
 [8123309d] sysfs_kf_write+0x3d/0x50

This patch tries to fix this by registering cpu notifiers from clocksource core,
only when we start clocksource-watchdog. And if on the CPU_DEAD notification it
is found that dying CPU was the CPU on which this timer is queued on, then it is
removed from that CPU and queued to next CPU.

Reported-by: Jet Chen jet.c...@intel.com
Reported-by: Fengguang Wu fengguang...@intel.com
Signed-off-by: Viresh Kumar viresh.ku...@linaro.org
---
 kernel/time/clocksource.c | 64 +++
 1 file changed, 53 insertions(+), 11 deletions(-)

diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
index ba3e502..9e96853 100644
--- a/kernel/time/clocksource.c
+++ b/kernel/time/clocksource.c
@@ -23,16 +23,21 @@
  *   o Allow clocksource drivers to be unregistered
  */
 
+#include linux/cpu.h
 #include linux/device.h
 #include linux/clocksource.h
 #include linux/init.h
 #include linux/module.h
+#include linux/notifier.h
 #include linux/sched.h /* for spin_unlock_irq() using preempt_count() m68k */
 #include linux/tick.h
 #include linux/kthread.h
 
 #include tick-internal.h
 
+/* Tracks next CPU to queue watchdog timer on */
+static int timer_cpu;
+
 void timecounter_init(struct timecounter *tc,
 		  const struct cyclecounter *cc,
 		  u64 start_tstamp)
@@ -246,12 +251,25 @@ void clocksource_mark_unstable(struct clocksource *cs)
 	spin_unlock_irqrestore(watchdog_lock, flags);
 }
 
+void queue_timer_on_next_cpu(void)
+{
+	/*
+	 * Cycle through CPUs to check if the CPUs stay synchronized to each
+	 * other.
+	 */
+	timer_cpu = cpumask_next(timer_cpu, cpu_online_mask);
+	if (timer_cpu = nr_cpu_ids)
+		timer_cpu = cpumask_first(cpu_online_mask);
+	watchdog_timer.expires = jiffies + WATCHDOG_INTERVAL;
+	add_timer_on(watchdog_timer, timer_cpu);
+}
+
 static void clocksource_watchdog(unsigned long data)
 {
 	struct clocksource *cs;
 	cycle_t csnow, wdnow;
 	int64_t wd_nsec, cs_nsec;
-	int next_cpu, reset_pending;
+	int reset_pending;
 
 	spin_lock(watchdog_lock);
 	if (!watchdog_running)
@@ -336,27 +354,50 @@ static void clocksource_watchdog(unsigned long data)
 	if (reset_pending)
 		atomic_dec(watchdog_reset_pending);
 
-	/*
-	 * Cycle through CPUs to check if the CPUs stay synchronized
-	 * to each other.
-	 */
-	next_cpu = cpumask_next(raw_smp_processor_id(), cpu_online_mask);
-	if (next_cpu

Re: [clocksource] INFO: possible irq lock inversion dependency detected

2014-04-08 Thread Jet Chen


On 04/09/2014 12:40 PM, Viresh Kumar wrote:

On 9 April 2014 10:04, Jet Chen jet.c...@intel.com wrote:

How did you got this in cc list ?


abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org

abd38155f8293923de5953cc063f9e2d7ecb3f04.1396679170.git.viresh.ku...@linaro.org



I got it from the patch you sent to me before. attach it again.
Apologizes if it's improper to cc this list.


There is no list like this :), its just the message id number
generated by git while
sending my patch.


Oh, I see. I'm supposed to in-reply-to that message id. I guess I just simple reply 
all so that my email client put it in TO list.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [FMC] BUG: scheduling while atomic: swapper/1/0x10000002

2014-04-08 Thread Jet Chen

On 04/09/2014 01:08 PM, Alessandro Rubini wrote:
 Hello.
 Thank you for the report.
 
 I'm at a conference and I fear I won't be able to test myself in the
 next days, but I think this is already fixed (it is part of
 the misc_register call path, so it's the same problem).
 
 The fix is commit v3.11-rc2-11-g783c2fb
 
783c2fb FMC: fix locking in sample chardev driver
 
 This commit, however, is not part of v3.11 and I think this is why you
 are finding the problem in the v3.10..v3.11 interval.
 
 thank you again
 /alessandro
 

I find commit 783c2fb FMC: fix locking in sample chardev driver. I will help to 
test it.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-05 Thread Jet Chen


On 04/05/2014 02:26 PM, Viresh Kumar wrote:

On 5 April 2014 10:00, Jet Chen  wrote:

vmlinuz from our build system doesn't have debug information. It is hard to
use objdump to identify which routine is timer->function.


I see...


But after several times trials, I get below dmesg messages.
It is clear to see address of "timer->function" is 0x810d7010.
In calling stack, " [] ?
clocksource_watchdog_kthread+0x40/0x40 ". So I guess timer->function is
clocksource_watchdog_kthread.


Hmm.. not exactly this function as this isn't timer->function for any timer.
But I think I have found the right function with this hint:
clocksource_watchdog()

Can you please try to test the attached patch, which must fix it.


Your patch fixes it!


Untested. I will
then post it with your Tested-by :)


Thank you



--
viresh




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-05 Thread Jet Chen


On 04/05/2014 02:26 PM, Viresh Kumar wrote:

On 5 April 2014 10:00, Jet Chen jet.c...@intel.com wrote:

vmlinuz from our build system doesn't have debug information. It is hard to
use objdump to identify which routine is timer-function.


I see...


But after several times trials, I get below dmesg messages.
It is clear to see address of timer-function is 0x810d7010.
In calling stack,  [810d7010] ?
clocksource_watchdog_kthread+0x40/0x40 . So I guess timer-function is
clocksource_watchdog_kthread.


Hmm.. not exactly this function as this isn't timer-function for any timer.
But I think I have found the right function with this hint:
clocksource_watchdog()

Can you please try to test the attached patch, which must fix it.


Your patch fixes it!


Untested. I will
then post it with your Tested-by :)


Thank you



--
viresh




--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-04 Thread Jet Chen


On 04/04/2014 03:52 PM, Viresh Kumar wrote:

On 4 April 2014 13:16, Jet Chen  wrote:

Hi Viresh,

I changed your print message as you suggested.

diff --git a/kernel/timer.c b/kernel/timer.c
index 6c3a371..193101d 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1617,8 +1617,8 @@ static void migrate_timer_list(struct tvec_base
*new_base, struct list_head *hea

  /* Check if CPU still has pinned timers */
  if (unlikely(WARN(is_pinned,
- "%s: can't migrate pinned timer: %p,
deactivating it\n",
- __func__, timer)))
+ "%s: can't migrate pinned timer: %p,
timer->function: %p,deactivating it\n",
+ __func__, timer, timer->function)))
  continue;

Then I reproduced the issue, and got the dmesg output,

[   37.918406] migrate_timer_list: can't migrate pinned timer:
81f06a60, timer->function: 810d7010,deactivating it

We reproduced this issue for several times in our LKP system. The address of
timer 81f06a60 is very constant. So is timer->function, I believe.

Hope this information will help you. Please feel free to tell me what else I
can do to help you.


Hi Jet,

Thanks a lot. Yes that's pretty helpful.. But I need some more help from you..
I don't have any idea which function has this address in your kernel:
810d7010 :)

Can you please debug that a bit more? You need to find which function
this address belongs to. You can try that using objdump on your vmlinux.

Some help can be found here: Documentation/BUG-HUNTING

Thanks in Advance.

vmlinuz from our build system doesn't have debug information. It is hard 
to use objdump to identify which routine is timer->function.

But after several times trials, I get below dmesg messages.
It is clear to see address of "timer->function" is 0x810d7010.
In calling stack, " [] ? 
clocksource_watchdog_kthread+0x40/0x40 ". So I guess timer->function is 
clocksource_watchdog_kthread.
I manually disable CONFIG_CLOCKSOURCE_WATCHDOG, then I never see this 
oops again (But see other oops for other reason :( )



[   37.918345] WARNING: CPU: 0 PID: 1932 at kernel/timer.c:1621 
migrate_timer_list+0xdb/0xf0()
[   37.918406] migrate_timer_list: can't migrate pinned timer: 
81f06a60, timer->function: 810d7010,deactivating it

[   37.918406] Modules linked in:
[   37.918406] CPU: 0 PID: 1932 Comm: 01-cpu-hotplug Not tainted 
3.14.0-rc1-00088-gab3c4fd #4

[   37.918406] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   37.918406]  0009 88001d407c38 817237bd 
88001d407c80
[   37.918406]  88001d407c70 8106a1dd 0010 
81f06a60
[   37.918406]  88001e04d040 81e3d4c0 88001e04d030 
88001d407cd0

[   37.918406] Call Trace:
[   37.918406]  [] dump_stack+0x4d/0x66
[   37.918406]  [] warn_slowpath_common+0x7d/0xa0
[   37.918406]  [] warn_slowpath_fmt+0x4c/0x50
[   37.918406]  [] ? __internal_add_timer+0x113/0x130
[   37.918406]  [] ? 
clocksource_watchdog_kthread+0x40/0x40

[   37.918406]  [] migrate_timer_list+0xdb/0xf0
[   37.918406]  [] timer_cpu_notify+0xfc/0x1f0
[   37.918406]  [] notifier_call_chain+0x4c/0x70
[   37.918406]  [] __raw_notifier_call_chain+0xe/0x10
[   37.918406]  [] cpu_notify+0x23/0x50
[   37.918406]  [] cpu_notify_nofail+0xe/0x20
[   37.918406]  [] _cpu_down+0x1ad/0x2e0
[   37.918406]  [] cpu_down+0x34/0x50
[   37.918406]  [] cpu_subsys_offline+0x14/0x20
[   37.918406]  [] device_offline+0x95/0xc0
[   37.918406]  [] online_store+0x40/0x90
[   37.918406]  [] dev_attr_store+0x18/0x30
[   37.918406]  [] sysfs_kf_write+0x3d/0x50


early console in setup code
Probing EDD (edd=off to disable)... ok
early console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.14.0-rc1-00088-gab3c4fd (kbuild@xian) (gcc 
version 4.8.2 (Debian 4.8.2-18) ) #4 SMP Fri Apr 4 14:46:57 CST 2014
[0.00] Command line: hung_task_panic=1 earlyprintk=ttyS0,115200 debug 
apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=10 
softlockup_panic=1 nmi_watchdog=panic load_ramdisk=2 prompt_ramdisk=0 
console=ttyS0,115200 console=tty0 vga=normal ip=nfsroot-lkp-ib04-40::dhcp 
nfsroot=192.168.1.1:/nfsroot/wfg,tcp,v3,nocto,actimeo=600,nolock,rsize=524288,wsize=524288
 rw 
link=/kernel-tests/run-queue/kvm/x86_64-rhel/tchen:vireshk_test/.vmlinuz-ab3c4fdd657432f23ac1ede2845392c4d4bdb947-20140404144932-2-lkp-ib04
 branch=tchen/vireshk_test 
BOOT_IMAGE=/kernel/x86_64-rhel/ab3c4fdd657432f23ac1ede2845392c4d4bdb947/vmlinuz-3.14.0-rc1-00088-gab3c4fd
 drbd.minor_count=8
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0

Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-04 Thread Jet Chen


Hi Viresh,

I changed your print message as you suggested.

diff --git a/kernel/timer.c b/kernel/timer.c
index 6c3a371..193101d 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1617,8 +1617,8 @@ static void migrate_timer_list(struct tvec_base 
*new_base, struct list_head *hea

 /* Check if CPU still has pinned timers */
 if (unlikely(WARN(is_pinned,
- "%s: can't migrate pinned timer: %p, deactivating 
it\n",
- __func__, timer)))
+ "%s: can't migrate pinned timer: %p, 
timer->function: %p,deactivating it\n",
+ __func__, timer, timer->function)))
 continue;

Then I reproduced the issue, and got the dmesg output,

[   37.918406] migrate_timer_list: can't migrate pinned timer: 81f06a60, 
timer->function: 810d7010,deactivating it

We reproduced this issue for several times in our LKP system. The address of timer 
81f06a60 is very constant. So is timer->function, I believe.

Hope this information will help you. Please feel free to tell me what else I 
can do to help you.

Thanks,
Jet

On 04/04/2014 12:56 PM, Viresh Kumar wrote:

Thanks Fengguang,

On 4 April 2014 08:49, Fengguang Wu  wrote:

Greetings,

I got the below dmesg and the first bad commit is

git://git.linaro.org/people/vireshk/linux timer-cleanup-for-tglx

commit 6378cb51af5f4743db0dcb3cbcf862eac5908754
Author: Viresh Kumar 
AuthorDate: Thu Mar 20 14:29:02 2014 +0530
Commit: Viresh Kumar 
CommitDate: Wed Apr 2 14:54:57 2014 +0530

 timer: don't migrate pinned timers

 migrate_timer() is called when a CPU goes down and its timers are required 
to be
 migrated to some other CPU. Its the responsibility of the users of the 
timer to
 remove it before control reaches to migrate_timers().

 As these were the pinned timers, the best we can do is: don't migrate 
these and
 report to the user as well.

 That's all this patch does.

 Signed-off-by: Viresh Kumar 

===
PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
===
Attached one more dmesg for the NULL pointer bug in parent commit.

+++++
|| 5a8530b7c3 | 
6378cb51af | 7caf71f403 |
+++++
| boot_successes | 103| 14  
   | 10 |
| boot_failures  | 17 | 18  
   | 13 |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference   | 16 | 
   ||
| Oops:SMP   | 16 | 
   ||
| Kernel_panic-not_syncing:Fatal_exception   | 16 | 
   ||
| backtrace:vfs_read | 16 | 
   ||
| backtrace:SyS_read | 16 | 
   ||
| BUG:kernel_test_crashed| 1  | 
   ||
| WARNING:CPU:PID:at_kernel/timer.c:migrate_timer_list() | 0  | 17  
   | 12 |
| backtrace:vfs_write| 0  | 17  
   | 12 |
| backtrace:SyS_write| 0  | 17  
   | 12 |
| BUG:kernel_early_hang_without_any_printk_output| 0  | 1   
   | 1  |
+++++

[   74.242293] Unregister pv shared memory for cpu 1
[   74.273280] smpboot: CPU 1 is now offline
[   74.274685] [ cut here ]
[   74.275524] WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 
migrate_timer_list+0xd6/0xf0()
[   74.275524] migrate_timer_list: can't migrate pinned timer: 
81f06a60, deactivating it


Hmm, nice. So, my patch hasn't created a bug, but just highlighted it.
I have added this piece of code while migrating timers away:

if (unlikely(WARN(is_pinned,
 "%s: can't migrate pinned timer: %p, deactivating it\n",
 __func__, timer)))

Which means, migrate all timers to other CPUs when a CPU is going down.
But obviously we can't migrate the pinned timers. And it looks like we
actually were doing that before this commit and things went unnoticed.

But just due to this print, we are highlighting an existing issue here.
@Thomas: So, in a sense my patch is doing some good work now :)

Now, we need to fix the code which queued this pinned timer.
@Fengguang: As I don't have the facilities to reproduce this, can you
help me debugging

Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-04 Thread Jet Chen


Hi Viresh,

I changed your print message as you suggested.

diff --git a/kernel/timer.c b/kernel/timer.c
index 6c3a371..193101d 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1617,8 +1617,8 @@ static void migrate_timer_list(struct tvec_base 
*new_base, struct list_head *hea

 /* Check if CPU still has pinned timers */
 if (unlikely(WARN(is_pinned,
- %s: can't migrate pinned timer: %p, deactivating 
it\n,
- __func__, timer)))
+ %s: can't migrate pinned timer: %p, 
timer-function: %p,deactivating it\n,
+ __func__, timer, timer-function)))
 continue;

Then I reproduced the issue, and got the dmesg output,

[   37.918406] migrate_timer_list: can't migrate pinned timer: 81f06a60, 
timer-function: 810d7010,deactivating it

We reproduced this issue for several times in our LKP system. The address of timer 
81f06a60 is very constant. So is timer-function, I believe.

Hope this information will help you. Please feel free to tell me what else I 
can do to help you.

Thanks,
Jet

On 04/04/2014 12:56 PM, Viresh Kumar wrote:

Thanks Fengguang,

On 4 April 2014 08:49, Fengguang Wu fengguang...@intel.com wrote:

Greetings,

I got the below dmesg and the first bad commit is

git://git.linaro.org/people/vireshk/linux timer-cleanup-for-tglx

commit 6378cb51af5f4743db0dcb3cbcf862eac5908754
Author: Viresh Kumar viresh.ku...@linaro.org
AuthorDate: Thu Mar 20 14:29:02 2014 +0530
Commit: Viresh Kumar viresh.ku...@linaro.org
CommitDate: Wed Apr 2 14:54:57 2014 +0530

 timer: don't migrate pinned timers

 migrate_timer() is called when a CPU goes down and its timers are required 
to be
 migrated to some other CPU. Its the responsibility of the users of the 
timer to
 remove it before control reaches to migrate_timers().

 As these were the pinned timers, the best we can do is: don't migrate 
these and
 report to the user as well.

 That's all this patch does.

 Signed-off-by: Viresh Kumar viresh.ku...@linaro.org

===
PARENT COMMIT NOT CLEAN. LOOK OUT FOR WRONG BISECT!
===
Attached one more dmesg for the NULL pointer bug in parent commit.

+++++
|| 5a8530b7c3 | 
6378cb51af | 7caf71f403 |
+++++
| boot_successes | 103| 14  
   | 10 |
| boot_failures  | 17 | 18  
   | 13 |
| BUG:unable_to_handle_kernel_NULL_pointer_dereference   | 16 | 
   ||
| Oops:SMP   | 16 | 
   ||
| Kernel_panic-not_syncing:Fatal_exception   | 16 | 
   ||
| backtrace:vfs_read | 16 | 
   ||
| backtrace:SyS_read | 16 | 
   ||
| BUG:kernel_test_crashed| 1  | 
   ||
| WARNING:CPU:PID:at_kernel/timer.c:migrate_timer_list() | 0  | 17  
   | 12 |
| backtrace:vfs_write| 0  | 17  
   | 12 |
| backtrace:SyS_write| 0  | 17  
   | 12 |
| BUG:kernel_early_hang_without_any_printk_output| 0  | 1   
   | 1  |
+++++

[   74.242293] Unregister pv shared memory for cpu 1
[   74.273280] smpboot: CPU 1 is now offline
[   74.274685] [ cut here ]
[   74.275524] WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 
migrate_timer_list+0xd6/0xf0()
[   74.275524] migrate_timer_list: can't migrate pinned timer: 
81f06a60, deactivating it


Hmm, nice. So, my patch hasn't created a bug, but just highlighted it.
I have added this piece of code while migrating timers away:

if (unlikely(WARN(is_pinned,
 %s: can't migrate pinned timer: %p, deactivating it\n,
 __func__, timer)))

Which means, migrate all timers to other CPUs when a CPU is going down.
But obviously we can't migrate the pinned timers. And it looks like we
actually were doing that before this commit and things went unnoticed.

But just due to this print, we are highlighting an existing issue here.
@Thomas: So, in a sense my patch is doing some good work now :)

Now, we need to fix the code which queued this pinned timer.

Re: WARNING: CPU: 0 PID: 1935 at kernel/timer.c:1621 migrate_timer_list()

2014-04-04 Thread Jet Chen


On 04/04/2014 03:52 PM, Viresh Kumar wrote:

On 4 April 2014 13:16, Jet Chen jet.c...@intel.com wrote:

Hi Viresh,

I changed your print message as you suggested.

diff --git a/kernel/timer.c b/kernel/timer.c
index 6c3a371..193101d 100644
--- a/kernel/timer.c
+++ b/kernel/timer.c
@@ -1617,8 +1617,8 @@ static void migrate_timer_list(struct tvec_base
*new_base, struct list_head *hea

  /* Check if CPU still has pinned timers */
  if (unlikely(WARN(is_pinned,
- %s: can't migrate pinned timer: %p,
deactivating it\n,
- __func__, timer)))
+ %s: can't migrate pinned timer: %p,
timer-function: %p,deactivating it\n,
+ __func__, timer, timer-function)))
  continue;

Then I reproduced the issue, and got the dmesg output,

[   37.918406] migrate_timer_list: can't migrate pinned timer:
81f06a60, timer-function: 810d7010,deactivating it

We reproduced this issue for several times in our LKP system. The address of
timer 81f06a60 is very constant. So is timer-function, I believe.

Hope this information will help you. Please feel free to tell me what else I
can do to help you.


Hi Jet,

Thanks a lot. Yes that's pretty helpful.. But I need some more help from you..
I don't have any idea which function has this address in your kernel:
810d7010 :)

Can you please debug that a bit more? You need to find which function
this address belongs to. You can try that using objdump on your vmlinux.

Some help can be found here: Documentation/BUG-HUNTING

Thanks in Advance.

vmlinuz from our build system doesn't have debug information. It is hard 
to use objdump to identify which routine is timer-function.

But after several times trials, I get below dmesg messages.
It is clear to see address of timer-function is 0x810d7010.
In calling stack,  [810d7010] ? 
clocksource_watchdog_kthread+0x40/0x40 . So I guess timer-function is 
clocksource_watchdog_kthread.
I manually disable CONFIG_CLOCKSOURCE_WATCHDOG, then I never see this 
oops again (But see other oops for other reason :( )



[   37.918345] WARNING: CPU: 0 PID: 1932 at kernel/timer.c:1621 
migrate_timer_list+0xdb/0xf0()
[   37.918406] migrate_timer_list: can't migrate pinned timer: 
81f06a60, timer-function: 810d7010,deactivating it

[   37.918406] Modules linked in:
[   37.918406] CPU: 0 PID: 1932 Comm: 01-cpu-hotplug Not tainted 
3.14.0-rc1-00088-gab3c4fd #4

[   37.918406] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
[   37.918406]  0009 88001d407c38 817237bd 
88001d407c80
[   37.918406]  88001d407c70 8106a1dd 0010 
81f06a60
[   37.918406]  88001e04d040 81e3d4c0 88001e04d030 
88001d407cd0

[   37.918406] Call Trace:
[   37.918406]  [817237bd] dump_stack+0x4d/0x66
[   37.918406]  [8106a1dd] warn_slowpath_common+0x7d/0xa0
[   37.918406]  [8106a24c] warn_slowpath_fmt+0x4c/0x50
[   37.918406]  [810761c3] ? __internal_add_timer+0x113/0x130
[   37.918406]  [810d7010] ? 
clocksource_watchdog_kthread+0x40/0x40

[   37.918406]  [8107753b] migrate_timer_list+0xdb/0xf0
[   37.918406]  [810782dc] timer_cpu_notify+0xfc/0x1f0
[   37.918406]  [8173046c] notifier_call_chain+0x4c/0x70
[   37.918406]  [8109340e] __raw_notifier_call_chain+0xe/0x10
[   37.918406]  [8106a3f3] cpu_notify+0x23/0x50
[   37.918406]  [8106a44e] cpu_notify_nofail+0xe/0x20
[   37.918406]  [81712a5d] _cpu_down+0x1ad/0x2e0
[   37.918406]  [81712bc4] cpu_down+0x34/0x50
[   37.918406]  [813fec54] cpu_subsys_offline+0x14/0x20
[   37.918406]  [813f9f65] device_offline+0x95/0xc0
[   37.918406]  [813fa060] online_store+0x40/0x90
[   37.918406]  [813f75d8] dev_attr_store+0x18/0x30
[   37.918406]  [8123309d] sysfs_kf_write+0x3d/0x50


early console in setup code
Probing EDD (edd=off to disable)... ok
early console in decompress_kernel

Decompressing Linux... Parsing ELF... done.
Booting the kernel.
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Linux version 3.14.0-rc1-00088-gab3c4fd (kbuild@xian) (gcc 
version 4.8.2 (Debian 4.8.2-18) ) #4 SMP Fri Apr 4 14:46:57 CST 2014
[0.00] Command line: hung_task_panic=1 earlyprintk=ttyS0,115200 debug 
apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 panic=10 
softlockup_panic=1 nmi_watchdog=panic load_ramdisk=2 prompt_ramdisk=0 
console=ttyS0,115200 console=tty0 vga=normal ip=nfsroot-lkp-ib04-40::dhcp 
nfsroot=192.168.1.1:/nfsroot/wfg,tcp,v3,nocto,actimeo=600,nolock,rsize=524288,wsize=524288
 rw 
link=/kernel-tests/run-queue/kvm/x86_64-rhel/tchen:vireshk_test/.vmlinuz-ab3c4fdd657432f23ac1ede2845392c4d4bdb947-20140404144932-2-lkp

Re: [mtd/sbc_gxx] kernel BUG at include/linux/mtd/map.h:148!

2014-04-03 Thread Jet Chen


Hi Michal,

Your patch fixes the problem.

Tested-by: Jet Chen 

Thanks,
-Jet

On 03/31/2014 10:35 PM, Michal Marek wrote:

On Mon, Mar 31, 2014 at 07:34:12PM +0800, Fengguang Wu wrote:

CC Michal and kbuild list.

On Thu, Mar 27, 2014 at 04:51:53PM -0600, Bjorn Helgaas wrote:

[+cc David, Brian]

On Thu, Mar 27, 2014 at 8:01 AM, Fengguang Wu  wrote:

FYI, here is a very old warning, too old to be bisected.

[5.282127] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[5.286079] SBC-GXx flash: IO:0x258-0x259 MEM:0xdc000-0xd
[5.288723] [ cut here ]
[5.289649] kernel BUG at include/linux/mtd/map.h:148!


I think the problem is that your randconfig happens to have none of
CONFIG_MTD_MAP_BANK_WIDTH_* set (you should have played the lottery
today!), and the default implementation of map_bankwidth() in that
case is just "BUG()":

   $ grep MTD_MAP config-3.14.0-rc8-wl-03045-gdf16ea4
   # CONFIG_MTD_MAP_BANK_WIDTH_1 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_2 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_4 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_32 is not set

I don't know enough Kconfig or MTD to fix this.


Michal, the problem we run into is, how to select one of the above
CONFIG_MTD_MAP_BANK_WIDTH_xx in "make randconfig".


You can't enforce this in Kconfig. What you can do is to make the logic
more fool-proof and fall back to some sensible default if none of the
CONFIG_MTD_MAP_BANK_WIDTH_* options is enabled. Like this


 From 76e66ceea7e2ffbb1d39c01af2eaf6f2b66c2be3 Mon Sep 17 00:00:00 2001
From: Michal Marek 
Date: Mon, 31 Mar 2014 16:25:32 +0200
Subject: [PATCH] mtd: Fall back to MTD_MAP_BANK_WIDTH_1 if none is specified

This is mainly to fix make randconfig errors.

Signed-off-by: Michal Marek 
---
  drivers/mtd/chips/Kconfig | 8 
  include/linux/mtd/map.h   | 4 ++--
  2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/chips/Kconfig b/drivers/mtd/chips/Kconfig
index e4696b3..219de49 100644
--- a/drivers/mtd/chips/Kconfig
+++ b/drivers/mtd/chips/Kconfig
@@ -115,6 +115,14 @@ config MTD_MAP_BANK_WIDTH_32
  If you wish to support CFI devices on a physical bus which is
  256 bits wide, say 'Y'.

+config HAVE_MTD_MAP_BANK_WIDTH_1
+   bool
+   default MTD_MAP_BANK_WIDTH_1 || (!MTD_MAP_BANK_WIDTH_2 && !MTD_MAP_BANK_WIDTH_4 && 
!MTD_MAP_BANK_WIDTH_8 && !MTD_MAP_BANK_WIDTH_16 && !MTD_MAP_BANK_WIDTH_32)
+
+if HAVE_MTD_MAP_BANK_WIDTH_1 && !MTD_MAP_BANK_WIDTH_1
+comment "no buswidth selected, using 8-bit as a fallback "
+endif
+
  config MTD_CFI_I1
bool "Support 1-chip flash interleave" if MTD_CFI_GEOMETRY
default y
diff --git a/include/linux/mtd/map.h b/include/linux/mtd/map.h
index 5f487d7..d4f4f49 100644
--- a/include/linux/mtd/map.h
+++ b/include/linux/mtd/map.h
@@ -32,7 +32,7 @@
  #include 
  #include 

-#ifdef CONFIG_MTD_MAP_BANK_WIDTH_1
+#ifdef CONFIG_HAVE_MTD_MAP_BANK_WIDTH_1
  #define map_bankwidth(map) 1
  #define map_bankwidth_is_1(map) (map_bankwidth(map) == 1)
  #define map_bankwidth_is_large(map) (0)
@@ -156,7 +156,7 @@ static inline int map_bankwidth(void *map)
  static inline int map_bankwidth_supported(int w)
  {
switch (w) {
-#ifdef CONFIG_MTD_MAP_BANK_WIDTH_1
+#ifdef CONFIG_HAVE_MTD_MAP_BANK_WIDTH_1
case 1:
  #endif
  #ifdef CONFIG_MTD_MAP_BANK_WIDTH_2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [mtd/sbc_gxx] kernel BUG at include/linux/mtd/map.h:148!

2014-04-03 Thread Jet Chen


Hi Michal,

Your patch fixes the problem.

Tested-by: Jet Chen jet.c...@intel.com

Thanks,
-Jet

On 03/31/2014 10:35 PM, Michal Marek wrote:

On Mon, Mar 31, 2014 at 07:34:12PM +0800, Fengguang Wu wrote:

CC Michal and kbuild list.

On Thu, Mar 27, 2014 at 04:51:53PM -0600, Bjorn Helgaas wrote:

[+cc David, Brian]

On Thu, Mar 27, 2014 at 8:01 AM, Fengguang Wu fengguang...@intel.com wrote:

FYI, here is a very old warning, too old to be bisected.

[5.282127] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
supports DPO and FUA
[5.286079] SBC-GXx flash: IO:0x258-0x259 MEM:0xdc000-0xd
[5.288723] [ cut here ]
[5.289649] kernel BUG at include/linux/mtd/map.h:148!


I think the problem is that your randconfig happens to have none of
CONFIG_MTD_MAP_BANK_WIDTH_* set (you should have played the lottery
today!), and the default implementation of map_bankwidth() in that
case is just BUG():

   $ grep MTD_MAP config-3.14.0-rc8-wl-03045-gdf16ea4
   # CONFIG_MTD_MAP_BANK_WIDTH_1 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_2 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_4 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_8 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_16 is not set
   # CONFIG_MTD_MAP_BANK_WIDTH_32 is not set

I don't know enough Kconfig or MTD to fix this.


Michal, the problem we run into is, how to select one of the above
CONFIG_MTD_MAP_BANK_WIDTH_xx in make randconfig.


You can't enforce this in Kconfig. What you can do is to make the logic
more fool-proof and fall back to some sensible default if none of the
CONFIG_MTD_MAP_BANK_WIDTH_* options is enabled. Like this


 From 76e66ceea7e2ffbb1d39c01af2eaf6f2b66c2be3 Mon Sep 17 00:00:00 2001
From: Michal Marek mma...@suse.cz
Date: Mon, 31 Mar 2014 16:25:32 +0200
Subject: [PATCH] mtd: Fall back to MTD_MAP_BANK_WIDTH_1 if none is specified

This is mainly to fix make randconfig errors.

Signed-off-by: Michal Marek mma...@suse.cz
---
  drivers/mtd/chips/Kconfig | 8 
  include/linux/mtd/map.h   | 4 ++--
  2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/mtd/chips/Kconfig b/drivers/mtd/chips/Kconfig
index e4696b3..219de49 100644
--- a/drivers/mtd/chips/Kconfig
+++ b/drivers/mtd/chips/Kconfig
@@ -115,6 +115,14 @@ config MTD_MAP_BANK_WIDTH_32
  If you wish to support CFI devices on a physical bus which is
  256 bits wide, say 'Y'.

+config HAVE_MTD_MAP_BANK_WIDTH_1
+   bool
+   default MTD_MAP_BANK_WIDTH_1 || (!MTD_MAP_BANK_WIDTH_2  !MTD_MAP_BANK_WIDTH_4  
!MTD_MAP_BANK_WIDTH_8  !MTD_MAP_BANK_WIDTH_16  !MTD_MAP_BANK_WIDTH_32)
+
+if HAVE_MTD_MAP_BANK_WIDTH_1  !MTD_MAP_BANK_WIDTH_1
+comment no buswidth selected, using 8-bit as a fallback 
+endif
+
  config MTD_CFI_I1
bool Support 1-chip flash interleave if MTD_CFI_GEOMETRY
default y
diff --git a/include/linux/mtd/map.h b/include/linux/mtd/map.h
index 5f487d7..d4f4f49 100644
--- a/include/linux/mtd/map.h
+++ b/include/linux/mtd/map.h
@@ -32,7 +32,7 @@
  #include asm/io.h
  #include asm/barrier.h

-#ifdef CONFIG_MTD_MAP_BANK_WIDTH_1
+#ifdef CONFIG_HAVE_MTD_MAP_BANK_WIDTH_1
  #define map_bankwidth(map) 1
  #define map_bankwidth_is_1(map) (map_bankwidth(map) == 1)
  #define map_bankwidth_is_large(map) (0)
@@ -156,7 +156,7 @@ static inline int map_bankwidth(void *map)
  static inline int map_bankwidth_supported(int w)
  {
switch (w) {
-#ifdef CONFIG_MTD_MAP_BANK_WIDTH_1
+#ifdef CONFIG_HAVE_MTD_MAP_BANK_WIDTH_1
case 1:
  #endif
  #ifdef CONFIG_MTD_MAP_BANK_WIDTH_2


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched/idle] c365c292d05: ltp.sched_rr_get_interval02.1.TFAIL

2014-03-07 Thread Jet Chen


Hi Peter,


Does: http://lkml.kernel.org/r/20140301191838.d15d03112b2598a671dac...@gmail.com

Fix it?


Both patches can fix the LTP regression.

We applied each of Juri's and Kirill's patches on top of Thomas' commit,
and here are the comparisons:

test case: ltp/syscalls

Thomas'  Juri's
c365c292d05908c  ffc78ca31d51f90c516bb50cf
---  -
1 ~ 0%-100.0%  0 ~ 0%  TOTAL 
ltp.sched_rr_get_interval02.1.TFAIL
73790 ~ 5% -14.1%  63363 ~ 6%  TOTAL interrupts.IWI
 3040 ~ 4%  -6.7%   2836 ~ 5%  TOTAL slabinfo.kmalloc-128.num_objs

Thomas'  Kirill's
c365c292d05908c  f4262311faf6f326bf27fb1f4
---  -
1 ~ 0%-100.0%  0 ~ 0%  TOTAL 
ltp.sched_rr_get_interval02.1.TFAIL
 3618 ~ 6% -13.2%   3141 ~ 6%  TOTAL slabinfo.anon_vma.active_objs
 3671 ~ 5% -11.3%   3258 ~ 4%  TOTAL slabinfo.anon_vma.num_objs
 8.41 ~ 2%  -5.1%   7.98 ~ 3%  TOTAL boottime.dhcp
15.90 ~ 2%  -3.2%  15.40 ~ 2%  TOTAL boottime.boot
55.18 ~ 1%  -2.6%  53.73 ~ 1%  TOTAL boottime.idle

Thanks,
-Jet

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched/idle] c365c292d05: ltp.sched_rr_get_interval02.1.TFAIL

2014-03-07 Thread Jet Chen


Hi Peter,


Does: http://lkml.kernel.org/r/20140301191838.d15d03112b2598a671dac...@gmail.com

Fix it?


Both patches can fix the LTP regression.

We applied each of Juri's and Kirill's patches on top of Thomas' commit,
and here are the comparisons:

test case: ltp/syscalls

Thomas'  Juri's
c365c292d05908c  ffc78ca31d51f90c516bb50cf
---  -
1 ~ 0%-100.0%  0 ~ 0%  TOTAL 
ltp.sched_rr_get_interval02.1.TFAIL
73790 ~ 5% -14.1%  63363 ~ 6%  TOTAL interrupts.IWI
 3040 ~ 4%  -6.7%   2836 ~ 5%  TOTAL slabinfo.kmalloc-128.num_objs

Thomas'  Kirill's
c365c292d05908c  f4262311faf6f326bf27fb1f4
---  -
1 ~ 0%-100.0%  0 ~ 0%  TOTAL 
ltp.sched_rr_get_interval02.1.TFAIL
 3618 ~ 6% -13.2%   3141 ~ 6%  TOTAL slabinfo.anon_vma.active_objs
 3671 ~ 5% -11.3%   3258 ~ 4%  TOTAL slabinfo.anon_vma.num_objs
 8.41 ~ 2%  -5.1%   7.98 ~ 3%  TOTAL boottime.dhcp
15.90 ~ 2%  -3.2%  15.40 ~ 2%  TOTAL boottime.boot
55.18 ~ 1%  -2.6%  53.73 ~ 1%  TOTAL boottime.idle

Thanks,
-Jet

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

90 matches

Mail list logo