from:"Tetsuo Handa"

Re: [PATCH] x86/microcode_intel_early.c: Get 32-bit physical address by __pa_nodebug()

2013-03-19 Thread Tetsuo Handa

Fenghua Yu wrote:
> From: Fenghua Yu 
> 
> In 32-bit, __pa_symbol() in CONFIG_DEBUG_VIRTUAL accesses kernel data (e.g.
> max_low_pfn) that haven't been setup yet in such early boot phase. To fix the
> issue, __pa_nodebug() replaces __pa_symbol() to get a global symbol's physical
> address.
> 
> Signed-off-by: Fenghua Yu 
> ---
>  arch/x86/kernel/microcode_intel_early.c | 26 +-
>  1 file changed, 13 insertions(+), 13 deletions(-)
> 
This patch fixes my problem. Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[linux-next-20130205] Bug in bootup code or debug code?

2013-02-05 Thread Tetsuo Handa

Hello.

I can boot linux-next-20130205 using kernel config at
http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
But I get VMware's virtual machine kernel stack fault (hardware reset) as soon
as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.

Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
kernel config generated by "make allnoconfig", I guess something is wrong with
code which is executed at very early stage of bootup.

Any clue?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.8-rc4 arm] SCSI_SYM53C8XX_2 module cannot register IRQ

2013-01-26 Thread Tetsuo Handa

Tetsuo Handa wrote:
> > I'm sorry if this is a bit complicated but bisection down several
> > regressions is one of the more advanced forms of regression
> > bug hunt... You're quite cool if you can handle it :-)
> 
> Sorry, I'm a beginner about git. I already know that 07c9249f is the cause of
> "no console messages" and 5c49985c is the fix for "no console messages".
> How can I setup a snapshot with 07c9249f and 5c49985c applied (I tried
> 
>   $ git reset --hard 07c9249f
>   $ git format-patch --stdout 5c49985c^..5c49985c | patch -p1
> 
> but patch does not apply cleanly) so that I can find the commit causing
> "sym0: request irq 27 failure" which is between 07c9249f and 5c49985c?

I did a blind git bisection (i.e. starting

  $ qemu-system-arm -M versatilepb -hda hda.img -kernel arch/arm/boot/zImage 
-append "root=/dev/sda1 init=/bin/sh" -nographic

and watching "top" for %CPU usage of qemu-system-arm , assuming that it goes to
100% only if detection of block device for / partition failed and kernel called
panic(), goes to 0% otherwise) in two patterns.

  $ git bisect start HEAD b1112249 v3.7 v3.6 v3.5 v3.4 v3.3 v3.2 v3.1 v3.0 -- 
arch/arm drivers/scsi/sym53c8xx_2/ drivers/scsi/*.[ch]

  $ git bisect start v3.8-rc1 95e629b7 b8db6b8 810883f0 b10bca0b 14318efb 
414a6750e b1112249 v3.7 v3.6 v3.5 v3.4 v3.3 v3.2 v3.1 v3.0

Both patterns resulted in that commit 07c9249f
"ARM: 7554/1: VIC: use irq_domain_add_simple()" is the cause of

  PCI: enabling device :00:0c.0 (0100 -> 0103)
  sym0: <895a> rev 0x0 at pci :00:0c.0 irq 27
  sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking
  sym0: request irq 27 failure
  sym0: giving up ...

message.

Would you have a look?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.8-rc4 arm] SCSI_SYM53C8XX_2 module cannot register IRQ

2013-01-26 Thread Tetsuo Handa

Linus Walleij wrote:
> I'm trying to reproduce this, but how do you reconfigure the kernel to
> get PCI, SCSI and such stuff enabled?
> 
> The stock versatile_defconfig does not even have SCSI enabled...

I'm using a customized config for qemu. I've just updated the config to
http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc1-arm .

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] ARM: versatile: fix the PCI IRQ regression

2013-01-28 Thread Tetsuo Handa

Linus Walleij wrote:
> ---
> ChangeLog v1->v2:
> - Got the PIC/SIC valid mask wrong again, take a deep breath
>   and use the foolproof method of defining each bit so that
>   nobody will get caught by this again. Didn't see this first
>   because the SCSI layer takes for ever to time out.
> ---
>  arch/arm/mach-versatile/core.c | 15 ++-
>  arch/arm/mach-versatile/pci.c  | 11 ++-
>  2 files changed, 20 insertions(+), 6 deletions(-)
> 

This patch fixes my problem. Please send to 3.8. Thank you!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.9-rc1] Bug in bootup code or debug code?

2013-03-13 Thread Tetsuo Handa

Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Hello.
> > 
> > I can boot linux-next-20130205 using kernel config at
> > http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> > But I get VMware's virtual machine kernel stack fault (hardware reset) as 
> > soon
> > as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.
> > 
> > Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
> > kernel config generated by "make allnoconfig", I guess something is wrong 
> > with
> > code which is executed at very early stage of bootup.
> > 
> > Any clue?
> > 
> > Regards.
> > 
> 
> This bug is not yet fixed as of 3.9-rc1.
> Should I run git bisect?
> 
> Regards.
> 

I found the location of "hardware reset" trigger.

It is __pa_symbol(&boot_params) call, for I don't encounter "hardware reset" if
I remove the "//" from below debug patch.

This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .

--- a/arch/x86/kernel/microcode_intel_early.c
+++ b/arch/x86/kernel/microcode_intel_early.c
@@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
 #ifdef CONFIG_X86_32
struct boot_params *boot_params_p;

+   //while (1);
boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
+   while (1);
ramdisk_image = boot_params_p->hdr.ramdisk_image;
ramdisk_size  = boot_params_p->hdr.ramdisk_size;
initrd_start_early = ramdisk_image;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.9-rc1] Bug in bootup code or debug code?

2013-03-13 Thread Tetsuo Handa

H. Peter Anvin wrote:
> On 03/13/2013 08:22 AM, Yu, Fenghua wrote:
> >>
> >> I found the location of "hardware reset" trigger.
> >>
> >> It is __pa_symbol(&boot_params) call, for I don't encounter "hardware
> >> reset" if
> >> I remove the "//" from below debug patch.
> >>
> >> This bug is not yet fixed as of 3.9.0-rc2-00188-g6c23cbb .
> >>
> >> --- a/arch/x86/kernel/microcode_intel_early.c
> >> +++ b/arch/x86/kernel/microcode_intel_early.c
> >> @@ -741,7 +741,9 @@ load_ucode_intel_bsp(void)
> >>  #ifdef CONFIG_X86_32
> >> struct boot_params *boot_params_p;
> >>
> >> +   //while (1);
> >> boot_params_p = (struct boot_params *)__pa_symbol(&boot_params);
> >> +   while (1);
> >> ramdisk_image = boot_params_p->hdr.ramdisk_image;
> >> ramdisk_size  = boot_params_p->hdr.ramdisk_size;
> >> initrd_start_early = ramdisk_image;
> > 
> > Tetsuo and Dave,
> > 
> > That's the place where we suspected to cause the problem.
> > 
> > My question is: how to access global variable in linear mode in 
> > virtualization? __pa_symbol() is not a problem for native.
> > 
> 
> What kind of virtualization are we talking about here?  We should not be
> running this code under any paravirtualized code path -- this is the
> hypervisor's job to take care of this.  For HVM, this should just work
> the same way.
> 
>   -hpa
> 
H. Peter Anvin wrote:
> This is a CONFIG_DEBUG_VIRTUAL configuration, isn't it?

Yes. CONFIG_MICROCODE_INTEL_EARLY=y && CONFIG_64BIT=n && CONFIG_DEBUG_VIRTUAL=y
on VMware Workstation/Player environment.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[2.6.24-mm1] TCP/IPv6 connect() oopses at twothirdsMD4Transform()

2008-02-04 Thread Tetsuo Handa

Hello.

Kernel config is at http://I-love.SAKURA.ne.jp/tmp/config-2.6.24-mm1

2.6.24 works fine.

Regards.
--
BUG: unable to handle kernel paging request at 25476bec
IP: [] twothirdsMD4Transform+0x78/0x37c
*pde =  
Oops:  [#1] SMP DEBUG_PAGEALLOC
last sysfs file: 
/sys/devices/pci:00/:00:10.0/host0/target0:0:1/0:0:1:0/type
Modules linked in: nfsd lockd sunrpc exportfs pcnet32

Pid: 2148, comm: a.out Not tainted (2.6.24-mm1 #1)
EIP: 0060:[] EFLAGS: 00010286 CPU: 0
EIP is at twothirdsMD4Transform+0x78/0x37c
EAX: 00084000 EBX: 0800 ECX: 8000 EDX: db45ddec
ESI:  EDI: 52806380 EBP: db45dddc ESP: db45ddc8
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
Process a.out (pid: 2148, ti=db45d000 task=deaf9250 task.ti=db45d000)
Stack: 8000 def6ef9c 6380 c0759d60 db45de1c db45de28 c0211fd2 0040 
   c0759d40    0100 52806380 1f2e00ba fffa249f 
   5a696b37 8dbe1970 cf7579d0 3b0cc350 a54b10a8 def6e9a0  def6ef8c 
Call Trace:
 [] ? secure_tcpv6_sequence_number+0x58/0x7a
 [] ? tcp_v6_connect+0x46d/0x4e3
 [] ? lock_sock_nested+0x56/0x5e
 [] ? inet_stream_connect+0x1c/0x163
 [] ? inet_stream_connect+0x92/0x163
 [] ? sys_connect+0x72/0x98
 [] ? lock_release_holdtime+0x4e/0x54
 [] ? do_page_fault+0x1c5/0x3fc
 [] ? __lock_release+0x4b/0x51
 [] ? do_page_fault+0x1c5/0x3fc
 [] ? sys_socketcall+0x6f/0x15e
 [] ? restore_nocheck+0x12/0x15
 [] ? syscall_call+0x7/0xb
 ===
Code: 31 c1 03 0c ba 8b 7a 0c 01 ce 8b 4d ec c1 c6 0b 31 d9 21 f1 31 d9 03 0c 
ba 8b 7a 10 01 c8 8b 4d ec c1 c0 13 31 f1 21 c1 33 4d ec <03> 0c ba 8b 7a 14 01 
cb 89 c1 c1 c3 03 31 f1 21 d9 31 f1 03 0c 
EIP: [] twothirdsMD4Transform+0x78/0x37c SS:ESP 0068:db45ddc8
---[ end trace 160518059a282c77 ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

What id does \"current->pid\" indicate?

2008-02-04 Thread Tetsuo Handa

Hello.

I found that there are "current->pid", "task_pid_vnr(current)"
and "task_pid_nr(current)" cases in kernel 2.6.24 .

According to include/linux/pid.h ,
"task_pid_nr()" is global id and "task_xid_vnr()" is virtual id.
But what id does "current->pid" indicate?
Is "current->pid" equivalent to "task_pid_nr(current)" ?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [2.6.24-mm1] TCP/IPv6 connect() oopses at twothirdsMD4Transform()

2008-02-04 Thread Tetsuo Handa

Hello.

> random: revert braindamage that snuck into checkpatch cleanup
> 
> Signed-off-by: Matt Mackall <[EMAIL PROTECTED]>

Yes. It solved the oops.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] exec: do not leave bprm->interp on stack

2012-10-25 Thread Tetsuo Handa

P J P wrote:
> 
>   Hello Kees,
> 
> +-- On Wed, 24 Oct 2012, Kees Cook wrote --+
> | What should the code here _actually_ be doing? The _script and _misc 
> | handlers expect to rewrite the bprm contents and recurse, but the module 
> | loader want to try again. It's not clear to me what the binfmt module 
> | handler is even there for; I don't see any binfmt- aliases in the tree. 
> | If nothing uses it, should we just rip it out? That would solve it too.
> 
> I've been following this issue and updated versions of HDs patch. Below is a 
> small patch to search_binary_handler() routine, which attempts to make the 
> request_module call before calling load_script routine.
> 
> Besides fixing the stack disclosure issue it also helps to *simplify* the 
> search_binary_handler routine by removing the -for (try=0;try<2;try++)- loop.
> 
> I'd really appreciate any comments/suggestions you may have.

Excuse me, but why do you change definition of printable(c) ?
Looks like a regression.

Wouldn't your patch trigger call request_module() whenever a script
starting with "#!/bin/sh" is executed? 

And if you meant

if (!(printable(bprm->buf[0]) && printable(bprm->buf[1])
&& printable(bprm->buf[2]) && printable(bprm->buf[3])))

then, wouldn't that trigger request_module() recursion?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[linux-next-20130822] module: broken module versions?

2013-08-26 Thread Tetsuo Handa

Hello.

I'm seeing errors that several symbols have bad header (bad CRC ?)
as of linux-next-20130822.

  binfmt_misc: disagrees about version of symbol sys_close
  binfmt_misc: Unknown symbol sys_close (err -22)
  binfmt_misc: disagrees about version of symbol current_fs_time
  binfmt_misc: Unknown symbol current_fs_time (err -22)
  ipv6: disagrees about version of symbol sock_register
  ipv6: Unknown symbol sock_register (err -22)
  ipv6: disagrees about version of symbol ns_capable
  ipv6: Unknown symbol ns_capable (err -22)

next-20130815 worked OK.
I guess that the culprit commit touches something module related code but I'm
unable to run full bisection due to unbootable 3.11-rc5. Any clue?

Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.11-rc6-next-20130822 .

--
# bad: [66a01bae29d11916c09f9f5a937cafe7d402e4a5] Add linux-next specific files 
for 20130819
# good: [49dfe76261e427f5521b40321fbc3d947350165d] Documentation: add 
networking/netdev-FAQ.txt
# good: [cf3c4c03060b688cbc389ebc5065ebcce5653e96] 8139cp: Add 
dma_mapping_error checking
# good: [093aba10e6a88a28fdc7d6ee8ec69aebd9767a22] Merge branch 
'for-3.12/upstream' into for-next
# good: [d010e5769a5ab2ae8d2bcb36e77b98172c24d80c] PCI / ACPI: Use dev_dbg() 
instead of dev_info() in acpi_pci_set_power_state()
# good: [569935db80fd5338005d977ffc3428d43aad84ba] Merge branches 'cma', 
'cxgb3', 'cxgb4', 'ipoib', 'misc', 'mlx4', 'mlx5', 'nes', 'ocrdma' and 'qib' 
into for-next
# good: [e729eac6f65e11c5f03b09adcc84bd5bcb230467] udf: Refuse RW mount of the 
filesystem instead of making it RO
# good: [3fef7f795fff7ccc58d55a28315ca73305515884] Merge remote-tracking branch 
'asoc/fix/wm0010' into asoc-linus
# good: [11a45820d02ee78ad22bb95d5abb94950a355d8d] Merge tag 'nfc-fixes-3.11-2' 
of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/nfc-fixes
# good: [8fe120b5a665fc869c23f86e4964b801f6e53486] ASoC: omap-abe-twl6040: 
Remove support for pdata (legacy boot)
# good: [66ffd113f5d81e951b0379acfd0a1df0771d8828] cifs: set sb->s_d_op before 
calling d_make_root()
# good: [6a3fc31f35025a583499c0d3b1c6fc5dcf6e48ec] spi/ep93xx: Fix format 
specifier warning
# good: [2de024b766bb9e31c357f70c6344d1107f38ce1a] spi/atmel: Fix format 
specifier warnings
# good: [e56566699ca64ec44dd134ec5310a3585ffacfec] regulator: pfuze100: 
Simplify pfuze100_set_ramp_delay implementation
# good: [6878ea72a5d1aa6caae86449975a50b7fe9abed5] aio: be defensive to ensure 
request batching is non-zero instead of BUG_ON()
# good: [70263cb474853c116f80713d468f3c17d805921c] ASoC: rcar: fix return value 
check in rsnd_gen1_probe()
# good: [3f1a91aa25579ba5e7268a47a73d2a83e4802c62] ASoC: fsl: Fix module build
# good: [db5ff9541b61ef8394bad0fb05508921b8c5b17b] ASoC: spdif: Add S20_3LE and 
S24_LE support for dummy codec drivers
# good: [01c6a6afd50f07dfd66b2891fd194c4b789fca48] x86 / tboot / ACPI: Fail 
extended mode reduced hardware sleep
# good: [21aa521727a004ce4a4c46e9a5cb7e34619b4d16] ARM: dts: Add G2D support to 
exynos5250
# good: [7b4e0c4ac1809eab6fcfe6818ec8b70be79b41bc] ACPI / PM: Remove redundant 
power manageable check from acpi_bus_set_power()
# good: [d29cae9f8f2a056138d1bfc55de6e5aa9b9c6a93] Merge branch 'for-3.12' into 
for-next
# good: [b36f4be3de1b123d8601de062e7dbfc904f305fb] Linux 3.11-rc6
# good: [d4e4ab86bcba5a72779c43dc1459f71fea3d89c8] Linux 3.11-rc5
# good: [c095ba7224d8edc71dcef0d655911399a8bd4a3f] Linux 3.11-rc4
# good: [5ae90d8e467e625e447000cb4335c4db973b1095] Linux 3.11-rc3
# good: [3b2f64d00c46e1e4e9bd0bb9bb12619adac27a4b] Linux 3.11-rc2
# good: [ad81f0545ef01ea651886dddac4bef6cec930092] Linux 3.11-rc1
# good: [8bb495e3f02401ee6f76d1b1d77f3ac9f079e376] Linux 3.10
# good: [c1be5a5b1b355d40e6cf79cc979eb66dafa24ad1] Linux 3.9
# good: [19f949f52599ba7c3f67a5897ac6be14bfcb1200] Linux 3.8
# good: [29594404d7fe73cd80eaa4ee8c43dcc53970c60e] Linux 3.7
# good: [a0d271cbfed1dd50278c6b06bead3d00ba0a88f9] Linux 3.6
# good: [28a33cbc24e4256c143dce96c7d93bf423229f92] Linux 3.5
# good: [76e10d158efb6d4516018846f60c2ab5501900bc] Linux 3.4
# good: [c16fa4f2ad19908a47c63d8fa436a1178438c7e7] Linux 3.3
# good: [805a6af8dba5dfdd35ec35dc52ec0122400b2610] Linux 3.2
# good: [c3b92c8787367a8bb53d57d9789b558f1295cc96] Linux 3.1
# good: [02f8c6aee8df3cdc935e9bdd4f2d020306035dbe] Linux 3.0
git bisect start '66a01bae29d11916c09f9f5a937cafe7d402e4a5' 
'49dfe76261e427f5521b40321fbc3d947350165d' 
'cf3c4c03060b688cbc389ebc5065ebcce5653e96' \
'093aba10e6a88a28fdc7d6ee8ec69aebd9767a22' 
'd010e5769a5ab2ae8d2bcb36e77b98172c24d80c' 
'569935db80fd5338005d977ffc3428d43aad84ba' \
'e729eac6f65e11c5f03b09adcc84bd5bcb230467' 
'3fef7f795fff7ccc58d55a28315ca73305515884' 
'11a45820d02ee78ad22bb95d5abb94950a355d8d' \
'8fe120b5a665fc869c23f86e4964b801f6e53486' 
'66ffd113f5d81e951b0379acfd0a1df0771d8828' 
'6a3fc31f35025a583499c0d3b1c6fc5dcf6e48ec' \
'2de024b766bb9e31c357f70c6344d1107f38ce1a' 
'e56566699ca64ec44dd134ec5310a3585ffacfec' 
'6878ea72a5d1aa6caae86449975a50b7fe9abed5' \
'70263cb474853c116

Re: [linux-next-20130822] module: broken module versions?

2013-08-27 Thread Tetsuo Handa

I noticed that symbols which cause "disagrees about version of symbol" messages
have crc == 0.

-- scripts/mod/modpost.c --
/* CRC'd symbol */
if (strncmp(symname, CRC_PFX, strlen(CRC_PFX)) == 0) {
crc = (unsigned int) sym->st_value;
+   if (!crc)
+   fprintf(stderr, "symbol %s has crc=0 
sym->st_shndx=%d\n",
+   symname, sym->st_shndx);
sym_update_crc(symname + strlen(CRC_PFX), mod, crc,
   export);
}
-- scripts/mod/modpost.c --

I got below messages upon build.

  symbol __crc_sys_close has crc=0 sym->st_shndx=0
  symbol __crc_path_is_under has crc=0 sym->st_shndx=0
  symbol __crc_task_nice has crc=0 sym->st_shndx=0
  symbol __crc_vfs_fsync_range has crc=0 sym->st_shndx=0
  symbol __crc___symbol_put has crc=0 sym->st_shndx=0
  symbol __crc_iov_shorten has crc=0 sym->st_shndx=0
  symbol __crc_inode_add_bytes has crc=0 sym->st_shndx=0
  symbol __crc_sock_register has crc=0 sym->st_shndx=0
  symbol __crc_vm_brk has crc=0 sym->st_shndx=0
  symbol __crc_kern_mount_data has crc=0 sym->st_shndx=0
  symbol __crc_schedule_timeout has crc=0 sym->st_shndx=0
  symbol __crc_generic_getxattr has crc=0 sym->st_shndx=0
  symbol __crc_in_group_p has crc=0 sym->st_shndx=0
  symbol __crc_finish_open has crc=0 sym->st_shndx=0
  symbol __crc_get_unmapped_area has crc=0 sym->st_shndx=0
  symbol __crc_generic_write_sync has crc=0 sym->st_shndx=0
  symbol __crc_filp_close has crc=0 sym->st_shndx=0
  symbol __crc_d_tmpfile has crc=0 sym->st_shndx=0
  symbol __crc_iterate_fd has crc=0 sym->st_shndx=0
  symbol __crc_register_exec_domain has crc=0 sym->st_shndx=0
  symbol __crc_ns_capable has crc=0 sym->st_shndx=0
  symbol __crc___page_file_mapping has crc=0 sym->st_shndx=0
  symbol __crc_mnt_set_expiry has crc=0 sym->st_shndx=0
  symbol __crc_do_sync_read has crc=0 sym->st_shndx=0
  symbol __crc_vfs_test_lock has crc=0 sym->st_shndx=0
  symbol __crc_perf_event_create_kernel_counter has crc=0 sym->st_shndx=0
  symbol __crc_current_fs_time has crc=0 sym->st_shndx=0
  symbol __crc_softirq_work_list has crc=0 sym->st_shndx=0
  symbol __crc_sys_close has crc=0 sym->st_shndx=0
  symbol __crc_path_is_under has crc=0 sym->st_shndx=0
  symbol __crc_task_nice has crc=0 sym->st_shndx=0
  symbol __crc_vfs_fsync_range has crc=0 sym->st_shndx=0
  symbol __crc___symbol_put has crc=0 sym->st_shndx=0
  symbol __crc_iov_shorten has crc=0 sym->st_shndx=0
  symbol __crc_inode_add_bytes has crc=0 sym->st_shndx=0
  symbol __crc_sock_register has crc=0 sym->st_shndx=0
  symbol __crc_vm_brk has crc=0 sym->st_shndx=0
  symbol __crc_kern_mount_data has crc=0 sym->st_shndx=0
  symbol __crc_schedule_timeout has crc=0 sym->st_shndx=0
  symbol __crc_generic_getxattr has crc=0 sym->st_shndx=0
  symbol __crc_in_group_p has crc=0 sym->st_shndx=0
  symbol __crc_finish_open has crc=0 sym->st_shndx=0
  symbol __crc_get_unmapped_area has crc=0 sym->st_shndx=0
  symbol __crc_generic_write_sync has crc=0 sym->st_shndx=0
  symbol __crc_filp_close has crc=0 sym->st_shndx=0
  symbol __crc_d_tmpfile has crc=0 sym->st_shndx=0
  symbol __crc_iterate_fd has crc=0 sym->st_shndx=0
  symbol __crc_register_exec_domain has crc=0 sym->st_shndx=0
  symbol __crc_ns_capable has crc=0 sym->st_shndx=0
  symbol __crc___page_file_mapping has crc=0 sym->st_shndx=0
  symbol __crc_mnt_set_expiry has crc=0 sym->st_shndx=0
  symbol __crc_do_sync_read has crc=0 sym->st_shndx=0
  symbol __crc_vfs_test_lock has crc=0 sym->st_shndx=0
  symbol __crc_perf_event_create_kernel_counter has crc=0 sym->st_shndx=0
  symbol __crc_current_fs_time has crc=0 sym->st_shndx=0
  symbol __crc_softirq_work_list has crc=0 sym->st_shndx=0

I continued bisection in this way and found that commit 5c019369 "syscalls.h:
use gcc alias instead of assembler aliases for syscalls" started showing
below messages upon build.

  WARNING: "ns_capable" [net/ipv6/sit.ko] has no CRC!
  WARNING: "ns_capable" [net/ipv6/ipv6.ko] has no CRC!
  WARNING: "sock_register" [net/ipv6/ipv6.ko] has no CRC!
  WARNING: "ns_capable" [net/ipv4/ip_tunnel.ko] has no CRC!
  WARNING: "inode_add_bytes" [fs/udf/udf.ko] has no CRC!
  WARNING: "current_fs_time" [fs/udf/udf.ko] has no CRC!
  WARNING: "do_sync_read" [fs/udf/udf.ko] has no CRC!
  WARNING: "d_tmpfile" [fs/udf/udf.ko] has no CRC!
  WARNING: "sys_close" [fs/binfmt_misc.ko] has no CRC!
  WARNING: "current_fs_time" [fs/binfmt_misc.ko] has no CRC!
  WARNING: "vm_brk" [fs/binfmt_aout.ko] has no CRC!
  WARNING: "sys_close" [fs/autofs4/autofs4.ko] has no CRC!
  WARNING: "schedule_timeout" [drivers/hid/usbhid/usbhid.ko] has no CRC!

Reverting commit 5c019369 from linux-next-20130822 solved this problem.

Andi, would you check?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: kthread: Make kthread_create() killable.

2013-10-01 Thread Tetsuo Handa

David Rientjes wrote:
> On Sat, 28 Sep 2013, Tetsuo Handa wrote:
> 
> > Some of enterprise users might prefer "kernel panic followed by kdump and
> > automatic reboot" to "a system is not responding for unpredictable period", 
> > for
> > the panic helps getting information for analyzing what process caused the
> > freeze. Well, can they use "Panic (Reboot) On Soft Lockups" option?
> > 
> 
> Or, when the system doesn't respond for a long period of time you do 
> sysrq+T and you find the TIF_MEMDIE bit set on a process that makes no 
> progress exiting.

In enterprise systems, an operator is not always sitting in front of the server
for pressing sysrq keys (nor kept ssh session for issuing sysrq via procfs).
The operator likely finds it many hours later after the system got frozen. The
operator finds that he/she can't login, and presses power reset button.
Rather than wasting for many hours, an unattended automatic reboot might be
preferred.

>These instances _should_ be very rare since we don't 
> have any other reports of it (and the oom killer hasn't differed in this 
> regard for over three years).  It used to be much more common for 
> mm->mmap_sem dependencies that were fixed.
> 

Such reports in real world might be rare, but I care potential bugs which can
affect availability of the server.

If local unprivileged users can execute their own programs, they can easily
freeze the server. Therefore, I test whether such freeze can happen using DoS
attacking programs executed by local unprivileged users. I confirmed that
request_module() can easily freeze the server and the request_module() case was
fixed as CVE-2012-4398. I confirmed that kthread_create() can freeze the server
(though not easy to trigger but can happen by chance) and posted a patch in
this thread.

> > Currently the OOM killer kills a process after
> > 
> >   blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
> > 
> > in out_of_memory() released all reclaimable memory.
> 
> The oom notifiers usually don't do any good on x86.
> 
> > This call helps reducing
> > the chance to kill a process if the bad process no longer asks for more 
> > memory.
> 
> The "bad process" could be anything, it's simply the process that is 
> allocating memory when all memory is exhausted.
> 

I'm using "bad process" as what you mean.

> > But if the bad process continues asking for more memory and the chosen task 
> > is
> > in TASK_UNINTERRUPTIBLE state, this call helps the OOM killer to be disabled
> > for unpredictable period. Therefore, releasing all reclaimable memory before
> > the OOM killer kills a process might be considered bad.
> > 
> 
> I don't follow this statement, could you reword it?
> 
> If current calls the oom killer and the oom notifiers don't free any 
> memory (which is very likely), then choosing an uninterruptible process is 
> possible and has always been possible.

Yes, this does happen if a local unprivileged user who can execute his/her own
program consumed a lots of memory.

> If sending SIGKILL and giving that 
> process access to memory reserves does not allow it to exit in a short 
> amount of time, then it must be waiting on another process that also 
> cannot make forward process.

Yes. kthread_create(), do_coredump() and call_usermodehelper_keys() are
examples of such cases which I think I can trigger deadlock using DoS attacking
programs executed by local unprivileged users.

>   We must identify these cases (which is 
> easily doable as described above) and fix them.
> 

I'm not expecting that we identify all possible cases, for any blocking
functions which wait in TASK_UNINTERRUPTIBLE are candidates for such cases.
This is as problematic as GFP_NOFS allocation functions calling other functions
which do GFP_KERNEL allocation.

> > Then, what about an approach described below?
> > 
> > (1) Introduce a kernel thread which reserves (e.g.) 1 percent of kernel 
> > memory
> > (this amount should be configurable via sysctl) upon startup.
> > 
> 
> We don't need kernel threads, this is what per-zone memory reserves are 
> intended to provide for GFP_ATOMIC and TIF_MEMDIE allocations (or 
> PF_MEMALLOC for reclaimers).
> 

I know some amount of memory is reserved for GFP_ATOMIC/TIF_MEMDIE allocations.
What I'm talking about is GFP_KERNEL allocating processes which are preventing
TIF_MEMDIE process from terminating due to TASK_UNINTERRUPTIBLE wait.

> > (2) The kernel thread sleeps using wait_event(memory_reservoir_wait) and
> >

Re: kthread: Make kthread_create() killable.

2013-10-02 Thread Tetsuo Handa

79829] Swap cache stats: add 120624, delete 120624, find 23686/24787
[  416.081544] Free swap  = 0kB
[  416.083515] Total swap = 0kB
[  416.085827] 131071 pages RAM
[  416.086644] 0 pages HighMem
[  416.087686] 2948 pages reserved
[  416.088581] 523856 pages shared
[  416.089421] 126585 pages non-shared
[  416.090338] [ pid ]   uid  tgid total_vm  rss nr_ptes swapents 
oom_score_adj name
[  416.092365] [ 2943] 0  2943  625   94   20 
-1000 udevd
[  416.095624] [ 3542] 0  3542 2732   45   26 
-1000 auditd
[  416.097695] [ 3790] 0  3790 2171  122   40 
-1000 sshd
[  416.099690] [ 3888] 0  3888  624   95   20 
-1000 udevd
[  416.101750] [ 3889] 0  3889  624   93   20 
-1000 udevd
[  416.104906] [ 4074] 0  4074  993  106   40   
  0 login
[  416.106926] [ 4075] 0  4075  513   14   30   
  0 mingetty
[  416.109019] [ 4076] 0  4076  513   13   30   
  0 mingetty
[  416.111072] [ 4077] 0  4077  513   14   30   
  0 mingetty
[  416.113162] [ 4078] 0  4078  513   13   30   
  0 mingetty
[  416.116363] [ 4079] 0  4079  993  107   30   
  0 login
[  416.118357] [ 4080] 0  4080 1611   70   40   
  0 bash
[  416.120354] [ 4093]   500  4093 1611   60   40   
  0 bash
[  416.122327] [ 4110] 0  4110   117665   117200 1170   
  0 a.out
[  416.125436] [ 4111]   500  4111  478   10   30   
  0 memeater
[  416.127509] Out of memory: Kill process 4110 (a.out) score 885 or sacrifice 
child
[  416.129458] Killed process 4110 (a.out) total-vm:470660kB, 
anon-rss:468800kB, file-rss:0kB
[  417.096035] * Calling call_usermodehelper_exec()
[  600.976438] INFO: task a.out:4110 blocked for more than 120 seconds.
[  600.978185]   Not tainted 3.11.0-10050-g3711d86-dirty #119
[  600.979655] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[  600.980561] a.out   D bc00 0  4110   4080 0x00100084
[  600.982320]  de6a9d20 0082  bc00 000a 1cf02990 df1dab10 
df097510
[  600.985850]  df1dab10 c07b8180 c07b2000 c07b8180 df1dab10 df1dab50  
c0568220
[  600.988293]  dffea180 c0568220 de6a9cf0 c015c29a dffea180 df0974d0 dffea180 
de6a9d04
[  600.990597] Call Trace:
[  600.993308]  [] ? check_preempt_curr+0x6a/0x80
[  600.994658]  [] ? ttwu_do_wakeup+0x1a/0xd0
[  600.995931]  [] ? ttwu_do_activate.clone.1+0x3a/0x50
[  600.996651]  [] schedule+0x1e/0x50
[  600.997749]  [] schedule_timeout+0x135/0x180
[  600.999069]  [] ? wake_up_process+0x1a/0x30
[  601.000470]  [] ? wake_up_worker+0x19/0x20
[  601.001733]  [] ? insert_work+0x53/0x90
[  601.002941]  [] wait_for_common+0x94/0x120
[  601.005966]  [] ? try_to_wake_up+0x1f0/0x1f0
[  601.007258]  [] wait_for_completion+0x12/0x20
[  601.008590]  [] call_usermodehelper_exec+0xd7/0x100
[  601.010026]  [] do_coredump+0x994/0xca0
[  601.011215]  [] ? do_coredump+0xca0/0xca0
[  601.012540]  [] ? do_proc_dointvec_ms_jiffies_conv+0x70/0x80
[  601.014169]  [] ? __dequeue_signal+0xd2/0x140
[  601.015500]  [] get_signal_to_deliver+0x19a/0x4e0
[  601.016609]  [] do_signal+0x37/0x960
[  601.017747]  [] ? irq_exit+0x51/0x90
[  601.020353]  [] ? do_IRQ+0x46/0xb0
[  601.021455]  [] ? init_fpu+0x3d/0xa0
[  601.022602]  [] ? __do_page_fault+0x500/0x500
[  601.023915]  [] do_notify_resume+0x55/0x70
[  601.025088]  [] work_notifysig+0x24/0x29
[  601.026304]  [] ? __do_page_fault+0x500/0x500
-- dmesg end --

Below is the patch to fix this problem.


>From 052fedc920b735354b618e23c0b74c7b88ecd3c6 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Wed, 2 Oct 2013 22:11:00 +0900
Subject: [PATCH v2] coredump: Make startup of coredump to pipe killable.

Any user process callers of wait_for_completion() except global init process
might be chosen by the OOM killer while waiting for completion() call by some
other process which does memory allocation.

When such users are chosen by the OOM killer when they are waiting for
completion() in TASK_UNINTERRUPTIBLE, the system will be kept stressed
due to memory starvation because the OOM killer cannot kill such users.

call_usermodehelper() without UMH_KILLABLE flag is one of such users and this
patch fixes the problem for do_coredump() by making startup of coredump to pipe
killable.

Signed-off-by: Tetsuo Handa 
---
 fs/coredump.c   |  124 +++
 include/linux/binfmts.h |2 +
 2 files changed, 84 insertions(+), 42 deletions(-)

diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index e8112ae..b0cb384 100644
--- a/includ

[PATCH for 3.12-rcX] mutex: Avoid gcc version dependent __builtin_constant_p() usage.

2013-10-03 Thread Tetsuo Handa

Peter Zijlstra wrote:
> On Mon, Sep 09, 2013 at 08:56:53PM +0900, Tetsuo Handa wrote:
> > From: Tetsuo Handa 
> > Date: Mon, 9 Sep 2013 20:48:13 +0900
> > Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p() 
> > usage.
> > 
> > Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
> > "!__builtin_constant_p(p == NULL)" but gcc 3.x cannot handle such expression
> > correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.
> > 
> > Fix it by explicitly passing a bool which tells whether p != NULL or not.
> > 
> > Signed-off-by: Tetsuo Handa 
> > Cc:  [3.11+]
> > ---
> >  kernel/mutex.c |   32 
> >  1 files changed, 16 insertions(+), 16 deletions(-)
> > 
> > diff --git a/kernel/mutex.c b/kernel/mutex.c
> > index a52ee7bb..a2b80f1 100644
> > --- a/kernel/mutex.c
> > +++ b/kernel/mutex.c
> > @@ -408,7 +408,7 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock,
> >  static __always_inline int __sched
> >  __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
> > struct lockdep_map *nest_lock, unsigned long ip,
> > -   struct ww_acquire_ctx *ww_ctx)
> > +   struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
> >  {
> > struct task_struct *task = current;
> > struct mutex_waiter waiter;
> > @@ -448,7 +448,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
> > unsigned int subclass,
> > struct task_struct *owner;
> > struct mspin_node  node;
> >  
> > -   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
> > 0) {
> > +   if (use_ww_ctx && ww_ctx->acquired > 0) {
> > struct ww_mutex *ww;
> >  
> > ww = container_of(lock, struct ww_mutex, base);
> > @@ -478,7 +478,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
> > unsigned int subclass,
> > if ((atomic_read(&lock->count) == 1) &&
> > (atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
> > lock_acquired(&lock->dep_map, ip);
> > -   if (!__builtin_constant_p(ww_ctx == NULL)) {
> > +   if (use_ww_ctx) {
> > struct ww_mutex *ww;
> > ww = container_of(lock, struct ww_mutex, base);
> >  
> > @@ -548,7 +548,7 @@ slowpath:
> > goto err;
> > }
> >  
> > -   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
> > 0) {
> > +   if (use_ww_ctx && ww_ctx->acquired > 0) {
> > ret = __mutex_lock_check_stamp(lock, ww_ctx);
> > if (ret)
> > goto err;
> > @@ -568,7 +568,7 @@ done:
> > mutex_remove_waiter(lock, &waiter, current_thread_info());
> > mutex_set_owner(lock);
> >  
> > -   if (!__builtin_constant_p(ww_ctx == NULL)) {
> > +   if (use_ww_ctx) {
> > struct ww_mutex *ww = container_of(lock,
> >   struct ww_mutex,
> >   base);
> > @@ -618,7 +618,7 @@ mutex_lock_nested(struct mutex *lock, unsigned int 
> > subclass)
> >  {
> > might_sleep();
> > __mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
> > -   subclass, NULL, _RET_IP_, NULL);
> > +   subclass, NULL, _RET_IP_, NULL, 0);
> >  }
> >  
> >  EXPORT_SYMBOL_GPL(mutex_lock_nested);
> 
> This is a sad patch, but provided it actually generates similar code I
> suppose its the best we can do bar whole sale deprecating gcc-3.
> 

Can the patch below go to 3.12-rcX (and the patch above to 3.11-stable which
does the same thing)?

Regards.
--
>From a1b01c858143c2c2c92b17e7df096042bfe0df6b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 24 Sep 2013 23:44:17 +0900
Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p() 
usage.

Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
"!__builtin_constant_p(p == NULL)" but gcc 3.x cannot handle such expression
correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.

Fix it by explicitly passing a bool which tells whether p != NULL or not.

Signed-off-by: Tetsuo Handa 
---
 kernel/mutex.c |   32 -

Re: [PATCH] LSM: ModPin LSM for module loading restrictions

2013-10-03 Thread Tetsuo Handa

Kees Cook wrote:
> +static int modpin_load_module(struct file *file)
> +{
> +   struct dentry *module_root;
> +
> +   if (!file) {
> +   if (!modpin_enforced) {
> +   report_load_module(NULL, "old-api-pinning-ignored");
> +   return 0;
> +   }
> +
> +   report_load_module(NULL, "old-api-denied");
> +   return -EPERM;
> +   }
> +
> +   module_root = file->f_path.mnt->mnt_root;
> +
> +   /* First loaded module defines the root for all others. */
> +   spin_lock(&pinned_root_spinlock);
> +   if (!pinned_root) {
> +   pinned_root = dget(module_root);
> +   /*
> +* Unlock now since it's only pinned_root we care about.
> +* In the worst case, we will (correctly) report pinning
> +* failures before we have announced that pinning is
> +* enabled. This would be purely cosmetic.
> +*/
> +   spin_unlock(&pinned_root_spinlock);
> +   check_pinning_enforcement();
> +   report_load_module(&file->f_path, "pinned");
> +   return 0;
> +   }
> +   spin_unlock(&pinned_root_spinlock);

Firstly loaded module is usually in initramfs whereas subsequently loaded
modules are usually on a hard disk partition.

This module is not meant for PC servers, is it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] vsprintf: drop comment claiming %n is ignored

2013-09-13 Thread Tetsuo Handa

Kees Cook wrote:
> 3- some callers of seq_printf (incorrectly) use the return value as a
> length indication

Are there really?

Is somebody using the return value from seq_printf() like

  pos = snprintf(buf, sizeof(buf) - 1, "%s", foo);
  snprintf(buf + pos, sizeof(buf) - 1 - pos, "%s", bar);

? Since the caller cannot pass the return value from seq_printf() like

  pos = seq_printf(m, "%s", foo);
  seq_printf(m + pos, "%s", bar);

, I wonder who would interpret the return value as a length indication.

Even bad code which has never tested failure case, the authors should already
know that "seq_printf() returns 0 on success case".

I think that

pos += seq_printf(m, "%s", foo);
pos += seq_printf(m, "%s", bar);

is used as the equivalent to

if (seq_printf(m, "%s", foo))
pos = -1;
if (seq_printf(m, "%s", bar))
pos = -1;

.

Joe Perches wrote:
> @@ -174,8 +171,8 @@ static int dbg_show_state(struct seq_file *s, void *p)
>   int pos = 0;
>  
>   /* basic device status */
> - pos += seq_printf(s, "DMA engine status\n");
> - pos += seq_printf(s, "\tChannel number: %d\n", num_dma_channels);
> + seq_puts(s, "DMA engine status\n");
> + seq_printf(s, "\tChannel number: %d\n", num_dma_channels);
>  
>   return pos;
>  }

As I described above, I think this change breaks the functionality.
We need to change like

  - pos += seq_printf(s, "DMA engine status\n");
  - pos += seq_printf(s, "\tChannel number: %d\n", num_dma_channels);
  + pos |= seq_puts(s, "DMA engine status\n");
  + pos |= seq_printf(s, "\tChannel number: %d\n", num_dma_channels);

or

  - pos += seq_printf(s, "DMA engine status\n");
  - pos += seq_printf(s, "\tChannel number: %d\n", num_dma_channels);
  + seq_puts(s, "DMA engine status\n");
  + seq_printf(s, "\tChannel number: %d\n", num_dma_channels);
   
  - return pos;
  + return seq_overflow(s) : -1 : 0;

for keeping the functionality.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] SCSI: buslogic: Added check for DMA mapping errors (wasRe:[BusLogic] DMA-API: device driver failed to check map error)

2013-09-13 Thread Tetsuo Handa

Khalid Aziz wrote:
> Added check for DMA mapping errors for request sense data
> buffer. Checking for mapping error can avoid potential wild
> writes. This patch was prompted by the warning from
> dma_unmap when kernel is compiled with CONFIG_DMA_API_DEBUG.

This patch looks OK and works OK to me.

Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] argv_split: Return NULL if argument contains no non-whitespace.

2013-09-14 Thread Tetsuo Handa

>From 210f917f3b535bc0d4dcbb20ca4395709e913104 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Sat, 14 Sep 2013 16:24:07 +0900
Subject: [PATCH] argv_split: Return NULL if argument contains no non-whitespace.

I tried

  # echo '|' > /proc/sys/kernel/core_pattern

and got

  BUG: unable to handle kernel NULL pointer dereference at   (null)

upon core dump because helper_argv[0] == NULL at

  helper_argv = argv_split(GFP_KERNEL, cn.corename, NULL);
  call_usermodehelper_setup(helper_argv[0], ...);

if cn.corename == "".

How to check this bug:

  # echo '|' > /proc/sys/kernel/core_pattern
  $ echo 'int main(int argc, char *argv[]) { return *(char *) 0; }' | gcc -x c 
- -o die
  $ ulimit -c unlimited
  $ ./die

This bug seems to exist since 2.6.19 (the version which core dump to pipe was
added). Depending on kernel version and config, some side effect might follow
immediately after this oops (e.g. kernel panic with 2.6.32-358.18.1.el6).

Assuming that nobody is expecting that argv_split() returns an array with
argv[0] == NULL, this patch fixes this bug by changing argv_split() to return
NULL if argument contains no non-whitespace.

Signed-off-by: Tetsuo Handa 
---
 lib/argv_split.c |6 +-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/lib/argv_split.c b/lib/argv_split.c
index e927ed0..5b828d9 100644
--- a/lib/argv_split.c
+++ b/lib/argv_split.c
@@ -50,7 +50,7 @@ EXPORT_SYMBOL(argv_free);
  * quote processing is performed.  Multiple whitespace characters are
  * considered to be a single argument separator.  The returned array
  * is always NULL-terminated.  Returns NULL on memory allocation
- * failure.
+ * failure or @str being empty or @str containing only white-space.
  *
  * The source string at `str' may be undergoing concurrent alteration via
  * userspace sysctl activity (at least).  The argv_split() implementation
@@ -68,6 +68,10 @@ char **argv_split(gfp_t gfp, const char *str, int *argcp)
return NULL;
 
argc = count_argc(argv_str);
+   if (!argc) {
+   kfree(argv_str);
+   return NULL;
+   }
argv = kmalloc(sizeof(*argv) * (argc + 2), gfp);
if (!argv) {
kfree(argv_str);
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kthread: Make kthread_create() killable.

2013-09-14 Thread Tetsuo Handa

Oleg Nesterov wrote:
> I am wondering if this can be simplified...
> 
> At least you can move create->done from kthread_create_info to the
> stack, and turn create->owner into the pointer to that completion.

Use of DECLARE_COMPLETION_ONSTACK() looks harmful to me because current thread
needs to be able to terminate as soon as possible if SIGKILLed (especially when
SIGKILLed by OOM killer). If we move something from kmalloc()ed zone to stack,
current thread cannot be terminated until that something is guaranteed to no
longer be used.

I think we need to convert from on-stack objects to kmalloc()ed objects so that
current thread acquires ability to terminate as soon as possible.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] argv_split: Return NULL if argument contains nonon-whitespace.

2013-09-14 Thread Tetsuo Handa

Oleg Nesterov wrote:
> > upon core dump because helper_argv[0] == NULL at
> >
> >   helper_argv = argv_split(GFP_KERNEL, cn.corename, NULL);
> >   call_usermodehelper_setup(helper_argv[0], ...);
> 
> Are you sure? See above.
> 

Yes, I'm sure. execve(NULL) from user space is safe, but
do_execve(NULL) from kernel space is not safe.

-- patch start --
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -763,6 +763,9 @@ struct file *open_exec(const char *name)
.lookup_flags = LOOKUP_FOLLOW,
};
 
+   if (WARN_ON(!name))
+   return ERR_PTR(-EINVAL);
+
file = do_filp_open(AT_FDCWD, &tmp, &open_exec_flags);
if (IS_ERR(file))
goto out;
-- patch end --

-- dmesg start --
die[3924]: segfault at 0 ip 0804839c sp bf9d3c78 error 4 in die[8048000+1000]
[ cut here ]
WARNING: CPU: 1 PID: 3925 at fs/exec.c:766 open_exec+0xfd/0x110()
Modules linked in: ipv6 binfmt_misc
CPU: 1 PID: 3925 Comm: kworker/u4:0 Not tainted 3.11.0-10050-g3711d86-dirty #111
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference 
Platform, BIOS 6.00 08/15/2008
  c0557e9e c068909b c0139756 c067af64 0001 0f55 c068909b
 02fe c01da24d c01da24d  de754a80 df120b50  c013979b
 0009  c01da24d de6626c0 c0158c68 df120b50  
Call Trace:
 [] ? dump_stack+0x3e/0x50
 [] ? warn_slowpath_common+0x86/0xb0
 [] ? open_exec+0xfd/0x110
 [] ? open_exec+0xfd/0x110
 [] ? warn_slowpath_null+0x1b/0x20
 [] ? open_exec+0xfd/0x110
 [] ? prepare_creds+0x88/0xb0
 [] ? do_execve+0x18c/0x560
 [] ? call_usermodehelper+0xbc/0xe0
 [] ? ret_from_kernel_thread+0x1b/0x28
 [] ? call_usermodehelper+0xe0/0xe0
---[ end trace 63bb92bc8d58b0c2 ]---
Core dump to | pipe failed
-- dmesg end --

> Perhaps
> 
>   --- x/kernel/kmod.c
>   +++ x/kernel/kmod.c
>   @@ -571,6 +571,9 @@ int call_usermodehelper_exec(struct subp
>   DECLARE_COMPLETION_ONSTACK(done);
>   int retval = 0;
>
>   +   if (!sub_info->path)
>   +   return -EXXX;
>   +
>   helper_lock();
>   if (!khelper_wq || usermodehelper_disabled) {
>   retval = -EBUSY;
> 
> ?
> 

I'm OK with that.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kmod: Check for NULL at call_usermodehelper_exec().

2013-09-15 Thread Tetsuo Handa

>From fe6723ba2816b42e26697472a3f2a3045614032b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Sun, 15 Sep 2013 23:17:15 +0900
Subject: [PATCH] kmod: Check for NULL at call_usermodehelper_exec().

If /proc/sys/kernel/core_pattern contains only "|", NULL pointer dereference
happens upon core dump because argv_split("") returns argv[0] == NULL.

This bug seems to exist since 2.6.19 (the version which core dump to pipe was
added). Depending on kernel version and config, some side effect might happen
immediately after this oops (e.g. kernel panic with 2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa 
---
 kernel/kmod.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index fb32636..3b59f6e 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -572,6 +572,10 @@ int call_usermodehelper_exec(struct subprocess_info 
*sub_info, int wait)
int retval = 0;
 
helper_lock();
+   if (!sub_info->path) {
+   retval = -ENOENT;
+   goto out;
+   }
if (!khelper_wq || usermodehelper_disabled) {
retval = -EBUSY;
goto out;
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kmod: Check for NULL at call_usermodehelper_exec().

2013-09-15 Thread Tetsuo Handa

Oleg Nesterov wrote:
> It looks a bit ugly to check ->path under helper_lock(), just add
> 
>   if (!sub_info->path)
>   retval = -ENOENT;
> 
> at the start. Otherwise the code looks as if there is a subtle
> reason to take the lock before this check.

Did you mean this?

DECLARE_COMPLETION_ONSTACK(done);
int retval = 0;
 
+   if (!sub_info->path) {
+   call_usermodehelper_freeinfo(sub_info);
+   return -ENOENT;
+   }
helper_lock();
if (!khelper_wq || usermodehelper_disabled) {
retval = -EBUSY;
goto out;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kmod: Check for NULL at call_usermodehelper_exec().

2013-09-15 Thread Tetsuo Handa

>From d6ff218545060c5f8b75b15d5b34bffcf625508f Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Mon, 16 Sep 2013 02:19:10 +0900
Subject: [PATCH] kmod: Check for NULL at call_usermodehelper_exec().

If /proc/sys/kernel/core_pattern contains only "|", NULL pointer dereference
happens upon core dump because argv_split("") returns argv[0] == NULL.

This bug was once fixed by commit 264b83c0 "usermodehelper: check
subprocess_info->path != NULL" but was by error reintroduced by commit
7f57cfa4 "usermodehelper: kill the sub_info->path[0] check".

This bug seems to exist since 2.6.19 (the version which core dump to pipe was
added). Depending on kernel version and config, some side effect might happen
immediately after this oops (e.g. kernel panic with 2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa 
Acked-by: Oleg Nesterov 
---
 kernel/kmod.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index fb32636..a962470 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -571,6 +571,10 @@ int call_usermodehelper_exec(struct subprocess_info 
*sub_info, int wait)
DECLARE_COMPLETION_ONSTACK(done);
int retval = 0;

+   if (!sub_info->path) {
+   call_usermodehelper_freeinfo(sub_info);
+   return -ENOENT;
+   }
helper_lock();
if (!khelper_wq || usermodehelper_disabled) {
retval = -EBUSY;
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] kthread: Make kthread_create() killable.

2013-09-15 Thread Tetsuo Handa

Oleg Nesterov wrote:
> Please look at call_usermodehelper_exec() which does this trick. The
> logic is the same, just you need to xchg(create->completion) instead
> of create->owner.

OK. I understood that we can use the same logic.

>From 87373d6938f045abffe8d9b4910bd132036eccaa Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Mon, 16 Sep 2013 09:39:17 +0900
Subject: [PATCH v2] kthread: Make kthread_create() killable.

Any users of wait_for_completion() might be chosen by the OOM killer while
waiting for completion() call by some process which does memory allocation.
kthread_create() is one of such users.

When such users are chosen by the OOM killer when they are waiting for
completion() in TASK_UNINTERRUPTIBLE, problem similar to CVE-2012-4398
"kernel: request_module() OOM local DoS" can happen.

This patch makes kthread_create() killable, using the same approach used for
fixing CVE-2012-4398.

Signed-off-by: Tetsuo Handa 
---
 kernel/kthread.c |   73 -
 1 files changed, 55 insertions(+), 18 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 760e86d..b5ae3ee 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -33,7 +33,7 @@ struct kthread_create_info
 
/* Result passed back to kthread_create() from kthreadd. */
struct task_struct *result;
-   struct completion done;
+   struct completion *done;
 
struct list_head list;
 };
@@ -178,6 +178,7 @@ static int kthread(void *_create)
struct kthread_create_info *create = _create;
int (*threadfn)(void *data) = create->threadfn;
void *data = create->data;
+   struct completion *done;
struct kthread self;
int ret;
 
@@ -187,10 +188,16 @@ static int kthread(void *_create)
init_completion(&self.parked);
current->vfork_done = &self.exited;
 
+   /* If user was SIGKILLed, I release the structure. */
+   done = xchg(&create->done, NULL);
+   if (!done) {
+   kfree(create);
+   do_exit(-EINTR);
+   }
/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_UNINTERRUPTIBLE);
create->result = current;
-   complete(&create->done);
+   complete(done);
schedule();
 
ret = -EINTR;
@@ -223,8 +230,15 @@ static void create_kthread(struct kthread_create_info 
*create)
/* We want our own signal handler (we take no signals by default). */
pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
if (pid < 0) {
+   /* If user was SIGKILLed, I release the structure. */
+   struct completion *done = xchg(&create->done, NULL);
+
+   if (!done) {
+   kfree(create);
+   return;
+   }
create->result = ERR_PTR(pid);
-   complete(&create->done);
+   complete(done);
}
 }
 
@@ -255,36 +269,59 @@ struct task_struct *kthread_create_on_node(int 
(*threadfn)(void *data),
   const char namefmt[],
   ...)
 {
-   struct kthread_create_info create;
-
-   create.threadfn = threadfn;
-   create.data = data;
-   create.node = node;
-   init_completion(&create.done);
+   DECLARE_COMPLETION_ONSTACK(done);
+   struct task_struct *task;
+   struct kthread_create_info *create = kmalloc(sizeof(*create),
+GFP_KERNEL);
+
+   if (!create)
+   return ERR_PTR(-ENOMEM);
+   create->threadfn = threadfn;
+   create->data = data;
+   create->node = node;
+   create->done = &done;
 
spin_lock(&kthread_create_lock);
-   list_add_tail(&create.list, &kthread_create_list);
+   list_add_tail(&create->list, &kthread_create_list);
spin_unlock(&kthread_create_lock);
 
wake_up_process(kthreadd_task);
-   wait_for_completion(&create.done);
-
-   if (!IS_ERR(create.result)) {
+   /*
+* Wait for completion in killable state, for I might be chosen by
+* the OOM killer while kthreadd is trying to allocate memory for
+* new kernel thread.
+*/
+   if (unlikely(wait_for_completion_killable(&done))) {
+   /*
+* If I was SIGKILLed before kthreadd (or new kernel thread)
+* calls complete(), leave the cleanup of this structure to
+* that thread.
+*/
+   if (xchg(&create->done, NULL))
+   return ERR_PTR(-ENOMEM);
+   /*
+* kthreadd (or new kernel thread) will call complete()
+

[PATCH] coredump: Make startup of coredump to pipe killable.

2013-09-16 Thread Tetsuo Handa

Oleg Nesterov wrote:
> Hi Tetsuo,
> 
> please do not start the off-list discussions ;)

Sorry. Although I think and hope that there is no easy way to trigger this bug,
this bug might become a CVE if found one. Thus, I started without ML. I assume
you also think that there is no easy way to trigger this bug.

> > Do we want to change from call_usermodehelper_exec(UMH_WAIT_EXEC) to
> > call_usermodehelper_exec(UMH_WAIT_EXEC | UMH_WAIT_KILLABLE)
> 
> To me, this makes sense in any case. And this matches other recent
> "make coredump killable" changes.

OK, this patch is for do_coredump(). But I might be missing something. (e.g.
Is it safe to terminate current process while file descriptor table of current
process is planned to be updated by kthread later? Are there other resources
which have to be kept valid until kthread starts coredump process?)

>From 6b81b9956df284564112a95c941bf390c15f4f06 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Mon, 16 Sep 2013 15:59:05 +0900
Subject: [PATCH] coredump: Make startup of coredump to pipe killable.

Any users of wait_for_completion() might be chosen by the OOM killer while
waiting for completion() call by some process which does memory allocation.
call_usermodehelper() without UMH_KILLABLE flag is one of such users.

When such users are chosen by the OOM killer when they are waiting for
completion() in TASK_UNINTERRUPTIBLE, problem similar to CVE-2012-4398
"kernel: request_module() OOM local DoS" can happen.

This patch makes call_usermodehelper_exec() call in do_coredump() killable,
using similar approach used for fixing CVE-2012-4398.

Signed-off-by: Tetsuo Handa 
---
 fs/coredump.c   |  113 ++-
 include/linux/binfmts.h |2 +
 2 files changed, 74 insertions(+), 41 deletions(-)

diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index e8112ae..547690f 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -61,6 +61,8 @@ struct coredump_params {
struct file *file;
unsigned long limit;
unsigned long mm_flags;
+   char **argv; /* Maybe NULL. Used by only fs/coredump.c */
+   bool in_use; /* Used by only fs/coredump.c */
 };
 
 /*
diff --git a/fs/coredump.c b/fs/coredump.c
index 9bdeca1..45abf0b 100644
--- a/fs/coredump.c
+++ b/fs/coredump.c
@@ -485,6 +485,23 @@ static int umh_pipe_setup(struct subprocess_info *info, 
struct cred *new)
return err;
 }
 
+/*
+ * umh_pipe_cleanup - Clean up resources as needed.
+ *
+ * @info: Pointer to "struct subprocess_info".
+ */
+static void umh_pipe_cleanup(struct subprocess_info *info)
+{
+   /* If user was SIGKILLed, I release the structure. */
+   struct coredump_params *cprm = (struct coredump_params *)info->data;
+
+   if (!xchg(&cprm->in_use, false)) {
+   if (cprm->argv)
+   argv_free(cprm->argv);
+   kfree(cprm);
+   }
+}
+
 void do_coredump(siginfo_t *siginfo)
 {
struct core_state core_state;
@@ -500,24 +517,26 @@ void do_coredump(siginfo_t *siginfo)
bool need_nonrelative = false;
bool core_dumped = false;
static atomic_t core_dump_count = ATOMIC_INIT(0);
-   struct coredump_params cprm = {
-   .siginfo = siginfo,
-   .regs = signal_pt_regs(),
-   .limit = rlimit(RLIMIT_CORE),
-   /*
-* We must use the same mm->flags while dumping core to avoid
-* inconsistency of bit flags, since this flag is not protected
-* by any locks.
-*/
-   .mm_flags = mm->flags,
-   };
+   struct coredump_params *cprm = kzalloc(sizeof(*cprm), GFP_KERNEL);
 
audit_core_dumps(siginfo->si_signo);
 
+   if (!cprm)
+   return;
+   cprm->siginfo = siginfo;
+   cprm->regs = signal_pt_regs();
+   cprm->limit = rlimit(RLIMIT_CORE);
+   /*
+* We must use the same mm->flags while dumping core to avoid
+* inconsistency of bit flags, since this flag is not protected
+* by any locks.
+*/
+   cprm->mm_flags = mm->flags;
+
binfmt = mm->binfmt;
if (!binfmt || !binfmt->core_dump)
goto fail;
-   if (!__get_dumpable(cprm.mm_flags))
+   if (!__get_dumpable(cprm->mm_flags))
goto fail;
 
cred = prepare_creds();
@@ -529,7 +548,7 @@ void do_coredump(siginfo_t *siginfo)
 * so we dump it as root in mode 2, and only into a controlled
 * environment (pipe handler or fully qualified path).
 */
-   if (__get_dumpable(cprm.mm_flags) == SUID_DUMP_ROOT) {
+   if (__get_dumpable(cprm->mm_flags) == SUID_DUMP_ROOT) {
/* Setuid core dump mode */
flag = O_EXCL;  /* Stop

Re: [PATCH 1/2] remove all uses of printf's %n

2013-09-16 Thread Tetsuo Handa

Kees Cook wrote:
> - seq_printf(m, "%s%d%n", con->name, con->index, &len);
> - len = 21 - len;
> + len = m->count;
> + seq_printf(m, "%s%d", con->name, con->index);
> + len = 21 - (m->count - len);

Why not to create a new function which returns bytes written?
The new function does not need to return negative value for indicating errors.
-- patch start --
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index 4e32edc..c889cf1 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -91,6 +91,7 @@ int seq_write(struct seq_file *seq, const void *data, size_t 
len);
 
 __printf(2, 3) int seq_printf(struct seq_file *, const char *, ...);
 __printf(2, 0) int seq_vprintf(struct seq_file *, const char *, va_list args);
+__printf(2, 3) int seq_new_printf(struct seq_file *m, const char *f, ...);
 
 int seq_path(struct seq_file *, const struct path *, const char *);
 int seq_dentry(struct seq_file *, struct dentry *, const char *);
diff --git a/fs/seq_file.c b/fs/seq_file.c
index 3135c25..7af75ec 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -419,6 +419,27 @@ int seq_printf(struct seq_file *m, const char *f, ...)
 EXPORT_SYMBOL(seq_printf);
 
 /**
+ * seq_new_printf - seq_printf() which returns bytes written.
+ * @m: target buffer
+ * @f: format
+ *
+ *  Returns bytes written to @m.
+ */
+int seq_new_printf(struct seq_file *m, const char *f, ...)
+{
+   const int count = m->count;
+   int ret;
+   va_list args;
+
+   va_start(args, f);
+   ret = seq_vprintf(m, f, args);
+   va_end(args);
+
+   return ret ? 0 : m->count - count;
+}
+EXPORT_SYMBOL(seq_new_printf);
+
+/**
  * mangle_path -   mangle and copy path to buffer beginning
  * @s: buffer start
  * @p: beginning of path in above buffer
-- patch end --
With new function, we can do:

-   len = m->count;
-   seq_printf(m, "%s%d", con->name, con->index);
-   len = 21 - (m->count - len);
+   len = 21 - seq_new_printf(m, "%s%d", con->name, con->index);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] remove all uses of printf's %n

2013-09-17 Thread Tetsuo Handa

Kees Cook wrote:
> On Mon, Sep 16, 2013 at 1:09 AM, Geert Uytterhoeven
>  wrote:
> > On Mon, Sep 16, 2013 at 9:43 AM, Kees Cook  wrote:
> >> All users of %n are calculating padding size when using seq_file, so
> >> instead use the new last_len member for discovering the length of the
> >> written strings.
> >
> > Would it make sense to provide a seq_pad(...) function instead, to avoid
> > exposing more seq_file internals to its callers?
> 
> We'd still need to track how much to pad.

If we add "size_t pad_until;" to "struct seq_file", we can do

void seq_setwidth(struct seq_file *m, size_t size)
{
m->pad_until = m->count + size;
}

void seq_pad(struct seq_file *m, char c)
{
int size = m->pad_until - m->count;
if (size > 0)
seq_printf(m, "%*s", size, "");
if (c)
seq_putc(m, c);
}

and use like

  seq_setwidth(m, 21);
  seq_printf(m, "%s%d", con->name, con->index);
  seq_pad(m, '\n');

.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] vsprintf: ignore %n again

2013-09-18 Thread Tetsuo Handa

Kees Cook wrote:
> > Consider, e.g. introducing __vsnprint(), with vsnprintf(s, n, fmt, ...)
> > expanding to __vsnprintf(1, s, n, fmt, ...) if fmt is a string literal
> > and __vsnprintf(0, s, n, fmt, ...) otherwise.  Now,
> > int __sprintf(int safe, char *buf, const char *fmt, ...)
> > {
> > va_list args;
> > int i;
> >
> > va_start(args, fmt);
> > i = __vsnprintf(safe, buf, INT_MAX, fmt, args);
> > va_end(args);
> >
> > return i;
> > }
> 
> Unless I've misunderstood, I think we'd already get very close to this
> with the gcc options instead. This patch set is what I've been using
> to generate the format string fixes over the last few months, with 7
> sent this last round:

Can we utilize __builtin_constant_p() ?

-- source start --
#include 

#define func(fmt)   \
if (__builtin_constant_p(fmt))  \
printf("const : %s\n", fmt);\
else\
printf("not const : %s\n", fmt);\


int main(int argc, char *argv[])
{
const char *fmt1 = "const char *";
const char fmt2[] = "const char []";
const char * const fmt3 = "const char * const";
func("literal");
func(fmt1);
func(fmt2);
func(fmt3);
return 0;
}
-- source end --

-- output start --
const : literal
not const : const char *
not const : const char []
const : const char * const
-- output end --

__builtin_constant_p() seems to enforce use of either "literal" or "* const".

An example change

-- patch start --
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -120,8 +120,9 @@ asmlinkage int printk_emit(int facility, int level,
   const char *dict, size_t dictlen,
   const char *fmt, ...);
 
-asmlinkage __printf(1, 2) __cold
-int printk(const char *fmt, ...);
+//asmlinkage __printf(1, 2) __cold
+//int printk(const char *fmt, ...);
+#define printk(fmt, ...) compiletime_assert(__builtin_constant_p(fmt), "Non-c  
onstant format string")
 
 /*
  * Special printk facility for scheduler use only, _DO_NOT_USE_ !
-- patch end --

caught errors like below.

  CC [M]  drivers/scsi/esas2r/esas2r_log.o
drivers/scsi/esas2r/esas2r_log.c: In function 'esas2r_log_master':
drivers/scsi/esas2r/esas2r_log.c:174: error: call to '__compiletime_assert_174' 
declared with attribute error: Non-constant format string
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] vsprintf: ignore %n again

2013-09-18 Thread Tetsuo Handa

Kees Cook wrote:
> > -- output start --
> > const : literal
> > not const : const char *
> > not const : const char []
> > const : const char * const
> 
> What version of gcc did you use? I don't get the last as const, for
> some reason. And as Dan mentions, shouldn't const char[] be detected
> as const too?

This worked on

  gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-3)
  gcc (GCC) 3.3.5 (Debian 1:3.3.5-13)

with -On (where n != 0), but didn't work on

  gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

. Oops...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] vsprintf: ignore %n again

2013-09-19 Thread Tetsuo Handa

If the code to test is built into vmlinux, we could use run-time checking like

--
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -1601,6 +1601,11 @@ int vsnprintf(char *buf, size_t size, const char *fmt, 
va_list args)
if (WARN_ON_ONCE((int) size < 0))
return 0;
 
+   if (!(__start_rodata <= fmt && fmt < __end_rodata)) {
+   static unsigned char warn = 100;
+   WARN(warn && warn--, "Format string is not in RODATA section.");
+   }
+
str = buf;
end = buf + size;
 
--

which reports errors like below.

[0.814121] [ cut here ]
[0.814985] WARNING: CPU: 0 PID: 1 at lib/vsprintf.c:1606 
vsnprintf+0xb4/0x3f0()
[0.816036] Format string is not in RODATA section.
[0.816883] Modules linked in:
[0.817490] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 
3.12.0-rc1-00046-g9baa505-dirty #180
[0.818974] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[0.820040]  0646  df079dc0 c02eae66 df079dcc c02eaead c02f3684 
df079df8
[0.821538]  c01422a9 c061bc40 df079e28 0001 c061bc31 0646 c02f3684 
df079e08
[0.822995]  0646 c061bc40 df079e14 c0142301 0009 df079e08 c061bc40 
df079e28
[0.824676] Call Trace:
[0.825124]  [] __dump_stack+0x16/0x20
[0.825922]  [] dump_stack+0x3d/0x60
[0.826691]  [] ? vsnprintf+0xb4/0x3f0
[0.827478]  [] warn_slowpath_common+0x79/0xa0
[0.828040]  [] ? vsnprintf+0xb4/0x3f0
[0.828917]  [] warn_slowpath_fmt+0x31/0x40
[0.829798]  [] vsnprintf+0xb4/0x3f0
[0.830584]  [] ? trace_hardirqs_on+0xb/0x10
[0.832050]  [] kvasprintf+0x24/0x60
[0.832831]  [] kobject_set_name_vargs+0x21/0x60
[0.833850]  [] kobject_add_varg+0x21/0x50
[0.834750]  [] kobject_init_and_add+0x29/0x30
[0.835699]  [] sysfs_slab_add+0x63/0xe0
[0.836055]  [] ? kmem_cache_init_late+0x10/0x10
[0.837054]  [] slab_sysfs_init+0x77/0x110
[0.838025]  [] ? procswaps_init+0x21/0x30
[0.838958]  [] do_one_initcall+0x32/0xd0
[0.840046]  [] ? parse_one+0xc0/0xe0
[0.840852]  [] ? parse_args+0x7a/0x170
[0.841676]  [] ? loglevel+0x30/0x30
[0.842443]  [] do_initcall_level+0x7a/0x90
[0.843338]  [] ? loglevel+0x30/0x30
[0.844038]  [] do_initcalls+0x18/0x20
[0.844960]  [] do_basic_setup+0x28/0x30
[0.845894]  [] kernel_init_freeable+0x5f/0xf0
[0.846907]  [] kernel_init+0xb/0xe0
[0.848045]  [] ret_from_kernel_thread+0x1c/0x2c
[0.849070]  [] ? rest_init+0x140/0x140
[0.849987] ---[ end trace c57fc7b42d34a992 ]---

--
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -5136,7 +5136,7 @@ static int sysfs_slab_add(struct kmem_cache *s)
}
 
s->kobj.kset = slab_kset;
-   err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, name);
+   err = kobject_init_and_add(&s->kobj, &slab_ktype, NULL, "%s", name);
if (err) {
kobject_put(&s->kobj);
return err;
--

But tools like sparse might find such bugs better?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] remove all uses of printf's %n

2013-09-19 Thread Tetsuo Handa

George Spelvin wrote:
> >>   seq_setwidth(m, 21);
> >>   seq_printf(m, "%s%d", con->name, con->index);
> >>   seq_pad(m, '\n');
> 
> > Ooh, I like this a lot! Much cleaner.
> 
> That's certainly a good way to do it, too.
> My "general principles" filter thinks it should be in a local variable
> if it can, but if hiding it in the struct seq_file is fine if people
> find that cleaner.

I think the good point of seq_file is that users don't need to worry about
length calculation for overflow detection. This patch helps users to simplify
length calculation for alignment. I think this approach is better if adding
a size_t to the seq_file is acceptable.

So, the patch follows.
--------
>From 02b28fd709971f71e5de9a5b595ff4fd059028b3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Thu, 19 Sep 2013 17:23:17 +0900
Subject: [PATCH] seq_file: Introduce seq_setwidth() and seq_pad()

There are several users who want to know bytes written by seq_*() for alignment
purpose. Currently they are using %n format for knowing it because seq_*()
returns 0 on success.

This patch introduces seq_setwidth() and seq_pad() for allowing them to align
without using %n format.

Signed-off-by: Tetsuo Handa 
---
 fs/seq_file.c|   15 +++
 include/linux/seq_file.h |   15 +++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 3135c25..40e471e 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -764,6 +764,21 @@ int seq_write(struct seq_file *seq, const void *data, 
size_t len)
 }
 EXPORT_SYMBOL(seq_write);
 
+/**
+ * seq_pad - write padding spaces to buffer
+ * @m: seq_file identifying the buffer to which data should be written
+ * @c: the byte to append after padding if non-zero
+ */
+void seq_pad(struct seq_file *m, char c)
+{
+   int size = m->pad_until - m->count;
+   if (size > 0)
+   seq_printf(m, "%*s", size, "");
+   if (c)
+   seq_putc(m, c);
+}
+EXPORT_SYMBOL(seq_pad);
+
 struct list_head *seq_list_start(struct list_head *head, loff_t pos)
 {
struct list_head *lh;
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index 4e32edc..52e0097 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -20,6 +20,7 @@ struct seq_file {
size_t size;
size_t from;
size_t count;
+   size_t pad_until;
loff_t index;
loff_t read_pos;
u64 version;
@@ -79,6 +80,20 @@ static inline void seq_commit(struct seq_file *m, int num)
}
 }
 
+/**
+ * seq_setwidth - set padding width
+ * @m: the seq_file handle
+ * @size: the max number of bytes to pad.
+ *
+ * Call seq_setwidth() for setting max width, then call seq_printf() etc. and
+ * finally call seq_pad() to pad the remaining bytes.
+ */
+static inline void seq_setwidth(struct seq_file *m, size_t size)
+{
+   m->pad_until = m->count + size;
+}
+void seq_pad(struct seq_file *m, char c);
+
 char *mangle_path(char *s, const char *p, const char *esc);
 int seq_open(struct file *, const struct seq_operations *);
 ssize_t seq_read(struct file *, char __user *, size_t, loff_t *);
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] remove all uses of printf's %n

2013-09-19 Thread Tetsuo Handa

Hello.

We are discussing about removal of %n support from vsnprintf() at
https://lkml.org/lkml/2013/9/16/52 , and you are using %n in seq_printf().

I posted https://lkml.org/lkml/diff/2013/9/19/53/1 which introduces
seq_setwidth() / seq_pad() which can avoid use of %n in seq_printf().
Assuming that this patch is merged, would you confirm that I didn't break
your code with below patch?

Regards.

>From f8b60ebe3971901b93dedb8eee0f85b60d0fdc5f Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Fri, 20 Sep 2013 12:01:07 +0900
Subject: [PATCH] Remove "%n" usage from seq_file users.

All seq_printf() users are using "%n" for calculating padding size, convert
them to use seq_setwidth() / seq_pad() pair.

Signed-off-by: Tetsuo Handa 
---
 fs/proc/consoles.c   |   10 --
 fs/proc/nommu.c  |   12 +---
 fs/proc/task_mmu.c   |   20 ++--
 fs/proc/task_nommu.c |   19 ++-
 net/ipv4/fib_trie.c  |   13 +++--
 net/ipv4/ping.c  |   15 +++
 net/ipv4/tcp_ipv4.c  |   33 +++--
 net/ipv4/udp.c   |   15 +++
 net/phonet/socket.c  |   24 +++-
 net/sctp/objcnt.c|9 +
 10 files changed, 73 insertions(+), 97 deletions(-)

diff --git a/fs/proc/consoles.c b/fs/proc/consoles.c
index b701eaa..51942d5 100644
--- a/fs/proc/consoles.c
+++ b/fs/proc/consoles.c
@@ -29,7 +29,6 @@ static int show_console_dev(struct seq_file *m, void *v)
char flags[ARRAY_SIZE(con_flags) + 1];
struct console *con = v;
unsigned int a;
-   int len;
dev_t dev = 0;
 
if (con->device) {
@@ -47,11 +46,10 @@ static int show_console_dev(struct seq_file *m, void *v)
con_flags[a].name : ' ';
flags[a] = 0;
 
-   seq_printf(m, "%s%d%n", con->name, con->index, &len);
-   len = 21 - len;
-   if (len < 1)
-   len = 1;
-   seq_printf(m, "%*c%c%c%c (%s)", len, ' ', con->read ? 'R' : '-',
+   seq_setwidth(m, 21 - 1);
+   seq_printf(m, "%s%d", con->name, con->index);
+   seq_pad(m, ' ');
+   seq_printf(m, "%c%c%c (%s)", con->read ? 'R' : '-',
con->write ? 'W' : '-', con->unblank ? 'U' : '-',
flags);
if (dev)
diff --git a/fs/proc/nommu.c b/fs/proc/nommu.c
index ccfd99b..5f9bc8a 100644
--- a/fs/proc/nommu.c
+++ b/fs/proc/nommu.c
@@ -39,7 +39,7 @@ static int nommu_region_show(struct seq_file *m, struct 
vm_region *region)
unsigned long ino = 0;
struct file *file;
dev_t dev = 0;
-   int flags, len;
+   int flags;
 
flags = region->vm_flags;
file = region->vm_file;
@@ -50,8 +50,9 @@ static int nommu_region_show(struct seq_file *m, struct 
vm_region *region)
ino = inode->i_ino;
}
 
+   seq_setwidth(m, 25 + sizeof(void *) * 6 - 1);
seq_printf(m,
-  "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu %n",
+  "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ",
   region->vm_start,
   region->vm_end,
   flags & VM_READ ? 'r' : '-',
@@ -59,13 +60,10 @@ static int nommu_region_show(struct seq_file *m, struct 
vm_region *region)
   flags & VM_EXEC ? 'x' : '-',
   flags & VM_MAYSHARE ? flags & VM_SHARED ? 'S' : 's' : 'p',
   ((loff_t)region->vm_pgoff) << PAGE_SHIFT,
-  MAJOR(dev), MINOR(dev), ino, &len);
+  MAJOR(dev), MINOR(dev), ino);
 
if (file) {
-   len = 25 + sizeof(void *) * 6 - len;
-   if (len < 1)
-   len = 1;
-   seq_printf(m, "%*c", len, ' ');
+   seq_pad(m, ' ');
seq_path(m, &file->f_path, "");
}
 
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 7366e9d..cc24165 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -83,14 +83,6 @@ unsigned long task_statm(struct mm_struct *mm,
return mm->total_vm;
 }
 
-static void pad_len_spaces(struct seq_file *m, int len)
-{
-   len = 25 + sizeof(void*) * 6 - len;
-   if (len < 1)
-   len = 1;
-   seq_printf(m, "%*c", len, ' ');
-}
-
 #ifdef CONFIG_NUMA
 /*
  * These functions are for numa_maps but called in generic **maps seq_file
@@ -268,7 +260,6 @@ show_map_vma(struct seq_file *m, struct vm_area_struct 
*vma, int is_pid)
unsigned long long pgoff = 0;
unsigned long start, end;

Re: [linux-next-20130903] module: broken module versions?

2013-09-04 Thread Tetsuo Handa

Hello.

Tetsuo Handa wrote:
> I tried further debugging but not yet successful.

I found what was wrong.

Bad kernel config contained

  CONFIG_PHYSICAL_START=0x100
  CONFIG_PHYSICAL_ALIGN=0x10

whereas good kernel config contained

  CONFIG_PHYSICAL_START=0x100
  CONFIG_PHYSICAL_ALIGN=0x100

. I think that there is no constraint checking for these values and resulted in
overlaying something on the kcrctab table.

Now linux-next-20130903 works fine. Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootable kernel.

2013-09-07 Thread Tetsuo Handa

Hello.

I noticed that 3.11 and current linux.git do not boot (they hang before
printing the "Linux version 3.10.0-rc7-00026-g040a0a3" line) when built with
CONFIG_DEBUG_MUTEXES=y using gcc (GCC) 3.3.5 (Debian 1:3.3.5-13). They boot OK
when built with the same config using gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3.

Bisection reached commit 040a0a37 "mutex: Add support for wound/wait style
locks". This commit might contain gcc version dependent trick, but how can I
find it?

Kernel config (only for testing whether the kernel version line is printed) is
at http://I-love.SAKURA.ne.jp/tmp/config-3.11-mutex and the command line I used
for testing is

  $ qemu-system-i386 -m 512 -nographic -kernel arch/x86/boot/bzImage --append 
"console=ttyS0,115200n8"

.

Regards.

-- bisection log start --
# bad: [ad81f0545ef01ea651886dddac4bef6cec930092] Linux 3.11-rc1
# good: [8bb495e3f02401ee6f76d1b1d77f3ac9f079e376] Linux 3.10
# good: [c1be5a5b1b355d40e6cf79cc979eb66dafa24ad1] Linux 3.9
# good: [19f949f52599ba7c3f67a5897ac6be14bfcb1200] Linux 3.8
# good: [29594404d7fe73cd80eaa4ee8c43dcc53970c60e] Linux 3.7
# good: [a0d271cbfed1dd50278c6b06bead3d00ba0a88f9] Linux 3.6
# good: [28a33cbc24e4256c143dce96c7d93bf423229f92] Linux 3.5
# good: [76e10d158efb6d4516018846f60c2ab5501900bc] Linux 3.4
# good: [c16fa4f2ad19908a47c63d8fa436a1178438c7e7] Linux 3.3
# good: [805a6af8dba5dfdd35ec35dc52ec0122400b2610] Linux 3.2
# good: [c3b92c8787367a8bb53d57d9789b558f1295cc96] Linux 3.1
# good: [02f8c6aee8df3cdc935e9bdd4f2d020306035dbe] Linux 3.0
git bisect start 'v3.11-rc1' 'v3.10' 'v3.9' 'v3.8' 'v3.7' 'v3.6' 'v3.5' 'v3.4' 
'v3.3' 'v3.2' 'v3.1' 'v3.0'
# bad: [1286da8bc009cb2aee7f285e94623fc974c0c983] Merge tag 'sound-3.11' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
git bisect bad 1286da8bc009cb2aee7f285e94623fc974c0c983
# good: [ee1a8d402e7e204d57fb108aa40003b6d1633036] Merge tag 'dt-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
git bisect good ee1a8d402e7e204d57fb108aa40003b6d1633036
# bad: [3e34131a65127e73fbae68c82748f32c8af7e4a4] Merge tag 
'stable/for-linus-3.11-rc0-tag-two' of 
git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
git bisect bad 3e34131a65127e73fbae68c82748f32c8af7e4a4
# bad: [790eac5640abf7a57fa3a644386df330e18c11b0] Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
git bisect bad 790eac5640abf7a57fa3a644386df330e18c11b0
# bad: [f0bb4c0ab064a8aeeffbda1cee380151a594eaab] Merge branch 
'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad f0bb4c0ab064a8aeeffbda1cee380151a594eaab
# good: [3e42dee676e8cf5adca817b1518b2e99d1c138ff] Merge branch 
'core-locking-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect good 3e42dee676e8cf5adca817b1518b2e99d1c138ff
# good: [130768b8c93cd8d21390a136ec8cef417153ca14] perf/x86/intel: Add Haswell 
PEBS record support
git bisect good 130768b8c93cd8d21390a136ec8cef417153ca14
# bad: [ab3d681e9d41816f90836ea8fe235168d973207f] Merge branch 
'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
git bisect bad ab3d681e9d41816f90836ea8fe235168d973207f
# good: [14961444696effb2e660fe876e5c1880f8bc3932] rcu: Shrink TINY_RCU by 
reworking CPU-stall ifdefs
git bisect good 14961444696effb2e660fe876e5c1880f8bc3932
# good: [be77f87c001b770f13fe742becb08b847d9542f1] Merge branches 
'cbnum.2013.06.10a', 'doc.2013.06.10a', 'fixes.2013.06.10a', 'srcu.2013.06.10a' 
and 'tiny.2013.06.10a' into HEAD
git bisect good be77f87c001b770f13fe742becb08b847d9542f1
# bad: [2fe3d4b149ccebbb384062fbbe6634439f2bf120] mutex: Add more tests to 
lib/locking-selftest.c
git bisect bad 2fe3d4b149ccebbb384062fbbe6634439f2bf120
# bad: [040a0a37100563754bb1fee6ff6427420bcfa609] mutex: Add support for 
wound/wait style locks
git bisect bad 040a0a37100563754bb1fee6ff6427420bcfa609
# good: [a41b56efa70e060f650aeb54740aaf52044a1ead] arch: Make 
__mutex_fastpath_lock_retval return whether fastpath succeeded or not
git bisect good a41b56efa70e060f650aeb54740aaf52044a1ead
-- bisection log end --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootable kernel.

2013-09-07 Thread Tetsuo Handa

Hello.

I found what is wrong.

-- bad patch start --
>From 3c56dfbd32a9b67ba824ce96128bb513eb65de4b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Sun, 8 Sep 2013 12:44:20 +0900
Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p() 
usage.

Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
"!__builtin_constant_p(p == NULL)" which I guess the author meant that
"__builtin_constant_p(p) && p", but gcc 3.x cannot handle such expression
correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.

Signed-off-by: Tetsuo Handa 
Cc:  [3.11+]
---
 kernel/mutex.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index a52ee7bb..0a6f14f 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -448,7 +448,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
struct task_struct *owner;
struct mspin_node  node;
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (__builtin_constant_p(ww_ctx) && ww_ctx && ww_ctx->acquired 
> 0) {
struct ww_mutex *ww;
 
ww = container_of(lock, struct ww_mutex, base);
@@ -478,7 +478,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
if ((atomic_read(&lock->count) == 1) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (__builtin_constant_p(ww_ctx) && ww_ctx) {
struct ww_mutex *ww;
ww = container_of(lock, struct ww_mutex, base);
 
@@ -548,7 +548,7 @@ slowpath:
goto err;
}
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (__builtin_constant_p(ww_ctx) && ww_ctx && ww_ctx->acquired 
> 0) {
ret = __mutex_lock_check_stamp(lock, ww_ctx);
if (ret)
goto err;
@@ -568,7 +568,7 @@ done:
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
 
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (__builtin_constant_p(ww_ctx) && ww_ctx) {
struct ww_mutex *ww = container_of(lock,
  struct ww_mutex,
  base);
-- 
1.7.8
-- bad patch end --

However, after applying the patch above, I get problems (both gcc 3.x and 4.x)
with locking selftests.

-- gcc version 3.3.5 start --
[0.00] Linux version 3.11.0-dirty (root@aqua) (gcc version 3.3.5 
(Debian 1:3.3.5-13)) #124 SMP Sun Sep 8 12:05:18 JST 2013
(...snipped...)
[0.00]   
--
[0.00]   | Wound/wait tests |
[0.00]   -
[0.00]   ww api failures:
[0.00] [ cut here ]
[0.00] WARNING: CPU: 0 PID: 0 at lib/locking-selftest.c:1143 
ww_test_fail_acquire+0x112/0x2c0()
[0.00] Modules linked in:
[0.00] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-dirty #124
[0.00] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
[0.00]  0477  c1577f18 c11e6736 c1577f24 c11e677d c11ffeb2 
c1577f50
[0.00]  c1041af9 c14f5ea0   c15114fa 0477 c11ffeb2 
c1cd9360
[0.00]   0001 c1577f60 c1041bbd 0009  c1577f84 
c11ffeb2
[0.00] Call Trace:
[0.00]  [] __dump_stack+0x16/0x20
[0.00]  [] dump_stack+0x3d/0x60
[0.00]  [] ? ww_test_fail_acquire+0x112/0x2c0
[0.00]  [] warn_slowpath_common+0x79/0xa0
[0.00]  [] ? ww_test_fail_acquire+0x112/0x2c0
[0.00]  [] warn_slowpath_null+0x1d/0x30
[0.00]  [] ww_test_fail_acquire+0x112/0x2c0
[0.00]  [] ? dotest+0x42/0x100
[0.00]  [] ? dotest+0x100/0x100
[0.00]  [] dotest+0x42/0x100
[0.00]  [] ? printk+0x35/0x40
[0.00]  [] ww_tests+0x53/0x410
[0.00]  [] locking_selftest+0x19a4/0x1ab0
[0.00]  [] start_kernel+0x1ec/0x2c0
[0.00]  [] ? repair_env_string+0x70/0x70
[0.00]  [] i386_start_kernel+0x25/0x30
[0.00] ---[ end trace 74d4202eb2b56266 ]---
[0.00]   ok  |  ok  |  ok  |
[0.00]ww contexts mixing:  ok  |FAILED|
[0.00] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW
3.11.0-dirty #124
[

Re: [3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootable kernel.

2013-09-08 Thread Tetsuo Handa

Hello.

Ilia Mirkin wrote:
> > Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
> > "!__builtin_constant_p(p == NULL)" which I guess the author meant that
> > "__builtin_constant_p(p) && p", but gcc 3.x cannot handle such expression
> > correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.
> 
> I think that !__builtin_constant_p(p == NULL) is basically saying "I
> am unable to conclude that p == NULL at build time", which would
> translate to something along the lines of
> 
> (__builtin_constant_p(p) && p) || !__builtin_constant_p(p)
> 

I think

  (__builtin_constant_p(p) && p) && p->acquired > 0

is safe but

  (!__builtin_constant_p(p)) && p->acquired > 0

is not safe, for "p != NULL" check is required for avoiding NULL pointer
dereference.

It seems to me that

  (!__builtin_constant_p(p == NULL))

need to be translated to something along the lines of

  (__builtin_constant_p(p) && p) || (!__builtin_constant_p(p) && p)

which can be simplified as

  (p)

.

> Or perhaps it's just equivalent to !__builtin_constant_p(p), since the
> compiler's ability to conclude whether it is NULL at build-time should
> be unaffected by whether it actually is NULL or not.

Likewise, it seems to me that

  (!__builtin_constant_p(p == NULL))

need to be translated to something along the lines of

  (!__builtin_constant_p(p) && p)

. Well this change as well can fix "boot failure on gcc 3.x" and avoid "locking
selftests failure on gcc 3.x / 4.x". OK, let's wait for answer from the author.

Can I add "Signed-off-by: Ilia Mirkin " to below patch?

-- good patch start --
>From a8bbf6b3c2d44cb90d63820f146aaff119d871c9 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Sun, 8 Sep 2013 16:09:27 +0900
Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p() 
usage.

Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
"!__builtin_constant_p(p == NULL)" but gcc 3.x cannot handle such expression
correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.

Fix it by changing from "!__builtin_constant_p(p == NULL)" to
"!__builtin_constant_p(p) && p".

Signed-off-by: Tetsuo Handa 
Cc:  [3.11+]
---
 kernel/mutex.c |8 
 1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index a52ee7bb..ef02003 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -448,7 +448,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
struct task_struct *owner;
struct mspin_node  node;
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (!__builtin_constant_p(ww_ctx) && ww_ctx && ww_ctx->acquired 
> 0) {
struct ww_mutex *ww;
 
ww = container_of(lock, struct ww_mutex, base);
@@ -478,7 +478,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
if ((atomic_read(&lock->count) == 1) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (!__builtin_constant_p(ww_ctx) && ww_ctx) {
struct ww_mutex *ww;
ww = container_of(lock, struct ww_mutex, base);
 
@@ -548,7 +548,7 @@ slowpath:
goto err;
}
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (!__builtin_constant_p(ww_ctx) && ww_ctx && ww_ctx->acquired 
> 0) {
ret = __mutex_lock_check_stamp(lock, ww_ctx);
if (ret)
goto err;
@@ -568,7 +568,7 @@ done:
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
 
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (!__builtin_constant_p(ww_ctx) && ww_ctx) {
struct ww_mutex *ww = container_of(lock,
  struct ww_mutex,
  base);
-- 
1.7.8
-- good patch end --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootablekernel.

2013-09-08 Thread Tetsuo Handa

Hello.

Maarten Lankhorst wrote:
> if it's broken for your compiler, please add a bool use_ww_ctx or something 
> to __mutex_lock_common that's set directly instead, the __builtin_constant_p 
> trick
> might be too gcc version specific.

I see. I tested that both gcc 3.x and gcc 4.x can generate bootable kernel
using below fix.
--
>From f71fb89bccaa7ed5b3a14e735a1b9cc1a0e7112d Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Sun, 8 Sep 2013 20:37:19 +0900
Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p()
 usage.

Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
"!__builtin_constant_p(p == NULL)" but gcc 3.x cannot handle such expression
correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.

Signed-off-by: Tetsuo Handa 
Cc:  [3.11+]
---
 kernel/mutex.c |9 +
 1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index a52ee7bb..2e7984e 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -414,6 +414,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
struct mutex_waiter waiter;
unsigned long flags;
int ret;
+   const bool use_ww_ctx = !!ww_ctx;
 
preempt_disable();
mutex_acquire_nest(&lock->dep_map, subclass, 0, nest_lock, ip);
@@ -448,7 +449,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
struct task_struct *owner;
struct mspin_node  node;
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (use_ww_ctx && ww_ctx->acquired > 0) {
struct ww_mutex *ww;
 
ww = container_of(lock, struct ww_mutex, base);
@@ -478,7 +479,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
if ((atomic_read(&lock->count) == 1) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (use_ww_ctx) {
struct ww_mutex *ww;
ww = container_of(lock, struct ww_mutex, base);
 
@@ -548,7 +549,7 @@ slowpath:
goto err;
}
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (use_ww_ctx && ww_ctx->acquired > 0) {
ret = __mutex_lock_check_stamp(lock, ww_ctx);
if (ret)
goto err;
@@ -568,7 +569,7 @@ done:
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
 
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (use_ww_ctx) {
struct ww_mutex *ww = container_of(lock,
  struct ww_mutex,
  base);
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootablekernel.

2013-09-09 Thread Tetsuo Handa

Maarten Lankhorst wrote:
> Almost correct. I meant passing it as parameter to __mutex_lock_common. Your 
> version will still cause an extra pointless null check in the ww_mutex_lock 
> case.

Ah, I see.
--
>From 95f189eb37c25ddf8e48d5dfc2f9f1185c52b6a8 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Mon, 9 Sep 2013 20:48:13 +0900
Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p() 
usage.

Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
"!__builtin_constant_p(p == NULL)" but gcc 3.x cannot handle such expression
correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.

Fix it by explicitly passing a bool which tells whether p != NULL or not.

Signed-off-by: Tetsuo Handa 
Cc:  [3.11+]
---
 kernel/mutex.c |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index a52ee7bb..a2b80f1 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -408,7 +408,7 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock,
 static __always_inline int __sched
 __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
struct lockdep_map *nest_lock, unsigned long ip,
-   struct ww_acquire_ctx *ww_ctx)
+   struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
 {
struct task_struct *task = current;
struct mutex_waiter waiter;
@@ -448,7 +448,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
struct task_struct *owner;
struct mspin_node  node;
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (use_ww_ctx && ww_ctx->acquired > 0) {
struct ww_mutex *ww;
 
ww = container_of(lock, struct ww_mutex, base);
@@ -478,7 +478,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
if ((atomic_read(&lock->count) == 1) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (use_ww_ctx) {
struct ww_mutex *ww;
ww = container_of(lock, struct ww_mutex, base);
 
@@ -548,7 +548,7 @@ slowpath:
goto err;
}
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (use_ww_ctx && ww_ctx->acquired > 0) {
ret = __mutex_lock_check_stamp(lock, ww_ctx);
if (ret)
goto err;
@@ -568,7 +568,7 @@ done:
mutex_remove_waiter(lock, &waiter, current_thread_info());
mutex_set_owner(lock);
 
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (use_ww_ctx) {
struct ww_mutex *ww = container_of(lock,
  struct ww_mutex,
  base);
@@ -618,7 +618,7 @@ mutex_lock_nested(struct mutex *lock, unsigned int subclass)
 {
might_sleep();
__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
-   subclass, NULL, _RET_IP_, NULL);
+   subclass, NULL, _RET_IP_, NULL, 0);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_nested);
@@ -628,7 +628,7 @@ _mutex_lock_nest_lock(struct mutex *lock, struct 
lockdep_map *nest)
 {
might_sleep();
__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
-   0, nest, _RET_IP_, NULL);
+   0, nest, _RET_IP_, NULL, 0);
 }
 
 EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
@@ -638,7 +638,7 @@ mutex_lock_killable_nested(struct mutex *lock, unsigned int 
subclass)
 {
might_sleep();
return __mutex_lock_common(lock, TASK_KILLABLE,
-  subclass, NULL, _RET_IP_, NULL);
+  subclass, NULL, _RET_IP_, NULL, 0);
 }
 EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
 
@@ -647,7 +647,7 @@ mutex_lock_interruptible_nested(struct mutex *lock, 
unsigned int subclass)
 {
might_sleep();
return __mutex_lock_common(lock, TASK_INTERRUPTIBLE,
-  subclass, NULL, _RET_IP_, NULL);
+  subclass, NULL, _RET_IP_, NULL, 0);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
@@ -685,7 +685,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct 
ww_acquire_ctx *ctx)
 
might_sleep();
ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
-  0, &ctx->dep_map, _RET_IP_, ctx)

[checkpatch.pl] runtime error by version dependency.

2013-09-09 Thread Tetsuo Handa

Hello.

Commit d1fe9c09 "checkpatch: add some --strict coding style checks"
introduced dependency on perl >= 5.10.0 .

While the comment says that "Any use must be runtime checked with $^V",
it is not runtime checked when running with perl == v5.8.4 , failing with

  Nested quantifiers in regex; marked by <-- HERE in m/(\((?:[^\(\)]++ <-- HERE 
|(?-1))*\))/ at ./scripts/checkpatch.pl line 340.

error.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootablekernel.

2013-09-09 Thread Tetsuo Handa

Maarten Lankhorst wrote:
> Yeah looks ok, did you run the selftests from 
> CONFIG_DEBUG_LOCKING_API_SELFTESTS,
> with/without CONFIG_PROVE_LOCKING and once more with DEBUG_MUTEXES also unset?

Since CONFIG_DEBUG_MUTEXES=n && CONFIG_PROVE_LOCKING=y is impossible, I tested

  CONFIG_DEBUG_MUTEXES=y
  CONFIG_PROVE_LOCKING=y
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=y

  CONFIG_DEBUG_MUTEXES=y
  CONFIG_PROVE_LOCKING=n
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=y

  CONFIG_DEBUG_MUTEXES=n
  CONFIG_PROVE_LOCKING=n
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=y

  CONFIG_DEBUG_MUTEXES=y
  CONFIG_PROVE_LOCKING=y
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=n

  CONFIG_DEBUG_MUTEXES=y
  CONFIG_PROVE_LOCKING=n
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=n

  CONFIG_DEBUG_MUTEXES=n
  CONFIG_PROVE_LOCKING=n
  CONFIG_DEBUG_LOCKING_API_SELFTESTS=n

and all works OK.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BusLogic] DMA-API: device driver failed to check map error

2013-09-09 Thread Tetsuo Handa

Hello.

I got below warning on current linux.git .

--
[2.612237] scsi: * BusLogic SCSI Driver Version 2.1.16 of 18 July 2002 
*
[2.613067] scsi: Copyright 1995-1998 by Leonard N. Zubkoff 

[2.630942] scsi0: Configuring BusLogic Model BT-958 PCI Wide Ultra SCSI 
Host Adapter
[2.633063] scsi0:   Firmware Version: 5.07B, I/O Address: 0x10C0, IRQ 
Channel: 17/Level
[2.633125] scsi0:   PCI Bus: 0, Device: 16, Address: 0xD880, Host 
Adapter SCSI ID: 7
[2.633188] scsi0:   Parity Checking: Enabled, Extended Translation: Enabled
[2.633250] scsi0:   Synchronous Negotiation: Ultra, Wide Negotiation: 
Enabled
[2.635721] scsi0:   Disconnect/Reconnect: Enabled, Tagged Queuing: Enabled
[2.637054] scsi0:   Scatter/Gather Limit: 128 of 8192 segments, Mailboxes: 
211
[2.637116] scsi0:   Driver Queue Depth: 211, Host Adapter Queue Depth: 192
[2.637179] scsi0:   Tagged Queue Depth: Automatic, Untagged Queue Depth: 3
[2.639812] scsi0: *** BusLogic BT-958 Initialized Successfully ***
[4.635828] scsi0 : BusLogic BT-958
[4.641883] [sched_delayed] sched: RT throttling activated
[4.647510] [ cut here ]
[4.647573] WARNING: CPU: 1 PID: 1 at lib/dma-debug.c:937 
check_unmap+0x777/0x7f0()
[4.648851] pci :00:10.0: DMA-API: device driver failed to check map 
error[device address=0x349cddc0] [size=96 bytes] [mapped as single]
[4.649873] Modules linked in:
[4.649873] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 
3.11.0-08716-g26b0332-dirty #8
[4.649873] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 08/15/2008
[4.649873]  03a9  f64b7de4 c11e8096 f64b7df0 c11e80dd c1210437 
f64b7e1c
[4.649873]  c10421b9 c1519440 f64b7e4c 0001 c1518220 03a9 c1210437 
f64b7e2c
[4.649873]  0001 f64b7ea4 f64b7e38 c1042211 0009 f64b7e2c c1519440 
f64b7e4c
[4.649873] Call Trace:
[4.649873]  [] __dump_stack+0x16/0x20
[4.649873]  [] dump_stack+0x3d/0x60
[4.649873]  [] ? check_unmap+0x777/0x7f0
[4.649873]  [] warn_slowpath_common+0x79/0xa0
[4.649873]  [] ? check_unmap+0x777/0x7f0
[4.649873]  [] warn_slowpath_fmt+0x31/0x40
[4.649873]  [] check_unmap+0x777/0x7f0
[4.649873]  [] debug_dma_unmap_page+0x78/0x90
[4.649873]  [] blogic_dealloc_ccb+0x83/0xb0
[4.649873]  [] blogic_process_ccbs+0x3c7/0x3f0
[4.649873]  [] ? blogic_scan_inbox+0x43/0xa0
[4.649873]  [] blogic_inthandler+0x81/0x150
[4.649873]  [] ? handle_irq_event+0x2e/0x60
[4.649873]  [] handle_irq_event_percpu+0x38/0x120
[4.649873]  [] ? handle_irq_event+0x2e/0x60
[4.649873]  [] handle_irq_event+0x37/0x60
[4.649873]  [] ? handle_level_irq+0xb0/0xb0
[4.649873]  [] handle_fasteoi_irq+0x87/0xc0
[4.649873][] ? do_IRQ+0x3c/0xb0
[4.649873]  [] ? mark_held_locks+0xca/0x100
[4.649873]  [] ? common_interrupt+0x31/0x36
[4.649873]  [] ? _raw_spin_unlock_irqrestore+0x47/0x60
[4.649873]  [] ? blogic_qcmd+0x3e/0x50
[4.649873]  [] ? scsi_dispatch_cmd+0x174/0x1f0
[4.649873]  [] ? trace_hardirqs_on+0xb/0x10
[4.649873]  [] ? scsi_request_fn+0x314/0x3a0
[4.649873]  [] ? __blk_run_queue+0x2b/0x40
[4.649873]  [] ? blk_execute_rq_nowait+0xa2/0xd0
[4.649873]  [] ? blk_rq_map_kern+0x130/0x130
[4.649873]  [] ? blk_execute_rq+0x8a/0xf0
[4.649873]  [] ? blk_rq_map_kern+0x130/0x130
[4.649873]  [] ? blk_recount_segments+0x1e/0x40
[4.649873]  [] ? aes_encrypt+0xe40/0x1450
[4.649873]  [] ? blk_rq_map_kern+0x10f/0x130
[4.649873]  [] ? scsi_execute+0xe4/0x150
[4.649873]  [] ? scsi_execute_req_flags+0x6e/0xa0
[4.649873]  [] ? scsi_probe_lun+0x112/0x300
[4.649873]  [] ? scsi_probe_and_add_lun+0x10d/0x2f0
[4.649873]  [] ? scsi_alloc_sdev+0x248/0x2b0
[4.649873]  [] ? scsi_probe_and_add_lun+0x12d/0x2f0
[4.649873]  [] ? anon_transport_class_unregister+0x30/0x30
[4.649873]  [] ? scsi_alloc_target+0x1c8/0x220
[4.649873]  [] ? __scsi_scan_target+0xa0/0xf0
[4.649873]  [] ? scsi_scan_channel+0x5a/0x90
[4.649873]  [] ? scsi_scan_host_selected+0xf9/0x140
[4.649873]  [] ? do_scsi_scan_host+0x69/0x80
[4.649873]  [] ? scsi_scan_host+0x30/0x50
[4.649873]  [] ? blogic_init+0x32c/0x3b0
[4.649873]  [] ? blogic_inithoststruct+0x80/0x80
[4.649873]  [] ? do_one_initcall+0x32/0xd0
[4.649873]  [] ? parse_one+0xc0/0xe0
[4.649873]  [] ? parse_args+0x7a/0x170
[4.649873]  [] ? loglevel+0x30/0x30
[4.649873]  [] ? do_initcall_level+0x7a/0x90
[4.649873]  [] ? loglevel+0x30/0x30
[4.649873]  [] ? do_initcalls+0x18/0x20
[4.649873]  [] ? do_basic_setup+0x28/0x30
[4.649873]  [] ? kernel_init_freeable+0x5f/0xf0
[4.649873]  [] ? kernel_init+0xb/0xe0
[4.649873]  [] ? ret_from_kernel_thread+0x1c/0x2c
[4.649873]  [] ? rest_init+0x140/0x140
[4.649873] ---[ end trace d5c0cda419f9730c ]---
[4.649873] Mapped at:
[

Re: [linux-next-20130822] module: broken module versions?

2013-09-10 Thread Tetsuo Handa

Hello.

Andrew Morton wrote:
> OK, thanks, I\'ve dropped
> syscallsh-use-gcc-alias-instead-of-assembler-aliases-for-syscalls.patch
> and scripts-mod-modpostc-handle-non-abs-crc-symbols.patch.

The problem was solved ( https://lkml.org/lkml/2013/9/4/188 ) and therefore
there is no need to drop these patches.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.12-rc1] Dependency on module-init-tools >= 3.11 ?

2013-09-11 Thread Tetsuo Handa

Hello.

I'm again having the boot failure problem due to commit 68411521 'Reinstate
"crypto: crct10dif - Wrap crc_t10dif function all to use crypto transform
framework"' in linux.git .

-- debug start --
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -188,7 +188,9 @@ int __request_module(bool wait, const char *fmt, ...)
 
trace_module_request(module_name, wait, _RET_IP_);
 
+   printk(KERN_WARNING "request_module(%s) start\n", module_name);
ret = call_modprobe(module_name, wait ? UMH_WAIT_PROC : UMH_WAIT_EXEC);
+   printk(KERN_WARNING "request_module(%s) end\n", module_name);
 
atomic_dec(&kmod_concurrent);
return ret;
-- debug end --

-- dmesg start --
[5.130608] Fusion MPT base driver 3.04.20
[5.130625] Copyright (c) 1999-2008 LSI Corporation
[5.136709] Fusion MPT SPI Host driver 3.04.20
[5.151422] mptbase: ioc0: Initiating bringup
[5.169695] ioc0: LSI53C1030 B0: Capabilities={Initiator}
[5.213380] scsi2 : ioc0: LSI53C1030 B0, FwRev=01032920h, Ports=1, MaxQ=128, 
IRQ=17
[5.223811] Switched to clocksource tsc
[5.247993] scsi 2:0:0:0: Direct-Access VMware,  VMware Virtual S 1.0  
PQ: 0 ANSI: 2
[5.249783] scsi target2:0:0: Beginning Domain Validation
[5.258664] scsi target2:0:0: Domain Validation skipping write tests
[5.259592] scsi target2:0:0: Ending Domain Validation
[5.260933] scsi target2:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 
127)
[5.263868] scsi 2:0:1:0: Direct-Access VMware,  VMware Virtual S 1.0  
PQ: 0 ANSI: 2
[5.264627] scsi target2:0:1: Beginning Domain Validation
[5.270310] scsi target2:0:1: Domain Validation skipping write tests
[5.270563] scsi target2:0:1: Ending Domain Validation
[5.271742] scsi target2:0:1: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, offset 
127)
[5.423813] sr0: scsi3-mmc drive: 1x/1x writer dvd-ram cd/rw xa/form2 cdda 
tray
[5.424805] cdrom: Uniform CD-ROM driver Revision: 3.20
[5.436731] request_module(crct10dif) start
[5.441143] request_module(crct10dif) end
[5.441873] request_module(crct10dif-all) start
[5.446616] request_module(crct10dif-all) end
[5.462070] request_module(crct10dif) start
[5.466579] request_module(crct10dif) end
[5.466648] request_module(crct10dif-all) start
[5.470592] request_module(crct10dif-all) end
[5.544469] scsi_id (268) used greatest stack depth: 3552 bytes left
FATAL: Module scsi_wait_scan not found.
(...snipped...)
FATAL: Module scsi_wait_scan not found.
[   59.306043] dracut Warning: Boot has failed. To debug this issue add 
"rdshell" to the kernel command line.
[   59.308188] dracut Warning: Signal caught!
-- dmesg end --

In the initramfs, crc-t10dif.ko is included but crct10dif.ko and
crct10dif-pclmul.ko are not included. This is because modules.dep does not
describe that crc-t10dif.ko depends on crct10dif.ko and optionally depends
on crct10dif-pclmul.ko .

-- dependency start --
$ grep t10dif modules.dep
kernel/arch/x86/crypto/crct10dif-pclmul.ko: kernel/crypto/crct10dif.ko
kernel/crypto/crct10dif.ko:
kernel/drivers/scsi/lpfc/lpfc.ko: kernel/drivers/scsi/scsi_transport_fc.ko 
kernel/drivers/scsi/scsi_tgt.ko kernel/lib/crc-t10dif.ko
kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko
kernel/drivers/scsi/scsi_debug.ko: kernel/lib/crc-t10dif.ko
kernel/lib/crc-t10dif.ko:
-- dependency end --

I'm using module-init-tools-3.9-21.el6_4 / binutils-2.20.51.0.2-5.36.el6 /
dracut-004-303.el6 / gcc-4.4.7-3.el6 and there is no modules.softdep file.

Did commit 7cb14ba7 "modules: add support for soft module dependencies"
silently introduced dependency on module-init-tools which can generate
modules.softdep file ( module-init-tools >= 3.11 ) ?

Kernel config is at http://I-love.SAKURA.ne.jp/tmp/config-3.12-rc1-modules .

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] vsnprintf: Remove use of %n and convert existing uses

2013-09-11 Thread Tetsuo Handa

Joe Perches wrote:
> - seq_printf(m, "%s%d%n", con->name, con->index, &len);
> + len = seq_printf(m, "%s%d", con->name, con->index);

Isn't len always 0 or -1 ?

int seq_vprintf(struct seq_file *m, const char *f, va_list args)
{
int len;

if (m->count < m->size) {
len = vsnprintf(m->buf + m->count, m->size - m->count, f, args);
if (m->count + len < m->size) {
m->count += len;
return 0;
}
}
seq_set_overflow(m);
return -1;
}
EXPORT_SYMBOL(seq_vprintf);

int seq_printf(struct seq_file *m, const char *f, ...)
{
int ret;
va_list args;

va_start(args, f);
ret = seq_vprintf(m, f, args);
va_end(args);

return ret;
}
EXPORT_SYMBOL(seq_printf);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.12-rc1] Dependency on module-init-tools >= 3.11 ?

2013-09-11 Thread Tetsuo Handa

Herbert Xu wrote:
> This way at least you'll have a working system until your initramfs
> tool is fixed to do the right thing.

Thank you. But it is module-init-tools-3.9-21.el6_4 in RHEL 6.4.
We can't wait until Red Hat backports module-init-tools >= 3.11 to RHEL 6.x.

Since most people are already using module-init-tools >= 3.11 and
there is workaround for my case (i.e. choose built-in), just updating

  module-init-tools  0.9.10  # depmod -V

line at "Current Minimal Requirements" in Documentation/Changes will be OK.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.12-rc1] Dependency on module-init-tools >= 3.11 ?

2013-09-12 Thread Tetsuo Handa

Herbert Xu wrote:
> The trouble is not all distros will include the softdep modules in
> the initramfs.  So for now I think we will have to live with a fallback.

I see.

Herbert Xu wrote:
> OK, can you please try this patch on top of the current tree?
> 
> This way at least you'll have a working system until your initramfs
> tool is fixed to do the right thing.

I tested the patch and confirmed that the boot failure was solved.

But arch/x86/crypto/crct10dif-pclmul.ko is not included into initramfs and
therefore we cannot benefit from PCLMULQDQ version.

-- before applying patch --
kernel/arch/x86/crypto/crct10dif-pclmul.ko: kernel/crypto/crct10dif.ko
kernel/crypto/crct10dif.ko:
kernel/drivers/scsi/lpfc/lpfc.ko: kernel/drivers/scsi/scsi_transport_fc.ko 
kernel/drivers/scsi/scsi_tgt.ko kernel/lib/crc-t10dif.ko
kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko
kernel/drivers/scsi/scsi_debug.ko: kernel/lib/crc-t10dif.ko
kernel/lib/crc-t10dif.ko:
-- before applying patch --

-- after applying patch --
kernel/arch/x86/crypto/crct10dif-pclmul.ko: kernel/crypto/crct10dif_common.ko
kernel/crypto/crct10dif_common.ko:
kernel/crypto/crct10dif_generic.ko: kernel/crypto/crct10dif_common.ko
kernel/drivers/scsi/lpfc/lpfc.ko: kernel/drivers/scsi/scsi_transport_fc.ko 
kernel/drivers/scsi/scsi_tgt.ko kernel/lib/crc-t10dif.ko 
kernel/crypto/crct10dif_common.ko
kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko 
kernel/crypto/crct10dif_common.ko
kernel/drivers/scsi/scsi_debug.ko: kernel/lib/crc-t10dif.ko 
kernel/crypto/crct10dif_common.ko
kernel/lib/crc-t10dif.ko: kernel/crypto/crct10dif_common.ko
-- after applying patch --
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.12-rc1] Dependency on module-init-tools >= 3.11 ?

2013-09-13 Thread Tetsuo Handa

Waiman Long wrote:
> I would like to report that I also have the same boot problem on a 
> RHEL6.4 box with the crypto patch. My workaround is to force kernel 
> build to have the crc_t10dif code built-in by changing the config file:
> 
> 4889c4889
> < CONFIG_CRYPTO_CRCT10DIF=m
> ---
>  > CONFIG_CRYPTO_CRCT10DIF=y
> 5002c5002
> < CONFIG_CRC_T10DIF=m
> ---
>  > CONFIG_CRC_T10DIF=y
> 
> This solved the boot problem without any additional patch.  Do you think 
> you should consider changing the configuration default to "y" instead of 
> "m" or doesn't allow the "m" option at all?

That was proposed but not accepted.
https://lkml.org/lkml/2013/7/17/543

You should choose CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y in your kernel config
if your CPU supports PCLMULQDQ.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] SCSI: buslogic: Added check for DMA mapping errors (was Re:[BusLogic] DMA-API: device driver failed to check map error)

2013-09-13 Thread Tetsuo Handa

Khalid Aziz wrote:
> Added check for DMA mapping errors for request sense data
> buffer. Checking for mapping error can avoid potential wild
> writes. This patch was prompted by the warning from
> dma_unmap when kernel is compiled with CONFIG_DMA_API_DEBUG.

This patch fixes CONFIG_DMA_API_DEBUG warning.
But excuse me, is this error path correct?

> @@ -309,16 +309,17 @@ static struct blogic_ccb *blogic_alloc_ccb(struct 
> blogic_adapter *adapter)
>blogic_dealloc_ccb deallocates a CCB, returning it to the Host Adapter's
>free list.  The Host Adapter's Lock should already have been acquired by 
> the
>caller.
>  */
> 
> -static void blogic_dealloc_ccb(struct blogic_ccb *ccb)
> +static void blogic_dealloc_ccb(struct blogic_ccb *ccb, int dma_unmap)
>  {
> struct blogic_adapter *adapter = ccb->adapter;
> 
> scsi_dma_unmap(ccb->command);

blogic_dealloc_ccb() uses "ccb->command". But

> -   pci_unmap_single(adapter->pci_device, ccb->sensedata,
> +   if (dma_unmap)
> +   pci_unmap_single(adapter->pci_device, ccb->sensedata,
>  ccb->sense_datalen, PCI_DMA_FROMDEVICE);
> 
> ccb->command = NULL;
> ccb->status = BLOGIC_CCB_FREE;
> ccb->next = adapter->free_ccbs;
> @@ -3177,13 +3179,21 @@ static int blogic_qcmd_lck(struct scsi_cmnd *command,
> ccb->legacy_tag = queuetag;
> }
> }
> memcpy(ccb->cdb, cdb, cdblen);
> ccb->sense_datalen = SCSI_SENSE_BUFFERSIZE;
> -   ccb->sensedata = pci_map_single(adapter->pci_device,
> +   sense_buf = pci_map_single(adapter->pci_device,
> command->sense_buffer, ccb->sense_datalen,
> PCI_DMA_FROMDEVICE);
> +   if (dma_mapping_error(&adapter->pci_device->dev, sense_buf)) {
> +   blogic_err("DMA mapping for sense data buffer failed\n",
> +   adapter);
> +   scsi_dma_map(command);
> +   blogic_dealloc_ccb(ccb, 0);

when was "ccb->command = command;" called before calling blogic_dealloc_ccb()?

> +   return SCSI_MLQUEUE_HOST_BUSY;
> +   }
> +   ccb->sensedata = sense_buf;
> ccb->command = command;
> command->scsi_done = comp_cb;
> if (blogic_multimaster_type(adapter)) {
> /*
>Place the CCB in an Outgoing Mailbox. The higher levels

Also, what happens if "scsi_dma_map(command);" returned -ENOMEM ?
If you are calling scsi_dma_map() because blogic_dealloc_ccb() calls
scsi_dma_unmap(), why can't we do like

  {
  struct blogic_adapter *adapter = ccb->adapter;
  ccb->command = NULL;
  ccb->status = BLOGIC_CCB_FREE;
  ccb->next = adapter->free_ccbs;
  adapter->free_ccbs = ccb;
  }

instead of

  scsi_dma_map(command);
  blogic_dealloc_ccb(ccb);

?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-next-20130903] module: broken module versions?

2013-09-03 Thread Tetsuo Handa

Hello.

This issue is not yet fixed as of linux-next-20130903. Without fixing this
issue, we would get 3.12-rc1 which cannot load several modules.

> I noticed that symbols which cause "disagrees about version of symbol" 
> messages
> have crc == 0.

I tried further debugging but not yet successful.

-- debug patch start --
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -1192,6 +1192,9 @@ static int check_version(Elf_Shdr *sechdrs,
if (strcmp(versions[i].name, symname) != 0)
continue;
 
+   printk("Found %s checksum %lX vs module %lX\n",
+  versions[i].name, maybe_relocated(*crc, crc_owner),
+  versions[i].crc);
if (versions[i].crc == maybe_relocated(*crc, crc_owner))
return 1;
pr_debug("Found checksum %lX vs module %lX\n",
-- debug patch end --

On x86_64, it shows

  Found sock_register checksum 0 vs module 0
  Found ns_capable checksum 0 vs module 0

and loading of ipv6.ko succeeds.

On x86_32, it shows

  Found sock_register checksum FF10 vs module 0
  Found ns_capable checksum FF10 vs module 0

and loading of ipv6.ko fails.

Thus, I came to assume that crc == 0 is the expected value but for some reason
crc == 0xFF10 is recorded into kcrctab tables.



-- debug patch start --
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -320,6 +320,11 @@ bool each_symbol_section(bool (*fn)(const struct symsearch 
*arr,
 #endif
};
 
+   static unsigned int i;
+   for (; i < __stop___ksymtab - __start___ksymtab; i++)
+   printk("%d crc(%s)=%lx\n", i, __start___ksymtab[i].name,
+  __start___kcrctab[i]);
+
if (each_symbol_in_section(arr, ARRAY_SIZE(arr), NULL, fn, data))
return true;
 
-- debug patch end --

On x86_64, "grep =0" shows

  236 crc(__symbol_put)=0
  784 crc(current_fs_time)=0
  810 crc(d_tmpfile)=0
  955 crc(do_sync_read)=0
  1093 crc(filp_close)=0
  1107 crc(finish_open)=0
  1183 crc(generic_getxattr)=0
  1207 crc(generic_write_sync)=0
  1237 crc(get_unmapped_area)=0
  1335 crc(in_group_p)=0
  1407 crc(inode_add_bytes)=0
  1495 crc(iov_shorten)=0
  1548 crc(iterate_fd)=0
  1815 crc(mnt_set_expiry)=0
  2038 crc(ns_capable)=0
  2102 crc(path_is_under)=0
  2405 crc(register_exec_domain)=0
  2506 crc(schedule_timeout)=0
  2911 crc(sock_register)=0
  2922 crc(softirq_work_list)=0
  2996 crc(sys_close)=0
  3022 crc(task_nice)=0
  3243 crc(vfs_fsync_range)=0
  3270 crc(vm_brk)=0

On x86_32, "grep =ff10" shows

  244 crc(__symbol_put)=ff10
  784 crc(current_fs_time)=ff10
  810 crc(d_tmpfile)=ff10
  967 crc(do_sync_read)=ff10
  1108 crc(filp_close)=ff10
  1122 crc(finish_open)=ff10
  1198 crc(generic_getxattr)=ff10
  1222 crc(generic_write_sync)=ff10
  1253 crc(get_unmapped_area)=ff10
  1351 crc(in_group_p)=ff10
  1423 crc(inode_add_bytes)=ff10
  1510 crc(iov_shorten)=ff10
  1567 crc(iterate_fd)=ff10
  1840 crc(mnt_set_expiry)=ff10
  2061 crc(ns_capable)=ff10
  2126 crc(path_is_under)=ff10
  2432 crc(register_exec_domain)=ff10
  2535 crc(schedule_timeout)=ff10
  2940 crc(sock_register)=ff10
  2951 crc(softirq_work_list)=ff10
  3013 crc(sys_close)=ff10
  3039 crc(task_nice)=ff10
  3261 crc(vfs_fsync_range)=ff10
  3288 crc(vm_brk)=ff10

Module.symvers says that these symbols have crc == 0x and
kcrctab on x86_64 says that these symbols have crc == 0x but
kcrctab on x86_32 says that these symbols have crc == 0xff10.

Prior to commit 5c019369 "syscalls.h: use gcc alias instead of assembler
aliases for syscalls", only crc(softirq_work_list)=ff10 line was shown.



I don't know where 0xff10 comes from, but this value is kconfig dependent
and this value might be by error introduced.

I also tried older kernels. As far as I tested (I might be wrong as this value
is kconfig dependent), it seems that 3.9 and later kernels show
crc(softirq_work_list)=ff10 line while 3.8 and earlier kernels show
crc(softirq_work_list)=0 line. I'll try to bisect between 3.8 and 3.9 if this
change is by error introduced.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] kthread: Make kthread_create() killable.

2013-09-25 Thread Tetsuo Handa

Thank you for comments, David.

David Rientjes wrote:
> Also results in a livelock if you're running in a memcg and have hit its
> limit.

> wait_for_completion() is scary if that completion requires memory that 
> cannot be allocated because the caller is killed but uninterruptible.

I don't think these lines are specific to wait_for_completion() users.

Currently the OOM killer is disabled throughout from "the moment the OOM killer
chose a process to kill" to "the moment the task_struct of the chosen process
becomes unreachable". Any blocking functions which wait in TASK_UNINTERRUPTIBLE
(e.g. mutex_lock()) can disable the OOM killer if the current thread is chosen
by the OOM killer. Therefore, any users of blocking functions which wait in
TASK_UNINTERRUPTIBLE are considered scary if they assume that the current
thread will not be chosen by the OOM killer.

But it seems to me that re-enabling the OOM killer at some point is more
realizable than purging all such users.

To re-enable the OOM killer at some point, the OOM killer needs to choose more
processes if the to-be-killed process cannot be terminated within an adequate
period.

For example, add "unsigned long memdie_stamp;" to "struct task_struct" and do
"p->memdie_stamp = jiffies + 5 * HZ;" before "set_tsk_thread_flag(p, 
TIF_MEMDIE);"
and do

if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
if (unlikely(frozen(task)))
__thaw_task(task);
+   /* Choose more processes if the chosen process cannot die. */
+   if (time_after(jiffies, p->memdie_stamp) &&
+   task->state == TASK_UNINTERRUPTIBLE)
+   return OOM_SCAN_CONTINUE;
if (!force_kill)
return OOM_SCAN_ABORT;
}

in oom_scan_process_thread().

This idea costs us the increment of the possibility of different side effects
(e.g. the second-worst process is chosen by the OOM killer when the worst
process cannot be terminated => memory allocation for writeback fails because
the second-worst process was in the ext3's writeback path => fs-error action
(remount read-only or panic) gets triggered by the second-worst process).

Anyway, this patch is for helping the OOM killer to kill the process smoothly
when the chosen process is waiting at kthread_create(). I attach updated patch
description. Did I merge your comments appropriately?
--
[PATCH v3] kthread: Make kthread_create() killable.

Any user process callers of wait_for_completion() except global init process
might be chosen by the OOM killer while waiting for completion() call by some
other process which does memory allocation.

When such users are chosen by the OOM killer when they are waiting for
completion() in TASK_UNINTERRUPTIBLE, the system will be kept stressed
due to memory starvation because the OOM killer cannot kill such users.

kthread_create() is one of such users and this patch fixes the problem for
kthreadd by making kthread_create() killable.

Signed-off-by: Tetsuo Handa 
Cc: Oleg Nesterov 
Acked-by: David Rientjes 
Signed-off-by: Andrew Morton 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: kthread: Make kthread_create() killable.

2013-09-28 Thread Tetsuo Handa

David Rientjes wrote:
> There may not be any eligible processes left and then the machine panics.  

Some of enterprise users might prefer "kernel panic followed by kdump and
automatic reboot" to "a system is not responding for unpredictable period", for
the panic helps getting information for analyzing what process caused the
freeze. Well, can they use "Panic (Reboot) On Soft Lockups" option?

> These time-based delays also have caused a complete depletion of memory 
> reserves if more than one process is chosen and each consumes an 
> non-neglible amount of memory which would then cause livelock.  We used to 
> have a jiffies-based rekill in 2.6.18 internally and we finally could 
> remove it when mm->mmap_sem issues were fixed (mostly by checking for 
> fatal_signal_pending() and aborting when necessary).

So, you've already tried that.

Currently the OOM killer kills a process after

  blocking_notifier_call_chain(&oom_notify_list, 0, &freed);

in out_of_memory() released all reclaimable memory. This call helps reducing
the chance to kill a process if the bad process no longer asks for more memory.
But if the bad process continues asking for more memory and the chosen task is
in TASK_UNINTERRUPTIBLE state, this call helps the OOM killer to be disabled
for unpredictable period. Therefore, releasing all reclaimable memory before
the OOM killer kills a process might be considered bad.

Then, what about an approach described below?

(1) Introduce a kernel thread which reserves (e.g.) 1 percent of kernel memory
(this amount should be configurable via sysctl) upon startup.

(2) The kernel thread sleeps using wait_event(memory_reservoir_wait) and
releases PAGE_SIZE bytes from the reserved memory upon each wakeup.

(3) The OOM killer calls wake_up() like

if (test_tsk_thread_flag(task, TIF_MEMDIE)) {
if (unlikely(frozen(task)))
__thaw_task(task);
+   /* Let the memory reservoir release memory if the chosen 
process cannot die. */
+   if (time_after(jiffies, p->memdie_stamp) &&
+   task->state == TASK_UNINTERRUPTIBLE)
+   wake_up(&memory_reservoir_wait);
if (!force_kill)
return OOM_SCAN_ABORT;
}

in oom_scan_process_thread().

(4) When a task where test_tsk_thread_flag(task, TIF_MEMDIE) is true has
terminated and memory used by the task is reclaimed, the reclaimed memory
is again reserved by the kernel thread up to 1 percent of kernel memory.

In this way, we could shorten the duration of the OOM killer being disabled
unless the reserved memory was not enough to terminate the chosen process.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] remove all uses of printf's %n

2013-09-30 Thread Tetsuo Handa

Hello.

As it seems that there is no critical problem (naming preference can easily be
fixed if needed), can these patches go to linux-next?

If these patches are accepted, Kees Cook will submit a patch which removes %n
support from vsnprintf() ( https://lkml.org/lkml/2013/9/16/54 ).

Regards.

>From 02b28fd709971f71e5de9a5b595ff4fd059028b3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Thu, 19 Sep 2013 17:23:17 +0900
Subject: [PATCH] seq_file: Introduce seq_setwidth() and seq_pad()

There are several users who want to know bytes written by seq_*() for alignment
purpose. Currently they are using %n format for knowing it because seq_*()
returns 0 on success.

This patch introduces seq_setwidth() and seq_pad() for allowing them to align
without using %n format.

Signed-off-by: Tetsuo Handa 
Acked-by: Kees Cook 
---
 fs/seq_file.c|   15 +++
 include/linux/seq_file.h |   15 +++
 2 files changed, 30 insertions(+), 0 deletions(-)

diff --git a/fs/seq_file.c b/fs/seq_file.c
index 3135c25..40e471e 100644
--- a/fs/seq_file.c
+++ b/fs/seq_file.c
@@ -764,6 +764,21 @@ int seq_write(struct seq_file *seq, const void *data, 
size_t len)
 }
 EXPORT_SYMBOL(seq_write);
 
+/**
+ * seq_pad - write padding spaces to buffer
+ * @m: seq_file identifying the buffer to which data should be written
+ * @c: the byte to append after padding if non-zero
+ */
+void seq_pad(struct seq_file *m, char c)
+{
+   int size = m->pad_until - m->count;
+   if (size > 0)
+   seq_printf(m, "%*s", size, "");
+   if (c)
+   seq_putc(m, c);
+}
+EXPORT_SYMBOL(seq_pad);
+
 struct list_head *seq_list_start(struct list_head *head, loff_t pos)
 {
struct list_head *lh;
diff --git a/include/linux/seq_file.h b/include/linux/seq_file.h
index 4e32edc..52e0097 100644
--- a/include/linux/seq_file.h
+++ b/include/linux/seq_file.h
@@ -20,6 +20,7 @@ struct seq_file {
size_t size;
size_t from;
size_t count;
+   size_t pad_until;
loff_t index;
loff_t read_pos;
u64 version;
@@ -79,6 +80,20 @@ static inline void seq_commit(struct seq_file *m, int num)
}
 }
 
+/**
+ * seq_setwidth - set padding width
+ * @m: the seq_file handle
+ * @size: the max number of bytes to pad.
+ *
+ * Call seq_setwidth() for setting max width, then call seq_printf() etc. and
+ * finally call seq_pad() to pad the remaining bytes.
+ */
+static inline void seq_setwidth(struct seq_file *m, size_t size)
+{
+   m->pad_until = m->count + size;
+}
+void seq_pad(struct seq_file *m, char c);
+
 char *mangle_path(char *s, const char *p, const char *esc);
 int seq_open(struct file *, const struct seq_operations *);
 ssize_t seq_read(struct file *, char __user *, size_t, loff_t *);
-- 
1.7.1

>From f8b60ebe3971901b93dedb8eee0f85b60d0fdc5f Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Fri, 20 Sep 2013 12:01:07 +0900
Subject: [PATCH] Remove "%n" usage from seq_file users.

All seq_printf() users are using "%n" for calculating padding size, convert
them to use seq_setwidth() / seq_pad() pair.

Signed-off-by: Tetsuo Handa 
Acked-by: Kees Cook 
---
 fs/proc/consoles.c   |   10 --
 fs/proc/nommu.c  |   12 +---
 fs/proc/task_mmu.c   |   20 ++--
 fs/proc/task_nommu.c |   19 ++-
 net/ipv4/fib_trie.c  |   13 +++--
 net/ipv4/ping.c  |   15 +++
 net/ipv4/tcp_ipv4.c  |   33 +++--
 net/ipv4/udp.c   |   15 +++
 net/phonet/socket.c  |   24 +++-
 net/sctp/objcnt.c|9 +
 10 files changed, 73 insertions(+), 97 deletions(-)

diff --git a/fs/proc/consoles.c b/fs/proc/consoles.c
index b701eaa..51942d5 100644
--- a/fs/proc/consoles.c
+++ b/fs/proc/consoles.c
@@ -29,7 +29,6 @@ static int show_console_dev(struct seq_file *m, void *v)
char flags[ARRAY_SIZE(con_flags) + 1];
struct console *con = v;
unsigned int a;
-   int len;
dev_t dev = 0;
 
if (con->device) {
@@ -47,11 +46,10 @@ static int show_console_dev(struct seq_file *m, void *v)
con_flags[a].name : ' ';
flags[a] = 0;
 
-   seq_printf(m, "%s%d%n", con->name, con->index, &len);
-   len = 21 - len;
-   if (len < 1)
-   len = 1;
-   seq_printf(m, "%*c%c%c%c (%s)", len, ' ', con->read ? 'R' : '-',
+   seq_setwidth(m, 21 - 1);
+   seq_printf(m, "%s%d", con->name, con->index);
+   seq_pad(m, ' ');
+   seq_printf(m, "%c%c%c (%s)", con->read ? 'R' : '-',
con->write ? 'W' : '-', con->unblank ? 'U' : '-',

Re: [PATCH 1/2] remove all uses of printf's %n

2013-09-20 Thread Tetsuo Handa

Kees Cook wrote:
> >> - seq_printf(seq, "%*s\n", 127 - len, "");
> >> + seq_pad(seq, '\n');
> >
> > Hmm, seq_pad is unintuitive. I would say it pads the string by '\n'. Of
> > course it does not, but...
> 
> I don't think this is a very serious problem. Currently, the padding
> character is always ' ' for all existing callers, so it only makes
> sense to make the trailing character an argument.

If you want, we can rename seq_pad() to seq_pad_and_putc(). Also we can pass
both the padding character (e.g. ' ') and the trailing character (e.g. '\n')
like seq_pad_and_putc((' ' << 8) | '\n'), though I wonder someone wants to
use '\0', '\t', '\n' etc. as the padding character...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kmod: Check for NULL at call_usermodehelper_exec().

2013-09-23 Thread Tetsuo Handa

Andrew, would you pick up this patch?

Regards.
--
>From d6ff218545060c5f8b75b15d5b34bffcf625508f Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Mon, 16 Sep 2013 02:19:10 +0900
Subject: [PATCH] kmod: Check for NULL at call_usermodehelper_exec().

If /proc/sys/kernel/core_pattern contains only "|", NULL pointer dereference
happens upon core dump because argv_split("") returns argv[0] == NULL.

This bug was once fixed by commit 264b83c0 "usermodehelper: check
subprocess_info->path != NULL" but was by error reintroduced by commit
7f57cfa4 "usermodehelper: kill the sub_info->path[0] check".

This bug seems to exist since 2.6.19 (the version which core dump to pipe was
added). Depending on kernel version and config, some side effect might happen
immediately after this oops (e.g. kernel panic with 2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa 
Acked-by: Oleg Nesterov 
---
 kernel/kmod.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index fb32636..a962470 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -571,6 +571,10 @@ int call_usermodehelper_exec(struct subprocess_info 
*sub_info, int wait)
DECLARE_COMPLETION_ONSTACK(done);
int retval = 0;

+   if (!sub_info->path) {
+   call_usermodehelper_freeinfo(sub_info);
+   return -ENOENT;
+   }
helper_lock();
if (!khelper_wq || usermodehelper_disabled) {
retval = -EBUSY;
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.11-rc1] CONFIG_DEBUG_MUTEXES=y using gcc 3.x makes unbootablekernel.

2013-09-24 Thread Tetsuo Handa

Hello, Maarten.

Is this patch already queued for 3.12-rcX ?
I expect this patch be committed before sending a patch for 3.11-stable.

Regards.
--
>From a1b01c858143c2c2c92b17e7df096042bfe0df6b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 24 Sep 2013 23:44:17 +0900
Subject: [PATCH] mutex: Avoid gcc version dependent __builtin_constant_p() 
usage.

Commit 040a0a37 "mutex: Add support for wound/wait style locks" used
"!__builtin_constant_p(p == NULL)" but gcc 3.x cannot handle such expression
correctly, leading to boot failure when built with CONFIG_DEBUG_MUTEXES=y.

Fix it by explicitly passing a bool which tells whether p != NULL or not.

Signed-off-by: Tetsuo Handa 
---
 kernel/mutex.c |   32 
 1 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/kernel/mutex.c b/kernel/mutex.c
index 6d647ae..d24105b 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -410,7 +410,7 @@ ww_mutex_set_context_fastpath(struct ww_mutex *lock,
 static __always_inline int __sched
 __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
struct lockdep_map *nest_lock, unsigned long ip,
-   struct ww_acquire_ctx *ww_ctx)
+   struct ww_acquire_ctx *ww_ctx, const bool use_ww_ctx)
 {
struct task_struct *task = current;
struct mutex_waiter waiter;
@@ -450,7 +450,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
struct task_struct *owner;
struct mspin_node  node;
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (use_ww_ctx && ww_ctx->acquired > 0) {
struct ww_mutex *ww;
 
ww = container_of(lock, struct ww_mutex, base);
@@ -480,7 +480,7 @@ __mutex_lock_common(struct mutex *lock, long state, 
unsigned int subclass,
if ((atomic_read(&lock->count) == 1) &&
(atomic_cmpxchg(&lock->count, 1, 0) == 1)) {
lock_acquired(&lock->dep_map, ip);
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (use_ww_ctx) {
struct ww_mutex *ww;
ww = container_of(lock, struct ww_mutex, base);
 
@@ -551,7 +551,7 @@ slowpath:
goto err;
}
 
-   if (!__builtin_constant_p(ww_ctx == NULL) && ww_ctx->acquired > 
0) {
+   if (use_ww_ctx && ww_ctx->acquired > 0) {
ret = __mutex_lock_check_stamp(lock, ww_ctx);
if (ret)
goto err;
@@ -575,7 +575,7 @@ skip_wait:
lock_acquired(&lock->dep_map, ip);
mutex_set_owner(lock);
 
-   if (!__builtin_constant_p(ww_ctx == NULL)) {
+   if (use_ww_ctx) {
struct ww_mutex *ww = container_of(lock, struct ww_mutex, base);
struct mutex_waiter *cur;
 
@@ -615,7 +615,7 @@ mutex_lock_nested(struct mutex *lock, unsigned int subclass)
 {
might_sleep();
__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
-   subclass, NULL, _RET_IP_, NULL);
+   subclass, NULL, _RET_IP_, NULL, 0);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_nested);
@@ -625,7 +625,7 @@ _mutex_lock_nest_lock(struct mutex *lock, struct 
lockdep_map *nest)
 {
might_sleep();
__mutex_lock_common(lock, TASK_UNINTERRUPTIBLE,
-   0, nest, _RET_IP_, NULL);
+   0, nest, _RET_IP_, NULL, 0);
 }
 
 EXPORT_SYMBOL_GPL(_mutex_lock_nest_lock);
@@ -635,7 +635,7 @@ mutex_lock_killable_nested(struct mutex *lock, unsigned int 
subclass)
 {
might_sleep();
return __mutex_lock_common(lock, TASK_KILLABLE,
-  subclass, NULL, _RET_IP_, NULL);
+  subclass, NULL, _RET_IP_, NULL, 0);
 }
 EXPORT_SYMBOL_GPL(mutex_lock_killable_nested);
 
@@ -644,7 +644,7 @@ mutex_lock_interruptible_nested(struct mutex *lock, 
unsigned int subclass)
 {
might_sleep();
return __mutex_lock_common(lock, TASK_INTERRUPTIBLE,
-  subclass, NULL, _RET_IP_, NULL);
+  subclass, NULL, _RET_IP_, NULL, 0);
 }
 
 EXPORT_SYMBOL_GPL(mutex_lock_interruptible_nested);
@@ -682,7 +682,7 @@ __ww_mutex_lock(struct ww_mutex *lock, struct 
ww_acquire_ctx *ctx)
 
might_sleep();
ret =  __mutex_lock_common(&lock->base, TASK_UNINTERRUPTIBLE,
-  0, &ctx->dep_map, _RET_IP_, ctx);
+  0, &ctx->dep_map, _RET_IP_, ctx, 1);
if (!ret && ctx->acquired > 1)
return ww_mutex_deadlo

Re: [PATCH] kmod: Check for NULL at call_usermodehelper_exec().

2013-09-24 Thread Tetsuo Handa

Andrew Morton wrote:
> > --- a/kernel/kmod.c
> > +++ b/kernel/kmod.c
> > @@ -571,6 +571,10 @@ int call_usermodehelper_exec(struct subprocess_info 
> > *sub_info, int wait)
> > DECLARE_COMPLETION_ONSTACK(done);
> > int retval = 0;
> >  
> > +   if (!sub_info->path) {
> > +   call_usermodehelper_freeinfo(sub_info);
> > +   return -ENOENT;
> > +   }
> > helper_lock();
> > if (!khelper_wq || usermodehelper_disabled) {
> > retval = -EBUSY;
> 
> The error is that the user put a bare "|" into
> /proc/sys/kernel/core_pattern.  Is ENOENT ("No such file or directory")
> the most appropriate error code here?  I think EINVAL ("Invalid
> argument")?
> 
I'm fine with EINVAL. Updated patch follows.
--
>From a1068778ed0dd3156dff0ac6245314b8627b8830 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 24 Sep 2013 23:54:05 +0900
Subject: [PATCH] kernel/kmod.c: check for NULL in call_usermodehelper_exec()

If /proc/sys/kernel/core_pattern contains only "|", NULL pointer
dereference happens upon core dump because argv_split("") returns argv[0]
== NULL.

This bug was once fixed by commit 264b83c0 ("usermodehelper: check
subprocess_info->path != NULL") but was by error reintroduced by commit
7f57cfa4 ("usermodehelper: kill the sub_info->path[0] check").

This bug seems to exist since 2.6.19 (the version which core dump to pipe
was added).  Depending on kernel version and config, some side effect
might happen immediately after this oops (e.g.  kernel panic with
2.6.32-358.18.1.el6).

Signed-off-by: Tetsuo Handa 
Acked-by: Oleg Nesterov 
Cc: 
Signed-off-by: Andrew Morton 
---
 kernel/kmod.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index fb32636..b086006 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -571,6 +571,10 @@ int call_usermodehelper_exec(struct subprocess_info 
*sub_info, int wait)
DECLARE_COMPLETION_ONSTACK(done);
int retval = 0;
 
+   if (!sub_info->path) {
+   call_usermodehelper_freeinfo(sub_info);
+   return -EINVAL;
+   }
helper_lock();
if (!khelper_wq || usermodehelper_disabled) {
retval = -EBUSY;
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] kthread: Make kthread_create() killable.

2013-09-24 Thread Tetsuo Handa

Andrew Morton wrote:
> That's a pretty big patch.  What's the status of this?  Do you think
> it's ready to go?  Oleg?
> 
> I don't like the changelog much - it doesn't really describe the bug. 
> I can google "CVE-2012-4398" and that turns up a bunch of stuff but
> it's more oriented toward sysadmins etc, rather than kernel developers.
> 
> Can we please have a conventional description?  When the user does A,
> the kernel does B because C, so we fix it via D?

I see. I updated description. Updated patch follows.
--
>From 0fe0c9d09b45cce0f00457755861204d51d7c2c9 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Wed, 25 Sep 2013 00:00:27 +0900
Subject: [PATCH] kthread: make kthread_create() killable

Any users of wait_for_completion() might be chosen by the OOM killer while
waiting for completion() call by some process which does memory
allocation.  kthread_create() is one of such users.

When such users are chosen by the OOM killer when they are waiting for
completion() in TASK_UNINTERRUPTIBLE, the system will be kept stressed
due to memory starvation because the OOM killer cannot kill such users.

Fix this problem for kthreadd by making kthread_create() killable.

Signed-off-by: Tetsuo Handa 
Cc: Oleg Nesterov 
Signed-off-by: Andrew Morton 
---
 kernel/kthread.c |   73 -
 1 files changed, 55 insertions(+), 18 deletions(-)

diff --git a/kernel/kthread.c b/kernel/kthread.c
index 760e86d..b5ae3ee 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -33,7 +33,7 @@ struct kthread_create_info
 
/* Result passed back to kthread_create() from kthreadd. */
struct task_struct *result;
-   struct completion done;
+   struct completion *done;
 
struct list_head list;
 };
@@ -178,6 +178,7 @@ static int kthread(void *_create)
struct kthread_create_info *create = _create;
int (*threadfn)(void *data) = create->threadfn;
void *data = create->data;
+   struct completion *done;
struct kthread self;
int ret;
 
@@ -187,10 +188,16 @@ static int kthread(void *_create)
init_completion(&self.parked);
current->vfork_done = &self.exited;
 
+   /* If user was SIGKILLed, I release the structure. */
+   done = xchg(&create->done, NULL);
+   if (!done) {
+   kfree(create);
+   do_exit(-EINTR);
+   }
/* OK, tell user we're spawned, wait for stop or wakeup */
__set_current_state(TASK_UNINTERRUPTIBLE);
create->result = current;
-   complete(&create->done);
+   complete(done);
schedule();
 
ret = -EINTR;
@@ -223,8 +230,15 @@ static void create_kthread(struct kthread_create_info 
*create)
/* We want our own signal handler (we take no signals by default). */
pid = kernel_thread(kthread, create, CLONE_FS | CLONE_FILES | SIGCHLD);
if (pid < 0) {
+   /* If user was SIGKILLed, I release the structure. */
+   struct completion *done = xchg(&create->done, NULL);
+
+   if (!done) {
+   kfree(create);
+   return;
+   }
create->result = ERR_PTR(pid);
-   complete(&create->done);
+   complete(done);
}
 }
 
@@ -255,36 +269,59 @@ struct task_struct *kthread_create_on_node(int 
(*threadfn)(void *data),
   const char namefmt[],
   ...)
 {
-   struct kthread_create_info create;
-
-   create.threadfn = threadfn;
-   create.data = data;
-   create.node = node;
-   init_completion(&create.done);
+   DECLARE_COMPLETION_ONSTACK(done);
+   struct task_struct *task;
+   struct kthread_create_info *create = kmalloc(sizeof(*create),
+GFP_KERNEL);
+
+   if (!create)
+   return ERR_PTR(-ENOMEM);
+   create->threadfn = threadfn;
+   create->data = data;
+   create->node = node;
+   create->done = &done;
 
spin_lock(&kthread_create_lock);
-   list_add_tail(&create.list, &kthread_create_list);
+   list_add_tail(&create->list, &kthread_create_list);
spin_unlock(&kthread_create_lock);
 
wake_up_process(kthreadd_task);
-   wait_for_completion(&create.done);
-
-   if (!IS_ERR(create.result)) {
+   /*
+* Wait for completion in killable state, for I might be chosen by
+* the OOM killer while kthreadd is trying to allocate memory for
+* new kernel thread.
+*/
+   if (unlikely(wait_for_completion_killable(&done))) {
+   /*
+* If I was SIGKILLed before kthreadd (or new ker

Re: [PATCH] kconfig/menuconfig: use TAILQ instead of CIRCLEQ

2012-10-19 Thread Tetsuo Handa

Yann E. MORIN wrote:
> Some systems (eg. Cygwin, FreeBSD) are missing the CIRCLEQ macros.
> They were removed in Y2000 from FreeBSD:
> http://svnweb.freebsd.org/base?view=revision&revision=70469
> 
> The reason was that TAILQ are perfectly capable of doing the exact
> same things:
> 
> http://www.mavetju.org/mail/view_thread.php?list=freebsd-arch&id=915145&thread=yes
> 
> (Thank Yaakov for the pointers!)
> 
> So, switch to using TAILQ instead, which are more portable.
> 
> Reported-by: Tetsuo Handa 
> Reported-by: Benjamin Poirier 
> Signed-off-by: "Yann E. MORIN" 
> Cc: Yaakov Selkowitz 
> ---
>  scripts/kconfig/expr.h  |4 ++--
>  scripts/kconfig/mconf.c |4 ++--
>  scripts/kconfig/menu.c  |6 +++---
>  3 files changed, 7 insertions(+), 7 deletions(-)
> 
Excuse me, but your patch does not solve my problem because kconfig started
using macros which does not exist in "@(#)queue.h 8.3 (Berkeley) 12/13/93".
Kconfig still fails after applying your patch:

  HOSTCC  scripts/kconfig/mconf.o
scripts/kconfig/mconf.c: In function `update_text':
scripts/kconfig/mconf.c:326: warning: implicit declaration of function 
`TAILQ_FOREACH'
scripts/kconfig/mconf.c:326: error: `entries' undeclared (first use in this 
function)
scripts/kconfig/mconf.c:326: error: (Each undeclared identifier is reported 
only once
scripts/kconfig/mconf.c:326: error: for each function it appears in.)
scripts/kconfig/mconf.c:326: error: syntax error before '{' token
scripts/kconfig/mconf.c:333: error: `header' undeclared (first use in this 
function)
scripts/kconfig/mconf.c: At top level:
scripts/kconfig/mconf.c:343: error: syntax error before '}' token
scripts/kconfig/mconf.c: In function `search_conf':
scripts/kconfig/mconf.c:378: warning: implicit declaration of function 
`TAILQ_HEAD_INITIALIZER'
scripts/kconfig/mconf.c:378: error: invalid initializer
make[1]: *** [scripts/kconfig/mconf.o] Error 1
make: *** [menuconfig] Error 2

So, would you add something which looks like "sed -e 's/CIRCLEQ/TAILQ/g'" upon
https://lkml.org/lkml/2012/10/16/274 ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kconfig/menuconfig: use TAILQ instead of CIRCLEQ

2012-10-20 Thread Tetsuo Handa

Michal Marek wrote:
> On 19.10.2012 14:10, Tetsuo Handa wrote:
> > Yann E. MORIN wrote:
> >> So, switch to using TAILQ instead, which are more portable.
> [...]
> > Excuse me, but your patch does not solve my problem because kconfig started
> > using macros which does not exist in "@(#)queue.h 8.3 (Berkeley) 12/13/93".
> > Kconfig still fails after applying your patch:
> > 
> >   HOSTCC  scripts/kconfig/mconf.o
> > scripts/kconfig/mconf.c: In function `update_text':
> > scripts/kconfig/mconf.c:326: warning: implicit declaration of function 
> > `TAILQ_FOREACH'
> [...]
> > scripts/kconfig/mconf.c:378: warning: implicit declaration of function 
> > `TAILQ_HEAD_INITIALIZER'
> > 
> > So, would you add something which looks like "sed -e 's/CIRCLEQ/TAILQ/g'" 
> > upon
> > https://lkml.org/lkml/2012/10/16/274 ?
> 
> Could you reduce that patch to not copy all of queue.h?
> TAILQ_HEAD_INITIALIZER can be replaced by a TAILQ_INIT() call after
> variable definitions, and we do not need stuff like
> TAILQ_FOREACH_REVERSE. The other option is to reimplement the needed
> operations under a different name, so that people don't accidentally use
> other macros that are missing in old queue.h revisions.
> 
> Michal
> 

I'm fine to manually add missing macros to /usr/include/sys/queue.h of
"@(#)queue.h 8.3 (Berkeley) 12/13/93" in my environment instead of adding
define-as-needed lines to scripts/kconfig/expr.h, for missing macros are
available with that of "@(#)queue.h 8.5 (Berkeley) 8/20/94".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] menuconfig: Replace CIRCLEQ by list_head-style lists.

2012-10-20 Thread Tetsuo Handa

Yann E. MORIN wrote:
> Benjamin, All,
> 
> On Saturday 20 October 2012 Benjamin Poirier wrote:
> > From: Benjamin Poirier 
> > 
> > sys/queue.h and CIRCLEQ in particular have proven to cause portability
> > problems (reported on Debian Sarge, Cygwin and FreeBSD)
> > 
> > Reported-by: Tetsuo Handa 
> > Signed-off-by: Benjamin Poirier 
> 
> Tested-by: "Yann E. MORIN" 
> 
> Thank you for this patch! I guess it is the best solution we can get.
> 
> Regards,
> Yann E. MORIN.

This patch solves my problem on Debian Sarge. Thank you.

Tested-by: Tetsuo Handa 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.7-rc6/perf,x86] Build failure at p6_pmu declaration.

2012-11-17 Thread Tetsuo Handa

Commit caaa8be3 "perf, x86: Fix __initconst vs const" causes

  arch/x86/kernel/cpu/perf_event_p6.c:200: error: p6_pmu causes a section type 
conflict
  make[3]: *** [arch/x86/kernel/cpu/perf_event_p6.o] Error 1

error. Patching

  -static __initconst const struct x86_pmu p6_pmu = {
  +static __initconst struct x86_pmu p6_pmu = {

solved the error. But is this rather a compiler bug? I'm using gcc 3.3.5.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.7-rc6] Build failure with scripts/Makefile.headersinst

2012-11-17 Thread Tetsuo Handa

I get build error which complains that include/uapi/asm-generic/auxvec.h is
missing although the file exists. I'm using gcc 3.3.5.
This bug seems to not happen with gcc 4.x.

# make
make[1]: Nothing to be done for `all'.
make[1]: Nothing to be done for `relocs'.
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CALLscripts/checksyscalls.sh
  CHK include/generated/compile.h
make[3]: `arch/x86/realmode/rm/realmode.bin' is up to date.
  CHK include/generated/uapi/linux/version.h
make[2]: Nothing to be done for `all'.
make[2]: Nothing to be done for `relocs'.
/usr/src/all/linux/scripts/Makefile.headersinst:50: *** Missing UAPI file 
/usr/src/all/linux/include/uapi/asm-generic/auxvec.h.  Stop.
make[2]: *** [asm-generic] Error 2
make[1]: *** [headers_install] Error 2
make: *** [vmlinux] Error 2
# ls -l /usr/src/all/linux/include/uapi/asm-generic/auxvec.h
-rw-r--r--  1 root root 218 Oct 20 14:56 
/usr/src/all/linux/include/uapi/asm-generic/auxvec.h

Linux 3.6 builds fine. I can't use "git bisect" until Linux 3.7-rc6 but
possibly caused by either commit 10b63956 "UAPI: Plumb the UAPI Kbuilds into
the user header installation and checking" or commit 40f1d4c2 "UAPI: Remove the
objhdr-y export list".
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.7-rc6] Build failure with scripts/Makefile.headersinst

2012-11-18 Thread Tetsuo Handa

David Howells wrote:
> Tetsuo Handa  wrote:
> 
> > I get build error which complains that include/uapi/asm-generic/auxvec.h is
> > missing although the file exists. I'm using gcc 3.3.5.
> > This bug seems to not happen with gcc 4.x.
> 
> What configuration?
> 
> David
> 

It is available at http://I-love.SAKURA.ne.jp/tmp/config-3.7-rc6 .
Workaround is to change to CONFIG_HEADERS_CHECK=n.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.7-rc6] Build failure with scripts/Makefile.headersinst

2012-11-19 Thread Tetsuo Handa

David Howells wrote:
> The version of the compiler shouldn't have any effect as far as I can see.
> Can you add the following:
> 
>   $(info  $(srcdir)/$(hdr)) \
>   $(info  $(oldsrcdir)/$(hdr)) \
> 
> immediately before the marked line and look for the lines in the output from
> make.  The error suggests that neither pattern match worked.

Inserting

$(info  $(srcdir)/$(hdr)), \
$(info  $(oldsrcdir)/$(hdr)), \

did not work.

> Btw, are you supplying an O= flag to make when you build?

No. But supplying V=1 revealed that $(_dst) is an empty string at

  # Recursion
  hdr-inst := -rR -f $(srctree)/scripts/Makefile.headersinst obj
  .PHONY: $(subdirs)
  $(subdirs):
  $(Q)$(MAKE) $(hdr-inst)=$(obj)/$@ dst=$(_dst)/$@

when using GNU Make 3.80, while $(_dst) contains appropriate string when using
GNU Make 3.81.



With Make 3.81:
make -f scripts/Makefile.build obj=scripts build_unifdef
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst obj=include/uapi
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/asm-generic dst=include/uapi/asm-generic
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/drm dst=include/uapi/drm
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/linux dst=include/uapi/linux
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/linux/byteorder dst=include/uapi/linux/byteorder
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/linux/caif dst=include/uapi/linux/caif
(...snipped...)

With Make 3.80:
make -f scripts/Makefile.build obj=scripts build_unifdef
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst obj=include/uapi
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/asm-generic dst=/asm-generic
/usr/src/all/linux/scripts/Makefile.headersinst:50: *** Missing UAPI file 
/usr/src/all/linux/include/uapi/asm-generic/auxvec.h.  Stop.
make[2]: *** [asm-generic] Error 2
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Yama: remove locking from delete path

2012-11-20 Thread Tetsuo Handa

Kees Cook wrote:
> Instead of locking the list during a delete, mark entries as invalid
> and trigger a workqueue to clean them up. This lets us easily handle
> task_free from interrupt context.

> @@ -57,9 +80,12 @@ static int yama_ptracer_add(struct task_struct *tracer,
>  
>   added->tracee = tracee;
>   added->tracer = tracer;
> + added->invalid = false;
>  
> - spin_lock_bh(&ptracer_relations_lock);
> + spin_lock(&ptracer_relations_lock);

Can't you use
spin_lock_irqsave(&ptracer_relations_lock, flags);
spin_unlock_irqrestore(&ptracer_relations_lock, flags);
instead of adding ->invalid ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.7-rc6] Build failure with scripts/Makefile.headersinst

2012-11-20 Thread Tetsuo Handa

David Howells wrote:
> Does $(info ...) not work at all in version 3.80?  If it does, can you get it
> to display the values $(destination-y), $(dst) and $(obj) at the top of
> Makefile.headersinst?  If $(info ...) doesn't exist, does $(warning ...)?

$(info ...) does not work, but $(warning ...) works.

Debug print shows that command line variable "dst=" passed to
  make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst
line is wrong.

-- debug print --
--- a/scripts/Makefile.headersinst
+++ b/scripts/Makefile.headersinst
@@ -7,6 +7,10 @@
 #
 # ==
 
+$(warning D "$(destination-y)")
+$(warning E "$(dst)")
+$(warning F "$(obj)")
+
 # called may set destination dir (when installing to asm/)
 _dst := $(or $(destination-y),$(dst),$(obj))
-- make 3.81 --
make -f scripts/Makefile.build obj=scripts build_unifdef
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst obj=include/uapi
/usr/src/all/linux/scripts/Makefile.headersinst:10: D ""
/usr/src/all/linux/scripts/Makefile.headersinst:11: E ""
/usr/src/all/linux/scripts/Makefile.headersinst:12: F "include/uapi"
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/asm-generic dst=include/uapi/asm-generic
/usr/src/all/linux/scripts/Makefile.headersinst:10: D ""
/usr/src/all/linux/scripts/Makefile.headersinst:11: E 
"include/uapi/asm-generic"
/usr/src/all/linux/scripts/Makefile.headersinst:12: F 
"include/uapi/asm-generic"
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/drm dst=include/uapi/drm
/usr/src/all/linux/scripts/Makefile.headersinst:10: D ""
/usr/src/all/linux/scripts/Makefile.headersinst:11: E "include/uapi/drm"
/usr/src/all/linux/scripts/Makefile.headersinst:12: F "include/uapi/drm"
-- make 3.80 --
make -f scripts/Makefile.build obj=scripts build_unifdef
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst obj=include/uapi
/usr/src/all/linux/scripts/Makefile.headersinst:10: D ""
/usr/src/all/linux/scripts/Makefile.headersinst:11: E ""
/usr/src/all/linux/scripts/Makefile.headersinst:12: F "include/uapi"
make -rR -f /usr/src/all/linux/scripts/Makefile.headersinst 
obj=include/uapi/asm-generic dst=/asm-generic
/usr/src/all/linux/scripts/Makefile.headersinst:10: D ""
/usr/src/all/linux/scripts/Makefile.headersinst:11: E "/asm-generic"
/usr/src/all/linux/scripts/Makefile.headersinst:12: F 
"include/uapi/asm-generic"
/usr/src/all/linux/scripts/Makefile.headersinst:54: *** Missing UAPI file 
/usr/src/all/linux/include/uapi/asm-generic/auxvec.h.  Stop.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: memcmp in modules

2013-04-22 Thread Tetsuo Handa

Andy Shevchenko wrote:
> What did I miss?

Well, as of linux-next-20130422, memcmp() is not correctly exported to modules.
Since linux-3.9-rc8 correctly exports memcmp(), this problem seems to be 
introduced
in linux-next tree. Also, this problem seems to involve CONFIG_MODVERSIONS=y.

  [root@localhost linux-next]# modprobe ipv6
  FATAL: Error inserting ipv6 
(/lib/modules/3.9.0-rc8-next-20130422/kernel/net/ipv6/ipv6.ko): Invalid argument
  [root@localhost linux-next]# dmesg
  ipv6: no symbol version for memcmp
  ipv6: Unknown symbol memcmp (err -22)

Since arch/x86/include/asm/string_64.h uses

  int memcmp(const void *cs, const void *ct, size_t count);

while arch/x86/include/asm/string_32.h uses

  #define memcmp __builtin_memcmp

changing to what you have tried

  #define memcmp(a, b, n) __builtin_memcmp(a, b, n)

or changing to what x86_64 does

  int memcmp(const void *cs, const void *ct, size_t count);

might solve this problem. But I don't know which one is correct solution...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86_32: Fix module version table mismatch.

2013-04-23 Thread Tetsuo Handa

Commit a4b6a77b "module: fix symbol versioning with symbol prefixes" broke
loading of net/ipv6/ipv6.ko built with CONFIG_MODVERSIONS=y for x86_32.

  # modprobe ipv6
  FATAL: Error inserting ipv6 
(/lib/modules/3.9.0-rc8-next-20130422/kernel/net/ipv6/ipv6.ko): Invalid argument
  # dmesg
  ipv6: no symbol version for memcmp
  ipv6: Unknown symbol memcmp (err -22)

The reason for breakage is that check_version() in kernel/module.c tries to
find symname == "memcmp" but versions[i].name == "__builtin_memcmp".

The reason for versions[i].name == "__builtin_memcmp" is that
memcmp() for x86_32 is defined as

  #define memcmp __builtin_memcmp

in arch/x86/include/asm/string_32.h while memcmp() for x86_64 is defined as

  int memcmp(const void *cs, const void *ct, size_t count);

in arch/x86/include/asm/string_64.h.

Since __builtin_memcmp is a gcc's built-in function which might emit a call to
memcmp, __builtin_memcmp should not be used for versions[i].name field.

In order to make sure that versions[i].name == "memcmp", make the definition of
memcmp() for x86_32 identical with that of x86_64.

Signed-off-by: Tetsuo Handa 
Cc: James Hogan 
Cc: Rusty Russell 
---
 arch/x86/include/asm/string_32.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index 3d3e835..bb85b4e 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -199,7 +199,7 @@ static inline void *__memcpy3d(void *to, const void *from, 
size_t len)
 #define __HAVE_ARCH_MEMMOVE
 void *memmove(void *dest, const void *src, size_t n);
 
-#define memcmp __builtin_memcmp
+int memcmp(const void *cs, const void *ct, size_t count);
 
 #define __HAVE_ARCH_MEMCHR
 extern void *memchr(const void *cs, int c, size_t count);
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] proc: Add workaround for idle/iowait decreasing problem.

2013-04-23 Thread Tetsuo Handa

CONFIG_NO_HZ=y can cause idle/iowait values to decrease.

If /proc/stat is monitored with a short interval (e.g. 1 or 2 secs) using
sysstat package, sar reports bogus %idle/iowait values because sar expects
that idle/iowait values do not decrease unless wraparound happens.

This patch makes idle/iowait values visible from /proc/stat increase
monotonically, with an assumption that we don't need to worry about
wraparound.

Signed-off-by: Tetsuo Handa 
---
 fs/proc/stat.c |   42 ++
 1 files changed, 38 insertions(+), 4 deletions(-)

diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index e296572..9fff534 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -19,6 +19,40 @@
 #define arch_irq_stat() 0
 #endif
 
+/*
+ * CONFIG_NO_HZ=y can cause idle/iowait values to decrease.
+ * Make sure that idle/iowait values visible from /proc/stat do not decrease.
+ */
+static inline u64 validate_iowait(u64 iowait, const int cpu)
+{
+#ifdef CONFIG_NO_HZ
+   static u64 max_iowait[NR_CPUS];
+   static DEFINE_SPINLOCK(lock);
+   spin_lock(&lock);
+   if (likely(iowait >= max_iowait[cpu]))
+   max_iowait[cpu] = iowait;
+   else
+   iowait = max_iowait[cpu];
+   spin_unlock(&lock);
+#endif
+   return iowait;
+}
+
+static inline u64 validate_idle(u64 idle, const int cpu)
+{
+#ifdef CONFIG_NO_HZ
+   static u64 max_idle[NR_CPUS];
+   static DEFINE_SPINLOCK(lock);
+   spin_lock(&lock);
+   if (likely(idle >= max_idle[cpu]))
+   max_idle[cpu] = idle;
+   else
+   idle = max_idle[cpu];
+   spin_unlock(&lock);
+#endif
+   return idle;
+}
+
 #ifdef arch_idle_time
 
 static cputime64_t get_idle_time(int cpu)
@@ -28,7 +62,7 @@ static cputime64_t get_idle_time(int cpu)
idle = kcpustat_cpu(cpu).cpustat[CPUTIME_IDLE];
if (cpu_online(cpu) && !nr_iowait_cpu(cpu))
idle += arch_idle_time(cpu);
-   return idle;
+   return validate_idle(idle, cpu);
 }
 
 static cputime64_t get_iowait_time(int cpu)
@@ -38,7 +72,7 @@ static cputime64_t get_iowait_time(int cpu)
iowait = kcpustat_cpu(cpu).cpustat[CPUTIME_IOWAIT];
if (cpu_online(cpu) && nr_iowait_cpu(cpu))
iowait += arch_idle_time(cpu);
-   return iowait;
+   return validate_iowait(iowait, cpu);
 }
 
 #else
@@ -56,7 +90,7 @@ static u64 get_idle_time(int cpu)
else
idle = usecs_to_cputime64(idle_time);
 
-   return idle;
+   return validate_idle(idle, cpu);
 }
 
 static u64 get_iowait_time(int cpu)
@@ -72,7 +106,7 @@ static u64 get_iowait_time(int cpu)
else
iowait = usecs_to_cputime64(iowait_time);
 
-   return iowait;
+   return validate_iowait(iowait, cpu);
 }
 
 #endif
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[linux-next-20130607] rcu warning, sleep in atomic, unchecked dma alloc, bad address access

2013-06-16 Thread Tetsuo Handa

I get below failures using 3.10.0-rc4-next-20130607.

Let me know whether these are already reported/bisected or not.
Trace 3 is known since linux-next-20121127, and still waiting for patches from
Don Fry.

Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.10-rc4-next-20130607 .
Full log is at http://I-love.SAKURA.ne.jp/tmp/dmesg-3.10-rc4-next-20120607.txt .

Trace 5 is a trace with TOMOYO 1.8 patch applied (though TOMOYO 1.8 should be
unrelated).

Regards.

-- Trace 1 start --

[   52.252768] ===
[   52.253874] [ INFO: suspicious RCU usage. ]
[   52.255202] 3.10.0-rc4-next-20130607 #6 Not tainted
[   52.255302] ---
[   52.257909] include/linux/rcupdate.h:471 Illegal context switch in RCU 
read-side critical section!
[   52.265938] 
[   52.265938] other info that might help us debug this:
[   52.265938] 
[   52.268165] 
[   52.268165] rcu_scheduler_active = 1, debug_locks = 1
[   52.271240] 2 locks held by udevd/1909:
[   52.272229]  #0:  (&ids->rw_mutex){+.+.+.}, at: [] ipcget+0x45/0x70
[   52.275183]  #1:  (rcu_read_lock){.+.+..}, at: [] 
rcu_read_lock+0x0/0x80
[   52.277663] 
[   52.277663] stack backtrace:
[   52.277955] CPU: 0 PID: 1909 Comm: udevd Not tainted 
3.10.0-rc4-next-20130607 #6
[   52.281381] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 08/15/2008
[   52.285386]  0001 da601e94 c06765af 0001 da601ebc c0199376 c07f561b 
c07f7108
[   52.287848]  0001 0001 c07e6614  01a8 c081d2f7 da601ee4 
c0176011
[   52.290814]  0246 0002 0001   00d0 de40ff0c 
ffe4
[   52.293472] Call Trace:
[   52.294297]  [] dump_stack+0x4c/0x6d
[   52.301317]  [] lockdep_rcu_suspicious+0xc6/0x100
[   52.302814]  [] __might_sleep+0xb1/0x200
[   52.303092]  [] idr_preload+0xa1/0xd0
[   52.305500]  [] ipc_addid+0x52/0x190
[   52.306692]  [] ? rcu_read_lock+0x5d/0x80
[   52.308119]  [] newary+0xba/0x1a0
[   52.310552]  [] ipcget+0x4e/0x70
[   52.311643]  [] ? __lock_release+0x72/0x1b0
[   52.313194]  [] SyS_semget+0x6f/0x80
[   52.314338]  [] ? sem_security+0x10/0x10
[   52.315562]  [] ? SyS_semget+0x80/0x80
[   52.316920]  [] ? SyS_msgctl+0xb0/0xb0
[   52.318119]  [] SyS_ipc+0xa3/0x250
[   52.318560]  [] ? vm_munmap+0x46/0x60
[   52.320684]  [] sysenter_do_call+0x12/0x32

-- Trace 1 end --

-- Trace 2 start --

[   52.322037] BUG: sleeping function called from invalid context at 
lib/idr.c:424
[   52.324064] in_atomic(): 1, irqs_disabled(): 0, pid: 1909, name: udevd
[   52.325711] 2 locks held by udevd/1909:
[   52.326703]  #0:  (&ids->rw_mutex){+.+.+.}, at: [] ipcget+0x45/0x70
[   52.334105]  #1:  (rcu_read_lock){.+.+..}, at: [] 
rcu_read_lock+0x0/0x80
[   52.337009] CPU: 0 PID: 1909 Comm: udevd Not tainted 
3.10.0-rc4-next-20130607 #6
[   52.338954] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 08/15/2008
[   52.341290]  01a8 da601ebc c06765af daae2920 da601ee4 c01760bd c07e670c 
0001
[   52.345513]   0775 daae2bd8 00d0 de40ff0c ffe4 da601ef0 
c0408cc1
[   52.348492]  c08e2b84 da601f10 c038adc2 da601f10 c038d8ad  0001 
c08e2b80
[   52.350955] Call Trace:
[   52.352001]  [] dump_stack+0x4c/0x6d
[   52.352001]  [] __might_sleep+0x15d/0x200
[   52.352001]  [] idr_preload+0xa1/0xd0
[   52.352001]  [] ipc_addid+0x52/0x190
[   52.352001]  [] ? rcu_read_lock+0x5d/0x80
[   52.352001]  [] newary+0xba/0x1a0
[   52.360934]  [] ipcget+0x4e/0x70
[   52.362111]  [] ? __lock_release+0x72/0x1b0
[   52.366977]  [] SyS_semget+0x6f/0x80
[   52.367216]  [] ? sem_security+0x10/0x10
[   52.367400]  [] ? SyS_semget+0x80/0x80
[   52.372241]  [] ? SyS_msgctl+0xb0/0xb0
[   52.376344]  [] SyS_ipc+0xa3/0x250
[   52.377477]  [] ? vm_munmap+0x46/0x60
[   52.378647]  [] sysenter_do_call+0x12/0x32

[   70.993965] [ cut here ]

-- Trace 2 end --

-- Trace 3 start --

[   70.995522] WARNING: CPU: 1 PID: 13 at lib/dma-debug.c:937 
check_unmap+0x4a6/0x8f0()
[   70.996073] pcnet32 :02:01.0: DMA-API: device driver failed to check map 
error[device address=0x1a0c0802] [size=90 bytes] [mapped as single]
[   70.996073] Modules linked in: ipv6 binfmt_misc
[   70.996073] CPU: 1 PID: 13 Comm: ksoftirqd/1 Not tainted 
3.10.0-rc4-next-20130607 #6
[   70.996073] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 08/15/2008
[   70.996073]  03a9 dedf1cec c06765af c081ee31 dedf1d1c c013c6ef c08208b8 
dedf1d48
[   70.996073]  000d c081ee31 03a9 c042fcc6 c042fcc6 de57a780 dbb76840 
c1191320
[   70.996073]  dedf1d34 c013c7b3 0009 dedf1d2c c08208b8 dedf1d48 dedf1da4 
c042fcc6
[   70.996073] Call Trace:
[   70.996073]  [] dump_stack+0x4c/0x6d
[   70.996073]  [] warn_slowpath_common+0x7f/0xa0
[   70.996073]  [] ? check_unmap+0x4a6/0x8f0
[   70

[PATCH linux-next] ipc: Avoid sleeping inside RCU.

2013-06-17 Thread Tetsuo Handa

I got this.

===
[ INFO: suspicious RCU usage. ]
3.10.0-rc6-next-20130617 #7 Not tainted
---
include/linux/rcupdate.h:475 Illegal context switch in RCU read-side critical 
section!

other info that might help us debug this:


rcu_scheduler_active = 1, debug_locks = 1
2 locks held by udevd/1909:
 #0:  (&ids->rw_mutex){+.+.+.}, at: [] ipcget+0x45/0x70
 #1:  (rcu_read_lock){.+.+..}, at: [] rcu_read_lock+0x0/0x80

stack backtrace:
CPU: 0 PID: 1909 Comm: udevd Not tainted 3.10.0-rc6-next-20130617 #7
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference 
Platform, BIOS 6.00 08/15/2008
 0001 da591e94 c0677931 0001 da591ebc c0199606 c07f690d c07f83fa
 0001 0001 c07e7904  01a8 c081e5f7 da591ee4 c0176231
 0246 0002 0001   00d0 deee0f0c ffe4
Call Trace:
 [] dump_stack+0x4c/0x6b
 [] lockdep_rcu_suspicious+0xc6/0x100
 [] __might_sleep+0xb1/0x200
 [] idr_preload+0xa1/0xd0
 [] ipc_addid+0x52/0x190
 [] ? rcu_read_lock+0x5d/0x80
 [] newary+0xba/0x1a0
 [] ipcget+0x4e/0x70
 [] ? __lock_release+0x72/0x1b0
 [] SyS_semget+0x6f/0x80
 [] ? sem_security+0x10/0x10
 [] ? SyS_semget+0x80/0x80
 [] ? SyS_msgctl+0xb0/0xb0
 [] SyS_ipc+0xa3/0x250
 [] ? vm_munmap+0x46/0x60
 [] sysenter_do_call+0x12/0x32
BUG: sleeping function called from invalid context at lib/idr.c:424
in_atomic(): 1, irqs_disabled(): 0, pid: 1909, name: udevd
2 locks held by udevd/1909:
 #0:  (&ids->rw_mutex){+.+.+.}, at: [] ipcget+0x45/0x70
 #1:  (rcu_read_lock){.+.+..}, at: [] rcu_read_lock+0x0/0x80
CPU: 0 PID: 1909 Comm: udevd Not tainted 3.10.0-rc6-next-20130617 #7
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference 
Platform, BIOS 6.00 08/15/2008
 01a8 da591ebc c0677931 da594260 da591ee4 c01762dd c07e79fc 0001
  0775 da594518 00d0 deee0f0c ffe4 da591ef0 c04098f1
 c08e2b84 da591f10 c038b4d2 da591f10 c038dfbd  0001 c08e2b80
Call Trace:
 [] dump_stack+0x4c/0x6b
 [] __might_sleep+0x15d/0x200
 [] idr_preload+0xa1/0xd0
 [] ipc_addid+0x52/0x190
 [] ? rcu_read_lock+0x5d/0x80
 [] newary+0xba/0x1a0
 [] ipcget+0x4e/0x70
 [] ? __lock_release+0x72/0x1b0
 [] SyS_semget+0x6f/0x80
 [] ? sem_security+0x10/0x10
 [] ? SyS_semget+0x80/0x80
 [] ? SyS_msgctl+0xb0/0xb0
 [] SyS_ipc+0xa3/0x250
 [] ? vm_munmap+0x46/0x60
 [] sysenter_do_call+0x12/0x32

--
>From da6fbd77764e01cd611bd488666f349c11e1d4f3 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Mon, 17 Jun 2013 20:49:25 +0900
Subject: [PATCH linux-next] ipc: Avoid sleeping inside RCU.

Commit 4964214b "ipc: move rcu lock out of ipc_addid" moved
idr_preload(GFP_KERNEL) into RCU read-side critical section.

Signed-off-by: Tetsuo Handa 
---
 ipc/util.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index a746abb..00aca85 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -261,7 +261,7 @@ int ipc_addid(struct ipc_ids* ids, struct kern_ipc_perm* 
new, int size)
if (ids->in_use >= size)
return -ENOSPC;
 
-   idr_preload(GFP_KERNEL);
+   idr_preload(GFP_ATOMIC);
 
spin_lock_init(&new->lock);
new->deleted = 0;
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-next-20130422] Bug in SLAB?

2013-06-17 Thread Tetsuo Handa

Tetsuo Handa wrote:
> Christoph Lameter wrote:
> > Subject: SLAB: Fix init_lock_keys
> > 
> > init_lock_keys goes too far in initializing values in kmalloc_caches because
> > it assumed that the size of the kmalloc array goes up to MAX_ORDER. 
> > However, the size
> > of the kmalloc array for SLAB may be restricted due to increased page sizes 
> > or CONFIG_FORCE_MAX_ZONEORDER.
> > 
> > Reported-by: Tetsuo Handa 
> > Signed-off-by: Christoph Lameter 
> > 
> > Index: linux/mm/slab.c
> > ===
> > --- linux.orig/mm/slab.c2013-05-09 09:06:20.0 -0500
> > +++ linux/mm/slab.c 2013-05-09 09:08:08.338606055 -0500
> > @@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
> > if (slab_state < UP)
> > return;
> > 
> > -   for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> > +   for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
> > struct kmem_cache_node *n;
> > struct kmem_cache *cache = kmalloc_caches[i];
> > 
> > 
> Looks OK to me. Please send this one to 3.10-rcX.
> 
It's already 3.10-rc6. Please be sure to send.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH linux-next] ipc: Avoid sleeping inside RCU.

2013-06-17 Thread Tetsuo Handa

Davidlohr Bueso wrote:
> This should already be fixed here:
> https://lkml.org/lkml/2013/6/11/705

OK. Please be sure to apply; it is not applied as of 3.10.0-rc6-next-20130617.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/4] LSM/TOMOYO: Introduce per a task_struct variables.

2013-06-18 Thread Tetsuo Handa

Please respond if you have any comments/questions/objections/problems.

Tetsuo Handa wrote:
> This patchset has four patches. Patch 1 and 2 are essentially revival of LSM
> hooks which existed until Linux 2.6.28.
> 
> [PATCH 1/4] LSM: Add security_bprm_aborting_creds() hook.
> [PATCH 2/4] LSM: Revive security_task_alloc() hook.
> [PATCH 3/4] TOMOYO: Remember the proposed domain while in execve() request.
> [PATCH 4/4] TOMOYO: Allow caching policy manager's state until execve() 
> request.
> 
>  b/fs/exec.c|1
>  b/include/linux/security.h |   11 +++
>  b/kernel/fork.c|7 +
>  b/security/capability.c|5 +
>  b/security/security.c  |5 +
>  b/security/tomoyo/common.c |   22 +-
>  b/security/tomoyo/common.h |   34 +
>  b/security/tomoyo/tomoyo.c |  163 
> +++--
>  include/linux/security.h   |   10 ++
>  security/capability.c  |6 +
>  security/security.c|5 +
>  security/tomoyo/common.h   |6 +
>  security/tomoyo/tomoyo.c   |   32 
>  13 files changed, 298 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/4] LSM/TOMOYO: Introduce per a task_struct variables.

2013-06-11 Thread Tetsuo Handa

This patchset is for fixing two of TOMOYO's long-standing bugs which exists
since Linux 2.6.30.

Bug 1:

  TOMOYO has been unable to check binary loader's permission upon execve()
  because TOMOYO uses different permission for the program passed to execve()
  request and the binary loader requested by the program passed to the execve()
  request, but TOMOYO was not able to distinguish them due to lack of ability
  to pass the proposed credential argument. Some attempt to pass the proposed
  credential was made but was not successful because it breaks DAC's behavior.

Bug 2:

  TOMOYO has been unable to remember that the current thread was once granted
  for managing policy, for there is no mechanism for cleanly allocating per a
  task_struct variables. As a result, TOMOYO needlessly has to check permission
  for updating policy whenever a line of policy was written. Also, if the
  userspace once deleted a line that is needed for updating policy, the current
  thread (which should be able to update policy) fails to write the rest of
  lines.
  Variables associated with copy on write credential do not help for fixing
  this bug because TOMOYO may not be allowed to modify it when TOMOYO wants to
  modify it.

This patchset has four patches. Patch 1 and 2 are essentially revival of LSM
hooks which existed until Linux 2.6.28.

[PATCH 1/4] LSM: Add security_bprm_aborting_creds() hook.
[PATCH 2/4] LSM: Revive security_task_alloc() hook.
[PATCH 3/4] TOMOYO: Remember the proposed domain while in execve() request.
[PATCH 4/4] TOMOYO: Allow caching policy manager's state until execve() request.

 b/fs/exec.c|1
 b/include/linux/security.h |   11 +++
 b/kernel/fork.c|7 +
 b/security/capability.c|5 +
 b/security/security.c  |5 +
 b/security/tomoyo/common.c |   22 +-
 b/security/tomoyo/common.h |   34 +
 b/security/tomoyo/tomoyo.c |  163 +++--
 include/linux/security.h   |   10 ++
 security/capability.c  |6 +
 security/security.c|5 +
 security/tomoyo/common.h   |6 +
 security/tomoyo/tomoyo.c   |   32 
 13 files changed, 298 insertions(+), 9 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/4] LSM: Add security_bprm_aborting_creds() hook.

2013-06-11 Thread Tetsuo Handa

>From 27dfd0d7652917601a53f4439678097c8ce67b2b Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 11 Jun 2013 21:26:53 +0900
Subject: [PATCH 1/4] LSM: Add security_bprm_aborting_creds() hook.

Add a LSM hook which is called only when an execve operation failed after
prepare_bprm_creds() succeeded. This hook is used by TOMOYO for synchronously
cleaning up resources allocated during an execve operation.

Signed-off-by: Tetsuo Handa 
---
 fs/exec.c|1 +
 include/linux/security.h |   11 +++
 security/capability.c|5 +
 security/security.c  |5 +
 4 files changed, 22 insertions(+), 0 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index 6430195..f71b2ae 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1175,6 +1175,7 @@ void free_bprm(struct linux_binprm *bprm)
 {
free_arg_pages(bprm);
if (bprm->cred) {
+   security_bprm_aborting_creds(bprm);
mutex_unlock(¤t->signal->cred_guard_mutex);
abort_creds(bprm->cred);
}
diff --git a/include/linux/security.h b/include/linux/security.h
index 40560f4..6f03e37 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -232,6 +232,11 @@ static inline void security_free_mnt_opts(struct 
security_mnt_opts *opts)
  * linux_binprm structure.  This hook is a good place to perform state
  * changes on the process such as clearing out non-inheritable signal
  * state.  This is called immediately after commit_creds().
+ * @bprm_aborting_creds:
+ * This hook is called when an execve operation failed after
+ * prepare_bprm_creds() succeeded so that we can synchronously clean up
+ * resources used by an execve operation.
+ * @bprm points to the linux_binprm structure.
  * @bprm_secureexec:
  * Return a boolean value (0 or 1) indicating whether a "secure exec"
  * is required.  The flag is passed in the auxiliary table
@@ -1426,6 +1431,7 @@ struct security_operations {
int (*bprm_secureexec) (struct linux_binprm *bprm);
void (*bprm_committing_creds) (struct linux_binprm *bprm);
void (*bprm_committed_creds) (struct linux_binprm *bprm);
+   void (*bprm_aborting_creds) (struct linux_binprm *bprm);
 
int (*sb_alloc_security) (struct super_block *sb);
void (*sb_free_security) (struct super_block *sb);
@@ -1714,6 +1720,7 @@ int security_bprm_set_creds(struct linux_binprm *bprm);
 int security_bprm_check(struct linux_binprm *bprm);
 void security_bprm_committing_creds(struct linux_binprm *bprm);
 void security_bprm_committed_creds(struct linux_binprm *bprm);
+void security_bprm_aborting_creds(struct linux_binprm *bprm);
 int security_bprm_secureexec(struct linux_binprm *bprm);
 int security_sb_alloc(struct super_block *sb);
 void security_sb_free(struct super_block *sb);
@@ -1954,6 +1961,10 @@ static inline void security_bprm_committed_creds(struct 
linux_binprm *bprm)
 {
 }
 
+static inline void security_bprm_aborting_creds(struct linux_binprm *bprm)
+{
+}
+
 static inline int security_bprm_secureexec(struct linux_binprm *bprm)
 {
return cap_bprm_secureexec(bprm);
diff --git a/security/capability.c b/security/capability.c
index 1728d4e..34b6f09 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -40,6 +40,10 @@ static void cap_bprm_committed_creds(struct linux_binprm 
*bprm)
 {
 }
 
+static void cap_bprm_aborting_creds(struct linux_binprm *bprm)
+{
+}
+
 static int cap_sb_alloc_security(struct super_block *sb)
 {
return 0;
@@ -916,6 +920,7 @@ void __init security_fixup_ops(struct security_operations 
*ops)
set_to_cap_if_null(ops, bprm_set_creds);
set_to_cap_if_null(ops, bprm_committing_creds);
set_to_cap_if_null(ops, bprm_committed_creds);
+   set_to_cap_if_null(ops, bprm_aborting_creds);
set_to_cap_if_null(ops, bprm_check_security);
set_to_cap_if_null(ops, bprm_secureexec);
set_to_cap_if_null(ops, sb_alloc_security);
diff --git a/security/security.c b/security/security.c
index a3dce87..7123178 100644
--- a/security/security.c
+++ b/security/security.c
@@ -235,6 +235,11 @@ void security_bprm_committed_creds(struct linux_binprm 
*bprm)
security_ops->bprm_committed_creds(bprm);
 }
 
+void security_bprm_aborting_creds(struct linux_binprm *bprm)
+{
+   security_ops->bprm_aborting_creds(bprm);
+}
+
 int security_bprm_secureexec(struct linux_binprm *bprm)
 {
return security_ops->bprm_secureexec(bprm);
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/4] LSM: Revive security_task_alloc() hook.

2013-06-11 Thread Tetsuo Handa

>From 5fc4d5f6d39e36f6b91ce6b26d084ef2354488c0 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 11 Jun 2013 21:33:49 +0900
Subject: [PATCH 2/4] LSM: Revive security_task_alloc() hook.

Revive a LSM hook which is called when a task_struct is allocated.
This hook is used by TOMOYO for inheriting per a task_struct variables.

Signed-off-by: Tetsuo Handa 
---
 include/linux/security.h |   10 ++
 kernel/fork.c|7 ++-
 security/capability.c|6 ++
 security/security.c  |5 +
 4 files changed, 27 insertions(+), 1 deletions(-)

diff --git a/include/linux/security.h b/include/linux/security.h
index 6f03e37..46566e3 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -660,6 +660,9 @@ static inline void security_free_mnt_opts(struct 
security_mnt_opts *opts)
  * manual page for definitions of the @clone_flags.
  * @clone_flags contains the flags indicating what should be shared.
  * Return 0 if permission is granted.
+ * @task_alloc:
+ * @task task being allocated.
+ * Handle allocation of task-related resources.
  * @task_free:
  * @task task being freed
  * Handle release of task-related resources. (Note that this can be called
@@ -1528,6 +1531,7 @@ struct security_operations {
int (*file_open) (struct file *file, const struct cred *cred);
 
int (*task_create) (unsigned long clone_flags);
+   int (*task_alloc) (struct task_struct *task);
void (*task_free) (struct task_struct *task);
int (*cred_alloc_blank) (struct cred *cred, gfp_t gfp);
void (*cred_free) (struct cred *cred);
@@ -1792,6 +1796,7 @@ int security_file_send_sigiotask(struct task_struct *tsk,
 int security_file_receive(struct file *file);
 int security_file_open(struct file *file, const struct cred *cred);
 int security_task_create(unsigned long clone_flags);
+int security_task_alloc(struct task_struct *task);
 void security_task_free(struct task_struct *task);
 int security_cred_alloc_blank(struct cred *cred, gfp_t gfp);
 void security_cred_free(struct cred *cred);
@@ -2281,6 +2286,11 @@ static inline int security_task_create(unsigned long 
clone_flags)
return 0;
 }
 
+static inline int security_task_alloc(struct task_struct *task)
+{
+   return 0;
+}
+
 static inline void security_task_free(struct task_struct *task)
 { }
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 987b28a..fed8f5d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1315,9 +1315,12 @@ static struct task_struct *copy_process(unsigned long 
clone_flags,
retval = perf_event_init_task(p);
if (retval)
goto bad_fork_cleanup_policy;
-   retval = audit_alloc(p);
+   retval = security_task_alloc(p);
if (retval)
goto bad_fork_cleanup_policy;
+   retval = audit_alloc(p);
+   if (retval)
+   goto bad_fork_cleanup_security;
/* copy all the process information */
retval = copy_semundo(clone_flags, p);
if (retval)
@@ -1512,6 +1515,8 @@ bad_fork_cleanup_semundo:
exit_sem(p);
 bad_fork_cleanup_audit:
audit_free(p);
+bad_fork_cleanup_security:
+   security_task_free(p);
 bad_fork_cleanup_policy:
perf_event_free_task(p);
 #ifdef CONFIG_NUMA
diff --git a/security/capability.c b/security/capability.c
index 34b6f09..3ddc282 100644
--- a/security/capability.c
+++ b/security/capability.c
@@ -363,6 +363,11 @@ static int cap_task_create(unsigned long clone_flags)
return 0;
 }
 
+static int cap_task_alloc(struct task_struct *task)
+{
+   return 0;
+}
+
 static void cap_task_free(struct task_struct *task)
 {
 }
@@ -990,6 +995,7 @@ void __init security_fixup_ops(struct security_operations 
*ops)
set_to_cap_if_null(ops, file_receive);
set_to_cap_if_null(ops, file_open);
set_to_cap_if_null(ops, task_create);
+   set_to_cap_if_null(ops, task_alloc);
set_to_cap_if_null(ops, task_free);
set_to_cap_if_null(ops, cred_alloc_blank);
set_to_cap_if_null(ops, cred_free);
diff --git a/security/security.c b/security/security.c
index 7123178..3aeaecf 100644
--- a/security/security.c
+++ b/security/security.c
@@ -782,6 +782,11 @@ int security_task_create(unsigned long clone_flags)
return security_ops->task_create(clone_flags);
 }
 
+int security_task_alloc(struct task_struct *task)
+{
+   return security_ops->task_alloc(task);
+}
+
 void security_task_free(struct task_struct *task)
 {
 #ifdef CONFIG_SECURITY_YAMA_STACKED
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/4] TOMOYO: Allow caching policy manager's state until execve() request.

2013-06-11 Thread Tetsuo Handa

>From 05dfa116d92f31b9def142790748a2b479800718 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 11 Jun 2013 21:39:12 +0900
Subject: [PATCH 4/4] TOMOYO: Allow caching policy manager's state until 
execve() request.

Extend the lifetime of the per a task_struct variable from "start of execve()
to end of do_execve()" to "arbitrary moment to the end of exexve()" so that
we can remember that the current thread (and threads created afterwards from
current thread) can update policy without checking permission to update policy
for each line.

Signed-off-by: Tetsuo Handa 
---
 security/tomoyo/common.c |   22 +-
 security/tomoyo/common.h |6 ++
 security/tomoyo/tomoyo.c |   32 +++-
 3 files changed, 58 insertions(+), 2 deletions(-)

diff --git a/security/tomoyo/common.c b/security/tomoyo/common.c
index 283862a..40ae993 100644
--- a/security/tomoyo/common.c
+++ b/security/tomoyo/common.c
@@ -921,10 +921,14 @@ static bool tomoyo_manager(void)
const char *exe;
const struct task_struct *task = current;
const struct tomoyo_path_info *domainname = tomoyo_domain()->domainname;
+   struct tomoyo_security *security;
bool found = false;
 
if (!tomoyo_policy_loaded)
return true;
+   security = tomoyo_find_task_security();
+   if (security && (security->tomoyo_flags & TOMOYO_TASK_IS_MANAGER))
+   return true;
if (!tomoyo_manage_by_non_root &&
(!uid_eq(task->cred->uid,  GLOBAL_ROOT_UID) ||
 !uid_eq(task->cred->euid, GLOBAL_ROOT_UID)))
@@ -951,7 +955,23 @@ static bool tomoyo_manager(void)
}
}
kfree(exe);
-   return found;
+   if (!found)
+   return false;
+   /*
+* Remember that the current thread is allowed to update
+* policies until do_execve().
+*/
+   if (security) {
+   security->tomoyo_flags |= TOMOYO_TASK_IS_MANAGER;
+   return true;
+   }
+   security = kzalloc(sizeof(*security), GFP_KERNEL);
+   if (security) {
+   security->task = task;
+   security->tomoyo_flags |= TOMOYO_TASK_IS_MANAGER;
+   tomoyo_add_task_security(security);
+   }
+   return true;
 }
 
 static struct tomoyo_domain_info *tomoyo_find_domain_by_qid
diff --git a/security/tomoyo/common.h b/security/tomoyo/common.h
index 60e5800..923d237 100644
--- a/security/tomoyo/common.h
+++ b/security/tomoyo/common.h
@@ -38,6 +38,11 @@
 
 /* Current thread is doing do_execve() ? */
 #define TOMOYO_TASK_IS_IN_EXECVE 1
+/*
+ * Current thread is allowed to modify policy via /sys/kernel/security/tomoyo/
+ * interface?
+ */
+#define TOMOYO_TASK_IS_MANAGER   2
 
 /*
  * TOMOYO uses this hash only when appending a string into the string
@@ -927,6 +932,7 @@ struct tomoyo_policy_namespace {
 /** Function prototypes. **/
 
 struct tomoyo_security *tomoyo_find_task_security(void);
+void tomoyo_add_task_security(struct tomoyo_security *ptr);
 bool tomoyo_address_matches_group(const bool is_ipv6, const __be32 *address,
  const struct tomoyo_group *group);
 bool tomoyo_compare_number_union(const unsigned long value,
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index 7039302..de4c7f9 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -35,7 +35,7 @@ static void tomoyo_del_task_security(struct tomoyo_security 
*ptr)
  *
  * Returns nothing.
  */
-static void tomoyo_add_task_security(struct tomoyo_security *ptr)
+void tomoyo_add_task_security(struct tomoyo_security *ptr)
 {
unsigned long flags;
spin_lock_irqsave(&tomoyo_task_security_list_lock, flags);
@@ -84,6 +84,29 @@ static void tomoyo_task_free(struct task_struct *task)
 }
 
 /**
+ * tomoyo_task_alloc - Make snapshot of security context for new task.
+ *
+ * @p: Pointer to "struct task_struct".
+ *
+ * Returns 0 on success, negative value otherwise.
+ */
+static int tomoyo_task_alloc(struct task_struct *p)
+{
+   struct tomoyo_security *old_security = tomoyo_find_task_security();
+   struct tomoyo_security *new_security;
+   
+   if (!old_security)
+   return 0;
+   new_security = kzalloc(sizeof(*new_security), GFP_KERNEL);
+   if (!new_security)
+   return -ENOMEM;
+   new_security->task = p;
+   new_security->tomoyo_flags = old_security->tomoyo_flags;
+   tomoyo_add_task_security(new_security);
+   return 0;
+}
+
+/**
  * tomoyo_bprm_committing_creds - Forget the proposed domain.
  *
  * @bprm: Pointer to "struct linux_binprm".
@@ -209,6 +232,12 @@ static int tomoyo_bprm_set_creds(struct linux_binprm *bprm)
 * execve operation.
 */
bprm->cred->security = NULL;
+   /* Clear the manager fla

[PATCH 3/4] TOMOYO: Remember the proposed domain while in execve() request.

2013-06-11 Thread Tetsuo Handa

>From 2567f2c896e1fe57f096619cfe750ebc9fc2ad01 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 11 Jun 2013 21:35:29 +0900
Subject: [PATCH 3/4] TOMOYO: Remember the proposed domain while in execve() 
request.

Introduce per a task_struct variable which remembers the proposed domain
so that TOMOYO can find the proposed domain before execve() succeeds.

Signed-off-by: Tetsuo Handa 
---
 security/tomoyo/common.h |   34 +-
 security/tomoyo/tomoyo.c |  163 --
 2 files changed, 191 insertions(+), 6 deletions(-)

diff --git a/security/tomoyo/common.h b/security/tomoyo/common.h
index b897d48..60e5800 100644
--- a/security/tomoyo/common.h
+++ b/security/tomoyo/common.h
@@ -36,6 +36,9 @@
 
 /** Constants definitions. **/
 
+/* Current thread is doing do_execve() ? */
+#define TOMOYO_TASK_IS_IN_EXECVE 1
+
 /*
  * TOMOYO uses this hash only when appending a string into the string
  * table. Frequency of appending strings is very low. So we don't need
@@ -398,6 +401,16 @@ enum tomoyo_pref_index {
 
 /** Structure definitions. **/
 
+/* Per a task_struct variable. */
+struct tomoyo_security {
+   struct list_head list;
+   const struct task_struct *task;
+   /* NULL unless tomoyo_flags has TOMOYO_TASK_IS_IN_EXECVE flag. */
+   struct tomoyo_domain_info *tomoyo_domain_info;
+   u32 tomoyo_flags;
+   struct rcu_head rcu;
+};
+
 /* Common header for holding ACL entries. */
 struct tomoyo_acl_head {
struct list_head list;
@@ -913,6 +926,7 @@ struct tomoyo_policy_namespace {
 
 /** Function prototypes. **/
 
+struct tomoyo_security *tomoyo_find_task_security(void);
 bool tomoyo_address_matches_group(const bool is_ipv6, const __be32 *address,
  const struct tomoyo_group *group);
 bool tomoyo_compare_number_union(const unsigned long value,
@@ -1202,7 +1216,25 @@ static inline void tomoyo_put_group(struct tomoyo_group 
*group)
  */
 static inline struct tomoyo_domain_info *tomoyo_domain(void)
 {
-   return current_cred()->security;
+   /*
+* Return the proposed domain stored in "struct linux_binprm *"
+* if current thread is in do_execve(). The proposed domain will be
+* cleared when do_execve() finished.
+*/
+   struct tomoyo_security *ptr = tomoyo_find_task_security();
+   return ptr && (ptr->tomoyo_flags & TOMOYO_TASK_IS_IN_EXECVE) ?
+   ptr->tomoyo_domain_info : current_cred()->security;
+}
+
+/**
+ * tomoyo_current_flags - Get flags for current thread.
+ *
+ * Returns flags for current thread.
+ */
+static inline u32 tomoyo_current_flags(void)
+{
+   struct tomoyo_security *ptr = tomoyo_find_task_security();
+   return ptr ? ptr->tomoyo_flags : 0;
 }
 
 /**
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index f0b756e..7039302 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -7,6 +7,110 @@
 #include 
 #include "common.h"
 
+/* List of "struct tomoyo_security" associated with "struct task_struct". */
+static LIST_HEAD(tomoyo_task_security_list);
+/* Lock for protecting tomoyo_task_security_list list. */
+static DEFINE_SPINLOCK(tomoyo_task_security_list_lock);
+
+/**
+ * tomoyo_del_task_security - Release "struct tomoyo_security".
+ *
+ * @ptr: Pointer to "struct tomoyo_security".
+ *
+ * Returns nothing.
+ */
+static void tomoyo_del_task_security(struct tomoyo_security *ptr)
+{
+   unsigned long flags;
+   spin_lock_irqsave(&tomoyo_task_security_list_lock, flags);
+   list_del_rcu(&ptr->list);
+   spin_unlock_irqrestore(&tomoyo_task_security_list_lock, flags);
+   kfree_rcu(ptr, rcu);
+}
+
+/**
+ * tomoyo_add_task_security - Add "struct tomoyo_security" to list.
+ *
+ * @ptr:  Pointer to "struct tomoyo_security".
+ *
+ * Returns nothing.
+ */
+static void tomoyo_add_task_security(struct tomoyo_security *ptr)
+{
+   unsigned long flags;
+   spin_lock_irqsave(&tomoyo_task_security_list_lock, flags);
+   list_add_rcu(&ptr->list, &tomoyo_task_security_list);
+   spin_unlock_irqrestore(&tomoyo_task_security_list_lock, flags);
+}
+
+/**
+ * tomoyo_find_task_security - Find "struct tomoyo_security" for current 
thread.
+ *
+ * Returns pointer to "struct tomoyo_security" if found, NULL otherwise.
+ */
+struct tomoyo_security *tomoyo_find_task_security(void)
+{
+   const struct task_struct *task = current;
+   struct tomoyo_security *ptr;
+   rcu_read_lock();
+   list_for_each_entry_rcu(ptr, &tomoyo_task_security_list, list) {
+   if (ptr->task != task)
+   continue;
+   rcu_read_unlock();
+   return ptr;
+   }
+   rcu_read_unlock();
+   return NULL;
+}
+
+/**
+ * to

Re: [linux-next-20130422] Bug in SLAB?

2013-07-01 Thread Tetsuo Handa

Andrew Morton wrote:
> On Tue, 7 May 2013 14:28:49 + Christoph Lameter  wrote:
> 
> > On Tue, 7 May 2013, Tetsuo Handa wrote:
> > 
> > > > These are exclusively from the module load. So the kernel seems to be
> > > > clean of large kmalloc's ?
> > > >
> > > There are modules (e.g. TOMOYO) which do not check for KMALLOC_MAX_SIZE 
> > > limit
> > > and expect kmalloc() larger than KMALLOC_MAX_SIZE bytes to return NULL.
> > 
> > Dont do that. Please fix these things.
> 
> Slab should return NULL for a request greater than KMALLOC_MAX_SIZE. 
> For heaven's sake don't break that!

The patch that fixes above things (commit 6286ae97) went to 3.10.



> What's going on with this bug, btw?  This:
> 
> --- a/mm/slab.c~slab-fix-init_lock_keys
> +++ a/mm/slab.c
> @@ -565,7 +565,7 @@ static void init_node_lock_keys(int q)
>   if (slab_state < UP)
>   return;
>  
> - for (i = 1; i < PAGE_SHIFT + MAX_ORDER; i++) {
> + for (i = 1; i <= KMALLOC_SHIFT_HIGH; i++) {
>   struct kmem_cache_node *n;
>   struct kmem_cache *cache = kmalloc_caches[i];
>  
> 
> still seems to be unapplied.
> 
The patch that fixes oops and panic on early boot on architectures with
PAGE_SHIFT + MAX_ORDER > 26 missed 3.10.

> I've read through the thread trying to work out what the end-user
> impact of that fix is, but it's all clear as mud.  It's possible that
> the end-user effect is `kernel locks up after printing "Booting the
> kernel"'.  Or maybe not.
> 
> And if the above patch does indeed fix something significant, we might
> need a -stable backport.
> 

Somebody needs this patch when debugging with CONFIG_LOCKDEP=y on
architectures with PAGE_SHIFT + MAX_ORDER > 26 .

> Can we get some clarity here please?
> 

Thank you for adding to -mm. But please delete

  Tetsuo said:
  
  : It hangs (with CPU#0 spinning) immediately after printing
  : 
  :   Decompressing Linux... Parsing ELF... done.
  :   Booting the kernel.
  : 
  : lines.

lines from "+ slab-fix-init_lock_keys.patch added to -mm tree", for
these lines are fixed by commit 8a965b3b. Though the same symptom would
appear if hitting this PAGE_SHIFT + MAX_ORDER > 26 bug, I can't confirm
the symptom for environments which I don't have.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [linux-next-20130422] Bug in SLAB?

2013-07-02 Thread Tetsuo Handa

Andrew Morton wrote:
> Look, I'll make this easier:
> 
> : Subject: slab: fix init_lock_keys
> : 
> : In 3.10 kernels with CONFIG_LOCKDEP=y on architectures with
> : PAGE_SHIFT + MAX_ORDER > 26 such as [architecture goes here], the kernel 
> does
> : [x] when the user does [y].
> :
> : init_lock_keys() goes too far in initializing values in kmalloc_caches
> : because it assumed that the size of the kmalloc array goes up to
> : MAX_ORDER.  However, the size of the kmalloc array for SLAB may be
> : restricted due to increased page sizes or CONFIG_FORCE_MAX_ZONEORDER.
> :
> : Fix this by [z].
> 
> 
> Please fill in the text within [].
> 
OK. I made from http://marc.info/?l=linux-kernel&m=136810234704350&w=2 .
-
Some architectures (e.g. powerpc built with CONFIG_PPC_256K_PAGES=y
CONFIG_FORCE_MAX_ZONEORDER=11) get PAGE_SHIFT + MAX_ORDER > 26.

In 3.10 kernels, CONFIG_LOCKDEP=y with PAGE_SHIFT + MAX_ORDER > 26 makes
init_lock_keys() dereference beyond kmalloc_caches[26].
This leads to an unbootable system (kernel panic at initializing SLAB)
if one of kmalloc_caches[26...PAGE_SHIFT+MAX_ORDER-1] is not NULL.

Fix this by making sure that init_lock_keys() does not dereference beyond
kmalloc_caches[26] arrays.
-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.

2013-07-16 Thread Tetsuo Handa

I got below failure.

  [5.258911] scsi 1:0:0:0: CD-ROMNECVMWar VMware IDE CDR10 1.00 
PQ: 0 ANSI: 5
  [5.267651] modprobe (156) used greatest stack depth: 4064 bytes left
  [5.293607] Fusion MPT base driver 3.04.20
  [5.294416] Copyright (c) 1999-2008 LSI Corporation
  [5.300109] Fusion MPT SPI Host driver 3.04.20
  [5.300967] Switched to clocksource tsc
  [5.310921] mptbase: ioc0: Initiating bringup
  [5.329480] ioc0: LSI53C1030 B0: Capabilities={Initiator}
  [5.373136] scsi2 : ioc0: LSI53C1030 B0, FwRev=01032920h, Ports=1, 
MaxQ=128, IRQ=17
  [5.406190] scsi 2:0:0:0: Direct-Access VMware,  VMware Virtual S 1.0  
PQ: 0 ANSI: 2
  [5.408582] scsi target2:0:0: Beginning Domain Validation
  [5.415451] scsi target2:0:0: Domain Validation skipping write tests
  [5.416399] scsi target2:0:0: Ending Domain Validation
  [5.417445] scsi target2:0:0: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, 
offset 127)
  [5.419855] scsi 2:0:1:0: Direct-Access VMware,  VMware Virtual S 1.0  
PQ: 0 ANSI: 2
  [5.420404] scsi target2:0:1: Beginning Domain Validation
  [5.423171] scsi target2:0:1: Domain Validation skipping write tests
  [5.423363] scsi target2:0:1: Ending Domain Validation
  [5.424440] scsi target2:0:1: FAST-40 WIDE SCSI 80.0 MB/s ST (25 ns, 
offset 127)
  [5.439700] modprobe (211) used greatest stack depth: 3872 bytes left
  [5.561700] sr0: scsi3-mmc drive: 1x/1x writer dvd-ram cd/rw xa/form2 cdda 
tray
  [5.562596] cdrom: Uniform CD-ROM driver Revision: 3.20
  [5.667330] scsi_id (264) used greatest stack depth: 3568 bytes left
  FATAL: Module scsi_wait_scan not found.
  
  FATAL: Module scsi_wait_scan not found.
  
  FATAL: Module scsi_wait_scan not found.
  (...snipped...)
  [   60.308916] dracut Warning: Boot has failed. To debug this issue add 
"rdshell" to the kernel command line.
  [   60.311431] dracut Warning: Signal caught!

Kernel config is at http://I-love.SAKURA.ne.jp/tmp/config-3.11-rc1

Bisected to commit 2d31e518 "crypto: crct10dif - Wrap crc_t10dif function all
to use crypto transform framework", and confirmed that changing from
CONFIG_CRYPTO_CRCT10DIF=m to CONFIG_CRYPTO_CRCT10DIF=y solves this problem.
--
>From ad396f0c049fe6d4ab14793d10367e32227c5991 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Tue, 16 Jul 2013 18:33:40 +0900
Subject: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.

Commit 2d31e518 "crypto: crct10dif - Wrap crc_t10dif function all to use crypto
transform framework" was added without updating module dependency, breaking at
least one module which depends on CONFIG_CRYPTO_CRCT10DIF=y.

Signed-off-by: Tetsuo Handa 
---
 crypto/Kconfig |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 69ce573..aa8edba 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -377,7 +377,7 @@ config CRYPTO_CRC32_PCLMUL
  and gain better performance as compared with the table implementation.
 
 config CRYPTO_CRCT10DIF
-   tristate "CRCT10DIF algorithm"
+   bool "CRCT10DIF algorithm"
select CRYPTO_HASH
help
  CRC T10 Data Integrity Field computation is being cast as
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to moduledependency.

2013-07-16 Thread Tetsuo Handa

Herbert Xu wrote:
> Looks like a bug in whatever is creating the initrd as it isn't
> including modules necessary for the boot.

It turned out that it is already wrong as of creating modules.dep.

  # grep crc /lib/modules/3.11.0-rc1/modules.dep
  kernel/crypto/crct10dif.ko:
  kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko
  kernel/lib/crc-t10dif.ko:

modules.dep says

  (1) sd_mod.ko depends on crc-t10dif.ko
  (2) crc-t10dif.ko does not depend on crct10dif.ko

but commit 2d31e518 made crc-t10dif.ko depend on crct10dif.ko , didn't it?

crct10dif.ko need to be loaded before crc-t10dif.ko is loaded, but doing

diff --git a/lib/Kconfig b/lib/Kconfig
index 35da513..53ee0fd 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -68,6 +68,7 @@ config CRC_T10DIF
tristate "CRC calculation for the T10 Data Integrity Field"
select CRYPTO
select CRYPTO_CRCT10DIF
+   depends on CRYPTO_CRCT10DIF
help
  This option is only needed if a module that's not in the
  kernel tree needs to calculate CRC checks for use with the

causes below warning.

  crypto/Kconfig:379: symbol CRYPTO_CRCT10DIF is selected by CRC_T10DIF
  warning: (BLK_DEV_SD && SCSI_LPFC && SCSI_DEBUG) selects CRC_T10DIF which has 
unmet direct dependencies (CRYPTO_CRCT10DIF)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.11-rc1] crypto: Fix boot failure due tomoduledependency.

2013-07-17 Thread Tetsuo Handa

Tim Chen wrote:
> Herbert, seems like modules.dep generator wants explicit
> 
> - select CRYPTO_CRCT10DIF
> + depends on CRYPTO_CRCT10DIF
> 
> But it seems to me like it should have known CRC_T10DIF needs
> CRYPTO_CRCT10DIF when we do 
>   select CRYPTO_CRCT10DIF
> 
> Your thoughts?

"select" cannot tell /sbin/depmod that there is dependency, for /sbin/depmod
calculates dependency by enumerating symbols in each module rather than by
parsing Kconfig files which depends on "kernel-source_files_installed = y".

Therefore, I think possible solutions are either

  (a) built-in the dependent modules

  # grep crc /lib/modules/3.11.0-rc1+/modules.dep
  kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko
  kernel/lib/crc-t10dif.ko:

or

  (b) embed explicit reference to the dependent module's symbols

  # grep crc /lib/modules/3.11.0-rc1+/modules.dep
  kernel/arch/x86/crypto/crct10dif-pclmul.ko: kernel/crypto/crct10dif.ko
  kernel/crypto/crct10dif.ko:
  kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko 
kernel/arch/x86/crypto/crct10dif-pclmul.ko kernel/crypto/crct10dif.ko
  kernel/lib/crc-t10dif.ko: kernel/arch/x86/crypto/crct10dif-pclmul.ko 
kernel/crypto/crct10dif.ko

.

Two patches ((a) and (b) respectively) follow, but I think patch (b) will not
work unless additional change

  static int __init crct10dif_intel_mod_init(void)
  {
  if (x86_match_cpu(crct10dif_cpu_id))
  return crypto_register_shash(&alg);
  return 0;
  }

  static void __exit crct10dif_intel_mod_fini(void)
  {
  if (x86_match_cpu(crct10dif_cpu_id))
  crypto_unregister_shash(&alg);
  }

is made, for currently crct10dif-pclmul.ko cannot be loaded on
!X86_FEATURE_MATCH(X86_FEATURE_PCLMULQDQ) systems.



>From d8d9b7c3e5be9c5a6198dac6fe7279ca904343a8 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Wed, 17 Jul 2013 19:45:19 +0900
Subject: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.

Commit 2d31e518 "crypto: crct10dif - Wrap crc_t10dif function all to use crypto
transform framework" was added without telling that "crc-t10dif.ko depends on
crct10dif.ko". This resulted in boot failure because "sd_mod.ko depends on
crc-t10dif.ko" but "crct10dif.ko is not loaded automatically".

Fix this by changing crct10dif.ko and crct10dif-pclmul.ko from "tristate" to
"bool" so that suitable version is chosen.

Signed-off-by: Tetsuo Handa 
---
 crypto/Kconfig |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 69ce573..dd3b79e 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -377,7 +377,7 @@ config CRYPTO_CRC32_PCLMUL
  and gain better performance as compared with the table implementation.
 
 config CRYPTO_CRCT10DIF
-   tristate "CRCT10DIF algorithm"
+   bool "CRCT10DIF algorithm"
select CRYPTO_HASH
help
  CRC T10 Data Integrity Field computation is being cast as
@@ -385,7 +385,7 @@ config CRYPTO_CRCT10DIF
  transforms to be used if they are available.
 
 config CRYPTO_CRCT10DIF_PCLMUL
-   tristate "CRCT10DIF PCLMULQDQ hardware acceleration"
+   bool "CRCT10DIF PCLMULQDQ hardware acceleration"
depends on X86 && 64BIT && CRC_T10DIF
select CRYPTO_HASH
help
-- 
1.7.1



>From 153e209fc9a7e1df42555829c396ee9ed53e83d0 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Wed, 17 Jul 2013 20:23:15 +0900
Subject: [PATCH] [PATCH 3.11-rc1] crypto: Fix boot failure due to module 
dependency.

Commit 2d31e518 "crypto: crct10dif - Wrap crc_t10dif function all to use crypto
transform framework" was added without telling that "crc-t10dif.ko depends on
crct10dif.ko". This resulted in boot failure because "sd_mod.ko depends on
crc-t10dif.ko" but "crct10dif.ko is not loaded automatically".

Fix this by describing "crc-t10dif.ko depends on crct10dif.ko".

Signed-off-by: Tetsuo Handa 
---
 arch/x86/crypto/crct10dif-pclmul_glue.c |6 ++
 lib/crc-t10dif.c|7 +++
 2 files changed, 13 insertions(+), 0 deletions(-)

diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c 
b/arch/x86/crypto/crct10dif-pclmul_glue.c
index 7845d7f..2964608 100644
--- a/arch/x86/crypto/crct10dif-pclmul_glue.c
+++ b/arch/x86/crypto/crct10dif-pclmul_glue.c
@@ -149,3 +149,9 @@ MODULE_LICENSE("GPL");
 
 MODULE_ALIAS("crct10dif");
 MODULE_ALIAS("crct10dif-pclmul");
+
+/* Dummy for describing module dependency. */
+#if defined(CONFIG_CRYPTO_CRCT10DIF_PCLMUL_MODULE)
+const char crct10dif_pclmul;
+EXPORT_SYMBOL(crct10dif_pclmul);
+#

Re: [PATCH 3.11-rc1] crypto: Fix boot failure duetomoduledependency.

2013-07-17 Thread Tetsuo Handa

Tim Chen wrote:
> > Therefore, I think possible solutions are either
> > 
> >   (a) built-in the dependent modules
> > 
> >   # grep crc /lib/modules/3.11.0-rc1+/modules.dep
> >   kernel/drivers/scsi/sd_mod.ko: kernel/lib/crc-t10dif.ko
> >   kernel/lib/crc-t10dif.ko:
> 
> This approach will increase kernel size for those who are not using
> t10dif so some people may object.  
> BTW, The PCLMULQDQ version does not need to be build in.

sd_mod.ko must choose one from versions available as of loading sd_mod.ko.
Although it is not needed to built-in the PCLMULQDQ version for fixing the boot
failure, it is needed to built-in the PCLMULQDQ version for getting performance
improvement when PCLMULQDQ is supported.

> Your approach is quite complicated.  I think something simpler like the
> following will work:

We cannot benefit from PCLMULQDQ. Is it acceptable for you?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to module dependency.

2013-07-17 Thread Tetsuo Handa

Tim Chen wrote:
> > > Your approach is quite complicated.  I think something simpler like the
> > > following will work:
> >
> > We cannot benefit from PCLMULQDQ. Is it acceptable for you?
> 
> 
> The following code in crct10dif-pclmul_glue.c
> 
> static const struct x86_cpu_id crct10dif_cpu_id[] = {
> X86_FEATURE_MATCH(X86_FEATURE_PCLMULQDQ),
> {}
> };
> MODULE_DEVICE_TABLE(x86cpu, crct10dif_cpu_id);
> 
> will put the module in the device table and get the module
> loaded, as long as the cpu support PCLMULQDQ. So we should be able
> to benefit.

Excuse me, how can crct10dif-pclmul.ko get loaded automatically?
Did you test CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m with below debug message?

diff --git a/arch/x86/crypto/crct10dif-pclmul_glue.c 
b/arch/x86/crypto/crct10dif-pclmul_glue.c
index 7845d7f..a8a95aa 100644
--- a/arch/x86/crypto/crct10dif-pclmul_glue.c
+++ b/arch/x86/crypto/crct10dif-pclmul_glue.c
@@ -129,9 +129,10 @@ MODULE_DEVICE_TABLE(x86cpu, crct10dif_cpu_id);

 static int __init crct10dif_intel_mod_init(void)
 {
+   printk(KERN_WARNING "** Checking for X86_FEATURE_PCLMULQDQ\n");
if (!x86_match_cpu(crct10dif_cpu_id))
return -ENODEV;
-
+   printk(KERN_WARNING "** Registering crct10dif-pclmul\n");
return crypto_register_shash(&alg);
 }

As far as I tested, crct10dif-pclmul.ko will not be loaded unless manually
adding "modprobe crct10dif-pclmul" to initramfs's /init or choosing
CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y.

> So as long as the crct10dif.ko and crct10dif-pclmul.ko are loaded,
> the pclmulqdq t10dif will have a higher priority and get allocated
> and used.

What I'm talking are

  (1) Since mkinitramfs is unable to know that crct10dif-pclmul.ko has higher
  priority than crct10dif.ko , mkinitramfs will not include
  "modprobe crct10dif-pclmul" line in the generated initramfs.

  (2) In order to get benefit from PCLMULQDQ, users have to manually make sure
  that "modprobe crct10dif-pclmul" is called before crc-t10dif.ko (which is
  loaded before sd_mod.ko is loaded) is loaded by their initramfs's /init
  script.

  (3) The cause of (1) is that crct10dif-pclmul.ko will not be loaded
  automatically unless choosing CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y.

  (4) The cause of (3) is that modules.dep does not describe that users will
  benefit by loading crct10dif-pclmul.ko before loading crc-t10dif.ko .

  (5) Currently crct10dif-pclmul.ko cannot be loaded if PCLMULQDQ is not
  supported. This leads to boot failure (since sd_mod.ko cannot be loaded)
  if modules.dep says that "crct10dif-pclmul.ko is required by
  crc-t10dif.ko".

  (6) To solve (4) and (5), modules.dep should say "crct10dif-pclmul.ko is
  preferred for crc-t10dif.ko but is not required by crc-t10dif.ko".
  But there is no such mechanism. Thus, currently available choice is
  "allow loading crct10dif-pclmul.ko even if PCLMULQDQ is not supported"
  or "ignore errors by built-in the crct10dif-pclmul.ko module".

My patch (b) seems to be complicated but is required in order to solve (4)
without asking users to manually add "modprobe crct10dif-pclmul" into their
initramfs. If we choose patch (b) rather than patch (a), we need to solve (5).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] KPortReserve : kernel version of portreserve utility

2013-08-21 Thread Tetsuo Handa

Hello.

A good summary on this proposal written by Jake Edge is available at
http://lwn.net/SubscriberLink/563178/c8a2e2fd4a794a9e/ .

Changes from version 2:

(1) Report number of rejections, the name of process and its pid, up to once
per a minute, in order to be able to figure out unexpected rejection
which could be caused by misconfiguration / misunderstanding.

Aug 21 21:28:38 localhost kernel: [  139.438347] KPortReserve:(#1): 
Rejected bind(22) by /root/testapp1 (pid=4636)
Aug 21 21:31:25 localhost kernel: [  306.755200] KPortReserve:(#3): 
Rejected bind(80) by /root/testapp2 (pid=4688)

(2) Updated Kconfig help.

Regards.

>From efc84232e6df17ad0a7359fb9f4b72b4f4a02ed6 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa 
Date: Wed, 21 Aug 2013 21:19:28 +0900
Subject: [PATCH v3] KPortReserve : kernel version of portreserve utility

This module reserves local port like /proc/sys/net/ipv4/ip_local_reserved_ports
does, but this module is designed for stopping bind() requests with non-zero
local port numbers from unwanted programs.

Signed-off-by: Tetsuo Handa 
---
 security/Kconfig   |6 +
 security/Makefile  |2 +
 security/kportreserve/Kconfig  |   43 +++
 security/kportreserve/Makefile |1 +
 security/kportreserve/kpr.c|  573 
 5 files changed, 625 insertions(+), 0 deletions(-)
 create mode 100644 security/kportreserve/Kconfig
 create mode 100644 security/kportreserve/Makefile
 create mode 100644 security/kportreserve/kpr.c

diff --git a/security/Kconfig b/security/Kconfig
index e9c6ac7..f4058ff 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -122,6 +122,7 @@ source security/smack/Kconfig
 source security/tomoyo/Kconfig
 source security/apparmor/Kconfig
 source security/yama/Kconfig
+source security/kportreserve/Kconfig
 
 source security/integrity/Kconfig
 
@@ -132,6 +133,7 @@ choice
default DEFAULT_SECURITY_TOMOYO if SECURITY_TOMOYO
default DEFAULT_SECURITY_APPARMOR if SECURITY_APPARMOR
default DEFAULT_SECURITY_YAMA if SECURITY_YAMA
+   default DEFAULT_SECURITY_KPR if SECURITY_KPR
default DEFAULT_SECURITY_DAC
 
help
@@ -153,6 +155,9 @@ choice
config DEFAULT_SECURITY_YAMA
bool "Yama" if SECURITY_YAMA=y
 
+   config DEFAULT_SECURITY_KPR
+   bool "KPortReserve" if SECURITY_KPR=y
+
config DEFAULT_SECURITY_DAC
bool "Unix Discretionary Access Controls"
 
@@ -165,6 +170,7 @@ config DEFAULT_SECURITY
default "tomoyo" if DEFAULT_SECURITY_TOMOYO
default "apparmor" if DEFAULT_SECURITY_APPARMOR
default "yama" if DEFAULT_SECURITY_YAMA
+   default "kpr" if DEFAULT_SECURITY_KPR
default "" if DEFAULT_SECURITY_DAC
 
 endmenu
diff --git a/security/Makefile b/security/Makefile
index c26c81e..87f95cc 100644
--- a/security/Makefile
+++ b/security/Makefile
@@ -8,6 +8,7 @@ subdir-$(CONFIG_SECURITY_SMACK) += smack
 subdir-$(CONFIG_SECURITY_TOMOYO)+= tomoyo
 subdir-$(CONFIG_SECURITY_APPARMOR) += apparmor
 subdir-$(CONFIG_SECURITY_YAMA) += yama
+subdir-$(CONFIG_SECURITY_KPR)  += kportreserve
 
 # always enable default capabilities
 obj-y  += commoncap.o
@@ -23,6 +24,7 @@ obj-$(CONFIG_AUDIT)   += lsm_audit.o
 obj-$(CONFIG_SECURITY_TOMOYO)  += tomoyo/built-in.o
 obj-$(CONFIG_SECURITY_APPARMOR)+= apparmor/built-in.o
 obj-$(CONFIG_SECURITY_YAMA)+= yama/built-in.o
+obj-$(CONFIG_SECURITY_KPR) += kportreserve/built-in.o
 obj-$(CONFIG_CGROUP_DEVICE)+= device_cgroup.o
 
 # Object integrity file lists
diff --git a/security/kportreserve/Kconfig b/security/kportreserve/Kconfig
new file mode 100644
index 000..41049ae
--- /dev/null
+++ b/security/kportreserve/Kconfig
@@ -0,0 +1,43 @@
+config SECURITY_KPR
+   bool "KPortReserve support"
+   depends on SECURITY
+   select SECURITY_NETWORK
+   select SECURITY_FS
+   default n
+   help
+ This selects local port reserving module which is similar to
+ /proc/sys/net/ipv4/ip_local_reserved_ports . But this module is
+ designed for stopping bind() requests with non-zero local port
+ numbers from unwanted programs using white list reservations.
+
+ If you are unsure how to answer this question, answer N.
+
+ Specifications:
+
+ Use "$port $identifier" format to add reservation.
+ Use "del $port $identifier" format to remove reservation.
+
+ The $port is a single port number between 0 and 65535.
+ The $identifier is an identifier word in TOMOYO's string
+ representation rule (i.e. consists with only ASCII printable
+ characters). Upon successful execve() operation,

Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to moduledependency.

2013-07-19 Thread Tetsuo Handa

Tim Chen wrote:
> On Fri, 2013-07-19 at 16:37 -0700, Tim Chen wrote:
> > Herbert,
> > 
> > I've tried the module alias approach (see my earlier mail with patch) 
> > but it didn't seem to load things properly.  Can you check to see if 
> > there's anything I did incorrectly.
> > 
> > Tim
> 
> I fixed a missing request_module statement in crct10dif library.  
> So now things work if I have the following config:
> 
> CONFIG_CRYPTO_CRCT10DIF=m
> CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
> CONFIG_CRC_T10DIF=m
> 
> However, when I have the library and generic algorithm compiled in,
> I do not see the PCLMULQDQ version loaded.
> 
> CONFIG_CRYPTO_CRCT10DIF=y
> CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
> CONFIG_CRC_T10DIF=y
> 
> Perhaps I am initiating the crct10dif library at a really early
> stage when things are compiled in, where the module is not in 
> initramfs?  In that case, perhaps we should only allow 
> PCLMUL version to be compiled in
> and not exist as a module?

I think that use of request_module("crct10dif") does not help loading
crct10dif-pclmul.ko when CONFIG_CRC_T10DIF=y CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m ,
for there is no / directory (note that the initramfs is not yet mounted as /
for loading modules which are not in vmlinux) when any module_init() functions
which are in vmlinux are called.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.11-rc1] crypto: Fix boot failure due to moduledependency.

2013-07-19 Thread Tetsuo Handa

Herbert Xu wrote:
> On Fri, Jul 19, 2013 at 06:31:04PM -0700, Tim Chen wrote:
> >
> > However, when I have the library and generic algorithm compiled in,
> > I do not see the PCLMULQDQ version loaded.
> > 
> > CONFIG_CRYPTO_CRCT10DIF=y
> > CONFIG_CRYPTO_CRCT10DIF_PCLMUL=m
> > CONFIG_CRC_T10DIF=y
> 
> That is completely expected.  I don't really think we need to
> do anything about this case.  After all, if the admin wants to
> use the optimised version for CRC_T10DIF then they could simply
> compile that in as well.
> 

Wow! ;-)

But I'd expect something like

 static int __init crc_t10dif_mod_init(void)
 {
+#if !defined(CONFIG_CRC_T10DIF_MODULE) && 
defined(CONFIG_CRYPTO_CRCT10DIF_PCLMUL_MODULE)
+   printk(KERN_WARNING "Consider CONFIG_CRYPTO_CRCT10DIF_PCLMUL=y for 
better performance\n");
+#endif
crct10dif_tfm = crypto_alloc_shash("crct10dif", 0, 0);
return PTR_RET(crct10dif_tfm);
 }

because the admin might not be aware of this implication.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.7-rc6] Build failure with scripts/Makefile.headersinst

2013-03-03 Thread Tetsuo Handa

Sam Ravnborg wrote:
> On Wed, Nov 28, 2012 at 01:55:14PM +, David Howells wrote:
> > Tetsuo Handa  wrote:
> > 
> > > Tetsuo Handa wrote:
> > > > Linux 3.6 builds fine. I can't use "git bisect" until Linux 3.7-rc6 but
> > > > possibly caused by either commit 10b63956 "UAPI: Plumb the UAPI Kbuilds
> > > > into the user header installation and checking" or commit 40f1d4c2 
> > > > "UAPI:
> > > > Remove the objhdr-y export list".
> > > 
> > > Bisected to commit 10b63956 "UAPI: Plumb the UAPI Kbuilds into the user 
> > > header
> > > installation and checking".
> > 
> > Indeed, but that doesn't help much.  The problem is that make's behaviour 
> > has
> > apparently changed.  I could do with Sam Ravnborg's help to work around this
> > since I think he's mainly responsible for the Makefile infrastructure.
> 
> Mical is the kbuild person these days.
> Anyway - it this still relevant?
> 
Yes, as of commit a7c1120d "Merge tag 'ext4_for_linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4" in linux.git still
has this problem.

> I took a quick look and my suspect is the use of $(or ..),
> as this feature was added recently to make.
> 
>   Sam
> 

Michal, this thread starts at https://lkml.org/lkml/2012/11/17/71 .

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.7-rc6] Build failure with scripts/Makefile.headersinst

2013-03-03 Thread Tetsuo Handa

Sam Ravnborg wrote:
> On Mon, Mar 04, 2013 at 12:07:31AM +0900, Tetsuo Handa wrote:
> > Sam Ravnborg wrote:
> > > On Wed, Nov 28, 2012 at 01:55:14PM +, David Howells wrote:
> > > > Tetsuo Handa  wrote:
> > > > 
> > > > > Tetsuo Handa wrote:
> > > > > > Linux 3.6 builds fine. I can't use "git bisect" until Linux 3.7-rc6 
> > > > > > but
> > > > > > possibly caused by either commit 10b63956 "UAPI: Plumb the UAPI 
> > > > > > Kbuilds
> > > > > > into the user header installation and checking" or commit 40f1d4c2 
> > > > > > "UAPI:
> > > > > > Remove the objhdr-y export list".
> > > > > 
> > > > > Bisected to commit 10b63956 "UAPI: Plumb the UAPI Kbuilds into the 
> > > > > user header
> > > > > installation and checking".
> > > > 
> > > > Indeed, but that doesn't help much.  The problem is that make's 
> > > > behaviour has
> > > > apparently changed.  I could do with Sam Ravnborg's help to work around 
> > > > this
> > > > since I think he's mainly responsible for the Makefile infrastructure.
> > > 
> > > Mical is the kbuild person these days.
> > > Anyway - it this still relevant?
> > > 
> > Yes, as of commit a7c1120d "Merge tag 'ext4_for_linus' of
> > git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4" in linux.git still
> > has this problem.
> > 
> > > I took a quick look and my suspect is the use of $(or ..),
> > > as this feature was added recently to make.
> 
> Hi Tetsuo.
> 
> If my guess is correct this patch should help.
> I have gmake 3.81 and I do nto see the beforementioned problem.

Yes, I'm using 3.80 and this patch fixes my problem. Thank you.

> 
>   Sam
> 
> diff --git a/scripts/Makefile.headersinst b/scripts/Makefile.headersinst
> index 25f216a..477d137 100644
> --- a/scripts/Makefile.headersinst
> +++ b/scripts/Makefile.headersinst
> @@ -14,7 +14,7 @@ kbuild-file := $(srctree)/$(obj)/Kbuild
>  include $(kbuild-file)
>  
>  # called may set destination dir (when installing to asm/)
> -_dst := $(or $(destination-y),$(dst),$(obj))
> +_dst := $(if $(destination-y),$(destination-y),$(if $(dst),$(dst),$(obj)))
>  
>  old-kbuild-file := $(srctree)/$(subst uapi/,,$(obj))/Kbuild
>  ifneq ($(wildcard $(old-kbuild-file)),)
> @@ -48,13 +48,14 @@ all-files := $(header-y) $(genhdr-y) $(wrapper-files)
>  output-files  := $(addprefix $(installdir)/, $(all-files))
>  
>  input-files   := $(foreach hdr, $(header-y), \
> -$(or \
> +$(if $(wildcard $(srcdir)/$(hdr)), \
>   $(wildcard $(srcdir)/$(hdr)), \
> - $(wildcard $(oldsrcdir)/$(hdr)), \
> - $(error Missing UAPI file $(srcdir)/$(hdr)) \
> + $(if $(wildcard $(oldsrcdir)/$(hdr)), \
> + $(wildcard $(oldsrcdir)/$(hdr)), \
> + $(error Missing UAPI file $(srcdir)/$(hdr))) \
>  )) \
>$(foreach hdr, $(genhdr-y), \
> -$(or \
> +$(if $(wildcard $(gendir)/$(hdr)), \
>   $(wildcard $(gendir)/$(hdr)), \
>   $(error Missing generated UAPI file $(gendir)/$(hdr)) \
>  ))
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.9-rc1] Bug in bootup code or debug code?

2013-03-04 Thread Tetsuo Handa

Tetsuo Handa wrote:
> Hello.
> 
> I can boot linux-next-20130205 using kernel config at
> http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> But I get VMware's virtual machine kernel stack fault (hardware reset) as soon
> as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.
> 
> Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
> kernel config generated by "make allnoconfig", I guess something is wrong with
> code which is executed at very early stage of bootup.
> 
> Any clue?
> 
> Regards.
> 

This bug is not yet fixed as of 3.9-rc1.
Should I run git bisect?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.9-rc1 x86/microcode] Bug in CONFIG_MICROCODE_INTEL_EARLY=y

2013-03-05 Thread Tetsuo Handa

Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Hello.
> > 
> > I can boot linux-next-20130205 using kernel config at
> > http://I-love.SAKURA.ne.jp/tmp/config-3.8-rc6-next-20130205 .
> > But I get VMware's virtual machine kernel stack fault (hardware reset) as 
> > soon
> > as kernel is loaded if CONFIG_DEBUG_VIRTUAL=y is added to the config above.
> > 
> > Since I don't get kernel stack fault if CONFIG_DEBUG_VIRTUAL=y is added to
> > kernel config generated by "make allnoconfig", I guess something is wrong 
> > with
> > code which is executed at very early stage of bootup.
> > 
> > Any clue?
> > 
> > Regards.
> > 
> 
> This bug is not yet fixed as of 3.9-rc1.
> Should I run git bisect?
> 
> Regards.
> 
I couldn't find the exact commit due to build failure, but I can guess that
this problem is triggered by early microcode loading changes, for this problem
happens only when CONFIG_MICROCODE_INTEL_EARLY=y on x86_32 kernel.

Candidate commits are:

  086fc8f8 "x86/tlbflush.h: Define __native_flush_tlb_global_irq_disabled()"
  e666dfa2 "x86/microcode_intel_lib.c: Early update ucode on Intel's CPU"
  a8ebf6d1 "x86/microcode_core_early.c: Define interfaces for early loading 
ucode"
  ec400dde "x86/microcode_intel_early.c: Early update ucode on Intel's CPU"
  63b553c6 "x86/head_32.S: Early update ucode in 32-bit"
  e6ebf5de "x86/common.c: load ucode in 64 bit or show loading ucode info in 32 
bit on AP"
  d288e1cf "x86/common.c: Make have_cpuid_p() a global function"
  feddc9de "x86/head64.c: Early update ucode in 64-bit"
  9cd4d78e "x86/microcode_intel.h: Define functions and macros for early 
loading ucode"
  cd745be8 "x86/mm/init.c: Copy ucode from initrd image to kernel memory"
  da76f64e "x86/Kconfig: Make early microcode loading a configuration feature"

I'm using Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz on VMware Player 4.0.5.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.9-rc1 x86] Bug in ioremap code?

2013-03-05 Thread Tetsuo Handa

Another problem

[0.021748] Mount-cache hash table entries: 512
[0.036341] Disabled fast string operations
[0.037760] mce: CPU supports 0 MCE banks
[0.039813] Last level iTLB entries: 4KB 128, 2MB 4, 4MB 4
[0.039813] Last level dTLB entries: 4KB 256, 2MB 0, 4MB 32
[0.039813] tlb_flushall_shift: -1
[0.074005] debug: unmapping init [mem 0xc186a000-0xc186efff]
[0.077005] ACPI: Core revision 20121018
[0.083350] [ cut here ]
[0.084000] kernel BUG at arch/x86/mm/physaddr.c:79!
[0.084000] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
[0.084000] Modules linked in:
[0.084000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc5-00105-g68d00bb #47 
VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[0.084000] EIP: 0060:[] EFLAGS: 00010206 CPU: 0
[0.084000] EIP is at __phys_addr+0x42/0x90
[0.084000] EAX:  EBX: 1fef ECX: 000c EDX: 
[0.084000] ESI: c1657edc EDI: 000f EBP: c1657dcc ESP: c1657dc8
[0.084000]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[0.084000] CR0: 8005003b CR2: ffe13000 CR3: 01872000 CR4: 06d0
[0.084000] DR0:  DR1:  DR2:  DR3: 
[0.084000] DR6: 0ff0 DR7: 0400
[0.084000] Process swapper/0 (pid: 0, ti=c1656000 task=c1661180 
task.ti=c1656000)
[0.084000] Stack:
[0.084000]  c1657e90 c1657dec c102ca3e c1661608 c166f700  c166f700 
c1657df0
[0.084000]   c1657e70 c102ceee c10d7899 0002 c1655000  
c10d7632
[0.084000]  0001 0dfc 02f0 c1657e60 c14ab000 000f 0110 
c1657e90
[0.084000] Call Trace:
[0.084000]  [] __cpa_process_fault+0x3e/0x80
[0.084000]  [] __change_page_attr_set_clr+0x3de/0x6d0
[0.084000]  [] ? __purge_vmap_area_lazy+0x2a9/0x360
[0.084000]  [] ? __purge_vmap_area_lazy+0x42/0x360
[0.084000]  [] ? vm_unmap_aliases+0x2bc/0x300
[0.084000]  [] ? vm_unmap_aliases+0x64/0x300
[0.084000]  [] change_page_attr_set_clr+0xe5/0x390
[0.084000]  [] _set_memory_wb+0x32/0x40
[0.084000]  [] ioremap_change_attr+0xf/0x40
[0.084000]  [] kernel_map_sync_memtype+0x87/0xf0
[0.084000]  [] __ioremap_caller+0x21b/0x2f0
[0.084000]  [] ? walk_system_ram_range+0xca/0xf0
[0.084000]  [] ioremap_cache+0x13/0x20
[0.084000]  [] ? acpi_os_map_memory+0xb6/0x112
[0.084000]  [] acpi_os_map_memory+0xb6/0x112
[0.084000]  [] acpi_tb_verify_table+0x20/0x49
[0.084000]  [] acpi_load_tables+0x35/0x13e
[0.084000]  [] acpi_early_init+0x67/0xeb
[0.084000]  [] start_kernel+0x30e/0x319
[0.084000]  [] ? repair_env_string+0x5b/0x5b
[0.084000]  [] i386_start_kernel+0x12c/0x12f
[0.084000] Code: 0c db c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 39 
ca 72 19 e8 be cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb <0f> 0b 
eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 8c 0b db
[0.084000] EIP: [] __phys_addr+0x42/0x90 SS:ESP 0068:c1657dc8
[0.085033] ---[ end trace bd778c4c9eceaf67 ]---
[0.088242] Kernel panic - not syncing: Attempted to kill the idle task!

was found using http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1 and was bisected
to commit 68d00bbe "Merge remote-tracking branch 'origin/x86/mm' into x86/mm2".

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.9-rc1 x86] Bug in ioremap code?

2013-03-05 Thread Tetsuo Handa

Borislav Petkov wrote:
> + Dave.
> 
> This still says 3.8.0-rc5-00105-g68d00bb. Can you still trigger this
> with 3.9-rc1?

Yes, since I saw it in 3.9-rc1, I ran "git bisect" starting from 3.9-rc1
and below is the output from the first bad commit.

> 
> And also, this is Linux running as a 32-bit guest in vmware, correct?
> 
> On Wed, Mar 06, 2013 at 12:41:10AM +0900, Tetsuo Handa wrote:
> > Another problem
> > 
> > [0.021748] Mount-cache hash table entries: 512
> > [0.036341] Disabled fast string operations
> > [0.037760] mce: CPU supports 0 MCE banks
> > [0.039813] Last level iTLB entries: 4KB 128, 2MB 4, 4MB 4
> > [0.039813] Last level dTLB entries: 4KB 256, 2MB 0, 4MB 32
> > [0.039813] tlb_flushall_shift: -1
> > [0.074005] debug: unmapping init [mem 0xc186a000-0xc186efff]
> > [0.077005] ACPI: Core revision 20121018
> > [0.083350] [ cut here ]
> > [0.084000] kernel BUG at arch/x86/mm/physaddr.c:79!
> > [0.084000] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
> > [0.084000] Modules linked in:
> > [0.084000] Pid: 0, comm: swapper/0 Not tainted 3.8.0-rc5-00105-g68d00bb 
> > #47 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
> > [0.084000] EIP: 0060:[] EFLAGS: 00010206 CPU: 0
> > [0.084000] EIP is at __phys_addr+0x42/0x90
> > [0.084000] EAX:  EBX: 1fef ECX: 000c EDX: 
> > [0.084000] ESI: c1657edc EDI: 000f EBP: c1657dcc ESP: c1657dc8
> > [0.084000]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
> > [0.084000] CR0: 8005003b CR2: ffe13000 CR3: 01872000 CR4: 06d0
> > [0.084000] DR0:  DR1:  DR2:  DR3: 
> > [0.084000] DR6: 0ff0 DR7: 0400
> > [0.084000] Process swapper/0 (pid: 0, ti=c1656000 task=c1661180 
> > task.ti=c1656000)
> > [0.084000] Stack:
> > [0.084000]  c1657e90 c1657dec c102ca3e c1661608 c166f700  
> > c166f700 c1657df0
> > [0.084000]   c1657e70 c102ceee c10d7899 0002 c1655000 
> >  c10d7632
> > [0.084000]  0001 0dfc 02f0 c1657e60 c14ab000 000f 
> > 0110 c1657e90
> > [0.084000] Call Trace:
> > [0.084000]  [] __cpa_process_fault+0x3e/0x80
> > [0.084000]  [] __change_page_attr_set_clr+0x3de/0x6d0
> > [0.084000]  [] ? __purge_vmap_area_lazy+0x2a9/0x360
> > [0.084000]  [] ? __purge_vmap_area_lazy+0x42/0x360
> > [0.084000]  [] ? vm_unmap_aliases+0x2bc/0x300
> > [0.084000]  [] ? vm_unmap_aliases+0x64/0x300
> > [0.084000]  [] change_page_attr_set_clr+0xe5/0x390
> > [0.084000]  [] _set_memory_wb+0x32/0x40
> > [0.084000]  [] ioremap_change_attr+0xf/0x40
> > [0.084000]  [] kernel_map_sync_memtype+0x87/0xf0
> > [0.084000]  [] __ioremap_caller+0x21b/0x2f0
> > [0.084000]  [] ? walk_system_ram_range+0xca/0xf0
> > [0.084000]  [] ioremap_cache+0x13/0x20
> > [0.084000]  [] ? acpi_os_map_memory+0xb6/0x112
> > [0.084000]  [] acpi_os_map_memory+0xb6/0x112
> > [0.084000]  [] acpi_tb_verify_table+0x20/0x49
> > [0.084000]  [] acpi_load_tables+0x35/0x13e
> > [0.084000]  [] acpi_early_init+0x67/0xeb
> > [0.084000]  [] start_kernel+0x30e/0x319
> > [0.084000]  [] ? repair_env_string+0x5b/0x5b
> > [0.084000]  [] i386_start_kernel+0x12c/0x12f
> > [0.084000] Code: 0c db c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 
> > 39 ca 72 19 e8 be cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb 
> > <0f> 0b eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 8c 0b db
> > [0.084000] EIP: [] __phys_addr+0x42/0x90 SS:ESP 0068:c1657dc8
> > [0.085033] ---[ end trace bd778c4c9eceaf67 ]---
> > [0.088242] Kernel panic - not syncing: Attempted to kill the idle task!
> > 
> > was found using http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1 and was 
> > bisected
> > to commit 68d00bbe "Merge remote-tracking branch 'origin/x86/mm' into 
> > x86/mm2".
> > 
> > Regards.
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> > 
> 
> -- 
> Regards/Gruss,
> Boris.
> 
> Sent from a fat crate under my desk. Formatting is fine.
> --
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.9-rc1 x86] Bug in ioremap code?

2013-03-06 Thread Tetsuo Handa

Dave Hansen wrote:
> Could you also add the following to your .config:
> 
>   CONFIG_ACPI_DEBUG=y
> 
> and boot with these on the kernel command-line:
> 
>   acpi.debug_layer=0x acpi.debug_level=0x2
> 
> I _think_ that\'ll shed some light on exactly which ACPI table is being
> parsed when the BUG_ON() trips.  That will hopefully let other folks
> reproduce it more easily.
> 
Using CONFIG_ACPI_DEBUG=y and adding acpi.debug_layer=0x 
acpi.debug_level=0x2
changed nothing.

But I found that this bug occurs only when the system has little RAM.

With 892MB RAM where /proc/meminfo would show HighTotal > 0,
this bug does not occur.

HighTotal:  4040 kB
LowTotal: 873960 kB

With 888MB RAM where /proc/meminfo would show HighTotal == 0,
this bug occurs.

[0.005852] [ cut here ]
[0.007043] kernel BUG at arch/x86/mm/physaddr.c:79!
[0.008203] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
[0.009546] Modules linked in:
[0.010303] Pid: 0, comm: swapper/0 Not tainted 3.9.0-rc1 #38 VMware, Inc. 
VMware Virtual Platform/440BX Desktop Reference Platform
[0.013023] EIP: 0060:[] EFLAGS: 00210206 CPU: 0
[0.014294] EIP is at __phys_addr+0x42/0x90
[0.015270] EAX:  EBX: 376f ECX: 000c EDX: 
[0.016686] ESI:  EDI: c1665e90 EBP: c1665dc8 ESP: c1665dc4
[0.018161]  DS: 007b ES: 007b FS: 00d8 GS:  SS: 0068
[0.019422] CR0: 80050033 CR2: ffe13000 CR3: 01882000 CR4: 000406d0
[0.020911] DR0:  DR1:  DR2:  DR3: 
[0.022363] DR6: 0ff0 DR7: 0400
[0.023242] Process swapper/0 (pid: 0, ti=c1664000 task=c166f140 
task.ti=c1664000)
[0.024957] Stack:
[0.025427]  c1665e90 c1665de8 c102d02e c10d7b49 c166f5c8 c167dd00  
c167dd00
[0.027391]  f72b9bc0 c1665e70 c102d565 c1665e30 c10d7b49 0002 c1664000 

[0.029372]  c10d78e2 c1665e18 c167dce8 0001 c1665e60 c14b7000 c1665e18 
02f0
[0.031373] Call Trace:
[0.031947]  [] __cpa_process_fault+0x3e/0x80
[0.033165]  [] ? __purge_vmap_area_lazy+0x2a9/0x360
[0.034500]  [] __change_page_attr_set_clr+0x2c5/0x5b0
[0.035879]  [] ? __purge_vmap_area_lazy+0x2a9/0x360
[0.037211]  [] ? __purge_vmap_area_lazy+0x42/0x360
[0.038538]  [] ? vm_unmap_aliases+0x64/0x300
[0.039717]  [] change_page_attr_set_clr+0xe5/0x390
[0.041059]  [] _set_memory_wb+0x32/0x40
[0.042143]  [] ioremap_change_attr+0xf/0x40
[0.043330]  [] kernel_map_sync_memtype+0x87/0xf0
[0.044610]  [] __ioremap_caller+0x21b/0x2f0
[0.045813]  [] ? walk_system_ram_range+0xca/0xf0
[0.047072]  [] ioremap_cache+0x13/0x20
[0.048183]  [] ? acpi_os_map_memory+0xb6/0x112
[0.049405]  [] acpi_os_map_memory+0xb6/0x112
[0.050620]  [] acpi_tb_verify_table+0x20/0x49
[0.051840]  [] acpi_load_tables+0x35/0x156
[0.053009]  [] acpi_early_init+0x67/0xeb
[0.054117]  [] start_kernel+0x30e/0x319
[0.055203]  [] ? repair_env_string+0x5b/0x5b
[0.056422]  [] i386_start_kernel+0x12c/0x12f
[0.057599] Code: 0e dc c1 8d 98 00 00 00 40 85 d2 74 12 89 d9 c1 e9 0c 39 
ca 72 19 e8 3e cd ff ff 39 c3 75 0c 89 d8 5b 5d c3 0f 0b 8d 76 00 eb fb <0f> 0b 
eb fe 0f 0b 90 8d b4 26 00 00 00 00 eb f6 8b 15 4c 0e dc
[0.063695] EIP: [] __phys_addr+0x42/0x90 SS:ESP 0068:c1665dc4
[0.065336] ---[ end trace ddccf428d5f1e08d ]---
[0.066391] Kernel panic - not syncing: Attempted to kill the idle task!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.9-rc1 x86] Bug in ioremap code?

2013-03-06 Thread Tetsuo Handa

Borislav Petkov wrote:
> Ok, before we continue guessing stuff, Tetsuo, can you please explain
> how exactly you're triggering this. More specifically, we need .config,
> hypervisor version, I'm assuming kernel is 3.9-rc1, Linux is guest/host
> etc, etc.

I'm using CentOS 6.3 x86_32 guest running on VMware Workstation 6.5.5 for
Windows XP x86_32 host and VMware Player 4.0.5 for Windows 7 x86_64 host.

Kernel version is 3.9-rc1 x86_32. This bug can be triggered only when the
guest has little RAM such that /proc/meminfo reports that HighTotal == 0.
Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-acpi .

I don't know why but changing kernel config to CONFIG_ACPI=n
( http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-noacpi ) solves this bug.
Well, should I run bisection on ACPI code?

Regards.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[3.10-rc1] tick: NULL pointer dereference at tick_handle_oneshot_broadcast

2013-05-13 Thread Tetsuo Handa

I\'m hitting below bug on bootup.

[1.215727] Switching to clocksource hpet
[1.217475] BUG: unable to handle kernel NULL pointer dereference at 
0018
[1.218467] IP: [] tick_handle_oneshot_broadcast+0xdc/0x280
[1.218488] PGD 0 
[1.218509] Oops:  [#1] SMP DEBUG_PAGEALLOC
[1.218530] Modules linked in:
[1.218550] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 3.10.0-rc1 #72
[1.218571] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 09/20/2012
[1.218592] task: 88007ad58000 ti: 88007acde000 task.ti: 
88007acde000
[1.218613] RIP: 0010:[]  [] 
tick_handle_oneshot_broadcast+0xdc/0x280
[1.218634] RSP: :88007b203df8  EFLAGS: 00010083
[1.218655] RAX: 0004 RBX: 7fff RCX: 0010
[1.218676] RDX: 0004 RSI: 0010 RDI: 
[1.218697] RBP: 88007b203e48 R08: 88007ac05578 R09: 
[1.218717] R10: 88007ac05570 R11: 7fff R12: 7fff
[1.218738] R13: d700 R14:  R15: 81e1f040
[1.218759] FS:  () GS:88007b20() 
knlGS:
[1.218780] CS:  0010 DS:  ES:  CR0: 80050033
[1.218801] CR2: 0018 CR3: 01e0b000 CR4: 000407f0
[1.218822] DR0:  DR1:  DR2: 
[1.218843] DR3:  DR6: 0ff0 DR7: 0400
[1.218864] Stack:
[1.218884]  88007b203e68 0086 88007b203e18 
3ec3dd69
[1.218905]  88007b203e38 81e12680 88007ac08488 

[1.218926]  88007acdfb48  88007b203e58 
81005145
[1.218947] Call Trace:
[1.218968]   
[1.218989]  [] timer_interrupt+0x15/0x20
[1.219041]  [] handle_irq_event_percpu+0x95/0x380
[1.219071]  [] handle_irq_event+0x48/0x70
[1.219101]  [] handle_edge_irq+0x6d/0x130
[1.219130]  [] handle_irq+0x5c/0x150
[1.219170]  [] ? d_alloc+0x68/0x80
[1.219202]  [] ? irq_enter+0x1b/0x90
[1.219233]  [] do_IRQ+0x5d/0xe0
[1.219263]  [] ? d_alloc+0x68/0x80
[1.219294]  [] common_interrupt+0x6f/0x6f
[1.219315]   
[1.219335]  [] ? trace_hardirqs_off+0xd/0x10
[1.219386]  [] ? lock_release+0x7e/0x130
[1.219417]  [] _raw_spin_unlock+0x23/0x50
[1.219447]  [] d_alloc+0x68/0x80
[1.219478]  [] lookup_dcache+0xa3/0xd0
[1.219508]  [] __lookup_hash+0x23/0x50
[1.219540]  [] lookup_one_len+0xd1/0x120
[1.219571]  [] __create_file+0x93/0x2a0
[1.219601]  [] debugfs_create_file+0x1a/0x30
[1.219631]  [] trace_create_file+0x1c/0x50
[1.219660]  [] tracing_init_debugfs_percpu+0xa2/0x210
[1.219691]  [] init_tracer_debugfs+0x188/0x1e0
[1.219723]  [] tracer_init_debugfs+0x78/0x20f
[1.219755]  [] ? clear_boot_tracer+0x2d/0x2d
[1.219785]  [] do_one_initcall+0xf2/0x1a0
[1.219816]  [] do_basic_setup+0x9d/0xbb
[1.219849]  [] ? kernel_init_freeable+0x133/0x133
[1.219879]  [] kernel_init_freeable+0xba/0x133
[1.219909]  [] ? rest_init+0x180/0x180
[1.219938]  [] kernel_init+0xe/0xf0
[1.219969]  [] ret_from_fork+0x7c/0xb0
[1.21]  [] ? rest_init+0x180/0x180
[1.220020] Code: 63 f1 48 89 c7 48 63 d2 e8 42 e5 26 00 8b 0d e4 6f e4 00 
89 c2 39 c8 89 ce 7d 74 48 63 fa 48 8b 3c fd 40 23 ef 81 49 8b 7c 3d 00 <4c> 8b 
47 18 4c 3b 45 c8 7e 1a 4d 39 e0 7d a5 83 fa ff 41 89 c6 
[1.220040] RIP  [] 
tick_handle_oneshot_broadcast+0xdc/0x280
[1.220061]  RSP 
[1.220082] CR2: 0018
[1.220103] ---[ end trace c8cac818edfcf493 ]---
[1.220124] Kernel panic - not syncing: Fatal exception in interrupt

Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.10-rc1
Full dmesg is at http://I-love.SAKURA.ne.jp/tmp/dmesg-3.10-rc1.txt

Below patch

-- debug printk() start --
diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 206bbfb..952df89 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -189,7 +189,9 @@ static void tick_do_broadcast(struct cpumask *mask)
if (cpumask_test_cpu(cpu, mask)) {
cpumask_clear_cpu(cpu, mask);
td = &per_cpu(tick_cpu_device, cpu);
-   td->evtdev->event_handler(td->evtdev);
+   printk(KERN_INFO \"test=%d td=%p td->evtdev=%p\\n\", cpu, td, 
td->evtdev);
+   if (td->evtdev)
+   td->evtdev->event_handler(td->evtdev);
}
 
if (!cpumask_empty(mask)) {
@@ -200,7 +202,9 @@ static void tick_do_broadcast(struct cpumask *mask)
 * misfeature only on x86 (lapic)
 */
td = &per_cpu(tick_cpu_device, cpumask_first(mask));
-   td->evtdev->broadcast(mask);
+   printk(KERN_INFO \"first td=%p td->evtdev=%p\\

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2177 matches

Mail list logo