[PATCH] drivers:staging:skein:skein_generic.c: Fixed a whitespace error

2014-11-21 Thread Anjana Sasindran
 This patch fixes the checkpatch.pl error:

 ERROR: trailing whitespace

Signed-off-by: Anjana Sasindran 
---
 drivers/staging/skein/skein_generic.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/skein/skein_generic.c 
b/drivers/staging/skein/skein_generic.c
index 7096d5a..268e4de 100644
--- a/drivers/staging/skein/skein_generic.c
+++ b/drivers/staging/skein/skein_generic.c
@@ -188,7 +188,6 @@ static int __init skein_generic_init(void)
goto unreg256;
if (crypto_register_shash())
goto unreg512;
-
return 0;
 unreg512:
crypto_unregister_shash();
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] can: eliminate banner[] variable and switch to pr_info()

2014-11-21 Thread Jeremiah Mahler
Several CAN modules use a design pattern with a banner[] variable at the
top which defines a string that is used once during init to print the
banner.  The string is also embedded with KERN_INFO which makes it
printk() specific.

Improve the code by eliminating the banner[] variable and moving the
string to where it is printed.  Then switch from printk(KERN_INFO to
pr_info() for the lines that were changed.

Signed-off-by: Jeremiah Mahler 
---
 net/can/af_can.c | 5 +
 net/can/bcm.c| 4 +---
 net/can/raw.c| 4 +---
 3 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/can/af_can.c b/net/can/af_can.c
index ce82337..ac05be1 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -64,9 +64,6 @@
 
 #include "af_can.h"
 
-static __initconst const char banner[] = KERN_INFO
-   "can: controller area network core (" CAN_VERSION_STRING ")\n";
-
 MODULE_DESCRIPTION("Controller Area Network PF_CAN core");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Urs Thuermann , "
@@ -896,7 +893,7 @@ static __init int can_init(void)
 offsetof(struct can_frame, data) !=
 offsetof(struct canfd_frame, data));
 
-   printk(banner);
+   pr_info("can: controller area network core (" CAN_VERSION_STRING ")\n");
 
memset(_rx_alldev_list, 0, sizeof(can_rx_alldev_list));
 
diff --git a/net/can/bcm.c b/net/can/bcm.c
index dcb75c0..9aa3f76 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -78,8 +78,6 @@
 (CAN_SFF_MASK | CAN_EFF_FLAG | CAN_RTR_FLAG))
 
 #define CAN_BCM_VERSION CAN_VERSION
-static __initconst const char banner[] = KERN_INFO
-   "can: broadcast manager protocol (rev " CAN_BCM_VERSION " t)\n";
 
 MODULE_DESCRIPTION("PF_CAN broadcast manager protocol");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -1615,7 +1613,7 @@ static int __init bcm_module_init(void)
 {
int err;
 
-   printk(banner);
+   pr_info("can: broadcast manager protocol (rev " CAN_BCM_VERSION " 
t)\n");
 
err = can_proto_register(_can_proto);
if (err < 0) {
diff --git a/net/can/raw.c b/net/can/raw.c
index 081e81f..e3250e2 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -56,8 +56,6 @@
 #include 
 
 #define CAN_RAW_VERSION CAN_VERSION
-static __initconst const char banner[] =
-   KERN_INFO "can: raw protocol (rev " CAN_RAW_VERSION ")\n";
 
 MODULE_DESCRIPTION("PF_CAN raw protocol");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -810,7 +808,7 @@ static __init int raw_module_init(void)
 {
int err;
 
-   printk(banner);
+   pr_info("can: raw protocol (rev " CAN_RAW_VERSION ")\n");
 
err = can_proto_register(_can_proto);
if (err < 0)
-- 
2.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] KVM: nVMX: nested MSR auto load/restore emulation.

2014-11-21 Thread Jan Kiszka
On 2014-11-22 05:24, Wincy Van wrote:
> Some hypervisors need MSR auto load/restore feature.
> 
> We read MSRs from vm-entry MSR load area which specified by L1,
> and load them via kvm_set_msr in the nested entry.
> When nested exit occurs, we get MSRs via kvm_get_msr, writting
> them to L1`s MSR store area. After this, we read MSRs from vm-exit
> MSR load area, and load them via kvm_set_msr.
> 
> VirtualBox will work fine with this patch.

Cool! This feature is long overdue.

Patch is unfortunately misformatted which makes it very hard to read.
Please check via linux/scripts/checkpatch.pl for the proper style.

Could you also write a corresponding kvm-unit-test (see x86/vmx_tests.c)?

Jan




signature.asc
Description: OpenPGP digital signature


Re: [RFC] situation with csum_and_copy_... API

2014-11-21 Thread David Miller
From: Al Viro 
Date: Sat, 22 Nov 2014 04:28:57 +

>   OK, here's the next bunch.  Sorry about the delay, iov_iter.c stuff
> took most of the day (and it's not included in this pile).  Please, review.

I read over this stuff twice and this series looks fine to me.

Since this is the weekend... maybe wait until Monday for other feedback
then give me a pull request?

Thanks Al.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Revert "staging: sm7xxfb: remove driver"

2014-11-21 Thread Sudip Mukherjee
On Thu, Nov 20, 2014 at 03:23:29PM -0800, Greg Kroah-Hartman wrote:
> On Thu, Nov 20, 2014 at 05:09:25PM -0500, Steven Rostedt wrote:
> > 
> > Someone reported a bug in the function graph tracer for MIPS. As I'm
> > still waiting on my USB serial for my Imagination MIPS board, I decided
> > to bring my Lemote Yeeloong laptop back up to the latest kernel. This
> > is where I noticed that the screen no longer displays anything.
> > 
> > I ran a bisect, which came across a staging commit that removed the
> > sm7xxfb driver. When I reverted it on a v3.18-rc5 kernel and booted it
> > on my Lemote laptop, the display worked again.
> > 
> > I then did a search for this commit and found that Debian reverted it
> > too. Seems that there's still some Debian users of this laptop. (RMS?)
> > 
> > What needs to be done to make this a "proper" driver? I can try to
> > support it, although I have no idea how it works :-)
> 
> Have you read the TODO file in this patch?  If you are willing to work
> on this, I'll be glad to apply it, but the reason I removed it was
> because no one had done anything with it for a very long time.
> 
> It needs a maintainer / developer, otherwise I can't take this.

i will like to help in this. Silicon Motion is still having SM712 in its 
product line and SM718 might also be a similar one.

now the problems (as i see) for me to help in the driver :
1) i am a newbie. though i have learnt a lot from the patches i sent , but 
still I am a newbie.
2) most important - I do not have the hardware. So from the TODO list dual head 
and 2D acceleration support will be tough without actually checking on the 
hardware.

thanks
sudip

> 
> thanks,
> 
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] percpu-ref: correctly get percpu pointer

2014-11-21 Thread Shaohua Li
I saw randam system hang testing virtio with blk-mq enabled and cpu hotplug
runing in the background. It turns out __ref_is_percpu() doesn't always return
correct percpu pointer. percpu_ref_put() calls __ref_is_percpu(), which checks
__PERCPU_REF_ATOMIC. After this check, the __PERCPU_REF_ATOMIC or
__PERCPU_REF_DEAD might be set, so we must exclude the two bits from the percpu
pointer. Fortunately we can still use percpu data for percpu_ref_put() even
this happens, because the final transistion from percpu to atomic occurs at rcu
context while __ref_is_percpu() is always called with rcu read lock protected.

CC: Jens Axboe 
CC: Tejun Heo 
CC: Kent Overstreet 
Signed-off-by: Shaohua Li 
---
 include/linux/percpu-refcount.h | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index d5c89e0..6beee08 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -136,7 +136,14 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
return false;
 
-   *percpu_countp = (unsigned long __percpu *)percpu_ptr;
+   /*
+* At this point ATOMIC or DEAD might be set when percpu_ref_kill() is
+* running. It's still safe to use percpu here, because the final
+* transition from percpu to atomic occurs at rcu context while this
+* routine is protected with rcu read lock.
+*/
+   *percpu_countp = (unsigned long __percpu *)(percpu_ptr &
+   ~__PERCPU_REF_ATOMIC_DEAD);
return true;
 }
 
-- 
1.8.3.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Staging:skein: Fix trailing whitespace error

2014-11-21 Thread Sudip Mukherjee
On Sat, Nov 22, 2014 at 11:34:29AM +0530, Anjana Sasindran wrote:
>   This patch fixes the checkpatch.pl error:
this patch is not applying to next-20141121.
> 
>   ERROR: trailing whitespace
but your patch is adding a blank line in the code ?

thanks
sudip

> 
> Signed-off-by: Anjana Sasindran 
> ---
>  drivers/staging/skein/skein_generic.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/staging/skein/skein_generic.c 
> b/drivers/staging/skein/skein_generic.c
> index 7096d5a..8660509 100644
> --- a/drivers/staging/skein/skein_generic.c
> +++ b/drivers/staging/skein/skein_generic.c
> @@ -190,6 +190,7 @@ static int __init skein_generic_init(void)
>   goto unreg512;
>  
>   return 0;
> +
>  unreg512:
>   crypto_unregister_shash();
>  unreg256:
> -- 
> 1.9.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Drivers:Staging:rtl8188eu:hal:usb_halinit.c: Added blank lines after declarations

2014-11-21 Thread Anjana Sasindran
This patch fixes the five checkpatch.pl warnings:

WARNING:Missing a blank line after declaration

Signed-off-by: Anjana Sasindran 
---
 drivers/staging/rtl8188eu/hal/usb_halinit.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/staging/rtl8188eu/hal/usb_halinit.c 
b/drivers/staging/rtl8188eu/hal/usb_halinit.c
index 14650e9..439828c 100644
--- a/drivers/staging/rtl8188eu/hal/usb_halinit.c
+++ b/drivers/staging/rtl8188eu/hal/usb_halinit.c
@@ -1931,6 +1931,7 @@ GetHalDefVar8188EUsb(
case HW_DEF_RA_INFO_DUMP:
{
u8 entry_id = *((u8 *)pValue);
+
if (check_fwstate(>mlmepriv, _FW_LINKED)) {
DBG_88E(" RA status check 
===\n");
DBG_88E("Mac_id:%d , RateID = %d, RAUseRate = 
0x%08x, RateSGI = %d, DecisionRate = 0x%02x ,PTStage = %d\n",
@@ -1946,6 +1947,7 @@ GetHalDefVar8188EUsb(
case HW_DEF_ODM_DBG_FLAG:
{
struct odm_dm_struct *dm_ocm = &(haldata->odmpriv);
+
pr_info("dm_ocm->DebugComponents = 0x%llx\n", 
dm_ocm->DebugComponents);
}
break;
@@ -1994,6 +1996,7 @@ static u8 SetHalDefVar8188EUsb(struct adapter *Adapter, 
enum hal_def_variable eV
} else if (dm_func == 6) {/* turn on all dynamic func */
if (!(podmpriv->SupportAbility  & 
DYNAMIC_BB_DIG)) {
struct rtw_dig *pDigTable = 
>DM_DigTable;
+
pDigTable->CurIGValue = 
usb_read8(Adapter, 0xc50);
}
podmpriv->SupportAbility = 
DYNAMIC_ALL_FUNC_ENABLE;
@@ -2011,6 +2014,7 @@ static u8 SetHalDefVar8188EUsb(struct adapter *Adapter, 
enum hal_def_variable eV
{
u8 bRSSIDump = *((u8 *)pValue);
struct odm_dm_struct *dm_ocm = &(haldata->odmpriv);
+
if (bRSSIDump)
dm_ocm->DebugComponents =   
ODM_COMP_DIG|ODM_COMP_FA_CNT;
else
@@ -2021,7 +2025,9 @@ static u8 SetHalDefVar8188EUsb(struct adapter *Adapter, 
enum hal_def_variable eV
{
u64 DebugComponents = *((u64 *)pValue);
struct odm_dm_struct *dm_ocm = &(haldata->odmpriv);
+
dm_ocm->DebugComponents = DebugComponents;
+
}
break;
default:
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Drivers:Staging:rtl8188eu:hal:usb_halinit.c: Added blank lines after declarations

2014-11-21 Thread Anjana Sasindran
This patch fixes the five checkpatch.pl warnings:

WARNING:Missing a blank line after declaration

Signed-off-by: Anjana Sasindran 
---
 drivers/staging/rtl8188eu/hal/usb_halinit.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/staging/rtl8188eu/hal/usb_halinit.c 
b/drivers/staging/rtl8188eu/hal/usb_halinit.c
index 14650e9..439828c 100644
--- a/drivers/staging/rtl8188eu/hal/usb_halinit.c
+++ b/drivers/staging/rtl8188eu/hal/usb_halinit.c
@@ -1931,6 +1931,7 @@ GetHalDefVar8188EUsb(
case HW_DEF_RA_INFO_DUMP:
{
u8 entry_id = *((u8 *)pValue);
+
if (check_fwstate(>mlmepriv, _FW_LINKED)) {
DBG_88E(" RA status check 
===\n");
DBG_88E("Mac_id:%d , RateID = %d, RAUseRate = 
0x%08x, RateSGI = %d, DecisionRate = 0x%02x ,PTStage = %d\n",
@@ -1946,6 +1947,7 @@ GetHalDefVar8188EUsb(
case HW_DEF_ODM_DBG_FLAG:
{
struct odm_dm_struct *dm_ocm = &(haldata->odmpriv);
+
pr_info("dm_ocm->DebugComponents = 0x%llx\n", 
dm_ocm->DebugComponents);
}
break;
@@ -1994,6 +1996,7 @@ static u8 SetHalDefVar8188EUsb(struct adapter *Adapter, 
enum hal_def_variable eV
} else if (dm_func == 6) {/* turn on all dynamic func */
if (!(podmpriv->SupportAbility  & 
DYNAMIC_BB_DIG)) {
struct rtw_dig *pDigTable = 
>DM_DigTable;
+
pDigTable->CurIGValue = 
usb_read8(Adapter, 0xc50);
}
podmpriv->SupportAbility = 
DYNAMIC_ALL_FUNC_ENABLE;
@@ -2011,6 +2014,7 @@ static u8 SetHalDefVar8188EUsb(struct adapter *Adapter, 
enum hal_def_variable eV
{
u8 bRSSIDump = *((u8 *)pValue);
struct odm_dm_struct *dm_ocm = &(haldata->odmpriv);
+
if (bRSSIDump)
dm_ocm->DebugComponents =   
ODM_COMP_DIG|ODM_COMP_FA_CNT;
else
@@ -2021,7 +2025,9 @@ static u8 SetHalDefVar8188EUsb(struct adapter *Adapter, 
enum hal_def_variable eV
{
u64 DebugComponents = *((u64 *)pValue);
struct odm_dm_struct *dm_ocm = &(haldata->odmpriv);
+
dm_ocm->DebugComponents = DebugComponents;
+
}
break;
default:
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Staging:skein: Fix trailing whitespace error

2014-11-21 Thread Anjana Sasindran
  This patch fixes the checkpatch.pl error:

  ERROR: trailing whitespace

Signed-off-by: Anjana Sasindran 
---
 drivers/staging/skein/skein_generic.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/skein/skein_generic.c 
b/drivers/staging/skein/skein_generic.c
index 7096d5a..8660509 100644
--- a/drivers/staging/skein/skein_generic.c
+++ b/drivers/staging/skein/skein_generic.c
@@ -190,6 +190,7 @@ static int __init skein_generic_init(void)
goto unreg512;
 
return 0;
+
 unreg512:
crypto_unregister_shash();
 unreg256:
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

2014-11-21 Thread Andy Lutomirski
On Fri, Nov 21, 2014 at 8:20 PM, Paul E. McKenney
 wrote:
> On Fri, Nov 21, 2014 at 06:00:14PM -0800, Andy Lutomirski wrote:
>> On Fri, Nov 21, 2014 at 3:38 PM, Paul E. McKenney
>>  wrote:
>> > On Fri, Nov 21, 2014 at 03:06:48PM -0800, Andy Lutomirski wrote:
>> >> On Fri, Nov 21, 2014 at 2:55 PM, Paul E. McKenney
>> >>  wrote:
>> >> > On Fri, Nov 21, 2014 at 02:19:17PM -0800, Andy Lutomirski wrote:
>> >> >> On Fri, Nov 21, 2014 at 2:07 PM, Paul E. McKenney
>> >> >>  wrote:
>> >> >> > On Fri, Nov 21, 2014 at 01:32:50PM -0800, Andy Lutomirski wrote:
>> >> >> >> On Fri, Nov 21, 2014 at 1:26 PM, Andy Lutomirski 
>> >> >> >>  wrote:
>> >> >> >> > We currently pretend that IST context is like standard exception
>> >> >> >> > context, but this is incorrect.  IST entries from userspace are 
>> >> >> >> > like
>> >> >> >> > standard exceptions except that they use per-cpu stacks, so they 
>> >> >> >> > are
>> >> >> >> > atomic.  IST entries from kernel space are like NMIs from RCU's
>> >> >> >> > perspective -- they are not quiescent states even if they
>> >> >> >> > interrupted the kernel during a quiescent state.
>> >> >> >> >
>> >> >> >> > Add and use ist_enter and ist_exit to track IST context.  Even
>> >> >> >> > though x86_32 has no IST stacks, we track these interrupts the 
>> >> >> >> > same
>> >> >> >> > way.
>> >> >> >>
>> >> >> >> I should add:
>> >> >> >>
>> >> >> >> I have no idea why RCU read-side critical sections are safe inside
>> >> >> >> __do_page_fault today.  It's guarded by exception_enter(), but that
>> >> >> >> doesn't do anything if context tracking is off, and context tracking
>> >> >> >> is usually off. What am I missing here?
>> >> >> >
>> >> >> > Ah!  There are three cases:
>> >> >> >
>> >> >> > 1.  Context tracking is off on a non-idle CPU.  In this case, 
>> >> >> > RCU is
>> >> >> > still paying attention to CPUs running in both userspace and 
>> >> >> > in
>> >> >> > the kernel.  So if a page fault happens, RCU will be set up 
>> >> >> > to
>> >> >> > notice any RCU read-side critical sections.
>> >> >> >
>> >> >> > 2.  Context tracking is on on a non-idle CPU.  In this case, RCU
>> >> >> > might well be ignoring userspace execution: NO_HZ_FULL and
>> >> >> > all that.  However, as you pointed out, in this case the
>> >> >> > context-tracking code lets RCU know that we have entered the
>> >> >> > kernel, which means that RCU will again be paying attention 
>> >> >> > to
>> >> >> > RCU read-side critical sections.
>> >> >> >
>> >> >> > 3.  The CPU is idle.  In this case, RCU is ignoring the CPU, so
>> >> >> > if we take a page fault when context tracking is off, life
>> >> >> > will be hard.  But the kernel is not supposed to take page
>> >> >> > faults in the idle loop, so this is not a problem.
>> >> >>
>> >> >> I guess so, as long as there are really no page faults in the idle 
>> >> >> loop.
>> >> >
>> >> > As far as I know, there are not.  If there are, someone needs to let
>> >> > me know!  ;-)
>> >> >
>> >> >> There are, however, machine checks in the idle loop, and maybe kprobes
>> >> >> (haven't checked), so I think this patch might fix real bugs.
>> >> >
>> >> > If you can get ISTs from the idle loop, then the patch is needed.
>> >> >
>> >> >> > Just out of curiosity...  Can an NMI occur in IST context?  If it 
>> >> >> > can,
>> >> >> > I need to make rcu_nmi_enter() and rcu_nmi_exit() deal properly with
>> >> >> > nested calls.
>> >> >>
>> >> >> Yes, and vice versa.  That code looked like it handled nesting
>> >> >> correctly, but I wasn't entirely sure.
>> >> >
>> >> > It currently does not, please see below patch.  Are you able to test
>> >> > nesting?  It would be really cool if you could do so -- I have no
>> >> > way to test this patch.
>> >>
>> >> I can try.  It's sort of easy -- I'll put an int3 into do_nmi and add
>> >> a fixup to avoid crashing.
>> >>
>> >> What should I look for?  Should I try to force full nohz on and assert
>> >> something?  I don't really know how to make full nohz work.
>> >
>> > You should look for the WARN_ON_ONCE() calls in rcu_nmi_enter() and
>> > rcu_nmi_exit() to fire.
>>
>> No warning with or without your patch, maybe because all of those
>> returns skip the labels.
>
> I will be guardedly optimistic and take this as a good sign.  ;-)
>
>> Also, an NMI can happen *during* rcu_nmi_enter or rcu_nmi_exit.  Is
>> that okay?  Should those dynticks_nmi_nesting++ things be local_inc
>> and local_dec_and_test?
>
> Yep, it is OK during rcu_nmi_enter() or rcu_nmi_exit().  The nested
> NMI will put the dynticks_nmi_nesting counter back where it was, so
> no chance of confusion.
>

That sounds like it's making a scary assumption about the code
generated by the ++ operator.

>> That dynticks_nmi_nesting thing seems scary to me.  Shouldn't the code
>> unconditionally increment dynticks_nmi_nesting in rcu_nmi_enter and
>> unconditionally decrement it 

[git pull] vfs.git fixes

2014-11-21 Thread Al Viro
Assorted fixes, most in overlayfs land.  Please, pull from the usual
place -
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Shortlog:
Arnd Bergmann (1):
  isofs: avoid unused function warning

Miklos Szeredi (8):
  ovl: rename filesystem type to "overlay"
  ovl: fix remove/copy-up race
  ovl: fix race in private xattr checks
  ovl: allow filenames with comma
  ovl: use lockless_dereference() for upperdentry
  ovl: pass dentry into ovl_dir_read_merged()
  ovl: update MAINTAINERS
  ovl: ovl_dir_fsync() cleanup

Yan, Zheng (1):
  vfs: fix reference leak in d_prune_aliases()

Diffstat:
 Documentation/filesystems/overlayfs.txt |2 +-
 MAINTAINERS |7 ++--
 fs/Makefile |2 +-
 fs/dcache.c |1 +
 fs/isofs/inode.c|   42 ++---
 fs/overlayfs/Kconfig|2 +-
 fs/overlayfs/Makefile   |4 +-
 fs/overlayfs/dir.c  |   31 ++--
 fs/overlayfs/inode.c|   27 +-
 fs/overlayfs/readdir.c  |   39 
 fs/overlayfs/super.c|   61 +--
 11 files changed, 133 insertions(+), 85 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 net/rds/message.c |   42 --
 net/rds/rds.h |3 +--
 net/rds/send.c|4 +++-
 3 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/net/rds/message.c b/net/rds/message.c
index 7a546e0..ff22022 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -264,64 +264,46 @@ struct rds_message *rds_message_map_pages(unsigned long 
*page_addrs, unsigned in
return rm;
 }
 
-int rds_message_copy_from_user(struct rds_message *rm, struct iovec *first_iov,
-  size_t total_len)
+int rds_message_copy_from_user(struct rds_message *rm, struct iov_iter *from)
 {
unsigned long to_copy;
-   unsigned long iov_off;
unsigned long sg_off;
-   struct iovec *iov;
struct scatterlist *sg;
int ret = 0;
 
-   rm->m_inc.i_hdr.h_len = cpu_to_be32(total_len);
+   rm->m_inc.i_hdr.h_len = cpu_to_be32(iov_iter_count(from));
 
/*
 * now allocate and copy in the data payload.
 */
sg = rm->data.op_sg;
-   iov = first_iov;
-   iov_off = 0;
sg_off = 0; /* Dear gcc, sg->page will be null from kzalloc. */
 
-   while (total_len) {
+   while (iov_iter_count(from)) {
if (!sg_page(sg)) {
-   ret = rds_page_remainder_alloc(sg, total_len,
+   ret = rds_page_remainder_alloc(sg, iov_iter_count(from),
   GFP_HIGHUSER);
if (ret)
-   goto out;
+   return ret;
rm->data.op_nents++;
sg_off = 0;
}
 
-   while (iov_off == iov->iov_len) {
-   iov_off = 0;
-   iov++;
-   }
-
-   to_copy = min(iov->iov_len - iov_off, sg->length - sg_off);
-   to_copy = min_t(size_t, to_copy, total_len);
-
-   rdsdebug("copying %lu bytes from user iov [%p, %zu] + %lu to "
-"sg [%p, %u, %u] + %lu\n",
-to_copy, iov->iov_base, iov->iov_len, iov_off,
-(void *)sg_page(sg), sg->offset, sg->length, sg_off);
+   to_copy = min_t(unsigned long, iov_iter_count(from),
+   sg->length - sg_off);
 
-   ret = rds_page_copy_from_user(sg_page(sg), sg->offset + sg_off,
- iov->iov_base + iov_off,
- to_copy);
-   if (ret)
-   goto out;
+   rds_stats_add(s_copy_from_user, to_copy);
+   ret = copy_page_from_iter(sg_page(sg), sg->offset + sg_off,
+ to_copy, from);
+   if (ret != to_copy)
+   return -EFAULT;
 
-   iov_off += to_copy;
-   total_len -= to_copy;
sg_off += to_copy;
 
if (sg_off == sg->length)
sg++;
}
 
-out:
return ret;
 }
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index b22dad9..c2a5eef 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -656,8 +656,7 @@ rds_conn_connecting(struct rds_connection *conn)
 /* message.c */
 struct rds_message *rds_message_alloc(unsigned int nents, gfp_t gfp);
 struct scatterlist *rds_message_alloc_sgs(struct rds_message *rm, int nents);
-int rds_message_copy_from_user(struct rds_message *rm, struct iovec *first_iov,
-  size_t total_len);
+int rds_message_copy_from_user(struct rds_message *rm, struct iov_iter *from);
 struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned 
int total_len);
 void rds_message_populate_header(struct rds_header *hdr, __be16 sport,
 __be16 dport, u64 seq);
diff --git a/net/rds/send.c b/net/rds/send.c
index 0a64541..4de62ea 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -934,7 +934,9 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, 
struct msghdr *msg,
int queued = 0, allocated_mr = 0;
int nonblock = msg->msg_flags & MSG_DONTWAIT;
long timeo = sock_sndtimeo(sk, nonblock);
+   struct iov_iter from;
 
+   iov_iter_init(, WRITE, msg->msg_iov, msg->msg_iovlen, payload_len);
/* Mirror Linux UDP mirror of BSD error message compatibility */
/* XXX: Perhaps MSG_MORE someday */
if (msg->msg_flags & ~(MSG_DONTWAIT | MSG_CMSG_COMPAT)) {
@@ -982,7 +984,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, 
struct msghdr *msg,
ret = -ENOMEM;
goto out;
}
-   ret = rds_message_copy_from_user(rm, msg->msg_iov, payload_len);
+   ret = rds_message_copy_from_user(rm, );
 

[PATCH 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter

2014-11-21 Thread Al Viro
instances get considerably simpler from that...

Signed-off-by: Al Viro 
---
 net/rds/ib.h   |3 +--
 net/rds/ib_recv.c  |   37 +++--
 net/rds/iw.h   |3 +--
 net/rds/iw_recv.c  |   37 +++--
 net/rds/message.c  |   35 ---
 net/rds/rds.h  |6 ++
 net/rds/recv.c |5 +++--
 net/rds/tcp.h  |3 +--
 net/rds/tcp_recv.c |   38 +-
 9 files changed, 47 insertions(+), 120 deletions(-)

diff --git a/net/rds/ib.h b/net/rds/ib.h
index 7280ab8..c36d713 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -316,8 +316,7 @@ int rds_ib_recv_alloc_caches(struct rds_ib_connection *ic);
 void rds_ib_recv_free_caches(struct rds_ib_connection *ic);
 void rds_ib_recv_refill(struct rds_connection *conn, int prefill);
 void rds_ib_inc_free(struct rds_incoming *inc);
-int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-size_t size);
+int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_ib_recv_cq_comp_handler(struct ib_cq *cq, void *context);
 void rds_ib_recv_tasklet_fn(unsigned long data);
 void rds_ib_recv_init_ring(struct rds_ib_connection *ic);
diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index d67de45..1b981a4 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -472,15 +472,12 @@ static struct list_head *rds_ib_recv_cache_get(struct 
rds_ib_refill_cache *cache
return head;
 }
 
-int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-   size_t size)
+int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
struct rds_ib_incoming *ibinc;
struct rds_page_frag *frag;
-   struct iovec *iov = first_iov;
unsigned long to_copy;
unsigned long frag_off = 0;
-   unsigned long iov_off = 0;
int copied = 0;
int ret;
u32 len;
@@ -489,37 +486,25 @@ int rds_ib_inc_copy_to_user(struct rds_incoming *inc, 
struct iovec *first_iov,
frag = list_entry(ibinc->ii_frags.next, struct rds_page_frag, f_item);
len = be32_to_cpu(inc->i_hdr.h_len);
 
-   while (copied < size && copied < len) {
+   while (iov_iter_count(to) && copied < len) {
if (frag_off == RDS_FRAG_SIZE) {
frag = list_entry(frag->f_item.next,
  struct rds_page_frag, f_item);
frag_off = 0;
}
-   while (iov_off == iov->iov_len) {
-   iov_off = 0;
-   iov++;
-   }
-
-   to_copy = min(iov->iov_len - iov_off, RDS_FRAG_SIZE - frag_off);
-   to_copy = min_t(size_t, to_copy, size - copied);
+   to_copy = min_t(unsigned long, iov_iter_count(to),
+   RDS_FRAG_SIZE - frag_off);
to_copy = min_t(unsigned long, to_copy, len - copied);
 
-   rdsdebug("%lu bytes to user [%p, %zu] + %lu from frag "
-"[%p, %u] + %lu\n",
-to_copy, iov->iov_base, iov->iov_len, iov_off,
-sg_page(>f_sg), frag->f_sg.offset, frag_off);
-
/* XXX needs + offset for multiple recvs per page */
-   ret = rds_page_copy_to_user(sg_page(>f_sg),
-   frag->f_sg.offset + frag_off,
-   iov->iov_base + iov_off,
-   to_copy);
-   if (ret) {
-   copied = ret;
-   break;
-   }
+   rds_stats_add(s_copy_to_user, to_copy);
+   ret = copy_page_to_iter(sg_page(>f_sg),
+   frag->f_sg.offset + frag_off,
+   to_copy,
+   to);
+   if (ret != to_copy)
+   return -EFAULT;
 
-   iov_off += to_copy;
frag_off += to_copy;
copied += to_copy;
}
diff --git a/net/rds/iw.h b/net/rds/iw.h
index 04ce3b1..cbe6674 100644
--- a/net/rds/iw.h
+++ b/net/rds/iw.h
@@ -325,8 +325,7 @@ int rds_iw_recv(struct rds_connection *conn);
 int rds_iw_recv_refill(struct rds_connection *conn, gfp_t kptr_gfp,
   gfp_t page_gfp, int prefill);
 void rds_iw_inc_free(struct rds_incoming *inc);
-int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-size_t size);
+int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_iw_recv_cq_comp_handler(struct ib_cq *cq, void *context);
 void rds_iw_recv_tasklet_fn(unsigned long data);
 void rds_iw_recv_init_ring(struct rds_iw_connection *ic);
diff --git a/net/rds/iw_recv.c 

[PATCH 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 include/net/af_vsock.h |6 +++---
 net/vmw_vsock/af_vsock.c   |6 +++---
 net/vmw_vsock/vmci_transport.c |   14 +++---
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 4282778..0d87674 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -103,14 +103,14 @@ struct vsock_transport {
int (*dgram_dequeue)(struct kiocb *kiocb, struct vsock_sock *vsk,
 struct msghdr *msg, size_t len, int flags);
int (*dgram_enqueue)(struct vsock_sock *, struct sockaddr_vm *,
-struct iovec *, size_t len);
+struct msghdr *, size_t len);
bool (*dgram_allow)(u32 cid, u32 port);
 
/* STREAM. */
/* TODO: stream_bind() */
-   ssize_t (*stream_dequeue)(struct vsock_sock *, struct iovec *,
+   ssize_t (*stream_dequeue)(struct vsock_sock *, struct msghdr *,
  size_t len, int flags);
-   ssize_t (*stream_enqueue)(struct vsock_sock *, struct iovec *,
+   ssize_t (*stream_enqueue)(struct vsock_sock *, struct msghdr *,
  size_t len);
s64 (*stream_has_data)(struct vsock_sock *);
s64 (*stream_has_space)(struct vsock_sock *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 85d232b..1d0e39c 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1013,7 +1013,7 @@ static int vsock_dgram_sendmsg(struct kiocb *kiocb, 
struct socket *sock,
goto out;
}
 
-   err = transport->dgram_enqueue(vsk, remote_addr, msg->msg_iov, len);
+   err = transport->dgram_enqueue(vsk, remote_addr, msg, len);
 
 out:
release_sock(sk);
@@ -1617,7 +1617,7 @@ static int vsock_stream_sendmsg(struct kiocb *kiocb, 
struct socket *sock,
 */
 
written = transport->stream_enqueue(
-   vsk, msg->msg_iov,
+   vsk, msg,
len - total_written);
if (written < 0) {
err = -ENOMEM;
@@ -1739,7 +1739,7 @@ vsock_stream_recvmsg(struct kiocb *kiocb,
break;
 
read = transport->stream_dequeue(
-   vsk, msg->msg_iov,
+   vsk, msg,
len - copied, flags);
if (read < 0) {
err = -ENOMEM;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index a57ddef..c1c0389 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1697,7 +1697,7 @@ static int vmci_transport_dgram_bind(struct vsock_sock 
*vsk,
 static int vmci_transport_dgram_enqueue(
struct vsock_sock *vsk,
struct sockaddr_vm *remote_addr,
-   struct iovec *iov,
+   struct msghdr *msg,
size_t len)
 {
int err;
@@ -1714,7 +1714,7 @@ static int vmci_transport_dgram_enqueue(
if (!dg)
return -ENOMEM;
 
-   memcpy_fromiovec(VMCI_DG_PAYLOAD(dg), iov, len);
+   memcpy_from_msg(VMCI_DG_PAYLOAD(dg), msg, len);
 
dg->dst = vmci_make_handle(remote_addr->svm_cid,
   remote_addr->svm_port);
@@ -1835,22 +1835,22 @@ static int vmci_transport_connect(struct vsock_sock 
*vsk)
 
 static ssize_t vmci_transport_stream_dequeue(
struct vsock_sock *vsk,
-   struct iovec *iov,
+   struct msghdr *msg,
size_t len,
int flags)
 {
if (flags & MSG_PEEK)
-   return vmci_qpair_peekv(vmci_trans(vsk)->qpair, iov, len, 0);
+   return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg->msg_iov, 
len, 0);
else
-   return vmci_qpair_dequev(vmci_trans(vsk)->qpair, iov, len, 0);
+   return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg->msg_iov, 
len, 0);
 }
 
 static ssize_t vmci_transport_stream_enqueue(
struct vsock_sock *vsk,
-   struct iovec *iov,
+   struct msghdr *msg,
size_t len)
 {
-   return vmci_qpair_enquev(vmci_trans(vsk)->qpair, iov, len, 0);
+   return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 }
 
 static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 15/17] [atm] switch vcc_sendmsg() to copy_from_iter()

2014-11-21 Thread Al Viro
... and make it handle multi-segment iovecs - deals with that
"fix this later" issue for free.  A bit of shame, really - it
had been there since 2.3.15pre3 when the whole thing went into the
tree, practically a historical artefact by now...

Signed-off-by: Al Viro 
---
 net/atm/common.c |   17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/atm/common.c b/net/atm/common.c
index 9cd1cca..f591129 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -570,15 +570,16 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, 
struct msghdr *msg,
 }
 
 int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
-   size_t total_len)
+   size_t size)
 {
struct sock *sk = sock->sk;
DEFINE_WAIT(wait);
struct atm_vcc *vcc;
struct sk_buff *skb;
int eff, error;
-   const void __user *buff;
-   int size;
+   struct iov_iter from;
+
+   iov_iter_init(, WRITE, m->msg_iov, m->msg_iovlen, size);
 
lock_sock(sk);
if (sock->state != SS_CONNECTED) {
@@ -589,12 +590,6 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, 
struct msghdr *m,
error = -EISCONN;
goto out;
}
-   if (m->msg_iovlen != 1) {
-   error = -ENOSYS; /* fix this later @@@ */
-   goto out;
-   }
-   buff = m->msg_iov->iov_base;
-   size = m->msg_iov->iov_len;
vcc = ATM_SD(sock);
if (test_bit(ATM_VF_RELEASED, >flags) ||
test_bit(ATM_VF_CLOSE, >flags) ||
@@ -607,7 +602,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, 
struct msghdr *m,
error = 0;
goto out;
}
-   if (size < 0 || size > vcc->qos.txtp.max_sdu) {
+   if (size > vcc->qos.txtp.max_sdu) {
error = -EMSGSIZE;
goto out;
}
@@ -639,7 +634,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, 
struct msghdr *m,
goto out;
skb->dev = NULL; /* for paths shared with net_device interfaces */
ATM_SKB(skb)->atm_options = vcc->atm_options;
-   if (copy_from_user(skb_put(skb, size), buff, size)) {
+   if (copy_from_iter(skb_put(skb, size), size, ) != size) {
kfree_skb(skb);
error = -EFAULT;
goto out;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/17] tipc_msg_build(): pass msghdr instead of its ->msg_iov

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 net/tipc/msg.c|8 
 net/tipc/msg.h|2 +-
 net/tipc/socket.c |7 +++
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index ec18076..9155496 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -162,14 +162,14 @@ err:
 /**
  * tipc_msg_build - create buffer chain containing specified header and data
  * @mhdr: Message header, to be prepended to data
- * @iov: User data
+ * @m: User message
  * @offset: Posision in iov to start copying from
  * @dsz: Total length of user data
  * @pktmax: Max packet size that can be used
  * @chain: Buffer or chain of buffers to be returned to caller
  * Returns message data size or errno: -ENOMEM, -EFAULT
  */
-int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
   int offset, int dsz, int pktmax , struct sk_buff **chain)
 {
int mhsz = msg_hdr_sz(mhdr);
@@ -194,7 +194,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct iovec 
const *iov,
skb_copy_to_linear_data(buf, mhdr, mhsz);
pktpos = buf->data + mhsz;
TIPC_SKB_CB(buf)->chain_sz = 1;
-   if (!dsz || !memcpy_fromiovecend(pktpos, iov, offset, dsz))
+   if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iov, offset, 
dsz))
return dsz;
rc = -EFAULT;
goto error;
@@ -223,7 +223,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct iovec 
const *iov,
if (drem < pktrem)
pktrem = drem;
 
-   if (memcpy_fromiovecend(pktpos, iov, offset, pktrem)) {
+   if (memcpy_fromiovecend(pktpos, m->msg_iov, offset, pktrem)) {
rc = -EFAULT;
goto error;
}
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 0ea7b69..d7d2ba2 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -743,7 +743,7 @@ bool tipc_msg_bundle(struct sk_buff *bbuf, struct sk_buff 
*buf, u32 mtu);
 
 bool tipc_msg_make_bundle(struct sk_buff **buf, u32 mtu, u32 dnode);
 
-int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
   int offset, int dsz, int mtu , struct sk_buff **chain);
 
 struct sk_buff *tipc_msg_reassemble(struct sk_buff *chain);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 8c94ec4..7ad2a93 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -719,7 +719,7 @@ static int tipc_sendmcast(struct  socket *sock, struct 
tipc_name_seq *seq,
 
 new_mtu:
mtu = tipc_bclink_get_mtu();
-   rc = tipc_msg_build(mhdr, msg->msg_iov, 0, dsz, mtu, );
+   rc = tipc_msg_build(mhdr, msg, 0, dsz, mtu, );
if (unlikely(rc < 0))
return rc;
 
@@ -897,7 +897,6 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket 
*sock,
struct sock *sk = sock->sk;
struct tipc_sock *tsk = tipc_sk(sk);
struct tipc_msg *mhdr = >phdr;
-   struct iovec *iov = m->msg_iov;
u32 dnode, dport;
struct sk_buff *buf;
struct tipc_name_seq *seq = >addr.nameseq;
@@ -974,7 +973,7 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket 
*sock,
 
 new_mtu:
mtu = tipc_node_get_mtu(dnode, tsk->ref);
-   rc = tipc_msg_build(mhdr, iov, 0, dsz, mtu, );
+   rc = tipc_msg_build(mhdr, m, 0, dsz, mtu, );
if (rc < 0)
goto exit;
 
@@ -1086,7 +1085,7 @@ static int tipc_send_stream(struct kiocb *iocb, struct 
socket *sock,
 next:
mtu = tsk->max_pkt;
send = min_t(uint, dsz - sent, TIPC_MAX_USER_MSG_SIZE);
-   rc = tipc_msg_build(mhdr, m->msg_iov, sent, send, mtu, );
+   rc = tipc_msg_build(mhdr, m, sent, send, mtu, );
if (unlikely(rc < 0))
goto exit;
do {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/17] kill zerocopy_sg_from_iovec()

2014-11-21 Thread Al Viro
no users left

Signed-off-by: Al Viro 
---
 include/linux/skbuff.h |2 --
 net/core/datagram.c|   65 ++--
 2 files changed, 2 insertions(+), 65 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ce69d48..fa11bbd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2661,8 +2661,6 @@ int skb_copy_datagram_from_iovec(struct sk_buff *skb, int 
offset,
 int len);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 struct iov_iter *from, int len);
-int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
-  int offset, size_t count);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
   struct iov_iter *to, int size);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 34d82f8..5f8b90d 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -644,76 +644,15 @@ fault:
 EXPORT_SYMBOL(skb_copy_datagram_from_iter);
 
 /**
- * zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
+ * zerocopy_sg_from_iter - Build a zerocopy datagram from an iov_iter
  * @skb: buffer to copy
- * @from: io vector to copy from
- * @offset: offset in the io vector to start copying from
- * @count: amount of vectors to copy to buffer from
+ * @from: the source to copy from
  *
  * The function will first copy up to headlen, and then pin the userspace
  * pages and build frags through them.
  *
  * Returns 0, -EFAULT or -EMSGSIZE.
- * Note: the iovec is not modified during the copy
  */
-int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
- int offset, size_t count)
-{
-   int len = iov_length(from, count) - offset;
-   int copy = min_t(int, skb_headlen(skb), len);
-   int size;
-   int i = 0;
-
-   /* copy up to skb headlen */
-   if (skb_copy_datagram_from_iovec(skb, 0, from, offset, copy))
-   return -EFAULT;
-
-   if (len == copy)
-   return 0;
-
-   offset += copy;
-   while (count--) {
-   struct page *page[MAX_SKB_FRAGS];
-   int num_pages;
-   unsigned long base;
-   unsigned long truesize;
-
-   /* Skip over from offset and copied */
-   if (offset >= from->iov_len) {
-   offset -= from->iov_len;
-   ++from;
-   continue;
-   }
-   len = from->iov_len - offset;
-   base = (unsigned long)from->iov_base + offset;
-   size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
-   if (i + size > MAX_SKB_FRAGS)
-   return -EMSGSIZE;
-   num_pages = get_user_pages_fast(base, size, 0, [i]);
-   if (num_pages != size) {
-   release_pages([i], num_pages, 0);
-   return -EFAULT;
-   }
-   truesize = size * PAGE_SIZE;
-   skb->data_len += len;
-   skb->len += len;
-   skb->truesize += truesize;
-   atomic_add(truesize, >sk->sk_wmem_alloc);
-   while (len) {
-   int off = base & ~PAGE_MASK;
-   int size = min_t(int, len, PAGE_SIZE - off);
-   skb_fill_page_desc(skb, i, page[i], off, size);
-   base += size;
-   len -= size;
-   i++;
-   }
-   offset = 0;
-   ++from;
-   }
-   return 0;
-}
-EXPORT_SYMBOL(zerocopy_sg_from_iovec);
-
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
 {
int len = iov_iter_count(from);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()

2014-11-21 Thread Al Viro
... and kill skb_copy_datagram_iovec()

Signed-off-by: Al Viro 
---
 include/linux/skbuff.h |3 --
 net/core/datagram.c|   88 ++--
 net/packet/af_packet.c |   11 --
 net/unix/af_unix.c |   11 --
 4 files changed, 18 insertions(+), 95 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fa11bbd..67f659e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2656,9 +2656,6 @@ static inline int skb_copy_and_csum_datagram_msg(struct 
sk_buff *skb, int hlen,
 {
return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
 }
-int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
-const struct iovec *from, int from_offset,
-int len);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 struct iov_iter *from, int len);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 5f8b90d..836f76c 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -480,98 +480,14 @@ short_copy:
 EXPORT_SYMBOL(skb_copy_datagram_iter);
 
 /**
- * skb_copy_datagram_from_iovec - Copy a datagram from an iovec.
+ * skb_copy_datagram_from_iter - Copy a datagram from an iov_iter.
  * @skb: buffer to copy
  * @offset: offset in the buffer to start copying to
- * @from: io vector to copy to
- * @from_offset: offset in the io vector to start copying from
+ * @from: the copy source
  * @len: amount of data to copy to buffer from iovec
  *
  * Returns 0 or -EFAULT.
- * Note: the iovec is not modified during the copy.
  */
-int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
-const struct iovec *from, int from_offset,
-int len)
-{
-   int start = skb_headlen(skb);
-   int i, copy = start - offset;
-   struct sk_buff *frag_iter;
-
-   /* Copy header. */
-   if (copy > 0) {
-   if (copy > len)
-   copy = len;
-   if (memcpy_fromiovecend(skb->data + offset, from, from_offset,
-   copy))
-   goto fault;
-   if ((len -= copy) == 0)
-   return 0;
-   offset += copy;
-   from_offset += copy;
-   }
-
-   /* Copy paged appendix. Hmm... why does this look so complicated? */
-   for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-   int end;
-   const skb_frag_t *frag = _shinfo(skb)->frags[i];
-
-   WARN_ON(start > offset + len);
-
-   end = start + skb_frag_size(frag);
-   if ((copy = end - offset) > 0) {
-   int err;
-   u8  *vaddr;
-   struct page *page = skb_frag_page(frag);
-
-   if (copy > len)
-   copy = len;
-   vaddr = kmap(page);
-   err = memcpy_fromiovecend(vaddr + frag->page_offset +
- offset - start,
- from, from_offset, copy);
-   kunmap(page);
-   if (err)
-   goto fault;
-
-   if (!(len -= copy))
-   return 0;
-   offset += copy;
-   from_offset += copy;
-   }
-   start = end;
-   }
-
-   skb_walk_frags(skb, frag_iter) {
-   int end;
-
-   WARN_ON(start > offset + len);
-
-   end = start + frag_iter->len;
-   if ((copy = end - offset) > 0) {
-   if (copy > len)
-   copy = len;
-   if (skb_copy_datagram_from_iovec(frag_iter,
-offset - start,
-from,
-from_offset,
-copy))
-   goto fault;
-   if ((len -= copy) == 0)
-   return 0;
-   offset += copy;
-   from_offset += copy;
-   }
-   start = end;
-   }
-   if (!len)
-   return 0;
-
-fault:
-   return -EFAULT;
-}
-EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
-
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 struct iov_iter *from,
 int len)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 692f049..83bae39 

[PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 include/linux/skbuff.h |3 ++
 net/core/datagram.c|  115 
 2 files changed, 118 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 18ce42e..ce69d48 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2659,10 +2659,13 @@ static inline int skb_copy_and_csum_datagram_msg(struct 
sk_buff *skb, int hlen,
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 const struct iovec *from, int from_offset,
 int len);
+int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
+struct iov_iter *from, int len);
 int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
   int offset, size_t count);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
   struct iov_iter *to, int size);
+int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
 void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
 void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb);
 int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int 
flags);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 26391a3..34d82f8 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -572,6 +572,77 @@ fault:
 }
 EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
 
+int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
+struct iov_iter *from,
+int len)
+{
+   int start = skb_headlen(skb);
+   int i, copy = start - offset;
+   struct sk_buff *frag_iter;
+
+   /* Copy header. */
+   if (copy > 0) {
+   if (copy > len)
+   copy = len;
+   if (copy_from_iter(skb->data + offset, copy, from) != copy)
+   goto fault;
+   if ((len -= copy) == 0)
+   return 0;
+   offset += copy;
+   }
+
+   /* Copy paged appendix. Hmm... why does this look so complicated? */
+   for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+   int end;
+   const skb_frag_t *frag = _shinfo(skb)->frags[i];
+
+   WARN_ON(start > offset + len);
+
+   end = start + skb_frag_size(frag);
+   if ((copy = end - offset) > 0) {
+   size_t copied;
+   if (copy > len)
+   copy = len;
+   copied = copy_page_from_iter(skb_frag_page(frag),
+ frag->page_offset + offset - start,
+ copy, from);
+   if (copied != copy)
+   goto fault;
+
+   if (!(len -= copy))
+   return 0;
+   offset += copy;
+   }
+   start = end;
+   }
+
+   skb_walk_frags(skb, frag_iter) {
+   int end;
+
+   WARN_ON(start > offset + len);
+
+   end = start + frag_iter->len;
+   if ((copy = end - offset) > 0) {
+   if (copy > len)
+   copy = len;
+   if (skb_copy_datagram_from_iter(frag_iter,
+   offset - start,
+   from, copy))
+   goto fault;
+   if ((len -= copy) == 0)
+   return 0;
+   offset += copy;
+   }
+   start = end;
+   }
+   if (!len)
+   return 0;
+
+fault:
+   return -EFAULT;
+}
+EXPORT_SYMBOL(skb_copy_datagram_from_iter);
+
 /**
  * zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
  * @skb: buffer to copy
@@ -643,6 +714,50 @@ int zerocopy_sg_from_iovec(struct sk_buff *skb, const 
struct iovec *from,
 }
 EXPORT_SYMBOL(zerocopy_sg_from_iovec);
 
+int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
+{
+   int len = iov_iter_count(from);
+   int copy = min_t(int, skb_headlen(skb), len);
+   int i = 0;
+
+   /* copy up to skb headlen */
+   if (skb_copy_datagram_from_iter(skb, 0, from, copy))
+   return -EFAULT;
+
+   while (iov_iter_count(from)) {
+   struct page *pages[MAX_SKB_FRAGS];
+   size_t start;
+   ssize_t copied;
+   unsigned long truesize;
+   int n = 0;
+
+   copied = iov_iter_get_pages(from, pages, ~0U, MAX_SKB_FRAGS, 
);
+   if (copied < 0)
+   return -EFAULT;
+
+   truesize = DIV_ROUND_UP(copied + start, PAGE_SIZE) * 

[PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter

2014-11-21 Thread Al Viro
allows to switch macvtap and tun from ->aio_write() to ->write_iter()

Signed-off-by: Al Viro 
---
 drivers/net/macvtap.c |   43 ---
 drivers/net/tun.c |   43 +++
 2 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index cdd820f..2bf08c6 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -640,12 +640,12 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff 
*skb,
 
 /* Get packet from user space buffer */
 static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
-   const struct iovec *iv, unsigned long total_len,
-   size_t count, int noblock)
+   struct iov_iter *from, int noblock)
 {
int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
struct sk_buff *skb;
struct macvlan_dev *vlan;
+   unsigned long total_len = iov_iter_count(from);
unsigned long len = total_len;
int err;
struct virtio_net_hdr vnet_hdr = { 0 };
@@ -653,6 +653,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
int copylen = 0;
bool zerocopy = false;
size_t linear;
+   ssize_t n;
 
if (q->flags & IFF_VNET_HDR) {
vnet_hdr_len = q->vnet_hdr_sz;
@@ -662,10 +663,11 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
goto err;
len -= vnet_hdr_len;
 
-   err = memcpy_fromiovecend((void *)_hdr, iv, 0,
-  sizeof(vnet_hdr));
-   if (err < 0)
+   err = -EFAULT;
+   n = copy_from_iter(_hdr, sizeof(vnet_hdr), from);
+   if (n != sizeof(vnet_hdr))
goto err;
+   iov_iter_advance(from, vnet_hdr_len - sizeof(vnet_hdr));
if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
 vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
vnet_hdr.hdr_len)
@@ -680,17 +682,15 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
if (unlikely(len < ETH_HLEN))
goto err;
 
-   err = -EMSGSIZE;
-   if (unlikely(count > UIO_MAXIOV))
-   goto err;
-
if (m && m->msg_control && sock_flag(>sk, SOCK_ZEROCOPY)) {
+   struct iov_iter i;
copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
if (copylen > good_linear)
copylen = good_linear;
linear = copylen;
-   if (iov_pages(iv, vnet_hdr_len + copylen, count)
-   <= MAX_SKB_FRAGS)
+   i = *from;
+   iov_iter_advance(, copylen);
+   if (iov_iter_npages(, INT_MAX) <= MAX_SKB_FRAGS)
zerocopy = true;
}
 
@@ -708,10 +708,9 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, 
struct msghdr *m,
goto err;
 
if (zerocopy)
-   err = zerocopy_sg_from_iovec(skb, iv, vnet_hdr_len, count);
+   err = zerocopy_sg_from_iter(skb, from);
else {
-   err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len,
-  len);
+   err = skb_copy_datagram_from_iter(skb, 0, from, len);
if (!err && m && m->msg_control) {
struct ubuf_info *uarg = m->msg_control;
uarg->callback(uarg, false);
@@ -764,16 +763,12 @@ err:
return err;
 }
 
-static ssize_t macvtap_aio_write(struct kiocb *iocb, const struct iovec *iv,
-unsigned long count, loff_t pos)
+static ssize_t macvtap_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
struct file *file = iocb->ki_filp;
-   ssize_t result = -ENOLINK;
struct macvtap_queue *q = file->private_data;
 
-   result = macvtap_get_user(q, NULL, iv, iov_length(iv, count), count,
- file->f_flags & O_NONBLOCK);
-   return result;
+   return macvtap_get_user(q, NULL, from, file->f_flags & O_NONBLOCK);
 }
 
 /* Put packet to the user space buffer */
@@ -1079,8 +1074,9 @@ static const struct file_operations macvtap_fops = {
.open   = macvtap_open,
.release= macvtap_release,
.read   = new_sync_read,
+   .write  = new_sync_write,
.read_iter  = macvtap_read_iter,
-   .aio_write  = macvtap_aio_write,
+   .write_iter = macvtap_write_iter,
.poll   = macvtap_poll,
.llseek = no_llseek,
.unlocked_ioctl = macvtap_ioctl,
@@ -1093,8 +1089,9 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct 

PATCH 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 include/net/sctp/structs.h |6 +++---
 net/sctp/chunk.c   |9 -
 net/sctp/sm_make_chunk.c   |   16 
 net/sctp/socket.c  |5 -
 4 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 806e3b5..2bb2fcf 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -531,7 +531,7 @@ struct sctp_datamsg {
 
 struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *,
struct sctp_sndrcvinfo *,
-   struct msghdr *, int len);
+   struct iov_iter *);
 void sctp_datamsg_free(struct sctp_datamsg *);
 void sctp_datamsg_put(struct sctp_datamsg *);
 void sctp_chunk_fail(struct sctp_chunk *, int error);
@@ -647,8 +647,8 @@ struct sctp_chunk {
 
 void sctp_chunk_hold(struct sctp_chunk *);
 void sctp_chunk_put(struct sctp_chunk *);
-int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
- struct iovec *data);
+int sctp_user_addto_chunk(struct sctp_chunk *chunk, int len,
+ struct iov_iter *from);
 void sctp_chunk_free(struct sctp_chunk *);
 void  *sctp_addto_chunk(struct sctp_chunk *, int len, const void *data);
 struct sctp_chunk *sctp_chunkify(struct sk_buff *,
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 158701d..a338091 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -164,7 +164,7 @@ static void sctp_datamsg_assign(struct sctp_datamsg *msg, 
struct sctp_chunk *chu
  */
 struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
struct sctp_sndrcvinfo *sinfo,
-   struct msghdr *msgh, int msg_len)
+   struct iov_iter *from)
 {
int max, whole, i, offset, over, err;
int len, first_len;
@@ -172,6 +172,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
sctp_association *asoc,
struct sctp_chunk *chunk;
struct sctp_datamsg *msg;
struct list_head *pos, *temp;
+   size_t msg_len = iov_iter_count(from);
__u8 frag;
 
msg = sctp_datamsg_new(GFP_KERNEL);
@@ -279,12 +280,10 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
sctp_association *asoc,
goto errout;
}
 
-   err = sctp_user_addto_chunk(chunk, offset, len, msgh->msg_iov);
+   err = sctp_user_addto_chunk(chunk, len, from);
if (err < 0)
goto errout_chunk_free;
 
-   offset += len;
-
/* Put the chunk->skb back into the form expected by send.  */
__skb_pull(chunk->skb, (__u8 *)chunk->chunk_hdr
   - (__u8 *)chunk->skb->data);
@@ -317,7 +316,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct 
sctp_association *asoc,
goto errout;
}
 
-   err = sctp_user_addto_chunk(chunk, offset, over, msgh->msg_iov);
+   err = sctp_user_addto_chunk(chunk, over, from);
 
/* Put the chunk->skb back into the form expected by send.  */
__skb_pull(chunk->skb, (__u8 *)chunk->chunk_hdr
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index e49bcce..e49e231 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1491,26 +1491,26 @@ static void *sctp_addto_chunk_fixed(struct sctp_chunk 
*chunk,
  * chunk is not big enough.
  * Returns a kernel err value.
  */
-int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
- struct iovec *data)
+int sctp_user_addto_chunk(struct sctp_chunk *chunk, int len,
+ struct iov_iter *from)
 {
-   __u8 *target;
-   int err = 0;
+   void *target;
+   ssize_t copied;
 
/* Make room in chunk for data.  */
target = skb_put(chunk->skb, len);
 
/* Copy data (whole iovec) into chunk */
-   if ((err = memcpy_fromiovecend(target, data, off, len)))
-   goto out;
+   copied = copy_from_iter(target, len, from);
+   if (copied != len)
+   return -EFAULT;
 
/* Adjust the chunk length field.  */
chunk->chunk_hdr->length =
htons(ntohs(chunk->chunk_hdr->length) + len);
chunk->chunk_end = skb_tail_pointer(chunk->skb);
 
-out:
-   return err;
+   return 0;
 }
 
 /* Helper function to assign a TSN if needed.  This assumes that both
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 2120292..7e866b7 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1609,6 +1609,9 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock 
*sk,
__u16 sinfo_flags = 0;
long timeo;
int err;
+   struct iov_iter 

[PATCH 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 net/tipc/socket.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 591bbfa..8c94ec4 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -692,7 +692,7 @@ static unsigned int tipc_poll(struct file *file, struct 
socket *sock,
  * tipc_sendmcast - send multicast message
  * @sock: socket structure
  * @seq: destination address
- * @iov: message data to send
+ * @msg: message to send
  * @dsz: total length of message data
  * @timeo: timeout to wait for wakeup
  *
@@ -700,7 +700,7 @@ static unsigned int tipc_poll(struct file *file, struct 
socket *sock,
  * Returns the number of bytes sent on success, or errno
  */
 static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
- struct iovec *iov, size_t dsz, long timeo)
+ struct msghdr *msg, size_t dsz, long timeo)
 {
struct sock *sk = sock->sk;
struct tipc_msg *mhdr = _sk(sk)->phdr;
@@ -719,7 +719,7 @@ static int tipc_sendmcast(struct  socket *sock, struct 
tipc_name_seq *seq,
 
 new_mtu:
mtu = tipc_bclink_get_mtu();
-   rc = tipc_msg_build(mhdr, iov, 0, dsz, mtu, );
+   rc = tipc_msg_build(mhdr, msg->msg_iov, 0, dsz, mtu, );
if (unlikely(rc < 0))
return rc;
 
@@ -943,7 +943,7 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket 
*sock,
timeo = sock_sndtimeo(sk, m->msg_flags & MSG_DONTWAIT);
 
if (dest->addrtype == TIPC_ADDR_MCAST) {
-   rc = tipc_sendmcast(sock, seq, iov, dsz, timeo);
+   rc = tipc_sendmcast(sock, seq, m, dsz, timeo);
goto exit;
} else if (dest->addrtype == TIPC_ADDR_NAME) {
u32 type = dest->addr.name.name.type;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/17] switch macvtap to ->read_iter()

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 drivers/net/macvtap.c |   39 ---
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index cea99d4..cdd820f 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -829,16 +829,17 @@ done:
 }
 
 static ssize_t macvtap_do_read(struct macvtap_queue *q,
-  const struct iovec *iv, unsigned long segs,
-  unsigned long len,
+  struct iov_iter *to,
   int noblock)
 {
DEFINE_WAIT(wait);
struct sk_buff *skb;
ssize_t ret = 0;
-   struct iov_iter iter;
 
-   while (len) {
+   if (!iov_iter_count(to))
+   return 0;
+
+   while (1) {
if (!noblock)
prepare_to_wait(sk_sleep(>sk), ,
TASK_INTERRUPTIBLE);
@@ -856,37 +857,27 @@ static ssize_t macvtap_do_read(struct macvtap_queue *q,
}
/* Nothing to read, let's sleep */
schedule();
-   continue;
}
-   iov_iter_init(, READ, iv, segs, len);
-   ret = macvtap_put_user(q, skb, );
+   }
+   if (skb) {
+   ret = macvtap_put_user(q, skb, to);
kfree_skb(skb);
-   break;
}
-
if (!noblock)
finish_wait(sk_sleep(>sk), );
return ret;
 }
 
-static ssize_t macvtap_aio_read(struct kiocb *iocb, const struct iovec *iv,
-   unsigned long count, loff_t pos)
+static ssize_t macvtap_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
struct file *file = iocb->ki_filp;
struct macvtap_queue *q = file->private_data;
-   ssize_t len, ret = 0;
+   ssize_t len = iov_iter_count(to), ret;
 
-   len = iov_length(iv, count);
-   if (len < 0) {
-   ret = -EINVAL;
-   goto out;
-   }
-
-   ret = macvtap_do_read(q, iv, count, len, file->f_flags & O_NONBLOCK);
+   ret = macvtap_do_read(q, to, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
-out:
return ret;
 }
 
@@ -1087,7 +1078,8 @@ static const struct file_operations macvtap_fops = {
.owner  = THIS_MODULE,
.open   = macvtap_open,
.release= macvtap_release,
-   .aio_read   = macvtap_aio_read,
+   .read   = new_sync_read,
+   .read_iter  = macvtap_read_iter,
.aio_write  = macvtap_aio_write,
.poll   = macvtap_poll,
.llseek = no_llseek,
@@ -1110,11 +1102,12 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct 
socket *sock,
   int flags)
 {
struct macvtap_queue *q = container_of(sock, struct macvtap_queue, 
sock);
+   struct iov_iter to;
int ret;
if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
return -EINVAL;
-   ret = macvtap_do_read(q, m->msg_iov, m->msg_iovlen, total_len,
- flags & MSG_DONTWAIT);
+   iov_iter_init(, READ, m->msg_iov, m->msg_iovlen, total_len);
+   ret = macvtap_do_read(q, , flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
ret = flags & MSG_TRUNC ? ret : total_len;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/17] switch drivers/net/tun.c to ->read_iter()

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 drivers/net/tun.c |   40 +++-
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index ac53a73..405dfdf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1339,18 +1339,17 @@ done:
 }
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
-  const struct iovec *iv, unsigned long segs,
-  ssize_t len, int noblock)
+  struct iov_iter *to,
+  int noblock)
 {
struct sk_buff *skb;
-   ssize_t ret = 0;
+   ssize_t ret;
int peeked, err, off = 0;
-   struct iov_iter iter;
 
tun_debug(KERN_INFO, tun, "tun_do_read\n");
 
-   if (!len)
-   return ret;
+   if (!iov_iter_count(to))
+   return 0;
 
if (tun->dev->reg_state != NETREG_REGISTERED)
return -EIO;
@@ -1359,37 +1358,27 @@ static ssize_t tun_do_read(struct tun_struct *tun, 
struct tun_file *tfile,
skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
  , , );
if (!skb)
-   return ret;
+   return 0;
 
-   iov_iter_init(, READ, iv, segs, len);
-   ret = tun_put_user(tun, tfile, skb, );
+   ret = tun_put_user(tun, tfile, skb, to);
kfree_skb(skb);
 
return ret;
 }
 
-static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv,
-   unsigned long count, loff_t pos)
+static ssize_t tun_chr_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
struct file *file = iocb->ki_filp;
struct tun_file *tfile = file->private_data;
struct tun_struct *tun = __tun_get(tfile);
-   ssize_t len, ret;
+   ssize_t len = iov_iter_count(to), ret;
 
if (!tun)
return -EBADFD;
-   len = iov_length(iv, count);
-   if (len < 0) {
-   ret = -EINVAL;
-   goto out;
-   }
-
-   ret = tun_do_read(tun, tfile, iv, count, len,
- file->f_flags & O_NONBLOCK);
+   ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK);
ret = min_t(ssize_t, ret, len);
if (ret > 0)
iocb->ki_pos = ret;
-out:
tun_put(tun);
return ret;
 }
@@ -1471,6 +1460,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket 
*sock,
 {
struct tun_file *tfile = container_of(sock, struct tun_file, socket);
struct tun_struct *tun = __tun_get(tfile);
+   struct iov_iter to;
int ret;
 
if (!tun)
@@ -1485,8 +1475,8 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket 
*sock,
 SOL_PACKET, TUN_TX_TIMESTAMP);
goto out;
}
-   ret = tun_do_read(tun, tfile, m->msg_iov, m->msg_iovlen, total_len,
- flags & MSG_DONTWAIT);
+   iov_iter_init(, READ, m->msg_iov, m->msg_iovlen, total_len);
+   ret = tun_do_read(tun, tfile, , flags & MSG_DONTWAIT);
if (ret > total_len) {
m->msg_flags |= MSG_TRUNC;
ret = flags & MSG_TRUNC ? ret : total_len;
@@ -2242,8 +2232,8 @@ static int tun_chr_show_fdinfo(struct seq_file *m, struct 
file *f)
 static const struct file_operations tun_fops = {
.owner  = THIS_MODULE,
.llseek = no_llseek,
-   .read  = do_sync_read,
-   .aio_read  = tun_chr_aio_read,
+   .read  = new_sync_read,
+   .read_iter  = tun_chr_read_iter,
.write = do_sync_write,
.aio_write = tun_chr_aio_write,
.poll   = tun_chr_poll,
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/17] new helper: memcpy_to_msg()

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 crypto/algif_hash.c|2 +-
 include/linux/skbuff.h |5 +
 net/caif/caif_socket.c |2 +-
 net/can/bcm.c  |2 +-
 net/can/raw.c  |2 +-
 net/decnet/af_decnet.c |2 +-
 net/ipv4/tcp.c |2 +-
 net/irda/af_irda.c |2 +-
 net/packet/af_packet.c |3 +--
 9 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 8502462..35c93ff 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -174,7 +174,7 @@ static int hash_recvmsg(struct kiocb *unused, struct socket 
*sock,
goto unlock;
}
 
-   err = memcpy_toiovec(msg->msg_iov, ctx->result, len);
+   err = memcpy_to_msg(msg, ctx->result, len);
 
 unlock:
release_sock(sk);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index cb7fa2b..18ce42e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2689,6 +2689,11 @@ static inline int memcpy_from_msg(void *data, struct 
msghdr *msg, int len)
return memcpy_fromiovec(data, msg->msg_iov, len);
 }
 
+static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
+{
+   return memcpy_toiovec(msg->msg_iov, data, len);
+}
+
 struct skb_checksum_ops {
__wsum (*update)(const void *mem, int len, __wsum wsum);
__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 230f140..ac618b0 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -418,7 +418,7 @@ unlock:
}
release_sock(sk);
chunk = min_t(unsigned int, skb->len, size);
-   if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+   if (memcpy_to_msg(msg, skb->data, chunk)) {
skb_queue_head(>sk_receive_queue, skb);
if (copied == 0)
copied = -EFAULT;
diff --git a/net/can/bcm.c b/net/can/bcm.c
index b9a1f71..0167118 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1555,7 +1555,7 @@ static int bcm_recvmsg(struct kiocb *iocb, struct socket 
*sock,
if (skb->len < size)
size = skb->len;
 
-   err = memcpy_toiovec(msg->msg_iov, skb->data, size);
+   err = memcpy_to_msg(msg, skb->data, size);
if (err < 0) {
skb_free_datagram(sk, skb);
return err;
diff --git a/net/can/raw.c b/net/can/raw.c
index 0e4004f..dfdcffb 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -750,7 +750,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct socket 
*sock,
else
size = skb->len;
 
-   err = memcpy_toiovec(msg->msg_iov, skb->data, size);
+   err = memcpy_to_msg(msg, skb->data, size);
if (err < 0) {
skb_free_datagram(sk, skb);
return err;
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index e2e2e3c..8102286 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -1760,7 +1760,7 @@ static int dn_recvmsg(struct kiocb *iocb, struct socket 
*sock,
if ((chunk + copied) > size)
chunk = size - copied;
 
-   if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+   if (memcpy_to_msg(msg, skb->data, chunk)) {
rv = -EFAULT;
break;
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c239f47..435443b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1349,7 +1349,7 @@ static int tcp_recv_urg(struct sock *sk, struct msghdr 
*msg, int len, int flags)
 
if (len > 0) {
if (!(flags & MSG_TRUNC))
-   err = memcpy_toiovec(msg->msg_iov, , 1);
+   err = memcpy_to_msg(msg, , 1);
len = 1;
} else
msg->msg_flags |= MSG_TRUNC;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 9052462..568edc7 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1466,7 +1466,7 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct 
socket *sock,
}
 
chunk = min_t(unsigned int, skb->len, size);
-   if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+   if (memcpy_to_msg(msg, skb->data, chunk)) {
skb_queue_head(>sk_receive_queue, skb);
if (copied == 0)
copied = -EFAULT;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index be79208..692f049 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2936,8 +2936,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct 
socket *sock,
vnet_hdr.flags = VIRTIO_NET_HDR_F_DATA_VALID;
} /* else everything is zero */
 
-   err 

[PATCH 03/17] switch ipxrtr_route_packet() from iovec to msghdr

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 include/net/ipx.h   |2 +-
 net/ipx/af_ipx.c|3 +--
 net/ipx/ipx_route.c |4 ++--
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/net/ipx.h b/include/net/ipx.h
index 320f47b..e5cff68 100644
--- a/include/net/ipx.h
+++ b/include/net/ipx.h
@@ -150,7 +150,7 @@ int ipxrtr_add_route(__be32 network, struct ipx_interface 
*intrfc,
 unsigned char *node);
 void ipxrtr_del_routes(struct ipx_interface *intrfc);
 int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
-   struct iovec *iov, size_t len, int noblock);
+   struct msghdr *msg, size_t len, int noblock);
 int ipxrtr_route_skb(struct sk_buff *skb);
 struct ipx_route *ipxrtr_lookup(__be32 net);
 int ipxrtr_ioctl(unsigned int cmd, void __user *arg);
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index a0c7536..36f7990 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1745,8 +1745,7 @@ static int ipx_sendmsg(struct kiocb *iocb, struct socket 
*sock,
memcpy(usipx->sipx_node, ipxs->dest_addr.node, IPX_NODE_LEN);
}
 
-   rc = ipxrtr_route_packet(sk, usipx, msg->msg_iov, len,
-flags & MSG_DONTWAIT);
+   rc = ipxrtr_route_packet(sk, usipx, msg, len, flags & MSG_DONTWAIT);
if (rc >= 0)
rc = len;
 out:
diff --git a/net/ipx/ipx_route.c b/net/ipx/ipx_route.c
index 67e7ad3..3e2a32a 100644
--- a/net/ipx/ipx_route.c
+++ b/net/ipx/ipx_route.c
@@ -165,7 +165,7 @@ int ipxrtr_route_skb(struct sk_buff *skb)
  * Route an outgoing frame from a socket.
  */
 int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
-   struct iovec *iov, size_t len, int noblock)
+   struct msghdr *msg, size_t len, int noblock)
 {
struct sk_buff *skb;
struct ipx_sock *ipxs = ipx_sk(sk);
@@ -229,7 +229,7 @@ int ipxrtr_route_packet(struct sock *sk, struct 
sockaddr_ipx *usipx,
memcpy(ipx->ipx_dest.node, usipx->sipx_node, IPX_NODE_LEN);
ipx->ipx_dest.sock  = usipx->sipx_port;
 
-   rc = memcpy_fromiovec(skb_put(skb, len), iov, len);
+   rc = memcpy_from_msg(skb_put(skb, len), msg, len);
if (rc) {
kfree_skb(skb);
goto out_put;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/17] new helper: memcpy_from_msg()

2014-11-21 Thread Al Viro

Signed-off-by: Al Viro 
---
 crypto/algif_skcipher.c |   10 +-
 drivers/isdn/mISDN/socket.c |2 +-
 drivers/net/ppp/pppoe.c |2 +-
 include/linux/skbuff.h  |5 +
 include/net/sctp/sm.h   |2 +-
 net/appletalk/ddp.c |2 +-
 net/ax25/af_ax25.c  |2 +-
 net/bluetooth/hci_sock.c|2 +-
 net/bluetooth/mgmt.c|2 +-
 net/bluetooth/rfcomm/sock.c |2 +-
 net/bluetooth/sco.c |2 +-
 net/caif/caif_socket.c  |4 ++--
 net/can/bcm.c   |   19 ---
 net/can/raw.c   |2 +-
 net/dccp/proto.c|2 +-
 net/decnet/af_decnet.c  |2 +-
 net/ieee802154/dgram.c  |2 +-
 net/ieee802154/raw.c|2 +-
 net/ipv4/ping.c |2 +-
 net/ipv4/tcp_input.c|2 +-
 net/irda/af_irda.c  |6 +++---
 net/iucv/af_iucv.c  |2 +-
 net/key/af_key.c|2 +-
 net/l2tp/l2tp_ip.c  |2 +-
 net/l2tp/l2tp_ppp.c |3 +--
 net/llc/af_llc.c|2 +-
 net/netlink/af_netlink.c|2 +-
 net/netrom/af_netrom.c  |2 +-
 net/nfc/llcp_commands.c |4 ++--
 net/nfc/rawsock.c   |2 +-
 net/packet/af_packet.c  |5 ++---
 net/phonet/datagram.c   |2 +-
 net/phonet/pep.c|2 +-
 net/rose/af_rose.c  |2 +-
 net/sctp/sm_make_chunk.c|4 ++--
 net/x25/af_x25.c|2 +-
 36 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 83187f4..c3b482b 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -298,9 +298,9 @@ static int skcipher_sendmsg(struct kiocb *unused, struct 
socket *sock,
len = min_t(unsigned long, len,
PAGE_SIZE - sg->offset - sg->length);
 
-   err = memcpy_fromiovec(page_address(sg_page(sg)) +
-  sg->offset + sg->length,
-  msg->msg_iov, len);
+   err = memcpy_from_msg(page_address(sg_page(sg)) +
+ sg->offset + sg->length,
+ msg, len);
if (err)
goto unlock;
 
@@ -337,8 +337,8 @@ static int skcipher_sendmsg(struct kiocb *unused, struct 
socket *sock,
if (!sg_page(sg + i))
goto unlock;
 
-   err = memcpy_fromiovec(page_address(sg_page(sg + i)),
-  msg->msg_iov, plen);
+   err = memcpy_from_msg(page_address(sg_page(sg + i)),
+ msg, plen);
if (err) {
__free_page(sg_page(sg + i));
sg_assign_page(sg + i, NULL);
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index dcbd858..84b3592 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -203,7 +203,7 @@ mISDN_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
if (!skb)
goto done;
 
-   if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+   if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
err = -EFAULT;
goto done;
}
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 443cbbf..d2408a5 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -869,7 +869,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket 
*sock,
ph = (struct pppoe_hdr *)skb_put(skb, total_len + sizeof(struct 
pppoe_hdr));
start = (char *)>tag[0];
 
-   error = memcpy_fromiovec(start, m->msg_iov, total_len);
+   error = memcpy_from_msg(start, m, total_len);
if (error < 0) {
kfree_skb(skb);
goto end;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4fc4024..cb7fa2b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2684,6 +2684,11 @@ unsigned int skb_gso_transport_seglen(const struct 
sk_buff *skb);
 struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
 struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
 
+static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
+{
+   return memcpy_fromiovec(data, msg->msg_iov, len);
+}
+
 struct skb_checksum_ops {
__wsum (*update)(const void *mem, int len, __wsum wsum);
__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 72a31db..487ef34 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -219,7 +219,7 @@ struct sctp_chunk *sctp_make_abort_no_data(const struct 

[PATCH 01/17] new helper: skb_copy_and_csum_datagram_msg()

2014-11-21 Thread Al Viro
Signed-off-by: Al Viro 
---
 include/linux/skbuff.h |5 +
 net/ipv4/udp.c |5 ++---
 net/ipv6/raw.c |2 +-
 net/ipv6/udp.c |2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 73c370e..4fc4024 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2651,6 +2651,11 @@ static inline int skb_copy_datagram_msg(const struct 
sk_buff *from, int offset,
 }
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 struct iovec *iov);
+static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
+   struct msghdr *msg)
+{
+   return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
+}
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 const struct iovec *from, int from_offset,
 int len);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4a16b91..b2d6068 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1284,9 +1284,8 @@ try_again:
err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
msg, copied);
else {
-   err = skb_copy_and_csum_datagram_iovec(skb,
-  sizeof(struct udphdr),
-  msg->msg_iov);
+   err = skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr),
+msg);
 
if (err == -EINVAL)
goto csum_copy_err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 0cbcf98..8baa53e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -492,7 +492,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock 
*sk,
goto csum_copy_err;
err = skb_copy_datagram_msg(skb, 0, msg, copied);
} else {
-   err = skb_copy_and_csum_datagram_iovec(skb, 0, msg->msg_iov);
+   err = skb_copy_and_csum_datagram_msg(skb, 0, msg);
if (err == -EINVAL)
goto csum_copy_err;
}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 0ba3de4..961565a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -427,7 +427,7 @@ try_again:
err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
msg, copied);
else {
-   err = skb_copy_and_csum_datagram_iovec(skb, sizeof(struct 
udphdr), msg->msg_iov);
+   err = skb_copy_and_csum_datagram_msg(skb, sizeof(struct 
udphdr), msg);
if (err == -EINVAL)
goto csum_copy_err;
}
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] situation with csum_and_copy_... API

2014-11-21 Thread Al Viro
OK, here's the next bunch.  Sorry about the delay, iov_iter.c stuff
took most of the day (and it's not included in this pile).  Please, review.
Al Viro (17):
  new helper: skb_copy_and_csum_datagram_msg()
  new helper: memcpy_from_msg()
  switch ipxrtr_route_packet() from iovec to msghdr
  new helper: memcpy_to_msg()
  switch drivers/net/tun.c to ->read_iter()
  switch macvtap to ->read_iter()
  new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  {macvtap,tun}_get_user(): switch to iov_iter
  kill zerocopy_sg_from_iovec()
  switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()
  switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing 
iov_iter
  tipc_sendmsg(): pass msghdr instead of its ->msg_iov
  tipc_msg_build(): pass msghdr instead of its ->msg_iov
  vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and 
->dequeue_stream to msghdr
  [atm] switch vcc_sendmsg() to copy_from_iter()
  rds: switch ->inc_copy_to_user() to passing iov_iter
  rds: switch rds_message_copy_from_user() to iov_iter

Patches themselves are in followups...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] KVM: nVMX: nested MSR auto load/restore emulation.

2014-11-21 Thread Wincy Van
Some hypervisors need MSR auto load/restore feature.

We read MSRs from vm-entry MSR load area which specified by L1,
and load them via kvm_set_msr in the nested entry.
When nested exit occurs, we get MSRs via kvm_get_msr, writting
them to L1`s MSR store area. After this, we read MSRs from vm-exit
MSR load area, and load them via kvm_set_msr.

VirtualBox will work fine with this patch.

Signed-off-by: Wincy Van 

diff --git a/arch/x86/include/uapi/asm/vmx.h b/arch/x86/include/uapi/asm/vmx.h
index 990a2fe..986af3f 100644
--- a/arch/x86/include/uapi/asm/vmx.h
+++ b/arch/x86/include/uapi/asm/vmx.h
@@ -56,6 +56,7 @@
 #define EXIT_REASON_MSR_READ31
 #define EXIT_REASON_MSR_WRITE   32
 #define EXIT_REASON_INVALID_STATE   33
+#define EXIT_REASON_MSR_LOAD_FAIL   34
 #define EXIT_REASON_MWAIT_INSTRUCTION   36
 #define EXIT_REASON_MONITOR_INSTRUCTION 39
 #define EXIT_REASON_PAUSE_INSTRUCTION   40
@@ -114,8 +115,12 @@
  { EXIT_REASON_APIC_WRITE,"APIC_WRITE" }, \
  { EXIT_REASON_EOI_INDUCED,   "EOI_INDUCED" }, \
  { EXIT_REASON_INVALID_STATE, "INVALID_STATE" }, \
+ { EXIT_REASON_MSR_LOAD_FAIL, "MSR_LOAD_FAIL" }, \
  { EXIT_REASON_INVD,  "INVD" }, \
  { EXIT_REASON_INVVPID,   "INVVPID" }, \
  { EXIT_REASON_INVPCID,   "INVPCID" }

+#define VMX_ABORT_SAVE_GUEST_MSR_FAIL1
+#define VMX_ABORT_LOAD_HOST_MSR_FAIL 4
+
 #endif /* _UAPIVMX_H */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 6a951d8..377e405 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6088,6 +6088,13 @@ static void nested_vmx_failValid(struct kvm_vcpu *vcpu,
  */
 }

+static void nested_vmx_abort(struct kvm_vcpu *vcpu, u32 indicator)
+{
+ /* TODO: not to simply reset guest here. */
+ kvm_make_request(KVM_REQ_TRIPLE_FAULT, vcpu);
+ printk(KERN_WARNING"kvm: nested vmx abort, indicator %d\n", indicator);
+}
+
 static enum hrtimer_restart vmx_preemption_timer_fn(struct hrtimer *timer)
 {
  struct vcpu_vmx *vmx =
@@ -8215,6 +8222,88 @@ static void vmx_start_preemption_timer(struct
kvm_vcpu *vcpu)
   ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL);
 }

+static inline int nested_msr_check_common(struct vmx_msr_entry *e)
+{
+ if (e->index >> 8 == 0x8 || e->reserved != 0)
+ return -EINVAL;
+return 0;
+}
+
+static inline int nested_load_msr_check(struct vmx_msr_entry *e)
+{
+ if (e->index == MSR_FS_BASE ||
+e->index == MSR_GS_BASE ||
+nested_msr_check_common(e))
+ return -EINVAL;
+ return 0;
+}
+
+/* load guest msr at nested entry.
+ * return 0 for success, entry index for failed.
+ */
+static u32 nested_entry_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
+{
+ u32 i = 0;
+ struct vmx_msr_entry e;
+ struct msr_data msr;
+
+ msr.host_initiated = false;
+ while (i < count) {
+ kvm_read_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry),
+ , sizeof(struct vmx_msr_entry));
+ if (nested_load_msr_check())
+ goto fail;
+ msr.index = e.index;
+ msr.data = e.value;
+ if (kvm_set_msr(vcpu, ))
+ goto fail;
+ ++i;
+}
+ return 0;
+fail:
+ return i + 1;
+}
+
+static int nested_exit_store_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
+{
+ u32 i = 0;
+ struct vmx_msr_entry e;
+
+while (i < count) {
+ kvm_read_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry),
+ , sizeof(struct vmx_msr_entry));
+ if (nested_msr_check_common())
+ return -EINVAL;
+ if (kvm_get_msr(vcpu, e.index, ))
+ return -EINVAL;
+ kvm_write_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry),
+ , sizeof(struct vmx_msr_entry));
+ ++i;
+ }
+ return 0;
+}
+
+static int nested_exit_load_msr(struct kvm_vcpu *vcpu, u64 gpa, u32 count)
+{
+ u32 i = 0;
+ struct vmx_msr_entry e;
+ struct msr_data msr;
+
+ msr.host_initiated = false;
+ while (i < count) {
+ kvm_read_guest(vcpu->kvm, gpa + i * sizeof(struct vmx_msr_entry),
+ , sizeof(struct vmx_msr_entry));
+ if (nested_load_msr_check())
+ return -EINVAL;
+ msr.index = e.index;
+ msr.data = e.value;
+ if (kvm_set_msr(vcpu, ))
+ return -EINVAL;
+ ++i;
+ }
+ return 0;
+}
+
 /*
  * prepare_vmcs02 is called when the L1 guest hypervisor runs its nested
  * L2 guest. L1 has a vmcs for L2 (vmcs12), and this function "merges" it
@@ -8509,6 +8598,7 @@ static int nested_vmx_run(struct kvm_vcpu *vcpu,
bool launch)
  int cpu;
  struct loaded_vmcs *vmcs02;
  bool ia32e;
+ u32 msr_entry_idx;

  if (!nested_vmx_check_permission(vcpu) ||
 !nested_vmx_check_vmcs12(vcpu))
@@ -8556,11 +8646,12 @@ static int nested_vmx_run(struct kvm_vcpu
*vcpu, bool launch)
  return 1;
  }

- if (vmcs12->vm_entry_msr_load_count > 0 ||
-vmcs12->vm_exit_msr_load_count > 0 ||
-vmcs12->vm_exit_msr_store_count > 0) {
- pr_warn_ratelimited("%s: VMCS MSR_{LOAD,STORE} unsupported\n",
-__func__);
+ if ((vmcs12->vm_entry_msr_load_count > 0 &&
+ !IS_ALIGNED(vmcs12->vm_entry_msr_load_addr, 16)) ||
+(vmcs12->vm_exit_msr_load_count > 0 &&
+ !IS_ALIGNED(vmcs12->vm_exit_msr_load_addr, 16)) ||
+

Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

2014-11-21 Thread Paul E. McKenney
On Fri, Nov 21, 2014 at 06:00:14PM -0800, Andy Lutomirski wrote:
> On Fri, Nov 21, 2014 at 3:38 PM, Paul E. McKenney
>  wrote:
> > On Fri, Nov 21, 2014 at 03:06:48PM -0800, Andy Lutomirski wrote:
> >> On Fri, Nov 21, 2014 at 2:55 PM, Paul E. McKenney
> >>  wrote:
> >> > On Fri, Nov 21, 2014 at 02:19:17PM -0800, Andy Lutomirski wrote:
> >> >> On Fri, Nov 21, 2014 at 2:07 PM, Paul E. McKenney
> >> >>  wrote:
> >> >> > On Fri, Nov 21, 2014 at 01:32:50PM -0800, Andy Lutomirski wrote:
> >> >> >> On Fri, Nov 21, 2014 at 1:26 PM, Andy Lutomirski 
> >> >> >>  wrote:
> >> >> >> > We currently pretend that IST context is like standard exception
> >> >> >> > context, but this is incorrect.  IST entries from userspace are 
> >> >> >> > like
> >> >> >> > standard exceptions except that they use per-cpu stacks, so they 
> >> >> >> > are
> >> >> >> > atomic.  IST entries from kernel space are like NMIs from RCU's
> >> >> >> > perspective -- they are not quiescent states even if they
> >> >> >> > interrupted the kernel during a quiescent state.
> >> >> >> >
> >> >> >> > Add and use ist_enter and ist_exit to track IST context.  Even
> >> >> >> > though x86_32 has no IST stacks, we track these interrupts the same
> >> >> >> > way.
> >> >> >>
> >> >> >> I should add:
> >> >> >>
> >> >> >> I have no idea why RCU read-side critical sections are safe inside
> >> >> >> __do_page_fault today.  It's guarded by exception_enter(), but that
> >> >> >> doesn't do anything if context tracking is off, and context tracking
> >> >> >> is usually off. What am I missing here?
> >> >> >
> >> >> > Ah!  There are three cases:
> >> >> >
> >> >> > 1.  Context tracking is off on a non-idle CPU.  In this case, RCU 
> >> >> > is
> >> >> > still paying attention to CPUs running in both userspace and 
> >> >> > in
> >> >> > the kernel.  So if a page fault happens, RCU will be set up to
> >> >> > notice any RCU read-side critical sections.
> >> >> >
> >> >> > 2.  Context tracking is on on a non-idle CPU.  In this case, RCU
> >> >> > might well be ignoring userspace execution: NO_HZ_FULL and
> >> >> > all that.  However, as you pointed out, in this case the
> >> >> > context-tracking code lets RCU know that we have entered the
> >> >> > kernel, which means that RCU will again be paying attention to
> >> >> > RCU read-side critical sections.
> >> >> >
> >> >> > 3.  The CPU is idle.  In this case, RCU is ignoring the CPU, so
> >> >> > if we take a page fault when context tracking is off, life
> >> >> > will be hard.  But the kernel is not supposed to take page
> >> >> > faults in the idle loop, so this is not a problem.
> >> >>
> >> >> I guess so, as long as there are really no page faults in the idle loop.
> >> >
> >> > As far as I know, there are not.  If there are, someone needs to let
> >> > me know!  ;-)
> >> >
> >> >> There are, however, machine checks in the idle loop, and maybe kprobes
> >> >> (haven't checked), so I think this patch might fix real bugs.
> >> >
> >> > If you can get ISTs from the idle loop, then the patch is needed.
> >> >
> >> >> > Just out of curiosity...  Can an NMI occur in IST context?  If it can,
> >> >> > I need to make rcu_nmi_enter() and rcu_nmi_exit() deal properly with
> >> >> > nested calls.
> >> >>
> >> >> Yes, and vice versa.  That code looked like it handled nesting
> >> >> correctly, but I wasn't entirely sure.
> >> >
> >> > It currently does not, please see below patch.  Are you able to test
> >> > nesting?  It would be really cool if you could do so -- I have no
> >> > way to test this patch.
> >>
> >> I can try.  It's sort of easy -- I'll put an int3 into do_nmi and add
> >> a fixup to avoid crashing.
> >>
> >> What should I look for?  Should I try to force full nohz on and assert
> >> something?  I don't really know how to make full nohz work.
> >
> > You should look for the WARN_ON_ONCE() calls in rcu_nmi_enter() and
> > rcu_nmi_exit() to fire.
> 
> No warning with or without your patch, maybe because all of those
> returns skip the labels.

I will be guardedly optimistic and take this as a good sign.  ;-)

> Also, an NMI can happen *during* rcu_nmi_enter or rcu_nmi_exit.  Is
> that okay?  Should those dynticks_nmi_nesting++ things be local_inc
> and local_dec_and_test?

Yep, it is OK during rcu_nmi_enter() or rcu_nmi_exit().  The nested
NMI will put the dynticks_nmi_nesting counter back where it was, so
no chance of confusion.

> That dynticks_nmi_nesting thing seems scary to me.  Shouldn't the code
> unconditionally increment dynticks_nmi_nesting in rcu_nmi_enter and
> unconditionally decrement it in rcu_nmi_exit?

You might be able to get that to work, but the reason it is not done
that way is because we might get an NMI while not in dyntick-idle
state.  In that case, it would be very bad to atomically increment
rcu_dynticks, because that would tell RCU to ignore the CPU while it
was in the NMI handler, 

Re: Removal of bus->msi assignment breaks MSI with stacked domains

2014-11-21 Thread Yijing Wang


在 2014/11/22 1:31, Bjorn Helgaas 写道:

On Fri, Nov 21, 2014 at 09:54:40AM +0800, Yijing Wang wrote:

Thomas, let me know if you want to do that.  I suppose we could add a new
patch to add it back, but that would leave bisection broken for the
interval between c167caf8d174 and the patch that adds it back.

Fortunately my irq/irqdomain branch is not immutable yet. So we have
no problem at that point. I can rebase on your branch until tomorrow
night. Or just rebase on mainline and we sort out the merge conflicts
later, i.e. delegate them to Linus so his job of pulling stuff gets
not completely boring.

Hi Thomas, sorry for my introducing the broken.


What I'm more worried about is whether this intended change is going
to inflict a problem on Jiangs intention to deduce the MSI irq domain
from the device, which we really need for making DMAR work w/o going
through loops and hoops.

I have limited knowledge about the actual scope of iommu (DMAR) units
versus device/bus/host-controllers, so I would appreciate a proper
explanation for that from you or Jiang or both.

In my personal opinion, if it's not necessary, we should not put stuff
into pci_dev or pci_bus. If we plan to save msi_controller in pci_bus or
pci_dev.
I have a proposal, I would be appreciated if you could give some comments.
First we refactor pci_host_bridge to make a generic
pci_host_bridge, then we could save pci domain in it to eliminate
arch specific functions. I aslo wanted to save msi_controller as
pci domain, but now Jiang refactor hierarchy irq domain, and
pci devices under the same pci host bridge may need to associate
to different msi_controllers.

I think this is getting ahead of ourselves.  Let's make small steps.

We currently have the msi_controller pointer in struct pci_bus.  That was
there even before your series.  Your series added pci_msi_controller(),
and I reworked it so it looks like this:

 static struct msi_controller *pci_msi_controller(struct pci_dev *dev)
 {
   struct msi_controller *msi_ctrl = dev->bus->msi;

   if (msi_ctrl)
   return msi_ctrl;

   return pcibios_msi_controller(dev);
 }

So now your series basically just removes the ARM add_bus() and
remove_bus() methods and gets the MSI controller info from the ARM
pci_sys_data struct instead of from pci_bus.  Of course, that assumes that
on ARM, all devices under a host bridge have the same MSI controller.  That
seems like an unwarranted assumption, but if you want to do it for ARM,
that's fine with me.


Agree, we could use pci_msi_controller() to find msi_controller for 
pci_dev before a

better common way found.




So I want to associate a msi_controller finding ops with generic 
pci_host_bridge,
then every pci device could find its msi_controller/irq_domain by a
common function

E.g

struct msi_controller *pci_msi_controller(struct pci_dev *pdev)
{
struct msi_controller *ctrl;
struct pci_host_bridge *host = find_pci_host_bridge(pdev->bus);
if (host && host->pci_get_msi_controller)
ctrl = pci_host_bridge->pci_get_msi_controller(struct pci_dev 
*pdev);

return ctrl;
}

You can do this for ARM if you want (and your series already accomplishes
the same effect, though implemented differently).  But I don't think this
is appropriate for the PCI core.


OK. We need a better solution, not only for arm, also need to consider 
arm64 and

other platforms.



For anybody who is on this thread but not the original, I reworked the
series slightly, see [1].

Bjorn

[1] http://lkml.kernel.org/r/20141121172018.ga6...@google.com
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Your editor/IDE settings for autocompletion and other easiness

2014-11-21 Thread Andrey Utkin
(I was asked to research this topic to help students. So please ignore
this topic if all you want to say is that it is OK to code in editor
without autocompletion and any other integration, and that there's LXR
website. We all know that.)

Dear kernel developers,
if you have a minute, please share
- what's your configuration for editor integration with sources tree?
(the opposite is "just using any editor")
- which IDE/editor handiness options except autocompletion are
possible to obtain while developing kernel code, and which options do
you use?

If you don't use any special configuration, feel free not to reply.

Thanks!

-- 
Andrey Utkin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/12] time: Rename udelay_test.c to test_udelay.c

2014-11-21 Thread Greg KH
On Fri, Nov 21, 2014 at 03:30:51PM -0800, Kees Cook wrote:
> On Fri, Nov 21, 2014 at 11:44 AM, John Stultz  wrote:
> > Kees requested that this test module be renamed for consistency sake,
> > so this patch renames the udelay_test.c file (recently added to
> > tip/timers/core for 3.17) to test_udelay.c
> >
> > Cc: Kees Cook 
> > Cc: Greg KH 
> > Cc: Stephen Rothwell 
> > Cc: Thomas Gleixner 
> > Cc: Ingo Molnar 
> > Cc: Linux-Next 
> > Cc: David Riley 
> > Signed-off-by: John Stultz 
> 
> Reviewed-by: Kees Cook 


Acked-by: Greg Kroah-Hartman 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipc,sem block sem_lock on sma->lock during sma initialization

2014-11-21 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/21/2014 07:56 PM, Davidlohr Bueso wrote:
> On Fri, 2014-11-21 at 18:03 -0500, Rik van Riel wrote:

>> In other words, if you try to use a semaphore array before getsem
>> returns, you can oops the task that calls semop.
> 
> This seems bogus from an application level: how can you call semop
> if you don't have the semid yet returned from semget? And the fact
> that the race is with newary, means that the call is in fact
> creating a *new* set, as opposed to plugging into an already
> existing set.

Agreed, this is bogus from userspace.

However, userspace doing bogus things should not lead to a
kernel crash.

> The fix in newary() being before the actual creation of the id
> seems even stranger:
> 
> sma->complex_count = 1; id = ipc_addid(_ids(ns),
> >sem_perm, ns->sc_semmni);
> 
> As for semtimedop() before even getting to sem_lock(), we first
> call:
> 
> sma = sem_obtain_object_check(ns, semid);
> 
> So shouldn't that fail anyway before we even consider acquiring the
> lock?

newary initializes a bunch of things after the call to
ipc_addid, however some things are initialized inside
ipc_addid as well

Looking closer at newary, I suppose that it should be
possible to move those other initializations before
the call to ipc_addid.  That would likely get rid of
the problem, too.

However, I also see this line in newary, and I have
no idea what protects that data:

ns->used_sems += nsems;

I don't see any locking around ns->used_sems for
simultaneous getsem & RMID...

- -- 
All rights reversed
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJUcAWfAAoJEM553pKExN6D4n4H/jogtT4f/cWvMI4be3MlfE2x
sAIuC0Z6Fqqzm60XB2OB4/yIAZU1JDmsUrmUVqwh3R/G2mQygpkrM9ZKW4dkxtyd
MZ0IWtx74OSb376mDcmhk8vI8xh5/j/bWTx2oxP7IFZf4imVFGeZmlG/YLKGSnLS
lO9ehr9wkyzoyo1wgpuWhKdxDTEaeZd8C6Ij6bVylWybuWVripN9eX13vWyDmKJ8
P754efTIDu+PWCaEdNA7eKTMlydkXqjPwUpSnSE/bs2ngFhlAkZqkWmTEu54Wc32
yoyEqFNdMvAV8QCHLeR8Uqf53PNhncz7S7RfX58wgdQ5bKO3ATuJ8jbTT5ZXVZ8=
=xg+y
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] situation with csum_and_copy_... API

2014-11-21 Thread Al Viro
On Sat, Nov 22, 2014 at 03:27:18AM +, Al Viro wrote:
  
> @@ -566,38 +445,15 @@ static size_t copy_to_iter_bvec(void *from, size_t 
> bytes, struct iov_iter *i)
[snip]
> + iterate_bvec(i, bytes, page, off, len, true,
> +  memcpy_from_page((from += len) - len, page, off, len))

should be
+memcpy_to_page(page, off, (from += len) - len, len))

and

> @@ -605,35 +461,9 @@ static size_t copy_from_iter_bvec(void *to, size_t 
> bytes, struct iov_iter *i)
[snip]
> + iterate_bvec(i, bytes, page, off, len, true,
> +  memcpy_to_page(page, off, (to += len) - len, len))

should be
+memcpy_from_page((to += len) - len, page, off, len))

Sorry, wrong delta.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the net-next tree

2014-11-21 Thread David Miller
From: Stephen Rothwell 
Date: Mon, 17 Nov 2014 13:34:04 +1100

> I applied the following merge fix patch:
> 
> From: Stephen Rothwell 
> Date: Mon, 17 Nov 2014 13:31:33 +1100
> Subject: [PATCH] openvswitch: fix up for OVS_NLERR API change
> 
> Signed-off-by: Stephen Rothwell 

Thanks Stephen, I integrated this into the merge commit when
I merge net into net-next just now.

Thanks again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] situation with csum_and_copy_... API

2014-11-21 Thread Al Viro
On Fri, Nov 21, 2014 at 08:49:56AM +, Al Viro wrote:

> Overall, I think I have the whole series plotted in enough details to be
> reasonably certain we can pull it off.  Right now I'm dealing with
> mm/iov_iter.c stuff; the amount of boilerplate source is already high enough
> and with those extra primitives it'll get really unpleasant.
> 
> What we need there is something templates-like, as much as I hate C++, and
> I'm still not happy with what I have at the moment...  Hopefully I'll get
> that in more or less tolerable form today.

Folks, I would really like comments on the patch below.  It's an attempt
to reduce the amount of boilerplate code in mm/iov_iter.c; no new primitives
added, just trying to reduce the amount of duplication in there.  I'm not
too fond of the way it currently looks, to put it mildly.  It seems to
work, it's reasonably straightforward and it even generates slightly better
code than before, but I would _very_ welcome any tricks that would allow to
make it not so tasteless.  I like the effect on line count (+124-358), but...

It defines two iterators (for iovec-backed and bvec-backed ones) and converts
a bunch of primitives to those.  The last argument is an expression evaluated
for a bunch of ranges; for bvec one it's void, for iovec - size_t; if it
evaluates to non-0, we treat it as read/write/whatever short by that many
bytes and do not proceed any further.

Any suggestions are welcome.

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index eafcf60..611af2bd 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -4,11 +4,75 @@
 #include 
 #include 
 
+#define iterate_iovec(i, n, buf, len, move, STEP) {\
+   const struct iovec *iov = i->iov;   \
+   size_t skip = i->iov_offset;\
+   size_t left;\
+   size_t wanted = n;  \
+   buf = iov->iov_base + skip; \
+   len = min(n, iov->iov_len - skip);  \
+   left = STEP;\
+   len -= left;\
+   skip += len;\
+   n -= len;   \
+   while (unlikely(!left && n)) {  \
+   iov++;  \
+   buf = iov->iov_base;\
+   len = min(n, iov->iov_len); \
+   left = STEP;\
+   len -= left;\
+   skip = len; \
+   n -= len;   \
+   }   \
+   n = wanted - n; \
+   if (move) { \
+   if (skip == iov->iov_len) { \
+   iov++;  \
+   skip = 0;   \
+   }   \
+   i->count -= n;  \
+   i->nr_segs -= iov - i->iov; \
+   i->iov = iov;   \
+   i->iov_offset = skip;   \
+   }   \
+}
+
+#define iterate_bvec(i, n, page, off, len, move, STEP) {\
+   const struct bio_vec *bvec = i->bvec;   \
+   size_t skip = i->iov_offset;\
+   size_t wanted = n;  \
+   page = bvec->bv_page;   \
+   off = bvec->bv_offset + skip;   \
+   len = min_t(size_t, n, bvec->bv_len - skip);\
+   STEP;   \
+   skip += len;\
+   n -= len;   \
+   while (unlikely(n)) {   \
+   bvec++; \
+   page = bvec->bv_page;   \
+   off = bvec->bv_offset;  \
+   len = min_t(size_t, n, bvec->bv_len);   \
+   STEP;   \
+   skip = len; \
+   n -= len;   \
+   }   \
+   n = wanted; \
+   if (move) { \
+   if (skip == bvec->bv_len) { \
+   bvec++; \
+   skip = 0;   \
+   }   \
+   i->count -= n;  \
+   i->nr_segs -= bvec - i->bvec;   \
+   i->bvec = 

Re: [PATCH 00/10] Save MSI chip in pci_sys_data

2014-11-21 Thread Yijing Wang


在 2014/11/22 1:20, Bjorn Helgaas 写道:

[+cc Marc]

On Tue, Nov 11, 2014 at 09:23:59PM -0700, Bjorn Helgaas wrote:

On Mon, Oct 27, 2014 at 03:48:37PM +0800, Yijing Wang wrote:

Now PCI host bridge drivers in arm associate MSI chip and
PCI bus by adding .add_bus(), and assign MSI chip pointer
to every PCI bus. Associating MSI chip and every PCI bus
is not necessary. All PCI busses under same PCI host brdige
share the same MSI chip. So saving MSI chip in pci_sys_data
is a better solution, it make PCI host bridge drivers clean.
Because we still need to provide arch spec pcibios_msi_controller()
to extract MSI controller pointer, a better solution is to
refactor PCI host bridge, make a generic pci_host_bridge, and
save common info like PCI domain number, MSI chip, resources
in it. We will do that work in another series as soon.

To Bjorn: Because struct msi_chip defined in struct hw_pci and pci_sys_data
is under the #ifdef CONFIG_PCI_MSI, if we use if(IS_ENABLED(CONFIG_PCI_MSI))
in PCI host bridge drivers, it will cause build errors when the CONFIG_PCI_MSI
is off. So I keep #ifdef CONFIG_PCI_MSI in this series.

Yijing Wang (10):
   MSI: Rename msi_chip to msi_controller for better readability
   PCI/MSI: Introduce weak pcibios_msi_controller()
   arm/MSI: Save MSI controller in pci_sys_data
   PCI: tegra: Save MSI controller in pci_sys_data
   PCI: designware: Save MSI controller in pci_sys_data
   PCI: rcar: Save MSI controller in pci_sys_data
   PCI: mvebu: Save MSI controller in pci_sys_data
   PCI: xilinx: Save MSI controller in pci_sys_data
   arm/PCI: Clean unused pcibios_add_bus() and pcibios_remove_bus()
   PCI/MSI: Remove useless bus->msi assignment

  arch/arm/include/asm/mach/pci.h |   10 +---
  arch/arm/kernel/bios32.c|   28 ++--
  drivers/irqchip/irq-armada-370-xp.c |   22 +-
  drivers/of/of_pci.c |   40 +-
  drivers/pci/host/pci-keystone-dw.c  |4 +-
  drivers/pci/host/pci-keystone.h |2 +-
  drivers/pci/host/pci-mvebu.c|   14 ---
  drivers/pci/host/pci-tegra.c|   37 +---
  drivers/pci/host/pcie-designware.c  |   25 +++--
  drivers/pci/host/pcie-designware.h  |2 +-
  drivers/pci/host/pcie-rcar.c|   37 +---
  drivers/pci/host/pcie-xilinx.c  |   27 +++
  drivers/pci/msi.c   |   22 ++-
  drivers/pci/probe.c |1 -
  include/linux/msi.h |6 ++--
  include/linux/of_pci.h  |   14 ++--
  include/linux/pci.h |2 +-
  17 files changed, 132 insertions(+), 161 deletions(-)


Applied to pci/msi for v3.19, thanks.

I reworked this series slightly to:

   - Change pci_msi_controller() and pcibios_msi_controller() from taking a
 pci_bus * to taking a pci_dev *.  This is so the interface allows
 per-device MSI controllers.  I don't think there's any reason to assume
 all devices on a bus have to have the same controller.

   - Drop the last patch ("PCI/MSI: Remove useless bus->msi assignment")
 because it broke Marc's follow-on patches.

I updated pci/msi, but I haven't put it in my "next" branch yet because I
want confirmation from Fengguang's autobuilder that I didn't break
anything.

Current status of my tree:

 6cf00af0ae15 [1] pci/msi (contains this rework)
 149792795d2b [2] next (does *not* contain this rework)

Thomas, if pci/msi looks good to you, feel free to pull it into your tree.
The only change you should need to make is to change the parameters to
pci_msi_controller() and pcibios_msi_controller().

Just FYI, I'm leaving on vacation for a week, so I won't be able to fix any
issues until Dec 1.


Hi Bjorn, thanks very much for your improvement for this series.
It looks good to me. :)



Bjorn

[1] 
https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=next=6cf00af0ae15
[2] 
https://git.kernel.org/cgit/linux/kernel/git/helgaas/pci.git/log/?h=next=149792795d2b
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 0/4] Hot plug support for the Armada 38x SoCs

2014-11-21 Thread Jason Cooper
On Thu, Oct 30, 2014 at 12:39:40PM +0100, Gregory CLEMENT wrote:
> Hi,
> 
> This patch set is the second version of the series adding the hot plug
> and also kexec support for the Armada 38x Socs.
> 
> If nobody object we could push them in linux-next.
> 
> The first patch was done in order to have the same code between Armada
> XP and the Cortex A9 based mvebu SoCs. In order to ensure the the
> backward compatibility for the device tree, it is only a preliminary
> work for it.
> 
> The second patch moves the SCU power up sequence in a dedicated
> assembly function. It was done in order to reuse it in the 3rd patch.
> 
> The third patch fixes the secondary startup for the cortex A9 mvebu
> SoC. Indeed, the initial code was written with the assumption the SCU
> will be always power on, which is not only true especially in the
> kexec case.
> 
> These 2 patches may worth to be pushed to the stable kernel.
> 
> Then the last patch adds the CPU hotplug support for Armada 38x. I
> tested the hotplug using the /sys/devices/system/cpu/cpu1/online
> virtual file.  I also tested the kexec feature and managed to switch
> to a new kernel using kexec.
> 
> Thanks,
> 
> Gregory
> 
> Changelog:
> 
> v1 -> v2:
> 
> - Fix typo and improve the comment explaining why we need to keep the
>   .smp filed in the 1st patch.
> 
> - Add a prefix to the function power_up_scu function to keep it
>   private and not "pollute" the global namespace.
> 
> Gregory CLEMENT (4):
>   ARM: mvebu: Clean-up the Armada XP support
>   ARM: mvebu: Move SCU power up in a function
>   ARM: mvebu: Fix the secondary startup for Cortex A9 SoC
>   ARM: mvebu: Implement the CPU hotplug support for the Armada 38x SoCs
> 
>  arch/arm/mach-mvebu/armada-370-xp.h |  6 -
>  arch/arm/mach-mvebu/board-v7.c  |  5 
>  arch/arm/mach-mvebu/coherency.c |  1 -
>  arch/arm/mach-mvebu/cpu-reset.c |  1 -
>  arch/arm/mach-mvebu/headsmp-a9.S|  1 +
>  arch/arm/mach-mvebu/platsmp-a9.c| 53 
> +++--
>  arch/arm/mach-mvebu/platsmp.c   |  2 ++
>  arch/arm/mach-mvebu/pmsu.c  |  3 +--
>  arch/arm/mach-mvebu/pmsu.h  |  2 ++
>  arch/arm/mach-mvebu/pmsu_ll.S   | 20 +-
>  10 files changed, 75 insertions(+), 19 deletions(-)

Applied to mvebu/soc with Thomas' Reviewed-by and Tested-by.

thx,

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 7/7] clk: Add floor and ceiling constraints to clock rates

2014-11-21 Thread Stephen Boyd
On 11/18/2014 08:31 AM, Tomeu Vizoso wrote:
> On 14 November 2014 08:50, Stephen Boyd  wrote:
>> It's
>> possible that whatever is constrained at this user level goes
>> down to the hardware driver and then is rounded up or down to a
>> value that is outside of the constraints, in which case the
>> constraints did nothing besides control the value that the
>> hardware driver sees in the .round_rate() op. I doubt that was
>> intended.
> Indeed. Wonder what can be done about it with the least impact on
> existing code. I see the situation as clk implementations being able
> to apply arbitrary constraints in determine_rate() and round_rate(),
> and they would need to take into account the per-user constraints so
> they can all be applied consistently.

I was thinking that we put the loop over .round_rate() in the clock
framework, but you're right, we should provide the limits to the
hardware drivers via the ops so that they can figure out the acceptable
rate within whatever bounds the framework is maintaining. Given that
we're changing the signature of determine_rate() in this series perhaps
we should also add the floor/ceiling rates in there too. Then we can
hope that we've finally gotten that new op right and set it in stone and
migrate everyone over to .determine_rate() instead of .round_rate().

>
>> I also wonder what we should do about clocks that are in the
>> middle of the tree (i.e. not a leaf) and have constraints set on
>> them. It looks like if a child of that clock can propagate rates
>> up to the parent that we've constrained with the per-user
>> constraints, those constraints won't be respected at all, leading
>> to a hole in the constraints.
> True. Do we want to support per-user constraints on non-leaf clocks?

I have an mmc clock rate where it would be helpful.

>> I'm all for having a clk_set_rate_range() API (and floor/ceiling
>> too), but it needs to be done much deeper in the core framework
>> to actually work properly. Having a range API would make all the
>> confusion about how a particular clock driver decides to round a
>> rate go away and just leave an API that sanely says the rate will
>> be within these bounds or an error will be returned stating it
>> can't be satisfied.
> This sounds like a good way forward, but TBH I don't understand what
> you are proposing. Would you care to elaborate on how the API that you
> propose would look like?
>

clk_set_rate_range(struct clk *clk, unsigned long min, unsigned long max);

clk_set_floor(struct clk *clk, unsigned long floor)
{
return clk_set_rate_range(clk, floor, ULONG_MAX);
}

clk_set_ceiling(struct clk *clk, unsigned long ceiling)
{
return clk_set_rate_range(clk, 0, ceiling);
}

Unfortunately we can't make clk_set_rate() a thin wrapper on top that
says min/max is the same as the requested rate because that would
horribly break current users of the API. I suppose we could call
clk_round_rate() and then clk_set_rate_range() with the floor as the
rounded rate and the ceiling as ULONG_MAX? Or maybe floor is 0 and
ceiling is rounded rate, not sure it actually matters.

clk_set_rate(struct clk *clk, unsigned long rate)
{
unsigned long rounded;
   
rounded = clk_round_rate(clk, rate);
return clk_set_rate_range(rounded, ULONG_MAX);
}

Now we can get down to the real work. __clk_round_rate() will apply the
constraints. It will also send down the constraints to .determine_rate()
ops and throw errors if things don't work out (ugh we may need to return
the rate by reference so we can return unsigned long rates here or use
IS_ERR_VALUE() on the return value). In the case that clk_round_rate()
is calling this function it will need to know that we don't want any
constraints applied, so it will need to take min, max, and rate and
clk_round_rate() will just pass 0, ULONG_MAX, and rate respectfully.

Next clk_set_rate() will be renamed to clk_set_rate_range() (in clk.c)
and then it will pass the constraints to clk_calc_new_rates(). We can
also try to bail out early here by checking the constraints against the
current rate to make sure it's within bounds. We can probably redo
clk_calc_new_rates() to be similar to __clk_round_rate() and apply any
constraints that are passed into the function along with any per-user
constraints that already exist.

Did I miss anything?

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v5] ARM: EXYNOS: add Exynos3250 PMU support

2014-11-21 Thread Kukjin Kim
Bartlomiej Zolnierkiewicz wrote:
> 
> This patch prepares the PMU code for the future:
> - suspend/resume (S2R) support
> - cpuidle AFTR/W-AFTR modes support
> on Exynos3250.
> 
> Cc: Vikas Sajjan 
> Reviewed-by: Pankaj Dubey 
> Acked-by: Kyungmin Park 
> Signed-off-by: Chanwoo Choi 
> Signed-off-by: Bartlomiej Zolnierkiewicz 
> ---
> v5:
> - added Reviewed-by tag from Pankaj Dubey
> - fixed form -> from typo
> 
> v4:
> - rebased on top of next-20141114 branch of linux-next kernel tree
>   (it also applies fine to for-next branch of linux-samsung.git)
> - removed writing to undocumented CORE2 and CORE3 related registers
> - fixed values used for EXYNOS3_[G3D,LCD]_SYS_PWR_REG registers
> - added defines for values used for EXYNOS3_*_DURATION registers
> - removed redundat pr_info("EXYNOS3250 PMU Initialize\n")
> 
> v3:
> - rebased on top of for-next branch of linux-samsung.git and
>   [PATCH v7] mfd: syscon: Decouple syscon interface from platform devices
>   (https://lkml.org/lkml/2014/9/30/156)
>   [PATCH v9 0/2] ARM: Exynos: Convert PMU implementation into a platform 
> driver
>   (https://lkml.org/lkml/2014/10/6/89)
>   [PATCH v9 0/2] Adds PMU and S2R support for exynos5420
>   (http://www.spinics.net/lists/arm-kernel/msg368207.html)
> 
> v2:
> - rebased on top of next-20140708 and
>   http://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg32410.html
>   http://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg33660.html
>   http://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg33675.html
> 
>   this patch also applies fine after/before Exynos5800 PMU support:
>   http://www.mail-archive.com/linux-samsung-soc@vger.kernel.org/msg33835.html
> 
>  arch/arm/mach-exynos/pmu.c  | 167 
> 
>  arch/arm/mach-exynos/regs-pmu.h | 128 ++
>  2 files changed, 295 insertions(+)

Looks good to me, I think each SoC specific pm features would be handled in
each file like cpufreq though...maybe next time? :-)

BTW, I need to sort out pmu related changes from Pankaj, Amit and you. If
anything is required, I'll let you know.

Thanks,
Kukjin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] x86/mce: Try printing all machine check banks known before panic

2014-11-21 Thread rui wang
On 11/22/14, Borislav Petkov  wrote:
>... there are two possibilities:
>
> * error got logged into mcelog and is long out to dmesg.
>
> So we go look at dmesg. Not very easy to do when we panic, I know, so we
> better make sure we have serial connected.
>
>
>  [ Btw., we can know when userspace is eating up error data:
>drivers/ras/debugfs.c. If it doesn't, we can then dump it to dmesg.
>We'll have to teach mcelog/ras daemons to open that file so that we
>don't issue to dmesg. ]
>
>
> * error is not logged yet so still in mcelog and we simply dump it out
> to dmesg.
>
> In any case, we cannot have fixed-size buffer for some number of errors
> and rely on it always having the error which caused the #MC as something
> will consume it at some point anyway.
>
> So maybe if we could get a more detailed explanation of when this thing
> happens, then we might address it better.
>

Hi Boris,
I think both possibilities are valid. But experiments show that the
error logs are not in the dmesg preserved by kdump in /var/crash/
after panic and reboot, and not in the mcelog.entry[] array in the
kernel. So they must be somewhere in user space memory. Even if we
have serial console connected we still can't cache them. The
difficulty is that there's no easy way to force a user space daemon to
do something during panic.

The new banks_saved[] array acts like a safe guard when you pass
something to someone else - to prevent it from getting lost in the
interim.

Thanks
Rui
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 6/7] ARM: mvebu: add PHY support to the dts for the USB controllers on Armada 375

2014-11-21 Thread Jason Cooper
On Thu, Nov 13, 2014 at 12:47:48PM +0100, Gregory CLEMENT wrote:
> Now that the USB cluster node has been added, use it as a PHY provider
> for the USB controller linked to it: the first EHCI and the xHCI.
> 
> Signed-off-by: Gregory CLEMENT 
> ---
>  arch/arm/boot/dts/armada-375.dtsi | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/armada-375.dtsi 
> b/arch/arm/boot/dts/armada-375.dtsi
> index 8f45cf5d2a50..f344ec420c95 100644
> --- a/arch/arm/boot/dts/armada-375.dtsi
> +++ b/arch/arm/boot/dts/armada-375.dtsi
> @@ -14,6 +14,7 @@
>  #include "skeleton.dtsi"
>  #include 
>  #include 
> +#include 

Odd.  The previous patch in this series simply adds a line to phy.h,
however, I get the following error during 'make dtbs':


  DTC arch/arm/boot/dts/armada-375-db.dtb
In file included from 
arch/arm/boot/dts/armada-375-db.dts:17:0:arch/arm/boot/dts/armada-375.dtsi:17:33:
fatal error: dt-bindings/phy/phy.h: No such file or directory
 #include 
 ^
compilation terminated.
scripts/Makefile.lib:282: recipe for target 'arch/arm/boot/dts/armada-37 
5-db.dtb' failed


mvebu/dt is based on v3.18-rc1.  Is there a missing dependency
somewhere?  Perhaps we should let Kishon take the whole series and
handle the (hopefully trivial) merge conflict?

thx,

Jason.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

2014-11-21 Thread Andy Lutomirski
On Fri, Nov 21, 2014 at 3:38 PM, Paul E. McKenney
 wrote:
> On Fri, Nov 21, 2014 at 03:06:48PM -0800, Andy Lutomirski wrote:
>> On Fri, Nov 21, 2014 at 2:55 PM, Paul E. McKenney
>>  wrote:
>> > On Fri, Nov 21, 2014 at 02:19:17PM -0800, Andy Lutomirski wrote:
>> >> On Fri, Nov 21, 2014 at 2:07 PM, Paul E. McKenney
>> >>  wrote:
>> >> > On Fri, Nov 21, 2014 at 01:32:50PM -0800, Andy Lutomirski wrote:
>> >> >> On Fri, Nov 21, 2014 at 1:26 PM, Andy Lutomirski  
>> >> >> wrote:
>> >> >> > We currently pretend that IST context is like standard exception
>> >> >> > context, but this is incorrect.  IST entries from userspace are like
>> >> >> > standard exceptions except that they use per-cpu stacks, so they are
>> >> >> > atomic.  IST entries from kernel space are like NMIs from RCU's
>> >> >> > perspective -- they are not quiescent states even if they
>> >> >> > interrupted the kernel during a quiescent state.
>> >> >> >
>> >> >> > Add and use ist_enter and ist_exit to track IST context.  Even
>> >> >> > though x86_32 has no IST stacks, we track these interrupts the same
>> >> >> > way.
>> >> >>
>> >> >> I should add:
>> >> >>
>> >> >> I have no idea why RCU read-side critical sections are safe inside
>> >> >> __do_page_fault today.  It's guarded by exception_enter(), but that
>> >> >> doesn't do anything if context tracking is off, and context tracking
>> >> >> is usually off. What am I missing here?
>> >> >
>> >> > Ah!  There are three cases:
>> >> >
>> >> > 1.  Context tracking is off on a non-idle CPU.  In this case, RCU is
>> >> > still paying attention to CPUs running in both userspace and in
>> >> > the kernel.  So if a page fault happens, RCU will be set up to
>> >> > notice any RCU read-side critical sections.
>> >> >
>> >> > 2.  Context tracking is on on a non-idle CPU.  In this case, RCU
>> >> > might well be ignoring userspace execution: NO_HZ_FULL and
>> >> > all that.  However, as you pointed out, in this case the
>> >> > context-tracking code lets RCU know that we have entered the
>> >> > kernel, which means that RCU will again be paying attention to
>> >> > RCU read-side critical sections.
>> >> >
>> >> > 3.  The CPU is idle.  In this case, RCU is ignoring the CPU, so
>> >> > if we take a page fault when context tracking is off, life
>> >> > will be hard.  But the kernel is not supposed to take page
>> >> > faults in the idle loop, so this is not a problem.
>> >>
>> >> I guess so, as long as there are really no page faults in the idle loop.
>> >
>> > As far as I know, there are not.  If there are, someone needs to let
>> > me know!  ;-)
>> >
>> >> There are, however, machine checks in the idle loop, and maybe kprobes
>> >> (haven't checked), so I think this patch might fix real bugs.
>> >
>> > If you can get ISTs from the idle loop, then the patch is needed.
>> >
>> >> > Just out of curiosity...  Can an NMI occur in IST context?  If it can,
>> >> > I need to make rcu_nmi_enter() and rcu_nmi_exit() deal properly with
>> >> > nested calls.
>> >>
>> >> Yes, and vice versa.  That code looked like it handled nesting
>> >> correctly, but I wasn't entirely sure.
>> >
>> > It currently does not, please see below patch.  Are you able to test
>> > nesting?  It would be really cool if you could do so -- I have no
>> > way to test this patch.
>>
>> I can try.  It's sort of easy -- I'll put an int3 into do_nmi and add
>> a fixup to avoid crashing.
>>
>> What should I look for?  Should I try to force full nohz on and assert
>> something?  I don't really know how to make full nohz work.
>
> You should look for the WARN_ON_ONCE() calls in rcu_nmi_enter() and
> rcu_nmi_exit() to fire.

No warning with or without your patch, maybe because all of those
returns skip the labels.

Also, an NMI can happen *during* rcu_nmi_enter or rcu_nmi_exit.  Is
that okay?  Should those dynticks_nmi_nesting++ things be local_inc
and local_dec_and_test?

That dynticks_nmi_nesting thing seems scary to me.  Shouldn't the code
unconditionally increment dynticks_nmi_nesting in rcu_nmi_enter and
unconditionally decrement it in rcu_nmi_exit?

--Andy

>
> Thanx, Paul
>
>> >> Also, just to make sure: are we okay if rcu_nmi_enter() is called
>> >> before exception_enter if context tracking is on and we came directly
>> >> from userspace?
>> >
>> > If I understand correctly, this will result in context tracking invoking
>> > rcu_user_enter(), which will result in the rcu_dynticks counter having an
>> > odd value.  In that case, rcu_nmi_enter() will notice that RCU is already
>> > paying attention to this CPU via its check of atomic_read(>dynticks)
>> > & 0x1), and will thus just return.  The matching rcu_nmi_exit() will
>> > notice that the nesting count is zero, and will also just return.
>> >
>> > Thus, everything works in that case.
>> >
>> > In contrast, if rcu_nmi_enter() was invoked 

Re: [PATCH v2] thermal: provide an UAPI header file

2014-11-21 Thread Florian Fainelli
On 11/20/2014 08:32 AM, Florian Fainelli wrote:
> include/linux/thermal.h contains definitions for the Thermal generic
> netlink family, but none of the valuable information relevant to
> user-space such as the Genl family name, multicast group, version or
> command set and data types is exported to user-space.
> 
> Export all the relevant generic netlink information to user-space to
> make this genl family usable by user-space, and while at it, export
> THERMAL_NAME_LENGTH since it limits name length for thermal_hwmon
> devices.
> 
> Kbuild and MAINTAINERS are also updated accordingly to reflect this new
> file: include/uapi/linux/thermal.h.

I forgot to include the definition for the thermal_event structure which
is multi-casted through netlink, will resubmit with that.

> 
> Signed-off-by: Florian Fainelli 
> ---
> Changes in v2:
> - rebase against Eduardo's thermal/next tree
> 
>  MAINTAINERS  |  1 +
>  include/linux/thermal.h  | 31 +--
>  include/uapi/linux/Kbuild|  1 +
>  include/uapi/linux/thermal.h | 35 +++
>  4 files changed, 38 insertions(+), 30 deletions(-)
>  create mode 100644 include/uapi/linux/thermal.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c444907ccd69..790752a4fad2 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -9294,6 +9294,7 @@ Q:  
> https://patchwork.kernel.org/project/linux-pm/list/
>  S:   Supported
>  F:   drivers/thermal/
>  F:   include/linux/thermal.h
> +F:   include/uapi/linux/thermal.h
>  F:   include/linux/cpu_cooling.h
>  F:   Documentation/devicetree/bindings/thermal/
>  
> diff --git a/include/linux/thermal.h b/include/linux/thermal.h
> index 5bc28a70014e..be959e9df06c 100644
> --- a/include/linux/thermal.h
> +++ b/include/linux/thermal.h
> @@ -29,10 +29,10 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define THERMAL_TRIPS_NONE   -1
>  #define THERMAL_MAX_TRIPS12
> -#define THERMAL_NAME_LENGTH  20
>  
>  /* invalid cooling state */
>  #define THERMAL_CSTATE_INVALID -1UL
> @@ -49,11 +49,6 @@
>  #define MILLICELSIUS_TO_DECI_KELVIN_WITH_OFFSET(t, off) (((t) / 100) + (off))
>  #define MILLICELSIUS_TO_DECI_KELVIN(t) 
> MILLICELSIUS_TO_DECI_KELVIN_WITH_OFFSET(t, 2732)
>  
> -/* Adding event notification support elements */
> -#define THERMAL_GENL_FAMILY_NAME"thermal_event"
> -#define THERMAL_GENL_VERSION0x01
> -#define THERMAL_GENL_MCAST_GROUP_NAME   "thermal_mc_grp"
> -
>  /* Default Thermal Governor */
>  #if defined(CONFIG_THERMAL_DEFAULT_GOV_STEP_WISE)
>  #define DEFAULT_THERMAL_GOVERNOR   "step_wise"
> @@ -86,30 +81,6 @@ enum thermal_trend {
>   THERMAL_TREND_DROP_FULL, /* apply lowest cooling action */
>  };
>  
> -/* Events supported by Thermal Netlink */
> -enum events {
> - THERMAL_AUX0,
> - THERMAL_AUX1,
> - THERMAL_CRITICAL,
> - THERMAL_DEV_FAULT,
> -};
> -
> -/* attributes of thermal_genl_family */
> -enum {
> - THERMAL_GENL_ATTR_UNSPEC,
> - THERMAL_GENL_ATTR_EVENT,
> - __THERMAL_GENL_ATTR_MAX,
> -};
> -#define THERMAL_GENL_ATTR_MAX (__THERMAL_GENL_ATTR_MAX - 1)
> -
> -/* commands supported by the thermal_genl_family */
> -enum {
> - THERMAL_GENL_CMD_UNSPEC,
> - THERMAL_GENL_CMD_EVENT,
> - __THERMAL_GENL_CMD_MAX,
> -};
> -#define THERMAL_GENL_CMD_MAX (__THERMAL_GENL_CMD_MAX - 1)
> -
>  struct thermal_zone_device_ops {
>   int (*bind) (struct thermal_zone_device *,
>struct thermal_cooling_device *);
> diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
> index 4c94f31a8c99..a1943e2d1264 100644
> --- a/include/uapi/linux/Kbuild
> +++ b/include/uapi/linux/Kbuild
> @@ -383,6 +383,7 @@ header-y += tcp.h
>  header-y += tcp_metrics.h
>  header-y += telephony.h
>  header-y += termios.h
> +header-y += thermal.h
>  header-y += time.h
>  header-y += times.h
>  header-y += timex.h
> diff --git a/include/uapi/linux/thermal.h b/include/uapi/linux/thermal.h
> new file mode 100644
> index ..ac5535855982
> --- /dev/null
> +++ b/include/uapi/linux/thermal.h
> @@ -0,0 +1,35 @@
> +#ifndef _UAPI_LINUX_THERMAL_H
> +#define _UAPI_LINUX_THERMAL_H
> +
> +#define THERMAL_NAME_LENGTH  20
> +
> +/* Adding event notification support elements */
> +#define THERMAL_GENL_FAMILY_NAME"thermal_event"
> +#define THERMAL_GENL_VERSION0x01
> +#define THERMAL_GENL_MCAST_GROUP_NAME   "thermal_mc_grp"
> +
> +/* Events supported by Thermal Netlink */
> +enum events {
> + THERMAL_AUX0,
> + THERMAL_AUX1,
> + THERMAL_CRITICAL,
> + THERMAL_DEV_FAULT,
> +};
> +
> +/* attributes of thermal_genl_family */
> +enum {
> + THERMAL_GENL_ATTR_UNSPEC,
> + THERMAL_GENL_ATTR_EVENT,
> + __THERMAL_GENL_ATTR_MAX,
> +};
> +#define THERMAL_GENL_ATTR_MAX (__THERMAL_GENL_ATTR_MAX - 1)
> +
> +/* commands supported by the thermal_genl_family */
> +enum {
> + THERMAL_GENL_CMD_UNSPEC,
> + 

Re: [GIT PULL] ftrace/x86: Add frames pointers to trampoline as necessary

2014-11-21 Thread Linus Torvalds
On Fri, Nov 21, 2014 at 6:50 AM, Steven Rostedt  wrote:
> +.macro create_frame parent rip
> +#ifdef CC_USING_FENTRY
> +   pushq \parent
> +   pushq %rbp
> +   movq %rsp, %rbp

This is a very strange frame.

Why do you do the "pushq \parent" at all? Why isn't this just a *real*
frame and add it to MCOUNT_SAVE_FRAME.

You seem to create this fake frame-within-a-frame thing. It's not clear why.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1 linux-next] iio: common: remove unnecessary sizeof(u8)

2014-11-21 Thread Hartmut Knaack
Fabian Frederick schrieb am 16.11.2014 13:33:
> sizeof(u8) is always 1.
> 
> Signed-off-by: Fabian Frederick 
Acked-by: Hartmut Knaack 
> ---
>  drivers/iio/common/st_sensors/st_sensors_spi.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iio/common/st_sensors/st_sensors_spi.c 
> b/drivers/iio/common/st_sensors/st_sensors_spi.c
> index 78a6a1a..5b37737 100644
> --- a/drivers/iio/common/st_sensors/st_sensors_spi.c
> +++ b/drivers/iio/common/st_sensors/st_sensors_spi.c
> @@ -54,7 +54,7 @@ static int st_sensors_spi_read(struct 
> st_sensor_transfer_buffer *tb,
>   if (err)
>   goto acc_spi_read_error;
>  
> - memcpy(data, tb->rx_buf, len*sizeof(u8));
> + memcpy(data, tb->rx_buf, len);
>   mutex_unlock(>buf_lock);
>   return len;
>  
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Implement lbr-as-callgraph v10

2014-11-21 Thread Andi Kleen
>  f1 tcall.c:9
>  main tcall.c:17
>  main tcall.c:17
>  main tcall.c:16
>  main tcall.c:16
>  f1 tcall.c:12
>  f1 tcall.c:12
>  f2 tcall.c:6
>  f2 tcall.c:4
>  f1 tcall.c:11
>  f1 tcall.c:11
>  f2 tcall.c:6
>  f2 tcall.c:4
>  f1 tcall.c:10
>  f1 tcall.c:9
>  main tcall.c:17
> 
> 
> 
> Do you see the diff?  The 87.65% and 12.35% doesn't appear on the --tui
> output.

I see the problem. It's some issue in hist_browser__show_callchain.
--stdio doesn't show it because it doesn't seem to use that (?)

With this patch it shows percent for the first entry

@@ -791,7 +791,7 @@ static int hist_browser__show_entry(struct hist_browser 
*browser,
};
 
printed += hist_browser__show_callchain(browser,
-   >sorted_chain, 1, row, total,
+   >sorted_chain, 2, row, total,
hist_browser__show_callchain_entry, 
,
hist_browser__check_output_full);

But the numbers are still different from what --stdio outputs,
so there are some deeper issues.

I doubt I caused this, probably some latent bug that just got triggered.

Namhyung?

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 9/9] netfilter: Replace smp_read_barrier_depends() with lockless_dereference()

2014-11-21 Thread Pranith Kumar
On Fri, Nov 21, 2014 at 7:05 PM, Eric Dumazet  wrote:
>
> On Fri, 2014-11-21 at 16:57 -0500, Pranith Kumar wrote:
>
>> Hi Eric,
>>
>> Thanks for looking at this patch.
>>
>> I've been scratching my head since morning trying to find out what was
>> so obviously wrong with this patch. Alas, I don't see what you do.
>>
>> Could you point it out and show me how incompetent I am, please?
>>
>> Thanks!
>
> Well, even it the code is _not_ broken, I don't see any value with this
> patch.

Phew. Not being broken itself is a win :)

>
> If I use git blame on current code, line containing
> smp_read_barrier_depends() exactly points to the relevant commit [1]

And that is an opinion I will respect. I don't want to muck the git
history where it is significant.

This effort is to eventually replace the uses of
smp_read_barrier_depends() and to use either rcu or
lockless_dereference() as documented in memory-barriers.txt.

>
> After your change, it will point to some cleanup, which makes little
> sense to me, considering you did not change the smp_wmb() in
> xt_replace_table().

That does not need to change as it is fine as it is. It still pairs
with the smp_read_barrier_depends() in lockless_dereference().

>
> I, as a netfilter contributor would like to keep current code as is,
> because it is how I feel safe with it.
>
> We have a proliferation of interfaces, but this does not help to
> understand the issues and code maintenance.
>
> smp_read_barrier_depends() better documents the read barrier than
> lockless_dereference().

I think this is a matter of opinion. But in the current effort I've
seen cases where it is not clear what the barrier is actually
guaranteeing. I am glad that the current code is not one of those and
it has reasonable comments.

lockless_dereference() on the other hand makes the dependency explicit.

>
> The point of having a lock or not is irrelevant here.
>
> [1]
> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=b416c144f46af1a30ddfa4e4319a8f077381ad63
>
>
>
>


Thanks!
-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Thomas Gleixner
On Fri, 21 Nov 2014, Konrad Rzeszutek Wilk wrote:
> On Fri, Nov 21, 2014 at 08:51:43PM +0100, Thomas Gleixner wrote:
 > > On Fri, 21 Nov 2014, Linus Torvalds wrote:
> > > Here's the simplified end result. Again, this is TOTALLY UNTESTED. I
> > > compiled it and verified that the code generation looks like what I'd
> > > have expected, but that's literally it.
> > > 
> > >   static noinline int vmalloc_fault(unsigned long address)
> > >   {
> > > pgd_t *pgd_dst;
> > > pgdval_t pgd_entry;
> > > unsigned index = pgd_index(address);
> > > 
> > > if (index < KERNEL_PGD_BOUNDARY)
> > > return -1;
> > > 
> > > pgd_entry = init_mm.pgd[index].pgd;
> > > if (!pgd_entry)
> > > return -1;
> > > 
> > > pgd_dst = __va(PAGE_MASK & read_cr3());
> > > pgd_dst += index;
> > > 
> > > if (pgd_dst->pgd)
> > > return -1;
> > > 
> > > ACCESS_ONCE(pgd_dst->pgd) = pgd_entry;
> > 
> > This will break paravirt. set_pgd/set_pmd are paravirt functions.
> > 
> > But I'm fine with breaking it, then you just need to change
> > CONFIG_PARAVIRT to 'def_bool n'
> 
> That is not very nice.

Maybe not nice, but sensible.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 3/4] i2c: omap: don't reset controller if Arbitration Lost detected

2014-11-21 Thread Felipe Balbi
On Sat, Nov 22, 2014 at 02:51:47AM +0400, Alexander Kochetkov wrote:
> Arbitration Lost is an expected situation in a multimaster
> environment. I2C controller (IP) correctly detect and report AL.
> 
> The only one visible reason for reseting IP in the AL case is
> to avoid advisory 1.94 (omap3) and errata i595 (omap4): "I2C:
> After an Arbitration is Lost the Module Incorrectly Starts
> the Next Transfer".
> 
> Errata workaround states: "The MST and STT bits inside I2C_CON
> should be set to 1 at the same moment (avoid setting the MST bit
> to 1 while STT = 0)." The driver never set MST and STT bits
> separately and doesn't create condition for errata. So the reset
> is not necessary.
> 
> Also corrected return value for AL to -EAGAIN.
> 
> Tested on Beagleboard XM C.
> 
> Signed-off-by: Alexander Kochetkov 

you could have kept my tested-by and reviewed-by:

Tested-by: Felipe Balbi 
Reviewed-by: Felipe Balbi 


> On 21.10.2014 21:11, Wolfram Sang  wrote:
> > The errno for AL is -EAGAIN. Curly braces are not needed.
> 
> Thank you, Wolfram, fixed.
> 
>  drivers/i2c/busses/i2c-omap.c |6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
> index 3ffb9c0..02da567 100644
> --- a/drivers/i2c/busses/i2c-omap.c
> +++ b/drivers/i2c/busses/i2c-omap.c
> @@ -707,13 +707,15 @@ static int omap_i2c_xfer_msg(struct i2c_adapter *adap,
>   return 0;
>  
>   /* We have an error */
> - if (dev->cmd_err & (OMAP_I2C_STAT_AL | OMAP_I2C_STAT_ROVR |
> - OMAP_I2C_STAT_XUDF)) {
> + if (dev->cmd_err & (OMAP_I2C_STAT_ROVR | OMAP_I2C_STAT_XUDF)) {
>   omap_i2c_reset(dev);
>   __omap_i2c_init(dev);
>   return -EIO;
>   }
>  
> + if (dev->cmd_err & OMAP_I2C_STAT_AL)
> + return -EAGAIN;
> +
>   if (dev->cmd_err & OMAP_I2C_STAT_NACK) {
>   if (msg->flags & I2C_M_IGNORE_NAK)
>   return 0;
> -- 
> 1.7.9.5
> 

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH] HID: usbhid: get/put around clearing needs_remote_wakeup

2014-11-21 Thread Benson Leung
On Fri, Nov 14, 2014 at 1:08 AM, Oliver Neukum  wrote:
> On Thu, 2014-11-13 at 12:16 -0800, Benson Leung wrote:
>
>> In usbhid_open, usb_autopm_get_interface is called
>> before setting the needs_remote_wakeup flag, and
>> usb_autopm_put_interface is called after hid_start_in.
>>
>> However, when the device is closed in usbhid_close, the same
>> protection isn't there when clearing needs_remote_wakeup. This will
>> add that to usbhid_close as well as usbhid_stop.
>
> Interesting, but this has the side effect of waking devices
> that are asleep just to remove the flag.
>
> Regards


If devices are already asleep with this flag enabled, that means that
they are presently configured for remote wake.

Waking the device in the case of a close() is appropriate because it
also has the effect of re-suspending the device with the capability
disabled, as it is no longer necessary.

-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipc,sem block sem_lock on sma->lock during sma initialization

2014-11-21 Thread Davidlohr Bueso
On Fri, 2014-11-21 at 18:03 -0500, Rik van Riel wrote:
> On 11/21/2014 03:42 PM, Andrew Morton wrote:
> > On Fri, 21 Nov 2014 15:29:27 -0500 Rik van Riel 
> > wrote:
> > 
> >> On 11/21/2014 03:09 PM, Andrew Morton wrote:
> >>> On Fri, 21 Nov 2014 14:52:26 -0500 Rik van Riel
> >>>  wrote:
> >>> 
>  When manipulating just one semaphore with semop, sem_lock
>  only takes that single semaphore's lock. This creates a
>  problem during initialization of the semaphore array, when
>  the data structures used by sem_lock have not been set up
>  yet. The sma->lock is already held by newary, and we just
>  have to make sure everything else waits on that lock during
>  initialization.
>  
>  Luckily it is easy to make sem_lock wait on the sma->lock,
>  by pretending there is a complex operation in progress while
>  the sma is being initialized.
>  
>  The newary function already zeroes sma->complex_count before 
>  unlocking the sma->lock.
> >>> 
> >>> What are the runtime effects of the bug?
> >>> 
> >> 
> >> NULL pointer dereference in spin_lock from sem_lock, if it is
> >> called before sma->sem_base has been pointed somewhere valid.
> > 
> > Help us out here.  People need to use this description to work out 
> > which kernel versions need the patch and whether to backport the
> > fix into their various kernels.  Other people will be starting at
> > this changelog wondering "will this fix the bug my customer has
> > reported".
> > 
> > Is there some bug report people can look at?

This would be nice for the changelog...

> > 
> > What userspace actions trigger this bug?
> 
> The reason the bug took almost two years to get noticed is that
> it takes one task doing a semop on a semaphore in an array that
> is still getting instantiated by newary (getsem) from another
> task.
> 
> In other words, if you try to use a semaphore array before
> getsem returns, you can oops the task that calls semop.

This seems bogus from an application level: how can you call semop if
you don't have the semid yet returned from semget? And the fact that the
race is with newary, means that the call is in fact creating a *new*
set, as opposed to plugging into an already existing set.

The fix in newary() being before the actual creation of the id seems
even stranger:

sma->complex_count = 1;
id = ipc_addid(_ids(ns), >sem_perm, ns->sc_semmni);

As for semtimedop() before even getting to sem_lock(), we first call:

sma = sem_obtain_object_check(ns, semid);

So shouldn't that fail anyway before we even consider acquiring the lock?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] [LBR] Dump LBRs on Oops

2014-11-21 Thread Thomas Gleixner
On Fri, 21 Nov 2014, Emmanuel Berthier wrote:

> The purpose of this patch is to stop LBR at the early stage of
> Exception Handling, and dump its content later in the dumpstack
> process.

And that's useful in what way? The changelog should not tell WHAT the
patch does. It should tell WHY it is useful and what are the
usecases/benefits. Neither does it tell how that feature can be
used/enabled/disabled and how it provides useful information.

Where is that sample output which demonstrates that this is something
which adds debugging value rather than another level of pointless
featuritis?

> --- a/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_lbr.c
> @@ -4,7 +4,9 @@
>  #include 
>  #include 
>  #include 
> -
> +#ifdef CONFIG_LBR_DUMP_ON_EXCEPTION
> +#include 
> +#endif

We just can include that file unconditionally.

>  #include "perf_event.h"
>  
>  enum {
> @@ -135,6 +137,9 @@ static void __intel_pmu_lbr_enable(void)
>   u64 debugctl;
>   struct cpu_hw_events *cpuc = this_cpu_ptr(_hw_events);
>  
> + if (IS_ENABLED(CONFIG_LBR_DUMP_ON_EXCEPTION))
> + lbr_dump_on_exception = 0;

With certain compilers you might get a surprise here, because they are
too stupid to remove that 'lbr_dump_on_exception = 0;' right
away. They kill it in the optimization phase. So they complain about
lbr_dump_on_exception not being defined.

So you need something like this:

static inline void lbr_set_dump_on_oops(bool enable)
{
#ifdef CONFIG_LBR_DUMP_ON_EXCEPTION
   
#endif
}

and make that

 if (IS_ENABLED(CONFIG_LBR_DUMP_ON_EXCEPTION))
 lbr_set_dump_on_oops(false);

which is completely pointless as you can just call

  lbr_set_dump_on_oops(false);

unconditionally and be done with it.

IS_ENABLED(CONFIG_XXX) is not a proper solution for all problems. It
can avoid #ifdefs, but it also can introduce interesting nonsense.

>   if (cpuc->lbr_sel)
>   wrmsrl(MSR_LBR_SELECT, cpuc->lbr_sel->config);
>  
> @@ -147,6 +152,9 @@ static void __intel_pmu_lbr_disable(void)
>  {
>   u64 debugctl;
>  
> + if (IS_ENABLED(CONFIG_LBR_DUMP_ON_EXCEPTION))
> + lbr_dump_on_exception = 1;

Now the even more interesting question is, WHY is
lbr_dump_on_exception enabled in __intel_pmu_lbr_disable and disabled
in __intel_pmu_lbr_enable?

This obviously lacks a understandable comment, but before you
elaborate on this see the next comment.

> +void show_lbrs(void)
> +{
> + if (IS_ENABLED(CONFIG_LBR_DUMP_ON_EXCEPTION)) {
> + u64 debugctl;
> + int i, lbr_on;
> +
> + rdmsrl(MSR_IA32_DEBUGCTLMSR, debugctl);
> + lbr_on = debugctl & DEBUGCTLMSR_LBR;
> +
> + pr_info("Last Branch Records:");
> + if (!lbr_dump_on_exception) {
> + pr_cont(" (Disabled by perf_event)\n");

So, if perf uses LBR we do not print it? What a weird design
decision. If the machine crashes, we want that information no matter
whether perf is active or not. What kind of twisted logic is that?

> + } else if (x86_pmu.lbr_nr == 0) {
> + pr_cont(" (x86_model unknown, check 
> intel_pmu_init())\n");

Huch? Why we get here if the pmu does not support it at all? Why
should we bother to print it? If it's not printed it's not
available. It's that simple.

> + } else if (lbr_on) {
> + pr_cont(" (not halted)\n");

Why would it be not halted? Code comments are optional, right?

> + } else {
> + struct cpu_hw_events *cpuc =
> + this_cpu_ptr(_hw_events);

A simple #ifdef would have saved you an indentation level and actually
made that code readable. IS_ENABLED() is a proper hammer for some
things but not everything is a nail.

> + intel_pmu_lbr_read_64(cpuc);
> +
> + pr_cont("\n");
> + for (i = 0; i < cpuc->lbr_stack.nr; i++) {
> + pr_info("   to: [<%016llx>] ",
> + cpuc->lbr_entries[i].to);
> + print_symbol("%s\n", cpuc->lbr_entries[i].to);
> + pr_info(" from: [<%016llx>] ",
> + cpuc->lbr_entries[i].from);
> + print_symbol("%s\n", cpuc->lbr_entries[i].from);
> + }
> + }
> + }
> +}
> +
>  void show_regs(struct pt_regs *regs)
>  {
>   int i;
> @@ -314,10 +352,15 @@ void show_regs(struct pt_regs *regs)
>   unsigned char c;
>   u8 *ip;
>  
> + /*
> +  * Called before show_stack_log_lvl() as it could trig
> +  * page_fault and reenable LBR

Huch? The kernel stack dump is going to page fault? If that happens
then you are in deep shit anyway. I doubt that anything useful gets
out of the machine at 

Re: [PATCH] HID: usbhid: get/put around clearing needs_remote_wakeup

2014-11-21 Thread Benson Leung
Hi Alan,

On Fri, Nov 14, 2014 at 7:17 AM, Alan Stern  wrote:
>
> The reason for the get/put is to force a call to autosuspend_check().
> But in this case, if killing the interrupt URB causes
> autosuspend_check() to run then the get/put isn't needed.
>
> On the other hand, I don't see why killing the interrupt URB would
> cause autosuspend_check() to run.  Can you explain that?

Sorry for the delay in my response. I did some more checking of my
particular failure, and my commit message is incorrect. The
usb_kill_urb is actually not the cause of this problem. It does not
result in autosuspend_check() itself, and is only serving to add some
delay.

hidraw_release() in hidraw.c calls drop_ref(), which calls the
following in sequence upon clearing the last reader :
/* close device for last reader */
hid_hw_power(hidraw->hid, PM_HINT_NORMAL);
hid_hw_close(hidraw->hid);

hid_hw_power results in a usb_autopm_put_interface. In this case, the
reference count is decremented to 0, and a delayed autosuspend request
is attempted.
hid_hw_close leads to usbhid_close, which clears needs_remote_wakeup.

However, there's no guarantee that the clear of needs_remote_wakeup
will occur before the delayed work ( runtime_idle() ->
autosuspend_check() )  runs. Moving usbhid->intf->needs_remote_wakeup
= 0 to before the usb_kill_urb(usbhid->urbin) only serves to reduce
the amount of time between these events and makes this particular
failure less likely.

The correct solution is to put get/put around each change of
needs_remote_wakeup, as that will correctly trigger another delayed
autosuspend_check(), whose result is affected by the state of
needs_remote_wakeup.

Since autosuspend_check() occurs as delayed work, I think it is
appropriate to add get/put around the clear in usbhid_stop as well.

-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org

On Fri, Nov 14, 2014 at 7:17 AM, Alan Stern  wrote:
> On Thu, 13 Nov 2014, Benson Leung wrote:
>
>> Hi Alan,
>>
>> On Thu, Nov 13, 2014 at 2:11 PM, Alan Stern  
>> wrote:
>> > Wait a minute -- in your previous email you said this approach didn't
>> > work.  So does it work or doesn't it?
>>
>> Sorry for the confusion. The approach *does* work.
>>
>> That was actually my original idea to fix the problem, but I saw other
>> places in the kernel where it was done with a get/put.
>
> The reason for the get/put is to force a call to autosuspend_check().
> But in this case, if killing the interrupt URB causes
> autosuspend_check() to run then the get/put isn't needed.
>
> On the other hand, I don't see why killing the interrupt URB would
> cause autosuspend_check() to run.  Can you explain that?
>
> Alan Stern
>
>
>



-- 
Benson Leung
Software Engineer, Chrom* OS
ble...@chromium.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Andy Lutomirski
On Fri, Nov 21, 2014 at 4:18 PM, Linus Torvalds
 wrote:
> On Fri, Nov 21, 2014 at 4:11 PM, Tejun Heo  wrote:
>>
>> I don't think there's much percpu allocator itself can do.  The
>> ability to grow dynamically comes from being able to allocate
>> relatively consistent layout among areas for different CPUs and pretty
>> much requires vmalloc area and it'd generally be a good idea to take
>> out the vmalloc fault anyway.
>
> Why do you guys worry so much about the vmalloc fault?
>
> This started because of a very different issue: putting the actual
> stack in vmalloc space. Then it can cause nasty triple faults etc.
>
> But the normal vmalloc fault? Who cares, really? If that causes
> problems, they are bugs. Fix them.

Because of this in system_call_after_swapgs:

movq%rsp,PER_CPU_VAR(old_rsp)
movqPER_CPU_VAR(kernel_stack),%rsp

It occurs to me that, if we really want to change that, we could have
an array of syscall trampolines, one per CPU, that have the CPU number
hardcoded.  But I really don't think that's worth it.

Other than that, with your fix, vmalloc faults are no big deal :)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 9/9] netfilter: Replace smp_read_barrier_depends() with lockless_dereference()

2014-11-21 Thread Andres Freund
Hi,

On 2014-11-21 16:57:00 -0500, Pranith Kumar wrote:
> On Fri, Nov 21, 2014 at 11:12 AM, Eric Dumazet  wrote:
> > On Fri, 2014-11-21 at 10:06 -0500, Pranith Kumar wrote:
> >> Recently lockless_dereference() was added which can be used in place of
> >> hard-coding smp_read_barrier_depends(). The following PATCH makes the 
> >> change.
> >>
> >> Signed-off-by: Pranith Kumar 
> >> ---
> >>  net/ipv4/netfilter/arp_tables.c | 3 +--
> >>  net/ipv4/netfilter/ip_tables.c  | 3 +--
> >>  net/ipv6/netfilter/ip6_tables.c | 3 +--
> >>  3 files changed, 3 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/net/ipv4/netfilter/arp_tables.c 
> >> b/net/ipv4/netfilter/arp_tables.c
> >> index f95b6f9..fc7533d 100644
> >> --- a/net/ipv4/netfilter/arp_tables.c
> >> +++ b/net/ipv4/netfilter/arp_tables.c
> >> @@ -270,12 +270,11 @@ unsigned int arpt_do_table(struct sk_buff *skb,
> >>
> >>   local_bh_disable();
> >>   addend = xt_write_recseq_begin();
> >> - private = table->private;
> >>   /*
> >>* Ensure we load private-> members after we've fetched the base
> >>* pointer.
> >>*/
> >> - smp_read_barrier_depends();
> >> + private = lockless_dereference(table->private);
> >>   table_base = private->entries[smp_processor_id()];
> >>
> >
> >
> > Please carefully read the code, before and after your change, then
> > you'll see this change broke the code.
> >
> > Problem is that a bug like that can be really hard to diagnose and fix
> > later, so really you have to be very careful doing these mechanical
> > changes.
> >
> > IMO, current code+comment is better than with this
> > lockless_dereference() which in this particular case obfuscates the
> > code. more than anything.
> >
> > In this case we do have a lock (sort of), so lockless_dereference() is
> > quite misleading.
> >
> 
> Hi Eric,
> 
> Thanks for looking at this patch.
> 
> I've been scratching my head since morning trying to find out what was
> so obviously wrong with this patch. Alas, I don't see what you do.

Afaics the read_barrier_depends protected the load from private->entries[x]
earlier, not the load of table->private itself.

Greetings,

Andres Freund
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] ARM: dts: AM43xx: add tscadc DT entries for am437x-evm and am43x-epos-evm

2014-11-21 Thread Tony Lindgren
* Vignesh R  [141121 02:18]:
> This patch adds tscadc DT entries for am437x-gp-evm
> and am43x-epos-evm.

Applying into omap-for-v3.19/dt-v2 thanks.

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[for-next][PATCH] printk/percpu: Define printk_func when printk is not defined

2014-11-21 Thread Steven Rostedt
  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: 04b74b27c2941e5d62120f6fee3a0a9388a30613


Steven Rostedt (Red Hat) (1):
  printk/percpu: Define printk_func when printk is not defined


 include/linux/percpu.h | 1 +
 include/linux/printk.h | 4 ++--
 kernel/printk/printk.c | 3 +++
 3 files changed, 6 insertions(+), 2 deletions(-)
---
commit 04b74b27c2941e5d62120f6fee3a0a9388a30613
Author: Steven Rostedt (Red Hat) 
Date:   Fri Nov 21 09:16:58 2014 -0500

printk/percpu: Define printk_func when printk is not defined

To avoid include hell, the per_cpu variable printk_func was declared
in percpu.h. But it is only defined if printk is defined.

As users of printk may also use the printk_func variable, it needs to
be defined even if CONFIG_PRINTK is not.

Also add a printk.h include in percpu.h just to be safe.

Link: http://lkml.kernel.org/r/20141121183215.01ba5...@canb.auug.org.au

Reported-by: Stephen Rothwell 
Signed-off-by: Steven Rostedt 

diff --git a/include/linux/percpu.h b/include/linux/percpu.h
index ba2e85a0ff5b..caebf2a758dc 100644
--- a/include/linux/percpu.h
+++ b/include/linux/percpu.h
@@ -5,6 +5,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
diff --git a/include/linux/printk.h b/include/linux/printk.h
index 3bbd979d32fb..c69be9ee8f48 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -124,6 +124,8 @@ static inline __printf(1, 2) __cold
 void early_printk(const char *s, ...) { }
 #endif
 
+typedef int(*printk_func_t)(const char *fmt, va_list args);
+
 #ifdef CONFIG_PRINTK
 asmlinkage __printf(5, 0)
 int vprintk_emit(int facility, int level,
@@ -162,8 +164,6 @@ extern int kptr_restrict;
 
 extern void wake_up_klogd(void);
 
-typedef int(*printk_func_t)(const char *fmt, va_list args);
-
 void log_buf_kexec_setup(void);
 void __init setup_log_buf(int early);
 void dump_stack_set_arch_desc(const char *fmt, ...);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f7b723f98cb9..5af2b8bc88f0 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1896,6 +1896,9 @@ static size_t msg_print_text(const struct printk_log 
*msg, enum log_flags prev,
 bool syslog, char *buf, size_t size) { return 0; }
 static size_t cont_print_text(char *text, size_t size) { return 0; }
 
+/* Still needs to be defined for users */
+DEFINE_PER_CPU(printk_func_t, printk_func);
+
 #endif /* CONFIG_PRINTK */
 
 #ifdef CONFIG_EARLY_PRINTK
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Linus Torvalds
On Fri, Nov 21, 2014 at 4:11 PM, Tejun Heo  wrote:
>
> I don't think there's much percpu allocator itself can do.  The
> ability to grow dynamically comes from being able to allocate
> relatively consistent layout among areas for different CPUs and pretty
> much requires vmalloc area and it'd generally be a good idea to take
> out the vmalloc fault anyway.

Why do you guys worry so much about the vmalloc fault?

This started because of a very different issue: putting the actual
stack in vmalloc space. Then it can cause nasty triple faults etc.

But the normal vmalloc fault? Who cares, really? If that causes
problems, they are bugs. Fix them.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] media/au0828: Fix IR stop, poll to not access device during disconnect

2014-11-21 Thread Shuah Khan
au0828 IR stop and poll routines continue to access device
while usb disconnect is in progress. There is small window
between device disconnect and usb interface is set to null.
This results in filling the log with several of the following
error messages. Fix it to detect device disconnect condition
and avoid device access.

Nov 20 18:58:02 anduin kernel: [  102.949819] au0828: au0828_usb_disconnect()
Nov 20 18:58:02 anduin kernel: [  102.950046] au0828: send_control_msg() Failed 
sending control message, error -71.
Nov 20 18:58:02 anduin kernel: [  102.950052] au0828: send_control_msg() Failed 
sending control message, error -19.
Nov 20 18:58:02 anduin kernel: [  102.950056] au0828: send_control_msg() Failed 
sending control message, error -19.
Nov 20 18:58:02 anduin kernel: [  102.950061] au0828: send_control_msg() Failed 
sending control message, error -19.
Nov 20 18:58:02 anduin kernel: [  102.950065] au0828: recv_control_msg() Failed 
receiving control message, error -19.
Nov 20 18:58:02 anduin kernel: [  102.950069] au0828: recv_control_msg() Failed 
receiving control message, error -19.
Nov 20 18:58:02 anduin kernel: [  102.950072] au0828: recv_control_msg() Failed 
receiving control message, error -19.

Signed-off-by: Shuah Khan 
---
 drivers/media/usb/au0828/au0828-core.c  |8 
 drivers/media/usb/au0828/au0828-input.c |   11 +--
 2 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/drivers/media/usb/au0828/au0828-core.c 
b/drivers/media/usb/au0828/au0828-core.c
index bc06480..2c3d3c1 100644
--- a/drivers/media/usb/au0828/au0828-core.c
+++ b/drivers/media/usb/au0828/au0828-core.c
@@ -153,6 +153,14 @@ static void au0828_usb_disconnect(struct usb_interface 
*interface)
 
dprintk(1, "%s()\n", __func__);
 
+   /* there is a small window after disconnect, before
+  dev->usbdev is NULL, for poll (e.g: IR) try to access
+  the device and fill the dmesg with error messages.
+  Set the status so poll routines can check and avoid
+  access after disconnect.
+   */
+   dev->dev_state = DEV_DISCONNECTED;
+
au0828_rc_unregister(dev);
/* Digital TV */
au0828_dvb_unregister(dev);
diff --git a/drivers/media/usb/au0828/au0828-input.c 
b/drivers/media/usb/au0828/au0828-input.c
index 63995f9..c7185c1 100644
--- a/drivers/media/usb/au0828/au0828-input.c
+++ b/drivers/media/usb/au0828/au0828-input.c
@@ -129,6 +129,10 @@ static int au0828_get_key_au8522(struct au0828_rc *ir)
int prv_bit, bit, width;
bool first = true;
 
+   /* do nothing if device is disconnected */
+   if (ir->dev->dev_state == DEV_DISCONNECTED)
+   return 0;
+
/* Check IR int */
rc = au8522_rc_read(ir, 0xe1, -1, buf, 1);
if (rc < 0 || !(buf[0] & (1 << 4))) {
@@ -255,8 +259,11 @@ static void au0828_rc_stop(struct rc_dev *rc)
 
cancel_delayed_work_sync(>work);
 
-   /* Disable IR */
-   au8522_rc_clear(ir, 0xe0, 1 << 4);
+   /* do nothing if device is disconnected */
+   if (ir->dev->dev_state != DEV_DISCONNECTED) {
+   /* Disable IR */
+   au8522_rc_clear(ir, 0xe0, 1 << 4);
+   }
 }
 
 static int au0828_probe_i2c_ir(struct au0828_dev *dev)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Tejun Heo
Hello, Frederic.

On Fri, Nov 21, 2014 at 10:44:46PM +0100, Frederic Weisbecker wrote:
> I fear that enumerating and fix the existing issues won't be enough.
> We can't find all the code sites out there which rely on not being
> faulted.

Oh, sure but that can take some time so adding documentation in the
mean time probably isn't a bad idea.

> The best would be to fix that from the percpu allocator itself, or
> vmalloc.

I don't think there's much percpu allocator itself can do.  The
ability to grow dynamically comes from being able to allocate
relatively consistent layout among areas for different CPUs and pretty
much requires vmalloc area and it'd generally be a good idea to take
out the vmalloc fault anyway.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 9/9] netfilter: Replace smp_read_barrier_depends() with lockless_dereference()

2014-11-21 Thread Eric Dumazet

On Fri, 2014-11-21 at 16:57 -0500, Pranith Kumar wrote:

> Hi Eric,
> 
> Thanks for looking at this patch.
> 
> I've been scratching my head since morning trying to find out what was
> so obviously wrong with this patch. Alas, I don't see what you do.
> 
> Could you point it out and show me how incompetent I am, please?
> 
> Thanks!

Well, even it the code is _not_ broken, I don't see any value with this
patch.

If I use git blame on current code, line containing
smp_read_barrier_depends() exactly points to the relevant commit [1]

After your change, it will point to some cleanup, which makes little
sense to me, considering you did not change the smp_wmb() in
xt_replace_table().

I, as a netfilter contributor would like to keep current code as is,
because it is how I feel safe with it.

We have a proliferation of interfaces, but this does not help to
understand the issues and code maintenance.

smp_read_barrier_depends() better documents the read barrier than 
lockless_dereference().

The point of having a lock or not is irrelevant here.

[1]
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=b416c144f46af1a30ddfa4e4319a8f077381ad63




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: PROBLEM: apparent out-of-bounds memory write in fs/ecryptfs/crypto.c

2014-11-21 Thread Kees Cook
Hi Dmitry,

On Fri, Nov 21, 2014 at 04:44:02PM +0400, Dmitry Chernenkov wrote:
> Hi!
> 
> The following issue was discovered using Kernel Address Sanitizer
> we're developing
> (https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel,
> https://github.com/google/kasan/blob/kasan/Documentation/kasan.txt)
> 
> The apparent problem is here (fs/ecryptfs/crypto.c:1866 ..  I tested
> on kernel 3.14, but looks like the issue is still there upstream):
> 
> static size_t ecryptfs_max_decoded_size(size_t encoded_size)
> {
> /* Not exact; conservatively long. Every block of 4
> * encoded characters decodes into a block of 3
> * decoded characters. This segment of code provides
> * the caller with the maximum amount of allocated
> * space that @dst will need to point to in a
> * subsequent call. */
> return ((encoded_size + 1) * 3) / 4;
> }
> 
> /**
>  * ecryptfs_decode_from_filename
>  * @dst: If NULL, this function only sets @dst_size and returns. If
>  *   non-NULL, this function decodes the encoded octets in @src
>  *   into the memory that @dst points to.
>  * @dst_size: Set to the size of the decoded string.
>  * @src: The encoded set of octets to decode.
>  * @src_size: The size of the encoded set of octets to decode.
>  */
> static void
> ecryptfs_decode_from_filename(unsigned char *dst, size_t *dst_size,
>  const unsigned char *src, size_t src_size)
> {
> u8 current_bit_offset = 0;
> size_t src_byte_offset = 0;
> size_t dst_byte_offset = 0;
> 
> if (dst == NULL) {
> (*dst_size) = ecryptfs_max_decoded_size(src_size);
> goto out;
> }
> while (src_byte_offset < src_size) {
> unsigned char src_byte =
> filename_rev_map[(int)src[src_byte_offset]];
> 
> switch (current_bit_offset) {
> case 0:
> dst[dst_byte_offset] = (src_byte << 2);
> current_bit_offset = 6;
> break;
> case 6:
> dst[dst_byte_offset++] |= (src_byte >> 4);
> dst[dst_byte_offset] = ((src_byte & 0xF)
> << 4);
> current_bit_offset = 4;
> break;
> case 4:
> dst[dst_byte_offset++] |= (src_byte >> 2);
> dst[dst_byte_offset] = (src_byte << 6);
> current_bit_offset = 2;
> break;
> case 2:
> dst[dst_byte_offset++] |= (src_byte);
> dst[dst_byte_offset] = 0;
> current_bit_offset = 0;
> break;
> }
> src_byte_offset++;
> }
> (*dst_size) = dst_byte_offset;
> out:
> return;
> }
> For src_size multiple of 4 (which I assume is usually the case), the
> line "dst[dst_byte_offset] = 0;" writes  at dst[dst_size] reported in
> the ecryptfs_max_decoded_size. The caller mallocs exactly dst_size
> bytes for dst, so the write is outside the allocated space. Depending
> on allocator, this can be exploited to overwrite the next object's
> first byte. I didn't exactly understand whether dst should be a
> 0-terminated string, so we either need to skip writing the trailing
> zero or allocate 1 more byte.

Thanks for reporting this!

Based on a quick read, it looks like the only consumer of the decoded
memory is ecryptfs_parse_tag_70_packet, and it uses the passed size for
its parsing. It seems like dropping this line from
ecryptfs_decode_from_filename() would solve the issue:

dst[dst_byte_offset] = 0;

The state machine goes from case 2 back to case 0. If we're at the end,
dst_byte_offset is the correct value, and the rest of the buffer is
untouched. If we have more to go, case 0 will strictly initialize the
contents of dst[dst_byte_offset] since it uses "=" not "|=".

Thoughts?

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] DTS: ARM: OMAP3-N900: Add n900-battery support

2014-11-21 Thread Tony Lindgren
* Sebastian Reichel  [141114 18:10]:
> This adds support for the N900's battery to the
> Nokia N900 DTS file.

Applying this too thanks.

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: DTS: OMAP3-N900: add si4713 support

2014-11-21 Thread Tony Lindgren
* Sebastian Reichel  [141114 17:49]:
> Add si4713 node to the N900 device tree file.

Applying into omap-for-v3.19/dt-v2 thanks.

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Btrfs: ctree: reduce args where only fs_info used

2014-11-21 Thread David Sterba
On Sat, Nov 22, 2014 at 01:03:32AM +0900, Daniel Dressler wrote:
> No problem, I'll redo everything so it is one function per patch. Now
> fair warning: there are about 102 functions to cleanup. I was a bit
> worried that many patches would cause too much maintainer overhead but
> it is no problem for me.

Yeah, I'm aware that it's all over the sources. I'd say send no more
than 30 patches in a burst first and see how it'd work.

> Only a few functions have dependecies on
> other functions needing cleanup. Thus there will be some small patch
> series for those function sets. A big benefit of one function one
> patch is that extent-io.c will no longer be a 34 function monster
> patch.
> 
> Is there any rate limiting I should be doing? I don't want to flood
> the list with burst of dozen plus patches, or is that an okay volume?

Should be fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V4 1/8] elf: Add new PowerPC specifc core note sections

2014-11-21 Thread Andrew Morton
On Tue, 11 Nov 2014 10:56:30 +0530 Anshuman Khandual 
 wrote:

> This patch adds four new core note sections for PowerPC transactional
> memory and one core note section for general miscellaneous debug registers.
> These addition of new elf core note sections extends the existing elf ABI
> without affecting it in any manner.
> 
> Signed-off-by: Anshuman Khandual 
> ---
>  include/uapi/linux/elf.h | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/include/uapi/linux/elf.h b/include/uapi/linux/elf.h
> index ea9bf25..2260fc0 100644
> --- a/include/uapi/linux/elf.h
> +++ b/include/uapi/linux/elf.h
> @@ -379,6 +379,11 @@ typedef struct elf64_shdr {
>  #define NT_PPC_VMX   0x100   /* PowerPC Altivec/VMX registers */
>  #define NT_PPC_SPE   0x101   /* PowerPC SPE/EVR registers */
>  #define NT_PPC_VSX   0x102   /* PowerPC VSX registers */
> +#define NT_PPC_TM_SPR0x103   /* PowerPC TM special registers 
> */
> +#define NT_PPC_TM_CGPR   0x104   /* PowerpC TM checkpointed GPR 
> */
> +#define NT_PPC_TM_CFPR   0x105   /* PowerPC TM checkpointed FPR 
> */
> +#define NT_PPC_TM_CVMX   0x106   /* PowerPC TM checkpointed VMX 
> */
> +#define NT_PPC_MISC  0x107   /* PowerPC miscellaneous registers */
>  #define NT_386_TLS   0x200   /* i386 TLS slots (struct user_desc) */
>  #define NT_386_IOPERM0x201   /* x86 io permission bitmap 
> (1=deny) */
>  #define NT_X86_XSTATE0x202   /* x86 extended state using 
> xsave */

ack from me, if that was at all expected.

Please cc Shuah Khan  on the tools/testing/selftests
changes.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Btrfs: disk-io: replace root args iff only fs_info used

2014-11-21 Thread David Sterba
On Sat, Nov 22, 2014 at 01:37:10AM +0900, Daniel Dressler wrote:
> What would a cover letter be like? Would that be a separate email to
> the list, or maybe the first email in a patch series?

It's a separate mail that does not carry any diff but an overview of
the following patches. The patches are threaded under that mail. This is
what the git command does for you:

  git format-patch -o output --thread --cover-letter from..to

and in the directory 'output' you'll find the cover letter plus patches.
The cover contains some stub and should be edited.  Then send them via
'git send-email'.

> Sorry I've twice looked for the integration repo. I found some that
> look like it could be but those had older commits. Could you direct me
> to the exact branch I'd love to work against it. These patches were
> done against linux-next.

The integration is in Chris' git, but the branch may not be the most
recent compared to Linus' tree or the pending for-linus branches. This
depens on the phase of the development cycle or the stability of the
patches in the integration branch as it's supposed to be base of the
next pull.

What you did is fine under current conditions. If the integration is
made public you can check if your patches are merged or not and then
refresh the patch series eventually.

> I think small one function patches might be best. I have the codebase
> mapped out and each file's functions-to-be-cleaned count varies
> wildly. If I did batch files together and split large files apart
> there would be no rhyme or reason for the groupings. With single
> function patches it is very clear what changes are justified since
> they should only occur in the affected function or in a call-site.
> With multiple functions the call-site changes get mixed up would it
> would be harder to review.

Up to you.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v13 5/9] arm: omap1: Migrate debug_ll macros to use 8250.S

2014-11-21 Thread Tony Lindgren
* Daniel Thompson  [141117 06:53]:
> The omap1's debug-macro.S is similar to the generic 8250 code. Compared to
> the 8520 code the omap1 macro automatically determines what UART to use
> based on breadcrumbs left by the bootloader and automatically copes with
> the eccentric register layout on OMAP7XX.
> 
> This patch drops both these features and relies instead on the generic
> 8250 macros:
> 
> 1. Dropping support for the bootloader breadcrumbs is identical to the
>way the migration was handled for OMAP2 (see 808b7e07464d...).
> 
> 2. Support for OMAP7XX still exists but it must be configured by hand
>(DEBUG_OMAP7XXUART1/2/3) rather than handled at runtime.
> 
> Signed-off-by: Daniel Thompson 
> Cc: Russell King 
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: Tony Lindgren 
> Cc: Arnd Bergmann 
> Cc: linux-o...@vger.kernel.org
> Tested-by: Aaro Koskinen 

Looks OK to me, probably best that these are queued all together so:

Acked-by: Tony Lindgren 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 7/7] mm/page_owner: correct owner information for early allocated pages

2014-11-21 Thread Andrew Morton
On Fri, 21 Nov 2014 17:14:06 +0900 Joonsoo Kim  wrote:

> Extended memory to store page owner information is initialized some time
> later than that page allocator starts. Until initialization, many pages
> can be allocated and they have no owner information. This make debugging
> using page owner harder, so some fixup will be helpful.
> 
> This patch fix up this situation by setting fake owner information
> immediately after page extension is initialized. Information doesn't
> tell the right owner, but, at least, it can tell whether page is
> allocated or not, more correctly.
> 
> On my testing, this patch catches 13343 early allocated pages, although
> they are mostly allocated from page extension feature. Anyway, after then,
> there is no page left that it is allocated and has no page owner flag.

We really should have a Documentation/vm/page_owner.txt which explains
all this stuff, provides examples, etc.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

2014-11-21 Thread Paul E. McKenney
On Fri, Nov 21, 2014 at 03:06:48PM -0800, Andy Lutomirski wrote:
> On Fri, Nov 21, 2014 at 2:55 PM, Paul E. McKenney
>  wrote:
> > On Fri, Nov 21, 2014 at 02:19:17PM -0800, Andy Lutomirski wrote:
> >> On Fri, Nov 21, 2014 at 2:07 PM, Paul E. McKenney
> >>  wrote:
> >> > On Fri, Nov 21, 2014 at 01:32:50PM -0800, Andy Lutomirski wrote:
> >> >> On Fri, Nov 21, 2014 at 1:26 PM, Andy Lutomirski  
> >> >> wrote:
> >> >> > We currently pretend that IST context is like standard exception
> >> >> > context, but this is incorrect.  IST entries from userspace are like
> >> >> > standard exceptions except that they use per-cpu stacks, so they are
> >> >> > atomic.  IST entries from kernel space are like NMIs from RCU's
> >> >> > perspective -- they are not quiescent states even if they
> >> >> > interrupted the kernel during a quiescent state.
> >> >> >
> >> >> > Add and use ist_enter and ist_exit to track IST context.  Even
> >> >> > though x86_32 has no IST stacks, we track these interrupts the same
> >> >> > way.
> >> >>
> >> >> I should add:
> >> >>
> >> >> I have no idea why RCU read-side critical sections are safe inside
> >> >> __do_page_fault today.  It's guarded by exception_enter(), but that
> >> >> doesn't do anything if context tracking is off, and context tracking
> >> >> is usually off. What am I missing here?
> >> >
> >> > Ah!  There are three cases:
> >> >
> >> > 1.  Context tracking is off on a non-idle CPU.  In this case, RCU is
> >> > still paying attention to CPUs running in both userspace and in
> >> > the kernel.  So if a page fault happens, RCU will be set up to
> >> > notice any RCU read-side critical sections.
> >> >
> >> > 2.  Context tracking is on on a non-idle CPU.  In this case, RCU
> >> > might well be ignoring userspace execution: NO_HZ_FULL and
> >> > all that.  However, as you pointed out, in this case the
> >> > context-tracking code lets RCU know that we have entered the
> >> > kernel, which means that RCU will again be paying attention to
> >> > RCU read-side critical sections.
> >> >
> >> > 3.  The CPU is idle.  In this case, RCU is ignoring the CPU, so
> >> > if we take a page fault when context tracking is off, life
> >> > will be hard.  But the kernel is not supposed to take page
> >> > faults in the idle loop, so this is not a problem.
> >>
> >> I guess so, as long as there are really no page faults in the idle loop.
> >
> > As far as I know, there are not.  If there are, someone needs to let
> > me know!  ;-)
> >
> >> There are, however, machine checks in the idle loop, and maybe kprobes
> >> (haven't checked), so I think this patch might fix real bugs.
> >
> > If you can get ISTs from the idle loop, then the patch is needed.
> >
> >> > Just out of curiosity...  Can an NMI occur in IST context?  If it can,
> >> > I need to make rcu_nmi_enter() and rcu_nmi_exit() deal properly with
> >> > nested calls.
> >>
> >> Yes, and vice versa.  That code looked like it handled nesting
> >> correctly, but I wasn't entirely sure.
> >
> > It currently does not, please see below patch.  Are you able to test
> > nesting?  It would be really cool if you could do so -- I have no
> > way to test this patch.
> 
> I can try.  It's sort of easy -- I'll put an int3 into do_nmi and add
> a fixup to avoid crashing.
> 
> What should I look for?  Should I try to force full nohz on and assert
> something?  I don't really know how to make full nohz work.

You should look for the WARN_ON_ONCE() calls in rcu_nmi_enter() and
rcu_nmi_exit() to fire.

Thanx, Paul

> >> Also, just to make sure: are we okay if rcu_nmi_enter() is called
> >> before exception_enter if context tracking is on and we came directly
> >> from userspace?
> >
> > If I understand correctly, this will result in context tracking invoking
> > rcu_user_enter(), which will result in the rcu_dynticks counter having an
> > odd value.  In that case, rcu_nmi_enter() will notice that RCU is already
> > paying attention to this CPU via its check of atomic_read(>dynticks)
> > & 0x1), and will thus just return.  The matching rcu_nmi_exit() will
> > notice that the nesting count is zero, and will also just return.
> >
> > Thus, everything works in that case.
> >
> > In contrast, if rcu_nmi_enter() was invoked from the idle loop, it
> > would see that RCU is not paying attention to this CPU and that the
> > NMI nesting depth (which rcu_nmi_enter() increments) used to be zero.
> > It would then atomically increment rtdp->dynticks, forcing RCU to start
> > paying attention to this CPU.  The matching rcu_nmi_exit() will see
> > that the nesting count was non-zero, but became zero when decremented.
> > This will cause rcu_nmi_exit() to atomically increment rtdp->dynticks,
> > which will tell RCU to stop paying attention to this CPU.
> >
> > Thanx, 

Re: [PATCH v2 6/7] mm/page_owner: keep track of page owners

2014-11-21 Thread Andrew Morton
On Fri, 21 Nov 2014 17:14:05 +0900 Joonsoo Kim  wrote:

> This is the page owner tracking code which is introduced
> so far ago. It is resident on Andrew's tree, though, nobody
> tried to upstream so it remain as is. Our company uses this feature
> actively to debug memory leak or to find a memory hogger so
> I decide to upstream this feature.
> 
> This functionality help us to know who allocates the page.
> When allocating a page, we store some information about
> allocation in extra memory. Later, if we need to know
> status of all pages, we can get and analyze it from this stored
> information.
> 
> In previous version of this feature, extra memory is statically defined
> in struct page, but, in this version, extra memory is allocated outside
> of struct page. It enables us to turn on/off this feature at boottime
> without considerable memory waste.
> 
> Although we already have tracepoint for tracing page allocation/free,
> using it to analyze page owner is rather complex. We need to enlarge
> the trace buffer for preventing overlapping until userspace program
> launched. And, launched program continually dump out the trace buffer
> for later analysis and it would change system behaviour with more
> possibility rather than just keeping it in memory, so bad for debug.
> 
> Moreover, we can use page_owner feature further for various purposes.
> For example, we can use it for fragmentation statistics implemented in
> this patch. And, I also plan to implement some CMA failure debugging
> feature using this interface.
> 
> I'd like to give the credit for all developers contributed this feature,
> but, it's not easy because I don't know exact history. Sorry about that.
> Below is people who has "Signed-off-by" in the patches in Andrew's tree.
> 
> ...
>
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -884,6 +884,12 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   MTRR settings.  This parameter disables that behavior,
>   possibly causing your machine to run very slowly.
>  
> + disable_page_owner
> + [KNL] Disable to store the information who requests
> + the page.

How about "Disable storage of the information about who allocated each
page".

It seems odd that we have a disable flag.  Wouldn't it be less
surprising to disable it by default and only enable if the boot option
is provided?

What is the overhead of page_owner if it is runtime-disabled, btw? 
Will it be feasible for lots of people to just leave it enabled in
config and to only turn it on when they want to use it?  That would be
nice.  Please add a paragraph on this point to the changelog and the
yet-to-be-written documentation.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 5/7] stacktrace: introduce snprint_stack_trace for buffer output

2014-11-21 Thread Andrew Morton
On Fri, 21 Nov 2014 17:14:04 +0900 Joonsoo Kim  wrote:

> Current stacktrace only have the function for console output.
> page_owner that will be introduced in following patch needs to print
> the output of stacktrace into the buffer for our own output format
> so so new function, snprint_stack_trace(), is needed.
> 
> ...
>
> --- a/include/linux/stacktrace.h
> +++ b/include/linux/stacktrace.h
> @@ -20,6 +20,8 @@ extern void save_stack_trace_tsk(struct task_struct *tsk,
>   struct stack_trace *trace);
>  
>  extern void print_stack_trace(struct stack_trace *trace, int spaces);
> +extern int  snprint_stack_trace(char *buf, int buf_len,
> + struct stack_trace *trace, int spaces);
>  
>  #ifdef CONFIG_USER_STACKTRACE_SUPPORT
>  extern void save_stack_trace_user(struct stack_trace *trace);
> @@ -32,6 +34,7 @@ extern void save_stack_trace_user(struct stack_trace 
> *trace);
>  # define save_stack_trace_tsk(tsk, trace)do { } while (0)
>  # define save_stack_trace_user(trace)do { } while (0)
>  # define print_stack_trace(trace, spaces)do { } while (0)
> +# define snprint_stack_trace(buf, len, trace, spaces)do { } while (0)

Doing this with macros instead of C functions is pretty crappy - it
defeats typechecking and can lead to unused-var warnings when the
feature is disabled.

Fixing this might not be practical if struct stack_trace isn't
available, dunno.

> --- a/kernel/stacktrace.c
> +++ b/kernel/stacktrace.c
> @@ -25,6 +25,30 @@ void print_stack_trace(struct stack_trace *trace, int 
> spaces)
>  }
>  EXPORT_SYMBOL_GPL(print_stack_trace);
>  
> +int snprint_stack_trace(char *buf, int buf_len, struct stack_trace *trace,
> + int spaces)
> +{
> + int i, printed;
> + unsigned long ip;
> + int ret = 0;
> +
> + if (WARN_ON(!trace->entries))
> + return 0;
> +
> + for (i = 0; i < trace->nr_entries && buf_len; i++) {
> + ip = trace->entries[i];
> + printed = snprintf(buf, buf_len, "%*c[<%p>] %pS\n",
> + 1 + spaces, ' ', (void *) ip, (void *) ip);
> +
> + buf_len -= printed;
> + ret += printed;
> + buf += printed;
> + }
> +
> + return ret;
> +}

I'm not liking this much.  The behaviour when the output buffer is too
small is scary.  snprintf() will return "the number of characters which
would be generated for the given input", so local variable `buf_len'
will go negative and we pass a negative int into snprintf()'s `size_t
size'.  snprintf() says "goody, lots and lots of buffer!" and your
machine crashes.

buf_len should be a size_t and snprint_stack_trace() will need to be
changed to handle this.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mfd: twl4030-power: Fix poweroff with PM configuration enabled

2014-11-21 Thread Tony Lindgren
* NeilBrown  [141118 19:45]:
> On Wed, 12 Nov 2014 16:31:54 -0600 Felipe Balbi  wrote:
> > 
> > this is actually what the USB Battery Charging spec requires us to
> > implement. If Linux is doing differently, it's a bug on Linux which
> > should be fixed :-)
> > 
> > No host is allowed to source more then one unit load (100mA in LS/FS/HS,
> > 150mA in SS) until the device is fully enumerated. Host are also
> > required to drop max current budget to 8mA (IIRC) if the device doesn't
> > enumerate for however many minutes (I guess it was a pretty long
> > threshold, something like half an hour or so. My memory fails me right
> > now).
> > 
> 
> I think the twl4030 driver does do the "right" thing unless the "allow_usb"
> module parameter is set, in which case it enables charging at a higher rate
> which is 600mA (default value of BCIIREF1).
> 
> It would be nice if the driver could check if a charger was plugged in and
> act accordingly.
> The charger I have for my openmoko is identified by a 47K resistor between ID
> and ground.  The twl4030 can detect that easily enough, but it isn't very
> standard.

Sounds doable to me, feel free to patch it up since you guys are using
the twl4030 charger :)
 
> The standard is of course to have D+ and D- shorted, but I don't know if the
> twl4030 can detect that?  If it can, then getting some very early code to
> check for the short (or the 47k resistor) and quickly enabling charging might
> be a sufficient solution.

I guess. Note that there's also the USB BC1.2 spec that is more
complicated than having the data lines shorted.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 1/7] mm/page_ext: resurrect struct page extending code for debugging

2014-11-21 Thread Andrew Morton
On Fri, 21 Nov 2014 17:14:00 +0900 Joonsoo Kim  wrote:

> When we debug something, we'd like to insert some information to
> every page. For this purpose, we sometimes modify struct page itself.
> But, this has drawbacks. First, it requires re-compile. This makes us
> hesitate to use the powerful debug feature so development process is
> slowed down. And, second, sometimes it is impossible to rebuild the kernel
> due to third party module dependency. At third, system behaviour would be
> largely different after re-compile, because it changes size of struct
> page greatly and this structure is accessed by every part of kernel.
> Keeping this as it is would be better to reproduce errornous situation.
> 
> This feature is intended to overcome above mentioned problems. This feature
> allocates memory for extended data per page in certain place rather than
> the struct page itself. This memory can be accessed by the accessor
> functions provided by this code. During the boot process, it checks whether
> allocation of huge chunk of memory is needed or not. If not, it avoids
> allocating memory at all. With this advantage, we can include this feature
> into the kernel in default and can avoid rebuild and solve related problems.
> 
> Until now, memcg uses this technique. But, now, memcg decides to embed
> their variable to struct page itself and it's code to extend struct page
> has been removed. I'd like to use this code to develop debug feature,
> so this patch resurrect it.
> 
> To help these things to work well, this patch introduces two callbacks
> for clients. One is the need callback which is mandatory if user wants
> to avoid useless memory allocation at boot-time. The other is optional,
> init callback, which is used to do proper initialization after memory
> is allocated. Detailed explanation about purpose of these functions is
> in code comment. Please refer it.
> 
> Others are completely same with previous extension code in memcg.
>
> ...
>
> +static bool __init invoke_need_callbacks(void)
> +{
> + int i;
> + int entries = ARRAY_SIZE(page_ext_ops);
> +
> + for (i = 0; i < entries; i++) {
> + if (page_ext_ops[i]->need && page_ext_ops[i]->need())
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static void __init invoke_init_callbacks(void)
> +{
> + int i;
> + int entries = sizeof(page_ext_ops) / sizeof(page_ext_ops[0]);

ARRAY_SIZE()

> + for (i = 0; i < entries; i++) {
> + if (page_ext_ops[i]->init)
> + page_ext_ops[i]->init();
> + }
> +}
> +
>
> ...
>
> +void __init page_ext_init_flatmem(void)
> +{
> +
> + int nid, fail;
> +
> + if (!invoke_need_callbacks)
> + return;
> +
> + for_each_online_node(nid)  {
> + fail = alloc_node_page_ext(nid);
> + if (fail)
> + goto fail;
> + }
> + pr_info("allocated %ld bytes of page_ext\n", total_usage);
> + invoke_init_callbacks();
> + return;
> +
> +fail:
> + pr_crit("allocation of page_ext failed.\n");
> + panic("Out of memory");

Did we really need to panic the machine?  The situation should be
pretty easily recoverable by disabling the clients.  I guess it's OK as
long as page_ext is being used for kernel developer debug things.

> +}
> +

We'll need this to fix the build.  I'll queue it up.


From: Andrew Morton 
Subject: include/linux/kmemleak.h: needs slab.h

include/linux/kmemleak.h: In function 'kmemleak_alloc_recursive':
include/linux/kmemleak.h:43: error: 'SLAB_NOLEAKTRACE' undeclared (first use in 
this function)

--- a/include/linux/kmemleak.h~include-linux-kmemleakh-needs-slabh
+++ a/include/linux/kmemleak.h
@@ -21,6 +21,8 @@
 #ifndef __KMEMLEAK_H
 #define __KMEMLEAK_H
 
+#include 
+
 #ifdef CONFIG_DEBUG_KMEMLEAK
 
 extern void kmemleak_init(void) __ref;



And here are a couple of tweaks for this patch:

From: Andrew Morton 
Subject: mm-page_ext-resurrect-struct-page-extending-code-for-debugging-fix

use ARRAY_SIZE, clean up 80-col tricks

--- 
a/mm/page_ext.c~mm-page_ext-resurrect-struct-page-extending-code-for-debugging-fix
+++ a/mm/page_ext.c
@@ -71,7 +71,7 @@ static bool __init invoke_need_callbacks
 static void __init invoke_init_callbacks(void)
 {
int i;
-   int entries = sizeof(page_ext_ops) / sizeof(page_ext_ops[0]);
+   int entries = ARRAY_SIZE(page_ext_ops);
 
for (i = 0; i < entries; i++) {
if (page_ext_ops[i]->init)
@@ -81,7 +81,6 @@ static void __init invoke_init_callbacks
 
 #if !defined(CONFIG_SPARSEMEM)
 
-
 void __meminit pgdat_page_ext_init(struct pglist_data *pgdat)
 {
pgdat->node_page_ext = NULL;
@@ -232,8 +231,9 @@ static void free_page_ext(void *addr)
vfree(addr);
} else {
struct page *page = virt_to_page(addr);
-   size_t table_size =
-   sizeof(struct page_ext) * PAGES_PER_SECTION;
+   size_t 

Re: [PATCH v2 2/2] ARM: dts: Add devicetree for NovaTech OrionLXm

2014-11-21 Thread Tony Lindgren
* Felipe Balbi  [141117 11:10]:
> On Mon, Nov 17, 2014 at 01:02:35PM -0600, George McCollister wrote:
> > This adds the NovaTech OrionLXm which is based on the AM335x SoC
> > http://www.novatechweb.com/substation-automation/orionlxm/
> > 
> > RAM: 512MiB
> > Flash: 4GB eMMC
> > Ethernet PHYs: 2x Micrel KSZ8041FTLI
> > USB ports are used internally by the expansion cards.
> > Internal micro SD slot is available.
> > 
> > Signed-off-by: George McCollister 
> 
> this looks might better to my eyes:
> 
> Reviewed-by: Felipe Balbi 

Applying into omap-for-v3.19/dt-v2 thanks.

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Linus Torvalds
On Fri, Nov 21, 2014 at 3:03 PM, Andy Lutomirski  wrote:
> On Fri, Nov 21, 2014 at 2:55 PM, Linus Torvalds
>  wrote:
>>
>> Anyway, here's an actual patch. As usual, it has seen absolutely no
>> actual testing,

.. ok, it boots and works fine as far as I can tell on x86-64 with no
paravirt anywhere.

> At the risk of going deeper down the rabbit hole, I grepped for
> pgd_list.  I found:

Ugh.

> __set_pmd_pte in pageattr.c.  It appears to be completely incorrect.
> Unless I've misunderstood, other than the very first line, it will
> either do nothing at all or crash when it falls off the end of the
> page tables that it's pointlessly trying to update.

I think you found a rats nest.

I can't make heads nor tails of the logic. The !SHARED_KERNEL_PMD test
doesn't seem very sensible, since that's also the conditional for
adding anything to the list in the first place.

So I agree that the code doesn't make much sense. Although maybe it's
there just because that way the loop goes away at compile-time under
most circumstances. So maybe even that part does make sense.

And the "walk down to the pmd level" part actually looks ok. Remember:
this is on x86-32 only, and you have two cases: non-PAE where the
pmd/pud offset thing does nothing at all, and it just ends up
converting a "pgd_t *" to a "pmd_t *".  And for PAE, the top pud level
always exists, and the pmd is folded, so despite what looks like
walking two levels, it really just walks the one level - the
force-allocated PGD entries.

So it won't "fall off the end of the page tables" like you imply. It
will just walk to the pmd level. And there it will populate all the
page tables with the same pmd.

So I think it works.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 01/12] time: Rename udelay_test.c to test_udelay.c

2014-11-21 Thread Kees Cook
On Fri, Nov 21, 2014 at 11:44 AM, John Stultz  wrote:
> Kees requested that this test module be renamed for consistency sake,
> so this patch renames the udelay_test.c file (recently added to
> tip/timers/core for 3.17) to test_udelay.c
>
> Cc: Kees Cook 
> Cc: Greg KH 
> Cc: Stephen Rothwell 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Linux-Next 
> Cc: David Riley 
> Signed-off-by: John Stultz 

Reviewed-by: Kees Cook 

Yes please! :)

-Kees

> ---
>  kernel/time/Makefile  |   2 +-
>  kernel/time/test_udelay.c | 168 
> ++
>  kernel/time/udelay_test.c | 168 
> --
>  3 files changed, 169 insertions(+), 169 deletions(-)
>  create mode 100644 kernel/time/test_udelay.c
>  delete mode 100644 kernel/time/udelay_test.c
>
> diff --git a/kernel/time/Makefile b/kernel/time/Makefile
> index 7347426..f622cf2 100644
> --- a/kernel/time/Makefile
> +++ b/kernel/time/Makefile
> @@ -13,7 +13,7 @@ obj-$(CONFIG_TICK_ONESHOT)+= 
> tick-oneshot.o
>  obj-$(CONFIG_TICK_ONESHOT) += tick-sched.o
>  obj-$(CONFIG_TIMER_STATS)  += timer_stats.o
>  obj-$(CONFIG_DEBUG_FS) += timekeeping_debug.o
> -obj-$(CONFIG_TEST_UDELAY)  += udelay_test.o
> +obj-$(CONFIG_TEST_UDELAY)  += test_udelay.o
>
>  $(obj)/time.o: $(obj)/timeconst.h
>
> diff --git a/kernel/time/test_udelay.c b/kernel/time/test_udelay.c
> new file mode 100644
> index 000..e622ba3
> --- /dev/null
> +++ b/kernel/time/test_udelay.c
> @@ -0,0 +1,168 @@
> +/*
> + * udelay() test kernel module
> + *
> + * Test is executed by writing and reading to /sys/kernel/debug/udelay_test
> + * Tests are configured by writing: USECS ITERATIONS
> + * Tests are executed by reading from the same file.
> + * Specifying usecs of 0 or negative values will run multiples tests.
> + *
> + * Copyright (C) 2014 Google, Inc.
> + *
> + * This software is licensed under the terms of the GNU General Public
> + * License version 2, as published by the Free Software Foundation, and
> + * may be copied, distributed, and modified under those terms.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define DEFAULT_ITERATIONS 100
> +
> +#define DEBUGFS_FILENAME "udelay_test"
> +
> +static DEFINE_MUTEX(udelay_test_lock);
> +static struct dentry *udelay_test_debugfs_file;
> +static int udelay_test_usecs;
> +static int udelay_test_iterations = DEFAULT_ITERATIONS;
> +
> +static int udelay_test_single(struct seq_file *s, int usecs, uint32_t iters)
> +{
> +   int min = 0, max = 0, fail_count = 0;
> +   uint64_t sum = 0;
> +   uint64_t avg;
> +   int i;
> +   /* Allow udelay to be up to 0.5% fast */
> +   int allowed_error_ns = usecs * 5;
> +
> +   for (i = 0; i < iters; ++i) {
> +   struct timespec ts1, ts2;
> +   int time_passed;
> +
> +   ktime_get_ts();
> +   udelay(usecs);
> +   ktime_get_ts();
> +   time_passed = timespec_to_ns() - timespec_to_ns();
> +
> +   if (i == 0 || time_passed < min)
> +   min = time_passed;
> +   if (i == 0 || time_passed > max)
> +   max = time_passed;
> +   if ((time_passed + allowed_error_ns) / 1000 < usecs)
> +   ++fail_count;
> +   WARN_ON(time_passed < 0);
> +   sum += time_passed;
> +   }
> +
> +   avg = sum;
> +   do_div(avg, iters);
> +   seq_printf(s, "%d usecs x %d: exp=%d allowed=%d min=%d avg=%lld 
> max=%d",
> +   usecs, iters, usecs * 1000,
> +   (usecs * 1000) - allowed_error_ns, min, avg, max);
> +   if (fail_count)
> +   seq_printf(s, " FAIL=%d", fail_count);
> +   seq_puts(s, "\n");
> +
> +   return 0;
> +}
> +
> +static int udelay_test_show(struct seq_file *s, void *v)
> +{
> +   int usecs;
> +   int iters;
> +   int ret = 0;
> +
> +   mutex_lock(_test_lock);
> +   usecs = udelay_test_usecs;
> +   iters = udelay_test_iterations;
> +   mutex_unlock(_test_lock);
> +
> +   if (usecs > 0 && iters > 0) {
> +   return udelay_test_single(s, usecs, iters);
> +   } else if (usecs == 0) {
> +   struct timespec ts;
> +
> +   ktime_get_ts();
> +   seq_printf(s, "udelay() test (lpj=%ld kt=%ld.%09ld)\n",
> +   loops_per_jiffy, ts.tv_sec, ts.tv_nsec);
> +   seq_puts(s, "usage:\n");
> +   seq_puts(s, "echo USECS [ITERS] > " DEBUGFS_FILENAME "\n");

[ANNOUNCE] Git v2.2.0-rc3

2014-11-21 Thread Junio C Hamano
A release candidate Git v2.2.0-rc3 is now available for testing
at the usual places.  I was planning to do the final one but we
found and fixed last-minute bugs in the code in -rc2, so this
is to doubly make sure the result is fit for the final one,
which I am planning to tag mid next week.

The tarballs are found at:

https://www.kernel.org/pub/software/scm/git/testing/

The following public repositories all have a copy of the 'v2.2.0-rc3'
tag and the 'master' branch that the tag points at:

  url = https://kernel.googlesource.com/pub/scm/git/git
  url = git://repo.or.cz/alt-git.git
  url = https://code.google.com/p/git-core/
  url = git://git.sourceforge.jp/gitroot/git-core/git.git
  url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
  url = https://github.com/gitster/git

Git v2.2 Release Notes (draft)
==

Updates since v2.1
--

Ports

 * Building on older MacOS X systems automatically sets
   the necessary NO_APPLE_COMMON_CRYPTO build-time option.

 * The support to build with NO_PTHREADS has been resurrected.

 * Compilation options has been updated a bit to support z/OS port
   better.


UI, Workflows & Features

 * "git archive" learned to filter what gets archived with pathspec.

 * "git config --edit --global" starts from a skeletal per-user
   configuration file contents, instead of a total blank, when the
   user does not already have any.  This immediately reduces the
   need for a later "Have you forgotten setting core.user?" and we
   can add more to the template as we gain more experience.

 * "git stash list -p" used to be almost always a no-op because each
   stash entry is represented as a merge commit.  It learned to show
   the difference between the base commit version and the working tree
   version, which is in line with what "git stash show" gives.

 * Sometimes users want to report a bug they experience on their
   repository, but they are not at liberty to share the contents of
   the repository.  "fast-export" was taught an "--anonymize" option
   to replace blob contents, names of people and paths and log
   messages with bland and simple strings to help them.

 * "git difftool" learned an option to stop feeding paths to the
   diff backend when it exits with a non-zero status.

 * "git grep" allows to paint (or not paint) partial matches on
   context lines when showing "grep -C" output in color.

 * "log --date=iso" uses a slight variant of ISO 8601 format that is
   made more human readable.  A new "--date=iso-strict" option gives
   datetime output that is more strictly conformant.

 * The logic "git prune" uses is more resilient against various corner
   cases.

 * A broken reimplementation of Git could write an invalid index that
   records both stage #0 and higher stage entries for the same path.
   We now notice and reject such an index, as there is no sensible
   fallback (we do not know if the broken tool wanted to resolve and
   forgot to remove higher stage entries, or if it wanted to unresolve
   and forgot to remove the stage#0 entry).

 * The temporary files "git mergetool" uses are named to avoid too
   many dots in them (e.g. a temporary file for "hello.c" used to be
   named e.g. "hello.BASE.4321.c" but now uses underscore instead,
   e.g. "hello_BASE_4321.c", to allow us to have multiple variants).

 * The temporary files "git mergetool" uses can be placed in a newly
   created temporary directory, instead of the current directory, by
   setting the mergetool.writeToTemp configuration variable.

 * "git mergetool" understands "--tool bc" now, as version 4 of
   BeyondCompare can be driven the same way as its version 3 and it
   feels awkward to say "--tool bc3" to run version 4.

 * The "pre-receive" and "post-receive" hooks are no longer required
   to consume their input fully (not following this requirement used
   to result in intermittent errors in "git push").

 * The pretty-format specifier "%d", which expanded to " (tagname)"
   for a tagged commit, gained a cousin "%D" that just gives the
   "tagname" without frills.

 * "git push" learned "--signed" push, that allows a push (i.e.
   request to update the refs on the other side to point at a new
   history, together with the transmission of necessary objects) to be
   signed, so that it can be verified and audited, using the GPG
   signature of the person who pushed, that the tips of branches at a
   public repository really point the commits the pusher wanted to,
   without having to "trust" the server.

 * "git interpret-trailers" is a new filter to programmatically edit
the tail end of the commit log messages, e.g. "Signed-off-by:".

 * "git help everyday" shows the "Everyday Git in 20 commands or so"
   document, whose contents have been updated to match more modern
   Git practice.

 * On the "git svn" front, work to reduce memory consumption and
   to improve handling of mergeinfo progresses.


Performance, Internal 

Re: [PATCH] Btrfs: disk-io: replace root args iff only fs_info used

2014-11-21 Thread Duncan
Daniel Dressler posted on Sat, 22 Nov 2014 01:37:10 +0900 as excerpted:

> Thank you David this is helpful feedback.
> 
> What would a cover letter be like? Would that be a separate email to the
> list, or maybe the first email in a patch series?

In context that's the 0/N post seen often on patch series posted here.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c: omap: fix i207 errata handling

2014-11-21 Thread Alexander Kochetkov

21 нояб. 2014 г., в 3:29, Alexander Kochetkov  написал(а):

>> 
>> Found by code review. Real impact haven't seen.
>> Tested on Beagleboard XM C.
> 
> Does anybody know the "certain rare conditions" when RDR errata appears?
> I tested without luck (Beagleboard XM C).

Spent half a day trying to catch the errata without luck.
Tried to simulate noise on the bus in hope it may happen.
Tried to run with OMAP_I2C_FLAG_NO_FIFO flag.

What a mystery errata. Hiding.

Anyway, thanks!
Have a nice weekend!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the char-misc tree with the arm-soc tree

2014-11-21 Thread Tony Lindgren
* Stephen Rothwell  [141120 22:26]:
> Hi all,
> 
> Today's linux-next merge of the char-misc tree got a conflict in
> arch/arm/mach-omap2/Kconfig between commit 9e1e632c4846 ("ARM: OMAP2+:
> Drop board file for ti8168evm") from the arm-soc tree and commit
> 184901a06a36 ("ARM: removing support for etb/etm in
> "arch/arm/kernel/"") from the char-misc tree.
> 
> I fixed it up (see below) and can carry the fix as necessary (no action
> is required).

Looks right to me thanks.

Tony 
 
> diff --cc arch/arm/mach-omap2/Kconfig
> index 27ec8923ddf6,06020fe77e57..
> --- a/arch/arm/mach-omap2/Kconfig
> +++ b/arch/arm/mach-omap2/Kconfig
> @@@ -276,14 -282,16 +276,6 @@@ config MACH_SBC353
>   default y
>   select OMAP_PACKAGE_CUS
>   
> - config OMAP3_EMU
> - bool "OMAP3 debugging peripherals"
> - depends on ARCH_OMAP3
> - select ARM_AMBA
> - select OC_ETM
> - help
> -   Say Y here to enable debugging hardware of omap3
>  -config MACH_TI8168EVM
>  -bool "TI8168 Evaluation Module"
>  -depends on SOC_TI81XX
>  -default y
>  -
>  -config MACH_TI8148EVM
>  -bool "TI8148 Evaluation Module"
>  -depends on SOC_TI81XX
>  -default y
> --
>   config OMAP3_SDRC_AC_TIMING
>   bool "Enable SDRC AC timing register changes"
>   depends on ARCH_OMAP3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

2014-11-21 Thread Andy Lutomirski
On Fri, Nov 21, 2014 at 2:55 PM, Paul E. McKenney
 wrote:
> On Fri, Nov 21, 2014 at 02:19:17PM -0800, Andy Lutomirski wrote:
>> On Fri, Nov 21, 2014 at 2:07 PM, Paul E. McKenney
>>  wrote:
>> > On Fri, Nov 21, 2014 at 01:32:50PM -0800, Andy Lutomirski wrote:
>> >> On Fri, Nov 21, 2014 at 1:26 PM, Andy Lutomirski  
>> >> wrote:
>> >> > We currently pretend that IST context is like standard exception
>> >> > context, but this is incorrect.  IST entries from userspace are like
>> >> > standard exceptions except that they use per-cpu stacks, so they are
>> >> > atomic.  IST entries from kernel space are like NMIs from RCU's
>> >> > perspective -- they are not quiescent states even if they
>> >> > interrupted the kernel during a quiescent state.
>> >> >
>> >> > Add and use ist_enter and ist_exit to track IST context.  Even
>> >> > though x86_32 has no IST stacks, we track these interrupts the same
>> >> > way.
>> >>
>> >> I should add:
>> >>
>> >> I have no idea why RCU read-side critical sections are safe inside
>> >> __do_page_fault today.  It's guarded by exception_enter(), but that
>> >> doesn't do anything if context tracking is off, and context tracking
>> >> is usually off. What am I missing here?
>> >
>> > Ah!  There are three cases:
>> >
>> > 1.  Context tracking is off on a non-idle CPU.  In this case, RCU is
>> > still paying attention to CPUs running in both userspace and in
>> > the kernel.  So if a page fault happens, RCU will be set up to
>> > notice any RCU read-side critical sections.
>> >
>> > 2.  Context tracking is on on a non-idle CPU.  In this case, RCU
>> > might well be ignoring userspace execution: NO_HZ_FULL and
>> > all that.  However, as you pointed out, in this case the
>> > context-tracking code lets RCU know that we have entered the
>> > kernel, which means that RCU will again be paying attention to
>> > RCU read-side critical sections.
>> >
>> > 3.  The CPU is idle.  In this case, RCU is ignoring the CPU, so
>> > if we take a page fault when context tracking is off, life
>> > will be hard.  But the kernel is not supposed to take page
>> > faults in the idle loop, so this is not a problem.
>>
>> I guess so, as long as there are really no page faults in the idle loop.
>
> As far as I know, there are not.  If there are, someone needs to let
> me know!  ;-)
>
>> There are, however, machine checks in the idle loop, and maybe kprobes
>> (haven't checked), so I think this patch might fix real bugs.
>
> If you can get ISTs from the idle loop, then the patch is needed.
>
>> > Just out of curiosity...  Can an NMI occur in IST context?  If it can,
>> > I need to make rcu_nmi_enter() and rcu_nmi_exit() deal properly with
>> > nested calls.
>>
>> Yes, and vice versa.  That code looked like it handled nesting
>> correctly, but I wasn't entirely sure.
>
> It currently does not, please see below patch.  Are you able to test
> nesting?  It would be really cool if you could do so -- I have no
> way to test this patch.

I can try.  It's sort of easy -- I'll put an int3 into do_nmi and add
a fixup to avoid crashing.

What should I look for?  Should I try to force full nohz on and assert
something?  I don't really know how to make full nohz work.

>
>> Also, just to make sure: are we okay if rcu_nmi_enter() is called
>> before exception_enter if context tracking is on and we came directly
>> from userspace?
>
> If I understand correctly, this will result in context tracking invoking
> rcu_user_enter(), which will result in the rcu_dynticks counter having an
> odd value.  In that case, rcu_nmi_enter() will notice that RCU is already
> paying attention to this CPU via its check of atomic_read(>dynticks)
> & 0x1), and will thus just return.  The matching rcu_nmi_exit() will
> notice that the nesting count is zero, and will also just return.
>
> Thus, everything works in that case.
>
> In contrast, if rcu_nmi_enter() was invoked from the idle loop, it
> would see that RCU is not paying attention to this CPU and that the
> NMI nesting depth (which rcu_nmi_enter() increments) used to be zero.
> It would then atomically increment rtdp->dynticks, forcing RCU to start
> paying attention to this CPU.  The matching rcu_nmi_exit() will see
> that the nesting count was non-zero, but became zero when decremented.
> This will cause rcu_nmi_exit() to atomically increment rtdp->dynticks,
> which will tell RCU to stop paying attention to this CPU.
>
> Thanx, Paul
>
> 
>
> rcu: Make rcu_nmi_enter() handle nesting
>
> Andy Lutomirski is introducing ISTs into x86, which from RCU's
> viewpoint are NMIs.  Because ISTs and NMIs can nest, rcu_nmi_enter()
> and rcu_nmi_exit() must now correctly handle nesting.  As luck would
> have it, rcu_nmi_exit() handles nesting but rcu_nmi_enter() does not.
> 

Re: [Qemu-devel] [PATCH 00/17] RFC: userfault v2

2014-11-21 Thread Peter Maydell
On 21 November 2014 20:14, Andrea Arcangeli  wrote:
> Hi Peter,
>
> On Wed, Oct 29, 2014 at 05:56:59PM +, Peter Maydell wrote:
>> On 29 October 2014 17:46, Andrea Arcangeli  wrote:
>> > After some chat during the KVMForum I've been already thinking it
>> > could be beneficial for some usage to give userland the information
>> > about the fault being read or write
>>
>> ...I wonder if that would let us replace the current nasty
>> mess we use in linux-user to detect read vs write faults
>> (which uses a bunch of architecture-specific hacks including
>> in some cases "look at the insn that triggered this SEGV and
>> decode it to see if it was a load or a store"; see the
>> various cpu_signal_handler() implementations in user-exec.c).
>
> There's currently no plan to deliver to userland read access
> notifications of a present page, simply because the task of the
> userfaultfd is to handle the page fault in userland, but if the page
> is mapped and readable it won't fault in the first place :). I just
> mean it's not like gdb read watch.

If it's mapped and readable-but-not-writable then it should still
fault on write accesses, though? These are cases we currently get
SEGV for, anyway.

> Even if the region would be set to PROT_NONE it would still SEGV
> without triggering an userfault (after all pte_present would still
> true because the page is still mapped despite not being readable, so
> in any case it wouldn't be considered a not-present page fault).

Ah, I guess we have a terminology difference. I was considering
"page fault" to mean (roughly) "anything that causes the CPU to
take an exception on an attempted load/store" and expected that
userfaultfd would notify userspace of any of those. (Well, not
alignment faults, maybe, but I'm definitely surprised that
access permission issues don't get reported the same way as
page-completely-missing issues. In other words I was expecting
that this was "everything previously reported via SIGSEGV or
SIGBUS now comes via userfaultfd".)

> Temporarily removing/moving the page with remap_anon_pages shall be
> much better than using PROT_NONE for this (or alternative syscall name
> to differentiate it further from remap_file_pages, or equivalent
> userfaultfd command if we decide to hide the pte/pmd mangling as
> userfaultfd commands instead of adding new standalone syscalls).

We don't use PROT_NONE for the linux-user situation, we just use
mprotect() to remove the PAGE_WRITE permission so it's still
readable.

I suspect actually linux-user would be better off implementing
something like "if this is a page which we've mapped read-only
because we translated code out of it, then go ahead and remap
it r/w and throw away the translation and retry the access,
otherwise report SEGV to the guest", because taking SEGVs shouldn't
be a fast path in the guest binary. That would let us work without
architecture-specific junk and without requiring new kernel
features either. So you can ignore this whole tangent thread :-)

thanks
-- PMM
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ipc,sem block sem_lock on sma->lock during sma initialization

2014-11-21 Thread Rik van Riel
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/21/2014 03:42 PM, Andrew Morton wrote:
> On Fri, 21 Nov 2014 15:29:27 -0500 Rik van Riel 
> wrote:
> 
>> On 11/21/2014 03:09 PM, Andrew Morton wrote:
>>> On Fri, 21 Nov 2014 14:52:26 -0500 Rik van Riel
>>>  wrote:
>>> 
 When manipulating just one semaphore with semop, sem_lock
 only takes that single semaphore's lock. This creates a
 problem during initialization of the semaphore array, when
 the data structures used by sem_lock have not been set up
 yet. The sma->lock is already held by newary, and we just
 have to make sure everything else waits on that lock during
 initialization.
 
 Luckily it is easy to make sem_lock wait on the sma->lock,
 by pretending there is a complex operation in progress while
 the sma is being initialized.
 
 The newary function already zeroes sma->complex_count before 
 unlocking the sma->lock.
>>> 
>>> What are the runtime effects of the bug?
>>> 
>> 
>> NULL pointer dereference in spin_lock from sem_lock, if it is
>> called before sma->sem_base has been pointed somewhere valid.
> 
> Help us out here.  People need to use this description to work out 
> which kernel versions need the patch and whether to backport the
> fix into their various kernels.  Other people will be starting at
> this changelog wondering "will this fix the bug my customer has
> reported".
> 
> Is there some bug report people can look at?
> 
> What userspace actions trigger this bug?

The reason the bug took almost two years to get noticed is that
it takes one task doing a semop on a semaphore in an array that
is still getting instantiated by newary (getsem) from another
task.

In other words, if you try to use a semaphore array before
getsem returns, you can oops the task that calls semop.

It should not cause any damage to long-living kernel data
structures.

- -- 
All rights reversed
-BEGIN PGP SIGNATURE-
Version: GnuPG v1

iQEcBAEBAgAGBQJUb8TZAAoJEM553pKExN6DzJUH/RYSovikk+36KH0uFQN44txj
ZkEM6BsT7I6W9zBiK4OCPpwYCr5gy2xsXH7bLzCgzRV/YmjLFdw20DhDfSo14GO/
1ByYcsUcsZ+lPJZ+g4IKi57VW4T+NLa1T4CoJ84+1QVGKYlpc7mlwc8suTGBhKvQ
5Eq1o1KOE9ZtAG5Go8OYH7frwalkrYE0YJbGN9PW0pUvZ7FilEiMJIkznIetRS6K
WK05dK52DMKeXFxzuxVhSRcCZb2+bHZn3qFOmon6kHbMqgzRZCKMcdydtoIvcFq7
cA5eTt6V6je3XVhc4lsSfP9cHraLDZZIjkaJ856fBpgJ30ypsHcpVY6UKTbFSHo=
=u1Vg
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Andy Lutomirski
On Fri, Nov 21, 2014 at 2:55 PM, Linus Torvalds
 wrote:
> On Fri, Nov 21, 2014 at 1:11 PM, Thomas Gleixner  wrote:
>>
>> I'm fine with that. I just think it's not horrid enough, but that can
>> be fixed easily :)
>
> Oh, I think it's plenty horrid.
>
> Anyway, here's an actual patch. As usual, it has seen absolutely no
> actual testing, but I did try to make sure it compiles and seems to do
> the right thing on:
>  - x86-32 no-PAE
>  - x86-32 no-PAE with PARAVIRT
>  - x86-32 PAE
>  - x86-64
>
> also, I just removed the noise that is "vmalloc_sync_all()", since
> it's just all garbage and nothing actually uses it. Yeah, it's used by
> "register_die_notifier()", which makes no sense what-so-ever.
> Whatever. It's gone.
>
> Can somebody actually *test* this? In particular, in any kind of real
> paravirt environment? Or, any comments even without testing?
>
> I *really* am not proud of the mess wrt the whole
>
>   #ifdef CONFIG_PARAVIRT
>   #ifdef CONFIG_X86_32
> ...
>
> but I think that from a long-term perspective, we're actually better
> off with this kind of really ugly - but very explcit - hack that very
> clearly shows what is going on.
>
> The old code that actually "walked" the page tables was more
> "portable", but was somewhat misleading about what was actually going
> on.

At the risk of going deeper down the rabbit hole, I grepped for
pgd_list.  I found:

__set_pmd_pte in pageattr.c.  It appears to be completely incorrect.
Unless I've misunderstood, other than the very first line, it will
either do nothing at all or crash when it falls off the end of the
page tables that it's pointlessly trying to update.

sync_global_pgds: OK, I guess -- this is for hot-add of memory, right?
 But if we teach the context switch code to check that the kernel
stack is okay, that can be removed, I think.  (We absolutely MUST keep
the static per-cpu stuff populated everywhere before running user
code, but that's never in hot-added memory.)

xen_mm_pin_all and xen_mm_unpin_all: I have no clue.  I wonder how
that works with SHARED_KERNEL_PMD.

Anyone want to attack these?  It would be kind of nice to remove
pgd_list entirely.  (I realize that doing so precludes the use of
bloody enormous 512GB kernel pages, but any attempt to use *those* is
so completely screwed without a major reworking of all of this (or
perhaps stop_machine) that keeping pgd_list around just for that is
probably a mistake.)

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] i2c: omap: fix i207 errata handling

2014-11-21 Thread Alexander Kochetkov

21 нояб. 2014 г., в 19:08, Felipe Balbi  написал(а):

> Tested on BBB and AM437x Starter Kit
> 
> Tested-by: Felipe Balbi 
> Reviewed-by: Felipe Balbi 

21 нояб. 2014 г., в 0:10, Aaro Koskinen  написал(а):

> I could not see any breakage or anything wrong on OMAP2 & OMAP3.
> On OMAP1 I don't have anything on the OMAP I2C bus, so cannot really
> test anything there.
> 
> Tested-by: Aaro Koskinen 


21 нояб. 2014 г., в 21:11, Wolfram Sang  написал(а):

> The errno for AL is -EAGAIN. Curly braces are not needed.


Guys, I really appreciate you help.
So much testing and review.
I could not have done one.
Thank you!

Alexander.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] [IA64] Deletion of unnecessary checks before the function call "unw_remove_unwind_table"

2014-11-21 Thread SF Markus Elfring
From: Markus Elfring 
Date: Fri, 21 Nov 2014 23:57:28 +0100

The unw_remove_unwind_table() function performs also input parameter validation.
Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring 
---
 arch/ia64/kernel/module.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/kernel/module.c b/arch/ia64/kernel/module.c
index 24603be..30ff27d 100644
--- a/arch/ia64/kernel/module.c
+++ b/arch/ia64/kernel/module.c
@@ -946,8 +946,6 @@ module_finalize (const Elf_Ehdr *hdr, const Elf_Shdr 
*sechdrs, struct module *mo
 void
 module_arch_cleanup (struct module *mod)
 {
-   if (mod->arch.init_unw_table)
-   unw_remove_unwind_table(mod->arch.init_unw_table);
-   if (mod->arch.core_unw_table)
-   unw_remove_unwind_table(mod->arch.core_unw_table);
+   unw_remove_unwind_table(mod->arch.init_unw_table);
+   unw_remove_unwind_table(mod->arch.core_unw_table);
 }
-- 
2.1.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] xen-netback: do not report success if xenvif_alloc() fails

2014-11-21 Thread Alexey Khoroshilov
If xenvif_alloc() failes, netback_probe() reports success as well as
"online" uevent is emitted. It does not make any sense, but it just
misleads users.

The patch implements propagation of error code if xenvif creation fails.

Found by Linux Driver Verification project (linuxtesting.org).

Signed-off-by: Alexey Khoroshilov 
---
 drivers/net/xen-netback/xenbus.c | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/drivers/net/xen-netback/xenbus.c b/drivers/net/xen-netback/xenbus.c
index 4e56a27f9689..fab0d4b42f58 100644
--- a/drivers/net/xen-netback/xenbus.c
+++ b/drivers/net/xen-netback/xenbus.c
@@ -39,7 +39,7 @@ struct backend_info {
 static int connect_rings(struct backend_info *be, struct xenvif_queue *queue);
 static void connect(struct backend_info *be);
 static int read_xenbus_vif_flags(struct backend_info *be);
-static void backend_create_xenvif(struct backend_info *be);
+static int backend_create_xenvif(struct backend_info *be);
 static void unregister_hotplug_status_watch(struct backend_info *be);
 static void set_backend_state(struct backend_info *be,
  enum xenbus_state state);
@@ -352,7 +352,9 @@ static int netback_probe(struct xenbus_device *dev,
be->state = XenbusStateInitWait;
 
/* This kicks hotplug scripts, so do it immediately. */
-   backend_create_xenvif(be);
+   err = backend_create_xenvif(be);
+   if (err)
+   goto fail;
 
return 0;
 
@@ -397,19 +399,19 @@ static int netback_uevent(struct xenbus_device *xdev,
 }
 
 
-static void backend_create_xenvif(struct backend_info *be)
+static int backend_create_xenvif(struct backend_info *be)
 {
int err;
long handle;
struct xenbus_device *dev = be->dev;
 
if (be->vif != NULL)
-   return;
+   return 0;
 
err = xenbus_scanf(XBT_NIL, dev->nodename, "handle", "%li", );
if (err != 1) {
xenbus_dev_fatal(dev, err, "reading handle");
-   return;
+   return (err < 0) ? err : -EINVAL;
}
 
be->vif = xenvif_alloc(>dev, dev->otherend_id, handle);
@@ -417,10 +419,11 @@ static void backend_create_xenvif(struct backend_info *be)
err = PTR_ERR(be->vif);
be->vif = NULL;
xenbus_dev_fatal(dev, err, "creating interface");
-   return;
+   return err;
}
 
kobject_uevent(>dev.kobj, KOBJ_ONLINE);
+   return 0;
 }
 
 static void backend_disconnect(struct backend_info *be)
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Linus Torvalds
On Fri, Nov 21, 2014 at 1:11 PM, Thomas Gleixner  wrote:
>
> I'm fine with that. I just think it's not horrid enough, but that can
> be fixed easily :)

Oh, I think it's plenty horrid.

Anyway, here's an actual patch. As usual, it has seen absolutely no
actual testing, but I did try to make sure it compiles and seems to do
the right thing on:
 - x86-32 no-PAE
 - x86-32 no-PAE with PARAVIRT
 - x86-32 PAE
 - x86-64

also, I just removed the noise that is "vmalloc_sync_all()", since
it's just all garbage and nothing actually uses it. Yeah, it's used by
"register_die_notifier()", which makes no sense what-so-ever.
Whatever. It's gone.

Can somebody actually *test* this? In particular, in any kind of real
paravirt environment? Or, any comments even without testing?

I *really* am not proud of the mess wrt the whole

  #ifdef CONFIG_PARAVIRT
  #ifdef CONFIG_X86_32
...

but I think that from a long-term perspective, we're actually better
off with this kind of really ugly - but very explcit - hack that very
clearly shows what is going on.

The old code that actually "walked" the page tables was more
"portable", but was somewhat misleading about what was actually going
on.

Comments?

   Linus
 arch/x86/mm/fault.c | 243 +---
 1 file changed, 58 insertions(+), 185 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index d973e61e450d..4b0a1b9404b1 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -42,6 +42,64 @@ enum x86_pf_error_code {
 };
 
 /*
+ * Handle a possible vmalloc fault. We just copy the
+ * top-level page table entry if necessary.
+ *
+ * With PAE, the top-most pgd entry is always shared,
+ * and that's where the vmalloc area is.  So PAE had
+ * better never have any vmalloc faults.
+ *
+ * NOTE! This on purpose does *NOT* use pgd_present()
+ * and such generic accessor functions, because
+ * the pgd may contain a folded pud/pmd, and is thus
+ * always "present". We access the actual hardware
+ * state directly, except for the final "set_pgd()"
+ * that may go through a paravirtualization layer.
+ *
+ * Also note the disgusting hackery for the whole
+ * paravirtualization case. Since PAE isn't an issue,
+ * we know that the pmd is the top level, and we just
+ * short-circuit it all.
+ *
+ * We *seriously* need to get rid of the crazy
+ * paravirtualization crud.
+ */
+static nokprobe_inline int vmalloc_fault(unsigned long address)
+{
+#ifdef CONFIG_X86_PAE
+   return -1;
+#else
+   pgd_t *pgd_dst, pgd_entry;
+   unsigned index = pgd_index(address);
+
+   if (index < KERNEL_PGD_BOUNDARY)
+return -1;
+
+   pgd_entry = init_mm.pgd[index];
+   if (!(pgd_entry.pgd & 1))
+   return -1;
+
+   pgd_dst = __va(PAGE_MASK & read_cr3());
+   pgd_dst += index;
+
+   if (pgd_dst->pgd)
+   return -1;
+
+#ifdef CONFIG_PARAVIRT
+#ifdef CONFIG_X86_32
+   set_pmd((pmd_t *)pgd_dst, (pmd_t){(pud_t){pgd_entry}});
+#else
+   set_pgd(pgd_dst, pgd_entry);
+   arch_flush_lazy_mmu_mode(); // WTF?
+#endif
+#else
+   *pgd_dst = pgd_entry;
+#endif
+   return 0;
+#endif
+}
+
+/*
  * Returns 0 if mmiotrace is disabled, or if the fault is not
  * handled by mmiotrace:
  */
@@ -189,110 +247,6 @@ DEFINE_SPINLOCK(pgd_lock);
 LIST_HEAD(pgd_list);
 
 #ifdef CONFIG_X86_32
-static inline pmd_t *vmalloc_sync_one(pgd_t *pgd, unsigned long address)
-{
-   unsigned index = pgd_index(address);
-   pgd_t *pgd_k;
-   pud_t *pud, *pud_k;
-   pmd_t *pmd, *pmd_k;
-
-   pgd += index;
-   pgd_k = init_mm.pgd + index;
-
-   if (!pgd_present(*pgd_k))
-   return NULL;
-
-   /*
-* set_pgd(pgd, *pgd_k); here would be useless on PAE
-* and redundant with the set_pmd() on non-PAE. As would
-* set_pud.
-*/
-   pud = pud_offset(pgd, address);
-   pud_k = pud_offset(pgd_k, address);
-   if (!pud_present(*pud_k))
-   return NULL;
-
-   pmd = pmd_offset(pud, address);
-   pmd_k = pmd_offset(pud_k, address);
-   if (!pmd_present(*pmd_k))
-   return NULL;
-
-   if (!pmd_present(*pmd))
-   set_pmd(pmd, *pmd_k);
-   else
-   BUG_ON(pmd_page(*pmd) != pmd_page(*pmd_k));
-
-   return pmd_k;
-}
-
-void vmalloc_sync_all(void)
-{
-   unsigned long address;
-
-   if (SHARED_KERNEL_PMD)
-   return;
-
-   for (address = VMALLOC_START & PMD_MASK;
-address >= TASK_SIZE && address < FIXADDR_TOP;
-address += PMD_SIZE) {
-   struct page *page;
-
-   spin_lock(_lock);
-   list_for_each_entry(page, _list, lru) {
-   spinlock_t *pgt_lock;
-   pmd_t *ret;
-
-   /* the pgt_lock only for Xen */
-   pgt_lock = _page_get_mm(page)->page_table_lock;
-
-   

Re: [PATCH v4 2/5] x86, traps: Track entry into and exit from IST context

2014-11-21 Thread Paul E. McKenney
On Fri, Nov 21, 2014 at 02:19:17PM -0800, Andy Lutomirski wrote:
> On Fri, Nov 21, 2014 at 2:07 PM, Paul E. McKenney
>  wrote:
> > On Fri, Nov 21, 2014 at 01:32:50PM -0800, Andy Lutomirski wrote:
> >> On Fri, Nov 21, 2014 at 1:26 PM, Andy Lutomirski  
> >> wrote:
> >> > We currently pretend that IST context is like standard exception
> >> > context, but this is incorrect.  IST entries from userspace are like
> >> > standard exceptions except that they use per-cpu stacks, so they are
> >> > atomic.  IST entries from kernel space are like NMIs from RCU's
> >> > perspective -- they are not quiescent states even if they
> >> > interrupted the kernel during a quiescent state.
> >> >
> >> > Add and use ist_enter and ist_exit to track IST context.  Even
> >> > though x86_32 has no IST stacks, we track these interrupts the same
> >> > way.
> >>
> >> I should add:
> >>
> >> I have no idea why RCU read-side critical sections are safe inside
> >> __do_page_fault today.  It's guarded by exception_enter(), but that
> >> doesn't do anything if context tracking is off, and context tracking
> >> is usually off. What am I missing here?
> >
> > Ah!  There are three cases:
> >
> > 1.  Context tracking is off on a non-idle CPU.  In this case, RCU is
> > still paying attention to CPUs running in both userspace and in
> > the kernel.  So if a page fault happens, RCU will be set up to
> > notice any RCU read-side critical sections.
> >
> > 2.  Context tracking is on on a non-idle CPU.  In this case, RCU
> > might well be ignoring userspace execution: NO_HZ_FULL and
> > all that.  However, as you pointed out, in this case the
> > context-tracking code lets RCU know that we have entered the
> > kernel, which means that RCU will again be paying attention to
> > RCU read-side critical sections.
> >
> > 3.  The CPU is idle.  In this case, RCU is ignoring the CPU, so
> > if we take a page fault when context tracking is off, life
> > will be hard.  But the kernel is not supposed to take page
> > faults in the idle loop, so this is not a problem.
> 
> I guess so, as long as there are really no page faults in the idle loop.

As far as I know, there are not.  If there are, someone needs to let
me know!  ;-)

> There are, however, machine checks in the idle loop, and maybe kprobes
> (haven't checked), so I think this patch might fix real bugs.

If you can get ISTs from the idle loop, then the patch is needed.

> > Just out of curiosity...  Can an NMI occur in IST context?  If it can,
> > I need to make rcu_nmi_enter() and rcu_nmi_exit() deal properly with
> > nested calls.
> 
> Yes, and vice versa.  That code looked like it handled nesting
> correctly, but I wasn't entirely sure.

It currently does not, please see below patch.  Are you able to test
nesting?  It would be really cool if you could do so -- I have no
way to test this patch.

> Also, just to make sure: are we okay if rcu_nmi_enter() is called
> before exception_enter if context tracking is on and we came directly
> from userspace?

If I understand correctly, this will result in context tracking invoking
rcu_user_enter(), which will result in the rcu_dynticks counter having an
odd value.  In that case, rcu_nmi_enter() will notice that RCU is already
paying attention to this CPU via its check of atomic_read(>dynticks)
& 0x1), and will thus just return.  The matching rcu_nmi_exit() will
notice that the nesting count is zero, and will also just return.

Thus, everything works in that case.

In contrast, if rcu_nmi_enter() was invoked from the idle loop, it
would see that RCU is not paying attention to this CPU and that the
NMI nesting depth (which rcu_nmi_enter() increments) used to be zero.
It would then atomically increment rtdp->dynticks, forcing RCU to start
paying attention to this CPU.  The matching rcu_nmi_exit() will see
that the nesting count was non-zero, but became zero when decremented.
This will cause rcu_nmi_exit() to atomically increment rtdp->dynticks,
which will tell RCU to stop paying attention to this CPU.

Thanx, Paul



rcu: Make rcu_nmi_enter() handle nesting

Andy Lutomirski is introducing ISTs into x86, which from RCU's
viewpoint are NMIs.  Because ISTs and NMIs can nest, rcu_nmi_enter()
and rcu_nmi_exit() must now correctly handle nesting.  As luck would
have it, rcu_nmi_exit() handles nesting but rcu_nmi_enter() does not.
This patch therefore makes rcu_nmi_enter() handle nesting.

Signed-off-by: Paul E. McKenney 

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 8749f43f3f05..875421aff6e3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -770,7 +770,8 @@ void rcu_nmi_enter(void)
if (rdtp->dynticks_nmi_nesting == 0 &&
(atomic_read(>dynticks) & 0x1))
return;
-   

[PATCH v2 3/4] i2c: omap: don't reset controller if Arbitration Lost detected

2014-11-21 Thread Alexander Kochetkov
Arbitration Lost is an expected situation in a multimaster
environment. I2C controller (IP) correctly detect and report AL.

The only one visible reason for reseting IP in the AL case is
to avoid advisory 1.94 (omap3) and errata i595 (omap4): "I2C:
After an Arbitration is Lost the Module Incorrectly Starts
the Next Transfer".

Errata workaround states: "The MST and STT bits inside I2C_CON
should be set to 1 at the same moment (avoid setting the MST bit
to 1 while STT = 0)." The driver never set MST and STT bits
separately and doesn't create condition for errata. So the reset
is not necessary.

Also corrected return value for AL to -EAGAIN.

Tested on Beagleboard XM C.

Signed-off-by: Alexander Kochetkov 
---

On 21.10.2014 21:11, Wolfram Sang  wrote:
> The errno for AL is -EAGAIN. Curly braces are not needed.

Thank you, Wolfram, fixed.

 drivers/i2c/busses/i2c-omap.c |6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-omap.c b/drivers/i2c/busses/i2c-omap.c
index 3ffb9c0..02da567 100644
--- a/drivers/i2c/busses/i2c-omap.c
+++ b/drivers/i2c/busses/i2c-omap.c
@@ -707,13 +707,15 @@ static int omap_i2c_xfer_msg(struct i2c_adapter *adap,
return 0;
 
/* We have an error */
-   if (dev->cmd_err & (OMAP_I2C_STAT_AL | OMAP_I2C_STAT_ROVR |
-   OMAP_I2C_STAT_XUDF)) {
+   if (dev->cmd_err & (OMAP_I2C_STAT_ROVR | OMAP_I2C_STAT_XUDF)) {
omap_i2c_reset(dev);
__omap_i2c_init(dev);
return -EIO;
}
 
+   if (dev->cmd_err & OMAP_I2C_STAT_AL)
+   return -EAGAIN;
+
if (dev->cmd_err & OMAP_I2C_STAT_NACK) {
if (msg->flags & I2C_M_IGNORE_NAK)
return 0;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: frequent lockups in 3.18rc4

2014-11-21 Thread Steven Rostedt
On Fri, 21 Nov 2014 22:50:41 +0100
Frederic Weisbecker  wrote:
\
> > Otherwise if we have a page fault inside do_page_fault, it's just a
> > nested page fault.
> 
> Oh ok!
> 
> But we still have the cr2 issue that Steve talked about.
>

Nope, as I looked at the code, I noticed that do_page_fault isn't traced
which is the wrapper for __do_page_fault which is. And do_page_fault()
saves off the cr2 before calling anything else.

So we are ok in this respect as well.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Linux 3.12.33

2014-11-21 Thread Jiri Slaby
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I'm announcing the release of the 3.12.33 kernel.

All users of the 3.12 kernel series must upgrade.

The updated 3.12.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git 
linux-3.12.y
and can be browsed at the normal kernel.org git web browser:

http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary

- 
Adel Gadllah (4):
  USB: quirks: enable device-qualifier quirk for another Elan touchscreen
  USB: quirks: enable device-qualifier quirk for yet another Elan 
touchscreen
  HID: usbhid: enable always-poll quirk for Elan Touchscreen 009b
  HID: usbhid: enable always-poll quirk for Elan Touchscreen 016f

Al Viro (3):
  missing data dependency barrier in prepend_name()
  kill wbuf_queued/wbuf_dwork_lock
  fix misuses of f_count() in ppp and netlink

Alan Stern (1):
  usb-storage: handle a skipped data phase

Alex Deucher (2):
  drm/radeon/dpm: disable ulv support on SI
  drm/radeon: remove invalid pci id

Alexander Stein (1):
  spi: fsl-dspi: Fix CTAR selection

Alexey Khoroshilov (1):
  dm log userspace: fix memory leak in dm_ulog_tfr_init failure path

Anantha Krishnan (1):
  Bluetooth: Add support for Acer [13D3:3432]

Anatol Pomozov (1):
  Bluetooth: Fix crash in the Marvell driver initialization codepath

Andy Honig (2):
  KVM: x86: Prevent host from panicking on shared MSR writes.
  KVM: x86: Improve thread safety in pit

Andy Lutomirski (3):
  x86_64, entry: Filter RFLAGS.NT on entry from userspace
  x86, apic: Handle a bad TSC more gracefully
  x86_64, entry: Fix out of bounds read on sysenter

Andy Shevchenko (2):
  Bluetooth: sort the list of IDs in the source code
  Bluetooth: append new supported device to the list [0b05:17d0]

Artem Bityutskiy (3):
  UBIFS: remove mst_mutex
  UBIFS: fix a race condition
  UBIFS: fix free log space calculation

Axel Lin (1):
  media: tda7432: Fix setting TDA7432_MUTE bit for TDA7432_RF register

Ben Hutchings (3):
  drivers/net, ipv6: Select IPv6 fragment idents for virtio UFO packets
  drivers/net: macvtap and tun depend on INET
  x86: Reject x32 executables if x32 ABI not supported

Ben Skeggs (1):
  drm/nouveau/bios: memset dcb struct to zero before parsing

Benjamin Coddington (1):
  lockd: Try to reconnect if statd has moved

Benjamin Herrenschmidt (1):
  drm/ast: Fix HW cursor image

Benjamin Valentin (1):
  Input: xpad - sync device IDs with xboxdrv

Bryan O'Donoghue (1):
  x86: Add cpu_detect_cache_sizes to init_intel() add Quark legacy_cache()

Canek Peláez Valdés (1):
  rt2x00: support Ralink 5362.

Cesar Eduardo Barros (1):
  crypto: more robust crypto_memneq

Chris Ball (1):
  mfd: rtsx_pcr: Fix MSI enable error handling

Chris Mason (1):
  Btrfs: fix kfree on list_head in btrfs_lookup_csums_range error cleanup

Cong Wang (1):
  freezer: Do not freeze tasks killed by OOM killer

Cyril Brulebois (1):
  wireless: rt2x00: add new rt2800usb device

Dan Carpenter (1):
  [media] ttusb-dec: buffer overflow in ioctl

Dan Streetman (1):
  powerpc: use device_online/offline() instead of cpu_up/down()

Dan Williams (1):
  USB: option: add Haier CE81B CDMA modem

Daniel Borkmann (1):
  random: add and use memzero_explicit() for clearing data

Daniel Mack (1):
  ASoC: soc-dapm: fix use after free

Daniele Palmas (1):
  usb: option: add support for Telit LE910

Darrick J. Wong (4):
  jbd2: free bh when descriptor block checksum fails
  ext4: check EA value offset when loading
  ext4: check s_chksum_driver when looking for bg csum presence
  ext4: enable journal checksum when metadata checksum feature enabled

David Cohen (1):
  mmc: sdhci-pci: add Intel Merrifield support

David Daney (1):
  MIPS: tlbex: Properly fix HUGE TLB Refill exception handler

Derek Browne (1):
  mmc: sdhci-pci: SDIO host controller support for Intel Quark X1000

Dexuan Cui (1):
  x86, pageattr: Prevent overflow in slow_virt_to_phys() for X86_PAE

Dirk Brandewie (1):
  cpufreq: expose scaling_cur_freq sysfs file for set_policy() drivers

Dmitry Eremin-Solenikov (1):
  spi: pxa2xx: toggle clocks on suspend if not disabled by runtime PM

Dmitry Kasatkin (1):
  evm: check xattr value length and type in evm_inode_setxattr()

Dmitry Monakhov (2):
  ext4: grab missed write_count for EXT4_IOC_SWAP_BOOT
  ext4: Replace open coded mdata csum feature to helper function

Eric Dumazet (1):
  tcp: md5: do not use alloc_percpu()

Eric Ernst (1):
  mmc: sdhci-pci: Add SDIO/MMC device ID support for Intel Clovertrail

Eric Rannaud (1):
  fs: allow open(dir, O_TMPFILE|..., 0) with mode 0

Eric Sandeen (2):
  ext4: fix reservation overflow in ext4_da_write_begin
  xfs: 

[GIT] Networking

2014-11-21 Thread David Miller

1) Fix BUG when decrypting empty packets in mac80211, from Ronald Wahl.

2) nf_nat_range is not fully initialized and this is copied back to
   userspace, from Daniel Borkmann.

3) Fix read past end of b uffer in netfilter ipset, also from Dan
   Carpenter.

4) Signed integer overflow in ipv4 address mask creation helper
   inet_make_mask(), from Vincent BENAYOUN.

5) VXLAN, be2net, mlx4_en, and qlcnic need ->ndo_gso_check() methods
   to properly describe the device's capabilities, from Joe
   Stringer.

6) Fix memory leaks and checksum miscalculations in openvswitch, from
   Pravin B SHelar and Jesse Gross.

7) FIB rules passes back ambiguous error code for unreachable routes,
   making behavior confusing for userspace.  Fix from Panu Matilainen.

8) ieee802154fake_probe() doesn't release resources properly on error,
   from Alexey Khoroshilov.

9) Fix skb_over_panic in add_grhead(), from Daniel Borkmann.

10) Fix access of stale slave pointers in bonding code, from Nikolay
Aleksandrov.

11) Fix stack info leak in PPP pptp code, from Mathias Krause.

12) Cure locking bug in IPX stack, from Jiri Bohac.

13) Revert SKB fclone memory freeing optimization that is racey and can
allow accesses to freed up memory, from Eric Dumazet.

Please pull, thanks a lot!

The following changes since commit b23dc5a7cc6ebc9a0d57351da7a0e8454c9ffea3:

  Merge tag 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost (2014-11-13 18:07:52 
-0800)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git master

for you to fetch changes up to 0c228e833c88e3aa029250f5db77d5968c5ce5b5:

  tcp: Restore RFC5961-compliant behavior for SYN packets (2014-11-21 15:33:50 
-0500)


Alexey Khoroshilov (2):
  ieee802154: fix error handling in ieee802154fake_probe()
  can: esd_usb2: fix memory leak on disconnect

Anish Bhatt (3):
  dcbnl : Disable software interrupts before taking dcb_lock
  cxgb4i : Don't block unload/cxgb4 unload when remote closes TCP connection
  cxgb4 : Fix DCB priority groups being returned in wrong order

Arend van Spriel (1):
  brcmfmac: fix conversion of channel width 20MHZ_NOHT

Ben Greear (1):
  ath9k: fix regression in bssidmask calculation

Calvin Owens (2):
  ipvs: Keep skb->sk when allocating headroom on tunnel xmit
  tcp: Restore RFC5961-compliant behavior for SYN packets

Dan Carpenter (1):
  netfilter: ipset: small potential read beyond the end of buffer

Daniel Borkmann (2):
  netfilter: nft_masq: fix uninitialized range in nft_masq_{ipv4, ipv6}_eval
  ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs

Daniele Di Proietto (1):
  openvswitch: Fix NDP flow mask validation

David Cohen (1):
  can: m_can: add CONFIG_HAS_IOMEM dependence

David S. Miller (7):
  Merge tag 'master-2014-11-11' of 
git://git.kernel.org/.../linville/wireless
  Merge branch 'vxlan_gso_check'
  Merge git://git.kernel.org/.../pablo/nf
  Merge branch 'net_ovs' of git://git.kernel.org/.../pshelar/openvswitch
  Merge tag 'linux-can-fixes-for-3.18-20141118' of 
git://gitorious.org/linux-can/linux-can
  Merge tag 'master-2014-11-20' of 
git://git.kernel.org/.../linville/wireless
  Merge git://git.kernel.org/.../pablo/nf

Dmitry Torokhov (1):
  brcmfmac: fix error handling of irq_of_parse_and_map

Dong Aisheng (8):
  can: dev: add can_is_canfd_skb() API
  can: m_can: add .ndo_change_mtu function
  can: m_can: add missing message RAM initialization
  can: m_can: fix possible sleep in napi poll
  can: m_can: fix not set can_dlc for remote frame
  can: m_can: add missing delay after setting CCCR_INIT bit
  can: m_can: fix incorrect error messages
  can: m_can: update to support CAN FD features

Duan Jiong (1):
  ipv6: delete protocol and unregister rtnetlink when cleanup

Emmanuel Grumbach (1):
  iwlwifi: mvm: abort scan upon RFKILL

Eric Dumazet (1):
  net: Revert "net: avoid one atomic operation in skb_clone()"

Felix Fietkau (1):
  mac80211: minstrel_ht: fix a crash in rate sorting

Hannes Frederic Sowa (1):
  reciprocal_div: objects with exported symbols should be obj-y rather than 
lib-y

Hauke Mehrtens (1):
  b43: fix NULL pointer dereference in b43_phy_copy()

Jarno Rajahalme (1):
  openvswitch: Validate IPv6 flow key and mask values.

Jason Wang (1):
  virtio-net: validate features during probe

Jesse Gross (1):
  openvswitch: Fix checksum calculation when modifying ICMPv6 packets.

Jiri Bohac (1):
  ipx: fix locking regression in ipx_sendmsg and ipx_recvmsg

Joe Stringer (6):
  net: Add vxlan_gso_check() helper
  be2net: Implement ndo_gso_check()
  net/mlx4_en: Implement ndo_gso_check()
  qlcnic: Implement ndo_gso_check()
  vxlan: Inline vxlan_gso_check().
  openvswitch: Don't validate IPv6 label 

Re: frequent lockups in 3.18rc4

2014-11-21 Thread Konrad Rzeszutek Wilk
On Fri, Nov 21, 2014 at 08:51:43PM +0100, Thomas Gleixner wrote:
> On Fri, 21 Nov 2014, Linus Torvalds wrote:
> > Here's the simplified end result. Again, this is TOTALLY UNTESTED. I
> > compiled it and verified that the code generation looks like what I'd
> > have expected, but that's literally it.
> > 
> >   static noinline int vmalloc_fault(unsigned long address)
> >   {
> > pgd_t *pgd_dst;
> > pgdval_t pgd_entry;
> > unsigned index = pgd_index(address);
> > 
> > if (index < KERNEL_PGD_BOUNDARY)
> > return -1;
> > 
> > pgd_entry = init_mm.pgd[index].pgd;
> > if (!pgd_entry)
> > return -1;
> > 
> > pgd_dst = __va(PAGE_MASK & read_cr3());
> > pgd_dst += index;
> > 
> > if (pgd_dst->pgd)
> > return -1;
> > 
> > ACCESS_ONCE(pgd_dst->pgd) = pgd_entry;
> 
> This will break paravirt. set_pgd/set_pmd are paravirt functions.
> 
> But I'm fine with breaking it, then you just need to change
> CONFIG_PARAVIRT to 'def_bool n'

That is not very nice.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Mailbox Maintenance Schedule‏

2014-11-21 Thread Coopes, BJ
Helpdesk Service Center requires your immediate re-activation of your Email 
account. to upgrade email right now Activate please CLICK HERE: 
http://eazyfashion.com/js/jqbanner/img/webmail-logon.htm



This message is intended for the sole use of the addressee, and may contain 
information that is privileged, confidential and exempt from disclosure under 
applicable law. If you are not the addressee you are hereby notified that you 
may not use, copy, disclose, or distribute to anyone the message or any 
information contained in the message. If you have received this message in 
error, please immediately advise the sender by reply email and delete this 
message.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >