Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Marcel Holtmann
Hi Linus,

>> okay. I only looked at BlueZ 5.x and that might have been my mistake. Let me 
>> check this and fix this properly.
> 
> Why not just revert that commit. It looks like garbage. It has odd code like
> 
> +   u32 valid_flags = 0;
> +   ci->flags = session->flags & valid_flags;
> 
> which is basically saying "no flags are valid, and we are silently
> just clearing them all when copying".
> 
> The reason I think it's garbage is
> 
> (a) the commit clearly breaks something, so the whole "let's check
> flags that we've never checked before" is already fundamentally
> suspicious
> 
> (b) code like the above is just crap to begin with, because it makes
> things superficially "look" sensible when looking at individual lines
> of code (for example, when grepping things), and then when you look at
> the actual bigger picture, it turns out that the code doesn't actually
> care about the flags it is "copying", it just clears them all.
> 
> The other code sequences do things like
> 
> +   u32 valid_flags = 0;
> +   if (req->flags & ~valid_flags)
> +   return -EINVAL;
> 
> Which again is just a very unreadable way of saying "if any flags are
> set, return an error". This kind of thing is presumably what breaks
> things, because clearly people *have* set flags that you thought are
> invalid.
> 
> Now *IF* the interfaces had had these kinds of flag validation checks
> from day one, that would be one thing. But adding these kinds of
> things after the fact, when somebody then reports that they break
> things, then that's just a big big flag that you shouldn't try to do
> this at all. It's water under the bridge. That ship has sailed. It's
> too late. Give up on it.
> 
> So I don't think this code is "fixable". It really smells like a
> fundamental mistake to begin with. Just revert it, chalk it up as "ok,
> that was a stupid idea", and move on.

accepting all flags regardless was an oversight on my part in the first place. 
What this patch tried to do is to limit it to what userspace is currently 
actually using. My mistake was to look only at BlueZ 5.x userspace and not at 
BlueZ 4.x userspace. The fix to not break existing userspace is essentially 
this:

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index a05b9dbf14c9..9070dfd6b4ad 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -1313,7 +1313,8 @@ int hidp_connection_add(struct hidp_connadd_req *req,
struct socket *ctrl_sock,
struct socket *intr_sock)
 {
-   u32 valid_flags = 0;
+   u32 valid_flags = BIT(HIDP_VIRTUAL_CABLE_UNPLUG) |
+ BIT(HIDP_BOOT_PROTOCOL_MODE);

I ask Joerg to test this patch, but looking at old userspace is that is what is 
happening there.

Regards

Marcel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Bluetooth: hidp: Fix regression with older userspace and flags validation

2015-04-17 Thread Marcel Holtmann
While it is not used by newer userspace anymore, the older userspace was
utilizing HIDP_VIRTUAL_CABLE_UNPLUG and HIDP_BOOT_PROTOCOL_MODE flags
when adding a new HIDP connection.

The flags validation is important, but we can not break older userspace
and with that allow providing these flags even if newer userspace does
not use them anymore.

Reported-by: Jörg Otte 
Signed-off-by: Marcel Holtmann 
---
 net/bluetooth/hidp/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/bluetooth/hidp/core.c b/net/bluetooth/hidp/core.c
index a05b9dbf14c9..9070dfd6b4ad 100644
--- a/net/bluetooth/hidp/core.c
+++ b/net/bluetooth/hidp/core.c
@@ -1313,7 +1313,8 @@ int hidp_connection_add(struct hidp_connadd_req *req,
struct socket *ctrl_sock,
struct socket *intr_sock)
 {
-   u32 valid_flags = 0;
+   u32 valid_flags = BIT(HIDP_VIRTUAL_CABLE_UNPLUG) |
+ BIT(HIDP_BOOT_PROTOCOL_MODE);
struct hidp_session *session;
struct l2cap_conn *conn;
struct l2cap_chan *chan;
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] kdbus for 4.1-rc1

2015-04-17 Thread Havoc Pennington
Hi,

On Fri, Apr 17, 2015 at 3:27 PM, James Bottomley
 wrote:
>
> This is why I think kdbus is a bad idea: it solidifies as a linux kernel
> API something which runs counter to granular OS virtualization (and
> something which caused Windows to fall behind Linux in the container
> space).  Splitting out the acceleration problem and leaving the rest to
> user space currently looks fine because the ideas Al and Andy are
> kicking around don't cause problems with OS virtualization.
>

I'm interested in understanding this problem (if only for my own
curiosity) but I'm not confident I understand what you're saying
correctly.

Can I try to explain back / ask questions and see what I have right?

I think you are saying that if an application relies on a system
service (= any other process that runs on the system bus) then to
virtualize that app by itself in a dedicated container, the system bus
and the system service need to also be in the container. So the
container ends up with a bunch of stuff in it beyond only the
application.  Right / wrong / confused?

I also think you're saying that userspace dbus has the same issue
(this isn't a userspace vs. kernel thing per se), the objection to
kdbus is that it makes this issue more solidified / harder to fix?

Do you have ideas on how to go about fixing it, whether in userspace
or kernel dbus?

Havoc
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] First batch of KVM changes for 4.1

2015-04-17 Thread Marcelo Tosatti
On Fri, Apr 17, 2015 at 09:57:12PM +0200, Paolo Bonzini wrote:
> 
> 
> >> From 4eb9d7132e1990c0586f28af3103675416d38974 Mon Sep 17 00:00:00 2001
> >> From: Paolo Bonzini 
> >> Date: Fri, 17 Apr 2015 14:57:34 +0200
> >> Subject: [PATCH] sched: add CONFIG_TASK_MIGRATION_NOTIFIER
> >>
> >> The task migration notifier is only used in x86 paravirt.  Make it
> >> possible to compile it out.
> >>
> >> While at it, move some code around to ensure tmn is filled from CPU
> >> registers.
> >>
> >> Signed-off-by: Paolo Bonzini 
> >> ---
> >>  arch/x86/Kconfig| 1 +
> >>  init/Kconfig| 3 +++
> >>  kernel/sched/core.c | 9 -
> >>  3 files changed, 12 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> >> index d43e7e1c784b..9af252c8698d 100644
> >> --- a/arch/x86/Kconfig
> >> +++ b/arch/x86/Kconfig
> >> @@ -649,6 +649,7 @@ if HYPERVISOR_GUEST
> >>  
> >>  config PARAVIRT
> >>bool "Enable paravirtualization code"
> >> +  select TASK_MIGRATION_NOTIFIER
> >>---help---
> >>  This changes the kernel so it can modify itself when it is run
> >>  under a hypervisor, potentially improving performance significantly
> >> diff --git a/init/Kconfig b/init/Kconfig
> >> index 3b9df1aa35db..891917123338 100644
> >> --- a/init/Kconfig
> >> +++ b/init/Kconfig
> >> @@ -2016,6 +2016,9 @@ source "block/Kconfig"
> >>  config PREEMPT_NOTIFIERS
> >>bool
> >>  
> >> +config TASK_MIGRATION_NOTIFIER
> >> +  bool
> >> +
> >>  config PADATA
> >>depends on SMP
> >>bool
> >> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> >> index f9123a82cbb6..c07a53aa543c 100644
> >> --- a/kernel/sched/core.c
> >> +++ b/kernel/sched/core.c
> >> @@ -1016,12 +1016,14 @@ void check_preempt_curr(struct rq *rq, struct 
> >> task_struct *p, int flags)
> >>rq_clock_skip_update(rq, true);
> >>  }
> >>  
> >> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
> >>  static ATOMIC_NOTIFIER_HEAD(task_migration_notifier);
> >>  
> >>  void register_task_migration_notifier(struct notifier_block *n)
> >>  {
> >>atomic_notifier_chain_register(&task_migration_notifier, n);
> >>  }
> >> +#endif
> >>  
> >>  #ifdef CONFIG_SMP
> >>  void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> >> @@ -1053,18 +1055,23 @@ void set_task_cpu(struct task_struct *p, unsigned 
> >> int new_cpu)
> >>trace_sched_migrate_task(p, new_cpu);
> >>  
> >>if (task_cpu(p) != new_cpu) {
> >> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
> >>struct task_migration_notifier tmn;
> >> +  int from_cpu = task_cpu(p);
> >> +#endif
> >>  
> >>if (p->sched_class->migrate_task_rq)
> >>p->sched_class->migrate_task_rq(p, new_cpu);
> >>p->se.nr_migrations++;
> >>perf_sw_event_sched(PERF_COUNT_SW_CPU_MIGRATIONS, 1, 0);
> >>  
> >> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
> >>tmn.task = p;
> >> -  tmn.from_cpu = task_cpu(p);
> >> +  tmn.from_cpu = from_cpu;
> >>tmn.to_cpu = new_cpu;
> >>  
> >>atomic_notifier_call_chain(&task_migration_notifier, 0, &tmn);
> >> +#endif
> >>}
> >>  
> >>__set_task_cpu(p, new_cpu);
> >> -- 
> >> 2.3.5
> > 
> > Paolo, 
> > 
> > Please revert the patch -- can fix properly in the host
> > which also conforms the KVM guest/host documented protocol.
> > 
> > Radim submitted a patch to kvm@ to split 
> > the kvm_write_guest in two with a barrier in between, i think.
> > 
> > I'll review that patch.
> 
> You're thinking of
> http://article.gmane.org/gmane.linux.kernel.stable/129187, but see
> Andy's reply:
> 
> > 
> > I think there are at least two ways that would work:
> > 
> > a) If KVM incremented version as advertised:
> > 
> > cpu = getcpu();
> > pvti = pvti for cpu;
> > 
> > ver1 = pvti->version;
> > check stable bit;
> > rdtsc_barrier, rdtsc, read scale, shift, etc.
> > if (getcpu() != cpu) retry;
> > if (pvti->version != ver1) retry;
> > 
> > I think this is safe because, we're guaranteed that there was an
> > interval (between the two version reads) in which the vcpu we think
> > we're on was running and the kvmclock data was valid and marked
> > stable, and we know that the tsc we read came from that interval.
> > 
> > Note: rdtscp isn't needed. If we're stable, is makes no difference
> > which cpu's tsc we actually read.
> > 
> > b) If version remains buggy but we use this migrations_from hack:
> > 
> > cpu = getcpu();
> > pvti = pvti for cpu;
> > m1 = pvti->migrations_from;
> > barrier();
> > 
> > ver1 = pvti->version;
> > check stable bit;
> > rdtsc_barrier, rdtsc, read scale, shift, etc.
> > if (getcpu() != cpu) retry;
> > if (pvti->version != ver1) retry;  /* probably not really needed */
> > 
> > barrier();
> > if (pvti->migrations_from != m1) retry;
> > 
> > This is just like (a), except that we're using a guest kernel hack to
> > ensure that no one migrated off the vcpu during the version-protected
> > critical section and that we were, in fact,

Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Linus Torvalds
On Fri, Apr 17, 2015 at 1:02 PM, Marcel Holtmann  wrote:
>
> okay. I only looked at BlueZ 5.x and that might have been my mistake. Let me 
> check this and fix this properly.

Why not just revert that commit. It looks like garbage. It has odd code like

+   u32 valid_flags = 0;
+   ci->flags = session->flags & valid_flags;

which is basically saying "no flags are valid, and we are silently
just clearing them all when copying".

The reason I think it's garbage is

 (a) the commit clearly breaks something, so the whole "let's check
flags that we've never checked before" is already fundamentally
suspicious

 (b) code like the above is just crap to begin with, because it makes
things superficially "look" sensible when looking at individual lines
of code (for example, when grepping things), and then when you look at
the actual bigger picture, it turns out that the code doesn't actually
care about the flags it is "copying", it just clears them all.

The other code sequences do things like

+   u32 valid_flags = 0;
+   if (req->flags & ~valid_flags)
+   return -EINVAL;

Which again is just a very unreadable way of saying "if any flags are
set, return an error". This kind of thing is presumably what breaks
things, because clearly people *have* set flags that you thought are
invalid.

Now *IF* the interfaces had had these kinds of flag validation checks
from day one, that would be one thing. But adding these kinds of
things after the fact, when somebody then reports that they break
things, then that's just a big big flag that you shouldn't try to do
this at all. It's water under the bridge. That ship has sailed. It's
too late. Give up on it.

So I don't think this code is "fixable". It really smells like a
fundamental mistake to begin with. Just revert it, chalk it up as "ok,
that was a stupid idea", and move on.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT] Sparc

2015-04-17 Thread David Miller

The PowerPC folks have a really nice scalable IOMMU pool allocator
that we wanted to make use of for sparc.   So here we have a series
that abstracts out their code into a common layer that anyone can
make use of.

Sparc is converted, and the PowerPC folks have reviewed and ACK'd
this series and plan to convert PowerPC over as well.

Please pull, thanks a lot!

The following changes since commit 497a5df7bf6ffd136ae21c49d1a01292930d7ca2:

  Merge tag 'stable/for-linus-4.1-rc0-tag' of 
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip (2015-04-16 14:01:03 
-0500)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc.git 

for you to fetch changes up to cb97201cb060d13da0b87fd1bf68208c7389c5b1:

  iommu-common: Fix PARISC compile-time warnings (2015-04-17 15:24:36 -0400)


David S. Miller (1):
  Merge branch 'generic-iommu-allocator'

Sowmini Varadhan (4):
  sparc: Break up monolithic iommu table/lock into finer graularity pools 
and lock
  sparc: Make sparc64 use scalable lib/iommu-common.c functions
  sparc: Make LDC use common iommu poll management functions
  iommu-common: Fix PARISC compile-time warnings

 arch/sparc/include/asm/iommu_64.h |   7 ++--
 arch/sparc/kernel/iommu.c | 188 
---
 arch/sparc/kernel/iommu_common.h  |   8 -
 arch/sparc/kernel/ldc.c   | 185 
+-
 arch/sparc/kernel/pci_sun4v.c | 193 
--
 include/linux/iommu-common.h  |  55 
 lib/Makefile  |   2 +-
 lib/iommu-common.c| 224 
++
 8 files changed, 537 insertions(+), 325 deletions(-)
 create mode 100644 include/linux/iommu-common.h
 create mode 100644 lib/iommu-common.c
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT] Networking

2015-04-17 Thread David Miller

1) Fix verifier memory corruption and other bugs in BPF layer, from
   Alexei Starovoitov.

2) Add a conservative fix for doing BPF properly in the BPF classifier
   of the packet scheduler on ingress.  Also from Alexei.

3) The SKB scrubber should not clear out the packet MARK and security
   label, from Herbert Xu.

4) Fix oops on rmmod in stmmac driver, from Bryan O'Donoghue.

5) Pause handling is not correct in the stmmac driver because it doesn't
   take into consideration the RX and TX fifo sizes.  From Vince
   Bridgers.

6) Failure path missing unlock in FOU driver, from Wang Cong.

Please pull, thanks a lot!

The following changes since commit c841e12add6926d64aa608687893465330b5a03e:

  Merge branch 'kconfig' of 
git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild (2015-04-15 
11:24:41 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 

for you to fetch changes up to e3122b7fae7b4e3d1d49fa84f6515bcbe6cbc6fc:

  net: dsa: use DEVICE_ATTR_RW to declare temp1_max (2015-04-17 15:58:37 -0400)


Alexei Starovoitov (3):
  bpf: fix verifier memory corruption
  bpf: fix bpf helpers to use skb->mac_header relative offsets
  bpf: fix two bugs in verification logic when accessing 'ctx' pointer

Andreas Oetken (1):
  altera tse: Fix network-delays and -retransmissions after high throughput.

Anjali Singhai Jain (2):
  i40e: Add support to program FDir SB rules for VF from PF through ethtool
  i40e: For VF reset (VFR and VFLR) add some more delay

Bryan O'Donoghue (1):
  stmmac: fix oops on rmmod after assigning ip addr

Catherine Sullivan (1):
  i40e: Bump version to 1.3.2

David S. Miller (3):
  Merge branch 'master' of git://git.kernel.org/.../jkirsher/next-queue
  Merge branch 'stmmac-flow-control'
  sfc: Fix memcpy() with const destination compiler warning.

Denys Vlasenko (1):
  netns: remove BUG_ONs from net_generic()

Erez Shitrit (1):
  IB/ipoib: Fix ndo_get_iflink

Eric Dumazet (3):
  bnx2x: Fix busy_poll vs netpoll
  tcp: tcp_get_info() should fetch socket fields once
  inet_diag: fix access to tcp cc information

Geert Uytterhoeven (1):
  net: dsa: mv88e6xxx: Add missing initialization in 
mv88e6xxx_set_port_state()

Greg Rose (1):
  i40e: Use new 40G speeds

Guenter Roeck (2):
  dsa: mv88e6xxx: Fix error handling in mv88e6xxx_set_port_state
  dsa: mv88e6xxx: Drop duplicate declaration of 'ret' variable

Herbert Xu (3):
  Revert "net: Reset secmark when scrubbing packet"
  skbuff: Do not scrub skb mark within the same name space
  act_mirred: Fix bogus header when redirecting from VLAN

Jesse Brandeburg (3):
  i40e: enable user dump of internal hardware state
  i40e: handle possible memory allocation failure
  i40e: get rid of unused locals

Johannes Berg (1):
  net: remove unused 'dev' argument from netif_needs_gso()

Kevin Scott (1):
  i40e/i40evf: Save WR_CSR_PROT field from DEV/FUNC capabilities

Michal Hocko (1):
  cxgb4: drop __GFP_NOFAIL allocation

Mitch Williams (5):
  i40e: stop VF rings
  i40evf: fix bad indentation
  i40evf: remove aq_pending
  i40e: notify VFs of link state
  i40e: move VF notification routines up

Thomas Gleixner (1):
  net: hip04: Make tx coalesce timer actually work

Vasu Dev (1):
  i40e: print FCoE capability reported by the device function

Vince Bridgers (5):
  stmmac: Add properties for transmit and receive fifo sizes
  stmmac: Add defines and documentation for enabling flow control
  stmmac: Read tx-fifo-depth and rx-fifo-depth from the devicetree
  stmmac: Enable unicast pause frame detect in GMAC Register 6
  stmmac: Configure Flow Control to work correctly based on rxfifo size

Vivien Didelot (1):
  net: dsa: use DEVICE_ATTR_RW to declare temp1_max

WANG Cong (1):
  fou: avoid missing unlock in failure path

Wei Yongjun (3):
  rocker: fix error return code in rocker_probe()
  ethernet: remove unused including 
  netns: remove duplicated include from net_namespace.c

 Documentation/devicetree/bindings/net/ethernet.txt|   6 +++
 Documentation/devicetree/bindings/net/stmmac.txt  |   4 ++
 drivers/infiniband/hw/cxgb4/mem.c |   2 +-
 drivers/infiniband/ulp/ipoib/ipoib_main.c |   5 ++
 drivers/infiniband/ulp/ipoib/ipoib_vlan.c |   3 +-
 drivers/net/dsa/mv88e6xxx.c   |   8 ++--
 drivers/net/ethernet/altera/altera_tse_main.c |   9 +++-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x.h   | 137 
-
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c   |   9 ++--
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c   |  15 --
 drivers/net/ethernet/hisilicon/hip04_eth.c|  18 ---
 drivers/net/ethernet/i

[GIT] IDE

2015-04-17 Thread David Miller

Just one change, getting rid of usage of the deprecated PCI DMA
interfaces in the IDE drivers.

Please pull, thanks a lot!

The following changes since commit 54e514b91b95d6441c12a7955addfb9f9d2afc65:

  Merge branch 'akpm' (patches from Andrew) (2015-04-17 09:04:38 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide.git 

for you to fetch changes up to d681f1166919d6829083c069a83edcd59bfd5e34:

  ide: remove deprecated use of pci api (2015-04-17 15:32:07 -0400)


Quentin Lambert (1):
  ide: remove deprecated use of pci api

 drivers/ide/cs5520.c| 2 +-
 drivers/ide/pmac.c  | 5 ++---
 drivers/ide/setup-pci.c | 2 +-
 drivers/ide/sgiioc4.c   | 4 ++--
 4 files changed, 6 insertions(+), 7 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Device mapper failed to open temporary keystore device

2015-04-17 Thread Murilo Opsfelder Araújo
Hello, everyone.

Right after I enter my passphrase to unlock my cryptsetup partition,
it displays the following error and asks for cryptsetup password again
(it got stuck on this loop).

This issue was introduced in next-20150413.  next-20150410 is working just fine.

Any hint on how to debug this?

Unlocking the disk /dev/disk/by-uuid/ (sda5_crypt)
Enter passphrase: *
[  244.239821] device-mapper: table: 252:0: crypt: Error allocating crypto tfm
device-mapper: reload ioctl on  failed: No such file or directory
Failed to open temporary keystore device.
device-mapper: remove ioctl on temporary-cryptsetup-239 failed: No
such device or address
device-mapper: reload ioctl on temporary-cryptsetup-239 failed: No
such device or address
device-mapper: remove ioctl on temporary-cryptsetup-239 failed: No
such device or address
device-mapper: remove ioctl on temporary-cryptsetup-239 failed: No
such device or address
device-mapper: remove ioctl on temporary-cryptsetup-239 failed: No
such device or address
device-mapper: remove ioctl on temporary-cryptsetup-239 failed: No
such device or address

-- 
Murilo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: 'perf upgrade' (was: Re: [PATCH v9 00/11] Add support for JSON event files.)

2015-04-17 Thread Andi Kleen
On Fri, Apr 17, 2015 at 05:31:26PM +0200, Jiri Olsa wrote:
> On Wed, Apr 15, 2015 at 01:50:42PM -0700, Sukadev Bhattiprolu wrote:
> 
> SNIP
> 
> > | 
> > |  - to blindly follow some poorly constructed vendor format with no 
> > |high level structure, that IMHO didn't work very well when OProfile 
> > |was written, and misrepresenting it as 'symbolic event names'.
> > | 
> > |Take a look at:
> > | 
> > |  https://download.01.org/perfmon/HSW/Haswell_core_V17.json
> > | 
> > |and weep.
> > 
> > Evil vendor formats, but to be fair, here is what _we_ have today:
> > 
> > perf stat -e r10068,r20036,r40060,r40ac sleep 1
> 
> hum, you could also use the 'cpu/event=.../' syntax right?

That's even worse -- same hex numbers, just more redundancy.

All the other profilers support symbolic names, which is what
users want.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 1/2] tee: generic TEE subsystem

2015-04-17 Thread Arnd Bergmann
On Friday 17 April 2015 09:50:56 Jens Wiklander wrote:
>  Documentation/ioctl/ioctl-number.txt |   1 +
>  drivers/Kconfig  |   2 +
>  drivers/Makefile |   1 +
>  drivers/tee/Kconfig  |   8 +
>  drivers/tee/Makefile |   3 +
>  drivers/tee/tee.c| 253 +++
>  drivers/tee/tee_private.h|  64 +++
>  drivers/tee/tee_shm.c| 330 
> +++
>  drivers/tee/tee_shm_pool.c   | 246 ++
>  include/linux/tee/tee.h  | 180 +++
>  include/linux/tee/tee_drv.h  | 271 

Hi Jens,

The driver looks very well implemented, but as you are introducing a new user
space API, we have to very carefully consider every aspect of that interface,
so I'm commenting mainly on user-visible parts.
 
> diff --git a/Documentation/ioctl/ioctl-number.txt 
> b/Documentation/ioctl/ioctl-number.txt
> index 8136e1f..6e9bd04 100644
> --- a/Documentation/ioctl/ioctl-number.txt
> +++ b/Documentation/ioctl/ioctl-number.txt
> @@ -301,6 +301,7 @@ Code  Seq#(hex)   Include FileComments
>  0xA3 80-8F   Port ACLin development:
>   
>  0xA3 90-9F   linux/dtlk.h
> +0xA4 00-1F   linux/sec-hw/tee.h  Generic TEE subsystem

File name does not match.

> +static long tee_ioctl_cmd(struct tee_context *ctx,
> + struct tee_ioctl_cmd_data __user *ucmd)
> +{
> + long ret;
> + struct tee_ioctl_cmd_data cmd;
> + void __user *buf_ptr;
> +
> + ret = copy_from_user(&cmd, ucmd, sizeof(cmd));
> + if (ret)
> + return ret;
> +
> + buf_ptr = (void __user *)(uintptr_t)cmd.buf_ptr;
> + return ctx->teedev->desc->ops->cmd(ctx, buf_ptr, cmd.buf_len);
> +}

What is that double indirection for? Normally each command gets its
own data structure, and then you can handle each command in the common
abstraction.

> +static long tee_ioctl_mem_share(struct tee_context *ctx,
> + struct tee_ioctl_mem_share_data __user *udata)
> +{
> + /* Not supported yet */
> + return -ENOSYS;
> +}
> +
> +static long tee_ioctl_mem_unshare(struct tee_context *ctx,
> + struct tee_ioctl_mem_share_data __user *udata)
> +{
> + /* Not supported yet */
> + return -ENOSYS;
> +}

Why -ENOSYS? ioctl does exist ;-)

> +static const struct file_operations tee_fops = {
> + .owner = THIS_MODULE,
> + .open = tee_open,
> + .release = tee_release,
> + .unlocked_ioctl = tee_ioctl
> +};

Add a .compat_ioctl function, to make it work on arm64 as well.
If you got all the data structures right, you can use the same
tee_ioctl function.

Minor nit: put a comma behind the last line in each struct initialization
to make it easier to add another callback.

> +
> +static void tee_shm_release(struct tee_shm *shm);

Try to avoid forward declarations by reordering the code.

> +static struct sg_table *tee_shm_op_map_dma_buf(struct dma_buf_attachment
> + *attach, enum dma_data_direction dir)
> +{
> + return NULL;
> +}
> +
> +static void tee_shm_op_unmap_dma_buf(struct dma_buf_attachment *attach,
> + struct sg_table *table, enum dma_data_direction dir)
> +{
> +}

Since a lot of callbacks are empty here, I'd probably change the
caller to check for NULL pointer before calling these, and remove
the empty implementations, unless your next patch fills them with
content.

> +struct tee_shm *tee_shm_get_from_fd(int fd)
> +{
> + struct dma_buf *dmabuf = dma_buf_get(fd);
> +
> + if (IS_ERR(dmabuf))
> + return ERR_CAST(dmabuf);
> +
> + if (!is_shm_dma_buf(dmabuf)) {
> + dma_buf_put(dmabuf);
> + return ERR_PTR(-EINVAL);
> + }
> + return dmabuf->priv;
> +}
> +EXPORT_SYMBOL_GPL(tee_shm_get_from_fd);
> +
> +void tee_shm_put(struct tee_shm *shm)
> +{
> + if (shm->flags & TEE_SHM_DMA_BUF)
> + dma_buf_put(shm->dmabuf);
> +}
> +EXPORT_SYMBOL_GPL(tee_shm_put);

Can you explain why you picked dmabuf as the interface here? Normally this
is used when you have multiple DMA master devices access the same memory,
while my understanding of your use case is that you just have one other
piece of code running on the same CPU accessing this.

Do you need more than one such buffer per device? Could you perhaps just
implement mmap on the chardev as a lot of other drivers do?

> +struct tee_shm_pool *tee_shm_pool_alloc_cma(struct device *dev, u_long 
> *vaddr,
> + phys_addr_t *paddr, size_t *size)

I think it would be better not to have 'cma' as part of the function
name -- the driver really should not care at all.

What is the typical and maximum allocation size here?

> +++ b/include/linux/tee/tee.h

This belongs into include/uapi/linux/, because you are defining ioctl values
fo

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread David Miller
From: Tejun Heo 
Date: Fri, 17 Apr 2015 15:52:38 -0400

> Hello,
> 
> On Fri, Apr 17, 2015 at 02:55:37PM -0400, David Miller wrote:
>> > * The bulk of patches are to pipe extended log messages to console
>> >   drivers and let netconsole relay them to the receiver (and quite a
>> >   bit of refactoring in the process), which, regardless of the
>> >   reliability logic, is beneficial as we're currently losing
>> >   structured logging (dictionary) and other metadata over consoles and
>> >   regardless of where the reliability logic is implemented, it's a lot
>> >   easier to have messages IDs.
>> 
>> I do not argue against cleanups and good restructuring of the existing
>> code.  But you have decided to mix that up with something that is not
>> exactly non-controversial.
> 
> Is the controlversial part referring to sending extended messages or
> the reliability part or both?

Anything outside of the non-side-effect cleanups.

> Hmmm... yeah, probably would have been a better idea.  FWIW, the
> patches are stacked roughly in the order of escalating
> controversiness.  Will split the series up.

Thanks.

> Sure, if irq handling is hosed, this won't work but I think there are
> enough other failure modes like oopsing while holding a mutex or
> falling into infinite loop while holding task_list lock (IIRC we had
> something simliar a while ago due to iterator bug).

If you oops while holding a mutex, unless it's the console mutex the
logging process can schedule and likely get the message transmitted.

What we're going to keep discussing is the fact that in return for all
of your unnecessary added complexity, we get something that only applies
in an extremely narrow scope of situations.

That is a very poor value proposition.

It took nearly two decades to get rid of all of the races and locking
problems with current netpoll/netconsole, and it's as simple as can
possibly be.  I do not want to even think about having to worry about
a reliability layer on top of it, that's just too much.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Error: DMA: Out of SW-IOMMU space [was: External USB drives become unresponsive after few hours.]

2015-04-17 Thread Konrad Rzeszutek Wilk
On Fri, Apr 17, 2015 at 05:14:20PM +0200, Dorian Gray wrote:
> On 16 April 2015 at 20:42, Konrad Rzeszutek Wilk  
> wrote:
> > And easier way is to compile the kernel with CONFIG_DMA_API_DEBUG
> > and then load the attached module.
> >
> > That should tell you who and what else is holding on the buffers.
> 
> Ok, I have compiled 3.19.4 w/ CONFIG_DMA_API_DEBUG=y + the module you sent me.
> Now, I'm not sure if I've done it right - I waited until the error
> occured and then modprobe'd dump_dma.
> I have attached the kernel log, but it tells me not much, if anything...

The network driver is quite hungry for DMA. Did it do the same thing
in the earlier kernels?

Thanks.
> 
> Thanks again.
> Jake


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2 V2] memory-hotplug: fix BUG_ON in move_freepages()

2015-04-17 Thread Yasuaki Ishimatsu

Your patches will fix your issue.
But, if BIOS reports memory first at node hot add, pgdat can
not be initialized.

Memory hot add flows are as follows:

add_memory
  ...
  -> hotadd_new_pgdat()
  ...
  -> node_set_online(nid)

When calling hotadd_new_pgdat() for a hot added node, the node is
offline because node_set_online() is not called yet. So if applying
your patches, the pgdat is not initialized in this case.

Thanks,
Yasuaki Ishimatsu

On Fri, 17 Apr 2015 18:50:32 +0800
Xishi Qiu  wrote:

> Hot remove nodeXX, then hot add nodeXX. If BIOS report cpu first, it will call
> hotadd_new_pgdat(nid, 0), this will set pgdat->node_start_pfn to 0. As nodeXX
> exists at boot time, so pgdat->node_spanned_pages is the same as original. 
> Then
> free_area_init_core()->memmap_init() will pass a wrong start and a nonzero 
> size.
> 
> free_area_init_core()
>   memmap_init()
>   memmap_init_zone()
>   early_pfn_in_nid()
>   set_page_links()
> 
> "if (!early_pfn_in_nid(pfn, nid))" will skip the pfn(memory in section), but 
> it
> will not skip the pfn(hole in section), this will cover and relink the page to
> zone/nid, so page_zone() from memory and hole in the same section are 
> different.
> The following call trace shows the bug.
> 
> This patch will set the node size to 0 when hotadd a new node(original or 
> new).
> init_currently_empty_zone() and memmap_init() will be called in add_zone(), so
> need not to change it.
> 
> [90476.077469] kernel BUG at mm/page_alloc.c:1042!  // move_freepages() -> 
> BUG_ON(page_zone(start_page) != page_zone(end_page));
> [90476.077469] invalid opcode:  [#1] SMP 
> [90476.077469] Modules linked in: iptable_nat nf_conntrack_ipv4 
> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack fuse btrfs zlib_deflate 
> raid6_pq xor msdos ext4 mbcache jbd2 binfmt_misc bridge stp llc 
> ip6table_filter ip6_tables iptable_filter ip_tables ebtable_nat ebtables 
> cfg80211 rfkill sg iTCO_wdt iTCO_vendor_support intel_powerclamp coretemp 
> intel_rapl kvm_intel kvm crct10dif_pclmul crc32_pclmul crc32c_intel 
> ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd 
> pcspkr igb vfat i2c_algo_bit dca fat sb_edac edac_core i2c_i801 lpc_ich 
> i2c_core mfd_core shpchp acpi_pad ipmi_si ipmi_msghandler uinput nfsd 
> auth_rpcgss nfs_acl lockd sunrpc xfs libcrc32c sd_mod crc_t10dif 
> crct10dif_common ahci libahci megaraid_sas tg3 ptp libata pps_core dm_mirror 
> dm_region_hash dm_log dm_mod [last unloaded: rasf]
> [90476.157382] CPU: 2 PID: 322803 Comm: updatedb Tainted: GF   W  
> O--   3.10.0-229.1.2.5.hulk.rc14.x86_64 #1
> [90476.157382] Hardware name: HUAWEI TECHNOLOGIES CO.,LTD. Huawei N1/Huawei 
> N1, BIOS V100R001 04/13/2015
> [90476.157382] task: 88006a6d5b00 ti: 880068eb8000 task.ti: 
> 880068eb8000
> [90476.157382] RIP: 0010:[]  [] 
> move_freepages+0x12f/0x140
> [90476.157382] RSP: 0018:880068ebb640  EFLAGS: 00010002
> [90476.157382] RAX: 880002316cc0 RBX: ea0001bd RCX: 
> 0001
> [90476.157382] RDX: 880002476e40 RSI:  RDI: 
> 880002316cc0
> [90476.157382] RBP: 880068ebb690 R08: 0010 R09: 
> ea0001bd7fc0
> [90476.157382] R10: 0006f5ff R11:  R12: 
> 0001
> [90476.157382] R13: 0003 R14: 880002316eb8 R15: 
> ea0001bd7fc0
> [90476.157382] FS:  7f4d3ab95740() GS:880033a0() 
> knlGS:
> [90476.157382] CS:  0010 DS:  ES:  CR0: 80050033
> [90476.157382] CR2: 7f4d3ae1a808 CR3: 00018907a000 CR4: 
> 001407e0
> [90476.157382] DR0:  DR1:  DR2: 
> 
> [90476.157382] DR3:  DR6: fffe0ff0 DR7: 
> 0400
> [90476.157382] Stack:
> [90476.157382]  880068ebb698 880002316cc0 a800b5378098 
> 880068ebb698
> [90476.157382]  810b11dc 880002316cc0 0001 
> 0003
> [90476.157382]  880002316eb8 ea0001bd6420 880068ebb6a0 
> 8115a003
> [90476.157382] Call Trace:
> [90476.157382]  [] ? update_curr+0xcc/0x150
> [90476.157382]  [] move_freepages_block+0x73/0x80
> [90476.157382]  [] __rmqueue+0x26a/0x460
> [90476.157382]  [] ? native_sched_clock+0x13/0x80
> [90476.157382]  [] get_page_from_freelist+0x7f2/0xd30
> [90476.157382]  [] ? __switch_to+0x179/0x4a0
> [90476.157382]  [] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
> [90476.157382]  [] __alloc_pages_nodemask+0x1c1/0xc90
> [90476.157382]  [] ? _xfs_buf_ioapply+0x31c/0x420 [xfs]
> [90476.157382]  [] ? down_trylock+0x2d/0x40
> [90476.157382]  [] ? xfs_buf_trylock+0x1f/0x80 [xfs]
> [90476.157382]  [] alloc_pages_current+0xa9/0x170
> [90476.157382]  [] new_slab+0x275/0x300
> [90476.157382]  [] __slab_alloc+0x315/0x48f
> [90476.157382]  [] ? kmem_zone_alloc+0x77/0x100 [xfs]
> [90476.157382]  [] ? xfs_bmap_search_extents+0x5c/0xc0 [xfs]
> [

Re: [PATCH 3.19 000/101] 3.19.5-stable review

2015-04-17 Thread Guenter Roeck
On Fri, Apr 17, 2015 at 03:27:48PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.19.5 release.
> There are 101 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Apr 19 13:24:43 UTC 2015.
> Anything received after that time might be too late.
> 
Build results:
total: 123 pass: 123 fail: 0
Qemu test results:
total: 30 pass: 30 fail: 0

Details are available at http://server.roeck-us.net:8010/builders.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Bluetooth: Pre-initialize variables in read_local_oob_ext_data_complete()

2015-04-17 Thread Geert Uytterhoeven
Hi Marcel,

On Thu, Apr 16, 2015 at 10:34 PM, Marcel Holtmann  wrote:
>> net/bluetooth/mgmt.c: In function ‘read_local_oob_ext_data_complete’:
>> net/bluetooth/mgmt.c:6474: warning: ‘r256’ may be used uninitialized in this 
>> function
>> net/bluetooth/mgmt.c:6474: warning: ‘h256’ may be used uninitialized in this 
>> function
>> net/bluetooth/mgmt.c:6474: warning: ‘r192’ may be used uninitialized in this 
>> function
>> net/bluetooth/mgmt.c:6474: warning: ‘h192’ may be used uninitialized in this 
>> function
>>
>> While these are false positives, the code can be shortened by
>> pre-initializing the hash table pointers and eir_len. This has the side
>> effect of killing the compiler warnings.
>
> can you be a bit specific on which compiler version is this. I fixed one 
> occurrence that seemed valid. However in this case the compiler seems to be 
> just plain stupid. On a gcc 4.9, I am not seeing these for example.

gcc 4.1.2. As there were too many false positives, these warnings were
disabled in later versions (throwing away the children with the bad water).

If you don't like my patch, just drop it. I only look at newly
introduced warnings
of this kind anyway.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14 00/43] 3.14.39-stable review

2015-04-17 Thread Guenter Roeck
On Fri, Apr 17, 2015 at 03:28:34PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.39 release.
> There are 43 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Apr 19 13:25:21 UTC 2015.
> Anything received after that time might be too late.
> 
Build results:
total: 125 pass: 123 fail: 2
Failed builds:
arm:allmodconfig
arm64:allmodconfig

Qemu test results:
total: 30 pass: 30 fail: 0

Build results are as expected. arm:allmodconfig and arm64:allmodconfig are
new additions to the list of build tests; the failures are not new.

Details are available at http://server.roeck-us.net:8010/builders.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.10 00/34] 3.10.75-stable review

2015-04-17 Thread Guenter Roeck
On Fri, Apr 17, 2015 at 03:28:32PM +0200, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.10.75 release.
> There are 34 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Apr 19 13:25:20 UTC 2015.
> Anything received after that time might be too late.
> 
Build results:
total: 125 pass: 124 fail: 1
Failed builds:
arm64:allmodconfig

Qemu test results:
total: 27 pass: 27 fail: 0

arm64:allmodconfig was added to the list of builds only recently;
the build failure is not new. Results are as expected.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] net: dsa: use DEVICE_ATTR_RW to declare temp1_max

2015-04-17 Thread David Miller
From: Guenter Roeck 
Date: Fri, 17 Apr 2015 12:40:06 -0700

> On Fri, Apr 17, 2015 at 03:12:25PM -0400, Vivien Didelot wrote:
>> Since commit da4759c (sysfs: Use only return value from is_visible for
>> the file mode), it is possible to reduce the permissions of a file.
>> 
>> So declare temp1_max with the DEVICE_ATTR_RW macro and remove the write
>> permission in dsa_hwmon_attrs_visible if set_temp_limit isn't provided.
>> 
>> Signed-off-by: Vivien Didelot 
> 
> Looks good.
> 
> Reviewed-by: Guenter Roeck 

Applied, thanks everyone.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] First batch of KVM changes for 4.1

2015-04-17 Thread Paolo Bonzini


>> From 4eb9d7132e1990c0586f28af3103675416d38974 Mon Sep 17 00:00:00 2001
>> From: Paolo Bonzini 
>> Date: Fri, 17 Apr 2015 14:57:34 +0200
>> Subject: [PATCH] sched: add CONFIG_TASK_MIGRATION_NOTIFIER
>>
>> The task migration notifier is only used in x86 paravirt.  Make it
>> possible to compile it out.
>>
>> While at it, move some code around to ensure tmn is filled from CPU
>> registers.
>>
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  arch/x86/Kconfig| 1 +
>>  init/Kconfig| 3 +++
>>  kernel/sched/core.c | 9 -
>>  3 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index d43e7e1c784b..9af252c8698d 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -649,6 +649,7 @@ if HYPERVISOR_GUEST
>>  
>>  config PARAVIRT
>>  bool "Enable paravirtualization code"
>> +select TASK_MIGRATION_NOTIFIER
>>  ---help---
>>This changes the kernel so it can modify itself when it is run
>>under a hypervisor, potentially improving performance significantly
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 3b9df1aa35db..891917123338 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -2016,6 +2016,9 @@ source "block/Kconfig"
>>  config PREEMPT_NOTIFIERS
>>  bool
>>  
>> +config TASK_MIGRATION_NOTIFIER
>> +bool
>> +
>>  config PADATA
>>  depends on SMP
>>  bool
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index f9123a82cbb6..c07a53aa543c 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1016,12 +1016,14 @@ void check_preempt_curr(struct rq *rq, struct 
>> task_struct *p, int flags)
>>  rq_clock_skip_update(rq, true);
>>  }
>>  
>> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
>>  static ATOMIC_NOTIFIER_HEAD(task_migration_notifier);
>>  
>>  void register_task_migration_notifier(struct notifier_block *n)
>>  {
>>  atomic_notifier_chain_register(&task_migration_notifier, n);
>>  }
>> +#endif
>>  
>>  #ifdef CONFIG_SMP
>>  void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
>> @@ -1053,18 +1055,23 @@ void set_task_cpu(struct task_struct *p, unsigned 
>> int new_cpu)
>>  trace_sched_migrate_task(p, new_cpu);
>>  
>>  if (task_cpu(p) != new_cpu) {
>> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
>>  struct task_migration_notifier tmn;
>> +int from_cpu = task_cpu(p);
>> +#endif
>>  
>>  if (p->sched_class->migrate_task_rq)
>>  p->sched_class->migrate_task_rq(p, new_cpu);
>>  p->se.nr_migrations++;
>>  perf_sw_event_sched(PERF_COUNT_SW_CPU_MIGRATIONS, 1, 0);
>>  
>> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
>>  tmn.task = p;
>> -tmn.from_cpu = task_cpu(p);
>> +tmn.from_cpu = from_cpu;
>>  tmn.to_cpu = new_cpu;
>>  
>>  atomic_notifier_call_chain(&task_migration_notifier, 0, &tmn);
>> +#endif
>>  }
>>  
>>  __set_task_cpu(p, new_cpu);
>> -- 
>> 2.3.5
> 
> Paolo, 
> 
> Please revert the patch -- can fix properly in the host
> which also conforms the KVM guest/host documented protocol.
> 
> Radim submitted a patch to kvm@ to split 
> the kvm_write_guest in two with a barrier in between, i think.
> 
> I'll review that patch.

You're thinking of
http://article.gmane.org/gmane.linux.kernel.stable/129187, but see
Andy's reply:

> 
> I think there are at least two ways that would work:
> 
> a) If KVM incremented version as advertised:
> 
> cpu = getcpu();
> pvti = pvti for cpu;
> 
> ver1 = pvti->version;
> check stable bit;
> rdtsc_barrier, rdtsc, read scale, shift, etc.
> if (getcpu() != cpu) retry;
> if (pvti->version != ver1) retry;
> 
> I think this is safe because, we're guaranteed that there was an
> interval (between the two version reads) in which the vcpu we think
> we're on was running and the kvmclock data was valid and marked
> stable, and we know that the tsc we read came from that interval.
> 
> Note: rdtscp isn't needed. If we're stable, is makes no difference
> which cpu's tsc we actually read.
> 
> b) If version remains buggy but we use this migrations_from hack:
> 
> cpu = getcpu();
> pvti = pvti for cpu;
> m1 = pvti->migrations_from;
> barrier();
> 
> ver1 = pvti->version;
> check stable bit;
> rdtsc_barrier, rdtsc, read scale, shift, etc.
> if (getcpu() != cpu) retry;
> if (pvti->version != ver1) retry;  /* probably not really needed */
> 
> barrier();
> if (pvti->migrations_from != m1) retry;
> 
> This is just like (a), except that we're using a guest kernel hack to
> ensure that no one migrated off the vcpu during the version-protected
> critical section and that we were, in fact, on that vcpu at some point
> during that critical section.  Once we've ensured that we were on
> pvti's associated vcpu for the entire time we were reading it, then we
> are protected by the existing versioning in the host.

(a) is not going to happen until 4.2, and there are too many buggy hosts
around so we

Re: [PATCH] netns: deinline net_generic()

2015-04-17 Thread David Miller
From: Denys Vlasenko 
Date: Fri, 17 Apr 2015 19:05:17 +0200

> On 04/16/2015 02:38 PM, Eric Dumazet wrote:
>> On Thu, 2015-04-16 at 13:14 +0200, Denys Vlasenko wrote:
>> 
>>> However, without BUG_ONs, function is still a bit big
>>> on PREEMPT configs.
>> 
>> Only on allyesconfig builds, that nobody use but to prove some points
>> about code size.
> 
> How do you expect one to find excessively large inlines,
> if not on allyesconfig build?
> 
> Only by using allyesconfig, I can measure how many calls
> are there in the kernel. (grepping source is utterly unreliable
> due to nested inlines and macros).

It is not indicative for it's overhead in what people actually make
use of, which is what is actually important.

Uninlining a static inline that basically does no more than index into
an array is nothing more than pure folly.  So please don't try to
weasel your way out of accepting this basic fact.

That's exactly the situation where the implementation of an abstraction
via a static inline is exactly the thing to do.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AM335x OMAP2 common clock external fixed-clock registration

2015-04-17 Thread Michael Turquette
Quoting Russell King - ARM Linux (2015-04-17 03:18:33)
> On Fri, Apr 17, 2015 at 11:12:03AM +0200, Sebastian Hesselbarth wrote:
> > On 17.04.2015 04:00, Michael Welling wrote:
> > >On Fri, Apr 17, 2015 at 01:23:50AM +0200, Sebastian Hesselbarth wrote:
> > >>On 17.04.2015 00:09, Michael Welling wrote:
> > >>>On Thu, Apr 16, 2015 at 10:37:19PM +0200, Sebastian Hesselbarth wrote:
> > On 16.04.2015 18:17, Michael Welling wrote:
> > [...]
> > >>>What would be the proper error path?
> > >>>What cleanup is required?
> > >>
> > >>A proper error path would be to release any claimed resource
> > >>on any error. If you look at the code, the only resources that
> > >>need to be released are the two clocks in question.
> > >
> > >So for every error return in the probe function and in the of 
> > >si5351_dt_parse
> > >it needs to clk_put first right?
> > 
> > Not quite. The driver should clk_put() every clock that it called a
> > [of_]clk_get() for. The thing is that clocks can be passed by
> > platform_data and we never claim them.
> 
> I've always said clocks (as in struct clk) should never be passed through
> platform data.

+1

And for ccf clock drivers Stephen and I plan to change the behavior of
clk_register() at some point so that it returns an error code and not a
struct clk. This will make clk_dev the only way to get at a struct clk
for users of the ccf implementation.

Of course it is still possible to clk_get from some place and pass as
platform_data, but every little bit helps.

Regards,
Mike

> 
> -- 
> FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
> according to speedtest.net.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tejun Heo
Hello,

On Fri, Apr 17, 2015 at 02:55:37PM -0400, David Miller wrote:
> > * The bulk of patches are to pipe extended log messages to console
> >   drivers and let netconsole relay them to the receiver (and quite a
> >   bit of refactoring in the process), which, regardless of the
> >   reliability logic, is beneficial as we're currently losing
> >   structured logging (dictionary) and other metadata over consoles and
> >   regardless of where the reliability logic is implemented, it's a lot
> >   easier to have messages IDs.
> 
> I do not argue against cleanups and good restructuring of the existing
> code.  But you have decided to mix that up with something that is not
> exactly non-controversial.

Is the controlversial part referring to sending extended messages or
the reliability part or both?

> You'd do well to seperate the cleanups from the fundamental changes,
> so they can be handled separately.

Hmmm... yeah, probably would have been a better idea.  FWIW, the
patches are stacked roughly in the order of escalating
controversiness.  Will split the series up.

> > * The only thing necessary for reliable transmission are timer and
> >   netpoll.  There sure are cases where they go down too but there's a
> >   pretty big gap between those two going down and userland getting
> >   hosed, but where to put the retransmission and reliability logic
> >   definitely is debatable.
> 
> I fundamentally disagree, exactly on this point.
> 
> If you take an OOPS in a software interrupt handler (basically, all of
> the networking receive paths and part of the transmit paths, for
> example) you're not going to be taking timer interrupts.

Sure, if irq handling is hosed, this won't work but I think there are
enough other failure modes like oopsing while holding a mutex or
falling into infinite loop while holding task_list lock (IIRC we had
something simliar a while ago due to iterator bug).  Whether being
more robust in those cases is worthwhile is definitely debatable.  I
thought the added complexity was small enough but the judgement can
easily fall on the other side.

> And that's the value of netconsole, the chance (albeit not %100) of
> getting messages in those scenerios.

None of the changes harm that in any way.  Anyways, I'll split up the
extended message and the rest.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 04/10] mtd: pxa3xx_nand: rework flash detection and timing setup

2015-04-17 Thread Robert Jarzmik
Ezequiel Garcia  writes:

>> Also, as soon as Robert moves pxa3xx boards fully to DT, we'll loose
>> the pdata timings option above. *sigh*
>> 
> Well, such move would include proper timing DT properties for non-ONFI
> devices.
I will move several boards to DT, including several pxa3x boards, but not _all_
the boards.

For reference, I said in [1] this :
> Actually, this deserves another discussion alltogether. My plan was not to
> convert all pxa board files to dt support, but all internal SoC IPs drivers +
> mach/plat support.
> 
> Or put another way at the end :
>  - there will be at least one pxa25x board which is fully DT converted
>  - there will be at least one pxa27x board which is fully DT converted
>  - there will be at least one pxa3xx board which is fully DT converted
> 
>  - there will be at least one pxa25x board which is not DT converted
>  - there will be at least one pxa27x board which is not DT converted
>  - there will be at least one pxa3xx board which is not DT converted
> 
> I want to keep the support for both legacy platform_data and DT for pxa
> architecture. The idea I had was that only fully DT converted machines will
> benefit from multiplatform support.

And as you said Ezequiel, the problem will arise regardless of being DT or not,
it's a lack of conformance of the zylonite board's nand amongst others.

So the timings will have to survive somewhere, whichever place is seen fit.

Cheers.

-- 
Robert

[1] http://lists.openwall.net/linux-kernel/2015/03/24/101
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: dwc3: gadget: call gadget driver's ->suspend/->resume

2015-04-17 Thread Felipe Balbi
On Fri, Apr 17, 2015 at 02:43:27PM -0500, Felipe Balbi wrote:
> On Fri, Apr 17, 2015 at 11:41:56AM -0700, David Cohen wrote:
> > From: Felipe Balbi 
> 
> missing the required:
> 
> [ Upstream commit bc5ba2e0b829c9397f96df1191c7d2319ebc36d9 ]
> 
> > 
> > When going into bus suspend/resume we _must_
> > call gadget driver's ->suspend/->resume callbacks
> > accordingly. This patch implements that very feature
> > which has been missing forever.
> > 
> > Cc:  # 3.14
> > Signed-off-by: Felipe Balbi 
> > Signed-off-by: David Cohen 
> > ---
> > 
> > Hi,
> > 
> > This patch was introduced on v3.15.
> > But the issue it fixes already existed on v3.14 and v3.14 is a long term
> > support version.
> 
> Can you show me a log of this breaking anywhere ? Why do you consider
> this a bug fix ? What sort of drawbacks did you notice ?
> 
> > I propose to backport it over there as well.
> > 
> > BR, David
> > ---
> > 
> >  drivers/usb/dwc3/gadget.c | 35 +++
> >  1 file changed, 35 insertions(+)
> > 
> > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > index 8f6738d46b14..1bb752736c32 100644
> > --- a/drivers/usb/dwc3/gadget.c
> > +++ b/drivers/usb/dwc3/gadget.c
> > @@ -2012,6 +2012,24 @@ static void dwc3_disconnect_gadget(struct dwc3 *dwc)
> > }
> >  }
> >  
> > +static void dwc3_suspend_gadget(struct dwc3 *dwc)
> > +{
> > +   if (dwc->gadget_driver && dwc->gadget_driver->disconnect) {
> 
> you also need Dan Carperter's commit which fixes this cut & paste error.

That's Carpenter, sorry.

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH] usb: dwc3: gadget: call gadget driver's ->suspend/->resume

2015-04-17 Thread Felipe Balbi
On Fri, Apr 17, 2015 at 11:41:56AM -0700, David Cohen wrote:
> From: Felipe Balbi 

missing the required:

[ Upstream commit bc5ba2e0b829c9397f96df1191c7d2319ebc36d9 ]

> 
> When going into bus suspend/resume we _must_
> call gadget driver's ->suspend/->resume callbacks
> accordingly. This patch implements that very feature
> which has been missing forever.
> 
> Cc:  # 3.14
> Signed-off-by: Felipe Balbi 
> Signed-off-by: David Cohen 
> ---
> 
> Hi,
> 
> This patch was introduced on v3.15.
> But the issue it fixes already existed on v3.14 and v3.14 is a long term
> support version.

Can you show me a log of this breaking anywhere ? Why do you consider
this a bug fix ? What sort of drawbacks did you notice ?

> I propose to backport it over there as well.
> 
> BR, David
> ---
> 
>  drivers/usb/dwc3/gadget.c | 35 +++
>  1 file changed, 35 insertions(+)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 8f6738d46b14..1bb752736c32 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -2012,6 +2012,24 @@ static void dwc3_disconnect_gadget(struct dwc3 *dwc)
>   }
>  }
>  
> +static void dwc3_suspend_gadget(struct dwc3 *dwc)
> +{
> + if (dwc->gadget_driver && dwc->gadget_driver->disconnect) {

you also need Dan Carperter's commit which fixes this cut & paste error.
That's commit 73a30bfc0d526db899033165db6f95c427e70505

> + spin_unlock(&dwc->lock);
> + dwc->gadget_driver->suspend(&dwc->gadget);
> + spin_lock(&dwc->lock);
> + }
> +}
> +
> +static void dwc3_resume_gadget(struct dwc3 *dwc)
> +{
> + if (dwc->gadget_driver && dwc->gadget_driver->disconnect) {
> + spin_unlock(&dwc->lock);
> + dwc->gadget_driver->resume(&dwc->gadget);
> + spin_lock(&dwc->lock);
> + }
> +}
> +
>  static void dwc3_stop_active_transfer(struct dwc3 *dwc, u32 epnum)
>  {
>   struct dwc3_ep *dep;
> @@ -2391,6 +2409,23 @@ static void 
> dwc3_gadget_linksts_change_interrupt(struct dwc3 *dwc,
>  
>   dwc->link_state = next;
>  
> + switch (next) {
> + case DWC3_LINK_STATE_U1:
> + if (dwc->speed == USB_SPEED_SUPER)
> + dwc3_suspend_gadget(dwc);
> + break;
> + case DWC3_LINK_STATE_U2:
> + case DWC3_LINK_STATE_U3:
> + dwc3_suspend_gadget(dwc);
> + break;
> + case DWC3_LINK_STATE_RESUME:
> + dwc3_resume_gadget(dwc);
> + break;
> + default:
> + /* do nothing */
> + break;
> + }
> +
>   dev_vdbg(dwc->dev, "%s link %d\n", __func__, dwc->link_state);
>  }

-- 
balbi


signature.asc
Description: Digital signature


Re: [PATCH 3.19 000/101] 3.19.5-stable review

2015-04-17 Thread Greg Kroah-Hartman
On Fri, Apr 17, 2015 at 11:34:53AM -0600, Shuah Khan wrote:
> On 04/17/2015 07:27 AM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.19.5 release.
> > There are 101 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sun Apr 19 13:24:43 UTC 2015.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.19.5-rc1.gz
> > and the diffstat can be found below.
> > 
> > thanks,
> > 
> > greg k-h
> > 
> 
> Complied and booted on my test system. No dmesg regressions.

Thanks for testing all three of these and letting me know.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: dwc3: gadget: call gadget driver's ->suspend/->resume

2015-04-17 Thread Greg KH
On Fri, Apr 17, 2015 at 11:41:56AM -0700, David Cohen wrote:
> From: Felipe Balbi 
> 
> When going into bus suspend/resume we _must_
> call gadget driver's ->suspend/->resume callbacks
> accordingly. This patch implements that very feature
> which has been missing forever.
> 
> Cc:  # 3.14
> Signed-off-by: Felipe Balbi 
> Signed-off-by: David Cohen 
> ---
> 
> Hi,
> 
> This patch was introduced on v3.15.
> But the issue it fixes already existed on v3.14 and v3.14 is a long term
> support version.
> I propose to backport it over there as well.

What is the git commit id of the patch in Linus's tree?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] net: dsa: use DEVICE_ATTR_RW to declare temp1_max

2015-04-17 Thread Guenter Roeck
On Fri, Apr 17, 2015 at 03:12:25PM -0400, Vivien Didelot wrote:
> Since commit da4759c (sysfs: Use only return value from is_visible for
> the file mode), it is possible to reduce the permissions of a file.
> 
> So declare temp1_max with the DEVICE_ATTR_RW macro and remove the write
> permission in dsa_hwmon_attrs_visible if set_temp_limit isn't provided.
> 
> Signed-off-by: Vivien Didelot 

Looks good.

Reviewed-by: Guenter Roeck 

> ---
>  net/dsa/dsa.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
> index 5eaadab..079a224 100644
> --- a/net/dsa/dsa.c
> +++ b/net/dsa/dsa.c
> @@ -124,7 +124,7 @@ static ssize_t temp1_max_store(struct device *dev,
>  
>   return count;
>  }
> -static DEVICE_ATTR(temp1_max, S_IRUGO, temp1_max_show, temp1_max_store);
> +static DEVICE_ATTR_RW(temp1_max);
>  
>  static ssize_t temp1_max_alarm_show(struct device *dev,
>   struct device_attribute *attr, char *buf)
> @@ -159,8 +159,8 @@ static umode_t dsa_hwmon_attrs_visible(struct kobject 
> *kobj,
>   if (index == 1) {
>   if (!drv->get_temp_limit)
>   mode = 0;
> - else if (drv->set_temp_limit)
> - mode |= S_IWUSR;
> + else if (!drv->set_temp_limit)
> + mode &= ~S_IWUSR;
>   } else if (index == 2 && !drv->get_temp_alarm) {
>   mode = 0;
>   }
> -- 
> 2.3.5
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AM335x OMAP2 common clock external fixed-clock registration

2015-04-17 Thread Russell King - ARM Linux
On Fri, Apr 17, 2015 at 02:06:23PM -0500, Michael Welling wrote:
> On Fri, Apr 17, 2015 at 11:18:33AM +0100, Russell King - ARM Linux wrote:
> > On Fri, Apr 17, 2015 at 11:12:03AM +0200, Sebastian Hesselbarth wrote:
> > > On 17.04.2015 04:00, Michael Welling wrote:
> > > >On Fri, Apr 17, 2015 at 01:23:50AM +0200, Sebastian Hesselbarth wrote:
> > > >>On 17.04.2015 00:09, Michael Welling wrote:
> > > >>>On Thu, Apr 16, 2015 at 10:37:19PM +0200, Sebastian Hesselbarth wrote:
> > > On 16.04.2015 18:17, Michael Welling wrote:
> > > [...]
> > > >>>What would be the proper error path?
> > > >>>What cleanup is required?
> > > >>
> > > >>A proper error path would be to release any claimed resource
> > > >>on any error. If you look at the code, the only resources that
> > > >>need to be released are the two clocks in question.
> > > >
> > > >So for every error return in the probe function and in the of 
> > > >si5351_dt_parse
> > > >it needs to clk_put first right?
> > > 
> > > Not quite. The driver should clk_put() every clock that it called a
> > > [of_]clk_get() for. The thing is that clocks can be passed by
> > > platform_data and we never claim them.
> > 
> > I've always said clocks (as in struct clk) should never be passed through
> > platform data.
> >
> 
> What is the alternative for systems that still use the old platform files?

clkdev, which has pre-existed DT.

-- 
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] kdbus for 4.1-rc1

2015-04-17 Thread James Bottomley
On Thu, 2015-04-16 at 14:13 +0200, David Herrmann wrote:
> Hi
> 
> On Wed, Apr 15, 2015 at 8:12 PM, James Bottomley
>  wrote:
> > For me the biggest issue is the container problem: it's really hard to
> > containerise kdbus because of the stateful nature of the protocol and
> > the fact that it has a well known system bus.  Separation into domains
> > works for OS containers, but application containers need more fluidity.
> > It's not unlike the same problem on windows: Windows application
> > containers are very difficult to do because the global registry means
> > that OLE handlers all have to run inside your container as well
> > (effectively making it an OS container).  I'm sure, since we already
> > have a lot of containers people going to plumbers, that we can get them
> > to turn up for the discussion.
> 
> kdbus actually works very well in OS containers that mount a new
> kdbusfs inside the container. This new instance of kdbus will be
> entirely seperated from any other on the system. We've designed it
> that way especially with OS containers in mind. This is explained in
> kdbus.fs(7). It's very similar to devpts' container support, where you
> mount a new instance of devpts into each container instance you run.
> 
> For Docker-style (i.e. app-focused) containers, it's a more complex
> story.

Well, no, docker-style is just one flavour of application containers.
I'm actually much more interested in something very different:
applications that use container features (like docker, rocket and
systemd).  Facilitating them is an interesting exercise.

Also, applications inside containers were around long before docker in
the PaaS space at least.

>  kdbus will not solve this for you, but at least one thing
> deserves being mentioned: for this kind of sandboxing kdbus certainly
> makes things *easier*, compared to dbus1.

So slightly better than really difficult isn't terribly useful.

>  Why? because the kernel
> gains a notion of individual messages and method call transactions,
> something that is completely unavailable if you stick to dbus1 where
> all the kernel sees is a raw stream of AF_UNIX/SOCK_STREAM bytes. In
> fact, kdbus as it is right now even contains minimal but explicit
> support for sandboxing, by allowing creation of multiple bus endpoints
> to the same bus that carry additional, more restrictive policy.

Sandboxing is a minor (albeit very useful) use of containers.

You nicely ignored the actual problem I listed, which is the system bus.
And the specific example of what happens.  Let me try again.  Just to
provide the context, Virtuozzo has long supported containers on both
Windows and Linux.  We have been doing application containers on Linux
for a long time, but we've been having issues doing the same thing on
windows (in spite of the fact that our windows container system is very
similar to the Linux one).

In windows, OLE + the global registry is dbus on steroids.  The idea
seems simple and elegant: remote system elements are provided to you via
an IPC interaction instead of being directly dynamically linked into
your virtual address space.  It allows windows applications to deal with
arbitrary objects of unknown type because the type handlers are provided
by the system via OLE.  It's really elegant in a single user desktop
environment because the system's job is to serve and protect only that
user.  In a multi user environment (as MS found with VDI) it's a lot
more problematic because now either the type handlers are global
(meaning local users can't modify them unlike in the single user case)
or they're all local, meaning we're back to OS containers again.  If you
think abstractly of containers as a way to bring multi-user features to
single user environments (essentially that's what OS virtualization is)
you can see immediately why we're having such issues with non-os
containers on Windows because the single bus/global namespace idea
doesn't play well with multi-user.

This is why I think kdbus is a bad idea: it solidifies as a linux kernel
API something which runs counter to granular OS virtualization (and
something which caused Windows to fall behind Linux in the container
space).  Splitting out the acceleration problem and leaving the rest to
user space currently looks fine because the ideas Al and Andy are
kicking around don't cause problems with OS virtualization.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] timer_list: Reduce SEQ_printf footprint

2015-04-17 Thread Joe Perches
On Fri, 2015-04-17 at 11:39 -0700, Joe Perches wrote:
> This macro can be converted to a static inline to reduce
> object size.

bah, that should be "static function".


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched/debug: Reduce SEQ_printf footprint

2015-04-17 Thread Joe Perches
On Fri, 2015-04-17 at 11:39 -0700, Joe Perches wrote:
> This macro can be converted to a static inline to reduce
> object size.

bah, that should be "static function".


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netns: remove BUG_ONs from net_generic()

2015-04-17 Thread David Miller
From: Denys Vlasenko 
Date: Fri, 17 Apr 2015 19:06:30 +0200

> This inline has ~500 callsites.
> 
> On 04/14/2015 08:37 PM, David Miller wrote:
>> That BUG_ON() was added 7 years ago, and I don't remember it ever
>> triggering or helping us diagnose something, so just remove it and
>> keep the function inlined.
> 
> On x86 allyesconfig build:
> 
> text data  bss   dec hex filename
> 82447071 22255384 20627456 125329911 77861f7 vmlinux4
> 82441375 22255384 20627456 125324215 7784bb7 vmlinux5prime
> 
> Signed-off-by: Denys Vlasenko 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 1/2] i2c: core: Add support for best effort block read emulation

2015-04-17 Thread Irina Tirdea
There are devices that need to handle block transactions
regardless of the capabilities exported by the adapter.
For performance reasons, they need to use i2c read blocks
if available, otherwise emulate the block transaction with word
or byte transactions.

Add support for a helper function that would read a data block
using the best transfer available: I2C_FUNC_SMBUS_READ_I2C_BLOCK,
I2C_FUNC_SMBUS_READ_WORD_DATA or I2C_FUNC_SMBUS_READ_BYTE_DATA.

Signed-off-by: Irina Tirdea 
---
 drivers/i2c/i2c-core.c | 60 ++
 include/linux/i2c.h|  3 +++
 2 files changed, 63 insertions(+)

diff --git a/drivers/i2c/i2c-core.c b/drivers/i2c/i2c-core.c
index 098f698..5ceebc4 100644
--- a/drivers/i2c/i2c-core.c
+++ b/drivers/i2c/i2c-core.c
@@ -2907,6 +2907,66 @@ trace:
 }
 EXPORT_SYMBOL(i2c_smbus_xfer);
 
+/**
+ * i2c_smbus_read_i2c_block_data_or_emulated - read block or emulate
+ * @client: Handle to slave device
+ * @command: Byte interpreted by slave
+ * @length: Size of data block; SMBus allows at most 32 bytes
+ * @values: Byte array into which data will be read; big enough to hold
+ * the data returned by the slave.  SMBus allows at most 32 bytes.
+ *
+ * This executes the SMBus "block read" protocol if supported by the adapter.
+ * If block read is not supported, it emulates it using either word or byte
+ * read protocols depending on availability.
+ *
+ * Before using this function you must double-check if the I2C slave does
+ * support exchanging a block transfer with a byte transfer.
+ */
+s32 i2c_smbus_read_i2c_block_data_or_emulated(const struct i2c_client *client,
+ u8 command, u8 length, u8 *values)
+{
+   u8 i;
+   int status;
+
+   if (length > I2C_SMBUS_BLOCK_MAX)
+   length = I2C_SMBUS_BLOCK_MAX;
+
+   if (i2c_check_functionality(client->adapter,
+   I2C_FUNC_SMBUS_READ_I2C_BLOCK)) {
+   return i2c_smbus_read_i2c_block_data(client, command,
+length, values);
+   } else if (i2c_check_functionality(client->adapter,
+  I2C_FUNC_SMBUS_READ_WORD_DATA)) {
+   for (i = 0; i < length; i += 2) {
+   status = i2c_smbus_read_word_data(client, command + i);
+   if (status < 0)
+   return status;
+   values[i] = status & 0xff;
+   if ((i + 1) < length)
+   values[i + 1] = status >> 8;
+   }
+   if (i > length)
+   return length;
+   return i;
+   } else if (i2c_check_functionality(client->adapter,
+  I2C_FUNC_SMBUS_READ_BYTE_DATA)) {
+   for (i = 0; i < length; i++) {
+   status = i2c_smbus_read_byte_data(client, command + i);
+   if (status < 0)
+   return status;
+   values[i] = status;
+   }
+   return i;
+   }
+
+   dev_err(&client->adapter->dev, "Unsupported transactions: %d,%d,%d\n",
+   I2C_SMBUS_I2C_BLOCK_DATA, I2C_SMBUS_WORD_DATA,
+   I2C_SMBUS_BYTE_DATA);
+
+   return -EOPNOTSUPP;
+}
+EXPORT_SYMBOL(i2c_smbus_read_i2c_block_data_or_emulated);
+
 #if IS_ENABLED(CONFIG_I2C_SLAVE)
 int i2c_slave_register(struct i2c_client *client, i2c_slave_cb_t slave_cb)
 {
diff --git a/include/linux/i2c.h b/include/linux/i2c.h
index e83a738..faf518d 100644
--- a/include/linux/i2c.h
+++ b/include/linux/i2c.h
@@ -121,6 +121,9 @@ extern s32 i2c_smbus_read_i2c_block_data(const struct 
i2c_client *client,
 extern s32 i2c_smbus_write_i2c_block_data(const struct i2c_client *client,
  u8 command, u8 length,
  const u8 *values);
+extern s32
+i2c_smbus_read_i2c_block_data_or_emulated(const struct i2c_client *client,
+ u8 command, u8 length, u8 *values);
 #endif /* I2C */
 
 /**
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 2/2] eeprom: at24: use i2c_smbus_read_i2c_block_data_or_emulated

2015-04-17 Thread Irina Tirdea
For i2c busses that support only SMBUS extensions, the eeprom at24
driver reads data from the device using the SMBus block, word or byte
read protocols depending on availability.

Replace the block read emulation from the driver with the
i2c_smbus_read_i2c_block_data_or_emulated call from i2c core.

Signed-off-by: Irina Tirdea 
---
 drivers/misc/eeprom/at24.c | 40 +---
 1 file changed, 9 insertions(+), 31 deletions(-)

diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
index 2d3db81..d13795a 100644
--- a/drivers/misc/eeprom/at24.c
+++ b/drivers/misc/eeprom/at24.c
@@ -186,19 +186,11 @@ static ssize_t at24_eeprom_read(struct at24_data *at24, 
char *buf,
if (count > io_limit)
count = io_limit;
 
-   switch (at24->use_smbus) {
-   case I2C_SMBUS_I2C_BLOCK_DATA:
+   if (at24->use_smbus) {
/* Smaller eeproms can work given some SMBus extension calls */
if (count > I2C_SMBUS_BLOCK_MAX)
count = I2C_SMBUS_BLOCK_MAX;
-   break;
-   case I2C_SMBUS_WORD_DATA:
-   count = 2;
-   break;
-   case I2C_SMBUS_BYTE_DATA:
-   count = 1;
-   break;
-   default:
+   } else {
/*
 * When we have a better choice than SMBus calls, use a
 * combined I2C message. Write address; then read up to
@@ -229,27 +221,13 @@ static ssize_t at24_eeprom_read(struct at24_data *at24, 
char *buf,
timeout = jiffies + msecs_to_jiffies(write_timeout);
do {
read_time = jiffies;
-   switch (at24->use_smbus) {
-   case I2C_SMBUS_I2C_BLOCK_DATA:
-   status = i2c_smbus_read_i2c_block_data(client, offset,
-   count, buf);
-   break;
-   case I2C_SMBUS_WORD_DATA:
-   status = i2c_smbus_read_word_data(client, offset);
-   if (status >= 0) {
-   buf[0] = status & 0xff;
-   buf[1] = status >> 8;
-   status = count;
-   }
-   break;
-   case I2C_SMBUS_BYTE_DATA:
-   status = i2c_smbus_read_byte_data(client, offset);
-   if (status >= 0) {
-   buf[0] = status;
-   status = count;
-   }
-   break;
-   default:
+   if (at24->use_smbus) {
+   status =
+   i2c_smbus_read_i2c_block_data_or_emulated(client,
+ offset,
+ count,
+ buf);
+   } else {
status = i2c_transfer(client->adapter, msg, 2);
if (status == 2)
status = count;
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[for-next][PATCH] tracing: Fix possible out of bounds memory access when parsing enums

2015-04-17 Thread Steven Rostedt
  git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace.git
for-next

Head SHA1: 3193899d4dd54056f8c2e0b1e40dd6e2f0009f28


Steven Rostedt (Red Hat) (1):
  tracing: Fix possible out of bounds memory access when parsing enums


 kernel/trace/trace_events.c | 6 ++
 1 file changed, 6 insertions(+)
---
commit 3193899d4dd54056f8c2e0b1e40dd6e2f0009f28
Author: Steven Rostedt (Red Hat) 
Date:   Fri Apr 17 10:27:57 2015 -0400

tracing: Fix possible out of bounds memory access when parsing enums

The code that replaces the enum names with the enum values in the
tracepoints' format files could possible miss the end of string nul
character. This was caused by processing things like backslashes, quotes
and other tokens. After processing the tokens, a check for the nul
character needed to be done before continuing the loop, because the loop
incremented the pointer before doing the check, which could bypass the nul
character.

Link: http://lkml.kernel.org/r/552e661d.5060...@oracle.com

Reported-by: Sasha Levin  # via KASan
Tested-by: Andrey Ryabinin 
Fixes: 0c564a538aa9 "tracing: Add TRACE_DEFINE_ENUM() macro to map enums to 
their values"
Signed-off-by: Steven Rostedt 

diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index 36a957c996c7..b49c107f82ac 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -1760,6 +1760,8 @@ static void update_event_printk(struct ftrace_event_call 
*call,
ptr++;
/* Check for alpha chars like ULL */
} while (isalnum(*ptr));
+   if (!*ptr)
+   break;
/*
 * A number must have some kind of delimiter after
 * it, and we can ignore that too.
@@ -1786,12 +1788,16 @@ static void update_event_printk(struct 
ftrace_event_call *call,
do {
ptr++;
} while (isalnum(*ptr) || *ptr == '_');
+   if (!*ptr)
+   break;
/*
 * If what comes after this variable is a '.' or
 * '->' then we can continue to ignore that string.
 */
if (*ptr == '.' || (ptr[0] == '-' && ptr[1] == '>')) {
ptr += *ptr == '.' ? 1 : 2;
+   if (!*ptr)
+   break;
goto skip_more;
}
/*
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2 0/2] Add support for best effort block read emulation

2015-04-17 Thread Irina Tirdea
This is the second version for adding i2c_smbus_read_i2c_block_data_or_emulated
to i2c-core. It contains mostly fixes suggested by Wolfram.

Changes since v1:
 - dropped the RFC tag
 - changed at24 to use i2c_smbus_read_i2c_block_data_or_emulated
 - when reading an odd number of bytes using word emulation, read an even
number of bytes and drop the last one
 - add a comment that this might not be suitable for all I2C slaves 

Irina Tirdea (2):
  i2c: core: Add support for best effort block read emulation
  eeprom: at24: use i2c_smbus_read_i2c_block_data_or_emulated

 drivers/i2c/i2c-core.c | 60 ++
 drivers/misc/eeprom/at24.c | 40 +++
 include/linux/i2c.h|  3 +++
 3 files changed, 72 insertions(+), 31 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] First batch of KVM changes for 4.1

2015-04-17 Thread Andy Lutomirski
On Fri, Apr 17, 2015 at 12:01 PM, Marcelo Tosatti  wrote:
> On Fri, Apr 17, 2015 at 03:38:58PM +0200, Paolo Bonzini wrote:
>> On 17/04/2015 15:10, Peter Zijlstra wrote:
>> > On Fri, Apr 17, 2015 at 02:46:57PM +0200, Paolo Bonzini wrote:
>> >> On 17/04/2015 12:55, Peter Zijlstra wrote:
>> >>> Also, it looks like you already do exactly this for other things, look
>> >>> at:
>> >>>
>> >>>   kvm_sched_in()
>> >>> kvm_arch_vcpu_load()
>> >>>   if (unlikely(vcpu->cpu != cpu) ... )
>> >>>
>> >>> So no, I don't believe for one second you need this.
>> >
>> > This [...] brings us back to where we were last
>> > time. There is _0_ justification for this in the patches, that alone is
>> > grounds enough to reject it.
>>
>> Oh, we totally agree on that.  I didn't commit that patch, but I already
>> said the commit message was insufficient.
>>
>> > Why should the guest task care about the physical cpu of the vcpu;
>> > that's a layering fail if ever there was one.
>>
>> It's totally within your right to not read the code, but then please
>> don't try commenting at it.
>>
>> This code:
>>
>>   kvm_sched_in()
>> kvm_arch_vcpu_load()
>>   if (unlikely(vcpu->cpu != cpu) ... )
>>
>> runs in the host.  The hypervisor obviously cares if the physical CPU of
>> the VCPU changes.  It has to tell the source processor (vcpu->cpu) to
>> release the VCPU's data structure and only then it can use it in the
>> target processor (cpu).  No layering violation here.
>>
>> The task migration notifier runs in the guest, whenever the VCPU of
>> a task changes.
>>
>> > Furthermore, the only thing that migration handler seems to do is
>> > increment a variable that is not actually used in that file.
>>
>> It's used in the vDSO, so you cannot increment it in the file that uses it.
>>
>> >> And frankly, I think the static key is snake oil.  The cost of task
>> >> migration in terms of cache misses and TLB misses is in no way
>> >> comparable to the cost of filling in a structure on the stack,
>> >> dereferencing the head of the notifiers list and seeing that it's NULL.
>> >
>> > The path this notifier is called from has nothing to do with those
>> > costs.
>>
>> How not?  The task is going to incur those costs, it's not like half
>> a dozen extra instruction make any difference.  But anyway...
>>
>> > And the fact you're inflicting these costs on _everyone_ for a
>> > single x86_64-paravirt case is insane.
>>
>> ... that's a valid objection.  Please look at the patch below.
>>
>> > I've had enough of this, the below goes into sched/urgent and you can
>> > come back with sane patches if and when you're ready.
>>
>> Oh, please, cut the alpha male crap.
>>
>> Paolo
>>
>> --- 8< 
>> >From 4eb9d7132e1990c0586f28af3103675416d38974 Mon Sep 17 00:00:00 2001
>> From: Paolo Bonzini 
>> Date: Fri, 17 Apr 2015 14:57:34 +0200
>> Subject: [PATCH] sched: add CONFIG_TASK_MIGRATION_NOTIFIER
>>
>> The task migration notifier is only used in x86 paravirt.  Make it
>> possible to compile it out.
>>
>> While at it, move some code around to ensure tmn is filled from CPU
>> registers.
>>
>> Signed-off-by: Paolo Bonzini 
>> ---
>>  arch/x86/Kconfig| 1 +
>>  init/Kconfig| 3 +++
>>  kernel/sched/core.c | 9 -
>>  3 files changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index d43e7e1c784b..9af252c8698d 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -649,6 +649,7 @@ if HYPERVISOR_GUEST
>>
>>  config PARAVIRT
>>   bool "Enable paravirtualization code"
>> + select TASK_MIGRATION_NOTIFIER
>>   ---help---
>> This changes the kernel so it can modify itself when it is run
>> under a hypervisor, potentially improving performance significantly
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 3b9df1aa35db..891917123338 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -2016,6 +2016,9 @@ source "block/Kconfig"
>>  config PREEMPT_NOTIFIERS
>>   bool
>>
>> +config TASK_MIGRATION_NOTIFIER
>> + bool
>> +
>>  config PADATA
>>   depends on SMP
>>   bool
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index f9123a82cbb6..c07a53aa543c 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -1016,12 +1016,14 @@ void check_preempt_curr(struct rq *rq, struct 
>> task_struct *p, int flags)
>>   rq_clock_skip_update(rq, true);
>>  }
>>
>> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
>>  static ATOMIC_NOTIFIER_HEAD(task_migration_notifier);
>>
>>  void register_task_migration_notifier(struct notifier_block *n)
>>  {
>>   atomic_notifier_chain_register(&task_migration_notifier, n);
>>  }
>> +#endif
>>
>>  #ifdef CONFIG_SMP
>>  void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
>> @@ -1053,18 +1055,23 @@ void set_task_cpu(struct task_struct *p, unsigned 
>> int new_cpu)
>>   trace_sched_migrate_task(p, new_cpu);
>>
>>   if (task_cpu(p) != new_

[PATCH v2] net: dsa: use DEVICE_ATTR_RW to declare temp1_max

2015-04-17 Thread Vivien Didelot
Since commit da4759c (sysfs: Use only return value from is_visible for
the file mode), it is possible to reduce the permissions of a file.

So declare temp1_max with the DEVICE_ATTR_RW macro and remove the write
permission in dsa_hwmon_attrs_visible if set_temp_limit isn't provided.

Signed-off-by: Vivien Didelot 
---
 net/dsa/dsa.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/dsa/dsa.c b/net/dsa/dsa.c
index 5eaadab..079a224 100644
--- a/net/dsa/dsa.c
+++ b/net/dsa/dsa.c
@@ -124,7 +124,7 @@ static ssize_t temp1_max_store(struct device *dev,
 
return count;
 }
-static DEVICE_ATTR(temp1_max, S_IRUGO, temp1_max_show, temp1_max_store);
+static DEVICE_ATTR_RW(temp1_max);
 
 static ssize_t temp1_max_alarm_show(struct device *dev,
struct device_attribute *attr, char *buf)
@@ -159,8 +159,8 @@ static umode_t dsa_hwmon_attrs_visible(struct kobject *kobj,
if (index == 1) {
if (!drv->get_temp_limit)
mode = 0;
-   else if (drv->set_temp_limit)
-   mode |= S_IWUSR;
+   else if (!drv->set_temp_limit)
+   mode &= ~S_IWUSR;
} else if (index == 2 && !drv->get_temp_alarm) {
mode = 0;
}
-- 
2.3.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 14/20] VFS/namei: add 'inode' arg to put_link().

2015-04-17 Thread Al Viro
On Fri, Apr 17, 2015 at 05:25:36PM +0100, Al Viro wrote:
> On Mon, Mar 23, 2015 at 01:37:40PM +1100, NeilBrown wrote:
> > @@ -1669,13 +1669,14 @@ static inline int nested_symlink(struct path *path, 
> > struct nameidata *nd)
> >  
> > do {
> > struct path link = *path;
> > +   struct inode *inode = link.dentry->d_inode;
> > void *cookie;
> >  
> > res = follow_link(&link, nd, &cookie);
> > if (res)
> > break;
> > res = walk_component(nd, path, LOOKUP_FOLLOW);
> > -   put_link(nd, &link, cookie);
> > +   put_link(nd, &link, inode, cookie);
> > } while (res > 0);
> 
> That's really unpleasant - it means increased stack footprint in the
> recursion.
> 
> Damn, maybe it's time to bite the bullet and kill the recursion completely...
> 
> What do we really need to save across the recursive call?
>   * how far did we get in the previous pathname
>   * data needed for put_link:
>   cookie
>   link body
>   dentry of link
>   vfsmount (to pin containing fs; non-RCU) or inode (RCU)
> 
> We are already saving link body in nameidata, so we could fatten that array.
> It would allow flattening link_path_walk() completely - instead of
> recursive call we would just save what needed saving and jump to the beginning
> and on exits we'd check the depth and either return or restore the saved state
> and jump back to just past the place where recursive call used to be.
> It would even save quite a bit of space in the worst case.  However, it would
> blow the stack footprint in normal cases *and* blow it even worse for the
> things that need two struct nameidata instances at once (rename(), basically).
> 5 pointers instead of 1 pointer per level - extra 32 words on stack, i.e.
> extra 256 bytes on 64bit.  Extra 0.5Kb of stack footprint on rename() is
> probably too much, especially since this "saved" stuff from its two nameidata
> instances will never be used at the same time...
> 
> Alternatively, we could just allocate about a page worth of an array when
> the depth of nesting goes beyond 2 and put this saved stuff there - at
> 5 pointers per level it would completely dispose of the depth of nesting
> limit, giving us uniform "can't traverse more than 40 symlinks per pathname
> resolution".  40 * 5 * sizeof(pointer) is what, at most 1600 bytes?  So
> even half a page would suffice for that quite comfortably...
> 
> The question is whether we'll be able to avoid blowing the I-cache footprint
> of link_path_walk() to hell while doing that; it feels like we should be,
> but we'll have to see how well does that work in reality...
> 
> I'll try to implement that (with your #3..#7 as the first steps) and see
> how well does it work; it's obviously the next cycle fodder, but hopefully
> in testable shape by -rc2...

Hmm...  Actually, right now we have 192 bytes of stack footprint per
nesting level (amd64 allmodconfig).  Which means that simply making the
array fatter would give a clean benefit at the 3rd level of recursion (symlink
encountered while traversing a symlink) for everything other than rename()...
allnoconfig+64bit gives 160 bytes per level, with the same breakeven point.

Interesting...  It might even make sense to separate that array from
struct nameidata and solve the rename() problem that way (current->nameidata
would be replaced with pointer to that sucker in such variant, of course, and
->depth would move there).  In that variant we do not get rid of nesting limit
completely, but it would probably be simpler than the one above...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: AM335x OMAP2 common clock external fixed-clock registration

2015-04-17 Thread Michael Welling
On Fri, Apr 17, 2015 at 11:18:33AM +0100, Russell King - ARM Linux wrote:
> On Fri, Apr 17, 2015 at 11:12:03AM +0200, Sebastian Hesselbarth wrote:
> > On 17.04.2015 04:00, Michael Welling wrote:
> > >On Fri, Apr 17, 2015 at 01:23:50AM +0200, Sebastian Hesselbarth wrote:
> > >>On 17.04.2015 00:09, Michael Welling wrote:
> > >>>On Thu, Apr 16, 2015 at 10:37:19PM +0200, Sebastian Hesselbarth wrote:
> > On 16.04.2015 18:17, Michael Welling wrote:
> > [...]
> > >>>What would be the proper error path?
> > >>>What cleanup is required?
> > >>
> > >>A proper error path would be to release any claimed resource
> > >>on any error. If you look at the code, the only resources that
> > >>need to be released are the two clocks in question.
> > >
> > >So for every error return in the probe function and in the of 
> > >si5351_dt_parse
> > >it needs to clk_put first right?
> > 
> > Not quite. The driver should clk_put() every clock that it called a
> > [of_]clk_get() for. The thing is that clocks can be passed by
> > platform_data and we never claim them.
> 
> I've always said clocks (as in struct clk) should never be passed through
> platform data.
>

What is the alternative for systems that still use the old platform files?

Hypothetically speaking of course.
 
> -- 
> FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
> according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] First batch of KVM changes for 4.1

2015-04-17 Thread Marcelo Tosatti
On Fri, Apr 17, 2015 at 03:38:58PM +0200, Paolo Bonzini wrote:
> On 17/04/2015 15:10, Peter Zijlstra wrote:
> > On Fri, Apr 17, 2015 at 02:46:57PM +0200, Paolo Bonzini wrote:
> >> On 17/04/2015 12:55, Peter Zijlstra wrote:
> >>> Also, it looks like you already do exactly this for other things, look
> >>> at:
> >>>
> >>>   kvm_sched_in()
> >>> kvm_arch_vcpu_load()
> >>>   if (unlikely(vcpu->cpu != cpu) ... )
> >>>
> >>> So no, I don't believe for one second you need this.
> > 
> > This [...] brings us back to where we were last
> > time. There is _0_ justification for this in the patches, that alone is
> > grounds enough to reject it.
> 
> Oh, we totally agree on that.  I didn't commit that patch, but I already
> said the commit message was insufficient.
> 
> > Why should the guest task care about the physical cpu of the vcpu;
> > that's a layering fail if ever there was one.
> 
> It's totally within your right to not read the code, but then please
> don't try commenting at it.
> 
> This code:
> 
>   kvm_sched_in()
> kvm_arch_vcpu_load()
>   if (unlikely(vcpu->cpu != cpu) ... )
> 
> runs in the host.  The hypervisor obviously cares if the physical CPU of
> the VCPU changes.  It has to tell the source processor (vcpu->cpu) to
> release the VCPU's data structure and only then it can use it in the
> target processor (cpu).  No layering violation here.
> 
> The task migration notifier runs in the guest, whenever the VCPU of
> a task changes.
> 
> > Furthermore, the only thing that migration handler seems to do is
> > increment a variable that is not actually used in that file.
> 
> It's used in the vDSO, so you cannot increment it in the file that uses it.
> 
> >> And frankly, I think the static key is snake oil.  The cost of task 
> >> migration in terms of cache misses and TLB misses is in no way 
> >> comparable to the cost of filling in a structure on the stack, 
> >> dereferencing the head of the notifiers list and seeing that it's NULL.
> > 
> > The path this notifier is called from has nothing to do with those
> > costs.
> 
> How not?  The task is going to incur those costs, it's not like half
> a dozen extra instruction make any difference.  But anyway...
> 
> > And the fact you're inflicting these costs on _everyone_ for a
> > single x86_64-paravirt case is insane.
> 
> ... that's a valid objection.  Please look at the patch below.
> 
> > I've had enough of this, the below goes into sched/urgent and you can
> > come back with sane patches if and when you're ready.
> 
> Oh, please, cut the alpha male crap.
> 
> Paolo
> 
> --- 8< 
> >From 4eb9d7132e1990c0586f28af3103675416d38974 Mon Sep 17 00:00:00 2001
> From: Paolo Bonzini 
> Date: Fri, 17 Apr 2015 14:57:34 +0200
> Subject: [PATCH] sched: add CONFIG_TASK_MIGRATION_NOTIFIER
> 
> The task migration notifier is only used in x86 paravirt.  Make it
> possible to compile it out.
> 
> While at it, move some code around to ensure tmn is filled from CPU
> registers.
> 
> Signed-off-by: Paolo Bonzini 
> ---
>  arch/x86/Kconfig| 1 +
>  init/Kconfig| 3 +++
>  kernel/sched/core.c | 9 -
>  3 files changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index d43e7e1c784b..9af252c8698d 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -649,6 +649,7 @@ if HYPERVISOR_GUEST
>  
>  config PARAVIRT
>   bool "Enable paravirtualization code"
> + select TASK_MIGRATION_NOTIFIER
>   ---help---
> This changes the kernel so it can modify itself when it is run
> under a hypervisor, potentially improving performance significantly
> diff --git a/init/Kconfig b/init/Kconfig
> index 3b9df1aa35db..891917123338 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -2016,6 +2016,9 @@ source "block/Kconfig"
>  config PREEMPT_NOTIFIERS
>   bool
>  
> +config TASK_MIGRATION_NOTIFIER
> + bool
> +
>  config PADATA
>   depends on SMP
>   bool
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index f9123a82cbb6..c07a53aa543c 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -1016,12 +1016,14 @@ void check_preempt_curr(struct rq *rq, struct 
> task_struct *p, int flags)
>   rq_clock_skip_update(rq, true);
>  }
>  
> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
>  static ATOMIC_NOTIFIER_HEAD(task_migration_notifier);
>  
>  void register_task_migration_notifier(struct notifier_block *n)
>  {
>   atomic_notifier_chain_register(&task_migration_notifier, n);
>  }
> +#endif
>  
>  #ifdef CONFIG_SMP
>  void set_task_cpu(struct task_struct *p, unsigned int new_cpu)
> @@ -1053,18 +1055,23 @@ void set_task_cpu(struct task_struct *p, unsigned int 
> new_cpu)
>   trace_sched_migrate_task(p, new_cpu);
>  
>   if (task_cpu(p) != new_cpu) {
> +#ifdef CONFIG_TASK_MIGRATION_NOTIFIER
>   struct task_migration_notifier tmn;
> + int from_cpu = task_cpu(p);
> +#endif
>  
>

[PATCH] thp: cleanup how khugepaged enters freezer

2015-04-17 Thread Jiri Kosina
khugepaged_do_scan() checks in every iteration whether freezing(current) 
is true, and in such case breaks out of the loop, which causes 
try_to_freeze() to be called immediately afterwards in 
khugepaged_wait_work().

If nothing else, this causes unnecessary freezing(current) test, and also 
makes the way khugepaged enters freezer a bit less obvious than necessary.

Let's just try to freeze directly, instead of splitting it into two 
(directly adjacent) phases.

Signed-off-by: Jiri Kosina 
---

Stumbled upon this when debugging something completely unrelated.

 mm/huge_memory.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 078832c..b3d8cd8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2799,7 +2799,7 @@ static void khugepaged_do_scan(void)
 
cond_resched();
 
-   if (unlikely(kthread_should_stop() || freezing(current)))
+   if (unlikely(kthread_should_stop() || try_to_freeze()))
break;
 
spin_lock(&khugepaged_mm_lock);
@@ -2820,8 +2820,6 @@ static void khugepaged_do_scan(void)
 
 static void khugepaged_wait_work(void)
 {
-   try_to_freeze();
-
if (khugepaged_has_work()) {
if (!khugepaged_scan_sleep_millisecs)
return;

-- 
Jiri Kosina
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netns: deinline net_generic()

2015-04-17 Thread David Miller
From: Eric Dumazet 
Date: Fri, 17 Apr 2015 10:42:16 -0700

> On Fri, 2015-04-17 at 19:05 +0200, Denys Vlasenko wrote:
> 
>> How do you expect one to find excessively large inlines,
>> if not on allyesconfig build?
> 
> Tuning kernel sources based on allyesconfig build _size_ only is
> terrible. We could build an interpreter based kernel and maybe reduce
> its size by 50% who knows...
> 
> You are saying that all inline should be removed, since it is obvious
> kernel size _will_ be smaller.

+1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread David Miller
From: Tejun Heo 
Date: Fri, 17 Apr 2015 13:37:54 -0400

> Hello, David.
> 
> On Fri, Apr 17, 2015 at 01:17:12PM -0400, David Miller wrote:
>> If userland cannot run properly, it is almost certain that neither will
>> your complex reliability layer logic.
> 
> * The bulk of patches are to pipe extended log messages to console
>   drivers and let netconsole relay them to the receiver (and quite a
>   bit of refactoring in the process), which, regardless of the
>   reliability logic, is beneficial as we're currently losing
>   structured logging (dictionary) and other metadata over consoles and
>   regardless of where the reliability logic is implemented, it's a lot
>   easier to have messages IDs.

I do not argue against cleanups and good restructuring of the existing
code.  But you have decided to mix that up with something that is not
exactly non-controversial.

You'd do well to seperate the cleanups from the fundamental changes,
so they can be handled separately.

> * The only thing necessary for reliable transmission are timer and
>   netpoll.  There sure are cases where they go down too but there's a
>   pretty big gap between those two going down and userland getting
>   hosed, but where to put the retransmission and reliability logic
>   definitely is debatable.

I fundamentally disagree, exactly on this point.

If you take an OOPS in a software interrupt handler (basically, all of
the networking receive paths and part of the transmit paths, for
example) you're not going to be taking timer interrupts.

And that's the value of netconsole, the chance (albeit not %100) of
getting messages in those scenerios.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] kdbus for 4.1-rc1

2015-04-17 Thread Andy Lutomirski
On Fri, Apr 17, 2015 at 2:19 AM, Michal Hocko  wrote:
> On Thu 16-04-15 10:04:17, Andy Lutomirski wrote:
>> On Thu, Apr 16, 2015 at 8:01 AM, David Herrmann  
>> wrote:
>> > Hi
>> >
>> > On Thu, Apr 16, 2015 at 4:34 PM, Andy Lutomirski  
>> > wrote:
>> >> Whose memcg does the pool use?
>> >
>> > The pool-owner's (i.e., the receiver's).
>> >
>> >> If it's the receiver's, and if the
>> >> receiver can configure a memcg, then it seems that even a single
>> >> receiver could probably cause the sender to block for an unlimited
>> >> amount of time.
>> >
>> > How? Which of those calls can block? I don't see how that can happen.
>>
>> I admit I don't fully understand memcg, but vfs_iter_write is
>> presumably going to need to get write access to the target pool page,
>> and that, in turn, will need that page to exist in memory and to be
>> writable, which may need to page it in and/or allocate a page.  If
>> that uses the receiver's memcg (as it should), then the receiver can
>> make it block.  Even if it doesn't use the receiver's memcg, it can
>> trigger direct reclaim, I think.
>
> Yes, memcg direct reclaim might trigger but we are no longer waiting for
> the OOM victim from non page fault paths so the time is bounded. It
> still might a quite some time, though, depending on the amount of work
> done in the direct reclaim.

Is that still true if OOM notifiers are involved?  I've lost track of
what changed there.

Any any event, I'm not entirely convinced that having a broadcast send
cause, say, PID 1 to block until an unbounded number of pages in a
potentially unbounded number of memcgs are reclaimed is a good idea.

In the kdbus model's favor, I think that allowing pages of data in the
receive queue to be swapped out is potentially quite nice, but I'm
less convinced about non-full pages in the receive queue.  There's a
resource management tradeoff here, and one nice thing about AF_UNIX is
that sends are genuinely non-blocking.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] usb: dwc3: gadget: call gadget driver's ->suspend/->resume

2015-04-17 Thread David Cohen
From: Felipe Balbi 

When going into bus suspend/resume we _must_
call gadget driver's ->suspend/->resume callbacks
accordingly. This patch implements that very feature
which has been missing forever.

Cc:  # 3.14
Signed-off-by: Felipe Balbi 
Signed-off-by: David Cohen 
---

Hi,

This patch was introduced on v3.15.
But the issue it fixes already existed on v3.14 and v3.14 is a long term
support version.
I propose to backport it over there as well.

BR, David
---

 drivers/usb/dwc3/gadget.c | 35 +++
 1 file changed, 35 insertions(+)

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 8f6738d46b14..1bb752736c32 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -2012,6 +2012,24 @@ static void dwc3_disconnect_gadget(struct dwc3 *dwc)
}
 }
 
+static void dwc3_suspend_gadget(struct dwc3 *dwc)
+{
+   if (dwc->gadget_driver && dwc->gadget_driver->disconnect) {
+   spin_unlock(&dwc->lock);
+   dwc->gadget_driver->suspend(&dwc->gadget);
+   spin_lock(&dwc->lock);
+   }
+}
+
+static void dwc3_resume_gadget(struct dwc3 *dwc)
+{
+   if (dwc->gadget_driver && dwc->gadget_driver->disconnect) {
+   spin_unlock(&dwc->lock);
+   dwc->gadget_driver->resume(&dwc->gadget);
+   spin_lock(&dwc->lock);
+   }
+}
+
 static void dwc3_stop_active_transfer(struct dwc3 *dwc, u32 epnum)
 {
struct dwc3_ep *dep;
@@ -2391,6 +2409,23 @@ static void dwc3_gadget_linksts_change_interrupt(struct 
dwc3 *dwc,
 
dwc->link_state = next;
 
+   switch (next) {
+   case DWC3_LINK_STATE_U1:
+   if (dwc->speed == USB_SPEED_SUPER)
+   dwc3_suspend_gadget(dwc);
+   break;
+   case DWC3_LINK_STATE_U2:
+   case DWC3_LINK_STATE_U3:
+   dwc3_suspend_gadget(dwc);
+   break;
+   case DWC3_LINK_STATE_RESUME:
+   dwc3_resume_gadget(dwc);
+   break;
+   default:
+   /* do nothing */
+   break;
+   }
+
dev_vdbg(dwc->dev, "%s link %d\n", __func__, dwc->link_state);
 }
 
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] arm64: add KASan support

2015-04-17 Thread David Keitel
On 04/15/2015 11:04 AM, Andrey Ryabinin wrote:
> I've pushed the most fresh thing that I have in git:
>   git://github.com/aryabinin/linux.git kasan/arm64v1
> 
> It's the same patches with two simple but important fixes on top of it.

Thanks, the two commits do fix compilation issues that I've had worked around 
to get to my mapping question.

I've addressed the mapping problem using __create_page_tables in 
arch/arm64/head.S as an example.

The next roadblock I hit was running into kasan_report_error calls in 
cgroups_early_init. After a short investigation it does seem to be a false 
positive due the the kasan_zero_page size and tracking bytes being reused for 
different memory regions.

I worked around that by enabling kasan error reporting only after the 
kasan_init is run. This let me get to the shell with some real KAsan reports 
along the way. There were some other fixes and hacks to get there. I'll 
backtrack to evaluate which ones warrant an RFC.

 - David

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rcu: small rcu_dereference doc update

2015-04-17 Thread Paul E. McKenney
On Fri, Apr 17, 2015 at 04:53:15PM +, Jeff Haran wrote:
> > -Original Message-
> > From: Paul E. McKenney [mailto:paul...@linux.vnet.ibm.com]
> > Sent: Friday, April 17, 2015 7:07 AM
> > To: Milos Vyletel
> > Cc: Josh Triplett; Steven Rostedt; Mathieu Desnoyers; Lai Jiangshan;
> > Jonathan Corbet; open list:READ-COPY UPDATE...; open
> > list:DOCUMENTATION; Jeff Haran
> > Subject: Re: [PATCH] rcu: small rcu_dereference doc update
> > 
> > On Fri, Apr 17, 2015 at 12:33:36PM +0200, Milos Vyletel wrote:
> > > Make a note stating that repeated calls of rcu_dereference() may not
> > > return the same pointer if update happens while in critical section.
> > >
> > > Reported-by: Jeff Haran 
> > > Signed-off-by: Milos Vyletel 
> > 
> > Hmmm...  Seems like that should be obvious, but on the other hand, I have
> > been using RCU for more than twenty years, so my obviousness sensors
> > might need recalibration.
> > 
> > Queued for 4.2.
> > 
> > Thanx, Paul
> 
> It's just that the original text suggests repeated rcu_dereference() calls 
> are discouraged because they are ugly and not efficient on some 
> architectures. When I read that I concluded that those were the only reasons 
> not to do it, that despite the possible inefficiency it would always return 
> the same pointer. Depending on how one's code is structured, being able to do 
> this could be advantageous. Then I started looking at the code that 
> implements it and I couldn't see how it could possibly be the case. I even 
> wrote a little kernel module to prove to myself that doing this could return 
> different pointer values. If I misinterpreted the original text I figured 
> others might also. Milos even found some code in the kernel where it's author 
> had done this, so it might be a widely held misunderstanding. It's easy for 
> people who have worked with rwlock_ts to think an RCU read lock works the 
> same.

Fair point, and thank you the rationale!  Are there any other parts of
the RCU documentation that are similarly blind to your initial point of
view?  If so, it would be good for them to be fixed.

Thanx, Paul

> Thanks,
> 
> Jeff Haran
> 
> > > ---
> > >  Documentation/RCU/whatisRCU.txt | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/Documentation/RCU/whatisRCU.txt
> > > b/Documentation/RCU/whatisRCU.txt index 88dfce1..82b1b2c 100644
> > > --- a/Documentation/RCU/whatisRCU.txt
> > > +++ b/Documentation/RCU/whatisRCU.txt
> > > @@ -256,7 +256,9 @@ rcu_dereference()
> > >   If you are going to be fetching multiple fields from the
> > >   RCU-protected structure, using the local variable is of
> > >   course preferred.  Repeated rcu_dereference() calls look
> > > - ugly and incur unnecessary overhead on Alpha CPUs.
> > > + ugly, do not guarantee that same pointer will be returned
> > > + if update happened while in critical section and incur
> > > + unnecessary overhead on Alpha CPUs.
> > >
> > >   Note that the value returned by rcu_dereference() is valid
> > >   only within the enclosing RCU read-side critical section.
> > > --
> > > 2.1.0
> > >
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched/debug: Reduce SEQ_printf footprint

2015-04-17 Thread Joe Perches
This macro can be converted to a static inline to reduce
object size.

(x86-64 defconfig, with SCHED_DEBUG)

$ size kernel/sched/debug.o*
   textdata bss dec hex filename
  13885   84098   179914647 kernel/sched/debug.o.new
  20413   84098   245195fc7 kernel/sched/debug.o.old

Signed-off-by: Joe Perches 
---
 kernel/sched/debug.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index a245c1f..c7932cc 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -25,13 +25,20 @@ static DEFINE_SPINLOCK(sched_debug_lock);
  * This allows printing both to /proc/sched_debug and
  * to the console
  */
-#define SEQ_printf(m, x...)\
- do {  \
-   if (m)  \
-   seq_printf(m, x);   \
-   else\
-   printk(x);  \
- } while (0)
+__printf(2, 3)
+static void SEQ_printf(struct seq_file *m, const char *fmt, ...)
+{
+   va_list args;
+
+   va_start(args, fmt);
+
+   if (m)
+   seq_vprintf(m, fmt, args);
+   else
+   vprintk(fmt, args);
+
+   va_end(args);
+}
 
 /*
  * Ease the printing of nsec fields:


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] timer_list: Reduce SEQ_printf footprint

2015-04-17 Thread Joe Perches
This macro can be converted to a static inline to reduce
object size.

(x86-64 defconfig)
$ size kernel/time/timer_list.o*
   textdata bss dec hex filename
   4647   8   04655122f kernel/time/timer_list.o.new
   6583   8   0659119bf kernel/time/timer_list.o.old

Signed-off-by: Joe Perches 
---
 kernel/time/timer_list.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/time/timer_list.c b/kernel/time/timer_list.c
index e878c2e..5960af21 100644
--- a/kernel/time/timer_list.c
+++ b/kernel/time/timer_list.c
@@ -35,13 +35,20 @@ DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
  * This allows printing both to /proc/timer_list and
  * to the console (on SysRq-Q):
  */
-#define SEQ_printf(m, x...)\
- do {  \
-   if (m)  \
-   seq_printf(m, x);   \
-   else\
-   printk(x);  \
- } while (0)
+__printf(2, 3)
+static void SEQ_printf(struct seq_file *m, const char *fmt, ...)
+{
+   va_list args;
+
+   va_start(args, fmt);
+
+   if (m)
+   seq_vprintf(m, fmt, args);
+   else
+   vprintk(fmt, args);
+
+   va_end(args);
+}
 
 static void print_name_offset(struct seq_file *m, void *sym)
 {


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 2/3] watchdog: add watchdog_cpumask sysctl to assist nohz

2015-04-17 Thread Chris Metcalf
Change the default behavior of watchdog so it only runs on the
housekeeping cores when nohz_full is enabled at build and boot time.
Allow modifying the set of cores the watchdog is currently running
on with a new kernel.watchdog_cpumask sysctl.

If we allowed the watchdog to run on nohz_full cores, the timer
interrupts and scheduler work would prevent the desired tickless
operation on those cores.  But if we disable the watchdog globally,
then the housekeeping cores can't benefit from the watchdog
functionality.  So we allow disabling it only on some cores.
See Documentation/lockup-watchdogs.txt for more information.

Acked-by: Don Zickus 
Signed-off-by: Chris Metcalf 
---
v9: use new, new semantics of smpboot_update_cpumask_percpu_thread() [Frederic]
add and use for_each_watchdog_cpu() [Uli]
check alloc_cpumask_var for failure [Chai Wen]

v8: use new semantics of smpboot_update_cpumask_percpu_thread() [Frederic]
improve documentation in "Documentation/" and in changelog [akpm]

v7: use cpumask field instead of valid_cpu() callback

v6: use alloc_cpumask_var() [Sasha Levin]
switch from watchdog_exclude to watchdog_cpumask [Frederic]
simplify the smp_hotplug_thread API to watchdog [Frederic]
add Don's Acked-by

 Documentation/lockup-watchdogs.txt | 18 +++
 Documentation/sysctl/kernel.txt| 15 +
 include/linux/nmi.h|  3 ++
 kernel/sysctl.c|  7 +
 kernel/watchdog.c  | 63 +++---
 5 files changed, 101 insertions(+), 5 deletions(-)

diff --git a/Documentation/lockup-watchdogs.txt 
b/Documentation/lockup-watchdogs.txt
index ab0baa692c13..22dd6af2e4bd 100644
--- a/Documentation/lockup-watchdogs.txt
+++ b/Documentation/lockup-watchdogs.txt
@@ -61,3 +61,21 @@ As explained above, a kernel knob is provided that allows
 administrators to configure the period of the hrtimer and the perf
 event. The right value for a particular environment is a trade-off
 between fast response to lockups and detection overhead.
+
+By default, the watchdog runs on all online cores.  However, on a
+kernel configured with NO_HZ_FULL, by default the watchdog runs only
+on the housekeeping cores, not the cores specified in the "nohz_full"
+boot argument.  If we allowed the watchdog to run by default on
+the "nohz_full" cores, we would have to run timer ticks to activate
+the scheduler, which would prevent the "nohz_full" functionality
+from protecting the user code on those cores from the kernel.
+Of course, disabling it by default on the nohz_full cores means that
+when those cores do enter the kernel, by default we will not be
+able to detect if they lock up.  However, allowing the watchdog
+to continue to run on the housekeeping (non-tickless) cores means
+that we will continue to detect lockups properly on those cores.
+
+In either case, the set of cores excluded from running the watchdog
+may be adjusted via the kernel.watchdog_cpumask sysctl.  For
+nohz_full cores, this may be useful for debugging a case where the
+kernel seems to be hanging on the nohz_full cores.
diff --git a/Documentation/sysctl/kernel.txt b/Documentation/sysctl/kernel.txt
index c831001c45f1..f1697858d71c 100644
--- a/Documentation/sysctl/kernel.txt
+++ b/Documentation/sysctl/kernel.txt
@@ -923,6 +923,21 @@ and nmi_watchdog.
 
 ==
 
+watchdog_cpumask:
+
+This value can be used to control on which cpus the watchdog may run.
+The default cpumask is all possible cores, but if NO_HZ_FULL is
+enabled in the kernel config, and cores are specified with the
+nohz_full= boot argument, those cores are excluded by default.
+Offline cores can be included in this mask, and if the core is later
+brought online, the watchdog will be started based on the mask value.
+
+Typically this value would only be touched in the nohz_full case
+to re-enable cores that by default were not running the watchdog,
+if a kernel lockup was suspected on those cores.
+
+==
+
 watchdog_thresh:
 
 This value can be used to control the frequency of hrtimer and NMI
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 3d46fb4708e0..f94da0e65dea 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -67,6 +67,7 @@ extern int nmi_watchdog_enabled;
 extern int soft_watchdog_enabled;
 extern int watchdog_user_enabled;
 extern int watchdog_thresh;
+extern unsigned long *watchdog_cpumask_bits;
 extern int sysctl_softlockup_all_cpu_backtrace;
 struct ctl_table;
 extern int proc_watchdog(struct ctl_table *, int ,
@@ -77,6 +78,8 @@ extern int proc_soft_watchdog(struct ctl_table *, int ,
  void __user *, size_t *, loff_t *);
 extern int proc_watchdog_thresh(struct ctl_table *, int ,
void __user *, size_t *, loff_t *);
+extern int proc_watchdog_cpumask(struct ctl_table *, int,
+

[PATCH v9 3/3] procfs: treat parked tasks as sleeping for task state

2015-04-17 Thread Chris Metcalf
Allowing watchdog threads to be parked means that we now have the
opportunity of actually seeing persistent parked threads in the output
of /proc's stat and status files.  The existing code reported such
threads as "Running", which is kind-of true if you think of the case
where we park them as part of taking cpus offline.  But if we allow
parking them indefinitely, "Running" is pretty misleading, so we report
them as "Sleeping" instead.

We could simply report them with a new string, "Parked", but it feels
like it's a bit risky for userspace to see unexpected new values.
The scheduler does report parked tasks with a "P" in debugging output
from sched_show_task() or dump_cpu_task(), but that's a different API.

This change seemed slightly cleaner than updating the task_state_array
to have additional rows.  TASK_DEAD should be subsumed by the exit_state
bits; TASK_WAKEKILL is just a modifier; and TASK_WAKING can very
reasonably be reported as "Running" (as it is now).  Only TASK_PARKED
shows up with unreasonable output here.

Signed-off-by: Chris Metcalf 
---
v9: fix to check tsk->state, and to set to TASK_INTERRUPTIBLE

 fs/proc/array.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/fs/proc/array.c b/fs/proc/array.c
index a3893b7505b2..2a59d061941e 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -126,6 +126,10 @@ static inline const char *get_task_state(struct 
task_struct *tsk)
 {
unsigned int state = (tsk->state | tsk->exit_state) & TASK_REPORT;
 
+   /* Treat parked tasks as sleeping. */
+   if (tsk->state == TASK_PARKED)
+   state = TASK_INTERRUPTIBLE;
+
BUILD_BUG_ON(1 + ilog2(TASK_REPORT) != ARRAY_SIZE(task_state_array)-1);
 
return task_state_array[fls(state)];
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v9 1/3] smpboot: allow excluding cpus from the smpboot threads

2015-04-17 Thread Chris Metcalf
This change allows some cores to be excluded from running the
smp_hotplug_thread tasks.  The following commit to update
kernel/watchdog.c to use this functionality is the motivating
example, and more information on the motivation is provided there.

A new smp_hotplug_thread field is introduced, "cpumask", which
is cpumask field managed by the smpboot subsystem that indicates whether
or not the given smp_hotplug_thread should run on that core; the
cpumask is checked when deciding whether to unpark the thread.

To limit the cpumask to less than cpu_possible, you must call
smpboot_update_cpumask_percpu_thread() after registering.

Signed-off-by: Chris Metcalf 
---
v9: move cpumask into smpboot_hotplug_thread and don't let the
client initialize it either [Frederic]
use alloc_cpumask_var, not a locked static cpumask [Frederic]

v8: make cpumask only updated by smpboot subsystem [Frederic]

v7: change from valid_cpu() callback to optional cpumask field
park smpboot threads rather than just not creating them

v6: change from an "exclude" data pointer to a more generic
valid_cpu() callback [Frederic]

v5: switch from watchdog_exclude to watchdog_cpumask [Frederic]
simplify the smp_hotplug_thread API to watchdog [Frederic]

 include/linux/smpboot.h |  6 ++
 kernel/smpboot.c| 57 +++--
 2 files changed, 61 insertions(+), 2 deletions(-)

 include/linux/smpboot.h |  5 +
 kernel/smpboot.c| 55 -
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/include/linux/smpboot.h b/include/linux/smpboot.h
index d600afb21926..7c42153edfac 100644
--- a/include/linux/smpboot.h
+++ b/include/linux/smpboot.h
@@ -27,6 +27,8 @@ struct smpboot_thread_data;
  * @pre_unpark:Optional unpark function, called before the 
thread is
  * unparked (cpu online). This is not guaranteed to be
  * called on the target cpu of the thread. Careful!
+ * @cpumask:   Internal state.  To update which threads are unparked,
+ * call smpboot_update_cpumask_percpu_thread().
  * @selfparking:   Thread is not parked by the park function.
  * @thread_comm:   The base name of the thread
  */
@@ -41,11 +43,14 @@ struct smp_hotplug_thread {
void(*park)(unsigned int cpu);
void(*unpark)(unsigned int cpu);
void(*pre_unpark)(unsigned int cpu);
+   struct cpumask  cpumask;
boolselfparking;
const char  *thread_comm;
 };
 
 int smpboot_register_percpu_thread(struct smp_hotplug_thread *plug_thread);
 void smpboot_unregister_percpu_thread(struct smp_hotplug_thread *plug_thread);
+int smpboot_update_cpumask_percpu_thread(struct smp_hotplug_thread 
*plug_thread,
+const struct cpumask *);
 
 #endif
diff --git a/kernel/smpboot.c b/kernel/smpboot.c
index c697f73d82d6..0d131daf3e7f 100644
--- a/kernel/smpboot.c
+++ b/kernel/smpboot.c
@@ -232,7 +232,8 @@ void smpboot_unpark_threads(unsigned int cpu)
 
mutex_lock(&smpboot_threads_lock);
list_for_each_entry(cur, &hotplug_threads, list)
-   smpboot_unpark_thread(cur, cpu);
+   if (cpumask_test_cpu(cpu, &cur->cpumask))
+   smpboot_unpark_thread(cur, cpu);
mutex_unlock(&smpboot_threads_lock);
 }
 
@@ -258,6 +259,15 @@ static void smpboot_destroy_threads(struct 
smp_hotplug_thread *ht)
 {
unsigned int cpu;
 
+   /* Unpark any threads that were voluntarily parked. */
+   for_each_cpu_not(cpu, &ht->cpumask) {
+   if (cpu_online(cpu)) {
+   struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
+   if (tsk)
+   kthread_unpark(tsk);
+   }
+   }
+
/* We need to destroy also the parked threads of offline cpus */
for_each_possible_cpu(cpu) {
struct task_struct *tsk = *per_cpu_ptr(ht->store, cpu);
@@ -281,6 +291,7 @@ int smpboot_register_percpu_thread(struct 
smp_hotplug_thread *plug_thread)
unsigned int cpu;
int ret = 0;
 
+   cpumask_copy(&plug_thread->cpumask, cpu_possible_mask);
get_online_cpus();
mutex_lock(&smpboot_threads_lock);
for_each_online_cpu(cpu) {
@@ -316,6 +327,48 @@ void smpboot_unregister_percpu_thread(struct 
smp_hotplug_thread *plug_thread)
 }
 EXPORT_SYMBOL_GPL(smpboot_unregister_percpu_thread);
 
+/**
+ * smpboot_update_cpumask_percpu_thread - Adjust which per_cpu hotplug threads 
stay parked
+ * @plug_thread:   Hotplug thread descriptor
+ * @new:   Revised mask to use
+ *
+ * The cpumask field in the smp_hotplug_thread must not be updated directly
+ * by the client, but only by calling this fun

Re: [PATCH v3] dmaengine: xgene-dma: Fix sparse wannings and coccinelle warnings

2015-04-17 Thread Vinod Koul

>   /* Get DMA error interrupt */
> @@ -2076,7 +2035,6 @@ static struct platform_driver xgene_dma_driver = {
>   .remove = xgene_dma_remove,
>   .driver = {
>   .name = "X-Gene-DMA",
> - .owner = THIS_MODULE,
I have already applied a patch for this

>   .of_match_table = xgene_dma_of_match_ptr,
>   },
>  };
> @@ -2085,6 +2043,5 @@ module_platform_driver(xgene_dma_driver);
> 
>  MODULE_DESCRIPTION("APM X-Gene SoC DMA driver");
>  MODULE_AUTHOR("Rameshwar Prasad Sahu ");
> -MODULE_AUTHOR("Loc Ho ");
And why this?

Fixes looks good though

-- 
~Vinod
>  MODULE_LICENSE("GPL");
>  MODULE_VERSION("1.0");
> --
> 1.8.2.1
> 

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] arch/tile changes for v4.1

2015-04-17 Thread Chris Metcalf

Linus,

Please pull the following changes for 4.1 from:

  git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile.git HEAD

These are mostly nohz_full changes, plus a smattering of minor fixes
(notably a couple for ftrace).

Chris Metcalf (5):
  tile: use si_int instead of si_ptr for compat_siginfo
  tile: support arch_irq_work_raise
  tile: support CONTEXT_TRACKING and thus NOHZ_FULL
  tile: map data region shadow of kernel as R/W
  tile: nohz: warn if nohz_full uses hypervisor shared cores

Colin Ian King (1):
  arch: tile: fix null pointer dereference on pt_regs pointer

Davidlohr Bueso (1):
  tile/elf: reorganize notify_exec()

Tony Lu (1):
  tile: ftrace: fix function_graph tracer issues

 arch/tile/Kconfig   |  1 +
 arch/tile/include/asm/Kbuild|  1 -
 arch/tile/include/asm/ftrace.h  |  2 ++
 arch/tile/include/asm/irq_work.h| 14 +++
 arch/tile/include/asm/smp.h |  1 +
 arch/tile/include/asm/thread_info.h |  9 ---
 arch/tile/include/hv/hypervisor.h   |  6 -
 arch/tile/kernel/compat_signal.c| 11 -
 arch/tile/kernel/ftrace.c   |  6 -
 arch/tile/kernel/mcount_64.S|  7 +-
 arch/tile/kernel/process.c  | 12 ++
 arch/tile/kernel/ptrace.c   | 22 +++--
 arch/tile/kernel/setup.c| 23 ++
 arch/tile/kernel/single_step.c  |  3 +++
 arch/tile/kernel/smp.c  | 32 -
 arch/tile/kernel/stack.c| 15 ++--
 arch/tile/kernel/traps.c| 16 +++--
 arch/tile/kernel/unaligned.c| 22 ++---
 arch/tile/mm/elf.c  | 47 +++--
 arch/tile/mm/fault.c| 10 +---
 arch/tile/mm/init.c |  7 --
 21 files changed, 201 insertions(+), 66 deletions(-)
 create mode 100644 arch/tile/include/asm/irq_work.h

--
Chris Metcalf, EZChip Semiconductor
http://www.ezchip.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tejun Heo
Hello,

On Sat, Apr 18, 2015 at 03:20:41AM +0900, Tetsuo Handa wrote:
> I didn't mean to introduce netconsole's own version of metadata.
> I meant we don't need to implement in-kernel retry logic.

Hmmm?  I'm not really following where this discussion is headed.  No,
we don't have to put it in the kernel.  We can punt the retry part to
userland as I wrote in another message at some cost to robustness.

> If we can assume that scheduler is working, adding a kernel thread that
> does
> 
>   while (1) {
>   read messages with metadata from /dev/kmsg
>   send them using UDP network
>   }
> 
> might be easier than modifying netconsole module.

But, I mean, if we are gonna do that in kernel, we better do it
properly where it belongs.  What's up with "easier than modifying
netconsole module"?  Why is netconsole special?  And how would the
above be any less complex than a single timer function?  What am I
missing?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

2015-04-17 Thread Peter Zijlstra
On Fri, Apr 17, 2015 at 08:20:37PM +0200, Andi Kleen wrote:
> On Fri, Apr 17, 2015 at 04:44:07PM +0200, Peter Zijlstra wrote:
> > On Fri, Apr 17, 2015 at 02:19:58PM +, Liang, Kan wrote:
> > 
> > > > But that brings us to patch 1 of this series, how is that correct in 
> > > > the face of
> > > > this? There is an arbitrary delay (A->B) added to the period.
> > > > And the Changelog of course never did bother to make that clear.
> 
> That's how perf and other profilers always behaved. The PMI
> is not part of the period. The automatic PEBS reload is not in any way
> different. It's much faster than a PMI, but it's also not zero cost.
> 
> This is not a gap in measurement though -- there is no other code
> running during that time on that CPU. It's simply overhead from the
> measurement mechanism.
> 
> > > 
> > > OK. I will update the changelog for patch 1 as below.
> > > ---
> > > When a fixed period is specified, this patch make perf use the PEBS
> > > auto reload mechanism. This makes normal profiling faster, because
> > > it avoids one costly MSR write in the PMI handler.
> > 
> > > However, the reset value will be loaded by hardware assist. There is 
> > > a little bit delay compared to previous non-auto-reload mechanism.
> > > The delay is arbitrary but very small.
> > 
> > What is very small? And doesn't that mean its significant at exactly the
> > point this patch series is aimed at, namely very short period.
> 
> The assist cost is 400-800 cycles, assuming common cases with everything
> cached. The minimum period the patch currently uses is 1. In that
> extreme case it can be ~10% if cycles are used.

Thanks, please include all this information.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] f2fs updates for v4.1

2015-04-17 Thread Jaegeuk Kim
Hi Linus,

Could you please pull the following patches?

Thank you very much,

The following changes since commit 13a7a6ac0a11197edcd0f756a035f472b42cdf8b:

  Linux 4.0-rc2 (2015-03-03 09:04:59 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs.git 
tags/for-f2fs-4.1

for you to fetch changes up to 10027551ccf5459cc771c31ac8bc8e5cc8db45f8:

  f2fs: pass checkpoint reason on roll-forward recovery (2015-04-16 09:45:40 
-0700)


New features are:
 o in-memory extent_cache
 o fs_shutdown to test power-off-recovery
 o use inline_data to store symlink path
 o show f2fs as a non-misc filesystem

Major fixes are to:
 o avoid CPU stalls on sync_dirty_dir_inodes
 o fix some power-off-recovery procedure
 o fix to handle broken symlink correctly
 o fix missing dot and dotdot made by sudden power cuts
 o handle wrong data index during roll-forward recovery
 o preallocate data blocks for direct_io

And it includes a bunch of minor bug fixes and cleanups.


Changman Lee (2):
  f2fs: add stat info for moved blocks by background gc
  f2fs: cleanup statement about max orphan inodes calc

Chao Yu (31):
  f2fs: remove unused inline_dentry_addr
  f2fs: introduce f2fs_update_dentry to clean up duplicated codes
  f2fs: use ->writepage in sync_meta_pages
  f2fs: fix incorrectly stat number of inline data inode
  f2fs: move ext_lock out of struct extent_info
  f2fs: simplfy a field name in struct f2fs_extent,extent_info
  f2fs: introduce f2fs_map_bh to clean codes of check_extent_cache
  f2fs: introduce universal lookup/update interface for extent cache
  f2fs: introduce infra macro and data structure of rb-tree extent cache
  f2fs: add core functions for rb-tree extent cache
  f2fs: add a mount option for rb-tree extent cache
  f2fs: enable rb-tree extent cache
  f2fs: show extent tree, node stat info in debugfs
  f2fs: add trace for rb-tree extent cache ops
  f2fs: support fast lookup in extent cache
  f2fs: switch to check FI_NO_EXTENT in f2fs_{lookup,update}_extent_cache
  f2fs: use extent cache for dir
  f2fs: fix to issue small discard in real-time mode discard
  f2fs: fix to calculate max length of contiguous free slots correctly
  f2fs: fix reference leaks in f2fs_acl_create
  f2fs: fix to truncate inline data past EOF
  f2fs: fix to check current blkaddr in __allocate_data_blocks
  f2fs: set SBI_NEED_FSCK when encountering exception in recovery
  f2fs: split set_data_blkaddr from f2fs_update_extent_cache
  f2fs: introduce __{find,grab}_extent_tree
  f2fs: initialize extent tree with on-disk extent info of inode
  f2fs: preserve extent info for extent cache
  f2fs: preallocate fallocated blocks for direct IO
  f2fs: avoid NULL pointer dereference in f2fs_xattr_advise_get
  f2fs: persist system.advise into on-disk inode
  f2fs: limit b_size of mapped bh in f2fs_map_bh

Jaegeuk Kim (24):
  f2fs: remove obsolete code
  f2fs: avoid wrong error during recovery
  f2fs: support fs shutdown
  f2fs: clear page's up-to-date if block was deallocated
  f2fs: check its block allocation to avoid producing wrong dirty pages
  f2fs: avoid to trigger writepage during POR
  f2fs: clear append/update flags once fsync is done
  f2fs: report -ENOENT for unreached data indices
  f2fs: relocate Kconfig from misc filesystems
  f2fs: fix to cover sentry_lock for block allocation
  f2fs: set buffer_new when new blocks are allocated
  f2fs: enhance multi-threads performance
  f2fs: avoid wrong f2fs_bug_on when truncating inline_data
  f2fs: avoid punch_hole overhead when releasing volatile data
  f2fs: add some tracepoints to debug volatile and atomic writes
  f2fs: fix sparse warnings
  f2fs: fix mismatching lock and unlock pages for roll-forward recovery
  f2fs: add F2FS_INLINE_DOTS to recover missing dot dentries
  f2fs: assign parent's i_mode for empty dir
  f2fs: do not increase link count during recovery
  f2fs: do not recover wrong data index
  f2fs: flush symlink path to avoid broken symlink after POR
  f2fs: avoid abnormal behavior on broken symlink
  f2fs: pass checkpoint reason on roll-forward recovery

Sebastian Andrzej Siewior (1):
  f2fs: add cond_resched() to sync_dirty_dir_inodes()

Taehee Yoo (1):
  f2fs: change 0 to false for bool type

Wanpeng Li (10):
  f2fs: introduce macro __cp_payload
  f2fs: fix the number of orphan inode blocks
  f2fs: fix block_ops trace point
  f2fs: don't need to collect dirty sit entries and flush journal when 
there's no dirty sit entries
  f2fs: fix max orphan inodes calculation
  f2fs: fix extent cache memory leak
  f2fs: reduce searching region of segmap whe

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote:
> On Sat, Apr 18, 2015 at 03:03:46AM +0900, Tetsuo Handa wrote:
> > packet will be sufficient for finding out whether the packets were lost 
> > and/or
> > reordered in flight.
> > 
> >   printk("Hello");
> >=> netconsole sends " Hello" using UDP
> >   printk("netconsole");
> >=> netconsole sends "0001 netconsole" using UDP
> >   printk("world\n");
> >=> netconsole sends "0002 world\n" using UDP
> > 
> > It might be nice to allow administrator to prefix a sequence number
> > to netconsole messages for those who are using special receiver
> > program (e.g. ncrx) which checks that sequence number.
> 
> That said, this is pretty much what the first 12 patches do (except
> for the last printk patch, which can be taken out).  We already have
> sequencing and established format to expose them to userland - try cat
> /dev/kmsg, which btw is what local loggers on modern systems use
> anyway.  Why introduce netconsole's own version of metadata?

I didn't mean to introduce netconsole's own version of metadata.
I meant we don't need to implement in-kernel retry logic.

If we can assume that scheduler is working, adding a kernel thread that
does

  while (1) {
  read messages with metadata from /dev/kmsg
  send them using UDP network
  }

might be easier than modifying netconsole module.

> 
> Thanks.
> 
> -- 
> tejun
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: shdmac: avoid unused variable warnings

2015-04-17 Thread Vinod Koul
On Sat, Apr 11, 2015 at 12:27:58AM +0200, Arnd Bergmann wrote:
> This driver uses '#ifdef CONFIG_ARCH_SHMOBILE' and '#ifdef CONFIG_ARM'
> interchangeably in its sh_dmae_probe function, which causes a build
> warning when building for ARM without also enabling shmobile:
> 
> dma/sh/shdmac.c: In function sh_dmae_probe:
> dma/sh/shdmac.c:696:6: warning: unused variable errirq [-Wunused-variable]
> dma/sh/shdmac.c:695:16: warning: unused variable irqflags [-Wunused-variable]
> dma/sh/shdmac.c: At top level:
> dma/sh/shdmac.c:447:20: warning: sh_dmae_err defined but not used 
> [-Wunused-function]
> 
> This changes all the #ifdef to test for CONFIG_ARCH_SHMOBILE to
> avoid that warning. An earlier patch from Laurent had fixed the warning
> for non-ARM case, but it still remained present in ARM randconfig builds.
> 
> Signed-off-by: Arnd Bergmann 
> Fixes: 52d6a5ee101bf ("DMA: shdma: Fix warnings due to declared but unused 
> symbols")
Applied, thanks

-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V6 4/6] perf, x86: handle multiple records in PEBS buffer

2015-04-17 Thread Andi Kleen
On Fri, Apr 17, 2015 at 04:44:07PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 17, 2015 at 02:19:58PM +, Liang, Kan wrote:
> 
> > > But that brings us to patch 1 of this series, how is that correct in the 
> > > face of
> > > this? There is an arbitrary delay (A->B) added to the period.
> > > And the Changelog of course never did bother to make that clear.

That's how perf and other profilers always behaved. The PMI
is not part of the period. The automatic PEBS reload is not in any way
different. It's much faster than a PMI, but it's also not zero cost.

This is not a gap in measurement though -- there is no other code
running during that time on that CPU. It's simply overhead from the
measurement mechanism.

> > 
> > OK. I will update the changelog for patch 1 as below.
> > ---
> > When a fixed period is specified, this patch make perf use the PEBS
> > auto reload mechanism. This makes normal profiling faster, because
> > it avoids one costly MSR write in the PMI handler.
> 
> > However, the reset value will be loaded by hardware assist. There is 
> > a little bit delay compared to previous non-auto-reload mechanism.
> > The delay is arbitrary but very small.
> 
> What is very small? And doesn't that mean its significant at exactly the
> point this patch series is aimed at, namely very short period.

The assist cost is 400-800 cycles, assuming common cases with everything
cached. The minimum period the patch currently uses is 1. In that
extreme case it can be ~10% if cycles are used.

-Andi
-- 
a...@linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: fix platform_no_drv_owner.cocci warnings

2015-04-17 Thread Vinod Koul
On Sun, Apr 12, 2015 at 02:18:34AM +0800, kbuild test robot wrote:
> drivers/dma/xgene-dma.c:2079:3-8: No need to set .owner here. The core will 
> do it.
> 
>  Remove .owner field if calls are used which set it automatically
> 
> Generated by: scripts/coccinelle/api/platform_no_drv_owner.cocci
Applied, thanks

-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: pch_dma: fix memory leak on failure path in pch_dma_probe()

2015-04-17 Thread Vinod Koul
On Sat, Apr 11, 2015 at 01:28:41AM +0300, Alexey Khoroshilov wrote:
> Memory allocated for pch_dma is not deallocated in case of failure
> in pch_dma_probe().
> 
> Found by Linux Driver Verification project (linuxtesting.org).
> 
Applied, thanks

-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: at_xdmac: unlock spin lock before return

2015-04-17 Thread Vinod Koul
On Tue, Apr 07, 2015 at 04:42:45PM +0200, Niklas Cassel wrote:
> Signed-off-by: Niklas Cassel 
> ---
>  drivers/dma/at_xdmac.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/dma/at_xdmac.c b/drivers/dma/at_xdmac.c
> index d9891d3..933e4b3 100644
> --- a/drivers/dma/at_xdmac.c
> +++ b/drivers/dma/at_xdmac.c
> @@ -1154,8 +1154,10 @@ static int at_xdmac_device_resume(struct dma_chan 
> *chan)
>   dev_dbg(chan2dev(chan), "%s\n", __func__);
>  
>   spin_lock_bh(&atchan->lock);
> - if (!at_xdmac_chan_is_paused(atchan))
> + if (!at_xdmac_chan_is_paused(atchan)) {
> + spin_unlock_bh(&atchan->lock);
>   return 0;
> + }
>  
>   at_xdmac_write(atxdmac, AT_XDMAC_GRWR, atchan->mask);
>   clear_bit(AT_XDMAC_CHAN_IS_PAUSED, &atchan->status);
> -- 
> 2.1.4
> 

Applied now
-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tejun Heo
On Sat, Apr 18, 2015 at 03:03:46AM +0900, Tetsuo Handa wrote:
> If you tolerate loss of kernel messages, adding sequence number to each UDP

Well, there's a difference between accepting loss when log buffer
overflows and when any packets get lost.

> packet will be sufficient for finding out whether the packets were lost and/or
> reordered in flight.
> 
>   printk("Hello");
>=> netconsole sends " Hello" using UDP
>   printk("netconsole");
>=> netconsole sends "0001 netconsole" using UDP
>   printk("world\n");
>=> netconsole sends "0002 world\n" using UDP
> 
> It might be nice to allow administrator to prefix a sequence number
> to netconsole messages for those who are using special receiver
> program (e.g. ncrx) which checks that sequence number.

That said, this is pretty much what the first 12 patches do (except
for the last printk patch, which can be taken out).  We already have
sequencing and established format to expose them to userland - try cat
/dev/kmsg, which btw is what local loggers on modern systems use
anyway.  Why introduce netconsole's own version of metadata?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netns: deinline net_generic()

2015-04-17 Thread Denys Vlasenko
On 04/17/2015 07:42 PM, Eric Dumazet wrote:
> On Fri, 2015-04-17 at 19:05 +0200, Denys Vlasenko wrote:
>> How do you expect one to find excessively large inlines,
>> if not on allyesconfig build?
> 
> Tuning kernel sources based on allyesconfig build _size_ only is
> terrible. We could build an interpreter based kernel and maybe reduce
> its size by 50% who knows...
> 
> You are saying that all inline should be removed, since it is obvious
> kernel size _will_ be smaller.

I am not saying that. That would be stupid.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tejun Heo
Just a bit of addition.

On Fri, Apr 17, 2015 at 01:37:54PM -0400, Tejun Heo wrote:
> Upto patch 12, it's just the same mechanism transferring extended
> messages.  It doesn't add any smartness to netconsole per-se except
> that it can now emit messages with metadata headers.  What do you
> think about them?

So, as long as netconsole can send messages with metadata header,
moving the reliability part to userland is trivial.  All that's
necessary is a program which follows /dev/kmsg, keeps the unacked
sequences and implement the same retransmission mechanism.  It'd be
less reobust in certain failure scenarios and a bit more cumbersome to
set up but nothing major and if we do that there'd be no reason to
keep the userland part in the kernel tree.

If the retransmission and timer parts are bothering, moving those to
userland sounds like an acceptable compromise.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote:
> > printk() cannot wait for ack. Trying to wait for ack would break something.
> > How can you transmit subsequent kernel messages which failed to enqueue
> > due to waiting for ack for previous kernel messages?
> 
> Well, if log buffer overflows and the messages aren't at the logging
> target yet, they're lost.  It's the same as doing dmesg on localhost,
> isn't it?  This doesn't have much to do with where the reliability
> logic is implemented and is exactly the same with local logging too.

If you tolerate loss of kernel messages, adding sequence number to each UDP
packet will be sufficient for finding out whether the packets were lost and/or
reordered in flight.

  printk("Hello");
   => netconsole sends " Hello" using UDP
  printk("netconsole");
   => netconsole sends "0001 netconsole" using UDP
  printk("world\n");
   => netconsole sends "0002 world\n" using UDP

It might be nice to allow administrator to prefix a sequence number
to netconsole messages for those who are using special receiver
program (e.g. ncrx) which checks that sequence number.

> 
> Thanks.
> 
> -- 
> tejun
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [LKP] [mtd] 6b44d910ae7: WARNING: CPU: 0 PID: 0 at kernel/locking/lockdep.c:3547 check_flags+0xae/0x17b()

2015-04-17 Thread Frans Klaver
On Fri, Apr 17, 2015 at 6:31 PM, Brian Norris
 wrote:
> On Fri, Apr 17, 2015 at 04:19:52PM +0200, Frans Klaver wrote:
>> > On Thu, Apr 16, 2015 at 7:27 AM, Huang Ying  wrote:
>> > I'm happy to
>> > send in a patch that restores "mtd->owner = THIS_MODULE" with these
>> > drivers, if that's preferred.
>>
>> Long story short, I should fix it.
>>
>> Brian, do you prefer a rework of the series or a patch fixing it?
>
> I don't have the time to handle much at the moment, so I've just removed
> your series from my +master branch. It won't go in my 4.1-rc1 pull
> request and will likely be delayed to 4.2. Feel free to send a new
> series, with a good changelog to describe the v1->v2 delta.

That will do.

Thanks,
Frans
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] __bitmap_parselist: fix bug in empty string handling

2015-04-17 Thread Chris Metcalf
bitmap_parselist("", &mask, nmaskbits) will erroneously set bit
zero in the mask.  The same bug is visible in cpumask_parselist()
since it is layered on top of the bitmask code, e.g. if you boot with
"isolcpus=", you will actually end up with cpu zero isolated.

The bug was introduced in commit 4b060420a596 ("bitmap, irq: add
smp_affinity_list interface to /proc/irq") when bitmap_parselist()
was generalized to support userspace as well as kernelspace.

Signed-off-by: Chris Metcalf 
Cc: sta...@vger.kernel.org
---
 lib/bitmap.c | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/lib/bitmap.c b/lib/bitmap.c
index d456f4c15a9f..c04448bf1271 100644
--- a/lib/bitmap.c
+++ b/lib/bitmap.c
@@ -536,12 +536,12 @@ static int __bitmap_parselist(const char *buf, unsigned 
int buflen,
unsigned a, b;
int c, old_c, totaldigits;
const char __user __force *ubuf = (const char __user __force *)buf;
-   int exp_digit, in_range;
+   int at_start, in_range;
 
totaldigits = c = 0;
bitmap_zero(maskp, nmaskbits);
do {
-   exp_digit = 1;
+   at_start = 1;
in_range = 0;
a = b = 0;
 
@@ -570,11 +570,10 @@ static int __bitmap_parselist(const char *buf, unsigned 
int buflen,
break;
 
if (c == '-') {
-   if (exp_digit || in_range)
+   if (at_start || in_range)
return -EINVAL;
b = 0;
in_range = 1;
-   exp_digit = 1;
continue;
}
 
@@ -584,16 +583,18 @@ static int __bitmap_parselist(const char *buf, unsigned 
int buflen,
b = b * 10 + (c - '0');
if (!in_range)
a = b;
-   exp_digit = 0;
+   at_start = 0;
totaldigits++;
}
if (!(a <= b))
return -EINVAL;
if (b >= nmaskbits)
return -ERANGE;
-   while (a <= b) {
-   set_bit(a, maskp);
-   a++;
+   if (!at_start) {
+   while (a <= b) {
+   set_bit(a, maskp);
+   a++;
+   }
}
} while (buflen && c == ',');
return 0;
-- 
2.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] dma: vdma: Fix compilation warnings

2015-04-17 Thread Vinod Koul
On Mon, Mar 30, 2015 at 06:48:29PM +0530, Kedareswara rao Appana wrote:
> This patch fixes the following compilation warnings.
> In file included from drivers/dma/xilinx/xilinx_vdma.c:26:0:
> include/linux/dmapool.h:18:4: warning: 'struct device' declared inside 
> parameter list
> size_t size, size_t align, size_t allocation);
> ^
> include/linux/dmapool.h:18:4: warning: its scope is only this definition or 
> declaration, which is probably not what you want
> include/linux/dmapool.h:31:7: warning: 'struct device' declared inside 
> parameter list
>size_t size, size_t align, size_t allocation);
>^
> drivers/dma/xilinx/xilinx_vdma.c: In function 
> 'xilinx_vdma_alloc_chan_resources':
> drivers/dma/xilinx/xilinx_vdma.c:501:20: warning: passing argument 2 of 
> 'dma_pool_create' from incompatible pointer type
>   chan->desc_pool = dma_pool_create("xilinx_vdma_desc_pool",
> ^
> In file included from drivers/dma/xilinx/xilinx_vdma.c:26:0:
> include/linux/dmapool.h:17:18: note: expected 'struct device *' but argument 
> is of type 'struct device *'
>  struct dma_pool *dma_pool_create(const char *name, struct device *dev, .
> 
I have applied this now

-- 
~Vinod
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5 v2] blk-mq: Add prep/unprep support

2015-04-17 Thread Christoph Hellwig
On Fri, Apr 17, 2015 at 10:15:46AM +0200, Matias Bj?rling wrote:
> Just the prep/unprep, or other pieces as well?

All of it - it's functionality that lies logically below the block
layer, so that's where it should be handled.

In fact it should probably work similar to the mtd subsystem - that is
have it's own API for low level drivers, and just export a block driver
as one consumer on the top side.

> In the future, applications can have an API to get/put flash block directly.
> (using the blk_nvm_[get/put]_blk interface).

s/application/filesystem/?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tejun Heo
On Sat, Apr 18, 2015 at 02:43:30AM +0900, Tetsuo Handa wrote:
> > Upto patch 12, it's just the same mechanism transferring extended
> > messages.  It doesn't add any smartness to netconsole per-se except
> > that it can now emit messages with metadata headers.  What do you
> > think about them?
> 
> So, this patchset aims for obtaining kernel messages under problematic
> condition. You have to hold messages until ack is delivered. This means
> that printk buffer can become full before burst messages (e.g. SysRq-t)
> are acked due to packet loss in the network.
> 
> printk() cannot wait for ack. Trying to wait for ack would break something.
> How can you transmit subsequent kernel messages which failed to enqueue
> due to waiting for ack for previous kernel messages?

Well, if log buffer overflows and the messages aren't at the logging
target yet, they're lost.  It's the same as doing dmesg on localhost,
isn't it?  This doesn't have much to do with where the reliability
logic is implemented and is exactly the same with local logging too.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tetsuo Handa
Tejun Heo wrote:
> Hello, David.
> 
> On Fri, Apr 17, 2015 at 01:17:12PM -0400, David Miller wrote:
> > If userland cannot run properly, it is almost certain that neither will
> > your complex reliability layer logic.
> 
> * The bulk of patches are to pipe extended log messages to console
>   drivers and let netconsole relay them to the receiver (and quite a
>   bit of refactoring in the process), which, regardless of the
>   reliability logic, is beneficial as we're currently losing
>   structured logging (dictionary) and other metadata over consoles and
>   regardless of where the reliability logic is implemented, it's a lot
>   easier to have messages IDs.
> 
> * The only thing necessary for reliable transmission are timer and
>   netpoll.  There sure are cases where they go down too but there's a
>   pretty big gap between those two going down and userland getting
>   hosed, but where to put the retransmission and reliability logic
>   definitely is debatable.
> 
> * That said, the "reliability" part of the patch series are just two
>   patches - 13 and 14, both of which are actually pretty simple.
> 
> > I tend to agree with Tetsuo, that in-kernel netconsole should remain
> > as simple as possible and once it starts to have any smarts and less
> > trivial logic the job belongs in userspace.
> 
> Upto patch 12, it's just the same mechanism transferring extended
> messages.  It doesn't add any smartness to netconsole per-se except
> that it can now emit messages with metadata headers.  What do you
> think about them?

So, this patchset aims for obtaining kernel messages under problematic
condition. You have to hold messages until ack is delivered. This means
that printk buffer can become full before burst messages (e.g. SysRq-t)
are acked due to packet loss in the network.

printk() cannot wait for ack. Trying to wait for ack would break something.
How can you transmit subsequent kernel messages which failed to enqueue
due to waiting for ack for previous kernel messages?

> 
> Thanks.
> 
> -- 
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netns: deinline net_generic()

2015-04-17 Thread Eric Dumazet
On Fri, 2015-04-17 at 19:05 +0200, Denys Vlasenko wrote:

> How do you expect one to find excessively large inlines,
> if not on allyesconfig build?

Tuning kernel sources based on allyesconfig build _size_ only is
terrible. We could build an interpreter based kernel and maybe reduce
its size by 50% who knows...

You are saying that all inline should be removed, since it is obvious
kernel size _will_ be smaller.

That is an ... interesting idea, but hardly related to net_generic().



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V6 07/10] sched: add a macro to ref all CLONE_NEW* flags

2015-04-17 Thread Peter Zijlstra
On Fri, Apr 17, 2015 at 11:42:50AM -0400, Richard Guy Briggs wrote:
> On 15/04/17, Peter Zijlstra wrote:
> > On Fri, Apr 17, 2015 at 03:35:54AM -0400, Richard Guy Briggs wrote:
> > > Added the macro CLONE_NEW_MASK_ALL to refer to all CLONE_NEW* flags.
> > 
> > A wee bit about why might be nice..
> 
> It makes the following patch much cleaner to read:
>   [PATCH V6 08/10] fork: audit on creation of new namespace(s)
>   https://lkml.org/lkml/2015/4/17/50
> 
> I was hoping it might also make a lot of other code cleaner, but most of
> the other places where multiple CLONE_NEW* flags are used, not all six
> are used together, but only 5 are used.  Ok, so it is helpful in 1 of 3:
> 
> It would actually be useful in check_unshare_flags():
>   https://github.com/torvalds/linux/blob/v3.17/kernel/fork.c#L1791
> 
> but not in copy_namespaces() or unshare_nsproxy_namespaces():
>   https://github.com/torvalds/linux/blob/v3.17/kernel/nsproxy.c#L130
>   https://github.com/torvalds/linux/blob/v3.17/kernel/nsproxy.c#L183
> 

Right, so no objections from me on this, its just that I only saw this
one patch in isolation without context and the changelog failed on
rationale.

Does it perchance make sense to fold this patch into the next patch that
actually makes use of it?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 1/4] fs: Add generic file system event notifications

2015-04-17 Thread John Spray



On 17/04/2015 17:22, Jan Kara wrote:

On Fri 17-04-15 17:08:10, John Spray wrote:

On 17/04/2015 16:43, Jan Kara wrote:
In that case I'm confused -- why would ENOSPC be an appropriate use
of this interface if the mount being entirely blocked would be
inappropriate?  Isn't being unable to service any I/O a more
fundamental and severe thing than being up and healthy but full?

Were you intending the interface to be exclusively for data
integrity issues like checksum failures, rather than more general
events about a mount that userspace would probably like to know
about?

   Well, I'm not saying we cannot have those events for fs availability /
inavailability. I'm just saying I'd like to see some use for that first.
I don't want events to be added just because it's possible...

For ENOSPC we have thin provisioned storage and the userspace deamon
shuffling real storage underneath. So there I know the usecase.



Ah, OK.  So I can think of a couple of use cases:
 * a cluster scheduling service (think MPI jobs or docker containers) 
might check for events like this.  If it can see the cluster filesystem 
is unavailable, then it can avoid scheduling the job, so that the 
(multi-node) application does not get hung on one node with a bad 
mount.  If it sees a mount go bad (unavailable, or client evicted) 
partway through a job, then it can kill -9 the process that was relying 
on the bad mount, and go run it somewhere else.
 * Boring but practical case: a nagios health check for checking if 
mounts are OK.


We don't have to invent these event types now of course, but something 
to bear in mind.  Hopefully if/when any of the distributed filesystems 
(Lustre/Ceph/etc) choose to implement this, we can look at making the 
event types common at that time though.


BTW in any case an interface for filesystem events to userspace will be 
a useful addition, thank you!


Cheers,
John
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


4.0 kernel XFS filesystem crash when running AIM7's disk workload

2015-04-17 Thread Waiman Long

Hi Dave,

When I was running the AIM7's disk workload on a 8-socket Westmere-EX 
server with 4.0 kernel, the kernel crash. A set of small ramdisks were 
created (ramdisk_size=271072). Those ramdisks were formatted with XFS 
filesystem before the test began. The kernel log was:


XFS (ram12): Mounting V4 Filesystem
XFS (ram12): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram12): Log size out of supported range. Continuing onwards, but if 
log hangs are

experienced then please report this message in the bug report.
XFS (ram12): Ending clean mount
XFS (ram13): Mounting V4 Filesystem
XFS (ram13): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram13): Log size out of supported range. Continuing onwards, but if 
log hangs are

experienced then please report this message in the bug report.
XFS (ram13): Ending clean mount
XFS (ram14): Mounting V4 Filesystem
XFS (ram14): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram14): Log size out of supported range. Continuing onwards, but if 
log hangs are

experienced then please report this message in the bug report.
XFS (ram14): Ending clean mount
XFS (ram15): Mounting V4 Filesystem
XFS (ram15): Log size 1424 blocks too small, minimum size is 1596 blocks
XFS (ram15): Log size out of supported range. Continuing onwards, but if 
log hangs are

experienced then please report this message in the bug report.
XFS (ram15): Ending clean mount
BUG: unable to handle kernel NULL pointer dereference at   (null)
IP: [] __memcpy+0xd/0x110
PGD 29f7655f067 PUD 29f75a80067 PMD 0
Oops:  [#1] SMP
Modules linked in: xfs exportfs libcrc32c ebtable_nat ebtables 
xt_CHECKSUM iptable_mangle bridge stp llc autofs4 ipt_REJECT 
nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables 
ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state 
nf_conntrack ip6table_filter ip6_tables ipv6 vhost_net macvtap macvlan 
vhost tun kvm_intel kvm ipmi_si ipmi_msghandler tpm_infineon iTCO_wdt 
iTCO_vendor_support wmi acpi_cpufreq microcode pcspkr serio_raw qlcnic 
be2net vxlan udp_tunnel ip6_udp_tunnel ses enclosure igb dca ptp 
pps_core lpc_ich mfd_core hpilo hpwdt sg i7core_edac edac_core 
netxen_nic ext4(E) jbd2(E) mbcache(E) sr_mod(E) cdrom(E) sd_mod(E) 
lpfc(E) qla2xxx(E) scsi_transport_fc(E) pata_acpi(E) ata_generic(E) 
ata_piix(E) hpsa(E) radeon(E) ttm(E) drm_kms_helper(E) drm(E) 
i2c_algo_bit(E) i2c_core(E) dm_mirror(E) dm_region_hash(E) dm_log(E) 
dm_mod(E)

CPU: 69 PID: 116603 Comm: xfsaild/ram5 Tainted: GE   4.0.0 #2
Hardware name: HP ProLiant DL980 G7, BIOS P66 07/30/2012
task: 8b9f7eeb4f80 ti: 8b9f7f1ac000 task.ti: 8b9f7f1ac000
RIP: 0010:[]  [] __memcpy+0xd/0x110
RSP: 0018:8b9f7f1afc10  EFLAGS: 00010206
RAX: 88102476a3cc RBX: 889ff2ab5000 RCX: 0005
RDX: 0006 RSI:  RDI: 88102476a3cc
RBP: 8b9f7f1afc18 R08: 0001 R09: 88102476a3cc
R10: 8a1f6c03ea80 R11:  R12: 8b1ff1269400
R13: 8b1f64837c98 R14: 881038701200 R15: 88102476a300
FS:  () GS:8b1fffa4() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2:  CR3: 029f7655e000 CR4: 06e0
Stack:
 a0ca8c41 8b9f7f1afc68 a0cc4803 8b9f7f1afc68
 a0cd2777 8b9f7f1afc68 8b1ff1269400 8a9f59022800
 8b1f7c932718 0003 8a9f590228e4 8b9f7f1afce8
Call Trace:
 [] ? xfs_iflush_fork+0x181/0x240 [xfs]
 [] xfs_iflush_int+0x1f3/0x320 [xfs]
 [] ? kmem_alloc+0x87/0x100 [xfs]
 [] xfs_iflush_cluster+0x295/0x380 [xfs]
 [] xfs_iflush+0xf4/0x1f0 [xfs]
 [] xfs_inode_item_push+0xea/0x130 [xfs]
 [] xfsaild_push+0x10d/0x500 [xfs]
 [] ? lock_timer_base+0x70/0x70
 [] xfsaild+0x98/0x130 [xfs]
 [] ? xfsaild_push+0x500/0x500 [xfs]
 [] ? xfsaild_push+0x500/0x500 [xfs]
 [] ? xfsaild_push+0x500/0x500 [xfs]
 [] ? kthread_freezable_should_stop+0x70/0x70
 [] ret_from_fork+0x58/0x90
 [] ? kthread_freezable_should_stop+0x70/0x70
Code: 0f b6 c0 5b c9 c3 0f 1f 84 00 00 00 00 00 e8 2b f9 ff ff 80 7b 25 
00 74 c8 eb d3 90 90 90 48 89 f8 48 89 d1 48 c1 e9 03 83 e2 07  48 
a5 89 d1 f3 a4 c3 20 4c 8b 06 4c 8b 4e 08 4c 8b 56 10 4c

RIP  [] __memcpy+0xd/0x110
 RSP 
CR2: 
---[ end trace fb8a4add69562a76 ]---

The xfs_iflush_fork+0x181/0x240 (385) IP address is at:

823case XFS_DINODE_FMT_LOCAL:
824if ((iip->ili_fields & dataflag[whichfork]) &&
   0x23c0 <+336>:movslq %ecx,%rcx
   0x23c3 <+339>:movswl 0x0(%rcx,%rcx,1),%eax
   0x23cb <+347>:test   %eax,0x90(%rdx)
   0x23d1 <+353>:je 0x2350 
   0x23da <+362>:test   %edx,%edx
   0x23dc <+364>:jle0x2350 

825(ifp->if_bytes > 0)) {
   0x23d7 <+359>:mov(%r10),%edx

826ASSERT(ifp->if_u1.if_data != NU

Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread Tejun Heo
Hello, David.

On Fri, Apr 17, 2015 at 01:17:12PM -0400, David Miller wrote:
> If userland cannot run properly, it is almost certain that neither will
> your complex reliability layer logic.

* The bulk of patches are to pipe extended log messages to console
  drivers and let netconsole relay them to the receiver (and quite a
  bit of refactoring in the process), which, regardless of the
  reliability logic, is beneficial as we're currently losing
  structured logging (dictionary) and other metadata over consoles and
  regardless of where the reliability logic is implemented, it's a lot
  easier to have messages IDs.

* The only thing necessary for reliable transmission are timer and
  netpoll.  There sure are cases where they go down too but there's a
  pretty big gap between those two going down and userland getting
  hosed, but where to put the retransmission and reliability logic
  definitely is debatable.

* That said, the "reliability" part of the patch series are just two
  patches - 13 and 14, both of which are actually pretty simple.

> I tend to agree with Tetsuo, that in-kernel netconsole should remain
> as simple as possible and once it starts to have any smarts and less
> trivial logic the job belongs in userspace.

Upto patch 12, it's just the same mechanism transferring extended
messages.  It doesn't add any smartness to netconsole per-se except
that it can now emit messages with metadata headers.  What do you
think about them?

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.14 00/43] 3.14.39-stable review

2015-04-17 Thread Shuah Khan
On 04/17/2015 07:28 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.14.39 release.
> There are 43 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Apr 19 13:25:21 UTC 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.14.39-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Complied and booted on my test system. No dmesg regressions.

-- Shuah

-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shua...@osg.samsung.com | (970) 217-8978
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.10 00/34] 3.10.75-stable review

2015-04-17 Thread Shuah Khan
On 04/17/2015 07:28 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.10.75 release.
> There are 34 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Apr 19 13:25:20 UTC 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.10.75-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Complied and booted on my test system. No dmesg regressions.

-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shua...@osg.samsung.com | (970) 217-8978
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3.19 000/101] 3.19.5-stable review

2015-04-17 Thread Shuah Khan
On 04/17/2015 07:27 AM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.19.5 release.
> There are 101 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sun Apr 19 13:24:43 UTC 2015.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.19.5-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 

Complied and booted on my test system. No dmesg regressions.

-- Shuah


-- 
Shuah Khan
Sr. Linux Kernel Developer
Open Source Innovation Group
Samsung Research America (Silicon Valley)
shua...@osg.samsung.com | (970) 217-8978
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 0/3] i2c: davinci improvements and fixes

2015-04-17 Thread grygorii.stras...@linaro.org

On 04/10/2015 06:59 PM, Wolfram Sang wrote:

On Mon, Apr 06, 2015 at 03:38:38PM +0300, grygorii.stras...@linaro.org wrote:

From: Grygorii Strashko 

This series converts driver to use I2C bus recovery infrastructure and
adds Davinci I2C bus recovery mechanizm based on using ICPFUNC registers.
These patches are combination of two patches from Ben Gardiner [1] and
Mike Looijmans [2] which i've reworked to use I2C bus recovery infrastructure


Applied to for-next, thanks for keeping at it and providing useful info!



Thanks

--
regards,
-grygorii
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/2] [media] uvcvideo: Remain runtime-suspended at sleeps

2015-04-17 Thread Alan Stern
On Fri, 17 Apr 2015, Tomeu Vizoso wrote:

> When the system goes to sleep and afterwards resumes, a significant
> amount of time is spent suspending and resuming devices that were
> already runtime-suspended.
> 
> By setting the power.force_direct_complete flag, the PM core will ignore
> the state of descendant devices and the device will be let in
> runtime-suspend.
> 
> Signed-off-by: Tomeu Vizoso 
> ---
>  drivers/media/usb/uvc/uvc_driver.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/media/usb/uvc/uvc_driver.c 
> b/drivers/media/usb/uvc/uvc_driver.c
> index 5970dd6..ae75a70 100644
> --- a/drivers/media/usb/uvc/uvc_driver.c
> +++ b/drivers/media/usb/uvc/uvc_driver.c
> @@ -1945,6 +1945,8 @@ static int uvc_probe(struct usb_interface *intf,
>   "supported.\n", ret);
>   }
>  
> + intf->dev.parent->power.force_direct_complete = true;

This seems wrong.  The uvc driver is bound to intf, not to intf's
parent.  So it would be okay for the driver to set
intf->dev.power.force_direct_complete, but it's wrong to set
intf->dev.parent->power.force_direct_complete.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/2] PM / sleep: Let devices force direct_complete

2015-04-17 Thread Alan Stern
On Fri, 17 Apr 2015, Laurent Pinchart wrote:

> Hi Tomeu,
> 
> Thank you for the patch.
> 
> On Friday 17 April 2015 17:24:49 Tomeu Vizoso wrote:
> > Introduce a new per-device flag power.force_direct_complete that will
> > instruct the PM core to ignore the runtime PM status of its descendants
> > when deciding whether to let this device remain in runtime suspend when
> > the system goes into a sleep power state.
> > 
> > This is needed because otherwise it would be needed to get dozens of
> > drivers to implement the prepare() callback and be runtime PM active
> > even if they don't have a 1-to-1 relationship with a piece of HW.
> 
> I'll let PM experts comment on the approach, but I believe the new flag would 
> benefit from being documented (likely in Documentation/power/devices.txt) :-)

Documentation/power/runtime_pm.txt is the right place.

However, I'm not sure that this is the sort of thing Rafael meant when 
he suggested adding a new flag.  I thought he meant the PM core would 
look at the new flag only if there was no ->prepare method at all.  
Then if the new flag was set, the PM core would act as though ->prepare 
had returned 1.  That way there would be no need to add silly little
one-line *_prepare() routines all over the place.

Maybe he had something else in mind, though...

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 0/5] powerpc8xx: Further optimisation of TLB handling

2015-04-17 Thread Scott Wood
On Fri, 2015-04-17 at 18:32 +0200, root wrote:
> This patchset provides a further optimisation of TLB handling in the 8xx.
> Changes are:
> - Not saving registers like CR when not needed
> - Adding support to any TASK_SIZE
> 
> Only the last patch of the set is changed compared to v4
> 
> Christophe Leroy (5):
>   powerpc/8xx: macro for handling CPU15 errata
>   powerpc/8xx: Handle CR out of exception PROLOG/EPILOG
>   powerpc/8xx: dont save CR in SCRATCH registers
>   powerpc/8xx: Use SPRG2 instead of DAR for saving r3
>   powerpc/8xx: Add support for TASK_SIZE greater than 0x8000
> 
>  arch/powerpc/kernel/head_8xx.S | 79 
> +++---
>  1 file changed, 51 insertions(+), 28 deletions(-)
> 

Do you really want your name in the git history to be "root"?

-Scott


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: bcm2835: Add slave dma support

2015-04-17 Thread Noralf Trønnes

Hi Stefan,

Den 17.04.2015 19:08, skrev Stefan Wahren:

Hi Noralf,

Am 17.04.2015 um 00:09 schrieb Noralf Trønnes:


Den 15.04.2015 21:00, skrev Stefan Wahren:

Hi Noralf,

Am 15.04.2015 um 11:56 schrieb Noralf Trønnes:

Add slave transfer capability to BCM2835 dmaengine driver.
This patch is pulled from the bcm2708-dmaengine driver in the
Raspberry Pi repo. The work was done by Gellert Weisz.

Tested with the bcm2835-mmc driver from the same repo.


why not with the upstream kernel?



See my answer to Alexander Stein.


i read the mail, but i'm still confused. Please let me paraphrase my 
last question:


Is this patch testable with upstream kernel?



Sorry, I misread you.
This patch was made against mainline 4.0-rc7, not the Raspberry Pi repo.
I then used the bcm2835-mmc driver in mainline to be able to test the
functionality.


It would be helpful to put those facts from the email to Alexander into
the patch description. Please clarify the intension of your patch.



From my point of view, the mmc driver is a discussion of it's own.
This patch provides functionality that other drivers can make use of as 
well.

Martin Sperl will soon start working on DMA support for spi-bcm2835,
relying on this patch to make that happen.


Noralf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: bcm2835: Add slave dma support

2015-04-17 Thread Martin Sperl

> On 17.04.2015, at 19:08, Stefan Wahren  wrote:
> i read the mail, but i'm still confused. Please let me paraphrase my last 
> question:
> 
> Is this patch testable with upstream kernel?
> 
> It would be helpful to put those facts from the email to Alexander into
> the patch description. Please clarify the intension of your patch.

The spi-bcm2835 driver will probably be the first “consumer” of this
patch. But that development is has just started and it obviously
requires scatter/gather support in dma-engine to work.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [V4.1] Regression: Bluetooth mouse not working.

2015-04-17 Thread Marcel Holtmann
Hi Joerg,

>> On Fri, Apr 17, 2015 at 5:36 AM, Jörg Otte  wrote:
>>> The BT mouse is "death" in v4.1.
>>> The BT mouse has been working in 4.0 and previous kernels, so this
>>> is a regression.
>> 
>> Any chance of bisecting it?
>> 
>>Linus
> I will try that.
> 
> Thanks, Jörg
 
 I first tried to bisect over all. But than I got an unbootable kernel.
 Then I did a bisect over net/bluetooth and I get the following fist bad
 commit:
 
 5f5da99f1da5b01c7c45473a500c7dbb77a00958 is the first bad commit
 commit 5f5da99f1da5b01c7c45473a500c7dbb77a00958
 Author: Marcel Holtmann 
 Date:   Wed Apr 1 13:51:53 2015 -0700
 
  Bluetooth: Restrict HIDP flags to only valid ones
 
  The HIDP flags should be clearly restricted to valid ones. So this puts
  extra checks in place to ensure this.
 
  Signed-off-by: Marcel Holtmann 
  Signed-off-by: Johan Hedberg 
 
 :04 04 b51ac3634c9d44f4d9df0e7f548b524954b99c76
 63bfb47283609849f1b3b8f05fe61743ccddfee6 M  net
>>> 
>>> thanks for bi-secting this. I looked at our existing userspace and 
>>> restricted it to the flags that are currently in use. However it seems that 
>>> I made a mistake. What version of BlueZ userspace are you using (bluetoothd 
>>> --version).
>>> 
>>> 
>> bluetoothd --version
>> 4.98
> 
> okay. I only looked at BlueZ 5.x and that might have been my mistake. Let me 
> check this and fix this properly.

I think that I know what I screwed up here. I sent you a patch to fix this. Can 
you please test it and report back. If that fixes it for you, then I will send 
it to Linus for inclusion.

Regards

Marcel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET] printk, netconsole: implement reliable netconsole

2015-04-17 Thread David Miller
From: Tejun Heo 
Date: Fri, 17 Apr 2015 12:28:26 -0400

> On Sat, Apr 18, 2015 at 12:35:06AM +0900, Tetsuo Handa wrote:
>> If the sender side can wait for retransmission, why can't we use
>> userspace programs (e.g. rsyslogd)?
> 
> Because the system may be oopsing, ooming or threshing excessively
> rendering the userland inoperable and that's exactly when we want
> those log messages to be transmitted out of the system.

If userland cannot run properly, it is almost certain that neither will
your complex reliability layer logic.

I tend to agree with Tetsuo, that in-kernel netconsole should remain
as simple as possible and once it starts to have any smarts and less
trivial logic the job belongs in userspace.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] rcu: small rcu_dereference doc update

2015-04-17 Thread Pranith Kumar
On Fri, Apr 17, 2015 at 12:15 PM, Paul E. McKenney
 wrote:
> Sounds like a good thought for a separate patch.  Please take a look
> through the rest of the documentation -- this might well be the right
> place for such an example, but there might well be a better place.
> Is this issue mentioned in the checklist?  If not, another item might
> be good.
>

Yup, I will take a look and send a patch for this.

-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 3/4] hugetlbfs: add hugetlbfs_fallocate()

2015-04-17 Thread Mike Kravetz

On 04/17/2015 01:00 AM, Hillf Danton wrote:

+   clear_huge_page(page, addr, pages_per_huge_page(h));
+   __SetPageUptodate(page);
+   error = huge_add_to_page_cache(page, mapping, index);
+   if (error) {
+   put_page(page);
+   /* Keep going if we see an -EEXIST */
+   if (error != -EEXIST)
+   goto out;  /* FIXME, need to free? */
+   }
+
+   /*
+* page_put due to reference from alloc_huge_page()
+* unlock_page because locked by add_to_page_cache()
+*/
+   put_page(page);


Still needed if EEXIST?


Nope.  Good catch.

I'll fix this in the next version.
--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: manual merge of the drm tree with the v4l-dvb tree

2015-04-17 Thread Philipp Zabel
Am Freitag, den 17.04.2015, 04:07 +0100 schrieb Dave Airlie:
> > Am Mittwoch, den 15.04.2015, 13:33 +1000 schrieb Stephen Rothwell:
> > > Hi Dave,
> > > 
> > > Today's linux-next merge of the drm tree got a conflict in
> > > Documentation/DocBook/media/v4l/subdev-formats.xml between commit
> > > 7b0fd4568bee ("[media] v4l: Add RBG and RGB 8:8:8 media bus formats on
> > > 24 and 32 bit busses") and e8b2d7a565ae ("[media] v4l: Sort YUV formats
> > > of v4l2_mbus_pixelcode") from the v4l-dvb tree and commits 08c38458be7e
> > > ("Add BGR888_1X24 and GBR888_1X24 media bus formats"), 0fc63eb104d7
> > > ("Add YUV8_1X24 media bus format") and 203508ef52e3 ("Add
> > > RGB666_1X24_CPADHI media bus format") from the drm tree.
> > > 
> > > I fixed it up (almost certainly incorrectly - see below) and can carry
> > > the fix as necessary.  Please sort out who "owns" this file and try to
> > > coordinate updates to it.
> > 
> > Together with the corresponding fixup for 
> > include/uapi/linux/media-bus-format.h,
> > how about this:
> 
> This should never have gone into my tree if there wasn't someone in the 
> v4l tree who knew it was coming,
>
> In future please merge the media-bus-formats through both tree, providing
> a stable git tree to both maintainers to pull from, though this may not
> avoid all bad cases, it hopefully will avoid this sort of mess.

I'll try this next time.

> I'm not really sure how best to clean this one up, I think I'd want
> patches to my tree that just use the correect values, then it would just 
> be a normal conflict on merging, instead of renumbering userspace visible
> values,

So far the media tree has added formats 0x100e 0x100f and 0x2024, so I
will send a patch for drm-next that moves these three out of the way,
leaving them unused for the merge.
The merge conflict resolution will still have to take care of the
ordering in media-bus-format.h and the conflicts in subdev-formats.xml
are still non-trivial, but at leasts the constant values won't move
around anymore.

regards
Philipp

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Fixup RGB444_1X12, RGB565_1X16, and YUV8_1X24 media bus format

2015-04-17 Thread Philipp Zabel
Change the constant values for RGB444_1X12, RGB565_1X16, and YUV8_1X24 media
bus formats in anticipation of a merge conflict with the media tree, where
the old values are already taken by RBG888_1X24, RGB888_1X32_PADHI, and
VUY8_1X24, respectively.

Signed-off-by: Philipp Zabel 
---
 Documentation/DocBook/media/v4l/subdev-formats.xml |  6 +++---
 include/uapi/linux/media-bus-format.h  | 10 +-
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/Documentation/DocBook/media/v4l/subdev-formats.xml 
b/Documentation/DocBook/media/v4l/subdev-formats.xml
index 18b71af..553a380 100644
--- a/Documentation/DocBook/media/v4l/subdev-formats.xml
+++ b/Documentation/DocBook/media/v4l/subdev-formats.xml
@@ -196,7 +196,7 @@ see .
  

  MEDIA_BUS_FMT_RGB444_1X12
- 0x100e
+ 0x1016
  
  &dash-ent-20;
  r3
@@ -326,7 +326,7 @@ see .


  MEDIA_BUS_FMT_RGB565_1X16
- 0x100f
+ 0x1017
  
  &dash-ent-16;
  r4
@@ -3049,7 +3049,7 @@ see .


  MEDIA_BUS_FMT_YUV8_1X24
- 0x2024
+ 0x2025
  
  -
  -
diff --git a/include/uapi/linux/media-bus-format.h 
b/include/uapi/linux/media-bus-format.h
index 83ea46f..73c78f1 100644
--- a/include/uapi/linux/media-bus-format.h
+++ b/include/uapi/linux/media-bus-format.h
@@ -33,13 +33,13 @@
 
 #define MEDIA_BUS_FMT_FIXED0x0001
 
-/* RGB - next is   0x1016 */
-#define MEDIA_BUS_FMT_RGB444_1X12  0x100e
+/* RGB - next is   0x1018 */
+#define MEDIA_BUS_FMT_RGB444_1X12  0x1016
 #define MEDIA_BUS_FMT_RGB444_2X8_PADHI_BE  0x1001
 #define MEDIA_BUS_FMT_RGB444_2X8_PADHI_LE  0x1002
 #define MEDIA_BUS_FMT_RGB555_2X8_PADHI_BE  0x1003
 #define MEDIA_BUS_FMT_RGB555_2X8_PADHI_LE  0x1004
-#define MEDIA_BUS_FMT_RGB565_1X16  0x100f
+#define MEDIA_BUS_FMT_RGB565_1X16  0x1017
 #define MEDIA_BUS_FMT_BGR565_2X8_BE0x1005
 #define MEDIA_BUS_FMT_BGR565_2X8_LE0x1006
 #define MEDIA_BUS_FMT_RGB565_2X8_BE0x1007
@@ -56,7 +56,7 @@
 #define MEDIA_BUS_FMT_RGB888_1X7X4_JEIDA   0x1012
 #define MEDIA_BUS_FMT_ARGB_1X320x100d
 
-/* YUV (including grey) - next is  0x2025 */
+/* YUV (including grey) - next is  0x2026 */
 #define MEDIA_BUS_FMT_Y8_1X8   0x2001
 #define MEDIA_BUS_FMT_UV8_1X8  0x2015
 #define MEDIA_BUS_FMT_UYVY8_1_5X8  0x2002
@@ -82,7 +82,7 @@
 #define MEDIA_BUS_FMT_VYUY10_1X20  0x201b
 #define MEDIA_BUS_FMT_YUYV10_1X20  0x200d
 #define MEDIA_BUS_FMT_YVYU10_1X20  0x200e
-#define MEDIA_BUS_FMT_YUV8_1X240x2024
+#define MEDIA_BUS_FMT_YUV8_1X240x2025
 #define MEDIA_BUS_FMT_YUV10_1X30   0x2016
 #define MEDIA_BUS_FMT_AYUV8_1X32   0x2017
 #define MEDIA_BUS_FMT_UYVY12_2X12  0x201c
-- 
2.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 4/4] mm: madvise allow remove operation for hugetlbfs

2015-04-17 Thread Mike Kravetz

On 04/17/2015 12:10 AM, Hillf Danton wrote:


Now that we have hole punching support for hugetlbfs, we can
also support the MADV_REMOVE interface to it.

Signed-off-by: Dave Hansen 
Signed-off-by: Mike Kravetz 
---
  mm/madvise.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index d551475..c4a1027 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -299,7 +299,7 @@ static long madvise_remove(struct vm_area_struct *vma,

*prev = NULL;   /* tell sys_madvise we drop mmap_sem */

-   if (vma->vm_flags & (VM_LOCKED | VM_HUGETLB))
+   if (vma->vm_flags & VM_LOCKED)
return -EINVAL;

f = vma->vm_file;
--
2.1.0


After the above change offset is computed,

offset = (loff_t)(start - vma->vm_start)
+ ((loff_t)vma->vm_pgoff << PAGE_SHIFT);

and I wonder if it is correct for huge page mapping.


I think it will be correct.

The above will be a (base) page size aligned offset into the file.
This offset will be huge page aligned in the fallocate hole punch
code.

/*
 * For hole punch round up the beginning offset of the hole and
 * round down the end.
 */
hole_start = (offset + hpage_size - 1) & ~huge_page_mask(h);
hole_end = (offset + len - (hpage_size - 1)) * ~huge_page_mask(h);

Was the alignment your concern, or something else?
--
Mike Kravetz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] dmaengine: bcm2835: Add slave dma support

2015-04-17 Thread Stefan Wahren

Hi Noralf,

Am 17.04.2015 um 00:09 schrieb Noralf Trønnes:


Den 15.04.2015 21:00, skrev Stefan Wahren:

Hi Noralf,

Am 15.04.2015 um 11:56 schrieb Noralf Trønnes:

Add slave transfer capability to BCM2835 dmaengine driver.
This patch is pulled from the bcm2708-dmaengine driver in the
Raspberry Pi repo. The work was done by Gellert Weisz.

Tested with the bcm2835-mmc driver from the same repo.


why not with the upstream kernel?



See my answer to Alexander Stein.


i read the mail, but i'm still confused. Please let me paraphrase my 
last question:


Is this patch testable with upstream kernel?

It would be helpful to put those facts from the email to Alexander into
the patch description. Please clarify the intension of your patch.

Thanks
Stefan


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] netns: remove BUG_ONs from net_generic()

2015-04-17 Thread Denys Vlasenko
This inline has ~500 callsites.

On 04/14/2015 08:37 PM, David Miller wrote:
> That BUG_ON() was added 7 years ago, and I don't remember it ever
> triggering or helping us diagnose something, so just remove it and
> keep the function inlined.

On x86 allyesconfig build:

text data  bss   dec hex filename
82447071 22255384 20627456 125329911 77861f7 vmlinux4
82441375 22255384 20627456 125324215 7784bb7 vmlinux5prime

Signed-off-by: Denys Vlasenko 
CC: Eric W. Biederman 
CC: David S. Miller 
CC: Jan Engelhardt 
CC: Jiri Pirko 
CC: linux-kernel@vger.kernel.org
CC: net...@vger.kernel.org
---
 include/net/netns/generic.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/net/netns/generic.h b/include/net/netns/generic.h
index 0931618..70e1585 100644
--- a/include/net/netns/generic.h
+++ b/include/net/netns/generic.h
@@ -38,11 +38,9 @@ static inline void *net_generic(const struct net *net, int 
id)
 
rcu_read_lock();
ng = rcu_dereference(net->gen);
-   BUG_ON(id == 0 || id > ng->len);
ptr = ng->ptr[id - 1];
rcu_read_unlock();
 
-   BUG_ON(!ptr);
return ptr;
 }
 #endif
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] netns: deinline net_generic()

2015-04-17 Thread Denys Vlasenko
On 04/16/2015 02:38 PM, Eric Dumazet wrote:
> On Thu, 2015-04-16 at 13:14 +0200, Denys Vlasenko wrote:
> 
>> However, without BUG_ONs, function is still a bit big
>> on PREEMPT configs.
> 
> Only on allyesconfig builds, that nobody use but to prove some points
> about code size.

How do you expect one to find excessively large inlines,
if not on allyesconfig build?

Only by using allyesconfig, I can measure how many calls
are there in the kernel. (grepping source is utterly unreliable
due to nested inlines and macros).

For the record: I am not using the _full_ allyesconfig,
I do disable some debugging options which clearly aren't
ever enabled on production systems. E.g. in my config:

# CONFIG_DEBUG_KMEMLEAK_TEST is not set
# CONFIG_KASAN is not set

etc.

> If you look at net_generic(), it is mostly used from code that is
> normally in 3 modules. How many people really load them ?
> 
> net/tipc : 91 call sites
> net/sunrpc : 57
> fs/nfsd & fs/lockd : 183
> 
> Then few remaining uses in tunnels.

Grepping is far from reliable. The above missed more than half
of all calls. I disassembed vmlinux after deinlining, there are
nearly 500 calls of net_generic().

> As we suggested, please just remove the BUG_ON().

Going to send the patch in a minute.
-- 
vda

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


<    1   2   3   4   5   6   7   8   9   >