date:20150316

Re: [RFC][PATCH] ring-buffer: Replace this_cpu_{read,write} with this_cpu_ptr()

2015-03-16 Thread Christoph Lameter

On Mon, 16 Mar 2015, Steven Rostedt wrote:

> It has come to my attention that this_cpu_read/write are horrible on
> architectures other than x86. Worse yet, they actually disable
> preemption or interrupts! This caused some unexpected tracing results
> on ARM.

Well its just been 7 years or so. Took a long time it seems.

These would need to be implemented on the architectures to
have comparable performance.

> I may go and remove all this_cpu_read,write() calls from my code
> because of this.

You could do that with __this_cpo_* but not this_cpu_*(). Doing
it to this_cpu_* would make the operations no longer per cpu atomic. If
they do not need per cpu atomicity then you could have used __this_cpu_*
instead. And  __this_cpu_* do not disable preemption or interrupts.

So please do not send patches based on gut reactions.

NAK

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [Xen-devel] [PATCH v6 07/30] PCI: Pass PCI domain number combined with root bus number

2015-03-16 Thread Manish Jaggi



On Monday 09 March 2015 08:04 AM, Yijing Wang wrote:

Now we could pass PCI domain combined with bus number
in u32 argu. Because in arm/arm64, PCI domain number
is assigned by pci_bus_assign_domain_nr(). So we leave
pci_scan_root_bus() and pci_create_root_bus() in arm/arm64
unchanged. A new function pci_host_assign_domain_nr()
will be introduced for arm/arm64 to assign domain number
in later patch.

Hi,
I think these changes might not be required. We have made very few 
changes in the xen-pcifront to support PCI passthrough in arm64.
As per xen architecture for a domU only a single pci virtual bus is 
created and all passthrough devices are attached to it.



-manish


Signed-off-by: Yijing Wang 
CC: Richard Henderson 
CC: Ivan Kokshaysky 
CC: Matt Turner 
CC: Tony Luck 
CC: Fenghua Yu 
CC: Michal Simek 
CC: Ralf Baechle 
CC: Benjamin Herrenschmidt 
CC: Paul Mackerras 
CC: Michael Ellerman 
CC: Sebastian Ott 
CC: Gerald Schaefer 
CC: "David S. Miller" 
CC: Chris Metcalf 
CC: Thomas Gleixner 
CC: Konrad Rzeszutek Wilk 
CC: linux-al...@vger.kernel.org
CC: linux-kernel@vger.kernel.org
CC: linux-i...@vger.kernel.org
CC: linux-m...@linux-mips.org
CC: linuxppc-...@lists.ozlabs.org
CC: linux-s...@vger.kernel.org
CC: linux...@vger.kernel.org
CC: sparcli...@vger.kernel.org
CC: xen-de...@lists.xenproject.org
Signed-off-by: Bjorn Helgaas 
---
  arch/alpha/kernel/pci.c  |5 +++--
  arch/alpha/kernel/sys_nautilus.c |4 ++--
  arch/ia64/pci/pci.c  |4 ++--
  arch/ia64/sn/kernel/io_init.c|5 +++--
  arch/microblaze/pci/pci-common.c |5 +++--
  arch/mips/pci/pci.c  |4 ++--
  arch/powerpc/kernel/pci-common.c |5 +++--
  arch/s390/pci/pci.c  |5 +++--
  arch/sh/drivers/pci/pci.c|5 +++--
  arch/sparc/kernel/pci.c  |5 +++--
  arch/tile/kernel/pci.c   |5 +++--
  arch/tile/kernel/pci_gx.c|5 +++--
  arch/x86/pci/acpi.c  |7 ---
  arch/x86/pci/common.c|3 ++-
  drivers/pci/xen-pcifront.c   |5 +++--
  15 files changed, 42 insertions(+), 30 deletions(-)

diff --git a/arch/alpha/kernel/pci.c b/arch/alpha/kernel/pci.c
index 5c845ad..deb0a36 100644
--- a/arch/alpha/kernel/pci.c
+++ b/arch/alpha/kernel/pci.c
@@ -336,8 +336,9 @@ common_init_pci(void)
pci_add_resource_offset(, hose->mem_space,
hose->mem_space->start);
  
-		bus = pci_scan_root_bus(NULL, next_busno, alpha_mv.pci_ops,

-   hose, );
+   bus = pci_scan_root_bus(NULL,
+   PCI_DOMBUS(hose->index, next_busno),
+   alpha_mv.pci_ops, hose, );
if (!bus)
continue;
hose->bus = bus;
diff --git a/arch/alpha/kernel/sys_nautilus.c b/arch/alpha/kernel/sys_nautilus.c
index 700686d..be0bbeb 100644
--- a/arch/alpha/kernel/sys_nautilus.c
+++ b/arch/alpha/kernel/sys_nautilus.c
@@ -206,10 +206,10 @@ nautilus_init_pci(void)
unsigned long memtop = max_low_pfn << PAGE_SHIFT;
  
  	/* Scan our single hose.  */

-   bus = pci_scan_bus(0, alpha_mv.pci_ops, hose);
+   bus = pci_scan_bus(PCI_DOMBUS(hose->index, 0),
+   alpha_mv.pci_ops, hose);
if (!bus)
return;
-
hose->bus = bus;
pcibios_claim_one_bus(bus);
  
diff --git a/arch/ia64/pci/pci.c b/arch/ia64/pci/pci.c

index 48cc657..675749f 100644
--- a/arch/ia64/pci/pci.c
+++ b/arch/ia64/pci/pci.c
@@ -465,8 +465,8 @@ struct pci_bus *pci_acpi_scan_root(struct acpi_pci_root 
*root)
 * should handle the case here, but it appears that IA64 hasn't
 * such quirk. So we just ignore the case now.
 */
-   pbus = pci_create_root_bus(NULL, bus, _root_ops, controller,
-  >resources);
+   pbus = pci_create_root_bus(NULL, PCI_DOMBUS(domain, bus),
+   _root_ops, controller, >resources);
if (!pbus) {
pci_free_resource_list(>resources);
__release_pci_root_info(info);
diff --git a/arch/ia64/sn/kernel/io_init.c b/arch/ia64/sn/kernel/io_init.c
index 1be65eb..7e0b7f9 100644
--- a/arch/ia64/sn/kernel/io_init.c
+++ b/arch/ia64/sn/kernel/io_init.c
@@ -266,8 +266,9 @@ sn_pci_controller_fixup(int segment, int busnum, struct 
pci_bus *bus)
pci_add_resource_offset(, [1],
prom_bussoft_ptr->bs_legacy_mem);
  
-	bus = pci_scan_root_bus(NULL, busnum, _root_ops, controller,

-   );
+   bus = pci_scan_root_bus(NULL,
+   PCI_DOMBUS(controller->segment, busnum),
+   _root_ops, controller, );
if (bus == NULL) {
kfree(res);
kfree(controller);
diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c
index 6d8d173..34a32ec 100644
---

[LKP] [locking/mutex] 07d2413a61d: -3.6% unixbench.score +60.8% unixbench.time.system_time

2015-03-16 Thread Huang Ying

|
 O O O OO O O O O O OO O O O O O O OO O O O O O OO O O O O O OO O O O
   0 ++-+

[*] bisect-good sample
[O] bisect-bad  sample

To reproduce:

apt-get install ruby
git clone 
git://git.kernel.org/pub/scm/linux/kernel/git/wfg/lkp-tests.git
cd lkp-tests
bin/setup-local job.yaml # the job file attached in this email
bin/run-local   job.yaml


Disclaimer:
Results have been estimated based on internal Intel analysis and are provided
for informational purposes only. Any difference in system hardware or software
design or configuration may affect actual performance.


Thanks,
Ying Huang

---
testcase: unixbench
default-monitors:
  wait: pre-test
  uptime: 
  iostat: 
  vmstat: 
  numa-numastat: 
  numa-vmstat: 
  numa-meminfo: 
  proc-vmstat: 
  proc-stat: 
  meminfo: 
  slabinfo: 
  interrupts: 
  lock_stat: 
  latency_stats: 
  softirqs: 
  bdi_dev_mapping: 
  diskstats: 
  nfsstat: 
  cpuidle: 
  cpufreq-stats: 
  turbostat: 
  pmeter: 
  sched_debug:
interval: 10
default_watchdogs:
  watch-oom: 
  watchdog: 
cpufreq_governor: performance
commit: 159e7763d517804c61a673736660a5a35f2ea5f8
model: Grantley Haswell
nr_cpu: 16
memory: 16G
hdd_partitions: 
swap_partitions: 
rootfs_partition: 
unixbench:
  test: fsdisk
testbox: lituya
tbox_group: lituya
kconfig: x86_64-rhel
enqueue_time: 2015-03-13 17:18:20.807960694 +08:00
head_commit: 159e7763d517804c61a673736660a5a35f2ea5f8
base_commit: 9eccca0843205f87c00404b663188b88eb248051
branch: next/master
kernel: 
"/kernel/x86_64-rhel/159e7763d517804c61a673736660a5a35f2ea5f8/vmlinuz-4.0.0-rc3-next-20150316"
user: lkp
queue: cyclic
rootfs: debian-x86_64-2015-02-07.cgz
result_root: 
"/result/lituya/unixbench/performance-fsdisk/debian-x86_64-2015-02-07.cgz/x86_64-rhel/159e7763d517804c61a673736660a5a35f2ea5f8/0"
job_file: 
"/lkp/scheduled/lituya/cyclic_unixbench-performance-fsdisk-x86_64-rhel-HEAD-159e7763d517804c61a673736660a5a35f2ea5f8-0-20150313-22163-1pkno0h.yaml"
dequeue_time: 2015-03-16 17:37:39.210757811 +08:00
max_uptime: 1211.50002
modules_initrd: 
"/kernel/x86_64-rhel/159e7763d517804c61a673736660a5a35f2ea5f8/modules.cgz"
bm_initrd: 
"/lkp/benchmarks/turbostat.cgz,/lkp/benchmarks/unixbench-debian.cgz,/lkp/benchmarks/unixbench.cgz"
job_state: finished
loadavg: 8.28 4.83 1.96 1/211 5736
start_time: '1426498684'
end_time: '1426498982'
version: "/lkp/lkp/.src-20150316-152133"
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu10/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu11/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu12/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu13/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu14/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu15/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu4/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu5/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu6/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu7/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu8/cpufreq/scaling_governor
echo performance > /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
./Run fsdisk
___
LKP mailing list
l...@linux.intel.com

[Patch v6 1/2] dt/bindings: qcom_adm: Fix channel specifiers

2015-03-16 Thread Andy Gross

This patch removes the crci information from the dma channel property.  At least
one client device requires using more than one CRCI value for a channel.  This
does not match the current binding and the crci information needs to be removed.

Instead, the client device will provide this information via other means.

Signed-off-by: Andy Gross 
---
 Documentation/devicetree/bindings/dma/qcom_adm.txt |   16 ++--
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/Documentation/devicetree/bindings/dma/qcom_adm.txt 
b/Documentation/devicetree/bindings/dma/qcom_adm.txt
index 9bcab91..38d45f8 100644
--- a/Documentation/devicetree/bindings/dma/qcom_adm.txt
+++ b/Documentation/devicetree/bindings/dma/qcom_adm.txt
@@ -4,8 +4,7 @@ Required properties:
 - compatible: must contain "qcom,adm" for IPQ/APQ8064 and MSM8960
 - reg: Address range for DMA registers
 - interrupts: Should contain one interrupt shared by all channels
-- #dma-cells: must be <2>.  First cell denotes the channel number.  Second cell
-  denotes CRCI (client rate control interface) flow control assignment.
+- #dma-cells: must be <1>.  First cell denotes the channel number.
 - clocks: Should contain the core clock and interface clock.
 - clock-names: Must contain "core" for the core clock and "iface" for the
   interface clock.
@@ -22,7 +21,7 @@ Example:
compatible = "qcom,adm";
reg = <0x1830 0x10>;
interrupts = <0 170 0>;
-   #dma-cells = <2>;
+   #dma-cells = <1>;
 
clocks = < ADM0_CLK>, < ADM0_PBUS_CLK>;
clock-names = "core", "iface";
@@ -35,15 +34,12 @@ Example:
qcom,ee = <0>;
};
 
-DMA clients must use the format descripted in the dma.txt file, using a three
+DMA clients must use the format descripted in the dma.txt file, using a two
 cell specifier for each channel.
 
-Each dmas request consists of 3 cells:
+Each dmas request consists of two cells:
  1. phandle pointing to the DMA controller
  2. channel number
- 3. CRCI assignment, if applicable.  If no CRCI flow control is required, use 
0.
-The CRCI is used for flow control.  It identifies the peripheral device 
that
-is the source/destination for the transferred data.
 
 Example:
 
@@ -56,7 +52,7 @@ Example:
 
cs-gpios = <_pinmux 20 0>;
 
-   dmas = <_dma 6 9>,
-   <_dma 5 10>;
+   dmas = <_dma 6>,
+   <_dma 5>;
dma-names = "rx", "tx";
};
-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Patch v6 0/2] Add Qualcomm ADM dmaengine driver

2015-03-16 Thread Andy Gross

This patch set introduces the dmaengine driver for the Qualcomm Application
Data Mover (ADM) DMA controller present on MSM8x60, APQ8064, and IPQ8064
devices.

The initial version of this driver will only support slave DMA operations
between system memory and peripherals.  Flow control via the CRCI (client rate
control interface) is supported and can be configured via device tree
configuration.  Flow control usage is required for some peripheral devices.

Changes from v5:
  - Fix erroneous adm_get_blksize for values of 192 and 256.

Changes from v4:
  - Fixed copyright date
  - Fixed error in EE offsets and usage
  - Changed namespace for registers and fields to ADM specific naming
  - Removed alloc_chan function
  - Removed control function and fixed up terminate_all and slave_config
  - Reworked descriptor processing code to make it more clean
  - Moved to use of_dma_xlate_by_chan_id
  - Fixed other small review comments

Changes from v3:
  - Remove .owner field

Changes from v2:
  - Removed extraneous achan variable from xlate function
  - Reworked crci check in slave_sg function
  - Added mux field to async_desc structure.
  - Reworked dma start function to use crci and mux values directly from
structure.
  - Added disable of clocks in probe error paths.
  - Changed to use #define for fixed number of channels.

Changes since v1:
  - Fixed various review comments
  - Fixed some descriptor programming issues.
  - Added single descriptors to support sub burst length transactions.
Selection of single or box descriptors depends on the sg length and burst
size.
  - Removed use of crci in the dmas property.  CRCI is now designated via the
slave_config structure and will be stored in slave_id.

Andy Gross (2):
  dt/bindings: qcom_adm: Fix channel specifiers
  dmaengine: Add ADM driver

 Documentation/devicetree/bindings/dma/qcom_adm.txt |   16 +-
 drivers/dma/Kconfig|   10 +
 drivers/dma/Makefile   |1 +
 drivers/dma/qcom_adm.c |  900 
 4 files changed, 917 insertions(+), 10 deletions(-)
 create mode 100644 drivers/dma/qcom_adm.c

-- 
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[Patch v6 2/2] dmaengine: Add ADM driver

2015-03-16 Thread Andy Gross

Add the DMA engine driver for the QCOM Application Data Mover (ADM) DMA
controller found in the MSM8x60 and IPQ/APQ8064 platforms.

The ADM supports both memory to memory transactions and memory
to/from peripheral device transactions.  The controller also provides flow
control capabilities for transactions to/from peripheral devices.

The initial release of this driver supports slave transfers to/from peripherals
and also incorporates CRCI (client rate control interface) flow control.

Signed-off-by: Andy Gross 
---
 drivers/dma/Kconfig|   10 +
 drivers/dma/Makefile   |1 +
 drivers/dma/qcom_adm.c |  900 
 3 files changed, 911 insertions(+)
 create mode 100644 drivers/dma/qcom_adm.c

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index a874b6e..6919013 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -473,4 +473,14 @@ config QCOM_BAM_DMA
  Enable support for the QCOM BAM DMA controller.  This controller
  provides DMA capabilities for a variety of on-chip devices.
 
+config QCOM_ADM
+   tristate "Qualcomm ADM support"
+   depends on ARCH_QCOM || (COMPILE_TEST && OF && ARM)
+   select DMA_ENGINE
+   select DMA_VIRTUAL_CHANNELS
+   ---help---
+ Enable support for the Qualcomm ADM DMA controller.  This controller
+ provides DMA capabilities for both general purpose and on-chip
+ peripheral devices.
+
 endif
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index f915f61..7f0fbe6 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -51,3 +51,4 @@ obj-$(CONFIG_INTEL_MIC_X100_DMA) += mic_x100_dma.o
 obj-$(CONFIG_NBPFAXI_DMA) += nbpfaxi.o
 obj-$(CONFIG_DMA_SUN6I) += sun6i-dma.o
 obj-$(CONFIG_IMG_MDC_DMA) += img-mdc-dma.o
+obj-$(CONFIG_QCOM_ADM) += qcom_adm.o
diff --git a/drivers/dma/qcom_adm.c b/drivers/dma/qcom_adm.c
new file mode 100644
index 000..7f8c119
--- /dev/null
+++ b/drivers/dma/qcom_adm.c
@@ -0,0 +1,900 @@
+/*
+ * Copyright (c) 2013-2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "dmaengine.h"
+#include "virt-dma.h"
+
+/* ADM registers - calculated from channel number and security domain */
+#define ADM_CHAN_MULTI 0x4
+#define ADM_CI_MULTI   0x4
+#define ADM_CRCI_MULTI 0x4
+#define ADM_EE_MULTI   0x800
+#define ADM_CHAN_OFFS(chan)(ADM_CHAN_MULTI * chan)
+#define ADM_EE_OFFS(ee)(ADM_EE_MULTI * ee)
+#define ADM_CHAN_EE_OFFS(chan, ee) (ADM_CHAN_OFFS(chan) + ADM_EE_OFFS(ee))
+#define ADM_CHAN_OFFS(chan)(ADM_CHAN_MULTI * chan)
+#define ADM_CI_OFFS(ci)(ADM_CHAN_OFF(ci))
+#define ADM_CH_CMD_PTR(chan, ee)   (ADM_CHAN_EE_OFFS(chan, ee))
+#define ADM_CH_RSLT(chan, ee)  (0x40 + ADM_CHAN_EE_OFFS(chan, ee))
+#define ADM_CH_FLUSH_STATE0(chan, ee)  (0x80 + ADM_CHAN_EE_OFFS(chan, ee))
+#define ADM_CH_STATUS_SD(chan, ee) (0x200 + ADM_CHAN_EE_OFFS(chan, ee))
+#define ADM_CH_CONF(chan)  (0x240 + ADM_CHAN_OFFS(chan))
+#define ADM_CH_RSLT_CONF(chan, ee) (0x300 + ADM_CHAN_EE_OFFS(chan, ee))
+#define ADM_SEC_DOMAIN_IRQ_STATUS(ee)  (0x380 + ADM_EE_OFFS(ee))
+#define ADM_CI_CONF(ci)(0x390 + ci * ADM_CI_MULTI)
+#define ADM_GP_CTL 0x3d8
+#define ADM_CRCI_CTL(crci, ee) (0x400 + crci * ADM_CRCI_MULTI + \
+   ADM_EE_OFFS(ee))
+
+/* channel status */
+#define ADM_CH_STATUS_VALIDBIT(1)
+
+/* channel result */
+#define ADM_CH_RSLT_VALID  BIT(31)
+#define ADM_CH_RSLT_ERRBIT(3)
+#define ADM_CH_RSLT_FLUSH  BIT(2)
+#define ADM_CH_RSLT_TPDBIT(1)
+
+/* channel conf */
+#define ADM_CH_CONF_SHADOW_EN  BIT(12)
+#define ADM_CH_CONF_MPU_DISABLEBIT(11)
+#define ADM_CH_CONF_PERM_MPU_CONF  BIT(9)
+#define ADM_CH_CONF_FORCE_RSLT_EN  BIT(7)
+#define ADM_CH_CONF_SEC_DOMAIN(ee) (((ee & 0x3) << 4) | ((ee & 0x4) << 11))
+
+/* channel result conf */
+#define ADM_CH_RSLT_CONF_FLUSH_EN  BIT(1)
+#define ADM_CH_RSLT_CONF_IRQ_ENBIT(0)
+
+/* CRCI CTL */
+#define ADM_CRCI_CTL_MUX_SEL   BIT(18)
+#define ADM_CRCI_CTL_RST   BIT(17)
+
+/* CI configuration */
+#define

[PATCH 2/2] ARM: dts: ls1021: Add qDMA node

2015-03-16 Thread Yuan Yao

Signed-off-by: Yuan Yao 
---
 arch/arm/boot/dts/ls1021a.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 491480f..1a81b89 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -383,6 +383,22 @@
 <_clk 1>;
};
 
+   qdma: qdma@839 {
+   compatible = "fsl,ls1021a-qdma";
+   reg = <0x0 0x839 0x0 0x1>;
+   interrupts = ,
+   ;
+   interrupt-names = "qdma-controller", "qdma-queue";
+   status-sizes = <64>;
+   channels = <8>;
+   queues = <2>;
+   queue-sizes = <256 256>;
+   queue-group = <0 1>;
+   queue-weight = <0 0>;
+   default-queue = <0>;
+   big-endian;
+   };
+
mdio0: mdio@2d24000 {
compatible = "gianfar";
device_type = "mdio";
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC v2 0/6] ext4: yet another project quota

2015-03-16 Thread Konstantin Khlebnikov

On Mon, Mar 16, 2015 at 7:52 PM, Jan Kara  wrote:
>   Hello,
>
> On Tue 10-03-15 20:22:04, Konstantin Khlebnikov wrote:
>> Projects quota allows to enforce disk quota for several subtrees or even
>> individual files on the filesystem. Each inode is marked with project-id
>> (independently from uid and gid) and accounted into corresponding project
>> quota. New files inherits project id from directory where they are created.
>>
>> This is must-have feature for deploying lightweight containers.
>> Also project quota can tell size of subtree without doing recursive 'du'.
>>
>> This patchset adds project id and quota into ext4.
>>
>> This time I've prepared patches also for e2fsprogs and quota-tools.
>>
>> All patches are available at github:
>> https://github.com/koct9i/linux --branch project
>> https://github.com/koct9i/e2fsprogs --branch project
>> https://github.com/koct9i/quota-tools --branch project
>>
>> Porposed behavior is similar to project quota in XFS:
>>
>> * all inode are marked with project id
>> * new files inherits project id from parent directory
>> * project quota accounts inodes and enforces limits
>> * cross-project link and rename operations are restricted
>
>   Thanks for the patches and sorry for taking a long time to look into
> them. I like your patches and they looks mostly fine to me but can we
> *please* start with making things completely compatible with how XFS works?
> I understand that may seem broken / not useful for your usecase but
> consistency among filesystems is really important.
>
> I was talking with Dave Chinner last week and we agreed that the direction
> you are changing things makes sense but getting things compatible with how
> they were for 15 years in XFS is an important first step (because there may
> be people with other usecases depending on the old behavior and they would
> get confused by ext4 behaving differently). After we have compatible code
> in, we can add your fs.protected_projects thing on top of that (and also
> support it in XFS to stay compatible).
>
>
>> Differences:
>>
>> There is no flag similar to XFS_XFLAG_PROJINHERIT (which allows to disable
>> project id inheritance), instead of that project which userspace sees as '0'
>> (in nested user-name space that might be actually non-zero project) acts as
>> default project where restrictions for link/rename are ignored.
>> (also see below, in "why new ioctl" part)
>>
>> This implementation adds shortcut for moving files from one project into
>> another: non-directory inodes with n_link == 1 are transferred without
>> copying (XFS in this case returns -EXDEV and userspace have to copy file).
>>
>> In XFS file owner (or process with CAP_FOWNER) can set set any project id,
>> XFS permits changing project id only from init user-namespace.
>>
>> This patchset adds sysctl fs.protected_projects. By default it's 0 and 
>> project
>> id acts as XFS project. Setting it to 1 makes chaning project id priviliged
>> operation which requires CAP_SYS_RESOURCE in current user-namespace, changing
>> project id mapping for nested user-namespace also requires that capability.
>> Thus there are two levels of control: project id mapping in user-ns defines 
>> set
>> of permitted projects and capability protects operations within this set.
>>
>> I see no problems with supporting all this in XFS, all difference in 
>> interface.
>>
>> Ext4 layout
>> ---
>>
>> Project id introduce ro-compatible feature 'project'.
>>
>> Inode project id is stored in place of obsolete field 'i_faddr' (that trick 
>> was
>> suggested by Darrick J. Wong in previous discussions of project quota).
>> Linux never used that field and present fsck checks that it contains zero.
>>
>> Quota information is stored in special inode №11 (by default only 10 inodes 
>> are
>> reserved for special usage, I've add option to resize2fs to reserve more).
>> (see e2fsprogs patches for details) For symmetry with other quotas inode 
>> number
>> is stored in superblock.
>>
>> Project quota supports only modern 'hidden' journaled mode.
>>
>> Interface
>> -
>>
>> Interface for changing limits / query current usage is common vfs quotactl()
>> where quotatype = PRJQUOTA = 2. User can query current state of any project
>> mapped into user-ns, changing of limits requires CAP_SYS_ADMIN in init 
>> user-ns.
>>
>> Two new ioctls for getting / changing inode project id:
>> int ioctl(fd, FS_IOC_GETPROJECT, unsigned *project);
>> int ioctl(fd, FS_IOC_SETPROJECT, unsigned *project);
>>
>> They acts as interface for super-block methods get_project / set_project
>> Generic code checks permissions, does project id translation in 
>> user-namespace
>> mapping, grabs write-access to the filesystem, locks i_mutex for set 
>> opetaion.
>> Filesystem method only updates inode and transfers project quota.
>>
>> No new mount options added. Disk usage tracking is enabled at mount.
>> Limits are enabeld later by "quotaon".
>>
>> (BTW why journalled quota doesn't enable

[PATCH 1/2] dma: Add Freescale qDMA engine driver support

2015-03-16 Thread Yuan Yao

Add Freescale Queue Direct Memory Access (qDMA) controller support.
The qDMA supports channel virtualization by allowing DMA jobs to be
enqueued into different command queues. Core can initiate a DMA
transaction by preparing a command descriptor (CD) for each DMA job
and enqueuing this job to a command queue.

This module can be found on LS-1021 LS1043 and LS2085 SoCs.

Signed-off-by: Yuan Yao 
---
 Documentation/devicetree/bindings/dma/fsl-qdma.txt |  51 ++
 drivers/dma/Kconfig|  11 +
 drivers/dma/Makefile   |   1 +
 drivers/dma/fsl-qdma.c | 929 +
 4 files changed, 992 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/fsl-qdma.txt
 create mode 100644 drivers/dma/fsl-qdma.c

diff --git a/Documentation/devicetree/bindings/dma/fsl-qdma.txt 
b/Documentation/devicetree/bindings/dma/fsl-qdma.txt
new file mode 100644
index 000..676ae3d
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/fsl-qdma.txt
@@ -0,0 +1,50 @@
+* Freescale queue Direct Memory Access Controller(qDMA) Controller
+
+  The qDMA controller transfers blocks of data between one source and one or 
more
+destinations. The blocks of data transferred can be represented in memory as 
contiguous
+or non-contiguous using scatter/gather table(s). Channel virtualization is 
supported
+through enqueuing of DMA jobs to, or dequeuing DMA jobs from, different work
+queues.
+
+* qDMA Controller
+Required properties:
+- compatible :
+   - "fsl,ls1021a-qdma" for qDMA used similar to that on LS1021a SoC
+- reg : Specifies base physical address(s) and size of the qDMA registers.
+   The region is qDMA control register's address and size.
+- interrupts : A list of interrupt-specifiers, one for each entry in
+   interrupt-names.
+- interrupt-names : Should contain:
+   "qdma-controller" - the controller interrupt
+   "qdma-queue" - the queue interrupt
+- status-sizes : Number of circular status descriptor queue size
+- channels : Number of channels supported by the controller
+- queues : Number of queues supported by the controller
+- queue-sizes : Number of circular descriptor queue size for each queue
+- queue-group : The group for each queue belong to
+- queue-weight : The weight for each queue belog to
+- default-queue : The default queue for request a new channel
+
+Optional properties:
+- big-endian: If present registers and hardware scatter/gather descriptors
+   of the qDMA are implemented in big endian mode, otherwise in little
+   mode.
+
+
+Examples:
+
+   qdma: qdma@839 {
+   compatible = "fsl,ls1021a-qdma";
+   reg = <0x0 0x839 0x0 0x1>;
+   interrupts = ,
+   ;
+   interrupt-names = "qdma-controller", "qdma-queue";
+   status-sizes = <64>;
+   channels = <8>;
+   queues = <2>;
+   queue-sizes = <256 256>;
+   queue-group = <0 1>;
+   queue-weight = <0 0>;
+   default-queue = <0>;
+   big-endian;
+   };
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 074ffad..fa52c9e 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -388,6 +388,17 @@ config FSL_EDMA
  multiplexing capability for DMA request sources(slot).
  This module can be found on Freescale Vybrid and LS-1 SoCs.
 
+config FSL_QDMA
+   tristate "Freescale qDMA engine support"
+   depends on OF
+   select DMA_ENGINE
+   select DMA_VIRTUAL_CHANNELS
+   help
+ Support the Freescale qDMA engine with command queue and legacy mode.
+ Channel virtualization is supported through enqueuing of DMA jobs to,
+ or dequeuing DMA jobs from, different work queues.
+ This module can be found on Freescale LS SoCs.
+
 config XILINX_VDMA
tristate "Xilinx AXI VDMA Engine"
depends on (ARCH_ZYNQ || MICROBLAZE)
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index bf44858..5f2b95e 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_TI_CPPI41) += cppi41.o
 obj-$(CONFIG_K3_DMA) += k3dma.o
 obj-$(CONFIG_MOXART_DMA) += moxart-dma.o
 obj-$(CONFIG_FSL_EDMA) += fsl-edma.o
+obj-$(CONFIG_FSL_QDMA) += fsl-qdma.o
 obj-$(CONFIG_QCOM_BAM_DMA) += qcom_bam_dma.o
 obj-y += xilinx/
 obj-$(CONFIG_INTEL_MIC_X100_DMA) += mic_x100_dma.o
diff --git a/drivers/dma/fsl-qdma.c b/drivers/dma/fsl-qdma.c
new file mode 100644
index 000..00f3c33
--- /dev/null
+++ b/drivers/dma/fsl-qdma.c
@@ -0,0 +1,929 @@
+/*
+ * drivers/dma/fsl-qdma.c
+ *
+ * Copyright 2014-2015 Freescale Semiconductor, Inc.
+ *
+ * Driver for the Freescale qDMA engine with command queue and legacy mode.
+ * Channel virtualization is supported through enqueuing of DMA jobs to,
+ * or dequeuing DMA jobs from, different work queues.
+ * This module can be found on Freescale

[PATCH 2/2] ARM: dts: ls1021: Add qDMA node

2015-03-16 Thread Yuan Yao

Signed-off-by: Yuan Yao 
---
 arch/arm/boot/dts/ls1021a.dtsi | 16 
 1 file changed, 16 insertions(+)

diff --git a/arch/arm/boot/dts/ls1021a.dtsi b/arch/arm/boot/dts/ls1021a.dtsi
index 491480f..1a81b89 100644
--- a/arch/arm/boot/dts/ls1021a.dtsi
+++ b/arch/arm/boot/dts/ls1021a.dtsi
@@ -383,6 +383,22 @@
 <_clk 1>;
};
 
+   qdma: qdma@839 {
+   compatible = "fsl,ls1021a-qdma";
+   reg = <0x0 0x839 0x0 0x1>;
+   interrupts = ,
+   ;
+   interrupt-names = "qdma-controller", "qdma-queue";
+   status-sizes = <64>;
+   channels = <8>;
+   queues = <2>;
+   queue-sizes = <256 256>;
+   queue-group = <0 1>;
+   queue-weight = <0 0>;
+   default-queue = <0>;
+   big-endian;
+   };
+
mdio0: mdio@2d24000 {
compatible = "gianfar";
device_type = "mdio";
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] ARM: dts: Define stdout-patch for Exynos Chromebooks

2015-03-16 Thread Arnd Bergmann

On Tuesday 17 March 2015 10:51:13 Kukjin Kim wrote:
> Javier Martinez Canillas wrote:
> > 
> > The kernel can use as the default console a serial port if is defined
> > as stdout device in the Device Tree.
> > 
> > This allows a board to be booted without the need of having a console
> > parameter in the kernel command line.
> > 
> > This small series add a stdout-path property for Exynos5 Chromebooks and
> > is composed of the following patches:
> > 
> > Javier Martinez Canillas (3):
> >   ARM: dts: Define stdout-path property for Peach boards
> >   ARM: dts: Define stdout-path property for Snow board
> >   ARM: dts: Define stdout-path property for Spring board
> > 
> >  arch/arm/boot/dts/exynos5250-snow.dts  | 1 +
> >  arch/arm/boot/dts/exynos5250-spring.dts| 1 +
> >  arch/arm/boot/dts/exynos5420-peach-pit.dts | 4 
> >  arch/arm/boot/dts/exynos5800-peach-pi.dts  | 4 
> >  4 files changed, 10 insertions(+)
> > 
> + Arnd
> 
> Basically, I have no objection to add stdout-path property on board DT but I
> need to ask other ARM guys how they think about? Always I'm questioned what
> should be defined in bootloader before entering kernel and IMHO kernel can do
> it, it should be defined in bootloader though 
> 
> Let's wait for other opinions...
> 

We're trying to do this on all machines now so we can replace
debug_ll with earlycon for any normal use case aside from early
early boot debugging.

Please merge this patch set.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mm/vmalloc: fix possible exhaustion of vmalloc space caused by vm_map_ram allocator

2015-03-16 Thread Roman Peniaev

On Tue, Mar 17, 2015 at 1:56 PM, Joonsoo Kim  wrote:
> On Fri, Mar 13, 2015 at 09:12:55PM +0900, Roman Pen wrote:
>> If suitable block can't be found, new block is allocated and put into a head
>> of a free list, so on next iteration this new block will be found first.
>>
>> That's bad, because old blocks in a free list will not get a chance to be 
>> fully
>> used, thus fragmentation will grow.
>>
>> Let's consider this simple example:
>>
>>  #1 We have one block in a free list which is partially used, and where only
>> one page is free:
>>
>> HEAD |x-| TAIL
>>^
>>free space for 1 page, order 0
>>
>>  #2 New allocation request of order 1 (2 pages) comes, new block is allocated
>> since we do not have free space to complete this request. New block is 
>> put
>> into a head of a free list:
>>
>> HEAD |--|x-| TAIL
>>
>>  #3 Two pages were occupied in a new found block:
>>
>> HEAD |xx|x-| TAIL
>>   ^
>>   two pages mapped here
>>
>>  #4 New allocation request of order 0 (1 page) comes.  Block, which was 
>> created
>> on #2 step, is located at the beginning of a free list, so it will be 
>> found
>> first:
>>
>>   HEAD |xxX---|x-| TAIL
>>   ^ ^
>>   page mapped here, but better to use this hole
>>
>> It is obvious, that it is better to complete request of #4 step using the old
>> block, where free space is left, because in other case fragmentation will be
>> highly increased.
>>
>> But fragmentation is not only the case.  The most worst thing is that I can
>> easily create scenario, when the whole vmalloc space is exhausted by blocks,
>> which are not used, but already dirty and have several free pages.
>>
>> Let's consider this function which execution should be pinned to one CPU:
>>
>>  
>> --
>> /* Here we consider that our block is equal to 1MB, thus 256 pages */
>> static void exhaust_virtual_space(struct page *pages[256], int iters)
>> {
>>   /* Firstly we have to map a big chunk, e.g. 16 pages.
>>* Then we have to occupy the remaining space with smaller
>>* chunks, i.e. 8 pages. At the end small hole should remain.
>>* So at the end of our allocation sequence block looks like
>>* this:
>>*XX  big chunk
>>* |XXxxx-|x  small chunk
>>* -  hole, which is enough for a small chunk,
>>*but not for a big chunk
>>*/
>>   unsigned big_allocs   = 1;
>>   /* -1 for hole, which should be left at the end of each block
>>* to keep it partially used, with some free space available */
>>   unsigned small_allocs = (256 - 16) / 8 - 1;
>>   void*vaddrs[big_allocs + small_allocs];
>>
>>   while (iters--) {
>>   int i = 0, j;
>>
>>   /* Map big chunk */
>>   vaddrs[i++] = vm_map_ram(pages, 16, -1, PAGE_KERNEL);
>>
>>   /* Map small chunks */
>>   for (j = 0; j < small_allocs; j++)
>>   vaddrs[i++] = vm_map_ram(pages + 16 + j * 8, 8, -1,
>>PAGE_KERNEL);
>>
>>   /* Unmap everything */
>>   while (i--)
>>   vm_unmap_ram(vaddrs[i], (i ? 8 : 16));
>>   }
>> }
>>  
>> --
>>
>> On every iteration new block (1MB of vm area in my case) will be allocated 
>> and
>> then will be occupied, without attempt to resolve small allocation request
>> using previously allocated blocks in a free list.
>>
>> In current patch I simply put newly allocated block to the tail of a free 
>> list,
>> thus reduce fragmentation, giving a chance to resolve allocation request 
>> using
>> older blocks with possible holes left.
>
> Hello,
>
> I think that if you put newly allocated block to the tail of a free
> list, below example would results in enormous performance degradation.
>
> new block: 1MB (256 pages)
>
> while (iters--) {
>   vm_map_ram(3 or something else not dividable for 256) * 85
>   vm_unmap_ram(3) * 85
> }
>
> On every iteration, it needs newly allocated block and it is put to the
> tail of a free list so finding it consumes large amount of time.
>
> Is there any other solution to prevent your problem?

Hello.

My second patch fixes this problem.
I occupy the block on allocation and avoid jumping to the search loop.

Also the problem is much wider.  Since we allocate a block on one CPU, but
search of a free block can be done on another CPU (preemption was turned on),
then allocation can happen again.  In worst case allocation will happen for
each CPU available on the system.

This scenario also should be fixed by occupying block on allocation.

--
Roman
--
To unsubscribe from this

Re: [Patch v4 2/2] dmaengine: Add ADM driver

2015-03-16 Thread Andy Gross

On Mon, Mar 16, 2015 at 08:15:26AM -, sricha...@codeaurora.org wrote:
> Hi,
> 
> 
> >
> >>
> >> > +static int adm_get_blksize(unsigned int burst)
> >> > +{
> >> > +int ret;
> >> > +
> >> > +switch (burst) {
> >> > +case 16:
> >> > +ret = 0;
> >> > +break;
> >> > +case 32:
> >> > +ret = 1;
> >> > +break;
> >> > +case 64:
> >> > +ret = 2;
> >> > +break;
> >> > +case 128:
> >> > +ret = 3;
> >> > +break;
> >> > +case 192:
> >> > +ret = 4;
> >> > +break;
> >> > +case 256:
> >> > +ret = 5;
> >> > +break;
> >> ffs(burst>>4) ?
> >
> > that should work nicely.  thanks.
> >
>   Will not work for 192, 256 ?

you are right.  I'll have to separate those out into 2 more cases.  Good catch!

-- 
Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] mm/vmalloc: fix possible exhaustion of vmalloc space caused by vm_map_ram allocator

2015-03-16 Thread Joonsoo Kim

On Fri, Mar 13, 2015 at 09:12:55PM +0900, Roman Pen wrote:
> If suitable block can't be found, new block is allocated and put into a head
> of a free list, so on next iteration this new block will be found first.
> 
> That's bad, because old blocks in a free list will not get a chance to be 
> fully
> used, thus fragmentation will grow.
> 
> Let's consider this simple example:
> 
>  #1 We have one block in a free list which is partially used, and where only
> one page is free:
> 
> HEAD |x-| TAIL
>^
>free space for 1 page, order 0
> 
>  #2 New allocation request of order 1 (2 pages) comes, new block is allocated
> since we do not have free space to complete this request. New block is put
> into a head of a free list:
> 
> HEAD |--|x-| TAIL
> 
>  #3 Two pages were occupied in a new found block:
> 
> HEAD |xx|x-| TAIL
>   ^
>   two pages mapped here
> 
>  #4 New allocation request of order 0 (1 page) comes.  Block, which was 
> created
> on #2 step, is located at the beginning of a free list, so it will be 
> found
> first:
> 
>   HEAD |xxX---|x-| TAIL
>   ^ ^
>   page mapped here, but better to use this hole
> 
> It is obvious, that it is better to complete request of #4 step using the old
> block, where free space is left, because in other case fragmentation will be
> highly increased.
> 
> But fragmentation is not only the case.  The most worst thing is that I can
> easily create scenario, when the whole vmalloc space is exhausted by blocks,
> which are not used, but already dirty and have several free pages.
> 
> Let's consider this function which execution should be pinned to one CPU:
> 
>  
> --
> /* Here we consider that our block is equal to 1MB, thus 256 pages */
> static void exhaust_virtual_space(struct page *pages[256], int iters)
> {
>   /* Firstly we have to map a big chunk, e.g. 16 pages.
>* Then we have to occupy the remaining space with smaller
>* chunks, i.e. 8 pages. At the end small hole should remain.
>* So at the end of our allocation sequence block looks like
>* this:
>*XX  big chunk
>* |XXxxx-|x  small chunk
>* -  hole, which is enough for a small chunk,
>*but not for a big chunk
>*/
>   unsigned big_allocs   = 1;
>   /* -1 for hole, which should be left at the end of each block
>* to keep it partially used, with some free space available */
>   unsigned small_allocs = (256 - 16) / 8 - 1;
>   void*vaddrs[big_allocs + small_allocs];
> 
>   while (iters--) {
>   int i = 0, j;
> 
>   /* Map big chunk */
>   vaddrs[i++] = vm_map_ram(pages, 16, -1, PAGE_KERNEL);
> 
>   /* Map small chunks */
>   for (j = 0; j < small_allocs; j++)
>   vaddrs[i++] = vm_map_ram(pages + 16 + j * 8, 8, -1,
>PAGE_KERNEL);
> 
>   /* Unmap everything */
>   while (i--)
>   vm_unmap_ram(vaddrs[i], (i ? 8 : 16));
>   }
> }
>  
> --
> 
> On every iteration new block (1MB of vm area in my case) will be allocated and
> then will be occupied, without attempt to resolve small allocation request
> using previously allocated blocks in a free list.
> 
> In current patch I simply put newly allocated block to the tail of a free 
> list,
> thus reduce fragmentation, giving a chance to resolve allocation request using
> older blocks with possible holes left.

Hello,

I think that if you put newly allocated block to the tail of a free
list, below example would results in enormous performance degradation.

new block: 1MB (256 pages)

while (iters--) {
  vm_map_ram(3 or something else not dividable for 256) * 85
  vm_unmap_ram(3) * 85
}

On every iteration, it needs newly allocated block and it is put to the
tail of a free list so finding it consumes large amount of time.

Is there any other solution to prevent your problem?

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 1/6] Drivers: hv: vmbus: Perform device register in the per-channel work element

2015-03-16 Thread KY Srinivasan



> -Original Message-
> From: Greg KH [mailto:gre...@linuxfoundation.org]
> Sent: Monday, March 16, 2015 1:22 PM
> To: KY Srinivasan
> Cc: a...@canonical.com; de...@linuxdriverproject.org; o...@aepfle.de;
> linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/6] Drivers: hv: vmbus: Perform device register in the
> per-channel work element
> 
> On Thu, Mar 12, 2015 at 02:16:23PM +, KY Srinivasan wrote:
> >
> >
> > > -Original Message-
> > > From: Greg KH [mailto:gre...@linuxfoundation.org]
> > > Sent: Thursday, March 12, 2015 6:29 AM
> > > To: KY Srinivasan
> > > Cc: a...@canonical.com; de...@linuxdriverproject.org; o...@aepfle.de;
> > > linux-kernel@vger.kernel.org
> > > Subject: Re: [PATCH 1/6] Drivers: hv: vmbus: Perform device register
> > > in the per-channel work element
> > >
> > > On Thu, Mar 12, 2015 at 01:12:29PM +, KY Srinivasan wrote:
> > > >
> > > >
> > > > > -Original Message-
> > > > > From: Greg KH [mailto:gre...@linuxfoundation.org]
> > > > > Sent: Thursday, March 12, 2015 2:03 AM
> > > > > To: KY Srinivasan
> > > > > Cc: linux-kernel@vger.kernel.org; de...@linuxdriverproject.org;
> > > > > o...@aepfle.de; a...@canonical.com; vkuzn...@redhat.com
> > > > > Subject: Re: [PATCH 1/6] Drivers: hv: vmbus: Perform device
> > > > > register in the per-channel work element
> > > > >
> > > > > On Thu, Mar 12, 2015 at 10:02:24AM +0100, Greg KH wrote:
> > > > > > On Wed, Mar 11, 2015 at 06:56:54PM -0700, K. Y. Srinivasan wrote:
> > > > > > > This patch is a continuation of the rescind handling cleanup work.
> > > > > > > We cannot block in the global message handling work context
> > > > > > > especially if we are blocking waiting for the host to wake
> > > > > > > us up. I would like to thank Dexuan Cui
> > > > > > >  for observing
> > > > > this problem.
> > > > > > >
> > > > > > > The current Linux 4.0 RC3 tree is broken and this patch
> > > > > > > fixes the
> > > problem.
> > > > > > >
> > > > > > > Signed-off-by: K. Y. Srinivasan 
> > > > > > > ---
> > > > > > >  drivers/hv/channel_mgmt.c |  143
> > > > > +++-
> > > > > > >  drivers/hv/connection.c   |6 ++-
> > > > > > >  drivers/hv/hyperv_vmbus.h |2 +-
> > > > > > >  3 files changed, 107 insertions(+), 44 deletions(-)
> > > > > >
> > > > > > This is a very big patch so late in the -rc cycle.  Is there
> > > > > > some patch that got merged in 4.0-rc1 that I should be
> > > > > > reverting instead to fix things up?
> > > > >
> > > > > Make that, "this is a very large patch set", not just one patch.
> > > > > I can't take all of these this late, sorry.  Please just tell me what 
> > > > > to
> revert.
> > > >
> > > > Greg,
> > > >
> > > > Would it be possible to pick up two patches. I could prune this
> > > > down to two. The two I want you to pick up are (in the order of
> importance):
> > > >
> > > > [PATCH 1/6] Drivers: hv: vmbus: Perform device register in the
> > > > per-channel work element [PATCH 2/6] Drivers: hv: hv_balloon: keep
> > > > locks balanced on add_memory() failure
> > > >
> > > > If you could pickup an additional patch that would be:
> > > >
> > > > [PATCH 6/6] Drivers: hv: vmbus: Fix a bug in rescind processing in
> > > > vmbus_close_internal()
> > > >
> > > > The first one is the most important one and if you can only pickup
> > > > one, the
> > > first one is the one I want you to pick up.
> > >
> > > You aren't answering my question, what happened that caused these to
> > > become an error and break the 4.0-rc tree?  Shouldn't I just revert
> > > a recent change here?  Or has things always been broken and no one
> > > has noticed it before?
> >
> > commit 2dd37cb81580dce6dfb8c5a7d5c37b904a188ae7
> >
> > introduced the bug (committed on Feb 28th). This patch cleaned up the
> > rescind handling code.
> >
> > The patches I sent a few days later:
> >
> > Drivers: hv: vmbus: Perform device register in the per-channel work
> > element fixed it.
> >
> > Drivers: hv: vmbus: Fix a bug in rescind processing in
> > vmbus_close_internal()
> >
> > Fixed the bugs.
> 
> Ok, commit 2dd37cb81580dce6dfb8c5a7d5c37b904a188ae7 is on my char-
> misc-next branch, and has nothing to do with 4.0-final.  So why do you think
> anything needs to be done for 4.0?
> 
> Please take a look at my tree, at Linus's tree, and figure out exactly what
> needs to be fixed where, and resend me patches that explicitly says which
> branch for me to apply them to (char-linus for patches that need to go for
> 4.0-final, char-next for patches that need to go into
> 4.1-rc1.)

You are right, the offending commit is NOT in 4.0-rc4 tree that 
I looked at earlier this afternoon.
 
> 
> I'm again dropping all of your pending patches in my to-apply queue, as it's 
> all
> just too confusing here and no one seems to know what is going on (myself
> included.)

Sorry about the confusion. My mistake; last week I looked at a test tree that 
had the
offending commit and I was told that the tree was

Re: [rfc patch v2] rt,nohz_full: fix nohz_full for PREEMPT_RT_FULL

2015-03-16 Thread Mike Galbraith

On Tue, 2015-03-17 at 02:53 +0100, Mike Galbraith wrote:
> On Mon, 2015-03-16 at 21:24 +0100, Sebastian Andrzej Siewior wrote:

> > What you do is that you accept the fact that the timer-softirq is
> > scheduled for no reason and then you try to disable the tick from within
> > the timer-softirq. I assumed that it would work get the "expired timer"
> > somehow.
> 
> Yup, it works around that otherwise crippling wakeup.  If I re-apply..
> 
>  timers-do-not-raise-softirq-unconditionally.patch
> 
> ..the workaround is not needed of course, but the livelock fix still is.
> I haven't yet tested that in 3.18-rt though, only 4.0-rt, but I presume
> it'll be the same deal there when I do.

Did that, and it is.  Fire up tbench 4 + make -j4 on my i7-4790+smt box
booted nohz_full=2,3,6,7, death spiral begins shortly thereafter.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] eeprom: at24: Add support for large EEPROMs connected to SMBus adapters

2015-03-16 Thread Guenter Roeck

On Mon, Feb 16, 2015 at 01:09:51PM +0100, Wolfram Sang wrote:
> Hi Guenter,
> 
> > I wonder where we are with thisp patch; I don't recall a reply to my 
> > previous
> > e-mail.
> 
> Sorry for the late reply. I needed to recover from a HDD headcrash :(
> 
> > Do you need some more time to think about it ? Otherwise I'll publish an
> > out-of-tree version of the at24 driver with the patch applied on github,
> > for those who might need the functionality provided by this patch.
> 
> Your last mail made me aware of why we were missing each other before. I
> see your point now, but yes, still need to think about it. My plan is to
> have a decision until the 3.21 merge window.
> 
Hi Wolfram,

any news ?

Thanks,
Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] tick/broadcast-hrtimer : Fix suspicious RCU usage in idle loop

2015-03-16 Thread Preeti U Murthy

On 03/16/2015 08:26 PM, Peter Zijlstra wrote:
> On Thu, Mar 05, 2015 at 10:06:30AM +0530, Preeti U Murthy wrote:
>>
>> On 03/02/2015 08:23 PM, Peter Zijlstra wrote:
>>> On Thu, Feb 26, 2015 at 08:52:02AM +0530, Preeti U Murthy wrote:
 The hrtimer mode of broadcast queues hrtimers in the idle entry
 path so as to wakeup cpus in deep idle states. 
>>>
>>> Callgraph please...
>>
>> cpuidle_idle_call()
>> | clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, ))
>>  |_tick_broadcast_set_event()
>>|clockevents_program_event()
>> |bc_set_next()
>>>
 hrtimer_{start/cancel}
 functions call into tracing which uses RCU. But it is not legal to call
 into RCU in cpuidle because it is one of the quiescent states. Hence
 protect this region with RCU_NONIDLE which informs RCU that the cpu
 is momentarily non-idle.
>>>
>>> It it not clear to me that every user of bc_set_next() is from IDLE.
>>> From what I can tell it ends up being clockevents_program_event() and
>>> that is called quite a lot.
>>
>> bc_set_next() is called from at places:
>> 1. Idle entry : It is called when a cpu in its idle entry path finds the
>> need to reset the broadcast hrtimer.
>> 2. CPU offline operations : When the cpu on which the broadcast hrtimer
>> is being queued goes offline.
>>
>> So you see that almost all the time, it is called in idle entry path.
> 
> How about:
> 
>   hrtimer_reprogram()
> tick_program_event()
>   clockevents_program_event()
> ->set_next_ktime()
> 
> That is called from !idle loads of times. I guess I'm not seeing what
> avoids _broadcast_hrtimer from being the 'normal' clock event.

Ok I see your point now. Sorry about having misinterpreted it
previously. ce_broadcast_hrtimer is not the per-cpu clock device. It is
not a real clock device. It is a pseudo clock device, which is called
only from the guts of the broadcast framework.
When it is programmed, it queues a hrtimer and programs the per-cpu
clock device. in the fashion mentioned above.

No hrtimer programming/starting/canceling will get routed through
bc_set_next(). The broadcast framework makes use of a separate broadcast
clock device, which is never the per-cpu clock device to wake cpus from
idle. This device is programmed explicitly when required and not
indirectly via timer queueing. *Only* when this broadcast clock device
needs to reprogrammed, bc_set_next() gets called on those archs which
*do not have a real broadcast clock device*. And the whole thing kicks
in when cpus go idle only, not just for PowerPC but for ARM as well.

Regards
Preeti U Murthy
> 
> Sure; it might be that for power you only end up with that broadcast
> crap enabled on idle/hotplug, but is this always so?
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] rtc: OMAP: Add external 32k clock feature

2015-03-16 Thread Keerthy




On Tuesday 03 March 2015 03:12 PM, Keerthy wrote:

Add external 32k clock feature. The internal clock will be gated during suspend.
Hence make use of the external 32k clock so that rtc is functional accross
suspend/resume.


A gentle ping on this.



Signed-off-by: Keerthy 
---

Tested on DRA7-EVM.

  drivers/rtc/rtc-omap.c | 15 ++-
  1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/drivers/rtc/rtc-omap.c b/drivers/rtc/rtc-omap.c
index 8e5851a..4f803ca 100644
--- a/drivers/rtc/rtc-omap.c
+++ b/drivers/rtc/rtc-omap.c
@@ -107,6 +107,8 @@

  /* OMAP_RTC_OSC_REG bit fields: */
  #define OMAP_RTC_OSC_32KCLK_ENBIT(6)
+#define OMAP_RTC_OSC_OSC32K_GZ BIT(4)
+#define OMAP_RTC_OSC_EXT_32K   BIT(3)

  /* OMAP_RTC_IRQWAKEEN bit fields: */
  #define OMAP_RTC_IRQWAKEEN_ALARM_WAKEEN   BIT(1)
@@ -120,6 +122,7 @@

  struct omap_rtc_device_type {
bool has_32kclk_en;
+   bool has_osc_ext_32k;
bool has_kicker;
bool has_irqwakeen;
bool has_pmic_mode;
@@ -446,6 +449,7 @@ static const struct omap_rtc_device_type 
omap_rtc_default_type = {

  static const struct omap_rtc_device_type omap_rtc_am3352_type = {
.has_32kclk_en  = true,
+   .has_osc_ext_32k = true,
.has_kicker = true,
.has_irqwakeen  = true,
.has_pmic_mode  = true,
@@ -543,7 +547,16 @@ static int __init omap_rtc_probe(struct platform_device 
*pdev)
if (rtc->type->has_32kclk_en) {
reg = rtc_read(rtc, OMAP_RTC_OSC_REG);
rtc_writel(rtc, OMAP_RTC_OSC_REG,
-   reg | OMAP_RTC_OSC_32KCLK_EN);
+  reg | OMAP_RTC_OSC_32KCLK_EN);
+   }
+
+   /* Enable External clock as the source */
+
+   if (rtc->type->has_osc_ext_32k) {
+   rtc_writel(rtc, OMAP_RTC_OSC_REG,
+  (OMAP_RTC_OSC_EXT_32K |
+  rtc_read(rtc, OMAP_RTC_OSC_REG)) &
+  (~OMAP_RTC_OSC_OSC32K_GZ));
}

/* clear old status */


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible no longer required cast in the function,usbhs_parse_dt in common.c

2015-03-16 Thread Felipe Balbi

On Tue, Mar 17, 2015 at 12:10:41AM -0400, nick wrote:
> 
> 
> On 2015-03-16 11:56 PM, nick wrote:
> > 
> > 
> > On 2015-03-16 11:54 PM, Felipe Balbi wrote:
> >> On Mon, Mar 16, 2015 at 11:51:15PM -0400, nick wrote:
> >>>
> >>>
> >>> On 2015-03-16 11:37 PM, Peter Chen wrote:
>   
> >
> > Greetings All,
> > I have been getting the below build warnings:
> > drivers/usb/renesas_usbhs/common.c: In function ‘usbhs_parse_dt’:
> > drivers/usb/renesas_usbhs/common.c:482:25: warning: cast from pointer to
> > integer of different size [-Wpointer-to-int-cast]
> > dparam->type = of_id ? (u32)of_id->data : 0;
> > After looking into the function I am curious if the hardware is only 32 
> > bit as if
> > the supported hardware for this driver is then this cast is no longer 
> > required
> > and I can send in a patch removing it. Furthermore, sorry for the simple
> > question but  I don't have access to the device specs for supported 
> > hardware
> > so I though it would be better to ask before I send in patch fixing 
> > this issue.
> > Thanks,
> > Nick
> 
>  Patch is welcome, there will be comment if it is not suitable.
> 
>  Peter
> 
> >>> I understand that,my question was does the hardware for this driver 
> >>> support 64 bit.
> >>
> >> regardless, it shouldn't produce a build warning.
> >>
> > It does for me.
> > Nick
> > 
> After looking more closely it seems me are trying to convert a const
> void to a int. Does the data member need to be const as this may be
> causing the issue.

why would const have anything to do with sizes ?

-- 
balbi


signature.asc
Description: Digital signature

[PATCH RFC v2] powerpc/powernv: Introduce kernel param to control fastsleep workaround behavior

2015-03-16 Thread Shreyas B. Prabhu

Fastsleep is one of the idle state which cpuidle subsystem currently
uses on power8 machines. In this state L2 cache is brought down to a
threshold voltage. Therefore when the core is in fastsleep, the
communication between L2 and L3 needs to be fenced. But there is a bug
in the current power8 chips surrounding this fencing. OPAL provides an
interface to workaround this bug, and in the current implementation,
every time before a core enters fastsleep OPAL call is made to 'apply'
the workarond and when the core wakes up from fastsleep OPAL call is
made to 'undo' the workaround. These OPAL calls account for roughly
4000 cycles everytime the core has to enter or wakeup from fastsleep.
The other alternative is to apply this workaround once at boot, and not
undo it at all. While this would quicken fastsleep entry/wakeup path,
downside is, any correctable error detected in L2 directory will result
in a checkstop.

This patch adds a new kernel paramerter
pnv_fastsleep_workaround_once, which can be used to override
the default behavior and apply the workaround once at boot and not undo
it.

Signed-off-by: Shreyas B. Prabhu 
CC: Michael Ellerman 
CC: Paul Mackerras 
CC: Benjamin Herrenschmidt 
CC: linuxppc-...@lists.ozlabs.org
---
Changes in v2:
--
Accurately describes the downside of running workaround always applied.

 Documentation/kernel-parameters.txt|  4 +++
 arch/powerpc/include/asm/opal.h|  8 +
 arch/powerpc/platforms/powernv/opal-wrappers.S |  1 +
 arch/powerpc/platforms/powernv/setup.c | 45 +-
 4 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index bfcb1a6..006863b 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2857,6 +2857,10 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
autoconfiguration.
Ranges are in pairs (memory base and size).
 
+   pnv_fastsleep_workaround_once=
+   [BUGS=ppc64] Tells kernel to apply fastsleep workaround
+   once at boot.
+
ports=  [IP_VS_FTP] IPVS ftp helper module
Default is 21.
Up to 8 (IP_VS_APP_MAX_PORTS) ports
diff --git a/arch/powerpc/include/asm/opal.h b/arch/powerpc/include/asm/opal.h
index 9ee0a30..8bea8fc 100644
--- a/arch/powerpc/include/asm/opal.h
+++ b/arch/powerpc/include/asm/opal.h
@@ -180,6 +180,13 @@ struct opal_sg_list {
 #define OPAL_PM_WINKLE_ENABLED 0x0004
 #define OPAL_PM_SLEEP_ENABLED_ER1  0x0008
 
+/*
+ * OPAL_CONFIG_CPU_IDLE_STATE parameters
+ */
+#define OPAL_CONFIG_IDLE_FASTSLEEP 1
+#define OPAL_CONFIG_IDLE_UNDO  0
+#define OPAL_CONFIG_IDLE_APPLY 1
+
 #ifndef __ASSEMBLY__
 
 #include 
@@ -924,6 +931,7 @@ int64_t opal_handle_hmi(void);
 int64_t opal_register_dump_region(uint32_t id, uint64_t start, uint64_t end);
 int64_t opal_unregister_dump_region(uint32_t id);
 int64_t opal_slw_set_reg(uint64_t cpu_pir, uint64_t sprn, uint64_t val);
+int64_t opal_config_cpu_idle_state(uint64_t state, uint64_t flag);
 int64_t opal_pci_set_phb_cxl_mode(uint64_t phb_id, uint64_t mode, uint64_t 
pe_number);
 int64_t opal_ipmi_send(uint64_t interface, struct opal_ipmi_msg *msg,
uint64_t msg_len);
diff --git a/arch/powerpc/platforms/powernv/opal-wrappers.S 
b/arch/powerpc/platforms/powernv/opal-wrappers.S
index 0509bca..84a20bb 100644
--- a/arch/powerpc/platforms/powernv/opal-wrappers.S
+++ b/arch/powerpc/platforms/powernv/opal-wrappers.S
@@ -283,6 +283,7 @@ OPAL_CALL(opal_sensor_read, 
OPAL_SENSOR_READ);
 OPAL_CALL(opal_get_param,  OPAL_GET_PARAM);
 OPAL_CALL(opal_set_param,  OPAL_SET_PARAM);
 OPAL_CALL(opal_handle_hmi, OPAL_HANDLE_HMI);
+OPAL_CALL(opal_config_cpu_idle_state,  OPAL_CONFIG_CPU_IDLE_STATE);
 OPAL_CALL(opal_slw_set_reg,OPAL_SLW_SET_REG);
 OPAL_CALL(opal_register_dump_region,   OPAL_REGISTER_DUMP_REGION);
 OPAL_CALL(opal_unregister_dump_region, OPAL_UNREGISTER_DUMP_REGION);
diff --git a/arch/powerpc/platforms/powernv/setup.c 
b/arch/powerpc/platforms/powernv/setup.c
index d2de7d5..21dde6c 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -405,6 +406,20 @@ u32 pnv_get_supported_cpuidle_states(void)
 }
 EXPORT_SYMBOL_GPL(pnv_get_supported_cpuidle_states);
 
+u8 pnv_apply_fastsleep_workaround_once;
+
+static int __init pnv_fastsleep_workaround_once(char *str)
+{
+   pnv_apply_fastsleep_workaround_once = 1;
+   return 0;
+}
+early_param("pnv_fastsleep_workaround_once", pnv_fastsleep_workaround_once);
+
+static void __init pnv_fastsleep_workaround_apply(void *info)
+{

Re: Possible no longer required cast in the function,usbhs_parse_dt in common.c

2015-03-16 Thread Felipe Balbi

On Mon, Mar 16, 2015 at 11:56:26PM -0400, nick wrote:
> 
> 
> On 2015-03-16 11:54 PM, Felipe Balbi wrote:
> > On Mon, Mar 16, 2015 at 11:51:15PM -0400, nick wrote:
> >>
> >>
> >> On 2015-03-16 11:37 PM, Peter Chen wrote:
> >>>  
> 
>  Greetings All,
>  I have been getting the below build warnings:
>  drivers/usb/renesas_usbhs/common.c: In function ‘usbhs_parse_dt’:
>  drivers/usb/renesas_usbhs/common.c:482:25: warning: cast from pointer to
>  integer of different size [-Wpointer-to-int-cast]
>  dparam->type = of_id ? (u32)of_id->data : 0;
>  After looking into the function I am curious if the hardware is only 32 
>  bit as if
>  the supported hardware for this driver is then this cast is no longer 
>  required
>  and I can send in a patch removing it. Furthermore, sorry for the simple
>  question but  I don't have access to the device specs for supported 
>  hardware
>  so I though it would be better to ask before I send in patch fixing this 
>  issue.
>  Thanks,
>  Nick
> >>>
> >>> Patch is welcome, there will be comment if it is not suitable.
> >>>
> >>> Peter
> >>>
> >> I understand that,my question was does the hardware for this driver 
> >> support 64 bit.
> > 
> > regardless, it shouldn't produce a build warning.
> > 
> It does for me.

yes, and I'm saying that's wrong. Regardless of the platform supporting
64bit or not, if the driver is allowed to build in 64bit configurations,
there should be no warnings; if there are, they should be fixed and
patches are very welcome.

-- 
balbi


signature.asc
Description: Digital signature

Re: [update][PATCH v10 06/21] ACPI / sleep: Introduce CONFIG_ACPI_GENERIC_SLEEP

2015-03-16 Thread Hanjun Guo

On 2015/3/17 11:23, Rafael J. Wysocki wrote:
> On Tuesday, March 17, 2015 10:36:47 AM Hanjun Guo wrote:
>> On 2015/3/17 10:28, Rafael J. Wysocki wrote:
>>> On Tuesday, March 17, 2015 09:08:45 AM Hanjun Guo wrote:
 On 2015/3/17 7:15, Rafael J. Wysocki wrote:
> On Monday, March 16, 2015 08:14:52 PM Hanjun Guo wrote:
>> On 2015年03月14日 05:49, Rafael J. Wysocki wrote:
>>> On Friday, March 13, 2015 04:14:29 PM Hanjun Guo wrote:
 [...]
 diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
 index 074e52b..e8728d7 100644
 --- a/arch/ia64/Kconfig
 +++ b/arch/ia64/Kconfig
 @@ -10,6 +10,7 @@ config IA64
select ARCH_MIGHT_HAVE_PC_SERIO
select PCI if (!IA64_HP_SIM)
select ACPI if (!IA64_HP_SIM)
 +  select ACPI_GENERIC_SLEEP if ACPI
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select HAVE_UNSTABLE_SCHED_CLOCK
select HAVE_IDE
 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
 index b7d31ca..9804431 100644
 --- a/arch/x86/Kconfig
 +++ b/arch/x86/Kconfig
 @@ -22,6 +22,7 @@ config X86_64
   ### Arch settings
   config X86
def_bool y
 +  select ACPI_GENERIC_SLEEP if ACPI
>>> One more nit.  If you did
>>>
>>> +   select ACPI_GENERIC_SLEEP if ACPI_SLEEP
>>>
>>> here (and above for ia64), you'd avoid having to make ACPI_SLEEP
>>> depend on ACPI_GENERIC_SLEEP which goes somewhat backwards.
>> In sleep.c,
>>
>> #ifdef CONFIG_ACPI_SLEEP
>> acpi_target_system_state()
>> {
>> }
>> #endif
>>
>> and CONFIG_ACPI_SLEEP depends on SUSPEND || HIBERNATION,
>> which one of them will be enabled on ARM64 so ACPI_SLEEP
>> will also enabled too.
>>
>> So if we
>>
>> +select ACPI_GENERIC_SLEEP if ACPI_SLEEP
>>
>> and
>>
>> +acpi-$(CONFIG_ACPI_GENERIC_SLEEP) += sleep.o
>>
>> it will lead to errors for acpi_target_system_state() that
>> is declared but not defined, so I will keep the code as
>> it is, what do you think?
> No, we need to hash this out.  Having two different Kconfig options 
> meaning
> almost the same thing (ACPI_SLEEP and ACPI_GENERIC_SLEEP) is beyond ugly.
>
> Do you need ACPI_SLEEP on ARM64 at all?
 No, at least for now we don't need it, the spec for sleep is not ready for
 ARM64 arch, so ACPI_SLEEP will not work at all on ARM64.
>>> Well, so what about selecting ACPI_SLEEP from the architectures that use it?
>> Do you mean remove CONFIG_ACPI_GENERIC_SLEEP and
>>
>> +acpi-$(CONFIG_ACPI_SLEEP) += sleep.o
>>
>> as well (also need to remove duplicate #ifdef CONFIG_ACPI_SLEEP in sleep.c if
>> we doing so)?
> Well, almost.  There is one problem with that, becuase sleep.c contains code
> outside of the ACPI_SLEEP-dependent blocks.  That code is used for powering
> off ACPI platforms.
>
> I guess you don't want that code on ARM too, right?

Yes, you are right.

>
> Perhaps we can use ACPI_REDUCED_HARDWARE_ONLY for that?  ARM64 will be the

Sorry, I can't fully understand your intention here, could you please
explain it more?

Let me guess a little bit. Do you mean use ACPI_REDUCED_HARDWARE_ONLY for
powering off ACPI platforms? if so, I guess it's not a good idea, ACPI spec
only says that S4BIOS is not supported on HW-reduced ACPI platforms, S5
has no such limitation, if I miss something here, please let me know.

> only arch setting it at least for the time being, is that correct?

That's pretty sure for now.

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 35/35 linux-next] pinctrl: constify of_device_id array

2015-03-16 Thread Jean-Christophe PLAGNIOL-VILLARD

On 20:59 Mon 16 Mar , Fabian Frederick wrote:
> of_device_id is always used as const.
> (See driver.of_match_table and open firmware functions)
> 
> Signed-off-by: Fabian Frederick 
Acked-by: Jean-Christophe PLAGNIOL-VILLARD 

Best Regards,
J.
> ---
>  drivers/pinctrl/bcm/pinctrl-bcm2835.c   | 2 +-
>  drivers/pinctrl/mediatek/pinctrl-mt8135.c   | 2 +-
>  drivers/pinctrl/mediatek/pinctrl-mt8173.c   | 2 +-
>  drivers/pinctrl/mvebu/pinctrl-armada-370.c  | 2 +-
>  drivers/pinctrl/mvebu/pinctrl-armada-375.c  | 2 +-
>  drivers/pinctrl/mvebu/pinctrl-armada-38x.c  | 2 +-
>  drivers/pinctrl/mvebu/pinctrl-armada-39x.c  | 2 +-
>  drivers/pinctrl/mvebu/pinctrl-armada-xp.c   | 2 +-
>  drivers/pinctrl/mvebu/pinctrl-kirkwood.c| 2 +-
>  drivers/pinctrl/mvebu/pinctrl-orion.c   | 2 +-
>  drivers/pinctrl/pinctrl-as3722.c| 2 +-
>  drivers/pinctrl/pinctrl-at91.c  | 4 ++--
>  drivers/pinctrl/pinctrl-palmas.c| 2 +-
>  drivers/pinctrl/pinctrl-single.c| 4 ++--
>  drivers/pinctrl/pinctrl-st.c| 2 +-
>  drivers/pinctrl/pinctrl-tz1090-pdc.c| 2 +-
>  drivers/pinctrl/pinctrl-tz1090.c| 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun4i-a10.c   | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun5i-a10s.c  | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun5i-a13.c   | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun6i-a31-r.c | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun6i-a31.c   | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun6i-a31s.c  | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun7i-a20.c   | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun8i-a23-r.c | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun8i-a23.c   | 2 +-
>  drivers/pinctrl/sunxi/pinctrl-sun9i-a80.c   | 2 +-
>  drivers/pinctrl/vt8500/pinctrl-vt8500.c | 2 +-
>  drivers/pinctrl/vt8500/pinctrl-wm8505.c | 2 +-
>  drivers/pinctrl/vt8500/pinctrl-wm8650.c | 2 +-
>  drivers/pinctrl/vt8500/pinctrl-wm8750.c | 2 +-
>  drivers/pinctrl/vt8500/pinctrl-wm8850.c | 2 +-
>  32 files changed, 34 insertions(+), 34 deletions(-)
> 
> diff --git a/drivers/pinctrl/bcm/pinctrl-bcm2835.c 
> b/drivers/pinctrl/bcm/pinctrl-bcm2835.c
> index 9aa8a3f..4d08b85 100644
> --- a/drivers/pinctrl/bcm/pinctrl-bcm2835.c
> +++ b/drivers/pinctrl/bcm/pinctrl-bcm2835.c
> @@ -1051,7 +1051,7 @@ static int bcm2835_pinctrl_remove(struct 
> platform_device *pdev)
>   return 0;
>  }
>  
> -static struct of_device_id bcm2835_pinctrl_match[] = {
> +static const struct of_device_id bcm2835_pinctrl_match[] = {
>   { .compatible = "brcm,bcm2835-gpio" },
>   {}
>  };
> diff --git a/drivers/pinctrl/mediatek/pinctrl-mt8135.c 
> b/drivers/pinctrl/mediatek/pinctrl-mt8135.c
> index 1296d6d..82c4af4 100644
> --- a/drivers/pinctrl/mediatek/pinctrl-mt8135.c
> +++ b/drivers/pinctrl/mediatek/pinctrl-mt8135.c
> @@ -347,7 +347,7 @@ static int mt8135_pinctrl_probe(struct platform_device 
> *pdev)
>   return mtk_pctrl_init(pdev, _pinctrl_data);
>  }
>  
> -static struct of_device_id mt8135_pctrl_match[] = {
> +static const struct of_device_id mt8135_pctrl_match[] = {
>   {
>   .compatible = "mediatek,mt8135-pinctrl",
>   }, {
> diff --git a/drivers/pinctrl/mediatek/pinctrl-mt8173.c 
> b/drivers/pinctrl/mediatek/pinctrl-mt8173.c
> index f07cafb..594f7b5 100644
> --- a/drivers/pinctrl/mediatek/pinctrl-mt8173.c
> +++ b/drivers/pinctrl/mediatek/pinctrl-mt8173.c
> @@ -427,7 +427,7 @@ static int mt8173_pinctrl_probe(struct platform_device 
> *pdev)
>   return mtk_pctrl_init(pdev, _pinctrl_data);
>  }
>  
> -static struct of_device_id mt8173_pctrl_match[] = {
> +static const struct of_device_id mt8173_pctrl_match[] = {
>   {
>   .compatible = "mediatek,mt8173-pinctrl",
>   }, {
> diff --git a/drivers/pinctrl/mvebu/pinctrl-armada-370.c 
> b/drivers/pinctrl/mvebu/pinctrl-armada-370.c
> index c4f51d0..42f930f 100644
> --- a/drivers/pinctrl/mvebu/pinctrl-armada-370.c
> +++ b/drivers/pinctrl/mvebu/pinctrl-armada-370.c
> @@ -379,7 +379,7 @@ static struct mvebu_mpp_mode mv88f6710_mpp_modes[] = {
>  
>  static struct mvebu_pinctrl_soc_info armada_370_pinctrl_info;
>  
> -static struct of_device_id armada_370_pinctrl_of_match[] = {
> +static const struct of_device_id armada_370_pinctrl_of_match[] = {
>   { .compatible = "marvell,mv88f6710-pinctrl" },
>   { },
>  };
> diff --git a/drivers/pinctrl/mvebu/pinctrl-armada-375.c 
> b/drivers/pinctrl/mvebu/pinctrl-armada-375.c
> index cd7c8f5..ca1e757 100644
> --- a/drivers/pinctrl/mvebu/pinctrl-armada-375.c
> +++ b/drivers/pinctrl/mvebu/pinctrl-armada-375.c
> @@ -399,7 +399,7 @@ static struct mvebu_mpp_mode mv88f6720_mpp_modes[] = {
>  
>  static struct mvebu_pinctrl_soc_info armada_375_pinctrl_info;
>  
> -static struct of_device_id armada_375_pinctrl_of_match[] = {
> +static const struct of_device_id armada_375_pinctrl_of_match[] = {
>   { .compatible = "marvell,mv88f6720-pinctrl" },
>   { },
>  };
> diff --git a/drivers/pinctrl/mvebu/pinctrl-armada-38x.c 
>

Re: [PATCH 26/35 linux-next] hwmon: constify of_device_id array

2015-03-16 Thread Guenter Roeck


On 03/16/2015 12:54 PM, Fabian Frederick wrote:

of_device_id is always used as const.
(See driver.of_match_table and open firmware functions)

Signed-off-by: Fabian Frederick 
---


Applied to hwmon-next.

Thanks,
Guenter


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Berørte Kære kunde,

2015-03-16 Thread transfer


Ubekymrede.
Berørte Kære kunde,

Jeg er i kontakt med dig på grund af dette behov og det haster med denne
transaktion betyder. De er udenlandske ansvarlige investeringer i Bank of
China (ICBC), og jeg var en finansiel rådgiver og indkomst for en privat
investor, der har et stort indskud i min bank.

Transaktionen indebærer overførsel af fonden for den samlede deponering af
23 mio 556,768.00 EUR til dig som en højst to advokater for død af denne
kunde uden en nær slægtning.

Jeg kan forsikre Dem om, at denne transaktion er 100% risiko mennesker til
gratis personer og enheder, der har gjort alle de underjordiske arbejder
lokalt for en smidig overdragelse af fonden i den korteste periode, og
bliver født i byen, hvor filialen af banken.

Så snart jeg modtager et svar på din interesse, vil jeg sende dig alle
oplysninger om den vellykkede afslutning af denne transaktion, og dele
dine noter vil være 50%, og mine vil være 50% af denne fond.

Kontakt venligst på min private e-mail nedenfor for eventuelle spørgsmål
og yderligere information.

Venlig hilsen,
Wang Yongli (ICBC Bank of China)
E-mail:  wangiyongli...@gmail.com

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Possible no longer required cast in the function,usbhs_parse_dt in common.c

2015-03-16 Thread Felipe Balbi

On Mon, Mar 16, 2015 at 11:51:15PM -0400, nick wrote:
> 
> 
> On 2015-03-16 11:37 PM, Peter Chen wrote:
> >  
> >>
> >> Greetings All,
> >> I have been getting the below build warnings:
> >> drivers/usb/renesas_usbhs/common.c: In function ‘usbhs_parse_dt’:
> >> drivers/usb/renesas_usbhs/common.c:482:25: warning: cast from pointer to
> >> integer of different size [-Wpointer-to-int-cast]
> >> dparam->type = of_id ? (u32)of_id->data : 0;
> >> After looking into the function I am curious if the hardware is only 32 
> >> bit as if
> >> the supported hardware for this driver is then this cast is no longer 
> >> required
> >> and I can send in a patch removing it. Furthermore, sorry for the simple
> >> question but  I don't have access to the device specs for supported 
> >> hardware
> >> so I though it would be better to ask before I send in patch fixing this 
> >> issue.
> >> Thanks,
> >> Nick
> > 
> > Patch is welcome, there will be comment if it is not suitable.
> > 
> > Peter
> > 
> I understand that,my question was does the hardware for this driver support 
> 64 bit.

regardless, it shouldn't produce a build warning.

-- 
balbi


signature.asc
Description: Digital signature

RE: Possible no longer required cast in the function,usbhs_parse_dt in common.c

2015-03-16 Thread Peter Chen

 
> 
> Greetings All,
> I have been getting the below build warnings:
> drivers/usb/renesas_usbhs/common.c: In function ‘usbhs_parse_dt’:
> drivers/usb/renesas_usbhs/common.c:482:25: warning: cast from pointer to
> integer of different size [-Wpointer-to-int-cast]
> dparam->type = of_id ? (u32)of_id->data : 0;
> After looking into the function I am curious if the hardware is only 32 bit 
> as if
> the supported hardware for this driver is then this cast is no longer required
> and I can send in a patch removing it. Furthermore, sorry for the simple
> question but  I don't have access to the device specs for supported hardware
> so I though it would be better to ask before I send in patch fixing this 
> issue.
> Thanks,
> Nick

Patch is welcome, there will be comment if it is not suitable.

Peter

Re: [PATCH] tick/broadcast-hrtimer : Fix suspicious RCU usage in idle loop

2015-03-16 Thread Preeti U Murthy


On 03/16/2015 08:26 PM, Peter Zijlstra wrote:
> On Thu, Mar 05, 2015 at 10:06:30AM +0530, Preeti U Murthy wrote:
>>
>> On 03/02/2015 08:23 PM, Peter Zijlstra wrote:
>>> On Thu, Feb 26, 2015 at 08:52:02AM +0530, Preeti U Murthy wrote:
 The hrtimer mode of broadcast queues hrtimers in the idle entry
 path so as to wakeup cpus in deep idle states. 
>>>
>>> Callgraph please...
>>
>> cpuidle_idle_call()
>> | clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, ))
>>  |_tick_broadcast_set_event()
>>|clockevents_program_event()
>> |bc_set_next()
>>>
 hrtimer_{start/cancel}
 functions call into tracing which uses RCU. But it is not legal to call
 into RCU in cpuidle because it is one of the quiescent states. Hence
 protect this region with RCU_NONIDLE which informs RCU that the cpu
 is momentarily non-idle.
>>>
>>> It it not clear to me that every user of bc_set_next() is from IDLE.
>>> From what I can tell it ends up being clockevents_program_event() and
>>> that is called quite a lot.
>>
>> bc_set_next() is called from at places:
>> 1. Idle entry : It is called when a cpu in its idle entry path finds the
>> need to reset the broadcast hrtimer.
>> 2. CPU offline operations : When the cpu on which the broadcast hrtimer
>> is being queued goes offline.
>>
>> So you see that almost all the time, it is called in idle entry path.
> 
> How about:
> 
>   hrtimer_reprogram()
> tick_program_event()
>   clockevents_program_event()
> ->set_next_ktime()
> 
> That is called from !idle loads of times. I guess I'm not seeing what
> avoids _broadcast_hrtimer from being the 'normal' clock event.

It is a normal clock event. In the above context, this hrtimer is being
moved from CPUx to the CPU executing that code. Hence it needs to be
enqueued onto the new CPU. Any hrtimer enqueue calls into tracing.
A hrtimer_reprogram() alone will not suffice.
Moreover hrtimer_reprogram() cannot be called directly, can it? nor is
it safe. Or am I missing your point ?

> 
> Sure; it might be that for power you only end up with that broadcast
> crap enabled on idle/hotplug, but is this always so?

The hrtimer broadcast framework gets invoked only during idle. This is
platform agnostic.

Regards
Preeti U Murthy
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 1/3] of/unittest: replace 'selftest' with 'unittest'

2015-03-16 Thread Wang Long

This patch just replace the string 'selftest' with 'unittest'
in OF unittest and data and binding file.

I have tested it successfully on ARM.

Signed-off-by: Gaurav Minocha 
Signed-off-by: Wang Long 
---
 Documentation/devicetree/bindings/unittest.txt |  44 +-
 drivers/of/unittest-data/tests-overlay.dtsi| 108 ++--
 drivers/of/unittest.c  | 706 -
 3 files changed, 429 insertions(+), 429 deletions(-)

diff --git a/Documentation/devicetree/bindings/unittest.txt 
b/Documentation/devicetree/bindings/unittest.txt
index 8933211..3bf58c2 100644
--- a/Documentation/devicetree/bindings/unittest.txt
+++ b/Documentation/devicetree/bindings/unittest.txt
@@ -1,60 +1,60 @@
-1) OF selftest platform device
+1) OF unittest platform device
 
-** selftest
+** unittest
 
 Required properties:
-- compatible: must be "selftest"
+- compatible: must be "unittest"
 
 All other properties are optional.
 
 Example:
-   selftest {
-   compatible = "selftest";
+   unittest {
+   compatible = "unittest";
status = "okay";
};
 
-2) OF selftest i2c adapter platform device
+2) OF unittest i2c adapter platform device
 
 ** platform device unittest adapter
 
 Required properties:
-- compatible: must be selftest-i2c-bus
+- compatible: must be unittest-i2c-bus
 
-Children nodes contain selftest i2c devices.
+Children nodes contain unittest i2c devices.
 
 Example:
-   selftest-i2c-bus {
-   compatible = "selftest-i2c-bus";
+   unittest-i2c-bus {
+   compatible = "unittest-i2c-bus";
status = "okay";
};
 
-3) OF selftest i2c device
+3) OF unittest i2c device
 
-** I2C selftest device
+** I2C unittest device
 
 Required properties:
-- compatible: must be selftest-i2c-dev
+- compatible: must be unittest-i2c-dev
 
 All other properties are optional
 
 Example:
-   selftest-i2c-dev {
-   compatible = "selftest-i2c-dev";
+   unittest-i2c-dev {
+   compatible = "unittest-i2c-dev";
status = "okay";
};
 
-4) OF selftest i2c mux device
+4) OF unittest i2c mux device
 
-** I2C selftest mux
+** I2C unittest mux
 
 Required properties:
-- compatible: must be selftest-i2c-mux
+- compatible: must be unittest-i2c-mux
 
-Children nodes contain selftest i2c bus nodes per channel.
+Children nodes contain unittest i2c bus nodes per channel.
 
 Example:
-   selftest-i2c-mux {
-   compatible = "selftest-i2c-mux";
+   unittest-i2c-mux {
+   compatible = "unittest-i2c-mux";
status = "okay";
#address-cells = <1>;
#size-cells = <0>;
@@ -64,7 +64,7 @@ Example:
#size-cells = <0>;
i2c-dev {
reg = <8>;
-   compatible = "selftest-i2c-dev";
+   compatible = "unittest-i2c-dev";
status = "okay";
};
};
diff --git a/drivers/of/unittest-data/tests-overlay.dtsi 
b/drivers/of/unittest-data/tests-overlay.dtsi
index 244226c..02ba56c 100644
--- a/drivers/of/unittest-data/tests-overlay.dtsi
+++ b/drivers/of/unittest-data/tests-overlay.dtsi
@@ -4,94 +4,94 @@
overlay-node {
 
/* test bus */
-   selftestbus: test-bus {
+   unittestbus: test-bus {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <0>;
 
-   selftest100: test-selftest100 {
-   compatible = "selftest";
+   unittest100: test-unittest100 {
+   compatible = "unittest";
status = "okay";
reg = <100>;
};
 
-   selftest101: test-selftest101 {
-   compatible = "selftest";
+   unittest101: test-unittest101 {
+   compatible = "unittest";
status = "disabled";
reg = <101>;
};
 
-   selftest0: test-selftest0 {
-   compatible = "selftest";
+   unittest0: test-unittest0 {
+   compatible = "unittest";
status = "disabled";
reg = <0>;
};
 
-   selftest1: test-selftest1 {
-   compatible =

[PATCH v3 2/3] Documentation: rename of_selftest.txt to of_unittest.txt

2015-03-16 Thread Wang Long

Since the test of the devicetree's OF api use unittest as
its name. so we should rename of_selftest.txt to of_unittest.txt.

Signed-off-by: Wang Long 
---
 Documentation/devicetree/{of_selftest.txt => of_unittest.txt} | 0
 1 file changed, 0 insertions(+), 0 deletions(-)
 rename Documentation/devicetree/{of_selftest.txt => of_unittest.txt} (100%)

diff --git a/Documentation/devicetree/of_selftest.txt 
b/Documentation/devicetree/of_unittest.txt
similarity index 100%
rename from Documentation/devicetree/of_selftest.txt
rename to Documentation/devicetree/of_unittest.txt
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 3/3] Documentation: update the of_unittest.txt

2015-03-16 Thread Wang Long

Since the directory "drivers/of/testcase-data" is renamed
to "drivers/of/unittest-data". so we should update the path
in the of_selftest.txt.

When the kernel is built with OF_UNITTEST enabled, the output
dtb is testcases.dtb instead of testcase.dtb, also update it
(s/testcase/testcases/).

Signed-off-by: Wang Long 
---
 Documentation/devicetree/of_unittest.txt | 35 
 1 file changed, 18 insertions(+), 17 deletions(-)

diff --git a/Documentation/devicetree/of_unittest.txt 
b/Documentation/devicetree/of_unittest.txt
index 57a808b..d79a6bc 100644
--- a/Documentation/devicetree/of_unittest.txt
+++ b/Documentation/devicetree/of_unittest.txt
@@ -1,11 +1,11 @@
-Open Firmware Device Tree Selftest
+Open Firmware Device Tree Unittest
 --
 
 Author: Gaurav Minocha 
 
 1. Introduction
 
-This document explains how the test data required for executing OF selftest
+This document explains how the test data required for executing OF unittest
 is attached to the live tree dynamically, independent of the machine's
 architecture.
 
@@ -22,31 +22,32 @@ most of the device drivers in various use cases.
 
 2. Test-data
 
-The Device Tree Source file (drivers/of/testcase-data/testcases.dts) contains
+The Device Tree Source file (drivers/of/unittest-data/testcases.dts) contains
 the test data required for executing the unit tests automated in
-drivers/of/selftests.c. Currently, following Device Tree Source Include files
-(.dtsi) are included in testcase.dts:
+drivers/of/unittest.c. Currently, following Device Tree Source Include files
+(.dtsi) are included in testcases.dts:
 
-drivers/of/testcase-data/tests-interrupts.dtsi
-drivers/of/testcase-data/tests-platform.dtsi
-drivers/of/testcase-data/tests-phandle.dtsi
-drivers/of/testcase-data/tests-match.dtsi
+drivers/of/unittest-data/tests-interrupts.dtsi
+drivers/of/unittest-data/tests-platform.dtsi
+drivers/of/unittest-data/tests-phandle.dtsi
+drivers/of/unittest-data/tests-match.dtsi
+drivers/of/unittest-data/tests-overlay.dtsi
 
 When the kernel is build with OF_SELFTEST enabled, then the following make rule
 
 $(obj)/%.dtb: $(src)/%.dts FORCE
$(call if_changed_dep, dtc)
 
-is used to compile the DT source file (testcase.dts) into a binary blob
-(testcase.dtb), also referred as flattened DT.
+is used to compile the DT source file (testcases.dts) into a binary blob
+(testcases.dtb), also referred as flattened DT.
 
 After that, using the following rule the binary blob above is wrapped as an
-assembly file (testcase.dtb.S).
+assembly file (testcases.dtb.S).
 
 $(obj)/%.dtb.S: $(obj)/%.dtb
$(call cmd, dt_S_dtb)
 
-The assembly file is compiled into an object file (testcase.dtb.o), and is
+The assembly file is compiled into an object file (testcases.dtb.o), and is
 linked into the kernel image.
 
 
@@ -98,8 +99,8 @@ child11 -> sibling12 -> sibling13 -> sibling14 -> null
 Figure 1: Generic structure of un-flattened device tree
 
 
-Before executing OF selftest, it is required to attach the test data to
-machine's device tree (if present). So, when selftest_data_add() is called,
+Before executing OF unittest, it is required to attach the test data to
+machine's device tree (if present). So, when unittest_data_add() is called,
 at first it reads the flattened device tree data linked into the kernel image
 via the following kernel symbols:
 
@@ -186,10 +187,10 @@ update_node_properties().
 
 2.2. Removing the test data
 
-Once the test case execution is complete, selftest_data_remove is called in
+Once the test case execution is complete, unittest_data_remove is called in
 order to remove the device nodes attached initially (first the leaf nodes are
 detached and then moving up the parent nodes are removed, and eventually the
-whole tree). selftest_data_remove() calls detach_node_and_children() that uses
+whole tree). unittest_data_remove() calls detach_node_and_children() that uses
 of_detach_node() to detach the nodes from the live device tree.
 
 To detach a node, of_detach_node() either updates the child pointer of given
-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3 0/3] replace 'selftest' with 'unittest' and update the document.

2015-03-16 Thread Wang Long

This series patches replace 'selftest' with 'unittest' in the 
OF unittest, and update the document.

the first patch comes from Gaurav Minocha, and i update it. because
it can not apply on linux 4.0-rc4 when using 'git am' command.


* v3 <- v2:
- Rebase the patch on 4.0-rc4
- Re-adjust the sequence of the patches

* v2 <- v1:
- According to Gaurav's advice. make the rename
file patch correctly.

Wang Long (3):
  of/unittest: replace 'selftest' with 'unittest'
  Documentation: rename of_selftest.txt to of_unittest.txt
  Documentation: update the of_unittest.txt

 Documentation/devicetree/bindings/unittest.txt |  44 +-
 .../{of_selftest.txt => of_unittest.txt}   |  35 +-
 drivers/of/unittest-data/tests-overlay.dtsi| 108 ++--
 drivers/of/unittest.c  | 706 ++---
 4 files changed, 447 insertions(+), 446 deletions(-)
 rename Documentation/devicetree/{of_selftest.txt => of_unittest.txt} (87%)

-- 
1.8.3.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux panic on 4.0.0-rc4

2015-03-16 Thread Peter Hurley

On 03/16/2015 11:12 PM, Pranith Kumar wrote:
> On Mon, Mar 16, 2015 at 10:58 PM, Peter Hurley  
> wrote:
 What is your init?
>>>
>>> I am using systemd from debian unstable.
>>
>> Do you have a stdout-path property defined in your dts to a serial
>> console you're not actually using?
>>
> 
> I am using tty0 as my console. From the config which I posted, it has:
> 
> CONFIG_CMDLINE="console=ttyS0,9600 console=tty0"
> 
> I am not using any device tree file.

Ok; there was some reported breakage on PowerMac wrt to 'stdout-path'
changes I made, so I thought I'd check if that might have affected
your setup as well.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux panic on 4.0.0-rc4

2015-03-16 Thread Pranith Kumar

On Mon, Mar 16, 2015 at 10:58 PM, Peter Hurley  wrote:
>>> What is your init?
>>
>> I am using systemd from debian unstable.
>
> Do you have a stdout-path property defined in your dts to a serial
> console you're not actually using?
>

I am using tty0 as my console. From the config which I posted, it has:

CONFIG_CMDLINE="console=ttyS0,9600 console=tty0"

I am not using any device tree file.
-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] kdbus: fix header guard name

2015-03-16 Thread lucas . de . marchi

From: Lucas De Marchi 

UAPI headers have a _UAPI_ as prefix, which is removed during
headers_install. If it's put as a suffix it will not be removed and will
be the only header with UAPI in the header guard macro.

Signed-off-by: Lucas De Marchi 
---
 include/uapi/linux/kdbus.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/uapi/linux/kdbus.h b/include/uapi/linux/kdbus.h
index fc1d77d..302862f 100644
--- a/include/uapi/linux/kdbus.h
+++ b/include/uapi/linux/kdbus.h
@@ -5,8 +5,8 @@
  * your option) any later version.
  */
 
-#ifndef _KDBUS_UAPI_H_
-#define _KDBUS_UAPI_H_
+#ifndef _UAPI_KDBUS_H_
+#define _UAPI_KDBUS_H_
 
 #include 
 #include 
-- 
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH kernel v6 26/29] vfio: powerpc/spapr: Define v2 IOMMU

2015-03-16 Thread Alexey Kardashevskiy


On 03/17/2015 06:45 AM, Alex Williamson wrote:

On Fri, 2015-03-13 at 19:07 +1100, Alexey Kardashevskiy wrote:

The existing IOMMU requires VFIO_IOMMU_ENABLE call to enable actual use
of the container (i.e. call DMA map/unmap) and this is where we check
the rlimit for locked pages. It assumes that only as much memory
as a default DMA window can be mapped. Every DMA map/unmap request will
do pinning/unpinning of physical pages.

New IOMMU will split physical pages pinning and TCE table update.
It will require guest pages to be registered first and consequent
map/unmap requests to work only with pre-registered memory.
For the default single window case this means that the entire guest
(instead of 2GB) needs to be pinned before using VFIO.
However when a huge DMA window is added, no additional pinning will be
required, otherwise it would be guest RAM + 2GB.

This advertises v2 SPAPR TCE IOMMU and restricts what the userspace
can do with v1 or v2 IOMMUs.

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v6:
* enforced limitations imposed by the SPAPR TCE IOMMU version
---
  drivers/vfio/vfio_iommu_spapr_tce.c | 18 +-
  include/uapi/linux/vfio.h   |  2 ++
  2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
b/drivers/vfio/vfio_iommu_spapr_tce.c
index 9d240b4..e191438 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -95,6 +95,7 @@ struct tce_container {
bool enabled;
unsigned long locked_pages;
struct list_head mem_list;
+   bool v2;
  };

  struct tce_memory {
@@ -398,7 +399,7 @@ static void *tce_iommu_open(unsigned long arg)
  {
struct tce_container *container;

-   if (arg != VFIO_SPAPR_TCE_IOMMU) {
+   if ((arg != VFIO_SPAPR_TCE_IOMMU) && (arg != VFIO_SPAPR_TCE_v2_IOMMU)) {
pr_err("tce_vfio: Wrong IOMMU type\n");
return ERR_PTR(-EINVAL);
}
@@ -410,6 +411,8 @@ static void *tce_iommu_open(unsigned long arg)
mutex_init(>lock);
INIT_LIST_HEAD_RCU(>mem_list);

+   container->v2 = arg == VFIO_SPAPR_TCE_v2_IOMMU;
+
return container;
  }

@@ -580,6 +583,7 @@ static long tce_iommu_ioctl(void *iommu_data,
case VFIO_CHECK_EXTENSION:
switch (arg) {
case VFIO_SPAPR_TCE_IOMMU:
+   case VFIO_SPAPR_TCE_v2_IOMMU:
ret = 1;
break;
default:
@@ -719,6 +723,9 @@ static long tce_iommu_ioctl(void *iommu_data,
case VFIO_IOMMU_SPAPR_REGISTER_MEMORY: {
struct vfio_iommu_spapr_register_memory param;

+   if (!container->v2)
+   return -EPERM;
+
minsz = offsetofend(struct vfio_iommu_spapr_register_memory,
size);

@@ -741,6 +748,9 @@ static long tce_iommu_ioctl(void *iommu_data,
case VFIO_IOMMU_SPAPR_UNREGISTER_MEMORY: {
struct vfio_iommu_spapr_register_memory param;

+   if (!container->v2)
+   return -EPERM;
+
minsz = offsetofend(struct vfio_iommu_spapr_register_memory,
size);

@@ -761,6 +771,9 @@ static long tce_iommu_ioctl(void *iommu_data,
return 0;
}
case VFIO_IOMMU_ENABLE:
+   if (container->v2)
+   return -EPERM;
+
mutex_lock(>lock);
ret = tce_iommu_enable(container);
mutex_unlock(>lock);
@@ -768,6 +781,9 @@ static long tce_iommu_ioctl(void *iommu_data,


case VFIO_IOMMU_DISABLE:
+   if (container->v2)
+   return -EPERM;
+
mutex_lock(>lock);
tce_iommu_disable(container);
mutex_unlock(>lock);



I wouldn't have guessed; nothing in the documentation suggests these
ioctls are deprecated in v2 (ie. please document).  If the ioctl doesn't
exist for the IOMMU type, why not simply break and let it fall out at
-ENOTTY?  Same for the above, v1 would have previously returned -ENOTTY
for those ioctls, why change to -EPERM?



Good points. I'll fix them and merge this patch to "vfio: powerpc/spapr: 
Register memory" as this is where it actually belongs to. Agree?



Thanks for the review!


--
Alexey
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [update][PATCH v10 06/21] ACPI / sleep: Introduce CONFIG_ACPI_GENERIC_SLEEP

2015-03-16 Thread Rafael J. Wysocki

On Tuesday, March 17, 2015 10:36:47 AM Hanjun Guo wrote:
> On 2015/3/17 10:28, Rafael J. Wysocki wrote:
> > On Tuesday, March 17, 2015 09:08:45 AM Hanjun Guo wrote:
> >> On 2015/3/17 7:15, Rafael J. Wysocki wrote:
> >>> On Monday, March 16, 2015 08:14:52 PM Hanjun Guo wrote:
>  On 2015年03月14日 05:49, Rafael J. Wysocki wrote:
> > On Friday, March 13, 2015 04:14:29 PM Hanjun Guo wrote:
> >> [...]
> >> diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
> >> index 074e52b..e8728d7 100644
> >> --- a/arch/ia64/Kconfig
> >> +++ b/arch/ia64/Kconfig
> >> @@ -10,6 +10,7 @@ config IA64
> >>select ARCH_MIGHT_HAVE_PC_SERIO
> >>select PCI if (!IA64_HP_SIM)
> >>select ACPI if (!IA64_HP_SIM)
> >> +  select ACPI_GENERIC_SLEEP if ACPI
> >>select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
> >>select HAVE_UNSTABLE_SCHED_CLOCK
> >>select HAVE_IDE
> >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> >> index b7d31ca..9804431 100644
> >> --- a/arch/x86/Kconfig
> >> +++ b/arch/x86/Kconfig
> >> @@ -22,6 +22,7 @@ config X86_64
> >>   ### Arch settings
> >>   config X86
> >>def_bool y
> >> +  select ACPI_GENERIC_SLEEP if ACPI
> > One more nit.  If you did
> >
> > +   select ACPI_GENERIC_SLEEP if ACPI_SLEEP
> >
> > here (and above for ia64), you'd avoid having to make ACPI_SLEEP
> > depend on ACPI_GENERIC_SLEEP which goes somewhat backwards.
>  In sleep.c,
> 
>  #ifdef CONFIG_ACPI_SLEEP
>  acpi_target_system_state()
>  {
>  }
>  #endif
> 
>  and CONFIG_ACPI_SLEEP depends on SUSPEND || HIBERNATION,
>  which one of them will be enabled on ARM64 so ACPI_SLEEP
>  will also enabled too.
> 
>  So if we
> 
>  +select ACPI_GENERIC_SLEEP if ACPI_SLEEP
> 
>  and
> 
>  +acpi-$(CONFIG_ACPI_GENERIC_SLEEP) += sleep.o
> 
>  it will lead to errors for acpi_target_system_state() that
>  is declared but not defined, so I will keep the code as
>  it is, what do you think?
> >>> No, we need to hash this out.  Having two different Kconfig options 
> >>> meaning
> >>> almost the same thing (ACPI_SLEEP and ACPI_GENERIC_SLEEP) is beyond ugly.
> >>>
> >>> Do you need ACPI_SLEEP on ARM64 at all?
> >> No, at least for now we don't need it, the spec for sleep is not ready for
> >> ARM64 arch, so ACPI_SLEEP will not work at all on ARM64.
> > Well, so what about selecting ACPI_SLEEP from the architectures that use it?
> 
> Do you mean remove CONFIG_ACPI_GENERIC_SLEEP and
> 
> +acpi-$(CONFIG_ACPI_SLEEP) += sleep.o
> 
> as well (also need to remove duplicate #ifdef CONFIG_ACPI_SLEEP in sleep.c if
> we doing so)?

Well, almost.  There is one problem with that, becuase sleep.c contains code
outside of the ACPI_SLEEP-dependent blocks.  That code is used for powering
off ACPI platforms.

I guess you don't want that code on ARM too, right?

Perhaps we can use ACPI_REDUCED_HARDWARE_ONLY for that?  ARM64 will be the
only arch setting it at least for the time being, is that correct?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux panic on 4.0.0-rc4

2015-03-16 Thread Peter Hurley

On 03/16/2015 10:02 PM, Pranith Kumar wrote:
> On Mon, Mar 16, 2015 at 7:22 PM, Michael Ellerman  wrote:
>>
>> The log shows that init is being killed, that's what's causing the panic.
>>
>> The exitcode of init is 0x200, which due to the vagaries of UNIX is I think 
>> an
>> "exit status" of 2 in the common usage.
>>
>> But it suggests that your init is just exiting for some reason?
>>
> 
> Yeah, seems like that. Not sure why though. git bisect seems to be the
> only option.
> 
>> What is your init?
> 
> I am using systemd from debian unstable.

Do you have a stdout-path property defined in your dts to a serial
console you're not actually using?


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH kernel v6 29/29] vfio: powerpc/spapr: Support Dynamic DMA windows

2015-03-16 Thread Alex Williamson

On Tue, 2015-03-17 at 12:02 +1100, Alexey Kardashevskiy wrote:
> On 03/17/2015 06:38 AM, Alex Williamson wrote:
> > On Fri, 2015-03-13 at 19:07 +1100, Alexey Kardashevskiy wrote:
> >> This adds create/remove window ioctls to create and remove DMA windows.
> >> sPAPR defines a Dynamic DMA windows capability which allows
> >> para-virtualized guests to create additional DMA windows on a PCI bus.
> >> The existing linux kernels use this new window to map the entire guest
> >> memory and switch to the direct DMA operations saving time on map/unmap
> >> requests which would normally happen in a big amounts.
> >>
> >> This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and
> >> VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows.
> >> Up to 2 windows are supported now by the hardware and by this driver.
> >>
> >> This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional
> >> information such as a number of supported windows and maximum number
> >> levels of TCE tables.
> >>
> >> DDW is added as a capability, not as a SPAPR TCE IOMMU v2 unique feature
> >> as we still want to support v2 on platforms which cannot do DDW for
> >> the sake of TCE acceleration in KVM (coming soon).
> >>
> >> Signed-off-by: Alexey Kardashevskiy 
> >> ---
> >> Changes:
> >> v6:
> >> * added explicit VFIO_IOMMU_INFO_DDW flag to vfio_iommu_spapr_tce_info,
> >> it used to be page mask flags from platform code
> >> * added explicit pgsizes field
> >> * added cleanup if tce_iommu_create_window() failed in a middle
> >> * added checks for callbacks in tce_iommu_create_window and remove those
> >> from tce_iommu_remove_window when it is too late to test anyway
> >> * spapr_tce_find_free_table returns sensible error code now
> >> * updated description of VFIO_IOMMU_SPAPR_TCE_CREATE/
> >> VFIO_IOMMU_SPAPR_TCE_REMOVE
> >>
> >> v4:
> >> * moved code to tce_iommu_create_window()/tce_iommu_remove_window()
> >> helpers
> >> * added docs
> >> ---
> >>   Documentation/vfio.txt  |  19 
> >>   arch/powerpc/include/asm/iommu.h|   2 +-
> >>   drivers/vfio/vfio_iommu_spapr_tce.c | 206 
> >> +++-
> >>   include/uapi/linux/vfio.h   |  41 ++-
> >>   4 files changed, 265 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
> >> index 791e85c..61ce393 100644
> >> --- a/Documentation/vfio.txt
> >> +++ b/Documentation/vfio.txt
> >> @@ -446,6 +446,25 @@ the memory block.
> >>   The user space is not expected to call these often and the block 
> >> descriptors
> >>   are stored in a linked list in the kernel.
> >>
> >> +6) sPAPR specification allows guests to have an ddditional DMA window(s) 
> >> on
> >
> >
> > s/ddditional/additional/
> >
> >> +a PCI bus with a variable page size. Two ioctls have been added to support
> >> +this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE.
> >> +The platform has to support the functionality or error will be returned to
> >> +the userspace. The existing hardware supports up to 2 DMA windows, one is
> >> +2GB long, uses 4K pages and called "default 32bit window"; the other can
> >> +be as big as entire RAM, use different page size, it is optional - guests
> >> +create those in run-time if the guest driver supports 64bit DMA.
> >> +
> >> +VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and
> >> +a number of TCE table levels (if a TCE table is going to be big enough and
> >> +the kernel may not be able to allocate enough of physicall contiguous 
> >> memory).
> >
> > s/physicall/physically/
> >
> >> +It creates a new window in the available slot and returns the bus address 
> >> where
> >> +the new window starts. Due to hardware limitation, the user space cannot 
> >> choose
> >> +the location of DMA windows.
> >> +
> >> +VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window
> >> +and removes it.
> >> +
> >>   
> >> ---
> >>
> >>   [1] VFIO was originally an acronym for "Virtual Function I/O" in its
> >> diff --git a/arch/powerpc/include/asm/iommu.h 
> >> b/arch/powerpc/include/asm/iommu.h
> >> index 13145a2..bac02bf 100644
> >> --- a/arch/powerpc/include/asm/iommu.h
> >> +++ b/arch/powerpc/include/asm/iommu.h
> >> @@ -138,7 +138,7 @@ extern void iommu_free_table(struct iommu_table *tbl, 
> >> const char *node_name);
> >>   extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
> >>int nid);
> >>
> >> -#define IOMMU_TABLE_GROUP_MAX_TABLES  1
> >> +#define IOMMU_TABLE_GROUP_MAX_TABLES  2
> >>
> >>   struct iommu_table_group;
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
> >> b/drivers/vfio/vfio_iommu_spapr_tce.c
> >> index d94116b..0129a4f 100644
> >> --- a/drivers/vfio/vfio_iommu_spapr_tce.c
> >> +++ b/drivers/vfio/vfio_iommu_spapr_tce.c
> >> @@ -600,11 +600,137 @@ static long

[RFC PATCH v4 02/12] kmod - rename call_usermodehelper() flags parameter

2015-03-16 Thread Ian Kent

From: Ian Kent 

The wait parameter of call_usermodehelper() is not quite a parameter
that describes the wait behaviour alone and will later be used to
request execution within the current namespaces. This flag is tied
to the wait field of the subprocess_info structure which is also
a field that doesn't specify wait behaviour alone and is used to
hold the passed flags information.

So change both the parameter and structure field name to flags.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 include/linux/kmod.h |6 +++---
 kernel/kmod.c|   32 +---
 2 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 0555cc6..e647ddb 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -59,7 +59,7 @@ struct subprocess_info {
char *path;
char **argv;
char **envp;
-   int wait;
+   unsigned int flags;
int retval;
int (*init)(struct subprocess_info *info, struct cred *new);
void (*cleanup)(struct subprocess_info *info);
@@ -67,7 +67,7 @@ struct subprocess_info {
 };
 
 extern int
-call_usermodehelper(char *path, char **argv, char **envp, int wait);
+call_usermodehelper(char *path, char **argv, char **envp, unsigned int flags);
 
 extern struct subprocess_info *
 call_usermodehelper_setup(char *path, char **argv, char **envp, gfp_t gfp_mask,
@@ -75,7 +75,7 @@ call_usermodehelper_setup(char *path, char **argv, char 
**envp, gfp_t gfp_mask,
  void (*cleanup)(struct subprocess_info *), void 
*data);
 
 extern int
-call_usermodehelper_exec(struct subprocess_info *info, int wait);
+call_usermodehelper_exec(struct subprocess_info *info, unsigned int flags);
 
 extern struct ctl_table usermodehelper_table[];
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index 2777f40..e968e2d 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -259,7 +259,7 @@ static int call_usermodehelper(void *data)
 out:
sub_info->retval = retval;
/* wait_for_helper() will call umh_complete if UHM_WAIT_PROC. */
-   if (!(sub_info->wait & UMH_WAIT_PROC))
+   if (!(sub_info->flags & UMH_WAIT_PROC))
umh_complete(sub_info);
if (!retval)
return 0;
@@ -310,7 +310,7 @@ static void __call_usermodehelper(struct work_struct *work)
container_of(work, struct subprocess_info, work);
pid_t pid;
 
-   if (sub_info->wait & UMH_WAIT_PROC)
+   if (sub_info->flags & UMH_WAIT_PROC)
pid = kernel_thread(wait_for_helper, sub_info,
CLONE_FS | CLONE_FILES | SIGCHLD);
else
@@ -525,16 +525,17 @@ EXPORT_SYMBOL(call_usermodehelper_setup);
 /**
  * call_usermodehelper_exec - start a usermode application
  * @sub_info: information about the subprocessa
- * @wait: wait for the application to finish and return status.
- *when UMH_NO_WAIT don't wait at all, but you get no useful error back
- *when the program couldn't be exec'ed. This makes it safe to call
- *from interrupt context.
+ * @flags: flag to indicate whether to wait for the application to finish
+ *   and return status. If UMH_NO_WAIT is set don't wait at all, but
+ *   you get no useful error back when the program couldn't be exec'ed.
+ *   This makes it safe to call from interrupt context.
  *
  * Runs a user-space application.  The application is started
  * asynchronously if wait is not set, and runs as a child of keventd.
  * (ie. it runs with full root capabilities).
  */
-int call_usermodehelper_exec(struct subprocess_info *sub_info, int wait)
+int call_usermodehelper_exec(struct subprocess_info *sub_info,
+unsigned int flags)
 {
DECLARE_COMPLETION_ONSTACK(done);
int retval = 0;
@@ -553,14 +554,14 @@ int call_usermodehelper_exec(struct subprocess_info 
*sub_info, int wait)
 * This makes it possible to use umh_complete to free
 * the data structure in case of UMH_NO_WAIT.
 */
-   sub_info->complete = (wait == UMH_NO_WAIT) ? NULL : 
-   sub_info->wait = wait;
+   sub_info->complete = (flags & UMH_NO_WAIT) ? NULL : 
+   sub_info->flags = flags;
 
queue_work(khelper_wq, _info->work);
-   if (wait == UMH_NO_WAIT)/* task has freed sub_info */
+   if (flags & UMH_NO_WAIT)/* task has freed sub_info */
goto unlock;
 
-   if (wait & UMH_KILLABLE) {
+   if (flags & UMH_KILLABLE) {
retval = wait_for_completion_killable();
if (!retval)
goto wait_done;
@@ -587,7 +588,7 @@ EXPORT_SYMBOL(call_usermodehelper_exec);
  * @path: path to usermode executable
  * @argv: arg vector for process
  * @envp: environment for process
- * @wait:

[RFC PATCH v4 05/12] kmod - teach call_usermodehelper() to use a namespace

2015-03-16 Thread Ian Kent

From: Ian Kent 

The call_usermodehelper() function executes all binaries in the
global "init" root context. This doesn't allow a binary to be run
within a namespace (eg. the namespaces of a container).

The init process of the callers environment is used to setup the
namespaces in almost the same way the root init process when the
UMH_USE_NS flag is used.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 kernel/kmod.c |   66 +
 1 file changed, 66 insertions(+)

diff --git a/kernel/kmod.c b/kernel/kmod.c
index 213dbe0..d6ee21a 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -56,6 +56,8 @@ static kernel_cap_t usermodehelper_inheritable = CAP_FULL_SET;
 static DEFINE_SPINLOCK(umh_sysctl_lock);
 static DECLARE_RWSEM(umhelper_sem);
 
+static void umh_put_nsproxy(struct subprocess_info *);
+
 #ifdef CONFIG_MODULES
 
 /*
@@ -194,6 +196,7 @@ static void call_usermodehelper_freeinfo(struct 
subprocess_info *info)
 {
if (info->cleanup)
(*info->cleanup)(info);
+   umh_put_nsproxy(info);
kfree(info);
 }
 
@@ -565,6 +568,61 @@ static void helper_unlock(void)
wake_up(_helpers_waitq);
 }
 
+#ifndef CONFIG_NAMESPACES
+static int umh_get_nsproxy(struct subprocess_info *sub_info)
+{
+   return -ENOTSUP;
+}
+
+static void umh_put_nsproxy(struct subprocess_info *sub_info)
+{
+}
+#else
+static int umh_get_nsproxy(struct subprocess_info *sub_info)
+{
+   struct umh_ns_info *nsinfo = _info->nsinfo;
+   struct task_struct *tsk;
+   struct user_namespace *user_ns;
+   struct nsproxy *new;
+   int err = 0;
+
+   rcu_read_lock();
+   tsk = find_task_by_vpid(1);
+   if (tsk)
+   get_task_struct(tsk);
+   rcu_read_unlock();
+   if (!tsk) {
+   err = -ESRCH;
+   goto out;
+   }
+
+   user_ns = get_user_ns(tsk->cred->user_ns);
+
+   new = create_new_namespaces(0, tsk, user_ns, tsk->fs);
+   if (IS_ERR(new)) {
+   err = PTR_ERR(new);
+   put_user_ns(user_ns);
+   put_task_struct(tsk);
+   goto out;
+   }
+
+   put_task_struct(tsk);
+
+   nsinfo->nsproxy = new;
+   nsinfo->user_ns = user_ns;
+out:
+   return err;
+}
+
+static void umh_put_nsproxy(struct subprocess_info *sub_info)
+{
+   if (sub_info->nsinfo.nsproxy) {
+   put_nsproxy(sub_info->nsinfo.nsproxy);
+   put_user_ns(sub_info->nsinfo.user_ns);
+   }
+}
+#endif
+
 /**
  * call_usermodehelper_setup - prepare to call a usermode helper
  * @path: path to usermode executable
@@ -697,6 +755,14 @@ int call_usermodehelper(char *path,
if (info == NULL)
return -ENOMEM;
 
+   if (flags & UMH_USE_NS) {
+   int err = umh_get_nsproxy(info);
+   if (err) {
+   kfree(info);
+   return err;
+   }
+   }
+
return call_usermodehelper_exec(info, flags);
 }
 EXPORT_SYMBOL(call_usermodehelper);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 04/12] kmod - add namespace aware thread runner

2015-03-16 Thread Ian Kent

From: Ian Kent 

Make usermode helper thread runner namespace aware.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 include/linux/kmod.h |   12 ++
 kernel/kmod.c|   98 --
 2 files changed, 106 insertions(+), 4 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index e647ddb..64c81c9 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -25,6 +25,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define KMOD_PATH_LEN 256
 
@@ -52,6 +53,14 @@ struct file;
 #define UMH_WAIT_EXEC  1   /* wait for the exec, but not the process */
 #define UMH_WAIT_PROC  2   /* wait for the process to complete */
 #define UMH_KILLABLE   4   /* wait for EXEC/PROC killable */
+#define UMH_USE_NS 32  /* exec using caller's init namespace */
+
+#ifdef CONFIG_NAMESPACES
+struct umh_ns_info {
+   struct nsproxy *nsproxy;
+   struct user_namespace *user_ns;
+};
+#endif
 
 struct subprocess_info {
struct work_struct work;
@@ -64,6 +73,9 @@ struct subprocess_info {
int (*init)(struct subprocess_info *info, struct cred *new);
void (*cleanup)(struct subprocess_info *info);
void *data;
+#ifdef CONFIG_NAMESPACES
+   struct umh_ns_info nsinfo;
+#endif
 };
 
 extern int
diff --git a/kernel/kmod.c b/kernel/kmod.c
index e968e2d..213dbe0 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -39,6 +39,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include 
@@ -303,11 +304,8 @@ static int wait_for_helper(void *data)
do_exit(0);
 }
 
-/* This is run by khelper thread  */
-static void __call_usermodehelper(struct work_struct *work)
+static pid_t umh_kernel_thread(struct subprocess_info *sub_info)
 {
-   struct subprocess_info *sub_info =
-   container_of(work, struct subprocess_info, work);
pid_t pid;
 
if (sub_info->flags & UMH_WAIT_PROC)
@@ -316,7 +314,99 @@ static void __call_usermodehelper(struct work_struct *work)
else
pid = kernel_thread(call_usermodehelper, sub_info,
SIGCHLD);
+   return pid;
+}
+
+#ifndef CONFIG_NAMESPACES
+static pid_t umh_kernel_thread_in_ns(struct subprocess_info *sub_info)
+{
+   return -ENOTSUP;
+}
+#else
+static pid_t umh_kernel_thread_in_ns(struct subprocess_info *sub_info)
+{
+   struct umh_ns_info *info = _info->nsinfo;
+   struct nsproxy *saved_nsp, *new_nsp;
+   struct user_namespace *user_ns;
+   struct pid_namespace *pid_ns;
+   struct mnt_namespace *mnt_ns;
+   pid_t pid;
+   int err;
+
+   saved_nsp = current->nsproxy;
+   get_nsproxy(saved_nsp);
+
+   new_nsp = info->nsproxy;
+   get_nsproxy(new_nsp);
+
+   user_ns = get_user_ns(current->cred->user_ns);
+   if (info->user_ns) {
+   err = user_ns->ns.ops->install(new_nsp, >user_ns->ns);
+   if (err)
+   goto out;
+   }
+
+   /* May need to wait4() completion so a pid valid in the
+* thread runners namespace is needed. Install current
+* pid ns in the nsproxy.
+*/
+   pid_ns = current->nsproxy->pid_ns_for_children;
+   if (pid_ns) {
+   err = pid_ns->ns.ops->install(new_nsp, _ns->ns);
+   if (err)
+   goto out_user_ns;
+   }
+
+   /* The mount namespace install function is a little
+* more than a no-op, as the install functions of the
+* other namespace types are in our case, we need to
+* call it to setup current fs root and pwd.
+*/
+   mnt_ns = current->nsproxy->mnt_ns;
+   err = mnt_ns->ns.ops->install(new_nsp, _nsp->mnt_ns->ns);
+   if (err)
+   goto out_user_ns;
 
+   /* Finally, switch to the nsproxy of the init namespace
+* of the caller.
+*/
+   switch_task_namespaces(current, new_nsp);
+
+   pid = umh_kernel_thread(sub_info);
+
+   mnt_ns->ns.ops->install(saved_nsp, _nsp->mnt_ns->ns);
+
+   if (info->user_ns)
+   user_ns->ns.ops->install(saved_nsp, _ns->ns);
+
+   switch_task_namespaces(current, saved_nsp);
+
+   put_user_ns(user_ns);
+
+   return pid;
+
+out_user_ns:
+   if (info->user_ns)
+   user_ns->ns.ops->install(saved_nsp, _ns->ns);
+out:
+   put_user_ns(user_ns);
+   put_nsproxy(new_nsp);
+   put_nsproxy(saved_nsp);
+   return err;
+}
+#endif
+
+/* This is run by khelper thread  */
+static void __call_usermodehelper(struct work_struct *work)
+{
+   struct subprocess_info *sub_info =
+   container_of(work, struct subprocess_info, work);
+   pid_t pid;
+
+   if (sub_info->flags & UMH_USE_NS)
+   pid = umh_kernel_thread_in_ns(sub_info);
+   else
+

[RFC PATCH v4 09/12] nfs - cache_lib use namespace if not executing in init namespace

2015-03-16 Thread Ian Kent

From: Ian Kent 

If pipefs is registered within a namespace other than the root init
namespace subsequent pipefs requests should be run within the init
namespace of registration.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 fs/nfs/cache_lib.c   |7 ++-
 include/linux/sunrpc/cache.h |2 ++
 net/sunrpc/cache.c   |5 +
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/cache_lib.c b/fs/nfs/cache_lib.c
index 5f7b053..4f381ad 100644
--- a/fs/nfs/cache_lib.c
+++ b/fs/nfs/cache_lib.c
@@ -48,7 +48,12 @@ int nfs_cache_upcall(struct cache_detail *cd, char 
*entry_name)
 
if (nfs_cache_getent_prog[0] == '\0')
goto out;
-   ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
+   if (cd->u.pipefs.umh_token) {
+   long token = cd->u.pipefs.umh_token;
+   ret = call_usermodehelper_ns(argv[0], argv, envp,
+UMH_WAIT_EXEC, token);
+   } else
+   ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
/*
 * Disable the upcall mechanism if we're getting an ENOENT or
 * EACCES error. The admin can re-enable it on the fly by using
diff --git a/include/linux/sunrpc/cache.h b/include/linux/sunrpc/cache.h
index 437ddb6..f6c1eb2 100644
--- a/include/linux/sunrpc/cache.h
+++ b/include/linux/sunrpc/cache.h
@@ -68,6 +68,8 @@ struct cache_detail_procfs {
 
 struct cache_detail_pipefs {
struct dentry *dir;
+   /* Namespace token */
+   long umh_token;
 };
 
 struct cache_detail {
diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c
index 5199bb1..a635efb 100644
--- a/net/sunrpc/cache.c
+++ b/net/sunrpc/cache.c
@@ -1811,6 +1811,9 @@ int sunrpc_cache_register_pipefs(struct dentry *parent,
if (IS_ERR(dir))
return PTR_ERR(dir);
cd->u.pipefs.dir = dir;
+   if (cd->net != _net)
+   cd->u.pipefs.umh_token =
+umh_ns_get_token(cd->u.pipefs.umh_token);
return 0;
 }
 EXPORT_SYMBOL_GPL(sunrpc_cache_register_pipefs);
@@ -1819,6 +1822,8 @@ void sunrpc_cache_unregister_pipefs(struct cache_detail 
*cd)
 {
rpc_remove_cache_dir(cd->u.pipefs.dir);
cd->u.pipefs.dir = NULL;
+   umh_ns_put_token(cd->u.pipefs.umh_token);
+   cd->u.pipefs.umh_token = 0;
 }
 EXPORT_SYMBOL_GPL(sunrpc_cache_unregister_pipefs);
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 10/12] nfs - objlayout use namespace if not executing in init namespace

2015-03-16 Thread Ian Kent

From: Ian Kent 

If the caller is running within a container then execute the usermode
helper callback within the init namespace of the container.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 fs/nfs/objlayout/objlayout.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/nfs/objlayout/objlayout.c b/fs/nfs/objlayout/objlayout.c
index 919efd4..00c9a34 100644
--- a/fs/nfs/objlayout/objlayout.c
+++ b/fs/nfs/objlayout/objlayout.c
@@ -599,9 +599,14 @@ static int __objlayout_upcall(struct __auto_login *login)
"PATH=/sbin:/usr/sbin:/bin:/usr/bin",
NULL
};
+   unsigned int umh_flags = UMH_WAIT_PROC;
char *argv[8];
int ret;
 
+   /* If running within a container use the container namespace */
+   if (current->nsproxy->net_ns != _net)
+   umh_flags |= UMH_USE_NS;
+
if (unlikely(!osd_login_prog[0])) {
dprintk("%s: osd_login_prog is disabled\n", __func__);
return -EACCES;
@@ -620,7 +625,7 @@ static int __objlayout_upcall(struct __auto_login *login)
argv[6] = login->systemid_hex;
argv[7] = NULL;
 
-   ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
+   ret = call_usermodehelper(argv[0], argv, envp, umh_flags);
/*
 * Disable the upcall mechanism if we're getting an ENOENT or
 * EACCES error. The admin can re-enable it on the fly by using

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 12/12] KEYS: exec request-key within the requesting task's init namespace

2015-03-16 Thread Ian Kent

From: Ian Kent 

Containerized request key helper callbacks need the ability to execute
a binary in a container's context. To do this calling an in kernel
equivalent of setns(2) should be sufficient since the user mode helper
execution kernel thread ultimately calls do_execve().

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 include/linux/key.h |3 +++
 security/keys/gc.c  |2 ++
 security/keys/key.c |4 
 security/keys/request_key.c |   35 +--
 4 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/include/linux/key.h b/include/linux/key.h
index e1d4715..89dc2d7 100644
--- a/include/linux/key.h
+++ b/include/linux/key.h
@@ -209,6 +209,9 @@ struct key {
} payload;
struct assoc_array keys;
};
+
+   /* Namespace token */
+   long umh_token;
 };
 
 extern struct key *key_alloc(struct key_type *type,
diff --git a/security/keys/gc.c b/security/keys/gc.c
index c795237..57a0730 100644
--- a/security/keys/gc.c
+++ b/security/keys/gc.c
@@ -156,6 +156,8 @@ static noinline void key_gc_unused_keys(struct list_head 
*keys)
 
kfree(key->description);
 
+   umh_ns_put_token(key->umh_token);
+
 #ifdef KEY_DEBUGGING
key->magic = KEY_DEBUG_MAGIC_X;
 #endif
diff --git a/security/keys/key.c b/security/keys/key.c
index aee2ec5..e7ab89d 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "internal.h"
 
 struct kmem_cache *key_jar;
@@ -309,6 +310,9 @@ struct key *key_alloc(struct key_type *type, const char 
*desc,
/* publish the key by giving it a serial number */
atomic_inc(>nkeys);
key_alloc_serial(key);
+   /* If running within a container use the container namespace */
+   if (current->nsproxy->net_ns != _net)
+   key->umh_token = umh_ns_get_token(0);
 
 error:
return key;
diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index e865f9f..16ac3b0 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -90,6 +90,31 @@ static int call_usermodehelper_keys(char *path, char **argv, 
char **envp,
 }
 
 /*
+ * Call a usermode helper with a specific session keyring and execute
+ * within a namespace.
+ */
+static int call_usermodehelper_keys_ns(char *path, char **argv, char **envp,
+   struct key *session_keyring,
+   unsigned int wait, long token)
+{
+   struct subprocess_info *info;
+   unsigned int gfp_mask = (wait & UMH_NO_WAIT) ?
+   GFP_ATOMIC : GFP_KERNEL;
+
+   if (token <= 0)
+   return -EINVAL;
+
+   info = call_usermodehelper_setup_ns(path, argv, envp, gfp_mask,
+   umh_keys_init, umh_keys_cleanup,
+   session_keyring, token);
+   if (!info)
+   return -ENOMEM;
+
+   key_get(session_keyring);
+   return call_usermodehelper_exec(info, wait|UMH_USE_NS);
+}
+
+/*
  * Request userspace finish the construction of a key
  * - execute "/sbin/request-key   
"
  */
@@ -104,6 +129,7 @@ static int call_sbin_request_key(struct key_construction 
*cons,
char *argv[9], *envp[3], uid_str[12], gid_str[12];
char key_str[12], keyring_str[3][12];
char desc[20];
+   unsigned int wait = UMH_WAIT_PROC;
int ret, i;
 
kenter("{%d},{%d},%s", key->serial, authkey->serial, op);
@@ -174,8 +200,13 @@ static int call_sbin_request_key(struct key_construction 
*cons,
argv[i] = NULL;
 
/* do it */
-   ret = call_usermodehelper_keys(argv[0], argv, envp, keyring,
-  UMH_WAIT_PROC);
+   /* If running within a container use the container namespace */
+   if (key->umh_token)
+   ret = call_usermodehelper_keys_ns(argv[0], argv, envp,
+  keyring, wait, key->umh_token);
+   else
+   ret = call_usermodehelper_keys(argv[0],
+  argv, envp, keyring, wait);
kdebug("usermode -> 0x%x", ret);
if (ret >= 0) {
/* ret is the exit/wait code */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 07/12] kmod - add call_usermodehelper_ns()

2015-03-16 Thread Ian Kent

From: Ian Kent 

Add function call_usermodehelper_ns() to allow passing a namespace
token to lookup previously stored namespace information for usermode
helper execution.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 include/linux/kmod.h |   24 +
 kernel/kmod.c|   96 ++
 2 files changed, 120 insertions(+)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 77f41ce..a761650 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -87,9 +87,33 @@ static inline long umh_ns_get_token(long token)
 static inline void umh_ns_put_token(long token)
 {
 }
+
+static inline int
+call_usermodehelper_ns(char *path, char **argv, char **envp,
+  unsigned int flags, long token)
+{
+   return -ENOTSUP;
+}
+
+static inline struct subprocess_info *
+call_usermodehelper_setup_ns(char *path, char **argv, char **envp, gfp_t 
gfp_mask,
+int (*init)(struct subprocess_info *info, struct 
cred *new),
+void (*cleanup)(struct subprocess_info *), void 
*data,
+long token)
+{
+   return -ENOTSUP;
+}
 #else
 extern long umh_ns_get_token(long token);
 extern void umh_ns_put_token(long token);
+extern int
+call_usermodehelper_ns(char *path, char **argv, char **envp,
+  unsigned int flags, long token);
+extern struct subprocess_info *
+call_usermodehelper_setup_ns(char *path, char **argv, char **envp, gfp_t 
gfp_mask,
+int (*init)(struct subprocess_info *info, struct 
cred *new),
+void (*cleanup)(struct subprocess_info *), void 
*data,
+long token);
 #endif
 
 extern int
diff --git a/kernel/kmod.c b/kernel/kmod.c
index ddd41f1..d711240 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -842,6 +842,62 @@ struct subprocess_info *call_usermodehelper_setup(char 
*path, char **argv,
 }
 EXPORT_SYMBOL(call_usermodehelper_setup);
 
+#ifdef CONFIG_NAMESPACES
+/**
+ * call_usermodehelper_setup_ns - prepare to call a usermode helper
+ * within a namespace
+ * @path: path to usermode executable
+ * @argv: arg vector for process
+ * @envp: environment for process
+ * @gfp_mask: gfp mask for memory allocation
+ * @cleanup: a cleanup function
+ * @init: an init function
+ * @data: arbitrary context sensitive data
+ * @token: token used to locate namespace setup.
+ *
+ * Returns either an errno error cast, or a subprocess_info structure.
+ * This should be passed to call_usermodehelper_exec to exec the process
+ * and free the structure.
+ *
+ * The init function is used to customize the helper process prior to
+ * exec.  A non-zero return code causes the process to error out, exit,
+ * and return the failure to the calling process
+ *
+ * The cleanup function is run just before the subprocess_info is about
+ * to be freed.  This can be used for freeing the argv and envp.  The
+ * Function must be runnable in either a process context or the
+ * context in which call_usermodehelper_exec is called.
+ */
+struct subprocess_info *call_usermodehelper_setup_ns(char *path, char **argv,
+   char **envp, gfp_t gfp_mask,
+   int (*init)(struct subprocess_info *info, struct cred *new),
+   void (*cleanup)(struct subprocess_info *info),
+   void *data, long token)
+{
+   struct subprocess_info *info;
+   unsigned int nowait = gfp_mask == GFP_ATOMIC ? 1 : 0;
+   struct umh_ns_entry *entry;
+
+   info = call_usermodehelper_setup(path, argv, envp,
+gfp_mask, NULL, NULL, NULL);
+   if (!info)
+   return ERR_PTR(-ENOMEM);
+
+   entry = umh_ns_find_entry(token, nowait);
+   if (IS_ERR(entry)) {
+   kfree(info);
+   info = ERR_CAST(entry);
+   goto out;
+   }
+   get_nsproxy(entry->nsinfo.nsproxy);
+   info->nsinfo.nsproxy = entry->nsinfo.nsproxy;
+   info->nsinfo.user_ns = get_user_ns(entry->nsinfo.user_ns);
+out:
+   return info;
+}
+EXPORT_SYMBOL(call_usermodehelper_setup_ns);
+#endif /* CONFIG_NAMESPACES */
+
 /**
  * call_usermodehelper_exec - start a usermode application
  * @sub_info: information about the subprocessa
@@ -939,6 +995,46 @@ int call_usermodehelper(char *path,
 }
 EXPORT_SYMBOL(call_usermodehelper);
 
+#ifdef CONFIG_NAMESPACES
+/**
+ * call_usermodehelper_ns() - prepare and start a usermode application and
+ * execute using the stored namspace information
+ * corresponding to the passed token
+ * @path: path to usermode executable
+ * @argv: arg vector for process
+ * @envp: environment for process
+ * @flags: wait for the application to finish and return status.
+ *

[RFC PATCH v4 08/12] nfsd - use namespace if not executing in init namespace

2015-03-16 Thread Ian Kent

From: Ian Kent 

If nfsd is running within a container the client tracking operations
should run within the originating container also.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 fs/nfsd/netns.h   |3 +++
 fs/nfsd/nfs4recover.c |   48 +++-
 fs/nfsd/nfsctl.c  |6 ++
 3 files changed, 40 insertions(+), 17 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index ea6749a..c85c13a 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -112,6 +112,9 @@ struct nfsd_net {
u32 clientid_counter;
 
struct svc_serv *nfsd_serv;
+
+   /* Namespace token */
+   long umh_token;
 };
 
 /* Simple check to find out if a given net was properly initialized */
diff --git a/fs/nfsd/nfs4recover.c b/fs/nfsd/nfs4recover.c
index 1c307f0..df13b54 100644
--- a/fs/nfsd/nfs4recover.c
+++ b/fs/nfsd/nfs4recover.c
@@ -1184,7 +1184,8 @@ nfsd4_cltrack_grace_start(time_t grace_start)
 }
 
 static int
-nfsd4_umh_cltrack_upcall(char *cmd, char *arg, char *env0, char *env1)
+nfsd4_umh_cltrack_upcall(char *cmd, char *arg,
+char *env0, char *env1, long token)
 {
char *envp[3];
char *argv[4];
@@ -1209,7 +1210,11 @@ nfsd4_umh_cltrack_upcall(char *cmd, char *arg, char 
*env0, char *env1)
argv[2] = arg;
argv[3] = NULL;
 
-   ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
+   if (token > 0)
+   ret = call_usermodehelper_ns(argv[0], argv, envp,
+UMH_WAIT_PROC, token);
+   else
+   ret = call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
/*
 * Disable the upcall mechanism if we're getting an ENOENT or EACCES
 * error. The admin can re-enable it on the fly by using sysfs
@@ -1252,14 +1257,8 @@ nfsd4_umh_cltrack_init(struct net *net)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
char *grace_start = nfsd4_cltrack_grace_start(nn->boot_time);
 
-   /* XXX: The usermode helper s not working in container yet. */
-   if (net != _net) {
-   WARN(1, KERN_ERR "NFSD: attempt to initialize umh client "
-   "tracking in a container!\n");
-   return -EINVAL;
-   }
-
-   ret = nfsd4_umh_cltrack_upcall("init", NULL, grace_start, NULL);
+   ret = nfsd4_umh_cltrack_upcall("init", NULL,
+   grace_start, NULL, nn->umh_token);
kfree(grace_start);
return ret;
 }
@@ -1285,6 +1284,7 @@ nfsd4_umh_cltrack_create(struct nfs4_client *clp)
 {
char *hexid, *has_session, *grace_start;
struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
+   int ret;
 
/*
 * With v4.0 clients, there's little difference in outcome between a
@@ -1312,7 +1312,10 @@ nfsd4_umh_cltrack_create(struct nfs4_client *clp)
grace_start = nfsd4_cltrack_grace_start(nn->boot_time);
 
nfsd4_cltrack_upcall_lock(clp);
-   if (!nfsd4_umh_cltrack_upcall("create", hexid, has_session, 
grace_start))
+   ret = nfsd4_umh_cltrack_upcall("create",
+  hexid, has_session, grace_start,
+  nn->umh_token);
+   if (!ret)
set_bit(NFSD4_CLIENT_STABLE, >cl_flags);
nfsd4_cltrack_upcall_unlock(clp);
 
@@ -1324,7 +1327,9 @@ nfsd4_umh_cltrack_create(struct nfs4_client *clp)
 static void
 nfsd4_umh_cltrack_remove(struct nfs4_client *clp)
 {
+   struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
char *hexid;
+   int ret;
 
if (!test_bit(NFSD4_CLIENT_STABLE, >cl_flags))
return;
@@ -1336,9 +1341,13 @@ nfsd4_umh_cltrack_remove(struct nfs4_client *clp)
}
 
nfsd4_cltrack_upcall_lock(clp);
-   if (test_bit(NFSD4_CLIENT_STABLE, >cl_flags) &&
-   nfsd4_umh_cltrack_upcall("remove", hexid, NULL, NULL) == 0)
-   clear_bit(NFSD4_CLIENT_STABLE, >cl_flags);
+   if (test_bit(NFSD4_CLIENT_STABLE, >cl_flags)) {
+   ret = nfsd4_umh_cltrack_upcall("remove",
+  hexid, NULL, NULL,
+  nn->umh_token);
+   if (ret == 0)
+   clear_bit(NFSD4_CLIENT_STABLE, >cl_flags);
+   }
nfsd4_cltrack_upcall_unlock(clp);
 
kfree(hexid);
@@ -1347,8 +1356,9 @@ nfsd4_umh_cltrack_remove(struct nfs4_client *clp)
 static int
 nfsd4_umh_cltrack_check(struct nfs4_client *clp)
 {
-   int ret;
+   struct nfsd_net *nn = net_generic(clp->net, nfsd_net_id);
char *hexid, *has_session, *legacy;
+   int ret;
 
if (test_bit(NFSD4_CLIENT_STABLE, >cl_flags))
return 0;
@@ -1366,7 +1376,9 @@

[RFC PATCH v4 11/12] KEYS - use correct memory allocation flag in call_usermodehelper_keys()

2015-03-16 Thread Ian Kent

From: Ian Kent 

When call_usermodehelper_keys() is called it assumes it won't be called
with the flag UMH_NO_WAIT. Currently that's always the case.

Change this to check the flag and use the correct kernel memory allocation
flag to guard against future changes.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 security/keys/request_key.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/security/keys/request_key.c b/security/keys/request_key.c
index 486ef6f..e865f9f 100644
--- a/security/keys/request_key.c
+++ b/security/keys/request_key.c
@@ -76,8 +76,10 @@ static int call_usermodehelper_keys(char *path, char **argv, 
char **envp,
struct key *session_keyring, int wait)
 {
struct subprocess_info *info;
+   unsigned int gfp_mask = (wait & UMH_NO_WAIT) ?
+   GFP_ATOMIC : GFP_KERNEL;
 
-   info = call_usermodehelper_setup(path, argv, envp, GFP_KERNEL,
+   info = call_usermodehelper_setup(path, argv, envp, gfp_mask,
  umh_keys_init, umh_keys_cleanup,
  session_keyring);
if (!info)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 06/12] kmod - add namespace info store

2015-03-16 Thread Ian Kent

From: Ian Kent 

Persistent use of namespace information is needed where contained
execution is needed in a namespace other than the current namespace.

Use a simple random token as a key to store namespace information
in a hashed list for later usermode helper execution.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 include/linux/kmod.h |   14 
 kernel/kmod.c|  185 --
 2 files changed, 193 insertions(+), 6 deletions(-)

diff --git a/include/linux/kmod.h b/include/linux/kmod.h
index 64c81c9..77f41ce 100644
--- a/include/linux/kmod.h
+++ b/include/linux/kmod.h
@@ -78,6 +78,20 @@ struct subprocess_info {
 #endif
 };
 
+#ifndef CONFIG_NAMESPACES
+static inline long umh_ns_get_token(long token)
+{
+   return -ENOTSUP;
+}
+
+static inline void umh_ns_put_token(long token)
+{
+}
+#else
+extern long umh_ns_get_token(long token);
+extern void umh_ns_put_token(long token);
+#endif
+
 extern int
 call_usermodehelper(char *path, char **argv, char **envp, unsigned int flags);
 
diff --git a/kernel/kmod.c b/kernel/kmod.c
index d6ee21a..ddd41f1 100644
--- a/kernel/kmod.c
+++ b/kernel/kmod.c
@@ -41,6 +41,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #include 
 
@@ -48,6 +51,21 @@ extern int max_threads;
 
 static struct workqueue_struct *khelper_wq;
 
+#ifdef CONFIG_NAMESPACES
+#define UMH_HASH_SHIFT  6
+#define UMH_HASH_SIZE   1 << UMH_HASH_SHIFT
+
+struct umh_ns_entry {
+   long token;
+   unsigned int count;
+   struct umh_ns_info nsinfo;
+   struct hlist_node umh_ns_hlist;
+};
+
+static DEFINE_SPINLOCK(umh_ns_hash_lock);
+static struct hlist_head umh_ns_hash[UMH_HASH_SIZE];
+#endif
+
 #define CAP_BSET   (void *)1
 #define CAP_PI (void *)2
 
@@ -577,10 +595,13 @@ static int umh_get_nsproxy(struct subprocess_info 
*sub_info)
 static void umh_put_nsproxy(struct subprocess_info *sub_info)
 {
 }
+
+static void umh_ns_hash_init(void)
+{
+}
 #else
-static int umh_get_nsproxy(struct subprocess_info *sub_info)
+static int _umh_get_nsproxy(struct umh_ns_info *nsinfo)
 {
-   struct umh_ns_info *nsinfo = _info->nsinfo;
struct task_struct *tsk;
struct user_namespace *user_ns;
struct nsproxy *new;
@@ -614,14 +635,165 @@ out:
return err;
 }
 
+static int umh_get_nsproxy(struct subprocess_info *sub_info)
+{
+   return _umh_get_nsproxy(_info->nsinfo);
+}
+
+static void _umh_put_nsproxy(struct umh_ns_info *nsinfo)
+{
+   if (nsinfo->nsproxy) {
+   put_nsproxy(nsinfo->nsproxy);
+   put_user_ns(nsinfo->user_ns);
+   }
+}
+
 static void umh_put_nsproxy(struct subprocess_info *sub_info)
 {
-   if (sub_info->nsinfo.nsproxy) {
-   put_nsproxy(sub_info->nsinfo.nsproxy);
-   put_user_ns(sub_info->nsinfo.user_ns);
+   return _umh_put_nsproxy(_info->nsinfo);
+}
+
+static void umh_ns_hash_init(void)
+{
+   int i;
+
+   for (i = 0; i < UMH_HASH_SIZE; i++)
+   INIT_HLIST_HEAD(_ns_hash[i]);
+}
+
+static struct umh_ns_entry *__umh_ns_find_entry(long token)
+{
+   struct umh_ns_entry *this, *entry;
+   struct hlist_head *bucket;
+   unsigned int hash;
+
+   hash = hash_64((unsigned long) token, UMH_HASH_SHIFT);
+   bucket = _ns_hash[hash];
+
+   entry = ERR_PTR(-ENOENT);
+   if (hlist_empty(bucket))
+   goto out;
+
+   hlist_for_each_entry(this, bucket, umh_ns_hlist) {
+   if (this->token == token) {
+   entry = this;
+   break;
+   }
}
+out:
+   return entry;
 }
-#endif
+
+static struct umh_ns_entry *umh_ns_find_entry(long token, unsigned int nowait)
+{
+   struct umh_ns_entry *entry;
+   unsigned long flags;
+
+   if (nowait)
+   spin_lock_irqsave(_ns_hash_lock, flags);
+   else
+   spin_lock(_ns_hash_lock);
+   entry = __umh_ns_find_entry(token);
+   if (nowait)
+   spin_unlock_irqrestore(_ns_hash_lock, flags);
+   else
+   spin_unlock(_ns_hash_lock);
+
+   return entry;
+}
+
+/**
+ * umh_ns_get_token - allocate and store namespace information of the
+ * init process of the caller
+ * @token: token of stored namspace information or zero for a new
+ *token.
+ *
+ * Returns a token used to locate the namespace information for calls to
+ * call_usermode_helper_ns() calls. On failure returns a negative errno.
+ */
+long umh_ns_get_token(long token)
+{
+   struct umh_ns_entry *entry;
+   struct hlist_head *bucket;
+   unsigned int hash;
+   unsigned int new_token;
+   int err;
+
+   if (token) {
+   spin_lock(_ns_hash_lock);
+   entry = __umh_ns_find_entry(token);
+   if (entry) {
+

[RFC PATCH v4 00/12] Second attempt at contained helper execution

2015-03-16 Thread Ian Kent

Here is another update to the attempt at contained helper execution.

The main change is I've tried to incorporate Oleg's suggestions
of directly constructing the namespaces rather than using the
open/setns approach and the addition of a namespace hash store.

I'm not particularly happy with this so far as there are a bunch
of ref counted objects and I've almost certainly got that wrong.
But also there are object lifetime problems, some I'm aware of
and for sure others I'm not. Also there is the integrity of the
thread runner process. I haven't performed a double fork on thread
execution, it might be painful to implement, so the thread runner
might end up with the wrong namespace setup if an error occurs.

Anyway, I've decided to stop spinning my wheels with this and
post an update in the hope that others can offer suggestions to
help and, of course, point out things I've missed.

The other change has been to the nfs and KEYS patches.
I've introduced the ability to get a token that can be used to
save namespace information for later execution and I've attempted
to use that for persistent namespace execution, as was discussed
previously.

I'm not at all sure I've done this in a sensible way but the
token does need to be accessible at helper execution time which
is why I've done it this way.

I definitely need advice here too. 

---

Ian Kent (12):
  nsproxy - make create_new_namespaces() non-static
  kmod - rename call_usermodehelper() flags parameter
  vfs - move mnt_namespace definition to linux/mount.h
  kmod - add namespace aware thread runner
  kmod - teach call_usermodehelper() to use a namespace
  kmod - add namespace info store
  kmod - add call_usermodehelper_ns()
  nfsd - use namespace if not executing in init namespace
  nfs - cache_lib use namespace if not executing in init namespace
  nfs - objlayout use namespace if not executing in init namespace
  KEYS - use correct memory allocation flag in call_usermodehelper_keys()
  KEYS: exec request-key within the requesting task's init namespace


 fs/mount.h   |   12 -
 fs/nfs/cache_lib.c   |7 +
 fs/nfs/objlayout/objlayout.c |7 +
 fs/nfsd/netns.h  |3 
 fs/nfsd/nfs4recover.c|   48 +++-
 fs/nfsd/nfsctl.c |6 +
 include/linux/key.h  |3 
 include/linux/kmod.h |   56 +
 include/linux/mount.h|   14 +
 include/linux/nsproxy.h  |3 
 include/linux/sunrpc/cache.h |2 
 kernel/kmod.c|  465 --
 kernel/nsproxy.c |2 
 net/sunrpc/cache.c   |5 
 security/keys/gc.c   |2 
 security/keys/key.c  |4 
 security/keys/request_key.c  |   39 +++-
 17 files changed, 620 insertions(+), 58 deletions(-)

--
Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 03/12] vfs - move mnt_namespace definition to linux/mount.h

2015-03-16 Thread Ian Kent

From: Ian Kent 

The mnt_namespace definition will be needed by the usermode helper
contained execution implementation, move it to include/linux/mount.h.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
Cc: Jeff Layton 
---
 fs/mount.h|   12 
 include/linux/mount.h |   14 +-
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/fs/mount.h b/fs/mount.h
index 6a61c2b..5b8423b 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -1,20 +1,8 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 
-struct mnt_namespace {
-   atomic_tcount;
-   struct ns_commonns;
-   struct mount *  root;
-   struct list_headlist;
-   struct user_namespace   *user_ns;
-   u64 seq;/* Sequence number to prevent loops */
-   wait_queue_head_t poll;
-   u64 event;
-};
-
 struct mnt_pcp {
int mnt_count;
int mnt_writers;
diff --git a/include/linux/mount.h b/include/linux/mount.h
index c2c561d..39dbcdf 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -15,11 +15,12 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 
 struct super_block;
 struct vfsmount;
 struct dentry;
-struct mnt_namespace;
 
 #define MNT_NOSUID 0x01
 #define MNT_NODEV  0x02
@@ -62,6 +63,17 @@ struct mnt_namespace;
 #define MNT_SYNC_UMOUNT0x200
 #define MNT_MARKED 0x400
 
+struct mnt_namespace {
+   atomic_tcount;
+   struct ns_commonns;
+   struct mount *  root;
+   struct list_headlist;
+   struct user_namespace   *user_ns;
+   u64 seq;/* Sequence number to prevent loops */
+   wait_queue_head_t poll;
+   u64 event;
+};
+
 struct vfsmount {
struct dentry *mnt_root;/* root of the mounted tree */
struct super_block *mnt_sb; /* pointer to superblock */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v4 01/12] nsproxy - make create_new_namespaces() non-static

2015-03-16 Thread Ian Kent

From: Ian Kent 

create_new_namespaces() will be needed by usermodehelper namespace
restricted execution.

Signed-off-by: Ian Kent 
Cc: Benjamin Coddington 
Cc: Al Viro 
Cc: J. Bruce Fields 
Cc: David Howells 
Cc: Trond Myklebust 
Cc: Stanislav Kinsbursky 
Cc: Oleg Nesterov 
Cc: Eric W. Biederman 
---
 include/linux/nsproxy.h |3 +++
 kernel/nsproxy.c|2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index 35fa08f..dfe7dda 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -62,6 +62,9 @@ extern struct nsproxy init_nsproxy;
  *
  */
 
+struct nsproxy *create_new_namespaces(unsigned long flags,
+   struct task_struct *tsk, struct user_namespace *user_ns,
+   struct fs_struct *new_fs);
 int copy_namespaces(unsigned long flags, struct task_struct *tsk);
 void exit_task_namespaces(struct task_struct *tsk);
 void switch_task_namespaces(struct task_struct *tsk, struct nsproxy *new);
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index 49746c8..48d5e4a 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -56,7 +56,7 @@ static inline struct nsproxy *create_nsproxy(void)
  * Return the newly created nsproxy.  Do not attach this to the task,
  * leave it to the caller to do proper locking and attach it to task.
  */
-static struct nsproxy *create_new_namespaces(unsigned long flags,
+struct nsproxy *create_new_namespaces(unsigned long flags,
struct task_struct *tsk, struct user_namespace *user_ns,
struct fs_struct *new_fs)
 {

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND v2] sched/deadline: fix rt runtime corrupt when dl refuse a smaller bandwidth

2015-03-16 Thread Wanpeng Li

Ping Ingo, ;-)
On Fri, Mar 13, 2015 at 03:28:00PM +0800, Wanpeng Li wrote:
>Dl class will refuse the bandwidth being set to some value smaller
>than the currently allocated bandwidth in any of the root_domains
>through sched_rt_runtime_us and sched_rt_period_us. RT runtime will
>be set according to sched_rt_runtime_us before dl class verify if
>the new bandwidth is suitable in the case of !CONFIG_RT_GROUP_SCHED.
>
>However, rt runtime will be corrupt if dl refuse the new bandwidth
>since there is no undo to reset the rt runtime to the old value.
>
>This patch fix it by verifying new bandwidth for deadline in advance.
>
>Acked-by: Juri Lelli 
>Signed-off-by: Wanpeng Li 
>---
> kernel/sched/core.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
>diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>index 97fe79c..e884909 100644
>--- a/kernel/sched/core.c
>+++ b/kernel/sched/core.c
>@@ -7815,7 +7815,7 @@ static int sched_rt_global_constraints(void)
> }
> #endif /* CONFIG_RT_GROUP_SCHED */
> 
>-static int sched_dl_global_constraints(void)
>+static int sched_dl_global_validate(void)
> {
>   u64 runtime = global_rt_runtime();
>   u64 period = global_rt_period();
>@@ -7916,11 +7916,11 @@ int sched_rt_handler(struct ctl_table *table, int 
>write,
>   if (ret)
>   goto undo;
> 
>-  ret = sched_rt_global_constraints();
>+  ret = sched_dl_global_validate();
>   if (ret)
>   goto undo;
> 
>-  ret = sched_dl_global_constraints();
>+  ret = sched_rt_global_constraints();
>   if (ret)
>   goto undo;
> 
>-- 
>1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND] sched/deadline: don't need to check throttled status when switched to dl

2015-03-16 Thread Wanpeng Li

Ping Ingo, ;-)
On Fri, Mar 13, 2015 at 03:27:51PM +0800, Wanpeng Li wrote:
>After commit 40767b0dc768 ("sched/deadline: Fix deadline parameter
>modification handling"), deadline task throttled status is cleared
>each time switch from dl, so throttled status always unset when
>switch back, there is no need to check throttled status, this patch
>drop the check.
>
>Acked-by: Juri Lelli 
>Signed-off-by: Wanpeng Li 
>---
> kernel/sched/deadline.c | 8 
> 1 file changed, 8 deletions(-)
>
>diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
>index 3fa8fa6..5cb5c9c 100644
>--- a/kernel/sched/deadline.c
>+++ b/kernel/sched/deadline.c
>@@ -1659,14 +1659,6 @@ static void switched_to_dl(struct rq *rq, struct 
>task_struct *p)
> {
>   int check_resched = 1;
> 
>-  /*
>-   * If p is throttled, don't consider the possibility
>-   * of preempting rq->curr, the check will be done right
>-   * after its runtime will get replenished.
>-   */
>-  if (unlikely(p->dl.dl_throttled))
>-  return;
>-
>   if (task_on_rq_queued(p) && rq->curr != p) {
> #ifdef CONFIG_SMP
>   if (p->nr_cpus_allowed > 1 && rq->dl.overloaded &&
>-- 
>1.9.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [9/9] powerpc/hv-24x7: Add missing put_cpu_var()

2015-03-16 Thread Michael Ellerman

On Tue, 2015-17-02 at 22:00:34 UTC, Sukadev Bhattiprolu wrote:
> Add missing put_cpu_var() for 24x7 requests.

When did it go missing? I assume in upstream, in which case this should be a
separate patch which I could merge for 4.0.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [update][PATCH v10 06/21] ACPI / sleep: Introduce CONFIG_ACPI_GENERIC_SLEEP

2015-03-16 Thread Hanjun Guo

On 2015/3/17 10:28, Rafael J. Wysocki wrote:
> On Tuesday, March 17, 2015 09:08:45 AM Hanjun Guo wrote:
>> On 2015/3/17 7:15, Rafael J. Wysocki wrote:
>>> On Monday, March 16, 2015 08:14:52 PM Hanjun Guo wrote:
 On 2015年03月14日 05:49, Rafael J. Wysocki wrote:
> On Friday, March 13, 2015 04:14:29 PM Hanjun Guo wrote:
>> [...]
>> diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
>> index 074e52b..e8728d7 100644
>> --- a/arch/ia64/Kconfig
>> +++ b/arch/ia64/Kconfig
>> @@ -10,6 +10,7 @@ config IA64
>>  select ARCH_MIGHT_HAVE_PC_SERIO
>>  select PCI if (!IA64_HP_SIM)
>>  select ACPI if (!IA64_HP_SIM)
>> +select ACPI_GENERIC_SLEEP if ACPI
>>  select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
>>  select HAVE_UNSTABLE_SCHED_CLOCK
>>  select HAVE_IDE
>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>> index b7d31ca..9804431 100644
>> --- a/arch/x86/Kconfig
>> +++ b/arch/x86/Kconfig
>> @@ -22,6 +22,7 @@ config X86_64
>>   ### Arch settings
>>   config X86
>>  def_bool y
>> +select ACPI_GENERIC_SLEEP if ACPI
> One more nit.  If you did
>
> + select ACPI_GENERIC_SLEEP if ACPI_SLEEP
>
> here (and above for ia64), you'd avoid having to make ACPI_SLEEP
> depend on ACPI_GENERIC_SLEEP which goes somewhat backwards.
 In sleep.c,

 #ifdef CONFIG_ACPI_SLEEP
 acpi_target_system_state()
 {
 }
 #endif

 and CONFIG_ACPI_SLEEP depends on SUSPEND || HIBERNATION,
 which one of them will be enabled on ARM64 so ACPI_SLEEP
 will also enabled too.

 So if we

 +select ACPI_GENERIC_SLEEP if ACPI_SLEEP

 and

 +acpi-$(CONFIG_ACPI_GENERIC_SLEEP) += sleep.o

 it will lead to errors for acpi_target_system_state() that
 is declared but not defined, so I will keep the code as
 it is, what do you think?
>>> No, we need to hash this out.  Having two different Kconfig options meaning
>>> almost the same thing (ACPI_SLEEP and ACPI_GENERIC_SLEEP) is beyond ugly.
>>>
>>> Do you need ACPI_SLEEP on ARM64 at all?
>> No, at least for now we don't need it, the spec for sleep is not ready for
>> ARM64 arch, so ACPI_SLEEP will not work at all on ARM64.
> Well, so what about selecting ACPI_SLEEP from the architectures that use it?

Do you mean remove CONFIG_ACPI_GENERIC_SLEEP and

+acpi-$(CONFIG_ACPI_SLEEP) += sleep.o

as well (also need to remove duplicate #ifdef CONFIG_ACPI_SLEEP in sleep.c if
we doing so)?

Thanks
Hanjun



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: kill kmemcheck

2015-03-16 Thread Steven Rostedt

On Mon, 16 Mar 2015 21:48:23 -0400
Sasha Levin  wrote:


> Steven,
> 
> 
> Since the only objection raised was the too-newiness of GCC 4.9.2/5.0, what
> would you consider a good time-line for removal?
> 
> I haven't heard any "over my dead body" objections, so I guess that trying
> to remove it while no distribution was shipping the compiler that would make
> it possible was premature.
> 
> Although, on the other hand, I'd be happy if we can have a reasonable date
> (that is before my kid goes to college), preferably even before the next
> LSF/MM so that we could have a mission accomplished thingie with a round
> of beers and commemorative t-shirts.

Perhaps give it 2 years? With fair notice that it will soon be gone?

In 2 years I should be up to gcc 4.9 ;-)

I still need to test it out.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: am437x-gp-evm: add DT nodes for ov2659 sensor

2015-03-16 Thread Tony Lindgren

* Lad, Prabhakar  [150316 18:20]:
> Hi Tony,
> 
> On Mon, Mar 16, 2015 at 10:17 PM, Tony Lindgren  wrote:
> > * Lad Prabhakar  [150312 16:38]:
> >> From: "Lad, Prabhakar" 
> >>
> >> this patch does the following:
> >> 1: adds DT node for fixed oscillator.
> >> 2: adds DT node entries for ov2659 sensor
> >> 3: adds remote-endpoint entry for VPFE.
> >>
> >> Signed-off-by: Lad, Prabhakar 
> >
> > Applying into omap-for-v4.1/dt thanks.
> >
> I would like to get this one in via media tree to avoid dependency
> as I am still waiting for Acks from DT maintainers for the sensor
> driver.

OK dropping it.
 
> If I can get your Ack on this I'll queue it up along with sensor
> driver via media tree.

Sorry the chances are too big for pointless merge conflicts with
these files with constant patching going on.

Please just resend this patch alone again to me later on once the
driver changes are merged into Linux next and on their way to the
mainline kernel.

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: manual merge of the input tree with the input-current tree

2015-03-16 Thread Stephen Rothwell

Hi Dmitry,

Today's linux-next merge of the input tree got a conflict in
drivers/input/mouse/synaptics.c between commit dc5465dc8a6d ("Input:
synaptics - fix middle button on Lenovo 2015 products") from the
input-current tree and commit de4e374b401a ("Input: synaptics - switch
ForcePad detection to PNP IDs") from the input tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc drivers/input/mouse/synaptics.c
index dda605836546,4c69e3304011..
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@@ -670,20 -612,6 +677,18 @@@ static void synaptics_parse_agm(const u
}
  }
  
 +static void synaptics_parse_ext_buttons(const unsigned char buf[],
 +  struct synaptics_data *priv,
 +  struct synaptics_hw_state *hw)
 +{
 +  unsigned int ext_bits =
 +  (SYN_CAP_MULTI_BUTTON_NO(priv->ext_cap) + 1) >> 1;
 +  unsigned int ext_mask = GENMASK(ext_bits - 1, 0);
 +
 +  hw->ext_buttons = buf[4] & ext_mask;
 +  hw->ext_buttons |= (buf[5] & ext_mask) << ext_bits;
 +}
 +
- static bool is_forcepad;
- 
  static int synaptics_parse_hw_state(const unsigned char buf[],
struct synaptics_data *priv,
struct synaptics_hw_state *hw)


pgpsmLX378Rqm.pgp
Description: OpenPGP digital signature

Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier (x86) (v12)

2015-03-16 Thread Steven Rostedt


[ Removed npig...@kernel.dk as I keep getting bounces from that addr ]

On Tue, 17 Mar 2015 01:45:25 + (UTC)
Mathieu Desnoyers  wrote:

> - Original Message -
> > From: "Peter Zijlstra" 
> > To: "Mathieu Desnoyers" 
> > Cc: linux-kernel@vger.kernel.org, "KOSAKI Motohiro" 
> > , "Steven Rostedt"
> > , "Paul E. McKenney" , 
> > "Nicholas Miell" ,
> > "Linus Torvalds" , "Ingo Molnar" 
> > , "Alan Cox"
> > , "Lai Jiangshan" , 
> > "Stephen Hemminger"
> > , "Andrew Morton" , 
> > "Josh Triplett" ,
> > "Thomas Gleixner" , "David Howells" 
> > , "Nick Piggin" 
> > Sent: Monday, March 16, 2015 4:54:35 PM
> > Subject: Re: [RFC PATCH] sys_membarrier(): system/process-wide memory 
> > barrier (x86) (v12)

Can you please fix your mail client to not include the entire header in
your replies please.

> Let's consider the following memory barrier scenario performed in
> user-space on an architecture with very relaxed ordering. PowerPC comes
> to mind.
> 
> https://lwn.net/Articles/573436/
> scenario 12:
> 
> CPU 0   CPU 1
> CAO(x) = 1; r3 = CAO(y);
> cmm_smp_wmb();  cmm_smp_rmb();
> CAO(y) = 1; r4 = CAO(x);
> 
> BUG_ON(r3 == 1 && r4 == 0)
> 
> 
> We tweak it to use sys_membarrier on CPU 1, and a simple compiler
> barrier() on CPU 0:
> 
> CPU 0   CPU 1
> CAO(x) = 1; r3 = CAO(y);
> barrier();  sys_membarrier();
> CAO(y) = 1; r4 = CAO(x);
> 
> BUG_ON(r3 == 1 && r4 == 0)
> 
> Now if CPU 1 executes sys_membarrier while CPU 0 is preempted after both
> stores, we have:
> 
> CPU 0   CPU 1
> CAO(x) = 1;
>   [1st store is slow to
>reach other cores]
> CAO(y) = 1;
>   [2nd store reaches other
>cores more quickly]
> [preempted]
> r3 = CAO(y)
>   (may see y = 1)
> sys_membarrier()
> Scheduler changes rq->curr.
> skips CPU 0, because rq->curr has
>   been updated.
> [return to userspace]
> r4 = CAO(x)
>   (may see x = 0)
> BUG_ON(r3 == 1 && r4 == 0) -> fails.
> load_cr3, with implied
>   memory barrier, comes
>   after CPU 1 has read "x".
> 
> The only way to make this scenario work is if a memory barrier is added
> before updating rq->curr. (we could also do a similar scenario for the
> needed barrier after store to rq->curr).

Hmm, I wonder if anything were to break if rq->curr was updated after
the context_switch() call?

Would that help?

this_cpu_write(saved_next, next);
rq = context_switch(rq, prev, next);
rq->curr = this_cpu_read(saved_next);

As I recently found out that this_cpu_read/write() is not that nice on
all architectures, something else may need to be updated. Or we can add
a temp variable on the rq.

rq->saved_next = next;
rq = context_switch(rq, prev, next);
rq->curr = rq->saved_next;

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [8/9] powerpc/hv-24x7: Break up single_24x7_request

2015-03-16 Thread Michael Ellerman

On Tue, 2015-17-02 at 22:00:33 UTC, Sukadev Bhattiprolu wrote:
> Break up the function single_24x7_request() into smaller functions.
> This would later enable us to "prepare" a multi-event request
> buffer and then submit a single hcall for several events.

This looks fine, though the names are a bit laboured.
> 
> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
> index 3c36694..fde6211 100644
> --- a/arch/powerpc/perf/hv-24x7.c
> +++ b/arch/powerpc/perf/hv-24x7.c
> @@ -1001,6 +1001,44 @@ static void log_24x7_hcall(struct 
> hv_24x7_request_buffer *request_buffer,
>  }
>  
>  /*
> + * Start the process for a new H_GET_24x7_DATA hcall.
> + */
> +static void start_24x7_get_data(struct hv_24x7_request_buffer 
> *request_buffer,
> + struct hv_24x7_data_result_buffer *result_buffer)
> +{

Just init_24x7_request() ?

> +
> + memset(request_buffer, 0, 4096);
> + memset(result_buffer, 0, 4096);
> +
> + request_buffer->interface_version = HV_24X7_IF_VERSION_CURRENT;
> + /* memset above set request_buffer->num_requests to 0 */
> +}
> +
> +/*
> + * Commit (i.e perform) the H_GET_24x7_DATA hcall using the data collected
> + * by 'start_24x7_get_data()' and 'add_event_to_24x7_request()'.
> + */
> +static int commit_24x7_get_data(struct hv_24x7_request_buffer 
> *request_buffer,
> + struct hv_24x7_data_result_buffer *result_buffer)
> +{

It don't like "commit" that is a loaded term.

Just make_24x7_request() perhaps?


cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] phy: Add a driver for dm816x USB PHY

2015-03-16 Thread Tony Lindgren

* Matthijs van Duin  [150316 14:17]:
> *gets increasingly confused*

:)
 
> The datasheet (sprs614e) only contains register addresses, and they
> seem to match the TRM's USB chapter.  The only disagreement I can spot
> is related to USB_CTRL register(s) in the control module (offsets
> 0x620 and 0x628) where
> * the TRM claims both exist in the control module chapter (1.16.1.3),
> one for each both
> * the TRM's USB chapter claims only one exists, and its layout
> (24.9.8.2) doesn't resemble the former one

Yes the second entry is the closest thing to a documentation
here it seems.

> * the datasheet agrees only one exists (but gives no layout)
> * reality seems to disagree with both: I booted up our evm816x,
> plugged in a mouse to confirm USB is functional, and attached JTAG:
> both registers read as zero (contradicting both layouts) and appear to
> ignore writes.

Yes so it seem here too, this is dm816x rev c, what do you have?

Anyways, I'll add a note that at least rev c does not seems to do
anything with USB_CTRL but that we follow what the TI tree is doing
in case some other revisions of the hardware use it.
 
> There doesn't seem to be any disagreement in the docs about the
> USBPHY_CTRL regs (offsets 0x624 and 0x62c in control module) as far as
> I can tell, and the documented reset values also match what I see via
> JTAG.

Yes agreed.

> > But dm814x [..] seems to be wired up like am335x.
> 
> This I mentioned: the only difference is related to GPIO mode (which
> on the dm814x yields exactly that: GPIO, while the am335x uses it to
> hook up UARTs).
> 
> > Yes I checked am3517 trm, and that too mentions Synopsys once
> 
> Only in its ECHI/OCHI host controller, not the OTG one...

Oh OK I missed that part.
 
> Anyhow, I didn't expect this small observation to end up becoming a
> device archeology expedition, sorry about that ;-)

Well thanks for spotting the weirdness :) 
 
> >> BTW, da850? Is that yet another instance of Primus? (i.e.
> >> omap-L1xx/c674x/am1xxx with odd final digit, also da830/da828)
> >
> > Yes it's the arm926 based series, l-138 is da850 I believe.
> 
> Ah, but there are two such series: Freon and Primus. Just to be
> different, their part numbers are both allocated from the omap-L1xx
> (arm+dsp) / tms320c674x (dsp only) / am1xxx (arm only) ranges, but
> distinguished by Freon having even final digit while Primus has odd
> final digit. Of course this doesn't hold for the da8xx parts, that
> would be too consistent.

I can't keep up with these part numbers..
 
> But you're right, da850 indeed seems to be a Freon rather than a
> Primus based on some more googling (apparently da850 has SATA --
> Primus doesn't).

OK

Regards,

Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/page_alloc: Call kernel_map_pages in unset_migrateype_isolate

2015-03-16 Thread Rik van Riel

On 03/16/2015 02:29 PM, Laura Abbott wrote:
> Commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on 
> isolated pageblock")
> changed the logic of unset_migratetype_isolate to check the buddy allocator
> and explicitly call __free_pages to merge. The page that is being freed in
> this path never had prep_new_page called so set_page_refcounted is called
> explicitly but there is no call to kernel_map_pages. With the default
> kernel_map_pages this is mostly harmless but if kernel_map_pages does any
> manipulation of the page tables (unmapping or setting pages to read only) this
> may trigger a fault:
> 
> alloc_contig_range test_pages_isolated(ceb00, ced00) failed
> Unable to handle kernel paging request at virtual address ffc0cec0
> pgd = ffc045fc4000
> [ffc0cec0] *pgd=
> Internal error: Oops: 964f [#1] PREEMPT SMP
> Modules linked in: exfatfs
> CPU: 1 PID: 23237 Comm: TimedEventQueue Not tainted 
> 3.10.49-gc72ad36-dirty #1
> task: ffc03de52100 ti: ffc015388000 task.ti: ffc015388000
> PC is at memset+0xc8/0x1c0
> LR is at kernel_map_pages+0x1ec/0x244
> 
> Fix this by calling kernel_map_pages to ensure the page is set in the
> page table properly

Acked-by: Rik van Riel 

-- 
All rights reversed
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [6/9] powerpc/hv-24x7: Define add_event_to_24x7_request()

2015-03-16 Thread Michael Ellerman

On Tue, 2015-17-02 at 22:00:31 UTC, Sukadev Bhattiprolu wrote:
> Move code that maps a perf_event to a 24x7 request buffer into a
> separate function, add_event_to_24x7_request().
> 
> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
> index e78b127..76c649a 100644
> --- a/arch/powerpc/perf/hv-24x7.c
> +++ b/arch/powerpc/perf/hv-24x7.c
> @@ -1052,7 +1077,6 @@ static unsigned long single_24x7_request(struct 
> perf_event *event, u64 *count)
>   }
>  
>   resb = _buffer->results[0];
> -
>   *count = be64_to_cpu(resb->elements[0].element_data[0]);
>  out:
>   return ret;
> @@ -1150,6 +1174,7 @@ static void h_24x7_event_read(struct perf_event *event)
>  {
>   s64 prev;
>   u64 now;
> +
>   now = h_24x7_get_value(event);
>   prev = local64_xchg(>hw.prev_count, now);
>   local64_add(now - prev, >count);

I'm a fan of whitespace for readability in cases like this, but do it as a
separate patch.

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3/9] powerpc/hv-24x7: Drop event_24x7_request()

2015-03-16 Thread Michael Ellerman

On Tue, 2015-17-02 at 22:00:28 UTC, Sukadev Bhattiprolu wrote:
> The function event_24x7_request() is essentially a wrapper to the
> function single_24x7_request() and can be dropped to simplify code.
> 
> diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
> index 7856e38..c189e75 100644
> --- a/arch/powerpc/perf/hv-24x7.c
> +++ b/arch/powerpc/perf/hv-24x7.c
> @@ -1004,17 +1004,22 @@ static unsigned long single_24x7_request(u8 domain, 
> u32 offset, u16 ix,
>   memset(request_buffer, 0, 4096);
>   memset(result_buffer, 0, 4096);
>  
> + if (is_physical_domain(event_get_domain(event)))
> + idx = event_get_core(event);
> + else
> + idx = event_get_vcpu(event);
> +
>   request_buffer->interface_version = HV_24X7_IF_VERSION_CURRENT;
>   request_buffer->num_requests = 1;
>  
>   req = _buffer->requests[0];
>  
> - req->performance_domain = domain;
> + req->performance_domain = event_get_domain(event);
>   req->data_size = cpu_to_be16(8);
> - req->data_offset = cpu_to_be32(offset);
> - req->starting_lpar_ix = cpu_to_be16(lpar),
> + req->data_offset = cpu_to_be32(event_get_offset(event));
> + req->starting_lpar_ix = cpu_to_be16(event_get_lpar(event)),
>   req->max_num_lpars = cpu_to_be16(1);
> - req->starting_ix = cpu_to_be16(ix);
> + req->starting_ix = cpu_to_be16(idx);
>   req->max_ix = cpu_to_be16(1);
>  
>   /*
> @@ -1029,7 +1034,9 @@ static unsigned long single_24x7_request(u8 domain, u32 
> offset, u16 ix,
>   if (ret) {
>   pr_notice_ratelimited("hcall failed: %d %#x %#x %d => "
>   "0x%lx (%ld) detail=0x%x failing ix=%x\n",
> - domain, offset, ix, lpar, ret, ret,
> + (int)event_get_domain(event),
> + (unsigned int)event_get_offset(event),
> + idx, (int)event_get_lpar(event), ret, ret,

It seems more natural here to print the req->performance_domain etc. rather
then re-extracting them from the event?

cheers
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] cpuidle: 4.0-rc3 fixes

2015-03-16 Thread Rafael J. Wysocki

On Monday, March 16, 2015 10:24:27 AM Geert Uytterhoeven wrote:
> Hi Rafael,
> 
> On Fri, Mar 13, 2015 at 10:56 PM, Rafael J. Wysocki  
> wrote:
> > On Friday, March 13, 2015 06:42:30 PM Daniel Lezcano wrote:
> >> this pull request contains a couple of fixes:
> >>
> >>   - Fix the cpu_pm_enter/exit symmetry in the mvebu driver (Gregory 
> >> Clement)
> >>
> >>   - Fix the mvebu drivers latency/residency values to reach an
> >> acceptable tradeoff between perf / power (Sebastian Rannou)
> >>
> >> Thanks !
> >
> > Pulled, but this is a bit too late for 4.0-rc4, so I'll queue it up for 
> > -rc5.
> 
> Please note commit 43b68879de27b199 ("cpuidle: mvebu: Fix the CPU PM
> notifier usage") has a "Cc: sta...@vger.kernel.org" tag.

Yes, it does, and that's why I've queued it up for 4.0-rc5 and not for 4.1.

Does it require any extra care or something?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v4 2/5] ARM: dts: Prepare exynos5410-odroidxu device tree

2015-03-16 Thread Kukjin Kim

Javier Martinez Canillas wrote:
> 
> Hello Andreas,
> 
Hi,

> On Mon, Mar 16, 2015 at 11:27 AM, Andreas Färber  wrote:
> > Am 16.03.2015 um 08:56 schrieb Javier Martinez Canillas:
> >>
> >> I think this should be defined in exynos5410.dtsi instead since is an
> >> IP block in the SoC and referenced in the .dts using a label to change
> >> the clock-frequency in the board.
> >
> > I hope you understood that this is a literal copy of smdk5410, so I'm
> > not going to make random changes here. If the Samsung guys want to make
> > this change for smdk5410, then fine, but otherwise - like for Snow and
> > Spring - I want to keep the diff -u low between the two.
> >
> 
> Yes I did understand that it was a copy but I thought it could be
> improved anyways. But I don't have a strong opinion either to block
> this series and always both DTS can be changed as a follow-up. So I'm
> ok with your decision to keeping the delta to the minimum for now.
> 
Yeah, everybody can update everything in mainline if it can be got review in
mailinglist. BTW asthe fin_pll can be different according to board condition
that's why it is defined in each boart DT file, it is mostly same on each
boards though...So I think keeping it would be more make sense.

Thanks,
Kukjin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: randconfig build error with next-20150316, in samples/kdbus/kdbus-workers

2015-03-16 Thread Michael Ellerman

On Tue, 2015-03-17 at 00:00 +0100, David Herrmann wrote:
> Hi
> 
> On Mon, Mar 16, 2015 at 11:51 PM, Michael Ellerman  
> wrote:
> > On Mon, 2015-03-16 at 23:27 +0100, David Herrmann wrote:
> >> The uapi-include only causes the warning, not the build failure.
> >
> > I don't know how you came to that conclusion?
> >
> > It fails looking for linux/compiler.h, which is only included from the 
> > kernel
> > headers, never from the exported headers.
> 
> We only include linux/kdbus.h. On sanitized headers, this will include
> linux/types.h -> linux/posix_types.h -> linux/stddef.h.
> If you use the uapi headers, then stddef.h is not sanitized and will
> still include linux/compiler.h (which is removed on sanitized
> headers). Hence, this error only occurs if you include
> uapi/linux/kdbus.h. With sanitized headers in ./usr/, the compiler
> will prefer ./usr/linux/kdbus.h and ./usr/linux/stddef.h, thus never
> including any linux/compiler.h.

Right. But if the uapi include isn't there, you don't get this build failure.
This build failure is caused by the uapi include.

> If you drop -I./include/uapi/, you will get "linux/kdbus.h not found".

Yep, which is exactly correct (unless it's provided by your distro headers).

> The error will be different, but it will not fix anything. However, if
> you run "make headers_install", everything will compile just fine even
> with -I./include/uapi/.

Well probably. But I could probably construct a scenario where something gets
pulled in from uapi, and then you've got a mixture of user & kernel headers
again.

At the end of the day there is no good reason to use include/uapi, I think we
agree on that :)

cheers


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/page_alloc: Call kernel_map_pages in unset_migrateype_isolate

2015-03-16 Thread Joonsoo Kim

On Mon, Mar 16, 2015 at 11:29:45AM -0700, Laura Abbott wrote:
> Commit 3c605096d315 ("mm/page_alloc: restrict max order of merging on 
> isolated pageblock")
> changed the logic of unset_migratetype_isolate to check the buddy allocator
> and explicitly call __free_pages to merge. The page that is being freed in
> this path never had prep_new_page called so set_page_refcounted is called
> explicitly but there is no call to kernel_map_pages. With the default
> kernel_map_pages this is mostly harmless but if kernel_map_pages does any
> manipulation of the page tables (unmapping or setting pages to read only) this
> may trigger a fault:
> 
> alloc_contig_range test_pages_isolated(ceb00, ced00) failed
> Unable to handle kernel paging request at virtual address ffc0cec0
> pgd = ffc045fc4000
> [ffc0cec0] *pgd=
> Internal error: Oops: 964f [#1] PREEMPT SMP
> Modules linked in: exfatfs
> CPU: 1 PID: 23237 Comm: TimedEventQueue Not tainted 
> 3.10.49-gc72ad36-dirty #1
> task: ffc03de52100 ti: ffc015388000 task.ti: ffc015388000
> PC is at memset+0xc8/0x1c0
> LR is at kernel_map_pages+0x1ec/0x244
> 
> Fix this by calling kernel_map_pages to ensure the page is set in the
> page table properly
> 
> Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on 
> isolated pageblock")
> Cc: Naoya Horiguchi 
> Cc: Mel Gorman 
> Cc: Rik van Riel 
> Cc: Yasuaki Ishimatsu 
> Cc: Zhang Yanfei 
> Cc: Xishi Qiu 
> Cc: Vladimir Davydov 
> Cc: Joonsoo Kim 
> Cc: Gioh Kim 
> Cc: Michal Nazarewicz 
> Cc: Marek Szyprowski 
> Cc: Vlastimil Babka 
> Signed-off-by: Laura Abbott 
> ---
> Note this was found on a backport to 3.10 and the code to make 
> kernel_map_pages
> change the page table state is currently out of tree. The original had stable,
> so this may need to go into stable as well.

I found that some implementation of kernel_map_pages() in mainline also require
this change. Some implementation doesn't check previous state of page table, but
some others check previous state of page table when calling kernel_map_pages().

Acked-by: Joonsoo Kim 

Thanks.

> ---
>  mm/page_isolation.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 72f5ac3..755a42c 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -103,6 +103,7 @@ void unset_migratetype_isolate(struct page *page, 
> unsigned migratetype)
>  
>   if (!is_migrate_isolate_page(buddy)) {
>   __isolate_free_page(page, order);
> + kernel_map_pages(page, (1 << order), 1);
>   set_page_refcounted(page);
>   isolated_page = page;
>   }
> -- 
> Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux 
> Foundation Collaborative Project
> This e-mail address will be inactive after March 20, 2015
> Please contact privately for follow up after that date.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND] usb: dwc2: avoid leaking DMA channels on disconnection

2015-03-16 Thread John Youn

On 3/16/2015 2:50 AM, Yunzhi Li wrote:
> Hi
>> When the HCD is disconnected, the DMA transfers still in-flight were 
>> cleaned-up
>> but the count of available DMA channels (e.g. available_host_channels) was 
>> not
>> reset.
>> The pool of DMA channels can be depleted when doing unclean
>> disconnection of USB peripherals, and reaches the point where no
>> transfer was possible until the next reboot/reload of the driver.
>>
>> Tested by putting a programmable USB mux on the port and randomly
>> plugging/unpluging a USB HUB with USB mass-storage key, USB-audio and
>> USB-ethernet dongle connected to its downstream ports, and also doing the
>> disconnection early while the devices are still enumerating to get more URBs
>> in-flight.
>> After the patch, the devices are still enumerating after thousands of cycles,
>> while the port was totally dead before.
>>
>> Signed-off-by: Vincent Palatin 
>> ---
>> I'm re-sending it, it seems the previous email did not show up.
>>
>>   drivers/usb/dwc2/hcd.c | 8 
>>   1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/usb/dwc2/hcd.c b/drivers/usb/dwc2/hcd.c
>> index c78c874..559b55e 100644
>> --- a/drivers/usb/dwc2/hcd.c
>> +++ b/drivers/usb/dwc2/hcd.c
>> @@ -257,6 +257,14 @@ static void dwc2_hcd_cleanup_channels(struct dwc2_hsotg 
>> *hsotg)
>>   */
>>  channel->qh = NULL;
>>  }
>> +/* All channels have been freed, mark them available */
>> +if (hsotg->core_params->uframe_sched > 0) {
>> +hsotg->available_host_channels =
>> +hsotg->core_params->host_channels;
>> +} else {
>> +hsotg->non_periodic_channels = 0;
>> +hsotg->periodic_channels = 0;
>> +}
>>   }
>>   
>>   /**
> 
> I have reviewed this patch. Obviously,it makes sense.
> 
> Reviewed-by: Yunzhi Li 


Acked-by: John Youn 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [update][PATCH v10 06/21] ACPI / sleep: Introduce CONFIG_ACPI_GENERIC_SLEEP

2015-03-16 Thread Rafael J. Wysocki

On Tuesday, March 17, 2015 09:08:45 AM Hanjun Guo wrote:
> On 2015/3/17 7:15, Rafael J. Wysocki wrote:
> > On Monday, March 16, 2015 08:14:52 PM Hanjun Guo wrote:
> >> On 2015年03月14日 05:49, Rafael J. Wysocki wrote:
> >>> On Friday, March 13, 2015 04:14:29 PM Hanjun Guo wrote:
> [...]
> 
>  diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
>  index 074e52b..e8728d7 100644
>  --- a/arch/ia64/Kconfig
>  +++ b/arch/ia64/Kconfig
>  @@ -10,6 +10,7 @@ config IA64
>   select ARCH_MIGHT_HAVE_PC_SERIO
>   select PCI if (!IA64_HP_SIM)
>   select ACPI if (!IA64_HP_SIM)
>  +select ACPI_GENERIC_SLEEP if ACPI
>   select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
>   select HAVE_UNSTABLE_SCHED_CLOCK
>   select HAVE_IDE
>  diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>  index b7d31ca..9804431 100644
>  --- a/arch/x86/Kconfig
>  +++ b/arch/x86/Kconfig
>  @@ -22,6 +22,7 @@ config X86_64
>    ### Arch settings
>    config X86
>   def_bool y
>  +select ACPI_GENERIC_SLEEP if ACPI
> >>> One more nit.  If you did
> >>>
> >>> + select ACPI_GENERIC_SLEEP if ACPI_SLEEP
> >>>
> >>> here (and above for ia64), you'd avoid having to make ACPI_SLEEP
> >>> depend on ACPI_GENERIC_SLEEP which goes somewhat backwards.
> >> In sleep.c,
> >>
> >> #ifdef CONFIG_ACPI_SLEEP
> >> acpi_target_system_state()
> >> {
> >> }
> >> #endif
> >>
> >> and CONFIG_ACPI_SLEEP depends on SUSPEND || HIBERNATION,
> >> which one of them will be enabled on ARM64 so ACPI_SLEEP
> >> will also enabled too.
> >>
> >> So if we
> >>
> >> +select ACPI_GENERIC_SLEEP if ACPI_SLEEP
> >>
> >> and
> >>
> >> +acpi-$(CONFIG_ACPI_GENERIC_SLEEP) += sleep.o
> >>
> >> it will lead to errors for acpi_target_system_state() that
> >> is declared but not defined, so I will keep the code as
> >> it is, what do you think?
> > No, we need to hash this out.  Having two different Kconfig options meaning
> > almost the same thing (ACPI_SLEEP and ACPI_GENERIC_SLEEP) is beyond ugly.
> >
> > Do you need ACPI_SLEEP on ARM64 at all?
> 
> No, at least for now we don't need it, the spec for sleep is not ready for
> ARM64 arch, so ACPI_SLEEP will not work at all on ARM64.

Well, so what about selecting ACPI_SLEEP from the architectures that use it?


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 0/5] mm: cma: add some debug information for CMA

2015-03-16 Thread Joonsoo Kim

On Mon, Mar 16, 2015 at 09:54:18PM -0400, Sasha Levin wrote:
> On 03/16/2015 09:43 PM, Joonsoo Kim wrote:
> > On Mon, Mar 16, 2015 at 07:06:55PM +0300, Stefan Strogin wrote:
> >> > Hi all.
> >> > 
> >> > Here is the fourth version of a patch set that adds some debugging 
> >> > facility for
> >> > CMA.
> >> > 
> >> > This patch set is based on next-20150316.
> >> > It is also available on git:
> >> > git://github.com/stefanstrogin/linux -b cmainfo-v4
> >> > 
> >> > We want an interface to see a list of currently allocated CMA buffers 
> >> > and some
> >> > useful information about them (like /proc/vmallocinfo but for physically
> >> > contiguous buffers allocated with CMA).
> >> > 
> >> > For example. We want a big (megabytes) CMA buffer to be allocated in 
> >> > runtime
> >> > in default CMA region. If someone already uses CMA then the big 
> >> > allocation
> >> > could fail. If it happened then with such an interface we could find who 
> >> > used
> >> > CMA at the moment of failure, who caused fragmentation and so on. Ftrace 
> >> > also
> >> > would be helpful here, but with ftrace we can see the whole history of
> >> > allocations and releases, whereas with this patch set we can see a 
> >> > snapshot of
> >> > CMA region with actual information about its allocations.
> > Hello,
> > 
> > Hmm... I still don't think that this is really helpful to find root
> > cause of fragmentation. Think about following example.
> > 
> > Assume 1024 MB CMA region.
> > 
> > 128 MB allocation * 4
> > 1 MB allocation
> > 128 MB allocation
> > 128 MB release * 4 (first 4)
> > try 512 MB allocation
> > 
> > With above sequences, fragmentation happens and 512 MB allocation would
> > be failed. We can get information about 1 MB allocation and 128 MB one
> > from the buffer list as you suggested, but, fragmentation are related
> > to whole sequence of allocation/free history, not snapshot of allocation.
> 
> This is solvable by dumping task->comm in the tracepoint patch (1/5), right?

Yes, it can be solved by 1/5.
I mean that I'm not sure patch 4/5 is really needed or not.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: linux panic on 4.0.0-rc4

2015-03-16 Thread Pranith Kumar

On Mon, Mar 16, 2015 at 7:22 PM, Michael Ellerman  wrote:
>
> The log shows that init is being killed, that's what's causing the panic.
>
> The exitcode of init is 0x200, which due to the vagaries of UNIX is I think an
> "exit status" of 2 in the common usage.
>
> But it suggests that your init is just exiting for some reason?
>

Yeah, seems like that. Not sure why though. git bisect seems to be the
only option.

> What is your init?

I am using systemd from debian unstable.

-- 
Pranith
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 1/1] extcon: usb-gpio: Introduce gpio usb extcon driver

2015-03-16 Thread Chanwoo Choi

Hi Ivan,

On 03/16/2015 11:23 PM, Ivan T. Ivanov wrote:
> 
> Hi Roger, 
> 
> On Mon, 2015-03-16 at 15:11 +0200, Roger Quadros wrote:
>> Hi Ivan,
>>
>> On 16/03/15 14:32, Ivan T. Ivanov wrote:
>>> Hi,
>>>
>>> On Mon, 2015-02-02 at 12:21 +0200, Roger Quadros wrote:
 This driver observes the USB ID pin connected over a GPIO and
 updates the USB cable extcon states accordingly.

 The existing GPIO extcon driver is not suitable for this purpose
 as it needs to be taught to understand USB cable states and it
 can't handle more than one cable per instance.

 For the USB case we need to handle 2 cable states.
 1) USB (attach/detach)
 2) USB-HOST (attach/detach)

 This driver can be easily updated in the future to handle VBUS
 events in case it happens to be available on GPIO for any platform.

 Signed-off-by: Roger Quadros 
 ---
 v4:
 - got rid of id_irqwake flag. Fail if enable/disable_irq_wake() fails
 - changed host cable name to "USB-HOST"
>>>
>>> I am sorry that I am getting a bit little late into this.
>>>
>>> Isn't supposed that we have to use strings defined in
>>> const char extcon_cable_name[][]?
>>>
>>>
 +
 +/* List of detectable cables */
 +enum {
 +   EXTCON_CABLE_USB = 0,
 +   EXTCON_CABLE_USB_HOST,
 +
>>>
>>> Same here: duplicated with enum extcon_cable_name
>>>
 +   EXTCON_CABLE_END,
 +};
 +
 +static const char *usb_extcon_cable[] = {
 +   [EXTCON_CABLE_USB] = "USB",
 +   [EXTCON_CABLE_USB_HOST] = "USB-HOST",
 +   NULL,
 +};
>>
>> I'm not exactly sure how else it is supposed to work if we
>> support only a subset of cables from the global extcon_cable_name[][].
> 
> I don't see issue that we use just 2 events. I think that we can
> reuse  enum extcon_cable_name and strings already defined in 
> extcon_cable_name[][] global variable. It is defined extern in
> extcon.h file exactly for this purpose, no?

'extcon_cable_name' global variable is not used on extcon driver directly.
It is just recommended cable name. 

I have plan to use standard cable name for extcon driver instead of that
each extcon driver define the cable name.

[snip]

Thanks,
Chanwoo Choi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

linux-next: build failure after merge of the crypto tree

2015-03-16 Thread Stephen Rothwell

Hi Herbert,

After merging the crypto tree, today's linux-next build (x86_64 allmodconfig)
failed like this:

drivers/crypto/img-hash.c: At top level:
drivers/crypto/img-hash.c:878:1: error: expected ',' or ';' before 'static'
 static int img_hash_probe(struct platform_device *pdev)
 ^

Caused by commit d358f1abbf71 ("crypto: img-hash - Add Imagination
Technologies hw hash accelerator").

I have used the crypto tree from next-20150316 for today.
-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au


pgpAhKjArLaAb.pgp
Description: OpenPGP digital signature

[PULL] virtio-next fixes

2015-03-16 Thread Rusty Russell

The following changes since commit 6587457b4b3d663b237a0f95ddf6e67d1828c8ea:

  Merge tag 'dma-buf-for-4.0-rc3' of 
git://git.kernel.org/pub/scm/linux/kernel/git/sumits/dma-buf (2015-03-04 
09:59:51 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux.git 
tags/virtio-next-for-linus

for you to fetch changes up to 704a0b5f234db26de5203740999e39523cfa4e3a:

  virtio_mmio: fix access width for mmio (2015-03-17 12:12:21 +1030)


Not entirely surprising: the ongoing QEMU work on virtio 1.0 has revealed
more minor issues with our virtio 1.0 drivers just introduced in the
kernel.

(I would normally use my fixes branch for this, but there were a batch of 
them...)

Thanks,
Rusty.


Michael S. Tsirkin (11):
  virtio_console: init work unconditionally
  virtio_console: avoid config access from irq
  virtio_balloon: set DRIVER_OK before using device
  virtio_blk: typo fix
  virtio_blk: fix comment for virtio 1.0
  virtio-balloon: do not call blocking ops when !TASK_RUNNING
  9p/trans_virtio: fix hot-unplug
  virtio_rpmsg: set DRIVER_OK before using device
  virtio_mmio: generation support
  uapi/virtio_scsi: allow overriding CDB/SENSE size
  virtio_mmio: fix access width for mmio

 drivers/char/virtio_console.c| 19 -
 drivers/rpmsg/virtio_rpmsg_bus.c | 17 +++-
 drivers/virtio/virtio_balloon.c  | 21 +++---
 drivers/virtio/virtio_mmio.c | 90 
 include/uapi/linux/virtio_blk.h  |  8 +++-
 include/uapi/linux/virtio_scsi.h | 12 +-
 net/9p/trans_virtio.c| 24 +--
 7 files changed, 168 insertions(+), 23 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4] livepatch/module: Correctly handle coming and going modules

2015-03-16 Thread Rusty Russell

Jiri Kosina  writes:
> On Thu, 12 Mar 2015, Petr Mladek wrote:
>
>> There is a notifier that handles live patches for coming and going modules.
>> It takes klp_mutex lock to avoid races with coming and going patches but
>> it does not keep the lock all the time. Therefore the following races are
>> possible:
> [ ... snip ... ]
>> diff --git a/include/linux/module.h b/include/linux/module.h
>> index b653d7c0a05a..7232fde6a991 100644
>> --- a/include/linux/module.h
>> +++ b/include/linux/module.h
>> @@ -344,6 +344,10 @@ struct module {
>>  unsigned long *ftrace_callsites;
>>  #endif
>>  
>> +#ifdef CONFIG_LIVEPATCH
>> +bool klp_alive;
>> +#endif
>> +
>
> Rusty, are you okay with this please? I'd like to have the race fixed in 
> 4.0 still, but don't want to be making changes to struct module without 
> your ack.

I look at the amount of explanation and discussion around these patches
and I fear the complexity of what you're doing.

But not enough to rewrite it myself, so:

Acked-by: Rusty Russell 

Good luck!
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v4 0/5] mm: cma: add some debug information for CMA

2015-03-16 Thread Sasha Levin

On 03/16/2015 09:43 PM, Joonsoo Kim wrote:
> On Mon, Mar 16, 2015 at 07:06:55PM +0300, Stefan Strogin wrote:
>> > Hi all.
>> > 
>> > Here is the fourth version of a patch set that adds some debugging 
>> > facility for
>> > CMA.
>> > 
>> > This patch set is based on next-20150316.
>> > It is also available on git:
>> > git://github.com/stefanstrogin/linux -b cmainfo-v4
>> > 
>> > We want an interface to see a list of currently allocated CMA buffers and 
>> > some
>> > useful information about them (like /proc/vmallocinfo but for physically
>> > contiguous buffers allocated with CMA).
>> > 
>> > For example. We want a big (megabytes) CMA buffer to be allocated in 
>> > runtime
>> > in default CMA region. If someone already uses CMA then the big allocation
>> > could fail. If it happened then with such an interface we could find who 
>> > used
>> > CMA at the moment of failure, who caused fragmentation and so on. Ftrace 
>> > also
>> > would be helpful here, but with ftrace we can see the whole history of
>> > allocations and releases, whereas with this patch set we can see a 
>> > snapshot of
>> > CMA region with actual information about its allocations.
> Hello,
> 
> Hmm... I still don't think that this is really helpful to find root
> cause of fragmentation. Think about following example.
> 
> Assume 1024 MB CMA region.
> 
> 128 MB allocation * 4
> 1 MB allocation
> 128 MB allocation
> 128 MB release * 4 (first 4)
> try 512 MB allocation
> 
> With above sequences, fragmentation happens and 512 MB allocation would
> be failed. We can get information about 1 MB allocation and 128 MB one
> from the buffer list as you suggested, but, fragmentation are related
> to whole sequence of allocation/free history, not snapshot of allocation.

This is solvable by dumping task->comm in the tracepoint patch (1/5), right?


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [rfc patch v2] rt,nohz_full: fix nohz_full for PREEMPT_RT_FULL

2015-03-16 Thread Mike Galbraith

On Mon, 2015-03-16 at 21:24 +0100, Sebastian Andrzej Siewior wrote:
> * Mike Galbraith | 2015-03-13 05:53:25 [+0100]:
> 
> >First of all, a task being ticked and trying to shut the tick down will
> >fail to do so due to having just awakened ksoftirqd, so let ksoftirqd
> >try to do that after SOFTIRQ_TIMER processing.  Secondly, should the
> >tick be shut down, we may livelock in hrtimer-cancel() because in -rt
> >a callback may be running.  Break the loop, and let tick_nohz_restart()
> >know that the timer is busy so it can bail.
> 
> I am a bit undecided on that one. I included it in the series but did
> not enable it yet.
> Just so we are on the same page here: you boot your machine with
> something like
> "isolcpus=1-31 rcu_nocbs=1-31 nohz_full=1-31"
> and pin all kernel threads to CPU0, right?

No, I only declare the nohz_full set, do the isolation via cpusets.

> What you do is that you accept the fact that the timer-softirq is
> scheduled for no reason and then you try to disable the tick from within
> the timer-softirq. I assumed that it would work get the "expired timer"
> somehow.

Yup, it works around that otherwise crippling wakeup.  If I re-apply..

 timers-do-not-raise-softirq-unconditionally.patch

..the workaround is not needed of course, but the livelock fix still is.
I haven't yet tested that in 3.18-rt though, only 4.0-rt, but I presume
it'll be the same deal there when I do.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 0/3] ARM: dts: Define stdout-patch for Exynos Chromebooks

2015-03-16 Thread Kukjin Kim

Javier Martinez Canillas wrote:
> 
> The kernel can use as the default console a serial port if is defined
> as stdout device in the Device Tree.
> 
> This allows a board to be booted without the need of having a console
> parameter in the kernel command line.
> 
> This small series add a stdout-path property for Exynos5 Chromebooks and
> is composed of the following patches:
> 
> Javier Martinez Canillas (3):
>   ARM: dts: Define stdout-path property for Peach boards
>   ARM: dts: Define stdout-path property for Snow board
>   ARM: dts: Define stdout-path property for Spring board
> 
>  arch/arm/boot/dts/exynos5250-snow.dts  | 1 +
>  arch/arm/boot/dts/exynos5250-spring.dts| 1 +
>  arch/arm/boot/dts/exynos5420-peach-pit.dts | 4 
>  arch/arm/boot/dts/exynos5800-peach-pi.dts  | 4 
>  4 files changed, 10 insertions(+)
> 
+ Arnd

Basically, I have no objection to add stdout-path property on board DT but I
need to ask other ARM guys how they think about? Always I'm questioned what
should be defined in bootloader before entering kernel and IMHO kernel can do
it, it should be defined in bootloader though ;)

Let's wait for other opinions...

Thanks,
Kukjin

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: kill kmemcheck

2015-03-16 Thread Sasha Levin

On 03/11/2015 10:52 AM, Steven Rostedt wrote:
>> > Could you try KASan for your use case and see if it potentially uncovers
>> > anything new?
> The problem is, I don't have a setup to build with the latest compiler.
> 
> I could build with my host compiler (that happens to be 4.9.2), but it
> would take a while to build, and is not part of my work flow.
> 
> 4.9.2 is very new, I think it's a bit premature to declare that the
> only way to test memory allocations is with the latest and greatest
> kernel.
> 
> But if kmemcheck really doesn't work anymore, than perhaps we should
> get rid of it.

Steven,

Since the only objection raised was the too-newiness of GCC 4.9.2/5.0, what
would you consider a good time-line for removal?

I haven't heard any "over my dead body" objections, so I guess that trying
to remove it while no distribution was shipping the compiler that would make
it possible was premature.

Although, on the other hand, I'd be happy if we can have a reasonable date
(that is before my kid goes to college), preferably even before the next
LSF/MM so that we could have a mission accomplished thingie with a round
of beers and commemorative t-shirts.

Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier (x86) (v12)

2015-03-16 Thread Mathieu Desnoyers

- Original Message -
> From: "Peter Zijlstra" 
> To: "Mathieu Desnoyers" 
> Cc: linux-kernel@vger.kernel.org, "KOSAKI Motohiro" 
> , "Steven Rostedt"
> , "Paul E. McKenney" , 
> "Nicholas Miell" ,
> "Linus Torvalds" , "Ingo Molnar" 
> , "Alan Cox"
> , "Lai Jiangshan" , 
> "Stephen Hemminger"
> , "Andrew Morton" , 
> "Josh Triplett" ,
> "Thomas Gleixner" , "David Howells" 
> , "Nick Piggin" 
> Sent: Monday, March 16, 2015 4:54:35 PM
> Subject: Re: [RFC PATCH] sys_membarrier(): system/process-wide memory barrier 
> (x86) (v12)
> 
> On Mon, Mar 16, 2015 at 06:53:35PM +, Mathieu Desnoyers wrote:
> > > I'm not entirely awake atm but I'm not seeing why it would need to be
> > > that strict; I think the current single MB on task switch is sufficient
> > > because if we're in the middle of schedule, userspace isn't actually
> > > running.
> > > 
> > > So from the point of userspace the task switch is atomic. Therefore even
> > > if we do not get a barrier before setting ->curr, the expedited thing
> > > missing us doesn't matter as userspace cannot observe the difference.
> > 
> > AFAIU, atomicity is not what matters here. It's more about memory ordering.
> > What is guaranteeing that upon entry in kernel-space, all prior memory
> > accesses (loads and stores) are ordered prior to following loads/stores ?
> > 
> > The same applies when returning to user-space: what is guaranteeing that
> > all
> > prior loads/stores are ordered before the user-space loads/stores performed
> > after returning to user-space ?
> 
> You're still one step ahead of me; why does this matter?
> 
> Or put it another way; what can go wrong? By virtue of being in
> schedule() both tasks (prev and next) get an affective MB from the task
> switch.
> 
> So even if we see the 'wrong' rq->curr, that CPU will still observe the
> MB by the time it gets to userspace.
> 
> All of this is really only about userspace load/store ordering and the
> context switch already very much needs to guarantee userspace program
> order in the face of context switches.

Let's go through a memory ordering scenario to highlight my reasoning
there.

Let's consider the following memory barrier scenario performed in
user-space on an architecture with very relaxed ordering. PowerPC comes
to mind.

https://lwn.net/Articles/573436/
scenario 12:

CPU 0   CPU 1
CAO(x) = 1; r3 = CAO(y);
cmm_smp_wmb();  cmm_smp_rmb();
CAO(y) = 1; r4 = CAO(x);

BUG_ON(r3 == 1 && r4 == 0)


We tweak it to use sys_membarrier on CPU 1, and a simple compiler
barrier() on CPU 0:

CPU 0   CPU 1
CAO(x) = 1; r3 = CAO(y);
barrier();  sys_membarrier();
CAO(y) = 1; r4 = CAO(x);

BUG_ON(r3 == 1 && r4 == 0)

Now if CPU 1 executes sys_membarrier while CPU 0 is preempted after both
stores, we have:

CPU 0   CPU 1
CAO(x) = 1;
  [1st store is slow to
   reach other cores]
CAO(y) = 1;
  [2nd store reaches other
   cores more quickly]
[preempted]
r3 = CAO(y)
  (may see y = 1)
sys_membarrier()
Scheduler changes rq->curr.
skips CPU 0, because rq->curr has
  been updated.
[return to userspace]
r4 = CAO(x)
  (may see x = 0)
BUG_ON(r3 == 1 && r4 == 0) -> fails.
load_cr3, with implied
  memory barrier, comes
  after CPU 1 has read "x".

The only way to make this scenario work is if a memory barrier is added
before updating rq->curr. (we could also do a similar scenario for the
needed barrier after store to rq->curr).

> 
> > > > In order to be able to dereference rq->curr->mm without holding the
> > > > rq->lock, do you envision we should protect task reclaim with RCU-sched
> > > > ?
> > > 
> > > A recent discussion had Linus suggest SLAB_DESTROY_BY_RCU, although I
> > > think Oleg did mention it would still be 'interesting'. I've not yet had
> > > time to really think about that.
> > 
> > This might be an "interesting" modification. :) This could perhaps come
> > as an optimization later on ?
> 
> Not really, again, take this for (;;) sys_membar(EXPEDITED) that'll
> generate horrendous rq lock contention, with or without the PRIVATE
> thing it'll pound a number of rq locks real bad.
> 
> Typical scheduler syscalls only affect a single rq lock at a time -- the
> one the task is on. This one potentially pounds all of them.

Would you see it as acceptable if we start by implementing
only the non-expedited sys_membarrier() ? Then we can add
the expedited-private implementation after rq->curr becomes
available through RCU.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to

Re: [PATCH v4 0/5] mm: cma: add some debug information for CMA

2015-03-16 Thread Joonsoo Kim

On Mon, Mar 16, 2015 at 07:06:55PM +0300, Stefan Strogin wrote:
> Hi all.
> 
> Here is the fourth version of a patch set that adds some debugging facility 
> for
> CMA.
> 
> This patch set is based on next-20150316.
> It is also available on git:
> git://github.com/stefanstrogin/linux -b cmainfo-v4
> 
> We want an interface to see a list of currently allocated CMA buffers and some
> useful information about them (like /proc/vmallocinfo but for physically
> contiguous buffers allocated with CMA).
> 
> For example. We want a big (megabytes) CMA buffer to be allocated in runtime
> in default CMA region. If someone already uses CMA then the big allocation
> could fail. If it happened then with such an interface we could find who used
> CMA at the moment of failure, who caused fragmentation and so on. Ftrace also
> would be helpful here, but with ftrace we can see the whole history of
> allocations and releases, whereas with this patch set we can see a snapshot of
> CMA region with actual information about its allocations.

Hello,

Hmm... I still don't think that this is really helpful to find root
cause of fragmentation. Think about following example.

Assume 1024 MB CMA region.

128 MB allocation * 4
1 MB allocation
128 MB allocation
128 MB release * 4 (first 4)
try 512 MB allocation

With above sequences, fragmentation happens and 512 MB allocation would
be failed. We can get information about 1 MB allocation and 128 MB one
from the buffer list as you suggested, but, fragmentation are related
to whole sequence of allocation/free history, not snapshot of allocation.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCHv3] kvmppc: Implement H_LOGICAL_CI_{LOAD,STORE} in KVM

2015-03-16 Thread David Gibson

On Thu, Feb 05, 2015 at 01:57:11AM +0100, Alexander Graf wrote:
> 
> 
> On 05.02.15 01:53, David Gibson wrote:
> > On POWER, storage caching is usually configured via the MMU - attributes
> > such as cache-inhibited are stored in the TLB and the hashed page table.
> > 
> > This makes correctly performing cache inhibited IO accesses awkward when
> > the MMU is turned off (real mode).  Some CPU models provide special
> > registers to control the cache attributes of real mode load and stores but
> > this is not at all consistent.  This is a problem in particular for SLOF,
> > the firmware used on KVM guests, which runs entirely in real mode, but
> > which needs to do IO to load the kernel.
> > 
> > To simplify this qemu implements two special hypercalls, H_LOGICAL_CI_LOAD
> > and H_LOGICAL_CI_STORE which simulate a cache-inhibited load or store to
> > a logical address (aka guest physical address).  SLOF uses these for IO.
> > 
> > However, because these are implemented within qemu, not the host kernel,
> > these bypass any IO devices emulated within KVM itself.  The simplest way
> > to see this problem is to attempt to boot a KVM guest from a virtio-blk
> > device with iothread / dataplane enabled.  The iothread code relies on an
> > in kernel implementation of the virtio queue notification, which is not
> > triggered by the IO hcalls, and so the guest will stall in SLOF unable to
> > load the guest OS.
> > 
> > This patch addresses this by providing in-kernel implementations of the
> > 2 hypercalls, which correctly scan the KVM IO bus.  Any access to an
> > address not handled by the KVM IO bus will cause a VM exit, hitting the
> > qemu implementation as before.
> > 
> > Note that a userspace change is also required, in order to enable these
> > new hcall implementations with KVM_CAP_PPC_ENABLE_HCALL.
> > 
> > Signed-off-by: David Gibson 
> 
> Thanks, applied to kvm-ppc-queue.

Any news on when this might go up to mainline?

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgpYxsnLedUuk.pgp
Description: PGP signature

Re: [PATCH] MIPS: bcm63xx: move bcm63xx_gpio_init() to bcm63xx_register_devices().

2015-03-16 Thread Maxime Bizon


On Monday 16 Mar 2015 à 16:54:54 (+0100), Jonas Gorski wrote:

> So I don't see how this breaks anything. But for the sake of the
> argument, let's give it a spin:

my mistake, you are right, I completely misread the patch.

-- 
Maxime
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ext4: Remove useless condition in if statement.

2015-03-16 Thread Wei Yuan

In this if statement, the previous condition is useless, the later one has 
covered it.

Signed-off-by: Weiyuan 
---
 fs/ext4/xattr.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 1e09fc7..f2ccad7 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -639,8 +639,7 @@ ext4_xattr_set_entry(struct ext4_xattr_info *i, struct 
ext4_xattr_search *s)
free += EXT4_XATTR_LEN(name_len);
}
if (i->value) {
-   if (free < EXT4_XATTR_SIZE(i->value_len) ||
-   free < EXT4_XATTR_LEN(name_len) +
+   if (free < EXT4_XATTR_LEN(name_len) +
   EXT4_XATTR_SIZE(i->value_len))
return -ENOSPC;
}
--
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 1/3] staging: lustre: space prohibited between function name and open parenthesis '('

2015-03-16 Thread Alberto Pires de Oliveira Neto

This patch fixes checkpatch.pl warning.
WARNING: space prohibited between function name and open parenthesis '('

Signed-off-by: Alberto Pires de Oliveira Neto 
---
 drivers/staging/lustre/lustre/fld/fld_internal.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/fld/fld_internal.h 
b/drivers/staging/lustre/lustre/fld/fld_internal.h
index 6125bbe..68bec765 100644
--- a/drivers/staging/lustre/lustre/fld/fld_internal.h
+++ b/drivers/staging/lustre/lustre/fld/fld_internal.h
@@ -142,7 +142,7 @@ extern struct lu_fld_hash fld_hash[];
 int fld_client_rpc(struct obd_export *exp,
   struct lu_seq_range *range, __u32 fld_op);
 
-#if defined (CONFIG_PROC_FS)
+#if defined(CONFIG_PROC_FS)
 extern struct lprocfs_vars fld_client_proc_list[];
 #endif
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 2/3] staging: lustre: void function return statements are not generally useful.

2015-03-16 Thread Alberto Pires de Oliveira Neto

This patch fixes checkpatch.pl warning.
WARNING: void function return statements are not generally useful

Signed-off-by: Alberto Pires de Oliveira Neto 
---
 drivers/staging/lustre/lustre/fld/fld_request.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/fld/fld_request.c 
b/drivers/staging/lustre/lustre/fld/fld_request.c
index 22e0d94..6ac225e 100644
--- a/drivers/staging/lustre/lustre/fld/fld_request.c
+++ b/drivers/staging/lustre/lustre/fld/fld_request.c
@@ -326,7 +326,6 @@ static int fld_client_proc_init(struct lu_client_fld *fld)
 
 void fld_client_proc_fini(struct lu_client_fld *fld)
 {
-   return;
 }
 #endif
 EXPORT_SYMBOL(fld_client_proc_fini);
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 3/3] staging: lustre: space required after that close brace '}'

2015-03-16 Thread Alberto Pires de Oliveira Neto

This patch fixes checkpatch.pl warning.
WARNING: space required after that close brace '}'

Signed-off-by: Alberto Pires de Oliveira Neto 
---
 drivers/staging/lustre/lustre/fld/lproc_fld.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/fld/lproc_fld.c 
b/drivers/staging/lustre/lustre/fld/lproc_fld.c
index 8c5a657..f53fdcf 100644
--- a/drivers/staging/lustre/lustre/fld/lproc_fld.c
+++ b/drivers/staging/lustre/lustre/fld/lproc_fld.c
@@ -168,4 +168,5 @@ struct lprocfs_vars fld_client_proc_list[] = {
{ "targets", _proc_targets_fops },
{ "hash", _proc_hash_fops },
{ "cache_flush", _proc_cache_flush_fops },
-   { NULL }};
+   { NULL }
+};
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC, PATCH] pagemap: do not leak physical addresses to non-privileged userspace

2015-03-16 Thread Andy Lutomirski

On Mon, Mar 16, 2015 at 5:49 PM, Mark Seaborn  wrote:
> On 16 March 2015 at 14:11, Pavel Machek  wrote:
>> On Mon 2015-03-09 23:11:12, Kirill A. Shutemov wrote:
>> > From: "Kirill A. Shutemov" 
>> >
>> > As pointed by recent post[1] on exploiting DRAM physical imperfection,
>> > /proc/PID/pagemap exposes sensitive information which can be used to do
>> > attacks.
>> >
>> > This is RFC patch which disallow anybody without CAP_SYS_ADMIN to read
>> > the pagemap.
>> >
>> > Any comments?
>> >
>> > [1] 
>> > http://googleprojectzero.blogspot.com/2015/03/exploiting-dram-rowhammer-bug-to-gain.html
>>
>> Note that this kind of attack still works without pagemap, it just
>> takes longer. Actually the first demo program is not using pagemap.
>
> That depends on the machine -- it depends on how bad the machine's
> DRAM is, and whether the machine has the 2x refresh rate mitigation
> enabled.
>
> Machines with less-bad DRAM or with a 2x refresh rate might still be
> vulnerable to rowhammer, but only if the attacker has access to huge
> pages or to /proc/PID/pagemap.
>
> /proc/PID/pagemap also gives an attacker the ability to scan for bad
> DRAM locations, save a list of their addresses, and exploit them in
> the future.
>
> Given that, I think it would still be worthwhile to disable /proc/PID/pagemap.

Having slept on this further, I think that unprivileged pagemap access
is awful and we should disable it with no option to re-enable.  If we
absolutely must, we could allow programs to read all zeros or to read
addresses that are severely scrambled (e.g. ECB-encrypted by a key
generated once per open of pagemap).

Pagemap is awful because:

 - Rowhammer.

 - It exposes internals that users have no business knowing.

 - It could easily leak direct-map addresses, and there's a nice paper
detailing a SMAP bypass using that technique.

Can we just try getting rid of it except with global CAP_SYS_ADMIN.

(Hmm.  Rowhammer attacks targeting SMRAM could be interesting.)

>
>
>> Can we do anything about that? Disabling cache flushes from userland
>> should make it no longer exploitable.
>
> Unfortunately there's no way to disable userland code's use of
> CLFLUSH, as far as I know.
>
> Maybe Intel or AMD could disable CLFLUSH via a microcode update, but
> they have not said whether that would be possible.

The Intel people I asked last week weren't confident.  For one thing,
I fully expect that rowhammer can be exploited using only reads and
writes with some clever tricks involving cache associativity.  I don't
think there are any fully-associative caches, although the cache
replacement algorithm could make the attacks interesting.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC PATCH v2 1/5] new helper: iov_iter_rw()

2015-03-16 Thread Omar Sandoval

Get either READ or WRITE out of iter->type.

Signed-off-by: Omar Sandoval 
---
Thanks, Al, this is much better. Anything else you'd like me to address
for this series?

 include/linux/uio.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 7188029..3d80a36 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -111,6 +111,12 @@ static inline bool iter_is_iovec(struct iov_iter *i)
 }
 
 /*
+ * Get one of READ or WRITE out of iter->type without any other flags OR'd in
+ * with it.
+ */
+#define iov_iter_rw(i) ((0 ? (struct iov_iter *)0 : (i))->type & RW_MASK)
+
+/*
  * Cap the iov_iter by given limit; note that the second argument is
  * *not* the new size - it's upper limit for such.  Passing it a value
  * greater than the amount of data in iov_iter is fine - it'll just do
-- 
2.3.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2 0/3] staging: lustre: Fix checkpatch.pl warnings.

2015-03-16 Thread Alberto Pires de Oliveira Neto

Changes since v1:
 - Put '}' in the next line instead of just insert a space for lproc_fld.c.

Alberto Pires de Oliveira Neto (3):
  staging: lustre: space prohibited between function name and open parenthesis 
'('
  staging: lustre: void function return statements are not generally useful.
  staging: lustre: space required after that close brace '}'

 drivers/staging/lustre/lustre/fld/fld_internal.h | 2 +-
 drivers/staging/lustre/lustre/fld/fld_request.c  | 1 -
 drivers/staging/lustre/lustre/fld/lproc_fld.c| 3 ++-
 3 files changed, 3 insertions(+), 3 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: dts: am437x-gp-evm: add DT nodes for ov2659 sensor

2015-03-16 Thread Lad, Prabhakar

Hi Tony,

On Mon, Mar 16, 2015 at 10:17 PM, Tony Lindgren  wrote:
> * Lad Prabhakar  [150312 16:38]:
>> From: "Lad, Prabhakar" 
>>
>> this patch does the following:
>> 1: adds DT node for fixed oscillator.
>> 2: adds DT node entries for ov2659 sensor
>> 3: adds remote-endpoint entry for VPFE.
>>
>> Signed-off-by: Lad, Prabhakar 
>
> Applying into omap-for-v4.1/dt thanks.
>
I would like to get this one in via media tree to avoid dependency
as I am still waiting for Acks from DT maintainers for the sensor
driver.

If I can get your Ack on this I'll queue it up along with sensor
driver via media tree.

Cheers,
--Prabhakar Lad

> Tony
>
>> ---
>>  Note this patch depends on
>>  https://patchwork.kernel.org/patch/6000161/
>>
>>  arch/arm/boot/dts/am437x-gp-evm.dts | 42 
>> +++--
>>  1 file changed, 40 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm/boot/dts/am437x-gp-evm.dts 
>> b/arch/arm/boot/dts/am437x-gp-evm.dts
>> index f84d971..195f452 100644
>> --- a/arch/arm/boot/dts/am437x-gp-evm.dts
>> +++ b/arch/arm/boot/dts/am437x-gp-evm.dts
>> @@ -106,6 +106,14 @@
>>   };
>>   };
>>   };
>> +
>> + /* fixed 12MHz oscillator */
>> + refclk: oscillator {
>> + #clock-cells = <0>;
>> + compatible = "fixed-clock";
>> + clock-frequency = <1200>;
>> + };
>> +
>>  };
>>
>>  _pinmux {
>> @@ -404,6 +412,21 @@
>>   regulator-always-on;
>>   };
>>   };
>> +
>> + ov2659@30 {
>> + compatible = "ovti,ov2659";
>> + reg = <0x30>;
>> +
>> + clocks = < 0>;
>> + clock-names = "xvclk";
>> +
>> + port {
>> + ov2659_0: endpoint {
>> + remote-endpoint = <_ep>;
>> + link-frequencies = /bits/ 64 <7000>;
>> + };
>> + };
>> + };
>>  };
>>
>>   {
>> @@ -423,6 +446,21 @@
>>   touchscreen-size-x = <1024>;
>>   touchscreen-size-y = <600>;
>>   };
>> +
>> + ov2659@30 {
>> + compatible = "ovti,ov2659";
>> + reg = <0x30>;
>> +
>> + clocks = < 0>;
>> + clock-names = "xvclk";
>> +
>> + port {
>> + ov2659_1: endpoint {
>> + remote-endpoint = <_ep>;
>> + link-frequencies = /bits/ 64 <7000>;
>> + };
>> + };
>> + };
>>  };
>>
>>   {
>> @@ -626,7 +664,7 @@
>>
>>   port {
>>   vpfe0_ep: endpoint {
>> - /* remote-endpoint = <>; add once we have it */
>> + remote-endpoint = <_1>;
>>   ti,am437x-vpfe-interface = <0>;
>>   bus-width = <8>;
>>   hsync-active = <0>;
>> @@ -643,7 +681,7 @@
>>
>>   port {
>>   vpfe1_ep: endpoint {
>> - /* remote-endpoint = <>; add once we have it */
>> + remote-endpoint = <_0>;
>>   ti,am437x-vpfe-interface = <0>;
>>   bus-width = <8>;
>>   hsync-active = <0>;
>> --
>> 2.1.0
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ARM: mm: Do not invoke OOM for higher order IOMMU DMA allocations

2015-03-16 Thread Tomasz Figa

Hi David,

On Tue, Mar 17, 2015 at 8:32 AM, David Rientjes  wrote:
> On Mon, 16 Mar 2015, Tomasz Figa wrote:
>
>> diff --git a/arch/arm/mm/dma-mapping.c b/arch/arm/mm/dma-mapping.c
>> index 83cd5ac..f081e9e 100644
>> --- a/arch/arm/mm/dma-mapping.c
>> +++ b/arch/arm/mm/dma-mapping.c
>> @@ -1145,18 +1145,31 @@ static struct page **__iommu_alloc_buffer(struct 
>> device *dev, size_t size,
>>   }
>>
>>   /*
>> -  * IOMMU can map any pages, so himem can also be used here
>> +  * IOMMU can map any pages, so himem can also be used here.
>> +  * We do not want OOM killer to be invoked as long as we can fall back
>> +  * to single pages, so we use __GFP_NORETRY for positive orders.
>>*/
>> - gfp |= __GFP_NOWARN | __GFP_HIGHMEM;
>> + gfp |= __GFP_NOWARN | __GFP_HIGHMEM | __GFP_NORETRY;
>>
>>   while (count) {
>> - int j, order = __fls(count);
>> + int j, order;
>>
>> - pages[i] = alloc_pages(gfp, order);
>> - while (!pages[i] && order)
>> - pages[i] = alloc_pages(gfp, --order);
>> - if (!pages[i])
>> - goto error;
>> + for (order = __fls(count); order; --order) {
>> + /* Will not trigger OOM. */
>> + pages[i] = alloc_pages(gfp, order);
>> + if (pages[i])
>> + break;
>> + }
>> +
>> + if (!pages[i]) {
>> + /*
>> +  * Fall back to single page allocation.
>> +  * Might invoke OOM killer as last resort.
>> +  */
>> + pages[i] = alloc_pages(gfp & ~__GFP_NORETRY, 0);
>> + if (!pages[i])
>> + goto error;
>> + }
>>
>>   if (order) {
>>   split_page(pages[i], order);
>
> I think this makes sense, but the problem is the unconditional setting and
> clearing of __GFP_NORETRY.  Strictly speaking, gfp may already have
> __GFP_NORETRY set when calling this function so it would be better to do
> the loop with alloc_pages(gfp | __GFP_NORETRY, order) and then the
> fallback as alloc_page(gfp).

Good point. I'll change it to that in next version.

Best regards,
Tomasz
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] zsmalloc: zsmalloc documentation

2015-03-16 Thread Minchan Kim

On Wed, Mar 04, 2015 at 04:56:10PM -0800, Andrew Morton wrote:
> On Thu, 5 Mar 2015 09:43:31 +0900 Minchan Kim  wrote:
> 
> > Hello Andrew,
> > 
> > On Wed, Mar 04, 2015 at 02:02:02PM -0800, Andrew Morton wrote:
> > > On Wed,  4 Mar 2015 14:01:32 +0900 Minchan Kim  wrote:
> > > 
> > > > +static int zs_stats_size_show(struct seq_file *s, void *v)
> > > > +{
> > > > +   int i;
> > > > +   struct zs_pool *pool = s->private;
> > > > +   struct size_class *class;
> > > > +   int objs_per_zspage;
> > > > +   unsigned long class_almost_full, class_almost_empty;
> > > > +   unsigned long obj_allocated, obj_used, pages_used;
> > > > +   unsigned long total_class_almost_full = 0, 
> > > > total_class_almost_empty = 0;
> > > > +   unsigned long total_objs = 0, total_used_objs = 0, total_pages 
> > > > = 0;
> > > > +
> > > > +   seq_printf(s, " %5s %5s %11s %12s %13s %10s %10s %16s\n",
> > > > +   "class", "size", "almost_full", "almost_empty",
> > > > +   "obj_allocated", "obj_used", "pages_used",
> > > > +   "pages_per_zspage");
> > > 
> > > Documentation?
> > 
> > It should been since [0f050d9, mm/zsmalloc: add statistics support].
> > Anyway, I will try it.
> > Where is right place to put only this statistics in Documentation?
> > 
> > Documentation/zsmalloc.txt?
> > Documentation/vm/zsmalloc.txt?
> > Documentation/blockdev/zram.txt?
> > Documentation/ABI/testing/sysfs-block-zram?
> 
> hm, this is debugfs so Documentation/ABI/testing/sysfs-block-zram isn't
> the right place.
> 
> akpm3:/usr/src/25> grep -rli zsmalloc Documentation 
> akpm3:/usr/src/25> 
> 
> lol.
> 
> Documentation/vm/zsmalloc.txt looks good.

Here it goes.

>From cb5ac24125c14467d1a5b6fbb92757d5517b0300 Mon Sep 17 00:00:00 2001
From: Minchan Kim 
Date: Tue, 17 Mar 2015 10:02:07 +0900
Subject: [PATCH] zsmalloc: zsmalloc documentation

This patch creates zsmalloc doc which explains design concept
and stat information.

Signed-off-by: Minchan Kim 
---
 Documentation/vm/zsmalloc.txt | 70 +++
 MAINTAINERS   |  1 +
 mm/zsmalloc.c | 29 --
 3 files changed, 71 insertions(+), 29 deletions(-)
 create mode 100644 Documentation/vm/zsmalloc.txt

diff --git a/Documentation/vm/zsmalloc.txt b/Documentation/vm/zsmalloc.txt
new file mode 100644
index ..64ed63c4f69d
--- /dev/null
+++ b/Documentation/vm/zsmalloc.txt
@@ -0,0 +1,70 @@
+zsmalloc
+
+
+This allocator is designed for use with zram. Thus, the allocator is
+supposed to work well under low memory conditions. In particular, it
+never attempts higher order page allocation which is very likely to
+fail under memory pressure. On the other hand, if we just use single
+(0-order) pages, it would suffer from very high fragmentation --
+any object of size PAGE_SIZE/2 or larger would occupy an entire page.
+This was one of the major issues with its predecessor (xvmalloc).
+
+To overcome these issues, zsmalloc allocates a bunch of 0-order pages
+and links them together using various 'struct page' fields. These linked
+pages act as a single higher-order page i.e. an object can span 0-order
+page boundaries. The code refers to these linked pages as a single entity
+called zspage.
+
+For simplicity, zsmalloc can only allocate objects of size up to PAGE_SIZE
+since this satisfies the requirements of all its current users (in the
+worst case, page is incompressible and is thus stored "as-is" i.e. in
+uncompressed form). For allocation requests larger than this size, failure
+is returned (see zs_malloc).
+
+Additionally, zs_malloc() does not return a dereferenceable pointer.
+Instead, it returns an opaque handle (unsigned long) which encodes actual
+location of the allocated object. The reason for this indirection is that
+zsmalloc does not keep zspages permanently mapped since that would cause
+issues on 32-bit systems where the VA region for kernel space mappings
+is very small. So, before using the allocating memory, the object has to
+be mapped using zs_map_object() to get a usable pointer and subsequently
+unmapped using zs_unmap_object().
+
+stat
+
+
+With CONFIG_ZSMALLOC_STAT, we could see zsmalloc internal information via
+/sys/kernel/debug/zsmalloc/. Here is a sample of stat output:
+
+# cat /sys/kernel/debug/zsmalloc/zram0/classes
+
+ class  size almost_full almost_empty obj_allocated   obj_used pages_used 
pages_per_zspage
+..
+..
+ 9   176   01   186129  8  
  4
+10   192   10  2880   2872135  
  3
+11   208   01   819795 42  
  2
+12   224   01   219159 12  
  4
+..
+..
+
+
+class: index
+size: object size zspage stores
+almost_empty: the number of

Re: [PATCH 30/35 linux-next] devfreq: constify of_device_id array

2015-03-16 Thread MyungJoo Ham

> of_device_id is always used as const.
> (See driver.of_match_table and open firmware functions)
> 
> Signed-off-by: Fabian Frederick 

Acked-by: MyungJoo Ham 

> ---
>  drivers/devfreq/event/exynos-ppmu.c | 2 +-
>  drivers/devfreq/tegra-devfreq.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/devfreq/event/exynos-ppmu.c 
> b/drivers/devfreq/event/exynos-ppmu.c
> index ad83473..5afb851 100644
> --- a/drivers/devfreq/event/exynos-ppmu.c
> +++ b/drivers/devfreq/event/exynos-ppmu.c
> @@ -354,7 +354,7 @@ static int exynos_ppmu_remove(struct platform_device 
> *pdev)
>   return 0;
>  }
>  
> -static struct of_device_id exynos_ppmu_id_match[] = {
> +static const struct of_device_id exynos_ppmu_id_match[] = {
>   { .compatible = "samsung,exynos-ppmu", },
>   { /* sentinel */ },
>  };
> diff --git a/drivers/devfreq/tegra-devfreq.c b/drivers/devfreq/tegra-devfreq.c
> index 3479096..244d8db 100644
> --- a/drivers/devfreq/tegra-devfreq.c
> +++ b/drivers/devfreq/tegra-devfreq.c
> @@ -695,7 +695,7 @@ static SIMPLE_DEV_PM_OPS(tegra_devfreq_pm_ops,
>tegra_devfreq_suspend,
>tegra_devfreq_resume);
>  
> -static struct of_device_id tegra_devfreq_of_match[] = {
> +static const struct of_device_id tegra_devfreq_of_match[] = {
>   { .compatible = "nvidia,tegra124-actmon" },
>   { },
>  };
> -- 
> 2.1.0
> 
>

Re: [PATCH] mm/slub: fix lockups on PREEMPT && !SMP kernels

2015-03-16 Thread Joonsoo Kim

Hello,

On Fri, Mar 13, 2015 at 03:47:12PM +, Mark Rutland wrote:
> Commit 9aabf810a67cd97e ("mm/slub: optimize alloc/free fastpath by
> removing preemption on/off") introduced an occasional hang for kernels
> built with CONFIG_PREEMPT && !CONFIG_SMP.
> 
> The problem is the following loop the patch introduced to
> slab_alloc_node and slab_free:
> 
> do {
> tid = this_cpu_read(s->cpu_slab->tid);
> c = raw_cpu_ptr(s->cpu_slab);
> } while (IS_ENABLED(CONFIG_PREEMPT) && unlikely(tid != c->tid));
> 
> GCC 4.9 has been observed to hoist the load of c and c->tid above the
> loop for !SMP kernels (as in this case raw_cpu_ptr(x) is compile-time
> constant and does not force a reload). On arm64 the generated assembly
> looks like:
> 
> ffc00016d3c4:   f9400404ldr x4, [x0,#8]
> ffc00016d3c8:   f9400401ldr x1, [x0,#8]
> ffc00016d3cc:   eb04003fcmp x1, x4
> ffc00016d3d0:   54c1b.neffc00016d3c8 
> 
> 
> If the thread is preempted between the load of c->tid (into x1) and tid
> (into x4), and and allocation or free occurs in another thread (bumping
> the cpu_slab's tid), the thread will be stuck in the loop until
> s->cpu_slab->tid wraps, which may be forever in the absence of
> allocations on the same CPU.

Is there any method to guarantee refetching these in each loop?

> 
> The loop itself is somewhat pointless as the thread can be preempted at
> any point after the loop before the this_cpu_cmpxchg_double, and the
> window for preemption within the loop is far smaller. Given that we
> assume accessing c->tid is safe for the loop condition, and we retry
> when the cmpxchg fails, we can get rid of the loop entirely and just
> access c->tid via the raw_cpu_ptr for s->cpu_slab.

Hmm... IIUC, loop itself is not pointless. It guarantees that tid and
c (s->cpu_slab) are fetched on right and same processor and this is
for algorithm correctness.

Think about your code.

c = raw_cpu_ptr(s->cpu_slab);
tid = READ_ONCE(c->tid);

This doesn't guarantee that tid is fetched on the cpu where c is
fetched if preemption/migration happens between these operations.

If c->tid, c->freelist, c->page are fetched on the other cpu,
there is no ordering guarantee and c->freelist, c->page could be stale
value even if c->tid is recent one.

Think about following free case with your patch.

Assume initial cpu 0's state as following.
c->tid: 1, c->freelist: NULL, c->page: A

User X: try to free object X for page A
User X: fetch c (s->cpu_slab)

Preemtion and migration happens...
The other allocation/free happens... so cpu 0's state is as following.
c->tid: 3, c->freelist: NULL, c->page: B

User X: read c->tid: 3, c->freelist: NULL, c->page A (stale value)

Because tid and freelist are matched with current ones, free would
succeed, but, current c->page is B and object is for A so this success
is wrong.

Loop prevents this possibility.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [update][PATCH v10 06/21] ACPI / sleep: Introduce CONFIG_ACPI_GENERIC_SLEEP

2015-03-16 Thread Hanjun Guo

On 2015/3/17 7:15, Rafael J. Wysocki wrote:
> On Monday, March 16, 2015 08:14:52 PM Hanjun Guo wrote:
>> On 2015年03月14日 05:49, Rafael J. Wysocki wrote:
>>> On Friday, March 13, 2015 04:14:29 PM Hanjun Guo wrote:
[...]

 diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
 index 074e52b..e8728d7 100644
 --- a/arch/ia64/Kconfig
 +++ b/arch/ia64/Kconfig
 @@ -10,6 +10,7 @@ config IA64
select ARCH_MIGHT_HAVE_PC_SERIO
select PCI if (!IA64_HP_SIM)
select ACPI if (!IA64_HP_SIM)
 +  select ACPI_GENERIC_SLEEP if ACPI
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select HAVE_UNSTABLE_SCHED_CLOCK
select HAVE_IDE
 diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
 index b7d31ca..9804431 100644
 --- a/arch/x86/Kconfig
 +++ b/arch/x86/Kconfig
 @@ -22,6 +22,7 @@ config X86_64
   ### Arch settings
   config X86
def_bool y
 +  select ACPI_GENERIC_SLEEP if ACPI
>>> One more nit.  If you did
>>>
>>> +   select ACPI_GENERIC_SLEEP if ACPI_SLEEP
>>>
>>> here (and above for ia64), you'd avoid having to make ACPI_SLEEP
>>> depend on ACPI_GENERIC_SLEEP which goes somewhat backwards.
>> In sleep.c,
>>
>> #ifdef CONFIG_ACPI_SLEEP
>> acpi_target_system_state()
>> {
>> }
>> #endif
>>
>> and CONFIG_ACPI_SLEEP depends on SUSPEND || HIBERNATION,
>> which one of them will be enabled on ARM64 so ACPI_SLEEP
>> will also enabled too.
>>
>> So if we
>>
>> +select ACPI_GENERIC_SLEEP if ACPI_SLEEP
>>
>> and
>>
>> +acpi-$(CONFIG_ACPI_GENERIC_SLEEP) += sleep.o
>>
>> it will lead to errors for acpi_target_system_state() that
>> is declared but not defined, so I will keep the code as
>> it is, what do you think?
> No, we need to hash this out.  Having two different Kconfig options meaning
> almost the same thing (ACPI_SLEEP and ACPI_GENERIC_SLEEP) is beyond ugly.
>
> Do you need ACPI_SLEEP on ARM64 at all?

No, at least for now we don't need it, the spec for sleep is not ready for
ARM64 arch, so ACPI_SLEEP will not work at all on ARM64.

Thanks
Hanjun

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH kernel v6 29/29] vfio: powerpc/spapr: Support Dynamic DMA windows

2015-03-16 Thread Alexey Kardashevskiy


On 03/17/2015 06:38 AM, Alex Williamson wrote:

On Fri, 2015-03-13 at 19:07 +1100, Alexey Kardashevskiy wrote:

This adds create/remove window ioctls to create and remove DMA windows.
sPAPR defines a Dynamic DMA windows capability which allows
para-virtualized guests to create additional DMA windows on a PCI bus.
The existing linux kernels use this new window to map the entire guest
memory and switch to the direct DMA operations saving time on map/unmap
requests which would normally happen in a big amounts.

This adds 2 ioctl handlers - VFIO_IOMMU_SPAPR_TCE_CREATE and
VFIO_IOMMU_SPAPR_TCE_REMOVE - to create and remove windows.
Up to 2 windows are supported now by the hardware and by this driver.

This changes VFIO_IOMMU_SPAPR_TCE_GET_INFO handler to return additional
information such as a number of supported windows and maximum number
levels of TCE tables.

DDW is added as a capability, not as a SPAPR TCE IOMMU v2 unique feature
as we still want to support v2 on platforms which cannot do DDW for
the sake of TCE acceleration in KVM (coming soon).

Signed-off-by: Alexey Kardashevskiy 
---
Changes:
v6:
* added explicit VFIO_IOMMU_INFO_DDW flag to vfio_iommu_spapr_tce_info,
it used to be page mask flags from platform code
* added explicit pgsizes field
* added cleanup if tce_iommu_create_window() failed in a middle
* added checks for callbacks in tce_iommu_create_window and remove those
from tce_iommu_remove_window when it is too late to test anyway
* spapr_tce_find_free_table returns sensible error code now
* updated description of VFIO_IOMMU_SPAPR_TCE_CREATE/
VFIO_IOMMU_SPAPR_TCE_REMOVE

v4:
* moved code to tce_iommu_create_window()/tce_iommu_remove_window()
helpers
* added docs
---
  Documentation/vfio.txt  |  19 
  arch/powerpc/include/asm/iommu.h|   2 +-
  drivers/vfio/vfio_iommu_spapr_tce.c | 206 +++-
  include/uapi/linux/vfio.h   |  41 ++-
  4 files changed, 265 insertions(+), 3 deletions(-)

diff --git a/Documentation/vfio.txt b/Documentation/vfio.txt
index 791e85c..61ce393 100644
--- a/Documentation/vfio.txt
+++ b/Documentation/vfio.txt
@@ -446,6 +446,25 @@ the memory block.
  The user space is not expected to call these often and the block descriptors
  are stored in a linked list in the kernel.

+6) sPAPR specification allows guests to have an ddditional DMA window(s) on



s/ddditional/additional/


+a PCI bus with a variable page size. Two ioctls have been added to support
+this: VFIO_IOMMU_SPAPR_TCE_CREATE and VFIO_IOMMU_SPAPR_TCE_REMOVE.
+The platform has to support the functionality or error will be returned to
+the userspace. The existing hardware supports up to 2 DMA windows, one is
+2GB long, uses 4K pages and called "default 32bit window"; the other can
+be as big as entire RAM, use different page size, it is optional - guests
+create those in run-time if the guest driver supports 64bit DMA.
+
+VFIO_IOMMU_SPAPR_TCE_CREATE receives a page shift, a DMA window size and
+a number of TCE table levels (if a TCE table is going to be big enough and
+the kernel may not be able to allocate enough of physicall contiguous memory).


s/physicall/physically/


+It creates a new window in the available slot and returns the bus address where
+the new window starts. Due to hardware limitation, the user space cannot choose
+the location of DMA windows.
+
+VFIO_IOMMU_SPAPR_TCE_REMOVE receives the bus start address of the window
+and removes it.
+
  
---

  [1] VFIO was originally an acronym for "Virtual Function I/O" in its
diff --git a/arch/powerpc/include/asm/iommu.h b/arch/powerpc/include/asm/iommu.h
index 13145a2..bac02bf 100644
--- a/arch/powerpc/include/asm/iommu.h
+++ b/arch/powerpc/include/asm/iommu.h
@@ -138,7 +138,7 @@ extern void iommu_free_table(struct iommu_table *tbl, const 
char *node_name);
  extern struct iommu_table *iommu_init_table(struct iommu_table * tbl,
int nid);

-#define IOMMU_TABLE_GROUP_MAX_TABLES   1
+#define IOMMU_TABLE_GROUP_MAX_TABLES   2

  struct iommu_table_group;

diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c 
b/drivers/vfio/vfio_iommu_spapr_tce.c
index d94116b..0129a4f 100644
--- a/drivers/vfio/vfio_iommu_spapr_tce.c
+++ b/drivers/vfio/vfio_iommu_spapr_tce.c
@@ -600,11 +600,137 @@ static long tce_iommu_build(struct tce_container 
*container,
return ret;
  }

+static int spapr_tce_find_free_table(struct tce_container *container)
+{
+   int i;
+
+   for (i = 0; i < IOMMU_TABLE_GROUP_MAX_TABLES; ++i) {
+   struct iommu_table *tbl = >tables[i];
+
+   if (!tbl->it_size)
+   return i;
+   }
+
+   return -ENOSPC;
+}
+
+static long tce_iommu_create_window(struct tce_container *container,
+   __u32 page_shift, __u64 window_size, __u32 levels,
+   __u64 *start_addr)
+{
+   struct tce_iommu_group

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 2638 matches

Mail list logo