Re: [PATCH] sched: Get rid of unnecessary checks from select_idle_sibling

2013-01-08 Thread Namhyung Kim
Hi Alex,

On Wed, 09 Jan 2013 15:33:40 +0800, Alex Shi wrote:
> On 01/09/2013 02:50 PM, Namhyung Kim wrote:
>> From: Namhyung Kim 
>> 
>> AFAICS @target cpu of select_idle_sibling() is always either prev_cpu
>> or this_cpu.  So no need to check it again and the conditionals can be
>> consolidated.
[snip]
> Uh, we don't know if the target is this_cpu or previous cpu, If we just
> check the target idle status, we may miss another idle cpu. So this
> patch change the logical in this function.

select_idle_sibling() is called only in select_task_rq_fair() if it
found a suitable affine_sd.  The default target is the 'prev_cpu' of the
task but if wake_affine() returns true it'd be (this) 'cpu'.

I cannot see where the prev_cpu or the cpu is set to another one before
calling select_idle_sibling.

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5] pwm: atmel: add Timer Counter Block PWM driver

2013-01-08 Thread Thierry Reding
On Tue, Jan 08, 2013 at 04:36:42PM +0100, Boris BREZILLON wrote:
> Hi,
> 
> This patch adds a PWM driver based on Atmel Timer Counter Block.
> Timer Counter Block is used in Waveform generator mode.
> 
> A Timer Counter Block provides up to 6 PWM devices grouped by 2:
> * group 0 = PWM 0 and 1
> * group 1 = PWM 2 and 3
> * group 2 = PMW 4 and 5
> 
> PWM devices in a given group must be configured with the same
> period value.
> If a PWM device in a group tries to change the period value and
> the other device is already configured with a different value an
> error will be returned.
> 
> This driver requires device tree support.
> The Timer Counter Block number used to create a PWM chip is
> given by tc-block field in an "atmel,tcb-pwm" compatible node.
> 
> This patch was tested on kizbox board (at91sam9g20 SoC) with 
> pwm-leds.
> 
> Regards,
> 
> Boris
> 
> Signed-off-by: Boris BREZILLON 
> ---
> Changes since v1:
>   - Fix device tree binding Documentation
>   - Fix Kconfig issues (missing OF dependency, 
>   deprecated HAVE_PWM select, ...)
>   - Fix various coding style issues.
>   - Cleanup code and add some comments.
> 
> Changes since v2:
>   - Replace kzalloc/kfree with managed versions
> (devm_kzalloc/devm_kfree).
>   - Add one cell to device tree binding to support polarity
> flag.
>   - Replace min computation (2 div -> 1 mul + 1 div).
> 
> Changes since v3:
>   - Fix device tree binding Documentation
>   - Fix Kconfig description
>   - Fix coding style issues (function parameters alignment)
>   - Replace 10 value with NSEC_PER_SEC macro
>   - Get rid of newcmr variable in enable/disable functions
>   - Remove unneeded devm_kfree
>   - Add missing atmel_tc_free
> 
> Changes since v4:
>   - Add missing comments
>   - Fix coding style issues (multi-line error string)
>   - Fix wrong MODULE_DEVICE_TABLE setting
>   - Remove unneeded MODULE_ALIAS declaration
> 
>  .../devicetree/bindings/pwm/atmel-tcb-pwm.txt  |   18 +
>  drivers/pwm/Kconfig|   12 +
>  drivers/pwm/Makefile   |1 +
>  drivers/pwm/pwm-atmel-tcb.c|  445 
> 
>  4 files changed, 476 insertions(+), 0 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/pwm/atmel-tcb-pwm.txt
>  create mode 100644 drivers/pwm/pwm-atmel-tcb.c

Applied with minor changes to the commit message, thanks.

Thierry


pgpAqii8I_kc6.pgp
Description: PGP signature


Re: [PATCH v3 2/4] videobuf2-dma-streaming: new videobuf2 memory allocator

2013-01-08 Thread Michael Olbrich
On Tue, Jan 08, 2013 at 07:31:30AM -0700, Jonathan Corbet wrote:
> On Tue, 08 Jan 2013 07:50:41 +0100
> Marek Szyprowski  wrote:
> 
> > > Couldn't this performance difference be due to the usage of GFP_DMA inside
> > > the VB2 code, like Federico's new patch series is proposing?
> > >
> > > If not, why are there a so large performance penalty?  
> > 
> > Nope, this was caused rather by a very poor CPU access to non-cached (aka
> > 'coherent') memory and the way the video data has been accessed/read 
> > with CPU.
> 
> Exactly.  Uncached memory *hurts*, especially if you're having to touch it
> all with the CPU.

Even worse, on ARMv7 (at least) the cache implements or is necessary for
(I'm not an expert here) unaligned access. I've seen applications crash
on non-cached memory with a bus error because gcc assumes unaligned access
works. And there isn't even a exception handler in the kernel, probably for
the same reason.

Michael

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 2/2] pwm: vt8500: Add polarity support

2013-01-08 Thread Thierry Reding
On Thu, Jan 03, 2013 at 08:44:16AM +1300, Tony Prisk wrote:
> Add support to set polarity on PWM devices, allowing for inverted
> duty cycles.
> 
> Also update the binding document to #pwm-cells = <3> to allow
> passing the flags from devicetree.
> 
> Signed-off-by: Tony Prisk 
> ---
> v2:
> Change binding document to detail flags usage.
> Add missing .of_xlate function
> Add missing .of_pwm_n_cells
> 
>  .../devicetree/bindings/pwm/vt8500-pwm.txt |9 +---
>  drivers/pwm/pwm-vt8500.c   |   23 
> 
>  2 files changed, 29 insertions(+), 3 deletions(-)

Applied, thanks.

Thierry


pgpqAWdm4S1pw.pgp
Description: PGP signature


Re: [PATCHv2 1/2] pwm: vt8500: Register write busy test performed incorrectly

2013-01-08 Thread Thierry Reding
On Thu, Jan 03, 2013 at 08:44:15AM +1300, Tony Prisk wrote:
> Correct operation for register writes is to perform a busy-wait
> after writing the register. Currently the busy wait it performed
> before, meaning subsequent register writes to bitfields may occur
> before the previous field has been updated.
> 
> Also, all registers are defined as 32-bit read/write. Change
> pwm_busy_wait() to use readl rather than readb.
> 
> Improve readability of code with defines for registers and bitfields.
> 
> Signed-off-by: Tony Prisk 
> ---
> v2:
> Change parenthesis around defines
> Replace pr_warn with dev_warn in pwm_busy_wait()
> 
>  drivers/pwm/pwm-vt8500.c |   64 
> +++---
>  1 file changed, 49 insertions(+), 15 deletions(-)

Applied, thanks.

Thierry


pgp2ZOwtr6QUv.pgp
Description: PGP signature


[PATCH V2 2/2] Xen ACPI memory hotplug

2013-01-08 Thread Liu, Jinsong
This patch implements real Xen acpi memory hotplug driver as module.
When loaded, it replaces Xen stub driver.

When an acpi memory device hotadd event occurs, it notifies OS and
invokes notification callback, adding related memory device and parsing
memory information, finally hypercall to xen hypervisor to add memory.

Signed-off-by: Liu Jinsong 
---
 drivers/xen/Kconfig   |   11 +
 drivers/xen/Makefile  |1 +
 drivers/xen/xen-acpi-memhotplug.c |  487 +
 include/xen/interface/platform.h  |   13 +-
 4 files changed, 508 insertions(+), 4 deletions(-)
 create mode 100644 drivers/xen/xen-acpi-memhotplug.c

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index 2986de9..b8cf899 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -191,6 +191,17 @@ config XEN_STUB
 
  To enable Xen features like cpu and memory hotplug, select Y here.
 
+config XEN_ACPI_HOTPLUG_MEMORY
+   tristate "Xen ACPI memory hotplug"
+   depends on XEN_STUB && ACPI
+   default n
+   help
+ This is Xen ACPI memory hotplug.
+
+ Currently Xen only support ACPI memory hot-add. If you want
+ to hot-add memory at runtime (the hot-added memory cannot be
+ removed until machine stop), select Y/M here, otherwise select N.
+
 config XEN_ACPI_PROCESSOR
tristate "Xen ACPI processor"
depends on XEN && X86 && ACPI_PROCESSOR && CPU_FREQ
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index b63edd8..1605f59 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -31,6 +31,7 @@ obj-$(CONFIG_XEN_MCE_LOG) += mcelog.o
 obj-$(CONFIG_XEN_PCIDEV_BACKEND)   += xen-pciback/
 obj-$(CONFIG_XEN_PRIVCMD)  += xen-privcmd.o
 obj-$(CONFIG_XEN_STUB) += xen-stub.o
+obj-$(CONFIG_XEN_ACPI_HOTPLUG_MEMORY)  += xen-acpi-memhotplug.o
 obj-$(CONFIG_XEN_ACPI_PROCESSOR)   += xen-acpi-processor.o
 xen-evtchn-y   := evtchn.o
 xen-gntdev-y   := gntdev.o
diff --git a/drivers/xen/xen-acpi-memhotplug.c 
b/drivers/xen/xen-acpi-memhotplug.c
new file mode 100644
index 000..d207fec
--- /dev/null
+++ b/drivers/xen/xen-acpi-memhotplug.c
@@ -0,0 +1,487 @@
+/*
+ * Copyright (C) 2012 Intel Corporation
+ *Author: Liu Jinsong 
+ *Author: Jiang Yunhong 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT.  See the GNU General Public License for more
+ * details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PREFIX "ACPI:xen_memory_hotplug:"
+
+struct acpi_memory_info {
+   struct list_head list;
+   u64 start_addr; /* Memory Range start physical addr */
+   u64 length; /* Memory Range length */
+   unsigned short caching; /* memory cache attribute */
+   unsigned short write_protect;   /* memory read/write attribute */
+   /* copied from buffer getting from _CRS */
+   unsigned int enabled:1;
+};
+
+struct acpi_memory_device {
+   struct acpi_device *device;
+   struct list_head res_list;
+};
+
+static bool acpi_hotmem_initialized __read_mostly;
+
+static int xen_hotadd_memory(int pxm, struct acpi_memory_info *info)
+{
+   struct xen_platform_op op;
+
+   op.cmd = XENPF_mem_hotadd;
+   op.u.mem_add.spfn = info->start_addr >> PAGE_SHIFT;
+   op.u.mem_add.epfn = (info->start_addr + info->length) >> PAGE_SHIFT;
+   op.u.mem_add.pxm = pxm;
+
+   return HYPERVISOR_dom0_op();
+}
+
+static int xen_acpi_get_pxm(acpi_handle h)
+{
+   unsigned long long pxm;
+   acpi_status status;
+   acpi_handle handle;
+   acpi_handle phandle = h;
+
+   do {
+   handle = phandle;
+   status = acpi_evaluate_integer(handle, "_PXM", NULL, );
+   if (ACPI_SUCCESS(status))
+   return pxm;
+   status = acpi_get_parent(handle, );
+   } while (ACPI_SUCCESS(status));
+   return -1;
+}
+
+static int xen_acpi_memory_enable_device(struct acpi_memory_device *mem_device)
+{
+   int pxm, result;
+   int num_enabled = 0;
+   struct acpi_memory_info *info;
+
+   if (!mem_device)
+   return -EINVAL;
+
+   pxm = xen_acpi_get_pxm(mem_device->device->handle);
+   if (pxm < 0)
+   return -EINVAL;
+
+   list_for_each_entry(info, _device->res_list, list) {
+   if (info->enabled) { /* just sanity check...*/
+ 

[PATCH V2 1/2] Xen stub driver for memory hotplug

2013-01-08 Thread Liu, Jinsong
This patch create a file (xen-stub.c) for Xen stub drivers.
Xen stub drivers are used to reserve space for Xen drivers, i.e.
memory hotplug and cpu hotplug, and to block native drivers loaded,
so that real Xen drivers can be modular and loaded on demand.

This patch is specific for Xen memory hotplug (other Xen logic
can add stub drivers on their own). The xen stub driver will
occupied earlier via subsys_initcall (than native memory hotplug
driver via module_init and so blocking native). Later real Xen
memory hotplug logic will unregister the stub driver and register
itself to take effect on demand.

Signed-off-by: Liu Jinsong 
---
 drivers/xen/Kconfig|   11 
 drivers/xen/Makefile   |1 +
 drivers/xen/xen-stub.c |   60 
 include/xen/acpi.h |6 
 4 files changed, 78 insertions(+), 0 deletions(-)
 create mode 100644 drivers/xen/xen-stub.c

diff --git a/drivers/xen/Kconfig b/drivers/xen/Kconfig
index cabfa97..2986de9 100644
--- a/drivers/xen/Kconfig
+++ b/drivers/xen/Kconfig
@@ -180,6 +180,17 @@ config XEN_PRIVCMD
depends on XEN
default m
 
+config XEN_STUB
+   bool "Xen stub drivers"
+   depends on XEN_DOM0 && X86_64
+   default y
+   help
+ Allow kernel to install stub drivers, to reserve space for Xen 
drivers,
+ i.e. memory hotplug and cpu hotplug, and to block native drivers 
loaded,
+ so that real Xen drivers can be modular.
+
+ To enable Xen features like cpu and memory hotplug, select Y here.
+
 config XEN_ACPI_PROCESSOR
tristate "Xen ACPI processor"
depends on XEN && X86 && ACPI_PROCESSOR && CPU_FREQ
diff --git a/drivers/xen/Makefile b/drivers/xen/Makefile
index fb213cf..b63edd8 100644
--- a/drivers/xen/Makefile
+++ b/drivers/xen/Makefile
@@ -30,6 +30,7 @@ obj-$(CONFIG_SWIOTLB_XEN) += swiotlb-xen.o
 obj-$(CONFIG_XEN_MCE_LOG)  += mcelog.o
 obj-$(CONFIG_XEN_PCIDEV_BACKEND)   += xen-pciback/
 obj-$(CONFIG_XEN_PRIVCMD)  += xen-privcmd.o
+obj-$(CONFIG_XEN_STUB) += xen-stub.o
 obj-$(CONFIG_XEN_ACPI_PROCESSOR)   += xen-acpi-processor.o
 xen-evtchn-y   := evtchn.o
 xen-gntdev-y   := gntdev.o
diff --git a/drivers/xen/xen-stub.c b/drivers/xen/xen-stub.c
new file mode 100644
index 000..01a49e3
--- /dev/null
+++ b/drivers/xen/xen-stub.c
@@ -0,0 +1,60 @@
+/*
+ * xen-stub.c - stub drivers to reserve space for Xen
+ *
+ * Copyright (C) 2012 Intel Corporation
+ *Author: Liu Jinsong 
+ *Author: Jiang Yunhong 
+ *
+ * Copyright (C) 2012 Oracle Inc
+ *Author: Konrad Rzeszutek Wilk 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or (at
+ * your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
+ * NON INFRINGEMENT.  See the GNU General Public License for more
+ * details.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+/*
+   stub driver for Xen memory hotplug
+*/
+
+#ifdef CONFIG_ACPI
+
+static const struct acpi_device_id memory_device_ids[] = {
+   {ACPI_MEMORY_DEVICE_HID, 0},
+   {"", 0},
+};
+
+struct acpi_driver xen_stub_memory_device_driver = {
+   /* same name as native memory driver to block native loaded */
+   .name = "acpi_memhotplug",
+   .class = ACPI_MEMORY_DEVICE_CLASS,
+   .ids = memory_device_ids,
+};
+EXPORT_SYMBOL_GPL(xen_stub_memory_device_driver);
+
+static int __init xen_stub_memory_device_init(void)
+{
+   if (!xen_initial_domain())
+   return -ENODEV;
+
+   /* just reserve space for Xen, block native driver loaded */
+   return acpi_bus_register_driver(_stub_memory_device_driver);
+}
+subsys_initcall(xen_stub_memory_device_init);
+
+#endif
diff --git a/include/xen/acpi.h b/include/xen/acpi.h
index 48a9c01..7366e58 100644
--- a/include/xen/acpi.h
+++ b/include/xen/acpi.h
@@ -40,6 +40,12 @@
 #include 
 #include 
 
+#define ACPI_MEMORY_DEVICE_CLASS"memory"
+#define ACPI_MEMORY_DEVICE_HID  "PNP0C80"
+#define ACPI_MEMORY_DEVICE_NAME "Hotplug Mem Device"
+
+extern struct acpi_driver xen_stub_memory_device_driver;
+
 int xen_acpi_notify_hypervisor_state(u8 sleep_state,
 u32 pm1a_cnt, u32 pm1b_cnd);
 
-- 
1.7.1
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: 3.8-rc2: pciehp waitqueue hang...

2013-01-08 Thread Yijing Wang
Hi Bjorn,
   I will send the shpchp patch soon.

Thanks!
Yijing

>>> Yijing, please check for the same problem in other hotplug drivers.
>>> Questions I have after a quick look:
>>>
>>
>> Hi Bjorn,
>>Sorry for delay reply. There are some busy work these days.
>>
>>>   - shpchp_wq looks like it might have the same deadlock issue.
>>
>> shpchp driver uses two workqueues shpchp_wq and shpchp_ordered_wq, they are 
>> created by alloc_ordered_workqueue
>> which set the "max_active" parameter to 1. So only one pci hotplug slot can 
>> do hotplug at the same time.
>> shpchp introduced these workqueue to remove the use of 
>> flush_scheduled_work() which is deprecated and scheduled for removal.
>>
>> hot remove path is:
>>  button press
>>shpc_isr(interrupt handler)
>> shpchp_handle_attention_button
>> queue_interrupt_event
>>queue_work "interrupt_event_handler" into "shpchp_wq"
>>interrupt_event_handler
>>  handle_button_press_event
>>queue_delayed_work 
>> "shpchp_queue_pushbutton_work" into "shpchp_wq"
>>  queue_work 
>> "shpchp_pushbutton_thread" into "shpchp_ordered_wq"
>>shpchp_pushbutton_thread
>>   shpchp_disable_slot
>> 
>> pci_stop_and_remove_bus_device
>> ..
>>shpc_remove() 
>>   if the hotplug slot connected a iobox which contains some hotplug 
>> pcieport, shpc_remove will be called when remove pcie port device.
>>
>> hpc_release_ctlr
>>
>> flush_workqueue(shpchp_wq);
>>
>> flush_workqueue(shpchp_ordered_wq);
>>So 
>> hotplug task hang.
>> shpchp driver has the same deadlock issue like pciehp driver, I think we 
>> should fix the issue, I will send out the patch if you agree this, but I 
>> have no machine support shpchp hotplug,
>> so I can't test this patch in real machine.
> 
> That's OK.  You've tested pciehp, and I don't want to leave shpchp
> broken the same way just because we can't test a similar fix there, so
> please do send the shpchp patch, too.
> 
>>>   - pciehp_wq (and your per-slot replacement) are allocated with
>>> alloc_workqueue().  shpchp_wq is allocated with
>>> alloc_ordered_workqueue().  Why the difference?
>>
>> alloc_workqueue(name, 0, 0) set max_active to 0(0 is default value used and 
>> support 256 work items of the wq can be executing at the same time per CPU).
>> So pciehp driver can handle push button event asynchronously.
>>
>> alloc_ordered_workqueue can only one handle push button event at the same 
>> time.
> 
> pciehp and shpchp should work the same in this respect unless there's
> a reason they can't, so it sounds like we should make shpchp work like
> pciehp.
> 
>>>   - The alloc/alloc_ordered difference might be related to 486b10b9f4,
>>> where Kenji removed alloc_ordered from pciehp.  Should a similar
>>> change be made to shpchp?
>>
>> Yes, I agree, we can use per-slot workqueue to fix this issue.
>>
>>>
>>>   - acpiphp uses the global kacpi_hotplug_wq.  We never flush or drain
>>> kacpi_hotplug_wq, so I doubt there's a deadlock issue, but I wonder if
>>> there are any ordering issues there because we *don't* ever wait for
>>> things in that queue to be completed.
>>
>> acpiphp driver is not attach to a pci device, so when hot remove pci device, 
>> driver will not to flush or drain kacpi_hotplug_wq.
>> But if we do acpiphp hot remove in sequence like this, there maybe cause 
>> some unexpected errors, I think.
>> slot(A)--pcie portslot(B)
>> slot A and slot B both support acpiphp hotplug.
>> 1、press attention button on slot A;
>> 2、press attention button on slot B quickly after step 1;
>> Because kacpi_hotplug_wq is a ordered workqueue, slot B hot remove won't run 
>> unless slot A hot remove action completed.
>> After Slot B hot remove completed, some resources of slot A also has been 
>> destroyed. So slot B hot remove will cause some unexpected errors.
>> Because my hotplug machine's bios don't support iobox 
>> hotplug(slot-connected-slot), I can't verify this situation.
> 
> Hmm.  That definitely sounds like a potential problem.  But I think
> it's beyond the scope of the issue you're trying to fix, and any fix
> would look much different from your current pciehp patch, so I think
> we can treat it separately.
> 
> Bjorn
> 
> .
> 


-- 
Thanks!
Yijing

--
To unsubscribe from this list: send the 

RE: [patch] bnx2x: NULL dereference on error in debug code

2013-01-08 Thread Ariel Elior
> -Original Message-
> From: netdev-ow...@vger.kernel.org [mailto:netdev-
> ow...@vger.kernel.org] On Behalf Of Dan Carpenter
> Sent: Tuesday, January 08, 2013 3:42 PM
> To: Eilon Greenstein
> Cc: net...@vger.kernel.org; linux-kernel@vger.kernel.org; kernel-
> janit...@vger.kernel.org
> Subject: [patch] bnx2x: NULL dereference on error in debug code
> 
> "vfop" is NULL here.  I've changed the debugging to not use it.
> 
> Signed-off-by: Dan Carpenter 
> ---
> Only needed in linux-next.
> 
> diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
> b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
> index 71fcef0..3eef972 100644
> --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
> +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c
> @@ -463,8 +463,7 @@ static int bnx2x_vfop_qdtor_cmd(struct bnx2x *bp,
>   return bnx2x_vfop_transition(bp, vf, bnx2x_vfop_qdtor,
>cmd->block);
>   }
> - DP(BNX2X_MSG_IOV, "VF[%d] failed to add a vfop. rc %d\n",
> -vf->abs_vfid, vfop->rc);
> + DP(BNX2X_MSG_IOV, "VF[%d] failed to add a vfop.\n", vf->abs_vfid);
>   return -ENOMEM;
>  }
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Acked-by: Ariel Elior 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sched: Get rid of unnecessary checks from select_idle_sibling

2013-01-08 Thread Alex Shi

> 
> Uh, we don't know if the target is this_cpu or previous cpu, If we just
> check the target idle status, we may miss another idle cpu. So this
> patch change the logical in this function.

But, you can fold wake_affine into select_idle_sibling(). that will save
a complicate calculation whichever cpu idle. :)

-- 
Thanks Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] Add mempressure cgroup

2013-01-08 Thread leonid.moiseichuk
-Original Message-
From: ext Anton Vorontsov [mailto:anton.voront...@linaro.org] 
Sent: 08 January, 2013 08:30
...
> > +static const uint vmpressure_level_med = 60;
> > +static const uint vmpressure_level_oom = 99;
> > +static const uint vmpressure_level_oom_prio = 4;
> > +
..
Seems vmpressure_level_oom = 99 is quite high if I understand it as a global. 
If I do not wrong in old version of kernel the kernel only memory border was 
stated as 1/32 part of available memory meaning no allocation for user-space if 
amount of free memory reached 1/32. So, decreasing this parameter to 95 or 90 
will allow notification to be propagated to user-space and handled.

Best wishes,
Leonid


Re: [PATCH] sched: Get rid of unnecessary checks from select_idle_sibling

2013-01-08 Thread Alex Shi
On 01/09/2013 02:50 PM, Namhyung Kim wrote:
> From: Namhyung Kim 
> 
> AFAICS @target cpu of select_idle_sibling() is always either prev_cpu
> or this_cpu.  So no need to check it again and the conditionals can be
> consolidated.
> 
> Cc: Mike Galbraith 
> Cc: Preeti U Murthy 
> Cc: Vincent Guittot 
> Cc: Alex Shi 
> Signed-off-by: Namhyung Kim 
> ---
>  kernel/sched/fair.c | 17 -
>  1 file changed, 4 insertions(+), 13 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 5eea8707234a..af665814c216 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -3254,25 +3254,16 @@ find_idlest_cpu(struct sched_group *group, struct 
> task_struct *p, int this_cpu)
>   */
>  static int select_idle_sibling(struct task_struct *p, int target)
>  {
> - int cpu = smp_processor_id();
> - int prev_cpu = task_cpu(p);
>   struct sched_domain *sd;
>   struct sched_group *sg;
>   int i;
>  
>   /*
> -  * If the task is going to be woken-up on this cpu and if it is
> -  * already idle, then it is the right target.
> -  */
> - if (target == cpu && idle_cpu(cpu))
> - return cpu;
> -
> - /*
> -  * If the task is going to be woken-up on the cpu where it previously
> -  * ran and if it is currently idle, then it the right target.
> +  * If the task is going to be woken-up on this cpu or the cpu where it
> +  * previously ran and it is already idle, then it is the right target.
>*/
> - if (target == prev_cpu && idle_cpu(prev_cpu))
> - return prev_cpu;
> + if (idle_cpu(target))
> + return target;

Uh, we don't know if the target is this_cpu or previous cpu, If we just
check the target idle status, we may miss another idle cpu. So this
patch change the logical in this function.

>  
>   /*
>* Otherwise, iterate the domains and find an elegible idle cpu.
> 


-- 
Thanks Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree)

2013-01-08 Thread Lijo Antony

On 01/09/2013 09:31 AM, Dave Airlie wrote:

On Wed, Jan 9, 2013 at 2:25 PM, Greg KH  wrote:

On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote:

Hi all,

I've hit this 3 times today on Linus's latest 3.8-rc2+ tree:

[11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU 
hung
[11868.414655] [drm] capturing error event; look for more information in 
/debug/dri/0/i915_error_state
[11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU 
hung
[11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring wedged!
[11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip.
[11883.083225] gnome-shell[19396]: segfault at 218 ip 7feef5f32333 sp 
7c1dc930 error 4 in i965_dri.so[7feef5ecb000+d]


I just hit this again.  And, as the kernel was asking for it, attached
is the i915_error_state file, compressed due to the size of it.


Welcome to sink hole that is
https://bugs.freedesktop.org/show_bug.cgi?id=55984

3 months and ticking, Intel guys are all running away from it saying
they can't reproduce, everyone else on planet seems to reproduce quite
easily.

Its generally considered a bug in the relocation/shrinker/no idea category,


Ugh, what a mess.


Assuming you have an Ironlake machine which I'm going to guess you do.


I don't know, it's an old i5 machine that has never had any video
problems for many years now.  How do I tell?


lspci -nn probably an 8086:0046 device.

Old i5 probably means original i5 which means ironlake.



I have also seen this a couple of times on 3.7 and 3.8-rc1.
Most of the times I was watching youtube video in chrome. Nothing 
crashed though(I am not running gnome shell). System recovered after few 
seconds.


I didn't see this on 3.8-rc2 yet, probably because I haven't watched any 
video.


-lijo



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fs/autofs: Fix sparse warning: context imbalance in 'autofs4_d_automount' - different lock contexts for basic block

2013-01-08 Thread Ian Kent
Hi all,

I see this hasn't been merged yet.
Do I need to re-post this (perhaps to Linus)?

On Thu, 2013-01-03 at 15:52 +0100, Peter Huewe wrote:
> Sparse complains:
> + fs/autofs4/root.c:409:9: sparse: context imbalance in
> 'autofs4_d_automount' - different lock contexts for basic block
> 
> This was introduced by commit
> f55fb0c243 autofs4 - dont clear DCACHE_NEED_AUTOMOUNT on rootless mount
> 
> The function autofs4_d_automount can be left with the (>fs_lock)
> held if sbi->version <= 4 and simple_empty(dentry) == false so the warning 
> seems valid.
> 
> --> Add an spin_unlock in this case before we jump to done
> 
> Unfortunately compile tested only.
> 
> Reported-by: Fengguang Wu 
> Signed-off-by: Peter Huewe 
> ---
>  fs/autofs4/root.c |4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> index c934476..74a299f 100644
> --- a/fs/autofs4/root.c
> +++ b/fs/autofs4/root.c
> @@ -383,8 +383,10 @@ static struct vfsmount *autofs4_d_automount(struct path 
> *path)
>   goto done;
>   }
>   } else {
> - if (!simple_empty(dentry))
> + if (!simple_empty(dentry)) {
> + spin_unlock(>fs_lock);
>   goto done;
> + }
>   }
>   ino->flags |= AUTOFS_INF_PENDING;
>   spin_unlock(>fs_lock);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 4/8] memcg: add per cgroup dirty pages accounting

2013-01-08 Thread Kamezawa Hiroyuki

(2013/01/09 14:15), Hugh Dickins wrote:

On Mon, 7 Jan 2013, Kamezawa Hiroyuki wrote:

(2013/01/07 5:02), Hugh Dickins wrote:


Forgive me, I must confess I'm no more than skimming this thread,
and don't like dumping unsigned-off patches on people; but thought
that on balance it might be more helpful than not if I offer you a
patch I worked on around 3.6-rc2 (but have updated to 3.8-rc2 below).

I too was getting depressed by the constraints imposed by
mem_cgroup_{begin,end}_update_page_stat (good job though Kamezawa-san
did to minimize them), and wanted to replace by something freer, more
RCU-like.  In the end it seemed more effort than it was worth to go
as far as I wanted, but I do think that this is some improvement over
what we currently have, and should deal with your recursion issue.


In what case does this improve performance ?


Perhaps none.  I was aiming to not degrade performance at the stats
update end, and make it more flexible, so new stats can be updated which
would be problematic today (for lock ordering and recursion reasons).

I've not done any performance measurement on it, and don't have enough
cpus for an interesting report; but if someone thinks it might solve a
problem for them, and has plenty of cpus to test with, please go ahead,
we'd be glad to hear the results.


Hi, this patch seems interesting but...doesn't this make move_account() very
slow if the number of cpus increases because of scanning all cpus per a page
?
And this looks like reader-can-block-writer percpu rwlock..it's too heavy to
writers if there are many readers.


I was happy to make the relatively rare move_account end considerably
heavier.  I'll be disappointed if it turns out to be prohibitively
heavy at that end - if we're going to make move_account impossible,
there are much easier ways to achieve that! - but it is a possibility.



move_account at task-move has been required feature for NEC and Nishimura-san
did good job. I'd like to keep that available as much as possible.


Something you might have missed when considering many readers (stats
updaters): the move_account end does not wait for a moment when there
are no readers, that would indeed be a losing strategy; it just waits
for each cpu that's updating page stats to leave that section, so every
cpu is sure to notice and hold off if it then tries to update the page
which is to be moved.  (I may not be explaining that very well!)



Hmm, yeah, maybe I miss somehing.

BTW, if nesting, mem_cgroup_end_update_page_stat() seems to make counter minus.

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Minchan Kim
Hi Hannes,

On Wed, Jan 09, 2013 at 01:56:12AM -0500, Johannes Weiner wrote:
> On Wed, Jan 09, 2013 at 03:21:13PM +0900, Minchan Kim wrote:
> > Recently, Luigi reported there are lots of free swap space when
> > OOM happens. It's easily reproduced on zram-over-swap, where
> > many instance of memory hogs are running and laptop_mode is enabled.
> > 
> > Luigi reported there was no problem when he disabled laptop_mode.
> > The problem when I investigate problem is following as.
> > 
> > try_to_free_pages disable may_writepage if laptop_mode is enabled.
> > shrink_page_list adds lots of anon pages in swap cache by
> > add_to_swap, which makes pages Dirty and rotate them to head of
> > inactive LRU without pageout. If it is repeated, inactive anon LRU
> > is full of Dirty and SwapCache pages.
> > 
> > In case of that, isolate_lru_pages fails because it try to isolate
> > clean page due to may_writepage == 0.
> > 
> > The may_writepage could be 1 only if total_scanned is higher than
> > writeback_threshold in do_try_to_free_pages but unfortunately,
> > VM can't isolate anon pages from inactive anon lru list by
> > above reason and we already reclaimed all file-backed pages.
> > So it ends up OOM killing.
> > 
> > This patch prevents to add a page to swap cache unnecessary when
> > may_writepage is unset so anoymous lru list isn't full of
> > Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> > which ends up setting may_writepage to 1 and could swap out
> > anon lru pages. When OOM triggers, I confirmed swap space was full.
> > 
> > Reported-by: Luigi Semenzato 
> > Signed-off-by: Minchan Kim 
> 
> Acked-by: Johannes Weiner 
> 
> We used to ignore the page's writeback state on isolation in the past,
> could you include a reference to since when this problem has been in

Good idea.
It has existed since f80c067[mm: zone_reclaim: make isolate_lru_page() 
filter-aware]
I will write down it in changelog.

> the tree?  Also, would it make sense to tag it for one of the stable
> trees?

If Luigi confirmed it, I will Cc sta...@vger.kernel.org in next spin.
Thanks!

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] input: keyboard: tegra: use devm_* for resource allocation

2013-01-08 Thread Thierry Reding
On Sun, Jan 06, 2013 at 11:57:48AM -0800, Dmitry Torokhov wrote:
> On Sun, Jan 06, 2013 at 08:27:39PM +0100, Thierry Reding wrote:
> > On Sat, Jan 05, 2013 at 12:06:58AM -0800, Dmitry Torokhov wrote:
> > > On Sat, Jan 05, 2013 at 01:15:08PM +0530, Laxman Dewangan wrote:
> > [...]
> > > > @@ -735,25 +738,16 @@ static int tegra_kbc_probe(struct platform_device 
> > > > *pdev)
> > > > spin_lock_init(>lock);
> > > > setup_timer(>timer, tegra_kbc_keypress_timer, (unsigned 
> > > > long)kbc);
> > > >  
> > > > -   res = request_mem_region(res->start, resource_size(res), 
> > > > pdev->name);
> > > > -   if (!res) {
> > > > -   dev_err(>dev, "failed to request I/O memory\n");
> > > > -   err = -EBUSY;
> > > > -   goto err_free_mem;
> > > > -   }
> > > > -
> > > > -   kbc->mmio = ioremap(res->start, resource_size(res));
> > > > +   kbc->mmio = devm_request_and_ioremap(>dev, res);
> > > > if (!kbc->mmio) {
> > > > -   dev_err(>dev, "failed to remap I/O memory\n");
> > > > -   err = -ENXIO;
> > > > -   goto err_free_mem_region;
> > > > +   dev_err(>dev, "Cannot request memregion/iomap 
> > > > address\n");
> > > > +   return -EADDRNOTAVAIL;
> > > 
> > > Erm, no, -EBUSY please.
> > 
> > EADDRNOTAVAIL is the canonical error for devm_request_and_ioremap()
> > failure. The kerneldoc comment in lib/devres.c even gives a short
> > example that uses this error code.
> 
> I am sorry, but I do not consider a function that was added a little
> over a year ago as a canon. If you look at the uses of EADDRNOTAVAIL it
> is used predominantly in networking code to indicate that attempted
> _network_ address is not available.

EBUSY might be misleading, though. devm_request_and_ioremap() can fail
in both the request_mem_region() and ioremap() calls. Furthermore it'd
be good to settle on a consistent error-code instead of doing it
differently depending on subsystem and/or driver. Currently the various
error codes used are:

EBUSY, EADDRNOTAVAIL, ENXIO, ENOMEM, ENODEV, ENOENT, EINVAL,
EIO, EFAULT, EADDRINUSE

Also if we can settle on one error code we should follow up with a patch
to make it consistent across the tree and also update that kerneldoc
comment. I volunteer to do that if nobody else steps up. I'm also Cc'ing
Wolfram (the original author), maybe he has some thoughts on this.

Thierry


pgpyllNBYETpP.pgp
Description: PGP signature


Re: [PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Johannes Weiner
On Wed, Jan 09, 2013 at 03:21:13PM +0900, Minchan Kim wrote:
> Recently, Luigi reported there are lots of free swap space when
> OOM happens. It's easily reproduced on zram-over-swap, where
> many instance of memory hogs are running and laptop_mode is enabled.
> 
> Luigi reported there was no problem when he disabled laptop_mode.
> The problem when I investigate problem is following as.
> 
> try_to_free_pages disable may_writepage if laptop_mode is enabled.
> shrink_page_list adds lots of anon pages in swap cache by
> add_to_swap, which makes pages Dirty and rotate them to head of
> inactive LRU without pageout. If it is repeated, inactive anon LRU
> is full of Dirty and SwapCache pages.
> 
> In case of that, isolate_lru_pages fails because it try to isolate
> clean page due to may_writepage == 0.
> 
> The may_writepage could be 1 only if total_scanned is higher than
> writeback_threshold in do_try_to_free_pages but unfortunately,
> VM can't isolate anon pages from inactive anon lru list by
> above reason and we already reclaimed all file-backed pages.
> So it ends up OOM killing.
> 
> This patch prevents to add a page to swap cache unnecessary when
> may_writepage is unset so anoymous lru list isn't full of
> Dirty/Swapcache page. So VM can isolate pages from anon lru list,
> which ends up setting may_writepage to 1 and could swap out
> anon lru pages. When OOM triggers, I confirmed swap space was full.
> 
> Reported-by: Luigi Semenzato 
> Signed-off-by: Minchan Kim 

Acked-by: Johannes Weiner 

We used to ignore the page's writeback state on isolation in the past,
could you include a reference to since when this problem has been in
the tree?  Also, would it make sense to tag it for one of the stable
trees?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] sched: Get rid of unnecessary checks from select_idle_sibling

2013-01-08 Thread Namhyung Kim
From: Namhyung Kim 

AFAICS @target cpu of select_idle_sibling() is always either prev_cpu
or this_cpu.  So no need to check it again and the conditionals can be
consolidated.

Cc: Mike Galbraith 
Cc: Preeti U Murthy 
Cc: Vincent Guittot 
Cc: Alex Shi 
Signed-off-by: Namhyung Kim 
---
 kernel/sched/fair.c | 17 -
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5eea8707234a..af665814c216 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3254,25 +3254,16 @@ find_idlest_cpu(struct sched_group *group, struct 
task_struct *p, int this_cpu)
  */
 static int select_idle_sibling(struct task_struct *p, int target)
 {
-   int cpu = smp_processor_id();
-   int prev_cpu = task_cpu(p);
struct sched_domain *sd;
struct sched_group *sg;
int i;
 
/*
-* If the task is going to be woken-up on this cpu and if it is
-* already idle, then it is the right target.
-*/
-   if (target == cpu && idle_cpu(cpu))
-   return cpu;
-
-   /*
-* If the task is going to be woken-up on the cpu where it previously
-* ran and if it is currently idle, then it the right target.
+* If the task is going to be woken-up on this cpu or the cpu where it
+* previously ran and it is already idle, then it is the right target.
 */
-   if (target == prev_cpu && idle_cpu(prev_cpu))
-   return prev_cpu;
+   if (idle_cpu(target))
+   return target;
 
/*
 * Otherwise, iterate the domains and find an elegible idle cpu.
-- 
1.7.11.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Bluetooth: btmrvl_sdio: look for sd8688 firmware in alternate place

2013-01-08 Thread Marcel Holtmann
Hi Lubomir,

> > > > linux-firmware ships the sd8688* firmware images that are shared with
> > > > libertas_sdio WiFi driver under libertas/. libertas_sdio looks in both 
> > > > places
> > > > and so should we.
> > > >
> > > > Signed-off-by: Lubomir Rintel 
> > > > ---
> > > >  drivers/bluetooth/btmrvl_sdio.c |   24 ++--
> > > >  drivers/bluetooth/btmrvl_sdio.h |6 --
> > > >  2 files changed, 26 insertions(+), 4 deletions(-)
> > > 
> > > NAK from me on this one. I do not want the driver to check two
> > > locations. That is what userspace can work around.
> > > 
> > > If we want to unify the location between the WiFi driver and the
> > > Bluetooth driver, I am fine with that, but seriously, just pick one over
> > > the other. I do not care which one.
> > 
> > The unified location is mrvl/ directory.
> > 
> > We can probably move SD8688 firmware & helper binaries to mrvl/ and have 
> > both drivers grab the images there?
> 
> That would break existing setups, wouldn't it?
> 
> I was under impression (commit 3d32a58b) that we care about
> compatibility here. Do we?

that is what symlinks are for.

Regards

Marcel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: remove unreachable code

2013-01-08 Thread Tushar Behera
On 01/09/2013 11:59 AM, Rajagopal Venkat wrote:
> while reparenting a clock, NULL check is done for clock in
> consideration and its new parent. So re-check is not required.
> If done, else part becomes unreachable.
> 
> Signed-off-by: Rajagopal Venkat 
> ---

It is good to have revision history of the patches (version number and
changelog).

>  drivers/clk/clk.c |   13 ++---
>  1 file changed, 2 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 251e45d..1c4097c 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1040,7 +1040,6 @@ void __clk_reparent(struct clk *clk, struct clk 
> *new_parent)
>  {
>  #ifdef CONFIG_COMMON_CLK_DEBUG
>   struct dentry *d;
> - struct dentry *new_parent_d;
>  #endif
>  
>   if (!clk || !new_parent)
> @@ -1048,22 +1047,14 @@ void __clk_reparent(struct clk *clk, struct clk 
> *new_parent)
>  
>   hlist_del(>child_node);
>  
> - if (new_parent)
> - hlist_add_head(>child_node, _parent->children);
> - else
> - hlist_add_head(>child_node, _orphan_list);
> + hlist_add_head(>child_node, _parent->children);
>  
>  #ifdef CONFIG_COMMON_CLK_DEBUG
>   if (!inited)
>   goto out;
>  
> - if (new_parent)
> - new_parent_d = new_parent->dentry;
> - else
> - new_parent_d = orphandir;
> -
>   d = debugfs_rename(clk->dentry->d_parent, clk->dentry,
> - new_parent_d, clk->name);
> + new_parent->dentry, clk->name);
>   if (d)
>   clk->dentry = d;
>   else
> 


-- 
Tushar Behera
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] clk: remove unreachable code

2013-01-08 Thread Rajagopal Venkat
while reparenting a clock, NULL check is done for clock in
consideration and its new parent. So re-check is not required.
If done, else part becomes unreachable.

Signed-off-by: Rajagopal Venkat 
---
 drivers/clk/clk.c |   13 ++---
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
index 251e45d..1c4097c 100644
--- a/drivers/clk/clk.c
+++ b/drivers/clk/clk.c
@@ -1040,7 +1040,6 @@ void __clk_reparent(struct clk *clk, struct clk 
*new_parent)
 {
 #ifdef CONFIG_COMMON_CLK_DEBUG
struct dentry *d;
-   struct dentry *new_parent_d;
 #endif
 
if (!clk || !new_parent)
@@ -1048,22 +1047,14 @@ void __clk_reparent(struct clk *clk, struct clk 
*new_parent)
 
hlist_del(>child_node);
 
-   if (new_parent)
-   hlist_add_head(>child_node, _parent->children);
-   else
-   hlist_add_head(>child_node, _orphan_list);
+   hlist_add_head(>child_node, _parent->children);
 
 #ifdef CONFIG_COMMON_CLK_DEBUG
if (!inited)
goto out;
 
-   if (new_parent)
-   new_parent_d = new_parent->dentry;
-   else
-   new_parent_d = orphandir;
-
d = debugfs_rename(clk->dentry->d_parent, clk->dentry,
-   new_parent_d, clk->name);
+   new_parent->dentry, clk->name);
if (d)
clk->dentry = d;
else
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Bluetooth: btmrvl_sdio: look for sd8688 firmware in alternate place

2013-01-08 Thread Lubomir Rintel
On Tue, 2013-01-08 at 18:43 -0800, Bing Zhao wrote:
> > > linux-firmware ships the sd8688* firmware images that are shared with
> > > libertas_sdio WiFi driver under libertas/. libertas_sdio looks in both 
> > > places
> > > and so should we.
> > >
> > > Signed-off-by: Lubomir Rintel 
> > > ---
> > >  drivers/bluetooth/btmrvl_sdio.c |   24 ++--
> > >  drivers/bluetooth/btmrvl_sdio.h |6 --
> > >  2 files changed, 26 insertions(+), 4 deletions(-)
> > 
> > NAK from me on this one. I do not want the driver to check two
> > locations. That is what userspace can work around.
> > 
> > If we want to unify the location between the WiFi driver and the
> > Bluetooth driver, I am fine with that, but seriously, just pick one over
> > the other. I do not care which one.
> 
> The unified location is mrvl/ directory.
> 
> We can probably move SD8688 firmware & helper binaries to mrvl/ and have both 
> drivers grab the images there?

That would break existing setups, wouldn't it?

I was under impression (commit 3d32a58b) that we care about
compatibility here. Do we?

--
Lubomir Rintel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] arm: vt8500: Add support for Wondermedia WM8750/WM8850

2013-01-08 Thread Olof Johansson
On Tue, Jan 8, 2013 at 10:13 PM, Tony Prisk  wrote:
> On Fri, 2012-12-28 at 12:20 +1300, Tony Prisk wrote:
>> This patch adds support for the WM8750 (ARMv6) and WM8850 (ARMv7).
>>
>> Common features across all SoCs are split into ARCH_VT8500 and
>> unique features are specified by each SoC option.
>>
>> Signed-off-by: Tony Prisk 
>
>
> Hi Arnd, Olof,
>
> Haven't heard anything re: this patch series. Problem?

Nope, just wasn't sure if you would send a git pull request or if you
wanted them applied.

I'm out of time for tonight, but I'll look closer at them (and apply
them if all OK) tomorrow.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] rtc-efi: register rtc-efi device when EFI enabled

2013-01-08 Thread H. Peter Anvin
That makes it even less compelling...

joeyli  wrote:

>於 五,2012-12-28 於 17:07 -0800,H. Peter Anvin 提到:
>> On 12/28/2012 05:00 PM, joeyli wrote:
>> > 於 五,2012-12-28 於 17:43 +,Matthew Garrett 提到:
>> >> On Sat, 2012-12-29 at 00:26 +0800, Lee, Chun-Yi wrote:
>> >>> UEFI time services, GetTime(), SetTime(), GetWakeupTime(),
>SetWakeupTime() are also
>> >>> supported by other non-IA64 architecutre with UEFI BIOS, e.g.
>x86.
>> >>>
>> >>> This patch changed RTC_DRV_EFI configuration to depend on EFI but
>not just IA64. It
>> >>> checks efi_enabled flag and efi-rtc driver should enabled.
>> >>
>> >> In theory, certainly - but do we still have machines that explode
>if the
>> >> get_time call is made? We may also want to think about disabling
>the
>> >> legacy access to the RTC if the EFI calls are present.
>> > 
>> > The legacy get_time access on my test machine is work fine, not
>thing
>> > explode. :-)
>> > Just we have a function want to expose the timezone information to
>> > userspace and also store it.
>> > 
>> 
>> We should indeed save the timezone information if it is available --
>> either from the ACPI TAD or from the EFI RTC, or even via some
>> platform-dependent mechanism.  It is important, though, that that is
>> separate from the order of priority.
>> 
>>  -hpa
>> 
>
>I found Windows 8 doesn't aware/maintain the Timezone and Daylight
>fields in EFI_TIME struct.
>
>I got a Acer UEFI notebook and I keep the Windows 8 hard drive
>(/dev/sda) but install Linux to another hard drive (/dev/sdb). 
>
>On Linux, I applied my rtc-efi patches for allow user space feed
>Timezone and store it to BIOS through SetTime(). I wrote a simple user
>space program to set Timezone and Daylight fields, after set those
>fields I reboot to Windows 8 and use DateTime setting GUI to look at
>the
>change. Looks Windows doesn't aware the change, it just assume the time
>in DateTime filed is local time, but didn't show the Timezone that was
>set by me on Linux to GUI.
>
>Then, I select another Timezone(country) through Windows 8 GUI, and
>reboot to Linux. I read the Timezone and Daylight by program but didn't
>see the Timezone and Daylight changed by Windows 8, the value is still
>the same with my latest time setting by Linux program. Windows 8
>changed
>DayTime fields but didn't maintain Timezone and Daylight.
>
>I only have this machine with preloaded Windows 8 for verify the
>behavior, not sure it's normally or not. If Windows 8 ignores Timezone
>and Daylight fields in UEFI BIOS, then I think it's lower down the
>necessary for we maintain Timezone and Daylight in UEFI BIOS. 
>
>We still can store Timezone and Daylight value to UEFI, but will have
>no
>any interactive with Windows 8.
>
>Appreciate for any suggestions.
>
>
>Thanks a lot!
>Joey Lee

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

2013-01-08 Thread Pandarathil, Vijaymohan R
- Create eventfd per vfio device assigned to a guest and register an
  event handler

- This fd is passed to the vfio_pci driver through a new ioctl

- When the device encounters an error, the eventfd is signaled
  and the qemu eventfd handler gets invoked.

- In the handler decide what action to take. Current action taken
  is to terminate the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 hw/vfio_pci.c  | 56 ++
 linux-headers/linux/vfio.h |  9 
 2 files changed, 65 insertions(+)

diff --git a/hw/vfio_pci.c b/hw/vfio_pci.c
index 28c8303..9c3c28b 100644
--- a/hw/vfio_pci.c
+++ b/hw/vfio_pci.c
@@ -38,6 +38,7 @@
 #include "qemu/error-report.h"
 #include "qemu/queue.h"
 #include "qemu/range.h"
+#include "sysemu/sysemu.h"
 
 /* #define DEBUG_VFIO */
 #ifdef DEBUG_VFIO
@@ -130,6 +131,8 @@ typedef struct VFIODevice {
 QLIST_ENTRY(VFIODevice) next;
 struct VFIOGroup *group;
 bool reset_works;
+EventNotifier errfd;
+__u32 dev_info_flags;
 } VFIODevice;
 
 typedef struct VFIOGroup {
@@ -1805,6 +1808,8 @@ static int vfio_get_device(VFIOGroup *group, const char 
*name, VFIODevice *vdev)
 DPRINTF("Device %s flags: %u, regions: %u, irgs: %u\n", name,
 dev_info.flags, dev_info.num_regions, dev_info.num_irqs);
 
+vdev->dev_info_flags = dev_info.flags;
+
 if (!(dev_info.flags & VFIO_DEVICE_FLAGS_PCI)) {
 error_report("vfio: Um, this isn't a PCI device\n");
 goto error;
@@ -1900,6 +1905,55 @@ static void vfio_put_device(VFIODevice *vdev)
 }
 }
 
+static void vfio_errfd_handler(void *opaque)
+{
+VFIODevice *vdev = opaque;
+
+if (!event_notifier_test_and_clear(>errfd)) {
+return;
+}
+
+/*
+ * TBD. Retrieve the error details and decide what action
+ * needs to be taken. One of the actions could be to pass
+ * the error to the guest and have the guest driver recover
+ * the error. This requires that PCIe capabilities be
+ * exposed to the guest. At present, we just terminate the
+ * guest to contain the error.
+ */
+error_report("%s(%04x:%02x:%02x.%x) "
+"Unrecoverable error detected... Terminating guest\n",
+__func__, vdev->host.domain, vdev->host.bus, vdev->host.slot,
+vdev->host.function);
+
+qemu_system_shutdown_request();
+return;
+}
+
+static void vfio_register_errfd(VFIODevice *vdev)
+{
+int32_t pfd;
+int ret;
+
+if (!(vdev->dev_info_flags & VFIO_DEVICE_FLAGS_AER_NOTIFY)) {
+error_report("vfio: Warning: Error notification not supported for the 
device\n");
+return;
+}
+if (event_notifier_init(>errfd, 0)) {
+error_report("vfio: Warning: Unable to init event notifier for error 
detection\n");
+return;
+}
+pfd = event_notifier_get_fd(>errfd);
+qemu_set_fd_handler(pfd, vfio_errfd_handler, NULL, vdev);
+
+ret = ioctl(vdev->fd, VFIO_DEVICE_SET_ERRFD, pfd);
+if (ret) {
+error_report("vfio: Warning: Failed to setup error fd: %d\n", ret);
+qemu_set_fd_handler(pfd, NULL, NULL, vdev);
+event_notifier_cleanup(>errfd);
+}
+return;
+}
 static int vfio_initfn(PCIDevice *pdev)
 {
 VFIODevice *pvdev, *vdev = DO_UPCAST(VFIODevice, pdev, pdev);
@@ -2010,6 +2064,8 @@ static int vfio_initfn(PCIDevice *pdev)
 }
 }
 
+vfio_register_errfd(vdev);
+
 return 0;
 
 out_teardown:
diff --git a/linux-headers/linux/vfio.h b/linux-headers/linux/vfio.h
index 4758d1b..0ca4eeb 100644
--- a/linux-headers/linux/vfio.h
+++ b/linux-headers/linux/vfio.h
@@ -147,6 +147,7 @@ struct vfio_device_info {
__u32   flags;
 #define VFIO_DEVICE_FLAGS_RESET(1 << 0)/* Device supports 
reset */
 #define VFIO_DEVICE_FLAGS_PCI  (1 << 1)/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1 << 2)   /* Supports aer notification */
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
@@ -288,6 +289,14 @@ struct vfio_irq_set {
  */
 #define VFIO_DEVICE_RESET  _IO(VFIO_TYPE, VFIO_BASE + 11)
 
+/**
+ * VFIO_DEVICE_SET_ERRFD - _IO(VFIO_TYPE, VFIO_BASE + 12)
+ *
+ * Pass the eventfd to the vfio-pci driver for signalling any device
+ * error notifications
+ */
+#define VFIO_DEVICE_SET_ERRFD   _IO(VFIO_TYPE, VFIO_BASE + 12)
+
 /*
  * The VFIO-PCI bus driver makes use of the following fixed region and
  * IRQ index mapping.  Unimplemented regions return a size of zero.
-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] AER-KVM: Error containment of VFIO devices assigned to KVM guests

2013-01-08 Thread Pandarathil, Vijaymohan R
Add support for error containment when a VFIO device assigned to a KVM
guest encounters an error. This is for PCIe devices/drivers that support AER
functionality. When the host OS is notified of an error in a device either
through the firmware first approach or through an interrupt handled by the AER
root port driver, the error handler registered by the vfio-pci driver gets
invoked. The qemu process is signaled through an eventfd registered per
VFIO device by the qemu process. In the eventfd handler, qemu decides on
what action to take. In this implementation, guest is brought down to
contain the error.

---
Vijay Mohan Pandarathil(2):

[PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER
[PATCH 2/2] QEMU-AER: Qemu changes to support AER for VFIO-PCI devices

Kernel files changed

 drivers/vfio/pci/vfio_pci.c | 29 +
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 drivers/vfio/vfio.c |  8 
 include/linux/vfio.h|  1 +
 include/uapi/linux/vfio.h   |  9 +
 5 files changed, 48 insertions(+)

Qemu files changed

 hw/vfio_pci.c  | 56 ++
 linux-headers/linux/vfio.h |  9 
 2 files changed, 65 insertions(+)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] VFIO-AER: Vfio-pci driver changes for supporting AER

2013-01-08 Thread Pandarathil, Vijaymohan R

- New ioctl which is used to pass the eventfd that is signaled when
  an error occurs in the vfio_pci_device

- Register pci_error_handler for the vfio_pci driver

- When the device encounters an error, the error handler registered by
  the vfio_pci driver gets invoked by the AER infrastructure

- In the error handler, signal the eventfd registered for the device.

- This results in the qemu eventfd handler getting invoked and
  appropriate action taken for the guest.

Signed-off-by: Vijay Mohan Pandarathil 
---
 drivers/vfio/pci/vfio_pci.c | 29 +
 drivers/vfio/pci/vfio_pci_private.h |  1 +
 drivers/vfio/vfio.c |  8 
 include/linux/vfio.h|  1 +
 include/uapi/linux/vfio.h   |  9 +
 5 files changed, 48 insertions(+)

diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
index 6c11994..4ae9526 100644
--- a/drivers/vfio/pci/vfio_pci.c
+++ b/drivers/vfio/pci/vfio_pci.c
@@ -207,6 +207,8 @@ static long vfio_pci_ioctl(void *device_data,
if (vdev->reset_works)
info.flags |= VFIO_DEVICE_FLAGS_RESET;
 
+   info.flags |= VFIO_DEVICE_FLAGS_AER_NOTIFY;
+
info.num_regions = VFIO_PCI_NUM_REGIONS;
info.num_irqs = VFIO_PCI_NUM_IRQS;
 
@@ -348,6 +350,19 @@ static long vfio_pci_ioctl(void *device_data,
 
return ret;
 
+   } else if (cmd == VFIO_DEVICE_SET_ERRFD) {
+   int32_t fd = (int32_t)arg;
+
+   if (fd < 0)
+   return -EINVAL;
+
+   vdev->err_trigger = eventfd_ctx_fdget(fd);
+
+   if (IS_ERR(vdev->err_trigger))
+   return PTR_ERR(vdev->err_trigger);
+
+   return 0;
+
} else if (cmd == VFIO_DEVICE_RESET)
return vdev->reset_works ?
pci_reset_function(vdev->pdev) : -EINVAL;
@@ -527,11 +542,25 @@ static void vfio_pci_remove(struct pci_dev *pdev)
kfree(vdev);
 }
 
+static pci_ers_result_t vfio_err_detected(struct pci_dev *pdev,
+   pci_channel_state_t state)
+{
+   struct vfio_pci_device *vdev = vfio_get_vdev(>dev);
+
+   eventfd_signal(vdev->err_trigger, 1);
+   return PCI_ERS_RESULT_CAN_RECOVER;
+}
+
+static const struct pci_error_handlers vfio_err_handlers = {
+   .error_detected = vfio_err_detected,
+};
+
 static struct pci_driver vfio_pci_driver = {
.name   = "vfio-pci",
.id_table   = NULL, /* only dynamic ids */
.probe  = vfio_pci_probe,
.remove = vfio_pci_remove,
+   .err_handler= _err_handlers,
 };
 
 static void __exit vfio_pci_cleanup(void)
diff --git a/drivers/vfio/pci/vfio_pci_private.h 
b/drivers/vfio/pci/vfio_pci_private.h
index 611827c..daee62f 100644
--- a/drivers/vfio/pci/vfio_pci_private.h
+++ b/drivers/vfio/pci/vfio_pci_private.h
@@ -55,6 +55,7 @@ struct vfio_pci_device {
boolbardirty;
struct pci_saved_state  *pci_saved_state;
atomic_trefcnt;
+   struct eventfd_ctx  *err_trigger;
 };
 
 #define is_intx(vdev) (vdev->irq_type == VFIO_PCI_INTX_IRQ_INDEX)
diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
index 56097c6..5ed5a54 100644
--- a/drivers/vfio/vfio.c
+++ b/drivers/vfio/vfio.c
@@ -693,6 +693,14 @@ void *vfio_del_group_dev(struct device *dev)
 }
 EXPORT_SYMBOL_GPL(vfio_del_group_dev);
 
+void *vfio_get_vdev(struct device *dev)
+{
+   struct vfio_device *device = dev_get_drvdata(dev);
+
+   return device->device_data;
+}
+EXPORT_SYMBOL_GPL(vfio_get_vdev);
+
 /**
  * VFIO base fd, /dev/vfio/vfio
  */
diff --git a/include/linux/vfio.h b/include/linux/vfio.h
index ab9e862..3c97b03 100644
--- a/include/linux/vfio.h
+++ b/include/linux/vfio.h
@@ -45,6 +45,7 @@ extern int vfio_add_group_dev(struct device *dev,
  void *device_data);
 
 extern void *vfio_del_group_dev(struct device *dev);
+extern void *vfio_get_vdev(struct device *dev);
 
 /**
  * struct vfio_iommu_driver_ops - VFIO IOMMU driver callbacks
diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 4758d1b..fa67213 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -147,6 +147,7 @@ struct vfio_device_info {
__u32   flags;
 #define VFIO_DEVICE_FLAGS_RESET(1 << 0)/* Device supports 
reset */
 #define VFIO_DEVICE_FLAGS_PCI  (1 << 1)/* vfio-pci device */
+#define VFIO_DEVICE_FLAGS_AER_NOTIFY (1 << 2)  /* Supports aer notify */
__u32   num_regions;/* Max region index + 1 */
__u32   num_irqs;   /* Max IRQ index + 1 */
 };
@@ -288,6 +289,14 @@ struct vfio_irq_set {
  */
 #define VFIO_DEVICE_RESET  _IO(VFIO_TYPE, VFIO_BASE + 11)
 
+/**
+ * VFIO_DEVICE_SET_ERRFD - _IO(VFIO_TYPE, VFIO_BASE + 12)
+ *
+ * Pass the 

Re: [PATCH 5/6] udf: implement extent caching while reading-writing to a file

2013-01-08 Thread Namjae Jeon
2012/12/13, Namjae Jeon :
> 2012/10/19, Namjae Jeon :
>> 2012/10/19, Jan Kara :
>>>   Hello,
>>>
>>> On Wed 10-10-12 00:10:01, Namjae Jeon wrote:
 From: Namjae Jeon 

 This patch implements extent caching.
 Instead of reading metadata everytime from file's starting position,
 now we read from the cached extent.
 This speeds up the transformation of file logical offsets to
 corresponding on-disk blocks.
>>>   I have some mostly minor comments to the patch. But when reading the
>>> extent code it is just ugly and hard to follow. So I'm thinking how to
>>> improve that before making things even harder with the extent cache. So
>>> give me a few more days please.
>> Hi Jan.
>> Okay, I see. Thanks!
>>>
> Hi Jan.
> Sorry for interrupt. I am still waiting for your review.
> Would you check extent cache patches ?
> Thanks.
Hi Jan.

Maybe You think you didn't convince extent cache implementation of
write part in this patch.
So I suggest that once, we add extent cache of read part first in case
of read-only mount type.
because it is real issue on 3D BD-disk playing(BD Disk is used on
read-only mount type).
And I will check more write part again.

Thanks.

>
>>> Honza
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] rtc-efi: register rtc-efi device when EFI enabled

2013-01-08 Thread joeyli
於 五,2012-12-28 於 17:07 -0800,H. Peter Anvin 提到:
> On 12/28/2012 05:00 PM, joeyli wrote:
> > 於 五,2012-12-28 於 17:43 +,Matthew Garrett 提到:
> >> On Sat, 2012-12-29 at 00:26 +0800, Lee, Chun-Yi wrote:
> >>> UEFI time services, GetTime(), SetTime(), GetWakeupTime(), 
> >>> SetWakeupTime() are also
> >>> supported by other non-IA64 architecutre with UEFI BIOS, e.g. x86.
> >>>
> >>> This patch changed RTC_DRV_EFI configuration to depend on EFI but not 
> >>> just IA64. It
> >>> checks efi_enabled flag and efi-rtc driver should enabled.
> >>
> >> In theory, certainly - but do we still have machines that explode if the
> >> get_time call is made? We may also want to think about disabling the
> >> legacy access to the RTC if the EFI calls are present.
> > 
> > The legacy get_time access on my test machine is work fine, not thing
> > explode. :-)
> > Just we have a function want to expose the timezone information to
> > userspace and also store it.
> > 
> 
> We should indeed save the timezone information if it is available --
> either from the ACPI TAD or from the EFI RTC, or even via some
> platform-dependent mechanism.  It is important, though, that that is
> separate from the order of priority.
> 
>   -hpa
> 

I found Windows 8 doesn't aware/maintain the Timezone and Daylight
fields in EFI_TIME struct.

I got a Acer UEFI notebook and I keep the Windows 8 hard drive
(/dev/sda) but install Linux to another hard drive (/dev/sdb). 

On Linux, I applied my rtc-efi patches for allow user space feed
Timezone and store it to BIOS through SetTime(). I wrote a simple user
space program to set Timezone and Daylight fields, after set those
fields I reboot to Windows 8 and use DateTime setting GUI to look at the
change. Looks Windows doesn't aware the change, it just assume the time
in DateTime filed is local time, but didn't show the Timezone that was
set by me on Linux to GUI.

Then, I select another Timezone(country) through Windows 8 GUI, and
reboot to Linux. I read the Timezone and Daylight by program but didn't
see the Timezone and Daylight changed by Windows 8, the value is still
the same with my latest time setting by Linux program. Windows 8 changed
DayTime fields but didn't maintain Timezone and Daylight.

I only have this machine with preloaded Windows 8 for verify the
behavior, not sure it's normally or not. If Windows 8 ignores Timezone
and Daylight fields in UEFI BIOS, then I think it's lower down the
necessary for we maintain Timezone and Daylight in UEFI BIOS. 

We still can store Timezone and Daylight value to UEFI, but will have no
any interactive with Windows 8.

Appreciate for any suggestions.


Thanks a lot!
Joey Lee

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v5 7/8] fat (exportfs): rebuild directory-inode if fat_dget() fails

2013-01-08 Thread Namjae Jeon
>
> BTW, fat_search_long() was wrong as similar function. Actually it would
> be fat_scan(), because we don't care longname entry.
Hi OGAWA.

We rewrite patch as your suggestion using dummy inode. Would please
you review below patch code ?

Thanks.


Subject: [PATCH] fat (exportfs): rebuild directory-inode if fat_dget()
 fails

This patch enables rebuilding of directory inodes which are not present
in the cache.This is done by traversing the disk clusters to find the
directory entry of the parent directory and using its i_pos to build the
inode.

Do this only if the "nostale_ro" nfs mount option is specified.

---
 fs/fat/dir.c   |   23 +++
 fs/fat/fat.h   |3 +++
 fs/fat/inode.c |2 +-
 fs/fat/nfs.c   |   52 +++-
 4 files changed, 78 insertions(+), 2 deletions(-)

diff --git a/fs/fat/dir.c b/fs/fat/dir.c
index 695c15c..ac97f34 100644
--- a/fs/fat/dir.c
+++ b/fs/fat/dir.c
@@ -975,6 +975,29 @@ int fat_scan(struct inode *dir, const unsigned char *name,

 EXPORT_SYMBOL_GPL(fat_scan);

+/*
+ * Scans a directory for a given logstart.
+ * Returns an error code or zero.
+ */
+int fat_scan_logstart(struct inode *dir, int i_logstart,
+ struct fat_slot_info *sinfo)
+{
+   struct super_block *sb = dir->i_sb;
+
+   sinfo->slot_off = 0;
+   sinfo->bh = NULL;
+   while (fat_get_short_entry(dir, >slot_off, >bh,
+  >de) >= 0) {
+   if (fat_get_start(MSDOS_SB(sb), sinfo->de) == i_logstart) {
+   sinfo->slot_off -= sizeof(*sinfo->de);
+   sinfo->nr_slots = 1;
+   sinfo->i_pos = fat_make_i_pos(sb, sinfo->bh, sinfo->de);
+   return 0;
+   }
+   }
+   return -ENOENT;
+}
+
 static int __fat_remove_entries(struct inode *dir, loff_t pos, int nr_slots)
 {
struct super_block *sb = dir->i_sb;
diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 73f15b8..d882c01 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -291,6 +291,8 @@ extern int fat_dir_empty(struct inode *dir);
 extern int fat_subdirs(struct inode *dir);
 extern int fat_scan(struct inode *dir, const unsigned char *name,
struct fat_slot_info *sinfo);
+extern int fat_scan_logstart(struct inode *dir, int i_logstart,
+struct fat_slot_info *sinfo);
 extern int fat_get_dotdot_entry(struct inode *dir, struct buffer_head **bh,
struct msdos_dir_entry **de);
 extern int fat_alloc_new_dir(struct inode *dir, struct timespec *ts);
@@ -370,6 +372,7 @@ extern int fat_fill_super(struct super_block *sb,
void *data, int silent,

 extern int fat_flush_inodes(struct super_block *sb, struct inode *i1,
struct inode *i2);
+extern int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de);
 static inline unsigned long fat_dir_hash(int logstart)
 {
return hash_32(logstart, FAT_HASH_BITS);
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 491320b..c4c286a 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -384,7 +384,7 @@ static int fat_calc_dir_size(struct inode *inode)
 }

 /* doesn't deal with root inode */
-static int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de)
+int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de)
 {
struct msdos_sb_info *sbi = MSDOS_SB(inode->i_sb);
int error;
diff --git a/fs/fat/nfs.c b/fs/fat/nfs.c
index 08ff9fa..e94da33 100644
--- a/fs/fat/nfs.c
+++ b/fs/fat/nfs.c
@@ -213,6 +213,53 @@ static struct dentry
*fat_fh_to_parent_nostale(struct super_block *sb,
 }

 /*
+ * Rebuild the parent for a directory that is not connected
+ *  to the filesystem root
+ */
+static
+struct inode *fat_rebuild_parent(struct super_block *sb, int parent_logstart)
+{
+   int search_clus, clus_to_match;
+   struct msdos_dir_entry *de;
+   struct inode *parent = NULL;
+   struct inode *dummy_grand_parent = NULL;
+   struct fat_slot_info sinfo;
+   struct msdos_sb_info *sbi = MSDOS_SB(sb);
+   sector_t blknr = fat_clus_to_blknr(sbi, parent_logstart);
+   struct buffer_head *parent_bh = sb_bread(sb, blknr);
+   if (!parent_bh) {
+   fat_msg(sb, KERN_ERR,
+   "unable to read cluster of parent directory");
+   return NULL;
+   }
+
+   de = (struct msdos_dir_entry *) parent_bh->b_data;
+   clus_to_match = fat_get_start(sbi, [0]);
+   search_clus = fat_get_start(sbi, [1]);
+
+   dummy_grand_parent = fat_dget(sb, search_clus);
+   if (!dummy_grand_parent) {
+   dummy_grand_parent = new_inode(sb);
+   if (!dummy_grand_parent) {
+   brelse(parent_bh);
+   return parent;
+   }
+
+   

[PATCH 1/2] mm: prevent to add a page to swap if may_writepage is unset

2013-01-08 Thread Minchan Kim
Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode is enabled.

Luigi reported there was no problem when he disabled laptop_mode.
The problem when I investigate problem is following as.

try_to_free_pages disable may_writepage if laptop_mode is enabled.
shrink_page_list adds lots of anon pages in swap cache by
add_to_swap, which makes pages Dirty and rotate them to head of
inactive LRU without pageout. If it is repeated, inactive anon LRU
is full of Dirty and SwapCache pages.

In case of that, isolate_lru_pages fails because it try to isolate
clean page due to may_writepage == 0.

The may_writepage could be 1 only if total_scanned is higher than
writeback_threshold in do_try_to_free_pages but unfortunately,
VM can't isolate anon pages from inactive anon lru list by
above reason and we already reclaimed all file-backed pages.
So it ends up OOM killing.

This patch prevents to add a page to swap cache unnecessary when
may_writepage is unset so anoymous lru list isn't full of
Dirty/Swapcache page. So VM can isolate pages from anon lru list,
which ends up setting may_writepage to 1 and could swap out
anon lru pages. When OOM triggers, I confirmed swap space was full.

Reported-by: Luigi Semenzato 
Signed-off-by: Minchan Kim 
---
 mm/vmscan.c |2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index ff869d2..439cc47 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -780,6 +780,8 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (PageAnon(page) && !PageSwapCache(page)) {
if (!(sc->gfp_mask & __GFP_IO))
goto keep_locked;
+   if (!sc->may_writepage)
+   goto keep_locked;
if (!add_to_swap(page))
goto activate_locked;
may_enter_fs = 1;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/2] Use up free swap space before reaching OOM kill

2013-01-08 Thread Minchan Kim
Recently, Luigi reported there are lots of free swap space when
OOM happens. It's easily reproduced on zram-over-swap, where
many instance of memory hogs are running and laptop_mode = 2.
http://marc.info/?l=linux-mm=135421750914807=2

This patchset fixes the problem. In fact, if we apply one of two,
we can fix the problem but I send two all because it's separate
issue although each of them solves this issues.

Andrew, Could you replace [1] with this patchset in mmotm?
I think this patchset is better than [1].

[1] mm-swap-out-anonymous-page-regardless-of-laptop_mode.patch

Minchan Kim (2):
  [1/2] mm: prevent to add a page to swap if may_writepage is unset
  [2/2] mm: forcely swapout when we are out of page cache

 mm/vmscan.c |8 
 1 file changed, 8 insertions(+)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] mm: forcely swapout when we are out of page cache

2013-01-08 Thread Minchan Kim
If laptop_mode is enable, VM try to avoid I/O for saving the power.
But if there isn't reclaimable memory without I/O, we should do I/O
for preventing unnecessary OOM kill although we sacrifices power.

One of example is that we are out of page cache. Remained one is
only anonymous pages, for swapping out, we needs may_writepage = 1.

Reported-by: Luigi Semenzato 
Signed-off-by: Minchan Kim 
---
 mm/vmscan.c |6 ++
 1 file changed, 6 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 439cc47..624c816 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1728,6 +1728,12 @@ static void get_scan_count(struct lruvec *lruvec, struct 
scan_control *sc,
free = zone_page_state(zone, NR_FREE_PAGES);
if (unlikely(file + free <= high_wmark_pages(zone))) {
scan_balance = SCAN_ANON;
+   /*
+* From now on, we have to swap out
+* for peventing OOM kill although
+* we sacrifice power consumption.
+*/
+   sc->may_writepage = 1;
goto out;
}
}
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/4] arm: vt8500: Add support for Wondermedia WM8750/WM8850

2013-01-08 Thread Tony Prisk
On Fri, 2012-12-28 at 12:20 +1300, Tony Prisk wrote:
> This patch adds support for the WM8750 (ARMv6) and WM8850 (ARMv7).
> 
> Common features across all SoCs are split into ARCH_VT8500 and
> unique features are specified by each SoC option.
> 
> Signed-off-by: Tony Prisk 


Hi Arnd, Olof,

Haven't heard anything re: this patch series. Problem?

Regards
Tony P

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: remove unreachable code

2013-01-08 Thread Rajagopal Venkat
On 9 January 2013 11:20, Tushar Behera  wrote:
> On 01/08/2013 06:33 PM, Rajagopal Venkat wrote:
>> while reparenting a clock, NULL check is done for clock in
>> consideration and its new parent. So re-check is not required.
>> If done, else part becomes unreachable.
>>
>> Signed-off-by: Rajagopal Venkat 
>> ---
>>  drivers/clk/clk.c |5 +
>>  1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
>> index 251e45d..f896584 100644
>> --- a/drivers/clk/clk.c
>> +++ b/drivers/clk/clk.c
>> @@ -1048,10 +1048,7 @@ void __clk_reparent(struct clk *clk, struct clk 
>> *new_parent)
>>
>>   hlist_del(>child_node);
>>
>> - if (new_parent)
>> - hlist_add_head(>child_node, _parent->children);
>> - else
>> - hlist_add_head(>child_node, _orphan_list);
>> + hlist_add_head(>child_node, _parent->children);
>>
>>  #ifdef CONFIG_COMMON_CLK_DEBUG
>>   if (!inited)
>>
>
> The same logic holds good for following piece of code too.
>
> 1060 |---if (new_parent)
> 1061 |---|---new_parent_d = new_parent->dentry;
> 1062 |---else
> 1063 |---|---new_parent_d = orphandir;

Yes. Thanks for pointing out.

>
>
> --
> Tushar Behera



-- 
Regards,
Rajagopal
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: remove unreachable code

2013-01-08 Thread Tushar Behera
On 01/08/2013 06:33 PM, Rajagopal Venkat wrote:
> while reparenting a clock, NULL check is done for clock in
> consideration and its new parent. So re-check is not required.
> If done, else part becomes unreachable.
> 
> Signed-off-by: Rajagopal Venkat 
> ---
>  drivers/clk/clk.c |5 +
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/drivers/clk/clk.c b/drivers/clk/clk.c
> index 251e45d..f896584 100644
> --- a/drivers/clk/clk.c
> +++ b/drivers/clk/clk.c
> @@ -1048,10 +1048,7 @@ void __clk_reparent(struct clk *clk, struct clk 
> *new_parent)
>  
>   hlist_del(>child_node);
>  
> - if (new_parent)
> - hlist_add_head(>child_node, _parent->children);
> - else
> - hlist_add_head(>child_node, _orphan_list);
> + hlist_add_head(>child_node, _parent->children);
>  
>  #ifdef CONFIG_COMMON_CLK_DEBUG
>   if (!inited)
> 

The same logic holds good for following piece of code too.

1060 |---if (new_parent)
1061 |---|---new_parent_d = new_parent->dentry;
1062 |---else
1063 |---|---new_parent_d = orphandir;


-- 
Tushar Behera
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/2] mirror throttling

2013-01-08 Thread Mikulas Patocka
dm-kcopyd: use throttle

This patch allows the administrator to limit kcopyd rate.

We maintain a history of kcopyd usage in variables io_period and
total_period. The actual kcopyd activity is "(100 * io_period /
total_period)" percent of time. If we exceed user-defined percentage
threshold, we sleep.

Signed-off-by: Mikulas Patocka 

---
 drivers/md/dm-kcopyd.c |  110 +
 1 file changed, 110 insertions(+)

Index: linux-3.8-rc1-fast/drivers/md/dm-kcopyd.c
===
--- linux-3.8-rc1-fast.orig/drivers/md/dm-kcopyd.c  2013-01-02 
23:23:17.0 +0100
+++ linux-3.8-rc1-fast/drivers/md/dm-kcopyd.c   2013-01-02 23:23:25.0 
+0100
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -51,6 +52,8 @@ struct dm_kcopyd_client {
struct workqueue_struct *kcopyd_wq;
struct work_struct kcopyd_work;
 
+   struct dm_kcopyd_throttle *throttle;
+
 /*
  * We maintain three lists of jobs:
  *
@@ -68,6 +71,108 @@ struct dm_kcopyd_client {
 
 static struct page_list zero_page_list;
 
+static DEFINE_SPINLOCK(throttle_spinlock);
+
+/*
+ * IO/IDLE accounting slowly decays after (1 << ACOUNT_INTERVAL_SHIFT) period.
+ * When total_period >= (1 << ACOUNT_INTERVAL_SHIFT) the counters are divided
+ * by 2.
+ */
+#define ACOUNT_INTERVAL_SHIFT  SHIFT_HZ
+
+/*
+ * Sleep this number of milliseconds.
+ *
+ * It is experimentally found value.
+ * Smaller values cause increased copy rate above the limit. The reason for
+ * this is unknown. A possible explanations could be jiffies rounding errors
+ * or read/write cache inside the disk.
+ */
+#define SLEEP_MSEC 100
+
+/*
+ * Maximum number of sleep events. There is a theoretical livelock if more
+ * kcopyd clients do work simultaneously, this limit allows us to get out of
+ * the livelock.
+ */
+#define MAX_SLEEPS 10
+
+static void io_job_start(struct dm_kcopyd_throttle *t)
+{
+   unsigned throttle, now, difference;
+   int slept, skew;
+
+   if (unlikely(!t))
+   return;
+
+   slept = 0;
+
+try_again:
+   spin_lock_irq(_spinlock);
+
+   throttle = ACCESS_ONCE(t->throttle);
+
+   if (likely(throttle >= 100))
+   goto skip_limit;
+
+   now = jiffies;
+   difference = now - t->last_jiffies;
+   t->last_jiffies = now;
+   if (t->num_io_jobs)
+   t->io_period += difference;
+   t->total_period += difference;
+
+   if (unlikely(t->total_period >= (1 << ACOUNT_INTERVAL_SHIFT))) {
+   int shift = fls(t->total_period >> ACOUNT_INTERVAL_SHIFT);
+   t->total_period >>= shift;
+   t->io_period >>= shift;
+   }
+
+   skew = t->io_period - throttle * t->total_period / 100;
+   /* skew = t->io_period * 100 / throttle - t->total_period; */
+   if (unlikely(skew > 0) && slept < MAX_SLEEPS) {
+   slept++;
+   spin_unlock_irq(_spinlock);
+   msleep(SLEEP_MSEC);
+   goto try_again;
+   }
+
+skip_limit:
+   t->num_io_jobs++;
+
+   spin_unlock_irq(_spinlock);
+}
+
+static void io_job_finish(struct dm_kcopyd_throttle *t)
+{
+   unsigned long flags;
+
+   if (unlikely(!t))
+   return;
+
+   spin_lock_irqsave(_spinlock, flags);
+
+   t->num_io_jobs--;
+
+   if (likely(ACCESS_ONCE(t->throttle) >= 100))
+   goto skip_limit;
+
+   if (!t->num_io_jobs) {
+   unsigned now, difference;
+
+   now = jiffies;
+   difference = now - t->last_jiffies;
+   t->last_jiffies = now;
+
+   t->io_period += difference;
+   t->total_period += difference;
+   }
+
+skip_limit:
+   spin_unlock_irqrestore(_spinlock, flags);
+}
+
+
 static void wake(struct dm_kcopyd_client *kc)
 {
queue_work(kc->kcopyd_wq, >kcopyd_work);
@@ -348,6 +453,8 @@ static void complete_io(unsigned long er
struct kcopyd_job *job = (struct kcopyd_job *) context;
struct dm_kcopyd_client *kc = job->kc;
 
+   io_job_finish(kc->throttle);
+
if (error) {
if (job->rw & WRITE)
job->write_err |= error;
@@ -389,6 +496,8 @@ static int run_io_job(struct kcopyd_job 
.client = job->kc->io_client,
};
 
+   io_job_start(job->kc->throttle);
+
if (job->rw == READ)
r = dm_io(_req, 1, >source, NULL);
else
@@ -708,6 +817,7 @@ struct dm_kcopyd_client *dm_kcopyd_clien
INIT_LIST_HEAD(>complete_jobs);
INIT_LIST_HEAD(>io_jobs);
INIT_LIST_HEAD(>pages_jobs);
+   kc->throttle = throttle;
 
kc->job_pool = mempool_create_slab_pool(MIN_JOBS, _job_cache);
if (!kc->job_pool)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to 

[PATCH 1/2] mirror throttling

2013-01-08 Thread Mikulas Patocka
dm-kcopyd: introduce per-module throttle structure

The structure contains the throttle parameter (it could be set in
/sys/module/*/parameters and auxulary variables for activity counting.

The throttle does nothing, it will be activated in the next patch.

Signed-off-by: Mikulas Patocka 

---
 drivers/md/dm-kcopyd.c|2 +-
 drivers/md/dm-raid1.c |5 -
 drivers/md/dm-snap.c  |5 -
 drivers/md/dm-thin.c  |5 -
 include/linux/dm-kcopyd.h |   15 ++-
 5 files changed, 27 insertions(+), 5 deletions(-)

Index: linux-3.8-rc1-fast/drivers/md/dm-kcopyd.c
===
--- linux-3.8-rc1-fast.orig/drivers/md/dm-kcopyd.c  2013-01-02 
23:00:49.0 +0100
+++ linux-3.8-rc1-fast/drivers/md/dm-kcopyd.c   2013-01-02 23:23:17.0 
+0100
@@ -695,7 +695,7 @@ int kcopyd_cancel(struct kcopyd_job *job
 /*-
  * Client setup
  *---*/
-struct dm_kcopyd_client *dm_kcopyd_client_create(void)
+struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle 
*throttle)
 {
int r = -ENOMEM;
struct dm_kcopyd_client *kc;
Index: linux-3.8-rc1-fast/drivers/md/dm-raid1.c
===
--- linux-3.8-rc1-fast.orig/drivers/md/dm-raid1.c   2013-01-02 
23:00:49.0 +0100
+++ linux-3.8-rc1-fast/drivers/md/dm-raid1.c2013-01-02 23:23:17.0 
+0100
@@ -82,6 +82,9 @@ struct mirror_set {
struct mirror mirror[0];
 };
 
+DECLARE_DM_KCOPYD_THROTTLE(raid1_resync_throttle,
+   "A percentage of time allocated for raid resynchronization");
+
 static void wakeup_mirrord(void *context)
 {
struct mirror_set *ms = context;
@@ -,7 +1114,7 @@ static int mirror_ctr(struct dm_target *
goto err_destroy_wq;
}
 
-   ms->kcopyd_client = dm_kcopyd_client_create();
+   ms->kcopyd_client = dm_kcopyd_client_create(_kcopyd_throttle);
if (IS_ERR(ms->kcopyd_client)) {
r = PTR_ERR(ms->kcopyd_client);
goto err_destroy_wq;
Index: linux-3.8-rc1-fast/drivers/md/dm-snap.c
===
--- linux-3.8-rc1-fast.orig/drivers/md/dm-snap.c2013-01-02 
23:00:49.0 +0100
+++ linux-3.8-rc1-fast/drivers/md/dm-snap.c 2013-01-02 23:23:17.0 
+0100
@@ -124,6 +124,9 @@ struct dm_snapshot {
 #define RUNNING_MERGE  0
 #define SHUTDOWN_MERGE 1
 
+DECLARE_DM_KCOPYD_THROTTLE(snapshot_copy_throttle,
+   "A percentage of time allocated for copy on write");
+
 struct dm_dev *dm_snap_origin(struct dm_snapshot *s)
 {
return s->origin;
@@ -1109,7 +1112,7 @@ static int snapshot_ctr(struct dm_target
goto bad_hash_tables;
}
 
-   s->kcopyd_client = dm_kcopyd_client_create();
+   s->kcopyd_client = dm_kcopyd_client_create(_kcopyd_throttle);
if (IS_ERR(s->kcopyd_client)) {
r = PTR_ERR(s->kcopyd_client);
ti->error = "Could not create kcopyd client";
Index: linux-3.8-rc1-fast/include/linux/dm-kcopyd.h
===
--- linux-3.8-rc1-fast.orig/include/linux/dm-kcopyd.h   2013-01-02 
22:59:41.0 +0100
+++ linux-3.8-rc1-fast/include/linux/dm-kcopyd.h2013-01-02 
23:23:17.0 +0100
@@ -21,11 +21,24 @@
 
 #define DM_KCOPYD_IGNORE_ERROR 1
 
+struct dm_kcopyd_throttle {
+   unsigned throttle;
+   unsigned long num_io_jobs;
+   unsigned io_period;
+   unsigned total_period;
+   unsigned last_jiffies;
+};
+
+#define DECLARE_DM_KCOPYD_THROTTLE(name, description)  \
+static struct dm_kcopyd_throttle dm_kcopyd_throttle = { 100, 0, 0, 0, 0 }; \
+module_param_named(name, dm_kcopyd_throttle.throttle, uint, 0644); \
+MODULE_PARM_DESC(name, description)
+
 /*
  * To use kcopyd you must first create a dm_kcopyd_client object.
  */
 struct dm_kcopyd_client;
-struct dm_kcopyd_client *dm_kcopyd_client_create(void);
+struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle 
*throttle);
 void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc);
 
 /*
Index: linux-3.8-rc1-fast/drivers/md/dm-thin.c
===
--- linux-3.8-rc1-fast.orig/drivers/md/dm-thin.c2013-01-02 
23:00:49.0 +0100
+++ linux-3.8-rc1-fast/drivers/md/dm-thin.c 2013-01-02 23:23:17.0 
+0100
@@ -26,6 +26,9 @@
 #define PRISON_CELLS 1024
 #define COMMIT_PERIOD HZ
 
+DECLARE_DM_KCOPYD_THROTTLE(snapshot_copy_throttle,
+   "A percentage of time allocated for copy on write");
+
 /*
  * The block size of the device holding pool data must be
  * between 64KB and 1GB.
@@ -1636,7 +1639,7 @@ static struct pool *pool_create(struct m
goto 

Re: [dm-devel] [PATCH 0/3 v3] add resync speed control for dm-raid1

2013-01-08 Thread Mikulas Patocka
Hi

I did this already some times ago.
I'm sending my patches in the next mail.

Basically, my and Guangliang's patches have the following differences:

my patch: uses per-module throttle settings
Guangliang's patch: uses per-device settings
(my patch could be changed to use per-device throttle too, but without 
userspace support it isn't much useful because userspace lvm can 
reload the mirror and per-device settings would be lost)

my patch: uses fine grained throttling of the individual IOs in kcopyd - 
it measures active/inactive ratio and if the disk is active more than the 
specified percentage of time, sleep is inserted.
Guangliang's patch: throttles on segment granularity, it waits when 
starting new segment, but segment is copied unthrottled.

my patch: the user selects a percentage value (0 - 100) in 
"/sys/module/dm_mirror/parameters/raid1_resync_throttle", the device is 
kept active the specified percent of time
Guangliang's patch: limits the number of segments per a specified 
interval

My patch is noticeably bigger.

Mikulas


On Mon, 7 Jan 2013, Guangliang Zhao wrote:

> Hi,
> 
> These patches are used to add resync speed control for dm-raid1. The
> second and third patch provide support for user-space tool dmsetup.
> I have made some modifications by the comments. This is the third
> version.
> 
> Guangliang Zhao (3):
>   dm raid1: add resync speed control for dm-raid1
>   dm raid1: add interface to set resync speed
>   dm raid1: add interface to get resync speed
> 
>  drivers/md/dm-raid1.c |   90 
> -
>  1 file changed, 89 insertions(+), 1 deletion(-)
> 
> -- 
> 1.7.10.4
> 
> --
> dm-devel mailing list
> dm-de...@redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree)

2013-01-08 Thread Dave Airlie
On Wed, Jan 9, 2013 at 2:25 PM, Greg KH  wrote:
> On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote:
>> >> Hi all,
>> >>
>> >> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree:
>> >>
>> >> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer 
>> >> elapsed... GPU hung
>> >> [11868.414655] [drm] capturing error event; look for more information in 
>> >> /debug/dri/0/i915_error_state
>> >> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer 
>> >> elapsed... GPU hung
>> >> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring 
>> >> wedged!
>> >> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip.
>> >> [11883.083225] gnome-shell[19396]: segfault at 218 ip 7feef5f32333 sp 
>> >> 7c1dc930 error 4 in i965_dri.so[7feef5ecb000+d]
>> >
>> > I just hit this again.  And, as the kernel was asking for it, attached
>> > is the i915_error_state file, compressed due to the size of it.
>> >
>> Welcome to sink hole that is
>> https://bugs.freedesktop.org/show_bug.cgi?id=55984
>>
>> 3 months and ticking, Intel guys are all running away from it saying
>> they can't reproduce, everyone else on planet seems to reproduce quite
>> easily.
>>
>> Its generally considered a bug in the relocation/shrinker/no idea category,
>
> Ugh, what a mess.
>
>> Assuming you have an Ironlake machine which I'm going to guess you do.
>
> I don't know, it's an old i5 machine that has never had any video
> problems for many years now.  How do I tell?

lspci -nn probably an 8086:0046 device.

Old i5 probably means original i5 which means ironlake.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clk: mxs: Index is always positive

2013-01-08 Thread Shawn Guo
On Mon, Jan 07, 2013 at 11:38:55PM -0200, Fabio Estevam wrote:
> From: Fabio Estevam 
> 
> Fix the following warnings when building with W=1 option:
> 
> drivers/clk/mxs/clk-imx23.c: In function 'mx23_clocks_init':
> drivers/clk/mxs/clk-imx23.c:149:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> drivers/clk/mxs/clk-imx23.c:165:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> ...
> 
> drivers/clk/mxs/clk-imx28.c: In function 'mx28_clocks_init':
> drivers/clk/mxs/clk-imx28.c:227:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> drivers/clk/mxs/clk-imx28.c:244:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> 
> Signed-off-by: Fabio Estevam 

Acked-by: Shawn Guo 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 4/8] memcg: add per cgroup dirty pages accounting

2013-01-08 Thread Hugh Dickins
On Mon, 7 Jan 2013, Kamezawa Hiroyuki wrote:
> (2013/01/07 5:02), Hugh Dickins wrote:
> > 
> > Forgive me, I must confess I'm no more than skimming this thread,
> > and don't like dumping unsigned-off patches on people; but thought
> > that on balance it might be more helpful than not if I offer you a
> > patch I worked on around 3.6-rc2 (but have updated to 3.8-rc2 below).
> > 
> > I too was getting depressed by the constraints imposed by
> > mem_cgroup_{begin,end}_update_page_stat (good job though Kamezawa-san
> > did to minimize them), and wanted to replace by something freer, more
> > RCU-like.  In the end it seemed more effort than it was worth to go
> > as far as I wanted, but I do think that this is some improvement over
> > what we currently have, and should deal with your recursion issue.
> > 
> In what case does this improve performance ?

Perhaps none.  I was aiming to not degrade performance at the stats
update end, and make it more flexible, so new stats can be updated which
would be problematic today (for lock ordering and recursion reasons).

I've not done any performance measurement on it, and don't have enough
cpus for an interesting report; but if someone thinks it might solve a
problem for them, and has plenty of cpus to test with, please go ahead,
we'd be glad to hear the results.

> Hi, this patch seems interesting but...doesn't this make move_account() very
> slow if the number of cpus increases because of scanning all cpus per a page
> ?
> And this looks like reader-can-block-writer percpu rwlock..it's too heavy to
> writers if there are many readers.

I was happy to make the relatively rare move_account end considerably
heavier.  I'll be disappointed if it turns out to be prohibitively
heavy at that end - if we're going to make move_account impossible,
there are much easier ways to achieve that! - but it is a possibility.

Something you might have missed when considering many readers (stats
updaters): the move_account end does not wait for a moment when there
are no readers, that would indeed be a losing strategy; it just waits
for each cpu that's updating page stats to leave that section, so every
cpu is sure to notice and hold off if it then tries to update the page
which is to be moved.  (I may not be explaining that very well!)

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu

2013-01-08 Thread Takao Indoh

Hi Thomas,

(2013/01/09 11:32), Thomas Renninger wrote:

On Tuesday, January 08, 2013 09:27:55 AM Yinghai Lu wrote:

On Tue, Jan 8, 2013 at 8:50 AM, Thomas Renninger  wrote:

megaraid_sas


can you check if your initrd for kdump kernel has that driver and
module that it depends on like
scsi sas transport etc ?


Removing the 5 patches and the disk works and the
dump is written.

I can look a bit further at the memmap=exactmap issue tomorrow.
I can also double check above then, but I am rather sure about it
already:
I tried plain vanilla -> worked, dumping started


It seems that there are several disk controllers in your system.

00:1f.2 SATA controller [0106]: Intel Corporation Device [8086:1d02] (rev 05)
02:00.0 RAID bus controller [0104]: LSI Logic / Symbios Logic Device 
[1000:005b] (rev 01)
05:00.0 Serial Attached SCSI controller [0107]: LSI Logic / Symbios Logic 
SAS2116 PCI-Express Fusion-MPT SAS-2 [Meteor] [1000:0064] (rev 02)

Which disk are you using to save the vmcore?



I tried with only these 5 patches added -> no disk.


Some questions:

You try to initialize the PCI subsystem in a way the BIOS typically has
to do it in kexec case?


These patches sends hot reset to endpoints to reset them, it may be
different way from BIOS initialization.


Reacting and trying to handle error condtitions more gracefully
at the place where they are caught could be another approach which
imo makes sense to implement in parallel.

In my case for example I see:
"Present field in the IRTE entry is clear"
DMAR errors. I expect this comes from a device which still throws
interrupts, but irq vector got not set-up or registered in the kexec'ed
kernel.

I could imagine this is the same error which happens when an irq is
wrongly configured and spurious interrupts happen (but in irq remapped case).
In my case it's not sever as I only see this message once, but according
to another report, they see about 80 of such DMAR error messages per
second. This seem to result in endless DMAR error interrupts and finally
a dead system.

I wonder whether the DMAR error handler could already invoke a PCIe
reset.
I found:
int pci_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state state)
which unfortunatly is only implemented for PPC, but would it make sense to
implement this one and trigger function level reset if several specific DMAR
errors are seen (or other PCI(e) error handlers get active?)?


Or AER framework may be able to handle this. Actually it has a function
to reset endpoint when error is detected.

Thanks,
Takao Indoh



If this does not help the next step could be to stop DMAR error interrupt
handling or other iommu commands to keep the machine alive, even if one
device keeps firing interrupts to an unconfigured irq vector (or whatever other
things could happen).

Just some ideas...
Comments appreciated.

Thomas




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [linux-pm] [PATCH] cpufreq: exynos: Show list of available frequencies

2013-01-08 Thread amit kachhap
On Tue, Jan 8, 2013 at 2:50 AM, Inderpal Singh
 wrote:
> Add freq_attr attribute to show list of available frequencies.
>
> Signed-off-by: Donggeun Kim 
> Signed-off-by: MyungJoo Ham 
> Signed-off-by: KyungMin Park 
> Signed-off-by: Inderpal Singh 
> ---
>  drivers/cpufreq/exynos-cpufreq.c |   13 +
>  1 file changed, 13 insertions(+)
>
> diff --git a/drivers/cpufreq/exynos-cpufreq.c 
> b/drivers/cpufreq/exynos-cpufreq.c
> index 7012ea8..bc1e833 100644
> --- a/drivers/cpufreq/exynos-cpufreq.c
> +++ b/drivers/cpufreq/exynos-cpufreq.c
> @@ -244,13 +244,26 @@ static int exynos_cpufreq_cpu_init(struct 
> cpufreq_policy *policy)
> return cpufreq_frequency_table_cpuinfo(policy, 
> exynos_info->freq_table);
>  }
>
> +static int exynos_cpufreq_cpu_exit(struct cpufreq_policy *policy)
> +{
> +   cpufreq_frequency_table_put_attr(policy->cpu);
> +   return 0;
> +}
> +
> +static struct freq_attr *exynos_cpufreq_attr[] = {
> +   _freq_attr_scaling_available_freqs,
> +   NULL,
> +};
> +

This change looks fine. I guess this was posted before also but could
not go mainline.
Reviewed-by: Amit Daniel Kachhap

Thanks,
Amit Daniel
>  static struct cpufreq_driver exynos_driver = {
> .flags  = CPUFREQ_STICKY,
> .verify = exynos_verify_speed,
> .target = exynos_target,
> .get= exynos_getspeed,
> .init   = exynos_cpufreq_cpu_init,
> +   .exit   = exynos_cpufreq_cpu_exit,
> .name   = "exynos_cpufreq",
> +   .attr   = exynos_cpufreq_attr,
>  #ifdef CONFIG_PM
> .suspend= exynos_cpufreq_suspend,
> .resume = exynos_cpufreq_resume,
> --
> 1.7.9.5
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/5 RESEND] thermal: exynos: Miscellaneous fixes to support falling threshold interrupt

2013-01-08 Thread amit kachhap
Hi Joe,

Thanks for the review. Will re-post with your suggestion,

On Sun, Jan 6, 2013 at 3:55 PM, Joe Perches  wrote:
> On Sun, 2013-01-06 at 15:50 -0800, Amit Daniel Kachhap wrote:
>> Below fixes are done to support falling threshold interrupt,
>> * Falling interrupt status macro corrected according to exynos5 data sheet.
>> * The get trend function modified to calculate trip temperature correctly.
>> * The clearing of interrupt status in the isr is now done after handling
>>   the event that caused the interrupt.
> []
>> diff --git a/drivers/thermal/exynos_thermal.c 
>> b/drivers/thermal/exynos_thermal.c
> []
>> @@ -373,12 +373,19 @@ static int exynos_get_temp(struct thermal_zone_device 
>> *thermal,
>>  static int exynos_get_trend(struct thermal_zone_device *thermal,
>>   int trip, enum thermal_trend *trend)
>>  {
>> - if (thermal->temperature >= trip)
>> + int ret = 0;
Yes agreed. Will modify it.
>
> Unnecessary initialization
>
>> + unsigned long trip_temp;
>> +
>> + ret = exynos_get_trip_temp(thermal, trip, _temp);
>> + if (ret < 0)
>> + return ret;
>> +
>> + if (thermal->temperature >= trip_temp)
>>   *trend = THERMAL_TREND_RAISING;
>>   else
>>   *trend = THERMAL_TREND_DROPPING;
>
> THERMAL_TREND_STABLE ?
Only 2 trend is sufficient. It is stable for some time as the falling
threshold interrupt is some units below the trip temp.
>
>>
>> - return 0;
>> + return ret;
Ok agreed
>
> return 0 is clearer.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-samsung-soc" 
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree)

2013-01-08 Thread Greg KH
On Wed, Jan 09, 2013 at 01:42:39PM +1000, Dave Airlie wrote:
> >> Hi all,
> >>
> >> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree:
> >>
> >> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer 
> >> elapsed... GPU hung
> >> [11868.414655] [drm] capturing error event; look for more information in 
> >> /debug/dri/0/i915_error_state
> >> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer 
> >> elapsed... GPU hung
> >> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring 
> >> wedged!
> >> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip.
> >> [11883.083225] gnome-shell[19396]: segfault at 218 ip 7feef5f32333 sp 
> >> 7c1dc930 error 4 in i965_dri.so[7feef5ecb000+d]
> >
> > I just hit this again.  And, as the kernel was asking for it, attached
> > is the i915_error_state file, compressed due to the size of it.
> >
> Welcome to sink hole that is
> https://bugs.freedesktop.org/show_bug.cgi?id=55984
> 
> 3 months and ticking, Intel guys are all running away from it saying
> they can't reproduce, everyone else on planet seems to reproduce quite
> easily.
> 
> Its generally considered a bug in the relocation/shrinker/no idea category,

Ugh, what a mess.

> Assuming you have an Ironlake machine which I'm going to guess you do.

I don't know, it's an old i5 machine that has never had any video
problems for many years now.  How do I tell?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: oops in copy_page_rep()

2013-01-08 Thread Hugh Dickins
On Tue, 8 Jan 2013, Andrea Arcangeli wrote:
> 
> Looking at this, one thing that isn't clear is where the page_count is
> checked in migrate_misplaced_transhuge_page. Ok that it's unable to
> migrate anon transhuge COW shared pages so it doesn't need to mess
> with rmap (the mapcount check makes it safe), but it shouldn't be
> allowed to migrate memory that has gup direct-IO in flight (and that
> can only be detected with a page_count vs mapcount check). Real
> migrate does page_freeze_refs to be safe. Mel comments?

Yes, I protested to Mel about that before the holidays, and he
quickly provided a patch, now in akpm's tree; but checking it again
today, I believe it's still not quite right yet - see other mail.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: migrate: Check page_count of THP before migrating

2013-01-08 Thread Hugh Dickins
On Mon, 7 Jan 2013, Mel Gorman wrote:

> Hugh Dickins pointed out that migrate_misplaced_transhuge_page() does not
> check page_count before migrating like base page migration and khugepage. He
> could not see why this was safe and he is right.
> 
> The potential impact of the bug is avoided due to the limitations of NUMA
> balancing.  The page_mapcount() check ensures that only a single address
> space is using this page and as THPs are typically private it should not be
> possible for another address space to fault it in parallel. If the address
> space has one associated task then it's difficult to have both a GUP pin
> and be referencing the page at the same time. If there are multiple tasks
> then a buggy scenario requires that another thread be accessing the page
> while the direct IO is in flight. This is dodgy behaviour as there is
> a possibility of corruption with or without THP migration. It would be
> difficult to identify the corruption as being a migration bug.
> 
> While we happen to be safe for the most part it is shoddy to depend on
> such "safety" so this patch checks the page count similar to anonymous
> pages. Note that this does not mean that the page_mapcount() check can go
> away. If we were to remove the page_mapcount() check the the THP would
> have to be unmapped from all referencing PTEs, replaced with migration
> PTEs and restored properly afterwards.
> 
> Reported-by: Hugh Dickins 
> Signed-off-by: Mel Gorman 

Sorry, Mel, it's a NAK: you will have expected an ack from me two weeks
or more ago; but somehow I had an intuition that if I sat on it for
long enough, a worm would crawl out.  Got down to looking again today,
and I notice that although the putback_lru_page() is right,
NR_ISOLATED_ANON is not restored on this path, so that would leak.

I expect you'll want to do something like:
if (isolated) {
putback_lru_page(page);
isolated = 0;
goto out;
}
and that may be the appropriate fix right now.

But I do still dislike the way you always put_page in
numamigrate_isolate_page(): it makes sense in the case when
isolate_lru_page() succeeds (I've long thought that weird both to
insist on an existing page reference and add one of its own), but
I find it very confusing on the failure paths, to have the put_page
far away from the unlock_page - and I get worried when I see put_page
followed by unlock_page rather than vice versa (it happens on !pmd_same
paths: if the pmd is not the same, then can we be sure that the put_page
does not free the page?)

At the bottom I've put my own cleanup for this, which simplifies by
doing the putback_lru_page() inside numamigrate_isolate_page(), and
doesn't put_page when it doesn't isolate.

I think the only functional difference from yours (aside from fixing
up NR_ISOLATED) is that migrate_misplaced_transhuge_page() doesn't
have to pretend to its caller that it succeeded when actually it
failed at the last hurdle (because it already did the unlock_page,
which in yours the caller expects to do on failure).  Oh, and I'm
not holding page lock (sometimes) at clear_pmdnuma: I didn't see
the reason for that, perhaps I'm missing something important there.

Maybe our tastes differ, and you won't see mine as an improvement.
And I've hardly tested, so haven't signed off, and won't be
surprised if its own worms crawl out.

Hugh

> ---
>  mm/migrate.c |   11 ++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 3b676b0..f466827 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1679,9 +1679,18 @@ int migrate_misplaced_transhuge_page(struct mm_struct 
> *mm,
>   page_xchg_last_nid(new_page, page_last_nid(page));
>  
>   isolated = numamigrate_isolate_page(pgdat, page);
> - if (!isolated) {
> +
> + /*
> +  * Failing to isolate or a GUP pin prevents migration. The expected
> +  * page count is 2. 1 for anonymous pages without a mapping and 1
> +  * for the callers pin. If the page was isolated, the page will
> +  * need to be put back on the LRU.
> +  */
> + if (!isolated || page_count(page) != 2) {
>   count_vm_events(PGMIGRATE_FAIL, HPAGE_PMD_NR);
>   put_page(new_page);
> + if (isolated)
> + putback_lru_page(page);
>   goto out_keep_locked;
>   }

Not-signed-off-by: Hugh Dickins 
---

 mm/huge_memory.c |   28 +--
 mm/migrate.c |   79 ++---
 2 files changed, 48 insertions(+), 59 deletions(-)

--- 3.8-rc2/mm/huge_memory.c2012-12-22 09:43:27.616015582 -0800
+++ linux/mm/huge_memory.c  2013-01-08 17:39:06.340407864 -0800
@@ -1298,7 +1298,6 @@ int do_huge_pmd_numa_page(struct mm_stru
int target_nid;
int current_nid = -1;
bool migrated;
-   bool page_locked = false;
 
spin_lock(>page_table_lock);
if (unlikely(!pmd_same(pmd, 

Re: [block] allow blk_flush_policy to return REQ_FSEQ_DATA independent of *FLUSH

2013-01-08 Thread Ajith Kumar
Hello,
Thanks for the response.
A block device driver during initialization would decide if it is capable of 
supporting FLUSH/FUA or not.  Suppose driver claims FLUSH/FUA support then any 
bio targeted at this driver with FLUSH bit set(which is commonly set by file 
system like XFS for doing internal logging) has a data corruption vulnerability 
in case of an abrupt shutdown.  So, IMO the vulnerability is not open to rare 
window where driver changes q->flush_flags while IO is in flight, but for a 
much larger window from time driver comes up and throughout it's life.

Thanks,
Ajith

On Wednesday, 9 January 2013 00:15:31 UTC+5:30, Tejun Heo  wrote:
> Hello,
> 
> 
> 
> On Tue, Jan 08, 2013 at 10:04:23AM -0800, ajithb.ku...@gmail.com wrote:
> 
> > Hi,
> 
> > Could you please provide clarity on the following.
> 
> > ">   Hmmm... yes, this can become a correctness issue if (and only if)
> 
> > >   blk_queue_flush() is called to change q->flush_flags while requests
> 
> > >   are in-flight;"
> 
> >
> 
> > Could you please clarify as to why is it a correctness issue only if
> 
> > blk_queue_flush() is used to change flush_flags when requests are in
> 
> > flight ?  As I understand, XFS does set WRITE_FLUSH_FUA flag in
> 
> > _xfs_buf_ioapply() function irrespective of whether the underlying
> 
> > device supports flush capabilities or not which will flow into
> 
> > blk_insert_flush().  Is my reading of the code correct and is there
> 
> > a general correctness issue here which potentially results in XFS
> 
> > file system corruption in case of an abrupt shutdown independent of
> 
> > q->flush_flags getting changed while request is in flight.
> 
> 
> 
> My memory is kinda fuzzy at this point but if a queue doesn't support
> 
> flush, its flush_flags should be zero and
> 
> generic_make_request_checks() will clear REQ_FLUSH|REQ_FUA from
> 
> bio->bi_rw so we never hit blk_insert_flush() and the request will be
> 
> processed as a normal IO one; however, if REQ_FLUSH goes off after a
> 
> request passed generic_make_request_checks() but before
> 
> blk_flush_policy(), it'll become null op and its data payload won't
> 
> get written out to the underlying device, which is data corruption.
> 
> 
> 
> Thanks.
> 
> 
> 
> -- 
> 
> tejun
> 
> --
> 
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> 
> the body of a message to majord...@vger.kernel.org
> 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> Please read the FAQ at  http://www.tux.org/lkml/

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for Jan 9

2013-01-08 Thread Stephen Rothwell
Hi all,

Changes since 20130108:

The staging tree lost its build failure.

The tegra tree gained a conflict against the arm-perf tree.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 214 trees (counting Linus' and 28 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (ed2c891 Merge tag 'sound-3.8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound)
Merging fixes/master (d287b87 Merge branch 'for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs)
Merging kbuild-current/rc-fixes (bad9955 menuconfig: Replace CIRCLEQ by 
list_head-style lists.)
Merging arm-current/fixes (d106de3 ARM: 7614/1: mm: fix wrong branch from 
Cortex-A9 to PJ4b)
Merging m68k-current/for-linus (e7e29b4 m68k: Wire up finit_module)
Merging powerpc-merge/merge (e6449c9 powerpc: Add missing NULL terminator to 
avoid boot panic on PPC40x)
Merging sparc/master (4e4d78f sparc: Hook up finit_module syscall.)
Merging net/master (ed2c891 Merge tag 'sound-3.8' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound)
Merging sound-current/for-linus (6d3cd5d ALSA: hda - add mute LED for HP 
Pavilion 17 (Realtek codec))
Merging pci-current/for-linus (56d0da4 PCI/AER: pci_get_domain_bus_and_slot() 
call missing required pci_dev_put())
Merging wireless/master (5e20a4b b43: Fix firmware loading when driver is built 
into the kernel)
Merging driver-core.current/driver-core-linus (4956964 Merge tag 
'driver-core-3.8-rc2' of 
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core)
Merging tty.current/tty-linus (d1c3ed6 Linux 3.8-rc2)
Merging usb.current/usb-linus (75e1a2a USB: ehci: make debug port in-use 
detection functional again)
Merging staging.current/staging-linus (e16a922 staging: tidspbridge: use 
prepare/unprepare on dsp clocks)
Merging char-misc.current/char-misc-linus (e6028db mei: fix mismatch in mutex 
unlock-lock in mei_amthif_read())
Merging input-current/for-linus (bec7a4b Input: lm8323 - fix checking PWM 
interrupt status)
Merging md-current/for-linus (a9add5d md/raid5: add blktrace calls)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (a2c0911 crypto: caam - Updated SEC-4.0 device 
tree binding for ERA information.)
Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops)
Merging dwmw2/master (084a0ec x86: add CONFIG_X86_MOVBE option)
CONFLICT (content): Merge conflict in arch/x86/Kconfig
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (a0d271c Linux 3.6)
Merging devicetree-current/devicetree/merge (ab28698 of: define struct device 
in of_platform.h if !OF_DEVICE and !OF_ADDRESS)
Merging spi-current/spi/merge (d3601e5 spi/sh-hspi: fix return value check in 
hspi_probe().)
Merging gpio-current/gpio/merge (bc1008c gpio/mvebu-gpio: Make mvebu-gpio 
depend on OF_CONFIG)
Merging rr-fixes/fixes (52441fa module: prevent warning when finit_module a 0 
sized file)
Merging asm-generic/master (fb9de7e xtensa: Use generic asm/mmu.h for nommu)
Merging arm/for-next (32887f3 Merge branch 'fixes' into for-ne

[PATCH] usb: gadget: FunctionFS: Fix missing braces in parse_opts

2013-01-08 Thread Benoit Goby
Add missing braces around an if block in ffs_fs_parse_opts. This broke
parsing the uid/gid mount options and causes mount to fail when using
uid/gid. This has been introduced by commit b9b73f7c (userns: Convert usb
functionfs to use kuid/kgid where appropriate) in 3.7.

Signed-off-by: Benoit Goby 
---
 drivers/usb/gadget/f_fs.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/gadget/f_fs.c b/drivers/usb/gadget/f_fs.c
index 4a6961c..8c2f251 100644
--- a/drivers/usb/gadget/f_fs.c
+++ b/drivers/usb/gadget/f_fs.c
@@ -1153,15 +1153,15 @@ static int ffs_fs_parse_opts(struct ffs_sb_fill_data 
*data, char *opts)
pr_err("%s: unmapped value: %lu\n", 
opts, value);
return -EINVAL;
}
-   }
-   else if (!memcmp(opts, "gid", 3))
+   } else if (!memcmp(opts, "gid", 3)) {
data->perms.gid = make_kgid(current_user_ns(), 
value);
if (!gid_valid(data->perms.gid)) {
pr_err("%s: unmapped value: %lu\n", 
opts, value);
return -EINVAL;
}
-   else
+   } else {
goto invalid;
+   }
break;
 
default:
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


rwlock_t unfairness and tasklist_lock

2013-01-08 Thread Michel Lespinasse
Like others before me, I have discovered how easy it is to DOS a
system by abusing the rwlock_t unfairness and causing the
tasklist_lock read side to be continuously held (my abuse code makes
use of the getpriority syscall, but there are plenty of other ways
anyway).

My understanding is that the issue of rwlock_t fairness has come up
several times over the last 10 years (I first saw a fair rwlock_t
proposal by David Howells 10 years ago,
https://lkml.org/lkml/2002/11/8/102), and every time the answer has
been that we can't easily change this because tasklist_lock makes use
of the read-side reentrancy and interruptibility properties of
rwlock_t, and that we should really find something smart to do about
tasklist_lock. Yet that last part never gets done, and the problem is
still with us.

I am wondering:

- Does anyone know of any current work towards removing the
tasklist_lock use of rwlock_t ? Thomas Gleixner mentioned 3 years ago
that he'd give it a shot (https://lwn.net/Articles/364601/), did he
encounter some unforeseen difficulty that we should learn from ?

- Would there be any fundamental objection to implementing a fair
rwlock_t and dealing with the reentrancy issues in tasklist_lock ? My
proposal there would be along the lines of:

1- implement a fair rwlock_t - the ticket based idea from David
Howells seems quite appropriate to me

2- if any places use reader side reentrancy within the same context,
adjust the code as needed to get rid of that reentrancy

3- a simple way to deal with reentrancy between contexts (as in, we
take the tasklist_lock read side in process context, get interrupted,
and we now need to take it again in interrupt or softirq context)
would be to have different locks depending on context. tasklist_lock
read side in process context would work as usual, but in irq or
contexts we'd take tasklist_irq_lock instead (and, if there are any
irq handlers taking tasklist_lock read side, we'd have to disable
interrupt handling when tasklist_irq_lock is held to avoid further
nesting). tasklist_lock write side - that is, mainly fork() and exec()
- would have to take both tasklist_lock and tasklist_irq_lock, in that
order.

While it might seem to be a downside that tasklist_lock write side
would now have to take both tasklist_lock and tasklist_irq_lock, I
must note that this wouldn't increase the number of atomic operations:
the current rwlock_t implementation uses atomics on both lock and
unlock, while the ticket based one would only need atomics on the lock
side (unlock is just a regular mov instruction), so the total cost
should be comparable to what we have now.

Any comments about this proposal ?

(I should note that I haven't given much thought to tasklist_lock
before, and I'm not quite sure just from code inspection which read
locks are run in which context...)

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Wong
Eric Dumazet  wrote:
> On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote:
> > Hmm, it seems sk_filter() can return -ENOMEM because skb has the
> > pfmemalloc() set.
> 
> > 
> > One TCP socket keeps retransmitting an SKB via loopback, and TCP stack 
> > drops the packet again and again.
> 
> sock_init_data() sets sk->sk_allocation to GFP_KERNEL
> 
> Shouldnt it use (GFP_KERNEL | __GFP_NOMEMALLOC) instead ?

Thanks, things are running good after ~35 minutes so far.
Will report back if things break (hopefully I don't run out
of laptop battery power :x).

I'm now getting allocation failure warnings (which I don't believe
happened before, and should be expected, I think...)

kworker/1:1: page allocation failure: order:0, mode:0x20
Pid: 236, comm: kworker/1:1 Not tainted 3.8.0-rc2w5+ #76
Call Trace:
   [] warn_alloc_failed+0xe1/0x130
 [] __alloc_pages_nodemask+0x5e9/0x840
 [] ? ip_rcv+0x24d/0x340
 [] ? sg_init_table+0x23/0x50
 [] get_a_page.isra.25+0x3a/0x40 [virtio_net]
 [] try_fill_recv+0x318/0x4a0 [virtio_net]
 [] virtnet_poll+0x3dd/0x610 [virtio_net]
 [] net_rx_action+0x9d/0x1a0
 [] __do_softirq+0xba/0x170
 [] call_softirq+0x1c/0x30
   [] do_softirq+0x6d/0xa0
 [] local_bh_enable+0x94/0xa0
 [] __cond_resched_softirq+0x35/0x50
 [] release_sock+0x9c/0x150
 [] tcp_sendmsg+0x11e/0xd80
 [] inet_sendmsg+0x5e/0xa0
 [] sock_sendmsg+0x87/0xa0
 [] ? __free_memcg_kmem_pages+0x9/0x10
 [] ? select_task_rq_fair+0x699/0x6b0
 [] kernel_sendmsg+0x3b/0x50
 [] xs_send_kvec+0x89/0xa0 [sunrpc]
 [] xs_sendpages+0x5f/0x1e0 [sunrpc]
 [] ? lock_timer_base.isra.32+0x33/0x60
 [] xs_tcp_send_request+0x57/0x110 [sunrpc]
 [] xprt_transmit+0x6d/0x260 [sunrpc]
 [] call_transmit+0x1a8/0x240 [sunrpc]
 [] __rpc_execute+0x56/0x250 [sunrpc]
 [] rpc_async_schedule+0x25/0x40 [sunrpc]
 [] process_one_work+0x12c/0x480
 [] ? __rpc_execute+0x250/0x250 [sunrpc]
 [] worker_thread+0x15d/0x460
 [] ? flush_delayed_work+0x60/0x60
 [] kthread+0xbb/0xc0
 [] ? kthread_create_on_node+0x120/0x120
 [] ret_from_fork+0x7c/0xb0
 [] ? kthread_create_on_node+0x120/0x120
Mem-Info:
DMA per-cpu:
CPU0: hi:0, btch:   1 usd:   0
CPU1: hi:0, btch:   1 usd:   0
DMA32 per-cpu:
CPU0: hi:  186, btch:  31 usd:   0
CPU1: hi:  186, btch:  31 usd:   0
active_anon:3620 inactive_anon:3624 isolated_anon:0
 active_file:4290 inactive_file:101218 isolated_file:0
 unevictable:0 dirty:2306 writeback:0 unstable:0
 free:1711 slab_reclaimable:1529 slab_unreclaimable:5796
 mapped:2325 shmem:66 pagetables:759 bounce:0
 free_cma:0
DMA free:2012kB min:84kB low:104kB high:124kB active_anon:0kB inactive_anon:0kB 
active_file:4kB inactive_file:13624kB unevictable:0kB isolated(anon):0kB 
isolated(file):0kB present:15644kB managed:15900kB mlocked:0kB dirty:244kB 
writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:16kB 
slab_unreclaimable:80kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 488 488 488
DMA32 free:4832kB min:2784kB low:3480kB high:4176kB active_anon:14480kB 
inactive_anon:14496kB active_file:17156kB inactive_file:391248kB 
unevictable:0kB isolated(anon):0kB isolated(file):0kB present:499960kB 
managed:491256kB mlocked:0kB dirty:8980kB writeback:0kB mapped:9300kB 
shmem:264kB slab_reclaimable:6100kB slab_unreclaimable:23104kB 
kernel_stack:1336kB pagetables:3036kB unstable:0kB bounce:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 2*4kB (U) 1*8kB (R) 1*16kB (U) 0*32kB 1*64kB (R) 1*128kB (R) 1*256kB (R) 
1*512kB (R) 1*1024kB (R) 0*2048kB 0*4096kB = 2016kB
DMA32: 207*4kB (UEM) 116*8kB (UEM) 32*16kB (UM) 58*32kB (UM) 13*64kB (UM) 
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4956kB
108890 total pagecache pages
3302 pages in swap cache
Swap cache stats: add 4086, delete 784, find 494/535
Free swap  = 378980kB
Total swap = 392188kB
131054 pages RAM
3820 pages reserved
541743 pages shared
117221 pages non-shared
cat: page allocation failure: order:0, mode:0x20
Pid: 23684, comm: cat Not tainted 3.8.0-rc2w5+ #76
Call Trace:
   [] warn_alloc_failed+0xe1/0x130
 [] __alloc_pages_nodemask+0x5e9/0x840
 [] ? ip_rcv+0x24d/0x340
 [] ? sg_init_table+0x23/0x50
 [] get_a_page.isra.25+0x3a/0x40 [virtio_net]
 [] try_fill_recv+0x318/0x4a0 [virtio_net]
 [] virtnet_poll+0x3dd/0x610 [virtio_net]
 [] net_rx_action+0x9d/0x1a0
 [] __do_softirq+0xba/0x170
 [] call_softirq+0x1c/0x30
 [] do_softirq+0x6d/0xa0
 [] irq_exit+0xa5/0xb0
 [] do_IRQ+0x5e/0xd0
 [] common_interrupt+0x6d/0x6d
   [] ? _raw_spin_unlock_irqrestore+0xc/0x20
 [] pagevec_lru_move_fn+0xb6/0xe0
 [] ? compound_unlock_irqrestore+0x20/0x20
 [] ? nfs_read_completion+0x190/0x190 [nfs]
 [] __pagevec_lru_add+0x17/0x20
 [] __lru_cache_add+0x68/0x90
 [] add_to_page_cache_lru+0x29/0x40
 [] read_cache_pages+0x6c/0x100
 [] nfs_readpages+0xcc/0x160 [nfs]
 [] __do_page_cache_readahead+0x1c7/0x280
 [] ra_submit+0x1c/0x20
 [] ondemand_readahead+0x12d/0x250
 [] ? 

Re: tg3 v3.123 in 100Mbps Full-Duplex mode with autoneg off

2013-01-08 Thread Michael Chan
Please tell us what tg3 device you're using.  You can provide lspci
output or tg3 probing dmesg output.  Thanks.

On Wed, 2013-01-09 at 11:20 +0800, 王金浦 wrote: 
> For this kind of driver related question, I suggest you send mail to
> tg3 driver developers, who I already cced.
> 
> 
> I think they should know what's going on, right?
> 
> 
> Jack
> 
> 
> 2013/1/8 Marcin Miotk 
> Hi,
> 
> any conclusions re this? 
> 
> Regards,
> Marcin Miotk
> --
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at
>  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 
> 



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: i915_hangcheck_hung problem with 3.8-rc2+ (Linus's latest tree)

2013-01-08 Thread Dave Airlie
>> Hi all,
>>
>> I've hit this 3 times today on Linus's latest 3.8-rc2+ tree:
>>
>> [11868.414648] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... 
>> GPU hung
>> [11868.414655] [drm] capturing error event; look for more information in 
>> /debug/dri/0/i915_error_state
>> [11870.408342] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... 
>> GPU hung
>> [11870.408412] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring 
>> wedged!
>> [11870.408414] [drm:i915_reset] *ERROR* Failed to reset chip.
>> [11883.083225] gnome-shell[19396]: segfault at 218 ip 7feef5f32333 sp 
>> 7c1dc930 error 4 in i965_dri.so[7feef5ecb000+d]
>
> I just hit this again.  And, as the kernel was asking for it, attached
> is the i915_error_state file, compressed due to the size of it.
>
Welcome to sink hole that is
https://bugs.freedesktop.org/show_bug.cgi?id=55984

3 months and ticking, Intel guys are all running away from it saying
they can't reproduce, everyone else on planet seems to reproduce quite
easily.

Its generally considered a bug in the relocation/shrinker/no idea category,

Assuming you have an Ironlake machine which I'm going to guess you do.

Dave.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fs: Disable preempt when acquire i_size_seqcount write lock

2013-01-08 Thread Fan Du
Two rt tasks bind to one CPU core.

The higher priority rt task A preempts a lower priority rt task B which
has already taken the write seq lock, and then the higher priority
rt task A try to acquire read seq lock, it's doomed to lockup.

rt task A with lower priority: call write
i_size_writert task B with higher 
priority: call sync, and preempt task A
  write_seqcount_begin(>i_size_seqcount);i_size_read  
  inode->i_size = i_size; read_seqcount_begin <-- 
lockup here... 


So disable preempt when acquiring every i_size_seqcount *write* lock will
cure the problem.

Signed-off-by: Fan Du 
---
 include/linux/fs.h |2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index db84f77..1b69e87 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -758,9 +758,11 @@ static inline loff_t i_size_read(const struct inode *inode)
 static inline void i_size_write(struct inode *inode, loff_t i_size)
 {
 #if BITS_PER_LONG==32 && defined(CONFIG_SMP)
+   preempt_disable();
write_seqcount_begin(>i_size_seqcount);
inode->i_size = i_size;
write_seqcount_end(>i_size_seqcount);
+   preempt_enable();
 #elif BITS_PER_LONG==32 && defined(CONFIG_PREEMPT)
preempt_disable();
inode->i_size = i_size;
-- 
1.7.0.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture

2013-01-08 Thread Benjamin Herrenschmidt
On Tue, 2013-01-08 at 18:38 -0800, Michel Lespinasse wrote:
> 
> Well no fair, the previous patch (for powerpc as well) has 22
> insertions and 93 deletions :)
> 
> The benefit is that the new code has lower algorithmic complexity, it
> replaces a per-vma loop with O(N) complexity with an outer loop that
> finds contiguous slice blocks and passes them to vm_unmapped_area()
> which is only O(log N) complexity. So the new code will be faster for
> workloads which use lots of vmas.
> 
> That said, I do agree that the code that looks for contiguous
> available slices looks kinda ugly - just not sure how to make it look
> nicer though.

Ok. I think at least you can move that construct:

+   if (addr < SLICE_LOW_TOP) {
+   slice = GET_LOW_SLICE_INDEX(addr);
+   addr = (slice + 1) << SLICE_LOW_SHIFT;
+   if (!(available.low_slices & (1u << slice)))
+   continue;
+   } else {
+   slice = GET_HIGH_SLICE_INDEX(addr);
+   addr = (slice + 1) << SLICE_HIGH_SHIFT;
+   if (!(available.high_slices & (1u << slice)))
+   continue;
+   }

Into some kind of helper. It will probably compile to the same thing but
at least it's more readable and it will avoid a fuckup in the future if
somebody changes the algorithm and forgets to update one of the
copies :-)

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] ARM: arm-soc fixes for 3.8-rc

2013-01-08 Thread Olof Johansson
On Tue, Jan 8, 2013 at 7:13 PM, Olof Johansson  wrote:

> Or maybe a better solution is to make git request-pull throw an error
> if there is a local signed tag for the request, but none is found on
> the server (or has different contents). I'll take a look at that.

A-HA! Git does that as of 1.7.11.2 / 1.7.12, my version was just a
little too old.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pinctrl: pinctrl-mxs: Fix variables' definition type

2013-01-08 Thread Shawn Guo
On Mon, Jan 07, 2013 at 10:53:49PM -0200, Fabio Estevam wrote:
> From: Fabio Estevam 
> 
> Fix the following warnings when building with W=1 option:
> 
> drivers/pinctrl/pinctrl-mxs.c: In function 'mxs_dt_free_map':
> drivers/pinctrl/pinctrl-mxs.c:151:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> drivers/pinctrl/pinctrl-mxs.c: In function 'mxs_pinctrl_enable':
> drivers/pinctrl/pinctrl-mxs.c:208:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> drivers/pinctrl/pinctrl-mxs.c: In function 'mxs_pinconf_group_set':
> drivers/pinctrl/pinctrl-mxs.c:265:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> drivers/pinctrl/pinctrl-mxs.c: In function 'mxs_pinctrl_parse_group':
> drivers/pinctrl/pinctrl-mxs.c:376:16: warning: comparison between signed and 
> unsigned integer expressions [-Wsign-compare]
> 
> Signed-off-by: Fabio Estevam 

Acked-by: Shawn Guo 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re: [3.8-rc2] stuck at reading CIFS mounted directory

2013-01-08 Thread 허종만
Hi,

> --- Original Message ---
> Sender : Jeff Layton
> Date : 2013-01-08 00:13 (GMT+09:00)
> Title : Re: [3.8-rc2] stuck at reading CIFS mounted directory
> 
> On Mon, 07 Jan 2013 15:10:05 +0530
> Suresh Jayaraman wrote:
> 
> > (Cc linux-c...@vger.kernel.org)
> > 
> > On 01/04/2013 06:27 AM, Jongman Heo wrote:
> > > Hi, all,
> > > 
> > > In 3.8-rc2, access to CIFS-mounted directory (df, ls, or similar) got 
> > > stuck with following message.
> > > 
> > > It's mounted with...
> > >   mount -t cifs ///Share  /mnt/window -o 
> > > user=jongman.heo,password=,sec=ntlm
> > > 
> > > 
> > > [16655.288591] INFO: task bash:4042 blocked for more than 120 seconds.
> > > [16655.318117] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" 
> > > disables this message.
> > > [16655.318123] bashD dada9c5c 0  4042  1 0x0004
> > > [16655.318132]  dada9cd0 0082 0282 dada9c5c c09022c6 dada9c7c 
> > > c044d316 c0c7c300
> > > [16655.318139]  d6db3a7b 0f09 c0c7c300  0f09 f3b7b240 
> > > c04401ba 
> > > [16655.318145]  c0b9e0d8 f598e960  0303 dada9c98 dada9c98 
> > > f598e960 0006
> > > [16655.318150] Call Trace:
> > > [16655.342785]  [] ? _raw_spin_unlock_irqrestore+0xf/0x11
> > > [16655.351554]  [] ? __wake_up+0x3b/0x42
> > > [16655.358802]  [] ? call_usermodehelper_fns+0x148/0x152
> > > [16655.358840]  [] ? __request_module+0x15e/0x1a1
> > > [16655.358842]  [] ? call_usermodehelper_freeinfo+0x19/0x19
> > > [16655.358845]  [] schedule+0x51/0x53
> > > [16655.358847]  [] schedule_preempt_disabled+0x8/0xa
> > > [16655.384345]  [] __mutex_lock_common+0xd6/0x123
> > > [16655.384430]  [] __mutex_lock_slowpath+0x20/0x22
> > > [16655.384436]  [] ? mutex_lock+0x18/0x25
> > > [16655.384441]  [] mutex_lock+0x18/0x25
> > > [16655.384892]  [] cifs_reconnect_tcon+0x170/0x252
> > > [16655.384953]  [] ? should_resched+0x8/0x22
> > > [16655.384963]  [] ? _cond_resched+0x8/0x1c
> > > [16655.384969]  [] smb_init+0x1d/0x6d
> > > [16655.385023]  [] CIFSSMBQPathInfo+0x4e/0x1e4
> > > [16655.385071]  [] cifs_query_path_info+0x38/0x73
> > > [16655.385080]  [] cifs_get_inode_info+0x122/0x3ac
> > > [16655.385548]  [] ? walk_component+0x14a/0x17a
> > > [16655.385570]  [] ? build_path_from_dentry+0xa3/0x19e
> > > [16655.385585]  [] ? build_path_from_dentry+0xa3/0x19e
> > > [16655.385596]  [] ? build_path_from_dentry+0xa3/0x19e
> > > [16655.385601]  [] ? getname_flags+0x59/0xeb
> > > [16655.385606]  [] ? _raw_spin_lock+0x8/0xa
> > > [16655.385613]  [] cifs_revalidate_dentry_attr+0x120/0x168
> > > [16655.385618]  [] cifs_getattr+0x5e/0xe3
> > > [16655.385625]  [] vfs_getattr+0x37/0x4e
> > > [16655.385631]  [] ? cifs_revalidate_dentry+0x20/0x20
> > > [16655.385639]  [] vfs_fstatat+0x59/0x8a
> > > [16655.385645]  [] vfs_stat+0x19/0x1b
> > > [16655.385652]  [] sys_stat64+0x11/0x22
> > > [16655.385659]  [] ? should_resched+0x8/0x22
> > > [16655.385668]  [] ? _cond_resched+0x8/0x1c
> > > [16655.385674]  [] ? task_work_run+0x6d/0x79
> > > [16655.385825]  [] ? __do_page_fault+0x33b/0x33b
> > > [16655.385834]  [] ? do_page_fault+0x8/0xa
> > > [16655.385840]  [] sysenter_do_call+0x12/0x2c
> > > 
> > > N?r??y???b?X???v?^?)?{.n?+{zX?????}z?:+v???zZ+??+zf???h???~i???z??w&?)?f??^j?y?m??@A?a???
> > >  0??h??i
> > > 
> > 
> > 
> 
> Looks like it's waiting on the session_mutex to become free. The
> question is what's holding it and why. Some questions:
> 
> 1) is this a regression? If so, what version were you using previously?

Yeah, regression. IIRC I didn't have this issue with 3.7.

> 2) any other processes stuck on on this mutex? What about the cifsd
> thread for this mount? Is it stuck holding it? You may want to
> "cat /proc//stack" on any other threads that might be related here
> and see if you can figure out what they're doing.

After the hang happens, CIFS is working well. Stack of cifsd doesn't show any 
interesting thing.

[ORANGE@/mnt/window] ps ax | grep cifsd
 1135 ?S  1:13 [cifsd]
[ORANGE@/mnt/window] cat /proc/1135/stack
[] sk_wait_data+0x63/0x9b
[] tcp_recvmsg+0x3aa/0x780
[] inet_recvmsg+0x51/0x63
[] sock_recvmsg+0x80/0x9d
[] kernel_recvmsg+0x2f/0x3f
[] cifs_readv_from_socket+0x142/0x1d3
[] cifs_read_from_socket+0x1c/0x1e
[] cifs_demultiplex_thread+0x701/0x72b
[] kthread+0x6b/0x70
[] ret_from_kernel_thread+0x1b/0x28
[] 0x


Host : Windows 7
Guest : VMWare Fedora 16 + 3.8 custom kernel

I feel that the issue is more likely to happen in this case.

 mount cifs directory from VMWare guest -> go to S3 sleep mode (by closing lid 
of laptop) -> open lid -> check cifs directory of VMWare


Following is more call stack trace I hit today.

[23245.542488] INFO: task bash:2711 blocked for more than 120 seconds.
[23245.571664] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[23245.571672] bashD f3bd5c5c 0  2711   2689 0x
[23245.580899]  f3bd5cd0 0086 0282 f3bd5c5c 

Re: sched: Consequences of integrating the Per Entity Load Tracking Metric into the Load Balancer

2013-01-08 Thread Preeti U Murthy
 Here comes the point of making both load balancing and wake up
 balance(select_idle_sibling) co operative. How about we always schedule
 the woken up task on the prev_cpu? This seems more sensible considering
 load balancing considers blocked load as being a part of the load of cpu2.
>>>
>>> Hi Preeti,
>>>
>>> I'm not sure that we want such steady state at cores level because we
>>> take advantage of migrating wake up tasks between cores that share
>>> their cache as Matthew demonstrated. But I agree that reaching such
>>> steady state at cluster and CPU level is interesting.
>>>
>>> IMHO, you're right that taking the blocked load into consideration
>>> should minimize tasks migration between cluster but it should no
>>> prevent fast task migration between cores that share their cache
>>
>> True Vincent.But I think the one disadvantage even at cpu or cluster
>> level is that when we consider blocked load, we might prevent any more
>> tasks from being scheduled on that cpu during periodic load balance if
>> the blocked load is too much.This is very poor cpu utilization
> 
> The blocked load of a cluster will be high if the blocked tasks have
> run recently. The contribution of a blocked task will be divided by 2
> each 32ms, so it means that a high blocked load will be made of recent
> running tasks and the long sleeping tasks will not influence the load
> balancing.
> The load balance period is between 1 tick (10ms for idle load balance
> on ARM) and up to 256 ms (for busy load balance) so a high blocked
> load should imply some tasks that have run recently otherwise your
> blocked load will be small and will not have a large influence on your
> load balance

Makes a lot of sense.

>> Also we can consider steady states if the waking tasks have a specific
>> waking pattern.I am not sure if we can risk hoping that the blocked task
>> would wake up soon or would wake up at time 'x' and utilize that cpu.
> 
> Ok, so you don't consider to use blocked load in load balancing any more ?

Hmm..This has got me thinking.I thought to solve the existing
select_idle_sibling() problem of bouncing tasks all over the l3 package
and taking time to find an idle buddy could be solved in isolation with
the PJT's metric.But that does not seem to be the case considering the
suggestions by you and Mike.

Currently there are so many approaches proposed to improve the scheduler
that it is confusing as to how and which pieces fit well.Let me lay them
down.Please do help me put them together.

Jigsaw Piece1:Use Pjt's metric in load balancing and  Blocked
load+runnable load as part of cpu load while load balancing.

Jigsaw Piece2: select_idle_sibling() choosing the cpu to wake up tasks on.

Jigsaw Piece3: 'cpu buddy' concept to prevent bouncing of tasks.

Considering both yours and Mike's suggestions,what do you guys think of
the following puzzle and solution?

*Puzzle*: Waking up tasks should not take too much time to find a cpu to
run on and should not keep bouncing on too many cpus all over the
package, and should try as much not to create too much of an imbalance
in the load distribution of the cpus.

*Solution:*

Place Jigsaw Piece 1 first:Use Pjt's metric and blocked load + runnable
load as part of cpu load while load balancing.
(As time passes the blocked load becomes less significant on that
cpu,hence load balancing will go on as usual).

Place Jigsaw Piece 2 next: When tasks wake up,**use
select_idle_sibling() to see only if you can migrate tasks between cores
that share their cache**,
IOW see if the cpu at the lowest level sched domain is idle.If it is,
then schedule on it and migrate_task_rq_fair() will remove the load from
the prev_cpu,if not idle,then return the prev_cpu() which had already
considered the blocked load as part of its overall load.Hence very
little imbalance will be created.


*Possible End Picture*

Waking up tasks will not take time to find a cpu since we are probing
the cpus at only one sched domain level.The bouncing of tasks will be
restricted at the core level.An imbalance will not be created as the
blocked load is also considered while load balancing.

*Concerns*

1.Is the wake up load balancing in this solution less aggressive so as
to harm throughput significantly ?
2.Do we need Jigsaw Piece 3 at all?

Please do let me know what you all think.Thank you very much for your
suggestions.
>
> regards,
> Vincent

Regards
Preeti U Murthy



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Consult] our latest kernel and latest Android under arm Samsung S5PV210 with yaffs2 file system

2013-01-08 Thread Chen Gang F T
于 2013年01月08日 22:53, Theodore Ts'o 写道:

  firstly, thank you for your reply in details.

  :-)

> yaffs2 is not a _standard_ file system for Android.  There may be some
> phones which use it, but the much more common is either FAT (for the
> older phones) or ext4.  The Google AOSP releases for pretty much all
> modern Nexus phones (Galaxy Nexus and newer if I recall correctly,
> certainly for GN, N4, N7, etc.) all use ext4.
> 
  for development, yaffs2 may be not a _standard_ file system for Android.
  but for marketing, it is realy used as a common file system.
  as far as I know:
some of HTC phone use it.
some embedded developement board use it.
it is well known in embedded area (can find many documents about it)


> 
> So if your existing Android device is using yaffs2, you'll need to
> integrate yaffs2, since yaff2 is not in the upstream kernel.  As far
> as hardware support, it will depend on the specifics of your
> development board, ...
>

  thank you for your suggestions.
  I will integrate yaffs2 to my current using kernel.


> ..., Others hopefully on this list will be able to answer it.
> 
> 
  welcome any members to reply, thanks.


  and additional consult:
is it suitable to integrate yaffs2 into upstream kernel ?

  :-)


  thanks.

-- 
Chen Gang

Flying Transformer
<>

Re: [GIT PULL] ARM: arm-soc fixes for 3.8-rc

2013-01-08 Thread Olof Johansson
On Tue, Jan 8, 2013 at 6:57 PM, Linus Torvalds
 wrote:
> On Tue, Jan 8, 2013 at 10:49 AM, Olof Johansson  wrote:
>>
>> A slightly larger delta than I'd ideally want by now, in part due to some
>> of the OMAP PM fixes that's adding a bit of code. I decided to include
>> it instead of push it to 3.9, but from here on out we'll be stricter.
>
> Ugh. Not only that, but:
>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git fixes
>
> you have both a branch called "fixes" and a tag called "fixes".
> Ambiguous. And when you ask me to pull like the above, it actually
> picks the branch, not the tag.
>
> Don't do this. Either use the unambiguous name ("tags/fixes" rather
> than just "fixes") or don't push out branches and tags that have the
> same name.

I switched from the latter to the former a while back, and should
probably switch back. What happened in this case is that the tag
hadn't mirrored out from ra yet, so git request-pull fell back to the
branch name instead and I didn't notice. :(

A non-ambiguous name will still fall back to the branch name instead
of the (differently named) tag, but it'll be easier to catch when I
check the pull request contents before sending it.

Or maybe a better solution is to make git request-pull throw an error
if there is a local signed tag for the request, but none is found on
the server (or has different contents). I'll take a look at that.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 1/2] virtio-net: fix the set affinity bug when CPU IDs are not consecutive

2013-01-08 Thread Jason Wang
On 01/09/2013 09:52 AM, Wanlong Gao wrote:
> On 01/08/2013 06:26 PM, Jason Wang wrote:
>> On 01/08/2013 06:07 PM, Wanlong Gao wrote:
>>> As Michael mentioned, set affinity and select queue will not work very
>>> well when CPU IDs are not consecutive, this can happen with hot unplug.
>>> Fix this bug by traversal the online CPUs, and create a per cpu variable
>>> to find the mapping from CPU to the preferable virtual-queue.
>>>
>>> Cc: Rusty Russell 
>>> Cc: "Michael S. Tsirkin" 
>>> Cc: Jason Wang 
>>> Cc: Eric Dumazet 
>>> Cc: virtualizat...@lists.linux-foundation.org
>>> Cc: net...@vger.kernel.org
>>> Signed-off-by: Wanlong Gao 
>>> ---
>>>  drivers/net/virtio_net.c | 39 +--
>>>  1 file changed, 29 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>> index a6fcf15..a77f86c 100644
>>> --- a/drivers/net/virtio_net.c
>>> +++ b/drivers/net/virtio_net.c
>>> @@ -41,6 +41,8 @@ module_param(gso, bool, 0444);
>>>  #define VIRTNET_SEND_COMMAND_SG_MAX2
>>>  #define VIRTNET_DRIVER_VERSION "1.0.0"
>>>  
>>> +DEFINE_PER_CPU(int, vq_index) = -1;
>>> +
>> I think this should not be a global one, consider we may have more than
>> one virtio-net cards with different max queues.
> Yes, would you move this into virtio_info?

Yes, I think it's better.
>>>  struct virtnet_stats {
>>> struct u64_stats_sync tx_syncp;
>>> struct u64_stats_sync rx_syncp;
>>> @@ -1016,6 +1018,7 @@ static int virtnet_vlan_rx_kill_vid(struct net_device 
>>> *dev, u16 vid)
>>>  static void virtnet_set_affinity(struct virtnet_info *vi, bool set)
>>>  {
>>> int i;
>>> +   int cpu;
>>>  
>>> /* In multiqueue mode, when the number of cpu is equal to the number of
>>>  * queue pairs, we let the queue pairs to be private to one cpu by
>>> @@ -1029,16 +1032,29 @@ static void virtnet_set_affinity(struct 
>>> virtnet_info *vi, bool set)
>>> return;
>>> }
>>>  
>>> -   for (i = 0; i < vi->max_queue_pairs; i++) {
>>> -   int cpu = set ? i : -1;
>>> -   virtqueue_set_affinity(vi->rq[i].vq, cpu);
>>> -   virtqueue_set_affinity(vi->sq[i].vq, cpu);
>>> -   }
>>> +   if (set) {
>>> +   i = 0;
>>> +   for_each_online_cpu(cpu) {
>>> +   virtqueue_set_affinity(vi->rq[i].vq, cpu);
>>> +   virtqueue_set_affinity(vi->sq[i].vq, cpu);
>>> +   per_cpu(vq_index, cpu) = i;
>>> +   i++;
>>> +   if (i >= vi->max_queue_pairs)
>>> +   break;
>> Can this happen? we check only set when the number are equal.
> will remove.
>
>>> +   }
>>>  
>>> -   if (set)
>>> vi->affinity_hint_set = true;
>>> -   else
>>> +   } else {
>>> +   for(i = 0; i < vi->max_queue_pairs; i++) {
>>> +   virtqueue_set_affinity(vi->rq[i].vq, -1);
>>> +   virtqueue_set_affinity(vi->sq[i].vq, -1);
>>> +   }
>>> +
>>> +   for_each_online_cpu(cpu)
>>> +   per_cpu(vq_index, cpu) = -1;
>>> +
>> This looks suboptimal since it may leads only txq zero is used.
> So, which value is best for txq when we don't set affinity?
> just remain to smp_processor_id()?

The value which will let us use all queues are ok.

How about this?
 
i = 0;
for_each_online_cpu(cpu)
per_cpu(vq_index, cpu) = ++i % vi->curr_queues;
> Thanks,
> Wanlong Gao
>
>>> vi->affinity_hint_set = false;
>>> +   }
>>>  }
>>>  
>>>  static void virtnet_get_ringparam(struct net_device *dev,
>>> @@ -1127,12 +1143,15 @@ static int virtnet_change_mtu(struct net_device 
>>> *dev, int new_mtu)
>>>  
>>>  /* To avoid contending a lock hold by a vcpu who would exit to host, 
>>> select the
>>>   * txq based on the processor id.
>>> - * TODO: handle cpu hotplug.
>>>   */
>>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff 
>>> *skb)
>>>  {
>>> -   int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>>> - smp_processor_id();
>>> +   int txq = 0;
>>> +
>>> +   if (skb_rx_queue_recorded(skb))
>>> +   txq = skb_get_rx_queue(skb);
>>> +   else if ((txq = per_cpu(vq_index, smp_processor_id())) == -1)
>>> +   txq = 0;
>>>  
>>> while (unlikely(txq >= dev->real_num_tx_queues))
>>> txq -= dev->real_num_tx_queues;
>>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: manual merge of the tegra tree with the tree

2013-01-08 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the tegra tree got a conflict in
arch/arm/mach-tegra/headsmp.S between commit bc4f1bdabc89 ("ARM:
coresight: common definition for (OS) Lock Access Register key value")
from the arm-perf tree and commit 2a3eb5bc45bd ("ARM: tegra: make device
can run on UP") from the tegra tree.

I fixed it up (see below) and can carry the fix as necessary (no action
is required).

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

diff --cc arch/arm/mach-tegra/headsmp.S
index b154213,fd473f2..000
--- a/arch/arm/mach-tegra/headsmp.S
+++ b/arch/arm/mach-tegra/headsmp.S
@@@ -1,67 -1,9 +1,11 @@@
  #include 
  #include 
  
- #include 
- #include 
- #include 
 +#include 
 +
- #include "flowctrl.h"
- #include "iomap.h"
- #include "reset.h"
  #include "sleep.h"
  
- #define APB_MISC_GP_HIDREV0x804
- #define PMC_SCRATCH41 0x140
- 
- #define RESET_DATA(x) ((TEGRA_RESET_##x)*4)
- 
  .section ".text.head", "ax"
-   __CPUINIT
- 
- /*
-  * Tegra specific entry point for secondary CPUs.
-  *   The secondary kernel init calls v7_flush_dcache_all before it enables
-  *   the L1; however, the L1 comes out of reset in an undefined state, so
-  *   the clean + invalidate performed by v7_flush_dcache_all causes a bunch
-  *   of cache lines with uninitialized data and uninitialized tags to get
-  *   written out to memory, which does really unpleasant things to the main
-  *   processor.  We fix this by performing an invalidate, rather than a
-  *   clean + invalidate, before jumping into the kernel.
-  */
- ENTRY(v7_invalidate_l1)
- mov r0, #0
- mcr p15, 2, r0, c0, c0, 0
- mrc p15, 1, r0, c0, c0, 0
- 
- ldr r1, =0x7fff
- and r2, r1, r0, lsr #13
- 
- ldr r1, =0x3ff
- 
- and r3, r1, r0, lsr #3  @ NumWays - 1
- add r2, r2, #1  @ NumSets
- 
- and r0, r0, #0x7
- add r0, r0, #4  @ SetShift
- 
- clz r1, r3  @ WayShift
- add r4, r3, #1  @ NumWays
- 1:  sub r2, r2, #1  @ NumSets--
- mov r3, r4  @ Temp = NumWays
- 2:  subsr3, r3, #1  @ Temp--
- mov r5, r3, lsl r1
- mov r6, r2, lsl r0
- orr r5, r5, r6  @ Reg = 
(Temp<

pgpoENHRIOsmY.pgp
Description: PGP signature


Re: [GIT PULL] ARM: arm-soc fixes for 3.8-rc

2013-01-08 Thread Linus Torvalds
On Tue, Jan 8, 2013 at 10:49 AM, Olof Johansson  wrote:
>
> A slightly larger delta than I'd ideally want by now, in part due to some
> of the OMAP PM fixes that's adding a bit of code. I decided to include
> it instead of push it to 3.9, but from here on out we'll be stricter.

Ugh. Not only that, but:

>   git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git fixes

you have both a branch called "fixes" and a tag called "fixes".
Ambiguous. And when you ask me to pull like the above, it actually
picks the branch, not the tag.

Don't do this. Either use the unambiguous name ("tags/fixes" rather
than just "fixes") or don't push out branches and tags that have the
same name.

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [v2.6.34-stable 71/77] crypto: ghash - Avoid null pointer dereference if no key is set

2013-01-08 Thread Nick Bowler
On 2013-01-08 18:35 -0500, Paul Gortmaker wrote:
> From: Nick Bowler 
> 
>---
> This is a commit scheduled for the next v2.6.34 longterm release.
> http://git.kernel.org/?p=linux/kernel/git/paulg/longterm-queue-2.6.34.git
> If you see a problem with using this for longterm, please comment.
>---
> 
> commit 7ed47b7d142ec99ad6880bbbec51e9f12b3af74c upstream.
> 
> The ghash_update function passes a pointer to gf128mul_4k_lle which will
> be NULL if ghash_setkey is not called or if the most recent call to
> ghash_setkey failed to allocate memory.  This causes an oops.  Fix this
> up by returning an error code in the null case.
> 
> This is trivially triggered from unprivileged userspace through the
> AF_ALG interface by simply writing to the socket without setting a key.

I haven't been following 2.6.34-longterm development, but unless
you've also backported the AF_ALG userspace interface from 2.6.38,
this sequence can only be triggered by kernel code.  So while this
patch shouldn't break anything, it isn't really necessary.

Cheers,
-- 
Nick Bowler, Elliptic Technologies (http://www.elliptictech.com/)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 18:32 -0800, Eric Dumazet wrote:

> 
> Hmm, it seems sk_filter() can return -ENOMEM because skb has the
> pfmemalloc() set.

> 
> One TCP socket keeps retransmitting an SKB via loopback, and TCP stack 
> drops the packet again and again.

sock_init_data() sets sk->sk_allocation to GFP_KERNEL

Shouldnt it use (GFP_KERNEL | __GFP_NOMEMALLOC) instead ?



diff --git a/net/core/sock.c b/net/core/sock.c
index bc131d4..76c4b39 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -286,6 +286,7 @@ void sk_set_memalloc(struct sock *sk)
 {
sock_set_flag(sk, SOCK_MEMALLOC);
sk->sk_allocation |= __GFP_MEMALLOC;
+   sk->sk_allocation &= ~__GFP_NOMEMALLOC;
static_key_slow_inc(_socks);
 }
 EXPORT_SYMBOL_GPL(sk_set_memalloc);
@@ -294,6 +295,7 @@ void sk_clear_memalloc(struct sock *sk)
 {
sock_reset_flag(sk, SOCK_MEMALLOC);
sk->sk_allocation &= ~__GFP_MEMALLOC;
+   sk->sk_allocation |= __GFP_NOMEMALLOC;
static_key_slow_dec(_socks);
 
/*
@@ -2230,7 +2232,7 @@ void sock_init_data(struct socket *sock, struct sock *sk)
 
init_timer(>sk_timer);
 
-   sk->sk_allocation   =   GFP_KERNEL;
+   sk->sk_allocation   =   GFP_KERNEL | __GFP_NOMEMALLOC;
sk->sk_rcvbuf   =   sysctl_rmem_default;
sk->sk_sndbuf   =   sysctl_wmem_default;
sk->sk_state=   TCP_CLOSE;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Supporting SYSRQ on broken laptops like the thinkpad T530

2013-01-08 Thread Marc MERLIN
On Wed, Jan 09, 2013 at 03:36:44AM +0100, Roland Eggner wrote:
> On 2013-01-08 Tuesday at 15:09 -0800 Marc MERLIN wrote:
> > In its infinite wisdom, lenovo has removed the sysrq key on the latest
> > thinkpads, and replaced it with a stupid ALT+FN+S key combination, which
> > doesn't really work for doing sysrq from the console (nor do I know how the
> > genius who did that intended for SYSRQ-S to work).
> > http://forums.lenovo.com/t5/T400-T500-and-newer-T-series/T430-s-T530-Where-are-the-shortcut-function-keys-break-Pause-etc/ta-p/781749
> > 
> > I realize that one solution is to throw my laptop window at a suitable high
> > floorand replace it with one from a vendor that doesn't randomly remove keys
> > from the keyboard.
> > That said, I was wondering if there were other solutions, especially
> > considering that thinkpads used to be the better linux laptops.
> 
> My Dell “Precision M4500” notebook suffers similar (same?) problem.  So far 
> I could not find a solution better than this:  e.g. Alt-Fn-SysRq-s
> 
> press and hold Alt
> press and hold Fn
> press and leave F10|SysRq
> leave Fn
> press and leave s
> leave Alt

Holy crap. That works for me too. If only lenovo could have been bothered to
document it properly. It's still a pitty to type and remmember the exact
hold and release key sequences, but it's better than nothing.

Thanks much.

> Several months ago a LKML user claimed, his cat had managed to press 
> Alt-Fn-SysRq-c on his Dell Latitude notebook with similar keyboard, and 
> provided 
> photos showing the kernel crash message ;)

Yeah, but my cat is not nearly smart enough for that :)

Thanks for your help again,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Bluetooth: btmrvl_sdio: look for sd8688 firmware in alternate place

2013-01-08 Thread Bing Zhao

> > linux-firmware ships the sd8688* firmware images that are shared with
> > libertas_sdio WiFi driver under libertas/. libertas_sdio looks in both 
> > places
> > and so should we.
> >
> > Signed-off-by: Lubomir Rintel 
> > ---
> >  drivers/bluetooth/btmrvl_sdio.c |   24 ++--
> >  drivers/bluetooth/btmrvl_sdio.h |6 --
> >  2 files changed, 26 insertions(+), 4 deletions(-)
> 
> NAK from me on this one. I do not want the driver to check two
> locations. That is what userspace can work around.
> 
> If we want to unify the location between the WiFi driver and the
> Bluetooth driver, I am fine with that, but seriously, just pick one over
> the other. I do not care which one.

The unified location is mrvl/ directory.

We can probably move SD8688 firmware & helper binaries to mrvl/ and have both 
drivers grab the images there?

Regards,
Bing



Re: [PATCH 5/5] kfifo: log based kfifo API

2013-01-08 Thread Yuanhan Liu
On Tue, Jan 08, 2013 at 10:16:46AM -0800, Dmitry Torokhov wrote:
> Hi Yuanhan,
> 
> On Tue, Jan 08, 2013 at 10:57:53PM +0800, Yuanhan Liu wrote:
> > The current kfifo API take the kfifo size as input, while it rounds
> >  _down_ the size to power of 2 at __kfifo_alloc. This may introduce
> > potential issue.
> > 
> > Take the code at drivers/hid/hid-logitech-dj.c as example:
> > 
> > if (kfifo_alloc(_dev->notif_fifo,
> >DJ_MAX_NUMBER_NOTIFICATIONS * sizeof(struct 
> > dj_report),
> >GFP_KERNEL)) {
> > 
> > Where, DJ_MAX_NUMBER_NOTIFICATIONS is 8, and sizeo of(struct dj_report)
> > is 15.
> > 
> > Which means it wants to allocate a kfifo buffer which can store 8
> > dj_report entries at once. The expected kfifo buffer size would be
> > 8 * 15 = 120 then. While, in the end, __kfifo_alloc will turn the
> > size to rounddown_power_of_2(120) =  64, and then allocate a buf
> > with 64 bytes, which I don't think this is the original author want.
> > 
> > With the new log API, we can do like following:
> > 
> > int kfifo_size_order = order_base_2(DJ_MAX_NUMBER_NOTIFICATIONS *
> > sizeof(struct dj_report));
> > 
> > if (kfifo_alloc(_dev->notif_fifo, kfifo_size_order, GFP_KERNEL)) {
> > 
> > This make sure we will allocate enough kfifo buffer for holding
> > DJ_MAX_NUMBER_NOTIFICATIONS dj_report entries.
> 
> Why don't you simply change __kfifo_alloc to round the allocation up
> instead of down?

Hi Dmitry,

Yes, it would be neat and that was my first reaction as well. I then
sent out a patch, but it was NACKed by Stefani(the original kfifo
author). Here is the link:

https://lkml.org/lkml/2012/10/26/144

Then Stefani proposed to change the API to take log of size as input to
root fix this kind of issues. And here it is.

Thanks.

--yliu
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Xen-devel] [PATCH v2] xen/grant-table: correctly initialize grant table version 1

2013-01-08 Thread ANNIE LI

Thanks so much for posting this.

On 2013-1-6 19:14, Matt Wilson wrote:

Commit 85ff6acb075a484780b3d763fdf41596d8fc0970 (xen/granttable: Grant
tables V2 implementation) changed the GREFS_PER_GRANT_FRAME macro from
a constant to a conditional expression. The expression depends on
grant_table_version being appropriately set. Unfortunately, at init
time grant_table_version will be 0. The GREFS_PER_GRANT_FRAME
conditional expression checks for "grant_table_version == 1", and
therefore returns the number of grant references per frame for v2.

This causes gnttab_init() to allocate fewer pages for gnttab_list, as
a frame can old half the number of v2 entries than v1 entries. After
gnttab_resume() is called, grant_table_version is appropriately
set. nr_init_grefs will then be miscalculated and gnttab_free_count
will hold a value larger than the actual number of free gref entries.

If a guest is heavily utilizing improperly initialized v1 grant
tables, memory corruption can occur. One common manifestation is
corruption of the vmalloc list, resulting in a poisoned pointer
derefrence when accessing /proc/meminfo or /proc/vmallocinfo:

[   40.770064] BUG: unable to handle kernel paging request at 20021407
[   40.770083] IP: [] get_vmalloc_info+0x70/0x110
[   40.770102] PGD 0
[   40.770107] Oops:  [#1] SMP
[   40.770114] CPU 10

This patch introduces a static variable, grefs_per_grant_frame, to
cache the calculated value. gnttab_init() now calls
gnttab_request_version() early so that grant_table_version and
grefs_per_grant_frame can be appropriately set. A few BUG_ON()s have
been added to prevent this type of bug from reoccurring in the future.

Signed-off-by: Matt Wilson
Reviewed-and-Tested-by: Steven Noonan
Cc: Ian Campbell
Cc: Konrad Rzeszutek Wilk
Cc: Annie Li
Cc: xen-de...@lists.xen.org
Cc: linux-kernel@vger.kernel.org
Cc: sta...@vger.kernel.org # v3.3 and newer
---
Changes since v1:
* introduced a new gnttab_setup() function and moved all of the
   initialization code from gnttab_resume() there.
---
  drivers/xen/grant-table.c |   52 ++--
  1 files changed, 31 insertions(+), 21 deletions(-)

diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index 043bf07..53715de 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -55,10 +55,6 @@
  /* External tools reserve first few grant table entries. */
  #define NR_RESERVED_ENTRIES 8
  #define GNTTAB_LIST_END 0x
-#define GREFS_PER_GRANT_FRAME \
-(grant_table_version == 1 ?  \
-(PAGE_SIZE / sizeof(struct grant_entry_v1)) :   \
-(PAGE_SIZE / sizeof(union grant_entry_v2)))

  static grant_ref_t **gnttab_list;
  static unsigned int nr_grant_frames;
@@ -153,6 +149,7 @@ static struct gnttab_ops *gnttab_interface;
  static grant_status_t *grstatus;

  static int grant_table_version;
+static int grefs_per_grant_frame;

  static struct gnttab_free_callback *gnttab_free_callback_list;

@@ -766,12 +763,14 @@ static int grow_gnttab_list(unsigned int more_frames)
unsigned int new_nr_grant_frames, extra_entries, i;
unsigned int nr_glist_frames, new_nr_glist_frames;

+   BUG_ON(grefs_per_grant_frame == 0);
+
new_nr_grant_frames = nr_grant_frames + more_frames;
-   extra_entries   = more_frames * GREFS_PER_GRANT_FRAME;
+   extra_entries   = more_frames * grefs_per_grant_frame;

-   nr_glist_frames = (nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / 
RPP;
+   nr_glist_frames = (nr_grant_frames * grefs_per_grant_frame + RPP - 1) / 
RPP;
new_nr_glist_frames =
-   (new_nr_grant_frames * GREFS_PER_GRANT_FRAME + RPP - 1) / RPP;
+   (new_nr_grant_frames * grefs_per_grant_frame + RPP - 1) / RPP;
for (i = nr_glist_frames; i<  new_nr_glist_frames; i++) {
gnttab_list[i] = (grant_ref_t *)__get_free_page(GFP_ATOMIC);
if (!gnttab_list[i])
@@ -779,12 +778,12 @@ static int grow_gnttab_list(unsigned int more_frames)
}


-   for (i = GREFS_PER_GRANT_FRAME * nr_grant_frames;
-i<  GREFS_PER_GRANT_FRAME * new_nr_grant_frames - 1; i++)
+   for (i = grefs_per_grant_frame * nr_grant_frames;
+i<  grefs_per_grant_frame * new_nr_grant_frames - 1; i++)
gnttab_entry(i) = i + 1;

gnttab_entry(i) = gnttab_free_head;
-   gnttab_free_head = GREFS_PER_GRANT_FRAME * nr_grant_frames;
+   gnttab_free_head = grefs_per_grant_frame * nr_grant_frames;
gnttab_free_count += extra_entries;

nr_grant_frames = new_nr_grant_frames;
@@ -904,7 +903,8 @@ EXPORT_SYMBOL_GPL(gnttab_unmap_refs);

  static unsigned nr_status_frames(unsigned nr_grant_frames)
  {
-   return (nr_grant_frames * GREFS_PER_GRANT_FRAME + SPP - 1) / SPP;
+   BUG_ON(grefs_per_grant_frame == 0);
+   return (nr_grant_frames * grefs_per_grant_frame + SPP - 1) / SPP;
  }

  static int gnttab_map_frames_v1(xen_pfn_t *frames, 

Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture

2013-01-08 Thread Michel Lespinasse
On Tue, Jan 8, 2013 at 6:15 PM, Benjamin Herrenschmidt
 wrote:
> On Tue, 2013-01-08 at 17:28 -0800, Michel Lespinasse wrote:
>> Update the powerpc slice_get_unmapped_area function to make use of
>> vm_unmapped_area() instead of implementing a brute force search.
>>
>> Signed-off-by: Michel Lespinasse 
>>
>> ---
>>  arch/powerpc/mm/slice.c |  128 
>> +-
>>  1 files changed, 81 insertions(+), 47 deletions(-)
>
> That doesn't look good ... the resulting code is longer than the
> original, which makes me wonder how it is an improvement...

Well no fair, the previous patch (for powerpc as well) has 22
insertions and 93 deletions :)

The benefit is that the new code has lower algorithmic complexity, it
replaces a per-vma loop with O(N) complexity with an outer loop that
finds contiguous slice blocks and passes them to vm_unmapped_area()
which is only O(log N) complexity. So the new code will be faster for
workloads which use lots of vmas.

That said, I do agree that the code that looks for contiguous
available slices looks kinda ugly - just not sure how to make it look
nicer though.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Supporting SYSRQ on broken laptops like the thinkpad T530

2013-01-08 Thread Roland Eggner
On 2013-01-08 Tuesday at 15:09 -0800 Marc MERLIN wrote:
> In its infinite wisdom, lenovo has removed the sysrq key on the latest
> thinkpads, and replaced it with a stupid ALT+FN+S key combination, which
> doesn't really work for doing sysrq from the console (nor do I know how the
> genius who did that intended for SYSRQ-S to work).
> http://forums.lenovo.com/t5/T400-T500-and-newer-T-series/T430-s-T530-Where-are-the-shortcut-function-keys-break-Pause-etc/ta-p/781749
> 
> I realize that one solution is to throw my laptop window at a suitable high
> floorand replace it with one from a vendor that doesn't randomly remove keys
> from the keyboard.
> That said, I was wondering if there were other solutions, especially
> considering that thinkpads used to be the better linux laptops.

My Dell “Precision M4500” notebook suffers similar (same?) problem.  So far 
I could not find a solution better than this:  e.g. Alt-Fn-SysRq-s

press and hold Alt
press and hold Fn
press and leave F10|SysRq
leave Fn
press and leave s
leave Alt

Several months ago a LKML user claimed, his cat had managed to press 
Alt-Fn-SysRq-c on his Dell Latitude notebook with similar keyboard, and 
provided 
photos showing the kernel crash message ;)

-- 
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH RFT] regulator: lp8788-ldo: Use ldo->en_pin to check if regulator is enabled by external pin

2013-01-08 Thread Kim, Milo
> -Original Message-
> From: Mark Brown [mailto:broo...@opensource.wolfsonmicro.com]
> Sent: Tuesday, January 08, 2013 7:44 PM
> To: Axel Lin
> Cc: Kim, Milo; Girdwood, Liam; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH RFT] regulator: lp8788-ldo: Use ldo->en_pin to
> check if regulator is enabled by external pin
> 
> On Tue, Jan 08, 2013 at 12:06:44PM +0800, Axel Lin wrote:
> 
> > In this driver,
> > There is a case that one gpio controls more than one regulator.
> > e.g. ALDO2, ALDO3, ALDO4 are controlled by the same pin.
> 
> > It looks like current regulator core does not support this case.
> 
> Then the code to cope with this should be ported over to the core
> instead of being open coded in the driver.

Mark, could you review the code below?

This covers multiple regulators are enabled by shared one GPIO pin.
It doesn't look nice yet, but I'd like to get your feedback on this idea first.

diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index 0f65b24..f63bdf1 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -3328,6 +3328,7 @@ regulator_register(const struct regulator_desc 
*regulator_desc,
struct device *dev;
int ret, i;
const char *supply = NULL;
+   bool shared_ena_pin = false;
 
if (regulator_desc == NULL || config == NULL)
return ERR_PTR(-EINVAL);
@@ -3409,17 +3410,24 @@ regulator_register(const struct regulator_desc 
*regulator_desc,
if (ret != 0) {
rdev_err(rdev, "Failed to request enable GPIO%d: %d\n",
 config->ena_gpio, ret);
-   goto wash;
+
+   if (ret == -EBUSY && config->is_gpio_shared)
+   shared_ena_pin = true;
+
+   if (!shared_ena_pin)
+   goto wash;
}
 
-   rdev->ena_gpio = config->ena_gpio;
-   rdev->ena_gpio_invert = config->ena_gpio_invert;
+   if (!shared_ena_pin) {
+   rdev->ena_gpio = config->ena_gpio;
+   rdev->ena_gpio_invert = config->ena_gpio_invert;
 
-   if (config->ena_gpio_flags & GPIOF_OUT_INIT_HIGH)
-   rdev->ena_gpio_state = 1;
+   if (config->ena_gpio_flags & GPIOF_OUT_INIT_HIGH)
+   rdev->ena_gpio_state = 1;
 
-   if (rdev->ena_gpio_invert)
-   rdev->ena_gpio_state = !rdev->ena_gpio_state;
+   if (rdev->ena_gpio_invert)
+   rdev->ena_gpio_state = !rdev->ena_gpio_state;
+   }
}
 
/* set regulator constraints */
diff --git a/include/linux/regulator/driver.h b/include/linux/regulator/driver.h
index d10bb0f..fa0e4e5 100644
--- a/include/linux/regulator/driver.h
+++ b/include/linux/regulator/driver.h
@@ -241,6 +241,8 @@ struct regulator_desc {
  * @regmap: regmap to use for core regmap helpers if dev_get_regulator() is
  *  insufficient.
  * @ena_gpio: GPIO controlling regulator enable.
+ * @is_gpio_shared: Set true if enable GPIO is shared (multiple regulators are
+ * enabled by one GPIO pin).
  * @ena_gpio_invert: Sense for GPIO enable control.
  * @ena_gpio_flags: Flags to use when calling gpio_request_one()
  */
@@ -252,6 +254,7 @@ struct regulator_config {
struct regmap *regmap;
 
int ena_gpio;
+   bool is_gpio_shared;
unsigned int ena_gpio_invert:1;
unsigned int ena_gpio_flags;
 };

Thanks,
Milo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] kfifo: remove unnecessary type check

2013-01-08 Thread Yuanhan Liu
On Tue, Jan 08, 2013 at 10:51:04PM +0100, Stefani Seibold wrote:
> Am Dienstag, den 08.01.2013, 22:57 +0800 schrieb Yuanhan Liu:
> > Firstly, this kind of type check doesn't work. It does something similar
> > as following:
> > void * __dummy = NULL;
> > __buf = __dummy;
> > 
> > __dummy is defined as void *. Thus it will not trigger warnings as
> > expected.
> > 
> > Second, we don't need that kind of check. Since the prototype
> > of __kfifo_out is:
> > unsigned int __kfifo_out(struct __kfifo *fifo,  void *buf, unsigned int 
> > len)
> > 
> > buf is defined as void *, so we don't need do the type check. Remove it.
> > 
> 
> Thats wrong.
> 
> First the type checking will be used in kfifo_put() and kfifo_in() for
> const types to check if the passed type of the data can converted to the
> fifo element type. 

Hi Stefani,

Yes, I see now. After rechecking the code, I found that this kind of
type checking only works for those static defined kifo by
DECLARE/DEFINE_KFIFO. As the ptrtype is the same as the data type:

/* the 4th argument "type" is "ptrtype" */
#define STRUCT_KFIFO(type, size) struct __STRUCT_KFIFO(type, size, 0, type)

#define DECLARE_KFIFO(fifo, type, size) STRUCT_KFIFO(type, size) fifo

While, for those kfifo dynamically allocated, the type checking will not
work as expected then as ptrtype is always "void":

struct kfifo __STRUCT_KFIFO_PTR(unsigned char, 0, void);

So, there is no need to do type force convertion like following:
arch/arm/plat-omap/mailbox.c:   len = kfifo_in(>fifo, (unsigned char 
*), sizeof(msg));
As mq->fifo is dynamically allocated.

So, the type checking does work, and I'll drop this patch.

Sorry for the noisy.

--yliu

> And it will be used in kfifo_get(), kfifo_peek(), kfifo_out() and
> kfio_out_peek() to check if the element type of the fifo can be
> converted to the passed type of the destination.
> 
> So a big NAK!  
> 
> > v2: remove ptr and const_ptr, which were used for type checking.
> > 
> > LINK: https://lkml.org/lkml/2012/10/25/386
> > LINK: https://lkml.org/lkml/2012/10/25/584
> > 
> > Cc: Stefani Seibold 
> > Cc: Andrew Morton 
> > Signed-off-by: Yuanhan Liu 
> > ---
> >  include/linux/kfifo.h |   46 --
> >  1 files changed, 12 insertions(+), 34 deletions(-)
> > 
> > diff --git a/include/linux/kfifo.h b/include/linux/kfifo.h
> > index 10308c6..7a18245 100644
> > --- a/include/linux/kfifo.h
> > +++ b/include/linux/kfifo.h
> > @@ -63,49 +63,47 @@ struct __kfifo {
> > void*data;
> >  };
> >  
> > -#define __STRUCT_KFIFO_COMMON(datatype, recsize, ptrtype) \
> > +#define __STRUCT_KFIFO_COMMON(datatype, recsize) \
> > union { \
> > struct __kfifo  kfifo; \
> > datatype*type; \
> > char(*rectype)[recsize]; \
> > -   ptrtype *ptr; \
> > -   const ptrtype   *ptr_const; \
> > }
> >  
> > -#define __STRUCT_KFIFO(type, size, recsize, ptrtype) \
> > +#define __STRUCT_KFIFO(type, size, recsize) \
> >  { \
> > -   __STRUCT_KFIFO_COMMON(type, recsize, ptrtype); \
> > +   __STRUCT_KFIFO_COMMON(type, recsize); \
> > typebuf[((size < 2) || (size & (size - 1))) ? -1 : size]; \
> >  }
> >  
> >  #define STRUCT_KFIFO(type, size) \
> > -   struct __STRUCT_KFIFO(type, size, 0, type)
> > +   struct __STRUCT_KFIFO(type, size, 0)
> >  
> > -#define __STRUCT_KFIFO_PTR(type, recsize, ptrtype) \
> > +#define __STRUCT_KFIFO_PTR(type, recsize) \
> >  { \
> > -   __STRUCT_KFIFO_COMMON(type, recsize, ptrtype); \
> > +   __STRUCT_KFIFO_COMMON(type, recsize); \
> > typebuf[0]; \
> >  }
> >  
> >  #define STRUCT_KFIFO_PTR(type) \
> > -   struct __STRUCT_KFIFO_PTR(type, 0, type)
> > +   struct __STRUCT_KFIFO_PTR(type, 0)
> >  
> >  /*
> >   * define compatibility "struct kfifo" for dynamic allocated fifos
> >   */
> > -struct kfifo __STRUCT_KFIFO_PTR(unsigned char, 0, void);
> > +struct kfifo __STRUCT_KFIFO_PTR(unsigned char, 0);
> >  
> >  #define STRUCT_KFIFO_REC_1(size) \
> > -   struct __STRUCT_KFIFO(unsigned char, size, 1, void)
> > +   struct __STRUCT_KFIFO(unsigned char, size, 1)
> >  
> >  #define STRUCT_KFIFO_REC_2(size) \
> > -   struct __STRUCT_KFIFO(unsigned char, size, 2, void)
> > +   struct __STRUCT_KFIFO(unsigned char, size, 2)
> >  
> >  /*
> >   * define kfifo_rec types
> >   */
> > -struct kfifo_rec_ptr_1 __STRUCT_KFIFO_PTR(unsigned char, 1, void);
> > -struct kfifo_rec_ptr_2 __STRUCT_KFIFO_PTR(unsigned char, 2, void);
> > +struct kfifo_rec_ptr_1 __STRUCT_KFIFO_PTR(unsigned char, 1);
> > +struct kfifo_rec_ptr_2 __STRUCT_KFIFO_PTR(unsigned char, 2);
> >  
> >  /*
> >   * helper macro to distinguish between real in place fifo where the fifo
> > @@ -390,10 +388,6 @@ __kfifo_int_must_check_helper( \
> > unsigned int __ret; \
> > const size_t __recsize = sizeof(*__tmp->rectype); \
> > struct __kfifo *__kfifo = &__tmp->kfifo; \
> > -   if (0) { \
> > - 

Re: [PATCH v7 0/5] Reset PCIe devices to address DMA problem on kdump with iommu

2013-01-08 Thread Thomas Renninger
On Tuesday, January 08, 2013 09:27:55 AM Yinghai Lu wrote:
> On Tue, Jan 8, 2013 at 8:50 AM, Thomas Renninger  wrote:
> > megaraid_sas
> 
> can you check if your initrd for kdump kernel has that driver and
> module that it depends on like
> scsi sas transport etc ?

Removing the 5 patches and the disk works and the
dump is written.

I can look a bit further at the memmap=exactmap issue tomorrow.
I can also double check above then, but I am rather sure about it
already:
I tried plain vanilla -> worked, dumping started
I tried with only these 5 patches added -> no disk.


Some questions:

You try to initialize the PCI subsystem in a way the BIOS typically has
to do it in kexec case?

Reacting and trying to handle error condtitions more gracefully
at the place where they are caught could be another approach which
imo makes sense to implement in parallel.

In my case for example I see:
"Present field in the IRTE entry is clear"
DMAR errors. I expect this comes from a device which still throws
interrupts, but irq vector got not set-up or registered in the kexec'ed 
kernel.

I could imagine this is the same error which happens when an irq is
wrongly configured and spurious interrupts happen (but in irq remapped case).
In my case it's not sever as I only see this message once, but according
to another report, they see about 80 of such DMAR error messages per
second. This seem to result in endless DMAR error interrupts and finally
a dead system.

I wonder whether the DMAR error handler could already invoke a PCIe
reset.
I found:
int pci_set_pcie_reset_state(struct pci_dev *dev, enum pcie_reset_state state)
which unfortunatly is only implemented for PPC, but would it make sense to
implement this one and trigger function level reset if several specific DMAR
errors are seen (or other PCI(e) error handlers get active?)?

If this does not help the next step could be to stop DMAR error interrupt
handling or other iommu commands to keep the machine alive, even if one
device keeps firing interrupts to an unconfigured irq vector (or whatever other
things could happen).

Just some ideas...
Comments appreciated.

   Thomas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 18:14 -0800, Eric Dumazet wrote:
> On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote:
> > Mel Gorman  wrote:
> > > Please try the following patch. However, even if it works the benefit of
> > > capture may be so marginal that partially reverting it and simplifying
> > > compaction.c is the better decision.
> > 
> > I already got my VM stuck on this one.  I had two twosleepy instances,
> > 2774 was the one that got stuck (also confirmed by watching top).
> > 
> > Btw, have you been able to reproduce this on your end?
> > 
> > I think the easiest reproduction on my 2-core VM is by running 2
> > twosleepy processes and doing the following to dirty a lot of pages:
> 
> Given the persistent sk_stream_wait_memory() traces I suspect a plain
> TCP bug, triggered by some extra wait somewhere.
> 
> Please mm guys don't spend too much time right now, I'll try to
> reproduce the problem.
> 
> Don't be confused by sk_stream_wait_memory() name.
> A thread is stuck here because TCP stack is failing to wake it.
> 

Hmm, it seems sk_filter() can return -ENOMEM because skb has the
pfmemalloc() set.

It seems nobody really tested this stuff under memory stress.

Mel, it looks like you are the guy who could fix this, after all ;)

One TCP socket keeps retransmitting an SKB via loopback, and TCP stack 
drops the packet again and again.


commit c93bdd0e03e848555d144eb44a1f275b871a8dd5
Author: Mel Gorman 
Date:   Tue Jul 31 16:44:19 2012 -0700

netvm: allow skb allocation to use PFMEMALLOC reserves

Change the skb allocation API to indicate RX usage and use this to fall
back to the PFMEMALLOC reserve when needed.  SKBs allocated from the
reserve are tagged in skb->pfmemalloc.  If an SKB is allocated from the
reserve and the socket is later found to be unrelated to page reclaim, the
packet is dropped so that the memory remains available for page reclaim.
Network protocols are expected to recover from this packet loss.

[a.p.zijls...@chello.nl: Ideas taken from various patches]
[da...@davemloft.net: Use static branches, coding style corrections]
[sebast...@breakpoint.cc: Avoid unnecessary cast, fix !CONFIG_NET build]
Signed-off-by: Mel Gorman 
Acked-by: David S. Miller 
Cc: Neil Brown 
Cc: Peter Zijlstra 
Cc: Mike Christie 
Cc: Eric B Munson 
Cc: Eric Dumazet 
Cc: Sebastian Andrzej Siewior 
Cc: Mel Gorman 
Cc: Christoph Lameter 
Signed-off-by: Andrew Morton 
Signed-off-by: Linus Torvalds 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] staging: imx-drm: ipu-common: Remove unused variable

2013-01-08 Thread Fabio Estevam
From: Fabio Estevam 

Fix the following warning when building with W=1 option:

drivers/staging/imx-drm/ipu-v3/ipu-common.c: In function 'ipu_remove':
drivers/staging/imx-drm/ipu-v3/ipu-common.c:1145:19: warning: variable 'res' 
set but not used [-Wunused-but-set-variable]

Signed-off-by: Fabio Estevam 
---
 drivers/staging/imx-drm/ipu-v3/ipu-common.c |3 ---
 1 file changed, 3 deletions(-)

diff --git a/drivers/staging/imx-drm/ipu-v3/ipu-common.c 
b/drivers/staging/imx-drm/ipu-v3/ipu-common.c
index f7059cd..366f259 100644
--- a/drivers/staging/imx-drm/ipu-v3/ipu-common.c
+++ b/drivers/staging/imx-drm/ipu-v3/ipu-common.c
@@ -1142,9 +1142,6 @@ failed_ioremap:
 static int ipu_remove(struct platform_device *pdev)
 {
struct ipu_soc *ipu = platform_get_drvdata(pdev);
-   struct resource *res;
-
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
 
platform_device_unregister_children(pdev);
ipu_submodules_exit(ipu);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7u1 26/31] x86: Don't enable swiotlb if there is not enough ram for it

2013-01-08 Thread Eric W. Biederman
Yinghai Lu  writes:

> On Tue, Jan 8, 2013 at 5:07 PM, Yinghai Lu  wrote:
>> On Tue, Jan 8, 2013 at 4:58 PM, Eric W. Biederman  
>> wrote:
>>
>>>
>>> So instead we need to say?
>>>
>>> +   if (no_iotlb_memory)
>>> +   panic("Cannot allocate SWIOTLB buffer");
>>> +
>>>
>>> Which is just making the panic a little later than it used to be and
>>> seems completely reasonable.
>>
>> yes, looks some driver just use map_single without checking results.
>
> update one.
>
> later could have another patch to shrink size...

It does look better.

Reading the code I am still left with the question why do the nopanic
handling at all?  Since the code effectively moves the panic to later.

Why can't other architectures use the same panic handling as x86?

Eric

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] Bluetooth: btmrvl_sdio: look for sd8688 firmware in alternate place

2013-01-08 Thread Marcel Holtmann
Hi Lubomir,

> linux-firmware ships the sd8688* firmware images that are shared with
> libertas_sdio WiFi driver under libertas/. libertas_sdio looks in both places
> and so should we.
> 
> Signed-off-by: Lubomir Rintel 
> ---
>  drivers/bluetooth/btmrvl_sdio.c |   24 ++--
>  drivers/bluetooth/btmrvl_sdio.h |6 --
>  2 files changed, 26 insertions(+), 4 deletions(-)

NAK from me on this one. I do not want the driver to check two
locations. That is what userspace can work around.

If we want to unify the location between the WiFi driver and the
Bluetooth driver, I am fine with that, but seriously, just pick one over
the other. I do not care which one.

Regards

Marcel


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 1/2] virtio-net: fix the set affinity bug when CPU IDs are not consecutive

2013-01-08 Thread Wanlong Gao
On 01/09/2013 07:31 AM, Rusty Russell wrote:
> Wanlong Gao  writes:
>>   */
>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>>  {
>> -int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> -  smp_processor_id();
>> +int txq = 0;
>> +
>> +if (skb_rx_queue_recorded(skb))
>> +txq = skb_get_rx_queue(skb);
>> +else if ((txq = per_cpu(vq_index, smp_processor_id())) == -1)
>> +txq = 0;
> 
> You should use __get_cpu_var() instead of smp_processor_id() here, ie:
> 
> else if ((txq = __get_cpu_var(vq_index)) == -1)
> 
> And AFAICT, no reason to initialize txq to 0 to start with.
> 
> So:
> 
> int txq;
> 
> if (skb_rx_queue_recorded(skb))
>   txq = skb_get_rx_queue(skb);
> else {
> txq = __get_cpu_var(vq_index);
> if (txq == -1)
> txq = 0;
> }

Got it, thank you.

> 
> Now, just to confirm, I assume this can happen even if we use vq_index,
> right, because of races with virtnet_set_channels?

I still can't understand this race, could you explain more? thank you.

Regards,
Wanlong Gao

> 
>   while (unlikely(txq >= dev->real_num_tx_queues))
>   txq -= dev->real_num_tx_queues;
> 
> 
> Thanks,
> Rusty.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V3 1/2] virtio-net: fix the set affinity bug when CPU IDs are not consecutive

2013-01-08 Thread Wanlong Gao
On 01/08/2013 06:26 PM, Jason Wang wrote:
> On 01/08/2013 06:07 PM, Wanlong Gao wrote:
>> As Michael mentioned, set affinity and select queue will not work very
>> well when CPU IDs are not consecutive, this can happen with hot unplug.
>> Fix this bug by traversal the online CPUs, and create a per cpu variable
>> to find the mapping from CPU to the preferable virtual-queue.
>>
>> Cc: Rusty Russell 
>> Cc: "Michael S. Tsirkin" 
>> Cc: Jason Wang 
>> Cc: Eric Dumazet 
>> Cc: virtualizat...@lists.linux-foundation.org
>> Cc: net...@vger.kernel.org
>> Signed-off-by: Wanlong Gao 
>> ---
>>  drivers/net/virtio_net.c | 39 +--
>>  1 file changed, 29 insertions(+), 10 deletions(-)
>>
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index a6fcf15..a77f86c 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>> @@ -41,6 +41,8 @@ module_param(gso, bool, 0444);
>>  #define VIRTNET_SEND_COMMAND_SG_MAX2
>>  #define VIRTNET_DRIVER_VERSION "1.0.0"
>>  
>> +DEFINE_PER_CPU(int, vq_index) = -1;
>> +
> 
> I think this should not be a global one, consider we may have more than
> one virtio-net cards with different max queues.

Yes, would you move this into virtio_info?

>>  struct virtnet_stats {
>>  struct u64_stats_sync tx_syncp;
>>  struct u64_stats_sync rx_syncp;
>> @@ -1016,6 +1018,7 @@ static int virtnet_vlan_rx_kill_vid(struct net_device 
>> *dev, u16 vid)
>>  static void virtnet_set_affinity(struct virtnet_info *vi, bool set)
>>  {
>>  int i;
>> +int cpu;
>>  
>>  /* In multiqueue mode, when the number of cpu is equal to the number of
>>   * queue pairs, we let the queue pairs to be private to one cpu by
>> @@ -1029,16 +1032,29 @@ static void virtnet_set_affinity(struct virtnet_info 
>> *vi, bool set)
>>  return;
>>  }
>>  
>> -for (i = 0; i < vi->max_queue_pairs; i++) {
>> -int cpu = set ? i : -1;
>> -virtqueue_set_affinity(vi->rq[i].vq, cpu);
>> -virtqueue_set_affinity(vi->sq[i].vq, cpu);
>> -}
>> +if (set) {
>> +i = 0;
>> +for_each_online_cpu(cpu) {
>> +virtqueue_set_affinity(vi->rq[i].vq, cpu);
>> +virtqueue_set_affinity(vi->sq[i].vq, cpu);
>> +per_cpu(vq_index, cpu) = i;
>> +i++;
>> +if (i >= vi->max_queue_pairs)
>> +break;
> 
> Can this happen? we check only set when the number are equal.

will remove.

>> +}
>>  
>> -if (set)
>>  vi->affinity_hint_set = true;
>> -else
>> +} else {
>> +for(i = 0; i < vi->max_queue_pairs; i++) {
>> +virtqueue_set_affinity(vi->rq[i].vq, -1);
>> +virtqueue_set_affinity(vi->sq[i].vq, -1);
>> +}
>> +
>> +for_each_online_cpu(cpu)
>> +per_cpu(vq_index, cpu) = -1;
>> +
> 
> This looks suboptimal since it may leads only txq zero is used.

So, which value is best for txq when we don't set affinity?
just remain to smp_processor_id()?

Thanks,
Wanlong Gao

>>  vi->affinity_hint_set = false;
>> +}
>>  }
>>  
>>  static void virtnet_get_ringparam(struct net_device *dev,
>> @@ -1127,12 +1143,15 @@ static int virtnet_change_mtu(struct net_device 
>> *dev, int new_mtu)
>>  
>>  /* To avoid contending a lock hold by a vcpu who would exit to host, select 
>> the
>>   * txq based on the processor id.
>> - * TODO: handle cpu hotplug.
>>   */
>>  static u16 virtnet_select_queue(struct net_device *dev, struct sk_buff *skb)
>>  {
>> -int txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) :
>> -  smp_processor_id();
>> +int txq = 0;
>> +
>> +if (skb_rx_queue_recorded(skb))
>> +txq = skb_get_rx_queue(skb);
>> +else if ((txq = per_cpu(vq_index, smp_processor_id())) == -1)
>> +txq = 0;
>>  
>>  while (unlikely(txq >= dev->real_num_tx_queues))
>>  txq -= dev->real_num_tx_queues;
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Pv-drivers] [PATCH 0/6] VSOCK for Linux upstreaming

2013-01-08 Thread Dmitry Torokhov
On Tue, Jan 08, 2013 at 05:46:01PM -0800, David Miller wrote:
> From: Dmitry Torokhov 
> Date: Tue, 08 Jan 2013 17:41:44 -0800
> 
> > On Tuesday, January 08, 2013 05:30:56 PM David Miller wrote:
> >> From: Greg KH 
> >> Date: Tue, 8 Jan 2013 16:21:10 -0800
> >> 
> >> > On Tue, Jan 08, 2013 at 03:59:08PM -0800, George Zhang wrote:
> >> >> * * *
> >> >> 
> >> >> This series of VSOCK linux upstreaming patches include latest udpate 
> >> >> from
> >> >> VMware to address Greg's and all other's code review comments.
> >> > 
> >> > Dave, you acked these patches a while ago,
> >> 
> >> Really?  I'd like to see where I did that.
> >> 
> >> Instead, what I remember doing was deferring to the feedback these
> >> folks received, stating that ideas that the virtio people had
> >> mentioned should be considered instead.
> >> 
> >> http://marc.info/?l=linux-netdev=135301515818462=2
> > 
> > I believe Andy replied to Anthony's AF_VMCHANNEL post and the differences
> > between the proposed solutions.
> 
> I'd much rather see a hypervisor neutral solution than a hypervisor
> specific one which this certainly is.

Objectively speaking neither solution is hypervisor neutral as there are
hypervisors that implement either VMCI or virtio or something else
entirely.

Our position is that VSOCK feature set is more complete and that it
should be possible to use transports other than VMCI for VSOCK traffic,
should interested parties implement them, and on this basis we ask to
include VSOCK.

Thanks,
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 7/8] mm: use vm_unmapped_area() on powerpc architecture

2013-01-08 Thread Benjamin Herrenschmidt
On Tue, 2013-01-08 at 17:28 -0800, Michel Lespinasse wrote:
> Update the powerpc slice_get_unmapped_area function to make use of
> vm_unmapped_area() instead of implementing a brute force search.
> 
> Signed-off-by: Michel Lespinasse 
> 
> ---
>  arch/powerpc/mm/slice.c |  128 +-
>  1 files changed, 81 insertions(+), 47 deletions(-)

That doesn't look good ... the resulting code is longer than the
original, which makes me wonder how it is an improvement...

Now it could just be a matter of how the code is factored, I see
quite a bit of duplication of the whole slice mask test...

Cheers,
Ben.

> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index 999a74f25ebe..048346b7eed5 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -242,31 +242,51 @@ static unsigned long slice_find_area_bottomup(struct 
> mm_struct *mm,
> struct slice_mask available,
> int psize)
>  {
> - struct vm_area_struct *vma;
> - unsigned long addr;
> - struct slice_mask mask;
>   int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
> + unsigned long addr, found, slice;
> + struct vm_unmapped_area_info info;
>  
> - addr = TASK_UNMAPPED_BASE;
> + info.flags = 0;
> + info.length = len;
> + info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
> + info.align_offset = 0;
>  
> - for (;;) {
> - addr = _ALIGN_UP(addr, 1ul << pshift);
> - if ((TASK_SIZE - len) < addr)
> - break;
> - vma = find_vma(mm, addr);
> - BUG_ON(vma && (addr >= vma->vm_end));
> + addr = TASK_UNMAPPED_BASE;
> + while (addr < TASK_SIZE) {
> + info.low_limit = addr;
> + if (addr < SLICE_LOW_TOP) {
> + slice = GET_LOW_SLICE_INDEX(addr);
> + addr = (slice + 1) << SLICE_LOW_SHIFT;
> + if (!(available.low_slices & (1u << slice)))
> + continue;
> + } else {
> + slice = GET_HIGH_SLICE_INDEX(addr);
> + addr = (slice + 1) << SLICE_HIGH_SHIFT;
> + if (!(available.high_slices & (1u << slice)))
> + continue;
> + }
>  
> - mask = slice_range_to_mask(addr, len);
> - if (!slice_check_fit(mask, available)) {
> - if (addr < SLICE_LOW_TOP)
> - addr = _ALIGN_UP(addr + 1,  1ul << 
> SLICE_LOW_SHIFT);
> - else
> - addr = _ALIGN_UP(addr + 1,  1ul << 
> SLICE_HIGH_SHIFT);
> - continue;
> + next_slice:
> + if (addr >= TASK_SIZE)
> + addr = TASK_SIZE;
> + else if (addr < SLICE_LOW_TOP) {
> + slice = GET_LOW_SLICE_INDEX(addr);
> + if (available.low_slices & (1u << slice)) {
> + addr = (slice + 1) << SLICE_LOW_SHIFT;
> + goto next_slice;
> + }
> + } else {
> + slice = GET_HIGH_SLICE_INDEX(addr);
> + if (available.high_slices & (1u << slice)) {
> + addr = (slice + 1) << SLICE_HIGH_SHIFT;
> + goto next_slice;
> + }
>   }
> - if (!vma || addr + len <= vma->vm_start)
> - return addr;
> - addr = vma->vm_end;
> + info.high_limit = addr;
> +
> + found = vm_unmapped_area();
> + if (!(found & ~PAGE_MASK))
> + return found;
>   }
>  
>   return -ENOMEM;
> @@ -277,39 +297,53 @@ static unsigned long slice_find_area_topdown(struct 
> mm_struct *mm,
>struct slice_mask available,
>int psize)
>  {
> - struct vm_area_struct *vma;
> - unsigned long addr;
> - struct slice_mask mask;
>   int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
> + unsigned long addr, found, slice;
> + struct vm_unmapped_area_info info;
>  
> - addr = mm->mmap_base;
> - while (addr > len) {
> - /* Go down by chunk size */
> - addr = _ALIGN_DOWN(addr - len, 1ul << pshift);
> + info.flags = VM_UNMAPPED_AREA_TOPDOWN;
> + info.length = len;
> + info.align_mask = PAGE_MASK & ((1ul << pshift) - 1);
> + info.align_offset = 0;
>  
> - /* Check for hit with different page size */
> - mask = slice_range_to_mask(addr, len);
> - if (!slice_check_fit(mask, available)) {
> - if (addr < SLICE_LOW_TOP)
> - addr = _ALIGN_DOWN(addr, 1ul << 
> 

Re: ppoll() stuck on POLLIN while TCP peer is sending

2013-01-08 Thread Eric Dumazet
On Tue, 2013-01-08 at 23:23 +, Eric Wong wrote:
> Mel Gorman  wrote:
> > Please try the following patch. However, even if it works the benefit of
> > capture may be so marginal that partially reverting it and simplifying
> > compaction.c is the better decision.
> 
> I already got my VM stuck on this one.  I had two twosleepy instances,
> 2774 was the one that got stuck (also confirmed by watching top).
> 
> Btw, have you been able to reproduce this on your end?
> 
> I think the easiest reproduction on my 2-core VM is by running 2
> twosleepy processes and doing the following to dirty a lot of pages:

Given the persistent sk_stream_wait_memory() traces I suspect a plain
TCP bug, triggered by some extra wait somewhere.

Please mm guys don't spend too much time right now, I'll try to
reproduce the problem.

Don't be confused by sk_stream_wait_memory() name.
A thread is stuck here because TCP stack is failing to wake it.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] cgroup: use new hashtable implementation

2013-01-08 Thread Li Zefan
On 2013/1/9 2:17, Tejun Heo wrote:
> Hello, Li.
> 
> On Tue, Jan 08, 2013 at 03:51:45PM +0800, Li Zefan wrote:
>> -static struct hlist_head *css_set_hash(struct cgroup_subsys_state *css[])
>> +static unsigned long css_set_hash(struct cgroup_subsys_state *css[])
>>  {
>>  int i;
>> -int index;
>> -unsigned long tmp = 0UL;
>> +unsigned long key = 0UL;
>>  
>>  for (i = 0; i < CGROUP_SUBSYS_COUNT; i++)
>> -tmp += (unsigned long)css[i];
>> -tmp = (tmp >> 16) ^ tmp;
>> +key += (unsigned long)css[i];
>> +key = (key >> 16) ^ key;
> 
> @key is gonna go through hash function anyway.  Do we still need the
> above (key >> 16) ^ key?  It's not gonna help anything.
> 

Nothing's changed after this patch, so the key will still be passed to
hash_long().

I tested this hash function long ago, and the original version was
without (key >> 16) ^ key, and it produced worse hash collision.

>> -index = hash_long(tmp, CSS_SET_HASH_BITS);
>> -
>> -return _set_table[index];
>> +return key;
>>  }
>>  
>>  /* We don't maintain the lists running through each css_set to its
>> @@ -4503,23 +4498,17 @@ int __init_or_module cgroup_load_subsys(struct 
>> cgroup_subsys *ss)
>>   * this is all done under the css_set_lock.
>>   */
>>  write_lock(_set_lock);
>> -for (i = 0; i < CSS_SET_TABLE_SIZE; i++) {
>> -struct css_set *cg;
>> -struct hlist_node *node, *tmp;
>> -struct hlist_head *bucket = _set_table[i], *new_bucket;
>> -
>> -hlist_for_each_entry_safe(cg, node, tmp, bucket, hlist) {
>> -/* skip entries that we already rehashed */
>> -if (cg->subsys[ss->subsys_id])
>> -continue;
>> -/* remove existing entry */
>> -hlist_del(>hlist);
>> -/* set new value */
>> -cg->subsys[ss->subsys_id] = css;
>> -/* recompute hash and restore entry */
>> -new_bucket = css_set_hash(cg->subsys);
>> -hlist_add_head(>hlist, new_bucket);
>> -}
>> +hash_for_each_safe(css_set_table, i, node, tmp, cg, hlist) {
>> +/* skip entries that we already rehashed */
>> +if (cg->subsys[ss->subsys_id])
>> +continue;
>> +/* remove existing entry */
>> +hlist_del(>hlist);
> 
>   hash_del()?
> 

will fix.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] target: initialize sense_reason_t ret in core_scsi3_emulate_pro_register()

2013-01-08 Thread Nicholas A. Bellinger
Hi Geert,

Apologies for the delay on this one.  Still catching up on some older
holiday items..

On Sat, 2012-12-22 at 22:15 +0100, Geert Uytterhoeven wrote:
> drivers/target/target_core_pr.c: In function 
> ‘core_scsi3_emulate_pro_register’:
> drivers/target/target_core_pr.c:2056: warning: ‘ret’ may be used 
> uninitialized in this function
> 
> If !spec_i_pt, the "goto out_put_pr_reg" on line 2141 seems to be a real
> case where ret is not initialized.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/target/target_core_pr.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/target/target_core_pr.c b/drivers/target/target_core_pr.c
> index e35dbf8..c2e8026 100644
> --- a/drivers/target/target_core_pr.c
> +++ b/drivers/target/target_core_pr.c
> @@ -2053,7 +2053,7 @@ core_scsi3_emulate_pro_register(struct se_cmd *cmd, u64 
> res_key, u64 sa_res_key,
>   /* Used for APTPL metadata w/ UNREGISTER */
>   unsigned char *pr_aptpl_buf = NULL;
>   unsigned char isid_buf[PR_REG_ISID_LEN], *isid_ptr = NULL;
> - sense_reason_t ret;
> + sense_reason_t ret = 0;
>   int pr_holder = 0, type;
>  
>   if (!se_sess || !se_lun) {

Looks fine, applied to target-pending/master.

Thank you,

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] gpiolib: add gpiod_get and gpiod_put functions

2013-01-08 Thread Alexandre Courbot
On Tue, Jan 8, 2013 at 10:07 PM, Arnd Bergmann  wrote:
> On Tuesday 08 January 2013, Alexandre Courbot wrote:
>>
>> Adds new GPIO allocation functions that work with the opaque descriptor
>> interface.
>>
>> Signed-off-by: Alexandre Courbot 
>
> I think you need to reorder the patches slightly, since the gpiod_get
> function introduced here is already being used in the first patch, which
> breaks bisection.

Yes, gpiod_get and gpiod_put should only appear from the second patch
on. Thanks for pointing that out.

Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/4] gpio: introduce descriptor-based interface

2013-01-08 Thread Alexandre Courbot
On Tue, Jan 8, 2013 at 10:06 PM, Arnd Bergmann  wrote:
> I like the interface, good idea!

Great! This was initially suggested by Linus W.

> A few questions:
>
> Is there a plan for migrating all the existing users of the current GPIO
> interface?

Nothing specifically planned for now, as we need to make sure the new
interface covers all needs first. There would be a lot of drivers to
change if we decide to deprecate the integer interface, but Coccinelle
could probably help here.

The question is, do we want to totally get rid of the integer
namespace? That would be the ultimate step, but would require another
way to identify GPIOs (controller_device:offset might be such a way),
and also to reorganize sysfs nodes. Wouldn't that be considered
breaking user-space? 'cause we all know what happens to those who
break user-space.

> How do you want to deal with drivers that work on platforms that currently
> don't use gpiolib but have their own implementation of asm/gpio.h? Are
> we going to phase them out?

With the current code, a driver should depend on gpiolib being
compiled if it uses the new interface. It is not even declared if
gpiolib is not used.

Given that both interfaces are quite close, one could imagine having a
gpiod wrapper around the integer namespace (the "opaque descriptors"
would then just be casted integers). This way drivers would only need
to depend on GENERIC_GPIO. It's a little bit weird to have gpiod
wrapping around gpio in one case and the opposite in another though -
I'd rather have these platforms convert to GPIO descriptors internally
(or even better, to gpiolib), but this is probably asking too much.

I do not know all the details of gpiolib's history, but why would
anyone want to implement the generic gpio interface and not use
gpiolib anyways?

Then there are platforms who do not follow generic gpios and implement
their own sauce. I don't think we need to care here, as these are not
supposed to be used with generic drivers anyway.

> If we are adding a new way to deal with GPIOs, would it make sense to
> have that more closely integrated into pinctrl in one form or another?
> My feeling is that there is already a significant overlap between the
> two, and while the addition of the gpiod_* functions doesn't necessarily
> make this worse, it could be a chance to improve the current situation
> to make the interface more consistent with pinctrl.

That may be a chance to introduce deeper changes indeed - what do you
have in mind exactly?

Thanks,
Alex.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Pv-drivers] [PATCH 0/6] VSOCK for Linux upstreaming

2013-01-08 Thread David Miller
From: Dmitry Torokhov 
Date: Tue, 08 Jan 2013 17:41:44 -0800

> On Tuesday, January 08, 2013 05:30:56 PM David Miller wrote:
>> From: Greg KH 
>> Date: Tue, 8 Jan 2013 16:21:10 -0800
>> 
>> > On Tue, Jan 08, 2013 at 03:59:08PM -0800, George Zhang wrote:
>> >> * * *
>> >> 
>> >> This series of VSOCK linux upstreaming patches include latest udpate from
>> >> VMware to address Greg's and all other's code review comments.
>> > 
>> > Dave, you acked these patches a while ago,
>> 
>> Really?  I'd like to see where I did that.
>> 
>> Instead, what I remember doing was deferring to the feedback these
>> folks received, stating that ideas that the virtio people had
>> mentioned should be considered instead.
>> 
>> http://marc.info/?l=linux-netdev=135301515818462=2
> 
> I believe Andy replied to Anthony's AF_VMCHANNEL post and the differences
> between the proposed solutions.

I'd much rather see a hypervisor neutral solution than a hypervisor
specific one which this certainly is.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Pv-drivers] [PATCH 0/6] VSOCK for Linux upstreaming

2013-01-08 Thread Dmitry Torokhov
On Tuesday, January 08, 2013 05:30:56 PM David Miller wrote:
> From: Greg KH 
> Date: Tue, 8 Jan 2013 16:21:10 -0800
> 
> > On Tue, Jan 08, 2013 at 03:59:08PM -0800, George Zhang wrote:
> >> * * *
> >> 
> >> This series of VSOCK linux upstreaming patches include latest udpate from
> >> VMware to address Greg's and all other's code review comments.
> > 
> > Dave, you acked these patches a while ago,
> 
> Really?  I'd like to see where I did that.
> 
> Instead, what I remember doing was deferring to the feedback these
> folks received, stating that ideas that the virtio people had
> mentioned should be considered instead.
> 
> http://marc.info/?l=linux-netdev=135301515818462=2

I believe Andy replied to Anthony's AF_VMCHANNEL post and the differences
between the proposed solutions.

> 
> So definitely NACK this code and any infrastructure you've
> merged which essentialy depends upon it.

No, there is no infrastructure that depends on VSOCK, as VSOCK is built
on top of VMCI, not the other way around.

Thanks,
Dmitry

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/8] vm_unmapped_area: finish the mission

2013-01-08 Thread Michel Lespinasse
Whoops, I was supposed to find a more appropriate subject line before
sending this :]

On Tue, Jan 8, 2013 at 5:28 PM, Michel Lespinasse  wrote:
> These patches, which apply on top of v3.8-rc kernels, are to complete the
> VMA gap finding code I introduced (following Rik's initial proposal) in
> v3.8-rc1.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/8] mm: use vm_unmapped_area() on parisc architecture

2013-01-08 Thread Michel Lespinasse
Update the parisc arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse 

---
 arch/parisc/kernel/sys_parisc.c |   46 ++
 1 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/arch/parisc/kernel/sys_parisc.c b/arch/parisc/kernel/sys_parisc.c
index f76c10863c62..6ab138088076 100644
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -35,18 +35,15 @@
 
 static unsigned long get_unshared_area(unsigned long addr, unsigned long len)
 {
-   struct vm_area_struct *vma;
+   struct vm_unmapped_area_info info;
 
-   addr = PAGE_ALIGN(addr);
-
-   for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
-   /* At this point:  (!vma || addr < vma->vm_end). */
-   if (TASK_SIZE - len < addr)
-   return -ENOMEM;
-   if (!vma || addr + len <= vma->vm_start)
-   return addr;
-   addr = vma->vm_end;
-   }
+   info.flags = 0;
+   info.length = len;
+   info.low_limit = PAGE_ALIGN(addr);
+   info.high_limit = TASK_SIZE;
+   info.align_mask = 0;
+   info.align_offset = 0;
+   return vm_unmapped_area();
 }
 
 #define DCACHE_ALIGN(addr) (((addr) + (SHMLBA - 1)) &~ (SHMLBA - 1))
@@ -63,30 +60,21 @@ static unsigned long get_unshared_area(unsigned long addr, 
unsigned long len)
  */
 static int get_offset(struct address_space *mapping)
 {
-   int offset = (unsigned long) mapping << (PAGE_SHIFT - 8);
-   return offset & 0x3FF000;
+   return (unsigned long) mapping >> 8;
 }
 
 static unsigned long get_shared_area(struct address_space *mapping,
unsigned long addr, unsigned long len, unsigned long pgoff)
 {
-   struct vm_area_struct *vma;
-   int offset = mapping ? get_offset(mapping) : 0;
-
-   offset = (offset + (pgoff << PAGE_SHIFT)) & 0x3FF000;
+   struct vm_unmapped_area_info info;
 
-   addr = DCACHE_ALIGN(addr - offset) + offset;
-
-   for (vma = find_vma(current->mm, addr); ; vma = vma->vm_next) {
-   /* At this point:  (!vma || addr < vma->vm_end). */
-   if (TASK_SIZE - len < addr)
-   return -ENOMEM;
-   if (!vma || addr + len <= vma->vm_start)
-   return addr;
-   addr = DCACHE_ALIGN(vma->vm_end - offset) + offset;
-   if (addr < vma->vm_end) /* handle wraparound */
-   return -ENOMEM;
-   }
+   info.flags = 0;
+   info.length = len;
+   info.low_limit = PAGE_ALIGN(addr);
+   info.high_limit = TASK_SIZE;
+   info.align_mask = PAGE_MASK & (SHMLBA - 1);
+   info.align_offset = (get_offset(mapping) + pgoff) << PAGE_SHIFT;
+   return vm_unmapped_area();
 }
 
 unsigned long arch_get_unmapped_area(struct file *filp, unsigned long addr,
-- 
1.7.7.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/6] VSOCK for Linux upstreaming

2013-01-08 Thread David Miller
From: Greg KH 
Date: Tue, 8 Jan 2013 16:21:10 -0800

> On Tue, Jan 08, 2013 at 03:59:08PM -0800, George Zhang wrote:
>> 
>> * * *
>> 
>> This series of VSOCK linux upstreaming patches include latest udpate from
>> VMware to address Greg's and all other's code review comments.
> 
> Dave, you acked these patches a while ago,

Really?  I'd like to see where I did that.

Instead, what I remember doing was deferring to the feedback these
folks received, stating that ideas that the virtio people had
mentioned should be considered instead.

http://marc.info/?l=linux-netdev=135301515818462=2

So definitely NACK this code and any infrastructure you've
merged which essentialy depends upon it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/8] mm: use vm_unmapped_area() on frv architecture

2013-01-08 Thread Michel Lespinasse
Update the frv arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse 

---
 arch/frv/mm/elf-fdpic.c |   49 --
 1 files changed, 17 insertions(+), 32 deletions(-)

diff --git a/arch/frv/mm/elf-fdpic.c b/arch/frv/mm/elf-fdpic.c
index 385fd30b142f..836f14707a62 100644
--- a/arch/frv/mm/elf-fdpic.c
+++ b/arch/frv/mm/elf-fdpic.c
@@ -60,7 +60,7 @@ unsigned long arch_get_unmapped_area(struct file *filp, 
unsigned long addr, unsi
 unsigned long pgoff, unsigned long flags)
 {
struct vm_area_struct *vma;
-   unsigned long limit;
+   struct vm_unmapped_area_info info;
 
if (len > TASK_SIZE)
return -ENOMEM;
@@ -79,39 +79,24 @@ unsigned long arch_get_unmapped_area(struct file *filp, 
unsigned long addr, unsi
}
 
/* search between the bottom of user VM and the stack grow area */
-   addr = PAGE_SIZE;
-   limit = (current->mm->start_stack - 0x0020);
-   if (addr + len <= limit) {
-   limit -= len;
-
-   if (addr <= limit) {
-   vma = find_vma(current->mm, PAGE_SIZE);
-   for (; vma; vma = vma->vm_next) {
-   if (addr > limit)
-   break;
-   if (addr + len <= vma->vm_start)
-   goto success;
-   addr = vma->vm_end;
-   }
-   }
-   }
+   info.flags = 0;
+   info.length = len;
+   info.low_limit = PAGE_SIZE;
+   info.high_limit = (current->mm->start_stack - 0x0020);
+   info.align_mask = 0;
+   info.align_offset = 0;
+   addr = vm_unmapped_area();
+   if (!(addr & ~PAGE_MASK))
+   goto success;
+   VM_BUG_ON(addr != -ENOMEM);
 
/* search from just above the WorkRAM area to the top of memory */
-   addr = PAGE_ALIGN(0x8000);
-   limit = TASK_SIZE - len;
-   if (addr <= limit) {
-   vma = find_vma(current->mm, addr);
-   for (; vma; vma = vma->vm_next) {
-   if (addr > limit)
-   break;
-   if (addr + len <= vma->vm_start)
-   goto success;
-   addr = vma->vm_end;
-   }
-
-   if (!vma && addr <= limit)
-   goto success;
-   }
+   info.low_limit = PAGE_ALIGN(0x8000);
+   info.high_limit = TASK_SIZE;
+   addr = vm_unmapped_area();
+   if (!(addr & ~PAGE_MASK))
+   goto success;
+   VM_BUG_ON(addr != -ENOMEM);
 
 #if 0
printk("[area] l=%lx (ENOMEM) f='%s'\n",
-- 
1.7.7.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/8] vm_unmapped_area: finish the mission

2013-01-08 Thread Michel Lespinasse
These patches, which apply on top of v3.8-rc kernels, are to complete the
VMA gap finding code I introduced (following Rik's initial proposal) in
v3.8-rc1.

First 5 patches introduce the use of vm_unmapped_area() to replace brute
force searches on parisc, alpha, frv and ia64 architectures (all relatively
trivial uses of the vm_unmapped_area() infrastructure)

Next 2 patches do the same as above for the powerpc architecture. This
change is not as trivial as for the other architectures, because we
need to account for each address space slice potentially having a
different page size.

The last patch removes the free_area_cache, which was used by all the
brute force searches before they got converted to the
vm_unmapped_area() infrastructure.

I did some basic testing on x86 and powerpc; however the first 5 (simpler)
patches for parisc, alpha, frv and ia64 architectures are untested.

Michel Lespinasse (8):
  mm: use vm_unmapped_area() on parisc architecture
  mm: use vm_unmapped_area() on alpha architecture
  mm: use vm_unmapped_area() on frv architecture
  mm: use vm_unmapped_area() on ia64 architecture
  mm: use vm_unmapped_area() in hugetlbfs on ia64 architecture
  mm: remove free_area_cache use in powerpc architecture
  mm: use vm_unmapped_area() on powerpc architecture
  mm: remove free_area_cache

 arch/alpha/kernel/osf_sys.c  |   20 ++--
 arch/arm/mm/mmap.c   |2 -
 arch/arm64/mm/mmap.c |2 -
 arch/frv/mm/elf-fdpic.c  |   49 +++
 arch/ia64/kernel/sys_ia64.c  |   37 ++
 arch/ia64/mm/hugetlbpage.c   |   20 ++--
 arch/mips/mm/mmap.c  |2 -
 arch/parisc/kernel/sys_parisc.c  |   46 +++
 arch/powerpc/include/asm/page_64.h   |3 +-
 arch/powerpc/mm/hugetlbpage.c|2 +-
 arch/powerpc/mm/mmap_64.c|2 -
 arch/powerpc/mm/slice.c  |  228 +-
 arch/powerpc/platforms/cell/spufs/file.c |2 +-
 arch/s390/mm/mmap.c  |4 -
 arch/sparc/kernel/sys_sparc_64.c |2 -
 arch/tile/mm/mmap.c  |2 -
 arch/x86/ia32/ia32_aout.c|2 -
 arch/x86/mm/mmap.c   |2 -
 fs/binfmt_aout.c |2 -
 fs/binfmt_elf.c  |2 -
 include/linux/mm_types.h |3 -
 include/linux/sched.h|2 -
 kernel/fork.c|4 -
 mm/mmap.c|   28 
 mm/nommu.c   |4 -
 mm/util.c|1 -
 26 files changed, 163 insertions(+), 310 deletions(-)

-- 
1.7.7.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH] Allow Marvell SATA driver to work with LEDS_TRIGGER_IDE_DISK

2013-01-08 Thread Kim, Milo
> -Original Message-
> From: linux-leds-ow...@vger.kernel.org [mailto:linux-leds-
> ow...@vger.kernel.org] On Behalf Of Jason Cooper
> Sent: Wednesday, January 09, 2013 5:04 AM
> To: Josh Coombs
> Cc: coolo...@gmail.com; linux-kernel@vger.kernel.org; linux-
> i...@vger.kernel.org; rpur...@rpsys.net; linux ARM; jgar...@pobox.com;
> linux-l...@vger.kernel.org
> Subject: Re: [PATCH] Allow Marvell SATA driver to work with
> LEDS_TRIGGER_IDE_DISK
> 
> On Tue, Jan 08, 2013 at 02:18:04PM -0500, Josh Coombs wrote:
> > Would it make more sense to add a second trigger
> > for SATA instead?
> 
> I'll leave that up to the led guys, I just wanted to raise the point
> that the driver could logically support other types of disks, and we
> should come up with a migration path instead of adding capabilities
> ad-hoc.

I agree with Jason's suggestion. One more thing to consider.

If we replace the name, 'ide-disk' or 'ide_disk' with general one,
then we should change the value of 'default_trigger' under arch directory also.
LED trigger works if the name is matched with trigger name of LED device.

Thanks,
Milo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/8] mm: use vm_unmapped_area() on ia64 architecture

2013-01-08 Thread Michel Lespinasse
Update the ia64 arch_get_unmapped_area function to make use of
vm_unmapped_area() instead of implementing a brute force search.

Signed-off-by: Michel Lespinasse 

---
 arch/ia64/kernel/sys_ia64.c |   37 -
 1 files changed, 12 insertions(+), 25 deletions(-)

diff --git a/arch/ia64/kernel/sys_ia64.c b/arch/ia64/kernel/sys_ia64.c
index d9439ef2f661..41e33f84c185 100644
--- a/arch/ia64/kernel/sys_ia64.c
+++ b/arch/ia64/kernel/sys_ia64.c
@@ -25,9 +25,9 @@ arch_get_unmapped_area (struct file *filp, unsigned long 
addr, unsigned long len
unsigned long pgoff, unsigned long flags)
 {
long map_shared = (flags & MAP_SHARED);
-   unsigned long start_addr, align_mask = PAGE_SIZE - 1;
+   unsigned long align_mask = 0;
struct mm_struct *mm = current->mm;
-   struct vm_area_struct *vma;
+   struct vm_unmapped_area_info info;
 
if (len > RGN_MAP_LIMIT)
return -ENOMEM;
@@ -44,7 +44,7 @@ arch_get_unmapped_area (struct file *filp, unsigned long 
addr, unsigned long len
addr = 0;
 #endif
if (!addr)
-   addr = mm->free_area_cache;
+   addr = TASK_UNMAPPED_BASE;
 
if (map_shared && (TASK_SIZE > 0xul))
/*
@@ -53,28 +53,15 @@ arch_get_unmapped_area (struct file *filp, unsigned long 
addr, unsigned long len
 * tasks, we prefer to avoid exhausting the address space too 
quickly by
 * limiting alignment to a single page.
 */
-   align_mask = SHMLBA - 1;
-
-  full_search:
-   start_addr = addr = (addr + align_mask) & ~align_mask;
-
-   for (vma = find_vma(mm, addr); ; vma = vma->vm_next) {
-   /* At this point:  (!vma || addr < vma->vm_end). */
-   if (TASK_SIZE - len < addr || RGN_MAP_LIMIT - len < 
REGION_OFFSET(addr)) {
-   if (start_addr != TASK_UNMAPPED_BASE) {
-   /* Start a new search --- just in case we 
missed some holes.  */
-   addr = TASK_UNMAPPED_BASE;
-   goto full_search;
-   }
-   return -ENOMEM;
-   }
-   if (!vma || addr + len <= vma->vm_start) {
-   /* Remember the address where we stopped this search:  
*/
-   mm->free_area_cache = addr + len;
-   return addr;
-   }
-   addr = (vma->vm_end + align_mask) & ~align_mask;
-   }
+   align_mask = PAGE_MASK & (SHMLBA - 1);
+
+   info.flags = 0;
+   info.length = len;
+   info.low_limit = addr;
+   info.high_limit = TASK_SIZE;
+   info.align_mask = align_mask;
+   info.align_offset = 0;
+   return vm_unmapped_area();
 }
 
 asmlinkage long
-- 
1.7.7.3
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >