date:20120830

Re: [RFC v2 PATCH 1/7] thp: remove assumptions on pgtable_t type

2012-08-30 Thread Aneesh Kumar K.V

Gerald Schaefer  writes:

> The thp page table pre-allocation code currently assumes that pgtable_t
> is of type "struct page *". This may not be true for all architectures,
> so this patch removes that assumption by replacing the functions
> prepare_pmd_huge_pte() and get_pmd_huge_pte() with two new functions
> that can be defined architecture-specific.
>
> It also removes two VM_BUG_ON checks for page_count() and page_mapcount()
> operating on a pgtable_t. Apart from the VM_BUG_ON removal, there will
> be no functional change introduced by this patch.

Why is that VM_BUG_ON not needed any more ? What is that changed which break
that requirement ?

-aneesh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/2] dw_dmac: repair driver for use with AVR32 (AP7000)

2012-08-30 Thread Viresh Kumar

On 31 August 2012 10:45, Hein Tibosch  wrote:
> On 8/31/2012 12:26 PM, Viresh Kumar wrote:
>> BTW, Ideally speaking the fix for AVR32 which will enable 32bit mem
>> support and enable BIG endian support should have been part of this
>> patchset. That code can go through DMA tree as these patches are
>> very closely related. Otherwise now you have to wait till these patches
>> are included in linux-next, then only you can send AVR32 patches for
>> inclusion.
>>
>> So, maybe you can just add Acked-by from me and Arnd and include
>> AVR patches (Only changes related to these two patches) in the same series.
>> That will make life easier for you.
>
> Good idea, I already wondered how these 5 patches can be kept together:
>
> 1 [PATCH v2 1/2] dw_dmac: make driver endianness configurable
> 2 [PATCH v2 2/2] dw_dmac: max_mem_width limits value for SRC/DST_TR_WID 
> register
> 3 [PATCH v2] avr32: at32ap700x: set DMA slave properties for MCI dw_dmac
> 4 [PATCH v2 1/2] mmc: atmel-mci: DMA can be used with other controller
> 5 [PATCH v2 2/2] mmc: atmel-mci: AP700x PDC is not connected to MCI
>
> Patch 3 will only compile after patch 2 has been applied.
>
> Patch 4 and 5 will compile but they will only result in a working mci+dma
> after patches 1, 2 and 3 have been applied.
>
> I'm a mere  developer, not a MAINTAINER. But sure it would be good to keep
> these together as much as possible. It would also be easier for fellow
> avr32/mci users who want to upgrade to 3.5.2 without problems.
>
> So I assume that you want patches 1 to 3, packed as [PATCH v3 3/3] ?
>
> The atmel-mci patches will be handled by Ludovic Desroches.

Perfect !!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH 1/5] ACPI: Add acpi_lookup_driver() function

2012-08-30 Thread Bjorn Helgaas

On Thu, Aug 30, 2012 at 1:16 PM, Toshi Kani  wrote:
> Added acpi_lookup_driver(), which looks up an associated driver
> for the notified ACPI device object by walking through the list
> of ACPI drivers.
>
> Signed-off-by: Toshi Kani 
> ---
>  drivers/acpi/scan.c |   65 
> +++
>  include/acpi/acpi_bus.h |2 +
>  2 files changed, 67 insertions(+), 0 deletions(-)
>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index d1ecca2..d0e0d18 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -1630,3 +1630,68 @@ int __init acpi_scan_init(void)
>
> return result;
>  }
> +
> +static int acpi_match_driver(struct device_driver *drv, void *data)
> +{
> +   struct acpi_driver *driver = to_acpi_driver(drv);
> +   struct acpi_device *device = (struct acpi_device *) data;
> +   int ret;
> +
> +   ret = acpi_match_device_ids(device, driver->ids);
> +   if (!ret)
> +   device->driver = driver;
> +
> +   return !ret;
> +}
> +
> +/**
> + * acpi_lookup_driver: Look up a driver for the notified ACPI device
> + * @handle: ACPI handle of the notified device object
> + * @event: Notify event
> + *
> + * Look up an associated driver for the notified ACPI device object
> + * by walking through the list of ACPI drivers.
> + */
> +struct acpi_driver *acpi_lookup_driver(acpi_handle handle, u32 event)
> +{
> +   struct acpi_device *device;
> +   struct acpi_driver *driver = NULL;
> +   unsigned long long sta;
> +   int type;
> +   int ret;
> +
> +   /* allocate a temporary device object */
> +   device = kzalloc(sizeof(struct acpi_device), GFP_KERNEL);
> +   if (!device) {
> +   pr_err(PREFIX "No memory to allocate a tmp device\n");
> +   return NULL;
> +   }
> +
> +   ret = acpi_bus_type_and_status(handle, , );
> +   if (ret) {
> +   pr_err(PREFIX "Failed to get type of device\n");
> +   goto out;
> +   }
> +
> +   /* setup this temporary device object */
> +   INIT_LIST_HEAD(>pnp.ids);
> +   device->device_type = type;
> +   device->handle = handle;
> +   device->parent = acpi_bus_get_parent(handle);
> +   device->dev.bus = _bus_type;
> +   device->driver = NULL;
> +   STRUCT_TO_INT(device->status) = sta;
> +   device->status.present = 1;
> +
> +   /* set HID to this device object */
> +   acpi_device_set_id(device);
> +
> +   /* lookup a matching driver */
> +   (void) bus_for_each_drv(device->dev.bus, NULL,
> +   device, acpi_match_driver);
> +   driver = device->driver;

This path is used when we receive a Notify to a device and a matching
driver has been registered, but the driver is not bound to the device.
 For example, it may be a newly-added device where we haven't bound a
driver to it yet.

Is there anything that prevents us from unloading the driver between
here (the point where we capture the "struct acpi_driver *") and the
point where we call "driver->ops.sys_notify"?

> +
> +out:
> +   kfree(device);
> +   return driver;
> +}
> diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> index bde976e..a773b46 100644
> --- a/include/acpi/acpi_bus.h
> +++ b/include/acpi/acpi_bus.h
> @@ -345,6 +345,8 @@ extern int unregister_acpi_notifier(struct notifier_block 
> *);
>
>  extern int register_acpi_bus_notifier(struct notifier_block *nb);
>  extern void unregister_acpi_bus_notifier(struct notifier_block *nb);
> +extern struct acpi_driver *acpi_lookup_driver(acpi_handle handle, u32 event);
> +
>  /*
>   * External Functions
>   */
> --
> 1.7.7.6
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] hv: vmbus_drv: detect hyperv through x86_hyper

2012-08-30 Thread Jason Wang

There are two reasons we need to use x86_hyper instead of
query_hypervisor_presence():

- Not only hyperv but also other hypervisors such as kvm would set
  X86_FEATURE_HYTPERVISOR, so query_hypervisor_presence() will return true even
  in kvm. This may cause extra delay of 5 seconds before failing the probing in
  kvm guest.
- The hypervisor has been detected in init_hypervisor(), so no need to do the
  work again.

Cc: "K. Y. Srinivasan" 
Cc: Haiyang Zhang 
Signed-off-by: Jason Wang 
---
 drivers/hv/vmbus_drv.c |   25 ++---
 1 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index f40dd57..8e1a9ec 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -34,6 +34,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "hyperv_vmbus.h"
 
 
@@ -719,33 +720,11 @@ static struct acpi_driver vmbus_acpi_driver = {
},
 };
 
-/*
- * query_hypervisor_presence
- * - Query the cpuid for presence of windows hypervisor
- */
-static int query_hypervisor_presence(void)
-{
-   unsigned int eax;
-   unsigned int ebx;
-   unsigned int ecx;
-   unsigned int edx;
-   unsigned int op;
-
-   eax = 0;
-   ebx = 0;
-   ecx = 0;
-   edx = 0;
-   op = HVCPUID_VERSION_FEATURES;
-   cpuid(op, , , , );
-
-   return ecx & HV_PRESENT_BIT;
-}
-
 static int __init hv_acpi_init(void)
 {
int ret, t;
 
-   if (!query_hypervisor_presence())
+   if (x86_hyper != _hyper_ms_hyperv)
return -ENODEV;
 
init_completion(_event);
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH] x86/kernel: remove tboot 1:1 page table creation code

2012-08-30 Thread Wei, Gang

Acked-by: Gang Wei 

> From: Xiaoyan Zhang 
> 
> For TXT boot, while Linux kernel trys to shutdown/S3/S4/reboot, it need to
> jump back to tboot code and do TXT teardown work. Previously kernel zapped
> all mem page identity mapping (va=pa) after booting, so tboot code mem
> address
> was mapped again with identity mapping. Now kernel didn't zap the identity
> mapping page table, so tboot related code can remove the remapping code
> before
> trapping back now.
> 
> Signed-off-by: Xiaoyan Zhang 
> ---
>  arch/x86/kernel/tboot.c |   78
+++
>  1 files changed, 5 insertions(+), 73 deletions(-)
> 
> diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
> index f84fe00..d4f460f 100644
> --- a/arch/x86/kernel/tboot.c
> +++ b/arch/x86/kernel/tboot.c
> @@ -103,71 +103,13 @@ void __init tboot_probe(void)
>   pr_debug("tboot_size: 0x%x\n", tboot->tboot_size);
>  }
> 
> -static pgd_t *tboot_pg_dir;
> -static struct mm_struct tboot_mm = {
> - .mm_rb  = RB_ROOT,
> - .pgd= swapper_pg_dir,
> - .mm_users   = ATOMIC_INIT(2),
> - .mm_count   = ATOMIC_INIT(1),
> - .mmap_sem   = __RWSEM_INITIALIZER(init_mm.mmap_sem),
> - .page_table_lock =
> __SPIN_LOCK_UNLOCKED(init_mm.page_table_lock),
> - .mmlist = LIST_HEAD_INIT(init_mm.mmlist),
> -};
> -
>  static inline void switch_to_tboot_pt(void)
>  {
> - write_cr3(virt_to_phys(tboot_pg_dir));
> -}
> -
> -static int map_tboot_page(unsigned long vaddr, unsigned long pfn,
> -   pgprot_t prot)
> -{
> - pgd_t *pgd;
> - pud_t *pud;
> - pmd_t *pmd;
> - pte_t *pte;
> -
> - pgd = pgd_offset(_mm, vaddr);
> - pud = pud_alloc(_mm, pgd, vaddr);
> - if (!pud)
> - return -1;
> - pmd = pmd_alloc(_mm, pud, vaddr);
> - if (!pmd)
> - return -1;
> - pte = pte_alloc_map(_mm, NULL, pmd, vaddr);
> - if (!pte)
> - return -1;
> - set_pte_at(_mm, vaddr, pte, pfn_pte(pfn, prot));
> - pte_unmap(pte);
> - return 0;
> -}
> -
> -static int map_tboot_pages(unsigned long vaddr, unsigned long start_pfn,
> -unsigned long nr)
> -{
> - /* Reuse the original kernel mapping */
> - tboot_pg_dir = pgd_alloc(_mm);
> - if (!tboot_pg_dir)
> - return -1;
> -
> - for (; nr > 0; nr--, vaddr += PAGE_SIZE, start_pfn++) {
> - if (map_tboot_page(vaddr, start_pfn, PAGE_KERNEL_EXEC))
> - return -1;
> - }
> -
> - return 0;
> -}
> -
> -static void tboot_create_trampoline(void)
> -{
> - u32 map_base, map_size;
> -
> - /* Create identity map for tboot shutdown code. */
> - map_base = PFN_DOWN(tboot->tboot_base);
> - map_size = PFN_UP(tboot->tboot_size);
> - if (map_tboot_pages(map_base << PAGE_SHIFT, map_base, map_size))
> - panic("tboot: Error mapping tboot pages (mfns) @ 0x%x,
0x%x\n",
> -   map_base, map_size);
> +#ifdef CONFIG_X86_32
> + load_cr3(initial_page_table);
> +#else
> + write_cr3(real_mode_header->trampoline_pgd);
> +#endif
>  }
> 
>  #ifdef CONFIG_ACPI_SLEEP
> @@ -225,14 +167,6 @@ void tboot_shutdown(u32 shutdown_type)
>   if (!tboot_enabled())
>   return;
> 
> - /*
> -  * if we're being called before the 1:1 mapping is set up then just
> -  * return and let the normal shutdown happen; this should only be
> -  * due to very early panic()
> -  */
> - if (!tboot_pg_dir)
> - return;
> -
>   /* if this is S3 then set regions to MAC */
>   if (shutdown_type == TB_SHUTDOWN_S3)
>   if (tboot_setup_sleep())
> @@ -343,8 +277,6 @@ static __init int tboot_late_init(void)
>   if (!tboot_enabled())
>   return 0;
> 
> - tboot_create_trampoline();
> -
>   atomic_set(_wfs_count, 0);
>   register_hotcpu_notifier(_cpu_notifier);
> 
> --
> 1.7.7.6



smime.p7s
Description: S/MIME cryptographic signature

[PATCH V2] block/throttle: Add IO throttled information in blkio.throttle.

2012-08-30 Thread Tao Ma

From: Tao Ma 

Currently, if the IO is throttled by io-throttle, the SA has no idea of
the situation and can't report it to the real application user about
that he/she has to do something. So this patch adds a new interface
named blkio.throttle.io_queued which indicates how many IOs are
currently throttled.

Also another function blkg_rwstat_dec is added since the number of throttled
IOs can be either added or decreased.

Cc: Tejun Heo 
Cc: Vivek Goyal 
Cc: Jens Axboe 
Signed-off-by: Tao Ma 
---
 block/blk-cgroup.h   |   26 ++
 block/blk-throttle.c |   37 +
 2 files changed, 63 insertions(+), 0 deletions(-)

diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 2459730..b1f6f5c 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -413,6 +413,32 @@ static inline void blkg_rwstat_add(struct blkg_rwstat 
*rwstat,
 }
 
 /**
+ * blkg_rwstat_dec - dec a value to a blkg_rwstat
+ * @rwstat: target blkg_rwstat
+ * @rw: mask of REQ_{WRITE|SYNC}
+ * @val: value to dec
+ *
+ * Dec @val to @rwstat.  The counters are chosen according to @rw.  The
+ * caller is responsible for synchronizing calls to this function.
+ */
+static inline void blkg_rwstat_dec(struct blkg_rwstat *rwstat,
+  int rw, uint64_t val)
+{
+   u64_stats_update_begin(>syncp);
+
+   if (rw & REQ_WRITE)
+   rwstat->cnt[BLKG_RWSTAT_WRITE] -= val;
+   else
+   rwstat->cnt[BLKG_RWSTAT_READ] -= val;
+   if (rw & REQ_SYNC)
+   rwstat->cnt[BLKG_RWSTAT_SYNC] -= val;
+   else
+   rwstat->cnt[BLKG_RWSTAT_ASYNC] -= val;
+
+   u64_stats_update_end(>syncp);
+}
+
+/**
  * blkg_rwstat_read - read the current values of a blkg_rwstat
  * @rwstat: blkg_rwstat to read
  *
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index 1588c2d..9317d71 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -46,6 +46,8 @@ struct tg_stats_cpu {
struct blkg_rwstat  service_bytes;
/* total IOs serviced, post merge */
struct blkg_rwstat  serviced;
+   /* total IOs queued, not submitted to the underlying device. */
+   struct blkg_rwstat  io_queued;
 };
 
 struct throtl_grp {
@@ -267,6 +269,7 @@ static void throtl_pd_reset_stats(struct blkcg_gq *blkg)
 
blkg_rwstat_reset(>service_bytes);
blkg_rwstat_reset(>serviced);
+   blkg_rwstat_reset(>io_queued);
}
 }
 
@@ -700,6 +703,31 @@ static void throtl_update_dispatch_stats(struct throtl_grp 
*tg, u64 bytes,
local_irq_restore(flags);
 }
 
+static void throtl_update_queued_stats(struct throtl_grp *tg, int rw, int add)
+{
+   struct tg_stats_cpu *stats_cpu;
+   unsigned long flags;
+
+   /* If per cpu stats are not allocated yet, don't do any accounting. */
+   if (tg->stats_cpu == NULL)
+   return;
+
+   /*
+* Disabling interrupts to provide mutual exclusion between two
+* writes on same cpu. It probably is not needed for 64bit. Not
+* optimizing that case yet.
+*/
+   local_irq_save(flags);
+
+   stats_cpu = this_cpu_ptr(tg->stats_cpu);
+   if (add)
+   blkg_rwstat_add(_cpu->io_queued, rw, 1);
+   else
+   blkg_rwstat_dec(_cpu->io_queued, rw, 1);
+
+   local_irq_restore(flags);
+}
+
 static void throtl_charge_bio(struct throtl_grp *tg, struct bio *bio)
 {
bool rw = bio_data_dir(bio);
@@ -715,6 +743,8 @@ static void throtl_add_bio_tg(struct throtl_data *td, 
struct throtl_grp *tg,
struct bio *bio)
 {
bool rw = bio_data_dir(bio);
+   struct tg_stats_cpu *stats_cpu;
+   unsigned long flags;
 
bio_list_add(>bio_lists[rw], bio);
/* Take a bio reference on tg */
@@ -722,6 +752,7 @@ static void throtl_add_bio_tg(struct throtl_data *td, 
struct throtl_grp *tg,
tg->nr_queued[rw]++;
td->nr_queued[rw]++;
throtl_enqueue_tg(td, tg);
+   throtl_update_queued_stats(tg, bio->bi_rw, 1);
 }
 
 static void tg_update_disptime(struct throtl_data *td, struct throtl_grp *tg)
@@ -762,6 +793,7 @@ static void tg_dispatch_one_bio(struct throtl_data *td, 
struct throtl_grp *tg,
bio->bi_rw |= REQ_THROTTLED;
 
throtl_trim_slice(td, tg, rw);
+   throtl_update_queued_stats(tg, bio->bi_rw, 0);
 }
 
 static int throtl_dispatch_tg(struct throtl_data *td, struct throtl_grp *tg,
@@ -1090,6 +1122,11 @@ static struct cftype throtl_files[] = {
.private = offsetof(struct tg_stats_cpu, serviced),
.read_seq_string = tg_print_cpu_rwstat,
},
+   {
+   .name = "throttle.io_queued",
+   .private = offsetof(struct tg_stats_cpu, io_queued),
+   .read_seq_string = tg_print_cpu_rwstat,
+   },
{ } /* terminate */
 };
 
-- 
1.7.0.4

--
To unsubscribe from this list:

RE: [PATCH] MAINTAINERS: fix TXT maintainer list and source repo path

2012-08-30 Thread Wei, Gang

Thanks for pointing it out.

Jimmy

On Thu, 2012-08-30 at 16:34 +0800, j...@perches.com wrote:
> On Thu, 2012-08-30 at 13:19 +0800, gang@intel.com wrote:
> > diff --git a/MAINTAINERS b/MAINTAINERS
> []
> >  INTEL(R) TRUSTED EXECUTION TECHNOLOGY (TXT)
> []
> > -T: Mercurial http://www.bughost.org/repos.hg/tboot.hg
> > +T: Mercurial http://tboot.hg.sourceforge.net:8000/hgroot/tboot/tboot
>
> Perhaps this would be better as:
> T:hg http://tboot.hg.sourceforge.net:8000/hgroot/tboot/tboot
>
> From the MAINTAINERS introduction:
>
> Descriptions of section entries:
> []
> T: SCM tree type and location.  Type is one of: git, hg, quilt, stgit, 
> topgit.
>


smime.p7s
Description: S/MIME cryptographic signature

[PATCH] block/throttle: Call throtl_update_dispatch_stats with throtl_grp directly.

2012-08-30 Thread Tao Ma

From: Tao Ma 

All callers of throtl_update_dispatch_stats uses tg_to_blkg and then
in this function we use blkg_to_tg again to change it back. So remove
all these useless conversion and use throtl_grp directly to call
throtl_update_dispatch_stats.

Cc: Tejun Heo 
Cc: Vivek Goyal 
Cc: Jens Axboe 
Signed-off-by: Tao Ma 
---
 block/blk-throttle.c |7 +++
 1 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index e287c19..1588c2d 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -675,10 +675,9 @@ static bool tg_may_dispatch(struct throtl_data *td, struct 
throtl_grp *tg,
return 0;
 }
 
-static void throtl_update_dispatch_stats(struct blkcg_gq *blkg, u64 bytes,
+static void throtl_update_dispatch_stats(struct throtl_grp *tg, u64 bytes,
 int rw)
 {
-   struct throtl_grp *tg = blkg_to_tg(blkg);
struct tg_stats_cpu *stats_cpu;
unsigned long flags;
 
@@ -709,7 +708,7 @@ static void throtl_charge_bio(struct throtl_grp *tg, struct 
bio *bio)
tg->bytes_disp[rw] += bio->bi_size;
tg->io_disp[rw]++;
 
-   throtl_update_dispatch_stats(tg_to_blkg(tg), bio->bi_size, bio->bi_rw);
+   throtl_update_dispatch_stats(tg, bio->bi_size, bio->bi_rw);
 }
 
 static void throtl_add_bio_tg(struct throtl_data *td, struct throtl_grp *tg,
@@ -1133,7 +1132,7 @@ bool blk_throtl_bio(struct request_queue *q, struct bio 
*bio)
tg = throtl_lookup_tg(td, blkcg);
if (tg) {
if (tg_no_rule_group(tg, rw)) {
-   throtl_update_dispatch_stats(tg_to_blkg(tg),
+   throtl_update_dispatch_stats(tg,
 bio->bi_size, bio->bi_rw);
goto out_unlock_rcu;
}
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: WARNING: at fs/inode.c:280 drop_nlink+0x31/0x33()

2012-08-30 Thread Nick Pasich

On Wed, Aug 29, 2012 at 03:16:41PM -0700, Jeff Layton wrote:
> On Wed, 29 Aug 2012 09:25:27 -0700
> Nick Pasich  wrote:
> 
> > 
> > I'm using kernel 3.5.3 ... 
> > 
> > It happens on 3.5.1 and 3.5.2 also.
> > 
> > I know that Nick Bowler has already reported this...
> > 
> > I'm experiencing the same thing.
> > 
> > It happens when moving files from one directory to another
> > on the same partition (NFS). 
> > 
> >   --( Nick Pasich )--
> > 
> > 
> > #
> > ##
> > ## Happens when PSTs are moved from one directory to another on the ISCSI 
> > ...
> > ##
> > #
> > 
> > Aug 29 08:06:16 localhost kernel: [ cut here ]
> > Aug 29 08:06:16 localhost kernel: WARNING: at fs/inode.c:280 
> > drop_nlink+0x31/0x33()
> > Aug 29 08:06:16 localhost kernel: Hardware name: To Be Filled By O.E.M.
> > Aug 29 08:06:16 localhost kernel: Modules linked in: ecb md4 cifs w83627hf 
> > eeprom asb100 hwmon_vid hwmon nfsd exportfs ipv6 psmouse usb_storage 
> > io_edgeport usbserial sg r8169 mii evdev intel_agp uhci_hcd i2c_i801 
> > i2c_core shpchp intel_gtt agpgart ehci_hcd microcode serio_raw
> > Aug 29 08:06:16 localhost kernel: Pid: 31477, comm: rm Tainted: GW  
> >   3.5.3 #1
> > Aug 29 08:06:16 localhost kernel: Call Trace:
> > Aug 29 08:06:16 localhost kernel:  [] ? drop_nlink+0x31/0x33
> > Aug 29 08:06:16 localhost kernel:  [] ? 
> > warn_slowpath_common+0x7b/0x90
> > Aug 29 08:06:16 localhost kernel:  [] ? drop_nlink+0x31/0x33
> > Aug 29 08:06:16 localhost kernel:  [] ? 
> > warn_slowpath_null+0x1b/0x1f
> > Aug 29 08:06:16 localhost kernel:  [] ? drop_nlink+0x31/0x33
> > Aug 29 08:06:16 localhost kernel:  [] ? cifs_unlink+0x134/0x63d 
> > [cifs]
> > Aug 29 08:06:16 localhost kernel:  [] ? dput+0x11/0x117
> > Aug 29 08:06:16 localhost kernel:  [] ? mntput_no_expire+0xf/0xf7
> > Aug 29 08:06:16 localhost kernel:  [] ? vfs_unlink+0x4e/0xb6
> > Aug 29 08:06:16 localhost kernel:  [] ? __lookup_hash+0x54/0xac
> > Aug 29 08:06:16 localhost kernel:  [] ? do_unlinkat+0x10a/0x12d
> > Aug 29 08:06:16 localhost kernel:  [] ? sys_ioctl+0x34/0x57
> > Aug 29 08:06:16 localhost kernel:  [] ? syscall_call+0x7/0xb
> > Aug 29 08:06:16 localhost kernel: ---[ end trace 756b427e3bd671f9 ]---
> > 
> 
> (cc'ing linux-cifs ml)
> 
> This stack trace comes from cifs, not nfs.
> 
> Steve French has a patch queued in his tree to silence this warning
> that I believe he intends to send to Linus for 3.6. Perhaps we should
> consider backporting it for 3.5.z too?
> 
> -- 
> Jeff Layton 

Jeff,

I applied this patch to Kernel 3.5.3 from Pavel and the 
the warning is gone with no problems.

Thanks,
 
--( Nick Pasich

##

>From df2d6b1fbf2401c5ee04f2ac143ea0954e3a87a6 Mon Sep 17 00:00:00 2001
From: Pavel Shilovsky 
Date: Fri, 13 Jul 2012 11:59:45 +0400
Subject: [PATCH] CIFS: Protect i_nlink from being negative

that can cause warning messages.

Signed-off-by: Pavel Shilovsky 
---
 fs/cifs/inode.c |   13 +++--
 1 files changed, 11 insertions(+), 2 deletions(-)

diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index 7354877..88afb1a 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -1110,6 +1110,15 @@ undo_setattr:
goto out_close;
 }
 
+/* copied from fs/nfs/dir.c with small changes */
+static void
+cifs_drop_nlink(struct inode *inode)
+{
+   spin_lock(>i_lock);
+   if (inode->i_nlink > 0)
+   drop_nlink(inode);
+   spin_unlock(>i_lock);
+}
 
 /*
  * If dentry->d_inode is null (usually meaning the cached dentry
@@ -1166,13 +1175,13 @@ retry_std_delete:
 psx_del_no_retry:
if (!rc) {
if (inode)
-   drop_nlink(inode);
+   cifs_drop_nlink(inode);
} else if (rc == -ENOENT) {
d_drop(dentry);
} else if (rc == -ETXTBSY) {
rc = cifs_rename_pending_delete(full_path, 
dentry, xid);
if (rc == 0)
-   drop_nlink(inode);
+   cifs_drop_nlink(inode);
} else if ((rc == -EACCES) && (dosattr == 0) && inode) {
attrs = kzalloc(sizeof(*attrs), GFP_KERNEL);
if (attrs == NULL) {
-- 
1.7.3.3

##

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/2] dw_dmac: repair driver for use with AVR32 (AP7000)

2012-08-30 Thread Viresh Kumar

On 30 August 2012 22:44, Hein Tibosch  wrote:
> After some recent changes to dw_dmac, the driver got broken
> for the AVR32 platform for two reasons:
>
> The accessors to i/o memory had become little-endian.
> The maximum transfer width on the memory side was increased
> from 32 to 64 bits. This led to undefined behavior on the
> avr32 platform.
>
> These patches repair the driver by:
> 1. making the endianness configurable through Kconfig,
> for AVR32 it will become big-endian
> 2. making the maximum memory transfer width configurable
> It can be set in the code within arch

Acked-by: Viresh Kumar 

BTW, Ideally speaking the fix for AVR32 which will enable 32bit mem
support and enable BIG endian support should have been part of this
patchset. That code can go through DMA tree as these patches are
very closely related. Otherwise now you have to wait till these patches
are included in linux-next, then only you can send AVR32 patches for
inclusion.

So, maybe you can just add Acked-by from me and Arnd and include
AVR patches (Only changes related to these two patches) in the same series.
That will make life easier for you.

viresh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [REGRESSION] Xorg doesn't like 4e8b14526 "time: Improve sanity checking of timekeeping inputs"

2012-08-30 Thread Linus Torvalds

On Thu, Aug 30, 2012 at 9:05 PM, Andreas Bombe  wrote:
>
> With that somewhat easy test I bisected it down to 4e8b14526 "time:
> Improve sanity checking of timekeeping inputs". The latest Linus git
> (155e36d40) with a revert of the bisected commit does not show the
> problem.

Ok, I guess we need to revert it. Although it might be interesting to
add a WARN_ON_ONCE() for the case of timespec_valid() returning false,
to just see exactly *where* that thing triggers. Could you do that? In
fact, do it with separate WARN_ON_ONCE's for each of the reasons that
function returns false, so that we also see which check it is that
triggers. Ok?

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[REGRESSION] Xorg doesn't like 4e8b14526 "time: Improve sanity checking of timekeeping inputs"

2012-08-30 Thread Andreas Bombe

I have recently started to get problems with X simply shutting itself
down and returning to the login screen. In the X logs I find:

> [  1492.936] 
> Fatal server error:
> [  1492.936] WaitForSomething(): select: Invalid argument

No messages whatsoever is found in the kernel logs. This error happens
randomly without any correlation to user input, but with a high
likelihood (within a few minutes at most) when a video is playing. It
doesn't matter if the video is in Flash in a browser window or in a
video player playing a local file.

With that somewhat easy test I bisected it down to 4e8b14526 "time:
Improve sanity checking of timekeeping inputs". The latest Linus git
(155e36d40) with a revert of the bisected commit does not show the
problem.

Video is Radeon HD 6950 with open source drivers. Xorg version is the
one currently in Debian unstable (xserver-xorg-core: 2:1.12.3.902-1,
xserver-xorg-video-radeon: 1:6.14.4-5, libdrm: 2.4.33-3).

-- 
Andreas Bombe
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] dev_ and dynamic_debug cleanups

2012-08-30 Thread Jim Cromie

On Thu, Aug 30, 2012 at 11:43 AM, Jim Cromie  wrote:
> On Sun, Aug 26, 2012 at 5:25 AM, Joe Perches  wrote:
>> The recent commit to fix dynamic_debug was a bit unclean.
>> Neaten the style for dynamic_debug.
>> Reduce the stack use of message logging that uses netdev_printk
>> Add utility functions dev_printk_emit and dev_vprintk_emit for /dev/kmsg.
>>
>> Joe Perches (5):
>>   dev_dbg/dynamic_debug: Update to use printk_emit, optimize stack
>>   netdev_printk/dynamic_netdev_dbg: Directly call printk_emit
>>   netdev_printk/netif_printk: Remove a superfluous logging colon
>>   dev: Add dev_vprintk_emit and dev_printk_emit
>>   device and dynamic_debug: Use dev_vprintk_emit and dev_printk_emit
>>
>
> Ive tested this on 2 builds differing only by DYNAMIC_DEBUG
> It works for me on x86-64
>
> However, I just booted a non-dyndbg build on x86-32, and got this.
>

Ok, transient error, went away with a clean build.

tested-by: Jim Cromie 

thanks
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ASoC: correct the check for NULL dma_buffer pointer

2012-08-30 Thread Prasad Joshi

The if condition
if (!buf && !buf->area)

checks if the buf pointer is NULL and then dereferences it again to
check if the buffer area is NULL, resulting in possible NULL
dereference.

Signed-off-by: Prasad Joshi 
---
 sound/soc/spear/spear_pcm.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/sound/soc/spear/spear_pcm.c b/sound/soc/spear/spear_pcm.c
index 97c2cac..8c7f237 100644
--- a/sound/soc/spear/spear_pcm.c
+++ b/sound/soc/spear/spear_pcm.c
@@ -138,7 +138,7 @@ static void spear_pcm_free(struct snd_pcm *pcm)
continue;
 
buf = >dma_buffer;
-   if (!buf && !buf->area)
+   if (!buf || !buf->area)
continue;
 
dma_free_writecombine(pcm->card->dev, buf->bytes,
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[BUG REPORT] Wrong deaklock warning!

2012-08-30 Thread Stanley.Miao

Hi, All,

I used two spinlocks in my code, and I enabled the following CONFIGs
for debugging.

CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_LOCKDEP=y

void abc_init(struct abc_dev *dev)
{
   spin_lock_init(>locka);
   spin_lock_init(>lockb);
}

void set_last_active_blk(struct abc_dev *dev)
{
   spin_lock(>locka);
   spin_lock(>lockb);
   /* do something */
   spin_unlock(>lockb);
   spin_unlock(>locka);
}

The code above works fine. No Warning.

Becaused of some reasons, I tried to encapsulate the spin_lock API.

typedef spinlock_t shannon_spinlock_t;
void abc_spin_lock_init(shannon_spinlock_t *lock)
{
 spin_lock_init((spinlock_t *)lock);
}

void abc_spin_lock(abc_spinlock_t *lock)
 {
 spin_lock((spinlock_t *)lock);
 }

void abc_spin_unlock(abc_spinlock_t *lock)
{
spin_unlock((spinlock_t *)lock);
}

Then my code become:

void abc_init(struct abc_dev *dev)
{
   abc_spin_lock_init(>locka);
   abc_spin_lock_init(>lockb);
}

set_last_active_blk(struct abc_dev *dev)
{
   shannon_spin_lock(>locka);
   shannon_spin_lock(>lockb);
   /* do something */
   shannon_spin_unlock(>lockb);
   shannon_spin_unlock(>locka);
}

Then I got the following Warning:

[  538.987581] =
[  538.988776] [ INFO: possible recursive locking detected ]
[  538.989594] 3.1.4+ #1085
[  538.989984] -
[  538.990801] fio/732 is trying to acquire lock:
[  538.991368]  (&((spinlock_t *)lock)->rlock){+.+...}, at:
[] abc_spin_lock+0xe/0x10
[  538.992341]
[  538.992341] but task is already holding lock:
[  538.992341]  (&((spinlock_t *)lock)->rlock){+.+...}, at:
[] abc_spin_lock+0xe/0x10
[  538.992341]
[  538.992341] other info that might help us debug this:
[  538.992341]  Possible unsafe locking scenario:
[  538.992341]
[  538.992341]CPU0
[  538.992341]
[  538.992341]   lock(&((spinlock_t *)lock)->rlock);
[  538.992341]   lock(&((spinlock_t *)lock)->rlock);
[  538.992341]
[  538.992341]  *** DEADLOCK ***
[  538.992341]
[  538.992341]  May be due to missing lock nesting notation
[  538.992341]
[  538.992341] 2 locks held by fio/732:
[  538.992341]  #0:  ((struct mutex *)lock){+.+.+.}, at:
[] abc_mutex_trylock+0xe/0x10
[  538.992341]  #1:  (&((spinlock_t *)lock)->rlock){+.+...}, at:
[] abc_spin_lock+0xe/0x10
[  538.992341]
[  538.992341] stack backtrace:
[  538.992341] Pid: 732, comm: fio Not tainted 3.1.4+ #1085
[  538.992341] Call Trace:
[  538.992341]  [] __lock_acquire+0xff8/0x1864
[  538.992341]  [] ? mempool_alloc_slab+0x15/0x17
[  538.992341]  [] ? abc_spin_lock+0xe/0x10
[  538.992341]  [] lock_acquire+0x101/0x12e
[  538.992341]  [] ? abc_spin_lock+0xe/0x10
[  538.992341]  [] _raw_spin_lock+0x52/0x87
[  538.992341]  [] ? abc_spin_lock+0xe/0x10
[  538.992341]  [] abc_spin_lock+0xe/0x10
[  538.992341]  [] set_last_active_blk+0x74/0x141
[  538.992341]  [] move_to_next_chunk+0xab/0xef

Obviously this is wrong. There are two different spinlocks and it
won't cause deadlock. There is no warning if I don't encapsulate the
spinlock API.


Stanley
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH v7 1/8] Talitos: Support for async_tx XOR offload

2012-08-30 Thread Liu Qiang-B32616

> -Original Message-
> From: Geanta Neag Horia Ioan-B05471
> Sent: Thursday, August 30, 2012 10:23 PM
> To: Liu Qiang-B32616; linux-cry...@vger.kernel.org;
> dan.j.willi...@gmail.com; herb...@gondor.hengli.com.au;
> da...@davemloft.net; linux-kernel@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org
> Cc: Li Yang-R58472; Phillips Kim-R1AAHA; vinod.k...@intel.com;
> dan.j.willi...@intel.com; a...@arndb.de; gre...@linuxfoundation.org; Liu
> Qiang-B32616
> Subject: RE: [PATCH v7 1/8] Talitos: Support for async_tx XOR offload
> 
> On Thu, 9 Aug 2012 11:20:48 +0300, qiang@freescale.com wrote:
> > From: Qiang Liu 
> >
> > Expose Talitos's XOR functionality to be used for RAID parity
> > calculation via the Async_tx layer.
> >
> > Cc: Herbert Xu 
> > Cc: David S. Miller 
> > Signed-off-by: Dipen Dudhat 
> > Signed-off-by: Maneesh Gupta 
> > Signed-off-by: Kim Phillips 
> > Signed-off-by: Vishnu Suresh 
> > Signed-off-by: Qiang Liu 
> > ---
> >  drivers/crypto/Kconfig   |9 +
> >  drivers/crypto/talitos.c |  413
> ++
> >  drivers/crypto/talitos.h |   53 ++
> >  3 files changed, 475 insertions(+), 0 deletions(-)
> 
> 
> > +static void talitos_xor_run_tx_complete_actions(struct
> talitos_xor_desc *desc,
> > +   struct talitos_xor_chan *xor_chan)
> > +{
> > +   struct device *dev = xor_chan->dev;
> > +   dma_addr_t dest, addr;
> > +   unsigned int src_cnt = desc->unmap_src_cnt;
> > +   unsigned int len = desc->unmap_len;
> > +   enum dma_ctrl_flags flags = desc->async_tx.flags;
> > +   struct dma_async_tx_descriptor *tx = >async_tx;
> > +
> > +   /* unmap dma addresses */
> > +   dest = desc->hwdesc.ptr[6].ptr;
> > +   if (likely(!(flags & DMA_COMPL_SKIP_DEST_UNMAP)))
> > +   dma_unmap_page(dev, dest, len, DMA_BIDIRECTIONAL);
> > +
> > +   desc->idx = 6 - src_cnt;
> > +   if (likely(!(flags & DMA_COMPL_SKIP_SRC_UNMAP))) {
> > +   while(desc->idx < 6) {
> > +   addr = desc->hwdesc.ptr[desc->idx++].ptr;
> > +   if (addr == dest)
> > +   continue;
> > +   dma_unmap_page(dev, addr, len, DMA_TO_DEVICE);
> > +   }
> > +   }
> 
> No need for braces around the while block.
I will remove it.

> 
> > +   /* run dependent operations */
> > +   dma_run_dependencies(tx);
> > +}
> 
> 
> > +static void talitos_release_xor(struct device *dev, struct
> talitos_desc *hwdesc,
> > +   void *context, int error)
> > +{
> > +   struct talitos_xor_desc *desc = context;
> > +   struct talitos_xor_chan *xor_chan;
> > +   dma_async_tx_callback callback;
> > +   void *callback_param;
> > +
> > +   if (unlikely(error))
> > +   dev_err(dev, "xor operation: talitos error %d\n", error);
> > +
> > +   xor_chan = container_of(desc->async_tx.chan, struct
> talitos_xor_chan,
> > +   common);
> > +   spin_lock_bh(_chan->desc_lock);
> > +   if (xor_chan->completed_cookie < desc->async_tx.cookie)
> > +   xor_chan->completed_cookie = desc->async_tx.cookie;
> > +
> > +   callback = desc->async_tx.callback;
> > +   callback_param = desc->async_tx.callback_param;
> > +
> > +   if (callback) {
> > +   spin_unlock_bh(_chan->desc_lock);
> > +   callback(callback_param);
> > +   spin_lock_bh(_chan->desc_lock);
> > +   }
> 
> Since callback_param is used only here, maybe:
> 
> if (callback) {
>   void *callback_param = desc->async_tx.callback_param;
> 
>   spin_unlock_bh(_chan->desc_lock);
>   callback(callback_param);
>   spin_lock_bh(_chan->desc_lock);
> }
Fine. I will modify it in next.

> 
> > +
> > +   talitos_xor_run_tx_complete_actions(desc, xor_chan);
> > +
> > +   list_del(>node);
> > +   list_add_tail(>node, _chan->free_desc);
> > +   spin_unlock_bh(_chan->desc_lock);
> > +   if (!list_empty(_chan->pending_q))
> > +   talitos_process_pending(xor_chan);
> > +}
> 
> 
> > +static int talitos_alloc_chan_resources(struct dma_chan *chan)
> > +{
> > +   struct talitos_xor_chan *xor_chan;
> > +   struct talitos_xor_desc *desc;
> > +   LIST_HEAD(tmp_list);
> > +   int i;
> > +
> > +   xor_chan = container_of(chan, struct talitos_xor_chan, common);
> > +
> > +   if (!list_empty(_chan->free_desc))
> > +   return xor_chan->total_desc;
> > +
> > +   for (i = 0; i < TALITOS_MAX_DESCRIPTOR_NR; i++) {
> > +   desc = talitos_xor_alloc_descriptor(xor_chan,
> > +   GFP_KERNEL | GFP_DMA);
> 
> talitos_xor_alloc_descriptor() is called here without holding
> the xor_chan->desc_lock and it increments xor_chan->total_desc.
> Isn't this an issue ?

No, please refer to the code as below, 
+   list_add_tail(>node, _list);

The list is temporary list, it will be merged to xor_chan->free_desc in next 
step, here is protected by lock,
+   spin_lock_bh(_chan->desc_lock);
+   list_splice_init(_list, _chan->free_desc);
+   spin_unlock_bh(_chan->desc_lock);

Re: [PATCH] of: add devres version of of_iomap

2012-08-30 Thread Rob Herring

On 08/30/2012 05:09 PM, Karicheri, Muralidharan wrote:
>>> -Original Message-
>>> From: Rob Herring [mailto:robherri...@gmail.com]
>>> Sent: Thursday, August 30, 2012 2:27 PM
>>> To: Karicheri, Muralidharan
>>> Cc: grant.lik...@secretlab.ca; devicetree-disc...@lists.ozlabs.org; linux-
>>> ker...@vger.kernel.org
>>> Subject: Re: [PATCH] of: add devres version of of_iomap
>>>
>>> On 08/30/2012 10:32 AM, Murali Karicheri wrote:
 This adds devres version of the of_iomap() to allow resource to be cleaned
 through devres.
>>>
>>> If you have a struct device, then don't you already have a resource and
>>> can just use devm_ioremap in a driver? New drivers should not be using
>>> of_iomap.
>>>
> 
> That is the point. If you do a grep under driver, there are many drivers 
> using the pattern
> like this. This helper function is mean to replace this code.
> 
> From dma/sirf-dma.c
> 
> ret = of_address_to_resource(dn, 0, );
> if (ret) {
> dev_err(dev, "Error parsing memory region!\n");
>  goto error;
> }
> 
> regs_start = res.start;
> regs_size = resource_size();
> 
> base = devm_ioremap(dev, regs_start, regs_size);
> if (!base) {
> dev_err(dev, "Error mapping memory region!\n");
>  goto error;
> }
> 

That's wrong and should be fixed. The resource is already setup and
available to the probe function.

> Other instances.
> 
> edac/mpc85xx_edac.c
> media/video/fsl-viu.c
> mtd/nand/mpc5121_nfc.c

All PPC drivers that used the old of_platform_driver and also need to be
updated.

> 
> Some of these code uses devm_request_mem_region() as well. Isn't a good idea 
> to add this helper
> that can be called by new drivers to replace this sequence? I could update 
> the patch to do this call
> as well?

devm_request_and_ioremap

Rob

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC,PATCH] efi: Add support for a UEFI variable filesystem

2012-08-30 Thread H. Peter Anvin

Wouldn't that be better handled by O_APPEND?



Jeremy Kerr  wrote:

>Hi hpa,
>
>Thanks for the review!
>
>> However, I have a question... rather than putting the attributes as
>the
>> first data bytes, would it be better to make it either part of the
>> filename (assuming there is at least one character other than / which
>> can be reasonably relied upon to not be part of the name); for
>example:
>>
>>  LangCodes,BS,RT
>>
>> ... or ...
>>
>>  LangCodes,6
>
>This will get tricky when handling EFI_VARIABLE_APPEND_WRITE: this 
>attribute will never appear in the attributes returned by
>GetVariable(), 
>but may be passed to SetVariable(). If we put attributes in the 
>filename, we'd need to handle writes to both names, and/or have 
>duplicate dentries for each variable. We could do it, but the
>filesystem 
>interface might be a little messy.
>
>[Supporting append writes is essential for key database updates, which 
>may be signed]
>
>Cheers,
>
>
>Jeremy

-- 
Sent from my mobile phone. Please excuse brevity and lack of formatting.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: i2c-eg20t: regression since i2c_add_numbered_adapter change

2012-08-30 Thread Feng Tang

Hi Alexander,

On Thu, 30 Aug 2012 13:08:09 +0200
Alexander Stein  wrote:

> On Thursday 30 August 2012 17:19:15, Feng Tang wrote:
> > > IMO the i2c_register_board_info only works in quite static setups. 
> > > Especially with I2C-Busses attached to hotplugable PCI devices this way 
> > > doesn't work reliable any more.
> > > The device come and go dynamically so you can't assume fixed mapping.
> > 
> > Can you specify the hotplugable?
> > 1. A hotplugable i2c bus controller (say i2c_eg20t) with all fixed i2c
> >  devices connecting to it
> > 2. i2c bus controller is fixed, while the i2c devices will be dynamically
> >  connected to it.
> > 3. Both the bus controller and devices are dynamically hotplugged
> 
> I had scenario 1 in mind, but with more than 1 bus controller (say 2x 
> i2c_eg20t). How can you set a fixed numbering if there are more controllers, 
> each with maybe more than 1 bus?
> Anyway, how can you provide a static bus numbering if there are more than one 
> driver or more than one device per driver if the devices are hotplugable?

There is no problem for one pci driver to support multiple devices with
same HW, if you check i2c-eg20t.c you can check the pch_pcidev_id[], there
is a ML7213 platform which has 2 i2c controllers already. With current
in tree driver, they will get fixed bus number 0 and 1.

For the scenario 1, if you have really have another hotplugable pci eg20t
controller, it surely will have its unique PCI id, and then we can use
the "driver_data" of struct pci_device_id to point to a platform info
structure like

struct i2c_eg20t_platform_info {
u16 bus_base_num;
u16 total_hw_num;
}

And for EG20T compatible platform, I don't think it will have too many
fancy hotplugable different type of i2c controller other than i2c_eg20t
and i2c_sch.

Thanks,
Feng
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-08-30 Thread Kent Overstreet

On Thu, Aug 30, 2012 at 6:43 PM, Kent Overstreet  wrote:
> On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
>> On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
>>
>> [..]
>> > > Performance aside, punting submission to per device worker in case of 
>> > > deep
>> > > stack usage sounds cleaner solution to me.
>> >
>> > Agreed, but performance tends to matter in the real world. And either
>> > way the tricky bits are going to be confined to a few functions, so I
>> > don't think it matters that much.
>> >
>> > If someone wants to code up the workqueue version and test it, they're
>> > more than welcome...
>>
>> Here is one quick and dirty proof of concept patch. It checks for stack
>> depth and if remaining space is less than 20% of stack size, then it
>> defers the bio submission to per queue worker.
>
> I can't think of any correctness issues. I see some stuff that could be
> simplified (blk_drain_deferred_bios() is redundant, just make it a
> wrapper around blk_deffered_bio_work()).
>
> Still skeptical about the performance impact, though - frankly, on some
> of the hardware I've been running bcache on this would be a visible
> performance regression - probably double digit percentages but I'd have
> to benchmark it.  That kind of of hardware/usage is not normal today,
> but I've put a lot of work into performance and I don't want to make
> things worse without good reason.

Here's another crazy idea - we don't really need another thread, just
more stack space.

We could check if we're running out of stack space, then if we are
just allocate another two pages and memcpy the struct thread_info
over.

I think the main obstacle is that we'd need some per arch code for
mucking with the stack pointer. And it'd break backtraces, but that's
fixable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] hwmon: (sht15) remove multiple driver registration

2012-08-30 Thread Vivien Didelot

Declare an array of platform_device_id, instead of registering a driver
for each supported chip. This makes the code cleaner.
Also add a module description.

Signed-off-by: Vivien Didelot 
---
 drivers/hwmon/sht15.c | 94 +--
 1 file changed, 23 insertions(+), 71 deletions(-)

diff --git a/drivers/hwmon/sht15.c b/drivers/hwmon/sht15.c
index 8b011d0..38e0233 100644
--- a/drivers/hwmon/sht15.c
+++ b/drivers/hwmon/sht15.c
@@ -1,7 +1,7 @@
 /*
  * sht15.c - support for the SHT15 Temperature and Humidity Sensor
  *
- * Portions Copyright (c) 2010-2011 Savoir-faire Linux Inc.
+ * Portions Copyright (c) 2010-2012 Savoir-faire Linux Inc.
  *  Jerome Oufella 
  *  Vivien Didelot 
  *
@@ -53,6 +53,9 @@
 #define SHT15_STATUS_HEATER0x04
 #define SHT15_STATUS_LOW_BATTERY   0x40
 
+/* List of supported chips */
+enum sht15_chips { sht10, sht11, sht15, sht71, sht75 };
+
 /* Actions the driver may be doing */
 enum sht15_state {
SHT15_READING_NOTHING,
@@ -1042,77 +1045,26 @@ static int __devexit sht15_remove(struct 
platform_device *pdev)
return 0;
 }
 
-/*
- * sht_drivers simultaneously refers to __devinit and __devexit function
- * which causes spurious section mismatch warning. So use __refdata to
- * get rid from this.
- */
-static struct platform_driver __refdata sht_drivers[] = {
-   {
-   .driver = {
-   .name = "sht10",
-   .owner = THIS_MODULE,
-   },
-   .probe = sht15_probe,
-   .remove = __devexit_p(sht15_remove),
-   }, {
-   .driver = {
-   .name = "sht11",
-   .owner = THIS_MODULE,
-   },
-   .probe = sht15_probe,
-   .remove = __devexit_p(sht15_remove),
-   }, {
-   .driver = {
-   .name = "sht15",
-   .owner = THIS_MODULE,
-   },
-   .probe = sht15_probe,
-   .remove = __devexit_p(sht15_remove),
-   }, {
-   .driver = {
-   .name = "sht71",
-   .owner = THIS_MODULE,
-   },
-   .probe = sht15_probe,
-   .remove = __devexit_p(sht15_remove),
-   }, {
-   .driver = {
-   .name = "sht75",
-   .owner = THIS_MODULE,
-   },
-   .probe = sht15_probe,
-   .remove = __devexit_p(sht15_remove),
-   },
+static struct platform_device_id sht15_device_ids[] = {
+   { "sht10", sht10 },
+   { "sht11", sht11 },
+   { "sht15", sht15 },
+   { "sht71", sht71 },
+   { "sht75", sht75 },
+   { }
 };
+MODULE_DEVICE_TABLE(platform, sht15_device_ids);
 
-static int __init sht15_init(void)
-{
-   int ret;
-   int i;
-
-   for (i = 0; i < ARRAY_SIZE(sht_drivers); i++) {
-   ret = platform_driver_register(_drivers[i]);
-   if (ret)
-   goto error_unreg;
-   }
-
-   return 0;
-
-error_unreg:
-   while (--i >= 0)
-   platform_driver_unregister(_drivers[i]);
-
-   return ret;
-}
-module_init(sht15_init);
-
-static void __exit sht15_exit(void)
-{
-   int i;
-   for (i = ARRAY_SIZE(sht_drivers) - 1; i >= 0; i--)
-   platform_driver_unregister(_drivers[i]);
-}
-module_exit(sht15_exit);
+static struct platform_driver sht15_driver = {
+   .driver = {
+   .name = "sht15",
+   .owner = THIS_MODULE,
+   },
+   .probe = sht15_probe,
+   .remove = __devexit_p(sht15_remove),
+   .id_table = sht15_device_ids,
+};
+module_platform_driver(sht15_driver);
 
 MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Sensirion SHT15 temperature and humidity sensor driver");
-- 
1.7.11.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-08-30 Thread Kent Overstreet

On Thu, Aug 30, 2012 at 06:07:45PM -0400, Vivek Goyal wrote:
> On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:
> 
> [..]
> > > Performance aside, punting submission to per device worker in case of deep
> > > stack usage sounds cleaner solution to me.
> > 
> > Agreed, but performance tends to matter in the real world. And either
> > way the tricky bits are going to be confined to a few functions, so I
> > don't think it matters that much.
> > 
> > If someone wants to code up the workqueue version and test it, they're
> > more than welcome...
> 
> Here is one quick and dirty proof of concept patch. It checks for stack
> depth and if remaining space is less than 20% of stack size, then it
> defers the bio submission to per queue worker.

I can't think of any correctness issues. I see some stuff that could be
simplified (blk_drain_deferred_bios() is redundant, just make it a
wrapper around blk_deffered_bio_work()).

Still skeptical about the performance impact, though - frankly, on some
of the hardware I've been running bcache on this would be a visible
performance regression - probably double digit percentages but I'd have
to benchmark it.  That kind of of hardware/usage is not normal today,
but I've put a lot of work into performance and I don't want to make
things worse without good reason.

Have you tested/benchmarked it?

There's scheduling behaviour, too. We really want the workqueue thread's
cpu time to be charged to the process that submitted the bio. (We could
use a mechanism like that in other places, too... not like this is a new
issue).

This is going to be a real issue for users that need strong isolation -
for any driver that uses non negligable cpu (i.e. dm crypt), we're
breaking that (not that it wasn't broken already, but this makes it
worse).

I could be convinced, but right now I prefer my solution.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] hwmon: add Maxim MAX197 support

2012-08-30 Thread Vivien Didelot

The MAX197 is an A/D converter, made by Maxim. This driver currently
supports the MAX197, and MAX199. They are both 8-Channel, Multi-Range,
5V, 12-Bit DAS with 8+4 Bus Interface and Fault Protection.

The available ranges for the MAX197 are {0,-5V} to 5V, and {0,-10V} to
10V, while they are {0,-2V} to 2V, and {0,-4V} to 4V on the MAX199.

Signed-off-by: Vivien Didelot 
---
 Documentation/hwmon/max197   |  60 ++
 drivers/hwmon/Kconfig|   9 +
 drivers/hwmon/Makefile   |   1 +
 drivers/hwmon/max197.c   | 349 +++
 include/linux/platform_data/max197.h |  21 +++
 5 files changed, 440 insertions(+)
 create mode 100644 Documentation/hwmon/max197
 create mode 100644 drivers/hwmon/max197.c
 create mode 100644 include/linux/platform_data/max197.h

diff --git a/Documentation/hwmon/max197 b/Documentation/hwmon/max197
new file mode 100644
index 000..8d89b90
--- /dev/null
+++ b/Documentation/hwmon/max197
@@ -0,0 +1,60 @@
+Maxim MAX197 driver
+===
+
+Author:
+  * Vivien Didelot 
+
+Supported chips:
+  * Maxim MAX197
+Prefix: 'max197'
+Datasheet: http://datasheets.maxim-ic.com/en/ds/MAX197.pdf
+
+  * Maxim MAX199
+Prefix: 'max199'
+Datasheet: http://datasheets.maxim-ic.com/en/ds/MAX199.pdf
+
+Description
+---
+
+The A/D converters MAX197, and MAX199 are both 8-Channel, Multi-Range, 5V,
+12-Bit DAS with 8+4 Bus Interface and Fault Protection.
+
+The available ranges for the MAX197 are {0,-5V} to 5V, and {0,-10V} to 10V,
+while they are {0,-2V} to 2V, and {0,-4V} to 4V on the MAX199.
+
+Platform data
+-
+
+The MAX197 platform data (defined in linux/platform_data/max197.h) should be
+filled with a pointer to a conversion function, defined like:
+
+int convert(u8 ctrl);
+
+ctrl is the control byte to write to start a new conversion.
+On success, the function must return the 12-bit raw value read from the chip,
+or a negative error code otherwise.
+
+Control byte format:
+
+Bit Name   Description
+7,6 PD1,PD0Clock and Power-Down modes
+5   ACQMOD Internal or External Controlled Acquisition
+4   RNGFull-scale voltage magnitude at the input
+3   BIPUnipolar or Bipolar conversion mode
+2,1,0   A2,A1,A0   Channel
+
+Sysfs interface
+---
+
+* in[0-7]_input: The conversion value for the corresponding channel.
+ RO
+
+* in[0-7]_min:   The lower limit (in mV) for the corresponding channel.
+ For the MAX197, it will be adjusted to -1, -5000, or 0.
+ For the MAX199, it will be adjusted to -4000, -2000, or 0.
+ RW
+
+* in[0-7]_max:   The higher limit (in mV) for the corresponding channel.
+ For the MAX197, it will be adjusted to 0, 5000, or 1.
+ For the MAX199, it will be adjusted to 0, 2000, or 4000.
+ RW
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index b0a2e4c..2196869 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -803,6 +803,15 @@ config SENSORS_MAX1668
  This driver can also be built as a module.  If so, the module
  will be called max1668.
 
+config SENSORS_MAX197
+   tristate "Maxim MAX197 and compatibles"
+   help
+ Support for the Maxim MAX197 A/D converter.
+ Support will include, but not be limited to, MAX197, and MAX199.
+
+ This driver can also be built as a module. If so, the module
+ will be called max197.
+
 config SENSORS_MAX6639
tristate "Maxim MAX6639 sensor chip"
depends on I2C && EXPERIMENTAL
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index 7aa9811..1e8f690 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -94,6 +94,7 @@ obj-$(CONFIG_SENSORS_MAX) += max.o
 obj-$(CONFIG_SENSORS_MAX16065) += max16065.o
 obj-$(CONFIG_SENSORS_MAX1619)  += max1619.o
 obj-$(CONFIG_SENSORS_MAX1668)  += max1668.o
+obj-$(CONFIG_SENSORS_MAX197)   += max197.o
 obj-$(CONFIG_SENSORS_MAX6639)  += max6639.o
 obj-$(CONFIG_SENSORS_MAX6642)  += max6642.o
 obj-$(CONFIG_SENSORS_MAX6650)  += max6650.o
diff --git a/drivers/hwmon/max197.c b/drivers/hwmon/max197.c
new file mode 100644
index 000..6304f26
--- /dev/null
+++ b/drivers/hwmon/max197.c
@@ -0,0 +1,349 @@
+/*
+ * Maxim MAX197 A/D Converter driver
+ *
+ * Copyright (c) 2012 Savoir-faire Linux Inc.
+ *  Vivien Didelot 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * For further information, see the Documentation/hwmon/max197 file.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MAX199_LIMIT   4000/* 4V */
+#define MAX197_LIMIT   1   /* 10V */
+

Re: [PATCH] gpio: em: Fix checking return value of irq_alloc_descs

2012-08-30 Thread Magnus Damm

On Fri, Aug 31, 2012 at 9:05 AM, Linus Walleij  wrote:
> On Tue, Aug 28, 2012 at 4:30 AM, Axel Lin  wrote:
>
>> irq_alloc_descs() returns negative error code on failure.
>>
>> Signed-off-by: Axel Lin 
>
> Magnuis can I have your ACK on this?

Yes, of course! I never disagree with bug fixes =)

Acked-by: Magnus Damm 

Thanks a lot to both Axel and you!

/ magnus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] userns: Add basic quota support v4

2012-08-30 Thread Dave Chinner

On Wed, Aug 29, 2012 at 02:31:26AM -0700, Eric W. Biederman wrote:
> 
> Dave thanks for taking the time to take a detailed look at this code.
> 
> Dave Chinner  writes:
> 
> > On Tue, Aug 28, 2012 at 12:09:56PM -0700, Eric W. Biederman wrote:
> >> 
> >> Add the data type struct kqid which holds the kernel internal form of
> >> the owning identifier of a quota.  struct kqid is a replacement for
> >> the implicit union of uid, gid and project stored in an unsigned int
> >> and the quota type field that is was used in the quota data
> >> structures.  Making the data type explicit allows the kuid_t and
> >> kgid_t type safety to propogate more thoroughly through the code,
> >> revealing more places where uid/gid conversions need be made.
> >> 
> >> Along with the data type struct kqid comes the helper functions
> >> qid_eq, qid_lt, from_kqid, from_kqid_munged, qid_valid, make_kqid,
> >
> > I think Jan's comment about from_kqid being named id_from_kgid is
> > better, though I also think it would read better as kqid_to_id().
> > ie:
> >
> > id = kqid_to_id(ns, qid);
> 
> kqid and qid are the same thing just in a different encoding.
> Emphasizing the quota identifier instead of the kernel vs user encoding
> change is paying attention to the wrong thing.

Not from a quota perspective. The only thing the quota code really
cares about is the quota identifier, not the encoding.

Fundamentally, from_kqid() doen't tell me anything about what I'm
getting from the kqid. There's code all over the place that used the
"_to_" convention because it's obvious what is
being converted from/to. e.g. cpu_to_beXX, compat_to_ptr,
dma_to_phys, pfn_to_page, etc.  Best practises say "follow existing
conventions".

> Using make_kqid and from_kqid follows the exact same conventions as I have
> established for kuids and kgids.  So if you learn one you have learned
> them all.

For those of us that have to look at it once every few months,
following the same conventions as all the other code in the kernel
(i.e. kqid_to_id()) tells me everything I need to know without
having to go through the process of looking up the unusual
from_kqid() function and then from_kuid() to find out what it is
actually doing

> >> make_kqid_invalid, make_kqid_uid, make_kqid_gid.
> >
> > and these named something like uid_to_kqid()
> 
> The last two are indeed weird, and definitely not the common case,
> since there is no precedent I can almost see doing something different
> but I don't see a good case for a different name.

There's plenty of precendence in other code that converts format.
A very common convention that is used everywhere is DEFINE_...().
That would be make the code easier to grasp than "make...".

> >> Change struct dquot dq_id to a struct kqid and remove the now
> >> unecessary dq_type.
> >> 
> >> Update the signature of dqget, quota_send_warning, dquot_get_dqblk,
> >> and dquot_set_dqblk to use struct kqid.
> >> 
> >> Make minimal changes to ext3, ext4, gfs2, ocfs2, and xfs to deal with
> >> the change in quota structures and signatures.  The ocfs2 changes are
> >> larger than most because of the extensive tracing throughout the ocfs2
> >> quota code that prints out dq_id.
> >
> > How did you test that this all works?
> 
> By making it a compile error if you get a conversion wrong and making it
> a rule not to make any logic changes.
>
> That combined with code review
> and running the code a bit to make certain I did not horribly mess up.

But no actual regression testing. You're messing with code that I
will have to triage when it goes wrong for a user, so IMO your code
has to pass the same bar as the code I write has to pass for review
- please regression test your code and write new regression tests
for new functionality.

> > e.g. run xfstests -g quota on
> > each of those filesystems and check for no regressions? And if you
> > wrote any tests, can you convert them to be part of xfstests so that
> > namespace aware quotas get tested regularly?
> 
> I have not written any tests, and running the xfstests in a namespace
> should roughly be a matter of "unshare -U xfstest -g quota"  It isn't
> quite that easy because  /proc/self/uid_map and /proc/self/gid_map need

Asking people to run the entire regression test suite differently
and with special setup magic won't get the code tested regularly.
Writing a new, self contained test that exercises quota in multiple
namespaces simultaneously is what is needed - that way people who
don't even know that namespaces exist will be regression testing
it...

> >> --- a/include/linux/quota.h
> >> +++ b/include/linux/quota.h
> >> @@ -181,10 +181,161 @@ enum {
> >>  #include 
> >>  
> >>  #include 
> >> +#include 
> >>  
> >>  typedef __kernel_uid32_t qid_t; /* Type in which we store ids in memory */
> >>  typedef long long qsize_t;/* Type in which we store sizes */
> >
> > From fs/xfs/xfs_types.h:
> >
> > typedef __uint32_t  prid_t; /* project ID */
> >
> > Perhaps it

Re: [PATCH v3] linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative operands

2012-08-30 Thread Guenter Roeck

On Thu, Aug 30, 2012 at 05:35:31PM -0700, Andrew Morton wrote:
> On Thu, 30 Aug 2012 17:10:47 -0700 Guenter Roeck  wrote:
> 
> > DIV_ROUND_CLOSEST returns a bad result for dividends with different sign:
> > DIV_ROUND_CLOSEST(-2, 2) = 0
> > 
> > Most of the time this does not matter. However, in the hardware monitoring
> > subsystem, DIV_ROUND_CLOSEST is sometimes used on integers which can be
> > negative (such as temperatures).
> > 
> > ...
> >
> > --- a/include/linux/kernel.h
> > +++ b/include/linux/kernel.h
> > @@ -84,8 +84,11 @@
> >  )
> >  #define DIV_ROUND_CLOSEST(x, divisor)( \
> >  {  \
> > -   typeof(divisor) __divisor = divisor;\
> > -   (((x) + ((__divisor) / 2)) / (__divisor));  \
> > +   typeof(x) __x = x;  \
> > +   typeof(divisor) __d = divisor;  \
> > +   ((__x) < 0) == ((__d) < 0) ?\
> > +   (((__x) + ((__d) / 2)) / (__d)) :   \
> > +   (((__x) - ((__d) / 2)) / (__d));\
> >  }  \
> >  )
> 
> Your v2 had that sneaky little "(typeof(x))-1 >= 0" trick in it, so
> half the code gets elided at compile time if `x' (why isn't this called
> "dividend") has an unsigned type.
> 
> Would retaining that be of any benefit?  We do want to avoid doing the
> compare-and-branch in as many cases as possible.
> 
DIV_ROUND_CLOSEST(0,-2)=1

This also happens if I keep the sneaky code. The v3 code does not have this
problem. I know it is a bit theoretic, but still there. Of course, I could
simply ignore the divisor's sign entirely, assuming (and documenting) that
negative divisors are just too odd to deal with. Commentss welcome ...

> Also, this would be a great opportunity to document the macro's beahviour
> (I do go on).  That would be a useful thing to do, given that we're now
> handling the four +/+, +/-, -/+, -/- cases and the behaviour for each
> case isn't terribly obvious.
> 
Ok.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/9 V3] workqueue: add non_manager_role_manager_mutex_unlock()

2012-08-30 Thread Lai Jiangshan

On 08/30/2012 05:17 PM, Tejun Heo wrote:
> Hello, Lai.
> 
> On Thu, Aug 30, 2012 at 05:16:01PM +0800, Lai Jiangshan wrote:
>> gcwq_unbind_fn() is unsafe even it is called from a work item.
>> so we need non_manager_role_manager_mutex_unlock().
>>
>> If rebind_workers() is called from a work item, it is safe when there is
>> no CPU_INTENSIVE items. but we can't disable CPU_INTENSIVE items,
>> so it is still unsafe, we need non_manager_role_manager_mutex_unlock() too.
> 
> Can you please elaborate?  Why is it not safe if there are
> CPU_INTENSIVE items?
> 
> Thanks.
> 

Imaging there only two workers, they all have UNBOUND bit because the 
rebind_workers()
has not been called. The First one is processing work items, the second one is 
idle,
when the first one encounter the work item of rebind_workers() and handle it, 
at the same
the second one try to create workers and failed and go to process work items 
too.
but unlikely the second one encounters a CPU_INTENSIVE items, the nr_running is 
still
<=1 after the first one finish rebind_workers().

nr_running.
first one:  process work item endless   +0 or +1
second one: process the CPU_INTENSIVE item endless  +0

No one can service for manager role.

Thanks.
Lai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 3/3] spi: spi-davinci: convert to DMA engine API

2012-08-30 Thread Vinod Koul

On Thu, 2012-08-30 at 10:43 -0400, Matt Porter wrote:
> On Thu, Aug 30, 2012 at 07:46:32PM +0530, Sekhar Nori wrote:
> > Hi Matt,
> > 
> > On 8/23/2012 6:39 AM, Matt Porter wrote:
> > > Removes use of the DaVinci EDMA private DMA API and replaces
> > > it with use of the DMA engine API.
> > > 
> > > Signed-off-by: Matt Porter 
> > 
> > I tried testing this patch on my OMAP-L138 EVM, but SPI fails to 
> > initialize after applying the patch.
> > 
> > root@arago:~# dmesg | grep -i spi   
> > 
> > spi_davinci spi_davinci.1: request RX DMA channel failed
> > 
> 
> Hi Sekhar,
> 
> Most likely CONFIG_TI_EDMA is off as it defaults to off in the v3
> series. Try enabling this and if it's the problem then this error
> path can be fixed to properly fallback to PIO only or fail to
> initialize as needed.
I didnt see any update on this one, is it okay, if so care to send a
tested-by?

-- 
~Vinod Koul
Intel Corp.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/2] ARM: EXYNOS: Set the capability of pdm0 and pdm1 as DMA_PRIVATE

2012-08-30 Thread Vinod Koul

On Wed, 2012-08-29 at 10:16 +0530, Tushar Behera wrote:
> DMA clients pdma0 and pdma1 are internal to the SoC and are used only
> by dedicated peripherals. Since they cannot be used for generic
> purpose, their capability should be set as DMA_PRIVATE.
> 
> The patches are rebased on top of v3.6-rc3.
Kukjin, if you ack them I can take thru my tree, other way round is fine
with me too.
> 
> Tushar Behera (2):
>   ARM: EXYNOS: Set the capability of pdm0 and pdm1 as DMA_PRIVATE
>   DMA: PL330: Set the capability of pdm0 and pdm1 as DMA_PRIVATE
> 
>  arch/arm/mach-exynos/dma.c |2 ++
>  drivers/dma/pl330.c|1 +
>  2 files changed, 3 insertions(+), 0 deletions(-)
> 

-- 
~Vinod Koul
Intel Corp.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC,PATCH] efi: Add support for a UEFI variable filesystem

2012-08-30 Thread Jeremy Kerr


Hi hpa,

Thanks for the review!


However, I have a question... rather than putting the attributes as the
first data bytes, would it be better to make it either part of the
filename (assuming there is at least one character other than / which
can be reasonably relied upon to not be part of the name); for example:

LangCodes,BS,RT

... or ...

LangCodes,6


This will get tricky when handling EFI_VARIABLE_APPEND_WRITE: this 
attribute will never appear in the attributes returned by GetVariable(), 
but may be passed to SetVariable(). If we put attributes in the 
filename, we'd need to handle writes to both names, and/or have 
duplicate dentries for each variable. We could do it, but the filesystem 
interface might be a little messy.


[Supporting append writes is essential for key database updates, which 
may be signed]


Cheers,


Jeremy

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH -v12 02/15] resources: Add probe_resource()

2012-08-30 Thread Bjorn Helgaas

On Wed, Aug 29, 2012 at 10:36 AM, Yinghai Lu  wrote:
> On Wed, Aug 29, 2012 at 8:57 AM, Yinghai Lu  wrote:
>> also have another version for probe_resource, please check attached version 
>> -v8.
>>
>
> sorry, v8 forget removing two lines.
>
> please -v9 instead.
>
> -v8: Linus said: allocation/return is not right, and -1 step tricks make it
> not work as generic resource probe.
>  So try to remove the needed_size tricks, and also use __adjust_resource
> for probing instead.
> -v9: remove two lines that is supposed to be removed after converting to use
> __adjust_resource

These tweaks might be slight improvements, but they completely miss
the point of my objection.  I just don't think the probe_resource()
interface is a reasonable addition to kernel/resource.c.  I think it's
too hard to describe what it does, and it seems like it's too specific
to what PCI needs in this particular case.  We should be able to look
at the prototype and get a pretty good idea of what the function does,
but I can't do that with this:

+int probe_resource(struct resource *b_res,
+struct resource *busn_res,
+resource_size_t needed_size, struct resource **p,
+int skip_nr, int limit, int stop_flags)

We already have adjust_resource(), which grows or shrinks a resource
while maintaining the invariants that the adjusted resource (1)
doesn't overlap any of its siblings and (2) still contains all its
children.

adjust_resource() seems like a fairly generic, generally useful
interface.  What you're trying to do with probe_resource() is quite
similar, except that probe_resource() adds the idea of walking up the
tree.

I think you should consider something like an "expand_resource()" that
just balloons a resource at both ends until it abuts its siblings,
i.e., it grows the resource as much as possible.  Then you know the
largest possible size, and you can use adjust_resource() to shrink it
again if you don't need that much.  You can walk up the tree in the
caller when you need to.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3] linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative operands

2012-08-30 Thread Andrew Morton

On Thu, 30 Aug 2012 17:10:47 -0700 Guenter Roeck  wrote:

> DIV_ROUND_CLOSEST returns a bad result for dividends with different sign:
>   DIV_ROUND_CLOSEST(-2, 2) = 0
> 
> Most of the time this does not matter. However, in the hardware monitoring
> subsystem, DIV_ROUND_CLOSEST is sometimes used on integers which can be
> negative (such as temperatures).
> 
> ...
>
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -84,8 +84,11 @@
>  )
>  #define DIV_ROUND_CLOSEST(x, divisor)(   \
>  {\
> - typeof(divisor) __divisor = divisor;\
> - (((x) + ((__divisor) / 2)) / (__divisor));  \
> + typeof(x) __x = x;  \
> + typeof(divisor) __d = divisor;  \
> + ((__x) < 0) == ((__d) < 0) ?\
> + (((__x) + ((__d) / 2)) / (__d)) :   \
> + (((__x) - ((__d) / 2)) / (__d));\
>  }\
>  )

Your v2 had that sneaky little "(typeof(x))-1 >= 0" trick in it, so
half the code gets elided at compile time if `x' (why isn't this called
"dividend") has an unsigned type.

Would retaining that be of any benefit?  We do want to avoid doing the
compare-and-branch in as many cases as possible.

Also, this would be a great opportunity to document the macro's beahviour
(I do go on).  That would be a useful thing to do, given that we're now
handling the four +/+, +/-, -/+, -/- cases and the behaviour for each
case isn't terribly obvious.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ioat: Adding Ivy Bridge IOATDMA PCI device IDs

2012-08-30 Thread Vinod Koul

On Fri, 2012-08-24 at 16:36 -0700, Dave Jiang wrote:
> Signed-off-by: Dave Jiang 
Sounds okay to me, I can carry it once Dan acks it
> ---
> 
>  drivers/dma/ioat/pci.c |   22 ++
>  1 files changed, 22 insertions(+), 0 deletions(-)
> 
> diff --git a/drivers/dma/ioat/pci.c b/drivers/dma/ioat/pci.c
> index 5e3a40f..c057306 100644
> --- a/drivers/dma/ioat/pci.c
> +++ b/drivers/dma/ioat/pci.c
> @@ -40,6 +40,17 @@ MODULE_VERSION(IOAT_DMA_VERSION);
>  MODULE_LICENSE("Dual BSD/GPL");
>  MODULE_AUTHOR("Intel Corporation");
>  
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB00x0e20
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB10x0e21
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB20x0e22
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB30x0e23
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB40x0e24
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB50x0e25
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB60x0e26
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB70x0e27
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB80x0e2e
> +#define PCI_DEVICE_ID_INTEL_IOAT_IVB90x0e2f
> +
>  static struct pci_device_id ioat_pci_tbl[] = {
>   /* I/OAT v1 platforms */
>   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT) },
> @@ -83,6 +94,17 @@ static struct pci_device_id ioat_pci_tbl[] = {
>   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_SNB8) },
>   { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_SNB9) },
>  
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB0) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB1) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB2) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB3) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB4) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB5) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB6) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB7) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB8) },
> + { PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_IOAT_IVB9) },
> +
>   { 0, }
>  };
>  MODULE_DEVICE_TABLE(pci, ioat_pci_tbl);
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
~Vinod Koul
Intel Corp.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] dma/ste_dma40: Fixup clock usage during probe

2012-08-30 Thread Vinod Koul

On Thu, 2012-08-23 at 13:41 +0200, Ulf Hansson wrote:
> From: Ulf Hansson 
> 
> Fixup some errorhandling for clocks during probe and make sure
> to use clk_prepare as well as clk_enable.
> 
> Signed-off-by: Ulf Hansson 
> Acked-by: Linus Walleij 
Applied, thanks

-- 
~Vinod Koul
Intel Corp.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v3] linux/kernel.h: Fix DIV_ROUND_CLOSEST to support negative operands

2012-08-30 Thread Guenter Roeck

DIV_ROUND_CLOSEST returns a bad result for dividends with different sign:
DIV_ROUND_CLOSEST(-2, 2) = 0

Most of the time this does not matter. However, in the hardware monitoring
subsystem, DIV_ROUND_CLOSEST is sometimes used on integers which can be
negative (such as temperatures).

Signed-off-by: Guenter Roeck 
---
v3: Instead of adding a new macro, fix DIV_ROUND_CLOSEST.
This version works for negative dividend and divisor.

v2: v1 did not work if typeof(divisor) was an unsigned variable type
(which can obviously not be negative).
Rework to revert to DIV_ROUND_CLOSEST if the dividend is unsigned,
or if it is signed but non-negative.

 include/linux/kernel.h |7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/linux/kernel.h b/include/linux/kernel.h
index 6043821..4b180de 100644
--- a/include/linux/kernel.h
+++ b/include/linux/kernel.h
@@ -84,8 +84,11 @@
 )
 #define DIV_ROUND_CLOSEST(x, divisor)( \
 {  \
-   typeof(divisor) __divisor = divisor;\
-   (((x) + ((__divisor) / 2)) / (__divisor));  \
+   typeof(x) __x = x;  \
+   typeof(divisor) __d = divisor;  \
+   ((__x) < 0) == ((__d) < 0) ?\
+   (((__x) + ((__d) / 2)) / (__d)) :   \
+   (((__x) - ((__d) / 2)) / (__d));\
 }  \
 )
 
-- 
1.7.9.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] gpio: em: Fix checking return value of irq_alloc_descs

2012-08-30 Thread Linus Walleij

On Tue, Aug 28, 2012 at 4:30 AM, Axel Lin  wrote:

> irq_alloc_descs() returns negative error code on failure.
>
> Signed-off-by: Axel Lin 

Magnuis can I have your ACK on this?

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/3] staging: ramster: move to new zcache2 code base

2012-08-30 Thread Konrad Rzeszutek Wilk

On Thu, Aug 30, 2012 at 03:46:01PM -0700, Dan Magenheimer wrote:
> Hi Greg --
> 
> gregkh> If you feel that the existing code needs to be dropped
> gregkh> and replaced with a totally new version, that's fine with
> gregkh> me.  It's forward progress, which is all that I ask for. 
> (http://lkml.indiana.edu/hypermail/linux/kernel/1208.0/02240.html,
> in reference to zcache, assuming applies to ramster as well)
> 
> Please apply for staging-next for the 3.7 window to move ramster forward.
> Since AFAICT there have been no patches or contributions from others to
> drivers/staging/ramster since it was merged, this totally new version
> of ramster should not run afoul and the patches should apply to
> 3.5 or 3.6-rcN.
> 
> Thanks,
> Dan
> 
> When ramster was merged into staging at 3.4, it used a "temporarily" forked
> version of zcache.  Code was proposed to merge zcache and ramster into
> a new common redesigned codebase which both resolves various serious design
> flaws and eliminates all code duplication between zcache and ramster, with
> the result to replace "zcache".  Sadly, that proposal was blocked, so the
> zcache (and tmem) code in drivers/staging/zcache and the zcache (and tmem)
> code in drivers/staging/ramster continue to be different.

Right. They will diverge for now.
> 
> This patchset moves ramster to the new redesigned codebase and calls that
> new codebase "zcache2".  Most, if not all, of the redesign will eventually
> need to be merged with "zcache1" before zcache functionality should be
> promoted out of staging.

Or as part of ramster unstaging  'zcache1' can be made more in a
library so that ramster can use it. Naturally this also requires some
modifications in zcache1 to have the infrastructure functionality for
ramster. But that is something we can worry about later.

> 
> An overview of the zcache2 rewrite is provided in a git commit comment
> later in this series.
> 
> A significant item of debate in the new codebase is the removal of zsmalloc.

Just clarifying since what you mean is that in your ramster's version
of zcache (so zcache2), as you are not using it.

> This removal may be temporary if zsmalloc is enhanced with necessary
> features to meet the needs of the new zcache codebase.  Justification
> for the change can be found at http://lkml.org/lkml/2012/8/15/292
> Such zsmalloc enhancments will almost certainly necessitate a major
> rework, not a small patch.

Or have the zcache be able to select whether its going to use zbud
or xsmalloc for any type of pages (so you could use xsmalloc for
both cleancache and frontswap pages, or be more selective like
zcache1 is).
> 
> While this zcache2 codebase is far from perfect (and thus remains in staging),
> the foundation is now cleaner, more stable, more maintainable, and much
> better commented.
> 
> Signed-off-by: Dan Magenheimer 
> 
> ---
> Diffstat:
> 
>  drivers/staging/Kconfig|4 +-
>  drivers/staging/Makefile   |2 +-
>  drivers/staging/ramster/Kconfig|   25 +-
>  drivers/staging/ramster/Makefile   |7 +-
>  drivers/staging/ramster/TODO   |   13 -
>  drivers/staging/ramster/cluster/Makefile   |3 -
>  drivers/staging/ramster/cluster/heartbeat.c|  464 ---
>  drivers/staging/ramster/cluster/heartbeat.h|   87 -
>  drivers/staging/ramster/cluster/masklog.c  |  155 -
>  drivers/staging/ramster/cluster/masklog.h  |  220 --
>  drivers/staging/ramster/cluster/nodemanager.c  |  992 --
>  drivers/staging/ramster/cluster/nodemanager.h  |   88 -
>  .../staging/ramster/cluster/ramster_nodemanager.h  |   39 -
>  drivers/staging/ramster/cluster/tcp.c  | 2256 -
>  drivers/staging/ramster/cluster/tcp.h  |  159 -
>  drivers/staging/ramster/cluster/tcp_internal.h |  248 --
>  drivers/staging/ramster/r2net.c|  401 ---
>  drivers/staging/ramster/ramster.h  |  113 +-
>  drivers/staging/ramster/ramster/heartbeat.c|  462 +++
>  drivers/staging/ramster/ramster/heartbeat.h|   87 +
>  drivers/staging/ramster/ramster/masklog.c  |  155 +
>  drivers/staging/ramster/ramster/masklog.h  |  220 ++
>  drivers/staging/ramster/ramster/nodemanager.c  |  995 ++
>  drivers/staging/ramster/ramster/nodemanager.h  |   88 +
>  drivers/staging/ramster/ramster/r2net.c|  414 +++
>  drivers/staging/ramster/ramster/ramster.c  |  985 ++
>  drivers/staging/ramster/ramster/ramster.h  |  161 +
>  .../staging/ramster/ramster/ramster_nodemanager.h  |   39 +
>  drivers/staging/ramster/ramster/tcp.c  | 2253 +
>  drivers/staging/ramster/ramster/tcp.h  |  159 +
>  drivers/staging/ramster/ramster/tcp_internal.h |  248 ++
>  drivers/staging/ramster/tmem.c |  313 +-
>  drivers/staging/ramster/tmem.h

Re: [PATCH 5/8] hpfs: drop lock/unlock super

2012-08-30 Thread Mikulas Patocka

It looks ok.

Mikulas

On Thu, 30 Aug 2012, Marco Stornelli wrote:

> Removed lock/unlock super.
> 
> Signed-off-by: Marco Stornelli 
> ---
>  fs/hpfs/super.c |3 ---
>  1 files changed, 0 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/hpfs/super.c b/fs/hpfs/super.c
> index 706a12c..8af2cdc 100644
> --- a/fs/hpfs/super.c
> +++ b/fs/hpfs/super.c
> @@ -389,7 +389,6 @@ static int hpfs_remount_fs(struct super_block *s, int 
> *flags, char *data)
>   *flags |= MS_NOATIME;
>   
>   hpfs_lock(s);
> - lock_super(s);
>   uid = sbi->sb_uid; gid = sbi->sb_gid;
>   umask = 0777 & ~sbi->sb_mode;
>   lowercase = sbi->sb_lowercase;
> @@ -422,12 +421,10 @@ static int hpfs_remount_fs(struct super_block *s, int 
> *flags, char *data)
>  
>   replace_mount_options(s, new_opts);
>  
> - unlock_super(s);
>   hpfs_unlock(s);
>   return 0;
>  
>  out_err:
> - unlock_super(s);
>   hpfs_unlock(s);
>   kfree(new_opts);
>   return -EINVAL;
> -- 
> 1.7.3.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/22] ARM: ux500: Fork MSP platform registration for step-by-step DT enablement

2012-08-30 Thread Linus Walleij

On Tue, Aug 28, 2012 at 12:48 AM, Lee Jones  wrote:
> On Mon, Aug 27, 2012 at 04:07:58PM -0700, Linus Walleij wrote:

>> If you're adding and then removing *all* of them in this set,
>> why add them in the first place?
>
> So that there's no breakage during bisection.
>
> You should be able to roll the kernel back in between each of these
> patches and there to be full compatibility at each point. At least
> that was the intention. Is that wrong?

No it's correct, the only way to do what I'm thinking on may be
to squash them all into one gigantic patch, which is not good either.

So go ahead with this scheme, it's the lesser of two evils.
Acked-by etc.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 09/17] ARM: ux500: Add AB8500 CODEC node to DB8500 Device Tree

2012-08-30 Thread Linus Walleij

On Fri, Aug 24, 2012 at 7:01 AM, Lee Jones  wrote:

> Ensure correct probing and pass though important configuration
> options to the AB8500 CODEC driver when DT is enabled
>
> Signed-off-by: Lee Jones 

Acked-by: Linus Walleij 

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/2] linux/kernel.h: Introduce IDIV_ROUND_CLOSEST

2012-08-30 Thread Guenter Roeck

On Thu, Aug 30, 2012 at 04:15:32PM -0700, Andrew Morton wrote:
> On Tue, 28 Aug 2012 09:30:55 -0700
> Guenter Roeck  wrote:
> 
> > DIV_ROUND_CLOSEST returns a bad result for negative dividends:
> > DIV_ROUND_CLOSEST(-2, 2) = 0
> > 
> > Most of the time this does not matter. However, in the hardware monitoring
> > subsystem, it is sometimes used on integers which can be negative (such as
> > temperatures). Introduce new macro IDIV_ROUND_CLOSEST which also supports
> > negative dividends.
> > 
> 
> Can't we just fix DIV_ROUND_CLOSEST?  That will make it a bit slower
> but it's not exactly a speed demon right now.  And fixing
> DIV_ROUND_CLOSEST() might just fix other bugs that we don't know about
> yet.
> 
Sure, fine with me. I'll submit a patch. Let's see who starts screaming.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 01/17] ASoC: Ux500: Move MSP pinctrl setup into the MSP driver

2012-08-30 Thread Linus Walleij

On Fri, Aug 24, 2012 at 7:01 AM, Lee Jones  wrote:

> In the initial submission of the MSP driver msp1 and msp3's associated
> pinctrl mechanism was passed back to platform code using a plat_init()
> call-back routine, but it has no place in platform code. The MSP driver
> should set this up for the appropriate ports. Instead we use a use_pinctrl
> identifier which is passed from platform_data/Device Tree which indicates
> which ports should use pinctrl.
>
> CC: alsa-de...@alsa-project.org
> Signed-off-by: Lee Jones 

Looks good to me, so Acked-by: Linus Walleij 
for the ux500 mach part.

However I'd request Ola/Roger to ACK it too on the ALSA side.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] clk: Make the generic clock API available by default

2012-08-30 Thread Stephen Warren


On 08/30/12 10:19, Mark Brown wrote:

On Wed, Aug 29, 2012 at 02:49:34PM -0700, Stephen Warren wrote:

On 08/28/12 13:35, Mark Brown wrote:



@@ -674,6 +676,7 @@ config ARCH_TEGRA
select GENERIC_CLOCKEVENTS
select GENERIC_GPIO
select HAVE_CLK
+   select HAVE_CUSTOM_CLK



For 3.7, Tegra will switch to the common clock framework. I think
this patch would then disable that. How should we resolve this -
rebase the Tegra common-clk tree on top of any branch containing
this patch in order to remove that select statement?


I'd expect this to be applied on a separate branch so you should be able
to rebase your conversion on top of it or merge it into your branch
which should deal with things well enough I think?


That should work.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 1/5] rcu: Update rcutorture defaults

2012-08-30 Thread Josh Triplett

On Thu, Aug 30, 2012 at 02:35:36PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 30, 2012 at 11:57:05AM -0700, Josh Triplett wrote:
> > On Thu, Aug 30, 2012 at 11:45:08AM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" 
> > > 
> > > A number of new features have been added to rcutorture over the years, but
> > > the defaults have not been updated to include them.  This commit therefore
> > > turns on a couple of them that have proven helpful and trustworthy, namely
> > > periodic progress reports and testing of NO_HZ.
> > > 
> > > Signed-off-by: Paul E. McKenney 
> > > Signed-off-by: Paul E. McKenney 
> > > ---
> > >  kernel/rcutorture.c |4 ++--
> > >  1 files changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
> > > index 25b1503..86315d3 100644
> > > --- a/kernel/rcutorture.c
> > > +++ b/kernel/rcutorture.c
> > > @@ -53,10 +53,10 @@ MODULE_AUTHOR("Paul E. McKenney  
> > > and Josh Triplett  > >  
> > >  static int nreaders = -1;/* # reader threads, defaults to 
> > > 2*ncpus */
> > >  static int nfakewriters = 4; /* # fake writer threads */
> > > -static int stat_interval;/* Interval between stats, in seconds. 
> > > */
> > > +static int stat_interval = 60;   /* Interval between stats, in seconds. 
> > > */
> > >   /*  Defaults to "only at end of test". */
> > 
> > Need to remove this comment about the default.
> 
> Good catch!  I have replaced it with "Zero means "only at end of test".

Good point, you definitely still need to document what zero means.

> > >  static bool verbose; /* Print more debug info. */
> > > -static bool test_no_idle_hz; /* Test RCU's support for tickless idle 
> > > CPUs. */
> > > +static bool test_no_idle_hz = 1; /* Test RCU support for tickless idle 
> > > CPUs. */
> > 
> > s/1/true/
> 
> Good point, fixed.
> 
> Thank you for looking this over!

With those two fixes:

Reviewed-by: Josh Triplett 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 2/5] rcu: Track CPU-hotplug duration statistics

2012-08-30 Thread Josh Triplett

On Thu, Aug 30, 2012 at 01:38:42PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 30, 2012 at 12:00:18PM -0700, Josh Triplett wrote:
> > On Thu, Aug 30, 2012 at 11:45:09AM -0700, Paul E. McKenney wrote:
> > > From: "Paul E. McKenney" 
> > > 
> > > Many rcutorture runs include CPU-hotplug operations in their stress
> > > testing.  This commit accumulates statistics on the durations of these
> > > operations in deference to the recent concern about the overhead and
> > > latency of these operations.
> > 
> > How many jiffies, on average, do these operations take?  Measuring these
> > using jiffies seems highly prone to repeated rounding error.
> 
> On my laptop, 30-140 depending on what hotplug patches I have in place.
> Some users have reported as few as 2-3 jiffies, but they don't use
> rcutorture.
> 
> I eagerly look forward to the time when I need to change the timebase for
> my own use.  ;-)

Fair enough.  In that case, this seems precise enough for the purpose it
serves.

> > > Signed-off-by: Paul E. McKenney 
> > > Signed-off-by: Paul E. McKenney 

Reviewed-by: Josh Triplett 

> > > ---
> > >  kernel/rcutorture.c |   42 +-
> > >  1 files changed, 37 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
> > > index 86315d3..c6cf6ff 100644
> > > --- a/kernel/rcutorture.c
> > > +++ b/kernel/rcutorture.c
> > > @@ -176,8 +176,14 @@ static long n_rcu_torture_boosts;
> > >  static long n_rcu_torture_timers;
> > >  static long n_offline_attempts;
> > >  static long n_offline_successes;
> > > +static unsigned long sum_offline;
> > > +static int min_offline = -1;
> > > +static int max_offline;
> > >  static long n_online_attempts;
> > >  static long n_online_successes;
> > > +static unsigned long sum_online;
> > > +static int min_online = -1;
> > > +static int max_online;
> > >  static long n_barrier_attempts;
> > >  static long n_barrier_successes;
> > >  static struct list_head rcu_torture_removed;
> > > @@ -1214,11 +1220,13 @@ rcu_torture_printk(char *page)
> > >  n_rcu_torture_boost_failure,
> > >  n_rcu_torture_boosts,
> > >  n_rcu_torture_timers);
> > > - cnt += sprintf([cnt], "onoff: %ld/%ld:%ld/%ld ",
> > > -n_online_successes,
> > > -n_online_attempts,
> > > -n_offline_successes,
> > > -n_offline_attempts);
> > > + cnt += sprintf([cnt],
> > > +"onoff: %ld/%ld:%ld/%ld %d,%d:%d,%d %lu:%lu (HZ=%d) ",
> > > +n_online_successes, n_online_attempts,
> > > +n_offline_successes, n_offline_attempts,
> > > +min_online, max_online,
> > > +min_offline, max_offline,
> > > +sum_online, sum_offline, HZ);
> > >   cnt += sprintf([cnt], "barrier: %ld/%ld:%ld",
> > >  n_barrier_successes,
> > >  n_barrier_attempts,
> > > @@ -1490,8 +1498,10 @@ static int __cpuinit
> > >  rcu_torture_onoff(void *arg)
> > >  {
> > >   int cpu;
> > > + unsigned long delta;
> > >   int maxcpu = -1;
> > >   DEFINE_RCU_RANDOM(rand);
> > > + unsigned long starttime;
> > >  
> > >   VERBOSE_PRINTK_STRING("rcu_torture_onoff task started");
> > >   for_each_online_cpu(cpu)
> > > @@ -1509,6 +1519,7 @@ rcu_torture_onoff(void *arg)
> > >   printk(KERN_ALERT "%s" TORTURE_FLAG
> > >  "rcu_torture_onoff task: offlining %d\n",
> > >  torture_type, cpu);
> > > + starttime = jiffies;
> > >   n_offline_attempts++;
> > >   if (cpu_down(cpu) == 0) {
> > >   if (verbose)
> > > @@ -1516,12 +1527,23 @@ rcu_torture_onoff(void *arg)
> > >  "rcu_torture_onoff task: 
> > > offlined %d\n",
> > >  torture_type, cpu);
> > >   n_offline_successes++;
> > > + delta = jiffies - starttime;
> > > + sum_offline += delta;
> > > + if (min_offline < 0) {
> > > + min_offline = delta;
> > > + max_offline = delta;
> > > + }
> > > + if (min_offline > delta)
> > > + min_offline = delta;
> > > + if (max_offline < delta)
> > > + max_offline = delta;
> > >   }
> > >   } else if (cpu_is_hotpluggable(cpu)) {
> > >   if (verbose)
> > >   printk(KERN_ALERT "%s" TORTURE_FLAG
> > >  "rcu_torture_onoff task: onlining %d\n",
> > >  torture_type, cpu);
> > > + starttime = jiffies;
> > >

Re: [PATCH tip/core/rcu 0/5] Documentation and rcutorture changes

2012-08-30 Thread Josh Triplett

On Thu, Aug 30, 2012 at 02:46:03PM -0700, Paul E. McKenney wrote:
> On Thu, Aug 30, 2012 at 11:56:09AM -0700, Josh Triplett wrote:
> > On Thu, Aug 30, 2012 at 11:44:48AM -0700, Paul E. McKenney wrote:
> > > Hello!
> > > 
> > > This series covers changes to rcutorture and documentation updates.
> > > The individual patches in this series are as follows:
> > > 
> > > 1.Update rcutorture default values so that casual rcutorture
> > >   users will do more aggressive testing.
> > > 2.Make rcutorture track CPU-hotplug latency statistics.
> > > 3.Document SRCU's new-found ability to be used by offline and
> > >   idle CPUs, and also emphasize SRCU's limitations.
> > > 4.Use the new pr_*() interfaces in rcutorture.
> > > 5.Prevent kthread-initialization races in rcutorture.
> > > 
> > >   Thanx, Paul
> > > 
> > > 
> > > 
> > >  b/Documentation/RCU/checklist.txt |6 +
> > >  b/Documentation/RCU/whatisRCU.txt |9 +-
> > >  b/kernel/rcutorture.c |4 -
> > >  kernel/rcutorture.c   |  152 
> > > +++---
> > >  4 files changed, 108 insertions(+), 63 deletions(-)
> > 
> > Something seems wrong with this diffstat; how'd the b/ prefixes get
> > there, and why does it list kernel/rcutorture.c twice, once with and
> > once without?
> 
> Hmmm...  It seems quite reproducible.  I did the usual git-format-patch
> and ran the resulting set of patches through diffstat.  I seem to have a
> broken diffstat...
> 
> However, git diff --stat v3.6-rc1..hotplug.2012.08.28a generates the
> following:
> 
>  kernel/rcutree.c   |   93 
> +++-
>  kernel/rcutree.h   |3 --
>  kernel/rcutree_trace.c |4 +-
>  kernel/sched/core.c|   41 ++---
>  4 files changed, 43 insertions(+), 98 deletions(-)
> 
> Which does look much better.

You might try generating your cover letter template via git format-patch
--cover-letter, which will automatically give you a list of patches and
a git-produced diffstat; much easier than trying to format a cover
letter by hand.  Meanwhile, you might consider sending your patches as a
bug report to diffstat upstream: Thomas E. Dickey
.

- Josh Triplett
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/8] x86, mm: init_memory_mapping cleanup

2012-08-30 Thread Jacob Shin

On Thu, Aug 30, 2012 at 04:06:07PM -0700, Yinghai Lu wrote:
> Only create mapping for E820_820 and E820_RESERVED_KERN.
> 
> Also seperate find_early_page_table out with init_memory_mapping.
> 
> Jacob Shin (3):
>   x86: if kernel .text .data .bss are not marked as E820_RAM, complain
> and fix
>   x86: Fixup code testing if a pfn is direct mapped
>   x86: Only direct map addresses that are marked as E820_RAM
> 
> Yinghai Lu (5):
>   x86, mm: Add global page_size_mask
>   x86, mm: Split out split_mem_range
>   x86, mm: Moving init_memory_mapping calling
>   x86, mm: Revert back good_end setting for 64bit
>   x86, mm: Find early page table only one time
> 
>  arch/x86/include/asm/init.h   |1 -
>  arch/x86/include/asm/page_types.h |3 +
>  arch/x86/include/asm/pgtable.h|1 +
>  arch/x86/kernel/cpu/amd.c |8 +-
>  arch/x86/kernel/setup.c   |   34 ---
>  arch/x86/mm/init.c|  225 
> ++---
>  arch/x86/mm/init_64.c |6 +-
>  arch/x86/platform/efi/efi.c   |8 +-
>  8 files changed, 191 insertions(+), 95 deletions(-)
> 
> -- 
> 1.7.7
> 
> 

I'll be out of office tomorrow, and Monday is a holiday, so I'll test it
on our machines on Tuesday,

Thanks,

-Jacob

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 1/2] linux/kernel.h: Introduce IDIV_ROUND_CLOSEST

2012-08-30 Thread Andrew Morton

On Tue, 28 Aug 2012 09:30:55 -0700
Guenter Roeck  wrote:

> DIV_ROUND_CLOSEST returns a bad result for negative dividends:
>   DIV_ROUND_CLOSEST(-2, 2) = 0
> 
> Most of the time this does not matter. However, in the hardware monitoring
> subsystem, it is sometimes used on integers which can be negative (such as
> temperatures). Introduce new macro IDIV_ROUND_CLOSEST which also supports
> negative dividends.
> 

Can't we just fix DIV_ROUND_CLOSEST?  That will make it a bit slower
but it's not exactly a speed demon right now.  And fixing
DIV_ROUND_CLOSEST() might just fix other bugs that we don't know about
yet.

Also, the name IDIV_ROUND_CLOSEST doesn't communicate much at all.


> +#define IDIV_ROUND_CLOSEST(x, divisor)(  \
> +{\
> + typeof(x) __x = x;  \
> + typeof(divisor) __d = divisor;  \
> + (((typeof(x))-1) >= 0 || (__x) >= 0) ?  \
> + DIV_ROUND_CLOSEST((__x), (__d)) :   \
> + (((__x) - ((__d) / 2)) / (__d));\
> +}\
> +)

And it doesn't help that the new "function" is undocumented.  Yes, we
screwed up with DIV_ROUND_CLOSEST(), but that doesn't mean we need to
keep screwing up!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/8] x86, mm: init_memory_mapping cleanup

2012-08-30 Thread Yinghai Lu

On Thu, Aug 30, 2012 at 4:06 PM, Yinghai Lu  wrote:
> Only create mapping for E820_820 and E820_RESERVED_KERN.
>
> Also seperate find_early_page_table out with init_memory_mapping.
>
> Jacob Shin (3):
>   x86: if kernel .text .data .bss are not marked as E820_RAM, complain
> and fix
>   x86: Fixup code testing if a pfn is direct mapped
>   x86: Only direct map addresses that are marked as E820_RAM
>
> Yinghai Lu (5):
>   x86, mm: Add global page_size_mask
>   x86, mm: Split out split_mem_range
>   x86, mm: Moving init_memory_mapping calling
>   x86, mm: Revert back good_end setting for 64bit
>   x86, mm: Find early page table only one time
>
>  arch/x86/include/asm/init.h   |1 -
>  arch/x86/include/asm/page_types.h |3 +
>  arch/x86/include/asm/pgtable.h|1 +
>  arch/x86/kernel/cpu/amd.c |8 +-
>  arch/x86/kernel/setup.c   |   34 ---
>  arch/x86/mm/init.c|  225 
> ++---
>  arch/x86/mm/init_64.c |6 +-
>  arch/x86/platform/efi/efi.c   |8 +-
>  8 files changed, 191 insertions(+), 95 deletions(-)

could be found at

git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git
for-x86-mm
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/8] x86, mm: Moving init_memory_mapping calling

2012-08-30 Thread Yinghai Lu

from setup.c to mm/init.c

So could update all related calling together.

Signed-off-by: Yinghai Lu 
---
 arch/x86/include/asm/init.h|1 -
 arch/x86/include/asm/pgtable.h |2 +-
 arch/x86/kernel/setup.c|   13 +
 arch/x86/mm/init.c |   19 ++-
 4 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/init.h b/arch/x86/include/asm/init.h
index adcc0ae..4f13998 100644
--- a/arch/x86/include/asm/init.h
+++ b/arch/x86/include/asm/init.h
@@ -12,7 +12,6 @@ kernel_physical_mapping_init(unsigned long start,
 unsigned long end,
 unsigned long page_size_mask);
 
-
 extern unsigned long __initdata pgt_buf_start;
 extern unsigned long __meminitdata pgt_buf_end;
 extern unsigned long __meminitdata pgt_buf_top;
diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index e47e4db..ae2cabb 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -597,7 +597,7 @@ static inline int pgd_none(pgd_t pgd)
 #ifndef __ASSEMBLY__
 
 extern int direct_gbpages;
-void probe_page_size_mask(void);
+void init_mem_mapping(void);
 
 /* local pte updates need not use xchg for locking */
 static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d6e8c03..c30c78c 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -912,20 +912,9 @@ void __init setup_arch(char **cmdline_p)
setup_real_mode();
 
init_gbpages();
-   probe_page_size_mask();
 
-   /* max_pfn_mapped is updated here */
-   max_low_pfn_mapped = init_memory_mapping(0, max_low_pfn< max_low_pfn) {
-   max_pfn_mapped = init_memory_mapping(1UL<<32,
-max_pfn<> PAGE_SHIFT;
 }
 
+void __init init_mem_mapping(void)
+{
+   probe_page_size_mask();
+
+   /* max_pfn_mapped is updated here */
+   max_low_pfn_mapped = init_memory_mapping(0, max_low_pfn< max_low_pfn) {
+   max_pfn_mapped = init_memory_mapping(1UL<<32,
+max_pfn

RE: Using uio_pdrv to create an platform device for an FPGA, mmap() fails

2012-08-30 Thread Worth, Kevin

 >[Added Greg Kroah-Hartman to Cc:]
>
>On Thu, Aug 30, 2012 at 08:10:11PM +, Worth, Kevin wrote:
>> >> Thanks for the reply, Hans. Your question about opening /dev/uio0 O_RDWR
>> >> prompted me to check out how I was creating /dev/uio0 ... my system
>> >> isn't using udev, and I was accidentally creating it with major/minor
>> >> number 254/0 instead of the correct 253/0 (found by looking at
>> >> /proc/devices). Fixed that and the mmap() call started working.
>> >
>> >Good.
>> >
>> >> 
>> >> Verified that if /dev/uio0 has permissions 0644, root can open it O_RDWR
>> >> and mmap PROT_READ | PROT_WRITE using the below code and write to an
>> >> address within my memory map. Of course this contradicts the statement
>> >> "/dev/uioX is a read-only file" in the UIO howto.
>> >
>> >You're right. That wants to be fixed...
>> >
>> >> 
>> >> Including my updated, tested code for completeness.
>> >> Note I also cleaned up the device registration a little by
>> >> using a different platform_device_register_ call and removing fields
>> >> in the struct uio_info that get filled in by uio_pdrv automatically.
>> >
>> >If you want to have that included in the mainline, please choose a more
>> >descriptive name than "myfpga" and send a proper patch.
>> 
>> I wasn't sure about submitting as a patch since it's for a custom FPGA
>> that I don't expect the community will be using,
>
>That doesn't matter. If it helps YOU that the code is maintained in mainline,
>post it.
>
>>  but the code seems like
>> possibly useful sample/example code.
>
>That is another good argument.

Perhaps this could be genericized to be a generic "Memory Map Userspace
IO Device" that takes a base address and a length in config (since those
are really the only things that are particular to my device/usage).
Could be enhanced to allow for additional maps, etc. or just serve as a
working example. Docs could then also simply refer to this as an example
of a device that uses the uio_pdrv driver.

>
>> Perhaps patching the HOWTO like
>> http://www.kernel.org/doc/htmldocs/uio-howto.html#uio_pci_generic_example
>> is the right approach?
>
>Oh, if you could hack up a patch for the documentation, that would be great.
>But please make it a second patch, don't mix it with your driver code.

Certainly these would belong in separate patches.

>
>Thanks,
>Hans
>
>> 
>> >
>> >Thanks,
>> >Hans
>> >
>> >> 
>> >> -Kevin
>> >> 
>> >> # lsuio -m -v
>> >> uio0: name=uio_myfpga, version=0.1, events=0
>> >> map[0]: addr=0xD000, size=262144, mmap test: OK
>> >> Device attributes:
>> >> uevent=DRIVER=uio_pdrv
>> >> modalias=platform:uio_pdrv
>> >> 
>> >> --Kernelspace--
>> >> #include 
>> >> #include 
>> >> #include 
>> >> 
>> >> #define MYFPGA_BASE 0xd000 // 3G
>> >> #define MYFPGA_SIZE 0x0004 // 256k
>> >> 
>> >> static struct resource myfpga_resources[] = {
>> >> {
>> >> .start = MYFPGA_BASE,
>> >> .end   = MYFPGA_BASE + MYFPGA_SIZE - 1,
>> >> .name  = "myfpga",
>> >> .flags = IORESOURCE_MEM
>> >> }
>> >> };
>> >> 
>> >> static struct uio_info myfpga_uio_info = {
>> >>.name = "uio_myfpga",
>> >>.version = "0.1",
>> >> };
>> >> 
>> >> static struct platform_device *myfpga_uio_pdev;
>> >> 
>> >> static int __init myfpga_init(void)
>> >> {
>> >> myfpga_uio_pdev = platform_device_register_resndata (NULL,
>> >>  "uio_pdrv",
>> >>  -1,
>> >>  myfpga_resources,
>> >>  1,
>> >>  _uio_info,
>> >>  sizeof(struct 
>> >> uio_info)
>> >> );
>> >> if (IS_ERR(myfpga_uio_pdev)) {
>> >> return PTR_ERR(myfpga_uio_pdev);
>> >> }
>> >> 
>> >> return 0;
>> >> }
>> >> 
>> >> static void __exit myfpga_exit(void)
>> >> {
>> >> platform_device_unregister(myfpga_uio_pdev);
>> >> }
>> >> 
>> >> module_init(myfpga_init);
>> >> module_exit(myfpga_exit);
>> >> 
>> >> --Userspace---
>> >> #include 
>> >> #include 
>> >> #include 
>> >> 
>> >> #include 
>> >> #include 
>> >> #include 
>> >> #include 
>> >> #include 
>> >> #include 
>> >> #include 
>> >> 
>> >> #define MYFPGA_BASE 0xd000 // 3G
>> >> #define MYFPGA_SIZE 0x0004 // 256k
>> >> #define MYFPGA_MAP_NUM  0 // First and only defined map
>> >> 
>> >> #define BIT32(n) (1 << (n))
>> >> 
>> >> /* Use mmap()'ped address "iomem", not physical MYFPGA address */
>> >> #define MYFPGA_REG(iomem) (volatile uint32_t*)(iomem + 0x8) // Third 
>> >> 32-bit reg
>> >> 
>> >> int main (int argc, char *argv[])
>> >> {
>> >> int fd;
>> >> void *iomem;
>> >> fd = open("/dev/uio0", O_RDWR|O_SYNC);
>> >> if

[PATCH 8/8] x86: Only direct map addresses that are marked as E820_RAM

2012-08-30 Thread Yinghai Lu

From: Jacob Shin 

Currently direct mappings are created for [ 0 to max_low_pfn<
---
 arch/x86/include/asm/page_types.h |   11 +
 arch/x86/kernel/setup.c   |8 ++-
 arch/x86/mm/init.c|   85 
 arch/x86/mm/init_64.c |6 +--
 4 files changed, 85 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/page_types.h 
b/arch/x86/include/asm/page_types.h
index 45aae6e..fbf5cc4 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -46,19 +46,14 @@ extern int devmem_is_allowed(unsigned long pagenr);
 extern unsigned long max_low_pfn_mapped;
 extern unsigned long max_pfn_mapped;
 
+void add_pfn_range_mapped(unsigned long start_pfn, unsigned long end_pfn);
+bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn);
+
 static inline phys_addr_t get_max_mapped(void)
 {
return (phys_addr_t)max_pfn_mapped << PAGE_SHIFT;
 }
 
-static inline bool pfn_range_is_mapped(unsigned long start_pfn,
-   unsigned long end_pfn)
-{
-   return end_pfn <= max_low_pfn_mapped ||
-  (end_pfn > (1UL << (32 - PAGE_SHIFT)) &&
-   end_pfn <= max_pfn_mapped);
-}
-
 extern unsigned long init_memory_mapping(unsigned long start,
 unsigned long end);
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 587dcd9..2eb91b7 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -115,9 +115,11 @@
 #include 
 
 /*
- * end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
- * The direct mapping extends to max_pfn_mapped, so that we can directly access
- * apertures, ACPI and other tables without having to play with fixmaps.
+ * max_low_pfn_mapped: highest direct mapped pfn under 4GB
+ * max_pfn_mapped: highest direct mapped pfn over 4GB
+ *
+ * The direct mapping only covers E820_RAM regions, so the ranges and gaps are
+ * represented by pfn_mapped
  */
 unsigned long max_low_pfn_mapped;
 unsigned long max_pfn_mapped;
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index c3e4341..9b871d0 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -246,6 +246,33 @@ static int __meminit split_mem_range(struct map_range *mr, 
int nr_range,
return nr_range;
 }
 
+static struct range pfn_mapped[E820_X_MAX];
+static int nr_pfn_mapped;
+
+void add_pfn_range_mapped(unsigned long start_pfn, unsigned long end_pfn)
+{
+   nr_pfn_mapped = add_range_with_merge(pfn_mapped, E820_X_MAX,
+nr_pfn_mapped, start_pfn, end_pfn);
+   nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_X_MAX);
+
+   max_pfn_mapped = max(max_pfn_mapped, end_pfn);
+
+   if (end_pfn <= (1UL << (32 - PAGE_SHIFT)))
+   max_low_pfn_mapped = max(max_low_pfn_mapped, end_pfn);
+}
+
+bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn)
+{
+   int i;
+
+   for (i = 0; i < nr_pfn_mapped; i++)
+   if ((start_pfn >= pfn_mapped[i].start) &&
+   (end_pfn <= pfn_mapped[i].end))
+   return true;
+
+   return false;
+}
+
 /*
  * Setup the direct mapping of the physical memory at PAGE_OFFSET.
  * This runs before bootmem is initialized and gets pages directly from
@@ -278,9 +305,55 @@ unsigned long __init_refok init_memory_mapping(unsigned 
long start,
 
__flush_tlb_all();
 
+   add_pfn_range_mapped(start >> PAGE_SHIFT, ret >> PAGE_SHIFT);
+
return ret >> PAGE_SHIFT;
 }
 
+/*
+ * Iterate through E820 memory map and create direct mappings for only E820_RAM
+ * regions. We cannot simply create direct mappings for all pfns from
+ * [0 to max_low_pfn) and [4GB to max_pfn) because of possible memory holes in
+ * high addresses that cannot be marked as UC by fixed/variable range MTRRs.
+ * Depending on the alignment of E820 ranges, this may possibly result in using
+ * smaller size (i.e. 4K instead of 2M or 1G) page tables.
+ */
+static void __init __init_mem_mapping(void)
+{
+unsigned long start_pfn, end_pfn;
+int i;
+
+   /* the ISA range is always mapped regardless of memory holes */
+   init_memory_mapping(0, ISA_END_ADDRESS);
+
+for_each_mem_pfn_range(i, MAX_NUMNODES, _pfn, _pfn, NULL) {
+   u64 start = start_pfn << PAGE_SHIFT;
+   u64 end = end_pfn << PAGE_SHIFT;
+
+   if (end <= ISA_END_ADDRESS)
+   continue;
+
+   if (start < ISA_END_ADDRESS)
+   start = ISA_END_ADDRESS;
+#ifdef CONFIG_X86_32
+   /* on 32 bit, we only map up to max_low_pfn */
+   if ((start >> PAGE_SHIFT) >= max_low_pfn)
+   continue;
+
+   if ((end >> PAGE_SHIFT) > max_low_pfn)
+   end = max_low_pfn << PAGE_SHIFT;
+#endif
+   init_memory_mapping(start,

[PATCH 7/8] x86: Fixup code testing if a pfn is direct mapped

2012-08-30 Thread Yinghai Lu

From: Jacob Shin 

Update code that previously assumed pfns [ 0 - max_low_pfn_mapped ) and
[ 4GB - max_pfn_mapped ) were always direct mapped, to now look up
pfn_mapped ranges instead.


-v2: change applying sequence to keep git bisecting working.
 so add dummy pfn_range_is_mapped(). - Yinghai Lu

Signed-off-by: Jacob Shin 
---
 arch/x86/include/asm/page_types.h |8 
 arch/x86/kernel/cpu/amd.c |8 +++-
 arch/x86/platform/efi/efi.c   |8 
 3 files changed, 15 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/page_types.h 
b/arch/x86/include/asm/page_types.h
index e21fdd1..45aae6e 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -51,6 +51,14 @@ static inline phys_addr_t get_max_mapped(void)
return (phys_addr_t)max_pfn_mapped << PAGE_SHIFT;
 }
 
+static inline bool pfn_range_is_mapped(unsigned long start_pfn,
+   unsigned long end_pfn)
+{
+   return end_pfn <= max_low_pfn_mapped ||
+  (end_pfn > (1UL << (32 - PAGE_SHIFT)) &&
+   end_pfn <= max_pfn_mapped);
+}
+
 extern unsigned long init_memory_mapping(unsigned long start,
 unsigned long end);
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 9d92e19..4235553 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -676,12 +676,10 @@ static void __cpuinit init_amd(struct cpuinfo_x86 *c)
 * benefit in doing so.
 */
if (!rdmsrl_safe(MSR_K8_TSEG_ADDR, )) {
+   unsigned long pfn = tseg >> PAGE_SHIFT;
+
printk(KERN_DEBUG "tseg: %010llx\n", tseg);
-   if ((tseg>>PMD_SHIFT) <
-   (max_low_pfn_mapped>>(PMD_SHIFT-PAGE_SHIFT)) ||
-   ((tseg>>PMD_SHIFT) <
-   (max_pfn_mapped>>(PMD_SHIFT-PAGE_SHIFT)) &&
-   (tseg>>PMD_SHIFT) >= (1ULL<<(32 - PMD_SHIFT
+   if (pfn_range_is_mapped(pfn, pfn + 1))
set_memory_4k((unsigned long)__va(tseg), 1);
}
}
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index 92660eda..f1facde 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -776,7 +776,7 @@ void __init efi_enter_virtual_mode(void)
efi_memory_desc_t *md, *prev_md = NULL;
efi_status_t status;
unsigned long size;
-   u64 end, systab, addr, npages, end_pfn;
+   u64 end, systab, addr, npages, start_pfn, end_pfn;
void *p, *va, *new_memmap = NULL;
int count = 0;
 
@@ -827,10 +827,10 @@ void __init efi_enter_virtual_mode(void)
size = md->num_pages << EFI_PAGE_SHIFT;
end = md->phys_addr + size;
 
+   start_pfn = PFN_DOWN(md->phys_addr);
end_pfn = PFN_UP(end);
-   if (end_pfn <= max_low_pfn_mapped
-   || (end_pfn > (1UL << (32 - PAGE_SHIFT))
-   && end_pfn <= max_pfn_mapped))
+
+   if (pfn_range_is_mapped(start_pfn, end_pfn))
va = __va(md->phys_addr);
else
va = efi_ioremap(md->phys_addr, size, md->type);
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/8] x86: if kernel .text .data .bss are not marked as E820_RAM, complain and fix

2012-08-30 Thread Yinghai Lu

From: Jacob Shin 

There could be cases where user supplied memmap=exactmap memory
mappings do not mark the region where the kernel .text .data and
.bss reside as E820_RAM, as reported here:

https://lkml.org/lkml/2012/8/14/86

Handle it by complaining, and adding the range back into the e820.

Signed-off-by: Jacob Shin 
---
 arch/x86/kernel/setup.c |   14 ++
 1 files changed, 14 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index c30c78c..587dcd9 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -831,6 +831,20 @@ void __init setup_arch(char **cmdline_p)
insert_resource(_resource, _resource);
insert_resource(_resource, _resource);
 
+   /*
+* Complain if .text .data and .bss are not marked as E820_RAM and
+* attempt to fix it by adding the range. We may have a confused BIOS,
+* or the user may have incorrectly supplied it via memmap=exactmap. If
+* we really are running on top non-RAM, we will crash later anyways.
+*/
+   if (!e820_all_mapped(code_resource.start, __pa(__brk_limit), E820_RAM)) 
{
+   pr_warn(".text .data .bss are not marked as E820_RAM!\n");
+
+   e820_add_region(code_resource.start,
+   __pa(__brk_limit) - code_resource.start + 1,
+   E820_RAM);
+   }
+
trim_bios_range();
 #ifdef CONFIG_X86_32
if (ppro_with_ram_bug()) {
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/8] x86, mm: Find early page table only one time

2012-08-30 Thread Yinghai Lu

Should not do that in every calling of init_memory_mapping.
Actually in early time, only need do once.

Also move down early_memtest.

Signed-off-by: Yinghai Lu 
---
 arch/x86/mm/init.c |   71 +++-
 1 files changed, 37 insertions(+), 34 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index cca9b7d..c3e4341 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -37,7 +37,7 @@ struct map_range {
 
 static int page_size_mask;
 
-static void __init find_early_table_space(struct map_range *mr,
+static void __init find_early_table_space(unsigned long begin,
  unsigned long end)
 {
unsigned long puds, pmds, ptes, tables, start = 0, good_end = end;
@@ -64,8 +64,8 @@ static void __init find_early_table_space(struct map_range 
*mr,
extra += PMD_SIZE;
 #endif
/* The first 2/4M doesn't use large pages. */
-   if (mr->start < PMD_SIZE)
-   extra += mr->end - mr->start;
+   if (begin < PMD_SIZE)
+   extra += (PMD_SIZE - start) >> PAGE_SHIFT;
 
ptes = (extra + PAGE_SIZE - 1) >> PAGE_SHIFT;
} else
@@ -265,15 +265,6 @@ unsigned long __init_refok init_memory_mapping(unsigned 
long start,
nr_range = 0;
nr_range = split_mem_range(mr, nr_range, start, end);
 
-   /*
-* Find space for the kernel direct mapping tables.
-*
-* Later we should allocate these tables in the local node of the
-* memory mapped. Unfortunately this is done currently before the
-* nodes are discovered.
-*/
-   if (!after_bootmem)
-   find_early_table_space([0], end);
 
for (i = 0; i < nr_range; i++)
ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
@@ -287,6 +278,36 @@ unsigned long __init_refok init_memory_mapping(unsigned 
long start,
 
__flush_tlb_all();
 
+   return ret >> PAGE_SHIFT;
+}
+
+void __init init_mem_mapping(void)
+{
+   probe_page_size_mask();
+
+   /*
+* Find space for the kernel direct mapping tables.
+*
+* Later we should allocate these tables in the local node of the
+* memory mapped. Unfortunately this is done currently before the
+* nodes are discovered.
+*/
+#ifdef CONFIG_X86_64
+   find_early_table_space(0, max_pfn< max_low_pfn) {
+   max_pfn_mapped = init_memory_mapping(1UL<<32,
+max_pfn< pgt_buf_start)
+   if (pgt_buf_end > pgt_buf_start)
x86_init.mapping.pagetable_reserve(PFN_PHYS(pgt_buf_start),
PFN_PHYS(pgt_buf_end));
 
-   if (!after_bootmem)
-   early_memtest(start, end);
-
-   return ret >> PAGE_SHIFT;
-}
-
-void __init init_mem_mapping(void)
-{
-   probe_page_size_mask();
-
-   /* max_pfn_mapped is updated here */
-   max_low_pfn_mapped = init_memory_mapping(0, max_low_pfn< max_low_pfn) {
-   max_pfn_mapped = init_memory_mapping(1UL<<32,
-max_pfn

[PATCH 4/8] x86, mm: Revert back good_end setting for 64bit

2012-08-30 Thread Yinghai Lu

So we could put page table high again for 64bit.

Signed-off-by: Yinghai Lu 
---
 arch/x86/mm/init.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 15a6a38..cca9b7d 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -76,8 +76,8 @@ static void __init find_early_table_space(struct map_range 
*mr,
 #ifdef CONFIG_X86_32
/* for fixmap */
tables += roundup(__end_of_fixed_addresses * sizeof(pte_t), PAGE_SIZE);
-#endif
good_end = max_pfn_mapped << PAGE_SHIFT;
+#endif
 
base = memblock_find_in_range(start, good_end, tables, PAGE_SIZE);
if (!base)
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/8] x86, mm: Split out split_mem_range

2012-08-30 Thread Yinghai Lu

from init_memory_mapping, so make init_memory_mapping readable.

Suggested-by: Ingo Molnar 
Signed-off-by: Yinghai Lu 
---
 arch/x86/mm/init.c |   42 ++
 1 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 838e9bc..7d05e28 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -143,25 +143,13 @@ static int __meminit save_mr(struct map_range *mr, int 
nr_range,
return nr_range;
 }
 
-/*
- * Setup the direct mapping of the physical memory at PAGE_OFFSET.
- * This runs before bootmem is initialized and gets pages directly from
- * the physical memory. To access them they are temporarily mapped.
- */
-unsigned long __init_refok init_memory_mapping(unsigned long start,
-  unsigned long end)
+static int __meminit split_mem_range(struct map_range *mr, int nr_range,
+unsigned long start,
+unsigned long end)
 {
unsigned long start_pfn, end_pfn;
-   unsigned long ret = 0;
unsigned long pos;
-   struct map_range mr[NR_RANGE_MR];
-   int nr_range, i;
-
-   printk(KERN_INFO "init_memory_mapping: [mem %#010lx-%#010lx]\n",
-  start, end - 1);
-
-   memset(mr, 0, sizeof(mr));
-   nr_range = 0;
+   int i;
 
/* head if not big page alignment ? */
start_pfn = start >> PAGE_SHIFT;
@@ -255,6 +243,28 @@ unsigned long __init_refok init_memory_mapping(unsigned 
long start,
(mr[i].page_size_mask & (1

[PATCH 1/8] x86, mm: Add global page_size_mask

2012-08-30 Thread Yinghai Lu

detect if need to use 1G or 2M and store them in page_size_mask.

Only probe them one time.

Suggested-by: Ingo Molnar 
Signed-off-by: Yinghai Lu 
---
 arch/x86/include/asm/pgtable.h |1 +
 arch/x86/kernel/setup.c|1 +
 arch/x86/mm/init.c |   66 +++-
 3 files changed, 33 insertions(+), 35 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 49afb3f..e47e4db 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -597,6 +597,7 @@ static inline int pgd_none(pgd_t pgd)
 #ifndef __ASSEMBLY__
 
 extern int direct_gbpages;
+void probe_page_size_mask(void);
 
 /* local pte updates need not use xchg for locking */
 static inline pte_t native_local_ptep_get_and_clear(pte_t *ptep)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index f4b9b80..d6e8c03 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -912,6 +912,7 @@ void __init setup_arch(char **cmdline_p)
setup_real_mode();
 
init_gbpages();
+   probe_page_size_mask();
 
/* max_pfn_mapped is updated here */
max_low_pfn_mapped = init_memory_mapping(0, max_low_pfn<> PUD_SHIFT;
tables = roundup(puds * sizeof(pud_t), PAGE_SIZE);
 
-   if (use_gbpages) {
+   if (page_size_mask & (1 << PG_LEVEL_1G)) {
unsigned long extra;
 
extra = end - ((end>>PUD_SHIFT) << PUD_SHIFT);
@@ -54,7 +56,7 @@ static void __init find_early_table_space(struct map_range 
*mr, unsigned long en
 
tables += roundup(pmds * sizeof(pmd_t), PAGE_SIZE);
 
-   if (use_pse) {
+   if (page_size_mask & (1 << PG_LEVEL_2M)) {
unsigned long extra;
 
extra = end - ((end>>PMD_SHIFT) << PMD_SHIFT);
@@ -90,6 +92,30 @@ static void __init find_early_table_space(struct map_range 
*mr, unsigned long en
(pgt_buf_top << PAGE_SHIFT) - 1);
 }
 
+void probe_page_size_mask(void)
+{
+#if !defined(CONFIG_DEBUG_PAGEALLOC) && !defined(CONFIG_KMEMCHECK)
+   /*
+* For CONFIG_DEBUG_PAGEALLOC, identity mapping will use small pages.
+* This will simplify cpa(), which otherwise needs to support splitting
+* large pages into small in interrupt context, etc.
+*/
+   if (direct_gbpages)
+   page_size_mask |= 1 << PG_LEVEL_1G;
+   if (cpu_has_pse)
+   page_size_mask |= 1 << PG_LEVEL_2M;
+#endif
+
+   /* Enable PSE if available */
+   if (cpu_has_pse)
+   set_in_cr4(X86_CR4_PSE);
+
+   /* Enable PGE if available */
+   if (cpu_has_pge) {
+   set_in_cr4(X86_CR4_PGE);
+   __supported_pte_mask |= _PAGE_GLOBAL;
+   }
+}
 void __init native_pagetable_reserve(u64 start, u64 end)
 {
memblock_reserve(start, end - start);
@@ -125,45 +151,15 @@ static int __meminit save_mr(struct map_range *mr, int 
nr_range,
 unsigned long __init_refok init_memory_mapping(unsigned long start,
   unsigned long end)
 {
-   unsigned long page_size_mask = 0;
unsigned long start_pfn, end_pfn;
unsigned long ret = 0;
unsigned long pos;
-
struct map_range mr[NR_RANGE_MR];
int nr_range, i;
-   int use_pse, use_gbpages;
 
printk(KERN_INFO "init_memory_mapping: [mem %#010lx-%#010lx]\n",
   start, end - 1);
 
-#if defined(CONFIG_DEBUG_PAGEALLOC) || defined(CONFIG_KMEMCHECK)
-   /*
-* For CONFIG_DEBUG_PAGEALLOC, identity mapping will use small pages.
-* This will simplify cpa(), which otherwise needs to support splitting
-* large pages into small in interrupt context, etc.
-*/
-   use_pse = use_gbpages = 0;
-#else
-   use_pse = cpu_has_pse;
-   use_gbpages = direct_gbpages;
-#endif
-
-   /* Enable PSE if available */
-   if (cpu_has_pse)
-   set_in_cr4(X86_CR4_PSE);
-
-   /* Enable PGE if available */
-   if (cpu_has_pge) {
-   set_in_cr4(X86_CR4_PGE);
-   __supported_pte_mask |= _PAGE_GLOBAL;
-   }
-
-   if (use_gbpages)
-   page_size_mask |= 1 << PG_LEVEL_1G;
-   if (use_pse)
-   page_size_mask |= 1 << PG_LEVEL_2M;
-
memset(mr, 0, sizeof(mr));
nr_range = 0;
 
@@ -267,7 +263,7 @@ unsigned long __init_refok init_memory_mapping(unsigned 
long start,
 * nodes are discovered.
 */
if (!after_bootmem)
-   find_early_table_space([0], end, use_pse, use_gbpages);
+   find_early_table_space([0], end);
 
for (i = 0; i < nr_range; i++)
ret = kernel_physical_mapping_init(mr[i].start, mr[i].end,
-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ

[PATCH 0/8] x86, mm: init_memory_mapping cleanup

2012-08-30 Thread Yinghai Lu

Only create mapping for E820_820 and E820_RESERVED_KERN.

Also seperate find_early_page_table out with init_memory_mapping.

Jacob Shin (3):
  x86: if kernel .text .data .bss are not marked as E820_RAM, complain
and fix
  x86: Fixup code testing if a pfn is direct mapped
  x86: Only direct map addresses that are marked as E820_RAM

Yinghai Lu (5):
  x86, mm: Add global page_size_mask
  x86, mm: Split out split_mem_range
  x86, mm: Moving init_memory_mapping calling
  x86, mm: Revert back good_end setting for 64bit
  x86, mm: Find early page table only one time

 arch/x86/include/asm/init.h   |1 -
 arch/x86/include/asm/page_types.h |3 +
 arch/x86/include/asm/pgtable.h|1 +
 arch/x86/kernel/cpu/amd.c |8 +-
 arch/x86/kernel/setup.c   |   34 ---
 arch/x86/mm/init.c|  225 ++---
 arch/x86/mm/init_64.c |6 +-
 arch/x86/platform/efi/efi.c   |8 +-
 8 files changed, 191 insertions(+), 95 deletions(-)

-- 
1.7.7

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/8] input: ab8500-ponkey: Rely on MFD core to convert IRQs to virtual

2012-08-30 Thread Dmitry Torokhov

On Thu, Aug 30, 2012 at 04:02:21PM -0700, Dmitry Torokhov wrote:
> On Thu, Aug 30, 2012 at 02:12:04PM +0100, Lee Jones wrote:
> > > Sorry for the delay. Yes, this shoudl be fine, but since it is
> > > essentially a revert of the original patch it should be pushed in as
> > > such.
> > 
> > How's this?
> > 
> 
> Excellent.

I assume you will be merging it with the rest of AB8500 patches, right?

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 8/8] input: ab8500-ponkey: Rely on MFD core to convert IRQs to virtual

2012-08-30 Thread Dmitry Torokhov

On Thu, Aug 30, 2012 at 02:12:04PM +0100, Lee Jones wrote:
> > Sorry for the delay. Yes, this shoudl be fine, but since it is
> > essentially a revert of the original patch it should be pushed in as
> > such.
> 
> How's this?
> 

Excellent.

> Author: Lee Jones 
> Date:   Thu Aug 30 14:08:19 2012 +0100
> 
> Revert "input: ab8500-ponkey: Create AB8500 domain IRQ mapping"
> 
> This reverts commit ca3b3faf9bee4dc5df4f10eae2d1e48f7de0a8ad.
> 
> There was a plan to place ab8500_irq_get_virq() calls in each AB8500
> child device prior to requesting an IRQ, but as we're no longer using
> Device Tree to collect our IRQ numbers, it's actually better to allow
> the core to do this during device registration time. So the IRQ number
> we pull from its resource has already been converted to a virtual IRQ.
> 
> CC: Dmitry Torokhov 
> CC: linux-in...@vger.kernel.org
> Acked-by: Linus Walleij 

Acked-by: Dmitry Torokhov 

> Signed-off-by: Lee Jones 
> 
> diff --git a/drivers/input/misc/ab8500-ponkey.c 
> b/drivers/input/misc/ab8500-ponkey.c
> index f06231b..84ec691 100644
> --- a/drivers/input/misc/ab8500-ponkey.c
> +++ b/drivers/input/misc/ab8500-ponkey.c
> @@ -74,8 +74,8 @@ static int __devinit ab8500_ponkey_probe(struct 
> platform_device *pdev)
>  
> ponkey->idev = input;
> ponkey->ab8500 = ab8500;
> -   ponkey->irq_dbf = ab8500_irq_get_virq(ab8500, irq_dbf);
> -   ponkey->irq_dbr = ab8500_irq_get_virq(ab8500, irq_dbr);
> +   ponkey->irq_dbf = irq_dbf;
> +   ponkey->irq_dbr = irq_dbr;
>  
> input->name = "AB8500 POn(PowerOn) Key";
> input->dev.parent = >dev;

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] drivers/tty: Folding Android's keyreset driver in sysRQ

2012-08-30 Thread Dmitry Torokhov

Hi Matthieu,

On Thu, Aug 30, 2012 at 04:30:54PM -0600, mathieu.poir...@linaro.org wrote:
> From: "Mathieu J. Poirier" 
> 
> This patch adds keyreset functionality to the sysrq driver. It
> allows certain button/key combinations to be used in order to
> trigger device resets.
> 
> The first time the key-combo is detected a work function that syncs
> the filesystems is scheduled and the kernel rebooted. If all the keys
> are released and then pressed again, it calls panic. Reboot on panic
> should be set for this to work.  A platform device that specify a
> reset key-combo should be added to the board file to trigger the
> feature.

Why do we need to involve a platform device and not use, for example, a module
parameter, that could be set up from userspace?

Also, why do we need reset_fn() and not simply invoke SysRq-B handler
that should call ctrl_alt_del() for us?

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [PATCH 0/8] lp8727_charger: cleanup code

2012-08-30 Thread Kim, Milo

> -Original Message-
> From: Anton Vorontsov [mailto:anton.voront...@linaro.org]
> Sent: Thursday, August 30, 2012 9:15 PM
> To: Kim, Milo
> Cc: linux-kernel@vger.kernel.org; David Woodhouse
> Subject: Re: [PATCH 0/8] lp8727_charger: cleanup code
> 
> On Thu, Aug 30, 2012 at 11:37:16AM +, Kim, Milo wrote:
> > LP8727 driver should be patched for several reasons.
> >
> > (a) Need to clean up _probe()/_remove()
> > (b) Not secure code when the platform data is NULL
> > (c) Interrupt handling
> > Two threads are running for handling one IRQ.
> > One is for the IRQ pin, the other is used for delayed processing.
> > This is unusual and can be enhanced.
> > (d) Misuse of mutex code
> > (e) Lots of definitions should be fixed
> > (f) Others..
> 
> Thanks a lot for the cleanups, this is much appreciated! The cleanups
> themselves look great, but I'd really like to see them more separated.
> 
> Thanks,
> Anton.

Sorry to bother you.
I'll resend the patch-set separately.
Thanks a lot for detailed review.

Best Regards,
Milo

Re: [PATCH] Input: Let the FT5x06 driver build without debugfs

2012-08-30 Thread Dmitry Torokhov

On Thu, Aug 30, 2012 at 03:26:21PM -0700, David Rientjes wrote:
> On Tue, 21 Aug 2012, Dmitry Torokhov wrote:
> 
> > > > > On 08/17/2012 02:15 AM, Eric W. Biederman wrote:
> > > > >> When testing to make certain my user namespace code works in
> > > > >> various configurations I tripped over the tf5x06.c not building
> > > > >> with debugfs disabled.
> > > > >
> > > > > Sorry for that.
> > > > >
> > > > > There already is a patch for this issue which I slightly prefer. You
> > > > > can find it in the mail from Guenther Roeck:
> > > > >
> > > > > http://www.mail-archive.com/linux-input@vger.kernel.org/msg00646.html
> > > > 
> > > > If you guys could get that merged into 3.6 I would appreciate it.
> > > > 
> > > Yes, that would be great. I don't recall seeing an e-mail from Dmitry 
> > > accepting
> > > it, though.
> > 
> > Applied, sorry for the delay.
> > 
> 
> This still affects Linus' tree and causes a build breakage without debugfs 
> configured.  Considering the driver went into 3.5-rc5, could you please 
> push this fix for 3.6

I Just sent a pull request to Linus.

> (and mark it for stable backport)?

I am pretty sure it was merged in 3.6 merge window, not 3.5, so no need
to mark for stable.

Thanks.

-- 
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] exportfs: add FILEID_INVALID to indicate invalid fid_type

2012-08-30 Thread Namjae Jeon

2012/8/31, J. Bruce Fields :
> On Wed, Aug 29, 2012 at 10:10:10AM -0400, Namjae Jeon wrote:
>> This commit adds FILEID_INVALID = 0xff in fid_type to
>> indicate invalid fid_type
>
> OK, applying for 3.7.
>
> Looks like this shows up in a lot of filesystems too as just "255".  Are
> you planning to patch up the filesystems afterwards?
Hi Bruce.
Yes, I will fix these from next patches.
Thanks.
>
> --b.
>
>>
>> It avoids using magic number 255
>>
>> Signed-off-by: Namjae Jeon 
>> Signed-off-by: Vivek Trivedi 
>> ---
>>  fs/exportfs/expfs.c  |4 ++--
>>  fs/fhandle.c |2 +-
>>  fs/nfsd/nfsfh.c  |4 ++--
>>  include/linux/exportfs.h |5 +
>>  4 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/exportfs/expfs.c b/fs/exportfs/expfs.c
>> index 29ab099..f1f1c59 100644
>> --- a/fs/exportfs/expfs.c
>> +++ b/fs/exportfs/expfs.c
>> @@ -322,10 +322,10 @@ static int export_encode_fh(struct inode *inode,
>> struct fid *fid,
>>
>>  if (parent && (len < 4)) {
>>  *max_len = 4;
>> -return 255;
>> +return FILEID_INVALID;
>>  } else if (len < 2) {
>>  *max_len = 2;
>> -return 255;
>> +return FILEID_INVALID;
>>  }
>>
>>  len = 2;
>> diff --git a/fs/fhandle.c b/fs/fhandle.c
>> index a48e4a1..78a7879 100644
>> --- a/fs/fhandle.c
>> +++ b/fs/fhandle.c
>> @@ -52,7 +52,7 @@ static long do_sys_name_to_handle(struct path *path,
>>  handle_bytes = handle_dwords * sizeof(u32);
>>  handle->handle_bytes = handle_bytes;
>>  if ((handle->handle_bytes > f_handle.handle_bytes) ||
>> -(retval == 255) || (retval == -ENOSPC)) {
>> +(retval == FILEID_INVALID) || (retval == -ENOSPC)) {
>>  /* As per old exportfs_encode_fh documentation
>>   * we could return ENOSPC to indicate overflow
>>   * But file system returned 255 always. So handle
>> diff --git a/fs/nfsd/nfsfh.c b/fs/nfsd/nfsfh.c
>> index 032af38..814afaa 100644
>> --- a/fs/nfsd/nfsfh.c
>> +++ b/fs/nfsd/nfsfh.c
>> @@ -572,7 +572,7 @@ fh_compose(struct svc_fh *fhp, struct svc_export *exp,
>> struct dentry *dentry,
>>
>>  if (inode)
>>  _fh_update(fhp, exp, dentry);
>> -if (fhp->fh_handle.fh_fileid_type == 255) {
>> +if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID) {
>>  fh_put(fhp);
>>  return nfserr_opnotsupp;
>>  }
>> @@ -603,7 +603,7 @@ fh_update(struct svc_fh *fhp)
>>  goto out;
>>
>>  _fh_update(fhp, fhp->fh_export, dentry);
>> -if (fhp->fh_handle.fh_fileid_type == 255)
>> +if (fhp->fh_handle.fh_fileid_type == FILEID_INVALID)
>>  return nfserr_opnotsupp;
>>  }
>>  out:
>> diff --git a/include/linux/exportfs.h b/include/linux/exportfs.h
>> index 12291a7..0e14525 100644
>> --- a/include/linux/exportfs.h
>> +++ b/include/linux/exportfs.h
>> @@ -83,6 +83,11 @@ enum fid_type {
>>   * 64 bit parent inode number.
>>   */
>>  FILEID_NILFS_WITH_PARENT = 0x62,
>> +
>> +/*
>> + * Filesystems must not use 0xff file ID.
>> + */
>> +FILEID_INVALID = 0xff,
>>  };
>>
>>  struct fid {
>> --
>> 1.7.9.5
>>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[git pull] Input updates for 3.6-rc4

2012-08-30 Thread Dmitry Torokhov

Hi Linus,

Please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input.git for-linus
or
master.kernel.org:/pub/scm/linux/kernel/git/dtor/input.git for-linus

to receive updates for the input subsystem.

Changelog:
-

Dmitry Torokhov (1):
  Input: i8042 - add Gigabyte T1005 series netbooks to noloop table

Guenter Roeck (1):
  Input: edt-ft5x06 - fix build error when compiling wthout CONFIG_DEBUG_FS

Jason Gerecke (1):
  Input: wacom - add support for EMR on Cintiq 24HD touch

Michael Grzeschik (1):
  Input: imx_keypad - reset the hardware before enabling


Diffstat:


 drivers/input/keyboard/imx_keypad.c|  3 +++
 drivers/input/serio/i8042-x86ia64io.h  | 14 ++
 drivers/input/tablet/wacom_wac.c   |  6 +-
 drivers/input/touchscreen/edt-ft5x06.c |  2 +-
 4 files changed, 23 insertions(+), 2 deletions(-)

-- 
Dmitry



pgpqkqnmEnzbd.pgp
Description: PGP signature

Latency.

2012-08-30 Thread Uwaysi Bin Kareem

I have done some research on latency. I have config`d a linux kernel to  
run 0.3ms reliable latency with audiostreams, under normal worksituations.  
(An audioapp, and maybe some small tasks in between).


This also resulted in an extremely smooth gameplaying experience, like an  
asm-programmed custom hardware arcade. (Why gamebox-developers isn`t using  
this, is a mystery).


Recently I also tried to come as close to that experience on windows, and  
found that win32priorityseparation on 25, all processes on idle, to avoid  
cpu2 stalling cpu1, and minimal drivers, services, and processes gave a  
similar experience. Windows btw, also gives lower latency, if one moves  
windows, which one can use/abuse in a script/hack.


The feeling from low latency systems brings back the exhilaration of  
custom hardware and assembly programming. It gives a different feel, and I  
do believe it sets a high quality expectation to software and I wonder if  
that is why the Amiga is said to have so much good software, and  
responsible for it`s reputation.


My windows-partition now runs as good as an Amiga, and I managed to make  
it run even better, reminding me of singletasking systems like Mac OS.


Games are just so much more fun with this. And the overall os is so much  
more responsive.


More optimized stuff like Wayland will ofcourse even improve things more.

I do think that for "desktop" the focus should really be on low-latency  
systems.
If "desktop" and "server" are the two different profiles you usually  
config for in linux, how about two different standard configs? Or are  
these merging aswell, since I would think multi-cpu servers appreciate low  
os-jitter aswell?


Just some thoughts.

Peace Be With You.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/3] staging: ramster: move to new zcache2 code base

2012-08-30 Thread Dan Magenheimer

Hi Greg --

gregkh> If you feel that the existing code needs to be dropped
gregkh> and replaced with a totally new version, that's fine with
gregkh> me.  It's forward progress, which is all that I ask for. 
(http://lkml.indiana.edu/hypermail/linux/kernel/1208.0/02240.html,
in reference to zcache, assuming applies to ramster as well)

Please apply for staging-next for the 3.7 window to move ramster forward.
Since AFAICT there have been no patches or contributions from others to
drivers/staging/ramster since it was merged, this totally new version
of ramster should not run afoul and the patches should apply to
3.5 or 3.6-rcN.

Thanks,
Dan

When ramster was merged into staging at 3.4, it used a "temporarily" forked
version of zcache.  Code was proposed to merge zcache and ramster into
a new common redesigned codebase which both resolves various serious design
flaws and eliminates all code duplication between zcache and ramster, with
the result to replace "zcache".  Sadly, that proposal was blocked, so the
zcache (and tmem) code in drivers/staging/zcache and the zcache (and tmem)
code in drivers/staging/ramster continue to be different.

This patchset moves ramster to the new redesigned codebase and calls that
new codebase "zcache2".  Most, if not all, of the redesign will eventually
need to be merged with "zcache1" before zcache functionality should be
promoted out of staging.

An overview of the zcache2 rewrite is provided in a git commit comment
later in this series.

A significant item of debate in the new codebase is the removal of zsmalloc.
This removal may be temporary if zsmalloc is enhanced with necessary
features to meet the needs of the new zcache codebase.  Justification
for the change can be found at http://lkml.org/lkml/2012/8/15/292
Such zsmalloc enhancments will almost certainly necessitate a major
rework, not a small patch.

While this zcache2 codebase is far from perfect (and thus remains in staging),
the foundation is now cleaner, more stable, more maintainable, and much
better commented.

Signed-off-by: Dan Magenheimer 

---
Diffstat:

 drivers/staging/Kconfig|4 +-
 drivers/staging/Makefile   |2 +-
 drivers/staging/ramster/Kconfig|   25 +-
 drivers/staging/ramster/Makefile   |7 +-
 drivers/staging/ramster/TODO   |   13 -
 drivers/staging/ramster/cluster/Makefile   |3 -
 drivers/staging/ramster/cluster/heartbeat.c|  464 ---
 drivers/staging/ramster/cluster/heartbeat.h|   87 -
 drivers/staging/ramster/cluster/masklog.c  |  155 -
 drivers/staging/ramster/cluster/masklog.h  |  220 --
 drivers/staging/ramster/cluster/nodemanager.c  |  992 --
 drivers/staging/ramster/cluster/nodemanager.h  |   88 -
 .../staging/ramster/cluster/ramster_nodemanager.h  |   39 -
 drivers/staging/ramster/cluster/tcp.c  | 2256 -
 drivers/staging/ramster/cluster/tcp.h  |  159 -
 drivers/staging/ramster/cluster/tcp_internal.h |  248 --
 drivers/staging/ramster/r2net.c|  401 ---
 drivers/staging/ramster/ramster.h  |  113 +-
 drivers/staging/ramster/ramster/heartbeat.c|  462 +++
 drivers/staging/ramster/ramster/heartbeat.h|   87 +
 drivers/staging/ramster/ramster/masklog.c  |  155 +
 drivers/staging/ramster/ramster/masklog.h  |  220 ++
 drivers/staging/ramster/ramster/nodemanager.c  |  995 ++
 drivers/staging/ramster/ramster/nodemanager.h  |   88 +
 drivers/staging/ramster/ramster/r2net.c|  414 +++
 drivers/staging/ramster/ramster/ramster.c  |  985 ++
 drivers/staging/ramster/ramster/ramster.h  |  161 +
 .../staging/ramster/ramster/ramster_nodemanager.h  |   39 +
 drivers/staging/ramster/ramster/tcp.c  | 2253 +
 drivers/staging/ramster/ramster/tcp.h  |  159 +
 drivers/staging/ramster/ramster/tcp_internal.h |  248 ++
 drivers/staging/ramster/tmem.c |  313 +-
 drivers/staging/ramster/tmem.h |  109 +-
 drivers/staging/ramster/xvmalloc.c |  509 ---
 drivers/staging/ramster/xvmalloc.h |   30 -
 drivers/staging/ramster/xvmalloc_int.h |   95 -
 drivers/staging/ramster/zbud.c | 1060 ++
 drivers/staging/ramster/zbud.h |   33 +
 drivers/staging/ramster/zcache-main.c  | 3532 ++--
 drivers/staging/ramster/zcache.h   |   55 +-
 40 files changed, 8711 insertions(+), 8567 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] rbtree based interval tree as a prio_tree replacement

2012-08-30 Thread Michel Lespinasse

On Thu, Aug 30, 2012 at 2:34 PM, Andrew Morton
 wrote:
> On Tue,  7 Aug 2012 00:25:38 -0700
> Michel Lespinasse  wrote:
>
>> This patchset goes over the rbtree changes that have been already integrated
>> into Andrew's -mm tree, as well as the augmented rbtree proposal which is
>> currently pending.
>
> hm.  Well I grabbed these for a bit of testing.
>
> It's a large change in MM and it depends on code which hasn't yet been
> merged in mainline.  It's probably prudent to do all this in two steps
> - we'll see.

Makes sense to me. If we want to split the series as they get sent
upstream, I would suggest sending all the rbtree and augmented rbtree
infrastructure first, and then the rbtree usages (prio tree
replacement, anon rmap interval tree which I'm going to send next, and
rik's augmented rbtree based vma gap finder) in the next kernel.

> The templates-with-CPP thing is not terribly appealing.  It's not
> obvious that it really needed to be done this way - we've avoided it in
> plenty of other places.  It would be nice to see that alternatives have
> been thoroughly explored, and why they were rejected.

I am actually wondering if the interval_tree_tmpl.h include file
shouldn't be done as one large preprocessor #define instead. The
ITSTRUCT, ITRB, etc... definitions would then become arguments to that
large definition. It would also be possible to break up that #define
into smaller ones - most likely, one for insertion, one for removal,
and one for the subtree_search / iter_first / iter_next functions. Do
you think this might help ?

I don't really see other workable alternatives that don't involve code
replication.

> The code uses the lame-and-useless "inline" absolutely all over the
> place.  I do think that for new code it would be better to get down and
> actually make proper engineering decisions about which functions should
> be inlined and mark them __always_inline.

You mentionned this before, but I'm not convinced that __always_inline
would be better. The kernel is full of 2-line functions that we really
want inlined, and I don't see what the value would be in converting
these all to __always_inline. I am tempted to stick with the current
usage, which I understand as being:

- use inline when the programmer believes a function should be inlined
(this includes the static inline functions in header files - if the
programmer didn't believe this should be inlined, he wouldn't put the
function in a header file)

- use __always_inline if the function absolutely needs to be inlined
for correct operation (I believe scheduler has some of these, which
need to be included in the parent in order to end up in the correct
section), OR in rare cases if the compiler is known to generate bad
code with a mere inline and the programmer wants to force the issue.

I would also note that replacing inline with __always_inline is not a
no-op change, even when the compiler was already inlining the original
(marked inline) function. Sometimes the generated code ends up being
different with __always_inline and I would rather not apply these
changes blindly.

> Hillf has made a review suggestion which AFAICT remains unresponded to.

To be honest, I wasn't quite sure what he was suggesting ?

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] drivers/tty: Folding Android's keyreset driver in sysRQ

2012-08-30 Thread mathieu . poirier

From: "Mathieu J. Poirier" 

This patch adds keyreset functionality to the sysrq driver. It
allows certain button/key combinations to be used in order to
trigger device resets.

The first time the key-combo is detected a work function that syncs
the filesystems is scheduled and the kernel rebooted. If all the keys
are released and then pressed again, it calls panic. Reboot on panic
should be set for this to work.  A platform device that specify a
reset key-combo should be added to the board file to trigger the
feature.

This functionality comes from the keyreset driver submitted by
Arve Hjønnevåg in the Android kernel.

Cc: a...@android.com
Cc: kernel-t...@android.com
Cc: dmitry.torok...@gmail.com
Cc: john.stu...@linaro.org
Signed-off-by: Mathieu Poirier 
---
 drivers/tty/sysrq.c   |  161 +
 include/linux/sysrq.h |8 +++
 2 files changed, 169 insertions(+), 0 deletions(-)

diff --git a/drivers/tty/sysrq.c b/drivers/tty/sysrq.c
index 05728894..f210853 100644
--- a/drivers/tty/sysrq.c
+++ b/drivers/tty/sysrq.c
@@ -41,6 +41,9 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #include 
 #include 
@@ -49,6 +52,11 @@
 static int __read_mostly sysrq_enabled = SYSRQ_DEFAULT_ENABLE;
 static bool __read_mostly sysrq_always_enabled;
 
+static struct input_handler sysrq_handler;
+
+/* Keep track of what has been called */
+static atomic_t restart_requested;
+
 static bool sysrq_on(void)
 {
return sysrq_enabled || sysrq_always_enabled;
@@ -570,6 +578,15 @@ struct sysrq_state {
struct input_handle handle;
struct work_struct reinject_work;
unsigned long key_down[BITS_TO_LONGS(KEY_CNT)];
+   unsigned long keybit[BITS_TO_LONGS(KEY_CNT)];
+   unsigned long upbit[BITS_TO_LONGS(KEY_CNT)];
+   unsigned long key[BITS_TO_LONGS(KEY_CNT)];
+   int (*reset_fn)(void);
+   int key_down_target;
+   int key_down_ctn;
+   int key_up_ctn;
+   int keyreset_data;
+   int restart_disabled;
unsigned int alt;
unsigned int alt_use;
bool active;
@@ -603,6 +620,93 @@ static void sysrq_reinject_alt_sysrq(struct work_struct 
*work)
}
 }
 
+
+static int sysrq_probe(struct platform_device *pdev)
+{
+   struct keyreset_platform_data *pdata = pdev->dev.platform_data;
+
+   /*
+* No sequence of keys to trigger on,
+* assuming default sysRQ behavior.
+*/
+   if (pdata) {
+   atomic_set(_requested, 0);
+   sysrq_handler.private = pdata;
+   } else
+   sysrq_handler.private = NULL;
+
+   /* FETCH DT INFO HERE */
+
+   return 0;
+
+}
+
+static void deferred_restart(struct work_struct *dummy)
+{
+   atomic_inc(_requested);
+   sys_sync();
+   atomic_inc(_requested);
+   kernel_restart(NULL);
+}
+static DECLARE_WORK(restart_work, deferred_restart);
+
+static int do_keyreset_event(struct sysrq_state *state,
+unsigned int code, int value)
+{
+   int ret;
+   int processed = 0;
+
+   /* Is the code is of interest to us */
+   if (!test_bit(code, state->keybit))
+   return processed;
+
+   /* No need to take care of key up events */
+   if (!test_bit(code, state->key) == !value)
+   return processed;
+
+   /* Record new entry */
+   __change_bit(code, state->key);
+
+   processed = 1;
+
+   if (test_bit(code, state->upbit)) {
+   if (value) {
+   state->restart_disabled = 1;
+   state->key_up_ctn++;
+   } else
+   state->key_up_ctn--;
+   } else {
+   if (value)
+   state->key_down_ctn++;
+   else
+   state->key_down_ctn--;
+   }
+
+   if (state->key_down_ctn == 0 && state->key_up_ctn == 0)
+   state->restart_disabled = 0;
+
+   if (value && !state->restart_disabled &&
+   state->key_down_ctn == state->key_down_target) {
+   state->restart_disabled = 1;
+   if (atomic_read(_requested))
+   panic("keyboard reset failed, %d - panic\n",
+atomic_read(_requested));
+   if (state->reset_fn) {
+   ret = state->reset_fn();
+   atomic_set(_requested, ret);
+   } else {
+   pr_info("keyboard reset\n");
+   schedule_work(_work);
+   atomic_inc(_requested);
+   }
+   }
+
+   /* no need to suppress keyreset characters */
+   state->active = false;
+
+   return processed;
+}
+
 static bool sysrq_filter(struct input_handle *handle,
 unsigned int type, unsigned int code, int value)
 {
@@ -669,6 +773,11 @@ static bool sysrq_filter(struct input_handle *handle,

Re: [PATCH] slub: consider pfmemalloc_match() in get_partial_node()

2012-08-30 Thread David Rientjes

On Sat, 25 Aug 2012, Joonsoo Kim wrote:

> There is no consideration for pfmemalloc_match() in get_partial(). If we don't
> consider that, we can't restrict access to PFMEMALLOC page mostly.
> 
> We may encounter following scenario.
> 
> Assume there is a request from normal allocation
> and there is no objects in per cpu cache and no node partial slab.
> 
> In this case, slab_alloc go into slow-path and
> new_slab_objects() is invoked. It may return PFMEMALLOC page.
> Current user is not allowed to access PFMEMALLOC page,
> deactivate_slab() is called (commit 5091b74a95d447e34530e713a8971450a45498b3),
> then return object from PFMEMALLOC page.
> 
> Next time, when we meet another request from normal allocation,
> slab_alloc() go into slow-path and re-go new_slab_objects().
> In new_slab_objects(), we invoke get_partial() and we get a partial slab
> which we have been deactivated just before, that is, PFMEMALLOC page.
> We extract one object from it and re-deactivate.
> 
> "deactivate -> re-get in get_partial -> re-deactivate" occures repeatedly.
> 
> As a result, we can't restrict access to PFMEMALLOC page and
> moreover, it introduce much performance degration to normal allocation
> because of deactivation frequently.
> 
> Now, we need to consider pfmemalloc_match() in get_partial_node()
> It prevent "deactivate -> re-get in get_partial".
> Instead, new_slab() is called. It may return !PFMEMALLOC page,
> so above situation will be suspended sometime.
> 
> Signed-off-by: Joonsoo Kim 
> Cc: David Miller 
> Cc: Neil Brown 
> Cc: Peter Zijlstra 
> Cc: Mike Christie 
> Cc: Eric B Munson 
> Cc: Eric Dumazet 
> Cc: Sebastian Andrzej Siewior 
> Cc: Mel Gorman 
> Cc: Christoph Lameter 
> Cc: Andrew Morton 

Acked-by: David Rientjes 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] Input: Let the FT5x06 driver build without debugfs

2012-08-30 Thread David Rientjes

On Tue, 21 Aug 2012, Dmitry Torokhov wrote:

> > > > On 08/17/2012 02:15 AM, Eric W. Biederman wrote:
> > > >> When testing to make certain my user namespace code works in
> > > >> various configurations I tripped over the tf5x06.c not building
> > > >> with debugfs disabled.
> > > >
> > > > Sorry for that.
> > > >
> > > > There already is a patch for this issue which I slightly prefer. You
> > > > can find it in the mail from Guenther Roeck:
> > > >http://www.mail-archive.com/linux-input@vger.kernel.org/msg00646.html
> > > 
> > > If you guys could get that merged into 3.6 I would appreciate it.
> > > 
> > Yes, that would be great. I don't recall seeing an e-mail from Dmitry 
> > accepting
> > it, though.
> 
> Applied, sorry for the delay.
> 

This still affects Linus' tree and causes a build breakage without debugfs 
configured.  Considering the driver went into 3.5-rc5, could you please 
push this fix for 3.6 (and mark it for stable backport)?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Using uio_pdrv to create an platform device for an FPGA, mmap() fails

2012-08-30 Thread Hans J. Koch

[Added Greg Kroah-Hartman to Cc:]

On Thu, Aug 30, 2012 at 08:10:11PM +, Worth, Kevin wrote:
> >> Thanks for the reply, Hans. Your question about opening /dev/uio0 O_RDWR
> >> prompted me to check out how I was creating /dev/uio0 ... my system
> >> isn't using udev, and I was accidentally creating it with major/minor
> >> number 254/0 instead of the correct 253/0 (found by looking at
> >> /proc/devices). Fixed that and the mmap() call started working.
> >
> >Good.
> >
> >> 
> >> Verified that if /dev/uio0 has permissions 0644, root can open it O_RDWR
> >> and mmap PROT_READ | PROT_WRITE using the below code and write to an
> >> address within my memory map. Of course this contradicts the statement
> >> "/dev/uioX is a read-only file" in the UIO howto.
> >
> >You're right. That wants to be fixed...
> >
> >> 
> >> Including my updated, tested code for completeness.
> >> Note I also cleaned up the device registration a little by
> >> using a different platform_device_register_ call and removing fields
> >> in the struct uio_info that get filled in by uio_pdrv automatically.
> >
> >If you want to have that included in the mainline, please choose a more
> >descriptive name than "myfpga" and send a proper patch.
> 
> I wasn't sure about submitting as a patch since it's for a custom FPGA
> that I don't expect the community will be using,

That doesn't matter. If it helps YOU that the code is maintained in mainline,
post it.

>  but the code seems like
> possibly useful sample/example code.

That is another good argument.

> Perhaps patching the HOWTO like
> http://www.kernel.org/doc/htmldocs/uio-howto.html#uio_pci_generic_example
> is the right approach?

Oh, if you could hack up a patch for the documentation, that would be great.
But please make it a second patch, don't mix it with your driver code.

Thanks,
Hans

> 
> >
> >Thanks,
> >Hans
> >
> >> 
> >> -Kevin
> >> 
> >> # lsuio -m -v
> >> uio0: name=uio_myfpga, version=0.1, events=0
> >> map[0]: addr=0xD000, size=262144, mmap test: OK
> >> Device attributes:
> >> uevent=DRIVER=uio_pdrv
> >> modalias=platform:uio_pdrv
> >> 
> >> --Kernelspace--
> >> #include 
> >> #include 
> >> #include 
> >> 
> >> #define MYFPGA_BASE 0xd000 // 3G
> >> #define MYFPGA_SIZE 0x0004 // 256k
> >> 
> >> static struct resource myfpga_resources[] = {
> >> {
> >> .start = MYFPGA_BASE,
> >> .end   = MYFPGA_BASE + MYFPGA_SIZE - 1,
> >> .name  = "myfpga",
> >> .flags = IORESOURCE_MEM
> >> }
> >> };
> >> 
> >> static struct uio_info myfpga_uio_info = {
> >>.name = "uio_myfpga",
> >>.version = "0.1",
> >> };
> >> 
> >> static struct platform_device *myfpga_uio_pdev;
> >> 
> >> static int __init myfpga_init(void)
> >> {
> >> myfpga_uio_pdev = platform_device_register_resndata (NULL,
> >>  "uio_pdrv",
> >>  -1,
> >>  myfpga_resources,
> >>  1,
> >>  _uio_info,
> >>  sizeof(struct 
> >> uio_info)
> >> );
> >> if (IS_ERR(myfpga_uio_pdev)) {
> >> return PTR_ERR(myfpga_uio_pdev);
> >> }
> >> 
> >> return 0;
> >> }
> >> 
> >> static void __exit myfpga_exit(void)
> >> {
> >> platform_device_unregister(myfpga_uio_pdev);
> >> }
> >> 
> >> module_init(myfpga_init);
> >> module_exit(myfpga_exit);
> >> 
> >> --Userspace---
> >> #include 
> >> #include 
> >> #include 
> >> 
> >> #include 
> >> #include 
> >> #include 
> >> #include 
> >> #include 
> >> #include 
> >> #include 
> >> 
> >> #define MYFPGA_BASE 0xd000 // 3G
> >> #define MYFPGA_SIZE 0x0004 // 256k
> >> #define MYFPGA_MAP_NUM  0 // First and only defined map
> >> 
> >> #define BIT32(n) (1 << (n))
> >> 
> >> /* Use mmap()'ped address "iomem", not physical MYFPGA address */
> >> #define MYFPGA_REG(iomem) (volatile uint32_t*)(iomem + 0x8) // Third 
> >> 32-bit reg
> >> 
> >> int main (int argc, char *argv[])
> >> {
> >> int fd;
> >> void *iomem;
> >> fd = open("/dev/uio0", O_RDWR|O_SYNC);
> >> if (fd < 0) {
> >> printf("failed to open /dev/uio0, quitting\n");
> >> return -1;
> >> }
> >> /* Note offset has a special meaning with uio devices */
> >> iomem = mmap(NULL, MYFPGA_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
> >>  MYFPGA_MAP_NUM * getpagesize());
> >> if (iomem == MAP_FAILED) {
> >> printf("mmap failed, quitting\n");
> >> close(fd);
> >> return -2;
> >> }
> >> 
> >> /* Set bit 5 of MYFPGA_REG register */
> >> *MYFPGA_REG(iomem) |= BIT32(5);
> >> 
> >> munmap(iomem, MYFPGA_SIZE);
> >>

RE: [PATCH] of: add devres version of of_iomap

2012-08-30 Thread Karicheri, Muralidharan

>> -Original Message-
>> From: Rob Herring [mailto:robherri...@gmail.com]
>> Sent: Thursday, August 30, 2012 2:27 PM
>> To: Karicheri, Muralidharan
>> Cc: grant.lik...@secretlab.ca; devicetree-disc...@lists.ozlabs.org; linux-
>> ker...@vger.kernel.org
>> Subject: Re: [PATCH] of: add devres version of of_iomap
>> 
>> On 08/30/2012 10:32 AM, Murali Karicheri wrote:
>> > This adds devres version of the of_iomap() to allow resource to be cleaned
>> > through devres.
>> 
>> If you have a struct device, then don't you already have a resource and
>> can just use devm_ioremap in a driver? New drivers should not be using
>> of_iomap.
>> 

That is the point. If you do a grep under driver, there are many drivers using 
the pattern
like this. This helper function is mean to replace this code.

>From dma/sirf-dma.c

ret = of_address_to_resource(dn, 0, );
if (ret) {
dev_err(dev, "Error parsing memory region!\n");
   goto error;
}

regs_start = res.start;
regs_size = resource_size();

base = devm_ioremap(dev, regs_start, regs_size);
if (!base) {
dev_err(dev, "Error mapping memory region!\n");
   goto error;
}

Other instances.

edac/mpc85xx_edac.c
media/video/fsl-viu.c
mtd/nand/mpc5121_nfc.c

Some of these code uses devm_request_mem_region() as well. Isn't a good idea to 
add this helper
that can be called by new drivers to replace this sequence? I could update the 
patch to do this call
as well?

>> Rob
>> 
>> >
>> > Signed-off-by: Murali Karicheri 
>> > ---
>> >  drivers/of/address.c   |   26 --
>> >  include/linux/of_address.h |2 ++
>> >  2 files changed, 26 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/drivers/of/address.c b/drivers/of/address.c
>> > index 7e262a6..d3da426 100644
>> > --- a/drivers/of/address.c
>> > +++ b/drivers/of/address.c
>> > @@ -602,10 +602,9 @@ struct device_node
>> *of_find_matching_node_by_address(struct device_node *from,
>> >return NULL;
>> >  }
>> >
>> > -
>> >  /**
>> >   * of_iomap - Maps the memory mapped IO for a given device_node
>> > - * @device:   the device whose io range will be mapped
>> > + * @device_node: Ptr to the device node that has the reg property
>> >   * @index:index of the io range
>> >   *
>> >   * Returns a pointer to the mapped memory
>> > @@ -620,3 +619,26 @@ void __iomem *of_iomap(struct device_node *np, int 
>> > index)
>> >return ioremap(res.start, resource_size());
>> >  }
>> >  EXPORT_SYMBOL(of_iomap);
>> > +
>> > +/**
>> > + * of_devm_iomap - devres version of of_iomap
>> > + * @device:   the device whose io range will be mapped
>> > + * @index:index of the io range
>> > + *
>> > + * Returns a pointer to the mapped memory
>> > + */
>> > +void __iomem *of_devm_iomap(struct device *dev, int index)
>> > +{
>> > +  struct device_node *np;
>> > +  struct resource res;
>> > +
>> > +  if (!dev)
>> > +  return NULL;
>> > +
>> > +  np = dev->of_node;
>> > +  if (of_address_to_resource(np, index, ))
>> > +  return NULL;
>> > +
>> > +  return devm_ioremap(dev, res.start, resource_size());
>> > +}
>> > +EXPORT_SYMBOL(of_devm_iomap);
>> > diff --git a/include/linux/of_address.h b/include/linux/of_address.h
>> > index 01b925a..67efa5f 100644
>> > --- a/include/linux/of_address.h
>> > +++ b/include/linux/of_address.h
>> > @@ -3,6 +3,7 @@
>> >  #include 
>> >  #include 
>> >  #include 
>> > +#include 
>> >
>> >  #ifdef CONFIG_OF_ADDRESS
>> >  extern u64 of_translate_address(struct device_node *np, const __be32 
>> > *addr);
>> > @@ -13,6 +14,7 @@ extern struct device_node 
>> > *of_find_matching_node_by_address(
>> >const struct of_device_id *matches,
>> >u64 base_address);
>> >  extern void __iomem *of_iomap(struct device_node *device, int index);
>> > +extern void __iomem *of_devm_iomap(struct device *dev, int index);
>> >
>> >  /* Extract an address from a device, returns the region size and
>> >   * the address space flags too. The PCI version uses a BAR number
>> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v7 9/9] block: Avoid deadlocks with bio allocation by stacking drivers

2012-08-30 Thread Vivek Goyal

On Wed, Aug 29, 2012 at 10:13:45AM -0700, Kent Overstreet wrote:

[..]
> > Performance aside, punting submission to per device worker in case of deep
> > stack usage sounds cleaner solution to me.
> 
> Agreed, but performance tends to matter in the real world. And either
> way the tricky bits are going to be confined to a few functions, so I
> don't think it matters that much.
> 
> If someone wants to code up the workqueue version and test it, they're
> more than welcome...

Here is one quick and dirty proof of concept patch. It checks for stack
depth and if remaining space is less than 20% of stack size, then it
defers the bio submission to per queue worker.

Thanks
Vivek


---
 block/blk-core.c  |  171 ++
 block/blk-sysfs.c |1 
 include/linux/blk_types.h |1 
 include/linux/blkdev.h|8 ++
 4 files changed, 138 insertions(+), 43 deletions(-)

Index: linux-2.6/include/linux/blkdev.h
===
--- linux-2.6.orig/include/linux/blkdev.h   2012-09-01 17:44:51.686485550 
-0400
+++ linux-2.6/include/linux/blkdev.h2012-09-01 18:09:58.805577658 -0400
@@ -430,6 +430,14 @@ struct request_queue {
/* Throttle data */
struct throtl_data *td;
 #endif
+
+   /*
+* Bio submission to queue can be deferred to a workqueue if stack
+* usage of submitter is high.
+*/
+   struct bio_list deferred_bios;
+   struct work_struct  deferred_bio_work;
+   struct workqueue_struct *deferred_bio_workqueue;
 };
 
 #define QUEUE_FLAG_QUEUED  1   /* uses generic tag queueing */
Index: linux-2.6/block/blk-core.c
===
--- linux-2.6.orig/block/blk-core.c 2012-09-01 17:44:51.686485550 -0400
+++ linux-2.6/block/blk-core.c  2012-09-02 00:34:55.204091269 -0400
@@ -211,6 +211,23 @@ static void blk_delay_work(struct work_s
spin_unlock_irq(q->queue_lock);
 }
 
+static void blk_deferred_bio_work(struct work_struct *work)
+{
+   struct request_queue *q;
+   struct bio *bio = NULL;
+
+   q = container_of(work, struct request_queue, deferred_bio_work);
+
+   do {
+   spin_lock_irq(q->queue_lock);
+   bio = bio_list_pop(>deferred_bios);
+   spin_unlock_irq(q->queue_lock);
+   if (!bio)
+   break;
+   generic_make_request(bio);
+   } while (1);
+}
+
 /**
  * blk_delay_queue - restart queueing after defined interval
  * @q: The  request_queue in question
@@ -289,6 +306,7 @@ void blk_sync_queue(struct request_queue
 {
del_timer_sync(>timeout);
cancel_delayed_work_sync(>delay_work);
+   cancel_work_sync(>deferred_bio_work);
 }
 EXPORT_SYMBOL(blk_sync_queue);
 
@@ -351,6 +369,29 @@ void blk_put_queue(struct request_queue 
 EXPORT_SYMBOL(blk_put_queue);
 
 /**
+ * blk_drain_deferred_bios - drain deferred bios
+ * @q: request_queue to drain deferred bios for
+ *
+ * Dispatch all currently deferred bios on @q through ->make_request_fn().
+ */
+static void blk_drain_deferred_bios(struct request_queue *q)
+{
+   struct bio_list bl;
+   struct bio *bio;
+   unsigned long flags;
+
+   bio_list_init();
+
+   spin_lock_irqsave(q->queue_lock, flags);
+   bio_list_merge(, >deferred_bios);
+   bio_list_init(>deferred_bios);
+   spin_unlock_irqrestore(q->queue_lock, flags);
+
+   while ((bio = bio_list_pop()))
+   generic_make_request(bio);
+}
+
+/**
  * blk_drain_queue - drain requests from request_queue
  * @q: queue to drain
  * @drain_all: whether to drain all requests or only the ones w/ ELVPRIV
@@ -358,6 +399,10 @@ EXPORT_SYMBOL(blk_put_queue);
  * Drain requests from @q.  If @drain_all is set, all requests are drained.
  * If not, only ELVPRIV requests are drained.  The caller is responsible
  * for ensuring that no new requests which need to be drained are queued.
+ *
+ * Note: It does not drain bios on q->deferred_bios list.
+ * Call blk_drain_deferred_bios() if need be.
+ *
  */
 void blk_drain_queue(struct request_queue *q, bool drain_all)
 {
@@ -505,6 +550,9 @@ void blk_cleanup_queue(struct request_qu
spin_unlock_irq(lock);
mutex_unlock(>sysfs_lock);
 
+   /* First drain all deferred bios. */
+   blk_drain_deferred_bios(q);
+
/* drain all requests queued before DEAD marking */
blk_drain_queue(q, true);
 
@@ -614,11 +662,19 @@ struct request_queue *blk_alloc_queue_no
q->bypass_depth = 1;
__set_bit(QUEUE_FLAG_BYPASS, >queue_flags);
 
-   if (blkcg_init_queue(q))
+   bio_list_init(>deferred_bios);
+   INIT_WORK(>deferred_bio_work, blk_deferred_bio_work);
+   q->deferred_bio_workqueue = alloc_workqueue("kdeferbiod", 
WQ_MEM_RECLAIM, 0);
+   if (!q->deferred_bio_workqueue)
goto fail_id;
 
+   if

[PATCH] ftdi_sio: PID for NZR SEM 16+ USB

2012-08-30 Thread Horst Schirmeier

This adds the USB PID for the NZR SEM 16+ USB energy monitor device
.  It works perfectly with the GPL software on
.

Signed-off-by: Horst Schirmeier 

---
 drivers/usb/serial/ftdi_sio.c |1 +
 drivers/usb/serial/ftdi_sio_ids.h |3 +++
 2 files changed, 4 insertions(+)
diff --git a/drivers/usb/serial/ftdi_sio.c b/drivers/usb/serial/ftdi_sio.c
index 5620db6..5cae2a1 100644
--- a/drivers/usb/serial/ftdi_sio.c
+++ b/drivers/usb/serial/ftdi_sio.c
@@ -704,6 +704,7 @@ static struct usb_device_id id_table_combined [] = {
{ USB_DEVICE(FTDI_VID, FTDI_PCDJ_DAC2_PID) },
{ USB_DEVICE(FTDI_VID, FTDI_RRCIRKITS_LOCOBUFFER_PID) },
{ USB_DEVICE(FTDI_VID, FTDI_ASK_RDR400_PID) },
+   { USB_DEVICE(FTDI_VID, FTDI_NZR_SEM_USB_PID) },
{ USB_DEVICE(ICOM_VID, ICOM_ID_1_PID) },
{ USB_DEVICE(ICOM_VID, ICOM_OPC_U_UC_PID) },
{ USB_DEVICE(ICOM_VID, ICOM_ID_RP2C1_PID) },
diff --git a/drivers/usb/serial/ftdi_sio_ids.h 
b/drivers/usb/serial/ftdi_sio_ids.h
index 5dd96ca..117f42b 100644
--- a/drivers/usb/serial/ftdi_sio_ids.h
+++ b/drivers/usb/serial/ftdi_sio_ids.h
@@ -75,6 +75,9 @@
 #define FTDI_OPENDCC_GATEWAY_PID   0xBFDB
 #define FTDI_OPENDCC_GBM_PID   0xBFDC
 
+/* NZR SEM 16+ USB (http://www.nzr.de) */
+#define FTDI_NZR_SEM_USB_PID   0xC1E0  /* NZR SEM-LOG16+ */
+
 /*
  * RR-CirKits LocoBuffer USB (http://www.rr-cirkits.com)
  */

-- 
PGP-Key 0xD40E0E7A


signature.asc
Description: Digital signature

Re: [PATCH 2/2] [RESEND] add discard support to nbd

2012-08-30 Thread Andrew Morton

On Wed, 29 Aug 2012 08:41:02 -0400
paul.cleme...@steeleye.com wrote:

> Description: This patch adds discard support to nbd. When the nbd client
> system receives a discard request, this will be passed along to the nbd
> server system, where the nbd-server will respond by performing:
>   fallocate(.. FALLOC_FL_PUNCH_HOLE ..)
> 
> To punch a hole in the backend storage, which is no longer needed.
> 

What happens if the user is running an older server?

I is it possible that because the old server didn't set
NBD_FLAG_SEND_TRIM, the user's screen gets filled with WARN_ONs?

Anyway, please make sure this combination was tested!


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Continuous warn slow path warning messages from top of the kernel tree

2012-08-30 Thread Sam Gandhi

I am using the latest git repo (3.6.-rc3) to boot my ARM mx28 based board.

When I boot this board via nfs I see continuous messages shown below.

I looked at ip_auto_config net/ipv4/ipconfig.c and I don't see
explicit prink() call although there are DBG() or pr_info/pr_err calls
but I thought those functions were slow path safe?

I was wondering if this some new regression that anybody else has seen?

-Sam

---
[4.18]  device=eth0, addr=192.168.137.1,
mask=255.255.255.0, gw=192.168.137.254
[4.18] Dumping ftrace buffer:
[4.18](ftrace buffer empty)
[4.18] [ cut here ]
[4.18] WARNING: at /src/git/kernel/kernel/rcutiny.c:135
rcu_idle_exit_common+0x88/0xa0()
[4.18] Current pid: 1 comm: swapper / Idle pid: 0 comm: swapper
[4.18] Modules linked in:
[4.18] [] (unwind_backtrace+0x0/0xf0) from
[] (warn_slowpath_common+0x4c/0x64)
[4.18] [] (warn_slowpath_common+0x4c/0x64) from
[] (warn_slowpath_fmt+0x30/0x40)
[4.18] [] (warn_slowpath_fmt+0x30/0x40) from
[] (rcu_idle_exit_common+0x88/0xa0)
[4.18] [] (rcu_idle_exit_common+0x88/0xa0) from
[] (rcu_irq_enter+0x40/0x7c)
[4.18] [] (rcu_irq_enter+0x40/0x7c) from
[] (irq_enter+0x8/0x64)
[4.18] [] (irq_enter+0x8/0x64) from []
(handle_IRQ+0x18/0x84)
[4.18] [] (handle_IRQ+0x18/0x84) from []
(__irq_svc+0x34/0x58)
[4.18] [] (__irq_svc+0x34/0x58) from []
(vprintk_emit+0x134/0x504)
[4.18] [] (vprintk_emit+0x134/0x504) from
[] (printk+0x34/0x44)
[4.18] [] (printk+0x34/0x44) from []
(ip_auto_config+0xda0/0xf4c)
[4.18] [] (ip_auto_config+0xda0/0xf4c) from
[] (do_one_initcall+0x30/0x16c)
[4.18] [] (do_one_initcall+0x30/0x16c) from
[] (kernel_init+0xf0/0x1b8)
[4.18] [] (kernel_init+0xf0/0x1b8) from []
(kernel_thread_exit+0x0/0x8)
[4.18] ---[ end trace 837047ef750231b9 ]---
[4.34] [ cut here ]
[4.34] WARNING: at /src/git/kernel/kernel/rcutiny.c:75
rcu_idle_enter_common.clone.6+0x9c/0x)
[4.34] Current pid: 1 comm: swapper / Idle pid: 0 comm: swapper
[4.34] Modules linked in:
[4.34] [] (unwind_backtrace+0x0/0xf0) from
[] (warn_slowpath_common+0x4c/0x64)
[4.34] [] (warn_slowpath_common+0x4c/0x64) from
[] (warn_slowpath_fmt+0x30/0x40)
[4.34] [] (warn_slowpath_fmt+0x30/0x40) from
[] (rcu_idle_enter_common.clone.6+0x)
[4.34] [] (rcu_idle_enter_common.clone.6+0x9c/0xb8)
from [] (rcu_irq_exit+0x3c/0x)
[4.34] [] (rcu_irq_exit+0x3c/0x78) from []
(handle_IRQ+0x34/0x84)
[4.34] [] (handle_IRQ+0x34/0x84) from []
(__irq_svc+0x34/0x58)
[4.34] [] (__irq_svc+0x34/0x58) from []
(vprintk_emit+0x134/0x504)
[4.34] [] (vprintk_emit+0x134/0x504) from
[] (printk+0x34/0x44)
[4.34] [] (printk+0x34/0x44) from []
(ip_auto_config+0xda0/0xf4c)
[4.34] [] (ip_auto_config+0xda0/0xf4c) from
[] (do_one_initcall+0x30/0x16c)
[4.34] [] (do_one_initcall+0x30/0x16c) from
[] (kernel_init+0xf0/0x1b8)
[4.34] [] (kernel_init+0xf0/0x1b8) from []
(kernel_thread_exit+0x0/0x8)
[4.34] ---[ end trace 837047ef750231ba ]---
[4.47]  host=192.168.137.1, domain=, nis-domain=(none)
[4.48]  bootserver=255.255.255.255,
rootserver=192.168.137.254, rootpath=
[4.49] [ cut here ]
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] [RESEND] add discard support to nbd

2012-08-30 Thread Andrew Morton


> Subject: [PATCH 1/2] [RESEND] add discard support to nbd

Please don't send multiple patches with the same title.  And please
prefix the patch titles with text which identifies the affected
subsystem.  Documentation/SubmittingPatches goes into details.


On Wed, 29 Aug 2012 08:40:45 -0400
paul.cleme...@steeleye.com wrote:

> Description: This patch adds a set-flags ioctl, allowing various option
> flags to be set on an nbd device.

That tells us what it does, but omits the all-important "why it does it".  
What's the requirement here?  What value does this change add to NBD users?

What does the interface do and how are users to use it?

Are the nbd ioctls documented anywhere?

> Signed-off-by: Paul Clements 
> ---
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index d07c9f7..c544bb4 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -78,6 +78,8 @@ static const char *ioctl_cmd_to_ascii(int cmd)
>   case NBD_SET_SOCK: return "set-sock";
>   case NBD_SET_BLKSIZE: return "set-blksize";
>   case NBD_SET_SIZE: return "set-size";
> + case NBD_SET_TIMEOUT: return "set-timeout";

That was an unchangelogged bugfix.

> + case NBD_SET_FLAGS: return "set-flags";
>   case NBD_DO_IT: return "do-it";
>   case NBD_CLEAR_SOCK: return "clear-sock";
>   case NBD_CLEAR_QUE: return "clear-que";
> @@ -460,7 +462,7 @@ static void nbd_handle_req(struct nbd_device *nbd, struct 
> request *req)
>   nbd_cmd(req) = NBD_CMD_READ;
>   if (rq_data_dir(req) == WRITE) {
>   nbd_cmd(req) = NBD_CMD_WRITE;
> - if (nbd->flags & NBD_READ_ONLY) {
> + if (nbd->flags & NBD_FLAG_READ_ONLY) {
>   dev_err(disk_to_dev(nbd->disk),
>   "Write on read-only\n");
>   goto error_out;
> @@ -642,6 +644,10 @@ static int __nbd_ioctl(struct block_device *bdev, struct 
> nbd_device *nbd,
>   nbd->xmit_timeout = arg * HZ;
>   return 0;
>  
> + case NBD_SET_FLAGS:
> + nbd->flags = arg;
> + return 0;
> +
>   case NBD_SET_SIZE_BLOCKS:
>   nbd->bytesize = ((u64) arg) * nbd->blksize;
>   bdev->bd_inode->i_size = nbd->bytesize;
> diff --git a/include/linux/nbd.h b/include/linux/nbd.h
> index d146ca1..bb349be 100644
> --- a/include/linux/nbd.h
> +++ b/include/linux/nbd.h
> @@ -27,6 +27,7 @@
>  #define NBD_SET_SIZE_BLOCKS  _IO( 0xab, 7 )
>  #define NBD_DISCONNECT  _IO( 0xab, 8 )
>  #define NBD_SET_TIMEOUT _IO( 0xab, 9 )
> +#define NBD_SET_FLAGS   _IO( 0xab, 10)
>  
>  enum {
>   NBD_CMD_READ = 0,
> @@ -34,6 +35,10 @@ enum {
>   NBD_CMD_DISC = 2
>  };
>  
> +/* values for flags field */
> +#define NBD_FLAG_HAS_FLAGS   (1 << 0)
> +#define NBD_FLAG_READ_ONLY   (1 << 1)

These could be individually documented right here, at their definition
site.


>  #define nbd_cmd(req) ((req)->cmd[0])
>  
>  /* userspace doesn't need the nbd_device structure */
> @@ -42,10 +47,6 @@ enum {
>  #include 
>  #include 
>  
> -/* values for flags field */
> -#define NBD_READ_ONLY 0x0001
> -#define NBD_WRITE_NOCHK 0x0002
> -
>  struct request;
>  
>  struct nbd_device {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 02/15] rcu: Pull TINY_RCU dyntick-idle tracing into non-idle region

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

Because TINY_RCU's idle detection keys directly off of the nesting
level, rather than from a separate variable as in TREE_RCU, the
TINY_RCU dyntick-idle tracing on transition to idle must happen
before the change to the nesting level.  This commit therefore makes
this change by passing the desired new value (rather than the old value)
of the nesting level in to rcu_idle_enter_common().

[ paulmck: Add fix for wrong-variable bug spotted by
  Michael Wang . ]

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutiny.c |   31 ---
 1 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index 547b1fe..e4163c5 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -56,24 +56,27 @@ static void __call_rcu(struct rcu_head *head,
 static long long rcu_dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
 
 /* Common code for rcu_idle_enter() and rcu_irq_exit(), see kernel/rcutree.c. 
*/
-static void rcu_idle_enter_common(long long oldval)
+static void rcu_idle_enter_common(long long newval)
 {
-   if (rcu_dynticks_nesting) {
+   if (newval) {
RCU_TRACE(trace_rcu_dyntick("--=",
-   oldval, rcu_dynticks_nesting));
+   rcu_dynticks_nesting, newval));
+   rcu_dynticks_nesting = newval;
return;
}
-   RCU_TRACE(trace_rcu_dyntick("Start", oldval, rcu_dynticks_nesting));
+   RCU_TRACE(trace_rcu_dyntick("Start", rcu_dynticks_nesting, newval));
if (!is_idle_task(current)) {
struct task_struct *idle = idle_task(smp_processor_id());
 
RCU_TRACE(trace_rcu_dyntick("Error on entry: not idle task",
-   oldval, rcu_dynticks_nesting));
+   rcu_dynticks_nesting, newval));
ftrace_dump(DUMP_ALL);
WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
  current->pid, current->comm,
  idle->pid, idle->comm); /* must be idle task! */
}
+   barrier();
+   rcu_dynticks_nesting = newval;
rcu_sched_qs(0); /* implies rcu_bh_qsctr_inc(0) */
 }
 
@@ -84,17 +87,16 @@ static void rcu_idle_enter_common(long long oldval)
 void rcu_idle_enter(void)
 {
unsigned long flags;
-   long long oldval;
+   long long newval;
 
local_irq_save(flags);
-   oldval = rcu_dynticks_nesting;
WARN_ON_ONCE((rcu_dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
if ((rcu_dynticks_nesting & DYNTICK_TASK_NEST_MASK) ==
DYNTICK_TASK_NEST_VALUE)
-   rcu_dynticks_nesting = 0;
+   newval = 0;
else
-   rcu_dynticks_nesting  -= DYNTICK_TASK_NEST_VALUE;
-   rcu_idle_enter_common(oldval);
+   newval = rcu_dynticks_nesting - DYNTICK_TASK_NEST_VALUE;
+   rcu_idle_enter_common(newval);
local_irq_restore(flags);
 }
 EXPORT_SYMBOL_GPL(rcu_idle_enter);
@@ -105,13 +107,12 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter);
 void rcu_irq_exit(void)
 {
unsigned long flags;
-   long long oldval;
+   long long newval;
 
local_irq_save(flags);
-   oldval = rcu_dynticks_nesting;
-   rcu_dynticks_nesting--;
-   WARN_ON_ONCE(rcu_dynticks_nesting < 0);
-   rcu_idle_enter_common(oldval);
+   newval = rcu_dynticks_nesting - 1;
+   WARN_ON_ONCE(newval < 0);
+   rcu_idle_enter_common(newval);
local_irq_restore(flags);
 }
 
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 08/15] rcu: Apply for_each_rcu_flavor() to increment_cpu_stall_ticks()

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The increment_cpu_stall_ticks() function listed each RCU flavor
explicitly, with an ifdef to handle preemptible RCU.  This commit
therefore applies for_each_rcu_flavor() to save a line of code.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 3ea60c9..139a803 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -2196,11 +2196,10 @@ static void zero_cpu_stall_ticks(struct rcu_data *rdp)
 /* Increment ->ticks_this_gp for all flavors of RCU. */
 static void increment_cpu_stall_ticks(void)
 {
-   __get_cpu_var(rcu_sched_data).ticks_this_gp++;
-   __get_cpu_var(rcu_bh_data).ticks_this_gp++;
-#ifdef CONFIG_TREE_PREEMPT_RCU
-   __get_cpu_var(rcu_preempt_data).ticks_this_gp++;
-#endif /* #ifdef CONFIG_TREE_PREEMPT_RCU */
+   struct rcu_state *rsp;
+
+   for_each_rcu_flavor(rsp)
+   __this_cpu_ptr(rsp->rda)->ticks_this_gp++;
 }
 
 #else /* #ifdef CONFIG_RCU_CPU_STALL_INFO */
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 10/15] rcu: Protect rcu_node accesses during CPU stall warnings

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The print_other_cpu_stall() function accesses a number of rcu_node
fields without protection from the ->lock.  In theory, this is not
a problem because the fields accessed are all integers, but in
practice the compiler can get nasty.  Therefore, the commit extends
the existing critical section to cover the entire loop body.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 9f44749..fbe43b0 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -746,14 +746,16 @@ static void print_other_cpu_stall(struct rcu_state *rsp)
rcu_for_each_leaf_node(rsp, rnp) {
raw_spin_lock_irqsave(>lock, flags);
ndetected += rcu_print_task_stall(rnp);
-   raw_spin_unlock_irqrestore(>lock, flags);
-   if (rnp->qsmask == 0)
+   if (rnp->qsmask == 0) {
+   raw_spin_unlock_irqrestore(>lock, flags);
continue;
+   }
for (cpu = 0; cpu <= rnp->grphi - rnp->grplo; cpu++)
if (rnp->qsmask & (1UL << cpu)) {
print_cpu_stall_info(rsp, rnp->grplo + cpu);
ndetected++;
}
+   raw_spin_unlock_irqrestore(>lock, flags);
}
 
/*
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 07/15] rcu: Fix obsolete rcu_initiate_boost() header comment

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

Commit 1217ed1b (rcu: permit rcu_read_unlock() to be called while holding
runqueue locks) made rcu_initiate_boost() restore irq state when releasing
the rcu_node structure's ->lock, but failed to update the header comment
accordingly.  This commit therefore brings the header comment up to date.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index c930a47..3ea60c9 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -1193,9 +1193,9 @@ static int rcu_boost_kthread(void *arg)
  * kthread to start boosting them.  If there is an expedited grace
  * period in progress, it is always time to boost.
  *
- * The caller must hold rnp->lock, which this function releases,
- * but irqs remain disabled.  The ->boost_kthread_task is immortal,
- * so we don't need to worry about it going away.
+ * The caller must hold rnp->lock, which this function releases.
+ * The ->boost_kthread_task is immortal, so we don't need to worry
+ * about it going away.
  */
 static void rcu_initiate_boost(struct rcu_node *rnp, unsigned long flags)
 {
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 0/5] Documentation and rcutorture changes

2012-08-30 Thread Paul E. McKenney

On Thu, Aug 30, 2012 at 11:56:09AM -0700, Josh Triplett wrote:
> On Thu, Aug 30, 2012 at 11:44:48AM -0700, Paul E. McKenney wrote:
> > Hello!
> > 
> > This series covers changes to rcutorture and documentation updates.
> > The individual patches in this series are as follows:
> > 
> > 1.  Update rcutorture default values so that casual rcutorture
> > users will do more aggressive testing.
> > 2.  Make rcutorture track CPU-hotplug latency statistics.
> > 3.  Document SRCU's new-found ability to be used by offline and
> > idle CPUs, and also emphasize SRCU's limitations.
> > 4.  Use the new pr_*() interfaces in rcutorture.
> > 5.  Prevent kthread-initialization races in rcutorture.
> > 
> > Thanx, Paul
> > 
> > 
> > 
> >  b/Documentation/RCU/checklist.txt |6 +
> >  b/Documentation/RCU/whatisRCU.txt |9 +-
> >  b/kernel/rcutorture.c |4 -
> >  kernel/rcutorture.c   |  152 
> > +++---
> >  4 files changed, 108 insertions(+), 63 deletions(-)
> 
> Something seems wrong with this diffstat; how'd the b/ prefixes get
> there, and why does it list kernel/rcutorture.c twice, once with and
> once without?

Hmmm...  It seems quite reproducible.  I did the usual git-format-patch
and ran the resulting set of patches through diffstat.  I seem to have a
broken diffstat...

However, git diff --stat v3.6-rc1..hotplug.2012.08.28a generates the
following:

 kernel/rcutree.c   |   93 +++-
 kernel/rcutree.h   |3 --
 kernel/rcutree_trace.c |4 +-
 kernel/sched/core.c|   41 ++---
 4 files changed, 43 insertions(+), 98 deletions(-)

Which does look much better.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 14/15] time: RCU permitted to stop idle entry via softirq

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The can_stop_idle_tick() function complains if a softirq vector is
raised too late in the idle-entry process, presumably in order to
prevent dangling softirq invocations from being delayed across the
full idle period, which might be indefinitely long -- and if softirq
was asserted any later than the call to this function, such a delay
might well happen.

However, RCU needs to be able to use softirq to stop idle entry in
order to be able to drain RCU callbacks from the current CPU, which in
turn enables faster entry into dyntick-idle mode, which in turn reduces
power consumption.  Because RCU takes this action at a well-defined
point in the idle-entry path, it is safe for RCU to take this approach.

This commit therefore silences the error message that is sometimes
produced when the going-idle CPU suddenly finds that it has an RCU_SOFTIRQ
to process.  The error message will continue to be issued for other
softirq vectors.

Reported-by: Sedat Dilek 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
Tested-by: Sedat Dilek 
---
 include/linux/interrupt.h |2 ++
 kernel/time/tick-sched.c  |3 ++-
 2 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index c5f856a..5e4e617 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -430,6 +430,8 @@ enum
NR_SOFTIRQS
 };
 
+#define SOFTIRQ_STOP_IDLE_MASK (~(1 << RCU_SOFTIRQ))
+
 /* map softirq index to softirq name. update 'softirq_to_name' in
  * kernel/softirq.c when adding a new softirq.
  */
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 024540f..4b1785a 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -436,7 +436,8 @@ static bool can_stop_idle_tick(int cpu, struct tick_sched 
*ts)
if (unlikely(local_softirq_pending() && cpu_online(cpu))) {
static int ratelimit;
 
-   if (ratelimit < 10) {
+   if (ratelimit < 10 &&
+   (local_softirq_pending() & SOFTIRQ_STOP_IDLE_MASK)) {
printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
   (unsigned int) local_softirq_pending());
ratelimit++;
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 09/15] rcu: Avoid rcu_print_detail_task_stall_rnp() segfault

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The rcu_print_detail_task_stall_rnp() function invokes
rcu_preempt_blocked_readers_cgp() to verify that there are some preempted
RCU readers blocking the current grace period outside of the protection
of the rcu_node structure's ->lock.  This means that the last blocked
reader might exit its RCU read-side critical section and remove itself
from the ->blkd_tasks list before the ->lock is acquired, resulting in
a segmentation fault when the subsequent code attempts to dereference
the now-NULL gp_tasks pointer.

This commit therefore moves the test under the lock.  This will not
have measurable effect on lock contention because this code is invoked
only when printing RCU CPU stall warnings, in other words, in the common
case, never.

Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 139a803..c02dc1d 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -422,9 +422,11 @@ static void rcu_print_detail_task_stall_rnp(struct 
rcu_node *rnp)
unsigned long flags;
struct task_struct *t;
 
-   if (!rcu_preempt_blocked_readers_cgp(rnp))
-   return;
raw_spin_lock_irqsave(>lock, flags);
+   if (!rcu_preempt_blocked_readers_cgp(rnp)) {
+   raw_spin_unlock_irqrestore(>lock, flags);
+   return;
+   }
t = list_entry(rnp->gp_tasks,
   struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, >blkd_tasks, rcu_node_entry)
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 13/15] rcu: Move TINY_PREEMPT_RCU away from raw_local_irq_save()

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The use of raw_local_irq_save() is unnecessary, given that local_irq_save()
really does disable interrupts.  Also, it appears to interfere with lockdep.
Therefore, this commit moves to local_irq_save().

Reported-by: Fengguang Wu 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
Tested-by: Fengguang Wu 
---
 kernel/rcutiny_plugin.h |   10 +-
 1 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/rcutiny_plugin.h b/kernel/rcutiny_plugin.h
index 918fd1e..3d01902 100644
--- a/kernel/rcutiny_plugin.h
+++ b/kernel/rcutiny_plugin.h
@@ -278,7 +278,7 @@ static int rcu_boost(void)
rcu_preempt_ctrlblk.exp_tasks == NULL)
return 0;  /* Nothing to boost. */
 
-   raw_local_irq_save(flags);
+   local_irq_save(flags);
 
/*
 * Recheck with irqs disabled: all tasks in need of boosting
@@ -287,7 +287,7 @@ static int rcu_boost(void)
 */
if (rcu_preempt_ctrlblk.boost_tasks == NULL &&
rcu_preempt_ctrlblk.exp_tasks == NULL) {
-   raw_local_irq_restore(flags);
+   local_irq_restore(flags);
return 0;
}
 
@@ -317,7 +317,7 @@ static int rcu_boost(void)
t = container_of(tb, struct task_struct, rcu_node_entry);
rt_mutex_init_proxy_locked(, t);
t->rcu_boost_mutex = 
-   raw_local_irq_restore(flags);
+   local_irq_restore(flags);
rt_mutex_lock();
rt_mutex_unlock();  /* Keep lockdep happy. */
 
@@ -991,9 +991,9 @@ static void rcu_trace_sub_qlen(struct rcu_ctrlblk *rcp, int 
n)
 {
unsigned long flags;
 
-   raw_local_irq_save(flags);
+   local_irq_save(flags);
rcp->qlen -= n;
-   raw_local_irq_restore(flags);
+   local_irq_restore(flags);
 }
 
 /*
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 03/15] rcu: Properly initialize ->boost_tasks on CPU offline

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

When rcu_preempt_offline_tasks() clears tasks from a leaf rcu_node
structure, it does not NULL out the structure's ->boost_tasks field.
This commit therefore fixes this issue.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h |7 ---
 1 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 7f3244c..b1b4851 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -584,8 +584,11 @@ static int rcu_preempt_offline_tasks(struct rcu_state *rsp,
raw_spin_unlock(_root->lock); /* irqs still disabled */
}
 
+   rnp->gp_tasks = NULL;
+   rnp->exp_tasks = NULL;
 #ifdef CONFIG_RCU_BOOST
-   /* In case root is being boosted and leaf is not. */
+   rnp->boost_tasks = NULL;
+   /* In case root is being boosted and leaf was not. */
raw_spin_lock(_root->lock); /* irqs already disabled */
if (rnp_root->boost_tasks != NULL &&
rnp_root->boost_tasks != rnp_root->gp_tasks)
@@ -593,8 +596,6 @@ static int rcu_preempt_offline_tasks(struct rcu_state *rsp,
raw_spin_unlock(_root->lock); /* irqs still disabled */
 #endif /* #ifdef CONFIG_RCU_BOOST */
 
-   rnp->gp_tasks = NULL;
-   rnp->exp_tasks = NULL;
return retval;
 }
 
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 05/15] rcu: Improve boost selection when moving tasks to root rcu_node

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The rcu_preempt_offline_tasks() moves all tasks queued on a given leaf
rcu_node structure to the root rcu_node, which is done when the last CPU
corresponding the the leaf rcu_node structure goes offline.  Now that
RCU-preempt's synchronize_rcu_expedited() implementation blocks CPU-hotplug
operations during the initialization of each rcu_node structure's
->boost_tasks pointer, rcu_preempt_offline_tasks() can do a better job
of setting the root rcu_node's ->boost_tasks pointer.

The key point is that rcu_preempt_offline_tasks() runs as part of the
CPU-hotplug process, so that a concurrent synchronize_rcu_expedited() is
guaranteed to either have not started on the one hand (in which case there
is no boosting on behalf of the expedited grace period) to be completely
initialized on the other (in which case, in absence of other priority
boosting, all ->boost_tasks pointers will be initialized).  Therefore,
if rcu_preempt_offline_tasks() finds that the ->boost_tasks pointer is
equal to the ->exp_tasks pointer, it can be sure that it is correcty
placed.

The case where there was boosting ongoing at the time that the
synchronize_rcu_expedited() function started, different nodes might
start boosting the tasks blocking the expedited grace period at different
times.  In this mixed case, the root node will either be boosting tasks
for the expedited grace period already, or it will start as soon as it
gets done boosting for the normal grace period -- but in this latter
case, the root node's tasks needed to be boosted in any case.

This commit therefore adds a check of the ->boost_tasks pointer against
the ->exp_tasks pointer to the list that prevents updating ->boost_tasks.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree_plugin.h |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index b1b4851..c930a47 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -591,7 +591,8 @@ static int rcu_preempt_offline_tasks(struct rcu_state *rsp,
/* In case root is being boosted and leaf was not. */
raw_spin_lock(_root->lock); /* irqs already disabled */
if (rnp_root->boost_tasks != NULL &&
-   rnp_root->boost_tasks != rnp_root->gp_tasks)
+   rnp_root->boost_tasks != rnp_root->gp_tasks &&
+   rnp_root->boost_tasks != rnp_root->exp_tasks)
rnp_root->boost_tasks = rnp_root->gp_tasks;
raw_spin_unlock(_root->lock); /* irqs still disabled */
 #endif /* #ifdef CONFIG_RCU_BOOST */
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 06/15] rcu: Make offline-CPU checking allow for indefinite delays

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The rcu_implicit_offline_qs() function implicitly assumed that execution
would progress predictably when interrupts are disabled, which is of course
not guaranteed when running on a hypervisor.  Furthermore, this function
is short, and is called from one place only in a short function.

This commit therefore ensures that the timing is checked before
checking the condition, which guarantees correct behavior even given
indefinite delays.  It also inlines rcu_implicit_offline_qs() into
rcu_implicit_dynticks_qs().

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c |   53 +
 1 files changed, 21 insertions(+), 32 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 96b8aff..9f44749 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -317,35 +317,6 @@ static struct rcu_node *rcu_get_root(struct rcu_state *rsp)
 }
 
 /*
- * If the specified CPU is offline, tell the caller that it is in
- * a quiescent state.  Otherwise, whack it with a reschedule IPI.
- * Grace periods can end up waiting on an offline CPU when that
- * CPU is in the process of coming online -- it will be added to the
- * rcu_node bitmasks before it actually makes it online.  The same thing
- * can happen while a CPU is in the process of coming online.  Because this
- * race is quite rare, we check for it after detecting that the grace
- * period has been delayed rather than checking each and every CPU
- * each and every time we start a new grace period.
- */
-static int rcu_implicit_offline_qs(struct rcu_data *rdp)
-{
-   /*
-* If the CPU is offline for more than a jiffy, it is in a quiescent
-* state.  We can trust its state not to change because interrupts
-* are disabled.  The reason for the jiffy's worth of slack is to
-* handle CPUs initializing on the way up and finding their way
-* to the idle loop on the way down.
-*/
-   if (cpu_is_offline(rdp->cpu) &&
-   ULONG_CMP_LT(rdp->rsp->gp_start + 2, jiffies)) {
-   trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, "ofl");
-   rdp->offline_fqs++;
-   return 1;
-   }
-   return 0;
-}
-
-/*
  * rcu_idle_enter_common - inform RCU that current CPU is moving towards idle
  *
  * If the new value of the ->dynticks_nesting counter now is zero,
@@ -675,7 +646,7 @@ static int dyntick_save_progress_counter(struct rcu_data 
*rdp)
  * Return true if the specified CPU has passed through a quiescent
  * state by virtue of being in or having passed through an dynticks
  * idle state since the last call to dyntick_save_progress_counter()
- * for this same CPU.
+ * for this same CPU, or by virtue of having been offline.
  */
 static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
 {
@@ -699,8 +670,26 @@ static int rcu_implicit_dynticks_qs(struct rcu_data *rdp)
return 1;
}
 
-   /* Go check for the CPU being offline. */
-   return rcu_implicit_offline_qs(rdp);
+   /*
+* Check for the CPU being offline, but only if the grace period
+* is old enough.  We don't need to worry about the CPU changing
+* state: If we see it offline even once, it has been through a
+* quiescent state.
+*
+* The reason for insisting that the grace period be at least
+* one jiffy old is that CPUs that are not quite online and that
+* have just gone offline can still execute RCU read-side critical
+* sections.
+*/
+   if (ULONG_CMP_GE(rdp->rsp->gp_start + 2, jiffies))
+   return 0;  /* Grace period is not old enough. */
+   barrier();
+   if (cpu_is_offline(rdp->cpu)) {
+   trace_rcu_fqs(rdp->rsp->name, rdp->gpnum, rdp->cpu, "ofl");
+   rdp->offline_fqs++;
+   return 1;
+   }
+   return 0;
 }
 
 static int jiffies_till_stall_check(void)
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 12/15] rcu: Remove redundant memory barrier from __call_rcu()

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

The first memory barrier in __call_rcu() is supposed to order any
updates done beforehand by the caller against the actual queuing
of the callback.  However, the second memory barrier (which is intended
to order incrementing the queue lengths before queuing the callback)
is also between the caller's updates and the queuing of the callback.
The second memory barrier can therefore serve both purposes.

This commit therefore removes the first memory barrier.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index e58097b..5b6709b 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -1923,8 +1923,6 @@ __call_rcu(struct rcu_head *head, void (*func)(struct 
rcu_head *rcu),
head->func = func;
head->next = NULL;
 
-   smp_mb(); /* Ensure RCU update seen before callback registry. */
-
/*
 * Opportunistically note grace-period endings and beginnings.
 * Note that we might see a beginning right after we see an
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 11/15] rcu: Avoid spurious RCU CPU stall warnings

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

If a given CPU avoids the idle loop but also avoids starting a new
RCU grace period for a full minute, RCU can issue spurious RCU CPU
stall warnings.  This commit fixes this issue by adding a check for
ongoing grace period to avoid these spurious stall warnings.

Reported-by: Becky Bruce 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcutree.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index fbe43b0..e58097b 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -820,7 +820,8 @@ static void check_cpu_stall(struct rcu_state *rsp, struct 
rcu_data *rdp)
j = ACCESS_ONCE(jiffies);
js = ACCESS_ONCE(rsp->jiffies_stall);
rnp = rdp->mynode;
-   if ((ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
+   if (rcu_gp_in_progress(rsp) &&
+   (ACCESS_ONCE(rnp->qsmask) & rdp->grpmask) && ULONG_CMP_GE(j, js)) {
 
/* We haven't checked in, so go dump stack. */
print_cpu_stall(rsp);
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 01/15] rcu: Add PROVE_RCU_DELAY to provoke difficult races

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

There have been some recent bugs that were triggered only when
preemptible RCU's __rcu_read_unlock() was preempted just after setting
->rcu_read_lock_nesting to INT_MIN, which is a low-probability event.
Therefore, reproducing those bugs (to say nothing of gaining confidence
in alleged fixes) was quite difficult.  This commit therefore creates
a new debug-only RCU kernel config option that forces a short delay
in __rcu_read_unlock() to increase the probability of those sorts of
bugs occurring.

Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcupdate.c |4 
 lib/Kconfig.debug |   14 ++
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/kernel/rcupdate.c b/kernel/rcupdate.c
index 4e6a61b..29ca1c6 100644
--- a/kernel/rcupdate.c
+++ b/kernel/rcupdate.c
@@ -45,6 +45,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define CREATE_TRACE_POINTS
 #include 
@@ -81,6 +82,9 @@ void __rcu_read_unlock(void)
} else {
barrier();  /* critical section before exit code. */
t->rcu_read_lock_nesting = INT_MIN;
+#ifdef CONFIG_PROVE_RCU_DELAY
+   udelay(10); /* Make preemption more probable. */
+#endif /* #ifdef CONFIG_PROVE_RCU_DELAY */
barrier();  /* assign before ->rcu_read_unlock_special load */
if (unlikely(ACCESS_ONCE(t->rcu_read_unlock_special)))
rcu_read_unlock_special(t);
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2403a63..dacbbe4 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -629,6 +629,20 @@ config PROVE_RCU_REPEATEDLY
 
 Say N if you are unsure.
 
+config PROVE_RCU_DELAY
+   bool "RCU debugging: preemptible RCU race provocation"
+   depends on DEBUG_KERNEL && PREEMPT_RCU
+   default n
+   help
+There is a class of races that involve an unlikely preemption
+of __rcu_read_unlock() just after ->rcu_read_lock_nesting has
+been set to INT_MIN.  This feature inserts a delay at that
+point to increase the probability of these races.
+
+Say Y to increase probability of preemption of __rcu_read_unlock().
+
+Say N if you are unsure.
+
 config SPARSE_RCU_POINTER
bool "RCU debugging: sparse-based checks for pointer usage"
default n
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 04/15] rcu: Permit RCU_NONIDLE() to be used from interrupt context

2012-08-30 Thread Paul E. McKenney

From: "Paul E. McKenney" 

There is a need to use RCU from interrupt context, but either before
rcu_irq_enter() is called or after rcu_irq_exit() is called.  If the
interrupt occurs from idle, then lockdep-RCU will complain about such
uses, as they appear to be illegal uses of RCU from the idle loop.
In other environments, RCU_NONIDLE() could be used to properly protect
the use of RCU, but RCU_NONIDLE() currently cannot be invoked except
from process context.

This commit therefore modifies RCU_NONIDLE() to permit its use more
globally.

Reported-by: Steven Rostedt 
Signed-off-by: Paul E. McKenney 
Signed-off-by: Paul E. McKenney 
---
 include/linux/rcupdate.h |6 ++
 kernel/rcutiny.c |2 ++
 kernel/rcutree.c |2 ++
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 115ead2..0fbbd52 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -210,14 +210,12 @@ extern void exit_rcu(void);
  * to nest RCU_NONIDLE() wrappers, but the nesting level is currently
  * quite limited.  If deeper nesting is required, it will be necessary
  * to adjust DYNTICK_TASK_NESTING_VALUE accordingly.
- *
- * This macro may be used from process-level code only.
  */
 #define RCU_NONIDLE(a) \
do { \
-   rcu_idle_exit(); \
+   rcu_irq_enter(); \
do { a; } while (0); \
-   rcu_idle_enter(); \
+   rcu_irq_exit(); \
} while (0)
 
 /*
diff --git a/kernel/rcutiny.c b/kernel/rcutiny.c
index e4163c5..2e073a2 100644
--- a/kernel/rcutiny.c
+++ b/kernel/rcutiny.c
@@ -115,6 +115,7 @@ void rcu_irq_exit(void)
rcu_idle_enter_common(newval);
local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(rcu_irq_exit);
 
 /* Common code for rcu_idle_exit() and rcu_irq_enter(), see kernel/rcutree.c. 
*/
 static void rcu_idle_exit_common(long long oldval)
@@ -172,6 +173,7 @@ void rcu_irq_enter(void)
rcu_idle_exit_common(oldval);
local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(rcu_irq_enter);
 
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 
diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f280e54..96b8aff 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -447,6 +447,7 @@ void rcu_irq_exit(void)
rcu_idle_enter_common(rdtp, oldval);
local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(rcu_irq_exit);
 
 /*
  * rcu_idle_exit_common - inform RCU that current CPU is moving away from idle
@@ -542,6 +543,7 @@ void rcu_irq_enter(void)
rcu_idle_exit_common(rdtp, oldval);
local_irq_restore(flags);
 }
+EXPORT_SYMBOL_GPL(rcu_irq_enter);
 
 /**
  * rcu_nmi_enter - inform RCU of entry to NMI context
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 15/15] kmemleak: Replace list_for_each_continue_rcu with new interface

2012-08-30 Thread Paul E. McKenney

From: Michael Wang 

This patch replaces list_for_each_continue_rcu() with
list_for_each_entry_continue_rcu() to save a few lines
of code and allow removing list_for_each_continue_rcu().

Signed-off-by: Michael Wang 
Acked-by: Catalin Marinas 
Signed-off-by: Paul E. McKenney 
---
 mm/kmemleak.c |6 ++
 1 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index 45eb621..0de83b4 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -1483,13 +1483,11 @@ static void *kmemleak_seq_next(struct seq_file *seq, 
void *v, loff_t *pos)
 {
struct kmemleak_object *prev_obj = v;
struct kmemleak_object *next_obj = NULL;
-   struct list_head *n = _obj->object_list;
+   struct kmemleak_object *obj = prev_obj;
 
++(*pos);
 
-   list_for_each_continue_rcu(n, _list) {
-   struct kmemleak_object *obj =
-   list_entry(n, struct kmemleak_object, object_list);
+   list_for_each_entry_continue_rcu(obj, _list, object_list) {
if (get_object(obj)) {
next_obj = obj;
break;
-- 
1.7.8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH tip/core/rcu 0/3] CPU-hotplug changes

2012-08-30 Thread Paul E. McKenney

Hello!

This patch series contains fixes and improvements related to CPU hotplug:

1.  Remove _rcu_barrier() dependency on __stop_machine().
2.  Disallow callback registry on offline CPUs.
3.  Fix load avg vs cpu-hotplug (with Peter Zijlstra).

Thanx, Paul



 b/kernel/rcutree.c   |   83 ++-
 b/kernel/rcutree.h   |3 -
 b/kernel/rcutree_trace.c |4 +-
 b/kernel/sched/core.c|   41 +++
 kernel/rcutree.c |   10 +
 5 files changed, 43 insertions(+), 98 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] rbtree based interval tree as a prio_tree replacement

2012-08-30 Thread Rik van Riel


On 08/30/2012 05:34 PM, Andrew Morton wrote:


It would good to have solid acknowledgement from Rik that this approach
does indeed suit his pending vma changes.


It does. Michel's rbtree rework is exactly what I need.

I do not need the interval tree bits, but the faster
augmented rbtree is required for my vma changes to
no longer have the performance regression Johannes
measured with a kernel build.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH tip/core/rcu 1/5] rcu: Update rcutorture defaults

2012-08-30 Thread Paul E. McKenney

On Thu, Aug 30, 2012 at 11:57:05AM -0700, Josh Triplett wrote:
> On Thu, Aug 30, 2012 at 11:45:08AM -0700, Paul E. McKenney wrote:
> > From: "Paul E. McKenney" 
> > 
> > A number of new features have been added to rcutorture over the years, but
> > the defaults have not been updated to include them.  This commit therefore
> > turns on a couple of them that have proven helpful and trustworthy, namely
> > periodic progress reports and testing of NO_HZ.
> > 
> > Signed-off-by: Paul E. McKenney 
> > Signed-off-by: Paul E. McKenney 
> > ---
> >  kernel/rcutorture.c |4 ++--
> >  1 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/kernel/rcutorture.c b/kernel/rcutorture.c
> > index 25b1503..86315d3 100644
> > --- a/kernel/rcutorture.c
> > +++ b/kernel/rcutorture.c
> > @@ -53,10 +53,10 @@ MODULE_AUTHOR("Paul E. McKenney  
> > and Josh Triplett  >  
> >  static int nreaders = -1;  /* # reader threads, defaults to 2*ncpus */
> >  static int nfakewriters = 4;   /* # fake writer threads */
> > -static int stat_interval;  /* Interval between stats, in seconds. */
> > +static int stat_interval = 60; /* Interval between stats, in seconds. 
> > */
> > /*  Defaults to "only at end of test". */
> 
> Need to remove this comment about the default.

Good catch!  I have replaced it with "Zero means "only at end of test".

> >  static bool verbose;   /* Print more debug info. */
> > -static bool test_no_idle_hz;   /* Test RCU's support for tickless idle 
> > CPUs. */
> > +static bool test_no_idle_hz = 1; /* Test RCU support for tickless idle 
> > CPUs. */
> 
> s/1/true/

Good point, fixed.

Thank you for looking this over!

Thanx, Paul

> >  static int shuffle_interval = 3; /* Interval between shuffles (in sec)*/
> >  static int stutter = 5;/* Start/stop testing interval (in sec) 
> > */
> >  static int irqreader = 1;  /* RCU readers from irq (timers). */
> > -- 
> > 1.7.8
> > 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] rbtree based interval tree as a prio_tree replacement

2012-08-30 Thread Andrew Morton

On Tue,  7 Aug 2012 00:25:38 -0700
Michel Lespinasse  wrote:

> This patchset goes over the rbtree changes that have been already integrated
> into Andrew's -mm tree, as well as the augmented rbtree proposal which is
> currently pending.

hm.  Well I grabbed these for a bit of testing.

It's a large change in MM and it depends on code which hasn't yet been
merged in mainline.  It's probably prudent to do all this in two steps
- we'll see.

It would good to have solid acknowledgement from Rik that this approach
does indeed suit his pending vma changes.

The templates-with-CPP thing is not terribly appealing.  It's not
obvious that it really needed to be done this way - we've avoided it in
plenty of other places.  It would be nice to see that alternatives have
been thoroughly explored, and why they were rejected.

AFAICT the code will work OK when expanding macros which reference their
arguments multiple times.  For example, interval_tree.c has

#define ITLAST(n)  ((n)->vm_pgoff + \
(((n)->vm_end - (n)->vm_start) >> PAGE_SHIFT) - 1)

which will explode if passed "foo++".  Things like that.

The code uses the lame-and-useless "inline" absolutely all over the
place.  I do think that for new code it would be better to get down and
actually make proper engineering decisions about which functions should
be inlined and mark them __always_inline.

Hillf has made a review suggestion which AFAICT remains unresponded to.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/3] efi: Fix the ACPI BGRT driver for images located in EFI boot services memory

2012-08-30 Thread Josh Triplett

The ACPI BGRT driver accesses the BIOS logo image when it initializes.
However, ACPI 5.0 (which introduces the BGRT) recommends putting the
logo image in EFI boot services memory, so that the OS can reclaim that
memory.  Production systems follow this recommendation, breaking the
ACPI BGRT driver.

Move the bulk of the BGRT code to run during a new EFI late
initialization phase, which occurs after switching EFI to virtual mode,
and after initializing ACPI, but before freeing boot services memory.
Copy the BIOS logo image to kernel memory at that point, and make it
accessible to the BGRT driver.  Rework the existing ACPI BGRT driver to
act as a simple wrapper exposing that image (and the properties from the
BGRT) via sysfs.

Signed-off-by: Josh Triplett 
---
 arch/x86/platform/efi/Makefile   |1 +
 arch/x86/platform/efi/efi-bgrt.c |   76 ++
 arch/x86/platform/efi/efi.c  |6 +++
 drivers/acpi/Kconfig |4 +-
 drivers/acpi/bgrt.c  |   76 +-
 include/linux/efi-bgrt.h |   21 +++
 include/linux/efi.h  |1 +
 init/main.c  |4 +-
 8 files changed, 119 insertions(+), 70 deletions(-)
 create mode 100644 arch/x86/platform/efi/efi-bgrt.c
 create mode 100644 include/linux/efi-bgrt.h

diff --git a/arch/x86/platform/efi/Makefile b/arch/x86/platform/efi/Makefile
index 73b8be0..6db1cc4 100644
--- a/arch/x86/platform/efi/Makefile
+++ b/arch/x86/platform/efi/Makefile
@@ -1 +1,2 @@
 obj-$(CONFIG_EFI)  += efi.o efi_$(BITS).o efi_stub_$(BITS).o
+obj-$(CONFIG_ACPI_BGRT) += efi-bgrt.o
diff --git a/arch/x86/platform/efi/efi-bgrt.c b/arch/x86/platform/efi/efi-bgrt.c
new file mode 100644
index 000..f6a0c1b
--- /dev/null
+++ b/arch/x86/platform/efi/efi-bgrt.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright 2012 Intel Corporation
+ * Author: Josh Triplett 
+ *
+ * Based on the bgrt driver:
+ * Copyright 2012 Red Hat, Inc 
+ * Author: Matthew Garrett
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+
+struct acpi_table_bgrt *bgrt_tab;
+void *bgrt_image;
+size_t bgrt_image_size;
+
+struct bmp_header {
+   u16 id;
+   u32 size;
+} __packed;
+
+void efi_bgrt_init(void)
+{
+   acpi_status status;
+   void __iomem *image;
+   bool ioremapped = false;
+   struct bmp_header bmp_header;
+
+   if (acpi_disabled)
+   return;
+
+   status = acpi_get_table("BGRT", 0,
+   (struct acpi_table_header **)_tab);
+   if (ACPI_FAILURE(status))
+   return;
+
+   if (bgrt_tab->version != 1)
+   return;
+   if (bgrt_tab->image_type != 0 || !bgrt_tab->image_address)
+   return;
+
+   image = efi_lookup_mapped_addr(bgrt_tab->image_address);
+   if (!image) {
+   image = ioremap(bgrt_tab->image_address, sizeof(bmp_header));
+   ioremapped = true;
+   if (!image)
+   return;
+   }
+
+   memcpy_fromio(_header, image, sizeof(bmp_header));
+   if (ioremapped)
+   iounmap(image);
+   bgrt_image_size = bmp_header.size;
+
+   bgrt_image = kmalloc(bgrt_image_size, GFP_KERNEL);
+   if (!bgrt_image)
+   return;
+
+   if (ioremapped) {
+   image = ioremap(bgrt_tab->image_address, bmp_header.size);
+   if (!image) {
+   kfree(bgrt_image);
+   bgrt_image = NULL;
+   return;
+   }
+   }
+
+   memcpy_fromio(bgrt_image, image, bgrt_image_size);
+   if (ioremapped)
+   iounmap(image);
+}
diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index ae35cc8..0226585 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -745,6 +746,11 @@ void __init efi_init(void)
 #endif
 }
 
+void __init efi_late_init(void)
+{
+   efi_bgrt_init();
+}
+
 void __init efi_set_executable(efi_memory_desc_t *md, bool executable)
 {
u64 addr, npages;
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 8099895..119d58d 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -385,8 +385,8 @@ config ACPI_CUSTOM_METHOD
  to override that restriction).
 
 config ACPI_BGRT
-tristate "Boottime Graphics Resource Table support"
-default n
+   bool "Boottime Graphics Resource Table support"
+   depends on EFI
 help
  This driver adds support for exposing the ACPI Boottime Graphics
  Resource Table, which allows the operating system to obtain
diff --git a/drivers/acpi/bgrt.c

1 2 3 4 5 6 7 8 >

1 - 100 of 736 matches

Mail list logo