date:20121217

[PATCH] SPI: SSP SPI Controller driver v3

2012-12-17 Thread chao bi


This patch is to implement SSP SPI controller driver, which has been applied and
validated on intel Moorestown & Medfield platform. The patch are originated by
Ken Mills  and Sylvain Centelles 
,
migrating to lateset Linux mainline SPI framework by Channing 

and Chen Jun  according to their integration & validation
on Medfield platform.

Signed-off-by: Ken Mills 
Signed-off-by: Sylvain Centelles 
Signed-off-by: channing 
Signed-off-by: Chen Jun 
---
 drivers/spi/Kconfig   |9 +
 drivers/spi/Makefile  |1 +
 drivers/spi/spi-intel-mid-ssp.c   | 1614 +
 include/linux/spi/spi-intel-mid-ssp.h |  103 +++
 4 files changed, 1727 insertions(+), 0 deletions(-)
 create mode 100644 drivers/spi/spi-intel-mid-ssp.c
 create mode 100644 include/linux/spi/spi-intel-mid-ssp.h

diff --git a/drivers/spi/Kconfig b/drivers/spi/Kconfig
index 2e188e1..6285f17 100644
--- a/drivers/spi/Kconfig
+++ b/drivers/spi/Kconfig
@@ -186,6 +186,15 @@ config SPI_IMX
  This enables using the Freescale i.MX SPI controllers in master
  mode.
 
+config SPI_INTEL_MID_SSP
+   tristate "SSP SPI controller driver for Intel MID platforms"
+   depends on SPI_MASTER && INTEL_MID_DMAC
+   help
+ This is the unified SSP SPI master controller driver for
+ the Intel MID platforms, handling Moorestown & Medfield,
+ master clock mode.
+ It supports Bulverde SSP core.
+
 config SPI_LM70_LLP
tristate "Parallel port adapter for LM70 eval board (DEVELOPMENT)"
depends on PARPORT && EXPERIMENTAL
diff --git a/drivers/spi/Makefile b/drivers/spi/Makefile
index 64e970b..1738966 100644
--- a/drivers/spi/Makefile
+++ b/drivers/spi/Makefile
@@ -33,6 +33,7 @@ obj-$(CONFIG_SPI_FSL_ESPI)+= spi-fsl-espi.o
 obj-$(CONFIG_SPI_FSL_SPI)  += spi-fsl-spi.o
 obj-$(CONFIG_SPI_GPIO) += spi-gpio.o
 obj-$(CONFIG_SPI_IMX)  += spi-imx.o
+obj-$(CONFIG_SPI_INTEL_MID_SSP)+= spi-intel-mid-ssp.o
 obj-$(CONFIG_SPI_LM70_LLP) += spi-lm70llp.o
 obj-$(CONFIG_SPI_MPC512x_PSC)  += spi-mpc512x-psc.o
 obj-$(CONFIG_SPI_MPC52xx_PSC)  += spi-mpc52xx-psc.o
diff --git a/drivers/spi/spi-intel-mid-ssp.c b/drivers/spi/spi-intel-mid-ssp.c
new file mode 100644
index 000..440c4a2
--- /dev/null
+++ b/drivers/spi/spi-intel-mid-ssp.c
@@ -0,0 +1,1614 @@
+/*
+ * spi-intel-mid-ssp.c
+ * This driver supports Bulverde SSP core used on Intel MID platforms
+ * It supports SSP of Moorestown & Medfield platforms and handles clock
+ * slave & master modes.
+ *
+ * Copyright (c) 2010, Intel Corporation.
+ *  Ken Mills 
+ *  Sylvain Centelles 
+ *  Jun Chen 
+ *  Chao Bi 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
+ *
+ */
+
+/*
+ * Note:
+ *
+ * Supports DMA and non-interrupt polled transfers.
+ *
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PCI_MRST_DMAC1_ID  0x0814
+#define PCI_MDFL_DMAC1_ID  0x0827
+
+#define SSP_NOT_SYNC BIT(22)
+#define MAX_SPI_TRANSFER_SIZE 8192
+#define MAX_BITBANGING_LOOP   1
+#define SPI_FIFO_SIZE 16
+
+/* PM QoS define(usec) */
+#define MIN_EXIT_LATENCY 20
+
+/* SPI DMA max transfer time */
+#define SSP_SPI_DMA_TIMEOUT 100
+
+/* SSP assignement configuration from PCI config */
+#define SSP_CFG_GET_MODE(ssp_cfg)  ((ssp_cfg) & 0x07)
+#define SSP_CFG_GET_SPI_BUS_NB(ssp_cfg)(((ssp_cfg) >> 3) & 0x07)
+#define SSP_CFG_IS_SPI_SLAVE(ssp_cfg)  ((ssp_cfg) & BIT(6))
+#define SSP_CFG_SPI_MODE_ID1
+/* adid field offset is 6 inside the vendor specific capability */
+#define VNDR_CAPABILITY_ADID_OFFSET6
+
+/* Driver's quirk flags
+ * This workarround bufferizes data in the audio fabric SDRAM from
+ * where the DMA transfers will operate. Should be enabled only for
+ * SPI slave mode. */
+#define QUIRKS_SRAM_ADDITIONAL_CPY 1
+/* If set the trailing bytes won't be handled by the DMA.
+ * Trailing byte feature not fully available. */
+#define QUIRKS_DMA_USE_NO_TRAIL2
+/* If set, the driver will use PM_QOS to reduce the latency
+ * introduced by the deeper C-states which may produce over/under
+ * run issues. Must be used in slave mode. In master mode, the
+ * latency is not critical, but setting

Re: [PATCH] sched: numa: Fix build error if CONFIG_NUMA_BALANCING && !CONFIG_TRANSPARENT_HUGEPAGE

2012-12-17 Thread David Rientjes

On Mon, 17 Dec 2012, Mel Gorman wrote:

> Michal Hocko reported that the following build error occurs if
> CONFIG_NUMA_BALANCING is set without THP support
> 
> kernel/sched/fair.c: In function â??task_numa_workâ??:
> kernel/sched/fair.c:932:55: error: call to â??__build_bug_failedâ?? declared 
> with attribute error: BUILD_BUG failed
> 
> The problem is that HPAGE_PMD_SHIFT triggers a BUILD_BUG() on
> !CONFIG_TRANSPARENT_HUGEPAGE. This patch addresses the problem.
> 
> Reported-by: Michal Hocko 
> Signed-off-by: Mel Gorman 

Acked-by: David Rientjes 

Fixes the build issue for me, thanks.

[PATCH] clk: max77686: Remove unnecessary NULL checking for container_of()

2012-12-17 Thread Axel Lin

container_of() never returns NULL, thus remove the NULL checking for it.
Also rename get_max77686_clk() to to_max77686_clk() for better readability.

Signed-off-by: Axel Lin 
---
 drivers/clk/clk-max77686.c |   30 --
 1 file changed, 8 insertions(+), 22 deletions(-)

diff --git a/drivers/clk/clk-max77686.c b/drivers/clk/clk-max77686.c
index 8944214..3680f66 100644
--- a/drivers/clk/clk-max77686.c
+++ b/drivers/clk/clk-max77686.c
@@ -44,48 +44,34 @@ struct max77686_clk {
struct clk_lookup *lookup;
 };
 
-static struct max77686_clk *get_max77686_clk(struct clk_hw *hw)
+static struct max77686_clk *to_max77686_clk(struct clk_hw *hw)
 {
return container_of(hw, struct max77686_clk, hw);
 }
 
 static int max77686_clk_prepare(struct clk_hw *hw)
 {
-   struct max77686_clk *max77686;
-   int ret;
-
-   max77686 = get_max77686_clk(hw);
-   if (!max77686)
-   return -ENOMEM;
-
-   ret = regmap_update_bits(max77686->iodev->regmap,
-   MAX77686_REG_32KHZ, max77686->mask, max77686->mask);
+   struct max77686_clk *max77686 = to_max77686_clk(hw);
 
-   return ret;
+   return regmap_update_bits(max77686->iodev->regmap,
+ MAX77686_REG_32KHZ, max77686->mask,
+ max77686->mask);
 }
 
 static void max77686_clk_unprepare(struct clk_hw *hw)
 {
-   struct max77686_clk *max77686;
-
-   max77686 = get_max77686_clk(hw);
-   if (!max77686)
-   return;
+   struct max77686_clk *max77686 = to_max77686_clk(hw);
 
regmap_update_bits(max77686->iodev->regmap,
-   MAX77686_REG_32KHZ, max77686->mask, ~max77686->mask);
+  MAX77686_REG_32KHZ, max77686->mask, 0);
 }
 
 static int max77686_clk_is_enabled(struct clk_hw *hw)
 {
-   struct max77686_clk *max77686;
+   struct max77686_clk *max77686 = to_max77686_clk(hw);
int ret;
u32 val;
 
-   max77686 = get_max77686_clk(hw);
-   if (!max77686)
-   return -ENOMEM;
-
ret = regmap_read(max77686->iodev->regmap,
MAX77686_REG_32KHZ, );
 
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/7] KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based

2012-12-17 Thread Takuya Yoshikawa

This makes it possible to release mmu_lock and reschedule conditionally
in a later patch.  Although this may increase the time needed to protect
the whole slot when we start dirty logging, the kernel should not allow
the userspace to trigger something that will hold a spinlock for such a
long time as tens of milliseconds: actually there is no limit since it
is roughly proportional to the number of guest pages.

Another point to note is that this patch removes the only user of
slot_bitmap which will cause some problems when we increase the number
of slots further.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/mmu.c |   28 +++-
 1 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bee3509..b4d4fd1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4198,25 +4198,27 @@ int kvm_mmu_setup(struct kvm_vcpu *vcpu)
 
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
 {
-   struct kvm_mmu_page *sp;
-   bool flush = false;
+   struct kvm_memory_slot *memslot;
+   gfn_t last_gfn;
+   int i;
 
-   list_for_each_entry(sp, >arch.active_mmu_pages, link) {
-   int i;
-   u64 *pt;
+   memslot = id_to_memslot(kvm->memslots, slot);
+   last_gfn = memslot->base_gfn + memslot->npages - 1;
 
-   if (!test_bit(slot, sp->slot_bitmap))
-   continue;
+   for (i = PT_PAGE_TABLE_LEVEL;
+i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
+   unsigned long *rmapp;
+   unsigned long last_index, index;
 
-   pt = sp->spt;
-   for (i = 0; i < PT64_ENT_PER_PAGE; ++i) {
-   if (!is_shadow_present_pte(pt[i]) ||
- !is_last_spte(pt[i], sp->role.level))
-   continue;
+   rmapp = memslot->arch.rmap[i - PT_PAGE_TABLE_LEVEL];
+   last_index = gfn_to_index(last_gfn, memslot->base_gfn, i);
 
-   spte_write_protect(kvm, [i], , false);
+   for (index = 0; index <= last_index; ++index, ++rmapp) {
+   if (*rmapp)
+   __rmap_write_protect(kvm, rmapp, false);
}
}
+
kvm_flush_remote_tlbs(kvm);
 }
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 5/7] KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself

2012-12-17 Thread Takuya Yoshikawa

No reason to make callers take mmu_lock since we do not need to protect
kvm_mmu_change_mmu_pages() and kvm_mmu_slot_remove_write_access()
together by mmu_lock in kvm_arch_commit_memory_region(): the former
calls kvm_mmu_commit_zap_page() and flushes TLBs by itself.

Note: we do not need to protect kvm->arch.n_requested_mmu_pages by
mmu_lock as can be seen from the fact that it is read locklessly.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/mmu.c |4 
 arch/x86/kvm/x86.c |9 -
 2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index bb964b3..fc7d84a 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2143,6 +2143,8 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned 
int goal_nr_mmu_pages)
 * change the value
 */
 
+   spin_lock(>mmu_lock);
+
if (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages) {
while (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages &&
!list_empty(>arch.active_mmu_pages)) {
@@ -2157,6 +2159,8 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned 
int goal_nr_mmu_pages)
}
 
kvm->arch.n_max_mmu_pages = goal_nr_mmu_pages;
+
+   spin_unlock(>mmu_lock);
 }
 
 int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9451efa..d9d0f6b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3270,12 +3270,10 @@ static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm 
*kvm,
return -EINVAL;
 
mutex_lock(>slots_lock);
-   spin_lock(>mmu_lock);
 
kvm_mmu_change_mmu_pages(kvm, kvm_nr_mmu_pages);
kvm->arch.n_requested_mmu_pages = kvm_nr_mmu_pages;
 
-   spin_unlock(>mmu_lock);
mutex_unlock(>slots_lock);
return 0;
 }
@@ -6894,7 +6892,6 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
if (!kvm->arch.n_requested_mmu_pages)
nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm);
 
-   spin_lock(>mmu_lock);
if (nr_mmu_pages)
kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
/*
@@ -6903,9 +6900,11 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 * not be created until the end of the logging.
 */
if ((mem->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
-   !(old.flags & KVM_MEM_LOG_DIRTY_PAGES))
+   !(old.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
+   spin_lock(>mmu_lock);
kvm_mmu_slot_remove_write_access(kvm, mem->slot);
-   spin_unlock(>mmu_lock);
+   spin_unlock(>mmu_lock);
+   }
/*
 * If memory slot is created, or moved, we need to clear all
 * mmio sptes.
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/7] KVM: x86: Remove unused slot_bitmap from kvm_mmu_page

2012-12-17 Thread Takuya Yoshikawa

Not needed any more.

Signed-off-by: Takuya Yoshikawa 
---
 Documentation/virtual/kvm/mmu.txt |7 ---
 arch/x86/include/asm/kvm_host.h   |5 -
 arch/x86/kvm/mmu.c|   10 --
 3 files changed, 0 insertions(+), 22 deletions(-)

diff --git a/Documentation/virtual/kvm/mmu.txt 
b/Documentation/virtual/kvm/mmu.txt
index fa5f1db..43fcb76 100644
--- a/Documentation/virtual/kvm/mmu.txt
+++ b/Documentation/virtual/kvm/mmu.txt
@@ -187,13 +187,6 @@ Shadow pages contain the following information:
 perform a reverse map from a pte to a gfn. When role.direct is set, any
 element of this array can be calculated from the gfn field when used, in
 this case, the array of gfns is not allocated. See role.direct and gfn.
-  slot_bitmap:
-A bitmap containing one bit per memory slot.  If the page contains a pte
-mapping a page from memory slot n, then bit n of slot_bitmap will be set
-(if a page is aliased among several slots, then it is not guaranteed that
-all slots will be marked).
-Used during dirty logging to avoid scanning a shadow page if none if its
-pages need tracking.
   root_count:
 A counter keeping track of how many hardware registers (guest cr3 or
 pdptrs) are now pointing at the page.  While this counter is nonzero, the
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c431b33..f75e1fe 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -219,11 +219,6 @@ struct kvm_mmu_page {
u64 *spt;
/* hold the gfn of each spte inside spt */
gfn_t *gfns;
-   /*
-* One bit set per slot which has memory
-* in this shadow page.
-*/
-   DECLARE_BITMAP(slot_bitmap, KVM_MEM_SLOTS_NUM);
bool unsync;
int root_count;  /* Currently serving as active root */
unsigned int unsync_children;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b4d4fd1..bb964b3 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1522,7 +1522,6 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct 
kvm_vcpu *vcpu,
sp->gfns = mmu_memory_cache_alloc(>arch.mmu_page_cache);
set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
list_add(>link, >kvm->arch.active_mmu_pages);
-   bitmap_zero(sp->slot_bitmap, KVM_MEM_SLOTS_NUM);
sp->parent_ptes = 0;
mmu_page_add_parent_pte(vcpu, sp, parent_pte);
kvm_mod_used_mmu_pages(vcpu->kvm, +1);
@@ -2183,14 +2182,6 @@ int kvm_mmu_unprotect_page(struct kvm *kvm, gfn_t gfn)
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_unprotect_page);
 
-static void page_header_update_slot(struct kvm *kvm, void *pte, gfn_t gfn)
-{
-   int slot = memslot_id(kvm, gfn);
-   struct kvm_mmu_page *sp = page_header(__pa(pte));
-
-   __set_bit(slot, sp->slot_bitmap);
-}
-
 /*
  * The function is based on mtrr_type_lookup() in
  * arch/x86/kernel/cpu/mtrr/generic.c
@@ -2497,7 +2488,6 @@ static void mmu_set_spte(struct kvm_vcpu *vcpu, u64 
*sptep,
++vcpu->kvm->stat.lpages;
 
if (is_shadow_present_pte(*sptep)) {
-   page_header_update_slot(vcpu->kvm, sptep, gfn);
if (!was_rmapped) {
rmap_count = rmap_add(vcpu, sptep, gfn);
if (rmap_count > RMAP_RECYCLE_THRESHOLD)
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 7/7] KVM: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a long time

2012-12-17 Thread Takuya Yoshikawa

If the userspace starts dirty logging for a large slot, say 64GB of
memory, kvm_mmu_slot_remove_write_access() needs to hold mmu_lock for
a long time such as tens of milliseconds.  This patch controls the lock
hold time by asking the scheduler if we need to reschedule for others.

One penalty for this is that we need to flush TLBs before releasing
mmu_lock.  But since holding mmu_lock for a long time does affect not
only the guest, vCPU threads in other words, but also the host as a
whole, we should pay for that.

In practice, the cost will not be so high because we can protect a fair
amount of memory before being rescheduled: on my test environment,
cond_resched_lock() was called only once for protecting 12GB of memory
even without THP.  We can also revisit Avi's "unlocked TLB flush" work
later for completely suppressing extra TLB flushes if needed.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/mmu.c |5 +
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index b7a1235..a32e8cf 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4212,6 +4212,11 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, 
int slot)
for (index = 0; index <= last_index; ++index, ++rmapp) {
if (*rmapp)
__rmap_write_protect(kvm, rmapp, false);
+
+   if (need_resched() || spin_needbreak(>mmu_lock)) {
+   kvm_flush_remote_tlbs(kvm);
+   cond_resched_lock(>mmu_lock);
+   }
}
}
 
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/7] KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself

2012-12-17 Thread Takuya Yoshikawa

Better to place mmu_lock handling and TLB flushing code together since
this is a self-contained function.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/mmu.c |3 +++
 arch/x86/kvm/x86.c |5 +
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fc7d84a..b7a1235 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4199,6 +4199,8 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, 
int slot)
memslot = id_to_memslot(kvm->memslots, slot);
last_gfn = memslot->base_gfn + memslot->npages - 1;
 
+   spin_lock(>mmu_lock);
+
for (i = PT_PAGE_TABLE_LEVEL;
 i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
unsigned long *rmapp;
@@ -4214,6 +4216,7 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, 
int slot)
}
 
kvm_flush_remote_tlbs(kvm);
+   spin_unlock(>mmu_lock);
 }
 
 void kvm_mmu_zap_all(struct kvm *kvm)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d9d0f6b..aa183e9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6900,11 +6900,8 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 * not be created until the end of the logging.
 */
if ((mem->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
-   !(old.flags & KVM_MEM_LOG_DIRTY_PAGES)) {
-   spin_lock(>mmu_lock);
+   !(old.flags & KVM_MEM_LOG_DIRTY_PAGES))
kvm_mmu_slot_remove_write_access(kvm, mem->slot);
-   spin_unlock(>mmu_lock);
-   }
/*
 * If memory slot is created, or moved, we need to clear all
 * mmio sptes.
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/7] KVM: MMU: Remove unused parameter level from __rmap_write_protect()

2012-12-17 Thread Takuya Yoshikawa

No longer need to care about the mapping level in this function.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/mmu.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 01d7c2a..bee3509 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1142,7 +1142,7 @@ spte_write_protect(struct kvm *kvm, u64 *sptep, bool 
*flush, bool pt_protect)
 }
 
 static bool __rmap_write_protect(struct kvm *kvm, unsigned long *rmapp,
-int level, bool pt_protect)
+bool pt_protect)
 {
u64 *sptep;
struct rmap_iterator iter;
@@ -1180,7 +1180,7 @@ void kvm_mmu_write_protect_pt_masked(struct kvm *kvm,
while (mask) {
rmapp = __gfn_to_rmap(slot->base_gfn + gfn_offset + __ffs(mask),
  PT_PAGE_TABLE_LEVEL, slot);
-   __rmap_write_protect(kvm, rmapp, PT_PAGE_TABLE_LEVEL, false);
+   __rmap_write_protect(kvm, rmapp, false);
 
/* clear the first set bit */
mask &= mask - 1;
@@ -1199,7 +1199,7 @@ static bool rmap_write_protect(struct kvm *kvm, u64 gfn)
for (i = PT_PAGE_TABLE_LEVEL;
 i < PT_PAGE_TABLE_LEVEL + KVM_NR_PAGE_SIZES; ++i) {
rmapp = __gfn_to_rmap(gfn, i, slot);
-   write_protected |= __rmap_write_protect(kvm, rmapp, i, true);
+   write_protected |= __rmap_write_protect(kvm, rmapp, true);
}
 
return write_protected;
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] user namespace and namespace infrastructure changes for 3.8

2012-12-17 Thread Eric W. Biederman

ebied...@xmission.com (Eric W. Biederman) writes:

> Linus,
>
> Please pull the for-linus git tree from:
>
>git://git.kernel.org:/pub/scm/linux/kernel/git/ebiederm/user-namespace.git 
> for-linus
>
>HEAD: 5155040ed349950e16c093ba8e65ad534994df2a userns: Fix typo in 
> description of the limitation of userns_install
>
>This tree is against v3.7-rc3
>
> The embarrasing oversights that Andy found have been corrected.

Those bugs, those darn embarrasing bugs just want don't want to get
fixed.

Linus I just updated my mirror of your kernel.org tree and it appears
you successfully pulled everything except the last 4 commits that fix
those embarrasing bugs.

When you get a chance can you please repull my branch (the details
above are still corect.

The pending changes are.

Eric W. Biederman (4):
  Fix cap_capable to only allow owners in the parent user namespace to have 
caps.
  userns: Require CAP_SYS_ADMIN for most uses of setns.
  userns: Add a more complete capability subset test to commit_creds
  userns: Fix typo in description of the limitation of userns_install

 fs/namespace.c   |3 ++-
 ipc/namespace.c  |3 ++-
 kernel/cred.c|   27 ++-
 kernel/pid_namespace.c   |3 ++-
 kernel/user_namespace.c  |2 +-
 kernel/utsname.c |3 ++-
 net/core/net_namespace.c |3 ++-
 security/commoncap.c |   25 +
 8 files changed, 54 insertions(+), 15 deletions(-)

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: common clock framwork: clk_set_rate issue

2012-12-17 Thread Sascha Hauer

On Tue, Dec 18, 2012 at 10:19:21AM +0800, Chao Xie wrote:
> On Tue, Dec 18, 2012 at 4:19 AM, Sascha Hauer  wrote:
> > On Thu, Dec 06, 2012 at 10:52:03AM +0800, Chao Xie wrote:
> >> hi
> >> When develop the clk drivers for SOCs based on common clock framework.
> >> I met a issue.
> >> For example there is a uart device, it's function clock comes from a
> >> divider, and the divider's parent is a mux. It means
> >>
> >> MUX --> DIV --> UART
> >>
> >> As we know that UART can work at low baudrate for a terminal, while it
> >> can also connect to GPS module which needs a high rate. So the MUX
> >> will provide two clock source, a low clock rate and high clock rate.
> >>
> >> The MUX clk driver clk-mux.c does not implement a ->round_rate callbacks.
> >> It means that when uart driver is used for a GPS and it want to change
> >> it clock, the driver will call clk_set_rate(); clk_set_rate will loop
> >> upward to DIV, and DIV will try to set its divider, and it need loop
> >> upward to MUX.
> >> In fact the current clk drivers have some issue.
> >> MUX clk driver should provide the round_rate callback, it then can
> >> provide a new_rate. It means that in clk_calc_subtree MUX can switch
> >> the clock source.
> >
> > It's not that simple. The input clocks to a mux may not only differ in
> > their rate but can also have other different properties, like for
> > example one input may be always present whereas another input only runs
> > when the CPU is in run mode.
> >
> > It may be a possibility to add a flag to the mux to explicitely
> > allow reparenting on a rate change.
> >
> There is already a flag to do it.
> CLK_SET_RATE_PARENT

That flag has another meaning. It means that a clock is allowed to
change the parents rate when a rate change is requested. What I meant
is a flag that allows a mux to change its parent when a rate change is
requested.
 
> if the mux does not want to changes the input for clk_set_rate called
> by its child, it can clear this flag.
> The question is whether we need add the round_rate/recalc_rate for MUX
> type of clock? Is there any special issue about it that why current
> MUX implementation does not have these callbacks?

They currently do not need these callbacks. When a clock does not have
round_rate propagates up to the parent if CLK_SET_RATE_PARENT is set or
it returns the parents rate if this flag is not set. The situation with
set_rate is similar.

Sascha

-- 
Pengutronix e.K.   | |
Industrial Linux Solutions | http://www.pengutronix.de/  |
Peiner Str. 6-8, 31137 Hildesheim, Germany | Phone: +49-5121-206917-0|
Amtsgericht Hildesheim, HRA 2686   | Fax:   +49-5121-206917- |
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] clk: max77686: Fix return value checking for devm_kzalloc

2012-12-17 Thread Axel Lin

devm_kzalloc returns NULL on failure.

Signed-off-by: Axel Lin 
---
 drivers/clk/clk-max77686.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/clk-max77686.c b/drivers/clk/clk-max77686.c
index d098f72..8944214 100644
--- a/drivers/clk/clk-max77686.c
+++ b/drivers/clk/clk-max77686.c
@@ -132,7 +132,7 @@ static int max77686_clk_register(struct device *dev,
 
max77686->lookup = devm_kzalloc(dev, sizeof(struct clk_lookup),
GFP_KERNEL);
-   if (IS_ERR(max77686->lookup))
+   if (!max77686->lookup)
return -ENOMEM;
 
max77686->lookup->con_id = hw->init->name;
@@ -151,13 +151,13 @@ static int max77686_clk_probe(struct platform_device 
*pdev)
 
max77686_clks = devm_kzalloc(>dev, sizeof(struct max77686_clk *)
* MAX77686_CLKS_NUM, GFP_KERNEL);
-   if (IS_ERR(max77686_clks))
+   if (!max77686_clks)
return -ENOMEM;
 
for (i = 0; i < MAX77686_CLKS_NUM; i++) {
max77686_clks[i] = devm_kzalloc(>dev,
sizeof(struct max77686_clk), 
GFP_KERNEL);
-   if (IS_ERR(max77686_clks[i]))
+   if (!max77686_clks[i])
return -ENOMEM;
}
 
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/3] perf tool: Add support to include non architectural event aliases

2012-12-17 Thread Jiri Olsa

On Tue, Dec 18, 2012 at 10:12:03AM +0900, Namhyung Kim wrote:
> Hi Jiri,
> 
> On Mon, 17 Dec 2012 14:37:04 +0100, Jiri Olsa wrote:
> > Adding support to parse non architectural event aliases
> > for given cpu. These aliases will be provided as 'events'
> > directory like architectural ones provided by kernel.
> >
> [snip]
> > +
> > +$(OUTPUT)$(OUTPUT)arch/$(ARCH)/util/pmu.o: 
> > $(OUTPUT)arch/$(ARCH)/util/pmu.c $(OUTPUT)PERF-CFLAGS
> 
> Double OUTPUT ? ;)

humm, yes ;-)

> 
> 
> > +   $(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) \
> > +'-DPERF_EXEC_PATH="$(perfexecdir_SQ)"' \
> > +'-DPREFIX="$(prefix_SQ)"' \
> > +$<
> [snip]
> > +static int cpu_aliases(struct list_head *head)
> > +{
> > +   unsigned vendol, model;
> 
> s/vendol/vendor/ ?

http://en.wikipedia.org/wiki/The_13th_Warrior
... 'Wendol', fiends who come with the mist to kill and eat human flesh ...

;-)

thanks
jirka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/7] KVM: Write protect the updated slot only when we start dirty logging

2012-12-17 Thread Takuya Yoshikawa

This is needed to make kvm_mmu_slot_remove_write_access() rmap based:
otherwise we may end up using invalid rmap's.

Signed-off-by: Takuya Yoshikawa 
---
 arch/x86/kvm/x86.c  |9 -
 virt/kvm/kvm_main.c |1 -
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1c9c834..9451efa 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6897,7 +6897,14 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
spin_lock(>mmu_lock);
if (nr_mmu_pages)
kvm_mmu_change_mmu_pages(kvm, nr_mmu_pages);
-   kvm_mmu_slot_remove_write_access(kvm, mem->slot);
+   /*
+* Write protect all pages for dirty logging.
+* Existing largepage mappings are destroyed here and new ones will
+* not be created until the end of the logging.
+*/
+   if ((mem->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
+   !(old.flags & KVM_MEM_LOG_DIRTY_PAGES))
+   kvm_mmu_slot_remove_write_access(kvm, mem->slot);
spin_unlock(>mmu_lock);
/*
 * If memory slot is created, or moved, we need to clear all
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index bd31096..0ef5daa 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -805,7 +805,6 @@ int __kvm_set_memory_region(struct kvm *kvm,
if ((new.flags & KVM_MEM_LOG_DIRTY_PAGES) && !new.dirty_bitmap) {
if (kvm_create_dirty_bitmap() < 0)
goto out_free;
-   /* destroy any largepage mappings for dirty tracking */
}
 
if (!npages || base_gfn != old.base_gfn) {
-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/7] KVM: Alleviate mmu_lock hold time when we start dirty logging

2012-12-17 Thread Takuya Yoshikawa

This patch set makes kvm_mmu_slot_remove_write_access() rmap based and
adds conditional rescheduling to it.

The motivation for this change is of course to reduce the mmu_lock hold
time when we start dirty logging for a large memory slot.  You may not
see the problem if you just give 8GB or less of the memory to the guest
with THP enabled on the host -- this is for the worst case.


IMPORTANT NOTE (not about this patch set):

I have hit the following bug many times with the current next branch,
even WITHOUT my patches.  Although I do not know a way to reproduce this
yet, it seems that something was broken around slot->dirty_bitmap.  I am
now investigating the new code in __kvm_set_memory_region().

The bug:
[  575.238063] BUG: unable to handle kernel paging request at 0002efe83a77
[  575.238185] IP: [] mark_page_dirty_in_slot+0x19/0x20 [kvm]
[  575.238308] PGD 0 
[  575.238343] Oops: 0002 [#1] SMP 

The call trace:
[  575.241207] Call Trace:
[  575.241257]  [] kvm_write_guest_cached+0x91/0xb0 [kvm]
[  575.241370]  [] kvm_arch_vcpu_ioctl_run+0x1109/0x12c0 [kvm]
[  575.241488]  [] ? kvm_arch_vcpu_ioctl_run+0xa5/0x12c0 [kvm]
[  575.241595]  [] ? mutex_lock_killable_nested+0x274/0x340
[  575.241706]  [] ? kvm_set_ioapic_irq+0x20/0x20 [kvm]
[  575.241813]  [] kvm_vcpu_ioctl+0x559/0x670 [kvm]
[  575.241913]  [] ? kvm_vm_ioctl+0x1b8/0x570 [kvm]
[  575.242007]  [] ? native_sched_clock+0x13/0x80
[  575.242125]  [] ? sched_clock+0x9/0x10
[  575.242208]  [] ? sched_clock_cpu+0xbd/0x110
[  575.242298]  [] ? fget_light+0x3c/0x140
[  575.242381]  [] do_vfs_ioctl+0x98/0x570
[  575.242463]  [] ? fget_light+0xa1/0x140
[  575.246393]  [] ? fget_light+0x3c/0x140
[  575.250363]  [] sys_ioctl+0x91/0xb0
[  575.254327]  [] system_call_fastpath+0x16/0x1b


Takuya Yoshikawa (7):
  KVM: Write protect the updated slot only when we start dirty logging
  KVM: MMU: Remove unused parameter level from __rmap_write_protect()
  KVM: MMU: Make kvm_mmu_slot_remove_write_access() rmap based
  KVM: x86: Remove unused slot_bitmap from kvm_mmu_page
  KVM: Make kvm_mmu_change_mmu_pages() take mmu_lock by itself
  KVM: Make kvm_mmu_slot_remove_write_access() take mmu_lock by itself
  KVM: Conditionally reschedule when kvm_mmu_slot_remove_write_access() takes a 
long time

 Documentation/virtual/kvm/mmu.txt |7 
 arch/x86/include/asm/kvm_host.h   |5 ---
 arch/x86/kvm/mmu.c|   56 +++-
 arch/x86/kvm/x86.c|   13 +---
 virt/kvm/kvm_main.c   |1 -
 5 files changed, 38 insertions(+), 44 deletions(-)

-- 
1.7.5.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 03/27] x86, realmode: set real_mode permissions early

2012-12-17 Thread Yinghai Lu

We need to set trampoline code to EXEC early before we do smp
AP bootings.

Found the problem after switching to #PF handler set page table.

Change to use  early_initcall instead.

Signed-off-by: Yinghai Lu 
---
 arch/x86/realmode/init.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 8045026..0b7e840 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -111,5 +111,4 @@ static int __init set_real_mode_permissions(void)
 
return 0;
 }
-
-arch_initcall(set_real_mode_permissions);
+early_initcall(set_real_mode_permissions);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 04/27] x86, realmode: use init_level4_pgt to set trapmoline_pgt directly

2012-12-17 Thread Yinghai Lu

with #PF handler way to set early page table, level3_ident will go away.

So just use entry in init_level4_pgt to set them

Signed-off-by: Yinghai Lu 
---
 arch/x86/realmode/init.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/realmode/init.c b/arch/x86/realmode/init.c
index 0b7e840..815eec1 100644
--- a/arch/x86/realmode/init.c
+++ b/arch/x86/realmode/init.c
@@ -78,8 +78,8 @@ void __init setup_real_mode(void)
*trampoline_cr4_features = read_cr4();
 
trampoline_pgd = (u64 *) __va(real_mode_header->trampoline_pgd);
-   trampoline_pgd[0] = __pa_symbol(level3_ident_pgt) + _KERNPG_TABLE;
-   trampoline_pgd[511] = __pa_symbol(level3_kernel_pgt) + _KERNPG_TABLE;
+   trampoline_pgd[0] = init_level4_pgt[pgd_index(__PAGE_OFFSET)].pgd;
+   trampoline_pgd[511] = init_level4_pgt[511].pgd;
 #endif
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 09/27] x86: add get_ramdisk_image/size()

2012-12-17 Thread Yinghai Lu

There several places to find ramdisk information early for reserving
and relocating.

Use functions to make code more readable and consistent.

Later will add ext_ramdisk_image/size in those functions to support
loading ramdisk above 4g.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/setup.c |   29 +
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 1b8a8cc..644a123 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -294,12 +294,25 @@ static void __init reserve_brk(void)
 
 #ifdef CONFIG_BLK_DEV_INITRD
 
+static u64 __init get_ramdisk_image(void)
+{
+   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
+
+   return ramdisk_image;
+}
+static u64 __init get_ramdisk_size(void)
+{
+   u64 ramdisk_size = boot_params.hdr.ramdisk_size;
+
+   return ramdisk_size;
+}
+
 #define MAX_MAP_CHUNK  (NR_FIX_BTMAPS << PAGE_SHIFT)
 static void __init relocate_initrd(void)
 {
/* Assume only end is not page aligned */
-   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-   u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+   u64 ramdisk_image = get_ramdisk_image();
+   u64 ramdisk_size  = get_ramdisk_size();
u64 area_size = PAGE_ALIGN(ramdisk_size);
u64 ramdisk_here;
unsigned long slop, clen, mapaddr;
@@ -338,8 +351,8 @@ static void __init relocate_initrd(void)
ramdisk_size  -= clen;
}
 
-   ramdisk_image = boot_params.hdr.ramdisk_image;
-   ramdisk_size  = boot_params.hdr.ramdisk_size;
+   ramdisk_image = get_ramdisk_image();
+   ramdisk_size  = get_ramdisk_size();
printk(KERN_INFO "Move RAMDISK from [mem %#010llx-%#010llx] to"
" [mem %#010llx-%#010llx]\n",
ramdisk_image, ramdisk_image + ramdisk_size - 1,
@@ -363,8 +376,8 @@ static u64 __init get_mem_size(unsigned long limit_pfn)
 static void __init early_reserve_initrd(void)
 {
/* Assume only end is not page aligned */
-   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-   u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+   u64 ramdisk_image = get_ramdisk_image();
+   u64 ramdisk_size  = get_ramdisk_size();
u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 
if (!boot_params.hdr.type_of_loader ||
@@ -376,8 +389,8 @@ static void __init early_reserve_initrd(void)
 static void __init reserve_initrd(void)
 {
/* Assume only end is not page aligned */
-   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-   u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+   u64 ramdisk_image = get_ramdisk_image();
+   u64 ramdisk_size  = get_ramdisk_size();
u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
u64 mapped_size;
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 11/27] x86, boot: move checking of cmd_line_ptr out of common path

2012-12-17 Thread Yinghai Lu

cmdline.c::__cmdline_find_option... are shared between
16-bit setup code and 32/64 bit decompressor code.

for 32/64 only path via kexec, we should not check if ptr less 1M.
as those cmdline could be put above 1M, or even 4G.

Move out accessible checking out of __cmdline_find_option()
So decompressor in misc.c can parse cmdline correctly.

Signed-off-by: Yinghai Lu 
---
 arch/x86/boot/boot.h|   14 --
 arch/x86/boot/cmdline.c |8 
 2 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 18997e5..7fadf80 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -289,12 +289,22 @@ int __cmdline_find_option(u32 cmdline_ptr, const char 
*option, char *buffer, int
 int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option);
 static inline int cmdline_find_option(const char *option, char *buffer, int 
bufsize)
 {
-   return __cmdline_find_option(boot_params.hdr.cmd_line_ptr, option, 
buffer, bufsize);
+   u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+
+   if (cmd_line_ptr >= 0x10)
+   return -1;  /* inaccessible */
+
+   return __cmdline_find_option(cmd_line_ptr, option, buffer, bufsize);
 }
 
 static inline int cmdline_find_option_bool(const char *option)
 {
-   return __cmdline_find_option_bool(boot_params.hdr.cmd_line_ptr, option);
+   u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+
+   if (cmd_line_ptr >= 0x10)
+   return -1;  /* inaccessible */
+
+   return __cmdline_find_option_bool(cmd_line_ptr, option);
 }
 
 
diff --git a/arch/x86/boot/cmdline.c b/arch/x86/boot/cmdline.c
index 6b3b6f7..768f00f 100644
--- a/arch/x86/boot/cmdline.c
+++ b/arch/x86/boot/cmdline.c
@@ -41,8 +41,8 @@ int __cmdline_find_option(u32 cmdline_ptr, const char 
*option, char *buffer, int
st_bufcpy   /* Copying this to buffer */
} state = st_wordstart;
 
-   if (!cmdline_ptr || cmdline_ptr >= 0x10)
-   return -1;  /* No command line, or inaccessible */
+   if (!cmdline_ptr)
+   return -1;  /* No command line */
 
cptr = cmdline_ptr & 0xf;
set_fs(cmdline_ptr >> 4);
@@ -111,8 +111,8 @@ int __cmdline_find_option_bool(u32 cmdline_ptr, const char 
*option)
st_wordskip,/* Miscompare, skip */
} state = st_wordstart;
 
-   if (!cmdline_ptr || cmdline_ptr >= 0x10)
-   return -1;  /* No command line, or inaccessible */
+   if (!cmdline_ptr)
+   return -1;  /* No command line */
 
cptr = cmdline_ptr & 0xf;
set_fs(cmdline_ptr >> 4);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 08/27] x86: Merge early_reserve_initrd for 32bit and 64bit

2012-12-17 Thread Yinghai Lu

They are the same, could move them out from head32/64.c to setup.c.

We are using memblock, and it could handle overlapping properly, so
we don't need to reserve some at first to hold the location, and just
need to make sure we reserve them before we are using memblock to find
free mem to use.

Signed-off-by: Yinghai Lu 
Reviewed-by: Pekka Enberg 
---
 arch/x86/kernel/head32.c |   11 ---
 arch/x86/kernel/head64.c |   11 ---
 arch/x86/kernel/setup.c  |   22 ++
 3 files changed, 18 insertions(+), 26 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index e175548..b071d41 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -33,17 +33,6 @@ void __init i386_start_kernel(void)
memblock_reserve(__pa_symbol(_text),
 (unsigned long)__bss_stop - (unsigned long)_text);
 
-#ifdef CONFIG_BLK_DEV_INITRD
-   /* Reserve INITRD */
-   if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
-   /* Assume only end is not page aligned */
-   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
-   u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
-   u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
-   memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
-   }
-#endif
-
/* Call the subarch specific early setup function */
switch (boot_params.hdr.hardware_subarch) {
case X86_SUBARCH_MRST:
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 46e509e..0aff120 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -162,17 +162,6 @@ void __init x86_64_start_reservations(char *real_mode_data)
memblock_reserve(__pa_symbol(_text),
 (unsigned long)__bss_stop - (unsigned long)_text);
 
-#ifdef CONFIG_BLK_DEV_INITRD
-   /* Reserve INITRD */
-   if (boot_params.hdr.type_of_loader && boot_params.hdr.ramdisk_image) {
-   /* Assume only end is not page aligned */
-   unsigned long ramdisk_image = boot_params.hdr.ramdisk_image;
-   unsigned long ramdisk_size  = boot_params.hdr.ramdisk_size;
-   unsigned long ramdisk_end   = PAGE_ALIGN(ramdisk_image + 
ramdisk_size);
-   memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
-   }
-#endif
-
reserve_ebda_region();
 
/*
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 04797e78..1b8a8cc 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -360,6 +360,19 @@ static u64 __init get_mem_size(unsigned long limit_pfn)
 
return mapped_pages << PAGE_SHIFT;
 }
+static void __init early_reserve_initrd(void)
+{
+   /* Assume only end is not page aligned */
+   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
+   u64 ramdisk_size  = boot_params.hdr.ramdisk_size;
+   u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
+
+   if (!boot_params.hdr.type_of_loader ||
+   !ramdisk_image || !ramdisk_size)
+   return; /* No initrd provided by bootloader */
+
+   memblock_reserve(ramdisk_image, ramdisk_end - ramdisk_image);
+}
 static void __init reserve_initrd(void)
 {
/* Assume only end is not page aligned */
@@ -386,10 +399,6 @@ static void __init reserve_initrd(void)
if (pfn_range_is_mapped(PFN_DOWN(ramdisk_image),
PFN_DOWN(ramdisk_end))) {
/* All are mapped, easy case */
-   /*
-* don't need to reserve again, already reserved early
-* in i386_start_kernel
-*/
initrd_start = ramdisk_image + PAGE_OFFSET;
initrd_end = initrd_start + ramdisk_size;
return;
@@ -400,6 +409,9 @@ static void __init reserve_initrd(void)
memblock_free(ramdisk_image, ramdisk_end - ramdisk_image);
 }
 #else
+static void __init early_reserve_initrd(void)
+{
+}
 static void __init reserve_initrd(void)
 {
 }
@@ -661,6 +673,8 @@ early_param("reservelow", parse_reservelow);
 
 void __init setup_arch(char **cmdline_p)
 {
+   early_reserve_initrd();
+
 #ifdef CONFIG_X86_32
memcpy(_cpu_data, _cpu_data, sizeof(new_cpu_data));
visws_early_detect();
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 12/27] x86, boot: pass cmd_line_ptr with unsigned long

2012-12-17 Thread Yinghai Lu

boot/compressed/misc.c is used for bzImage in 64bit and 32bit, and
cmd_line_ptr could point to buffer that is above 4g, cmd_line_ptr
should be 64 bit otherwise high 32 bit will be capped out.

So need to change data type to unsigned long, that will be 64bit get correct
address of command line buffer.

and it is ok with 32bit bzImage, because unsigned long for them
is still 32bit.

Signed-off-by: Yinghai Lu 
---
 arch/x86/boot/boot.h|8 
 arch/x86/boot/cmdline.c |4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/x86/boot/boot.h b/arch/x86/boot/boot.h
index 7fadf80..5b75319 100644
--- a/arch/x86/boot/boot.h
+++ b/arch/x86/boot/boot.h
@@ -285,11 +285,11 @@ struct biosregs {
 void intcall(u8 int_no, const struct biosregs *ireg, struct biosregs *oreg);
 
 /* cmdline.c */
-int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, 
int bufsize);
-int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option);
+int __cmdline_find_option(unsigned long cmdline_ptr, const char *option, char 
*buffer, int bufsize);
+int __cmdline_find_option_bool(unsigned long cmdline_ptr, const char *option);
 static inline int cmdline_find_option(const char *option, char *buffer, int 
bufsize)
 {
-   u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+   unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
 
if (cmd_line_ptr >= 0x10)
return -1;  /* inaccessible */
@@ -299,7 +299,7 @@ static inline int cmdline_find_option(const char *option, 
char *buffer, int bufs
 
 static inline int cmdline_find_option_bool(const char *option)
 {
-   u32 cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+   unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
 
if (cmd_line_ptr >= 0x10)
return -1;  /* inaccessible */
diff --git a/arch/x86/boot/cmdline.c b/arch/x86/boot/cmdline.c
index 768f00f..625d21b 100644
--- a/arch/x86/boot/cmdline.c
+++ b/arch/x86/boot/cmdline.c
@@ -27,7 +27,7 @@ static inline int myisspace(u8 c)
  * Returns the length of the argument (regardless of if it was
  * truncated to fit in the buffer), or -1 on not found.
  */
-int __cmdline_find_option(u32 cmdline_ptr, const char *option, char *buffer, 
int bufsize)
+int __cmdline_find_option(unsigned long cmdline_ptr, const char *option, char 
*buffer, int bufsize)
 {
addr_t cptr;
char c;
@@ -99,7 +99,7 @@ int __cmdline_find_option(u32 cmdline_ptr, const char 
*option, char *buffer, int
  * Returns the position of that option (starts counting with 1)
  * or 0 on not found
  */
-int __cmdline_find_option_bool(u32 cmdline_ptr, const char *option)
+int __cmdline_find_option_bool(unsigned long cmdline_ptr, const char *option)
 {
addr_t cptr;
char c;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 10/27] x86, boot: add get_cmd_line_ptr()

2012-12-17 Thread Yinghai Lu

later will check ext_cmd_line_ptr at the same time.

Signed-off-by: Yinghai Lu 
---
 arch/x86/boot/compressed/cmdline.c |   10 --
 arch/x86/kernel/head64.c   |   13 +++--
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/boot/compressed/cmdline.c 
b/arch/x86/boot/compressed/cmdline.c
index 10f6b11..b4c913c 100644
--- a/arch/x86/boot/compressed/cmdline.c
+++ b/arch/x86/boot/compressed/cmdline.c
@@ -13,13 +13,19 @@ static inline char rdfs8(addr_t addr)
return *((char *)(fs + addr));
 }
 #include "../cmdline.c"
+static unsigned long get_cmd_line_ptr(void)
+{
+   unsigned long cmd_line_ptr = real_mode->hdr.cmd_line_ptr;
+
+   return cmd_line_ptr;
+}
 int cmdline_find_option(const char *option, char *buffer, int bufsize)
 {
-   return __cmdline_find_option(real_mode->hdr.cmd_line_ptr, option, 
buffer, bufsize);
+   return __cmdline_find_option(get_cmd_line_ptr(), option, buffer, 
bufsize);
 }
 int cmdline_find_option_bool(const char *option)
 {
-   return __cmdline_find_option_bool(real_mode->hdr.cmd_line_ptr, option);
+   return __cmdline_find_option_bool(get_cmd_line_ptr(), option);
 }
 
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 0aff120..fe9037d 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -102,13 +102,22 @@ static void __init clear_bss(void)
   (unsigned long) __bss_stop - (unsigned long) __bss_start);
 }
 
+static unsigned long get_cmd_line_ptr(void)
+{
+   unsigned long cmd_line_ptr = boot_params.hdr.cmd_line_ptr;
+
+   return cmd_line_ptr;
+}
+
 static void __init copy_bootdata(char *real_mode_data)
 {
char * command_line;
+   unsigned long cmd_line_ptr;
 
memcpy(_params, real_mode_data, sizeof boot_params);
-   if (boot_params.hdr.cmd_line_ptr) {
-   command_line = __va(boot_params.hdr.cmd_line_ptr);
+   cmd_line_ptr = get_cmd_line_ptr();
+   if (cmd_line_ptr) {
+   command_line = __va(cmd_line_ptr);
memcpy(boot_command_line, command_line, COMMAND_LINE_SIZE);
}
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 00/27] x86, boot, 64bit: Add support for loading ramdisk and bzImage above 4G

2012-12-17 Thread Yinghai Lu

Now we have limit kdump reseved under 896M, because kexec has the limitation.
and also bzImage need to stay under 4g.

To make kexec/kdump could use range above 4g, we need to make bzImage and
ramdisk could be loaded above 4g.
During booting bzImage will be unpacked on same postion and stay high.

The patches add fields in setup_header and boot_params to
1. get info about ramdisk position info above 4g from bootloader/kexec
2. get info about cmd_line_ptr info above 4g from bootloader/kexec
3. set xloadflags bit0 in header for bzImage and bootloader/kexec load
   could check that to decide if it could to put bzImage high.
4. use sentinel to make sure ext_* fields in boot_params could be used.

This patches is tested with kexec tools with local changes and they are sent
to kexec list later.

could be found at:

git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git 
for-x86-boot

and it is on top of linus's tree 2012-12-17
plus tip:x86/mm, tip:x86/urgent, tip:x86/mm2

-v2: add ext_cmd_line_ptr support, and handle boot_param/cmd_line is above
 4G case.
-v3: according to hpa, use xloadflags instead code32_start_offset.
 0x200 will not be changed...
-v4: move ext_ramdisk_image/ext_ramdisk_size/ext_cmd_line_ptr to boot_params.
 add handling cross GB boundary case.
-v5: put spare pages in BRK,so could avoid wasting about 4 pages.
 add check for bit USE_EXT_BOOT_PARAMS in xloadflags
-v6: use sentinel according to HPA
 add kdump load high support.
-v7: move sentinel from 0x1f0 to 0x1ef... according to HPA.
 Use HPA's #PF handler version instead of ioremap.

H. Peter Anvin (1):
  x86, 64bit: early #PF handler set page table

Yinghai Lu (26):
  x86, mm: Fix page table early allocation offset checking
  x86, mm: make pgd next calculation consistent with pud/pmd
  x86, realmode: set real_mode permissions early
  x86, realmode: use init_level4_pgt to set trapmoline_pgt directly
  x86, realmode: Separate real_mode reserve and setup
  x86, 64bit: Print init kernel lowmap correctly
  x86: Merge early_reserve_initrd for 32bit and 64bit
  x86: add get_ramdisk_image/size()
  x86, boot: add get_cmd_line_ptr()
  x86, boot: move checking of cmd_line_ptr out of common path
  x86, boot: pass cmd_line_ptr with unsigned long
  x86, boot: move verify_cpu.S and no_longmode after 0x200
  x86, boot: Move lldt/ltr out of 64bit code section
  x86, kexec: remove 1024G limitation for kexec buffer on 64bit
  x86, kexec: set ident mapping for kernel that is above max_pfn
  x86, kexec: Merge ident_mapping_init and init_level4_page
  x86, kexec: only set ident mapping for ram.
  x86, boot: add fields to support load bzImage and ramdisk above 4G
  x86, boot: update comments about entries for 64bit image
  x86, boot: Not need to check setup_header version
  mm: Add alloc_bootmem_low_pages_nopanic()
  x86: Don't panic if can not alloc buffer for swiotlb
  x86: Add swiotlb force off support
  x86, kdump: remove crashkernel range find limit for 64bit
  x86: add Crash kernel low reservation
  x86: Merge early kernel reserve for 32bit and 64bit

 Documentation/kernel-parameters.txt |   10 ++
 Documentation/x86/boot.txt  |   53 ++-
 Documentation/x86/zero-page.txt |4 +
 arch/x86/boot/boot.h|   18 ++-
 arch/x86/boot/cmdline.c |   12 +-
 arch/x86/boot/compressed/cmdline.c  |   12 +-
 arch/x86/boot/compressed/head_64.S  |   48 ---
 arch/x86/boot/compressed/misc.c |   12 ++
 arch/x86/boot/header.S  |   12 +-
 arch/x86/boot/setup.ld  |7 +
 arch/x86/include/asm/kexec.h|6 +-
 arch/x86/include/asm/page.h |4 +
 arch/x86/include/asm/pgtable_64_types.h |4 +
 arch/x86/include/asm/processor.h|1 +
 arch/x86/include/asm/realmode.h |3 +-
 arch/x86/include/uapi/asm/bootparam.h   |   13 +-
 arch/x86/kernel/head32.c|   20 ---
 arch/x86/kernel/head64.c|  115 +++-
 arch/x86/kernel/head_64.S   |  202 +++
 arch/x86/kernel/machine_kexec_64.c  |  228 ++-
 arch/x86/kernel/pci-swiotlb.c   |   15 +-
 arch/x86/kernel/setup.c |  140 ++-
 arch/x86/kernel/traps.c |9 ++
 arch/x86/mm/init.c  |   11 +-
 arch/x86/mm/init_64.c   |   12 +-
 arch/x86/realmode/init.c|   37 +++--
 drivers/iommu/amd_iommu.c   |1 +
 include/linux/bootmem.h |5 +
 include/linux/kexec.h   |3 +
 include/linux/swiotlb.h |3 +-
 kernel/kexec.c  |   34 -
 lib/swiotlb.c   |   20 ++-
 mm/bootmem.c|8 ++
 mm/nobootmem.c  |8 ++
 34 files changed, 713 insertions(+), 377 deletions(-)

--

[PATCH v7 01/27] x86, mm: Fix page table early allocation offset checking

2012-12-17 Thread Yinghai Lu

During debug load kernel above 4G, found one page if is not used in BRK
and it should be with early page allocation.

Fix that checking and also add print out for every allocation from BRK

Signed-off-by: Yinghai Lu 
---
 arch/x86/mm/init.c |4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 6f85de8..c4293cf 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -47,7 +47,7 @@ __ref void *alloc_low_pages(unsigned int num)
__GFP_ZERO, order);
}
 
-   if ((pgt_buf_end + num) >= pgt_buf_top) {
+   if ((pgt_buf_end + num) > pgt_buf_top) {
unsigned long ret;
if (min_pfn_mapped >= max_pfn_mapped)
panic("alloc_low_page: ran out of memory");
@@ -61,6 +61,8 @@ __ref void *alloc_low_pages(unsigned int num)
} else {
pfn = pgt_buf_end;
pgt_buf_end += num;
+   printk(KERN_DEBUG "BRK [%#010lx, %#010lx] PGTABLE\n",
+   pfn << PAGE_SHIFT, (pgt_buf_end << PAGE_SHIFT) - 1);
}
 
for (i = 0; i < num; i++) {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 05/27] x86, realmode: Separate real_mode reserve and setup

2012-12-17 Thread Yinghai Lu

After we switch to use #PF handler help to set page table, init_level4_pgt
will have entries set after init_mem_mapping.
We need to move coping init_level4_pgt to trampoline_pgd after than.

So separate reserve_real_mode out

Move the setup after init_mem_mapping()

Signed-off-by: Yinghai Lu 
---
 arch/x86/include/asm/realmode.h |3 ++-
 arch/x86/kernel/setup.c |4 +++-
 arch/x86/realmode/init.c|   30 +++---
 3 files changed, 24 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/realmode.h b/arch/x86/include/asm/realmode.h
index fe1ec5b..9c6b890 100644
--- a/arch/x86/include/asm/realmode.h
+++ b/arch/x86/include/asm/realmode.h
@@ -58,6 +58,7 @@ extern unsigned char boot_gdt[];
 extern unsigned char secondary_startup_64[];
 #endif
 
-extern void __init setup_real_mode(void);
+void reserve_real_mode(void);
+void setup_real_mode(void);
 
 #endif /* _ARCH_X86_REALMODE_H */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 81ea5a5..01b22d0 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -913,10 +913,12 @@ void __init setup_arch(char **cmdline_p)
printk(KERN_DEBUG "initial memory mapped: [mem 0x-%#010lx]\n",
(max_pfn_mapped

[PATCH v7 06/27] x86, 64bit: early #PF handler set page table

2012-12-17 Thread Yinghai Lu

From: "H. Peter Anvin" 

two use cases:
1. We will support load and run kernel above 4G, and zero_page, ramdisk
   will be above 4G, too
2. need to access ramdisk early to get microcode to update that as
   early possible.

We could use early_iomap to access them, but it will make code to
messy and hard to unified with 32bit.

So here comes #PF handler to set page page.

When #PF happen, handler will use pages in __initdata to set page page
to cover accessed page.

those code and page in __INIT sections, so will not increase ram usages.

The good point is: with help of #PF handler, we can set kernel mapping
from blank, and switch to init_level4_pgt later.

switchover in head_64.S is only using three page to handle kernel
crossing 1G, 512G with shareing page, most insteresting part.

early_make_pgtable is using kernel high mapping address to access pages
to set page table.

-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel()   - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
 also move back init_level4_pgt from BSS to DATA again.
 because we have to clear it anyway.  - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
 it is with fill 512,8,0 already in head_64.S  - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
 let early_trap_init to trash that early #PF handler.
 So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
 touch possible mem holes. - Yinghai

Signed-off-by: Yinghai Lu 
---
 arch/x86/include/asm/pgtable_64_types.h |4 +
 arch/x86/include/asm/processor.h|1 +
 arch/x86/kernel/head64.c|   79 ++--
 arch/x86/kernel/head_64.S   |  202 +--
 arch/x86/kernel/setup.c |2 +
 arch/x86/kernel/traps.c |9 ++
 arch/x86/mm/init.c  |3 +-
 7 files changed, 204 insertions(+), 96 deletions(-)

diff --git a/arch/x86/include/asm/pgtable_64_types.h 
b/arch/x86/include/asm/pgtable_64_types.h
index 766ea16..2d88344 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -1,6 +1,8 @@
 #ifndef _ASM_X86_PGTABLE_64_DEFS_H
 #define _ASM_X86_PGTABLE_64_DEFS_H
 
+#include 
+
 #ifndef __ASSEMBLY__
 #include 
 
@@ -60,4 +62,6 @@ typedef struct { pteval_t pte; } pte_t;
 #define MODULES_END  _AC(0xff00, UL)
 #define MODULES_LEN   (MODULES_END - MODULES_VADDR)
 
+#define EARLY_DYNAMIC_PAGE_TABLES  64
+
 #endif /* _ASM_X86_PGTABLE_64_DEFS_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 888184b..a0b58dd 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -731,6 +731,7 @@ extern void enable_sep_cpu(void);
 extern int sysenter_setup(void);
 
 extern void early_trap_init(void);
+extern void early_trap_pf_init(void);
 
 /* Defined in head.S */
 extern struct desc_ptr early_gdt_descr;
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 7b215a5..cac61dc 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -26,11 +26,72 @@
 #include 
 #include 
 
-static void __init zap_identity_mappings(void)
+/*
+ * Manage page tables very early on.
+ */
+extern pgd_t early_level4_pgt[PTRS_PER_PGD];
+extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABLES][PTRS_PER_PMD];
+static unsigned int __initdata next_early_pgt = 2;
+
+/* Wipe all early page tables except for the kernel symbol map */
+static void __init reset_early_page_tables(void)
 {
-   pgd_t *pgd = pgd_offset_k(0UL);
-   pgd_clear(pgd);
-   __flush_tlb_all();
+   unsigned long i;
+
+   for (i = 0; i < PTRS_PER_PGD-1; i++)
+   early_level4_pgt[i].pgd = 0;
+
+   next_early_pgt = 0;
+
+   write_cr3(__pa(early_level4_pgt));
+}
+
+/* Create a new PMD entry */
+int __init early_make_pgtable(unsigned long address)
+{
+   unsigned long physaddr = address - __PAGE_OFFSET;
+   unsigned long i;
+   pgdval_t pgd, *pgd_p;
+   pudval_t *pud_p;
+   pmdval_t pmd, *pmd_p;
+
+
+   /* Invalid address or early pgt is done ?  */
+   if (physaddr >= MAXMEM || read_cr3() != __pa(early_level4_pgt))
+   return -1;
+
+   pgd_p = _level4_pgt[pgd_index(address)].pgd;
+   pgd = *pgd_p;
+
+   /*
+* The use of __START_KERNEL_map rather than __PAGE_OFFSET here is
+* critical -- __PAGE_OFFSET would point us back into the dynamic
+* range and we might end up looping forever...
+*/
+   if (pgd && next_early_pgt < EARLY_DYNAMIC_PAGE_TABLES) {
+   pud_p = (pudval_t *)((pgd & PTE_PFN_MASK) + __START_KERNEL_map 
- phys_base);
+   } else {
+   if (next_early_pgt >=

[PATCH v7 15/27] x86, kexec: remove 1024G limitation for kexec buffer on 64bit

2012-12-17 Thread Yinghai Lu

Now 64bit kernel supports more than 1T ram and kexec tools
could find buffer above 1T, remove that obsolete limitation.
and use MAXMEM instead.

Tested on system more than 1024G ram.

Signed-off-by: Yinghai Lu 
Cc: "Eric W. Biederman" 
---
 arch/x86/include/asm/kexec.h |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 6080d26..17483a4 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -48,11 +48,11 @@
 # define vmcore_elf_check_arch_cross(x) ((x)->e_machine == EM_X86_64)
 #else
 /* Maximum physical address we can use pages from */
-# define KEXEC_SOURCE_MEMORY_LIMIT  (0xFFUL)
+# define KEXEC_SOURCE_MEMORY_LIMIT  (MAXMEM-1)
 /* Maximum address we can reach in physical address mode */
-# define KEXEC_DESTINATION_MEMORY_LIMIT (0xFFUL)
+# define KEXEC_DESTINATION_MEMORY_LIMIT (MAXMEM-1)
 /* Maximum address we can use for the control pages */
-# define KEXEC_CONTROL_MEMORY_LIMIT (0xFFUL)
+# define KEXEC_CONTROL_MEMORY_LIMIT (MAXMEM-1)
 
 /* Allocate one page for the pdp and the second for the code */
 # define KEXEC_CONTROL_PAGE_SIZE  (4096UL + 4096UL)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 16/27] x86, kexec: set ident mapping for kernel that is above max_pfn

2012-12-17 Thread Yinghai Lu

When first kernel is booted with memmap= or mem=  to limit max_pfn.
kexec can load second kernel above that max_pfn.

We need to set ident mapping for whole image in this case not just
for first 2M.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/machine_kexec_64.c |   43 +++-
 1 file changed, 37 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index b3ea9db..be14ee1 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -56,6 +56,25 @@ out:
return result;
 }
 
+static int ident_mapping_init(struct kimage *image, pgd_t *level4p,
+   unsigned long mstart, unsigned long mend)
+{
+   int result;
+
+   mstart = round_down(mstart, PMD_SIZE);
+   mend   = round_up(mend - 1, PMD_SIZE);
+
+   while (mstart < mend) {
+   result = init_one_level2_page(image, level4p, mstart);
+   if (result)
+   return result;
+
+   mstart += PMD_SIZE;
+   }
+
+   return 0;
+}
+
 static void init_level2_page(pmd_t *level2p, unsigned long addr)
 {
unsigned long end_addr;
@@ -184,22 +203,34 @@ err:
return result;
 }
 
-
 static int init_pgtable(struct kimage *image, unsigned long start_pgtable)
 {
+   unsigned long mstart, mend;
pgd_t *level4p;
int result;
+   int i;
+
level4p = (pgd_t *)__va(start_pgtable);
result = init_level4_page(image, level4p, 0, max_pfn << PAGE_SHIFT);
if (result)
return result;
+
/*
-* image->start may be outside 0 ~ max_pfn, for example when
-* jump back to original kernel from kexeced kernel
+* segments's mem ranges could be outside 0 ~ max_pfn,
+* for example when jump back to original kernel from kexeced kernel.
+* or first kernel is booted with user mem map, and second kernel
+* could be loaded out of that range.
 */
-   result = init_one_level2_page(image, level4p, image->start);
-   if (result)
-   return result;
+   for (i = 0; i < image->nr_segments; i++) {
+   mstart = image->segment[i].mem;
+   mend   = mstart + image->segment[i].memsz;
+
+   result = ident_mapping_init(image, level4p, mstart, mend);
+
+   if (result)
+   return result;
+   }
+
return init_transition_pgtable(image, level4p);
 }
 
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 18/27] x86, kexec: only set ident mapping for ram.

2012-12-17 Thread Yinghai Lu

We should not set mapping for all under max_pfn.
That causes same problem that is fixed by

x86, mm: Only direct map addresses that are marked as E820_RAM

This patch expose pfn_mapped array, and only set ident mapping for range
in that array.

This patch rely on new ident_mapping_init that could handle sharing
pgd/pud between different calling.

Signed-off-by: Yinghai Lu 
---
 arch/x86/include/asm/page.h|4 
 arch/x86/kernel/machine_kexec_64.c |   13 ++---
 arch/x86/mm/init.c |4 ++--
 3 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
index 3698a6a..c878924 100644
--- a/arch/x86/include/asm/page.h
+++ b/arch/x86/include/asm/page.h
@@ -17,6 +17,10 @@
 
 struct page;
 
+#include 
+extern struct range pfn_mapped[];
+extern int nr_pfn_mapped;
+
 static inline void clear_user_page(void *page, unsigned long vaddr,
   struct page *pg)
 {
diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index a0bf7fb..cc6d0e3 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -157,9 +157,16 @@ static int init_pgtable(struct kimage *image, unsigned 
long start_pgtable)
 
level4p = (pgd_t *)__va(start_pgtable);
clear_page(level4p);
-   result = ident_mapping_init(image, level4p, 0, max_pfn << PAGE_SHIFT);
-   if (result)
-   return result;
+
+   for (i = 0; i < nr_pfn_mapped; i++) {
+   mstart = pfn_mapped[i].start << PAGE_SHIFT;
+   mend   = pfn_mapped[i].end << PAGE_SHIFT;
+
+   result = ident_mapping_init(image, level4p, mstart, mend);
+
+   if (result)
+   return result;
+   }
 
/*
 * segments's mem ranges could be outside 0 ~ max_pfn,
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index ab26a15..d704b36 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -300,8 +300,8 @@ static int __meminit split_mem_range(struct map_range *mr, 
int nr_range,
return nr_range;
 }
 
-static struct range pfn_mapped[E820_X_MAX];
-static int nr_pfn_mapped;
+struct range pfn_mapped[E820_X_MAX];
+int nr_pfn_mapped;
 
 static void add_pfn_range_mapped(unsigned long start_pfn, unsigned long 
end_pfn)
 {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 17/27] x86, kexec: Merge ident_mapping_init and init_level4_page

2012-12-17 Thread Yinghai Lu

Now ident_mapping_init is checking if pgd/pud is present for every 2M,
so several 2Ms are in same PUD, it will keep checking if pud is there.

init_level4_page does not check existing pgd/pud.

We will need to use ident_mapping_init with pfn_mapped array to
map ram only, and two entries in pfn_mapped could be in same pgd/pud,
so we need to check if pgd/pud is present instead of init_level4_page.

So merge these two set functions to make new ident_mapping_init not
check pgd/pud for every pmd in same pgd/pud, and use it to replace
init_level4_page.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/machine_kexec_64.c |  214 ++--
 1 file changed, 80 insertions(+), 134 deletions(-)

diff --git a/arch/x86/kernel/machine_kexec_64.c 
b/arch/x86/kernel/machine_kexec_64.c
index be14ee1..a0bf7fb 100644
--- a/arch/x86/kernel/machine_kexec_64.c
+++ b/arch/x86/kernel/machine_kexec_64.c
@@ -21,139 +21,6 @@
 #include 
 #include 
 
-static int init_one_level2_page(struct kimage *image, pgd_t *pgd,
-   unsigned long addr)
-{
-   pud_t *pud;
-   pmd_t *pmd;
-   struct page *page;
-   int result = -ENOMEM;
-
-   addr &= PMD_MASK;
-   pgd += pgd_index(addr);
-   if (!pgd_present(*pgd)) {
-   page = kimage_alloc_control_pages(image, 0);
-   if (!page)
-   goto out;
-   pud = (pud_t *)page_address(page);
-   clear_page(pud);
-   set_pgd(pgd, __pgd(__pa(pud) | _KERNPG_TABLE));
-   }
-   pud = pud_offset(pgd, addr);
-   if (!pud_present(*pud)) {
-   page = kimage_alloc_control_pages(image, 0);
-   if (!page)
-   goto out;
-   pmd = (pmd_t *)page_address(page);
-   clear_page(pmd);
-   set_pud(pud, __pud(__pa(pmd) | _KERNPG_TABLE));
-   }
-   pmd = pmd_offset(pud, addr);
-   if (!pmd_present(*pmd))
-   set_pmd(pmd, __pmd(addr | __PAGE_KERNEL_LARGE_EXEC));
-   result = 0;
-out:
-   return result;
-}
-
-static int ident_mapping_init(struct kimage *image, pgd_t *level4p,
-   unsigned long mstart, unsigned long mend)
-{
-   int result;
-
-   mstart = round_down(mstart, PMD_SIZE);
-   mend   = round_up(mend - 1, PMD_SIZE);
-
-   while (mstart < mend) {
-   result = init_one_level2_page(image, level4p, mstart);
-   if (result)
-   return result;
-
-   mstart += PMD_SIZE;
-   }
-
-   return 0;
-}
-
-static void init_level2_page(pmd_t *level2p, unsigned long addr)
-{
-   unsigned long end_addr;
-
-   addr &= PAGE_MASK;
-   end_addr = addr + PUD_SIZE;
-   while (addr < end_addr) {
-   set_pmd(level2p++, __pmd(addr | __PAGE_KERNEL_LARGE_EXEC));
-   addr += PMD_SIZE;
-   }
-}
-
-static int init_level3_page(struct kimage *image, pud_t *level3p,
-   unsigned long addr, unsigned long last_addr)
-{
-   unsigned long end_addr;
-   int result;
-
-   result = 0;
-   addr &= PAGE_MASK;
-   end_addr = addr + PGDIR_SIZE;
-   while ((addr < last_addr) && (addr < end_addr)) {
-   struct page *page;
-   pmd_t *level2p;
-
-   page = kimage_alloc_control_pages(image, 0);
-   if (!page) {
-   result = -ENOMEM;
-   goto out;
-   }
-   level2p = (pmd_t *)page_address(page);
-   init_level2_page(level2p, addr);
-   set_pud(level3p++, __pud(__pa(level2p) | _KERNPG_TABLE));
-   addr += PUD_SIZE;
-   }
-   /* clear the unused entries */
-   while (addr < end_addr) {
-   pud_clear(level3p++);
-   addr += PUD_SIZE;
-   }
-out:
-   return result;
-}
-
-
-static int init_level4_page(struct kimage *image, pgd_t *level4p,
-   unsigned long addr, unsigned long last_addr)
-{
-   unsigned long end_addr;
-   int result;
-
-   result = 0;
-   addr &= PAGE_MASK;
-   end_addr = addr + (PTRS_PER_PGD * PGDIR_SIZE);
-   while ((addr < last_addr) && (addr < end_addr)) {
-   struct page *page;
-   pud_t *level3p;
-
-   page = kimage_alloc_control_pages(image, 0);
-   if (!page) {
-   result = -ENOMEM;
-   goto out;
-   }
-   level3p = (pud_t *)page_address(page);
-   result = init_level3_page(image, level3p, addr, last_addr);
-   if (result)
-   goto out;
-   set_pgd(level4p++, __pgd(__pa(level3p) | _KERNPG_TABLE));
-   addr += PGDIR_SIZE;
-   }
-   /* clear the unused entries */
-   while (addr < end_addr) {
-   pgd_clear(level4p++);
-   addr

[PATCH v7 20/27] x86, boot: update comments about entries for 64bit image

2012-12-17 Thread Yinghai Lu

Now 64bit entry is fixed on 0x200, can not be changed anymore.

Update the comments to reflect that.

Also put info about it in boot.txt

Signed-off-by: Yinghai Lu 
---
 Documentation/x86/boot.txt |   38 
 arch/x86/boot/compressed/head_64.S |   22 -
 2 files changed, 51 insertions(+), 9 deletions(-)

diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 18ca9fb..24cc542 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -1042,6 +1042,44 @@ must have read/write permission; CS must be __BOOT_CS 
and DS, ES, SS
 must be __BOOT_DS; interrupt must be disabled; %esi must hold the base
 address of the struct boot_params; %ebp, %edi and %ebx must be zero.
 
+ 64-bit BOOT PROTOCOL
+
+For machine with 64bit cpus and 64bit kernel, we could use 64bit bootloader
+We need a 64-bit boot protocol.
+
+In 64-bit boot protocol, the first step in loading a Linux kernel
+should be to setup the boot parameters (struct boot_params,
+traditionally known as "zero page"). The memory for struct boot_params
+should be allocated under or above 4G and initialized to all zero.
+Then the setup header from offset 0x01f1 of kernel image on should be
+loaded into struct boot_params and examined. The end of setup header
+can be calculated as follow:
+
+   0x0202 + byte value at offset 0x0201
+
+In addition to read/modify/write the setup header of the struct
+boot_params as that of 16-bit boot protocol, the boot loader should
+also fill the additional fields of the struct boot_params as that
+described in zero-page.txt.
+
+After setting up the struct boot_params, the boot loader can load the
+64-bit kernel in the same way as that of 16-bit boot protocol, but
+kernel could be above 4G.
+
+In 64-bit boot protocol, the kernel is started by jumping to the
+64-bit kernel entry point, which is the start address of loaded
+64-bit kernel plus 0x200.
+
+At entry, the CPU must be in 64-bit mode with paging enabled.
+The range with setup_header.init_size from start address of loaded
+kernel and zero page and command line buffer get ident mapping;
+a GDT must be loaded with the descriptors for selectors
+__BOOT_CS(0x10) and __BOOT_DS(0x18); both descriptors must be 4G flat
+segment; __BOOT_CS must have execute/read permission, and __BOOT_DS
+must have read/write permission; CS must be __BOOT_CS and DS, ES, SS
+must be __BOOT_DS; interrupt must be disabled; %rsi must hold the base
+address of the struct boot_params.
+
  EFI HANDOVER PROTOCOL
 
 This protocol allows boot loaders to defer initialisation to the EFI
diff --git a/arch/x86/boot/compressed/head_64.S 
b/arch/x86/boot/compressed/head_64.S
index 5c80b94..aaafd4e 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -37,6 +37,12 @@
__HEAD
.code32
 ENTRY(startup_32)
+   /*
+* 32bit entry is 0, could not be changed!
+* If we come here directly from a bootloader,
+* kernel(text+data+bss+brk) ramdisk, zero_page, command line
+* all need to be under 4G limit.
+*/
cld
/*
 * Test KEEP_SEGMENTS flag to see if the bootloader is asking
@@ -182,20 +188,18 @@ ENTRY(startup_32)
lret
 ENDPROC(startup_32)
 
-   /*
-* Be careful here startup_64 needs to be at a predictable
-* address so I can export it in an ELF header.  Bootloaders
-* should look at the ELF header to find this address, as
-* it may change in the future.
-*/
.code64
.org 0x200
 ENTRY(startup_64)
/*
+* 64bit entry is 0x200, could not be changed!
 * We come here either from startup_32 or directly from a
-* 64bit bootloader.  If we come here from a bootloader we depend on
-* an identity mapped page table being provied that maps our
-* entire text+data+bss and hopefully all of memory.
+* 64bit bootloader.
+* If we come here from a bootloader, kernel(text+data+bss+brk),
+* ramdisk, zero_page, command line could be above 4G.
+* We depend on an identity mapped page table being provided
+* that maps our entire kernel(text+data+bss+brk), zero page
+* and command line.
 */
 #ifdef CONFIG_EFI_STUB
/*
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 23/27] x86: Don't panic if can not alloc buffer for swiotlb

2012-12-17 Thread Yinghai Lu

Normal boot path on system with iommu support:
swiotlb buffer will be allocated early at first and then try to initialize
iommu, if iommu for intel or amd could setup properly, swiotlb buffer
will be freed.

The early allocating is with bootmem, and could panic when we try to use
kdump with buffer above 4G only.

Replace the panic with WARN, and the kernel can go on without swiotlb,
and could iommu later.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/pci-swiotlb.c |5 -
 include/linux/swiotlb.h   |2 +-
 lib/swiotlb.c |   15 ++-
 3 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 6c483ba..6f93eb7 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -91,7 +91,10 @@ IOMMU_INIT(pci_swiotlb_detect_4gb,
 void __init pci_swiotlb_init(void)
 {
if (swiotlb) {
-   swiotlb_init(0);
+   if (swiotlb_init(0)) {
+   swiotlb = 0;
+   return;
+   }
dma_ops = _dma_ops;
}
 }
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 071d62c..1d2506f 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -22,7 +22,7 @@ extern int swiotlb_force;
  */
 #define IO_TLB_SHIFT 11
 
-extern void swiotlb_init(int verbose);
+int swiotlb_init(int verbose);
 extern void swiotlb_init_with_tbl(char *tlb, unsigned long nslabs, int 
verbose);
 extern unsigned long swiotlb_nr_tbl(void);
 extern int swiotlb_late_init_with_tbl(char *tlb, unsigned long nslabs);
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 196b069..958322e 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -175,7 +175,7 @@ void __init swiotlb_init_with_tbl(char *tlb, unsigned long 
nslabs, int verbose)
  * Statically reserve bounce buffer space and initialize bounce buffer data
  * structures for the software IO TLB used to implement the DMA API.
  */
-static void __init
+static int __init
 swiotlb_init_with_default_size(size_t default_size, int verbose)
 {
unsigned char *vstart;
@@ -192,16 +192,21 @@ swiotlb_init_with_default_size(size_t default_size, int 
verbose)
 * Get IO TLB memory from the low pages
 */
vstart = alloc_bootmem_low_pages(PAGE_ALIGN(bytes));
-   if (!vstart)
-   panic("Cannot allocate SWIOTLB buffer");
+   if (!vstart) {
+   WARN(1, "Cannot allocate SWIOTLB buffer");
+   return -1;
+   }
 
swiotlb_init_with_tbl(vstart, io_tlb_nslabs, verbose);
+
+   return 0;
 }
 
-void __init
+int __init
 swiotlb_init(int verbose)
 {
-   swiotlb_init_with_default_size(64 * (1<<20), verbose);  /* default to 
64MB */
+   /* default to 64MB */
+   return swiotlb_init_with_default_size(64 * (1<<20), verbose);
 }
 
 /*
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC v4 2/3] Discard volatile page

2012-12-17 Thread Minchan Kim

VM don't need to swap out volatile pages. Instead, it just discards
pages and set true to the vma's purge state so if user try to access
purged vma without calling mnovolatile, it will encounter SIGBUS.

Reclaimer reclaims volatile page when it reaches tail of LRU regardless
of the recent reference. So when the memory pressure doesn't happen,
it wouldn't be evicted so it can reduce the number of minor fault.
Although memory pressure happens, it doesn't be evicted until it reaches
tail of LRU. It could mitigate fault/data-regenaration overhead if
memory pressure isn't severe. But it's not solid design and need more
discussion.

Cc: Michael Kerrisk 
Cc: Arun Sharma 
Cc: san...@google.com
Cc: Paul Turner 
CC: David Rientjes 
Cc: John Stultz 
Cc: Andrew Morton 
Cc: Christoph Lameter 
Cc: Android Kernel Team 
Cc: Robert Love 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Dave Hansen 
Cc: Rik van Riel 
Cc: Dave Chinner 
Cc: Neil Brown 
Cc: Mike Hommey 
Cc: Taras Glek 
Cc: KOSAKI Motohiro 
Cc: Christoph Lameter 
Cc: KAMEZAWA Hiroyuki 
Signed-off-by: Minchan Kim 
---
 include/linux/rmap.h |3 ++
 mm/memory.c  |2 ++
 mm/migrate.c |6 ++--
 mm/rmap.c|   95 --
 mm/vmscan.c  |3 ++
 5 files changed, 105 insertions(+), 4 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index bfe1f47..ed263bb 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -80,6 +80,8 @@ enum ttu_flags {
TTU_IGNORE_MLOCK = (1 << 8),/* ignore mlock */
TTU_IGNORE_ACCESS = (1 << 9),   /* don't age */
TTU_IGNORE_HWPOISON = (1 << 10),/* corrupted page is recoverable */
+   /* ignore volatile. Should be revisit to handle migration entry */
+   TTU_IGNORE_VOLATILE = (1 << 11),
 };
 
 #ifdef CONFIG_MMU
@@ -261,5 +263,6 @@ static inline int page_mkclean(struct page *page)
 #define SWAP_AGAIN 1
 #define SWAP_FAIL  2
 #define SWAP_MLOCK 3
+#define SWAP_DISCARD   4
 
 #endif /* _LINUX_RMAP_H */
diff --git a/mm/memory.c b/mm/memory.c
index 221fc9f..71e06fe 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3459,6 +3459,8 @@ int handle_pte_fault(struct mm_struct *mm,
return do_linear_fault(mm, vma, address,
pte, pmd, flags, entry);
}
+   if (unlikely(vma->vm_flags & VM_VOLATILE))
+   return VM_FAULT_SIGBUS;
return do_anonymous_page(mm, vma, address,
 pte, pmd, flags);
}
diff --git a/mm/migrate.c b/mm/migrate.c
index 77ed2d7..bf9d76a 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -800,7 +800,8 @@ static int __unmap_and_move(struct page *page, struct page 
*newpage,
}
 
/* Establish migration ptes or remove ptes */
-   try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+   try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|
+   TTU_IGNORE_ACCESS|TTU_IGNORE_VOLATILE);
 
 skip_unmap:
if (!page_mapped(page))
@@ -915,7 +916,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
if (PageAnon(hpage))
anon_vma = page_get_anon_vma(hpage);
 
-   try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+   try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|
+   TTU_IGNORE_ACCESS|TTU_IGNORE_VOLATILE);
 
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, 1, mode);
diff --git a/mm/rmap.c b/mm/rmap.c
index 7f4493c..02ee1a3 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1189,6 +1189,64 @@ out:
mem_cgroup_end_update_page_stat(page, , );
 }
 
+int try_to_zap_one(struct page *page, struct vm_area_struct *vma,
+   unsigned long address)
+{
+   struct mm_struct *mm = vma->vm_mm;
+   pgd_t *pgd;
+   pud_t *pud;
+   pmd_t *pmd;
+   pte_t *pte;
+   spinlock_t *ptl;
+
+   swp_entry_t entry = { .val = page_private(page) };
+
+   VM_BUG_ON(!PageLocked(page));
+   VM_BUG_ON(!PageAnon(page));
+   VM_BUG_ON(!PageSwapCache(page));
+
+   pgd = pgd_offset(mm, address);
+   if (!pgd_present(*pgd))
+   return 0;
+
+   pud = pud_offset(pgd, address);
+   if (!pud_present(*pud))
+   return 0;
+
+   pmd = pmd_offset(pud, address);
+   if (!pmd_present(*pmd))
+   return 0;
+
+   VM_BUG_ON(pmd_trans_huge(*pmd));
+
+   pte = pte_offset_map(pmd, address);
+   /* Make a quick check before getting the lock */
+   if(!pte_present(*pte)) {
+   pte_unmap(pte);
+   return 0;
+   }
+
+   ptl = pte_lockptr(mm, pmd);
+   spin_lock(ptl);
+
+   if (entry.val != pte_to_swp_entry(*pte).val) {
+   pte_unmap_unlock(pte, ptl);
+

[PATCH v7 21/27] x86, boot: Not need to check setup_header version

2012-12-17 Thread Yinghai Lu

That is for bootloader.

setup_data is in setup_header, and all bootloader is copying that
for bzImage. So for old bootloader should keep that as 0.

kexec till now for elf image, will set setup_data to 0.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/setup.c |6 --
 1 file changed, 6 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2509efa..15ce495 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -439,8 +439,6 @@ static void __init parse_setup_data(void)
struct setup_data *data;
u64 pa_data;
 
-   if (boot_params.hdr.version < 0x0209)
-   return;
pa_data = boot_params.hdr.setup_data;
while (pa_data) {
u32 data_len, map_len;
@@ -476,8 +474,6 @@ static void __init e820_reserve_setup_data(void)
u64 pa_data;
int found = 0;
 
-   if (boot_params.hdr.version < 0x0209)
-   return;
pa_data = boot_params.hdr.setup_data;
while (pa_data) {
data = early_memremap(pa_data, sizeof(*data));
@@ -501,8 +497,6 @@ static void __init 
memblock_x86_reserve_range_setup_data(void)
struct setup_data *data;
u64 pa_data;
 
-   if (boot_params.hdr.version < 0x0209)
-   return;
pa_data = boot_params.hdr.setup_data;
while (pa_data) {
data = early_memremap(pa_data, sizeof(*data));
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 24/27] x86: Add swiotlb force off support

2012-12-17 Thread Yinghai Lu

So use could disable swiotlb from command line, even swiotlb support
is compiled in.  Just like we have intel_iommu=on and intel_iommu=off.

Signed-off-by: Yinghai Lu 
---
 Documentation/kernel-parameters.txt |7 +++
 arch/x86/kernel/pci-swiotlb.c   |   10 +-
 drivers/iommu/amd_iommu.c   |1 +
 include/linux/swiotlb.h |1 +
 lib/swiotlb.c   |5 -
 5 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index ea8e5b4..2b37020 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2835,6 +2835,13 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
 
swiotlb=[IA-64] Number of I/O TLB slabs
 
+   swiotlb=[force|off|on] [KNL] disable or enable swiotlb.
+   force
+   on
+   Enable swiotlb.
+   off
+   Disable swiotlb.
+
switches=   [HW,M68k]
 
sysfs.deprecated=0|1 [KNL]
diff --git a/arch/x86/kernel/pci-swiotlb.c b/arch/x86/kernel/pci-swiotlb.c
index 6f93eb7..80afd3b 100644
--- a/arch/x86/kernel/pci-swiotlb.c
+++ b/arch/x86/kernel/pci-swiotlb.c
@@ -58,12 +58,12 @@ static struct dma_map_ops swiotlb_dma_ops = {
  */
 int __init pci_swiotlb_detect_override(void)
 {
-   int use_swiotlb = swiotlb | swiotlb_force;
-
if (swiotlb_force)
swiotlb = 1;
+   else if (swiotlb_force_off)
+   swiotlb = 0;
 
-   return use_swiotlb;
+   return swiotlb;
 }
 IOMMU_INIT_FINISH(pci_swiotlb_detect_override,
  pci_xen_swiotlb_detect,
@@ -76,9 +76,9 @@ IOMMU_INIT_FINISH(pci_swiotlb_detect_override,
  */
 int __init pci_swiotlb_detect_4gb(void)
 {
-   /* don't initialize swiotlb if iommu=off (no_iommu=1) */
+   /* don't initialize swiotlb if iommu=off (no_iommu=1) or force off */
 #ifdef CONFIG_X86_64
-   if (!no_iommu && max_pfn > MAX_DMA32_PFN)
+   if (!no_iommu && !swiotlb_force_off && max_pfn > MAX_DMA32_PFN)
swiotlb = 1;
 #endif
return swiotlb;
diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 55074cb..4f370d3 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3082,6 +3082,7 @@ int __init amd_iommu_init_dma_ops(void)
unhandled = device_dma_ops_init();
if (unhandled && max_pfn > MAX_DMA32_PFN) {
/* There are unhandled devices - initialize swiotlb for them */
+   WARN(swiotlb_force_off, "Please remove swiotlb=off\n");
swiotlb = 1;
}
 
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 1d2506f..dc43968 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -8,6 +8,7 @@ struct dma_attrs;
 struct scatterlist;
 
 extern int swiotlb_force;
+extern int swiotlb_force_off;
 
 /*
  * Maximum allowable number of contiguous slabs to map,
diff --git a/lib/swiotlb.c b/lib/swiotlb.c
index 958322e..3a0ec46 100644
--- a/lib/swiotlb.c
+++ b/lib/swiotlb.c
@@ -51,6 +51,7 @@
 #define IO_TLB_MIN_SLABS ((1<<20) >> IO_TLB_SHIFT)
 
 int swiotlb_force;
+int swiotlb_force_off;
 
 /*
  * Used to do a quick range check in swiotlb_tbl_unmap_single and
@@ -102,8 +103,10 @@ setup_io_tlb_npages(char *str)
}
if (*str == ',')
++str;
-   if (!strcmp(str, "force"))
+   if (!strcmp(str, "force") || !strcmp(str, "on"))
swiotlb_force = 1;
+   if (!strcmp(str, "off"))
+   swiotlb_force_off = 1;
 
return 1;
 }
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 25/27] x86, kdump: remove crashkernel range find limit for 64bit

2012-12-17 Thread Yinghai Lu

Now kexeced kernel/ramdisk could be above 4g, so remove 896 limit for
64bit.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/setup.c |4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 15ce495..2631008 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -515,13 +515,11 @@ static void __init 
memblock_x86_reserve_range_setup_data(void)
 /*
  * Keep the crash kernel below this limit.  On 32 bits earlier kernels
  * would limit the kernel to the low 512 MiB due to mapping restrictions.
- * On 64 bits, kexec-tools currently limits us to 896 MiB; increase this
- * limit once kexec-tools are fixed.
  */
 #ifdef CONFIG_X86_32
 # define CRASH_KERNEL_ADDR_MAX (512 << 20)
 #else
-# define CRASH_KERNEL_ADDR_MAX (896 << 20)
+# define CRASH_KERNEL_ADDR_MAX MAXMEM
 #endif
 
 static void __init reserve_crashkernel(void)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 13/27] x86, boot: move verify_cpu.S and no_longmode after 0x200

2012-12-17 Thread Yinghai Lu

We are short of space before 0x200 that is entry for startup_64.

According to hpa, we can not change startup_64 to other offset and
that become ABI now.

We could move function verify_cpu and no_longmode down, because one is
used via call and another will not return.
So could avoid extra code of jmp back and forth if we would move other
lines.

Signed-off-by: Yinghai Lu 
Cc: Matt Fleming 
---
 arch/x86/boot/compressed/head_64.S |   17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S 
b/arch/x86/boot/compressed/head_64.S
index 2c4b171..fb984c0 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -176,14 +176,6 @@ ENTRY(startup_32)
lret
 ENDPROC(startup_32)
 
-no_longmode:
-   /* This isn't an x86-64 CPU so hang */
-1:
-   hlt
-   jmp 1b
-
-#include "../../kernel/verify_cpu.S"
-
/*
 * Be careful here startup_64 needs to be at a predictable
 * address so I can export it in an ELF header.  Bootloaders
@@ -349,6 +341,15 @@ relocated:
  */
jmp *%rbp
 
+   .code32
+no_longmode:
+   /* This isn't an x86-64 CPU so hang */
+1:
+   hlt
+   jmp 1b
+
+#include "../../kernel/verify_cpu.S"
+
.data
 gdt:
.word   gdt_end - gdt
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 19/27] x86, boot: add fields to support load bzImage and ramdisk above 4G

2012-12-17 Thread Yinghai Lu

ext_ramdisk_image/size will record high 32bits for ramdisk info.

xloadflags bit0 will be set if relocatable with 64bit.

Let get_ramdisk_image/size to use ext_ramdisk_image/size to get
right positon for ramdisk.

bootloader will fill value to ext_ramdisk_image/size when it load
ramdisk above 4G.

Also bootloader will check if xloadflags bit0 is set to decicde if
it could load ramdisk high above 4G.

sentinel is used to make sure kernel have ext_* valid values set

Update header version to 2.12.

-v2: add ext_cmd_line_ptr for above 4G support.
-v3: update to xloadflags from HPA.
-v4: use fields from bootparam instead setup_header according to HPA.
-v5: add checking for USE_EXT_BOOT_PARAMS
-v6: use sentinel to check if ext_* are valid suggested by HPA.
 HPA said:
1. add a field in the uninitialized portion, call it "sentinel";
2. make sure the byte position corresponding to the "sentinel" field is
   nonzero in the bzImage file;
3. if the kernel boots up and sentinel is nonzero, erase those fields
   that you identified as uninitialized;
-v7: change to 0x1ef instead of 0x1f0, HPA said:
it is quite plausible that someone may (fairly sanely) start the
copy range at 0x1f0 instead of 0x1f1

Signed-off-by: Yinghai Lu 
Cc: Rob Landley 
Cc: Matt Fleming 
---
 Documentation/x86/boot.txt|   15 ++-
 Documentation/x86/zero-page.txt   |4 
 arch/x86/boot/compressed/cmdline.c|2 ++
 arch/x86/boot/compressed/misc.c   |   12 
 arch/x86/boot/header.S|   12 ++--
 arch/x86/boot/setup.ld|7 +++
 arch/x86/include/uapi/asm/bootparam.h |   13 ++---
 arch/x86/kernel/head64.c  |2 ++
 arch/x86/kernel/setup.c   |4 
 9 files changed, 65 insertions(+), 6 deletions(-)

diff --git a/Documentation/x86/boot.txt b/Documentation/x86/boot.txt
index 406d82d..18ca9fb 100644
--- a/Documentation/x86/boot.txt
+++ b/Documentation/x86/boot.txt
@@ -57,6 +57,9 @@ Protocol 2.10:(Kernel 2.6.31) Added a protocol for 
relaxed alignment
 Protocol 2.11: (Kernel 3.6) Added a field for offset of EFI handover
protocol entry point.
 
+Protocol 2.12: (Kernel 3.9) Added three fields for loading bzImage and
+ramdisk above 4G with 64bit in bootparam.
+
  MEMORY LAYOUT
 
 The traditional memory map for the kernel loader, used for Image or
@@ -182,7 +185,7 @@ Offset  Proto   NameMeaning
 0230/4 2.05+   kernel_alignment Physical addr alignment required for kernel
 0234/1 2.05+   relocatable_kernel Whether kernel is relocatable or not
 0235/1 2.10+   min_alignment   Minimum alignment, as a power of two
-0236/2 N/A pad3Unused
+0236/2 2.12+   xloadflags  Boot protocol option flags
 0238/4 2.06+   cmdline_sizeMaximum size of the kernel command line
 023C/4 2.07+   hardware_subarch Hardware subarchitecture
 0240/8 2.07+   hardware_subarch_data Subarchitecture-specific data
@@ -582,6 +585,16 @@ Protocol:  2.10+
   misaligned kernel.  Therefore, a loader should typically try each
   power-of-two alignment from kernel_alignment down to this alignment.
 
+Field name: xloadflags
+Type:   modify (obligatory)
+Offset/size:0x236/2
+Protocol:   2.12+
+
+  This field is a bitmask.
+
+  Bit 0 (read): CAN_BE_LOADED_ABOVE_4G
+- If 1, kernel/boot_params/cmdline/ramdisk can be above 4g,
+
 Field name:cmdline_size
 Type:  read
 Offset/size:   0x238/4
diff --git a/Documentation/x86/zero-page.txt b/Documentation/x86/zero-page.txt
index cf5437d..1140e59 100644
--- a/Documentation/x86/zero-page.txt
+++ b/Documentation/x86/zero-page.txt
@@ -19,6 +19,9 @@ OffsetProto   NameMeaning
 090/010ALL hd1_infohd1 disk parameter, OBSOLETE!!
 0A0/010ALL sys_desc_table  System description table (struct 
sys_desc_table)
 0B0/010ALL olpc_ofw_header OLPC's OpenFirmware CIF and friends
+0C0/004ALL ext_ramdisk_image ramdisk_image high 32bits
+0C4/004ALL ext_ramdisk_size  ramdisk_size high 32bits
+0C8/004ALL ext_cmd_line_ptr  cmd_line_ptr high 32bits
 140/080ALL edid_info   Video mode setup (struct edid_info)
 1C0/020ALL efi_infoEFI 32 information (struct efi_info)
 1E0/004ALL alk_mem_k   Alternative mem check, in KB
@@ -27,6 +30,7 @@ OffsetProto   NameMeaning
 1E9/001ALL eddbuf_entries  Number of entries in eddbuf (below)
 1EA/001ALL edd_mbr_sig_buf_entries Number of entries in 
edd_mbr_sig_buffer
(below)
+1EF/001ALL sentinel0: states _ext_* fields are valid
 290/040ALL edd_mbr_sig_buffer EDD MBR signatures
 2D0/A00ALL e820_mapE820 memory map table
(array of struct e820entry)

[PATCH v7 26/27] x86: add Crash kernel low reservation

2012-12-17 Thread Yinghai Lu

During kdump kernel's booting stage, it need to find low ram for
swiotlb buffer when system does not support intel iommu/dmar remapping.

kexed-tools is appending memmap=exactmap and range from /proc/iomem
with "Crash kernel", and that range is above 4G for 64bit after boot
protocol 2.12.

We need to add another range in /proc/iomem like "Crash kernel low",
so kexec-tools could find that info and append to kdump kernel
command line.

Try to reserve some under 4G if the normal "Crash kernel" is above
4G.

User could specify the size with crashkernel_low=XX[KMG].
If the user does not specify that, will use 72M instead.

-v2: fix warning that is found by Fengguang's test robot.

Signed-off-by: Yinghai Lu 
---
 Documentation/kernel-parameters.txt |3 ++
 arch/x86/kernel/setup.c |   70 +++
 include/linux/kexec.h   |3 ++
 kernel/kexec.c  |   34 ++---
 4 files changed, 89 insertions(+), 21 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 2b37020..e8c48e9 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -600,6 +600,9 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
is selected automatically. Check
Documentation/kdump/kdump.txt for further details.
 
+   crashkernel_low=size[KMG]
+   [KNL, x86] parts under 4G.
+
crashkernel=range1:size1[,range2:size2,...][@offset]
[KNL] Same as above, but depends on the memory
in the running system. The syntax of range is
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2631008..5373a71 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -292,6 +292,21 @@ static void __init reserve_brk(void)
_brk_start = 0;
 }
 
+static u64 __init get_mem_size(unsigned long limit_pfn)
+{
+   int i;
+   u64 pages = 0;
+   unsigned long start_pfn, end_pfn;
+
+   for_each_mem_pfn_range(i, MAX_NUMNODES, _pfn, _pfn, NULL) {
+   start_pfn = min_t(unsigned long, start_pfn, limit_pfn);
+   end_pfn = min_t(unsigned long, end_pfn, limit_pfn);
+   pages += end_pfn - start_pfn;
+   }
+
+   return pages << PAGE_SHIFT;
+}
+
 #ifdef CONFIG_BLK_DEV_INITRD
 
 static u64 __init get_ramdisk_image(void)
@@ -363,20 +378,6 @@ static void __init relocate_initrd(void)
ramdisk_here, ramdisk_here + ramdisk_size - 1);
 }
 
-static u64 __init get_mem_size(unsigned long limit_pfn)
-{
-   int i;
-   u64 mapped_pages = 0;
-   unsigned long start_pfn, end_pfn;
-
-   for_each_mem_pfn_range(i, MAX_NUMNODES, _pfn, _pfn, NULL) {
-   start_pfn = min_t(unsigned long, start_pfn, limit_pfn);
-   end_pfn = min_t(unsigned long, end_pfn, limit_pfn);
-   mapped_pages += end_pfn - start_pfn;
-   }
-
-   return mapped_pages << PAGE_SHIFT;
-}
 static void __init early_reserve_initrd(void)
 {
/* Assume only end is not page aligned */
@@ -522,8 +523,43 @@ static void __init 
memblock_x86_reserve_range_setup_data(void)
 # define CRASH_KERNEL_ADDR_MAX MAXMEM
 #endif
 
+static void __init reserve_crashkernel_low(void)
+{
+#ifdef CONFIG_X86_64
+   const unsigned long long alignment = 16<<20;/* 16M */
+   unsigned long long low_base = 0, low_size = 0;
+   unsigned long total_low_mem;
+   unsigned long long base;
+   int ret;
+
+   total_low_mem = get_mem_size(1UL<<(32-PAGE_SHIFT));
+   ret = parse_crashkernel_low(boot_command_line, total_low_mem,
+   _size, );
+   if (ret != 0 || low_size <= 0)
+   low_size = (72UL<<20);  /* 72M */
+   low_base = memblock_find_in_range(low_size, (1ULL<<32),
+   low_size, alignment);
+
+   if (!low_base) {
+   pr_info("crashkernel low reservation failed - No suitable area 
found.\n");
+
+   return;
+   }
+
+   memblock_reserve(low_base, low_size);
+   pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System 
low RAM: %ldMB)\n",
+   (unsigned long)(low_size >> 20),
+   (unsigned long)(low_base >> 20),
+   (unsigned long)(total_low_mem >> 20));
+   crashk_low_res.start = low_base;
+   crashk_low_res.end   = low_base + low_size - 1;
+   insert_resource(_resource, _low_res);
+#endif
+}
+
 static void __init reserve_crashkernel(void)
 {
+   const unsigned long long alignment = 16<<20;/* 16M */
unsigned long long total_mem;
unsigned long long crash_size, crash_base;
int ret;
@@ -537,8 +573,6 @@ static void __init reserve_crashkernel(void)
 
/* 0 means: find the address automatically */
if

[PATCH v7 27/27] x86: Merge early kernel reserve for 32bit and 64bit

2012-12-17 Thread Yinghai Lu

They are the same, could move them out from head32/64.c to setup.c.

We are using memblock, and it could handle overlapping properly, so
we don't need to reserve some at first to hold the location, and just
need to make sure we reserve them before we are using memblock to find
free mem to use.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/head32.c |9 -
 arch/x86/kernel/head64.c |9 -
 arch/x86/kernel/setup.c  |9 +
 3 files changed, 9 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kernel/head32.c b/arch/x86/kernel/head32.c
index b071d41..17f7792 100644
--- a/arch/x86/kernel/head32.c
+++ b/arch/x86/kernel/head32.c
@@ -30,9 +30,6 @@ static void __init i386_default_early_setup(void)
 
 void __init i386_start_kernel(void)
 {
-   memblock_reserve(__pa_symbol(_text),
-(unsigned long)__bss_stop - (unsigned long)_text);
-
/* Call the subarch specific early setup function */
switch (boot_params.hdr.hardware_subarch) {
case X86_SUBARCH_MRST:
@@ -46,11 +43,5 @@ void __init i386_start_kernel(void)
break;
}
 
-   /*
-* At this point everything still needed from the boot loader
-* or BIOS or kernel text should be early reserved or marked not
-* RAM in e820. All other memory is free game.
-*/
-
start_kernel();
 }
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 0824b02..2370970 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -170,16 +170,7 @@ void __init x86_64_start_reservations(char *real_mode_data)
 {
copy_bootdata(__va(real_mode_data));
 
-   memblock_reserve(__pa_symbol(_text),
-(unsigned long)__bss_stop - (unsigned long)_text);
-
reserve_ebda_region();
 
-   /*
-* At this point everything still needed from the boot loader
-* or BIOS or kernel text should be early reserved or marked not
-* RAM in e820. All other memory is free game.
-*/
-
start_kernel();
 }
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 5373a71..1f07058 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -720,8 +720,17 @@ early_param("reservelow", parse_reservelow);
 
 void __init setup_arch(char **cmdline_p)
 {
+   memblock_reserve(__pa_symbol(_text),
+(unsigned long)__bss_stop - (unsigned long)_text);
+
early_reserve_initrd();
 
+   /*
+* At this point everything still needed from the boot loader
+* or BIOS or kernel text should be early reserved or marked not
+* RAM in e820. All other memory is free game.
+*/
+
 #ifdef CONFIG_X86_32
memcpy(_cpu_data, _cpu_data, sizeof(new_cpu_data));
visws_early_detect();
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 22/27] mm: Add alloc_bootmem_low_pages_nopanic()

2012-12-17 Thread Yinghai Lu

We don't need to panic in some case, like for swiotlb preallocating.

Signed-off-by: Yinghai Lu 
---
 include/linux/bootmem.h |5 +
 mm/bootmem.c|8 
 mm/nobootmem.c  |8 
 3 files changed, 21 insertions(+)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 3f778c2..3cd16ba 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -99,6 +99,9 @@ void *___alloc_bootmem_node_nopanic(pg_data_t *pgdat,
 extern void *__alloc_bootmem_low(unsigned long size,
 unsigned long align,
 unsigned long goal);
+void *__alloc_bootmem_low_nopanic(unsigned long size,
+unsigned long align,
+unsigned long goal);
 extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
  unsigned long size,
  unsigned long align,
@@ -132,6 +135,8 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
 
 #define alloc_bootmem_low(x) \
__alloc_bootmem_low(x, SMP_CACHE_BYTES, 0)
+#define alloc_bootmem_low_pages_nopanic(x) \
+   __alloc_bootmem_low_nopanic(x, PAGE_SIZE, 0)
 #define alloc_bootmem_low_pages(x) \
__alloc_bootmem_low(x, PAGE_SIZE, 0)
 #define alloc_bootmem_low_pages_node(pgdat, x) \
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 1324cd7..315d253 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -821,6 +821,14 @@ void * __init __alloc_bootmem_low(unsigned long size, 
unsigned long align,
return ___alloc_bootmem(size, align, goal, ARCH_LOW_ADDRESS_LIMIT);
 }
 
+void * __init __alloc_bootmem_low_nopanic(unsigned long size,
+ unsigned long align,
+ unsigned long goal)
+{
+   return ___alloc_bootmem_nopanic(size, align, goal,
+   ARCH_LOW_ADDRESS_LIMIT);
+}
+
 /**
  * __alloc_bootmem_low_node - allocate low boot memory from a specific node
  * @pgdat: node to allocate from
diff --git a/mm/nobootmem.c b/mm/nobootmem.c
index 03d152a..5e07d36 100644
--- a/mm/nobootmem.c
+++ b/mm/nobootmem.c
@@ -391,6 +391,14 @@ void * __init __alloc_bootmem_low(unsigned long size, 
unsigned long align,
return ___alloc_bootmem(size, align, goal, ARCH_LOW_ADDRESS_LIMIT);
 }
 
+void * __init __alloc_bootmem_low_nopanic(unsigned long size,
+ unsigned long align,
+ unsigned long goal)
+{
+   return ___alloc_bootmem_nopanic(size, align, goal,
+   ARCH_LOW_ADDRESS_LIMIT);
+}
+
 /**
  * __alloc_bootmem_low_node - allocate low boot memory from a specific node
  * @pgdat: node to allocate from
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH] backlight: add lms501kf03 LCD driver

2012-12-17 Thread Jingoo Han

On Tuesday, December 18, 2012 12:51 AM, Joe Perches wrote
> On Mon, 2012-12-17 at 17:22 +0900, Jingoo Han wrote:
> > Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
> > x 480) driver uses 3-wired SPI inteface.
> 
> A trivial note:
> 
> > diff --git a/drivers/video/backlight/lms501kf03.c 
> > b/drivers/video/backlight/lms501kf03.c
> 
> []
> 
> > +static const unsigned short seq_rgb_gamma[] = {
> > +   0xc1, 0x01, 0x03, 0x07, 0x0f, 0x1a, 0x22, 0x2c, 0x33, 0x3c,
> > +   0x46, 0x4f, 0x58, 0x60, 0x69, 0x71, 0x79, 0x82, 0x89, 0x92,
> > +   0x9a, 0xa1, 0xa9, 0xb1, 0xb9, 0xc1, 0xc9, 0xcf, 0xd6, 0xde,
> > +   0xe5, 0xec, 0xf3, 0xf9, 0xff, 0xdd, 0x39, 0x07, 0x1c, 0xcb,
> > +   0xab, 0x5f, 0x49, 0x80, 0x03, 0x07, 0x0f, 0x19, 0x20, 0x2a,
> > +   0x31, 0x39, 0x42, 0x4b, 0x53, 0x5b, 0x63, 0x6b, 0x73, 0x7b,
> > +   0x83, 0x8a, 0x92, 0x9b, 0xa2, 0xaa, 0xb2, 0xba, 0xc2, 0xca,
> > +   0xd0, 0xd8, 0xe1, 0xe8, 0xf0, 0xf8, 0xff, 0xf7, 0xd8, 0xbe,
> > +   0xa7, 0x39, 0x40, 0x85, 0x8c, 0xc0, 0x04, 0x07, 0x0c, 0x17,
> > +   0x1c, 0x23, 0x2b, 0x34, 0x3b, 0x43, 0x4c, 0x54, 0x5b, 0x63,
> > +   0x6a, 0x73, 0x7a, 0x82, 0x8a, 0x91, 0x98, 0xa1, 0xa8, 0xb0,
> > +   0xb7, 0xc1, 0xc9, 0xcf, 0xd9, 0xe3, 0xea, 0xf4, 0xff, 0x00,
> > +   0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> > +   ENDDEF
> > +};
> 
> All of these ushort arrays could be uchar.
> 
> > +static int lms501kf03_spi_write(struct lms501kf03 *lcd, unsigned char 
> > address,
> > +   unsigned char command)
> > +{
> > +   int ret;
> > +
> > +   ret = lms501kf03_spi_write_byte(lcd, address, command);
> > +
> > +   return ret;
> > +}
> > +
> > +static int lms501kf03_panel_send_sequence(struct lms501kf03 *lcd,
> > +   const unsigned short *wbuf)
> > +{
> > +   int ret = 0, i = 0;
> > +
> > +   while (wbuf[i] != ENDDEF) {
> 
> Using an unsigned short where the high order byte
> is an end-of-buffer indicator is a bit space wasteful.
> 
> Perhaps a sized struct or array instead.

OK, I will use unsigned char, instead of unsigned short.

Thanks.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 14/27] x86, boot: Move lldt/ltr out of 64bit code section

2012-12-17 Thread Yinghai Lu

commit 08da5a2ca

x86_64: Early segment setup for VT

add lldt/ltr to clean more segments.

Those code are put in code64, and it is using gdt that is only
loaded from code32 path.

That breaks booting with 64bit bootloader that does not go through
code32 path. It get at startup_64 directly,  and it has different
gdt.

Move those lines into code32 after their gdt is loaded.

Signed-off-by: Yinghai Lu 
Cc: Zachary Amsden 
Cc: Matt Fleming 
---
 arch/x86/boot/compressed/head_64.S |9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/arch/x86/boot/compressed/head_64.S 
b/arch/x86/boot/compressed/head_64.S
index fb984c0..5c80b94 100644
--- a/arch/x86/boot/compressed/head_64.S
+++ b/arch/x86/boot/compressed/head_64.S
@@ -154,6 +154,12 @@ ENTRY(startup_32)
btsl$_EFER_LME, %eax
wrmsr
 
+   /* After gdt is loaded */
+   xorl%eax, %eax
+   lldt%ax
+   movl$0x20, %eax
+   ltr %ax
+
/*
 * Setup for the jump to 64bit mode
 *
@@ -239,9 +245,6 @@ preferred_addr:
movl%eax, %ss
movl%eax, %fs
movl%eax, %gs
-   lldt%ax
-   movl$0x20, %eax
-   ltr %ax
 
/*
 * Compute the decompressed kernel start address.  It is where
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 02/27] x86, mm: make pgd next calculation consistent with pud/pmd

2012-12-17 Thread Yinghai Lu

Just like PUD_SIZE, and PMD_SIZE next calculation, aka
round down and add size.

also remove not need next checking, just pass end instead.
later phys_pud_init uses PTRS_PER_PUD checking to exit early
if end is too big.

Signed-off-by: Yinghai Lu 
---
 arch/x86/mm/init_64.c |6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 167439c..b1178eb 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -530,9 +530,7 @@ kernel_physical_mapping_init(unsigned long start,
pgd_t *pgd = pgd_offset_k(start);
pud_t *pud;
 
-   next = (start + PGDIR_SIZE) & PGDIR_MASK;
-   if (next > end)
-   next = end;
+   next = (start & PGDIR_MASK) + PGDIR_SIZE;
 
if (pgd_val(*pgd)) {
pud = (pud_t *)pgd_page_vaddr(*pgd);
@@ -542,7 +540,7 @@ kernel_physical_mapping_init(unsigned long start,
}
 
pud = alloc_low_page();
-   last_map_addr = phys_pud_init(pud, __pa(start), __pa(next),
+   last_map_addr = phys_pud_init(pud, __pa(start), __pa(end),
 page_size_mask);
 
spin_lock(_mm.page_table_lock);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v7 07/27] x86, 64bit: Print init kernel lowmap correctly

2012-12-17 Thread Yinghai Lu

We are not having max_pfn_mapped set correctly until init_memory_mapping.

so don't print it initial value  for 64bit

Also need to use KERNEL_IMAGE_SIZE directly for highmap cleanup.

Signed-off-by: Yinghai Lu 
---
 arch/x86/kernel/head64.c |3 ---
 arch/x86/kernel/setup.c  |2 ++
 arch/x86/mm/init_64.c|6 +-
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index cac61dc..46e509e 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -137,9 +137,6 @@ void __init x86_64_start_kernel(char * real_mode_data)
/* clear bss before set_intr_gate with early_idt_handler */
clear_bss();
 
-   /* XXX - this is wrong... we need to build page tables from scratch */
-   max_pfn_mapped = KERNEL_IMAGE_SIZE >> PAGE_SHIFT;
-
for (i = 0; i < NUM_EXCEPTION_VECTORS; i++) {
 #ifdef CONFIG_EARLY_PRINTK
set_intr_gate(i, _idt_handlers[i]);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 63160c6..04797e78 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -910,8 +910,10 @@ void __init setup_arch(char **cmdline_p)
setup_bios_corruption_check();
 #endif
 
+#ifdef CONFIG_X86_32
printk(KERN_DEBUG "initial memory mapped: [mem 0x-%#010lx]\n",
(max_pfn_mapped

[PATCH] usb: host: tegra: make use of PHY pointer of HCD

2012-12-17 Thread Venu Byravarasu

As pointer to PHY structure can be stored in struct usb_hcd
making use of it, to call PHY APIs.

Call to usb_phy_shutdown() is moved up in tegra_ehci_remove(),
so that to avoid dereferencing of hcd after its freed up.

Signed-off-by: Venu Byravarasu 
---
This patch depends on patch 
http://marc.info/?l=linux-kernel=135581274019690=2.
Without above patch applied, phy->notify_connect & phy->notify_disconnect are
set to some unknown values, which need not be NULL.
This creates problem when hcd->phy pointer is initialized hub_port_init() in 
hub.c
calls usb_phy_notify_connect().

Stephen,
Can you plz take this patch through tegra tree, so as to take care of the 
dependencies?

 drivers/usb/host/ehci-tegra.c |   19 +++
 1 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/drivers/usb/host/ehci-tegra.c b/drivers/usb/host/ehci-tegra.c
index aca6606..5b2c48d 100644
--- a/drivers/usb/host/ehci-tegra.c
+++ b/drivers/usb/host/ehci-tegra.c
@@ -53,7 +53,7 @@ static void tegra_ehci_power_up(struct usb_hcd *hcd)
 
clk_prepare_enable(tegra->emc_clk);
clk_prepare_enable(tegra->clk);
-   usb_phy_set_suspend(>phy->u_phy, 0);
+   usb_phy_set_suspend(hcd->phy, 0);
tegra->host_resumed = 1;
 }
 
@@ -62,7 +62,7 @@ static void tegra_ehci_power_down(struct usb_hcd *hcd)
struct tegra_ehci_hcd *tegra = dev_get_drvdata(hcd->self.controller);
 
tegra->host_resumed = 0;
-   usb_phy_set_suspend(>phy->u_phy, 1);
+   usb_phy_set_suspend(hcd->phy, 1);
clk_disable_unprepare(tegra->clk);
clk_disable_unprepare(tegra->emc_clk);
 }
@@ -716,9 +716,14 @@ static int tegra_ehci_probe(struct platform_device *pdev)
goto fail_io;
}
 
-   usb_phy_init(>phy->u_phy);
+   hcd->phy = >phy->u_phy;
+   err = usb_phy_init(hcd->phy);
+   if (err) {
+   dev_err(>dev, "Failed to initialize phy\n");
+   goto fail;
+   }
 
-   err = usb_phy_set_suspend(>phy->u_phy, 0);
+   err = usb_phy_set_suspend(hcd->phy, 0);
if (err) {
dev_err(>dev, "Failed to power on the phy\n");
goto fail;
@@ -764,7 +769,7 @@ fail:
if (!IS_ERR_OR_NULL(tegra->transceiver))
otg_set_host(tegra->transceiver->otg, NULL);
 #endif
-   usb_phy_shutdown(>phy->u_phy);
+   usb_phy_shutdown(hcd->phy);
 fail_io:
clk_disable_unprepare(tegra->emc_clk);
 fail_emc_clk:
@@ -787,12 +792,10 @@ static int tegra_ehci_remove(struct platform_device *pdev)
if (!IS_ERR_OR_NULL(tegra->transceiver))
otg_set_host(tegra->transceiver->otg, NULL);
 #endif
-
+   usb_phy_shutdown(hcd->phy);
usb_remove_hcd(hcd);
usb_put_hcd(hcd);
 
-   usb_phy_shutdown(>phy->u_phy);
-
clk_disable_unprepare(tegra->clk);
 
clk_disable_unprepare(tegra->emc_clk);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] ARM: tegra30: Add support for Uart clock source divider as 15.1

2012-12-17 Thread Prashant Gaikwad


On Tuesday 18 December 2012 11:54 AM, Laxman Dewangan wrote:

On Tuesday 18 December 2012 11:44 AM, Prashant Gaikwad wrote:

On Tuesday 18 December 2012 03:13 AM, Stephen Warren wrote:

On 12/17/2012 05:08 AM, Laxman Dewangan wrote:

Tegra20 uart clock source have the 15.1 clock divider in place of

That says Tegra20, but ...


7.1. Add support for 15.1 clock divider and change the uart clock divider
flag to DIV_U151.
arch/arm/mach-tegra/clock.h   |3 +-
arch/arm/mach-tegra/tegra30_clocks.c  |   70 
++--
arch/arm/mach-tegra/tegra30_clocks_data.c |   10 ++--

... the patch only modifies Tegra30. Do both Tegra20 and Tegra30 have
this feature; should both clock drivers be updated?

BTW, Prashant is reworking the Tegra clock support to be modular, rather
than having a single monolithic "Tegra clock" type, and also moving the
code to drivers/clk. This patch will conflict signifcantly with that.
Please work with him to integrate this patch into his rework series,
either before or after his changes, and have him include the patch when
he posts his series. You'll also need to think about whether/how your
and his series depend on each-other.

... but: Is this a pure bug-fix? If so, I guess this patch should be
applied before Prashant's patches, and this patch also Cc: stable?

My clock driver rework includes this fix. Divider supports both DIVU71
and DIVU151.
UART divider is set to DIVU151.

Prashant,
I like to go this patch as first patch towards bug fixes rather than
after moving the clock. The reason is that we will pull this change in
our downstream and will be available in our K3.7 code.


Laxman,

We are not going to use ccf in our downstream kernel port to K3.7. How 
does that help pushing in upstream?







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/4] ARM: tegra: add connection name for uart clock table

2012-12-17 Thread Laxman Dewangan


On Tuesday 18 December 2012 03:14 AM, Stephen Warren wrote:

On 12/17/2012 05:08 AM, Laxman Dewangan wrote:

Add connection name "uart-clk" for the uart clock information.

Does the UART receive more than one clock, so that it actually cares
what the clock connection name is? If not, can we just drop this patch?


I like to have this patch because:
- In future, I want to also get the name of the clock source and set 
parent of uart clock properly.
- I want to switch the parent clock source dynamically between CLKM and 
PLLP to achieve more power optimization i.e. use CLKM wherever possible..

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2] serial: tegra: add serial driver

2012-12-17 Thread Laxman Dewangan

Nvidia's Tegra has multiple uart controller which supports:
- APB dma based controller fifo read/write.
- End Of Data interrupt in incoming data to know whether end
  of frame achieve or not.
- Hw controlled RTS and CTS flow control to reduce SW overhead.

Add serial driver to use all above feature.

Signed-off-by: Laxman Dewangan 
---
Changes from V1:
- Remove port-number parameter and use the of_alias_get().
- put the ref count for the tty.
- rename the bindng document file to serial-tegra.txt to match with
  driver name.
- Remove falsy introduced line from Kconfig.
- Move platform data file to linux/platfor_data. Not removing the
  platform datacompletely now. if it is requie to remove the will
  be remove later along with other tegra driver also.
- Simplify tegra_uart_set_mctrl
- Clear flag for CMSPAR as driver dose not support this.
- Modify uart_get_baud_rate() to use actual baudrate.
- reorder compatibles in documentation file.
- used of_property_read_bool for modem interrupt.
- remove check if (pdev->dev.of_node) as it si always true.
- Drop devinit and devexit compiler option.
- nit cleanups for moving struture to the usage area.

 .../devicetree/bindings/serial/serial-tegra.txt|   24 +
 drivers/tty/serial/Kconfig |   11 +
 drivers/tty/serial/Makefile|1 +
 drivers/tty/serial/serial-tegra.c  | 1407 
 include/linux/platform_data/serial-tegra.h |   37 +
 5 files changed, 1480 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/serial/serial-tegra.txt
 create mode 100644 drivers/tty/serial/serial-tegra.c
 create mode 100644 include/linux/platform_data/serial-tegra.h

diff --git a/Documentation/devicetree/bindings/serial/serial-tegra.txt 
b/Documentation/devicetree/bindings/serial/serial-tegra.txt
new file mode 100644
index 000..8b20248
--- /dev/null
+++ b/Documentation/devicetree/bindings/serial/serial-tegra.txt
@@ -0,0 +1,24 @@
+NVIDIA Tegra20/Tegra30 high speed (dma based) UART controller driver.
+
+Required properties:
+- compatible : should be "nvidia,tegra30-hsuart", "nvidia,tegra20-hsuart".
+- reg: Should contain UART controller registers location and length.
+- interrupts: Should contain UART controller interrupts.
+- nvidia,dma-request-selector : The Tegra DMA controller's phandle and
+  request selector for this UART controller.
+
+Optional properties:
+- nvidia,enable-modem-interrupt: Enable modem interrupts. Should be enable
+   only if all 8 lines of uart controller are pinmuxed.
+
+Example:
+
+serial@70006000 {
+   compatible = "nvidia,tegra30-hsuart", "nvidia,tegra20-hsuart";
+   reg = <0x70006000 0x40>;
+   reg-shift = <2>;
+   interrupts = <0 36 0x04>;
+   nvidia,dma-request-selector = < 8>;
+   nvidia,enable-modem-interrupt;
+   status = "disabled";
+};
diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 59c23d0..366631c 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -269,6 +269,17 @@ config SERIAL_SIRFSOC_CONSOLE
   your boot loader about how to pass options to the kernel at
   boot time.)
 
+config SERIAL_TEGRA
+   tristate "Nvidia Tegra20/30 SoC serial controller"
+   depends on ARCH_TEGRA && TEGRA20_APB_DMA
+   select SERIAL_CORE
+   help
+ Support for the on-chip UARTs on the Nvidia Tegra seria SOCs
+ providing /dev/ttyHS0, 1, 2, 3 and 4 (note, some machines may not
+ provide all of these ports, depending on how the serial port
+ are enabled). This driver uses the APB dma to achieve higher baudrate
+ and better performance.
+
 config SERIAL_MAX3100
tristate "MAX3100 support"
depends on SPI
diff --git a/drivers/tty/serial/Makefile b/drivers/tty/serial/Makefile
index df1b998..82e4306 100644
--- a/drivers/tty/serial/Makefile
+++ b/drivers/tty/serial/Makefile
@@ -80,6 +80,7 @@ obj-$(CONFIG_SERIAL_MXS_AUART) += mxs-auart.o
 obj-$(CONFIG_SERIAL_LANTIQ)+= lantiq.o
 obj-$(CONFIG_SERIAL_XILINX_PS_UART) += xilinx_uartps.o
 obj-$(CONFIG_SERIAL_SIRFSOC) += sirfsoc_uart.o
+obj-$(CONFIG_SERIAL_TEGRA) += serial-tegra.o
 obj-$(CONFIG_SERIAL_AR933X)   += ar933x_uart.o
 obj-$(CONFIG_SERIAL_EFM32_UART) += efm32-uart.o
 obj-$(CONFIG_SERIAL_ARC)   += arc_uart.o
diff --git a/drivers/tty/serial/serial-tegra.c 
b/drivers/tty/serial/serial-tegra.c
new file mode 100644
index 000..0b7efb3
--- /dev/null
+++ b/drivers/tty/serial/serial-tegra.c
@@ -0,0 +1,1407 @@
+/*
+ * serial_tegra.c
+ *
+ * High-speed serial driver for NVIDIA Tegra SoCs
+ *
+ * Copyright (c) 2012, NVIDIA CORPORATION.  All rights reserved.
+ *
+ * Author: Laxman Dewangan 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in

Re: [PATCH 3/4] ARM: tegra: Add OF_DEV_AUXDATA for uart driver in board dt

2012-12-17 Thread Laxman Dewangan


On Tuesday 18 December 2012 11:48 AM, Prashant Gaikwad wrote:

On Tuesday 18 December 2012 03:17 AM, Stephen Warren wrote:

On 12/17/2012 05:08 AM, Laxman Dewangan wrote:

Add OF_DEV_AUXDATA for high speed uart controller driver for
Tegra20/Tegra30 board dt files.
Set the parent clock of uart controller to PLLP.
diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -94,6 +94,11 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata = {
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006000, "tegra-uart.0", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006040, "tegra-uart.1", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006200, "tegra-uart.2", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006300, "tegra-uart.3", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006400, "tegra-uart.4", 
NULL),

Instead, can we simply get the clocks from device tree? Prashant, how
much effort will that be once your clock patches are checked in, or is
it already part of those patches?

It is not part of rework patches, but I will send a patch for it
immediately after those patches are accepted upstream.


@@ -106,7 +111,10 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata 
= {
   static __initdata struct tegra_clk_init_table tegra_dt_clk_init_table[] = {
/* name parent  rateenabled */
{ "uarta","pll_p",  21600,  true },
+   { "uartb","pll_p",  21600,  false },
+   { "uartc","pll_p",  21600,  false },
{ "uartd","pll_p",  21600,  true },
+   { "uarte","pll_p",  21600,  false },

Prashant's clock patches remove this table. Please work with him to work
out how to deal with that.

Laxman,

If you want I can include these entries in current tables.


No issue, you can add this in your change.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND 0/6 v10] gpio: Add block GPIO

2012-12-17 Thread Wolfgang Grandegger

On 12/18/2012 06:55 AM, Jean-Christophe PLAGNIOL-VILLARD wrote:
> On 20:47 Mon 17 Dec , Wolfgang Grandegger wrote:
>> On 12/17/2012 07:02 PM, Roland Stigge wrote:
>>> On 12/17/2012 06:37 PM, Wolfgang Grandegger wrote:
/* Do synchronous data output with a single write access */
__raw_writel(~mask, pio + PIO_OWDR);
__raw_writel(mask, pio + PIO_OWER);
__raw_writel(val, pio + PIO_ODSR);

 For caching we would need a storage. Not sure if it's worth compared to
 a context switch into the kernel.
>>>
>>> Block GPIO is not only for you in userspace. ;-) You can also implement
>>> efficient n-bit bus I/O in kernel drivers, n-bit-banging. :-) So not
>>> always context switches involved.
>>
>> OK, what do you think about the following untested patch:
>>
>> From b44cad16cbbca84715dffd4cb5268497216add25 Mon Sep 17 00:00:00 2001
>> From: Wolfgang Grandegger 
>> Date: Mon, 3 Dec 2012 08:31:55 +0100
>> Subject: [PATCH 1/2] gpio: add GPIO block callback functions for AT91
>>
>> Signed-off-by: Wolfgang Grandegger 
>> ---
>>  arch/arm/mach-at91/gpio.c |   29 +
>>  1 file changed, 29 insertions(+)
>>
>> diff --git a/arch/arm/mach-at91/gpio.c b/arch/arm/mach-at91/gpio.c
>> index be42cf0..cf6bd45 100644
>> --- a/arch/arm/mach-at91/gpio.c
>> +++ b/arch/arm/mach-at91/gpio.c
>> @@ -42,13 +42,16 @@ struct at91_gpio_chip {
>>  void __iomem*regbase;   /* PIO bank virtual address */
>>  struct clk  *clock; /* associated clock */
>>  struct irq_domain   *domain;/* associated irq domain */
>> +unsigned long   mask_shadow;/* synchronous data output */
>>  };
>>  
>>  #define to_at91_gpio_chip(c) container_of(c, struct at91_gpio_chip, chip)
>>  
>>  static void at91_gpiolib_dbg_show(struct seq_file *s, struct gpio_chip 
>> *chip);
>>  static void at91_gpiolib_set(struct gpio_chip *chip, unsigned offset, int 
>> val);
>> +static void at91_gpiolib_set_block(struct gpio_chip *chip, unsigned long 
>> mask, unsigned long val);
>>  static int at91_gpiolib_get(struct gpio_chip *chip, unsigned offset);
>> +static unsigned long at91_gpiolib_get_block(struct gpio_chip *chip, 
>> unsigned long mask);
>>  static int at91_gpiolib_direction_output(struct gpio_chip *chip,
>>   unsigned offset, int val);
>>  static int at91_gpiolib_direction_input(struct gpio_chip *chip,
>> @@ -62,7 +65,9 @@ static int at91_gpiolib_to_irq(struct gpio_chip *chip, 
>> unsigned offset);
>>  .direction_input  = at91_gpiolib_direction_input, \
>>  .direction_output = at91_gpiolib_direction_output, \
>>  .get  = at91_gpiolib_get,   \
>> +.get_block= at91_gpiolib_get_block, \
>>  .set  = at91_gpiolib_set,   \
>> +.set_block= at91_gpiolib_set_block, \
>>  .dbg_show = at91_gpiolib_dbg_show,  \
>>  .to_irq   = at91_gpiolib_to_irq,\
>>  .ngpio= nr_gpio,\
>> @@ -896,6 +901,16 @@ static int at91_gpiolib_get(struct gpio_chip *chip, 
>> unsigned offset)
>>  return (pdsr & mask) != 0;
>>  }
>>  
>> +static unsigned long at91_gpiolib_get_block(struct gpio_chip *chip, 
>> unsigned long mask)
>> +{
>> +struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
>> +void __iomem *pio = at91_gpio->regbase;
>> +u32 pdsr;
>> +
>> +pdsr = __raw_readl(pio + PIO_PDSR);
>> +return pdsr & mask;
>> +}
>> +
>>  static void at91_gpiolib_set(struct gpio_chip *chip, unsigned offset, int 
>> val)
>>  {
>>  struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
>> @@ -905,6 +920,20 @@ static void at91_gpiolib_set(struct gpio_chip *chip, 
>> unsigned offset, int val)
>>  __raw_writel(mask, pio + (val ? PIO_SODR : PIO_CODR));
>>  }
>>  
>> +static void at91_gpiolib_set_block(struct gpio_chip *chip, unsigned long 
>> mask, unsigned long val)
>> +{
>> +struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
>> +void __iomem *pio = at91_gpio->regbase;
>> +
>> +/* Do synchronous data output with a single write access */
>> +if (mask != at91_gpio->mask_shadow) {
>> +at91_gpio->mask_shadow = mask;
>> +__raw_writel(~mask, pio + PIO_OWDR);
>> +__raw_writel(mask, pio + PIO_OWER);
>> +}
>> +__raw_writel(val, pio + PIO_ODSR);
>> +}
> this driver is only for old at91 platfrom if you touch at91 you need to update
> the pinctrl too

Well, the patch is for the hardware I have at hand and I can test. There
are many other GPIO hardware interfaces which could be enhanced with
block gpio. Roland only did it for the interfaces in driver/gpio. Also,
I think, an ACK for this patch series would be nice before we continue.

Wolfgang.

--
To

[RFC v4 1/3] Introduce new system call mvolatile

2012-12-17 Thread Minchan Kim

This patch adds new system call m[no]volatile. If some user asks
is_volatile system call, it could, too.

The reason why I introduced new system call instead of madvise is
m[no]volatile vma handling is totally different with madvise's vma
handling.

1) The m[no]volatile should be successful although the range includes
   unmapped or non-volatile range. It just skips such range without stop
   with returning error although it encounter invalid range.
   It makes user convenient without calling several calling of small range.
   - Suggested by John Stultz

2) The propagation of purged state between vmas should be atomic between
   m[no]volatile and reclaim. For it, we need to tweak vma_merge/split_vma's
   anon_vma handling. It's very common operation and I don't want to add
   unnecessary overhead and code if it is possbile.

3) The purged state of volatile range should be propagated out to user
   with mnovolatile operation and it should be atomic with reclaim, too.

For meeting above requirements, I introudced new system call m[no]volatile.
It doesn't change vma_merge/split and repair vmas after vma operation.

So mvolatile(start, len)'s semantics is following as.

1) It makes range(start, len) as volatile although the range includes
   unmapped area, speacial mapping and mlocked area which are just skipped.
   Now it doesn't support Hugepage and KSM. - TODO
   Return -EINVAL if range doesn't include a right vma at all.
   Return -ENOMEM with interrupting range opeartion if memory is not
   enough to merge/split vmas. In this case, some range would be volatile
   and others not. So user have to recall mvolatile after he cancel all
   range by mnovolatile.
   Return 0 if range consists of only proper vmas.
   Return 1 if part of range includes hole/huge/ksm/mlock/special area.

2) If user calls mvolatile to the range which was already volatile VMA and
   even purged state, VOLATILE attributes still remains but purged state
   is reset. I expect some user want to split volatile vma into smaller
   ranges. Although he can do it for mnovlatile(whole range) and serveral 
calling
   with movlatile(smaller range), this function can avoid mnovolatile if he
   doesn't care purged state. I'm not sure we really need this function so
   I hope listen opinions. Unfortunately, current implemenation doesn't split
   volatile VMA with new range in this case. I forgot implementing it
   in this version but decide to send it to listen opinions because implementing
   is rather trivial if we decided.

mnovolatile(start, len)'s semantics is following as.

1) It makes range(start, len) as volatile although the range includes
   unmapped area, speacial mapping and non-volatile range which are just
   skipped.

2) If the range is purged, it will return 1 regardless of including invalid
   range.

3) It returns -ENOMEM if system doesn't have enough memory for vma operation.

4) It returns -EINVAL if range doesn't include a right vma at all.

5) If user try to access purged range without mnovoatile call, it encounters
   SIGBUS which would show up next patch.

Cc: Michael Kerrisk 
Cc: Arun Sharma 
Cc: san...@google.com
Cc: Paul Turner 
CC: David Rientjes 
Cc: John Stultz 
Cc: Andrew Morton 
Cc: Christoph Lameter 
Cc: Android Kernel Team 
Cc: Robert Love 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Dave Hansen 
Cc: Rik van Riel 
Cc: Dave Chinner 
Cc: Neil Brown 
Cc: Mike Hommey 
Cc: Taras Glek 
Cc: KOSAKI Motohiro 
Cc: Christoph Lameter 
Cc: KAMEZAWA Hiroyuki 
Signed-off-by: Minchan Kim 
---
 arch/x86/syscalls/syscall_64.tbl |3 +-
 include/linux/mm.h   |1 +
 include/linux/mm_types.h |2 +
 include/linux/syscalls.h |2 +
 mm/Makefile  |4 +-
 mm/huge_memory.c |9 +-
 mm/ksm.c |3 +-
 mm/mlock.c   |5 +-
 mm/mmap.c|2 +-
 mm/mvolatile.c   |  396 ++
 mm/rmap.c|2 +
 11 files changed, 419 insertions(+), 10 deletions(-)
 create mode 100644 mm/mvolatile.c

diff --git a/arch/x86/syscalls/syscall_64.tbl b/arch/x86/syscalls/syscall_64.tbl
index a582bfe..7da9c4a 100644
--- a/arch/x86/syscalls/syscall_64.tbl
+++ b/arch/x86/syscalls/syscall_64.tbl
@@ -319,7 +319,8 @@
 31064  process_vm_readvsys_process_vm_readv
 31164  process_vm_writev   sys_process_vm_writev
 312common  kcmpsys_kcmp
-
+313common  mvolatile   sys_mvolatile
+314common  mnovolatile sys_mnovolatile
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
 # for native 64-bit operation.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index bcaab4e..94742c4 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -87,6 +87,7 @@ extern unsigned int kobjsize(const void *objp);
 #define VM_PFNMAP  0x0400  /* Page-ranges managed

[RFC v4 3/3] add PGVOLATILE vmstat count

2012-12-17 Thread Minchan Kim

This patch add pgvolatile vmstat so admin can see how many of volatile
pages are discarded by VM until now. It could be a good indicator of
patch effect during test but still not sure we need it in real practice.
Will rethink it.

Cc: Michael Kerrisk 
Cc: Arun Sharma 
Cc: san...@google.com
Cc: Paul Turner 
CC: David Rientjes 
Cc: John Stultz 
Cc: Andrew Morton 
Cc: Christoph Lameter 
Cc: Android Kernel Team 
Cc: Robert Love 
Cc: Mel Gorman 
Cc: Hugh Dickins 
Cc: Dave Hansen 
Cc: Rik van Riel 
Cc: Dave Chinner 
Cc: Neil Brown 
Cc: Mike Hommey 
Cc: Taras Glek 
Cc: KOSAKI Motohiro 
Cc: Christoph Lameter 
Cc: KAMEZAWA Hiroyuki 
Signed-off-by: Minchan Kim 
---
 include/linux/vm_event_item.h |2 +-
 mm/vmscan.c   |1 +
 mm/vmstat.c   |1 +
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 3d31145..f83c3d2 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -23,7 +23,7 @@
 
 enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
FOR_ALL_ZONES(PGALLOC),
-   PGFREE, PGACTIVATE, PGDEACTIVATE,
+   PGFREE, PGVOLATILE, PGACTIVATE, PGDEACTIVATE,
PGFAULT, PGMAJFAULT,
FOR_ALL_ZONES(PGREFILL),
FOR_ALL_ZONES(PGSTEAL_KSWAPD),
diff --git a/mm/vmscan.c b/mm/vmscan.c
index cfe95d3..1ec7345 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -794,6 +794,7 @@ static unsigned long shrink_page_list(struct list_head 
*page_list,
if (page_mapped(page) && mapping) {
switch (try_to_unmap(page, ttu_flags)) {
case SWAP_DISCARD:
+   count_vm_event(PGVOLATILE);
goto discard_page;
case SWAP_FAIL:
goto activate_locked;
diff --git a/mm/vmstat.c b/mm/vmstat.c
index c737057..9fd8ead 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -747,6 +747,7 @@ const char * const vmstat_text[] = {
TEXTS_FOR_ZONES("pgalloc")
 
"pgfree",
+   "pgvolatile",
"pgactivate",
"pgdeactivate",
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[RFC v4 0/3] Support volatile for anonymous range

2012-12-17 Thread Minchan Kim

This is still RFC because we need more input from user-space
people and discussion about interface/reclaim policy of volatile
pages and I want to expand this concept to tmpfs volatile range
if it is possbile without big performance drop of anonymous volatile
rnage (Let's define our term. anon volatile VS tmpfs volatile? John?)

NOTE: I didn't consider THP/KSM so for test, you should disable them.

I hope more inputs from user-space allocator people and test patch
with their allocator because it might need design change of arena
management for getting real vaule.

Changelog from v4

 * Add new system call mvolatile/mnovolatile
 * Add sigbus when user try to access volatile range
 * Rebased on v3.7
 * Applied bug fix from John Stultz, Thanks!

Changelog from v3

 * Removing madvise(addr, length, MADV_NOVOLATILE).
 * add vmstat about the number of discarded volatile pages
 * discard volatile pages without promotion in reclaim path

This is based on v3.7

- What's the mvolatile(addr, length)?

  It's a hint that user deliver to kernel so kernel can *discard*
  pages in a range anytime.

- What happens if user access page(ie, virtual address) discarded
  by kernel?

  The user can encounter SIGBUS.

- What should user do for avoding SIGBUS?
  He should call mnovolatie(addr, length) before accessing the range
  which was called by mvolatile.

- What happens if user access page(ie, virtual address) doesn't
  discarded by kernel?

  The user can see old data without page fault.

- What's different with madvise(DONTNEED)?

  System call semantic

  DONTNEED makes sure user always can see zero-fill pages after
  he calls madvise while mvolatile can see old data or encounter
  SIGBUS.

  Internal implementation

  The madvise(DONTNEED) should zap all mapped pages in range so
  overhead is increased linearly with the number of mapped pages.
  Even, if user access zapped pages as write mode, page fault +
  page allocation + memset should be happened.

  The mvolatile just marks the flag in a range(ie, VMA) instead of
  zapping all of pte in the vma so it doesn't touch ptes any more.

- What's the benefit compared to DONTNEED?

  1. The system call overhead is smaller because mvolatile just marks
 the flag to VMA instead of zapping all the page in a range so
 overhead should be very small.

  2. It has a chance to eliminate overheads (ex, zapping pte + page fault
 + page allocation + memset(PAGE_SIZE)) if memory pressure isn't
 severe.

  3. It has a potential to zap all ptes and free the pages if memory
 pressure is severe so reclaim overhead could be disappear - TODO

- Isn't there any drawback?

  Madvise(DONTNEED) doesn't need exclusive mmap_sem so concurrent page
  fault of other threads could be allowed. But m[no]volatile needs
  exclusive mmap_sem so other thread would be blocked if they try to
  access not-yet-mapped pages. That's why I design m[no]volatile
  overhead should be small as far as possible.

  It could suffer from max rss usage increasement because madvise(DONTNEED)
  deallocates pages instantly when the system call is issued while mvoatile
  delays it until memory pressure happens so if memory pressure is severe by
  max rss incresement, system would suffer. First of all, allocator needs
  some balance logic for that or kernel might handle it by zapping pages
  although user calls mvolatile if memory pressure is severe.
  The problem is how we know memory pressure is severe.
  One of solution is to see kswapd is active or not. Another solution is
  Anton's mempressure so allocator can handle it.

- What's for targetting?

  Firstly, user-space allocator like ptmalloc, tcmalloc or heap management
  of virtual machine like Dalvik. Also, it comes in handy for embedded
  which doesn't have swap device so they can't reclaim anonymous pages.
  By discarding instead of swapout, it could be used in the non-swap system.
  For it, we have to age anon lru list although we don't have swap because
  I don't want to discard volatile pages by top priority when memory pressure
  happens as volatile in this patch means "We don't need to swap out because
  user can handle the situation which data are disappear suddenly", NOT
  "They are useless so hurry up to reclaim them". So I want to apply same
  aging rule of nomal pages to them.

  Anonymous page background aging of non-swap system would be a trade-off
  for getting good feature. Even, we had done it two years ago until merge
  [1] and I believe gain of this patch will beat loss of anon lru aging's
  overead once all of allocator start to use madvise.
  (This patch doesn't include background aging in case of non-swap system
  but it's trivial if we decide)

  As another choice, we can zap the range like madvise(DONTNEED) when mvolatile
  is called if we don't have swap space.

- Stupid performance test
  I attach test program/script which are utter crap and I don't expect
  current smart allocator never have done it so we need more

[GIT PULL] ARM: arm-soc fixes for 3.8

2012-12-17 Thread Olof Johansson

Hi Linus,


The following changes since commit fa4c95bfdb85d568ae327d57aa33a4f55bab79c4:

  Merge branch 'for_linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs (2012-12-17 
08:27:59 -0800)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc.git tags/fixes

for you to fetch changes up to 4d1839138220e7e35bf9e31c854e4e0196dea7a1:

  Merge tag 'ep93xx-fixes-for-3.8' of git://github.com/RyanMallon/linux-ep93xx 
into fixes (2012-12-17 18:42:30 -0800)



ARM: arm-soc fixes for 3.8

This is a batch of fixes for arm-soc platforms, most of it is for OMAP
but there are others too (i.MX, Tegra, ep93xx). Fixes warnings, some
broken platforms and drivers, etc. A bit all over the map really.


Fabio Estevam (2):
  ARM: dts: mx27: Fix the AIPI bus for FEC
  ARM: imx: Move platform-mx2-emma to arch/arm/mach-imx/devices

Florian Fainelli (1):
  ARM: ep93xx: properly wait for UART FIFO to be empty

Hiroshi Doyu (1):
  amba: tegra-ahb: Fix warning w/o PM_SLEEP

Javier Martinez Canillas (1):
  ARM: OMAP2+: common: remove use of vram

Jon Hunter (11):
  ARM: OMAP2+: Fix realtime_counter_init warning in timer.c
  ARM: AM335x: Fix warning in timer.c
  ARM: OMAP2420: Fix ethernet support for OMAP2420 H4
  ARM: OMAP: Remove debug-devices.c
  ARM: dts: OMAP2420: Correct H4 board memory size
  ARM: dts: Add build target for omap4-panda-a4
  ARM: OMAP4: Update timer clock aliases
  ARM: OMAP4: Add function table for non-M4X dplls
  ARM: OMAP4: Enhance support for DPLLs with 4X multiplier
  ARM: OMAP4460: Workaround ABE DPLL failing to turn-on
  ARM: OMAP4: Fix EMU clock domain always on

Linus Walleij (2):
  ARM: u300: delete custom pin hog code
  ARM: ux500: fix missing include

Maxime Ripard (1):
  ARM: sunxi: Change device tree naming scheme for sunxi

Oleg Matcovschi (1):
  OMAP2+: mux: Fixed gpio mux mode analysis

Olof Johansson (7):
  ARM: exynos: Fix warning due to missing 'inline' in stub
  ARM: davinci: fix build break due to missing include
  Merge tag 'tegra-for-3.8-fixes-for-rc1' of 
git://git.kernel.org/.../swarren/linux-tegra into fixes
  Merge tag 'omap-fixes-a-for-v3.8-window' of 
git://git.kernel.org/.../pjw/omap-pending into fixes
  Merge tag 'omap-for-v3.8/fixes-for-merge-window-v4-signed' of 
git://git.kernel.org/.../tmlind/linux-omap into fixes
  Merge tag 'imx-fixes-rc' of git://git.pengutronix.de/git/imx/linux-2.6 
into fixes
  Merge tag 'ep93xx-fixes-for-3.8' of 
git://github.com/RyanMallon/linux-ep93xx into fixes

Paul Walmsley (3):
  ARM: OMAP4: clock data: div_iva_hs_clk is a power-of-two divider
  ARM: OMAP4: clock data: DPLLs are missing bypass clocks in their parent 
lists
  ARM: OMAP3/4: cpuidle: fix sparse and checkpatch warnings

Roger Quadros (1):
  mfd: omap-usb-host: get rid of cpu_is_omap..() macros

Sascha Hauer (1):
  ARM i.MX51 clock: Fix regression since enabling MIPI/HSP clocks

Sivaram Nair (2):
  ARM: tegra: select correct parent clk for pll_p
  ARM: tegra: fix comment in dsib clk set_parent

Tomi Valkeinen (1):
  OMAP: board-files: fix i2c_bus for tfp410

Tony Lindgren (3):
  Merge branch 'fixes-timer-build' of git://github.com/jonhunter/linux into 
omap-for-v3.8/fixes-for-merge-window
  ARM: OMAP: Move plat/omap-serial.h to 
include/linux/platform_data/serial-omap.h
  Merge branch 'omap-for-v3.8/fixes-for-merge-window' into 
omap-for-v3.8/fixes-for-merge-window-v2

Vaibhav Hiremath (1):
  ARM: OMAP2+: Fix sparse warnings in timer.c

 arch/arm/boot/dts/Makefile |  5 +-
 arch/arm/boot/dts/imx27-3ds.dts|  8 +-
 arch/arm/boot/dts/imx27-phytec-phycore.dts | 13 +--
 arch/arm/boot/dts/imx27.dtsi   | 11 ++-
 arch/arm/boot/dts/omap2420-h4.dts  |  2 +-
 arch/arm/boot/dts/sun4i-cubieboard.dts |  4 +-
 arch/arm/boot/dts/sun5i-olinuxino.dts  |  4 +-
 arch/arm/mach-davinci/board-da850-evm.c|  1 +
 arch/arm/mach-ep93xx/include/mach/uncompress.h | 10 +--
 arch/arm/mach-exynos/common.h  |  2 +-
 arch/arm/mach-imx/clk-imx51-imx53.c| 16 
 .../devices/platform-mx2-emma.c|  4 +-
 arch/arm/mach-omap2/Kconfig|  3 +-
 arch/arm/mach-omap2/board-3430sdp.c|  1 +
 arch/arm/mach-omap2/board-am3517evm.c  |  1 +
 arch/arm/mach-omap2/board-cm-t35.c |  1 +
 arch/arm/mach-omap2/board-devkit8000.c |  1 +
 arch/arm/mach-omap2/board-h4.c | 83 +--
 arch/arm/mach-omap2/board-omap3evm.c   |  1 +
 arch/arm/mach-omap2/board-omap3stalker.c   |  1 +

Re: [PATCH RESEND 0/6 v10] gpio: Add block GPIO

2012-12-17 Thread Wolfgang Grandegger

On 12/17/2012 10:33 PM, Roland Stigge wrote:
> On 17/12/12 20:47, Wolfgang Grandegger wrote:
>> On 12/17/2012 07:02 PM, Roland Stigge wrote:
>>> On 12/17/2012 06:37 PM, Wolfgang Grandegger wrote:
/* Do synchronous data output with a single write access */
__raw_writel(~mask, pio + PIO_OWDR);
__raw_writel(mask, pio + PIO_OWER);
__raw_writel(val, pio + PIO_ODSR);

 For caching we would need a storage. Not sure if it's worth compared to
 a context switch into the kernel.
>>>
>>> Block GPIO is not only for you in userspace. ;-) You can also implement
>>> efficient n-bit bus I/O in kernel drivers, n-bit-banging. :-) So not
>>> always context switches involved.
>>
>> OK, what do you think about the following untested patch:
> 
> Looks good!
> 
> Why "untested"? ;-)

Because I didn't have a chance to test it yet. Will do tomorrow.

Wolfgang.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] ODROID-X: dts: Add board dts file for ODROID-X

2012-12-17 Thread Jean-Christophe PLAGNIOL-VILLARD

On 22:14 Mon 17 Dec , Olof Johansson wrote:
> On Mon, Dec 17, 2012 at 10:00 PM, Jean-Christophe PLAGNIOL-VILLARD
>  wrote:
> > On 17:56 Mon 17 Dec , Olof Johansson wrote:
> >> On Mon, Dec 17, 2012 at 11:55 AM, Dongjin Kim  wrote:
> >> > Add initial dtb file for Hardkernel's ODROID-X board based on EXYNOS4412 
> >> > SoC.
> >> >
> >> > Signed-off-by: Dongjin Kim 
> >> > ---
> >> >  arch/arm/boot/dts/Makefile   |1 +
> >> >  arch/arm/boot/dts/exynos4412-odroidx.dts |   52 
> >> > ++
> >> >  2 files changed, 53 insertions(+)
> >> >  create mode 100644 arch/arm/boot/dts/exynos4412-odroidx.dts
> >> >
> >> > diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
> >> > index ca6fb8e..3355af9 100644
> >> > --- a/arch/arm/boot/dts/Makefile
> >> > +++ b/arch/arm/boot/dts/Makefile
> >> > @@ -45,6 +45,7 @@ dtb-$(CONFIG_ARCH_EXYNOS) += exynos4210-origen.dtb \
> >> > exynos5250-smdk5250.dtb \
> >> > exynos5440-ssdk5440.dtb \
> >> > exynos4412-smdk4412.dtb \
> >> > +   exynos4412-odroidx.dtb \
> >>
> >> Please add them alphabetically, so before smdk.
> > we need to drop the \ \ stuff it will end with merge conflict
> > as if you add 2 dtb at the end you will end with 2 patch that touch the same
> > previous line
> 
> ..which is why the dts files should be added alphabetically instead of
> just appended to the list.
we need to drop this \

and use this syntax and keep ordered
dtb-$() +=

> 
> >> > diff --git a/arch/arm/boot/dts/exynos4412-odroidx.dts 
> >> > b/arch/arm/boot/dts/exynos4412-odroidx.dts
> >> > new file mode 100644
> >> > index 000..786ddd7
> >> > --- /dev/null
> >> > +++ b/arch/arm/boot/dts/exynos4412-odroidx.dts
> >> > @@ -0,0 +1,52 @@
> >> > +/*
> >> > + * Hardkernel's Exynos4412 based ODROID-X board device tree source
> >> > + *
> >> > + * Copyright (c) 2012-2013 Dongjin Kim 
> >>
> >> Are you from the future?
> >>
> >> > + *
> >> > + * Device tree source file for Hardkernel's ODROID-X board which is 
> >> > based on
> >> > + * Samsung's Exynos4412 SoC.
> >> > + *
> >> > + * This program is free software; you can redistribute it and/or modify
> >> > + * it under the terms of the GNU General Public License version 2 as
> >> > + * published by the Free Software Foundation.
> >> > +*/
> >> > +
> >> > +/dts-v1/;
> >> > +/include/ "exynos4412.dtsi"
> >> > +
> >> > +/ {
> >> > +   model = "Hardkernel ODROID-X board based on Exynos4412";
> >> > +   compatible = "samsung,exynos4412";
> >>
> >> It should have a more specific compatible value first, i.e.
> >> "hardkernel,odroid-x" or similar.
> >>
> >>
> >> > +   memory {
> >> > +   reg = <0x4000 0x4000>;
> >> > +   };
> >> > +
> >> > +   chosen {
> >> > +   bootargs ="root=/dev/mmcblk0p3 rw console=ttySAC1,115200 
> >> > init=/sbin/init delay=2";
> >>
> >> Bootargs should be passed in from u-boot, don't specify them in the
> >> static device tree.
> >
> > why not we can choose to have a default cmdline and even usit as a 
> > complement
> > of the bootloader one
> >
> > it's up to the dts maintainer to choose
> 
> The chance of having a valid generic command line that will work for
> everyone with that hardware is fairly small, especially on more
> generic systems that might have a regular distro installed on them.

this does not hurt anyone 99% of the people will overwrite you can just see as
a cmdline example

Best Regards,
J.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] usb: phy: tegra: Using devm API for memory allocation

2012-12-17 Thread Venu Byravarasu

Using devm_kzalloc for allocating memory needed for PHY
pointer and hence removing kfree calls to PHY pointer.

Signed-off-by: Venu Byravarasu 
---
 drivers/usb/phy/tegra_usb_phy.c |4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/usb/phy/tegra_usb_phy.c b/drivers/usb/phy/tegra_usb_phy.c
index 9d13c81..0b99e1f 100644
--- a/drivers/usb/phy/tegra_usb_phy.c
+++ b/drivers/usb/phy/tegra_usb_phy.c
@@ -704,7 +704,6 @@ static void tegra_usb_phy_close(struct usb_phy *x)
utmip_pad_close(phy);
clk_disable_unprepare(phy->pll_u);
clk_put(phy->pll_u);
-   kfree(phy);
 }
 
 static int tegra_usb_phy_power_on(struct tegra_usb_phy *phy)
@@ -740,7 +739,7 @@ struct tegra_usb_phy *tegra_usb_phy_open(struct device 
*dev, int instance,
int i;
int err;
 
-   phy = kmalloc(sizeof(struct tegra_usb_phy), GFP_KERNEL);
+   phy = devm_kzalloc(dev, sizeof(struct tegra_usb_phy), GFP_KERNEL);
if (!phy)
return ERR_PTR(-ENOMEM);
 
@@ -791,7 +790,6 @@ err1:
clk_disable_unprepare(phy->pll_u);
clk_put(phy->pll_u);
 err0:
-   kfree(phy);
return ERR_PTR(err);
 }
 EXPORT_SYMBOL_GPL(tegra_usb_phy_open);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND 0/6 v10] gpio: Add block GPIO

2012-12-17 Thread Jean-Christophe PLAGNIOL-VILLARD

On 20:47 Mon 17 Dec , Wolfgang Grandegger wrote:
> On 12/17/2012 07:02 PM, Roland Stigge wrote:
> > On 12/17/2012 06:37 PM, Wolfgang Grandegger wrote:
> >>/* Do synchronous data output with a single write access */
> >>__raw_writel(~mask, pio + PIO_OWDR);
> >>__raw_writel(mask, pio + PIO_OWER);
> >>__raw_writel(val, pio + PIO_ODSR);
> >>
> >> For caching we would need a storage. Not sure if it's worth compared to
> >> a context switch into the kernel.
> > 
> > Block GPIO is not only for you in userspace. ;-) You can also implement
> > efficient n-bit bus I/O in kernel drivers, n-bit-banging. :-) So not
> > always context switches involved.
> 
> OK, what do you think about the following untested patch:
> 
> From b44cad16cbbca84715dffd4cb5268497216add25 Mon Sep 17 00:00:00 2001
> From: Wolfgang Grandegger 
> Date: Mon, 3 Dec 2012 08:31:55 +0100
> Subject: [PATCH 1/2] gpio: add GPIO block callback functions for AT91
> 
> Signed-off-by: Wolfgang Grandegger 
> ---
>  arch/arm/mach-at91/gpio.c |   29 +
>  1 file changed, 29 insertions(+)
> 
> diff --git a/arch/arm/mach-at91/gpio.c b/arch/arm/mach-at91/gpio.c
> index be42cf0..cf6bd45 100644
> --- a/arch/arm/mach-at91/gpio.c
> +++ b/arch/arm/mach-at91/gpio.c
> @@ -42,13 +42,16 @@ struct at91_gpio_chip {
>   void __iomem*regbase;   /* PIO bank virtual address */
>   struct clk  *clock; /* associated clock */
>   struct irq_domain   *domain;/* associated irq domain */
> + unsigned long   mask_shadow;/* synchronous data output */
>  };
>  
>  #define to_at91_gpio_chip(c) container_of(c, struct at91_gpio_chip, chip)
>  
>  static void at91_gpiolib_dbg_show(struct seq_file *s, struct gpio_chip 
> *chip);
>  static void at91_gpiolib_set(struct gpio_chip *chip, unsigned offset, int 
> val);
> +static void at91_gpiolib_set_block(struct gpio_chip *chip, unsigned long 
> mask, unsigned long val);
>  static int at91_gpiolib_get(struct gpio_chip *chip, unsigned offset);
> +static unsigned long at91_gpiolib_get_block(struct gpio_chip *chip, unsigned 
> long mask);
>  static int at91_gpiolib_direction_output(struct gpio_chip *chip,
>unsigned offset, int val);
>  static int at91_gpiolib_direction_input(struct gpio_chip *chip,
> @@ -62,7 +65,9 @@ static int at91_gpiolib_to_irq(struct gpio_chip *chip, 
> unsigned offset);
>   .direction_input  = at91_gpiolib_direction_input, \
>   .direction_output = at91_gpiolib_direction_output, \
>   .get  = at91_gpiolib_get,   \
> + .get_block= at91_gpiolib_get_block, \
>   .set  = at91_gpiolib_set,   \
> + .set_block= at91_gpiolib_set_block, \
>   .dbg_show = at91_gpiolib_dbg_show,  \
>   .to_irq   = at91_gpiolib_to_irq,\
>   .ngpio= nr_gpio,\
> @@ -896,6 +901,16 @@ static int at91_gpiolib_get(struct gpio_chip *chip, 
> unsigned offset)
>   return (pdsr & mask) != 0;
>  }
>  
> +static unsigned long at91_gpiolib_get_block(struct gpio_chip *chip, unsigned 
> long mask)
> +{
> + struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
> + void __iomem *pio = at91_gpio->regbase;
> + u32 pdsr;
> +
> + pdsr = __raw_readl(pio + PIO_PDSR);
> + return pdsr & mask;
> +}
> +
>  static void at91_gpiolib_set(struct gpio_chip *chip, unsigned offset, int 
> val)
>  {
>   struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
> @@ -905,6 +920,20 @@ static void at91_gpiolib_set(struct gpio_chip *chip, 
> unsigned offset, int val)
>   __raw_writel(mask, pio + (val ? PIO_SODR : PIO_CODR));
>  }
>  
> +static void at91_gpiolib_set_block(struct gpio_chip *chip, unsigned long 
> mask, unsigned long val)
> +{
> + struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
> + void __iomem *pio = at91_gpio->regbase;
> +
> + /* Do synchronous data output with a single write access */
> + if (mask != at91_gpio->mask_shadow) {
> + at91_gpio->mask_shadow = mask;
> + __raw_writel(~mask, pio + PIO_OWDR);
> + __raw_writel(mask, pio + PIO_OWER);
> + }
> + __raw_writel(val, pio + PIO_ODSR);
> +}
this driver is only for old at91 platfrom if you touch at91 you need to update
the pinctrl too

Best Regards,
J.
> +
>  static void at91_gpiolib_dbg_show(struct seq_file *s, struct gpio_chip *chip)
>  {
>   int i;
> -- 
> 1.7.9.5
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC v2 6/8] gpu: drm: tegra: Remove redundant host1x

2012-12-17 Thread Terje Bergström

On 17.12.2012 22:55, Stephen Warren wrote:
> On 12/16/2012 09:37 AM, Terje Bergström wrote:
> ...
>> ... Sure we could tell DC to ask its parent
>> (host1x), and call host1x driver with platform_device pointer found that
>> way, and host1x would return a pointer to tegradrm's data. Hanging the
>> data onto host1x driver is just a more complicated way of implementing
>> global data
> 
> No it's not; at that point, the data is no longer global, but specific
> to the driver instance.

We have only one tegradrm, so the advantage is theoretical - the one
driver gets the same pointer in both cases.

If we use static pointer with an accessor function, we can keep the
solution contained to one source code file and the ownership of data is
clear - tegradrm allocates and deallocates the object and is the sole
user. Code is already in the patchset I sent.

Shared responsibility with host1x and tegradrm would work probably
something like this:

tegradrm creates an object, and passes the reference to host1x
(host1x_set_drm_pdata(host1x_platform_dev, object). host1x sets the
pdata to a member in struct host1x. A getter host1x_get_drm_pdata()
allows retrieving the object. DC would call it with
"host1x_get_drm_pdata(to_platform_device(pdev->dev.parent))".

This assumes that tegradrm would keep ownership of the data and free it
before host1x gets unloaded.

To me this sounds like a steep price and I fail to see the advantage.

Dummy device is something that would come then on top of this mechanism.

Terje
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] acpi: glue: Update DBG macro to include KERN_DEBUG

2012-12-17 Thread Joe Perches

Currently these DBG statements are emitted at KERN_DEFAULT.
Change the macro to emit at KERN_DEBUG.

This can help avoid unexpected message interleaving.

Signed-off-by: Joe Perches 
---

Another way to fix this message interleaving...

 drivers/acpi/glue.c |9 +++--
 1 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/acpi/glue.c b/drivers/acpi/glue.c
index 0155184..95af6f6 100644
--- a/drivers/acpi/glue.c
+++ b/drivers/acpi/glue.c
@@ -18,9 +18,14 @@
 
 #define ACPI_GLUE_DEBUG0
 #if ACPI_GLUE_DEBUG
-#define DBG(x...) printk(PREFIX x)
+#define DBG(fmt, ...)  \
+   printk(KERN_DEBUG PREFIX fmt, ##__VA_ARGS__)
 #else
-#define DBG(x...) do { } while(0)
+#define DBG(fmt, ...)  \
+do {   \
+   if (0)  \
+   printk(KERN_DEBUG PREFIX fmt, ##__VA_ARGS__);   \
+} while (0)
 #endif
 static LIST_HEAD(bus_type_list);
 static DECLARE_RWSEM(bus_type_sem);
-- 
1.7.8.112.g3fd21

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] ARM: tegra30: Add support for Uart clock source divider as 15.1

2012-12-17 Thread Laxman Dewangan


On Tuesday 18 December 2012 11:44 AM, Prashant Gaikwad wrote:

On Tuesday 18 December 2012 03:13 AM, Stephen Warren wrote:

On 12/17/2012 05:08 AM, Laxman Dewangan wrote:

Tegra20 uart clock source have the 15.1 clock divider in place of

That says Tegra20, but ...


7.1. Add support for 15.1 clock divider and change the uart clock divider
flag to DIV_U151.
   arch/arm/mach-tegra/clock.h   |3 +-
   arch/arm/mach-tegra/tegra30_clocks.c  |   70 ++--
   arch/arm/mach-tegra/tegra30_clocks_data.c |   10 ++--

... the patch only modifies Tegra30. Do both Tegra20 and Tegra30 have
this feature; should both clock drivers be updated?

BTW, Prashant is reworking the Tegra clock support to be modular, rather
than having a single monolithic "Tegra clock" type, and also moving the
code to drivers/clk. This patch will conflict signifcantly with that.
Please work with him to integrate this patch into his rework series,
either before or after his changes, and have him include the patch when
he posts his series. You'll also need to think about whether/how your
and his series depend on each-other.

... but: Is this a pure bug-fix? If so, I guess this patch should be
applied before Prashant's patches, and this patch also Cc: stable?

My clock driver rework includes this fix. Divider supports both DIVU71
and DIVU151.
UART divider is set to DIVU151.


Prashant,
I like to go this patch as first patch towards bug fixes rather than 
after moving the clock. The reason is that we will pull this change in 
our downstream and will be available in our K3.7 code.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] clk: vt8500: Use of_init_clk_data()

2012-12-17 Thread Tony Prisk

On Mon, 2012-12-17 at 13:02 -0800, Stephen Boyd wrote:
> Reduce lines of code and simplify this driver by using the
> generic clock binding parsing function.
> 
> Signed-off-by: Stephen Boyd 
> Cc: Tony Prisk 
> ---
>  drivers/clk/clk-vt8500.c | 39 +++
>  1 file changed, 15 insertions(+), 24 deletions(-)

Looks fine, and compiles without errors/warnings.

Acked-by: Tony Prisk 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/4] ARM: tegra: Add OF_DEV_AUXDATA for uart driver in board dt

2012-12-17 Thread Prashant Gaikwad


On Tuesday 18 December 2012 03:17 AM, Stephen Warren wrote:

On 12/17/2012 05:08 AM, Laxman Dewangan wrote:

Add OF_DEV_AUXDATA for high speed uart controller driver for
Tegra20/Tegra30 board dt files.
Set the parent clock of uart controller to PLLP.
diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -94,6 +94,11 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata = {
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006000, "tegra-uart.0", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006040, "tegra-uart.1", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006200, "tegra-uart.2", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006300, "tegra-uart.3", 
NULL),
+   OF_DEV_AUXDATA("nvidia,tegra20-hsuart", 0x70006400, "tegra-uart.4", 
NULL),

Instead, can we simply get the clocks from device tree? Prashant, how
much effort will that be once your clock patches are checked in, or is
it already part of those patches?


It is not part of rework patches, but I will send a patch for it 
immediately after those patches are accepted upstream.



@@ -106,7 +111,10 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata 
= {
  static __initdata struct tegra_clk_init_table tegra_dt_clk_init_table[] = {
/* name parent  rateenabled */
{ "uarta","pll_p",  21600,  true },
+   { "uartb","pll_p",  21600,  false },
+   { "uartc","pll_p",  21600,  false },
{ "uartd","pll_p",  21600,  true },
+   { "uarte","pll_p",  21600,  false },

Prashant's clock patches remove this table. Please work with him to work
out how to deal with that.


Laxman,

If you want I can include these entries in current tables.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] ARM: tegra30: Add support for Uart clock source divider as 15.1

2012-12-17 Thread Prashant Gaikwad


On Tuesday 18 December 2012 03:13 AM, Stephen Warren wrote:

On 12/17/2012 05:08 AM, Laxman Dewangan wrote:

Tegra20 uart clock source have the 15.1 clock divider in place of

That says Tegra20, but ...


7.1. Add support for 15.1 clock divider and change the uart clock divider
flag to DIV_U151.
  arch/arm/mach-tegra/clock.h   |3 +-
  arch/arm/mach-tegra/tegra30_clocks.c  |   70 ++--
  arch/arm/mach-tegra/tegra30_clocks_data.c |   10 ++--

... the patch only modifies Tegra30. Do both Tegra20 and Tegra30 have
this feature; should both clock drivers be updated?

BTW, Prashant is reworking the Tegra clock support to be modular, rather
than having a single monolithic "Tegra clock" type, and also moving the
code to drivers/clk. This patch will conflict signifcantly with that.
Please work with him to integrate this patch into his rework series,
either before or after his changes, and have him include the patch when
he posts his series. You'll also need to think about whether/how your
and his series depend on each-other.

... but: Is this a pure bug-fix? If so, I guess this patch should be
applied before Prashant's patches, and this patch also Cc: stable?


My clock driver rework includes this fix. Divider supports both DIVU71 
and DIVU151.

UART divider is set to DIVU151.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] ODROID-X: dts: Add board dts file for ODROID-X

2012-12-17 Thread Olof Johansson

On Mon, Dec 17, 2012 at 10:00 PM, Jean-Christophe PLAGNIOL-VILLARD
 wrote:
> On 17:56 Mon 17 Dec , Olof Johansson wrote:
>> On Mon, Dec 17, 2012 at 11:55 AM, Dongjin Kim  wrote:
>> > Add initial dtb file for Hardkernel's ODROID-X board based on EXYNOS4412 
>> > SoC.
>> >
>> > Signed-off-by: Dongjin Kim 
>> > ---
>> >  arch/arm/boot/dts/Makefile   |1 +
>> >  arch/arm/boot/dts/exynos4412-odroidx.dts |   52 
>> > ++
>> >  2 files changed, 53 insertions(+)
>> >  create mode 100644 arch/arm/boot/dts/exynos4412-odroidx.dts
>> >
>> > diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
>> > index ca6fb8e..3355af9 100644
>> > --- a/arch/arm/boot/dts/Makefile
>> > +++ b/arch/arm/boot/dts/Makefile
>> > @@ -45,6 +45,7 @@ dtb-$(CONFIG_ARCH_EXYNOS) += exynos4210-origen.dtb \
>> > exynos5250-smdk5250.dtb \
>> > exynos5440-ssdk5440.dtb \
>> > exynos4412-smdk4412.dtb \
>> > +   exynos4412-odroidx.dtb \
>>
>> Please add them alphabetically, so before smdk.
> we need to drop the \ \ stuff it will end with merge conflict
> as if you add 2 dtb at the end you will end with 2 patch that touch the same
> previous line

..which is why the dts files should be added alphabetically instead of
just appended to the list.

>> > diff --git a/arch/arm/boot/dts/exynos4412-odroidx.dts 
>> > b/arch/arm/boot/dts/exynos4412-odroidx.dts
>> > new file mode 100644
>> > index 000..786ddd7
>> > --- /dev/null
>> > +++ b/arch/arm/boot/dts/exynos4412-odroidx.dts
>> > @@ -0,0 +1,52 @@
>> > +/*
>> > + * Hardkernel's Exynos4412 based ODROID-X board device tree source
>> > + *
>> > + * Copyright (c) 2012-2013 Dongjin Kim 
>>
>> Are you from the future?
>>
>> > + *
>> > + * Device tree source file for Hardkernel's ODROID-X board which is based 
>> > on
>> > + * Samsung's Exynos4412 SoC.
>> > + *
>> > + * This program is free software; you can redistribute it and/or modify
>> > + * it under the terms of the GNU General Public License version 2 as
>> > + * published by the Free Software Foundation.
>> > +*/
>> > +
>> > +/dts-v1/;
>> > +/include/ "exynos4412.dtsi"
>> > +
>> > +/ {
>> > +   model = "Hardkernel ODROID-X board based on Exynos4412";
>> > +   compatible = "samsung,exynos4412";
>>
>> It should have a more specific compatible value first, i.e.
>> "hardkernel,odroid-x" or similar.
>>
>>
>> > +   memory {
>> > +   reg = <0x4000 0x4000>;
>> > +   };
>> > +
>> > +   chosen {
>> > +   bootargs ="root=/dev/mmcblk0p3 rw console=ttySAC1,115200 
>> > init=/sbin/init delay=2";
>>
>> Bootargs should be passed in from u-boot, don't specify them in the
>> static device tree.
>
> why not we can choose to have a default cmdline and even usit as a complement
> of the bootloader one
>
> it's up to the dts maintainer to choose

The chance of having a valid generic command line that will work for
everyone with that hardware is fairly small, especially on more
generic systems that might have a regular distro installed on them.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/9] can: add tx/rx LED trigger support

2012-12-17 Thread Bernd Krumböck

Hi Fabio!

> On Mon, Dec 17, 2012 at 09:20:57PM +0100, "Bernd Krumböck" wrote:
>> > If you think it's useful for USB controller, just tell me or modify
>> the
>> > driver by yourself!  As you see the patch is really easy.
>>
>> At least it is useful for the usb_8dev driver. I'll write a patch.
>>
>> Photo of the device:
>> http://www.8devices.com/product/2/usb2can
>
> Sure that it's needed?  That status LED is probably controlled directly
> by the firmware itself.  Anyway I think it makes sense to support all
> the mainline drivers and I'm really happy if you test the patch on
> your hardware!

Yes, it is controlled by the firmware, but it does not show rx/tx. Whereas
my OpenWRT hardware has enough leds. ;-)

regards,
Bernd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/4] ODROID-X: dts: Add board dts file for ODROID-X

2012-12-17 Thread Jean-Christophe PLAGNIOL-VILLARD

On 17:56 Mon 17 Dec , Olof Johansson wrote:
> On Mon, Dec 17, 2012 at 11:55 AM, Dongjin Kim  wrote:
> > Add initial dtb file for Hardkernel's ODROID-X board based on EXYNOS4412 
> > SoC.
> >
> > Signed-off-by: Dongjin Kim 
> > ---
> >  arch/arm/boot/dts/Makefile   |1 +
> >  arch/arm/boot/dts/exynos4412-odroidx.dts |   52 
> > ++
> >  2 files changed, 53 insertions(+)
> >  create mode 100644 arch/arm/boot/dts/exynos4412-odroidx.dts
> >
> > diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
> > index ca6fb8e..3355af9 100644
> > --- a/arch/arm/boot/dts/Makefile
> > +++ b/arch/arm/boot/dts/Makefile
> > @@ -45,6 +45,7 @@ dtb-$(CONFIG_ARCH_EXYNOS) += exynos4210-origen.dtb \
> > exynos5250-smdk5250.dtb \
> > exynos5440-ssdk5440.dtb \
> > exynos4412-smdk4412.dtb \
> > +   exynos4412-odroidx.dtb \
> 
> Please add them alphabetically, so before smdk.
we need to drop the \ \ stuff it will end with merge conflict
as if you add 2 dtb at the end you will end with 2 patch that touch the same
previous line
> 
> > diff --git a/arch/arm/boot/dts/exynos4412-odroidx.dts 
> > b/arch/arm/boot/dts/exynos4412-odroidx.dts
> > new file mode 100644
> > index 000..786ddd7
> > --- /dev/null
> > +++ b/arch/arm/boot/dts/exynos4412-odroidx.dts
> > @@ -0,0 +1,52 @@
> > +/*
> > + * Hardkernel's Exynos4412 based ODROID-X board device tree source
> > + *
> > + * Copyright (c) 2012-2013 Dongjin Kim 
> 
> Are you from the future?
> 
> > + *
> > + * Device tree source file for Hardkernel's ODROID-X board which is based 
> > on
> > + * Samsung's Exynos4412 SoC.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > +*/
> > +
> > +/dts-v1/;
> > +/include/ "exynos4412.dtsi"
> > +
> > +/ {
> > +   model = "Hardkernel ODROID-X board based on Exynos4412";
> > +   compatible = "samsung,exynos4412";
> 
> It should have a more specific compatible value first, i.e.
> "hardkernel,odroid-x" or similar.
> 
> 
> > +   memory {
> > +   reg = <0x4000 0x4000>;
> > +   };
> > +
> > +   chosen {
> > +   bootargs ="root=/dev/mmcblk0p3 rw console=ttySAC1,115200 
> > init=/sbin/init delay=2";
> 
> Bootargs should be passed in from u-boot, don't specify them in the
> static device tree.

why not we can choose to have a default cmdline and even usit as a complement
of the bootloader one

it's up to the dts maintainer to choose

Best Regards,
J.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] usb: phy: samsung: Add support to set pmu isolation

2012-12-17 Thread Vivek Gautam

Adding support to parse device node data in order to get
required properties to set pmu isolation for usb-phy.

Signed-off-by: Vivek Gautam 
---
 .../devicetree/bindings/usb/samsung-usbphy.txt |   10 +++
 drivers/usb/phy/samsung-usbphy.c   |   80 ++--
 2 files changed, 82 insertions(+), 8 deletions(-)

diff --git a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt 
b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
index 7b26e2d..112eaa6 100644
--- a/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
+++ b/Documentation/devicetree/bindings/usb/samsung-usbphy.txt
@@ -9,3 +9,13 @@ Required properties:
 - compatible : should be "samsung,exynos4210-usbphy"
 - reg : base physical address of the phy registers and length of memory mapped
region.
+- samsung,usb-phyctrl : should point to usb-phyctrl sub-node which provides
+   binding data to enable/disable device PHY handled by
+   PMU register.
+
+   Required properties:
+   - compatible : should be "samsung,usbdev-phyctrl" for
+   DEVICE type phy.
+   - samsung,phyctrl-reg: base physical address of
+   PHY_CONTROL register in PMU.
+- samsung,enable-mask : should be '1'
diff --git a/drivers/usb/phy/samsung-usbphy.c b/drivers/usb/phy/samsung-usbphy.c
index 5c5e1bb5..ef394c3 100644
--- a/drivers/usb/phy/samsung-usbphy.c
+++ b/drivers/usb/phy/samsung-usbphy.c
@@ -72,6 +72,8 @@ enum samsung_cpu_type {
  * @dev: The parent device supplied to the probe function
  * @clk: usb phy clock
  * @regs: usb phy register memory base
+ * @devctrl_reg: usb phy-control pmu register memory base
+ * @en_mask: enable mask
  * @ref_clk_freq: reference clock frequency selection
  * @cpu_type: machine identifier
  */
@@ -81,12 +83,62 @@ struct samsung_usbphy {
struct device   *dev;
struct clk  *clk;
void __iomem*regs;
+   void __iomem*devctrl_reg;
+   u32 en_mask;
int ref_clk_freq;
int cpu_type;
 };
 
 #define phy_to_sphy(x) container_of((x), struct samsung_usbphy, phy)
 
+static int samsung_usbphy_parse_dt_param(struct samsung_usbphy *sphy)
+{
+   struct device_node *usb_phyctrl;
+   u32 reg;
+
+   if (!sphy->dev->of_node) {
+   sphy->devctrl_reg = NULL;
+   return -ENODEV;
+   }
+
+   usb_phyctrl = of_parse_phandle(sphy->dev->of_node,
+   "samsung,usb-phyctrl", 0);
+   if (!usb_phyctrl) {
+   dev_dbg(sphy->dev, "Can't get usb-phy control node\n");
+   sphy->devctrl_reg = NULL;
+   return -ENODEV;
+   }
+
+   of_property_read_u32(usb_phyctrl, "samsung,phyctrl-reg", );
+
+   sphy->devctrl_reg = ioremap(reg, SZ_4);
+
+   of_property_read_u32(sphy->dev->of_node, "samsung,enable-mask",
+   >en_mask);
+
+   return 0;
+}
+
+/*
+ * Set isolation here for phy.
+ * SOCs control this by controlling corresponding PMU registers
+ */
+static void samsung_usbphy_set_isolation(struct samsung_usbphy *sphy, int on)
+{
+   void __iomem *usb_phyctrl_reg;
+   u32 en_mask = sphy->en_mask;
+   u32 reg;
+
+   usb_phyctrl_reg = sphy->devctrl_reg;
+
+   reg = readl(usb_phyctrl_reg);
+
+   if (on)
+   writel(reg & ~en_mask, usb_phyctrl_reg);
+   else
+   writel(reg | en_mask, usb_phyctrl_reg);
+}
+
 /*
  * Returns reference clock frequency selection value
  */
@@ -199,6 +251,8 @@ static int samsung_usbphy_init(struct usb_phy *phy)
/* Disable phy isolation */
if (sphy->plat && sphy->plat->pmu_isolation)
sphy->plat->pmu_isolation(false);
+   else
+   samsung_usbphy_set_isolation(sphy, false);
 
/* Initialize usb phy registers */
samsung_usbphy_enable(sphy);
@@ -228,6 +282,8 @@ static void samsung_usbphy_shutdown(struct usb_phy *phy)
/* Enable phy isolation */
if (sphy->plat && sphy->plat->pmu_isolation)
sphy->plat->pmu_isolation(true);
+   else
+   samsung_usbphy_set_isolation(sphy, true);
 
clk_disable_unprepare(sphy->clk);
 }
@@ -249,17 +305,12 @@ static inline int samsung_usbphy_get_driver_data(struct 
platform_device *pdev)
 static int __devinit samsung_usbphy_probe(struct platform_device *pdev)
 {
struct samsung_usbphy *sphy;
-   struct samsung_usbphy_data *pdata;
+   struct samsung_usbphy_data *pdata = pdev->dev.platform_data;
struct device *dev = >dev;
struct resource *phy_mem;
void __iomem*phy_base;
struct clk *clk;
-
-   pdata = pdev->dev.platform_data;
-   if (!pdata) {
-   dev_err(>dev, "%s: no platform data defined\n", __func__);

[PATCH] usb: phy: samsung: Add support to set pmu isolation

2012-12-17 Thread Vivek Gautam

Based on patches for samsung-usbphy driver available at:
https://patchwork.kernel.org/patch/1794651/

In this patch we are adding support to parse device tree data for
samsung-usbphy driver and further setting pmu_isolation to
enable/disable phy as and when needed.
This further chucks out the need of platform data for samsung-usbphy on
DT enabled system and hence serves the purpose of the discussion
in the thread for:
[PATCH v8 2/2] usb: s3c-hsotg: Adding phy driver support

Vivek Gautam (1):
  usb: phy: samsung: Add support to set pmu isolation

 .../devicetree/bindings/usb/samsung-usbphy.txt |   10 +++
 drivers/usb/phy/samsung-usbphy.c   |   80 ++--
 2 files changed, 82 insertions(+), 8 deletions(-)

-- 
1.7.6.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 4/4] uprobes/powerpc: Make use of generic routines to enable single step

2012-12-17 Thread Ananth N Mavinakayanahalli

On Fri, Dec 14, 2012 at 09:02:41PM +0100, Oleg Nesterov wrote:
> On 12/03, Suzuki K. Poulose wrote:
> >
> > Replace the ptrace helpers with the powerpc generic routines to
> > enable/disable single step. We save/restore the MSR (and DCBR for BookE)
> > across for the operation. We don't have to disable the single step,
> > as restoring the MSR/DBCR would restore the previous state.
> 
> Obviously I can't review this series (although it looks fine to me).
> 
> Just one note,
> 
> > @@ -121,7 +132,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, 
> > struct pt_regs *regs)
> >
> > WARN_ON_ONCE(current->thread.trap_nr != UPROBE_TRAP_NR);
> >
> > -   uprobe_restore_context_sstep(>autask);
> > +   uprobe_restore_context_sstep(>autask, regs);
> 
> I am not sure ppc needs this, but note that x86 does a bit more.
> 
> Not only we need to restore the "single-step" state, we need to
> send SIGTRAP if it was not set by us. The same for _skip_sstep.

Do you mean restoring the TF equivalent on powerpc to what it was before?

If so, powerpc has always been unique in this aspect -- the single-step
exception handler *always* resets the sstep bit in MSR. Any user needing
to continue single-stepping has to explicitly set it again.

Ananth

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PULL] virtio-next

2012-12-17 Thread Rusty Russell

  git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux.git 
tags/virtio-next-for-linus

for you to fetch changes up to 1b6370463e88b0c1c317de16d7b962acc1dab4f2:

  virtio_console: Add support for remoteproc serial (2012-12-18 15:20:44 +1030)


Some nice cleanups, and even a patch my wife did as a "live" demo for
Latinoware 2012.

There's a slightly non-trivial merge in virtio-net, as we cleaned up the
virtio add_buf interface while DaveM accepted the mq virtio-net patches.

You can see my solution in my pending-rebases branch, if that helps, but I
know you love merging:

https://git.kernel.org/?p=linux/kernel/git/rusty/linux.git;a=commit;h=12e4e64fa66a4c812e4855de32abdb4d819526fe

Cheers,
Rusty.


Alex Russell (1):
  lguest: fix typo

Amit Shah (1):
  virtio: console: don't rely on virtqueue_add_buf() returning capacity.

Bryan Venteicher (1):
  virtio-scsi: Add real 2-clause BSD license to header

Joe Perches (1):
  virtio: Convert dev_printk(KERN_ to dev_(

Michael S. Tsirkin (2):
  virtio-net: correct capacity math on ring full
  virtio-net: remove unused skb_vnet_hdr->num_sg field

Pawel Moll (1):
  virtio-mmio: Fix irq parsing in command line parameter

Rusty Russell (9):
  lguest: fix block request handling in example launcher.
  virtio: move queue_index and num_free fields into core struct virtqueue.
  virtio_net: don't rely on virtqueue_add_buf() returning capacity.
  virtio: make virtqueue_add_buf() returning 0 on success, not capacity.
  virtio: console: make it clear that virtqueue_add_buf() no longer returns 
> 0
  virtio: net: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: rpmsg: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: scsi: make it clear that virtqueue_add_buf() no longer returns > 0
  virtio: tools: make it clear that virtqueue_add_buf() no longer returns > 0

Sjur Brændeland (4):
  virtio_console: Free buffer if splice fails
  virtio_console: Use kmalloc instead of kzalloc
  virtio_console: Merge struct buffer_token into struct port_buffer
  virtio_console: Add support for remoteproc serial

Wanlong Gao (2):
  virtio: use dev_to_virtio wrapper in virtio
  virtio: add drv_to_virtio to make code clearly

Wei Yongjun (1):
  virtio-pci: use module_pci_driver to simplify the code

Will Deacon (3):
  mm: highmem: export kmap_to_page for modules
  virtio: 9p: correctly pass physical address to userspace for high pages
  virtio: force vring descriptors to be allocated from lowmem

sjur.brandel...@stericsson.com (1):
  virtio_console: Free buffers from out-queue upon close

 drivers/char/virtio_console.c|  329 ++
 drivers/lguest/core.c|2 +-
 drivers/net/virtio_net.c |   46 +++---
 drivers/rpmsg/virtio_rpmsg_bus.c |6 +-
 drivers/scsi/virtio_scsi.c   |   24 +--
 drivers/virtio/virtio.c  |   30 ++--
 drivers/virtio/virtio_balloon.c  |7 +-
 drivers/virtio/virtio_mmio.c |   30 ++--
 drivers/virtio/virtio_pci.c  |   20 +--
 drivers/virtio/virtio_ring.c |   46 +++---
 include/linux/virtio.h   |   25 ++-
 include/linux/virtio_scsi.h  |   28 +++-
 include/uapi/linux/virtio_ids.h  |1 +
 mm/highmem.c |1 +
 net/9p/trans_virtio.c|3 +-
 tools/lguest/lguest.c|   84 --
 tools/virtio/virtio_test.c   |4 +-
 17 files changed, 412 insertions(+), 274 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH] backlight: add lms501kf03 LCD driver

2012-12-17 Thread Sachin Kamat

On 18 December 2012 06:59, Jingoo Han  wrote:
> On Monday, Monday, December 17, 2012 7:01 PM, Sachin Kamat wrote
>>
>> Hi Jingoo,
>>
>> I had already submitted a patch for adding support for this driver [1]
>> and you had also provided your review comments on them ([2] and [3]).
>> There were certain comments from Andrew Morton that needed to be addressed
>> which I could not due to some other priorities.
>
> CC'ed Ilho Lee
>
> You have abandoned this patch for 7 months!
>
> In addition, before you submitted a patch, it was already developed and
> managed by me with Ilho Lee. So, ownership should be given to me and Ilho Lee.
> Also, I am not sure that you can manage this lms501kf03 LCD driver.
>

I leave it to the maintainers to decide if it is an ethical practice
to claim ownership by incorporating review comments on the original
patch.

>>
>> IMO, it would be better to address comments on that driver rather than
>> posting a new (similar) driver altogether.


>>
>> [1] http://marc.info/?l=linux-fbdev=133455903202255=4
>> [2] http://marc.info/?l=linux-fbdev=133574414215045=4
>> [3] http://marc.info/?l=linux-fbdev=133576237619447=4
>>
>> On 17 December 2012 13:52, Jingoo Han  wrote:
>> > Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
>> > x 480) driver uses 3-wired SPI inteface.
>> >
>> > Signed-off-by: Ilho Lee 
>> > Signed-off-by: Jingoo Han 
>> > ---
>> >  drivers/video/backlight/Kconfig  |8 +
>> >  drivers/video/backlight/Makefile |1 +
>> >  drivers/video/backlight/lms501kf03.c |  448 
>> > ++
>> >  3 files changed, 457 insertions(+), 0 deletions(-)
>> >  create mode 100644 drivers/video/backlight/lms501kf03.c
>> >
>> > diff --git a/drivers/video/backlight/Kconfig 
>> > b/drivers/video/backlight/Kconfig
>> > index 765a945..081d6cf 100644
>> > --- a/drivers/video/backlight/Kconfig
>> > +++ b/drivers/video/backlight/Kconfig
>> > @@ -126,6 +126,14 @@ config LCD_AMS369FG06
>> >   If you have an AMS369FG06 AMOLED Panel, say Y to enable its
>> >   LCD control driver.
>> >
>> > +config LCD_LMS501KF03
>> > +   tristate "LMS501KF03 LCD Driver"
>> > +   depends on SPI
>> > +   default n
>> > +   help
>> > + If you have an LMS501KF03 LCD Panel, say Y to enable its
>> > + LCD control driver.
>> > +
>> >  endif # LCD_CLASS_DEVICE
>> >
>> >  #
>> > diff --git a/drivers/video/backlight/Makefile 
>> > b/drivers/video/backlight/Makefile
>> > index e7ce729..d02a728 100644
>> > --- a/drivers/video/backlight/Makefile
>> > +++ b/drivers/video/backlight/Makefile
>> > @@ -14,6 +14,7 @@ obj-$(CONFIG_LCD_TOSA)   += tosa_lcd.o
>> >  obj-$(CONFIG_LCD_S6E63M0)  += s6e63m0.o
>> >  obj-$(CONFIG_LCD_LD9040)   += ld9040.o
>> >  obj-$(CONFIG_LCD_AMS369FG06)   += ams369fg06.o
>> > +obj-$(CONFIG_LCD_LMS501KF03)   += lms501kf03.o
>> >
>> >  obj-$(CONFIG_BACKLIGHT_CLASS_DEVICE) += backlight.o
>> >  obj-$(CONFIG_BACKLIGHT_ATMEL_PWM)+= atmel-pwm-bl.o
>> > diff --git a/drivers/video/backlight/lms501kf03.c 
>> > b/drivers/video/backlight/lms501kf03.c
>> > new file mode 100644
>> > index 000..c30ea53
>> > --- /dev/null
>> > +++ b/drivers/video/backlight/lms501kf03.c
>> > @@ -0,0 +1,448 @@
>> > +/*
>> > + * lms501kf03 TFT LCD panel driver.
>> > + *
>> > + * Copyright (c) 2012 Samsung Electronics Co., Ltd.
>> > + * Author: Jingoo Han  
>> > + *
>> > + * This program is free software; you can redistribute it and/or modify it
>> > + * under the terms of the GNU General Public License as published by the
>> > + * Free Software Foundation; either version 2 of the License, or (at your
>> > + * option) any later version.
>> > + */
>> > +
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +#include 
>> > +
>> > +#define ENDDEF 0xff00
>> > +#define COMMAND_ONLY   0x00
>> > +#define DATA_ONLY  0x01
>> > +
>> > +struct lms501kf03 {
>> > +   struct device   *dev;
>> > +   struct spi_device   *spi;
>> > +   unsigned intpower;
>> > +   struct lcd_device   *ld;
>> > +   struct lcd_platform_data*lcd_pd;
>> > +};
>> > +
>> > +static const unsigned short seq_password[] = {
>> > +   0xb9, 0xff, 0x83, 0x69,
>> > +   ENDDEF
>> > +};
>> > +
>> > +static const unsigned short seq_power[] = {
>> > +   0xb1, 0x01, 0x00, 0x34, 0x06, 0x00, 0x14, 0x14, 0x20, 0x28,
>> > +   0x12, 0x12, 0x17, 0x0a, 0x01, 0xe6, 0xe6, 0xe6, 0xe6, 0xe6,
>> > +   ENDDEF
>> > +};
>> > +
>> > +static const unsigned short seq_display[] = {
>> > +   0xb2, 0x00, 0x2b, 0x03, 0x03, 0x70, 0x00, 0xff, 0x00, 0x00,
>> > +   0x00, 0x00, 0x03, 0x03, 0x00, 0x01,
>> > +   ENDDEF
>> > +};
>> > +
>> > +static const unsigned short seq_rgb_if[] = {
>> > +   0xb3, 0x09,
>> > +   ENDDEF
>> > +};
>> > +
>> > +static const unsigned short

Re: PATCH] backlight: add lms501kf03 LCD driver

2012-12-17 Thread Jingoo Han

On Tuesday, December 18, 2012 2:00 PM, Sachin Kamat wrote
> On 18 December 2012 06:59, Jingoo Han  wrote:
> > On Monday, Monday, December 17, 2012 7:01 PM, Sachin Kamat wrote
> >>
> >> Hi Jingoo,
> >>
> >> I had already submitted a patch for adding support for this driver [1]
> >> and you had also provided your review comments on them ([2] and [3]).
> >> There were certain comments from Andrew Morton that needed to be addressed
> >> which I could not due to some other priorities.
> >
> > CC'ed Ilho Lee
> >
> > You have abandoned this patch for 7 months!
> >
> > In addition, before you submitted a patch, it was already developed and
> > managed by me with Ilho Lee. So, ownership should be given to me and Ilho 
> > Lee.
> > Also, I am not sure that you can manage this lms501kf03 LCD driver.
> >
> 
> I leave it to the maintainers to decide if it is an ethical practice
> to claim ownership by incorporating review comments on the original
> patch.


I am not claiming ownership by incorporating review comments
on the original patch!!!


It is the history about LMS501KF03 LCD driver.

  2011.9~2011.12: Original code was developed by Ilho Lee, Jingoo Han
  2012.1~2012.10: Original code was managed by Jingoo Han

  2012.4.16: 1st patch was submitted by Sachin Kamat using original code.
  2012.4.30: 3rd patch was submitted by Sachin Kamat using original code.

  2012.5.01: Andrew gave the comment on 3rd patch.

  2012.5.01~2012.12.16: There is no response from Sachin Kamat.

  2012.12.17: 4th patch was submitted by Jingoo Han using original code.


Original patch was developed by Ilho Lee and Me during 2011.9~2011.12.
You just copied the original patch and send it on 2012.4.16.
Moreover, you have not concerned for 7 months(2012.5.01~2012.12.16).

So, ownership should be given to Ilho Lee and Me.
I don't want to make a noise.


> 
> >>
> >> IMO, it would be better to address comments on that driver rather than
> >> posting a new (similar) driver altogether.
> 
> 
> >>
> >> [1] http://marc.info/?l=linux-fbdev=133455903202255=4
> >> [2] http://marc.info/?l=linux-fbdev=133574414215045=4
> >> [3] http://marc.info/?l=linux-fbdev=133576237619447=4
> >>
> >> On 17 December 2012 13:52, Jingoo Han  wrote:
> >> > Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
> >> > x 480) driver uses 3-wired SPI inteface.
> >> >
> >> > Signed-off-by: Ilho Lee 
> >> > Signed-off-by: Jingoo Han 
> >> > ---
> >> >  drivers/video/backlight/Kconfig  |8 +
> >> >  drivers/video/backlight/Makefile |1 +
> >> >  drivers/video/backlight/lms501kf03.c |  448 
> >> > ++
> >> >  3 files changed, 457 insertions(+), 0 deletions(-)
> >> >  create mode 100644 drivers/video/backlight/lms501kf03.c
> >> >
> >> > diff --git a/drivers/video/backlight/Kconfig 
> >> > b/drivers/video/backlight/Kconfig
> >> > index 765a945..081d6cf 100644
> >> > --- a/drivers/video/backlight/Kconfig
> >> > +++ b/drivers/video/backlight/Kconfig
> >> > @@ -126,6 +126,14 @@ config LCD_AMS369FG06
> >> >   If you have an AMS369FG06 AMOLED Panel, say Y to enable its
> >> >   LCD control driver.
> >> >
> >> > +config LCD_LMS501KF03
> >> > +   tristate "LMS501KF03 LCD Driver"
> >> > +   depends on SPI
> >> > +   default n
> >> > +   help
> >> > + If you have an LMS501KF03 LCD Panel, say Y to enable its
> >> > + LCD control driver.
> >> > +
> >> >  endif # LCD_CLASS_DEVICE
> >> >
> >> >  #
> >> > diff --git a/drivers/video/backlight/Makefile 
> >> > b/drivers/video/backlight/Makefile
> >> > index e7ce729..d02a728 100644
> >> > --- a/drivers/video/backlight/Makefile
> >> > +++ b/drivers/video/backlight/Makefile
> >> > @@ -14,6 +14,7 @@ obj-$(CONFIG_LCD_TOSA)   += tosa_lcd.o
> >> >  obj-$(CONFIG_LCD_S6E63M0)  += s6e63m0.o
> >> >  obj-$(CONFIG_LCD_LD9040)   += ld9040.o
> >> >  obj-$(CONFIG_LCD_AMS369FG06)   += ams369fg06.o
> >> > +obj-$(CONFIG_LCD_LMS501KF03)   += lms501kf03.o
> >> >
> >> >  obj-$(CONFIG_BACKLIGHT_CLASS_DEVICE) += backlight.o
> >> >  obj-$(CONFIG_BACKLIGHT_ATMEL_PWM)+= atmel-pwm-bl.o
> >> > diff --git a/drivers/video/backlight/lms501kf03.c 
> >> > b/drivers/video/backlight/lms501kf03.c
> >> > new file mode 100644
> >> > index 000..c30ea53
> >> > --- /dev/null
> >> > +++ b/drivers/video/backlight/lms501kf03.c
> >> > @@ -0,0 +1,448 @@
> >> > +/*
> >> > + * lms501kf03 TFT LCD panel driver.
> >> > + *
> >> > + * Copyright (c) 2012 Samsung Electronics Co., Ltd.
> >> > + * Author: Jingoo Han  
> >> > + *
> >> > + * This program is free software; you can redistribute it and/or modify 
> >> > it
> >> > + * under the terms of the GNU General Public License as published by the
> >> > + * Free Software Foundation; either version 2 of the License, or (at 
> >> > your
> >> > + * option) any later version.
> >> > + */
> >> > +
> >> > +#include 
> >> > +#include 
> >> > +#include 
> >> > +#include 
> >> > +#include 
> >> > +#include 
> >> >

[PATCH] x86,idle: pr_debug information need separated

2012-12-17 Thread Youquan Song

When debug kernel, the the below information is found:
intel_idle: unaware of model 0x1a MWAIT 4 please contact lenb@kernel.orgACPI: 
Device input0 -> No ACPI support

so this patch separates it.

Signed-off-by: Youquan Song 
---
 drivers/idle/intel_idle.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index b0f6b4c..eae6e3b 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -518,7 +518,7 @@ static int intel_idle_cpuidle_driver_init(void)
if (*cpuidle_state_table[cstate].name == '\0')
pr_debug(PREFIX "unaware of model 0x%x"
" MWAIT %d please"
-   " contact l...@kernel.org",
+   " contact l...@kernel.org\n",
boot_cpu_data.x86_model, cstate);
continue;
}
-- 
1.6.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] Bug-fixes to xen-blkfront for v3.8

2012-12-17 Thread Konrad Rzeszutek Wilk

Hey Jens,

Please git pull the following branch:

 git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git 
stable/for-jens-3.8

which has a bug-fix to the xen-blkfront and xen-blkback driver
when using the persistent mode. An issue was discovered where LVM
disks could not be read correctly and this fixes it. There
is also a change in llist.h which has been blessed by akpm.

Please pull!

 drivers/block/xen-blkback/blkback.c | 18 +++---
 drivers/block/xen-blkfront.c| 10 ++
 include/linux/llist.h   | 25 +
 3 files changed, 42 insertions(+), 11 deletions(-)

Roger Pau Monne (3):
  xen-blkback: implement safe iterator for the list of persistent grants
  llist/xen-blkfront: implement safe version of llist_for_each_entry
  xen-blkfront: handle bvecs with partial data

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86,perf: Add IvyBridge EP support

2012-12-17 Thread Youquan Song

Run in perf utility at Ivybridge EP server, encouter "not supported" event

L1-dcache-loads 
L1-dcache-load-misses   
L1-dcache-stores
L1-dcache-store-misses  
L1-dcache-prefetches
L1-dcache-prefetch-misses

This patch add the support for this processor.

Reviewed-by: Andi Kleen 
Signed-off-by: Youquan Song 
---
 arch/x86/kernel/cpu/perf_event_intel.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel.c 
b/arch/x86/kernel/cpu/perf_event_intel.c
index 324bb52..aea3503 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -2075,6 +2075,7 @@ __init int intel_pmu_init(void)
pr_cont("SandyBridge events, ");
break;
case 58: /* IvyBridge */
+   case 62: /* IvyBridge EP */
memcpy(hw_cache_event_ids, snb_hw_cache_event_ids,
   sizeof(hw_cache_event_ids));
memcpy(hw_cache_extra_regs, snb_hw_cache_extra_regs,
-- 
1.6.4.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 4/4] uprobes/powerpc: Make use of generic routines to enable single step

2012-12-17 Thread Suzuki K. Poulose


On 12/15/2012 01:32 AM, Oleg Nesterov wrote:

On 12/03, Suzuki K. Poulose wrote:


Replace the ptrace helpers with the powerpc generic routines to
enable/disable single step. We save/restore the MSR (and DCBR for BookE)
across for the operation. We don't have to disable the single step,
as restoring the MSR/DBCR would restore the previous state.


Obviously I can't review this series (although it looks fine to me).

Just one note,


@@ -121,7 +132,7 @@ int arch_uprobe_post_xol(struct arch_uprobe *auprobe, 
struct pt_regs *regs)

WARN_ON_ONCE(current->thread.trap_nr != UPROBE_TRAP_NR);

-   uprobe_restore_context_sstep(>autask);
+   uprobe_restore_context_sstep(>autask, regs);


I am not sure ppc needs this, but note that x86 does a bit more.

Not only we need to restore the "single-step" state, we need to
send SIGTRAP if it was not set by us. The same for _skip_sstep.


Ok. I will investigate that part and do the necessary.

Thanks
Suzuki

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [RFC][PATCH v6]trace,x86: add x86 irq vector tracepoints

2012-12-17 Thread Seiji Aguchi

Thank you for reviewing my patch.
I will update it in accordance with your comment.

Seiji

> -Original Message-
> From: Steven Rostedt [mailto:rost...@goodmis.org]
> Sent: Monday, December 17, 2012 10:02 PM
> To: Seiji Aguchi
> Cc: x...@kernel.org; linux-kernel@vger.kernel.org; H. Peter Anvin 
> (h...@zytor.com); Thomas Gleixner (t...@linutronix.de);
> 'mi...@elte.hu' (mi...@elte.hu); Borislav Petkov (b...@alien8.de); Satoru 
> Moriya; dle-deve...@lists.sourceforge.net; linux-
> e...@vger.kernel.org; Luck, Tony (tony.l...@intel.com)
> Subject: Re: [RFC][PATCH v6]trace,x86: add x86 irq vector tracepoints
> 
> On Tue, 2012-12-18 at 01:34 +, Seiji Aguchi wrote:
> > Change log
> >
> >  v5 -> v6
> >  - Rebased to 3.7
> >
> >  v4 -> v5
> >  - Rebased to 3.6.0
> >
> >  - Introduce a logic switching IDT at enabling/disabling TP time
> >so that a time penalty makes a zero when tracepoints are disabled.
> >This IDT is created only when CONFIG_TRACEPOINTS is enabled.
> >
> >  - Remove arch_irq_vector_entry/exit and add followings again
> >so that we can add each tracepoint in a generic way.
> >- error_apic_vector
> >- thermal_apic_vector
> >- threshold_apic_vector
> >- spurious_apic_vector
> >- x86_platform_ipi_vector
> >
> >  - Drop nmi tracepoints to begin with apic interrupts and discuss a logic 
> > switching
> >IDT first.
> >
> >  - Move irq_vectors.h in the directory of arch/x86/include/asm/trace because
> >I'm not sure if a logic switching IDT is sharable with other 
> > architectures.
> >
> >  v3 -> v4
> >  - Add a latency measurement of each tracepoint
> >  - Rebased to 3.6-rc6
> >
> >  v2 -> v3
> >  - Remove an invalidate_tlb_vector event because it was replaced by a call 
> > function vector
> >in a following commit.
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;
> > h=52aec3308db85f4e9f5c8b9f5dc4fbd0138c6fa4
> >
> >  v1 -> v2
> >  - Modify variable name from irq to vector.
> >  - Merge arch-specific tracepoints below to an arch_irq_vector_entry/exit.
> >- error_apic_vector
> >- thermal_apic_vector
> >- threshold_apic_vector
> >- spurious_apic_vector
> >- x86_platform_ipi_vector
> >
> > [Purpose of this patch]
> >
> > As Vaibhav explained in the thread below, tracepoints for irq vectors
> > are useful.
> >
> > http://www.spinics.net/lists/mm-commits/msg85707.html
> >
> > 
> > The current interrupt traces from irq_handler_entry and
> > irq_handler_exit provide when an interrupt is handled.  They provide
> > good data about when the system has switched to kernel space and how
> > it affects the currently running processes.
> >
> > There are some IRQ vectors which trigger the system into kernel space,
> > which are not handled in generic IRQ handlers.  Tracing such events
> > gives us the information about IRQ interaction with other system events.
> >
> > The trace also tells where the system is spending its time.  We want
> > to know which cores are handling interrupts and how they are affecting
> > other processes in the system.  Also, the trace provides information
> > about when the cores are idle and which interrupts are changing that state.
> > 
> >
> > On the other hand, my usecase is tracing just local timer event and
> > getting a value of instruction pointer.
> >
> >   I suggested to add an argument local timer event to get instruction 
> > pointer before.
> >   But there is another way to get it with external module like systemtap.
> >   So, I don't need to add any argument to irq vector tracepoints now.
> >
> > [Patch Description]
> >
> > Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in 
> > all events.
> > But there is an above use case to trace specific irq_vector rather than 
> > tracing all events.
> > In this case, we are concerned about overhead due to unwanted events.
> >
> > This patch adds following tracepoints instead of introducing 
> > irq_vector_entry/exit.
> > so that we can enable them independently.
> >- local_timer_vector
> >- reschedule_vector
> >- call_function_vector
> >- call_function_single_vector
> >- irq_work_entry_vector
> >- error_apic_vector
> >- thermal_apic_vector
> >- threshold_apic_vector
> >- spurious_apic_vector
> >- x86_platform_ipi_vector
> >
> > Also, it introduces a logic switching IDT at enabling/disabling time
> > so that a time penalty makes a complete zero when tracepoints are disabled. 
> > Detailed explanations are as follows.
> >  - Create new irq handlers inserted tracepoints by using macros.
> >  - Create a new IDT, trace_idt_table, at boot time by duplicating original 
> > IDT, idt table, and
> >registering the new handers for tracpoints.
> >  - Switch IDT to new one at enabling TP time.
> >  - Restore to an original IDT at disabling TP time.
> > The new IDT is created only when CONFIG_TRACEPOINTS is enabled to avoid 
> > being used for other purposes.
> >
> > Signed-off-by:

[PATCH 1/2 v2] menuconfig: Add Save/Load buttons

2012-12-17 Thread Wang YanQing

If menuconfig have Save/Load button like alternative
.config editors, xconfig, nconfig, etc.We will have
a obvious benefit when use menuconfig just like
when we use others, we can Save/Load our .config quickly
and conveniently.

This patch add the Save/Load button for menuconfig.

[remove trailing space while at it for below line:
"*)  Formerly when I used Page Down and Page Up, the cursor would be set"
]

Changes:
V1-V2:
1:use PATH_MAX instead of hard code suggested by Yann E. MORIN
2:drop the spurious empty-line removal suggested by Yann E. MORIN

Signed-off-by: Wang YanQing 
---
 scripts/kconfig/lxdialog/menubox.c | 20 +++-
 scripts/kconfig/mconf.c| 30 +-
 2 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/scripts/kconfig/lxdialog/menubox.c 
b/scripts/kconfig/lxdialog/menubox.c
index 1d60473..8b534d5 100644
--- a/scripts/kconfig/lxdialog/menubox.c
+++ b/scripts/kconfig/lxdialog/menubox.c
@@ -26,7 +26,7 @@
  *
  **)  A bugfix for the Page-Down problem
  *
- **)  Formerly when I used Page Down and Page Up, the cursor would be set 
+ **)  Formerly when I used Page Down and Page Up, the cursor would be set
  *to the first position in the menu box.  Now lxdialog is a bit
  *smarter and works more like other menu systems (just have a look at
  *it).
@@ -154,12 +154,14 @@ static void print_arrows(WINDOW * win, int item_no, int 
scroll, int y, int x,
  */
 static void print_buttons(WINDOW * win, int height, int width, int selected)
 {
-   int x = width / 2 - 16;
+   int x = width / 2 - 26;
int y = height - 2;
 
print_button(win, gettext("Select"), y, x, selected == 0);
print_button(win, gettext(" Exit "), y, x + 12, selected == 1);
print_button(win, gettext(" Help "), y, x + 24, selected == 2);
+   print_button(win, gettext(" Save "), y, x + 36, selected == 3);
+   print_button(win, gettext(" Load "), y, x + 48, selected == 4);
 
wmove(win, y, x + 1 + 12 * selected);
wrefresh(win);
@@ -372,7 +374,7 @@ do_resize:
case TAB:
case KEY_RIGHT:
button = ((key == KEY_LEFT ? --button : ++button) < 0)
-   ? 2 : (button > 2 ? 0 : button);
+   ? 4 : (button > 4 ? 0 : button);
 
print_buttons(dialog, height, width, button);
wrefresh(menu);
@@ -399,17 +401,17 @@ do_resize:
return 2;
case 's':
case 'y':
-   return 3;
+   return 5;
case 'n':
-   return 4;
+   return 6;
case 'm':
-   return 5;
+   return 7;
case ' ':
-   return 6;
+   return 8;
case '/':
-   return 7;
+   return 9;
case 'z':
-   return 8;
+   return 10;
case '\n':
return button;
}
diff --git a/scripts/kconfig/mconf.c b/scripts/kconfig/mconf.c
index 53975cf..9fb90f0 100644
--- a/scripts/kconfig/mconf.c
+++ b/scripts/kconfig/mconf.c
@@ -280,6 +280,7 @@ static struct menu *current_menu;
 static int child_count;
 static int single_menu_mode;
 static int show_all_options;
+static int save_and_exit;
 
 static void conf(struct menu *menu, struct menu *active_menu);
 static void conf_choice(struct menu *menu);
@@ -651,6 +652,12 @@ static void conf(struct menu *menu, struct menu 
*active_menu)
show_helptext(_("README"), _(mconf_readme));
break;
case 3:
+   conf_save();
+   break;
+   case 4:
+   conf_load();
+   break;
+   case 5:
if (item_is_tag('t')) {
if (sym_set_tristate_value(sym, yes))
break;
@@ -658,24 +665,24 @@ static void conf(struct menu *menu, struct menu 
*active_menu)
show_textbox(NULL, setmod_text, 6, 74);
}
break;
-   case 4:
+   case 6:
if (item_is_tag('t'))
sym_set_tristate_value(sym, no);
break;
-   case 5:
+   case 7:
if (item_is_tag('t'))
sym_set_tristate_value(sym, mod);

Re: [Suggestion] drivers/staging/tidspbridge: strcpy and strncpy, src length checking issue.

2012-12-17 Thread Chen Gang

Hello Omar Ramirez Luna:

  excuse me to bother you (maybe you are busy in these days).
  please help checking this suggestion when you have free time.

  my suggestion may be not valid (I already have at least 9 fault which
I made)
  for example of my fault:
A) net/atm:  "%pM means format this pointer as a mac address", thank
Chas Williams
B) net/tipc: "TIPC_MAX_IF_NAME is not TIPC_MAX_LINK_NAME", thank Xue
Ying
C) net/core: "not see 'if (PAGE_SIZE - len < 3)' ", find by myself
D) MAINTAINER: "tty != serial",  thank Jiri Slaby and Joe Perches
E) drvers/staging/telephony: "torvalds' tree is different with next
tree", thank devendra.aaru
F) drivers/staging/telephony: "we should probably fix it for older
kernels", thank Dan Carpenter
G) drivers/usb/core: "doing DMA on the stack violates the DMA
rules", thank Oliver Neukum
H) arch/blackfin/kernel: "%8s is used to take up the same space",
thank Mike Frysinger and Steven Miao
I) drivers/usb/host: "usb_hcd_giveback_urb set urb->hcpriv to NULL",
thank Alan Stern

  finding and solving issues is a way (not a goal) to provide
contributes to Open Source.
  so I hope:
When you have free time, also can provide your contributes to Open
Source, too.

  thanks.


By the way:
  this week, I need work for 2 patches which relative with usb sub-system.
  if still get no reply for tidspbridge until next week.
I should work for it, it is my duty (since I have provided
'suggestion' to it).
"work for it" means:
   if tidspbridge is still useful
 I need construct relative environments for unit test.
 then provide relative patches.
   else (useless)
 I need delete it from Open Source.
 (since it can not pass compiling, and no response from
*@ti.com, it almost means useless)
 (at least, fix the 2 compiling issues which I have suggested,
can pass compiling)


  welcome any other members to giving suggestions and completions
(especially from *@ti.com)


  Regards

gchen.


于 2012年12月14日 11:50, Chen Gang 写道:
> Hello Omar Ramirez Luna:
> 
>   in drivers/staging/tidspbridge/rmgr/proc.c:
> 
> if strlen(drv_datap->base_img) == size, will pass checking (line 397)
> the size is the full length of exec_file (line 382, line 468..469)
> strcpy causes issue: src len is strlen(drv_datap->base_img) + '\0'. (line 
> 400)
> 
> strncpy seems also has issue: need use size instead of strlen(iva_img) + 
> 1. (line 402..403)
> 
>   please help to check, thanks.
> 
> gchen.
> 
> 
>  380 static int get_exec_file(struct cfg_devnode *dev_node_obj,
>  381 struct dev_object *hdev_obj,
>  382 u32 size, char *exec_file)
>  383 {
>  384 u8 dev_type;
>  385 s32 len;
>  386 struct drv_data *drv_datap = dev_get_drvdata(bridge);
>  387 
>  388 dev_get_dev_type(hdev_obj, (u8 *) _type);
>  389 
>  390 if (!exec_file)
>  391 return -EFAULT;
>  392 
>  393 if (dev_type == DSP_UNIT) {
>  394 if (!drv_datap || !drv_datap->base_img)
>  395 return -EFAULT;
>  396 
>  397 if (strlen(drv_datap->base_img) > size)
>  398 return -EINVAL;
>  399 
>  400 strcpy(exec_file, drv_datap->base_img);
>  401 } else if (dev_type == IVA_UNIT && iva_img) {
>  402 len = strlen(iva_img);
>  403 strncpy(exec_file, iva_img, len + 1);
>  404 } else {
>  405 return -ENOENT;
>  406 }
>  407 
>  408 return 0;
>  409 }
>  410 
>  ...
> 
>  465 /* Get the default executable for this board... */
>  466 dev_get_dev_type(hdev_obj, (u8 *) _type);
>  467 p_proc_object->processor_id = dev_type;
>  468 status = get_exec_file(dev_node_obj, hdev_obj, 
> sizeof(sz_exec_file),
>  469sz_exec_file);
> 


-- 
Chen Gang

Asianux Corporation


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] x86,apic: Blacklist x2APIC on some platforms

2012-12-17 Thread Youquan Song

Blacklist x2apic when Nivida graphics enabled on Lenovo ThinkPad T420.
Also set blacklist x2apic for Lenovo ThinkPad W520 and L520.


Thre are 3 bug reports:
https://bugzilla.kernel.org/show_bug.cgi?id=43054
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/776999
https://bugs.launchpad.net/bugs/922037

The patches is based on http://git.kernel.org/?p=linux/kernel/git/yinghai/
linux-yinghai.git;a=patch;h=de38757e964cfee20e6da1977572a2191d7f4aa0

Reviewed-by: Yinghai Lu 
Signed-off-by: Youquan Song 
---
 arch/x86/include/asm/x86_init.h |1 +
 arch/x86/kernel/apic/apic.c |   51 +++
 arch/x86/kernel/early-quirks.c  |9 +++
 3 files changed, 61 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index 38155f6..88e39e6 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -202,5 +202,6 @@ extern struct x86_msi_ops x86_msi;
 extern struct x86_io_apic_ops x86_io_apic_ops;
 extern void x86_init_noop(void);
 extern void x86_init_uint_noop(unsigned int unused);
+extern int early_found_nvidia_display_card;
 
 #endif
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 24deb30..0822fe9 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -170,6 +170,54 @@ static __init int setup_nox2apic(char *str)
return 0;
 }
 early_param("nox2apic", setup_nox2apic);
+
+static __init int x2apic_set_blacklist_nvidia(const struct dmi_system_id *d)
+{
+   if (!early_found_nvidia_display_card)
+   return 1;
+
+   setup_nox2apic("");
+   pr_info("x2apic blacklisted when Nivida graphics enabled on %s\n",
+   d->ident);
+   return 0;
+}
+
+static __init int x2apic_set_blacklist(const struct dmi_system_id *d)
+{
+   setup_nox2apic("");
+   pr_info("x2apic blacklisted because of broken SMI on %s\n",
+   d->ident);
+   return 0;
+}
+
+static const struct dmi_system_id x2apic_dmi_table[] = {
+   {
+   .callback = x2apic_set_blacklist_nvidia,
+   .ident = "Lenovo ThinkPad T420",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+   DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad T420"),
+   },
+   },
+   {
+   .callback = x2apic_set_blacklist,
+   .ident = "Lenovo ThinkPad W520",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+   DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad W520"),
+   },
+   },
+   {
+   .callback = x2apic_set_blacklist,
+   .ident = "Lenovo ThinkPad L520",
+   .matches = {
+   DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+   DMI_MATCH(DMI_PRODUCT_VERSION, "ThinkPad L520"),
+   },
+   },
+   {}
+};
+
 #endif
 
 unsigned long mp_lapic_addr;
@@ -1542,6 +1590,9 @@ void __init enable_IR_x2apic(void)
int ret, x2apic_enabled = 0;
int hardware_init_ret;
 
+   if (x2apic_supported())
+   dmi_check_system(x2apic_dmi_table);
+
/* Make sure irq_remap_ops are initialized */
setup_irq_remapping_ops();
 
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index 7548932..852d7a0 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -19,6 +19,8 @@
 #include 
 #include 
 
+int early_found_nvidia_display_card __initdata;
+
 static void __init fix_hypertransport_config(int num, int slot, int func)
 {
u32 htcfg;
@@ -192,6 +194,11 @@ static void __init ati_bugs_contd(int num, int slot, int 
func)
 }
 #endif
 
+static void __init nvidia_x2apic_bugs(int num, int slot, int func)
+{
+   early_found_nvidia_display_card = 1;
+}
+
 #define QFLAG_APPLY_ONCE   0x1
 #define QFLAG_APPLIED  0x2
 #define QFLAG_DONE (QFLAG_APPLY_ONCE|QFLAG_APPLIED)
@@ -221,6 +228,8 @@ static struct chipset early_qrk[] __initdata = {
  PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs },
{ PCI_VENDOR_ID_ATI, PCI_DEVICE_ID_ATI_SBX00_SMBUS,
  PCI_CLASS_SERIAL_SMBUS, PCI_ANY_ID, 0, ati_bugs_contd },
+   { PCI_VENDOR_ID_NVIDIA, PCI_ANY_ID,
+ PCI_CLASS_DISPLAY_VGA, 0xff00, 0, nvidia_x2apic_bugs},
{}
 };
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] tuntap: fix sparse warning

2012-12-17 Thread David Miller

From: Jason Wang 
Date: Tue, 18 Dec 2012 11:00:27 +0800

> Make tun_enable_queue() static to fix the sparse warning:
> 
> drivers/net/tun.c:399:19: sparse: symbol 'tun_enable_queue' was not declared. 
> Should it be static?
> 
> Reported-by: Fengguang Wu 
> Signed-off-by: Jason Wang 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] arm: tegra: remove USB address related macros from iomap.h

2012-12-17 Thread Venu Byravarasu

USB register base address and sizes defined in iomap.h
are not used in any files other than board-dt-tegra20.c.
Hence removed those defines from header file and using
the absolute values in board files.

Signed-off-by: Venu Byravarasu 
---
 arch/arm/mach-tegra/board-dt-tegra20.c |6 +++---
 arch/arm/mach-tegra/iomap.h|9 -
 2 files changed, 3 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mach-tegra/board-dt-tegra20.c 
b/arch/arm/mach-tegra/board-dt-tegra20.c
index 734d9cc..aae399b 100644
--- a/arch/arm/mach-tegra/board-dt-tegra20.c
+++ b/arch/arm/mach-tegra/board-dt-tegra20.c
@@ -81,11 +81,11 @@ struct of_dev_auxdata tegra20_auxdata_lookup[] __initdata = 
{
OF_DEV_AUXDATA("nvidia,tegra20-i2s", TEGRA_I2S1_BASE, "tegra20-i2s.0", 
NULL),
OF_DEV_AUXDATA("nvidia,tegra20-i2s", TEGRA_I2S2_BASE, "tegra20-i2s.1", 
NULL),
OF_DEV_AUXDATA("nvidia,tegra20-das", TEGRA_APB_MISC_DAS_BASE, 
"tegra20-das", NULL),
-   OF_DEV_AUXDATA("nvidia,tegra20-ehci", TEGRA_USB_BASE, "tegra-ehci.0",
+   OF_DEV_AUXDATA("nvidia,tegra20-ehci", 0xC500, "tegra-ehci.0",
   _ehci1_pdata),
-   OF_DEV_AUXDATA("nvidia,tegra20-ehci", TEGRA_USB2_BASE, "tegra-ehci.1",
+   OF_DEV_AUXDATA("nvidia,tegra20-ehci", 0xC5004000, "tegra-ehci.1",
   _ehci2_pdata),
-   OF_DEV_AUXDATA("nvidia,tegra20-ehci", TEGRA_USB3_BASE, "tegra-ehci.2",
+   OF_DEV_AUXDATA("nvidia,tegra20-ehci", 0xC5008000, "tegra-ehci.2",
   _ehci3_pdata),
OF_DEV_AUXDATA("nvidia,tegra20-apbdma", TEGRA_APB_DMA_BASE, 
"tegra-apbdma", NULL),
OF_DEV_AUXDATA("nvidia,tegra20-pwm", TEGRA_PWFM_BASE, "tegra-pwm", 
NULL),
diff --git a/arch/arm/mach-tegra/iomap.h b/arch/arm/mach-tegra/iomap.h
index db8be51..399fbca 100644
--- a/arch/arm/mach-tegra/iomap.h
+++ b/arch/arm/mach-tegra/iomap.h
@@ -240,15 +240,6 @@
 #define TEGRA_CSITE_BASE   0x7004
 #define TEGRA_CSITE_SIZE   SZ_256K
 
-#define TEGRA_USB_BASE 0xC500
-#define TEGRA_USB_SIZE SZ_16K
-
-#define TEGRA_USB2_BASE0xC5004000
-#define TEGRA_USB2_SIZESZ_16K
-
-#define TEGRA_USB3_BASE0xC5008000
-#define TEGRA_USB3_SIZESZ_16K
-
 #define TEGRA_SDMMC1_BASE  0xC800
 #define TEGRA_SDMMC1_SIZE  SZ_512
 
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm/swap: add independed bio pool for swap

2012-12-17 Thread Hugh Dickins

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:

> This bio pool guarantees reclaiming progress for anonymous pages.
> All avaliable bio in fs_bio_set may be borrowed by writeback which may
> never ends, because disk too slow or broken. I have seen this situation in
> real life in system where was a lot of bio requests to a loop device which
> laying on top of special fuse-based filesystem.

Hmm, perhaps, I'm not at all sure.

I don't particularly want to fragment off yet another pool if it's
not the right approach.  Or maybe it's loop or fuse which should
have the pool, rather than swap.

If the disk is slow, I'd expect us to be okay; but if it's not
responding at all, then yes, those mempools will remain exhausted.
You're imagining swap going to a more reliable disk, but it's being
starved by the unresponding disk, so deserves a separate pool?

Let's Cc Mel and Jens, who will each have plenty of experience of
running out of bios/mempools, and the proper way to avoid or accept it.

(Note that BIO_POOL_SIZE is only 2 nowadays: when mempools were
first introduced, indeed they were sized larger; but once we found so
much memory disappearing into them, they got cut down to the minimum
needed for forward progress - I forget why that's 2 not 1).

Hugh

> 
> Signed-off-by: Konstantin Khlebnikov 
> Cc: Andrew Morton 
> Cc: Hugh Dickins 
> ---
>  mm/page_io.c |   13 -
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_io.c b/mm/page_io.c
> index 78eee32..699f85e 100644
> --- a/mm/page_io.c
> +++ b/mm/page_io.c
> @@ -22,12 +22,14 @@
>  #include 
>  #include 
>  
> +static struct bio_set *swap_bio_set;
> +
>  static struct bio *get_swap_bio(gfp_t gfp_flags,
>   struct page *page, bio_end_io_t end_io)
>  {
>   struct bio *bio;
>  
> - bio = bio_alloc(gfp_flags, 1);
> + bio = bio_alloc_bioset(gfp_flags, 1, swap_bio_set);
>   if (bio) {
>   bio->bi_sector = map_swap_page(page, >bi_bdev);
>   bio->bi_sector <<= PAGE_SHIFT - 9;
> @@ -290,3 +292,12 @@ int swap_set_page_dirty(struct page *page)
>   return __set_page_dirty_no_writeback(page);
>   }
>  }
> +
> +static int __init swap_bio_init(void)
> +{
> + swap_bio_set = bioset_create(SWAP_CLUSTER_MAX, 0);
> + if (!swap_bio_set)
> + panic("can't allocate swap_bio_set\n");
> + return 0;
> +}
> +late_initcall(swap_bio_init);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v4] netfilter: nf_conntrack_sip: Handle Cisco 7941/7945 IP phones

2012-12-17 Thread Kevin Cernekee

Most SIP devices use a source port of 5060/udp on SIP requests, so the
response automatically comes back to port 5060:

phone_ip:5060 -> proxy_ip:5060   REGISTER
proxy_ip:5060 -> phone_ip:5060   100 Trying

The newer Cisco IP phones, however, use a randomly chosen high source
port for the SIP request but expect the response on port 5060:

phone_ip:49173 -> proxy_ip:5060  REGISTER
proxy_ip:5060 -> phone_ip:5060   100 Trying

Standard Linux NAT, with or without nf_nat_sip, will send the reply back
to port 49173, not 5060:

phone_ip:49173 -> proxy_ip:5060  REGISTER
proxy_ip:5060 -> phone_ip:49173  100 Trying

But the phone is not listening on 49173, so it will never see the reply.

This patch modifies nf_*_sip to work around this quirk by extracting
the SIP response port from the Via: header, iff the source IP in the
packet header matches the source IP in the SIP request.

Signed-off-by: Kevin Cernekee 
Acked-by: Eric Dumazet 
Cc: Patrick McHardy 
---


Baseline: git://1984.lsi.us.es/nf-next

v3->v4 changes:

Fix patch context and APIs to match the current Linux tree.  These
changes are from OpenWRT (Gabor?) and David W.

v4 was tested with Cisco 7945 (high UDP destination port) and Snom m9
(normal "symmetric" UDP destination port), both on IPv4 only.

I've been running a recent OpenWRT port of this patch (Attitude Adjustment
release, 3.3 kernel) for ~2mo, with both phones as clients.


 include/linux/netfilter/nf_conntrack_sip.h |3 +++
 net/netfilter/nf_conntrack_sip.c   |   17 +
 net/netfilter/nf_nat_sip.c |   27 ---
 3 files changed, 44 insertions(+), 3 deletions(-)

diff --git a/include/linux/netfilter/nf_conntrack_sip.h 
b/include/linux/netfilter/nf_conntrack_sip.h
index 387bdd0..ba7f571 100644
--- a/include/linux/netfilter/nf_conntrack_sip.h
+++ b/include/linux/netfilter/nf_conntrack_sip.h
@@ -4,12 +4,15 @@
 
 #include 
 
+#include 
+
 #define SIP_PORT   5060
 #define SIP_TIMEOUT3600
 
 struct nf_ct_sip_master {
unsigned intregister_cseq;
unsigned intinvite_cseq;
+   __be16  forced_dport;
 };
 
 enum sip_expectation_classes {
diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index df8f4f2..72a67bb 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1440,8 +1440,25 @@ static int process_sip_request(struct sk_buff *skb, 
unsigned int protoff,
 {
enum ip_conntrack_info ctinfo;
struct nf_conn *ct = nf_ct_get(skb, );
+   struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
+   enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
unsigned int matchoff, matchlen;
unsigned int cseq, i;
+   union nf_inet_addr addr;
+   __be16 port;
+
+   /* Many Cisco IP phones use a high source port for SIP requests, but
+* listen for the response on port 5060.  If we are the local
+* router for one of these phones, save the port number from the
+* Via: header so that nf_nat_sip can redirect the responses to
+* the correct port.
+*/
+   if (ct_sip_parse_header_uri(ct, *dptr, NULL, *datalen,
+   SIP_HDR_VIA_UDP, NULL, ,
+   , , ) > 0 &&
+   port != ct->tuplehash[dir].tuple.src.u.udp.port &&
+   nf_inet_addr_cmp(, >tuplehash[dir].tuple.src.u3))
+   ct_sip_info->forced_dport = port;
 
for (i = 0; i < ARRAY_SIZE(sip_handlers); i++) {
const struct sip_handler *handler;
diff --git a/net/netfilter/nf_nat_sip.c b/net/netfilter/nf_nat_sip.c
index 16303c7..5951146e 100644
--- a/net/netfilter/nf_nat_sip.c
+++ b/net/netfilter/nf_nat_sip.c
@@ -95,6 +95,7 @@ static int map_addr(struct sk_buff *skb, unsigned int protoff,
enum ip_conntrack_info ctinfo;
struct nf_conn *ct = nf_ct_get(skb, );
enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
+   struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
char buffer[INET6_ADDRSTRLEN + sizeof("[]:n")];
unsigned int buflen;
union nf_inet_addr newaddr;
@@ -107,7 +108,8 @@ static int map_addr(struct sk_buff *skb, unsigned int 
protoff,
} else if (nf_inet_addr_cmp(>tuplehash[dir].tuple.dst.u3, addr) &&
   ct->tuplehash[dir].tuple.dst.u.udp.port == port) {
newaddr = ct->tuplehash[!dir].tuple.src.u3;
-   newport = ct->tuplehash[!dir].tuple.src.u.udp.port;
+   newport = ct_sip_info->forced_dport ? :
+ ct->tuplehash[!dir].tuple.src.u.udp.port;
} else
return 1;
 
@@ -144,6 +146,7 @@ static unsigned int nf_nat_sip(struct sk_buff *skb, 
unsigned int protoff,
enum ip_conntrack_info ctinfo;
struct nf_conn *ct = nf_ct_get(skb, );
enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
+   struct nf_ct_sip_master

[PULL REQUEST] md update for 3.8

2012-12-17 Thread NeilBrown


Hi Linus,
  Not much for md this time round.

Thanks,
NeilBrown

The following changes since commit 874807a83139abc094f939e93623c5623573d543:

  md/raid1{,0}: fix deadlock in bitmap_unplug. (2012-11-27 12:14:40 +1100)

are available in the git repository at:

  git://neil.brown.name/md/ tags/md-3.8

for you to fetch changes up to a9add5d92b64ea57fb4c3b557c3891cdeb15fa0c:

  md/raid5: add blktrace calls (2012-12-18 10:22:21 +1100)


md update for 3.8

Mostly just little fixes.  Probably biggest part is
AVX accelerated RAID6 calculations.


Jim Kukunas (1):
  lib/raid6: Add AVX2 optimized recovery functions

NeilBrown (5):
  md: removed unused variable in calc_sb_1_csm.
  md: close race between removing and adding a device.
  md.c: re-indent various 'switch' statements.
  md/raid5: use async_tx_quiesce() instead of open-coding it.
  md/raid5: add blktrace calls

Yuanhan Liu (2):
  lib/raid6: Add AVX2 optimized gen_syndrome functions
  lib/raid6: build proper files on corresponding arch

kernelmail (1):
  md:Add place to update ->recovery_cp.

majianpeng (2):
  md: Update checkpoint of resync/recovery based on time.
  md: Use ->curr_resync as last completed request when cleanly aborting 
resync.

 arch/x86/Makefile   |   5 +-
 drivers/md/md.c | 256 --
 drivers/md/md.h |   2 +
 drivers/md/raid5.c  |  43 +--
 include/linux/raid/pq.h |   4 +
 lib/raid6/Makefile  |   9 +-
 lib/raid6/algos.c   |  12 ++
 lib/raid6/altivec.uc|   3 -
 lib/raid6/avx2.c| 251 +
 lib/raid6/mmx.c |   2 +-
 lib/raid6/recov_avx2.c  | 323 
 lib/raid6/recov_ssse3.c |   4 -
 lib/raid6/sse1.c|   2 +-
 lib/raid6/sse2.c|   8 +-
 lib/raid6/test/Makefile |  29 -
 lib/raid6/x86.h |  14 ++-
 16 files changed, 809 insertions(+), 158 deletions(-)
 create mode 100644 lib/raid6/avx2.c
 create mode 100644 lib/raid6/recov_avx2.c


signature.asc
Description: PGP signature

Re: [PATCH] firmware: make sure paths remain relative

2012-12-17 Thread Ming Lei

On Tue, Dec 18, 2012 at 12:09 PM, Kees Cook  wrote:
>
> Do you mean a printk should be emitted on this error path? I can add that if 
> so.

dev_err() should be better. With that, please feel free to add

   Acked-by: Ming Lei 

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: PATCH] backlight: add lms501kf03 LCD driver

2012-12-17 Thread Jingoo Han

On Tuesday, December 18, 2012 1:51 AM, devendra.aaru wrote
> Hello,
> 
> > +static int lms501kf03_spi_write(struct lms501kf03 *lcd, unsigned char 
> > address,
> > +   unsigned char command)
> > +{
> > +   int ret;
> > +
> > +   ret = lms501kf03_spi_write_byte(lcd, address, command);
> > +
> > +   return ret;
> 
> there is redundancy here,
> you can do just removing the ret and do return.

OK. I will fix it.

> 
> > +}
> > +
> > +static int lms501kf03_panel_send_sequence(struct lms501kf03 *lcd,
> > +   const unsigned short *wbuf)
> > +{
> > +   int ret = 0, i = 0;
> > +
> > +   while (wbuf[i] != ENDDEF) {
> > +   if (i == 0)
> > +   ret = lms501kf03_spi_write(lcd, COMMAND_ONLY, 
> > wbuf[i]);
> > +   else
> > +   ret = lms501kf03_spi_write(lcd, DATA_ONLY, wbuf[i]);
> > +   if (ret)
> > +   break;
> > +   i += 1;
> > +   }
> > +
> > +   return ret;
> > +}
> > +
> > +static int lms501kf03_ldi_init(struct lms501kf03 *lcd)
> > +{
> > +   int ret, i;
> > +   static const unsigned short *init_seq[] = {
> > +   seq_password,
> > +   seq_power,
> > +   seq_display,
> > +   seq_rgb_if,
> > +   seq_display_inv,
> > +   seq_vcom,
> > +   seq_gate,
> > +   seq_panel,
> > +   seq_col_mod,
> > +   seq_w_gamma,
> > +   seq_rgb_gamma,
> > +   seq_sleep_out,
> > +   };
> > +
> > +   for (i = 0; i < ARRAY_SIZE(init_seq); i++) {
> > +   ret = lms501kf03_panel_send_sequence(lcd, init_seq[i]);
> > +   if (ret)
> > +   break;
> > +   }
> > +   /* according to the datasheet, 120ms delay time is required. */
> why the 120ms delay required would be good to specify as comment. or
> you can put the link to datasheet if available.

OK, I will add more explanation.
However, link to datasheet is not available.

> 
> > +   msleep(120);
> > +
> > +   return ret;
> > +}
> > +
> > +static int lms501kf03_ldi_enable(struct lms501kf03 *lcd)
> > +{
> > +   return lms501kf03_panel_send_sequence(lcd, seq_display_on);
> > +}
> > +
> > +static int lms501kf03_ldi_disable(struct lms501kf03 *lcd)
> > +{
> > +   return lms501kf03_panel_send_sequence(lcd, seq_display_off);
> > +}
> > +
> > +static int lms501kf03_power_is_on(int power)
> > +{
> > +   return (power) <= FB_BLANK_NORMAL;
> > +}
> > +
> > +static int lms501kf03_power_on(struct lms501kf03 *lcd)
> > +{
> > +   int ret = 0;
> > +   struct lcd_platform_data *pd;
> > +
> > +   pd = lcd->lcd_pd;
> > +
> > +   if (!pd->power_on) {
> > +   dev_err(lcd->dev, "power_on is NULL.\n");
> > +   return -EFAULT;
> we may need to do -EINVAL instead of EFAULT as EFAULT tends to be for
> the invalid memory addresses.

OK. I will fix it.

> 
> > +   } else {
> > +   pd->power_on(lcd->ld, 1);
> > +   msleep(pd->power_on_delay);
> > +   }
> > +
> > +   if (!pd->reset) {
> > +   dev_err(lcd->dev, "reset is NULL.\n");
> > +   return -EFAULT;
> 
> may be here too..

OK. I will fix it.

> 
> > +   } else {
> > +   pd->reset(lcd->ld);
> > +   msleep(pd->reset_delay);
> > +   }
> > +
> > +   ret = lms501kf03_ldi_init(lcd);
> > +   if (ret) {
> > +   dev_err(lcd->dev, "failed to initialize ldi.\n");
> > +   return ret;
> > +   }
> > +
> > +   ret = lms501kf03_ldi_enable(lcd);
> > +   if (ret) {
> > +   dev_err(lcd->dev, "failed to enable ldi.\n");
> > +   return ret;
> > +   }
> > +
> > +   return 0;
> > +}
> > +
> > +static int lms501kf03_power_off(struct lms501kf03 *lcd)
> > +{
> > +   int ret = 0;
> > +   struct lcd_platform_data *pd;
> > +
> > +   pd = lcd->lcd_pd;
> > +
> > +   ret = lms501kf03_ldi_disable(lcd);
> > +   if (ret) {
> > +   dev_err(lcd->dev, "lcd setting failed.\n");
> > +   return -EIO;
> > +   }
> > +
> > +   msleep(pd->power_off_delay);
> > +
> > +   pd->power_on(lcd->ld, 0);
> > +
> 
> seems that you are calling the core lcd framework api, i am curious to
> know why, :p , and obviously i dunno why its calling that way.
> 
> > +   return 0;
> > +}
> > +
> > +static int lms501kf03_power(struct lms501kf03 *lcd, int power)
> > +{
> > +   int ret = 0;
> > +
> > +   if (lms501kf03_power_is_on(power) &&
> > +   !lms501kf03_power_is_on(lcd->power))
> > +   ret = lms501kf03_power_on(lcd);
> > +   else if (!lms501kf03_power_is_on(power) &&
> > +   lms501kf03_power_is_on(lcd->power))
> > +   ret = lms501kf03_power_off(lcd);
> > +
> 
> seems that ret is used

Re: [PATCH] mm/swap: abort swapoff after disk error

2012-12-17 Thread Hugh Dickins

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:

> Content of non-uptodate pages completely random, we cannot expose them into
> userspace. This leads to information leak and will crash userspace for sure.

Good find, yes, it's very wrong as is.  But, sorry, I don't like your fix
- better than ignoring the issue as at present, but not the right answer.

> Probably we can reuse hwpoison entries here, but tmpfs already too complex.

HWpoison entries?  They're for when that page of RAM is bad, but this is
quite a different case: the page is fine and can perfectly well be freed
and reused - what's bad is the data currently in it.

> 
> Signed-off-by: Konstantin Khlebnikov 
> Original-patch-by: Alexey Kuznetsov 
> Cc: Andrew Morton 
> Cc: Hugh Dickins 
> Cc: Andi Kleen 
> ---
>  mm/swapfile.c |   16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index e97a0e5..98fc2fd 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
>   wait_on_page_writeback(page);
>  
>   /*
> +  * If read failed we cannot map not-uptodate page to
> +  * user space. Actually, we are in serious troubles,
> +  * we do not even know what process to kill. So, the only

try_to_unuse() is all about locating exactly where this page belongs;
and if the user is lucky, the page in question won't even be needed again
before the process exits, so nothing should be killed at this point.

> +  * variant remains: to stop swapoff() and allow someone
> +  * to kill processes to zap invalid pages.

No, we should not abort swapoff: there's every reason to continue,
to make sure that this unreliable area can be taken out of service.

> +  *
> +  * TODO replace page with hwpoison entry in pte and shmem.

Instead of blindly going ahead and inserting ptes pointing to the
!PageUptodate page, unuse_pte() and shmem_unuse_inode() should insert
a substitute bad swapentry, to generate SIGBUS if it's accessed.

swp_entry(1, 0) might serve, but there's probably a few mods needed
here and there; and getting the details right (e.g. memcg charges)
will need care.

Not as straightforward as your block below, I admit.  I wonder if you
posted that just to stir me to do better: or can you take it further?

Thanks,
Hugh

> +  */
> + if (unlikely(!PageUptodate(page))) {
> + unlock_page(page);
> + page_cache_release(page);
> + retval = -EIO;
> + break;
> + }
> +
> + /*
>* Remove all references to entry.
>*/
>   swcount = *swap_map;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

mutex warning in intel_cacheinfo.c:cpu_list_show

2012-12-17 Thread Dave Jones

(At least I think that's where 'cpu_list_show' comes from...
 those preprocessor tricks confuse ctags)

Just started seeing this today..
(fwiw, cpu is a Phenom(tm) 9750)

Dave

WARNING: at kernel/mutex.c:198 mutex_lock_nested+0x39c/0x3b0()
Hardware name: GA-MA78GM-S2H
Modules linked in: hidp fuse l2tp_ppp l2tp_core bnep rfcomm binfmt_misc phonet 
rose af_key nfnetlink caif_socket caif ipt_ULOG af_rxrpc netrom appletalk ax25 
can ipx p8023 rds psnap nfc p8022 atm scsi_transport_iscsi pppoe pppox 
ppp_generic llc2 irda decnet slhc llc x25 crc_ccitt nfsv3 nfs_acl nfs fscache 
lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack 
nf_conntrack ip6table_filter ip6_tables snd_hda_codec_realtek microcode 
snd_hda_intel btusb snd_hda_codec bluetooth serio_raw usb_debug snd_pcm pcspkr 
snd_page_alloc snd_timer snd edac_core soundcore r8169 mii vhost_net tun 
macvtap macvlan kvm_amd kvm
Pid: 667, comm: trinity-child2 Not tainted 3.7.0+ #25
Call Trace:
 [] warn_slowpath_common+0x7f/0xc0
 [] warn_slowpath_null+0x1a/0x20
 [] mutex_lock_nested+0x39c/0x3b0
 [] ? cpu_list_show+0x35/0x70
 [] ? cpu_list_show+0x35/0x70
 [] ? sysfs_read_file+0x8b/0x1c0
 [] cpu_list_show+0x35/0x70
 [] dev_attr_show+0x20/0x60
 [] ? sysfs_read_file+0x8b/0x1c0
 [] sysfs_read_file+0xb7/0x1c0
 [] vfs_read+0xac/0x180
 [] sys_pread64+0x9a/0xb0
 [] system_call_fastpath+0x16/0x1b
 [] ? ack_flush+0x1/0x10 [kvm]
---[ end trace 6d0328611cb687b5 ]---

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] firmware: make sure paths remain relative

2012-12-17 Thread Kees Cook

On Mon, Dec 17, 2012 at 6:15 PM, Ming Lei  wrote:
> On Tue, Dec 18, 2012 at 9:37 AM, Kees Cook  wrote:
>> On Mon, Dec 17, 2012 at 5:30 PM, Ming Lei  wrote:
>>> On Sat, Dec 15, 2012 at 6:51 AM, Kees Cook  wrote:
 Some devices have configurable firmware locations. If these configuration
 mechanisms are exposed to unprivileged userspace, it may be possible to
>>>
>>> I an wondering how the unprivileged userspace can write the firmware sysfs
>>> to trigger loading firmware?
>>
>> If a daemon were to, for example, make firmware selectable by the user
>> (which under certain situations is possible in Chrome OS), it seems
>> wasteful require these userspace tools/interfaces to each perform
>> filtering, so I figured it would be trivial to put in here instead to
>> avoid possible future vulnerabilities.
>
> OK, I understand your concern, and looks reasonable wrt. the specific
> problem, and IMO, it is better to provide failure log so that the affected
> device driver can be fixed easily.

Do you mean a printk should be emitted on this error path? I can add that if so.

-Kees

-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] iio: dac: ad5446: Don't set error code to voltage_uv

2012-12-17 Thread Axel Lin

regulator_get_voltage() may return negative error code.
Add error checking to avoid setting error code to voltage_uv.

Signed-off-by: Axel Lin 
---
Sorry. Just found I made the same mistake again.
Here is v2, should check if ret is negative value.
 drivers/iio/dac/ad5446.c |6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/iio/dac/ad5446.c b/drivers/iio/dac/ad5446.c
index 3310cbb..607378f 100644
--- a/drivers/iio/dac/ad5446.c
+++ b/drivers/iio/dac/ad5446.c
@@ -226,7 +226,11 @@ static int __devinit ad5446_probe(struct device *dev, 
const char *name,
if (ret)
goto error_put_reg;
 
-   voltage_uv = regulator_get_voltage(reg);
+   ret = regulator_get_voltage(reg);
+   if (ret < 0)
+   goto error_disable_reg;
+
+   voltage_uv = ret;
}
 
indio_dev = iio_device_alloc(sizeof(*st));
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 02/15] mm,ksm: use new hashtable implementation

2012-12-17 Thread Hugh Dickins

On Mon, 17 Dec 2012, Sasha Levin wrote:
> Switch ksm to use the new hashtable implementation. This reduces the amount of
> generic unrelated code in the ksm module.
> 
> This patch depends on d9b482c ("hashtable: introduce a small and naive
> hashtable") which was merged in v3.6.
> 
> Signed-off-by: Sasha Levin 

This seems fine, thanks:
except please drop that irrelevant final hunk to ksm_init(), then
Acked-by: Hugh Dickins 

> ---
>  mm/ksm.c | 31 +--
>  1 file changed, 13 insertions(+), 18 deletions(-)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 382d930..e888f54 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -33,7 +33,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  
> @@ -156,9 +156,8 @@ struct rmap_item {
>  static struct rb_root root_stable_tree = RB_ROOT;
>  static struct rb_root root_unstable_tree = RB_ROOT;
>  
> -#define MM_SLOTS_HASH_SHIFT 10
> -#define MM_SLOTS_HASH_HEADS (1 << MM_SLOTS_HASH_SHIFT)
> -static struct hlist_head mm_slots_hash[MM_SLOTS_HASH_HEADS];
> +#define MM_SLOTS_HASH_BITS 10
> +static DEFINE_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS);
>  
>  static struct mm_slot ksm_mm_head = {
>   .mm_list = LIST_HEAD_INIT(ksm_mm_head.mm_list),
> @@ -275,26 +274,21 @@ static inline void free_mm_slot(struct mm_slot *mm_slot)
>  
>  static struct mm_slot *get_mm_slot(struct mm_struct *mm)
>  {
> - struct mm_slot *mm_slot;
> - struct hlist_head *bucket;
>   struct hlist_node *node;
> + struct mm_slot *slot;
> +
> + hash_for_each_possible(mm_slots_hash, slot, node, link, (unsigned 
> long)mm) 
> + if (slot->mm == mm)
> + return slot;
>  
> - bucket = _slots_hash[hash_ptr(mm, MM_SLOTS_HASH_SHIFT)];
> - hlist_for_each_entry(mm_slot, node, bucket, link) {
> - if (mm == mm_slot->mm)
> - return mm_slot;
> - }
>   return NULL;
>  }
>  
>  static void insert_to_mm_slots_hash(struct mm_struct *mm,
>   struct mm_slot *mm_slot)
>  {
> - struct hlist_head *bucket;
> -
> - bucket = _slots_hash[hash_ptr(mm, MM_SLOTS_HASH_SHIFT)];
>   mm_slot->mm = mm;
> - hlist_add_head(_slot->link, bucket);
> + hash_add(mm_slots_hash, _slot->link, (unsigned long)mm);
>  }
>  
>  static inline int in_stable_tree(struct rmap_item *rmap_item)
> @@ -647,7 +641,7 @@ static int unmerge_and_remove_all_rmap_items(void)
>   ksm_scan.mm_slot = list_entry(mm_slot->mm_list.next,
>   struct mm_slot, mm_list);
>   if (ksm_test_exit(mm)) {
> - hlist_del(_slot->link);
> + hash_del(_slot->link);
>   list_del(_slot->mm_list);
>   spin_unlock(_mmlist_lock);
>  
> @@ -1392,7 +1386,7 @@ next_mm:
>* or when all VM_MERGEABLE areas have been unmapped (and
>* mmap_sem then protects against race with MADV_MERGEABLE).
>*/
> - hlist_del(>link);
> + hash_del(>link);
>   list_del(>mm_list);
>   spin_unlock(_mmlist_lock);
>  
> @@ -1559,7 +1553,7 @@ void __ksm_exit(struct mm_struct *mm)
>   mm_slot = get_mm_slot(mm);
>   if (mm_slot && ksm_scan.mm_slot != mm_slot) {
>   if (!mm_slot->rmap_list) {
> - hlist_del(_slot->link);
> + hash_del(_slot->link);
>   list_del(_slot->mm_list);
>   easy_to_free = 1;
>   } else {
> @@ -2035,6 +2029,7 @@ static int __init ksm_init(void)
>*/
>   hotplug_memory_notifier(ksm_memory_callback, 100);
>  #endif
> +
>   return 0;
>  
>  out_free:
> -- 
> 1.8.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] tuntap: fix sparse warning

2012-12-17 Thread Jason Wang

Make tun_enable_queue() static to fix the sparse warning:

drivers/net/tun.c:399:19: sparse: symbol 'tun_enable_queue' was not declared. 
Should it be static?

Reported-by: Fengguang Wu 
Signed-off-by: Jason Wang 
---
 drivers/net/tun.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 173acf5..504f7f1 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -396,7 +396,7 @@ static void tun_disable_queue(struct tun_struct *tun, 
struct tun_file *tfile)
++tun->numdisabled;
 }
 
-struct tun_struct *tun_enable_queue(struct tun_file *tfile)
+static struct tun_struct *tun_enable_queue(struct tun_file *tfile)
 {
struct tun_struct *tun = tfile->detached;
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC][PATCH v6]trace,x86: add x86 irq vector tracepoints

2012-12-17 Thread Steven Rostedt

On Tue, 2012-12-18 at 01:34 +, Seiji Aguchi wrote:
> Change log 
> 
>  v5 -> v6
>  - Rebased to 3.7
>  
>  v4 -> v5
>  - Rebased to 3.6.0
> 
>  - Introduce a logic switching IDT at enabling/disabling TP time 
>so that a time penalty makes a zero when tracepoints are disabled.
>This IDT is created only when CONFIG_TRACEPOINTS is enabled.
> 
>  - Remove arch_irq_vector_entry/exit and add followings again
>so that we can add each tracepoint in a generic way.
>- error_apic_vector
>- thermal_apic_vector
>- threshold_apic_vector
>- spurious_apic_vector
>- x86_platform_ipi_vector
> 
>  - Drop nmi tracepoints to begin with apic interrupts and discuss a logic 
> switching
>IDT first.
> 
>  - Move irq_vectors.h in the directory of arch/x86/include/asm/trace because
>I'm not sure if a logic switching IDT is sharable with other architectures.
> 
>  v3 -> v4
>  - Add a latency measurement of each tracepoint
>  - Rebased to 3.6-rc6
> 
>  v2 -> v3
>  - Remove an invalidate_tlb_vector event because it was replaced by a call 
> function vector
>in a following commit.
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=52aec3308db85f4e9f5c8b9f5dc4fbd0138c6fa4
> 
>  v1 -> v2
>  - Modify variable name from irq to vector.
>  - Merge arch-specific tracepoints below to an arch_irq_vector_entry/exit.
>- error_apic_vector
>- thermal_apic_vector
>- threshold_apic_vector
>- spurious_apic_vector
>- x86_platform_ipi_vector
> 
> [Purpose of this patch]
> 
> As Vaibhav explained in the thread below, tracepoints for irq vectors
> are useful.
> 
> http://www.spinics.net/lists/mm-commits/msg85707.html
> 
> 
> The current interrupt traces from irq_handler_entry and irq_handler_exit
> provide when an interrupt is handled.  They provide good data about when
> the system has switched to kernel space and how it affects the currently
> running processes.
> 
> There are some IRQ vectors which trigger the system into kernel space,
> which are not handled in generic IRQ handlers.  Tracing such events gives
> us the information about IRQ interaction with other system events.
> 
> The trace also tells where the system is spending its time.  We want to
> know which cores are handling interrupts and how they are affecting other
> processes in the system.  Also, the trace provides information about when
> the cores are idle and which interrupts are changing that state.
> 
> 
> On the other hand, my usecase is tracing just local timer event and 
> getting a value of instruction pointer.
> 
>   I suggested to add an argument local timer event to get instruction pointer 
> before.
>   But there is another way to get it with external module like systemtap.
>   So, I don't need to add any argument to irq vector tracepoints now.
> 
> [Patch Description]
> 
> Vaibhav's patch shared a trace point ,irq_vector_entry/irq_vector_exit, in 
> all events.
> But there is an above use case to trace specific irq_vector rather than 
> tracing all events.
> In this case, we are concerned about overhead due to unwanted events.
> 
> This patch adds following tracepoints instead of introducing 
> irq_vector_entry/exit.
> so that we can enable them independently.
>- local_timer_vector
>- reschedule_vector
>- call_function_vector
>- call_function_single_vector 
>- irq_work_entry_vector
>- error_apic_vector
>- thermal_apic_vector
>- threshold_apic_vector
>- spurious_apic_vector
>- x86_platform_ipi_vector
> 
> Also, it introduces a logic switching IDT at enabling/disabling time so that 
> a time penalty makes 
> a complete zero when tracepoints are disabled. Detailed explanations are as 
> follows.
>  - Create new irq handlers inserted tracepoints by using macros.
>  - Create a new IDT, trace_idt_table, at boot time by duplicating original 
> IDT, idt table, and 
>registering the new handers for tracpoints.
>  - Switch IDT to new one at enabling TP time.
>  - Restore to an original IDT at disabling TP time.
> The new IDT is created only when CONFIG_TRACEPOINTS is enabled to avoid being 
> used for other purposes.
> 
> Signed-off-by: Seiji Aguchi 
> ---
>  arch/x86/include/asm/desc.h  |   27 +
>  arch/x86/include/asm/entry_arch.h|   32 +
>  arch/x86/include/asm/hw_irq.h|   14 +++
>  arch/x86/kernel/Makefile |1 +
>  arch/x86/kernel/apic/apic.c  |  186 
> +-
>  arch/x86/kernel/cpu/mcheck/therm_throt.c |   26 +++--
>  arch/x86/kernel/cpu/mcheck/threshold.c   |   27 +++--
>  arch/x86/kernel/entry_64.S   |   33 ++
>  arch/x86/kernel/head_64.S|6 +
>  arch/x86/kernel/irq.c|   44 ---
>  arch/x86/kernel/irq_work.c   |   22 +++-
>  arch/x86/kernel/irqinit.c|2 +
>  arch/x86/kernel/smp.c|   68 
>  13 files changed, 345 insertions(+),

linux-next: Tree for Dec 18

2012-12-17 Thread Stephen Rothwell

Hi all,

Changes since 20121217:

Lots of conflicts are migrating between trees.

The btrfs tree lost its build failure.

The akpm tree gained a build failure for which I applied a patch.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 214 trees (counting Linus' and 28 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (9228ff9 Merge branch 'for-3.8/drivers' of 
git://git.kernel.dk/linux-block)
Merging fixes/master (8041853 disable the SB105X driver)
Merging kbuild-current/rc-fixes (bad9955 menuconfig: Replace CIRCLEQ by 
list_head-style lists.)
Merging arm-current/fixes (dad5451 ARM: 7605/1: vmlinux.lds: Move .notes 
section next to the rodata)
Merging m68k-current/for-linus (5fec45a m68k/sun3: Fix instruction faults)
Merging powerpc-merge/merge (e716e01 powerpc/eeh: Do not invalidate PE properly)
Merging sparc/master (66cdd0c Merge tag 'kvm-3.8-1' of 
git://git.kernel.org/pub/scm/virt/kvm/kvm)
Merging net/master (76fe458 tuntap: reset network header before calling 
skb_get_rxhash())
Merging sound-current/for-linus (d846b17 ALSA: hda - Fix build without 
CONFIG_PM)
Merging pci-current/for-linus (ff8e59b PCI/portdrv: Don't create hotplug slots 
unless port supports hotplug)
Merging wireless/master (009b969 wireless: fix Atheros drivers compilation)
Merging driver-core.current/driver-core-linus (9360b53 Revert "bdi: add a 
user-tunable cpu_list for the bdi flusher threads")
Merging tty.current/tty-linus (1ebaf4f Merge branch 'x86-timers-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging usb.current/usb-linus (1ebaf4f Merge branch 'x86-timers-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging staging.current/staging-linus (1ebaf4f Merge branch 
'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging char-misc.current/char-misc-linus (1ebaf4f Merge branch 
'x86-timers-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip)
Merging input-current/for-linus (022573c Merge branch 'next' into for-linus)
CONFLICT (modify/delete): drivers/input/touchscreen/ti_tscadc.c deleted in HEAD 
and modified in input-current/for-linus. Version input-current/for-linus of 
drivers/input/touchscreen/ti_tscadc.c left in tree.
CONFLICT (content): Merge conflict in arch/arm/mach-ux500/board-mop500-stuib.c
$ git rm -f drivers/input/touchscreen/ti_tscadc.c
Merging md-current/for-linus (749586b md/raid5: use async_tx_quiesce() instead 
of open-coding it.)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (a2c0911 crypto: caam - Updated SEC-4.0 device 
tree binding for ERA information.)
Merging ide/master (9974e43 ide: fix generic_ide_suspend/resume Oops)
Merging dwmw2/master (03a0b4c solos-pci: fix double-free of TX skb in DMA mode)
CONFLICT (content): Merge conflict in arch/x86/Kconfig.cpu
CONFLICT (content): Merge conflict in arch/x86/Kconfig
CONFLICT (content): Merge conflict in arch/powerpc/Kconfig
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/mer

[PATCH 2/2]linux-usb:optimize to match the Huawei USB storage devices and support new switch command

2012-12-17 Thread fangxiaozhi 00110321

From: fangxiaozhi 

1. Optimize the match rules with new macro for Huawei USB storage devices, 
   to avoid to load USB storage driver for the modem interface 
   with Huawei devices.
2. Add to support new switch command for new Huawei USB dongles.

Signed-off-by: fangxiaozhi 


diff -uprN linux-3.7_bak/drivers/usb/storage/initializers.c 
linux-3.7/drivers/usb/storage/initializers.c
--- linux-3.7_bak/drivers/usb/storage/initializers.c2012-12-11 
09:56:11.0 +0800
+++ linux-3.7/drivers/usb/storage/initializers.c2012-12-17 
11:12:12.0 +0800
@@ -92,8 +92,8 @@ int usb_stor_ucr61s2b_init(struct us_dat
return 0;
 }
 
-/* This places the HUAWEI E220 devices in multi-port mode */
-int usb_stor_huawei_e220_init(struct us_data *us)
+/* This places the HUAWEI usb dongles in multi-port mode */
+static int usb_stor_huawei_feature_init(struct us_data *us)
 {
int result;
 
@@ -104,3 +104,59 @@ int usb_stor_huawei_e220_init(struct us_
US_DEBUGP("Huawei mode set result is %d\n", result);
return 0;
 }
+
+/* Find the supported Huawei USB dongles */
+static int usb_stor_huawei_dongles_pid(struct us_data *us)
+{
+   struct usb_interface_descriptor *idesc;
+   int idProduct;
+   
+   idesc = >pusb_intf->cur_altsetting->desc;
+   idProduct = us->pusb_dev->descriptor.idProduct;
+   if (idesc && idesc->bInterfaceNumber == 0) {
+   if ((idProduct == 0x1001)
+   || (idProduct == 0x1003)
+   || (idProduct == 0x1004)
+   || (idProduct >= 0x1401 && idProduct < 0x1501)
+   || (idProduct > 0x1504 && idProduct <= 0x1600)
+   || (idProduct >= 0x1c02 && idProduct <= 0x2202)) {
+   return 1;
+   }
+   }
+   return 0;
+}
+
+static int usb_stor_huawei_scsi_init(struct us_data *us)
+{
+   int result = 0;
+   int act_len = 0;
+   char rewind_cmd[] = {0x11, 0x06, 0x20, 0x00, 0x00, 0x01, 0x01, 0x00,
+   0x01, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+   struct bulk_cb_wrap *bcbw = (struct bulk_cb_wrap *) us->iobuf;
+   
+   memset(bcbw, 0, sizeof(struct bulk_cb_wrap));
+   bcbw->Signature = cpu_to_le32(US_BULK_CB_SIGN);
+   bcbw->Tag = 0;
+   bcbw->DataTransferLength = 0;
+   bcbw->Flags = bcbw->Lun = 0;
+   bcbw->Length = sizeof(rewind_cmd);
+   memcpy(bcbw->CDB, rewind_cmd, sizeof(rewind_cmd));
+
+   result = usb_stor_bulk_transfer_buf(us, us->send_bulk_pipe, ,
+   US_BULK_CS_WRAP_LEN, _len);
+   US_DEBUGP("transfer actual length=%d, result=%d\n", act_len, result);
+   return result;
+}
+
+int usb_stor_huawei_init(struct us_data *us)
+{
+   int result = 0;
+   
+   if (usb_stor_huawei_dongles_pid(us)) {
+   if (us->pusb_dev->descriptor.idProduct >= 0x1446)
+   result = usb_stor_huawei_scsi_init(us);
+   else
+   result = usb_stor_huawei_feature_init(us);
+   }
+   return result;
+}
diff -uprN linux-3.7_bak/drivers/usb/storage/initializers.h 
linux-3.7/drivers/usb/storage/initializers.h
--- linux-3.7_bak/drivers/usb/storage/initializers.h2012-12-11 
09:56:11.0 +0800
+++ linux-3.7/drivers/usb/storage/initializers.h2012-12-17 
10:39:55.0 +0800
@@ -46,5 +46,5 @@ int usb_stor_euscsi_init(struct us_data 
  * flash reader */
 int usb_stor_ucr61s2b_init(struct us_data *us);
 
-/* This places the HUAWEI E220 devices in multi-port mode */
-int usb_stor_huawei_e220_init(struct us_data *us);
+/* This places the HUAWEI usb dongles in multi-port mode */
+int usb_stor_huawei_init(struct us_data *us);
Binary files linux-3.7_bak/drivers/usb/storage/initializers.o and 
linux-3.7/drivers/usb/storage/initializers.o differ
diff -uprN linux-3.7_bak/drivers/usb/storage/unusual_devs.h 
linux-3.7/drivers/usb/storage/unusual_devs.h
--- linux-3.7_bak/drivers/usb/storage/unusual_devs.h2012-12-11 
09:56:11.0 +0800
+++ linux-3.7/drivers/usb/storage/unusual_devs.h2012-12-17 
10:40:10.0 +0800
@@ -1527,335 +1527,10 @@ UNUSUAL_DEV(  0x1210, 0x0003, 0x0100, 0x
 /* Reported by fangxiaozhi 
  * This brings the HUAWEI data card devices into multi-port mode
  */
-UNUSUAL_DEV(  0x12d1, 0x1001, 0x, 0x,
+UNUSUAL_VENDOR_INTF(0x12d1, 0x08, 0x06, 0x50,
"HUAWEI MOBILE",
"Mass Storage",
-   USB_SC_DEVICE, USB_PR_DEVICE, usb_stor_huawei_e220_init,
-   0),
-UNUSUAL_DEV(  0x12d1, 0x1003, 0x, 0x,
-   "HUAWEI MOBILE",
-   "Mass Storage",
-   USB_SC_DEVICE, USB_PR_DEVICE, usb_stor_huawei_e220_init,
-   0),
-UNUSUAL_DEV(  0x12d1, 0x1004, 0x, 0x,
-   "HUAWEI MOBILE",
-   "Mass Storage",
-

Re: [PATCH 1/3] timekeeping: Add persistent_clock_exist flag

2012-12-17 Thread Feng Tang

On Mon, Dec 17, 2012 at 11:22:02AM -0700, Jason Gunthorpe wrote:
> On Tue, Dec 18, 2012 at 12:14:33AM +0800, Feng Tang wrote:
> 
> > > Sure, but my view on this is that it has nothing to do with
> > > read_persistent_clock. If the RTC driver can run with IRQs off is a
> > > property of the RTC driver and RTC hardware - it has nothing to do
> > > with the platform. ARM platforms will vary on a machine by machine
> > > basis. The rtc-mv driver used on my ARM system is perfectly
> > > re-entrant, lots of rtc on SOC drivers will be the same.
> > > 
> > > If this is the only thing keeping you on read_persistent_clock, for
> > > real RTCs, then how about a RTC_DEV_SAFE_READ flag (or whatever) in
> > > rtc_device.flags?
> > > 
> > > Reserve read_persistent_clock for things like that very specialized
> > > non-RTC ARM counter.
> > 
> > Yes, these non-RTC device is one reason for keeping read_persistent_clock,
> > one other reason I can think of is the CONFIG_RTC_LIB is not always on by
> > default for all Archs, and some platforms may chose to disable it on 
> > purpose. 
> > When CONFIG_RTC_LIB is not set, we need the read_persistent_clock for
> > time init/suspend/resume.
> 
> I thought your concern was the case where the RTC was turned on and
> read_persistent_clock was also turned on. Having a flag in the RTC and
> disabling read_persistent_clock for that situation would help you
> avoid the double code path to the same hardware.

No, it's not about my concern (actually I don't have concerns :), my 3
patches are just some optimization).

My point is the "driver/rtc" or "drivers/rtc/class.c" can't be ganrateeded
to be built for all kernels, myself saw several cases, under which
the read_persistent_clock is still needed.

If you want drivers/rtc code to take the priority, first thing you have
to do is to make those codes always "obj-y=xxx.o"

Thanks,
Feng

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Are there u32 atomic bitops? (or dealing w/ i_flags)

2012-12-17 Thread Andy Lutomirski

On Mon, Dec 17, 2012 at 5:57 PM, Al Viro  wrote:
> On Mon, Dec 17, 2012 at 05:10:21PM -0800, Andy Lutomirski wrote:
>> I want to change inode->i_flags access to be atomic -- there are some
>> locking oddities right now, I think, and I want to use a new inode
>> flag to signal mtime updates from page_mkwrite.  The problem is that
>> i_flags is an unsigned int, and making it an unsigned long seems like
>> a waste, but there aren't any u32 atomic bitops.
>
> ... and atomic accesses cost more.  A lot more on some architectures.
> FWIW, atomic_t *is* 32bit on 32bit architectures, which still doesn't
> make it a good idea.

Are atomic_set_mask and atomic_clear_mask as fast as set_bit and
friends on all archs?

In any case, i_flags looks like it's rarely written, so I find it a
bit hard to believe that making it atomic would hurt.  Isn't
atomic_read equivalent to non-atomic reads everywhere?

I want page_mkwrite to set a flag (without taking i_mutex) but *not*
call file_update_time and then to have the writeback paths update the
inode time.  (This, along with stable pages, is the major cause of
long sleeps in my application.)  OTOH, maybe I should just use i_state
and i_lock for this.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1206 matches

Mail list logo