[PATCHv2 1/4] Clocksource: Flextimer: Use internal clocksource read API.

2014-09-04 Thread Xiubo Li
Since the Flextimer device will be implemented in BE mode on
LS1 SoC, and in LE mode on Vybrid, LS2 SoCs, so here we need
the endianness judgment before doing the mmio.

Signed-off-by: Xiubo Li 
---
 drivers/clocksource/fsl_ftm_timer.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/clocksource/fsl_ftm_timer.c 
b/drivers/clocksource/fsl_ftm_timer.c
index 454227d..9c6e935 100644
--- a/drivers/clocksource/fsl_ftm_timer.c
+++ b/drivers/clocksource/fsl_ftm_timer.c
@@ -226,6 +226,11 @@ static int __init ftm_clockevent_init(unsigned long freq, 
int irq)
return 0;
 }
 
+static cycle_t ftm_clocksource_read_up(struct clocksource *c)
+{
+   return ftm_readl(priv->clksrc_base + FTM_CNT) & 0x;
+}
+
 static int __init ftm_clocksource_init(unsigned long freq)
 {
int err;
@@ -238,7 +243,7 @@ static int __init ftm_clocksource_init(unsigned long freq)
sched_clock_register(ftm_read_sched_clock, 16, freq / (1 << priv->ps));
err = clocksource_mmio_init(priv->clksrc_base + FTM_CNT, "fsl-ftm",
freq / (1 << priv->ps), 300, 16,
-   clocksource_mmio_readl_up);
+   ftm_clocksource_read_up);
if (err) {
pr_err("ftm: init clock source mmio failed: %d\n", err);
return err;
-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 0/4] Clocksource: Flextimer: add LS1 support

2014-09-04 Thread Xiubo Li
Change in v2:
- Follows Thomas Gleixner advice.


Xiubo Li (4):
  Clocksource: Flextimer: Use internal clocksource read API.
  Clocksource: Flextimer: Remove the useless code.
  Clocksource: Flextimer: Set cpumask to cpu_possible_mask
  Clocksource: Flextimer: Fix counter clock prescaler calculation.

 drivers/clocksource/fsl_ftm_timer.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

-- 
2.1.0.27.g96db324

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] Rockchip: RK3288: CRU: swap value of bit for CORE clock pll source selection

2014-09-04 Thread jianqun
From: xujianqun 

For RK3288, core clock pll source select APLL when bit value is 1, select GPLL
when bit value is 0;

CRU_CLKSEL0_CON [15]
- core_clk_pll_sel
- CORE clock pll source selection
-- 1'b1: select ARM PLL
-- 1'b0: select GENERAL PLL

BUG=none
TEST= "cat /sys/kernel/debug/clk/clk_summary |grep apll" check parent of core 
clock

Change-Id: I44a528af256da1fad573b4ccf9d0a20ad4cf6d68
Signed-off-by: xujianqun 
---
 drivers/clk/rockchip/clk-cpu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/rockchip/clk-cpu.c b/drivers/clk/rockchip/clk-cpu.c
index c5b14e9..1725ac7 100644
--- a/drivers/clk/rockchip/clk-cpu.c
+++ b/drivers/clk/rockchip/clk-cpu.c
@@ -136,7 +136,7 @@ static int rockchip_cpuclk_pre_rate_change(struct 
rockchip_cpuclk *cpuclk,
}
 
/* select alternate parent */
-   writel(HIWORD_UPDATE(1, 1, reg_data->mux_core_shift),
+   writel(HIWORD_UPDATE(0, 1, reg_data->mux_core_shift),
   cpuclk->reg_base + reg_data->core_reg);
 
/* alternate parent is active now. set the dividers */
@@ -163,7 +163,7 @@ static int rockchip_cpuclk_post_rate_change(struct 
rockchip_cpuclk *cpuclk,
spin_lock(cpuclk->lock);
 
/* post-rate change event, re-mux back to primary parent */
-   writel(HIWORD_UPDATE(0, 1, reg_data->mux_core_shift),
+   writel(HIWORD_UPDATE(1, 1, reg_data->mux_core_shift),
   cpuclk->reg_base + RK2928_CLKSEL_CON(0));
 
/* remove any core dividers */
-- 
1.9.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -v4] x86: only load initrd above 4g on second try

2014-09-04 Thread Anders Darander
* Yinghai Lu  [140905 03:19]:


> On Thu, Sep 4, 2014 at 2:29 PM, Matt Fleming  wrote:
> > On Thu, 04 Sep, at 01:59:05PM, H. Peter Anvin wrote:

> >> I am fine with this patch, but at the same time I do want to note that
> >> there is an alternative to double-buffer the patch and/or (if that
> >> applies to the buggy BIOS) round up the size of the target buffer.

> > I'm not sure that rounding up the size of the target buffer will
> > workaround this issue correctly.

> > As far as I know, the only thing that Mantas tried was rounding up the
> > size of the source file, by padding it.

> Can you try attached patch on top of linus tree?

I took the liberty to test the patch on my Dell XPS13 9333, and
unfortunately I got the old hang back. 

This was tested on the current Linus' tree.

Cheers,
Anders Darander

> - chunksize = size;
> + chunksize = round_up(size, 
> EFI_PAGE_SIZE);



-- 
It usually takes more than three weeks to prepare a good impromptu speech.
-- Mark Twain
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] blk-merge: fix blk_recount_segments

2014-09-04 Thread Rusty Russell
Ming Lei  writes:
> On Tue, 02 Sep 2014 10:24:24 -0600
> Jens Axboe  wrote:
>
>> On 09/02/2014 10:21 AM, Christoph Hellwig wrote:
>> > Btw, one thing we should reconsider is where we set
>> > QUEUE_FLAG_NO_SG_MERGE.  At least for virtio-blk it seems to me that
>> > doing the S/G merge should be a lot cheaper than fanning out into the
>> > indirect descriptors.
>
> Indirect is always considered first no matter SG merge is off or on,
> at least from current virtio-blk implementation.
>
> But it is a good idea to try direct descriptor first, the below simple
> change can improve randread(libaio, O_DIRECT, multi-queue) 7% in my test,
> and 77% transfer starts to use direct descriptor, and almost all transfer
> uses indirect descriptor only in current upstream implementation.

Hi Ming!

In general, we want to use direct descriptors of we have plenty
of descriptors, and indirect if the ring is going to fill up.  I was
thinking about this just yesterday, in fact.

I've been trying to use EWMA to figure out how full the ring gets, but
so far it's not working well.  I'm still hacking on a solution though,
and your thoughts would be welcome.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] Drivers: scsi: storvsc: Get rid of warning messages

2014-09-04 Thread Christoph Hellwig
Looks good to me.

Olaf, Hannes - can I get another review for this (and the older hyperv
scanning patch set)?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 6/6] mm/hugetlb: remove unused argument of follow_huge_addr()

2014-09-04 Thread Naoya Horiguchi
On Wed, Sep 03, 2014 at 02:26:37PM -0700, Hugh Dickins wrote:
> On Thu, 28 Aug 2014, Naoya Horiguchi wrote:
> 
> > follow_huge_addr()'s parameter write is not used, so let's remove it.
> > 
> > Signed-off-by: Naoya Horiguchi 
> 
> I think this patch is a waste of time: that it should be replaced
> by a patch which replaces the "write" argument by a "flags" argument,

OK, I just drop this patch.

> so that follow_huge_addr() can do get_page() for FOLL_GET while holding
> appropriate lock, instead of the BUG_ON(flags & FOLL_GET) we currently
> have.
> 
> Once that is implemented, you could try getting hugetlb migration
> tested on ia64 and powerpc; but yes, keep hugetlb migration disabled
> on all but x86 until it has been tested elsewhere.
> 
> > ---
> >  arch/ia64/mm/hugetlbpage.c| 2 +-
> >  arch/powerpc/mm/hugetlbpage.c | 2 +-
> >  arch/x86/mm/hugetlbpage.c | 2 +-
> >  include/linux/hugetlb.h   | 5 ++---
> >  mm/gup.c  | 2 +-
> >  mm/hugetlb.c  | 3 +--
> >  6 files changed, 7 insertions(+), 9 deletions(-)
> > 
> > diff --git mmotm-2014-08-25-16-52.orig/arch/ia64/mm/hugetlbpage.c 
> > mmotm-2014-08-25-16-52/arch/ia64/mm/hugetlbpage.c
> > index 6170381bf074..524a4e001bda 100644
> > --- mmotm-2014-08-25-16-52.orig/arch/ia64/mm/hugetlbpage.c
> > +++ mmotm-2014-08-25-16-52/arch/ia64/mm/hugetlbpage.c
> > @@ -89,7 +89,7 @@ int prepare_hugepage_range(struct file *file,
> > return 0;
> >  }
> >  
> > -struct page *follow_huge_addr(struct mm_struct *mm, unsigned long addr, 
> > int write)
> > +struct page *follow_huge_addr(struct mm_struct *mm, unsigned long addr)
> >  {
> > struct page *page = NULL;
> > pte_t *ptep;
> > diff --git mmotm-2014-08-25-16-52.orig/arch/powerpc/mm/hugetlbpage.c 
> > mmotm-2014-08-25-16-52/arch/powerpc/mm/hugetlbpage.c
> > index 1d8854a56309..5b6fe8b0cde3 100644
> > --- mmotm-2014-08-25-16-52.orig/arch/powerpc/mm/hugetlbpage.c
> > +++ mmotm-2014-08-25-16-52/arch/powerpc/mm/hugetlbpage.c
> > @@ -674,7 +674,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb,
> >  }
> >  
> >  struct page *
> > -follow_huge_addr(struct mm_struct *mm, unsigned long address, int write)
> > +follow_huge_addr(struct mm_struct *mm, unsigned long address)
> >  {
> > pte_t *ptep;
> > struct page *page = ERR_PTR(-EINVAL);
> > diff --git mmotm-2014-08-25-16-52.orig/arch/x86/mm/hugetlbpage.c 
> > mmotm-2014-08-25-16-52/arch/x86/mm/hugetlbpage.c
> > index 03b8a7c11817..cab09d87ae65 100644
> > --- mmotm-2014-08-25-16-52.orig/arch/x86/mm/hugetlbpage.c
> > +++ mmotm-2014-08-25-16-52/arch/x86/mm/hugetlbpage.c
> > @@ -18,7 +18,7 @@
> >  
> >  #if 0  /* This is just for testing */
> >  struct page *
> > -follow_huge_addr(struct mm_struct *mm, unsigned long address, int write)
> > +follow_huge_addr(struct mm_struct *mm, unsigned long address)
> >  {
> > unsigned long start = address;
> > int length = 1;
> > diff --git mmotm-2014-08-25-16-52.orig/include/linux/hugetlb.h 
> > mmotm-2014-08-25-16-52/include/linux/hugetlb.h
> > index b3200fce07aa..cdff1bd393bb 100644
> > --- mmotm-2014-08-25-16-52.orig/include/linux/hugetlb.h
> > +++ mmotm-2014-08-25-16-52/include/linux/hugetlb.h
> > @@ -96,8 +96,7 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
> > unsigned long addr, unsigned long sz);
> >  pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr);
> >  int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t 
> > *ptep);
> > -struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
> > - int write);
> > +struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address);
> >  struct page *follow_huge_pmd(struct vm_area_struct *vma, unsigned long 
> > address,
> > pmd_t *pmd, int flags);
> >  struct page *follow_huge_pud(struct vm_area_struct *vma, unsigned long 
> > address,
> > @@ -124,7 +123,7 @@ static inline unsigned long hugetlb_total_pages(void)
> >  }
> >  
> >  #define follow_hugetlb_page(m,v,p,vs,a,b,i,w)  ({ BUG(); 0; })
> > -#define follow_huge_addr(mm, addr, write)  ERR_PTR(-EINVAL)
> > +#define follow_huge_addr(mm, addr) ERR_PTR(-EINVAL)
> >  #define copy_hugetlb_page_range(src, dst, vma) ({ BUG(); 0; })
> >  static inline void hugetlb_report_meminfo(struct seq_file *m)
> >  {
> > diff --git mmotm-2014-08-25-16-52.orig/mm/gup.c 
> > mmotm-2014-08-25-16-52/mm/gup.c
> > index 597a5e92e265..8f0550f1770d 100644
> > --- mmotm-2014-08-25-16-52.orig/mm/gup.c
> > +++ mmotm-2014-08-25-16-52/mm/gup.c
> > @@ -149,7 +149,7 @@ struct page *follow_page_mask(struct vm_area_struct 
> > *vma,
> >  
> > *page_mask = 0;
> >  
> > -   page = follow_huge_addr(mm, address, flags & FOLL_WRITE);
> > +   page = follow_huge_addr(mm, address);
> > if (!IS_ERR(page)) {
> > BUG_ON(flags & FOLL_GET);
> > return page;
> > diff --git mmotm-2014-08-25-16-52.orig/mm/hugetlb.c 
> > 

Re: [PATCH v3 4/6] mm/hugetlb: add migration entry check in hugetlb_change_protection

2014-09-04 Thread Naoya Horiguchi
On Wed, Sep 03, 2014 at 06:06:34PM -0700, Hugh Dickins wrote:
> On Thu, 28 Aug 2014, Naoya Horiguchi wrote:
> 
> > There is a race condition between hugepage migration and 
> > change_protection(),
> > where hugetlb_change_protection() doesn't care about migration entries and
> > wrongly overwrites them. That causes unexpected results like kernel crash.
> > 
> > This patch adds is_hugetlb_entry_(migration|hwpoisoned) check in this
> > function to do proper actions.
> > 
> > ChangeLog v3:
> > - handle migration entry correctly (instead of just skipping)
> > 
> > Signed-off-by: Naoya Horiguchi 
> > Cc:  # [2.6.36+]
> 
> 2.6.36+?  For the hwpoisoned part of it, I suppose.
> Then you'd better mentioned the hwpoisoned case in the comment above.

OK, I'll update the description and the subject.

> > ---
> >  mm/hugetlb.c | 21 -
> >  1 file changed, 20 insertions(+), 1 deletion(-)
> > 
> > diff --git mmotm-2014-08-25-16-52.orig/mm/hugetlb.c 
> > mmotm-2014-08-25-16-52/mm/hugetlb.c
> > index 2aafe073cb06..1ed9df6def54 100644
> > --- mmotm-2014-08-25-16-52.orig/mm/hugetlb.c
> > +++ mmotm-2014-08-25-16-52/mm/hugetlb.c
> > @@ -3362,7 +3362,26 @@ unsigned long hugetlb_change_protection(struct 
> > vm_area_struct *vma,
> > spin_unlock(ptl);
> > continue;
> > }
> > -   if (!huge_pte_none(huge_ptep_get(ptep))) {
> > +   pte = huge_ptep_get(ptep);
> > +   if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) {
> > +   spin_unlock(ptl);
> > +   continue;
> > +   }
> > +   if (unlikely(is_hugetlb_entry_migration(pte))) {
> > +   swp_entry_t entry = pte_to_swp_entry(pte);
> > +
> > +   if (is_write_migration_entry(entry)) {
> > +   pte_t newpte;
> > +
> > +   make_migration_entry_read();
> > +   newpte = swp_entry_to_pte(entry);
> > +   set_pte_at(mm, address, ptep, newpte);
> 
> set_huge_pte_at.

Fixed, thanks.

> 
> (As usual, I can't bear to see these is_hugetlb_entry_hwpoisoned and
> is_hugetlb_entry_migration examples go past without bleating about
> wanting to streamline them a little; but agreed last time to leave
> that to some later cleanup once all the stable backports are stable.)

Yes, these two check routines need cleanup.
I'll do it in separate work later.

> > +   pages++;
> > +   }
> > +   spin_unlock(ptl);
> > +   continue;
> > +   }
> > +   if (!huge_pte_none(pte)) {
> > pte = huge_ptep_get_and_clear(mm, address, ptep);
> > pte = pte_mkhuge(huge_pte_modify(pte, newprot));
> > pte = arch_make_huge_pte(pte, vma, NULL, 0);
> > -- 
> > 1.9.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 3/6] mm/hugetlb: fix getting refcount 0 page in hugetlb_fault()

2014-09-04 Thread Naoya Horiguchi
On Wed, Sep 03, 2014 at 05:20:59PM -0700, Hugh Dickins wrote:
> On Thu, 28 Aug 2014, Naoya Horiguchi wrote:
> 
> > When running the test which causes the race as shown in the previous patch,
> > we can hit the BUG "get_page() on refcount 0 page" in hugetlb_fault().
> > 
> > This race happens when pte turns into migration entry just after the first
> > check of is_hugetlb_entry_migration() in hugetlb_fault() passed with false.
> > To fix this, we need to check pte_present() again with holding ptl.
> > 
> > This patch also reorders taking ptl and doing pte_page(), because pte_page()
> > should be done in ptl. Due to this reordering, we need use trylock_page()
> > in page != pagecache_page case to respect locking order.
> > 
> > ChangeLog v3:
> > - doing pte_page() and taking refcount under page table lock
> > - check pte_present after taking ptl, which makes it unnecessary to use
> >   get_page_unless_zero()
> > - use trylock_page in page != pagecache_page case
> > - fixed target stable version
> 
> ChangeLog vN below the --- (or am I contradicting some other advice?)

no, this is a practical advice.

> > 
> > Fixes: 66aebce747ea ("hugetlb: fix race condition in hugetlb_fault()")
> > Signed-off-by: Naoya Horiguchi 
> > Cc:   # [3.2+]
> 
> One bug, one warning, a couple of suboptimals...
> 
> > ---
> >  mm/hugetlb.c | 32 ++--
> >  1 file changed, 18 insertions(+), 14 deletions(-)
> > 
> > diff --git mmotm-2014-08-25-16-52.orig/mm/hugetlb.c 
> > mmotm-2014-08-25-16-52/mm/hugetlb.c
> > index c5345c5edb50..2aafe073cb06 100644
> > --- mmotm-2014-08-25-16-52.orig/mm/hugetlb.c
> > +++ mmotm-2014-08-25-16-52/mm/hugetlb.c
> > @@ -3184,6 +3184,15 @@ int hugetlb_fault(struct mm_struct *mm, struct 
> > vm_area_struct *vma,
> > vma, address);
> > }
> >  
> > +   ptl = huge_pte_lock(h, mm, ptep);
> > +
> > +   /* Check for a racing update before calling hugetlb_cow */
> > +   if (unlikely(!pte_same(entry, huge_ptep_get(ptep
> > +   goto out_ptl;
> > +
> > +   if (!pte_present(entry))
> > +   goto out_ptl;
> 
> A comment on that test would be helpful.  Is a migration entry
> the only !pte_present() case you would expect to find there?

No, we can have the same race with hwpoisoned entry, although it's
very rare.

> It would be better to test "entry" for this (or for being a migration
> entry) higher up, just after getting "entry": less to unwind on error.

Right, thanks.

> And better to call migration_entry_wait_huge(), after dropping locks,
> before returning 0, so that we don't keep the cpu busy faulting while
> the migration entry remains there.  Maybe not important, but better.

OK.

> Probably best done with a goto unwinding code at end of function.
> 
> (Whereas we don't worry about "wait"s in the !pte_same case,
> because !pte_same indicates that change is already occurring:
> it's prolonged pte_same cases that we want to get away from.)
> 
> > +
> > /*
> >  * hugetlb_cow() requires page locks of pte_page(entry) and
> >  * pagecache_page, so here we need take the former one
> > @@ -3192,22 +3201,17 @@ int hugetlb_fault(struct mm_struct *mm, struct 
> > vm_area_struct *vma,
> >  * so no worry about deadlock.
> >  */
> > page = pte_page(entry);
> > -   get_page(page);
> > if (page != pagecache_page)
> > -   lock_page(page);
> > -
> > -   ptl = huge_pte_lockptr(h, mm, ptep);
> > -   spin_lock(ptl);
> > -   /* Check for a racing update before calling hugetlb_cow */
> > -   if (unlikely(!pte_same(entry, huge_ptep_get(ptep
> > -   goto out_ptl;
> > +   if (!trylock_page(page))
> > +   goto out_ptl;
> 
> And, again to avoid keeping the cpu busy refaulting, it would be better
> to wait_on_page_locked(), after dropping locks, before returning 0;
> probably best done with another goto end of function.

OK.

> >  
> > +   get_page(page);
> >  
> > if (flags & FAULT_FLAG_WRITE) {
> > if (!huge_pte_write(entry)) {
> > ret = hugetlb_cow(mm, vma, address, ptep, entry,
> > pagecache_page, ptl);
> > -   goto out_ptl;
> > +   goto out_put_page;
> > }
> > entry = huge_pte_mkdirty(entry);
> > }
> > @@ -3215,7 +3219,11 @@ int hugetlb_fault(struct mm_struct *mm, struct 
> > vm_area_struct *vma,
> > if (huge_ptep_set_access_flags(vma, address, ptep, entry,
> > flags & FAULT_FLAG_WRITE))
> > update_mmu_cache(vma, address, ptep);
> > -
> > +out_put_page:
> > +   put_page(page);
> 
> If I'm reading this correctly, there's now a small but nasty chance that
> this put_page will be the one which frees the page, and the unlock_page
> below will then be unlocking a freed page.  Our "Bad page" checks should
> detect that case, so it won't be as serious as unlocking someone 

deb-pkg: Add support for powerpc little endian

2014-09-04 Thread Michael Neuling
The Debian powerpc little endian architecture is called ppc64le.  This
is the default architecture used by Ubuntu for powerpc.

The below checks the kernel config to see if we are compiling little
endian and sets the Debian arch appropriately.

Signed-off-by: Michael Neuling 

diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index 35d5a58..6f4a1af 100644
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -37,7 +37,7 @@ create_package() {
s390*)
debarch=s390$(grep -q CONFIG_64BIT=y $KCONFIG_CONFIG && echo x 
|| true) ;;
ppc*)
-   debarch=powerpc ;;
+   debarch=$(grep -q CPU_LITTLE_ENDIAN=y $KCONFIG_CONFIG && echo 
ppc64el || echo powerpc) ;;
parisc*)
debarch=hppa ;;
mips*)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 5/6] mm/hugetlb: add migration entry check in __unmap_hugepage_range

2014-09-04 Thread Naoya Horiguchi
On Wed, Sep 03, 2014 at 06:47:38PM -0700, Hugh Dickins wrote:
> On Thu, 28 Aug 2014, Naoya Horiguchi wrote:
> 
> > If __unmap_hugepage_range() tries to unmap the address range over which
> > hugepage migration is on the way, we get the wrong page because pte_page()
> > doesn't work for migration entries. This patch calls pte_to_swp_entry() and
> > migration_entry_to_page() to get the right page for migration entries.
> > 
> > Signed-off-by: Naoya Horiguchi 
> > Cc:   # [2.6.36+]
> 
> 2.6.36+?  But this one doesn't affect hwpoisoned.
> I admit I've lost track of how far back hugetlb migration goes:
> oh, to 2.6.37+, that fits with what you marked on some commits earlier.
> But then 2/6 says 3.12+.  Help!  Please remind me of the sequence of events.

The bug of this patch exists after any kind of hugetlb migration appears,
so I tagged as [2.6.36+] (Fixes: 290408d4a2 "hugetlb: hugepage migration core".)
As for patch 2/6, the related bug was introduced due to follow_huge_pmd()
with FOLL_GET, which can happen after commit e632a938d914 "mm: migrate:
add hugepage migration code to move_pages()", so I tagged as [3.12+].

> 
> > ---
> >  mm/hugetlb.c | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> > 
> > diff --git mmotm-2014-08-25-16-52.orig/mm/hugetlb.c 
> > mmotm-2014-08-25-16-52/mm/hugetlb.c
> > index 1ed9df6def54..0a455ee0 100644
> > --- mmotm-2014-08-25-16-52.orig/mm/hugetlb.c
> > +++ mmotm-2014-08-25-16-52/mm/hugetlb.c
> > @@ -2652,6 +2652,13 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, 
> > struct vm_area_struct *vma,
> > if (huge_pte_none(pte))
> > goto unlock;
> >  
> > +   if (unlikely(is_hugetlb_entry_migration(pte))) {
> > +   swp_entry_t entry = pte_to_swp_entry(pte);
> > +
> > +   page = migration_entry_to_page(entry);
> > +   goto clear;
> > +   }
> > +
> 
> This surprises me: are you sure?  Obviously you know hugetlb migration
> much better than I do: is it done in a significantly different way from
> order:0 page migration?  In the order:0 case, there is no reference to
> the page corresponding to the migration entry placed in a page table,
> just the remaining reference held by the task doing the migration.  But
> here you are jumping to the code which unmaps and frees a present page.

Sorry, I misread the code again, you're right.

> I can see that a fix is necessary, but I would have expected it to
> consist of merely changing the "HWPoisoned" comment below to include
> migration entries, and changing its test from
>   if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) {
> to
>   if (unlikely(!pte_present(pte))) {

Yes, this looks the best way.

> 
> > /*
> >  * HWPoisoned hugepage is already unmapped and dropped reference
> >  */
> > @@ -2677,7 +2684,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, 
> > struct vm_area_struct *vma,
> >  */
> > set_vma_resv_flags(vma, HPAGE_RESV_UNMAPPED);
> > }
> > -
> > +clear:
> > pte = huge_ptep_get_and_clear(mm, address, ptep);
> > tlb_remove_tlb_entry(tlb, ptep, address);
> > if (huge_pte_dirty(pte))
> > -- 
> > 1.9.3
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 2/6] mm/hugetlb: take page table lock in follow_huge_(addr|pmd|pud)()

2014-09-04 Thread Naoya Horiguchi
Hi Hugh,

Thank you very much for you close looking and valuable comments.
And I can't help feeling shame on many mistakes/misunderstandings
and lack of thoughts throughout the patchset.
I promise that all these will be fixed in the next version.

On Wed, Sep 03, 2014 at 02:17:41PM -0700, Hugh Dickins wrote:
> On Thu, 28 Aug 2014, Naoya Horiguchi wrote:
> 
> > We have a race condition between move_pages() and freeing hugepages,
> > where move_pages() calls follow_page(FOLL_GET) for hugepages internally
> > and tries to get its refcount without preventing concurrent freeing.
> > This race crashes the kernel, so this patch fixes it by moving FOLL_GET
> > code for hugepages into follow_huge_pmd() with taking the page table lock.
> 
> You really ought to mention how you are intentionally dropping the
> unnecessary check for NULL pte_page() in this patch: we agree on that,
> but it does need to be mentioned somewhere in the comment.

OK, I'll add it.

> > 
> > This patch also adds the similar locking to follow_huge_(addr|pud)
> > for consistency.
> > 
> > Here is the reproducer:
> > 
> >   $ cat movepages.c
> >   #include 
> >   #include 
> >   #include 
> > 
> >   #define ADDR_INPUT  0x7000UL
> >   #define HPS 0x20
> >   #define PS  0x1000
> > 
> >   int main(int argc, char *argv[]) {
> >   int i;
> >   int nr_hp = strtol(argv[1], NULL, 0);
> >   int nr_p  = nr_hp * HPS / PS;
> >   int ret;
> >   void **addrs;
> >   int *status;
> >   int *nodes;
> >   pid_t pid;
> > 
> >   pid = strtol(argv[2], NULL, 0);
> >   addrs  = malloc(sizeof(char *) * nr_p + 1);
> >   status = malloc(sizeof(char *) * nr_p + 1);
> >   nodes  = malloc(sizeof(char *) * nr_p + 1);
> > 
> >   while (1) {
> >   for (i = 0; i < nr_p; i++) {
> >   addrs[i] = (void *)ADDR_INPUT + i * PS;
> >   nodes[i] = 1;
> >   status[i] = 0;
> >   }
> >   ret = numa_move_pages(pid, nr_p, addrs, nodes, status,
> > MPOL_MF_MOVE_ALL);
> >   if (ret == -1)
> >   err("move_pages");
> > 
> >   for (i = 0; i < nr_p; i++) {
> >   addrs[i] = (void *)ADDR_INPUT + i * PS;
> >   nodes[i] = 0;
> >   status[i] = 0;
> >   }
> >   ret = numa_move_pages(pid, nr_p, addrs, nodes, status,
> > MPOL_MF_MOVE_ALL);
> >   if (ret == -1)
> >   err("move_pages");
> >   }
> >   return 0;
> >   }
> > 
> >   $ cat hugepage.c
> >   #include 
> >   #include 
> >   #include 
> > 
> >   #define ADDR_INPUT  0x7000UL
> >   #define HPS 0x20
> > 
> >   int main(int argc, char *argv[]) {
> >   int nr_hp = strtol(argv[1], NULL, 0);
> >   char *p;
> > 
> >   while (1) {
> >   p = mmap((void *)ADDR_INPUT, nr_hp * HPS, PROT_READ | 
> > PROT_WRITE,
> >MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, -1, 
> > 0);
> >   if (p != (void *)ADDR_INPUT) {
> >   perror("mmap");
> >   break;
> >   }
> >   memset(p, 0, nr_hp * HPS);
> >   munmap(p, nr_hp * HPS);
> >   }
> >   }
> > 
> >   $ sysctl vm.nr_hugepages=40
> >   $ ./hugepage 10 &
> >   $ ./movepages 10 $(pgrep -f hugepage)
> > 
> > Note for stable inclusion:
> >   This patch fixes e632a938d914 ("mm: migrate: add hugepage migration code
> >   to move_pages()"), so is applicable to -stable kernels which includes it.
> 
> Just say
> Fixes: e632a938d914 ("mm: migrate: add hugepage migration code to 
> move_pages()")

I just found that Documentation/SubmittingPatches started to state about
Fixes: tag. I'll use it from now.

> > 
> > ChangeLog v3:
> > - remove unnecessary if (page) check
> > - check (pmd|pud)_huge again after holding ptl
> > - do the same change also on follow_huge_pud()
> > - take page table lock also in follow_huge_addr()
> > 
> > ChangeLog v2:
> > - introduce follow_huge_pmd_lock() to do locking in arch-independent code.
> 
> ChangeLog vN info belongs below the ---

OK.
I didn't know this but it's written in SubmittingPatches, so I'll keep it
in mind.

> > 
> > Reported-by: Hugh Dickins 
> > Signed-off-by: Naoya Horiguchi 
> > Cc:   # [3.12+]
> 
> No ack to this one yet, I'm afraid.

OK, I defer Reported-by until all the problems in this patch are solved.
I added this Reported-by because Andrew asked how In found this problem,
and advised me to show the reporter.
And I didn't intend by this Reported-by that you acked the patch.
In this case, should I have used some unofficial tag 

Re: [PATCH] scsi_debug: deadlock between completions and surprise module removal

2014-09-04 Thread Christoph Hellwig
Can I get another review for this one?

On Sun, Aug 31, 2014 at 07:09:59PM -0400, Douglas Gilbert wrote:
> A deadlock has been reported when the completion
> of SCSI commands (simulated by a timer) was surprised
> by a module removal. This patch removes one half of
> the offending locks around timer deletions. This fix
> is applied both to stop_all_queued() which is were
> the deadlock was discovered and stop_queued_cmnd()
> which has very similar logic.
> 
> This patch should be applied both to the lk 3.17 tree
> and Christoph's drivers-for-3.18 tree.
> 
> Tested-and-reported-by: Milan Broz 
> Signed-off-by: Douglas Gilbert 

> --- a/drivers/scsi/scsi_debug.c   2014-08-26 13:24:51.646948507 -0400
> +++ b/drivers/scsi/scsi_debug.c   2014-08-30 18:04:54.589226679 -0400
> @@ -2743,6 +2743,13 @@ static int stop_queued_cmnd(struct scsi_
>   if (test_bit(k, queued_in_use_bm)) {
>   sqcp = _arr[k];
>   if (cmnd == sqcp->a_cmnd) {
> + devip = (struct sdebug_dev_info *)
> + cmnd->device->hostdata;
> + if (devip)
> + atomic_dec(>num_in_q);
> + sqcp->a_cmnd = NULL;
> + spin_unlock_irqrestore(_arr_lock,
> +iflags);
>   if (scsi_debug_ndelay > 0) {
>   if (sqcp->sd_hrtp)
>   hrtimer_cancel(
> @@ -2755,18 +2762,13 @@ static int stop_queued_cmnd(struct scsi_
>   if (sqcp->tletp)
>   tasklet_kill(sqcp->tletp);
>   }
> - __clear_bit(k, queued_in_use_bm);
> - devip = (struct sdebug_dev_info *)
> - cmnd->device->hostdata;
> - if (devip)
> - atomic_dec(>num_in_q);
> - sqcp->a_cmnd = NULL;
> - break;
> + clear_bit(k, queued_in_use_bm);
> + return 1;
>   }
>   }
>   }
>   spin_unlock_irqrestore(_arr_lock, iflags);
> - return (k < qmax) ? 1 : 0;
> + return 0;
>  }
>  
>  /* Deletes (stops) timers or tasklets of all queued commands */
> @@ -2782,6 +2784,13 @@ static void stop_all_queued(void)
>   if (test_bit(k, queued_in_use_bm)) {
>   sqcp = _arr[k];
>   if (sqcp->a_cmnd) {
> + devip = (struct sdebug_dev_info *)
> + sqcp->a_cmnd->device->hostdata;
> + if (devip)
> + atomic_dec(>num_in_q);
> + sqcp->a_cmnd = NULL;
> + spin_unlock_irqrestore(_arr_lock,
> +iflags);
>   if (scsi_debug_ndelay > 0) {
>   if (sqcp->sd_hrtp)
>   hrtimer_cancel(
> @@ -2794,12 +2803,8 @@ static void stop_all_queued(void)
>   if (sqcp->tletp)
>   tasklet_kill(sqcp->tletp);
>   }
> - __clear_bit(k, queued_in_use_bm);
> - devip = (struct sdebug_dev_info *)
> - sqcp->a_cmnd->device->hostdata;
> - if (devip)
> - atomic_dec(>num_in_q);
> - sqcp->a_cmnd = NULL;
> + clear_bit(k, queued_in_use_bm);
> + spin_lock_irqsave(_arr_lock, iflags);
>   }
>   }
>   }

---end quoted text---
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v10 net-next 1/2] net: filter: add "load 64-bit immediate" eBPF instruction

2014-09-04 Thread Alexei Starovoitov
add BPF_LD_IMM64 instruction to load 64-bit immediate value into a register.
All previous instructions were 8-byte. This is first 16-byte instruction.
Two consecutive 'struct bpf_insn' blocks are interpreted as single instruction:
insn[0].code = BPF_LD | BPF_DW | BPF_IMM
insn[0].dst_reg = destination register
insn[0].imm = lower 32-bit
insn[1].code = 0
insn[1].imm = upper 32-bit
All unused fields must be zero.

Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM
which loads 32-bit immediate value into a register.

x64 JITs it as single 'movabsq %rax, imm64'
arm64 may JIT as sequence of four 'movk x0, #imm16, lsl #shift' insn

Note that old eBPF programs are binary compatible with new interpreter.

It helps eBPF programs load 64-bit constant into a register with one
instruction instead of using two registers and 4 instructions:
BPF_MOV32_IMM(R1, imm32)
BPF_ALU64_IMM(BPF_LSH, R1, 32)
BPF_MOV32_IMM(R2, imm32)
BPF_ALU64_REG(BPF_OR, R1, R2)

User space generated programs will use this instruction to load constants only.

To tell kernel that user space needs a pointer the _pseudo_ variant of
this instruction may be added later, which will use extra bits of encoding
to indicate what type of pointer user space is asking kernel to provide.
For example 'off' or 'src_reg' fields can be used for such purpose.
src_reg = 1 could mean that user space is asking kernel to validate and
load in-kernel map pointer.
src_reg = 2 could mean that user space needs readonly data section pointer
src_reg = 3 could mean that user space needs a pointer to per-cpu local data
All such future pseudo instructions will not be carrying the actual pointer
as part of the instruction, but rather will be treated as a request to kernel
to provide one. The kernel will verify the request_for_a_pointer, then
will drop _pseudo_ marking and will store actual internal pointer inside
the instruction, so the end result is the interpreter and JITs never
see pseudo BPF_LD_IMM64 insns and only operate on generic BPF_LD_IMM64 that
loads 64-bit immediate into a register. User space never operates on direct
pointers and verifier can easily recognize request_for_pointer vs other
instructions.

Signed-off-by: Alexei Starovoitov 
---
 Documentation/networking/filter.txt |8 +++-
 arch/x86/net/bpf_jit_comp.c |   17 +
 include/linux/filter.h  |   18 ++
 kernel/bpf/core.c   |5 +
 lib/test_bpf.c  |   21 +
 5 files changed, 68 insertions(+), 1 deletion(-)

diff --git a/Documentation/networking/filter.txt 
b/Documentation/networking/filter.txt
index c48a9704bda8..81916ab5d96f 100644
--- a/Documentation/networking/filter.txt
+++ b/Documentation/networking/filter.txt
@@ -951,7 +951,7 @@ Size modifier is one of ...
 
 Mode modifier is one of:
 
-  BPF_IMM  0x00  /* classic BPF only, reserved in eBPF */
+  BPF_IMM  0x00  /* used for 32-bit mov in classic BPF and 64-bit in eBPF */
   BPF_ABS  0x20
   BPF_IND  0x40
   BPF_MEM  0x60
@@ -995,6 +995,12 @@ BPF_XADD | BPF_DW | BPF_STX: lock xadd *(u64 *)(dst_reg + 
off16) += src_reg
 Where size is one of: BPF_B or BPF_H or BPF_W or BPF_DW. Note that 1 and
 2 byte atomic increments are not supported.
 
+eBPF has one 16-byte instruction: BPF_LD | BPF_DW | BPF_IMM which consists
+of two consecutive 'struct bpf_insn' 8-byte blocks and interpreted as single
+instruction that loads 64-bit immediate value into a dst_reg.
+Classic BPF has similar instruction: BPF_LD | BPF_W | BPF_IMM which loads
+32-bit immediate value into a register.
+
 Testing
 ---
 
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 39ccfbb4a723..06f8c17f5484 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -393,6 +393,23 @@ static int do_jit(struct bpf_prog *bpf_prog, int *addrs, 
u8 *image,
EMIT1_off32(add_1reg(0xB8, dst_reg), imm32);
break;
 
+   case BPF_LD | BPF_IMM | BPF_DW:
+   if (insn[1].code != 0 || insn[1].src_reg != 0 ||
+   insn[1].dst_reg != 0 || insn[1].off != 0) {
+   /* verifier must catch invalid insns */
+   pr_err("invalid BPF_LD_IMM64 insn\n");
+   return -EINVAL;
+   }
+
+   /* movabsq %rax, imm64 */
+   EMIT2(add_1mod(0x48, dst_reg), add_1reg(0xB8, dst_reg));
+   EMIT(insn[0].imm, 4);
+   EMIT(insn[1].imm, 4);
+
+   insn++;
+   i++;
+   break;
+
/* dst %= src, dst /= src, dst %= imm32, dst /= imm32 */
case BPF_ALU | BPF_MOD | BPF_X:
case BPF_ALU | BPF_DIV | BPF_X:
diff --git a/include/linux/filter.h b/include/linux/filter.h
index c78994593355..bf323da77950 

[PATCH v10 net-next 2/2] net: filter: split filter.h and expose eBPF to user space

2014-09-04 Thread Alexei Starovoitov
allow user space to generate eBPF programs

uapi/linux/bpf.h: eBPF instruction set definition

linux/filter.h: the rest

This patch only moves macro definitions, but practically it freezes existing
eBPF instruction set, though new instructions can still be added in the future.

These eBPF definitions cannot go into uapi/linux/filter.h, since the names
may conflict with existing applications.

Full eBPF ISA description is in Documentation/networking/filter.txt

Signed-off-by: Alexei Starovoitov 
Acked-by: Daniel Borkmann 
---
 include/linux/filter.h|   56 +-
 include/uapi/linux/Kbuild |1 +
 include/uapi/linux/bpf.h  |   65 +
 3 files changed, 67 insertions(+), 55 deletions(-)
 create mode 100644 include/uapi/linux/bpf.h

diff --git a/include/linux/filter.h b/include/linux/filter.h
index bf323da77950..8f82ef3f1cdd 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -10,58 +10,12 @@
 #include 
 #include 
 #include 
+#include 
 
 struct sk_buff;
 struct sock;
 struct seccomp_data;
 
-/* Internally used and optimized filter representation with extended
- * instruction set based on top of classic BPF.
- */
-
-/* instruction classes */
-#define BPF_ALU64  0x07/* alu mode in double word width */
-
-/* ld/ldx fields */
-#define BPF_DW 0x18/* double word */
-#define BPF_XADD   0xc0/* exclusive add */
-
-/* alu/jmp fields */
-#define BPF_MOV0xb0/* mov reg to reg */
-#define BPF_ARSH   0xc0/* sign extending arithmetic shift right */
-
-/* change endianness of a register */
-#define BPF_END0xd0/* flags for endianness conversion: */
-#define BPF_TO_LE  0x00/* convert to little-endian */
-#define BPF_TO_BE  0x08/* convert to big-endian */
-#define BPF_FROM_LEBPF_TO_LE
-#define BPF_FROM_BEBPF_TO_BE
-
-#define BPF_JNE0x50/* jump != */
-#define BPF_JSGT   0x60/* SGT is signed '>', GT in x86 */
-#define BPF_JSGE   0x70/* SGE is signed '>=', GE in x86 */
-#define BPF_CALL   0x80/* function call */
-#define BPF_EXIT   0x90/* function return */
-
-/* Register numbers */
-enum {
-   BPF_REG_0 = 0,
-   BPF_REG_1,
-   BPF_REG_2,
-   BPF_REG_3,
-   BPF_REG_4,
-   BPF_REG_5,
-   BPF_REG_6,
-   BPF_REG_7,
-   BPF_REG_8,
-   BPF_REG_9,
-   BPF_REG_10,
-   __MAX_BPF_REG,
-};
-
-/* BPF has 10 general purpose 64-bit registers and stack frame. */
-#define MAX_BPF_REG__MAX_BPF_REG
-
 /* ArgX, context and stack frame pointer register positions. Note,
  * Arg1, Arg2, Arg3, etc are used as argument mappings of function
  * calls in BPF_CALL instruction.
@@ -322,14 +276,6 @@ enum {
 #define SK_RUN_FILTER(filter, ctx) \
(*filter->prog->bpf_func)(ctx, filter->prog->insnsi)
 
-struct bpf_insn {
-   __u8code;   /* opcode */
-   __u8dst_reg:4;  /* dest register */
-   __u8src_reg:4;  /* source register */
-   __s16   off;/* signed offset */
-   __s32   imm;/* signed immediate constant */
-};
-
 #ifdef CONFIG_COMPAT
 /* A struct sock_filter is architecture independent. */
 struct compat_sock_fprog {
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index 24e9033f8b3f..fb3f7b675229 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -67,6 +67,7 @@ header-y += bfs_fs.h
 header-y += binfmts.h
 header-y += blkpg.h
 header-y += blktrace_api.h
+header-y += bpf.h
 header-y += bpqether.h
 header-y += bsg.h
 header-y += btrfs.h
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
new file mode 100644
index ..479ed0b6be16
--- /dev/null
+++ b/include/uapi/linux/bpf.h
@@ -0,0 +1,65 @@
+/* Copyright (c) 2011-2014 PLUMgrid, http://plumgrid.com
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#ifndef _UAPI__LINUX_BPF_H__
+#define _UAPI__LINUX_BPF_H__
+
+#include 
+
+/* Extended instruction set based on top of classic BPF */
+
+/* instruction classes */
+#define BPF_ALU64  0x07/* alu mode in double word width */
+
+/* ld/ldx fields */
+#define BPF_DW 0x18/* double word */
+#define BPF_XADD   0xc0/* exclusive add */
+
+/* alu/jmp fields */
+#define BPF_MOV0xb0/* mov reg to reg */
+#define BPF_ARSH   0xc0/* sign extending arithmetic shift right */
+
+/* change endianness of a register */
+#define BPF_END0xd0/* flags for endianness conversion: */
+#define BPF_TO_LE  0x00/* convert to little-endian */
+#define BPF_TO_BE  0x08/* convert to big-endian */
+#define BPF_FROM_LEBPF_TO_LE
+#define BPF_FROM_BEBPF_TO_BE
+
+#define BPF_JNE0x50/* jump != */
+#define 

[PATCH v10 net-next 0/2] load imm64 insn and uapi/linux/bpf.h

2014-09-04 Thread Alexei Starovoitov
Hi,

V9->V10
- no changes, added Daniel's ack

Note they're on top of Hannes's patch in the same area [1]

V8 thread with 'why' reasoning and end goal [2]

Original set [3] of ~28 patches I'm planning to present in 4 stages:

  I. this 2 patches to fork off llvm upstreaming
 II. bpf syscall with manpage and map implementation
III. bpf program load/unload with verifier testsuite (1st user of
 instruction macros from bpf.h and 1st user of load imm64 insn)
 IV. tracing, etc

[1] http://patchwork.ozlabs.org/patch/385266/
[2] https://lkml.org/lkml/2014/8/27/628
[3] https://lkml.org/lkml/2014/8/26/859

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set

2014-09-04 Thread Junxiao Bi
On 09/05/2014 10:32 AM, Junxiao Bi wrote:
> On 09/04/2014 05:23 PM, Dave Chinner wrote:
>> On Wed, Sep 03, 2014 at 01:54:54PM +0800, Junxiao Bi wrote:
>>> commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O 
>>> during memory allocation")
>>> introduces PF_MEMALLOC_NOIO flag to avoid doing I/O inside memory 
>>> allocation, __GFP_IO is cleared
>>> when this flag is set, but __GFP_FS implies __GFP_IO, it should also be 
>>> cleared. Or it may still
>>> run into I/O, like in superblock shrinker.
>>>
>>> Signed-off-by: Junxiao Bi 
>>> Cc: joyce.xue 
>>> Cc: Ming Lei 
>>> ---
>>>  include/linux/sched.h |6 --
>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>>> index 5c2c885..2fb2c47 100644
>>> --- a/include/linux/sched.h
>>> +++ b/include/linux/sched.h
>>> @@ -1936,11 +1936,13 @@ extern void thread_group_cputime_adjusted(struct 
>>> task_struct *p, cputime_t *ut,
>>>  #define tsk_used_math(p) ((p)->flags & PF_USED_MATH)
>>>  #define used_math() tsk_used_math(current)
>>>  
>>> -/* __GFP_IO isn't allowed if PF_MEMALLOC_NOIO is set in current->flags */
>>> +/* __GFP_IO isn't allowed if PF_MEMALLOC_NOIO is set in current->flags
>>> + * __GFP_FS is also cleared as it implies __GFP_IO.
>>> + */
>>>  static inline gfp_t memalloc_noio_flags(gfp_t flags)
>>>  {
>>> if (unlikely(current->flags & PF_MEMALLOC_NOIO))
>>> -   flags &= ~__GFP_IO;
>>> +   flags &= ~(__GFP_IO | __GFP_FS);
>>> return flags;
>>>  }
>>
>> You also need to mask all the shrink_control->gfp_mask
>> initialisations in mm/vmscan.c. The current code only masks the page
>> reclaim gfp_mask, not those that are passed to the shrinkers.
> Yes, there are some shrink_control->gfp_mask not masked in vmscan.c in
> the following functions. Beside this, all seemed be masked from direct
> reclaim path by memalloc_noio_flags().
> 
> -reclaim_clean_pages_from_list()
> used by alloc_contig_range(), this function is invoked in hugetlb and
> cma, for hugetlb, it should be safe as only userspace use it. I am not
> sure about the cma.
> David & Andrew, may you share your idea about whether cma is affected?
> 
Look at CMA, it's used for device which doesn't support scatter/gather
dma and mainly used for embedded device like camera, this should not be
the case of the block device. So i think this gfp_mask doesn't need be
masked.

Thanks,
Junxiao.
> -mem_cgroup_shrink_node_zone()
> -try_to_free_mem_cgroup_pages()
> These two are used by mem cgroup, as no kernel thread can be assigned
> into such cgroup, so i think, no need mask.
> 
> -balance_pgdat()
> used by kswapd, no need mask.
> 
> -shrink_all_memory()
> used by hibernate, should be safe with GFP_FS/IO.
> 
> Thanks,
> Junxiao.
>>
>> Cheers,
>>
>> Dave.
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v7] pcie: Add Xilinx PCIe Host Bridge IP driver

2014-09-04 Thread Srikanth Thokala
On Wed, Sep 3, 2014 at 11:35 PM, Bjorn Helgaas  wrote:
> On Wed, Aug 20, 2014 at 09:56:02PM +0530, Srikanth Thokala wrote:
>> This is the driver for Xilinx AXI PCIe Host Bridge Soft IP
>>
>> Signed-off-by: Srikanth Thokala 
>> Acked-by: Arnd Bergmann 
>
> Applied to pci/host-xilinx for v3.18, thanks!

Thanks Bjorn!

Srikanth


>
>> ---
>> Changes in v7:
>> - Removed errors reported from build-bot.  The errors are
>>   mainly due to same CONFIG_PCI_XILINX flag being used for
>>   Zynq and Microblaze platforms.  So, renamed the driver
>>   config flag name to CONFIG_PCIE_XILINX.
>> - Also, renamed the driver file to pcie-xilinx.c to be in
>>   sync with CONFIG option.
>> - Fixed the annotation of xilinx_pcie_scan_bus() function
>>   to remove build-bot warnings.
>>
>> Changes in v6:
>> - Added Ack from Arnd. Thanks Arnd.
>> - Rebased on 3.16-rc7.
>>
>> Changes in v5:
>> - Removed unnecessary checking of port structure.
>> - Changed the return type of verify_config from int to bool.
>> - Renamed following functions,
>>   xilinx_pcie_is_link_up() -> xilinx_pcie_link_is_up()
>>   xilinx_pcie_verify_config() -> xilinx_pcie_valid_device()
>>   xilinx_pcie_get_config_base() -> xilinx_pcie_config_base()
>> - Removed link_up bool flag from port structure, as it is not
>>   being used.
>> - Removed unused constants.
>> - Rebased on 3.16-rc6.
>> - Fixed some minor comments.
>> - Thanks Bjorn for the review.
>>
>> Changes in v4:
>> - Regarding the comments to separate ECAM functionality,
>>   I have sent a separate patch and it is decided to implement
>>   it later. The patch is here,
>>   https://lkml.org/lkml/2014/5/18/54
>> - Fixed issue with adding configuration bus resource.
>> - Moved the logic for setting up bus resources to probe() from
>>   pcie_setup().
>> - Instead of mapping all the MSI interrupts in the probe, changed
>>   to map only when a MSI is requested.
>> - Earlier, the implementation of legacy and MSI interrupts init-
>>   is mutually exclusive, now changed to have the legacy interrupts
>>   init always and MSI interrupt init based on CONFIG_PCI_MSI flag.
>> - Regarding the MSI generic implementation comment, I will plan to
>>   do on top of this driver patch.
>> - Rebased on 3.16-rc2.
>> - Fixed other minor comments.
>> - Thanks Arnd and Bjorn for the review.
>>
>> Changes in v3:
>> - Rebased on v3.15.0-rc1
>> - Added support for interrupt-map DT functionality.
>> - Removed map_irq() wrapper, instead using of_irq_parse_and_map_pci().
>> - Modified resource mapping logic as per the series
>>   "PCI: ARM: add support for generic PCI host controller"
>> - Modified devicetree binding documentation to update with interrupt-
>>   map properties.
>> - Use devm calls wherever applicable.
>> - Fixed minor comments from Jason
>> - Thanks Jason for the review and suggestions.
>>
>> Changes in v2:
>> - Rebased on v3.14.0-rc8
>> - Removed IP specific DT properties like include-rc, axibar-num etc.,
>>   as suggested by Jason and Bjorn, Thanks
>> ---
>>  .../devicetree/bindings/pci/xilinx-pcie.txt|   62 ++
>>  drivers/pci/host/Kconfig   |7 +
>>  drivers/pci/host/Makefile  |1 +
>>  drivers/pci/host/pcie-xilinx.c |  978 
>> 
>>  4 files changed, 1048 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/pci/xilinx-pcie.txt
>>  create mode 100644 drivers/pci/host/pcie-xilinx.c
>>
>> diff --git a/Documentation/devicetree/bindings/pci/xilinx-pcie.txt 
>> b/Documentation/devicetree/bindings/pci/xilinx-pcie.txt
>> new file mode 100644
>> index 000..3e2c88d
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/pci/xilinx-pcie.txt
>> @@ -0,0 +1,62 @@
>> +* Xilinx AXI PCIe Root Port Bridge DT description
>> +
>> +Required properties:
>> +- #address-cells: Address representation for root ports, set to <3>
>> +- #size-cells: Size representation for root ports, set to <2>
>> +- #interrupt-cells: specifies the number of cells needed to encode an
>> + interrupt source. The value must be 1.
>> +- compatible: Should contain "xlnx,axi-pcie-host-1.00.a"
>> +- reg: Should contain AXI PCIe registers location and length
>> +- device_type: must be "pci"
>> +- interrupts: Should contain AXI PCIe interrupt
>> +- interrupt-map-mask,
>> +  interrupt-map: standard PCI properties to define the mapping of the
>> + PCI interface to interrupt numbers.
>> +- ranges: ranges for the PCI memory regions (I/O space region is not
>> + supported by hardware)
>> + Please refer to the standard PCI bus binding document for a more
>> + detailed explanation
>> +
>> +Optional properties:
>> +- bus-range: PCI bus numbers covered
>> +
>> +Interrupt controller child node
>> 
>> +Required properties:
>> +- interrupt-controller: identifies the node as an interrupt controller
>> +- #address-cells: specifies the number of cells needed to encode an
>> + address. The value must be 

[PATCH v2] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Namhyung Kim
There's a problem on finding correct kernel symbols when perf report
runs on a different kernel.  Although a part of the problem was solved
by the prior commit 0a7e6d1b6844 ("perf tools: Check recorded kernel
version when finding vmlinux"), there's a remaining problem still.

When perf records samples, it synthesizes the kernel map using
machine__mmap_name() and ref_reloc_sym like "[kernel.kallsyms]_text".
You can easily see it using 'perf report -D' command.

After finishing record, it goes through the recorded events to find
maps/dsos actually used.  And then record build-id info of them.

During this process, it needs to load symbols in a dso and it'd call
dso__load_vmlinux() since the default value of the symbol_conf.try_
vmlinux_path is true.  However it changes dso->long_name to a real
path of the vmlinux file (e.g. /lib/modules/3.16.0-rc2+/build/vmlinux)
if one is running on a custom kernel.

It resulted in that perf report reads the build-id of the vmlinux, but
cannot use it since it only knows about the [kernel.kallsyms] map.  It
then falls back to possible vmlinux paths by using the recorded kernel
version (in case of a recent version) or a running kernel silently.

Even with the recent tools, this still has a possibility of breaking
the result.  As the build directory is a symbolic link, if one built a
new kernel in the same directory with different source/config, the old
link to vmlinux will point the new file.  So it's absolutely needed to
use build-id when finding a kernel image.

In this patch, it's now changed to try to search a kernel dso using
"vmlinux" shortname (which always has a build-id) and, if not found,
search "[kernel.kallsyms]".

Before:

  $ perf report
  # Children  Self  Command  Shared Object  Symbol
  #     ...  .  
...
  #
  72.15% 0.00%  swapper  [kernel.kallsyms]  [k] set_curr_task_rt
  72.15% 0.00%  swapper  [kernel.kallsyms]  [k] native_calibrate_tsc
  72.15% 0.00%  swapper  [kernel.kallsyms]  [k] 
tsc_refine_calibration_work
  71.87%71.87%  swapper  [kernel.kallsyms]  [k] module_finalize
   ...

After (for the same perf.data):

  72.15% 0.00%  swapper  vmlinux  [k] cpu_startup_entry
  72.15% 0.00%  swapper  vmlinux  [k] arch_cpu_idle
  72.15% 0.00%  swapper  vmlinux  [k] default_idle
  71.87%71.87%  swapper  vmlinux  [k] native_safe_halt
   ...

Cc: sta...@vger.kernel.org
Signed-off-by: Namhyung Kim 
---
 tools/perf/util/machine.c | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index b2ec38bf211e..d9f828f3b54f 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1060,10 +1060,14 @@ static int machine__process_kernel_mmap_event(struct 
machine *machine,
strlen(kmmap_prefix));
/*
 * Should be there already, from the build-id table in
-* the header.
+* the header (but maybe with a different name: "vmlinux").
 */
-   struct dso *kernel = __dsos__findnew(>kernel_dsos,
-kmmap_prefix);
+   struct dso *kernel = dsos__find(>kernel_dsos,
+   "vmlinux", true);
+
+   if (kernel == NULL)
+   kernel = __dsos__findnew(>kernel_dsos,
+kmmap_prefix);
if (kernel == NULL)
goto out_problem;
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] pinctrl: st: Add remove function and remove gpio_chip on failure

2014-09-04 Thread Pramod Gurav
Hi Srini,

Thanks for review.

On 04-09-2014 11:38 PM, Srinivas Kandagatla wrote:
> Pramod,
> sorry for delay in reply as I was travelling, still in Jet lag.
>> Signed-off-by: Pramod Gurav 
>> ---
>>   drivers/pinctrl/pinctrl-st.c |   25 +
>>   1 files changed, 25 insertions(+), 0 deletions(-)
>>
>> diff --git a/drivers/pinctrl/pinctrl-st.c b/drivers/pinctrl/pinctrl-st.c
>> index 5475374..9296845 100644
>> --- a/drivers/pinctrl/pinctrl-st.c
>> +++ b/drivers/pinctrl/pinctrl-st.c
>> @@ -1517,6 +1517,7 @@ static int st_gpiolib_register_bank(struct st_pinctrl 
>> *info,
>> 0, handle_simple_irq,
>> IRQ_TYPE_LEVEL_LOW);
>>  if (err) {
>> +gpiochip_remove(>gpio_chip);
> This change-set looks good.
> IMO, you can send a patch for this change set.
Will resend just this.

> 
>>  dev_info(dev, "could not add irqchip\n");
>>  return err;
>>  }
>> @@ -1685,6 +1686,29 @@ static int st_pctl_probe(struct platform_device *pdev)
>>  return 0;
>>   }
>>
> 
>> +static int st_pctl_remove(struct platform_device *pdev)
>> +{
> Ideally this driver will not be removed, as other drivers depend on 
> this, even the serial.
> 
> so I see no big achievement in adding the remove functionality, as this 
> is going to be a dead code and would never be tested.
> 
>> +struct st_pinctrl *info = platform_get_drvdata(pdev);
>> +struct device_node *np = pdev->dev.of_node;
>> +struct device_node *child;
>> +struct gpio_chip gpio_chip;
>> +int bank = 0;
>> +
> 
>> +if (info->nbanks) {
>> +for_each_child_of_node(np, child) {
>> +if (of_property_read_bool(child, "gpio-controller")) {
> We should not re-parse the DT nodes once we are done with it in the probe.
Thanks. :)
> 
>> +gpio_chip = info->banks[bank].gpio_chip;
>> +gpiochip_remove(_chip);
>> +bank++;
>> +}
>> +}
>> +}
>> +
> 
> I think the logic is very simple:
> 
>   while (nbanks--)
>   gpiochip_remove(>banks[bank++].gpio_chip))
Thanks again. Remove is not needed hence will not do this. But good know.
> 
> 
> thanks,
> srini
>> +pinctrl_unregister(info->pctl);
>> +
>> +return 0;
>> +}
>> +
>>   static struct platform_driver st_pctl_driver = {
>>  .driver = {
>>  .name = "st-pinctrl",
>> @@ -1692,6 +1716,7 @@ static struct platform_driver st_pctl_driver = {
>>  .of_match_table = st_pctl_of_match,
>>  },
>>  .probe = st_pctl_probe,
>> +.remove = st_pctl_remove,
>>   };
>>
>>   static int __init st_pctl_init(void)
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 4/5] ARM: dts: Add PPMU dt node for Exynos4 SoC

2014-09-04 Thread Chanwoo Choi
This patch add PPMU (Performance Profiling Monitoring Unit) dt node for Exynos4
(Exynos4210/4212/4412) SoC. PPMU dt node is used to monitor the utilization of
each IP.

The Exynos4210/Exynos4212/Exynos4412 SoC includes following PPMUs:
- PPMU_DMC0  0x106A_
- PPMU_DMC1  0x106B_
- PPMU_CPU   0x106C_
- PPMU_ACP   0x10AE_
- PPMU_RIGHT_BUS 0x112A_
- PPMU_LEFT_BUS  0x116A_
- PPMU_LCD0  0x11E4_
- PPMU_CAMIF 0x11AC_
- PPMU_IMAGE 0x12AA_
- PPMU_TV0x12E4_
- PPMU_3D0x1322_
- PPMU_MFC_L 0x1366_
- PPMU_MFC_R 0x1367_

Additionally, the Exynos4210 SoC includes following PPMUs:
- PPMU_LCD1  0x1224_

Signed-off-by: Chanwoo Choi 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos4.dtsi| 96 +++
 arch/arm/boot/dts/exynos4210.dtsi |  8 
 2 files changed, 104 insertions(+)

diff --git a/arch/arm/boot/dts/exynos4.dtsi b/arch/arm/boot/dts/exynos4.dtsi
index e0278ec..c58d3f3 100644
--- a/arch/arm/boot/dts/exynos4.dtsi
+++ b/arch/arm/boot/dts/exynos4.dtsi
@@ -645,4 +645,100 @@
samsung,sysreg = <_reg>;
status = "disabled";
};
+
+   ppmu_dmc0: ppmu_dmc0@106a {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x106a 0x2000>;
+   clocks = < CLK_PPMUDMC0>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_dmc1: ppmu_dmc1@106b {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x106b 0x2000>;
+   clocks = < CLK_PPMUDMC1>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_cpu: ppmu_cpu@106c {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x106c 0x2000>;
+   clocks = < CLK_PPMUCPU>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_rightbus: ppmu_rightbus@112a {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x112a 0x2000>;
+   clocks = < CLK_PPMURIGHT>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_leftbus: ppmu_leftbus0@116a {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x116a 0x2000>;
+   clocks = < CLK_PPMULEFT>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_camif: ppmu_camif@11ac {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x11ac 0x2000>;
+   clocks = < CLK_PPMUCAMIF>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_lcd0: ppmu_lcd0@11e4 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x11e4 0x2000>;
+   clocks = < CLK_PPMULCD0>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_image: ppmu_image@12aa {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x12aa 0x2000>;
+   clocks = < CLK_PPMUIMAGE>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_tv: ppmu_tv@12e4 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x12e4 0x2000>;
+   clocks = < CLK_PPMUTV>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_g3d: ppmu_g3d@1322 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x1322 0x2000>;
+   clocks = < CLK_PPMUG3D>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_mfc_l: ppmu_mfc_l@1366 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x1366 0x2000>;
+   clocks = < CLK_PPMUMFC_L>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_mfc_r: ppmu_mfc_r@1367 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x1367 0x2000>;
+   clocks = < CLK_PPMUMFC_R>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
 };
diff --git a/arch/arm/boot/dts/exynos4210.dtsi 
b/arch/arm/boot/dts/exynos4210.dtsi
index 807bb5b..1b854e6 100644
--- a/arch/arm/boot/dts/exynos4210.dtsi
+++ b/arch/arm/boot/dts/exynos4210.dtsi
@@ -175,4 +175,12 @@
samsung,lcd-wb;
};
};
+
+   ppmu_lcd1: ppmu_lcd1@1224 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x1224 0x2000>;
+   clocks = < CLK_PPMULCD1>;
+   clock-names = "ppmu";
+   status = 

Re: [PATCH] i2c: rk3x: fix divisor calculation for SCL frequency

2014-09-04 Thread Doug Anderson
Addy,

On Thu, Sep 4, 2014 at 7:32 PM, Addy Ke  wrote:
> I2C_CLKDIV register descripted in the previous version of
> RK3x chip manual is incorrect. Plus 1 is required.
>
> The correct formula:
> - T(SCL_HIGH) = T(PCLK) * (CLKDIVH + 1) * 8
> - T(SCL_LOW) = T(PCLK) * (CLKDIVL + 1) * 8
> - (SCL Divsor) = 8 * ((CLKDIVL + 1) + (CLKDIVH + 1))
> - SCL = PCLK / (CLK Divsor)

I'll trust that you tested this with a scope


> It will be updated to the latest version of chip manual.
>
> Signed-off-by: Addy Ke 
> ---
>  drivers/i2c/busses/i2c-rk3x.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
> index e637c32..76b6604 100644
> --- a/drivers/i2c/busses/i2c-rk3x.c
> +++ b/drivers/i2c/busses/i2c-rk3x.c
> @@ -433,8 +433,8 @@ static void rk3x_i2c_set_scl_rate(struct rk3x_i2c *i2c, 
> unsigned long scl_rate)
> unsigned long i2c_rate = clk_get_rate(i2c->clk);
> unsigned int div;
>
> -   /* SCL rate = (clk rate) / (8 * DIV) */
> -   div = DIV_ROUND_UP(i2c_rate, scl_rate * 8);
> +   /* SCL rate = (clk rate) / (8 * (DIV + 2)) */
> +   div = DIV_ROUND_UP(i2c_rate, scl_rate * 8) - 2;

Given the bug I just fixed in the Rockchip SPI driver, I was a little
worried about div becoming -1 (and thus being a really large positive
number since div is unsigned).

However, it seems that you get saved by the next statement:
  div = DIV_ROUND_UP(div, 2);

In the testing I did with the Linux macros, that magically transformed
a div of 0x (-1) to 0, so it's not technically a bug.  ...but
it's very non-obvious.  Can you do something a little cleaner?

-Doug
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 0/5] devfreq: Add devfreq-event class to provide raw data for devfreq device

2014-09-04 Thread Chanwoo Choi
This patchset add new devfreq_event class to provide raw data to determine
current utilization of device  which is used for devfreq governor.
But, This patchset is just initial version and must need more implementaion
and updates.

Although this patchset has not enough feature to support devfreq-event class,
I send this RFC patchset to share current implementation. Please feel free to
give comment and advice.

[Description of devfreq-event class]
This patchset add new devfreq_event class for devfreq_event device which provide
raw data (e.g., memory bus utilization/GPU utilization). This raw data from
devfreq_event data would be used for the governor of devfreq subsystem.
- devfreq_event device : Provide raw data for governor of existing devfreq 
device
- devfreq device   : Monitor device state and change frequency/voltage of 
device
 using the raw data from devfreq_event device

The devfreq subsystem support generic DVFS(Dynamic Voltage/Frequency Scaling)
for Non-CPU Devices. The devfreq device would dertermine current device state
using various governor (e.g., ondemand, performance, powersave). After completed
determination of system state, devfreq device would change the frequency/voltage
of devfreq device according to the result of governor.

But, devfreq governor must need basic data which indicates current device state.
Existing devfreq subsystem only consider devfreq device which check current 
system
state and determine proper system state using basic data. There is no subsystem
for device providing basic data to devfreq device.

The devfreq subsystem must need devfreq_event device(data-provider device) for
existing devfreq device. So, this patch add new devfreq_event class for
devfreq_event device which read various basic data(e.g, memory bus utilization,
GPU utilization) and provide measured data to existing devfreq device through
standard APIs of devfreq_event class.

The following description explains the feature of two kind of devfreq class:
- devfreq class (existing)
 : devfreq consumer device use raw data from devfreq_event device for
   determining proper current system state and change voltage/frequency
   dynamically using various governors.
- devfreq_event class (new)
 : Provide measured raw data to devfreq device for governor

[For example]
If board dts includes PPMU_DMC0/DMC1/CPU event node,
would show following sysfs entry. Also devfreq driver(e.g., exynos4_bus.c)
can get the instance of devfreq-event device by using provided API and then
get raw data which reflect the current utilization of device.

-sh-3.2# pwd
/sys/class/devfreq_event
-sh-3.2# ls -al
total 0
drwxr-xr-x  2 root root 0 Jan  1 21:06 .
drwxr-xr-x 45 root root 0 Jan  1 21:06 ..
lrwxrwxrwx  1 root root 0 Jan  1 21:06 event.0 -> 
../../devices/soc/106a.ppmu_dmc0/devfreq_event/event.0
lrwxrwxrwx  1 root root 0 Jan  1 21:06 event.1 -> 
../../devices/soc/106b.ppmu_dmc1/devfreq_event/event.1
lrwxrwxrwx  1 root root 0 Jan  1 21:06 event.2 -> 
../../devices/soc/106c.ppmu_cpu/devfreq_event/event.2

Chanwoo Choi (5):
  devfreq: Add new devfreq_event class to provide basic data for devfreq 
governor
  devfreq: event: Add exynos-ppmu devfreq evnet driver
  ARM: dts: Add PPMU dt node for Exynos3250
  ARM: dts: Add PPMU dt node for Exynos4 SoC
  ARM: dts: Add dt node fo PPMU_CPU/DMC0/DMC1 for exynos4412-trats2

 arch/arm/boot/dts/exynos3250.dtsi   |  66 +
 arch/arm/boot/dts/exynos4.dtsi  |  96 
 arch/arm/boot/dts/exynos4210.dtsi   |   8 +
 arch/arm/boot/dts/exynos4412-trats2.dts |  29 +++
 drivers/devfreq/Kconfig |  12 +
 drivers/devfreq/Makefile|   5 +-
 drivers/devfreq/devfreq-event.c | 251 +++
 drivers/devfreq/event/Makefile  |   2 +
 drivers/devfreq/event/exynos-ppmu.c | 410 
 include/linux/devfreq.h |  97 
 10 files changed, 975 insertions(+), 1 deletion(-)
 create mode 100644 drivers/devfreq/devfreq-event.c
 create mode 100644 drivers/devfreq/event/Makefile
 create mode 100644 drivers/devfreq/event/exynos-ppmu.c

-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 2/5] devfreq: event: Add exynos-ppmu devfreq evnet driver

2014-09-04 Thread Chanwoo Choi
This patch add exynos-ppmu devfreq event driver to provider raw data about
the utilization of each IP in Exynos SoC series.

Signed-off-by: Chanwoo Choi 
Acked-by: Kyungmin Park 
---
 drivers/devfreq/Kconfig |  10 +
 drivers/devfreq/event/Makefile  |   1 +
 drivers/devfreq/event/exynos-ppmu.c | 410 
 3 files changed, 421 insertions(+)
 create mode 100644 drivers/devfreq/event/exynos-ppmu.c

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index ef839e7..4fbbcea 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -90,4 +90,14 @@ config ARM_EXYNOS5_BUS_DEVFREQ
 
 comment "DEVFREQ Event Drivers"
 
+config DEVFREQ_EVENT_EXYNOS_PPMU
+   bool "EXYNOS PPMU (Performance Profiling Monitoring Unit) DEVFREQ event 
Driver"
+   depends on ARCH_EXYNOS
+   select ARCH_HAS_OPP
+   select PM_OPP
+   help
+This add the DEVFREQ event driver for Exynos SoC. It provides PPMU
+(Performance Profiling Monitoring Unit) counters to estimate the
+utilization of each module.
+
 endif # PM_DEVFREQ
diff --git a/drivers/devfreq/event/Makefile b/drivers/devfreq/event/Makefile
index dc56005..be146ea 100644
--- a/drivers/devfreq/event/Makefile
+++ b/drivers/devfreq/event/Makefile
@@ -1 +1,2 @@
 # Exynos DEVFREQ Event Drivers
+obj-$(CONFIG_DEVFREQ_EVENT_EXYNOS_PPMU) += exynos-ppmu.o
diff --git a/drivers/devfreq/event/exynos-ppmu.c 
b/drivers/devfreq/event/exynos-ppmu.c
new file mode 100644
index 000..54fd492
--- /dev/null
+++ b/drivers/devfreq/event/exynos-ppmu.c
@@ -0,0 +1,410 @@
+/*
+ * exynos_ppmu.c - EXYNOS PPMU (Performance Profiling Monitoring Units) support
+ *
+ * Copyright (c) 2014 Samsung Electronics Co., Ltd.
+ * Author : Chanwoo Choi 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This driver is based on drivers/devfreq/exynos/exynos_ppmu.c
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define PPMU_ENABLE BIT(0)
+#define PPMU_DISABLE0x0
+#define PPMU_CYCLE_RESETBIT(1)
+#define PPMU_COUNTER_RESET  BIT(2)
+
+#define PPMU_ENABLE_COUNT0  BIT(0)
+#define PPMU_ENABLE_COUNT1  BIT(1)
+#define PPMU_ENABLE_COUNT2  BIT(2)
+#define PPMU_ENABLE_COUNT3  BIT(3)
+#define PPMU_ENABLE_CYCLE   BIT(31)
+
+#define PPMU_CNTENS0x10
+#define PPMU_FLAG  0x50
+#define PPMU_CCNT_OVERFLOW BIT(31)
+#define PPMU_CCNT  0x100
+
+#define PPMU_PMCNT00x110
+#define PPMU_PMCNT_OFFSET  0x10
+#define PMCNT_OFFSET(x)(PPMU_PMCNT0 + (PPMU_PMCNT_OFFSET * x))
+
+#define PPMU_BEVT0SEL  0x1000
+#define PPMU_BEVTSEL_OFFSET0x100
+#define PPMU_BEVTSEL(x)(PPMU_BEVT0SEL + (x * 
PPMU_BEVTSEL_OFFSET))
+
+#define RD_DATA_COUNT  0x5
+#define WR_DATA_COUNT  0x6
+#define RDWR_DATA_COUNT0x7
+
+enum exynos_ppmu_type {
+   TYPE_PPMU_EXYNOS4210,
+};
+
+enum ppmu_counter {
+   PPMU_PMNCNT0,
+   PPMU_PMNCNT1,
+   PPMU_PMNCNT2,
+   PPMU_PMNCNT3,
+   PPMU_PMNCNT_MAX,
+};
+
+/* Platform data */
+struct exynos_ppmu_data {
+   struct devfreq *devfreq;
+   struct devfreq_event_dev **edev;
+   struct devfreq_event_desc *events;
+   unsigned int num_events;
+
+   struct device *dev;
+   struct clk *clk_ppmu;
+   struct mutex lock;
+
+   enum exynos_ppmu_type type;
+
+   struct __exynos_ppmu {
+   void __iomem *hw_base;
+   unsigned int ccnt;
+   unsigned int event[PPMU_PMNCNT_MAX];
+   unsigned int count[PPMU_PMNCNT_MAX];
+   unsigned long long ns;
+   ktime_t reset_time;
+   bool ccnt_overflow;
+   bool count_overflow[PPMU_PMNCNT_MAX];
+   } ppmu;
+};
+
+static int exynos_ppmu_enable(struct devfreq_event_dev *edev)
+{
+   struct exynos_ppmu_data *exynos_ppmu = edev_get_drvdata(edev);
+
+   __raw_writel(PPMU_ENABLE, exynos_ppmu->ppmu.hw_base);
+
+   return 0;
+}
+
+static int exynos_ppmu_disable(struct devfreq_event_dev *edev)
+{
+   struct exynos_ppmu_data *exynos_ppmu = edev_get_drvdata(edev);
+
+   __raw_writel(PPMU_DISABLE, exynos_ppmu->ppmu.hw_base);
+
+   return 0;
+}
+
+static int exynos_ppmu_set_event(struct devfreq_event_dev *edev)
+{
+   struct exynos_ppmu_data *exynos_ppmu = edev_get_drvdata(edev);
+   void __iomem *ppmu_base = exynos_ppmu->ppmu.hw_base;
+   int id = edev->desc->id;
+
+   __raw_writel(RDWR_DATA_COUNT, ppmu_base + PPMU_BEVTSEL(id));
+
+   return 0;
+}
+
+static int exynos_ppmu_get_event(struct devfreq_event_dev *edev)
+{
+   struct exynos_ppmu_data *exynos_ppmu = edev_get_drvdata(edev);
+   void __iomem *ppmu_base = 

[RFC PATCH 3/5] ARM: dts: Add PPMU dt node for Exynos3250

2014-09-04 Thread Chanwoo Choi
This patch add PPMU (Performance Profiling Monitoring Units) dt node
to estimate the utilization of each IP in Exynos SoC throught DEVFREQ Event
subsystem.

This patch adds following PPMU dt nodes:
- PPMU_DMC0 0x106a
- PPMU_DMC1 0x106b
- PPMU_RIGHTBUS 0x112A
- PPMU_LEFTBUS  0x116A
- PPMU_CAMIF0x11AC
- PPMU_LCD0 0x11E4
- PPMU_3D   0x1322
- PPMU_MFC_L0x1366
- PPMU_CPU  0x106c

Signed-off-by: Chanwoo Choi 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos3250.dtsi | 66 +++
 1 file changed, 66 insertions(+)

diff --git a/arch/arm/boot/dts/exynos3250.dtsi 
b/arch/arm/boot/dts/exynos3250.dtsi
index 1d52de6..20a4c59 100644
--- a/arch/arm/boot/dts/exynos3250.dtsi
+++ b/arch/arm/boot/dts/exynos3250.dtsi
@@ -465,6 +465,72 @@
compatible = "arm,cortex-a7-pmu";
interrupts = <0 18 0>, <0 19 0>;
};
+
+   ppmu_dmc0: ppmu_dmc0@106a {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x106a 0x2000>;
+   status = "disabled";
+   };
+
+   ppmu_dmc1: ppmu_dmc1@106b {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x106b 0x2000>;
+   status = "disabled";
+   };
+
+   ppmu_cpu: ppmu_cpu@106c {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x106c 0x2000>;
+   status = "disabled";
+   };
+
+   ppmu_rightbus: ppmu_rightbus@112a {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x112a 0x2000>;
+   clocks = < CLK_PPMURIGHT>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_leftbus: ppmu_leftbus0@116a {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x116a 0x2000>;
+   clocks = < CLK_PPMULEFT>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_camif: ppmu_camif@11ac {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x11ac 0x2000>;
+   clocks = < CLK_PPMUCAMIF>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_lcd0: ppmu_lcd0@11e4 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x11e4 0x2000>;
+   clocks = < CLK_PPMULCD0>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_g3d: ppmu_g3d@1322 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x1322 0x2000>;
+   clocks = < CLK_PPMUG3D>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
+
+   ppmu_mfc_l: ppmu_mfc_l@1366 {
+   compatible = "samsung,exynos4210-ppmu";
+   reg = <0x1366 0x2000>;
+   clocks = < CLK_PPMUMFC_L>;
+   clock-names = "ppmu";
+   status = "disabled";
+   };
};
 };
 
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 1/5] devfreq: Add new devfreq_event class to provide basic data for devfreq governor

2014-09-04 Thread Chanwoo Choi
This patch add new devfreq_event class for devfreq_event device which provide
raw data (e.g., memory bus utilization/GPU utilization). This raw data from
devfreq_event data would be used for the governor of devfreq subsystem.
- devfreq_event device : Provide raw data for governor of existing devfreq 
device
- devfreq device   : Monitor device state and change frequency/voltage of 
device
 using the raw data from devfreq_event device

The devfreq subsystem support generic DVFS(Dynamic Voltage/Frequency Scaling)
for Non-CPU Devices. The devfreq device would dertermine current device state
using various governor (e.g., ondemand, performance, powersave). After completed
determination of system state, devfreq device would change the frequency/voltage
of devfreq device according to the result of governor.

But, devfreq governor must need basic data which indicates current device state.
Existing devfreq subsystem only consider devfreq device which check current 
system
state and determine proper system state using basic data. There is no subsystem
for device providing basic data to devfreq device.

The devfreq subsystem must need devfreq_event device(data-provider device) for
existing devfreq device. So, this patch add new devfreq_event class for
devfreq_event device which read various basic data(e.g, memory bus utilization,
GPU utilization) and provide measured data to existing devfreq device through
standard APIs of devfreq_event class.

The following description explains the feature of two kind of devfreq class:
- devfreq class (existing)
 : devfreq consumer device use raw data from devfreq_event device for
   determining proper current system state and change voltage/frequency
   dynamically using various governors.

- devfreq_event class (new)
 : Provide measured raw data to devfreq device for governor

Signed-off-by: Chanwoo Choi 
Acked-by: Kyungmin Park 
---
 drivers/devfreq/Kconfig |   2 +
 drivers/devfreq/Makefile|   5 +-
 drivers/devfreq/devfreq-event.c | 251 
 drivers/devfreq/event/Makefile  |   1 +
 include/linux/devfreq.h |  97 
 5 files changed, 355 insertions(+), 1 deletion(-)
 create mode 100644 drivers/devfreq/devfreq-event.c
 create mode 100644 drivers/devfreq/event/Makefile

diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
index 3dced0a..ef839e7 100644
--- a/drivers/devfreq/Kconfig
+++ b/drivers/devfreq/Kconfig
@@ -88,4 +88,6 @@ config ARM_EXYNOS5_BUS_DEVFREQ
  It reads PPMU counters of memory controllers and adjusts the
  operating frequencies and voltages with OPP support.
 
+comment "DEVFREQ Event Drivers"
+
 endif # PM_DEVFREQ
diff --git a/drivers/devfreq/Makefile b/drivers/devfreq/Makefile
index 16138c9..a1ffabe 100644
--- a/drivers/devfreq/Makefile
+++ b/drivers/devfreq/Makefile
@@ -1,4 +1,4 @@
-obj-$(CONFIG_PM_DEVFREQ)   += devfreq.o
+obj-$(CONFIG_PM_DEVFREQ)   += devfreq.o devfreq-event.o
 obj-$(CONFIG_DEVFREQ_GOV_SIMPLE_ONDEMAND)  += governor_simpleondemand.o
 obj-$(CONFIG_DEVFREQ_GOV_PERFORMANCE)  += governor_performance.o
 obj-$(CONFIG_DEVFREQ_GOV_POWERSAVE)+= governor_powersave.o
@@ -7,3 +7,6 @@ obj-$(CONFIG_DEVFREQ_GOV_USERSPACE) += governor_userspace.o
 # DEVFREQ Drivers
 obj-$(CONFIG_ARM_EXYNOS4_BUS_DEVFREQ)  += exynos/
 obj-$(CONFIG_ARM_EXYNOS5_BUS_DEVFREQ)  += exynos/
+
+# DEVFREQ Event Drivers
+obj-$(CONFIG_PM_DEVFREQ)   += event/
diff --git a/drivers/devfreq/devfreq-event.c b/drivers/devfreq/devfreq-event.c
new file mode 100644
index 000..1629197
--- /dev/null
+++ b/drivers/devfreq/devfreq-event.c
@@ -0,0 +1,251 @@
+/*
+ * devfreq-event: Generic DEVFREQ Event class driver
+ *
+ * Copyright (C) 2014 Samsung Electronics
+ * Chanwoo Choi 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This driver is based on drivers/devfreq/devfreq.c
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "governor.h"
+
+static struct class *devfreq_event_class;
+
+/* The list of all devfreq event list */
+static LIST_HEAD(devfreq_event_list);
+static DEFINE_MUTEX(devfreq_event_list_lock);
+
+#define to_devfreq_event(DEV) container_of(DEV, struct devfreq_event_dev, dev)
+
+struct devfreq_event_dev *devfreq_add_event_device(struct device *dev,
+   struct devfreq_event_desc *desc)
+{
+   struct devfreq_event_dev *edev;
+   static atomic_t event_no = ATOMIC_INIT(0);
+   int ret;
+
+   if (!dev || !desc)
+   return ERR_PTR(-EINVAL);
+
+   if (!desc->name || !desc->ops || !desc->dev)
+   return ERR_PTR(-EINVAL);
+
+   edev = kzalloc(sizeof(struct 

[RFC PATCH 5/5] ARM: dts: Add dt node fo PPMU_CPU/DMC0/DMC1 for exynos4412-trats2

2014-09-04 Thread Chanwoo Choi
This patch add dt node for PPMU_CPU/DMC0/DMC1 for exynos4412-trats2 board.
Each PPMU dt node includes one event of 'PPMU Count 3'

Signed-off-by: Chanwoo Choi 
Acked-by: Kyungmin Park 
---
 arch/arm/boot/dts/exynos4412-trats2.dts | 29 +
 1 file changed, 29 insertions(+)

diff --git a/arch/arm/boot/dts/exynos4412-trats2.dts 
b/arch/arm/boot/dts/exynos4412-trats2.dts
index 11967f4..00dcb91 100644
--- a/arch/arm/boot/dts/exynos4412-trats2.dts
+++ b/arch/arm/boot/dts/exynos4412-trats2.dts
@@ -770,6 +770,35 @@
status = "okay";
};
 
+   ppmu_cpu: ppmu_cpu@106c {
+   status = "okay";
+
+   events {
+   ppmu_cpu_3: ppmu-cpu-pmcnt3 {
+   event-name = "ppmu-cpu-pmcnt3";
+   };
+   };
+   };
+
+   ppmu_dmc0: ppmu_dmc0@106a {
+   status = "okay";
+
+   events {
+   ppmu_dmc0_3: ppmu-dmc0-pmcnt3 {
+   event-name = "ppmu-dmc0-pmcnt3";
+   };
+   };
+   };
+
+   ppmu_dmc1: ppmu_dmc1@106b {
+   status = "okay";
+
+   events {
+   ppmu_dmc1_3: ppmu-dmc1-pmcnt3 {
+   event-name = "ppmu-dmc1-pmcnt3";
+   };
+   };
+
thermistor-ap@0 {
compatible = "ntc,ncp15wb473";
pullup-uv = <180>;   /* VCC_1.8V_AP */
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v2.2 1/7] regulator: sky81452: Adding Skyworks SKY81452 MFD driver

2014-09-04 Thread Gyungoh Yoo
This patch adds SKY81452 MFD driver.
SKY81452 has 2 functions : backlight LED driver and boost converter.
The MFD driver registers I2C regmap and registers the child drivers.

Signed-off-by: Gyungoh Yoo 
---
 drivers/mfd/Kconfig  |  12 +
 drivers/mfd/Makefile |   1 +
 drivers/mfd/sky81452.c   | 111 +++
 include/linux/mfd/sky81452.h |  32 +
 4 files changed, 156 insertions(+)
 create mode 100644 drivers/mfd/sky81452.c
 create mode 100644 include/linux/mfd/sky81452.h

diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index de5abf2..6962b4e 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -626,6 +626,18 @@ config MFD_SM501_GPIO
 lines on the SM501. The platform data is used to supply the
 base number for the first GPIO line to register.
 
+config MFD_SKY81452
+   tristate "Skyworks Solutions SKY81452"
+   select MFD_CORE
+   select REGMAP_I2C
+   depends on I2C
+   help
+ This is the core driver for the Skyworks SKY81452 backlight and
+ voltage regulator device.
+
+ This driver can also be built as a module.  If so, the module
+ will be called sky81452.
+
 config MFD_SMSC
bool "SMSC ECE1099 series chips"
depends on I2C=y
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index f001487..6c2f317 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -169,6 +169,7 @@ obj-$(CONFIG_MFD_AS3711)+= as3711.o
 obj-$(CONFIG_MFD_AS3722)   += as3722.o
 obj-$(CONFIG_MFD_STW481X)  += stw481x.o
 obj-$(CONFIG_MFD_IPAQ_MICRO)   += ipaq-micro.o
+obj-$(CONFIG_MFD_SKY81452) += sky81452.o
 
 intel-soc-pmic-objs:= intel_soc_pmic_core.o intel_soc_pmic_crc.o
 obj-$(CONFIG_INTEL_SOC_PMIC)   += intel-soc-pmic.o
diff --git a/drivers/mfd/sky81452.c b/drivers/mfd/sky81452.c
new file mode 100644
index 000..3455ac7
--- /dev/null
+++ b/drivers/mfd/sky81452.c
@@ -0,0 +1,111 @@
+/*
+ * sky81452.c  SKY81452 MFD driver
+ *
+ * Copyright 2014 Skyworks Solutions Inc.
+ * Author : Gyungoh Yoo 
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see .
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static const struct regmap_config sky81452_config = {
+   .reg_bits = 8,
+   .val_bits = 8,
+};
+
+static int sky81452_probe(struct i2c_client *client,
+   const struct i2c_device_id *id)
+{
+   struct device *dev = >dev;
+   const struct sky81452_platform_data *pdata = dev_get_platdata(dev);
+   struct mfd_cell cells[2];
+   struct regmap *regmap;
+   int ret;
+
+   if (!pdata) {
+   pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
+   if (!pdata)
+   return -ENOMEM;
+   }
+
+   regmap = devm_regmap_init_i2c(client, _config);
+   if (IS_ERR(regmap)) {
+   dev_err(dev, "failed to initialize I2C: %d", PTR_ERR(regmap));
+   return PTR_ERR(regmap);
+   }
+
+   i2c_set_clientdata(client, regmap);
+
+   memset(cells, 0, sizeof(cells));
+   cells[0].name = "sky81452-bl";
+   cells[0].of_compatible = "skyworks,sky81452-backlight";
+   cells[0].platform_data = pdata->bl_pdata;
+   cells[0].pdata_size = sizeof(*pdata->bl_pdata);
+   cells[1].name = "sky81452-regulator";
+   cells[1].of_compatible = "skyworks,sky81452-regulator";
+   cells[1].platform_data = pdata->regulator_init_data;
+   cells[1].pdata_size = sizeof(*pdata->regulator_init_data);
+
+   ret = mfd_add_devices(dev, -1, cells, ARRAY_SIZE(cells), NULL, 0, NULL);
+   if (ret)
+   dev_err(dev, "failed to add child devices: %d", ret);
+
+   return ret;
+}
+
+static int sky81452_remove(struct i2c_client *client)
+{
+   mfd_remove_devices(>dev);
+   return 0;
+}
+
+static const struct i2c_device_id sky81452_ids[] = {
+   { "sky81452" },
+   { }
+};
+MODULE_DEVICE_TABLE(i2c, sky81452_ids);
+
+#ifdef CONFIG_OF
+static const struct of_device_id sky81452_of_match[] = {
+   { .compatible = "skyworks,sky81452", },
+   { }
+};
+MODULE_DEVICE_TABLE(of, sky81452_of_match);
+#endif
+
+static struct i2c_driver sky81452_driver = {
+   .driver = {
+   .name = "sky81452",
+   .of_match_table = 

[PATCH] perf tool: fix compilation for ARM

2014-09-04 Thread Stephane Eranian

This patch fixes ARM compile of the perf tool.
The debug.h header file was missing from a couple
of unwind related modules.

Signed-off-by: Stephane Eranian 
--

diff --git a/tools/perf/arch/arm/tests/dwarf-unwind.c 
b/tools/perf/arch/arm/tests/dwarf-unwind.c
index 9f870d2..62eff84 100644
--- a/tools/perf/arch/arm/tests/dwarf-unwind.c
+++ b/tools/perf/arch/arm/tests/dwarf-unwind.c
@@ -3,6 +3,7 @@
 #include "thread.h"
 #include "map.h"
 #include "event.h"
+#include "debug.h"
 #include "tests/tests.h"
 
 #define STACK_SIZE 8192
diff --git a/tools/perf/arch/arm/util/unwind-libunwind.c 
b/tools/perf/arch/arm/util/unwind-libunwind.c
index 729ed69..62c397e 100644
--- a/tools/perf/arch/arm/util/unwind-libunwind.c
+++ b/tools/perf/arch/arm/util/unwind-libunwind.c
@@ -3,6 +3,7 @@
 #include 
 #include "perf_regs.h"
 #include "../../util/unwind.h"
+#include "../../util/debug.h"
 
 int libunwind__arch_reg_id(int regnum)
 {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] staging/lustre: Fixed checkpatch warning: Use #include instead of

2014-09-04 Thread Filipe Gonçalves
Signed-off-by: Filipe Gonçalves 
---
 drivers/staging/lustre/lustre/lmv/lproc_lmv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c 
b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
index 310df44..a7814f1 100644
--- a/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
+++ b/drivers/staging/lustre/lustre/lmv/lproc_lmv.c
@@ -37,7 +37,7 @@
 #define DEBUG_SUBSYSTEM S_CLASS
 
 #include 
-#include 
+#include 
 #include "../include/lprocfs_status.h"
 #include "../include/obd_class.h"
 
-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: add support for gcc 5

2014-09-04 Thread Hannes Frederic Sowa
On Do, 2014-09-04 at 23:37 -0400, Sasha Levin wrote:
> On 09/04/2014 07:47 PM, Joe Perches wrote:
> > On Fri, 2014-09-05 at 00:43 +0200, Hannes Frederic Sowa wrote:
> >> > Most statements are already depending on GCC_VERSION, maybe we can just
> >> > unify all gcc specific headers to one, still trying to keep the file
> >> > organized? ;)
> > Maybe something like:
> > 
> > gnu development of gcc will be more frequent and the use of
> > compiler-gcc.h likely will not be convenient anymore.
> > 
> > Integrate the individual compiler-gcc.h files into
> > compiler-gcc.h.
> 
> Please no. We have a similar file we maintain in our team that's supposed to
> do something very similar for kernel versions. It goes all the way back to
> 2.6.9 and it's a *horrible* mess.
> 
> This is how compiler-gcc.h will end up looking in a while.

Something along these lines? We can make '4' a macro describing it
references the latest possible compiler-gccX.h file.

--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -103,7 +103,12 @@
 #define __gcc_header(x) #x
 #define _gcc_header(x) __gcc_header(linux/compiler-gcc##x.h)
 #define gcc_header(x) _gcc_header(x)
+
+#if __GNUC__ > 4
+#include gcc_header(4)
+#else
 #include gcc_header(__GNUC__)
+#endif
 
 #if !defined(__noclone)
 #define __noclone  /* not needed */


I still think we should start chaining newer gcc header files to
deduplicate the content. What do you think?

Bye,
Hannes


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bit fields && data tearing

2014-09-04 Thread Paul E. McKenney
On Thu, Sep 04, 2014 at 10:47:24PM -0400, Peter Hurley wrote:
> Hi James,
> 
> On 09/04/2014 10:11 PM, James Bottomley wrote:
> > On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote:
> >> +And there are anti-guarantees:
> >> +
> >> + (*) These guarantees do not apply to bitfields, because compilers often
> >> + generate code to modify these using non-atomic read-modify-write
> >> + sequences.  Do not attempt to use bitfields to synchronize parallel
> >> + algorithms.
> >> +
> >> + (*) Even in cases where bitfields are protected by locks, all fields
> >> + in a given bitfield must be protected by one lock.  If two fields
> >> + in a given bitfield are protected by different locks, the compiler's
> >> + non-atomic read-modify-write sequences can cause an update to one
> >> + field to corrupt the value of an adjacent field.
> >> +
> >> + (*) These guarantees apply only to properly aligned and sized scalar
> >> + variables.  "Properly sized" currently means "int" and "long",
> >> + because some CPU families do not support loads and stores of
> >> + other sizes.  ("Some CPU families" is currently believed to
> >> + be only Alpha 21064.  If this is actually the case, a different
> >> + non-guarantee is likely to be formulated.)
> > 
> > This is a bit unclear.  Presumably you're talking about definiteness of
> > the outcome (as in what's seen after multiple stores to the same
> > variable).
> 
> No, the last conditions refers to adjacent byte stores from different
> cpu contexts (either interrupt or SMP).
> 
> > The guarantees are only for natural width on Parisc as well,
> > so you would get a mess if you did byte stores to adjacent memory
> > locations.
> 
> For a simple test like:
> 
> struct x {
>   long a;
>   char b;
>   char c;
>   char d;
>   char e;
> };
> 
> void store_bc(struct x *p) {
>   p->b = 1;
>   p->c = 2;
> }
> 
> on parisc, gcc generates separate byte stores
> 
> void store_bc(struct x *p) {
>0: 34 1c 00 02 ldi 1,ret0
>4: 0f 5c 12 08 stb ret0,4(r26)
>8: 34 1c 00 04 ldi 2,ret0
>c: e8 40 c0 00 bv r0(rp)
>   10: 0f 5c 12 0a stb ret0,5(r26)
> 
> which appears to confirm that on parisc adjacent byte data
> is safe from corruption by concurrent cpu updates; that is,
> 
> CPU 0| CPU 1
>  |
> p->b = 1 | p->c = 2
>  |
> 
> will result in p->b == 1 && p->c == 2 (assume both values
> were 0 before the call to store_bc()).

What Peter said.  I would ask for suggestions for better wording, but
I would much rather be able to say that single-byte reads and writes
are atomic and that aligned-short reads and writes are also atomic.

Thus far, it looks like we lose only very old Alpha systems, so unless
I hear otherwise, I update my patch to outlaw these very old systems.

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: add support for gcc 5

2014-09-04 Thread Joe Perches
On Thu, 2014-09-04 at 23:37 -0400, Sasha Levin wrote:
> On 09/04/2014 07:47 PM, Joe Perches wrote:
> > On Fri, 2014-09-05 at 00:43 +0200, Hannes Frederic Sowa wrote:
> >> > Most statements are already depending on GCC_VERSION, maybe we can just
> >> > unify all gcc specific headers to one, still trying to keep the file
> >> > organized? ;)
> > Maybe something like:
> > 
> > gnu development of gcc will be more frequent and the use of
> > compiler-gcc.h likely will not be convenient anymore.
> > 
> > Integrate the individual compiler-gcc.h files into
> > compiler-gcc.h.
> 
> Please no. We have a similar file we maintain in our team that's supposed to
> do something very similar for kernel versions. It goes all the way back to
> 2.6.9 and it's a *horrible* mess.
> 
> This is how compiler-gcc.h will end up looking in a while.

As gcc options are generally upward compatible,
I don't think it will but there's no great way
to predict the future.

It wouldn't be as complicated as say backports.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 0/3] new APIs to allocate buffer-cache with user specific flag

2014-09-04 Thread Theodore Ts'o
Joonson,

Thanks for the update.  I've applied Gioh's patches to the ext4 tree,
but I'd appreciate a further clarification.  My understanding with the
problem you were trying to address is that with the current CMA
implementation, kswapd was getting activiated too early, yes?

But it would still be a good idea to try to use non-moveable memory in
preference in favor of CMA memory; even if the page migration can move
the contents of the page elsewhere, wouldn't be better to avoid
needing to do the page migation in the first place.  Given that the
ext4 file systems are getting mounted very early in the boot process,
when there should be plenty of CMA and non-CMA elegible memory
available, why was CMA memory getting selected for the buffer cache
allocations when non-CMA memory available?

In other words, even without Gioh's patch to force the use of non-CMA
eligible memory, wouldn't it be better if the memory allocator used
non-CMA preferentially if it were available.  This should be
orthogonal to whether or not kswaped gets activiated, right?

Regards,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch added to -mm tree

2014-09-04 Thread Randy Dunlap
On 09/04/14 17:44, Joonsoo Kim wrote:
> On Wed, Sep 03, 2014 at 12:40:52PM -0700, a...@linux-foundation.org wrote:
>>
>> The patch titled
>>  Subject: mm: fix kmemcheck.c build errors
>> has been added to the -mm tree.  Its filename is
>>  mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch
>>
>> This patch should soon appear at
>> 
>> http://ozlabs.org/~akpm/mmots/broken-out/mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch
>> and later at
>> 
>> http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch
>>
>> Before you just go and hit "reply", please:
>>a) Consider who else should be cc'ed
>>b) Prefer to cc a suitable mailing list as well
>>c) Ideally: find the original patch on the mailing list and do a
>>   reply-to-all to that, adding suitable additional cc's
>>
>> *** Remember to use Documentation/SubmitChecklist when testing your code ***
>>
>> The -mm tree is included into linux-next and is updated
>> there every 3-4 working days
>>
>> --
>> From: Randy Dunlap 
>> Subject: mm: fix kmemcheck.c build errors
>>
>> Add header files to fix kmemcheck.c build errors:
>>
>> ../mm/kmemcheck.c:70:7: error: dereferencing pointer to incomplete type
>> ../mm/kmemcheck.c:83:15: error: dereferencing pointer to incomplete type
>> ../mm/kmemcheck.c:95:8: error: dereferencing pointer to incomplete type
>> ../mm/kmemcheck.c:95:21: error: dereferencing pointer to incomplete type
>>
>> ../mm/slab.h: In function 'cache_from_obj':
>> ../mm/slab.h:283:2: error: implicit declaration of function 
>> 'memcg_kmem_enabled' [-Werror=implicit-function-declaration]
>>
>> Signed-off-by: Randy Dunlap 
>> Cc: Joonsoo Kim 
>> Signed-off-by: Andrew Morton 
>> ---
>>
>>  mm/kmemcheck.c |1 +
>>  mm/slab.h  |2 ++
>>  2 files changed, 3 insertions(+)
>>
>> diff -puN 
>> mm/kmemcheck.c~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2
>>  mm/kmemcheck.c
>> --- 
>> a/mm/kmemcheck.c~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2
>> +++ a/mm/kmemcheck.c
>> @@ -2,6 +2,7 @@
>>  #include 
>>  #include 
>>  #include 
>> +#include "slab.h"
>>  #include 
>>  
>>  void kmemcheck_alloc_shadow(struct page *page, int order, gfp_t flags, int 
>> node)
>> diff -puN 
>> mm/slab.h~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2 
>> mm/slab.h
>> --- 
>> a/mm/slab.h~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2
>> +++ a/mm/slab.h
>> @@ -268,6 +268,8 @@ static inline void memcg_uncharge_slab(s
>>  }
>>  #endif
>>  
>> +#include 
>> +
>>  static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void 
>> *x)
>>  {
>>  struct kmem_cache *cachep;
> 
> Hello, Andrew.
> 
> Could you take another fix instead of this?
> This also make build failure if CONFIG_MEMCG_KMEM=y.
> Please see following patch I sent.
> 
> https://lkml.org/lkml/2014/8/31/148
> 
> The only difference is position of memcontrol.h header.
> 
> Thanks.

Ack that.  Thanks.

-- 
~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] openvswitch: Remove redundant comment in file net/openvswitch/vport.h

2014-09-04 Thread Lawrance Jing
From: lawrancejing 

In file net/openvswitch/vport.h, line#35 and line#37 are the same,
the patch will remove one to make code more clean.

Signed-off-by: lawrancejing 
---


--- net/openvswitch/vport.h.orig 2014-08-22 02:45:36.474929300 +0800
+++ net/openvswitch/vport.h 2014-08-22 02:46:15.041929282 +0800
@@ -34,7 +34,6 @@ struct vport_parms;

 /* The following definitions are for users of the vport subsytem: */

-/* The following definitions are for users of the vport subsytem: */
 struct vport_net {
  struct vport __rcu *gre_vport;
 };
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] kernel: add support for gcc 5

2014-09-04 Thread Sasha Levin
On 09/04/2014 07:47 PM, Joe Perches wrote:
> On Fri, 2014-09-05 at 00:43 +0200, Hannes Frederic Sowa wrote:
>> > Most statements are already depending on GCC_VERSION, maybe we can just
>> > unify all gcc specific headers to one, still trying to keep the file
>> > organized? ;)
> Maybe something like:
> 
> gnu development of gcc will be more frequent and the use of
> compiler-gcc.h likely will not be convenient anymore.
> 
> Integrate the individual compiler-gcc.h files into
> compiler-gcc.h.

Please no. We have a similar file we maintain in our team that's supposed to
do something very similar for kernel versions. It goes all the way back to
2.6.9 and it's a *horrible* mess.

This is how compiler-gcc.h will end up looking in a while.


Thanks,
Sasha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PULL] seccomp update (3.18)

2014-09-04 Thread James Morris
On Wed, 3 Sep 2014, Kees Cook wrote:

> Hi,
> 
> Please pull these seccomp changes for 3.18.
> 
> Thanks!
> 
> -Kees
> 
> The following changes since commit c3ce6dfa48e3879206382cdfdc015bffc50dce30:
> 
>   KEYS: Set pr_fmt() in asymmetric key signature handling (2014-09-03 
> 11:08:45 +1000)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git 
> tags/seccomp-3.18
> 
> for you to fetch changes up to a0cfd75fdc46b56978ece383a7d6f6b04e9087ad:

Pulled.

-- 
James Morris


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1] add selftest for virtio-net v1.0

2014-09-04 Thread Jason Wang
On 09/05/2014 09:51 AM, Hengjinxiao wrote:
>   Selftest is an important part of network driver, this patch adds 
> selftest for
> virtio-net, including loopback test, negotiate test and reset test. Loopback 
> test checks whether virtio-net can send and receive packets normally. 
> Negotiate test
> executes feature negotiation between virtio-net driver in Guest OS and 
> virtio-net 
> device in Host OS. Reset test resets virtio-net.
>   Following last patch, this version has deleted some useless codes and 
> fixed bugs
> as you suggest.
>   Any corrections are welcome.
>
> Signed-off-by: Hengjinxiao 

Some of the lines were indented correctly, some others exceeded 80
characters per line. Please see Documentation/SubmittingPatches for more
guide lines. More important, some of the comments in V1 were still not
addressed.

I suggest split this patch into series:

- patch that introduces virtio core helpers
- patch that introduces new virtio-net helpers
- patch that implements a skeleton of the selftest
- patch that implements loopback test
- patch that implements feature negotation test

Thanks
>
> ---
>  drivers/net/virtio_net.c| 241 
> ++--
>  drivers/virtio/virtio.c |  20 +++-
>  include/linux/virtio.h  |   2 +
>  include/uapi/linux/virtio_net.h |   9 ++
>  4 files changed, 256 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 59caa06..22d8228 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -28,6 +28,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Why pci.h is needed?
>  
>  static int napi_weight = NAPI_POLL_WEIGHT;
>  module_param(napi_weight, int, 0444);
> @@ -51,6 +52,17 @@ module_param(gso, bool, 0444);
>  #define MERGEABLE_BUFFER_ALIGN max(L1_CACHE_BYTES, 256)
>  
>  #define VIRTNET_DRIVER_VERSION "1.0.0"
> +#define __VIRTNET_TESTING  0
> +
> +static const struct {
> + const char string[ETH_GSTRING_LEN];
> +} virtnet_gstrings_test[] = {
> + { "loopback test   (offline)" },
> + { "negotiate test  (offline)" },
> + { "reset test (offline)" },
> +};
> +
> +#define VIRTNET_NUM_TEST ARRAY_SIZE(virtnet_gstrings_test)
>  
>  struct virtnet_stats {
>   struct u64_stats_sync tx_syncp;
> @@ -104,6 +116,8 @@ struct virtnet_info {
>   struct send_queue *sq;
>   struct receive_queue *rq;
>   unsigned int status;
> + unsigned long flags;
> + atomic_t lb_count;
>  
>   /* Max # of queue pairs supported by the device */
>   u16 max_queue_pairs;
> @@ -436,6 +450,19 @@ err_buf:
>   return NULL;
>  }
>  
> +void virtnet_check_lb_frame(struct virtnet_info *vi,
> +struct sk_buff *skb)
> +{
> + unsigned int frame_size = skb->len;
> +
> + if (*(skb->data + 3) == 0xFF) {
> + if ((*(skb->data + frame_size / 2 + 10) == 0xBE) &&
> +(*(skb->data + frame_size / 2 + 12) == 0xAF)) {
> + atomic_dec(>lb_count);
> + }
> + }
> +}
> +
>  static void receive_buf(struct receive_queue *rq, void *buf, unsigned int 
> len)
>  {
>   struct virtnet_info *vi = rq->vq->vdev->priv;
> @@ -485,7 +512,12 @@ static void receive_buf(struct receive_queue *rq, void 
> *buf, unsigned int len)
>   } else if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID) {
>   skb->ip_summed = CHECKSUM_UNNECESSARY;
>   }
> -
> + /* loopback self test for ethtool */
> + if (test_bit(__VIRTNET_TESTING, >flags)) {
> + virtnet_check_lb_frame(vi, skb);
> + dev_kfree_skb_any(skb);
> + return;
> + }

Again, we really need a selftest specific handler and don't put anything
in the ordinary fast path.
>   skb->protocol = eth_type_trans(skb, dev);
>   pr_debug("Receiving skb proto 0x%04x len %i type %i\n",
>ntohs(skb->protocol), skb->len, skb->pkt_type);
> @@ -813,6 +845,9 @@ static int virtnet_open(struct net_device *dev)
>  {
>   struct virtnet_info *vi = netdev_priv(dev);
>   int i;
> + /* disallow open during test */
> + if (test_bit(__VIRTNET_TESTING, >flags))
> + return -EBUSY;
>  
>   for (i = 0; i < vi->max_queue_pairs; i++) {
>   if (i < vi->curr_queue_pairs)
> @@ -1363,12 +1398,166 @@ static void virtnet_get_channels(struct net_device 
> *dev,
>   channels->other_count = 0;
>  }
>  
> +static int virtnet_reset(struct virtnet_info *vi, u64 *data);
> +
> +static void virtnet_create_lb_frame(struct sk_buff *skb,
> + unsigned int frame_size)
> +{
> + memset(skb->data, 0xFF, frame_size);
> + frame_size &= ~1;
> + memset(>data[frame_size / 2], 0xAA, frame_size / 2 - 1);
> + memset(>data[frame_size / 2 + 10], 0xBE, 1);
> + memset(>data[frame_size / 2 + 12], 0xAF, 1);
> +}
> +
> +static int virtnet_start_loopback(struct virtnet_info *vi)
> +{
> + if 

Re: [PATCH v4 2/6] perf/x86: add support for sampling PEBS machine state registers

2014-09-04 Thread Stephane Eranian
On Fri, Sep 5, 2014 at 4:16 AM, Chuck Ebbert  wrote:
> On Wed,  3 Sep 2014 16:59:07 +0200
> Stephane Eranian  wrote:
>
>> PEBS can capture machine state regs at retiremnt of the sampled
>> instructions. When precise sampling is enabled on an event, PEBS
>> is used, so substitute the interrupted state with the PEBS state.
>> Note that not all registers are captured by PEBS. Those missing
>> are replaced by the interrupt state counter-parts.
>>
>> Signed-off-by: Stephane Eranian 
>> ---
>>  arch/x86/kernel/cpu/perf_event_intel_ds.c |   17 +
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 9dc4199..139a8a5
>> 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
>> @@ -886,6 +886,23 @@ static void __intel_pmu_pebs_event(struct
>> perf_event *event, regs.bp = pebs->bp;
>>   regs.sp = pebs->sp;
>>
>> + if (sample_type & PERF_SAMPLE_REGS_INTR) {
>> + regs.ax = pebs->ax;
>> + regs.bx = pebs->bx;
>> + regs.cx = pebs->cx;
>> + regs.si = pebs->si;
>> + regs.di = pebs->di;
>> +
>> + regs.r8 = pebs->r8;
>> + regs.r9 = pebs->r9;
>> + regs.r10 = pebs->r10;
>> + regs.r11 = pebs->r11;
>> + regs.r12 = pebs->r12;
>> + regs.r13 = pebs->r13;
>> + regs.r14 = pebs->r14;
>> + regs.r14 = pebs->r15;
>  ^^^
>  r15 ???
>
Oops, good catch. I will fix in v5.
Thanks.

>> + }
>> +
>>   if (event->attr.precise_ip > 1 &&
>> x86_pmu.intel_cap.pebs_format >= 2) { regs.ip = pebs->real_ip;
>>   regs.flags |= PERF_EFLAGS_EXACT;
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Stephane Eranian
On Fri, Sep 5, 2014 at 2:23 AM, Namhyung Kim  wrote:
> Hi Stephane,
>
> On Thu, 4 Sep 2014 16:37:51 +0200, Stephane Eranian wrote:
>> On Tue, Aug 26, 2014 at 8:38 AM, Namhyung Kim  wrote:
>>>
>>> There's a problem on finding correct kernel symbols when perf report
>>> runs on a different kernel.  Although a part of the problem was solved
>>> by the prior commit 0a7e6d1b6844 ("perf tools: Check recorded kernel
>>> version when finding vmlinux"), there's a remaining problem still.
>>>
>>> When perf records samples, it synthesizes the kernel map using
>>> machine__mmap_name() and ref_reloc_sym like "[kernel.kallsyms]_text".
>>> You can easily see it using 'perf report -D' command.
>>>
>>> After finishing record, it goes through the recorded events to find
>>> maps/dsos actually used.  And then record build-id info of them.
>>>
>>> During this process, it needs to load symbols in a dso and it'd call
>>> dso__load_vmlinux() since the default value of the symbol_conf.try_
>>> vmlinux_path is true.  However it changes dso->long_name to a real
>>> path of the vmlinux file (e.g. /lib/modules/3.16.0-rc2+/build/vmlinux)
>>> if one is running on a custom kernel.
>>>
>>> It resulted in that perf report reads the build-id of the vmlinux, but
>>> cannot use it since it only knows about the [kernel.kallsyms] map.  It
>>> then falls back to possible vmlinux paths by using the recorded kernel
>>> version (in case of a recent version) or a running kernel silently
>>> (which might break the result).  I think it's worth going to the
>>> stable tree.
>>>
>>> I can think of a couple of ways to fix it.  In this patch, I changed
>>> to use the name of "[kernel.kallsyms]" for the kernel build-id event
>>> instead of not trying vmlinux paths.  This way we can provide maximum
>>> info (like annotation) with minimum change IMHO.
>>>
>>> Before:
>>>
>>>   $ perf record -a usleep 1
>>>
>>>   $ perf buildid-list
>>>   00d5ff078efe1d30b8492854f259215fd877ce30 
>>> /lib/modules/3.16.0-rc2+/build/vmlinux
>>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>>>
>>>   $ perf buildid-list -H
>>>    [kernel.kallsyms]
>>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>>>    /tmp/perf-2523.map
>>>
>> There is something I don't understand in your example above.  The -H
>> option shows only DSO with samples. So why do you get the buildid
>> without -H and you get no buildid with -H? In other words, I don't
>> connect the dots between what -H does on the buildid change for the
>> kernel. Looks like you have the buildid in the perf.data file.
>
> Without -H, it just prints all DSOs found in build-id table (rebuilt
> during read perf data file header) and skips processing events.  But
> with -H, it'd process the event records and so set kernel map to
> '[kernel.kallsyms]' - since the kernel mmap event always has the name -
> and mark it as hit.  Thus the actual vmlinux can't be marked and then
> cannot be printed.
>
Still don't follow this. You're saying because as part of processing the events,
you create or replace the mmap record corresponding to the kernel from the
synthesized mmap (actual kernel filename) to the generic kernel.kallsyms, you
lose the buildid. Why not just transfer it? It has to be the one
listed without -H.
This would certainly be much less confusing (to me at least)! Seems to me
you have one piece of information or the other (buildid or filename) but
never both.



> Hmm.. now I'm curious that why the -H option is needed at all.. the perf
> record already wrote build-ids that are actually hits..
>
> Thanks,
> Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 2/4] Input: misc: Add haptic driver on max77693

2014-09-04 Thread Jaewon Kim

Hi Dmitry Torokhov

thanks to review my patchs.

2014년 09월 05일 01:59에 Dmitry Torokhov 이(가) 쓴 글:

Hi Jaewon,

On Fri, Sep 05, 2014 at 12:01:29AM +0900, Jaewon Kim wrote:

This patch add max77693-haptic device driver to support the haptic controller
on MAX77693. The MAX77693 is a Multifunction device with PMIC, CHARGER, LED,
MUIC, HAPTIC and the patch is haptic device driver in the MAX77693. This driver
support external pwm and LRA(Linear Resonant Actuator) motor. User can control
the haptic driver by using force feedback framework.

Signed-off-by: Jaewon Kim 
Acked-by: Chanwoo Choi 
---
  drivers/input/misc/Kconfig   |   12 ++
  drivers/input/misc/Makefile  |1 +
  drivers/input/misc/max77693-haptic.c |  333 ++
  include/linux/mfd/max77693-private.h |9 +
  4 files changed, 355 insertions(+)
  create mode 100644 drivers/input/misc/max77693-haptic.c

diff --git a/drivers/input/misc/Kconfig b/drivers/input/misc/Kconfig
index 2ff4425..c597c52 100644
--- a/drivers/input/misc/Kconfig
+++ b/drivers/input/misc/Kconfig
@@ -144,6 +144,18 @@ config INPUT_M68K_BEEP
tristate "M68k Beeper support"
depends on M68K
  
+config INPUT_MAX77693_HAPTIC

+   tristate "MAXIM MAX77693 haptic controller support"
+   depends on MFD_MAX77693 && PWM
+   select INPUT_FF_MEMLESS
+   help
+ This option enables device driver support for the haptic controller
+ on MAXIM MAX77693 chip. This driver supports ff-memless interface
+ from input framework.
+
+ To compile this driver as module, choose M here: the
+ module will be called max77693-haptic.
+
  config INPUT_MAX8925_ONKEY
tristate "MAX8925 ONKEY support"
depends on MFD_MAX8925
diff --git a/drivers/input/misc/Makefile b/drivers/input/misc/Makefile
index 4955ad3..b28570c 100644
--- a/drivers/input/misc/Makefile
+++ b/drivers/input/misc/Makefile
@@ -35,6 +35,7 @@ obj-$(CONFIG_INPUT_IXP4XX_BEEPER) += ixp4xx-beeper.o
  obj-$(CONFIG_INPUT_KEYSPAN_REMOTE)+= keyspan_remote.o
  obj-$(CONFIG_INPUT_KXTJ9) += kxtj9.o
  obj-$(CONFIG_INPUT_M68K_BEEP) += m68kspkr.o
+obj-$(CONFIG_INPUT_MAX77693_HAPTIC)+= max77693-haptic.o
  obj-$(CONFIG_INPUT_MAX8925_ONKEY) += max8925_onkey.o
  obj-$(CONFIG_INPUT_MAX8997_HAPTIC)+= max8997_haptic.o
  obj-$(CONFIG_INPUT_MC13783_PWRBUTTON) += mc13783-pwrbutton.o
diff --git a/drivers/input/misc/max77693-haptic.c 
b/drivers/input/misc/max77693-haptic.c
new file mode 100644
index 000..2a69496
--- /dev/null
+++ b/drivers/input/misc/max77693-haptic.c
@@ -0,0 +1,333 @@
+/*
+ * max77693-haptic.c - MAXIM MAX77693 Haptic device driver
+ *
+ * Copyright (C) 2014 Samsung Electronics
+ * Jaewon Kim 
+ *
+ * This program is not provided / owned by Maxim Integrated Products.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MAX_MAGNITUDE_SHIFT16
+
+enum max77693_haptic_motor_type {
+   MAX77693_HAPTIC_ERM = 0,
+   MAX77693_HAPTIC_LRA,
+};
+
+enum max77693_haptic_pulse_mode {
+   MAX77693_HAPTIC_EXTERNAL_MODE = 0,
+   MAX77693_HAPTIC_INTERNAL_MODE,
+};
+
+enum max77693_haptic_pwm_divisor {
+   MAX77693_HAPTIC_PWM_DIVISOR_32 = 0,
+   MAX77693_HAPTIC_PWM_DIVISOR_64,
+   MAX77693_HAPTIC_PWM_DIVISOR_128,
+   MAX77693_HAPTIC_PWM_DIVISOR_256,
+};
+
+struct max77693_haptic {
+   struct regmap *regmap_pmic;
+   struct regmap *regmap_haptic;
+   struct device *dev;
+   struct input_dev *input_dev;
+   struct pwm_device *pwm_dev;
+   struct regulator *motor_reg;
+
+   bool enabled;
+   unsigned int magnitude;
+   enum max77693_haptic_motor_type type;
+   enum max77693_haptic_pulse_mode mode;
+   enum max77693_haptic_pwm_divisor pwm_divisor;
+
+   struct mutex mutex;
+   struct work_struct work;
+};
+
+static int max77693_haptic_set_duty_cycle(struct max77693_haptic *haptic,
+   unsigned int pwm_duty)
+{
+   int ret;
+   int delta = (haptic->pwm_dev->period + pwm_duty)/2;
+
+   ret = pwm_config(haptic->pwm_dev, delta, haptic->pwm_dev->period);
+   if (ret) {
+   dev_err(haptic->dev, "cannot configuration pwm\n");
+   return ret;
+   }
+
+   return 0;
+}
+
+static int max77693_haptic_configure(struct max77693_haptic *haptic,
+   unsigned int enable)
+{
+   int ret;
+   unsigned int value = 0;
+
+   value = ((haptic->type << MAX77693_CONFIG2_MODE) |
+   (enable << MAX77693_CONFIG2_MEN) |
+   

Re: bit fields && data tearing

2014-09-04 Thread Peter Hurley
Hi James,

On 09/04/2014 10:11 PM, James Bottomley wrote:
> On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote:
>> +And there are anti-guarantees:
>> +
>> + (*) These guarantees do not apply to bitfields, because compilers often
>> + generate code to modify these using non-atomic read-modify-write
>> + sequences.  Do not attempt to use bitfields to synchronize parallel
>> + algorithms.
>> +
>> + (*) Even in cases where bitfields are protected by locks, all fields
>> + in a given bitfield must be protected by one lock.  If two fields
>> + in a given bitfield are protected by different locks, the compiler's
>> + non-atomic read-modify-write sequences can cause an update to one
>> + field to corrupt the value of an adjacent field.
>> +
>> + (*) These guarantees apply only to properly aligned and sized scalar
>> + variables.  "Properly sized" currently means "int" and "long",
>> + because some CPU families do not support loads and stores of
>> + other sizes.  ("Some CPU families" is currently believed to
>> + be only Alpha 21064.  If this is actually the case, a different
>> + non-guarantee is likely to be formulated.)
> 
> This is a bit unclear.  Presumably you're talking about definiteness of
> the outcome (as in what's seen after multiple stores to the same
> variable).

No, the last conditions refers to adjacent byte stores from different
cpu contexts (either interrupt or SMP).

> The guarantees are only for natural width on Parisc as well,
> so you would get a mess if you did byte stores to adjacent memory
> locations.

For a simple test like:

struct x {
long a;
char b;
char c;
char d;
char e;
};

void store_bc(struct x *p) {
p->b = 1;
p->c = 2;
}

on parisc, gcc generates separate byte stores

void store_bc(struct x *p) {
   0:   34 1c 00 02 ldi 1,ret0
   4:   0f 5c 12 08 stb ret0,4(r26)
   8:   34 1c 00 04 ldi 2,ret0
   c:   e8 40 c0 00 bv r0(rp)
  10:   0f 5c 12 0a stb ret0,5(r26)

which appears to confirm that on parisc adjacent byte data
is safe from corruption by concurrent cpu updates; that is,

CPU 0| CPU 1
 |
p->b = 1 | p->c = 2
 |

will result in p->b == 1 && p->c == 2 (assume both values
were 0 before the call to store_bc()).

Regards,
Peter Hurley


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 2/3] ext4: use non-movable memory for the ext4 superblock

2014-09-04 Thread Theodore Ts'o
On Thu, Sep 04, 2014 at 04:29:40PM +0900, Gioh Kim wrote:
> A buffer-cache for superblock is disturbing page migration,
> because the buffer-cache is not released until unmount.
> The buffer-cache must be allocated from non-movable area.
> 
> Signed-off-by: Gioh Kim 
> Reviewed-by: Jan Kara 

Thanks, I've applied this to the ext4 git tree with a slightly
rewritten commit description (for clarity).

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 1/3] fs.c: support buffer cache allocations with gfp modifiers

2014-09-04 Thread Theodore Ts'o
On Thu, Sep 04, 2014 at 04:29:39PM +0900, Gioh Kim wrote:
> A buffer cache is allocated from movable area
> because it is referred for a while and released soon.
> But some filesystems are taking buffer cache for a long time
> and it can disturb page migration.
> 
> New APIs are introduced to allocate buffer cache
> with user specific flag.
> *_gfp APIs are for user want to set page allocation flag for page cache
> allocation.
> And *_unmovable APIs are for the user wants to allocate page cache from
> non-movable area.
> 
> Signed-off-by: Gioh Kim 
> Reviewed-by: Jan Kara 

Thanks, I've applied this to the ext4 git tree with a slightly
rewritten commit description (for clarity).

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] IBM Akebono: Remove obsolete config select

2014-09-04 Thread Alistair Popple
On Fri, 5 Sep 2014 00:20:42 Paul Bolle wrote:



> > On Fri, 13 Jun 2014 13:56:32 Paul Bolle wrote:
> > > On Fri, 2014-05-02 at 18:06 +1000, Alistair Popple wrote:
> > > > The original implementation of MMC support for Akebono introduced a
> > > > new configuration symbol (MMC_SDHCI_OF_476GTR). This symbol has been
> > > > dropped in favour of using the generic platform driver however the
> > > > select for this symbol was mistakenly left in the platform
> > > > configuration.
> > > > 
> > > > This patch removes the obsolete symbol selection.
> > > > 
> > > > Signed-off-by: Alistair Popple 
> > > 
> > > This patch hasn't yet entered linux-next nor Linus' tree. Is it queued
> > > somewhere? If not, would a
> > > 
> > > Acked-by: Paul Bolle 
> > > 
> > > help to get this trivial patch queued for either of those trees?
> > > 
> > > 
> > > Paul Bolle
> > > 
> > > > ---
> > > > 
> > > >  arch/powerpc/platforms/44x/Kconfig | 1 -
> > > >  1 file changed, 1 deletion(-)
> > > > 
> > > > diff --git a/arch/powerpc/platforms/44x/Kconfig
> > > > b/arch/powerpc/platforms/44x/Kconfig index 8beec7d..908bf11 100644
> > > > --- a/arch/powerpc/platforms/44x/Kconfig
> > > > +++ b/arch/powerpc/platforms/44x/Kconfig
> > > > @@ -220,7 +220,6 @@ config AKEBONO
> > > > 
> > > > select USB_EHCI_HCD_PLATFORM
> > > > select MMC_SDHCI
> > > > select MMC_SDHCI_PLTFM
> > > > 
> > > > -   select MMC_SDHCI_OF_476GTR
> > > > 
> > > > select ATA
> > > > select SATA_AHCI_PLATFORM
> > > > help
> 
> This trivial cleanup is still not in linux-next nor in Linus' tree.
> Could someone else please have a look at it?
> 
> Thanks,
> 
> 
> Paul Bolle

Ben,

Any chance you could merge this? It's in patchwork (see 
http://patchwork.ozlabs.org/patch/344894/).

Regards,

Alistair

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 3/3] jbd/jbd2: use non-movable memory for the jbd superblock

2014-09-04 Thread Theodore Ts'o
On Thu, Sep 04, 2014 at 04:29:41PM +0900, Gioh Kim wrote:
> A long-lasting buffer-cache can distrub page migration so that
> it must be allocated from non-movable area.
> 
> The journal_init_inode is creating a buffer-cache for superblock journaling.
> The superblock exists until system shutdown so that the buffer-cache
> for the superblock would also exist for a long time
> and it can distrub page migration.
> 
> This patch make the buffer-cache be allocated from non-movable area
> not to distrub page migration.
> 
> Signed-off-by: Gioh Kim 
> Reviewed-by: Jan Kara 

Thanks, I've applied this to the ext4 git tree with a slightly
rewritten commit description (for clarity).

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] moduleparam: Resolve missing-field-initializer warning

2014-09-04 Thread Rusty Russell
"Rustad, Mark D"  writes:
> On Aug 31, 2014, at 5:52 PM, Rusty Russell  wrote:
>
>> Jeff Kirsher  writes:
>>> From: Mark Rustad 
>>> 
>>> Resolve a missing-field-initializer warning, that is produced
>>> by every reference to module_param_call, by using designated
>>> initialization for the first field. That is enough to silence
>>> the complaint.
>>> 
>>> Signed-off-by: Mark Rustad 
>>> Signed-off-by: Jeff Kirsher 
>>> ---
>>> include/linux/moduleparam.h | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> Strange, I haven't seen this warning.  Compiler version?  And it's good
>> to quote the error message, so people can google it.
>
> The message is only seen when doing a W=2 build. I happened to be using gcc 
> 4.8.3, but I think most versions would produce the warning when it is 
> enabled. It can either be silenced by using even a single designated 
> initializer as I did here, or providing values for all of the fields. Because 
> of the number of references to the macro, this change silences many warnings 
> in W=2 builds.
>
> One instance of the full warning message looks like this:
>
> /home/share/git/nn-mdr/include/linux/moduleparam.h:198:16: warning: missing 
> initializer for field ‘free’ of ‘struct kernel_param_ops’ 
> [-Wmissing-field-initializers]
>   static struct kernel_param_ops __param_ops_##name =  \
> ^
> /home/share/git/nn-mdr/fs/fuse/inode.c:35:1: note: in expansion of macro 
> ‘module_param_call’
>  module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
>  ^
> /home/share/git/nn-mdr/include/linux/moduleparam.h:56:9: note: ‘free’ 
> declared here
>   void (*free)(void *arg);

OK, I pasted this into your commit message, and applied it.  See below.

Thanks!
Rusty.

From: Mark Rustad 
Subject: moduleparam: Resolve missing-field-initializer warning

Resolve a missing-field-initializer warning, that is produced
by every reference to module_param_call, by using designated
initialization for the first field. That is enough to silence
the complaint.

The message is only seen when doing a W=2 build. I happened to be using gcc
4.8.3, but I think most versions would produce the warning when it is
enabled. It can either be silenced by using even a single designated
initializer as I did here, or providing values for all of the fields. Because
of the number of references to the macro, this change silences many warnings
in W=2 builds.

One instance of the full warning message looks like this:

/home/share/git/nn-mdr/include/linux/moduleparam.h:198:16: warning: missing
initializer for field ‘free’ of ‘struct kernel_param_ops’
[-Wmissing-field-initializers]
  static struct kernel_param_ops __param_ops_##name =  \
  ^
/home/share/git/nn-mdr/fs/fuse/inode.c:35:1: note: in expansion of macro
‘module_param_call’
 module_param_call(max_user_bgreq, set_global_limit, param_get_uint,
 ^
/home/share/git/nn-mdr/include/linux/moduleparam.h:56:9: note: ‘free’
declared here
  void (*free)(void *arg);

Signed-off-by: Mark Rustad 
Signed-off-by: Jeff Kirsher 
Signed-off-by: Rusty Russell 
---
 include/linux/moduleparam.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/moduleparam.h b/include/linux/moduleparam.h
index 593501996574..b43f4752304e 100644
--- a/include/linux/moduleparam.h
+++ b/include/linux/moduleparam.h
@@ -224,7 +224,7 @@ struct kparam_array
 /* Obsolete - use module_param_cb() */
 #define module_param_call(name, set, get, arg, perm)   \
static struct kernel_param_ops __param_ops_##name = \
-   { 0, (void *)set, (void *)get };\
+   { .flags = 0, (void *)set, (void *)get };   \
__module_param_call(MODULE_PARAM_PREFIX,\
name, &__param_ops_##name, arg, \
(perm) + sizeof(__check_old_set_param(set))*0, -1, 
0)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set

2014-09-04 Thread Junxiao Bi
On 09/04/2014 05:23 PM, Dave Chinner wrote:
> On Wed, Sep 03, 2014 at 01:54:54PM +0800, Junxiao Bi wrote:
>> commit 21caf2fc1931 ("mm: teach mm by current context info to not do I/O 
>> during memory allocation")
>> introduces PF_MEMALLOC_NOIO flag to avoid doing I/O inside memory 
>> allocation, __GFP_IO is cleared
>> when this flag is set, but __GFP_FS implies __GFP_IO, it should also be 
>> cleared. Or it may still
>> run into I/O, like in superblock shrinker.
>>
>> Signed-off-by: Junxiao Bi 
>> Cc: joyce.xue 
>> Cc: Ming Lei 
>> ---
>>  include/linux/sched.h |6 --
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 5c2c885..2fb2c47 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -1936,11 +1936,13 @@ extern void thread_group_cputime_adjusted(struct 
>> task_struct *p, cputime_t *ut,
>>  #define tsk_used_math(p) ((p)->flags & PF_USED_MATH)
>>  #define used_math() tsk_used_math(current)
>>  
>> -/* __GFP_IO isn't allowed if PF_MEMALLOC_NOIO is set in current->flags */
>> +/* __GFP_IO isn't allowed if PF_MEMALLOC_NOIO is set in current->flags
>> + * __GFP_FS is also cleared as it implies __GFP_IO.
>> + */
>>  static inline gfp_t memalloc_noio_flags(gfp_t flags)
>>  {
>>  if (unlikely(current->flags & PF_MEMALLOC_NOIO))
>> -flags &= ~__GFP_IO;
>> +flags &= ~(__GFP_IO | __GFP_FS);
>>  return flags;
>>  }
> 
> You also need to mask all the shrink_control->gfp_mask
> initialisations in mm/vmscan.c. The current code only masks the page
> reclaim gfp_mask, not those that are passed to the shrinkers.
Yes, there are some shrink_control->gfp_mask not masked in vmscan.c in
the following functions. Beside this, all seemed be masked from direct
reclaim path by memalloc_noio_flags().

-reclaim_clean_pages_from_list()
used by alloc_contig_range(), this function is invoked in hugetlb and
cma, for hugetlb, it should be safe as only userspace use it. I am not
sure about the cma.
David & Andrew, may you share your idea about whether cma is affected?

-mem_cgroup_shrink_node_zone()
-try_to_free_mem_cgroup_pages()
These two are used by mem cgroup, as no kernel thread can be assigned
into such cgroup, so i think, no need mask.

-balance_pgdat()
used by kswapd, no need mask.

-shrink_all_memory()
used by hibernate, should be safe with GFP_FS/IO.

Thanks,
Junxiao.
> 
> Cheers,
> 
> Dave.
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] i2c: rk3x: fix divisor calculation for SCL frequency

2014-09-04 Thread Addy Ke
I2C_CLKDIV register descripted in the previous version of
RK3x chip manual is incorrect. Plus 1 is required.

The correct formula:
- T(SCL_HIGH) = T(PCLK) * (CLKDIVH + 1) * 8
- T(SCL_LOW) = T(PCLK) * (CLKDIVL + 1) * 8
- (SCL Divsor) = 8 * ((CLKDIVL + 1) + (CLKDIVH + 1))
- SCL = PCLK / (CLK Divsor)

It will be updated to the latest version of chip manual.

Signed-off-by: Addy Ke 
---
 drivers/i2c/busses/i2c-rk3x.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/i2c/busses/i2c-rk3x.c b/drivers/i2c/busses/i2c-rk3x.c
index e637c32..76b6604 100644
--- a/drivers/i2c/busses/i2c-rk3x.c
+++ b/drivers/i2c/busses/i2c-rk3x.c
@@ -433,8 +433,8 @@ static void rk3x_i2c_set_scl_rate(struct rk3x_i2c *i2c, 
unsigned long scl_rate)
unsigned long i2c_rate = clk_get_rate(i2c->clk);
unsigned int div;
 
-   /* SCL rate = (clk rate) / (8 * DIV) */
-   div = DIV_ROUND_UP(i2c_rate, scl_rate * 8);
+   /* SCL rate = (clk rate) / (8 * (DIV + 2)) */
+   div = DIV_ROUND_UP(i2c_rate, scl_rate * 8) - 2;
 
/* The lower and upper half of the CLKDIV reg describe the length of
 * SCL low & high periods. */
-- 
1.8.3.2


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Namhyung Kim
On Fri, Sep 5, 2014 at 10:42 AM, Namhyung Kim  wrote:
> On Fri, Sep 5, 2014 at 10:18 AM, Arnaldo Carvalho de Melo
>  wrote:
>> Em Fri, Sep 05, 2014 at 09:09:49AM +0900, Namhyung Kim escreveu:
>>> The perf report rebuilds machine states from the event records only.  In
>>> this case, the kernel map was recorded in the name of [kernel.kallsyms]
>>> so it couldn't find the build-id from the table.
>>
>> Ok, but then we can special case this one, no?
>>
>> Somehow mark in the buildid table that that entry is the one for the
>> kernel and hook it up to the synthesized event that has the
>> [kernel.kallsyms].ref_reloc_sym entry.
>
> Maybe we can search vmlinux in machine->kernel_dsos first when
> processing kernel mmap event.  And in this case do you want replace
> the name of mapping to a fullname of vmlinux?

Ah.. the name of mapping is not used at all after finding a matching
dso.  Ok then, I'll post a new patch to search vmlinux before
kallsyms.

Thanks
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v4 2/6] perf/x86: add support for sampling PEBS machine state registers

2014-09-04 Thread Chuck Ebbert
On Wed,  3 Sep 2014 16:59:07 +0200
Stephane Eranian  wrote:

> PEBS can capture machine state regs at retiremnt of the sampled
> instructions. When precise sampling is enabled on an event, PEBS
> is used, so substitute the interrupted state with the PEBS state.
> Note that not all registers are captured by PEBS. Those missing
> are replaced by the interrupt state counter-parts.
> 
> Signed-off-by: Stephane Eranian 
> ---
>  arch/x86/kernel/cpu/perf_event_intel_ds.c |   17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> b/arch/x86/kernel/cpu/perf_event_intel_ds.c index 9dc4199..139a8a5
> 100644 --- a/arch/x86/kernel/cpu/perf_event_intel_ds.c
> +++ b/arch/x86/kernel/cpu/perf_event_intel_ds.c
> @@ -886,6 +886,23 @@ static void __intel_pmu_pebs_event(struct
> perf_event *event, regs.bp = pebs->bp;
>   regs.sp = pebs->sp;
>  
> + if (sample_type & PERF_SAMPLE_REGS_INTR) {
> + regs.ax = pebs->ax;
> + regs.bx = pebs->bx;
> + regs.cx = pebs->cx;
> + regs.si = pebs->si;
> + regs.di = pebs->di;
> +
> + regs.r8 = pebs->r8;
> + regs.r9 = pebs->r9;
> + regs.r10 = pebs->r10;
> + regs.r11 = pebs->r11;
> + regs.r12 = pebs->r12;
> + regs.r13 = pebs->r13;
> + regs.r14 = pebs->r14;
> + regs.r14 = pebs->r15;
 ^^^
 r15 ???

> + }
> +
>   if (event->attr.precise_ip > 1 &&
> x86_pmu.intel_cap.pebs_format >= 2) { regs.ip = pebs->real_ip;
>   regs.flags |= PERF_EFLAGS_EXACT;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bit fields && data tearing

2014-09-04 Thread James Bottomley
On Thu, 2014-09-04 at 17:17 -0700, Paul E. McKenney wrote:
> +And there are anti-guarantees:
> +
> + (*) These guarantees do not apply to bitfields, because compilers often
> + generate code to modify these using non-atomic read-modify-write
> + sequences.  Do not attempt to use bitfields to synchronize parallel
> + algorithms.
> +
> + (*) Even in cases where bitfields are protected by locks, all fields
> + in a given bitfield must be protected by one lock.  If two fields
> + in a given bitfield are protected by different locks, the compiler's
> + non-atomic read-modify-write sequences can cause an update to one
> + field to corrupt the value of an adjacent field.
> +
> + (*) These guarantees apply only to properly aligned and sized scalar
> + variables.  "Properly sized" currently means "int" and "long",
> + because some CPU families do not support loads and stores of
> + other sizes.  ("Some CPU families" is currently believed to
> + be only Alpha 21064.  If this is actually the case, a different
> + non-guarantee is likely to be formulated.)

This is a bit unclear.  Presumably you're talking about definiteness of
the outcome (as in what's seen after multiple stores to the same
variable).  The guarantees are only for natural width on Parisc as well,
so you would get a mess if you did byte stores to adjacent memory
locations.  But multiple 32 bit stores guarantees to see one of the
stored values as the final outcome.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] PM / sleep: add configurable delay for pm_test

2014-09-04 Thread Chirantan Ekbote
On Thu, Sep 4, 2014 at 10:54 AM, Brian Norris
 wrote:
> On Thu, Sep 04, 2014 at 09:14:12AM +0200, Pavel Machek wrote:
>> > When CONFIG_PM_DEBUG=y, we provide a sysfs file (/sys/power/pm_test) for
>> > selecting one of a few suspend test modes, where rather than entering a
>> > full suspend state, the kernel will perform some subset of suspend
>> > steps, wait 5 seconds, and then resume back to normal operation.
>> >
>> > This mode is useful for (among other things) observing the state of the
>> > system just before entering a sleep mode, for debugging or analysis
>> > purposes. However, a constant 5 second wait is not sufficient for some
>> > sorts of analysis; for example, on an SoC, one might want to use
>> > external tools to probe the power states of various on-chip controllers
>> > or clocks.
>>
>> When you are doing this kind of analysis, perhaps directly modifying
>> kernel source is the way to go ...?
>
> That's what I've been doing for now, but I have a few engineers who need
> to do this sort of testing and aren't kernel developers. I could
> continue to maintain my own patch for this, but I just thought I'd see
> what others thought.
>
> Is there a good reason this can't be in mainline? These features are
> hidden behind a Kconfig symbol called PM_DEBUG anyway, and I think this
> classifies as a pretty simple extension to the limited existing PM
> debugging options.
>

We're currently carrying a similar patch in the chrome os kernel tree.
I would also like to see this go into mainline.

-Chirantan

> Regards,
> Brian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bit fields && data tearing

2014-09-04 Thread H. Peter Anvin
On 09/04/2014 05:59 PM, Peter Hurley wrote:
> I have no idea how prevalent the ev56 is compared to the ev5.
> Still we're talking about a chip that came out in 1996.

Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
suffix (EV5).  However, we're still talking about museum pieces here.

I wonder what the one I have in my garage is... I'm sure I could emulate
it faster, though.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 3/3] GPIO: gpio-dwapb: Suspend & Resume PM enabling

2014-09-04 Thread Chen, Alvin
> 
> > +#if defined CONFIG_PM_SLEEP
> 
> I wonder whether it's worth #ifdef:in out such things, it clutters the place.
OK. I will use '#ifdef'.
> 
> > +/* Store GPIO context across system-wide suspend/resume transitions
> > +*/ static struct gpio_saved_regs {
> 
> Call the struct:
> 
> struct dwapb_context
> 
> because that is easier to understand.
> 
OK.

> > +   unsigned long data;
> > +   unsigned long dir;
> > +   unsigned long int_en;
> > +   unsigned long int_mask;
> > +   unsigned long int_type;
> > +   unsigned long int_pol;
> > +   unsigned long int_deb;
> > +} saved_regs;
> 
> Singleton huh?
> 
> Insert this into the dynamically allocated per-port or chip struct instead.
> 
How about the following?

static struct dwapb_context {
u32 data[DWAPB_MAX_PORTS];
u32 dir[DWAPB_MAX_PORTS];
u32 ext[DWAPB_MAX_PORTS];
u32 int_en;
u32 int_mask;
u32 int_type;
u32 int_pol;
u32 int_deb;
} dwapb_context;

Since only portA can support irq, and the irq related registers are only for 
portA. Comparing to allocate for each port
dynamically, it is more directly and easy to understand. 


> +   dwapb_write(gpio, GPIO_SWPORTA_DR, saved_regs.data);
> +   dwapb_write(gpio, GPIO_SWPORTA_DDR, saved_regs.dir);
> 
> And port B, C, D?
> 
> This looks like a crude hack.
I will add port B, C, D.
> 
> Yours,
> Linus Walleij
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: bit fields && data tearing

2014-09-04 Thread H. Peter Anvin
On 09/04/2014 05:59 PM, Peter Hurley wrote:
> I have no idea how prevalent the ev56 is compared to the ev5.
> Still we're talking about a chip that came out in 1996.

Ah yes, I stand corrected.  According to Wikipedia, the affected CPUs
were all the 2106x CPUs (EV4, EV45, LCA4, LCA45) plus the 21164 with no
suffix (EV5).  However, we're still talking about museum pieces here.

I wonder what the one I have in my garage is... I'm sure I could emulate
it faster, though.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] blk-mq: Cleanup blk_mq_tag_busy() and blk_mq_tag_idle()

2014-09-04 Thread Chuck Ebbert
On Thu, 04 Sep 2014 19:30:18 -0600
Jens Axboe  wrote:

> On 09/04/2014 07:26 PM, Chuck Ebbert wrote:
> > On Wed, 03 Sep 2014 14:35:29 -0600
> > Jens Axboe  wrote:
> > 
> >> On 09/03/2014 02:33 PM, Alexander Gordeev wrote:
> > 
> >>> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> >>> index 6206ed1..795ec3f 100644
> >>> --- a/block/blk-mq-tag.h
> >>> +++ b/block/blk-mq-tag.h
> >>> @@ -66,23 +66,22 @@ enum {
> >>>   BLK_MQ_TAG_MAX  = BLK_MQ_TAG_FAIL - 1,
> >>>  };
> >>>  
> >>> -extern bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *);
> >>> +extern void __blk_mq_tag_busy(struct blk_mq_hw_ctx *);
> >>>  extern void __blk_mq_tag_idle(struct blk_mq_hw_ctx *);
> >>>  
> >>>  static inline bool blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
> >>>  {
> >>> - if (!(hctx->flags & BLK_MQ_F_TAG_SHARED))
> >>> - return false;
> >>> -
> >>> - return __blk_mq_tag_busy(hctx);
> >>> + if (hctx->flags & BLK_MQ_F_TAG_SHARED) {
> >>> + __blk_mq_tag_busy(hctx);
> >>> + return true;
> >>> + }
> >>> + return false;
> >>>  }
> >>
> >> The normal/fast path here is the flag NOT being set, which is why
> >> it was coded that way to put the fast path inline.
> >>
> >>>  
> >>>  static inline void blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
> >>>  {
> >>> - if (!(hctx->flags & BLK_MQ_F_TAG_SHARED))
> >>> - return;
> >>> -
> >>> - __blk_mq_tag_idle(hctx);
> >>> + if (hctx->flags & BLK_MQ_F_TAG_SHARED)
> >>> + __blk_mq_tag_idle(hctx);
> >>>  }
> >>
> >> Ditto
> > 
> > Shouldn't it just add unlikely() then? That way it's obvious what
> > the common case is, instead of relying on convoluted code.
> 
> It's a common construct. Besides, if you find a flag-not-set check
> convoluted, then I hope you are not programming anything I use.
> That's a bit of a straw man, imho.
> 

Sure, it's a common construct. But there's nothing there to prevent the
optimizer from rearranging things any way it pleases. Nor is there
anything keeping a human from doing the same. ;)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bit fields && data tearing

2014-09-04 Thread Peter Hurley
[ +cc linux-alpha ]

Hi Paul,

On 09/04/2014 08:17 PM, Paul E. McKenney wrote:
> On Thu, Sep 04, 2014 at 03:16:03PM -0700, H. Peter Anvin wrote:
>> On 09/04/2014 12:42 PM, Peter Hurley wrote:
>>>
>>> Or we could give up on the Alpha.
>>>
>>
>> If Alpha is turning into Voyager (kept alive only as a museum piece, but
>> actively causing problems) then please let's kill it.
> 
> Sorry for being slow to join this thread, but I propose the following
> patch.  If we can remove support for all CPUs that to not support
> direct access to bytes and shorts (which I would very much like to
> see happen), I will remove the last non-guarantee.
> 
>   Thanx, Paul

Although I don't mind the patch below, I don't think the bitfield thing
happened because anyone was confused about what the compiler would do;
here, it's more a case of legacy code that came out from under the
Big Kernel Lock and the bitfield was an oversight.

However, my patch to fix it by splitting the bitfield into 4 bytes
was rejected as insufficient to prevent accidental sharing. This is
what spun off the Alpha discussion about non-atomic byte updates.

FWIW, there are a bunch of problems with both the documentation and
kernel code if adjacent bytes can be overwritten by a single byte write.

Documentation/atomic-ops.txt claims that properly aligned chars are
atomic in the same sense that ints are, which is not true on the Alpha
(hopefully not a possible optimization on other arches -- I tried
with the ia64 cross compiler but it stuck with byte-sized writes).

Pretty much any large aggregate kernel structure is bound to have some
byte-size fields that are either lockless or serialized by different
locks, which may be corrupted by concurrent updates to adjacent data.
IOW, ACCESS_ONCE(), spinlocks, whatever, doesn't prevent adjacent
byte-sized data from being overwritten. I haven't bothered to count
how many global bools/chars there are and whether they might be
overwritten by adjacent updates.

Documentation/circular-buffers.txt and any lockless implementation based
on or similar to it for bytes or shorts will be corrupted if the head
nears the tail.

I'm sure there's other interesting outcomes that haven't come to light.

I think that 'naturally aligned scalar writes are atomic' should be the
minimum arch guarantee.

Regards,
Peter Hurley


> 
> 
> documentation: Record limitations of bitfields and small variables
> 
> This commit documents the fact that it is not safe to use bitfields
> as shared variables in synchronization algorithms.
> 
> Signed-off-by: Paul E. McKenney 
> 
> diff --git a/Documentation/memory-barriers.txt 
> b/Documentation/memory-barriers.txt
> index 87be0a8a78de..a28bfe4fd759 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -269,6 +269,26 @@ And there are a number of things that _must_ or 
> _must_not_ be assumed:
>   STORE *(A + 4) = Y; STORE *A = X;
>   STORE {*A, *(A + 4) } = {X, Y};
>  
> +And there are anti-guarantees:
> +
> + (*) These guarantees do not apply to bitfields, because compilers often
> + generate code to modify these using non-atomic read-modify-write
> + sequences.  Do not attempt to use bitfields to synchronize parallel
> + algorithms.
> +
> + (*) Even in cases where bitfields are protected by locks, all fields
> + in a given bitfield must be protected by one lock.  If two fields
> + in a given bitfield are protected by different locks, the compiler's
> + non-atomic read-modify-write sequences can cause an update to one
> + field to corrupt the value of an adjacent field.
> +
> + (*) These guarantees apply only to properly aligned and sized scalar
> + variables.  "Properly sized" currently means "int" and "long",
> + because some CPU families do not support loads and stores of
> + other sizes.  ("Some CPU families" is currently believed to
> + be only Alpha 21064.  If this is actually the case, a different
> + non-guarantee is likely to be formulated.)
> +
>  
>  =
>  WHAT ARE MEMORY BARRIERS?
 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2.2 2/2] dt-bindings: Adding compatible attribute for SKY81452 regulator

2014-09-04 Thread Gyungoh Yoo
On Mon, Sep 01, 2014 at 11:31:58AM +0100, Mark Brown wrote:
> On Mon, Sep 01, 2014 at 11:40:18AM +0900, Gyungoh Yoo wrote:
> > Adding compatible attribute for SKY81452 regulator driver.
> 
> >  Required properties:
> > +- compatible   : Must be "skyworks,sky81452-regulator"
> 
> Why is this a good idea - can this driver be used for anything other
> than a sky81452?

Thank you for the answer.

Yes. There is a possibility that this driver will be used by similar device 
with SKY81452.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/1] add selftest for virtio-net v1.0

2014-09-04 Thread Hengjinxiao
Selftest is an important part of network driver, this patch adds 
selftest for
virtio-net, including loopback test, negotiate test and reset test. Loopback 
test checks whether virtio-net can send and receive packets normally. Negotiate 
test
executes feature negotiation between virtio-net driver in Guest OS and 
virtio-net 
device in Host OS. Reset test resets virtio-net.
Following last patch, this version has deleted some useless codes and 
fixed bugs
as you suggest.
Any corrections are welcome.

Signed-off-by: Hengjinxiao 

---
 drivers/net/virtio_net.c| 241 ++--
 drivers/virtio/virtio.c |  20 +++-
 include/linux/virtio.h  |   2 +
 include/uapi/linux/virtio_net.h |   9 ++
 4 files changed, 256 insertions(+), 16 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 59caa06..22d8228 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static int napi_weight = NAPI_POLL_WEIGHT;
 module_param(napi_weight, int, 0444);
@@ -51,6 +52,17 @@ module_param(gso, bool, 0444);
 #define MERGEABLE_BUFFER_ALIGN max(L1_CACHE_BYTES, 256)
 
 #define VIRTNET_DRIVER_VERSION "1.0.0"
+#define __VIRTNET_TESTING  0
+
+static const struct {
+   const char string[ETH_GSTRING_LEN];
+} virtnet_gstrings_test[] = {
+   { "loopback test   (offline)" },
+   { "negotiate test  (offline)" },
+   { "reset test (offline)" },
+};
+
+#define VIRTNET_NUM_TEST   ARRAY_SIZE(virtnet_gstrings_test)
 
 struct virtnet_stats {
struct u64_stats_sync tx_syncp;
@@ -104,6 +116,8 @@ struct virtnet_info {
struct send_queue *sq;
struct receive_queue *rq;
unsigned int status;
+   unsigned long flags;
+   atomic_t lb_count;
 
/* Max # of queue pairs supported by the device */
u16 max_queue_pairs;
@@ -436,6 +450,19 @@ err_buf:
return NULL;
 }
 
+void virtnet_check_lb_frame(struct virtnet_info *vi,
+  struct sk_buff *skb)
+{
+   unsigned int frame_size = skb->len;
+
+   if (*(skb->data + 3) == 0xFF) {
+   if ((*(skb->data + frame_size / 2 + 10) == 0xBE) &&
+  (*(skb->data + frame_size / 2 + 12) == 0xAF)) {
+   atomic_dec(>lb_count);
+   }
+   }
+}
+
 static void receive_buf(struct receive_queue *rq, void *buf, unsigned int len)
 {
struct virtnet_info *vi = rq->vq->vdev->priv;
@@ -485,7 +512,12 @@ static void receive_buf(struct receive_queue *rq, void 
*buf, unsigned int len)
} else if (hdr->hdr.flags & VIRTIO_NET_HDR_F_DATA_VALID) {
skb->ip_summed = CHECKSUM_UNNECESSARY;
}
-
+   /* loopback self test for ethtool */
+   if (test_bit(__VIRTNET_TESTING, >flags)) {
+   virtnet_check_lb_frame(vi, skb);
+   dev_kfree_skb_any(skb);
+   return;
+   }
skb->protocol = eth_type_trans(skb, dev);
pr_debug("Receiving skb proto 0x%04x len %i type %i\n",
 ntohs(skb->protocol), skb->len, skb->pkt_type);
@@ -813,6 +845,9 @@ static int virtnet_open(struct net_device *dev)
 {
struct virtnet_info *vi = netdev_priv(dev);
int i;
+   /* disallow open during test */
+   if (test_bit(__VIRTNET_TESTING, >flags))
+   return -EBUSY;
 
for (i = 0; i < vi->max_queue_pairs; i++) {
if (i < vi->curr_queue_pairs)
@@ -1363,12 +1398,166 @@ static void virtnet_get_channels(struct net_device 
*dev,
channels->other_count = 0;
 }
 
+static int virtnet_reset(struct virtnet_info *vi, u64 *data);
+
+static void virtnet_create_lb_frame(struct sk_buff *skb,
+   unsigned int frame_size)
+{
+   memset(skb->data, 0xFF, frame_size);
+   frame_size &= ~1;
+   memset(>data[frame_size / 2], 0xAA, frame_size / 2 - 1);
+   memset(>data[frame_size / 2 + 10], 0xBE, 1);
+   memset(>data[frame_size / 2 + 12], 0xAF, 1);
+}
+
+static int virtnet_start_loopback(struct virtnet_info *vi)
+{
+   if (!virtnet_send_command(vi, VIRTIO_NET_CTRL_LOOPBACK,
+ VIRTIO_NET_CTRL_LOOPBACK_SET, NULL, NULL)) {
+   dev_warn(>dev->dev, "Failed to set loopback.\n");
+   return -EINVAL;
+   }
+   for (i = 0; i < vi->curr_queue_pairs; i++)
+   napi_disable(>rq[i].napi);
+   return 0;
+}
+
+static int virtnet_run_loopback_test(struct virtnet_info *vi)
+{
+   int i;
+   struct sk_buff *skb;
+   unsigned int size = GOOD_COPY_LEN;
+
+   for (i = 0; i < 100; i++) {
+   skb = netdev_alloc_skb(vi->dev, size);
+   if (!skb)
+   return -ENOMEM;
+
+   skb->queue_mapping = 0;
+   skb_put(skb, size);
+   virtnet_create_lb_frame(skb, size);
+   

Re: [PATCHv4 0/3] new APIs to allocate buffer-cache with user specific flag

2014-09-04 Thread Joonsoo Kim
On Thu, Sep 04, 2014 at 09:14:19PM -0400, Theodore Ts'o wrote:
> On Fri, Sep 05, 2014 at 09:37:05AM +0900, Gioh Kim wrote:
> > >But what were the problems which were observed in standard kernels and
> > >what effect did this patchset have upon them?  Some quantitative
> > >measurements will really help here.
> > 
> > The problem is that I cannot allocate entire CMA memory.
> > >
> > Actually the problem is not found without Joonsoo's patch:
> > https://lkml.org/lkml/2014/5/28/64.  Without it CMA memory is free
> > and every CMA-memory allocation is successed.
> > 
> > If the Joonsoo's patch is applied, the CMA memory is allocated
> > generally when system boots-up.
> 
> As I said earlier, I'm happy to carry this patch in the ext4 tree,
> because as it turns out I could use this facility for another purpose
> (to cause a few buffer cache allocations to happen with __GFP_NOFAIL).
> 
> I do have one question; I note that Joonsoo's patch dates back to May,
> and yet this has not hit the mainline kernel, and I haven't seen any
> discussions about this patch after May.  Has there been some pushback
> from the mm maintainers about Joonsoo's approach with respect to this
> patch?   What is the current status of that patch set?

Hello,

That patchset is postponed, but will be continued. The reason is that
another bugs turn up frequently if that patchset is applied. I will
fix this bug first and will re-submit that patchset. Following is
the attempt to fix this bug.

https://lkml.org/lkml/2014/8/26/147

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Namhyung Kim
On Fri, Sep 5, 2014 at 10:18 AM, Arnaldo Carvalho de Melo
 wrote:
> Em Fri, Sep 05, 2014 at 09:09:49AM +0900, Namhyung Kim escreveu:
>> The perf report rebuilds machine states from the event records only.  In
>> this case, the kernel map was recorded in the name of [kernel.kallsyms]
>> so it couldn't find the build-id from the table.
>
> Ok, but then we can special case this one, no?
>
> Somehow mark in the buildid table that that entry is the one for the
> kernel and hook it up to the synthesized event that has the
> [kernel.kallsyms].ref_reloc_sym entry.

Maybe we can search vmlinux in machine->kernel_dsos first when
processing kernel mmap event.  And in this case do you want replace
the name of mapping to a fullname of vmlinux?

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v9 2/3] clk: RK808: Add clkout driver for RK808

2014-09-04 Thread Mike Turquette
Quoting Chris Zhong (2014-09-03 18:12:38)
> +static int rk808_clkout1_is_prepared(struct clk_hw *hw)
> +{
> +   return 1;
> +}
> +



> +static const struct clk_ops rk808_clkout1_ops = {
> +   .is_prepared = rk808_clkout1_is_prepared,
> +   .recalc_rate = rk808_clkout_recalc_rate,
> +};

Hi Chris,

I do not see a need clkout1 to have a .is_prepared callback. You should
be fine only having a .recalc_rate.

Regards,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net v6 4/4] tg3: Fix tx_pending checks for tg3_tso_bug

2014-09-04 Thread Benjamin Poirier
In tg3_set_ringparam(), the tx_pending test to cover the cases where
tg3_tso_bug() is entered has two problems
1) the check is only done for certain hardware whereas the workaround
is now used more broadly. IOW, the check may not be performed when it
is needed.
2) the check is too optimistic.

For example, with a 5761 (SHORT_DMA_BUG), tg3_set_ringparam() skips over the
"tx_pending <= (MAX_SKB_FRAGS * 3)" check because TSO_BUG is false. Even if it
did do the check, with a full sized skb, frag_cnt_est = 135 but the check is
for <= MAX_SKB_FRAGS * 3 (= 17 * 3 = 51). So the check is insufficient. This
leads to the following situation: by setting, ex. tx_pending = 100, there can
be an skb that triggers tg3_tso_bug() and that is large enough to cause
tg3_tso_bug() to stop the queue even when it is empty. We then end up with a
netdev watchdog transmit timeout.

Given that 1) some of the conditions tested for in tg3_tx_frag_set() apply
regardless of the chipset flags and that 2) it is difficult to estimate ahead
of time the max possible number of frames that a large skb may be split into
by gso, this patch changes tg3_set_ringparam() to ignore the requirements of
tg3_tso_bug(). Those requirements are instead checked in tg3_tso_bug() itself
and if there is not a sufficient number of descriptors available in the tx
queue, the skb is linearized.

This patch also removes the current scheme in tg3_tso_bug() where the number
of descriptors required to transmit an skb is estimated. Instead,
gso_segment() is called without _SG which yields predictable, linear skbs.

Signed-off-by: Benjamin Poirier 

---

Changes v1->v2
* in tg3_set_ringparam(), reduce gso_max_segs further to budget 3 descriptors
  per gso seg instead of only 1 as in v1
* in tg3_tso_bug(), check that this estimation (3 desc/seg) holds, otherwise
  linearize some skbs as needed
* in tg3_start_xmit(), make the queue stop threshold a parameter, for the
  reason explained in the commit description

Changes v2->v3
* use tg3_maybe_stop_txq() instead of repeatedly open coding it
* add the requested tp->tx_dropped++ stat increase in tg3_tso_bug() if
  skb_linearize() fails and we must abort
* in the same code block, add an additional check to stop the queue with the
  default threshold. Otherwise, the netdev_err message at the start of
  __tg3_start_xmit() could be triggered when the next frame is transmitted.
  That is because the previous calls to __tg3_start_xmit() in tg3_tso_bug()
  may have been using a stop_thresh=segs_remaining that is < MAX_SKB_FRAGS +
  1.

Changes v3->v4
* in tg3_set_ringparam(), make sure that wakeup_thresh does not end up being
  >= tx_pending. Identified by Prashant.

Changes v4->v5
* in tg3_set_ringparam(), use TG3_TX_WAKEUP_THRESH() and tp->txq_cnt instead
  of tp->irq_max. Identified by Prashant.

Changes v5->v6
* avoid changing gso_max_segs and making the tx queue wakeup threshold
  dynamic. Instead of stopping the queue when there are not enough descriptors
  available, the skb is linearized.

I reproduced this bug using the same approach explained in patch 1.
The bug reproduces with tx_pending <= 135
---
 drivers/net/ethernet/broadcom/tg3.c | 59 -
 1 file changed, 38 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index 6e6b07c..a9787a1 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7830,6 +7830,8 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi 
*tnapi,
 }
 
 static netdev_tx_t tg3_start_xmit(struct sk_buff *, struct net_device *);
+static netdev_tx_t __tg3_start_xmit(struct sk_buff *, struct net_device *,
+   u32);
 
 /* Returns true if the queue has been stopped. Note that it may have been
  * restarted since.
@@ -7866,27 +7868,38 @@ static inline bool tg3_maybe_stop_txq(struct tg3_napi 
*tnapi,
 static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi *tnapi,
   struct netdev_queue *txq, struct sk_buff *skb)
 {
-   struct sk_buff *segs, *nskb;
-   u32 frag_cnt_est = skb_shinfo(skb)->gso_segs * 3;
+   unsigned int segs_remaining = skb_shinfo(skb)->gso_segs;
 
-   /* Estimate the number of fragments in the worst case */
-   tg3_maybe_stop_txq(tnapi, txq, frag_cnt_est, frag_cnt_est);
-   if (netif_tx_queue_stopped(txq))
-   return NETDEV_TX_BUSY;
+   if (unlikely(tg3_tx_avail(tnapi) <= segs_remaining)) {
+   if (!skb_is_nonlinear(skb) || skb_linearize(skb))
+   goto tg3_tso_bug_drop;
+   tg3_start_xmit(skb, tp->dev);
+   } else {
+   struct sk_buff *segs, *nskb;
 
-   segs = skb_gso_segment(skb, tp->dev->features &
-   ~(NETIF_F_TSO | NETIF_F_TSO6));
-   if (IS_ERR(segs) || !segs)
-   goto tg3_tso_bug_end;
+   segs = skb_gso_segment(skb, tp->dev->features 

[PATCH net v6 3/4] tg3: Move tx queue stop logic to its own function

2014-09-04 Thread Benjamin Poirier
It is duplicated. Also, the first instance in tg3_start_xmit() is racy.
Consider:

tg3_start_xmit()
if budget <= ...
tg3_tx()
(free up the entire ring)
tx_cons =
smp_mb
if queue_stopped and tx_avail, NO
if !queue_stopped
stop queue
return NETDEV_TX_BUSY

... tx queue stopped forever

Signed-off-by: Benjamin Poirier 

---

Changes v2->v3
* new patch to avoid repeatedly open coding this block in the next patch.

Changes v3->v4
* added a comment to clarify the return value, as suggested
* replaced the BUG_ON with netdev_err(). No need to be so dramatic, this
  situation will trigger a netdev watchdog anyways.
---
 drivers/net/ethernet/broadcom/tg3.c | 75 -
 1 file changed, 40 insertions(+), 35 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index c5061c3..6e6b07c 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -7831,6 +7831,35 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi 
*tnapi,
 
 static netdev_tx_t tg3_start_xmit(struct sk_buff *, struct net_device *);
 
+/* Returns true if the queue has been stopped. Note that it may have been
+ * restarted since.
+ */
+static inline bool tg3_maybe_stop_txq(struct tg3_napi *tnapi,
+ struct netdev_queue *txq,
+ u32 stop_thresh, u32 wakeup_thresh)
+{
+   bool stopped = false;
+
+   if (unlikely(tg3_tx_avail(tnapi) <= stop_thresh)) {
+   if (!netif_tx_queue_stopped(txq)) {
+   stopped = true;
+   netif_tx_stop_queue(txq);
+   if (wakeup_thresh >= tnapi->tx_pending)
+   netdev_err(tnapi->tp->dev,
+  "BUG! wakeup_thresh too large (%u >= 
%u)\n",
+  wakeup_thresh, tnapi->tx_pending);
+   }
+   /* netif_tx_stop_queue() must be done before checking tx index
+* in tg3_tx_avail(), because in tg3_tx(), we update tx index
+* before checking for netif_tx_queue_stopped().
+*/
+   smp_mb();
+   if (tg3_tx_avail(tnapi) > wakeup_thresh)
+   netif_tx_wake_queue(txq);
+   }
+   return stopped;
+}
+
 /* Use GSO to workaround all TSO packets that meet HW bug conditions
  * indicated in tg3_tx_frag_set()
  */
@@ -7841,20 +7870,9 @@ static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi 
*tnapi,
u32 frag_cnt_est = skb_shinfo(skb)->gso_segs * 3;
 
/* Estimate the number of fragments in the worst case */
-   if (unlikely(tg3_tx_avail(tnapi) <= frag_cnt_est)) {
-   netif_tx_stop_queue(txq);
-
-   /* netif_tx_stop_queue() must be done before checking
-* checking tx index in tg3_tx_avail() below, because in
-* tg3_tx(), we update tx index before checking for
-* netif_tx_queue_stopped().
-*/
-   smp_mb();
-   if (tg3_tx_avail(tnapi) <= frag_cnt_est)
-   return NETDEV_TX_BUSY;
-
-   netif_tx_wake_queue(txq);
-   }
+   tg3_maybe_stop_txq(tnapi, txq, frag_cnt_est, frag_cnt_est);
+   if (netif_tx_queue_stopped(txq))
+   return NETDEV_TX_BUSY;
 
segs = skb_gso_segment(skb, tp->dev->features &
~(NETIF_F_TSO | NETIF_F_TSO6));
@@ -7902,16 +7920,13 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
 * interrupt.  Furthermore, IRQ processing runs lockless so we have
 * no IRQ context deadlocks to worry about either.  Rejoice!
 */
-   if (unlikely(budget <= (skb_shinfo(skb)->nr_frags + 1))) {
-   if (!netif_tx_queue_stopped(txq)) {
-   netif_tx_stop_queue(txq);
-
-   /* This is a hard error, log it. */
-   netdev_err(dev,
-  "BUG! Tx Ring full when queue awake!\n");
-   }
-   return NETDEV_TX_BUSY;
+   if (tg3_maybe_stop_txq(tnapi, txq, skb_shinfo(skb)->nr_frags + 1,
+  TG3_TX_WAKEUP_THRESH(tnapi))) {
+   /* This is a hard error, log it. */
+   netdev_err(dev, "BUG! Tx Ring full when queue awake!\n");
}
+   if (netif_tx_queue_stopped(txq))
+   return NETDEV_TX_BUSY;
 
entry = tnapi->tx_prod;
base_flags = 0;
@@ -8087,18 +8102,8 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, 
struct net_device *dev)

[PATCH net v6 2/4] tg3: Fix tx_pending check for MAX_SKB_FRAGS

2014-09-04 Thread Benjamin Poirier
The rest of the driver assumes at least one free descriptor in the tx ring.
Therefore, since an skb with max frags takes up (MAX_SKB_FRAGS + 1)
descriptors, tx_pending must be > (MAX_SKB_FRAGS + 1).

Signed-off-by: Benjamin Poirier 

---

Changes v1->v2
Moved ahead in the series from 3/3 to 2/3, no functionnal change

I reproduced this bug using the same approach explained in patch 1.
The bug reproduces with tx_pending = 18
---
 drivers/net/ethernet/broadcom/tg3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index 81b3a57..c5061c3 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -12331,7 +12331,7 @@ static int tg3_set_ringparam(struct net_device *dev, 
struct ethtool_ringparam *e
if ((ering->rx_pending > tp->rx_std_ring_mask) ||
(ering->rx_jumbo_pending > tp->rx_jmb_ring_mask) ||
(ering->tx_pending > TG3_TX_RING_SIZE - 1) ||
-   (ering->tx_pending <= MAX_SKB_FRAGS) ||
+   (ering->tx_pending <= MAX_SKB_FRAGS + 1) ||
(tg3_flag(tp, TSO_BUG) &&
 (ering->tx_pending <= (MAX_SKB_FRAGS * 3
return -EINVAL;
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net v6 1/4] tg3: Limit minimum tx queue wakeup threshold

2014-09-04 Thread Benjamin Poirier
tx_pending may be set by the user (via ethtool -G) to a low enough value that
TG3_TX_WAKEUP_THRESH becomes smaller than MAX_SKB_FRAGS + 1. This may cause
the tx queue to be waked when there are in fact not enough descriptors to
handle an skb with max frags. This in turn causes tg3_start_xmit() to return
NETDEV_TX_BUSY and print error messages. Fix the problem by putting a limit to
how low TG3_TX_WAKEUP_THRESH can go.

Signed-off-by: Benjamin Poirier 

---

I noticed the problem in a 3.0 kernel when setting `ethtool eth0 -G tx 50` and
running a netperf TCP_STREAM test. The console fills up with
[10597.596155] tg3 :06:00.0: eth0: BUG! Tx Ring full when queue awake!
The problem in tg3 remains in current kernels though it does not reproduce as
easily since "5640f76 net: use a per task frag allocator (v3.7-rc1)". I
reproduced on current kernels by using the fail_page_alloc fault injection
mechanism to force the creation of skbs with many order-0 frags. Note that the
following script may also trigger another bug (NETDEV WATCHDOG), which is
fixed in the next patch.

$ cat /tmp/doit.sh

F="/sys/kernel/debug/fail_page_alloc"

echo -1 > "$F/times"
echo 0 > "$F/verbose"
echo 0 > "$F/ignore-gfp-wait"
echo 1 > "$F/task-filter"
echo 100 > "$F/probability"

netperf -H 192.168.9.30 -l100 -t omni -- -d send &

n=$!

sleep 0.3
echo 1 > "/proc/$n/make-it-fail"
sleep 10

kill "$n"
---
 drivers/net/ethernet/broadcom/tg3.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index cb77ae9..81b3a57 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -202,7 +202,8 @@ static inline void _tg3_flag_clear(enum TG3_FLAGS flag, 
unsigned long *bits)
 #endif
 
 /* minimum number of free TX descriptors required to wake up TX process */
-#define TG3_TX_WAKEUP_THRESH(tnapi)((tnapi)->tx_pending / 4)
+#define TG3_TX_WAKEUP_THRESH(tnapi)max_t(u32, (tnapi)->tx_pending / 4, \
+ MAX_SKB_FRAGS + 1)
 #define TG3_TX_BD_DMA_MAX_2K   2048
 #define TG3_TX_BD_DMA_MAX_4K   4096
 
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH net v6 0/4] tg3: tx_pending fixes

2014-09-04 Thread Benjamin Poirier

Extra info regarding patch 4:
This version of the series calls gso_segment() without NETIF_F_SG. This avoids
the need for desc_cnt_est in tg3_tso_bug() as in previous versions of this
patch series. Since Michael had previously raised concerns about gso_segment
without SG, I ran some netperf throughput tests. I used a small patch to force
tg3_tso_bug() to be called even when it is not needed [1].

root@linux-y64m:~# perf stat -r10 -ad netperf -H 192.168.9.30 -l60 -T 0,0 -t 
omni -- -d send

* original tg3_tso_bug() (ie. without patch 4/4)
  781±2 10^6bits/s
  6.60 cycle/bit
* gso_segment() without SG (current series)
  801.0±0.9 10^6bits/s
  5.79 cycle/bit
* gso_segment() with SG (alternate patch 4/4 [2])
  783±2 10^6bits/s
  7.25 cycle/bit

(For reference, with the original tg3_tso_bug() implementation but without
forcing it to be called, the throughput I get is 822±1 10^6bits/s @ 3.82
cycle/bit with 0 invocations of tg3_tso_bug)

[1] fault injection patch

---
 drivers/net/ethernet/broadcom/tg3.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index cb77ae9..f9144dc 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -47,6 +47,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -468,6 +469,27 @@ static const struct {
 #define TG3_NUM_TEST   ARRAY_SIZE(ethtool_test_keys)
 
 
+/* debugging stuff */
+static u32 tg3_do_mangle;
+static struct dentry *tg3_mangle_debugfs;
+
+static int __init tg3_mod_init(void)
+{
+   tg3_mangle_debugfs = debugfs_create_u32("tg3_do_mangle", S_IRUGO |
+   S_IWUSR, NULL,
+   _do_mangle);
+
+   return 0;
+}
+module_init(tg3_mod_init);
+
+static void __exit tg3_mod_exit(void)
+{
+   debugfs_remove(tg3_mangle_debugfs);
+}
+module_exit(tg3_mod_exit);
+/* --- */
+
 static void tg3_write32(struct tg3 *tp, u32 off, u32 val)
 {
writel(val, tp->regs + off);
@@ -8048,6 +8070,11 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, 
struct net_device *dev)
would_hit_hwbug = 1;
break;
}
+
+   if (tg3_do_mangle > 0) {
+   would_hit_hwbug = 4;
+   break;
+   }
}
}
 
-- 

[2] alternate patch 4

call gso_segment with SG (without removing it, actually)

---
 drivers/net/ethernet/broadcom/tg3.c | 80 +++--
 1 file changed, 59 insertions(+), 21 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c 
b/drivers/net/ethernet/broadcom/tg3.c
index ee93b51..1ecb393 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -205,6 +205,9 @@ static inline void _tg3_flag_clear(enum TG3_FLAGS flag, 
unsigned long *bits)
 /* minimum number of free TX descriptors required to wake up TX process */
 #define TG3_TX_WAKEUP_THRESH(tnapi)max_t(u32, (tnapi)->tx_pending / 4, \
  MAX_SKB_FRAGS + 1)
+/* estimate a certain number of descriptors per gso segment */
+#define TG3_TX_DESC_PER_SEG(seg_nb)((seg_nb) * 3)
+
 #define TG3_TX_BD_DMA_MAX_2K   2048
 #define TG3_TX_BD_DMA_MAX_4K   4096
 
@@ -7852,6 +7855,8 @@ static int tigon3_dma_hwbug_workaround(struct tg3_napi 
*tnapi,
 }
 
 static netdev_tx_t tg3_start_xmit(struct sk_buff *, struct net_device *);
+static netdev_tx_t __tg3_start_xmit(struct sk_buff *, struct net_device *,
+   u32);
 
 /* Returns true if the queue has been stopped. Note that it may have been
  * restarted since.
@@ -7888,27 +7893,56 @@ static inline bool tg3_maybe_stop_txq(struct tg3_napi 
*tnapi,
 static int tg3_tso_bug(struct tg3 *tp, struct tg3_napi *tnapi,
   struct netdev_queue *txq, struct sk_buff *skb)
 {
-   struct sk_buff *segs, *nskb;
-   u32 frag_cnt_est = skb_shinfo(skb)->gso_segs * 3;
+   unsigned int segs_remaining = skb_shinfo(skb)->gso_segs;
+   u32 desc_cnt_est = TG3_TX_DESC_PER_SEG(segs_remaining);
 
-   /* Estimate the number of fragments in the worst case */
-   tg3_maybe_stop_txq(tnapi, txq, frag_cnt_est, frag_cnt_est);
-   if (netif_tx_queue_stopped(txq))
-   return NETDEV_TX_BUSY;
+   if (unlikely(tg3_tx_avail(tnapi) <= desc_cnt_est)) {
+   if (!skb_is_nonlinear(skb) || skb_linearize(skb))
+   goto tg3_tso_bug_drop;
+   tg3_start_xmit(skb, tp->dev);
+   } else {
+   struct sk_buff *segs, *nskb;
 
-   segs = skb_gso_segment(skb, tp->dev->features &
-   ~(NETIF_F_TSO | NETIF_F_TSO6));
-   if (IS_ERR(segs) || !segs)
-   goto tg3_tso_bug_end;
+  

Re: [PATCH v8] usb:serial:pl2303: add GPIOs interface on PL2303

2014-09-04 Thread Wang YanQing
On Thu, Sep 04, 2014 at 06:44:31PM +0200, Benjamin Henrion wrote:
> On Thu, Sep 4, 2014 at 6:14 PM, Benjamin Henrion  wrote:
> > I have subscribed to the lkml.
> >
> > Can you make me a favour, send me your email as you posted on the LKML
> > in mbox format attached to an email?
> >
> > I am trying to reply to it, but I cannot find an mbox archive, outside
> > of gmane which rewrites all the email addresses.
> >
> > Otherwise, I made a page here:
> >
> > http://www.zoobab.com/pl2303hxd-gpio
> >
> > I will also try with other adapters to see if the 2 GPIOs show up.
> >
> > Any idea how to test the GPIO sysfs speed?
> 
> Also tested with an HXA laying around, I could export the 2 GPIOS in
> /sys/class/gpio, but I could not change the status of the pins via
> echo.
> 
> Any idea why?

It is strange, I have tested again with my two HXA adapters, they work well.
How do you test? check cat value or check voltage change? I check cat value,
and they work good.


> 
> -- 
> Benjamin Henrion 
> FFII Brussels - +32-484-566109 - +32-2-4148403
> "In July 2005, after several failed attempts to legalise software
> patents in Europe, the patent establishment changed its strategy.
> Instead of explicitly seeking to sanction the patentability of
> software, they are now seeking to create a central European patent
> court, which would establish and enforce patentability rules in their
> favor, without any possibility of correction by competing courts or
> democratically elected legislators."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] blk-mq: Cleanup blk_mq_tag_busy() and blk_mq_tag_idle()

2014-09-04 Thread Jens Axboe
On 09/04/2014 07:26 PM, Chuck Ebbert wrote:
> On Wed, 03 Sep 2014 14:35:29 -0600
> Jens Axboe  wrote:
> 
>> On 09/03/2014 02:33 PM, Alexander Gordeev wrote:
> 
>>> diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
>>> index 6206ed1..795ec3f 100644
>>> --- a/block/blk-mq-tag.h
>>> +++ b/block/blk-mq-tag.h
>>> @@ -66,23 +66,22 @@ enum {
>>> BLK_MQ_TAG_MAX  = BLK_MQ_TAG_FAIL - 1,
>>>  };
>>>  
>>> -extern bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *);
>>> +extern void __blk_mq_tag_busy(struct blk_mq_hw_ctx *);
>>>  extern void __blk_mq_tag_idle(struct blk_mq_hw_ctx *);
>>>  
>>>  static inline bool blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
>>>  {
>>> -   if (!(hctx->flags & BLK_MQ_F_TAG_SHARED))
>>> -   return false;
>>> -
>>> -   return __blk_mq_tag_busy(hctx);
>>> +   if (hctx->flags & BLK_MQ_F_TAG_SHARED) {
>>> +   __blk_mq_tag_busy(hctx);
>>> +   return true;
>>> +   }
>>> +   return false;
>>>  }
>>
>> The normal/fast path here is the flag NOT being set, which is why it
>> was coded that way to put the fast path inline.
>>
>>>  
>>>  static inline void blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
>>>  {
>>> -   if (!(hctx->flags & BLK_MQ_F_TAG_SHARED))
>>> -   return;
>>> -
>>> -   __blk_mq_tag_idle(hctx);
>>> +   if (hctx->flags & BLK_MQ_F_TAG_SHARED)
>>> +   __blk_mq_tag_idle(hctx);
>>>  }
>>
>> Ditto
> 
> Shouldn't it just add unlikely() then? That way it's obvious what the
> common case is, instead of relying on convoluted code.

It's a common construct. Besides, if you find a flag-not-set check
convoluted, then I hope you are not programming anything I use. That's a
bit of a straw man, imho.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] blk-mq: Cleanup blk_mq_tag_busy() and blk_mq_tag_idle()

2014-09-04 Thread Chuck Ebbert
On Wed, 03 Sep 2014 14:35:29 -0600
Jens Axboe  wrote:

> On 09/03/2014 02:33 PM, Alexander Gordeev wrote:

> > diff --git a/block/blk-mq-tag.h b/block/blk-mq-tag.h
> > index 6206ed1..795ec3f 100644
> > --- a/block/blk-mq-tag.h
> > +++ b/block/blk-mq-tag.h
> > @@ -66,23 +66,22 @@ enum {
> > BLK_MQ_TAG_MAX  = BLK_MQ_TAG_FAIL - 1,
> >  };
> >  
> > -extern bool __blk_mq_tag_busy(struct blk_mq_hw_ctx *);
> > +extern void __blk_mq_tag_busy(struct blk_mq_hw_ctx *);
> >  extern void __blk_mq_tag_idle(struct blk_mq_hw_ctx *);
> >  
> >  static inline bool blk_mq_tag_busy(struct blk_mq_hw_ctx *hctx)
> >  {
> > -   if (!(hctx->flags & BLK_MQ_F_TAG_SHARED))
> > -   return false;
> > -
> > -   return __blk_mq_tag_busy(hctx);
> > +   if (hctx->flags & BLK_MQ_F_TAG_SHARED) {
> > +   __blk_mq_tag_busy(hctx);
> > +   return true;
> > +   }
> > +   return false;
> >  }
> 
> The normal/fast path here is the flag NOT being set, which is why it
> was coded that way to put the fast path inline.
> 
> >  
> >  static inline void blk_mq_tag_idle(struct blk_mq_hw_ctx *hctx)
> >  {
> > -   if (!(hctx->flags & BLK_MQ_F_TAG_SHARED))
> > -   return;
> > -
> > -   __blk_mq_tag_idle(hctx);
> > +   if (hctx->flags & BLK_MQ_F_TAG_SHARED)
> > +   __blk_mq_tag_idle(hctx);
> >  }
> 
> Ditto

Shouldn't it just add unlikely() then? That way it's obvious what the
common case is, instead of relying on convoluted code.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -v4] x86: only load initrd above 4g on second try

2014-09-04 Thread Yinghai Lu
On Thu, Sep 4, 2014 at 2:29 PM, Matt Fleming  wrote:
> On Thu, 04 Sep, at 01:59:05PM, H. Peter Anvin wrote:
>>
>> I am fine with this patch, but at the same time I do want to note that
>> there is an alternative to double-buffer the patch and/or (if that
>> applies to the buggy BIOS) round up the size of the target buffer.
>
> I'm not sure that rounding up the size of the target buffer will
> workaround this issue correctly.
>
> As far as I know, the only thing that Mantas tried was rounding up the
> size of the source file, by padding it.

Hi Mantas,

Can you try attached patch on top of linus tree?

Thanks

Yinghai
---
 drivers/firmware/efi/libstub/efi-stub-helper.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-2.6/drivers/firmware/efi/libstub/efi-stub-helper.c
===
--- linux-2.6.orig/drivers/firmware/efi/libstub/efi-stub-helper.c
+++ linux-2.6/drivers/firmware/efi/libstub/efi-stub-helper.c
@@ -426,7 +426,7 @@ efi_status_t handle_cmdline_files(efi_sy
 if (size > EFI_READ_CHUNK_SIZE)
 	chunksize = EFI_READ_CHUNK_SIZE;
 else
-	chunksize = size;
+	chunksize = round_up(size, EFI_PAGE_SIZE);
 
 status = efi_file_read(files[j].handle,
 		   ,


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Arnaldo Carvalho de Melo
Em Fri, Sep 05, 2014 at 09:09:49AM +0900, Namhyung Kim escreveu:
> >> Before:
> >>   $ perf record -a usleep 1

> >>   $ perf buildid-list
> >>   00d5ff078efe1d30b8492854f259215fd877ce30 
> >> /lib/modules/3.16.0-rc2+/build/vmlinux
> >>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so

> >>   $ perf buildid-list -H
> >>    [kernel.kallsyms]
> >>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so

> >> After:
> >>   $ perf record -a usleep 1

> >>   $ perf buildid-list
> >>   00d5ff078efe1d30b8492854f259215fd877ce30 [kernel.kallsyms]

> > We are losing information, namely the pathname for the kernel used, that
> > may be useful in analysis.
 
> Right.  That's a problem.

> > Why not make sure that if there is a build-id in the perf.data header,
> > then we completely refusing anything that doesn't match the build-id?
> > I.e. the name is irrelevant for this purpose, the contents, as keyed by
> > the build-id, is what matters.
 
> The perf report rebuilds machine states from the event records only.  In
> this case, the kernel map was recorded in the name of [kernel.kallsyms]
> so it couldn't find the build-id from the table.

Ok, but then we can special case this one, no?

Somehow mark in the buildid table that that entry is the one for the
kernel and hook it up to the synthesized event that has the
[kernel.kallsyms].ref_reloc_sym entry.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: max77802: Remove duplicate rdev_get_id() call

2014-09-04 Thread Axel Lin
Signed-off-by: Axel Lin 
---
 drivers/regulator/max77802.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/regulator/max77802.c b/drivers/regulator/max77802.c
index 967e109..d89792b 100644
--- a/drivers/regulator/max77802.c
+++ b/drivers/regulator/max77802.c
@@ -134,7 +134,7 @@ static int max77802_ldo_set_suspend_mode_logic1(struct 
regulator_dev *rdev,
return -EINVAL;
}
 
-   max77802->opmode[rdev_get_id(rdev)] = val;
+   max77802->opmode[id] = val;
return regmap_update_bits(rdev->regmap, rdev->desc->enable_reg,
  rdev->desc->enable_mask, val << shift);
 }
@@ -167,7 +167,7 @@ static int max77802_ldo_set_suspend_mode_logic2(struct 
regulator_dev *rdev,
return -EINVAL;
}
 
-   max77802->opmode[rdev_get_id(rdev)] = val;
+   max77802->opmode[id] = val;
return regmap_update_bits(rdev->regmap, rdev->desc->enable_reg,
  rdev->desc->enable_mask, val << shift);
 }
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 0/3] new APIs to allocate buffer-cache with user specific flag

2014-09-04 Thread Theodore Ts'o
On Fri, Sep 05, 2014 at 09:37:05AM +0900, Gioh Kim wrote:
> >But what were the problems which were observed in standard kernels and
> >what effect did this patchset have upon them?  Some quantitative
> >measurements will really help here.
> 
> The problem is that I cannot allocate entire CMA memory.
> >
> Actually the problem is not found without Joonsoo's patch:
> https://lkml.org/lkml/2014/5/28/64.  Without it CMA memory is free
> and every CMA-memory allocation is successed.
> 
> If the Joonsoo's patch is applied, the CMA memory is allocated
> generally when system boots-up.

As I said earlier, I'm happy to carry this patch in the ext4 tree,
because as it turns out I could use this facility for another purpose
(to cause a few buffer cache allocations to happen with __GFP_NOFAIL).

I do have one question; I note that Joonsoo's patch dates back to May,
and yet this has not hit the mainline kernel, and I haven't seen any
discussions about this patch after May.  Has there been some pushback
from the mm maintainers about Joonsoo's approach with respect to this
patch?   What is the current status of that patch set?

Thanks,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: hi6421: Fix misleading comment

2014-09-04 Thread Axel Lin
Signed-off-by: Axel Lin 
---
 drivers/regulator/hi6421-regulator.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/regulator/hi6421-regulator.c 
b/drivers/regulator/hi6421-regulator.c
index e389920..a8c362c 100644
--- a/drivers/regulator/hi6421-regulator.c
+++ b/drivers/regulator/hi6421-regulator.c
@@ -37,7 +37,7 @@ struct hi6421_regulator_pdata {
  * struct hi6421_regulator_info - hi6421 regulator information
  * @desc: regulator description
  * @mode_mask: ECO mode bitmask of LDOs; for BUCKs, this masks sleep
- * @eco_microamp: eco mode load upper limit (in mA), valid for LDOs only
+ * @eco_microamp: eco mode load upper limit (in uA), valid for LDOs only
  */
 struct hi6421_regulator_info {
struct regulator_desc   desc;
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] regulator: da9211: Set of_match_table and export device table

2014-09-04 Thread Axel Lin
Also move da9211_i2c_id and da9211_dt_ids close to the user for better
readability.

Signed-off-by: Axel Lin 
---
 drivers/regulator/da9211-regulator.c | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/regulator/da9211-regulator.c 
b/drivers/regulator/da9211-regulator.c
index 044c36c..f47adf3 100644
--- a/drivers/regulator/da9211-regulator.c
+++ b/drivers/regulator/da9211-regulator.c
@@ -388,20 +388,6 @@ static int da9211_regulator_init(struct da9211 *chip)
return 0;
 }
 
-static const struct i2c_device_id da9211_i2c_id[] = {
-   {"da9211", DA9211},
-   {"da9213", DA9213},
-   {},
-};
-
-#ifdef CONFIG_OF
-static const struct of_device_id da9211_dt_ids[] = {
-   { .compatible = "dlg,da9211", .data = _i2c_id[0] },
-   { .compatible = "dlg,da9213", .data = _i2c_id[1] },
-   {},
-};
-#endif
-
 /*
  * I2C driver interface functions
  */
@@ -479,12 +465,27 @@ static int da9211_i2c_probe(struct i2c_client *i2c,
return ret;
 }
 
+static const struct i2c_device_id da9211_i2c_id[] = {
+   {"da9211", DA9211},
+   {"da9213", DA9213},
+   {},
+};
 MODULE_DEVICE_TABLE(i2c, da9211_i2c_id);
 
+#ifdef CONFIG_OF
+static const struct of_device_id da9211_dt_ids[] = {
+   { .compatible = "dlg,da9211", .data = _i2c_id[0] },
+   { .compatible = "dlg,da9213", .data = _i2c_id[1] },
+   {},
+};
+MODULE_DEVICE_TABLE(of, da9211_dt_ids);
+#endif
+
 static struct i2c_driver da9211_regulator_driver = {
.driver = {
.name = "da9211",
.owner = THIS_MODULE,
+   .of_match_table = of_match_ptr(da9211_dt_ids),
},
.probe = da9211_i2c_probe,
.id_table = da9211_i2c_id,
-- 
1.9.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Arnaldo Carvalho de Melo
Em Fri, Sep 05, 2014 at 09:23:56AM +0900, Namhyung Kim escreveu:
> Hmm.. now I'm curious that why the -H option is needed at all.. the perf
> record already wrote build-ids that are actually hits..

Probably because when 'perf buildid-list' was introduced 'perf record'
was inserting all the PERF_RECORD_MMAP dsos into the buildid-list, not
just the ones with hits.

Then, later on, 'perf record' probably moved to register just the ones
with hits.

This way -H ends up being a nop, i.e. should produce the same result as
'perf buildid-list' with no options.

If it doesn't, then it is a bug.

- Arnaldo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] HID: Add Holtek USB ID 04d9:a0c2 ETEKCITY Scroll

2014-09-04 Thread John DeSilva
The report descriptor for the HOLTEK USB ID 04d9:a0c2 (ETEKCITY Scroll
T-140 Gaming Mouse) is set to a very large amount of consumer usages
(2^16), exceeding HID_MAX_USAGES. Added id, bindings and comments for
the mouse, and reduced the usage and logical maximums to 0x2fff,
consistent with the other mice in the category.
Tested on the hardware.

Signed-off-by: John C. DeSilva 
---
 drivers/hid/hid-holtek-mouse.c | 4 
 drivers/hid/hid-ids.h  | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/hid/hid-holtek-mouse.c b/drivers/hid/hid-holtek-mouse.c
index d60fbd0..78b3a0c 100644
--- a/drivers/hid/hid-holtek-mouse.c
+++ b/drivers/hid/hid-holtek-mouse.c
@@ -29,6 +29,7 @@
  *   and Zalman ZM-GM1
  * - USB ID 04d9:a081, sold as SHARKOON DarkGlider Gaming mouse
  * - USB ID 04d9:a072, sold as LEETGION Hellion Gaming Mouse
+ * - USB ID 04d9:a0c2, sold as ETEKCITY Scroll T-140 Gaming Mouse
  */
 
 static __u8 *holtek_mouse_report_fixup(struct hid_device *hdev, __u8 *rdesc,
@@ -42,6 +43,7 @@ static __u8 *holtek_mouse_report_fixup(struct hid_device 
*hdev, __u8 *rdesc,
switch (hdev->product) {
case USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A067:
case USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A072:
+   case USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A0C2:
if (*rsize >= 122 && rdesc[115] == 0xff && rdesc[116] 
== 0x7f
&& rdesc[120] == 0xff && rdesc[121] == 
0x7f) {
hid_info(hdev, "Fixing up report descriptor\n");
@@ -74,6 +76,8 @@ static const struct hid_device_id holtek_mouse_devices[] = {
USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A072) },
{ HID_USB_DEVICE(USB_VENDOR_ID_HOLTEK_ALT,
USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A081) },
+   { HID_USB_DEVICE(USB_VENDOR_ID_HOLTEK_ALT,
+   USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A0C2) },
{ }
 };
 MODULE_DEVICE_TABLE(hid, holtek_mouse_devices);
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h
index 25cd674..c7b36ad 100644
--- a/drivers/hid/hid-ids.h
+++ b/drivers/hid/hid-ids.h
@@ -479,6 +479,7 @@
 #define USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A0700xa070
 #define USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A0720xa072
 #define USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A0810xa081
+#define USB_DEVICE_ID_HOLTEK_ALT_MOUSE_A0C20xa0c2
 #define USB_DEVICE_ID_HOLTEK_ALT_KEYBOARD_A096 0xa096
 
 #define USB_VENDOR_ID_IMATION  0x0718
-- 
1.8.5.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bit fields && data tearing

2014-09-04 Thread Peter Hurley
[ +cc linux-alpha ]

On 09/04/2014 06:14 PM, H. Peter Anvin wrote:
> On 09/04/2014 02:52 AM, Benjamin Herrenschmidt wrote:
>>
>> Yeah correct, alpha and bytes right ? Is there any other ? That's why I
>> suggested int.
>>
> 
> Even for Alpha it is only the 21064 AFAIK.

For -mcpu=ev5 (21164) and the following test

struct x {
long a;
char b;
char c;
char d;
char e;
};

void store_b(struct x *p) {
p->b = 1;
}

gcc generates:

void store_b(struct x *p) {
   0:   08 00 30 a0 ldl t0,8(a0)
   4:   01 f1 3f 44 andnot  t0,0xff,t0
   8:   01 34 20 44 or  t0,0x1,t0
   c:   08 00 30 b0 stl t0,8(a0)
  10:   01 80 fa 6b ret

IOW, rmw on 3 adjacent bytes, which is bad :)
For -mcpu=ev56 (21164A), the generated code is:

void store_b(struct x *p) {
   0:   01 00 3f 20 lda t0,1
   4:   08 00 30 38 stb t0,8(a0)
   8:   01 80 fa 6b ret

which is ok.
I have no idea how prevalent the ev56 is compared to the ev5.
Still we're talking about a chip that came out in 1996.

I still hate split caches though.

Regards,
Peter Hurley

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] x86, mm, pat: Set WT to PA4 slot of PAT MSR

2014-09-04 Thread Andy Lutomirski
On Thu, Sep 4, 2014 at 5:29 PM, Toshi Kani  wrote:
> On Thu, 2014-09-04 at 16:34 -0700, Andy Lutomirski wrote:
>> On Thu, Sep 4, 2014 at 4:19 PM, Henrique de Moraes Holschuh
>>  wrote:
>> > On Thu, 04 Sep 2014, H. Peter Anvin wrote:
>> >> On 09/04/2014 01:11 PM, Henrique de Moraes Holschuh wrote:
>> >> > I am worried of uncharted territory, here.  I'd actually advocate for 
>> >> > not
>> >> > enabling the upper four PAT entries on IA-32 at all, unless Windows 9X 
>> >> > / XP
>> >> > is using them as well.  Is this a real concern, or am I being overly
>> >> > cautious?
>> >>
>> >> It is extremely unlikely that we'd have PAT issues in 32-bit mode and
>> >> not in 64-bit mode on the same CPU.
>> >
>> > Sure, but is it really a good idea to enable this on the *old* non-64-bit
>> > capable processors (note: I don't mean x86-64 processors operating in 
>> > 32-bit
>> > mode) ?
>> >
>> >> As far as I know, the current blacklist rule is very conservative due to
>> >> lack of testing more than anything else.
>> >
>> > I was told that much in 2009 when I asked why cpuid 0x6d8 was blacklisted
>> > from using PAT :-)
>>
>> At the very least, anyone who plugs an NV-DIMM into a 32-bit machine
>> is nuts, and not just because I'd be somewhat amazed if it even
>> physically fits into the slot. :)
>
> According to the spec, the upper four entries bug was fixed in Pentium 4
> model 0x1.  So, the remaining Intel 32-bit processors that may enable
> the upper four entries are Pentium 4 model 0x1-4.  Should we disable it
> for all Pentium 4 models?

Assuming that this is Pentium 4 erratum N46, then there may be another
option: use slot 7 instead of slot 4 for WT.  Then, even if somehow
the blacklist screws up, the worst that happens is that a WT page gets
interpreted as UC.  I suppose this could cause aliasing issues, but
can't cause problems for people who don't use the high entries in the
first place.

--Andy

>
> Thanks,
> -Toshi
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V1 1/2] ARM: clk-imx6q: refine clock tree for ASRC

2014-09-04 Thread Shawn Guo
On Thu, Sep 04, 2014 at 05:48:58PM +0800, Shengjiu Wang wrote:
> ASRC has "asrc", "asrc_ipg", "asrc_mem" clocks, and they share
> the same gate bits.
> 
> Signed-off-by: Shengjiu Wang 

Applied both, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: + mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch added to -mm tree

2014-09-04 Thread Joonsoo Kim
On Wed, Sep 03, 2014 at 12:40:52PM -0700, a...@linux-foundation.org wrote:
> 
> The patch titled
>  Subject: mm: fix kmemcheck.c build errors
> has been added to the -mm tree.  Its filename is
>  mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch
> 
> This patch should soon appear at
> 
> http://ozlabs.org/~akpm/mmots/broken-out/mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch
> and later at
> 
> http://ozlabs.org/~akpm/mmotm/broken-out/mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2.patch
> 
> Before you just go and hit "reply", please:
>a) Consider who else should be cc'ed
>b) Prefer to cc a suitable mailing list as well
>c) Ideally: find the original patch on the mailing list and do a
>   reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> --
> From: Randy Dunlap 
> Subject: mm: fix kmemcheck.c build errors
> 
> Add header files to fix kmemcheck.c build errors:
> 
> ../mm/kmemcheck.c:70:7: error: dereferencing pointer to incomplete type
> ../mm/kmemcheck.c:83:15: error: dereferencing pointer to incomplete type
> ../mm/kmemcheck.c:95:8: error: dereferencing pointer to incomplete type
> ../mm/kmemcheck.c:95:21: error: dereferencing pointer to incomplete type
> 
> ../mm/slab.h: In function 'cache_from_obj':
> ../mm/slab.h:283:2: error: implicit declaration of function 
> 'memcg_kmem_enabled' [-Werror=implicit-function-declaration]
> 
> Signed-off-by: Randy Dunlap 
> Cc: Joonsoo Kim 
> Signed-off-by: Andrew Morton 
> ---
> 
>  mm/kmemcheck.c |1 +
>  mm/slab.h  |2 ++
>  2 files changed, 3 insertions(+)
> 
> diff -puN 
> mm/kmemcheck.c~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2
>  mm/kmemcheck.c
> --- 
> a/mm/kmemcheck.c~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2
> +++ a/mm/kmemcheck.c
> @@ -2,6 +2,7 @@
>  #include 
>  #include 
>  #include 
> +#include "slab.h"
>  #include 
>  
>  void kmemcheck_alloc_shadow(struct page *page, int order, gfp_t flags, int 
> node)
> diff -puN 
> mm/slab.h~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2 
> mm/slab.h
> --- 
> a/mm/slab.h~mm-slab_common-move-kmem_cache-definition-to-internal-header-fix-2
> +++ a/mm/slab.h
> @@ -268,6 +268,8 @@ static inline void memcg_uncharge_slab(s
>  }
>  #endif
>  
> +#include 
> +
>  static inline struct kmem_cache *cache_from_obj(struct kmem_cache *s, void 
> *x)
>  {
>   struct kmem_cache *cachep;

Hello, Andrew.

Could you take another fix instead of this?
This also make build failure if CONFIG_MEMCG_KMEM=y.
Please see following patch I sent.

https://lkml.org/lkml/2014/8/31/148

The only difference is position of memcontrol.h header.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ARM: clk-imx6sl: correct the pxp and epdc axi clock selections

2014-09-04 Thread Shawn Guo
On Thu, Sep 04, 2014 at 04:33:12PM +0800, Fancy Fang wrote:
> The parent clocks of IMX6SL_CLK_PXP_AXI_SEL and IMX6SL_CLK_EPDC_AXI_SEL
> clocks are not the same. So split the epdc_pxp_sels into two different
> clock selections 'pxp_axi_sels' and 'epdc_axi_sels'.
> 
> Signed-off-by: Fancy Fang 
> Signed-off-by: Robby Cai 

Applied, thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/5] x86, mm, pat: Set WT to PA4 slot of PAT MSR

2014-09-04 Thread Toshi Kani
On Thu, 2014-09-04 at 16:34 -0700, Andy Lutomirski wrote:
> On Thu, Sep 4, 2014 at 4:19 PM, Henrique de Moraes Holschuh
>  wrote:
> > On Thu, 04 Sep 2014, H. Peter Anvin wrote:
> >> On 09/04/2014 01:11 PM, Henrique de Moraes Holschuh wrote:
> >> > I am worried of uncharted territory, here.  I'd actually advocate for not
> >> > enabling the upper four PAT entries on IA-32 at all, unless Windows 9X / 
> >> > XP
> >> > is using them as well.  Is this a real concern, or am I being overly
> >> > cautious?
> >>
> >> It is extremely unlikely that we'd have PAT issues in 32-bit mode and
> >> not in 64-bit mode on the same CPU.
> >
> > Sure, but is it really a good idea to enable this on the *old* non-64-bit
> > capable processors (note: I don't mean x86-64 processors operating in 32-bit
> > mode) ?
> >
> >> As far as I know, the current blacklist rule is very conservative due to
> >> lack of testing more than anything else.
> >
> > I was told that much in 2009 when I asked why cpuid 0x6d8 was blacklisted
> > from using PAT :-)
> 
> At the very least, anyone who plugs an NV-DIMM into a 32-bit machine
> is nuts, and not just because I'd be somewhat amazed if it even
> physically fits into the slot. :)

According to the spec, the upper four entries bug was fixed in Pentium 4
model 0x1.  So, the remaining Intel 32-bit processors that may enable
the upper four entries are Pentium 4 model 0x1-4.  Should we disable it
for all Pentium 4 models?

Thanks,
-Toshi  

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] x86: reimplement ___preempt_schedule*() using THUNK helpers

2014-09-04 Thread Andy Lutomirski
On Thu, Sep 4, 2014 at 6:42 AM, Oleg Nesterov  wrote:
>
> ___preempt_schedule() does SAVE_ALL/RESTORE_ALL but this is suboptimal,
> we do not need to save/restore the callee-saved register. And we already
> have arch/x86/lib/thunk_*.S which implements the similar asm wrappers,
> so it makes sense to redefine ___preempt_schedule() as "THUNK ..." and
> remove preempt.S altogether.

Reviewed-by: Andy Lutomirski 

>
> Signed-off-by: Oleg Nesterov 
> ---
>  arch/x86/kernel/Makefile  |2 --
>  arch/x86/kernel/preempt.S |   25 -
>  arch/x86/lib/thunk_32.S   |   20 
>  arch/x86/lib/thunk_64.S   |7 +++
>  4 files changed, 23 insertions(+), 31 deletions(-)
>  delete mode 100644 arch/x86/kernel/preempt.S
>
> diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
> index b5ea75c..58cf60b 100644
> --- a/arch/x86/kernel/Makefile
> +++ b/arch/x86/kernel/Makefile
> @@ -39,8 +39,6 @@ obj-y += tsc.o tsc_msr.o io_delay.o rtc.o
>  obj-y  += pci-iommu_table.o
>  obj-y  += resource.o
>
> -obj-$(CONFIG_PREEMPT)  += preempt.o
> -
>  obj-y  += process.o
>  obj-y  += i387.o xsave.o
>  obj-y  += ptrace.o
> diff --git a/arch/x86/kernel/preempt.S b/arch/x86/kernel/preempt.S
> deleted file mode 100644
> index ca7f0d5..000
> --- a/arch/x86/kernel/preempt.S
> +++ /dev/null
> @@ -1,25 +0,0 @@
> -
> -#include 
> -#include 
> -#include 
> -#include 
> -
> -ENTRY(___preempt_schedule)
> -   CFI_STARTPROC
> -   SAVE_ALL
> -   call preempt_schedule
> -   RESTORE_ALL
> -   ret
> -   CFI_ENDPROC
> -
> -#ifdef CONFIG_CONTEXT_TRACKING
> -
> -ENTRY(___preempt_schedule_context)
> -   CFI_STARTPROC
> -   SAVE_ALL
> -   call preempt_schedule_context
> -   RESTORE_ALL
> -   ret
> -   CFI_ENDPROC
> -
> -#endif
> diff --git a/arch/x86/lib/thunk_32.S b/arch/x86/lib/thunk_32.S
> index 28f85c9..7f1641a 100644
> --- a/arch/x86/lib/thunk_32.S
> +++ b/arch/x86/lib/thunk_32.S
> @@ -7,16 +7,19 @@
> #include 
> #include 
>
> -#ifdef CONFIG_TRACE_IRQFLAGS
> /* put return address in eax (arg1) */
> -   .macro thunk_ra name,func
> +   .macro thunk_ra name, func, put_ret_addr_in_eax=0
> .globl \name
>  \name:
> pushl %eax
> pushl %ecx
> pushl %edx
> +
> +   .if \put_ret_addr_in_eax
> /* Place EIP in the arg1 */
> movl 3*4(%esp), %eax
> +   .endif
> +
> call \func
> popl %edx
> popl %ecx
> @@ -25,6 +28,15 @@
> _ASM_NOKPROBE(\name)
> .endm
>
> -   thunk_ra trace_hardirqs_on_thunk,trace_hardirqs_on_caller
> -   thunk_ra trace_hardirqs_off_thunk,trace_hardirqs_off_caller
> +#ifdef CONFIG_TRACE_IRQFLAGS
> +   thunk_ra trace_hardirqs_on_thunk,trace_hardirqs_on_caller,1
> +   thunk_ra trace_hardirqs_off_thunk,trace_hardirqs_off_caller,1
> +#endif
> +
> +#ifdef CONFIG_PREEMPT
> +   THUNK ___preempt_schedule, preempt_schedule
> +#ifdef CONFIG_CONTEXT_TRACKING
> +   THUNK ___preempt_schedule_context, preempt_schedule_context
>  #endif
> +#endif
> +
> diff --git a/arch/x86/lib/thunk_64.S b/arch/x86/lib/thunk_64.S
> index 92d9fea..b30b5eb 100644
> --- a/arch/x86/lib/thunk_64.S
> +++ b/arch/x86/lib/thunk_64.S
> @@ -38,6 +38,13 @@
> THUNK lockdep_sys_exit_thunk,lockdep_sys_exit
>  #endif
>
> +#ifdef CONFIG_PREEMPT
> +   THUNK ___preempt_schedule, preempt_schedule
> +#ifdef CONFIG_CONTEXT_TRACKING
> +   THUNK ___preempt_schedule_context, preempt_schedule_context
> +#endif
> +#endif
> +
> /* SAVE_ARGS below is used only for the .cfi directives it contains. 
> */
> CFI_STARTPROC
> SAVE_ARGS
> --
> 1.5.5.1
>
>



-- 
Andy Lutomirski
AMA Capital Management, LLC
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv4 0/3] new APIs to allocate buffer-cache with user specific flag

2014-09-04 Thread Gioh Kim



2014-09-05 오전 7:16, Andrew Morton 쓴 글:

On Thu,  4 Sep 2014 16:29:38 +0900 Gioh Kim  wrote:


This patch try to solve problem that a long-lasting page caches of
ext4 superblock and journaling of superblock disturb page migration.

I've been testing CMA feature on my ARM-based platform
and found that two page caches cannot be migrated.
They are page caches of superblock of ext4 filesystem and its journaling data.

Current ext4 reads superblock with sb_bread() that allocates page
from movable area. But the problem is that ext4 hold the page until
it is unmounted. If root filesystem is ext4 the page cannot be migrated
forever.
And also the journaling data for the superblock cannot be migreated.

I introduce new APIs that allocates page cache with specific flag passed by an
argument.
*_gfp APIs are for user want to set page allocation flag for page cache
allocation.
And *_unmovable APIs are for the user wants to allocate page cache from
non-movable area.

It is useful for ext4/ext3 and others that want to hold page cache for a long
time.


Could we please have some detailed information about the real-world
effect of this patchset?

You earlier said "My test platform is currently selling item in the
market.  And also I test my patch when my platform is working as if
real user uses it.".


OK. I'm writing details as possible as I can.
Please feel free to request me more information.

My platform is TV and 1GB system memory and 256MB CMA memory.
I want to use full 256MB CMA memory.



But what were the problems which were observed in standard kernels and
what effect did this patchset have upon them?  Some quantitative
measurements will really help here.


The problem is that I cannot allocate entire CMA memory.
Actually the problem is not found without Joonsoo's patch: 
https://lkml.org/lkml/2014/5/28/64.
Without it CMA memory is free and every CMA-memory allocation is successed.

If the Joonsoo's patch is applied, the CMA memory is allocated generally when 
system boots-up.
Therefore superblocks of mounted filesystem and buffer cache of it are 
allocation from CMA memory.

I have three ext4 partitions and one squash partition.
The squash filesystem has no problem. It holds buffer-cache temporarily.

But each ext4 partition holds 2 buffer-cache until unmounted (one for sb and 
one for journal)
so that I found 2 or 3 pages, the page are storing buffer-cache, are busy in my 
platform
when I try to allocate 256MB, entiry CMA memory.
So my allocation fails.

This patchset makes the long-lasting buffer-caches be allocated in non-CMA area.
Therefore I can success to allocate the entire CMA memory always.

Please tell me what I have to measure quantitatively?
What I know is every ext4 filesystem has 2 long-lasting buffer-cache
that are released when it is unmounted.

I applied this patch and try to allocate the entire CMA memory almost 100-times.
And I successed always.



I'm trying to get an understanding of how effective and important the
change is, whether others will see similar benefits.  I'd also like to
understand how *complete* the fix is - were the problems which you
observed completely fixed, or do outstanding problems remain?


I think this patch has benefits only for systems that use CMA or HOTPLUG 
feature.
As I mentioned above, the problem is not occured without Joonsoo's patch that 
allocates CMA area frequently.

If a system want to use CMA/HOTPLUG feature, I think, this patch is very 
important.
The problem is only several pages but several MB can be wasted considering an 
align of allocation size.
If allocation size align is 16MB and one page is busy, 16MB can be wasted.
For embedded system like TV 16MB is really big issue.

I beleive the problem is completely fixed with my patch.
I've tested many times for several days and reviewed ext4 code that deals with 
buffer-header.
I couldn't find any other problem.

I'm sorry to confuse you with my poor English.
Please reply me whatever you need.

Next week is Korean thanksgiving holidays.
I think I can reply on Fri.



Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3] zram: add num_discards for discarded pages stat

2014-09-04 Thread Minchan Kim
Hi Sergey,

On Fri, Sep 05, 2014 at 12:43:31AM +0900, Sergey Senozhatsky wrote:
> Hello,
> 
> On (09/04/14 11:25), Minchan Kim wrote:
> > Hello Sergey,
> > 
> > First of all, Sorry for late response.
> > 
> > On Tue, Aug 26, 2014 at 11:15:43PM +0900, Sergey Senozhatsky wrote:
> > > Hello,
> > > 
> > > On (08/26/14 14:08), Minchan Kim wrote:
> > > > Hi,
> > > > 
> > > > On Mon, Aug 25, 2014 at 08:01:18PM +0900, Sergey Senozhatsky wrote:
> > > > > Hello,
> > > > > 
> > > > > On (08/25/14 09:36), Minchan Kim wrote:
> > > > > > Hello Chao,
> > > > > > 
> > > > > > On Fri, Aug 22, 2014 at 04:21:01PM +0800, Chao Yu wrote:
> > > > > > > Since we have supported handling discard request in this commit
> > > > > > > f4659d8e620d08bd1a84a8aec5d2f5294a242764 (zram: support 
> > > > > > > REQ_DISCARD), zram got
> > > > > > > one more chance to free unused memory whenever received discard 
> > > > > > > request. But
> > > > > > > without stating for discard request, there is no method for user 
> > > > > > > to know whether
> > > > > > > discard request has been handled by zram or how many blocks were 
> > > > > > > discarded by
> > > > > > > zram when user wants to know the effect of discard.
> > > > > > 
> > > > > > My concern is that how much we are able to know the effect of 
> > > > > > discard
> > > > > > exactly with your patch.
> > > > > > 
> > > > > > The issue I can think of is zram-swap discard.
> > > > > > Now, zram handles notification from VM to free duplicated copy 
> > > > > > between
> > > > > > VM-owned memory and zRAM-owned's one so discarding for zram-swap 
> > > > > > might
> > > > > > be pointless overhead but your stat indicates lots of free page 
> > > > > > discarded
> > > > > > without real freeing 
> > > > > 
> > > > > this is why I've moved stats accounting to the place where actual
> > > > > zs_free() happens. and, frankly, I still would like to see the number
> > > > > of zs_free() calls, rather than the number of slot free notifications
> > > > > and REQ_DISCARD (or separately), because they all end up calling
> > > > > zs_free(). iow, despite the call path, from the user point of view
> > > > > they are just zs_free() -- the number of pages that's been freed by
> > > > > the 3rd party and we had have to deal with that.
> > > > 
> > > > My qeustion is that what user can do with the only real freeing count?
> > > > Could you give me a concret example?
> > > 
> > > for !swap device case it's identicall to `num_discarded'.
> > > for swap device case, it's a bit more complicated (less convenient) if
> > > we actually can receive both slot free and delayed REQ_DISCARDs.
> > > 
> > > > It's a just number of real freeing count so if you were admin, what
> > > > do you expect from that? That's what I'd like to see in changelog.
> > > > 
> > > > > 
> > > > > > so that user might think "We should keep enable
> > > > > > swap discard for zRAM because the stat indicates it's really good".
> > > > > > 
> > > > > > In summary, wouldn't it better to have two?
> > > > > > 
> > > > > > num_discards,
> > > > > > num_failed_discards?
> > > > > 
> > > > > do we actully need this? the only value I can think of (perhaps I'm
> > > > > missing something) is that we can make sure that we need to support
> > > > > both slot free and REQ_DISCARDS, or we can leave only REQ_DISCARDS.
> > > > > is there anything else?
> > > > 
> > > > The secnario I imagined with two stat is how REQ_DISCARDS is effective
> > > > from swap layer. Normally, slot free logic is called in advance
> > > > when the page is zapped or swap read happens to avoid duplicate copy,
> > > > so discard request from swap space would be just overhead without
> > > > any benefit so we might guide zram-swap user don't use "swap -d".
> > > > Otherwise, as failed_discard ratio is low, it means it would be
> > > > better to remove swap slot free logic because swap discard works well
> > > > without slot free hint.(Although I don't think)
> > > 
> > > yes, so it looks like it is a developer's stat - to make some
> > > observations and to come up with some decisions. do we really
> > > want to put it into release?
> > 
> > Agree. I was too specific for my purpose and it couldn't be
> > a compelling reason to make it export for general purpose.
> > 
> > Actually, discard req sent by swap for getting free cluster
> > shouldn't be success(i,e num_discarded should be zero) because
> > zram_slot_free_notify will always free the duplicated copy
> > in advance so user don't have any gain with 'swapon -d'.
> > 
> > Now, I agree with you that we shouldn't add more stat without
> > compelling reason so it would be better to rename notify_free
> > with discarded and move it in zram_free_page like your patch.
> > https://lkml.org/lkml/2014/8/21/294
> > 
> > I will ask to Andrew to revert Chao's patch and pick your patch
> > after a few days unless Chao has another opinion.
> > 
> 
> no problem.
> 
> I, probably, was not clear enough. one of my objections was that
> it is 

Re: [PATCH v4 1/4] thermal: rockchip: add driver for thermal

2014-09-04 Thread Caesar Wang


在 2014年09月05日 01:06, Dmitry Torokhov 写道:

Hi Caesar,

On Wed, Sep 03, 2014 at 10:10:36AM +0800, Caesar Wang wrote:

+static int rockchip_thermal_remove(struct platform_device *pdev)
+{
+   struct rockchip_thermal_data *data = platform_get_drvdata(pdev);
+
+   rockchip_thermal_control(data, false);
+
+   thermal_zone_device_unregister(data->tz);
+   cpufreq_cooling_unregister(data->cdev);
+   cpufreq_cooling_unregister(data->cdev);

It looks like we are unregistering the same cooling device one too many times.

Thanks.


yes,I made a stupid mistake.  Thanks~

--
Best regards,
Caesar


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2 05/17] locks: generic_delete_lease doesn't need a file_lock at all

2014-09-04 Thread Jeff Layton
On Thu, 4 Sep 2014 13:14:24 -0700
Christoph Hellwig  wrote:

> On Thu, Sep 04, 2014 at 08:38:31AM -0400, Jeff Layton wrote:
> > Ensure that it's OK to pass in a NULL file_lock double pointer on
> > a F_UNLCK request and convert the vfs_setlease F_UNLCK callers to
> > do just that.
> > 
> > Finally, turn the BUG_ON in generic_setlease into a WARN_ON_ONCE
> > with an error return. That's a problem we can handle without
> > crashing the box if it occurs.
> 
> Can we just make generic_delete_lease (maye renamed to vfs_delete_lease)
> the interface for deleting leases instead of going through a useless
> multiplex and file operation?
> 

I'm not sure that change really makes sense to me at this point.

Suppose we have an exportable filesystem with a ->setlease
implementation [1]. We end up calling into it to set up a lease and it
calls generic_add_lease. If we make the change you're suggesting, then
we'll have no parallel to a ->setlease op when removing that lease.

We could of course make a ->dellease op or something, but I'd rather
not introduce that change until I've had a chance to do some other
cleanup to the file locking infrastructure.

So...I'm not opposed to doing what you suggest, but I'd rather not do it
just yet until I've gotten a little farther with some other cleanup of
how we deal with locks in general. I think it'll be easier to do that
once some other changes have gone in.

I'll post a draft patchset based on those changes "real soon now" as an
RFC. Hopefully at that point my rationale will make a bit more sense...

[1]: of course, only cifs has a non-trivial one for now and it's pretty
half-assed...

-- 
Jeff Layton 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Namhyung Kim
Hi Stephane,

On Thu, 4 Sep 2014 16:37:51 +0200, Stephane Eranian wrote:
> On Tue, Aug 26, 2014 at 8:38 AM, Namhyung Kim  wrote:
>>
>> There's a problem on finding correct kernel symbols when perf report
>> runs on a different kernel.  Although a part of the problem was solved
>> by the prior commit 0a7e6d1b6844 ("perf tools: Check recorded kernel
>> version when finding vmlinux"), there's a remaining problem still.
>>
>> When perf records samples, it synthesizes the kernel map using
>> machine__mmap_name() and ref_reloc_sym like "[kernel.kallsyms]_text".
>> You can easily see it using 'perf report -D' command.
>>
>> After finishing record, it goes through the recorded events to find
>> maps/dsos actually used.  And then record build-id info of them.
>>
>> During this process, it needs to load symbols in a dso and it'd call
>> dso__load_vmlinux() since the default value of the symbol_conf.try_
>> vmlinux_path is true.  However it changes dso->long_name to a real
>> path of the vmlinux file (e.g. /lib/modules/3.16.0-rc2+/build/vmlinux)
>> if one is running on a custom kernel.
>>
>> It resulted in that perf report reads the build-id of the vmlinux, but
>> cannot use it since it only knows about the [kernel.kallsyms] map.  It
>> then falls back to possible vmlinux paths by using the recorded kernel
>> version (in case of a recent version) or a running kernel silently
>> (which might break the result).  I think it's worth going to the
>> stable tree.
>>
>> I can think of a couple of ways to fix it.  In this patch, I changed
>> to use the name of "[kernel.kallsyms]" for the kernel build-id event
>> instead of not trying vmlinux paths.  This way we can provide maximum
>> info (like annotation) with minimum change IMHO.
>>
>> Before:
>>
>>   $ perf record -a usleep 1
>>
>>   $ perf buildid-list
>>   00d5ff078efe1d30b8492854f259215fd877ce30 
>> /lib/modules/3.16.0-rc2+/build/vmlinux
>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>>
>>   $ perf buildid-list -H
>>    [kernel.kallsyms]
>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>>    /tmp/perf-2523.map
>>
> There is something I don't understand in your example above.  The -H
> option shows only DSO with samples. So why do you get the buildid
> without -H and you get no buildid with -H? In other words, I don't
> connect the dots between what -H does on the buildid change for the
> kernel. Looks like you have the buildid in the perf.data file.

Without -H, it just prints all DSOs found in build-id table (rebuilt
during read perf data file header) and skips processing events.  But
with -H, it'd process the event records and so set kernel map to
'[kernel.kallsyms]' - since the kernel mmap event always has the name -
and mark it as hit.  Thus the actual vmlinux can't be marked and then
cannot be printed.

Hmm.. now I'm curious that why the -H option is needed at all.. the perf
record already wrote build-ids that are actually hits..

Thanks,
Namhyung
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] dm cache: fix race causing dirty blocks to be marked as clean

2014-09-04 Thread Anssi Hannula
When a writeback or a promotion of a block is completed, the cell of
that block is removed from the prison, the block is marked as clean, and
the clear_dirty() callback of the cache policy is called.

Unfortunately, performing those actions in this order allows an incoming
new write bio for that block to come in before clearing the dirty status
is completed and therefore possibly causing one of these two scenarios:

Scenario A:

Thread 1  Thread 2
cell_defer()  .
- cell removed from prison.
- detained bios queued.
. incoming write bio
. remapped to cache
. set_dirty() called,
.   but block already dirty
.   => it does nothing
clear_dirty() .
- block marked clean  .
- policy clear_dirty() called .

Result: Block is marked clean even though it is actually dirty. No
writeback will occur.

Scenario B:

Thread 1  Thread 2
cell_defer()  .
- cell removed from prison.
- detained bios queued.
clear_dirty() .
- block marked clean  .
. incoming write bio
. remapped to cache
. set_dirty() called
. - block marked dirty
. - policy set_dirty() called
- policy clear_dirty() called .

Result: Block is properly marked as dirty, but policy thinks it is clean
and therefore never asks us to writeback it.
This case is visible in "dmsetup status" dirty block count (which
normally decreases to 0 on a quiet device).

Fix these issues by calling clear_dirty() before calling cell_defer().
Incoming bios for that block will then be detained in the cell and
released only after clear_dirty() has completed, so the race will not
occur.

Found by inspecting the code after noticing spurious dirty counts
(scenario B).

Signed-off-by: Anssi Hannula 
Cc: Joe Thornber 
Cc: sta...@vger.kernel.org
---

> Unfortunately it seems there is some other potentially more serious bug
> still in there...

After looking through the code that indeed seems to be the case, as
explained above.

Unless I'm missing something?

I can't say with 100% certainty if this fixes the spurious counts I saw
since those took quite a long time (1-2 weeks?) to appear and the load
of that system is somewhat irregular.


 drivers/md/dm-cache-target.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index 1af40ee209e2..7130505c2425 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -895,8 +895,8 @@ static void migration_success_pre_commit(struct 
dm_cache_migration *mg)
struct cache *cache = mg->cache;
 
if (mg->writeback) {
-   cell_defer(cache, mg->old_ocell, false);
clear_dirty(cache, mg->old_oblock, mg->cblock);
+   cell_defer(cache, mg->old_ocell, false);
cleanup_migration(mg);
return;
 
@@ -951,13 +951,13 @@ static void migration_success_post_commit(struct 
dm_cache_migration *mg)
}
 
} else {
+   clear_dirty(cache, mg->new_oblock, mg->cblock);
if (mg->requeue_holder)
cell_defer(cache, mg->new_ocell, true);
else {
bio_endio(mg->new_ocell->holder, 0);
cell_defer(cache, mg->new_ocell, false);
}
-   clear_dirty(cache, mg->new_oblock, mg->cblock);
cleanup_migration(mg);
}
 }
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: bit fields && data tearing

2014-09-04 Thread Paul E. McKenney
On Thu, Sep 04, 2014 at 03:16:03PM -0700, H. Peter Anvin wrote:
> On 09/04/2014 12:42 PM, Peter Hurley wrote:
> > 
> > Or we could give up on the Alpha.
> > 
> 
> If Alpha is turning into Voyager (kept alive only as a museum piece, but
> actively causing problems) then please let's kill it.

Sorry for being slow to join this thread, but I propose the following
patch.  If we can remove support for all CPUs that to not support
direct access to bytes and shorts (which I would very much like to
see happen), I will remove the last non-guarantee.

Thanx, Paul



documentation: Record limitations of bitfields and small variables

This commit documents the fact that it is not safe to use bitfields
as shared variables in synchronization algorithms.

Signed-off-by: Paul E. McKenney 

diff --git a/Documentation/memory-barriers.txt 
b/Documentation/memory-barriers.txt
index 87be0a8a78de..a28bfe4fd759 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -269,6 +269,26 @@ And there are a number of things that _must_ or _must_not_ 
be assumed:
STORE *(A + 4) = Y; STORE *A = X;
STORE {*A, *(A + 4) } = {X, Y};
 
+And there are anti-guarantees:
+
+ (*) These guarantees do not apply to bitfields, because compilers often
+ generate code to modify these using non-atomic read-modify-write
+ sequences.  Do not attempt to use bitfields to synchronize parallel
+ algorithms.
+
+ (*) Even in cases where bitfields are protected by locks, all fields
+ in a given bitfield must be protected by one lock.  If two fields
+ in a given bitfield are protected by different locks, the compiler's
+ non-atomic read-modify-write sequences can cause an update to one
+ field to corrupt the value of an adjacent field.
+
+ (*) These guarantees apply only to properly aligned and sized scalar
+ variables.  "Properly sized" currently means "int" and "long",
+ because some CPU families do not support loads and stores of
+ other sizes.  ("Some CPU families" is currently believed to
+ be only Alpha 21064.  If this is actually the case, a different
+ non-guarantee is likely to be formulated.)
+
 
 =
 WHAT ARE MEMORY BARRIERS?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] genirq: Introduce irq_read_line()

2014-09-04 Thread Bjorn Andersson
On Tue, Aug 19, 2014 at 1:23 PM, Bjorn Andersson
 wrote:
> Introduce the irq_read_line() function to allow device drivers to read
> the current logical state of an input when the hardware only exposes
> this through status bits in the interrupt controller.
>
> The new function is backed by a new callback function in the irq_chip -
> irq_read_line() - that can be implemented by irq_chips that owns such
> status bits.
>
> Based on rfc patch from April 2011 by Abhijeet.
>
> Cc: Abhijeet Dharmapurikar 
> Signed-off-by: Bjorn Andersson 

ping?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] perf tools: Fix build-id matching on vmlinux

2014-09-04 Thread Namhyung Kim
Hi Arnaldo,

On Thu, 4 Sep 2014 11:18:45 -0300, Arnaldo Carvalho de Melo wrote:
> Em Tue, Aug 26, 2014 at 03:38:39PM +0900, Namhyung Kim escreveu:
>> There's a problem on finding correct kernel symbols when perf report
>> runs on a different kernel.  Although a part of the problem was solved
>> by the prior commit 0a7e6d1b6844 ("perf tools: Check recorded kernel
>> version when finding vmlinux"), there's a remaining problem still.
>> 
>> When perf records samples, it synthesizes the kernel map using
>> machine__mmap_name() and ref_reloc_sym like "[kernel.kallsyms]_text".
>> You can easily see it using 'perf report -D' command.
>> 
>> After finishing record, it goes through the recorded events to find
>> maps/dsos actually used.  And then record build-id info of them.
>> 
>> During this process, it needs to load symbols in a dso and it'd call
>> dso__load_vmlinux() since the default value of the symbol_conf.try_
>> vmlinux_path is true.  However it changes dso->long_name to a real
>> path of the vmlinux file (e.g. /lib/modules/3.16.0-rc2+/build/vmlinux)
>> if one is running on a custom kernel.
>> 
>> It resulted in that perf report reads the build-id of the vmlinux, but
>> cannot use it since it only knows about the [kernel.kallsyms] map.  It
>> then falls back to possible vmlinux paths by using the recorded kernel
>> version (in case of a recent version) or a running kernel silently
>> (which might break the result).  I think it's worth going to the
>> stable tree.
>> 
>> I can think of a couple of ways to fix it.  In this patch, I changed
>> to use the name of "[kernel.kallsyms]" for the kernel build-id event
>> instead of not trying vmlinux paths.  This way we can provide maximum
>> info (like annotation) with minimum change IMHO.
>> 
>> Before:
>> 
>>   $ perf record -a usleep 1
>> 
>>   $ perf buildid-list
>>   00d5ff078efe1d30b8492854f259215fd877ce30 
>> /lib/modules/3.16.0-rc2+/build/vmlinux
>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>> 
>>   $ perf buildid-list -H
>>    [kernel.kallsyms]
>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>>    /tmp/perf-2523.map
>> 
>> After:
>> 
>>   $ perf record -a usleep 1
>> 
>>   $ perf buildid-list
>>   00d5ff078efe1d30b8492854f259215fd877ce30 [kernel.kallsyms]
>
> We are losing information, namely the pathname for the kernel used, that
> may be useful in analysis.

Right.  That's a problem.


>
> Why not make sure that if there is a build-id in the perf.data header,
> then we completely refusing anything that doesn't match the build-id?
> I.e. the name is irrelevant for this purpose, the contents, as keyed by
> the build-id, is what matters.

The perf report rebuilds machine states from the event records only.  In
this case, the kernel map was recorded in the name of [kernel.kallsyms]
so it couldn't find the build-id from the table.

Thanks,
Namhyung


>
> - Arnaldo
>
>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>> 
>>   $ perf buildid-list -H
>>   00d5ff078efe1d30b8492854f259215fd877ce30 [kernel.kallsyms]
>>   78186287bba77069a056a5ccbeb14b7fd2ca3a4b /usr/lib64/libc-2.17.so
>>   4eadca6cb82e0a85edb87c15b5e3980742514501 /usr/lib64/ld-2.17.so
>>   1e272ca30081e81ef41935a630eb2f4c636798b4 /usr/lib64/dri/swrast_dri.so
>>    /tmp/perf-2523.map
>> 
>> Cc: sta...@vger.kernel.org
>> Signed-off-by: Namhyung Kim 
>> ---
>>  tools/perf/util/header.c | 3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
>> index 158c787ce0c4..5c4093dee467 100644
>> --- a/tools/perf/util/header.c
>> +++ b/tools/perf/util/header.c
>> @@ -263,6 +263,9 @@ static int __dsos__write_buildid_table(struct list_head 
>> *head,
>>  machine__mmap_name(machine, nm, sizeof(nm));
>>  name = nm;
>>  name_len = strlen(nm) + 1;
>> +} else if (dso__is_vmlinux(pos)) {
>> +name = pos->name;
>> +name_len = strlen(pos->name) + 1;
>>  } else {
>>  name = pos->long_name;
>>  name_len = pos->long_name_len + 1;
>> -- 
>> 2.0.0
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 3/3] zram: add swap_get_free hint

2014-09-04 Thread Minchan Kim
Hi Heesub,

On Thu, Sep 04, 2014 at 03:26:14PM +0900, Heesub Shin wrote:
> Hello Minchan,
> 
> First of all, I agree with the overall purpose of your patch set.

Thank you.

> 
> On 09/04/2014 10:39 AM, Minchan Kim wrote:
> >This patch implement SWAP_GET_FREE handler in zram so that VM can
> >know how many zram has freeable space.
> >VM can use it to stop anonymous reclaiming once zram is full.
> >
> >Signed-off-by: Minchan Kim 
> >---
> >  drivers/block/zram/zram_drv.c | 18 ++
> >  1 file changed, 18 insertions(+)
> >
> >diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> >index 88661d62e46a..8e22b20aa2db 100644
> >--- a/drivers/block/zram/zram_drv.c
> >+++ b/drivers/block/zram/zram_drv.c
> >@@ -951,6 +951,22 @@ static int zram_slot_free_notify(struct block_device 
> >*bdev,
> > return 0;
> >  }
> >
> >+static int zram_get_free_pages(struct block_device *bdev, long *free)
> >+{
> >+struct zram *zram;
> >+struct zram_meta *meta;
> >+
> >+zram = bdev->bd_disk->private_data;
> >+meta = zram->meta;
> >+
> >+if (!zram->limit_pages)
> >+return 1;
> >+
> >+*free = zram->limit_pages - zs_get_total_pages(meta->mem_pool);
> 
> Even if 'free' is zero here, there may be free spaces available to
> store more compressed pages into the zs_pool. I mean calculation
> above is not quite accurate and wastes memory, but have no better
> idea for now.

Yeb, good point.

Actually, I thought about that but in this patchset, I wanted to
go with conservative approach which is a safe guard to prevent
system hang which is terrible than early OOM kill.

Whole point of this patchset is to add a facility to VM and VM
collaborates with zram via the interface to avoid worst case
(ie, system hang) and logic to throttle could be enhanced by
several approaches in future but I agree my logic was too simple
and conservative.

We could improve it with [anti|de]fragmentation in future but
at the moment, below simple heuristic is not too bad for first
step. :)


---
 drivers/block/zram/zram_drv.c | 15 ++-
 drivers/block/zram/zram_drv.h |  1 +
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 8e22b20aa2db..af9dfe6a7d2b 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -410,6 +410,7 @@ static bool zram_free_page(struct zram *zram, size_t index)
atomic64_sub(zram_get_obj_size(meta, index),
>stats.compr_data_size);
atomic64_dec(>stats.pages_stored);
+   atomic_set(>alloc_fail, 0);
 
meta->table[index].handle = 0;
zram_set_obj_size(meta, index, 0);
@@ -600,10 +601,12 @@ static int zram_bvec_write(struct zram *zram, struct 
bio_vec *bvec, u32 index,
alloced_pages = zs_get_total_pages(meta->mem_pool);
if (zram->limit_pages && alloced_pages > zram->limit_pages) {
zs_free(meta->mem_pool, handle);
+   atomic_inc(>alloc_fail);
ret = -ENOMEM;
goto out;
}
 
+   atomic_set(>alloc_fail, 0);
update_used_max(zram, alloced_pages);
 
cmem = zs_map_object(meta->mem_pool, handle, ZS_MM_WO);
@@ -951,6 +954,7 @@ static int zram_slot_free_notify(struct block_device *bdev,
return 0;
 }
 
+#define FULL_THRESH_HOLD 32
 static int zram_get_free_pages(struct block_device *bdev, long *free)
 {
struct zram *zram;
@@ -959,12 +963,13 @@ static int zram_get_free_pages(struct block_device *bdev, 
long *free)
zram = bdev->bd_disk->private_data;
meta = zram->meta;
 
-   if (!zram->limit_pages)
-   return 1;
-
-   *free = zram->limit_pages - zs_get_total_pages(meta->mem_pool);
+   if (zram->limit_pages &&
+   (atomic_read(>alloc_fail) > FULL_THRESH_HOLD)) {
+   *free = 0;
+   return 0;
+   }
 
-   return 0;
+   return 1;
 }
 
 static int zram_swap_hint(struct block_device *bdev,
diff --git a/drivers/block/zram/zram_drv.h b/drivers/block/zram/zram_drv.h
index 779d03fa4360..182a2544751b 100644
--- a/drivers/block/zram/zram_drv.h
+++ b/drivers/block/zram/zram_drv.h
@@ -115,6 +115,7 @@ struct zram {
u64 disksize;   /* bytes */
int max_comp_streams;
struct zram_stats stats;
+   atomic_t alloc_fail;
/*
 * the number of pages zram can consume for storing compressed data
 */
-- 
2.0.0

> 
> heesub
> 
> >+
> >+return 0;
> >+}
> >+
> >  static int zram_swap_hint(struct block_device *bdev,
> > unsigned int hint, void *arg)
> >  {
> >@@ -958,6 +974,8 @@ static int zram_swap_hint(struct block_device *bdev,
> >
> > if (hint == SWAP_SLOT_FREE)
> > ret = zram_slot_free_notify(bdev, (unsigned long)arg);
> >+else if (hint == SWAP_GET_FREE)
> >+ret = zram_get_free_pages(bdev, arg);
> >
> > return ret;
> >  }
> >
> 

Re: [PATCH] kernel: add support for gcc 5

2014-09-04 Thread Joe Perches
On Fri, 2014-09-05 at 00:43 +0200, Hannes Frederic Sowa wrote:
> Most statements are already depending on GCC_VERSION, maybe we can just
> unify all gcc specific headers to one, still trying to keep the file
> organized? ;)

Maybe something like:

gnu development of gcc will be more frequent and the use of
compiler-gcc.h likely will not be convenient anymore.

Integrate the individual compiler-gcc.h files into
compiler-gcc.h.
---
 include/linux/compiler-gcc.h  | 113 --
 include/linux/compiler-gcc3.h |  23 -
 include/linux/compiler-gcc4.h |  88 
 3 files changed, 109 insertions(+), 115 deletions(-)
 delete mode 100644 include/linux/compiler-gcc3.h
 delete mode 100644 include/linux/compiler-gcc4.h

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 02ae99e..ec73109 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -100,10 +100,115 @@
 #define __maybe_unused __attribute__((unused))
 #define __always_unused__attribute__((unused))
 
-#define __gcc_header(x) #x
-#define _gcc_header(x) __gcc_header(linux/compiler-gcc##x.h)
-#define gcc_header(x) _gcc_header(x)
-#include gcc_header(__GNUC__)
+/* gcc version specific checks */
+
+#if GCC_VERSION < 30200
+# error Sorry, your compiler is too old - please upgrade it.
+#endif
+
+#if GCC_VERSION < 30300
+# define __used__attribute__((__unused__))
+#else
+# define __used__attribute__((__used__))
+#endif
+
+#ifdef CONFIG_GCOV_KERNEL
+# if GCC_VERSION < 30400
+#   error "GCOV profiling support for gcc versions below 3.4 not included"
+# endif /* __GNUC_MINOR__ */
+#endif /* CONFIG_GCOV_KERNEL */
+
+#if GCC_VERSION >= 30400
+#define __must_check   __attribute__((warn_unused_result))
+#endif
+
+#if GCC_VERSION >= 4
+
+/* GCC 4.1.[01] miscompiles __weak */
+#ifdef __KERNEL__
+# if GCC_VERSION >= 40100 &&  GCC_VERSION <= 40101
+#  error Your version of gcc miscompiles the __weak directive
+# endif
+#endif
+
+#define __used __attribute__((__used__))
+#define __compiler_offsetof(a,b) __builtin_offsetof(a,b)
+
+#if GCC_VERSION >= 40100 && GCC_VERSION < 40600
+# define __compiletime_object_size(obj) __builtin_object_size(obj, 0)
+#endif
+
+#if GCC_VERSION >= 40300
+/* Mark functions as cold. gcc will assume any path leading to a call
+   to them will be unlikely.  This means a lot of manual unlikely()s
+   are unnecessary now for any paths leading to the usual suspects
+   like BUG(), printk(), panic() etc. [but let's keep them for now for
+   older compilers]
+
+   Early snapshots of gcc 4.3 don't support this and we can't detect this
+   in the preprocessor, but we can live with this because they're unreleased.
+   Maketime probing would be overkill here.
+
+   gcc also has a __attribute__((__hot__)) to move hot functions into
+   a special section, but I don't see any sense in this right now in
+   the kernel context */
+#define __cold __attribute__((__cold__))
+
+#define __UNIQUE_ID(prefix) __PASTE(__PASTE(__UNIQUE_ID_, prefix), __COUNTER__)
+
+#ifndef __CHECKER__
+# define __compiletime_warning(message) __attribute__((warning(message)))
+# define __compiletime_error(message) __attribute__((error(message)))
+#endif /* __CHECKER__ */
+#endif /* GCC_VERSION >= 40300 */
+
+#if GCC_VERSION >= 40500
+/*
+ * Mark a position in code as unreachable.  This can be used to
+ * suppress control flow warnings after asm blocks that transfer
+ * control elsewhere.
+ *
+ * Early snapshots of gcc 4.5 don't support this and we can't detect
+ * this in the preprocessor, but we can live with this because they're
+ * unreleased.  Really, we need to have autoconf for the kernel.
+ */
+#define unreachable() __builtin_unreachable()
+
+/* Mark a function definition as prohibited from being cloned. */
+#define __noclone  __attribute__((__noclone__))
+
+#endif /* GCC_VERSION >= 40500 */
+
+#if GCC_VERSION >= 40600
+/*
+ * Tell the optimizer that something else uses this function or variable.
+ */
+#define __visible __attribute__((externally_visible))
+#endif
+
+/*
+ * GCC 'asm goto' miscompiles certain code sequences:
+ *
+ *   http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670
+ *
+ * Work it around via a compiler barrier quirk suggested by Jakub Jelinek.
+ * Fixed in GCC 4.8.2 and later versions.
+ *
+ * (asm goto is automatically volatile - the naming reflects this.)
+ */
+#define asm_volatile_goto(x...)do { asm goto(x); asm (""); } while (0)
+
+#ifdef CONFIG_ARCH_USE_BUILTIN_BSWAP
+#if GCC_VERSION >= 40400
+#define __HAVE_BUILTIN_BSWAP32__
+#define __HAVE_BUILTIN_BSWAP64__
+#endif
+#if GCC_VERSION >= 40800 || (defined(__powerpc__) && GCC_VERSION >= 40600)
+#define __HAVE_BUILTIN_BSWAP16__
+#endif
+#endif /* CONFIG_ARCH_USE_BUILTIN_BSWAP */
+
+#endif /* gcc version specific checks */
 
 #if !defined(__noclone)
 #define 

[git pull] drm fixes (resend + one patch)

2014-09-04 Thread Dave Airlie

Hi Linus,

send one yesterday, it bounced due to LF mail fail (linux-foundation.org),
hopefully this one makes it.

i915 fixes, a few display regressions
vmwgfx, possible loop forever fix
nouveau, one userspace interface fix.

Dave.

The following changes since commit 59753a805499f1ffbca4ac0a24b3dff67bf1:

  Merge tag 'backlight-fixes-3.17' of 
git://git.kernel.org/pub/scm/linux/kernel/git/lee/backlight (2014-08-28 
10:47:10 -0700)

are available in the git repository at:


  git://people.freedesktop.org/~airlied/linux drm-fixes

for you to fetch changes up to 68c78bd67bd6f868474ac75d98ea7d6ebf28d2e7:

  Merge branch 'linux-3.17' of 
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes (2014-09-05 
09:27:33 +1000)



Ben Skeggs (1):
  drm/nouveau/core: don't leak oclass type bits to user

Dave Airlie (5):
  Merge tag 'drm-intel-fixes-2014-08-28' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes
  drm/i915: handle G45/GM45 pulse detection connected state.
  Merge branch 'vmwgfx-fixes-3.17' of 
git://people.freedesktop.org/~thomash/linux into drm-fixes
  Merge tag 'drm-intel-fixes-2014-09-03' of 
git://anongit.freedesktop.org/drm-intel into drm-fixes
  Merge branch 'linux-3.17' of 
git://anongit.freedesktop.org/git/nouveau/linux-2.6 into drm-fixes

Mathias Krause (1):
  drm/i915: Remove bogus __init annotation from DMI callbacks

Paulo Zanoni (1):
  drm/i915: fix plane/cursor handling when runtime suspended

Scot Doyle (2):
  drm/i915: Ignore VBT backlight presence check on Acer C720 (4005U)
  drm/i915: don't warn if backlight unexpectedly enabled

Thomas Hellstrom (2):
  drm/vmwgfx: Fix an incorrect OOM return value
  drm/vmwgfx: Fix a potential infinite spin waiting for fifo idle

Ville Syrjälä (2):
  drm/i915: Move intel_ddi_set_vc_payload_alloc(false) to 
haswell_crtc_disable()
  drm/i915: Fix lock dropping in intel_tv_detect()

 drivers/gpu/drm/i915/intel_bios.c  |  2 +-
 drivers/gpu/drm/i915/intel_crt.c   |  2 +-
 drivers/gpu/drm/i915/intel_display.c   | 34 +++---
 drivers/gpu/drm/i915/intel_dp.c| 55 --
 drivers/gpu/drm/i915/intel_lvds.c  |  2 +-
 drivers/gpu/drm/i915/intel_panel.c |  8 ++---
 drivers/gpu/drm/i915/intel_tv.c| 10 --
 drivers/gpu/drm/nouveau/core/core/parent.c |  4 +--
 drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c| 11 +++---
 drivers/gpu/drm/vmwgfx/vmwgfx_fifo.c   |  3 +-
 10 files changed, 90 insertions(+), 41 deletions(-)

Re: [3.16.1 REGRESSION]: Simtec Entropy Key (cdc-acm) broken in 3.16

2014-09-04 Thread Nix
On 1 Sep 2014, Oliver Neukum stated:

>
>> I'll do a bisection of the cdc-acm changes since 3.15 tomorrow night and
>> see if I can find the commit at fault.
>
> Thank you for the report. Please let me know the results of your
> bisection.

Bisection underway (fifth attempt -- I *may* have characterized it well
enough after a few hours of thrashing at it to bisect accurately this
time).

Some more random info.

btw, when the Entropy Key has ended up in a messed up state due to this
bug, we sometimes see

[2.330158] usb 2-1: new full-speed USB device number 2 using ohci-pci
[2.552465] usb 2-1: device descriptor read/64, error -62
[2.870142] usb 2-1: device descriptor read/64, error -62
[3.190150] usb 2-1: new full-speed USB device number 3 using ohci-pci
[3.410137] usb 2-1: device descriptor read/64, error -62
[3.740142] usb 2-1: device descriptor read/64, error -62
[4.060146] usb 2-1: new full-speed USB device number 4 using ohci-pci
[4.520133] usb 2-1: device not accepting address 4, error -62
[4.730139] usb 2-1: new full-speed USB device number 5 using ohci-pci
[5.180117] usb 2-1: device not accepting address 5, error -62
[5.215194] hub 2-0:1.0: unable to enumerate USB device on port 1

when starting up a working kernel (the key then doesn't work until
physically disconnected and reconnected again).

More generally, the problem may be at *shutdown* -- something goes wrong
during link suspension or something, such that the link never comes up
again until physically reconnected. So a straight bisect is misleading
-- the error may have been in the *last* kernel tested -- and even then,
some kernels (e.g. the 3.15.0 merge base) appear capable of making it
work fine. But even this is not consistent: sometimes a kernel that
works fine if you repeatedly reboot it (such as 3.15) malfunctions when
you reboot into 3.16 -- but sometimes a newly plugged USB key on a 3.16
kernel malfunctions upon reboot, even if you reboot into a working
kernel such as 3.15 (and it then proceeds to work indefinitely if you
unplug and replug it and stick with 3.15.x, but upon rebooting into
3.16.x it goes wrong again).

So sometimes a faulty kernel makes the key go wrong when you restart
into another kernel (faulty or not), and sometimes it makes a key go
wrong when it is restarted into. There doesn't seem to be any
consistency to this that I've spotted, at least not yet.

Upon physical reconnection, the USB key works again, even on afflicted
kernels.

I'm working around this confusing morass by rebooting into each test
kernel, unplugging and replugging the entropy key if it was fubared,
then rebooting into the same kernel again and seeing if it was still
fubared. But this is not terribly fast, particularly not on a headless
compact-flash-based Geode box which doesn't even complete booting
without the entropy source which this bug cuts off :) so it'll be
sometime tomorrow before I can get this bisection done, I'm afraid.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >