date:20130614

Re: [netlink] WARNING: at mm/vmalloc.c:1487 __vunmap()

2013-06-14 Thread Cong Wang

On Sat, Jun 15, 2013 at 6:01 AM, Fengguang Wu  wrote:
> Greetings,
>
> I got the below dmesg and the first bad commit is
>
> commit c05cdb1b864f548c0c3d8ae3b51264e6739a69b1
> Author: Pablo Neira Ayuso 
> Date:   Mon Jun 3 09:46:28 2013 +
>
> netlink: allow large data transfers from user-space

Hi, Fengguang,

Could you try the following quick fix?

diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 8978755..d8c6c03 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -750,7 +750,7 @@ static void netlink_skb_destructor(struct sk_buff *skb)
skb->head = NULL;
}
 #endif
-   if (is_vmalloc_addr(skb->head)) {
+   else if (is_vmalloc_addr(skb->head)) {
vfree(skb->head);
skb->head = NULL;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] ipc/sem.c: performance improvements, FIFO

2013-06-14 Thread Mike Galbraith

On Sat, 2013-06-15 at 07:27 +0200, Manfred Spraul wrote:

> Assume there is one op (semctl(), whatever) that acquires the global 
> lock - and a continuous stream of simple ops.
> - spin_is_locked() returns true due to the semctl().
> - then simple ops will switch to spin_lock(>sem_perm.lock).
> - since the spinlock is acquired, the next operation will get true from 
> spin_is_locked().
> 
> It will stay that way around - as long as there is at least one op 
> waiting for sma->sem_perm.lock.
> With enough cpus, it will stay like this forever.

Yup, pondered that yesterday, scratching my head over how to do better.
Hints highly welcome.  Maybe if I figure out how to scratch dual lock
thingy properly for -rt, non-rt will start acting sane too, as that spot
seems to be itchy in both kernels.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states

2013-06-14 Thread Tejun Heo

On Fri, Jun 14, 2013 at 10:35:22PM -0700, Tejun Heo wrote:
> On Fri, Jun 14, 2013 at 03:31:25PM -0700, Tejun Heo wrote:
> > I'll play with it a bit more on an actual machine and post more
> > results.  Test program attached.
> 
> So, here are the results from the same test on a dual-socket 2-way
> NUMA opteron 8 core machine.
> 
> Running on one CPU.
> 
>   copy size   atomic  percpu  diff in pct
>   0   535964443   616756827   +15.07%
>   32  399988186   378678713-5.33%
>   64  389067476   355073979-8.74%
>   128 342192631   315615300-7.77%
>   256 281208005   260598931-7.33%
>   512 188070912   193225269+2.74%
> 
> Running on all eight cores.
> 
>   copy size   atomic  percpu  diff in pct
>   0   121324328   4889425511  +3,930.05%
>   32   96170193   2999613380  +3,019.07%
>   64   98139061   2813894184  +2,767.25%
>   128 112610025   2503229487  +2,122.92%
>   256  96828114   2069865752  +2,037.67%
>   512  95858297   1537726109  +1,504.17%

A bit of addition, this of course is completely synthetic and
exaggerates the differences both ways, but it's pretty clear that this
is gonna be a clear gain in any kind of workload which would generate
some amount of cross-CPU refcnting, which would be the norm anyway.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 11/11] cgroup: use percpu refcnt for cgroup_subsys_states

2013-06-14 Thread Tejun Heo

On Fri, Jun 14, 2013 at 03:31:25PM -0700, Tejun Heo wrote:
> I'll play with it a bit more on an actual machine and post more
> results.  Test program attached.

So, here are the results from the same test on a dual-socket 2-way
NUMA opteron 8 core machine.

Running on one CPU.

  copy size atomic  percpu  diff in pct
  0 535964443   616756827   +15.07%
  32399988186   378678713-5.33%
  64389067476   355073979-8.74%
  128   342192631   315615300-7.77%
  256   281208005   260598931-7.33%
  512   188070912   193225269+2.74%

Running on all eight cores.

  copy size atomic  percpu  diff in pct
  0 121324328   4889425511  +3,930.05%
  32 96170193   2999613380  +3,019.07%
  64 98139061   2813894184  +2,767.25%
  128   112610025   2503229487  +2,122.92%
  25696828114   2069865752  +2,037.67%
  51295858297   1537726109  +1,504.17%

Ration of all cores / single core.

  copy size atomic  percpu
  0 0.237.93
  320.247.92
  640.257.92
  128   0.337.93
  256   0.347.94
  512   0.517.96

Note exactly 8 - the cores do share quite a bit of resources after all
- but pretty close.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/6] ipc/sem.c: performance improvements, FIFO

2013-06-14 Thread Manfred Spraul


On 06/14/2013 09:05 PM, Mike Galbraith wrote:

32 of 64 cores DL980 without the -rt killing goto again loop removal I
showed you.  Unstable, not wonderful throughput.

Unfortunately the -rt approach is defintively unstable:

@@ -285,9 +274,29 @@ static inline int sem_lock(struct sem_ar
 * but have to wait for the global lock to be released.
 */
if (unlikely(spin_is_locked(>sem_perm.lock))) {
-   spin_unlock(>lock);
- spin_unlock_wait(>sem_perm.lock);
-   goto again;
+   spin_lock(>sem_perm.lock);
+   if (sma->complex_count)
+   goto wait_array;
+
+   /*
+* Acquiring our sem->lock under the global lock
+* forces new complex operations to wait for us
+* to exit our critical section.
+*/
+   spin_lock(>lock);
+   spin_unlock(>sem_perm.lock);


Assume there is one op (semctl(), whatever) that acquires the global 
lock - and a continuous stream of simple ops.

- spin_is_locked() returns true due to the semctl().
- then simple ops will switch to spin_lock(>sem_perm.lock).
- since the spinlock is acquired, the next operation will get true from 
spin_is_locked().


It will stay that way around - as long as there is at least one op 
waiting for sma->sem_perm.lock.

With enough cpus, it will stay like this forever.

--
Manfred
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ncpfs: fix rmdir returns Device or resource busy

2013-06-14 Thread Al Viro

On Sat, Jun 15, 2013 at 06:09:39AM +0100, Al Viro wrote:

> BTW, in ncp_fill_cache() we have a provably pointless
> if (!ino)
> ino = find_inode_number(dentry, );
> Check it out - any path that can lead there with ino == 0 will *not*
> have a positive dentry with such name, so this find_inode_number()
> call is just "waste some time and return 0".  Cargo-cult, plain and
> simple...

Incidentally, the only other caller of find_inode_number() is equally
pointless, so I'm very inclined to kill the damn function off.  Sure,
it's exported.  And I'm fairly sure that its out-of-tree users are
just as fishy (as in Innsmouth)...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Gumstix Overo Linux 3.10-rc5 overo-defconfig etc.

2013-06-14 Thread kernel-dev

I have been using linux on Gumstix since around 2007.

I recently posted a request to help me bring an OEM
driver from 2.6.29 to 2.6.34, since that is the kernel
of the system I am developing user-space code on.

I have gotten 3.x kernels working on Gumstix 
in the past, but at some point forgot about the change 
from ttyS2 to ttyO2, and so gave up on that to finish my code.

Since I was informed that I need to make requests for assistance 
against current kernels, I recently built and booted 3.9.5, 3.9.6,
and 3.10-rc5 on a Gumstix Overo, on a Gallop-43 Breakout Board. 

I started with a 3.5 kernel that Gumstix publishes that
was built by the Yocto Project, extracted the .config,
then brought it forward with "make oldconfig" on the 
newer source.

That being said, I have a .config that works with 
current releases. I noticed that the overo-defconfig
was removed from arch/arm/configs/ at or around
kernel 3.0.

Any interest in putting it back?

Also, if need be I can setup and test an overo-defconfig
for all the 3.x Stable releases.
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] ncpfs: fix rmdir returns Device or resource busy

2013-06-14 Thread Al Viro

On Thu, Jun 13, 2013 at 11:19:26PM -0500, Dave Chiluk wrote:

> I'm afraid you are way beyond my current vfs experience level on this
> one.  While you're getting rid of things you might consider
> dentry_unhash as well, as only hpfs_unlink, ncp_rmdir, and ncp_rename
> call that.

The trouble is, hpfs_unlink() really wants it, so we probably won't be
able to kill that off.

> If you get a patch together, I'll do my best to test it.  Looks like
> only ncp_readdir calls that, so afaik, a few varying ls commands should
> be all that's needed for a test.

... combined with memory pressure and changes to directory, to test the
invalidation logics.

> Dave.
> p.s. are you sure you don't just want to just deprecate all of ncpfs?

Don't tempt me ;-)  As far as I'm concerned, everything NetWare-related is
best dealt by fine folks from Miskatonic University, with all the precautions
due when working with spawn of the Old Ones...

Speaking of the madness and perversion: take a look at ncp_fill_cache().
What happens there is that we try to find or create a dentry according
to the directory entry we've got from server, then stuff a reference to
it into page cache of directory inode and call filldir for that sucker.
* if dentry allocation fails, we skip stuffing a reference into
page cache.  Result: garbage pointer left there.  _Another_ result:
if that happens more than page size / sizeof(pointer) times and then
we finally manage to allocate an entry (or just find one already in
dcache), we hit this:
if (ctl.idx >= NCP_DIRCACHE_SIZE) {
if (ctl.page) {
kunmap(ctl.page);
SetPageUptodate(ctl.page);
unlock_page(ctl.page);
page_cache_release(ctl.page);
}
ctl.cache = NULL;
ctl.idx  -= NCP_DIRCACHE_SIZE;
ctl.ofs  += 1;
ctl.page  = grab_cache_page(>i_data, ctl.ofs);
if (ctl.page)
ctl.cache = kmap(ctl.page);
}
ctx.idx was being incrmented on each entry, so now we are past
NCP_DIRCACHE_SIZE * 2.  We subtract NCP_DIRCACHE_SIZE, increment
ctl.ofs (page number), grab that page... and proceed to
if (ctl.cache) {
ctl.cache->dentry[ctl.idx] = newdent;
valid = 1;
}
which stuffs pointer past the end of that page.  And no, the caller
won't stop calling that on the first failure - if ->f_pos is large
enough, we'll record the failure in ctl.valid and have ncp_fill_cache()
return true - ctl.filled is false (== filldir hadn't told us to stop,
since we hadn't called it at all), so ctl.valid || !ctl.filled is
true.  IOW, the loop in caller will keep calling that sucker.

What's more, if ctl.valid is set to 0, there's no point bothering with
page cache anymore - it won't be used at all and on the next readdir()
we'll reread from scratch.

Even better, OOM for inode allocation is treated differently - we stuff
a reference to negative dentry into page cache, so that ncp_dget_fpos()
will find it, notice that it's negative and return NULL.  At which
point the caller will invalidate the damn cache and reread everything
from scratch.  Why bother stuffing it there at all?

BTW, in ncp_fill_cache() we have a provably pointless
if (!ino)
ino = find_inode_number(dentry, );
Check it out - any path that can lead there with ino == 0 will *not*
have a positive dentry with such name, so this find_inode_number()
call is just "waste some time and return 0".  Cargo-cult, plain and
simple...

Grr...  When has that code been read the last time?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] ocfs2: Fix llseek() semantics and do some cleanup

2013-06-14 Thread Jeff Liu

[Add ocfs2-devel to CC-list]

Hello Richard,

Thanks for your patch.

On 06/15/2013 03:23 AM, Richard Yao wrote:

> There are multiple issues with the custom llseek implemented in ocfs2 for
> implementing SEEK_HOLE/SEEK_DATA.
> 
> 1. It takes the inode->i_mutex lock before calling generic_file_llseek(), 
> which
> is unnecessary.

Agree, but please see my comments below.

> 
> 2. It fails to take the filp->f_lock spinlock before modifying filp->f_pos and
> filp->f_version, which differs from generic_file_llseek().
> 
> 3. It does a offset > inode->i_sb->s_maxbytes check that permits seeking up to
> the maximum file size possible on the ocfs2 filesystem, even when it is past
> the end of the file. Seeking beyond that (if possible), would return EINVAL
> instead of ENXIO.
> 
> 4. The switch statement tries to cover all whence values when in reality it
> should only care about SEEK_HOLE/SEEK_DATA. Any other cases should be passsed
> to generic_file_llseek().

I have another patch set for refactoring ocfs2_file_llseek() but not yet found 
time
to run a comprehensive tests.  It can solve the existing issues but also 
improved the
SEEK_DATA/SEEK_HOLE for unwritten extents, i.e. OCFS2_EXT_UNWRITTEN.

With this change, SEEK_DATA/SEEK_HOLE will go into separate function with a 
little code
duplication instead of the current mix-ups in ocfs2_seek_data_hole_offset(), 
i.e, 

loff_t ocfs2_file_llseek()
{
switch (origin) {
case SEEK_END:
case SEEK_CUR:
case SEEK_SET:
return generic_file_llseek(file, offset, origin);
case SEEK_DATA:
return ocfs2_seek_data(file, offset);
case SEEK_HOLE:
return ocfs2_seek_hole(file, offset);
default:
return -EINVAL;
}
}

I personally like keeping SEEK_DATA/SEEK_HOLE in switch...case style rather
than dealing with them in a condition check block.

Thanks,
-Jeff

> 
> btrfs_file_llseek() and ocfs2_file_llseek() are extremely similar and
> consequently, contain many of the same flaws. Li Dongyang filed a pull
> request with ZFSOnLinux for SEEK_HOLE/SEEK_DATA support that included a
> custom llseek function that appears to have been modelled after the one
> in ocfs2. The similarity was strong enough that it suffered from many of
> the same flaws, which I caught during review. I addressed the issues
> with his patch with one that I wrote. Since a small percentage of Gentoo
> Linux users are affected by these flaws, I decided to adapt that code
> that to btrfs (separate patch) and ocfs2.
> 
> Note that commit 48802c8ae2a9d618ec734a61283d645ad527e06c by Jeff Liu at
> Oracle mostly addressed #3 in btrfs. The only lingering issue was that
> the offset > inode->i_sb->s_maxbytes check became dead code. The ocfs2
> code was not fortunate enough to have had a similar correction until
> now.
> 
> Signed-off-by: Richard Yao 
> ---
>  fs/ocfs2/file.c | 65 
> ++---
>  1 file changed, 25 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
> index ff54014..84f8c9c 100644
> --- a/fs/ocfs2/file.c
> +++ b/fs/ocfs2/file.c
> @@ -2615,54 +2615,39 @@ bail:
>  }
>  
>  /* Refer generic_file_llseek_unlocked() */
> -static loff_t ocfs2_file_llseek(struct file *file, loff_t offset, int whence)
> +static loff_t ocfs2_file_llseek(struct file *filp, loff_t offset, int whence)
>  {
> - struct inode *inode = file->f_mapping->host;
> - int ret = 0;
> + if (whence == SEEK_DATA || whence == SEEK_HOLE) {
> + struct inode *inode = filp->f_mapping->host;
> + int ret;
>  
> - mutex_lock(>i_mutex);
> + if (offset < 0 && !(filp->f_mode & FMODE_UNSIGNED_OFFSET))
> + return -EINVAL;
>  
> - switch (whence) {
> - case SEEK_SET:
> - break;
> - case SEEK_END:
> - offset += inode->i_size;
> - break;
> - case SEEK_CUR:
> - if (offset == 0) {
> - offset = file->f_pos;
> - goto out;
> + if (offset >= i_size_read(inode)) {
> + return -ENXIO;
>   }
> - offset += file->f_pos;
> - break;
> - case SEEK_DATA:
> - case SEEK_HOLE:
> - ret = ocfs2_seek_data_hole_offset(file, , whence);
> - if (ret)
> - goto out;
> - break;
> - default:
> - ret = -EINVAL;
> - goto out;
> - }
>  
> - if (offset < 0 && !(file->f_mode & FMODE_UNSIGNED_OFFSET))
> - ret = -EINVAL;
> - if (!ret && offset > inode->i_sb->s_maxbytes)
> - ret = -EINVAL;
> - if (ret)
> - goto out;
> + mutex_lock(>i_mutex);
> + ret = ocfs2_seek_data_hole_offset(filp, , whence);
> + mutex_unlock(>i_mutex);
> +
> + if (ret) {
> + return ret;

[PATCH V2] dma: mmp_tdma: disable irq when disabling dma channel

2013-06-14 Thread Qiao Zhou

mask dma irq when disabling dma channel, so that interrupt status
will not be set and interrupt won't come again.

Signed-off-by: Qiao Zhou 
---
 drivers/dma/mmp_tdma.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/drivers/dma/mmp_tdma.c b/drivers/dma/mmp_tdma.c
index 43d5a6c..9b93665 100644
--- a/drivers/dma/mmp_tdma.c
+++ b/drivers/dma/mmp_tdma.c
@@ -154,6 +154,10 @@ static void mmp_tdma_disable_chan(struct mmp_tdma_chan 
*tdmac)
 {
writel(readl(tdmac->reg_base + TDCR) & ~TDCR_CHANEN,
tdmac->reg_base + TDCR);
+
+   /* disable irq */
+   writel(0, tdmac->reg_base + TDIMR);
+
tdmac->status = DMA_SUCCESS;
 }
 
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2] dma: mmp_tdma: disable irq when disabling dma channel

2013-06-14 Thread Qiao Zhou

V2 -> V1:
1, mask dma interrupt when disable DMA channel.
2, remove patch v1.

if the dma channel is disabled without interrupt masked, the interrupt
status may still be set. next time when dma channel is enabled again,
the old interrupt status may trigger the interrupt wrongly. we need to
mask the interrupt when dma channel is disabled.

Qiao Zhou (1):
  dma: mmp_tdma: disable irq when disabling dma channel

 drivers/dma/mmp_tdma.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] tty/vt: Return EBUSY if deallocating VT1 and it is busy

2013-06-14 Thread Greg KH

On Fri, Jun 14, 2013 at 07:01:56PM -0400, Peter Hurley wrote:
> On 06/14/2013 06:24 PM, Ross Lagerwall wrote:
> >Commit 421b40a6286e ("tty/vt: Fix vc_deallocate() lock order") changed
> >the behavior when deallocating VT 1.  Previously if trying to
> >deallocate VT1 and it is busy, we would return EBUSY.  The commit
> >changed this to return 0 (success).
> >
> >This commit restores the old behavior.
> >
> >Signed-off-by: Ross Lagerwall 
> 
> Thanks.
> 
> Acked-by: Peter Hurley 

Thanks for this, I'll queue it up for Linus soon.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] i915: Don't provide ACPI backlight interface if firmware expects Windows 8

2013-06-14 Thread Matthew Garrett

On Sat, Jun 15, 2013 at 12:14:42PM +0800, Aaron Lu wrote:
> On 06/15/2013 09:38 AM, Matthew Garrett wrote:
> > Well, Windows 8 will only use the ACPI backlight interface if the GPU
> > driver decides to, right? So the logic for deciding whether to remove
> > the ACPI backlight control or not should be left up to the GPU. There's
> 
> I don't know this. From the document I googled, Microsoft suggests under
> win8, backlight should be controlled by the graphics driver for smooth
> brightness level change, instead of ACPI or other methods. So it is
> possible that OEM will not test the ACPI interface well and thus the
> interface is likely broken. I don't see why GPU driver has any better
> knowledge on which systems the firmware interface is broken or not.

The vendor will presumably have tested that backlight control works - if 
the GPU driver uses the ACPI interface and backlight control is broken, 
then the vendor would fix it.

> > no harm in refusing to expose a working method if there's another
> > working method, but there is harm in exposing a broken one and expecting
> > userspace to know the difference.
> 
> BTW, the proposed solution is not meant to solve win8 problems alone, it
> should make solving other problems easy and make individual backlight
> control interface provider module independent with each other such as
> the platform drivers will not need to check if ACPI video driver will
> control backlight or not and can always create backlight interface(its
> default priority is lower that ACPI video driver's so will not be taken
> by user space by default, showing the same behavior of the current code).

Sure, but it still requires the replacement of existing userspace. We 
need to fix the problem with existing userspace, too.

> The current acpi_backlight=video/vendor kernel command line is pretty
> misleading, for laptops that do not have vendor backlight interface,
> adding acpi_backlight=vendor actually makes the system using the GPU's
> interface. Some laptops are using this switch to work around problems in
> ACPI video driver and users think they are using vendor interface.
> That's why I think we need a new command line as the
> backlight.force_interface=raw/firmware/platform.

No, I think we need to fix the bugs that currently require users to pass 
options.

> Instead of letting individual driver to make decisions on which
> backlight interface this system should use(either in platform driver as
> we currently did, see acer-wmi and asus-wmi, or GPU driver as this case),
> I think we need a better and clear way to handle such things. For
> example, suppose an acer laptop: vendor does not support backlight, ACPI
> video's backlight interface is broken and GPU's works, to make it work,
> user will need to select acer-wmi module while this module does not have
> anything to do with the functionality, but simply because it serves as
> the backlight manager for acer laptops.

How do these machines work under Windows? Until we know the answer to 
that, we don't know what the correct way to handle the problem is under 
Linux.

-- 
Matthew Garrett | mj...@srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] i915: Don't provide ACPI backlight interface if firmware expects Windows 8

2013-06-14 Thread Aaron Lu

On 06/15/2013 09:38 AM, Matthew Garrett wrote:
> On Sat, 2013-06-15 at 09:26 +0800, Aaron Lu wrote:
>> It's not easy to decide if they work or not sometimes, e.g. I came
>> across a system that claims win8 in ACPI table and has an Intel GPU,
>> while its ACPI video interface also works. With this patch, the working
>> ACPI video interface is removed, while with the priority based solution,
>> the GPU's interface priority gets higher, but the ACPI video interface
>> still stays.
> 
> Well, Windows 8 will only use the ACPI backlight interface if the GPU
> driver decides to, right? So the logic for deciding whether to remove
> the ACPI backlight control or not should be left up to the GPU. There's

I don't know this. From the document I googled, Microsoft suggests under
win8, backlight should be controlled by the graphics driver for smooth
brightness level change, instead of ACPI or other methods. So it is
possible that OEM will not test the ACPI interface well and thus the
interface is likely broken. I don't see why GPU driver has any better
knowledge on which systems the firmware interface is broken or not.

> no harm in refusing to expose a working method if there's another
> working method, but there is harm in exposing a broken one and expecting
> userspace to know the difference.

BTW, the proposed solution is not meant to solve win8 problems alone, it
should make solving other problems easy and make individual backlight
control interface provider module independent with each other such as
the platform drivers will not need to check if ACPI video driver will
control backlight or not and can always create backlight interface(its
default priority is lower that ACPI video driver's so will not be taken
by user space by default, showing the same behavior of the current code).

The current acpi_backlight=video/vendor kernel command line is pretty
misleading, for laptops that do not have vendor backlight interface,
adding acpi_backlight=vendor actually makes the system using the GPU's
interface. Some laptops are using this switch to work around problems in
ACPI video driver and users think they are using vendor interface.
That's why I think we need a new command line as the
backlight.force_interface=raw/firmware/platform.

Instead of letting individual driver to make decisions on which
backlight interface this system should use(either in platform driver as
we currently did, see acer-wmi and asus-wmi, or GPU driver as this case),
I think we need a better and clear way to handle such things. For
example, suppose an acer laptop: vendor does not support backlight, ACPI
video's backlight interface is broken and GPU's works, to make it work,
user will need to select acer-wmi module while this module does not have
anything to do with the functionality, but simply because it serves as
the backlight manager for acer laptops.

The above information and idea is formed while solving bugs reported
in kernel bugzilla video component, the proposed solutin may not be good
enough, but I hope we can find a better way to handle backlight problems.

Thanks,
Aaron
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC ticketlock] Auto-queued ticketlock

2013-06-14 Thread Waiman Long


On 06/14/2013 09:26 PM, Benjamin Herrenschmidt wrote:

On Fri, 2013-06-14 at 14:17 -0400, Waiman Long wrote:

With some minor changes, the current patch can be modified to support
debugging lock for 32-bit system. For 64-bit system, we can apply a
similar concept for debugging lock with cmpxchg_double. However, for
architecture that does not have cmpxchg_double support, it will be out
of luck and we probably couldn't support the same feature in debugging
mode. It will have to fall back to taking the lock.

That means only x86_64 and s390 would benefit from it ... I'm sure we can do 
better :-)

Cheers,
Ben.


On second thought, using cmpxchg_double may not be such a good idea 
after all as it requires a 16-byte alignment, at least for x86-64. 
Another possible alternative is to integrate the reference count 
directly into the spinlock_t data structure immediately after 
arch_spinlock_t for this special case. If CONFIG_GENERIC_LOCKBREAK is 
not defined, there will be a 4-byte hole that can be used. Otherwise, 
the spinlock_t structure will have an 8 byte size increase. I suppose 
that others won't be too upset for an 8-byte increase in size when 
spinlock debugging is turned on.


Regards,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[git pull] vfs.git

2013-06-14 Thread Al Viro

Several fixes + obvious cleanup (you've missed a couple of
open-coded can_lookup() back then).  Please, pull from
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-linus

Shortlog:
Al Viro (2):
  use can_lookup() instead of direct checks of ->i_op->lookup
  snd_pcm_link(): fix a leak...

Dave Chiluk (1):
  ncpfs: fix rmdir returns Device or resource busy

Oleg Nesterov (2):
  fput: task_work_add() can fail if the caller has passed exit_task_work()
  move exit_task_namespaces() outside of exit_notify()

Diffstat:
 fs/file_table.c |   19 ++-
 fs/namei.c  |4 ++--
 fs/ncpfs/dir.c  |9 -
 kernel/exit.c   |2 +-
 sound/core/pcm_native.c |4 ++--
 5 files changed, 15 insertions(+), 23 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[git pull] Please pull powerpc.git merge branch

2013-06-14 Thread Benjamin Herrenschmidt

Hi Linus !

Hopefully this one smells better ...

So here are 3 fixes still for 3.10. Fixes are simple, bugs are nasty
(though not recent regressions, nasty enough) and all targeted at
stable. Please apply.

Thanks !

Cheers,
Ben.

The following changes since commit 34376a50fb1fa095b9d0636fa41ed2e73125f214:

  Fix lockup related to stop_machine being stuck in __do_softirq. (2013-06-10 
17:46:57 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc.git merge

for you to fetch changes up to 230b3034793247f61e6a0b08c44cf415f6d92981:

  powerpc: Fix missing/delayed calls to irq_work (2013-06-15 12:33:30 +1000)


Benjamin Herrenschmidt (1):
  powerpc: Fix missing/delayed calls to irq_work

Michael Ellerman (1):
  powerpc: Fix stack overflow crash in resume_kernel when ftracing

Paul Mackerras (1):
  powerpc: Fix emulation of illegal instructions on PowerNV platform

 arch/powerpc/include/asm/exception-64s.h |2 +-
 arch/powerpc/kernel/exceptions-64s.S |2 +-
 arch/powerpc/kernel/irq.c|2 +-
 arch/powerpc/kernel/process.c|4 ++--
 arch/powerpc/kernel/traps.c  |   10 ++
 5 files changed, 15 insertions(+), 5 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: XFS (vdb): Corruption detected. Unmount and run xfs_repair

2013-06-14 Thread Dave Chinner

[cc x...@oss.sgi.com, where XFS bug reports should go]

On Sat, Jun 15, 2013 at 10:36:20AM +0800, Fengguang Wu wrote:
> Greetings,
> 
> I got the below dmesg in both upstream and linux-next, and the first
> bad commit *might be* commit 211d022c43ca ("xfs: Avoid pathological
> backwards allocation").
> 
> [   74.595386] 
> [   74.603826] CPU: 0 PID: 2137 Comm: kworker/0:1H Not tainted 
> 3.10.0-rc1-00031-gade1335 #1508
> [   74.609255] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
> [   74.612498] Workqueue: xfslogd xfs_buf_iodone_work
> [   74.615690]  0001 880016815c68 81aa456f 
> 880016815c88
> [   74.621548]  8130b179 81309514 0001 
> 880016815cc8
> [   74.627417]  8130b1d0  02da 
> 0016
> [   74.633321] Call Trace:
> [   74.635427]  [] dump_stack+0x19/0x1b
> [   74.638412]  [] xfs_error_report+0x3d/0x3f
> [   74.641627]  [] ? xfs_buf_iodone_work+0x4a/0x83
> [   74.644970]  [] xfs_corruption_error+0x55/0x71
> [   74.648217]  [] xfs_sb_read_verify+0xee/0x105
> [   74.651478]  [] ? xfs_buf_iodone_work+0x4a/0x83
> [   74.654820]  [] ? 
> ftrace_raw_event_workqueue_execute_start+0x92/0xa1
> [   74.659821]  [] xfs_buf_iodone_work+0x4a/0x83
> [   74.663042]  [] process_one_work+0x26c/0x470
> [   74.666296]  [] ? process_one_work+0x1ca/0x470
> [   74.669647]  [] worker_thread+0x1d0/0x2cb
> [   74.672770]  [] ? manage_workers.isra.19+0x1c3/0x1c3
> [   74.676201]  [] kthread+0xd5/0xdd
> [   74.679151]  [] ? trace_hardirqs_on+0xd/0xf
> [   74.682411]  [] ? __init_kthread_worker+0x5a/0x5a
> [   74.685776]  [] ret_from_fork+0x7c/0xb0
> [   74.688798]  [] ? __init_kthread_worker+0x5a/0x5a
> [   74.692206] XFS (vdb): Corruption detected. Unmount and run xfs_repair
> [   74.696000] XFS (vdb): SB validate failed with error 22.

EINVAL, which means there should have been some kind of output in
the log before the -corruption report- that explains why EINVAL was
returned.

> I'm not sure whether it's the first bad commit because

It's not, because it isn't in the upstream kernel and so if you are
seeing it in the upstream kernel, it can't be the cause. And,
besides:

> [   74.570969] XFS (vdb): bad magic number
> [   74.573837] 8800170ed000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.579266] 8800170ed010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.584581] 8800170ed020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.590036] 8800170ed030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.595386] XFS (vdb): Internal error xfs_sb_read_verify at line 730 of 
> file /c/kernel-tests/src/stable/fs/xfs/xfs_mount.c.  Caller 0x81309514
.
> [   74.692206] XFS (vdb): Corruption detected. Unmount and run xfs_repair
> [   74.696000] XFS (vdb): SB validate failed with error 22.

It's obviously not an XFS filesystem you are asking the kernel to
mount, so it's perfectly valid to throw a corruption error at you.
What it has actually thrown is EWRONGFS, but because you've asked
the kernel specifically to mount the device as an XFS filesystem,
the kernel is explicitly telling you that it's a corrupt
filesystem... :)

> common.rc: retrying test device mount with external set
> [   74.782247] XFS (vdb): bad magic number
> [   74.784895] 8800170e7000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.790201] 8800170e7010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.795466] 8800170e7020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.800759] 8800170e7030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> 00  
> [   74.806031] XFS (vdb): Internal error xfs_sb_read_verify at line 730 of 
> file /c/kernel-tests/src/stable/fs/xfs/xfs_mount.c.  Caller 0x81309514

It still isn't an XFS filesystem :/

This looks like user error, not a bug.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] libata: remove dead code from libata-acpi.c

2013-06-14 Thread Liu Jiang

From: Liu Jiang 

Commit 30dcf76acc69 "libata: migrate ACPI code over to new bindings"
removed ACPI dock notification related code, but there's some dead
code left, so clean up it.

Cc: Tejun Heo 
Cc: Matthew Garrett 
Cc: Aaron Lu 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Liu Jiang 
---
 drivers/ata/libata-acpi.c | 123 --
 1 file changed, 123 deletions(-)

diff --git a/drivers/ata/libata-acpi.c b/drivers/ata/libata-acpi.c
index 87f2f39..e50c987 100644
--- a/drivers/ata/libata-acpi.c
+++ b/drivers/ata/libata-acpi.c
@@ -91,129 +91,6 @@ acpi_handle ata_dev_acpi_handle(struct ata_device *dev)
 }
 EXPORT_SYMBOL(ata_dev_acpi_handle);
 
-/* @ap and @dev are the same as ata_acpi_handle_hotplug() */
-static void ata_acpi_detach_device(struct ata_port *ap, struct ata_device *dev)
-{
-   if (dev)
-   dev->flags |= ATA_DFLAG_DETACH;
-   else {
-   struct ata_link *tlink;
-   struct ata_device *tdev;
-
-   ata_for_each_link(tlink, ap, EDGE)
-   ata_for_each_dev(tdev, tlink, ALL)
-   tdev->flags |= ATA_DFLAG_DETACH;
-   }
-
-   ata_port_schedule_eh(ap);
-}
-
-/**
- * ata_acpi_handle_hotplug - ACPI event handler backend
- * @ap: ATA port ACPI event occurred
- * @dev: ATA device ACPI event occurred (can be NULL)
- * @event: ACPI event which occurred
- *
- * All ACPI bay / device realted events end up in this function.  If
- * the event is port-wide @dev is NULL.  If the event is specific to a
- * device, @dev points to it.
- *
- * Hotplug (as opposed to unplug) notification is always handled as
- * port-wide while unplug only kills the target device on device-wide
- * event.
- *
- * LOCKING:
- * ACPI notify handler context.  May sleep.
- */
-static void ata_acpi_handle_hotplug(struct ata_port *ap, struct ata_device 
*dev,
-   u32 event)
-{
-   struct ata_eh_info *ehi = >link.eh_info;
-   int wait = 0;
-   unsigned long flags;
-
-   spin_lock_irqsave(ap->lock, flags);
-   /*
-* When dock driver calls into the routine, it will always use
-* ACPI_NOTIFY_BUS_CHECK/ACPI_NOTIFY_DEVICE_CHECK for add and
-* ACPI_NOTIFY_EJECT_REQUEST for remove
-*/
-   switch (event) {
-   case ACPI_NOTIFY_BUS_CHECK:
-   case ACPI_NOTIFY_DEVICE_CHECK:
-   ata_ehi_push_desc(ehi, "ACPI event");
-
-   ata_ehi_hotplugged(ehi);
-   ata_port_freeze(ap);
-   break;
-   case ACPI_NOTIFY_EJECT_REQUEST:
-   ata_ehi_push_desc(ehi, "ACPI event");
-
-   ata_acpi_detach_device(ap, dev);
-   wait = 1;
-   break;
-   }
-
-   spin_unlock_irqrestore(ap->lock, flags);
-
-   if (wait)
-   ata_port_wait_eh(ap);
-}
-
-static void ata_acpi_dev_notify_dock(acpi_handle handle, u32 event, void *data)
-{
-   struct ata_device *dev = data;
-
-   ata_acpi_handle_hotplug(dev->link->ap, dev, event);
-}
-
-static void ata_acpi_ap_notify_dock(acpi_handle handle, u32 event, void *data)
-{
-   struct ata_port *ap = data;
-
-   ata_acpi_handle_hotplug(ap, NULL, event);
-}
-
-static void ata_acpi_uevent(struct ata_port *ap, struct ata_device *dev,
-   u32 event)
-{
-   struct kobject *kobj = NULL;
-   char event_string[20];
-   char *envp[] = { event_string, NULL };
-
-   if (dev) {
-   if (dev->sdev)
-   kobj = >sdev->sdev_gendev.kobj;
-   } else
-   kobj = >dev->kobj;
-
-   if (kobj) {
-   snprintf(event_string, 20, "BAY_EVENT=%d", event);
-   kobject_uevent_env(kobj, KOBJ_CHANGE, envp);
-   }
-}
-
-static void ata_acpi_ap_uevent(acpi_handle handle, u32 event, void *data)
-{
-   ata_acpi_uevent(data, NULL, event);
-}
-
-static void ata_acpi_dev_uevent(acpi_handle handle, u32 event, void *data)
-{
-   struct ata_device *dev = data;
-   ata_acpi_uevent(dev->link->ap, dev, event);
-}
-
-static const struct acpi_dock_ops ata_acpi_dev_dock_ops = {
-   .handler = ata_acpi_dev_notify_dock,
-   .uevent = ata_acpi_dev_uevent,
-};
-
-static const struct acpi_dock_ops ata_acpi_ap_dock_ops = {
-   .handler = ata_acpi_ap_notify_dock,
-   .uevent = ata_acpi_ap_uevent,
-};
-
 /**
  * ata_acpi_dissociate - dissociate ATA host from ACPI objects
  * @host: target ATA host
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in RCU subsystem in latest mainline kernel

2013-06-14 Thread Paul E. McKenney

On Fri, Jun 14, 2013 at 10:31:12PM -0400, Steven Rostedt wrote:
> On Sat, 2013-06-15 at 12:21 +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2013-06-14 at 22:17 -0400, Steven Rostedt wrote:
> > > On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > > > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > > > thing is, when I type on the console, it makes progress. Anyway, it
> > > > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > > > time either tonight or next week to figure it out.
> > > > 
> > > > Does this help ?
> > > 
> > > It did for me. Rojhalat, did this fix your issue too?
> > 
> > Ok, adding that to what I'm about to send to Linus.
> 
> Feel free to add: Tested-by: Steven Rostedt 

Thank you both!!!

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] PCI: only WARN_ON() when pci_ioremap_bar() is called for io port BAR

2013-06-14 Thread Jiang Liu

Ideally caller should check availability of IO BAR resource before
calling pci_ioremap_bar(), but no caller doing that yet:(
The WARN_ON() in function pci_ioremap_bar() is used to warn the caller
if it's called for an IO port BAR, so disable it if OS fails to allocate
resources for the BAR, otherwise it will generate heavy log messages as
below (actually the last line of log message is enough for analysis):
[  157.383390] [ cut here ]
[  157.383396] WARNING: at drivers/pci/pci.c:130 pci_ioremap_bar+0x69/0x70()
[  157.383397] Modules linked in: ata_generic pata_acpi pata_marvell fuse 
zram(C) rfcomm bnep snd_hda_codec_hdmi snd_hda_codec_realtek rtsx_pci_ms 
rtsx_pci_sdmmc memstick mmc_core iTCO_wdt iTCO_vendor_support arc4 uvcvideo 
iwldvm videobuf2_vmalloc videobuf2_memops mac80211 qcserial videobuf2_core 
usb_wwan videodev media usbserial btusb bluetooth iwlwifi coretemp kvm_intel 
kvm sony_laptop cfg80211 snd_hda_intel snd_hda_codec rfkill i915 pcspkr r8169 
joydev mii rtsx_pci snd_hwdep intel_agp snd_pcm intel_gtt i2c_algo_bit 
snd_page_alloc drm_kms_helper snd_timer drm lpc_ich snd agpgart i2c_i801 
mfd_core sha256_ssse3 sha256_generic dm_crypt raid0 md_mod crc32_pclmul 
crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 glue_helper lrw 
gf128mul ablk_helper cryptd xhci_hcd ehci_pci ehci_hcd dm_mirror dm_region_hash
[  157.383462]  dm_log dm_mod [last unloaded: microcode]
[  157.383469] CPU: 0 PID: 40 Comm: kworker/0:1 Tainted: GWC   
3.10.0-rc4 #7
[  157.383473] Hardware name: Sony Corporation VPCZ23A4R/VAIO, BIOS R1013H5 
05/21/2012
[  157.383478] Workqueue: kacpi_hotplug _handle_hotplug_event_bridge
[  157.383481]  81a2a114 880253d3f808 8165aab8 
880253d3f848
[  157.383487]  8103c8cb 880253d3f828 880253152800 
88022eb09000
[  157.383494]   880253153000 0001 
880253d3f858
[  157.383500] Call Trace:
[  157.383507]  [] dump_stack+0x19/0x1b
[  157.383513]  [] warn_slowpath_common+0x6b/0xa0
[  157.383519]  [] warn_slowpath_null+0x15/0x20
[  157.383524]  [] pci_ioremap_bar+0x69/0x70
[  157.383532]  [] azx_first_init+0x56/0x600 [snd_hda_intel]
[  157.383536]  [] ? pci_get_domain_bus_and_slot+0x2f/0x70
[  157.383540]  [] azx_probe+0x555/0x940 [snd_hda_intel]
[  157.383543]  [] ? trace_hardirqs_on+0xd/0x10
[  157.383546]  [] local_pci_probe+0x46/0x80
[  157.383549]  [] pci_device_probe+0xf9/0x120
[  157.383549]  [] pci_device_probe+0xf9/0x120
[  157.383554]  [] driver_probe_device+0x76/0x220
[  157.383556]  [] __device_attach+0x4b/0x60
[  157.383559]  [] ? __driver_attach+0xb0/0xb0
[  157.383561]  [] bus_for_each_drv+0x5c/0x90
[  157.383564]  [] device_attach+0x98/0xb0
[  157.383566]  [] pci_bus_add_device+0x34/0x60
[  157.383569]  [] pci_bus_add_devices+0x39/0xa0
[  157.383571]  [] pci_bus_add_devices+0x87/0xa0
[  157.383573]  [] pci_bus_add_devices+0x87/0xa0
[  157.383575]  [] pci_bus_add_devices+0x87/0xa0
[  157.383577]  [] pci_bus_add_devices+0x87/0xa0
[  157.383580]  [] enable_device+0x370/0x450
[  157.383583]  [] acpiphp_enable_slot+0xca/0x140
[  157.383585]  [] acpiphp_check_bridge+0x77/0x100
[  157.383587]  [] _handle_hotplug_event_bridge+0x13d/0x290
[  157.383591]  [] process_one_work+0x1c2/0x560
[  157.383594]  [] ? process_one_work+0x157/0x560
[  157.383596]  [] worker_thread+0x116/0x370
[  157.383598]  [] ? manage_workers.isra.20+0x2d0/0x2d0
[  157.383601]  [] kthread+0xd6/0xe0
[  157.383604]  [] ? _raw_spin_unlock_irq+0x2b/0x60
[  157.383607]  [] ? __init_kthread_worker+0x70/0x70
[  157.383610]  [] ret_from_fork+0x7c/0xb0
[  157.383613]  [] ? __init_kthread_worker+0x70/0x70
[  157.383614] ---[ end trace f366acc9dc87b38a ]---
[  157.383616] hda-intel :16:00.1: ioremap error

Signed-off-by: Jiang Liu 
Cc: linux-...@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
---
 drivers/pci/pci.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index a899d8b..9288161 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -127,7 +127,8 @@ void __iomem *pci_ioremap_bar(struct pci_dev *pdev, int bar)
 * Make sure the BAR is actually a memory resource, not an IO resource
 */
if (!(pci_resource_flags(pdev, bar) & IORESOURCE_MEM)) {
-   WARN_ON(1);
+   if (pci_resource_flags(pdev, bar))
+   WARN_ON(1);
return NULL;
}
return ioremap_nocache(pci_resource_start(pdev, bar),
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] at91: soc updates for 3.11 #1

2013-06-14 Thread Olof Johansson

Hi,

On Fri, Jun 14, 2013 at 11:42:18PM +0200, Nicolas Ferre wrote:
> Arnd, Olof,
> 
> A little AT91 pull-request for patches that are more targeted to SoC/boards
> modifications. It is prepared on top of the arm-soc/at91/cleanup branch.
> 
> Thanks, best regards,
> 
> The following changes since commit b3f442b0eedbc20b5ce3f4a96530588d14901199:
> 
>   ARM: at91: udpate defconfigs (2013-05-17 15:05:08 +0200)
> 
> are available in the git repository at:
> 
>   git://github.com/at91linux/linux-at91.git tags/at91-soc
> 
> for you to fetch changes up to 7e75545ea7fb972c3da759f92c3d0be84d1cee72:
> 
>   ARM: at91: drop rm9200dk board support (2013-06-14 23:34:11 +0200)
> 
> 
> Two non critical fixes that can go in 3.11.
> An old board removed.
> 
> 
> Alexandre Belloni (1):
>   ARM: at91: Fix link breakage when !CONFIG_PHYLIB

Fix

> Jean-Christophe PLAGNIOL-VILLARD (1):
>   ARM: at91: drop rm9200dk board support

Cleanup

> Wenyou Yang (1):
>   ARM: at91: Change the internal SRAM memory type MT_MEMORY_NONCACHED

Fix

...assuming, of course, that none of the fixes are for errors introduced in
some branch we already pulled, since then they should go on top of that branch.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 6/8] spi: omap2-mcspi: add generic DMA request support to the DT binding

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

The binding definition is based on the generic DMA request binding

Signed-off-by: Matt Porter 
Signed-off-by: Joel A Fernandes 
---
 Documentation/devicetree/bindings/spi/omap-spi.txt |   27 +++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/Documentation/devicetree/bindings/spi/omap-spi.txt 
b/Documentation/devicetree/bindings/spi/omap-spi.txt
index 938809c..4c85c4c 100644
--- a/Documentation/devicetree/bindings/spi/omap-spi.txt
+++ b/Documentation/devicetree/bindings/spi/omap-spi.txt
@@ -10,7 +10,18 @@ Required properties:
  input. The default is D0 as input and
  D1 as output.
 
-Example:
+Optional properties:
+- dmas: List of DMA specifiers with the controller specific format
+   as described in the generic DMA client binding. A tx and rx
+   specifier is required for each chip select.
+- dma-names: List of DMA request names. These strings correspond
+   1:1 with the DMA specifiers listed in dmas. The string naming
+   is to be "rxN" and "txN" for RX and TX requests,
+   respectively, where N equals the chip select number.
+
+Examples:
+
+[hwmod populated DMA resources]
 
 mcspi1: mcspi@1 {
 #address-cells = <1>;
@@ -20,3 +31,17 @@ mcspi1: mcspi@1 {
 ti,spi-num-cs = <4>;
 };
 
+[generic DMA request binding]
+
+mcspi1: mcspi@1 {
+#address-cells = <1>;
+#size-cells = <0>;
+compatible = "ti,omap4-mcspi";
+ti,hwmods = "mcspi1";
+ti,spi-num-cs = <2>;
+dmas = < 42
+43
+44
+45>;
+dma-names = "tx0", "rx0", "tx1", "rx1";
+};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 1/8] ARM: edma: Add AM33XX support to the private EDMA API

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

Adds support for parsing the TI EDMA DT data into the required EDMA
private API platform data. Enables runtime PM support to initialize
the EDMA hwmod. Enables build on OMAP.

Changes by Joel:
* Setup default one-to-one mapping for queue_priority and queue_tc
mapping as discussed in [1].
* Split out xbar stuff to separate patch. [1]

[1] https://patchwork.kernel.org/patch/2226761/

Signed-off-by: Matt Porter 
Acked-by: Sekhar Nori 
Signed-off-by: Joel A Fernandes 
---
 arch/arm/common/edma.c |  190 +---
 arch/arm/mach-omap2/Kconfig|1 +
 include/linux/platform_data/edma.h |4 +-
 3 files changed, 181 insertions(+), 14 deletions(-)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index a1db6cd..9823b79 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -24,6 +24,13 @@
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 
 #include 
 
@@ -1369,31 +1376,173 @@ void edma_clear_event(unsigned channel)
 EXPORT_SYMBOL(edma_clear_event);
 
 /*---*/
+static int edma_of_read_u32_to_s8_array(const struct device_node *np,
+const char *propname, s8 *out_values,
+size_t sz)
+{
+   int ret;
+
+   ret = of_property_read_u8_array(np, propname, out_values, sz);
+   if (ret)
+   return ret;
+
+   /* Terminate it */
+   *out_values++ = -1;
+   *out_values++ = -1;
+
+   return 0;
+}
+
+static int edma_of_read_u32_to_s16_array(const struct device_node *np,
+const char *propname, s16 *out_values,
+size_t sz)
+{
+   int ret;
+
+   ret = of_property_read_u16_array(np, propname, out_values, sz);
+   if (ret)
+   return ret;
+
+   /* Terminate it */
+   *out_values++ = -1;
+   *out_values++ = -1;
+
+   return 0;
+}
 
-static int __init edma_probe(struct platform_device *pdev)
+static int edma_of_parse_dt(struct device *dev,
+   struct device_node *node,
+   struct edma_soc_info *pdata)
+{
+   int ret = 0, i;
+   u32 value;
+   struct property *prop;
+   size_t sz;
+   struct edma_rsv_info *rsv_info;
+   const s16 (*rsv_chans)[2], (*rsv_slots)[2];
+   s8 (*queue_tc_map)[2], (*queue_priority_map)[2];
+
+   memset(pdata, 0, sizeof(struct edma_soc_info));
+
+   ret = of_property_read_u32(node, "dma-channels", );
+   if (ret < 0)
+   return ret;
+   pdata->n_channel = value;
+
+   ret = of_property_read_u32(node, "ti,edma-regions", );
+   if (ret < 0)
+   return ret;
+   pdata->n_region = value;
+
+   ret = of_property_read_u32(node, "ti,edma-slots", );
+   if (ret < 0)
+   return ret;
+   pdata->n_slot = value;
+
+   pdata->n_cc = 1;
+   pdata->n_tc = 3;
+
+   rsv_info =
+   devm_kzalloc(dev, sizeof(struct edma_rsv_info), GFP_KERNEL);
+   if (!rsv_info)
+   return -ENOMEM;
+   pdata->rsv = rsv_info;
+
+   queue_tc_map = devm_kzalloc(dev, 8*sizeof(s8), GFP_KERNEL);
+   if (!queue_tc_map)
+   return -ENOMEM;
+
+   for (i = 0; i < 3; i++)
+   queue_tc_map[i][0] = queue_tc_map[i][1] = i;
+   queue_tc_map[i][0] = queue_tc_map[i][1] = -1;
+
+   pdata->queue_tc_mapping = queue_tc_map;
+
+   queue_priority_map = devm_kzalloc(dev, 8*sizeof(s8), GFP_KERNEL);
+   if (!queue_priority_map)
+   return -ENOMEM;
+
+   for (i = 0; i < 3; i++)
+   queue_priority_map[i][0] = queue_priority_map[i][1] = i;
+   queue_priority_map[i][0] = queue_priority_map[i][1] = -1;
+
+   pdata->queue_priority_mapping = queue_priority_map;
+
+   pdata->default_queue = 0;
+
+
+   return ret;
+}
+
+static struct of_dma_filter_info edma_filter_info = {
+   .filter_fn = edma_filter_fn,
+};
+
+static int edma_probe(struct platform_device *pdev)
 {
struct edma_soc_info**info = pdev->dev.platform_data;
-   const s8(*queue_priority_mapping)[2];
-   const s8(*queue_tc_mapping)[2];
+   struct edma_soc_info*ninfo[EDMA_MAX_CC] = {NULL, NULL};
+   struct edma_soc_infotmpinfo;
+   s8  (*queue_priority_mapping)[2];
+   s8  (*queue_tc_mapping)[2];
int i, j, off, ln, found = 0;
int status = -1;
const s16   (*rsv_chans)[2];
const s16   (*rsv_slots)[2];
int irq[EDMA_MAX_CC] = {0, 0};
int err_irq[EDMA_MAX_CC] = {0, 0};
-   struct resource *r[EDMA_MAX_CC] = {NULL};
+

[PATCH v10 7/8] spi: omap2-mcspi: convert to dma_request_slave_channel_compat()

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

Convert dmaengine channel requests to use
dma_request_slave_channel_compat(). This supports the DT case of
platforms requiring channel selection from either the OMAP DMA or
the EDMA engine. AM33xx only boots from DT and is the only user
implementing EDMA so in the !DT case we can default to the OMAP DMA
filter.

Signed-off-by: Matt Porter 
Acked-by: Mark Brown 
Signed-off-by: Joel A Fernandes 
---
 drivers/spi/spi-omap2-mcspi.c |   64 -
 1 file changed, 44 insertions(+), 20 deletions(-)

diff --git a/drivers/spi/spi-omap2-mcspi.c b/drivers/spi/spi-omap2-mcspi.c
index 86d2158..ca4ab78 100644
--- a/drivers/spi/spi-omap2-mcspi.c
+++ b/drivers/spi/spi-omap2-mcspi.c
@@ -102,6 +102,9 @@ struct omap2_mcspi_dma {
 
struct completion dma_tx_completion;
struct completion dma_rx_completion;
+
+   char dma_rx_ch_name[14];
+   char dma_tx_ch_name[14];
 };
 
 /* use PIO for small transfers, avoiding DMA setup/teardown overhead and
@@ -830,12 +833,20 @@ static int omap2_mcspi_request_dma(struct spi_device *spi)
dma_cap_zero(mask);
dma_cap_set(DMA_SLAVE, mask);
sig = mcspi_dma->dma_rx_sync_dev;
-   mcspi_dma->dma_rx = dma_request_channel(mask, omap_dma_filter_fn, );
+
+   mcspi_dma->dma_rx =
+   dma_request_slave_channel_compat(mask, omap_dma_filter_fn,
+, >dev,
+mcspi_dma->dma_rx_ch_name);
if (!mcspi_dma->dma_rx)
goto no_dma;
 
sig = mcspi_dma->dma_tx_sync_dev;
-   mcspi_dma->dma_tx = dma_request_channel(mask, omap_dma_filter_fn, );
+   mcspi_dma->dma_tx =
+   dma_request_slave_channel_compat(mask, omap_dma_filter_fn,
+, >dev,
+mcspi_dma->dma_tx_ch_name);
+
if (!mcspi_dma->dma_tx) {
dma_release_channel(mcspi_dma->dma_rx);
mcspi_dma->dma_rx = NULL;
@@ -1256,29 +1267,42 @@ static int omap2_mcspi_probe(struct platform_device 
*pdev)
goto free_master;
 
for (i = 0; i < master->num_chipselect; i++) {
-   char dma_ch_name[14];
+   char *dma_rx_ch_name = mcspi->dma_channels[i].dma_rx_ch_name;
+   char *dma_tx_ch_name = mcspi->dma_channels[i].dma_tx_ch_name;
struct resource *dma_res;
 
-   sprintf(dma_ch_name, "rx%d", i);
-   dma_res = platform_get_resource_byname(pdev, IORESOURCE_DMA,
-   dma_ch_name);
-   if (!dma_res) {
-   dev_dbg(>dev, "cannot get DMA RX channel\n");
-   status = -ENODEV;
-   break;
-   }
+   sprintf(dma_rx_ch_name, "rx%d", i);
+   if (!pdev->dev.of_node) {
+   dma_res =
+   platform_get_resource_byname(pdev,
+IORESOURCE_DMA,
+dma_rx_ch_name);
+   if (!dma_res) {
+   dev_dbg(>dev,
+   "cannot get DMA RX channel\n");
+   status = -ENODEV;
+   break;
+   }
 
-   mcspi->dma_channels[i].dma_rx_sync_dev = dma_res->start;
-   sprintf(dma_ch_name, "tx%d", i);
-   dma_res = platform_get_resource_byname(pdev, IORESOURCE_DMA,
-   dma_ch_name);
-   if (!dma_res) {
-   dev_dbg(>dev, "cannot get DMA TX channel\n");
-   status = -ENODEV;
-   break;
+   mcspi->dma_channels[i].dma_rx_sync_dev =
+   dma_res->start;
}
+   sprintf(dma_tx_ch_name, "tx%d", i);
+   if (!pdev->dev.of_node) {
+   dma_res =
+   platform_get_resource_byname(pdev,
+IORESOURCE_DMA,
+dma_tx_ch_name);
+   if (!dma_res) {
+   dev_dbg(>dev,
+   "cannot get DMA TX channel\n");
+   status = -ENODEV;
+   break;
+   }
 
-   mcspi->dma_channels[i].dma_tx_sync_dev = dma_res->start;
+   mcspi->dma_channels[i].dma_tx_sync_dev =
+   dma_res->start;
+   }
}
 
if (status < 0)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a

[PATCH v10 2/8] ARM: edma: Add AM33XX EDMA crossbar event mux support

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

Changes by Joel:
* Split EDMA xbar support out of original EDMA DT parsing patch
to keep it easier for review.
* rewrite shift and offset calculation as per

Suggested-by: Sekhar Nori 
Suggested by: Andy Shevchenko 
Signed-off-by: Joel A Fernandes 

Reference:
[1] https://patchwork.kernel.org/patch/2226991/
---
 arch/arm/common/edma.c |   59 
 include/linux/platform_data/edma.h |1 +
 2 files changed, 60 insertions(+)

diff --git a/arch/arm/common/edma.c b/arch/arm/common/edma.c
index 9823b79..1c2fb15 100644
--- a/arch/arm/common/edma.c
+++ b/arch/arm/common/edma.c
@@ -1410,6 +1410,52 @@ static int edma_of_read_u32_to_s16_array(const struct 
device_node *np,
return 0;
 }
 
+static int edma_xbar_event_map(struct device *dev,
+  struct device_node *node,
+  struct edma_soc_info *pdata, int len)
+{
+   int ret = 0;
+   int i;
+   struct resource res;
+   void *xbar;
+   const s16 (*xbar_chans)[2];
+   u32 shift, offset, mux;
+
+   xbar_chans = devm_kzalloc(dev,
+ len/sizeof(s16) + 2*sizeof(s16),
+ GFP_KERNEL);
+   if (!xbar_chans)
+   return -ENOMEM;
+
+   ret = of_address_to_resource(node, 1, );
+   if (ret)
+   return -EIO;
+
+   xbar = devm_ioremap(dev, res.start, resource_size());
+   if (!xbar)
+   return -ENOMEM;
+
+   ret = edma_of_read_u32_to_s16_array(node,
+   "ti,edma-xbar-event-map",
+   (s16 *)xbar_chans,
+   len/sizeof(u32));
+   if (ret)
+   return -EIO;
+
+   for (i = 0; xbar_chans[i][0] != -1; i++) {
+   shift = (xbar_chans[i][1] & 0x03) << 3;
+   offset = xbar_chans[i][1] & 0xfffc;
+   mux = readl((void *)((u32)xbar + offset));
+   mux &= ~(0xff << shift);
+   mux |= xbar_chans[i][0] << shift;
+   writel(mux, (void *)((u32)xbar + offset));
+   }
+
+   pdata->xbar_chans = xbar_chans;
+
+   return 0;
+}
+
 static int edma_of_parse_dt(struct device *dev,
struct device_node *node,
struct edma_soc_info *pdata)
@@ -1470,6 +1516,9 @@ static int edma_of_parse_dt(struct device *dev,
 
pdata->default_queue = 0;
 
+   prop = of_find_property(node, "ti,edma-xbar-event-map", );
+   if (prop)
+   ret = edma_xbar_event_map(dev, node, pdata, sz);
 
return ret;
 }
@@ -1489,6 +1538,7 @@ static int edma_probe(struct platform_device *pdev)
int status = -1;
const s16   (*rsv_chans)[2];
const s16   (*rsv_slots)[2];
+   const s16   (*xbar_chans)[2];
int irq[EDMA_MAX_CC] = {0, 0};
int err_irq[EDMA_MAX_CC] = {0, 0};
struct resource *r[EDMA_MAX_CC] = {NULL, NULL};
@@ -1617,6 +1667,15 @@ static int edma_probe(struct platform_device *pdev)
}
}
 
+   /* Clear the xbar mapped channels in unused list */
+   xbar_chans = info[j]->xbar_chans;
+   if (xbar_chans) {
+   for (i = 0; xbar_chans[i][1] != -1; i++) {
+   off = xbar_chans[i][1];
+   clear_bits(off, 1,
+   edma_cc[j]->edma_unused);
+   }
+   }
 
if (node)
irq[j] = irq_of_parse_and_map(node, 0);
diff --git a/include/linux/platform_data/edma.h 
b/include/linux/platform_data/edma.h
index 317f2be..57300fd 100644
--- a/include/linux/platform_data/edma.h
+++ b/include/linux/platform_data/edma.h
@@ -177,6 +177,7 @@ struct edma_soc_info {
 
s8  (*queue_tc_mapping)[2];
s8  (*queue_priority_mapping)[2];
+   const s16   (*xbar_chans)[2];
 };
 
 #endif
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 8/8] ARM: dts: add AM33XX SPI DMA support

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

Adds DMA resources to the AM33XX SPI nodes.

Signed-off-by: Matt Porter 
Signed-off-by: Joel A Fernandes 
---
 arch/arm/boot/dts/am33xx.dtsi |   10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index f8a8f19..119f8a9 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -352,6 +352,11 @@
interrupts = <65>;
ti,spi-num-cs = <2>;
ti,hwmods = "spi0";
+   dmas = < 16
+17
+18
+19>;
+   dma-names = "tx0", "rx0", "tx1", "rx1";
status = "disabled";
};
 
@@ -363,6 +368,11 @@
interrupts = <125>;
ti,spi-num-cs = <2>;
ti,hwmods = "spi1";
+   dmas = < 42
+43
+44
+45>;
+   dma-names = "tx0", "rx0", "tx1", "rx1";
status = "disabled";
};
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 4/8] dmaengine: edma: enable build for AM33XX

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

Enable TI EDMA option on OMAP.

Signed-off-by: Matt Porter 
Signed-off-by: Joel A Fernandes 
---
 drivers/dma/Kconfig |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index e992489..3215a3c 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -213,7 +213,7 @@ config SIRF_DMA
 
 config TI_EDMA
tristate "TI EDMA support"
-   depends on ARCH_DAVINCI
+   depends on ARCH_DAVINCI || ARCH_OMAP
select DMA_ENGINE
select DMA_VIRTUAL_CHANNELS
default n
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 3/8] ARM: dts: add AM33XX EDMA support

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

Adds AM33XX EDMA support to the am33xx.dtsi as documented in
Documentation/devicetree/bindings/dma/ti-edma.txt

Joel: Drop DT entries that are non-hardware-description for now as discussed in 
[1]

[1] https://patchwork.kernel.org/patch/2226761/

Signed-off-by: Matt Porter 
Signed-off-by: Joel A Fernandes 
---
 arch/arm/boot/dts/am33xx.dtsi |   12 
 1 file changed, 12 insertions(+)

diff --git a/arch/arm/boot/dts/am33xx.dtsi b/arch/arm/boot/dts/am33xx.dtsi
index 70c86a0..f8a8f19 100644
--- a/arch/arm/boot/dts/am33xx.dtsi
+++ b/arch/arm/boot/dts/am33xx.dtsi
@@ -89,6 +89,18 @@
reg = <0x4820 0x1000>;
};
 
+   edma: edma@4900 {
+   compatible = "ti,edma3";
+   ti,hwmods = "tpcc", "tptc0", "tptc1", "tptc2";
+   reg =   <0x4900 0x1>,
+   <0x44e10f90 0x10>;
+   interrupts = <12 13 14>;
+   #dma-cells = <1>;
+   dma-channels = <64>;
+   ti,edma-regions = <4>;
+   ti,edma-slots = <256>;
+   };
+
gpio0: gpio@44e07000 {
compatible = "ti,omap4-gpio";
ti,hwmods = "gpio1";
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 5/8] dmaengine: edma: Add TI EDMA device tree binding

2013-06-14 Thread Joel A Fernandes

From: Matt Porter 

The binding definition is based on the generic DMA controller
binding.

Joel: Droped reserved and queue DT entries from Documentation
for now from the original patch series.

Signed-off-by: Matt Porter 
Signed-off-by: Joel A Fernandes 
---
 Documentation/devicetree/bindings/dma/ti-edma.txt |   26 +
 1 file changed, 26 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/ti-edma.txt

diff --git a/Documentation/devicetree/bindings/dma/ti-edma.txt 
b/Documentation/devicetree/bindings/dma/ti-edma.txt
new file mode 100644
index 000..ada0018
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/ti-edma.txt
@@ -0,0 +1,26 @@
+TI EDMA
+
+Required properties:
+- compatible : "ti,edma3"
+- ti,hwmods: Name of the hwmods associated to the EDMA
+- ti,edma-regions: Number of regions
+- ti,edma-slots: Number of slots
+
+Optional properties:
+- ti,edma-xbar-event-map: Crossbar event to channel map
+
+Example:
+
+edma: edma@4900 {
+   reg = <0x4900 0x1>;
+   interrupt-parent = <>;
+   interrupts = <12 13 14>;
+   compatible = "ti,edma3";
+   ti,hwmods = "tpcc", "tptc0", "tptc1", "tptc2";
+   #dma-cells = <1>;
+   dma-channels = <64>;
+   ti,edma-regions = <4>;
+   ti,edma-slots = <256>;
+   ti,edma-xbar-event-map = <1 12
+ 2 13>;
+};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] ARM: ux500: minor code cleanup

2013-06-14 Thread Olof Johansson

Clean up coding style a bit in cpu-db8500.

Signed-off-by: Olof Johansson 
---

Linus,

Noticed the last chunk of this patch when I resolved one of the recent
conflicts, so I did a once-over of the file. Feel free to drop the
first chunk when applying if you'd prefer to do a sweeping DEFINE_RES()
cleanup instead.

 arch/arm/mach-ux500/cpu-db8500.c | 19 ---
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/arm/mach-ux500/cpu-db8500.c b/arch/arm/mach-ux500/cpu-db8500.c
index 27e5566..d25db87 100644
--- a/arch/arm/mach-ux500/cpu-db8500.c
+++ b/arch/arm/mach-ux500/cpu-db8500.c
@@ -95,11 +95,7 @@ void __init u8500_map_io(void)
 }
 
 static struct resource db8500_pmu_resources[] = {
-   [0] = {
-   .start  = IRQ_DB8500_PMU,
-   .end= IRQ_DB8500_PMU,
-   .flags  = IORESOURCE_IRQ,
-   },
+   [0] = DEFINE_RES_IRQ(IRQ_DB8500_PMU),
 };
 
 /*
@@ -281,8 +277,8 @@ static const struct of_device_id u8500_local_bus_nodes[] = {
/* only create devices below soc node */
{ .compatible = "stericsson,db8500", },
{ .compatible = "stericsson,db8500-prcmu", },
-   { .compatible = "simple-bus"},
-   { },
+   { .compatible = "simple-bus" },
+   {},
 };
 
 static void __init u8500_init_machine(void)
@@ -290,15 +286,16 @@ static void __init u8500_init_machine(void)
struct device *parent = db8500_soc_device_init();
 
/* Pinmaps must be in place before devices register */
-   if (of_machine_is_compatible("st-ericsson,mop500"))
+   if (of_machine_is_compatible("st-ericsson,mop500")) {
mop500_pinmaps_init();
-   else if (of_machine_is_compatible("calaosystems,snowball-a9500")) {
+   } else if (of_machine_is_compatible("calaosystems,snowball-a9500")) {
snowball_pinmaps_init();
mop500_snowball_ethernet_clock_enable();
-   } else if (of_machine_is_compatible("st-ericsson,hrefv60+"))
+   } else if (of_machine_is_compatible("st-ericsson,hrefv60+")) {
hrefv60_pinmaps_init();
-   else if (of_machine_is_compatible("st-ericsson,ccu9540")) {}
+   } else if (of_machine_is_compatible("st-ericsson,ccu9540")) {
/* TODO: Add pinmaps for ccu9540 board. */
+   }
 
/* automatically probe child nodes of dbx5x0 devices */
if (of_machine_is_compatible("st-ericsson,u8540"))
-- 
1.8.1.192.gc4361b8

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: BUG: unable to handle kernel NULL pointer dereference at 0000000000000040

2013-06-14 Thread Ming Lei

On Sat, Jun 15, 2013 at 1:07 AM, nirinA raseliarison
 wrote:

> patch applied and no longer have the bug message when i
> reboot and wake up the ethernet controller.

I am wondering if Guenter's patch can fix the race really, but I'd like to
see Guenter's explanation on his patch.

The race should be caused by below:

- request timeout triggered by internal timer

- user space aborts the requests before the line in _request_firmware_load()

 fw_priv->buf = NULL

which is run in timeout path

- then the abort() called from firmware_loading_store() may use a freed fw buf
since the timeout path will free the fw buffer.

Considered clearing 'fw_priv->buf' in _request_firmware_load()() isn't protected
by fw_lock now, so Guenter's patch can't avoid the race entirely.

Thanks,
--
Ming Lei
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in RCU subsystem in latest mainline kernel

2013-06-14 Thread Steven Rostedt

On Sat, 2013-06-15 at 12:21 +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2013-06-14 at 22:17 -0400, Steven Rostedt wrote:
> > On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > > thing is, when I type on the console, it makes progress. Anyway, it
> > > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > > time either tonight or next week to figure it out.
> > > 
> > > Does this help ?
> > 
> > It did for me. Rojhalat, did this fix your issue too?
> 
> Ok, adding that to what I'm about to send to Linus.
> 

Feel free to add: Tested-by: Steven Rostedt 

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v10 0/8] DMA Engine support for AM33XX

2013-06-14 Thread Joel A Fernandes

This series is a repost of Matt Porter's EDMA patches for AM33XX EDMA support
with changes for few pending review comments on v9 series.

Currently this is required for AM33XX (Beaglebone or EVM) to access MMC
and be able mount to rootfs and boot till command prompt over MMC.
Unless there are other pending review comments, I hope this series can
finally make it into 3.11 merge window though that is probably wishful
thinking at this point considering below code movement patches [1] is
pending review and Ack.

Tested EDMA on AM1808 EVM and AM33XX Beaglebone with MMC.

Sekhar Nori has posted a GIT PULL [1] which has 2 patches this series depends 
on:
ARM: davinci: move private EDMA API to arm/common
ARM: dts: add AM33XX MMC support
[1] http://www.spinics.net/lists/arm-kernel/msg251503.html

Changes since v9:
- Droped reserved and queue DT entries from Documentation
for now from the original patch series.
- Drop DT entries that are non-hardware-description
- Split EDMA xbar support out of original EDMA DT parsing patch
to keep it easier for review.
- Rewrite shift and offset calculation xbar event mapping.
- Setup default one-to-one mapping for queue_priority and queue_tc
mapping as discussed in.
- Split out xbar stuff to separate patch.

Reference discussion:
   https://patchwork.kernel.org/patch/2226761/

Changes since v8:
- Removed edma node interrupt-parent property, it is inherited

Changes since v7:
- Dropped dmaengine compat() patch. It is upstream.
- Submitted edma_alloc_slot() error checking bug fix separately,
  now a dependency
- Fixed bisect issues due to 3/10 hunks that went into 1/10
- Fixed incorrect IS_ERRVALUE() use in 3/10

Changes since v6:
- Converted edma_of_read_*() to wrappers around of_property_read_*()
- Fixed wording on the omap-spi generic DMA properties
- Added comment/check to clarify that the driver only supports
  a single EDMA instance when booting from DT

Changes since v5:
- Dropped mmc portion and moved it to a separate series
- Incorporate corrected version of dma_request_slave_channel_compat()
- Fix #defines and enablement of TI_PRIV_EDMA option

Changes since v4:
- Fixed debug section mismatch in private edma api [01/14]
- Respun format-patch to catch the platform_data/edma.h rename [01/14]
- Removed address/size-cells from the EDMA binding [05/14]

Changes since v3:
- Rebased on 3.8-rc3
- No longer an RFC
- Fixed bugs in DT/pdata parsing reported by Vaibhav Bedia
- Restored all the Davinci pdata to const
- Removed max_segs hack in favor of using dma_get_channel_caps()
- Fixed extra parens, __raw_* accessors and, ioremap error checks
  in xbar handling
- Removed excess license info in platform_data/edma.h
- Removed unneeded reserved channels data for AM33xx
- Removed test-specific pinmuxing from dts files
- Adjusted mmc1 node to be disabled by default in the dtsi

Changes since v2:
- Rebased on 3.7-rc1
- Fixed bug in DT/pdata parsing first found by Gururaja
  that turned out to be masked by some toolchains
- Dropped unused mach-omap2/devices.c hsmmc patch
- Added AM33XX crossbar DMA event mux support
- Added am335x-evm support

Changes since v1:
- Rebased on top of mainline from 12250d8
- Dropped the feature removal schedule patch
- Implemented dma_request_slave_channel_compat() and
  converted the mmc and spi drivers to use it
- Dropped unneeded #address-cells and #size-cells from
  EDMA DT support
- Moved private EDMA header to linux/platform_data/ and
  removed some unneeded definitions
- Fixed parsing of optional properties

This series adds DMA Engine support for AM33xx, which uses
an EDMA DMAC. The EDMA DMAC has been previously supported by only
a private API implementation (much like the situation with OMAP
DMA) found on the DaVinci family of SoCs.

The series applies on top of 3.10-rc4.

The approach taken is similar to how OMAP DMA is being converted to
DMA Engine support. With the functional EDMA private API already
existing in mach-davinci/dma.c, we first move that to an ARM common
area so it can be shared. Adding DT and runtime PM support to the
private EDMA API implementation allows it to run on AM33xx. AM33xx
*only* boots using DT so the upstream generic DT DMA helpers are
leveraged to register EDMA DMAC with the of_dma framework. SPI (and
MMC in a separate series) are supported using the upstream
dma_request_slave_channel_compat() dmaengine call that allows
compatibility with !DT platforms.

With this series both BeagleBone and the AM335x EVM have working
SPI DMA support (and MMC support with the separate MMC series).

This is tested on BeagleBone with a SPI framebuffer driver and MMC

Re: [PATCH v3 0/6] KVM: MMU: fast invalidate all mmio sptes

2013-06-14 Thread Takuya Yoshikawa

On Thu, 13 Jun 2013 21:08:21 -0300
Marcelo Tosatti  wrote:

> On Fri, Jun 07, 2013 at 04:51:22PM +0800, Xiao Guangrong wrote:

> - Where is the generation number increased?

Looks like when a new slot is installed in update_memslots() because
it's based on slots->generation.  This is not restricted to "create"
and "move".

> - Should use spinlock breakable code in kvm_mmu_zap_mmio_sptes()
> (picture guest with 512GB of RAM, even walking all those pages is
> expensive) (ah, patch to remove kvm_mmu_zap_mmio_sptes does that).
> - Is -13 enough to test wraparound? Its highly likely the guest has 
> not began executing by the time 13 kvm_set_memory_calls are made
> (so no sptes around). Perhaps -2000 is more sensible (should confirm
> though).

In the future, after we've tested enough, we should change the testing
code to be executed only for some debugging configs.  Especially, if we
change zap_mmio_sptes() to zap_all_shadows(), very common guests, even
without huge memory like 512GB, can see the effect induced by sudden page
faults unnecessarily.

If necessary, developers can test the wraparound code by lowering the
max_gen itself anyway.

> - Why remove "if (change == KVM_MR_CREATE) || (change
> ==  KVM_MR_MOVE)" from kvm_arch_commit_memory_region?
> Its instructive.

There may be a chance that we miss generation wraparounds if we don't
check other cases: seems unlikely, but theoretically possible.

In short, all memory slot changes make mmio sptes stored in shadow pages
obsolete, or zapped for wraparounds, in the new way -- am I right?

Takuya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in RCU subsystem in latest mainline kernel

2013-06-14 Thread Benjamin Herrenschmidt

On Fri, 2013-06-14 at 22:17 -0400, Steven Rostedt wrote:
> On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> > On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > > thing is, when I type on the console, it makes progress. Anyway, it
> > > seems that powerpc has an issue with irq_work(). I'll try to get some
> > > time either tonight or next week to figure it out.
> > 
> > Does this help ?
> 
> It did for me. Rojhalat, did this fix your issue too?

Ok, adding that to what I'm about to send to Linus.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in RCU subsystem in latest mainline kernel

2013-06-14 Thread Steven Rostedt

On Sat, 2013-06-15 at 12:02 +1000, Benjamin Herrenschmidt wrote:
> On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> > I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> > thing is, when I type on the console, it makes progress. Anyway, it
> > seems that powerpc has an issue with irq_work(). I'll try to get some
> > time either tonight or next week to figure it out.
> 
> Does this help ?

It did for me. Rojhalat, did this fix your issue too?

-- Steve

> 
> diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
> index 5cbcf4d..ea185e0 100644
> --- a/arch/powerpc/kernel/irq.c
> +++ b/arch/powerpc/kernel/irq.c
> @@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
>* in case we also had a rollover while hard disabled
>*/
>   local_paca->irq_happened &= ~PACA_IRQ_DEC;
> - if (decrementer_check_overflow())
> + if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
>   return 0x900;
>  
>   /* Finally check if an external interrupt happened */
> 
> Cheers,
> Ben.
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/2] Initial support for Allwinner's Security ID fuses

2013-06-14 Thread Andy Shevchenko

On Sat, Jun 15, 2013 at 2:16 AM, Oliver Schinagl
 wrote:
> From: Oliver Schinagl 
>
> Allwinner has electric fuses (efuse) on their line of chips. This driver
> reads those fuses, seeds the kernel entropy and exports them as a sysfs node.
>
> These fuses are most likly to be programmed at the factory, encoding
> things like Chip ID, some sort of serial number etc and appear to be
> reasonable unique.
> While in theory, these should be writeable by the user, it will probably
> be inconvinient to do so. Allwinner recommends that a certain input pin,
> labeled 'efuse_vddq', be connected to GND. To write these fuses, 2.5 V
> needs to be applied to this pin.
>
> Even so, they can still be used to generate a board-unique mac from, board
> unique RSA key and seed the kernel RNG.
>
> Currently supported are the following known chips:
> Allwinner sun4i (A10)
> Allwinner sun5i (A10s, A13)

Few comments below.

> +++ b/drivers/misc/eeprom/sunxi_sid.c

> +#include 

Are you sure this has to be explicitly mentioned?

> +#define SID_SIZE (SID_KEYS * 4)
> +
> +

Extra line.

> +/* We read the entire key, but only return the requested byte. This is of
> + * course slower then it could be and uses 4 times more reads as needed but
> + * keeps code simpler.

May be better to rewrite this logic and save CPU and I/O resources?

> + */
> +static u8 sunxi_sid_read_byte(const void __iomem *sid_reg_base,
> + const unsigned int offset)
> +{
> +   u32 sid_key = 0;
> +
> +   if (offset >= SID_SIZE)
> +   goto exit;

Just return here.

> +   sid_key = ioread32be(sid_reg_base + round_down(offset, 4));
> +   sid_key >>= (offset % 4) * 8;
> +   sid_key &= 0xff;

Redundant 0xff.

> +   /* fall through */
> +
> +exit:
> +   return (u8)sid_key;

No need to have explicit casting here.

> +   pdev = (struct platform_device 
> *)to_platform_device(kobj_to_dev(kobj));

Ditto.

> +   sid_reg_base = (void __iomem *)platform_get_drvdata(pdev);

Ditto.

> +static int sunxi_sid_remove(struct platform_device *pdev)
> +{
> +   device_remove_bin_file(>dev, _bin_attr);
> +   dev_info(>dev, "sunxi SID driver unloaded\n");

Often this is useless message. In what case this is crucial?

> +static int __init sunxi_sid_probe(struct platform_device *pdev)
> +{
> +   int entropy[SID_SIZE], i;
> +   struct resource *res;
> +   void __iomem *sid_reg_base;
> +   int ret;
> +
> +   if (!pdev->dev.of_node) {
> +   dev_err(>dev, "No devicetree data available\n");
> +   ret = -ENXIO;
> +   goto exit;

You have only return, use it. It's common practice in the .probe() function.

> +   if (IS_ERR(sid_reg_base)) {
> +   ret = PTR_ERR(sid_reg_base);
> +   goto exit;

Ditto.

> +   ret = device_create_bin_file(>dev, _bin_attr);
> +   if (ret) {
> +   dev_err(>dev, "Unable to create sysfs bin entry\n");
> +   goto exit;

Ditto.

> +   dev_info(>dev, "sunxi SID ver %s loaded\n", DRV_VERSION);
> +   ret = 0;
> +   /* fall through */

Ditto.

> +
> +exit:
> +   return ret;

Useless lines.

> +module_platform_driver(sunxi_sid_driver);
> +
> +

Extra line.


--
With Best Regards,
Andy Shevchenko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL v3] at91: USBA DT support / drivers update for 3.11 #1

2013-06-14 Thread Olof Johansson

On Sat, Jun 15, 2013 at 12:02:44AM +0200, Nicolas Ferre wrote:
> Arnd, Olof,
> 
> This is a rework of the previous pull-request done by Jean-Christophe
> PLAGNIOL-VILLARD ([GIT PULL] at91: USBA DT support for 3.11).
> It is also the division of my previous pull request (v2) to extract only
> patches related to drivers. The DT patches will be stacked on top of our
> arm-soc/at91/dt branch in an upcoming pull-request.
> You should see that this material is based on the cleanup branch that you
> already have in arm-soc/at91/cleanup.
> 
> This adds the DT support for USBA gadget driver present in the most recent
> AT91 SoCs.
> As agreed with Arnd and Felipe we send these drivers updates via arm-soc.
> 
> Thanks, best regards,
> 
> The following changes since commit b3f442b0eedbc20b5ce3f4a96530588d14901199:
> 
>   ARM: at91: udpate defconfigs (2013-05-17 15:05:08 +0200)
> 
> are available in the git repository at:
> 
>   git://github.com/at91linux/linux-at91.git tags/at91-drivers

Pulled, thanks. This had conflicts with some code from fixes. The change/remove
ones were easy (cs-gpio vs interrupts), but there was one dealing with pinctrl.
PLease double-check that there wasn't a need to carry over a fix to the
include-file-based version.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Regression in RCU subsystem in latest mainline kernel

2013-06-14 Thread Benjamin Herrenschmidt

On Fri, 2013-06-14 at 17:06 -0400, Steven Rostedt wrote:
> I was pretty much able to reproduce this on my PA Semi PPC box. Funny
> thing is, when I type on the console, it makes progress. Anyway, it
> seems that powerpc has an issue with irq_work(). I'll try to get some
> time either tonight or next week to figure it out.

Does this help ?

diff --git a/arch/powerpc/kernel/irq.c b/arch/powerpc/kernel/irq.c
index 5cbcf4d..ea185e0 100644
--- a/arch/powerpc/kernel/irq.c
+++ b/arch/powerpc/kernel/irq.c
@@ -162,7 +162,7 @@ notrace unsigned int __check_irq_replay(void)
 * in case we also had a rollover while hard disabled
 */
local_paca->irq_happened &= ~PACA_IRQ_DEC;
-   if (decrementer_check_overflow())
+   if ((happened & PACA_IRQ_DEC) || decrementer_check_overflow())
return 0x900;
 
/* Finally check if an external interrupt happened */

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] at91: Device Tree update for 3.11 #2

2013-06-14 Thread Olof Johansson

On Sat, Jun 15, 2013 at 12:47:47AM +0200, Nicolas Ferre wrote:
> Arnd, Olof,
> 
> Additional pull-request for AT91 DT patches.
> It contains the remaining part of the USB gadget pull-request that I sent you
> last week. After having split it, here is the DT part.
> It also contains the update of DMA bindings: it is the AT91 part the should go
> through arm-soc. I have included the patch (ARM: at91: dt: add header to 
> define
> at_hdmac configuration) so that we avoid build errors whichever git tree
> (slave-dma or arm-soc) is merged first.
> A SPI DT patch for at91sam9x5 is also added.
> 
> Thanks, best regards,
> 
> The following changes since commit 028633c238f91dc113520a7ad25d37b2ba9068af:
> 
>   ARM: at91/dt: add pinctrl definition for at91 tc blocks (2013-05-31 
> 22:40:37 +0200)
> 
> are available in the git repository at:
> 
>   git://github.com/at91linux/linux-at91.git tags/at91-dt

Pulled, thanks.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUGFIX v2 2/4] ACPI, DOCK: resolve possible deadlock scenarios

2013-06-14 Thread Jiang Liu

On Sat 15 Jun 2013 06:21:02 AM CST, Rafael J. Wysocki wrote:
> On Saturday, June 15, 2013 03:27:59 AM Jiang Liu wrote:
>> This is a preparation for next patch to avoid breaking bisecting.
>> If next patch is applied without this one, it will cause deadlock
>> as below:
>>
>> Case 1:
>> [   31.015593]  Possible unsafe locking scenario:
>>
>> [   31.018350]CPU0CPU1
>> [   31.019691]
>> [   31.021002]   lock(_station->hp_lock);
>> [   31.022327]lock(>crit_sect);
>> [   31.023650]lock(_station->hp_lock);
>> [   31.025010]   lock(>crit_sect);
>> [   31.026342]
>>
>> Case 2:
>> hotplug_dock_devices()
>> mutex_lock(>hp_lock)
>> dd->ops->handler()
>> register_hotplug_dock_device()
>> mutex_lock(>hp_lock)
>> [   34.316570] [ INFO: possible recursive locking detected ]
>> [   34.316573] 3.10.0-rc4 #6 Tainted: G C
>> [   34.316575] -
>> [   34.316577] kworker/0:0/4 is trying to acquire lock:
>> [   34.316579]  (_station->hp_lock){+.+.+.}, at:
>> [] register_hotplug_dock_device+0x6a/0xbf
>> [   34.316588]
>> but task is already holding lock:
>> [   34.316590]  (_station->hp_lock){+.+.+.}, at:
>> [] hotplug_dock_devices+0x2c/0xda
>> [   34.316595]
>> other info that might help us debug this:
>> [   34.316597]  Possible unsafe locking scenario:
>>
>> [   34.316599]CPU0
>> [   34.316601]
>> [   34.316602]   lock(_station->hp_lock);
>> [   34.316605]   lock(_station->hp_lock);
>> [   34.316608]
>>  *** DEADLOCK ***
>>
>> So fix this deadlock by not taking ds->hp_lock in function
>> register_hotplug_dock_device(). This patch also fixes a possible
>> race conditions in function dock_event() because previously it
>> accesses ds->hotplug_devices list without holding ds->hp_lock.
>>
>> Signed-off-by: Jiang Liu 
>> Cc: Len Brown 
>> Cc: "Rafael J. Wysocki" 
>> Cc: Bjorn Helgaas 
>> Cc: Yinghai Lu 
>> Cc: Yijing Wang 
>> Cc: linux-a...@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Cc: linux-...@vger.kernel.org
>> Cc:  # 3.8+
>> ---
>>  drivers/acpi/dock.c| 109 
>> ++---
>>  drivers/pci/hotplug/acpiphp_glue.c |  15 +
>>  include/acpi/acpi_drivers.h|   2 +
>>  3 files changed, 82 insertions(+), 44 deletions(-)
>>
>> diff --git a/drivers/acpi/dock.c b/drivers/acpi/dock.c
>> index 02b0563..602bce5 100644
>> --- a/drivers/acpi/dock.c
>> +++ b/drivers/acpi/dock.c
>> @@ -66,7 +66,7 @@ struct dock_station {
>>  spinlock_t dd_lock;
>>  struct mutex hp_lock;
>>  struct list_head dependent_devices;
>> -struct list_head hotplug_devices;
>> +struct klist hotplug_devices;
>>
>>  struct list_head sibling;
>>  struct platform_device *dock_device;
>> @@ -76,12 +76,18 @@ static int dock_station_count;
>>
>>  struct dock_dependent_device {
>>  struct list_head list;
>> -struct list_head hotplug_list;
>> +acpi_handle handle;
>> +};
>> +
>> +struct dock_hotplug_info {
>> +struct klist_node node;
>>  acpi_handle handle;
>>  const struct acpi_dock_ops *ops;
>>  void *context;
>>  };
>
> Can we please relax a bit and possibly take a step back?
>
> So since your last reply to me wasn't particularly helpful, I went through the
> code in dock.c and acpiphp_glue.c and I simply think that the whole
> hotplug_list thing is simply redundant.
>
> It looks like instead of using it (or the klist in this patch), we can add a
> "hotlpug_device" flag to dock_dependent_device and set that flag instead of
> adding dd to hotplug_devices or clear it instead of removing dd from that 
> list.
>
> That would allow us to avoid the deadlock, because we wouldn't need the 
> hp_lock
> any more and perhaps we could make the code simpler instead of making it more
> complex.
>
> How does that sound?
>
> Rafael
Hi Rafael,
   Thanks for comments! It would be great if we could kill the 
hotplug_devices
list so thing gets simple. But there are still some special cases:(

As you have mentioned,  ds->hp_lock is used to make both addition and 
removal
of hotplug devices wait for us to complete walking ds->hotplug_devices.
So it acts as two roles:
1) protect the hotplug_devices list,
2) serialize unregister_hotplug_dock_device() and 
hotplug_dock_devices() so
the dock driver doesn't access registered handler and associated data 
structure
once returing from unregister_hotplug_dock_device().

If we simply use a flag to mark presence of registered callback, we 
can't achieve
the second goal. Take the sony laptop as an example. It has several PCI 
hotplug
slot associated with the dock station:
[   28.829316] acpiphp_glue: _handle_hotplug_event_func: Bus check 
notify on \_SB_.PCI0.RP07.LPMB
[   30.174964] acpiphp_glue: _handle_hotplug_event_func:

support for Intel Atom based QNAP LEDs/buttons/buzzer in Linux?

2013-06-14 Thread Christoph Anton Mitterer

Hi.

I wondered whether anyone knows, whether the kernel supports the
LEDs/buttons/buzzer of Intel Atom based QNAP NAS like the TS-569 Pro?

I got the two line LCD, which is a A125, working,...it can easily be
controlled via the serial device... but not the others.
Seems these are GPIO controlled...

I further found this: https://github.com/tomtastic/qnap-gpio/
but it seems it's for the TS-239 Pro only.

Any people out there with some experience? :)

Cheers and thanks,
Chris.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] i915: Don't provide ACPI backlight interface if firmware expects Windows 8

2013-06-14 Thread Matthew Garrett

On Sat, 2013-06-15 at 09:26 +0800, Aaron Lu wrote:
> On 06/15/2013 01:29 AM, Matthew Garrett wrote: 
> > How would that work with existing userspace?
> 
> User space tool will need to be updated to use this as stated in the
> gist page, I've patches for gsd-backlight-helper and xorg-x11-drv-intel,
> for others we can add I think if the priority based solution is deemed
> useful.

Right, that's not a great solution.

> > We shouldn't export interfaces if we don't expect them to work.
> 
> It's not easy to decide if they work or not sometimes, e.g. I came
> across a system that claims win8 in ACPI table and has an Intel GPU,
> while its ACPI video interface also works. With this patch, the working
> ACPI video interface is removed, while with the priority based solution,
> the GPU's interface priority gets higher, but the ACPI video interface
> still stays.

Well, Windows 8 will only use the ACPI backlight interface if the GPU
driver decides to, right? So the logic for deciding whether to remove
the ACPI backlight control or not should be left up to the GPU. There's
no harm in refusing to expose a working method if there's another
working method, but there is harm in exposing a broken one and expecting
userspace to know the difference.

-- 
Matthew Garrett | mj...@srcf.ucam.org
N�r��yb�X��ǧv�^�)޺{.n�+{zX����ܨ}���Ơz�:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf��^jǫy�m��@A�a���
0��h���i

Re: [GIT PULL 3/3] msm clock for 3.11

2013-06-14 Thread Olof Johansson

On Fri, Jun 14, 2013 at 12:52:55PM -0700, David Brown wrote:
> The following changes since commit f722406faae2d073cc1d01063d1123c35425939e:
> 
>  Linux 3.10-rc1 (2013-05-11 17:14:08 -0700)
> 
> are available in the git repository at:
> 
>  git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm.git 
> tags/msm-clock-for-3.11
> 
> for you to fetch changes up to f54c1c11053d8aaf318a096b8b332fd41a394a66:
> 
>  ARM: msm: Migrate to common clock framework (2013-06-12 14:43:31 -0700)
> 
> 
> MSM clock updates for 3.11.
> 
> Per Stephen Boyd's coverletter:
> 
> This patchset moves the existing MSM clock code and affected drivers
> to the common clock framework. A prerequisite of moving to the common
> clock framework is to use clk_prepare() and clk_enable() so the first
> few patches migrate drivers to that call (clk_prepare() is a no-op on
> MSM right now). It also removes some custom clock APIs that MSM
> provides and finally moves the proc_comm clock code to the common
> struct clk.
> 
> This patch series will be used as the foundation of the MSM 8660/8960
> clock code that I plan to send out after this series.
> 
> 
> Stephen Boyd (12):
>  msm_serial: Convert to clk_prepare/unprepare
>  msm_serial: Use devm_clk_get() and properly return errors
>  usb: otg: msm: Convert to clk_prepare/unprepare
>  msm_sdcc: Convert to clk_prepare/unprepare
>  msm: iommu: Convert to clk_prepare/unprepare
>  msm: iommu: Use clk_set_rate() instead of clk_set_min_rate()
>  ARM: msm: Remove custom clk_set_flags() API
>  ARM: msm: Remove custom clk_set_{max,min}_rate() API
>  ARM: msm: Remove clock-7x30.h include file
>  ARM: msm: Prepare clk_get() users in mach-msm for clock-pcom driver
>  ARM: msm: Make proc_comm clock control into a platform driver
>  ARM: msm: Migrate to common clock framework

I don't see a single ack from any of the driver subsystem maintainers for
serial, usb or iommu. Some of them have been acked though, for example Felipe
acked the USB one on the mailing list.

I also didn't see Greg cc:d on the serial patches, which would explain why
there was no ack for them.

Care to repost/collect/rebuild this branch, please? It's a great cleanup and
I'd be happy to take it even if it ends up coming in after -rc6.


-Olof

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC ticketlock] Auto-queued ticketlock

2013-06-14 Thread Benjamin Herrenschmidt

On Fri, 2013-06-14 at 14:17 -0400, Waiman Long wrote:
> 
> With some minor changes, the current patch can be modified to support 
> debugging lock for 32-bit system. For 64-bit system, we can apply a 
> similar concept for debugging lock with cmpxchg_double. However, for 
> architecture that does not have cmpxchg_double support, it will be out 
> of luck and we probably couldn't support the same feature in debugging 
> mode. It will have to fall back to taking the lock.

That means only x86_64 and s390 would benefit from it ... I'm sure we can do 
better :-)

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL 2/3] msm fixes for 3.11

2013-06-14 Thread Olof Johansson

On Fri, Jun 14, 2013 at 10:56:55AM -0700, David Brown wrote:
> The following changes since commit f722406faae2d073cc1d01063d1123c35425939e:
> 
>   Linux 3.10-rc1 (2013-05-11 17:14:08 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm.git 
> tags/msm-fix-for-3.11
> 
> for you to fetch changes up to 7ba655fc965b073292349fa49fb9d16d701185bc:
> 
>   gpio: msm-v1: Remove errant __devinit to fix compile (2013-06-12 14:49:06 
> -0700)
> 
> 
> Some minor fixes for MSM for 3.11
> 
> I don't expect these to be necessary for stable, since the fixes are
> to recently added code.  The strncpy fix is only in debug code that
> isn't normally compiled or used (and is being removed in upcoming
> patches).
> 
> 
> Chen Gang (1):
>   arch: arm: mach-msm: using strlcpy instead of strncpy
> 
> Stephen Boyd (3):
>   ARM: dts: msm: Fix bad register addresses

Hmm. I see that the msm-hsuart device nodes completely lack reg entries. That's
considerably more important to fix than the cosmetic unit address that's not
even needed unless two nodes happen to have the same name.

I ended up pulling this in underneath of the cleanup branch to resolve the
add/change conflicts there. 

-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3/3] i915: Don't provide ACPI backlight interface if firmware expects Windows 8

2013-06-14 Thread Aaron Lu

On 06/15/2013 01:29 AM, Matthew Garrett wrote:
> On Fri, 2013-06-14 at 14:47 +0800, Aaron Lu wrote:
> 
>> What about a priority based solution? We can introduce a new field named
>> priority to backlight_device and instead of calling another module's
>> function like the unregister one here(which cause unnecessary module
>> dependency), we only need to boost priority for its own interface. This
>> field will be exported to sysfs, so user can change it during runtime
>> too. And we can also introduce a new kernel command line as
>> backlight.force_interface=raw/firmware/platform, to overcome the limited
>> functionality provided by acpi_backlight=video/vendor, which does not
>> involve GPU's interface.
> 
> How would that work with existing userspace?

User space tool will need to be updated to use this as stated in the
gist page, I've patches for gsd-backlight-helper and xorg-x11-drv-intel,
for others we can add I think if the priority based solution is deemed
useful.

> 
>> And we can place the quirk code in backlight layer instead of individual
>> backlight functionality provider module. Suppose we have a backlight
>> manager there, for all win8 systems, we can boost the raw type's
>> priority on its registration, so no need to add code in
>> intel/amd/etc./'s GPU driver code.
> 
> But we'd need to add code to every piece of userspace that currently
> uses the backlight, right?

Yes that's the case.

> 
>> With priority based solution, all backlight control interfaces stay,
>> the priority field is an indication given by kernel to user space.
> 
> We shouldn't export interfaces if we don't expect them to work.

It's not easy to decide if they work or not sometimes, e.g. I came
across a system that claims win8 in ACPI table and has an Intel GPU,
while its ACPI video interface also works. With this patch, the working
ACPI video interface is removed, while with the priority based solution,
the GPU's interface priority gets higher, but the ACPI video interface
still stays.

Thanks,
Aaron
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL 1/3] msm cleanups for 3.11

2013-06-14 Thread Olof Johansson

On Fri, Jun 14, 2013 at 10:56:53AM -0700, David Brown wrote:
> The following changes since commit f722406faae2d073cc1d01063d1123c35425939e:
> 
>   Linux 3.10-rc1 (2013-05-11 17:14:08 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/davidb/linux-msm.git 
> tags/msm-cleanup-for-3.11
> 
> for you to fetch changes up to 1aa3d1a3c7d235c47e30c7c8c6b5ef02fb1536b3:
> 
>   mfd: ssbi: Use devm_* and simplify code (2013-06-12 14:50:12 -0700)
> 
> 
> Cleanups for MSM for 3.11
> 
> These are a handful of cleanups to the MSM tree.  The gpio cleanups
> get us closer to having proper pinmux and gpio support.

Pulled, thanks.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] module: don't modify argument of module_kallsyms_lookup_name()

2013-06-14 Thread Mathias Krause

If we pass a pointer to a const string in the form "module:symbol"
module_kallsyms_lookup_name() will try to split the string at the colon,
i.e., will try to modify r/o data. That will, in fact, fail on a kernel
with enabled CONFIG_DEBUG_RODATA.

Avoid modifying the passed string in module_kallsyms_lookup_name(),
modify find_module_all() instead to pass it the module name length.

Signed-off-by: Mathias Krause 
---
v2:
- don't use kstrdup(), pass the string length to find_module_all() as
  suggested by Rusty

 kernel/module.c |   15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/kernel/module.c b/kernel/module.c
index cab4bce..bf3b846 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -455,7 +455,7 @@ const struct kernel_symbol *find_symbol(const char *name,
 EXPORT_SYMBOL_GPL(find_symbol);
 
 /* Search for module by name: must hold module_mutex. */
-static struct module *find_module_all(const char *name,
+static struct module *find_module_all(const char *name, size_t len,
  bool even_unformed)
 {
struct module *mod;
@@ -463,7 +463,7 @@ static struct module *find_module_all(const char *name,
list_for_each_entry(mod, , list) {
if (!even_unformed && mod->state == MODULE_STATE_UNFORMED)
continue;
-   if (strcmp(mod->name, name) == 0)
+   if (strlen(mod->name) == len && !memcmp(mod->name, name, len))
return mod;
}
return NULL;
@@ -471,7 +471,7 @@ static struct module *find_module_all(const char *name,
 
 struct module *find_module(const char *name)
 {
-   return find_module_all(name, false);
+   return find_module_all(name, strlen(name), false);
 }
 EXPORT_SYMBOL_GPL(find_module);
 
@@ -3014,7 +3014,7 @@ static bool finished_loading(const char *name)
bool ret;
 
mutex_lock(_mutex);
-   mod = find_module_all(name, true);
+   mod = find_module_all(name, strlen(name), true);
ret = !mod || mod->state == MODULE_STATE_LIVE
|| mod->state == MODULE_STATE_GOING;
mutex_unlock(_mutex);
@@ -3152,7 +3152,8 @@ static int add_unformed_module(struct module *mod)
 
 again:
mutex_lock(_mutex);
-   if ((old = find_module_all(mod->name, true)) != NULL) {
+   old = find_module_all(mod->name, strlen(mod->name), true);
+   if (old != NULL) {
if (old->state == MODULE_STATE_COMING
|| old->state == MODULE_STATE_UNFORMED) {
/* Wait in case it fails to load. */
@@ -3563,10 +3564,8 @@ unsigned long module_kallsyms_lookup_name(const char 
*name)
/* Don't lock: we're in enough trouble already. */
preempt_disable();
if ((colon = strchr(name, ':')) != NULL) {
-   *colon = '\0';
-   if ((mod = find_module(name)) != NULL)
+   if ((mod = find_module_all(name, colon - name, false)) != NULL)
ret = mod_find_symname(mod, colon+1);
-   *colon = ':';
} else {
list_for_each_entry_rcu(mod, , list) {
if (mod->state == MODULE_STATE_UNFORMED)
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] scripts/setlocalversion on write-protected source tree

2013-06-14 Thread Christian Kujau


Since no one objected[0] and Nico kinda approved of my suggestion to 
remove "git update-index", I propose the 2nd version of this patch
for 3.11 (Linux, that is)

[0] https://lkml.org/lkml/2013/6/9/185

---

  Signed-off-by: Christian Kujau 
  Cc: Nico Schottelius 

I just stumbled across another[0] issue when scripts/setlocalversion 
operates on a write-protected source tree. Back then[0] the source tree 
was on an read-only NFS share, so "test -w" was introduced before "git 
update-index" was run.

This time, the source tree is on read/write NFS share, but the permissions 
are world-readable and only a specific user (or root) can write. 
Thus, "test -w ." returns "0" and then runs "git update-index", 
producing the following message (on a dirty tree):

  fatal: Unable to create '/usr/local/src/linux-git/.git/index.lock': 
Permission denied

While it says "fatal", compilation continues just fine.

However, I don't think a kernel compilation should alter the source 
tree (or the .git directory) in any way and I don't see how removing 
"git update-index" could do any harm. The Mercurial and SVN routines in 
scripts/setlocalversion don't have any tree-modifying commands, AFAICS. 
So, maybe the patch below would be acceptable.

[0] https://patchwork.kernel.org/patch/29718/

---

diff --git a/scripts/setlocalversion b/scripts/setlocalversion
index 84b88f1..d105a44 100755
--- a/scripts/setlocalversion
+++ b/scripts/setlocalversion
@@ -71,9 +71,6 @@ scm_version()
printf -- '-svn%s' "`git svn find-rev $head`"
fi
 
-   # Update index only on r/w media
-   [ -w . ] && git update-index --refresh --unmerged > /dev/null
-
# Check for uncommitted changes
if git diff-index --name-only HEAD | grep -qv 
"^scripts/package"; then
printf '%s' -dirty




-- 
BOFH excuse #359:

YOU HAVE AN I/O ERROR -> Incompetent Operator error
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 03/22] x86, ACPI, mm: Kill max_low_pfn_mapped

2013-06-14 Thread Yinghai Lu

Now we have arch_pfn_mapped array, and max_low_pfn_mapped should not
be used anymore.

User should use arch_pfn_mapped or just 1UL<<(32-PAGE_SHIFT) instead.

Only user is ACPI_INITRD_TABLE_OVERRIDE, and it should not use that,
as later accessing is using early_ioremap(). We could change to use
1U<<(32_PAGE_SHIFT) with it, aka under 4G.

-v2: Leave alone max_low_pfn_mapped in i915 code according to tj.

Suggested-by: H. Peter Anvin 
Signed-off-by: Yinghai Lu 
Cc: "Rafael J. Wysocki" 
Cc: Jacob Shin 
Cc: Pekka Enberg 
Cc: linux-a...@vger.kernel.org
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/include/asm/page_types.h | 1 -
 arch/x86/kernel/setup.c   | 4 +---
 arch/x86/mm/init.c| 4 
 drivers/acpi/osl.c| 6 +++---
 4 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/arch/x86/include/asm/page_types.h 
b/arch/x86/include/asm/page_types.h
index 54c9787..b012b82 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -43,7 +43,6 @@
 
 extern int devmem_is_allowed(unsigned long pagenr);
 
-extern unsigned long max_low_pfn_mapped;
 extern unsigned long max_pfn_mapped;
 
 static inline phys_addr_t get_max_mapped(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 66ab495..6ca5f2c 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -112,13 +112,11 @@
 #include 
 
 /*
- * max_low_pfn_mapped: highest direct mapped pfn under 4GB
- * max_pfn_mapped: highest direct mapped pfn over 4GB
+ * max_pfn_mapped: highest direct mapped pfn
  *
  * The direct mapping only covers E820_RAM regions, so the ranges and gaps are
  * represented by pfn_mapped
  */
-unsigned long max_low_pfn_mapped;
 unsigned long max_pfn_mapped;
 
 #ifdef CONFIG_DMI
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index eaac174..8554656 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -313,10 +313,6 @@ static void add_pfn_range_mapped(unsigned long start_pfn, 
unsigned long end_pfn)
nr_pfn_mapped = clean_sort_range(pfn_mapped, E820_X_MAX);
 
max_pfn_mapped = max(max_pfn_mapped, end_pfn);
-
-   if (start_pfn < (1UL<<(32-PAGE_SHIFT)))
-   max_low_pfn_mapped = max(max_low_pfn_mapped,
-min(end_pfn, 1UL<<(32-PAGE_SHIFT)));
 }
 
 bool pfn_range_is_mapped(unsigned long start_pfn, unsigned long end_pfn)
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index e721863..93e3194 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -624,9 +624,9 @@ void __init acpi_initrd_override(void *data, size_t size)
if (table_nr == 0)
return;
 
-   acpi_tables_addr =
-   memblock_find_in_range(0, max_low_pfn_mapped << PAGE_SHIFT,
-  all_tables_size, PAGE_SIZE);
+   /* under 4G at first, then above 4G */
+   acpi_tables_addr = memblock_find_in_range(0, (1ULL<<32) - 1,
+   all_tables_size, PAGE_SIZE);
if (!acpi_tables_addr) {
WARN_ON(1);
return;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 07/22] x86, ACPI: Store override acpi tables phys addr in cpio files info array

2013-06-14 Thread Yinghai Lu

In 32bit we will find table with phys address during 32bit flat mode
in head_32.S, because at that time we don't need set page table to
access initrd.

For copying we could use early_ioremap() with phys directly before mem mapping
is set.

To keep 32bit and 64bit consistent, use phys_addr for all.

-v2: introduce file_pos to save phys address instead of abusing cpio_data
that tj is not happy with.

Signed-off-by: Yinghai Lu 
Cc: Rafael J. Wysocki 
Cc: linux-a...@vger.kernel.org
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 drivers/acpi/osl.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index fea73af..3a307ec 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -570,7 +570,11 @@ static const char * const table_sigs[] = {
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
 #define ACPI_OVERRIDE_TABLES 64
-static struct cpio_data __initdata acpi_initrd_files[ACPI_OVERRIDE_TABLES];
+struct file_pos {
+   phys_addr_t data;
+   phys_addr_t size;
+};
+static struct file_pos __initdata acpi_initrd_files[ACPI_OVERRIDE_TABLES];
 
 void __init acpi_initrd_override_find(void *data, size_t size)
 {
@@ -615,7 +619,7 @@ void __init acpi_initrd_override_find(void *data, size_t 
size)
table->signature, cpio_path, file.name, table->length);
 
all_tables_size += table->length;
-   acpi_initrd_files[table_nr].data = file.data;
+   acpi_initrd_files[table_nr].data = __pa_nodebug(file.data);
acpi_initrd_files[table_nr].size = file.size;
table_nr++;
}
@@ -624,7 +628,7 @@ void __init acpi_initrd_override_find(void *data, size_t 
size)
 void __init acpi_initrd_override_copy(void)
 {
int no, total_offset = 0;
-   char *p;
+   char *p, *q;
 
if (!all_tables_size)
return;
@@ -659,12 +663,15 @@ void __init acpi_initrd_override_copy(void)
 * one by one during copying.
 */
for (no = 0; no < ACPI_OVERRIDE_TABLES; no++) {
+   phys_addr_t addr = acpi_initrd_files[no].data;
phys_addr_t size = acpi_initrd_files[no].size;
 
if (!size)
break;
+   q = early_ioremap(addr, size);
p = early_ioremap(acpi_tables_addr + total_offset, size);
-   memcpy(p, acpi_initrd_files[no].data, size);
+   memcpy(p, q, size);
+   early_iounmap(q, size);
early_iounmap(p, size);
total_offset += size;
}
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 05/22] x86, ACPI: Increase override tables number limit

2013-06-14 Thread Yinghai Lu

Current acpi tables in initrd is limited to 10, that is too small.
64 should be good enough as we have 35 sigs and could have several
SSDT.

Two problems in current code prevent us from increasing limit:
1. that cpio file info array is put in stack, as every element is 32
   bytes, could run out of stack if we have that array size to 64.
   We can move it out from stack, and make it as global and put it in
   __initdata section.
2. early_ioremap only can remap 256k one time. Current code is mapping
   10 tables one time. If we increase that limit, whole size could be
   more than 256k, early_ioremap will fail with that.
   We can map table one by one during copying, instead of mapping
   all them one time.

-v2: According to tj, split it out to separated patch, also
 rename array name to acpi_initrd_files.
-v3: Add some comments about mapping table one by one during copying
 per tj.

Signed-off-by: Yinghai 
Cc: Rafael J. Wysocki 
Cc: linux-a...@vger.kernel.org
Acked-by: Tejun Heo 
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 drivers/acpi/osl.c | 26 +++---
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 42c48fc..c4ea2b7 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -569,8 +569,8 @@ static const char * const table_sigs[] = {
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
-/* Must not increase 10 or needs code modification below */
-#define ACPI_OVERRIDE_TABLES 10
+#define ACPI_OVERRIDE_TABLES 64
+static struct cpio_data __initdata acpi_initrd_files[ACPI_OVERRIDE_TABLES];
 
 void __init acpi_initrd_override(void *data, size_t size)
 {
@@ -579,7 +579,6 @@ void __init acpi_initrd_override(void *data, size_t size)
struct acpi_table_header *table;
char cpio_path[32] = "kernel/firmware/acpi/";
struct cpio_data file;
-   struct cpio_data early_initrd_files[ACPI_OVERRIDE_TABLES];
char *p;
 
if (data == NULL || size == 0)
@@ -617,8 +616,8 @@ void __init acpi_initrd_override(void *data, size_t size)
table->signature, cpio_path, file.name, table->length);
 
all_tables_size += table->length;
-   early_initrd_files[table_nr].data = file.data;
-   early_initrd_files[table_nr].size = file.size;
+   acpi_initrd_files[table_nr].data = file.data;
+   acpi_initrd_files[table_nr].size = file.size;
table_nr++;
}
if (table_nr == 0)
@@ -648,14 +647,19 @@ void __init acpi_initrd_override(void *data, size_t size)
memblock_reserve(acpi_tables_addr, all_tables_size);
arch_reserve_mem_area(acpi_tables_addr, all_tables_size);
 
-   p = early_ioremap(acpi_tables_addr, all_tables_size);
-
+   /*
+* early_ioremap only can remap 256k one time. If we map all
+* tables one time, we will hit the limit. Need to map table
+* one by one during copying.
+*/
for (no = 0; no < table_nr; no++) {
-   memcpy(p + total_offset, early_initrd_files[no].data,
-  early_initrd_files[no].size);
-   total_offset += early_initrd_files[no].size;
+   phys_addr_t size = acpi_initrd_files[no].size;
+
+   p = early_ioremap(acpi_tables_addr + total_offset, size);
+   memcpy(p, acpi_initrd_files[no].data, size);
+   early_iounmap(p, size);
+   total_offset += size;
}
-   early_iounmap(p, all_tables_size);
 }
 #endif /* CONFIG_ACPI_INITRD_TABLE_OVERRIDE */
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 15/22] x86, mm, numa: Move node_possible_map setting later

2013-06-14 Thread Yinghai Lu

Move node_possible_map handling out of numa_check_memblks to avoid side
changing in numa_check_memblks().

Only set once for successful path instead of resetting in numa_init()
every time.

Suggested-by: Tejun Heo 
Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index e448b6f..da2ebab 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -536,12 +536,13 @@ static unsigned long __init node_map_pfn_alignment(struct 
numa_meminfo *mi)
 
 static int __init numa_check_memblks(struct numa_meminfo *mi)
 {
+   nodemask_t nodes_parsed;
unsigned long pfn_align;
 
/* Account for nodes with cpus and no memory */
-   node_possible_map = numa_nodes_parsed;
-   numa_nodemask_from_meminfo(_possible_map, mi);
-   if (WARN_ON(nodes_empty(node_possible_map)))
+   nodes_parsed = numa_nodes_parsed;
+   numa_nodemask_from_meminfo(_parsed, mi);
+   if (WARN_ON(nodes_empty(nodes_parsed)))
return -EINVAL;
 
if (!numa_meminfo_cover_memory(mi))
@@ -593,7 +594,6 @@ static int __init numa_init(int (*init_func)(void))
set_apicid_to_node(i, NUMA_NO_NODE);
 
nodes_clear(numa_nodes_parsed);
-   nodes_clear(node_possible_map);
memset(_meminfo, 0, sizeof(numa_meminfo));
numa_reset_distance();
 
@@ -669,6 +669,9 @@ void __init x86_numa_init(void)
 
early_x86_numa_init();
 
+   node_possible_map = numa_nodes_parsed;
+   numa_nodemask_from_meminfo(_possible_map, mi);
+
for (i = 0; i < mi->nr_blks; i++) {
struct numa_memblk *mb = >blk[i];
memblock_set_node(mb->start, mb->end - mb->start, mb->nid);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 14/22] x86, mm, numa: Set memblock nid later

2013-06-14 Thread Yinghai Lu

For the separation, we need to set memblock nid later, as it
could change memblock array, and possible doube memblock.memory
array that will need to allocate buffer.

Only set memblock nid one time for successful path.

Also rename numa_register_memblks to numa_check_memblks()
after move out code for setting memblock nid.

Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index cff565a..e448b6f 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -534,10 +534,9 @@ static unsigned long __init node_map_pfn_alignment(struct 
numa_meminfo *mi)
 }
 #endif
 
-static int __init numa_register_memblks(struct numa_meminfo *mi)
+static int __init numa_check_memblks(struct numa_meminfo *mi)
 {
unsigned long pfn_align;
-   int i;
 
/* Account for nodes with cpus and no memory */
node_possible_map = numa_nodes_parsed;
@@ -560,11 +559,6 @@ static int __init numa_register_memblks(struct 
numa_meminfo *mi)
return -EINVAL;
}
 
-   for (i = 0; i < mi->nr_blks; i++) {
-   struct numa_memblk *mb = >blk[i];
-   memblock_set_node(mb->start, mb->end - mb->start, mb->nid);
-   }
-
return 0;
 }
 
@@ -601,7 +595,6 @@ static int __init numa_init(int (*init_func)(void))
nodes_clear(numa_nodes_parsed);
nodes_clear(node_possible_map);
memset(_meminfo, 0, sizeof(numa_meminfo));
-   WARN_ON(memblock_set_node(0, ULLONG_MAX, MAX_NUMNODES));
numa_reset_distance();
 
ret = init_func();
@@ -613,7 +606,7 @@ static int __init numa_init(int (*init_func)(void))
 
numa_emulation(_meminfo, numa_distance_cnt);
 
-   ret = numa_register_memblks(_meminfo);
+   ret = numa_check_memblks(_meminfo);
if (ret < 0)
return ret;
 
@@ -676,6 +669,11 @@ void __init x86_numa_init(void)
 
early_x86_numa_init();
 
+   for (i = 0; i < mi->nr_blks; i++) {
+   struct numa_memblk *mb = >blk[i];
+   memblock_set_node(mb->start, mb->end - mb->start, mb->nid);
+   }
+
/* Finally register nodes. */
for_each_node_mask(nid, node_possible_map) {
u64 start = PFN_PHYS(max_pfn);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 02/22] x86, microcode: Use common get_ramdisk_image()

2013-06-14 Thread Yinghai Lu

Use common get_ramdisk_image() to get ramdisk start phys address.

We need this to get correct ramdisk adress for 64bit bzImage that
initrd can be loaded above 4G by kexec-tools.

-v2: fix one typo that is found by Tang Chen

Signed-off-by: Yinghai Lu 
Cc: Fenghua Yu 
Acked-by: Tejun Heo 
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/kernel/microcode_intel_early.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kernel/microcode_intel_early.c 
b/arch/x86/kernel/microcode_intel_early.c
index 2e9e128..54575a9 100644
--- a/arch/x86/kernel/microcode_intel_early.c
+++ b/arch/x86/kernel/microcode_intel_early.c
@@ -743,8 +743,8 @@ load_ucode_intel_bsp(void)
struct boot_params *boot_params_p;
 
boot_params_p = (struct boot_params *)__pa_nodebug(_params);
-   ramdisk_image = boot_params_p->hdr.ramdisk_image;
-   ramdisk_size  = boot_params_p->hdr.ramdisk_size;
+   ramdisk_image = get_ramdisk_image(boot_params_p);
+   ramdisk_size  = get_ramdisk_size(boot_params_p);
initrd_start_early = ramdisk_image;
initrd_end_early = initrd_start_early + ramdisk_size;
 
@@ -753,8 +753,8 @@ load_ucode_intel_bsp(void)
(unsigned long *)__pa_nodebug(_saved_in_initrd),
initrd_start_early, initrd_end_early, );
 #else
-   ramdisk_image = boot_params.hdr.ramdisk_image;
-   ramdisk_size  = boot_params.hdr.ramdisk_size;
+   ramdisk_image = get_ramdisk_image(_params);
+   ramdisk_size  = get_ramdisk_size(_params);
initrd_start_early = ramdisk_image + PAGE_OFFSET;
initrd_end_early = initrd_start_early + ramdisk_size;
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 01/22] x86: Change get_ramdisk_image() to global

2013-06-14 Thread Yinghai Lu

Need to use get_ramdisk_image() with early microcode_updating in other file.
Change it to global.

Also make it to take boot_params pointer, as head_32.S need to access it via
phys address during 32bit flat mode.

Signed-off-by: Yinghai Lu 
Acked-by: Tejun Heo 
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/include/asm/setup.h |  3 +++
 arch/x86/kernel/setup.c  | 28 ++--
 2 files changed, 17 insertions(+), 14 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index b7bf350..4f71d48 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -106,6 +106,9 @@ void *extend_brk(size_t size, size_t align);
RESERVE_BRK(name, sizeof(type) * entries)
 
 extern void probe_roms(void);
+u64 get_ramdisk_image(struct boot_params *bp);
+u64 get_ramdisk_size(struct boot_params *bp);
+
 #ifdef __i386__
 
 void __init i386_start_kernel(void);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 56f7fcf..66ab495 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -297,19 +297,19 @@ static void __init reserve_brk(void)
 
 #ifdef CONFIG_BLK_DEV_INITRD
 
-static u64 __init get_ramdisk_image(void)
+u64 __init get_ramdisk_image(struct boot_params *bp)
 {
-   u64 ramdisk_image = boot_params.hdr.ramdisk_image;
+   u64 ramdisk_image = bp->hdr.ramdisk_image;
 
-   ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
+   ramdisk_image |= (u64)bp->ext_ramdisk_image << 32;
 
return ramdisk_image;
 }
-static u64 __init get_ramdisk_size(void)
+u64 __init get_ramdisk_size(struct boot_params *bp)
 {
-   u64 ramdisk_size = boot_params.hdr.ramdisk_size;
+   u64 ramdisk_size = bp->hdr.ramdisk_size;
 
-   ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
+   ramdisk_size |= (u64)bp->ext_ramdisk_size << 32;
 
return ramdisk_size;
 }
@@ -318,8 +318,8 @@ static u64 __init get_ramdisk_size(void)
 static void __init relocate_initrd(void)
 {
/* Assume only end is not page aligned */
-   u64 ramdisk_image = get_ramdisk_image();
-   u64 ramdisk_size  = get_ramdisk_size();
+   u64 ramdisk_image = get_ramdisk_image(_params);
+   u64 ramdisk_size  = get_ramdisk_size(_params);
u64 area_size = PAGE_ALIGN(ramdisk_size);
u64 ramdisk_here;
unsigned long slop, clen, mapaddr;
@@ -358,8 +358,8 @@ static void __init relocate_initrd(void)
ramdisk_size  -= clen;
}
 
-   ramdisk_image = get_ramdisk_image();
-   ramdisk_size  = get_ramdisk_size();
+   ramdisk_image = get_ramdisk_image(_params);
+   ramdisk_size  = get_ramdisk_size(_params);
printk(KERN_INFO "Move RAMDISK from [mem %#010llx-%#010llx] to"
" [mem %#010llx-%#010llx]\n",
ramdisk_image, ramdisk_image + ramdisk_size - 1,
@@ -369,8 +369,8 @@ static void __init relocate_initrd(void)
 static void __init early_reserve_initrd(void)
 {
/* Assume only end is not page aligned */
-   u64 ramdisk_image = get_ramdisk_image();
-   u64 ramdisk_size  = get_ramdisk_size();
+   u64 ramdisk_image = get_ramdisk_image(_params);
+   u64 ramdisk_size  = get_ramdisk_size(_params);
u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
 
if (!boot_params.hdr.type_of_loader ||
@@ -382,8 +382,8 @@ static void __init early_reserve_initrd(void)
 static void __init reserve_initrd(void)
 {
/* Assume only end is not page aligned */
-   u64 ramdisk_image = get_ramdisk_image();
-   u64 ramdisk_size  = get_ramdisk_size();
+   u64 ramdisk_image = get_ramdisk_image(_params);
+   u64 ramdisk_size  = get_ramdisk_size(_params);
u64 ramdisk_end   = PAGE_ALIGN(ramdisk_image + ramdisk_size);
u64 mapped_size;
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 19/22] x86, mm: Parse numa info early

2013-06-14 Thread Yinghai Lu

Parsing numa info has been separated to two functions now.

early_initmem_info() only parse info in numa_meminfo and
nodes_parsed. still keep numaq, acpi_numa, amd_numa, dummy
fall back sequence working.

SLIT and numa emulation handling are still left in initmem_init().

Call early_initmem_init before init_mem_mapping() to prepare
to use numa_info with it.

Signed-off-by: Yinghai Lu 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/kernel/setup.c | 24 ++--
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 301165e..fd0d5be 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1125,13 +1125,21 @@ void __init setup_arch(char **cmdline_p)
trim_platform_memory_ranges();
trim_low_memory_range();
 
+   /*
+* Parse the ACPI tables for possible boot-time SMP configuration.
+*/
+   acpi_initrd_override_copy();
+   acpi_boot_table_init();
+   early_acpi_boot_init();
+   early_initmem_init();
init_mem_mapping();
-
+   memblock.current_limit = get_max_mapped();
early_trap_pf_init();
 
+   reserve_initrd();
+
setup_real_mode();
 
-   memblock.current_limit = get_max_mapped();
dma_contiguous_reserve(0);
 
/*
@@ -1145,24 +1153,12 @@ void __init setup_arch(char **cmdline_p)
/* Allocate bigger log buffer */
setup_log_buf(1);
 
-   acpi_initrd_override_copy();
-
-   reserve_initrd();
-
reserve_crashkernel();
 
vsmp_init();
 
io_delay_init();
 
-   /*
-* Parse the ACPI tables for possible boot-time SMP configuration.
-*/
-   acpi_boot_table_init();
-
-   early_acpi_boot_init();
-
-   early_initmem_init();
initmem_init();
memblock_find_dma_reserve();
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 16/22] x86, mm, numa: Move emulation handling down.

2013-06-14 Thread Yinghai Lu

It needs to allocate buffer for new numa_meminfo and distance matrix,
so move it down.

Also we change the behavoir:
before this patch, if user input wrong data in command line, it
will fall back to next numa probing or disabling numa.
after this patch, if user input wrong data in command line, it will
stay with numa info from probing before, like acpi srat or amd_numa.

We need to call numa_check_memblks to reject wrong user inputs early,
so keep the original numa_meminfo not changed.

Signed-off-by: Yinghai Lu 
Cc: David Rientjes 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c   | 6 +++---
 arch/x86/mm/numa_emulation.c | 2 +-
 arch/x86/mm/numa_internal.h  | 2 ++
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index da2ebab..3254f22 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -534,7 +534,7 @@ static unsigned long __init node_map_pfn_alignment(struct 
numa_meminfo *mi)
 }
 #endif
 
-static int __init numa_check_memblks(struct numa_meminfo *mi)
+int __init numa_check_memblks(struct numa_meminfo *mi)
 {
nodemask_t nodes_parsed;
unsigned long pfn_align;
@@ -604,8 +604,6 @@ static int __init numa_init(int (*init_func)(void))
if (ret < 0)
return ret;
 
-   numa_emulation(_meminfo, numa_distance_cnt);
-
ret = numa_check_memblks(_meminfo);
if (ret < 0)
return ret;
@@ -669,6 +667,8 @@ void __init x86_numa_init(void)
 
early_x86_numa_init();
 
+   numa_emulation(_meminfo, numa_distance_cnt);
+
node_possible_map = numa_nodes_parsed;
numa_nodemask_from_meminfo(_possible_map, mi);
 
diff --git a/arch/x86/mm/numa_emulation.c b/arch/x86/mm/numa_emulation.c
index d47..5a0433d 100644
--- a/arch/x86/mm/numa_emulation.c
+++ b/arch/x86/mm/numa_emulation.c
@@ -348,7 +348,7 @@ void __init numa_emulation(struct numa_meminfo 
*numa_meminfo, int numa_dist_cnt)
if (ret < 0)
goto no_emu;
 
-   if (numa_cleanup_meminfo() < 0) {
+   if (numa_cleanup_meminfo() < 0 || numa_check_memblks() < 0) {
pr_warning("NUMA: Warning: constructed meminfo invalid, 
disabling emulation\n");
goto no_emu;
}
diff --git a/arch/x86/mm/numa_internal.h b/arch/x86/mm/numa_internal.h
index ad86ec9..bb2fbcc 100644
--- a/arch/x86/mm/numa_internal.h
+++ b/arch/x86/mm/numa_internal.h
@@ -21,6 +21,8 @@ void __init numa_reset_distance(void);
 
 void __init x86_numa_init(void);
 
+int __init numa_check_memblks(struct numa_meminfo *mi);
+
 #ifdef CONFIG_NUMA_EMU
 void __init numa_emulation(struct numa_meminfo *numa_meminfo,
   int numa_dist_cnt);
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 11/22] x86, mm, numa: Call numa_meminfo_cover_memory() checking early

2013-06-14 Thread Yinghai Lu

For the separation, we need to set memblock nid later, as it
could change memblock array, and possible doube memblock.memory
array that will need to allocate buffer.

We do not need to use nid in memblock to find out absent pages.
So we can move that numa_meminfo_cover_memory() early.

Also could change __absent_pages_in_range() to static and use
absent_pages_in_range() directly.

Later we can only set memblock nid one time on successful path.

Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 7 ---
 include/linux/mm.h | 2 --
 mm/page_alloc.c| 2 +-
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 07ae800..1bb565d 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -457,7 +457,7 @@ static bool __init numa_meminfo_cover_memory(const struct 
numa_meminfo *mi)
u64 s = mi->blk[i].start >> PAGE_SHIFT;
u64 e = mi->blk[i].end >> PAGE_SHIFT;
numaram += e - s;
-   numaram -= __absent_pages_in_range(mi->blk[i].nid, s, e);
+   numaram -= absent_pages_in_range(s, e);
if ((s64)numaram < 0)
numaram = 0;
}
@@ -485,6 +485,9 @@ static int __init numa_register_memblks(struct numa_meminfo 
*mi)
if (WARN_ON(nodes_empty(node_possible_map)))
return -EINVAL;
 
+   if (!numa_meminfo_cover_memory(mi))
+   return -EINVAL;
+
for (i = 0; i < mi->nr_blks; i++) {
struct numa_memblk *mb = >blk[i];
memblock_set_node(mb->start, mb->end - mb->start, mb->nid);
@@ -503,8 +506,6 @@ static int __init numa_register_memblks(struct numa_meminfo 
*mi)
return -EINVAL;
}
 #endif
-   if (!numa_meminfo_cover_memory(mi))
-   return -EINVAL;
 
return 0;
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e0c8528..28e9470 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1385,8 +1385,6 @@ static inline unsigned long free_initmem_default(int 
poison)
  */
 extern void free_area_init_nodes(unsigned long *max_zone_pfn);
 unsigned long node_map_pfn_alignment(void);
-unsigned long __absent_pages_in_range(int nid, unsigned long start_pfn,
-   unsigned long end_pfn);
 extern unsigned long absent_pages_in_range(unsigned long start_pfn,
unsigned long end_pfn);
 extern void get_pfn_range_for_nid(unsigned int nid,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 378a15b..c427f46 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4395,7 +4395,7 @@ static unsigned long __meminit 
zone_spanned_pages_in_node(int nid,
  * Return the number of holes in a range on a node. If nid is MAX_NUMNODES,
  * then all holes in the requested range will be accounted for.
  */
-unsigned long __meminit __absent_pages_in_range(int nid,
+static unsigned long __meminit __absent_pages_in_range(int nid,
unsigned long range_start_pfn,
unsigned long range_end_pfn)
 {
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 21/22] x86, mm: Make init_mem_mapping be able to be called several times

2013-06-14 Thread Yinghai Lu

Prepare to put page table on local nodes.

Move calling of init_mem_mapping to early_initmem_init.

Rework alloc_low_pages to alloc page table in following order:
BRK, local node, low range

Still load_cr3 one time.

Signed-off-by: Yinghai Lu 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Cc: Konrad Rzeszutek Wilk 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/include/asm/pgtable.h |   2 +-
 arch/x86/kernel/setup.c|   1 -
 arch/x86/mm/init.c | 101 +
 arch/x86/mm/numa.c |  24 ++
 4 files changed, 88 insertions(+), 40 deletions(-)

diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
index 1e67223..868687c 100644
--- a/arch/x86/include/asm/pgtable.h
+++ b/arch/x86/include/asm/pgtable.h
@@ -621,7 +621,7 @@ static inline int pgd_none(pgd_t pgd)
 #ifndef __ASSEMBLY__
 
 extern int direct_gbpages;
-void init_mem_mapping(void);
+void init_mem_mapping(unsigned long begin, unsigned long end);
 void early_alloc_pgt_buf(void);
 
 /* local pte updates need not use xchg for locking */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index fd0d5be..9ccbd60 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1132,7 +1132,6 @@ void __init setup_arch(char **cmdline_p)
acpi_boot_table_init();
early_acpi_boot_init();
early_initmem_init();
-   init_mem_mapping();
memblock.current_limit = get_max_mapped();
early_trap_pf_init();
 
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 5f38e72..21b1653 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -24,7 +24,10 @@ static unsigned long __initdata pgt_buf_start;
 static unsigned long __initdata pgt_buf_end;
 static unsigned long __initdata pgt_buf_top;
 
-static unsigned long min_pfn_mapped;
+static unsigned long low_min_pfn_mapped;
+static unsigned long low_max_pfn_mapped;
+static unsigned long local_min_pfn_mapped;
+static unsigned long local_max_pfn_mapped;
 
 static bool __initdata can_use_brk_pgt = true;
 
@@ -52,10 +55,17 @@ __ref void *alloc_low_pages(unsigned int num)
 
if ((pgt_buf_end + num) > pgt_buf_top || !can_use_brk_pgt) {
unsigned long ret;
-   if (min_pfn_mapped >= max_pfn_mapped)
-   panic("alloc_low_page: ran out of memory");
-   ret = memblock_find_in_range(min_pfn_mapped << PAGE_SHIFT,
-   max_pfn_mapped << PAGE_SHIFT,
+   if (local_min_pfn_mapped >= local_max_pfn_mapped) {
+   if (low_min_pfn_mapped >= low_max_pfn_mapped)
+   panic("alloc_low_page: ran out of memory");
+   ret = memblock_find_in_range(
+   low_min_pfn_mapped << PAGE_SHIFT,
+   low_max_pfn_mapped << PAGE_SHIFT,
+   PAGE_SIZE * num , PAGE_SIZE);
+   } else
+   ret = memblock_find_in_range(
+   local_min_pfn_mapped << PAGE_SHIFT,
+   local_max_pfn_mapped << PAGE_SHIFT,
PAGE_SIZE * num , PAGE_SIZE);
if (!ret)
panic("alloc_low_page: can not alloc memory");
@@ -412,67 +422,87 @@ static unsigned long __init get_new_step_size(unsigned 
long step_size)
return  step_size;
 }
 
-void __init init_mem_mapping(void)
+void __init init_mem_mapping(unsigned long begin, unsigned long end)
 {
-   unsigned long end, real_end, start, last_start;
+   unsigned long real_end, start, last_start;
unsigned long step_size;
unsigned long addr;
unsigned long mapped_ram_size = 0;
unsigned long new_mapped_ram_size;
+   bool is_low = false;
+
+   if (!begin) {
+   probe_page_size_mask();
+   /* the ISA range is always mapped regardless of memory holes */
+   init_memory_mapping(0, ISA_END_ADDRESS);
+   begin = ISA_END_ADDRESS;
+   is_low = true;
+   }
 
-   probe_page_size_mask();
-
-#ifdef CONFIG_X86_64
-   end = max_pfn << PAGE_SHIFT;
-#else
-   end = max_low_pfn << PAGE_SHIFT;
-#endif
-
-   /* the ISA range is always mapped regardless of memory holes */
-   init_memory_mapping(0, ISA_END_ADDRESS);
+   if (begin >= end)
+   return;
 
/* xen has big range in reserved near end of ram, skip it at first.*/
-   addr = memblock_find_in_range(ISA_END_ADDRESS, end, PMD_SIZE, PMD_SIZE);
+   addr = memblock_find_in_range(begin, end, PMD_SIZE, PMD_SIZE);
real_end = addr + PMD_SIZE;
 
/* step_size need to be small so pgt_buf from BRK could cover it */
step_size = PMD_SIZE;
-   max_pfn_mapped = 0; /* will get exact value next */
-   min_pfn_mapped = real_end >>

[PATCH v5 09/22] x86, ACPI: Find acpi tables in initrd early from head_32.S/head64.c

2013-06-14 Thread Yinghai Lu

head64.c could use #PF handler set page table to access initrd before
init mem mapping and initrd relocating.

head_32.S could use 32bit flat mode to access initrd before init mem
mapping initrd relocating.

That make 32bit and 64 bit more consistent.

-v2: use inline function in header file instead according to tj.
 also still need to keep #idef head_32.S to avoid compiling error.
-v3: need to move down reserve_initrd() after acpi_initrd_override_copy(),
 to make sure we are using right address.

Signed-off-by: Yinghai Lu 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Cc: Rafael J. Wysocki 
Cc: linux-a...@vger.kernel.org
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/include/asm/setup.h |  6 ++
 arch/x86/kernel/head64.c |  2 ++
 arch/x86/kernel/head_32.S|  4 
 arch/x86/kernel/setup.c  | 34 ++
 4 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/setup.h b/arch/x86/include/asm/setup.h
index 4f71d48..6f885b7 100644
--- a/arch/x86/include/asm/setup.h
+++ b/arch/x86/include/asm/setup.h
@@ -42,6 +42,12 @@ extern void visws_early_detect(void);
 static inline void visws_early_detect(void) { }
 #endif
 
+#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE
+void x86_acpi_override_find(void);
+#else
+static inline void x86_acpi_override_find(void) { }
+#endif
+
 extern unsigned long saved_video_mode;
 
 extern void reserve_standard_io_resources(void);
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 55b6761..229b281 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -175,6 +175,8 @@ void __init x86_64_start_kernel(char * real_mode_data)
if (console_loglevel == 10)
early_printk("Kernel alive\n");
 
+   x86_acpi_override_find();
+
clear_page(init_level4_pgt);
/* set init_level4_pgt kernel high mapping*/
init_level4_pgt[511] = early_level4_pgt[511];
diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
index 73afd11..ca08f0e 100644
--- a/arch/x86/kernel/head_32.S
+++ b/arch/x86/kernel/head_32.S
@@ -149,6 +149,10 @@ ENTRY(startup_32)
call load_ucode_bsp
 #endif
 
+#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE
+   call x86_acpi_override_find
+#endif
+
 /*
  * Initialize page tables.  This creates a PDE and a set of page
  * tables, which are located immediately beyond __brk_base.  The variable
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 142e042..d11b1b7 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -421,6 +421,34 @@ static void __init reserve_initrd(void)
 }
 #endif /* CONFIG_BLK_DEV_INITRD */
 
+#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE
+void __init x86_acpi_override_find(void)
+{
+   unsigned long ramdisk_image, ramdisk_size;
+   unsigned char *p = NULL;
+
+#ifdef CONFIG_X86_32
+   struct boot_params *boot_params_p;
+
+   /*
+* 32bit is from head_32.S, and it is 32bit flat mode.
+* So need to use phys address to access global variables.
+*/
+   boot_params_p = (struct boot_params *)__pa_nodebug(_params);
+   ramdisk_image = get_ramdisk_image(boot_params_p);
+   ramdisk_size  = get_ramdisk_size(boot_params_p);
+   p = (unsigned char *)ramdisk_image;
+   acpi_initrd_override_find(p, ramdisk_size, true);
+#else
+   ramdisk_image = get_ramdisk_image(_params);
+   ramdisk_size  = get_ramdisk_size(_params);
+   if (ramdisk_image)
+   p = __va(ramdisk_image);
+   acpi_initrd_override_find(p, ramdisk_size, false);
+#endif
+}
+#endif
+
 static void __init parse_setup_data(void)
 {
struct setup_data *data;
@@ -1117,12 +1145,10 @@ void __init setup_arch(char **cmdline_p)
/* Allocate bigger log buffer */
setup_log_buf(1);
 
-   reserve_initrd();
-
-   acpi_initrd_override_find((void *)initrd_start,
-   initrd_end - initrd_start, false);
acpi_initrd_override_copy();
 
+   reserve_initrd();
+
reserve_crashkernel();
 
vsmp_init();
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 06/22] x86, ACPI: Split acpi_initrd_override to find/copy two functions

2013-06-14 Thread Yinghai Lu

To parse srat early, we need to move acpi table probing early.
acpi_initrd_table_override is before acpi table probing. So we need to
move it early too.

Current code acpi_initrd_table_override is after init_mem_mapping and
relocate_initrd(), so it can scan initrd and copy acpi tables with kernel
virtual address of initrd.
Copying need to be after memblock is ready, because it need to allocate
buffer for new acpi tables.

So we have to split that function to find and copy two functions.
Find should be as early as possible. Copy should be after memblock is ready.

Finding could be done in head_32.S and head64.c, just like microcode
early scanning. In head_32.S, it is 32bit flat mode, we don't
need to set page table to access it. In head64.c, #PF set page table
could help us access initrd with kernel low mapping address.

Copying could be done just after memblock is ready and before probing
acpi tables, and we need to early_ioremap to access source and target
range, as init_mem_mapping is not called yet.

While a dummy version of acpi_initrd_override() was defined when
!CONFIG_ACPI_INITRD_TABLE_OVERRIDE, the prototype and dummy version
were conditionalized inside CONFIG_ACPI.  This forced setup_arch() to
have its own #ifdefs around acpi_initrd_override() as otherwise build
would fail when !CONFIG_ACPI.  Move the prototypes and dummy
implementations of the newly split functions below CONFIG_ACPI block
in acpi.h so that we can do away with #ifdefs in its user.

-v2: Split one patch out according to tj.
 also don't pass table_nr around.
-v3: Add Tj's changelog about moving down to #idef in acpi.h to
 avoid #idef in setup.c

Signed-off-by: Yinghai 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Cc: Rafael J. Wysocki 
Cc: linux-a...@vger.kernel.org
Acked-by: Tejun Heo 
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/kernel/setup.c |  6 +++---
 drivers/acpi/osl.c  | 18 +-
 include/linux/acpi.h| 16 
 3 files changed, 24 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 6ca5f2c..42f584c 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1119,9 +1119,9 @@ void __init setup_arch(char **cmdline_p)
 
reserve_initrd();
 
-#if defined(CONFIG_ACPI) && defined(CONFIG_BLK_DEV_INITRD)
-   acpi_initrd_override((void *)initrd_start, initrd_end - initrd_start);
-#endif
+   acpi_initrd_override_find((void *)initrd_start,
+   initrd_end - initrd_start);
+   acpi_initrd_override_copy();
 
reserve_crashkernel();
 
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index c4ea2b7..fea73af 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -572,14 +572,13 @@ static const char * const table_sigs[] = {
 #define ACPI_OVERRIDE_TABLES 64
 static struct cpio_data __initdata acpi_initrd_files[ACPI_OVERRIDE_TABLES];
 
-void __init acpi_initrd_override(void *data, size_t size)
+void __init acpi_initrd_override_find(void *data, size_t size)
 {
-   int sig, no, table_nr = 0, total_offset = 0;
+   int sig, no, table_nr = 0;
long offset = 0;
struct acpi_table_header *table;
char cpio_path[32] = "kernel/firmware/acpi/";
struct cpio_data file;
-   char *p;
 
if (data == NULL || size == 0)
return;
@@ -620,7 +619,14 @@ void __init acpi_initrd_override(void *data, size_t size)
acpi_initrd_files[table_nr].size = file.size;
table_nr++;
}
-   if (table_nr == 0)
+}
+
+void __init acpi_initrd_override_copy(void)
+{
+   int no, total_offset = 0;
+   char *p;
+
+   if (!all_tables_size)
return;
 
/* under 4G at first, then above 4G */
@@ -652,9 +658,11 @@ void __init acpi_initrd_override(void *data, size_t size)
 * tables one time, we will hit the limit. Need to map table
 * one by one during copying.
 */
-   for (no = 0; no < table_nr; no++) {
+   for (no = 0; no < ACPI_OVERRIDE_TABLES; no++) {
phys_addr_t size = acpi_initrd_files[no].size;
 
+   if (!size)
+   break;
p = early_ioremap(acpi_tables_addr + total_offset, size);
memcpy(p, acpi_initrd_files[no].data, size);
early_iounmap(p, size);
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 17b5b59..8dd917b 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -79,14 +79,6 @@ typedef int (*acpi_tbl_table_handler)(struct 
acpi_table_header *table);
 typedef int (*acpi_tbl_entry_handler)(struct acpi_subtable_header *header,
  const unsigned long end);
 
-#ifdef CONFIG_ACPI_INITRD_TABLE_OVERRIDE
-void acpi_initrd_override(void *data, size_t size);
-#else
-static inline void acpi_initrd_override(void *data, size_t size)
-{
-}
-#endif
-
 char * __acpi_map_table

[PATCH v5 04/22] x86, ACPI: Search buffer above 4G in second try for acpi override tables

2013-06-14 Thread Yinghai Lu

Now we only search buffer for override acpi table under 4G.
In some case, like user use memmap to exclude all low ram,
we may not find range for it under 4G.

Do second try to search above 4G.

Signed-off-by: Yinghai Lu 
Cc: "Rafael J. Wysocki" 
Cc: linux-a...@vger.kernel.org
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 drivers/acpi/osl.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 93e3194..42c48fc 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -627,6 +627,10 @@ void __init acpi_initrd_override(void *data, size_t size)
/* under 4G at first, then above 4G */
acpi_tables_addr = memblock_find_in_range(0, (1ULL<<32) - 1,
all_tables_size, PAGE_SIZE);
+   if (!acpi_tables_addr)
+   acpi_tables_addr = memblock_find_in_range(0,
+   ~(phys_addr_t)0,
+   all_tables_size, PAGE_SIZE);
if (!acpi_tables_addr) {
WARN_ON(1);
return;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 20/22] x86, mm: Add comments for step_size shift

2013-06-14 Thread Yinghai Lu

As request by hpa, add comments for why we choose 5 for
step size shift.

Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/init.c | 21 ++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 3c21f16..5f38e72 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -395,8 +395,23 @@ static unsigned long __init init_range_memory_mapping(
return mapped_ram_size;
 }
 
-/* (PUD_SHIFT-PMD_SHIFT)/2 */
-#define STEP_SIZE_SHIFT 5
+static unsigned long __init get_new_step_size(unsigned long step_size)
+{
+   /*
+* initial mapped size is PMD_SIZE, aka 2M.
+* We can not set step_size to be PUD_SIZE aka 1G yet.
+* In worse case, when 1G is cross the 1G boundary, and
+* PG_LEVEL_2M is not set, we will need 1+1+512 pages (aka 2M + 8k)
+* to map 1G range with PTE. Use 5 as shift for now.
+*/
+   unsigned long new_step_size = step_size << 5;
+
+   if (new_step_size > step_size)
+   step_size = new_step_size;
+
+   return  step_size;
+}
+
 void __init init_mem_mapping(void)
 {
unsigned long end, real_end, start, last_start;
@@ -445,7 +460,7 @@ void __init init_mem_mapping(void)
min_pfn_mapped = last_start >> PAGE_SHIFT;
/* only increase step_size after big range get mapped */
if (new_mapped_ram_size > mapped_ram_size)
-   step_size <<= STEP_SIZE_SHIFT;
+   step_size = get_new_step_size(step_size);
mapped_ram_size += new_mapped_ram_size;
}
 
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 08/22] x86, ACPI: Make acpi_initrd_override_find work with 32bit flat mode

2013-06-14 Thread Yinghai Lu

For finding with 32bit, it would be easy to access initrd in 32bit
flat mode, as we don't need to set page table.

That is from head_32.S, and microcode updating already use this trick.

Need to change acpi_initrd_override_find to use phys to access global
variables.

Pass is_phys in the function, as we can not use address to decide if it
is phys or virtual address on 32 bit. Boot loader could load initrd above
max_low_pfn.

Don't call printk as it uses global variables, so delay print later
during copying.

Change table_sigs to use stack instead, otherwise it is too messy to change
string array to phys and still keep offset calculating correct.
That size is about 36x4 bytes, and it is small to settle in stack.

Also remove "continue" in MARCO to make code more readable.

-v2: add (size_t) castint according to hpa to fix compiling warning
found by Fengguan Wu.

Signed-off-by: Yinghai Lu 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Cc: Rafael J. Wysocki 
Cc: linux-a...@vger.kernel.org
Tested-by: Thomas Renninger 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/kernel/setup.c |  2 +-
 drivers/acpi/osl.c  | 86 ++---
 include/linux/acpi.h|  5 +--
 3 files changed, 64 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 42f584c..142e042 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1120,7 +1120,7 @@ void __init setup_arch(char **cmdline_p)
reserve_initrd();
 
acpi_initrd_override_find((void *)initrd_start,
-   initrd_end - initrd_start);
+   initrd_end - initrd_start, false);
acpi_initrd_override_copy();
 
reserve_crashkernel();
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 3a307ec..3b2beac 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -551,21 +551,9 @@ u8 __init acpi_table_checksum(u8 *buffer, u32 length)
return sum;
 }
 
-/* All but ACPI_SIG_RSDP and ACPI_SIG_FACS: */
-static const char * const table_sigs[] = {
-   ACPI_SIG_BERT, ACPI_SIG_CPEP, ACPI_SIG_ECDT, ACPI_SIG_EINJ,
-   ACPI_SIG_ERST, ACPI_SIG_HEST, ACPI_SIG_MADT, ACPI_SIG_MSCT,
-   ACPI_SIG_SBST, ACPI_SIG_SLIT, ACPI_SIG_SRAT, ACPI_SIG_ASF,
-   ACPI_SIG_BOOT, ACPI_SIG_DBGP, ACPI_SIG_DMAR, ACPI_SIG_HPET,
-   ACPI_SIG_IBFT, ACPI_SIG_IVRS, ACPI_SIG_MCFG, ACPI_SIG_MCHI,
-   ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA,
-   ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
-   ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
-   ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL };
-
 /* Non-fatal errors: Affected tables/files are ignored */
 #define INVALID_TABLE(x, path, name)   \
-   { pr_err("ACPI OVERRIDE: " x " [%s%s]\n", path, name); continue; }
+   do { pr_err("ACPI OVERRIDE: " x " [%s%s]\n", path, name); } while (0)
 
 #define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
 
@@ -576,17 +564,45 @@ struct file_pos {
 };
 static struct file_pos __initdata acpi_initrd_files[ACPI_OVERRIDE_TABLES];
 
-void __init acpi_initrd_override_find(void *data, size_t size)
+/*
+ * acpi_initrd_override_find() is called from head_32.S and head64.c.
+ * head_32.S calling path is with 32bit flat mode, so we can access
+ * initrd early without setting pagetable or relocating initrd. For
+ * global variables accessing, we need to use phys address instead of
+ * kernel virtual address, try to put table_sigs string array in stack,
+ * so avoid switching for it.
+ * Also don't call printk as it uses global variables.
+ */
+void __init acpi_initrd_override_find(void *data, size_t size, bool is_phys)
 {
int sig, no, table_nr = 0;
long offset = 0;
struct acpi_table_header *table;
char cpio_path[32] = "kernel/firmware/acpi/";
struct cpio_data file;
+   struct file_pos *files = acpi_initrd_files;
+   int *all_tables_size_p = _tables_size;
+
+   /* All but ACPI_SIG_RSDP and ACPI_SIG_FACS: */
+   char *table_sigs[] = {
+   ACPI_SIG_BERT, ACPI_SIG_CPEP, ACPI_SIG_ECDT, ACPI_SIG_EINJ,
+   ACPI_SIG_ERST, ACPI_SIG_HEST, ACPI_SIG_MADT, ACPI_SIG_MSCT,
+   ACPI_SIG_SBST, ACPI_SIG_SLIT, ACPI_SIG_SRAT, ACPI_SIG_ASF,
+   ACPI_SIG_BOOT, ACPI_SIG_DBGP, ACPI_SIG_DMAR, ACPI_SIG_HPET,
+   ACPI_SIG_IBFT, ACPI_SIG_IVRS, ACPI_SIG_MCFG, ACPI_SIG_MCHI,
+   ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA,
+   ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
+   ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
+   ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, NULL };
 
if (data == NULL || size == 0)
return;
 
+   if (is_phys) {
+   files = (struct file_pos *)__pa_symbol(acpi_initrd_files);

[PATCH v5 00/22] x86, ACPI, numa: Parse numa info early

2013-06-14 Thread Yinghai Lu

One commit that tried to parse SRAT early get reverted before v3.9-rc1.

| commit e8d1955258091e4c92d5a975ebd7fd8a98f5d30f
| Author: Tang Chen 
| Date:   Fri Feb 22 16:33:44 2013 -0800
|
|acpi, memory-hotplug: parse SRAT before memblock is ready

It broke several things, like acpi override and fall back path etc.

This patchset is clean implementation that will parse numa info early.
1. keep the acpi table initrd override working by split finding with copying.
   finding is done at head_32.S and head64.c stage,
in head_32.S, initrd is accessed in 32bit flat mode with phys addr.
in head64.c, initrd is accessed via kernel low mapping address
with help of #PF set page table.
   copying is done with early_ioremap just after memblock is setup.
2. keep fallback path working. numaq and ACPI and amd_nmua and dummy.
   seperate initmem_init to two stages.
   early_initmem_init will only extract numa info early into numa_meminfo.
   initmem_init will keep slit and emulation handling.
3. keep other old code flow untouched like relocate_initrd and initmem_init.
   early_initmem_init will take old init_mem_mapping position.
   it call early_x86_numa_init and init_mem_mapping for every nodes.
   For 64bit, we avoid having size limit on initrd, as relocate_initrd
   is still after init_mem_mapping for all memory.
4. last patch will try to put page table on local node, so that memory
   hotplug will be happy.

In short, early_initmem_init will parse numa info early and call
init_mem_mapping to set page table for every nodes's mem.

could be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-yinghai.git 
for-x86-mm

and it is based on today's Linus tree.

-v2: Address tj's review and split patches to small ones.
-v3: Add some Acked-by from tj, also stop abusing cpio_data for acpi_files info
-v4: fix one typo found by Tang Chen.
 Also added tested-by from Thomas Renninger and Tony.
-v5: rebase to v3.10-rc3, and add tested-by from Tang Chang
 fix one warning for 32bit found by Fengguang.
 resend as Tang's rebase seems broken and fail the compiling test
   in Fengguang test bots.

Thanks

Yinghai

Yinghai Lu (22):
  x86: Change get_ramdisk_image() to global
  x86, microcode: Use common get_ramdisk_image()
  x86, ACPI, mm: Kill max_low_pfn_mapped
  x86, ACPI: Search buffer above 4G in second try for acpi override tables
  x86, ACPI: Increase override tables number limit
  x86, ACPI: Split acpi_initrd_override to find/copy two functions
  x86, ACPI: Store override acpi tables phys addr in cpio files info array
  x86, ACPI: Make acpi_initrd_override_find work with 32bit flat mode
  x86, ACPI: Find acpi tables in initrd early from head_32.S/head64.c
  x86, mm, numa: Move two functions calling on successful path later
  x86, mm, numa: Call numa_meminfo_cover_memory() checking early
  x86, mm, numa: Move node_map_pfn alignment() to x86
  x86, mm, numa: Use numa_meminfo to check node_map_pfn alignment
  x86, mm, numa: Set memblock nid later
  x86, mm, numa: Move node_possible_map setting later
  x86, mm, numa: Move emulation handling down.
  x86, ACPI, numa, ia64: split SLIT handling out
  x86, mm, numa: Add early_initmem_init() stub
  x86, mm: Parse numa info early
  x86, mm: Add comments for step_size shift
  x86, mm: Make init_mem_mapping be able to be called several times
  x86, mm, numa: Put pagetable on local node ram for 64bit

 arch/ia64/kernel/setup.c|   4 +-
 arch/x86/include/asm/acpi.h |   3 +-
 arch/x86/include/asm/page_types.h   |   2 +-
 arch/x86/include/asm/pgtable.h  |   2 +-
 arch/x86/include/asm/setup.h|   9 ++
 arch/x86/kernel/head64.c|   2 +
 arch/x86/kernel/head_32.S   |   4 +
 arch/x86/kernel/microcode_intel_early.c |   8 +-
 arch/x86/kernel/setup.c |  86 +++-
 arch/x86/mm/init.c  | 122 ++--
 arch/x86/mm/numa.c  | 240 +---
 arch/x86/mm/numa_emulation.c|   2 +-
 arch/x86/mm/numa_internal.h |   2 +
 arch/x86/mm/srat.c  |  11 +-
 drivers/acpi/numa.c |  13 +-
 drivers/acpi/osl.c  | 139 --
 include/linux/acpi.h|  20 +--
 include/linux/mm.h  |   3 -
 mm/page_alloc.c |  52 +--
 19 files changed, 477 insertions(+), 247 deletions(-)

-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 22/22] x86, mm, numa: Put pagetable on local node ram for 64bit

2013-06-14 Thread Yinghai Lu

If node with ram is hotplugable, local node mem for page table and vmemmap
should be on that node ram.

This patch is some kind of refreshment of
| commit 1411e0ec3123ae4c4ead6bfc9fe3ee5a3ae5c327
| Date:   Mon Dec 27 16:48:17 2010 -0800
|
|x86-64, numa: Put pgtable to local node memory
That was reverted before.

We have reason to reintroduce it to make memory hotplug work.

Calling init_mem_mapping in early_initmem_init for every node.
alloc_low_pages will alloc page table in following order:
BRK, local node, low range
So page table will be on low range or local nodes.

Signed-off-by: Yinghai Lu 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Cc: Konrad Rzeszutek Wilk 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 34 +-
 1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 9b18ee8..5adf803 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -670,7 +670,39 @@ static void __init early_x86_numa_init(void)
 #ifdef CONFIG_X86_64
 static void __init early_x86_numa_init_mapping(void)
 {
-   init_mem_mapping(0, max_pfn << PAGE_SHIFT);
+   unsigned long last_start = 0, last_end = 0;
+   struct numa_meminfo *mi = _meminfo;
+   unsigned long start, end;
+   int last_nid = -1;
+   int i, nid;
+
+   for (i = 0; i < mi->nr_blks; i++) {
+   nid   = mi->blk[i].nid;
+   start = mi->blk[i].start;
+   end   = mi->blk[i].end;
+
+   if (last_nid == nid) {
+   last_end = end;
+   continue;
+   }
+
+   /* other nid now */
+   if (last_nid >= 0) {
+   printk(KERN_DEBUG "Node %d: [mem %#016lx-%#016lx]\n",
+   last_nid, last_start, last_end - 1);
+   init_mem_mapping(last_start, last_end);
+   }
+
+   /* for next nid */
+   last_nid   = nid;
+   last_start = start;
+   last_end   = end;
+   }
+   /* last one */
+   printk(KERN_DEBUG "Node %d: [mem %#016lx-%#016lx]\n",
+   last_nid, last_start, last_end - 1);
+   init_mem_mapping(last_start, last_end);
+
if (max_pfn > max_low_pfn)
max_low_pfn = max_pfn;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 18/22] x86, mm, numa: Add early_initmem_init() stub

2013-06-14 Thread Yinghai Lu

early_initmem_init() call early_x86_numa_init() to parse numa info early.

Later will call init_mem_mapping for nodes in it.

Signed-off-by: Yinghai Lu 
Cc: Pekka Enberg 
Cc: Jacob Shin 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/include/asm/page_types.h | 1 +
 arch/x86/kernel/setup.c   | 1 +
 arch/x86/mm/init.c| 6 ++
 arch/x86/mm/numa.c| 7 +--
 4 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/page_types.h 
b/arch/x86/include/asm/page_types.h
index b012b82..d04dd8c 100644
--- a/arch/x86/include/asm/page_types.h
+++ b/arch/x86/include/asm/page_types.h
@@ -55,6 +55,7 @@ bool pfn_range_is_mapped(unsigned long start_pfn, unsigned 
long end_pfn);
 extern unsigned long init_memory_mapping(unsigned long start,
 unsigned long end);
 
+void early_initmem_init(void);
 extern void initmem_init(void);
 
 #endif /* !__ASSEMBLY__ */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index d11b1b7..301165e 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1162,6 +1162,7 @@ void __init setup_arch(char **cmdline_p)
 
early_acpi_boot_init();
 
+   early_initmem_init();
initmem_init();
memblock_find_dma_reserve();
 
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 8554656..3c21f16 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -467,6 +467,12 @@ void __init init_mem_mapping(void)
early_memtest(0, max_pfn_mapped << PAGE_SHIFT);
 }
 
+#ifndef CONFIG_NUMA
+void __init early_initmem_init(void)
+{
+}
+#endif
+
 /*
  * devmem_is_allowed() checks to see if /dev/mem access to a certain address
  * is valid. The argument is a physical page number.
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 630e09f..7d76936 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -665,13 +665,16 @@ static void __init early_x86_numa_init(void)
numa_init(dummy_numa_init);
 }
 
+void __init early_initmem_init(void)
+{
+   early_x86_numa_init();
+}
+
 void __init x86_numa_init(void)
 {
int i, nid;
struct numa_meminfo *mi = _meminfo;
 
-   early_x86_numa_init();
-
 #ifdef CONFIG_ACPI_NUMA
if (srat_used)
x86_acpi_numa_init_slit();
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 17/22] x86, ACPI, numa, ia64: split SLIT handling out

2013-06-14 Thread Yinghai Lu

We need to handle slit later, as it need to allocate buffer for distance
matrix. Also we do not need SLIT info before init_mem_mapping.

So move SLIT parsing later.

x86_acpi_numa_init become x86_acpi_numa_init_srat/x86_acpi_numa_init_slit.

It should not break ia64 by replacing acpi_numa_init with
acpi_numa_init_srat/acpi_numa_init_slit/acpi_num_arch_fixup.

-v2: Change name to acpi_numa_init_srat/acpi_numa_init_slit according tj.
 remove the reset_numa_distance() in numa_init(), as get we only set
 distance in slit handling.

Signed-off-by: Yinghai Lu 
Cc: Rafael J. Wysocki 
Cc: linux-a...@vger.kernel.org
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: linux-i...@vger.kernel.org
Tested-by: Tony Luck 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/ia64/kernel/setup.c|  4 +++-
 arch/x86/include/asm/acpi.h |  3 ++-
 arch/x86/mm/numa.c  | 14 --
 arch/x86/mm/srat.c  | 11 +++
 drivers/acpi/numa.c | 13 +++--
 include/linux/acpi.h|  3 ++-
 6 files changed, 33 insertions(+), 15 deletions(-)

diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 13bfdd2..5f7db4a 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -558,7 +558,9 @@ setup_arch (char **cmdline_p)
acpi_table_init();
early_acpi_boot_init();
 # ifdef CONFIG_ACPI_NUMA
-   acpi_numa_init();
+   acpi_numa_init_srat();
+   acpi_numa_init_slit();
+   acpi_numa_arch_fixup();
 #  ifdef CONFIG_ACPI_HOTPLUG_CPU
prefill_possible_map();
 #  endif
diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index b31bf97..651db0b 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -178,7 +178,8 @@ static inline void disable_acpi(void) { }
 
 #ifdef CONFIG_ACPI_NUMA
 extern int acpi_numa;
-extern int x86_acpi_numa_init(void);
+int x86_acpi_numa_init_srat(void);
+void x86_acpi_numa_init_slit(void);
 #endif /* CONFIG_ACPI_NUMA */
 
 #define acpi_unlazy_tlb(x) leave_mm(x)
diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 3254f22..630e09f 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -595,7 +595,6 @@ static int __init numa_init(int (*init_func)(void))
 
nodes_clear(numa_nodes_parsed);
memset(_meminfo, 0, sizeof(numa_meminfo));
-   numa_reset_distance();
 
ret = init_func();
if (ret < 0)
@@ -633,6 +632,10 @@ static int __init dummy_numa_init(void)
return 0;
 }
 
+#ifdef CONFIG_ACPI_NUMA
+static bool srat_used __initdata;
+#endif
+
 /**
  * x86_numa_init - Initialize NUMA
  *
@@ -648,8 +651,10 @@ static void __init early_x86_numa_init(void)
return;
 #endif
 #ifdef CONFIG_ACPI_NUMA
-   if (!numa_init(x86_acpi_numa_init))
+   if (!numa_init(x86_acpi_numa_init_srat)) {
+   srat_used = true;
return;
+   }
 #endif
 #ifdef CONFIG_AMD_NUMA
if (!numa_init(amd_numa_init))
@@ -667,6 +672,11 @@ void __init x86_numa_init(void)
 
early_x86_numa_init();
 
+#ifdef CONFIG_ACPI_NUMA
+   if (srat_used)
+   x86_acpi_numa_init_slit();
+#endif
+
numa_emulation(_meminfo, numa_distance_cnt);
 
node_possible_map = numa_nodes_parsed;
diff --git a/arch/x86/mm/srat.c b/arch/x86/mm/srat.c
index cdd0da9..443f9ef 100644
--- a/arch/x86/mm/srat.c
+++ b/arch/x86/mm/srat.c
@@ -185,14 +185,17 @@ out_err:
return -1;
 }
 
-void __init acpi_numa_arch_fixup(void) {}
-
-int __init x86_acpi_numa_init(void)
+int __init x86_acpi_numa_init_srat(void)
 {
int ret;
 
-   ret = acpi_numa_init();
+   ret = acpi_numa_init_srat();
if (ret < 0)
return ret;
return srat_disabled() ? -EINVAL : 0;
 }
+
+void __init x86_acpi_numa_init_slit(void)
+{
+   acpi_numa_init_slit();
+}
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 33e609f..6460db4 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -282,7 +282,7 @@ acpi_table_parse_srat(enum acpi_srat_type id,
handler, max_entries);
 }
 
-int __init acpi_numa_init(void)
+int __init acpi_numa_init_srat(void)
 {
int cnt = 0;
 
@@ -303,11 +303,6 @@ int __init acpi_numa_init(void)
NR_NODE_MEMBLKS);
}
 
-   /* SLIT: System Locality Information Table */
-   acpi_table_parse(ACPI_SIG_SLIT, acpi_parse_slit);
-
-   acpi_numa_arch_fixup();
-
if (cnt < 0)
return cnt;
else if (!parsed_numa_memblks)
@@ -315,6 +310,12 @@ int __init acpi_numa_init(void)
return 0;
 }
 
+void __init acpi_numa_init_slit(void)
+{
+   /* SLIT: System Locality Information Table */
+   acpi_table_parse(ACPI_SIG_SLIT, acpi_parse_slit);
+}
+
 int acpi_get_pxm(acpi_handle h)
 {
unsigned long long pxm;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index

[PATCH v5 12/22] x86, mm, numa: Move node_map_pfn alignment() to x86

2013-06-14 Thread Yinghai Lu

Move node_map_pfn_alignment() to arch/x86/mm as no other user for it.

Will update it to use numa_meminfo instead of memblock.

Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 50 ++
 include/linux/mm.h |  1 -
 mm/page_alloc.c| 50 --
 3 files changed, 50 insertions(+), 51 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 1bb565d..10c6240 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -474,6 +474,56 @@ static bool __init numa_meminfo_cover_memory(const struct 
numa_meminfo *mi)
return true;
 }
 
+/**
+ * node_map_pfn_alignment - determine the maximum internode alignment
+ *
+ * This function should be called after node map is populated and sorted.
+ * It calculates the maximum power of two alignment which can distinguish
+ * all the nodes.
+ *
+ * For example, if all nodes are 1GiB and aligned to 1GiB, the return value
+ * would indicate 1GiB alignment with (1 << (30 - PAGE_SHIFT)).  If the
+ * nodes are shifted by 256MiB, 256MiB.  Note that if only the last node is
+ * shifted, 1GiB is enough and this function will indicate so.
+ *
+ * This is used to test whether pfn -> nid mapping of the chosen memory
+ * model has fine enough granularity to avoid incorrect mapping for the
+ * populated node map.
+ *
+ * Returns the determined alignment in pfn's.  0 if there is no alignment
+ * requirement (single node).
+ */
+unsigned long __init node_map_pfn_alignment(void)
+{
+   unsigned long accl_mask = 0, last_end = 0;
+   unsigned long start, end, mask;
+   int last_nid = -1;
+   int i, nid;
+
+   for_each_mem_pfn_range(i, MAX_NUMNODES, , , ) {
+   if (!start || last_nid < 0 || last_nid == nid) {
+   last_nid = nid;
+   last_end = end;
+   continue;
+   }
+
+   /*
+* Start with a mask granular enough to pin-point to the
+* start pfn and tick off bits one-by-one until it becomes
+* too coarse to separate the current node from the last.
+*/
+   mask = ~((1 << __ffs(start)) - 1);
+   while (mask && last_end <= (start & (mask << 1)))
+   mask <<= 1;
+
+   /* accumulate all internode masks */
+   accl_mask |= mask;
+   }
+
+   /* convert mask to number of pages */
+   return ~accl_mask + 1;
+}
+
 static int __init numa_register_memblks(struct numa_meminfo *mi)
 {
unsigned long uninitialized_var(pfn_align);
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 28e9470..b827743 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1384,7 +1384,6 @@ static inline unsigned long free_initmem_default(int 
poison)
  * CONFIG_HAVE_MEMBLOCK_NODE_MAP.
  */
 extern void free_area_init_nodes(unsigned long *max_zone_pfn);
-unsigned long node_map_pfn_alignment(void);
 extern unsigned long absent_pages_in_range(unsigned long start_pfn,
unsigned long end_pfn);
 extern void get_pfn_range_for_nid(unsigned int nid,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c427f46..28c4a97 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4760,56 +4760,6 @@ void __init setup_nr_node_ids(void)
 }
 #endif
 
-/**
- * node_map_pfn_alignment - determine the maximum internode alignment
- *
- * This function should be called after node map is populated and sorted.
- * It calculates the maximum power of two alignment which can distinguish
- * all the nodes.
- *
- * For example, if all nodes are 1GiB and aligned to 1GiB, the return value
- * would indicate 1GiB alignment with (1 << (30 - PAGE_SHIFT)).  If the
- * nodes are shifted by 256MiB, 256MiB.  Note that if only the last node is
- * shifted, 1GiB is enough and this function will indicate so.
- *
- * This is used to test whether pfn -> nid mapping of the chosen memory
- * model has fine enough granularity to avoid incorrect mapping for the
- * populated node map.
- *
- * Returns the determined alignment in pfn's.  0 if there is no alignment
- * requirement (single node).
- */
-unsigned long __init node_map_pfn_alignment(void)
-{
-   unsigned long accl_mask = 0, last_end = 0;
-   unsigned long start, end, mask;
-   int last_nid = -1;
-   int i, nid;
-
-   for_each_mem_pfn_range(i, MAX_NUMNODES, , , ) {
-   if (!start || last_nid < 0 || last_nid == nid) {
-   last_nid = nid;
-   last_end = end;
-   continue;
-   }
-
-   /*
-* Start with a mask granular enough to pin-point to the
-* start pfn and tick off bits one-by-one until it becomes
-* too coarse to separate the current node from the last.
-*/
-

[PATCH v5 13/22] x86, mm, numa: Use numa_meminfo to check node_map_pfn alignment

2013-06-14 Thread Yinghai Lu

We could use numa_meminfo directly instead of memblock nid.

So we could move down set memblock nid and only do it one time
for successful path.

-v2: according to tj, separate moving to another patch.

Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 30 +++---
 1 file changed, 19 insertions(+), 11 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index 10c6240..cff565a 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -493,14 +493,18 @@ static bool __init numa_meminfo_cover_memory(const struct 
numa_meminfo *mi)
  * Returns the determined alignment in pfn's.  0 if there is no alignment
  * requirement (single node).
  */
-unsigned long __init node_map_pfn_alignment(void)
+#ifdef NODE_NOT_IN_PAGE_FLAGS
+static unsigned long __init node_map_pfn_alignment(struct numa_meminfo *mi)
 {
unsigned long accl_mask = 0, last_end = 0;
unsigned long start, end, mask;
int last_nid = -1;
int i, nid;
 
-   for_each_mem_pfn_range(i, MAX_NUMNODES, , , ) {
+   for (i = 0; i < mi->nr_blks; i++) {
+   start = mi->blk[i].start >> PAGE_SHIFT;
+   end = mi->blk[i].end >> PAGE_SHIFT;
+   nid = mi->blk[i].nid;
if (!start || last_nid < 0 || last_nid == nid) {
last_nid = nid;
last_end = end;
@@ -523,10 +527,16 @@ unsigned long __init node_map_pfn_alignment(void)
/* convert mask to number of pages */
return ~accl_mask + 1;
 }
+#else
+static unsigned long __init node_map_pfn_alignment(struct numa_meminfo *mi)
+{
+   return 0;
+}
+#endif
 
 static int __init numa_register_memblks(struct numa_meminfo *mi)
 {
-   unsigned long uninitialized_var(pfn_align);
+   unsigned long pfn_align;
int i;
 
/* Account for nodes with cpus and no memory */
@@ -538,24 +548,22 @@ static int __init numa_register_memblks(struct 
numa_meminfo *mi)
if (!numa_meminfo_cover_memory(mi))
return -EINVAL;
 
-   for (i = 0; i < mi->nr_blks; i++) {
-   struct numa_memblk *mb = >blk[i];
-   memblock_set_node(mb->start, mb->end - mb->start, mb->nid);
-   }
-
/*
 * If sections array is gonna be used for pfn -> nid mapping, check
 * whether its granularity is fine enough.
 */
-#ifdef NODE_NOT_IN_PAGE_FLAGS
-   pfn_align = node_map_pfn_alignment();
+   pfn_align = node_map_pfn_alignment(mi);
if (pfn_align && pfn_align < PAGES_PER_SECTION) {
printk(KERN_WARNING "Node alignment %LuMB < min %LuMB, 
rejecting NUMA config\n",
   PFN_PHYS(pfn_align) >> 20,
   PFN_PHYS(PAGES_PER_SECTION) >> 20);
return -EINVAL;
}
-#endif
+
+   for (i = 0; i < mi->nr_blks; i++) {
+   struct numa_memblk *mb = >blk[i];
+   memblock_set_node(mb->start, mb->end - mb->start, mb->nid);
+   }
 
return 0;
 }
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] of: irq: Pass trigger type in IRQ resource flags

2013-06-14 Thread Javier Martinez Canillas


On 15/06/2013, at 00:00, Grant Likely  wrote:

> On Wed, 05 Jun 2013 20:20:39 +0200, Tomasz Figa  wrote:
>> Hi,
>> 
>> On Sunday 19 of May 2013 00:56:30 Tomasz Figa wrote:
>>> Some drivers might rely on availability of trigger flags in IRQ
>>> resource, for example to configure the hardware for particular interrupt
>>> type. However current code creating IRQ resources from data in device
>>> tree does not configure trigger flags in resulting resources.
>>> 
>>> This patch solves the problem, based on the fact that
>>> irq_of_parse_and_map() configures the trigger based on DT interrupt
>>> specifier, IRQD_TRIGGER_* flags are consistent with IORESOURCE_IRQ_*,
>>> and we can get trigger type by calling irqd_get_trigger_type() after
>>> mapping the interrupt.
>>> 
>>> Signed-off-by: Tomasz Figa 
>>> ---
>>> drivers/of/irq.c | 10 ++
>>> 1 file changed, 10 insertions(+)
>>> 
>>> diff --git a/drivers/of/irq.c b/drivers/of/irq.c
>>> index a3c1c5a..79a7a26 100644
>>> --- a/drivers/of/irq.c
>>> +++ b/drivers/of/irq.c
>>> @@ -355,6 +355,16 @@ int of_irq_to_resource(struct device_node *dev, int
>>> index, struct resource *r) r->start = r->end = irq;
>>>r->flags = IORESOURCE_IRQ;
>>>r->name = name ? name : dev->full_name;
>>> +
>>> +/*
>>> + * Some drivers might rely on availability of trigger
>> flags
>>> + * in IRQ resource. Since irq_of_parse_and_map()
>> configures the
>>> + * trigger based on interrupt specifier and IRQD_TRIGGER_*
>>> + * flags are consistent with IORESOURCE_IRQ_*, we can get
>>> + * trigger type that was just set and pass it through
>> resource
>>> + * flags as well.
>>> + */
>>> +r->flags |= irqd_get_trigger_type(irq_get_irq_data(irq));
>>>}
>>> 
>>>return irq;
>> 
>> Any comments on this patch?
> 
> That's actually a pretty good solution and a whole lot less invasive
> that the approach that Javier was pursuing. Javier, I'm going to pick
> up this patch. Please yell if it doesn't solve the problem that you're
> trying to solve.
> 
> g.
> 

It solves the issue I was trying to solve and the solution is indeed more 
elegant and simpler than the one I posted.

Thanks a lot for pointing this out.

Best regards,
Javier--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v5 10/22] x86, mm, numa: Move two functions calling on successful path later

2013-06-14 Thread Yinghai Lu

We need to have numa info ready before init_mem_mapping, so we
can call init_mem_mapping per nodes also can trim node mem range to
big alignment.

Current numa parsing need to allocate some buffer and need to be
called after init_mem_mapping.

So try to split parsing numa info to two stages, and early one will be
before init_mem_mapping, and it should not need allocate buffers.

At last we will have early_initmem_init() and initmem_init().

This one is first one for separation.

setup_node_data() and numa_init_array() are only called for successful
path, so we can move calling to x86_numa_init(). That will also make
numa_init() small and readable.

-v2: remove online_node_map clear in numa_init(), as it is only
 set in setup_node_data() at last in successful path.

Signed-off-by: Yinghai Lu 
Reviewed-by: Tang Chen 
Tested-by: Tang Chen 
---
 arch/x86/mm/numa.c | 69 ++
 1 file changed, 39 insertions(+), 30 deletions(-)

diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c
index a71c4e2..07ae800 100644
--- a/arch/x86/mm/numa.c
+++ b/arch/x86/mm/numa.c
@@ -477,7 +477,7 @@ static bool __init numa_meminfo_cover_memory(const struct 
numa_meminfo *mi)
 static int __init numa_register_memblks(struct numa_meminfo *mi)
 {
unsigned long uninitialized_var(pfn_align);
-   int i, nid;
+   int i;
 
/* Account for nodes with cpus and no memory */
node_possible_map = numa_nodes_parsed;
@@ -506,24 +506,6 @@ static int __init numa_register_memblks(struct 
numa_meminfo *mi)
if (!numa_meminfo_cover_memory(mi))
return -EINVAL;
 
-   /* Finally register nodes. */
-   for_each_node_mask(nid, node_possible_map) {
-   u64 start = PFN_PHYS(max_pfn);
-   u64 end = 0;
-
-   for (i = 0; i < mi->nr_blks; i++) {
-   if (nid != mi->blk[i].nid)
-   continue;
-   start = min(mi->blk[i].start, start);
-   end = max(mi->blk[i].end, end);
-   }
-
-   if (start < end)
-   setup_node_data(nid, start, end);
-   }
-
-   /* Dump memblock with node info and return. */
-   memblock_dump_all();
return 0;
 }
 
@@ -559,7 +541,6 @@ static int __init numa_init(int (*init_func)(void))
 
nodes_clear(numa_nodes_parsed);
nodes_clear(node_possible_map);
-   nodes_clear(node_online_map);
memset(_meminfo, 0, sizeof(numa_meminfo));
WARN_ON(memblock_set_node(0, ULLONG_MAX, MAX_NUMNODES));
numa_reset_distance();
@@ -577,15 +558,6 @@ static int __init numa_init(int (*init_func)(void))
if (ret < 0)
return ret;
 
-   for (i = 0; i < nr_cpu_ids; i++) {
-   int nid = early_cpu_to_node(i);
-
-   if (nid == NUMA_NO_NODE)
-   continue;
-   if (!node_online(nid))
-   numa_clear_node(i);
-   }
-   numa_init_array();
return 0;
 }
 
@@ -618,7 +590,7 @@ static int __init dummy_numa_init(void)
  * last fallback is dummy single node config encomapssing whole memory and
  * never fails.
  */
-void __init x86_numa_init(void)
+static void __init early_x86_numa_init(void)
 {
if (!numa_off) {
 #ifdef CONFIG_X86_NUMAQ
@@ -638,6 +610,43 @@ void __init x86_numa_init(void)
numa_init(dummy_numa_init);
 }
 
+void __init x86_numa_init(void)
+{
+   int i, nid;
+   struct numa_meminfo *mi = _meminfo;
+
+   early_x86_numa_init();
+
+   /* Finally register nodes. */
+   for_each_node_mask(nid, node_possible_map) {
+   u64 start = PFN_PHYS(max_pfn);
+   u64 end = 0;
+
+   for (i = 0; i < mi->nr_blks; i++) {
+   if (nid != mi->blk[i].nid)
+   continue;
+   start = min(mi->blk[i].start, start);
+   end = max(mi->blk[i].end, end);
+   }
+
+   if (start < end)
+   setup_node_data(nid, start, end); /* online is set */
+   }
+
+   /* Dump memblock with node info */
+   memblock_dump_all();
+
+   for (i = 0; i < nr_cpu_ids; i++) {
+   int nid = early_cpu_to_node(i);
+
+   if (nid == NUMA_NO_NODE)
+   continue;
+   if (!node_online(nid))
+   numa_clear_node(i);
+   }
+   numa_init_array();
+}
+
 static __init int find_near_online_node(int node)
 {
int n, val;
-- 
1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] [NET]: Unmap fragment page once iterator is done

2013-06-14 Thread Wedson Almeida Filho

Callers of skb_seq_read() are currently forced to call skb_abort_seq_read()
even when consuming all the data because the last call to skb_seq_read (the
one that returns 0 to indicate the end) fails to unmap the last fragment page.

With this patch callers will be allowed to traverse the SKB data by calling
skb_prepare_seq_read() once and repeatedly calling skb_seq_read() as originally
intended (and documented in the original commit 677e90eda), that is, only call
skb_abort_seq_read() if the sequential read is actually aborted.

Signed-off-by: Wedson Almeida Filho 
---
 net/core/skbuff.c |7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index cfd777b..26ea1cf 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2554,8 +2554,13 @@ unsigned int skb_seq_read(unsigned int consumed, const 
u8 **data,
unsigned int block_limit, abs_offset = consumed + st->lower_offset;
skb_frag_t *frag;
 
-   if (unlikely(abs_offset >= st->upper_offset))
+   if (unlikely(abs_offset >= st->upper_offset)) {
+   if (st->frag_data) {
+   kunmap_atomic(st->frag_data);
+   st->frag_data = NULL;
+   }
return 0;
+   }
 
 next_skb:
block_limit = skb_headlen(st->cur_skb) + st->stepped_offset;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: vmscan: remove redundant querying to shrinker

2013-06-14 Thread Kyungmin Park

On Sat, Jun 15, 2013 at 8:04 AM, Andrew Morton
 wrote:
>
> On Sat, 15 Jun 2013 03:13:26 +0900 HeeSub Shin  wrote:
>
> > Hello,
> >
> > On Fri, Jun 14, 2013 at 8:10 PM, Minchan Kim  wrote:
> >
> > >
> > > Hello,
> > >
> > > On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> > > > shrink_slab() queries each slab cache to get the number of
> > > > elements in it. In most cases such queries are cheap but,
> > > > on some caches. For example, Android low-memory-killer,
> > > > which is operates as a slab shrinker, does relatively
> > > > long calculation once invoked and it is quite expensive.
> > >
> > > LMK as shrinker is really bad, which everybody didn't want
> > > when we reviewed it a few years ago so that's a one of reason
> > > LMK couldn't be promoted to mainline yet. So your motivation is
> > > already not atrractive. ;-)
> > >
> > > >
> > > > This patch removes redundant queries to shrinker function
> > > > in the loop of shrink batch.
> > >
> > > I didn't review the patch and others don't want it, I guess.
> > > Because slab shrink is under construction and many patches were
> > > already merged into mmtom. Please look at latest mmotm tree.
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> >
> >
> > >
> > > If you concern is still in there and it's really big concern of MM
> > > we should take care, NOT LMK, plese, resend it.
> > >
> > >
> > I've noticed that there are huge changes there in the recent mmotm and
> > you
> > guys already settled the issue of my concern. I usually keep track
> > changes
> > in recent mm-tree, but this time I didn't. My bad :-)
> >
>
> I'm not averse to merging an improvement like this even if it gets
> rubbed out by forthcoming changes.  The big changes may never get
> merged or may be reverted.  And by merging this patch, others are more
> likely to grab it, backport it into earlier kernels and benefit from
> it.
>
> Also, the problem which this simple patch fixes might be present in a
> different form after the large patchset has been merged.  That does not
> appear to be the case this time.
>
> So I'd actually like to merge Heesub's patch.  Problem is, I don't have
> a way to redistribute it for testing - I'd need to effectively revert
> the whole thing when integrating Glauber's stuff on top, so nobody who
> is using linux-next would test Heesub's change.  Drat.
>
>
>
>
> However I'm a bit sceptical about the description here.  The shrinker
> is supposed to special-case the "nr_to_scan == 0" case and AFAICT
> drivers/staging/android/lowmemorykiller.c:lowmem_shrink() does do this,
> and it looks like the function will be pretty quick in this case.
>
> In other words, the behaviour of lowmem_shrink(nr_to_scan == 0) does
> not match Heesub's description.  What's up with that?
>
Right, but real use case is differnet at mainline kernel. and he found it.
there are two approaches,
1. Reduce do_shinker_slab call by this patch
2. Optimize shinker function itself as like this.

Thank you,
Kyungmin Park
>
>
> Also, there is an obvious optimisation which we could make to
> lowmem_shrink().  All this stuff:
>
> if (lowmem_adj_size < array_size)
> array_size = lowmem_adj_size;
> if (lowmem_minfree_size < array_size)
> array_size = lowmem_minfree_size;
> for (i = 0; i < array_size; i++) {
> if (other_free < lowmem_minfree[i] &&
> other_file < lowmem_minfree[i]) {
> min_score_adj = lowmem_adj[i];
> break;
> }
> }
>
> does nothing useful in the nr_to_scan==0 case and should be omitted for
> this special case.  But this problem was fixed in the large shrinker
> rework in -mm.
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majord...@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: mailto:"d...@kvack.org;> em...@kvack.org 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL 1/5]: ux500 core changes for v3.11

2013-06-14 Thread Olof Johansson

On Tue, Jun 04, 2013 at 02:21:10PM +0200, Linus Walleij wrote:
> Hi ARM SoC folks,
> 
> this is the ux500 core changes for v3.11. See the signed
> tag for details.
> 
> The following changes since commit e4aa937ec75df0eea0bee03bffa3303ad36c986b:
> 
>   Linux 3.10-rc3 (2013-05-26 16:00:47 -0700)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-stericsson.git
> tags/ux500-core-for-arm-soc

Hi,

> 
> for you to fetch changes up to 080e0435e54298992dfc03dc04ca53cfe3de36ba:
> 
>   ARM: ux500: avoid warning in ux500_read_asicid (2013-06-04 11:21:58 +0200)
> 
> 
> Ux500 core changes:
> - Fixes for size and location of PRCMU TCDM
> - SD/MMC/SDIO caps updates to boardfiles
> - Misc fixes
> 
> 
> Arnd Bergmann (1):
>   ARM: ux500: avoid warning in ux500_read_asicid
> 
> Lee Jones (3):
>   ARM: ux500: Increase the size of the PRCMU's TCPM size
>   ARM: ux500: Remove incorrect DB9540 PRCMU TCDM base location
>   ARM: ux500: regulators: Remove misleading comment
> 
> Ulf Hansson (6):
>   ARM: ux500: Enable 100MHz for SD/SDIO/MMC devices
>   ARM: ux500: Don't set plf ocr mask for SD/MMC device
>   ARM: ux500: Enable support for RPMB and Reliable Write for eMMC
>   ARM: ux500: Enable support for discard for MMC/SD
>   ARM: ux500: Set eMMC and WLAN card slot as non-removable
>   ARM: ux500: Enable support for UHS-I SD-cards

So, the first two patches seem to be fixes (the TCDM ones). Then a bunch of
board patches, and then one warning fix.

Sorting the three fixes in a fixes-non-critical branch, and the rest in a board
branch, could have been a good way to organize this.

Anyway, not a big deal, I'll pull this into next/boards, since that's what most
of the patches are for.


-Olof
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RFC ticketlock] v3 Auto-queued ticketlock

2013-06-14 Thread Paul E. McKenney

On Fri, Jun 14, 2013 at 09:28:16AM +0800, Lai Jiangshan wrote:
> On 06/14/2013 07:57 AM, Paul E. McKenney wrote:
> > On Fri, Jun 14, 2013 at 07:25:57AM +0800, Lai Jiangshan wrote:
> >> On Thu, Jun 13, 2013 at 11:22 PM, Paul E. McKenney
> >>  wrote:
> >>> On Thu, Jun 13, 2013 at 10:55:41AM +0800, Lai Jiangshan wrote:
>  On 06/12/2013 11:40 PM, Paul E. McKenney wrote:
> > Breaking up locks is better than implementing high-contention locks, but
> > if we must have high-contention locks, why not make them automatically
> > switch between light-weight ticket locks at low contention and queued
> > locks at high contention?  After all, this would remove the need for
> > the developer to predict which locks will be highly contended.
> >
> > This commit allows ticket locks to automatically switch between pure
> > ticketlock and queued-lock operation as needed.  If too many CPUs are
> > spinning on a given ticket lock, a queue structure will be allocated
> > and the lock will switch to queued-lock operation.  When the lock 
> > becomes
> > free, it will switch back into ticketlock operation.  The low-order bit
> > of the head counter is used to indicate that the lock is in queued mode,
> > which forces an unconditional mismatch between the head and tail 
> > counters.
> > This approach means that the common-case code path under conditions of
> > low contention is very nearly that of a plain ticket lock.
> >
> > A fixed number of queueing structures is statically allocated in an
> > array.  The ticket-lock address is used to hash into an initial element,
> > but if that element is already in use, it moves to the next element.  If
> > the entire array is already in use, continue to spin in ticket mode.
> >
> > Signed-off-by: Paul E. McKenney 
> > [ paulmck: Eliminate duplicate code and update comments (Steven 
> > Rostedt). ]
> > [ paulmck: Address Eric Dumazet review feedback. ]
> > [ paulmck: Use Lai Jiangshan idea to eliminate smp_mb(). ]
> > [ paulmck: Expand ->head_tkt from s32 to s64 (Waiman Long). ]
> > [ paulmck: Move cpu_relax() to main spin loop (Steven Rostedt). ]
> > [ paulmck: Reduce queue-switch contention (Waiman Long). ]
> > [ paulmck: __TKT_SPIN_INC for __ticket_spin_trylock() (Steffen 
> > Persvold). ]
> > [ paulmck: Type safety fixes (Steven Rostedt). ]
> > [ paulmck: Pre-check cmpxchg() value (Waiman Long). ]
> > [ paulmck: smp_mb() downgrade to smp_wmb() (Lai Jiangshan). ]
> 
> 
>  Hi, Paul,
> 
>  I simplify the code and remove the search by encoding the index of 
>  struct tkt_q_head
>  into lock->tickets.head.
> 
>  A) lock->tickets.head(when queued-lock):
>  -
>   index of struct tkt_q_head |0|1|
>  -
> >>>
> >>> Interesting approach!  It might reduce queued-mode overhead a bit in
> >>> some cases, though I bet that in the common case the first queue element
> >>> examined is the right one.  More on this below.
> >>>
>  The bit0 = 1 for queued, and the bit1 = 0 is reserved for 
>  __ticket_spin_unlock(),
>  thus __ticket_spin_unlock() will not change the higher bits of 
>  lock->tickets.head.
> 
>  B) tqhp->head is for the real value of lock->tickets.head.
>  if the last bit of tqhp->head is 1, it means it is the head ticket when 
>  started queuing.
> >>>
> >>> But don't you also need the xadd() in __ticket_spin_unlock() to become
> >>> a cmpxchg() for this to work?  Or is your patch missing your changes to
> >>> arch/x86/include/asm/spinlock.h?  Either way, this is likely to increase
> >>> the no-contention overhead, which might be counterproductive.  Wouldn't
> >>> hurt to get measurements, though.
> >>
> >> No, don't need to change __ticket_spin_unlock() in my idea.
> >> bit1 in the  tickets.head is reserved for __ticket_spin_unlock(),
> >> __ticket_spin_unlock() only changes the bit1, it will not change
> >> the higher bits. tkt_q_do_wake() will restore the tickets.head.
> >>
> >> This approach avoids cmpxchg in __ticket_spin_unlock().
> > 
> > Ah, I did miss that.  But doesn't the adjustment in __ticket_spin_lock()
> > need to be atomic in order to handle concurrent invocations of
> > __ticket_spin_lock()?
> 
> I don't understand, do we just discuss about __ticket_spin_unlock() only?
> Or does my suggestion hurt __ticket_spin_lock()?

On many architectures, it is harmless.  But my concern is that
__ticket_spin_lock() is atomically incrementing the full value
(both head and tail), but in such a way as to never change the
value of head.  So in theory, a concurrent non-atomic store to
head should be OK, but it does make me quite nervous.

At the very least, it needs a comment saying why it is safe.

Thanx, Paul

> > Either way, it would be good to

Re: [PATCH RFC ticketlock] v3 Auto-queued ticketlock

2013-06-14 Thread Paul E. McKenney

On Fri, Jun 14, 2013 at 03:12:43PM +0800, Lai Jiangshan wrote:
> On 06/14/2013 07:57 AM, Paul E. McKenney wrote:
> > On Fri, Jun 14, 2013 at 07:25:57AM +0800, Lai Jiangshan wrote:
> >> On Thu, Jun 13, 2013 at 11:22 PM, Paul E. McKenney
> >>  wrote:
> >>> On Thu, Jun 13, 2013 at 10:55:41AM +0800, Lai Jiangshan wrote:
>  On 06/12/2013 11:40 PM, Paul E. McKenney wrote:
> > Breaking up locks is better than implementing high-contention locks, but
> > if we must have high-contention locks, why not make them automatically
> > switch between light-weight ticket locks at low contention and queued
> > locks at high contention?  After all, this would remove the need for
> > the developer to predict which locks will be highly contended.
> >
> > This commit allows ticket locks to automatically switch between pure
> > ticketlock and queued-lock operation as needed.  If too many CPUs are
> > spinning on a given ticket lock, a queue structure will be allocated
> > and the lock will switch to queued-lock operation.  When the lock 
> > becomes
> > free, it will switch back into ticketlock operation.  The low-order bit
> > of the head counter is used to indicate that the lock is in queued mode,
> > which forces an unconditional mismatch between the head and tail 
> > counters.
> > This approach means that the common-case code path under conditions of
> > low contention is very nearly that of a plain ticket lock.
> >
> > A fixed number of queueing structures is statically allocated in an
> > array.  The ticket-lock address is used to hash into an initial element,
> > but if that element is already in use, it moves to the next element.  If
> > the entire array is already in use, continue to spin in ticket mode.
> >
> > Signed-off-by: Paul E. McKenney 
> > [ paulmck: Eliminate duplicate code and update comments (Steven 
> > Rostedt). ]
> > [ paulmck: Address Eric Dumazet review feedback. ]
> > [ paulmck: Use Lai Jiangshan idea to eliminate smp_mb(). ]
> > [ paulmck: Expand ->head_tkt from s32 to s64 (Waiman Long). ]
> > [ paulmck: Move cpu_relax() to main spin loop (Steven Rostedt). ]
> > [ paulmck: Reduce queue-switch contention (Waiman Long). ]
> > [ paulmck: __TKT_SPIN_INC for __ticket_spin_trylock() (Steffen 
> > Persvold). ]
> > [ paulmck: Type safety fixes (Steven Rostedt). ]
> > [ paulmck: Pre-check cmpxchg() value (Waiman Long). ]
> > [ paulmck: smp_mb() downgrade to smp_wmb() (Lai Jiangshan). ]
> 
> 
>  Hi, Paul,
> 
>  I simplify the code and remove the search by encoding the index of 
>  struct tkt_q_head
>  into lock->tickets.head.
> 
>  A) lock->tickets.head(when queued-lock):
>  -
>   index of struct tkt_q_head |0|1|
>  -
> >>>
> >>> Interesting approach!  It might reduce queued-mode overhead a bit in
> >>> some cases, though I bet that in the common case the first queue element
> >>> examined is the right one.  More on this below.
> >>>
>  The bit0 = 1 for queued, and the bit1 = 0 is reserved for 
>  __ticket_spin_unlock(),
>  thus __ticket_spin_unlock() will not change the higher bits of 
>  lock->tickets.head.
> 
>  B) tqhp->head is for the real value of lock->tickets.head.
>  if the last bit of tqhp->head is 1, it means it is the head ticket when 
>  started queuing.
> >>>
> >>> But don't you also need the xadd() in __ticket_spin_unlock() to become
> >>> a cmpxchg() for this to work?  Or is your patch missing your changes to
> >>> arch/x86/include/asm/spinlock.h?  Either way, this is likely to increase
> >>> the no-contention overhead, which might be counterproductive.  Wouldn't
> >>> hurt to get measurements, though.
> >>
> >> No, don't need to change __ticket_spin_unlock() in my idea.
> >> bit1 in the  tickets.head is reserved for __ticket_spin_unlock(),
> >> __ticket_spin_unlock() only changes the bit1, it will not change
> >> the higher bits. tkt_q_do_wake() will restore the tickets.head.
> >>
> >> This approach avoids cmpxchg in __ticket_spin_unlock().
> > 
> > Ah, I did miss that.  But doesn't the adjustment in __ticket_spin_lock()
> > need to be atomic in order to handle concurrent invocations of
> > __ticket_spin_lock()?
> > 
> > Either way, it would be good to see the performance effects of this.
> > 
> > Thanx, Paul
> > 
> >>> Given the results that Davidlohr posted, I believe that the following
> >>> optimizations would also provide some improvement:
> >>>
> >>> 1.  Move the call to tkt_spin_pass() from __ticket_spin_lock()
> >>> to a separate linker section in order to reduce the icache
> >>> penalty exacted by the spinloop.  This is likely to be causing
> >>> some of the performance reductions in the cases where ticket
> >>>

Re: [RFC] Staging: imx-drm: Do not use fractional part of divider

2013-06-14 Thread Jiada Wang


Hello Alexander


Alexander Shiyan wrote:

Hello.

Analysis of driver imx-drm led me to believe that the use fractional part of 
the divider is not always a good idea.
For example, for a parallel display bus connected to LVDS converter chip 
(DS90C363), in this case the use of
fractional part, clock will unstable and the on-chip PLL is not working 
properly, or rather, does not work at all.

Let me give a specific example.
ipu_crtc_mode_set 0x36314752
imx-ipuv3 4000.ipu: clk_di_round_rate: inrate: 13300 div: 0x0035 
outrate: 40150928 wanted: 4000
imx-ipuv3 4000.ipu: clk_di_round_rate: inrate: 13300 div: 0x0035 
outrate: 40150928 wanted: 40150928
imx-ipuv3 4000.ipu: clk_di_set_rate: inrate: 13300 desired: 40150928 
div: 0x0035

In this case the divider is 3.5, that result to clock is incorrect. See an 
attached oscillogram FTEK.jpg.

After a patch the clocks is OK. Patch just uncomment some FSL code.
imx-ipuv3 4000.ipu: clk_di_round_rate: inrate: 13300 div: 0x0040 
outrate: 3325 wanted: 4000
imx-ipuv3 4000.ipu: clk_di_round_rate: inrate: 13300 div: 0x0040 
outrate: 3325 wanted: 3325
imx-ipuv3 4000.ipu: clk_di_set_rate: inrate: 13300 desired: 3325 
div: 0x0040

See an attached oscillogram F0001TEK.jpg.

So, I want to review this from developers and wait for comments.




Recently I am also looking at use of fractional part of CLKGEN0,
and want to discuss with you some information I found.

in our code we are using external PLL to drive DI pixel clock, although 
I haven't checked with oscilloscope, but it's apparently fractional part 
of CLKGEN0 doesn't work properly as it is described in reference manual. 
After some investigation I found 0x8 (0.5) seems works fine.


Our solution is try to set DI pixel clock's root clock to integer times 
of clk_di_pixel as close as possible, so that we can avoid using 
fractional part to get desired clock. if Pll -> ipu_di_podf could not 
provide the clock close enough, then try to set it to X.5 times of DI 
pixel clock, then only the "proved" 0x8 of fractional part will be used.



diff --git a/drivers/staging/imx-drm/ipu-v3/ipu-di.c 
b/drivers/stagineg/imx-drm/ipu-v3/ipu-di.c
index 19d777e..d424c22 100644
--- a/drivers/staging/imx-drm/ipu-v3/ipu-di.c
+++ b/drivers/staging/imx-drm/ipu-v3/ipu-di.c
@@ -154,22 +154,15 @@ static int ipu_di_clk_calc_div(unsigned long inrate, 
unsigned long outrate)

if (div < 0x10)
div = 0x10;
-
-#ifdef WTF_IS_THIS
-   /*
-* Freescale has this in their Kernel. It is neither clear what
-* it does nor why it does it
-*/
-   if (div & 0x10)
-   div &= ~0x7;
else {
/* Round up divider if it gets us closer to desired pix clk */
-   if ((div & 0xC) == 0xC) {
+   if (div & 0x0f) {
div += 0x10;
-   div &= ~0xF;
+   /* Strip fractional part of divider */
+   div &= ~0x0f;
If div = 0x11, and the display is not forgiving enough, the pixel clock 
will probably not be accepted by it.


thanks,
jiada

}
}
-#endif
+
return div;
  }




___
linux-arm-kernel mailing list
linux-arm-ker...@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] gpio MIPS/OCTEON: Add a driver for OCTEON's on-chip GPIO pins.

2013-06-14 Thread David Daney

From: David Daney 

The SOCs in the OCTEON family have 16 (or in some cases 20) on-chip
GPIO pins, this driver handles them all.  Configuring the pins as
interrupt sources is handled elsewhere (OCTEON's irq handling code).

Signed-off-by: David Daney 
---

This patch depends somewhat on patches in Ralf's MIPS/Linux -next tree
where we have patches that enable the Kconfig CAVIUM_OCTEON_SOC and
ARCH_REQUIRE_GPIOLIB symbols.  Apart from that it is stand-alone and
is probably suitable for merging via the GPIO tree.

Device tree binding defintions already exist for this device in
Documentation/devicetree/bindings/gpio/cavium-octeon-gpio.txt

 drivers/gpio/Kconfig   |   8 +++
 drivers/gpio/Makefile  |   1 +
 drivers/gpio/gpio-octeon.c | 153 +
 3 files changed, 162 insertions(+)
 create mode 100644 drivers/gpio/gpio-octeon.c

diff --git a/drivers/gpio/Kconfig b/drivers/gpio/Kconfig
index 573c449..7b5df9a 100644
--- a/drivers/gpio/Kconfig
+++ b/drivers/gpio/Kconfig
@@ -190,6 +190,14 @@ config GPIO_MXS
select GPIO_GENERIC
select GENERIC_IRQ_CHIP
 
+config GPIO_OCTEON
+   tristate "Cavium OCTEON GPIO"
+   depends on GPIOLIB && CAVIUM_OCTEON_SOC
+   default y
+   help
+ Say yes here to support the on-chip GPIO lines on the OCTEON
+ family of SOCs.
+
 config GPIO_PL061
bool "PrimeCell PL061 GPIO support"
depends on ARM && ARM_AMBA
diff --git a/drivers/gpio/Makefile b/drivers/gpio/Makefile
index 0cb2d65..b8487b6 100644
--- a/drivers/gpio/Makefile
+++ b/drivers/gpio/Makefile
@@ -50,6 +50,7 @@ obj-$(CONFIG_GPIO_MSM_V2) += gpio-msm-v2.o
 obj-$(CONFIG_GPIO_MVEBU)+= gpio-mvebu.o
 obj-$(CONFIG_GPIO_MXC) += gpio-mxc.o
 obj-$(CONFIG_GPIO_MXS) += gpio-mxs.o
+obj-$(CONFIG_GPIO_OCTEON)  += gpio-octeon.o
 obj-$(CONFIG_ARCH_OMAP)+= gpio-omap.o
 obj-$(CONFIG_GPIO_PCA953X) += gpio-pca953x.o
 obj-$(CONFIG_GPIO_PCF857X) += gpio-pcf857x.o
diff --git a/drivers/gpio/gpio-octeon.c b/drivers/gpio/gpio-octeon.c
new file mode 100644
index 000..f5bd127
--- /dev/null
+++ b/drivers/gpio/gpio-octeon.c
@@ -0,0 +1,153 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2011, 2012 Cavium Inc.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#define RX_DAT 0x80
+#define TX_SET 0x88
+#define TX_CLEAR 0x90
+/*
+ * The address offset of the GPIO configuration register for a given
+ * line.
+ */
+static unsigned int bit_cfg_reg(unsigned int gpio)
+{
+   if (gpio < 16)
+   return 8 * gpio;
+   else
+   return 8 * (gpio - 16) + 0x100;
+}
+
+struct octeon_gpio {
+   struct gpio_chip chip;
+   u64 register_base;
+};
+
+static int octeon_gpio_dir_in(struct gpio_chip *chip, unsigned offset)
+{
+   struct octeon_gpio *gpio = container_of(chip, struct octeon_gpio, chip);
+
+   cvmx_write_csr(gpio->register_base + bit_cfg_reg(offset), 0);
+   return 0;
+}
+
+static void octeon_gpio_set(struct gpio_chip *chip, unsigned offset, int value)
+{
+   struct octeon_gpio *gpio = container_of(chip, struct octeon_gpio, chip);
+   u64 mask = 1ull << offset;
+   u64 reg = gpio->register_base + (value ? TX_SET : TX_CLEAR);
+   cvmx_write_csr(reg, mask);
+}
+
+static int octeon_gpio_dir_out(struct gpio_chip *chip, unsigned offset,
+  int value)
+{
+   struct octeon_gpio *gpio = container_of(chip, struct octeon_gpio, chip);
+   union cvmx_gpio_bit_cfgx cfgx;
+
+   octeon_gpio_set(chip, offset, value);
+
+   cfgx.u64 = 0;
+   cfgx.s.tx_oe = 1;
+
+   cvmx_write_csr(gpio->register_base + bit_cfg_reg(offset), cfgx.u64);
+   return 0;
+}
+
+static int octeon_gpio_get(struct gpio_chip *chip, unsigned offset)
+{
+   struct octeon_gpio *gpio = container_of(chip, struct octeon_gpio, chip);
+   u64 read_bits = cvmx_read_csr(gpio->register_base + RX_DAT);
+
+   return ((1ull << offset) & read_bits) != 0;
+}
+
+static int octeon_gpio_probe(struct platform_device *pdev)
+{
+   struct octeon_gpio *gpio;
+   struct gpio_chip *chip;
+   struct resource *res_mem;
+   int err = 0;
+
+   gpio = devm_kzalloc(>dev, sizeof(*gpio), GFP_KERNEL);
+   if (!gpio)
+   return -ENOMEM;
+   chip = >chip;
+
+   res_mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+   if (res_mem == NULL) {
+   dev_err(>dev, "found no memory resource\n");
+   err = -ENXIO;
+   goto out;
+   }
+   if (!devm_request_mem_region(>dev, res_mem->start,
+   resource_size(res_mem),
+res_mem->name)) {
+   dev_err(>dev, "request_mem_region failed\n");
+

[PATCH 2/2] Add sunxi-sid to dts for sun4i and sun5i

2013-06-14 Thread Oliver Schinagl

From: Oliver Schinagl 

This patch shall add support for the sunxi-sid driver to the device table for
sun4i and sun5i.

Signed-off-by: Oliver Schinagl 
---
 arch/arm/boot/dts/sun4i-a10.dtsi | 5 +
 arch/arm/boot/dts/sun5i-a13.dtsi | 5 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm/boot/dts/sun4i-a10.dtsi b/arch/arm/boot/dts/sun4i-a10.dtsi
index e7ef619..bc71d64 100644
--- a/arch/arm/boot/dts/sun4i-a10.dtsi
+++ b/arch/arm/boot/dts/sun4i-a10.dtsi
@@ -213,6 +213,11 @@
reg = <0x01c20c90 0x10>;
};
 
+   sid: eeprom@01c23800 {
+   compatible = "allwinner,sun4i-sid";
+   reg = <0x01c23800 0x10>;
+   };
+
uart0: serial@01c28000 {
compatible = "snps,dw-apb-uart";
reg = <0x01c28000 0x400>;
diff --git a/arch/arm/boot/dts/sun5i-a13.dtsi b/arch/arm/boot/dts/sun5i-a13.dtsi
index 8ba65c1..c80c81b 100644
--- a/arch/arm/boot/dts/sun5i-a13.dtsi
+++ b/arch/arm/boot/dts/sun5i-a13.dtsi
@@ -196,6 +196,11 @@
reg = <0x01c20c90 0x10>;
};
 
+   sid: eeprom@01c23800 {
+   compatible = "allwinner,sun4i-sid";
+   reg = <0x01c23800 0x10>;
+   };
+
uart1: serial@01c28400 {
compatible = "snps,dw-apb-uart";
reg = <0x01c28400 0x400>;
-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/2] v3 Driver for Allwinner sunxi Security ID

2013-06-14 Thread Oliver Schinagl

From: Oliver Schinagl 

I've tried to incoperate all requests/issues but as always could have possibly
missed some. I've talked to a few people, maxime mostly about the return vs
goto and he said it was up to me, and have choosen to stick with the goto for
the error handling.

Changes from v2:
* Removed the global pointer, we can change that when the need for external
  access arises
* Fixed header inclusions
* Corrected if guards. There where some crude mistakes there
* Changed offset to an unsigned int so we don't have to worry about negatives
* Cleaned up variable declarations
* Changed ret value, ENXIO (No device/io) as that better matches a missing dt
* Made the loading informercial print version so it is somewhat usefull

Changes from v1:
* Renamed the sys-fs exported key to eeprom, since it really a read-only eeprom
* Removed mention of sun[67]i since we haven't tested those
* Fixed up mistakes in comments
* Removed PAGE_SIZE references, since this is a binary only driver
* Removed lookup table and calculate offsets better
* Use proper endianess
* Add the SID to seed the kernel entropy pool
* Rewrite probe to use platform_get_resource/devm_ioremap_resource instead


The Allwinner A-series of SoC's have efuses exposed via registers to read the
factory programmed e-fuses. These should in theory be programmable but this is
still to be confirmed. It does appear that these fuses are unique enough to be
used as serial numbers, RSA keys, generate MAC addresses from etc. If it turns
out to be user programmable, the use obviously increases. Allwinner did use the
fuses initially to determine the chip-type.

This driver supports all currently known chips based on datasheets and 'dumped'
drivers that we have so far, the dts is only implemented for known chips.

It has been tested on a Cubieboard 1

This is my very first driver so please try to be gentle 

Oliver Schinagl (2):
  Initial support for Allwinner's Security ID fuses
  Add sunxi-sid to dts for sun4i and sun5i

 arch/arm/boot/dts/sun4i-a10.dtsi |   5 ++
 arch/arm/boot/dts/sun5i-a13.dtsi |   5 ++
 drivers/misc/eeprom/Kconfig  |  17 
 drivers/misc/eeprom/Makefile |   1 +
 drivers/misc/eeprom/sunxi_sid.c  | 167 +++
 5 files changed, 195 insertions(+)
 create mode 100644 drivers/misc/eeprom/sunxi_sid.c

-- 
1.8.1.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] Initial support for Allwinner's Security ID fuses

2013-06-14 Thread Oliver Schinagl

From: Oliver Schinagl 

Allwinner has electric fuses (efuse) on their line of chips. This driver
reads those fuses, seeds the kernel entropy and exports them as a sysfs node.

These fuses are most likly to be programmed at the factory, encoding
things like Chip ID, some sort of serial number etc and appear to be
reasonable unique.
While in theory, these should be writeable by the user, it will probably
be inconvinient to do so. Allwinner recommends that a certain input pin,
labeled 'efuse_vddq', be connected to GND. To write these fuses, 2.5 V
needs to be applied to this pin.

Even so, they can still be used to generate a board-unique mac from, board
unique RSA key and seed the kernel RNG.

Currently supported are the following known chips:
Allwinner sun4i (A10)
Allwinner sun5i (A10s, A13)

Signed-off-by: Oliver Schinagl 
---
 drivers/misc/eeprom/Kconfig |  17 
 drivers/misc/eeprom/Makefile|   1 +
 drivers/misc/eeprom/sunxi_sid.c | 167 
 3 files changed, 185 insertions(+)
 create mode 100644 drivers/misc/eeprom/sunxi_sid.c

diff --git a/drivers/misc/eeprom/Kconfig b/drivers/misc/eeprom/Kconfig
index 04f2e1f..c7bc6ed 100644
--- a/drivers/misc/eeprom/Kconfig
+++ b/drivers/misc/eeprom/Kconfig
@@ -96,4 +96,21 @@ config EEPROM_DIGSY_MTC_CFG
 
  If unsure, say N.
 
+config EEPROM_SUNXI_SID
+   tristate "Allwinner sunxi security ID support"
+   depends on ARCH_SUNXI && SYSFS
+   help
+ This is a driver for the 'security ID' available on various Allwinner
+ devices. Currently supported are:
+   sun4i (A10)
+   sun5i (A13)
+
+ Due to the potential risks involved with changing e-fuses,
+ this driver is read-only
+
+ For more information visit http://linux-sunxi.org/SID
+
+ This driver can also be built as a module. If so, the module
+ will be called sunxi_sid.
+
 endmenu
diff --git a/drivers/misc/eeprom/Makefile b/drivers/misc/eeprom/Makefile
index fc1e81d..9507aec 100644
--- a/drivers/misc/eeprom/Makefile
+++ b/drivers/misc/eeprom/Makefile
@@ -4,4 +4,5 @@ obj-$(CONFIG_EEPROM_LEGACY) += eeprom.o
 obj-$(CONFIG_EEPROM_MAX6875)   += max6875.o
 obj-$(CONFIG_EEPROM_93CX6) += eeprom_93cx6.o
 obj-$(CONFIG_EEPROM_93XX46)+= eeprom_93xx46.o
+obj-$(CONFIG_EEPROM_SUNXI_SID) += sunxi_sid.o
 obj-$(CONFIG_EEPROM_DIGSY_MTC_CFG) += digsy_mtc_eeprom.o
diff --git a/drivers/misc/eeprom/sunxi_sid.c b/drivers/misc/eeprom/sunxi_sid.c
new file mode 100644
index 000..f014e1b
--- /dev/null
+++ b/drivers/misc/eeprom/sunxi_sid.c
@@ -0,0 +1,167 @@
+/*
+ * Copyright (c) 2013 Oliver Schinagl
+ * http://www.linux-sunxi.org
+ *
+ * Oliver Schinagl 
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * This driver exposes the Allwinner security ID, a 128 bit eeprom, in byte
+ * sized chunks.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define DRV_NAME "sunxi-sid"
+#define DRV_VERSION "1.0"
+
+/* There are 4 32-bit keys */
+#define SID_KEYS 4
+/* and 4 byte sized keys per 32-bit key */
+#define SID_SIZE (SID_KEYS * 4)
+
+
+/* We read the entire key, but only return the requested byte. This is of
+ * course slower then it could be and uses 4 times more reads as needed but
+ * keeps code simpler.
+ */
+static u8 sunxi_sid_read_byte(const void __iomem *sid_reg_base,
+ const unsigned int offset)
+{
+   u32 sid_key = 0;
+
+   if (offset >= SID_SIZE)
+   goto exit;
+
+   sid_key = ioread32be(sid_reg_base + round_down(offset, 4));
+   sid_key >>= (offset % 4) * 8;
+   sid_key &= 0xff;
+   /* fall through */
+
+exit:
+   return (u8)sid_key;
+}
+
+static ssize_t sid_read(struct file *fd, struct kobject *kobj,
+   struct bin_attribute *attr, char *buf,
+   loff_t pos, size_t size)
+{
+   int i;
+   struct platform_device *pdev;
+   void __iomem *sid_reg_base;
+
+   pdev = (struct platform_device *)to_platform_device(kobj_to_dev(kobj));
+   sid_reg_base = (void __iomem *)platform_get_drvdata(pdev);
+
+   for (i = 0; i < size; i++) {
+   if ((pos + i) >= SID_SIZE || (pos < 0))
+   break;
+   buf[i] = sunxi_sid_read_byte(sid_reg_base, pos + i);
+   }
+
+   return i;
+}
+
+static const struct of_device_id

[PATCH v2] Drivers: hv: vmbus: incorrect device name is printed when child device is unregistered

2013-06-14 Thread Fernando Soto

From: Fernando Soto  
Please CC me, I am not subscribed to the list.

Whenever a device is unregistered in vmbus_device_unregister 
(drivers/hv/vmbus_drv.c), the device name in the log message may contain 
garbage as the memory has already been freed by the time pr_info is called. Log 
example:
 [ 3149.170475] hv_vmbus: child device àõsèè0_5 unregistered

By logging the message just before calling device_unregister, the correct 
device name is printed:
[ 3145.034652] hv_vmbus: child device vmbus_0_5 unregistered

Also changing register & unregister messages to debug to avoid unnecessarily 
cluttering the kernel log.

Signed-off-by: Fernando M Soto 
--- linux-3.10-rc5/drivers/hv/vmbus_drv.c.orig  2013-06-13 19:20:55.359511352 
-0400
+++ linux-3.10-rc5/drivers/hv/vmbus_drv.c   2013-06-14 19:00:21.722105728 
-0400
@@ -686,7 +686,7 @@ int vmbus_device_register(struct hv_devi
if (ret)
pr_err("Unable to register child device\n");
else
-   pr_info("child device %s registered\n",
+   pr_debug("child device %s registered\n",
dev_name(_device_obj->device));
 
return ret;
@@ -698,14 +698,14 @@ int vmbus_device_register(struct hv_devi
  */
 void vmbus_device_unregister(struct hv_device *device_obj)
 {
+   pr_debug("child device %s unregistered\n",
+   dev_name(_obj->device));
+
/*
 * Kick off the process of unregistering the device.
 * This will call vmbus_remove() and eventually vmbus_device_release()
 */
device_unregister(_obj->device);
-
-   pr_info("child device %s unregistered\n",
-   dev_name(_obj->device));
 }
 
 
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 0/5] ACPI / scan: Make it possible to use the container hotplug with other scan handlers

2013-06-14 Thread Rafael J. Wysocki

On Friday, June 14, 2013 03:32:42 PM Tony Luck wrote:
> On Fri, Jun 14, 2013 at 3:23 PM, Rafael J. Wysocki  wrote:
> > Can you please just test patch [5/5] alone without patches [1-4/5]?  We 
> > believe
> > that this should work too and if that's the case, we'll only need that patch
> > and a reworked [1/5].
> 
> Your belief is sound - I popped all five patches and then applied just
> 5/5 ... and
> the system still works.

Great, thanks!

Can you please apply the appended patch on top of it and see if the system
still works then?

Rafael


---
From: Rafael J. Wysocki 
Subject: ACPI / scan: Do not bind ACPI drivers to objects with scan handlers

ACPI drivers must not be bound to device objects having scan handlers
attatched to them, so make acpi_device_probe() fail with -EINVAL if the
device object being probed has an ACPI scan handler.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/acpi/scan.c  |3 +++
 drivers/acpi/video.c |3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

Index: linux-pm/drivers/acpi/scan.c
===
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -939,6 +939,9 @@ static int acpi_device_probe(struct devi
struct acpi_driver *acpi_drv = to_acpi_driver(dev->driver);
int ret;
 
+   if (acpi_dev->handler)
+   return -EINVAL;
+
if (!acpi_drv->ops.add)
return -ENOSYS;
 
Index: linux-pm/drivers/acpi/video.c
===
--- linux-pm.orig/drivers/acpi/video.c
+++ linux-pm/drivers/acpi/video.c
@@ -1722,9 +1722,6 @@ static int acpi_video_bus_add(struct acp
int error;
acpi_status status;
 
-   if (device->handler)
-   return -EINVAL;
-
status = acpi_walk_namespace(ACPI_TYPE_DEVICE,
device->parent->handle, 1,
acpi_video_bus_match, NULL,

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI / ACPI / PM: Use correct power state strings in messages

2013-06-14 Thread Rafael J. Wysocki

On Friday, June 14, 2013 04:49:49 PM Bjorn Helgaas wrote:
> On Sat, Jun 15, 2013 at 12:28:12AM +0200, Rafael J. Wysocki wrote:
> > On Friday, June 14, 2013 11:08:44 AM Bjorn Helgaas wrote:
> > > On Thu, Jun 13, 2013 at 4:29 PM, Rafael J. Wysocki  wrote:
> > > > From: Rafael J. Wysocki 
> > > >
> > > > Make acpi_pci_set_power_state() print the name of the ACPI device
> > > > power state the device has been actually put into instead of printing
> > > > the name of the requested PCI device power state, which need not be
> > > > the same.
> > > >
> > > > Signed-off-by: Rafael J. Wysocki 
> > > > ---
> > > >
> > > > For 3.11.
> > > >
> > > > Thanks,
> > > > Rafael
> > > >
> > > > ---
> > > >  drivers/pci/pci-acpi.c |2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > Index: linux-pm/drivers/pci/pci-acpi.c
> > > > ===
> > > > --- linux-pm.orig/drivers/pci/pci-acpi.c
> > > > +++ linux-pm/drivers/pci/pci-acpi.c
> > > > @@ -211,7 +211,7 @@ static int acpi_pci_set_power_state(stru
> > > >
> > > > if (!error)
> > > > dev_info(>dev, "power state changed by ACPI to 
> > > > %s\n",
> > > > -pci_power_name(state));
> > > > +acpi_power_state_string(state_conv[state]));
> > > >
> > > > return error;
> > > >  }
> > > >
> > > 
> > > Just to double-check this, it *looks* like the effective change is
> > > that for PCI_D3hot and PCI_D3cold, we'll print "(unknown)" instead of
> > > "D3hot" and "D3cold" because state_conv[] folds both PCI_D3hot and
> > > PCI_D3cold into ACPI_STATE_D3, and acpi_power_state_string() doesn't
> > > have a case for ACPI_STATE_D3.
> > 
> > No, it won't work like this, because ACPI_STATE_D3 == ACPI_STATE_D3_COLD. 
> > :-)
> > 
> > So, actually, "D3cold" will be printed for both PCI_D3hot and PCI_D3cold
> > (and I have tested this).
> 
> Ah, right, I should have noticed that.
> 
> > Well, I think it's better to actually replace ACPI_STATE_D3 everywhere in
> > pci-acpi.c with ACPI_STATE_D3_COLD to avoid that confusion.  Do you want me
> > to prepare a patch?
> 
> If the following is OK, I'll just put it in my pci/misc branch:

Yes, it is, thanks!

Rafael


> commit fc6504b3a4dc9beae782a11e6f7c3c4a9f077fb8
> Author: Rafael J. Wysocki 
> Date:   Fri Jun 14 00:29:50 2013 +0200
> 
> PCI / ACPI / PM: Use correct power state strings in messages
> 
> Make acpi_pci_set_power_state() print the name of the ACPI device
> power state the device has been actually put into instead of printing
> the name of the requested PCI device power state, which need not be
> the same.
> 
> [bhelgaas: use ACPI_STATE_D3_COLD (ACPI_STATE_D3 == ACPI_STATE_D3_COLD)]
> Signed-off-by: Rafael J. Wysocki 
> Signed-off-by: Bjorn Helgaas 
> 
> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
> index 6c15d6a..dbdc5f7 100644
> --- a/drivers/pci/pci-acpi.c
> +++ b/drivers/pci/pci-acpi.c
> @@ -186,8 +186,8 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, 
> pci_power_t state)
>   [PCI_D0] = ACPI_STATE_D0,
>   [PCI_D1] = ACPI_STATE_D1,
>   [PCI_D2] = ACPI_STATE_D2,
> - [PCI_D3hot] = ACPI_STATE_D3,
> - [PCI_D3cold] = ACPI_STATE_D3
> + [PCI_D3hot] = ACPI_STATE_D3_COLD,
> + [PCI_D3cold] = ACPI_STATE_D3_COLD,
>   };
>   int error = -EINVAL;
>  
> @@ -211,7 +211,7 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, 
> pci_power_t state)
>  
>   if (!error)
>   dev_info(>dev, "power state changed by ACPI to %s\n",
> -  pci_power_name(state));
> +  acpi_power_state_string(state_conv[state]));
>  
>   return error;
>  }
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: vmscan: remove redundant querying to shrinker

2013-06-14 Thread Andrew Morton

On Sat, 15 Jun 2013 03:13:26 +0900 HeeSub Shin  wrote:

> Hello,
> 
> On Fri, Jun 14, 2013 at 8:10 PM, Minchan Kim  wrote:
> 
> >
> > Hello,
> >
> > On Fri, Jun 14, 2013 at 07:07:51PM +0900, Heesub Shin wrote:
> > > shrink_slab() queries each slab cache to get the number of
> > > elements in it. In most cases such queries are cheap but,
> > > on some caches. For example, Android low-memory-killer,
> > > which is operates as a slab shrinker, does relatively
> > > long calculation once invoked and it is quite expensive.
> >
> > LMK as shrinker is really bad, which everybody didn't want
> > when we reviewed it a few years ago so that's a one of reason
> > LMK couldn't be promoted to mainline yet. So your motivation is
> > already not atrractive. ;-)
> >
> > >
> > > This patch removes redundant queries to shrinker function
> > > in the loop of shrink batch.
> >
> > I didn't review the patch and others don't want it, I guess.
> > Because slab shrink is under construction and many patches were
> > already merged into mmtom. Please look at latest mmotm tree.
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/mhocko/mm.git
> 
> 
> >
> > If you concern is still in there and it's really big concern of MM
> > we should take care, NOT LMK, plese, resend it.
> >
> >
> I've noticed that there are huge changes there in the recent mmotm and you
> guys already settled the issue of my concern. I usually keep track changes
> in recent mm-tree, but this time I didn't. My bad :-)
> 

I'm not averse to merging an improvement like this even if it gets
rubbed out by forthcoming changes.  The big changes may never get
merged or may be reverted.  And by merging this patch, others are more
likely to grab it, backport it into earlier kernels and benefit from
it.

Also, the problem which this simple patch fixes might be present in a
different form after the large patchset has been merged.  That does not
appear to be the case this time.

So I'd actually like to merge Heesub's patch.  Problem is, I don't have
a way to redistribute it for testing - I'd need to effectively revert
the whole thing when integrating Glauber's stuff on top, so nobody who
is using linux-next would test Heesub's change.  Drat.

However I'm a bit sceptical about the description here.  The shrinker
is supposed to special-case the "nr_to_scan == 0" case and AFAICT
drivers/staging/android/lowmemorykiller.c:lowmem_shrink() does do this,
and it looks like the function will be pretty quick in this case.

In other words, the behaviour of lowmem_shrink(nr_to_scan == 0) does
not match Heesub's description.  What's up with that?

Also, there is an obvious optimisation which we could make to
lowmem_shrink().  All this stuff:

if (lowmem_adj_size < array_size)
array_size = lowmem_adj_size;
if (lowmem_minfree_size < array_size)
array_size = lowmem_minfree_size;
for (i = 0; i < array_size; i++) {
if (other_free < lowmem_minfree[i] &&
other_file < lowmem_minfree[i]) {
min_score_adj = lowmem_adj[i];
break;
}
}

does nothing useful in the nr_to_scan==0 case and should be omitted for
this special case.  But this problem was fixed in the large shrinker
rework in -mm.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUGFIX 2/9] ACPIPHP: fix device destroying order issue when handling dock notification

2013-06-14 Thread Rafael J. Wysocki

On Friday, June 14, 2013 11:30:12 PM Jiang Liu wrote:
> On 06/14/2013 10:12 PM, Rafael J. Wysocki wrote:
> > On Friday, June 14, 2013 09:57:15 PM Jiang Liu wrote:
> >> On 06/14/2013 08:23 PM, Rafael J. Wysocki wrote:
> >>> On Thursday, June 13, 2013 09:59:44 PM Rafael J. Wysocki wrote:
>  On Friday, June 14, 2013 12:32:25 AM Jiang Liu wrote:
> > Current ACPI glue logic expects that physical devices are destroyed
> > before destroying companion ACPI devices, otherwise it will break the
> > ACPI unbind logic and cause following warning messages:
> > [  185.026073] usb usb5: Oops, 'acpi_handle' corrupt
> > [  185.035150] pci :1b:00.0: Oops, 'acpi_handle' corrupt
> > [  185.035515] pci :18:02.0: Oops, 'acpi_handle' corrupt
> > [  180.013656]  port1: Oops, 'acpi_handle' corrupt
> > Please refer to https://bugzilla.kernel.org/attachment.cgi?id=104321
> > for full log message.
> 
>  So my question is, did we have this problem before commit 3b63aaa70e1?
> 
>  If we did, then when did it start?  Or was it present forever?
> 
> > Above warning messages are caused by following scenario:
> > 1) acpi_dock_notifier_call() queues a task (T1) onto kacpi_hotplug_wq
> > 2) kacpi_hotplug_wq handles T1, which invokes acpi_dock_deferred_cb()
> >->dock_notify()-> handle_eject_request()->hotplug_dock_devices()
> > 3) hotplug_dock_devices() first invokes registered hotplug callbacks to
> >destroy physical devices, then destroys all affected ACPI devices.
> >Everything seems perfect until now. But the acpiphp dock notification
> >handler will queue another task (T2) onto kacpi_hotplug_wq to really
> >destroy affected physical devices.
> 
>  Would not the solution be to modify it so that it didn't spawn the other
>  task (T2), but removed the affected physical devices synchronously?
> 
> > 4) kacpi_hotplug_wq finishes T1, and all affected ACPI devices have
> >been destroyed.
> > 5) kacpi_hotplug_wq handles T2, which destroys all affected physical
> >devices.
> >
> > So it breaks ACPI glue logic's expection because ACPI devices are 
> > destroyed
> > in step 3 and physical devices are destroyed in step 5.
> >
> > Signed-off-by: Jiang Liu 
> > Reported-by: Alexander E. Patrakov 
> > Cc: Bjorn Helgaas 
> > Cc: Yinghai Lu 
> > Cc: "Rafael J. Wysocki" 
> > Cc: linux-...@vger.kernel.org
> > Cc: linux-kernel@vger.kernel.org
> > Cc: sta...@vger.kernel.org
> > ---
> > Hi Bjorn and Rafael,
> >  The recursive lock changes haven't been tested yet, need help
> > from Alexander for testing.
> 
>  Well, let's just say I'm not a fan of recursive locks.  Is that 
>  unavoidable
>  here?
> >>>
> >>> What about the appended patch (on top of [1/9], untested)?
> >>>
> >>> Rafael
> >> It should have similar effect as patch 2/9, and it will encounter the
> >> same deadlock scenario as 2/9 too.
> > 
> > And why exactly?
> > 
> > I'm looking at acpiphp_disable_slot() and I'm not seeing where the
> > problematic lock is taken.  Similarly for power_off_slot().
> > 
> > It should take the ACPI scan lock, but that's a different matter.
> > 
> > Thanks,
> > Rafael
> The deadlock scenario is the same:

Well, you didn't answer my question.

> hotplug_dock_devices()
> mutex_lock(>hp_lock)
> dd->ops->handler()
>   destroy pci bus

And this wasn't particularly helpful.

What about mentioning acpi_pci_remove_bus()?  I don't remember all changes
made recently, sorry.

>   unregister_hotplug_dock_device()
>   mutex_lock(>hp_lock)

I see the problem.

ds->hp_lock is used to make both addition and removal of hotplug devices wait
for us to complete walking ds->hotplug_devices.  However, if we are in the
process of removing hotplug devices, we don't need removals to block on
ds->hp_lock (in fact, we don't even want them to block on it).  Conversely, if
we are in the process of adding hotplug devices, we don't want additions to
block on ds->hp_lock.

So, why don't we do the following:

(1) Introduce a 'hotplug_status' field into struct_dock station with possible
values representing "removal", "addition" and "no action" and a wait queue
associated with it.  We can use ds->dd_lock to protect that field.

(2) hotplug_status will be modified by hotplug_dock_devices() depending on the
event.  For example, on eject it will set hotplug_status to "removal".
Then, after completing the walk, it will reset hotplug_status to
"no action" and wake up its wait queue.

(3) dock_del_hotplug_device() will check if hotplug_status is "removal" or
"no_action" and if so, it will just do the removal under ds->dd_lock.  If
it is "addition", though, it will go to sleep

Re: [PATCH] tty/vt: Return EBUSY if deallocating VT1 and it is busy

2013-06-14 Thread Peter Hurley


On 06/14/2013 06:24 PM, Ross Lagerwall wrote:

Commit 421b40a6286e ("tty/vt: Fix vc_deallocate() lock order") changed
the behavior when deallocating VT 1.  Previously if trying to
deallocate VT1 and it is busy, we would return EBUSY.  The commit
changed this to return 0 (success).

This commit restores the old behavior.

Signed-off-by: Ross Lagerwall 


Thanks.

Acked-by: Peter Hurley 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI: Remove not needed check in disable aspm link

2013-06-14 Thread Yinghai Lu

On Fri, Jun 14, 2013 at 3:48 PM, Matthew Garrett
 wrote:
> On Fri, 2013-06-14 at 15:40 -0700, Yinghai Lu wrote:
>> On Fri, Jun 14, 2013 at 3:27 PM, Matthew Garrett
>>  wrote:
>> > On Fri, 2013-06-14 at 15:17 -0700, Yinghai Lu wrote:
>> >
>> >> after those two patches, it aspm_disabled is set, via _osc early,
>> >> pre-1.1 devices aspm register will be touched even aspm_force is not 
>> >> specified.
>> >
>> > I don't follow. We were previously automatically disabling ASPM on
>> > pre-1.1 devices even if _OSC didn't give us control.
>>
>> I don't think so, we just moved _OSC support/control setting before pci scan
>> in 3.8 and revert that in v3.9.
>
> Right, sorry, I don't mean _OSC, I mean the FADT flag. We were
> previously automatically disabling ASPM on pre-1.1 devices even if the
> FADT flag was set.

so in that case aspm_disabled, never get set.
booting path: pcie_aspm_init_link_state will not touch aspm on pre-1.1 devices.

late, FADT checking will cause pcie_clear_aspm() get called, it will
call __pci_disable_link_state, and because following line in
pcie_aspm_link()
/* Nothing to do if the link is already in the requested state */
state &= (link->aspm_capable & ~link->aspm_disable);
if (link->aspm_enabled == state)
return;
aspm in pre-1.1 devices still does not get touched.

Maybe I miss something.

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] pstore/ram for 3.11

2013-06-14 Thread Tony Luck

On Fri, Jun 14, 2013 at 3:47 PM, Anton Vorontsov  wrote:
>
> Acked-by: Anton Vorontsov 
>
> (Or I can pick this via linux-pstore.git tree, I'll let Tony decide.)

Added that Acked-by: and applied to my tree.

Thanks

-Tony
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] powerpc/sysfs: disable hotplug for the boot cpu

2013-06-14 Thread Benjamin Herrenschmidt

On Thu, 2013-06-13 at 19:25 +0800, Zhao Chenhui wrote:
> Some multicore SoCs firstly boot up the cpu0 after warm reset.
> In some suspend/resume cases, SoC will do a warm reset when resuming.
> In order to ensure that the suspending and resuming is running
> on a same cpu, cpu0 should be the last cpu to suspend. Here, cpu0 is
> the boot_cpuid.

Well, so:

  - In any case, your patch will break pseries, so it's not acceptable.

  - Why does it have to absolutely resume from the same CPU it suspended
from ? Can't you have a little bit of code on the resuming CPU that
checks if it's not online, poke an online one and goes back to sleep ?

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI / ACPI / PM: Use correct power state strings in messages

2013-06-14 Thread Bjorn Helgaas

On Sat, Jun 15, 2013 at 12:28:12AM +0200, Rafael J. Wysocki wrote:
> On Friday, June 14, 2013 11:08:44 AM Bjorn Helgaas wrote:
> > On Thu, Jun 13, 2013 at 4:29 PM, Rafael J. Wysocki  wrote:
> > > From: Rafael J. Wysocki 
> > >
> > > Make acpi_pci_set_power_state() print the name of the ACPI device
> > > power state the device has been actually put into instead of printing
> > > the name of the requested PCI device power state, which need not be
> > > the same.
> > >
> > > Signed-off-by: Rafael J. Wysocki 
> > > ---
> > >
> > > For 3.11.
> > >
> > > Thanks,
> > > Rafael
> > >
> > > ---
> > >  drivers/pci/pci-acpi.c |2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > Index: linux-pm/drivers/pci/pci-acpi.c
> > > ===
> > > --- linux-pm.orig/drivers/pci/pci-acpi.c
> > > +++ linux-pm/drivers/pci/pci-acpi.c
> > > @@ -211,7 +211,7 @@ static int acpi_pci_set_power_state(stru
> > >
> > > if (!error)
> > > dev_info(>dev, "power state changed by ACPI to %s\n",
> > > -pci_power_name(state));
> > > +acpi_power_state_string(state_conv[state]));
> > >
> > > return error;
> > >  }
> > >
> > 
> > Just to double-check this, it *looks* like the effective change is
> > that for PCI_D3hot and PCI_D3cold, we'll print "(unknown)" instead of
> > "D3hot" and "D3cold" because state_conv[] folds both PCI_D3hot and
> > PCI_D3cold into ACPI_STATE_D3, and acpi_power_state_string() doesn't
> > have a case for ACPI_STATE_D3.
> 
> No, it won't work like this, because ACPI_STATE_D3 == ACPI_STATE_D3_COLD. :-)
> 
> So, actually, "D3cold" will be printed for both PCI_D3hot and PCI_D3cold
> (and I have tested this).

Ah, right, I should have noticed that.

> Well, I think it's better to actually replace ACPI_STATE_D3 everywhere in
> pci-acpi.c with ACPI_STATE_D3_COLD to avoid that confusion.  Do you want me
> to prepare a patch?

If the following is OK, I'll just put it in my pci/misc branch:

commit fc6504b3a4dc9beae782a11e6f7c3c4a9f077fb8
Author: Rafael J. Wysocki 
Date:   Fri Jun 14 00:29:50 2013 +0200

PCI / ACPI / PM: Use correct power state strings in messages

Make acpi_pci_set_power_state() print the name of the ACPI device
power state the device has been actually put into instead of printing
the name of the requested PCI device power state, which need not be
the same.

[bhelgaas: use ACPI_STATE_D3_COLD (ACPI_STATE_D3 == ACPI_STATE_D3_COLD)]
Signed-off-by: Rafael J. Wysocki 
Signed-off-by: Bjorn Helgaas 

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 6c15d6a..dbdc5f7 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -186,8 +186,8 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, 
pci_power_t state)
[PCI_D0] = ACPI_STATE_D0,
[PCI_D1] = ACPI_STATE_D1,
[PCI_D2] = ACPI_STATE_D2,
-   [PCI_D3hot] = ACPI_STATE_D3,
-   [PCI_D3cold] = ACPI_STATE_D3
+   [PCI_D3hot] = ACPI_STATE_D3_COLD,
+   [PCI_D3cold] = ACPI_STATE_D3_COLD,
};
int error = -EINVAL;
 
@@ -211,7 +211,7 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, 
pci_power_t state)
 
if (!error)
dev_info(>dev, "power state changed by ACPI to %s\n",
-pci_power_name(state));
+acpi_power_state_string(state_conv[state]));
 
return error;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] PCI: Remove not needed check in disable aspm link

2013-06-14 Thread Matthew Garrett

On Fri, 2013-06-14 at 15:40 -0700, Yinghai Lu wrote:
> On Fri, Jun 14, 2013 at 3:27 PM, Matthew Garrett
>  wrote:
> > On Fri, 2013-06-14 at 15:17 -0700, Yinghai Lu wrote:
> >
> >> after those two patches, it aspm_disabled is set, via _osc early,
> >> pre-1.1 devices aspm register will be touched even aspm_force is not 
> >> specified.
> >
> > I don't follow. We were previously automatically disabling ASPM on
> > pre-1.1 devices even if _OSC didn't give us control.
> 
> I don't think so, we just moved _OSC support/control setting before pci scan
> in 3.8 and revert that in v3.9.

Right, sorry, I don't mean _OSC, I mean the FADT flag. We were
previously automatically disabling ASPM on pre-1.1 devices even if the
FADT flag was set.

-- 
Matthew Garrett | mj...@srcf.ucam.org

[GIT PULL] at91: Device Tree update for 3.11 #2

2013-06-14 Thread Nicolas Ferre

Arnd, Olof,

Additional pull-request for AT91 DT patches.
It contains the remaining part of the USB gadget pull-request that I sent you
last week. After having split it, here is the DT part.
It also contains the update of DMA bindings: it is the AT91 part the should go
through arm-soc. I have included the patch (ARM: at91: dt: add header to define
at_hdmac configuration) so that we avoid build errors whichever git tree
(slave-dma or arm-soc) is merged first.
A SPI DT patch for at91sam9x5 is also added.

Thanks, best regards,

The following changes since commit 028633c238f91dc113520a7ad25d37b2ba9068af:

  ARM: at91/dt: add pinctrl definition for at91 tc blocks (2013-05-31 22:40:37 
+0200)

are available in the git repository at:

  git://github.com/at91linux/linux-at91.git tags/at91-dt

for you to fetch changes up to 24ce10e142e7b063c4ae4437dd3b290fbfafe052:

  ARM: at91: sam9m10g45ek add udc DT support (2013-06-15 00:15:22 +0200)


Again some nice DT updates for AT91:
- DMA binding update with one patch shared with slave-dma tree
- more SPI DT activation
- enable the USB gadget HS for DT platforms


Jean-Christophe PLAGNIOL-VILLARD (4):
  ARM: at91: sam9x5 add udc DT support
  ARM: at91: sam9x5ek add udc DT support
  ARM: at91: sam9g45 add udc DT support
  ARM: at91: sam9m10g45ek add udc DT support

Ludovic Desroches (2):
  ARM: at91: dt: add header to define at_hdmac configuration
  ARM: at91: dt: switch DMA DT bindings to pre-processor

Richard Genoud (1):
  ARM: at91: dt: at91sam9x5: add SPI DMA client infos

 arch/arm/boot/dts/at91sam9g45.dtsi | 67 ++-
 arch/arm/boot/dts/at91sam9m10g45ek.dts |  5 ++
 arch/arm/boot/dts/at91sam9n12.dtsi | 11 +++--
 arch/arm/boot/dts/at91sam9x5.dtsi  | 85 ++
 arch/arm/boot/dts/at91sam9x5ek.dtsi|  5 ++
 arch/arm/boot/dts/sama5d3.dtsi | 19 
 arch/arm/mach-at91/at91sam9g45.c   |  2 +
 arch/arm/mach-at91/at91sam9x5.c|  2 +
 include/dt-bindings/dma/at91.h | 27 +++
 9 files changed, 199 insertions(+), 24 deletions(-)
 create mode 100644 include/dt-bindings/dma/at91.h

-- 
Nicolas Ferre
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] pstore/ram for 3.11

2013-06-14 Thread Anton Vorontsov

On Fri, Jun 14, 2013 at 05:32:16PM -0500, Rob Herring wrote:
> https://lkml.org/lkml/2013/4/9/831
> 
> The main discussion was around the write-combining change which I
> dropped. You can pick patches 2 and 3 off the mail list if you prefer. I
> would assume they only require an ack from one of the 4 of you which
> pulling the tree implicitly does.

I've just re-reviewed the discussion. I see Colin was talking about ftrace
causing infinite loops due to recursion. Once I stumbled onto this too
(but the issue itself was due to missing 'notrace' mark), and Stephen told
that recursion protection is kind of mandatory:

  http://lists.linaro.org/pipermail/linaro-kernel/2012-August/002084.html

I belive that the protection is actually there nowadays, and that is why
Rob does not see the problem with his patch.

So, I guess the patches are good, at least they fix real issues for Rob.
:)

Acked-by: Anton Vorontsov 

(Or I can pick this via linux-pstore.git tree, I'll let Tony decide.)

Thanks!

Anton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Performance regression from switching lock to rw-sem for anon-vma tree

2013-06-14 Thread Michel Lespinasse

On Fri, Jun 14, 2013 at 3:31 PM, Davidlohr Bueso  wrote:
> A few ideas that come to mind are avoiding taking the ->wait_lock and
> avoid dealing with waiters when doing the optimistic spinning (just like
> mutexes do).
>
> I agree that we should first deal with the optimistic spinning before
> adding the MCS complexity.

Maybe it would be worth disabling the MCS patch in mutex and comparing
that to the rwsem patches ? Just to make sure the rwsem performance
delta isn't related to that.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Performance regression from switching lock to rw-sem for anon-vma tree

2013-06-14 Thread Tim Chen


> 
> Unfortunately this patch didn't make any difference, in fact it hurt
> several of the workloads even more. I also tried disabling preemption
> when spinning on owner to actually resemble spinlocks, which was my
> original plan, yet not much difference. 
> 

That's also similar to the performance I got.  There are things about
optimistic spinning that I missed that results in the better mutex
performance.  So I'm scratching my head.

> A few ideas that come to mind are avoiding taking the ->wait_lock and
> avoid dealing with waiters when doing the optimistic spinning (just like
> mutexes do).
> 

For my patch, we actually spin without the wait_lock.

> I agree that we should first deal with the optimistic spinning before
> adding the MCS complexity.
> 
> > Matthew and I have also discussed possibly introducing some 
> > limited spinning for writer when semaphore is held by read.  
> > His idea was to have readers as well as writers set ->owner.  
> > Writers, as now, unconditionally clear owner.  Readers clear 
> > owner if sem->owner == current.  Writers spin on ->owner if ->owner 
> > is non-NULL and still active. That gives us a reasonable chance 
> > to spin since we'll be spinning on
> > the most recent acquirer of the lock.
> 
> I also tried implementing this concept on top of your patch, didn't make
> much of a difference with or without it. 
> 

It also didn't make a difference for me.

Tim

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1176 matches

Mail list logo