date:20121218

[RFC PATCH 0/5] clockevents: decouple broadcast mechanism from drivers

2012-12-18 Thread Mark Rutland

In some SMP systems, cpu-local timers may stop delivering interrupts
when in low power states, or not all CPUs may have local timers. To
support these systems we have a mechanism for broadcasting timer ticks
to other CPUs. This mechanism relies on the struct
clock_event_device::broadcast function pointer, which is a
driver-specific mechanism for broadcasting ticks to other CPUs.

As the broadcast mechanism is architecture-specific, placing the
broadcast function on struct clock_event_device ties each driver to a
single architecture. Additionally the driver or architecture backend
must handle the routing of broadcast ticks to the correct
clock_event_device, leading to duplication of the list of active
clock_event_devices.

These patches introduce a generic mechanism for handling the receipt of
timer broadcasts, and an optional architecture-specific broadcast
function which allows drivers to be decoupled from a particular
architecture will retaining support for timer tick broadcasts. These
mechanisms are wired up for the arm port, and have been boot-tested on a
pandaboard.

Thanks,
Mark.

Mark Rutland (5):
  ARM: remove useless guard in smp.c
  clockevents: Add generic timer broadcast receiver
  ARM: Use generic timer broadcast receive
  clockevents: Add generic timer broadcast function
  ARM: Add generic timer broadcast support

 arch/arm/Kconfig |1 +
 arch/arm/kernel/smp.c|   15 ++-
 include/linux/clockchips.h   |9 +
 kernel/time/Kconfig  |4 
 kernel/time/tick-broadcast.c |   30 ++
 5 files changed, 46 insertions(+), 13 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [IPv6] crashed when __ip6_del_rt()

2012-12-18 Thread YOSHIFUJI Hideaki

stanley zhou wrote:

> when call write_lock_bh() table is null cause crash in __ip6_del_rt().
> kernel version is 2.6.30.10
:
> static int __ip6_del_rt(struct rt6_info *rt, struct nl_info *info)
> {
> int err;
> struct fib6_table *table;
> struct net *net = dev_net(rt->rt6i_dev);
> 
> if (rt == net->ipv6.ip6_null_entry) {
> +++err = -ENOENT;
> +++goto out;
> --- return -ENOENT;
> }
> 
> table = rt->rt6i_table;
> write_lock_bh(>tb6_lock);
> err = fib6_del(rt, info);
> write_unlock_bh(>tb6_lock);
> +++out:
> dst_release(>u.dst);
> return err;
> } 
>  

I think this is what commit 6825a26c ("ipv6: release reference of
ip6_null_entry's dst entry in __ip6_del_rt") by Gao feng
 does, which is already in v3.7.

Are you suggesting that we should have this in -stable tree as well?

--yoshfuji


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] at24: make module parameters changeable via sysfs

2012-12-18 Thread Wolfram Sang


> I reviewed this patch 3 months ago and did not hear back. Are you going
> to update this patch and resubmit, or should I just drop it?

Uwe is on holiday. I'll take care about it if the need is still there...

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature

Re: [RFC PATCH v2 3/6] sched: pack small tasks

2012-12-18 Thread Alex Shi

On Tue, Dec 18, 2012 at 5:53 PM, Vincent Guittot
 wrote:
> On 17 December 2012 16:24, Alex Shi  wrote:
 The scheme below tries to summaries the idea:

 Socket  | socket 0 | socket 1   | socket 2   | socket 3   |
 LCPU| 0 | 1-15 | 16 | 17-31 | 32 | 33-47 | 48 | 49-63 |
 buddy conf0 | 0 | 0| 1  | 16| 2  | 32| 3  | 48|
 buddy conf1 | 0 | 0| 0  | 16| 16 | 32| 32 | 48|
 buddy conf2 | 0 | 0| 16 | 16| 32 | 32| 48 | 48|

 But, I don't know how this can interact with NUMA load balance and the
 better might be to use conf3.
>>>
>>> I mean conf2 not conf3

>
> Cyclictest is the ultimate small tasks use case which points out all
> weaknesses of a scheduler for such kind of tasks.
> Music playback is a more realistic one and it also shows improvement
>
>> granularity or one tick, thus we really don't need to consider task
>> migration cost. But when the task are not too small, migration is more
>
> For which kind of machine are you stating that hypothesis ?

Seems the biggest argument between us is you didn't want to admit 'not
too small tasks' exists and that will cause more migrations because
your patch.

>> even so they should run in the same socket for power saving
>> consideration(my power scheduling patch can do this), instead of spread
>> to all sockets.
>
> This is may be good for your scenario and your machine :-)
> Packing small tasks is the best choice for any scenario and machine.

That's clearly wrong, I had explained many times, your single buddy
CPU is impossible packing all tasks for a  big machine, like for just
16 LCPU, while it suppose do.

Anyway you have right insist your design. and I thought I can not say
more clear about the scalability issue. I won't judge the patch again.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] at24: make module parameters changeable via sysfs

2012-12-18 Thread Jean Delvare

Uwe,

On Fri, 14 Sep 2012 10:25:36 +0200, Jean Delvare wrote:
> On Wed, 12 Sep 2012 11:43:32 +0200, Uwe Kleine-König wrote:
> > The respective values are evaluated at each read/write, so no further
> > action is required than to change the perm argument to module_param.
> > 
> > Note there is no sanity check so root can make the driver effectively
> > unusable but that's what root is for :-)
> >
> > Signed-off-by: Uwe Kleine-König 
> > ---
> >  drivers/misc/eeprom/at24.c |4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/misc/eeprom/at24.c b/drivers/misc/eeprom/at24.c
> > index ab1ad41..8a5a192 100644
> > --- a/drivers/misc/eeprom/at24.c
> > +++ b/drivers/misc/eeprom/at24.c
> > @@ -85,7 +85,7 @@ struct at24_data {
> >   * This value is forced to be a power of two so that writes align on pages.
> >   */
> >  static unsigned io_limit = 128;
> > -module_param(io_limit, uint, 0);
> > +module_param(io_limit, uint, S_IRUGO | S_IWUSR);
> 
> This won't work. Not only there is no validation of the value, while
> there is such a validation (and value adjustment!) in at24_init(); you
> seem to not care, but I do. But the more important problem is that
> changing io_limit at run-time will only affect reads, not writes. The
> size limit from writes is computed at device probing time:
> 
> static int at24_probe(struct i2c_client *client, const struct i2c_device_id 
> *id)
> {
> (...)
>   if (writable) {
>   (...)
>   if (write_max > io_limit)
>   write_max = io_limit;
> 
> So changing the value through sysfs will have no effect. If you want it
> to have an effect, you have to move the check from at24_probe() to
> at24_eeprom_write().
> 
> Back to the validation issue, I think it would be worth looking into
> module_param_cb(). Using it, it may not be that difficult to get
> validation when the value is changed through sysfs. Otherwise I'll ask
> you to check what exactly happens if someone sets io_limit to 0. We
> can't afford infinite loops or EEPROM corruption on root mistyping.
> 
> >  MODULE_PARM_DESC(io_limit, "Maximum bytes per I/O (default 128)");
> >  
> >  /*
> > @@ -93,7 +93,7 @@ MODULE_PARM_DESC(io_limit, "Maximum bytes per I/O 
> > (default 128)");
> >   * it's important to recover from write timeouts.
> >   */
> >  static unsigned write_timeout = 25;
> > -module_param(write_timeout, uint, 0);
> > +module_param(write_timeout, uint, S_IRUGO | S_IWUSR);
> 
> This one is OK.
> 
> >  MODULE_PARM_DESC(write_timeout, "Time (in ms) to try writes (default 25)");
> >  
> >  #define AT24_SIZE_BYTELEN 5

I reviewed this patch 3 months ago and did not hear back. Are you going
to update this patch and resubmit, or should I just drop it?

-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH resend 0/2] I2C: sis630: add sis964 support

2012-12-18 Thread Jean Delvare

Hi Amaury,

On Wed, 29 Aug 2012 03:35:13 +0200, Amaury Decrême wrote:
> This serie of patches brings SIS964 support to i2c-sis630.
> 
> The SiS datasheets have been used.
> 
> The SIS964 isn't part of the SIS96X family and behaves differently.
> For i2c, this array show the differences between sis630 and sis964.
>   +++---+
>   || SIS630/730 |  SIS964   |
>   +++---+
>   | Clock  | 14kHz/56kHz| 55.56kHz/27.78kHz |
>   | SMBus registers offset | 0x80   | 0xE0  |
>   | SMB_CNT| Bit 1 = Slave Busy | Bit 1 = Bus probe |
>   | SMB_COUNT  | 4:0 bits   | 5:0 bits  |
>   +++---+
> 
> The other differences doesn't affect the functions provided by the original
> i2c-sis630 driver.
> 
> The first patch is mandatory as it adds supports for SIS964 bus.
> The second patch is optional. It depends on the first patch.
> 
> Amaury Decrême (2):
>   I2C: sis630: sis964 bus
>   I2C: sis630: Cleaning and cosmetics
> 
>  Documentation/i2c/busses/i2c-sis630 |   17 +-
>  drivers/i2c/busses/Kconfig  |4 +-
>  drivers/i2c/busses/i2c-sis630.c |  445 
> +--
>  3 files changed, 278 insertions(+), 188 deletions(-)

I reviewed these two patches 2 months ago, but did not hear back from
you since then. Do you plan to resubmit these patches with the
improvements I suggested? I would hate to see your work time and mine
wasted.

Thanks,
-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 13/13] drivers/media/tuners/e4000.c: use macros for i2c_msg initialization

2012-12-18 Thread Jean Delvare

Hi Julia,

On Thu, 11 Oct 2012 08:45:43 +0200 (CEST), Julia Lawall wrote:
> I found 6 cases where there are more than 2 messages in the array.  I
> didn't check how many cases where there are two messages but there is
> something other than one read and one write.
> 
> Perhaps a reasonable option would be to use
> 
> I2C_MSG_READ
> I2C_MSG_WRITE
> I2C_MSG_READ_OP
> I2C_MSG_WRITE_OP
> 
> The last two are for the few cases where more flags are specified.  As
> compared to the original proposal of I2C_MSG_OP, these keep the READ or
> WRITE idea in the macro name.  The additional argument to the OP macros
> would be or'd with the read or write (nothing to do in this case) flags as
> appropriate.
> 
> Mauro proposed INIT_I2C_READ_SUBADDR for the very common case where a
> message array has one read and one write.  I think that putting one
> I2C_MSG_READ and one I2C_MSG_WRITE in this case is readable enough, and
> avoids the need to do something special for the cases that don't match the
> expectations of INIT_I2C_READ_SUBADDR.
> 
> I propose not to do anything for the moment either for sizes or for
> message or buffer arrays that contain only one element.

Please note that I resigned from my position of i2c subsystem
maintainer, so I will not handle this. If you think this is important,
you'll have to resubmit and Wolfram will decide what he wants to do
about it.

-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [boot crash] Re: [GIT PULL[ block drivers bits for 3.8

2012-12-18 Thread Jens Axboe

On 2012-12-18 10:25, Ingo Molnar wrote:
> 
> * Jens Axboe  wrote:
> 
>> Hi Linus,
>>
>> Now that the core bits are in, here are the driver bits for 3.8. The
>> branch contains:
> 
> FYI, I'm getting a divide-by-zero boot crash (serial log capture 
> below) with the attached config.
> 
> Reproduced with 848b81415c42.
> 
> The bug might have gone upstream between 8874e81 (Linus's tree 
> from yesterday) and 848b81415c42 (Linus's tree from today). Or 
> it's from earlier and I only triggered it today.
> 
> ( Note that every log line is duplicated, haven't tracked that
>   down yet, earlyprintk=,keep might be busted. )

Bah. Does the below fix it up for you?

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index acb4f7b..067f195 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1188,12 +1188,13 @@ static inline int queue_discard_alignment(struct 
request_queue *q)
 
 static inline int queue_limit_discard_alignment(struct queue_limits *lim, 
sector_t sector)
 {
-   sector_t alignment = sector << 9;
-   alignment = sector_div(alignment, lim->discard_granularity);
+   sector_t alignment;
 
-   if (!lim->max_discard_sectors)
+   if (!lim->max_discard_sectors || !lim->discard_granularity)
return 0;
 
+   alignment = sector << 9;
+   alignment = sector_div(alignment, lim->discard_granularity);
alignment = lim->discard_granularity + lim->discard_alignment - 
alignment;
return sector_div(alignment, lim->discard_granularity);
 }

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH V2] serial: tegra: add serial driver

2012-12-18 Thread Alan Cox

On Tue, 18 Dec 2012 12:29:53 +0530
Laxman Dewangan  wrote:

> Nvidia's Tegra has multiple uart controller which supports:
> - APB dma based controller fifo read/write.
> - End Of Data interrupt in incoming data to know whether end
>   of frame achieve or not.
> - Hw controlled RTS and CTS flow control to reduce SW overhead.
> 
> Add serial driver to use all above feature.
> 
> Signed-off-by: Laxman Dewangan 

Acked-by: Alan Cox 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 2/8] Thermal: Create zone level APIs

2012-12-18 Thread Joe Perches

On Tue, 2012-12-18 at 14:59 +0530, Durgadoss R wrote:
> This patch adds a new thermal_zone structure to
> thermal.h. Also, adds zone level APIs to the thermal
> framework.

[]

> diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c

> +#define GET_INDEX(tz, ptr, indx, type)   \
> + do {\
> + int i;  \
> + indx = -EINVAL; \
> + if (!tz || !ptr)\
> + break;  \
> + mutex_lock(##_list_lock);  \
> + for (i = 0; i < tz->type##_indx; i++) { \
> + if (tz->type##s[i] == ptr) {\
> + indx = i;   \
> + break;  \
> + }   \
> + }   \
> + mutex_unlock(##_list_lock);\
> + } while (0)

A statement expression macro returning int would be
more kernel style like and better to use.

(sorry about the whitespace, evolution 3.6 is crappy)

#define GET_INDEX(tx, ptr, type)\
({  \
int rtn = -EINVAL;  \
do {\
int i;  \
if (!tz || !ptr)\
break;  \
mutex_lock(##_list_lock);  \
for (i = 0; i < tz->type##_indx; i++) { \
if (tz->type##s[i] == ptr) {\
rtn = i;\
break;  \
}   \
}   \
mutex_unlock(##_list_lock);\
} while (0);\
rtn;\
})


> +static void remove_sensor_from_zone(struct thermal_zone *tz,
> + struct thermal_sensor *ts)
> +{
> + int j, indx;
> +
> + GET_INDEX(tz, ts, indx, sensor);

This becomes

indx = GET_INDEX(tx, ts, sensor);

> + if (indx < 0)
> + return;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Laurent Navet

>> > Are you also able to build-test the changes?
Yes,

> Me too :) I just wanted to express that I would love to see a compile
> test before submission, even for checkpatch thingies. Can save some
> hazzle for all of us.
>
I agree,

thanks for your comments, i'll send new patch in a few days.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/8] Thermal: Create sensor level APIs

2012-12-18 Thread Joe Perches

On Tue, 2012-12-18 at 14:59 +0530, Durgadoss R wrote:
> This patch creates sensor level APIs, in the
> generic thermal framework.

Just some trivial notes.

> diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
[]
> +static ssize_t
> +sensor_temp_show(struct device *dev, struct device_attribute *attr, char 
> *buf)
> +{
> + int ret;
> + long val;
> + struct thermal_sensor *ts = to_thermal_sensor(dev);
> +
> + ret = ts->ops->get_temp(ts, );
> +
> + return ret ? ret : sprintf(buf, "%ld\n", val);

I'd much prefer the form

ret = ts->ops...
if (ret)
return ret;

return sprintf(buf, "%ld\n", val);

Otherwise, maybe use gcc's pretty common ?: extension
return ret ?: sprintf(...) 

[]

> +static int enable_sensor_thresholds(struct thermal_sensor *ts, int count)
> +{
> + int i;
> + int size = sizeof(struct thermal_attr) * count;
> +
> + ts->thresh_attrs = kzalloc(size, GFP_KERNEL);

kcalloc

> + if (!ts->thresh_attrs)
> + return -ENOMEM;
> +
> + if (ts->ops->get_hyst) {
> + ts->hyst_attrs = kzalloc(size, GFP_KERNEL);

kcalloc here too



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Jean Delvare

On Tue, 18 Dec 2012 18:38:20 +0800, wenhao zhang wrote:
> *A Stupid Question*

This is not a stupid question, but a completely OFF-TOPIC question.
Don't do that again, please. If you have a question to ask, start a new
discussion thread on the appropriate list.

Thanks,
-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] [RFC] drm/radeon: return 0 on successful gpu reset

2012-12-18 Thread Christian König


On 17.12.2012 22:31, Paul Bolle wrote:

On an (outdated) laptop the radeon driver (almost always) prints, during
the first resume of each session:
 [drm] crtc 1 is connected to a TV

This message is a bit puzzling as, as far as I know, no TV has ever
been connected to this laptop. Anyhow, before v3.5, if that happened the
radeon driver then printed an error during all following resumes:
 [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -35!

(-35 is -EDEADLK.) But the resume would succeed and the driver seemed to
run without too much trouble. From v3.5 onwards things changed. If the
(puzzling) message about crtc 1 was printed on first resume the laptop
would simply hang on second resume. Only a manual power off would then
be possible. In that case nothing of interest would be found in the
(truncated) logs.  And, most annoyingly, the hang would never happen if
the laptop was booted with, say, "console=ttyS0,115200n8" added to the
kernel command line.

I bisected the hang to commit 6c6f478370eccfbfafbdc6fc55c0def03e58f124
("drm/radeon: rework recursive gpu reset handling"), which was added in
the v3.5 release cycle. After discovering that and poking at the driver
it turned out that this hang is triggered by radeon_cs_handle_lockup()
returning -EAGAIN after successfully resetting the gpu. Simply returning
0 makes the hang disappear (and makes the drm error reappear).

Nothing in the code or the commit explanation clarifies why -EAGAIN
should be returned on successful gpu reset. So I suggest
radeon_cs_handle_lockup() simply returns what radeon_gpu_reset()
returns, eg 0 (on success) or a negative error code (on failure).

Signed-off-by: Paul Bolle 
---
0) This exact patch is untested (but I run something comparable).

1) Sent as an RFC because I do not understand why this laptop (almost
always) prints the "crtc 1" message on first resume. Note that another
workaround for this hang is simply booting with "radeon.tv=0".
Alex should probably take a look into this, since he probably is the one 
with the deepest knowledge of the display engine. My best guess is that 
it is just some error while probing for an attached TV and actually 
isn't so bad after all.



2) Also sent as an RFC because I have no idea whatsoever why returning
-EAGAIN will hang the machine. I guess it's returned to userland by
radeon_cs_ioctl(). What code uses that ioctl? And what does that code do
on -EAGAIN that hangs this laptop?


EAGAIN just tells userspace to reissue the requested system call. When a 
system call is interrupted (either by a signal or in our case a GPU 
lockup) it aborts and EGAIN is returned to userspace, telling userspace 
that it should try again. So by just returning 0 userspace things that 
our system call was executed and doesn't try it again.


So you just prevented the normal reissuing of the system call and so 
also prevented whatever this command submission should be doing in the 
first place.



3) A third reason to send this as an RFC is that I also have no idea why
this hang doesn't happen when booting with "console=ttyS0,115200n8" or
even "console=tty0"! But I guess I'm now allowed to call this hang a
Heisenbug.

"Heisenbug" ? LOL, I need to remember that.

But anyway it is not so unusual seeing a bug like this, cause it is 
possible (but highly unlikely) that actually trying to print an error 
message can cause a lockup.


You should definitely try Alex latest drm-fixes-3.8 branch 
(git://people.freedesktop.org/~agd5f/linux) since the possibility is 
quite high that we already have fixed that bug. If that doesn't helps 
then please open a bug report and leave me a note so that I can 
investigate further.


Christian.




  drivers/gpu/drm/radeon/radeon_cs.c |5 +
  1 files changed, 1 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/radeon/radeon_cs.c 
b/drivers/gpu/drm/radeon/radeon_cs.c
index 41672cc..a302c00 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -486,11 +486,8 @@ out:
  
  static int radeon_cs_handle_lockup(struct radeon_device *rdev, int r)

  {
-   if (r == -EDEADLK) {
+   if (r == -EDEADLK)
r = radeon_gpu_reset(rdev);
-   if (!r)
-   r = -EAGAIN;
-   }
return r;
  }
  


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [sched/rt] Optimization of function pull_rt_task()

2012-12-18 Thread Kirill Tkhai



16.11.2012, 00:36, "Steven Rostedt" :
> Doing my INBOX maintenance (clean up), I've stumbled on this thread
> again. I'm not sure the changes here are hopeless.
>
> On Mon, 2012-06-04 at 13:27 +0800, Yong Zhang wrote:
>
>>  On Fri, Jun 01, 2012 at 08:45:16PM +0400, Kirill Tkhai wrote:
>>>  19.04.2012, 12:54, "Yong Zhang" :
  On Wed, Apr 18, 2012 at 05:16:55PM -0400, Steven Rostedt wrote:
>  ?On Wed, 2012-04-18 at 14:32 -0400, Steven Rostedt wrote:
>>  ?On Mon, 2012-04-16 at 12:06 -0400, Steven Rostedt wrote:
>>>  ?On Sun, 2012-04-15 at 23:45 +0400, Kirill Tkhai wrote:
  ?The condition (src_rq->rt.rt_nr_running) is weak because it doesn't
  ?consider the cases when src_rq has only processes bound to it (when
  ?single cpu is allowed). It may be running kernel thread like
  ?migration/x etc.

  ?So it's better to use more stronger condition which is able to 
 exclude
  ?above conditions. The function has_pushable_tasks() complitely does
  ?this. A task may be pullable for another cpu rq only if he is 
 pushable
  ?for his own queue.
>>>  ?I considered this before, and for some reason I never did the change.
>>>  ?I'll have to think about it. It seems like this would be the obvious
>>>  ?case, but I think there was something not so obvious that caused 
>>> issues.
>>>  ?But I don't remember what it was.
>>>
>>>  ?I'll have to rethink this again.
>>  ?I can't find anything wrong with this change. Maybe things change, or I
>>  ?was thinking of another change.
>>
>>  ?I'll apply it and start running my tests against it.
>  ?Not only does this seem to work fine, I took it one step further :-)
  Hmm... throttle doesn't handle the pushable list, so we may find a
  throttled task by pick_next_pushable_task().

  Thanks,
  Yong
>>>  I don't complitelly understand throttle logic.
>>>
>>>  Is the source patch not-appliable the same reason?
>>  I guess so.
>>
>>  Your patch will change the semantic of pick_next_pushable_task().
>
> Looking at the original patch, I don't see how it changes the semantics
> (although mine may have). The original patch was:
>
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -1729,7 +1729,7 @@ static int pull_rt_task(struct rq *this_rq)
> /*
>  * Are there still pullable RT tasks?
>  */
> -   if (src_rq->rt.rt_nr_running <= 1)
> +   if (!has_pushable_tasks(src_rq))
> goto skip;
>
> p = pick_next_highest_task_rt(src_rq, this_cpu);
>
> And I still don't see a problem with this. If a rq has no pushable
> tasks, then we shouldn't bother trying to pull from it (no task can
> migrate).
>
> Thus, the original patch, I believe should be applied without question.
>
> Now, about my patch, the one that made pick_next_highest_task_rt into
> just:
>
> static struct task_struct *pick_next_highest_task_rt(struct rq *rq, int cpu)
> {
>    struct plist_head *head = >rt.pushable_tasks;
>    struct task_struct *next;
>
>    plist_for_each_entry(next, head, pushable_tasks) {
>   if (pick_rt_task(rq, next, cpu))
>   return next;
>    }
>
>    return NULL;
> }
>
> You said could pick a task from a throttled rq. I'm not sure that is
> different than what we have now. As the current
> pick_next_highest_task_rt() just does a loop over the leaf_rt_rqs which
> includes throttled rqs. That's because a throttled rq will not dequeue
> the rt_rq from the leaf_rt_rq list if the rt_rq has rt_nr_running != 0.

Yes, there is no connection between logic of pushable tasks and throttling at 
the moment.
These activities are independent. ( I tried to connect them at the patch:
http://lkml.indiana.edu/hypermail/linux/kernel/1211.2/03750.html )

I think, there is no problem.

Kirill

>
> I'm still thinking about adding both patches.
>
> -- Steve
>
>>  Thanks,
>>  Yong
>>>  Kirill
>  ?Peter, do you see anything wrong with this patch?
>
>  ?-- Steve
>
>  ?diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
>  ?index 61e3086..b44fd1b 100644
>  ?--- a/kernel/sched/rt.c
>  ?+++ b/kernel/sched/rt.c
>  ?@@ -1416,39 +1416,15 @@ static int pick_rt_task(struct rq *rq, struct 
> task_struct *p, int cpu)
>  ??/* Return the second highest RT task, NULL otherwise */
>  ??static struct task_struct *pick_next_highest_task_rt(struct rq *rq, 
> int cpu)
>  ??{
>  ?- struct task_struct *next = NULL;
>  ?- struct sched_rt_entity *rt_se;
>  ?- struct rt_prio_array *array;
>  ?- struct rt_rq *rt_rq;
>  ?- int idx;
>  ?+ struct plist_head *head = >rt.pushable_tasks;
>  ?+ struct task_struct *next;
>
>  ?- for_each_leaf_rt_rq(rt_rq, rq) {
>  ?- array = _rq->active;
>  ?- idx = sched_find_first_bit(array->bitmap);

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Wolfram Sang


> > Are you also able to build-test the changes?
> 
> Most certainly yes, this driver has almost no dependencies. I will
> build-test it anyway.

Me too :) I just wanted to express that I would love to see a compile
test before submission, even for checkpatch thingies. Can save some
hazzle for all of us.

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Jean Delvare

On Tue, 18 Dec 2012 11:09:06 +0100, Wolfram Sang wrote:
> 
> > i have planned fixing these too, do you prefer one patch fixing all or
> > multiple patches (one per error/warning type )?
> 
> One patch, definately.

Yes please :)
> You can skip the 80 char thing.

For PCI device IDs, agreed.

> Are you also able to build-test the changes?

Most certainly yes, this driver has almost no dependencies. I will
build-test it anyway.


-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH RESEND 0/6 v10] gpio: Add block GPIO

2012-12-18 Thread Jean-Christophe PLAGNIOL-VILLARD

On 07:58 Tue 18 Dec , Wolfgang Grandegger wrote:
> On 12/18/2012 06:55 AM, Jean-Christophe PLAGNIOL-VILLARD wrote:
> > On 20:47 Mon 17 Dec , Wolfgang Grandegger wrote:
> >> On 12/17/2012 07:02 PM, Roland Stigge wrote:
> >>> On 12/17/2012 06:37 PM, Wolfgang Grandegger wrote:
>   /* Do synchronous data output with a single write access */
>   __raw_writel(~mask, pio + PIO_OWDR);
>   __raw_writel(mask, pio + PIO_OWER);
>   __raw_writel(val, pio + PIO_ODSR);
> 
>  For caching we would need a storage. Not sure if it's worth compared to
>  a context switch into the kernel.
> >>>
> >>> Block GPIO is not only for you in userspace. ;-) You can also implement
> >>> efficient n-bit bus I/O in kernel drivers, n-bit-banging. :-) So not
> >>> always context switches involved.
> >>
> >> OK, what do you think about the following untested patch:
> >>
> >> From b44cad16cbbca84715dffd4cb5268497216add25 Mon Sep 17 00:00:00 2001
> >> From: Wolfgang Grandegger 
> >> Date: Mon, 3 Dec 2012 08:31:55 +0100
> >> Subject: [PATCH 1/2] gpio: add GPIO block callback functions for AT91
> >>
> >> Signed-off-by: Wolfgang Grandegger 
> >> ---
> >>  arch/arm/mach-at91/gpio.c |   29 +
> >>  1 file changed, 29 insertions(+)
> >>
> >> diff --git a/arch/arm/mach-at91/gpio.c b/arch/arm/mach-at91/gpio.c
> >> index be42cf0..cf6bd45 100644
> >> --- a/arch/arm/mach-at91/gpio.c
> >> +++ b/arch/arm/mach-at91/gpio.c
> >> @@ -42,13 +42,16 @@ struct at91_gpio_chip {
> >>void __iomem*regbase;   /* PIO bank virtual address */
> >>struct clk  *clock; /* associated clock */
> >>struct irq_domain   *domain;/* associated irq domain */
> >> +  unsigned long   mask_shadow;/* synchronous data output */
> >>  };
> >>  
> >>  #define to_at91_gpio_chip(c) container_of(c, struct at91_gpio_chip, chip)
> >>  
> >>  static void at91_gpiolib_dbg_show(struct seq_file *s, struct gpio_chip 
> >> *chip);
> >>  static void at91_gpiolib_set(struct gpio_chip *chip, unsigned offset, int 
> >> val);
> >> +static void at91_gpiolib_set_block(struct gpio_chip *chip, unsigned long 
> >> mask, unsigned long val);
> >>  static int at91_gpiolib_get(struct gpio_chip *chip, unsigned offset);
> >> +static unsigned long at91_gpiolib_get_block(struct gpio_chip *chip, 
> >> unsigned long mask);
> >>  static int at91_gpiolib_direction_output(struct gpio_chip *chip,
> >> unsigned offset, int val);
> >>  static int at91_gpiolib_direction_input(struct gpio_chip *chip,
> >> @@ -62,7 +65,9 @@ static int at91_gpiolib_to_irq(struct gpio_chip *chip, 
> >> unsigned offset);
> >>.direction_input  = at91_gpiolib_direction_input, \
> >>.direction_output = at91_gpiolib_direction_output, \
> >>.get  = at91_gpiolib_get,   \
> >> +  .get_block= at91_gpiolib_get_block, \
> >>.set  = at91_gpiolib_set,   \
> >> +  .set_block= at91_gpiolib_set_block, \
> >>.dbg_show = at91_gpiolib_dbg_show,  \
> >>.to_irq   = at91_gpiolib_to_irq,\
> >>.ngpio= nr_gpio,\
> >> @@ -896,6 +901,16 @@ static int at91_gpiolib_get(struct gpio_chip *chip, 
> >> unsigned offset)
> >>return (pdsr & mask) != 0;
> >>  }
> >>  
> >> +static unsigned long at91_gpiolib_get_block(struct gpio_chip *chip, 
> >> unsigned long mask)
> >> +{
> >> +  struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
> >> +  void __iomem *pio = at91_gpio->regbase;
> >> +  u32 pdsr;
> >> +
> >> +  pdsr = __raw_readl(pio + PIO_PDSR);
> >> +  return pdsr & mask;
> >> +}
> >> +
> >>  static void at91_gpiolib_set(struct gpio_chip *chip, unsigned offset, int 
> >> val)
> >>  {
> >>struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
> >> @@ -905,6 +920,20 @@ static void at91_gpiolib_set(struct gpio_chip *chip, 
> >> unsigned offset, int val)
> >>__raw_writel(mask, pio + (val ? PIO_SODR : PIO_CODR));
> >>  }
> >>  
> >> +static void at91_gpiolib_set_block(struct gpio_chip *chip, unsigned long 
> >> mask, unsigned long val)
> >> +{
> >> +  struct at91_gpio_chip *at91_gpio = to_at91_gpio_chip(chip);
> >> +  void __iomem *pio = at91_gpio->regbase;
> >> +
> >> +  /* Do synchronous data output with a single write access */
> >> +  if (mask != at91_gpio->mask_shadow) {
> >> +  at91_gpio->mask_shadow = mask;
> >> +  __raw_writel(~mask, pio + PIO_OWDR);
> >> +  __raw_writel(mask, pio + PIO_OWER);
> >> +  }
> >> +  __raw_writel(val, pio + PIO_ODSR);
> >> +}
> > this driver is only for old at91 platfrom if you touch at91 you need to 
> > update
> > the pinctrl too
> 
> Well, the patch is for the hardware I have at hand and I can test. There
> are many other GPIO hardware

Re: [RFC PATCH v2 3/6] sched: pack small tasks

2012-12-18 Thread Vincent Guittot

On 17 December 2012 16:24, Alex Shi  wrote:
>>> The scheme below tries to summaries the idea:
>>>
>>> Socket  | socket 0 | socket 1   | socket 2   | socket 3   |
>>> LCPU| 0 | 1-15 | 16 | 17-31 | 32 | 33-47 | 48 | 49-63 |
>>> buddy conf0 | 0 | 0| 1  | 16| 2  | 32| 3  | 48|
>>> buddy conf1 | 0 | 0| 0  | 16| 16 | 32| 32 | 48|
>>> buddy conf2 | 0 | 0| 16 | 16| 32 | 32| 48 | 48|
>>>
>>> But, I don't know how this can interact with NUMA load balance and the
>>> better might be to use conf3.
>>
>> I mean conf2 not conf3
>
> So, it has 4 levels 0/16/32/ for socket 3 and 0 level for socket 0, it
> is unbalanced for different socket.

 That the target because we have decided to pack the small tasks in
 socket 0 when we have parsed the topology at boot.
 We don't have to loop into sched_domain or sched_group anymore to find
 the best LCPU when a small tasks wake up.
>>>
>>> iteration on domain and group is a advantage feature for power efficient
>>> requirement, not shortage. If some CPU are already idle before forking,
>>> let another waking CPU check their load/util and then decide which one
>>> is best CPU can reduce late migrations, that save both the performance
>>> and power.
>>
>> In fact, we have already done this job once at boot and we consider
>> that moving small tasks in the buddy CPU is always benefit so we don't
>> need to waste time looping sched_domain and sched_group to compute
>> current capacity of each LCPU for each wake up of each small tasks. We
>> want all small tasks and background activity waking up on the same
>> buddy CPU and let the default behavior of the scheduler choosing the
>> best CPU for heavy tasks or loaded CPUs.
>
> IMHO, the design should be very good for your scenario and your machine,
> but when the code move to general scheduler, we do want it can handle
> more general scenarios. like sometime the 'small task' is not as small
> as tasks in cyclictest which even hardly can run longer than migration

Cyclictest is the ultimate small tasks use case which points out all
weaknesses of a scheduler for such kind of tasks.
Music playback is a more realistic one and it also shows improvement

> granularity or one tick, thus we really don't need to consider task
> migration cost. But when the task are not too small, migration is more

For which kind of machine are you stating that hypothesis ?

> heavier than domain/group walking, that is the common sense in
> fork/exec/waking balance.

I would have said the opposite: The current scheduler limits its
computation of statistic during fork/exec/waking compared to a
periodic load balance because it's too heavy. It's even more true for
wake up if wake affine is possible.

>
>>
>>>
>>> On the contrary, move task walking on each level buddies is not only bad
>>> on performance but also bad on power. Consider the quite big latency of
>>> waking a deep idle CPU. we lose too much..
>>
>> My result have shown different conclusion.
>
> That should be due to your tasks are too small to need consider
> migration cost.
>> In fact, there is much more chance that the buddy will not be in a
>> deep idle as all the small tasks and background activity are already
>> waking on this CPU.
>
> powertop is helpful to tune your system for more idle time. Another
> reason is current kernel just try to spread tasks on more cpu for
> performance consideration. My power scheduling patch should helpful on this.
>>
>>>

>
> And the ground level has just one buddy for 16 LCPUs - 8 cores, that's
> not a good design, consider my previous examples: if there are 4 or 8
> tasks in one socket, you just has 2 choices: spread them into all cores,
> or pack them into one LCPU. Actually, moving them just into 2 or 4 cores
> maybe a better solution. but the design missed this.

 You speak about tasks without any notion of load. This patch only care
 of small tasks and light LCPU load, but it falls back to default
 behavior for other situation. So if there are 4 or 8 small tasks, they
 will migrate to the socket 0 after 1 or up to 3 migration (it depends
 of the conf and the LCPU they come from).
>>>
>>> According to your patch, what your mean 'notion of load' is the
>>> utilization of cpu, not the load weight of tasks, right?
>>
>> Yes but not only. The number of tasks that run simultaneously, is
>> another important input
>>
>>>
>>> Yes, I just talked about tasks numbers, but it naturally extends to the
>>> task utilization on cpu. like 8 tasks with 25% util, that just can full
>>> fill 2 CPUs. but clearly beyond the capacity of the buddy, so you need
>>> to wake up another CPU socket while local socket has some LCPU idle...
>>
>> 8 tasks with a running period of 25ms per 100ms that wake up
>> simultaneously should probably run on 8 different LCPU in order to
>> race to idle
>
> nope,

Re: linux-next: build failure after merge of the akpm tree

2012-12-18 Thread Mel Gorman

On Tue, Dec 18, 2012 at 01:29:07PM +1100, Stephen Rothwell wrote:
> Hi Andrew,
> 
> After merging the akpm tree, today's linux-next build (x86_64
> allmodconfig) failed like this:
> 
> mm/migrate.c: In function 'migrate_misplaced_transhuge_page':
> mm/migrate.c:1738:2: error: incompatible type for argument 3 of 
> 'update_mmu_cache_pmd'
> arch/x86/include/asm/pgtable.h:792:20: note: expected 'struct pmd_t *' but 
> argument is of type 'pmd_t'
> 
> Caused by commit b32967ff101a ("mm: numa: Add THP migration for the NUMA
> working set scanning fault case") from Linus' tree interacting with
> commit "x86: convert update_mmu_cache() and update_mmu_cache_pmd() to
> functions".
> 
> This was previously reported as against the tip tree, but the fix patch
> (which was applied to the tip tree) did not survive the rewrite that went
> into Linus' tree.  Just a little annoyed :-(
> 

Sorry about that Stephen, I should have caught the fix in the tip tree
and picked it up. Thanks for fixing it a second time.

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Wolfram Sang


> i have planned fixing these too, do you prefer one patch fixing all or
> multiple patches (one per error/warning type )?

One patch, definately. You can skip the 80 char thing. Are you also able
to build-test the changes?

-- 
Pengutronix e.K.   | Wolfram Sang|
Industrial Linux Solutions | http://www.pengutronix.de/  |


signature.asc
Description: Digital signature

Re: [PATCH] mm/swap: abort swapoff after disk error

2012-12-18 Thread Konstantin Khlebnikov


Hugh Dickins wrote:

On Fri, 14 Dec 2012, Konstantin Khlebnikov wrote:


Content of non-uptodate pages completely random, we cannot expose them into
userspace. This leads to information leak and will crash userspace for sure.


Good find, yes, it's very wrong as is.  But, sorry, I don't like your fix
- better than ignoring the issue as at present, but not the right answer.


Probably we can reuse hwpoison entries here, but tmpfs already too complex.


HWpoison entries?  They're for when that page of RAM is bad, but this is
quite a different case: the page is fine and can perfectly well be freed
and reused - what's bad is the data currently in it.



Signed-off-by: Konstantin Khlebnikov
Original-patch-by: Alexey Kuznetsov
Cc: Andrew Morton
Cc: Hugh Dickins
Cc: Andi Kleen
---
  mm/swapfile.c |   16 
  1 file changed, 16 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index e97a0e5..98fc2fd 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1127,6 +1127,22 @@ int try_to_unuse(unsigned int type, bool frontswap,
wait_on_page_writeback(page);

/*
+* If read failed we cannot map not-uptodate page to
+* user space. Actually, we are in serious troubles,
+* we do not even know what process to kill. So, the only


try_to_unuse() is all about locating exactly where this page belongs;
and if the user is lucky, the page in question won't even be needed again
before the process exits, so nothing should be killed at this point.



+* variant remains: to stop swapoff() and allow someone
+* to kill processes to zap invalid pages.


No, we should not abort swapoff: there's every reason to continue,
to make sure that this unreliable area can be taken out of service.


+*
+* TODO replace page with hwpoison entry in pte and shmem.


Instead of blindly going ahead and inserting ptes pointing to the
!PageUptodate page, unuse_pte() and shmem_unuse_inode() should insert
a substitute bad swapentry, to generate SIGBUS if it's accessed.

swp_entry(1, 0) might serve, but there's probably a few mods needed
here and there; and getting the details right (e.g. memcg charges)
will need care.

Not as straightforward as your block below, I admit.  I wonder if you
posted that just to stir me to do better: or can you take it further?


I found this patch in our kernel tree. For some reason it wasn't sent
to mainline. So I decided to send it as is to not lose it for a few more
years. Using here hwpoison was just a guess. Your bad-swap-entry is much
more accurate solution. Seems like here is no rush, this bug was here from
the beginning, so I'll handle it. Thanks for your advice.



Thanks,
Hugh


+*/
+   if (unlikely(!PageUptodate(page))) {
+   unlock_page(page);
+   page_cache_release(page);
+   retval = -EIO;
+   break;
+   }
+
+   /*
 * Remove all references to entry.
 */
swcount = *swap_map;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Laurent Navet

Hi guys,

> This is correct, however there are several other checkpatch errors and
> warnings in this file and I would appreciate if you could fix them as
> well. I'm not asking that you fix them all, but please consider fixing
> the following:
>
> WARNING: space prohibited between function name and open parenthesis '('
> #63: FILE: i2c/busses/i2c-nforce2.c:63:
> +MODULE_AUTHOR ("Hans-Frieder Vogt ");
>
> WARNING: suspect code indent for conditional statements (24, 33)
> #226: FILE: i2c/busses/i2c-nforce2.c:226:
> + if (read_write == I2C_SMBUS_WRITE) {
> +  outb_p(data->word, NVIDIA_SMB_DATA);
>
> WARNING: line over 80 characters
> #275: FILE: i2c/busses/i2c-nforce2.c:275:
> + data->word = inb_p(NVIDIA_SMB_DATA) | 
> (inb_p(NVIDIA_SMB_DATA+1) << 8);
>
> WARNING: quoted string split across lines
> #282: FILE: i2c/busses/i2c-nforce2.c:282:
> + dev_err(>dev, "Transaction failed "
> + "(received block size: 0x%02x)\n",
>
> WARNING: space prohibited between function name and open parenthesis '('
> #330: FILE: i2c/busses/i2c-nforce2.c:330:
> +MODULE_DEVICE_TABLE (pci, nforce2_ids);
>
> WARNING: line over 80 characters
> #380: FILE: i2c/busses/i2c-nforce2.c:380:
> + dev_info(>adapter.dev, "nForce2 SMBus adapter at %#x\n",
> smbus->base);
>
> ERROR: space required before the open parenthesis '('
> #395: FILE: i2c/busses/i2c-nforce2.c:395:
> + switch(dev->device) {
>
> These are simple coding style issues, very easy to fix.

i have planned fixing these too, do you prefer one patch fixing all or
multiple patches (one per error/warning type )?

regards,
Laurent.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] fuse: truncate file if async dio failed - v2

2012-12-18 Thread Maxim V. Patlasov

The patch improves error handling in fuse_direct_IO(): if we successfully
submitted several fuse requests on behalf of synchronous direct write
extending file and some of them failed, let's try to do our best to clean-up.

Changed in v2: reuse fuse_do_setattr(). Thanks to Brian for suggestion.

Signed-off-by: Maxim Patlasov 
---
 fs/fuse/dir.c|   17 +
 fs/fuse/file.c   |   27 +--
 fs/fuse/fuse_i.h |3 +++
 3 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 20b52a5..049d4c2 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -1532,10 +1532,9 @@ void fuse_release_nowrite(struct inode *inode)
  * vmtruncate() doesn't allow for this case, so do the rlimit checking
  * and the actual truncation by hand.
  */
-static int fuse_do_setattr(struct dentry *entry, struct iattr *attr,
-  struct file *file)
+int fuse_do_setattr(struct inode *inode, struct iattr *attr,
+   struct file *file)
 {
-   struct inode *inode = entry->d_inode;
struct fuse_conn *fc = get_fuse_conn(inode);
struct fuse_req *req;
struct fuse_setattr_in inarg;
@@ -1544,9 +1543,6 @@ static int fuse_do_setattr(struct dentry *entry, struct 
iattr *attr,
loff_t oldsize;
int err;
 
-   if (!fuse_allow_task(fc, current))
-   return -EACCES;
-
if (!(fc->flags & FUSE_DEFAULT_PERMISSIONS))
attr->ia_valid |= ATTR_FORCE;
 
@@ -1641,10 +1637,15 @@ error:
 
 static int fuse_setattr(struct dentry *entry, struct iattr *attr)
 {
+   struct inode *inode = entry->d_inode;
+
+   if (!fuse_allow_task(get_fuse_conn(inode), current))
+   return -EACCES;
+
if (attr->ia_valid & ATTR_FILE)
-   return fuse_do_setattr(entry, attr, attr->ia_file);
+   return fuse_do_setattr(inode, attr, attr->ia_file);
else
-   return fuse_do_setattr(entry, attr, NULL);
+   return fuse_do_setattr(inode, attr, NULL);
 }
 
 static int fuse_getattr(struct vfsmount *mnt, struct dentry *entry,
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 05eed23..d9a0568 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -2340,6 +2340,25 @@ int fuse_notify_poll_wakeup(struct fuse_conn *fc,
return 0;
 }
 
+static void fuse_do_truncate(struct file *file)
+{
+   struct inode *inode = file->f_mapping->host;
+   struct iattr attr;
+   int err;
+
+   attr.ia_valid = ATTR_SIZE;
+   attr.ia_size = i_size_read(inode);
+
+   attr.ia_file = file;
+   attr.ia_valid |= ATTR_FILE;
+
+   err = fuse_do_setattr(inode, , file);
+
+   if (err)
+   printk(KERN_WARNING "failed to truncate to %lld with error "
+  "%d\n", i_size_read(inode), err);
+}
+
 static ssize_t
 fuse_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov,
loff_t offset, unsigned long nr_segs)
@@ -2400,8 +2419,12 @@ fuse_direct_IO(int rw, struct kiocb *iocb, const struct 
iovec *iov,
kfree(io);
}
 
-   if (rw == WRITE && ret > 0)
-   fuse_write_update_size(inode, pos);
+   if (rw == WRITE) {
+   if (ret > 0)
+   fuse_write_update_size(inode, pos);
+   else if (ret < 0 && offset + count > i_size)
+   fuse_do_truncate(file);
+   }
 
return ret;
 }
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h
index 91b5192..d4f7f07 100644
--- a/fs/fuse/fuse_i.h
+++ b/fs/fuse/fuse_i.h
@@ -840,4 +840,7 @@ int fuse_dev_release(struct inode *inode, struct file 
*file);
 
 void fuse_write_update_size(struct inode *inode, loff_t pos);
 
+int fuse_do_setattr(struct inode *inode, struct iattr *attr,
+   struct file *file);
+
 #endif /* _FS_FUSE_I_H */

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 3/7] mm: vmscan: clarify how swappiness, highest priority, memcg interact

2012-12-18 Thread Mel Gorman

On Mon, Dec 17, 2012 at 01:12:33PM -0500, Johannes Weiner wrote:
> A swappiness of 0 has a slightly different meaning for global reclaim
> (may swap if file cache really low) and memory cgroup reclaim (never
> swap, ever).
> 
> In addition, global reclaim at highest priority will scan all LRU
> lists equal to their size and ignore other balancing heuristics.
> UNLESS swappiness forbids swapping, then the lists are balanced based
> on recent reclaim effectiveness.  UNLESS file cache is running low,
> then anonymous pages are force-scanned.
> 
> This (total mess of a) behaviour is implicit and not obvious from the
> way the code is organized.  At least make it apparent in the code flow
> and document the conditions.  It will be it easier to come up with
> sane semantics later.
> 
> Signed-off-by: Johannes Weiner 

Acked-by: Mel Gorman 

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: 32kHz clock removal causes problems omap_hsmmc

2012-12-18 Thread Felipe Balbi

Hi,

On Thu, Nov 15, 2012 at 10:31:33AM +0200, Luciano Coelho wrote:
> Since the 32KHz clock was removed from the twl-regulator (0e8e5c34
> regulator: twl: Remove references to 32kHz clock from DT bindings),
> we've been having problems with our wl12xx chip that is connected
> through the omap_hsmmc.
> 
> Our card simply doesn't get added to the system and we get lots of
> -ETIMEOUTs during mmc_attach.  If I revert 0e8e5c34 (plus a couple of
> associated patches), everything works fine.
> 
> I've been using a Blaze device with a WiLink 1283 chip connected in the
> COM (internal) port.  The funny thing is that I don't have the same
> problem with a WiLink 1853 chip (which is connected externally to
> another Blaze).
> 
> Does anyone know what the problem could and how to fix it?
> 
> This is a regression that has been there since 3.6-rc1.

damn, this is still part of our v3.7-rc kernel. Original commit was done
with no testing whatsoever and caused a big regression to (at least)
TI's WiFi driver which depend on SDIO to function.

Too bad things break and even when reported nobody gives a rat's ***
about them :-s

-- 
balbi


signature.asc
Description: Digital signature

[PATCH] drivers/watchdog/eurotechwdt: handle spurious interrupts on wrong hardware

2012-12-18 Thread Konstantin Khlebnikov

"eurotechwdt" hasn't any PCI-ID or DMI checks, thus is can be loaded on any
hardware. On my PC this leads to immediate reboot, because driver got irq right
after registering irq handler. This patch rejects interrupts until device
activation. There is no sense to load this driver without special hardware,
but such bugs blocks mine automatic testing for allmodconfig kernels.

Signed-off-by: Konstantin Khlebnikov 
Cc: Wim Van Sebroeck 
Cc: linux-watch...@vger.kernel.org
Cc: Alan Cox 

---

by default driver uses 10th irq, on my machine it's free:

root@buzz:~# cat /proc/interrupts
   CPU0   CPU1
  0:187  0   IO-APIC-edge  timer
  1:  0  8   IO-APIC-edge  i8042
  7:  2  0   IO-APIC-edge  parport0
  8:  0  1   IO-APIC-edge  rtc0
  9:  0  0   IO-APIC-fasteoi   acpi
 10:  1  0   IO-APIC-edge
 14:  1 61   IO-APIC-edge  pata_amd
 15:  0  0   IO-APIC-edge  pata_amd
 20:  0317   IO-APIC-fasteoi   snd_hda_intel
 22:  0  2   IO-APIC-fasteoi   ehci_hcd:usb1, sata_nv
 23:  6   3667   IO-APIC-fasteoi   ohci_hcd:usb2, sata_nv
 41: 93 776888   PCI-MSI-edge  eth0
NMI:  0  2   Non-maskable interrupts
LOC:  13293  32610   Local timer interrupts
SPU:  0  0   Spurious interrupts
PMI:  0  2   Performance monitoring interrupts
IWI:  0  0   IRQ work interrupts
RTR:  0  0   APIC ICR read retries
RES:  11427   7619   Rescheduling interrupts
CAL:182 52   Function call interrupts
TLB:824580   TLB shootdowns
TRM:  0  0   Thermal event interrupts
THR:  0  0   Threshold APIC interrupts
MCE:  0  0   Machine check exceptions
MCP:  2  2   Machine check polls
ERR:  1
MIS:  0
---
 drivers/watchdog/eurotechwdt.c |7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/watchdog/eurotechwdt.c b/drivers/watchdog/eurotechwdt.c
index cd31b8a..7d24914 100644
--- a/drivers/watchdog/eurotechwdt.c
+++ b/drivers/watchdog/eurotechwdt.c
@@ -65,6 +65,7 @@
 static unsigned long eurwdt_is_open;
 static int eurwdt_timeout;
 static char eur_expect_close;
+static bool eurwdt_is_active;
 static DEFINE_SPINLOCK(eurwdt_lock);
 
 /*
@@ -139,6 +140,7 @@ static inline void eurwdt_disable_timer(void)
 static void eurwdt_activate_timer(void)
 {
eurwdt_disable_timer();
+   eurwdt_is_active = true;
eurwdt_write_reg(WDT_CTRL_REG, 0x01);   /* activate the WDT */
eurwdt_write_reg(WDT_OUTPIN_CFG,
!strcmp("int", ev) ? WDT_EVENT_INT : WDT_EVENT_REBOOT);
@@ -164,6 +166,11 @@ static void eurwdt_activate_timer(void)
 
 static irqreturn_t eurwdt_interrupt(int irq, void *dev_id)
 {
+   if (!eurwdt_is_active) {
+   pr_crit("spurious interrupt\n");
+   return IRQ_NONE;
+   }
+
pr_crit("timeout WDT timeout\n");
 
 #ifdef ONLY_TESTING

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [patch 1/7] mm: memcg: only evict file pages when we have plenty

2012-12-18 Thread Mel Gorman

On Mon, Dec 17, 2012 at 01:12:31PM -0500, Johannes Weiner wrote:
> e986850 "mm, vmscan: only evict file pages when we have plenty" makes
> a point of not going for anonymous memory while there is still enough
> inactive cache around.
> 
> The check was added only for global reclaim, but it is just as useful
> to reduce swapping in memory cgroup reclaim:
> 
> 200M-memcg-defconfig-j2
> 
>  vanilla   patched
> Real time  454.06 (  +0.00%) 453.71 (  -0.08%)
> User time  668.57 (  +0.00%) 668.73 (  +0.02%)
> System time128.92 (  +0.00%) 129.53 (  +0.46%)
> Swap in   1246.80 (  +0.00%) 814.40 ( -34.65%)
> Swap out  1198.90 (  +0.00%) 827.00 ( -30.99%)
> Pages allocated   16431288.10 (  +0.00%)16434035.30 (  +0.02%)
> Major faults   681.50 (  +0.00%) 593.70 ( -12.86%)
> THP faults 237.20 (  +0.00%) 242.40 (  +2.18%)
> THP collapse   241.20 (  +0.00%) 248.50 (  +3.01%)
> THP splits 157.30 (  +0.00%) 161.40 (  +2.59%)
> 
> Signed-off-by: Johannes Weiner 
> Acked-by: Michal Hocko 

Acked-by: Mel Gorman 

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] driver i2c-nforce2: fix pointer CodingStyle issues

2012-12-18 Thread Jean Delvare

Hi Laurent,

On Mon, 17 Dec 2012 22:04:19 +0100, Laurent Navet wrote:
> fix these errors reported by checkpatch.pl
> - drivers/i2c/busses/i2c-nforce2.c:191
> - drivers/i2c/busses/i2c-nforce2.c:193
> ERROR: "foo * bar" should be "foo *bar"
> 
> - drivers/i2c/busses/i2c-nforce2.c:302:
> ERROR: "(foo*)" should be "(foo *)"
> 
> Signed-off-by: Laurent Navet 
> ---
>  drivers/i2c/busses/i2c-nforce2.c |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/i2c/busses/i2c-nforce2.c 
> b/drivers/i2c/busses/i2c-nforce2.c
> index 392303b..9db5ff5 100644
> --- a/drivers/i2c/busses/i2c-nforce2.c
> +++ b/drivers/i2c/busses/i2c-nforce2.c
> @@ -188,9 +188,9 @@ static int nforce2_check_status(struct i2c_adapter *adap)
>  }
>  
>  /* Return negative errno on error */
> -static s32 nforce2_access(struct i2c_adapter * adap, u16 addr,
> +static s32 nforce2_access(struct i2c_adapter *adap, u16 addr,
>   unsigned short flags, char read_write,
> - u8 command, int size, union i2c_smbus_data * data)
> + u8 command, int size, union i2c_smbus_data *data)
>  {
>   struct nforce2_smbus *smbus = adap->algo_data;
>   unsigned char protocol, pec;
> @@ -299,7 +299,7 @@ static u32 nforce2_func(struct i2c_adapter *adapter)
>   return I2C_FUNC_SMBUS_QUICK | I2C_FUNC_SMBUS_BYTE |
>  I2C_FUNC_SMBUS_BYTE_DATA | I2C_FUNC_SMBUS_WORD_DATA |
>  I2C_FUNC_SMBUS_PEC |
> -(((struct nforce2_smbus*)adapter->algo_data)->blockops ?
> +(((struct nforce2_smbus *)adapter->algo_data)->blockops ?
>   I2C_FUNC_SMBUS_BLOCK_DATA : 0);
>  }
>  

This is correct, however there are several other checkpatch errors and
warnings in this file and I would appreciate if you could fix them as
well. I'm not asking that you fix them all, but please consider fixing
the following:

WARNING: space prohibited between function name and open parenthesis '('
#63: FILE: i2c/busses/i2c-nforce2.c:63:
+MODULE_AUTHOR ("Hans-Frieder Vogt ");

WARNING: suspect code indent for conditional statements (24, 33)
#226: FILE: i2c/busses/i2c-nforce2.c:226:
+   if (read_write == I2C_SMBUS_WRITE) {
+outb_p(data->word, NVIDIA_SMB_DATA);

WARNING: line over 80 characters
#275: FILE: i2c/busses/i2c-nforce2.c:275:
+   data->word = inb_p(NVIDIA_SMB_DATA) | 
(inb_p(NVIDIA_SMB_DATA+1) << 8);

WARNING: quoted string split across lines
#282: FILE: i2c/busses/i2c-nforce2.c:282:
+   dev_err(>dev, "Transaction failed "
+   "(received block size: 0x%02x)\n",

WARNING: space prohibited between function name and open parenthesis '('
#330: FILE: i2c/busses/i2c-nforce2.c:330:
+MODULE_DEVICE_TABLE (pci, nforce2_ids);

WARNING: line over 80 characters
#380: FILE: i2c/busses/i2c-nforce2.c:380:
+   dev_info(>adapter.dev, "nForce2 SMBus adapter at %#x\n", 
smbus->base);

ERROR: space required before the open parenthesis '('
#395: FILE: i2c/busses/i2c-nforce2.c:395:
+   switch(dev->device) {

These are simple coding style issues, very easy to fix.

Thanks,
-- 
Jean Delvare
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Why Linux kernel forced to enter X2APIC mode( just because of booting cpu has supported x2apic) without depending on BIOS' setting in MSR->x2apic enablement bit ?

2012-12-18 Thread Zhang, Lin-Bao (Linux Kernel R)

Hi Suresh and other guys , 

In 3.4.4/3.6.6 ,I found a x2apic issue . if I am wrong , sorry first , 
and welcome your correction . thanks for your forwarding other maintainers. 
I am testing a server , its BIOS is like this:
a) If BIOS think the system is of x2apic , it will set x2apic enablement bit in 
MSR and create x2apic ACPI tables and pass control to OS with x2apic mode
b) If BIOS feel the system doesn't meet x2apic conditions , it will not set 
x2apic enablement bit in MSR ,and pass control to OS with xapic mode. 

It seems that MSR should be interface to OS to use. OS should leverage this, 
but it seems not. 
In linux kernel source code , 
We see:
void check_x2apic(void)  
{
if (x2apic_enabled()) {  // this depends on 2 conditions : cpu has 
supported x2apic or not, bios setting MSR bit or not .
pr_info("x2apic enabled by BIOS, switching to x2apic ops\n");
x2apic_preenabled = x2apic_mode = 1;
}
It seems that Linux kernel will follow BIOS' result in early booting. 

But I found that in other source : enable_IR_x2apic() , linux kernel obviously 
discard BIOS' result like this:

#define x2apic_supported()  (cpu_has_x2apic)  // just depends on if 
booting CPU has supported x2apic
Enable_IR_x2apic {

x2apic_enabled = 1;
if (x2apic_supported() && !x2apic_mode) {  //x2apic_mode depends on 
BIOS's MSR. 
x2apic_mode = 1;
enable_x2apic();// OS 
will write MSR->x2apic enablement bit and print 
pr_info("Enabled x2apic\n");
}
..
}

void enable_x2apic(void)
{
u64 msr;

rdmsrl(MSR_IA32_APICBASE, msr);
if (x2apic_disabled) {
__disable_x2apic(msr);
return;
}

if (!x2apic_mode)
return;

if (!(msr & X2APIC_ENABLE)) {
printk_once(KERN_INFO "Enabling x2apic\n");
// linux kernel will write MSR->x2apic_enable ,work around BIOS 
, this is reasonable ?
wrmsrl(MSR_IA32_APICBASE, msr | X2APIC_ENABLE);
}
}
I am very surprised for this ,why BIOS has claimed that BIOS didn't support 
x2apic , but Linux kernel will force enabling x2apic 
(although booting CPU has supported x2apic) ?
I have one machine , in booting log , 
There is no " x2apic enabled by BIOS, switching to x2apic ops "
But later , linux kernel will print 
" [0.464649] parse_iosapics: ecap f0207e
[0.469728] IOAPIC id 8 under DRHD base  0xa800 IU 0
[0.576731] IOAPIC id 0 under DRHD base  0xa800 IOMMU 0
[0.584065] parse_io_apic: io_apic 2 nr 2
[0.589590] Enabled IRQ remapping in x2apic mode
[0.595663] Enabling x2apic
[0.599337] Enabled x2apic
[0.603007] Switched APIC routing to cluster x2apic."  ( I can add 
x2apic_phys to force using physical mode).


1, I quite worry about this , try to think , BIOS didn't enable x2apic , it 
just did enable xAPIC mode , 
it didn't create ACPI tables of x2apic. In this case, how Linux kernel can run 
normally in x2apic mode ?  
in fact, in this machine , Linux kernel didn't run into some issues ,but I 
still think this is a potential defect. 
2, why Linux kernel first depends on MSR and later will discard MSR ?  is there 
any history ?  
Linux kernel wants to completely discard MSR checking in future ?
3, anybody test machines which support x2apic ? do you have similar problems ?  
in these machines , will you check MSR bit ?


-- Bob(LinBao Zhang)
HP linux kernel enginner


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 4/6] Add new base function _install_special_mapping() to mmap.c

2012-12-18 Thread stefani

From: Stefani Seibold 

The _install_special_mapping() is the new base function for
install_special_mapping(). This function will return a pointer to the vma
which was created or a error code in an ERR_PTR()

The install_special_mapping() will unscramble this to the old behaviour,
returning an int.

This new function will be needed by the X86_64/IA32_EMULATION to map the
hpet into the 32 bit address space. This will be done with
io_remap_pfn_range() which requieres a vm_area_struct.

Signed-off-by: Stefani Seibold 
---
 include/linux/mm.h |  3 +++
 mm/mmap.c  | 20 
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index bcaab4e..82a992b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1435,6 +1435,9 @@ extern void set_mm_exe_file(struct mm_struct *mm, struct 
file *new_exe_file);
 extern struct file *get_mm_exe_file(struct mm_struct *mm);
 
 extern int may_expand_vm(struct mm_struct *mm, unsigned long npages);
+extern struct vm_area_struct *_install_special_mapping(struct mm_struct *mm,
+  unsigned long addr, unsigned long len,
+  unsigned long flags, struct page **pages);
 extern int install_special_mapping(struct mm_struct *mm,
   unsigned long addr, unsigned long len,
   unsigned long flags, struct page **pages);
diff --git a/mm/mmap.c b/mm/mmap.c
index 9a796c4..dd85d21 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2515,7 +2515,7 @@ static const struct vm_operations_struct 
special_mapping_vmops = {
  * The array pointer and the pages it points to are assumed to stay alive
  * for as long as this mapping might exist.
  */
-int install_special_mapping(struct mm_struct *mm,
+struct vm_area_struct *_install_special_mapping(struct mm_struct *mm,
unsigned long addr, unsigned long len,
unsigned long vm_flags, struct page **pages)
 {
@@ -2524,7 +2524,7 @@ int install_special_mapping(struct mm_struct *mm,
 
vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);
if (unlikely(vma == NULL))
-   return -ENOMEM;
+   return ERR_PTR(-ENOMEM);
 
INIT_LIST_HEAD(>anon_vma_chain);
vma->vm_mm = mm;
@@ -2545,11 +2545,23 @@ int install_special_mapping(struct mm_struct *mm,
 
perf_event_mmap(vma);
 
-   return 0;
+   return vma;
 
 out:
kmem_cache_free(vm_area_cachep, vma);
-   return ret;
+   return ERR_PTR(ret);
+}
+
+int install_special_mapping(struct mm_struct *mm,
+   unsigned long addr, unsigned long len,
+   unsigned long vm_flags, struct page **pages)
+{
+   struct vm_area_struct *vma = _install_special_mapping(mm,
+   addr, len, vm_flags, pages);
+
+   if (IS_ERR(vma))
+   return PTR_ERR(vma);
+   return 0;
 }
 
 static DEFINE_MUTEX(mm_all_locks_mutex);
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/6] Add 32 bit VDSO time function support

2012-12-18 Thread stefani

From: Stefani Seibold 

This patch add the functions vdso_gettimeofday(), vdso_clock_gettime()
and vdso_time() to the 32 bit VDSO.

The reason to do this was to get a fast reliable time stamp. Many developers
uses TSC to get a fast time time stamp, without knowing the pitfalls. VDSO
time functions a fast and reliable way, because the kernel knows the
best time source and the P- and C-state of the CPU.

The helper library to use the VDSO functions can be download at
http://http://seibold.net/vdso.c
The libary is very small, only 228 lines of code. Compile it with
gcc -Wall -O3 -fpic vdso.c -lrt -shared -o libvdso.so
and use it with LD_PRELOAD=/libvdso.so

This kind of helper must be integrated into glibc, for x86 64 bit and
PowerPC it is already there.

Some benchmark linux 32 bit results (all measurements are in nano seconds):

Intel(R) Celeron(TM) CPU 400MHz

Average time kernel call:
 gettimeofday(): 1039
 clock_gettime(): 1578
 time(): 526
Average time VDSO call:
 gettimeofday(): 378
 clock_gettime(): 303
 time(): 60

Celeron(R) Dual-Core CPU T3100 1.90GHz

Average time kernel call:
 gettimeofday(): 209
 clock_gettime(): 406
 time(): 135
Average time VDSO call:
 gettimeofday(): 51
 clock_gettime(): 43
 time(): 10

So you can see a performance increase between 4 and 13, depending on the
CPU and the function.

The patch is against kernel 3.7. Please apply if you like it.

Changelog:
25.11.2012 - first release and proof of concept for linux 3.4
11.12.2012 - Port to linux 3.7 and code cleanup
12.12.2012 - fixes suggested by Andy Lutomirski
   - fixes suggested by John Stultz
   - use call VDSO32_vsyscall instead of int 80
   - code cleanup
17.12.2012 - support for IA32_EMULATION, this includes
 - code cleanup
 - include cleanup to fix compile warnings and errors
 - move out seqcount from seqlock, enable use in VDSO
 - map FIXMAP and HPET into the 32 bit address space
18.12.2012 - split into separate patches
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/6] Make vsyscall_gtod_data handling x86 generic

2012-12-18 Thread stefani

From: Stefani Seibold 

This patch move the vsyscall_gtod_data handling out of vsyscall_64.c
into an additonal file vsyscall_gtod.c and make the functions
available for x86 32 bit kernels.

Signed-off-by: Stefani Seibold 
---
 arch/x86/Kconfig   |  4 +-
 arch/x86/include/asm/fixmap.h  |  6 +-
 arch/x86/include/asm/clocksource.h |  4 --
 arch/x86/include/asm/vgtod.h   |  1 +
 arch/x86/include/asm/vvar.h|  4 ++
 arch/x86/kernel/hpet.c |  2 -
 arch/x86/kernel/setup.c|  2 +
 arch/x86/kernel/tsc.c  |  2 -
 arch/x86/kernel/vmlinux.lds.S  |  4 --
 arch/x86/kernel/vsyscall_64.c  | 49 
 arch/x86/kernel/vsyscall_gtod.c| 93 ++
 10 files changed, 102 insertions(+), 63 deletions(-)
 create mode 100644 arch/x86/kernel/vsyscall_gtod.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 46c3bff..b8c2c74 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -100,9 +100,9 @@ config X86
select GENERIC_CMOS_UPDATE
select CLOCKSOURCE_WATCHDOG
select GENERIC_CLOCKEVENTS
-   select ARCH_CLOCKSOURCE_DATA if X86_64
+   select ARCH_CLOCKSOURCE_DATA
select GENERIC_CLOCKEVENTS_BROADCAST if X86_64 || (X86_32 && 
X86_LOCAL_APIC)
-   select GENERIC_TIME_VSYSCALL if X86_64
+   select GENERIC_TIME_VSYSCALL
select KTIME_SCALAR if X86_32
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
index 4da3c0c..75ebc52 100644
--- a/arch/x86/include/asm/fixmap.h
+++ b/arch/x86/include/asm/fixmap.h
@@ -16,7 +16,8 @@
 
 #ifndef __ASSEMBLY__
 #include 
-#include 
+#include 
+#include 
 #include 
 #include 
 #ifdef CONFIG_X86_32
@@ -78,9 +79,10 @@ enum fixed_addresses {
VSYSCALL_LAST_PAGE,
VSYSCALL_FIRST_PAGE = VSYSCALL_LAST_PAGE
+ ((VSYSCALL_END-VSYSCALL_START) >> PAGE_SHIFT) - 1,
+#endif
VVAR_PAGE,
VSYSCALL_HPET,
-#endif
+
FIX_DBGP_BASE,
FIX_EARLYCON_MEM_BASE,
 #ifdef CONFIG_PROVIDE_OHCI1394_DMA_INIT
diff --git a/arch/x86/include/asm/clocksource.h 
b/arch/x86/include/asm/clocksource.h
index 0bdbbb3..67d68b9 100644
--- a/arch/x86/include/asm/clocksource.h
+++ b/arch/x86/include/asm/clocksource.h
@@ -3,8 +3,6 @@
 #ifndef _ASM_X86_CLOCKSOURCE_H
 #define _ASM_X86_CLOCKSOURCE_H
 
-#ifdef CONFIG_X86_64
-
 #define VCLOCK_NONE 0  /* No vDSO clock available. */
 #define VCLOCK_TSC  1  /* vDSO should use vread_tsc.   */
 #define VCLOCK_HPET 2  /* vDSO should use vread_hpet.  */
@@ -13,6 +11,4 @@ struct arch_clocksource_data {
int vclock_mode;
 };
 
-#endif /* CONFIG_X86_64 */
-
 #endif /* _ASM_X86_CLOCKSOURCE_H */
diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index 46e24d3..eb87b53 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -27,4 +27,5 @@ struct vsyscall_gtod_data {
 };
 extern struct vsyscall_gtod_data vsyscall_gtod_data;
 
+extern void map_vgtod(void);
 #endif /* _ASM_X86_VGTOD_H */
diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h
index de656ac..8084d55 100644
--- a/arch/x86/include/asm/vvar.h
+++ b/arch/x86/include/asm/vvar.h
@@ -17,7 +17,11 @@
  */
 
 /* Base address of vvars.  This is not ABI. */
+#ifdef CONFIG_X86_64
 #define VVAR_ADDRESS (-10*1024*1024 - 4096)
+#else
+#define VVAR_ADDRESS 0xd000
+#endif
 
 #if defined(__VVAR_KERNEL_LDS)
 
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 1460a5d..859bb2d 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -752,9 +752,7 @@ static struct clocksource clocksource_hpet = {
.mask   = HPET_MASK,
.flags  = CLOCK_SOURCE_IS_CONTINUOUS,
.resume = hpet_resume_counter,
-#ifdef CONFIG_X86_64
.archdata   = { .vclock_mode = VCLOCK_HPET },
-#endif
 };
 
 static int hpet_clocksource_register(void)
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index ca45696..c2f6bbb 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -114,6 +114,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * end_pfn only includes RAM, while max_pfn_mapped includes all e820 entries.
@@ -997,6 +998,7 @@ void __init setup_arch(char **cmdline_p)
 #ifdef CONFIG_X86_64
map_vsyscall();
 #endif
+   map_vgtod();
 
generic_apic_probe();
 
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index cfa5d4f..078cc9a 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -772,9 +772,7 @@ static struct clocksource clocksource_tsc = {
.mask   = CLOCKSOURCE_MASK(64),
.flags  = CLOCK_SOURCE_IS_CONTINUOUS |
  CLOCK_SOURCE_MUST_VERIFY,
-#ifdef CONFIG_X86_64
.archdata   = { .vclock_mode = VCLOCK_TSC },
-#endif
 };
 
 void

[PATCH 1/6] Move out seqcount from seqlock.h

2012-12-18 Thread stefani

From: Stefani Seibold 

Create a seperate seqcount.h file which handles only seqcount. This file
is save to include in VDSO, since there is no in-kernel functionality
like spinlock in use. seqlock.h still includes seqcount.h, so there is no
side effect for current users.

Signed-off-by: Stefani Seibold 
---
 include/linux/seqcount.h | 150 +++
 include/linux/seqlock.h  | 145 +
 2 files changed, 151 insertions(+), 144 deletions(-)
 create mode 100644 include/linux/seqcount.h

diff --git a/include/linux/seqcount.h b/include/linux/seqcount.h
new file mode 100644
index 000..b83dff3
--- /dev/null
+++ b/include/linux/seqcount.h
@@ -0,0 +1,150 @@
+/*
+ * Version using sequence counter only.
+ * This can be used when code has its own mutex protecting the
+ * updating starting before the write_seqcountbeqin() and ending
+ * after the write_seqcount_end().
+ */
+
+#ifndef __LINUX_SEQCOUNT_H
+#define __LINUX_SEQCOUNT_H
+
+#include 
+#include 
+
+typedef struct seqcount {
+   unsigned sequence;
+} seqcount_t;
+
+#define SEQCNT_ZERO { 0 }
+#define seqcount_init(x) do { *(x) = (seqcount_t) SEQCNT_ZERO; } while (0)
+
+/**
+ * __read_seqcount_begin - begin a seq-read critical section (without barrier)
+ * @s: pointer to seqcount_t
+ * Returns: count to be passed to read_seqcount_retry
+ *
+ * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
+ * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
+ * provided before actually loading any of the variables that are to be
+ * protected in this critical section.
+ *
+ * Use carefully, only in critical code, and comment how the barrier is
+ * provided.
+ */
+static inline unsigned __read_seqcount_begin(const seqcount_t *s)
+{
+   unsigned ret;
+
+repeat:
+   ret = ACCESS_ONCE(s->sequence);
+   if (unlikely(ret & 1)) {
+   cpu_relax();
+   goto repeat;
+   }
+   return ret;
+}
+
+/**
+ * read_seqcount_begin - begin a seq-read critical section
+ * @s: pointer to seqcount_t
+ * Returns: count to be passed to read_seqcount_retry
+ *
+ * read_seqcount_begin opens a read critical section of the given seqcount.
+ * Validity of the critical section is tested by checking read_seqcount_retry
+ * function.
+ */
+static inline unsigned read_seqcount_begin(const seqcount_t *s)
+{
+   unsigned ret = __read_seqcount_begin(s);
+   smp_rmb();
+   return ret;
+}
+
+/**
+ * raw_seqcount_begin - begin a seq-read critical section
+ * @s: pointer to seqcount_t
+ * Returns: count to be passed to read_seqcount_retry
+ *
+ * raw_seqcount_begin opens a read critical section of the given seqcount.
+ * Validity of the critical section is tested by checking read_seqcount_retry
+ * function.
+ *
+ * Unlike read_seqcount_begin(), this function will not wait for the count
+ * to stabilize. If a writer is active when we begin, we will fail the
+ * read_seqcount_retry() instead of stabilizing at the beginning of the
+ * critical section.
+ */
+static inline unsigned raw_seqcount_begin(const seqcount_t *s)
+{
+   unsigned ret = ACCESS_ONCE(s->sequence);
+   smp_rmb();
+   return ret & ~1;
+}
+
+/**
+ * __read_seqcount_retry - end a seq-read critical section (without barrier)
+ * @s: pointer to seqcount_t
+ * @start: count, from read_seqcount_begin
+ * Returns: 1 if retry is required, else 0
+ *
+ * __read_seqcount_retry is like read_seqcount_retry, but has no smp_rmb()
+ * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
+ * provided before actually loading any of the variables that are to be
+ * protected in this critical section.
+ *
+ * Use carefully, only in critical code, and comment how the barrier is
+ * provided.
+ */
+static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start)
+{
+   return unlikely(s->sequence != start);
+}
+
+/**
+ * read_seqcount_retry - end a seq-read critical section
+ * @s: pointer to seqcount_t
+ * @start: count, from read_seqcount_begin
+ * Returns: 1 if retry is required, else 0
+ *
+ * read_seqcount_retry closes a read critical section of the given seqcount.
+ * If the critical section was invalid, it must be ignored (and typically
+ * retried).
+ */
+static inline int read_seqcount_retry(const seqcount_t *s, unsigned start)
+{
+   smp_rmb();
+
+   return __read_seqcount_retry(s, start);
+}
+
+
+/*
+ * Sequence counter only version assumes that callers are using their
+ * own mutexing.
+ */
+static inline void write_seqcount_begin(seqcount_t *s)
+{
+   s->sequence++;
+   smp_wmb();
+}
+
+static inline void write_seqcount_end(seqcount_t *s)
+{
+   smp_wmb();
+   s->sequence++;
+}
+
+/**
+ * write_seqcount_barrier - invalidate in-progress read-side seq operations
+ * @s: pointer to seqcount_t
+ *
+ * After write_seqcount_barrier, no read-side seq operations will complete
+ * successfully and see data older

[PATCH 3/6] Make vsyscall_gtod_data compatible with 32 bit VDSO

2012-12-18 Thread stefani

From: Stefani Seibold 

To make the vsyscall_gtod_data available for both VDSO (X86_64 and
IA32_EMULATION) the alignment must be set to 4. Otherwise the code
create with "gcc -m32" will fail, since the structure alignment in 32
bit mode ist 4 byte.

There is currently no drawback for X86_64, since the structure members
are in a good order.

Signed-off-by: Stefani Seibold 
---
 arch/x86/include/asm/vgtod.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index eb87b53..86afff8 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -13,7 +13,7 @@ struct vsyscall_gtod_data {
cycle_t mask;
u32 mult;
u32 shift;
-   } clock;
+   } __attribute__((aligned(4),packed)) clock;
 
/* open coded 'struct timespec' */
time_t  wall_time_sec;
@@ -24,7 +24,8 @@ struct vsyscall_gtod_data {
struct timezone sys_tz;
struct timespec wall_time_coarse;
struct timespec monotonic_time_coarse;
-};
+} __attribute__((aligned(4),packed));
+
 extern struct vsyscall_gtod_data vsyscall_gtod_data;
 
 extern void map_vgtod(void);
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/6] Add 32 bit VDSO support for 32 and 64 bit kernels

2012-12-18 Thread stefani

From: Stefani Seibold 

This patch adds support for 32 bit VDSO.

For 32 bit programs running on a 32 bit kernel, the same mechanism is
used as for 64 bit programs running on a 64 bit kernel.

For 32 bit programs running under a 64 bit IA32_EMULATION, it is a
little bit more tricky. In this case the VVAR and HPET will be mapped
into the 32 bit address space, by cutting of the upper 32 bit. So the
address for this will not changed in the view of the 32 bit VDSO. The
HPET will be mapped in this case at 0xff5fe000 and the VVAR at 0xff5ff000.

The transformation between the in 64 bit kernel representation and the 32 bit
abi will be also provided.

So we have one VDSO Source for all.

Signed-off-by: Stefani Seibold 
---
 arch/x86/include/asm/vgtod.h  |   4 +-
 arch/x86/include/asm/vsyscall.h   |   1 -
 arch/x86/include/asm/vvar.h   |   1 +
 arch/x86/kernel/Makefile  |   1 +
 arch/x86/kernel/hpet.c|   9 ++-
 arch/x86/vdso/Makefile|   6 ++
 arch/x86/vdso/vclock_gettime.c| 108 ++
 arch/x86/vdso/vdso32-setup.c  |  43 ++
 arch/x86/vdso/vdso32/vclock_gettime.c |  29 +
 arch/x86/vdso/vdso32/vdso32.lds.S |   3 +
 11 files changed, 179 insertions(+), 32 deletions(-)
 create mode 100644 arch/x86/vdso/vdso32/vclock_gettime.c

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index 86afff8..74c80d4 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -1,8 +1,8 @@
 #ifndef _ASM_X86_VGTOD_H
 #define _ASM_X86_VGTOD_H
 
-#include 
-#include 
+#include 
+#include 
 
 struct vsyscall_gtod_data {
seqcount_t  seq;
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index eaea1d3..24730cb 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -14,7 +14,6 @@ enum vsyscall_num {
 #define VSYSCALL_ADDR(vsyscall_nr) (VSYSCALL_START+VSYSCALL_SIZE*(vsyscall_nr))
 
 #ifdef __KERNEL__
-#include 
 
 #define VGETCPU_RDTSCP 1
 #define VGETCPU_LSL2
diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h
index 8084d55..1e71e6c 100644
--- a/arch/x86/include/asm/vvar.h
+++ b/arch/x86/include/asm/vvar.h
@@ -50,5 +50,6 @@
 DECLARE_VVAR(0, volatile unsigned long, jiffies)
 DECLARE_VVAR(16, int, vgetcpu_mode)
 DECLARE_VVAR(128, struct vsyscall_gtod_data, vsyscall_gtod_data)
+DECLARE_VVAR(512, const void __iomem *, vsyscall_hpet)
 
 #undef DECLARE_VVAR
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 91ce48f..298a0b1 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -26,6 +26,7 @@ obj-y += probe_roms.o
 obj-$(CONFIG_X86_32)   += i386_ksyms_32.o
 obj-$(CONFIG_X86_64)   += sys_x86_64.o x8664_ksyms_64.o
 obj-y  += syscall_$(BITS).o
+obj-y  += vsyscall_gtod.o
 obj-$(CONFIG_X86_64)   += vsyscall_64.o
 obj-$(CONFIG_X86_64)   += vsyscall_emu_64.o
 obj-y  += bootflag.o e820.o
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 859bb2d..4b7bb5d 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -69,14 +69,19 @@ static inline void hpet_writel(unsigned int d, unsigned int 
a)
 
 #ifdef CONFIG_X86_64
 #include 
+#else
+#include 
 #endif
 
+DEFINE_VVAR(const void __iomem *, vsyscall_hpet);
+
+#include 
+
 static inline void hpet_set_mapping(void)
 {
hpet_virt_address = ioremap_nocache(hpet_address, HPET_MMAP_SIZE);
-#ifdef CONFIG_X86_64
__set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VVAR_NOCACHE);
-#endif
+   vsyscall_hpet = (const void __iomem *)fix_to_virt(VSYSCALL_HPET);
 }
 
 static inline void hpet_clear_mapping(void)
diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile
index fd14be1..e136314 100644
--- a/arch/x86/vdso/Makefile
+++ b/arch/x86/vdso/Makefile
@@ -145,8 +145,14 @@ KBUILD_AFLAGS_32 := $(filter-out -m64,$(KBUILD_AFLAGS))
 $(vdso32-images:%=$(obj)/%.dbg): KBUILD_AFLAGS = $(KBUILD_AFLAGS_32)
 $(vdso32-images:%=$(obj)/%.dbg): asflags-$(CONFIG_X86_64) += -m32
 
+KBUILD_CFLAGS_32 := $(filter-out -m64,$(KBUILD_CFLAGS))
+KBUILD_CFLAGS_32 := $(filter-out -mcmodel=kernel,$(KBUILD_CFLAGS_32))
+KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=3 -freg-struct-return
+$(vdso32-images:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_32)
+
 $(vdso32-images:%=$(obj)/%.dbg): $(obj)/vdso32-%.so.dbg: FORCE \
 $(obj)/vdso32/vdso32.lds \
+$(obj)/vdso32/vclock_gettime.o \
 $(obj)/vdso32/note.o \
 $(obj)/vdso32/%.o
$(call if_changed,vdso)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index 4df6c37..e856bd8 100644
--- a/arch/x86/vdso/vclock_gettime.c
+++ b/arch/x86/vdso/vclock_gettime.c
@@ -4,6 +4,8 @@
  *
  * Fast user context implementation of clock_gettime,

[PATCH 5/6] Cleanup header files to build a proper 32 bit VDSO

2012-12-18 Thread stefani

From: Stefani Seibold 

To build a proper VDSO for 64 bit and 32 bit from the same source, some
header cleanup is necessary, otherwise a "gcc -m32" will produce a lot
of errors and warnings due the differents with LP64 and LP32.

Signed-off-by: Stefani Seibold 
---
 arch/x86/mm/init_32.c   | 1 +
 include/linux/clocksource.h | 1 -
 include/linux/time.h| 3 +--
 include/linux/timekeeper_internal.h | 1 +
 include/linux/types.h   | 2 ++
 5 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 11a5800..394e563 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -52,6 +52,7 @@
 #include 
 #include 
 #include 
+#include 
 
 unsigned long highstart_pfn, highend_pfn;
 
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 4dceaf8..84ed093 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -19,7 +19,6 @@
 #include 
 
 /* clocksource cycle base type */
-typedef u64 cycle_t;
 struct clocksource;
 
 #ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
diff --git a/include/linux/time.h b/include/linux/time.h
index 4d358e9..edfab8a 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -2,9 +2,8 @@
 #define _LINUX_TIME_H
 
 # include 
-# include 
 # include 
-#include 
+# include 
 
 extern struct timezone sys_tz;
 
diff --git a/include/linux/timekeeper_internal.h 
b/include/linux/timekeeper_internal.h
index e1d558e..9a55a0c 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /* Structure holding internal timekeeping values. */
 struct timekeeper {
diff --git a/include/linux/types.h b/include/linux/types.h
index 1cc0e4b..3ff59cf 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -74,6 +74,8 @@ typedef __kernel_time_t   time_t;
 typedef __kernel_clock_t   clock_t;
 #endif
 
+typedef u64 cycle_t;
+
 #ifndef _CADDR_T
 #define _CADDR_T
 typedef __kernel_caddr_t   caddr_t;
-- 
1.8.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/8] Thermal: Create sensor level APIs

2012-12-18 Thread Durgadoss R

This patch creates sensor level APIs, in the
generic thermal framework.

A Thermal sensor is a piece of hardware that can report
temperature of the spot in which it is placed. A thermal
sensor driver reads the temperature from this sensor
and reports it out. This kind of driver can be in
any subsystem. If the sensor needs to participate
in platform thermal management, the corresponding
driver can use the APIs introduced in this patch, to
register(or unregister) with the thermal framework.

Signed-off-by: Durgadoss R 
---
 drivers/thermal/thermal_sys.c |  280 +
 include/linux/thermal.h   |   29 +
 2 files changed, 309 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 8f0f37b..b2becb9 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -45,13 +45,16 @@ MODULE_LICENSE("GPL");
 
 static DEFINE_IDR(thermal_tz_idr);
 static DEFINE_IDR(thermal_cdev_idr);
+static DEFINE_IDR(thermal_sensor_idr);
 static DEFINE_MUTEX(thermal_idr_lock);
 
 static LIST_HEAD(thermal_tz_list);
+static LIST_HEAD(thermal_sensor_list);
 static LIST_HEAD(thermal_cdev_list);
 static LIST_HEAD(thermal_governor_list);
 
 static DEFINE_MUTEX(thermal_list_lock);
+static DEFINE_MUTEX(sensor_list_lock);
 static DEFINE_MUTEX(thermal_governor_lock);
 
 static struct thermal_governor *__find_governor(const char *name)
@@ -421,6 +424,103 @@ static void thermal_zone_device_check(struct work_struct 
*work)
 #define to_thermal_zone(_dev) \
container_of(_dev, struct thermal_zone_device, device)
 
+#define to_thermal_sensor(_dev) \
+   container_of(_dev, struct thermal_sensor, device)
+
+static ssize_t
+sensor_name_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   return sprintf(buf, "%s\n", ts->name);
+}
+
+static ssize_t
+sensor_temp_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   int ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   ret = ts->ops->get_temp(ts, );
+
+   return ret ? ret : sprintf(buf, "%ld\n", val);
+}
+
+static ssize_t
+hyst_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!sscanf(attr->attr.name, "threshold%d_hyst", ))
+   return -EINVAL;
+
+   ret = ts->ops->get_hyst(ts, indx, );
+
+   return ret ? ret : sprintf(buf, "%ld\n", val);
+}
+
+static ssize_t
+hyst_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!ts->ops->set_hyst)
+   return -EPERM;
+
+   if (!sscanf(attr->attr.name, "threshold%d_hyst", ))
+   return -EINVAL;
+
+   if (kstrtol(buf, 10, ))
+   return -EINVAL;
+
+   ret = ts->ops->set_hyst(ts, indx, val);
+
+   return ret ? ret : count;
+}
+
+static ssize_t
+threshold_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!sscanf(attr->attr.name, "threshold%d", ))
+   return -EINVAL;
+
+   ret = ts->ops->get_threshold(ts, indx, );
+
+   return ret ? ret : sprintf(buf, "%ld\n", val);
+}
+
+static ssize_t
+threshold_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!ts->ops->set_threshold)
+   return -EPERM;
+
+   if (!sscanf(attr->attr.name, "threshold%d", ))
+   return -EINVAL;
+
+   if (kstrtol(buf, 10, ))
+   return -EINVAL;
+
+   ret = ts->ops->set_threshold(ts, indx, val);
+
+   return ret ? ret : count;
+}
+
 static ssize_t
 type_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
@@ -705,6 +805,10 @@ static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
 static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
 static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
 
+/* Thermal sensor attributes */
+static DEVICE_ATTR(sensor_name, 0444, sensor_name_show, NULL);
+static DEVICE_ATTR(temp_input, 0444, sensor_temp_show, NULL);
+
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
@@ -1491,6 +1595,182 @@ static void remove_trip_attrs(struct 
thermal_zone_device *tz)
 }
 
 /**
+ * enable_sensor_thresholds - create sysfs nodes for thresholdX
+ * @ts:the thermal sensor
+ * @count:

[PATCH 5/8] Thermal: Add 'thermal_map' sysfs node

2012-12-18 Thread Durgadoss R

This patch creates a thermal map sysfs node under
/sys/class/thermal/thermal_zoneX/. This contains
entries named map0, map1 .. mapN. Each map has the
following space separated values:
trip_type sensor_name cdev_name trip_mask weights

Signed-off-by: Durgadoss R 
---
 drivers/thermal/thermal_sys.c |  149 -
 include/linux/thermal.h   |   29 
 2 files changed, 176 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 29ec073..a3adc00 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -506,6 +506,41 @@ static void remove_cdev_from_zone(struct thermal_zone *tz,
tz->cdev_indx--;
 }
 
+static void __clean_map_entry(struct thermal_zone *tz, int i)
+{
+   tz->map[i] = NULL;
+   sysfs_remove_file(tz->kobj_thermal_map, >map_attr[i]->attr.attr);
+   /* Free map attributes */
+   kfree(tz->map_attr[i]);
+   tz->map_attr[i] = NULL;
+}
+
+static void remove_sensor_map_entry(struct thermal_zone *tz,
+   struct thermal_sensor *ts)
+{
+   int i;
+
+   for (i = 0; i < MAX_MAPS_PER_ZONE; i++) {
+   if (tz->map[i] && !strnicmp(ts->name, tz->map[i]->sensor_name,
+   THERMAL_NAME_LENGTH)) {
+   __clean_map_entry(tz, i);
+   }
+   }
+}
+
+static void remove_cdev_map_entry(struct thermal_zone *tz,
+   struct thermal_cooling_device *cdev)
+{
+   int i;
+
+   for (i = 0; i < MAX_MAPS_PER_ZONE; i++) {
+   if (tz->map[i] && !strnicmp(cdev->type, tz->map[i]->cdev_name,
+   THERMAL_NAME_LENGTH)) {
+   __clean_map_entry(tz, i);
+   }
+   }
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
@@ -898,6 +933,52 @@ policy_show(struct device *dev, struct device_attribute 
*devattr, char *buf)
 }
 
 static ssize_t
+map_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int i, indx, ret = 0;
+   struct thermal_zone *tz;
+   struct thermal_map *map;
+   struct device *dev;
+   char *trip;
+
+   /*
+* For maps under /sys/class/thermal/zoneX/thermal_map/mapY:
+* attr points to mapY
+* kobj points to thermal_map
+* kobj->parent points to zoneX
+*/
+
+   /* Get zone pointer */
+   dev = container_of(kobj->parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   sscanf(attr->attr.name, "map%d", );
+
+   if (indx < 0 || indx >= MAX_MAPS_PER_ZONE)
+   return -EINVAL;
+
+   if (!tz->map[indx])
+   return sprintf(buf, "\n");
+
+   map = tz->map[indx];
+
+   trip = (map->trip_type == THERMAL_TRIP_ACTIVE) ?
+   "active" : "passive";
+   ret += sprintf(buf, "%s", trip);
+   ret += sprintf(buf + ret, " %s", map->sensor_name);
+   ret += sprintf(buf + ret, " %s", map->cdev_name);
+   ret += sprintf(buf + ret, " 0x%x", map->trip_mask);
+
+   for (i = 0; i < map->num_weights; i++)
+   ret += sprintf(buf + ret, " %d", map->weights[i]);
+
+   ret += sprintf(buf + ret, "\n");
+   return ret;
+}
+
+static ssize_t
 active_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 {
int i, indx, ret = 0;
@@ -1676,8 +1757,10 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
 
mutex_lock(_list_lock);
 
-   for_each_thermal_zone(tmp_tz)
+   for_each_thermal_zone(tmp_tz) {
remove_cdev_from_zone(tmp_tz, cdev);
+   remove_cdev_map_entry(tmp_tz, cdev);
+   }
 
mutex_unlock(_list_lock);
 
@@ -1931,12 +2014,19 @@ struct thermal_zone *create_thermal_zone(const char 
*name, void *devdata)
if (!tz->kobj_thermal_trip)
goto exit_name;
 
+   tz->kobj_thermal_map = kobject_create_and_add("thermal_map",
+   >device.kobj);
+   if (!tz->kobj_thermal_map)
+   goto exit_trip;
+
/* Add this zone to the global list of thermal zones */
mutex_lock(_list_lock);
list_add_tail(>node, _zone_list);
mutex_unlock(_list_lock);
return tz;
 
+exit_trip:
+   kobject_del(tz->kobj_thermal_trip);
 exit_name:
device_remove_file(>device, _attr_zone_name);
 exit_unregister:
@@ -2000,6 +2090,12 @@ void remove_thermal_zone(struct thermal_zone *tz)
kobject_name(>cdevs[i]->device.kobj));
}
 
+   for (i = 0; i < MAX_MAPS_PER_ZONE; i++)
+   __clean_map_entry(tz, i);
+
+   /* Remove /sys/class/thermal/zoneX/thermal_map */
+   kobject_del(tz->kobj_thermal_map);
+
release_idr(_zone_idr, _idr_lock,

[PATCH 4/8] Thermal: Add Thermal_trip sysfs node

2012-12-18 Thread Durgadoss R

This patch adds a thermal_trip directory under
/sys/class/thermal/zoneX. This directory contains
the trip point values for sensors bound to this
zone.

Signed-off-by: Durgadoss R 
---
 drivers/thermal/thermal_sys.c |  237 -
 include/linux/thermal.h   |   37 +++
 2 files changed, 272 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index b39bf97..29ec073 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -448,6 +448,22 @@ static void thermal_zone_device_check(struct work_struct 
*work)
thermal_zone_device_update(tz);
 }
 
+static int get_sensor_indx_by_kobj(struct thermal_zone *tz, const char *name)
+{
+   int i, indx = -EINVAL;
+
+   mutex_lock(_list_lock);
+   for (i = 0; i < tz->sensor_indx; i++) {
+   if (!strnicmp(name, kobject_name(tz->kobj_trip[i]),
+   THERMAL_NAME_LENGTH)) {
+   indx = i;
+   break;
+   }
+   }
+   mutex_unlock(_list_lock);
+   return indx;
+}
+
 static void remove_sensor_from_zone(struct thermal_zone *tz,
struct thermal_sensor *ts)
 {
@@ -459,9 +475,15 @@ static void remove_sensor_from_zone(struct thermal_zone 
*tz,
 
sysfs_remove_link(>device.kobj, kobject_name(>device.kobj));
 
+   /* Delete this sensor's trip Kobject */
+   kobject_del(tz->kobj_trip[indx]);
+
/* Shift the entries in the tz->sensors array */
-   for (j = indx; j < MAX_SENSORS_PER_ZONE - 1; j++)
+   for (j = indx; j < MAX_SENSORS_PER_ZONE - 1; j++) {
tz->sensors[j] = tz->sensors[j + 1];
+   tz->sensor_trip[j] = tz->sensor_trip[j + 1];
+   tz->kobj_trip[j] = tz->kobj_trip[j + 1];
+   }
 
tz->sensor_indx--;
 }
@@ -875,6 +897,120 @@ policy_show(struct device *dev, struct device_attribute 
*devattr, char *buf)
return sprintf(buf, "%s\n", tz->governor->name);
 }
 
+static ssize_t
+active_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int i, indx, ret = 0;
+   struct thermal_zone *tz;
+   struct device *dev;
+
+   /* In this function, for
+* /sys/class/thermal/zoneX/thermal_trip/sensorY:
+* attr points to sysfs node 'active'
+* kobj points to sensorY
+* kobj->parent points to thermal_trip
+* kobj->parent->parent points to zoneX
+*/
+
+   /* Get the zone pointer */
+   dev = container_of(kobj->parent->parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   /*
+* We need this because in the sysfs tree, 'sensorY' is
+* not really the sensor pointer. It just has the name
+* 'sensorY'; whereas 'zoneX' is actually the zone pointer.
+* This means container_of(kobj, struct device, kobj) will not
+* provide the actual sensor pointer.
+*/
+   indx = get_sensor_indx_by_kobj(tz, kobject_name(kobj));
+   if (indx < 0)
+   return indx;
+
+   if (tz->sensor_trip[indx]->num_active_trips <= 0)
+   return sprintf(buf, "\n");
+
+   ret += sprintf(buf, "0x%x", tz->sensor_trip[indx]->active_trip_mask);
+   for (i = 0; i < tz->sensor_trip[indx]->num_active_trips; i++) {
+   ret += sprintf(buf + ret, " %d",
+   tz->sensor_trip[indx]->active_trips[i]);
+   }
+
+   ret += sprintf(buf + ret, "\n");
+   return ret;
+}
+
+static ssize_t
+ptrip_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int i, indx, ret = 0;
+   struct thermal_zone *tz;
+   struct device *dev;
+
+   /* Get the zone pointer */
+   dev = container_of(kobj->parent->parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   indx = get_sensor_indx_by_kobj(tz, kobject_name(kobj));
+   if (indx < 0)
+   return indx;
+
+   if (tz->sensor_trip[indx]->num_passive_trips <= 0)
+   return sprintf(buf, "\n");
+
+   for (i = 0; i < tz->sensor_trip[indx]->num_passive_trips; i++) {
+   ret += sprintf(buf + ret, "%d ",
+   tz->sensor_trip[indx]->passive_trips[i]);
+   }
+
+   ret += sprintf(buf + ret, "\n");
+   return ret;
+}
+
+static ssize_t
+hot_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int indx;
+   struct thermal_zone *tz;
+   struct device *dev;
+
+   /* Get the zone pointer */
+   dev = container_of(kobj->parent->parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   indx = get_sensor_indx_by_kobj(tz, kobject_name(kobj));
+   if (indx < 0)
+   return indx;
+
+   return

[PATCH 6/8] Thermal: Add Documentation to new APIs

2012-12-18 Thread Durgadoss R

This patch adds Documentation for the new APIs
introduced in this patch set. The documentation
also has a model sysfs structure for reference.

Signed-off-by: Durgadoss R 
---
 Documentation/thermal/sysfs-api2.txt |  248 ++
 1 file changed, 248 insertions(+)
 create mode 100644 Documentation/thermal/sysfs-api2.txt

diff --git a/Documentation/thermal/sysfs-api2.txt 
b/Documentation/thermal/sysfs-api2.txt
new file mode 100644
index 000..ffd0402
--- /dev/null
+++ b/Documentation/thermal/sysfs-api2.txt
@@ -0,0 +1,248 @@
+Thermal Framework
+-
+
+Written by Durgadoss R 
+Copyright (c) 2012 Intel Corporation
+
+Created on: 4 November 2012
+Updated on: 18 December 2012
+
+0. Introduction
+---
+The Linux thermal framework provides a set of interfaces for thermal
+sensors and thermal cooling devices (fan, processor...) to register
+with the thermal management solution and to be a part of it.
+
+This document focuses on how to enable new thermal sensors and cooling
+devices to participate in thermal management. This solution is intended
+to be 'light-weight' and platform/architecture independent. Any thermal
+sensor/cooling device should be able to use the infrastructure easily.
+
+The goal of thermal framework is to expose the thermal sensor/zone and
+cooling device attributes in a consistent way. This will help the
+thermal governors to make use of the information to manage platform
+thermals efficiently.
+
+The thermal sensor source file can be generic (can be any sensor driver,
+in any subsystem). This driver will use the sensor APIs and register with
+thermal framework to participate in platform Thermal management. This
+does not (and should not) know about which zone it belongs to, or any
+other information about platform thermals. A sensor driver is a standalone
+piece of code, which can optionally register with thermal framework.
+
+However, for any platform, there should be a platformX_thermal.c file,
+which will know about the platform thermal characteristics (like how many
+sensors, zones, cooling devices, etc.. And how they are related to each other
+i.e the mapping information). Only in this file, the zone level APIs should
+be used, in which case the file will have all information required to attach
+various sensors to a particular zone.
+
+This way, we can have one platform level thermal file, which can support
+multiple platforms (may be)using the same set of sensors (but)binded in
+a different way. This file can get the platform thermal information
+through Firmware, ACPI tables, device tree etc.
+
+Unfortunately, today we don't have many drivers that can be clearly
+differentiated as 'sensor_file.c' and 'platform_thermal_file.c'.
+But very soon we will need/have. The reason I am saying this is because
+we are seeing a lot of chip drivers, starting to use thermal framework,
+and we should keep it really light-weight for them to do so.
+
+An Example: drivers/hwmon/emc1403.c - a generic thermal chip driver
+In one platform this sensor can belong to 'ZoneA' and in another the
+same can belong to 'ZoneB'. But, emc1403.c does not really care about
+where does it belong. It just reports temperature.
+
+1. Terminology
+--
+This section describes the terminology used in the rest of this
+document as well as the thermal framework code.
+
+thermal_sensor: Hardware that can report temperature of a particular
+   spot in the platform, where it is placed. The temperature
+   reported by the sensor is the 'real' temperature reported
+   by the hardware.
+thermal_zone:  A virtual area on the device, that gets heated up. It may
+   have one or more thermal sensors attached to it.
+cooling_device:Any component that can help in reducing the temperature 
of
+   a 'hot spot' either by reducing its performance (passive
+   cooling) or by other means(Active cooling E.g. Fan)
+
+trip_points:   Various temperature levels for each sensor. As of now, we
+   have four levels namely active, passive, hot and critical.
+   Hot and critical trip point support only one value whereas
+   active and passive can have any number of values. These
+   temperature values can come from platform data, and are
+   exposed through sysfs in a consistent manner. Stand-alone
+   thermal sensor drivers are not expected to know these values.
+   These values are RO.
+thresholds:These are programmable temperature limits, on reaching which
+   the thermal sensor generates an interrupt. The framework is
+   notified about this interrupt to take appropriate action.
+   There can be as many number of thresholds as that of the
+   hardware supports. These values are RW.
+
+thermal_map:   This provides the mapping (aka binding) information between
+

[PATCH 7/8] Thermal: Make PER_ZONE values configurable

2012-12-18 Thread Durgadoss R

This patch makes MAX_SENSORS_PER_ZONE and
MAX_CDEVS_PER_ZONE values configurable. The
default value is 1, and range is 1-12.

Signed-off-by: Durgadoss R 
---
No great reason for using 12.
---
 drivers/thermal/Kconfig |   14 ++
 include/linux/thermal.h |6 +++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index d96da07..c5ba3340 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -15,6 +15,20 @@ menuconfig THERMAL
 
 if THERMAL
 
+config THERMAL_MAX_SENSORS_PER_ZONE
+   int "Maximum number of sensors allowed per thermal zone"
+   default 1
+   range 1 12
+   ---help---
+ Specify the number of sensors allowed per zone
+
+config THERMAL_MAX_CDEVS_PER_ZONE
+   int "Maximum number of cooling devices allowed per thermal zone"
+   default 1
+   range 1 12
+   ---help---
+ Specify the number of cooling devices allowed per zone
+
 config THERMAL_HWMON
bool
depends on HWMON=y || HWMON=THERMAL
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 581dc87..7b0359b 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -49,9 +49,9 @@
 /* Default Thermal Governor: Does Linear Throttling */
 #define DEFAULT_THERMAL_GOVERNOR   "step_wise"
 
-#define MAX_SENSORS_PER_ZONE   5
-
-#define MAX_CDEVS_PER_ZONE 5
+/* Maximum number of sensors/cdevs per zone, defined through Kconfig */
+#define MAX_SENSORS_PER_ZONE   CONFIG_THERMAL_MAX_SENSORS_PER_ZONE
+#define MAX_CDEVS_PER_ZONE CONFIG_THERMAL_MAX_CDEVS_PER_ZONE
 
 /* If we map each sensor with every possible cdev for a zone */
 #define MAX_MAPS_PER_ZONE  (MAX_SENSORS_PER_ZONE * MAX_CDEVS_PER_ZONE)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 8/8] Thermal: Dummy driver used for testing

2012-12-18 Thread Durgadoss R

This patch has a dummy driver that can be used for
testing purposes. This patch is not for merge.

Signed-off-by: Durgadoss R 
---
 drivers/thermal/Kconfig|5 +
 drivers/thermal/Makefile   |3 +
 drivers/thermal/thermal_test.c |  315 
 3 files changed, 323 insertions(+)
 create mode 100644 drivers/thermal/thermal_test.c

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index c5ba3340..3b92a76 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -136,4 +136,9 @@ config DB8500_CPUFREQ_COOLING
  bound cpufreq cooling device turns active to set CPU frequency low to
  cool down the CPU.
 
+config THERMAL_TEST
+   tristate "test driver"
+   help
+ Enable this to test the thermal framework.
+
 endif
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index d8da683..02c3edb 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -18,3 +18,6 @@ obj-$(CONFIG_RCAR_THERMAL)+= rcar_thermal.o
 obj-$(CONFIG_EXYNOS_THERMAL)   += exynos_thermal.o
 obj-$(CONFIG_DB8500_THERMAL)   += db8500_thermal.o
 obj-$(CONFIG_DB8500_CPUFREQ_COOLING)   += db8500_cpufreq_cooling.o
+
+# dummy driver for testing
+obj-$(CONFIG_THERMAL_TEST) += thermal_test.o
diff --git a/drivers/thermal/thermal_test.c b/drivers/thermal/thermal_test.c
new file mode 100644
index 000..5a11e34
--- /dev/null
+++ b/drivers/thermal/thermal_test.c
@@ -0,0 +1,315 @@
+/*
+ * thermal_test.c - This driver can be used to test Thermal
+ *Framework changes. Not specific to any
+ *platform. Fills the log buffer generously ;)
+ *
+ * Copyright (C) 2012 Intel Corporation
+ *
+ * ~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * ~~
+ * Author: Durgadoss R 
+ */
+
+#define pr_fmt(fmt)  "thermal_test: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define MAX_THERMAL_ZONES  2
+#define MAX_THERMAL_SENSORS2
+#define MAX_COOLING_DEVS   4
+#define NUM_THRESHOLDS 3
+
+static struct ts_data {
+   int curr_temp;
+   int flag;
+} ts_data;
+
+int active_trips[10] = {100, 90, 80, 70, 60, 50, 40, 30, 20, 10};
+int passive_trips[5] = {100, 90, 60, 50, 40};
+
+static struct platform_device *pdev;
+static unsigned long cur_cdev_state = 2;
+static struct thermal_sensor *ts, *ts1;
+static struct thermal_zone *tz;
+static struct thermal_cooling_device *cdev;
+
+static long thermal_thresholds[NUM_THRESHOLDS] = {3, 4, 5};
+
+static struct thermal_trip_point trip = {
+   .hot = 90,
+   .crit = 100,
+   .num_passive_trips = 5,
+   .passive_trips = passive_trips,
+   .num_active_trips = 10,
+   .active_trips = active_trips,
+   .active_trip_mask = 0xCFF,
+};
+
+static struct thermal_trip_point trip1 = {
+   .hot = 95,
+   .crit = 125,
+   .num_passive_trips = 0,
+   .passive_trips = passive_trips,
+   .num_active_trips = 6,
+   .active_trips = active_trips,
+   .active_trip_mask = 0xFF,
+};
+
+static int read_cur_state(struct thermal_cooling_device *cdev,
+   unsigned long *state)
+{
+   *state = cur_cdev_state;
+   return 0;
+}
+
+static int write_cur_state(struct thermal_cooling_device *cdev,
+   unsigned long state)
+{
+   cur_cdev_state = state;
+   return 0;
+}
+
+static int read_max_state(struct thermal_cooling_device *cdev,
+   unsigned long *state)
+{
+   *state = 5;
+   return 0;
+}
+
+static int read_curr_temp(struct thermal_sensor *ts, long *temp)
+{
+   *temp = ts_data.curr_temp;
+   return 0;
+}
+
+static ssize_t
+flag_show(struct device *dev, struct device_attribute *devattr, char *buf)
+{
+   return sprintf(buf, "%d\n", ts_data.flag);
+}
+
+static ssize_t
+flag_store(struct device *dev, struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   long flag;
+
+   if (kstrtol(buf, 10, ))
+   return -EINVAL;
+
+   ts_data.flag = flag;
+
+   if (flag == 0) {
+

[PATCH 2/8] Thermal: Create zone level APIs

2012-12-18 Thread Durgadoss R

This patch adds a new thermal_zone structure to
thermal.h. Also, adds zone level APIs to the thermal
framework.

A thermal zone is a hot spot on the platform, which
can have one or more sensors and cooling devices attached
to it. These sensors can be mapped to a set of cooling
devices, which when throttled, can help to bring down
the temperature of the hot spot.

Signed-off-by: Durgadoss R 
---
 drivers/thermal/thermal_sys.c |  194 +
 include/linux/thermal.h   |   21 +
 2 files changed, 215 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index b2becb9..06d5a12 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -44,19 +44,44 @@ MODULE_DESCRIPTION("Generic thermal management sysfs 
support");
 MODULE_LICENSE("GPL");
 
 static DEFINE_IDR(thermal_tz_idr);
+static DEFINE_IDR(thermal_zone_idr);
 static DEFINE_IDR(thermal_cdev_idr);
 static DEFINE_IDR(thermal_sensor_idr);
 static DEFINE_MUTEX(thermal_idr_lock);
 
 static LIST_HEAD(thermal_tz_list);
 static LIST_HEAD(thermal_sensor_list);
+static LIST_HEAD(thermal_zone_list);
 static LIST_HEAD(thermal_cdev_list);
 static LIST_HEAD(thermal_governor_list);
 
 static DEFINE_MUTEX(thermal_list_lock);
 static DEFINE_MUTEX(sensor_list_lock);
+static DEFINE_MUTEX(zone_list_lock);
 static DEFINE_MUTEX(thermal_governor_lock);
 
+#define for_each_thermal_sensor(pos) \
+   list_for_each_entry(pos, _sensor_list, node)
+
+#define for_each_thermal_zone(pos) \
+   list_for_each_entry(pos, _zone_list, node)
+
+#define GET_INDEX(tz, ptr, indx, type) \
+   do {\
+   int i;  \
+   indx = -EINVAL; \
+   if (!tz || !ptr)\
+   break;  \
+   mutex_lock(##_list_lock);  \
+   for (i = 0; i < tz->type##_indx; i++) { \
+   if (tz->type##s[i] == ptr) {\
+   indx = i;   \
+   break;  \
+   }   \
+   }   \
+   mutex_unlock(##_list_lock);\
+   } while (0)
+
 static struct thermal_governor *__find_governor(const char *name)
 {
struct thermal_governor *pos;
@@ -419,15 +444,44 @@ static void thermal_zone_device_check(struct work_struct 
*work)
thermal_zone_device_update(tz);
 }
 
+static void remove_sensor_from_zone(struct thermal_zone *tz,
+   struct thermal_sensor *ts)
+{
+   int j, indx;
+
+   GET_INDEX(tz, ts, indx, sensor);
+   if (indx < 0)
+   return;
+
+   sysfs_remove_link(>device.kobj, kobject_name(>device.kobj));
+
+   /* Shift the entries in the tz->sensors array */
+   for (j = indx; j < MAX_SENSORS_PER_ZONE - 1; j++)
+   tz->sensors[j] = tz->sensors[j + 1];
+
+   tz->sensor_indx--;
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
container_of(_dev, struct thermal_zone_device, device)
 
+#define to_zone(_dev) \
+   container_of(_dev, struct thermal_zone, device)
+
 #define to_thermal_sensor(_dev) \
container_of(_dev, struct thermal_sensor, device)
 
 static ssize_t
+zone_name_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   struct thermal_zone *tz = to_zone(dev);
+
+   return sprintf(buf, "%s\n", tz->name);
+}
+
+static ssize_t
 sensor_name_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
struct thermal_sensor *ts = to_thermal_sensor(dev);
@@ -809,6 +863,8 @@ static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, 
policy_store);
 static DEVICE_ATTR(sensor_name, 0444, sensor_name_show, NULL);
 static DEVICE_ATTR(temp_input, 0444, sensor_temp_show, NULL);
 
+static DEVICE_ATTR(zone_name, 0444, zone_name_show, NULL);
+
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
@@ -1654,6 +1710,136 @@ static int enable_sensor_thresholds(struct 
thermal_sensor *ts, int count)
return 0;
 }
 
+struct thermal_zone *create_thermal_zone(const char *name, void *devdata)
+{
+   struct thermal_zone *tz;
+   int ret;
+
+   if (!name || (name && strlen(name) >= THERMAL_NAME_LENGTH))
+   return ERR_PTR(-EINVAL);
+
+   tz = kzalloc(sizeof(*tz), GFP_KERNEL);
+   if (!tz)
+   return ERR_PTR(-ENOMEM);
+
+   idr_init(>idr);
+   ret = get_idr(_zone_idr, _idr_lock, >id);
+   if (ret)
+   goto exit_free;
+
+   strcpy(tz->name, name);
+   tz->devdata = devdata;
+   tz->device.class = _class;
+
+   dev_set_name(>device, "zone%d",

[PATCH 0/8] Thermal Framework Enhancements

2012-12-18 Thread Durgadoss R

This patch is a v1 based on the RFC submitted here:
https://patchwork.kernel.org/patch/1758921/

This patch set is based on Rui's -thermal tree, and is
tested on a Core-i5 and an Atom netbook.

This series contains 8 patches:
Patch 1/8: Creates new sensor level APIs
Patch 2/8: Creates new zone level APIs. The existing tzd structure is
   kept as such for clarity and compatibility purposes.
Patch 3/8: Creates functions to add/remove a cdev to/from a zone. The
   existing tcd structure need not be modified.
Patch 4/8: Adds a thermal_trip sysfs node, which exposes various trip
   points for all sensors present in a zone.
Patch 5/8: Adds a thermal_map sysfs node. It is a compact representation
   of the binding relationship between a sensor and a cdev,
   within a zone.
Patch 6/8: Creates Documentation for the new APIs. A new file is
   created for clarity. Final goal is to merge with the existing
   file or refactor the files, as whatever seems appropriate.
Patch 7/8: Make PER ZONE values configurable through Kconfig
Patch 8/8: A dummy driver that can be used for testing. This is not for merge.

Thanks to Rui Zhang, Honghbo Zhang, Wei Ni for their feedback on the
RFC version.

Durgadoss R (8):
  Thermal: Create sensor level APIs
  Thermal: Create zone level APIs
  Thermal: Add APIs to bind cdev to new zone structure
  Thermal: Add Thermal_trip sysfs node
  Thermal: Add 'thermal_map' sysfs node
  Thermal: Add Documentation to new APIs
  Thermal: Make PER_ZONE values configurable
  Thermal: Dummy driver used for testing

 Documentation/thermal/sysfs-api2.txt |  248 +
 drivers/thermal/Kconfig  |   19 +
 drivers/thermal/Makefile |3 +
 drivers/thermal/thermal_sys.c|  932 ++
 drivers/thermal/thermal_test.c   |  315 
 include/linux/thermal.h  |  124 +
 6 files changed, 1641 insertions(+)
 create mode 100644 Documentation/thermal/sysfs-api2.txt
 create mode 100644 drivers/thermal/thermal_test.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/8] Thermal: Add APIs to bind cdev to new zone structure

2012-12-18 Thread Durgadoss R

This patch creates new APIs to add/remove a
cdev to/from a zone. This patch does not change
the old cooling device implementation.

Signed-off-by: Durgadoss R 
---
 drivers/thermal/thermal_sys.c |   80 +
 include/linux/thermal.h   |8 +
 2 files changed, 88 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 06d5a12..b39bf97 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -58,6 +58,7 @@ static LIST_HEAD(thermal_governor_list);
 static DEFINE_MUTEX(thermal_list_lock);
 static DEFINE_MUTEX(sensor_list_lock);
 static DEFINE_MUTEX(zone_list_lock);
+static DEFINE_MUTEX(cdev_list_lock);
 static DEFINE_MUTEX(thermal_governor_lock);
 
 #define for_each_thermal_sensor(pos) \
@@ -82,6 +83,9 @@ static DEFINE_MUTEX(thermal_governor_lock);
mutex_unlock(##_list_lock);\
} while (0)
 
+#define for_each_cdev(pos) \
+   list_for_each_entry(pos, _cdev_list, node)
+
 static struct thermal_governor *__find_governor(const char *name)
 {
struct thermal_governor *pos;
@@ -462,6 +466,24 @@ static void remove_sensor_from_zone(struct thermal_zone 
*tz,
tz->sensor_indx--;
 }
 
+static void remove_cdev_from_zone(struct thermal_zone *tz,
+   struct thermal_cooling_device *cdev)
+{
+   int j, indx;
+
+   GET_INDEX(tz, cdev, indx, cdev);
+   if (indx < 0)
+   return;
+
+   sysfs_remove_link(>device.kobj, kobject_name(>device.kobj));
+
+   /* Shift the entries in the tz->cdevs array */
+   for (j = indx; j < MAX_CDEVS_PER_ZONE - 1; j++)
+   tz->cdevs[j] = tz->cdevs[j + 1];
+
+   tz->cdev_indx--;
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
@@ -1458,6 +1480,7 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
int i;
const struct thermal_zone_params *tzp;
struct thermal_zone_device *tz;
+   struct thermal_zone *tmp_tz;
struct thermal_cooling_device *pos = NULL;
 
if (!cdev)
@@ -1495,6 +1518,13 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
 
mutex_unlock(_list_lock);
 
+   mutex_lock(_list_lock);
+
+   for_each_thermal_zone(tmp_tz)
+   remove_cdev_from_zone(tmp_tz, cdev);
+
+   mutex_unlock(_list_lock);
+
if (cdev->type[0])
device_remove_file(>device, _attr_cdev_type);
device_remove_file(>device, _attr_max_state);
@@ -1790,6 +1820,23 @@ exit:
 }
 EXPORT_SYMBOL(remove_thermal_zone);
 
+struct thermal_cooling_device *get_cdev_by_name(const char *name)
+{
+   struct thermal_cooling_device *pos;
+   struct thermal_cooling_device *cdev = NULL;
+
+   mutex_lock(_list_lock);
+   for_each_cdev(pos) {
+   if (!strnicmp(pos->type, name, THERMAL_NAME_LENGTH)) {
+   cdev = pos;
+   break;
+   }
+   }
+   mutex_unlock(_list_lock);
+   return cdev;
+}
+EXPORT_SYMBOL(get_cdev_by_name);
+
 struct thermal_sensor *get_sensor_by_name(const char *name)
 {
struct thermal_sensor *pos;
@@ -1840,6 +1887,39 @@ exit_zone:
 }
 EXPORT_SYMBOL(add_sensor_to_zone);
 
+int add_cdev_to_zone(struct thermal_zone *tz,
+   struct thermal_cooling_device *cdev)
+{
+   int ret;
+
+   if (!tz || !cdev)
+   return -EINVAL;
+
+   mutex_lock(_list_lock);
+
+   /* Ensure we are not adding the same cdev again!! */
+   GET_INDEX(tz, cdev, ret, cdev);
+   if (ret >= 0) {
+   ret = -EEXIST;
+   goto exit_zone;
+   }
+
+   mutex_lock(_list_lock);
+   ret = sysfs_create_link(>device.kobj, >device.kobj,
+   kobject_name(>device.kobj));
+   if (ret)
+   goto exit_cdev;
+
+   tz->cdevs[tz->cdev_indx++] = cdev;
+
+exit_cdev:
+   mutex_unlock(_list_lock);
+exit_zone:
+   mutex_unlock(_list_lock);
+   return ret;
+}
+EXPORT_SYMBOL(add_cdev_to_zone);
+
 /**
  * thermal_sensor_register - register a new thermal sensor
  * @name:  name of the thermal sensor
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f08f774..c4e45c7 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -51,6 +51,8 @@
 
 #define MAX_SENSORS_PER_ZONE   5
 
+#define MAX_CDEVS_PER_ZONE 5
+
 struct thermal_sensor;
 struct thermal_zone_device;
 struct thermal_cooling_device;
@@ -209,6 +211,10 @@ struct thermal_zone {
/* Sensor level information */
int sensor_indx; /* index into 'sensors' array */
struct thermal_sensor *sensors[MAX_SENSORS_PER_ZONE];
+
+   /* cdev level information */
+   int cdev_indx; /* index into 'cdevs' array */
+   struct thermal_cooling_device *cdevs[MAX_CDEVS_PER_ZONE];
 };
 
 /* Structure that holds thermal governor

Re: [PATCH 4/4] ARM: tegra: Set SCU base address dynamically from DT

2012-12-18 Thread Hiroshi Doyu

Hi Rob,

Rob Herring  wrote @ Mon, 17 Dec 2012 15:00:46 +0100:

> On 12/17/2012 12:18 AM, Hiroshi Doyu wrote:
> > Set Snoop Control Unit(SCU) register base address dynamically from DT.
> > 
> > Signed-off-by: Hiroshi Doyu 
> > ---
> >  arch/arm/mach-tegra/platsmp.c |   23 ---
> >  1 file changed, 20 insertions(+), 3 deletions(-)
> > 
> > diff --git a/arch/arm/mach-tegra/platsmp.c b/arch/arm/mach-tegra/platsmp.c
> > index 1b926df..45c0b79 100644
> > --- a/arch/arm/mach-tegra/platsmp.c
> > +++ b/arch/arm/mach-tegra/platsmp.c
> > @@ -18,6 +18,8 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> > +#include 
> >  
> >  #include 
> >  #include 
> > @@ -36,7 +38,7 @@
> >  
> >  extern void tegra_secondary_startup(void);
> >  
> > -static void __iomem *scu_base = IO_ADDRESS(TEGRA_ARM_PERIF_BASE);
> > +static void __iomem *scu_base;
> >  
> >  #define EVP_CPU_RESET_VECTOR \
> > (IO_ADDRESS(TEGRA_EXCEPTION_VECTORS_BASE) + 0x100)
> > @@ -143,14 +145,28 @@ done:
> > return status;
> >  }
> >  
> > +static const struct of_device_id cortex_a9_scu_match[] __initconst = {
> > +   { .compatible = "arm,cortex-a9-scu", },
> > +   {}
> > +};
> > +
> >  /*
> >   * Initialise the CPU possible map early - this describes the CPUs
> >   * which may be present or become present in the system.
> >   */
> >  static void __init tegra_smp_init_cpus(void)
> >  {
> > -   unsigned int i, ncores = scu_get_core_count(scu_base);
> > +   struct device_node *np;
> > +   unsigned int i, ncores = 1;
> > +
> > +   np = of_find_matching_node(NULL, cortex_a9_scu_match);
> > +   if (!np)
> > +   return;
> > +   scu_base = of_iomap(np, 0);
> 
> Did you actually test this? Unless something changed, ioremap does not
> work this early. The only reason to have it mapped this early is to get
> the core count, but that doesn't work on A15 or A7. So we really need to
> get core count/mask in a standard way. At least some work to get core
> count from DT went into 3.8.
> 
> BTW, you can get the scu address on the A9 by reading cp15 register:
> 
>   /* Get SCU base */
>   asm("mrc p15, 4, %0, c15, c0, 0" : "=r" (base));
> 
> It's still probably good to have the DT node, but the reg property can
> be optional in this case.

I'm simply wondering, if the above cp15 works with Cortex-A9, do we
still need SCU DT node? At least from Cortex-A15 TRM, it seems that
SCU is tighly integrated into CPU core and it doesn't have any user
control. So Cortex-A15 doesn't seem to need to configure SCU. For
Cortex-A7, I haven't yet found S/W configurable register definitions
in TRM. So if neither of A15/A7 need SCU base, would the above cp15
intructions be enough?

> We need to move away from having the DT matching code within the
> platforms. This should all be moved to the scu code in a scu_of_init
> function that could be called from common code.

True if SCU DT node is still necessary.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] ARM: ux500: add pinctrl address resources

2012-12-18 Thread Fabio Baltieri

Current nmk_pinctrl driver is not PRCMU dependent anymore, so it needs
its own DT address resources to work properly, as done for
platform_device in:

f482833 ARM: ux500: add PRCM register base for pinctrl

Reviewed-by: Linus Walleij 
Signed-off-by: Fabio Baltieri 
---
 arch/arm/boot/dts/dbx5x0.dtsi| 3 ++-
 arch/arm/mach-ux500/cpu-db8500.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/dbx5x0.dtsi b/arch/arm/boot/dts/dbx5x0.dtsi
index 2efd9c8..16552d4 100644
--- a/arch/arm/boot/dts/dbx5x0.dtsi
+++ b/arch/arm/boot/dts/dbx5x0.dtsi
@@ -170,7 +170,8 @@
gpio-bank = <8>;
};
 
-   pinctrl {
+   pinctrl@80157000 {
+   reg = <0x80157000 0x2000>;
compatible = "stericsson,nmk_pinctrl";
};
 
diff --git a/arch/arm/mach-ux500/cpu-db8500.c b/arch/arm/mach-ux500/cpu-db8500.c
index db0bb75..5b286e0 100644
--- a/arch/arm/mach-ux500/cpu-db8500.c
+++ b/arch/arm/mach-ux500/cpu-db8500.c
@@ -285,7 +285,8 @@ static struct of_dev_auxdata u8500_auxdata_lookup[] 
__initdata = {
OF_DEV_AUXDATA("st,nomadik-i2c", 0x8011, "nmk-i2c.3", NULL),
OF_DEV_AUXDATA("st,nomadik-i2c", 0x8012a000, "nmk-i2c.4", NULL),
/* Requires device name bindings. */
-   OF_DEV_AUXDATA("stericsson,nmk_pinctrl", 0, "pinctrl-db8500", NULL),
+   OF_DEV_AUXDATA("stericsson,nmk_pinctrl", U8500_PRCMU_BASE,
+   "pinctrl-db8500", NULL),
/* Requires clock name and DMA bindings. */
OF_DEV_AUXDATA("stericsson,ux500-msp-i2s", 0x80123000,
"ux500-msp-i2s.0", _platform_data),
-- 
1.7.12.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kvm: fix i8254 counter 0 wraparound

2012-12-18 Thread Gleb Natapov

On Sat, Dec 15, 2012 at 06:34:37AM -0500, Nickolai Zeldovich wrote:
> The kvm i8254 emulation for counter 0 (but not for counters 1 and 2)
> has at least two bugs in mode 0:
> 
> 1. The OUT bit, computed by pit_get_out(), is never set high.
> 
> 2. The counter value, computed by pit_get_count(), wraps back around to
>the initial counter value, rather than wrapping back to 0x
>(which is the behavior described in the comment in __kpit_elapsed,
>the behavior implemented by qemu, and the behavior observed on AMD
>hardware).
> 
> The bug stems from __kpit_elapsed computing the elapsed time mod the
> initial counter value (stored as nanoseconds in ps->period).  This is both
> unnecessary (none of the callers of kpit_elapsed expect the value to be
> at most the initial counter value) and incorrect (it causes pit_get_count
> to appear to wrap around to the initial counter value rather than 0x).
> Removing this mod from __kpit_elapsed fixes both of the above bugs.
> 
> Signed-off-by: Nickolai Zeldovich 
Applied, thanks!

> ---
>  arch/x86/kvm/i8254.c |1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
> index 11300d2..c1d30b2 100644
> --- a/arch/x86/kvm/i8254.c
> +++ b/arch/x86/kvm/i8254.c
> @@ -122,7 +122,6 @@ static s64 __kpit_elapsed(struct kvm *kvm)
>*/
>   remaining = hrtimer_get_remaining(>timer);
>   elapsed = ps->period - ktime_to_ns(remaining);
> - elapsed = mod_64(elapsed, ps->period);
>  
>   return elapsed;
>  }
> -- 
> 1.7.10.4

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] pinctrl: nomadik: return if prcm_base is NULL

2012-12-18 Thread Fabio Baltieri

This patch adds a check for npct->prcm_base to make sure that the
address is not NULL before using it, as the driver was made capable of
loading even without a proper memory resource in:

f1671bf pinctrl/nomadik: make independent of prcmu driver

Also, refuses to probe without prcm_base on anything else than nomadik.

This solves the following crash, introduced during the merge window when
booting on U8500 with device tree:

pinctrl-nomadik pinctrl-db8500: No PRCM base, assume no ALT-Cx control is 
available
Unable to handle kernel NULL pointer dereference at virtual address 0138
pgd = c0004000
[0138] *pgd=
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0Not tainted  (3.7.0-02892-g1ebaf4f #631)
PC is at nmk_pmx_enable+0x1bc/0x4d0
LR is at clk_disable+0x40/0x44
[snip]
[] (nmk_pmx_enable+0x1bc/0x4d0) from [] 
(pinmux_enable_setting+0x12c/0x1ec)
[] (pinmux_enable_setting+0x12c/0x1ec) from [] 
(pinctrl_select_state_locked+0xfc/0x134)
[] (pinctrl_select_state_locked+0xfc/0x134) from [] 
(pinctrl_register+0x26c/0x43c)
[] (pinctrl_register+0x26c/0x43c) from [] 
(nmk_pinctrl_probe+0x114/0x238)
[] (nmk_pinctrl_probe+0x114/0x238) from [] 
(platform_drv_probe+0x28/0x2c)
[] (platform_drv_probe+0x28/0x2c) from [] 
(driver_probe_device+0x84/0x21c)
[] (driver_probe_device+0x84/0x21c) from [] 
(__device_attach+0x50/0x54)
[] (__device_attach+0x50/0x54) from [] 
(bus_for_each_drv+0x54/0x9c)
[] (bus_for_each_drv+0x54/0x9c) from [] 
(device_attach+0x84/0x9c)
[] (device_attach+0x84/0x9c) from [] 
(bus_probe_device+0x94/0xb8)
[] (bus_probe_device+0x94/0xb8) from [] 
(device_add+0x4f0/0x5bc)
[] (device_add+0x4f0/0x5bc) from [] 
(of_device_add+0x40/0x48)
[] (of_device_add+0x40/0x48) from [] 
(of_platform_device_create_pdata+0x68/0x98)
[] (of_platform_device_create_pdata+0x68/0x98) from [] 
(of_platform_bus_create+0xe4/0x260)
[] (of_platform_bus_create+0xe4/0x260) from [] 
(of_platform_bus_create+0x130/0x260)
[] (of_platform_bus_create+0x130/0x260) from [] 
(of_platform_populate+0x6c/0xac)
[] (of_platform_populate+0x6c/0xac) from [] 
(u8500_init_machine+0x78/0x140)
[] (u8500_init_machine+0x78/0x140) from [] 
(customize_machine+0x24/0x30)
[] (customize_machine+0x24/0x30) from [] 
(do_one_initcall+0x130/0x1b0)
[] (do_one_initcall+0x130/0x1b0) from [] 
(kernel_init+0x138/0x2e8)
[] (kernel_init+0x138/0x2e8) from [] 
(ret_from_fork+0x14/0x20)
Code: 0a1b e19400b2 e59a200c e0822000 (e592c000)
---[ end trace 1b75b31a2719ed1c ]---
note: swapper/0[1] exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b

Reviewed-by: Linus Walleij 
Signed-off-by: Fabio Baltieri 
---
 drivers/pinctrl/pinctrl-nomadik.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/pinctrl-nomadik.c 
b/drivers/pinctrl/pinctrl-nomadik.c
index ef66f98..015b093 100644
--- a/drivers/pinctrl/pinctrl-nomadik.c
+++ b/drivers/pinctrl/pinctrl-nomadik.c
@@ -259,6 +259,9 @@ static void nmk_prcm_altcx_set_mode(struct nmk_pinctrl 
*npct,
const struct prcm_gpiocr_altcx_pin_desc *pin_desc;
const u16 *gpiocr_regs;
 
+   if (!npct->prcm_base)
+   return;
+
if (alt_num > PRCM_IDX_GPIOCR_ALTC_MAX) {
dev_err(npct->dev, "PRCM GPIOCR: alternate-C%i is invalid\n",
alt_num);
@@ -682,6 +685,9 @@ static int nmk_prcm_gpiocr_get_mode(struct pinctrl_dev 
*pctldev, int gpio)
const struct prcm_gpiocr_altcx_pin_desc *pin_desc;
const u16 *gpiocr_regs;
 
+   if (!npct->prcm_base)
+   return NMK_GPIO_ALT_C;
+
for (i = 0; i < npct->soc->npins_altcx; i++) {
if (npct->soc->altcx_pins[i].pin == gpio)
break;
@@ -1887,9 +1893,12 @@ static int __devinit nmk_pinctrl_probe(struct 
platform_device *pdev)
"failed to ioremap PRCM registers\n");
return -ENOMEM;
}
-   } else {
+   } else if (version == PINCTRL_NMK_STN8815) {
dev_info(>dev,
 "No PRCM base, assume no ALT-Cx control is 
available\n");
+   } else {
+   dev_err(>dev, "missing PRCM base address\n");
+   return -EINVAL;
}
 
/*
-- 
1.7.12.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] backlight: add lms501kf03 LCD driver

2012-12-18 Thread devendra.aaru

On Tue, Dec 18, 2012 at 3:46 AM, Jingoo Han  wrote:
> Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
> x 480) driver uses 3-wired SPI inteface.
>
> Signed-off-by: Ilho Lee 
> Signed-off-by: Jingoo Han 
> ---
> Change since v1:
> - remove redundant return variables
> - use -EINVAL instead of -EFAULT
> - add a more detailed description of 120ms delay time


Thanks for taking care of these comments!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2] spi: remove check for bits_per_word on transfer from low level driver

2012-12-18 Thread Laxman Dewangan

The spi core make sure that each transfer structure have the proper
setting for bits_per_word before calling low level transfer APIs.

Hence it is no more require to check again in low level driver for
this field whether this is set correct or not. Removing such code
from low level driver.

Signed-off-by: Laxman Dewangan 
---
This is the continuation of feedback got from Jonas on
change
spi: make sure all transfer has bits_per_word set
where I made the change in only tegra slink driver.
This patch remove the similar code form all drivers.

Changes from V1:
- No change in code.
- Description is rewritten to match with the change.

 drivers/spi/spi-altera.c|2 +-
 drivers/spi/spi-bfin-sport.c|3 +--
 drivers/spi/spi-bfin5xx.c   |3 +--
 drivers/spi/spi-bitbang.c   |6 +++---
 drivers/spi/spi-clps711x.c  |2 +-
 drivers/spi/spi-coldfire-qspi.c |3 +--
 drivers/spi/spi-ep93xx.c|2 +-
 drivers/spi/spi-s3c64xx.c   |2 +-
 drivers/spi/spi-sirf.c  |3 +--
 drivers/spi/spi-tegra20-slink.c |9 +++--
 drivers/spi/spi-txx9.c  |6 ++
 11 files changed, 16 insertions(+), 25 deletions(-)

diff --git a/drivers/spi/spi-altera.c b/drivers/spi/spi-altera.c
index 5e7314a..a537f8d 100644
--- a/drivers/spi/spi-altera.c
+++ b/drivers/spi/spi-altera.c
@@ -134,7 +134,7 @@ static int altera_spi_txrx(struct spi_device *spi, struct 
spi_transfer *t)
hw->tx = t->tx_buf;
hw->rx = t->rx_buf;
hw->count = 0;
-   hw->bytes_per_word = (t->bits_per_word ? : spi->bits_per_word) / 8;
+   hw->bytes_per_word = t->bits_per_word / 8;
hw->len = t->len / hw->bytes_per_word;
 
if (hw->irq >= 0) {
diff --git a/drivers/spi/spi-bfin-sport.c b/drivers/spi/spi-bfin-sport.c
index ac7ffca..39b0d17 100644
--- a/drivers/spi/spi-bfin-sport.c
+++ b/drivers/spi/spi-bfin-sport.c
@@ -416,8 +416,7 @@ bfin_sport_spi_pump_transfers(unsigned long data)
drv_data->cs_change = transfer->cs_change;
 
/* Bits per word setup */
-   bits_per_word = transfer->bits_per_word ? :
-   message->spi->bits_per_word ? : 8;
+   bits_per_word = transfer->bits_per_word;
if (bits_per_word % 16 == 0)
drv_data->ops = _sport_transfer_ops_u16;
else
diff --git a/drivers/spi/spi-bfin5xx.c b/drivers/spi/spi-bfin5xx.c
index 0429d83..7d7c991 100644
--- a/drivers/spi/spi-bfin5xx.c
+++ b/drivers/spi/spi-bfin5xx.c
@@ -642,8 +642,7 @@ static void bfin_spi_pump_transfers(unsigned long data)
drv_data->cs_change = transfer->cs_change;
 
/* Bits per word setup */
-   bits_per_word = transfer->bits_per_word ? :
-   message->spi->bits_per_word ? : 8;
+   bits_per_word = transfer->bits_per_word;
if (bits_per_word % 16 == 0) {
drv_data->n_bytes = bits_per_word/8;
drv_data->len = (transfer->len) >> 1;
diff --git a/drivers/spi/spi-bitbang.c b/drivers/spi/spi-bitbang.c
index 8b3d8ef..61beaec 100644
--- a/drivers/spi/spi-bitbang.c
+++ b/drivers/spi/spi-bitbang.c
@@ -69,7 +69,7 @@ static unsigned bitbang_txrx_8(
unsignedns,
struct spi_transfer *t
 ) {
-   unsignedbits = t->bits_per_word ? : spi->bits_per_word;
+   unsignedbits = t->bits_per_word;
unsignedcount = t->len;
const u8*tx = t->tx_buf;
u8  *rx = t->rx_buf;
@@ -95,7 +95,7 @@ static unsigned bitbang_txrx_16(
unsignedns,
struct spi_transfer *t
 ) {
-   unsignedbits = t->bits_per_word ? : spi->bits_per_word;
+   unsignedbits = t->bits_per_word;
unsignedcount = t->len;
const u16   *tx = t->tx_buf;
u16 *rx = t->rx_buf;
@@ -121,7 +121,7 @@ static unsigned bitbang_txrx_32(
unsignedns,
struct spi_transfer *t
 ) {
-   unsignedbits = t->bits_per_word ? : spi->bits_per_word;
+   unsignedbits = t->bits_per_word;
unsignedcount = t->len;
const u32   *tx = t->tx_buf;
u32 *rx = t->rx_buf;
diff --git a/drivers/spi/spi-clps711x.c b/drivers/spi/spi-clps711x.c
index 1366c46..a11cbf0 100644
--- a/drivers/spi/spi-clps711x.c
+++ b/drivers/spi/spi-clps711x.c
@@ -68,7 +68,7 @@ static int spi_clps711x_setup_xfer(struct spi_device *spi,
   struct spi_transfer *xfer)
 {
u32 speed = xfer->speed_hz ? : spi->max_speed_hz;
-   u8 bpw = xfer->bits_per_word ? : spi->bits_per_word;
+   u8 bpw = xfer->bits_per_word;
struct spi_clps711x_data *hw = spi_master_get_devdata(spi->master);
 
if (bpw != 8) {
diff --git a/drivers/spi/spi-coldfire-qspi.c b/drivers/spi/spi-coldfire-qspi.c
index 58466b8..7b5cc9e 100644

[PATCH] [RFC] cpufreq: can't raise max frequency with cpu_thermal

2012-12-18 Thread Sonny Rao

The cpu_thermal generic thermal management code has a bug where once
max cpu frequency has been lowered in sysfs (scaling_max_freq) it is
not possible to raise the max back up later.  The bug is that the
notifer gets called by __cpufreq_set_policy() before the user policy
max is raised, and is incorrectly trying to enforce the max frequency
policy even when we are trying to change the policy.  It is also not
clear why this driver is looking at the user policy since it is
primarily supposed to enforce thermal policy, not user set policy.

Signed-off-by: Sonny Rao 
---
 drivers/thermal/cpu_cooling.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 836828e..63bc708 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -219,10 +219,6 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
if (cpumask_test_cpu(policy->cpu, _device->allowed_cpus))
max_freq = notify_device->cpufreq_val;
 
-   /* Never exceed user_policy.max*/
-   if (max_freq > policy->user_policy.max)
-   max_freq = policy->user_policy.max;
-
if (policy->max != max_freq)
cpufreq_verify_within_limits(policy, 0, max_freq);
 
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] backlight: add lms501kf03 LCD driver

2012-12-18 Thread Jingoo Han

Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
x 480) driver uses 3-wired SPI inteface.

Signed-off-by: Ilho Lee 
Signed-off-by: Jingoo Han 
---
Change since v1:
- remove redundant return variables
- use -EINVAL instead of -EFAULT
- add a more detailed description of 120ms delay time
- replace unsigned short arrays with unsigned char arrays

 drivers/video/backlight/Kconfig  |8 +
 drivers/video/backlight/Makefile |1 +
 drivers/video/backlight/lms501kf03.c |  444 ++
 3 files changed, 453 insertions(+), 0 deletions(-)
 create mode 100644 drivers/video/backlight/lms501kf03.c

diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig
index 765a945..081d6cf 100644
--- a/drivers/video/backlight/Kconfig
+++ b/drivers/video/backlight/Kconfig
@@ -126,6 +126,14 @@ config LCD_AMS369FG06
  If you have an AMS369FG06 AMOLED Panel, say Y to enable its
  LCD control driver.
 
+config LCD_LMS501KF03
+   tristate "LMS501KF03 LCD Driver"
+   depends on SPI
+   default n
+   help
+ If you have an LMS501KF03 LCD Panel, say Y to enable its
+ LCD control driver.
+
 endif # LCD_CLASS_DEVICE
 
 #
diff --git a/drivers/video/backlight/Makefile b/drivers/video/backlight/Makefile
index e7ce729..d02a728 100644
--- a/drivers/video/backlight/Makefile
+++ b/drivers/video/backlight/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_LCD_TOSA)   += tosa_lcd.o
 obj-$(CONFIG_LCD_S6E63M0)  += s6e63m0.o
 obj-$(CONFIG_LCD_LD9040)   += ld9040.o
 obj-$(CONFIG_LCD_AMS369FG06)   += ams369fg06.o
+obj-$(CONFIG_LCD_LMS501KF03)   += lms501kf03.o
 
 obj-$(CONFIG_BACKLIGHT_CLASS_DEVICE) += backlight.o
 obj-$(CONFIG_BACKLIGHT_ATMEL_PWM)+= atmel-pwm-bl.o
diff --git a/drivers/video/backlight/lms501kf03.c 
b/drivers/video/backlight/lms501kf03.c
new file mode 100644
index 000..af7979d
--- /dev/null
+++ b/drivers/video/backlight/lms501kf03.c
@@ -0,0 +1,444 @@
+/*
+ * lms501kf03 TFT LCD panel driver.
+ *
+ * Copyright (c) 2012 Samsung Electronics Co., Ltd.
+ * Author: Jingoo Han  
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#define COMMAND_ONLY   0x00
+#define DATA_ONLY  0x01
+
+struct lms501kf03 {
+   struct device   *dev;
+   struct spi_device   *spi;
+   unsigned intpower;
+   struct lcd_device   *ld;
+   struct lcd_platform_data*lcd_pd;
+};
+
+static const unsigned char seq_password[] = {
+   0xb9, 0xff, 0x83, 0x69,
+};
+
+static const unsigned char seq_power[] = {
+   0xb1, 0x01, 0x00, 0x34, 0x06, 0x00, 0x14, 0x14, 0x20, 0x28,
+   0x12, 0x12, 0x17, 0x0a, 0x01, 0xe6, 0xe6, 0xe6, 0xe6, 0xe6,
+};
+
+static const unsigned char seq_display[] = {
+   0xb2, 0x00, 0x2b, 0x03, 0x03, 0x70, 0x00, 0xff, 0x00, 0x00,
+   0x00, 0x00, 0x03, 0x03, 0x00, 0x01,
+};
+
+static const unsigned char seq_rgb_if[] = {
+   0xb3, 0x09,
+};
+
+static const unsigned char seq_display_inv[] = {
+   0xb4, 0x01, 0x08, 0x77, 0x0e, 0x06,
+};
+
+static const unsigned char seq_vcom[] = {
+   0xb6, 0x4c, 0x2e,
+};
+
+static const unsigned char seq_gate[] = {
+   0xd5, 0x00, 0x05, 0x03, 0x29, 0x01, 0x07, 0x17, 0x68, 0x13,
+   0x37, 0x20, 0x31, 0x8a, 0x46, 0x9b, 0x57, 0x13, 0x02, 0x75,
+   0xb9, 0x64, 0xa8, 0x07, 0x0f, 0x04, 0x07,
+};
+
+static const unsigned char seq_panel[] = {
+   0xcc, 0x02,
+};
+
+static const unsigned char seq_col_mod[] = {
+   0x3a, 0x77,
+};
+
+static const unsigned char seq_w_gamma[] = {
+   0xe0, 0x00, 0x04, 0x09, 0x0f, 0x1f, 0x3f, 0x1f, 0x2f, 0x0a,
+   0x0f, 0x10, 0x16, 0x18, 0x16, 0x17, 0x0d, 0x15, 0x00, 0x04,
+   0x09, 0x0f, 0x38, 0x3f, 0x20, 0x39, 0x0a, 0x0f, 0x10, 0x16,
+   0x18, 0x16, 0x17, 0x0d, 0x15,
+};
+
+static const unsigned char seq_rgb_gamma[] = {
+   0xc1, 0x01, 0x03, 0x07, 0x0f, 0x1a, 0x22, 0x2c, 0x33, 0x3c,
+   0x46, 0x4f, 0x58, 0x60, 0x69, 0x71, 0x79, 0x82, 0x89, 0x92,
+   0x9a, 0xa1, 0xa9, 0xb1, 0xb9, 0xc1, 0xc9, 0xcf, 0xd6, 0xde,
+   0xe5, 0xec, 0xf3, 0xf9, 0xff, 0xdd, 0x39, 0x07, 0x1c, 0xcb,
+   0xab, 0x5f, 0x49, 0x80, 0x03, 0x07, 0x0f, 0x19, 0x20, 0x2a,
+   0x31, 0x39, 0x42, 0x4b, 0x53, 0x5b, 0x63, 0x6b, 0x73, 0x7b,
+   0x83, 0x8a, 0x92, 0x9b, 0xa2, 0xaa, 0xb2, 0xba, 0xc2, 0xca,
+   0xd0, 0xd8, 0xe1, 0xe8, 0xf0, 0xf8, 0xff, 0xf7, 0xd8, 0xbe,
+   0xa7, 0x39, 0x40, 0x85, 0x8c, 0xc0, 0x04, 0x07, 0x0c, 0x17,
+   0x1c, 0x23, 0x2b, 0x34, 0x3b, 0x43, 0x4c, 0x54, 0x5b, 0x63,
+   0x6a, 0x73, 0x7a, 0x82, 0x8a, 0x91, 0x98, 0xa1, 0xa8, 0xb0,
+   0xb7, 0xc1, 0xc9, 0xcf, 0xd9,

[PATCH RESEND 4] ARM: plat-versatile: move secondary CPU startup into cpuinit

2012-12-18 Thread Claudio Fontana


Using __CPUINIT instead of __INIT puts the secondary CPU startup code
into the "right" section: it will not be freed in hotplug configurations,
allowing hot-add of cpus, while still getting freed in non-hotplug configs.

Tested successfully on Fast-Models and on Arndale for VCPU hotplug. 

Signed-off-by: Claudio Fontana 
Tested-by: Claudio Fontana 

diff --git a/arch/arm/plat-versatile/headsmp.S 
b/arch/arm/plat-versatile/headsmp.S
index dd703ef..19fe180 100644
--- a/arch/arm/plat-versatile/headsmp.S
+++ b/arch/arm/plat-versatile/headsmp.S
@@ -11,7 +11,7 @@
 #include 
 #include 
 
-   __INIT
+   __CPUINIT
 
 /*
  * Realview/Versatile Express specific entry point for secondary CPUs.
-- 
1.7.12.1


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv2 1/2] dma: dw_dmac: add dwc_chan_pause and dwc_chan_resume

2012-12-18 Thread Andy Shevchenko

We will use at least the dwc_chan_resume() later.

Signed-off-by: Andy Shevchenko 
Acked-by: Viresh Kumar 
---
 drivers/dma/dw_dmac.c |   31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index 4413f69..687af2a 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -1008,6 +1008,26 @@ set_runtime_config(struct dma_chan *chan, struct 
dma_slave_config *sconfig)
return 0;
 }
 
+static inline void dwc_chan_pause(struct dw_dma_chan *dwc)
+{
+   u32 cfglo = channel_readl(dwc, CFG_LO);
+
+   channel_writel(dwc, CFG_LO, cfglo | DWC_CFGL_CH_SUSP);
+   while (!(channel_readl(dwc, CFG_LO) & DWC_CFGL_FIFO_EMPTY))
+   cpu_relax();
+
+   dwc->paused = true;
+}
+
+static inline void dwc_chan_resume(struct dw_dma_chan *dwc)
+{
+   u32 cfglo = channel_readl(dwc, CFG_LO);
+
+   channel_writel(dwc, CFG_LO, cfglo & ~DWC_CFGL_CH_SUSP);
+
+   dwc->paused = false;
+}
+
 static int dwc_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
   unsigned long arg)
 {
@@ -1015,18 +1035,13 @@ static int dwc_control(struct dma_chan *chan, enum 
dma_ctrl_cmd cmd,
struct dw_dma   *dw = to_dw_dma(chan->device);
struct dw_desc  *desc, *_desc;
unsigned long   flags;
-   u32 cfglo;
LIST_HEAD(list);
 
if (cmd == DMA_PAUSE) {
spin_lock_irqsave(>lock, flags);
 
-   cfglo = channel_readl(dwc, CFG_LO);
-   channel_writel(dwc, CFG_LO, cfglo | DWC_CFGL_CH_SUSP);
-   while (!(channel_readl(dwc, CFG_LO) & DWC_CFGL_FIFO_EMPTY))
-   cpu_relax();
+   dwc_chan_pause(dwc);
 
-   dwc->paused = true;
spin_unlock_irqrestore(>lock, flags);
} else if (cmd == DMA_RESUME) {
if (!dwc->paused)
@@ -1034,9 +1049,7 @@ static int dwc_control(struct dma_chan *chan, enum 
dma_ctrl_cmd cmd,
 
spin_lock_irqsave(>lock, flags);
 
-   cfglo = channel_readl(dwc, CFG_LO);
-   channel_writel(dwc, CFG_LO, cfglo & ~DWC_CFGL_CH_SUSP);
-   dwc->paused = false;
+   dwc_chan_resume(dwc);
 
spin_unlock_irqrestore(>lock, flags);
} else if (cmd == DMA_TERMINATE_ALL) {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv2 2/2] dma: dw_dmac: clear suspend bit during termination

2012-12-18 Thread Andy Shevchenko

From: Heikki Krogerus 

The DMA transfer could not be established if previously it was paused and
terminated. In that case the channel's suspend bit remains set that prevents to
transfer anything until channel is resumed.

The patch adds a code that clears the DWC_CFGL_CH_SUSP bit during termination.

Signed-off-by: Heikki Krogerus 
Signed-off-by: Andy Shevchenko 
Acked-by: Viresh Kumar 
---
 drivers/dma/dw_dmac.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index 687af2a..8d77643 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -1059,7 +1059,7 @@ static int dwc_control(struct dma_chan *chan, enum 
dma_ctrl_cmd cmd,
 
dwc_chan_disable(dw, dwc);
 
-   dwc->paused = false;
+   dwc_chan_resume(dwc);
 
/* active_list entries will end up before queued entries */
list_splice_init(>queue, );
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: Suppress mm/memory.o warning on older compilers if !CONFIG_NUMA_BALANCING

2012-12-18 Thread David Rientjes

On Mon, 17 Dec 2012, Andrew Morton wrote:

> > The kbuild test robot reported the following after the merge of Automatic
> > NUMA Balancing when cross-compiling for avr32.
> > 
> > mm/memory.c: In function 'do_pmd_numa_page':
> > mm/memory.c:3593: warning: no return statement in function returning 
> > non-void
> > 
> > The code is unreachable but the avr32 cross-compiler was not new enough
> > to know that. This patch suppresses the warning.
> > 
> > Signed-off-by: Mel Gorman 
> > ---
> >  mm/memory.c |1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/mm/memory.c b/mm/memory.c
> > index e6a3b93..23f1fdf 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3590,6 +3590,7 @@ static int do_pmd_numa_page(struct mm_struct *mm, 
> > struct vm_area_struct *vma,
> >  unsigned long addr, pmd_t *pmdp)
> >  {
> > BUG();
> > +   return 0;
> >  }
> >  #endif /* CONFIG_NUMA_BALANCING */
> 
> Odd.  avr32's BUG() includes a call to unreachable(), which should
> evaluate to "do { } while (1)".  Can you check that this is working?
> 
> Perhaps it _is_ working, but the compiler incorrectly thinks that the
> function can return?
> 

This isn't the typical "control reaches end of non-void function", the 
warning is merely stating there is no return statement in the function 
which happens to be the case (and it has nothing to do with avr32, it 
will be the same on all archs).  This is one of the last things that gcc 
does after it parses a function declaration and will be emitted with 
-Wreturn-type unless the function in question is main() and it isn't 
marked with __attribute__((noreturn)).  If you're testing this, try making 
the function statically defined and it should show up even with 
do {} while(1).

And for CONFIG_BUG=n this ends up being do {} while (0) which is just a 
no-op and would end up returning that "control reaches end of non-void 
function" warning.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: gadget zero: avoid unnecessary reinit of data in f_sourcesink

2012-12-18 Thread Sebastian Andrzej Siewior

On Mon, Dec 17, 2012 at 06:21:16PM +0100, Armando Visconti wrote:
> >Besides that the patch looks fine :)
> 
> Do you mean that 'inited' should be changed with 'initialized'?
Yes, I do.

> Oh ... my poor english... :(
Don't worry. Others, including myself, do this from time to time as well :)

> 
> Rgds,
> Arm

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] spi: Add the flag indicate to registe new device as children of master or not.

2012-12-18 Thread Jun Chen


Because there are two aim when allocating the new device, one is for children 
of master,
other is for master. So this patch add one flag to indicate different purpose.

Signed-off-by: Bi Chao 
Signed-off-by: Chen Jun 
---
 drivers/spi/spi.c   |   16 +++-
 include/linux/spi/spi.h |3 ++-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 718cc1f..06f69ce 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -300,6 +300,8 @@ static DEFINE_MUTEX(board_lock);
 /**
  * spi_alloc_device - Allocate a new SPI device
  * @master: Controller to which device is connected
+ * device_was_children_of_master is flag which the device is registed
+ * as the children of the bus
  * Context: can sleep
  *
  * Allows a driver to allocate and initialize a spi_device without
@@ -314,7 +316,8 @@ static DEFINE_MUTEX(board_lock);
  *
  * Returns a pointer to the new device, or NULL.
  */
-struct spi_device *spi_alloc_device(struct spi_master *master)
+struct spi_device *spi_alloc_device(struct spi_master *master,
+   bool device_was_children_of_master)
 {
struct spi_device   *spi;
struct device   *dev = master->dev.parent;
@@ -330,7 +333,10 @@ struct spi_device *spi_alloc_device(struct spi_master 
*master)
}
 
spi->master = master;
-   spi->dev.parent = >dev;
+   if (device_was_children_of_master == true)
+   spi->dev.parent = >dev;
+   else
+   spi->dev.parent = dev;
spi->dev.bus = _bus_type;
spi->dev.release = spidev_release;
device_initialize(>dev);
@@ -434,7 +440,7 @@ struct spi_device *spi_new_device(struct spi_master *master,
 * suggests syslogged diagnostics are best here (ugh).
 */
 
-   proxy = spi_alloc_device(master);
+   proxy = spi_alloc_device(master, false);
if (!proxy)
return NULL;
 
@@ -827,7 +833,7 @@ static void of_register_spi_devices(struct spi_master 
*master)
 
for_each_available_child_of_node(master->dev.of_node, nc) {
/* Alloc an spi_device */
-   spi = spi_alloc_device(master);
+   spi = spi_alloc_device(master, true);
if (!spi) {
dev_err(>dev, "spi_device alloc error for %s\n",
nc->full_name);
@@ -939,7 +945,7 @@ static acpi_status acpi_spi_add_device(acpi_handle handle, 
u32 level,
if (acpi_bus_get_status(adev) || !adev->status.present)
return AE_OK;
 
-   spi = spi_alloc_device(master);
+   spi = spi_alloc_device(master, false);
if (!spi) {
dev_err(>dev, "failed to allocate SPI device for %s\n",
dev_name(>dev));
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index fa702ae..43d2f8e 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -838,7 +838,8 @@ spi_register_board_info(struct spi_board_info const *info, 
unsigned n)
  * be defined using the board info.
  */
 extern struct spi_device *
-spi_alloc_device(struct spi_master *master);
+spi_alloc_device(struct spi_master *master,
+   bool device_was_children_of_master);
 
 extern int
 spi_add_device(struct spi_device *spi);
-- 
1.7.4.1



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[regression][PATCH v3] watchdog: Fix disable/enable regression

2012-12-18 Thread Bjørn Mork

commit 8d451690 ("watchdog: Fix CPU hotplug regression") cause
an oops or hard lockup when doing

 echo 0 > /proc/sys/kernel/nmi_watchdog
 echo 1 > /proc/sys/kernel/nmi_watchdog

and the kernel is booted with nmi_watchdog=1 (default)

Running laptop-mode-tools and disconnecting/connecting AC power
will cause this to trigger, making it a common failure scenario
on laptops.

Instead of bailing out of watchdog_disable() when !watchdog_enabled
we can initialize the hrtimer regardless of watchdog_enabled status.
This makes it safe to call watchdog_disable() in the nmi_watchdog=0
case, without the negative effect on the enabled => disabled =>
enabled case.

All these tests pass with this patch:
- nmi_watchdog=1
  echo 0 > /proc/sys/kernel/nmi_watchdog
  echo 1 > /proc/sys/kernel/nmi_watchdog

- nmi_watchdog=0
  echo 0 > /sys/devices/system/cpu/cpu1/online

- nmi_watchdog=0
  echo mem > /sys/power/state

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661

Cc:  # v3.7
Cc: Norbert Warmuth 
Cc: Joseph Salisbury 
Cc: Thomas Gleixner 
Signed-off-by: Bjørn Mork 
---
v3:
  added Bugzilla reference and additional recipients
  rebased on current mainline
v2:
  implemented an alternate workaround for the original problem.

Hello,

Sorry for nagging, but this patch or some other fix should be applied
to mainline ASAP so it can be included in the 3.7 stable series. 3.7.0
and 3.7.1 dies when plugging AC power on a large number of laptop
systems.

I will not claim to understand this code, but it seemed to me like the
original problem was caused by the missing initialization of the hrtimer
in the disabled case. Calling hrtimer_cancel() on an initialized timer
not running should be perfectly OK.  And watchdog_nmi_disable() will 
not do anything unless the event is initialized.  So this patch looks
like a fix.

At least it survives both the original test cases and the post v3.7-rc8
regression test case.


Bjørn

 kernel/watchdog.c |   11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 997c6a1..75a2ab3 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -344,6 +344,10 @@ static void watchdog_enable(unsigned int cpu)
 {
struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
 
+   /* kick off the timer for the hardlockup detector */
+   hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   hrtimer->function = watchdog_timer_fn;
+
if (!watchdog_enabled) {
kthread_park(current);
return;
@@ -352,10 +356,6 @@ static void watchdog_enable(unsigned int cpu)
/* Enable the perf event */
watchdog_nmi_enable(cpu);
 
-   /* kick off the timer for the hardlockup detector */
-   hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
-   hrtimer->function = watchdog_timer_fn;
-
/* done here because hrtimer_start can only pin to smp_processor_id() */
hrtimer_start(hrtimer, ns_to_ktime(sample_period),
  HRTIMER_MODE_REL_PINNED);
@@ -369,9 +369,6 @@ static void watchdog_disable(unsigned int cpu)
 {
struct hrtimer *hrtimer = &__raw_get_cpu_var(watchdog_hrtimer);
 
-   if (!watchdog_enabled)
-   return;
-
watchdog_set_prio(SCHED_NORMAL, 0);
hrtimer_cancel(hrtimer);
/* disable the perf event */
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 5/6] fuse: truncate file if async dio failed

2012-12-18 Thread Maxim V. Patlasov


12/17/2012 11:04 PM, Brian Foster пишет:

On 12/17/2012 09:13 AM, Maxim V. Patlasov wrote:

Hi,

12/15/2012 12:16 AM, Brian Foster пишет:

On 12/14/2012 10:21 AM, Maxim V. Patlasov wrote:

...

+

fuse_do_truncate() looks fairly close to fuse_do_setattr(). Is there any
reason we couldn't make fuse_do_setattr() non-static, change the dentry
parameter to an inode and use that?

fuse_do_setattr() performs extra checks that fuse_do_truncate() needn't.
Some of them are harmless, some not: fuse_allow_task() may return 0 if
task credentials changed. E.g. super-user successfully opened a file,
then setuid(other_user_uid), then write(2) to the file. write(2) doesn't
check uid, but fuse_do_truncate() - via fuse_allow_task() - does.


Conversely, what about the extra error handling bits in
fuse_do_setattr() that do not appear in fuse_do_truncate() (i.e., the
inode mode check, the change attributes call, updating the inode size,
etc.)? It seems like we would want some of that code here.


Yes, they won't harm.



fuse_setattr() is the only caller of fuse_do_setattr(), so why not embed
some of the initial checks (such as fuse_allow_task()) there? I suppose
we could pull out some of the error handling checks there as well if
they are considered harmful to this post-write error truncate situation.


Makes sense. I like it especially because it allows to avoid code 
duplication (handling FUSE_SETATTR fuse-request).



FWIW, I just tested a quick change that pulls up the fuse_allow_task()
check (via instrumenting a write error) and it seems to work as
expected. I can forward a patch if interested...


I did exactly the same before sending previous email :) In my tests it 
works as expected too (modulo fuse_allow_task() that we can move up). 
I'll re-send corrected patch soon.


Thanks,
Maxim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] s390 patches for the 3.8 merge window #2

2012-12-18 Thread Martin Schwidefsky

Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
The main patch is the function measurement blocks extension for PCI to do
performance statistics and help with debugging. The other patch is a small
cleanup in ccwdev.h.

Cornelia Huck (1):
  s390/ccwdev: Include asm/schid.h.

Jan Glauber (1):
  s390/pci: performance statistics and debug infrastructure

 arch/s390/include/asm/ccwdev.h|4 +-
 arch/s390/include/asm/pci.h   |   39 
 arch/s390/include/asm/pci_debug.h |   36 +++
 arch/s390/pci/Makefile|2 +-
 arch/s390/pci/pci.c   |   73 +-
 arch/s390/pci/pci_clp.c   |1 +
 arch/s390/pci/pci_debug.c |  193 +
 arch/s390/pci/pci_dma.c   |8 +-
 arch/s390/pci/pci_event.c |2 +
 9 files changed, 350 insertions(+), 8 deletions(-)
 create mode 100644 arch/s390/include/asm/pci_debug.h
 create mode 100644 arch/s390/pci/pci_debug.c

diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index 6d1f357..e606161 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -12,15 +12,13 @@
 #include 
 #include 
 #include 
+#include 
 
 /* structs from asm/cio.h */
 struct irb;
 struct ccw1;
 struct ccw_dev_id;
 
-/* from asm/schid.h */
-struct subchannel_id;
-
 /* simplified initializers for struct ccw_device:
  * CCW_DEVICE and CCW_DEVICE_DEVTYPE initialize one
  * entry in your MODULE_DEVICE_TABLE and set the match_flag correctly */
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index a6175ad..b1fa93c 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define PCIBIOS_MIN_IO 0x1000
 #define PCIBIOS_MIN_MEM0x1000
@@ -33,6 +34,25 @@ int pci_proc_domain(struct pci_bus *);
 #define ZPCI_FC_BLOCKED0x20
 #define ZPCI_FC_DMA_ENABLED0x10
 
+struct zpci_fmb {
+   u32 format  :  8;
+   u32 dma_valid   :  1;
+   u32 : 23;
+   u32 samples;
+   u64 last_update;
+   /* hardware counters */
+   u64 ld_ops;
+   u64 st_ops;
+   u64 stb_ops;
+   u64 rpcit_ops;
+   u64 dma_rbytes;
+   u64 dma_wbytes;
+   /* software counters */
+   atomic64_t allocated_pages;
+   atomic64_t mapped_pages;
+   atomic64_t unmapped_pages;
+} __packed __aligned(16);
+
 struct msi_map {
unsigned long irq;
struct msi_desc *msi;
@@ -92,7 +112,15 @@ struct zpci_dev {
u64 end_dma;/* End of available DMA addresses */
u64 dma_mask;   /* DMA address space mask */
 
+   /* Function measurement block */
+   struct zpci_fmb *fmb;
+   u16 fmb_update; /* update interval */
+
enum pci_bus_speed max_bus_speed;
+
+   struct dentry   *debugfs_dev;
+   struct dentry   *debugfs_perf;
+   struct dentry   *debugfs_debug;
 };
 
 struct pci_hp_callback_ops {
@@ -155,4 +183,15 @@ extern struct list_head zpci_list;
 extern struct pci_hp_callback_ops hotplug_ops;
 extern unsigned int pci_probe;
 
+/* FMB */
+int zpci_fmb_enable_device(struct zpci_dev *);
+int zpci_fmb_disable_device(struct zpci_dev *);
+
+/* Debug */
+int zpci_debug_init(void);
+void zpci_debug_exit(void);
+void zpci_debug_init_device(struct zpci_dev *);
+void zpci_debug_exit_device(struct zpci_dev *);
+void zpci_debug_info(struct zpci_dev *, struct seq_file *);
+
 #endif
diff --git a/arch/s390/include/asm/pci_debug.h 
b/arch/s390/include/asm/pci_debug.h
new file mode 100644
index 000..6bbec42
--- /dev/null
+++ b/arch/s390/include/asm/pci_debug.h
@@ -0,0 +1,36 @@
+#ifndef _S390_ASM_PCI_DEBUG_H
+#define _S390_ASM_PCI_DEBUG_H
+
+#include 
+
+extern debug_info_t *pci_debug_msg_id;
+extern debug_info_t *pci_debug_err_id;
+
+#ifdef CONFIG_PCI_DEBUG
+#define zpci_dbg(fmt, args...) 
\
+   do {
\
+   if (pci_debug_msg_id->level >= 2)   
\
+   debug_sprintf_event(pci_debug_msg_id, 2, fmt , ## 
args);\
+   } while (0)
+
+#else /* !CONFIG_PCI_DEBUG */
+#define zpci_dbg(fmt, args...) do { } while (0)
+#endif
+
+#define zpci_err(text...)  
\
+   do {
\
+   char debug_buffer[16];  
\
+   snprintf(debug_buffer, 16, text);   
\
+   debug_text_event(pci_debug_err_id, 0, debug_buffer);
\
+   } while (0)
+
+static inline void zpci_err_hex(void

Re: [PATCH 1/3] tools/hv: Fix for long file names from readdir

2012-12-18 Thread Tomas Hozza

- Original Message -
> > This is just for sanity. The value PATH_MAX was chosen after
> > discussion
> > with K. Y. Srinivasan and Olaf Hering instead of some "magic"
> > number like
> > 256 or 512.
> 
> PATH_MAX is a magic name.

It is defined in "limits.h". I would welcome some more constructive
argumentation and critics.

> > > Using snprintf() is a good idea, but you need to check the return
> > > value and handle the truncation case somehow.
> > 
> > By using PATH_MAX sized buffer there is no need for handling the
> > truncation
> > case.
>  
> You are claiming two contradictory things: sprintf() may overrun the
> buffer, so we need the length check provided by snprintf(), but there
> is no need to check for truncation because we know the length is
> sufficient.

So what do you propose? How should it be solved?

Thank you.

Regards,
Tomas Hozza
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] x86, paravirt: fix build error when thp is disabled

2012-12-18 Thread David Rientjes

With CONFIG_PARAVIRT=y and CONFIG_TRANSPARENT_HUGEPAGE=n, the build breaks 
because set_pmd_at() is undeclared:

mm/memory.c: In function 'do_pmd_numa_page':
mm/memory.c:3520: error: implicit declaration of function 'set_pmd_at'
mm/mprotect.c: In function 'change_pmd_protnuma':
mm/mprotect.c:120: error: implicit declaration of function 'set_pmd_at'

This is because paravirt defines set_pmd_at() only when 
CONFIG_TRANSPARENT_HUGEPAGE=y and such a restriction is unneeded.  The fix 
is to define it for all CONFIG_PARAVIRT configurations.

Signed-off-by: David Rientjes 
---
 arch/x86/include/asm/paravirt.h |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -528,7 +528,6 @@ static inline void set_pte_at(struct mm_struct *mm, 
unsigned long addr,
PVOP_VCALL4(pv_mmu_ops.set_pte_at, mm, addr, ptep, pte.pte);
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
  pmd_t *pmdp, pmd_t pmd)
 {
@@ -539,7 +538,6 @@ static inline void set_pmd_at(struct mm_struct *mm, 
unsigned long addr,
PVOP_VCALL4(pv_mmu_ops.set_pmd_at, mm, addr, pmdp,
native_pmd_val(pmd));
 }
-#endif
 
 static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] x86/uapi for 3.8

2012-12-18 Thread Jan Beulich

>>> On 17.12.12 at 18:15, "H. Peter Anvin"  wrote:
> On 12/17/2012 09:03 AM, Jan Beulich wrote:
> On 17.12.12 at 17:39, "H. Peter Anvin"  wrote:
>>> Right, I think you nailed this one.  This patch copies PTEs from the
>>> kernel PTEs and thus they will have the global bit set.  It obviously
>>> makes no sense to *copy* PTEs from the kernel and yet leaving the global
>>> bit set, which means there are two ways of fixing it: either sharing
>>> page tables and use the cr4.pge off/on trick that Jan mentioned -- this
>>> would also be my preference -- and the other is to copy the PTEs but
>>> strip the global bit, which has the advantage that the actual kernel
>>> mappings will survive.
>> 
>> PTE copying is only one half of it. I think additionally L4 entries
>> get copied for the 1:1 mapping, and you can't strip the global
>> bits there without allocating separate page tables.
>> 
> 
> The point right now is that it *does* allocate separate page tables, but

My point was that this isn't really the case: You only considered
the ioremap() adjustment of the respective patch, but the first
of the two loops the same patch adds to setup_real_mode() does
in fact share page tables for the identity mapping of RAM.

Matthew - that loop is, btw, off by one, i.e. should be

   for (i = 0; i <= pgd_index((max_pfn - 1) << PAGE_SHIFT); i++) {

But of course this, at least for the moment, is only a theoretical
issue.

> doesn't take advantage of it.  What I say is I think we should take the
> flush for the advantage of sharing page tables.  If we are allocating
> new page tables then we should of course make them non-global.
> 
> Do we know how often this gets called?  I presume the most common case
> is when we have an EFI RTC?  Unless there is a use case where this
> happens a lot sharing seems much easier...

When running on EFI any access to the real time clock will go
that route (i.e. there is no such thing as EFI without EFI RTC).

But then again there of course shouldn't be frequent accesses
to the RTC in the first place (which otherwise would quickly
become a bottleneck with the CMOS RTC as well).

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [GIT PULL] x86/uapi for 3.8

2012-12-18 Thread Jan Beulich

 On 17.12.12 at 18:15, H. Peter Anvin h...@linux.intel.com wrote:
 On 12/17/2012 09:03 AM, Jan Beulich wrote:
 On 17.12.12 at 17:39, H. Peter Anvin h...@linux.intel.com wrote:
 Right, I think you nailed this one.  This patch copies PTEs from the
 kernel PTEs and thus they will have the global bit set.  It obviously
 makes no sense to *copy* PTEs from the kernel and yet leaving the global
 bit set, which means there are two ways of fixing it: either sharing
 page tables and use the cr4.pge off/on trick that Jan mentioned -- this
 would also be my preference -- and the other is to copy the PTEs but
 strip the global bit, which has the advantage that the actual kernel
 mappings will survive.
 
 PTE copying is only one half of it. I think additionally L4 entries
 get copied for the 1:1 mapping, and you can't strip the global
 bits there without allocating separate page tables.
 
 
 The point right now is that it *does* allocate separate page tables, but

My point was that this isn't really the case: You only considered
the ioremap() adjustment of the respective patch, but the first
of the two loops the same patch adds to setup_real_mode() does
in fact share page tables for the identity mapping of RAM.

Matthew - that loop is, btw, off by one, i.e. should be

   for (i = 0; i = pgd_index((max_pfn - 1)  PAGE_SHIFT); i++) {

But of course this, at least for the moment, is only a theoretical
issue.

 doesn't take advantage of it.  What I say is I think we should take the
 flush for the advantage of sharing page tables.  If we are allocating
 new page tables then we should of course make them non-global.
 
 Do we know how often this gets called?  I presume the most common case
 is when we have an EFI RTC?  Unless there is a use case where this
 happens a lot sharing seems much easier...

When running on EFI any access to the real time clock will go
that route (i.e. there is no such thing as EFI without EFI RTC).

But then again there of course shouldn't be frequent accesses
to the RTC in the first place (which otherwise would quickly
become a bottleneck with the CMOS RTC as well).

Jan

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[patch] x86, paravirt: fix build error when thp is disabled

2012-12-18 Thread David Rientjes

With CONFIG_PARAVIRT=y and CONFIG_TRANSPARENT_HUGEPAGE=n, the build breaks 
because set_pmd_at() is undeclared:

mm/memory.c: In function 'do_pmd_numa_page':
mm/memory.c:3520: error: implicit declaration of function 'set_pmd_at'
mm/mprotect.c: In function 'change_pmd_protnuma':
mm/mprotect.c:120: error: implicit declaration of function 'set_pmd_at'

This is because paravirt defines set_pmd_at() only when 
CONFIG_TRANSPARENT_HUGEPAGE=y and such a restriction is unneeded.  The fix 
is to define it for all CONFIG_PARAVIRT configurations.

Signed-off-by: David Rientjes rient...@google.com
---
 arch/x86/include/asm/paravirt.h |2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/paravirt.h b/arch/x86/include/asm/paravirt.h
--- a/arch/x86/include/asm/paravirt.h
+++ b/arch/x86/include/asm/paravirt.h
@@ -528,7 +528,6 @@ static inline void set_pte_at(struct mm_struct *mm, 
unsigned long addr,
PVOP_VCALL4(pv_mmu_ops.set_pte_at, mm, addr, ptep, pte.pte);
 }
 
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
 static inline void set_pmd_at(struct mm_struct *mm, unsigned long addr,
  pmd_t *pmdp, pmd_t pmd)
 {
@@ -539,7 +538,6 @@ static inline void set_pmd_at(struct mm_struct *mm, 
unsigned long addr,
PVOP_VCALL4(pv_mmu_ops.set_pmd_at, mm, addr, pmdp,
native_pmd_val(pmd));
 }
-#endif
 
 static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
 {
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 1/3] tools/hv: Fix for long file names from readdir

2012-12-18 Thread Tomas Hozza

- Original Message -
  This is just for sanity. The value PATH_MAX was chosen after
  discussion
  with K. Y. Srinivasan and Olaf Hering instead of some magic
  number like
  256 or 512.
 
 PATH_MAX is a magic name.

It is defined in limits.h. I would welcome some more constructive
argumentation and critics.

   Using snprintf() is a good idea, but you need to check the return
   value and handle the truncation case somehow.
  
  By using PATH_MAX sized buffer there is no need for handling the
  truncation
  case.
  
 You are claiming two contradictory things: sprintf() may overrun the
 buffer, so we need the length check provided by snprintf(), but there
 is no need to check for truncation because we know the length is
 sufficient.

So what do you propose? How should it be solved?

Thank you.

Regards,
Tomas Hozza
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[GIT PULL] s390 patches for the 3.8 merge window #2

2012-12-18 Thread Martin Schwidefsky

Hi Linus,

please pull from the 'for-linus' branch of

git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux.git for-linus

to receive the following updates:
The main patch is the function measurement blocks extension for PCI to do
performance statistics and help with debugging. The other patch is a small
cleanup in ccwdev.h.

Cornelia Huck (1):
  s390/ccwdev: Include asm/schid.h.

Jan Glauber (1):
  s390/pci: performance statistics and debug infrastructure

 arch/s390/include/asm/ccwdev.h|4 +-
 arch/s390/include/asm/pci.h   |   39 
 arch/s390/include/asm/pci_debug.h |   36 +++
 arch/s390/pci/Makefile|2 +-
 arch/s390/pci/pci.c   |   73 +-
 arch/s390/pci/pci_clp.c   |1 +
 arch/s390/pci/pci_debug.c |  193 +
 arch/s390/pci/pci_dma.c   |8 +-
 arch/s390/pci/pci_event.c |2 +
 9 files changed, 350 insertions(+), 8 deletions(-)
 create mode 100644 arch/s390/include/asm/pci_debug.h
 create mode 100644 arch/s390/pci/pci_debug.c

diff --git a/arch/s390/include/asm/ccwdev.h b/arch/s390/include/asm/ccwdev.h
index 6d1f357..e606161 100644
--- a/arch/s390/include/asm/ccwdev.h
+++ b/arch/s390/include/asm/ccwdev.h
@@ -12,15 +12,13 @@
 #include linux/mod_devicetable.h
 #include asm/fcx.h
 #include asm/irq.h
+#include asm/schid.h
 
 /* structs from asm/cio.h */
 struct irb;
 struct ccw1;
 struct ccw_dev_id;
 
-/* from asm/schid.h */
-struct subchannel_id;
-
 /* simplified initializers for struct ccw_device:
  * CCW_DEVICE and CCW_DEVICE_DEVTYPE initialize one
  * entry in your MODULE_DEVICE_TABLE and set the match_flag correctly */
diff --git a/arch/s390/include/asm/pci.h b/arch/s390/include/asm/pci.h
index a6175ad..b1fa93c 100644
--- a/arch/s390/include/asm/pci.h
+++ b/arch/s390/include/asm/pci.h
@@ -9,6 +9,7 @@
 #include asm-generic/pci.h
 #include asm-generic/pci-dma-compat.h
 #include asm/pci_clp.h
+#include asm/pci_debug.h
 
 #define PCIBIOS_MIN_IO 0x1000
 #define PCIBIOS_MIN_MEM0x1000
@@ -33,6 +34,25 @@ int pci_proc_domain(struct pci_bus *);
 #define ZPCI_FC_BLOCKED0x20
 #define ZPCI_FC_DMA_ENABLED0x10
 
+struct zpci_fmb {
+   u32 format  :  8;
+   u32 dma_valid   :  1;
+   u32 : 23;
+   u32 samples;
+   u64 last_update;
+   /* hardware counters */
+   u64 ld_ops;
+   u64 st_ops;
+   u64 stb_ops;
+   u64 rpcit_ops;
+   u64 dma_rbytes;
+   u64 dma_wbytes;
+   /* software counters */
+   atomic64_t allocated_pages;
+   atomic64_t mapped_pages;
+   atomic64_t unmapped_pages;
+} __packed __aligned(16);
+
 struct msi_map {
unsigned long irq;
struct msi_desc *msi;
@@ -92,7 +112,15 @@ struct zpci_dev {
u64 end_dma;/* End of available DMA addresses */
u64 dma_mask;   /* DMA address space mask */
 
+   /* Function measurement block */
+   struct zpci_fmb *fmb;
+   u16 fmb_update; /* update interval */
+
enum pci_bus_speed max_bus_speed;
+
+   struct dentry   *debugfs_dev;
+   struct dentry   *debugfs_perf;
+   struct dentry   *debugfs_debug;
 };
 
 struct pci_hp_callback_ops {
@@ -155,4 +183,15 @@ extern struct list_head zpci_list;
 extern struct pci_hp_callback_ops hotplug_ops;
 extern unsigned int pci_probe;
 
+/* FMB */
+int zpci_fmb_enable_device(struct zpci_dev *);
+int zpci_fmb_disable_device(struct zpci_dev *);
+
+/* Debug */
+int zpci_debug_init(void);
+void zpci_debug_exit(void);
+void zpci_debug_init_device(struct zpci_dev *);
+void zpci_debug_exit_device(struct zpci_dev *);
+void zpci_debug_info(struct zpci_dev *, struct seq_file *);
+
 #endif
diff --git a/arch/s390/include/asm/pci_debug.h 
b/arch/s390/include/asm/pci_debug.h
new file mode 100644
index 000..6bbec42
--- /dev/null
+++ b/arch/s390/include/asm/pci_debug.h
@@ -0,0 +1,36 @@
+#ifndef _S390_ASM_PCI_DEBUG_H
+#define _S390_ASM_PCI_DEBUG_H
+
+#include asm/debug.h
+
+extern debug_info_t *pci_debug_msg_id;
+extern debug_info_t *pci_debug_err_id;
+
+#ifdef CONFIG_PCI_DEBUG
+#define zpci_dbg(fmt, args...) 
\
+   do {
\
+   if (pci_debug_msg_id-level = 2)   
\
+   debug_sprintf_event(pci_debug_msg_id, 2, fmt , ## 
args);\
+   } while (0)
+
+#else /* !CONFIG_PCI_DEBUG */
+#define zpci_dbg(fmt, args...) do { } while (0)
+#endif
+
+#define zpci_err(text...)  
\
+   do {
\
+   char debug_buffer[16];  
\
+   snprintf(debug_buffer, 16, text);   
\
+

Re: [PATCH 5/6] fuse: truncate file if async dio failed

2012-12-18 Thread Maxim V. Patlasov


12/17/2012 11:04 PM, Brian Foster пишет:

On 12/17/2012 09:13 AM, Maxim V. Patlasov wrote:

Hi,

12/15/2012 12:16 AM, Brian Foster пишет:

On 12/14/2012 10:21 AM, Maxim V. Patlasov wrote:

...

+

fuse_do_truncate() looks fairly close to fuse_do_setattr(). Is there any
reason we couldn't make fuse_do_setattr() non-static, change the dentry
parameter to an inode and use that?

fuse_do_setattr() performs extra checks that fuse_do_truncate() needn't.
Some of them are harmless, some not: fuse_allow_task() may return 0 if
task credentials changed. E.g. super-user successfully opened a file,
then setuid(other_user_uid), then write(2) to the file. write(2) doesn't
check uid, but fuse_do_truncate() - via fuse_allow_task() - does.


Conversely, what about the extra error handling bits in
fuse_do_setattr() that do not appear in fuse_do_truncate() (i.e., the
inode mode check, the change attributes call, updating the inode size,
etc.)? It seems like we would want some of that code here.


Yes, they won't harm.



fuse_setattr() is the only caller of fuse_do_setattr(), so why not embed
some of the initial checks (such as fuse_allow_task()) there? I suppose
we could pull out some of the error handling checks there as well if
they are considered harmful to this post-write error truncate situation.


Makes sense. I like it especially because it allows to avoid code 
duplication (handling FUSE_SETATTR fuse-request).



FWIW, I just tested a quick change that pulls up the fuse_allow_task()
check (via instrumenting a write error) and it seems to work as
expected. I can forward a patch if interested...


I did exactly the same before sending previous email :) In my tests it 
works as expected too (modulo fuse_allow_task() that we can move up). 
I'll re-send corrected patch soon.


Thanks,
Maxim
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[regression][PATCH v3] watchdog: Fix disable/enable regression

2012-12-18 Thread Bjørn Mork

commit 8d451690 (watchdog: Fix CPU hotplug regression) cause
an oops or hard lockup when doing

 echo 0  /proc/sys/kernel/nmi_watchdog
 echo 1  /proc/sys/kernel/nmi_watchdog

and the kernel is booted with nmi_watchdog=1 (default)

Running laptop-mode-tools and disconnecting/connecting AC power
will cause this to trigger, making it a common failure scenario
on laptops.

Instead of bailing out of watchdog_disable() when !watchdog_enabled
we can initialize the hrtimer regardless of watchdog_enabled status.
This makes it safe to call watchdog_disable() in the nmi_watchdog=0
case, without the negative effect on the enabled = disabled =
enabled case.

All these tests pass with this patch:
- nmi_watchdog=1
  echo 0  /proc/sys/kernel/nmi_watchdog
  echo 1  /proc/sys/kernel/nmi_watchdog

- nmi_watchdog=0
  echo 0  /sys/devices/system/cpu/cpu1/online

- nmi_watchdog=0
  echo mem  /sys/power/state

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661

Cc: sta...@vger.kernel.org # v3.7
Cc: Norbert Warmuth nwarm...@t-online.de
Cc: Joseph Salisbury joseph.salisb...@canonical.com
Cc: Thomas Gleixner t...@linutronix.de
Signed-off-by: Bjørn Mork bj...@mork.no
---
v3:
  added Bugzilla reference and additional recipients
  rebased on current mainline
v2:
  implemented an alternate workaround for the original problem.

Hello,

Sorry for nagging, but this patch or some other fix should be applied
to mainline ASAP so it can be included in the 3.7 stable series. 3.7.0
and 3.7.1 dies when plugging AC power on a large number of laptop
systems.

I will not claim to understand this code, but it seemed to me like the
original problem was caused by the missing initialization of the hrtimer
in the disabled case. Calling hrtimer_cancel() on an initialized timer
not running should be perfectly OK.  And watchdog_nmi_disable() will 
not do anything unless the event is initialized.  So this patch looks
like a fix.

At least it survives both the original test cases and the post v3.7-rc8
regression test case.


Bjørn

 kernel/watchdog.c |   11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 997c6a1..75a2ab3 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -344,6 +344,10 @@ static void watchdog_enable(unsigned int cpu)
 {
struct hrtimer *hrtimer = __raw_get_cpu_var(watchdog_hrtimer);
 
+   /* kick off the timer for the hardlockup detector */
+   hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+   hrtimer-function = watchdog_timer_fn;
+
if (!watchdog_enabled) {
kthread_park(current);
return;
@@ -352,10 +356,6 @@ static void watchdog_enable(unsigned int cpu)
/* Enable the perf event */
watchdog_nmi_enable(cpu);
 
-   /* kick off the timer for the hardlockup detector */
-   hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
-   hrtimer-function = watchdog_timer_fn;
-
/* done here because hrtimer_start can only pin to smp_processor_id() */
hrtimer_start(hrtimer, ns_to_ktime(sample_period),
  HRTIMER_MODE_REL_PINNED);
@@ -369,9 +369,6 @@ static void watchdog_disable(unsigned int cpu)
 {
struct hrtimer *hrtimer = __raw_get_cpu_var(watchdog_hrtimer);
 
-   if (!watchdog_enabled)
-   return;
-
watchdog_set_prio(SCHED_NORMAL, 0);
hrtimer_cancel(hrtimer);
/* disable the perf event */
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] spi: Add the flag indicate to registe new device as children of master or not.

2012-12-18 Thread Jun Chen


Because there are two aim when allocating the new device, one is for children 
of master,
other is for master. So this patch add one flag to indicate different purpose.

Signed-off-by: Bi Chao chao...@intel.com
Signed-off-by: Chen Jun jun.d.c...@intel.com
---
 drivers/spi/spi.c   |   16 +++-
 include/linux/spi/spi.h |3 ++-
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 718cc1f..06f69ce 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -300,6 +300,8 @@ static DEFINE_MUTEX(board_lock);
 /**
  * spi_alloc_device - Allocate a new SPI device
  * @master: Controller to which device is connected
+ * device_was_children_of_master is flag which the device is registed
+ * as the children of the bus
  * Context: can sleep
  *
  * Allows a driver to allocate and initialize a spi_device without
@@ -314,7 +316,8 @@ static DEFINE_MUTEX(board_lock);
  *
  * Returns a pointer to the new device, or NULL.
  */
-struct spi_device *spi_alloc_device(struct spi_master *master)
+struct spi_device *spi_alloc_device(struct spi_master *master,
+   bool device_was_children_of_master)
 {
struct spi_device   *spi;
struct device   *dev = master-dev.parent;
@@ -330,7 +333,10 @@ struct spi_device *spi_alloc_device(struct spi_master 
*master)
}
 
spi-master = master;
-   spi-dev.parent = master-dev;
+   if (device_was_children_of_master == true)
+   spi-dev.parent = master-dev;
+   else
+   spi-dev.parent = dev;
spi-dev.bus = spi_bus_type;
spi-dev.release = spidev_release;
device_initialize(spi-dev);
@@ -434,7 +440,7 @@ struct spi_device *spi_new_device(struct spi_master *master,
 * suggests syslogged diagnostics are best here (ugh).
 */
 
-   proxy = spi_alloc_device(master);
+   proxy = spi_alloc_device(master, false);
if (!proxy)
return NULL;
 
@@ -827,7 +833,7 @@ static void of_register_spi_devices(struct spi_master 
*master)
 
for_each_available_child_of_node(master-dev.of_node, nc) {
/* Alloc an spi_device */
-   spi = spi_alloc_device(master);
+   spi = spi_alloc_device(master, true);
if (!spi) {
dev_err(master-dev, spi_device alloc error for %s\n,
nc-full_name);
@@ -939,7 +945,7 @@ static acpi_status acpi_spi_add_device(acpi_handle handle, 
u32 level,
if (acpi_bus_get_status(adev) || !adev-status.present)
return AE_OK;
 
-   spi = spi_alloc_device(master);
+   spi = spi_alloc_device(master, false);
if (!spi) {
dev_err(master-dev, failed to allocate SPI device for %s\n,
dev_name(adev-dev));
diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index fa702ae..43d2f8e 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -838,7 +838,8 @@ spi_register_board_info(struct spi_board_info const *info, 
unsigned n)
  * be defined using the board info.
  */
 extern struct spi_device *
-spi_alloc_device(struct spi_master *master);
+spi_alloc_device(struct spi_master *master,
+   bool device_was_children_of_master);
 
 extern int
 spi_add_device(struct spi_device *spi);
-- 
1.7.4.1



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: gadget zero: avoid unnecessary reinit of data in f_sourcesink

2012-12-18 Thread Sebastian Andrzej Siewior

On Mon, Dec 17, 2012 at 06:21:16PM +0100, Armando Visconti wrote:
 Besides that the patch looks fine :)
 
 Do you mean that 'inited' should be changed with 'initialized'?
Yes, I do.

 Oh ... my poor english... :(
Don't worry. Others, including myself, do this from time to time as well :)

 
 Rgds,
 Arm

Sebastian
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] mm: Suppress mm/memory.o warning on older compilers if !CONFIG_NUMA_BALANCING

2012-12-18 Thread David Rientjes

On Mon, 17 Dec 2012, Andrew Morton wrote:

  The kbuild test robot reported the following after the merge of Automatic
  NUMA Balancing when cross-compiling for avr32.
  
  mm/memory.c: In function 'do_pmd_numa_page':
  mm/memory.c:3593: warning: no return statement in function returning 
  non-void
  
  The code is unreachable but the avr32 cross-compiler was not new enough
  to know that. This patch suppresses the warning.
  
  Signed-off-by: Mel Gorman mgor...@suse.de
  ---
   mm/memory.c |1 +
   1 file changed, 1 insertion(+)
  
  diff --git a/mm/memory.c b/mm/memory.c
  index e6a3b93..23f1fdf 100644
  --- a/mm/memory.c
  +++ b/mm/memory.c
  @@ -3590,6 +3590,7 @@ static int do_pmd_numa_page(struct mm_struct *mm, 
  struct vm_area_struct *vma,
   unsigned long addr, pmd_t *pmdp)
   {
  BUG();
  +   return 0;
   }
   #endif /* CONFIG_NUMA_BALANCING */
 
 Odd.  avr32's BUG() includes a call to unreachable(), which should
 evaluate to do { } while (1).  Can you check that this is working?
 
 Perhaps it _is_ working, but the compiler incorrectly thinks that the
 function can return?
 

This isn't the typical control reaches end of non-void function, the 
warning is merely stating there is no return statement in the function 
which happens to be the case (and it has nothing to do with avr32, it 
will be the same on all archs).  This is one of the last things that gcc 
does after it parses a function declaration and will be emitted with 
-Wreturn-type unless the function in question is main() and it isn't 
marked with __attribute__((noreturn)).  If you're testing this, try making 
the function statically defined and it should show up even with 
do {} while(1).

And for CONFIG_BUG=n this ends up being do {} while (0) which is just a 
no-op and would end up returning that control reaches end of non-void 
function warning.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv2 2/2] dma: dw_dmac: clear suspend bit during termination

2012-12-18 Thread Andy Shevchenko

From: Heikki Krogerus heikki.kroge...@linux.intel.com

The DMA transfer could not be established if previously it was paused and
terminated. In that case the channel's suspend bit remains set that prevents to
transfer anything until channel is resumed.

The patch adds a code that clears the DWC_CFGL_CH_SUSP bit during termination.

Signed-off-by: Heikki Krogerus heikki.kroge...@linux.intel.com
Signed-off-by: Andy Shevchenko andriy.shevche...@linux.intel.com
Acked-by: Viresh Kumar viresh.ku...@linaro.org
---
 drivers/dma/dw_dmac.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index 687af2a..8d77643 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -1059,7 +1059,7 @@ static int dwc_control(struct dma_chan *chan, enum 
dma_ctrl_cmd cmd,
 
dwc_chan_disable(dw, dwc);
 
-   dwc-paused = false;
+   dwc_chan_resume(dwc);
 
/* active_list entries will end up before queued entries */
list_splice_init(dwc-queue, list);
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCHv2 1/2] dma: dw_dmac: add dwc_chan_pause and dwc_chan_resume

2012-12-18 Thread Andy Shevchenko

We will use at least the dwc_chan_resume() later.

Signed-off-by: Andy Shevchenko andriy.shevche...@linux.intel.com
Acked-by: Viresh Kumar viresh.ku...@linaro.org
---
 drivers/dma/dw_dmac.c |   31 ++-
 1 file changed, 22 insertions(+), 9 deletions(-)

diff --git a/drivers/dma/dw_dmac.c b/drivers/dma/dw_dmac.c
index 4413f69..687af2a 100644
--- a/drivers/dma/dw_dmac.c
+++ b/drivers/dma/dw_dmac.c
@@ -1008,6 +1008,26 @@ set_runtime_config(struct dma_chan *chan, struct 
dma_slave_config *sconfig)
return 0;
 }
 
+static inline void dwc_chan_pause(struct dw_dma_chan *dwc)
+{
+   u32 cfglo = channel_readl(dwc, CFG_LO);
+
+   channel_writel(dwc, CFG_LO, cfglo | DWC_CFGL_CH_SUSP);
+   while (!(channel_readl(dwc, CFG_LO)  DWC_CFGL_FIFO_EMPTY))
+   cpu_relax();
+
+   dwc-paused = true;
+}
+
+static inline void dwc_chan_resume(struct dw_dma_chan *dwc)
+{
+   u32 cfglo = channel_readl(dwc, CFG_LO);
+
+   channel_writel(dwc, CFG_LO, cfglo  ~DWC_CFGL_CH_SUSP);
+
+   dwc-paused = false;
+}
+
 static int dwc_control(struct dma_chan *chan, enum dma_ctrl_cmd cmd,
   unsigned long arg)
 {
@@ -1015,18 +1035,13 @@ static int dwc_control(struct dma_chan *chan, enum 
dma_ctrl_cmd cmd,
struct dw_dma   *dw = to_dw_dma(chan-device);
struct dw_desc  *desc, *_desc;
unsigned long   flags;
-   u32 cfglo;
LIST_HEAD(list);
 
if (cmd == DMA_PAUSE) {
spin_lock_irqsave(dwc-lock, flags);
 
-   cfglo = channel_readl(dwc, CFG_LO);
-   channel_writel(dwc, CFG_LO, cfglo | DWC_CFGL_CH_SUSP);
-   while (!(channel_readl(dwc, CFG_LO)  DWC_CFGL_FIFO_EMPTY))
-   cpu_relax();
+   dwc_chan_pause(dwc);
 
-   dwc-paused = true;
spin_unlock_irqrestore(dwc-lock, flags);
} else if (cmd == DMA_RESUME) {
if (!dwc-paused)
@@ -1034,9 +1049,7 @@ static int dwc_control(struct dma_chan *chan, enum 
dma_ctrl_cmd cmd,
 
spin_lock_irqsave(dwc-lock, flags);
 
-   cfglo = channel_readl(dwc, CFG_LO);
-   channel_writel(dwc, CFG_LO, cfglo  ~DWC_CFGL_CH_SUSP);
-   dwc-paused = false;
+   dwc_chan_resume(dwc);
 
spin_unlock_irqrestore(dwc-lock, flags);
} else if (cmd == DMA_TERMINATE_ALL) {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH RESEND 4] ARM: plat-versatile: move secondary CPU startup into cpuinit

2012-12-18 Thread Claudio Fontana


Using __CPUINIT instead of __INIT puts the secondary CPU startup code
into the right section: it will not be freed in hotplug configurations,
allowing hot-add of cpus, while still getting freed in non-hotplug configs.

Tested successfully on Fast-Models and on Arndale for VCPU hotplug. 

Signed-off-by: Claudio Fontana claudio.font...@huawei.com
Tested-by: Claudio Fontana claudio.font...@huawei.com

diff --git a/arch/arm/plat-versatile/headsmp.S 
b/arch/arm/plat-versatile/headsmp.S
index dd703ef..19fe180 100644
--- a/arch/arm/plat-versatile/headsmp.S
+++ b/arch/arm/plat-versatile/headsmp.S
@@ -11,7 +11,7 @@
 #include linux/linkage.h
 #include linux/init.h
 
-   __INIT
+   __CPUINIT
 
 /*
  * Realview/Versatile Express specific entry point for secondary CPUs.
-- 
1.7.12.1


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH v2] backlight: add lms501kf03 LCD driver

2012-12-18 Thread Jingoo Han

Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
x 480) driver uses 3-wired SPI inteface.

Signed-off-by: Ilho Lee ilho215@samsung.com
Signed-off-by: Jingoo Han jg1@samsung.com
---
Change since v1:
- remove redundant return variables
- use -EINVAL instead of -EFAULT
- add a more detailed description of 120ms delay time
- replace unsigned short arrays with unsigned char arrays

 drivers/video/backlight/Kconfig  |8 +
 drivers/video/backlight/Makefile |1 +
 drivers/video/backlight/lms501kf03.c |  444 ++
 3 files changed, 453 insertions(+), 0 deletions(-)
 create mode 100644 drivers/video/backlight/lms501kf03.c

diff --git a/drivers/video/backlight/Kconfig b/drivers/video/backlight/Kconfig
index 765a945..081d6cf 100644
--- a/drivers/video/backlight/Kconfig
+++ b/drivers/video/backlight/Kconfig
@@ -126,6 +126,14 @@ config LCD_AMS369FG06
  If you have an AMS369FG06 AMOLED Panel, say Y to enable its
  LCD control driver.
 
+config LCD_LMS501KF03
+   tristate LMS501KF03 LCD Driver
+   depends on SPI
+   default n
+   help
+ If you have an LMS501KF03 LCD Panel, say Y to enable its
+ LCD control driver.
+
 endif # LCD_CLASS_DEVICE
 
 #
diff --git a/drivers/video/backlight/Makefile b/drivers/video/backlight/Makefile
index e7ce729..d02a728 100644
--- a/drivers/video/backlight/Makefile
+++ b/drivers/video/backlight/Makefile
@@ -14,6 +14,7 @@ obj-$(CONFIG_LCD_TOSA)   += tosa_lcd.o
 obj-$(CONFIG_LCD_S6E63M0)  += s6e63m0.o
 obj-$(CONFIG_LCD_LD9040)   += ld9040.o
 obj-$(CONFIG_LCD_AMS369FG06)   += ams369fg06.o
+obj-$(CONFIG_LCD_LMS501KF03)   += lms501kf03.o
 
 obj-$(CONFIG_BACKLIGHT_CLASS_DEVICE) += backlight.o
 obj-$(CONFIG_BACKLIGHT_ATMEL_PWM)+= atmel-pwm-bl.o
diff --git a/drivers/video/backlight/lms501kf03.c 
b/drivers/video/backlight/lms501kf03.c
new file mode 100644
index 000..af7979d
--- /dev/null
+++ b/drivers/video/backlight/lms501kf03.c
@@ -0,0 +1,444 @@
+/*
+ * lms501kf03 TFT LCD panel driver.
+ *
+ * Copyright (c) 2012 Samsung Electronics Co., Ltd.
+ * Author: Jingoo Han  jg1@samsung.com
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#include linux/backlight.h
+#include linux/delay.h
+#include linux/fb.h
+#include linux/gpio.h
+#include linux/lcd.h
+#include linux/module.h
+#include linux/spi/spi.h
+#include linux/wait.h
+
+#define COMMAND_ONLY   0x00
+#define DATA_ONLY  0x01
+
+struct lms501kf03 {
+   struct device   *dev;
+   struct spi_device   *spi;
+   unsigned intpower;
+   struct lcd_device   *ld;
+   struct lcd_platform_data*lcd_pd;
+};
+
+static const unsigned char seq_password[] = {
+   0xb9, 0xff, 0x83, 0x69,
+};
+
+static const unsigned char seq_power[] = {
+   0xb1, 0x01, 0x00, 0x34, 0x06, 0x00, 0x14, 0x14, 0x20, 0x28,
+   0x12, 0x12, 0x17, 0x0a, 0x01, 0xe6, 0xe6, 0xe6, 0xe6, 0xe6,
+};
+
+static const unsigned char seq_display[] = {
+   0xb2, 0x00, 0x2b, 0x03, 0x03, 0x70, 0x00, 0xff, 0x00, 0x00,
+   0x00, 0x00, 0x03, 0x03, 0x00, 0x01,
+};
+
+static const unsigned char seq_rgb_if[] = {
+   0xb3, 0x09,
+};
+
+static const unsigned char seq_display_inv[] = {
+   0xb4, 0x01, 0x08, 0x77, 0x0e, 0x06,
+};
+
+static const unsigned char seq_vcom[] = {
+   0xb6, 0x4c, 0x2e,
+};
+
+static const unsigned char seq_gate[] = {
+   0xd5, 0x00, 0x05, 0x03, 0x29, 0x01, 0x07, 0x17, 0x68, 0x13,
+   0x37, 0x20, 0x31, 0x8a, 0x46, 0x9b, 0x57, 0x13, 0x02, 0x75,
+   0xb9, 0x64, 0xa8, 0x07, 0x0f, 0x04, 0x07,
+};
+
+static const unsigned char seq_panel[] = {
+   0xcc, 0x02,
+};
+
+static const unsigned char seq_col_mod[] = {
+   0x3a, 0x77,
+};
+
+static const unsigned char seq_w_gamma[] = {
+   0xe0, 0x00, 0x04, 0x09, 0x0f, 0x1f, 0x3f, 0x1f, 0x2f, 0x0a,
+   0x0f, 0x10, 0x16, 0x18, 0x16, 0x17, 0x0d, 0x15, 0x00, 0x04,
+   0x09, 0x0f, 0x38, 0x3f, 0x20, 0x39, 0x0a, 0x0f, 0x10, 0x16,
+   0x18, 0x16, 0x17, 0x0d, 0x15,
+};
+
+static const unsigned char seq_rgb_gamma[] = {
+   0xc1, 0x01, 0x03, 0x07, 0x0f, 0x1a, 0x22, 0x2c, 0x33, 0x3c,
+   0x46, 0x4f, 0x58, 0x60, 0x69, 0x71, 0x79, 0x82, 0x89, 0x92,
+   0x9a, 0xa1, 0xa9, 0xb1, 0xb9, 0xc1, 0xc9, 0xcf, 0xd6, 0xde,
+   0xe5, 0xec, 0xf3, 0xf9, 0xff, 0xdd, 0x39, 0x07, 0x1c, 0xcb,
+   0xab, 0x5f, 0x49, 0x80, 0x03, 0x07, 0x0f, 0x19, 0x20, 0x2a,
+   0x31, 0x39, 0x42, 0x4b, 0x53, 0x5b, 0x63, 0x6b, 0x73, 0x7b,
+   0x83, 0x8a, 0x92, 0x9b, 0xa2, 0xaa, 0xb2, 0xba, 0xc2, 0xca,
+   0xd0, 0xd8, 0xe1, 0xe8, 0xf0, 0xf8, 0xff, 0xf7, 0xd8, 0xbe,
+   0xa7, 0x39, 0x40, 0x85, 0x8c, 0xc0, 0x04, 0x07, 0x0c, 0x17,
+   0x1c,

[PATCH] [RFC] cpufreq: can't raise max frequency with cpu_thermal

2012-12-18 Thread Sonny Rao

The cpu_thermal generic thermal management code has a bug where once
max cpu frequency has been lowered in sysfs (scaling_max_freq) it is
not possible to raise the max back up later.  The bug is that the
notifer gets called by __cpufreq_set_policy() before the user policy
max is raised, and is incorrectly trying to enforce the max frequency
policy even when we are trying to change the policy.  It is also not
clear why this driver is looking at the user policy since it is
primarily supposed to enforce thermal policy, not user set policy.

Signed-off-by: Sonny Rao sonny...@chromium.org
---
 drivers/thermal/cpu_cooling.c |4 
 1 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 836828e..63bc708 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -219,10 +219,6 @@ static int cpufreq_thermal_notifier(struct notifier_block 
*nb,
if (cpumask_test_cpu(policy-cpu, notify_device-allowed_cpus))
max_freq = notify_device-cpufreq_val;
 
-   /* Never exceed user_policy.max*/
-   if (max_freq  policy-user_policy.max)
-   max_freq = policy-user_policy.max;
-
if (policy-max != max_freq)
cpufreq_verify_within_limits(policy, 0, max_freq);
 
-- 
1.7.7.3

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH V2] spi: remove check for bits_per_word on transfer from low level driver

2012-12-18 Thread Laxman Dewangan

The spi core make sure that each transfer structure have the proper
setting for bits_per_word before calling low level transfer APIs.

Hence it is no more require to check again in low level driver for
this field whether this is set correct or not. Removing such code
from low level driver.

Signed-off-by: Laxman Dewangan ldewan...@nvidia.com
---
This is the continuation of feedback got from Jonas on
change
spi: make sure all transfer has bits_per_word set
where I made the change in only tegra slink driver.
This patch remove the similar code form all drivers.

Changes from V1:
- No change in code.
- Description is rewritten to match with the change.

 drivers/spi/spi-altera.c|2 +-
 drivers/spi/spi-bfin-sport.c|3 +--
 drivers/spi/spi-bfin5xx.c   |3 +--
 drivers/spi/spi-bitbang.c   |6 +++---
 drivers/spi/spi-clps711x.c  |2 +-
 drivers/spi/spi-coldfire-qspi.c |3 +--
 drivers/spi/spi-ep93xx.c|2 +-
 drivers/spi/spi-s3c64xx.c   |2 +-
 drivers/spi/spi-sirf.c  |3 +--
 drivers/spi/spi-tegra20-slink.c |9 +++--
 drivers/spi/spi-txx9.c  |6 ++
 11 files changed, 16 insertions(+), 25 deletions(-)

diff --git a/drivers/spi/spi-altera.c b/drivers/spi/spi-altera.c
index 5e7314a..a537f8d 100644
--- a/drivers/spi/spi-altera.c
+++ b/drivers/spi/spi-altera.c
@@ -134,7 +134,7 @@ static int altera_spi_txrx(struct spi_device *spi, struct 
spi_transfer *t)
hw-tx = t-tx_buf;
hw-rx = t-rx_buf;
hw-count = 0;
-   hw-bytes_per_word = (t-bits_per_word ? : spi-bits_per_word) / 8;
+   hw-bytes_per_word = t-bits_per_word / 8;
hw-len = t-len / hw-bytes_per_word;
 
if (hw-irq = 0) {
diff --git a/drivers/spi/spi-bfin-sport.c b/drivers/spi/spi-bfin-sport.c
index ac7ffca..39b0d17 100644
--- a/drivers/spi/spi-bfin-sport.c
+++ b/drivers/spi/spi-bfin-sport.c
@@ -416,8 +416,7 @@ bfin_sport_spi_pump_transfers(unsigned long data)
drv_data-cs_change = transfer-cs_change;
 
/* Bits per word setup */
-   bits_per_word = transfer-bits_per_word ? :
-   message-spi-bits_per_word ? : 8;
+   bits_per_word = transfer-bits_per_word;
if (bits_per_word % 16 == 0)
drv_data-ops = bfin_sport_transfer_ops_u16;
else
diff --git a/drivers/spi/spi-bfin5xx.c b/drivers/spi/spi-bfin5xx.c
index 0429d83..7d7c991 100644
--- a/drivers/spi/spi-bfin5xx.c
+++ b/drivers/spi/spi-bfin5xx.c
@@ -642,8 +642,7 @@ static void bfin_spi_pump_transfers(unsigned long data)
drv_data-cs_change = transfer-cs_change;
 
/* Bits per word setup */
-   bits_per_word = transfer-bits_per_word ? :
-   message-spi-bits_per_word ? : 8;
+   bits_per_word = transfer-bits_per_word;
if (bits_per_word % 16 == 0) {
drv_data-n_bytes = bits_per_word/8;
drv_data-len = (transfer-len)  1;
diff --git a/drivers/spi/spi-bitbang.c b/drivers/spi/spi-bitbang.c
index 8b3d8ef..61beaec 100644
--- a/drivers/spi/spi-bitbang.c
+++ b/drivers/spi/spi-bitbang.c
@@ -69,7 +69,7 @@ static unsigned bitbang_txrx_8(
unsignedns,
struct spi_transfer *t
 ) {
-   unsignedbits = t-bits_per_word ? : spi-bits_per_word;
+   unsignedbits = t-bits_per_word;
unsignedcount = t-len;
const u8*tx = t-tx_buf;
u8  *rx = t-rx_buf;
@@ -95,7 +95,7 @@ static unsigned bitbang_txrx_16(
unsignedns,
struct spi_transfer *t
 ) {
-   unsignedbits = t-bits_per_word ? : spi-bits_per_word;
+   unsignedbits = t-bits_per_word;
unsignedcount = t-len;
const u16   *tx = t-tx_buf;
u16 *rx = t-rx_buf;
@@ -121,7 +121,7 @@ static unsigned bitbang_txrx_32(
unsignedns,
struct spi_transfer *t
 ) {
-   unsignedbits = t-bits_per_word ? : spi-bits_per_word;
+   unsignedbits = t-bits_per_word;
unsignedcount = t-len;
const u32   *tx = t-tx_buf;
u32 *rx = t-rx_buf;
diff --git a/drivers/spi/spi-clps711x.c b/drivers/spi/spi-clps711x.c
index 1366c46..a11cbf0 100644
--- a/drivers/spi/spi-clps711x.c
+++ b/drivers/spi/spi-clps711x.c
@@ -68,7 +68,7 @@ static int spi_clps711x_setup_xfer(struct spi_device *spi,
   struct spi_transfer *xfer)
 {
u32 speed = xfer-speed_hz ? : spi-max_speed_hz;
-   u8 bpw = xfer-bits_per_word ? : spi-bits_per_word;
+   u8 bpw = xfer-bits_per_word;
struct spi_clps711x_data *hw = spi_master_get_devdata(spi-master);
 
if (bpw != 8) {
diff --git a/drivers/spi/spi-coldfire-qspi.c b/drivers/spi/spi-coldfire-qspi.c
index 58466b8..7b5cc9e 100644
---

Re: [PATCH v2] backlight: add lms501kf03 LCD driver

2012-12-18 Thread devendra.aaru

On Tue, Dec 18, 2012 at 3:46 AM, Jingoo Han jg1@samsung.com wrote:
 Add the lms501kf03 LCD panel driver. The lms501kf03 LCD panel (800
 x 480) driver uses 3-wired SPI inteface.

 Signed-off-by: Ilho Lee ilho215@samsung.com
 Signed-off-by: Jingoo Han jg1@samsung.com
 ---
 Change since v1:
 - remove redundant return variables
 - use -EINVAL instead of -EFAULT
 - add a more detailed description of 120ms delay time


Thanks for taking care of these comments!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/2] pinctrl: nomadik: return if prcm_base is NULL

2012-12-18 Thread Fabio Baltieri

This patch adds a check for npct-prcm_base to make sure that the
address is not NULL before using it, as the driver was made capable of
loading even without a proper memory resource in:

f1671bf pinctrl/nomadik: make independent of prcmu driver

Also, refuses to probe without prcm_base on anything else than nomadik.

This solves the following crash, introduced during the merge window when
booting on U8500 with device tree:

pinctrl-nomadik pinctrl-db8500: No PRCM base, assume no ALT-Cx control is 
available
Unable to handle kernel NULL pointer dereference at virtual address 0138
pgd = c0004000
[0138] *pgd=
Internal error: Oops: 5 [#1] PREEMPT SMP ARM
Modules linked in:
CPU: 0Not tainted  (3.7.0-02892-g1ebaf4f #631)
PC is at nmk_pmx_enable+0x1bc/0x4d0
LR is at clk_disable+0x40/0x44
[snip]
[c01d5e50] (nmk_pmx_enable+0x1bc/0x4d0) from [c01d3ba8] 
(pinmux_enable_setting+0x12c/0x1ec)
[c01d3ba8] (pinmux_enable_setting+0x12c/0x1ec) from [c01d1dc8] 
(pinctrl_select_state_locked+0xfc/0x134)
[c01d1dc8] (pinctrl_select_state_locked+0xfc/0x134) from [c01d2814] 
(pinctrl_register+0x26c/0x43c)
[c01d2814] (pinctrl_register+0x26c/0x43c) from [c01d668c] 
(nmk_pinctrl_probe+0x114/0x238)
[c01d668c] (nmk_pinctrl_probe+0x114/0x238) from [c0211cc4] 
(platform_drv_probe+0x28/0x2c)
[c0211cc4] (platform_drv_probe+0x28/0x2c) from [c0210738] 
(driver_probe_device+0x84/0x21c)
[c0210738] (driver_probe_device+0x84/0x21c) from [c02109c0] 
(__device_attach+0x50/0x54)
[c02109c0] (__device_attach+0x50/0x54) from [c020eb1c] 
(bus_for_each_drv+0x54/0x9c)
[c020eb1c] (bus_for_each_drv+0x54/0x9c) from [c0210668] 
(device_attach+0x84/0x9c)
[c0210668] (device_attach+0x84/0x9c) from [c020fbac] 
(bus_probe_device+0x94/0xb8)
[c020fbac] (bus_probe_device+0x94/0xb8) from [c020e084] 
(device_add+0x4f0/0x5bc)
[c020e084] (device_add+0x4f0/0x5bc) from [c0276400] 
(of_device_add+0x40/0x48)
[c0276400] (of_device_add+0x40/0x48) from [c0276a98] 
(of_platform_device_create_pdata+0x68/0x98)
[c0276a98] (of_platform_device_create_pdata+0x68/0x98) from [c0276bac] 
(of_platform_bus_create+0xe4/0x260)
[c0276bac] (of_platform_bus_create+0xe4/0x260) from [c0276bf8] 
(of_platform_bus_create+0x130/0x260)
[c0276bf8] (of_platform_bus_create+0x130/0x260) from [c0276d94] 
(of_platform_populate+0x6c/0xac)
[c0276d94] (of_platform_populate+0x6c/0xac) from [c04a8224] 
(u8500_init_machine+0x78/0x140)
[c04a8224] (u8500_init_machine+0x78/0x140) from [c04a3560] 
(customize_machine+0x24/0x30)
[c04a3560] (customize_machine+0x24/0x30) from [c00087b0] 
(do_one_initcall+0x130/0x1b0)
[c00087b0] (do_one_initcall+0x130/0x1b0) from [c033ff9c] 
(kernel_init+0x138/0x2e8)
[c033ff9c] (kernel_init+0x138/0x2e8) from [c000eb18] 
(ret_from_fork+0x14/0x20)
Code: 0a1b e19400b2 e59a200c e0822000 (e592c000)
---[ end trace 1b75b31a2719ed1c ]---
note: swapper/0[1] exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init! exitcode=0x000b

Reviewed-by: Linus Walleij linus.wall...@linaro.org
Signed-off-by: Fabio Baltieri fabio.balti...@linaro.org
---
 drivers/pinctrl/pinctrl-nomadik.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/pinctrl-nomadik.c 
b/drivers/pinctrl/pinctrl-nomadik.c
index ef66f98..015b093 100644
--- a/drivers/pinctrl/pinctrl-nomadik.c
+++ b/drivers/pinctrl/pinctrl-nomadik.c
@@ -259,6 +259,9 @@ static void nmk_prcm_altcx_set_mode(struct nmk_pinctrl 
*npct,
const struct prcm_gpiocr_altcx_pin_desc *pin_desc;
const u16 *gpiocr_regs;
 
+   if (!npct-prcm_base)
+   return;
+
if (alt_num  PRCM_IDX_GPIOCR_ALTC_MAX) {
dev_err(npct-dev, PRCM GPIOCR: alternate-C%i is invalid\n,
alt_num);
@@ -682,6 +685,9 @@ static int nmk_prcm_gpiocr_get_mode(struct pinctrl_dev 
*pctldev, int gpio)
const struct prcm_gpiocr_altcx_pin_desc *pin_desc;
const u16 *gpiocr_regs;
 
+   if (!npct-prcm_base)
+   return NMK_GPIO_ALT_C;
+
for (i = 0; i  npct-soc-npins_altcx; i++) {
if (npct-soc-altcx_pins[i].pin == gpio)
break;
@@ -1887,9 +1893,12 @@ static int __devinit nmk_pinctrl_probe(struct 
platform_device *pdev)
failed to ioremap PRCM registers\n);
return -ENOMEM;
}
-   } else {
+   } else if (version == PINCTRL_NMK_STN8815) {
dev_info(pdev-dev,
 No PRCM base, assume no ALT-Cx control is 
available\n);
+   } else {
+   dev_err(pdev-dev, missing PRCM base address\n);
+   return -EINVAL;
}
 
/*
-- 
1.7.12.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] kvm: fix i8254 counter 0 wraparound

2012-12-18 Thread Gleb Natapov

On Sat, Dec 15, 2012 at 06:34:37AM -0500, Nickolai Zeldovich wrote:
 The kvm i8254 emulation for counter 0 (but not for counters 1 and 2)
 has at least two bugs in mode 0:
 
 1. The OUT bit, computed by pit_get_out(), is never set high.
 
 2. The counter value, computed by pit_get_count(), wraps back around to
the initial counter value, rather than wrapping back to 0x
(which is the behavior described in the comment in __kpit_elapsed,
the behavior implemented by qemu, and the behavior observed on AMD
hardware).
 
 The bug stems from __kpit_elapsed computing the elapsed time mod the
 initial counter value (stored as nanoseconds in ps-period).  This is both
 unnecessary (none of the callers of kpit_elapsed expect the value to be
 at most the initial counter value) and incorrect (it causes pit_get_count
 to appear to wrap around to the initial counter value rather than 0x).
 Removing this mod from __kpit_elapsed fixes both of the above bugs.
 
 Signed-off-by: Nickolai Zeldovich nicko...@csail.mit.edu
Applied, thanks!

 ---
  arch/x86/kvm/i8254.c |1 -
  1 file changed, 1 deletion(-)
 
 diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
 index 11300d2..c1d30b2 100644
 --- a/arch/x86/kvm/i8254.c
 +++ b/arch/x86/kvm/i8254.c
 @@ -122,7 +122,6 @@ static s64 __kpit_elapsed(struct kvm *kvm)
*/
   remaining = hrtimer_get_remaining(ps-timer);
   elapsed = ps-period - ktime_to_ns(remaining);
 - elapsed = mod_64(elapsed, ps-period);
  
   return elapsed;
  }
 -- 
 1.7.10.4

--
Gleb.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 2/2] ARM: ux500: add pinctrl address resources

2012-12-18 Thread Fabio Baltieri

Current nmk_pinctrl driver is not PRCMU dependent anymore, so it needs
its own DT address resources to work properly, as done for
platform_device in:

f482833 ARM: ux500: add PRCM register base for pinctrl

Reviewed-by: Linus Walleij linus.wall...@linaro.org
Signed-off-by: Fabio Baltieri fabio.balti...@linaro.org
---
 arch/arm/boot/dts/dbx5x0.dtsi| 3 ++-
 arch/arm/mach-ux500/cpu-db8500.c | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/arm/boot/dts/dbx5x0.dtsi b/arch/arm/boot/dts/dbx5x0.dtsi
index 2efd9c8..16552d4 100644
--- a/arch/arm/boot/dts/dbx5x0.dtsi
+++ b/arch/arm/boot/dts/dbx5x0.dtsi
@@ -170,7 +170,8 @@
gpio-bank = 8;
};
 
-   pinctrl {
+   pinctrl@80157000 {
+   reg = 0x80157000 0x2000;
compatible = stericsson,nmk_pinctrl;
};
 
diff --git a/arch/arm/mach-ux500/cpu-db8500.c b/arch/arm/mach-ux500/cpu-db8500.c
index db0bb75..5b286e0 100644
--- a/arch/arm/mach-ux500/cpu-db8500.c
+++ b/arch/arm/mach-ux500/cpu-db8500.c
@@ -285,7 +285,8 @@ static struct of_dev_auxdata u8500_auxdata_lookup[] 
__initdata = {
OF_DEV_AUXDATA(st,nomadik-i2c, 0x8011, nmk-i2c.3, NULL),
OF_DEV_AUXDATA(st,nomadik-i2c, 0x8012a000, nmk-i2c.4, NULL),
/* Requires device name bindings. */
-   OF_DEV_AUXDATA(stericsson,nmk_pinctrl, 0, pinctrl-db8500, NULL),
+   OF_DEV_AUXDATA(stericsson,nmk_pinctrl, U8500_PRCMU_BASE,
+   pinctrl-db8500, NULL),
/* Requires clock name and DMA bindings. */
OF_DEV_AUXDATA(stericsson,ux500-msp-i2s, 0x80123000,
ux500-msp-i2s.0, msp0_platform_data),
-- 
1.7.12.1

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 4/4] ARM: tegra: Set SCU base address dynamically from DT

2012-12-18 Thread Hiroshi Doyu

Hi Rob,

Rob Herring robherri...@gmail.com wrote @ Mon, 17 Dec 2012 15:00:46 +0100:

 On 12/17/2012 12:18 AM, Hiroshi Doyu wrote:
  Set Snoop Control Unit(SCU) register base address dynamically from DT.
  
  Signed-off-by: Hiroshi Doyu hd...@nvidia.com
  ---
   arch/arm/mach-tegra/platsmp.c |   23 ---
   1 file changed, 20 insertions(+), 3 deletions(-)
  
  diff --git a/arch/arm/mach-tegra/platsmp.c b/arch/arm/mach-tegra/platsmp.c
  index 1b926df..45c0b79 100644
  --- a/arch/arm/mach-tegra/platsmp.c
  +++ b/arch/arm/mach-tegra/platsmp.c
  @@ -18,6 +18,8 @@
   #include linux/jiffies.h
   #include linux/smp.h
   #include linux/io.h
  +#include linux/of.h
  +#include linux/of_address.h
   
   #include asm/cacheflush.h
   #include asm/hardware/gic.h
  @@ -36,7 +38,7 @@
   
   extern void tegra_secondary_startup(void);
   
  -static void __iomem *scu_base = IO_ADDRESS(TEGRA_ARM_PERIF_BASE);
  +static void __iomem *scu_base;
   
   #define EVP_CPU_RESET_VECTOR \
  (IO_ADDRESS(TEGRA_EXCEPTION_VECTORS_BASE) + 0x100)
  @@ -143,14 +145,28 @@ done:
  return status;
   }
   
  +static const struct of_device_id cortex_a9_scu_match[] __initconst = {
  +   { .compatible = arm,cortex-a9-scu, },
  +   {}
  +};
  +
   /*
* Initialise the CPU possible map early - this describes the CPUs
* which may be present or become present in the system.
*/
   static void __init tegra_smp_init_cpus(void)
   {
  -   unsigned int i, ncores = scu_get_core_count(scu_base);
  +   struct device_node *np;
  +   unsigned int i, ncores = 1;
  +
  +   np = of_find_matching_node(NULL, cortex_a9_scu_match);
  +   if (!np)
  +   return;
  +   scu_base = of_iomap(np, 0);
 
 Did you actually test this? Unless something changed, ioremap does not
 work this early. The only reason to have it mapped this early is to get
 the core count, but that doesn't work on A15 or A7. So we really need to
 get core count/mask in a standard way. At least some work to get core
 count from DT went into 3.8.
 
 BTW, you can get the scu address on the A9 by reading cp15 register:
 
   /* Get SCU base */
   asm(mrc p15, 4, %0, c15, c0, 0 : =r (base));
 
 It's still probably good to have the DT node, but the reg property can
 be optional in this case.

I'm simply wondering, if the above cp15 works with Cortex-A9, do we
still need SCU DT node? At least from Cortex-A15 TRM, it seems that
SCU is tighly integrated into CPU core and it doesn't have any user
control. So Cortex-A15 doesn't seem to need to configure SCU. For
Cortex-A7, I haven't yet found S/W configurable register definitions
in TRM. So if neither of A15/A7 need SCU base, would the above cp15
intructions be enough?

 We need to move away from having the DT matching code within the
 platforms. This should all be moved to the scu code in a scu_of_init
 function that could be called from common code.

True if SCU DT node is still necessary.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 0/8] Thermal Framework Enhancements

2012-12-18 Thread Durgadoss R

This patch is a v1 based on the RFC submitted here:
https://patchwork.kernel.org/patch/1758921/

This patch set is based on Rui's -thermal tree, and is
tested on a Core-i5 and an Atom netbook.

This series contains 8 patches:
Patch 1/8: Creates new sensor level APIs
Patch 2/8: Creates new zone level APIs. The existing tzd structure is
   kept as such for clarity and compatibility purposes.
Patch 3/8: Creates functions to add/remove a cdev to/from a zone. The
   existing tcd structure need not be modified.
Patch 4/8: Adds a thermal_trip sysfs node, which exposes various trip
   points for all sensors present in a zone.
Patch 5/8: Adds a thermal_map sysfs node. It is a compact representation
   of the binding relationship between a sensor and a cdev,
   within a zone.
Patch 6/8: Creates Documentation for the new APIs. A new file is
   created for clarity. Final goal is to merge with the existing
   file or refactor the files, as whatever seems appropriate.
Patch 7/8: Make PER ZONE values configurable through Kconfig
Patch 8/8: A dummy driver that can be used for testing. This is not for merge.

Thanks to Rui Zhang, Honghbo Zhang, Wei Ni for their feedback on the
RFC version.

Durgadoss R (8):
  Thermal: Create sensor level APIs
  Thermal: Create zone level APIs
  Thermal: Add APIs to bind cdev to new zone structure
  Thermal: Add Thermal_trip sysfs node
  Thermal: Add 'thermal_map' sysfs node
  Thermal: Add Documentation to new APIs
  Thermal: Make PER_ZONE values configurable
  Thermal: Dummy driver used for testing

 Documentation/thermal/sysfs-api2.txt |  248 +
 drivers/thermal/Kconfig  |   19 +
 drivers/thermal/Makefile |3 +
 drivers/thermal/thermal_sys.c|  932 ++
 drivers/thermal/thermal_test.c   |  315 
 include/linux/thermal.h  |  124 +
 6 files changed, 1641 insertions(+)
 create mode 100644 Documentation/thermal/sysfs-api2.txt
 create mode 100644 drivers/thermal/thermal_test.c

-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 3/8] Thermal: Add APIs to bind cdev to new zone structure

2012-12-18 Thread Durgadoss R

This patch creates new APIs to add/remove a
cdev to/from a zone. This patch does not change
the old cooling device implementation.

Signed-off-by: Durgadoss R durgados...@intel.com
---
 drivers/thermal/thermal_sys.c |   80 +
 include/linux/thermal.h   |8 +
 2 files changed, 88 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 06d5a12..b39bf97 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -58,6 +58,7 @@ static LIST_HEAD(thermal_governor_list);
 static DEFINE_MUTEX(thermal_list_lock);
 static DEFINE_MUTEX(sensor_list_lock);
 static DEFINE_MUTEX(zone_list_lock);
+static DEFINE_MUTEX(cdev_list_lock);
 static DEFINE_MUTEX(thermal_governor_lock);
 
 #define for_each_thermal_sensor(pos) \
@@ -82,6 +83,9 @@ static DEFINE_MUTEX(thermal_governor_lock);
mutex_unlock(type##_list_lock);\
} while (0)
 
+#define for_each_cdev(pos) \
+   list_for_each_entry(pos, thermal_cdev_list, node)
+
 static struct thermal_governor *__find_governor(const char *name)
 {
struct thermal_governor *pos;
@@ -462,6 +466,24 @@ static void remove_sensor_from_zone(struct thermal_zone 
*tz,
tz-sensor_indx--;
 }
 
+static void remove_cdev_from_zone(struct thermal_zone *tz,
+   struct thermal_cooling_device *cdev)
+{
+   int j, indx;
+
+   GET_INDEX(tz, cdev, indx, cdev);
+   if (indx  0)
+   return;
+
+   sysfs_remove_link(tz-device.kobj, kobject_name(cdev-device.kobj));
+
+   /* Shift the entries in the tz-cdevs array */
+   for (j = indx; j  MAX_CDEVS_PER_ZONE - 1; j++)
+   tz-cdevs[j] = tz-cdevs[j + 1];
+
+   tz-cdev_indx--;
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
@@ -1458,6 +1480,7 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
int i;
const struct thermal_zone_params *tzp;
struct thermal_zone_device *tz;
+   struct thermal_zone *tmp_tz;
struct thermal_cooling_device *pos = NULL;
 
if (!cdev)
@@ -1495,6 +1518,13 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
 
mutex_unlock(thermal_list_lock);
 
+   mutex_lock(zone_list_lock);
+
+   for_each_thermal_zone(tmp_tz)
+   remove_cdev_from_zone(tmp_tz, cdev);
+
+   mutex_unlock(zone_list_lock);
+
if (cdev-type[0])
device_remove_file(cdev-device, dev_attr_cdev_type);
device_remove_file(cdev-device, dev_attr_max_state);
@@ -1790,6 +1820,23 @@ exit:
 }
 EXPORT_SYMBOL(remove_thermal_zone);
 
+struct thermal_cooling_device *get_cdev_by_name(const char *name)
+{
+   struct thermal_cooling_device *pos;
+   struct thermal_cooling_device *cdev = NULL;
+
+   mutex_lock(cdev_list_lock);
+   for_each_cdev(pos) {
+   if (!strnicmp(pos-type, name, THERMAL_NAME_LENGTH)) {
+   cdev = pos;
+   break;
+   }
+   }
+   mutex_unlock(cdev_list_lock);
+   return cdev;
+}
+EXPORT_SYMBOL(get_cdev_by_name);
+
 struct thermal_sensor *get_sensor_by_name(const char *name)
 {
struct thermal_sensor *pos;
@@ -1840,6 +1887,39 @@ exit_zone:
 }
 EXPORT_SYMBOL(add_sensor_to_zone);
 
+int add_cdev_to_zone(struct thermal_zone *tz,
+   struct thermal_cooling_device *cdev)
+{
+   int ret;
+
+   if (!tz || !cdev)
+   return -EINVAL;
+
+   mutex_lock(zone_list_lock);
+
+   /* Ensure we are not adding the same cdev again!! */
+   GET_INDEX(tz, cdev, ret, cdev);
+   if (ret = 0) {
+   ret = -EEXIST;
+   goto exit_zone;
+   }
+
+   mutex_lock(cdev_list_lock);
+   ret = sysfs_create_link(tz-device.kobj, cdev-device.kobj,
+   kobject_name(cdev-device.kobj));
+   if (ret)
+   goto exit_cdev;
+
+   tz-cdevs[tz-cdev_indx++] = cdev;
+
+exit_cdev:
+   mutex_unlock(cdev_list_lock);
+exit_zone:
+   mutex_unlock(zone_list_lock);
+   return ret;
+}
+EXPORT_SYMBOL(add_cdev_to_zone);
+
 /**
  * thermal_sensor_register - register a new thermal sensor
  * @name:  name of the thermal sensor
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index f08f774..c4e45c7 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -51,6 +51,8 @@
 
 #define MAX_SENSORS_PER_ZONE   5
 
+#define MAX_CDEVS_PER_ZONE 5
+
 struct thermal_sensor;
 struct thermal_zone_device;
 struct thermal_cooling_device;
@@ -209,6 +211,10 @@ struct thermal_zone {
/* Sensor level information */
int sensor_indx; /* index into 'sensors' array */
struct thermal_sensor *sensors[MAX_SENSORS_PER_ZONE];
+
+   /* cdev level information */
+   int cdev_indx; /* index into 'cdevs' array */
+   struct

[PATCH 2/8] Thermal: Create zone level APIs

2012-12-18 Thread Durgadoss R

This patch adds a new thermal_zone structure to
thermal.h. Also, adds zone level APIs to the thermal
framework.

A thermal zone is a hot spot on the platform, which
can have one or more sensors and cooling devices attached
to it. These sensors can be mapped to a set of cooling
devices, which when throttled, can help to bring down
the temperature of the hot spot.

Signed-off-by: Durgadoss R durgados...@intel.com
---
 drivers/thermal/thermal_sys.c |  194 +
 include/linux/thermal.h   |   21 +
 2 files changed, 215 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index b2becb9..06d5a12 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -44,19 +44,44 @@ MODULE_DESCRIPTION(Generic thermal management sysfs 
support);
 MODULE_LICENSE(GPL);
 
 static DEFINE_IDR(thermal_tz_idr);
+static DEFINE_IDR(thermal_zone_idr);
 static DEFINE_IDR(thermal_cdev_idr);
 static DEFINE_IDR(thermal_sensor_idr);
 static DEFINE_MUTEX(thermal_idr_lock);
 
 static LIST_HEAD(thermal_tz_list);
 static LIST_HEAD(thermal_sensor_list);
+static LIST_HEAD(thermal_zone_list);
 static LIST_HEAD(thermal_cdev_list);
 static LIST_HEAD(thermal_governor_list);
 
 static DEFINE_MUTEX(thermal_list_lock);
 static DEFINE_MUTEX(sensor_list_lock);
+static DEFINE_MUTEX(zone_list_lock);
 static DEFINE_MUTEX(thermal_governor_lock);
 
+#define for_each_thermal_sensor(pos) \
+   list_for_each_entry(pos, thermal_sensor_list, node)
+
+#define for_each_thermal_zone(pos) \
+   list_for_each_entry(pos, thermal_zone_list, node)
+
+#define GET_INDEX(tz, ptr, indx, type) \
+   do {\
+   int i;  \
+   indx = -EINVAL; \
+   if (!tz || !ptr)\
+   break;  \
+   mutex_lock(type##_list_lock);  \
+   for (i = 0; i  tz-type##_indx; i++) { \
+   if (tz-type##s[i] == ptr) {\
+   indx = i;   \
+   break;  \
+   }   \
+   }   \
+   mutex_unlock(type##_list_lock);\
+   } while (0)
+
 static struct thermal_governor *__find_governor(const char *name)
 {
struct thermal_governor *pos;
@@ -419,15 +444,44 @@ static void thermal_zone_device_check(struct work_struct 
*work)
thermal_zone_device_update(tz);
 }
 
+static void remove_sensor_from_zone(struct thermal_zone *tz,
+   struct thermal_sensor *ts)
+{
+   int j, indx;
+
+   GET_INDEX(tz, ts, indx, sensor);
+   if (indx  0)
+   return;
+
+   sysfs_remove_link(tz-device.kobj, kobject_name(ts-device.kobj));
+
+   /* Shift the entries in the tz-sensors array */
+   for (j = indx; j  MAX_SENSORS_PER_ZONE - 1; j++)
+   tz-sensors[j] = tz-sensors[j + 1];
+
+   tz-sensor_indx--;
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
container_of(_dev, struct thermal_zone_device, device)
 
+#define to_zone(_dev) \
+   container_of(_dev, struct thermal_zone, device)
+
 #define to_thermal_sensor(_dev) \
container_of(_dev, struct thermal_sensor, device)
 
 static ssize_t
+zone_name_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   struct thermal_zone *tz = to_zone(dev);
+
+   return sprintf(buf, %s\n, tz-name);
+}
+
+static ssize_t
 sensor_name_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
struct thermal_sensor *ts = to_thermal_sensor(dev);
@@ -809,6 +863,8 @@ static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, 
policy_store);
 static DEVICE_ATTR(sensor_name, 0444, sensor_name_show, NULL);
 static DEVICE_ATTR(temp_input, 0444, sensor_temp_show, NULL);
 
+static DEVICE_ATTR(zone_name, 0444, zone_name_show, NULL);
+
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
@@ -1654,6 +1710,136 @@ static int enable_sensor_thresholds(struct 
thermal_sensor *ts, int count)
return 0;
 }
 
+struct thermal_zone *create_thermal_zone(const char *name, void *devdata)
+{
+   struct thermal_zone *tz;
+   int ret;
+
+   if (!name || (name  strlen(name) = THERMAL_NAME_LENGTH))
+   return ERR_PTR(-EINVAL);
+
+   tz = kzalloc(sizeof(*tz), GFP_KERNEL);
+   if (!tz)
+   return ERR_PTR(-ENOMEM);
+
+   idr_init(tz-idr);
+   ret = get_idr(thermal_zone_idr, thermal_idr_lock, tz-id);
+   if (ret)
+   goto exit_free;
+
+   strcpy(tz-name, name);
+   tz-devdata = devdata;
+   tz-device.class =

[PATCH 8/8] Thermal: Dummy driver used for testing

2012-12-18 Thread Durgadoss R

This patch has a dummy driver that can be used for
testing purposes. This patch is not for merge.

Signed-off-by: Durgadoss R durgados...@intel.com
---
 drivers/thermal/Kconfig|5 +
 drivers/thermal/Makefile   |3 +
 drivers/thermal/thermal_test.c |  315 
 3 files changed, 323 insertions(+)
 create mode 100644 drivers/thermal/thermal_test.c

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index c5ba3340..3b92a76 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -136,4 +136,9 @@ config DB8500_CPUFREQ_COOLING
  bound cpufreq cooling device turns active to set CPU frequency low to
  cool down the CPU.
 
+config THERMAL_TEST
+   tristate test driver
+   help
+ Enable this to test the thermal framework.
+
 endif
diff --git a/drivers/thermal/Makefile b/drivers/thermal/Makefile
index d8da683..02c3edb 100644
--- a/drivers/thermal/Makefile
+++ b/drivers/thermal/Makefile
@@ -18,3 +18,6 @@ obj-$(CONFIG_RCAR_THERMAL)+= rcar_thermal.o
 obj-$(CONFIG_EXYNOS_THERMAL)   += exynos_thermal.o
 obj-$(CONFIG_DB8500_THERMAL)   += db8500_thermal.o
 obj-$(CONFIG_DB8500_CPUFREQ_COOLING)   += db8500_cpufreq_cooling.o
+
+# dummy driver for testing
+obj-$(CONFIG_THERMAL_TEST) += thermal_test.o
diff --git a/drivers/thermal/thermal_test.c b/drivers/thermal/thermal_test.c
new file mode 100644
index 000..5a11e34
--- /dev/null
+++ b/drivers/thermal/thermal_test.c
@@ -0,0 +1,315 @@
+/*
+ * thermal_test.c - This driver can be used to test Thermal
+ *Framework changes. Not specific to any
+ *platform. Fills the log buffer generously ;)
+ *
+ * Copyright (C) 2012 Intel Corporation
+ *
+ * ~~
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; version 2 of the License.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
+ *
+ * ~~
+ * Author: Durgadoss R durgados...@intel.com
+ */
+
+#define pr_fmt(fmt)  thermal_test:  fmt
+
+#include linux/module.h
+#include linux/init.h
+#include linux/err.h
+#include linux/param.h
+#include linux/device.h
+#include linux/slab.h
+#include linux/pm.h
+#include linux/platform_device.h
+#include linux/thermal.h
+
+#define MAX_THERMAL_ZONES  2
+#define MAX_THERMAL_SENSORS2
+#define MAX_COOLING_DEVS   4
+#define NUM_THRESHOLDS 3
+
+static struct ts_data {
+   int curr_temp;
+   int flag;
+} ts_data;
+
+int active_trips[10] = {100, 90, 80, 70, 60, 50, 40, 30, 20, 10};
+int passive_trips[5] = {100, 90, 60, 50, 40};
+
+static struct platform_device *pdev;
+static unsigned long cur_cdev_state = 2;
+static struct thermal_sensor *ts, *ts1;
+static struct thermal_zone *tz;
+static struct thermal_cooling_device *cdev;
+
+static long thermal_thresholds[NUM_THRESHOLDS] = {3, 4, 5};
+
+static struct thermal_trip_point trip = {
+   .hot = 90,
+   .crit = 100,
+   .num_passive_trips = 5,
+   .passive_trips = passive_trips,
+   .num_active_trips = 10,
+   .active_trips = active_trips,
+   .active_trip_mask = 0xCFF,
+};
+
+static struct thermal_trip_point trip1 = {
+   .hot = 95,
+   .crit = 125,
+   .num_passive_trips = 0,
+   .passive_trips = passive_trips,
+   .num_active_trips = 6,
+   .active_trips = active_trips,
+   .active_trip_mask = 0xFF,
+};
+
+static int read_cur_state(struct thermal_cooling_device *cdev,
+   unsigned long *state)
+{
+   *state = cur_cdev_state;
+   return 0;
+}
+
+static int write_cur_state(struct thermal_cooling_device *cdev,
+   unsigned long state)
+{
+   cur_cdev_state = state;
+   return 0;
+}
+
+static int read_max_state(struct thermal_cooling_device *cdev,
+   unsigned long *state)
+{
+   *state = 5;
+   return 0;
+}
+
+static int read_curr_temp(struct thermal_sensor *ts, long *temp)
+{
+   *temp = ts_data.curr_temp;
+   return 0;
+}
+
+static ssize_t
+flag_show(struct device *dev, struct device_attribute *devattr, char *buf)
+{
+   return sprintf(buf, %d\n, ts_data.flag);
+}
+
+static ssize_t
+flag_store(struct device *dev, struct device_attribute *attr,
+   const char *buf, size_t count)
+{
+   long flag;
+
+

[PATCH 7/8] Thermal: Make PER_ZONE values configurable

2012-12-18 Thread Durgadoss R

This patch makes MAX_SENSORS_PER_ZONE and
MAX_CDEVS_PER_ZONE values configurable. The
default value is 1, and range is 1-12.

Signed-off-by: Durgadoss R durgados...@intel.com
---
No great reason for using 12.
---
 drivers/thermal/Kconfig |   14 ++
 include/linux/thermal.h |6 +++---
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/drivers/thermal/Kconfig b/drivers/thermal/Kconfig
index d96da07..c5ba3340 100644
--- a/drivers/thermal/Kconfig
+++ b/drivers/thermal/Kconfig
@@ -15,6 +15,20 @@ menuconfig THERMAL
 
 if THERMAL
 
+config THERMAL_MAX_SENSORS_PER_ZONE
+   int Maximum number of sensors allowed per thermal zone
+   default 1
+   range 1 12
+   ---help---
+ Specify the number of sensors allowed per zone
+
+config THERMAL_MAX_CDEVS_PER_ZONE
+   int Maximum number of cooling devices allowed per thermal zone
+   default 1
+   range 1 12
+   ---help---
+ Specify the number of cooling devices allowed per zone
+
 config THERMAL_HWMON
bool
depends on HWMON=y || HWMON=THERMAL
diff --git a/include/linux/thermal.h b/include/linux/thermal.h
index 581dc87..7b0359b 100644
--- a/include/linux/thermal.h
+++ b/include/linux/thermal.h
@@ -49,9 +49,9 @@
 /* Default Thermal Governor: Does Linear Throttling */
 #define DEFAULT_THERMAL_GOVERNOR   step_wise
 
-#define MAX_SENSORS_PER_ZONE   5
-
-#define MAX_CDEVS_PER_ZONE 5
+/* Maximum number of sensors/cdevs per zone, defined through Kconfig */
+#define MAX_SENSORS_PER_ZONE   CONFIG_THERMAL_MAX_SENSORS_PER_ZONE
+#define MAX_CDEVS_PER_ZONE CONFIG_THERMAL_MAX_CDEVS_PER_ZONE
 
 /* If we map each sensor with every possible cdev for a zone */
 #define MAX_MAPS_PER_ZONE  (MAX_SENSORS_PER_ZONE * MAX_CDEVS_PER_ZONE)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/8] Thermal: Add Documentation to new APIs

2012-12-18 Thread Durgadoss R

This patch adds Documentation for the new APIs
introduced in this patch set. The documentation
also has a model sysfs structure for reference.

Signed-off-by: Durgadoss R durgados...@intel.com
---
 Documentation/thermal/sysfs-api2.txt |  248 ++
 1 file changed, 248 insertions(+)
 create mode 100644 Documentation/thermal/sysfs-api2.txt

diff --git a/Documentation/thermal/sysfs-api2.txt 
b/Documentation/thermal/sysfs-api2.txt
new file mode 100644
index 000..ffd0402
--- /dev/null
+++ b/Documentation/thermal/sysfs-api2.txt
@@ -0,0 +1,248 @@
+Thermal Framework
+-
+
+Written by Durgadoss R durgados...@intel.com
+Copyright (c) 2012 Intel Corporation
+
+Created on: 4 November 2012
+Updated on: 18 December 2012
+
+0. Introduction
+---
+The Linux thermal framework provides a set of interfaces for thermal
+sensors and thermal cooling devices (fan, processor...) to register
+with the thermal management solution and to be a part of it.
+
+This document focuses on how to enable new thermal sensors and cooling
+devices to participate in thermal management. This solution is intended
+to be 'light-weight' and platform/architecture independent. Any thermal
+sensor/cooling device should be able to use the infrastructure easily.
+
+The goal of thermal framework is to expose the thermal sensor/zone and
+cooling device attributes in a consistent way. This will help the
+thermal governors to make use of the information to manage platform
+thermals efficiently.
+
+The thermal sensor source file can be generic (can be any sensor driver,
+in any subsystem). This driver will use the sensor APIs and register with
+thermal framework to participate in platform Thermal management. This
+does not (and should not) know about which zone it belongs to, or any
+other information about platform thermals. A sensor driver is a standalone
+piece of code, which can optionally register with thermal framework.
+
+However, for any platform, there should be a platformX_thermal.c file,
+which will know about the platform thermal characteristics (like how many
+sensors, zones, cooling devices, etc.. And how they are related to each other
+i.e the mapping information). Only in this file, the zone level APIs should
+be used, in which case the file will have all information required to attach
+various sensors to a particular zone.
+
+This way, we can have one platform level thermal file, which can support
+multiple platforms (may be)using the same set of sensors (but)binded in
+a different way. This file can get the platform thermal information
+through Firmware, ACPI tables, device tree etc.
+
+Unfortunately, today we don't have many drivers that can be clearly
+differentiated as 'sensor_file.c' and 'platform_thermal_file.c'.
+But very soon we will need/have. The reason I am saying this is because
+we are seeing a lot of chip drivers, starting to use thermal framework,
+and we should keep it really light-weight for them to do so.
+
+An Example: drivers/hwmon/emc1403.c - a generic thermal chip driver
+In one platform this sensor can belong to 'ZoneA' and in another the
+same can belong to 'ZoneB'. But, emc1403.c does not really care about
+where does it belong. It just reports temperature.
+
+1. Terminology
+--
+This section describes the terminology used in the rest of this
+document as well as the thermal framework code.
+
+thermal_sensor: Hardware that can report temperature of a particular
+   spot in the platform, where it is placed. The temperature
+   reported by the sensor is the 'real' temperature reported
+   by the hardware.
+thermal_zone:  A virtual area on the device, that gets heated up. It may
+   have one or more thermal sensors attached to it.
+cooling_device:Any component that can help in reducing the temperature 
of
+   a 'hot spot' either by reducing its performance (passive
+   cooling) or by other means(Active cooling E.g. Fan)
+
+trip_points:   Various temperature levels for each sensor. As of now, we
+   have four levels namely active, passive, hot and critical.
+   Hot and critical trip point support only one value whereas
+   active and passive can have any number of values. These
+   temperature values can come from platform data, and are
+   exposed through sysfs in a consistent manner. Stand-alone
+   thermal sensor drivers are not expected to know these values.
+   These values are RO.
+thresholds:These are programmable temperature limits, on reaching which
+   the thermal sensor generates an interrupt. The framework is
+   notified about this interrupt to take appropriate action.
+   There can be as many number of thresholds as that of the
+   hardware supports. These values are RW.
+
+thermal_map:   This provides the mapping (aka

[PATCH 4/8] Thermal: Add Thermal_trip sysfs node

2012-12-18 Thread Durgadoss R

This patch adds a thermal_trip directory under
/sys/class/thermal/zoneX. This directory contains
the trip point values for sensors bound to this
zone.

Signed-off-by: Durgadoss R durgados...@intel.com
---
 drivers/thermal/thermal_sys.c |  237 -
 include/linux/thermal.h   |   37 +++
 2 files changed, 272 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index b39bf97..29ec073 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -448,6 +448,22 @@ static void thermal_zone_device_check(struct work_struct 
*work)
thermal_zone_device_update(tz);
 }
 
+static int get_sensor_indx_by_kobj(struct thermal_zone *tz, const char *name)
+{
+   int i, indx = -EINVAL;
+
+   mutex_lock(sensor_list_lock);
+   for (i = 0; i  tz-sensor_indx; i++) {
+   if (!strnicmp(name, kobject_name(tz-kobj_trip[i]),
+   THERMAL_NAME_LENGTH)) {
+   indx = i;
+   break;
+   }
+   }
+   mutex_unlock(sensor_list_lock);
+   return indx;
+}
+
 static void remove_sensor_from_zone(struct thermal_zone *tz,
struct thermal_sensor *ts)
 {
@@ -459,9 +475,15 @@ static void remove_sensor_from_zone(struct thermal_zone 
*tz,
 
sysfs_remove_link(tz-device.kobj, kobject_name(ts-device.kobj));
 
+   /* Delete this sensor's trip Kobject */
+   kobject_del(tz-kobj_trip[indx]);
+
/* Shift the entries in the tz-sensors array */
-   for (j = indx; j  MAX_SENSORS_PER_ZONE - 1; j++)
+   for (j = indx; j  MAX_SENSORS_PER_ZONE - 1; j++) {
tz-sensors[j] = tz-sensors[j + 1];
+   tz-sensor_trip[j] = tz-sensor_trip[j + 1];
+   tz-kobj_trip[j] = tz-kobj_trip[j + 1];
+   }
 
tz-sensor_indx--;
 }
@@ -875,6 +897,120 @@ policy_show(struct device *dev, struct device_attribute 
*devattr, char *buf)
return sprintf(buf, %s\n, tz-governor-name);
 }
 
+static ssize_t
+active_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int i, indx, ret = 0;
+   struct thermal_zone *tz;
+   struct device *dev;
+
+   /* In this function, for
+* /sys/class/thermal/zoneX/thermal_trip/sensorY:
+* attr points to sysfs node 'active'
+* kobj points to sensorY
+* kobj-parent points to thermal_trip
+* kobj-parent-parent points to zoneX
+*/
+
+   /* Get the zone pointer */
+   dev = container_of(kobj-parent-parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   /*
+* We need this because in the sysfs tree, 'sensorY' is
+* not really the sensor pointer. It just has the name
+* 'sensorY'; whereas 'zoneX' is actually the zone pointer.
+* This means container_of(kobj, struct device, kobj) will not
+* provide the actual sensor pointer.
+*/
+   indx = get_sensor_indx_by_kobj(tz, kobject_name(kobj));
+   if (indx  0)
+   return indx;
+
+   if (tz-sensor_trip[indx]-num_active_trips = 0)
+   return sprintf(buf, Not available\n);
+
+   ret += sprintf(buf, 0x%x, tz-sensor_trip[indx]-active_trip_mask);
+   for (i = 0; i  tz-sensor_trip[indx]-num_active_trips; i++) {
+   ret += sprintf(buf + ret,  %d,
+   tz-sensor_trip[indx]-active_trips[i]);
+   }
+
+   ret += sprintf(buf + ret, \n);
+   return ret;
+}
+
+static ssize_t
+ptrip_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int i, indx, ret = 0;
+   struct thermal_zone *tz;
+   struct device *dev;
+
+   /* Get the zone pointer */
+   dev = container_of(kobj-parent-parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   indx = get_sensor_indx_by_kobj(tz, kobject_name(kobj));
+   if (indx  0)
+   return indx;
+
+   if (tz-sensor_trip[indx]-num_passive_trips = 0)
+   return sprintf(buf, Not available\n);
+
+   for (i = 0; i  tz-sensor_trip[indx]-num_passive_trips; i++) {
+   ret += sprintf(buf + ret, %d ,
+   tz-sensor_trip[indx]-passive_trips[i]);
+   }
+
+   ret += sprintf(buf + ret, \n);
+   return ret;
+}
+
+static ssize_t
+hot_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int indx;
+   struct thermal_zone *tz;
+   struct device *dev;
+
+   /* Get the zone pointer */
+   dev = container_of(kobj-parent-parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   indx = get_sensor_indx_by_kobj(tz, kobject_name(kobj));
+   if (indx  0)
+   return indx;
+
+   return

[PATCH 5/8] Thermal: Add 'thermal_map' sysfs node

2012-12-18 Thread Durgadoss R

This patch creates a thermal map sysfs node under
/sys/class/thermal/thermal_zoneX/. This contains
entries named map0, map1 .. mapN. Each map has the
following space separated values:
trip_type sensor_name cdev_name trip_mask weights

Signed-off-by: Durgadoss R durgados...@intel.com
---
 drivers/thermal/thermal_sys.c |  149 -
 include/linux/thermal.h   |   29 
 2 files changed, 176 insertions(+), 2 deletions(-)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 29ec073..a3adc00 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -506,6 +506,41 @@ static void remove_cdev_from_zone(struct thermal_zone *tz,
tz-cdev_indx--;
 }
 
+static void __clean_map_entry(struct thermal_zone *tz, int i)
+{
+   tz-map[i] = NULL;
+   sysfs_remove_file(tz-kobj_thermal_map, tz-map_attr[i]-attr.attr);
+   /* Free map attributes */
+   kfree(tz-map_attr[i]);
+   tz-map_attr[i] = NULL;
+}
+
+static void remove_sensor_map_entry(struct thermal_zone *tz,
+   struct thermal_sensor *ts)
+{
+   int i;
+
+   for (i = 0; i  MAX_MAPS_PER_ZONE; i++) {
+   if (tz-map[i]  !strnicmp(ts-name, tz-map[i]-sensor_name,
+   THERMAL_NAME_LENGTH)) {
+   __clean_map_entry(tz, i);
+   }
+   }
+}
+
+static void remove_cdev_map_entry(struct thermal_zone *tz,
+   struct thermal_cooling_device *cdev)
+{
+   int i;
+
+   for (i = 0; i  MAX_MAPS_PER_ZONE; i++) {
+   if (tz-map[i]  !strnicmp(cdev-type, tz-map[i]-cdev_name,
+   THERMAL_NAME_LENGTH)) {
+   __clean_map_entry(tz, i);
+   }
+   }
+}
+
 /* sys I/F for thermal zone */
 
 #define to_thermal_zone(_dev) \
@@ -898,6 +933,52 @@ policy_show(struct device *dev, struct device_attribute 
*devattr, char *buf)
 }
 
 static ssize_t
+map_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
+{
+   int i, indx, ret = 0;
+   struct thermal_zone *tz;
+   struct thermal_map *map;
+   struct device *dev;
+   char *trip;
+
+   /*
+* For maps under /sys/class/thermal/zoneX/thermal_map/mapY:
+* attr points to mapY
+* kobj points to thermal_map
+* kobj-parent points to zoneX
+*/
+
+   /* Get zone pointer */
+   dev = container_of(kobj-parent, struct device, kobj);
+   tz = to_zone(dev);
+   if (!tz)
+   return -EINVAL;
+
+   sscanf(attr-attr.name, map%d, indx);
+
+   if (indx  0 || indx = MAX_MAPS_PER_ZONE)
+   return -EINVAL;
+
+   if (!tz-map[indx])
+   return sprintf(buf, Unavailable\n);
+
+   map = tz-map[indx];
+
+   trip = (map-trip_type == THERMAL_TRIP_ACTIVE) ?
+   active : passive;
+   ret += sprintf(buf, %s, trip);
+   ret += sprintf(buf + ret,  %s, map-sensor_name);
+   ret += sprintf(buf + ret,  %s, map-cdev_name);
+   ret += sprintf(buf + ret,  0x%x, map-trip_mask);
+
+   for (i = 0; i  map-num_weights; i++)
+   ret += sprintf(buf + ret,  %d, map-weights[i]);
+
+   ret += sprintf(buf + ret, \n);
+   return ret;
+}
+
+static ssize_t
 active_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf)
 {
int i, indx, ret = 0;
@@ -1676,8 +1757,10 @@ void thermal_cooling_device_unregister(struct 
thermal_cooling_device *cdev)
 
mutex_lock(zone_list_lock);
 
-   for_each_thermal_zone(tmp_tz)
+   for_each_thermal_zone(tmp_tz) {
remove_cdev_from_zone(tmp_tz, cdev);
+   remove_cdev_map_entry(tmp_tz, cdev);
+   }
 
mutex_unlock(zone_list_lock);
 
@@ -1931,12 +2014,19 @@ struct thermal_zone *create_thermal_zone(const char 
*name, void *devdata)
if (!tz-kobj_thermal_trip)
goto exit_name;
 
+   tz-kobj_thermal_map = kobject_create_and_add(thermal_map,
+   tz-device.kobj);
+   if (!tz-kobj_thermal_map)
+   goto exit_trip;
+
/* Add this zone to the global list of thermal zones */
mutex_lock(zone_list_lock);
list_add_tail(tz-node, thermal_zone_list);
mutex_unlock(zone_list_lock);
return tz;
 
+exit_trip:
+   kobject_del(tz-kobj_thermal_trip);
 exit_name:
device_remove_file(tz-device, dev_attr_zone_name);
 exit_unregister:
@@ -2000,6 +2090,12 @@ void remove_thermal_zone(struct thermal_zone *tz)
kobject_name(tz-cdevs[i]-device.kobj));
}
 
+   for (i = 0; i  MAX_MAPS_PER_ZONE; i++)
+   __clean_map_entry(tz, i);
+
+   /* Remove /sys/class/thermal/zoneX/thermal_map */
+   kobject_del(tz-kobj_thermal_map);
+

[PATCH 1/8] Thermal: Create sensor level APIs

2012-12-18 Thread Durgadoss R

This patch creates sensor level APIs, in the
generic thermal framework.

A Thermal sensor is a piece of hardware that can report
temperature of the spot in which it is placed. A thermal
sensor driver reads the temperature from this sensor
and reports it out. This kind of driver can be in
any subsystem. If the sensor needs to participate
in platform thermal management, the corresponding
driver can use the APIs introduced in this patch, to
register(or unregister) with the thermal framework.

Signed-off-by: Durgadoss R durgados...@intel.com
---
 drivers/thermal/thermal_sys.c |  280 +
 include/linux/thermal.h   |   29 +
 2 files changed, 309 insertions(+)

diff --git a/drivers/thermal/thermal_sys.c b/drivers/thermal/thermal_sys.c
index 8f0f37b..b2becb9 100644
--- a/drivers/thermal/thermal_sys.c
+++ b/drivers/thermal/thermal_sys.c
@@ -45,13 +45,16 @@ MODULE_LICENSE(GPL);
 
 static DEFINE_IDR(thermal_tz_idr);
 static DEFINE_IDR(thermal_cdev_idr);
+static DEFINE_IDR(thermal_sensor_idr);
 static DEFINE_MUTEX(thermal_idr_lock);
 
 static LIST_HEAD(thermal_tz_list);
+static LIST_HEAD(thermal_sensor_list);
 static LIST_HEAD(thermal_cdev_list);
 static LIST_HEAD(thermal_governor_list);
 
 static DEFINE_MUTEX(thermal_list_lock);
+static DEFINE_MUTEX(sensor_list_lock);
 static DEFINE_MUTEX(thermal_governor_lock);
 
 static struct thermal_governor *__find_governor(const char *name)
@@ -421,6 +424,103 @@ static void thermal_zone_device_check(struct work_struct 
*work)
 #define to_thermal_zone(_dev) \
container_of(_dev, struct thermal_zone_device, device)
 
+#define to_thermal_sensor(_dev) \
+   container_of(_dev, struct thermal_sensor, device)
+
+static ssize_t
+sensor_name_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   return sprintf(buf, %s\n, ts-name);
+}
+
+static ssize_t
+sensor_temp_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   int ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   ret = ts-ops-get_temp(ts, val);
+
+   return ret ? ret : sprintf(buf, %ld\n, val);
+}
+
+static ssize_t
+hyst_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!sscanf(attr-attr.name, threshold%d_hyst, indx))
+   return -EINVAL;
+
+   ret = ts-ops-get_hyst(ts, indx, val);
+
+   return ret ? ret : sprintf(buf, %ld\n, val);
+}
+
+static ssize_t
+hyst_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!ts-ops-set_hyst)
+   return -EPERM;
+
+   if (!sscanf(attr-attr.name, threshold%d_hyst, indx))
+   return -EINVAL;
+
+   if (kstrtol(buf, 10, val))
+   return -EINVAL;
+
+   ret = ts-ops-set_hyst(ts, indx, val);
+
+   return ret ? ret : count;
+}
+
+static ssize_t
+threshold_show(struct device *dev, struct device_attribute *attr, char *buf)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!sscanf(attr-attr.name, threshold%d, indx))
+   return -EINVAL;
+
+   ret = ts-ops-get_threshold(ts, indx, val);
+
+   return ret ? ret : sprintf(buf, %ld\n, val);
+}
+
+static ssize_t
+threshold_store(struct device *dev, struct device_attribute *attr,
+  const char *buf, size_t count)
+{
+   int indx, ret;
+   long val;
+   struct thermal_sensor *ts = to_thermal_sensor(dev);
+
+   if (!ts-ops-set_threshold)
+   return -EPERM;
+
+   if (!sscanf(attr-attr.name, threshold%d, indx))
+   return -EINVAL;
+
+   if (kstrtol(buf, 10, val))
+   return -EINVAL;
+
+   ret = ts-ops-set_threshold(ts, indx, val);
+
+   return ret ? ret : count;
+}
+
 static ssize_t
 type_show(struct device *dev, struct device_attribute *attr, char *buf)
 {
@@ -705,6 +805,10 @@ static DEVICE_ATTR(mode, 0644, mode_show, mode_store);
 static DEVICE_ATTR(passive, S_IRUGO | S_IWUSR, passive_show, passive_store);
 static DEVICE_ATTR(policy, S_IRUGO | S_IWUSR, policy_show, policy_store);
 
+/* Thermal sensor attributes */
+static DEVICE_ATTR(sensor_name, 0444, sensor_name_show, NULL);
+static DEVICE_ATTR(temp_input, 0444, sensor_temp_show, NULL);
+
 /* sys I/F for cooling device */
 #define to_cooling_device(_dev)\
container_of(_dev, struct thermal_cooling_device, device)
@@ -1491,6 +1595,182 @@ static void remove_trip_attrs(struct 
thermal_zone_device *tz)
 }
 
 /**
+ * enable_sensor_thresholds - create sysfs nodes for thresholdX
+ * @ts:the thermal sensor
+ *

[PATCH 5/6] Cleanup header files to build a proper 32 bit VDSO

2012-12-18 Thread stefani

From: Stefani Seibold stef...@seibold.net

To build a proper VDSO for 64 bit and 32 bit from the same source, some
header cleanup is necessary, otherwise a gcc -m32 will produce a lot
of errors and warnings due the differents with LP64 and LP32.

Signed-off-by: Stefani Seibold stef...@seibold.net
---
 arch/x86/mm/init_32.c   | 1 +
 include/linux/clocksource.h | 1 -
 include/linux/time.h| 3 +--
 include/linux/timekeeper_internal.h | 1 +
 include/linux/types.h   | 2 ++
 5 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
index 11a5800..394e563 100644
--- a/arch/x86/mm/init_32.c
+++ b/arch/x86/mm/init_32.c
@@ -52,6 +52,7 @@
 #include asm/cacheflush.h
 #include asm/page_types.h
 #include asm/init.h
+#include asm/numa_32.h
 
 unsigned long highstart_pfn, highend_pfn;
 
diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 4dceaf8..84ed093 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -19,7 +19,6 @@
 #include asm/io.h
 
 /* clocksource cycle base type */
-typedef u64 cycle_t;
 struct clocksource;
 
 #ifdef CONFIG_ARCH_CLOCKSOURCE_DATA
diff --git a/include/linux/time.h b/include/linux/time.h
index 4d358e9..edfab8a 100644
--- a/include/linux/time.h
+++ b/include/linux/time.h
@@ -2,9 +2,8 @@
 #define _LINUX_TIME_H
 
 # include linux/cache.h
-# include linux/seqlock.h
 # include linux/math64.h
-#include uapi/linux/time.h
+# include uapi/linux/time.h
 
 extern struct timezone sys_tz;
 
diff --git a/include/linux/timekeeper_internal.h 
b/include/linux/timekeeper_internal.h
index e1d558e..9a55a0c 100644
--- a/include/linux/timekeeper_internal.h
+++ b/include/linux/timekeeper_internal.h
@@ -9,6 +9,7 @@
 #include linux/clocksource.h
 #include linux/jiffies.h
 #include linux/time.h
+#include linux/seqlock.h
 
 /* Structure holding internal timekeeping values. */
 struct timekeeper {
diff --git a/include/linux/types.h b/include/linux/types.h
index 1cc0e4b..3ff59cf 100644
--- a/include/linux/types.h
+++ b/include/linux/types.h
@@ -74,6 +74,8 @@ typedef __kernel_time_t   time_t;
 typedef __kernel_clock_t   clock_t;
 #endif
 
+typedef u64 cycle_t;
+
 #ifndef _CADDR_T
 #define _CADDR_T
 typedef __kernel_caddr_t   caddr_t;
-- 
1.8.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 1/6] Move out seqcount from seqlock.h

2012-12-18 Thread stefani

From: Stefani Seibold stef...@seibold.net

Create a seperate seqcount.h file which handles only seqcount. This file
is save to include in VDSO, since there is no in-kernel functionality
like spinlock in use. seqlock.h still includes seqcount.h, so there is no
side effect for current users.

Signed-off-by: Stefani Seibold stef...@seibold.net
---
 include/linux/seqcount.h | 150 +++
 include/linux/seqlock.h  | 145 +
 2 files changed, 151 insertions(+), 144 deletions(-)
 create mode 100644 include/linux/seqcount.h

diff --git a/include/linux/seqcount.h b/include/linux/seqcount.h
new file mode 100644
index 000..b83dff3
--- /dev/null
+++ b/include/linux/seqcount.h
@@ -0,0 +1,150 @@
+/*
+ * Version using sequence counter only.
+ * This can be used when code has its own mutex protecting the
+ * updating starting before the write_seqcountbeqin() and ending
+ * after the write_seqcount_end().
+ */
+
+#ifndef __LINUX_SEQCOUNT_H
+#define __LINUX_SEQCOUNT_H
+
+#include asm/processor.h
+#include asm/barrier.h
+
+typedef struct seqcount {
+   unsigned sequence;
+} seqcount_t;
+
+#define SEQCNT_ZERO { 0 }
+#define seqcount_init(x) do { *(x) = (seqcount_t) SEQCNT_ZERO; } while (0)
+
+/**
+ * __read_seqcount_begin - begin a seq-read critical section (without barrier)
+ * @s: pointer to seqcount_t
+ * Returns: count to be passed to read_seqcount_retry
+ *
+ * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
+ * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
+ * provided before actually loading any of the variables that are to be
+ * protected in this critical section.
+ *
+ * Use carefully, only in critical code, and comment how the barrier is
+ * provided.
+ */
+static inline unsigned __read_seqcount_begin(const seqcount_t *s)
+{
+   unsigned ret;
+
+repeat:
+   ret = ACCESS_ONCE(s-sequence);
+   if (unlikely(ret  1)) {
+   cpu_relax();
+   goto repeat;
+   }
+   return ret;
+}
+
+/**
+ * read_seqcount_begin - begin a seq-read critical section
+ * @s: pointer to seqcount_t
+ * Returns: count to be passed to read_seqcount_retry
+ *
+ * read_seqcount_begin opens a read critical section of the given seqcount.
+ * Validity of the critical section is tested by checking read_seqcount_retry
+ * function.
+ */
+static inline unsigned read_seqcount_begin(const seqcount_t *s)
+{
+   unsigned ret = __read_seqcount_begin(s);
+   smp_rmb();
+   return ret;
+}
+
+/**
+ * raw_seqcount_begin - begin a seq-read critical section
+ * @s: pointer to seqcount_t
+ * Returns: count to be passed to read_seqcount_retry
+ *
+ * raw_seqcount_begin opens a read critical section of the given seqcount.
+ * Validity of the critical section is tested by checking read_seqcount_retry
+ * function.
+ *
+ * Unlike read_seqcount_begin(), this function will not wait for the count
+ * to stabilize. If a writer is active when we begin, we will fail the
+ * read_seqcount_retry() instead of stabilizing at the beginning of the
+ * critical section.
+ */
+static inline unsigned raw_seqcount_begin(const seqcount_t *s)
+{
+   unsigned ret = ACCESS_ONCE(s-sequence);
+   smp_rmb();
+   return ret  ~1;
+}
+
+/**
+ * __read_seqcount_retry - end a seq-read critical section (without barrier)
+ * @s: pointer to seqcount_t
+ * @start: count, from read_seqcount_begin
+ * Returns: 1 if retry is required, else 0
+ *
+ * __read_seqcount_retry is like read_seqcount_retry, but has no smp_rmb()
+ * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
+ * provided before actually loading any of the variables that are to be
+ * protected in this critical section.
+ *
+ * Use carefully, only in critical code, and comment how the barrier is
+ * provided.
+ */
+static inline int __read_seqcount_retry(const seqcount_t *s, unsigned start)
+{
+   return unlikely(s-sequence != start);
+}
+
+/**
+ * read_seqcount_retry - end a seq-read critical section
+ * @s: pointer to seqcount_t
+ * @start: count, from read_seqcount_begin
+ * Returns: 1 if retry is required, else 0
+ *
+ * read_seqcount_retry closes a read critical section of the given seqcount.
+ * If the critical section was invalid, it must be ignored (and typically
+ * retried).
+ */
+static inline int read_seqcount_retry(const seqcount_t *s, unsigned start)
+{
+   smp_rmb();
+
+   return __read_seqcount_retry(s, start);
+}
+
+
+/*
+ * Sequence counter only version assumes that callers are using their
+ * own mutexing.
+ */
+static inline void write_seqcount_begin(seqcount_t *s)
+{
+   s-sequence++;
+   smp_wmb();
+}
+
+static inline void write_seqcount_end(seqcount_t *s)
+{
+   smp_wmb();
+   s-sequence++;
+}
+
+/**
+ * write_seqcount_barrier - invalidate in-progress read-side seq operations
+ * @s: pointer to seqcount_t
+ *
+ * After write_seqcount_barrier, no read-side seq

[PATCH 3/6] Make vsyscall_gtod_data compatible with 32 bit VDSO

2012-12-18 Thread stefani

From: Stefani Seibold stef...@seibold.net

To make the vsyscall_gtod_data available for both VDSO (X86_64 and
IA32_EMULATION) the alignment must be set to 4. Otherwise the code
create with gcc -m32 will fail, since the structure alignment in 32
bit mode ist 4 byte.

There is currently no drawback for X86_64, since the structure members
are in a good order.

Signed-off-by: Stefani Seibold stef...@seibold.net
---
 arch/x86/include/asm/vgtod.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index eb87b53..86afff8 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -13,7 +13,7 @@ struct vsyscall_gtod_data {
cycle_t mask;
u32 mult;
u32 shift;
-   } clock;
+   } __attribute__((aligned(4),packed)) clock;
 
/* open coded 'struct timespec' */
time_t  wall_time_sec;
@@ -24,7 +24,8 @@ struct vsyscall_gtod_data {
struct timezone sys_tz;
struct timespec wall_time_coarse;
struct timespec monotonic_time_coarse;
-};
+} __attribute__((aligned(4),packed));
+
 extern struct vsyscall_gtod_data vsyscall_gtod_data;
 
 extern void map_vgtod(void);
-- 
1.8.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH 6/6] Add 32 bit VDSO support for 32 and 64 bit kernels

2012-12-18 Thread stefani

From: Stefani Seibold stef...@seibold.net

This patch adds support for 32 bit VDSO.

For 32 bit programs running on a 32 bit kernel, the same mechanism is
used as for 64 bit programs running on a 64 bit kernel.

For 32 bit programs running under a 64 bit IA32_EMULATION, it is a
little bit more tricky. In this case the VVAR and HPET will be mapped
into the 32 bit address space, by cutting of the upper 32 bit. So the
address for this will not changed in the view of the 32 bit VDSO. The
HPET will be mapped in this case at 0xff5fe000 and the VVAR at 0xff5ff000.

The transformation between the in 64 bit kernel representation and the 32 bit
abi will be also provided.

So we have one VDSO Source for all.

Signed-off-by: Stefani Seibold stef...@seibold.net
---
 arch/x86/include/asm/vgtod.h  |   4 +-
 arch/x86/include/asm/vsyscall.h   |   1 -
 arch/x86/include/asm/vvar.h   |   1 +
 arch/x86/kernel/Makefile  |   1 +
 arch/x86/kernel/hpet.c|   9 ++-
 arch/x86/vdso/Makefile|   6 ++
 arch/x86/vdso/vclock_gettime.c| 108 ++
 arch/x86/vdso/vdso32-setup.c  |  43 ++
 arch/x86/vdso/vdso32/vclock_gettime.c |  29 +
 arch/x86/vdso/vdso32/vdso32.lds.S |   3 +
 11 files changed, 179 insertions(+), 32 deletions(-)
 create mode 100644 arch/x86/vdso/vdso32/vclock_gettime.c

diff --git a/arch/x86/include/asm/vgtod.h b/arch/x86/include/asm/vgtod.h
index 86afff8..74c80d4 100644
--- a/arch/x86/include/asm/vgtod.h
+++ b/arch/x86/include/asm/vgtod.h
@@ -1,8 +1,8 @@
 #ifndef _ASM_X86_VGTOD_H
 #define _ASM_X86_VGTOD_H
 
-#include asm/vsyscall.h
-#include linux/clocksource.h
+#include linux/seqcount.h
+#include uapi/linux/time.h
 
 struct vsyscall_gtod_data {
seqcount_t  seq;
diff --git a/arch/x86/include/asm/vsyscall.h b/arch/x86/include/asm/vsyscall.h
index eaea1d3..24730cb 100644
--- a/arch/x86/include/asm/vsyscall.h
+++ b/arch/x86/include/asm/vsyscall.h
@@ -14,7 +14,6 @@ enum vsyscall_num {
 #define VSYSCALL_ADDR(vsyscall_nr) (VSYSCALL_START+VSYSCALL_SIZE*(vsyscall_nr))
 
 #ifdef __KERNEL__
-#include linux/seqlock.h
 
 #define VGETCPU_RDTSCP 1
 #define VGETCPU_LSL2
diff --git a/arch/x86/include/asm/vvar.h b/arch/x86/include/asm/vvar.h
index 8084d55..1e71e6c 100644
--- a/arch/x86/include/asm/vvar.h
+++ b/arch/x86/include/asm/vvar.h
@@ -50,5 +50,6 @@
 DECLARE_VVAR(0, volatile unsigned long, jiffies)
 DECLARE_VVAR(16, int, vgetcpu_mode)
 DECLARE_VVAR(128, struct vsyscall_gtod_data, vsyscall_gtod_data)
+DECLARE_VVAR(512, const void __iomem *, vsyscall_hpet)
 
 #undef DECLARE_VVAR
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 91ce48f..298a0b1 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -26,6 +26,7 @@ obj-y += probe_roms.o
 obj-$(CONFIG_X86_32)   += i386_ksyms_32.o
 obj-$(CONFIG_X86_64)   += sys_x86_64.o x8664_ksyms_64.o
 obj-y  += syscall_$(BITS).o
+obj-y  += vsyscall_gtod.o
 obj-$(CONFIG_X86_64)   += vsyscall_64.o
 obj-$(CONFIG_X86_64)   += vsyscall_emu_64.o
 obj-y  += bootflag.o e820.o
diff --git a/arch/x86/kernel/hpet.c b/arch/x86/kernel/hpet.c
index 859bb2d..4b7bb5d 100644
--- a/arch/x86/kernel/hpet.c
+++ b/arch/x86/kernel/hpet.c
@@ -69,14 +69,19 @@ static inline void hpet_writel(unsigned int d, unsigned int 
a)
 
 #ifdef CONFIG_X86_64
 #include asm/pgtable.h
+#else
+#include asm/vvar.h
 #endif
 
+DEFINE_VVAR(const void __iomem *, vsyscall_hpet);
+
+#include linux/mm.h
+
 static inline void hpet_set_mapping(void)
 {
hpet_virt_address = ioremap_nocache(hpet_address, HPET_MMAP_SIZE);
-#ifdef CONFIG_X86_64
__set_fixmap(VSYSCALL_HPET, hpet_address, PAGE_KERNEL_VVAR_NOCACHE);
-#endif
+   vsyscall_hpet = (const void __iomem *)fix_to_virt(VSYSCALL_HPET);
 }
 
 static inline void hpet_clear_mapping(void)
diff --git a/arch/x86/vdso/Makefile b/arch/x86/vdso/Makefile
index fd14be1..e136314 100644
--- a/arch/x86/vdso/Makefile
+++ b/arch/x86/vdso/Makefile
@@ -145,8 +145,14 @@ KBUILD_AFLAGS_32 := $(filter-out -m64,$(KBUILD_AFLAGS))
 $(vdso32-images:%=$(obj)/%.dbg): KBUILD_AFLAGS = $(KBUILD_AFLAGS_32)
 $(vdso32-images:%=$(obj)/%.dbg): asflags-$(CONFIG_X86_64) += -m32
 
+KBUILD_CFLAGS_32 := $(filter-out -m64,$(KBUILD_CFLAGS))
+KBUILD_CFLAGS_32 := $(filter-out -mcmodel=kernel,$(KBUILD_CFLAGS_32))
+KBUILD_CFLAGS_32 += -m32 -msoft-float -mregparm=3 -freg-struct-return
+$(vdso32-images:%=$(obj)/%.dbg): KBUILD_CFLAGS = $(KBUILD_CFLAGS_32)
+
 $(vdso32-images:%=$(obj)/%.dbg): $(obj)/vdso32-%.so.dbg: FORCE \
 $(obj)/vdso32/vdso32.lds \
+$(obj)/vdso32/vclock_gettime.o \
 $(obj)/vdso32/note.o \
 $(obj)/vdso32/%.o
$(call if_changed,vdso)
diff --git a/arch/x86/vdso/vclock_gettime.c b/arch/x86/vdso/vclock_gettime.c
index 4df6c37..e856bd8 100644
---

[PATCH 0/6] Add 32 bit VDSO time function support

2012-12-18 Thread stefani

From: Stefani Seibold stef...@seibold.net

This patch add the functions vdso_gettimeofday(), vdso_clock_gettime()
and vdso_time() to the 32 bit VDSO.

The reason to do this was to get a fast reliable time stamp. Many developers
uses TSC to get a fast time time stamp, without knowing the pitfalls. VDSO
time functions a fast and reliable way, because the kernel knows the
best time source and the P- and C-state of the CPU.

The helper library to use the VDSO functions can be download at
http://http://seibold.net/vdso.c
The libary is very small, only 228 lines of code. Compile it with
gcc -Wall -O3 -fpic vdso.c -lrt -shared -o libvdso.so
and use it with LD_PRELOAD=path/libvdso.so

This kind of helper must be integrated into glibc, for x86 64 bit and
PowerPC it is already there.

Some benchmark linux 32 bit results (all measurements are in nano seconds):

Intel(R) Celeron(TM) CPU 400MHz

Average time kernel call:
 gettimeofday(): 1039
 clock_gettime(): 1578
 time(): 526
Average time VDSO call:
 gettimeofday(): 378
 clock_gettime(): 303
 time(): 60

Celeron(R) Dual-Core CPU T3100 1.90GHz

Average time kernel call:
 gettimeofday(): 209
 clock_gettime(): 406
 time(): 135
Average time VDSO call:
 gettimeofday(): 51
 clock_gettime(): 43
 time(): 10

So you can see a performance increase between 4 and 13, depending on the
CPU and the function.

The patch is against kernel 3.7. Please apply if you like it.

Changelog:
25.11.2012 - first release and proof of concept for linux 3.4
11.12.2012 - Port to linux 3.7 and code cleanup
12.12.2012 - fixes suggested by Andy Lutomirski
   - fixes suggested by John Stultz
   - use call VDSO32_vsyscall instead of int 80
   - code cleanup
17.12.2012 - support for IA32_EMULATION, this includes
 - code cleanup
 - include cleanup to fix compile warnings and errors
 - move out seqcount from seqlock, enable use in VDSO
 - map FIXMAP and HPET into the 32 bit address space
18.12.2012 - split into separate patches
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

< 1 2 3 4 5 6 7 8 9 10 >

401 - 500 of 932 matches

Mail list logo