Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-14 Thread Catalin Marinas
On Mon, Jun 10, 2013 at 05:12:08AM +0100, Rob Herring wrote:
> On 06/04/2013 05:21 AM, Russell King - ARM Linux wrote:
> > On Mon, Jun 03, 2013 at 06:51:59PM -0700, Stephen Boyd wrote:
> >> On 06/03/13 15:12, Russell King - ARM Linux wrote:
> >>> If you have a 56-bit clock which ticks at a period of 1ns, then
> >>> cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
> >>> The scheduler always _requires_ 64-bits from sched_clock.  That's why we
> >>> have the complicated code to extend the 32-bits-or-less to a _full_
> >>> 64-bit value.
> >>>
> >>> Let me make this clearer: sched_clock() return values _must_ without
> >>> exception monotonically increment from zero to 2^64-1 and then wrap
> >>> back to zero.  No other behaviour is acceptable for sched_clock().
> >>
> >> Ok so you're saying if we have less than 64 bits of useable information
> >> we _must_ do something to find where the wraparound will occur and
> >> adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
> >> enough. I was trying to avoid more work because on arm architected timer
> >> platforms it takes many years for that to happen.
> >>
> >> I'll see what I can do.
> > 
> > Well, 56 bits at 1ns intervals is 833 days (2^56 / (10*60*60*24)).
> > We used to say that 497 days was enough several years ago, and that got
> > fixed.  We used to say 640K was enough memory for anything, and that
> > got fixed.
> 
> The ARM ARM states a minimum resolution of 40 years AND at least 56-bits
> of resolution. So a 1Gz counter would have to have more that 56 bits.

At a quick calculation, with a full 64-bit counter and 40-year roll-over
we can have maximum 14.6GHz clock. So we shouldn't just mask the top
8-bit of the counter as the bottom 56 could roll over in much less time.

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-14 Thread Catalin Marinas
On Mon, Jun 10, 2013 at 05:12:08AM +0100, Rob Herring wrote:
 On 06/04/2013 05:21 AM, Russell King - ARM Linux wrote:
  On Mon, Jun 03, 2013 at 06:51:59PM -0700, Stephen Boyd wrote:
  On 06/03/13 15:12, Russell King - ARM Linux wrote:
  If you have a 56-bit clock which ticks at a period of 1ns, then
  cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
  The scheduler always _requires_ 64-bits from sched_clock.  That's why we
  have the complicated code to extend the 32-bits-or-less to a _full_
  64-bit value.
 
  Let me make this clearer: sched_clock() return values _must_ without
  exception monotonically increment from zero to 2^64-1 and then wrap
  back to zero.  No other behaviour is acceptable for sched_clock().
 
  Ok so you're saying if we have less than 64 bits of useable information
  we _must_ do something to find where the wraparound will occur and
  adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
  enough. I was trying to avoid more work because on arm architected timer
  platforms it takes many years for that to happen.
 
  I'll see what I can do.
  
  Well, 56 bits at 1ns intervals is 833 days (2^56 / (10*60*60*24)).
  We used to say that 497 days was enough several years ago, and that got
  fixed.  We used to say 640K was enough memory for anything, and that
  got fixed.
 
 The ARM ARM states a minimum resolution of 40 years AND at least 56-bits
 of resolution. So a 1Gz counter would have to have more that 56 bits.

At a quick calculation, with a full 64-bit counter and 40-year roll-over
we can have maximum 14.6GHz clock. So we shouldn't just mask the top
8-bit of the counter as the bottom 56 could roll over in much less time.

-- 
Catalin
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-10 Thread Russell King - ARM Linux
On Mon, Jun 10, 2013 at 09:31:21PM +0530, anish singh wrote:
> On Mon, Jun 10, 2013 at 9:08 PM, Russell King - ARM Linux
>  wrote:
> 
> Least I can do is to say "Thanks".
> > On Mon, Jun 10, 2013 at 08:46:36PM +0530, anish singh wrote:
> >> Probably a trivial question.I was wondering why this particular requirement
> >> exists in the first place.I looked into this commit 112f38a4a3 but couldn't
> >> gather the reason.
> >
> > You're looking at a commit introducing an implementation.  The requirement
> > isn't driven by the implementation.  It's driven by the code and the maths
> > in the core scheduler, and its been a requirement for years.
> >
> > sched_clock() needs to be monotonic, and needs to wrap at 64-bit, because
> > calculations are done by comparing the difference of two 64-bit values
> > returned from this function.
> 
> Yes, and this is the question.If it is 32 bit then also it can overflow but
> it will happen relatively fast.So I guess that is the reason why we use 64 bit
> and this will avoid recalculations for recalibration.

And that's why 112f38a4a3 is there - to ensure that we extend a 32-bit
or smaller counter all the way up to the full 64-bits.  This replaces
the previous generation code which only extended it to 63-bits.  Problems
were reported!

> > Let's take a trivial example - if you have a 16 bit counter, and you have
> > a value of 0xc000 ns, and next time you read it, it has value 0x0001 ns,
> > then what value do you end up with when you calculate the time passed
> > using 64-bit maths.
> >
> > That's 0x0001 - 0xc000.  The answer is a very big
> > number which is not the correct 16385.  This means that things like process
> > timeslice counting and scheduler fairness is compromised - I'd expect even
> 
> So you mean when counter overflows the scheduler doesn't handle it?

There is no handling of counter overflows at scheduler level because
the specification for sched_clock() is that this function _will_ return
a monotonically increasing 64-bit value from 0 to the maximum 64-bit
value.

The reason for this is that there are popular architectures around
which do this natively, so the powers that be do not want additional
useless code cluttering their architectures.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-10 Thread anish singh
On Mon, Jun 10, 2013 at 9:08 PM, Russell King - ARM Linux
 wrote:

Least I can do is to say "Thanks".
> On Mon, Jun 10, 2013 at 08:46:36PM +0530, anish singh wrote:
>> Probably a trivial question.I was wondering why this particular requirement
>> exists in the first place.I looked into this commit 112f38a4a3 but couldn't
>> gather the reason.
>
> You're looking at a commit introducing an implementation.  The requirement
> isn't driven by the implementation.  It's driven by the code and the maths
> in the core scheduler, and its been a requirement for years.
>
> sched_clock() needs to be monotonic, and needs to wrap at 64-bit, because
> calculations are done by comparing the difference of two 64-bit values
> returned from this function.

Yes, and this is the question.If it is 32 bit then also it can overflow but
it will happen relatively fast.So I guess that is the reason why we use 64 bit
and this will avoid recalculations for recalibration.
>
> Let's take a trivial example - if you have a 16 bit counter, and you have
> a value of 0xc000 ns, and next time you read it, it has value 0x0001 ns,
> then what value do you end up with when you calculate the time passed
> using 64-bit maths.
>
> That's 0x0001 - 0xc000.  The answer is a very big
> number which is not the correct 16385.  This means that things like process
> timeslice counting and scheduler fairness is compromised - I'd expect even

So you mean when counter overflows the scheduler doesn't handle it?
> more so if you're running RT and this is being used to provide guarantees.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-10 Thread Russell King - ARM Linux
On Mon, Jun 10, 2013 at 08:46:36PM +0530, anish singh wrote:
> Probably a trivial question.I was wondering why this particular requirement
> exists in the first place.I looked into this commit 112f38a4a3 but couldn't
> gather the reason.

You're looking at a commit introducing an implementation.  The requirement
isn't driven by the implementation.  It's driven by the code and the maths
in the core scheduler, and its been a requirement for years.

sched_clock() needs to be monotonic, and needs to wrap at 64-bit, because
calculations are done by comparing the difference of two 64-bit values
returned from this function.

Let's take a trivial example - if you have a 16 bit counter, and you have
a value of 0xc000 ns, and next time you read it, it has value 0x0001 ns,
then what value do you end up with when you calculate the time passed
using 64-bit maths.

That's 0x0001 - 0xc000.  The answer is a very big
number which is not the correct 16385.  This means that things like process
timeslice counting and scheduler fairness is compromised - I'd expect even
more so if you're running RT and this is being used to provide guarantees.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-10 Thread anish singh
On Tue, Jun 4, 2013 at 3:42 AM, Russell King - ARM Linux
 wrote:
> On Mon, Jun 03, 2013 at 02:11:59PM -0700, Stephen Boyd wrote:
>> On 06/03/13 02:39, Russell King - ARM Linux wrote:
>> > On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
>> >> +}
>> >> +
>> >> +void __init
>> >> +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
>> >> +{
>> >> +  if (cd.rate > rate)
>> >> +  return;
>> >> +
>> >> +  BUG_ON(bits <= 32);
>> >> +  WARN_ON(!irqs_disabled());
>> >> +  read_sched_clock_64 = read;
>> >> +  sched_clock_func = sched_clock_64;
>> >> +  cd.rate = rate;
>> >> +  cd.mult = NSEC_PER_SEC / rate;
>> > Here, you don't check that the (2^bits) * mult results in a wrap of the
>> > resulting 64-bit number, which is a _basic_ requirement for sched_clock
>> > (hence all the code for <=32bit clocks, otherwise we wouldn't need this
>> > complexity in the first place.)
>>
>> Ok I will use clocks_calc_mult_shift() here.
>
> No, that's not the problem.
>
> If you have a 56-bit clock which ticks at a period of 1ns, then
> cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
> The scheduler always _requires_ 64-bits from sched_clock.  That's why we
> have the complicated code to extend the 32-bits-or-less to a _full_
> 64-bit value.
>
> Let me make this clearer: sched_clock() return values _must_ without
> exception monotonically increment from zero to 2^64-1 and then wrap
> back to zero.  No other behaviour is acceptable for sched_clock().

Probably a trivial question.I was wondering why this particular requirement
exists in the first place.I looked into this commit 112f38a4a3 but couldn't
gather the reason.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-10 Thread anish singh
On Tue, Jun 4, 2013 at 3:42 AM, Russell King - ARM Linux
li...@arm.linux.org.uk wrote:
 On Mon, Jun 03, 2013 at 02:11:59PM -0700, Stephen Boyd wrote:
 On 06/03/13 02:39, Russell King - ARM Linux wrote:
  On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
  +}
  +
  +void __init
  +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
  +{
  +  if (cd.rate  rate)
  +  return;
  +
  +  BUG_ON(bits = 32);
  +  WARN_ON(!irqs_disabled());
  +  read_sched_clock_64 = read;
  +  sched_clock_func = sched_clock_64;
  +  cd.rate = rate;
  +  cd.mult = NSEC_PER_SEC / rate;
  Here, you don't check that the (2^bits) * mult results in a wrap of the
  resulting 64-bit number, which is a _basic_ requirement for sched_clock
  (hence all the code for =32bit clocks, otherwise we wouldn't need this
  complexity in the first place.)

 Ok I will use clocks_calc_mult_shift() here.

 No, that's not the problem.

 If you have a 56-bit clock which ticks at a period of 1ns, then
 cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
 The scheduler always _requires_ 64-bits from sched_clock.  That's why we
 have the complicated code to extend the 32-bits-or-less to a _full_
 64-bit value.

 Let me make this clearer: sched_clock() return values _must_ without
 exception monotonically increment from zero to 2^64-1 and then wrap
 back to zero.  No other behaviour is acceptable for sched_clock().

Probably a trivial question.I was wondering why this particular requirement
exists in the first place.I looked into this commit 112f38a4a3 but couldn't
gather the reason.

 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-10 Thread Russell King - ARM Linux
On Mon, Jun 10, 2013 at 08:46:36PM +0530, anish singh wrote:
 Probably a trivial question.I was wondering why this particular requirement
 exists in the first place.I looked into this commit 112f38a4a3 but couldn't
 gather the reason.

You're looking at a commit introducing an implementation.  The requirement
isn't driven by the implementation.  It's driven by the code and the maths
in the core scheduler, and its been a requirement for years.

sched_clock() needs to be monotonic, and needs to wrap at 64-bit, because
calculations are done by comparing the difference of two 64-bit values
returned from this function.

Let's take a trivial example - if you have a 16 bit counter, and you have
a value of 0xc000 ns, and next time you read it, it has value 0x0001 ns,
then what value do you end up with when you calculate the time passed
using 64-bit maths.

That's 0x0001 - 0xc000.  The answer is a very big
number which is not the correct 16385.  This means that things like process
timeslice counting and scheduler fairness is compromised - I'd expect even
more so if you're running RT and this is being used to provide guarantees.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-10 Thread anish singh
On Mon, Jun 10, 2013 at 9:08 PM, Russell King - ARM Linux
li...@arm.linux.org.uk wrote:

Least I can do is to say Thanks.
 On Mon, Jun 10, 2013 at 08:46:36PM +0530, anish singh wrote:
 Probably a trivial question.I was wondering why this particular requirement
 exists in the first place.I looked into this commit 112f38a4a3 but couldn't
 gather the reason.

 You're looking at a commit introducing an implementation.  The requirement
 isn't driven by the implementation.  It's driven by the code and the maths
 in the core scheduler, and its been a requirement for years.

 sched_clock() needs to be monotonic, and needs to wrap at 64-bit, because
 calculations are done by comparing the difference of two 64-bit values
 returned from this function.

Yes, and this is the question.If it is 32 bit then also it can overflow but
it will happen relatively fast.So I guess that is the reason why we use 64 bit
and this will avoid recalculations for recalibration.

 Let's take a trivial example - if you have a 16 bit counter, and you have
 a value of 0xc000 ns, and next time you read it, it has value 0x0001 ns,
 then what value do you end up with when you calculate the time passed
 using 64-bit maths.

 That's 0x0001 - 0xc000.  The answer is a very big
 number which is not the correct 16385.  This means that things like process
 timeslice counting and scheduler fairness is compromised - I'd expect even

So you mean when counter overflows the scheduler doesn't handle it?
 more so if you're running RT and this is being used to provide guarantees.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-10 Thread Russell King - ARM Linux
On Mon, Jun 10, 2013 at 09:31:21PM +0530, anish singh wrote:
 On Mon, Jun 10, 2013 at 9:08 PM, Russell King - ARM Linux
 li...@arm.linux.org.uk wrote:
 
 Least I can do is to say Thanks.
  On Mon, Jun 10, 2013 at 08:46:36PM +0530, anish singh wrote:
  Probably a trivial question.I was wondering why this particular requirement
  exists in the first place.I looked into this commit 112f38a4a3 but couldn't
  gather the reason.
 
  You're looking at a commit introducing an implementation.  The requirement
  isn't driven by the implementation.  It's driven by the code and the maths
  in the core scheduler, and its been a requirement for years.
 
  sched_clock() needs to be monotonic, and needs to wrap at 64-bit, because
  calculations are done by comparing the difference of two 64-bit values
  returned from this function.
 
 Yes, and this is the question.If it is 32 bit then also it can overflow but
 it will happen relatively fast.So I guess that is the reason why we use 64 bit
 and this will avoid recalculations for recalibration.

And that's why 112f38a4a3 is there - to ensure that we extend a 32-bit
or smaller counter all the way up to the full 64-bits.  This replaces
the previous generation code which only extended it to 63-bits.  Problems
were reported!

  Let's take a trivial example - if you have a 16 bit counter, and you have
  a value of 0xc000 ns, and next time you read it, it has value 0x0001 ns,
  then what value do you end up with when you calculate the time passed
  using 64-bit maths.
 
  That's 0x0001 - 0xc000.  The answer is a very big
  number which is not the correct 16385.  This means that things like process
  timeslice counting and scheduler fairness is compromised - I'd expect even
 
 So you mean when counter overflows the scheduler doesn't handle it?

There is no handling of counter overflows at scheduler level because
the specification for sched_clock() is that this function _will_ return
a monotonically increasing 64-bit value from 0 to the maximum 64-bit
value.

The reason for this is that there are popular architectures around
which do this natively, so the powers that be do not want additional
useless code cluttering their architectures.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-09 Thread Rob Herring
On 06/04/2013 05:21 AM, Russell King - ARM Linux wrote:
> On Mon, Jun 03, 2013 at 06:51:59PM -0700, Stephen Boyd wrote:
>> On 06/03/13 15:12, Russell King - ARM Linux wrote:
>>> If you have a 56-bit clock which ticks at a period of 1ns, then
>>> cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
>>> The scheduler always _requires_ 64-bits from sched_clock.  That's why we
>>> have the complicated code to extend the 32-bits-or-less to a _full_
>>> 64-bit value.
>>>
>>> Let me make this clearer: sched_clock() return values _must_ without
>>> exception monotonically increment from zero to 2^64-1 and then wrap
>>> back to zero.  No other behaviour is acceptable for sched_clock().
>>
>> Ok so you're saying if we have less than 64 bits of useable information
>> we _must_ do something to find where the wraparound will occur and
>> adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
>> enough. I was trying to avoid more work because on arm architected timer
>> platforms it takes many years for that to happen.
>>
>> I'll see what I can do.
> 
> Well, 56 bits at 1ns intervals is 833 days (2^56 / (10*60*60*24)).
> We used to say that 497 days was enough several years ago, and that got
> fixed.  We used to say 640K was enough memory for anything, and that
> got fixed.

The ARM ARM states a minimum resolution of 40 years AND at least 56-bits
of resolution. So a 1Gz counter would have to have more that 56 bits.

Rob

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-09 Thread Rob Herring
On 06/04/2013 05:21 AM, Russell King - ARM Linux wrote:
 On Mon, Jun 03, 2013 at 06:51:59PM -0700, Stephen Boyd wrote:
 On 06/03/13 15:12, Russell King - ARM Linux wrote:
 If you have a 56-bit clock which ticks at a period of 1ns, then
 cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
 The scheduler always _requires_ 64-bits from sched_clock.  That's why we
 have the complicated code to extend the 32-bits-or-less to a _full_
 64-bit value.

 Let me make this clearer: sched_clock() return values _must_ without
 exception monotonically increment from zero to 2^64-1 and then wrap
 back to zero.  No other behaviour is acceptable for sched_clock().

 Ok so you're saying if we have less than 64 bits of useable information
 we _must_ do something to find where the wraparound will occur and
 adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
 enough. I was trying to avoid more work because on arm architected timer
 platforms it takes many years for that to happen.

 I'll see what I can do.
 
 Well, 56 bits at 1ns intervals is 833 days (2^56 / (10*60*60*24)).
 We used to say that 497 days was enough several years ago, and that got
 fixed.  We used to say 640K was enough memory for anything, and that
 got fixed.

The ARM ARM states a minimum resolution of 40 years AND at least 56-bits
of resolution. So a 1Gz counter would have to have more that 56 bits.

Rob

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-04 Thread Russell King - ARM Linux
On Mon, Jun 03, 2013 at 06:51:59PM -0700, Stephen Boyd wrote:
> On 06/03/13 15:12, Russell King - ARM Linux wrote:
> > If you have a 56-bit clock which ticks at a period of 1ns, then
> > cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
> > The scheduler always _requires_ 64-bits from sched_clock.  That's why we
> > have the complicated code to extend the 32-bits-or-less to a _full_
> > 64-bit value.
> >
> > Let me make this clearer: sched_clock() return values _must_ without
> > exception monotonically increment from zero to 2^64-1 and then wrap
> > back to zero.  No other behaviour is acceptable for sched_clock().
> 
> Ok so you're saying if we have less than 64 bits of useable information
> we _must_ do something to find where the wraparound will occur and
> adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
> enough. I was trying to avoid more work because on arm architected timer
> platforms it takes many years for that to happen.
> 
> I'll see what I can do.

Well, 56 bits at 1ns intervals is 833 days (2^56 / (10*60*60*24)).
We used to say that 497 days was enough several years ago, and that got
fixed.  We used to say 640K was enough memory for anything, and that
got fixed.

Whenever there's a limit, that limit will always be exceeded.  833 days
uptime has already been exceeded by ARM machines - I have one at the
moment:

 11:17:58 up 1082 days, 11:53, 14 users,  load average: 1.20, 1.28, 1.32

and I would not be surprised if there were others around.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-04 Thread Russell King - ARM Linux
On Mon, Jun 03, 2013 at 06:51:59PM -0700, Stephen Boyd wrote:
 On 06/03/13 15:12, Russell King - ARM Linux wrote:
  If you have a 56-bit clock which ticks at a period of 1ns, then
  cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
  The scheduler always _requires_ 64-bits from sched_clock.  That's why we
  have the complicated code to extend the 32-bits-or-less to a _full_
  64-bit value.
 
  Let me make this clearer: sched_clock() return values _must_ without
  exception monotonically increment from zero to 2^64-1 and then wrap
  back to zero.  No other behaviour is acceptable for sched_clock().
 
 Ok so you're saying if we have less than 64 bits of useable information
 we _must_ do something to find where the wraparound will occur and
 adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
 enough. I was trying to avoid more work because on arm architected timer
 platforms it takes many years for that to happen.
 
 I'll see what I can do.

Well, 56 bits at 1ns intervals is 833 days (2^56 / (10*60*60*24)).
We used to say that 497 days was enough several years ago, and that got
fixed.  We used to say 640K was enough memory for anything, and that
got fixed.

Whenever there's a limit, that limit will always be exceeded.  833 days
uptime has already been exceeded by ARM machines - I have one at the
moment:

 11:17:58 up 1082 days, 11:53, 14 users,  load average: 1.20, 1.28, 1.32

and I would not be surprised if there were others around.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-03 Thread Stephen Boyd
On 06/03/13 15:12, Russell King - ARM Linux wrote:
> On Mon, Jun 03, 2013 at 02:11:59PM -0700, Stephen Boyd wrote:
>> On 06/03/13 02:39, Russell King - ARM Linux wrote:
>>> On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
 +}
 +
 +void __init
 +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
 +{
 +  if (cd.rate > rate)
 +  return;
 +
 +  BUG_ON(bits <= 32);
 +  WARN_ON(!irqs_disabled());
 +  read_sched_clock_64 = read;
 +  sched_clock_func = sched_clock_64;
 +  cd.rate = rate;
 +  cd.mult = NSEC_PER_SEC / rate;
>>> Here, you don't check that the (2^bits) * mult results in a wrap of the
>>> resulting 64-bit number, which is a _basic_ requirement for sched_clock
>>> (hence all the code for <=32bit clocks, otherwise we wouldn't need this
>>> complexity in the first place.)
>> Ok I will use clocks_calc_mult_shift() here.
> No, that's not the problem.
>
> If you have a 56-bit clock which ticks at a period of 1ns, then
> cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
> The scheduler always _requires_ 64-bits from sched_clock.  That's why we
> have the complicated code to extend the 32-bits-or-less to a _full_
> 64-bit value.
>
> Let me make this clearer: sched_clock() return values _must_ without
> exception monotonically increment from zero to 2^64-1 and then wrap
> back to zero.  No other behaviour is acceptable for sched_clock().

Ok so you're saying if we have less than 64 bits of useable information
we _must_ do something to find where the wraparound will occur and
adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
enough. I was trying to avoid more work because on arm architected timer
platforms it takes many years for that to happen.

I'll see what I can do.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-03 Thread Russell King - ARM Linux
On Mon, Jun 03, 2013 at 02:11:59PM -0700, Stephen Boyd wrote:
> On 06/03/13 02:39, Russell King - ARM Linux wrote:
> > On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
> >> +}
> >> +
> >> +void __init
> >> +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
> >> +{
> >> +  if (cd.rate > rate)
> >> +  return;
> >> +
> >> +  BUG_ON(bits <= 32);
> >> +  WARN_ON(!irqs_disabled());
> >> +  read_sched_clock_64 = read;
> >> +  sched_clock_func = sched_clock_64;
> >> +  cd.rate = rate;
> >> +  cd.mult = NSEC_PER_SEC / rate;
> > Here, you don't check that the (2^bits) * mult results in a wrap of the
> > resulting 64-bit number, which is a _basic_ requirement for sched_clock
> > (hence all the code for <=32bit clocks, otherwise we wouldn't need this
> > complexity in the first place.)
> 
> Ok I will use clocks_calc_mult_shift() here.

No, that's not the problem.

If you have a 56-bit clock which ticks at a period of 1ns, then
cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
The scheduler always _requires_ 64-bits from sched_clock.  That's why we
have the complicated code to extend the 32-bits-or-less to a _full_
64-bit value.

Let me make this clearer: sched_clock() return values _must_ without
exception monotonically increment from zero to 2^64-1 and then wrap
back to zero.  No other behaviour is acceptable for sched_clock().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-03 Thread Stephen Boyd
On 06/03/13 02:39, Russell King - ARM Linux wrote:
> On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
>> +static unsigned long long notrace sched_clock_64(void)
>> +{
>> +u64 cyc = read_sched_clock_64() - cd.epoch_ns;
>> +return cyc * cd.mult;
> So, the use of cd.mult implies that the return value from
> read_sched_clock_64() is not nanoseconds but something else.  But then
> we subtract it from the nanoseconds epoch - which has to be nanoseconds
> because you simply return that when suspended.

You're right, it is confusing and broken. I was thinking we may need a
union for epoch_ns but I will try to make it always nanoseconds and see
if that makes the code clearer.

>
>> +}
>> +
>> +void __init
>> +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
>> +{
>> +if (cd.rate > rate)
>> +return;
>> +
>> +BUG_ON(bits <= 32);
>> +WARN_ON(!irqs_disabled());
>> +read_sched_clock_64 = read;
>> +sched_clock_func = sched_clock_64;
>> +cd.rate = rate;
>> +cd.mult = NSEC_PER_SEC / rate;
> Here, you don't check that the (2^bits) * mult results in a wrap of the
> resulting 64-bit number, which is a _basic_ requirement for sched_clock
> (hence all the code for <=32bit clocks, otherwise we wouldn't need this
> complexity in the first place.)

Ok I will use clocks_calc_mult_shift() here.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-03 Thread Russell King - ARM Linux
On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
> The ARM architected system counter has at least 56 useable bits.
> Add support for counters with more than 32 bits to the generic
> sched_clock implementation so we can avoid the complexity of
> dealing with wrap-around on these devices while benefiting from
> the irqtime accounting and suspend/resume handling that the
> generic sched_clock code already has.

This looks like a horrid hack to me.

> +static unsigned long long notrace sched_clock_64(void)
> +{
> + u64 cyc = read_sched_clock_64() - cd.epoch_ns;
> + return cyc * cd.mult;

So, the use of cd.mult implies that the return value from
read_sched_clock_64() is not nanoseconds but something else.  But then
we subtract it from the nanoseconds epoch - which has to be nanoseconds
because you simply return that when suspended.

> +}
> +
> +void __init
> +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
> +{
> + if (cd.rate > rate)
> + return;
> +
> + BUG_ON(bits <= 32);
> + WARN_ON(!irqs_disabled());
> + read_sched_clock_64 = read;
> + sched_clock_func = sched_clock_64;
> + cd.rate = rate;
> + cd.mult = NSEC_PER_SEC / rate;

Here, you don't check that the (2^bits) * mult results in a wrap of the
resulting 64-bit number, which is a _basic_ requirement for sched_clock
(hence all the code for <=32bit clocks, otherwise we wouldn't need this
complexity in the first place.)

So, I think this whole approach is broken.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-03 Thread Russell King - ARM Linux
On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
 The ARM architected system counter has at least 56 useable bits.
 Add support for counters with more than 32 bits to the generic
 sched_clock implementation so we can avoid the complexity of
 dealing with wrap-around on these devices while benefiting from
 the irqtime accounting and suspend/resume handling that the
 generic sched_clock code already has.

This looks like a horrid hack to me.

 +static unsigned long long notrace sched_clock_64(void)
 +{
 + u64 cyc = read_sched_clock_64() - cd.epoch_ns;
 + return cyc * cd.mult;

So, the use of cd.mult implies that the return value from
read_sched_clock_64() is not nanoseconds but something else.  But then
we subtract it from the nanoseconds epoch - which has to be nanoseconds
because you simply return that when suspended.

 +}
 +
 +void __init
 +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
 +{
 + if (cd.rate  rate)
 + return;
 +
 + BUG_ON(bits = 32);
 + WARN_ON(!irqs_disabled());
 + read_sched_clock_64 = read;
 + sched_clock_func = sched_clock_64;
 + cd.rate = rate;
 + cd.mult = NSEC_PER_SEC / rate;

Here, you don't check that the (2^bits) * mult results in a wrap of the
resulting 64-bit number, which is a _basic_ requirement for sched_clock
(hence all the code for =32bit clocks, otherwise we wouldn't need this
complexity in the first place.)

So, I think this whole approach is broken.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-03 Thread Stephen Boyd
On 06/03/13 02:39, Russell King - ARM Linux wrote:
 On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
 +static unsigned long long notrace sched_clock_64(void)
 +{
 +u64 cyc = read_sched_clock_64() - cd.epoch_ns;
 +return cyc * cd.mult;
 So, the use of cd.mult implies that the return value from
 read_sched_clock_64() is not nanoseconds but something else.  But then
 we subtract it from the nanoseconds epoch - which has to be nanoseconds
 because you simply return that when suspended.

You're right, it is confusing and broken. I was thinking we may need a
union for epoch_ns but I will try to make it always nanoseconds and see
if that makes the code clearer.


 +}
 +
 +void __init
 +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
 +{
 +if (cd.rate  rate)
 +return;
 +
 +BUG_ON(bits = 32);
 +WARN_ON(!irqs_disabled());
 +read_sched_clock_64 = read;
 +sched_clock_func = sched_clock_64;
 +cd.rate = rate;
 +cd.mult = NSEC_PER_SEC / rate;
 Here, you don't check that the (2^bits) * mult results in a wrap of the
 resulting 64-bit number, which is a _basic_ requirement for sched_clock
 (hence all the code for =32bit clocks, otherwise we wouldn't need this
 complexity in the first place.)

Ok I will use clocks_calc_mult_shift() here.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-03 Thread Russell King - ARM Linux
On Mon, Jun 03, 2013 at 02:11:59PM -0700, Stephen Boyd wrote:
 On 06/03/13 02:39, Russell King - ARM Linux wrote:
  On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
  +}
  +
  +void __init
  +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
  +{
  +  if (cd.rate  rate)
  +  return;
  +
  +  BUG_ON(bits = 32);
  +  WARN_ON(!irqs_disabled());
  +  read_sched_clock_64 = read;
  +  sched_clock_func = sched_clock_64;
  +  cd.rate = rate;
  +  cd.mult = NSEC_PER_SEC / rate;
  Here, you don't check that the (2^bits) * mult results in a wrap of the
  resulting 64-bit number, which is a _basic_ requirement for sched_clock
  (hence all the code for =32bit clocks, otherwise we wouldn't need this
  complexity in the first place.)
 
 Ok I will use clocks_calc_mult_shift() here.

No, that's not the problem.

If you have a 56-bit clock which ticks at a period of 1ns, then
cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
The scheduler always _requires_ 64-bits from sched_clock.  That's why we
have the complicated code to extend the 32-bits-or-less to a _full_
64-bit value.

Let me make this clearer: sched_clock() return values _must_ without
exception monotonically increment from zero to 2^64-1 and then wrap
back to zero.  No other behaviour is acceptable for sched_clock().
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-03 Thread Stephen Boyd
On 06/03/13 15:12, Russell King - ARM Linux wrote:
 On Mon, Jun 03, 2013 at 02:11:59PM -0700, Stephen Boyd wrote:
 On 06/03/13 02:39, Russell King - ARM Linux wrote:
 On Sat, Jun 01, 2013 at 11:39:41PM -0700, Stephen Boyd wrote:
 +}
 +
 +void __init
 +setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
 +{
 +  if (cd.rate  rate)
 +  return;
 +
 +  BUG_ON(bits = 32);
 +  WARN_ON(!irqs_disabled());
 +  read_sched_clock_64 = read;
 +  sched_clock_func = sched_clock_64;
 +  cd.rate = rate;
 +  cd.mult = NSEC_PER_SEC / rate;
 Here, you don't check that the (2^bits) * mult results in a wrap of the
 resulting 64-bit number, which is a _basic_ requirement for sched_clock
 (hence all the code for =32bit clocks, otherwise we wouldn't need this
 complexity in the first place.)
 Ok I will use clocks_calc_mult_shift() here.
 No, that's not the problem.

 If you have a 56-bit clock which ticks at a period of 1ns, then
 cd.rate = 1, and your sched_clock() values will be truncated to 56-bits.
 The scheduler always _requires_ 64-bits from sched_clock.  That's why we
 have the complicated code to extend the 32-bits-or-less to a _full_
 64-bit value.

 Let me make this clearer: sched_clock() return values _must_ without
 exception monotonically increment from zero to 2^64-1 and then wrap
 back to zero.  No other behaviour is acceptable for sched_clock().

Ok so you're saying if we have less than 64 bits of useable information
we _must_ do something to find where the wraparound will occur and
adjust for it so that epoch_ns is always incrementing until 2^64-1. Fair
enough. I was trying to avoid more work because on arm architected timer
platforms it takes many years for that to happen.

I'll see what I can do.

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by The Linux Foundation

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCHv2 4/6] sched_clock: Add support for >32 bit sched_clock

2013-06-02 Thread Stephen Boyd
The ARM architected system counter has at least 56 useable bits.
Add support for counters with more than 32 bits to the generic
sched_clock implementation so we can avoid the complexity of
dealing with wrap-around on these devices while benefiting from
the irqtime accounting and suspend/resume handling that the
generic sched_clock code already has.

Signed-off-by: Stephen Boyd 
---
 include/linux/sched_clock.h |   2 +
 kernel/time/sched_clock.c   | 101 
 2 files changed, 77 insertions(+), 26 deletions(-)

diff --git a/include/linux/sched_clock.h b/include/linux/sched_clock.h
index fa7922c..e732b39 100644
--- a/include/linux/sched_clock.h
+++ b/include/linux/sched_clock.h
@@ -18,4 +18,6 @@ extern void setup_sched_clock(u32 (*read)(void), int bits, 
unsigned long rate);
 
 extern unsigned long long (*sched_clock_func)(void);
 
+extern void setup_sched_clock_64(u64 (*read)(void), int bits,
+unsigned long rate);
 #endif
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index aad1ae6..482242c 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -43,6 +43,7 @@ static u32 notrace jiffy_sched_clock_read(void)
 }
 
 static u32 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
+static u64 __read_mostly (*read_sched_clock_64)(void);
 
 static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 {
@@ -103,24 +104,12 @@ static void sched_clock_poll(unsigned long wrap_ticks)
update_sched_clock();
 }
 
-void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate)
+static u64 __init sched_clock_calc_wrap(int bits, unsigned long rate)
 {
-   unsigned long r, w;
+   unsigned long r;
u64 res, wrap;
char r_unit;
 
-   if (cd.rate > rate)
-   return;
-
-   BUG_ON(bits > 32);
-   WARN_ON(!irqs_disabled());
-   read_sched_clock = read;
-   sched_clock_mask = (1 << bits) - 1;
-   cd.rate = rate;
-
-   /* calculate the mult/shift to convert counter ticks to ns. */
-   clocks_calc_mult_shift(, , rate, NSEC_PER_SEC, 0);
-
r = rate;
if (r >= 400) {
r /= 100;
@@ -134,12 +123,39 @@ void __init setup_sched_clock(u32 (*read)(void), int 
bits, unsigned long rate)
/* calculate how many ns until we wrap */
wrap = cyc_to_ns((1ULL << bits) - 1, cd.mult, cd.shift);
do_div(wrap, NSEC_PER_MSEC);
-   w = wrap;
 
/* calculate the ns resolution of this counter */
res = cyc_to_ns(1ULL, cd.mult, cd.shift);
-   pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps 
every %lums\n",
-   bits, r, r_unit, res, w);
+   pr_info("sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps 
every %llums\n",
+   bits, r, r_unit, res, wrap);
+
+   return wrap;
+}
+
+static void __init try_to_enable_irqtime(unsigned long rate)
+{
+   /* Enable IRQ time accounting if we have a fast enough sched_clock */
+   if (irqtime > 0 || (irqtime == -1 && rate >= 100))
+   enable_sched_clock_irqtime();
+}
+
+void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate)
+{
+   unsigned long w;
+
+   if (cd.rate > rate)
+   return;
+
+   BUG_ON(bits > 32);
+   WARN_ON(!irqs_disabled());
+   read_sched_clock = read;
+   sched_clock_mask = (1 << bits) - 1;
+   cd.rate = rate;
+
+   /* calculate the mult/shift to convert counter ticks to ns. */
+   clocks_calc_mult_shift(, , rate, NSEC_PER_SEC, 0);
+
+   w = sched_clock_calc_wrap(bits, rate);
 
/*
 * Start the timer to keep sched_clock() properly updated and
@@ -153,9 +169,7 @@ void __init setup_sched_clock(u32 (*read)(void), int bits, 
unsigned long rate)
 */
cd.epoch_ns = 0;
 
-   /* Enable IRQ time accounting if we have a fast enough sched_clock */
-   if (irqtime > 0 || (irqtime == -1 && rate >= 100))
-   enable_sched_clock_irqtime();
+   try_to_enable_irqtime(rate);
 
pr_debug("Registered %pF as sched_clock source\n", read);
 }
@@ -168,6 +182,32 @@ static unsigned long long notrace sched_clock_32(void)
 
 unsigned long long __read_mostly (*sched_clock_func)(void) = sched_clock_32;
 
+static unsigned long long notrace sched_clock_64(void)
+{
+   u64 cyc = read_sched_clock_64() - cd.epoch_ns;
+   return cyc * cd.mult;
+}
+
+void __init
+setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
+{
+   if (cd.rate > rate)
+   return;
+
+   BUG_ON(bits <= 32);
+   WARN_ON(!irqs_disabled());
+   read_sched_clock_64 = read;
+   sched_clock_func = sched_clock_64;
+   cd.rate = rate;
+   cd.mult = NSEC_PER_SEC / rate;
+   cd.epoch_ns = read_sched_clock_64();
+
+   sched_clock_calc_wrap(bits, rate);
+
+   try_to_enable_irqtime(rate);
+   

[PATCHv2 4/6] sched_clock: Add support for 32 bit sched_clock

2013-06-02 Thread Stephen Boyd
The ARM architected system counter has at least 56 useable bits.
Add support for counters with more than 32 bits to the generic
sched_clock implementation so we can avoid the complexity of
dealing with wrap-around on these devices while benefiting from
the irqtime accounting and suspend/resume handling that the
generic sched_clock code already has.

Signed-off-by: Stephen Boyd sb...@codeaurora.org
---
 include/linux/sched_clock.h |   2 +
 kernel/time/sched_clock.c   | 101 
 2 files changed, 77 insertions(+), 26 deletions(-)

diff --git a/include/linux/sched_clock.h b/include/linux/sched_clock.h
index fa7922c..e732b39 100644
--- a/include/linux/sched_clock.h
+++ b/include/linux/sched_clock.h
@@ -18,4 +18,6 @@ extern void setup_sched_clock(u32 (*read)(void), int bits, 
unsigned long rate);
 
 extern unsigned long long (*sched_clock_func)(void);
 
+extern void setup_sched_clock_64(u64 (*read)(void), int bits,
+unsigned long rate);
 #endif
diff --git a/kernel/time/sched_clock.c b/kernel/time/sched_clock.c
index aad1ae6..482242c 100644
--- a/kernel/time/sched_clock.c
+++ b/kernel/time/sched_clock.c
@@ -43,6 +43,7 @@ static u32 notrace jiffy_sched_clock_read(void)
 }
 
 static u32 __read_mostly (*read_sched_clock)(void) = jiffy_sched_clock_read;
+static u64 __read_mostly (*read_sched_clock_64)(void);
 
 static inline u64 notrace cyc_to_ns(u64 cyc, u32 mult, u32 shift)
 {
@@ -103,24 +104,12 @@ static void sched_clock_poll(unsigned long wrap_ticks)
update_sched_clock();
 }
 
-void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate)
+static u64 __init sched_clock_calc_wrap(int bits, unsigned long rate)
 {
-   unsigned long r, w;
+   unsigned long r;
u64 res, wrap;
char r_unit;
 
-   if (cd.rate  rate)
-   return;
-
-   BUG_ON(bits  32);
-   WARN_ON(!irqs_disabled());
-   read_sched_clock = read;
-   sched_clock_mask = (1  bits) - 1;
-   cd.rate = rate;
-
-   /* calculate the mult/shift to convert counter ticks to ns. */
-   clocks_calc_mult_shift(cd.mult, cd.shift, rate, NSEC_PER_SEC, 0);
-
r = rate;
if (r = 400) {
r /= 100;
@@ -134,12 +123,39 @@ void __init setup_sched_clock(u32 (*read)(void), int 
bits, unsigned long rate)
/* calculate how many ns until we wrap */
wrap = cyc_to_ns((1ULL  bits) - 1, cd.mult, cd.shift);
do_div(wrap, NSEC_PER_MSEC);
-   w = wrap;
 
/* calculate the ns resolution of this counter */
res = cyc_to_ns(1ULL, cd.mult, cd.shift);
-   pr_info(sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps 
every %lums\n,
-   bits, r, r_unit, res, w);
+   pr_info(sched_clock: %u bits at %lu%cHz, resolution %lluns, wraps 
every %llums\n,
+   bits, r, r_unit, res, wrap);
+
+   return wrap;
+}
+
+static void __init try_to_enable_irqtime(unsigned long rate)
+{
+   /* Enable IRQ time accounting if we have a fast enough sched_clock */
+   if (irqtime  0 || (irqtime == -1  rate = 100))
+   enable_sched_clock_irqtime();
+}
+
+void __init setup_sched_clock(u32 (*read)(void), int bits, unsigned long rate)
+{
+   unsigned long w;
+
+   if (cd.rate  rate)
+   return;
+
+   BUG_ON(bits  32);
+   WARN_ON(!irqs_disabled());
+   read_sched_clock = read;
+   sched_clock_mask = (1  bits) - 1;
+   cd.rate = rate;
+
+   /* calculate the mult/shift to convert counter ticks to ns. */
+   clocks_calc_mult_shift(cd.mult, cd.shift, rate, NSEC_PER_SEC, 0);
+
+   w = sched_clock_calc_wrap(bits, rate);
 
/*
 * Start the timer to keep sched_clock() properly updated and
@@ -153,9 +169,7 @@ void __init setup_sched_clock(u32 (*read)(void), int bits, 
unsigned long rate)
 */
cd.epoch_ns = 0;
 
-   /* Enable IRQ time accounting if we have a fast enough sched_clock */
-   if (irqtime  0 || (irqtime == -1  rate = 100))
-   enable_sched_clock_irqtime();
+   try_to_enable_irqtime(rate);
 
pr_debug(Registered %pF as sched_clock source\n, read);
 }
@@ -168,6 +182,32 @@ static unsigned long long notrace sched_clock_32(void)
 
 unsigned long long __read_mostly (*sched_clock_func)(void) = sched_clock_32;
 
+static unsigned long long notrace sched_clock_64(void)
+{
+   u64 cyc = read_sched_clock_64() - cd.epoch_ns;
+   return cyc * cd.mult;
+}
+
+void __init
+setup_sched_clock_64(u64 (*read)(void), int bits, unsigned long rate)
+{
+   if (cd.rate  rate)
+   return;
+
+   BUG_ON(bits = 32);
+   WARN_ON(!irqs_disabled());
+   read_sched_clock_64 = read;
+   sched_clock_func = sched_clock_64;
+   cd.rate = rate;
+   cd.mult = NSEC_PER_SEC / rate;
+   cd.epoch_ns = read_sched_clock_64();
+
+   sched_clock_calc_wrap(bits, rate);
+
+