Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-28 Thread Will Deacon
On Tue, Jul 26, 2016 at 09:11:49AM -0500, Timur Tabi wrote:
> Will Deacon wrote:
> >The kernel really needs to support both of those platforms :/
> >
> >For the memory-mapped counter registers, the architecture says:
> >
> >   `If the implementation supports 64-bit atomic accesses, then the
> >CNTV_CVAL register must be accessible as an atomic 64-bit value.'
> >
> >which is borderline tautological. If we take the generous reading that
> >this means AArch64 CPUs can use readq (and I'm not completely
> >comfortable with that assertion, particularly as you say that it breaks
> >the model), then you still need to use readq_relaxed here to avoid a
> >DSB. Furthermore, what are you going to do for AArch32? readq doesn't
> >exist over there, and if you use the generic implementation then it's
> >not atomic. In which case, we end up with the current code, as well as a
> >readq_relaxed guarded by a questionable #ifdef that is known to break a
> >supported platform for an unknown performance improvement. Hardly a big
> >win.
> 
> I know Fu dropped this patch, and I don't want to kick a dead horse, but I
> was wondering if it would be okay to do this:
> 
> static u64 arch_counter_get_cntvct_mem(void)
> {
> #ifdef readq_relaxed
>   return readq_relaxed(arch_counter_base + CNTVCT_LO);
> #else
>   u32 vct_lo, vct_hi, tmp_hi;
> 
>   do {
>   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   } while (vct_hi != tmp_hi);
> 
>   return ((u64) vct_hi << 32) | vct_lo;
> #endif
> }
> 
> readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why
> would the function exist if AArch64 CPUs can't use it?
> 
> Do we need something like ARCH_HAS_64BIT_ATOMIC_READ in order to decide
> whether readq is safe?

No, I'm still not ok with this. If you want to use readq_relaxed we need
the following guarantees:

  1. readq_relaxed is provided by the architecture
  2. readq_relaxed is single-copy atomic from the CPU's perspective
  3. The memory-mapped timer has been integrated in such a way that it
 can be accessed using 64-bit transactions.

(1) is easy, and you have that above. For (2), we just need to avoid
include/linux/io-64-nonatomic-*.h. (3), however, is not something we
can safely probe. If this optimisation really is worthwhile, then we
need to extend the device-tree binding for the counter so that we can
tell the kernel that it's ok to use 64-bit accesses for the counter
without tearing.

I have confirmed this with the architects here at ARM.

Will


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-28 Thread Will Deacon
On Tue, Jul 26, 2016 at 09:11:49AM -0500, Timur Tabi wrote:
> Will Deacon wrote:
> >The kernel really needs to support both of those platforms :/
> >
> >For the memory-mapped counter registers, the architecture says:
> >
> >   `If the implementation supports 64-bit atomic accesses, then the
> >CNTV_CVAL register must be accessible as an atomic 64-bit value.'
> >
> >which is borderline tautological. If we take the generous reading that
> >this means AArch64 CPUs can use readq (and I'm not completely
> >comfortable with that assertion, particularly as you say that it breaks
> >the model), then you still need to use readq_relaxed here to avoid a
> >DSB. Furthermore, what are you going to do for AArch32? readq doesn't
> >exist over there, and if you use the generic implementation then it's
> >not atomic. In which case, we end up with the current code, as well as a
> >readq_relaxed guarded by a questionable #ifdef that is known to break a
> >supported platform for an unknown performance improvement. Hardly a big
> >win.
> 
> I know Fu dropped this patch, and I don't want to kick a dead horse, but I
> was wondering if it would be okay to do this:
> 
> static u64 arch_counter_get_cntvct_mem(void)
> {
> #ifdef readq_relaxed
>   return readq_relaxed(arch_counter_base + CNTVCT_LO);
> #else
>   u32 vct_lo, vct_hi, tmp_hi;
> 
>   do {
>   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   } while (vct_hi != tmp_hi);
> 
>   return ((u64) vct_hi << 32) | vct_lo;
> #endif
> }
> 
> readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why
> would the function exist if AArch64 CPUs can't use it?
> 
> Do we need something like ARCH_HAS_64BIT_ATOMIC_READ in order to decide
> whether readq is safe?

No, I'm still not ok with this. If you want to use readq_relaxed we need
the following guarantees:

  1. readq_relaxed is provided by the architecture
  2. readq_relaxed is single-copy atomic from the CPU's perspective
  3. The memory-mapped timer has been integrated in such a way that it
 can be accessed using 64-bit transactions.

(1) is easy, and you have that above. For (2), we just need to avoid
include/linux/io-64-nonatomic-*.h. (3), however, is not something we
can safely probe. If this optimisation really is worthwhile, then we
need to extend the device-tree binding for the counter so that we can
tell the kernel that it's ok to use 64-bit accesses for the counter
without tearing.

I have confirmed this with the architects here at ARM.

Will


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Fu Wei
Hi all,

On 27 July 2016 at 11:33, Jisheng Zhang  wrote:
> +1
>
> On Tue, 26 Jul 2016 09:11:49 -0500 Timur Tabi  wrote:
>
>> Will Deacon wrote:
>> > The kernel really needs to support both of those platforms :/
>> >
>> > For the memory-mapped counter registers, the architecture says:
>> >
>> >`If the implementation supports 64-bit atomic accesses, then the
>> > CNTV_CVAL register must be accessible as an atomic 64-bit value.'
>> >
>> > which is borderline tautological. If we take the generous reading that
>> > this means AArch64 CPUs can use readq (and I'm not completely
>> > comfortable with that assertion, particularly as you say that it breaks
>> > the model), then you still need to use readq_relaxed here to avoid a
>> > DSB. Furthermore, what are you going to do for AArch32? readq doesn't
>> > exist over there, and if you use the generic implementation then it's
>> > not atomic. In which case, we end up with the current code, as well as a
>> > readq_relaxed guarded by a questionable #ifdef that is known to break a
>> > supported platform for an unknown performance improvement. Hardly a big
>> > win.
>>
>> I know Fu dropped this patch, and I don't want to kick a dead horse, but
>> I was wondering if it would be okay to do this:
>>
>> static u64 arch_counter_get_cntvct_mem(void)
>> {
>> #ifdef readq_relaxed
>>   return readq_relaxed(arch_counter_base + CNTVCT_LO);
>> #else
>>   u32 vct_lo, vct_hi, tmp_hi;
>>
>>   do {
>>   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>>   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>>   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>>   } while (vct_hi != tmp_hi);
>>
>>   return ((u64) vct_hi << 32) | vct_lo;
>> #endif
>> }
>>
>> readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why
>> would the function exist if AArch64 CPUs can't use it?

yes, that is a good idea. Thanks Timur! :-)

>
> +1

I like this idea too, but please allow me to upstream this patch separately,
because this GTDT patchset can work without it, this readq support is
a  optimizing.

I also can see another arm-related driver are using readq in this way(
#ifdef readq): bus/arm-ccn.c
And some other drivers are also doing this.

>
> I measured the performance on berlin arm64 platforms:
>
> compared with original version, using readq_relaxed could reduce
> time of arch_counter_get_cntvct_mem() by about 42%!

Great thanks for your data, :-)

>
> Thanks,
> Jisheng



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Fu Wei
Hi all,

On 27 July 2016 at 11:33, Jisheng Zhang  wrote:
> +1
>
> On Tue, 26 Jul 2016 09:11:49 -0500 Timur Tabi  wrote:
>
>> Will Deacon wrote:
>> > The kernel really needs to support both of those platforms :/
>> >
>> > For the memory-mapped counter registers, the architecture says:
>> >
>> >`If the implementation supports 64-bit atomic accesses, then the
>> > CNTV_CVAL register must be accessible as an atomic 64-bit value.'
>> >
>> > which is borderline tautological. If we take the generous reading that
>> > this means AArch64 CPUs can use readq (and I'm not completely
>> > comfortable with that assertion, particularly as you say that it breaks
>> > the model), then you still need to use readq_relaxed here to avoid a
>> > DSB. Furthermore, what are you going to do for AArch32? readq doesn't
>> > exist over there, and if you use the generic implementation then it's
>> > not atomic. In which case, we end up with the current code, as well as a
>> > readq_relaxed guarded by a questionable #ifdef that is known to break a
>> > supported platform for an unknown performance improvement. Hardly a big
>> > win.
>>
>> I know Fu dropped this patch, and I don't want to kick a dead horse, but
>> I was wondering if it would be okay to do this:
>>
>> static u64 arch_counter_get_cntvct_mem(void)
>> {
>> #ifdef readq_relaxed
>>   return readq_relaxed(arch_counter_base + CNTVCT_LO);
>> #else
>>   u32 vct_lo, vct_hi, tmp_hi;
>>
>>   do {
>>   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>>   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>>   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>>   } while (vct_hi != tmp_hi);
>>
>>   return ((u64) vct_hi << 32) | vct_lo;
>> #endif
>> }
>>
>> readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why
>> would the function exist if AArch64 CPUs can't use it?

yes, that is a good idea. Thanks Timur! :-)

>
> +1

I like this idea too, but please allow me to upstream this patch separately,
because this GTDT patchset can work without it, this readq support is
a  optimizing.

I also can see another arm-related driver are using readq in this way(
#ifdef readq): bus/arm-ccn.c
And some other drivers are also doing this.

>
> I measured the performance on berlin arm64 platforms:
>
> compared with original version, using readq_relaxed could reduce
> time of arch_counter_get_cntvct_mem() by about 42%!

Great thanks for your data, :-)

>
> Thanks,
> Jisheng



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Jisheng Zhang
+1

On Tue, 26 Jul 2016 09:11:49 -0500 Timur Tabi  wrote:

> Will Deacon wrote:
> > The kernel really needs to support both of those platforms :/
> >
> > For the memory-mapped counter registers, the architecture says:
> >
> >`If the implementation supports 64-bit atomic accesses, then the
> > CNTV_CVAL register must be accessible as an atomic 64-bit value.'
> >
> > which is borderline tautological. If we take the generous reading that
> > this means AArch64 CPUs can use readq (and I'm not completely
> > comfortable with that assertion, particularly as you say that it breaks
> > the model), then you still need to use readq_relaxed here to avoid a
> > DSB. Furthermore, what are you going to do for AArch32? readq doesn't
> > exist over there, and if you use the generic implementation then it's
> > not atomic. In which case, we end up with the current code, as well as a
> > readq_relaxed guarded by a questionable #ifdef that is known to break a
> > supported platform for an unknown performance improvement. Hardly a big
> > win.  
> 
> I know Fu dropped this patch, and I don't want to kick a dead horse, but 
> I was wondering if it would be okay to do this:
> 
> static u64 arch_counter_get_cntvct_mem(void)
> {
> #ifdef readq_relaxed
>   return readq_relaxed(arch_counter_base + CNTVCT_LO);
> #else
>   u32 vct_lo, vct_hi, tmp_hi;
> 
>   do {
>   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   } while (vct_hi != tmp_hi);
> 
>   return ((u64) vct_hi << 32) | vct_lo;
> #endif
> }
> 
> readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why 
> would the function exist if AArch64 CPUs can't use it?

+1

I measured the performance on berlin arm64 platforms:

compared with original version, using readq_relaxed could reduce
time of arch_counter_get_cntvct_mem() by about 42%!

Thanks,
Jisheng


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Jisheng Zhang
+1

On Tue, 26 Jul 2016 09:11:49 -0500 Timur Tabi  wrote:

> Will Deacon wrote:
> > The kernel really needs to support both of those platforms :/
> >
> > For the memory-mapped counter registers, the architecture says:
> >
> >`If the implementation supports 64-bit atomic accesses, then the
> > CNTV_CVAL register must be accessible as an atomic 64-bit value.'
> >
> > which is borderline tautological. If we take the generous reading that
> > this means AArch64 CPUs can use readq (and I'm not completely
> > comfortable with that assertion, particularly as you say that it breaks
> > the model), then you still need to use readq_relaxed here to avoid a
> > DSB. Furthermore, what are you going to do for AArch32? readq doesn't
> > exist over there, and if you use the generic implementation then it's
> > not atomic. In which case, we end up with the current code, as well as a
> > readq_relaxed guarded by a questionable #ifdef that is known to break a
> > supported platform for an unknown performance improvement. Hardly a big
> > win.  
> 
> I know Fu dropped this patch, and I don't want to kick a dead horse, but 
> I was wondering if it would be okay to do this:
> 
> static u64 arch_counter_get_cntvct_mem(void)
> {
> #ifdef readq_relaxed
>   return readq_relaxed(arch_counter_base + CNTVCT_LO);
> #else
>   u32 vct_lo, vct_hi, tmp_hi;
> 
>   do {
>   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>   } while (vct_hi != tmp_hi);
> 
>   return ((u64) vct_hi << 32) | vct_lo;
> #endif
> }
> 
> readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why 
> would the function exist if AArch64 CPUs can't use it?

+1

I measured the performance on berlin arm64 platforms:

compared with original version, using readq_relaxed could reduce
time of arch_counter_get_cntvct_mem() by about 42%!

Thanks,
Jisheng


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Timur Tabi

Will Deacon wrote:

The kernel really needs to support both of those platforms :/

For the memory-mapped counter registers, the architecture says:

   `If the implementation supports 64-bit atomic accesses, then the
CNTV_CVAL register must be accessible as an atomic 64-bit value.'

which is borderline tautological. If we take the generous reading that
this means AArch64 CPUs can use readq (and I'm not completely
comfortable with that assertion, particularly as you say that it breaks
the model), then you still need to use readq_relaxed here to avoid a
DSB. Furthermore, what are you going to do for AArch32? readq doesn't
exist over there, and if you use the generic implementation then it's
not atomic. In which case, we end up with the current code, as well as a
readq_relaxed guarded by a questionable #ifdef that is known to break a
supported platform for an unknown performance improvement. Hardly a big
win.


I know Fu dropped this patch, and I don't want to kick a dead horse, but 
I was wondering if it would be okay to do this:


static u64 arch_counter_get_cntvct_mem(void)
{
#ifdef readq_relaxed
return readq_relaxed(arch_counter_base + CNTVCT_LO);
#else
u32 vct_lo, vct_hi, tmp_hi;

do {
vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
} while (vct_hi != tmp_hi);

return ((u64) vct_hi << 32) | vct_lo;
#endif
}

readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why 
would the function exist if AArch64 CPUs can't use it?


Do we need something like ARCH_HAS_64BIT_ATOMIC_READ in order to decide 
whether readq is safe?


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Timur Tabi

Will Deacon wrote:

The kernel really needs to support both of those platforms :/

For the memory-mapped counter registers, the architecture says:

   `If the implementation supports 64-bit atomic accesses, then the
CNTV_CVAL register must be accessible as an atomic 64-bit value.'

which is borderline tautological. If we take the generous reading that
this means AArch64 CPUs can use readq (and I'm not completely
comfortable with that assertion, particularly as you say that it breaks
the model), then you still need to use readq_relaxed here to avoid a
DSB. Furthermore, what are you going to do for AArch32? readq doesn't
exist over there, and if you use the generic implementation then it's
not atomic. In which case, we end up with the current code, as well as a
readq_relaxed guarded by a questionable #ifdef that is known to break a
supported platform for an unknown performance improvement. Hardly a big
win.


I know Fu dropped this patch, and I don't want to kick a dead horse, but 
I was wondering if it would be okay to do this:


static u64 arch_counter_get_cntvct_mem(void)
{
#ifdef readq_relaxed
return readq_relaxed(arch_counter_base + CNTVCT_LO);
#else
u32 vct_lo, vct_hi, tmp_hi;

do {
vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
} while (vct_hi != tmp_hi);

return ((u64) vct_hi << 32) | vct_lo;
#endif
}

readq and readq_relaxed are defined in arch/arm64/include/asm/io.h.  Why 
would the function exist if AArch64 CPUs can't use it?


Do we need something like ARCH_HAS_64BIT_ATOMIC_READ in order to decide 
whether readq is safe?


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Fu Wei
Hi Russell King,

On 26 July 2016 at 06:49, Russell King - ARM Linux
 wrote:
> On Mon, Jul 25, 2016 at 05:31:45PM +0100, Will Deacon wrote:
>> On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
>> > On 25 July 2016 at 23:31, Will Deacon  wrote:
>> > > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
>> > >> From: Fu Wei 
>> > >>
>> > >> This patch simplify arch_counter_get_cntvct_mem function by
>> > >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
>> > >>
>> > >> Signed-off-by: Fu Wei 
>> > >> ---
>> > >>  drivers/clocksource/arm_arch_timer.c | 10 +-
>> > >>  1 file changed, 1 insertion(+), 9 deletions(-)
>> > >>
>> > >> diff --git a/drivers/clocksource/arm_arch_timer.c 
>> > >> b/drivers/clocksource/arm_arch_timer.c
>> > >> index e6fd42d..483d2f9 100644
>> > >> --- a/drivers/clocksource/arm_arch_timer.c
>> > >> +++ b/drivers/clocksource/arm_arch_timer.c
>> > >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
>> > >>
>> > >>  static u64 arch_counter_get_cntvct_mem(void)
>> > >>  {
>> > >> - u32 vct_lo, vct_hi, tmp_hi;
>> > >> -
>> > >> - do {
>> > >> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> > >> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>> > >> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> > >> - } while (vct_hi != tmp_hi);
>> > >> -
>> > >> - return ((u64) vct_hi << 32) | vct_lo;
>> > >> + return readq(arch_counter_base + CNTVCT_LO);
>> > >
>> > > Please drop this patch. It doesn't work.
>> >
>> > I am OK to drop this, but could you let me know why it doesn't work?
>> >
>> > I did get some problem on Foundation model about readq, but it works on 
>> > Seattle.
>> > I guess that is a problem of model, but not a code problem.
>> > So I just got confused, why readq  doesn't work,  :-)
>>
>> The kernel really needs to support both of those platforms :/
>>
>> For the memory-mapped counter registers, the architecture says:
>>
>>   `If the implementation supports 64-bit atomic accesses, then the
>>CNTV_CVAL register must be accessible as an atomic 64-bit value.'
>>
>> which is borderline tautological. If we take the generous reading that
>> this means AArch64 CPUs can use readq (and I'm not completely
>> comfortable with that assertion, particularly as you say that it breaks
>> the model), then you still need to use readq_relaxed here to avoid a
>> DSB. Furthermore, what are you going to do for AArch32? readq doesn't
>> exist over there, and if you use the generic implementation then it's
>> not atomic. In which case, we end up with the current code, as well as a
>> readq_relaxed guarded by a questionable #ifdef that is known to break a
>> supported platform for an unknown performance improvement. Hardly a big
>> win.
>>
>> Did you see any performance advantage from this? Given that you've added
>> a DSB, this looks to be extremely premature.
>
> +1, absolutely agreed on the 32-bit ARM bits.

Sorry for misunderstanding it, will drop it in v10.

Great thanks for your help! :-)

>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-26 Thread Fu Wei
Hi Russell King,

On 26 July 2016 at 06:49, Russell King - ARM Linux
 wrote:
> On Mon, Jul 25, 2016 at 05:31:45PM +0100, Will Deacon wrote:
>> On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
>> > On 25 July 2016 at 23:31, Will Deacon  wrote:
>> > > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
>> > >> From: Fu Wei 
>> > >>
>> > >> This patch simplify arch_counter_get_cntvct_mem function by
>> > >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
>> > >>
>> > >> Signed-off-by: Fu Wei 
>> > >> ---
>> > >>  drivers/clocksource/arm_arch_timer.c | 10 +-
>> > >>  1 file changed, 1 insertion(+), 9 deletions(-)
>> > >>
>> > >> diff --git a/drivers/clocksource/arm_arch_timer.c 
>> > >> b/drivers/clocksource/arm_arch_timer.c
>> > >> index e6fd42d..483d2f9 100644
>> > >> --- a/drivers/clocksource/arm_arch_timer.c
>> > >> +++ b/drivers/clocksource/arm_arch_timer.c
>> > >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
>> > >>
>> > >>  static u64 arch_counter_get_cntvct_mem(void)
>> > >>  {
>> > >> - u32 vct_lo, vct_hi, tmp_hi;
>> > >> -
>> > >> - do {
>> > >> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> > >> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>> > >> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> > >> - } while (vct_hi != tmp_hi);
>> > >> -
>> > >> - return ((u64) vct_hi << 32) | vct_lo;
>> > >> + return readq(arch_counter_base + CNTVCT_LO);
>> > >
>> > > Please drop this patch. It doesn't work.
>> >
>> > I am OK to drop this, but could you let me know why it doesn't work?
>> >
>> > I did get some problem on Foundation model about readq, but it works on 
>> > Seattle.
>> > I guess that is a problem of model, but not a code problem.
>> > So I just got confused, why readq  doesn't work,  :-)
>>
>> The kernel really needs to support both of those platforms :/
>>
>> For the memory-mapped counter registers, the architecture says:
>>
>>   `If the implementation supports 64-bit atomic accesses, then the
>>CNTV_CVAL register must be accessible as an atomic 64-bit value.'
>>
>> which is borderline tautological. If we take the generous reading that
>> this means AArch64 CPUs can use readq (and I'm not completely
>> comfortable with that assertion, particularly as you say that it breaks
>> the model), then you still need to use readq_relaxed here to avoid a
>> DSB. Furthermore, what are you going to do for AArch32? readq doesn't
>> exist over there, and if you use the generic implementation then it's
>> not atomic. In which case, we end up with the current code, as well as a
>> readq_relaxed guarded by a questionable #ifdef that is known to break a
>> supported platform for an unknown performance improvement. Hardly a big
>> win.
>>
>> Did you see any performance advantage from this? Given that you've added
>> a DSB, this looks to be extremely premature.
>
> +1, absolutely agreed on the 32-bit ARM bits.

Sorry for misunderstanding it, will drop it in v10.

Great thanks for your help! :-)

>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Russell King - ARM Linux
On Mon, Jul 25, 2016 at 05:31:45PM +0100, Will Deacon wrote:
> On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
> > On 25 July 2016 at 23:31, Will Deacon  wrote:
> > > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
> > >> From: Fu Wei 
> > >>
> > >> This patch simplify arch_counter_get_cntvct_mem function by
> > >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> > >>
> > >> Signed-off-by: Fu Wei 
> > >> ---
> > >>  drivers/clocksource/arm_arch_timer.c | 10 +-
> > >>  1 file changed, 1 insertion(+), 9 deletions(-)
> > >>
> > >> diff --git a/drivers/clocksource/arm_arch_timer.c 
> > >> b/drivers/clocksource/arm_arch_timer.c
> > >> index e6fd42d..483d2f9 100644
> > >> --- a/drivers/clocksource/arm_arch_timer.c
> > >> +++ b/drivers/clocksource/arm_arch_timer.c
> > >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
> > >>
> > >>  static u64 arch_counter_get_cntvct_mem(void)
> > >>  {
> > >> - u32 vct_lo, vct_hi, tmp_hi;
> > >> -
> > >> - do {
> > >> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> > >> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> > >> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> > >> - } while (vct_hi != tmp_hi);
> > >> -
> > >> - return ((u64) vct_hi << 32) | vct_lo;
> > >> + return readq(arch_counter_base + CNTVCT_LO);
> > >
> > > Please drop this patch. It doesn't work.
> > 
> > I am OK to drop this, but could you let me know why it doesn't work?
> > 
> > I did get some problem on Foundation model about readq, but it works on 
> > Seattle.
> > I guess that is a problem of model, but not a code problem.
> > So I just got confused, why readq  doesn't work,  :-)
> 
> The kernel really needs to support both of those platforms :/
> 
> For the memory-mapped counter registers, the architecture says:
> 
>   `If the implementation supports 64-bit atomic accesses, then the
>CNTV_CVAL register must be accessible as an atomic 64-bit value.'
> 
> which is borderline tautological. If we take the generous reading that
> this means AArch64 CPUs can use readq (and I'm not completely
> comfortable with that assertion, particularly as you say that it breaks
> the model), then you still need to use readq_relaxed here to avoid a
> DSB. Furthermore, what are you going to do for AArch32? readq doesn't
> exist over there, and if you use the generic implementation then it's
> not atomic. In which case, we end up with the current code, as well as a
> readq_relaxed guarded by a questionable #ifdef that is known to break a
> supported platform for an unknown performance improvement. Hardly a big
> win.
> 
> Did you see any performance advantage from this? Given that you've added
> a DSB, this looks to be extremely premature.

+1, absolutely agreed on the 32-bit ARM bits.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Russell King - ARM Linux
On Mon, Jul 25, 2016 at 05:31:45PM +0100, Will Deacon wrote:
> On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
> > On 25 July 2016 at 23:31, Will Deacon  wrote:
> > > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
> > >> From: Fu Wei 
> > >>
> > >> This patch simplify arch_counter_get_cntvct_mem function by
> > >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> > >>
> > >> Signed-off-by: Fu Wei 
> > >> ---
> > >>  drivers/clocksource/arm_arch_timer.c | 10 +-
> > >>  1 file changed, 1 insertion(+), 9 deletions(-)
> > >>
> > >> diff --git a/drivers/clocksource/arm_arch_timer.c 
> > >> b/drivers/clocksource/arm_arch_timer.c
> > >> index e6fd42d..483d2f9 100644
> > >> --- a/drivers/clocksource/arm_arch_timer.c
> > >> +++ b/drivers/clocksource/arm_arch_timer.c
> > >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
> > >>
> > >>  static u64 arch_counter_get_cntvct_mem(void)
> > >>  {
> > >> - u32 vct_lo, vct_hi, tmp_hi;
> > >> -
> > >> - do {
> > >> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> > >> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> > >> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> > >> - } while (vct_hi != tmp_hi);
> > >> -
> > >> - return ((u64) vct_hi << 32) | vct_lo;
> > >> + return readq(arch_counter_base + CNTVCT_LO);
> > >
> > > Please drop this patch. It doesn't work.
> > 
> > I am OK to drop this, but could you let me know why it doesn't work?
> > 
> > I did get some problem on Foundation model about readq, but it works on 
> > Seattle.
> > I guess that is a problem of model, but not a code problem.
> > So I just got confused, why readq  doesn't work,  :-)
> 
> The kernel really needs to support both of those platforms :/
> 
> For the memory-mapped counter registers, the architecture says:
> 
>   `If the implementation supports 64-bit atomic accesses, then the
>CNTV_CVAL register must be accessible as an atomic 64-bit value.'
> 
> which is borderline tautological. If we take the generous reading that
> this means AArch64 CPUs can use readq (and I'm not completely
> comfortable with that assertion, particularly as you say that it breaks
> the model), then you still need to use readq_relaxed here to avoid a
> DSB. Furthermore, what are you going to do for AArch32? readq doesn't
> exist over there, and if you use the generic implementation then it's
> not atomic. In which case, we end up with the current code, as well as a
> readq_relaxed guarded by a questionable #ifdef that is known to break a
> supported platform for an unknown performance improvement. Hardly a big
> win.
> 
> Did you see any performance advantage from this? Given that you've added
> a DSB, this looks to be extremely premature.

+1, absolutely agreed on the 32-bit ARM bits.

-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Will Deacon
On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
> On 25 July 2016 at 23:31, Will Deacon  wrote:
> > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
> >> From: Fu Wei 
> >>
> >> This patch simplify arch_counter_get_cntvct_mem function by
> >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> >>
> >> Signed-off-by: Fu Wei 
> >> ---
> >>  drivers/clocksource/arm_arch_timer.c | 10 +-
> >>  1 file changed, 1 insertion(+), 9 deletions(-)
> >>
> >> diff --git a/drivers/clocksource/arm_arch_timer.c 
> >> b/drivers/clocksource/arm_arch_timer.c
> >> index e6fd42d..483d2f9 100644
> >> --- a/drivers/clocksource/arm_arch_timer.c
> >> +++ b/drivers/clocksource/arm_arch_timer.c
> >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
> >>
> >>  static u64 arch_counter_get_cntvct_mem(void)
> >>  {
> >> - u32 vct_lo, vct_hi, tmp_hi;
> >> -
> >> - do {
> >> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> >> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> >> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> >> - } while (vct_hi != tmp_hi);
> >> -
> >> - return ((u64) vct_hi << 32) | vct_lo;
> >> + return readq(arch_counter_base + CNTVCT_LO);
> >
> > Please drop this patch. It doesn't work.
> 
> I am OK to drop this, but could you let me know why it doesn't work?
> 
> I did get some problem on Foundation model about readq, but it works on 
> Seattle.
> I guess that is a problem of model, but not a code problem.
> So I just got confused, why readq  doesn't work,  :-)

The kernel really needs to support both of those platforms :/

For the memory-mapped counter registers, the architecture says:

  `If the implementation supports 64-bit atomic accesses, then the
   CNTV_CVAL register must be accessible as an atomic 64-bit value.'

which is borderline tautological. If we take the generous reading that
this means AArch64 CPUs can use readq (and I'm not completely
comfortable with that assertion, particularly as you say that it breaks
the model), then you still need to use readq_relaxed here to avoid a
DSB. Furthermore, what are you going to do for AArch32? readq doesn't
exist over there, and if you use the generic implementation then it's
not atomic. In which case, we end up with the current code, as well as a
readq_relaxed guarded by a questionable #ifdef that is known to break a
supported platform for an unknown performance improvement. Hardly a big
win.

Did you see any performance advantage from this? Given that you've added
a DSB, this looks to be extremely premature.

Will


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Will Deacon
On Mon, Jul 25, 2016 at 11:55:49PM +0800, Fu Wei wrote:
> On 25 July 2016 at 23:31, Will Deacon  wrote:
> > On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
> >> From: Fu Wei 
> >>
> >> This patch simplify arch_counter_get_cntvct_mem function by
> >> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> >>
> >> Signed-off-by: Fu Wei 
> >> ---
> >>  drivers/clocksource/arm_arch_timer.c | 10 +-
> >>  1 file changed, 1 insertion(+), 9 deletions(-)
> >>
> >> diff --git a/drivers/clocksource/arm_arch_timer.c 
> >> b/drivers/clocksource/arm_arch_timer.c
> >> index e6fd42d..483d2f9 100644
> >> --- a/drivers/clocksource/arm_arch_timer.c
> >> +++ b/drivers/clocksource/arm_arch_timer.c
> >> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
> >>
> >>  static u64 arch_counter_get_cntvct_mem(void)
> >>  {
> >> - u32 vct_lo, vct_hi, tmp_hi;
> >> -
> >> - do {
> >> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> >> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> >> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> >> - } while (vct_hi != tmp_hi);
> >> -
> >> - return ((u64) vct_hi << 32) | vct_lo;
> >> + return readq(arch_counter_base + CNTVCT_LO);
> >
> > Please drop this patch. It doesn't work.
> 
> I am OK to drop this, but could you let me know why it doesn't work?
> 
> I did get some problem on Foundation model about readq, but it works on 
> Seattle.
> I guess that is a problem of model, but not a code problem.
> So I just got confused, why readq  doesn't work,  :-)

The kernel really needs to support both of those platforms :/

For the memory-mapped counter registers, the architecture says:

  `If the implementation supports 64-bit atomic accesses, then the
   CNTV_CVAL register must be accessible as an atomic 64-bit value.'

which is borderline tautological. If we take the generous reading that
this means AArch64 CPUs can use readq (and I'm not completely
comfortable with that assertion, particularly as you say that it breaks
the model), then you still need to use readq_relaxed here to avoid a
DSB. Furthermore, what are you going to do for AArch32? readq doesn't
exist over there, and if you use the generic implementation then it's
not atomic. In which case, we end up with the current code, as well as a
readq_relaxed guarded by a questionable #ifdef that is known to break a
supported platform for an unknown performance improvement. Hardly a big
win.

Did you see any performance advantage from this? Given that you've added
a DSB, this looks to be extremely premature.

Will


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Timur Tabi

Will Deacon wrote:

>  {
>-   u32 vct_lo, vct_hi, tmp_hi;
>-
>-   do {
>-   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>-   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>-   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>-   } while (vct_hi != tmp_hi);
>-
>-   return ((u64) vct_hi << 32) | vct_lo;
>+   return readq(arch_counter_base + CNTVCT_LO);

Please drop this patch. It doesn't work.


On systems where readq() does work, wouldn't it be more optimal than the 
above while-loop?


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Timur Tabi

Will Deacon wrote:

>  {
>-   u32 vct_lo, vct_hi, tmp_hi;
>-
>-   do {
>-   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>-   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>-   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>-   } while (vct_hi != tmp_hi);
>-
>-   return ((u64) vct_hi << 32) | vct_lo;
>+   return readq(arch_counter_base + CNTVCT_LO);

Please drop this patch. It doesn't work.


On systems where readq() does work, wouldn't it be more optimal than the 
above while-loop?


--
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the
Code Aurora Forum, hosted by The Linux Foundation.


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Fu Wei
Hi Will,

On 25 July 2016 at 23:31, Will Deacon  wrote:
> On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
>> From: Fu Wei 
>>
>> This patch simplify arch_counter_get_cntvct_mem function by
>> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
>>
>> Signed-off-by: Fu Wei 
>> ---
>>  drivers/clocksource/arm_arch_timer.c | 10 +-
>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>
>> diff --git a/drivers/clocksource/arm_arch_timer.c 
>> b/drivers/clocksource/arm_arch_timer.c
>> index e6fd42d..483d2f9 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
>>
>>  static u64 arch_counter_get_cntvct_mem(void)
>>  {
>> - u32 vct_lo, vct_hi, tmp_hi;
>> -
>> - do {
>> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> - } while (vct_hi != tmp_hi);
>> -
>> - return ((u64) vct_hi << 32) | vct_lo;
>> + return readq(arch_counter_base + CNTVCT_LO);
>
> Please drop this patch. It doesn't work.

I am OK to drop this, but could you let me know why it doesn't work?

I did get some problem on Foundation model about readq, but it works on Seattle.
I guess that is a problem of model, but not a code problem.
So I just got confused, why readq  doesn't work,  :-)


>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/445369.html

I just replied to it, sorry.

>
> Will



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Fu Wei
Hi Will,

On 25 July 2016 at 23:31, Will Deacon  wrote:
> On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
>> From: Fu Wei 
>>
>> This patch simplify arch_counter_get_cntvct_mem function by
>> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
>>
>> Signed-off-by: Fu Wei 
>> ---
>>  drivers/clocksource/arm_arch_timer.c | 10 +-
>>  1 file changed, 1 insertion(+), 9 deletions(-)
>>
>> diff --git a/drivers/clocksource/arm_arch_timer.c 
>> b/drivers/clocksource/arm_arch_timer.c
>> index e6fd42d..483d2f9 100644
>> --- a/drivers/clocksource/arm_arch_timer.c
>> +++ b/drivers/clocksource/arm_arch_timer.c
>> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
>>
>>  static u64 arch_counter_get_cntvct_mem(void)
>>  {
>> - u32 vct_lo, vct_hi, tmp_hi;
>> -
>> - do {
>> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
>> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
>> - } while (vct_hi != tmp_hi);
>> -
>> - return ((u64) vct_hi << 32) | vct_lo;
>> + return readq(arch_counter_base + CNTVCT_LO);
>
> Please drop this patch. It doesn't work.

I am OK to drop this, but could you let me know why it doesn't work?

I did get some problem on Foundation model about readq, but it works on Seattle.
I guess that is a problem of model, but not a code problem.
So I just got confused, why readq  doesn't work,  :-)


>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/445369.html

I just replied to it, sorry.

>
> Will



-- 
Best regards,

Fu Wei
Software Engineer
Red Hat


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Will Deacon
On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
> From: Fu Wei 
> 
> This patch simplify arch_counter_get_cntvct_mem function by
> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> 
> Signed-off-by: Fu Wei 
> ---
>  drivers/clocksource/arm_arch_timer.c | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c 
> b/drivers/clocksource/arm_arch_timer.c
> index e6fd42d..483d2f9 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
>  
>  static u64 arch_counter_get_cntvct_mem(void)
>  {
> - u32 vct_lo, vct_hi, tmp_hi;
> -
> - do {
> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> - } while (vct_hi != tmp_hi);
> -
> - return ((u64) vct_hi << 32) | vct_lo;
> + return readq(arch_counter_base + CNTVCT_LO);

Please drop this patch. It doesn't work.

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/445369.html

Will


Re: [PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread Will Deacon
On Mon, Jul 25, 2016 at 11:27:02PM +0800, fu@linaro.org wrote:
> From: Fu Wei 
> 
> This patch simplify arch_counter_get_cntvct_mem function by
> using readq to get 64-bit CNTVCT value instead of readl_relaxed.
> 
> Signed-off-by: Fu Wei 
> ---
>  drivers/clocksource/arm_arch_timer.c | 10 +-
>  1 file changed, 1 insertion(+), 9 deletions(-)
> 
> diff --git a/drivers/clocksource/arm_arch_timer.c 
> b/drivers/clocksource/arm_arch_timer.c
> index e6fd42d..483d2f9 100644
> --- a/drivers/clocksource/arm_arch_timer.c
> +++ b/drivers/clocksource/arm_arch_timer.c
> @@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
>  
>  static u64 arch_counter_get_cntvct_mem(void)
>  {
> - u32 vct_lo, vct_hi, tmp_hi;
> -
> - do {
> - vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> - vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
> - tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
> - } while (vct_hi != tmp_hi);
> -
> - return ((u64) vct_hi << 32) | vct_lo;
> + return readq(arch_counter_base + CNTVCT_LO);

Please drop this patch. It doesn't work.

http://lists.infradead.org/pipermail/linux-arm-kernel/2016-July/445369.html

Will


[PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread fu . wei
From: Fu Wei 

This patch simplify arch_counter_get_cntvct_mem function by
using readq to get 64-bit CNTVCT value instead of readl_relaxed.

Signed-off-by: Fu Wei 
---
 drivers/clocksource/arm_arch_timer.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index e6fd42d..483d2f9 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
 
 static u64 arch_counter_get_cntvct_mem(void)
 {
-   u32 vct_lo, vct_hi, tmp_hi;
-
-   do {
-   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
-   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
-   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
-   } while (vct_hi != tmp_hi);
-
-   return ((u64) vct_hi << 32) | vct_lo;
+   return readq(arch_counter_base + CNTVCT_LO);
 }
 
 /*
-- 
2.5.5



[PATCH v9 4/9] clocksource/drivers/arm_arch_timer: use readq to get 64-bit CNTVCT

2016-07-25 Thread fu . wei
From: Fu Wei 

This patch simplify arch_counter_get_cntvct_mem function by
using readq to get 64-bit CNTVCT value instead of readl_relaxed.

Signed-off-by: Fu Wei 
---
 drivers/clocksource/arm_arch_timer.c | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/drivers/clocksource/arm_arch_timer.c 
b/drivers/clocksource/arm_arch_timer.c
index e6fd42d..483d2f9 100644
--- a/drivers/clocksource/arm_arch_timer.c
+++ b/drivers/clocksource/arm_arch_timer.c
@@ -418,15 +418,7 @@ u32 arch_timer_get_rate(void)
 
 static u64 arch_counter_get_cntvct_mem(void)
 {
-   u32 vct_lo, vct_hi, tmp_hi;
-
-   do {
-   vct_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
-   vct_lo = readl_relaxed(arch_counter_base + CNTVCT_LO);
-   tmp_hi = readl_relaxed(arch_counter_base + CNTVCT_HI);
-   } while (vct_hi != tmp_hi);
-
-   return ((u64) vct_hi << 32) | vct_lo;
+   return readq(arch_counter_base + CNTVCT_LO);
 }
 
 /*
-- 
2.5.5