Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-24 Thread Marc Gonzalez
On 19/11/2015 14:57, Thomas Gleixner wrote:
> On Thu, 19 Nov 2015, Marc Gonzalez wrote:
>> On 19/11/2015 12:14, Thomas Gleixner wrote:
>>
>>> So yes, the alignment of the clocksource struct is not longer
>>> relevant. The case where we access clocksource->max_cycles is when
>>> CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
>>> problems to timekeeping than the extra cacheline.
>>>
>>> So the simple solution for this issue is indeed the one liner below.
>>
>> It would make sense to also remove the comment emphasizing the
>> alignment requirement.
> 
> Indeed.

Thomas,

Will you commit that patch yourself?

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-24 Thread Marc Gonzalez
On 19/11/2015 14:57, Thomas Gleixner wrote:
> On Thu, 19 Nov 2015, Marc Gonzalez wrote:
>> On 19/11/2015 12:14, Thomas Gleixner wrote:
>>
>>> So yes, the alignment of the clocksource struct is not longer
>>> relevant. The case where we access clocksource->max_cycles is when
>>> CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
>>> problems to timekeeping than the extra cacheline.
>>>
>>> So the simple solution for this issue is indeed the one liner below.
>>
>> It would make sense to also remove the comment emphasizing the
>> alignment requirement.
> 
> Indeed.

Thomas,

Will you commit that patch yourself?

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Marc Gonzalez wrote:
> On 19/11/2015 12:14, Thomas Gleixner wrote:
> 
> > So yes, the alignment of the clocksource struct is not longer
> > relevant. The case where we access clocksource->max_cycles is when
> > CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
> > problems to timekeeping than the extra cacheline.
> > 
> > So the simple solution for this issue is indeed the one liner below.
> 
> It would make sense to also remove the comment emphasizing the
> alignment requirement.

Indeed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 19/11/2015 12:21, Russell King wrote:

> When I wrote the MMIO clocksource implementation, there was no
> cacheline_aligned on struct clocksource, and the arrangement I came
> to for the structure put the 'reg' and 'read' within the same cache line
> (note that the MMIO clocksource pre-dates Thomas' rearrangement of struct
> clocksource and the addition of the cache line alignment.)  The original
> layout did not have any padding gaps.

For the record, I pointed out the chronology in a previous discussion.
But Thomas didn't comment at the time :-(

http://thread.gmane.org/gmane.linux.ports.arm.kernel/402968/focus=403604

Mason wrote:
> Oh and while I have your attention ;-) I have alignment-related
> questions about clocksource_mmio_init() (commit 442c8176d2) wrt
> Thomas Gleixner's 369db4c952 patch. (I think the two patches
> do not play nice.)

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 19/11/2015 12:14, Thomas Gleixner wrote:

> So yes, the alignment of the clocksource struct is not longer
> relevant. The case where we access clocksource->max_cycles is when
> CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
> problems to timekeeping than the extra cacheline.
> 
> So the simple solution for this issue is indeed the one liner below.

It would make sense to also remove the comment emphasizing the
alignment requirement.

Regards.

8<---

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd279a7a8..6a0f86a9a92d 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -64,10 +64,6 @@ struct module;
  * @owner: module reference, must be set by clocksource in modules
  */
 struct clocksource {
-   /*
-* Hotpath data, fits in a single cache line when the
-* clocksource itself is cacheline aligned.
-*/
cycle_t (*read)(struct clocksource *cs);
cycle_t mask;
u32 mult;
@@ -95,7 +91,7 @@ struct clocksource {
cycle_t wd_last;
 #endif
struct module *owner;
-} cacheline_aligned;
+};
 
 /*
  * Clock source flags bits::

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:55:46AM +0100, Marc Gonzalez wrote:
> If you just object to the ifdef, then perhaps 'reg' can be included
> unconditionally.
> 
> diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> index 278dd279a7a8..50725fd23ab0 100644
> --- a/include/linux/clocksource.h
> +++ b/include/linux/clocksource.h
> @@ -69,6 +69,7 @@ struct clocksource {
>* clocksource itself is cacheline aligned.
>*/
>   cycle_t (*read)(struct clocksource *cs);
> + void __iomem *reg;
>   cycle_t mask;
>   u32 mult;
>   u32 shift;
> 
> > The basic cause of this problem is the cacheline_aligned annotation
> > which effectively prevents wrapping struct clocksource to provide
> > implementation specific data.
> 
> True. But note that placing reg inside struct clocksource comes
> for free on 32-bit platforms (it just replaces padding).

Yes, I'd object less if it's just replacing padding.

> > Maybe your idea is that struct clocksource should be bloated with all
> > implementation specific data in the long term?
> 
> reg is not implementation-specific data, right?

That depends on how you look at what consitutes "implementation specific".

The vast majority of clocksource drivers access some non-MMIO register,
or some register that they need to store the __iomem pointer externally
for other reasons.  The original clocksource design did not have any
iomem pointer.

When I wrote the MMIO clocksource implementation, there was no
cacheline_aligned on struct clocksource, and the arrangement I came
to for the structure put the 'reg' and 'read' within the same cache line
(note that the MMIO clocksource pre-dates Thomas' rearrangement of struct
clocksource and the addition of the cache line alignment.)  The original
layout did not have any padding gaps.

So, it was pointless to add a __iomem pointer to the main structure,
which would have bloated the struct for every user, and it made no sense
what so ever to do that.

Things may have changed, and there may be padding, but things have changed
again which actually mean that very little of struct clocksource will be
in a cache line when the ->read function is called, so the whole idea of
fitting things into one cache line here is totally irrelevant anymore.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:

> On Thu, Nov 19, 2015 at 11:42:48AM +0100, Thomas Gleixner wrote:
> > On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> > > The basic cause of this problem is the cacheline_aligned annotation
> > > which effectively prevents wrapping struct clocksource to provide
> > > implementation specific data.
> > >
> > > Maybe your idea is that struct clocksource should be bloated with all
> > > implementation specific data in the long term?
> > 
> > Certainly not. That mmio use case is sane enough, but you are right,
> > that we should try to lift that cacheline_aligned restriction.
> 
> I don't think the cache line alignment of struct clocksource matters
> anymore - the core timekeeping code no longer uses struct clocksource
> but instead uses struct timekeeper, which caches much of the data from
> struct clocksource.  The only member of struct clocksource which it
> does access is max_cycles, which is more than 32 bytes into struct
> clocksource.
>
> So, I see no reason to waste memory with all these struct clocksources
> being bloated out to cacheline alignments.  In addition, once
> cacheline_aligned is removed, I see no reason for Marc's change
> either.

Right. I completely forgot that I rewrote the core part some time
ago. I'm older than 50, so I'm entitled to use the beginning Alzheimer
excuse. :)

So yes, the alignment of the clocksource struct is not longer
relevant. The case where we access clocksource->max_cycles is when
CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
problems to timekeeping than the extra cacheline.

So the simple solution for this issue is indeed the one liner below.

Thanks,

tglx

8<---

--- tip.orig/include/linux/clocksource.h
+++ tip/include/linux/clocksource.h
@@ -95,7 +95,7 @@ struct clocksource {
cycle_t wd_last;
 #endif
struct module *owner;
-} cacheline_aligned;
+}
 
 /*
  * Clock source flags bits::

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:42:48AM +0100, Thomas Gleixner wrote:
> On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> > The basic cause of this problem is the cacheline_aligned annotation
> > which effectively prevents wrapping struct clocksource to provide
> > implementation specific data.
> >
> > Maybe your idea is that struct clocksource should be bloated with all
> > implementation specific data in the long term?
> 
> Certainly not. That mmio use case is sane enough, but you are right,
> that we should try to lift that cacheline_aligned restriction.

I don't think the cache line alignment of struct clocksource matters
anymore - the core timekeeping code no longer uses struct clocksource
but instead uses struct timekeeper, which caches much of the data from
struct clocksource.  The only member of struct clocksource which it
does access is max_cycles, which is more than 32 bytes into struct
clocksource.

So, I see no reason to waste memory with all these struct clocksources
being bloated out to cacheline alignments.  In addition, once
cacheline_aligned is removed, I see no reason for Marc's change
either.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 19/11/2015 11:36, Russell King - ARM Linux wrote:
> On Thu, Nov 19, 2015 at 11:33:47AM +0100, Thomas Gleixner wrote:
>> Russell,
>>
>> On Wed, 18 Nov 2015, Russell King - ARM Linux wrote:
>>
>>> On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
 Since 'struct clocksource' is cacheline_aligned, gcc must insert
 a lot of padding between reg and clksrc in 'struct clocksource_mmio'
 (for example, L1_CACHE_BYTES = 64 on ARMv7).

 Storing reg within 'struct clocksource' removes unnecessary padding,
 and reg can then be grouped with other hot data. A nice side-effect
 of this patch is making container_of() unnecessary, which makes the
 code a bit simpler.

 On 32-bit platforms, reg fits in the padding between read and mask,
 meaning no downside from storing it there.
>>>
>>> Just swap the order of 'reg' and 'clksrc'.
>>
>> That might reduce the memory footprint, but it does not bring the
>> iomem pointer closer to the other hotpath clocksource data. So we
>> still need to touch at minimum two cache lines for reading the time.
>>
>> With Marc's change we have all hotpath data in a single cacheline.
> 
> Right, and what it's doing is polluting struct clocksource with lots
> of ifdefs which determine how much data is contained in there.  Seems
> to me to be totally insane.

What I find puzzling is that you would consider the read field to be
a bona fide member of struct clocksource, but reg (the address passed
to read) does not belong inside the struct?

If you just object to the ifdef, then perhaps 'reg' can be included
unconditionally.

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd279a7a8..50725fd23ab0 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -69,6 +69,7 @@ struct clocksource {
 * clocksource itself is cacheline aligned.
 */
cycle_t (*read)(struct clocksource *cs);
+   void __iomem *reg;
cycle_t mask;
u32 mult;
u32 shift;

> The basic cause of this problem is the cacheline_aligned annotation
> which effectively prevents wrapping struct clocksource to provide
> implementation specific data.

True. But note that placing reg inside struct clocksource comes
for free on 32-bit platforms (it just replaces padding).

> Maybe your idea is that struct clocksource should be bloated with all
> implementation specific data in the long term?

reg is not implementation-specific data, right?

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> The basic cause of this problem is the cacheline_aligned annotation
> which effectively prevents wrapping struct clocksource to provide
> implementation specific data.
>
> Maybe your idea is that struct clocksource should be bloated with all
> implementation specific data in the long term?

Certainly not. That mmio use case is sane enough, but you are right,
that we should try to lift that cacheline_aligned restriction.

We have 77 instances of static struct clocksource declaration...

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:36:53AM +0100, Thomas Gleixner wrote:
> On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> > Maybe the cacheline_aligned is inappropriate then, because it means
> > any wrapping of struct clocksource has exactly the same problem.
> 
> We could do that, but that does not necessarily solve the cache
> footprint issue. Aside of that we'd need to add cacheline_aligned
> to quite some of the statically allocated clocksource declarations.

Sorry, but it does solve the cache footprint issue, because it would
have the effect of moving 'reg' right into the same cache line as
the 'read' pointer.

Yes, we'd need cacheline_aligned at the static declarations, but
surely that's better than constantly having to stuff implementation
specific data into struct clocksource.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> On Thu, Nov 19, 2015 at 10:27:33AM +0100, Marc Gonzalez wrote:
> > On 18/11/2015 18:21, Russell King - ARM Linux wrote:
> > > On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> > >
> > >> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> > >> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> > >> (for example, L1_CACHE_BYTES = 64 on ARMv7).
> > >>
> > >> Storing reg within 'struct clocksource' removes unnecessary padding,
> > >> and reg can then be grouped with other hot data. A nice side-effect
> > >> of this patch is making container_of() unnecessary, which makes the
> > >> code a bit simpler.
> > >>
> > >> On 32-bit platforms, reg fits in the padding between read and mask,
> > >> meaning no downside from storing it there.
> > > 
> > > Just swap the order of 'reg' and 'clksrc'.
> > 
> > You already suggested that the last time (April 1st).
> > What problem is this supposed to solve?
> > Swapping the fields does not change the amount of padding required,
> > and does not place reg close to the hot data.
> > 
> > On a 32-bit platform, with L1_CACHE_BYTES = 64
> > 
> > sizeof(struct unaligned_clocksource) = 80
> > sizeof(struct clocksource) = 128
> > sizeof(struct clocksource_mmio)  = 192, reg at +0,   clksrc at +64
> > sizeof(struct clocksource_mmio2) = 192, reg at +128, clksrc at +0
> > 
> > Same amount of padding.
> 
> Maybe the cacheline_aligned is inappropriate then, because it means
> any wrapping of struct clocksource has exactly the same problem.

We could do that, but that does not necessarily solve the cache
footprint issue. Aside of that we'd need to add cacheline_aligned
to quite some of the statically allocated clocksource declarations.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:33:47AM +0100, Thomas Gleixner wrote:
> Russell,
> 
> On Wed, 18 Nov 2015, Russell King - ARM Linux wrote:
> 
> > On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> > > Since 'struct clocksource' is cacheline_aligned, gcc must insert
> > > a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> > > (for example, L1_CACHE_BYTES = 64 on ARMv7).
> > > 
> > > Storing reg within 'struct clocksource' removes unnecessary padding,
> > > and reg can then be grouped with other hot data. A nice side-effect
> > > of this patch is making container_of() unnecessary, which makes the
> > > code a bit simpler.
> > > 
> > > On 32-bit platforms, reg fits in the padding between read and mask,
> > > meaning no downside from storing it there.
> > 
> > Just swap the order of 'reg' and 'clksrc'.
> 
> That might reduce the memory footprint, but it does not bring the
> iomem pointer closer to the other hotpath clocksource data. So we
> still need to touch at minimum two cache lines for reading the time.
> 
> With Marcs change we have all hotpath data in a single cacheline.

Right, and what it's doing is polluting struct clocksource with lots
of ifdefs which determine how much data is contained in there.  Seems
to me to be totally insane.

The basic cause of this problem is the cacheline_aligned annotation
which effectively prevents wrapping struct clocksource to provide
implementation specific data.

Maybe your idea is that struct clocksource should be bloated with all
implementation specific data in the long term?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
Russell,

On Wed, 18 Nov 2015, Russell King - ARM Linux wrote:

> On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> > Since 'struct clocksource' is cacheline_aligned, gcc must insert
> > a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> > (for example, L1_CACHE_BYTES = 64 on ARMv7).
> > 
> > Storing reg within 'struct clocksource' removes unnecessary padding,
> > and reg can then be grouped with other hot data. A nice side-effect
> > of this patch is making container_of() unnecessary, which makes the
> > code a bit simpler.
> > 
> > On 32-bit platforms, reg fits in the padding between read and mask,
> > meaning no downside from storing it there.
> 
> Just swap the order of 'reg' and 'clksrc'.

That might reduce the memory footprint, but it does not bring the
iomem pointer closer to the other hotpath clocksource data. So we
still need to touch at minimum two cache lines for reading the time.

With Marcs change we have all hotpath data in a single cacheline.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 10:27:33AM +0100, Marc Gonzalez wrote:
> On 18/11/2015 18:21, Russell King - ARM Linux wrote:
> > On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> >
> >> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> >> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> >> (for example, L1_CACHE_BYTES = 64 on ARMv7).
> >>
> >> Storing reg within 'struct clocksource' removes unnecessary padding,
> >> and reg can then be grouped with other hot data. A nice side-effect
> >> of this patch is making container_of() unnecessary, which makes the
> >> code a bit simpler.
> >>
> >> On 32-bit platforms, reg fits in the padding between read and mask,
> >> meaning no downside from storing it there.
> > 
> > Just swap the order of 'reg' and 'clksrc'.
> 
> You already suggested that the last time (April 1st).
> What problem is this supposed to solve?
> Swapping the fields does not change the amount of padding required,
> and does not place reg close to the hot data.
> 
> On a 32-bit platform, with L1_CACHE_BYTES = 64
> 
> sizeof(struct unaligned_clocksource) = 80
> sizeof(struct clocksource) = 128
> sizeof(struct clocksource_mmio)  = 192, reg at +0,   clksrc at +64
> sizeof(struct clocksource_mmio2) = 192, reg at +128, clksrc at +0
> 
> Same amount of padding.

Maybe the cacheline_aligned is inappropriate then, because it means
any wrapping of struct clocksource has exactly the same problem.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 18/11/2015 18:21, Russell King - ARM Linux wrote:

> On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
>
>> Since 'struct clocksource' is cacheline_aligned, gcc must insert
>> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
>> (for example, L1_CACHE_BYTES = 64 on ARMv7).
>>
>> Storing reg within 'struct clocksource' removes unnecessary padding,
>> and reg can then be grouped with other hot data. A nice side-effect
>> of this patch is making container_of() unnecessary, which makes the
>> code a bit simpler.
>>
>> On 32-bit platforms, reg fits in the padding between read and mask,
>> meaning no downside from storing it there.
> 
> Just swap the order of 'reg' and 'clksrc'.

You already suggested that the last time (April 1st).
What problem is this supposed to solve?
Swapping the fields does not change the amount of padding required,
and does not place reg close to the hot data.

On a 32-bit platform, with L1_CACHE_BYTES = 64

sizeof(struct unaligned_clocksource) = 80
sizeof(struct clocksource) = 128
sizeof(struct clocksource_mmio)  = 192, reg at +0,   clksrc at +64
sizeof(struct clocksource_mmio2) = 192, reg at +128, clksrc at +0

Same amount of padding.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 10:27:33AM +0100, Marc Gonzalez wrote:
> On 18/11/2015 18:21, Russell King - ARM Linux wrote:
> > On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> >
> >> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> >> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> >> (for example, L1_CACHE_BYTES = 64 on ARMv7).
> >>
> >> Storing reg within 'struct clocksource' removes unnecessary padding,
> >> and reg can then be grouped with other hot data. A nice side-effect
> >> of this patch is making container_of() unnecessary, which makes the
> >> code a bit simpler.
> >>
> >> On 32-bit platforms, reg fits in the padding between read and mask,
> >> meaning no downside from storing it there.
> > 
> > Just swap the order of 'reg' and 'clksrc'.
> 
> You already suggested that the last time (April 1st).
> What problem is this supposed to solve?
> Swapping the fields does not change the amount of padding required,
> and does not place reg close to the hot data.
> 
> On a 32-bit platform, with L1_CACHE_BYTES = 64
> 
> sizeof(struct unaligned_clocksource) = 80
> sizeof(struct clocksource) = 128
> sizeof(struct clocksource_mmio)  = 192, reg at +0,   clksrc at +64
> sizeof(struct clocksource_mmio2) = 192, reg at +128, clksrc at +0
> 
> Same amount of padding.

Maybe the cacheline_aligned is inappropriate then, because it means
any wrapping of struct clocksource has exactly the same problem.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:33:47AM +0100, Thomas Gleixner wrote:
> Russell,
> 
> On Wed, 18 Nov 2015, Russell King - ARM Linux wrote:
> 
> > On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> > > Since 'struct clocksource' is cacheline_aligned, gcc must insert
> > > a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> > > (for example, L1_CACHE_BYTES = 64 on ARMv7).
> > > 
> > > Storing reg within 'struct clocksource' removes unnecessary padding,
> > > and reg can then be grouped with other hot data. A nice side-effect
> > > of this patch is making container_of() unnecessary, which makes the
> > > code a bit simpler.
> > > 
> > > On 32-bit platforms, reg fits in the padding between read and mask,
> > > meaning no downside from storing it there.
> > 
> > Just swap the order of 'reg' and 'clksrc'.
> 
> That might reduce the memory footprint, but it does not bring the
> iomem pointer closer to the other hotpath clocksource data. So we
> still need to touch at minimum two cache lines for reading the time.
> 
> With Marcs change we have all hotpath data in a single cacheline.

Right, and what it's doing is polluting struct clocksource with lots
of ifdefs which determine how much data is contained in there.  Seems
to me to be totally insane.

The basic cause of this problem is the cacheline_aligned annotation
which effectively prevents wrapping struct clocksource to provide
implementation specific data.

Maybe your idea is that struct clocksource should be bloated with all
implementation specific data in the long term?

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> On Thu, Nov 19, 2015 at 10:27:33AM +0100, Marc Gonzalez wrote:
> > On 18/11/2015 18:21, Russell King - ARM Linux wrote:
> > > On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> > >
> > >> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> > >> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> > >> (for example, L1_CACHE_BYTES = 64 on ARMv7).
> > >>
> > >> Storing reg within 'struct clocksource' removes unnecessary padding,
> > >> and reg can then be grouped with other hot data. A nice side-effect
> > >> of this patch is making container_of() unnecessary, which makes the
> > >> code a bit simpler.
> > >>
> > >> On 32-bit platforms, reg fits in the padding between read and mask,
> > >> meaning no downside from storing it there.
> > > 
> > > Just swap the order of 'reg' and 'clksrc'.
> > 
> > You already suggested that the last time (April 1st).
> > What problem is this supposed to solve?
> > Swapping the fields does not change the amount of padding required,
> > and does not place reg close to the hot data.
> > 
> > On a 32-bit platform, with L1_CACHE_BYTES = 64
> > 
> > sizeof(struct unaligned_clocksource) = 80
> > sizeof(struct clocksource) = 128
> > sizeof(struct clocksource_mmio)  = 192, reg at +0,   clksrc at +64
> > sizeof(struct clocksource_mmio2) = 192, reg at +128, clksrc at +0
> > 
> > Same amount of padding.
> 
> Maybe the cacheline_aligned is inappropriate then, because it means
> any wrapping of struct clocksource has exactly the same problem.

We could do that, but that does not necessarily solve the cache
footprint issue. Aside of that we'd need to add cacheline_aligned
to quite some of the statically allocated clocksource declarations.

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:36:53AM +0100, Thomas Gleixner wrote:
> On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> > Maybe the cacheline_aligned is inappropriate then, because it means
> > any wrapping of struct clocksource has exactly the same problem.
> 
> We could do that, but that does not necessarily solve the cache
> footprint issue. Aside of that we'd need to add cacheline_aligned
> to quite some of the statically allocated clocksource declarations.

Sorry, but it does solve the cache footprint issue, because it would
have the effect of moving 'reg' right into the same cache line as
the 'read' pointer.

Yes, we'd need cacheline_aligned at the static declarations, but
surely that's better than constantly having to stuff implementation
specific data into struct clocksource.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> The basic cause of this problem is the cacheline_aligned annotation
> which effectively prevents wrapping struct clocksource to provide
> implementation specific data.
>
> Maybe your idea is that struct clocksource should be bloated with all
> implementation specific data in the long term?

Certainly not. That mmio use case is sane enough, but you are right,
that we should try to lift that cacheline_aligned restriction.

We have 77 instances of static struct clocksource declaration...

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 18/11/2015 18:21, Russell King - ARM Linux wrote:

> On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
>
>> Since 'struct clocksource' is cacheline_aligned, gcc must insert
>> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
>> (for example, L1_CACHE_BYTES = 64 on ARMv7).
>>
>> Storing reg within 'struct clocksource' removes unnecessary padding,
>> and reg can then be grouped with other hot data. A nice side-effect
>> of this patch is making container_of() unnecessary, which makes the
>> code a bit simpler.
>>
>> On 32-bit platforms, reg fits in the padding between read and mask,
>> meaning no downside from storing it there.
> 
> Just swap the order of 'reg' and 'clksrc'.

You already suggested that the last time (April 1st).
What problem is this supposed to solve?
Swapping the fields does not change the amount of padding required,
and does not place reg close to the hot data.

On a 32-bit platform, with L1_CACHE_BYTES = 64

sizeof(struct unaligned_clocksource) = 80
sizeof(struct clocksource) = 128
sizeof(struct clocksource_mmio)  = 192, reg at +0,   clksrc at +64
sizeof(struct clocksource_mmio2) = 192, reg at +128, clksrc at +0

Same amount of padding.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 19/11/2015 12:21, Russell King wrote:

> When I wrote the MMIO clocksource implementation, there was no
> cacheline_aligned on struct clocksource, and the arrangement I came
> to for the structure put the 'reg' and 'read' within the same cache line
> (note that the MMIO clocksource pre-dates Thomas' rearrangement of struct
> clocksource and the addition of the cache line alignment.)  The original
> layout did not have any padding gaps.

For the record, I pointed out the chronology in a previous discussion.
But Thomas didn't comment at the time :-(

http://thread.gmane.org/gmane.linux.ports.arm.kernel/402968/focus=403604

Mason wrote:
> Oh and while I have your attention ;-) I have alignment-related
> questions about clocksource_mmio_init() (commit 442c8176d2) wrt
> Thomas Gleixner's 369db4c952 patch. (I think the two patches
> do not play nice.)

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 19/11/2015 12:14, Thomas Gleixner wrote:

> So yes, the alignment of the clocksource struct is not longer
> relevant. The case where we access clocksource->max_cycles is when
> CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
> problems to timekeeping than the extra cacheline.
> 
> So the simple solution for this issue is indeed the one liner below.

It would make sense to also remove the comment emphasizing the
alignment requirement.

Regards.

8<---

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd279a7a8..6a0f86a9a92d 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -64,10 +64,6 @@ struct module;
  * @owner: module reference, must be set by clocksource in modules
  */
 struct clocksource {
-   /*
-* Hotpath data, fits in a single cache line when the
-* clocksource itself is cacheline aligned.
-*/
cycle_t (*read)(struct clocksource *cs);
cycle_t mask;
u32 mult;
@@ -95,7 +91,7 @@ struct clocksource {
cycle_t wd_last;
 #endif
struct module *owner;
-} cacheline_aligned;
+};
 
 /*
  * Clock source flags bits::

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:

> On Thu, Nov 19, 2015 at 11:42:48AM +0100, Thomas Gleixner wrote:
> > On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> > > The basic cause of this problem is the cacheline_aligned annotation
> > > which effectively prevents wrapping struct clocksource to provide
> > > implementation specific data.
> > >
> > > Maybe your idea is that struct clocksource should be bloated with all
> > > implementation specific data in the long term?
> > 
> > Certainly not. That mmio use case is sane enough, but you are right,
> > that we should try to lift that cacheline_aligned restriction.
> 
> I don't think the cache line alignment of struct clocksource matters
> anymore - the core timekeeping code no longer uses struct clocksource
> but instead uses struct timekeeper, which caches much of the data from
> struct clocksource.  The only member of struct clocksource which it
> does access is max_cycles, which is more than 32 bytes into struct
> clocksource.
>
> So, I see no reason to waste memory with all these struct clocksources
> being bloated out to cacheline alignments.  In addition, once
> cacheline_aligned is removed, I see no reason for Marc's change
> either.

Right. I completely forgot that I rewrote the core part some time
ago. I'm older than 50, so I'm entitled to use the beginning Alzheimer
excuse. :)

So yes, the alignment of the clocksource struct is not longer
relevant. The case where we access clocksource->max_cycles is when
CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
problems to timekeeping than the extra cacheline.

So the simple solution for this issue is indeed the one liner below.

Thanks,

tglx

8<---

--- tip.orig/include/linux/clocksource.h
+++ tip/include/linux/clocksource.h
@@ -95,7 +95,7 @@ struct clocksource {
cycle_t wd_last;
 #endif
struct module *owner;
-} cacheline_aligned;
+}
 
 /*
  * Clock source flags bits::

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:42:48AM +0100, Thomas Gleixner wrote:
> On Thu, 19 Nov 2015, Russell King - ARM Linux wrote:
> > The basic cause of this problem is the cacheline_aligned annotation
> > which effectively prevents wrapping struct clocksource to provide
> > implementation specific data.
> >
> > Maybe your idea is that struct clocksource should be bloated with all
> > implementation specific data in the long term?
> 
> Certainly not. That mmio use case is sane enough, but you are right,
> that we should try to lift that cacheline_aligned restriction.

I don't think the cache line alignment of struct clocksource matters
anymore - the core timekeeping code no longer uses struct clocksource
but instead uses struct timekeeper, which caches much of the data from
struct clocksource.  The only member of struct clocksource which it
does access is max_cycles, which is more than 32 bytes into struct
clocksource.

So, I see no reason to waste memory with all these struct clocksources
being bloated out to cacheline alignments.  In addition, once
cacheline_aligned is removed, I see no reason for Marc's change
either.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Russell King - ARM Linux
On Thu, Nov 19, 2015 at 11:55:46AM +0100, Marc Gonzalez wrote:
> If you just object to the ifdef, then perhaps 'reg' can be included
> unconditionally.
> 
> diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
> index 278dd279a7a8..50725fd23ab0 100644
> --- a/include/linux/clocksource.h
> +++ b/include/linux/clocksource.h
> @@ -69,6 +69,7 @@ struct clocksource {
>* clocksource itself is cacheline aligned.
>*/
>   cycle_t (*read)(struct clocksource *cs);
> + void __iomem *reg;
>   cycle_t mask;
>   u32 mult;
>   u32 shift;
> 
> > The basic cause of this problem is the cacheline_aligned annotation
> > which effectively prevents wrapping struct clocksource to provide
> > implementation specific data.
> 
> True. But note that placing reg inside struct clocksource comes
> for free on 32-bit platforms (it just replaces padding).

Yes, I'd object less if it's just replacing padding.

> > Maybe your idea is that struct clocksource should be bloated with all
> > implementation specific data in the long term?
> 
> reg is not implementation-specific data, right?

That depends on how you look at what consitutes "implementation specific".

The vast majority of clocksource drivers access some non-MMIO register,
or some register that they need to store the __iomem pointer externally
for other reasons.  The original clocksource design did not have any
iomem pointer.

When I wrote the MMIO clocksource implementation, there was no
cacheline_aligned on struct clocksource, and the arrangement I came
to for the structure put the 'reg' and 'read' within the same cache line
(note that the MMIO clocksource pre-dates Thomas' rearrangement of struct
clocksource and the addition of the cache line alignment.)  The original
layout did not have any padding gaps.

So, it was pointless to add a __iomem pointer to the main structure,
which would have bloated the struct for every user, and it made no sense
what so ever to do that.

Things may have changed, and there may be padding, but things have changed
again which actually mean that very little of struct clocksource will be
in a cache line when the ->read function is called, so the whole idea of
fitting things into one cache line here is totally irrelevant anymore.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
Russell,

On Wed, 18 Nov 2015, Russell King - ARM Linux wrote:

> On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> > Since 'struct clocksource' is cacheline_aligned, gcc must insert
> > a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> > (for example, L1_CACHE_BYTES = 64 on ARMv7).
> > 
> > Storing reg within 'struct clocksource' removes unnecessary padding,
> > and reg can then be grouped with other hot data. A nice side-effect
> > of this patch is making container_of() unnecessary, which makes the
> > code a bit simpler.
> > 
> > On 32-bit platforms, reg fits in the padding between read and mask,
> > meaning no downside from storing it there.
> 
> Just swap the order of 'reg' and 'clksrc'.

That might reduce the memory footprint, but it does not bring the
iomem pointer closer to the other hotpath clocksource data. So we
still need to touch at minimum two cache lines for reading the time.

With Marcs change we have all hotpath data in a single cacheline.

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Marc Gonzalez
On 19/11/2015 11:36, Russell King - ARM Linux wrote:
> On Thu, Nov 19, 2015 at 11:33:47AM +0100, Thomas Gleixner wrote:
>> Russell,
>>
>> On Wed, 18 Nov 2015, Russell King - ARM Linux wrote:
>>
>>> On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
 Since 'struct clocksource' is cacheline_aligned, gcc must insert
 a lot of padding between reg and clksrc in 'struct clocksource_mmio'
 (for example, L1_CACHE_BYTES = 64 on ARMv7).

 Storing reg within 'struct clocksource' removes unnecessary padding,
 and reg can then be grouped with other hot data. A nice side-effect
 of this patch is making container_of() unnecessary, which makes the
 code a bit simpler.

 On 32-bit platforms, reg fits in the padding between read and mask,
 meaning no downside from storing it there.
>>>
>>> Just swap the order of 'reg' and 'clksrc'.
>>
>> That might reduce the memory footprint, but it does not bring the
>> iomem pointer closer to the other hotpath clocksource data. So we
>> still need to touch at minimum two cache lines for reading the time.
>>
>> With Marc's change we have all hotpath data in a single cacheline.
> 
> Right, and what it's doing is polluting struct clocksource with lots
> of ifdefs which determine how much data is contained in there.  Seems
> to me to be totally insane.

What I find puzzling is that you would consider the read field to be
a bona fide member of struct clocksource, but reg (the address passed
to read) does not belong inside the struct?

If you just object to the ifdef, then perhaps 'reg' can be included
unconditionally.

diff --git a/include/linux/clocksource.h b/include/linux/clocksource.h
index 278dd279a7a8..50725fd23ab0 100644
--- a/include/linux/clocksource.h
+++ b/include/linux/clocksource.h
@@ -69,6 +69,7 @@ struct clocksource {
 * clocksource itself is cacheline aligned.
 */
cycle_t (*read)(struct clocksource *cs);
+   void __iomem *reg;
cycle_t mask;
u32 mult;
u32 shift;

> The basic cause of this problem is the cacheline_aligned annotation
> which effectively prevents wrapping struct clocksource to provide
> implementation specific data.

True. But note that placing reg inside struct clocksource comes
for free on 32-bit platforms (it just replaces padding).

> Maybe your idea is that struct clocksource should be bloated with all
> implementation specific data in the long term?

reg is not implementation-specific data, right?

Regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-19 Thread Thomas Gleixner
On Thu, 19 Nov 2015, Marc Gonzalez wrote:
> On 19/11/2015 12:14, Thomas Gleixner wrote:
> 
> > So yes, the alignment of the clocksource struct is not longer
> > relevant. The case where we access clocksource->max_cycles is when
> > CONFIG_DEBUG_TIMEKEEPING is enabled, which imposes worse performance
> > problems to timekeeping than the extra cacheline.
> > 
> > So the simple solution for this issue is indeed the one liner below.
> 
> It would make sense to also remove the comment emphasizing the
> alignment requirement.

Indeed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-18 Thread Russell King - ARM Linux
On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> (for example, L1_CACHE_BYTES = 64 on ARMv7).
> 
> Storing reg within 'struct clocksource' removes unnecessary padding,
> and reg can then be grouped with other hot data. A nice side-effect
> of this patch is making container_of() unnecessary, which makes the
> code a bit simpler.
> 
> On 32-bit platforms, reg fits in the padding between read and mask,
> meaning no downside from storing it there.

Just swap the order of 'reg' and 'clksrc'.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-18 Thread Måns Rullgård
Marc Gonzalez  writes:

> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> (for example, L1_CACHE_BYTES = 64 on ARMv7).
>
> Storing reg within 'struct clocksource' removes unnecessary padding,
> and reg can then be grouped with other hot data.

Can you demonstrate a difference with this change?  Not saying it's bad,
but it's always good to have numbers.

> A nice side-effect of this patch is making container_of() unnecessary,
> which makes the code a bit simpler.

You really need to get used to that construct.

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-18 Thread Russell King - ARM Linux
On Wed, Nov 18, 2015 at 02:43:34PM +0100, Marc Gonzalez wrote:
> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> (for example, L1_CACHE_BYTES = 64 on ARMv7).
> 
> Storing reg within 'struct clocksource' removes unnecessary padding,
> and reg can then be grouped with other hot data. A nice side-effect
> of this patch is making container_of() unnecessary, which makes the
> code a bit simpler.
> 
> On 32-bit platforms, reg fits in the padding between read and mask,
> meaning no downside from storing it there.

Just swap the order of 'reg' and 'clksrc'.

-- 
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] clocksource: Store reg field within struct clocksource

2015-11-18 Thread Måns Rullgård
Marc Gonzalez  writes:

> Since 'struct clocksource' is cacheline_aligned, gcc must insert
> a lot of padding between reg and clksrc in 'struct clocksource_mmio'
> (for example, L1_CACHE_BYTES = 64 on ARMv7).
>
> Storing reg within 'struct clocksource' removes unnecessary padding,
> and reg can then be grouped with other hot data.

Can you demonstrate a difference with this change?  Not saying it's bad,
but it's always good to have numbers.

> A nice side-effect of this patch is making container_of() unnecessary,
> which makes the code a bit simpler.

You really need to get used to that construct.

-- 
Måns Rullgård
m...@mansr.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/