Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Luck, Tony
On Mon, Oct 17, 2016 at 09:43:41AM -0700, Yu, Fenghua wrote:
> > > > I wonder whether this is the proper abstraction level. We might as
> > > > well do the following:
> > > >
> > > > rdtresources[] = {
> > > >  {
> > > > .name   = "L3",
> > > >  },
> > > >  {
> > > > .name   = "L3Data",
> > > >  },
> > > >  {
> > > > .name   = "L3Code",
> > > >  },
> > > >
> > > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > > simpler, but it's definitely worth a thought or two.
> > >
> > > This way will be better than having cdp_enabled/capable for L3 and not
> > > for L2.  And this doesn't change current userinterface design either,
> > > I think.
> > 
> > User interface would change if you did this. The schemata file would look 
> > like
> > this with CDP enabled:
> > 
> > # cat schemata
> > L3Data:0=f;1=f;2=f;3=f
> > L3Code:0=f;1=f;2=f;3=f
> > 
> > but that is easier to read than the current:
> > 
> > # cat schemata
> > L3:0=f,f;1=f,f;2=f,f;3=f,f
> > 
> > which gives you no clue on which mask is code and which is data.
> 
> Right.
> 
> Also changing to uniform format :=cbm1;=cbm2;...
> is lot easier to parse schemata line in CDP mode.
> 
> So I'll change the code and doc to have two new resources: L3Data and L3Code 
> for CDP mode.

Doc change (fold into part 05):
diff --git a/Documentation/x86/intel_rdt_ui.txt 
b/Documentation/x86/intel_rdt_ui.txt
index e56781952f42..b9f634c9a058 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -97,13 +97,18 @@ With CDP disabled the L3 schemata format is:
 
 L3 details (CDP enabled via mount option to resctrl)
 
-When CDP is enabled, you need to specify separate cache bit masks for
-code and data access. The generic format is:
+When CDP is enabled L3 control is split into two separate resources
+so you can specify independent masks for code and data like this:
 
-   L3:=,;=,;...
+   L3data:=;=;...
+   L3code:=;=;...
 
-where the d_cbm masks are for data access, and the i_cbm masks for code.
+L2 details
+--
+L2 cache does not support code and data prioritization, so the
+schemata format is always:
 
+   L2:=;=;...
 
 Example 1
 -


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Luck, Tony
On Mon, Oct 17, 2016 at 09:43:41AM -0700, Yu, Fenghua wrote:
> > > > I wonder whether this is the proper abstraction level. We might as
> > > > well do the following:
> > > >
> > > > rdtresources[] = {
> > > >  {
> > > > .name   = "L3",
> > > >  },
> > > >  {
> > > > .name   = "L3Data",
> > > >  },
> > > >  {
> > > > .name   = "L3Code",
> > > >  },
> > > >
> > > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > > simpler, but it's definitely worth a thought or two.
> > >
> > > This way will be better than having cdp_enabled/capable for L3 and not
> > > for L2.  And this doesn't change current userinterface design either,
> > > I think.
> > 
> > User interface would change if you did this. The schemata file would look 
> > like
> > this with CDP enabled:
> > 
> > # cat schemata
> > L3Data:0=f;1=f;2=f;3=f
> > L3Code:0=f;1=f;2=f;3=f
> > 
> > but that is easier to read than the current:
> > 
> > # cat schemata
> > L3:0=f,f;1=f,f;2=f,f;3=f,f
> > 
> > which gives you no clue on which mask is code and which is data.
> 
> Right.
> 
> Also changing to uniform format :=cbm1;=cbm2;...
> is lot easier to parse schemata line in CDP mode.
> 
> So I'll change the code and doc to have two new resources: L3Data and L3Code 
> for CDP mode.

Doc change (fold into part 05):
diff --git a/Documentation/x86/intel_rdt_ui.txt 
b/Documentation/x86/intel_rdt_ui.txt
index e56781952f42..b9f634c9a058 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -97,13 +97,18 @@ With CDP disabled the L3 schemata format is:
 
 L3 details (CDP enabled via mount option to resctrl)
 
-When CDP is enabled, you need to specify separate cache bit masks for
-code and data access. The generic format is:
+When CDP is enabled L3 control is split into two separate resources
+so you can specify independent masks for code and data like this:
 
-   L3:=,;=,;...
+   L3data:=;=;...
+   L3code:=;=;...
 
-where the d_cbm masks are for data access, and the i_cbm masks for code.
+L2 details
+--
+L2 cache does not support code and data prioritization, so the
+schemata format is always:
 
+   L2:=;=;...
 
 Example 1
 -


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Fenghua Yu
On Mon, Oct 17, 2016 at 07:02:19PM +0200, Thomas Gleixner wrote:
> On Mon, 17 Oct 2016, Fenghua Yu wrote:
> > On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> > > I wonder whether this is the proper abstraction level. We might as well do
> > > the following:
> > > 
> > > rdtresources[] = {
> > >  {
> > >   .name   = "L3",
> > >  },
> > >  {
> > >   .name   = "L3Data",
> > >  },
> > >  {
> > >   .name   = "L3Code",
> > >  },
> > > 
> > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > simpler, but it's definitely worth a thought or two.
> > 
> > This way will be better than having cdp_enabled/capable for L3 and not
> > for L2.
> 
> So you need to change the struct to have capable and enabled
> 
> > > static void rdt_get_config(int idx, struct rdt_resource *r)
> > > {
> > >   union cpuid_0x10_1_eax eax;
> > >   union cpuid_0x10_1_edx edx;
> > >   u32 ebx, ecx;
> > > 
> > >   cpuid_count(0x0010, idx, , , , );
> > >   r->max_closid = edx.split.cos_max + 1;
> > >   r->num_closid = r->max_closid;
> > >   r->cbm_len = eax.split.cbm_len + 1;
> > >   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > >   r->enabled = true;
> 
> And set 
> 
>   r->capable = true; 
> 
> here instead of r->enabled and set r->enabled at mount time for the
> resources depending on the mount flags.

Neat code change!
 
Thanks,

-Fenghua


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Fenghua Yu
On Mon, Oct 17, 2016 at 07:02:19PM +0200, Thomas Gleixner wrote:
> On Mon, 17 Oct 2016, Fenghua Yu wrote:
> > On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> > > I wonder whether this is the proper abstraction level. We might as well do
> > > the following:
> > > 
> > > rdtresources[] = {
> > >  {
> > >   .name   = "L3",
> > >  },
> > >  {
> > >   .name   = "L3Data",
> > >  },
> > >  {
> > >   .name   = "L3Code",
> > >  },
> > > 
> > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > simpler, but it's definitely worth a thought or two.
> > 
> > This way will be better than having cdp_enabled/capable for L3 and not
> > for L2.
> 
> So you need to change the struct to have capable and enabled
> 
> > > static void rdt_get_config(int idx, struct rdt_resource *r)
> > > {
> > >   union cpuid_0x10_1_eax eax;
> > >   union cpuid_0x10_1_edx edx;
> > >   u32 ebx, ecx;
> > > 
> > >   cpuid_count(0x0010, idx, , , , );
> > >   r->max_closid = edx.split.cos_max + 1;
> > >   r->num_closid = r->max_closid;
> > >   r->cbm_len = eax.split.cbm_len + 1;
> > >   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > >   r->enabled = true;
> 
> And set 
> 
>   r->capable = true; 
> 
> here instead of r->enabled and set r->enabled at mount time for the
> resources depending on the mount flags.

Neat code change!
 
Thanks,

-Fenghua


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Mon, 17 Oct 2016, Fenghua Yu wrote:
> On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> > I wonder whether this is the proper abstraction level. We might as well do
> > the following:
> > 
> > rdtresources[] = {
> >  {
> > .name   = "L3",
> >  },
> >  {
> > .name   = "L3Data",
> >  },
> >  {
> > .name   = "L3Code",
> >  },
> > 
> > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > simpler, but it's definitely worth a thought or two.
> 
> This way will be better than having cdp_enabled/capable for L3 and not
> for L2.

So you need to change the struct to have capable and enabled

> > static void rdt_get_config(int idx, struct rdt_resource *r)
> > {
> > union cpuid_0x10_1_eax eax;
> > union cpuid_0x10_1_edx edx;
> > u32 ebx, ecx;
> > 
> > cpuid_count(0x0010, idx, , , , );
> > r->max_closid = edx.split.cos_max + 1;
> > r->num_closid = r->max_closid;
> > r->cbm_len = eax.split.cbm_len + 1;
> > r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > r->enabled = true;

And set 

r->capable = true; 

here instead of r->enabled and set r->enabled at mount time for the
resources depending on the mount flags.

Thanks,

tglx







Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Mon, 17 Oct 2016, Fenghua Yu wrote:
> On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> > I wonder whether this is the proper abstraction level. We might as well do
> > the following:
> > 
> > rdtresources[] = {
> >  {
> > .name   = "L3",
> >  },
> >  {
> > .name   = "L3Data",
> >  },
> >  {
> > .name   = "L3Code",
> >  },
> > 
> > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > simpler, but it's definitely worth a thought or two.
> 
> This way will be better than having cdp_enabled/capable for L3 and not
> for L2.

So you need to change the struct to have capable and enabled

> > static void rdt_get_config(int idx, struct rdt_resource *r)
> > {
> > union cpuid_0x10_1_eax eax;
> > union cpuid_0x10_1_edx edx;
> > u32 ebx, ecx;
> > 
> > cpuid_count(0x0010, idx, , , , );
> > r->max_closid = edx.split.cos_max + 1;
> > r->num_closid = r->max_closid;
> > r->cbm_len = eax.split.cbm_len + 1;
> > r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > r->enabled = true;

And set 

r->capable = true; 

here instead of r->enabled and set r->enabled at mount time for the
resources depending on the mount flags.

Thanks,

tglx







Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Mon, 17 Oct 2016, Fenghua Yu wrote:
> On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> > On Fri, 14 Oct 2016, Fenghua Yu wrote:
> > > +/**
> > > + * struct rdt_resource - attributes of an RDT resource
> > > + * @enabled: Is this feature enabled on this machine
> > > + * @name:Name to use in "schemata" file
> > > + * @max_closid:  Maximum number of CLOSIDs supported
> > > + * @num_closid:  Current number of CLOSIDs available
> > > + * @max_cbm: Largest Cache Bit Mask allowed
> > > + * @min_cbm_bits:Minimum number of bits to be set in a 
> > > cache
> > 
> > That should be 'number of consecutive bits', right?
> 
> Change to "Minimum number of consecutive bits to be set in a cache", is
> that ok?

Yes.
 
Thanks,

tglx


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Mon, 17 Oct 2016, Luck, Tony wrote:
> > > I wonder whether this is the proper abstraction level. We might as well do
> > > the following:
> > > 
> > > rdtresources[] = {
> > >  {
> > >   .name   = "L3",
> > >  },
> > >  {
> > >   .name   = "L3Data",
> > >  },
> > >  {
> > >   .name   = "L3Code",
> > >  },
> > > 
> > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > simpler, but it's definitely worth a thought or two.
> > 
> > This way will be better than having cdp_enabled/capable for L3 and not
> > for L2.  And this doesn't change current userinterface design either,
> > I think.
> 
> User interface would change if you did this. The schemata file would
> look like this with CDP enabled:
> 
> # cat schemata
> L3Data:0=f;1=f;2=f;3=f
> L3Code:0=f;1=f;2=f;3=f
> 
> but that is easier to read than the current:
> 
> # cat schemata
> L3:0=f,f;1=f,f;2=f,f;3=f,f
> 
> which gives you no clue on which mask is code and which is data.

Indeed.
 
> We'd also end up with "info/L3Data/" and "info/L3code/"
> which would be a little redundant (since the files in each
> would contain the same numbers), but perhaps that is worth
> it to get the better schemata file.

I think so. Making the user interface more intuitive is always worth it.

Thanks,

tglx


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Mon, 17 Oct 2016, Fenghua Yu wrote:
> On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> > On Fri, 14 Oct 2016, Fenghua Yu wrote:
> > > +/**
> > > + * struct rdt_resource - attributes of an RDT resource
> > > + * @enabled: Is this feature enabled on this machine
> > > + * @name:Name to use in "schemata" file
> > > + * @max_closid:  Maximum number of CLOSIDs supported
> > > + * @num_closid:  Current number of CLOSIDs available
> > > + * @max_cbm: Largest Cache Bit Mask allowed
> > > + * @min_cbm_bits:Minimum number of bits to be set in a 
> > > cache
> > 
> > That should be 'number of consecutive bits', right?
> 
> Change to "Minimum number of consecutive bits to be set in a cache", is
> that ok?

Yes.
 
Thanks,

tglx


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Mon, 17 Oct 2016, Luck, Tony wrote:
> > > I wonder whether this is the proper abstraction level. We might as well do
> > > the following:
> > > 
> > > rdtresources[] = {
> > >  {
> > >   .name   = "L3",
> > >  },
> > >  {
> > >   .name   = "L3Data",
> > >  },
> > >  {
> > >   .name   = "L3Code",
> > >  },
> > > 
> > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > simpler, but it's definitely worth a thought or two.
> > 
> > This way will be better than having cdp_enabled/capable for L3 and not
> > for L2.  And this doesn't change current userinterface design either,
> > I think.
> 
> User interface would change if you did this. The schemata file would
> look like this with CDP enabled:
> 
> # cat schemata
> L3Data:0=f;1=f;2=f;3=f
> L3Code:0=f;1=f;2=f;3=f
> 
> but that is easier to read than the current:
> 
> # cat schemata
> L3:0=f,f;1=f,f;2=f,f;3=f,f
> 
> which gives you no clue on which mask is code and which is data.

Indeed.
 
> We'd also end up with "info/L3Data/" and "info/L3code/"
> which would be a little redundant (since the files in each
> would contain the same numbers), but perhaps that is worth
> it to get the better schemata file.

I think so. Making the user interface more intuitive is always worth it.

Thanks,

tglx


RE: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Yu, Fenghua
> > > I wonder whether this is the proper abstraction level. We might as
> > > well do the following:
> > >
> > > rdtresources[] = {
> > >  {
> > >   .name   = "L3",
> > >  },
> > >  {
> > >   .name   = "L3Data",
> > >  },
> > >  {
> > >   .name   = "L3Code",
> > >  },
> > >
> > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > simpler, but it's definitely worth a thought or two.
> >
> > This way will be better than having cdp_enabled/capable for L3 and not
> > for L2.  And this doesn't change current userinterface design either,
> > I think.
> 
> User interface would change if you did this. The schemata file would look like
> this with CDP enabled:
> 
> # cat schemata
> L3Data:0=f;1=f;2=f;3=f
> L3Code:0=f;1=f;2=f;3=f
> 
> but that is easier to read than the current:
> 
> # cat schemata
> L3:0=f,f;1=f,f;2=f,f;3=f,f
> 
> which gives you no clue on which mask is code and which is data.

Right.

Also changing to uniform format :=cbm1;=cbm2;...
is lot easier to parse schemata line in CDP mode.

So I'll change the code and doc to have two new resources: L3Data and L3Code 
for CDP mode.

> 
> We'd also end up with "info/L3Data/" and "info/L3code/"
> which would be a little redundant (since the files in each would contain the
> same numbers), but perhaps that is worth it to get the better schemata file.
> 
> -Tony


RE: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Yu, Fenghua
> > > I wonder whether this is the proper abstraction level. We might as
> > > well do the following:
> > >
> > > rdtresources[] = {
> > >  {
> > >   .name   = "L3",
> > >  },
> > >  {
> > >   .name   = "L3Data",
> > >  },
> > >  {
> > >   .name   = "L3Code",
> > >  },
> > >
> > > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > > simpler, but it's definitely worth a thought or two.
> >
> > This way will be better than having cdp_enabled/capable for L3 and not
> > for L2.  And this doesn't change current userinterface design either,
> > I think.
> 
> User interface would change if you did this. The schemata file would look like
> this with CDP enabled:
> 
> # cat schemata
> L3Data:0=f;1=f;2=f;3=f
> L3Code:0=f;1=f;2=f;3=f
> 
> but that is easier to read than the current:
> 
> # cat schemata
> L3:0=f,f;1=f,f;2=f,f;3=f,f
> 
> which gives you no clue on which mask is code and which is data.

Right.

Also changing to uniform format :=cbm1;=cbm2;...
is lot easier to parse schemata line in CDP mode.

So I'll change the code and doc to have two new resources: L3Data and L3Code 
for CDP mode.

> 
> We'd also end up with "info/L3Data/" and "info/L3code/"
> which would be a little redundant (since the files in each would contain the
> same numbers), but perhaps that is worth it to get the better schemata file.
> 
> -Tony


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Luck, Tony
> > I wonder whether this is the proper abstraction level. We might as well do
> > the following:
> > 
> > rdtresources[] = {
> >  {
> > .name   = "L3",
> >  },
> >  {
> > .name   = "L3Data",
> >  },
> >  {
> > .name   = "L3Code",
> >  },
> > 
> > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > simpler, but it's definitely worth a thought or two.
> 
> This way will be better than having cdp_enabled/capable for L3 and not
> for L2.  And this doesn't change current userinterface design either,
> I think.

User interface would change if you did this. The schemata file would
look like this with CDP enabled:

# cat schemata
L3Data:0=f;1=f;2=f;3=f
L3Code:0=f;1=f;2=f;3=f

but that is easier to read than the current:

# cat schemata
L3:0=f,f;1=f,f;2=f,f;3=f,f

which gives you no clue on which mask is code and which is data.

We'd also end up with "info/L3Data/" and "info/L3code/"
which would be a little redundant (since the files in each
would contain the same numbers), but perhaps that is worth
it to get the better schemata file.

-Tony


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Luck, Tony
> > I wonder whether this is the proper abstraction level. We might as well do
> > the following:
> > 
> > rdtresources[] = {
> >  {
> > .name   = "L3",
> >  },
> >  {
> > .name   = "L3Data",
> >  },
> >  {
> > .name   = "L3Code",
> >  },
> > 
> > and enable either L3 or L3Data+L3Code. Not sure if that makes things
> > simpler, but it's definitely worth a thought or two.
> 
> This way will be better than having cdp_enabled/capable for L3 and not
> for L2.  And this doesn't change current userinterface design either,
> I think.

User interface would change if you did this. The schemata file would
look like this with CDP enabled:

# cat schemata
L3Data:0=f;1=f;2=f;3=f
L3Code:0=f;1=f;2=f;3=f

but that is easier to read than the current:

# cat schemata
L3:0=f,f;1=f,f;2=f,f;3=f,f

which gives you no clue on which mask is code and which is data.

We'd also end up with "info/L3Data/" and "info/L3code/"
which would be a little redundant (since the files in each
would contain the same numbers), but perhaps that is worth
it to get the better schemata file.

-Tony


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Fenghua Yu
On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> On Fri, 14 Oct 2016, Fenghua Yu wrote:
> > +/**
> > + * struct rdt_resource - attributes of an RDT resource
> > + * @enabled:   Is this feature enabled on this machine
> > + * @name:  Name to use in "schemata" file
> > + * @max_closid:Maximum number of CLOSIDs supported
> > + * @num_closid:Current number of CLOSIDs available
> > + * @max_cbm:   Largest Cache Bit Mask allowed
> > + * @min_cbm_bits:  Minimum number of bits to be set in a cache
> 
> That should be 'number of consecutive bits', right?

Change to "Minimum number of consecutive bits to be set in a cache", is
that ok?

It's 2 on Haswell. It's 1 in other cases i.e. L3 on Broadwell and
Skylake servers etc.

> 
> > + * bit mask
> > + * @domains:   All domains for this resource
> > + * @num_domains:   Number of domains active
> > + * @msr_base:  Base MSR address for CBMs
> > + * @cdp_capable:   Code/Data Prioritization available
> > + * @cdp_enabled:   Code/Data Prioritization enabled
> 
> I wonder whether this is the proper abstraction level. We might as well do
> the following:
> 
> rdtresources[] = {
>  {
>   .name   = "L3",
>  },
>  {
>   .name   = "L3Data",
>  },
>  {
>   .name   = "L3Code",
>  },
> 
> and enable either L3 or L3Data+L3Code. Not sure if that makes things
> simpler, but it's definitely worth a thought or two.

This way will be better than having cdp_enabled/capable for L3 and not
for L2.  And this doesn't change current userinterface design either,
I think.

> 
> > +#define for_each_rdt_resource(r)   \
> > +   for (r = rdt_resources_all; r->name; r++) \
> > +   if (r->enabled)
> 
> So the resource array must be NULL terminated, right? You might as well use
> 
>r < rdt_resources_all + ARRAY_SIZE(rdt_resources_all)
> 
> as the loop condition. So you avoid the NULL termination.

Will do.

> 
> > +
> > +#define IA32_L3_CBM_BASE   0xc90
> > +extern struct rdt_resource rdt_resources_all[];
> 
> Please visually split this. CBM_BASE has nothing to do with the resource
> array.
> 
> > +#define domain_init(name) LIST_HEAD_INIT(rdt_resources_all[name].domains)
> 
> name is really misleading here. Please use id and make this an inline
> function.
> 
> > +struct rdt_resource rdt_resources_all[] = {
> 
> >  static inline bool get_rdt_resources(void)
> >  {
> > +   struct rdt_resource *r;
> > bool ret = false;
> >  
> > if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> > @@ -74,20 +105,53 @@ static inline bool get_rdt_resources(void)
> >  
> > if (!boot_cpu_has(X86_FEATURE_RDT_A))
> > return false;
> > -   if (boot_cpu_has(X86_FEATURE_CAT_L3))
> > +   if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
> > +   union cpuid_0x10_1_eax eax;
> > +   union cpuid_0x10_1_edx edx;
> > +   u32 ebx, ecx;
> > +
> > +   r = _resources_all[RDT_RESOURCE_L3];
> > +   cpuid_count(0x0010, 1, , , , );
> > +   r->max_closid = edx.split.cos_max + 1;
> > +   r->num_closid = r->max_closid;
> > +   r->cbm_len = eax.split.cbm_len + 1;
> > +   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > +   if (boot_cpu_has(X86_FEATURE_CDP_L3))
> > +   r->cdp_capable = true;
> > +   r->enabled = true;
> > +
> > ret = true;
> > +   }
> > +   if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
> > +   union cpuid_0x10_1_eax eax;
> > +   union cpuid_0x10_1_edx edx;
> > +   u32 ebx, ecx;
> > +
> > +   /* CPUID 0x10.2 fields are same format at 0x10.1 */
> > +   r = _resources_all[RDT_RESOURCE_L2];
> > +   cpuid_count(0x0010, 2, , , , );
> > +   r->max_closid = edx.split.cos_max + 1;
> > +   r->num_closid = r->max_closid;
> > +   r->cbm_len = eax.split.cbm_len + 1;
> > +   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > +   r->enabled = true;
> 
> Copy and paste is a wonderful thing, right?
> 
> static void rdt_get_config(int idx, struct rdt_resource *r)
> {
>   union cpuid_0x10_1_eax eax;
>   union cpuid_0x10_1_edx edx;
>   u32 ebx, ecx;
> 
>   cpuid_count(0x0010, idx, , , , );
>   r->max_closid = edx.split.cos_max + 1;
>   r->num_closid = r->max_closid;
>   r->cbm_len = eax.split.cbm_len + 1;
>   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
>   r->enabled = true;
> }
> 
> and and the call site:
> 
>   if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
>   rdt_get_config(1, _resources_all[RDT_RESOURCE_L3]);
>   if (boot_cpu_has(X86_FEATURE_CDP_L3))
>   r->cdp_capable = true;
>   ret = true;
>   }
> 
>   if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
>   

Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Fenghua Yu
On Mon, Oct 17, 2016 at 03:45:32PM +0200, Thomas Gleixner wrote:
> On Fri, 14 Oct 2016, Fenghua Yu wrote:
> > +/**
> > + * struct rdt_resource - attributes of an RDT resource
> > + * @enabled:   Is this feature enabled on this machine
> > + * @name:  Name to use in "schemata" file
> > + * @max_closid:Maximum number of CLOSIDs supported
> > + * @num_closid:Current number of CLOSIDs available
> > + * @max_cbm:   Largest Cache Bit Mask allowed
> > + * @min_cbm_bits:  Minimum number of bits to be set in a cache
> 
> That should be 'number of consecutive bits', right?

Change to "Minimum number of consecutive bits to be set in a cache", is
that ok?

It's 2 on Haswell. It's 1 in other cases i.e. L3 on Broadwell and
Skylake servers etc.

> 
> > + * bit mask
> > + * @domains:   All domains for this resource
> > + * @num_domains:   Number of domains active
> > + * @msr_base:  Base MSR address for CBMs
> > + * @cdp_capable:   Code/Data Prioritization available
> > + * @cdp_enabled:   Code/Data Prioritization enabled
> 
> I wonder whether this is the proper abstraction level. We might as well do
> the following:
> 
> rdtresources[] = {
>  {
>   .name   = "L3",
>  },
>  {
>   .name   = "L3Data",
>  },
>  {
>   .name   = "L3Code",
>  },
> 
> and enable either L3 or L3Data+L3Code. Not sure if that makes things
> simpler, but it's definitely worth a thought or two.

This way will be better than having cdp_enabled/capable for L3 and not
for L2.  And this doesn't change current userinterface design either,
I think.

> 
> > +#define for_each_rdt_resource(r)   \
> > +   for (r = rdt_resources_all; r->name; r++) \
> > +   if (r->enabled)
> 
> So the resource array must be NULL terminated, right? You might as well use
> 
>r < rdt_resources_all + ARRAY_SIZE(rdt_resources_all)
> 
> as the loop condition. So you avoid the NULL termination.

Will do.

> 
> > +
> > +#define IA32_L3_CBM_BASE   0xc90
> > +extern struct rdt_resource rdt_resources_all[];
> 
> Please visually split this. CBM_BASE has nothing to do with the resource
> array.
> 
> > +#define domain_init(name) LIST_HEAD_INIT(rdt_resources_all[name].domains)
> 
> name is really misleading here. Please use id and make this an inline
> function.
> 
> > +struct rdt_resource rdt_resources_all[] = {
> 
> >  static inline bool get_rdt_resources(void)
> >  {
> > +   struct rdt_resource *r;
> > bool ret = false;
> >  
> > if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> > @@ -74,20 +105,53 @@ static inline bool get_rdt_resources(void)
> >  
> > if (!boot_cpu_has(X86_FEATURE_RDT_A))
> > return false;
> > -   if (boot_cpu_has(X86_FEATURE_CAT_L3))
> > +   if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
> > +   union cpuid_0x10_1_eax eax;
> > +   union cpuid_0x10_1_edx edx;
> > +   u32 ebx, ecx;
> > +
> > +   r = _resources_all[RDT_RESOURCE_L3];
> > +   cpuid_count(0x0010, 1, , , , );
> > +   r->max_closid = edx.split.cos_max + 1;
> > +   r->num_closid = r->max_closid;
> > +   r->cbm_len = eax.split.cbm_len + 1;
> > +   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > +   if (boot_cpu_has(X86_FEATURE_CDP_L3))
> > +   r->cdp_capable = true;
> > +   r->enabled = true;
> > +
> > ret = true;
> > +   }
> > +   if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
> > +   union cpuid_0x10_1_eax eax;
> > +   union cpuid_0x10_1_edx edx;
> > +   u32 ebx, ecx;
> > +
> > +   /* CPUID 0x10.2 fields are same format at 0x10.1 */
> > +   r = _resources_all[RDT_RESOURCE_L2];
> > +   cpuid_count(0x0010, 2, , , , );
> > +   r->max_closid = edx.split.cos_max + 1;
> > +   r->num_closid = r->max_closid;
> > +   r->cbm_len = eax.split.cbm_len + 1;
> > +   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> > +   r->enabled = true;
> 
> Copy and paste is a wonderful thing, right?
> 
> static void rdt_get_config(int idx, struct rdt_resource *r)
> {
>   union cpuid_0x10_1_eax eax;
>   union cpuid_0x10_1_edx edx;
>   u32 ebx, ecx;
> 
>   cpuid_count(0x0010, idx, , , , );
>   r->max_closid = edx.split.cos_max + 1;
>   r->num_closid = r->max_closid;
>   r->cbm_len = eax.split.cbm_len + 1;
>   r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
>   r->enabled = true;
> }
> 
> and and the call site:
> 
>   if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
>   rdt_get_config(1, _resources_all[RDT_RESOURCE_L3]);
>   if (boot_cpu_has(X86_FEATURE_CDP_L3))
>   r->cdp_capable = true;
>   ret = true;
>   }
> 
>   if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
>   

Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Fri, 14 Oct 2016, Fenghua Yu wrote:
> +/**
> + * struct rdt_resource - attributes of an RDT resource
> + * @enabled: Is this feature enabled on this machine
> + * @name:Name to use in "schemata" file
> + * @max_closid:  Maximum number of CLOSIDs supported
> + * @num_closid:  Current number of CLOSIDs available
> + * @max_cbm: Largest Cache Bit Mask allowed
> + * @min_cbm_bits:Minimum number of bits to be set in a cache

That should be 'number of consecutive bits', right?

> + *   bit mask
> + * @domains: All domains for this resource
> + * @num_domains: Number of domains active
> + * @msr_base:Base MSR address for CBMs
> + * @cdp_capable: Code/Data Prioritization available
> + * @cdp_enabled: Code/Data Prioritization enabled

I wonder whether this is the proper abstraction level. We might as well do
the following:

rdtresources[] = {
 {
.name   = "L3",
 },
 {
.name   = "L3Data",
 },
 {
.name   = "L3Code",
 },

and enable either L3 or L3Data+L3Code. Not sure if that makes things
simpler, but it's definitely worth a thought or two.

> +#define for_each_rdt_resource(r) \
> + for (r = rdt_resources_all; r->name; r++) \
> + if (r->enabled)

So the resource array must be NULL terminated, right? You might as well use

   r < rdt_resources_all + ARRAY_SIZE(rdt_resources_all)

as the loop condition. So you avoid the NULL termination.

> +
> +#define IA32_L3_CBM_BASE 0xc90
> +extern struct rdt_resource rdt_resources_all[];

Please visually split this. CBM_BASE has nothing to do with the resource
array.

> +#define domain_init(name) LIST_HEAD_INIT(rdt_resources_all[name].domains)

name is really misleading here. Please use id and make this an inline
function.

> +struct rdt_resource rdt_resources_all[] = {

>  static inline bool get_rdt_resources(void)
>  {
> + struct rdt_resource *r;
>   bool ret = false;
>  
>   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> @@ -74,20 +105,53 @@ static inline bool get_rdt_resources(void)
>  
>   if (!boot_cpu_has(X86_FEATURE_RDT_A))
>   return false;
> - if (boot_cpu_has(X86_FEATURE_CAT_L3))
> + if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
> + union cpuid_0x10_1_eax eax;
> + union cpuid_0x10_1_edx edx;
> + u32 ebx, ecx;
> +
> + r = _resources_all[RDT_RESOURCE_L3];
> + cpuid_count(0x0010, 1, , , , );
> + r->max_closid = edx.split.cos_max + 1;
> + r->num_closid = r->max_closid;
> + r->cbm_len = eax.split.cbm_len + 1;
> + r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> + if (boot_cpu_has(X86_FEATURE_CDP_L3))
> + r->cdp_capable = true;
> + r->enabled = true;
> +
>   ret = true;
> + }
> + if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
> + union cpuid_0x10_1_eax eax;
> + union cpuid_0x10_1_edx edx;
> + u32 ebx, ecx;
> +
> + /* CPUID 0x10.2 fields are same format at 0x10.1 */
> + r = _resources_all[RDT_RESOURCE_L2];
> + cpuid_count(0x0010, 2, , , , );
> + r->max_closid = edx.split.cos_max + 1;
> + r->num_closid = r->max_closid;
> + r->cbm_len = eax.split.cbm_len + 1;
> + r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> + r->enabled = true;

Copy and paste is a wonderful thing, right?

static void rdt_get_config(int idx, struct rdt_resource *r)
{
union cpuid_0x10_1_eax eax;
union cpuid_0x10_1_edx edx;
u32 ebx, ecx;

cpuid_count(0x0010, idx, , , , );
r->max_closid = edx.split.cos_max + 1;
r->num_closid = r->max_closid;
r->cbm_len = eax.split.cbm_len + 1;
r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->enabled = true;
}

and and the call site:

if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
rdt_get_config(1, _resources_all[RDT_RESOURCE_L3]);
if (boot_cpu_has(X86_FEATURE_CDP_L3))
r->cdp_capable = true;
ret = true;
}

if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
rdt_get_config(2, _resources_all[RDT_RESOURCE_L2]);
ret = true;
}

Hmm?

> + for_each_rdt_resource(r)
> + pr_info("Intel RDT %s allocation %s detected\n", r->name,
> + r->cdp_capable ? " (with CDP)" : "");

This want's curly braces around the loop.
  
Thanks,

tglx


Re: [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID

2016-10-17 Thread Thomas Gleixner
On Fri, 14 Oct 2016, Fenghua Yu wrote:
> +/**
> + * struct rdt_resource - attributes of an RDT resource
> + * @enabled: Is this feature enabled on this machine
> + * @name:Name to use in "schemata" file
> + * @max_closid:  Maximum number of CLOSIDs supported
> + * @num_closid:  Current number of CLOSIDs available
> + * @max_cbm: Largest Cache Bit Mask allowed
> + * @min_cbm_bits:Minimum number of bits to be set in a cache

That should be 'number of consecutive bits', right?

> + *   bit mask
> + * @domains: All domains for this resource
> + * @num_domains: Number of domains active
> + * @msr_base:Base MSR address for CBMs
> + * @cdp_capable: Code/Data Prioritization available
> + * @cdp_enabled: Code/Data Prioritization enabled

I wonder whether this is the proper abstraction level. We might as well do
the following:

rdtresources[] = {
 {
.name   = "L3",
 },
 {
.name   = "L3Data",
 },
 {
.name   = "L3Code",
 },

and enable either L3 or L3Data+L3Code. Not sure if that makes things
simpler, but it's definitely worth a thought or two.

> +#define for_each_rdt_resource(r) \
> + for (r = rdt_resources_all; r->name; r++) \
> + if (r->enabled)

So the resource array must be NULL terminated, right? You might as well use

   r < rdt_resources_all + ARRAY_SIZE(rdt_resources_all)

as the loop condition. So you avoid the NULL termination.

> +
> +#define IA32_L3_CBM_BASE 0xc90
> +extern struct rdt_resource rdt_resources_all[];

Please visually split this. CBM_BASE has nothing to do with the resource
array.

> +#define domain_init(name) LIST_HEAD_INIT(rdt_resources_all[name].domains)

name is really misleading here. Please use id and make this an inline
function.

> +struct rdt_resource rdt_resources_all[] = {

>  static inline bool get_rdt_resources(void)
>  {
> + struct rdt_resource *r;
>   bool ret = false;
>  
>   if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
> @@ -74,20 +105,53 @@ static inline bool get_rdt_resources(void)
>  
>   if (!boot_cpu_has(X86_FEATURE_RDT_A))
>   return false;
> - if (boot_cpu_has(X86_FEATURE_CAT_L3))
> + if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
> + union cpuid_0x10_1_eax eax;
> + union cpuid_0x10_1_edx edx;
> + u32 ebx, ecx;
> +
> + r = _resources_all[RDT_RESOURCE_L3];
> + cpuid_count(0x0010, 1, , , , );
> + r->max_closid = edx.split.cos_max + 1;
> + r->num_closid = r->max_closid;
> + r->cbm_len = eax.split.cbm_len + 1;
> + r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> + if (boot_cpu_has(X86_FEATURE_CDP_L3))
> + r->cdp_capable = true;
> + r->enabled = true;
> +
>   ret = true;
> + }
> + if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
> + union cpuid_0x10_1_eax eax;
> + union cpuid_0x10_1_edx edx;
> + u32 ebx, ecx;
> +
> + /* CPUID 0x10.2 fields are same format at 0x10.1 */
> + r = _resources_all[RDT_RESOURCE_L2];
> + cpuid_count(0x0010, 2, , , , );
> + r->max_closid = edx.split.cos_max + 1;
> + r->num_closid = r->max_closid;
> + r->cbm_len = eax.split.cbm_len + 1;
> + r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
> + r->enabled = true;

Copy and paste is a wonderful thing, right?

static void rdt_get_config(int idx, struct rdt_resource *r)
{
union cpuid_0x10_1_eax eax;
union cpuid_0x10_1_edx edx;
u32 ebx, ecx;

cpuid_count(0x0010, idx, , , , );
r->max_closid = edx.split.cos_max + 1;
r->num_closid = r->max_closid;
r->cbm_len = eax.split.cbm_len + 1;
r->max_cbm = BIT_MASK(eax.split.cbm_len + 1) - 1;
r->enabled = true;
}

and and the call site:

if (boot_cpu_has(X86_FEATURE_CAT_L3)) {
rdt_get_config(1, _resources_all[RDT_RESOURCE_L3]);
if (boot_cpu_has(X86_FEATURE_CDP_L3))
r->cdp_capable = true;
ret = true;
}

if (boot_cpu_has(X86_FEATURE_CAT_L2)) {
rdt_get_config(2, _resources_all[RDT_RESOURCE_L2]);
ret = true;
}

Hmm?

> + for_each_rdt_resource(r)
> + pr_info("Intel RDT %s allocation %s detected\n", r->name,
> + r->cdp_capable ? " (with CDP)" : "");

This want's curly braces around the loop.
  
Thanks,

tglx