Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-25 02:24:40, Jan Beulich wrote: > >>> On 25.04.17 at 09:15,wrote: > > Sorry, this may cause potential issue and is not a good example. But from SW > > view, there is another case where the per-socket supporting is important in > > real-time scenarios. You may have a real-time domain on one socket that > > requires > > different masks (especially for code/data) to guarantee its performance vs. > > other general-purpose domains run on a different socket. In that case it’s a > > heterogeneous software usage model (rather than heterogeneous hardware). > > And, > > we should not force same masks setting on different sockets in such case. > > I don't follow: The COS IDs for the real-time and general purpose > domains would be different, wouldn't they? Thus there would be > different masks in use, as intended. > Yes, you are right. But as above case, the real-time domain only runs on one socket. If per-socket supporting is enabled, we can allocate this COS ID to other domains on other sockets. This can help to improve the scarce cache utilization. > Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 25.04.17 at 09:15,wrote: > Sorry, this may cause potential issue and is not a good example. But from SW > view, there is another case where the per-socket supporting is important in > real-time scenarios. You may have a real-time domain on one socket that > requires > different masks (especially for code/data) to guarantee its performance vs. > other general-purpose domains run on a different socket. In that case it’s a > heterogeneous software usage model (rather than heterogeneous hardware). And, > we should not force same masks setting on different sockets in such case. I don't follow: The COS IDs for the real-time and general purpose domains would be different, wouldn't they? Thus there would be different masks in use, as intended. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-24 00:55:38, Jan Beulich wrote: > >>> On 24.04.17 at 08:40,wrote: > > As what we talked on IRC last Friday, I have got answers for your > > two final questions below: > > 1. Why domain setting is designed to per-socket, any reason? > > Answer: There is a real case from Intel's customer. HSX (Haswell server) > > and BDX (Broadwell server) processors are plugged into each socket of a > > Grantley platform. HSX does not support CAT but BDX does. > > > > You and Chao Peng discussed this before for CAT feature enabling patches. > > The asymmetry supporting is agreed. > > I don't see any agreement in those threads. The first sub-thread is > merely mentioning this configuration, while the second is only about > nr_sockets calculation. > > > http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:smfz7fbatbnxs > > > > 3ti+state:results > > http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:wlovqpg7oj63e > > > > jte+state:results > > > > 2. Why cannot the previous setting to the domains be kept when socket is > > online? > > Answer: If the asymmetry system is supported, we cannot assume the > > configuration > > can be applied to new socket. > > I'm afraid we'd have problems elsewhere if we tried to run Xen on > a mixed-model system. Unless you can prove PSR/CAT is the only > relevant (read: affecting Xen's behavior) hardware difference > between Haswell and Broadwell (for example, isn't the former > SMEP only, but the latter SMEP+SMAP?), I don't buy this as a > reason to have more complicated than necessary code. > Sorry, this may cause potential issue and is not a good example. But from SW view, there is another case where the per-socket supporting is important in real-time scenarios. You may have a real-time domain on one socket that requires different masks (especially for code/data) to guarantee its performance vs. other general-purpose domains run on a different socket. In that case it’s a heterogeneous software usage model (rather than heterogeneous hardware). And, we should not force same masks setting on different sockets in such case. Because CLOS are a scarce software resource which should be wasted. One of the most important reasons for managing CLOS independently across sockets is to preserve the flexibility in using CLOS which is key as they are a scarce resource. BRs, Sun Yi ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 24.04.17 at 08:40,wrote: > As what we talked on IRC last Friday, I have got answers for your > two final questions below: > 1. Why domain setting is designed to per-socket, any reason? > Answer: There is a real case from Intel's customer. HSX (Haswell server) > and BDX (Broadwell server) processors are plugged into each socket of a > Grantley platform. HSX does not support CAT but BDX does. > > You and Chao Peng discussed this before for CAT feature enabling patches. > The asymmetry supporting is agreed. I don't see any agreement in those threads. The first sub-thread is merely mentioning this configuration, while the second is only about nr_sockets calculation. > http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:smfz7fbatbnxs > 3ti+state:results > http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:wlovqpg7oj63e > jte+state:results > > 2. Why cannot the previous setting to the domains be kept when socket is > online? > Answer: If the asymmetry system is supported, we cannot assume the > configuration > can be applied to new socket. I'm afraid we'd have problems elsewhere if we tried to run Xen on a mixed-model system. Unless you can prove PSR/CAT is the only relevant (read: affecting Xen's behavior) hardware difference between Haswell and Broadwell (for example, isn't the former SMEP only, but the latter SMEP+SMAP?), I don't buy this as a reason to have more complicated than necessary code. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-21 00:18:27, Jan Beulich wrote: > >>> On 20.04.17 at 04:14,wrote: > > On 17-04-19 03:00:29, Jan Beulich wrote: > >> >>> On 19.04.17 at 10:22, wrote: > >> > On 17-04-18 05:46:43, Jan Beulich wrote: > >> >> >>> On 18.04.17 at 12:55, wrote: > >> >> > I made a test patch based on v10 and attached it in mail. Could you > >> >> > please > >> >> > help to check it? Thanks! > >> >> > >> >> This looks reasonable at the first glance, albeit I continue to be > >> >> unconvinced that this is the only (reasonable) way of solving the > > > > Do you have any other suggestion on this? Thanks! > > > >> >> problem. After all we don't have to go through similar hoops for > >> >> any other of the register state associated with a vCPU. There > >> > > >> > Sorry, I do not understand your meaning clearly. > >> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this > >> > patch, > >> >we check 'dom_ids' array to know if the domain's cos id has not been > >> > set but > >> >its 'psr_cos_ids[socket]' already has a non zero value. This case > >> > means the > >> >socket offline has happened so that we have to restore the domain's > >> >'psr_cos_ids[socket]' to default value 0 which points to default COS > >> > MSR. > >> >I think we have discussed this in previous mails and achieved > >> > agreement on > >> >such logic. > >> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, > >> > when > >> >socket is online, the registers values may be modified by FW and > >> > others. > >> >So, we have to restore them to default value which is being done in > >> >'cat_init_feature'. > >> > > >> > So, what is your exactly meaning here? Thanks! > >> > >> Neither of the two. I'm comparing with COS/PSR-_unrelated_ > >> handling of register state. After all there are other MSRs which > >> we need to put into the right state after a core comes online. > >> > > For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket > > is offline. The values in it are all 0 when socket is online again. This > > causes > > error when user shows the CBMs. So, we have two options when socket is > > online: > > 1. Read COS MSRs values and save them into 'cos_reg_val[]'. > > 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM. > > > > Per our discussion on v9 patch 05, we decided to adopt option 2. > > Btw., having thought some more about this, putting code into the > context switch path when there is an alternative is probably the > wrong thing after all, i.e. if special treatment _is_ really needed, > doing it in less frequently executed code would likely be better. > But as before - much depends on clarifying why special treatment > is needed here, but not elsewhere (and to avoid questions, with > "elsewhere" I mean outside of PSR/CAT code - there's plenty of > other CPU register state to take as reference). > Hi, Jan, As what we talked on IRC last Friday, I have got answers for your two final questions below: 1. Why domain setting is designed to per-socket, any reason? Answer: There is a real case from Intel's customer. HSX (Haswell server) and BDX (Broadwell server) processors are plugged into each socket of a Grantley platform. HSX does not support CAT but BDX does. You and Chao Peng discussed this before for CAT feature enabling patches. The asymmetry supporting is agreed. http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:smfz7fbatbnxs3ti+state:results http://markmail.org/message/xcq5odezfngszvcb#query:+page:1+mid:wlovqpg7oj63ejte+state:results 2. Why cannot the previous setting to the domains be kept when socket is online? Answer: If the asymmetry system is supported, we cannot assume the configuration can be applied to new socket. BRs, Sun Yi ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 20.04.17 at 04:14,wrote: > On 17-04-19 03:00:29, Jan Beulich wrote: >> >>> On 19.04.17 at 10:22, wrote: >> > On 17-04-18 05:46:43, Jan Beulich wrote: >> >> >>> On 18.04.17 at 12:55, wrote: >> >> > I made a test patch based on v10 and attached it in mail. Could you >> >> > please >> >> > help to check it? Thanks! >> >> >> >> This looks reasonable at the first glance, albeit I continue to be >> >> unconvinced that this is the only (reasonable) way of solving the > > Do you have any other suggestion on this? Thanks! > >> >> problem. After all we don't have to go through similar hoops for >> >> any other of the register state associated with a vCPU. There >> > >> > Sorry, I do not understand your meaning clearly. >> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this >> > patch, >> >we check 'dom_ids' array to know if the domain's cos id has not been >> > set but >> >its 'psr_cos_ids[socket]' already has a non zero value. This case means >> > the >> >socket offline has happened so that we have to restore the domain's >> >'psr_cos_ids[socket]' to default value 0 which points to default COS >> > MSR. >> >I think we have discussed this in previous mails and achieved agreement >> > on >> >such logic. >> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when >> >socket is online, the registers values may be modified by FW and others. >> >So, we have to restore them to default value which is being done in >> >'cat_init_feature'. >> > >> > So, what is your exactly meaning here? Thanks! >> >> Neither of the two. I'm comparing with COS/PSR-_unrelated_ >> handling of register state. After all there are other MSRs which >> we need to put into the right state after a core comes online. >> > For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket > is offline. The values in it are all 0 when socket is online again. This > causes > error when user shows the CBMs. So, we have two options when socket is online: > 1. Read COS MSRs values and save them into 'cos_reg_val[]'. > 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM. > > Per our discussion on v9 patch 05, we decided to adopt option 2. Btw., having thought some more about this, putting code into the context switch path when there is an alternative is probably the wrong thing after all, i.e. if special treatment _is_ really needed, doing it in less frequently executed code would likely be better. But as before - much depends on clarifying why special treatment is needed here, but not elsewhere (and to avoid questions, with "elsewhere" I mean outside of PSR/CAT code - there's plenty of other CPU register state to take as reference). Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 20.04.17 at 18:52,wrote: > So to summarise: > > On item 1: it appears that you are looking for a some more justification why > some of the changes were made, maybe with a rationale for some of the choices > that were made. Given that this is quite a complex series which has diverged > quite a lot from the design, the goal is to make it easier for either you (or > someone else) to sanity check the proposal which on the face of things look > OK. But you have some doubts and you can't easily check against the design as > it is out-of-date. > > On item 2: you think something may not be quite right, but you can't really > decide until a couple of questions (not quite sure which, but I am sure Yi > can locate them) are answered. > > Let me know whether this is actually true. Well, afaict both actually boil down to the same single question regarding the special handling of CAT MSRs after onlining (at runtime) a core on a socket all of whose cores had been offline, namely considering that other CPU registers don't require any such special treatment (in context switch code or elsewhere). As to the design possibly being out of date - I have to admit I didn't even check whether the accompanying documentation has been kept up to date with the actual code changes. The matter here really isn't with comparing with the design, but rather whether the design choice (written down or not) was an appropriate one. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
> If this reaction of mine is not acceptable, all I can do is refrain > from further looking at this series. And Yi, I certainly apologize > for perhaps not doing these reviews wholeheartedly, since - > as also expressed before - I continue to not really view this as > very important functionality. Yet considering for how long some > of the versions hadn't been looked at by anyone at all, the > alternative would have been to simply let it sit further without > looking at it. I actually take this lack of interest by others as an > indication that I'm not the only one considering this nice to have, > but no more. I do have an interest in this series. And I can certainly give it a review - but once you get your teeth in a patchset I feel it is bit counterintuitive to review it - as you do a much much better job that I could have. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
> On 20 Apr 2017, at 14:21, Jan Beulichwrote: > On 20.04.17 at 15:02, wrote: >>> On 20 Apr 2017, at 10:43, Jan Beulich wrote: >>> >> On 20.04.17 at 04:14, wrote: On 17-04-19 03:00:29, Jan Beulich wrote: On 19.04.17 at 10:22, wrote: >> On 17-04-18 05:46:43, Jan Beulich wrote: >> On 18.04.17 at 12:55, wrote: I made a test patch based on v10 and attached it in mail. Could you please help to check it? Thanks! >>> [Item 1] >>> This looks reasonable at the first glance, albeit I continue to be >>> unconvinced that this is the only (reasonable) way of solving the Do you have any other suggestion on this? Thanks! >>> >>> I'm sorry, but no. I'm already spending _much_ more time on this >>> series than I should be, considering its (afaic) relatively low >>> priority. I really have to ask you to properly think through both >>> the data layout and code structure, including considering >>> alternatives especially in places where you can _anticipate_ >>> controversy. >> >> Jan, can you confirm whether you are happy with the current proposal. You >> say "this looks reasonable at the first glance", but then go on to say that >> there may be other "reasonable ways" of solving the problem at hand. >> >> This is not very concrete and hard to respond to from a contributors point >> of view: it would be good to establish whether a) the v10 proposal in this >> area is good enough, b) whether there are any concrete improvements to be >> made. > I understand it's not very concrete, but please understand that with > the over 100 patches wanting looking at right now it is simply > impossible for me to give precise suggestions everywhere. I really > have to be allowed to defer to the originator to come up with > possible better mechanisms (or defend what there is by addressing > questions raised), Jan, I don't object to the principle of deferring issues to a contributor, for contributor to defend their viewpoint or to come up with a better mechanism. I just observed, that I could not make a lot of sense what you were looking for in this particular review. I am assuming that it would be even harder for Yi. > especially with - as said - the amount of time spent > here already having been way higher than is justifiable. And of course we have passed the 4.9 code freeze, so some of the pressure is off. At the same time I understand that because of the upcoming releases you need to focus on bug fixes, etc. > Just go and > compare v10 with one of the initial versions: Almost all of the data > layout and code flow have fundamentally changed, mostly based on > feedback I gave. I'm sorry for saying that, but to me this is an > indication that things hadn't been thought through well in the design > phase, i.e. before even submitting a first RFC. That is good feedback which may contain some valuable lessons. Once we are through this (or maybe at the summit) it may be worthwhile to look at what has gone wrong and see how we can do better in future. [Item 2] >> You say please think through whether "you can _anticipate_ controversy", but >> at the same time you can't currently identify/think of any issues. That >> seems >> to suggest to me that maybe the proposal is good enough. Or that something >> is >> missing, which has not been articulated. Taking into account language >> barriers, more clarity would probably be helpful. > > I've given criteria by which I have the gut feeling (but no more) > that this isn't the right approach. I'm absolutely fine to be > convinced that my gut feeling is wrong. That would require to > simply answer the question I raised multiple times, and which was > repeatedly "answered" by simply re-stating previously expressed > facts. I have not followed the full thread. But it seems that we have communications issue there. Normally this happens when expectations don't quite match and one party (in this case Yi) does not quite get what the other one is looking for. Maybe the best approach would be for Yi to get some of these things resolved during a short IRC conversation with you. I did see him and others resolve some previous issues more effectively on IRC. > If this reaction of mine is not acceptable, all I can do is refrain > from further looking at this series. And Yi, I certainly apologize > for perhaps not doing these reviews wholeheartedly, since - > as also expressed before - I continue to not really view this as > very important functionality. As I said earlier, I stepped in, as I didn't really understand what was going on. I think this is a little clearer now. So to summarise: On item 1: it appears that you are looking for a some more justification why some of the changes were made, maybe with a rationale for some of the choices
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 20.04.17 at 15:02,wrote: >> On 20 Apr 2017, at 10:43, Jan Beulich wrote: >> > On 20.04.17 at 04:14, wrote: >>> On 17-04-19 03:00:29, Jan Beulich wrote: >>> On 19.04.17 at 10:22, wrote: > On 17-04-18 05:46:43, Jan Beulich wrote: > On 18.04.17 at 12:55, wrote: >>> I made a test patch based on v10 and attached it in mail. Could you >>> please >>> help to check it? Thanks! >> >> This looks reasonable at the first glance, albeit I continue to be >> unconvinced that this is the only (reasonable) way of solving the >>> >>> Do you have any other suggestion on this? Thanks! >> >> I'm sorry, but no. I'm already spending _much_ more time on this >> series than I should be, considering its (afaic) relatively low >> priority. I really have to ask you to properly think through both >> the data layout and code structure, including considering >> alternatives especially in places where you can _anticipate_ >> controversy. > > Jan, can you confirm whether you are happy with the current proposal. You > say "this looks reasonable at the first glance", but then go on to say that > there may be other "reasonable ways" of solving the problem at hand. > > This is not very concrete and hard to respond to from a contributors point > of view: it would be good to establish whether a) the v10 proposal in this > area is good enough, b) whether there are any concrete improvements to be > made. I understand it's not very concrete, but please understand that with the over 100 patches wanting looking at right now it is simply impossible for me to give precise suggestions everywhere. I really have to be allowed to defer to the originator to come up with possible better mechanisms (or defend what there is by addressing questions raised), especially with - as said - the amount of time spent here already having been way higher than is justifiable. Just go and compare v10 with one of the initial versions: Almost all of the data layout and code flow have fundamentally changed, mostly based on feedback I gave. I'm sorry for saying that, but to me this is an indication that things hadn't been thought through well in the design phase, i.e. before even submitting a first RFC. > You say please think through whether "you can _anticipate_ controversy", but > at the same time you can't currently identify/think of any issues. That seems > to suggest to me that maybe the proposal is good enough. Or that something is > missing, which has not been articulated. Taking into account language > barriers, more clarity would probably be helpful. I've given criteria by which I have the gut feeling (but no more) that this isn't the right approach. I'm absolutely fine to be convinced that my gut feeling is wrong. That would require to simply answer the question I raised multiple times, and which was repeatedly "answered" by simply re-stating previously expressed facts. If this reaction of mine is not acceptable, all I can do is refrain from further looking at this series. And Yi, I certainly apologize for perhaps not doing these reviews wholeheartedly, since - as also expressed before - I continue to not really view this as very important functionality. Yet considering for how long some of the versions hadn't been looked at by anyone at all, the alternative would have been to simply let it sit further without looking at it. I actually take this lack of interest by others as an indication that I'm not the only one considering this nice to have, but no more. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
Apologies for stepping in here. But it feels to me that this thread is at risk of becoming unproductive. > On 20 Apr 2017, at 10:43, Jan Beulichwrote: > On 20.04.17 at 04:14, wrote: >> On 17-04-19 03:00:29, Jan Beulich wrote: >> On 19.04.17 at 10:22, wrote: On 17-04-18 05:46:43, Jan Beulich wrote: On 18.04.17 at 12:55, wrote: >> I made a test patch based on v10 and attached it in mail. Could you >> please >> help to check it? Thanks! > > This looks reasonable at the first glance, albeit I continue to be > unconvinced that this is the only (reasonable) way of solving the >> >> Do you have any other suggestion on this? Thanks! > > I'm sorry, but no. I'm already spending _much_ more time on this > series than I should be, considering its (afaic) relatively low > priority. I really have to ask you to properly think through both > the data layout and code structure, including considering > alternatives especially in places where you can _anticipate_ > controversy. Jan, can you confirm whether you are happy with the current proposal. You say "this looks reasonable at the first glance", but then go on to say that there may be other "reasonable ways" of solving the problem at hand. This is not very concrete and hard to respond to from a contributors point of view: it would be good to establish whether a) the v10 proposal in this area is good enough, b) whether there are any concrete improvements to be made. You say please think through whether "you can _anticipate_ controversy", but at the same time you can't currently identify/think of any issues. That seems to suggest to me that maybe the proposal is good enough. Or that something is missing, which has not been articulated. Taking into account language barriers, more clarity would probably be helpful. >> Per our discussion on v9 patch 05, we decided to adopt option 2. Below are >> what we discussed. FYR. > > Well, we decided option 2 is better than option 1. I'm still > unconvinced there's not a yet better alternative. I suppose that is the same type of argument. Aka we looked at a number of options, seem to have agreed one is better than the other. But there is no clarity as to whether in this case option 2 is good enough and what concrete steps would be needed to get to a better alternative. Of course I may have missed some of the context here in some of the older threads and thus I may have missed some of the context. >> If it is not 0 which means the domain's cos id does not need restore >> to 0, we can directly set it into ASSOC register. That can avoid >> unnecessary lock. I will send out the test patch again to ask your >> help to provide review comments (you said there are also 'a number >> of cosmetics issues'). > > And I would hope you would try to eliminate some (if not all) yourself > first. After all you can easily go over your patch yourself, checking > for e.g. style violations. I think this is fair enough. Best Regards Lars ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 20.04.17 at 04:14,wrote: > On 17-04-19 03:00:29, Jan Beulich wrote: >> >>> On 19.04.17 at 10:22, wrote: >> > On 17-04-18 05:46:43, Jan Beulich wrote: >> >> >>> On 18.04.17 at 12:55, wrote: >> >> > I made a test patch based on v10 and attached it in mail. Could you >> >> > please >> >> > help to check it? Thanks! >> >> >> >> This looks reasonable at the first glance, albeit I continue to be >> >> unconvinced that this is the only (reasonable) way of solving the > > Do you have any other suggestion on this? Thanks! I'm sorry, but no. I'm already spending _much_ more time on this series than I should be, considering its (afaic) relatively low priority. I really have to ask you to properly think through both the data layout and code structure, including considering alternatives especially in places where you can _anticipate_ controversy. >> >> problem. After all we don't have to go through similar hoops for >> >> any other of the register state associated with a vCPU. There >> > >> > Sorry, I do not understand your meaning clearly. >> > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this >> > patch, >> >we check 'dom_ids' array to know if the domain's cos id has not been >> > set but >> >its 'psr_cos_ids[socket]' already has a non zero value. This case means >> > the >> >socket offline has happened so that we have to restore the domain's >> >'psr_cos_ids[socket]' to default value 0 which points to default COS >> > MSR. >> >I think we have discussed this in previous mails and achieved agreement >> > on >> >such logic. >> > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when >> >socket is online, the registers values may be modified by FW and others. >> >So, we have to restore them to default value which is being done in >> >'cat_init_feature'. >> > >> > So, what is your exactly meaning here? Thanks! >> >> Neither of the two. I'm comparing with COS/PSR-_unrelated_ >> handling of register state. After all there are other MSRs which >> we need to put into the right state after a core comes online. >> > For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket > is offline. The values in it are all 0 when socket is online again. This > causes > error when user shows the CBMs. So, we have two options when socket is > online: > 1. Read COS MSRs values and save them into 'cos_reg_val[]'. > 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM. This re-states what you want to do; it does not answer my question. Along the lines of what you say, for example FS and GS base MSRs come back as zero too after a socket has been (re-)onlined. We don't need to go through any hoops there, nevertheless. > Per our discussion on v9 patch 05, we decided to adopt option 2. Below are > what we discussed. FYR. Well, we decided option 2 is better than option 1. I'm still unconvinced there's not a yet better alternative. >> >> are a number of cosmetic issues, but commenting on an attached >> >> (rather than inlined) patch is a little difficult. >> >> >> > Sorry for that, I will directly send patch out next time. >> > >> >> One remark regarding the locking though: Acquiring a lock in the >> >> context switch path should be made as low risk of long stalls as >> >> possible. Therefore you will want to consider using r/w locks >> >> instead of spin locks here, which would allow parallelism on all >> >> cores of a socket as long as COS IDs aren't being updated. >> >> >> > In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions. >> > So, I do not understand why read-write lock is better? Anything I don't >> > consider? Please help to point out. Thanks! >> >> Hmm, looking again I can see that r/w locks indeed may not help >> here. However, what you say still doesn't look correct to me: You >> acquire the lock depending on _only_ psra->cos_mask being non- >> zero. This means that all cores on one socket are being serialized >> here. Quite possibly all you need is for some of the checks done >> inside the locked region to be replicated (but _not_ moved) to the >> outer if(), to limit the number of times where the lock is to be >> acquired. >> > I think your suggestions is to check old_cos outer the lock region. My suggestion is to check as much state as possible, to prevent having to acquire the lock whenever possible. > If it is not 0 which means the domain's cos id does not need restore > to 0, we can directly set it into ASSOC register. That can avoid > unnecessary lock. I will send out the test patch again to ask your > help to provide review comments (you said there are also 'a number > of cosmetics issues'). And I would hope you would try to eliminate some (if not all) yourself first. After all you can easily go over your patch yourself, checking for e.g. style violations. Jan
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-19 03:00:29, Jan Beulich wrote: > >>> On 19.04.17 at 10:22,wrote: > > On 17-04-18 05:46:43, Jan Beulich wrote: > >> >>> On 18.04.17 at 12:55, wrote: > >> > I made a test patch based on v10 and attached it in mail. Could you > >> > please > >> > help to check it? Thanks! > >> > >> This looks reasonable at the first glance, albeit I continue to be > >> unconvinced that this is the only (reasonable) way of solving the Do you have any other suggestion on this? Thanks! > >> problem. After all we don't have to go through similar hoops for > >> any other of the register state associated with a vCPU. There > > > > Sorry, I do not understand your meaning clearly. > > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this > > patch, > >we check 'dom_ids' array to know if the domain's cos id has not been set > > but > >its 'psr_cos_ids[socket]' already has a non zero value. This case means > > the > >socket offline has happened so that we have to restore the domain's > >'psr_cos_ids[socket]' to default value 0 which points to default COS MSR. > >I think we have discussed this in previous mails and achieved agreement > > on > >such logic. > > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when > >socket is online, the registers values may be modified by FW and others. > >So, we have to restore them to default value which is being done in > >'cat_init_feature'. > > > > So, what is your exactly meaning here? Thanks! > > Neither of the two. I'm comparing with COS/PSR-_unrelated_ > handling of register state. After all there are other MSRs which > we need to put into the right state after a core comes online. > For PSR feature, the 'cos_reg_val[]' of the feature is destroied when socket is offline. The values in it are all 0 when socket is online again. This causes error when user shows the CBMs. So, we have two options when socket is online: 1. Read COS MSRs values and save them into 'cos_reg_val[]'. 2. Restore COS MSRs values and 'cos_reg_val[]' values to default CBM. Per our discussion on v9 patch 05, we decided to adopt option 2. Below are what we discussed. FYR. [v9 patch 05] >> >> > +/* cos=0 is reserved as default cbm(all bits within cbm_len are >> >> > 1). */ >> >> > +feat->cos_reg_val[0] = cat_default_val(cat.cbm_len); >> >> > +/* >> >> > + * To handle cpu offline and then online case, we need read MSRs >> >> > back to >> >> > + * save values into cos_reg_val array. >> >> > + */ >> >> > +for ( i = 1; i <= cat.cos_max; i++ ) >> >> > +{ >> >> > +rdmsrl(MSR_IA32_PSR_L3_MASK(i), val); >> >> > +feat->cos_reg_val[i] = (uint32_t)val; >> >> > +} >> >> [Jan] >> >> You mention this in the changes done, but I don't understand why >> >> you do this. What meaning to these values have to you? If you want >> >> hardware and cached values to match up, the much more conventional >> >> way of enforcing this would be to write the values you actually >> >> want (normally all zero). >> >> [Sun Yi] >> > When all cpus on a socket are offline, the free_feature() is called >> > to free features resources so that the values saved in >> > cos_reg_val[] are lost. When the socket is online again, features >> > are allocated again so that cos_reg_val[] members are all >> > initialized to 0. Only is cos_reg_val[0] initialized to default value in >> > this function in old codes. >> > >> > But domain is still alive so that its cos id on the socket is kept. >> > The corresponding MSR value is kept too per test. To make >> > cos_reg_val[] values be same as HW to not to mislead user, we >> > should read back the valid values on HW into cos_reg_val[]. >> [Jan] >> Okay, I understand the background, but I don't view this solution as >> viable: Once the last core on a socket goes offline, all references >> to it should be cleaned up. After all what will be brought back >> online may be a different physical CPU altogether; you can't assume >> MSR values to have survived even if it is the same CPU which comes >> back online, as it may have undergone a reset cycle, or BIOS/SMM may >> have played with the MSRs. >> That's even a possibility for a single core coming back online, so >> you have to reload MSRs explicitly anyway if implicit reloading (i.e. >> once vCPU-s get scheduled onto it) doesn't suffice. >> [Sun Yi] > So, you think the MSRs values may not be valid after such process and > reloading (write MSRs to default value) is needed. If so, I would like > to do more operations in 'free_feature()': > 1. Iterate all domains working on the offline socket to change >'d->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init >status. > 2. Restore 'socket_info[socket].cos_ref[]' to all 0. > > These can make the socket's info be totally restored back to init status. [Jan] Yes, that's what I think is
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 19.04.17 at 10:22,wrote: > On 17-04-18 05:46:43, Jan Beulich wrote: >> >>> On 18.04.17 at 12:55, wrote: >> > I made a test patch based on v10 and attached it in mail. Could you please >> > help to check it? Thanks! >> >> This looks reasonable at the first glance, albeit I continue to be >> unconvinced that this is the only (reasonable) way of solving the >> problem. After all we don't have to go through similar hoops for >> any other of the register state associated with a vCPU. There > > Sorry, I do not understand your meaning clearly. > 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch, >we check 'dom_ids' array to know if the domain's cos id has not been set > but >its 'psr_cos_ids[socket]' already has a non zero value. This case means the >socket offline has happened so that we have to restore the domain's >'psr_cos_ids[socket]' to default value 0 which points to default COS MSR. >I think we have discussed this in previous mails and achieved agreement on >such logic. > 2. DYM the COS MSRs? We have discussed this before. Per your comments, when >socket is online, the registers values may be modified by FW and others. >So, we have to restore them to default value which is being done in >'cat_init_feature'. > > So, what is your exactly meaning here? Thanks! Neither of the two. I'm comparing with COS/PSR-_unrelated_ handling of register state. After all there are other MSRs which we need to put into the right state after a core comes online. >> are a number of cosmetic issues, but commenting on an attached >> (rather than inlined) patch is a little difficult. >> > Sorry for that, I will directly send patch out next time. > >> One remark regarding the locking though: Acquiring a lock in the >> context switch path should be made as low risk of long stalls as >> possible. Therefore you will want to consider using r/w locks >> instead of spin locks here, which would allow parallelism on all >> cores of a socket as long as COS IDs aren't being updated. >> > In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions. > So, I do not understand why read-write lock is better? Anything I don't > consider? Please help to point out. Thanks! Hmm, looking again I can see that r/w locks indeed may not help here. However, what you say still doesn't look correct to me: You acquire the lock depending on _only_ psra->cos_mask being non- zero. This means that all cores on one socket are being serialized here. Quite possibly all you need is for some of the checks done inside the locked region to be replicated (but _not_ moved) to the outer if(), to limit the number of times where the lock is to be acquired. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-18 05:46:43, Jan Beulich wrote: > >>> On 18.04.17 at 12:55,wrote: > > I made a test patch based on v10 and attached it in mail. Could you please > > help to check it? Thanks! > > This looks reasonable at the first glance, albeit I continue to be > unconvinced that this is the only (reasonable) way of solving the > problem. After all we don't have to go through similar hoops for > any other of the register state associated with a vCPU. There Sorry, I do not understand your meaning clearly. 1. DYM the ASSOC register which is set in 'psr_ctxt_switch_to'? In this patch, we check 'dom_ids' array to know if the domain's cos id has not been set but its 'psr_cos_ids[socket]' already has a non zero value. This case means the socket offline has happened so that we have to restore the domain's 'psr_cos_ids[socket]' to default value 0 which points to default COS MSR. I think we have discussed this in previous mails and achieved agreement on such logic. 2. DYM the COS MSRs? We have discussed this before. Per your comments, when socket is online, the registers values may be modified by FW and others. So, we have to restore them to default value which is being done in 'cat_init_feature'. So, what is your exactly meaning here? Thanks! > are a number of cosmetic issues, but commenting on an attached > (rather than inlined) patch is a little difficult. > Sorry for that, I will directly send patch out next time. > One remark regarding the locking though: Acquiring a lock in the > context switch path should be made as low risk of long stalls as > possible. Therefore you will want to consider using r/w locks > instead of spin locks here, which would allow parallelism on all > cores of a socket as long as COS IDs aren't being updated. > In 'psr_ctxt_switch_to', I use the lock only to protect 'write' actions. So, I do not understand why read-write lock is better? Anything I don't consider? Please help to point out. Thanks! > Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 18.04.17 at 12:55,wrote: > I made a test patch based on v10 and attached it in mail. Could you please > help to check it? Thanks! This looks reasonable at the first glance, albeit I continue to be unconvinced that this is the only (reasonable) way of solving the problem. After all we don't have to go through similar hoops for any other of the register state associated with a vCPU. There are a number of cosmetic issues, but commenting on an attached (rather than inlined) patch is a little difficult. One remark regarding the locking though: Acquiring a lock in the context switch path should be made as low risk of long stalls as possible. Therefore you will want to consider using r/w locks instead of spin locks here, which would allow parallelism on all cores of a socket as long as COS IDs aren't being updated. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-13 05:50:01, Jan Beulich wrote: > >>> On 13.04.17 at 13:44,wrote: > > On 17-04-13 05:31:41, Jan Beulich wrote: > >> >>> On 13.04.17 at 13:11, wrote: > >> > On 17-04-13 04:58:06, Jan Beulich wrote: > >> >> >>> On 13.04.17 at 12:49, wrote: > >> >> > How about a per socket array like this: > >> >> > uint32_t domain_switch[1024]; > >> >> > > >> >> > Every bit represents a domain id. Then, we can handle this case as > >> >> > below: > >> >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place > >> >> > is enough to > >> >> >cover socket offline case. We do not need to clear it in > >> >> > 'free_socket_resources'. > >> >> > > >> >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, > >> >> > domain_switch) to set > >> >> >the bit to 1 according to domain_id. If the old value is 0 and the > >> >> >'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to > >> >> > be 0. > >> >> > > >> >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to > >> >> > set the bit > >> >> >to 1 too. Then, update 'psr_cos_ids[socket]' according to > >> >> > find/pick flow. > >> >> > > >> >> > Then, we only use 4KB for one socket. > >> >> > >> >> This looks to come closer to something I'd consider acceptable, but > >> >> I may not understand your intentions in full yet: For one, there's > >> >> nowhere you clear the bit (other than presumably during socket > >> >> cleanup). > >> > > >> > Actually, clear the array in 'free_socket_resources' has same effect. I > >> > can > >> > move clear action into it. > >> > >> That wasn't my point - I was asking about clearing individual bits. > >> Point being that if you only ever set bits in the map, you'll likely > >> end up iterating through all active domains anyway. > >> > > If entering 'free_socket_resources', that means no more actions to > > the array on this socket except clearing it. Can I just memset this array > > of the socekt to 0? > > You can, afaict, but unless first you act on the set bits I can't see why > you would want to track the bits in the first place. Or maybe I'm still > not understanding your intention, in which case I guess the best you > can do is simply implement your plan, and we'll discuss it in v11 review. > I made a test patch based on v10 and attached it in mail. Could you please help to check it? Thanks! > Jan diff --git a/xen/arch/x86/psr.c b/xen/arch/x86/psr.c index a85ea99..ef8d3e9 100644 --- a/xen/arch/x86/psr.c +++ b/xen/arch/x86/psr.c @@ -125,6 +125,8 @@ struct feat_node { uint32_t cos_reg_val[MAX_COS_REG_CNT]; }; +#define PSR_DOM_IDS_NUM ((DOMID_IDLE + 1) / sizeof(uint32_t)) + /* * PSR features are managed per socket. Below structure defines the members * used to manage these features. @@ -134,9 +136,11 @@ struct feat_node { * COS ID. Every entry of cos_ref corresponds to one COS ID. */ struct psr_socket_info { -struct feat_node *features[PSR_SOCKET_MAX_FEAT]; spinlock_t ref_lock; +spinlock_t dom_ids_lock; +struct feat_node *features[PSR_SOCKET_MAX_FEAT]; unsigned int cos_ref[MAX_COS_REG_CNT]; +uint32_t dom_ids[PSR_DOM_IDS_NUM]; }; struct psr_assoc { @@ -194,26 +198,11 @@ static void free_socket_resources(unsigned int socket) { unsigned int i; struct psr_socket_info *info = socket_info + socket; -struct domain *d; +unsigned long flag; if ( !info ) return; -/* Restore domain cos id to 0 when socket is offline. */ -for_each_domain ( d ) -{ -unsigned int cos = d->arch.psr_cos_ids[socket]; -if ( cos == 0 ) -continue; - -spin_lock(>ref_lock); -ASSERT(!cos || info->cos_ref[cos]); -info->cos_ref[cos]--; -spin_unlock(>ref_lock); - -d->arch.psr_cos_ids[socket] = 0; -} - /* * Free resources of features. The global feature object, e.g. feat_l3_cat, * may not be freed here if it is not added into array. It is simply being @@ -221,12 +210,17 @@ static void free_socket_resources(unsigned int socket) */ for ( i = 0; i < PSR_SOCKET_MAX_FEAT; i++ ) { -if ( !info->features[i] ) -continue; - xfree(info->features[i]); info->features[i] = NULL; } + +spin_lock(>ref_lock); +memset(info->cos_ref, 0, MAX_COS_REG_CNT * sizeof(unsigned int)); +spin_unlock(>ref_lock); + +spin_lock_irqsave(>dom_ids_lock, flag); +memset(info->dom_ids, 0, PSR_DOM_IDS_NUM * sizeof(uint32_t)); +spin_unlock_irqrestore(>dom_ids_lock, flag); } static bool feat_init_done(const struct psr_socket_info *info) @@ -682,9 +676,34 @@ void psr_ctxt_switch_to(struct domain *d) psr_assoc_rmid(, d->arch.psr_rmid); if ( psra->cos_mask ) -psr_assoc_cos(, d->arch.psr_cos_ids ? -
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 13.04.17 at 13:44,wrote: > On 17-04-13 05:31:41, Jan Beulich wrote: >> >>> On 13.04.17 at 13:11, wrote: >> > On 17-04-13 04:58:06, Jan Beulich wrote: >> >> >>> On 13.04.17 at 12:49, wrote: >> >> > How about a per socket array like this: >> >> > uint32_t domain_switch[1024]; >> >> > >> >> > Every bit represents a domain id. Then, we can handle this case as >> >> > below: >> >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is >> >> > enough to >> >> >cover socket offline case. We do not need to clear it in >> >> > 'free_socket_resources'. >> >> > >> >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, >> >> > domain_switch) to set >> >> >the bit to 1 according to domain_id. If the old value is 0 and the >> >> >'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be >> >> > 0. >> >> > >> >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to >> >> > set the bit >> >> >to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick >> >> > flow. >> >> > >> >> > Then, we only use 4KB for one socket. >> >> >> >> This looks to come closer to something I'd consider acceptable, but >> >> I may not understand your intentions in full yet: For one, there's >> >> nowhere you clear the bit (other than presumably during socket >> >> cleanup). >> > >> > Actually, clear the array in 'free_socket_resources' has same effect. I can >> > move clear action into it. >> >> That wasn't my point - I was asking about clearing individual bits. >> Point being that if you only ever set bits in the map, you'll likely >> end up iterating through all active domains anyway. >> > If entering 'free_socket_resources', that means no more actions to > the array on this socket except clearing it. Can I just memset this array > of the socekt to 0? You can, afaict, but unless first you act on the set bits I can't see why you would want to track the bits in the first place. Or maybe I'm still not understanding your intention, in which case I guess the best you can do is simply implement your plan, and we'll discuss it in v11 review. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-13 05:31:41, Jan Beulich wrote: > >>> On 13.04.17 at 13:11,wrote: > > On 17-04-13 04:58:06, Jan Beulich wrote: > >> >>> On 13.04.17 at 12:49, wrote: > >> > How about a per socket array like this: > >> > uint32_t domain_switch[1024]; > >> > > >> > Every bit represents a domain id. Then, we can handle this case as below: > >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is > >> > enough to > >> >cover socket offline case. We do not need to clear it in > >> > 'free_socket_resources'. > >> > > >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) > >> > to set > >> >the bit to 1 according to domain_id. If the old value is 0 and the > >> >'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0. > >> > > >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set > >> > the bit > >> >to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick > >> > flow. > >> > > >> > Then, we only use 4KB for one socket. > >> > >> This looks to come closer to something I'd consider acceptable, but > >> I may not understand your intentions in full yet: For one, there's > >> nowhere you clear the bit (other than presumably during socket > >> cleanup). > > > > Actually, clear the array in 'free_socket_resources' has same effect. I can > > move clear action into it. > > That wasn't my point - I was asking about clearing individual bits. > Point being that if you only ever set bits in the map, you'll likely > end up iterating through all active domains anyway. > If entering 'free_socket_resources', that means no more actions to the array on this socket except clearing it. Can I just memset this array of the socekt to 0? > Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 13.04.17 at 13:11,wrote: > On 17-04-13 04:58:06, Jan Beulich wrote: >> >>> On 13.04.17 at 12:49, wrote: >> > How about a per socket array like this: >> > uint32_t domain_switch[1024]; >> > >> > Every bit represents a domain id. Then, we can handle this case as below: >> > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is >> > enough to >> >cover socket offline case. We do not need to clear it in >> > 'free_socket_resources'. >> > >> > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) >> > to set >> >the bit to 1 according to domain_id. If the old value is 0 and the >> >'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0. >> > >> > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set >> > the bit >> >to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick >> > flow. >> > >> > Then, we only use 4KB for one socket. >> >> This looks to come closer to something I'd consider acceptable, but >> I may not understand your intentions in full yet: For one, there's >> nowhere you clear the bit (other than presumably during socket >> cleanup). > > Actually, clear the array in 'free_socket_resources' has same effect. I can > move clear action into it. That wasn't my point - I was asking about clearing individual bits. Point being that if you only ever set bits in the map, you'll likely end up iterating through all active domains anyway. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-13 19:11:54, Yi Sun wrote: > On 17-04-13 04:58:06, Jan Beulich wrote: > > >>> On 13.04.17 at 12:49,wrote: > > > On 17-04-13 03:41:44, Jan Beulich wrote: > > >> >>> On 13.04.17 at 10:11, wrote: > > >> > On 17-04-12 06:42:01, Jan Beulich wrote: > > >> >> >>> On 12.04.17 at 14:23, wrote: > > >> >> > On 17-04-12 03:09:56, Jan Beulich wrote: > > >> >> >> >>> On 12.04.17 at 07:53, wrote: > > >> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote: > > >> >> >> >> >>> On 01.04.17 at 15:53, wrote: > > >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do > > >> >> >> >> in the > > >> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you > > >> >> >> >> have a few thousand VMs, the loop above may take a while. > > >> >> >> >> > > >> >> >> > Hmm, that may be a potential issue. I have two proposals below. > > >> >> >> > Could you > > >> >> >> > please help to check which one you prefer? Or provide another > > >> >> >> > solution? > > >> >> >> > > > >> >> >> > 1. Start a tasklet in free_socket_resources() to restore > > >> >> > 'psr_cos_ids[socket]' > > >> >> >> >of all domains. The action is protected by 'ref_lock' to > > >> >> >> > avoid > > >> >> > confliction > > >> >> >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in > > >> >> >> > tasklet or > > > memset > > >> >> >> >the array to 0 in free_socket_resources(). > > >> >> >> > > > >> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and > > >> >> >> > change > > > index > > >> >> >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs > > >> >> >> > per > > > socket > > >> >> >> >and can memset the array to 0 when socket is offline. But > > >> >> >> > here is an > > >> >> > issue > > >> >> >> >that we do not know how many members this array should have. > > >> >> >> > I cannot > > >> >> > find > > >> >> >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to > > >> >> >> > use > > >> >> > reallocation > > >> >> >> >in 'psr_alloc_cos' if the newly created domain's id is bigger > > >> >> >> > than > > >> >> > current > > >> >> >> >array number. > > >> >> >> > > >> >> >> The number of domains is limited by the special DOMID_* values. > > >> >> >> However, allocating an array with 32k entries doesn't sound very > > >> >> >> reasonable. > > >> >> > > > >> >> > I think 32K entries should be the extreme case. I can allocate e.g. > > >> >> > 100 > > > entries > > >> >> > when the first domain is created. If a new domain's id exceeds 100, > > > reallocate > > >> >> > another 100 entries. The total number of entries allocated should > > >> >> > be less > > > than > > >> >> > 32K. This is a functional requirement which cannot be avoided. How > > >> >> > do you > > >> >> > think? > > >> >> > > >> >> So how many entries would your array have once I start the 32,000th > > >> >> domain (having at any one time at most a single one running, besides > > >> >> Dom0)? > > >> >> > > >> > In such case, we have to keep a 32K array because the domain_id is the > > > index to > > >> > access the array. But this array is per socket so the whole memory > > >> > used > > > should > > >> > not be too much. > > >> > > >> We carefully avoid any runtime allocations of order > 0, so if you > > >> were to set up such an array, you'd need to use vmalloc()/vzalloc(). > > >> But I continue to be unconvinced that we want such a large array > > >> in the first place. > > >> > > >> > After considering this issue more, I think the original codes might > > >> > not be > > >> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at > > >> > most > > >> > 288 CPUs. So, I think the domains running at same time in reality may > > >> > not > > > be > > >> > so many (no efficient resources). If this hypothesis is right, a loop > > >> > to > > > write > > >> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If > > >> > I am > > >> > wrong, please correct me. Thanks! > > >> > > >> What relationship does the number of CPUs have to the number of > > >> domains on a host? There could be thousands with just a few dozen > > >> CPUs, provided none or very few of them have high demands on > > >> CPU resources. Additionally please never forget that system sizes > > >> basically only ever grow. Plus we wouldn't want a latent issue here > > >> in case we ever end up needing to widen domain IDs beyond 16 bits. > > >> > > > How about a per socket array like this: > > > uint32_t domain_switch[1024]; > > > > > > Every bit represents a domain id. Then, we can handle this case as below: > > > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is > > > enough to > > >cover socket offline case. We do not need to clear it in > > > 'free_socket_resources'. > > > > > > 2. In
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-13 04:58:06, Jan Beulich wrote: > >>> On 13.04.17 at 12:49,wrote: > > On 17-04-13 03:41:44, Jan Beulich wrote: > >> >>> On 13.04.17 at 10:11, wrote: > >> > On 17-04-12 06:42:01, Jan Beulich wrote: > >> >> >>> On 12.04.17 at 14:23, wrote: > >> >> > On 17-04-12 03:09:56, Jan Beulich wrote: > >> >> >> >>> On 12.04.17 at 07:53, wrote: > >> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote: > >> >> >> >> >>> On 01.04.17 at 15:53, wrote: > >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in > >> >> >> >> the > >> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you > >> >> >> >> have a few thousand VMs, the loop above may take a while. > >> >> >> >> > >> >> >> > Hmm, that may be a potential issue. I have two proposals below. > >> >> >> > Could you > >> >> >> > please help to check which one you prefer? Or provide another > >> >> >> > solution? > >> >> >> > > >> >> >> > 1. Start a tasklet in free_socket_resources() to restore > >> >> > 'psr_cos_ids[socket]' > >> >> >> >of all domains. The action is protected by 'ref_lock' to avoid > >> >> > confliction > >> >> >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet > >> >> >> > or > > memset > >> >> >> >the array to 0 in free_socket_resources(). > >> >> >> > > >> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and > >> >> >> > change > > index > >> >> >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs > >> >> >> > per > > socket > >> >> >> >and can memset the array to 0 when socket is offline. But here > >> >> >> > is an > >> >> > issue > >> >> >> >that we do not know how many members this array should have. I > >> >> >> > cannot > >> >> > find > >> >> >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use > >> >> > reallocation > >> >> >> >in 'psr_alloc_cos' if the newly created domain's id is bigger > >> >> >> > than > >> >> > current > >> >> >> >array number. > >> >> >> > >> >> >> The number of domains is limited by the special DOMID_* values. > >> >> >> However, allocating an array with 32k entries doesn't sound very > >> >> >> reasonable. > >> >> > > >> >> > I think 32K entries should be the extreme case. I can allocate e.g. > >> >> > 100 > > entries > >> >> > when the first domain is created. If a new domain's id exceeds 100, > > reallocate > >> >> > another 100 entries. The total number of entries allocated should be > >> >> > less > > than > >> >> > 32K. This is a functional requirement which cannot be avoided. How do > >> >> > you > >> >> > think? > >> >> > >> >> So how many entries would your array have once I start the 32,000th > >> >> domain (having at any one time at most a single one running, besides > >> >> Dom0)? > >> >> > >> > In such case, we have to keep a 32K array because the domain_id is the > > index to > >> > access the array. But this array is per socket so the whole memory used > > should > >> > not be too much. > >> > >> We carefully avoid any runtime allocations of order > 0, so if you > >> were to set up such an array, you'd need to use vmalloc()/vzalloc(). > >> But I continue to be unconvinced that we want such a large array > >> in the first place. > >> > >> > After considering this issue more, I think the original codes might not > >> > be > >> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at > >> > most > >> > 288 CPUs. So, I think the domains running at same time in reality may > >> > not > > be > >> > so many (no efficient resources). If this hypothesis is right, a loop to > > write > >> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I > >> > am > >> > wrong, please correct me. Thanks! > >> > >> What relationship does the number of CPUs have to the number of > >> domains on a host? There could be thousands with just a few dozen > >> CPUs, provided none or very few of them have high demands on > >> CPU resources. Additionally please never forget that system sizes > >> basically only ever grow. Plus we wouldn't want a latent issue here > >> in case we ever end up needing to widen domain IDs beyond 16 bits. > >> > > How about a per socket array like this: > > uint32_t domain_switch[1024]; > > > > Every bit represents a domain id. Then, we can handle this case as below: > > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is > > enough to > >cover socket offline case. We do not need to clear it in > > 'free_socket_resources'. > > > > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to > > set > >the bit to 1 according to domain_id. If the old value is 0 and the > >'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0. > > > > 3. In 'psr_set_val()', test_and_set_bit(domain_id,
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 13.04.17 at 12:49,wrote: > On 17-04-13 03:41:44, Jan Beulich wrote: >> >>> On 13.04.17 at 10:11, wrote: >> > On 17-04-12 06:42:01, Jan Beulich wrote: >> >> >>> On 12.04.17 at 14:23, wrote: >> >> > On 17-04-12 03:09:56, Jan Beulich wrote: >> >> >> >>> On 12.04.17 at 07:53, wrote: >> >> >> > On 17-04-11 09:01:53, Jan Beulich wrote: >> >> >> >> >>> On 01.04.17 at 15:53, wrote: >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in >> >> >> >> the >> >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you >> >> >> >> have a few thousand VMs, the loop above may take a while. >> >> >> >> >> >> >> > Hmm, that may be a potential issue. I have two proposals below. >> >> >> > Could you >> >> >> > please help to check which one you prefer? Or provide another >> >> >> > solution? >> >> >> > >> >> >> > 1. Start a tasklet in free_socket_resources() to restore >> >> > 'psr_cos_ids[socket]' >> >> >> >of all domains. The action is protected by 'ref_lock' to avoid >> >> > confliction >> >> >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet >> >> >> > or > memset >> >> >> >the array to 0 in free_socket_resources(). >> >> >> > >> >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and >> >> >> > change > index >> >> >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per > socket >> >> >> >and can memset the array to 0 when socket is offline. But here is >> >> >> > an >> >> > issue >> >> >> >that we do not know how many members this array should have. I >> >> >> > cannot >> >> > find >> >> >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use >> >> > reallocation >> >> >> >in 'psr_alloc_cos' if the newly created domain's id is bigger >> >> >> > than >> >> > current >> >> >> >array number. >> >> >> >> >> >> The number of domains is limited by the special DOMID_* values. >> >> >> However, allocating an array with 32k entries doesn't sound very >> >> >> reasonable. >> >> > >> >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 > entries >> >> > when the first domain is created. If a new domain's id exceeds 100, > reallocate >> >> > another 100 entries. The total number of entries allocated should be >> >> > less > than >> >> > 32K. This is a functional requirement which cannot be avoided. How do >> >> > you >> >> > think? >> >> >> >> So how many entries would your array have once I start the 32,000th >> >> domain (having at any one time at most a single one running, besides >> >> Dom0)? >> >> >> > In such case, we have to keep a 32K array because the domain_id is the > index to >> > access the array. But this array is per socket so the whole memory used > should >> > not be too much. >> >> We carefully avoid any runtime allocations of order > 0, so if you >> were to set up such an array, you'd need to use vmalloc()/vzalloc(). >> But I continue to be unconvinced that we want such a large array >> in the first place. >> >> > After considering this issue more, I think the original codes might not be >> > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most >> > 288 CPUs. So, I think the domains running at same time in reality may not > be >> > so many (no efficient resources). If this hypothesis is right, a loop to > write >> > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am >> > wrong, please correct me. Thanks! >> >> What relationship does the number of CPUs have to the number of >> domains on a host? There could be thousands with just a few dozen >> CPUs, provided none or very few of them have high demands on >> CPU resources. Additionally please never forget that system sizes >> basically only ever grow. Plus we wouldn't want a latent issue here >> in case we ever end up needing to widen domain IDs beyond 16 bits. >> > How about a per socket array like this: > uint32_t domain_switch[1024]; > > Every bit represents a domain id. Then, we can handle this case as below: > 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough > to >cover socket offline case. We do not need to clear it in > 'free_socket_resources'. > > 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to > set >the bit to 1 according to domain_id. If the old value is 0 and the >'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0. > > 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the > bit >to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow. > > Then, we only use 4KB for one socket. This looks to come closer to something I'd consider acceptable, but I may not understand your intentions in full yet: For one, there's nowhere you clear the bit (other
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-13 03:41:44, Jan Beulich wrote: > >>> On 13.04.17 at 10:11,wrote: > > On 17-04-12 06:42:01, Jan Beulich wrote: > >> >>> On 12.04.17 at 14:23, wrote: > >> > On 17-04-12 03:09:56, Jan Beulich wrote: > >> >> >>> On 12.04.17 at 07:53, wrote: > >> >> > On 17-04-11 09:01:53, Jan Beulich wrote: > >> >> >> >>> On 01.04.17 at 15:53, wrote: > >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the > >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you > >> >> >> have a few thousand VMs, the loop above may take a while. > >> >> >> > >> >> > Hmm, that may be a potential issue. I have two proposals below. Could > >> >> > you > >> >> > please help to check which one you prefer? Or provide another > >> >> > solution? > >> >> > > >> >> > 1. Start a tasklet in free_socket_resources() to restore > >> > 'psr_cos_ids[socket]' > >> >> >of all domains. The action is protected by 'ref_lock' to avoid > >> > confliction > >> >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or > >> >> > memset > >> >> >the array to 0 in free_socket_resources(). > >> >> > > >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change > >> >> > index > >> >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per > >> >> > socket > >> >> >and can memset the array to 0 when socket is offline. But here is > >> >> > an > >> > issue > >> >> >that we do not know how many members this array should have. I > >> >> > cannot > >> > find > >> >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use > >> > reallocation > >> >> >in 'psr_alloc_cos' if the newly created domain's id is bigger than > >> > current > >> >> >array number. > >> >> > >> >> The number of domains is limited by the special DOMID_* values. > >> >> However, allocating an array with 32k entries doesn't sound very > >> >> reasonable. > >> > > >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 > >> > entries > >> > when the first domain is created. If a new domain's id exceeds 100, > >> > reallocate > >> > another 100 entries. The total number of entries allocated should be > >> > less than > >> > 32K. This is a functional requirement which cannot be avoided. How do > >> > you > >> > think? > >> > >> So how many entries would your array have once I start the 32,000th > >> domain (having at any one time at most a single one running, besides > >> Dom0)? > >> > > In such case, we have to keep a 32K array because the domain_id is the > > index to > > access the array. But this array is per socket so the whole memory used > > should > > not be too much. > > We carefully avoid any runtime allocations of order > 0, so if you > were to set up such an array, you'd need to use vmalloc()/vzalloc(). > But I continue to be unconvinced that we want such a large array > in the first place. > > > After considering this issue more, I think the original codes might not be > > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most > > 288 CPUs. So, I think the domains running at same time in reality may not be > > so many (no efficient resources). If this hypothesis is right, a loop to > > write > > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am > > wrong, please correct me. Thanks! > > What relationship does the number of CPUs have to the number of > domains on a host? There could be thousands with just a few dozen > CPUs, provided none or very few of them have high demands on > CPU resources. Additionally please never forget that system sizes > basically only ever grow. Plus we wouldn't want a latent issue here > in case we ever end up needing to widen domain IDs beyond 16 bits. > How about a per socket array like this: uint32_t domain_switch[1024]; Every bit represents a domain id. Then, we can handle this case as below: 1. In 'psr_cpu_init()', clear the array to be 0. I think this place is enough to cover socket offline case. We do not need to clear it in 'free_socket_resources'. 2. In 'psr_ctxt_switch_to()', test_and_set_bit(domain_id, domain_switch) to set the bit to 1 according to domain_id. If the old value is 0 and the 'psr_cos_ids[socket]' is not 0, restore 'psr_cos_ids[socket]' to be 0. 3. In 'psr_set_val()', test_and_set_bit(domain_id, domain_switch) to set the bit to 1 too. Then, update 'psr_cos_ids[socket]' according to find/pick flow. Then, we only use 4KB for one socket. ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 13.04.17 at 10:11,wrote: > On 17-04-12 06:42:01, Jan Beulich wrote: >> >>> On 12.04.17 at 14:23, wrote: >> > On 17-04-12 03:09:56, Jan Beulich wrote: >> >> >>> On 12.04.17 at 07:53, wrote: >> >> > On 17-04-11 09:01:53, Jan Beulich wrote: >> >> >> >>> On 01.04.17 at 15:53, wrote: >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the >> >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you >> >> >> have a few thousand VMs, the loop above may take a while. >> >> >> >> >> > Hmm, that may be a potential issue. I have two proposals below. Could >> >> > you >> >> > please help to check which one you prefer? Or provide another solution? >> >> > >> >> > 1. Start a tasklet in free_socket_resources() to restore >> > 'psr_cos_ids[socket]' >> >> >of all domains. The action is protected by 'ref_lock' to avoid >> > confliction >> >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or >> >> > memset >> >> >the array to 0 in free_socket_resources(). >> >> > >> >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change >> >> > index >> >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per >> >> > socket >> >> >and can memset the array to 0 when socket is offline. But here is an >> > issue >> >> >that we do not know how many members this array should have. I >> >> > cannot >> > find >> >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use >> > reallocation >> >> >in 'psr_alloc_cos' if the newly created domain's id is bigger than >> > current >> >> >array number. >> >> >> >> The number of domains is limited by the special DOMID_* values. >> >> However, allocating an array with 32k entries doesn't sound very >> >> reasonable. >> > >> > I think 32K entries should be the extreme case. I can allocate e.g. 100 >> > entries >> > when the first domain is created. If a new domain's id exceeds 100, >> > reallocate >> > another 100 entries. The total number of entries allocated should be less >> > than >> > 32K. This is a functional requirement which cannot be avoided. How do you >> > think? >> >> So how many entries would your array have once I start the 32,000th >> domain (having at any one time at most a single one running, besides >> Dom0)? >> > In such case, we have to keep a 32K array because the domain_id is the index > to > access the array. But this array is per socket so the whole memory used should > not be too much. We carefully avoid any runtime allocations of order > 0, so if you were to set up such an array, you'd need to use vmalloc()/vzalloc(). But I continue to be unconvinced that we want such a large array in the first place. > After considering this issue more, I think the original codes might not be > so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most > 288 CPUs. So, I think the domains running at same time in reality may not be > so many (no efficient resources). If this hypothesis is right, a loop to write > 'psr_cos_ids[socket]' of every domain to 0 may not take much time. If I am > wrong, please correct me. Thanks! What relationship does the number of CPUs have to the number of domains on a host? There could be thousands with just a few dozen CPUs, provided none or very few of them have high demands on CPU resources. Additionally please never forget that system sizes basically only ever grow. Plus we wouldn't want a latent issue here in case we ever end up needing to widen domain IDs beyond 16 bits. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-12 06:42:01, Jan Beulich wrote: > >>> On 12.04.17 at 14:23,wrote: > > On 17-04-12 03:09:56, Jan Beulich wrote: > >> >>> On 12.04.17 at 07:53, wrote: > >> > On 17-04-11 09:01:53, Jan Beulich wrote: > >> >> >>> On 01.04.17 at 15:53, wrote: > >> >> > +info->cos_ref[cos]--; > >> >> > +spin_unlock(>ref_lock); > >> >> > + > >> >> > +d->arch.psr_cos_ids[socket] = 0; > >> >> > +} > >> >> > >> >> Overall, while you say in the revision log that this was a suggestion of > >> >> mine, I don't recall any such (and I've just checked the v9 thread of > >> >> this patch without finding anything), and hence it's not really clear to > >> >> me why this is needed. After all you should be freeing info anyway > >> > > >> > We discussed this in v9 05 patch. > >> > >> Ah, that's why I didn't find it. > >> > >> > I paste it below for your convenience to > >> > check. > >> > [Sun Yi]: > >> > > So, you think the MSRs values may not be valid after such process > >> > and > >> > > reloading (write MSRs to default value) is needed. If so, I would > >> > like > >> > > to do more operations in 'free_feature()': > >> > > 1. Iterate all domains working on the offline socket to change > >> > >'d->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to > >> > init > >> > >status. > >> > > 2. Restore 'socket_info[socket].cos_ref[]' to all 0. > >> > > > >> > > These can make the socket's info be totally restored back to init > > status. > >> > > >> > [Jan] > >> > Yes, that's what I think is needed. > >> > > >> >> (albeit I can't see this happening, which would look to be a bug in > >> >> patch 5), so getting the refcounts adjusted seems pointless in any > >> >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not > >> > > >> > We only free resources in 'socket_info[socekt]' but do not free itself. > >> > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]' > >> > is not a pointer that can be directly freed. > >> > socket_info = xzalloc_array(struct psr_socket_info, nr_sockets); > >> > > >> > That is the reason to reduce the 'info->cos_ref[cos]'. > >> > >> I see. But then there's no need to decrement it for each domain > >> using it, you could simply flush it to zero. > >> > >> >> certain - this may indeed by unavoidable, to match up with > >> >> psr_alloc_cos() using xzalloc. > >> >> > >> >> Furthermore I'm not at all convinced this is appropriate to do in the > >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you > >> >> have a few thousand VMs, the loop above may take a while. > >> >> > >> > Hmm, that may be a potential issue. I have two proposals below. Could you > >> > please help to check which one you prefer? Or provide another solution? > >> > > >> > 1. Start a tasklet in free_socket_resources() to restore > > 'psr_cos_ids[socket]' > >> >of all domains. The action is protected by 'ref_lock' to avoid > > confliction > >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or > >> > memset > >> >the array to 0 in free_socket_resources(). > >> > > >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change > >> > index > >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per > >> > socket > >> >and can memset the array to 0 when socket is offline. But here is an > > issue > >> >that we do not know how many members this array should have. I cannot > > find > >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use > > reallocation > >> >in 'psr_alloc_cos' if the newly created domain's id is bigger than > > current > >> >array number. > >> > >> The number of domains is limited by the special DOMID_* values. > >> However, allocating an array with 32k entries doesn't sound very > >> reasonable. > > > > I think 32K entries should be the extreme case. I can allocate e.g. 100 > > entries > > when the first domain is created. If a new domain's id exceeds 100, > > reallocate > > another 100 entries. The total number of entries allocated should be less > > than > > 32K. This is a functional requirement which cannot be avoided. How do you > > think? > > So how many entries would your array have once I start the 32,000th > domain (having at any one time at most a single one running, besides > Dom0)? > In such case, we have to keep a 32K array because the domain_id is the index to access the array. But this array is per socket so the whole memory used should not be too much. After considering this issue more, I think the original codes might not be so unacceptable. Per my knowledge, Intel Xeon Phi chip can support at most 288 CPUs. So, I think the domains running at same time in reality may not be so many (no efficient resources). If this hypothesis is right, a loop to write 'psr_cos_ids[socket]' of every domain to 0 may not take
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 12.04.17 at 14:23,wrote: > On 17-04-12 03:09:56, Jan Beulich wrote: >> >>> On 12.04.17 at 07:53, wrote: >> > On 17-04-11 09:01:53, Jan Beulich wrote: >> >> >>> On 01.04.17 at 15:53, wrote: >> >> > +info->cos_ref[cos]--; >> >> > +spin_unlock(>ref_lock); >> >> > + >> >> > +d->arch.psr_cos_ids[socket] = 0; >> >> > +} >> >> >> >> Overall, while you say in the revision log that this was a suggestion of >> >> mine, I don't recall any such (and I've just checked the v9 thread of >> >> this patch without finding anything), and hence it's not really clear to >> >> me why this is needed. After all you should be freeing info anyway >> > >> > We discussed this in v9 05 patch. >> >> Ah, that's why I didn't find it. >> >> > I paste it below for your convenience to >> > check. >> > [Sun Yi]: >> > > So, you think the MSRs values may not be valid after such process and >> > > reloading (write MSRs to default value) is needed. If so, I would like >> > > to do more operations in 'free_feature()': >> > > 1. Iterate all domains working on the offline socket to change >> > >'d->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init >> > >status. >> > > 2. Restore 'socket_info[socket].cos_ref[]' to all 0. >> > > >> > > These can make the socket's info be totally restored back to init > status. >> > >> > [Jan] >> > Yes, that's what I think is needed. >> > >> >> (albeit I can't see this happening, which would look to be a bug in >> >> patch 5), so getting the refcounts adjusted seems pointless in any >> >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not >> > >> > We only free resources in 'socket_info[socekt]' but do not free itself. >> > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]' >> > is not a pointer that can be directly freed. >> > socket_info = xzalloc_array(struct psr_socket_info, nr_sockets); >> > >> > That is the reason to reduce the 'info->cos_ref[cos]'. >> >> I see. But then there's no need to decrement it for each domain >> using it, you could simply flush it to zero. >> >> >> certain - this may indeed by unavoidable, to match up with >> >> psr_alloc_cos() using xzalloc. >> >> >> >> Furthermore I'm not at all convinced this is appropriate to do in the >> >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you >> >> have a few thousand VMs, the loop above may take a while. >> >> >> > Hmm, that may be a potential issue. I have two proposals below. Could you >> > please help to check which one you prefer? Or provide another solution? >> > >> > 1. Start a tasklet in free_socket_resources() to restore > 'psr_cos_ids[socket]' >> >of all domains. The action is protected by 'ref_lock' to avoid > confliction >> >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or >> > memset >> >the array to 0 in free_socket_resources(). >> > >> > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index >> >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket >> >and can memset the array to 0 when socket is offline. But here is an > issue >> >that we do not know how many members this array should have. I cannot > find >> >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use > reallocation >> >in 'psr_alloc_cos' if the newly created domain's id is bigger than > current >> >array number. >> >> The number of domains is limited by the special DOMID_* values. >> However, allocating an array with 32k entries doesn't sound very >> reasonable. > > I think 32K entries should be the extreme case. I can allocate e.g. 100 > entries > when the first domain is created. If a new domain's id exceeds 100, reallocate > another 100 entries. The total number of entries allocated should be less than > 32K. This is a functional requirement which cannot be avoided. How do you > think? So how many entries would your array have once I start the 32,000th domain (having at any one time at most a single one running, besides Dom0)? Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-12 03:09:56, Jan Beulich wrote: > >>> On 12.04.17 at 07:53,wrote: > > On 17-04-11 09:01:53, Jan Beulich wrote: > >> >>> On 01.04.17 at 15:53, wrote: > >> > +info->cos_ref[cos]--; > >> > +spin_unlock(>ref_lock); > >> > + > >> > +d->arch.psr_cos_ids[socket] = 0; > >> > +} > >> > >> Overall, while you say in the revision log that this was a suggestion of > >> mine, I don't recall any such (and I've just checked the v9 thread of > >> this patch without finding anything), and hence it's not really clear to > >> me why this is needed. After all you should be freeing info anyway > > > > We discussed this in v9 05 patch. > > Ah, that's why I didn't find it. > > > I paste it below for your convenience to > > check. > > [Sun Yi]: > > > So, you think the MSRs values may not be valid after such process and > > > reloading (write MSRs to default value) is needed. If so, I would like > > > to do more operations in 'free_feature()': > > > 1. Iterate all domains working on the offline socket to change > > >'d->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init > > >status. > > > 2. Restore 'socket_info[socket].cos_ref[]' to all 0. > > > > > > These can make the socket's info be totally restored back to init > > status. > > > > [Jan] > > Yes, that's what I think is needed. > > > >> (albeit I can't see this happening, which would look to be a bug in > >> patch 5), so getting the refcounts adjusted seems pointless in any > >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not > > > > We only free resources in 'socket_info[socekt]' but do not free itself. > > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]' > > is not a pointer that can be directly freed. > > socket_info = xzalloc_array(struct psr_socket_info, nr_sockets); > > > > That is the reason to reduce the 'info->cos_ref[cos]'. > > I see. But then there's no need to decrement it for each domain > using it, you could simply flush it to zero. > > >> certain - this may indeed by unavoidable, to match up with > >> psr_alloc_cos() using xzalloc. > >> > >> Furthermore I'm not at all convinced this is appropriate to do in the > >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you > >> have a few thousand VMs, the loop above may take a while. > >> > > Hmm, that may be a potential issue. I have two proposals below. Could you > > please help to check which one you prefer? Or provide another solution? > > > > 1. Start a tasklet in free_socket_resources() to restore > > 'psr_cos_ids[socket]' > >of all domains. The action is protected by 'ref_lock' to avoid > > confliction > >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset > >the array to 0 in free_socket_resources(). > > > > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index > >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket > >and can memset the array to 0 when socket is offline. But here is an > > issue > >that we do not know how many members this array should have. I cannot > > find > >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use > > reallocation > >in 'psr_alloc_cos' if the newly created domain's id is bigger than > > current > >array number. > > The number of domains is limited by the special DOMID_* values. > However, allocating an array with 32k entries doesn't sound very > reasonable. I think 32K entries should be the extreme case. I can allocate e.g. 100 entries when the first domain is created. If a new domain's id exceeds 100, reallocate another 100 entries. The total number of entries allocated should be less than 32K. This is a functional requirement which cannot be avoided. How do you think? > Sadly the other solution doesn't look very attractive > either, as there'll be quite a bit of synchronization needed (you'd > have to defer the same socket being re-used by a CPU being > onlined until you've done the cleanup). This may need some more > thought, but I can't see myself finding time for this any time soon, > so I'm afraid I'll have to leave it to you for now. > Yes, the first option need carefully consider the synchronization which is more complex than second option. > Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 12.04.17 at 07:53,wrote: > On 17-04-11 09:01:53, Jan Beulich wrote: >> >>> On 01.04.17 at 15:53, wrote: >> > +info->cos_ref[cos]--; >> > +spin_unlock(>ref_lock); >> > + >> > +d->arch.psr_cos_ids[socket] = 0; >> > +} >> >> Overall, while you say in the revision log that this was a suggestion of >> mine, I don't recall any such (and I've just checked the v9 thread of >> this patch without finding anything), and hence it's not really clear to >> me why this is needed. After all you should be freeing info anyway > > We discussed this in v9 05 patch. Ah, that's why I didn't find it. > I paste it below for your convenience to > check. > [Sun Yi]: > > So, you think the MSRs values may not be valid after such process and > > reloading (write MSRs to default value) is needed. If so, I would like > > to do more operations in 'free_feature()': > > 1. Iterate all domains working on the offline socket to change > >'d->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init > >status. > > 2. Restore 'socket_info[socket].cos_ref[]' to all 0. > > > > These can make the socket's info be totally restored back to init status. > > [Jan] > Yes, that's what I think is needed. > >> (albeit I can't see this happening, which would look to be a bug in >> patch 5), so getting the refcounts adjusted seems pointless in any >> event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not > > We only free resources in 'socket_info[socekt]' but do not free itself. > Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]' > is not a pointer that can be directly freed. > socket_info = xzalloc_array(struct psr_socket_info, nr_sockets); > > That is the reason to reduce the 'info->cos_ref[cos]'. I see. But then there's no need to decrement it for each domain using it, you could simply flush it to zero. >> certain - this may indeed by unavoidable, to match up with >> psr_alloc_cos() using xzalloc. >> >> Furthermore I'm not at all convinced this is appropriate to do in the >> context of a CPU_UP_CANCELED / CPU_DEAD notification: If you >> have a few thousand VMs, the loop above may take a while. >> > Hmm, that may be a potential issue. I have two proposals below. Could you > please help to check which one you prefer? Or provide another solution? > > 1. Start a tasklet in free_socket_resources() to restore 'psr_cos_ids[socket]' >of all domains. The action is protected by 'ref_lock' to avoid confliction >in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset >the array to 0 in free_socket_resources(). > > 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index >from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket >and can memset the array to 0 when socket is offline. But here is an issue >that we do not know how many members this array should have. I cannot find >a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use > reallocation >in 'psr_alloc_cos' if the newly created domain's id is bigger than current >array number. The number of domains is limited by the special DOMID_* values. However, allocating an array with 32k entries doesn't sound very reasonable. Sadly the other solution doesn't look very attractive either, as there'll be quite a bit of synchronization needed (you'd have to defer the same socket being re-used by a CPU being onlined until you've done the cleanup). This may need some more thought, but I can't see myself finding time for this any time soon, so I'm afraid I'll have to leave it to you for now. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org https://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
On 17-04-11 09:01:53, Jan Beulich wrote: > >>> On 01.04.17 at 15:53,wrote: > > --- a/xen/arch/x86/psr.c > > +++ b/xen/arch/x86/psr.c > > @@ -157,10 +157,26 @@ static void free_socket_resources(unsigned int socket) > > { > > unsigned int i; > > struct psr_socket_info *info = socket_info + socket; > > +struct domain *d; > > > > if ( !info ) > > return; > > > > +/* Restore domain cos id to 0 when socket is offline. */ > > +for_each_domain ( d ) > > +{ > > +unsigned int cos = d->arch.psr_cos_ids[socket]; > > +if ( cos == 0 ) > > Blank line between declaration(s) and statement(s) please. > Ok, will modify other places where have same issue. > > +continue; > > + > > +spin_lock(>ref_lock); > > +ASSERT(!cos || info->cos_ref[cos]); > > You've checked cos to be non-zero already above. > Yep. Thanks! > > +info->cos_ref[cos]--; > > +spin_unlock(>ref_lock); > > + > > +d->arch.psr_cos_ids[socket] = 0; > > +} > > Overall, while you say in the revision log that this was a suggestion of > mine, I don't recall any such (and I've just checked the v9 thread of > this patch without finding anything), and hence it's not really clear to > me why this is needed. After all you should be freeing info anyway We discussed this in v9 05 patch. I paste it below for your convenience to check. [Sun Yi]: > So, you think the MSRs values may not be valid after such process and > reloading (write MSRs to default value) is needed. If so, I would like > to do more operations in 'free_feature()': > 1. Iterate all domains working on the offline socket to change >'d->arch.psr_cos_ids[socket]' to COS 0, i.e restore it back to init >status. > 2. Restore 'socket_info[socket].cos_ref[]' to all 0. > > These can make the socket's info be totally restored back to init status. [Jan] Yes, that's what I think is needed. > (albeit I can't see this happening, which would look to be a bug in > patch 5), so getting the refcounts adjusted seems pointless in any > event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not We only free resources in 'socket_info[socekt]' but do not free itself. Below is how we allocate 'socket_info'. So, the 'socket_info[socekt]' is not a pointer that can be directly freed. socket_info = xzalloc_array(struct psr_socket_info, nr_sockets); That is the reason to reduce the 'info->cos_ref[cos]'. > certain - this may indeed by unavoidable, to match up with > psr_alloc_cos() using xzalloc. > > Furthermore I'm not at all convinced this is appropriate to do in the > context of a CPU_UP_CANCELED / CPU_DEAD notification: If you > have a few thousand VMs, the loop above may take a while. > Hmm, that may be a potential issue. I have two proposals below. Could you please help to check which one you prefer? Or provide another solution? 1. Start a tasklet in free_socket_resources() to restore 'psr_cos_ids[socket]' of all domains. The action is protected by 'ref_lock' to avoid confliction in 'psr_set_val'. We can reduce 'info->cos_ref[cos]' in tasklet or memset the array to 0 in free_socket_resources(). 2. Move 'psr_cos_ids[]' from 'domain' to 'psr_socket_info' and change index from 'socket' to 'domain_id'. So we keep all domains' COS IDs per socket and can memset the array to 0 when socket is offline. But here is an issue that we do not know how many members this array should have. I cannot find a macro something like 'DOMAIN_MAX_NUMBER'. So, I prefer to use reallocation in 'psr_alloc_cos' if the newly created domain's id is bigger than current array number. > > @@ -574,15 +590,210 @@ int psr_get_val(struct domain *d, unsigned int > > socket, > > return 0; > > } > > > > -int psr_set_l3_cbm(struct domain *d, unsigned int socket, > > - uint64_t cbm, enum cbm_type type) > > +/* Set value functions */ > > +static unsigned int get_cos_num(const struct psr_socket_info *info) > > { > > return 0; > > } > > > > +static int gather_val_array(uint32_t val[], > > +unsigned int array_len, > > +const struct psr_socket_info *info, > > +unsigned int old_cos) > > +{ > > +return -EINVAL; > > +} > > + > > +static int insert_val_to_array(uint32_t val[], > > As indicated before, I'm pretty convinced "into" would be more > natural than "to". > Got it. Thanks! > > + unsigned int array_len, > > + const struct psr_socket_info *info, > > + enum psr_feat_type feat_type, > > + enum cbm_type type, > > + uint32_t new_val) > > +{ > > +return -EINVAL; > > +} > > + > > +static int find_cos(const uint32_t val[], unsigned int array_len, > > +enum
Re: [Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
>>> On 01.04.17 at 15:53,wrote: > --- a/xen/arch/x86/psr.c > +++ b/xen/arch/x86/psr.c > @@ -157,10 +157,26 @@ static void free_socket_resources(unsigned int socket) > { > unsigned int i; > struct psr_socket_info *info = socket_info + socket; > +struct domain *d; > > if ( !info ) > return; > > +/* Restore domain cos id to 0 when socket is offline. */ > +for_each_domain ( d ) > +{ > +unsigned int cos = d->arch.psr_cos_ids[socket]; > +if ( cos == 0 ) Blank line between declaration(s) and statement(s) please. > +continue; > + > +spin_lock(>ref_lock); > +ASSERT(!cos || info->cos_ref[cos]); You've checked cos to be non-zero already above. > +info->cos_ref[cos]--; > +spin_unlock(>ref_lock); > + > +d->arch.psr_cos_ids[socket] = 0; > +} Overall, while you say in the revision log that this was a suggestion of mine, I don't recall any such (and I've just checked the v9 thread of this patch without finding anything), and hence it's not really clear to me why this is needed. After all you should be freeing info anyway (albeit I can't see this happening, which would look to be a bug in patch 5), so getting the refcounts adjusted seems pointless in any event. Whether d->arch.psr_cos_ids[socket] needs clearing I'm not certain - this may indeed by unavoidable, to match up with psr_alloc_cos() using xzalloc. Furthermore I'm not at all convinced this is appropriate to do in the context of a CPU_UP_CANCELED / CPU_DEAD notification: If you have a few thousand VMs, the loop above may take a while. > @@ -574,15 +590,210 @@ int psr_get_val(struct domain *d, unsigned int socket, > return 0; > } > > -int psr_set_l3_cbm(struct domain *d, unsigned int socket, > - uint64_t cbm, enum cbm_type type) > +/* Set value functions */ > +static unsigned int get_cos_num(const struct psr_socket_info *info) > { > return 0; > } > > +static int gather_val_array(uint32_t val[], > +unsigned int array_len, > +const struct psr_socket_info *info, > +unsigned int old_cos) > +{ > +return -EINVAL; > +} > + > +static int insert_val_to_array(uint32_t val[], As indicated before, I'm pretty convinced "into" would be more natural than "to". > + unsigned int array_len, > + const struct psr_socket_info *info, > + enum psr_feat_type feat_type, > + enum cbm_type type, > + uint32_t new_val) > +{ > +return -EINVAL; > +} > + > +static int find_cos(const uint32_t val[], unsigned int array_len, > +enum psr_feat_type feat_type, > +const struct psr_socket_info *info, > +spinlock_t *ref_lock) I don't think I did suggest adding another parameter here. The lock is accessible from info, isn't it? In which case, as I _did_ suggest, you should drop const from it instead. But ... > +{ > +ASSERT(spin_is_locked(ref_lock)); ... the assertion seems rather pointless anyway with there just being a single caller, which very visibly acquires the lock up front. > +static int pick_avail_cos(const struct psr_socket_info *info, > + spinlock_t *ref_lock, Same here then. > +int psr_set_val(struct domain *d, unsigned int socket, > +uint32_t val, enum cbm_type type) > +{ > +unsigned int old_cos; > +int cos, ret; > +unsigned int *ref; > +uint32_t *val_array; > +struct psr_socket_info *info = get_socket_info(socket); > +unsigned int array_len; > +enum psr_feat_type feat_type; > + > +if ( IS_ERR(info) ) > +return PTR_ERR(info); > + > +feat_type = psr_cbm_type_to_feat_type(type); > +if ( feat_type > ARRAY_SIZE(info->features) || I think I did point out the off-by-one mistake here in an earlier patch already. > + !info->features[feat_type] ) > +return -ENOENT; > + > +/* > + * Step 0: > + * old_cos means the COS ID current domain is using. By default, it is 0. > + * > + * For every COS ID, there is a reference count to record how many > domains > + * are using the COS register corresponding to this COS ID. > + * - If ref[old_cos] is 0, that means this COS is not used by any domain. > + * - If ref[old_cos] is 1, that means this COS is only used by current > + * domain. > + * - If ref[old_cos] is more than 1, that mean multiple domains are using > + * this COS. > + */ > +old_cos = d->arch.psr_cos_ids[socket]; > +ASSERT(old_cos < MAX_COS_REG_CNT); > + > +ref = info->cos_ref; > + > +/* > + * Step 1: > + * Gather a value array to store all features cos_reg_val[old_cos]. > + * And, set the input new val into array according to
[Xen-devel] [PATCH v10 09/25] x86: refactor psr: L3 CAT: set value: implement framework.
As set value flow is the most complicated one in psr, it will be divided to some patches to make things clearer. This patch implements the set value framework to show a whole picture firstly. It also changes domctl interface to make it more general. To make the set value flow be general and can support multiple features at same time, it includes below steps: 1. Get COS ID that current domain is using. 2. Gather a value array to store all features current value into it and replace the current value of the feature which is being set to the new input value. 3. Find if there is already a COS ID on which all features' values are same as the array. Then, we can reuse this COS ID. 4. If fail to find, we need pick an available COS ID. Only COS ID which ref is 0 or 1 can be picked. 5. Write the feature's MSR according to the COS ID and cbm_type. 6. Update ref according to COS ID. 7. Save the COS ID into current domain's psr_cos_ids[socket] so that we can know which COS the domain is using on the socket. So, some functions are abstracted and the callback functions will be implemented in next patches. Here is an example to understand the process. The CPU supports two featuers, e.g. L3 CAT and L2 CAT. User wants to set L3 CAT of Dom1 to 0x1ff. 1. At the initial time, the old_cos of Dom1 is 0. The COS registers values are below at this time. --- | COS 0 | COS 1 | COS 2 | ... | --- L3 CAT | 0x7ff | 0x7ff | 0x7ff | ... | --- L2 CAT | 0xff | 0xff | 0xff | ... | --- 2. Gather the value array and insert new value into it: val[0]: 0x1ff val[1]: 0xff 3. It cannot find a matching COS. 4. Pick COS 1 to store the value set. 5. Write the L3 CAT COS 1 registers. The COS registers values are changed to below now. --- | COS 0 | COS 1 | COS 2 | ... | --- L3 CAT | 0x7ff | 0x1ff | ... | ... | --- L2 CAT | 0xff | 0xff | ... | ... | --- 6. The ref[1] is increased to 1 because Dom1 is using it now. 7. Save 1 to Dom1's psr_cos_ids[socket]. Then, user wants to set L3 CAT of Dom2 to 0x1ff too. The old_cos of Dom2 is 0 too. Repeat above flow. The val array assembled is: val[0]: 0x1ff val[1]: 0xff So, it can find a matching COS, COS 1. Then, it can reuse COS 1 for Dom2. The ref[1] is increased to 2 now because both Dom1 and Dom2 are using this COS ID. Set 1 to Dom2's psr_cos_ids[socket]. Another thing need to emphasize is the context switch. When context switch happens, 'psr_ctxt_switch_to' is called by system to get domain's COS ID from 'psr_cos_ids[socket]'. But 'psr_cos_ids[socket]' is set at step 7 above. So, there are three scenarios, e.g.: 1. User calls domctl interface on Dom0 to set a COS ID 1 for Dom1 into its psr_cos_ids[]. Then, Dom1 is scheduled so that 'psr_ctxt_switch_to()' is called which makes COS ID 1 work. For this case, no action is needed. 2. Dom1 runs on CPU 1 and COS ID 1 is working. At same time, user calls domctl interface on Dom0 to set a new COS ID 2 for Dom1 into psr_cos_ids[]. After time slice ends, the Dom1 is scheduled again, the new COS ID 2 will work. 3. When a new COS ID is being set to psr_cos_ids[], 'psr_ctxt_switch_to()' is called to access the same psr_cos_ids[] member through 'psr_assoc_cos'. The COS ID is constrained by cos_mask so that it cannot exceeds the cos_max. So even the COS ID got here is wrong, it is still a workable ID (within cos_max). The functionality is still workable, the actual valid CBM will be effective for a short time. In next schedule, the correct CBM will take effect. All these cases will not cause race condition and no harm to system. The PSR features are to set cache capacity for a domain. The setting to cache is progressively effective. When the cache setting becomes really effective, the time slice to schedule a domain may have passed. So, even if a wrong COS ID is used to set ASSOC, only another valid CBM be effective for a short time during cache preparation time. The correct COS ID will take effect in a short time. This does not affect cache capacity setting much. Signed-off-by: Yi Sun--- v10: - restore domain cos id to 0 when socket is offline. (suggested by Jan Beulich) - check 'psr_cat_op.data' to make sure only lower 32 bits are valid. (suggested by Jan Beulich) - remove unnecessary fixed width type of parameters and variables. (suggested by Jan Beulich) - rename 'insert_new_val_to_array' to 'insert_val_to_array'. (suggested by Jan Beulich) - input 'ref_lock' pointer into functions to check if it has been locked. (suggested by Jan Beulich) - add comment to declare the set process is protected by 'domctl_lock'.