Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-18 Thread Tim Deegan
Hi Kevin,

At 14:09 +0100 on 11 Dec (1418303386), Tim Deegan wrote:
 At 02:03 + on 11 Dec (1418259797), Tian, Kevin wrote:
   From: Tim Deegan [mailto:t...@xen.org]
   Now since the code's not going to be in 4.5 anyway, it should be
   possible to _develop_ it in this manner, possibly in a private branch,
   even if the early stages aren't getting applied immediately.  We
   should be able to set up an rmrr branch on the public servers if that
   helps.  But again, that relies on having a design worked out in
   advance that both developers and maintainers are (within reason)
   committed to.
  
  that's a good suggestion. We'll send out a design review today, and then
  based on discussion we can see whether making sense to do such 
  incremental way.
 
 Sounds good!

I haven't seen this design doc yet -- if I missed it can someone point
me to it?

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-18 Thread Chen, Tiejun

On 2014/12/19 0:13, Tim Deegan wrote:

Hi Kevin,

At 14:09 +0100 on 11 Dec (1418303386), Tim Deegan wrote:

At 02:03 + on 11 Dec (1418259797), Tian, Kevin wrote:

From: Tim Deegan [mailto:t...@xen.org]
Now since the code's not going to be in 4.5 anyway, it should be
possible to _develop_ it in this manner, possibly in a private branch,
even if the early stages aren't getting applied immediately.  We
should be able to set up an rmrr branch on the public servers if that
helps.  But again, that relies on having a design worked out in
advance that both developers and maintainers are (within reason)
committed to.


that's a good suggestion. We'll send out a design review today, and then
based on discussion we can see whether making sense to do such
incremental way.


Sounds good!


I haven't seen this design doc yet -- if I missed it can someone point
me to it?



No.

Its still reviewed/discussed internally but it will come soon.

Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-11 Thread Tim Deegan
At 02:03 + on 11 Dec (1418259797), Tian, Kevin wrote:
  From: Tim Deegan [mailto:t...@xen.org]
  Now since the code's not going to be in 4.5 anyway, it should be
  possible to _develop_ it in this manner, possibly in a private branch,
  even if the early stages aren't getting applied immediately.  We
  should be able to set up an rmrr branch on the public servers if that
  helps.  But again, that relies on having a design worked out in
  advance that both developers and maintainers are (within reason)
  committed to.
 
 that's a good suggestion. We'll send out a design review today, and then
 based on discussion we can see whether making sense to do such 
 incremental way.

Sounds good!

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-10 Thread Jan Beulich
 On 10.12.14 at 04:39, kevin.t...@intel.com wrote:
 1. It's more efficient for new people to start from a small, well-defined 
 task
 in one area, and then spanning to adjacent areas gradually. Patience must 
 be given by the community to help them grow;

Yes. But if a large item like the RMRR one is being picked, this is a
recipe for failure.

 2. Unfortunately this RMRR effort grows from original two patches (very VT-d
 focused) to now v8 17 patches spanning many areas (including hypercall, mmu, 
 domain builder, hvmloader, etc.), and thus imposing both long learning curve
 and lots of frustrations being no achievement returned. Though having a 
 complete
 solution is desired, we need to help split the big task into step-by-step 
 approach 
 as long as:
   - the overall design is agreed
   - each step is self-contained 
   - each step won't be wasted moving forward. 
 That way new people can see incremental achievements, instead of being hit 
 down before final success. 
 
 Take this RMRR for example. Maybe we can split into steps like:
 
   step-1: setup RMRR identify mapping in hypervisor. fail if confliction. 
 no
 user space changes
   step-2: expose RMRR knowledge to userspace and detect confliction in
 domain builder and hvmloader
   step-3: reconcile e820/mmio to avoid confliction in a best effort
   step-4: miscellaneous enhancements, each can be ACK-ed individually:
   - handle guest CPU access to RMRR regions
   - handle conflicting RMRR regions
   - handle group RMRRs
   - re-enable USB device assignment
 
 That way Tiejun can focus on a self-contained task at one time, and then 
 observe
 continuous acknowledgements for his work. We don't need to claim RMRR fully
 ready in Xen until last patch is accepted, but that at least provides a way 
 to ack
 new people when working on complex features and also allow for partial usage 
 on his work.

If only this wouldn't result in regressions when done in steps (like you
outlined above, or likely also if split up in any other ways). Having to
do this in one go is the price you/we have to pay for this not having
got done properly from the beginning.

 3. Maintainers sometimes didn't give definitive guidance to the new people, 
 and the high-level design is not closed in the 1st place. e.g. when I thought 
 this v8
 version has everyone agreed on the design, there's another comment from Jan
 about using XENMEM_set_memory_map might be better, while back to Aug.
 when Tiejun raised that option the answer is I'm not sure. Note I 
 understand
 as a maintainer he might not have definite answers for all opens. But w/o a
 definitive guide new people may waste lots of effort on chasing a wrong 
 option,
 which is definitely frustrating. 

Main problem being that the maintainers to help here are primarily you,
and only then me or others - after all this is a VT-d only problem, not a
general IOMMU one. The fact that non-VT-d code gets touched doesn't
matter when considering just the design. And that's why I had asked
Tiejun to work with the two of you on getting the basis right. I don't
know how much of that possibly happened without the public seeing it,
but the results seem to suggest not all that much.

 So I'd suggest for such non-trivial task for a new people, all maintainers in 
 involved areas (xen, mmu, tools, vtd, etc) should first spend time to agree 
 on the high level design. At that stage, let's skip the coding problems to 
 save
 both time. After agreed design, then we can help the new people to improve 
 the coding to reach check-in criteria, which then becomes a converging 
 process.
 
 for this RMRR issue, let's close the design first, and then use staged 
 approach
 to get this patch series in.

Yes please. Till now (as said many times before) the only route I see
without grounds up design considerations is to disable pass-through for
devices associated with RMRRs. The longer the current situation lasts,
the more I'm tempted to put together a patch to do just that.

Jan

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-10 Thread Tian, Kevin
 From: Jan Beulich [mailto:jbeul...@suse.com]
 Sent: Wednesday, December 10, 2014 5:01 PM
 
  On 10.12.14 at 04:39, kevin.t...@intel.com wrote:
  1. It's more efficient for new people to start from a small, well-defined
  task
  in one area, and then spanning to adjacent areas gradually. Patience must
  be given by the community to help them grow;
 
 Yes. But if a large item like the RMRR one is being picked, this is a
 recipe for failure.

RMRR is already there, just broken. Here what we are doing is to fix it. 
Regarding
to that, as long as the small item doesn't cause new failures or regressions, it
should be fine.

 
  2. Unfortunately this RMRR effort grows from original two patches (very
 VT-d
  focused) to now v8 17 patches spanning many areas (including hypercall,
 mmu,
  domain builder, hvmloader, etc.), and thus imposing both long learning curve
  and lots of frustrations being no achievement returned. Though having a
  complete
  solution is desired, we need to help split the big task into step-by-step
  approach
  as long as:
  - the overall design is agreed
  - each step is self-contained
  - each step won't be wasted moving forward.
  That way new people can see incremental achievements, instead of being hit
  down before final success.
 
  Take this RMRR for example. Maybe we can split into steps like:
 
  step-1: setup RMRR identify mapping in hypervisor. fail if confliction. 
  no
  user space changes
  step-2: expose RMRR knowledge to userspace and detect confliction in
  domain builder and hvmloader
  step-3: reconcile e820/mmio to avoid confliction in a best effort
  step-4: miscellaneous enhancements, each can be ACK-ed individually:
  - handle guest CPU access to RMRR regions
  - handle conflicting RMRR regions
  - handle group RMRRs
  - re-enable USB device assignment
 
  That way Tiejun can focus on a self-contained task at one time, and then
  observe
  continuous acknowledgements for his work. We don't need to claim RMRR
 fully
  ready in Xen until last patch is accepted, but that at least provides a way
  to ack
  new people when working on complex features and also allow for partial
 usage
  on his work.
 
 If only this wouldn't result in regressions when done in steps (like you
 outlined above, or likely also if split up in any other ways). Having to
 do this in one go is the price you/we have to pay for this not having
 got done properly from the beginning.

then let's sit down to clear the design first before going to review the detail.

 
  3. Maintainers sometimes didn't give definitive guidance to the new people,
  and the high-level design is not closed in the 1st place. e.g. when I 
  thought
  this v8
  version has everyone agreed on the design, there's another comment from
 Jan
  about using XENMEM_set_memory_map might be better, while back to Aug.
  when Tiejun raised that option the answer is I'm not sure. Note I
  understand
  as a maintainer he might not have definite answers for all opens. But w/o a
  definitive guide new people may waste lots of effort on chasing a wrong
  option,
  which is definitely frustrating.
 
 Main problem being that the maintainers to help here are primarily you,
 and only then me or others - after all this is a VT-d only problem, not a
 general IOMMU one. The fact that non-VT-d code gets touched doesn't
 matter when considering just the design. And that's why I had asked
 Tiejun to work with the two of you on getting the basis right. I don't
 know how much of that possibly happened without the public seeing it,
 but the results seem to suggest not all that much.

We worked with Tiejun on the design, and some level of code review (not much
as you did since it touches lots of different areas), and the version he sent 
out is 
what we discussed to be a right way to go. But since you tempt to have 
spontaneous different opinions in each series, let's close that first. We'll 
work
with Tiejun to send a design review request separately, and let's see how it 
works
and whether it may be split into small steps.

and... as you already noted, when 'RMRR' is a VT feature, the majority code
touched in the patch series are not in VT-d space. Somehow I view this is a
general feature development, i.e. how to reserve a resource from guest 
physical space. RMRR is just one client of this feature, and there could be
others. In such case, we need all the maintainers in corresponding areas to
help, meant you as the general maintainer, meant Tim for mmu part, and
meant Ian for domain builder part, etc... Regarding to design, we need all
on the table and come to agreement.

Thanks
Kevin


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-10 Thread Tim Deegan
Hi Kevin,

Thanks for taking the time to work through this.

At 03:39 + on 10 Dec (1418179184), Tian, Kevin wrote:
 1. It's more efficient for new people to start from a small, well-defined task
 in one area, and then spanning to adjacent areas gradually. Patience must 
 be given by the community to help them grow;
 
 2. Unfortunately this RMRR effort grows from original two patches (very VT-d
 focused) to now v8 17 patches spanning many areas (including hypercall, mmu, 
 domain builder, hvmloader, etc.), and thus imposing both long learning curve
 and lots of frustrations being no achievement returned. Though having a 
 complete
 solution is desired, we need to help split the big task into step-by-step 
 approach 
 as long as:
   - the overall design is agreed
   - each step is self-contained 
   - each step won't be wasted moving forward. 
 That way new people can see incremental achievements, instead of being hit 
 down before final success. 
 
 Take this RMRR for example. Maybe we can split into steps like:
 
   step-1: setup RMRR identify mapping in hypervisor. fail if confliction. 
 no
 user space changes
   step-2: expose RMRR knowledge to userspace and detect confliction in
 domain builder and hvmloader
   step-3: reconcile e820/mmio to avoid confliction in a best effort
   step-4: miscellaneous enhancements, each can be ACK-ed individually:
   - handle guest CPU access to RMRR regions
   - handle conflicting RMRR regions
   - handle group RMRRs
   - re-enable USB device assignment
 
 That way Tiejun can focus on a self-contained task at one time, and then 
 observe
 continuous acknowledgements for his work. We don't need to claim RMRR fully
 ready in Xen until last patch is accepted, but that at least provides a way 
 to ack
 new people when working on complex features and also allow for partial usage 
 on his work.

We had this discussion before and I think it was clear that the
maintainers in general are unhappy with a partial solution.  OTOH, if
we can agree on the roadmap, and Intel will commit to seeing the work
through, it might be possible.  I think Jan is the man to convince
here. :)

Now since the code's not going to be in 4.5 anyway, it should be
possible to _develop_ it in this manner, possibly in a private branch,
even if the early stages aren't getting applied immediately.  We
should be able to set up an rmrr branch on the public servers if that
helps.  But again, that relies on having a design worked out in
advance that both developers and maintainers are (within reason)
committed to.

 3. Maintainers sometimes didn't give definitive guidance to the new people, 
 and the high-level design is not closed in the 1st place. e.g. when I thought 
 this v8
 version has everyone agreed on the design, there's another comment from Jan
 about using XENMEM_set_memory_map might be better, while back to Aug.
 when Tiejun raised that option the answer is I'm not sure. Note I understand
 as a maintainer he might not have definite answers for all opens. But w/o a
 definitive guide new people may waste lots of effort on chasing a wrong 
 option,
 which is definitely frustrating. 

This is definitely a problem, and indeed frustrating for the
developers.  We had similar difficulties with PVH development, where
even though we planned the architecture up-front, once the code was
written it became clear that a different approach would have been
better.  I'm not sure what we can do here to make it more likely that
we get the design right first time.

I do think that figuring out the design in advance makes these
projects much smoother, and it's something we're seeing more of.  For
example David Vrabel's designs for the new event channel interface, or
indeed the design doc he wrote about grant table locking, where review
at the design stage may have avoided a bunch of implementation effort
altogether.

 So I'd suggest for such non-trivial task for a new people, all maintainers in 
 involved areas (xen, mmu, tools, vtd, etc) should first spend time to agree 
 on the high level design. At that stage, let's skip the coding problems to 
 save
 both time.

That sounds like a very good idea to me.

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-10 Thread Tian, Kevin
 From: Tim Deegan [mailto:t...@xen.org]
 Sent: Wednesday, December 10, 2014 7:12 PM
 
 Hi Kevin,
 
 Thanks for taking the time to work through this.
 
 At 03:39 + on 10 Dec (1418179184), Tian, Kevin wrote:
  1. It's more efficient for new people to start from a small, well-defined 
  task
  in one area, and then spanning to adjacent areas gradually. Patience must
  be given by the community to help them grow;
 
  2. Unfortunately this RMRR effort grows from original two patches (very
 VT-d
  focused) to now v8 17 patches spanning many areas (including hypercall,
 mmu,
  domain builder, hvmloader, etc.), and thus imposing both long learning curve
  and lots of frustrations being no achievement returned. Though having a
 complete
  solution is desired, we need to help split the big task into step-by-step
 approach
  as long as:
  - the overall design is agreed
  - each step is self-contained
  - each step won't be wasted moving forward.
  That way new people can see incremental achievements, instead of being hit
  down before final success.
 
  Take this RMRR for example. Maybe we can split into steps like:
 
  step-1: setup RMRR identify mapping in hypervisor. fail if confliction. 
  no
  user space changes
  step-2: expose RMRR knowledge to userspace and detect confliction in
  domain builder and hvmloader
  step-3: reconcile e820/mmio to avoid confliction in a best effort
  step-4: miscellaneous enhancements, each can be ACK-ed individually:
  - handle guest CPU access to RMRR regions
  - handle conflicting RMRR regions
  - handle group RMRRs
  - re-enable USB device assignment
 
  That way Tiejun can focus on a self-contained task at one time, and then
 observe
  continuous acknowledgements for his work. We don't need to claim RMRR
 fully
  ready in Xen until last patch is accepted, but that at least provides a way 
  to
 ack
  new people when working on complex features and also allow for partial
 usage
  on his work.
 
 We had this discussion before and I think it was clear that the
 maintainers in general are unhappy with a partial solution.  OTOH, if
 we can agree on the roadmap, and Intel will commit to seeing the work
 through, it might be possible.  I think Jan is the man to convince
 here. :)

I think w/ last 8 series Tiejun has sent out, there's no doubt Intel commits
to make a complete work. :-)

 
 Now since the code's not going to be in 4.5 anyway, it should be
 possible to _develop_ it in this manner, possibly in a private branch,
 even if the early stages aren't getting applied immediately.  We
 should be able to set up an rmrr branch on the public servers if that
 helps.  But again, that relies on having a design worked out in
 advance that both developers and maintainers are (within reason)
 committed to.

that's a good suggestion. We'll send out a design review today, and then
based on discussion we can see whether making sense to do such 
incremental way.

 
  3. Maintainers sometimes didn't give definitive guidance to the new people,
  and the high-level design is not closed in the 1st place. e.g. when I 
  thought
 this v8
  version has everyone agreed on the design, there's another comment from
 Jan
  about using XENMEM_set_memory_map might be better, while back to Aug.
  when Tiejun raised that option the answer is I'm not sure. Note I
 understand
  as a maintainer he might not have definite answers for all opens. But w/o a
  definitive guide new people may waste lots of effort on chasing a wrong
 option,
  which is definitely frustrating.
 
 This is definitely a problem, and indeed frustrating for the
 developers.  We had similar difficulties with PVH development, where
 even though we planned the architecture up-front, once the code was
 written it became clear that a different approach would have been
 better.  I'm not sure what we can do here to make it more likely that
 we get the design right first time.

understand and reasonable.

 
 I do think that figuring out the design in advance makes these
 projects much smoother, and it's something we're seeing more of.  For
 example David Vrabel's designs for the new event channel interface, or
 indeed the design doc he wrote about grant table locking, where review
 at the design stage may have avoided a bunch of implementation effort
 altogether.

yes. a formal review in advance would be lot better than mixing design 
comments in scattered in deep coding review comments. For this RMRR
example, Tiejun did send out some summary for his patch set, but not
abstracted enough to catch people's eye on key design opens. And having
a goal for 4.5 window really brought him hard time to balance code 
refactoring and learn new areas when each series was questioned with 
new design inputs.

 
  So I'd suggest for such non-trivial task for a new people, all maintainers 
  in
  involved areas (xen, mmu, tools, vtd, etc) should first spend time to agree
  on the high 

Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Jan Beulich
 On 09.12.14 at 08:47, tiejun.c...@intel.com wrote:
 On 2014/12/8 16:51, Jan Beulich wrote:
 The whole if-copy-unlock-and-return-EFAULT-otherwise-increment
 is identical and can be factored out pretty easily afaict.
 
 What about this?
 
 struct get_reserved_device_memory {
  struct xen_reserved_device_memory_map map;
  unsigned int used_entries;
  struct domain *domain;
 };
 
 static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
u32 id, void *ctxt)
 {
  struct get_reserved_device_memory *grdm = ctxt;
  struct domain *d = grdm-domain;
  unsigned int i, hit_one = 0;
  u32 sbdf;
  struct xen_reserved_device_memory rdm = {
  .start_pfn = start, .nr_pages = nr
  };
 
  if ( !d-arch.hvm_domain.pci_force )
  {
  for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
  {
  sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
   d-arch.hvm_domain.pcidevs[i].bus,
   d-arch.hvm_domain.pcidevs[i].devfn);
  if ( sbdf == id )
  {
  hit_one = 1;
  break;
  }
  }
 
  if ( !hit_one )
  return 0;
  }

Why do you always pick other than the simplest possible solution?
You don't need a separate variable here, you can simply check
whether i reached d-arch.hvm_domain.num_pcidevs after the
loop. And even if you added a variable, it would want to be a
bool_t one with the way you use it.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Chen, Tiejun

On 2014/12/9 16:19, Jan Beulich wrote:

On 09.12.14 at 08:47, tiejun.c...@intel.com wrote:

On 2014/12/8 16:51, Jan Beulich wrote:

The whole if-copy-unlock-and-return-EFAULT-otherwise-increment
is identical and can be factored out pretty easily afaict.


What about this?

struct get_reserved_device_memory {
  struct xen_reserved_device_memory_map map;
  unsigned int used_entries;
  struct domain *domain;
};

static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
u32 id, void *ctxt)
{
  struct get_reserved_device_memory *grdm = ctxt;
  struct domain *d = grdm-domain;
  unsigned int i, hit_one = 0;
  u32 sbdf;
  struct xen_reserved_device_memory rdm = {
  .start_pfn = start, .nr_pages = nr
  };

  if ( !d-arch.hvm_domain.pci_force )
  {
  for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
  {
  sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
   d-arch.hvm_domain.pcidevs[i].bus,
   d-arch.hvm_domain.pcidevs[i].devfn);
  if ( sbdf == id )
  {
  hit_one = 1;
  break;
  }
  }

  if ( !hit_one )
  return 0;
  }


Why do you always pick other than the simplest possible solution?


I don't intend it to be, but I may go a complicated way, even a wrong 
way, based on my understanding. But as one main maintainer, if you 
always say to me in such a reproachful word more than once, I have to 
consider you may hint constantly I'm not a suitable candidate to finish 
this. Its fair to me, I'd really like to quit this to ask my manager if 
it can deliver to other guy to make sure this can move forward.



You don't need a separate variable here, you can simply check
whether i reached d-arch.hvm_domain.num_pcidevs after the
loop. And even if you added a variable, it would want to be a


Are you saying this?

if ( i == d-arch.hvm_domain.num_pcidevs )
return 0;

But if the last one happens to one hit, 'i' is equal to 
d-arch.hvm_domain.num_pcidevs.


Thanks
Tiejun


bool_t one with the way you use it.

Jan





___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Jan Beulich
 On 09.12.14 at 10:12, tiejun.c...@intel.com wrote:
 On 2014/12/9 16:19, Jan Beulich wrote:
 On 09.12.14 at 08:47, tiejun.c...@intel.com wrote:
 On 2014/12/8 16:51, Jan Beulich wrote:
 The whole if-copy-unlock-and-return-EFAULT-otherwise-increment
 is identical and can be factored out pretty easily afaict.

 What about this?

 struct get_reserved_device_memory {
   struct xen_reserved_device_memory_map map;
   unsigned int used_entries;
   struct domain *domain;
 };

 static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
 u32 id, void *ctxt)
 {
   struct get_reserved_device_memory *grdm = ctxt;
   struct domain *d = grdm-domain;
   unsigned int i, hit_one = 0;
   u32 sbdf;
   struct xen_reserved_device_memory rdm = {
   .start_pfn = start, .nr_pages = nr
   };

   if ( !d-arch.hvm_domain.pci_force )
   {
   for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
   {
   sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
d-arch.hvm_domain.pcidevs[i].bus,
d-arch.hvm_domain.pcidevs[i].devfn);
   if ( sbdf == id )
   {
   hit_one = 1;
   break;
   }
   }

   if ( !hit_one )
   return 0;
   }

 Why do you always pick other than the simplest possible solution?
 
 I don't intend it to be, but I may go a complicated way, even a wrong 
 way, based on my understanding. But as one main maintainer, if you 
 always say to me in such a reproachful word more than once, I have to 
 consider you may hint constantly I'm not a suitable candidate to finish 
 this. Its fair to me, I'd really like to quit this to ask my manager if 
 it can deliver to other guy to make sure this can move forward.
 
 You don't need a separate variable here, you can simply check
 whether i reached d-arch.hvm_domain.num_pcidevs after the
 loop. And even if you added a variable, it would want to be a
 
 Are you saying this?
 
   if ( i == d-arch.hvm_domain.num_pcidevs )
   return 0;

Yes. Or use =.

 But if the last one happens to one hit, 'i' is equal to 
 d-arch.hvm_domain.num_pcidevs.

No, when the last one hits, i == d-arch.hvm_domain.num_pcidevs - 1.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Chen, Tiejun

On 2014/12/9 17:21, Jan Beulich wrote:

On 09.12.14 at 10:12, tiejun.c...@intel.com wrote:

On 2014/12/9 16:19, Jan Beulich wrote:

On 09.12.14 at 08:47, tiejun.c...@intel.com wrote:

On 2014/12/8 16:51, Jan Beulich wrote:

The whole if-copy-unlock-and-return-EFAULT-otherwise-increment
is identical and can be factored out pretty easily afaict.


What about this?

struct get_reserved_device_memory {
   struct xen_reserved_device_memory_map map;
   unsigned int used_entries;
   struct domain *domain;
};

static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
 u32 id, void *ctxt)
{
   struct get_reserved_device_memory *grdm = ctxt;
   struct domain *d = grdm-domain;
   unsigned int i, hit_one = 0;
   u32 sbdf;
   struct xen_reserved_device_memory rdm = {
   .start_pfn = start, .nr_pages = nr
   };

   if ( !d-arch.hvm_domain.pci_force )
   {
   for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
   {
   sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
d-arch.hvm_domain.pcidevs[i].bus,
d-arch.hvm_domain.pcidevs[i].devfn);
   if ( sbdf == id )
   {
   hit_one = 1;
   break;
   }
   }

   if ( !hit_one )
   return 0;
   }


Why do you always pick other than the simplest possible solution?


I don't intend it to be, but I may go a complicated way, even a wrong
way, based on my understanding. But as one main maintainer, if you
always say to me in such a reproachful word more than once, I have to
consider you may hint constantly I'm not a suitable candidate to finish
this. Its fair to me, I'd really like to quit this to ask my manager if
it can deliver to other guy to make sure this can move forward.


You don't need a separate variable here, you can simply check
whether i reached d-arch.hvm_domain.num_pcidevs after the
loop. And even if you added a variable, it would want to be a


Are you saying this?

if ( i == d-arch.hvm_domain.num_pcidevs )
return 0;


Yes. Or use =.


Okay.




But if the last one happens to one hit, 'i' is equal to
d-arch.hvm_domain.num_pcidevs.


No, when the last one hits, i == d-arch.hvm_domain.num_pcidevs - 1.



You're right.

Thanks
Tiejun

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Tim Deegan
At 08:19 + on 09 Dec (1418109561), Jan Beulich wrote:
 Why do you always pick other than the simplest possible solution?

Jan, please don't make personal comments like this in code review.  It
can only offend and demoralize contributors, and deter other people
from joining in.

I understand that it can be frustrating, and I'm sure I have lashed
out at people on-list in the past.  But remember that people who are
new to the project need time to learn, and keep the comments to the
code itself.

I can see that this series has been going for a long time, and is
still getting hung up on coding style issues.  Might it be useful to
have a round of internal review from some of the more xen-experienced
engineers at Intel before Jan looks at it again?

Cheers,

Tim.

___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Jan Beulich
 On 09.12.14 at 11:11, t...@xen.org wrote:
 At 08:19 + on 09 Dec (1418109561), Jan Beulich wrote:
 Why do you always pick other than the simplest possible solution?
 
 Jan, please don't make personal comments like this in code review.  It
 can only offend and demoralize contributors, and deter other people
 from joining in.

I apologize - I shouldn't have permitted myself to do so.

 I understand that it can be frustrating, and I'm sure I have lashed
 out at people on-list in the past.  But remember that people who are
 new to the project need time to learn, and keep the comments to the
 code itself.
 
 I can see that this series has been going for a long time, and is
 still getting hung up on coding style issues.  Might it be useful to
 have a round of internal review from some of the more xen-experienced
 engineers at Intel before Jan looks at it again?

I've been suggesting something along those lines a number of times,
with (apparently) no success at all.

Jan


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-09 Thread Chen, Tiejun

On 2014/12/9 18:22, Jan Beulich wrote:

On 09.12.14 at 11:11, t...@xen.org wrote:

At 08:19 + on 09 Dec (1418109561), Jan Beulich wrote:

Why do you always pick other than the simplest possible solution?


Jan, please don't make personal comments like this in code review.  It
can only offend and demoralize contributors, and deter other people
from joining in.


I apologize - I shouldn't have permitted myself to do so.


I understand that it can be frustrating, and I'm sure I have lashed


Actually myself also have this same feeling but this really isn't 
helpful to step forward.



out at people on-list in the past.  But remember that people who are
new to the project need time to learn, and keep the comments to the
code itself.

I can see that this series has been going for a long time, and is
still getting hung up on coding style issues.  Might it be useful to


Something is my fault here. Although I learn more about Xen from this 
series constantly, looks its hard to cover this big thread with that 
reasonable code under Xen circumstance. This is also why I claimed I 
can't do this right from the start.


And additionally, even some initial designs are not very clear or argued 
between us, so when we discuss something during each revision, it bring 
a new thought again...



have a round of internal review from some of the more xen-experienced
engineers at Intel before Jan looks at it again?


I've been suggesting something along those lines a number of times,
with (apparently) no success at all.



Actually we had a little bit discussion internal and some guys really 
brought me some comments previously, but I think they're too busy to 
review each patch carefully one line by one line, especially if I bring 
something new to address yours comments just in the process of public 
review.


Anyway, now let me ask if this can deliver to other suitable guy. As I 
said previously I really would like to quit if this can step next properly.


___
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel


Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-08 Thread Jan Beulich
 On 08.12.14 at 08:11, tiejun.c...@intel.com wrote:
 On 2014/12/4 23:50, Jan Beulich wrote:
 On 01.12.14 at 10:24, tiejun.c...@intel.com wrote:

 -if ( __copy_to_compat_offset(grdm-map.buffer, grdm-used_entries,
 - rdm, 1) )
 -return -EFAULT;
 +if ( d )
 +{
 +if ( d-arch.hvm_domain.pci_force )

 You didn't verify that the domain is a HVM/PVH one.
 
 Is this, ASSERT(is_hvm_domain(grdm.domain)), correct?

Certainly not, or do you want to crash the hypervisor because of bad
tools input?

 +{
 +if ( grdm-used_entries  grdm-map.nr_entries )
 +{
 +if ( __copy_to_compat_offset(grdm-map.buffer,
 + grdm-used_entries,
 + rdm, 1) )
 +{
 +rcu_unlock_domain(d);
 +return -EFAULT;
 +}
 +}
 +++grdm-used_entries;
 +}
 +else
 +{
 +for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
 +{
 +sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
 + d-arch.hvm_domain.pcidevs[i].bus,
 + d-arch.hvm_domain.pcidevs[i].devfn);
 +if ( sbdf == id )
 +{
 +if ( grdm-used_entries  grdm-map.nr_entries )
 +{
 +if ( __copy_to_compat_offset(grdm-map.buffer,
 + grdm-used_entries,
 + rdm, 1) )
 +{
 +rcu_unlock_domain(d);
 +return -EFAULT;
 +}
 +}
 +++grdm-used_entries;

 break;
 
 Added.
 

 Also trying to fold code identical on the if and else branches would
 seem pretty desirable.
 
 Sorry, I can't see what I'm missing.

The whole if-copy-unlock-and-return-EFAULT-otherwise-increment
is identical and can be factored out pretty easily afaict.

 @@ -319,9 +358,13 @@ int compat_memory_op(unsigned int cmd, 
 XEN_GUEST_HANDLE_PARAM(void) compat)

   if ( !rc  grdm.map.nr_entries  grdm.used_entries )
   rc = -ENOBUFS;
 +
   grdm.map.nr_entries = grdm.used_entries;
 -if ( __copy_to_guest(compat, grdm.map, 1) )
 -rc = -EFAULT;
 +if ( grdm.map.nr_entries )
 +{
 +if ( __copy_to_guest(compat, grdm.map, 1) )
 +rc = -EFAULT;
 +}

 Why do you need this change?
 
 If we have no any entries, why do we still copy that?

That's not only a pointless optimization (the counter question being
Why add an extra conditional when the copying does no harm?), but
also not subject of this patch. Additionally iirc the field is an IN/OUT,
i.e. when no entries were found you want to tell the caller so.

 --- a/xen/drivers/passthrough/vtd/dmar.c
 +++ b/xen/drivers/passthrough/vtd/dmar.c
 @@ -904,17 +904,33 @@ int platform_supports_x2apic(void)

   int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
   {
 -struct acpi_rmrr_unit *rmrr;
 +struct acpi_rmrr_unit *rmrr, *rmrr_cur = NULL;
   int rc = 0;
 +unsigned int i;
 +u16 bdf;

 -list_for_each_entry(rmrr, acpi_rmrr_units, list)
 +for_each_rmrr_device ( rmrr, bdf, i )
   {
 -rc = func(PFN_DOWN(rmrr-base_address),
 -  PFN_UP(rmrr-end_address) - PFN_DOWN(rmrr-base_address),
 -  ctxt);
 -if ( rc )
 -break;
 +if ( rmrr != rmrr_cur )
 +{
 +rc = func(PFN_DOWN(rmrr-base_address),
 +  PFN_UP(rmrr-end_address) -
 +PFN_DOWN(rmrr-base_address),
 +  PCI_SBDF(rmrr-segment, bdf),
 +  ctxt);
 +
 +if ( unlikely(rc  0) )
 +return rc;
 +
 +/* Just go next. */
 +if ( !rc )
 +rmrr_cur = rmrr;
 +
 +/* Now just return specific to user requirement. */
 +if ( rc  0 )
 +return rc;

 Nice that you check for that, but I can't see this case occurring
 anymore. Did you lose some code? Also please don't write code
 
 We have three scenarios here:
 
 #1 rc  0 means this is an error;
 #2 rc = 0 means the tools caller don't know how many buffers it should 
 construct, so we need to count all entries as 'nr_entries' to return.
 #3 rc  0 means in some cases, we need to return some specific values, 
 like 1 to indicate we're hitting some RMRR range. Currently, we use gfn 
 to check this in case of memory populating, ept violation handler and 
 mem_access.

Yes, I saw that you make use of this in later patches. It just seemed
suspicious 

Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-08 Thread Chen, Tiejun

On 2014/12/8 16:51, Jan Beulich wrote:

On 08.12.14 at 08:11, tiejun.c...@intel.com wrote:

On 2014/12/4 23:50, Jan Beulich wrote:

On 01.12.14 at 10:24, tiejun.c...@intel.com wrote:


-if ( __copy_to_compat_offset(grdm-map.buffer, grdm-used_entries,
- rdm, 1) )
-return -EFAULT;
+if ( d )
+{
+if ( d-arch.hvm_domain.pci_force )


You didn't verify that the domain is a HVM/PVH one.


Is this, ASSERT(is_hvm_domain(grdm.domain)), correct?


Certainly not, or do you want to crash the hypervisor because of bad
tools input?


Based on konrad's hint, I hope this should work for you,

+if ( !is_hvm_domain(d) )
+return -ENOSYS;




+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if ( __copy_to_compat_offset(grdm-map.buffer,
+ grdm-used_entries,
+ rdm, 1) )
+{
+rcu_unlock_domain(d);
+return -EFAULT;
+}
+}
+++grdm-used_entries;
+}
+else
+{
+for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
+{
+sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
+ d-arch.hvm_domain.pcidevs[i].bus,
+ d-arch.hvm_domain.pcidevs[i].devfn);
+if ( sbdf == id )
+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if ( __copy_to_compat_offset(grdm-map.buffer,
+ grdm-used_entries,
+ rdm, 1) )
+{
+rcu_unlock_domain(d);
+return -EFAULT;
+}
+}
+++grdm-used_entries;


break;


Added.



Also trying to fold code identical on the if and else branches would
seem pretty desirable.


Sorry, I can't see what I'm missing.


The whole if-copy-unlock-and-return-EFAULT-otherwise-increment
is identical and can be factored out pretty easily afaict.


What about this?

struct get_reserved_device_memory {
struct xen_reserved_device_memory_map map;
unsigned int used_entries;
struct domain *domain;
};

static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
  u32 id, void *ctxt)
{
struct get_reserved_device_memory *grdm = ctxt;
struct domain *d = grdm-domain;
unsigned int i, hit_one = 0;
u32 sbdf;
struct xen_reserved_device_memory rdm = {
.start_pfn = start, .nr_pages = nr
};

if ( !d-arch.hvm_domain.pci_force )
{
for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
{
sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
 d-arch.hvm_domain.pcidevs[i].bus,
 d-arch.hvm_domain.pcidevs[i].devfn);
if ( sbdf == id )
{
hit_one = 1;
break;
}
}

if ( !hit_one )
return 0;
}

if ( grdm-used_entries  grdm-map.nr_entries )
{
if ( __copy_to_guest_offset(grdm-map.buffer,
grdm-used_entries,
rdm, 1) )
return -EFAULT;
}

++grdm-used_entries;

return 0;
}




@@ -319,9 +358,13 @@ int compat_memory_op(unsigned int cmd,

XEN_GUEST_HANDLE_PARAM(void) compat)


   if ( !rc  grdm.map.nr_entries  grdm.used_entries )
   rc = -ENOBUFS;
+
   grdm.map.nr_entries = grdm.used_entries;
-if ( __copy_to_guest(compat, grdm.map, 1) )
-rc = -EFAULT;
+if ( grdm.map.nr_entries )
+{
+if ( __copy_to_guest(compat, grdm.map, 1) )
+rc = -EFAULT;
+}


Why do you need this change?


If we have no any entries, why do we still copy that?


That's not only a pointless optimization (the counter question being
Why add an extra conditional when the copying does no harm?), but
also not subject of this patch. Additionally iirc the field is an IN/OUT,
i.e. when no entries were found you want to tell the caller so.


Right so I will recover that.




--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -904,17 +904,33 @@ int platform_supports_x2apic(void)

   int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
   {
-struct acpi_rmrr_unit *rmrr;
+struct acpi_rmrr_unit *rmrr, *rmrr_cur = NULL;
   int rc = 0;
+unsigned int i;
+u16 bdf;

-list_for_each_entry(rmrr, acpi_rmrr_units, list)
+for_each_rmrr_device ( rmrr, bdf, i )

Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-07 Thread Chen, Tiejun

On 2014/12/2 16:46, Tian, Kevin wrote:

From: Chen, Tiejun
Sent: Monday, December 01, 2014 5:24 PM

After we intend to expost that hypercall explicitly based on
XEN_DOMCTL_set_rdm, we need this rebase. I hope we can squash
this into that previous patch once Jan Ack this.


better to merge together, since it's the right thing to do based on previous
discussion.


As I said I will do this after this patch is acked.



one question about 'd-arch.hvm_domain.pci_force'. My impression is
that this flag enables force check, and while enabled, you'll always


Yes.


do selected BDF filtering by default. However from below code, seems


No.


pci_force is used to whether report all or selected regions. Am I reading
it wrong?


if ( d-arch.hvm_domain.pci_force )
{
...
}
else
{
for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
...
}

So by default, we first check d-arch.hvm_domain.pci_force without 
filtering all selected BDF :)


Thanks
Tiejun





Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
  xen/common/compat/memory.c | 75
++
  xen/common/memory.c| 71
+---
  xen/drivers/passthrough/vtd/dmar.c | 32 
  xen/include/public/memory.h|  5 +++
  xen/include/xen/iommu.h|  2 +-
  xen/include/xen/pci.h  |  2 +
  6 files changed, 148 insertions(+), 39 deletions(-)

diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
index 60512fa..e6a256e 100644
--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -22,27 +22,66 @@ struct get_reserved_device_memory {
  unsigned int used_entries;
  };

-static int get_reserved_device_memory(xen_pfn_t start,
-  xen_ulong_t nr, void *ctxt)
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+  u32 id, void *ctxt)
  {
  struct get_reserved_device_memory *grdm = ctxt;
+struct domain *d;
+unsigned int i;
+u32 sbdf;
+struct compat_reserved_device_memory rdm = {
+.start_pfn = start, .nr_pages = nr
+};

-if ( grdm-used_entries  grdm-map.nr_entries )
-{
-struct compat_reserved_device_memory rdm = {
-.start_pfn = start, .nr_pages = nr
-};
+if ( rdm.start_pfn != start || rdm.nr_pages != nr )
+return -ERANGE;

-if ( rdm.start_pfn != start || rdm.nr_pages != nr )
-return -ERANGE;
+d = rcu_lock_domain_by_any_id(grdm-map.domid);
+if ( d == NULL )
+return -ESRCH;

-if ( __copy_to_compat_offset(grdm-map.buffer,
grdm-used_entries,
- rdm, 1) )
-return -EFAULT;
+if ( d )
+{
+if ( d-arch.hvm_domain.pci_force )
+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if ( __copy_to_compat_offset(grdm-map.buffer,
+ grdm-used_entries,
+ rdm, 1) )
+{
+rcu_unlock_domain(d);
+return -EFAULT;
+}
+}
+++grdm-used_entries;
+}
+else
+{
+for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
+{
+sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
+ d-arch.hvm_domain.pcidevs[i].bus,
+
d-arch.hvm_domain.pcidevs[i].devfn);
+if ( sbdf == id )
+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if
( __copy_to_compat_offset(grdm-map.buffer,
+
grdm-used_entries,
+ rdm, 1) )
+{
+rcu_unlock_domain(d);
+return -EFAULT;
+}
+}
+++grdm-used_entries;
+}
+}
+}
  }

-++grdm-used_entries;
-
+rcu_unlock_domain(d);
  return 0;
  }
  #endif
@@ -319,9 +358,13 @@ int compat_memory_op(unsigned int cmd,
XEN_GUEST_HANDLE_PARAM(void) compat)

  if ( !rc  grdm.map.nr_entries  grdm.used_entries )
  rc = -ENOBUFS;
+
  grdm.map.nr_entries = grdm.used_entries;
-if ( __copy_to_guest(compat, grdm.map, 1) )
-rc = -EFAULT;
+if ( grdm.map.nr_entries )
+{
+if ( __copy_to_guest(compat, grdm.map, 1) )
+rc = -EFAULT;
+}

  return rc;
  }
diff --git a/xen/common/memory.c b/xen/common/memory.c
index 4788acc..9ce82b1 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -698,24 +698,63 @@ 

Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-07 Thread Chen, Tiejun

On 2014/12/4 23:50, Jan Beulich wrote:

On 01.12.14 at 10:24, tiejun.c...@intel.com wrote:

--- a/xen/common/compat/memory.c
+++ b/xen/common/compat/memory.c
@@ -22,27 +22,66 @@ struct get_reserved_device_memory {
  unsigned int used_entries;
  };

-static int get_reserved_device_memory(xen_pfn_t start,
-  xen_ulong_t nr, void *ctxt)
+static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
+  u32 id, void *ctxt)
  {
  struct get_reserved_device_memory *grdm = ctxt;
+struct domain *d;
+unsigned int i;
+u32 sbdf;
+struct compat_reserved_device_memory rdm = {
+.start_pfn = start, .nr_pages = nr
+};

-if ( grdm-used_entries  grdm-map.nr_entries )
-{
-struct compat_reserved_device_memory rdm = {
-.start_pfn = start, .nr_pages = nr
-};
+if ( rdm.start_pfn != start || rdm.nr_pages != nr )
+return -ERANGE;

-if ( rdm.start_pfn != start || rdm.nr_pages != nr )
-return -ERANGE;
+d = rcu_lock_domain_by_any_id(grdm-map.domid);
+if ( d == NULL )
+return -ESRCH;


So why are you doing this in the call back (potentially many times)
instead of just once in compat_memory_op(), storing the pointer in
the context structure?


Okay.





-if ( __copy_to_compat_offset(grdm-map.buffer, grdm-used_entries,
- rdm, 1) )
-return -EFAULT;
+if ( d )
+{
+if ( d-arch.hvm_domain.pci_force )


You didn't verify that the domain is a HVM/PVH one.


Is this, ASSERT(is_hvm_domain(grdm.domain)), correct?




+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if ( __copy_to_compat_offset(grdm-map.buffer,
+ grdm-used_entries,
+ rdm, 1) )
+{
+rcu_unlock_domain(d);
+return -EFAULT;
+}
+}
+++grdm-used_entries;
+}
+else
+{
+for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
+{
+sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
+ d-arch.hvm_domain.pcidevs[i].bus,
+ d-arch.hvm_domain.pcidevs[i].devfn);
+if ( sbdf == id )
+{
+if ( grdm-used_entries  grdm-map.nr_entries )
+{
+if ( __copy_to_compat_offset(grdm-map.buffer,
+ grdm-used_entries,
+ rdm, 1) )
+{
+rcu_unlock_domain(d);
+return -EFAULT;
+}
+}
+++grdm-used_entries;


break;


Added.



Also trying to fold code identical on the if and else branches would
seem pretty desirable.


Sorry, I can't see what I'm missing.




@@ -319,9 +358,13 @@ int compat_memory_op(unsigned int cmd, 
XEN_GUEST_HANDLE_PARAM(void) compat)

  if ( !rc  grdm.map.nr_entries  grdm.used_entries )
  rc = -ENOBUFS;
+
  grdm.map.nr_entries = grdm.used_entries;
-if ( __copy_to_guest(compat, grdm.map, 1) )
-rc = -EFAULT;
+if ( grdm.map.nr_entries )
+{
+if ( __copy_to_guest(compat, grdm.map, 1) )
+rc = -EFAULT;
+}


Why do you need this change?


If we have no any entries, why do we still copy that?




--- a/xen/drivers/passthrough/vtd/dmar.c
+++ b/xen/drivers/passthrough/vtd/dmar.c
@@ -904,17 +904,33 @@ int platform_supports_x2apic(void)

  int intel_iommu_get_reserved_device_memory(iommu_grdm_t *func, void *ctxt)
  {
-struct acpi_rmrr_unit *rmrr;
+struct acpi_rmrr_unit *rmrr, *rmrr_cur = NULL;
  int rc = 0;
+unsigned int i;
+u16 bdf;

-list_for_each_entry(rmrr, acpi_rmrr_units, list)
+for_each_rmrr_device ( rmrr, bdf, i )
  {
-rc = func(PFN_DOWN(rmrr-base_address),
-  PFN_UP(rmrr-end_address) - PFN_DOWN(rmrr-base_address),
-  ctxt);
-if ( rc )
-break;
+if ( rmrr != rmrr_cur )
+{
+rc = func(PFN_DOWN(rmrr-base_address),
+  PFN_UP(rmrr-end_address) -
+PFN_DOWN(rmrr-base_address),
+  PCI_SBDF(rmrr-segment, bdf),
+  ctxt);
+
+if ( unlikely(rc  0) )
+return rc;
+
+/* Just go next. */
+if ( !rc )
+rmrr_cur = rmrr;
+
+/* Now just return specific to user requirement. */
+if ( rc  0 )
+return rc;


Nice that you check for 

Re: [Xen-devel] [v8][PATCH 04/17] update the existing hypercall to support XEN_DOMCTL_set_rdm

2014-12-02 Thread Tian, Kevin
 From: Chen, Tiejun
 Sent: Monday, December 01, 2014 5:24 PM
 
 After we intend to expost that hypercall explicitly based on
 XEN_DOMCTL_set_rdm, we need this rebase. I hope we can squash
 this into that previous patch once Jan Ack this.

better to merge together, since it's the right thing to do based on previous
discussion.

one question about 'd-arch.hvm_domain.pci_force'. My impression is
that this flag enables force check, and while enabled, you'll always
do selected BDF filtering by default. However from below code, seems
pci_force is used to whether report all or selected regions. Am I reading
it wrong?

 
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 ---
  xen/common/compat/memory.c | 75
 ++
  xen/common/memory.c| 71
 +---
  xen/drivers/passthrough/vtd/dmar.c | 32 
  xen/include/public/memory.h|  5 +++
  xen/include/xen/iommu.h|  2 +-
  xen/include/xen/pci.h  |  2 +
  6 files changed, 148 insertions(+), 39 deletions(-)
 
 diff --git a/xen/common/compat/memory.c b/xen/common/compat/memory.c
 index 60512fa..e6a256e 100644
 --- a/xen/common/compat/memory.c
 +++ b/xen/common/compat/memory.c
 @@ -22,27 +22,66 @@ struct get_reserved_device_memory {
  unsigned int used_entries;
  };
 
 -static int get_reserved_device_memory(xen_pfn_t start,
 -  xen_ulong_t nr, void *ctxt)
 +static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
 +  u32 id, void *ctxt)
  {
  struct get_reserved_device_memory *grdm = ctxt;
 +struct domain *d;
 +unsigned int i;
 +u32 sbdf;
 +struct compat_reserved_device_memory rdm = {
 +.start_pfn = start, .nr_pages = nr
 +};
 
 -if ( grdm-used_entries  grdm-map.nr_entries )
 -{
 -struct compat_reserved_device_memory rdm = {
 -.start_pfn = start, .nr_pages = nr
 -};
 +if ( rdm.start_pfn != start || rdm.nr_pages != nr )
 +return -ERANGE;
 
 -if ( rdm.start_pfn != start || rdm.nr_pages != nr )
 -return -ERANGE;
 +d = rcu_lock_domain_by_any_id(grdm-map.domid);
 +if ( d == NULL )
 +return -ESRCH;
 
 -if ( __copy_to_compat_offset(grdm-map.buffer,
 grdm-used_entries,
 - rdm, 1) )
 -return -EFAULT;
 +if ( d )
 +{
 +if ( d-arch.hvm_domain.pci_force )
 +{
 +if ( grdm-used_entries  grdm-map.nr_entries )
 +{
 +if ( __copy_to_compat_offset(grdm-map.buffer,
 + grdm-used_entries,
 + rdm, 1) )
 +{
 +rcu_unlock_domain(d);
 +return -EFAULT;
 +}
 +}
 +++grdm-used_entries;
 +}
 +else
 +{
 +for ( i = 0; i  d-arch.hvm_domain.num_pcidevs; i++ )
 +{
 +sbdf = PCI_SBDF2(d-arch.hvm_domain.pcidevs[i].seg,
 + d-arch.hvm_domain.pcidevs[i].bus,
 +
 d-arch.hvm_domain.pcidevs[i].devfn);
 +if ( sbdf == id )
 +{
 +if ( grdm-used_entries  grdm-map.nr_entries )
 +{
 +if
 ( __copy_to_compat_offset(grdm-map.buffer,
 +
 grdm-used_entries,
 + rdm, 1) )
 +{
 +rcu_unlock_domain(d);
 +return -EFAULT;
 +}
 +}
 +++grdm-used_entries;
 +}
 +}
 +}
  }
 
 -++grdm-used_entries;
 -
 +rcu_unlock_domain(d);
  return 0;
  }
  #endif
 @@ -319,9 +358,13 @@ int compat_memory_op(unsigned int cmd,
 XEN_GUEST_HANDLE_PARAM(void) compat)
 
  if ( !rc  grdm.map.nr_entries  grdm.used_entries )
  rc = -ENOBUFS;
 +
  grdm.map.nr_entries = grdm.used_entries;
 -if ( __copy_to_guest(compat, grdm.map, 1) )
 -rc = -EFAULT;
 +if ( grdm.map.nr_entries )
 +{
 +if ( __copy_to_guest(compat, grdm.map, 1) )
 +rc = -EFAULT;
 +}
 
  return rc;
  }
 diff --git a/xen/common/memory.c b/xen/common/memory.c
 index 4788acc..9ce82b1 100644
 --- a/xen/common/memory.c
 +++ b/xen/common/memory.c
 @@ -698,24 +698,63 @@ struct get_reserved_device_memory {
  unsigned int used_entries;
  };
 
 -static int get_reserved_device_memory(xen_pfn_t start,
 -  xen_ulong_t nr, void *ctxt)
 +static int get_reserved_device_memory(xen_pfn_t start, xen_ulong_t nr,
 +  u32 id, void *ctxt)