Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Sinan Kaya
On 3/13/2018 6:48 PM, Logan Gunthorpe wrote:
> 
> 
> On 13/03/18 04:29 PM, Sinan Kaya wrote:
>> If hardware doesn't support it, blacklisting should have been the right
>> path and I still think that you should remove all switch business from the 
>> code.
>> I did not hear enough justification for having a switch requirement
>> for P2P.
> 
> I disagree.
> 
>> You are also saying that root ports have issues not because of functionality 
>> but
>> because of performance. 
> 
> No... performance can also be an issue but the main justification for
> this is that many root ports do not support P2P at all and can fail in
> different ways (usually just dropping the TLPs).
> 
>> What if I come up with a very cheap/crappy switch (something like used in 
>> data
>> mining)?
> 
> Good luck. That's not how hardware is designed. PCIe switches that have
> any hope to compete with the existing market will need features like
> NTB, non-transparent ports, etc... and that implies a certain symmetry
> (ie there isn't a special host port because there may be more than one
> and it may move around) which implies that packets will always be able
> to forward between each ports which implies P2P will work.
> 

It is still a switch it can move packets but, maybe it can move data at
100kbps speed. 

What prevents that switch from trying P2P and having a bad user experience?

If everything is so broken, I was suggesting to at least list the switches
you have tested.

What's the problem with this?

Why do you want to assume that all switches are good and all root ports are
bad?

>> I have been doing my best to provide feedback. It feels like you are throwing
>> them over the wall to be honest.
>>
>> You keep implying "not my problem".
> 
> Well, the fact of the matter is that extending this in all the ways
> people like you want face huge problems on all sides. These are not
> trivial issues and holding back work that works for our problem because
> it doesn't solve your problem is IMO just going to grind development in
> this area to a halt. We have to have something we can agree on which is
> safe to start building on. The building can't just emerge fully formed
> in one go.
> 

What if the design is so canned that you can't change anything? 

I have been asking things like getting rid of switch search in ACS
enablement towards achieving generic P2P. You seem to be pushing back.
You said yourself P2P and isolation doesn't go together at this point
but you also care about isolation for other devices that are not doing
P2P.

Confusing...

> P2P proposal go back a long time and have never gotten anywhere because
> there are limitations and people want it to do things that are hard but
> don't want to contribute the work to solving those problems.
> 
>>> Well, if it's a problem for someone they'll have to solve it. We're
>>> targeting JBOFs that have no use for ACS / IOMMU groups at all.
>>
>> IMO, you (not somebody) should address this one way or the other before this
>> series land in upstream.
> 
> The real way to address this (as I've mentioned before) is with some way
> of doing ACS and iomem groups dynamically. But this is a huge project in
> itself and is never going to be part of the P2P patchset.

fair enough.

> 
>> Another assumption: There are other architectures like ARM64 where IOMMU
>> is enabled by default even if you don't use VMs for security reasons.
>> IOMMU blocks stray transactions.
> 
> True, but it doesn't change my point: ACS is not a requirement for Linux
> many many systems do not have it on at all or by default.

I don't think so.

It is not a requirement for you but it is a requirement for me (ARM64 guy).
Linux happens to run on multiple architectures. One exception invalidates your
point.

> 
>> Didn't the ACS behavior change suddenly for no good reason when we enabled
>> your code even though I might not be using the P2P but I happen to have
>> a kernel with P2P config option?
> 

If you are assuming that your kernel option should not be used by general
distributions like Ubuntu/redhat etc. and requires a kernel compilation,
creating a dependency to EXPERT is the right way to do. 

Distributions assume that no damage is done by enabling PCI bus options
under normal circumstances.

> Well no, presumably you made a conscious choice to turn the config
> option on and build a custom kernel for your box. That doesn't seem very
> sudden and the reason is that the two concepts are very much at odds
> with each other: you can't have isolation and still have transactions
> between devices.
> 
> Logan
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Sinan Kaya
On 3/13/2018 6:00 PM, Logan Gunthorpe wrote:
> 
> 
> On 13/03/18 03:22 PM, Sinan Kaya wrote:
>> It sounds like you have very tight hardware expectations for this to work
>> at this moment. You also don't want to generalize this code for others and
>> address the shortcomings.
> 
> No, that's the way the community has pushed this work. Our original work
> was very general and we were told it was unacceptable to put the onus on
> the user and have things break if the hardware doesn't support it. I
> think that's a reasonable requirement. So the hardware use-cases were
> wittled down to the ones we can be confident about and support with
> reasonable changes to the kernel today.

If hardware doesn't support it, blacklisting should have been the right
path and I still think that you should remove all switch business from the code.
I did not hear enough justification for having a switch requirement
for P2P.

You are also saying that root ports have issues not because of functionality but
because of performance. 

If you know what is bad, create a list and keep adding it. You can't assume
universally that all root ports are broken/ have bad performance.

> 
>> To get you going, you should limit this change to the switch products that 
>> you have
>> validated via white-listing PCI vendor/device ids. Please do not enable this 
>> feature
>> for all other PCI devices or by default.
> 
> This doesn't seem necessary. We know that all PCIe switches available
> today support P2P and we are highly confident that any switch that would
> ever be produced would support P2P. As long as devices are behind a
> switch you don't need any white listing. This is what the current patch
> set implements. If you want to start including root ports then you will
> need a white list (and solve all the other problems I mentioned earlier).

What if I come up with a very cheap/crappy switch (something like used in data
mining)?

What guarantees that P2P will work with this device? You are making an
assumption here that all switches have good performance.

How is that any different from good switch vs. bad switch and good root
port vs. bad root port?

If it is universally broken, why don't you list the things that work?

> 
>> I think your code qualifies as a virus until this issue is resolved (so 
>> NAK). 
> 
> That seems a bit hyperbolic... "a virus"??!... please keep things
> constructive.
> 

Sorry, this was harsh. I'm taking "virus" word back. I apologize. 
But, I hold onto my NAK opinion. 

I have been doing my best to provide feedback. It feels like you are throwing
them over the wall to be honest.

You keep implying "not my problem".

> > I agree disabling globally would be bad. Somebody can always say I have
> > ten switches on my system. I want to do peer-to-peer on one switch only. 
> > Now,
> > this change weakened security for the other switches that I had no intention
> > with doing P2P.
> >
> > Isn't this a problem?
> 
> Well, if it's a problem for someone they'll have to solve it. We're
> targeting JBOFs that have no use for ACS / IOMMU groups at all.

IMO, you (not somebody) should address this one way or the other before this
series land in upstream.

>> You are delivering a general purpose P2P code with a lot of holes in it and
>> expecting people to jump through it.
> No, the code prevents users from screwing it up. It just requires a
> switch in the hardware which is hardly a high bar to jump through
> (provided you are putting some design thought into your hardware). And
> given it requires semi-custom hardware today, it's not something that
> needs to be on by default in any distributor kernel.
> 
>> Turning security off by default is also not acceptable. Linux requires ACS 
>> support
>> even though you don't care about it for your particular application.
> 
> That's not true. Linux does not require ACS support. In fact it's
> already off by default until you specifically turn on the IOMMU. (Which
> is not always the most obvious thing to enable.) And the only thing it
> really supports is enforcing isolation between VMs that are using
> different pieces of hardware in the system.

Another assumption: There are other architectures like ARM64 where IOMMU
is enabled by default even if you don't use VMs for security reasons.
IOMMU blocks stray transactions.

> 
>> I'd hate ACS to be broken due to some operating system enabling your CONFIG 
>> option.
> 
> ACS isn't "broken" by enabling the config option. It just makes the
> IOMMU groups and isolation less granular. (ie. devices behind a switch
> will be in the same group and not isolated from each-other).

Didn't the ACS behavior change suddenly for

Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Sinan Kaya
On 3/13/2018 4:46 PM, Logan Gunthorpe wrote:
> 
> 
> On 13/03/18 01:53 PM, Sinan Kaya wrote:
>> I agree disabling globally would be bad. Somebody can always say I have
>> ten switches on my system. I want to do peer-to-peer on one switch only. Now,
>> this change weakened security for the other switches that I had no intention
>> with doing P2P.
>>
>> Isn't this a problem?
> 
> Well, if it's a problem for someone they'll have to solve it. We're
> targeting JBOFs that have no use for ACS / IOMMU groups at all.
> 
>> Can we specify the BDF of the downstream device we want P2P with during boot 
>> via
>> kernel command line?
> 
> That's a painful configuration burden. And then things might stop
> working if you change your topology at all and now have to change boot
> parameters.
> 

It sounds like you have very tight hardware expectations for this to work
at this moment. You also don't want to generalize this code for others and
address the shortcomings.

To get you going, you should limit this change to the switch products that you 
have
validated via white-listing PCI vendor/device ids. Please do not enable this 
feature
for all other PCI devices or by default.

I think your code qualifies as a virus until this issue is resolved (so NAK). 

Another option is for your CONFIG to depend on BROKEN/EXPERT.

You are delivering a general purpose P2P code with a lot of holes in it and
expecting people to jump through it.

Turning security off by default is also not acceptable. Linux requires ACS 
support
even though you don't care about it for your particular application.

I'd hate ACS to be broken due to some operating system enabling your CONFIG 
option.


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Sinan Kaya
On 3/13/2018 3:19 PM, Logan Gunthorpe wrote:
> 
> 
> On 13/03/18 01:10 PM, Sinan Kaya wrote:
>> I was thinking of this for the pci_p2pdma_add_client() case for the
>> parent pointer.
>>
>> +struct pci_p2pdma_client {
>> +struct list_head list;
>> +struct pci_dev *client;
>> +struct pci_dev *provider;
>> +};
> 
> Yeah, that structure only exists in a list owned by the client and we
> only check the upstream bridge once per entry so I don't see the point.
> 
>> But then, Why bother searching for the switch at all?
> 
> Huh? We have to make sure all the client and provider devices are behind
> the same switch. How can we do that without "searching" for the switch?
> 

Sorry, I was thinking of ACS case you described below. The only thing code
cares is if the device is behind a switch or not at this moment.

> In the ACS case, we only disable ACS on downstream ports of switches. No
> sense disabling it globally as that's worse from an isolation point of
> view and not worth it given we require all P2P transactions to be behind
> a switch.

I agree disabling globally would be bad. Somebody can always say I have
ten switches on my system. I want to do peer-to-peer on one switch only. Now,
this change weakened security for the other switches that I had no intention
with doing P2P.

Isn't this a problem?

Can we specify the BDF of the downstream device we want P2P with during boot via
kernel command line?

> 
>> Even if the switch is there, there is no guarantee that it is currently
>> being used for P2P.
> 
> IOMMU groups are set at boot time and, at present, there's no way to
> dynamically change ACS bits without messing up the groups. So switches
> not used for P2P will not have ACS enabled when CONFIG_PCI_P2PDMA is set
> and I don't know of any good solution to that. Please see the ACS
> discussions in v1 and v2.

Given the implementation limitations, this might be OK as a short-term
solution. 

It depends on if Alex is comfortable with this.

> 
>> It seems that we are going with the assumption that enabling this config
>> option implies you want P2P, then we can simplify this code as well.
> 
> How so?
> 
> Logan
> 
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Sinan Kaya
On 3/13/2018 2:44 PM, Logan Gunthorpe wrote:
>> Do you think you can keep a pointer to the parent bridge instead of querying 
>> it
>> via get_upstream_bridge_port() here so that we can reuse your
>> pci_p2pdma_disable_acs() in the future.
> 
> Keep a pointer where? pci_p2pdma_disable_acs() and pci_p2pdma_add_client() 
> are used in completely different cases on completely different devices. There 
> really is no overlap and no obvious place to store the port pointer (except 
> in the struct pci_dev itself, in which case why not just call the function 
> again).

I was thinking of this for the pci_p2pdma_add_client() case for the
parent pointer.

+struct pci_p2pdma_client {
+   struct list_head list;
+   struct pci_dev *client;
+   struct pci_dev *provider;
+};

You are right second code seems to be there to disable ACS altogether
when this kernel configuration is selected.

It doesn't have any relationship to the client code.

But then, Why bother searching for the switch at all?

Even if the switch is there, there is no guarantee that it is currently
being used for P2P.

It seems that we are going with the assumption that enabling this config
option implies you want P2P, then we can simplify this code as well.

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-13 Thread Sinan Kaya
On 3/13/2018 12:43 PM, Logan Gunthorpe wrote:
> 
> 
> On 12/03/18 09:28 PM, Sinan Kaya wrote:
>> On 3/12/2018 3:35 PM, Logan Gunthorpe wrote:
>> Regarding the switch business, It is amazing how much trouble you went into
>> limit this functionality into very specific hardware.
>>
>> I thought that we reached to an agreement that code would not impose
>> any limits on what user wants.
>>
>> What happened to all the emails we exchanged?
> 
> It turns out that root ports that support P2P are far less common than anyone 
> thought. So it will likely have to be a white list. Nobody else seems keen on 
> allowing the user to enable this on hardware that doesn't work. The easiest 
> solution is still limiting it to using a switch. From there, if someone wants 
> to start creating a white-list then that's probably the way forward to 
> support root ports.

I thought only few root ports had problem. Thanks for clarifying that.

> 
> And there's also the ACS problem which means if you want to use P2P on the 
> root ports you'll have to disable ACS on the entire system. (Or preferably, 
> the IOMMU groups need to get more sophisticated to allow for dynamic changes).
> 

Do you think you can keep a pointer to the parent bridge instead of querying it
via get_upstream_bridge_port() here so that we can reuse your
pci_p2pdma_disable_acs() in the future.

+int pci_p2pdma_disable_acs(struct pci_dev *pdev)
+{
+   struct pci_dev *up;
+   int pos;
+   u16 ctrl;
+
+   up = get_upstream_bridge_port(pdev);
+   if (!up)
+   return 0;

Some future device would only deal with pci_p2pdma_add_client(() for 
whitelisting
instead of changing all of your code.

We should at least limit the usage of get_upstream_bridge_port() family of 
functions
to probe time only.

> Additionally, once you allow for root ports you may find the IOMMU getting in 
> the way.

We can tell iommu to do one to one translation by passing iommu.passthrough=1 
to kernel
command line to have identical behavior to your switch case.

> 
> So there are great deal more issues to sort out if you don't restrict to 
> devices behind switches.
> 
> Logan
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 01/11] PCI/P2PDMA: Support peer-to-peer memory

2018-03-12 Thread Sinan Kaya
On 3/12/2018 3:35 PM, Logan Gunthorpe wrote:
> +int pci_p2pdma_add_client(struct list_head *head, struct device *dev)

It feels like code tried to be a generic p2pdma provider first. Then got
converted to PCI, yet all dev parameters are still struct device.

Maybe, dev parameter should also be struct pci_dev so that you can get rid of
all to_pci_dev() calls in this code including find_parent_pci_dev() function.

Regarding the switch business, It is amazing how much trouble you went into
limit this functionality into very specific hardware.

I thought that we reached to an agreement that code would not impose
any limits on what user wants.

What happened to all the emails we exchanged?

I understand you are coming from what you tested. Considering that you
are singing up for providing a generic PCI functionality into the kernel,
why don't you just blacklist the products that you had problems with and
yet still allow other architectures to use your code with their root ports?

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 08/11] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-12 Thread Sinan Kaya
On 3/12/2018 9:55 PM, Sinan Kaya wrote:
> On 3/12/2018 3:35 PM, Logan Gunthorpe wrote:
>> -if (nvmeq->sq_cmds_io)
> 
> I think you should keep the code as it is for the case where
> (!nvmeq->sq_cmds_is_io && nvmeq->sq_cmds_io)

Never mind. I misunderstood the code.


> 
> You are changing the behavior for NVMe drives with CMB buffers.
> You can change the if statement here with the statement above.
> 
>> -memcpy_toio(>sq_cmds_io[tail], cmd, sizeof(*cmd));
>> -else
>> -memcpy(>sq_cmds[tail], cmd, sizeof(*cmd));
>> +memcpy(>sq_cmds[tail], cmd, sizeof(*cmd));
>>  
> 
> 


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.


Re: [PATCH v3 08/11] nvme-pci: Use PCI p2pmem subsystem to manage the CMB

2018-03-12 Thread Sinan Kaya
On 3/12/2018 3:35 PM, Logan Gunthorpe wrote:
> - if (nvmeq->sq_cmds_io)

I think you should keep the code as it is for the case where
(!nvmeq->sq_cmds_is_io && nvmeq->sq_cmds_io)

You are changing the behavior for NVMe drives with CMB buffers.
You can change the if statement here with the statement above.

> - memcpy_toio(>sq_cmds_io[tail], cmd, sizeof(*cmd));
> - else
> - memcpy(>sq_cmds[tail], cmd, sizeof(*cmd));
> + memcpy(>sq_cmds[tail], cmd, sizeof(*cmd));
>  


-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm 
Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux 
Foundation Collaborative Project.