Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Leon Romanovsky
On Wed, May 17, 2017 at 08:07:00PM -0400, Doug Ledford wrote:
> On 5/17/2017 6:37 PM, Parav Pandit wrote:
> > Hi Doug,
> >
>
> >> It would have been better with AF_INET/AF_INET6 and an option to enable
> >> SMC than AF_SMC.  The first implementation simply assumes AF_INET in
> >> the presence of AF_SMC.  When IPv6 support is added, some sort of
> >> guessing logic will have to be put in place to try and determine if an
> >> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
> >> guaranteed way of telling.  Apps can use struct sockaddr_storage as their
> >> normal element to stick the address into, and could rely on the kernel to
> >> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
> >> this breaks that.  The RFC gives *some* thought to adding IPv6 in the
> >> future, but not a lot.  It may be that the answer is that in the future, 
> >> IPv6
> >> support is enabled by making the IPv6 API be
> >> AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then I
> >> would suggest making the later API specifically call out AF_INET +
> >> setsockopt(SMC) be identical to AF_SMC.
> >>
> >
> > What are the shortcomings in my proposal [1] which I am reiterating below.
> > Bart also suggested to define new stream protocol for SMC similar to SCTP.
> >
> > (a) address family should be either AF_INET or AF_INET6
> > (b) socket() API to introduce new protocol as PROTO_SMC for in socket() 
> > system call.
> >
> > With this there is no additional setsockop() needed.
> >
> > With this - user space applications, do getaddrinfo() with hint as
> > hints.ai_family  = AF_INET;
> > hints.ai_socktype = SOCK_STREAM;
> >
> > getaddrinfo() returns back both the protocols TCP and SMC.
> > Famous database application such as Redis client iterates over all entries 
> > of getaddrinfo() and establishes connection to servers.
> >
> > There are few advantages of this interface.
> > 1. No change in any makefile of applications needed - unless user wants to 
> > specify explicitly that it wants to force SMC protocol.
> > 2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because 
> > bind() connect() of SMC checks for AF_INET).
> > 3. No major changes to glibc to process AF_SMC differently
> > glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ 
> > places).
> > A lot simpler test matrix for glibc for new protocol
> > 5. No need to recompile applications, as long as getaddrinfo returns all 
> > streaming protocols (TCP, SMC)
> > 6. for applications to make use of setsockopt() it needs another knob and 
> > hint from other places, which can be avoided because SMC TCP option 
> > negotiates with remote end
> >
> > And representing new protocol as new protocol for a given address family 
> > appears correct, compare to setting socket options.
> >
> > Tools like CRIU or similar tool might find a race conditions - when it 
> > queries socket option, SMC was not set, but later on SMC was set, and does 
> > incorrect handling.
> > Setting socket() with SMC protocol makes it easier to understand in this 
> > area as well.
>
> I have no problem with the proposal in itself, but as IBM released this
> publication and did their own implementation prior to submitting things
> upstream, and as there might exist in the field implementations, it
> depends on whether we wish to call those in the field implementations
> experimental and break them as we go to a final implementation of
> version 1, or if we consider version 1 baked.  I'm fine breaking it.
> After all, that's what happened with XRC back in the day and Mellanox
> learned a valuable lesson about upstream first.  I have no problem with
> IBM learning that same lesson IMO.  So, I find your proposal, including
> breaking the API of the version 1 implementation just taken into the
> kernel before it has had time to fully sit and gel, acceptable.

Doug,
I am amused that after your excellent summary you are not calling to
this v1 of protocol in its real word: BROKEN.

Current implementation and RFC are DOA (dead-on-arrival) and are not
usable for ALL RDMA key players (Mellanox, Intel, Chelsio, ...) and
ALL RDMA major users. It doesn't fit into existing infrastructure
(DB, HPC, glibc, e.t.c.) and doesn't even fulfill the expected from
cover letter (doesn't work with LD_PRELOAD_*).

The fact that IBM implemented it and used it doesn't mean that they
can't continue to leave it with out-of-tree solution for a little
bit more time till usable version will come.

>
> But this is where we kind of need a judgment from on high, and why I
> Cc:ed Linus on this thread.  Any input on this issue Linus?

It should be dropped/moved_to_staging as soon as possible, before
UAPI started to spread.

>
> > I have additional proposal for link groups, resource creation area. I will 
> > take that up after this discussion.
>
> Look forward to hearing it.
>
> > [1] https://patchwork.kernel.org/patch/9719375/
>
>
> --
> Doug Ledford 

Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Doug Ledford
On 5/17/2017 6:37 PM, Parav Pandit wrote:
> Hi Doug,
> 

>> It would have been better with AF_INET/AF_INET6 and an option to enable
>> SMC than AF_SMC.  The first implementation simply assumes AF_INET in
>> the presence of AF_SMC.  When IPv6 support is added, some sort of
>> guessing logic will have to be put in place to try and determine if an
>> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
>> guaranteed way of telling.  Apps can use struct sockaddr_storage as their
>> normal element to stick the address into, and could rely on the kernel to
>> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
>> this breaks that.  The RFC gives *some* thought to adding IPv6 in the
>> future, but not a lot.  It may be that the answer is that in the future, IPv6
>> support is enabled by making the IPv6 API be
>> AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then I
>> would suggest making the later API specifically call out AF_INET +
>> setsockopt(SMC) be identical to AF_SMC.
>>
> 
> What are the shortcomings in my proposal [1] which I am reiterating below.
> Bart also suggested to define new stream protocol for SMC similar to SCTP.
> 
> (a) address family should be either AF_INET or AF_INET6
> (b) socket() API to introduce new protocol as PROTO_SMC for in socket() 
> system call.
> 
> With this there is no additional setsockop() needed.
> 
> With this - user space applications, do getaddrinfo() with hint as
> hints.ai_family  = AF_INET;
> hints.ai_socktype = SOCK_STREAM;
> 
> getaddrinfo() returns back both the protocols TCP and SMC.
> Famous database application such as Redis client iterates over all entries of 
> getaddrinfo() and establishes connection to servers.
> 
> There are few advantages of this interface.
> 1. No change in any makefile of applications needed - unless user wants to 
> specify explicitly that it wants to force SMC protocol.
> 2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because bind() 
> connect() of SMC checks for AF_INET).
> 3. No major changes to glibc to process AF_SMC differently
> glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ 
> places).
> A lot simpler test matrix for glibc for new protocol
> 5. No need to recompile applications, as long as getaddrinfo returns all 
> streaming protocols (TCP, SMC)
> 6. for applications to make use of setsockopt() it needs another knob and 
> hint from other places, which can be avoided because SMC TCP option 
> negotiates with remote end
> 
> And representing new protocol as new protocol for a given address family 
> appears correct, compare to setting socket options.
> 
> Tools like CRIU or similar tool might find a race conditions - when it 
> queries socket option, SMC was not set, but later on SMC was set, and does 
> incorrect handling.
> Setting socket() with SMC protocol makes it easier to understand in this area 
> as well.

I have no problem with the proposal in itself, but as IBM released this
publication and did their own implementation prior to submitting things
upstream, and as there might exist in the field implementations, it
depends on whether we wish to call those in the field implementations
experimental and break them as we go to a final implementation of
version 1, or if we consider version 1 baked.  I'm fine breaking it.
After all, that's what happened with XRC back in the day and Mellanox
learned a valuable lesson about upstream first.  I have no problem with
IBM learning that same lesson IMO.  So, I find your proposal, including
breaking the API of the version 1 implementation just taken into the
kernel before it has had time to fully sit and gel, acceptable.

But this is where we kind of need a judgment from on high, and why I
Cc:ed Linus on this thread.  Any input on this issue Linus?

> I have additional proposal for link groups, resource creation area. I will 
> take that up after this discussion.

Look forward to hearing it.

> [1] https://patchwork.kernel.org/patch/9719375/


-- 
Doug Ledford 
GPG Key ID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



signature.asc
Description: OpenPGP digital signature


RE: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Parav Pandit
Hi Doug,

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of Doug Ledford
> Sent: Wednesday, May 17, 2017 3:38 PM
> To: David Miller <da...@davemloft.net>
> Cc: bart.vanass...@sandisk.com; torva...@linux-foundation.org;
> h...@lst.de; netdev@vger.kernel.org; linux-r...@vger.kernel.org;
> sta...@vger.kernel.org; ubr...@linux.vnet.ibm.com
> Subject: Re: [PATCH] net/smc: mark as BROKEN due to remote memory
> exposure
> 
> On Tue, 2017-05-16 at 15:28 -0400, Doug Ledford wrote:
> > I hadn't realized EXPERIMENTAL was gone.  Which is too bad, because
> > that's entirely appropriate in this case, and would have had the
> > desired side effect of keeping it out of any non-cutting edge distros
> > and warning people of possible API changes.  With EXPERIMENTAL gone,
> > the closest thing we have is drivers/staging, since that tends to
> > imply some of the same consequences.  I know you think BROKEN is
> > overly harsh, but I'm not sure we should just do nothing.  How about
> > we take a few days to let some of the RDMA people closely review the
> > 143 page
> > (egads!) rfc (http://www.rfc-editor.org/info/rfc7609) to see if we
> > think it can be fixed to use multiple link layers with the existing
> > API in place or if it will require something other than AF_SMC.  If we
> > need to break API, then I think we should either fix it ASAP and send
> > that fix to the 4.11 stable series (which probably violates the
> > normative stable patch size/scope) or if the fix will take longer than
> > this kernel cycle, then move it to staging both here and in 4.11
> > stable, and fix it there and then move it back.  Something like that
> > would prevent the kind of API flappage we ought not do
> 
> So, I've skimmed the entire RFC, focusing on things were I needed to.
>  Here's my take on it:
> 
> It would have been better with AF_INET/AF_INET6 and an option to enable
> SMC than AF_SMC.  The first implementation simply assumes AF_INET in
> the presence of AF_SMC.  When IPv6 support is added, some sort of
> guessing logic will have to be put in place to try and determine if an
> AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
> guaranteed way of telling.  Apps can use struct sockaddr_storage as their
> normal element to stick the address into, and could rely on the kernel to
> interpret it properly based on the AF_INET/AF_INET6 differentiation, and
> this breaks that.  The RFC gives *some* thought to adding IPv6 in the
> future, but not a lot.  It may be that the answer is that in the future, IPv6
> support is enabled by making the IPv6 API be
> AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then I
> would suggest making the later API specifically call out AF_INET +
> setsockopt(SMC) be identical to AF_SMC.
> 

What are the shortcomings in my proposal [1] which I am reiterating below.
Bart also suggested to define new stream protocol for SMC similar to SCTP.

(a) address family should be either AF_INET or AF_INET6
(b) socket() API to introduce new protocol as PROTO_SMC for in socket() system 
call.

With this there is no additional setsockop() needed.

With this - user space applications, do getaddrinfo() with hint as
hints.ai_family  = AF_INET;
hints.ai_socktype = SOCK_STREAM;

getaddrinfo() returns back both the protocols TCP and SMC.
Famous database application such as Redis client iterates over all entries of 
getaddrinfo() and establishes connection to servers.

There are few advantages of this interface.
1. No change in any makefile of applications needed - unless user wants to 
specify explicitly that it wants to force SMC protocol.
2. No need to do LD_LIBRARY_PRELOAD, (which won't work anyway because bind() 
connect() of SMC checks for AF_INET).
3. No major changes to glibc to process AF_SMC differently
glibc references IPPROTO_TCP at 22 places. (compare to AF_INET at 140+ places).
A lot simpler test matrix for glibc for new protocol
5. No need to recompile applications, as long as getaddrinfo returns all 
streaming protocols (TCP, SMC)
6. for applications to make use of setsockopt() it needs another knob and hint 
from other places, which can be avoided because SMC TCP option negotiates with 
remote end

And representing new protocol as new protocol for a given address family 
appears correct, compare to setting socket options.

Tools like CRIU or similar tool might find a race conditions - when it queries 
socket option, SMC was not set, but later on SMC was set, and does incorrect 
handling.
Setting socket() with SMC protocol makes it easier to understand in this area 
as well.

I have additional proposal for link groups, resource creation area. I will take 
that up after this discussion.

[

Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-17 Thread Doug Ledford
On Tue, 2017-05-16 at 15:28 -0400, Doug Ledford wrote:
> I hadn't realized EXPERIMENTAL was gone.  Which is too bad, because
> that's entirely appropriate in this case, and would have had the
> desired side effect of keeping it out of any non-cutting edge distros
> and warning people of possible API changes.  With EXPERIMENTAL gone,
> the closest thing we have is drivers/staging, since that tends to
> imply
> some of the same consequences.  I know you think BROKEN is overly
> harsh, but I'm not sure we should just do nothing.  How about we take
> a
> few days to let some of the RDMA people closely review the 143 page
> (egads!) rfc (http://www.rfc-editor.org/info/rfc7609) to see if we
> think it can be fixed to use multiple link layers with the existing
> API
> in place or if it will require something other than AF_SMC.  If we
> need
> to break API, then I think we should either fix it ASAP and send that
> fix to the 4.11 stable series (which probably violates the normative
> stable patch size/scope) or if the fix will take longer than this
> kernel cycle, then move it to staging both here and in 4.11 stable,
> and
> fix it there and then move it back.  Something like that would
> prevent
> the kind of API flappage we ought not do

So, I've skimmed the entire RFC, focusing on things were I needed to.
 Here's my take on it:

It would have been better with AF_INET/AF_INET6 and an option to enable
SMC than AF_SMC.  The first implementation simply assumes AF_INET in
the presence of AF_SMC.  When IPv6 support is added, some sort of
guessing logic will have to be put in place to try and determine if an
AF_SMC address is actually AF_INET or AF_INET6 since we won't have a
guaranteed way of telling.  Apps can use struct sockaddr_storage as
their normal element to stick the address into, and could rely on the
kernel to interpret it properly based on the AF_INET/AF_INET6
differentiation, and this breaks that.  The RFC gives *some* thought to
adding IPv6 in the future, but not a lot.  It may be that the answer is
that in the future, IPv6 support is enabled by making the IPv6 API be
AF_INET6 + setsockopt(SMC) or the equivalent.  If that's the case, then
I would suggest making the later API specifically call out AF_INET +
setsockopt(SMC) be identical to AF_SMC.

The protocol included a version header in the negotation messages.
 This is good as it allows us to almost immediately start work on
version 2 that fixes the shortcomings of version 1 while maintaining
back compatibility.

After reading the RFC, I can see why they only support one link layer:
RoCE.  The issue here is that they punted on the hard issue of doing
any sort of work to determine if security restrictions on the TCP
connection should also be applied to the RDMA connection.  The RFC
basically says "the RoCE link needs to have the same physical
restrictions (vlan) as the TCP/IP link so that the security limitations
are the same" because they don't do anything to check them essentially.
 For v2 of the protocol, and for different link layer support, this is
no longer sufficient, so there will be significant work to determine
the security context of specific TCP connections and then make sure
that they meet the security context of the RDMA links allowed.

Additionally, the same is likely to be true in terms of SELinux
options.  The addition of the IB/OPA link layers will throw a bit of a
monkey wrench in things because the SELinux control over those links is
slightly different than a normal TCP/IP SELinux control.  For instance,
you might create rules about processes and ports to make sure that the
httpd daemon can only access specific ports on TCP/IP, but on IB/OPA
you would need to create process and P_Key rules as IB/OPA don't have
the same port level semantics, and it's the P_Key on communications
that is enforced network wide, including in the switches, so
controlling what P_Key a process can send/receive on is your best way
to limit what a process can do.  Translation from one to the other
might be rather difficult to do in any sort of automated fashion.

There might have to be some additional work to get this to properly
account items for the RDMA control group elements that were recently
taken into the kernel.

Finally, the RDMA subsystem is in the process of switching to
structured option processing similar to how netlink does it.  For
version 2 of this protocol, since it will be interacting with the RDMA
core in many ways, will be simpler if it switches the on-wire
negotiation packets to follow the same methods as that will allow reuse
of code that it will have to have for properly dealing with the RDMA
subsystem in the future.

So, I'm fine with it being left as is since you accepted the patch that
makes note of the memory registration insecurity in the Kconfig text.
 The fact that this is a versioned protocol means that we can fix the
things we see as being wrong without having to have it all right from
the very start, it can be done 

Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
On Tue, 2017-05-16 at 14:52 -0400, David Miller wrote:
> From: Doug Ledford 
> Date: Tue, 16 May 2017 14:03:22 -0400
> 
> > On Tue, 2017-05-16 at 13:36 -0400, David Miller wrote:
> >> From: Doug Ledford 
> >> Date: Tue, 16 May 2017 13:20:44 -0400
> >> 
> >> > Anyway, we're just talking out what happened, when what we
> really
> >> need
> >> > to focus on is moving forward.  Again, your thoughts on marking
> SMC
> >> > EXPERIMENTAL until it's fixed up and unfreezing the API in case
> we
> >> need
> >> > to adjust it to work on different link layers?
> >> 
> >> Something like:
> >> 
> >> http://patchwork.ozlabs.org/patch/762803/
> >> 
> >> with the addition of the EXPERIMENTAL dependency?
> >> 
> >> Sure.
> > 
> > Perfect.  I assume you'll submit it since it's in your patchworks?
> 
> Ok I applied the patch referenced above, but we don't actually have
> an EXPERIMENTAL symbol.  The closest thing we have is BROKEN and
> even in this situation that's a bit harsh.

I hadn't realized EXPERIMENTAL was gone.  Which is too bad, because
that's entirely appropriate in this case, and would have had the
desired side effect of keeping it out of any non-cutting edge distros
and warning people of possible API changes.  With EXPERIMENTAL gone,
the closest thing we have is drivers/staging, since that tends to imply
some of the same consequences.  I know you think BROKEN is overly
harsh, but I'm not sure we should just do nothing.  How about we take a
few days to let some of the RDMA people closely review the 143 page
(egads!) rfc (http://www.rfc-editor.org/info/rfc7609) to see if we
think it can be fixed to use multiple link layers with the existing API
in place or if it will require something other than AF_SMC.  If we need
to break API, then I think we should either fix it ASAP and send that
fix to the 4.11 stable series (which probably violates the normative
stable patch size/scope) or if the fix will take longer than this
kernel cycle, then move it to staging both here and in 4.11 stable, and
fix it there and then move it back.  Something like that would prevent
the kind of API flappage we ought not do

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread David Miller
From: Doug Ledford 
Date: Tue, 16 May 2017 14:03:22 -0400

> On Tue, 2017-05-16 at 13:36 -0400, David Miller wrote:
>> From: Doug Ledford 
>> Date: Tue, 16 May 2017 13:20:44 -0400
>> 
>> > Anyway, we're just talking out what happened, when what we really
>> need
>> > to focus on is moving forward.  Again, your thoughts on marking SMC
>> > EXPERIMENTAL until it's fixed up and unfreezing the API in case we
>> need
>> > to adjust it to work on different link layers?
>> 
>> Something like:
>> 
>> http://patchwork.ozlabs.org/patch/762803/
>> 
>> with the addition of the EXPERIMENTAL dependency?
>> 
>> Sure.
> 
> Perfect.  I assume you'll submit it since it's in your patchworks?

Ok I applied the patch referenced above, but we don't actually have
an EXPERIMENTAL symbol.  The closest thing we have is BROKEN and
even in this situation that's a bit harsh.

I also applied the patch which converts SMC over to using
IB_PD_UNSAFE_GLOBAL_RKEY.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
On Tue, 2017-05-16 at 13:36 -0400, David Miller wrote:
> From: Doug Ledford 
> Date: Tue, 16 May 2017 13:20:44 -0400
> 
> > Anyway, we're just talking out what happened, when what we really
> need
> > to focus on is moving forward.  Again, your thoughts on marking SMC
> > EXPERIMENTAL until it's fixed up and unfreezing the API in case we
> need
> > to adjust it to work on different link layers?
> 
> Something like:
> 
> http://patchwork.ozlabs.org/patch/762803/
> 
> with the addition of the EXPERIMENTAL dependency?
> 
> Sure.

Perfect.  I assume you'll submit it since it's in your patchworks?

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread David Miller
From: Doug Ledford 
Date: Tue, 16 May 2017 13:20:44 -0400

> Anyway, we're just talking out what happened, when what we really need
> to focus on is moving forward.  Again, your thoughts on marking SMC
> EXPERIMENTAL until it's fixed up and unfreezing the API in case we need
> to adjust it to work on different link layers?

Something like:

http://patchwork.ozlabs.org/patch/762803/

with the addition of the EXPERIMENTAL dependency?

Sure.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
On Tue, 2017-05-16 at 12:49 -0400, David Miller wrote:
> From: Doug Ledford 
> Date: Tue, 16 May 2017 12:42:43 -0400
> 
> > On Tue, 2017-05-16 at 12:29 -0400, David Miller wrote:
> >> From: Doug Ledford 
> >> Date: Tue, 16 May 2017 11:57:04 -0400
> >> 
> >> > Regardless though, I'm rather purturbed about this entire thing.
> >>  If
> >> > you are right that because this got into 4.11, it's now a done
> >> deal,
> >> > then the fact that this went through 4 review cycles on netdev@
> >> that,
> >> > as I understand it, spanned roughly one years time, and not one
> >> single
> >> > person bothered to note that this was as much an RDMA driver as
> >> > anything else, and not one person bothered to note that linux-
> rdma@ 
> >> was
> >> > not on the Cc: list, and not one person told the submitters that
> >> they
> >> > needed to include linux-rdma@ on the Cc: list of these
> submissions,
> >> and
> >> > you took it without any review comments from any RDMA people in
> the
> >> > course of a year, or an ack from me to show that the RDMA
> portion
> >> of
> >> > this had at least been given some sort of review, was a collosal
> >> fuckup
> >> > of cross tree maintainer cooperation.
> >> 
> >> We rely on people from various areas of expertiece to contribute
> to
> >> patch review on netdev and give appropriate feedback.
> >> 
> >> If you actually look through the history, I made many semantic
> >> reviews
> >> of the SMC changes, and kept pushing back.
> >> 
> >> And in fact I did this several times, making them go through
> several
> >> revisions, in the hopes that someone would review more of the meat
> >> and
> >> substance of the patch set.
> > 
> > If you want to walk to the mailbox, you walk to the mailbox, you
> don't
> > walk to the grocery store, to the gym, and never even go to the
> > mailbox.  Likewise, if you want review from RDMA experts, most
> (maybe
> > even all) don't subscribe to netdev@ because it's too high traffic,
> you
> > don't waste time on silly semantic pushbacks, you send a single
> email
> > that says "Please get review from linux-rdma@, thank you."  Don't
> beat
> > around the bush, be direct and get it over with.  That's exactly
> what I
> > do for all netdev@ related patches coming to linux-rdma@ without a
> > proper Cc: to netdev@.
> 
> Read my other email, it wasn't %100 clear to me that this was so
> strictly RDMA related.  And I kept pushing back with semantic changes
> in part because it wasn't clear.

See my other email.  I think in the fact that you are overloaded on
netdev@ you skimmed the cover letter, which is the one thing that would
have made things clear.

> So as far as I was concerned I was not necessarily going to the wrong
> store, in fact I wasn't sure what store to go to.
> 
> And none of the thousands of subscribers to netdev intuit'd this
> either.  Maybe there is a reason for that.

Yeah, it's an overloaded list would be my guess.  I imagine no one can
keep up with it fully in truth.  Maybe it's time to split some of it
out into sublists?  netdev-core/netdev-ethernet/netdev-packet?  I don't
know, but I know I can't keep up with it.  That you do as well as you
do is probably only because you know how to not spend too much time on
any one thing.  I'm sure you have people you know are submitting
important work that effects the overall net subsystem and you focus on
their stuff, but ancillary things probably only get half attention at
double speed in order to allow you to make it through the day.

> Furthermore, if netdev is too much traffic for one or two RDMA people
> to just casually subscribe to on the off chance that a situationm
> like
> this comes up, guess what it is like for me who has to read and
> review
> pretty much every single posting that is placed there?

I get it.  I don't have the time to both be responsible for linux-rdma
and follow netdev.  I can already tell you what would happen if I
tried.  Eventually netdev would get so far behind I would just mass
delete and start over.  So that doesn't solve the problem.

Anyway, we're just talking out what happened, when what we really need
to focus on is moving forward.  Again, your thoughts on marking SMC
EXPERIMENTAL until it's fixed up and unfreezing the API in case we need
to adjust it to work on different link layers?

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
On Tue, 2017-05-16 at 12:41 -0400, David Miller wrote:
> From: Doug Ledford 
> Date: Tue, 16 May 2017 12:36:01 -0400
> 
> > On Tue, 2017-05-16 at 18:30 +0200, Christoph Hellwig wrote:
> >> On Tue, May 16, 2017 at 12:29:23PM -0400, David Miller wrote:
> >> > 
> >> > I can't push back on people with silly coding style and small
> >> > semantic
> >> > issues forever.  And I think I made a serious effort to keep the
> >> > patches getting posted over and over again to make sure they got
> >> > more
> >> > exposure.
> >> 
> >> You can tell them to go to linux-rdma.  I'm sending people to the
> >> right
> >> mailing list all the time.
> > 
> > Indeed.  Every single time a patch comes into linux-rdma that
> touches
> > things in net/ or include/net, unless it is exceedingly minor, I
> check
> > the To:/Cc: lines on the email and if netdev@ isn't included, or in
> the
> > case of complex/tricky items, you aren't directly Cc:ed, then I
> > specifically tell them to include netdev@ and/or you.  I've even
> had
> > things like a 12 patch series that buried three netdev@ appropriate
> > patches at different points in the series and told the submitter to
> > move all of the netdev@ related patches to the front and submit
> them to
> > netdev@ so they can be reviewed as a group before I would move on
> to
> > the others.  It's just what you do.  I've always considered that
> part
> > of my job.
> 
> To be quite honest it wasn't exceedingly clear, even to me, that this
> had such implications or was directly a RDMA thing.  From my
> perspective while reviewing I saw a patch series adding it's own
> protocol stack living inside of it's own directory under net/

OK.  Fair enough.  That implies to me that you probably dove right into
the patches and skimmed the cover letter (http://marc.info/?l=linux-s39
0=148397751211964=2), because when I read the cover letter, it
seems pretty clear to me that this is very much RDMA related.  In any
case, there are already two other code bases that live under net/ but
also involve RDMA heavily: nfs and rds.  This is a third now.  So, for
future reference, just like the RDMA related nfs/rds patches already
get Cc:ed to linux-rdma@, these probably should be too.

> And, if even one RDMA/infiniband person said to me "you really
> shouldn't apply this" then I would have dropped it on the spot.

I'm glad to hear that.  I wish we had managed to get together on this
sooner.

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread David Miller
From: Doug Ledford 
Date: Tue, 16 May 2017 12:42:43 -0400

> On Tue, 2017-05-16 at 12:29 -0400, David Miller wrote:
>> From: Doug Ledford 
>> Date: Tue, 16 May 2017 11:57:04 -0400
>> 
>> > Regardless though, I'm rather purturbed about this entire thing.
>>  If
>> > you are right that because this got into 4.11, it's now a done
>> deal,
>> > then the fact that this went through 4 review cycles on netdev@
>> that,
>> > as I understand it, spanned roughly one years time, and not one
>> single
>> > person bothered to note that this was as much an RDMA driver as
>> > anything else, and not one person bothered to note that linux-rdma@ 
>> was
>> > not on the Cc: list, and not one person told the submitters that
>> they
>> > needed to include linux-rdma@ on the Cc: list of these submissions,
>> and
>> > you took it without any review comments from any RDMA people in the
>> > course of a year, or an ack from me to show that the RDMA portion
>> of
>> > this had at least been given some sort of review, was a collosal
>> fuckup
>> > of cross tree maintainer cooperation.
>> 
>> We rely on people from various areas of expertiece to contribute to
>> patch review on netdev and give appropriate feedback.
>> 
>> If you actually look through the history, I made many semantic
>> reviews
>> of the SMC changes, and kept pushing back.
>> 
>> And in fact I did this several times, making them go through several
>> revisions, in the hopes that someone would review more of the meat
>> and
>> substance of the patch set.
> 
> If you want to walk to the mailbox, you walk to the mailbox, you don't
> walk to the grocery store, to the gym, and never even go to the
> mailbox.  Likewise, if you want review from RDMA experts, most (maybe
> even all) don't subscribe to netdev@ because it's too high traffic, you
> don't waste time on silly semantic pushbacks, you send a single email
> that says "Please get review from linux-rdma@, thank you."  Don't beat
> around the bush, be direct and get it over with.  That's exactly what I
> do for all netdev@ related patches coming to linux-rdma@ without a
> proper Cc: to netdev@.

Read my other email, it wasn't %100 clear to me that this was so
strictly RDMA related.  And I kept pushing back with semantic changes
in part because it wasn't clear.

So as far as I was concerned I was not necessarily going to the wrong
store, in fact I wasn't sure what store to go to.

And none of the thousands of subscribers to netdev intuit'd this
either.  Maybe there is a reason for that.

Furthermore, if netdev is too much traffic for one or two RDMA people
to just casually subscribe to on the off chance that a situationm like
this comes up, guess what it is like for me who has to read and review
pretty much every single posting that is placed there?



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
On Tue, 2017-05-16 at 12:29 -0400, David Miller wrote:
> From: Doug Ledford 
> Date: Tue, 16 May 2017 11:57:04 -0400
> 
> > Regardless though, I'm rather purturbed about this entire thing.
>  If
> > you are right that because this got into 4.11, it's now a done
> deal,
> > then the fact that this went through 4 review cycles on netdev@
> that,
> > as I understand it, spanned roughly one years time, and not one
> single
> > person bothered to note that this was as much an RDMA driver as
> > anything else, and not one person bothered to note that linux-rdma@ 
> was
> > not on the Cc: list, and not one person told the submitters that
> they
> > needed to include linux-rdma@ on the Cc: list of these submissions,
> and
> > you took it without any review comments from any RDMA people in the
> > course of a year, or an ack from me to show that the RDMA portion
> of
> > this had at least been given some sort of review, was a collosal
> fuckup
> > of cross tree maintainer cooperation.
> 
> We rely on people from various areas of expertiece to contribute to
> patch review on netdev and give appropriate feedback.
> 
> If you actually look through the history, I made many semantic
> reviews
> of the SMC changes, and kept pushing back.
> 
> And in fact I did this several times, making them go through several
> revisions, in the hopes that someone would review more of the meat
> and
> substance of the patch set.

If you want to walk to the mailbox, you walk to the mailbox, you don't
walk to the grocery store, to the gym, and never even go to the
mailbox.  Likewise, if you want review from RDMA experts, most (maybe
even all) don't subscribe to netdev@ because it's too high traffic, you
don't waste time on silly semantic pushbacks, you send a single email
that says "Please get review from linux-rdma@, thank you."  Don't beat
around the bush, be direct and get it over with.  That's exactly what I
do for all netdev@ related patches coming to linux-rdma@ without a
proper Cc: to netdev@.

> Nobody do this for over a year.
> 
> I can't push back on people with silly coding style and small
> semantic
> issues forever.  And I think I made a serious effort to keep the
> patches getting posted over and over again to make sure they got more
> exposure.
> 
> I think it's unsettling that there are no RDMA experts, or at least
> people remotely knowledgable about this "networking" technology,
> subscribed to netdev and taking a cursory look at pactches that might
> be relevant and effect that technology either directly or indirectly.
> 
> So there is a lot of blame to go around.

Fine, allocate blame however, you like.  What I want to actually settle
is how we are going to move forward.  You didn't respond to that part
of my email.  Your thoughts?

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread David Miller
From: Doug Ledford 
Date: Tue, 16 May 2017 12:36:01 -0400

> On Tue, 2017-05-16 at 18:30 +0200, Christoph Hellwig wrote:
>> On Tue, May 16, 2017 at 12:29:23PM -0400, David Miller wrote:
>> > 
>> > I can't push back on people with silly coding style and small
>> > semantic
>> > issues forever.  And I think I made a serious effort to keep the
>> > patches getting posted over and over again to make sure they got
>> > more
>> > exposure.
>> 
>> You can tell them to go to linux-rdma.  I'm sending people to the
>> right
>> mailing list all the time.
> 
> Indeed.  Every single time a patch comes into linux-rdma that touches
> things in net/ or include/net, unless it is exceedingly minor, I check
> the To:/Cc: lines on the email and if netdev@ isn't included, or in the
> case of complex/tricky items, you aren't directly Cc:ed, then I
> specifically tell them to include netdev@ and/or you.  I've even had
> things like a 12 patch series that buried three netdev@ appropriate
> patches at different points in the series and told the submitter to
> move all of the netdev@ related patches to the front and submit them to
> netdev@ so they can be reviewed as a group before I would move on to
> the others.  It's just what you do.  I've always considered that part
> of my job.

To be quite honest it wasn't exceedingly clear, even to me, that this
had such implications or was directly a RDMA thing.  From my
perspective while reviewing I saw a patch series adding it's own
protocol stack living inside of it's own directory under net/

And, if even one RDMA/infiniband person said to me "you really
shouldn't apply this" then I would have dropped it on the spot.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
On Tue, 2017-05-16 at 18:30 +0200, Christoph Hellwig wrote:
> On Tue, May 16, 2017 at 12:29:23PM -0400, David Miller wrote:
> > 
> > I can't push back on people with silly coding style and small
> > semantic
> > issues forever.  And I think I made a serious effort to keep the
> > patches getting posted over and over again to make sure they got
> > more
> > exposure.
> 
> You can tell them to go to linux-rdma.  I'm sending people to the
> right
> mailing list all the time.

Indeed.  Every single time a patch comes into linux-rdma that touches
things in net/ or include/net, unless it is exceedingly minor, I check
the To:/Cc: lines on the email and if netdev@ isn't included, or in the
case of complex/tricky items, you aren't directly Cc:ed, then I
specifically tell them to include netdev@ and/or you.  I've even had
things like a 12 patch series that buried three netdev@ appropriate
patches at different points in the series and told the submitter to
move all of the netdev@ related patches to the front and submit them to
netdev@ so they can be reviewed as a group before I would move on to
the others.  It's just what you do.  I've always considered that part
of my job.

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD



Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Christoph Hellwig
On Tue, May 16, 2017 at 12:33:08PM -0400, David Miller wrote:
> That doesn't cover things that don't directly touch the RDMA code or
> infiniband infrastructure.
> 
> There should have been RDMA people on netdev who saw this thing and
> cried wolf, and they would have had about an entire year, and about
> 8 instances of this series being posted in which to do so.
> 
> It's not like this got posted once or twice and went in with zero
> review.

None outside of netdev.  Really, if you get rdma patches include
linux-rdma.  Or at least linux-kernel where I will usually catch
this sort of stuff.  netdev is too high traffic and too far away
from my area of work to watch it.

But for example when I wrote my first RDMA ULP I did subsribe
to linux-rdma, started extensive discussions and fixed up lots
of core code.  There is no excuse for other people to not even
try.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread David Miller
From: Christoph Hellwig 
Date: Tue, 16 May 2017 18:30:41 +0200

> On Tue, May 16, 2017 at 12:29:23PM -0400, David Miller wrote:
>> I can't push back on people with silly coding style and small semantic
>> issues forever.  And I think I made a serious effort to keep the
>> patches getting posted over and over again to make sure they got more
>> exposure.
> 
> You can tell them to go to linux-rdma.  I'm sending people to the right
> mailing list all the time.

That doesn't cover things that don't directly touch the RDMA code or
infiniband infrastructure.

There should have been RDMA people on netdev who saw this thing and
cried wolf, and they would have had about an entire year, and about
8 instances of this series being posted in which to do so.

It's not like this got posted once or twice and went in with zero
review.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Christoph Hellwig
On Tue, May 16, 2017 at 12:29:23PM -0400, David Miller wrote:
> I can't push back on people with silly coding style and small semantic
> issues forever.  And I think I made a serious effort to keep the
> patches getting posted over and over again to make sure they got more
> exposure.

You can tell them to go to linux-rdma.  I'm sending people to the right
mailing list all the time.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread David Miller
From: Doug Ledford 
Date: Tue, 16 May 2017 11:57:04 -0400

> Regardless though, I'm rather purturbed about this entire thing.  If
> you are right that because this got into 4.11, it's now a done deal,
> then the fact that this went through 4 review cycles on netdev@ that,
> as I understand it, spanned roughly one years time, and not one single
> person bothered to note that this was as much an RDMA driver as
> anything else, and not one person bothered to note that linux-rdma@ was
> not on the Cc: list, and not one person told the submitters that they
> needed to include linux-rdma@ on the Cc: list of these submissions, and
> you took it without any review comments from any RDMA people in the
> course of a year, or an ack from me to show that the RDMA portion of
> this had at least been given some sort of review, was a collosal fuckup
> of cross tree maintainer cooperation.

We rely on people from various areas of expertiece to contribute to
patch review on netdev and give appropriate feedback.

If you actually look through the history, I made many semantic reviews
of the SMC changes, and kept pushing back.

And in fact I did this several times, making them go through several
revisions, in the hopes that someone would review more of the meat and
substance of the patch set.

Nobody do this for over a year.

I can't push back on people with silly coding style and small semantic
issues forever.  And I think I made a serious effort to keep the
patches getting posted over and over again to make sure they got more
exposure.

I think it's unsettling that there are no RDMA experts, or at least
people remotely knowledgable about this "networking" technology,
subscribed to netdev and taking a cursory look at pactches that might
be relevant and effect that technology either directly or indirectly.

So there is a lot of blame to go around.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-16 Thread Doug Ledford
Hi Linus,

I've added you to this thread.  A quick synopsis: Dave sent you the
net/smc driver for 4.11.  Even though it lives in net/smc, it is, for
the most part, a net<->rdma translator and so it is as much an RDMA
driver as anything else.  And upon review, the rdma community does not
believe either the spec/rfc or the driver are the right way to engineer
this particular technology, and the implementation leaves much to be
desired.

On Sun, 2017-05-14 at 20:44 -0400, David Miller wrote:
> From: Bart Van Assche 
> Date: Sun, 14 May 2017 19:08:50 +
> 
> > What is your plan to avoid that applications start using and
> > depending on AF_SMC?
> 
> The API is out there already so we are out of luck, and neither
> you nor I nor anyone else can "stop" this from happening.

That's not true at all.  There's nothing that says we can't revert this
now before it goes any further.  It's only been in two kernels, I'm
positive it hasn't landed in any distros yet, and it can go back to
being something people can add on the side.  Futher, the "standard"
this is based on is not a real standard, it's just a publication and
has not been through a standard track.  I wouldn't consider this "out
there already" until there is a standard that has gone through the
standard track.

Regardless though, I'm rather purturbed about this entire thing.  If
you are right that because this got into 4.11, it's now a done deal,
then the fact that this went through 4 review cycles on netdev@ that,
as I understand it, spanned roughly one years time, and not one single
person bothered to note that this was as much an RDMA driver as
anything else, and not one person bothered to note that linux-rdma@ was
not on the Cc: list, and not one person told the submitters that they
needed to include linux-rdma@ on the Cc: list of these submissions, and
you took it without any review comments from any RDMA people in the
course of a year, or an ack from me to show that the RDMA portion of
this had at least been given some sort of review, was a collosal fuckup
of cross tree maintainer cooperation.

The SMC driver makes several mistakes that people tried to avoid with
previous RDMA standards, it only supports one out of the five possible
link layers (iWARP, IB, OPA, RoCEv1, RoCEv2), it uses a highly
discouraged and deprecated technique for memory registration that is
considered horribly insecure (handing the keys to the castle to anyone
who connects to the machine, aka, the entire memory space is registered
with one key and that key is given to remote connections, so they can
read any bit of kernel memory they want as opposed to whatever we tell
them to read), and the design as articuled in the published rfc seems
incomplete for dealing with any of the other link layers, indicating
that this should have probably stayed out until the rfc was discussed
and updated to address the shortcomings obviously present in the
current rfc.  With all of these issues outstanding against it, I hope
you can see why I think the way I do about you taking it without ever
consulting any of the RDMA community.

But that leaves us with the question of what to do moving forward.
 Probably the number one concern is that this protocol chose to create
a new AF as opposed to reusing the IPv4 and IPv6 address families and
adding an option similar to SCTP for enabling the new protocol on a
specific socket.  The concern is that we have means of addressing all
of the link layers the RDMA subsystem supports using IPv4 or IPv6 (sort
of...it's possible to have IB or OPA without IPoIB, which leaves them
without an IPv4 or IPv6 address, in which case the rdmacm can use
native GUIDs to resolve the other side, but that only works for verbs
connections, in the case of TCP connections, we always require IPoIB to
be present, and so IPv4 or IPv6 is always sufficient).  In the end,
switching this protocol to use AF_INET and AF_INET6 and a protocol
option to enable SMC may be what we need to do.  That, of course,
changes the user space API.  So, are we truly locked in at this point?
 I would suggest that, since this is only present in 4.11 and 4.12, and
I'm sure this has not landed in any distros as of yet (except maybe
something like Fedora rawhide), we can submit a patch to both the
current kernel and the 4.11 stable to set this code as
CONFIG_EXPERIMENTAL and mark the API as possibly going to undergo
change.  Then let the RDMA community work with IBM to get this properly
fixed so that this is a reasonable RDMA driver and not something the
community is ready to immediately trash, and only after we've got it
whipped into shape and the RDMA community is satisfied it is a
reasonable driver that can continue to work with future planned RDMA
subsystem updates and across various link layers, we remove the
EXPERIMENTAL marker and freeze the API for user space.

-- 
Doug Ledford 
    GPG KeyID: B826A3330E572FDD
   
Key fingerprint = AE6B 1BDA 122B 23B4 265B 

Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-15 Thread Leon Romanovsky
On Sun, May 14, 2017 at 11:51:16AM -0400, David Miller wrote:
> From: Christoph Hellwig 
> Date: Sun, 14 May 2017 07:58:48 +0200
>
> > this patch has not been superceeded by anything, can you explain why
> > it has been marked as such in patchworks?
>
> I think you're being overbearing by requiring this to be marked BROKEN
> and I would like you to explore other ways with the authors to fix
> whatever perceived problems you think SMC has.

Hi Dave,

Christoph proposed very clear way to remove BROKEN tag, by providing
secure environment by default to all users.

All ULPs in RDMA and lustre in staging [1] are working in secure mode
by default and I don't understand why SMC-R should be different.

From my experience such "BROKEN" tag will help for authors to get proper
priority from their managers to fix it in a first place, before they are
moved to anything else.

>
> You claim that this is somehow "urgent" is false.  You can ask
> distributions to disable SMC or whatever in the short term if it
> reallly, truly, bothers you.

It is not distributions only. For example, RDMA stack is released by
different providers. Mellanox OFED [2] is one of them and it is based on
upstream kernel.

Thanks

[1] e0ccf5d085ab ("staging: lustre: ko2iblnd: Adapt to the removal of 
ib_get_dma_mr()")
[2] 
http://www.mellanox.com/page/products_dyn?product_family=26=linux_sw_drivers

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


signature.asc
Description: PGP signature


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-15 Thread Sagi Grimberg

Hi Dave,


this patch has not been superceeded by anything, can you explain why
it has been marked as such in patchworks?


I think you're being overbearing by requiring this to be marked BROKEN
and I would like you to explore other ways with the authors to fix
whatever perceived problems you think SMC has.


Well, its not one's opinion, its a real problem. To be fair, this
security breach existed in other RDMA based storage protocols for
years in the past, but we cleaned it up completely.

Our assumption is that *if* the user is willingly choosing to expose its
entire physical address space to remote access to get some performance
boost that's fine, but to have the kernel expose it by default, without
even letting the user to control it is plain irresponsible. IMHO we
should *not* repeat the mistakes of the past and set a higher bar for
RDMA protocol implementations.


You claim that this is somehow "urgent" is false.  You can ask
distributions to disable SMC or whatever in the short term if it
reallly, truly, bothers you.


I doubt that is sufficient given that not all implementations
out there rely on distros. I'm afraid one time is too much in
this case.


RE: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-14 Thread Parav Pandit
Hi Dave,

> -Original Message-
> From: linux-rdma-ow...@vger.kernel.org [mailto:linux-rdma-
> ow...@vger.kernel.org] On Behalf Of David Miller
> Sent: Sunday, May 14, 2017 7:44 PM
> To: bart.vanass...@sandisk.com
> Cc: h...@lst.de; netdev@vger.kernel.org; linux-r...@vger.kernel.org;
> sta...@vger.kernel.org; ubr...@linux.vnet.ibm.com
> Subject: Re: [PATCH] net/smc: mark as BROKEN due to remote memory
> exposure
> 
> From: Bart Van Assche <bart.vanass...@sandisk.com>
> Date: Sun, 14 May 2017 19:08:50 +
> 
> > What is your plan to avoid that applications start using and depending
> > on AF_SMC?
> 

status = socket(AF_SMC, field, IPPROT_TCP);
Here,
- AF_SMC actually means AF_INET IPv4 addresses!
- IPPROTO_TCP means TCP and RDMA both when socket is AF_SMC.
- When creating socket addresses, use AF_INET based addresses.
-  When invoking bind(), listen(), connect() APIs, use AF_INET addresses 
instead.
- Supporting IPv6 is TBD with AF_SMC sockets.
- At user level get_addrinfo will continue to return AF_INET addresses.

Such explanation for socket APIs doesn't sound correct.

The primary motivation for SMC protocol was to simplify the applications and 
library to make use of RDMA.
This kind of API is against such simplicity and creates more confusion.
RFC only gives example and doesn't asks to create new socket family.
I can provide more data, but a simple grep in get_addrinfo() and friend 
functions in user space has heavy dependence on AF_INET and AF_INET6.

> The API is out there already so we are out of luck, and neither you nor I nor
> anyone else can "stop" this from happening.

I think it is still not too late to fix this API. SMC is released in v4.11 very 
recently.
v4.12 is still not out.
Given the limitation of protocol being RoCEv1 only, we might not have many 
users whose applications will stop functioning.
(Which will anyway won't work for RoCEv2, and IPv6 addresses).

I propose,
(a) AF_SMC socket 43 can be marked reserved in future kernel versions to avoid 
use.
(b) New protocol family that represents TCP and RDMA protocol, may be named 
IPPROTO_SMC even though it is not a protocol in IP header.

We can possibly target to have this fix in 4.13 kernel timeframe.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the
> body of a message to majord...@vger.kernel.org More majordomo info
> at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-14 Thread David Miller
From: Bart Van Assche 
Date: Sun, 14 May 2017 19:08:50 +

> What is your plan to avoid that applications start using and
> depending on AF_SMC?

The API is out there already so we are out of luck, and neither
you nor I nor anyone else can "stop" this from happening.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-14 Thread Bart Van Assche
On Sun, 2017-05-14 at 11:51 -0400, David Miller wrote:
> From: Christoph Hellwig 
> Date: Sun, 14 May 2017 07:58:48 +0200
> 
> > this patch has not been superceeded by anything, can you explain why
> > it has been marked as such in patchworks?
> 
> I think you're being overbearing by requiring this to be marked BROKEN
> and I would like you to explore other ways with the authors to fix
> whatever perceived problems you think SMC has.
> 
> You claim that this is somehow "urgent" is false.  You can ask
> distributions to disable SMC or whatever in the short term if it
> reallly, truly, bothers you.

Hello Dave,

There is agreement that the user-space API for using the SMC protocol must
be changed, namely by dropping AF_SMC and by making applications use the
SMC protocol through socket(AF_INET..., SOCK_STREAM, ...). What is your
plan to avoid that applications start using and depending on AF_SMC?

Thanks,

Bart.

Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-14 Thread David Miller
From: Christoph Hellwig 
Date: Sun, 14 May 2017 07:58:48 +0200

> this patch has not been superceeded by anything, can you explain why
> it has been marked as such in patchworks?

I think you're being overbearing by requiring this to be marked BROKEN
and I would like you to explore other ways with the authors to fix
whatever perceived problems you think SMC has.

You claim that this is somehow "urgent" is false.  You can ask
distributions to disable SMC or whatever in the short term if it
reallly, truly, bothers you.


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-13 Thread Christoph Hellwig
Dave,

this patch has not been superceeded by anything, can you explain why it
has been marked as such in patchworks?

On Thu, May 11, 2017 at 02:57:43PM +, Bart Van Assche wrote:
> On Wed, 2017-05-10 at 09:26 +0200, Christoph Hellwig wrote:
> > The driver has a lot of quality issues due to the lack of RDMA-side
> > review, and explicitly bypasses APIs to register all memory once a
> > connection is made, and thus allows remote access to memoery.
> > 
> > Mark it as broken until at least that part is fixed.
> 
> Since this is the only way to get the BROKEN marker in the v4.11 stable
> kernel series:
> 
> Acked-by: Bart Van Assche ---end quoted text---


Re: [PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-11 Thread Bart Van Assche
On Wed, 2017-05-10 at 09:26 +0200, Christoph Hellwig wrote:
> The driver has a lot of quality issues due to the lack of RDMA-side
> review, and explicitly bypasses APIs to register all memory once a
> connection is made, and thus allows remote access to memoery.
> 
> Mark it as broken until at least that part is fixed.

Since this is the only way to get the BROKEN marker in the v4.11 stable
kernel series:

Acked-by: Bart Van Assche 

[PATCH] net/smc: mark as BROKEN due to remote memory exposure

2017-05-10 Thread Christoph Hellwig
The driver has a lot of quality issues due to the lack of RDMA-side
review, and explicitly bypasses APIs to register all memory once a
connection is made, and thus allows remote access to memoery.

Mark it as broken until at least that part is fixed.

Signed-off-by: Christoph Hellwig 
Cc: sta...@vger.kernel.org

---
 net/smc/Kconfig | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/net/smc/Kconfig b/net/smc/Kconfig
index c717ef0896aa..fe6b78bc515f 100644
--- a/net/smc/Kconfig
+++ b/net/smc/Kconfig
@@ -1,6 +1,6 @@
 config SMC
tristate "SMC socket protocol family"
-   depends on INET && INFINIBAND
+   depends on INET && INFINIBAND && BROKEN
---help---
  SMC-R provides a "sockets over RDMA" solution making use of
  RDMA over Converged Ethernet (RoCE) technology to upgrade
@@ -8,6 +8,10 @@ config SMC
  The Linux implementation of the SMC-R solution is designed as
  a separate socket family SMC.
 
+ Warning: SMC will expose all memory for remote reads and writes
+ once a connection is established.  Don't enable this option except
+ for tightly controlled lab environment.
+
  Select this option if you want to run SMC socket applications
 
 config SMC_DIAG
-- 
2.11.0