Re: [Cluster-devel] [PATCH 4/6] dlm: use sctp 1-to-1 API

2015-08-13 Thread Steven Whitehouse

Hi,

On 12/08/15 17:42, Marcelo Ricardo Leitner wrote:

Em 12-08-2015 12:33, David Laight escreveu:

From: Marcelo Ricardo Leitner

Sent: 12 August 2015 14:16
Em 12-08-2015 07:23, David Laight escreveu:

From: Marcelo Ricardo Leitner

Sent: 11 August 2015 23:22
DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
needed but this causes it to use sctp_do_peeloff() to mimic an
kernel_accept() and this causes a symbol dependency on sctp module.

By switching it to 1-to-1 API we can avoid this dependency and also
reduce quite a lot of SCTP-specific code in lowcomms.c.

...

You still need to enable sctp notifications (I think the patch deleted
that code).
Otherwise you don't get any kind of indication if the remote system
'resets' (ie sends an new INIT chunk) on an existing connection.


Right, it would miss the restart event and could generate a corrupted
tx/rx buffers by glueing parts of old messages with new ones.


Except that it is SCTP so you'd expect DATA chunks to contain entire
messages and so get unexpected message sequences rather than corrupt
messages.


I was thinking on cases where the buf for recvmsg is not enough to 
hold the chunk, so that the remaining is left for another attempt 
(sctp_recvmsg, around line 2130), but sounds like we won't purge rx 
buffer when the reset happens so that doesn't matter. The association 
is replaced, but the buffers are kept.


Out of order messages aren't a problem for dlm. It can recover from 
that just fine, as it doesn't have a specific handshake at beginning 
or something like that and upper layers are agnostic to that state 
transition (disconnect/reconnect/...), this should be fine.


I'm not sure thats true - DLM does rely on message ordering in some 
cases in order to ensure correct functioning. So depending on how SCTP 
is interfaced to DLM, it might potentially be an issue,


Steve.



Re: [Cluster-devel] [PATCH 4/6] dlm: use sctp 1-to-1 API

2015-08-13 Thread Marcelo Ricardo Leitner

Em 13-08-2015 06:37, Steven Whitehouse escreveu:

Hi,

On 12/08/15 17:42, Marcelo Ricardo Leitner wrote:

Em 12-08-2015 12:33, David Laight escreveu:

From: Marcelo Ricardo Leitner

Sent: 12 August 2015 14:16
Em 12-08-2015 07:23, David Laight escreveu:

From: Marcelo Ricardo Leitner

Sent: 11 August 2015 23:22
DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
needed but this causes it to use sctp_do_peeloff() to mimic an
kernel_accept() and this causes a symbol dependency on sctp module.

By switching it to 1-to-1 API we can avoid this dependency and also
reduce quite a lot of SCTP-specific code in lowcomms.c.

...

You still need to enable sctp notifications (I think the patch deleted
that code).
Otherwise you don't get any kind of indication if the remote system
'resets' (ie sends an new INIT chunk) on an existing connection.


Right, it would miss the restart event and could generate a corrupted
tx/rx buffers by glueing parts of old messages with new ones.


Except that it is SCTP so you'd expect DATA chunks to contain entire
messages and so get unexpected message sequences rather than corrupt
messages.


I was thinking on cases where the buf for recvmsg is not enough to
hold the chunk, so that the remaining is left for another attempt
(sctp_recvmsg, around line 2130), but sounds like we won't purge rx
buffer when the reset happens so that doesn't matter. The association
is replaced, but the buffers are kept.

Out of order messages aren't a problem for dlm. It can recover from
that just fine, as it doesn't have a specific handshake at beginning
or something like that and upper layers are agnostic to that state
transition (disconnect/reconnect/...), this should be fine.


I'm not sure thats true - DLM does rely on message ordering in some
cases in order to ensure correct functioning. So depending on how SCTP
is interfaced to DLM, it might potentially be an issue,


Yes, that ordering is still kept. Like, it won't flip a newer message to 
a first position. It's just that if DLM had its own handshake exposing 
its version and features, one peer (the old one) would get it out of the 
blue and the other (the new one) would never get it. Or if its messages 
would depend on a previous state, meaning LockMsgC is only acceptable if 
LockMsgA was already performed on that connection. That is my 
understanding from what David pointed out and what I checked here.


Then as lowcomms previously allowed connection closing without telling 
anyone above it that it happened, it should be fine, right? It will just 
finish processing the old messages and then start on the new ones, just 
like before.


Thanks,
Marcelo



Re: [Cluster-devel] [PATCH 4/6] dlm: use sctp 1-to-1 API

2015-08-12 Thread David Laight
From: Marcelo Ricardo Leitner
 Sent: 12 August 2015 14:16
 Em 12-08-2015 07:23, David Laight escreveu:
  From: Marcelo Ricardo Leitner
  Sent: 11 August 2015 23:22
  DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
  needed but this causes it to use sctp_do_peeloff() to mimic an
  kernel_accept() and this causes a symbol dependency on sctp module.
 
  By switching it to 1-to-1 API we can avoid this dependency and also
  reduce quite a lot of SCTP-specific code in lowcomms.c.
  ...
 
  You still need to enable sctp notifications (I think the patch deleted
  that code).
  Otherwise you don't get any kind of indication if the remote system
  'resets' (ie sends an new INIT chunk) on an existing connection.
 
 Right, it would miss the restart event and could generate a corrupted
 tx/rx buffers by glueing parts of old messages with new ones.

Except that it is SCTP so you'd expect DATA chunks to contain entire
messages and so get unexpected message sequences rather than corrupt
messages.
The problem is that the recovery is likely to be another reset.
(Particularly with M3UA where the source and destination port numbers
are likely to be the same and fixed.)

  It is probably enough to treat the MSG_NOTIFICATION as a fatal error
  and close the socket.
 
 Just so we are on the same page, you mean that after accepting the new
 association and enabling notifications on it, any further notification
 on it can be treated as fatal errors, right? Seems reasonable to me.

That's what I had to do.
The far end will probably see an additional disconnect, but it shouldn't
matter.

  This is probably a bug in the sctp stack - if a connection is reset
  but the user hasn't requested notifications then it should be
  converted to a disconnect indication and a new incoming connection.
 
 Maybe in such case resets shouldn't be allowed at all? Because unless it
 happens on a moment of silence it will always lead to application buffer
 corruption. Checked the RFCs now but couldn't find anything restricting
 them to some condition.

I certainly expected the 'reset' to cause an inwards abortive disconnect
on the old socket and a new indication on the listening socket.
I think (hope) that is what you get for a TCP SYN that matches an existing
connection.

In our case I think they were happening when the remote system was power
cycled.

David




Re: [Cluster-devel] [PATCH 4/6] dlm: use sctp 1-to-1 API

2015-08-12 Thread David Laight
From: Marcelo Ricardo Leitner
 Sent: 11 August 2015 23:22
 DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
 needed but this causes it to use sctp_do_peeloff() to mimic an
 kernel_accept() and this causes a symbol dependency on sctp module.
 
 By switching it to 1-to-1 API we can avoid this dependency and also
 reduce quite a lot of SCTP-specific code in lowcomms.c.
...

You still need to enable sctp notifications (I think the patch deleted
that code).
Otherwise you don't get any kind of indication if the remote system
'resets' (ie sends an new INIT chunk) on an existing connection.

It is probably enough to treat the MSG_NOTIFICATION as a fatal error
and close the socket.

This is probably a bug in the sctp stack - if a connection is reset
but the user hasn't requested notifications then it should be
converted to a disconnect indication and a new incoming connection.

David




Re: [Cluster-devel] [PATCH 4/6] dlm: use sctp 1-to-1 API

2015-08-12 Thread Marcelo Ricardo Leitner

Em 12-08-2015 07:23, David Laight escreveu:

From: Marcelo Ricardo Leitner

Sent: 11 August 2015 23:22
DLM is using 1-to-many API but in a 1-to-1 fashion. That is, it's not
needed but this causes it to use sctp_do_peeloff() to mimic an
kernel_accept() and this causes a symbol dependency on sctp module.

By switching it to 1-to-1 API we can avoid this dependency and also
reduce quite a lot of SCTP-specific code in lowcomms.c.

...

You still need to enable sctp notifications (I think the patch deleted
that code).
Otherwise you don't get any kind of indication if the remote system
'resets' (ie sends an new INIT chunk) on an existing connection.


Right, it would miss the restart event and could generate a corrupted 
tx/rx buffers by glueing parts of old messages with new ones.



It is probably enough to treat the MSG_NOTIFICATION as a fatal error
and close the socket.


Just so we are on the same page, you mean that after accepting the new 
association and enabling notifications on it, any further notification 
on it can be treated as fatal errors, right? Seems reasonable to me.



This is probably a bug in the sctp stack - if a connection is reset
but the user hasn't requested notifications then it should be
converted to a disconnect indication and a new incoming connection.


Maybe in such case resets shouldn't be allowed at all? Because unless it 
happens on a moment of silence it will always lead to application buffer 
corruption. Checked the RFCs now but couldn't find anything restricting 
them to some condition.


Thanks,
Marcelo