Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread Srinivas Eeda
Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC 
every 2 seconds(default). So, nodes without this patch would always 
receive a heartbeat message every 2 seconds.

Nodes without this patch will send(respond) with 
O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they received. So 
nodes with this patch will always receive a response message.

So, in a mixed setup, both nodes will always hear the heartbeat from 
each other :).

thanks,
--Srini



Joel Becker wrote:
 On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
   
  case O2NET_MSG_KEEP_REQ_MAGIC:
 -o2net_sendpage(sc, o2net_keep_resp,
 -   sizeof(*o2net_keep_resp));
 +/* Each node now sends keepalive message every
 + * keepalive time interval. Hence no need for response
 + */
  goto out;
 

   You still have to send the response.  Think about a mixed
 environment where some nodes have this fix and some do not.  The older
 software is still waiting on the response.
   The newer version can just ignore any responses it gets from
 other nodes.  But it has to send responses out just in case the other
 node is older.
   The only other alternative is to bump the o2net protocol
 version, and that means the cluster has to be shut down to upgrade.  Not
 a good choice.

 Joel

   


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread Sunil Mushran
What's the harm in leaving it in?

Srinivas Eeda wrote:
 Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC 
 every 2 seconds(default). So, nodes without this patch would always 
 receive a heartbeat message every 2 seconds.

 Nodes without this patch will send(respond) with 
 O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they received. So 
 nodes with this patch will always receive a response message.

 So, in a mixed setup, both nodes will always hear the heartbeat from 
 each other :).

 thanks,
 --Srini



 Joel Becker wrote:
   
 On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
   
 
 case O2NET_MSG_KEEP_REQ_MAGIC:
 -   o2net_sendpage(sc, o2net_keep_resp,
 -  sizeof(*o2net_keep_resp));
 +   /* Each node now sends keepalive message every
 +* keepalive time interval. Hence no need for response
 +*/
 goto out;
 
   
  You still have to send the response.  Think about a mixed
 environment where some nodes have this fix and some do not.  The older
 software is still waiting on the response.
  The newer version can just ignore any responses it gets from
 other nodes.  But it has to send responses out just in case the other
 node is older.
  The only other alternative is to bump the o2net protocol
 version, and that means the cluster has to be shut down to upgrade.  Not
 a good choice.

 Joel

   
 


 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-devel
   


___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread Srinivas Eeda
No harm, just doubles heartbeat messages which is not required at all.

Sunil Mushran wrote:
 What's the harm in leaving it in?

 Srinivas Eeda wrote:
 Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC 
 every 2 seconds(default). So, nodes without this patch would always 
 receive a heartbeat message every 2 seconds.

 Nodes without this patch will send(respond) with 
 O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they received. 
 So nodes with this patch will always receive a response message.

 So, in a mixed setup, both nodes will always hear the heartbeat from 
 each other :).

 thanks,
 --Srini



 Joel Becker wrote:
  
 On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
  
  case O2NET_MSG_KEEP_REQ_MAGIC:
 -o2net_sendpage(sc, o2net_keep_resp,
 -   sizeof(*o2net_keep_resp));
 +/* Each node now sends keepalive message every
 + * keepalive time interval. Hence no need for response
 + */
  goto out;
   
 You still have to send the response.  Think about a mixed
 environment where some nodes have this fix and some do not.  The older
 software is still waiting on the response.
 The newer version can just ignore any responses it gets from
 other nodes.  But it has to send responses out just in case the other
 node is older.
 The only other alternative is to bump the o2net protocol
 version, and that means the cluster has to be shut down to upgrade.  
 Not
 a good choice.

 Joel

   


 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-devel
   



___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread Sunil Mushran
How will it double? The node will send a keepalive only if it has
not heard from the other node for 2 secs.

Srinivas Eeda wrote:
 No harm, just doubles heartbeat messages which is not required at all.

 Sunil Mushran wrote:
 What's the harm in leaving it in?

 Srinivas Eeda wrote:
 Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC 
 every 2 seconds(default). So, nodes without this patch would always 
 receive a heartbeat message every 2 seconds.

 Nodes without this patch will send(respond) with 
 O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they received. 
 So nodes with this patch will always receive a response message.

 So, in a mixed setup, both nodes will always hear the heartbeat from 
 each other :).

 thanks,
 --Srini



 Joel Becker wrote:
  
 On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
 
  case O2NET_MSG_KEEP_REQ_MAGIC:
 -o2net_sendpage(sc, o2net_keep_resp,
 -   sizeof(*o2net_keep_resp));
 +/* Each node now sends keepalive message every
 + * keepalive time interval. Hence no need for response
 + */
  goto out;
   
 You still have to send the response.  Think about a mixed
 environment where some nodes have this fix and some do not.  The older
 software is still waiting on the response.
 The newer version can just ignore any responses it gets from
 other nodes.  But it has to send responses out just in case the other
 node is older.
 The only other alternative is to bump the o2net protocol
 version, and that means the cluster has to be shut down to 
 upgrade.  Not
 a good choice.

 Joel

   


 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-devel
   




___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread Srinivas Eeda
In old code a node cancels and re queues keep alive message when it 
hears from the other node. If it didn't hear in 2 seconds, queued 
message gets fired which sends a keep alive message. And a re queue 
happens only after it hears from the other node.

With the new change, a node sends keep alive every 2 seconds.

Sunil Mushran wrote:
 How will it double? The node will send a keepalive only if it has
 not heard from the other node for 2 secs.

 Srinivas Eeda wrote:
 No harm, just doubles heartbeat messages which is not required at all.

 Sunil Mushran wrote:
 What's the harm in leaving it in?

 Srinivas Eeda wrote:
 Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC 
 every 2 seconds(default). So, nodes without this patch would always 
 receive a heartbeat message every 2 seconds.

 Nodes without this patch will send(respond) with 
 O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they 
 received. So nodes with this patch will always receive a response 
 message.

 So, in a mixed setup, both nodes will always hear the heartbeat 
 from each other :).

 thanks,
 --Srini



 Joel Becker wrote:
  
 On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:

  case O2NET_MSG_KEEP_REQ_MAGIC:
 -o2net_sendpage(sc, o2net_keep_resp,
 -   sizeof(*o2net_keep_resp));
 +/* Each node now sends keepalive message every
 + * keepalive time interval. Hence no need for response
 + */
  goto out;
   
 You still have to send the response.  Think about a mixed
 environment where some nodes have this fix and some do not.  The 
 older
 software is still waiting on the response.
 The newer version can just ignore any responses it gets from
 other nodes.  But it has to send responses out just in case the other
 node is older.
 The only other alternative is to bump the o2net protocol
 version, and that means the cluster has to be shut down to 
 upgrade.  Not
 a good choice.

 Joel

   


 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-devel
   





___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread srinivas eeda
Ok, I'll modify the patch. Are messages queued on o2net_wq and execution 
of o2net_process_message is always done in the context of o2net thread 
and are synchronized?

On 2/17/2010 3:50 PM, Sunil Mushran wrote:
 My understanding was that we'll also requeue after sending a keepalive.
 As in, not wait for the response to requeue. But we'll still be smart 
 about
 it in the sense that not send a hb even if the nodes are communicating
 otherwise.

 Srinivas Eeda wrote:
 In old code a node cancels and re queues keep alive message when it 
 hears from the other node. If it didn't hear in 2 seconds, queued 
 message gets fired which sends a keep alive message. And a re queue 
 happens only after it hears from the other node.

 With the new change, a node sends keep alive every 2 seconds.

 Sunil Mushran wrote:
 How will it double? The node will send a keepalive only if it has
 not heard from the other node for 2 secs.

 Srinivas Eeda wrote:
 No harm, just doubles heartbeat messages which is not required at all.

 Sunil Mushran wrote:
 What's the harm in leaving it in?

 Srinivas Eeda wrote:
 Each node that has this patch would send a 
 O2NET_MSG_KEEP_REQ_MAGIC every 2 seconds(default). So, nodes 
 without this patch would always receive a heartbeat message every 
 2 seconds.

 Nodes without this patch will send(respond) with 
 O2NET_MSG_KEEP_RESP_MAGIC for every keep alive packet they 
 received. So nodes with this patch will always receive a response 
 message.

 So, in a mixed setup, both nodes will always hear the heartbeat 
 from each other :).

 thanks,
 --Srini



 Joel Becker wrote:
  
 On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
  
  case O2NET_MSG_KEEP_REQ_MAGIC:
 -o2net_sendpage(sc, o2net_keep_resp,
 -   sizeof(*o2net_keep_resp));
 +/* Each node now sends keepalive message every
 + * keepalive time interval. Hence no need for 
 response
 + */
  goto out;
   
 You still have to send the response.  Think about a mixed
 environment where some nodes have this fix and some do not.  The 
 older
 software is still waiting on the response.
 The newer version can just ignore any responses it gets from
 other nodes.  But it has to send responses out just in case the 
 other
 node is older.
 The only other alternative is to bump the o2net protocol
 version, and that means the cluster has to be shut down to 
 upgrade.  Not
 a good choice.

 Joel

   


 ___
 Ocfs2-devel mailing list
 Ocfs2-devel@oss.oracle.com
 http://oss.oracle.com/mailman/listinfo/ocfs2-devel
   






___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel


Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-17 Thread srinivas eeda

Yea, they don't expect/wait for a response for keep alive message.

On 2/17/2010 5:49 PM, Joel Becker wrote:

On Wed, Feb 17, 2010 at 10:24:30AM -0800, Srinivas Eeda wrote:
  

Each node that has this patch would send a O2NET_MSG_KEEP_REQ_MAGIC
every 2 seconds(default). So, nodes without this patch would always
receive a heartbeat message every 2 seconds.



So, old nodes do not care if they never receive RESP_MAGIC as
long as they got some other message?

Joel

  
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel

Re: [Ocfs2-devel] [PATCH 3/3] o2net: correct keepalive message protocol

2010-02-16 Thread Joel Becker
On Thu, Jan 28, 2010 at 08:51:11PM -0800, Srinivas Eeda wrote:
   case O2NET_MSG_KEEP_REQ_MAGIC:
 - o2net_sendpage(sc, o2net_keep_resp,
 -sizeof(*o2net_keep_resp));
 + /* Each node now sends keepalive message every
 +  * keepalive time interval. Hence no need for response
 +  */
   goto out;

You still have to send the response.  Think about a mixed
environment where some nodes have this fix and some do not.  The older
software is still waiting on the response.
The newer version can just ignore any responses it gets from
other nodes.  But it has to send responses out just in case the other
node is older.
The only other alternative is to bump the o2net protocol
version, and that means the cluster has to be shut down to upgrade.  Not
a good choice.

Joel

-- 

Life's Little Instruction Book #464

Don't miss the magic of the moment by focusing on what's
 to come.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.bec...@oracle.com
Phone: (650) 506-8127

___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-devel