rzo1 commented on code in PR #8584:
URL: https://github.com/apache/storm/pull/8584#discussion_r3183858754


##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)

Review Comment:
   The original 0.7.1 source had a comment backing this claim; the modern 
`KryoTupleSerializer` holds a `Kryo` field, and Kryo itself is not thread-safe 
(instances are scoped per executor/thread). Could you verify, or drop 
"thread-safe"?



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)

Review Comment:
   The Disruptor reference is outdated — 2.x replaced it with `JCQueue` 
(JCTools), see `storm-client/src/jvm/org/apache/storm/utils/JCQueue.java`.
   
   ```suggestion
   (Note: this walkthrough has been updated for v2.8.7. The message passing 
infrastructure was rewritten in 0.8.0 (originally Disruptor-based) and later 
moved to JCQueue.)
   ```



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)

Review Comment:
   GitHub `/blob/` URLs don't resolve to directories — this should be `/tree/`, 
or better, point at the entry-point file:
   
   ```suggestion
      - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/Context.java)
   ```



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)

Review Comment:
   Same `/blob/` issue, and the directory only contains one file:
   
   ```suggestion
      - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/Context.java)
   ```



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
 - Receiving messages in tasks works differently in local mode and distributed 
mode
-   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj#L21)
-   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port is called a "virtual port", because it receives [task id, message] and 
then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L204)
-      - The virtual port implementation is here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/zilch/virtual_port.clj)
-      - Tasks listen on an in-memory ZeroMQ port for messages from the virtual 
port 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L201)
-        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L489)
-        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L382)
+   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)

Review Comment:
   Same `/blob/` directory issue:
   
   ```suggestion
      - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/Context.java)
   ```



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
 - Receiving messages in tasks works differently in local mode and distributed 
mode
-   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj#L21)
-   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port is called a "virtual port", because it receives [task id, message] and 
then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L204)
-      - The virtual port implementation is here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/zilch/virtual_port.clj)
-      - Tasks listen on an in-memory ZeroMQ port for messages from the virtual 
port 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L201)
-        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L489)
-        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L382)
+   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
+   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port receives [task id, message] and then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/StormServerHandler.java)
+      - The message routing implementation is here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)

Review Comment:
   Same `/blob/` directory issue; original linked a specific file. Suggest 
pointing at `Server.java` (or removing this bullet — line 21 already covers 
Netty routing):
   
   ```suggestion
         - The message routing implementation is here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/Server.java)
   ```



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
 - Receiving messages in tasks works differently in local mode and distributed 
mode
-   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj#L21)
-   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port is called a "virtual port", because it receives [task id, message] and 
then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L204)
-      - The virtual port implementation is here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/zilch/virtual_port.clj)
-      - Tasks listen on an in-memory ZeroMQ port for messages from the virtual 
port 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L201)
-        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L489)
-        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L382)
+   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
+   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port receives [task id, message] and then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/StormServerHandler.java)
+      - The message routing implementation is here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+      - Executors listen on an in-memory connection for messages 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/Executor.java)
+        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/bolt/BoltExecutor.java)
+        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/spout/SpoutExecutor.java)
 - Tasks are responsible for message routing. A tuple is emitted either to a 
direct stream (where the task id is specified) or a regular stream. In direct 
streams, the message is only sent if that bolt subscribes to that direct 
stream. In regular streams, the stream grouping functions are used to determine 
the task ids to send the tuple to.
-  - Tasks have a routing map from {stream id} -> {component id} -> {stream 
grouping function} 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L198)
-  - The "tasks-fn" returns the task ids to send the tuples to for either 
regular stream emit or direct stream emit 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L207)
-  - After getting the output task ids, bolts and spouts use the transfer-fn 
provided by the worker to actually transfer the tuples
-      - Bolt transfer code here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L429)
-      - Spout transfer code here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L329)
+  - Tasks have a routing map from {stream id} -> {component id} -> {stream 
grouping function} 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/Task.java)
+  - Stream grouping functions determine the task ids to send the tuples to for 
either regular stream emit or direct stream emit 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/Task.java)

Review Comment:
   Both bullets link to the same `Task.java` root. The original 0.7.1 doc had 
distinct line anchors. Either add `#L<line>` anchors at the relevant 
fields/methods, or merge into one bullet:
   
   ```suggestion
     - Tasks have a routing map from {stream id} -> {component id} -> {stream 
grouping function}; the grouping functions determine the task ids to send the 
tuples to for either regular stream emit or direct stream emit. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/Task.java)
   ```



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
 - Receiving messages in tasks works differently in local mode and distributed 
mode
-   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj#L21)
-   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port is called a "virtual port", because it receives [task id, message] and 
then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L204)
-      - The virtual port implementation is here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/zilch/virtual_port.clj)
-      - Tasks listen on an in-memory ZeroMQ port for messages from the virtual 
port 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L201)
-        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L489)
-        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L382)
+   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
+   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port receives [task id, message] and then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/StormServerHandler.java)
+      - The message routing implementation is here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+      - Executors listen on an in-memory connection for messages 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/Executor.java)
+        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/bolt/BoltExecutor.java)
+        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/spout/SpoutExecutor.java)
 - Tasks are responsible for message routing. A tuple is emitted either to a 
direct stream (where the task id is specified) or a regular stream. In direct 
streams, the message is only sent if that bolt subscribes to that direct 
stream. In regular streams, the stream grouping functions are used to determine 
the task ids to send the tuple to.
-  - Tasks have a routing map from {stream id} -> {component id} -> {stream 
grouping function} 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L198)
-  - The "tasks-fn" returns the task ids to send the tuples to for either 
regular stream emit or direct stream emit 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L207)
-  - After getting the output task ids, bolts and spouts use the transfer-fn 
provided by the worker to actually transfer the tuples
-      - Bolt transfer code here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L429)
-      - Spout transfer code here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L329)
+  - Tasks have a routing map from {stream id} -> {component id} -> {stream 
grouping function} 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/Task.java)
+  - Stream grouping functions determine the task ids to send the tuples to for 
either regular stream emit or direct stream emit 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/Task.java)
+  - After getting the output task ids, bolts and spouts use the transfer 
function provided by the worker to actually transfer the tuples
+      - Bolt transfer code here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/bolt/BoltExecutor.java)
+      - Spout transfer code here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/spout/SpoutExecutor.java)

Review Comment:
   The "Bolts listen here" (line 24) and "Bolt transfer code here" (line 30) 
bullets both link to `BoltExecutor.java`; same for `SpoutExecutor.java` on 
lines 25/31. Please add `#L<line>` anchors so each link lands on a distinct 
method, or merge the receive/transfer bullets — the right anchors are your call.



##########
docs/Message-passing-implementation.md:
##########
@@ -3,28 +3,30 @@ title: Message Passing Implementation
 layout: documentation
 documentation: true
 ---
-(Note: this walkthrough is out of date as of 0.8.0. 0.8.0 revamped the message 
passing infrastructure to be based on the Disruptor)
+
+(Note: this walkthrough has been updated for v2.8.7. As of 0.8.0, the message 
passing infrastructure has been based on the Disruptor)
 
 This page walks through how emitting and transferring tuples works in Storm.
 
 - Worker is responsible for message transfer
-   - `refresh-connections` is called every "task.refresh.poll.secs" or 
whenever assignment in ZK changes. It manages connections to other workers and 
maintains a mapping from task -> worker 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L123)
-   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L56)
-   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/0.7.1/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L26)
-   - The worker has a single thread which drains the transfer queue and sends 
the messages to other workers 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L185)
-   - Message sending happens through this protocol: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/protocol.clj)
-   - The implementation for distributed mode uses ZeroMQ 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/zmq.clj)
-   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing to get ZeroMQ installed) 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj)
+   - Connection management is handled by the `WorkerState` class which manages 
connections to other workers and maintains a mapping from task -> worker. 
Connection refresh is triggered every "task.refresh.poll.secs" or whenever 
assignment in ZK changes. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerState.java)
+   - Provides a "transfer function" that is used by tasks to send tuples to 
other tasks. The transfer function takes in a task id and a tuple, and it 
serializes the tuple and puts it onto a "transfer queue". There is a single 
transfer queue for each worker. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java)
+   - The serializer is thread-safe 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java)
+   - The worker drains the transfer queue and sends the messages to other 
workers 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java)
+   - Message sending happens through this interface: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/IConnection.java)
+   - The implementation for distributed mode uses Netty 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+   - The implementation for local mode uses in memory Java queues (so that 
it's easy to use Storm locally without needing external messaging dependencies) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
 - Receiving messages in tasks works differently in local mode and distributed 
mode
-   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/messaging/local.clj#L21)
-   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port is called a "virtual port", because it receives [task id, message] and 
then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/worker.clj#L204)
-      - The virtual port implementation is here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/zilch/virtual_port.clj)
-      - Tasks listen on an in-memory ZeroMQ port for messages from the virtual 
port 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L201)
-        - Bolts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L489)
-        - Spouts listen here: 
[code](https://github.com/apache/storm/blob/0.7.1/src/clj/org/apache/storm/daemon/task.clj#L382)
+   - In local mode, the tuple is sent directly to an in-memory queue for the 
receiving task 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/local/)
+   - In distributed mode, each worker listens on a single TCP port for 
incoming messages and then routes those messages in-memory to tasks. The TCP 
port receives [task id, message] and then routes it to the actual task. 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/StormServerHandler.java)
+      - The message routing implementation is here: 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/messaging/netty/)
+      - Executors listen on an in-memory connection for messages 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/Executor.java)

Review Comment:
   In 2.x, bolt/spout executors consume from their `JCQueue` receive queue 
rather than listening on an in-memory connection:
   
   ```suggestion
         - Executors consume from their receive queue (a `JCQueue`) 
[code](https://github.com/apache/storm/blob/v2.8.7/storm-client/src/jvm/org/apache/storm/executor/Executor.java)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to