pchang388 commented on issue #12701:
URL: https://github.com/apache/druid/issues/12701#issuecomment-1178206982

   #2 Continued
   
   After the 202 Accepted, Supervisor/Overlord will constantly hit the 
"/status" endpoint of the peon in order to see if it actually paused. After the 
202 it appears to only respond with "STARTING" status
   ```
   022-07-06T21:56:33,609 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Pause 
task[index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc]
   2022-07-06T21:56:33,612 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.common.IndexTaskClient - HTTP POST: 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause
   2022-07-06T21:56:33,612 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 starting
   2022-07-06T21:56:35,618 DEBUG [HttpClient-Netty-Worker-11] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 messageReceived: DefaultHttpResponse(chunked: true)
   X-Druid-Task-Id: index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc
   2022-07-06T21:56:35,618 DEBUG [HttpClient-Netty-Worker-11] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 Got response: 202 Accepted
   2022-07-06T21:56:35,618 DEBUG [HttpClient-Netty-Worker-11] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 messageReceived: org.jboss.netty.handler.codec.http.DefaultHttpChunk@6634e90e
   2022-07-06T21:56:35,618 DEBUG [HttpClient-Netty-Worker-11] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 Got chunk: 44B, last=false
   2022-07-06T21:56:35,618 DEBUG [HttpClient-Netty-Worker-11] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 messageReceived: org.jboss.netty.handler.codec.http.HttpChunk$1@47788630
   2022-07-06T21:56:35,618 DEBUG [HttpClient-Netty-Worker-11] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 Got chunk: 0B, last=true
   2022-07-06T21:56:35,618 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - 
GetStatus task[index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc]
   2022-07-06T21:56:35,623 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.common.IndexTaskClient - HTTP GET: 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status
   2022-07-06T21:56:35,623 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 starting
   2022-07-06T21:56:35,628 DEBUG [HttpClient-Netty-Worker-16] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 messageReceived: DefaultHttpResponse(chunked: true)
   X-Druid-Task-Id: index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc
   2022-07-06T21:56:35,628 DEBUG [HttpClient-Netty-Worker-16] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 Got response: 200 OK
   2022-07-06T21:56:35,628 DEBUG [HttpClient-Netty-Worker-16] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 messageReceived: org.jboss.netty.handler.codec.http.DefaultHttpChunk@4922d15b
   2022-07-06T21:56:35,628 DEBUG [HttpClient-Netty-Worker-16] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 Got chunk: 9B, last=false
   2022-07-06T21:56:35,628 DEBUG [HttpClient-Netty-Worker-16] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 messageReceived: org.jboss.netty.handler.codec.http.HttpChunk$1@47788630
   2022-07-06T21:56:35,628 DEBUG [HttpClient-Netty-Worker-16] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 Got chunk: 0B, last=true
   2022-07-06T21:56:35,628 INFO [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   2022-07-06T21:56:37,628 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - 
GetStatus task[index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc]
   2022-07-06T21:56:37,632 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.common.IndexTaskClient - HTTP GET: 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status
   2022-07-06T21:56:37,632 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 starting
   2022-07-06T21:56:37,637 DEBUG [HttpClient-Netty-Worker-21] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 messageReceived: DefaultHttpResponse(chunked: true)
   X-Druid-Task-Id: index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc
   2022-07-06T21:56:37,637 DEBUG [HttpClient-Netty-Worker-21] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 Got response: 200 OK
   2022-07-06T21:56:37,637 DEBUG [HttpClient-Netty-Worker-21] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 messageReceived: org.jboss.netty.handler.codec.http.DefaultHttpChunk@38855127
   2022-07-06T21:56:37,637 DEBUG [HttpClient-Netty-Worker-21] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 Got chunk: 9B, last=false
   2022-07-06T21:56:37,637 DEBUG [HttpClient-Netty-Worker-21] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 messageReceived: org.jboss.netty.handler.codec.http.HttpChunk$1@47788630
   2022-07-06T21:56:37,637 DEBUG [HttpClient-Netty-Worker-21] 
org.apache.druid.java.util.http.client.NettyHttpClient - [GET 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/status]
 Got chunk: 0B, last=true
   2022-07-06T21:56:37,637 INFO [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   2022-07-06T21:56:39,637 DEBUG [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - 
GetStatus task[index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc]
   ```
   * This continues for some time, I count 287 occurrences in this peon task 
for similar messages for retries: 
   ```
   2022-07-06T21:56:35,628 INFO [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   ...
   2022-07-06T22:02:15,663 INFO [IndexTaskClient-REDACT-4] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   
   2022-07-06T22:02:16,518 INFO [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   
   2022-07-06T22:02:17,196 INFO [IndexTaskClient-REDACT-0] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   
   2022-07-06T22:02:17,680 INFO [IndexTaskClient-REDACT-4] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   
   2022-07-06T22:02:18,524 INFO [IndexTaskClient-REDACT-5] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   
   2022-07-06T22:02:19,206 INFO [IndexTaskClient-REDACT-0] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   ...
   ...
   2022-07-06T22:02:27,720 INFO [IndexTaskClient-REDACT-4] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Still 
waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to change its 
status to [PAUSED]; will try again in [PT2S]
   ```
   
   * Until eventually, it is killed via shutdown as it never paused and on the 
next pause attempt, it responded with "400 Bad Request - cannot pause a task in 
"PUBLISHING" phase". According to the log snippet below, it jumped from 
"STARTING" phase to "PUBLISHING" between "/status" checks/calls:
   ```
   2022-07-06T22:05:12,414 DEBUG [IndexTaskClient-REDACT-0] 
org.apache.druid.indexing.seekablestream.SeekableStreamIndexTaskClient - Pause 
task[index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc]
   2022-07-06T22:05:12,417 DEBUG [IndexTaskClient-REDACT-0] 
org.apache.druid.indexing.common.IndexTaskClient - HTTP POST: 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause
   2022-07-06T22:05:12,417 DEBUG [IndexTaskClient-REDACT-0] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 starting
   2022-07-06T22:05:12,420 DEBUG [HttpClient-Netty-Worker-9] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 messageReceived: DefaultHttpResponse(chunked: true)
   X-Druid-Task-Id: index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc
   2022-07-06T22:05:12,421 DEBUG [HttpClient-Netty-Worker-9] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 Got response: 400 Bad Request
   2022-07-06T22:05:12,421 DEBUG [HttpClient-Netty-Worker-9] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 messageReceived: org.jboss.netty.handler.codec.http.DefaultHttpChunk@317bbf2c
   2022-07-06T22:05:12,421 DEBUG [HttpClient-Netty-Worker-9] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 Got chunk: 66B, last=false
   2022-07-06T22:05:12,421 DEBUG [HttpClient-Netty-Worker-9] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 messageReceived: org.jboss.netty.handler.codec.http.HttpChunk$1@47788630
   2022-07-06T22:05:12,421 DEBUG [HttpClient-Netty-Worker-9] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8100/druid/worker/v1/chat/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/pause]
 Got chunk: 0B, last=true
   2022-07-06T22:05:12,427 INFO [KafkaSupervisor-REDACT-Worker-1] 
org.apache.druid.indexing.overlord.RemoteTaskRunner - Shutdown 
[index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] because: [An exception occured 
while waiting for task [index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc] to pause: 
[org.apache.druid.java.util.common.IAE: Received 400 Bad Request with body: 
Can't pause, task is not in a pausable state (state: [PUBLISHING])]]
   2022-07-06T22:05:12,427 DEBUG [KafkaSupervisor-REDACT-Worker-1] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 starting
   2022-07-06T22:05:12,439 DEBUG [HttpClient-Netty-Worker-6] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 messageReceived: DefaultHttpResponse(chunked: true)
   2022-07-06T22:05:12,439 DEBUG [HttpClient-Netty-Worker-6] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 Got response: 200 OK
   2022-07-06T22:05:12,439 DEBUG [HttpClient-Netty-Worker-6] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 messageReceived: org.jboss.netty.handler.codec.http.DefaultHttpChunk@3c164b04
   2022-07-06T22:05:12,439 DEBUG [HttpClient-Netty-Worker-6] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 Got chunk: 53B, last=false
   2022-07-06T22:05:12,439 DEBUG [HttpClient-Netty-Worker-6] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 messageReceived: org.jboss.netty.handler.codec.http.HttpChunk$1@47788630
   2022-07-06T22:05:12,439 DEBUG [HttpClient-Netty-Worker-6] 
org.apache.druid.java.util.http.client.NettyHttpClient - [POST 
http://REDACTED.host.com:8091/druid/worker/v1/task/index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc/shutdown]
 Got chunk: 0B, last=true
   2022-07-06T22:05:12,439 INFO [KafkaSupervisor-REDACT-Worker-1] 
org.apache.druid.indexing.overlord.RemoteTaskRunner - Sent shutdown message to 
worker: REDACTED.host.com:8091, status 200 OK, response: 
{"task":"index_kafka_REDACT_a5c10ee5effa63e_bhjndmoc"}
   ```
   * The Peon is able to respond with 202 but it's not clear on why it was 
stuck in "STARTING" phase even though it appeared we were ingesting data and 
doing actual work :
       * 
https://github.com/apache/druid/blob/master/indexing-service/src/main/java/org/apache/druid/indexing/seekablestream/SeekableStreamIndexTaskRunner.java
 (lines: 1770-1826)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to