we hit the same issue with this topic. Our SLB system detect and determined 
whether a broker is alive or not only by it's return HTTP Code.
Our Druid Cluster is quite huge, and a broker from start to ready to server 
might have 10+ seconds time gap.

when our SLB system try call /druid/broker/v1/loadstatus
it might return
HTTP Code: 200
{
    "inventoryInitialized": false
}

and assume that this broker could be served query, hens, some of the request 
will fail according to this condition.

Proposal: 
Have to look at the latest code of coordinator isLeader API

```
    if (leading) {
      return Response.ok(response).build();
    } else {
      return 
Response.status(Response.Status.NOT_FOUND).entity(response).build();
    }
```

i suggest that we can have a another broker http endpoint
/druid/broker/v1/isReady
if serverView is not inititalized, return 404, else return 200. and this will 
make our SLB system work as expected.

@pdeva if you are with my proposal, i can do the pull request, and fix this 
issue along with the documentation.  FYI.



[ Full content available at: 
https://github.com/apache/incubator-druid/issues/6172 ]
This message was relayed via gitbox.apache.org for [email protected]

Reply via email to