jihoonson commented on issue #6176: CuratorInventoryManager may not report inventory as initialized URL: https://github.com/apache/incubator-druid/issues/6176#issuecomment-414789398 @himanshug thanks for the details. > 1. switching coordinator to use HTTP (using HttpLoadQueuePeon) for segment assignment (load/drop) > 2. switching broker/coordinator to use HTTP (using HttpServerInventoryView) for discovering what segments are served by queryable nodes (historicals, and peons doing indexing) > 3. switching overlord to use HTTP for task mgmt (using HttpRemoteTaskRunner) In my comment above I was talking about trying making (1) and (2) default after a bit of testing on some more clusters that you have. > > looks like #6201 pertains to (3) , so let us not consider enabling (3) by default at this time until we get to the bottom of #6201 . Thanks. It sounds good to me. > However, after (1), (2) and (3) are done with druid clusters using HTTP . And, we remove coordinator/overlord service announcement that is always done in ZK, to support tranquility. Then , technically, it becomes possible to write extensions for discovery that don't necessarily use zookeeper and use say etcd instead. However, this is also an independent activity which will take its own time, so don't want to make it a prerequisite for trying out http or default to it as we gain more confidence with those features. And, remove zookeeper code in phases that is not needed (i.e. after say 4-6 months from a release where specific thing was made default) This sounds good to me. What I meant for removing remaining codes writing data on ZK is [this kind of things](https://github.com/apache/incubator-druid/blob/master/indexing-service/src/main/java/io/druid/indexing/overlord/hrtr/HttpRemoteTaskRunnerFactory.java#L54). Would you tell us that there are some reasons that we can't remove these things right now? If not, I think we should clear them out before making it default. Looks that these things only remain for HTTP-based task allocation, so it might not be an issue for `1.` and `2.`. For tranquility, I would say we don't have to improve tranquility to support HTTP-based overlords at this point. For the number of HTTP connections, I have checked those connections and they were all valid. The issue was the small number of HTTP server/client worker threads, not the large number of HTTP connections. (For configurations for the number of worker threads, see https://github.com/apache/incubator-druid/blob/master/server/src/main/java/io/druid/guice/http/DruidHttpClientConfig.java#L46 and https://github.com/apache/incubator-druid/blob/master/extensions-core/kafka-indexing-service/src/main/java/io/druid/indexing/kafka/supervisor/KafkaSupervisor.java#L281-L283.) So, I'm not worrying about a few additional HTTP connections, but we may need to change the default configurations because default configurations should not make users confused. If there are only one additional connection per master/worker, I guess it would be fine.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@druid.apache.org For additional commands, e-mail: commits-h...@druid.apache.org