gianm opened a new pull request, #12696:
URL: https://github.com/apache/druid/pull/12696

   Our servers talk to each other over HTTP. We have a low-level HTTP
   client (HttpClient) that is super-asynchronous and super-customizable
   through its handlers. It's also proven to be quite robust: we use it
   for Broker -> Historical communication over the wide variety of query
   types and workloads we support.
   
   But the low-level client has no facilities for service location or
   retries, which means we have a variety of high-level clients that
   implement these in their own ways. Some high-level clients do a better
   job than others. This patch adds a mid-level ServiceClient that makes
   it easier for high-level clients to be built correctly and harmoniously,
   and migrates some of the high-level logic to use ServiceClients.
   
   Main changes:
   
   1) Add ServiceClient org.apache.druid.rpc package. That package also
      contains supporting stuff like ServiceLocator and RetryPolicy
      interfaces, and a DiscoveryServiceLocator based on
      DruidNodeDiscoveryProvider.
   
   2) Add high-level OverlordClient in org.apache.druid.rpc.indexing.
   
   3) Indexing task client creator in TaskServiceClients. It uses
      SpecificTaskServiceLocator to find the tasks. This improves on
      ClientInfoTaskProvider by caching task locations for up to 30 seconds
      across calls, reducing load on the Overlord.
   
   4) Rework ParallelIndexSupervisorTaskClient to use a ServiceClient
      instead of extending IndexTaskClient.
   
   5) Rework RemoteTaskActionClient to use a ServiceClient instead of
      DruidLeaderClient.
   
   6) Rework LocalIntermediaryDataManager, TaskMonitor, and
      ParallelIndexSupervisorTask. As a result, MiddleManager, Peon, and
      Overlord no longer need IndexingServiceClient (which internally used
      DruidLeaderClient).
   
   There are some concrete benefits over the prior logic, namely:
   
   - DruidLeaderClient does retries in its "go" method, but only retries
     exactly 5 times, does not sleep between retries, and does not retry
     retryable HTTP codes like 502, 503, 504. (It only retries IOExceptions.)
     ServiceClient handles retries in a more reasonable way.
   
   - DruidLeaderClient's methods are all synchronous, whereas ServiceClient
     methods are asynchronous. This is used in one place so far: the
     SpecificTaskServiceLocator, so we don't need to block a thread trying
     to locate a task. It can be used in other places in the future.
   
   - HttpIndexingServiceClient does not properly handle all server errors.
     In some cases, it tries to parse a server error as a successful
     response (for example: in getTaskStatus).
   
   - IndexTaskClient currently makes an Overlord call on every task-to-task
     HTTP request, as a way to find where the target task is. ServiceClient,
     through SpecificTaskServiceLocator, caches these target locations
     for a period of time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to