I believe today, if you use the (experimental) HTTP-based load queues, they will parallelize segment downloads. Adding similar functionality for the ZK-based load queues would definitely be useful though, since at this time nobody seems to be actively driving a migration to HTTP-based load queues being enabled by default.
On Wed, Jan 30, 2019 at 7:20 PM Samarth Jain <sama...@apache.org> wrote: > We noticed that it takes a long time for the historicals to download > segments from deep storage (in our case S3). Looking closer at the code in > ZKCoordinator, I noticed that the segment download is happening in a single > threaded fashion. This download happens in the SingleThreadedExecutor > service used by the PathChildrenCache. Looking at the commentary on > https://github.com/apache/incubator-druid/issues/4421 and > https://github.com/apache/incubator-druid/issues/3202, the executor > service > used in PathChildrenCache can only be single threaded. > > My proposal is to use a multi threaded ExecutorService that will be used to > take action on the events to perform the download. The role of single > threaded ExecutorService in PathChildrenCache will be simply to delegate > the download task to this new executor service. > > Does that sound feasible? IMO, if this happens to be functionally correct, > it should help significantly boost up the time it is taking historicals to > download all the assigned segments. > > I would be more than happy to contribute this enhancement to the community. > > Thanks, > Samarth >