That is very helpful, thank you! +1 for continuing with 0.12.2-RC1
On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <clint.wy...@imply.io> wrote: > Heya, sorry for the delay (and missing the sync, i'll try to get better > about showing up). I've fixed a handful of coordinator bugs post 0.12.0 > (and > not backported to 0.12.1), some of these issues go far back, some back to > when segment assignment priority for different tiers of historicals was > introduced, some are just some oddities on the behavior of the balancer > that I am unsure when were introduced. This is the complete list of fixes > that are currently in 0.12.2 afaik, with a small description (see PRs and > associated issues for more details) > > https://github.com/apache/incubator-druid/pull/5528 fixed an issue that > movement did not drop the segment from the server the segment was being > moved from (this one goes waaaay back, to batch segment announcements) > > https://github.com/apache/incubator-druid/pull/5529 changed behavior of > drop to use the balancer to choose where to drop segments from, based on > behavior observed caused by the issue of 5528 > > https://github.com/apache/incubator-druid/pull/5532 fixes an issue where > primary assignment during load rule processing would assign an unavailable > segment to every server with capacity until at least 1 historical had the > segment (and drop it from all the others if they all loaded at the same > time), choking load queues from doing useful things > > https://github.com/apache/incubator-druid/pull/5555 fixed a way for http > based coordinator to get stuck loading or dropping segments and a companion > PR that fixed a lambda that wasn't friendly to older jvm versions > https://github.com/apache/incubator-druid/pull/5591 > > https://github.com/apache/incubator-druid/pull/5888 makes balancing honor > a > load rule max load queue depth setting to help prevent movement from > starving loading > > https://github.com/apache/incubator-druid/pull/5928 doesn't really fix > anything, just does an early return to avoid doing pointless work > > Additionally, there are a couple of pairs of PRs that are not currently in > 0.12.2: https://github.com/druid-io/druid/pull/5927 and > https://github.com/apache/incubator-druid/pull/5929 and their respective > fixes which have yet to be merged, but have been performing well on our > test cluster, https://github.com/apache/incubator-druid/pull/5987 and > https://github.com/apache/incubator-druid/pull/5988. One of them makes > balancing behave in a way more consistent with expectations by always > trying to move maxSegmentsToMove and more correctly tracking what the > balancer is doing, and one just adds better logging (without much extra log > volume) due to frustrations I had chasing down all these other issues. Both > of these were slated for 0.12.2 but were pulled out because of the issues > (which the open PRs fix afaict). I would be in favor of sliding them in > there, pending review of the fixes, but understand if they won't make the > cut since they maybe fall a bit more on the cosmetic side of things. I'm > pretty happy of the state of things on our test cluster right now, but > without these 4 patches things should still be operating more correctly > than they were before, just the differences being with balancing moving > somewhere between 0 and max, and less useful logging making future issues > (which I have no doubts still lurk) harder to diagnose. > > Cheers, > Clint > > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cral...@apache.org> > wrote: > > > Brought this up in the dev sync: > > > > I saw a lot of PRs and fixes for Coordinator segment balancing related to > > some regressions that happened in 0.12.x . Is anyone able to give a > rundown > > of the state of coordinator segment management for the 0.12.2 RC? > > > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa < > > nbanga...@hortonworks.com> > > wrote: > > > > > +1 > > > > > > -- > > > Nishant Bangarwa > > > > > > Hortonworks > > > > > > On 7/10/18, 3:57 AM, "Jihoon Son" <jihoon...@apache.org> wrote: > > > > > > Related thread: > > > > > > https://lists.apache.org/thread.html/76755aecfddb1210fcc3f08b1d4631 > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E > > > . > > > > > > Jihoon > > > > > > On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son <jihoon...@apache.org> > > > wrote: > > > > > > > Hi all, > > > > > > > > We have no open issues and PRs for 0.12.2 ( > > > > https://github.com/apache/incubator-druid/milestone/27). The > > 0.12.2 > > > > branch is already available and all PRs for 0.12.2 have merged > into > > > that > > > > branch. > > > > > > > > Let's vote on releasing RC1. Here is my +1. > > > > > > > > This is a non-ASF release. > > > > > > > > Best, > > > > Jihoon > > > > > > > > > > > > > > > >