+1 On Wed, Jul 11, 2018 at 9:44 AM, Gian Merlino <g...@apache.org> wrote:
> +1 from me too! > > On Wed, Jul 11, 2018 at 7:28 AM Charles Allen <cral...@apache.org> wrote: > > > That is very helpful, thank you! > > > > +1 for continuing with 0.12.2-RC1 > > > > On Tue, Jul 10, 2018 at 6:51 PM Clint Wylie <clint.wy...@imply.io> > wrote: > > > > > Heya, sorry for the delay (and missing the sync, i'll try to get better > > > about showing up). I've fixed a handful of coordinator bugs post 0.12.0 > > > (and > > > not backported to 0.12.1), some of these issues go far back, some back > to > > > when segment assignment priority for different tiers of historicals was > > > introduced, some are just some oddities on the behavior of the balancer > > > that I am unsure when were introduced. This is the complete list of > fixes > > > that are currently in 0.12.2 afaik, with a small description (see PRs > and > > > associated issues for more details) > > > > > > https://github.com/apache/incubator-druid/pull/5528 fixed an issue > that > > > movement did not drop the segment from the server the segment was being > > > moved from (this one goes waaaay back, to batch segment announcements) > > > > > > https://github.com/apache/incubator-druid/pull/5529 changed behavior > of > > > drop to use the balancer to choose where to drop segments from, based > on > > > behavior observed caused by the issue of 5528 > > > > > > https://github.com/apache/incubator-druid/pull/5532 fixes an issue > where > > > primary assignment during load rule processing would assign an > > unavailable > > > segment to every server with capacity until at least 1 historical had > the > > > segment (and drop it from all the others if they all loaded at the same > > > time), choking load queues from doing useful things > > > > > > https://github.com/apache/incubator-druid/pull/5555 fixed a way for > http > > > based coordinator to get stuck loading or dropping segments and a > > companion > > > PR that fixed a lambda that wasn't friendly to older jvm versions > > > https://github.com/apache/incubator-druid/pull/5591 > > > > > > https://github.com/apache/incubator-druid/pull/5888 makes balancing > > honor > > > a > > > load rule max load queue depth setting to help prevent movement from > > > starving loading > > > > > > https://github.com/apache/incubator-druid/pull/5928 doesn't really fix > > > anything, just does an early return to avoid doing pointless work > > > > > > Additionally, there are a couple of pairs of PRs that are not currently > > in > > > 0.12.2: https://github.com/druid-io/druid/pull/5927 and > > > https://github.com/apache/incubator-druid/pull/5929 and their > respective > > > fixes which have yet to be merged, but have been performing well on our > > > test cluster, https://github.com/apache/incubator-druid/pull/5987 and > > > https://github.com/apache/incubator-druid/pull/5988. One of them makes > > > balancing behave in a way more consistent with expectations by always > > > trying to move maxSegmentsToMove and more correctly tracking what the > > > balancer is doing, and one just adds better logging (without much extra > > log > > > volume) due to frustrations I had chasing down all these other issues. > > Both > > > of these were slated for 0.12.2 but were pulled out because of the > issues > > > (which the open PRs fix afaict). I would be in favor of sliding them in > > > there, pending review of the fixes, but understand if they won't make > the > > > cut since they maybe fall a bit more on the cosmetic side of things. > I'm > > > pretty happy of the state of things on our test cluster right now, but > > > without these 4 patches things should still be operating more correctly > > > than they were before, just the differences being with balancing moving > > > somewhere between 0 and max, and less useful logging making future > issues > > > (which I have no doubts still lurk) harder to diagnose. > > > > > > Cheers, > > > Clint > > > > > > On Tue, Jul 10, 2018 at 10:30 AM, Charles Allen <cral...@apache.org> > > > wrote: > > > > > > > Brought this up in the dev sync: > > > > > > > > I saw a lot of PRs and fixes for Coordinator segment balancing > related > > to > > > > some regressions that happened in 0.12.x . Is anyone able to give a > > > rundown > > > > of the state of coordinator segment management for the 0.12.2 RC? > > > > > > > > On Tue, Jul 10, 2018 at 10:26 AM Nishant Bangarwa < > > > > nbanga...@hortonworks.com> > > > > wrote: > > > > > > > > > +1 > > > > > > > > > > -- > > > > > Nishant Bangarwa > > > > > > > > > > Hortonworks > > > > > > > > > > On 7/10/18, 3:57 AM, "Jihoon Son" <jihoon...@apache.org> wrote: > > > > > > > > > > Related thread: > > > > > > > > > > https://lists.apache.org/thread.html/ > 76755aecfddb1210fcc3f08b1d4631 > > > > 784a8a5eede64d22718c271841@%3Cdev.druid.apache.org%3E > > > > > . > > > > > > > > > > Jihoon > > > > > > > > > > On Mon, Jul 9, 2018 at 3:25 PM Jihoon Son < > jihoon...@apache.org> > > > > > wrote: > > > > > > > > > > > Hi all, > > > > > > > > > > > > We have no open issues and PRs for 0.12.2 ( > > > > > > https://github.com/apache/incubator-druid/milestone/27). The > > > > 0.12.2 > > > > > > branch is already available and all PRs for 0.12.2 have > merged > > > into > > > > > that > > > > > > branch. > > > > > > > > > > > > Let's vote on releasing RC1. Here is my +1. > > > > > > > > > > > > This is a non-ASF release. > > > > > > > > > > > > Best, > > > > > > Jihoon > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >