Tim hit me with the obvious question here. “I’m assuming there are reasons, but what about a little copy past on some of these issues that you mentioned.”
I say the obvious question because I kind of flippantly jump through some lines of code and then say and then you just do a, b and c and that’s the ballgame. There are a lot of reasons I can’t cut and paste though. And I can open almost any class and annotate a similar set of issues. So without diving into all the reasons, I would have already if it was so simple. I can certainly help address some things, lean on existing code and efforts, but at the moment I’m in a position where the best I have is to work on things as needed by outside pressures, items or demands. If I see others improving or redoing any of this core cloud code though, I’d certainly lend a hand on those efforts. Outside of making changes based on external needs, I just got out from under the solo kamakize, and i cant dive back in without it being on contained items and goals that satisfies someone’s needs or joining an existing multi crew effort or goal. If I had to randomly pull threads, repeat efforts yet one more time, and funnel that work through a gauntlet of uninvolved, good intentioned developers, neither me nor anyone else would be pleased. Mark On Fri, Oct 1, 2021 at 2:17 PM Mark Miller <[email protected]> wrote: > That covers a lot of current silliness you will see, pretty simply as most > of it comes down remove silly stuff, but you can find some related wildness > in ZkController#register. > > // check replica's existence in clusterstate first > > zkStateReader.waitForState(collection, 100, TimeUnit.MILLISECONDS, > (collectionState) -> getReplicaOrNull(collectionState, shardId, > coreZkNodeName) != null); > > 100ms wait, no biggie, and at least it uses waitForState, but we should not > need to get our own clusterstate from zk so here care about waiting for this > here - if there is an item of data we need, it should have been passed into > the core create call. > > Next we get the shard terms object so we can later create our shard terms > entry (LIR). > > Slow and bug inducing complicated to have each replica do this here, fighting > each other to add an initial entry. You can create the initial shard terms > for a replica when you create or update the clusterstate (term > {replicaname=0}), and you can do it in > > a single zk call. > > > // in this case, we want to wait for the leader as long as the leader might > // wait for a vote, at least - but also long enough that a large cluster has > // time to get its act together > String leaderUrl = getLeader(cloudDesc, leaderVoteWait + 600000); > > Now we do getLeader, a polling operation that should not be, and wait > possibly forever for it. As I mention there should be little wait at most in > the notes on leader sync, there should be little wait here. It's also > > one of a variety of places that even if you remove the polling, sucks to wait > on. I'm a fan of thousands of cores per machine not being an issue. In many > of these cases, you can't achieve that and have 1000 threads hanging out > > all over even if they are not blind polling. This is one of the simpler cases > where that can be addressed. I break this method into two and I enhance > zkstatereader waitforstate functionality. I allow you to pass a runnable to > execute > > when zkstatereader is notified and the given predicate matches. So no need > for 1000's or hundreds or dozens of slackers here. Do a couple base register > items, call wait for state with a runnable that calls the second part of the > logic > > when a leader comes into zkstatereader and go away. We can't eat up threads > like this in all these cases. > > Now you can also easily shutdown and reload cores and do various things that > are currently harassed by various waits like this slacking off in these wait > loops. > > > > The rest is just continuation of this game when it comes to leader selection > and finalization and collection creation and replica spin up. You make > zkstatereader actually efficient. You make multiple and lazy collections work > appropriately, > > and not super inefficient. > > You make leader election a sensible bit of code. As part of zkstatereader > sensibility you remove the need for a billion client based watches in zk and > in many cases the need for a thousand watcher implementations and instances. > > You let the components dictate how often requests go to services and coalesce > dependent code requests instead of letting the dependents dictate service > request cadence and size, and you do a lot less sillines like serialize large > json > > structures for bit size data updates, and scaling to 10's of k and even 100's > of k replicas and collections is doable even > > on single machines and a handful of Solr instances, say nothing about pulling > in more hardware. Everything required is cheap cheap cheap. It's the mountain > of unrequired that is expensive expensive expensive. > > > On Fri, Oct 1, 2021 at 12:47 PM Mark Miller <[email protected]> wrote: > >> Ignoring lots of polling, inefficiencies, early defensive raw sleeps, >> various races and bugs and a laundry list of items involved in making >> leader processes good enough to enter a collection creation contest, here >> is a more practical small set of notes off the top of my head on a quick >> inspection around what is currently just in your face non sensible. >> >> https://gist.github.com/markrmiller/233119ba84ce39d39960de0f35e79fc9 >> > > > -- > - Mark > > http://about.me/markrmiller > -- - Mark http://about.me/markrmiller
