Sure, I will open a new thread. On Wed, 10 Dec 2025 at 03:43, Tsz Wo Sze <[email protected]> wrote:
> Hi Sammi and others, > > Thanks for working hard on addressing the review comments! > > The branch [1] is 313 commits behind master. Let's update it (and resolve > the conflicts, if there are any)? > > BTW, let's start a new email thread and use [VOTE 2] in the subject for the > new VOTE. It is earlier to follow. > > Tsz-Wo > [1] https://github.com/apache/ozone/tree/HDDS-5713 > > On Sun, Dec 7, 2025 at 11:45 PM Sammi Chen <[email protected]> wrote: > > > Thanks Sumit for the review and comments. > > > > Sorry for a late update. With the aggregated feedback and a discussion > with > > Symious, we decided to remove SCM as the proxy of disk balancer command > to > > datanode, so CLI will send directly to datanode. Here are the JIRAs of > this > > major change, HDDS-13598 < > https://issues.apache.org/jira/browse/HDDS-13598 > > > > > , HDDS-13878 <https://issues.apache.org/jira/browse/HDDS-13878>. And > > introduced a feature flag HDDS-13497 > > <https://issues.apache.org/jira/browse/HDDS-13497>. > > > > With the above changes, all major comments are addressed. Please help to > > take another review and cast your vote for disk balancer branch merge. > > > > > > Thanks, > > Sammi > > > > On Fri, 5 Sept 2025 at 20:08, Sumit Agrawal > > <[email protected]> wrote: > > > > > Hi All, > > > > > > Few feedback from my side over design: > > > 1. SCM act as proxy for CLI command like start / stop / config changes > > but > > > it has issue due to SCM async nature (API result consistency), > > > a. Command results for successful apply or failure of change can not > be > > > shown in *expected* *time* (for a happy path will take at least 2 HB > > time, > > > i.e. 1 min) > > > b. Command rejection failure is high, based on SCM state (especially > > for > > > all DN cases) like DN state un-healthy or not available or restarted, > SCM > > > restart, ... > > > IMO, we can remove this CLI command execution via SCM but can be direct > > > from Tool CLI itself as its redundant job which CLI also can do > > > efficiently. > > > > > > 2. The DN side having a file as persistence of configuration is complex > > and > > > error prone. DB can be used instead to keep configuration as there for > > open > > > containers. > > > a. file format versioning code for upgrade -- having additional code > > > b. file consistency, recovery on failure and atomic write -- > additional > > > code is there > > > c. file path configuration > > > d. amount of code is high 2-3kloc for this which can be handled in a > > few > > > 100s. > > > So I think we should replace the file with DB for keeping > configuration > > > in an atomic way which DB has inbuilt features and *proto* provides > > > implicit *upgrade capability.* > > > > > > IMO, we need to make the above 2 changes with respect to code > > > implementation. Others can share opinions. > > > > > > Regards > > > Sumit > > > > > > On Fri, Aug 22, 2025 at 9:33 PM Tsz-Wo Nicholas Sze < > [email protected]> > > > wrote: > > > > > > > > The SlidingWindow class is not related to disk balancer, ... > > > > > > > > Ethan, thanks for pointing it out! Then, we should not remove it > here. > > > > > > > > Tsz-Wo > > > > > > > > > > > > On Thu, Aug 21, 2025 at 1:18 PM Ethan Rose <[email protected]> wrote: > > > > > > > > > Thanks for working to update this for merge. > > > > > > > > > > The SlidingWindow class is not related to disk balancer, it has > been > > > > added > > > > > to master to be used by the volume scanner. It was added in > > HDDS-12852 > > > > > <https://issues.apache.org/jira/browse/HDDS-12852> and is being > used > > > in > > > > > HDDS-13108 <https://issues.apache.org/jira/browse/HDDS-13108> / > > > > > https://github.com/apache/ozone/pull/8843. Please don't remove it. > > > > > > > > > > Please update the wiki with the justification for why there are no > > > > > incompatible changes as described in this thread. Currently it > still > > > > > says "There > > > > > should not be any incompatible changes introduced with this > feature." > > > but > > > > > provides no justification. > > > > > > > > > > Regarding the feature flag, the status communicated in this thread > > must > > > > > also be clear to end users. Until this is considered stable and > > enabled > > > > by > > > > > default, please: > > > > > - Update the user doc added in > > > https://github.com/apache/ozone/pull/8837 > > > > > to > > > > > indicate that the feature is considered experimental/beta. > > > Additionally, > > > > I > > > > > don't see a mention of the feature flag in this doc. > > > > > - Update the config description added in > > > > > https://github.com/apache/ozone/pull/8869 to indicate the same. > > > > > - Update the CLI help message to indicate the same. > > > > > - Hide the CLI so it does not appear in help messages. > > > > > > > > > > Additionally, please make all the CLI flags kebab-case. They are > > using > > > a > > > > > mix of kebab and camel case right now. Ideally the command itself > > would > > > > be > > > > > `disk-balancer` as well, but since we already have > > `containerbalancer` > > > we > > > > > might want to leave that part as is for consistency. > > > > > > > > > > Thanks, > > > > > Ethan > > > > > > > > > > > > > > > > > > -- > > > *Sumit Agrawal* | Senior Staff Engineer > > > cloudera.com <https://www.cloudera.com> > > > [image: Cloudera] <https://www.cloudera.com/> > > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > > > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > > Cloudera > > > on LinkedIn] <https://www.linkedin.com/company/cloudera> > > > ------------------------------ > > > > > >
