This is an automated email from the ASF dual-hosted git repository.

jtuglu1 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/master by this push:
     new 120be6e11c5 refactor: adjust diskNormalized strategy to scale cost 
exponentially with disk utilization (#19422)
120be6e11c5 is described below

commit 120be6e11c5f5fdf072cfe47fd001d007c624942
Author: jtuglu1 <[email protected]>
AuthorDate: Mon Jun 1 13:40:31 2026 -0700

    refactor: adjust diskNormalized strategy to scale cost exponentially with 
disk utilization (#19422)
    
    The existing linear penalization factor is still ineffective in large skew 
scenarios where the CostBalancerStrategy's cost forces a move/load (even with 
the utilization-based penalty). This switches the penalty to scale 
exponentially with the disk utilization, ensuring that near-full historicals 
are penalized. This is also particularly helpful when the size of segments on 
the cluster vary wildly.
    
    This also marks the diskNormalized strategy as ready for production use.
---
 docs/configuration/index.md                        |   4 +-
 docs/design/coordinator.md                         |   2 +-
 .../DiskNormalizedCostBalancerStrategy.java        |  47 +++---
 .../DiskNormalizedCostBalancerStrategyConfig.java  |  10 +-
 .../DiskNormalizedCostBalancerStrategyTest.java    | 160 +++++++++++++++++++--
 5 files changed, 186 insertions(+), 37 deletions(-)

diff --git a/docs/configuration/index.md b/docs/configuration/index.md
index 12a7cd387dc..c064908ddd6 100644
--- a/docs/configuration/index.md
+++ b/docs/configuration/index.md
@@ -712,8 +712,8 @@ These Coordinator static configurations can be defined in 
the `coordinator/runti
 |`druid.coordinator.period`|The run period for the Coordinator. The 
Coordinator operates by maintaining the current state of the world in memory 
and periodically looking at the set of "used" segments and segments being 
served to make decisions about whether any changes need to be made to the data 
topology. This property sets the delay between each of these runs.|`PT60S`|
 |`druid.coordinator.startDelay`|The operation of the Coordinator works on the 
assumption that it has an up-to-date view of the state of the world when it 
runs, the current ZooKeeper interaction code, however, is written in a way that 
doesn’t allow the Coordinator to know for a fact that it’s done loading the 
current state of the world. This delay is a hack to give it enough time to 
believe that it has all the data.|`PT300S`|
 |`druid.coordinator.load.timeout`|The timeout duration for when the 
Coordinator assigns a segment to a Historical service.|`PT15M`|
-|`druid.coordinator.balancer.strategy`|The [balancing 
strategy](../design/coordinator.md#balancing-segments-in-a-tier) used by the 
Coordinator to distribute segments among the Historical servers in a tier. The 
`cost` strategy distributes segments by minimizing a cost function, 
`diskNormalized` weights these costs with the disk usage ratios of the servers 
and `random` distributes segments randomly.|`cost`|
-|`druid.coordinator.balancer.diskNormalized.moveCostSavingsThreshold`|Only 
used when `druid.coordinator.balancer.strategy` is `diskNormalized`. Minimum 
fractional cost reduction required before a segment is moved off a server that 
already holds it. A value of `0.05` requires the destination to be at least 5% 
cheaper than the source, which prevents oscillation between servers with 
similar disk utilization. Must be in `[0.0, 1.0)`; `0.0` disables the 
anti-oscillation discount.|`0.05`|
+|`druid.coordinator.balancer.strategy`|The [balancing 
strategy](../design/coordinator.md#balancing-segments-in-a-tier) used by the 
Coordinator to distribute segments among the Historical servers in a tier. The 
`cost` strategy distributes segments by minimizing a cost function, 
`diskNormalized` divides these costs by the projected available disk headroom 
of each server and `random` distributes segments randomly.|`cost`|
+|`druid.coordinator.balancer.diskNormalized.moveCostSavingsThreshold`|Only 
used when `druid.coordinator.balancer.strategy` is `diskNormalized`. Minimum 
fractional cost reduction required before a segment is moved off a server that 
already holds it. A value of `0.05` requires the destination to be at least 5% 
cheaper than the source, which prevents oscillation between servers with 
similar projected headroom. Must be in `[0.0, 1.0)`; `0.0` disables the 
anti-oscillation discount.|`0.05`|
 |`druid.coordinator.loadqueuepeon.http.repeatDelay`|The start and repeat delay 
(in milliseconds) for the load queue peon, which manages the load/drop queue of 
segments for any server.|1 minute|
 |`druid.coordinator.loadqueuepeon.http.batchSize`|Number of segment load/drop 
requests to batch in one HTTP request. Note that it must be smaller than or 
equal to the `druid.segmentCache.numLoadingThreads` config on Historical 
service. If this value is not configured, the coordinator uses the value of the 
`numLoadingThreads` for the respective server. | 
`druid.segmentCache.numLoadingThreads` |
 |`druid.coordinator.asOverlord.enabled`|Boolean value for whether this 
Coordinator service should act like an Overlord as well. This configuration 
allows users to simplify a Druid cluster by not having to deploy any standalone 
Overlord services. If set to true, then Overlord console is available at 
`http://coordinator-host:port/console.html` and be sure to set 
`druid.coordinator.asOverlord.overlordService` also.|false|
diff --git a/docs/design/coordinator.md b/docs/design/coordinator.md
index e63a5b4c3d5..f2d735000cf 100644
--- a/docs/design/coordinator.md
+++ b/docs/design/coordinator.md
@@ -88,7 +88,7 @@ But in a tier with several Historicals (or a low replication 
factor), segment re
 Thus, the Coordinator constantly monitors the set of segments present on each 
Historical in a tier and employs one of the following strategies to identify 
segments that may be moved from one Historical to another to retain balance.
 
 - `cost` (default): For a given segment in a tier, this strategy picks the 
server with the minimum "cost" of placing that segment. The cost is a function 
of the data interval of the segment and the data intervals of all the segments 
already present on the candidate server. In essence, this strategy tries to 
avoid placing segments with adjacent or overlapping data intervals on the same 
server. This is based on the premise that adjacent-interval segments are more 
likely to be used together [...]
-- `diskNormalized`: A derivative of the `cost` strategy that multiplies the 
cost of placing a segment on a server by the server's disk usage ratio 
(`diskUsed / maxSize`). This penalizes fuller servers and drives disk 
utilization to equalize across the tier, which is useful when historicals 
within a tier hold segments of widely varying sizes. To prevent oscillation 
when servers have similar utilization, a segment that is already placed on a 
server receives a cost discount; a move only fir [...]
+- `diskNormalized`: A derivative of the `cost` strategy that divides the cost 
of placing a segment on a server by the server's projected available disk 
headroom. The projected usage ratio is `(diskUsed + 
segmentSizeIfNotAlreadyProjected) / maxSize`, so the disk-adjusted cost is 
`cost / max(EPSILON, 1 - projectedUsageRatio)`. This strongly penalizes servers 
that would be nearly full after placement and drives disk utilization to 
equalize across the tier, which is useful when historicals w [...]
 - `random`: Distributes segments randomly across servers. This is an 
experimental strategy and is not recommended for a production cluster.
 
 All of the above strategies prioritize moving segments from the Historical 
with the least available disk space.
diff --git 
a/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategy.java
 
b/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategy.java
index e8b1b902dde..53180fa9e3e 100644
--- 
a/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategy.java
+++ 
b/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategy.java
@@ -26,21 +26,22 @@ import org.apache.druid.timeline.DataSegment;
 
 /**
  * A {@link BalancerStrategy} which normalizes the cost of placing a segment 
on a
- * server as calculated by {@link CostBalancerStrategy} by multiplying it by 
the
- * server's disk usage ratio.
+ * server as calculated by {@link CostBalancerStrategy} by dividing by the
+ * server's projected available disk headroom.
  * <pre>
- * normalizedCost = cost * usageRatio
- *     where usageRatio = diskUsed / totalDiskSpace
+ * normalizedCost = cost / max(EPSILON, 1 - projectedUsageRatio)
+ *     where projectedUsageRatio = (diskUsed + 
segmentSizeIfNotAlreadyProjected) / totalDiskSpace
  * </pre>
- * This penalizes servers that are more full, driving disk utilization to 
equalize
- * across the tier. When all servers have equal disk usage, the behavior is 
identical
- * to {@link CostBalancerStrategy}. When historicals have different disk 
capacities,
- * this naturally accounts for both fill level and total capacity.
+ * The denominator diverges as a server approaches full, so disk fullness has
+ * more weight over the placement decision when servers are nearly full,
+ * regardless of asymmetries in the locality cost. {@link #EPSILON} is a small
+ * numerical floor on the divisor to guard against division by zero (or by
+ * negative values during in-flight loads).
  * <p>
- * To prevent oscillation when servers have similar utilization, any server 
that
+ * To prevent oscillation when servers have similar headroom, any server that
  * is already projected to hold the segment (the source on a move, or a 
currently
  * serving node on a drop) receives a cost discount equal to
- * {@link #DEFAULT_MOVE_COST_SAVINGS_THRESHOLD}. A move therefore fires only 
when
+ * {@link 
DiskNormalizedCostBalancerStrategyConfig.DEFAULT_MOVE_COST_SAVINGS_THRESHOLD}. 
A move therefore fires only when
  * the destination saves at least this fraction of the source's cost. The 
default
  * is configurable via
  * {@code druid.coordinator.balancer.diskNormalized.moveCostSavingsThreshold}.
@@ -48,18 +49,17 @@ import org.apache.druid.timeline.DataSegment;
 public class DiskNormalizedCostBalancerStrategy extends CostBalancerStrategy
 {
   /**
-   * Default minimum fractional cost reduction required before a segment will
-   * be moved off a server that is already projected to hold it. A value of
-   * {@code 0.05} means the destination must be at least 5% cheaper than the
-   * source for the move to happen.
+   * Numerical floor on the headroom divisor to prevent division by zero or by
+   * negative values when {@code usageRatio >= 1.0} (possible for 
over-allocated
+   * servers or during in-flight loads).
    */
-  static final double DEFAULT_MOVE_COST_SAVINGS_THRESHOLD = 0.05;
+  static final double EPSILON = 1e-6;
 
   private final double sourceCostMultiplier;
 
   public DiskNormalizedCostBalancerStrategy(ListeningExecutorService exec)
   {
-    this(exec, DEFAULT_MOVE_COST_SAVINGS_THRESHOLD);
+    this(exec, 
DiskNormalizedCostBalancerStrategyConfig.DEFAULT_MOVE_COST_SAVINGS_THRESHOLD);
   }
 
   public DiskNormalizedCostBalancerStrategy(ListeningExecutorService exec, 
double moveCostSavingsThreshold)
@@ -85,19 +85,20 @@ public class DiskNormalizedCostBalancerStrategy extends 
CostBalancerStrategy
       return cost;
     }
 
-    // Guard against NaN propagation in the cost comparator if a server
-    // somehow reports a non-positive maxSize. Such a server cannot hold
-    // anything and will be rejected by canLoadSegment, so returning the
-    // raw cost is safe.
+    // A server with non-positive maxSize cannot hold anything and will be
+    // rejected by canLoadSegment; return the raw cost to avoid NaN 
propagation.
     final long maxSize = server.getMaxSize();
     if (maxSize <= 0) {
       return cost;
     }
 
-    double usageRatio = (double) server.getSizeUsed() / maxSize;
-    double normalizedCost = cost * usageRatio;
+    final boolean alreadyProjected = 
server.isProjectedSegment(proposalSegment);
+    final long projectedSizeUsed = server.getSizeUsed() + (alreadyProjected ? 
0 : proposalSegment.getSize());
+    final double usageRatio = (double) projectedSizeUsed / maxSize;
+    final double headroom = Math.max(EPSILON, 1.0 - usageRatio);
+    double normalizedCost = cost / headroom;
 
-    if (server.isProjectedSegment(proposalSegment)) {
+    if (alreadyProjected) {
       normalizedCost *= sourceCostMultiplier;
     }
 
diff --git 
a/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyConfig.java
 
b/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyConfig.java
index 95680e7e7dd..1219eddc8ab 100644
--- 
a/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyConfig.java
+++ 
b/server/src/main/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyConfig.java
@@ -34,6 +34,14 @@ import javax.annotation.Nullable;
  */
 public class DiskNormalizedCostBalancerStrategyConfig
 {
+  /**
+   * Default minimum fractional cost reduction required before a segment will
+   * be moved off a server that is already projected to hold it. A value of
+   * {@code 0.05} means the destination must be at least 5% cheaper than the
+   * source for the move to happen.
+   */
+  static final double DEFAULT_MOVE_COST_SAVINGS_THRESHOLD = 0.05;
+
   /**
    * Minimum fractional cost reduction required to move a segment off a server
    * that is already projected to hold it. For example, a value of {@code 
0.05} means the
@@ -53,7 +61,7 @@ public class DiskNormalizedCostBalancerStrategyConfig
       @JsonProperty("moveCostSavingsThreshold") @Nullable Double 
moveCostSavingsThreshold
   )
   {
-    this.moveCostSavingsThreshold = 
Configs.valueOrDefault(moveCostSavingsThreshold, 
DiskNormalizedCostBalancerStrategy.DEFAULT_MOVE_COST_SAVINGS_THRESHOLD);
+    this.moveCostSavingsThreshold = 
Configs.valueOrDefault(moveCostSavingsThreshold, 
DEFAULT_MOVE_COST_SAVINGS_THRESHOLD);
 
     Preconditions.checkArgument(
         this.moveCostSavingsThreshold >= 0.0 && this.moveCostSavingsThreshold 
< 1.0,
diff --git 
a/server/src/test/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyTest.java
 
b/server/src/test/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyTest.java
index f57199c1f48..d6c3ad44e13 100644
--- 
a/server/src/test/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyTest.java
+++ 
b/server/src/test/java/org/apache/druid/server/coordinator/balancer/DiskNormalizedCostBalancerStrategyTest.java
@@ -120,6 +120,11 @@ public class DiskNormalizedCostBalancerStrategyTest
   }
 
   public static DataSegment getSegment(int index, String dataSource, Interval 
interval)
+  {
+    return getSegment(index, dataSource, interval, index * 100L);
+  }
+
+  public static DataSegment getSegment(int index, String dataSource, Interval 
interval, long size)
   {
     // Not using EasyMock as it hampers the performance of multithreads.
     DataSegment segment = new DataSegment(
@@ -131,7 +136,7 @@ public class DiskNormalizedCostBalancerStrategyTest
         new ArrayList<>(),
         null,
         0,
-        index * 100L
+        size
     );
     return segment;
   }
@@ -180,6 +185,16 @@ public class DiskNormalizedCostBalancerStrategyTest
     List<DataSegment> segments = IntStream.range(baseIndex, baseIndex + 
segmentCount)
         .mapToObj(DiskNormalizedCostBalancerStrategyTest::getSegment)
         .collect(Collectors.toList());
+    return buildServer(name, maxSize, sizeUsed, segments);
+  }
+
+  private static ServerHolder buildServer(
+      String name,
+      long maxSize,
+      long sizeUsed,
+      List<DataSegment> segments
+  )
+  {
     ImmutableDruidDataSource ds =
         new ImmutableDruidDataSource("DUMMY", Collections.emptyMap(), 
segments);
     return new ServerHolder(
@@ -228,8 +243,8 @@ public class DiskNormalizedCostBalancerStrategyTest
         newCostStrategy().findServersToLoadSegment(proposal, 
servers).next().getServer().getName()
     );
 
-    // DiskNormalized: A = 10 * 0.9 = 9.0, B = 60 * 0.1 = 6.0.
-    // The emptier server must win.
+    // DiskNormalized uses projected headroom: A ~= 10K / 0.09, B ~= 60K / 
0.89.
+    // The emptier server wins despite the higher raw cost.
     Assert.assertEquals(
         "DiskNormalizedCostBalancerStrategy must prefer the emptier server",
         "B",
@@ -263,8 +278,8 @@ public class DiskNormalizedCostBalancerStrategyTest
     );
 
     // DiskNormalizedCostBalancerStrategy (default 5% threshold):
-    //   A: 38K * 0.80 * 0.95 = 28.88K
-    //   B: 40K * 0.20        =  8.00K
+    //   A: 38K / 0.20 * 0.95 = 180.5K
+    //   B: 40K / 0.80        =  50.0K
     // B wins decisively and the segment moves, reducing the skew.
     final ServerHolder diskNormalizedResult =
         
newDiskNormalizedStrategy().findDestinationServerToMoveSegment(segmentToMove, 
heavy, servers);
@@ -280,7 +295,7 @@ public class DiskNormalizedCostBalancerStrategyTest
   {
     final long maxSize = 10_000_000L;
     final ServerHolder source = buildServer("SOURCE", maxSize, 8_000_000L, 0, 
20);
-    final ServerHolder dest = buildServer("DEST", maxSize, 7_400_000L, 100, 
20);
+    final ServerHolder dest = buildServer("DEST", maxSize, 7_830_000L, 100, 
20);
 
     final DataSegment segmentToMove = getSegment(0);
     final List<ServerHolder> servers = new ArrayList<>();
@@ -293,17 +308,142 @@ public class DiskNormalizedCostBalancerStrategyTest
         
newDiskNormalizedStrategy().findDestinationServerToMoveSegment(segmentToMove, 
source, servers)
     );
 
-    // threshold=0 removes the discount; the same marginal difference now
-    // triggers the move. This proves the threshold is what blocks it above.
-    final BalancerStrategy noDiscount = new DiskNormalizedCostBalancerStrategy(
+    // Lowering the threshold to 1% reduces the discount; the same marginal
+    // difference now triggers the move. This proves the threshold is what
+    // blocks it above.
+    final BalancerStrategy onePercentThreshold = new 
DiskNormalizedCostBalancerStrategy(
         MoreExecutors.listeningDecorator(Execs.multiThreaded(1, 
"DiskNormalizedCostBalancerStrategyTest-%d")),
         0.01
     );
-    final ServerHolder movedTo = 
noDiscount.findDestinationServerToMoveSegment(segmentToMove, source, servers);
+    final ServerHolder movedTo = 
onePercentThreshold.findDestinationServerToMoveSegment(segmentToMove, source, 
servers);
     Assert.assertNotNull("With threshold=0.01, the marginal move should fire", 
movedTo);
     Assert.assertEquals("DEST", movedTo.getServer().getName());
   }
 
+  @Test
+  public void testNearFullServerIsNotChosenForNewSegmentLoad()
+  {
+    final long maxSize = 10_000_000L;
+    // A: 95% full, 5 same-DS DAY segments -> raw cost = 10 * K (low, few 
co-located segs)
+    final ServerHolder nearFull = buildServer("A", maxSize, 9_500_000L, 0, 5);
+    // B: 70% full, 20 same-DS DAY segments -> raw cost = 40 * K (higher, more 
co-located)
+    final ServerHolder partial = buildServer("B", maxSize, 7_000_000L, 100, 
20);
+
+    final DataSegment newSegment = getSegment(1000);
+    final List<ServerHolder> servers = new ArrayList<>();
+    servers.add(nearFull);
+    servers.add(partial);
+
+    // CostBalancerStrategy picks A because raw cost 10K < 40K.
+    Assert.assertEquals(
+        "Pure CostBalancerStrategy must pick the near-full server (lower raw 
cost)",
+        "A",
+        newCostStrategy().findServersToLoadSegment(newSegment, 
servers).next().getServer().getName()
+    );
+
+    // DiskNormalized uses projected headroom: A_norm = 10K / 0.04 = 250K,
+    // B_norm = 40K / 0.29 = 138K -> B wins.
+    Assert.assertEquals(
+        "DiskNormalized must prefer the emptier server despite its higher raw 
cost",
+        "B",
+        newDiskNormalizedStrategy().findServersToLoadSegment(newSegment, 
servers).next().getServer().getName()
+    );
+  }
+
+  @Test
+  public void testProjectedSegmentSizeIsUsedForNewSegmentLoad()
+  {
+    final long maxSize = 1_000_000L;
+    // A has the lower raw cost, but the 250 KB proposal would leave only 5% 
headroom.
+    final ServerHolder almostFullAfterLoad = buildServer("A", maxSize, 
700_000L, 0, 5);
+    // B has more co-located segments, but keeps 25% headroom after the 
proposal.
+    final ServerHolder moreHeadroomAfterLoad = buildServer("B", maxSize, 
500_000L, 100, 20);
+
+    final DataSegment largeSegment = getSegment(1000, "DUMMY", DAY, 250_000L);
+    final List<ServerHolder> servers = new ArrayList<>();
+    servers.add(almostFullAfterLoad);
+    servers.add(moreHeadroomAfterLoad);
+
+    // CostBalancerStrategy picks A because raw cost 10K < 40K.
+    Assert.assertEquals(
+        "Pure CostBalancerStrategy must pick the lower raw-cost server",
+        "A",
+        newCostStrategy().findServersToLoadSegment(largeSegment, 
servers).next().getServer().getName()
+    );
+
+    // If diskNormalized used current headroom, A would also win:
+    //   A_current = 10K / 0.30, B_current = 40K / 0.50.
+    // With projected headroom, B wins:
+    //   A_projected = 10K / 0.05, B_projected = 40K / 0.25.
+    Assert.assertEquals(
+        "DiskNormalized must account for the proposal size before choosing a 
server",
+        "B",
+        newDiskNormalizedStrategy().findServersToLoadSegment(largeSegment, 
servers).next().getServer().getName()
+    );
+  }
+
+  @Test
+  public void testNearFullServerIsNotChosenAsMoveDestination()
+  {
+    final long maxSize = 10_000_000L;
+    // SOURCE: 70% full, 20 same-DS DAY segments; segmentToMove is one of them.
+    final ServerHolder source = buildServer("SOURCE", maxSize, 7_000_000L, 0, 
20);
+    // DEST: 95% full, 5 same-DS DAY segments -> raw cost 10K < SOURCE's 38K.
+    final ServerHolder nearFullDest = buildServer("DEST", maxSize, 9_500_000L, 
100, 5);
+
+    final DataSegment segmentToMove = getSegment(0);
+    final List<ServerHolder> servers = new ArrayList<>();
+    servers.add(source);
+    servers.add(nearFullDest);
+
+    // CostBalancerStrategy: DEST raw cost (10K) < SOURCE raw cost (38K) -> 
recommends the move.
+    final ServerHolder costResult =
+        newCostStrategy().findDestinationServerToMoveSegment(segmentToMove, 
source, servers);
+    Assert.assertNotNull("CostBalancerStrategy must recommend moving to the 
near-full DEST", costResult);
+    Assert.assertEquals("DEST", costResult.getServer().getName());
+
+    // DiskNormalized: DEST_norm = 10K / 0.05 = 200K > SOURCE_norm = 38K / 
0.30 * 0.95 ≈ 120K.
+    // Near-full DEST is too expensive after normalization -> no move.
+    Assert.assertNull(
+        "DiskNormalized must block the move to the near-full server",
+        
newDiskNormalizedStrategy().findDestinationServerToMoveSegment(segmentToMove, 
source, servers)
+    );
+  }
+
+  @Test
+  public void testProjectedSegmentSizePreventsMoveThatWouldFillDestination()
+  {
+    final long maxSize = 10_000_000L;
+    final DataSegment largeSegment = getSegment(0, "DUMMY", DAY, 2_500_000L);
+    final List<DataSegment> sourceSegments = new ArrayList<>();
+    sourceSegments.add(largeSegment);
+    IntStream.range(1, 20)
+             .mapToObj(DiskNormalizedCostBalancerStrategyTest::getSegment)
+             .forEach(sourceSegments::add);
+
+    // SOURCE is fuller before the move, but already projects the segment.
+    final ServerHolder source = buildServer("SOURCE", maxSize, 8_000_000L, 
sourceSegments);
+    // DEST has low raw cost, but loading the 2.5 MB segment would leave only 
5% headroom.
+    final ServerHolder dest = buildServer("DEST", maxSize, 7_000_000L, 100, 5);
+
+    final List<ServerHolder> servers = new ArrayList<>();
+    servers.add(source);
+    servers.add(dest);
+
+    // CostBalancerStrategy recommends the move because DEST raw cost (10K) < 
SOURCE raw cost (38K).
+    final ServerHolder costResult =
+        newCostStrategy().findDestinationServerToMoveSegment(largeSegment, 
source, servers);
+    Assert.assertNotNull("CostBalancerStrategy must recommend moving to the 
lower raw-cost DEST", costResult);
+    Assert.assertEquals("DEST", costResult.getServer().getName());
+
+    // If diskNormalized used current headroom, DEST would win: 10K / 0.30 < 
38K / 0.20 * 0.95.
+    // With projected headroom, DEST is too full after placement: 10K / 0.05 > 
38K / 0.20 * 0.95.
+    Assert.assertNull(
+        "DiskNormalized must not move a large segment to a server that would 
become too full",
+        
newDiskNormalizedStrategy().findDestinationServerToMoveSegment(largeSegment, 
source, servers)
+    );
+  }
+
   @Test
   public void testRejectsInvalidThreshold()
   {


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to