Hi Devs,

Is there a way to validate the capacity availability of cluster instances
when adding a resource and rebalancing it with WAGED? Because the resource
addition process seems to happen in an event pipeline. So, when it
encounters the "FAILED_TO_CALCULATE" exception, it doesn't seem to
propagate to the place where we add the resource. Therefore, it seems
tricky to validate capacity availability beforehand.

While looking for this I found [1]. But I couldn't clearly understand the
usage of the "WAGED simulation API" mentioned there. So, here's what I've
tried;

So the questions are:
- Is it correct?
- If so, is encountering a ""getIdealAssignmentForWagedFullAuto():
Calculation failed: Failed to compute BestPossibleState!"" can be
considered "FAILED_TO_CALCULATE"?
- If so, is there a way to get the proper reason for the failure. Like "Unable
to find any available candidate node for partition resource4_0; Fail
reasons: {resource4-resource4_0-ONLINE={c8cep_on_localhost_12002=[Node has
insufficient capacity]..."
- Or, is there a better way of doing this?

    try {
        IdealState newIS = getIdealState(resourceName);
        ResourceConfig newResourceConfig = new ResourceConfig(resourceName);
        // Set PARTITION_CAPACITY_MAP
        Map<String, String> capacityDataMap = ImmutableMap.of("CPU",
"20", "MEMORY", "60");
        
newResourceConfig.getRecord().setMapField(ResourceConfig.ResourceConfigProperty.PARTITION_CAPACITY_MAP.name(),

Collections.singletonMap(ResourceConfig.DEFAULT_PARTITION_KEY,
OBJECT_MAPPER.writeValueAsString(capacityDataMap)));

        // Read existing cluster/instances/resources info
        final ZKHelixDataAccessor dataAccessor = new
ZKHelixDataAccessor(CLUSTER_NAME,
                new
ZkBaseDataAccessor.Builder<ZNRecord>().setZkAddress(ZK_ADDRESS).build());
        ClusterConfig clusterConfig =
dataAccessor.getProperty(dataAccessor.keyBuilder().clusterConfig());
        List<InstanceConfig> instanceConfigs =
dataAccessor.getChildValues(dataAccessor.keyBuilder().instanceConfigs(),
true);
        List<String> liveInstances =
dataAccessor.getChildNames(dataAccessor.keyBuilder().liveInstances());
        List<IdealState> idealStates =
dataAccessor.getChildValues(dataAccessor.keyBuilder().idealStates(),
true);
        List<ResourceConfig> resourceConfigs =
dataAccessor.getChildValues(dataAccessor.keyBuilder().resourceConfigs(),
true);

        // Do we need add this?
        idealStates.add(newIS);
        resourceConfigs.add(newResourceConfig);

        // Verify that utilResult contains the assignment for the
resources added
        Map<String, ResourceAssignment> utilResult = HelixUtil
                .getTargetAssignmentForWagedFullAuto(ZK_ADDRESS,
clusterConfig, instanceConfigs,
                        liveInstances, idealStates, resourceConfigs);

    } catch (HelixException e) {
        // Getting "getIdealAssignmentForWagedFullAuto(): Calculation
failed: Failed to compute BestPossibleState!"
        // means not enough capacity?
    }

[1] https://github.com/apache/helix/pull/1701

Thank you,
Grainier Perera.

Reply via email to