J-HowHuang opened a new pull request, #16136:
URL: https://github.com/apache/pinot/pull/16136
# Enhanced Tenant Rebalance Result with Aggregated Summary
## Summary
This PR enhances the tenant rebalance API to provide aggregated summaries
across all tables within a tenant, giving users a comprehensive view of the
rebalance operation's impact at the tenant level. Previously, the API only
returned individual table rebalance results, making it difficult to understand
the overall tenant-level changes.
## Changes
### ๐ง Core Enhancements
1. **Enhanced `TenantRebalanceResult`** - Added aggregated summary fields:
- `totalTables`: Total number of tables processed
- `statusSummary`: Count of tables by status (DONE, NO_OP, FAILED, etc.)
- `aggregatedPreChecksResult`: Aggregated pre-check results across all
tables
- `aggregatedRebalanceSummary`: Consolidated rebalance summary with
server and segment information
2. **Smart Aggregation Logic** - Implements sophisticated aggregation of
overlapping resources:
- Handles servers that appear in multiple tables
- Correctly aggregates segment counts and data movement estimates
- Preserves individual table details while providing tenant-level insights
3. **Comprehensive Test Coverage** - Added extensive tests covering:
- Basic aggregation scenarios
- Complex overlapping server scenarios
- Edge cases and validation
### ๐ New Output Format
The enhanced API now returns both individual table results and aggregated
tenant-level summaries:
```json
{
"jobId": "tenant-rebalance-12345",
"totalTables": 3,
"statusSummary": {
"DONE": 2,
"NO_OP": 1
},
"aggregatedRebalanceSummary": {
"serverInfo": {
"numServersGettingNewSegments": 4,
"numServers": {
"valueBeforeRebalance": 9,
"expectedValueAfterRebalance": 10
},
"serversAdded": ["server4", "server5"],
"serversRemoved": ["server6"],
"serversUnchanged": ["server1", "server2", "server3", "server7",
"server8"],
"serversGettingNewSegments": ["server4", "server5", "server7",
"server8"],
"serverSegmentChangeInfo": {
"server1": {
"serverStatus": "UNCHANGED",
"totalSegmentsBeforeRebalance": 10,
"totalSegmentsAfterRebalance": 10,
"segmentsAdded": 0,
"segmentsDeleted": 0,
"segmentsUnchanged": 10
}
// ... more servers
}
},
"segmentInfo": {
"totalSegmentsToBeMoved": 18,
"totalSegmentsToBeDeleted": 8,
"maxSegmentsAddedToASingleServer": 5,
"estimatedAverageSegmentSizeInBytes": 1444,
"totalEstimatedDataToBeMovedInBytes": 26000,
"numSegmentsInSingleReplica": {
"valueBeforeRebalance": 32,
"expectedValueAfterRebalance": 32
},
"numSegmentsAcrossAllReplicas": {
"valueBeforeRebalance": 74,
"expectedValueAfterRebalance": 84
}
},
"tagsInfo": [
{
"tagName": "TestTenant_OFFLINE",
"numSegmentsToDownload": 10,
"numSegmentsUnchanged": 42,
"numServerParticipants": 7
},
{
"tagName": "TestTenant_REALTIME",
"numSegmentsToDownload": 8,
"numSegmentsUnchanged": 24,
"numServerParticipants": 3
}
]
},
"rebalanceTableResults": {
"tableA_OFFLINE": {
"jobId": "tableA_job",
"status": "DONE",
"description": "Table A rebalanced successfully"
// ... individual table details
}
// ... more tables
}
}
```
## ๐๏ธ Special Aggregation Rules
Several fields use **derived aggregation** logic instead of simple summation:
### 1. **Server Status Aggregation**
- When a server appears in multiple tables with different statuses, the most
significant status takes precedence:
- `REMOVED` > `ADDED` > `UNCHANGED`
- Example: If server1 is `UNCHANGED` in tableA but `REMOVED` from tableB,
its aggregated status becomes `REMOVED`
### 2. **Max Segments Added to Single Server**
- **Derived from aggregated server data** rather than table-level maximums
- Calculates the maximum across all servers' total segments added (summed
across all their tables)
- More accurate than taking the max of individual table maximums
### 3. **Average Segment Size Calculation**
- **Weighted average** based on total segments moved and total data moved
- Formula: `totalEstimatedDataToBeMovedInBytes / totalSegmentsToBeMoved`
- Provides accurate tenant-level average, not simple arithmetic mean of
table averages
### 4. **Server Count Calculations**
- Uses **set operations** to handle overlapping servers between tables
- `numServers.valueBeforeRebalance`: Count of unique servers with segments
before rebalance
- `numServers.expectedValueAfterRebalance`: Count of unique servers with
segments after rebalance
### 5. **Tag Participant Counting**
- Aggregates unique servers across all tables using the same tag
- Accounts for servers that may have been removed from some tables but still
exist in others
## ๐งช Testing
Added comprehensive test coverage including:
1. **`testTenantRebalanceResultAggregation()`**
- Basic aggregation across different table types (offline/realtime)
- Scale-out, scale-in, and no-op scenarios
- Status summary validation
2. **`testTenantRebalanceResultAggregationWithOverlappingServers()`**
- Complex scenario with servers appearing in multiple tables
- Validates correct handling of server status precedence
- Tests derived calculations with overlapping resources
## ๐ Backward Compatibility
- **Fully backward compatible** - existing fields remain unchanged
- New aggregated fields are **optional additions**
- `verboseResult` parameter controls inclusion of individual table details
- Existing API consumers continue to work without modification
## ๐ Benefits
1. **Operational Visibility**: Clear tenant-level view of rebalance impact
2. **Resource Planning**: Understand total data movement and server changes
3. **Monitoring**: Easy status tracking across multiple tables
4. **Performance Insights**: Aggregated metrics for capacity planning
## ๐ฏ Use Cases
- **Tenant Scaling**: Understand impact when adding/removing servers to a
tenant
- **Capacity Planning**: Estimate data movement and resource requirements
- **Operations Dashboard**: Monitor tenant rebalance status and progress
- **Automated Tooling**: Build automation based on aggregated metrics
## API Endpoint
```
POST /tenants/{tenantName}/rebalance
```
Query Parameters:
- `verboseResult=true` - Include individual table details (default: false)
- All existing rebalance parameters remain supported
---
This enhancement provides crucial tenant-level insights while maintaining
full backward compatibility with existing integrations.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]