PG1204 opened a new issue, #5773:
URL: https://github.com/apache/texera/issues/5773

   ### Task Summary
   
   ### Task Summary
   
   Type the per-operator statistics fields the backend already sends but the 
frontend drops, and expose a derived performance model from 
`WorkflowStatusService`, as the data foundation for the workflow heat-map 
overlay. No UI change.
   
   ### Context
   
   The execution engine emits 11 statistics per operator over the websocket, 
but the frontend `OperatorStatistics` type in 
`frontend/src/app/workspace/types/execute-workflow.interface.ts` declares only 
6:
   
   ```typescript
   export interface OperatorStatistics
     extends Readonly<{
       operatorState: OperatorState;
       aggregatedInputRowCount: number;
       aggregatedInputSize?: number; // bytes
       inputPortMetrics: Record<string, number>;
       aggregatedOutputRowCount: number;
       aggregatedOutputSize?: number; // bytes
       outputPortMetrics: Record<string, number>;
       numWorkers?: number;
       aggregatedDataProcessingTime?: number; // nanoseconds
       aggregatedControlProcessingTime?: number; // nanoseconds
       aggregatedIdleTime?: number; // nanoseconds
     }> {}
   ```
   
   The five timing/byte-size fields: aggregatedInputSize, aggregatedOutputSize, 
aggregatedDataProcessingTime, aggregatedControlProcessingTime, 
aggregatedIdleTime, all arrive in the JSON payload but are untyped and unused, 
so they are silently discarded.
   
   WorkflowStatusService is currently a thin pass-through over the 
OperatorStatisticsUpdateEvent stream and has no derived performance model and 
no spec file. This sub-task makes it the single source of truth for 
per-operator performance data, ahead of the overlay built in the following 
sub-tasks.
   
   Before:  websocket -> OperatorStatistics (6 fields; timing & sizes dropped)
   After:   websocket -> OperatorStatistics (11 fields) -> derived, normalized 
per-operator performance metrics
   
   ### Proposed Change
   In frontend/src/app/workspace/types/execute-workflow.interface.ts:
   
   Add the five missing fields to OperatorStatistics, all optional, so the 
partial objects built in resetStatus / clearStatus still type-check.
   New 
frontend/src/app/workspace/service/workflow-status/performance-metrics.ts 
   (pure, framework-free):
   
   - A derived OperatorPerformanceMetrics model and a toPerformanceMetrics 
mapper that defensively defaults every field to 0.
   - A HeatmapView enum (Runtime, Throughput, I/O imbalance) and a 
rawMetricForView selector.
   - A normalizeScores helper returning [0, 1] scores, with skew handling and 
defined empty / single-operator / all-equal behavior.
   
   In 
frontend/src/app/workspace/service/workflow-status/workflow-status.service.ts:
   
   - Keep the existing pass-through API unchanged.
   - Add a derived performance-metrics stream (BehaviorSubject-backed) plus a 
synchronous snapshot getter.
   - Add a public setExternalStatus(...) ingestion method so a later sub-task 
can feed in restored historical statistics without touching private state.
   
   ### Required Test
   - New performance-metrics.spec.ts: full mapping, missing-field defaults (no 
NaN), all-zero, I/O-imbalance ratio with inputRows === 0, unicode operator id, 
and normalizeScores edge cases (empty, single, all-equal, two-value, 
heavy-tail, clamping).
   - New workflow-status.service.spec.ts: derived stream emits on 
OperatorStatisticsUpdateEvent, non-matching events are ignored, the snapshot 
getter is synchronous, and resetStatus / clearStatus / setExternalStatus behave 
as expected.
   
   ### Related
   Umbrella issue #5772 
   RFC discussion #5216
   
   ### Task Type
   
   - [x] Refactor / Cleanup
   - [ ] DevOps / Deployment / CI
   - [ ] Testing / QA
   - [ ] Documentation
   - [x] Performance
   - [ ] Other


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to