DanielCarter-stack commented on PR #10637:
URL: https://github.com/apache/seatunnel/pull/10637#issuecomment-4110599165
<!-- code-pr-reviewer -->
<!-- cpr:pr_reply_v2_parts {"group": "apache/seatunnel#10637", "part": 1,
"total": 1} -->
## Issue 1: Missing test coverage for null values and edge cases
**Location**:
`seatunnel-engine/seatunnel-engine-ui/src/tests/detail-metrics.spec.ts`
**Problem Description**:
The test file covers main scenarios, but lacks tests for the following
important edge cases:
1. When `metricMap` is `undefined`, whether `readVertexMetricValue`
correctly returns 0
2. Behavior when `vertexName` does not contain identifier (e.g.,
`vertexName: 'simple-name'`)
3. Behavior when there is no matching key in metricMap
4. Priority validation when metricMap contains both prefixed keys and
original keys
5. Handling of `Number.isFinite` when metric value is a non-numeric string
(e.g., `'abc'`)
**Potential Risks**:
- Edge cases are not covered by tests, unexpected behavior may occur in
production environments
- When vertexName format does not match expectations, metric lookup may fail
- Unable to ensure code robustness under various abnormal inputs
**Impact Scope**:
- Direct impact: `resolveMetricKey`, `readVertexMetricValue`,
`collectVertexMetrics` functions in `detail-metrics.ts`
- Indirect impact: Metric display logic in `detail.tsx`
- Impact area: Only affects frontend UI metric display, does not affect
backend Job execution
**Severity**: MINOR
**Improvement Suggestions**:
```typescript
test('returns 0 when metricMap is undefined', () => {
expect(readVertexMetricValue(undefined, sourceVertex,
'fake.user_table')).toBe(0)
})
test('handles vertexName without identifier', () => {
const vertexWithoutIdentifier = {
vertexId: 1,
type: 'source',
vertexName: 'simple-source-name',
tablePaths: ['fake.user_table']
} as Vertex
const metricMap = { 'fake.user_table': '15' }
expect(readVertexMetricValue(metricMap, vertexWithoutIdentifier,
'fake.user_table')).toBe(15)
})
test('returns 0 when no matching key found', () => {
const metricMap = { 'other.table': '100' }
expect(readVertexMetricValue(metricMap, sourceVertex,
'fake.user_table')).toBe(0)
})
test('prefers prefixed key over raw key when both exist', () => {
const metricMap = {
'Source[0].fake.user_table': '10',
'fake.user_table': '999'
}
expect(readVertexMetricValue(metricMap, sourceVertex,
'fake.user_table')).toBe(10)
})
test('handles non-numeric metric values', () => {
const metricMap = { 'fake.user_table': 'invalid' }
expect(readVertexMetricValue(metricMap, sourceVertex,
'fake.user_table')).toBe(0)
})
```
---
## Issue 2: Key names returned by `collectVertexMetrics` may not match user
expectations
**Location**:
`seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail-metrics.ts:81-100`
**Problem Description**:
The key name format returned by the `collectVertexMetrics` function is
`${metricName}.${path}`, for example `TableSinkWriteCount.fake.user_table`.
However, when the actual key used is the prefixed `Sink[1].fake.user_table`,
the key name users see in the Configuration component is inconsistent with the
actual metricMap key name, which may cause confusion.
Although this does not affect functionality (Configuration component only
displays key-value pairs), it may make debugging difficult because the key name
users see does not match the actual key name returned by the backend.
**Potential Risks**:
- Users may be confused when viewing Job details why the displayed key name
is inconsistent with the one returned by the backend
- May not quickly locate the actual metricMap key when debugging issues
- Cannot distinguish which vertex the metric belongs to when multiple
vertices share the same table path
**Impact Scope**:
- Direct impact: `focusedVertex` computed property in `detail.tsx`
- Indirect impact: Configuration component display
- Impact area: Only affects UI display, does not affect functionality
**Severity**: MINOR
**Improvement Suggestions**:
Consider displaying both the original key name and the actual used key name
in the Configuration component, or keep the current status and add comments for
explanation.
Alternatively, if clearer display is needed, you can modify
`collectVertexMetrics` to return an object containing more information:
```typescript
export const collectVertexMetrics = (
metricName: string,
metricMap: MetricMap,
vertex: MetricVertex
): Record<string, { value: string; actualKey: string }> => {
const metrics: Record<string, { value: string; actualKey: string }> = {}
if (!metricMap) {
return metrics
}
vertex.tablePaths.forEach((path) => {
const metricKey = resolveMetricKey(metricMap, vertex, path)
if (metricKey !== undefined) {
metrics[`${metricName}.${path}`] = {
value: metricMap[metricKey],
actualKey: metricKey
}
}
})
return metrics
}
```
But this requires synchronized changes to the Configuration component, which
may be beyond the scope of this PR.
---
## Issue 3: Fallback logic of `resolveMetricKey` may lead to ambiguity
**Location**:
`seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail-metrics.ts:50-61`
**Problem Description**:
In the `resolveMetricKey` function, when an exact matching key cannot be
found, it attempts to fall back to suffix matching (lines 50-60). Specifically,
the fallback logic on line 60:
```typescript
return suffixedKeys.length === 1 ? suffixedKeys[0] : undefined
```
This logic may cause problems in certain situations:
1. When multiple vertices share the same table path but metrics are only
reported from one vertex, this fallback logic will incorrectly return the only
existing key
2. When multiple table paths with the same name belong to different
vertices, if there is no identifier, it may return the wrong metric value
**Potential Risks**:
- In scenarios where multiple vertices share table paths, incorrect metric
values may be returned
- When a vertex's metrics have not yet been reported, another vertex's
metric values may be used
- Causes users to see inaccurate metric data, misleading operations decisions
**Impact Scope**:
- Direct impact: `sourceCell` and `sinkCell` functions in `detail.tsx`
- Indirect impact: Metric display on Job details page
- Impact area: Metric display in multi-vertex scenarios
**Severity**: MAJOR
**Improvement Suggestions**:
In the fallback logic on line 60, you should be more cautious and only use
fallback when it is confirmed that there will be no confusion:
```typescript
// Only fallback when there is no identifier and only one matching key exists
if (!identifier && suffixedKeys.length === 1) {
return suffixedKeys[0]
}
// If there is an identifier but no matching prefix key, do not blindly
fallback
// In this case, undefined should be returned to let the caller handle it
if (identifier && suffixedKeys.length > 0) {
const sameVertexKey = suffixedKeys.find((key) =>
key.startsWith(`${identifier}.`))
if (sameVertexKey) {
return sameVertexKey
}
// If a table with the same name is found but does not belong to the
current vertex, return undefined instead of using another vertex's data
return undefined
}
return undefined
```
But this requires more careful consideration of various scenarios and may
need to add more test cases for verification.
---
## Issue 4: Performance Consideration - Multiple traversals of Object.keys
**Location**:
`seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail-metrics.ts:50-51`
**Problem Description**:
In the `resolveMetricKey` function, each call executes
`Object.keys(metricMap).filter((key) => key.endsWith(suffix))`, which traverses
all metricMap keys.
Considering:
1. `sourceCell` and `sinkCell` functions call `readVertexMetricValue` once
for each tablePath of each vertex
2. If a Job has 10 vertices and each vertex has 5 table paths,
`resolveMetricKey` will be called 50 times
3. Each call traverses all metricMap keys (possibly dozens to hundreds)
This may have performance issues when the number of metrics is large.
**Potential Risks**:
- In large Jobs with large amounts of metric data, page rendering
performance may decline
- Frequent array traversal and filtering operations may affect user
experience
- In real-time update scenarios (refreshing every 5 seconds), performance
issues become more obvious
**Impact Scope**:
- Direct impact: Metric table rendering in `detail.tsx`
- Indirect impact: Performance of Job details page
- Impact area: Details pages of large Jobs
**Severity**: MINOR
**Improvement Suggestions**:
Consider pre-building an index to speed up lookup, or use more efficient
data structures. However, since this is a frontend optimization and metrics are
usually not too many, the priority of this issue is lower.
If optimization is indeed needed, you can consider:
```typescript
// Pre-build a Map grouped by table path
const buildMetricPathIndex = (metricMap: MetricMap) => {
if (!metricMap) return new Map()
const index = new Map<string, string[]>()
Object.keys(metricMap).forEach(key => {
const parts = key.split('.')
if (parts.length >= 2) {
const path = parts.slice(1).join('.')
if (!index.has(path)) {
index.set(path, [])
}
index.get(path)!.push(key)
}
})
return index
}
```
But this increases code complexity and requires weighing performance against
maintainability.
---
## Issue 5: Missing metric handling for Transform type vertices
**Location**:
`seatunnel-engine/seatunnel-engine-ui/src/views/jobs/detail.tsx:95-126`
**Problem Description**:
Although the `VERTEX_IDENTIFIER_PATTERN` regular expression includes
`Transform`, the `sourceCell` and `sinkCell` functions only handle `source` and
`sink` type vertices, not `transform` type vertices.
This may lead to the following problems:
1. If Transform type vertices also have table metrics (although not very
common), these metrics will not be displayed
2. In the computed property of `focusedVertex`, there is also no handling
for transform type
**Potential Risks**:
- If Transform type vertices support table metrics in the future, these
metrics will not be able to be displayed in the UI
- Incomplete code structure may cause confusion for maintainers
**Impact Scope**:
- Direct impact: `sourceCell`, `sinkCell` and `focusedVertex` functions in
`detail.tsx`
- Indirect impact: Metric display of Transform type vertices
- Impact area: Transform type vertices
**Severity**: MINOR
**Improvement Suggestions**:
If it is certain that Transform type will not have table metrics, you can
add comments to explain in the code. Alternatively, for code completeness, you
can add support for Transform:
```typescript
const transformCell = (
row: Vertex,
key: string
) => {
if (row.type === 'transform') {
// If the Transform type also has table metrics, processing logic can be
added here
return row.tablePaths.reduce(
(s, path) => s + readVertexMetricValue(job.metrics?.[key], row, path),
0
)
}
return 0
}
```
But this requires first confirming whether the backend returns table metrics
for Transform.
---
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]