This is an automated email from the ASF dual-hosted git repository. wu-sheng pushed a commit to branch feat/instance-topology-preview in repository https://gitbox.apache.org/repos/asf/skywalking-horizon-ui.git
commit 4397390ab1e0229aafd6dfa4d8736ed8b9494611 Author: Wu Sheng <[email protected]> AuthorDate: Sun Jun 7 10:37:35 2026 +0800 feat(layer): list every service in the landing picker; landing + i18n hardening - Landing picker lists the whole layer, not just the metric-probed top-N: services below the cap show `low` in the order-by column (`—` elsewhere), still searchable and selectable. Header reads "metrics: top N". New `query.landingServiceCap` (default 100) bounds the metric fan-out; a cheap single-metric ranking pass picks the true top-N when a layer exceeds it. - Remove the stale "Landing KPI tile" Headline/Trend admin controls and all dead throughput/spark config (api-client wire types, BFF landing compute + setup schema + loader migration, UI data layer + metric catalog). The layer header renders every configured column as its own KPI with its own trend; the headline/trend fields drove nothing on screen. - i18n is remote-only at runtime: a missing OAP overlay falls back to the English source, never to bundled disk overlays (those stay for boot-seed / reset / diff only). - Dashboard + overview data caches key on the full widget config, not just ids, so a remote sync or preview edit that changes an MQE refires instead of serving stale data. - Config-bundle preload failure no longer strands the shell on "Initializing…": it seeds an empty unreachable bundle so routes render and the connectivity banner can recover. - An off-sample selected service resolves its name from the full roster, so the header KPIs match the dashboard the body queries. --- CHANGELOG.md | 27 +++ apps/bff/src/config/schema.ts | 16 ++ apps/bff/src/http/config/bundle.ts | 37 +-- apps/bff/src/http/config/dashboard.ts | 3 +- apps/bff/src/http/config/overview.ts | 4 +- apps/bff/src/http/config/setup.ts | 25 +- apps/bff/src/http/query/landing.ts | 264 ++++++++------------- apps/bff/src/http/query/menu.ts | 6 +- apps/bff/src/i18n/merge.ts | 51 ++-- apps/bff/src/logic/layers/loader.ts | 34 +-- apps/bff/src/logic/templates/overlay.ts | 17 +- apps/ui/src/api/client.ts | 9 +- apps/ui/src/api/scopes/layer.test.ts | 4 +- apps/ui/src/api/scopes/layer.ts | 2 - apps/ui/src/controls/configBundle.ts | 23 +- .../admin/layer-templates/LayerDashboardsAdmin.vue | 209 +--------------- apps/ui/src/i18n/locales/de.json | 3 + apps/ui/src/i18n/locales/en.json | 3 + apps/ui/src/i18n/locales/es.json | 3 + apps/ui/src/i18n/locales/fr.json | 3 + apps/ui/src/i18n/locales/ja.json | 3 + apps/ui/src/i18n/locales/ko.json | 3 + apps/ui/src/i18n/locales/pt.json | 3 + apps/ui/src/i18n/locales/zh-CN.json | 3 + apps/ui/src/layer/LayerServiceSelector.vue | 164 ++++++++++--- apps/ui/src/layer/LayerShell.vue | 49 ++-- apps/ui/src/layer/useLayerLanding.ts | 2 - .../render/layer-dashboard/useLayerDashboard.ts | 6 +- .../ui/src/render/overview/useOverviewDashboard.ts | 5 +- apps/ui/src/state/setup.ts | 6 - apps/ui/src/utils/metricCatalog.ts | 13 - docs/setup/horizon-yaml.md | 35 +++ packages/api-client/src/index.ts | 1 - packages/api-client/src/landing.ts | 23 +- packages/api-client/src/menu.ts | 5 - packages/api-client/src/setup.ts | 23 -- 36 files changed, 466 insertions(+), 621 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9ec7a25..e1949da 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -87,6 +87,33 @@ packages) plus the BFF's `HORIZON_VERSION` default. when you click **Reset to bundled** — matching the runtime, which renders the published version or blocks, never the bundle. +### Layer landing & service list + +- **The layer landing now shows your services, not just an arbitrary 25.** It + used to cap the metric fan-out at the first 25 services *by list order* — + so larger layers hid the rest, and the "top" services weren't even the true + top (the cap happened before the ranking). Now it probes **all** services up + to a configurable cap and, when a layer exceeds it, runs a cheap single- + metric ranking pass to pick the **true top-N** by the landing's order-by + column. The service picker surfaces **"top N of M"** so the trim is never + silent. Queries drain through a bounded-concurrency pool, so a big layer + fans out in controlled waves rather than a thundering herd. +- New `query.landingServiceCap` in `horizon.yaml` (default **100**) tunes how + many services a landing probes per request — raise it if your OAP + storage + can take the larger fan-out, lower it to protect a modest deployment. +- **The service picker now lists the *whole* layer, not only the metric-probed + top-N.** Services that ranked below the metric cap on the order-by column now + appear as their own rows with **`low`** in that column (and `—` for the + others, which were never probed) instead of being hidden — every service + stays browsable, searchable, and selectable regardless of the cap. The + header chip reads **"metrics: top N"** to make the metric trim explicit. +- **Removed the stale "Landing KPI tile" controls** (Headline / Trend line) + from the Layer-dashboards admin. They no longer matched the rendered layer + header — which shows every configured metric column as its own KPI with its + own trend line — so editing them changed nothing on screen. The header is + driven entirely by the service-list columns + default sort; the preview now + reflects that. + ### Documentation & release tooling - The website docs were brought current with the 0.6.0 build and the diff --git a/apps/bff/src/config/schema.ts b/apps/bff/src/config/schema.ts index 37a0005..0aea48f 100644 --- a/apps/bff/src/config/schema.ts +++ b/apps/bff/src/config/schema.ts @@ -304,6 +304,21 @@ const debugLogSchema = z redactAuthHeaders: true, }); +const querySchema = z + .object({ + /** Max services a layer landing runs metric MQE for, per request. The + * landing always lists ALL services, but only fetches column metrics + * for up to this many — the TRUE top-N by the landing's `orderBy` + * column (a cheap single-metric ranking pass picks them when a layer + * exceeds the cap). The UI surfaces "top N of M" whenever the cap + * bites, so nothing is silently dropped. Raise it if your OAP + + * storage backend can take the larger fan-out; lower it to protect a + * modest deployment. Default 100. */ + landingServiceCap: z.number().int().positive().default(100), + }) + .strict() + .default({ landingServiceCap: 100 }); + export const configSchema = z .object({ server: serverSchema.default({}), @@ -315,6 +330,7 @@ export const configSchema = z setup: setupSchema, alarms: alarmsSchema, debugLog: debugLogSchema, + query: querySchema, // Deprecated + ignored. The 3D-map config moved to OAP (a template kind); // the old file-backed `infra3d.file` knob is gone. Accepted here (rather // than rejected by `.strict()`) so an existing config carrying the block diff --git a/apps/bff/src/http/config/bundle.ts b/apps/bff/src/http/config/bundle.ts index 9f6a945..eff0884 100644 --- a/apps/bff/src/http/config/bundle.ts +++ b/apps/bff/src/http/config/bundle.ts @@ -71,12 +71,7 @@ import { formatName, parseEnvelope } from '../../logic/templates/names.js'; import { resync as resyncTemplates } from '../../logic/templates/sync.js'; import { logger } from '../../logger.js'; import type { Locale } from '../../i18n/index.js'; -import { - localizeContent, - getLayerOverlay, - getOverviewOverlay, - localeFromRequest, -} from '../../i18n/index.js'; +import { localizeContent, localeFromRequest } from '../../i18n/index.js'; export interface ConfigBundleDeps { config: ConfigSource; @@ -175,15 +170,10 @@ async function buildBundle( const layers: Record<string, ScopeMap> = {}; // Localize + slice a resolved layer template into per-scope widget sets. const addLayer = (picked: LayerTemplate): void => { - // Localize: OAP overlay wins where present, disk overlay fills the - // rest, English source falls through for the remainder. Both overlays - // key on the layer's `key` (upper-snake, matching OAP's enum). - const effective = localizeContent( - picked, - oapOverlayFor('layer', picked.key), - getLayerOverlay(picked.key, locale), - locale, - ); + // Localize against the OAP overlay row (keyed on the layer's + // upper-snake `key`); English fills the rest. Disk i18n is seed/reset + // only, never a runtime fill — same remote-first rule as the template. + const effective = localizeContent(picked, oapOverlayFor('layer', picked.key), locale); const scopes: ScopeMap = {}; for (const scope of ['service', 'instance', 'endpoint'] as const) { const ws = widgetsForScope(effective, scope); @@ -216,14 +206,7 @@ async function buildBundle( diskOverviewIds.add(dash.id); const picked = pickOverviewContent(dash, remoteByName, preferLocal); if (picked === null) continue; // disabled - overviews.push( - localizeContent( - picked, - oapOverlayFor('overview', picked.id), - getOverviewOverlay(picked.id, locale), - locale, - ), - ); + overviews.push(localizeContent(picked, oapOverlayFor('overview', picked.id), locale)); } // Remote-only overviews: dashboards that live on OAP with no on-disk // base — created in the admin UI and pushed. The disk loop can't see @@ -236,11 +219,9 @@ async function buildBundle( if (!env || !isOverviewLike(env.content)) continue; const dash = env.content as OverviewDashboard; if (diskOverviewIds.has(dash.id)) continue; // already handled above - // Remote-only dashboards: no disk overlay, but a per-locale OAP - // overlay row may still apply. - overviews.push( - localizeContent(dash, oapOverlayFor('overview', dash.id), null, locale), - ); + // Remote-only dashboards: localize against the per-locale OAP overlay + // row when one exists, else English. + overviews.push(localizeContent(dash, oapOverlayFor('overview', dash.id), locale)); } const syncStatus: BundleSyncStatus = { diff --git a/apps/bff/src/http/config/dashboard.ts b/apps/bff/src/http/config/dashboard.ts index 4e94509..a492840 100644 --- a/apps/bff/src/http/config/dashboard.ts +++ b/apps/bff/src/http/config/dashboard.ts @@ -36,7 +36,7 @@ import { resolveEffectiveLayer } from '../../logic/layers/effective.js'; import { oapOverlayContentFor } from '../../logic/templates/overlay.js'; import { defaultWidgetsFor } from '../../logic/dashboard/defaults.js'; import { scopeSchema } from '../query/dashboard.js'; -import { localizeContent, getLayerOverlay, localeFromRequest } from '../../i18n/index.js'; +import { localizeContent, localeFromRequest } from '../../i18n/index.js'; export interface DashboardConfigDeps { config: ConfigSource; @@ -75,7 +75,6 @@ export function registerDashboardConfigRoute(app: FastifyInstance, deps: Dashboa // (`GENERAL`), not the lowercase URL param — pass the template's // own key so the OAP translation overlay row actually matches. await oapOverlayContentFor(deps.uiTemplateClient, 'layer', rawTpl.key, locale), - getLayerOverlay(rawTpl.key, locale), locale, ) : null; diff --git a/apps/bff/src/http/config/overview.ts b/apps/bff/src/http/config/overview.ts index 8fca605..369f78e 100644 --- a/apps/bff/src/http/config/overview.ts +++ b/apps/bff/src/http/config/overview.ts @@ -45,7 +45,7 @@ import { resolveEffectiveOverview, } from '../../logic/overview/effective.js'; import { oapOverlayContentFor } from '../../logic/templates/overlay.js'; -import { localizeContent, getOverviewOverlay, localeFromRequest } from '../../i18n/index.js'; +import { localizeContent, localeFromRequest } from '../../i18n/index.js'; export interface OverviewRouteDeps { config: ConfigSource; @@ -68,7 +68,6 @@ export function registerOverviewRoutes(app: FastifyInstance, deps: OverviewRoute const ld = localizeContent<OverviewDashboard>( d, await oapOverlayContentFor(deps.uiTemplateClient, 'overview', d.id, locale), - getOverviewOverlay(d.id, locale), locale, ); return { @@ -110,7 +109,6 @@ export function registerOverviewRoutes(app: FastifyInstance, deps: OverviewRoute dashboard: localizeContent<OverviewDashboard>( dash, await oapOverlayContentFor(deps.uiTemplateClient, 'overview', dash.id, locale), - getOverviewOverlay(dash.id, locale), locale, ), reachable: true, diff --git a/apps/bff/src/http/config/setup.ts b/apps/bff/src/http/config/setup.ts index b4272b9..6c473ba 100644 --- a/apps/bff/src/http/config/setup.ts +++ b/apps/bff/src/http/config/setup.ts @@ -46,35 +46,18 @@ const landingColumnSchema = z }) .strict(); -const throughputSchema = z - .object({ - metric: z.string().min(1), - label: z.string().optional(), - unit: z.string().optional(), - mqe: z.string().optional(), - aggregation: aggregationSchema.optional(), - scale: z.number().finite().optional(), - precision: z.number().int().min(0).max(6).optional(), - }) - .strict(); - const landingSchema = z .object({ priority: z.number().int().min(0).max(99), topN: z.number().int().min(5).max(8), orderBy: z.string().min(1), columns: z.array(landingColumnSchema).max(5), - spark: z - .object({ - metric: z.string().min(1), - height: z.number().int().positive(), - }) - .strict() - .optional(), - throughput: throughputSchema.optional(), style: z.enum(['table', 'bar', 'mini-topology']), }) - .strict(); + // `.strip()` (not `.strict()`): setups persisted by older builds carry the + // retired `spark` / `throughput` keys — drop them silently on re-save + // instead of failing the whole payload. + .strip(); const layerConfigSchema = z .object({ diff --git a/apps/bff/src/http/query/landing.ts b/apps/bff/src/http/query/landing.ts index a7565df..3f8ef37 100644 --- a/apps/bff/src/http/query/landing.ts +++ b/apps/bff/src/http/query/landing.ts @@ -24,16 +24,13 @@ * ``` * { * topN, orderBy, columns: LandingColumn[], - * spark?: { metric, height }, - * throughput?: ThroughputConfig, * } * ``` * * One GraphQL trip lists services, a second batches per-service column - * MQE values (one alias per service × column), a third optional trip - * fetches the sparkline + throughput series for the surviving topN - * rows. Errors anywhere in the MQE batch are local — failing cells - * become `null`, the rest of the response stands. + * MQE values (one alias per service × column). Errors anywhere in the + * MQE batch are local — failing cells become `null`, the rest of the + * response stands. */ import type { FastifyInstance, FastifyReply, FastifyRequest } from 'fastify'; @@ -94,7 +91,27 @@ const LIST_SERVICES_QUERY = /* GraphQL */ ` `; const DEFAULT_WINDOW_MIN = 60; -const SERVICE_QUERY_CAP = 25; +// Services × columns are chunked into batches of N services per OAP +// round-trip — OAP enforces a per-request GraphQL complexity ceiling, so a +// 25×10 single batch reliably 5xx'd busy backends and blanked every cell. +// The batches then drain through a bounded-concurrency pool so a large +// layer fans out in controlled waves, not a thundering herd. The number of +// services probed per request is itself bounded by `query.landingServiceCap`. +const MAX_SERVICES_PER_BATCH = 6; +const LANDING_BATCH_CONCURRENCY = 8; + +/** Run `fn` over `items` with at most `limit` promises in flight at once. */ +async function mapPool<T>(items: T[], limit: number, fn: (item: T) => Promise<void>): Promise<void> { + let next = 0; + await Promise.all( + Array.from({ length: Math.min(limit, items.length) }, async () => { + while (next < items.length) { + const i = next++; + await fn(items[i]); + } + }), + ); +} const aggSchema = z.enum(['sum', 'avg']); const columnSchema = z.object({ @@ -107,15 +124,6 @@ const columnSchema = z.object({ scale: z.number().finite().optional(), precision: z.number().int().min(0).max(6).optional(), }); -const throughputSchema = z.object({ - metric: z.string().min(1), - label: z.string().optional(), - unit: z.string().optional(), - mqe: z.string().optional(), - aggregation: aggSchema.optional(), - scale: z.number().finite().optional(), - precision: z.number().int().min(0).max(6).optional(), -}); const bodySchema = z.object({ topN: z.number().int().min(1).max(8), orderBy: z.string().min(1), @@ -124,10 +132,6 @@ const bodySchema = z.object({ // per-layer header columns. Up to 3 + 5 = 8 in the worst case; 10 // gives headroom without making the BFF wide-open. columns: z.array(columnSchema).max(10), - spark: z - .object({ metric: z.string().min(1), height: z.number().int().positive() }) - .optional(), - throughput: throughputSchema.optional(), // Topbar time picker — same triplet shape the dashboard route accepts. // When all three are present the BFF queries OAP at the requested // window/precision; otherwise it falls back to the last-hour MINUTE @@ -226,10 +230,6 @@ function aggregateSeries( return out; } -function alias(serviceIdx: number, columnIdx: number): string { - return `r${serviceIdx}_c${columnIdx}`; -} - interface MqeRequest { expression: string; serviceName: string; @@ -329,69 +329,97 @@ export function registerLandingRoute(app: FastifyInstance, deps: LandingRouteDep } // Step 2 — resolve MQE expressions per column. - const sampled = services.slice(0, SERVICE_QUERY_CAP); const resolved = cfg.columns.map((c) => ({ column: c, expression: resolveMqe(c.metric, c.mqe, layerKey), })); - const coldStage = !!req.coldStage; - // Chunk services × columns into batches of N services per - // round-trip, fired in parallel. OAP's GraphQL server enforces a - // per-request complexity ceiling — same reason the dashboard route - // pins at 6 widgets per trip (see dashboard.ts:555). A 25-service - // × 10-column landing in one batch is 250 fragments and reliably - // 5xx'd on busy backends, blanking every cell. Chunked + - // Promise.all keeps wall-clock close to a single round-trip. - const MAX_SERVICES_PER_BATCH = 6; - const serviceChunks: { svc: (typeof sampled)[number]; sIdx: number }[][] = []; - for (let i = 0; i < sampled.length; i += MAX_SERVICES_PER_BATCH) { - serviceChunks.push( - sampled.slice(i, i + MAX_SERVICES_PER_BATCH).map((svc, idxInChunk) => ({ - svc, - sIdx: i + idxInChunk, - })), - ); - } - let mqeData: Record<string, MqeResultShape> = {}; - if (sampled.length > 0 && resolved.some((r) => !!r.expression)) { - try { - const chunkResults = await Promise.all( - serviceChunks.map(async (chunk) => { - const fragments: string[] = []; - for (const { svc, sIdx } of chunk) { - resolved.forEach(({ expression }, cIdx) => { - if (!expression) return; - fragments.push( - buildMqeFragment( - alias(sIdx, cIdx), - { expression, serviceName: svc.value, normal: svc.normal !== false }, - window, - coldStage, - ), - ); - }); - } - if (fragments.length === 0) return {} as Record<string, MqeResultShape>; - const query = `query LandingMqe { ${fragments.join('\n ')} }`; - return graphqlPost<Record<string, MqeResultShape>>(opts, query); - }), - ); - for (const chunk of chunkResults) Object.assign(mqeData, chunk); - } catch { - mqeData = {}; + + // Probe `cols` for every service in `svcList`, chunked into + // per-request batches and drained through the bounded pool. Keyed by + // `${serviceId}#${colIdx}` so the row assembly reads back by id, not + // by a fragile global index. Per-batch failures are local — those + // cells just stay empty. + const probeColumns = async ( + svcList: typeof services, + cols: typeof resolved, + ): Promise<Map<string, MqeResultShape>> => { + const out = new Map<string, MqeResultShape>(); + if (svcList.length === 0 || !cols.some((c) => !!c.expression)) return out; + const chunks: (typeof svcList)[] = []; + for (let i = 0; i < svcList.length; i += MAX_SERVICES_PER_BATCH) { + chunks.push(svcList.slice(i, i + MAX_SERVICES_PER_BATCH)); + } + await mapPool(chunks, LANDING_BATCH_CONCURRENCY, async (batch) => { + const fragments: string[] = []; + const back: { a: string; key: string }[] = []; + batch.forEach((svc, li) => { + cols.forEach(({ expression }, ci) => { + if (!expression) return; + const a = `s${li}c${ci}`; + back.push({ a, key: `${svc.id}#${ci}` }); + fragments.push( + buildMqeFragment( + a, + { expression, serviceName: svc.value, normal: svc.normal !== false }, + window, + coldStage, + ), + ); + }); + }); + if (fragments.length === 0) return; + try { + const data = await graphqlPost<Record<string, MqeResultShape>>( + opts, + `query LandingMqe { ${fragments.join('\n ')} }`, + ); + for (const { a, key } of back) { + if (data[a] !== undefined) out.set(key, data[a]); + } + } catch { + /* batch-local failure → leave those cells empty */ + } + }); + return out; + }; + + // Bound the column fan-out to `landingServiceCap` services. The + // landing lists ALL services; when a layer exceeds the cap we run a + // cheap single-metric ranking pass (the `orderBy` column over every + // service) to pick the TRUE top-`cap`, then fetch the full columns for + // just those. At or under the cap we probe everyone directly. The UI + // surfaces "top N of M" so the trim is never silent. + const cap = deps.config.current.query.landingServiceCap; + let sampled = services; + if (totalServiceCount > cap) { + const orderByCol = resolved.find((r) => r.column.metric === cfg.orderBy && r.expression); + if (orderByCol) { + const ranked = await probeColumns(services, [orderByCol]); + const scored = services.map((svc) => ({ svc, v: collapseToScalar(ranked.get(`${svc.id}#0`)) })); + scored.sort((a, b) => { + if (a.v == null && b.v == null) return 0; + if (a.v == null) return 1; + if (b.v == null) return -1; + return b.v - a.v; + }); + sampled = scored.slice(0, cap).map((x) => x.svc); + } else { + // No orderBy column to rank by — fall back to the first `cap`. + sampled = services.slice(0, cap); } } + const probed = await probeColumns(sampled, resolved); // Step 3 — assemble per-row metrics + retain the per-bucket // series so the layer header can render a sparkline under each // KPI (aggregated point-wise across topN below). const seriesByServiceMetric = new Map<string, Map<string, Array<number | null>>>(); - const rows: LandingServiceRow[] = sampled.map((svc, sIdx) => { + const rows: LandingServiceRow[] = sampled.map((svc) => { const metrics: Record<string, number | null> = {}; const seriesMap = new Map<string, Array<number | null>>(); resolved.forEach(({ column }, cIdx) => { - const raw = mqeData[alias(sIdx, cIdx)]; + const raw = probed.get(`${svc.id}#${cIdx}`); const series = collapseToSeries(raw); if (series) { seriesMap.set( @@ -426,69 +454,10 @@ export function registerLandingRoute(app: FastifyInstance, deps: LandingRouteDep }); const topRows = rows.slice(0, cfg.topN); - // Step 5 — sparkline + throughput series for the surviving topN. - const sparkExpr = cfg.spark ? resolveMqe(cfg.spark.metric, undefined, layerKey) : null; - const throughputCol = cfg.throughput; - const throughputExpr = throughputCol - ? resolveMqe(throughputCol.metric, throughputCol.mqe, layerKey) - : null; - - // The throughput tile reuses one MQE call per surviving row — but - // only when it's a different expression than the column already - // fetched. Most setups will pick throughput = orderBy, so we just - // reuse `rows` values in that case. - const sparkSeriesByRow = new Map<number, Array<number | null>>(); - const throughputSeriesByRow = new Map<number, Array<number | null>>(); - - if ((sparkExpr || throughputExpr) && topRows.length > 0) { - const sparkFragments: string[] = []; - topRows.forEach((row, i) => { - const svc = sampled.find((s) => s.id === row.serviceId); - if (!svc) return; - const r: MqeRequest = { expression: '', serviceName: svc.value, normal: svc.normal !== false }; - if (sparkExpr) { - sparkFragments.push(buildMqeFragment(`s${i}`, { ...r, expression: sparkExpr }, window, coldStage)); - } - if (throughputExpr && throughputExpr !== sparkExpr) { - sparkFragments.push(buildMqeFragment(`t${i}`, { ...r, expression: throughputExpr }, window, coldStage)); - } - }); - if (sparkFragments.length > 0) { - const sparkQuery = `query LandingSpark { ${sparkFragments.join('\n ')} }`; - try { - const sparkData = await graphqlPost<Record<string, MqeResultShape>>(opts, sparkQuery); - topRows.forEach((row, i) => { - const sk = sparkExpr ? collapseToSeries(sparkData[`s${i}`]) : null; - if (sk && cfg.spark) { - // Spark inherits the orderBy column's scale/precision if - // we have a matching column; otherwise raw. - const matchedCol = cfg.columns.find((c) => c.metric === cfg.spark!.metric); - const scaled = sk.map((v) => - postProcess(v, matchedCol?.scale, matchedCol?.precision), - ); - row.spark = scaled; - sparkSeriesByRow.set(i, scaled); - } - if (throughputExpr) { - const series = - throughputExpr === sparkExpr - ? sk - : collapseToSeries(sparkData[`t${i}`]); - if (series) { - const scaled = series.map((v) => - postProcess(v, throughputCol?.scale, throughputCol?.precision), - ); - throughputSeriesByRow.set(i, scaled); - } - } - }); - } catch { - // Soft-fail: leave spark / throughput-spark empty. - } - } - } - - // Step 6 — aggregates for the KPI tile. + // Step 5 — aggregates for the KPI tile. Each header column becomes a + // KPI: a point value (sum/avg across the topN per the column's + // `aggregation`) plus the point-wise aggregated series the header + // renders as a trend line beneath it. const aggregates: LandingAggregates = { serviceCount: totalServiceCount, metrics: {}, @@ -508,31 +477,6 @@ export function registerLandingRoute(app: FastifyInstance, deps: LandingRouteDep const agg = aggregateSeries(colSeries, kind); if (agg) aggregates.seriesByMetric[col.metric] = agg; } - if (throughputCol) { - const kind: AggregationKind = throughputCol.aggregation ?? 'sum'; - // Value: either reuse the per-row column value (when throughput - // matches a column) or compute it now from the throughput series. - const matchingCol = cfg.columns.find((c) => c.metric === throughputCol.metric); - const values = matchingCol - ? topRows.map((r) => r.metrics[throughputCol.metric] ?? null) - : topRows.map((_, i) => { - const series = throughputSeriesByRow.get(i); - if (!series) return null; - const finite = series.filter((v): v is number => v !== null); - if (finite.length === 0) return null; - return finite.reduce((a, b) => a + b, 0) / finite.length; - }); - aggregates.throughputMetric = throughputCol.metric; - aggregates.throughputValue = aggregate(values, kind); - const seriesList = topRows.map((_, i) => throughputSeriesByRow.get(i)); - aggregates.spark = aggregateSeries(seriesList, kind); - } else if (cfg.spark) { - // No throughput configured — surface the spark metric's aggregated - // series as a fallback so the KPI tile still has a trend line. - const kind: AggregationKind = 'avg'; - const seriesList = topRows.map((_, i) => sparkSeriesByRow.get(i)); - aggregates.spark = aggregateSeries(seriesList, kind); - } const body: LandingResponse = { layer: layerKey, diff --git a/apps/bff/src/http/query/menu.ts b/apps/bff/src/http/query/menu.ts index 2193493..8846059 100644 --- a/apps/bff/src/http/query/menu.ts +++ b/apps/bff/src/http/query/menu.ts @@ -36,7 +36,7 @@ import type { TemplateRow } from '../../logic/templates/sync.js'; import type { ServiceLayerCatalog } from '../../logic/services/service-layer-catalog.js'; import { logger } from '../../logger.js'; import type { Locale } from '../../i18n/index.js'; -import { localizeContent, getLayerOverlay, localeFromRequest } from '../../i18n/index.js'; +import { localizeContent, localeFromRequest } from '../../i18n/index.js'; import { oapOverlayContentFromRows } from '../../logic/templates/overlay.js'; /** @@ -267,9 +267,7 @@ function deriveLayer( // seed/reset source, never a render-time fallback; the per-page // features block in that state while the sidebar still navigates. const rawTpl = resolveLayerTemplate(rawKey, layerRowsByName); - const tpl = rawTpl - ? localizeContent<LayerTemplate>(rawTpl, oapOverlay, getLayerOverlay(rawKey, locale), locale) - : null; + const tpl = rawTpl ? localizeContent<LayerTemplate>(rawTpl, oapOverlay, locale) : null; if (tpl) { return { key: rawKey.toLowerCase(), diff --git a/apps/bff/src/i18n/merge.ts b/apps/bff/src/i18n/merge.ts index 6d5abe4..b39229b 100644 --- a/apps/bff/src/i18n/merge.ts +++ b/apps/bff/src/i18n/merge.ts @@ -81,33 +81,22 @@ export function localize<T>(source: T, overlay: unknown, locale: string): T { } /** - * Variant for layer / overview templates whose translations live as - * sibling rows on OAP (`horizon.<kind>.<key>.i18n.<locale>`) plus - * sibling files on disk (`*.i18n.<lang>.json`). + * Localize a layer / overview template against its **OAP** translation + * overlay row (`horizon.<kind>.<key>.i18n.<locale>`), most-specific-wins + * per leaf: the OAP overlay value, else the English source. * - * Lookup precedence (most-specific wins per leaf): - * 1. `oapOverlay` — operator's pushed translations from the sibling - * OAP row. - * 2. `diskOverlay` — sibling `*.i18n.<lang>.json` file from disk. - * 3. Source (English). + * Runtime is REMOTE-only. The disk `*.i18n.<lang>.json` files are + * seed/reset defaults — boot-seed pushes each as a sibling OAP overlay + * row — NOT a render-time fill. So a key the OAP overlay doesn't carry + * falls through to **English**, never to the disk-shipped translation; + * the bundled overlay reaches the UI only by being synced to OAP, exactly + * like bundled templates. (Operators who want the full shipped + * translation reset-to-bundled, which re-seeds the OAP row.) * - * The merge is per-leaf: where the OAP overlay has a value the OAP - * value wins; where it's blank or absent the disk overlay fills in; - * where neither has it the English source falls through. So operators - * never lose disk-shipped translations by pushing one row — they - * compose. - * - * Defensive: any embedded `i18n` block on the source content is - * stripped before the merge. The split layout never writes embedded - * i18n; this guard exists only to keep us safe against hand-edited - * rows or future regressions. + * Defensive: any embedded `i18n` block on the source content is stripped + * before the merge — the split layout never writes embedded i18n. */ -export function localizeContent<T>( - content: T, - oapOverlay: unknown, - diskOverlay: unknown, - locale: string, -): T { +export function localizeContent<T>(content: T, oapOverlay: unknown, locale: string): T { let source: T = content; if (content !== null && typeof content === 'object' && !Array.isArray(content)) { const record = content as unknown as Record<string, unknown>; @@ -117,16 +106,6 @@ export function localizeContent<T>( source = rest as unknown as T; } } - if (locale === 'en') return source; - // Apply disk first, then OAP — OAP wins where it has a value, disk - // fills the gaps. mergeLocalizedNode preserves source-shape and - // falls through to the previous-layer value on empty/missing. - let merged: T = source; - if (diskOverlay !== null && diskOverlay !== undefined) { - merged = mergeLocalizedNode(merged, diskOverlay) as T; - } - if (oapOverlay !== null && oapOverlay !== undefined) { - merged = mergeLocalizedNode(merged, oapOverlay) as T; - } - return merged; + if (locale === 'en' || oapOverlay === null || oapOverlay === undefined) return source; + return mergeLocalizedNode(source, oapOverlay) as T; } diff --git a/apps/bff/src/logic/layers/loader.ts b/apps/bff/src/logic/layers/loader.ts index db7d3e6..768cf93 100644 --- a/apps/bff/src/logic/layers/loader.ts +++ b/apps/bff/src/logic/layers/loader.ts @@ -145,17 +145,12 @@ export interface OverviewGroup { /** * Overview-tile config. `groups` is the canonical shape; legacy - * `metrics`/`throughput`/`spark` are migrated to a single auto-size - * group at load time. + * `metrics` is migrated to a single auto-size group at load time. */ export interface LayerOverviewConfig { groups?: OverviewGroup[]; /** @deprecated — wrapped into a single auto-size group on load. */ metrics?: OverviewMetric[]; - /** @deprecated — migrated to the first group's first metric. */ - throughput?: string; - /** @deprecated — sparkline follows the headline metric. */ - spark?: string; } /** @@ -468,32 +463,13 @@ function load(): Map<string, LayerTemplate> { if (!parsed.header) parsed.header = { columns: [] }; parsed.metrics = parsed.header; - // Legacy: `metrics.throughput` + `metrics.spark` used to sit - // inside the header block. Promote to top-level `overview` so the - // tile rendering doesn't have to know about either spelling. - const legacyMetrics = parsed.metrics as - | (LayerHeaderConfig & { throughput?: string; spark?: string }) - | undefined; - if (legacyMetrics && (legacyMetrics.throughput || legacyMetrics.spark)) { - const ov: LayerOverviewConfig = { ...(parsed.overview ?? {}) }; - if (!ov.throughput && legacyMetrics.throughput) ov.throughput = legacyMetrics.throughput; - if (!ov.spark && legacyMetrics.spark) ov.spark = legacyMetrics.spark; - parsed.overview = ov; - delete legacyMetrics.throughput; - delete legacyMetrics.spark; - } - // Overview tile schema: self-contained `OverviewMetric[]`. Support - // three input shapes and normalize to the new one: + // two input shapes and normalize to the new one: // 1. `metrics: OverviewMetric[]` ← new shape, pass through. // 2. `metrics: string[]` ← previous shape (key refs // into the header columns); resolve each ref to a full entry. - // 3. `throughput` / `spark` strings ← oldest shape; resolve same - // way the column-ref path does. if (parsed.overview) { const ov = parsed.overview as LayerOverviewConfig & { - throughput?: string; - spark?: string; metrics?: unknown; }; const columns = parsed.header?.columns ?? []; @@ -530,10 +506,6 @@ function load(): Map<string, LayerTemplate> { } } } - if (resolved.length === 0) { - if (ov.throughput) resolved.push(fromRef(ov.throughput)); - if (ov.spark && ov.spark !== ov.throughput) resolved.push(fromRef(ov.spark)); - } // Assign auto-ids to any unkeyed entry. The id is what the SPA // threads through the landing query as the synthetic column key. resolved = resolved.map((m, i) => ({ id: m.id ?? `ov_${i}`, ...m })); @@ -564,8 +536,6 @@ function load(): Map<string, LayerTemplate> { // Keep the legacy `metrics` array in sync with the flattened // groups so any caller still reading the old field keeps working. ov.metrics = (ov.groups ?? []).flatMap((g) => g.metrics); - delete ov.throughput; - delete ov.spark; } out.set(parsed.key.toUpperCase(), parsed); } diff --git a/apps/bff/src/logic/templates/overlay.ts b/apps/bff/src/logic/templates/overlay.ts index f72479f..2f755e8 100644 --- a/apps/bff/src/logic/templates/overlay.ts +++ b/apps/bff/src/logic/templates/overlay.ts @@ -19,15 +19,14 @@ * Resolve the live **OAP translation overlay** content for one * (kind, key, locale) — the operator-pushed per-locale row. * - * The disk `*.i18n.<lang>.json` files are only seeds; the live - * translation is an OAP overlay row. The config bundle already merges - * this row on top of the disk overlay (`localizeContent(content, - * oapOverlay, diskOverlay, locale)`), so render routes that localize - * with the disk overlay ALONE show stale text once an operator pushes a - * translation. This helper gives those direct routes the same remote- - * first overlay the bundle uses. Reads the shared 30s sync cache, so - * it's cheap on the hot path; soft-fails to `null` (disk-only) on - * English / no-client / OAP-unreachable / parse error. + * Runtime translation is REMOTE-only: `localizeContent(content, + * oapOverlay, locale)` applies this OAP overlay over the English source, + * and nothing else — the disk `*.i18n.<lang>.json` files are seed / reset + * / diff sources, never a render-time fill (same remote-first rule as + * bundled templates). This helper gives the direct render routes the same + * OAP overlay the config bundle uses. Reads the shared 30s sync cache, so + * it's cheap on the hot path; soft-fails to `null` (→ English) on English + * / no-client / OAP-unreachable / parse error. */ import type { UITemplateClient } from '@skywalking-horizon-ui/api-client'; diff --git a/apps/ui/src/api/client.ts b/apps/ui/src/api/client.ts index 6f714a6..7d82380 100644 --- a/apps/ui/src/api/client.ts +++ b/apps/ui/src/api/client.ts @@ -37,6 +37,7 @@ import type { DashboardWidget, DslDebuggingStatus, EndpointDependencyConfig, + LayerOverviewConfig, LocalState, MetricRow, ProcessTopologyConfig, @@ -288,10 +289,10 @@ export interface AdminLayerTemplate { precision?: number; }>; }; - overview?: { - throughput?: string; - spark?: string; - }; + /** Overview-tile config (group list). Edited on the Overview-templates + * admin, not here — surfaced for the translation preview + so a + * round-trip save preserves it. */ + overview?: LayerOverviewConfig; widgets: DashboardWidget[]; topology?: TopologyConfig; endpointDependency?: EndpointDependencyConfig; diff --git a/apps/ui/src/api/scopes/layer.test.ts b/apps/ui/src/api/scopes/layer.test.ts index 5357577..d7752c7 100644 --- a/apps/ui/src/api/scopes/layer.test.ts +++ b/apps/ui/src/api/scopes/layer.test.ts @@ -47,8 +47,6 @@ describe('LayerApi.landing', () => { topN: 5, orderBy: 'cpm', columns: [{ metric: 'cpm', label: 'CPM', mqe: 'service_cpm', aggregation: 'sum' }], - spark: { metric: 'service_cpm', height: 18 }, - throughput: { metric: 'cpm', mqe: 'service_cpm', label: 'CPM', unit: 'rpm' }, style: 'table', }, { step: 'MINUTE', startMs: 1, endMs: 2 }, @@ -59,10 +57,10 @@ describe('LayerApi.landing', () => { expect(calls[0][2]).toMatchObject({ topN: 5, orderBy: 'cpm', + columns: [{ metric: 'cpm', label: 'CPM', mqe: 'service_cpm', aggregation: 'sum' }], step: 'MINUTE', startMs: 1, endMs: 2, - spark: { metric: 'service_cpm', height: 18 }, }); }); diff --git a/apps/ui/src/api/scopes/layer.ts b/apps/ui/src/api/scopes/layer.ts index f434fdd..d97744f 100644 --- a/apps/ui/src/api/scopes/layer.ts +++ b/apps/ui/src/api/scopes/layer.ts @@ -52,8 +52,6 @@ export class LayerApi { topN: cfg.topN, orderBy: cfg.orderBy, columns: cfg.columns, - ...(cfg.spark ? { spark: cfg.spark } : {}), - ...(cfg.throughput ? { throughput: cfg.throughput } : {}), }; if (range) { body.step = range.step; diff --git a/apps/ui/src/controls/configBundle.ts b/apps/ui/src/controls/configBundle.ts index 3772157..c9513b6 100644 --- a/apps/ui/src/controls/configBundle.ts +++ b/apps/ui/src/controls/configBundle.ts @@ -155,7 +155,28 @@ export function ensureConfigBundle(): Promise<void> { 'err', `Config preload failed: ${err instanceof Error ? err.message : String(err)}`, ); - // Don't rethrow — the SPA falls back to per-page network reads. + // Don't rethrow — but DO unblock the shell. A network / non-2xx + // failure with no cached copy leaves `state` null, and the AppShell + // waits on `loaded` before rendering ANY route — so the app would + // hang on "Initializing…" forever. Seed an empty, unreachable bundle + // so `loaded` flips true: routes render, per-page reads + the + // connectivity banner take over, and the banner's retry can recover. + if (state.value === null) { + const now = Date.now(); + state.value = { + etag: '', + generatedAt: now, + layers: {}, + overviews: [], + syncStatus: { + unreachable: true, + lastSuccessfulSyncAt: null, + generatedAt: now, + badges: [], + conflicts: [], + }, + }; + } } })(); return loadPromise; diff --git a/apps/ui/src/features/admin/layer-templates/LayerDashboardsAdmin.vue b/apps/ui/src/features/admin/layer-templates/LayerDashboardsAdmin.vue index 1ede4eb..4dd7e6e 100644 --- a/apps/ui/src/features/admin/layer-templates/LayerDashboardsAdmin.vue +++ b/apps/ui/src/features/admin/layer-templates/LayerDashboardsAdmin.vue @@ -53,7 +53,6 @@ import { buildExportEnvelope, downloadJson, pickJsonFile, validateImport } from import { usePreviewOverride } from '@/controls/previewOverride'; import TimeChart from '@/components/charts/TimeChart.vue'; import TopList from '@/components/charts/TopList.vue'; -import Sparkline from '@/components/charts/Sparkline.vue'; import { fmtMetric } from '@/utils/formatters'; import { stableStringify } from '@/utils/stableJson'; import { mockCardValue, mockLineSeries, mockRecordRows, mockTopGroups } from './widget-mock'; @@ -1222,9 +1221,7 @@ const TRACE_SOURCE_OPTIONS: Array<{ value: TraceSource; label: string; hint: str * honors at runtime to pick the trace backend. */ /** - * Metrics block editor — drives the service-list columns + default - * sort. Overview-only fields (throughput, spark) live in a separate - * block, so they're edited in their own card. + * Metrics block editor — drives the service-list columns + default sort. */ function ensureMetrics(): NonNullable<AdminLayerTemplate['metrics']> { if (!draft.template) throw new Error('no template selected'); @@ -1233,23 +1230,11 @@ function ensureMetrics(): NonNullable<AdminLayerTemplate['metrics']> { } return draft.template.metrics as NonNullable<AdminLayerTemplate['metrics']>; } -function ensureOverview(): NonNullable<AdminLayerTemplate['overview']> { - if (!draft.template) throw new Error('no template selected'); - if (!draft.template.overview) { - (draft.template as AdminLayerTemplate).overview = {}; - } - return draft.template.overview as NonNullable<AdminLayerTemplate['overview']>; -} const metricsModel = computed(() => { if (!draft.template) return null; ensureMetrics(); return draft.template.metrics as NonNullable<AdminLayerTemplate['metrics']>; }); -const overviewModel = computed(() => { - if (!draft.template) return null; - ensureOverview(); - return draft.template.overview as NonNullable<AdminLayerTemplate['overview']>; -}); const metricsColumns = computed(() => { if (!draft.template) return []; const m = ensureMetrics(); @@ -1290,36 +1275,6 @@ const effectiveOrderBy = computed( () => metricsModel.value?.orderBy ?? metricsColumns.value[0]?.metric, ); -type MetricColumn = NonNullable<NonNullable<AdminLayerTemplate['metrics']>['columns']>[number]; -/** Headline column for the landing KPI tile: explicit `overview.throughput`, - * else the default-sort column. */ -const kpiHeadlineCol = computed<MetricColumn | undefined>(() => { - const key = overviewModel.value?.throughput ?? effectiveOrderBy.value; - return metricsColumns.value.find((c) => c.metric === key) ?? metricsColumns.value[0]; -}); -/** Trend column for the sparkline: explicit `overview.spark`, else the - * headline column. */ -const kpiSparkCol = computed<MetricColumn | undefined>(() => { - const key = overviewModel.value?.spark ?? overviewModel.value?.throughput ?? effectiveOrderBy.value; - return metricsColumns.value.find((c) => c.metric === key) ?? kpiHeadlineCol.value; -}); -/** Headline number — fmtMetric of a mock aggregate for the headline - * column (the real tile shows the value bare; unit lives in the label). */ -const kpiHeadlineValue = computed(() => - kpiHeadlineCol.value ? fmtMetric(previewBase(1) * (kpiHeadlineCol.value.scale ?? 1)) : '—', -); -/** Mock sparkline series for the real Sparkline component — deterministic - * per spark-column so the trend reads as real movement without MQE. */ -const kpiSparkValues = computed<number[]>(() => { - const seed = (kpiSparkCol.value?.metric ?? 'x').length; - const n = 24; - const a: number[] = []; - for (let i = 0; i < n; i++) a.push(14 + Math.sin(i * 0.7 + seed) * 7 + Math.sin(i * 0.23 + seed) * 4); - return a; -}); -/** Two-letter mark for the header icon tile (mirrors the live header). */ -const previewInitials = computed(() => (selectedTpl.value?.key ?? '??').slice(0, 2).toUpperCase()); - /** * Scope-aware `visibleWhen` placeholder + hover hint. Two supported * predicate forms: @@ -2126,54 +2081,12 @@ const namingTest = computed<NamingTestResult>(() => { </tr> </tbody> </table> - <!-- Preview: the per-layer landing KPI tile (headline + trend) - at the head — its config picks WHICH service-list column - feeds the tile, no new metrics — then the service-list - sample table. Mock values, no MQE fired. --> + <!-- Preview: the service-list sample table — shows how the + configured columns render (label, scale, precision, unit, + default-sort marker). Mock values, no MQE fired. --> <div v-if="metricsColumns.length > 0" class="metrics-preview"> <div class="metrics-preview-head"> - Preview <span class="sub">how this layer’s landing KPI tile + service list render (sample data)</span> - </div> - <div v-if="overviewModel" class="kpi-block"> - <div class="kpi-config"> - <div class="kpi-config-title">Landing KPI tile <span class="sub">which column is the headline + trend</span></div> - <label> - <span>Headline (throughput)</span> - <select v-model="overviewModel.throughput"> - <option :value="undefined">(default sort)</option> - <option v-for="c in metricsColumns" :key="c.metric" :value="c.metric">{{ c.label || c.metric }}</option> - </select> - </label> - <label> - <span>Trend line (spark)</span> - <select v-model="overviewModel.spark"> - <option :value="undefined">(headline)</option> - <option v-for="c in metricsColumns" :key="c.metric" :value="c.metric">{{ c.label || c.metric }}</option> - </select> - </label> - </div> - <!-- Faithful copy of the real layer header (LayerShell): - icon tile + identity + the kpi-strip. Reuses the same - class names + the Sparkline component so the preview - matches the live page. --> - <header class="sw-card preview-layer-head"> - <div class="layer-id-row"> - <div class="icon-tile" :style="{ background: selectedTpl.color || 'var(--sw-fg-3)' }">{{ previewInitials }}</div> - <div class="identity-text"> - <div class="title-row"><h1>{{ selectedTpl.alias || selectedTpl.key }}</h1></div> - <div class="sub">{{ SAMPLE_SERVICES.length }} {{ (selectedTpl.slots?.services || 'services').toLowerCase() }}</div> - </div> - <div class="kpi-strip"> - <div class="kpi"> - <div class="kpi-label"> - {{ kpiHeadlineCol?.label || kpiHeadlineCol?.metric || '—' }}<span v-if="kpiHeadlineCol?.unit" class="unit">({{ kpiHeadlineCol.unit }})</span> - </div> - <div class="kpi-value">{{ kpiHeadlineValue }}</div> - <Sparkline class="kpi-spark" :values="kpiSparkValues" :width="84" :height="18" color="var(--sw-accent)" :stroke="1.25" /> - </div> - </div> - </div> - </header> + Preview <span class="sub">how this layer’s service list renders (sample data)</span> </div> <div class="metrics-preview-scroll"> <table class="sw-table preview-table"> @@ -4141,118 +4054,6 @@ const namingTest = computed<NamingTestResult>(() => { font-size: 10px; } -/* Landing KPI tile preview: config head + a faithful copy of the real - * layer header. The header classes below mirror LayerShell so the - * preview matches the live page (kept in sync intentionally). */ -.kpi-block { - display: flex; - flex-direction: column; - gap: 12px; - padding: 4px 4px 14px; -} -.kpi-config { - display: flex; - flex-wrap: wrap; - align-items: flex-end; - gap: 10px; -} -.kpi-config-title { - flex: 1 0 100%; - font-size: 11px; - font-weight: 600; - color: var(--sw-fg-1); -} -.kpi-config-title .sub { - margin-left: 6px; - font-weight: 400; - font-size: 10px; - color: var(--sw-fg-3); -} -.kpi-config label { - display: flex; - flex-direction: column; - gap: 3px; - font-size: 10px; - text-transform: uppercase; - letter-spacing: 0.04em; - color: var(--sw-fg-3); -} -.kpi-config select { - height: 28px; - padding: 0 8px; - background: var(--sw-bg-2); - border: 1px solid var(--sw-line-2); - border-radius: 4px; - color: var(--sw-fg-0); - font: inherit; - font-size: 11.5px; -} -.kpi-config select:focus { outline: none; border-color: var(--sw-accent); } -/* ↓ mirrors LayerShell.vue .layer-head / .layer-id-row / .kpi-strip. */ -.preview-layer-head { padding: 14px; } -.preview-layer-head .layer-id-row { - display: flex; - align-items: center; - gap: 12px; - min-width: 0; -} -.preview-layer-head .icon-tile { - width: 40px; - height: 40px; - border-radius: 10px; - display: grid; - place-items: center; - color: #fff; - font-weight: 700; - font-size: 14px; - letter-spacing: -0.02em; - flex: 0 0 40px; - background-blend-mode: multiply; - box-shadow: inset 0 0 0 1px rgba(255, 255, 255, 0.08); -} -.preview-layer-head .identity-text { min-width: 0; } -.preview-layer-head .title-row h1 { - margin: 0; - font-size: 18px; - font-weight: 600; - color: var(--sw-fg-0); - letter-spacing: -0.02em; -} -.preview-layer-head .sub { - margin-top: 4px; - font-size: 11.5px; - color: var(--sw-fg-3); -} -.preview-layer-head .kpi-strip { - display: flex; - gap: 22px; - flex-wrap: wrap; - align-items: flex-end; - margin-left: auto; -} -.preview-layer-head .kpi { text-align: right; min-width: 80px; } -.preview-layer-head .kpi-label { - font-size: 10px; - text-transform: uppercase; - letter-spacing: 0.08em; - color: var(--sw-fg-3); - margin-bottom: 2px; -} -.preview-layer-head .kpi-label .unit { - text-transform: none; - letter-spacing: 0; - margin-left: 2px; - font-size: 9.5px; -} -.preview-layer-head .kpi-value { - font-size: 18px; - font-weight: 600; - font-variant-numeric: tabular-nums; - letter-spacing: -0.02em; - color: var(--sw-fg-1); -} -.preview-layer-head .kpi-spark { display: block; margin-top: 4px; margin-left: auto; } - /* Menu-label (slot alias) editor grid. */ .alias-grid { display: grid; diff --git a/apps/ui/src/i18n/locales/de.json b/apps/ui/src/i18n/locales/de.json index 9066b5c..6e6e161 100644 --- a/apps/ui/src/i18n/locales/de.json +++ b/apps/ui/src/i18n/locales/de.json @@ -361,6 +361,9 @@ "No matches": "Keine Treffer", "filter by name…": "nach Name filtern…", "{n} of {total}": "{n} von {total}", + "metrics: top {n}": "Metriken: Top {n}", + "low": "niedrig", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "Metriken werden nur für die Top-{n}-Dienste nach {metric} erfasst; die übrigen werden als verkehrsarm aufgeführt. Erhöhen Sie query.landingServiceCap, um mehr zu erfassen.", "No services match": "Keine Services passen", "Last 15 min": "Letzte 15 Min", "Last 30 min": "Letzte 30 Min", diff --git a/apps/ui/src/i18n/locales/en.json b/apps/ui/src/i18n/locales/en.json index 0230eac..cdda62a 100644 --- a/apps/ui/src/i18n/locales/en.json +++ b/apps/ui/src/i18n/locales/en.json @@ -365,6 +365,9 @@ "No matches": "No matches", "filter by name…": "filter by name…", "{n} of {total}": "{n} of {total}", + "metrics: top {n}": "metrics: top {n}", + "low": "low", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.", "No services match": "No services match", "Last 15 min": "Last 15 min", "Last 30 min": "Last 30 min", diff --git a/apps/ui/src/i18n/locales/es.json b/apps/ui/src/i18n/locales/es.json index 28927a6..74f6fd6 100644 --- a/apps/ui/src/i18n/locales/es.json +++ b/apps/ui/src/i18n/locales/es.json @@ -361,6 +361,9 @@ "No matches": "Sin coincidencias", "filter by name…": "filtrar por nombre…", "{n} of {total}": "{n} de {total}", + "metrics: top {n}": "métricas: top {n}", + "low": "bajo", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "Las métricas se obtienen para los {n} servicios principales por {metric}; el resto se listan como de bajo tráfico. Aumente query.landingServiceCap para obtener más.", "No services match": "Ningún servicio coincide", "Last 15 min": "Últimos 15 min", "Last 30 min": "Últimos 30 min", diff --git a/apps/ui/src/i18n/locales/fr.json b/apps/ui/src/i18n/locales/fr.json index 254f381..3c2cabf 100644 --- a/apps/ui/src/i18n/locales/fr.json +++ b/apps/ui/src/i18n/locales/fr.json @@ -361,6 +361,9 @@ "No matches": "Aucune correspondance", "filter by name…": "filtrer par nom…", "{n} of {total}": "{n} sur {total}", + "metrics: top {n}": "métriques : top {n}", + "low": "faible", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "Les métriques sont collectées pour les {n} premiers services par {metric} ; les autres sont listés comme à faible trafic. Augmentez query.landingServiceCap pour en collecter davantage.", "No services match": "Aucun service ne correspond", "Last 15 min": "15 dernières minutes", "Last 30 min": "30 dernières minutes", diff --git a/apps/ui/src/i18n/locales/ja.json b/apps/ui/src/i18n/locales/ja.json index 6629ec3..9262f2e 100644 --- a/apps/ui/src/i18n/locales/ja.json +++ b/apps/ui/src/i18n/locales/ja.json @@ -361,6 +361,9 @@ "No matches": "該当なし", "filter by name…": "名前で絞り込み…", "{n} of {total}": "{total} 件中 {n} 件", + "metrics: top {n}": "指標:上位 {n}", + "low": "低い", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "{metric} の上位 {n} サービスのみ指標を取得しています。残りは低トラフィックとして表示されます。さらに取得するには query.landingServiceCap を上げてください。", "No services match": "一致するサービスがありません", "Last 15 min": "直近 15 分", "Last 30 min": "直近 30 分", diff --git a/apps/ui/src/i18n/locales/ko.json b/apps/ui/src/i18n/locales/ko.json index 069ade1..b422261 100644 --- a/apps/ui/src/i18n/locales/ko.json +++ b/apps/ui/src/i18n/locales/ko.json @@ -361,6 +361,9 @@ "No matches": "일치 항목 없음", "filter by name…": "이름으로 필터링…", "{n} of {total}": "{total}건 중 {n}건", + "metrics: top {n}": "지표: 상위 {n}", + "low": "낮음", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "{metric} 기준 상위 {n}개 서비스만 지표를 수집합니다. 나머지는 낮은 트래픽으로 표시됩니다. 더 수집하려면 query.landingServiceCap을 높이세요.", "No services match": "일치하는 서비스가 없습니다", "Last 15 min": "최근 15분", "Last 30 min": "최근 30분", diff --git a/apps/ui/src/i18n/locales/pt.json b/apps/ui/src/i18n/locales/pt.json index b3514fe..824bae0 100644 --- a/apps/ui/src/i18n/locales/pt.json +++ b/apps/ui/src/i18n/locales/pt.json @@ -361,6 +361,9 @@ "No matches": "Nenhuma correspondência", "filter by name…": "filtrar por nome…", "{n} of {total}": "{n} de {total}", + "metrics: top {n}": "métricas: top {n}", + "low": "baixo", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "As métricas são coletadas para os {n} principais serviços por {metric}; os demais são listados como de baixo tráfego. Aumente query.landingServiceCap para coletar mais.", "No services match": "Nenhum serviço corresponde", "Last 15 min": "Últimos 15 min", "Last 30 min": "Últimos 30 min", diff --git a/apps/ui/src/i18n/locales/zh-CN.json b/apps/ui/src/i18n/locales/zh-CN.json index 83ee593..839fb80 100644 --- a/apps/ui/src/i18n/locales/zh-CN.json +++ b/apps/ui/src/i18n/locales/zh-CN.json @@ -361,6 +361,9 @@ "No matches": "无匹配", "filter by name…": "按名称过滤…", "{n} of {total}": "{n} / {total}", + "metrics: top {n}": "指标:前 {n}", + "low": "偏低", + "Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.": "仅为按 {metric} 排名前 {n} 的服务采集指标;其余按低流量列出。如需采集更多,请调高 query.landingServiceCap。", "No services match": "无匹配的服务", "Last 15 min": "最近 15 分钟", "Last 30 min": "最近 30 分钟", diff --git a/apps/ui/src/layer/LayerServiceSelector.vue b/apps/ui/src/layer/LayerServiceSelector.vue index 8cb63aa..b96d101 100644 --- a/apps/ui/src/layer/LayerServiceSelector.vue +++ b/apps/ui/src/layer/LayerServiceSelector.vue @@ -51,6 +51,22 @@ const props = withDefaults( * "ActiveMQ clusters", "Databases"). Falls back to the generic * "Service" when the layer defines no alias. */ serviceLabel?: string; + /** Full layer roster (id + name for EVERY service). When supplied, + * the picker lists the WHOLE layer — services beyond the + * metric-probed `services` set (those that ranked below + * `query.landingServiceCap` on the order-by metric) render as + * "low <orderBy>" tail rows instead of being hidden, so every + * service stays browsable / searchable / selectable regardless of + * the metric fan-out cap. */ + roster?: ReadonlyArray<{ id: string; name: string }>; + /** Order-by metric key — labels the tail rows ("low RPM"). */ + orderBy?: string; + /** Total services in the layer (landing aggregates). When it exceeds + * the probed `services` count, the landing capped the metric fan-out + * at `query.landingServiceCap`: the top rows carry metrics, the rest + * list as "low <orderBy>". Surfaced as a "metrics: top N" chip so + * the trim is never silent. */ + totalCount?: number; }>(), { accent: 'var(--sw-accent)', @@ -66,10 +82,47 @@ const emit = defineEmits<{ (e: 'select', id: string): void }>(); const filter = ref(''); const page = ref(0); +// A probed row carries landing metrics; a tail row is a roster-only +// service that ranked below the metric cap — id+name with no numbers. +type PickerRow = + | { kind: 'probed'; id: string; name: string; row: LandingServiceRow } + | { kind: 'tail'; id: string; name: string }; + +// Probed rows first (already sorted by orderBy from the BFF), then the +// roster tail that wasn't metric-probed. Without a roster prop the +// behaviour is unchanged — just the probed set. +const allRows = computed<PickerRow[]>(() => { + const probed: PickerRow[] = props.services.map((r) => ({ + kind: 'probed', + id: r.serviceId, + name: r.serviceName, + row: r, + })); + const roster = props.roster; + if (!roster || roster.length === 0) return probed; + const probedIds = new Set(probed.map((p) => p.id)); + const tail: PickerRow[] = roster + .filter((r) => !probedIds.has(r.id)) + .map((r) => ({ kind: 'tail', id: r.id, name: r.name })); + return [...probed, ...tail]; +}); + +// Display label for the order-by metric, used in the "low <metric>" +// tail message. Prefer the matching column's header; fall back to the +// catalog short label, then the raw key. +const orderByLabel = computed(() => { + const ob = props.orderBy; + if (!ob) return ''; + const col = props.columns.find((c) => c.metric === ob); + return col?.label ?? metricMeta(ob).label ?? ob; +}); + +const probedCount = computed(() => props.services.length); + const filtered = computed(() => { const q = filter.value.trim().toLowerCase(); - if (q.length === 0) return props.services; - return props.services.filter((s) => s.serviceName.toLowerCase().includes(q)); + if (q.length === 0) return allRows.value; + return allRows.value.filter((r) => r.name.toLowerCase().includes(q)); }); const pageCount = computed(() => Math.max(1, Math.ceil(filtered.value.length / props.pageSize))); const currentPage = computed(() => Math.min(page.value, pageCount.value - 1)); @@ -94,7 +147,12 @@ function colorForStatus(s: 'ok' | 'warn' | 'err'): string { spellcheck="false" autocomplete="off" /> - <span class="count">{{ t('{n} of {total}', { n: filtered.length, total: services.length }) }}</span> + <span class="count">{{ t('{n} of {total}', { n: filtered.length, total: allRows.length }) }}</span> + <span + v-if="allRows.length > probedCount" + class="count capped" + :title="t('Metrics are probed for the top {n} services by {metric}; the rest are listed as low-traffic. Raise query.landingServiceCap to probe more.', { n: probedCount, metric: orderByLabel })" + >{{ t('metrics: top {n}', { n: probedCount }) }}</span> </header> <table class="sw-table picker-table"> <thead> @@ -111,31 +169,61 @@ function colorForStatus(s: 'ok' | 'warn' | 'err'): string { </tr> </thead> <tbody> - <tr - v-for="row in visible" - :key="row.serviceId" - class="row" - :class="{ active: row.serviceId === selectedId }" - @click="emit('select', row.serviceId)" - > - <td class="svc-col" :title="row.serviceName"> - <span class="pulse" :style="{ background: colorForStatus(statusForMetrics(row.metrics)) }" /> - <span v-if="identity(row.serviceName).cluster" class="group-chip"> - <span class="chip-alias">{{ identity(row.serviceName).clusterAlias }}</span> - <span class="chip-val">{{ identity(row.serviceName).cluster }}</span> - </span> - <span class="name-text">{{ row.shortName || identity(row.serviceName).display }}</span> - </td> - <td - v-for="c in columns" - :key="c.metric" - class="num" - :class="{ muted: row.metrics[c.metric] == null }" - :style="{ color: thresholdColor(c.metric, row.metrics[c.metric] ?? null) ?? undefined }" + <template v-for="r in visible" :key="r.id"> + <tr + v-if="r.kind === 'probed'" + class="row" + :class="{ active: r.id === selectedId }" + @click="emit('select', r.id)" > - {{ fmtMetric(row.metrics[c.metric]) }} - </td> - </tr> + <td class="svc-col" :title="r.name"> + <span class="pulse" :style="{ background: colorForStatus(statusForMetrics(r.row.metrics)) }" /> + <span v-if="identity(r.name).cluster" class="group-chip"> + <span class="chip-alias">{{ identity(r.name).clusterAlias }}</span> + <span class="chip-val">{{ identity(r.name).cluster }}</span> + </span> + <span class="name-text">{{ r.row.shortName || identity(r.name).display }}</span> + </td> + <td + v-for="c in columns" + :key="c.metric" + class="num" + :class="{ muted: r.row.metrics[c.metric] == null }" + :style="{ color: thresholdColor(c.metric, r.row.metrics[c.metric] ?? null) ?? undefined }" + > + {{ fmtMetric(r.row.metrics[c.metric]) }} + </td> + </tr> + <!-- Below the metric cap: roster-only row, no numbers probed. + Still selectable — picking it loads that service's own + dashboard (which probes per-service metrics directly). --> + <tr + v-else + class="row tail" + :class="{ active: r.id === selectedId }" + @click="emit('select', r.id)" + > + <td class="svc-col" :title="r.name"> + <span class="pulse tail-dot" /> + <span v-if="identity(r.name).cluster" class="group-chip"> + <span class="chip-alias">{{ identity(r.name).clusterAlias }}</span> + <span class="chip-val">{{ identity(r.name).cluster }}</span> + </span> + <span class="name-text">{{ identity(r.name).display }}</span> + </td> + <!-- "low" sits in the order-by column (that's the metric it + ranked low on); the others show "—" — they were never + probed, so they're unknown, not zero. --> + <td + v-for="c in columns" + :key="c.metric" + class="num" + :class="{ 'low-tail': c.metric === orderBy, muted: c.metric !== orderBy }" + > + {{ c.metric === orderBy ? t('low') : '—' }} + </td> + </tr> + </template> <tr v-if="visible.length === 0"> <td :colspan="columns.length + 1" class="empty"> {{ t('No services match') }} <code>{{ filter }}</code>. @@ -188,6 +276,11 @@ function colorForStatus(s: 'ok' | 'warn' | 'err'): string { margin-left: auto; font-variant-numeric: tabular-nums; } +.count.capped { + margin-left: 8px; + color: var(--sw-warn); + cursor: help; +} .picker-table { width: 100%; font-size: 11.5px; @@ -217,6 +310,23 @@ function colorForStatus(s: 'ok' | 'warn' | 'err'): string { .picker-table td.muted { color: var(--sw-fg-3); } +.picker-table tr.tail td { + /* The unprobed long-tail reads a notch quieter than measured rows. */ + opacity: 0.8; +} +.picker-table td.low-tail { + text-align: right; + color: var(--sw-fg-3); + font-size: 10.5px; + font-style: italic; + letter-spacing: 0.02em; +} +.pulse.tail-dot { + /* Neutral grey, never green: this service was NOT measured, so its + dot must not read as "healthy". */ + background: var(--sw-fg-3); + opacity: 0.5; +} .picker-table tr.row { cursor: pointer; } diff --git a/apps/ui/src/layer/LayerShell.vue b/apps/ui/src/layer/LayerShell.vue index fbf4e3c..40ed0e6 100644 --- a/apps/ui/src/layer/LayerShell.vue +++ b/apps/ui/src/layer/LayerShell.vue @@ -29,7 +29,7 @@ import { computed, onBeforeUnmount, ref, watch } from 'vue'; import { useQuery } from '@tanstack/vue-query'; import { RouterLink, RouterView, useRoute, useRouter } from 'vue-router'; -import type { LayerDef } from '@skywalking-horizon-ui/api-client'; +import type { LayerDef, LandingServiceRow } from '@skywalking-horizon-ui/api-client'; import { bffClient } from '@/api/client'; import { useAuthStore } from '@/state/auth'; import Icon from '@/components/icons/Icon.vue'; @@ -268,13 +268,37 @@ const aggregates = computed(() => // null or no longer in the roster. const { selectedId, setSelected } = useSelectedService(); const sampledServices = computed(() => landing.data.value?.sampledRows ?? landing.rows.value ?? []); -const selectorColumns = computed(() => safeCfg.value.columns); -const selectedRow = computed( - () => - sampledServices.value.find((s) => s.serviceId === selectedId.value) ?? - sampledServices.value[0] ?? - null, +// Total services in the layer — when it exceeds the probed sample, the +// landing capped the metric fan-out (query.landingServiceCap) and the +// selector surfaces "top N of M". +const layerServiceTotal = computed<number | undefined>( + () => landing.data.value?.aggregates?.serviceCount ?? undefined, ); +const selectorColumns = computed(() => safeCfg.value.columns); +// Full service roster (the layer's REAL catalog, independent of landing's +// top-N sample which misses low-traffic services / anything beyond the +// landingServiceCap). A URL `?service=` is validated against THIS, not the +// sample, and `selectedRow` resolves an off-sample selection from it. +const { services: fullRoster, isLoading: rosterLoading } = useLayerServices(layerKey); +const selectedRow = computed<LandingServiceRow | null>(() => { + const id = selectedId.value; + if (id) { + // The sampled row carries metrics — prefer it. + const inSample = sampledServices.value.find((s) => s.serviceId === id); + if (inSample) return inSample; + // Valid but off-sample (low-traffic, or beyond landingServiceCap): + // resolve the NAME from the full roster so the header + KPI strip + // reflect the ACTUAL selected service the dashboard queries — not the + // top sampled one. Metrics aren't in the sample, so the KPIs show + // dashes (honest "not probed") rather than another service's numbers. + const inRoster = fullRoster.value.find((s) => s.id === id); + if (inRoster) { + return { serviceId: inRoster.id, serviceName: inRoster.name, metrics: {} }; + } + } + // No (or genuinely stale) selection → default to the first sampled row. + return sampledServices.value[0] ?? null; +}); const selectedParsed = computed(() => parseServiceName(selectedRow.value?.serviceName)); const selectedGroup = computed(() => selectedParsed.value.group); // Switch-button label — base name only when the service has a group @@ -304,12 +328,6 @@ const isZipkinTrace = computed<boolean>(() => { return scopeSegment.value === 'trace' && layer.value?.traces?.source === 'zipkin'; }); -// Full service roster (the layer's REAL catalog, independent of landing's -// top-N sample which misses low-traffic services). A URL `?service=` is -// validated against THIS, not the sample — otherwise a valid but -// low-traffic deep link is wrongly treated as stale. -const { services: fullRoster, isLoading: rosterLoading } = useLayerServices(layerKey); - // Keep the URL-backed service selection honest for every page that // uses the shell picker. A `?service=` outside the landing sample is // trusted when it exists in the full roster; only a genuinely stale id @@ -585,13 +603,16 @@ const serviceKpis = computed<HeaderKpi[]>(() => { Sits below the General header so the page reads top-to-bottom: layer identity → expanded service picker → sub-route body. --> <LayerServiceSelector - v-if="layer && pickerOpen && sampledServices.length > 0 && !viewOwnsServiceSelector" + v-if="layer && pickerOpen && (sampledServices.length > 0 || fullRoster.length > 0) && !viewOwnsServiceSelector" :services="sampledServices" + :roster="fullRoster" + :order-by="safeCfg.orderBy" :columns="selectorColumns" :selected-id="selectedId" :accent="layer.color" :naming-rule="layer.naming ?? null" :service-label="layer.slots.services" + :total-count="layerServiceTotal" @select="pickService" /> diff --git a/apps/ui/src/layer/useLayerLanding.ts b/apps/ui/src/layer/useLayerLanding.ts index 246b21e..16ccca7 100644 --- a/apps/ui/src/layer/useLayerLanding.ts +++ b/apps/ui/src/layer/useLayerLanding.ts @@ -52,8 +52,6 @@ export function useLayerLanding( topN: cfg.value.topN, orderBy: cfg.value.orderBy, columns: cfg.value.columns, - spark: cfg.value.spark, - throughput: cfg.value.throughput, })); const rangeRef = range ?? computed<LandingRange | null>(() => null); const rangeKey = computed(() => { diff --git a/apps/ui/src/render/layer-dashboard/useLayerDashboard.ts b/apps/ui/src/render/layer-dashboard/useLayerDashboard.ts index 7b86a2b..eacc193 100644 --- a/apps/ui/src/render/layer-dashboard/useLayerDashboard.ts +++ b/apps/ui/src/render/layer-dashboard/useLayerDashboard.ts @@ -159,7 +159,11 @@ export function useLayerDashboard( entityRefs.instance ?? computed(() => null), entityRefs.endpoint ?? computed(() => null), rangeKey, - computed(() => widgetsList?.value.map((w) => w.id).join('|') ?? null), + // Key on the FULL widget config, not just ids: a remote sync or + // preview edit that keeps a widget's id but changes its MQE + // expressions / type must refire — an id-only key would serve the + // stale (wrong-expression) data from cache. + computed(() => (widgetsList?.value ? JSON.stringify(widgetsList.value) : null)), ], queryFn: async () => { const total = widgetsList?.value.length ?? 0; diff --git a/apps/ui/src/render/overview/useOverviewDashboard.ts b/apps/ui/src/render/overview/useOverviewDashboard.ts index 5210747..c344ba0 100644 --- a/apps/ui/src/render/overview/useOverviewDashboard.ts +++ b/apps/ui/src/render/overview/useOverviewDashboard.ts @@ -143,7 +143,10 @@ export function useOverviewDashboard(idRef: Ref<string>) { const entries = Array.from(layerRequests.value.entries()); const range = rangeKey.value; return entries.map(([layer, reqs]) => ({ - queryKey: ['overview-dashboard-data', idRef.value, layer, range], + // Include the MQE column set (`reqs`), not just the overview id: + // a remote sync or preview edit that keeps the id but changes a + // widget's MQE must refire, or the cache serves stale data. + queryKey: ['overview-dashboard-data', idRef.value, layer, range, JSON.stringify(reqs)], queryFn: () => { /* Service-count KPIs read from `aggregates.serviceCount` * — strip them from the MQE column list to avoid sending diff --git a/apps/ui/src/state/setup.ts b/apps/ui/src/state/setup.ts index b596bb2..22e506f 100644 --- a/apps/ui/src/state/setup.ts +++ b/apps/ui/src/state/setup.ts @@ -167,7 +167,6 @@ export function defaultLandingFor( } for (const c of ovCols) combined.push(c); - const headline = ovIds[0] ?? orderBy; return { priority: defaultPriority(layerKey), topN: 5, @@ -180,11 +179,6 @@ export function defaultLandingFor( // metrics are SERVICE_INSTANCE-only and show as `—` on the // Service page header). headerColumns: headerCols, - spark: { metric: headline, height: 28 }, - throughput: { - metric: headline, - aggregation: defaultAggregationFor(headline), - }, // Flat list of overview metric ids (legacy back-compat for any // code path still reading it). Same set, flattened from groups. overviewMetrics: ovIds.length > 0 ? ovIds : [orderBy], diff --git a/apps/ui/src/utils/metricCatalog.ts b/apps/ui/src/utils/metricCatalog.ts index df555ae..07a265f 100644 --- a/apps/ui/src/utils/metricCatalog.ts +++ b/apps/ui/src/utils/metricCatalog.ts @@ -382,8 +382,6 @@ interface DefaultLandingSet { columns: Array<{ metric: string; label?: string; unit?: string }>; /** Metric key used to rank the top-N. */ orderBy: string; - /** Sparkline metric (defaults to `orderBy` when omitted). */ - spark?: string; } const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { @@ -412,7 +410,6 @@ const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { { metric: 'k8s.restart' }, ], orderBy: 'k8s.cpu', - spark: 'k8s.cpu', }, browser: { columns: [ @@ -422,7 +419,6 @@ const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { { metric: 'browser.js-err' }, ], orderBy: 'browser.pv', - spark: 'browser.pv', }, database: { columns: [ @@ -432,7 +428,6 @@ const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { { metric: 'db.conn' }, ], orderBy: 'db.qps', - spark: 'db.qps', }, mq: { columns: [ @@ -441,7 +436,6 @@ const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { { metric: 'mq.consumer-lag' }, ], orderBy: 'mq.msg-rate', - spark: 'mq.msg-rate', }, faas: { columns: [ @@ -451,7 +445,6 @@ const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { { metric: 'err' }, ], orderBy: 'faas.invocations', - spark: 'faas.invocations', }, genai: { columns: [ @@ -460,7 +453,6 @@ const LAYER_TYPE_DEFAULTS: Record<LayerCategory, DefaultLandingSet> = { { metric: 'genai.latency' }, ], orderBy: 'genai.req', - spark: 'genai.tokens', }, }; @@ -485,11 +477,6 @@ export function defaultOrderByForLayer(layerKey: string): string { return LAYER_TYPE_DEFAULTS[layerCategory(layerKey)].orderBy; } -/** Sparkline metric for the landing card (falls back to the order-by key). */ -export function defaultSparkForLayer(layerKey: string): string { - const set = LAYER_TYPE_DEFAULTS[layerCategory(layerKey)]; - return set.spark ?? set.orderBy; -} /** * Generic RPC-shaped metrics every layer can render — surfaced as a diff --git a/docs/setup/horizon-yaml.md b/docs/setup/horizon-yaml.md index c2f8840..9a99286 100644 --- a/docs/setup/horizon-yaml.md +++ b/docs/setup/horizon-yaml.md @@ -14,6 +14,7 @@ This page is the top-level map. Each subsection has its own detail page: | `audit` | Audit log file path. | [audit](audit.md) | | `setup` / `alarms` | State file paths. | [files](files.md) | | `debugLog` | Wire-level request/response log for troubleshooting. | [debugLog](debug-log.md) | +| `query` | Per-request query limits (the layer-landing service cap). | [below](#query-limits) | ## Top-level shape @@ -93,6 +94,40 @@ Two changes require a process restart: - `server.host`, `server.port` — the listener already bound. - Capability probes — the OAP schema introspection cache is per-process. +## Query limits + +```yaml +query: + landingServiceCap: 100 # default +``` + +`query.landingServiceCap` bounds how many services a **layer landing** runs +column-metric MQE for, per request. The service picker always **lists every +service** in the layer, but only fetches metric columns for up to this many — +and when a layer has more, the BFF runs one cheap single-metric pass (the +landing's order-by column over every service) to pick the **true top-N**, then +fetches the full columns for just those. Services below the cap still appear in +the picker, showing **`low`** in the order-by column (and `—` for the others, +which were never probed) — every service stays browsable and selectable. The +picker header reads **"metrics: top N"** so the metric trim is never silent. + +- **Default `100`.** Most layers have fewer services and render in full. +- **Raise it** (e.g. `300`, `500`) if your OAP and storage backend can take + the larger fan-out and you want metrics for more services at once. +- **Lower it** to protect a modest deployment from heavy landings. + +**What it bounds.** The cap limits the **full-column** MQE fan-out (the +expensive part — every configured column × service). When a layer exceeds +it, the **true top-N** is found by a single cheap pass that evaluates only +the order-by column for every service — so on a very large layer that one +ranking pass still scales with the service count (it's one metric, batched +through a bounded-concurrency pool, not the full column set). The cap is +therefore a bound on the *expensive* fan-out, not a hard ceiling on total +OAP traffic. If you need a hard ceiling on a pathological layer, lower the +cap and pair it with a tighter OAP rate limit. + +Hot-reloadable — a change takes effect on the next landing request. + ## Cross-references - A field that affects user-visible behavior at runtime is also visible on **Admin → Auth Status** (`/admin/auth-status`) for live verification — see [Admin Pages](../access-control/admin-pages.md). diff --git a/packages/api-client/src/index.ts b/packages/api-client/src/index.ts index 74ad536..5a35336 100644 --- a/packages/api-client/src/index.ts +++ b/packages/api-client/src/index.ts @@ -36,7 +36,6 @@ export type { LayerConfig, SetupResponse, SetupSavePayload, - ThroughputConfig, } from './setup.js'; export type { LandingAggregates, LandingResponse, LandingServiceRow } from './landing.js'; export type { diff --git a/packages/api-client/src/landing.ts b/packages/api-client/src/landing.ts index 88d2757..48fb2ef 100644 --- a/packages/api-client/src/landing.ts +++ b/packages/api-client/src/landing.ts @@ -34,12 +34,6 @@ export interface LandingServiceRow { * renders `null` as a muted em-dash. */ metrics: Record<string, number | null>; - /** - * Sparkline series for `cfg.spark.metric`, when configured. Same order - * as the `step` buckets returned by OAP — left-to-right oldest-to-newest. - * `null` entries mark missing samples. - */ - spark?: Array<number | null>; } /** @@ -59,15 +53,6 @@ export interface LandingAggregates { * metrics, avg for ratio/latency metrics. Used by the per-layer * header to render a trend line under each KPI. */ seriesByMetric: Record<string, Array<number | null>>; - /** Aggregated sparkline series for the `throughput.metric` (or `spark`) - * using the throughput aggregation. `null` when not configured. */ - spark?: Array<number | null> | null; - /** Echo of the throughput metric key the spark series was computed - * against (so the UI can label the tile). */ - throughputMetric?: string; - /** Value of the throughput metric across the layer (null when - * unconfigured or unmapped). */ - throughputValue?: number | null; } export interface LandingResponse { @@ -83,10 +68,10 @@ export interface LandingResponse { durationEnd: string; rows: LandingServiceRow[]; /** - * All services the BFF probed for this layer (up to its internal cap, - * currently 25). `rows` is a sorted+sliced subset of this — the Overview - * card uses `rows`, the per-layer constellation / table uses the full - * `sampledRows` so deep-dive views don't lose context. + * Every service the BFF probed for this layer (up to `query.landingServiceCap`, + * default 100; the true top-N by `orderBy` when the layer exceeds it). + * `rows` is a sorted+sliced subset (the top-`topN`); the per-layer service + * picker uses the full `sampledRows` so it can list the whole layer. */ sampledRows?: LandingServiceRow[]; /** Whole-layer rollup KPIs for the Overview strip tile. */ diff --git a/packages/api-client/src/menu.ts b/packages/api-client/src/menu.ts index c55f161..d2f7f7c 100644 --- a/packages/api-client/src/menu.ts +++ b/packages/api-client/src/menu.ts @@ -199,16 +199,11 @@ export interface OverviewGroup { * - `metrics: string[]` of column-key refs → resolved against * `layer-header.columns` then * wrapped as above. - * - `throughput` / `spark` (oldest) → resolved same way. */ export interface LayerOverviewConfig { groups?: OverviewGroup[]; /** @deprecated — wrapped into a single auto-size group on load. */ metrics?: OverviewMetric[]; - /** @deprecated — migrated to the first group's first metric. */ - throughput?: string; - /** @deprecated — sparkline follows the headline metric. */ - spark?: string; } export interface LayerDef { diff --git a/packages/api-client/src/setup.ts b/packages/api-client/src/setup.ts index 43b091e..5397907 100644 --- a/packages/api-client/src/setup.ts +++ b/packages/api-client/src/setup.ts @@ -74,25 +74,6 @@ export interface LandingColumn { precision?: number; } -/** - * Headline throughput metric for the per-layer KPI strip tile. Optional — - * when omitted, the strip falls back to the `orderBy` column's value - * (also aggregated per the column's `aggregation` field). - */ -export interface ThroughputConfig { - /** Short metric key (must match a column or stand alone). */ - metric: string; - /** Display label override (default falls through to the metric catalog). */ - label?: string; - unit?: string; - /** MQE override — same semantics as `LandingColumn.mqe`. */ - mqe?: string; - /** Aggregation across services (defaults to `sum`). */ - aggregation?: AggregationKind; - scale?: number; - precision?: number; -} - export interface LandingConfig { /** Lower number → higher on the Overview. */ priority: number; @@ -101,10 +82,6 @@ export interface LandingConfig { /** Metric key used to rank the top-N. */ orderBy: string; columns: LandingColumn[]; - /** Optional sparkline column. */ - spark?: { metric: string; height: number }; - /** Optional headline metric for the per-layer KPI strip tile. */ - throughput?: ThroughputConfig; /** @deprecated kept for back-compat; new code reads `overviewGroups`. */ overviewMetrics?: string[]; /** Explicit per-layer page header columns — distinct from `columns`
