This is an automated email from the ASF dual-hosted git repository.
tqchen pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git
The following commit(s) were added to refs/heads/main by this push:
new 5d8c4d2727 [Web] Fix progress reporting when loading from cache
(#18450)
5d8c4d2727 is described below
commit 5d8c4d2727fe678108ceaea48dd6ee35b2317fcf
Author: Mihail Yonchev <[email protected]>
AuthorDate: Tue Nov 18 16:41:26 2025 +0100
[Web] Fix progress reporting when loading from cache (#18450)
## Problem
When loading model shards from cache (not network), the progress
indicator
always showed 0% because `fetchedBytes` was not incremented during the
cache
loading phase in `fetchTensorCacheInternal()`.
The `reportCallback` function calculates progress as `fetchedBytes * 100
/ totalBytes`,
but `fetchedBytes` was only updated during the network download phase
(line 1361),
not during the cache loading phase (lines 1377-1427). This caused the
progress
to remain at 0% until completion when loading from cache.
## Solution
This fix increments `fetchedBytes` and updates `timeElapsed` after
processing
each cached shard (matching the behavior of the network download phase).
The
progress callback now correctly reports:
- Percentage completed (`fetchedBytes * 100 / totalBytes`)
- MB loaded
- Time elapsed
## Changes
- Added `fetchedBytes += shard.nbytes;` after processing each cache
shard
- Added `timeElapsed` update to ensure accurate time reporting
- Matches the pattern used in the download phase (lines 1360-1361)
---
web/src/runtime.ts | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/web/src/runtime.ts b/web/src/runtime.ts
index 8143f970ed..41bc43b54c 100644
--- a/web/src/runtime.ts
+++ b/web/src/runtime.ts
@@ -1359,7 +1359,7 @@ export class Instance implements Disposable {
}
timeElapsed = Math.ceil((perf.now() - tstart) / 1000);
fetchedBytes += shard.nbytes;
- reportCallback(fetchedShards++, /*loading=*/false);
+ reportCallback(++fetchedShards, /*loading=*/false);
}
}
// We launch 4 parallel for loops to limit the max concurrency to 4
download
@@ -1373,6 +1373,10 @@ export class Instance implements Disposable {
]);
}
+ // Reset for the loading phase to avoid double counting with download phase
+ fetchedBytes = 0;
+ fetchedShards = 0;
+
// Then iteratively, load the shard from cache
for (let i = 0; i < list.length; ++i) {
const shard = list[i];
@@ -1421,7 +1425,9 @@ export class Instance implements Disposable {
throw err;
}
}
- reportCallback(i + 1, /*loading=*/true);
+ fetchedBytes += shard.nbytes;
+ timeElapsed = Math.ceil((perf.now() - tstart) / 1000);
+ reportCallback(++fetchedShards, /*loading=*/true);
}
}