Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: a31ae8e1e6d32c432efb43b3524917a41d6621be
https://github.com/WebKit/WebKit/commit/a31ae8e1e6d32c432efb43b3524917a41d6621be
Author: Wenson Hsieh <[email protected]>
Date: 2026-03-02 (Mon, 02 Mar 2026)
Changed paths:
M Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.h
M Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.mm
M Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm
M Source/WebKit/UIProcess/API/Cocoa/WKWebViewInternal.h
M Source/WebKit/UIProcess/API/mac/WKWebViewMac.mm
Log Message:
-----------
[AutoFill Debugging] Improve text recognition filtering performance when
extracting visible text
https://bugs.webkit.org/show_bug.cgi?id=309008
rdar://170845857
Reviewed by Abrar Rahman Protyasha.
Speed up performance of OCR-based visible text filtering when the
`.textRecognition` flag is set in
`filterOptions`. Right now, we scan every paragraph in the DOM that's longer
than a certain (short)
threshold by snapshotting the corresponding `SimpleRange` and running that
through the Vision
framework. This can be wasteful if there are many paragraphs that require
filtering, since we'll end
up taking a snapshot for each one, and then pass that through to
`mediaanalysisd` through Vision for
analysis.
Instead, we can avoid performing redundant OCR in many cases by simply doing
one up-front OCR pass
over the whole web page (or just the extraction target rect) to collect a
lexicon of recognized
words that appear in the page, and then bail when validating text for any
paragraphs that are
comprised mostly (arbitrarily, ≥90%) of words that appear in the up-front page
OCR lexicon.
* Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.mm:
(WebKit::recognizeText):
Also set `VNRequestTextRecognitionLevelFast` as the recognition level; this is
necessary to
reliably recognize small text in otherwise very tall page snapshots.
* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(-[WKWebView
_extractDebugTextWithConfigurationWithoutUpdatingFilterRules:assertionScope:completionHandler:]):
(-[WKWebView _requestTextExtractionInternal:completion:]):
(-[WKWebView _validateText:inFrame:inNode:completionHandler:]):
(-[WKWebView _clearTextExtractionFilterCache]):
* Source/WebKit/UIProcess/API/Cocoa/WKWebViewInternal.h:
* Source/WebKit/UIProcess/API/mac/WKWebViewMac.mm:
(-[WKWebView _web_didChangeContentSize:]):
Keep track of the last known page content size on macOS.
Canonical link: https://commits.webkit.org/308504@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications