Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: a31ae8e1e6d32c432efb43b3524917a41d6621be
      
https://github.com/WebKit/WebKit/commit/a31ae8e1e6d32c432efb43b3524917a41d6621be
  Author: Wenson Hsieh <[email protected]>
  Date:   2026-03-02 (Mon, 02 Mar 2026)

  Changed paths:
    M Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.h
    M Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.mm
    M Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm
    M Source/WebKit/UIProcess/API/Cocoa/WKWebViewInternal.h
    M Source/WebKit/UIProcess/API/mac/WKWebViewMac.mm

  Log Message:
  -----------
  [AutoFill Debugging] Improve text recognition filtering performance when 
extracting visible text
https://bugs.webkit.org/show_bug.cgi?id=309008
rdar://170845857

Reviewed by Abrar Rahman Protyasha.

Speed up performance of OCR-based visible text filtering when the 
`.textRecognition` flag is set in
`filterOptions`. Right now, we scan every paragraph in the DOM that's longer 
than a certain (short)
threshold by snapshotting the corresponding `SimpleRange` and running that 
through the Vision
framework. This can be wasteful if there are many paragraphs that require 
filtering, since we'll end
up taking a snapshot for each one, and then pass that through to 
`mediaanalysisd` through Vision for
analysis.

Instead, we can avoid performing redundant OCR in many cases by simply doing 
one up-front OCR pass
over the whole web page (or just the extraction target rect) to collect a 
lexicon of recognized
words that appear in the page, and then bail when validating text for any 
paragraphs that are
comprised mostly (arbitrarily, ≥90%) of words that appear in the up-front page 
OCR lexicon.

* Source/WebKit/Platform/cocoa/ImageAnalysisUtilities.mm:
(WebKit::recognizeText):

Also set `VNRequestTextRecognitionLevelFast` as the recognition level; this is 
necessary to
reliably recognize small text in otherwise very tall page snapshots.

* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(-[WKWebView 
_extractDebugTextWithConfigurationWithoutUpdatingFilterRules:assertionScope:completionHandler:]):
(-[WKWebView _requestTextExtractionInternal:completion:]):
(-[WKWebView _validateText:inFrame:inNode:completionHandler:]):
(-[WKWebView _clearTextExtractionFilterCache]):
* Source/WebKit/UIProcess/API/Cocoa/WKWebViewInternal.h:
* Source/WebKit/UIProcess/API/mac/WKWebViewMac.mm:
(-[WKWebView _web_didChangeContentSize:]):

Keep track of the last known page content size on macOS.

Canonical link: https://commits.webkit.org/308504@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications

Reply via email to