Branch: refs/heads/main
Home: https://github.com/WebKit/WebKit
Commit: b137fef748b7711f22e63952f81b794968f6e504
https://github.com/WebKit/WebKit/commit/b137fef748b7711f22e63952f81b794968f6e504
Author: Wenson Hsieh <[email protected]>
Date: 2025-12-27 (Sat, 27 Dec 2025)
Changed paths:
M
LayoutTests/fast/text-extraction/debug-text-extraction-shorten-urls-expected.txt
M LayoutTests/fast/text-extraction/debug-text-extraction-shorten-urls.html
M Source/WebKit/Shared/TextExtractionToStringConversion.cpp
M Source/WebKit/Shared/TextExtractionToStringConversion.h
A Source/WebKit/Shared/TextExtractionURLCache.cpp
A Source/WebKit/Shared/TextExtractionURLCache.h
M Source/WebKit/Sources.txt
M Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm
M Source/WebKit/UIProcess/API/Cocoa/WKWebViewInternal.h
M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h
M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm
M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtractionInternal.h
M Source/WebKit/WebKit.xcodeproj/project.pbxproj
Log Message:
-----------
[AutoFill Debugging] Part 2/2: Deduplicate shortened URLs and report
replacements to the client
https://bugs.webkit.org/show_bug.cgi?id=304680
rdar://167149825
Reviewed by Richard Robinson.
Add a strategy to deduplicate different URLs that are shortened to the same
string, when opting into
URL shortening for text extraction; combine that with a new property on
`_WKTextExtractionResult`
that reports a mapping from shortened URLs to each original URL, back to the
WebKit text extraction
client.
See below for more details.
*
LayoutTests/fast/text-extraction/debug-text-extraction-shorten-urls-expected.txt:
* LayoutTests/fast/text-extraction/debug-text-extraction-shorten-urls.html:
Augment this layout test to exercise the deduplication strategy.
* Source/WebKit/Shared/TextExtractionToStringConversion.cpp:
(WebKit::TextExtractionAggregator::~TextExtractionAggregator):
(WebKit::TextExtractionAggregator::stringForURL):
Add a helper method that uses `TextExtractionURLCache` to map each shortened
URL string to a
deduplicated string.
(WebKit::addPartsForItem):
(WebKit::addTextRepresentationRecursive):
(WebKit::centerEllipsize): Deleted.
* Source/WebKit/Shared/TextExtractionToStringConversion.h:
(WebKit::TextExtractionOptions::TextExtractionOptions):
* Source/WebKit/Shared/TextExtractionURLCache.cpp: Added.
Add a new helper class that represents a cache of shortened URL strings and
original URLs.
(WebKit::TextExtractionURLCache::clear):
Reset the cache (called during top level navigation, or if the web process
swaps or exits).
(WebKit::TextExtractionURLCache::add):
Main entry point for updating the cache: this takes the shortened URL string
and original URL as
arguments, and returns the shortened URL string saved to the cache, adjusting
it by adding a
numbered suffix if necessary. For instance:
- Suppose `https://example.com/foo?bar=baz` is shortened to "example.com/foo".
- If `https://example.com/foo?bar=garply` appears later on, it will also get
shortened to
"example.com/foo" which is then deduplicated to "example.com/foo2".
- If `https://example.com/foo.html` appears, it will shorten to
"example.com/foo.html" because the
shortened version doesn't conflict with the other shortened strings above.
- If `https://example.com/foo.html?bar=baz` appears later on, it will also
get shortened to
"example.com/foo.html" which is then deduplicated to
"example.com/foo2.html".
(WebKit::TextExtractionURLCache::urlForShortenedString const):
* Source/WebKit/Shared/TextExtractionURLCache.h: Added.
(WebKit::TextExtractionURLCache::create):
* Source/WebKit/Sources.txt:
* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(createEmptyTextExtractionResult):
(-[WKWebView
_extractDebugTextWithConfigurationWithoutUpdatingFilterRules:completionHandler:]):
(-[WKWebView _clearTextExtractionFilterCache]):
* Source/WebKit/UIProcess/API/Cocoa/WKWebViewInternal.h:
Store a `TextExtractionURLCache` on the web view, and pass it into the text
conversion pipeline.
Clear the cache upon navigation.
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h:
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm:
(-[_WKTextExtractionResult
initWithTextContent:filteredOutAnyText:shortenedURLs:]):
(-[_WKTextExtractionResult shortenedURLs]):
Add support for a new property on `_WKTextExtractionResult` that contains a map
of shortened
extracted URLs to their original URLs.
(-[_WKTextExtractionResult initWithTextContent:filteredOutAnyText:]): Deleted.
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtractionInternal.h:
* Source/WebKit/WebKit.xcodeproj/project.pbxproj:
Canonical link: https://commits.webkit.org/304963@main
To unsubscribe from these emails, change your notification settings at
https://github.com/WebKit/WebKit/settings/notifications