Branch: refs/heads/main
  Home:   https://github.com/WebKit/WebKit
  Commit: 428e849ba738d0d3eaec46dde0a9a453b1f1d575
      
https://github.com/WebKit/WebKit/commit/428e849ba738d0d3eaec46dde0a9a453b1f1d575
  Author: Wenson Hsieh <[email protected]>
  Date:   2025-10-13 (Mon, 13 Oct 2025)

  Changed paths:
    M LayoutTests/fast/text-extraction/debug-text-extraction-basic.html
    A 
LayoutTests/fast/text-extraction/debug-text-extraction-lightweight-expected.txt
    A LayoutTests/fast/text-extraction/debug-text-extraction-lightweight.html
    M LayoutTests/resources/ui-helper.js
    M Source/WebCore/page/text-extraction/TextExtraction.cpp
    M Source/WebCore/page/text-extraction/TextExtractionTypes.h
    M Source/WebKit/Shared/TextExtractionToStringConversion.cpp
    M Source/WebKit/Shared/TextExtractionToStringConversion.h
    M Source/WebKit/Shared/WebCoreArgumentCoders.serialization.in
    M Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm
    M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h
    M Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm
    M Source/WebKit/UIProcess/Cocoa/TextExtraction/WKTextExtractionUtilities.mm
    M Tools/TestRunnerShared/UIScriptContext/Bindings/UIScriptController.idl
    M Tools/TestRunnerShared/UIScriptContext/UIScriptController.h
    M Tools/TestRunnerShared/UIScriptContext/UIScriptControllerShared.cpp
    M Tools/WebKitTestRunner/cocoa/UIScriptControllerCocoa.h
    M Tools/WebKitTestRunner/cocoa/UIScriptControllerCocoa.mm

  Log Message:
  -----------
  [AutoFill Debugging] Add support for several new extraction configuration 
options to limit text output
https://bugs.webkit.org/show_bug.cgi?id=300568
rdar://162378543

Reviewed by Aditya Keerthi and Abrar Rahman Protyasha.

Add support for 3 new text extraction configuration options:
- includeURLs
- includeRects
- maxWordsPerParagraph

See below for more details.

Test: fast/text-extraction/debug-text-extraction-lightweight.html

Test: fast/text-extraction/debug-text-extraction-lightweight.html
* LayoutTests/fast/text-extraction/debug-text-extraction-basic.html:
* 
LayoutTests/fast/text-extraction/debug-text-extraction-lightweight-expected.txt:
 Added.

Add a new layout test, similar to the existing 
`debug-text-extraction-basic.html`, which exercises
this new functionality by:

1. Forcing a word limit of 5
2. Omitting rects
3. Omitting URLs

* LayoutTests/fast/text-extraction/debug-text-extraction-lightweight.html: 
Copied from LayoutTests/fast/text-extraction/debug-text-extraction-basic.html.
* LayoutTests/resources/ui-helper.js:
(window.UIHelper.prototype.async requestDebugText):

Add options to request extraction with or without rects and URLs, along with 
word limit. We also
make `requestDebugText` take the newly renamed `TextExtractionTestOptions`.

* Source/WebCore/page/text-extraction/TextExtraction.cpp:
(WebCore::TextExtraction::extractItemData):
* Source/WebCore/page/text-extraction/TextExtractionTypes.h:

Drive-by fix: also surface entire image `src` URLs here instead of just the 
last path component
(which was a compromise to keep output length small), now that clients can 
simply avoid URLs in the
output altogether.

* Source/WebKit/Shared/TextExtractionToStringConversion.cpp:
(WebKit::commaSeparatedString):
(WebKit::escapeString):
(WebKit::normalizedURLString):

Move these helpers to the top of the file.

(WebKit::TextExtractionAggregator::TextExtractionAggregator):
(WebKit::TextExtractionAggregator::~TextExtractionAggregator):
(WebKit::TextExtractionAggregator::create):
(WebKit::TextExtractionAggregator::includeRects const):
(WebKit::TextExtractionAggregator::includeURLs const):
(WebKit::TextExtractionAggregator::filter const):
(WebKit::TextExtractionAggregator::addLineForNativeMenuItemsIfNeeded):

To support these new behaviors (as well as easily support any new flags that 
need to be plumbed
through the text extraction lifecycle), we refactor `TextExtractionAggregator` 
so that it also
carries the `TextExtractionOptions` with it, along with other state like the 
filtering callback and
native menu items (which it appends at the end, before calling the completion 
handler).

(WebKit::partsForItem):
(WebKit::addPartsForText):
(WebKit::addPartsForItem):
(WebKit::addTextRepresentationRecursive):
(WebKit::convertToText):
* Source/WebKit/Shared/TextExtractionToStringConversion.h:
(WebKit::TextExtractionOptions::TextExtractionOptions):
(WebKit::convertToText): Deleted.
(): Deleted.
* Source/WebKit/Shared/WebCoreArgumentCoders.serialization.in:
* Source/WebKit/UIProcess/API/Cocoa/WKWebView.mm:
(joinAndTruncateLinesToWordLimit):

Add a helper method to truncate each line (separated by newlines) in a piece of 
extracted text to
the requested maximum word limit. Use it below, as the last step in the 
filtering callback. In the
case where `TEXT_EXTRACTION_FILTER` is disabled or sanitization is disabled on 
the configuration but
a word limit is still specified, we just make the filtering callback truncate 
the raw output to the
word limit.

(-[WKWebView _debugTextWithConfiguration:completionHandler:]):
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.h:
* Source/WebKit/UIProcess/API/Cocoa/_WKTextExtraction.mm:
(-[_WKTextExtractionConfiguration init]):
* Source/WebKit/UIProcess/Cocoa/TextExtraction/WKTextExtractionUtilities.mm:
(WebKit::createItemWithChildren):
* Tools/TestRunnerShared/UIScriptContext/Bindings/UIScriptController.idl:
* Tools/TestRunnerShared/UIScriptContext/UIScriptController.h:
(WTR::UIScriptController::requestTextExtraction):
(WTR::UIScriptController::requestDebugText):
* Tools/TestRunnerShared/UIScriptContext/UIScriptControllerShared.cpp:
(WTR::toTextExtractionTestOptions):
(WTR::toTextExtractionOptions): Deleted.

Augment test infrastructure to allow UI-side scripts to pass in configuration 
options when using
`UIScriptController.requestDebugText`.

* Tools/WebKitTestRunner/cocoa/UIScriptControllerCocoa.h:
* Tools/WebKitTestRunner/cocoa/UIScriptControllerCocoa.mm:
(WTR::createTextExtractionConfiguration):
(WTR::UIScriptControllerCocoa::requestTextExtraction):
(WTR::UIScriptControllerCocoa::requestDebugText):

Canonical link: https://commits.webkit.org/301420@main



To unsubscribe from these emails, change your notification settings at 
https://github.com/WebKit/WebKit/settings/notifications
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to