Title: [270158] trunk
Revision
270158
Author
[email protected]
Date
2020-11-21 21:51:10 -0800 (Sat, 21 Nov 2020)

Log Message

Implement audio capture for SpeechRecognition on macOS
https://bugs.webkit.org/show_bug.cgi?id=218855
<rdar://problem/71331001>

Reviewed by Youenn Fablet.

Source/WebCore:

Introduce SpeechRecognizer, which performs audio capture and speech recogntion operations. On start,
SpeechRecognizer creates a SpeechRecognitionCaptureSource and starts audio capturing. On stop, SpeechRecognizer
clears the source and stops recognizing. SpeechRecognizer can only handle one request at a time, so calling
start on already started SpeechRecognizer would cause ongoing request to be aborted.

Tests: fast/speechrecognition/start-recognition-then-stop.html
       fast/speechrecognition/start-second-recognition.html

* Headers.cmake:
* Modules/speech/SpeechRecognitionCaptureSource.cpp: Added.
(WebCore::SpeechRecognitionCaptureSource::SpeechRecognitionCaptureSource):
* Modules/speech/SpeechRecognitionCaptureSource.h: Added.
* Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp: Added. SpeechRecognitionCaptureSourceImpl provides
implementation of SpeechRecognitionCaptureSource on when ENABLE(MEDIA_STREAM) is true.
(WebCore::nextLogIdentifier):
(WebCore::nullLogger):
(WebCore::SpeechRecognitionCaptureSourceImpl::SpeechRecognitionCaptureSourceImpl):
(WebCore::SpeechRecognitionCaptureSourceImpl::~SpeechRecognitionCaptureSourceImpl):
(WebCore::SpeechRecognitionCaptureSourceImpl::audioSamplesAvailable): Push data to buffer, signal main thread to
pull from buffer and invoke data callback.
(WebCore::SpeechRecognitionCaptureSourceImpl::sourceStarted):
(WebCore::SpeechRecognitionCaptureSourceImpl::sourceStopped):
(WebCore::SpeechRecognitionCaptureSourceImpl::sourceMutedChanged):
* Modules/speech/SpeechRecognitionCaptureSourceImpl.h: Added.
* Modules/speech/SpeechRecognizer.cpp: Added.
(WebCore::SpeechRecognizer::SpeechRecognizer):
(WebCore::SpeechRecognizer::reset):
(WebCore::SpeechRecognizer::start):
(WebCore::SpeechRecognizer::startInternal):
(WebCore::SpeechRecognizer::stop):
(WebCore::SpeechRecognizer::stopInternal):
* Modules/speech/SpeechRecognizer.h: Added.
(WebCore::SpeechRecognizer::currentClientIdentifier const):
* Sources.txt:
* SourcesCocoa.txt:
* WebCore.xcodeproj/project.pbxproj:
* platform/cocoa/MediaUtilities.cpp: Added.
(WebCore::createAudioFormatDescription):
(WebCore::createAudioSampleBuffer):
* platform/cocoa/MediaUtilities.h: Added.
* platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm: Move code for creating CMSampleBufferRef to
MediaUtilities.h/cpp so it can shared between SpeechRecognition and UserMedia, as Speech recognition backend
will take CMSampleBufferRef as input.
(WebCore::createAudioFormatDescription): Deleted.
(WebCore::createAudioSampleBuffer): Deleted.

Source/WebKit:

* UIProcess/SpeechRecognitionPermissionManager.cpp:
(WebKit::SpeechRecognitionPermissionManager::startProcessingRequest): Check and enable mock devices based on
preference as SpeechRecognition needs it for testing.
* UIProcess/SpeechRecognitionServer.cpp:
(WebKit::SpeechRecognitionServer::start):
(WebKit::SpeechRecognitionServer::requestPermissionForRequest):
(WebKit::SpeechRecognitionServer::handleRequest):
(WebKit::SpeechRecognitionServer::stop):
(WebKit::SpeechRecognitionServer::abort):
(WebKit::SpeechRecognitionServer::invalidate):
(WebKit::SpeechRecognitionServer::sendUpdate):
(WebKit::SpeechRecognitionServer::stopRequest): Deleted.
(WebKit::SpeechRecognitionServer::abortRequest): Deleted.
* UIProcess/SpeechRecognitionServer.h:
* UIProcess/WebPageProxy.cpp:
(WebKit::WebPageProxy::syncIfMockDevicesEnabledChanged):
* UIProcess/WebPageProxy.h:

LayoutTests:

* TestExpectations:
* fast/speechrecognition/start-recognition-in-removed-iframe.html: mark test as async to avoid flakiness.
* fast/speechrecognition/start-recognition-then-stop-expected.txt: Added.
* fast/speechrecognition/start-recognition-then-stop.html: Added.
* fast/speechrecognition/start-second-recognition-expected.txt: Added.
* fast/speechrecognition/start-second-recognition.html: Added.
* platform/wk2/TestExpectations:

Modified Paths

Added Paths

Diff

Modified: trunk/LayoutTests/ChangeLog (270157 => 270158)


--- trunk/LayoutTests/ChangeLog	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/LayoutTests/ChangeLog	2020-11-22 05:51:10 UTC (rev 270158)
@@ -1,3 +1,19 @@
+2020-11-21  Sihui Liu  <[email protected]>
+
+        Implement audio capture for SpeechRecognition on macOS
+        https://bugs.webkit.org/show_bug.cgi?id=218855
+        <rdar://problem/71331001>
+
+        Reviewed by Youenn Fablet.
+
+        * TestExpectations:
+        * fast/speechrecognition/start-recognition-in-removed-iframe.html: mark test as async to avoid flakiness.
+        * fast/speechrecognition/start-recognition-then-stop-expected.txt: Added.
+        * fast/speechrecognition/start-recognition-then-stop.html: Added.
+        * fast/speechrecognition/start-second-recognition-expected.txt: Added.
+        * fast/speechrecognition/start-second-recognition.html: Added.
+        * platform/wk2/TestExpectations:
+
 2020-11-21  Chris Dumez  <[email protected]>
 
         Poor resampling quality when using AudioContext sampleRate parameter

Modified: trunk/LayoutTests/TestExpectations (270157 => 270158)


--- trunk/LayoutTests/TestExpectations	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/LayoutTests/TestExpectations	2020-11-22 05:51:10 UTC (rev 270158)
@@ -179,6 +179,8 @@
 fast/forms/call-text-did-change-in-text-field-when-typing.html [ Skip ]
 http/tests/in-app-browser-privacy/ [ Skip ]
 fast/speechrecognition/permission-error.html [ Skip ]
+fast/speechrecognition/start-recognition-then-stop.html [ Skip ]
+fast/speechrecognition/start-second-recognition.html [ Skip ]
 
 # Only partial support on Cocoa platforms.
 imported/w3c/web-platform-tests/speech-api/ [ Skip ]

Modified: trunk/LayoutTests/fast/speechrecognition/start-recognition-in-removed-iframe.html (270157 => 270158)


--- trunk/LayoutTests/fast/speechrecognition/start-recognition-in-removed-iframe.html	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/LayoutTests/fast/speechrecognition/start-recognition-in-removed-iframe.html	2020-11-22 05:51:10 UTC (rev 270158)
@@ -5,15 +5,20 @@
 <script>
 description("Verify that process does not crash when starting recognition in a removed iframe.");
 
+if (window.testRunner) {
+    jsTestIsAsync = true;
+}
+
 function test()
 {
     iframe = document.getElementsByTagName('iframe')[0];
-    shouldNotThrow("iframe.contentWindow.startRecognition()"); 
+    shouldNotThrow("iframe.contentWindow.startRecognition()");
 }
 
 function removeFrame()
 {
     shouldNotThrow("iframe.parentNode.removeChild(iframe)");
+    setTimeout(() => finishJSTest(), 0);
 }
 
 window.addEventListener('load', test, false);

Added: trunk/LayoutTests/fast/speechrecognition/start-recognition-then-stop-expected.txt (0 => 270158)


--- trunk/LayoutTests/fast/speechrecognition/start-recognition-then-stop-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/fast/speechrecognition/start-recognition-then-stop-expected.txt	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,16 @@
+Verify that events are received corretly when start and stop recognition normally.
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS recognition = new SpeechRecognition(); did not throw exception.
+PASS recognition.start() did not throw exception.
+Received start event
+Received audiostart event
+PASS recognition.stop() did not throw exception.
+Received audioend event
+Received end event
+PASS successfullyParsed is true
+
+TEST COMPLETE
+

Added: trunk/LayoutTests/fast/speechrecognition/start-recognition-then-stop.html (0 => 270158)


--- trunk/LayoutTests/fast/speechrecognition/start-recognition-then-stop.html	                        (rev 0)
+++ trunk/LayoutTests/fast/speechrecognition/start-recognition-then-stop.html	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,37 @@
+<!DOCTYPE html>
+<html>
+<body>
+<script src=""
+<script>
+description("Verify that events are received corretly when start and stop recognition normally.");
+
+if (window.testRunner) {
+    jsTestIsAsync = true;
+}
+
+shouldNotThrow("recognition = new SpeechRecognition();");
+recognition._onstart_ = (event) => {
+    debug("Received start event");
+}
+
+recognition._onaudiostart_ = (event) => {
+    debug("Received audiostart event");
+
+    shouldNotThrow("recognition.stop()");
+}
+
+recognition._onaudioend_ = (event) => {
+    debug("Received audioend event");
+}
+
+recognition._onend_ = (event) => {
+    debug("Received end event");
+
+    finishJSTest();
+}
+
+shouldNotThrow("recognition.start()");
+
+</script>
+</body>
+</html>
\ No newline at end of file

Added: trunk/LayoutTests/fast/speechrecognition/start-second-recognition-expected.txt (0 => 270158)


--- trunk/LayoutTests/fast/speechrecognition/start-second-recognition-expected.txt	                        (rev 0)
+++ trunk/LayoutTests/fast/speechrecognition/start-second-recognition-expected.txt	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,16 @@
+Verify that starting a second recognition aborts ongoing one.
+
+On success, you will see a series of "PASS" messages, followed by "TEST COMPLETE".
+
+
+PASS recognition = new SpeechRecognition(); did not throw exception.
+PASS recognition.start() did not throw exception.
+PASS secondRecognition = new SpeechRecognition(); did not throw exception.
+PASS secondRecognition.start() did not throw exception.
+PASS receivedStart is true
+PASS event.error is "aborted"
+PASS event.message is "Another request is started"
+PASS successfullyParsed is true
+
+TEST COMPLETE
+

Added: trunk/LayoutTests/fast/speechrecognition/start-second-recognition.html (0 => 270158)


--- trunk/LayoutTests/fast/speechrecognition/start-second-recognition.html	                        (rev 0)
+++ trunk/LayoutTests/fast/speechrecognition/start-second-recognition.html	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,32 @@
+<!DOCTYPE html>
+<html>
+<body>
+<script src=""
+<script>
+description("Verify that starting a second recognition aborts ongoing one.");
+
+if (window.testRunner) {
+    jsTestIsAsync = true;
+}
+
+shouldNotThrow("recognition = new SpeechRecognition();");
+receivedStart = false;
+recognition._onstart_ = (event) => {
+    receivedStart = true;
+}
+
+recognition._onerror_ = (event) => {
+    shouldBeTrue("receivedStart");
+    shouldBeEqualToString("event.error", "aborted");
+    shouldBeEqualToString("event.message", "Another request is started");
+
+    finishJSTest();
+}
+
+shouldNotThrow("recognition.start()");
+shouldNotThrow("secondRecognition = new SpeechRecognition();");
+shouldNotThrow("secondRecognition.start()");
+
+</script>
+</body>
+</html>
\ No newline at end of file

Modified: trunk/LayoutTests/platform/wk2/TestExpectations (270157 => 270158)


--- trunk/LayoutTests/platform/wk2/TestExpectations	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/LayoutTests/platform/wk2/TestExpectations	2020-11-22 05:51:10 UTC (rev 270158)
@@ -804,4 +804,6 @@
 # WebKit2 only.
 js/throw-large-string-oom.html [ Pass ]
 fast/speechrecognition/permission-error.html [ Pass ]
+fast/speechrecognition/start-recognition-then-stop.html [ Pass ]
+fast/speechrecognition/start-second-recognition.html [ Pass ]
 fullscreen/full-screen-enter-while-exiting.html [ Pass ]

Modified: trunk/Source/WebCore/ChangeLog (270157 => 270158)


--- trunk/Source/WebCore/ChangeLog	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebCore/ChangeLog	2020-11-22 05:51:10 UTC (rev 270158)
@@ -1,3 +1,57 @@
+2020-11-21  Sihui Liu  <[email protected]>
+
+        Implement audio capture for SpeechRecognition on macOS
+        https://bugs.webkit.org/show_bug.cgi?id=218855
+        <rdar://problem/71331001>
+
+        Reviewed by Youenn Fablet.
+
+        Introduce SpeechRecognizer, which performs audio capture and speech recogntion operations. On start,
+        SpeechRecognizer creates a SpeechRecognitionCaptureSource and starts audio capturing. On stop, SpeechRecognizer 
+        clears the source and stops recognizing. SpeechRecognizer can only handle one request at a time, so calling
+        start on already started SpeechRecognizer would cause ongoing request to be aborted.
+
+        Tests: fast/speechrecognition/start-recognition-then-stop.html
+               fast/speechrecognition/start-second-recognition.html
+
+        * Headers.cmake:
+        * Modules/speech/SpeechRecognitionCaptureSource.cpp: Added.
+        (WebCore::SpeechRecognitionCaptureSource::SpeechRecognitionCaptureSource):
+        * Modules/speech/SpeechRecognitionCaptureSource.h: Added.
+        * Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp: Added. SpeechRecognitionCaptureSourceImpl provides 
+        implementation of SpeechRecognitionCaptureSource on when ENABLE(MEDIA_STREAM) is true.
+        (WebCore::nextLogIdentifier):
+        (WebCore::nullLogger):
+        (WebCore::SpeechRecognitionCaptureSourceImpl::SpeechRecognitionCaptureSourceImpl):
+        (WebCore::SpeechRecognitionCaptureSourceImpl::~SpeechRecognitionCaptureSourceImpl):
+        (WebCore::SpeechRecognitionCaptureSourceImpl::audioSamplesAvailable): Push data to buffer, signal main thread to
+        pull from buffer and invoke data callback.
+        (WebCore::SpeechRecognitionCaptureSourceImpl::sourceStarted):
+        (WebCore::SpeechRecognitionCaptureSourceImpl::sourceStopped):
+        (WebCore::SpeechRecognitionCaptureSourceImpl::sourceMutedChanged):
+        * Modules/speech/SpeechRecognitionCaptureSourceImpl.h: Added.
+        * Modules/speech/SpeechRecognizer.cpp: Added.
+        (WebCore::SpeechRecognizer::SpeechRecognizer):
+        (WebCore::SpeechRecognizer::reset):
+        (WebCore::SpeechRecognizer::start):
+        (WebCore::SpeechRecognizer::startInternal):
+        (WebCore::SpeechRecognizer::stop):
+        (WebCore::SpeechRecognizer::stopInternal):
+        * Modules/speech/SpeechRecognizer.h: Added.
+        (WebCore::SpeechRecognizer::currentClientIdentifier const):
+        * Sources.txt:
+        * SourcesCocoa.txt:
+        * WebCore.xcodeproj/project.pbxproj:
+        * platform/cocoa/MediaUtilities.cpp: Added.
+        (WebCore::createAudioFormatDescription):
+        (WebCore::createAudioSampleBuffer):
+        * platform/cocoa/MediaUtilities.h: Added.
+        * platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm: Move code for creating CMSampleBufferRef to
+        MediaUtilities.h/cpp so it can shared between SpeechRecognition and UserMedia, as Speech recognition backend 
+        will take CMSampleBufferRef as input.
+        (WebCore::createAudioFormatDescription): Deleted.
+        (WebCore::createAudioSampleBuffer): Deleted.
+
 2020-11-21  Chris Dumez  <[email protected]>
 
         Poor resampling quality when using AudioContext sampleRate parameter

Modified: trunk/Source/WebCore/Headers.cmake (270157 => 270158)


--- trunk/Source/WebCore/Headers.cmake	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebCore/Headers.cmake	2020-11-22 05:51:10 UTC (rev 270158)
@@ -116,6 +116,8 @@
     Modules/plugins/PluginReplacement.h
     Modules/plugins/YouTubePluginReplacement.h
 
+    Modules/speech/SpeechRecognitionCaptureSource.h
+    Modules/speech/SpeechRecognitionCaptureSourceImpl.h
     Modules/speech/SpeechRecognitionConnection.h
     Modules/speech/SpeechRecognitionConnectionClient.h
     Modules/speech/SpeechRecognitionConnectionClientIdentifier.h
@@ -124,6 +126,7 @@
     Modules/speech/SpeechRecognitionRequestInfo.h
     Modules/speech/SpeechRecognitionResultData.h
     Modules/speech/SpeechRecognitionUpdate.h
+    Modules/speech/SpeechRecognizer.h
 
     Modules/streams/ReadableStreamChunk.h
     Modules/streams/ReadableStreamSink.h

Added: trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp (0 => 270158)


--- trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp	                        (rev 0)
+++ trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "SpeechRecognitionCaptureSource.h"
+
+#if ENABLE(MEDIA_STREAM)
+#include "CaptureDeviceManager.h"
+#include "RealtimeMediaSourceCenter.h"
+#include "SpeechRecognitionUpdate.h"
+#endif
+
+namespace WebCore {
+
+SpeechRecognitionCaptureSource::SpeechRecognitionCaptureSource(SpeechRecognitionConnectionClientIdentifier clientIdentifier, DataCallback&& dataCallback, StateUpdateCallback&& stateUpdateCallback)
+{
+#if ENABLE(MEDIA_STREAM)
+    Optional<CaptureDevice> captureDevice;
+    auto devices = RealtimeMediaSourceCenter::singleton().audioCaptureFactory().audioCaptureDeviceManager().captureDevices();
+    for (auto device : devices) {
+        if (!device.enabled())
+            continue;
+
+        if (!captureDevice)
+            captureDevice = device;
+
+        if (device.isDefault()) {
+            captureDevice = device;
+            break;
+        }
+    }
+
+    if (!captureDevice) {
+        auto error = SpeechRecognitionError { SpeechRecognitionErrorType::AudioCapture, "No device is available for capture" };
+        stateUpdateCallback(SpeechRecognitionUpdate::createError(clientIdentifier, error));
+        return;
+    }
+
+    auto result = RealtimeMediaSourceCenter::singleton().audioCaptureFactory().createAudioCaptureSource(*captureDevice, { }, { });
+    if (!result) {
+        auto error = SpeechRecognitionError { SpeechRecognitionErrorType::AudioCapture, result.errorMessage };
+        stateUpdateCallback(SpeechRecognitionUpdate::createError(clientIdentifier, error));
+        return;
+    }
+
+    m_impl = makeUnique<SpeechRecognitionCaptureSourceImpl>(clientIdentifier, WTFMove(dataCallback), WTFMove(stateUpdateCallback), result.source());
+#else
+    UNUSED_PARAM(clientIdentifier);
+    UNUSED_PARAM(dataCallback);
+    UNUSED_PARAM(stateUpdateCallback);
+#endif
+}
+
+} // namespace WebCore

Added: trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.h (0 => 270158)


--- trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.h	                        (rev 0)
+++ trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSource.h	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,56 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#include "SpeechRecognitionCaptureSourceImpl.h"
+#include "SpeechRecognitionConnectionClientIdentifier.h"
+
+namespace WTF {
+class MediaTime;
+}
+
+namespace WebCore {
+
+class AudioStreamDescription;
+class PlatformAudioData;
+class SpeechRecognitionCaptureSourceImpl;
+class SpeechRecognitionUpdate;
+
+class SpeechRecognitionCaptureSource {
+    WTF_MAKE_FAST_ALLOCATED;
+public:
+    ~SpeechRecognitionCaptureSource() = default;
+    using DataCallback = Function<void(const WTF::MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t)>;
+    using StateUpdateCallback = Function<void(const SpeechRecognitionUpdate&)>;
+    SpeechRecognitionCaptureSource(SpeechRecognitionConnectionClientIdentifier, DataCallback&&, StateUpdateCallback&&);
+
+private:
+#if ENABLE(MEDIA_STREAM)
+    std::unique_ptr<SpeechRecognitionCaptureSourceImpl> m_impl;
+#endif
+};
+
+} // namespace WebCore

Added: trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp (0 => 270158)


--- trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp	                        (rev 0)
+++ trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,151 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "SpeechRecognitionCaptureSourceImpl.h"
+
+#if ENABLE(MEDIA_STREAM)
+
+#include "SpeechRecognitionUpdate.h"
+
+#if PLATFORM(COCOA)
+#include "CAAudioStreamDescription.h"
+#include "WebAudioBufferList.h"
+#endif
+
+namespace WebCore {
+
+#if !RELEASE_LOG_DISABLED
+static const void* nextLogIdentifier()
+{
+    static uint64_t logIdentifier = cryptographicallyRandomNumber();
+    return reinterpret_cast<const void*>(++logIdentifier);
+}
+
+static RefPtr<Logger>& nullLogger()
+{
+    static NeverDestroyed<RefPtr<Logger>> logger;
+    return logger;
+}
+#endif
+
+SpeechRecognitionCaptureSourceImpl::SpeechRecognitionCaptureSourceImpl(SpeechRecognitionConnectionClientIdentifier identifier, DataCallback&& dataCallback, StateUpdateCallback&& stateUpdateCallback, Ref<RealtimeMediaSource>&& source)
+    : m_clientIdentifier(identifier)
+    , m_dataCallback(WTFMove(dataCallback))
+    , m_stateUpdateCallback(WTFMove(stateUpdateCallback))
+    , m_source(WTFMove(source))
+{
+    m_source->addAudioSampleObserver(*this);
+    m_source->addObserver(*this);
+    m_source->start();
+
+#if !RELEASE_LOG_DISABLED
+    if (!nullLogger().get()) {
+        nullLogger() = Logger::create(this);
+        nullLogger()->setEnabled(this, false);
+    }
+
+    m_source->setLogger(*nullLogger(), nextLogIdentifier());
+#endif
+
+    auto weakThis = makeWeakPtr(this);
+}
+
+SpeechRecognitionCaptureSourceImpl::~SpeechRecognitionCaptureSourceImpl()
+{
+    m_source->removeAudioSampleObserver(*this);
+    m_source->removeObserver(*this);
+    m_source->stop();
+}
+
+void SpeechRecognitionCaptureSourceImpl::audioSamplesAvailable(const MediaTime& time, const PlatformAudioData& data, const AudioStreamDescription& description, size_t sampleCount)
+{
+#if PLATFORM(COCOA)
+    ASSERT(description.platformDescription().type == PlatformDescription::CAAudioStreamBasicType);
+    auto audioDescription = toCAAudioStreamDescription(description);
+    if (!m_dataSource || !m_dataSource->inputDescription() || *m_dataSource->inputDescription() != description) {
+        auto dataSource = AudioSampleDataSource::create(description.sampleRate() * 1, m_source.get());
+        if (dataSource->setInputFormat(audioDescription)) {
+            callOnMainThread([this, weakThis = makeWeakPtr(this)] {
+                if (weakThis)
+                    m_stateUpdateCallback(SpeechRecognitionUpdate::createError(m_clientIdentifier, SpeechRecognitionError { SpeechRecognitionErrorType::AudioCapture, "Unable to set input format" }));
+            });
+            return;
+        }
+
+        if (dataSource->setOutputFormat(audioDescription)) {
+            callOnMainThread([this, weakThis = makeWeakPtr(this)] {
+                if (weakThis)
+                    m_stateUpdateCallback(SpeechRecognitionUpdate::createError(m_clientIdentifier, SpeechRecognitionError { SpeechRecognitionErrorType::AudioCapture, "Unable to set output format" }));
+            });
+            return;
+        }
+        
+        if (auto locker = tryHoldLock(m_dataSourceLock))
+            m_dataSource = WTFMove(dataSource);
+        else
+            return;
+    }
+
+    m_dataSource->pushSamples(time, data, sampleCount);
+    callOnMainThread([this, weakThis = makeWeakPtr(this), time, audioDescription, sampleCount] {
+        if (!weakThis)
+            return;
+
+        auto data = "" { audioDescription, static_cast<uint32_t>(sampleCount) };
+        {
+            auto locker = holdLock(m_dataSourceLock);
+            m_dataSource->pullSamples(*data.list(), sampleCount, time.timeValue(), 0, AudioSampleDataSource::Copy);
+        }
+
+        m_dataCallback(time, data, audioDescription, sampleCount);
+    });
+#else
+    m_dataCallback(time, data, description, sampleCount);
+#endif
+}
+
+void SpeechRecognitionCaptureSourceImpl::sourceStarted()
+{
+    ASSERT(isMainThread());
+    m_stateUpdateCallback(SpeechRecognitionUpdate::create(m_clientIdentifier, SpeechRecognitionUpdateType::AudioStart));
+}
+
+void SpeechRecognitionCaptureSourceImpl::sourceStopped()
+{
+    ASSERT(isMainThread());
+    ASSERT(m_source->captureDidFail());
+    m_stateUpdateCallback(SpeechRecognitionUpdate::createError(m_clientIdentifier, SpeechRecognitionError { SpeechRecognitionErrorType::AudioCapture, "Source is stopped" }));
+}
+
+void SpeechRecognitionCaptureSourceImpl::sourceMutedChanged()
+{
+    ASSERT(isMainThread());
+    m_stateUpdateCallback(SpeechRecognitionUpdate::createError(m_clientIdentifier, SpeechRecognitionError { SpeechRecognitionErrorType::AudioCapture, "Source is muted" }));
+}
+
+} // namespace WebCore
+
+#endif

Added: trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.h (0 => 270158)


--- trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.h	                        (rev 0)
+++ trunk/Source/WebCore/Modules/speech/SpeechRecognitionCaptureSourceImpl.h	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,81 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#if ENABLE(MEDIA_STREAM)
+
+#include "RealtimeMediaSource.h"
+#include "SpeechRecognitionConnectionClientIdentifier.h"
+
+#if PLATFORM(COCOA)
+#include "AudioSampleDataSource.h"
+#endif
+
+namespace WTF {
+class MediaTime;
+}
+
+namespace WebCore {
+
+class AudioStreamDescription;
+class PlatformAudioData;
+class SpeechRecognitionUpdate;
+enum class SpeechRecognitionUpdateType;
+
+class SpeechRecognitionCaptureSourceImpl
+    : public RealtimeMediaSource::Observer
+    , public RealtimeMediaSource::AudioSampleObserver {
+    WTF_MAKE_FAST_ALLOCATED;
+public:
+    using DataCallback = Function<void(const WTF::MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t)>;
+    using StateUpdateCallback = Function<void(const SpeechRecognitionUpdate&)>;
+    SpeechRecognitionCaptureSourceImpl(SpeechRecognitionConnectionClientIdentifier, DataCallback&&, StateUpdateCallback&&, Ref<RealtimeMediaSource>&&);
+    ~SpeechRecognitionCaptureSourceImpl();
+
+private:
+    // RealtimeMediaSource::AudioSampleObserver
+    void audioSamplesAvailable(const MediaTime&, const PlatformAudioData&, const AudioStreamDescription&, size_t) final;
+
+    // RealtimeMediaSource::Observer
+    void sourceStarted() final;
+    void sourceStopped() final;
+    void sourceMutedChanged() final;
+
+    SpeechRecognitionConnectionClientIdentifier m_clientIdentifier;
+    DataCallback m_dataCallback;
+    StateUpdateCallback m_stateUpdateCallback;
+    Ref<RealtimeMediaSource> m_source;
+
+#if PLATFORM(COCOA)
+    RefPtr<AudioSampleDataSource> m_dataSource;
+    Lock m_dataSourceLock;
+#endif
+};
+
+} // namespace WebCore
+
+#endif
+

Added: trunk/Source/WebCore/Modules/speech/SpeechRecognizer.cpp (0 => 270158)


--- trunk/Source/WebCore/Modules/speech/SpeechRecognizer.cpp	                        (rev 0)
+++ trunk/Source/WebCore/Modules/speech/SpeechRecognizer.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,122 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "SpeechRecognizer.h"
+
+#include "SpeechRecognitionUpdate.h"
+
+#if PLATFORM(COCOA)
+#include "MediaUtilities.h"
+#include <pal/avfoundation/MediaTimeAVFoundation.h>
+#endif
+
+namespace WebCore {
+
+SpeechRecognizer::SpeechRecognizer(DelegateCallback&& callback)
+    : m_delegateCallback(WTFMove(callback))
+{
+}
+
+void SpeechRecognizer::reset()
+{
+    if (!m_clientIdentifier)
+        return;
+
+    if (m_source)
+        m_source = nullptr;
+
+    auto error = SpeechRecognitionError { SpeechRecognitionErrorType::Aborted, "Another request is started" };
+    m_delegateCallback(SpeechRecognitionUpdate::createError(*m_clientIdentifier, error));
+}
+
+void SpeechRecognizer::start(SpeechRecognitionConnectionClientIdentifier identifier)
+{
+    reset();
+
+    m_clientIdentifier = identifier;
+    m_delegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::Start));
+
+    startInternal();
+}
+
+void SpeechRecognizer::startInternal()
+{
+    auto dataCallback = [weakThis = makeWeakPtr(this)](const auto& time, const auto& data, const auto& description, auto sampleCount) {
+        if (!weakThis)
+            return;
+
+#if PLATFORM(COCOA)
+        auto buffer = createAudioSampleBuffer(data, description, PAL::toCMTime(time), sampleCount);
+        UNUSED_PARAM(buffer);
+#else
+        UNUSED_PARAM(time);
+        UNUSED_PARAM(data);
+        UNUSED_PARAM(description);
+        UNUSED_PARAM(sampleCount);
+#endif
+    };
+
+    auto stateUpdateCallback = [this, weakThis = makeWeakPtr(this)](const auto& update) {
+        if (!weakThis)
+            return;
+
+        ASSERT(m_clientIdentifier && m_clientIdentifier.value() == update.clientIdentifier());
+        m_delegateCallback(update);
+
+        if (update.type() == SpeechRecognitionUpdateType::Error)
+            m_source = nullptr;
+    };
+
+    m_source = makeUnique<SpeechRecognitionCaptureSource>(*m_clientIdentifier, WTFMove(dataCallback), WTFMove(stateUpdateCallback));
+}
+
+void SpeechRecognizer::stop(ShouldGenerateFinalResult shouldGenerateFinalResult)
+{
+    if (!m_clientIdentifier)
+        return;
+
+    stopInternal();
+
+    if (shouldGenerateFinalResult == ShouldGenerateFinalResult::Yes) {
+        // TODO: generate real result when speech recognition backend is implemented.
+        Vector<SpeechRecognitionResultData> resultDatas;
+        m_delegateCallback(SpeechRecognitionUpdate::createResult(*m_clientIdentifier, resultDatas));
+    }
+
+    m_delegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::End));
+    m_clientIdentifier = WTF::nullopt;
+}
+
+void SpeechRecognizer::stopInternal()
+{
+    if (!m_source)
+        return;
+
+    m_source = nullptr;
+    m_delegateCallback(SpeechRecognitionUpdate::create(*m_clientIdentifier, SpeechRecognitionUpdateType::AudioEnd));
+}
+
+} // namespace WebCore

Added: trunk/Source/WebCore/Modules/speech/SpeechRecognizer.h (0 => 270158)


--- trunk/Source/WebCore/Modules/speech/SpeechRecognizer.h	                        (rev 0)
+++ trunk/Source/WebCore/Modules/speech/SpeechRecognizer.h	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,58 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#include "SpeechRecognitionCaptureSource.h"
+#include "SpeechRecognitionConnectionClientIdentifier.h"
+
+namespace WebCore {
+
+class SpeechRecognitionUpdate;
+
+class SpeechRecognizer : public CanMakeWeakPtr<SpeechRecognizer> {
+    WTF_MAKE_FAST_ALLOCATED;
+public:
+    using DelegateCallback = Function<void(const SpeechRecognitionUpdate&)>;
+    WEBCORE_EXPORT explicit SpeechRecognizer(DelegateCallback&&);
+    WEBCORE_EXPORT ~SpeechRecognizer() = default;
+
+    WEBCORE_EXPORT void start(SpeechRecognitionConnectionClientIdentifier);
+    enum class ShouldGenerateFinalResult { No, Yes };
+    WEBCORE_EXPORT void stop(ShouldGenerateFinalResult = ShouldGenerateFinalResult::Yes);
+
+    Optional<SpeechRecognitionConnectionClientIdentifier> currentClientIdentifier() const { return m_clientIdentifier; }
+
+private:
+    void reset();
+    void startInternal();
+    void stopInternal();
+
+    Optional<SpeechRecognitionConnectionClientIdentifier> m_clientIdentifier;
+    DelegateCallback m_delegateCallback;
+    std::unique_ptr<SpeechRecognitionCaptureSource> m_source;
+};
+
+} // namespace WebCore

Modified: trunk/Source/WebCore/Sources.txt (270157 => 270158)


--- trunk/Source/WebCore/Sources.txt	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebCore/Sources.txt	2020-11-22 05:51:10 UTC (rev 270158)
@@ -205,6 +205,9 @@
 Modules/speech/SpeechRecognitionResult.cpp
 Modules/speech/SpeechRecognitionResultList.cpp
 Modules/speech/SpeechRecognitionUpdate.cpp
+Modules/speech/SpeechRecognitionCaptureSource.cpp
+Modules/speech/SpeechRecognitionCaptureSourceImpl.cpp
+Modules/speech/SpeechRecognizer.cpp
 Modules/speech/DOMWindowSpeechSynthesis.cpp
 Modules/speech/SpeechSynthesis.cpp
 Modules/speech/SpeechSynthesisEvent.cpp

Modified: trunk/Source/WebCore/SourcesCocoa.txt (270157 => 270158)


--- trunk/Source/WebCore/SourcesCocoa.txt	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebCore/SourcesCocoa.txt	2020-11-22 05:51:10 UTC (rev 270158)
@@ -238,6 +238,7 @@
 platform/cocoa/KeyEventCocoa.mm
 platform/cocoa/LocalizedStringsCocoa.mm
 platform/cocoa/MIMETypeRegistryCocoa.mm
+platform/cocoa/MediaUtilities.cpp
 platform/cocoa/NetworkExtensionContentFilter.mm
 platform/cocoa/ParentalControlsContentFilter.mm
 platform/cocoa/PasteboardCocoa.mm

Modified: trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj (270157 => 270158)


--- trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebCore/WebCore.xcodeproj/project.pbxproj	2020-11-22 05:51:10 UTC (rev 270158)
@@ -120,8 +120,8 @@
 		073794FE19F5864E00E5A045 /* RTCNotifiersMock.h in Headers */ = {isa = PBXBuildFile; fileRef = 073794F819F5864E00E5A045 /* RTCNotifiersMock.h */; };
 		0738E5EC2499839000DA101C /* AVOutputDeviceMenuControllerTargetPicker.mm in Sources */ = {isa = PBXBuildFile; fileRef = 0738E5EA249968AD00DA101C /* AVOutputDeviceMenuControllerTargetPicker.mm */; };
 		073A15542177A42600EA08F2 /* RemoteVideoSample.h in Headers */ = {isa = PBXBuildFile; fileRef = 073A15532177A39A00EA08F2 /* RemoteVideoSample.h */; settings = {ATTRIBUTES = (Private, ); }; };
-		073B87671E4385AC0071C0EC /* AudioSampleBufferList.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87631E43859D0071C0EC /* AudioSampleBufferList.h */; };
-		073B87691E4385AC0071C0EC /* AudioSampleDataSource.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87651E43859D0071C0EC /* AudioSampleDataSource.h */; };
+		073B87671E4385AC0071C0EC /* AudioSampleBufferList.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87631E43859D0071C0EC /* AudioSampleBufferList.h */; settings = {ATTRIBUTES = (Private, ); }; };
+		073B87691E4385AC0071C0EC /* AudioSampleDataSource.h in Headers */ = {isa = PBXBuildFile; fileRef = 073B87651E43859D0071C0EC /* AudioSampleDataSource.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		074E82BB18A69F0E007EF54C /* PlatformTimeRanges.h in Headers */ = {isa = PBXBuildFile; fileRef = 074E82B918A69F0E007EF54C /* PlatformTimeRanges.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		075033A8252BD36800F70CE3 /* VideoPlaybackQualityMetrics.h in Headers */ = {isa = PBXBuildFile; fileRef = 075033A6252BD36800F70CE3 /* VideoPlaybackQualityMetrics.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		0753860314489E9800B78452 /* CachedTextTrack.h in Headers */ = {isa = PBXBuildFile; fileRef = 0753860114489E9800B78452 /* CachedTextTrack.h */; };
@@ -2795,6 +2795,8 @@
 		9393E600151A99F200066F06 /* CSSImageSetValue.h in Headers */ = {isa = PBXBuildFile; fileRef = 9393E5FE151A99F200066F06 /* CSSImageSetValue.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		939885C408B7E3D100E707C4 /* EventNames.h in Headers */ = {isa = PBXBuildFile; fileRef = 939885C208B7E3D100E707C4 /* EventNames.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		939B02EF0EA2DBC400C54570 /* WidthIterator.h in Headers */ = {isa = PBXBuildFile; fileRef = 939B02ED0EA2DBC400C54570 /* WidthIterator.h */; };
+		939C0D272564E47F00B3211B /* SpeechRecognizer.h in Headers */ = {isa = PBXBuildFile; fileRef = 939C0D2125648C3900B3211B /* SpeechRecognizer.h */; settings = {ATTRIBUTES = (Private, ); }; };
+		939C0D2B2564E7F300B3211B /* MediaUtilities.h in Headers */ = {isa = PBXBuildFile; fileRef = 939C0D292564E7F200B3211B /* MediaUtilities.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		93A0482825495506000AC462 /* SpeechRecognitionProvider.h in Headers */ = {isa = PBXBuildFile; fileRef = 93A0482625495500000AC462 /* SpeechRecognitionProvider.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		93A0482925495511000AC462 /* SpeechRecognitionResultData.h in Headers */ = {isa = PBXBuildFile; fileRef = 93A0481B254954E4000AC462 /* SpeechRecognitionResultData.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		93A0482A25495514000AC462 /* SpeechRecognitionRequest.h in Headers */ = {isa = PBXBuildFile; fileRef = 93A0481F254954E6000AC462 /* SpeechRecognitionRequest.h */; settings = {ATTRIBUTES = (Private, ); }; };
@@ -2856,6 +2858,8 @@
 		93F1D5BB12D532C400832BEC /* WebGLLoseContext.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F1D5B812D532C400832BEC /* WebGLLoseContext.h */; };
 		93F1D5C112D5335600832BEC /* JSWebGLLoseContext.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F1D5BF12D5335600832BEC /* JSWebGLLoseContext.h */; };
 		93F2CC932427FB9C005851D8 /* CharacterRange.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F2CC912427FB9A005851D8 /* CharacterRange.h */; settings = {ATTRIBUTES = (Private, ); }; };
+		93F6B81F2567A08C00A08488 /* SpeechRecognitionCaptureSourceImpl.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F6B81C25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.h */; settings = {ATTRIBUTES = (Private, ); }; };
+		93F6B8222567A65600A08488 /* SpeechRecognitionCaptureSource.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F6B81B25679F6F00A08488 /* SpeechRecognitionCaptureSource.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		93F6F1EE127F70B10055CB06 /* WebGLContextEvent.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F6F1EB127F70B10055CB06 /* WebGLContextEvent.h */; };
 		93F925430F7EF5B8007E37C9 /* RadioButtonGroups.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F925410F7EF5B8007E37C9 /* RadioButtonGroups.h */; settings = {ATTRIBUTES = (Private, ); }; };
 		93F9B6E10BA0FB7200854064 /* JSComment.h in Headers */ = {isa = PBXBuildFile; fileRef = 93F9B6DF0BA0FB7200854064 /* JSComment.h */; };
@@ -11453,6 +11457,10 @@
 		939885C208B7E3D100E707C4 /* EventNames.h */ = {isa = PBXFileReference; fileEncoding = 30; indentWidth = 4; lastKnownFileType = sourcecode.c.h; path = EventNames.h; sourceTree = "<group>"; tabWidth = 8; usesTabs = 0; };
 		939B02EC0EA2DBC400C54570 /* WidthIterator.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WidthIterator.cpp; sourceTree = "<group>"; };
 		939B02ED0EA2DBC400C54570 /* WidthIterator.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WidthIterator.h; sourceTree = "<group>"; };
+		939C0D2125648C3900B3211B /* SpeechRecognizer.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognizer.h; sourceTree = "<group>"; };
+		939C0D2325648C4E00B3211B /* SpeechRecognizer.cpp */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.cpp.cpp; path = SpeechRecognizer.cpp; sourceTree = "<group>"; };
+		939C0D282564E7F200B3211B /* MediaUtilities.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = MediaUtilities.cpp; sourceTree = "<group>"; };
+		939C0D292564E7F200B3211B /* MediaUtilities.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = MediaUtilities.h; sourceTree = "<group>"; };
 		93A0481B254954E4000AC462 /* SpeechRecognitionResultData.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionResultData.h; sourceTree = "<group>"; };
 		93A0481D254954E5000AC462 /* SpeechRecognitionConnection.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionConnection.h; sourceTree = "<group>"; };
 		93A0481F254954E6000AC462 /* SpeechRecognitionRequest.h */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionRequest.h; sourceTree = "<group>"; };
@@ -11539,6 +11547,10 @@
 		93F1D5BE12D5335600832BEC /* JSWebGLLoseContext.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = JSWebGLLoseContext.cpp; sourceTree = "<group>"; };
 		93F1D5BF12D5335600832BEC /* JSWebGLLoseContext.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = JSWebGLLoseContext.h; sourceTree = "<group>"; };
 		93F2CC912427FB9A005851D8 /* CharacterRange.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = CharacterRange.h; sourceTree = "<group>"; };
+		93F6B81B25679F6F00A08488 /* SpeechRecognitionCaptureSource.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionCaptureSource.h; sourceTree = "<group>"; };
+		93F6B81C25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = SpeechRecognitionCaptureSourceImpl.h; sourceTree = "<group>"; };
+		93F6B81D25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SpeechRecognitionCaptureSourceImpl.cpp; sourceTree = "<group>"; };
+		93F6B81E25679F7100A08488 /* SpeechRecognitionCaptureSource.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = SpeechRecognitionCaptureSource.cpp; sourceTree = "<group>"; };
 		93F6F1EA127F70B10055CB06 /* WebGLContextEvent.cpp */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.cpp.cpp; path = WebGLContextEvent.cpp; sourceTree = "<group>"; };
 		93F6F1EB127F70B10055CB06 /* WebGLContextEvent.h */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.c.h; path = WebGLContextEvent.h; sourceTree = "<group>"; };
 		93F6F1EC127F70B10055CB06 /* WebGLContextEvent.idl */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = text; path = WebGLContextEvent.idl; sourceTree = "<group>"; };
@@ -23587,6 +23599,13 @@
 			path = cocoa;
 			sourceTree = "<group>";
 		};
+		93F6B80E2566EDE100A08488 /* cocoa */ = {
+			isa = PBXGroup;
+			children = (
+			);
+			path = cocoa;
+			sourceTree = "<group>";
+		};
 		946D37271D6CB2250077084F /* parser */ = {
 			isa = PBXGroup;
 			children = (
@@ -24423,6 +24442,8 @@
 				A5C974D011485FF10066F2AB /* KeyEventCocoa.mm */,
 				06E81ED60AB5D5E900C87837 /* LocalCurrentGraphicsContext.h */,
 				1A4832B21A953BA6008B4DFE /* LocalizedStringsCocoa.mm */,
+				939C0D282564E7F200B3211B /* MediaUtilities.cpp */,
+				939C0D292564E7F200B3211B /* MediaUtilities.h */,
 				C53D39331C97892D007F3AE9 /* MIMETypeRegistryCocoa.mm */,
 				A19D93491AA11B1E00B46C24 /* NetworkExtensionContentFilter.h */,
 				A19D93481AA11B1E00B46C24 /* NetworkExtensionContentFilter.mm */,
@@ -25615,6 +25636,7 @@
 		AA2A5AB716A485A400975A25 /* speech */ = {
 			isa = PBXGroup;
 			children = (
+				93F6B80E2566EDE100A08488 /* cocoa */,
 				AA2A5ABA16A485D500975A25 /* DOMWindow+SpeechSynthesis.idl */,
 				AA2A5AB816A485D500975A25 /* DOMWindowSpeechSynthesis.cpp */,
 				AA2A5AB916A485D500975A25 /* DOMWindowSpeechSynthesis.h */,
@@ -25624,6 +25646,10 @@
 				934950B72539434B0099F171 /* SpeechRecognitionAlternative.cpp */,
 				934950BC2539434E0099F171 /* SpeechRecognitionAlternative.h */,
 				934950BB2539434E0099F171 /* SpeechRecognitionAlternative.idl */,
+				93F6B81E25679F7100A08488 /* SpeechRecognitionCaptureSource.cpp */,
+				93F6B81B25679F6F00A08488 /* SpeechRecognitionCaptureSource.h */,
+				93F6B81D25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.cpp */,
+				93F6B81C25679F7000A08488 /* SpeechRecognitionCaptureSourceImpl.h */,
 				93A0481D254954E5000AC462 /* SpeechRecognitionConnection.h */,
 				93A04824254954E9000AC462 /* SpeechRecognitionConnectionClient.h */,
 				93A04820254954E6000AC462 /* SpeechRecognitionConnectionClientIdentifier.h */,
@@ -25648,6 +25674,8 @@
 				934950C5253943530099F171 /* SpeechRecognitionResultList.idl */,
 				93D6B76E254B8E1B0058DD3A /* SpeechRecognitionUpdate.cpp */,
 				93D6B76D254B8E1B0058DD3A /* SpeechRecognitionUpdate.h */,
+				939C0D2325648C4E00B3211B /* SpeechRecognizer.cpp */,
+				939C0D2125648C3900B3211B /* SpeechRecognizer.h */,
 				AA2A5ABD16A485D500975A25 /* SpeechSynthesis.cpp */,
 				AA2A5ABE16A485D500975A25 /* SpeechSynthesis.h */,
 				AA2A5ABF16A485D500975A25 /* SpeechSynthesis.idl */,
@@ -33534,6 +33562,7 @@
 				932CC0B71DFFD158004C0F9F /* MediaTrackConstraints.h in Headers */,
 				07C1C0E21BFB600100BD2256 /* MediaTrackSupportedConstraints.h in Headers */,
 				07611DC12440E59B00D80704 /* MediaUsageInfo.h in Headers */,
+				939C0D2B2564E7F300B3211B /* MediaUtilities.h in Headers */,
 				51E1BAC31BD8064E0055D81F /* MemoryBackingStoreTransaction.h in Headers */,
 				BCB16C180979C3BD00467741 /* MemoryCache.h in Headers */,
 				517139081BF64DEF000D5F01 /* MemoryCursor.h in Headers */,
@@ -34426,6 +34455,8 @@
 				626CDE0F1140424C001E5A68 /* SpatialNavigation.h in Headers */,
 				934950CD253943610099F171 /* SpeechRecognition.h in Headers */,
 				934950CE253943650099F171 /* SpeechRecognitionAlternative.h in Headers */,
+				93F6B8222567A65600A08488 /* SpeechRecognitionCaptureSource.h in Headers */,
+				93F6B81F2567A08C00A08488 /* SpeechRecognitionCaptureSourceImpl.h in Headers */,
 				93A0482C25495519000AC462 /* SpeechRecognitionConnection.h in Headers */,
 				93A0482E2549551E000AC462 /* SpeechRecognitionConnectionClient.h in Headers */,
 				93A0482D2549551B000AC462 /* SpeechRecognitionConnectionClientIdentifier.h in Headers */,
@@ -34440,6 +34471,7 @@
 				93A0482925495511000AC462 /* SpeechRecognitionResultData.h in Headers */,
 				934950D6253943810099F171 /* SpeechRecognitionResultList.h in Headers */,
 				93D6B771254BAB450058DD3A /* SpeechRecognitionUpdate.h in Headers */,
+				939C0D272564E47F00B3211B /* SpeechRecognizer.h in Headers */,
 				AA2A5AD416A4861100975A25 /* SpeechSynthesis.h in Headers */,
 				C14938072234551A000CD707 /* SpeechSynthesisClient.h in Headers */,
 				AA2A5AD216A4860A00975A25 /* SpeechSynthesisEvent.h in Headers */,

Added: trunk/Source/WebCore/platform/cocoa/MediaUtilities.cpp (0 => 270158)


--- trunk/Source/WebCore/platform/cocoa/MediaUtilities.cpp	                        (rev 0)
+++ trunk/Source/WebCore/platform/cocoa/MediaUtilities.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,73 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include "config.h"
+#include "MediaUtilities.h"
+
+#include "AudioStreamDescription.h"
+#include "WebAudioBufferList.h"
+#include <wtf/SoftLinking.h>
+#include <pal/cf/CoreMediaSoftLink.h>
+
+namespace WebCore {
+
+using namespace PAL;
+
+RetainPtr<CMFormatDescriptionRef> createAudioFormatDescription(const AudioStreamDescription& description)
+{
+    auto basicDescription = WTF::get<const AudioStreamBasicDescription*>(description.platformDescription().description);
+    CMFormatDescriptionRef format = nullptr;
+    auto error = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, basicDescription, 0, nullptr, 0, nullptr, nullptr, &format);
+    if (error) {
+        LOG_ERROR("createAudioFormatDescription failed with %d", error);
+        return nullptr;
+    }
+    return adoptCF(format);
+}
+
+RetainPtr<CMSampleBufferRef> createAudioSampleBuffer(const PlatformAudioData& data, const AudioStreamDescription& description, CMTime time, size_t sampleCount)
+{
+    // FIXME: check if we can reuse the format for multiple sample buffers.
+    auto format = createAudioFormatDescription(description);
+    if (!format)
+        return nullptr;
+
+    CMSampleBufferRef sampleBuffer = nullptr;
+    auto error = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault, nullptr, false, nullptr, nullptr, format.get(), sampleCount, time, nullptr, &sampleBuffer);
+    if (error) {
+        LOG_ERROR("createAudioSampleBuffer with packet descriptions failed - %d", error);
+        return nullptr;
+    }
+    auto buffer = adoptCF(sampleBuffer);
+
+    error = CMSampleBufferSetDataBufferFromAudioBufferList(buffer.get(), kCFAllocatorDefault, kCFAllocatorDefault, 0, downcast<WebAudioBufferList>(data).list());
+    if (error) {
+        LOG_ERROR("createAudioSampleBuffer from audio buffer list failed - %d", error);
+        return nullptr;
+    }
+    return buffer;
+}
+
+} // namespace WebCore

Added: trunk/Source/WebCore/platform/cocoa/MediaUtilities.h (0 => 270158)


--- trunk/Source/WebCore/platform/cocoa/MediaUtilities.h	                        (rev 0)
+++ trunk/Source/WebCore/platform/cocoa/MediaUtilities.h	2020-11-22 05:51:10 UTC (rev 270158)
@@ -0,0 +1,42 @@
+/*
+ * Copyright (C) 2020 Apple Inc. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY APPLE INC. AND ITS CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL APPLE INC. OR ITS CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF
+ * THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#pragma once
+
+#include <CoreMedia/CMTime.h>
+#include <wtf/RetainPtr.h>
+
+typedef const struct opaqueCMFormatDescription* CMFormatDescriptionRef;
+typedef struct opaqueCMSampleBuffer* CMSampleBufferRef;
+
+namespace WebCore {
+
+class AudioStreamDescription;
+class PlatformAudioData;
+
+RetainPtr<CMFormatDescriptionRef> createAudioFormatDescription(const AudioStreamDescription&);
+RetainPtr<CMSampleBufferRef> createAudioSampleBuffer(const PlatformAudioData&, const AudioStreamDescription&, CMTime, size_t sampleCount);
+
+} // namespace WebCore

Modified: trunk/Source/WebCore/platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm (270157 => 270158)


--- trunk/Source/WebCore/platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebCore/platform/mediarecorder/cocoa/MediaRecorderPrivateWriterCocoa.mm	2020-11-22 05:51:10 UTC (rev 270158)
@@ -34,6 +34,7 @@
 #include "MediaRecorderPrivate.h"
 #include "MediaRecorderPrivateOptions.h"
 #include "MediaStreamTrackPrivate.h"
+#include "MediaUtilities.h"
 #include "VideoSampleBufferCompressor.h"
 #include "WebAudioBufferList.h"
 #include <AVFoundation/AVAssetWriter.h>
@@ -443,40 +444,6 @@
         m_videoCompressor->addSampleBuffer(bufferWithCurrentTime.get());
 }
 
-static inline RetainPtr<CMFormatDescriptionRef> createAudioFormatDescription(const AudioStreamDescription& description)
-{
-    auto basicDescription = WTF::get<const AudioStreamBasicDescription*>(description.platformDescription().description);
-    CMFormatDescriptionRef format = nullptr;
-    auto error = CMAudioFormatDescriptionCreate(kCFAllocatorDefault, basicDescription, 0, NULL, 0, NULL, NULL, &format);
-    if (error) {
-        RELEASE_LOG_ERROR(MediaStream, "MediaRecorderPrivateWriter CMAudioFormatDescriptionCreate failed with %d", error);
-        return nullptr;
-    }
-    return adoptCF(format);
-}
-
-static inline RetainPtr<CMSampleBufferRef> createAudioSampleBuffer(const PlatformAudioData& data, const AudioStreamDescription& description, CMTime time, size_t sampleCount)
-{
-    auto format = createAudioFormatDescription(description);
-    if (!format)
-        return nullptr;
-
-    CMSampleBufferRef sampleBuffer = nullptr;
-    auto error = CMAudioSampleBufferCreateWithPacketDescriptions(kCFAllocatorDefault, NULL, false, NULL, NULL, format.get(), sampleCount, time, NULL, &sampleBuffer);
-    if (error) {
-        RELEASE_LOG_ERROR(MediaStream, "MediaRecorderPrivateWriter createAudioSampleBufferWithPacketDescriptions failed with %d", error);
-        return nullptr;
-    }
-    auto buffer = adoptCF(sampleBuffer);
-
-    error = CMSampleBufferSetDataBufferFromAudioBufferList(buffer.get(), kCFAllocatorDefault, kCFAllocatorDefault, 0, downcast<WebAudioBufferList>(data).list());
-    if (error) {
-        RELEASE_LOG_ERROR(MediaStream, "MediaRecorderPrivateWriter CMSampleBufferSetDataBufferFromAudioBufferList failed with %d", error);
-        return nullptr;
-    }
-    return buffer;
-}
-
 void MediaRecorderPrivateWriter::appendAudioSampleBuffer(const PlatformAudioData& data, const AudioStreamDescription& description, const WTF::MediaTime&, size_t sampleCount)
 {
     if (auto sampleBuffer = createAudioSampleBuffer(data, description, m_currentAudioSampleTime, sampleCount))

Modified: trunk/Source/WebKit/ChangeLog (270157 => 270158)


--- trunk/Source/WebKit/ChangeLog	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebKit/ChangeLog	2020-11-22 05:51:10 UTC (rev 270158)
@@ -1,3 +1,29 @@
+2020-11-21  Sihui Liu  <[email protected]>
+
+        Implement audio capture for SpeechRecognition on macOS
+        https://bugs.webkit.org/show_bug.cgi?id=218855
+        <rdar://problem/71331001>
+
+        Reviewed by Youenn Fablet.
+
+        * UIProcess/SpeechRecognitionPermissionManager.cpp:
+        (WebKit::SpeechRecognitionPermissionManager::startProcessingRequest): Check and enable mock devices based on 
+        preference as SpeechRecognition needs it for testing.
+        * UIProcess/SpeechRecognitionServer.cpp:
+        (WebKit::SpeechRecognitionServer::start):
+        (WebKit::SpeechRecognitionServer::requestPermissionForRequest):
+        (WebKit::SpeechRecognitionServer::handleRequest):
+        (WebKit::SpeechRecognitionServer::stop):
+        (WebKit::SpeechRecognitionServer::abort):
+        (WebKit::SpeechRecognitionServer::invalidate):
+        (WebKit::SpeechRecognitionServer::sendUpdate):
+        (WebKit::SpeechRecognitionServer::stopRequest): Deleted.
+        (WebKit::SpeechRecognitionServer::abortRequest): Deleted.
+        * UIProcess/SpeechRecognitionServer.h:
+        * UIProcess/WebPageProxy.cpp:
+        (WebKit::WebPageProxy::syncIfMockDevicesEnabledChanged):
+        * UIProcess/WebPageProxy.h:
+
 2020-11-21  Simon Fraser  <[email protected]>
 
         Propagate the 'wheelEventGesturesBecomeNonBlocking' setting to the ScrollingTree

Modified: trunk/Source/WebKit/UIProcess/SpeechRecognitionPermissionManager.cpp (270157 => 270158)


--- trunk/Source/WebKit/UIProcess/SpeechRecognitionPermissionManager.cpp	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebKit/UIProcess/SpeechRecognitionPermissionManager.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -104,6 +104,7 @@
     m_speechRecognitionServiceCheck = computeSpeechRecognitionServiceAccess();
 
     if (m_page.preferences().mockCaptureDevicesEnabled()) {
+        m_page.syncIfMockDevicesEnabledChanged();
         m_microphoneCheck = CheckResult::Granted;
         m_speechRecognitionServiceCheck = CheckResult::Granted;
     }

Modified: trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.cpp (270157 => 270158)


--- trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.cpp	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -46,12 +46,11 @@
 void SpeechRecognitionServer::start(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier, String&& lang, bool continuous, bool interimResults, uint64_t maxAlternatives, WebCore::ClientOrigin&& origin)
 {
     MESSAGE_CHECK(clientIdentifier);
-    ASSERT(!m_pendingRequests.contains(clientIdentifier));
-    ASSERT(!m_ongoingRequests.contains(clientIdentifier));
+    ASSERT(!m_requests.contains(clientIdentifier));
     auto requestInfo = WebCore::SpeechRecognitionRequestInfo { clientIdentifier, WTFMove(lang), continuous, interimResults, maxAlternatives, WTFMove(origin) };
-    auto& pendingRequest = m_pendingRequests.add(clientIdentifier, makeUnique<WebCore::SpeechRecognitionRequest>(WTFMove(requestInfo))).iterator->value;
+    auto& newRequest = m_requests.add(clientIdentifier, makeUnique<WebCore::SpeechRecognitionRequest>(WTFMove(requestInfo))).iterator->value;
 
-    requestPermissionForRequest(*pendingRequest);
+    requestPermissionForRequest(*newRequest);
 }
 
 void SpeechRecognitionServer::requestPermissionForRequest(WebCore::SpeechRecognitionRequest& request)
@@ -64,68 +63,72 @@
             return;
 
         auto identifier = weakRequest->clientIdentifier();
-        auto takenRequest = m_pendingRequests.take(identifier);
         if (decision == SpeechRecognitionPermissionDecision::Deny) {
+            m_requests.remove(identifier);
             auto error = WebCore::SpeechRecognitionError { WebCore::SpeechRecognitionErrorType::NotAllowed, "Permission check failed"_s };
             sendUpdate(identifier, WebCore::SpeechRecognitionUpdateType::Error, error);
             return;
         }
 
-        m_ongoingRequests.add(identifier, WTFMove(takenRequest));
-        handleRequest(*m_ongoingRequests.get(identifier));
+        handleRequest(identifier);
     });
 }
 
+void SpeechRecognitionServer::handleRequest(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier)
+{
+    if (!m_recognizer) {
+        m_recognizer = makeUnique<SpeechRecognizer>([this, weakThis = makeWeakPtr(this)](auto& update) {
+            if (!weakThis)
+                return;
+
+            auto clientIdentifier = update.clientIdentifier();
+            if (!m_requests.contains(clientIdentifier))
+                return;
+
+            auto type = update.type();
+            if (type == SpeechRecognitionUpdateType::Error || type == SpeechRecognitionUpdateType::End)
+                m_requests.remove(clientIdentifier);
+
+            sendUpdate(update);
+        });
+    }
+
+    m_recognizer->start(clientIdentifier);
+}
+
 void SpeechRecognitionServer::stop(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier)
 {
     MESSAGE_CHECK(clientIdentifier);
-    if (m_pendingRequests.remove(clientIdentifier)) {
-        sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
+    if (m_recognizer && m_recognizer->currentClientIdentifier() == clientIdentifier) {
+        m_recognizer->stop();
         return;
     }
 
-    ASSERT(m_ongoingRequests.contains(clientIdentifier));
-    stopRequest(*m_ongoingRequests.get(clientIdentifier));
+    if (m_requests.remove(clientIdentifier))
+        sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
 }
 
 void SpeechRecognitionServer::abort(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier)
 {
     MESSAGE_CHECK(clientIdentifier);
-    if (m_pendingRequests.remove(clientIdentifier)) {
-        sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
+    if (m_recognizer && m_recognizer->currentClientIdentifier() == clientIdentifier) {
+        m_recognizer->stop(WebCore::SpeechRecognizer::ShouldGenerateFinalResult::No);
         return;
     }
 
-    ASSERT(m_ongoingRequests.contains(clientIdentifier));
-    auto request = m_ongoingRequests.take(clientIdentifier);
-    abortRequest(*request);
-    auto update = WebCore::SpeechRecognitionUpdate::create(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
-    send(Messages::WebSpeechRecognitionConnection::DidReceiveUpdate(update), m_identifier);
+    if (m_requests.remove(clientIdentifier))
+        sendUpdate(clientIdentifier, WebCore::SpeechRecognitionUpdateType::End);
 }
 
 void SpeechRecognitionServer::invalidate(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier)
 {
     MESSAGE_CHECK(clientIdentifier);
-    auto request = m_ongoingRequests.take(clientIdentifier);
-    if (request)
-        abortRequest(*request);
+    if (m_requests.remove(clientIdentifier)) {
+        if (m_recognizer && m_recognizer->currentClientIdentifier() == clientIdentifier)
+            m_recognizer->stop();
+    }
 }
 
-void SpeechRecognitionServer::handleRequest(WebCore::SpeechRecognitionRequest& request)
-{
-    // TODO: start capturing audio and recognition.
-}
-
-void SpeechRecognitionServer::stopRequest(WebCore::SpeechRecognitionRequest& request)
-{
-    // TODO: stop capturing audio and finalizing results by recognizing captured audio.
-}
-
-void SpeechRecognitionServer::abortRequest(WebCore::SpeechRecognitionRequest& request)
-{
-    // TODO: stop capturing audio and recognition immediately without generating results.
-}
-
 void SpeechRecognitionServer::sendUpdate(WebCore::SpeechRecognitionConnectionClientIdentifier clientIdentifier, WebCore::SpeechRecognitionUpdateType type, Optional<WebCore::SpeechRecognitionError> error, Optional<Vector<WebCore::SpeechRecognitionResultData>> result)
 {
     auto update = WebCore::SpeechRecognitionUpdate::create(clientIdentifier, type);
@@ -133,9 +136,14 @@
         update = WebCore::SpeechRecognitionUpdate::createError(clientIdentifier, *error);
     if (type == WebCore::SpeechRecognitionUpdateType::Result)
         update = WebCore::SpeechRecognitionUpdate::createResult(clientIdentifier, *result);
-    send(Messages::WebSpeechRecognitionConnection::DidReceiveUpdate(update), m_identifier);
+    sendUpdate(update);
 }
 
+void SpeechRecognitionServer::sendUpdate(const WebCore::SpeechRecognitionUpdate& update)
+{
+    send(Messages::WebSpeechRecognitionConnection::DidReceiveUpdate(update));
+}
+
 IPC::Connection* SpeechRecognitionServer::messageSenderConnection() const
 {
     return m_connection.ptr();

Modified: trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.h (270157 => 270158)


--- trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.h	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebKit/UIProcess/SpeechRecognitionServer.h	2020-11-22 05:51:10 UTC (rev 270158)
@@ -31,6 +31,7 @@
 #include <WebCore/SpeechRecognitionError.h>
 #include <WebCore/SpeechRecognitionRequest.h>
 #include <WebCore/SpeechRecognitionResultData.h>
+#include <WebCore/SpeechRecognizer.h>
 #include <wtf/Deque.h>
 
 namespace WebCore {
@@ -58,10 +59,9 @@
 
 private:
     void requestPermissionForRequest(WebCore::SpeechRecognitionRequest&);
-    void handleRequest(WebCore::SpeechRecognitionRequest&);
-    void stopRequest(WebCore::SpeechRecognitionRequest&);
-    void abortRequest(WebCore::SpeechRecognitionRequest&);
+    void handleRequest(WebCore::SpeechRecognitionConnectionClientIdentifier);
     void sendUpdate(WebCore::SpeechRecognitionConnectionClientIdentifier, WebCore::SpeechRecognitionUpdateType, Optional<WebCore::SpeechRecognitionError> = WTF::nullopt, Optional<Vector<WebCore::SpeechRecognitionResultData>> = WTF::nullopt);
+    void sendUpdate(const WebCore::SpeechRecognitionUpdate&);
 
     // IPC::MessageReceiver.
     void didReceiveMessage(IPC::Connection&, IPC::Decoder&) override;
@@ -72,9 +72,9 @@
 
     Ref<IPC::Connection> m_connection;
     SpeechRecognitionServerIdentifier m_identifier;
-    HashMap<WebCore::SpeechRecognitionConnectionClientIdentifier, std::unique_ptr<WebCore::SpeechRecognitionRequest>> m_pendingRequests;
-    HashMap<WebCore::SpeechRecognitionConnectionClientIdentifier, std::unique_ptr<WebCore::SpeechRecognitionRequest>> m_ongoingRequests;
+    HashMap<WebCore::SpeechRecognitionConnectionClientIdentifier, std::unique_ptr<WebCore::SpeechRecognitionRequest>> m_requests;
     SpeechRecognitionPermissionChecker m_permissionChecker;
+    std::unique_ptr<WebCore::SpeechRecognizer> m_recognizer;
 };
 
 } // namespace WebKit

Modified: trunk/Source/WebKit/UIProcess/WebPageProxy.cpp (270157 => 270158)


--- trunk/Source/WebKit/UIProcess/WebPageProxy.cpp	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebKit/UIProcess/WebPageProxy.cpp	2020-11-22 05:51:10 UTC (rev 270158)
@@ -8174,6 +8174,13 @@
 #endif
 }
 
+void WebPageProxy::syncIfMockDevicesEnabledChanged()
+{
+#if ENABLE(MEDIA_STREAM)
+    userMediaPermissionRequestManager().syncWithWebCorePrefs();
+#endif
+}
+
 void WebPageProxy::beginMonitoringCaptureDevices()
 {
 #if ENABLE(MEDIA_STREAM)

Modified: trunk/Source/WebKit/UIProcess/WebPageProxy.h (270157 => 270158)


--- trunk/Source/WebKit/UIProcess/WebPageProxy.h	2020-11-22 04:51:13 UTC (rev 270157)
+++ trunk/Source/WebKit/UIProcess/WebPageProxy.h	2020-11-22 05:51:10 UTC (rev 270158)
@@ -1825,6 +1825,8 @@
     void requestSpeechRecognitionPermission(const WebCore::ClientOrigin&, CompletionHandler<void(SpeechRecognitionPermissionDecision)>&&);
     void requestSpeechRecognitionPermissionByDefaultAction(const WebCore::SecurityOrigin&, CompletionHandler<void(bool)>&&);
 
+    void syncIfMockDevicesEnabledChanged();
+
 private:
     WebPageProxy(PageClient&, WebProcessProxy&, Ref<API::PageConfiguration>&&);
     void platformInitialize();
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to