Hi,

we've already implemented an open source library that integrates WebGL, WebAudio, DeviceOrientation and gUM to create an easy to use VR/AR framework that runs on Rift, Glass, mobile, tablet, pc, etc.

Here's an overview of our API.

  https://buildar.com/awe/tutorials/intro_to_awe.js/index.html

And the project is in our github repos.

  https://github.com/buildar/awe.js

Hope that's relevant.

PS: We're also working on a Depth Stream Extension proposal to add depth camera support to gUM.

roBman


On 27/03/14 5:58 AM, Lars Knudsen wrote:
I think it could make sense to put stuff like this as an extension on top of WebGL and WebAudio as they are the only two current APIs close enough to the bare metal/low latency/high performance to get a decent experience. Also - I seem to remember that some earlier generation VR glasses solved the game support problem by providing their own GL and Joystick drivers (today - probably device orientation events) so many games didn't have to bother (too much) with the integration.

In theory - we could:

 - extend (if needed at all) WebGL to provide stereo vision
- hook up WebAudio as is (as it supports audio objects, Doppler effect, etc. similar to OpenAL - hook up DeviceOrientation/Motion in Desktop browsers if a WiiMote, HMD or other is connected
 - hook up getUserMedia as is to the potential VR camera

..and make it possible to do low latency paths/hooks between them if needed.

It seems that all (or at least most) of the components are already present - but proper hooks need to be made for desktop browsers at least (afaik .. it's been a while ;))

- Lars


On Wed, Mar 26, 2014 at 7:18 PM, Brandon Jones <[email protected] <mailto:[email protected]>> wrote:

    So there's a few things to consider regarding this. For one, I
    think your ViewEvent structure would need to look more like this:

    interface ViewEvent : UIEvent {
    readonly attribute Quaternion orientation; // Where Quaternion is
    4 floats. Prevents gimble lock.
    readonly attribute float offsetX; // offset X from the calibrated
    center 0 in millimeters
    readonly attribute float offsetY; // offset Y from the calibrated
    center 0 in millimeters
    readonly attribute float offsetZ; // offset Z from the calibrated
    center 0 in millimeters
        readonly attribute float accelerationX; // Acceleration along
    X axis in m/s^2
        readonly attribute float accelerationY; // Acceleration along
    Y axis in m/s^2
        readonly attribute float accelerationZ; // Acceleration along
    Z axis in m/s^2
    }

    You have to deal with explicit units for a case like this and not
    clamped/normalized values. What would a normalized offset of 1.0
    mean? Am I slightly off center? At the other end of the room? It's
    meaningless without a frame of reference. Same goes
    for acceleration. You can argue that you can normalize to 1.0 ==
    9.8 m/s^2 but the accelerometers will happily report values
    outside that range, and at that point you might as well just
    report in a standard unit.

    As for things like eye position and such, you'd want to query that
    separately (no sense in sending it with every device), along with
    other information about the device capabilities (Screen
    resolution, FOV, Lens distortion factors, etc, etc.) And you'll
    want to account for the scenario where there are more than one
    device connected to the browser.

    Also, if this is going to be a high quality experience you'll want
    to be able to target rendering to the HMD directly and not rely on
    OS mirroring to render the image. This is a can of worms in and of
    itself: How do you reference the display? Can you manipulate a DOM
    tree on it, or is it limited to WebGL/Canvas2D? If you can render
    HTML there how do the appropriate distortions get applied, and how
    do things like depth get communicated? Does this new rendering
    surface share the same Javascript scope as the page that launched
    it? If the HMD refreshes at 90hz and your monitor refreshes at
    60hz, when does requestAnimationFrame fire? These are not simple
    questions, and need to be considered carefully to make sure that
    any resulting API is useful.

    Finally, it's worth considering that for a VR experience to be
    effective it needs to be pretty low latency. Put bluntly: Browser
    suck at this. Optimizing for scrolling large pages of flat
    content, text, and images is very different from optimizing for
    realtime, super low latency I/O. If you were to take an Oculus
    Rift and plug it into one of the existing browser/Rift demos
    <https://github.com/Instrument/oculus-bridge> with Chrome, you'll
    probably find that in the best case the rendering lags behind your
    head movement by about 4 frames. Even if your code is rendering at
    a consistent 60hz that means you're seeing ~67ms of lag, which
    will result in a motion-sickness-inducing "swimming" effect where
    the world is constantly catching up to your head position. And
    that's not even taking into account the question of how well
    Javascript/WebGL can keep up with rendering two high resolution
    views of a moderately complex scene, something that even modern
    gaming PCs can struggle with.

    That's an awful lot of work for technology that, right now, does
    not have a large user base and for which the standards and
    conventions are still being defined. I think that you'll have a
    hard time drumming up support for such an API until the technology
    becomes a little more widespread.

    (Disclaimer: I'm very enthusiastic about current VR research. If I
    sound negative it's because I'm being practical, not because I
    don't want to see this happen)

    --Brandon


    On Wed, Mar 26, 2014 at 12:34 AM, Brandon Andrews
    <[email protected]
    <mailto:[email protected]>> wrote:

        I searched, but I can't find anything relevant in the
        archives. Since pointer lock is now well supported, I think
        it's time to begin thinking about virtual reality APIs. Since
        this is a complex topic I think any spec should start simple.
        With that I'm proposing we have a discussion on adding a head
        tracking. This should be very generic with just position and
        orientation information. So no matter if the data is coming
        from a webcam, a VR headset, or a pair of glasses with eye
        tracking in the future the interface would be the same. This
        event would be similar to mouse move with a high sample rate
        (which is why in the event the head tracking and eye tracking
        are in the same event representing a user's total view).

        interface ViewEvent : UIEvent {
            readonly attribute float roll; // radians, positive is
        slanting the head to the right
            readonly attribute float pitch; // radians, positive is
        looking up
            readonly attribute float yaw; // radians, positive is
        looking to the right
            readonly attribute float offsetX; // offset X from the
        calibrated center 0 in the range -1 to 1
            readonly attribute float offsetY; // offset Y from the
        calibrated center 0 in the range -1 to 1
            readonly attribute float offsetZ; // offset Z from the
        calibrated center 0 in the range -1 to 1, and 0 if not supported
            readonly attribute float leftEyeX; // left eye X position
        in screen coordinates from -1 to 1 (but not clamped) where 0
        is the default if not supported
            readonly attribute float leftEyeY; // left eye Y position
        in screen coordinates from -1 to 1 (but not clamped) where 0
        is the default if not supported
            readonly attribute float rightEyeX; // right eye X
        position in screen coordinates from -1 to 1 (but not clamped)
        where 0 is the default if not supported
            readonly attribute float rightEyeY; // right eye Y
        position in screen coordinates from -1 to 1 (but not clamped)
        where 0 is the default if not supported
        }

        Then like the pointer lock spec the user would be able to
        request view lock to begin sampling head tracking data from
        the selected source. There would thus be a view lock change event.
        (It's not clear how the browser would list which sources to
        let the user choose from. So if they had a webcam method that
        the browser offered and an Oculus Rift then both would show
        and the user would need to choose).

        Now for discussion. Are there any features missing from the
        proposed head tracking API or features that VR headsets offer
        that need to be included from the beginning? Also I'm not sure
        what it should be called. I like "view lock", but it was my
        first thought so "head tracking" or something else might fit
        the scope of the problem better.

        Some justifications. The offset and head orientation are self
        explanatory and calibrated by the device. The eye offsets
        would be more for a UI that selects or highlights things as
        the user moves their eyes around. Examples would be a web
        enabled HUD on VR glasses and a laptop with a precision
        webcam. The user calibrates with their device software which
        reports the range (-1, -1) to (1, 1) in screen space. The
        values are not clamped so the user can look beyond the
        calibrated ranges. Separate left and right eye values enable
        precision and versatility since most hardware supporting eye
        tracking will have raw values for each eye.





--
Rob

Checkout my new book "Getting started with WebRTC" - it's a 5 star hit
on Amazon http://www.amazon.com/dp/1782166300/?tag=packtpubli-20

CEO & co-founder
http://MOB-labs.com

Chair of the W3C Augmented Web Community Group
http://www.w3.org/community/ar

Invited Expert with the ISO, Khronos Group & W3C


Reply via email to