Jan writes,

> So why not 1Gb/sec? Or even higher in the next 5 years?

“Why The Internet Pipes Will Burst When Virtual Reality Takes Off”

Feb 9th, 2016.  Written by  Bo Begole   (VP and Global Head of Huawei 
Technologies’ Media Lab)

http://www.forbes.com/sites/valleyvoices/2016/02/09/why-the-internet-pipes-will-burst-if-virtual-reality-takes-off/#47bcc01a64e8


Last month, I talked about the coming era of responsive media, media that 
changes the content dynamically to fit the consumer's attention, engagement and 
situation. Some of the technologies that need to make that happen are coming 
out now (VR goggles, emotion-sensing algorithms, and multi-camera systems) but 
there's one fly in the ointment: the new bandwidth required by responsive media 
could be too much and the internet's firehose will burst.

That's a pretty dramatic claim so let me explain where this alarm is coming 
from. The aim of virtual reality is to generate a digital experience at the 
full fidelity of human perception - to recreate every photon your eyes would 
see, every small vibration your ears would hear and eventually other details 
like touch, smell and temperature.

That's a tall order because humans can process an equivalent of nearly 5.2 
gigabits per second of sound and light - 200x what the US Federal 
Communications Commission predicts to be the future requirement for broadband 
networks (25 Mbps).

Woah! Bringing this down to the bottom line, assuming no head or body rotation, 
the eye can receive 720 million pixels for each of 2 eyes, at 36 bits per pixel 
for full color and at 60 frames per second: that's 3.1 trillion (tera) bits! 
Today's compression standards can reduce that by a factor of 300 and even if 
future compression could reach a factor of 600 (which is the goal of future 
video standards), that still means we need 5.2 gigabits per second of network 
throughput; maybe more.

"Hold on, Chicken Little," you are saying. "5.2 Gbps is just a theoretical 
upper limit. No cameras or displays today can deliver 30K resolution - we're 
only seeing 8K cameras coming out this year."

But there's the rub. Cinematographers and even consumers are no longer using 
just a single camera to create experiences. I mentioned several 360 degree 
panorama camera systems last month - these rigs generally consist of 16 or more 
outward facing cameras. At today's 4K resolution, 30 frames per second and 24 
bits per pixel, and using a 300:1 compression ratio, these rigs generate 300 
megabits per second of imagery. That's more than 10x the typical requirement 
for a high-quality 4K movie experience.

There's more. While panorama camera rigs face outward, there's another kind of 
system where the cameras face inward to capture live events. This year's Super 
Bowl, for example, was covered by 70 cameras, 36 of which were devoted to a new 
kind of capture system called Eye Vision 360 which allows the action to be 
frozen while the audience pans around the center of the action to see the 
details of the play at the line! Did the ball cross the goal line? Was the 
player fouled? See for yourself by spinning the view to any angle you choose.

Previously, these kinds of effects were only possible in video games or 
Matrix-style movies because they require heavy computation to stitch the 
multiple views together. Heavy duty post-processing means that such effects 
haven't been available during live action in the past but soon enough they will 
be. That's because new network architectures can move the post-processing off 
of high-end workstations at the cameras so that processors on the edge of the 
network and the client display devices (VR goggles, smart TVs, tablets and 
phones) themselves can perform advanced image processing to stitch the camera 
feeds into dramatic effects.

"Okay, Chicken Little," you continue, "but even when there are 16 or 36 or 70 
or more cameras capturing a scene, audiences today only see one view at a time. 
So the bandwidth requirements would not total up to the sum of all the cameras 
in the rig. Plus, dynamic caching and multicast should be able to reduce the 
load, by delivering content to thousands from a single feed."

Ah, but Responsive Media changes that. VR goggles and Tango-embedded tablets 
will let audiences dynamically select their individual point of view. That 
means two things: 1) the feed from all of the cameras needs to be available in 
an instant and 2) conventional multicast won't be possible when each audience 
member selects an individualized viewpoint. So, there it is. The network will 
be overwhelmed. The sky is falling.

Of course, network capacity is increasing every day, which will help alleviate 
the problem, but increasing the fixed network capacity is really only a finger 
in the dike.

In addition to increased capacity, the network architecture needs to be able to 
adapt dynamically by using technologies like software-defined networking and 
network function virtualization. By transforming our old network infrastructure 
away from hard-wired switches and router boxes and into a software platform, it 
will become more flexible in how resources are allocated to meet changing 
demands. New capacity can be added for major events at lower cost using 
commodity computer hardware rather than specialized, inflexible network boxes.

Finally, video/audio compression technologies are being developed that can 
achieve much higher compression ratios for these new multi-camera systems. 
Whereas conventional video compression gets most of its bang from the 
similarity of the images between one frame and the next (called temporal 
redundancy), VR compression adds to that and can leverage the similarity among 
the images from different cameras (like the sky, trees, large buildings and 
others, called spatial redundancy) and use intelligent slicing and tiling 
techniques, using less bandwidth to deliver full 360 degree video experiences.

All of these advances may still not be enough to reach the theoretical limits 
of a fully immersive experience but they should carry us through for several 
years to come.

Ultimately, we likely need a fundamentally new network architecture - one that 
can dynamically multicast and cache multiple video feeds close to consumers and 
then perform advanced video processing within the network to construct 
individualized views.

Responsive media will require this kind of information-centric networking that 
avoids the choke points of a conventional host-centric networking and allows 
the network itself to optimally distribute bits only where needed so we can 
fully enjoy the future of individualized responsive media.


And also, quote:

“Virtual Reality (VR) video experiences will be the next major 
bandwidth-consuming application, according to ARRIS CTO Charles Cheevers.  
ARRIS estimates that a VR game in 720p will require 50 Mbps, and a 4K VR game 
(do they exist yet?!) will need 500 Mbps. “So maybe VR is the one that drives 
the need for gigabit speeds, gigabit Wi-Fi and all that stuff,” Cheevers said.”

Ref: 
http://www.onlinereporter.com/2016/06/17/arris-gives-us-hint-bandwidth-requirements-vr/

Cheers,
Stephen

_______________________________________________
Link mailing list
[email protected]
http://mailman.anu.edu.au/mailman/listinfo/link

Reply via email to