Re: [whatwg] resource hints and separating download from processing

2014-08-08 Thread Ben Maurer
Sweet, I think we're on the same page. I completely agree with your high
level goal of a declarative fetch interface.

Agreed that matching is still critical, eg for the Link header use case.


On Thu, Aug 7, 2014 at 10:40 PM, Ilya Grigorik igrigo...@gmail.com wrote:




 On Thu, Aug 7, 2014 at 4:39 PM, Ben Maurer ben.mau...@gmail.com wrote:

 On Thu, Aug 7, 2014 at 3:21 PM, Ilya Grigorik igrigo...@gmail.com
 wrote:


 It would be nice if there was a more declarative relationship between
 the declarative fetch and the eventual use of the resource (assuming the
 resources are on the same page).


 I would like to break that dependency. I want layering separation where
 we have a clean way to fetch resources (with custom parameters like
 headers, priorities, dependencies, etc), and a layer that's responsible for
 consuming fetched resources for processing in the right context (enforcing
 security policies, etc), and at the right time -- e.g. see Example 3 under:
 https://igrigorik.github.io/resource-hints/#preload


 So I guess my worry here is that the loose dependency could be hard to
 debug. As a concrete example, we use crossorigin=anonymous on our script
 tags so that we can get stack traces. My understanding is that this
 requires an Origin header to be sent with the request and a CORS header in
 the response. If my  link rel=preload doesn't have a crossorigin
 setting, the requests wouldn't match up.


 We can't control the response bits, but as far as emitting the right
 request headers and communicating stream priority + dependencies, all of
 that could be exposed via some set of options when the request is built up.
 This is where we get back to the syntax (e.g. params={...}, or some
 such), which would probably map directly to various Fetch API use cases and
 options. Effectively, I want a declarative Fetch interface.


 I guess what I'm asking for here is some programmatic way of connecting
 the fetch with the consumption. For example, exposing the fetch object of
 the rel=preload and allowing you to construct a script tag explicitly with
 the fetch object.


 +1. That said, we still *need* a mechanism where the matching does not
 rely on manual plumbing with JavaScript - e.g. server returns a set of
 hints via Link header for critical resources, which must be matched by the
 user agent against appropriate requests later in the doc.

 ig



Re: [whatwg] resource hints and separating download from processing

2014-08-08 Thread bizzbyster
+1 to breaking the dependency between fetching the resource and how it is later 
used in the document. This type of “late binding” enables many page 
optimizations.

Peter

PS. My apologies if you’ve already seen a message saying similar. I’m not sure 
if the mailing list is accepting my messages or not yet.

On Aug 8, 2014, at 2:17 AM, Ben Maurer ben.mau...@gmail.com wrote:

 Sweet, I think we're on the same page. I completely agree with your high
 level goal of a declarative fetch interface.
 
 Agreed that matching is still critical, eg for the Link header use case.
 
 
 On Thu, Aug 7, 2014 at 10:40 PM, Ilya Grigorik igrigo...@gmail.com wrote:
 
 
 
 
 On Thu, Aug 7, 2014 at 4:39 PM, Ben Maurer ben.mau...@gmail.com wrote:
 
 On Thu, Aug 7, 2014 at 3:21 PM, Ilya Grigorik igrigo...@gmail.com
 wrote:
 
 
 It would be nice if there was a more declarative relationship between
 the declarative fetch and the eventual use of the resource (assuming the
 resources are on the same page).
 
 
 I would like to break that dependency. I want layering separation where
 we have a clean way to fetch resources (with custom parameters like
 headers, priorities, dependencies, etc), and a layer that's responsible for
 consuming fetched resources for processing in the right context (enforcing
 security policies, etc), and at the right time -- e.g. see Example 3 under:
 https://igrigorik.github.io/resource-hints/#preload
 
 
 So I guess my worry here is that the loose dependency could be hard to
 debug. As a concrete example, we use crossorigin=anonymous on our script
 tags so that we can get stack traces. My understanding is that this
 requires an Origin header to be sent with the request and a CORS header in
 the response. If my  link rel=preload doesn't have a crossorigin
 setting, the requests wouldn't match up.
 
 
 We can't control the response bits, but as far as emitting the right
 request headers and communicating stream priority + dependencies, all of
 that could be exposed via some set of options when the request is built up.
 This is where we get back to the syntax (e.g. params={...}, or some
 such), which would probably map directly to various Fetch API use cases and
 options. Effectively, I want a declarative Fetch interface.
 
 
 I guess what I'm asking for here is some programmatic way of connecting
 the fetch with the consumption. For example, exposing the fetch object of
 the rel=preload and allowing you to construct a script tag explicitly with
 the fetch object.
 
 
 +1. That said, we still *need* a mechanism where the matching does not
 rely on manual plumbing with JavaScript - e.g. server returns a set of
 hints via Link header for critical resources, which must be matched by the
 user agent against appropriate requests later in the doc.
 
 ig
 



Re: [whatwg] Notifications improvements

2014-08-08 Thread Peter Beverloo
Hi Andrew,

On Wed, Aug 6, 2014 at 12:59 PM, Andrew Wilson atwil...@google.com wrote:

 On Wed, Aug 6, 2014 at 12:48 PM, Anne van Kesteren ann...@annevk.nl
 wrote:
  On Wed, Aug 6, 2014 at 10:08 AM, Andrew Wilson atwil...@google.com
 wrote:
  I understand your concern that is driving this proposal: you don't
  want to provide rich APIs that can't be well implemented on every
  platform, and thereby fragment the web platform. I just don't want to
  see us go down this path of adding these new notification types that
  are so limited in ability that people will just keep using general
  notifications anyway - I'd rather just stick with the existing API.
 
  Are you unenthusiastic about any of the proposed additions (what about
  those already added?) or is this more about the more complex
  features such as indication of progress?

 I'm (somewhat) unenthusiastic about the new semantic types, because
 I'm not sure they'd get enough uptake to be worth the effort to
 implement (note that this is just my personal opinion - I'm no longer
 as heavily involved in the notification work within Chromium, so Peter
 B's opinion carries much more weight than mine when it comes to
 determining what we'd implement).


I find myself being in favor of the semantic types.

The primary reason for this is that it allows us to provide a much more
consistent user experience, especially on platforms where we don't control
rendering of the notification, compared to HTML notifications. On Chrome
for Android we want to provide for a consistent user experience, where
notifications should be visually indistinguishable (aside from clarifying
the origin) from those created by native apps. The proposed semantic types
would get us there.

Furthermore, support for HTML notifications will be much more difficult to
implement across the board. Some mobile OSes, notably Firefox OS and iOS,
wouldn't support this at all. Others, such as Android, theoretically could
support it, but won't because it means creating an entire WebView --
causing very significant memory pressure on already resource constrained
devices.

Implementation wise, Chrome recently switched to rendering Notifications
using a new message center implementation, which already supports rich data
such as progress bars, timestamps, lists and buttons. On Android many of
these features will come down to calling an extra method on the Java
Notification.Builder.

I am quite enthusiastic about adding support in the API around
 allowing users to interact with notifications after the parent page
 has closed, however.

 
  Having a new field timestamp that carries a particular point in time
  related to the message seems quite useful for instance and not very
  intrusive.

 Perhaps I'm not understanding how this would be used. What would such
 a notification look like for (say) a calendar event (Your 1PM meeting
 starts in 5 minutes) versus a countdown timer (In 73 seconds, your
 hard boiled egg will be done cooking)? Would we have to provide a
 format string so the UA knows where to inject the time remaining, or
 would it always have to be put at the end? Do we need to specify
 granularity of updates (i.e. if I have a stopwatch app, I probably
 want to update every second, but for a calendar app, updating every 5
 minutes is probably sufficient, until we get down to the last 5
 minutes)?


I would argue that this is a feature that should be defined by the UA or
platform. I'm not sure whether using a counting notification is the right
way to implement a count-down -- I would expect normal, foreground UI
displaying a counter, and a notification if that UI was moved to the
background and the counter expired.

The timestamp doesn't necessarily need to be in the future either.

At least for me, a calendar notification that is constantly updating
 its countdown every minute but has no snooze functionality is actually
 a mis-feature, because it's distracting.


The notification wouldn't visually re-announce itself on every update.

Again, I don't want to be overly negative about this - maybe there are
 use cases like list notifications where the new API would be really
 useful, but as soon as I start thinking about more complex display
 scenarios, I immediately want to start having more control over the
 formatting of the text being displayed, and i wouldn't get that with
 this proposal.


I think the benefit of being able to closely match the UI and UX of native
notifications on the platforms is something that's enabled by using
declarative properties, whereas that would be near impossible to do with
HTML notifications. As long as advanced features, especially more
compatible ones (say, buttons or timestamps) can be feature detected by
developers so that they can provide a fallback when they're not available,
I would be in favor of extending the future set with Jonas' declarative
proposals.

Thanks,
Peter


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Ashley Gullen

 As Justin stated, 20% of current Chrome users currently fall back to
 canvas 2d.


1. What fraction of those 20% actually still get a GPU accelerated canvas
vs. software rendered? Batching will be of very little use to the software
rendered audience, making it an even smaller target market.
2. In Firefox's case, that number has reduced from 67% to 15% over a couple
of years. Surely in time this will fall even further to a negligible
amount. Why standardise a feature whose target market is disappearing?


 Small developers that don't have the resources to develop for concurrent
 WebGL and Canvas2D code paths


They don't have to: there are free, high-quality open-source libraries like
Pixi.js that do this already, so even small developers have an easy way
to make use of a WebGL renderer without much extra effort.


 When do you envision that OpenGL drivers are bug free everywhere? History
 is not on your side here...
 I would much rather have something short term that can be implemented with
 low effort and improves performance.


No software is ever bug-free, but this is irrelevant. To not be
blacklisted, drivers don't need to be perfect, they just need to meet a
reasonable threshold of security and reliability. If a driver is insecure
or crashes constantly it is blacklisted. Drivers are being improved so they
are no longer this poor, and it is not unrealistic to imagine 99%+ of
drivers meeting this threshold in the near future, even if none of them are
completely bug-free.


 I don't really understand why you and Brian are so opposed to improving
 the performance of canvas 2D.


I see it as a feature targeted at a rapidly disappearing segment of the
market that will disappear in the long run, leaving the web platform with
unnecessary API cruft.


 Following your logic, why work on new canvas or SVG features as they can
 theoretically be emulated in WebGL?
 Or now that we have asm.js, why even bother with new JavaScript features?


I am in general against duplication on the web platform, but new features
deserve to be implemented if they have a valid use case or solve a real
problem. In this case I don't see that any real problem is being solved,
since widely available frameworks and engines already solve it with WebGL
in a way accessible even to individual developers, and this solution is
already production-grade and widely deployed.

On further thought this particular proposal doesn't even appear to solve
the batching problem very well. Many games consist of large numbers of
rotated sprites. If a canvas2d batching facility needs to break the batch
every time it needs to call rotate(), this will revert back to individual
draw-calls for many kinds of game. WebGL does not have this limitation and
can batch in to single calls objects of a variety of scales, angles,
tiling, opacity and more. This is done by control over individual vertex
positions and texture co-ordinates, which is a fundamental break from the
style of the canvas2d API. Therefore even with the proposed batching
facility, for maximum performance it is still necessary to use WebGL. This
proposal solves a very narrowly defined performance problem.

An alternate solution is for browser vendors to implement canvas2d entirely
in JS on top of WebGL. This reduces per-call overhead by staying in JS
land, while not needing to add any new API surface. In fact it looks like
this has already been attempted here:
https://github.com/corbanbrook/webgl-2d - I'd suggest before speccing this
feature that it is researched whether the same performance goals can be
achieved with a well-batched JS implementation which reduces the number of
calls in to the browser, which is exactly how existing WebGL renderers
outperform canvas2d despite both being GPU accelerated.

Ashley


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Justin Novosad
On Fri, Aug 8, 2014 at 10:25 AM, Ashley Gullen ash...@scirra.com wrote:

 As Justin stated, 20% of current Chrome users currently fall back to
 canvas 2d.


 1. What fraction of those 20% actually still get a GPU accelerated canvas
 vs. software rendered? Batching will be of very little use to the software
 rendered audience, making it an even smaller target market.


~25% of Chrome users that do not have gpu-accelerated WebGL do have gpu
accelerated 2d canvas. Nonetheless, the software implementation of 2D
canvas is way faster than software-emulated WebGL, so it makes sens to
fallback to 2D canvas anytime accelerated WebGL is unavailable.


 No software is ever bug-free, but this is irrelevant. To not be
 blacklisted, drivers don't need to be perfect, they just need to meet a
 reasonable threshold of security and reliability. If a driver is insecure
 or crashes constantly it is blacklisted. Drivers are being improved so they
 are no longer this poor, and it is not unrealistic to imagine 99%+ of
 drivers meeting this threshold in the near future, even if none of them are
 completely bug-free.


The problem is the long tail of old devices. There is an astonishingly
large number of machines in the world running outdated OSes that no longer
receive updates, or graphics driver updates.  Also, AFAIK, display drivers
tend to not have auto-updates the way OSes and browsers do. So even if
there is an updated driver out there, most users are unlikely to install it
until they try to use some software that requires it explicitly.



  Many games consist of large numbers of rotated sprites.


The proposal includes the possibility of specifying a per draw
transformation matrix. And as Katelyn suggested, we could add a variant
that takes an alpha value.

I will post an update to this thread as soon as I have more compelling
performance data. My goal is to demonstrate that a 2d canvas (with batched
drawImage calls) can yield performance characteristics that are
significantly superior to WebGL's for typical 2D sprite-based game use
cases, particularly on mobile platforms.  This will be possible by
leveraging the browser's compositing framework in ways that are not
possible with WebGL.  Stay tuned.

   -Justin


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Ashley Gullen

 1. What fraction of those 20% actually still get a GPU accelerated canvas
 vs. software rendered? Batching will be of very little use to the software
 rendered audience, making it an even smaller target market.


 ~25% of Chrome users that do not have gpu-accelerated WebGL do have gpu
 accelerated 2d canvas.


So 25% of 20%... you are speccing a feature for just 5% of users, correct?
(Since batching likely makes no difference when software rendered)


 The problem is the long tail of old devices.


These will still disappear with time, just like old Windows 95 machines
have. Is your intent to spec a feature that will no longer be necessary in
future?


 I will post an update to this thread as soon as I have more compelling
 performance data. My goal is to demonstrate that a 2d canvas (with batched
 drawImage calls) can yield performance characteristics that are
 significantly superior to WebGL's for typical 2D sprite-based game use
 cases, particularly on mobile platforms.


Here's some existing data: comparing a WebGL renderer (
http://www.scirra.com/demos/c2/renderperfgl/) with canvas2d (
http://www.scirra.com/demos/c2/renderperf2d) in Chrome on a Nexus 5:

canvas2d: 360 objects @ 30 FPS
webgl: ~17,500 objects @ 30 FPS

WebGL is nearly 50x (fifty!) times faster than canvas2d in these results.
Do you really consider this not fast enough? And by just how much further
do you hope to improve the result?


Re: [whatwg] Notifications improvements

2014-08-08 Thread Jonas Sicking
On Fri, Aug 8, 2014 at 5:48 AM, Peter Beverloo bever...@google.com wrote:
 I think the benefit of being able to closely match the UI and UX of native
 notifications on the platforms is something that's enabled by using
 declarative properties, whereas that would be near impossible to do with
 HTML notifications. As long as advanced features, especially more compatible
 ones (say, buttons or timestamps) can be feature detected by developers so
 that they can provide a fallback when they're not available, I would be in
 favor of extending the future set with Jonas' declarative proposals.

Cool! I had not though about the need to feature detect. At least for
progress/list/timestamp. My thinking had been to require that the UA
provide a textual fallback on platforms that can't render those
widgets natively.

However I think you are right that we'll need feature detection here.
We might even need two types of it.

First of all not all UAs are going to implement all of these features
right away. So obviously they won't provide any fallback rendering
either. Detecting this seems important.

Second, it might be good to enable pages to detect platforms where a
progress bar is rendered as a native progress bar, rather than as text
fallback. This way the page can choose not to use a progress bar and
instead create its own fallback.

An important point here is for pages that don't choose to test for the
latter, will still get a useful rendering of the notification.

/ Jonas


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Rik Cabanier
On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote:

 Sorry, in this context rgba multiplication refers to per-channel
 multipliers (instead of only one multiplier for the alpha channel), so
 that you can color tint images when drawing them. As mentioned, it's
 used for fades, drawing colored text, and similar effects.


I see. Any reason that this couldn't be done with a 'multiply' blend mode?


 Premultiplication is a different subject, sorry if I confused you with
 the similar language. There are past discussions about both in the
 list archives.

 On Thu, Aug 7, 2014 at 10:59 AM, Rik Cabanier caban...@gmail.com wrote:
 
 
 
  On Mon, Aug 4, 2014 at 4:35 PM, Katelyn Gadd k...@luminance.org wrote:
 
  Many, many uses of drawImage involve transform and/or other state
  changes per-blit (composite mode, global alpha).
 
  I think some of those state changes could be viably batched for most
  games (composite mode) but others absolutely cannot (global alpha,
  transform). I see that you handle transform with
  source-rectangle-and-transform (nice!) but you do not currently handle
  the others. I'd suggest that this needs to at least handle
  globalAlpha.
 
  Replacing the overloading with individual named methods is something
  I'm also in favor of. I think it would be ideal if the format-enum
  argument were not there so that it's easier to feature-detect what
  formats are available (for example, if globalAlpha data is added later
  instead of in the '1.0' version of this feature).
 
 
  We can define the functions so they throw a type error if an unknown
 enum is
  passed. That way you can feature detect future additions to the enum.
 
  What should be do about error detection in general? If we require the
 float
  array to be well formed before drawing, we need an extra pass to make
 sure
  that they are correct.
  If we don't require it, we can skip that pass but content could be
 partially
  drawn to the canvas before the exception is thrown.
 
 
  I get the impression that ordering is implicit for this call - the
  batch's drawing operations occur in exact order. It might be
  worthwhile to have a way to indicate to the implementation that you
  don't care about order, so that it is free to rearrange the draw
  operations by image and reduce state changes. Doing that in userspace
  js is made difficult since you can't easily do efficient table lookup
  for images.
 
  if rgba multiplication were to make it into canvas2d sometime in the
  next decade, that would nicely replace globalAlpha as a per-draw
  value. This is an analogue to per-vertex colors in 3d graphics and is
  used in virtually every hardware-accelerated 2d game out there,
  whether to tint characters when drawing text, fade things in and out,
  or flash the screen various colors. That would be another reason to
  make feature detection easier.
 
  Would it be possible to sneak rgba multiplication in under the guise
  of this feature? ;) Without it, I'm forced to use WebGL and reduce
  compatibility just for something relatively trivial on the
  implementer's side. (I should note that from what I've heard, Direct2D
  actually makes this hard to implement.
 
 
  Is this the other proposal to control the format of the canvas buffer
 that
  is passed to WebGL?
 
 
  On the bright side there's a workaround for RGBA multiplication based
  on generating per-channel bitmaps from the source bitmap (k, r/g/b),
  then blending them source-over/add/add/add. drawImageBatch would
  improve perf for the r/g/b part of it, so it's still an improvement.
 
  On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org
 
  wrote:
   It looks reasonable to me.
  
   How do these calls interact with globalAlpha etc? You talk about
   decomposing them to individual drawImage calls; does that mean each
   image
   draw is treated as a separate composite operation?
  
   Currently you have to choose between using a single image or passing
 an
   array with one element per image-draw. It seems to me it would be more
   flexible to always pass an array but allow the parameters array to
 refer
   to
   an image by index. Did you consider that approach?
  
   Rob
   --
   oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso
 oaonogoroyo
   owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
   osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
   owohooo
   osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o
   o‘oRoaocoao,o’o
   oioso
   oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo
 oaonoyooonoeo
   owohooo
   osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono
 odoaonogoeoro
   ooofo
   otohoeo ofoioroeo ooofo ohoeololo.
 
 



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Rik Cabanier
On Fri, Aug 8, 2014 at 7:25 AM, Ashley Gullen ash...@scirra.com wrote:

 As Justin stated, 20% of current Chrome users currently fall back to
 canvas 2d.


 1. What fraction of those 20% actually still get a GPU accelerated canvas
 vs. software rendered? Batching will be of very little use to the software
 rendered audience, making it an even smaller target market.


There will still be a noticeable gain since we wouldn't have to cross the
JS boundary as much.
More importantly though, things will still work for non-accelerated canvas
while WebGL won't.


 2. In Firefox's case, that number has reduced from 67% to 15% over a
 couple of years. Surely in time this will fall even further to a negligible
 amount. Why standardise a feature whose target market is disappearing?


How long will it be before we're at a reliable 100%? Graphics drivers are
still flaky and it's not as if they only came out a couple of years ago.
Note that the supported % for graphics layers (accelerated compositing)
is much lower.


 Small developers that don't have the resources to develop for concurrent
 WebGL and Canvas2D code paths


 They don't have to: there are free, high-quality open-source libraries
 like Pixi.js that do this already, so even small developers have an easy
 way to make use of a WebGL renderer without much extra effort.


 When do you envision that OpenGL drivers are bug free everywhere? History
 is not on your side here...
 I would much rather have something short term that can be implemented
 with low effort and improves performance.


 No software is ever bug-free, but this is irrelevant. To not be
 blacklisted, drivers don't need to be perfect, they just need to meet a
 reasonable threshold of security and reliability. If a driver is insecure
 or crashes constantly it is blacklisted. Drivers are being improved so they
 are no longer this poor, and it is not unrealistic to imagine 99%+ of
 drivers meeting this threshold in the near future, even if none of them are
 completely bug-free.


I think Justin can better fill you in on this. Chrome has to jump through
many hoops to make canvas reliable on top of OpenGL and it still suffers
from random crashes when you stress the system.
Both Safari and Firefox use higher level system calls and are more reliable
(albeit slower) than Chrome.


 I don't really understand why you and Brian are so opposed to improving
 the performance of canvas 2D.


 I see it as a feature targeted at a rapidly disappearing segment of the
 market that will disappear in the long run, leaving the web platform with
 unnecessary API cruft.


 Following your logic, why work on new canvas or SVG features as they can
 theoretically be emulated in WebGL?
 Or now that we have asm.js, why even bother with new JavaScript features?


 I am in general against duplication on the web platform, but new features
 deserve to be implemented if they have a valid use case or solve a real
 problem.


The problem is that a large number of drawImage calls have a lot of
overhead due to JS crossings and housekeeping. This proposal solves that.


 In this case I don't see that any real problem is being solved, since
 widely available frameworks and engines already solve it with WebGL in a
 way accessible even to individual developers, and this solution is already
 production-grade and widely deployed.


Sure, but that is in WebGL which not everyone wants to use and is less
widely supported.


 On further thought this particular proposal doesn't even appear to solve
 the batching problem very well. Many games consist of large numbers of
 rotated sprites. If a canvas2d batching facility needs to break the batch
 every time it needs to call rotate(), this will revert back to individual
 draw-calls for many kinds of game. WebGL does not have this limitation and
 can batch in to single calls objects of a variety of scales, angles,
 tiling, opacity and more. This is done by control over individual vertex
 positions and texture co-ordinates, which is a fundamental break from the
 style of the canvas2d API. Therefore even with the proposed batching
 facility, for maximum performance it is still necessary to use WebGL. This
 proposal solves a very narrowly defined performance problem.


I'm unsure if I follow. The point of Justin's proposal is to do just that
under the hood.
Why do you think the batching needs to be broken up? Did you see that the
proposal has a matrix per draw?


 An alternate solution is for browser vendors to implement canvas2d
 entirely in JS on top of WebGL. This reduces per-call overhead by staying
 in JS land, while not needing to add any new API surface. In fact it looks
 like this has already been attempted here:
 https://github.com/corbanbrook/webgl-2d -


Implementing Canvas on top of WebGL is not realistic.
Please look into Chrome's implementation to make canvas reliable and fast.
This can not be achieved today.

I totally agree if the web platform matures to a point where this IS
possible, we 

Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Boris Zbarsky

On 8/8/14, 8:43 PM, Rik Cabanier wrote:

The problem is that a large number of drawImage calls have a lot of
overhead due to JS crossings and housekeeping.


Could we please quantify this?

I measured the JS crossings part of this, because it's easy: Just have 
the C++ side of drawImage return immediately.  What I see if I do that 
is that a drawImage call that passes an HTMLImageElement and two 
integers takes about 17ns on my hardware in a current Firefox nightly on 
Mac.


For scale, a full-up drawImage call that actually does something takes 
the following amounts of time in various browsers I have [1]:


Firefox nightly: ~9500ns/call
Chrome dev: ~4300ns/call
Safari 7.0.5 and WebKit nightly: ~3000ns/call

all with noise (when averaged across 10 calls) that's way more than 
17ns.


So I'm not sure JS crossings is a significant performance cost here. 
I'd be interested in which parts of housekeeping would be shareable 
across the multiple images in the proposal and how much time those take 
in practice.


-Boris

[1] The specific testcase I used:

!DOCTYPE html
img src=http://www.mozilla.org/images/mozilla-banner.gif;
script
  onload = function() {
var c = document.createElement(canvas).getContext(2d);
var count = 100;
var img = document.querySelector(img);
var start = new Date;
for (var i = 0; i  count; ++i) c.drawImage(img, 0, 0);
var stop = new Date;
console.log((stop - start) / count * 1e6 + ns per call);
  }
/script



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Justin Novosad
The way you measured the JS crossing time does not include parameter
validations if I am not mistaken.  If you have 1000 sprite draws that draw
from the same sprite sheet, that is 1000 times you are verifying the same
image parameter (verifying that the image is complete, that its security
origin does not taint the canvas, fetch the decoded image data, etc.), even
tough most of this stuff is cached, it is being looked up N times instead
of just once.
I will prototype this and come back with some hard data.

On Fri, Aug 8, 2014 at 9:26 PM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 8/8/14, 8:43 PM, Rik Cabanier wrote:

 The problem is that a large number of drawImage calls have a lot of
 overhead due to JS crossings and housekeeping.


 Could we please quantify this?

 I measured the JS crossings part of this, because it's easy: Just have
 the C++ side of drawImage return immediately.  What I see if I do that is
 that a drawImage call that passes an HTMLImageElement and two integers
 takes about 17ns on my hardware in a current Firefox nightly on Mac.

 For scale, a full-up drawImage call that actually does something takes the
 following amounts of time in various browsers I have [1]:

 Firefox nightly: ~9500ns/call
 Chrome dev: ~4300ns/call
 Safari 7.0.5 and WebKit nightly: ~3000ns/call

 all with noise (when averaged across 10 calls) that's way more than
 17ns.

 So I'm not sure JS crossings is a significant performance cost here. I'd
 be interested in which parts of housekeeping would be shareable across
 the multiple images in the proposal and how much time those take in
 practice.

 -Boris

 [1] The specific testcase I used:

 !DOCTYPE html
 img src=http://www.mozilla.org/images/mozilla-banner.gif;
 script
   onload = function() {
 var c = document.createElement(canvas).getContext(2d);
 var count = 100;
 var img = document.querySelector(img);
 var start = new Date;
 for (var i = 0; i  count; ++i) c.drawImage(img, 0, 0);
 var stop = new Date;
 console.log((stop - start) / count * 1e6 + ns per call);
   }
 /script



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Boris Zbarsky

On 8/8/14, 9:46 PM, Justin Novosad wrote:

The way you measured the JS crossing time does not include parameter
validations if I am not mistaken.


It includes the validation Web IDL does (e.g. this is an 
HTMLImageElement) but not the specific validation this method does, 
correct.



If you have 1000 sprite draws that
draw from the same sprite sheet, that is 1000 times you are verifying
the same image parameter (verifying that the image is complete, that its
security origin does not taint the canvas, fetch the decoded image data,
etc.), even tough most of this stuff is cached, it is being looked up N
times instead of just once.


True.  I just tested this in a Firefox nightly: I put the early return 
after we have looked up the decoded image data in our internal cache, or 
gotten it from the image if not cached, but before we've started doing 
anything with the position passed to drawImage.  The internal cache 
lookup already includes checks for things like tainting, completeness 
verification, etc, since there is nothing in the cache if the image is 
not complete and the canvas is already tainted as needed if we have 
cached decoded data for this (image,canvas) pair.


This version of drawImage takes about 47ns per call when averaged over 
1e5 calls (of which, recall, 17ns is the JS call overhead; the other 
30ns is the cache lookup, which hits).


That's starting to creep up into the 1.5% range of the time I see 
drawImage taking in the fastest drawImage implementation I have on hand.



I will prototype this and come back with some hard data.


That would be awesome!

-Boris


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Katelyn Gadd
A multiply blend mode by itself is not sufficient because the image
being rgba multiplied typically has alpha transparency. The closest
approximation is to generate (offline, in software with getImageData)
an image per channel - rgbk - and to source-over blend the 'k' channel
and then additive blend the r/g/b channels with individual alpha. This
approximates the per-channel alpha values with nearly equivalent
results (I say nearly equivalent because some browsers do weird stuff
with gamma/colorspace conversion that completely breaks this.)

If it helps, you could think of this operation as a layer group in
photoshop. The first layer in the group is the source image, the
second layer is a solid color filled layer containing the rgba
'multiplier', in multiply mode, and then the layer group has a mask on
it that contains the source image's alpha information. Note that this
representation (a layer group with two layers and a mask) implies that
drawing an image this way requires multiple passes, which is highly
undesirable. My current fallback requires 4 passes, along with 4
texture changes and two blend state changes. Not wonderful.


RGBA multiplication dates back to early fixed-function graphics
pipelines. If a blend with globalAlpha and a premultiplied source is
represented like this:

result(r, g, b) = ( source-premultiplied(r, g, b) * globalAlpha ) + (
dest(r, g, b) * (1 - (source(a) * globalAlpha)) )

Then if you take a premultiplied color constant and use that as the
multiplier for your image (instead of a global alpha value - this is
the input to rgba multiplication, i.e. a 'vertex color'):

result(r, g, b) = ( source-premultiplied(r, g, b) *
rgba-multiplier-premultiplied(r, g, b) ) + ( dest(r, g, b) * (1 -
(source(a) * rgba-multiplier-premultiplied(a))) )

(Sorry if this is unclear, I don't have a math education)

So you basically take the global alpha multiplier and you go from that
to a per-channel multiplier. If you're using premultiplied alpha
already, this ends up being pretty straightforward... you just take a
color (premultiplied, like everything else) and use that as your
multiplier. You can multiply directly by each channel since the global
'alpha' part of the multiplier is already baked in by the
premultiplication step.

This is a really common primitive since it's so easy to implement, if
not entirely free - you're already doing that global alpha
multiplication, so you just introduce a different multiplier
per-channel, which is really trivial in a SIMD model like the ones
used in computer graphics. You go from vec4 * scalar to vec4 * vec4.


Text rendering is the most compelling reason to support this, IMO.
With this feature you can build glyph atlases inside 2d canvases
(using something like freetype, etc), then trivially draw colored
glyphs out of them without having to drop down into getImageData or
use WebGL. It is trivially expressed in most graphics APIs since it
uses the same machinery as a global alpha multiplier - if you're
drawing a premultiplied image with an alpha multiplier in hardware,
you're almost certainly doing vec4 * scalar in your shader. If you're
using the fixed-function pipeline from bad old 3d graphics, vec4 *
scalar didn't even exist - the right hand side was *always* another
vec4 so this feature literally just changed the constant on the right
hand side.


I harp on this feature since nearly every 2d game I encounter uses it,
and JSIL has to do it in software. If not for this one feature it
would be very easy to make the vast majority of ported titles Just
Work against canvas, which makes them more likely to run correctly on
mobile.


On Fri, Aug 8, 2014 at 5:28 PM, Rik Cabanier caban...@gmail.com wrote:



 On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote:

 Sorry, in this context rgba multiplication refers to per-channel
 multipliers (instead of only one multiplier for the alpha channel), so
 that you can color tint images when drawing them. As mentioned, it's
 used for fades, drawing colored text, and similar effects.


 I see. Any reason that this couldn't be done with a 'multiply' blend mode?


 Premultiplication is a different subject, sorry if I confused you with
 the similar language. There are past discussions about both in the
 list archives.



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Rik Cabanier
On Fri, Aug 8, 2014 at 7:54 PM, Katelyn Gadd k...@luminance.org wrote:

 A multiply blend mode by itself is not sufficient because the image
 being rgba multiplied typically has alpha transparency. The closest
 approximation is to generate (offline, in software with getImageData)
 an image per channel - rgbk - and to source-over blend the 'k' channel
 and then additive blend the r/g/b channels with individual alpha. This
 approximates the per-channel alpha values with nearly equivalent
 results (I say nearly equivalent because some browsers do weird stuff
 with gamma/colorspace conversion that completely breaks this.)

 If it helps, you could think of this operation as a layer group in
 photoshop. The first layer in the group is the source image, the
 second layer is a solid color filled layer containing the rgba
 'multiplier', in multiply mode, and then the layer group has a mask on
 it that contains the source image's alpha information. Note that this
 representation (a layer group with two layers and a mask) implies that
 drawing an image this way requires multiple passes, which is highly
 undesirable. My current fallback requires 4 passes, along with 4
 texture changes and two blend state changes. Not wonderful.


I see; you're asking for a feature like Photoshop Color Overlay layer
effect. Is that correct?



 RGBA multiplication dates back to early fixed-function graphics
 pipelines. If a blend with globalAlpha and a premultiplied source is
 represented like this:

 result(r, g, b) = ( source-premultiplied(r, g, b) * globalAlpha ) + (
 dest(r, g, b) * (1 - (source(a) * globalAlpha)) )

 Then if you take a premultiplied color constant and use that as the
 multiplier for your image (instead of a global alpha value - this is
 the input to rgba multiplication, i.e. a 'vertex color'):

 result(r, g, b) = ( source-premultiplied(r, g, b) *
 rgba-multiplier-premultiplied(r, g, b) ) + ( dest(r, g, b) * (1 -
 (source(a) * rgba-multiplier-premultiplied(a))) )

 (Sorry if this is unclear, I don't have a math education)

 So you basically take the global alpha multiplier and you go from that
 to a per-channel multiplier. If you're using premultiplied alpha
 already, this ends up being pretty straightforward... you just take a
 color (premultiplied, like everything else) and use that as your
 multiplier. You can multiply directly by each channel since the global
 'alpha' part of the multiplier is already baked in by the
 premultiplication step.

 This is a really common primitive since it's so easy to implement, if
 not entirely free - you're already doing that global alpha
 multiplication, so you just introduce a different multiplier
 per-channel, which is really trivial in a SIMD model like the ones
 used in computer graphics. You go from vec4 * scalar to vec4 * vec4.


 Text rendering is the most compelling reason to support this, IMO.
 With this feature you can build glyph atlases inside 2d canvases
 (using something like freetype, etc), then trivially draw colored
 glyphs out of them without having to drop down into getImageData or
 use WebGL. It is trivially expressed in most graphics APIs since it
 uses the same machinery as a global alpha multiplier - if you're
 drawing a premultiplied image with an alpha multiplier in hardware,
 you're almost certainly doing vec4 * scalar in your shader. If you're
 using the fixed-function pipeline from bad old 3d graphics, vec4 *
 scalar didn't even exist - the right hand side was *always* another
 vec4 so this feature literally just changed the constant on the right
 hand side.


 I harp on this feature since nearly every 2d game I encounter uses it,
 and JSIL has to do it in software. If not for this one feature it
 would be very easy to make the vast majority of ported titles Just
 Work against canvas, which makes them more likely to run correctly on
 mobile.


Maybe it would be best to bring this up as a separate topic on this mailing
list. (just copy/paste most of your message)


 On Fri, Aug 8, 2014 at 5:28 PM, Rik Cabanier caban...@gmail.com wrote:
 
 
 
  On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote:
 
  Sorry, in this context rgba multiplication refers to per-channel
  multipliers (instead of only one multiplier for the alpha channel), so
  that you can color tint images when drawing them. As mentioned, it's
  used for fades, drawing colored text, and similar effects.
 
 
  I see. Any reason that this couldn't be done with a 'multiply' blend
 mode?
 
 
  Premultiplication is a different subject, sorry if I confused you with
  the similar language. There are past discussions about both in the
  list archives.