Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Ashley Gullen

 As Justin stated, 20% of current Chrome users currently fall back to
 canvas 2d.


1. What fraction of those 20% actually still get a GPU accelerated canvas
vs. software rendered? Batching will be of very little use to the software
rendered audience, making it an even smaller target market.
2. In Firefox's case, that number has reduced from 67% to 15% over a couple
of years. Surely in time this will fall even further to a negligible
amount. Why standardise a feature whose target market is disappearing?


 Small developers that don't have the resources to develop for concurrent
 WebGL and Canvas2D code paths


They don't have to: there are free, high-quality open-source libraries like
Pixi.js that do this already, so even small developers have an easy way
to make use of a WebGL renderer without much extra effort.


 When do you envision that OpenGL drivers are bug free everywhere? History
 is not on your side here...
 I would much rather have something short term that can be implemented with
 low effort and improves performance.


No software is ever bug-free, but this is irrelevant. To not be
blacklisted, drivers don't need to be perfect, they just need to meet a
reasonable threshold of security and reliability. If a driver is insecure
or crashes constantly it is blacklisted. Drivers are being improved so they
are no longer this poor, and it is not unrealistic to imagine 99%+ of
drivers meeting this threshold in the near future, even if none of them are
completely bug-free.


 I don't really understand why you and Brian are so opposed to improving
 the performance of canvas 2D.


I see it as a feature targeted at a rapidly disappearing segment of the
market that will disappear in the long run, leaving the web platform with
unnecessary API cruft.


 Following your logic, why work on new canvas or SVG features as they can
 theoretically be emulated in WebGL?
 Or now that we have asm.js, why even bother with new JavaScript features?


I am in general against duplication on the web platform, but new features
deserve to be implemented if they have a valid use case or solve a real
problem. In this case I don't see that any real problem is being solved,
since widely available frameworks and engines already solve it with WebGL
in a way accessible even to individual developers, and this solution is
already production-grade and widely deployed.

On further thought this particular proposal doesn't even appear to solve
the batching problem very well. Many games consist of large numbers of
rotated sprites. If a canvas2d batching facility needs to break the batch
every time it needs to call rotate(), this will revert back to individual
draw-calls for many kinds of game. WebGL does not have this limitation and
can batch in to single calls objects of a variety of scales, angles,
tiling, opacity and more. This is done by control over individual vertex
positions and texture co-ordinates, which is a fundamental break from the
style of the canvas2d API. Therefore even with the proposed batching
facility, for maximum performance it is still necessary to use WebGL. This
proposal solves a very narrowly defined performance problem.

An alternate solution is for browser vendors to implement canvas2d entirely
in JS on top of WebGL. This reduces per-call overhead by staying in JS
land, while not needing to add any new API surface. In fact it looks like
this has already been attempted here:
https://github.com/corbanbrook/webgl-2d - I'd suggest before speccing this
feature that it is researched whether the same performance goals can be
achieved with a well-batched JS implementation which reduces the number of
calls in to the browser, which is exactly how existing WebGL renderers
outperform canvas2d despite both being GPU accelerated.

Ashley


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Justin Novosad
On Fri, Aug 8, 2014 at 10:25 AM, Ashley Gullen ash...@scirra.com wrote:

 As Justin stated, 20% of current Chrome users currently fall back to
 canvas 2d.


 1. What fraction of those 20% actually still get a GPU accelerated canvas
 vs. software rendered? Batching will be of very little use to the software
 rendered audience, making it an even smaller target market.


~25% of Chrome users that do not have gpu-accelerated WebGL do have gpu
accelerated 2d canvas. Nonetheless, the software implementation of 2D
canvas is way faster than software-emulated WebGL, so it makes sens to
fallback to 2D canvas anytime accelerated WebGL is unavailable.


 No software is ever bug-free, but this is irrelevant. To not be
 blacklisted, drivers don't need to be perfect, they just need to meet a
 reasonable threshold of security and reliability. If a driver is insecure
 or crashes constantly it is blacklisted. Drivers are being improved so they
 are no longer this poor, and it is not unrealistic to imagine 99%+ of
 drivers meeting this threshold in the near future, even if none of them are
 completely bug-free.


The problem is the long tail of old devices. There is an astonishingly
large number of machines in the world running outdated OSes that no longer
receive updates, or graphics driver updates.  Also, AFAIK, display drivers
tend to not have auto-updates the way OSes and browsers do. So even if
there is an updated driver out there, most users are unlikely to install it
until they try to use some software that requires it explicitly.



  Many games consist of large numbers of rotated sprites.


The proposal includes the possibility of specifying a per draw
transformation matrix. And as Katelyn suggested, we could add a variant
that takes an alpha value.

I will post an update to this thread as soon as I have more compelling
performance data. My goal is to demonstrate that a 2d canvas (with batched
drawImage calls) can yield performance characteristics that are
significantly superior to WebGL's for typical 2D sprite-based game use
cases, particularly on mobile platforms.  This will be possible by
leveraging the browser's compositing framework in ways that are not
possible with WebGL.  Stay tuned.

   -Justin


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Ashley Gullen

 1. What fraction of those 20% actually still get a GPU accelerated canvas
 vs. software rendered? Batching will be of very little use to the software
 rendered audience, making it an even smaller target market.


 ~25% of Chrome users that do not have gpu-accelerated WebGL do have gpu
 accelerated 2d canvas.


So 25% of 20%... you are speccing a feature for just 5% of users, correct?
(Since batching likely makes no difference when software rendered)


 The problem is the long tail of old devices.


These will still disappear with time, just like old Windows 95 machines
have. Is your intent to spec a feature that will no longer be necessary in
future?


 I will post an update to this thread as soon as I have more compelling
 performance data. My goal is to demonstrate that a 2d canvas (with batched
 drawImage calls) can yield performance characteristics that are
 significantly superior to WebGL's for typical 2D sprite-based game use
 cases, particularly on mobile platforms.


Here's some existing data: comparing a WebGL renderer (
http://www.scirra.com/demos/c2/renderperfgl/) with canvas2d (
http://www.scirra.com/demos/c2/renderperf2d) in Chrome on a Nexus 5:

canvas2d: 360 objects @ 30 FPS
webgl: ~17,500 objects @ 30 FPS

WebGL is nearly 50x (fifty!) times faster than canvas2d in these results.
Do you really consider this not fast enough? And by just how much further
do you hope to improve the result?


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Rik Cabanier
On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote:

 Sorry, in this context rgba multiplication refers to per-channel
 multipliers (instead of only one multiplier for the alpha channel), so
 that you can color tint images when drawing them. As mentioned, it's
 used for fades, drawing colored text, and similar effects.


I see. Any reason that this couldn't be done with a 'multiply' blend mode?


 Premultiplication is a different subject, sorry if I confused you with
 the similar language. There are past discussions about both in the
 list archives.

 On Thu, Aug 7, 2014 at 10:59 AM, Rik Cabanier caban...@gmail.com wrote:
 
 
 
  On Mon, Aug 4, 2014 at 4:35 PM, Katelyn Gadd k...@luminance.org wrote:
 
  Many, many uses of drawImage involve transform and/or other state
  changes per-blit (composite mode, global alpha).
 
  I think some of those state changes could be viably batched for most
  games (composite mode) but others absolutely cannot (global alpha,
  transform). I see that you handle transform with
  source-rectangle-and-transform (nice!) but you do not currently handle
  the others. I'd suggest that this needs to at least handle
  globalAlpha.
 
  Replacing the overloading with individual named methods is something
  I'm also in favor of. I think it would be ideal if the format-enum
  argument were not there so that it's easier to feature-detect what
  formats are available (for example, if globalAlpha data is added later
  instead of in the '1.0' version of this feature).
 
 
  We can define the functions so they throw a type error if an unknown
 enum is
  passed. That way you can feature detect future additions to the enum.
 
  What should be do about error detection in general? If we require the
 float
  array to be well formed before drawing, we need an extra pass to make
 sure
  that they are correct.
  If we don't require it, we can skip that pass but content could be
 partially
  drawn to the canvas before the exception is thrown.
 
 
  I get the impression that ordering is implicit for this call - the
  batch's drawing operations occur in exact order. It might be
  worthwhile to have a way to indicate to the implementation that you
  don't care about order, so that it is free to rearrange the draw
  operations by image and reduce state changes. Doing that in userspace
  js is made difficult since you can't easily do efficient table lookup
  for images.
 
  if rgba multiplication were to make it into canvas2d sometime in the
  next decade, that would nicely replace globalAlpha as a per-draw
  value. This is an analogue to per-vertex colors in 3d graphics and is
  used in virtually every hardware-accelerated 2d game out there,
  whether to tint characters when drawing text, fade things in and out,
  or flash the screen various colors. That would be another reason to
  make feature detection easier.
 
  Would it be possible to sneak rgba multiplication in under the guise
  of this feature? ;) Without it, I'm forced to use WebGL and reduce
  compatibility just for something relatively trivial on the
  implementer's side. (I should note that from what I've heard, Direct2D
  actually makes this hard to implement.
 
 
  Is this the other proposal to control the format of the canvas buffer
 that
  is passed to WebGL?
 
 
  On the bright side there's a workaround for RGBA multiplication based
  on generating per-channel bitmaps from the source bitmap (k, r/g/b),
  then blending them source-over/add/add/add. drawImageBatch would
  improve perf for the r/g/b part of it, so it's still an improvement.
 
  On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org
 
  wrote:
   It looks reasonable to me.
  
   How do these calls interact with globalAlpha etc? You talk about
   decomposing them to individual drawImage calls; does that mean each
   image
   draw is treated as a separate composite operation?
  
   Currently you have to choose between using a single image or passing
 an
   array with one element per image-draw. It seems to me it would be more
   flexible to always pass an array but allow the parameters array to
 refer
   to
   an image by index. Did you consider that approach?
  
   Rob
   --
   oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso
 oaonogoroyo
   owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
   osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
   owohooo
   osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o
   o‘oRoaocoao,o’o
   oioso
   oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo
 oaonoyooonoeo
   owohooo
   osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono
 odoaonogoeoro
   ooofo
   otohoeo ofoioroeo ooofo ohoeololo.
 
 



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Rik Cabanier
On Fri, Aug 8, 2014 at 7:25 AM, Ashley Gullen ash...@scirra.com wrote:

 As Justin stated, 20% of current Chrome users currently fall back to
 canvas 2d.


 1. What fraction of those 20% actually still get a GPU accelerated canvas
 vs. software rendered? Batching will be of very little use to the software
 rendered audience, making it an even smaller target market.


There will still be a noticeable gain since we wouldn't have to cross the
JS boundary as much.
More importantly though, things will still work for non-accelerated canvas
while WebGL won't.


 2. In Firefox's case, that number has reduced from 67% to 15% over a
 couple of years. Surely in time this will fall even further to a negligible
 amount. Why standardise a feature whose target market is disappearing?


How long will it be before we're at a reliable 100%? Graphics drivers are
still flaky and it's not as if they only came out a couple of years ago.
Note that the supported % for graphics layers (accelerated compositing)
is much lower.


 Small developers that don't have the resources to develop for concurrent
 WebGL and Canvas2D code paths


 They don't have to: there are free, high-quality open-source libraries
 like Pixi.js that do this already, so even small developers have an easy
 way to make use of a WebGL renderer without much extra effort.


 When do you envision that OpenGL drivers are bug free everywhere? History
 is not on your side here...
 I would much rather have something short term that can be implemented
 with low effort and improves performance.


 No software is ever bug-free, but this is irrelevant. To not be
 blacklisted, drivers don't need to be perfect, they just need to meet a
 reasonable threshold of security and reliability. If a driver is insecure
 or crashes constantly it is blacklisted. Drivers are being improved so they
 are no longer this poor, and it is not unrealistic to imagine 99%+ of
 drivers meeting this threshold in the near future, even if none of them are
 completely bug-free.


I think Justin can better fill you in on this. Chrome has to jump through
many hoops to make canvas reliable on top of OpenGL and it still suffers
from random crashes when you stress the system.
Both Safari and Firefox use higher level system calls and are more reliable
(albeit slower) than Chrome.


 I don't really understand why you and Brian are so opposed to improving
 the performance of canvas 2D.


 I see it as a feature targeted at a rapidly disappearing segment of the
 market that will disappear in the long run, leaving the web platform with
 unnecessary API cruft.


 Following your logic, why work on new canvas or SVG features as they can
 theoretically be emulated in WebGL?
 Or now that we have asm.js, why even bother with new JavaScript features?


 I am in general against duplication on the web platform, but new features
 deserve to be implemented if they have a valid use case or solve a real
 problem.


The problem is that a large number of drawImage calls have a lot of
overhead due to JS crossings and housekeeping. This proposal solves that.


 In this case I don't see that any real problem is being solved, since
 widely available frameworks and engines already solve it with WebGL in a
 way accessible even to individual developers, and this solution is already
 production-grade and widely deployed.


Sure, but that is in WebGL which not everyone wants to use and is less
widely supported.


 On further thought this particular proposal doesn't even appear to solve
 the batching problem very well. Many games consist of large numbers of
 rotated sprites. If a canvas2d batching facility needs to break the batch
 every time it needs to call rotate(), this will revert back to individual
 draw-calls for many kinds of game. WebGL does not have this limitation and
 can batch in to single calls objects of a variety of scales, angles,
 tiling, opacity and more. This is done by control over individual vertex
 positions and texture co-ordinates, which is a fundamental break from the
 style of the canvas2d API. Therefore even with the proposed batching
 facility, for maximum performance it is still necessary to use WebGL. This
 proposal solves a very narrowly defined performance problem.


I'm unsure if I follow. The point of Justin's proposal is to do just that
under the hood.
Why do you think the batching needs to be broken up? Did you see that the
proposal has a matrix per draw?


 An alternate solution is for browser vendors to implement canvas2d
 entirely in JS on top of WebGL. This reduces per-call overhead by staying
 in JS land, while not needing to add any new API surface. In fact it looks
 like this has already been attempted here:
 https://github.com/corbanbrook/webgl-2d -


Implementing Canvas on top of WebGL is not realistic.
Please look into Chrome's implementation to make canvas reliable and fast.
This can not be achieved today.

I totally agree if the web platform matures to a point where this IS
possible, we 

Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Boris Zbarsky

On 8/8/14, 8:43 PM, Rik Cabanier wrote:

The problem is that a large number of drawImage calls have a lot of
overhead due to JS crossings and housekeeping.


Could we please quantify this?

I measured the JS crossings part of this, because it's easy: Just have 
the C++ side of drawImage return immediately.  What I see if I do that 
is that a drawImage call that passes an HTMLImageElement and two 
integers takes about 17ns on my hardware in a current Firefox nightly on 
Mac.


For scale, a full-up drawImage call that actually does something takes 
the following amounts of time in various browsers I have [1]:


Firefox nightly: ~9500ns/call
Chrome dev: ~4300ns/call
Safari 7.0.5 and WebKit nightly: ~3000ns/call

all with noise (when averaged across 10 calls) that's way more than 
17ns.


So I'm not sure JS crossings is a significant performance cost here. 
I'd be interested in which parts of housekeeping would be shareable 
across the multiple images in the proposal and how much time those take 
in practice.


-Boris

[1] The specific testcase I used:

!DOCTYPE html
img src=http://www.mozilla.org/images/mozilla-banner.gif;
script
  onload = function() {
var c = document.createElement(canvas).getContext(2d);
var count = 100;
var img = document.querySelector(img);
var start = new Date;
for (var i = 0; i  count; ++i) c.drawImage(img, 0, 0);
var stop = new Date;
console.log((stop - start) / count * 1e6 + ns per call);
  }
/script



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Justin Novosad
The way you measured the JS crossing time does not include parameter
validations if I am not mistaken.  If you have 1000 sprite draws that draw
from the same sprite sheet, that is 1000 times you are verifying the same
image parameter (verifying that the image is complete, that its security
origin does not taint the canvas, fetch the decoded image data, etc.), even
tough most of this stuff is cached, it is being looked up N times instead
of just once.
I will prototype this and come back with some hard data.

On Fri, Aug 8, 2014 at 9:26 PM, Boris Zbarsky bzbar...@mit.edu wrote:

 On 8/8/14, 8:43 PM, Rik Cabanier wrote:

 The problem is that a large number of drawImage calls have a lot of
 overhead due to JS crossings and housekeeping.


 Could we please quantify this?

 I measured the JS crossings part of this, because it's easy: Just have
 the C++ side of drawImage return immediately.  What I see if I do that is
 that a drawImage call that passes an HTMLImageElement and two integers
 takes about 17ns on my hardware in a current Firefox nightly on Mac.

 For scale, a full-up drawImage call that actually does something takes the
 following amounts of time in various browsers I have [1]:

 Firefox nightly: ~9500ns/call
 Chrome dev: ~4300ns/call
 Safari 7.0.5 and WebKit nightly: ~3000ns/call

 all with noise (when averaged across 10 calls) that's way more than
 17ns.

 So I'm not sure JS crossings is a significant performance cost here. I'd
 be interested in which parts of housekeeping would be shareable across
 the multiple images in the proposal and how much time those take in
 practice.

 -Boris

 [1] The specific testcase I used:

 !DOCTYPE html
 img src=http://www.mozilla.org/images/mozilla-banner.gif;
 script
   onload = function() {
 var c = document.createElement(canvas).getContext(2d);
 var count = 100;
 var img = document.querySelector(img);
 var start = new Date;
 for (var i = 0; i  count; ++i) c.drawImage(img, 0, 0);
 var stop = new Date;
 console.log((stop - start) / count * 1e6 + ns per call);
   }
 /script



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Boris Zbarsky

On 8/8/14, 9:46 PM, Justin Novosad wrote:

The way you measured the JS crossing time does not include parameter
validations if I am not mistaken.


It includes the validation Web IDL does (e.g. this is an 
HTMLImageElement) but not the specific validation this method does, 
correct.



If you have 1000 sprite draws that
draw from the same sprite sheet, that is 1000 times you are verifying
the same image parameter (verifying that the image is complete, that its
security origin does not taint the canvas, fetch the decoded image data,
etc.), even tough most of this stuff is cached, it is being looked up N
times instead of just once.


True.  I just tested this in a Firefox nightly: I put the early return 
after we have looked up the decoded image data in our internal cache, or 
gotten it from the image if not cached, but before we've started doing 
anything with the position passed to drawImage.  The internal cache 
lookup already includes checks for things like tainting, completeness 
verification, etc, since there is nothing in the cache if the image is 
not complete and the canvas is already tainted as needed if we have 
cached decoded data for this (image,canvas) pair.


This version of drawImage takes about 47ns per call when averaged over 
1e5 calls (of which, recall, 17ns is the JS call overhead; the other 
30ns is the cache lookup, which hits).


That's starting to creep up into the 1.5% range of the time I see 
drawImage taking in the fastest drawImage implementation I have on hand.



I will prototype this and come back with some hard data.


That would be awesome!

-Boris


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Katelyn Gadd
A multiply blend mode by itself is not sufficient because the image
being rgba multiplied typically has alpha transparency. The closest
approximation is to generate (offline, in software with getImageData)
an image per channel - rgbk - and to source-over blend the 'k' channel
and then additive blend the r/g/b channels with individual alpha. This
approximates the per-channel alpha values with nearly equivalent
results (I say nearly equivalent because some browsers do weird stuff
with gamma/colorspace conversion that completely breaks this.)

If it helps, you could think of this operation as a layer group in
photoshop. The first layer in the group is the source image, the
second layer is a solid color filled layer containing the rgba
'multiplier', in multiply mode, and then the layer group has a mask on
it that contains the source image's alpha information. Note that this
representation (a layer group with two layers and a mask) implies that
drawing an image this way requires multiple passes, which is highly
undesirable. My current fallback requires 4 passes, along with 4
texture changes and two blend state changes. Not wonderful.


RGBA multiplication dates back to early fixed-function graphics
pipelines. If a blend with globalAlpha and a premultiplied source is
represented like this:

result(r, g, b) = ( source-premultiplied(r, g, b) * globalAlpha ) + (
dest(r, g, b) * (1 - (source(a) * globalAlpha)) )

Then if you take a premultiplied color constant and use that as the
multiplier for your image (instead of a global alpha value - this is
the input to rgba multiplication, i.e. a 'vertex color'):

result(r, g, b) = ( source-premultiplied(r, g, b) *
rgba-multiplier-premultiplied(r, g, b) ) + ( dest(r, g, b) * (1 -
(source(a) * rgba-multiplier-premultiplied(a))) )

(Sorry if this is unclear, I don't have a math education)

So you basically take the global alpha multiplier and you go from that
to a per-channel multiplier. If you're using premultiplied alpha
already, this ends up being pretty straightforward... you just take a
color (premultiplied, like everything else) and use that as your
multiplier. You can multiply directly by each channel since the global
'alpha' part of the multiplier is already baked in by the
premultiplication step.

This is a really common primitive since it's so easy to implement, if
not entirely free - you're already doing that global alpha
multiplication, so you just introduce a different multiplier
per-channel, which is really trivial in a SIMD model like the ones
used in computer graphics. You go from vec4 * scalar to vec4 * vec4.


Text rendering is the most compelling reason to support this, IMO.
With this feature you can build glyph atlases inside 2d canvases
(using something like freetype, etc), then trivially draw colored
glyphs out of them without having to drop down into getImageData or
use WebGL. It is trivially expressed in most graphics APIs since it
uses the same machinery as a global alpha multiplier - if you're
drawing a premultiplied image with an alpha multiplier in hardware,
you're almost certainly doing vec4 * scalar in your shader. If you're
using the fixed-function pipeline from bad old 3d graphics, vec4 *
scalar didn't even exist - the right hand side was *always* another
vec4 so this feature literally just changed the constant on the right
hand side.


I harp on this feature since nearly every 2d game I encounter uses it,
and JSIL has to do it in software. If not for this one feature it
would be very easy to make the vast majority of ported titles Just
Work against canvas, which makes them more likely to run correctly on
mobile.


On Fri, Aug 8, 2014 at 5:28 PM, Rik Cabanier caban...@gmail.com wrote:



 On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote:

 Sorry, in this context rgba multiplication refers to per-channel
 multipliers (instead of only one multiplier for the alpha channel), so
 that you can color tint images when drawing them. As mentioned, it's
 used for fades, drawing colored text, and similar effects.


 I see. Any reason that this couldn't be done with a 'multiply' blend mode?


 Premultiplication is a different subject, sorry if I confused you with
 the similar language. There are past discussions about both in the
 list archives.



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-08 Thread Rik Cabanier
On Fri, Aug 8, 2014 at 7:54 PM, Katelyn Gadd k...@luminance.org wrote:

 A multiply blend mode by itself is not sufficient because the image
 being rgba multiplied typically has alpha transparency. The closest
 approximation is to generate (offline, in software with getImageData)
 an image per channel - rgbk - and to source-over blend the 'k' channel
 and then additive blend the r/g/b channels with individual alpha. This
 approximates the per-channel alpha values with nearly equivalent
 results (I say nearly equivalent because some browsers do weird stuff
 with gamma/colorspace conversion that completely breaks this.)

 If it helps, you could think of this operation as a layer group in
 photoshop. The first layer in the group is the source image, the
 second layer is a solid color filled layer containing the rgba
 'multiplier', in multiply mode, and then the layer group has a mask on
 it that contains the source image's alpha information. Note that this
 representation (a layer group with two layers and a mask) implies that
 drawing an image this way requires multiple passes, which is highly
 undesirable. My current fallback requires 4 passes, along with 4
 texture changes and two blend state changes. Not wonderful.


I see; you're asking for a feature like Photoshop Color Overlay layer
effect. Is that correct?



 RGBA multiplication dates back to early fixed-function graphics
 pipelines. If a blend with globalAlpha and a premultiplied source is
 represented like this:

 result(r, g, b) = ( source-premultiplied(r, g, b) * globalAlpha ) + (
 dest(r, g, b) * (1 - (source(a) * globalAlpha)) )

 Then if you take a premultiplied color constant and use that as the
 multiplier for your image (instead of a global alpha value - this is
 the input to rgba multiplication, i.e. a 'vertex color'):

 result(r, g, b) = ( source-premultiplied(r, g, b) *
 rgba-multiplier-premultiplied(r, g, b) ) + ( dest(r, g, b) * (1 -
 (source(a) * rgba-multiplier-premultiplied(a))) )

 (Sorry if this is unclear, I don't have a math education)

 So you basically take the global alpha multiplier and you go from that
 to a per-channel multiplier. If you're using premultiplied alpha
 already, this ends up being pretty straightforward... you just take a
 color (premultiplied, like everything else) and use that as your
 multiplier. You can multiply directly by each channel since the global
 'alpha' part of the multiplier is already baked in by the
 premultiplication step.

 This is a really common primitive since it's so easy to implement, if
 not entirely free - you're already doing that global alpha
 multiplication, so you just introduce a different multiplier
 per-channel, which is really trivial in a SIMD model like the ones
 used in computer graphics. You go from vec4 * scalar to vec4 * vec4.


 Text rendering is the most compelling reason to support this, IMO.
 With this feature you can build glyph atlases inside 2d canvases
 (using something like freetype, etc), then trivially draw colored
 glyphs out of them without having to drop down into getImageData or
 use WebGL. It is trivially expressed in most graphics APIs since it
 uses the same machinery as a global alpha multiplier - if you're
 drawing a premultiplied image with an alpha multiplier in hardware,
 you're almost certainly doing vec4 * scalar in your shader. If you're
 using the fixed-function pipeline from bad old 3d graphics, vec4 *
 scalar didn't even exist - the right hand side was *always* another
 vec4 so this feature literally just changed the constant on the right
 hand side.


 I harp on this feature since nearly every 2d game I encounter uses it,
 and JSIL has to do it in software. If not for this one feature it
 would be very easy to make the vast majority of ported titles Just
 Work against canvas, which makes them more likely to run correctly on
 mobile.


Maybe it would be best to bring this up as a separate topic on this mailing
list. (just copy/paste most of your message)


 On Fri, Aug 8, 2014 at 5:28 PM, Rik Cabanier caban...@gmail.com wrote:
 
 
 
  On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote:
 
  Sorry, in this context rgba multiplication refers to per-channel
  multipliers (instead of only one multiplier for the alpha channel), so
  that you can color tint images when drawing them. As mentioned, it's
  used for fades, drawing colored text, and similar effects.
 
 
  I see. Any reason that this couldn't be done with a 'multiply' blend
 mode?
 
 
  Premultiplication is a different subject, sorry if I confused you with
  the similar language. There are past discussions about both in the
  list archives.
 



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-07 Thread Rik Cabanier
On Mon, Aug 4, 2014 at 4:35 PM, Katelyn Gadd k...@luminance.org wrote:

 Many, many uses of drawImage involve transform and/or other state
 changes per-blit (composite mode, global alpha).

 I think some of those state changes could be viably batched for most
 games (composite mode) but others absolutely cannot (global alpha,
 transform). I see that you handle transform with
 source-rectangle-and-transform (nice!) but you do not currently handle
 the others. I'd suggest that this needs to at least handle
 globalAlpha.

 Replacing the overloading with individual named methods is something
 I'm also in favor of. I think it would be ideal if the format-enum
 argument were not there so that it's easier to feature-detect what
 formats are available (for example, if globalAlpha data is added later
 instead of in the '1.0' version of this feature).


We can define the functions so they throw a type error if an unknown enum
is passed. That way you can feature detect future additions to the enum.

What should be do about error detection in general? If we require the float
array to be well formed before drawing, we need an extra pass to make sure
that they are correct.
If we don't require it, we can skip that pass but content could be
partially drawn to the canvas before the exception is thrown.


 I get the impression that ordering is implicit for this call - the
 batch's drawing operations occur in exact order. It might be
 worthwhile to have a way to indicate to the implementation that you
 don't care about order, so that it is free to rearrange the draw
 operations by image and reduce state changes. Doing that in userspace
 js is made difficult since you can't easily do efficient table lookup
 for images.

 if rgba multiplication were to make it into canvas2d sometime in the
 next decade, that would nicely replace globalAlpha as a per-draw
 value. This is an analogue to per-vertex colors in 3d graphics and is
 used in virtually every hardware-accelerated 2d game out there,
 whether to tint characters when drawing text, fade things in and out,
 or flash the screen various colors. That would be another reason to
 make feature detection easier.

 Would it be possible to sneak rgba multiplication in under the guise
 of this feature? ;) Without it, I'm forced to use WebGL and reduce
 compatibility just for something relatively trivial on the
 implementer's side. (I should note that from what I've heard, Direct2D
 actually makes this hard to implement.


Is this the other proposal to control the format of the canvas buffer that
is passed to WebGL?


 On the bright side there's a workaround for RGBA multiplication based
 on generating per-channel bitmaps from the source bitmap (k, r/g/b),
 then blending them source-over/add/add/add. drawImageBatch would
 improve perf for the r/g/b part of it, so it's still an improvement.

 On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:
  It looks reasonable to me.
 
  How do these calls interact with globalAlpha etc? You talk about
  decomposing them to individual drawImage calls; does that mean each image
  draw is treated as a separate composite operation?
 
  Currently you have to choose between using a single image or passing an
  array with one element per image-draw. It seems to me it would be more
  flexible to always pass an array but allow the parameters array to refer
 to
  an image by index. Did you consider that approach?
 
  Rob
  --
  oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo
  owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
  osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
  owohooo
  osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o o‘oRoaocoao,o’o
  oioso
  oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo
  owohooo
  osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono odoaonogoeoro
  ooofo
  otohoeo ofoioroeo ooofo ohoeololo.



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-07 Thread Katelyn Gadd
Sorry, in this context rgba multiplication refers to per-channel
multipliers (instead of only one multiplier for the alpha channel), so
that you can color tint images when drawing them. As mentioned, it's
used for fades, drawing colored text, and similar effects.
Premultiplication is a different subject, sorry if I confused you with
the similar language. There are past discussions about both in the
list archives.

On Thu, Aug 7, 2014 at 10:59 AM, Rik Cabanier caban...@gmail.com wrote:



 On Mon, Aug 4, 2014 at 4:35 PM, Katelyn Gadd k...@luminance.org wrote:

 Many, many uses of drawImage involve transform and/or other state
 changes per-blit (composite mode, global alpha).

 I think some of those state changes could be viably batched for most
 games (composite mode) but others absolutely cannot (global alpha,
 transform). I see that you handle transform with
 source-rectangle-and-transform (nice!) but you do not currently handle
 the others. I'd suggest that this needs to at least handle
 globalAlpha.

 Replacing the overloading with individual named methods is something
 I'm also in favor of. I think it would be ideal if the format-enum
 argument were not there so that it's easier to feature-detect what
 formats are available (for example, if globalAlpha data is added later
 instead of in the '1.0' version of this feature).


 We can define the functions so they throw a type error if an unknown enum is
 passed. That way you can feature detect future additions to the enum.

 What should be do about error detection in general? If we require the float
 array to be well formed before drawing, we need an extra pass to make sure
 that they are correct.
 If we don't require it, we can skip that pass but content could be partially
 drawn to the canvas before the exception is thrown.


 I get the impression that ordering is implicit for this call - the
 batch's drawing operations occur in exact order. It might be
 worthwhile to have a way to indicate to the implementation that you
 don't care about order, so that it is free to rearrange the draw
 operations by image and reduce state changes. Doing that in userspace
 js is made difficult since you can't easily do efficient table lookup
 for images.

 if rgba multiplication were to make it into canvas2d sometime in the
 next decade, that would nicely replace globalAlpha as a per-draw
 value. This is an analogue to per-vertex colors in 3d graphics and is
 used in virtually every hardware-accelerated 2d game out there,
 whether to tint characters when drawing text, fade things in and out,
 or flash the screen various colors. That would be another reason to
 make feature detection easier.

 Would it be possible to sneak rgba multiplication in under the guise
 of this feature? ;) Without it, I'm forced to use WebGL and reduce
 compatibility just for something relatively trivial on the
 implementer's side. (I should note that from what I've heard, Direct2D
 actually makes this hard to implement.


 Is this the other proposal to control the format of the canvas buffer that
 is passed to WebGL?


 On the bright side there's a workaround for RGBA multiplication based
 on generating per-channel bitmaps from the source bitmap (k, r/g/b),
 then blending them source-over/add/add/add. drawImageBatch would
 improve perf for the r/g/b part of it, so it's still an improvement.

 On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:
  It looks reasonable to me.
 
  How do these calls interact with globalAlpha etc? You talk about
  decomposing them to individual drawImage calls; does that mean each
  image
  draw is treated as a separate composite operation?
 
  Currently you have to choose between using a single image or passing an
  array with one element per image-draw. It seems to me it would be more
  flexible to always pass an array but allow the parameters array to refer
  to
  an image by index. Did you consider that approach?
 
  Rob
  --
  oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo
  owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
  osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
  owohooo
  osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o
  o‘oRoaocoao,o’o
  oioso
  oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo
  owohooo
  osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono odoaonogoeoro
  ooofo
  otohoeo ofoioroeo ooofo ohoeololo.




Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-06 Thread Rik Cabanier
On Tue, Aug 5, 2014 at 10:04 AM, Brian Blakely anewpage.me...@gmail.com
wrote:

 On Tue, Aug 5, 2014 at 11:21 AM, Justin Novosad ju...@google.com wrote:

  On Tue, Aug 5, 2014 at 7:47 AM, Ashley Gullen ash...@scirra.com wrote:
 
   I am against this suggestion. If you are serious about performance then
   you should use WebGL and implement your own batching system, which is
  what
   every major 2D HTML5 game framework I'm aware of does already. Adding
   batching features to canvas2d has three disadvantages in my view:
  
   1. Major 2D engines already support WebGL, so even if this new feature
  was
   supported, in practice it would not be used.
   2. There is opportunity cost in speccing something that is unlikely to
 be
   used and already well-covered by another part of the web platform. We
  could
   be speccing something else more useful.
   3. canvas2d should not end up being specced closer and closer to WebGL:
   canvas2d should be kept as a high-level easy-to-use API even with
   performance cost, whereas WebGL is the low-level high-performance API.
   These are two different use cases and it's good to have two different
  APIs
   to cover them. If you want to keep improving canvas2d performance I
 would
   worry you will simply end up reinventing WebGL.
  
  
  These are good points. The only counter argument I have to that is that a
  fallback from WebGL to canvas2d is unfortunately necessary for a
  significant fraction of users. Even on web-browsers that do support
 WebGL,
  gl may be emulated in software, which can be detected by web apps and
  warrants falling back to canvas2d (approx. 20% of Chrome users, for
  example). I realize that there is currently a clear ease of use vs.
  performance dichotomy between 2d and webgl, and this proposal blurs that
  boundary. Nonetheless, there is developer-driven demand for this based
 on a
  real-world problem. Also, if 2D canvas had better performance
  characteristics, it would not be necessary for some game engines to have
  dual (2d/webgl) implementations.
 
  -Justin
 

 My take is similar to Ashley's, and I wonder how buffing up the toy API
 (2D) compensates for the fact that the performance API (GL) has
 compatibility problems, even on platforms that support it.  If the goal is
 to solve the latter, why not introduce more direct proposals?


Can you explain what you're asking for? Are you asking for a proposal that
fixes compatibility?


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-06 Thread Rik Cabanier
On Tue, Aug 5, 2014 at 5:55 PM, Ashley Gullen ash...@scirra.com wrote:

 If your argument is that WebGL sometimes falls back to canvas2d, this
 generally only happens when the system has crappy drivers that are
 blacklisted for being insecure/unstable. The solution to this is to develop
 and distribute better drivers that are not blacklisted. This is already
 happening and making good progress - according to Mozilla's stats, Firefox
 users who get WebGL support has increased from 33% in 2011 to 85% in 2014 (
 http://people.mozilla.org/~bjacob/gfx_features_stats/).


As Justin stated, 20% of current Chrome users currently fall back to canvas
2d.
This is a large chunk of the market.
Small developers that don't have the resources to develop for concurrent
WebGL and Canvas2D code paths will certainly code for just Canvas since
that will give them close to 100%.


 I feel it is likely
 to continue to approach ubiquitous WebGL support, making fallbacks
 unnecessary. This also solves the problem of having to have dual renderer
 implementations: only the WebGL renderer will be necessary, and this is far
 more compelling than a souped-up canvas2d, since WebGL can use shader
 effects, have advanced control over textures and co-ordinates, also do 3D,
 and so on. This cannot all be brought to canvas2d without simply
 reinventing WebGL. Further, crappy drivers can also cause software-rendered
 canvas2d as well, which is likely so slow to begin with that batching will
 have no important performance improvement. Software-rendered WebGL is just
 another workaround to crappy drivers (or in rare cases systems without
 GPUs, but then who's going to be gunning for high performance there?) and
 there is still no guarantee falling back to canvas2d will be
 GPU-accelerated, especially since the system already has such poor drivers
 that the browser has blacklisted it for WebGL support.

 The real problem is that there is not 100% WebGL support everywhere, but
 with drivers improving and Apple and Microsoft on board I'm sure that will
 fix itself eventually. Please don't spec features to improve canvas2d
 performance in the mean time; I don't see it having any long-term utility
 for the web platform.


When do you envision that OpenGL drivers are bug free everywhere? History
is not on your side here...
I would much rather have something short term that can be implemented with
low effort and improves performance.

I don't really understand why you and Brian are so opposed to improving the
performance of canvas 2D.
There are a lot of people that use and like its API. WebGL on the other
hand, has a very steep learning curve and problems are not always obvious.

Following your logic, why work on new canvas or SVG features as they can
theoretically be emulated in WebGL?
Or now that we have asm.js, why even bother with new JavaScript features?


 On 5 August 2014 16:21, Justin Novosad ju...@google.com wrote:

  On Mon, Aug 4, 2014 at 6:39 PM, Robert O'Callahan rob...@ocallahan.org
  wrote:
 
   It looks reasonable to me.
  
   How do these calls interact with globalAlpha etc? You talk about
   decomposing them to individual drawImage calls; does that mean each
 image
   draw is treated as a separate composite operation?
  
 
  Composited separately is the intent. A possible internal optimization:
 the
  implementation could group non-overlapping draw and composite them
  together.
 
 
   Currently you have to choose between using a single image or passing an
   array with one element per image-draw. It seems to me it would be more
   flexible to always pass an array but allow the parameters array to
 refer
  to
   an image by index. Did you consider that approach?
  
 
  Had not thought of that. Good idea.
 
  On Mon, Aug 4, 2014 at 7:35 PM, Katelyn Gadd k...@luminance.org wrote:
 
   I'd suggest that this needs to at least handle
   globalAlpha.
  
 
  It would be trivial to add a an addition format that includes alpha.
 
 
   Replacing the overloading with individual named methods is something
   I'm also in favor of.
 
 
  That's something I pondered and was not sure about. Eliminating the
  parameter format argument would be nice. Your feature-detection argument
 is
  a really good reason.
 
  
   I get the impression that ordering is implicit for this call - the
   batch's drawing operations occur in exact order. It might be
   worthwhile to have a way to indicate to the implementation that you
   don't care about order, so that it is free to rearrange the draw
   operations by image and reduce state changes. Doing that in userspace
   js is made difficult since you can't easily do efficient table lookup
   for images.
  
 
  I am not sure exposing that in the API is a good idea because it opens
 the
  door to undefined behavior. It could result in different implementations
  producing drastically different yet compliant results.
  Perhaps implementations could auto-detect draw operations that are
  commutative based on a quick 

Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-05 Thread Ashley Gullen
I am against this suggestion. If you are serious about performance then you
should use WebGL and implement your own batching system, which is what
every major 2D HTML5 game framework I'm aware of does already. Adding
batching features to canvas2d has three disadvantages in my view:

1. Major 2D engines already support WebGL, so even if this new feature was
supported, in practice it would not be used.
2. There is opportunity cost in speccing something that is unlikely to be
used and already well-covered by another part of the web platform. We could
be speccing something else more useful.
3. canvas2d should not end up being specced closer and closer to WebGL:
canvas2d should be kept as a high-level easy-to-use API even with
performance cost, whereas WebGL is the low-level high-performance API.
These are two different use cases and it's good to have two different APIs
to cover them. If you want to keep improving canvas2d performance I would
worry you will simply end up reinventing WebGL.

As developers of a HTML5 game engine with WebGL support and fallback to
canvas2d, this has no utility to us at all. It would be more useful to work
on WebGL 2.

Ashley Gullen
Scirra.com



On 5 August 2014 00:35, Katelyn Gadd k...@luminance.org wrote:

 Many, many uses of drawImage involve transform and/or other state
 changes per-blit (composite mode, global alpha).

 I think some of those state changes could be viably batched for most
 games (composite mode) but others absolutely cannot (global alpha,
 transform). I see that you handle transform with
 source-rectangle-and-transform (nice!) but you do not currently handle
 the others. I'd suggest that this needs to at least handle
 globalAlpha.

 Replacing the overloading with individual named methods is something
 I'm also in favor of. I think it would be ideal if the format-enum
 argument were not there so that it's easier to feature-detect what
 formats are available (for example, if globalAlpha data is added later
 instead of in the '1.0' version of this feature).

 I get the impression that ordering is implicit for this call - the
 batch's drawing operations occur in exact order. It might be
 worthwhile to have a way to indicate to the implementation that you
 don't care about order, so that it is free to rearrange the draw
 operations by image and reduce state changes. Doing that in userspace
 js is made difficult since you can't easily do efficient table lookup
 for images.

 if rgba multiplication were to make it into canvas2d sometime in the
 next decade, that would nicely replace globalAlpha as a per-draw
 value. This is an analogue to per-vertex colors in 3d graphics and is
 used in virtually every hardware-accelerated 2d game out there,
 whether to tint characters when drawing text, fade things in and out,
 or flash the screen various colors. That would be another reason to
 make feature detection easier.

 Would it be possible to sneak rgba multiplication in under the guise
 of this feature? ;) Without it, I'm forced to use WebGL and reduce
 compatibility just for something relatively trivial on the
 implementer's side. (I should note that from what I've heard, Direct2D
 actually makes this hard to implement.

 On the bright side there's a workaround for RGBA multiplication based
 on generating per-channel bitmaps from the source bitmap (k, r/g/b),
 then blending them source-over/add/add/add. drawImageBatch would
 improve perf for the r/g/b part of it, so it's still an improvement.

 On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:
  It looks reasonable to me.
 
  How do these calls interact with globalAlpha etc? You talk about
  decomposing them to individual drawImage calls; does that mean each image
  draw is treated as a separate composite operation?
 
  Currently you have to choose between using a single image or passing an
  array with one element per image-draw. It seems to me it would be more
  flexible to always pass an array but allow the parameters array to refer
 to
  an image by index. Did you consider that approach?
 
  Rob
  --
  oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo
  owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
  osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
  owohooo
  osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o o‘oRoaocoao,o’o
  oioso
  oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo
  owohooo
  osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono odoaonogoeoro
  ooofo
  otohoeo ofoioroeo ooofo ohoeololo.



Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-05 Thread Justin Novosad
On Mon, Aug 4, 2014 at 6:39 PM, Robert O'Callahan rob...@ocallahan.org
wrote:

 It looks reasonable to me.

 How do these calls interact with globalAlpha etc? You talk about
 decomposing them to individual drawImage calls; does that mean each image
 draw is treated as a separate composite operation?


Composited separately is the intent. A possible internal optimization: the
implementation could group non-overlapping draw and composite them together.


 Currently you have to choose between using a single image or passing an
 array with one element per image-draw. It seems to me it would be more
 flexible to always pass an array but allow the parameters array to refer to
 an image by index. Did you consider that approach?


Had not thought of that. Good idea.

On Mon, Aug 4, 2014 at 7:35 PM, Katelyn Gadd k...@luminance.org wrote:

 I'd suggest that this needs to at least handle
 globalAlpha.


It would be trivial to add a an addition format that includes alpha.


 Replacing the overloading with individual named methods is something
 I'm also in favor of.


That's something I pondered and was not sure about. Eliminating the
parameter format argument would be nice. Your feature-detection argument is
a really good reason.


 I get the impression that ordering is implicit for this call - the
 batch's drawing operations occur in exact order. It might be
 worthwhile to have a way to indicate to the implementation that you
 don't care about order, so that it is free to rearrange the draw
 operations by image and reduce state changes. Doing that in userspace
 js is made difficult since you can't easily do efficient table lookup
 for images.


I am not sure exposing that in the API is a good idea because it opens the
door to undefined behavior. It could result in different implementations
producing drastically different yet compliant results.
Perhaps implementations could auto-detect draw operations that are
commutative based on a quick overlap analysis, and use that knowledge to
automatically group draw calls that use similar drawing state (e.g. the
same source GPU texture)



 Would it be possible to sneak rgba multiplication in under the guise
 of this feature? ;) Without it, I'm forced to use WebGL and reduce
 compatibility just for something relatively trivial on the
 implementer's side. (I should note that from what I've heard, Direct2D
 actually makes this hard to implement.


I think that would make this feature significantly more complex to spec,
and to implement. It really should be treated as an orthogonal feature
request. Your suggestion is very use-case specific. A more general
incarnation of your request would be to have a blendMode parameter that
that determines how the source image gets blended with the fillStyle. With
that, you resolve your specific use case by setling the fill style to an
rgba color, and using a multiply blend op to use it to modulate images.

On Tue, Aug 5, 2014 at 7:47 AM, Ashley Gullen ash...@scirra.com wrote:

 I am against this suggestion. If you are serious about performance then
 you should use WebGL and implement your own batching system, which is what
 every major 2D HTML5 game framework I'm aware of does already. Adding
 batching features to canvas2d has three disadvantages in my view:

 1. Major 2D engines already support WebGL, so even if this new feature was
 supported, in practice it would not be used.
 2. There is opportunity cost in speccing something that is unlikely to be
 used and already well-covered by another part of the web platform. We could
 be speccing something else more useful.
 3. canvas2d should not end up being specced closer and closer to WebGL:
 canvas2d should be kept as a high-level easy-to-use API even with
 performance cost, whereas WebGL is the low-level high-performance API.
 These are two different use cases and it's good to have two different APIs
 to cover them. If you want to keep improving canvas2d performance I would
 worry you will simply end up reinventing WebGL.


These are good points. The only counter argument I have to that is that a
fallback from WebGL to canvas2d is unfortunately necessary for a
significant fraction of users. Even on web-browsers that do support WebGL,
gl may be emulated in software, which can be detected by web apps and
warrants falling back to canvas2d (approx. 20% of Chrome users, for
example). I realize that there is currently a clear ease of use vs.
performance dichotomy between 2d and webgl, and this proposal blurs that
boundary. Nonetheless, there is developer-driven demand for this based on a
real-world problem. Also, if 2D canvas had better performance
characteristics, it would not be necessary for some game engines to have
dual (2d/webgl) implementations.

-Justin


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-05 Thread Brian Blakely
On Tue, Aug 5, 2014 at 11:21 AM, Justin Novosad ju...@google.com wrote:

 On Tue, Aug 5, 2014 at 7:47 AM, Ashley Gullen ash...@scirra.com wrote:

  I am against this suggestion. If you are serious about performance then
  you should use WebGL and implement your own batching system, which is
 what
  every major 2D HTML5 game framework I'm aware of does already. Adding
  batching features to canvas2d has three disadvantages in my view:
 
  1. Major 2D engines already support WebGL, so even if this new feature
 was
  supported, in practice it would not be used.
  2. There is opportunity cost in speccing something that is unlikely to be
  used and already well-covered by another part of the web platform. We
 could
  be speccing something else more useful.
  3. canvas2d should not end up being specced closer and closer to WebGL:
  canvas2d should be kept as a high-level easy-to-use API even with
  performance cost, whereas WebGL is the low-level high-performance API.
  These are two different use cases and it's good to have two different
 APIs
  to cover them. If you want to keep improving canvas2d performance I would
  worry you will simply end up reinventing WebGL.
 
 
 These are good points. The only counter argument I have to that is that a
 fallback from WebGL to canvas2d is unfortunately necessary for a
 significant fraction of users. Even on web-browsers that do support WebGL,
 gl may be emulated in software, which can be detected by web apps and
 warrants falling back to canvas2d (approx. 20% of Chrome users, for
 example). I realize that there is currently a clear ease of use vs.
 performance dichotomy between 2d and webgl, and this proposal blurs that
 boundary. Nonetheless, there is developer-driven demand for this based on a
 real-world problem. Also, if 2D canvas had better performance
 characteristics, it would not be necessary for some game engines to have
 dual (2d/webgl) implementations.

 -Justin


My take is similar to Ashley's, and I wonder how buffing up the toy API
(2D) compensates for the fact that the performance API (GL) has
compatibility problems, even on platforms that support it.  If the goal is
to solve the latter, why not introduce more direct proposals?


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-05 Thread Ashley Gullen
If your argument is that WebGL sometimes falls back to canvas2d, this
generally only happens when the system has crappy drivers that are
blacklisted for being insecure/unstable. The solution to this is to develop
and distribute better drivers that are not blacklisted. This is already
happening and making good progress - according to Mozilla's stats, Firefox
users who get WebGL support has increased from 33% in 2011 to 85% in 2014 (
http://people.mozilla.org/~bjacob/gfx_features_stats/). I feel it is likely
to continue to approach ubiquitous WebGL support, making fallbacks
unnecessary. This also solves the problem of having to have dual renderer
implementations: only the WebGL renderer will be necessary, and this is far
more compelling than a souped-up canvas2d, since WebGL can use shader
effects, have advanced control over textures and co-ordinates, also do 3D,
and so on. This cannot all be brought to canvas2d without simply
reinventing WebGL. Further, crappy drivers can also cause software-rendered
canvas2d as well, which is likely so slow to begin with that batching will
have no important performance improvement. Software-rendered WebGL is just
another workaround to crappy drivers (or in rare cases systems without
GPUs, but then who's going to be gunning for high performance there?) and
there is still no guarantee falling back to canvas2d will be
GPU-accelerated, especially since the system already has such poor drivers
that the browser has blacklisted it for WebGL support.

The real problem is that there is not 100% WebGL support everywhere, but
with drivers improving and Apple and Microsoft on board I'm sure that will
fix itself eventually. Please don't spec features to improve canvas2d
performance in the mean time; I don't see it having any long-term utility
for the web platform.

Ashley



On 5 August 2014 16:21, Justin Novosad ju...@google.com wrote:

 On Mon, Aug 4, 2014 at 6:39 PM, Robert O'Callahan rob...@ocallahan.org
 wrote:

  It looks reasonable to me.
 
  How do these calls interact with globalAlpha etc? You talk about
  decomposing them to individual drawImage calls; does that mean each image
  draw is treated as a separate composite operation?
 

 Composited separately is the intent. A possible internal optimization: the
 implementation could group non-overlapping draw and composite them
 together.


  Currently you have to choose between using a single image or passing an
  array with one element per image-draw. It seems to me it would be more
  flexible to always pass an array but allow the parameters array to refer
 to
  an image by index. Did you consider that approach?
 

 Had not thought of that. Good idea.

 On Mon, Aug 4, 2014 at 7:35 PM, Katelyn Gadd k...@luminance.org wrote:

  I'd suggest that this needs to at least handle
  globalAlpha.
 

 It would be trivial to add a an addition format that includes alpha.


  Replacing the overloading with individual named methods is something
  I'm also in favor of.


 That's something I pondered and was not sure about. Eliminating the
 parameter format argument would be nice. Your feature-detection argument is
 a really good reason.

 
  I get the impression that ordering is implicit for this call - the
  batch's drawing operations occur in exact order. It might be
  worthwhile to have a way to indicate to the implementation that you
  don't care about order, so that it is free to rearrange the draw
  operations by image and reduce state changes. Doing that in userspace
  js is made difficult since you can't easily do efficient table lookup
  for images.
 

 I am not sure exposing that in the API is a good idea because it opens the
 door to undefined behavior. It could result in different implementations
 producing drastically different yet compliant results.
 Perhaps implementations could auto-detect draw operations that are
 commutative based on a quick overlap analysis, and use that knowledge to
 automatically group draw calls that use similar drawing state (e.g. the
 same source GPU texture)


 
  Would it be possible to sneak rgba multiplication in under the guise
  of this feature? ;) Without it, I'm forced to use WebGL and reduce
  compatibility just for something relatively trivial on the
  implementer's side. (I should note that from what I've heard, Direct2D
  actually makes this hard to implement.
 

 I think that would make this feature significantly more complex to spec,
 and to implement. It really should be treated as an orthogonal feature
 request. Your suggestion is very use-case specific. A more general
 incarnation of your request would be to have a blendMode parameter that
 that determines how the source image gets blended with the fillStyle. With
 that, you resolve your specific use case by setling the fill style to an
 rgba color, and using a multiply blend op to use it to modulate images.

 On Tue, Aug 5, 2014 at 7:47 AM, Ashley Gullen ash...@scirra.com wrote:

  I am against this suggestion. If you are serious 

[whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-04 Thread Justin Novosad
Greetings all,

In a drive to satisfy some the most demanding performance critical
applications--2d platformer games--, browser vendors have put a lot of
effort into optimizing CanvasRenderingContext2D, and the drawImage method
in particular. In a world where browsers have GPU-accelerated graphics
backends and complex multi-threaded rendering frameworks, the ultimate
hurdle in relieving computational burden from the browser's main thread is:
per draw call overhead.

In a sprite based animation, the application may be calling drawImage
hundreds or even thousands of times per animation frame. The script
bindings and browser internal bookkeeping that is performed on a per draw
call basis is a O(N) cost that could be made O(1) by introducing batch
versions of drawImage. Benchmark experiments performed using Chrome on
mobile and desktop platforms show that this is currently the main obstacle
standing in the way of achieving near-native rendering performance.

Here is what I am proposing:
http://wiki.whatwg.org/wiki/Canvas_Batch_drawImage

Feedback wanted.

Cheers,

   -Justin Novosad


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-04 Thread Robert O'Callahan
It looks reasonable to me.

How do these calls interact with globalAlpha etc? You talk about
decomposing them to individual drawImage calls; does that mean each image
draw is treated as a separate composite operation?

Currently you have to choose between using a single image or passing an
array with one element per image-draw. It seems to me it would be more
flexible to always pass an array but allow the parameters array to refer to
an image by index. Did you consider that approach?

Rob
-- 
oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo
owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
owohooo
osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o o‘oRoaocoao,o’o
oioso
oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo
owohooo
osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono odoaonogoeoro
ooofo
otohoeo ofoioroeo ooofo ohoeololo.


Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage

2014-08-04 Thread Katelyn Gadd
Many, many uses of drawImage involve transform and/or other state
changes per-blit (composite mode, global alpha).

I think some of those state changes could be viably batched for most
games (composite mode) but others absolutely cannot (global alpha,
transform). I see that you handle transform with
source-rectangle-and-transform (nice!) but you do not currently handle
the others. I'd suggest that this needs to at least handle
globalAlpha.

Replacing the overloading with individual named methods is something
I'm also in favor of. I think it would be ideal if the format-enum
argument were not there so that it's easier to feature-detect what
formats are available (for example, if globalAlpha data is added later
instead of in the '1.0' version of this feature).

I get the impression that ordering is implicit for this call - the
batch's drawing operations occur in exact order. It might be
worthwhile to have a way to indicate to the implementation that you
don't care about order, so that it is free to rearrange the draw
operations by image and reduce state changes. Doing that in userspace
js is made difficult since you can't easily do efficient table lookup
for images.

if rgba multiplication were to make it into canvas2d sometime in the
next decade, that would nicely replace globalAlpha as a per-draw
value. This is an analogue to per-vertex colors in 3d graphics and is
used in virtually every hardware-accelerated 2d game out there,
whether to tint characters when drawing text, fade things in and out,
or flash the screen various colors. That would be another reason to
make feature detection easier.

Would it be possible to sneak rgba multiplication in under the guise
of this feature? ;) Without it, I'm forced to use WebGL and reduce
compatibility just for something relatively trivial on the
implementer's side. (I should note that from what I've heard, Direct2D
actually makes this hard to implement.

On the bright side there's a workaround for RGBA multiplication based
on generating per-channel bitmaps from the source bitmap (k, r/g/b),
then blending them source-over/add/add/add. drawImageBatch would
improve perf for the r/g/b part of it, so it's still an improvement.

On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org wrote:
 It looks reasonable to me.

 How do these calls interact with globalAlpha etc? You talk about
 decomposing them to individual drawImage calls; does that mean each image
 draw is treated as a separate composite operation?

 Currently you have to choose between using a single image or passing an
 array with one element per image-draw. It seems to me it would be more
 flexible to always pass an array but allow the parameters array to refer to
 an image by index. Did you consider that approach?

 Rob
 --
 oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo
 owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo
 osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo
 owohooo
 osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o o‘oRoaocoao,o’o
 oioso
 oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo
 owohooo
 osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono odoaonogoeoro
 ooofo
 otohoeo ofoioroeo ooofo ohoeololo.