Re: [whatwg] resource hints and separating download from processing
Sweet, I think we're on the same page. I completely agree with your high level goal of a declarative fetch interface. Agreed that matching is still critical, eg for the Link header use case. On Thu, Aug 7, 2014 at 10:40 PM, Ilya Grigorik igrigo...@gmail.com wrote: On Thu, Aug 7, 2014 at 4:39 PM, Ben Maurer ben.mau...@gmail.com wrote: On Thu, Aug 7, 2014 at 3:21 PM, Ilya Grigorik igrigo...@gmail.com wrote: It would be nice if there was a more declarative relationship between the declarative fetch and the eventual use of the resource (assuming the resources are on the same page). I would like to break that dependency. I want layering separation where we have a clean way to fetch resources (with custom parameters like headers, priorities, dependencies, etc), and a layer that's responsible for consuming fetched resources for processing in the right context (enforcing security policies, etc), and at the right time -- e.g. see Example 3 under: https://igrigorik.github.io/resource-hints/#preload So I guess my worry here is that the loose dependency could be hard to debug. As a concrete example, we use crossorigin=anonymous on our script tags so that we can get stack traces. My understanding is that this requires an Origin header to be sent with the request and a CORS header in the response. If my link rel=preload doesn't have a crossorigin setting, the requests wouldn't match up. We can't control the response bits, but as far as emitting the right request headers and communicating stream priority + dependencies, all of that could be exposed via some set of options when the request is built up. This is where we get back to the syntax (e.g. params={...}, or some such), which would probably map directly to various Fetch API use cases and options. Effectively, I want a declarative Fetch interface. I guess what I'm asking for here is some programmatic way of connecting the fetch with the consumption. For example, exposing the fetch object of the rel=preload and allowing you to construct a script tag explicitly with the fetch object. +1. That said, we still *need* a mechanism where the matching does not rely on manual plumbing with JavaScript - e.g. server returns a set of hints via Link header for critical resources, which must be matched by the user agent against appropriate requests later in the doc. ig
Re: [whatwg] resource hints and separating download from processing
+1 to breaking the dependency between fetching the resource and how it is later used in the document. This type of “late binding” enables many page optimizations. Peter PS. My apologies if you’ve already seen a message saying similar. I’m not sure if the mailing list is accepting my messages or not yet. On Aug 8, 2014, at 2:17 AM, Ben Maurer ben.mau...@gmail.com wrote: Sweet, I think we're on the same page. I completely agree with your high level goal of a declarative fetch interface. Agreed that matching is still critical, eg for the Link header use case. On Thu, Aug 7, 2014 at 10:40 PM, Ilya Grigorik igrigo...@gmail.com wrote: On Thu, Aug 7, 2014 at 4:39 PM, Ben Maurer ben.mau...@gmail.com wrote: On Thu, Aug 7, 2014 at 3:21 PM, Ilya Grigorik igrigo...@gmail.com wrote: It would be nice if there was a more declarative relationship between the declarative fetch and the eventual use of the resource (assuming the resources are on the same page). I would like to break that dependency. I want layering separation where we have a clean way to fetch resources (with custom parameters like headers, priorities, dependencies, etc), and a layer that's responsible for consuming fetched resources for processing in the right context (enforcing security policies, etc), and at the right time -- e.g. see Example 3 under: https://igrigorik.github.io/resource-hints/#preload So I guess my worry here is that the loose dependency could be hard to debug. As a concrete example, we use crossorigin=anonymous on our script tags so that we can get stack traces. My understanding is that this requires an Origin header to be sent with the request and a CORS header in the response. If my link rel=preload doesn't have a crossorigin setting, the requests wouldn't match up. We can't control the response bits, but as far as emitting the right request headers and communicating stream priority + dependencies, all of that could be exposed via some set of options when the request is built up. This is where we get back to the syntax (e.g. params={...}, or some such), which would probably map directly to various Fetch API use cases and options. Effectively, I want a declarative Fetch interface. I guess what I'm asking for here is some programmatic way of connecting the fetch with the consumption. For example, exposing the fetch object of the rel=preload and allowing you to construct a script tag explicitly with the fetch object. +1. That said, we still *need* a mechanism where the matching does not rely on manual plumbing with JavaScript - e.g. server returns a set of hints via Link header for critical resources, which must be matched by the user agent against appropriate requests later in the doc. ig
Re: [whatwg] Notifications improvements
Hi Andrew, On Wed, Aug 6, 2014 at 12:59 PM, Andrew Wilson atwil...@google.com wrote: On Wed, Aug 6, 2014 at 12:48 PM, Anne van Kesteren ann...@annevk.nl wrote: On Wed, Aug 6, 2014 at 10:08 AM, Andrew Wilson atwil...@google.com wrote: I understand your concern that is driving this proposal: you don't want to provide rich APIs that can't be well implemented on every platform, and thereby fragment the web platform. I just don't want to see us go down this path of adding these new notification types that are so limited in ability that people will just keep using general notifications anyway - I'd rather just stick with the existing API. Are you unenthusiastic about any of the proposed additions (what about those already added?) or is this more about the more complex features such as indication of progress? I'm (somewhat) unenthusiastic about the new semantic types, because I'm not sure they'd get enough uptake to be worth the effort to implement (note that this is just my personal opinion - I'm no longer as heavily involved in the notification work within Chromium, so Peter B's opinion carries much more weight than mine when it comes to determining what we'd implement). I find myself being in favor of the semantic types. The primary reason for this is that it allows us to provide a much more consistent user experience, especially on platforms where we don't control rendering of the notification, compared to HTML notifications. On Chrome for Android we want to provide for a consistent user experience, where notifications should be visually indistinguishable (aside from clarifying the origin) from those created by native apps. The proposed semantic types would get us there. Furthermore, support for HTML notifications will be much more difficult to implement across the board. Some mobile OSes, notably Firefox OS and iOS, wouldn't support this at all. Others, such as Android, theoretically could support it, but won't because it means creating an entire WebView -- causing very significant memory pressure on already resource constrained devices. Implementation wise, Chrome recently switched to rendering Notifications using a new message center implementation, which already supports rich data such as progress bars, timestamps, lists and buttons. On Android many of these features will come down to calling an extra method on the Java Notification.Builder. I am quite enthusiastic about adding support in the API around allowing users to interact with notifications after the parent page has closed, however. Having a new field timestamp that carries a particular point in time related to the message seems quite useful for instance and not very intrusive. Perhaps I'm not understanding how this would be used. What would such a notification look like for (say) a calendar event (Your 1PM meeting starts in 5 minutes) versus a countdown timer (In 73 seconds, your hard boiled egg will be done cooking)? Would we have to provide a format string so the UA knows where to inject the time remaining, or would it always have to be put at the end? Do we need to specify granularity of updates (i.e. if I have a stopwatch app, I probably want to update every second, but for a calendar app, updating every 5 minutes is probably sufficient, until we get down to the last 5 minutes)? I would argue that this is a feature that should be defined by the UA or platform. I'm not sure whether using a counting notification is the right way to implement a count-down -- I would expect normal, foreground UI displaying a counter, and a notification if that UI was moved to the background and the counter expired. The timestamp doesn't necessarily need to be in the future either. At least for me, a calendar notification that is constantly updating its countdown every minute but has no snooze functionality is actually a mis-feature, because it's distracting. The notification wouldn't visually re-announce itself on every update. Again, I don't want to be overly negative about this - maybe there are use cases like list notifications where the new API would be really useful, but as soon as I start thinking about more complex display scenarios, I immediately want to start having more control over the formatting of the text being displayed, and i wouldn't get that with this proposal. I think the benefit of being able to closely match the UI and UX of native notifications on the platforms is something that's enabled by using declarative properties, whereas that would be near impossible to do with HTML notifications. As long as advanced features, especially more compatible ones (say, buttons or timestamps) can be feature detected by developers so that they can provide a fallback when they're not available, I would be in favor of extending the future set with Jonas' declarative proposals. Thanks, Peter
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
As Justin stated, 20% of current Chrome users currently fall back to canvas 2d. 1. What fraction of those 20% actually still get a GPU accelerated canvas vs. software rendered? Batching will be of very little use to the software rendered audience, making it an even smaller target market. 2. In Firefox's case, that number has reduced from 67% to 15% over a couple of years. Surely in time this will fall even further to a negligible amount. Why standardise a feature whose target market is disappearing? Small developers that don't have the resources to develop for concurrent WebGL and Canvas2D code paths They don't have to: there are free, high-quality open-source libraries like Pixi.js that do this already, so even small developers have an easy way to make use of a WebGL renderer without much extra effort. When do you envision that OpenGL drivers are bug free everywhere? History is not on your side here... I would much rather have something short term that can be implemented with low effort and improves performance. No software is ever bug-free, but this is irrelevant. To not be blacklisted, drivers don't need to be perfect, they just need to meet a reasonable threshold of security and reliability. If a driver is insecure or crashes constantly it is blacklisted. Drivers are being improved so they are no longer this poor, and it is not unrealistic to imagine 99%+ of drivers meeting this threshold in the near future, even if none of them are completely bug-free. I don't really understand why you and Brian are so opposed to improving the performance of canvas 2D. I see it as a feature targeted at a rapidly disappearing segment of the market that will disappear in the long run, leaving the web platform with unnecessary API cruft. Following your logic, why work on new canvas or SVG features as they can theoretically be emulated in WebGL? Or now that we have asm.js, why even bother with new JavaScript features? I am in general against duplication on the web platform, but new features deserve to be implemented if they have a valid use case or solve a real problem. In this case I don't see that any real problem is being solved, since widely available frameworks and engines already solve it with WebGL in a way accessible even to individual developers, and this solution is already production-grade and widely deployed. On further thought this particular proposal doesn't even appear to solve the batching problem very well. Many games consist of large numbers of rotated sprites. If a canvas2d batching facility needs to break the batch every time it needs to call rotate(), this will revert back to individual draw-calls for many kinds of game. WebGL does not have this limitation and can batch in to single calls objects of a variety of scales, angles, tiling, opacity and more. This is done by control over individual vertex positions and texture co-ordinates, which is a fundamental break from the style of the canvas2d API. Therefore even with the proposed batching facility, for maximum performance it is still necessary to use WebGL. This proposal solves a very narrowly defined performance problem. An alternate solution is for browser vendors to implement canvas2d entirely in JS on top of WebGL. This reduces per-call overhead by staying in JS land, while not needing to add any new API surface. In fact it looks like this has already been attempted here: https://github.com/corbanbrook/webgl-2d - I'd suggest before speccing this feature that it is researched whether the same performance goals can be achieved with a well-batched JS implementation which reduces the number of calls in to the browser, which is exactly how existing WebGL renderers outperform canvas2d despite both being GPU accelerated. Ashley
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
On Fri, Aug 8, 2014 at 10:25 AM, Ashley Gullen ash...@scirra.com wrote: As Justin stated, 20% of current Chrome users currently fall back to canvas 2d. 1. What fraction of those 20% actually still get a GPU accelerated canvas vs. software rendered? Batching will be of very little use to the software rendered audience, making it an even smaller target market. ~25% of Chrome users that do not have gpu-accelerated WebGL do have gpu accelerated 2d canvas. Nonetheless, the software implementation of 2D canvas is way faster than software-emulated WebGL, so it makes sens to fallback to 2D canvas anytime accelerated WebGL is unavailable. No software is ever bug-free, but this is irrelevant. To not be blacklisted, drivers don't need to be perfect, they just need to meet a reasonable threshold of security and reliability. If a driver is insecure or crashes constantly it is blacklisted. Drivers are being improved so they are no longer this poor, and it is not unrealistic to imagine 99%+ of drivers meeting this threshold in the near future, even if none of them are completely bug-free. The problem is the long tail of old devices. There is an astonishingly large number of machines in the world running outdated OSes that no longer receive updates, or graphics driver updates. Also, AFAIK, display drivers tend to not have auto-updates the way OSes and browsers do. So even if there is an updated driver out there, most users are unlikely to install it until they try to use some software that requires it explicitly. Many games consist of large numbers of rotated sprites. The proposal includes the possibility of specifying a per draw transformation matrix. And as Katelyn suggested, we could add a variant that takes an alpha value. I will post an update to this thread as soon as I have more compelling performance data. My goal is to demonstrate that a 2d canvas (with batched drawImage calls) can yield performance characteristics that are significantly superior to WebGL's for typical 2D sprite-based game use cases, particularly on mobile platforms. This will be possible by leveraging the browser's compositing framework in ways that are not possible with WebGL. Stay tuned. -Justin
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
1. What fraction of those 20% actually still get a GPU accelerated canvas vs. software rendered? Batching will be of very little use to the software rendered audience, making it an even smaller target market. ~25% of Chrome users that do not have gpu-accelerated WebGL do have gpu accelerated 2d canvas. So 25% of 20%... you are speccing a feature for just 5% of users, correct? (Since batching likely makes no difference when software rendered) The problem is the long tail of old devices. These will still disappear with time, just like old Windows 95 machines have. Is your intent to spec a feature that will no longer be necessary in future? I will post an update to this thread as soon as I have more compelling performance data. My goal is to demonstrate that a 2d canvas (with batched drawImage calls) can yield performance characteristics that are significantly superior to WebGL's for typical 2D sprite-based game use cases, particularly on mobile platforms. Here's some existing data: comparing a WebGL renderer ( http://www.scirra.com/demos/c2/renderperfgl/) with canvas2d ( http://www.scirra.com/demos/c2/renderperf2d) in Chrome on a Nexus 5: canvas2d: 360 objects @ 30 FPS webgl: ~17,500 objects @ 30 FPS WebGL is nearly 50x (fifty!) times faster than canvas2d in these results. Do you really consider this not fast enough? And by just how much further do you hope to improve the result?
Re: [whatwg] Notifications improvements
On Fri, Aug 8, 2014 at 5:48 AM, Peter Beverloo bever...@google.com wrote: I think the benefit of being able to closely match the UI and UX of native notifications on the platforms is something that's enabled by using declarative properties, whereas that would be near impossible to do with HTML notifications. As long as advanced features, especially more compatible ones (say, buttons or timestamps) can be feature detected by developers so that they can provide a fallback when they're not available, I would be in favor of extending the future set with Jonas' declarative proposals. Cool! I had not though about the need to feature detect. At least for progress/list/timestamp. My thinking had been to require that the UA provide a textual fallback on platforms that can't render those widgets natively. However I think you are right that we'll need feature detection here. We might even need two types of it. First of all not all UAs are going to implement all of these features right away. So obviously they won't provide any fallback rendering either. Detecting this seems important. Second, it might be good to enable pages to detect platforms where a progress bar is rendered as a native progress bar, rather than as text fallback. This way the page can choose not to use a progress bar and instead create its own fallback. An important point here is for pages that don't choose to test for the latter, will still get a useful rendering of the notification. / Jonas
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote: Sorry, in this context rgba multiplication refers to per-channel multipliers (instead of only one multiplier for the alpha channel), so that you can color tint images when drawing them. As mentioned, it's used for fades, drawing colored text, and similar effects. I see. Any reason that this couldn't be done with a 'multiply' blend mode? Premultiplication is a different subject, sorry if I confused you with the similar language. There are past discussions about both in the list archives. On Thu, Aug 7, 2014 at 10:59 AM, Rik Cabanier caban...@gmail.com wrote: On Mon, Aug 4, 2014 at 4:35 PM, Katelyn Gadd k...@luminance.org wrote: Many, many uses of drawImage involve transform and/or other state changes per-blit (composite mode, global alpha). I think some of those state changes could be viably batched for most games (composite mode) but others absolutely cannot (global alpha, transform). I see that you handle transform with source-rectangle-and-transform (nice!) but you do not currently handle the others. I'd suggest that this needs to at least handle globalAlpha. Replacing the overloading with individual named methods is something I'm also in favor of. I think it would be ideal if the format-enum argument were not there so that it's easier to feature-detect what formats are available (for example, if globalAlpha data is added later instead of in the '1.0' version of this feature). We can define the functions so they throw a type error if an unknown enum is passed. That way you can feature detect future additions to the enum. What should be do about error detection in general? If we require the float array to be well formed before drawing, we need an extra pass to make sure that they are correct. If we don't require it, we can skip that pass but content could be partially drawn to the canvas before the exception is thrown. I get the impression that ordering is implicit for this call - the batch's drawing operations occur in exact order. It might be worthwhile to have a way to indicate to the implementation that you don't care about order, so that it is free to rearrange the draw operations by image and reduce state changes. Doing that in userspace js is made difficult since you can't easily do efficient table lookup for images. if rgba multiplication were to make it into canvas2d sometime in the next decade, that would nicely replace globalAlpha as a per-draw value. This is an analogue to per-vertex colors in 3d graphics and is used in virtually every hardware-accelerated 2d game out there, whether to tint characters when drawing text, fade things in and out, or flash the screen various colors. That would be another reason to make feature detection easier. Would it be possible to sneak rgba multiplication in under the guise of this feature? ;) Without it, I'm forced to use WebGL and reduce compatibility just for something relatively trivial on the implementer's side. (I should note that from what I've heard, Direct2D actually makes this hard to implement. Is this the other proposal to control the format of the canvas buffer that is passed to WebGL? On the bright side there's a workaround for RGBA multiplication based on generating per-channel bitmaps from the source bitmap (k, r/g/b), then blending them source-over/add/add/add. drawImageBatch would improve perf for the r/g/b part of it, so it's still an improvement. On Mon, Aug 4, 2014 at 3:39 PM, Robert O'Callahan rob...@ocallahan.org wrote: It looks reasonable to me. How do these calls interact with globalAlpha etc? You talk about decomposing them to individual drawImage calls; does that mean each image draw is treated as a separate composite operation? Currently you have to choose between using a single image or passing an array with one element per image-draw. It seems to me it would be more flexible to always pass an array but allow the parameters array to refer to an image by index. Did you consider that approach? Rob -- oIo otoeololo oyooouo otohoaoto oaonoyooonoeo owohooo oioso oaonogoroyo owoiotoho oao oboroootohoeoro oooro osoiosotoeoro owoiololo oboeo osouobojoeocoto otooo ojouodogomoeonoto.o oAogoaoiono,o oaonoyooonoeo owohooo osoaoyoso otooo oao oboroootohoeoro oooro osoiosotoeoro,o o‘oRoaocoao,o’o oioso oaonosowoeoroaoboloeo otooo otohoeo ocooouoroto.o oAonodo oaonoyooonoeo owohooo osoaoyoso,o o‘oYooouo ofolo!o’o owoiololo oboeo oiono odoaonogoeoro ooofo otohoeo ofoioroeo ooofo ohoeololo.
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
On Fri, Aug 8, 2014 at 7:25 AM, Ashley Gullen ash...@scirra.com wrote: As Justin stated, 20% of current Chrome users currently fall back to canvas 2d. 1. What fraction of those 20% actually still get a GPU accelerated canvas vs. software rendered? Batching will be of very little use to the software rendered audience, making it an even smaller target market. There will still be a noticeable gain since we wouldn't have to cross the JS boundary as much. More importantly though, things will still work for non-accelerated canvas while WebGL won't. 2. In Firefox's case, that number has reduced from 67% to 15% over a couple of years. Surely in time this will fall even further to a negligible amount. Why standardise a feature whose target market is disappearing? How long will it be before we're at a reliable 100%? Graphics drivers are still flaky and it's not as if they only came out a couple of years ago. Note that the supported % for graphics layers (accelerated compositing) is much lower. Small developers that don't have the resources to develop for concurrent WebGL and Canvas2D code paths They don't have to: there are free, high-quality open-source libraries like Pixi.js that do this already, so even small developers have an easy way to make use of a WebGL renderer without much extra effort. When do you envision that OpenGL drivers are bug free everywhere? History is not on your side here... I would much rather have something short term that can be implemented with low effort and improves performance. No software is ever bug-free, but this is irrelevant. To not be blacklisted, drivers don't need to be perfect, they just need to meet a reasonable threshold of security and reliability. If a driver is insecure or crashes constantly it is blacklisted. Drivers are being improved so they are no longer this poor, and it is not unrealistic to imagine 99%+ of drivers meeting this threshold in the near future, even if none of them are completely bug-free. I think Justin can better fill you in on this. Chrome has to jump through many hoops to make canvas reliable on top of OpenGL and it still suffers from random crashes when you stress the system. Both Safari and Firefox use higher level system calls and are more reliable (albeit slower) than Chrome. I don't really understand why you and Brian are so opposed to improving the performance of canvas 2D. I see it as a feature targeted at a rapidly disappearing segment of the market that will disappear in the long run, leaving the web platform with unnecessary API cruft. Following your logic, why work on new canvas or SVG features as they can theoretically be emulated in WebGL? Or now that we have asm.js, why even bother with new JavaScript features? I am in general against duplication on the web platform, but new features deserve to be implemented if they have a valid use case or solve a real problem. The problem is that a large number of drawImage calls have a lot of overhead due to JS crossings and housekeeping. This proposal solves that. In this case I don't see that any real problem is being solved, since widely available frameworks and engines already solve it with WebGL in a way accessible even to individual developers, and this solution is already production-grade and widely deployed. Sure, but that is in WebGL which not everyone wants to use and is less widely supported. On further thought this particular proposal doesn't even appear to solve the batching problem very well. Many games consist of large numbers of rotated sprites. If a canvas2d batching facility needs to break the batch every time it needs to call rotate(), this will revert back to individual draw-calls for many kinds of game. WebGL does not have this limitation and can batch in to single calls objects of a variety of scales, angles, tiling, opacity and more. This is done by control over individual vertex positions and texture co-ordinates, which is a fundamental break from the style of the canvas2d API. Therefore even with the proposed batching facility, for maximum performance it is still necessary to use WebGL. This proposal solves a very narrowly defined performance problem. I'm unsure if I follow. The point of Justin's proposal is to do just that under the hood. Why do you think the batching needs to be broken up? Did you see that the proposal has a matrix per draw? An alternate solution is for browser vendors to implement canvas2d entirely in JS on top of WebGL. This reduces per-call overhead by staying in JS land, while not needing to add any new API surface. In fact it looks like this has already been attempted here: https://github.com/corbanbrook/webgl-2d - Implementing Canvas on top of WebGL is not realistic. Please look into Chrome's implementation to make canvas reliable and fast. This can not be achieved today. I totally agree if the web platform matures to a point where this IS possible, we
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
On 8/8/14, 8:43 PM, Rik Cabanier wrote: The problem is that a large number of drawImage calls have a lot of overhead due to JS crossings and housekeeping. Could we please quantify this? I measured the JS crossings part of this, because it's easy: Just have the C++ side of drawImage return immediately. What I see if I do that is that a drawImage call that passes an HTMLImageElement and two integers takes about 17ns on my hardware in a current Firefox nightly on Mac. For scale, a full-up drawImage call that actually does something takes the following amounts of time in various browsers I have [1]: Firefox nightly: ~9500ns/call Chrome dev: ~4300ns/call Safari 7.0.5 and WebKit nightly: ~3000ns/call all with noise (when averaged across 10 calls) that's way more than 17ns. So I'm not sure JS crossings is a significant performance cost here. I'd be interested in which parts of housekeeping would be shareable across the multiple images in the proposal and how much time those take in practice. -Boris [1] The specific testcase I used: !DOCTYPE html img src=http://www.mozilla.org/images/mozilla-banner.gif; script onload = function() { var c = document.createElement(canvas).getContext(2d); var count = 100; var img = document.querySelector(img); var start = new Date; for (var i = 0; i count; ++i) c.drawImage(img, 0, 0); var stop = new Date; console.log((stop - start) / count * 1e6 + ns per call); } /script
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
The way you measured the JS crossing time does not include parameter validations if I am not mistaken. If you have 1000 sprite draws that draw from the same sprite sheet, that is 1000 times you are verifying the same image parameter (verifying that the image is complete, that its security origin does not taint the canvas, fetch the decoded image data, etc.), even tough most of this stuff is cached, it is being looked up N times instead of just once. I will prototype this and come back with some hard data. On Fri, Aug 8, 2014 at 9:26 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 8/8/14, 8:43 PM, Rik Cabanier wrote: The problem is that a large number of drawImage calls have a lot of overhead due to JS crossings and housekeeping. Could we please quantify this? I measured the JS crossings part of this, because it's easy: Just have the C++ side of drawImage return immediately. What I see if I do that is that a drawImage call that passes an HTMLImageElement and two integers takes about 17ns on my hardware in a current Firefox nightly on Mac. For scale, a full-up drawImage call that actually does something takes the following amounts of time in various browsers I have [1]: Firefox nightly: ~9500ns/call Chrome dev: ~4300ns/call Safari 7.0.5 and WebKit nightly: ~3000ns/call all with noise (when averaged across 10 calls) that's way more than 17ns. So I'm not sure JS crossings is a significant performance cost here. I'd be interested in which parts of housekeeping would be shareable across the multiple images in the proposal and how much time those take in practice. -Boris [1] The specific testcase I used: !DOCTYPE html img src=http://www.mozilla.org/images/mozilla-banner.gif; script onload = function() { var c = document.createElement(canvas).getContext(2d); var count = 100; var img = document.querySelector(img); var start = new Date; for (var i = 0; i count; ++i) c.drawImage(img, 0, 0); var stop = new Date; console.log((stop - start) / count * 1e6 + ns per call); } /script
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
On 8/8/14, 9:46 PM, Justin Novosad wrote: The way you measured the JS crossing time does not include parameter validations if I am not mistaken. It includes the validation Web IDL does (e.g. this is an HTMLImageElement) but not the specific validation this method does, correct. If you have 1000 sprite draws that draw from the same sprite sheet, that is 1000 times you are verifying the same image parameter (verifying that the image is complete, that its security origin does not taint the canvas, fetch the decoded image data, etc.), even tough most of this stuff is cached, it is being looked up N times instead of just once. True. I just tested this in a Firefox nightly: I put the early return after we have looked up the decoded image data in our internal cache, or gotten it from the image if not cached, but before we've started doing anything with the position passed to drawImage. The internal cache lookup already includes checks for things like tainting, completeness verification, etc, since there is nothing in the cache if the image is not complete and the canvas is already tainted as needed if we have cached decoded data for this (image,canvas) pair. This version of drawImage takes about 47ns per call when averaged over 1e5 calls (of which, recall, 17ns is the JS call overhead; the other 30ns is the cache lookup, which hits). That's starting to creep up into the 1.5% range of the time I see drawImage taking in the fastest drawImage implementation I have on hand. I will prototype this and come back with some hard data. That would be awesome! -Boris
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
A multiply blend mode by itself is not sufficient because the image being rgba multiplied typically has alpha transparency. The closest approximation is to generate (offline, in software with getImageData) an image per channel - rgbk - and to source-over blend the 'k' channel and then additive blend the r/g/b channels with individual alpha. This approximates the per-channel alpha values with nearly equivalent results (I say nearly equivalent because some browsers do weird stuff with gamma/colorspace conversion that completely breaks this.) If it helps, you could think of this operation as a layer group in photoshop. The first layer in the group is the source image, the second layer is a solid color filled layer containing the rgba 'multiplier', in multiply mode, and then the layer group has a mask on it that contains the source image's alpha information. Note that this representation (a layer group with two layers and a mask) implies that drawing an image this way requires multiple passes, which is highly undesirable. My current fallback requires 4 passes, along with 4 texture changes and two blend state changes. Not wonderful. RGBA multiplication dates back to early fixed-function graphics pipelines. If a blend with globalAlpha and a premultiplied source is represented like this: result(r, g, b) = ( source-premultiplied(r, g, b) * globalAlpha ) + ( dest(r, g, b) * (1 - (source(a) * globalAlpha)) ) Then if you take a premultiplied color constant and use that as the multiplier for your image (instead of a global alpha value - this is the input to rgba multiplication, i.e. a 'vertex color'): result(r, g, b) = ( source-premultiplied(r, g, b) * rgba-multiplier-premultiplied(r, g, b) ) + ( dest(r, g, b) * (1 - (source(a) * rgba-multiplier-premultiplied(a))) ) (Sorry if this is unclear, I don't have a math education) So you basically take the global alpha multiplier and you go from that to a per-channel multiplier. If you're using premultiplied alpha already, this ends up being pretty straightforward... you just take a color (premultiplied, like everything else) and use that as your multiplier. You can multiply directly by each channel since the global 'alpha' part of the multiplier is already baked in by the premultiplication step. This is a really common primitive since it's so easy to implement, if not entirely free - you're already doing that global alpha multiplication, so you just introduce a different multiplier per-channel, which is really trivial in a SIMD model like the ones used in computer graphics. You go from vec4 * scalar to vec4 * vec4. Text rendering is the most compelling reason to support this, IMO. With this feature you can build glyph atlases inside 2d canvases (using something like freetype, etc), then trivially draw colored glyphs out of them without having to drop down into getImageData or use WebGL. It is trivially expressed in most graphics APIs since it uses the same machinery as a global alpha multiplier - if you're drawing a premultiplied image with an alpha multiplier in hardware, you're almost certainly doing vec4 * scalar in your shader. If you're using the fixed-function pipeline from bad old 3d graphics, vec4 * scalar didn't even exist - the right hand side was *always* another vec4 so this feature literally just changed the constant on the right hand side. I harp on this feature since nearly every 2d game I encounter uses it, and JSIL has to do it in software. If not for this one feature it would be very easy to make the vast majority of ported titles Just Work against canvas, which makes them more likely to run correctly on mobile. On Fri, Aug 8, 2014 at 5:28 PM, Rik Cabanier caban...@gmail.com wrote: On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote: Sorry, in this context rgba multiplication refers to per-channel multipliers (instead of only one multiplier for the alpha channel), so that you can color tint images when drawing them. As mentioned, it's used for fades, drawing colored text, and similar effects. I see. Any reason that this couldn't be done with a 'multiply' blend mode? Premultiplication is a different subject, sorry if I confused you with the similar language. There are past discussions about both in the list archives.
Re: [whatwg] [2D Canvas] Proposal: batch variants of drawImage
On Fri, Aug 8, 2014 at 7:54 PM, Katelyn Gadd k...@luminance.org wrote: A multiply blend mode by itself is not sufficient because the image being rgba multiplied typically has alpha transparency. The closest approximation is to generate (offline, in software with getImageData) an image per channel - rgbk - and to source-over blend the 'k' channel and then additive blend the r/g/b channels with individual alpha. This approximates the per-channel alpha values with nearly equivalent results (I say nearly equivalent because some browsers do weird stuff with gamma/colorspace conversion that completely breaks this.) If it helps, you could think of this operation as a layer group in photoshop. The first layer in the group is the source image, the second layer is a solid color filled layer containing the rgba 'multiplier', in multiply mode, and then the layer group has a mask on it that contains the source image's alpha information. Note that this representation (a layer group with two layers and a mask) implies that drawing an image this way requires multiple passes, which is highly undesirable. My current fallback requires 4 passes, along with 4 texture changes and two blend state changes. Not wonderful. I see; you're asking for a feature like Photoshop Color Overlay layer effect. Is that correct? RGBA multiplication dates back to early fixed-function graphics pipelines. If a blend with globalAlpha and a premultiplied source is represented like this: result(r, g, b) = ( source-premultiplied(r, g, b) * globalAlpha ) + ( dest(r, g, b) * (1 - (source(a) * globalAlpha)) ) Then if you take a premultiplied color constant and use that as the multiplier for your image (instead of a global alpha value - this is the input to rgba multiplication, i.e. a 'vertex color'): result(r, g, b) = ( source-premultiplied(r, g, b) * rgba-multiplier-premultiplied(r, g, b) ) + ( dest(r, g, b) * (1 - (source(a) * rgba-multiplier-premultiplied(a))) ) (Sorry if this is unclear, I don't have a math education) So you basically take the global alpha multiplier and you go from that to a per-channel multiplier. If you're using premultiplied alpha already, this ends up being pretty straightforward... you just take a color (premultiplied, like everything else) and use that as your multiplier. You can multiply directly by each channel since the global 'alpha' part of the multiplier is already baked in by the premultiplication step. This is a really common primitive since it's so easy to implement, if not entirely free - you're already doing that global alpha multiplication, so you just introduce a different multiplier per-channel, which is really trivial in a SIMD model like the ones used in computer graphics. You go from vec4 * scalar to vec4 * vec4. Text rendering is the most compelling reason to support this, IMO. With this feature you can build glyph atlases inside 2d canvases (using something like freetype, etc), then trivially draw colored glyphs out of them without having to drop down into getImageData or use WebGL. It is trivially expressed in most graphics APIs since it uses the same machinery as a global alpha multiplier - if you're drawing a premultiplied image with an alpha multiplier in hardware, you're almost certainly doing vec4 * scalar in your shader. If you're using the fixed-function pipeline from bad old 3d graphics, vec4 * scalar didn't even exist - the right hand side was *always* another vec4 so this feature literally just changed the constant on the right hand side. I harp on this feature since nearly every 2d game I encounter uses it, and JSIL has to do it in software. If not for this one feature it would be very easy to make the vast majority of ported titles Just Work against canvas, which makes them more likely to run correctly on mobile. Maybe it would be best to bring this up as a separate topic on this mailing list. (just copy/paste most of your message) On Fri, Aug 8, 2014 at 5:28 PM, Rik Cabanier caban...@gmail.com wrote: On Thu, Aug 7, 2014 at 7:11 PM, Katelyn Gadd k...@luminance.org wrote: Sorry, in this context rgba multiplication refers to per-channel multipliers (instead of only one multiplier for the alpha channel), so that you can color tint images when drawing them. As mentioned, it's used for fades, drawing colored text, and similar effects. I see. Any reason that this couldn't be done with a 'multiply' blend mode? Premultiplication is a different subject, sorry if I confused you with the similar language. There are past discussions about both in the list archives.