Re: [whatwg] Canvas pixel manipulation and performance
On Thu, 26 Nov 2009, Jason Oster wrote: I've been using canvas to draw pixel art (NES/SNES game screens and sprites) similar to what an emulator would do. Doing this kind of drawing requires direct access to the pixel buffer. My problem with the canvas spec (as it is now) is that it tends to artificially bounds pixel drawing performance to JavaScript when doing any sort of pixel access. Setting four unsigned 8-bit array elements (R, G, B, and A) is a slower operation that setting just one unsigned 32-bit array element (RGBA or ABGR). Sadly, we don't have this latter option for canvas. My comment is a request for a new set of pixel access methods on the CanvasRenderingContext2D object. Specifically, alternatives to createImageData(), getImageData(), and putImageData() methods for providing an array of unsigned 32-bit elements for pixel manipulation. This comes up every now and then, but in the big picture, this problem isn't a huge issue. I think it's better that we wait for the rest of the spec to be better implemented before we start adding more features to canvas. Even in canvas, there are a number of features that would probably be more important than this, such as reusable path objects, text on a path, or dotted or dashed line styles. If we had data showing that it was a problem, that might change matters. -- Ian Hickson U+1047E)\._.,--,'``.fL http://ln.hixie.ch/ U+263A/, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Re: [whatwg] Canvas pixel manipulation and performance
I guess this suggestion to access the full pixel data in a single array element has fallen by the wayside. Are there any direct objections to including additional API to allow this kind of behavior? It seems most developers believe it would be unnecessary, but I haven't heard much in the way of reasoning (technical nor personal). I cannot comment on the typical uses of accessing pixel data from script; if it is [in general] more important to have each of the R,G,B,A components separated for script access, or not. But for cases involving indexed palettes, having the ability to directly treat each pixel as a single property is very much desired. It is not going to provide a huge boost in performance. At worst, it will help make code cleaner. But at best, it will do that and [slightly?] reduce the performance penalty of reading/writing 3 superfluous (in my eyes) array accesses. The only negative aspect I can think of with additional API functions is the introduction of new developer confusion; Which one do I use? Thanks for listening, Jay
Re: [whatwg] Canvas pixel manipulation and performance
On Fri, Dec 4, 2009 at 9:30 AM, Jason Oster paras...@kodewerx.org wrote: I guess this suggestion to access the full pixel data in a single array element has fallen by the wayside. Are there any direct objections to including additional API to allow this kind of behavior? It seems most developers believe it would be unnecessary, but I haven't heard much in the way of reasoning (technical nor personal). I cannot comment on the typical uses of accessing pixel data from script; if it is [in general] more important to have each of the R,G,B,A components separated for script access, or not. But for cases involving indexed palettes, having the ability to directly treat each pixel as a single property is very much desired. It is not going to provide a huge boost in performance. At worst, it will help make code cleaner. But at best, it will do that and [slightly?] reduce the performance penalty of reading/writing 3 superfluous (in my eyes) array accesses. The only negative aspect I can think of with additional API functions is the introduction of new developer confusion; Which one do I use? I think you'd get more traction if you had performance measurements; minimally, profiles showing that this is hot in your current application. Ideally, you could do a prototype in one of the browsers supporting WebGL which exposes the ImageData's backing store as a WebGLUnsignedIntArray. If this showed a significant speedup it would provide strong motivation. -Ken
Re: [whatwg] Canvas pixel manipulation and performance
On Dec 4, 2009, at 6:10 PM, Kenneth Russell wrote: I think you'd get more traction if you had performance measurements; minimally, profiles showing that this is hot in your current application. Ideally, you could do a prototype in one of the browsers supporting WebGL which exposes the ImageData's backing store as a WebGLUnsignedIntArray. If this showed a significant speedup it would provide strong motivation. My current application isn't necessarily the best test bed. I'll see what I can do to prototype it with some minimal test cases, though. Thanks for the suggestion. Jay
Re: [whatwg] Canvas pixel manipulation and performance
On Mon, Nov 30, 2009 at 4:46 PM, Kenneth Russell k...@google.com wrote: CanvasPixelArray specifies that values greater than 255, including +inf, are clamped to 255 and values less than 0, including -inf, are clamped to zero. WebGLUnsignedByteArray (as people will see in the WebGL draft spec this week or next) specifies that the conversion is done with a C-style cast. The results are different for out-of-range values. I was going to say: It doesn't include +/-inf, because http://whatwg.org/html5#dependencies says if a method with an argument that is a floating point number type (float) is passed an Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception must be raised, and that probably applies to the CanvasPixelArray setter method. But it looks like the spec changed since I last looked, and the setter takes an 'octet' argument, so I think the conversion should happen as per http://dev.w3.org/2006/webapi/WebIDL/#es-octet and CanvasPixelArray shouldn't define any conversion. (Filed as http://www.w3.org/Bugs/Public/show_bug.cgi?id=8405). Hopefully WebIDL and WebGL either match or can be made to match. -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas pixel manipulation and performance
On Mon, 30 Nov 2009 19:31:53 +0100, Philip Taylor excors+wha...@gmail.com wrote: But it looks like the spec changed since I last looked, and the setter takes an 'octet' argument, so I think the conversion should happen as per http://dev.w3.org/2006/webapi/WebIDL/#es-octet and CanvasPixelArray shouldn't define any conversion. (Filed as http://www.w3.org/Bugs/Public/show_bug.cgi?id=8405). Hopefully WebIDL and WebGL either match or can be made to match. It would be nice if they used the same object/interface too... Maybe implementations of CanvasPixelArray should hide the interface and other details so that we can eventually convert into some kind of octet array if we get native support for that. -- Anne van Kesteren http://annevankesteren.nl/
Re: [whatwg] Canvas pixel manipulation and performance
I have to wonder if it's worth trying to micro-optimize web APIs like this. Your suggestions will squeeze out only a small amount of additional performance - the goals will get a bit higher and we'll be back at square one. I know NativeClient isn't a proposed spec or standardised piece of web infrastructure, but I think what you really need is the ability to scribble on a canvas from native code rather than JavaScript. Work done on that has the advantage of generalizing to all web APIs and use cases rather than just direct graphics access.
Re: [whatwg] Canvas pixel manipulation and performance
On Nov 29, 2009, at 4:19 AM, Mike Hearn wrote: I have to wonder if it's worth trying to micro-optimize web APIs like this. Your suggestions will squeeze out only a small amount of additional performance - the goals will get a bit higher and we'll be back at square one. I've always imagined that was the general flow of performance improvements; tune a little here, change that over there, and you knock some time off your overall benchmark. I hope I haven't mistaken what you are saying. I know NativeClient isn't a proposed spec or standardised piece of web infrastructure, but I think what you really need is the ability to scribble on a canvas from native code rather than JavaScript. Work done on that has the advantage of generalizing to all web APIs and use cases rather than just direct graphics access. That's one way to get a healthy performance boost (typically) but where does the web developer stand in this work? Are you suggesting native code should replace JavaScript? On the other hand, I have briefly considered using native code (via XPCOM) in my XULRunner application to grind out every bit of performance possible. I can understand where you are coming from.
Re: [whatwg] Canvas pixel manipulation and performance
That's one way to get a healthy performance boost (typically) but where does the web developer stand in this work? Are you suggesting native code should replace JavaScript? For code where performance is critical (like complex animation code) yes. Don't get me wrong, I'm all for better JavaScript performance, but we have to be realistic. Compared to native code JavaScripts performance will always be lacking - many clever tricks have been deployed to speed up JavaScript but even the fastest JS engines don't come close to the output of average C++ compilers/JVMs. The nature of JS makes it likely that this situation will remain true for a long time, perhaps forever. So there are two possibilities here - one is to introduce ever more complexity into the web APIs for diminishing returns, even though a primary goal of the web APIs is simplicity. And the other is to just bind native code to those APIs, hopefully eliminating much of the marshalling overhead along the way. The latter approach has the advantage of not requiring novice-level developers to understand things like endianness or bit masking to draw some pixels (replace as appropriate for any given API), whilst allowing developers that need it to get the fastest execution securely possible.
Re: [whatwg] Canvas pixel manipulation and performance
On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 11/29/09 12:15 AM, Kenneth Russell wrote: I assume you meant JS bitwise operators? Do we have any indication that this would be faster than four array property sets? The bitwise ops in JS are not necessarily particulary fast. Yes, that's what I meant. I don't have any data on whether this would currently be faster than the four separate byte stores. Are they even byte stores, necessarily? I know in Gecko imagedata is just a JS array at the moment; it stores each of R,G,B,A as a JS Number (with the usual if it's an integer store as an integer optimization arrays do). That might well change in the future, and I hope it does, but that's the current code. I can't speak to what the behavior is in Webkit, and in particular whether it's even the same when using V8 vs Nitro. In Chromium (WebKit + V8), CanvasPixelArray property stores write individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray behave similarly but have simpler clamping semantics. -Ken
Re: [whatwg] Canvas pixel manipulation and performance
On Sun, Nov 29, 2009 at 6:59 PM, Kenneth Russell k...@google.com wrote: On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote: Are they even byte stores, necessarily? I know in Gecko imagedata is just a JS array at the moment; it stores each of R,G,B,A as a JS Number (with the usual if it's an integer store as an integer optimization arrays do). That might well change in the future, and I hope it does, but that's the current code. I can't speak to what the behavior is in Webkit, and in particular whether it's even the same when using V8 vs Nitro. In Chromium (WebKit + V8), CanvasPixelArray property stores write individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray behave similarly but have simpler clamping semantics. Would it be helpful (for simplicity or performance or consistency etc) to change the specification of CanvasPixelArray to have those simpler clamping semantics? (I don't expect there would be compatibility problems with changing it now, particularly since Firefox doesn't implement clamping at all in CPA.) -- Philip Taylor exc...@gmail.com
Re: [whatwg] Canvas pixel manipulation and performance
On Sun, Nov 29, 2009 at 11:05 AM, Philip Taylor excors+wha...@gmail.com wrote: On Sun, Nov 29, 2009 at 6:59 PM, Kenneth Russell k...@google.com wrote: On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote: Are they even byte stores, necessarily? I know in Gecko imagedata is just a JS array at the moment; it stores each of R,G,B,A as a JS Number (with the usual if it's an integer store as an integer optimization arrays do). That might well change in the future, and I hope it does, but that's the current code. I can't speak to what the behavior is in Webkit, and in particular whether it's even the same when using V8 vs Nitro. In Chromium (WebKit + V8), CanvasPixelArray property stores write individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray behave similarly but have simpler clamping semantics. Would it be helpful (for simplicity or performance or consistency etc) to change the specification of CanvasPixelArray to have those simpler clamping semantics? (I don't expect there would be compatibility problems with changing it now, particularly since Firefox doesn't implement clamping at all in CPA.) It would. Vladimir Vukicevic from Mozilla was planning to raise this issue with the whatwg upon release of the first public draft of the WebGL spec. -Ken
Re: [whatwg] Canvas pixel manipulation and performance
On 11/29/09 1:20 PM, Jason Oster wrote: Changeset 2b56c4771d5c reduced the number of pixel array elements accessed by caching the 256px x 256px rooms within the stage map, and passing the cached rooms to putImageData(). As opposed to doing what before the change? The previous code used a non-cached approach. Where every pixel in the canvas was explicitly drawn into the ImageData array. Keep in mind, the largest of these was 4864px × 3072px. If anything, the change took time away from JavaScript and placed it in native code: putImageData(). I'm not sure I follow. Looking at the diff, it looks like you used to do a single putImageData call, passing it this.fgmap.render(), right? Now you do a bunch of putImageData calls, passing this.fgmap[rooms[i++]].img, where right before that you called this.fgmap[i].render() for a bunch if i. I really don't see how this would have made things faster, unless render() is just not being called on all rooms now. -Boris
Re: [whatwg] Canvas pixel manipulation and performance
The patch changed something like this: for (y in canvasHeight) { for (x in canvasWidth) { putPixel(); } } To something like this: for (y in roomHeight) { for (x in roomWidth) { putPixel(); } } for (rooms_y in canvasHeight) { for (rooms_x in canvasWidth) { putRoom(); } } This pseudo-code is a bit unintelligible, please bear with me. Basically, there is a fixed number of rooms (256px x 256px images), only about 15 in the smaller stages. The full map might have a total area for 10 x 3 rooms; half of them are duplicated. The patch caches the rooms drawn, and then blits from the cache into the canvas. this.fgmap used to be one giant canvas. Now it is an array of smaller canvases (the size of a single room). The .render() method draws pixels to the associated ImageData in each case (using the four R,G,B,A elements per pixel, as we are discussing). In this case, cutting down on the number of pixels that this.fgmap.render() needs to poke into ImageData made the overall drawing approximately 2x faster. It might be important to note that this.fgmap.render() method also does some tile decoding (to convert the SNES tile format into a usable bitmap), and caches the results. Does that make more sense? I know it is difficult to follow unfamiliar code, but I would like to clear up any questions you might have. Jay On Nov 29, 2009, at 1:03 PM, Boris Zbarsky wrote: On 11/29/09 1:20 PM, Jason Oster wrote: Changeset 2b56c4771d5c reduced the number of pixel array elements accessed by caching the 256px x 256px rooms within the stage map, and passing the cached rooms to putImageData(). As opposed to doing what before the change? The previous code used a non-cached approach. Where every pixel in the canvas was explicitly drawn into the ImageData array. Keep in mind, the largest of these was 4864px × 3072px. If anything, the change took time away from JavaScript and placed it in native code: putImageData(). I'm not sure I follow. Looking at the diff, it looks like you used to do a single putImageData call, passing it this.fgmap.render(), right? Now you do a bunch of putImageData calls, passing this.fgmap[rooms[i++]].img, where right before that you called this.fgmap[i].render() for a bunch if i. I really don't see how this would have made things faster, unless render() is just not being called on all rooms now. -Boris
Re: [whatwg] Canvas pixel manipulation and performance
On 11/29/09 3:33 PM, Jason Oster wrote: It might be important to note that this.fgmap.render() method also does some tile decoding (to convert the SNES tile format into a usable bitmap), and caches the results. Does that make more sense? I know it is difficult to follow unfamiliar code, but I would like to clear up any questions you might have. So the new code has to do about half as much tile decoding, as well as half the number of imagedata[n] sets? Or was the decoding already being cached? -Boris
Re: [whatwg] Canvas pixel manipulation and performance
On Nov 29, 2009, at 1:57 PM, Boris Zbarsky wrote: So the new code has to do about half as much tile decoding, as well as half the number of imagedata[n] sets? Or was the decoding already being cached? Decoded tiles were already cached. It actually builds MORE tile caches now, though: one cache for each this.fgmap[]. So that's more (a lot more) tile decoding, but far fewer pixel pokes.
Re: [whatwg] Canvas pixel manipulation and performance
On Nov 29, 2009, at 10:59 AM, Kenneth Russell wrote: On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 11/29/09 12:15 AM, Kenneth Russell wrote: I assume you meant JS bitwise operators? Do we have any indication that this would be faster than four array property sets? The bitwise ops in JS are not necessarily particulary fast. Yes, that's what I meant. I don't have any data on whether this would currently be faster than the four separate byte stores. Are they even byte stores, necessarily? I know in Gecko imagedata is just a JS array at the moment; it stores each of R,G,B,A as a JS Number (with the usual if it's an integer store as an integer optimization arrays do). That might well change in the future, and I hope it does, but that's the current code. I can't speak to what the behavior is in Webkit, and in particular whether it's even the same when using V8 vs Nitro. In Chromium (WebKit + V8), CanvasPixelArray property stores write individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray behave similarly but have simpler clamping semantics. I don't know where you're getting that idea from -- the clamping semantics for CanvasPixelArray and WebGLUnsignedByteArray are identical. The CanvasPixelArray implementation in WebKit has always matched the spec and been a clamping bytearray, eg. one byte per channel, per pixel. Just for future reference for all who are interested: in WebKit the JS interface to a DOM object is merely a binding to a C++ implementation, eg. there's no reason to be concerned about different DOM object behaviour dependent on the JS engine - if there were it a difference would imply a bug rather than a design choice. --Oliver -Ken
Re: [whatwg] Canvas pixel manipulation and performance
On 11/29/09 11:22 PM, Oliver Hunt wrote: The CanvasPixelArray implementation in WebKit has always matched the spec and been a clamping bytearray, eg. one byte per channel, per pixel. I assume you mean the spec as it is now and not the spec as it was when Gecko implemented get/putImageData? The latter was quite different, as I recall. Just for future reference for all who are interested: in WebKit the JS interface to a DOM object is merely a binding to a C++ implementation Sure; the point is that in Gecko the thing returned by getImageData().data is not a DOM object at all but a JS Array. Similarly, the object returned by getImageData() is not a DOM object, but a JS Object. Likewise, putImageData accepts any JS Object with the right properties on it (width, height, and data). I _think_ as of when the implementation was created some of this (e.g. the putImageData behavior) was called for by the then-whatwg spec. See http://www.whatwg.org/specs/web-apps/2007-10-26/ for example; I believe there were other drafts between the 2005-01 draft and that one that had still other behavior. As you note the spec has since changed; the Gecko implementation hasn't been changed yet pending us having some indication that the spec is actually stable now. Once burnt, twice shy and all. In any case, I simply didn't know whether CanvasPixelArray was implemented as a host object or a native object in webkit, and didn't want to make any claims about it as a result. -Boris
Re: [whatwg] Canvas pixel manipulation and performance
On 11/29/09 11:22 PM, Oliver Hunt wrote: I don't know where you're getting that idea from -- the clamping semantics for CanvasPixelArray and WebGLUnsignedByteArray are identical. Perhaps Kenneth included the rounding behavior (which seems to be different to me from a brief look at JavaScriptCore/wtf/ByteArray.h and WebCore/html/canvas/WebGLUnsignedByteArray.h) in clamping semantics? CanvasPixelArray, which uses ByteArray, rounds to nearest integer (ties rounded up), while WebGLUnsignedByteArray truncates. Note that neither implements what the current spec draft calls for (which is round-to-nearest, ties-to-even behavior). No opinion on whether this _should_ be what the spec calls for. -Boris
Re: [whatwg] Canvas pixel manipulation and performance
My apologies for the direct reply, Oliver. This was meant to go back to the list: On Nov 26, 2009, at 3:35 PM, Oliver Hunt wrote: WebGL has completely different constraints to that of the 2d canvas -- when the developer provides resources to GL the developer has to provide a myriad of type details, this means that the developer needs to be able to request storage of a specific type. The WebGL array types are specifically targeting this use case -- they don't make sense for canvas2d where the only storage is not a developer specified format. That is understandable. History has shown that any time a developer won't handle both byte orders -- developers tend to work on the assumption that if something works for them it must be correct, this is why we end up with sites that claim This site needs IE/Safari/Firefox to run type messages. Even conscientious developers who test multiple browsers, and validate their content, etc will be able to produce accidentally broken sites because this would add a hardware dependency on spec behaviour. We certainly don't want any more of that. Realistically simply making an separate object that has indexes 32bit rgba pixels would resolve the problem you're trying to describe -- the implementation would need to do byte order correct, but given that 2/3 canvas implementations already do unpre-premultiplied data conversion on putImageData this is unlikely to add any cost at all (in fact in the webkit implementation i don't believe there would be any difference in the logic in get/putImageData). Once again, I agree. My confusion on the type-specific arrays for WebGL is that they were specific and general enough to use in other cases. If they should not be used in 2D canvas implementations (or elsewhere) then a 2D-canvas-specific array or object would be the way forward. Take for instance, the following pseudo code: var canvas = document.getElementById(canvas); var ctx = canvas.getContext(2d); var pixels = ctx.createUnsignedByteArray(8, 8); // Fill with medium gray for (var i = 0; i 8 * 8; i++) { pixels.data[i] = ctx.mapRGBA(128, 128, 128, 255); } ctx.putUnsignedByteArray(pixels, 0, 0); Adding a function call would make your code much slower. Yes, it would. Once again, this was only for illustrative purposes. More commonly, a Look-Up Table would be created, containing all of the colors used in the scene before any pixels are touched. For any kind of low-resolution pixel art (as found in classic gaming consoles), the palette is typically indexed and consisting of 256 colors or fewer. In extreme cases, an LUT with thousands of colors would be far faster than using such a function call. I neglected to mention any optimal way of using a mapRGBA function; that's not what I was trying to illustrate. I understand this a bad way to fill a portion of a canvas with a solid color; this is for illustration purposes only. The overall idea is that setting fewer array elements per pixel will perform better. Have you actually measured this? How long is spent in each part? I suspect if you're not using the dirty region arguments you're pushing back (and doing premult conversion) on a lot more pixels than necessary. Yes setting 4 properties is slower than setting 1, but where is your time actually being spent. I have not directly done any measurements, sorry. What I do have is a mecurial repository for a level editor project (which draws independent pixels directly to very large canvas elements) showing the progression of optimizations I've introduced. Many of the modifications intended to make the drawing faster have done so by avoiding pixel access wherever possible. Certainly it is not the most efficient code, but I've optimized enough to make the time spent setting pixel arrays worth investigating. I still do not have any actual numbers to throw around, however. We've already seen the emergence of emulators written in JavaScript/Canvas. In fact, there are loads of them[4], and they would all benefit from having a better way to interact directly with canvas pixels. Of course, the use cases are not limited to emulation; my NES/SNES level editor projects would enjoy faster pixel manipulation as well. These kinds of projects can use arbitrarily sized canvases (up to 4864px × 3072px in one case[5]) and can take a good deal of time to fully render, even with several off-ImageData optimization tricks. Without seeing the code for your demo i'd have no idea whether what you're doing is actually efficient -- have you profiled? Both Safari and Firefox have built in profilers. The trouble with profiling my project is that it is a XULRunner application, and does not run directly in web browsers as-is. The code can largely be hacked to work as a web application. If you are interested in a demo of sorts, the code is all available
Re: [whatwg] Canvas pixel manipulation and performance
On Sat, Nov 28, 2009 at 12:44 PM, Jason Oster paras...@kodewerx.org wrote: Once again, I agree. My confusion on the type-specific arrays for WebGL is that they were specific and general enough to use in other cases. If they should not be used in 2D canvas implementations (or elsewhere) then a 2D-canvas-specific array or object would be the way forward. I and other members of the WebGL working group are hoping that the new array-like types being introduced with this specification will be general enough to repurpose in other areas. The first public draft of the spec will be released in the next week or two, and we're hoping that will enable discussion with the broader web community. From a technical standpoint, it would be feasible to use the WebGLUnsignedIntArray to access the Canvas's pixel data, and assemble RGBA pixels into integer values using just JavaScript logical operators. To keep things sane, the specification would need to state something along the lines that the high (logical, not addressing) 8 bits are the red bits and the low 8 bits are the alpha bits. This means that implementations on big-endian and little-endian machines would need to store the data differently internally so that the behavior at the JavaScript level is identical; the WebGL array types currently deliberately do no byte swapping. -Ken
Re: [whatwg] Canvas pixel manipulation and performance
On 11/28/09 11:42 PM, Kenneth Russell wrote: From a technical standpoint, it would be feasible to use the WebGLUnsignedIntArray to access the Canvas's pixel data, and assemble RGBA pixels into integer values using just JavaScript logical operators. I assume you meant JS bitwise operators? Do we have any indication that this would be faster than four array property sets? The bitwise ops in JS are not necessarily particulary fast. -Boris
Re: [whatwg] Canvas pixel manipulation and performance
On 11/28/09 3:44 PM, Jason Oster wrote: The trouble with profiling my project is that it is a XULRunner application, and does not run directly in web browsers as-is. This is not an issue at all; any XULRunner application can be run in Firefox directly (with the right command-line flags). I'm happy to profile whatever set of operations you want for you in Firefox if you send me step-by-step instructions for performing that set of operations in your application, ideally with code pointers to the start and stop of the code to be profiled in your codebase. Probably offlist for all that... ;) Of course if you have a Mac you can do all this yourself; Shark is an excellent tool. Changeset 2b56c4771d5c reduced the number of pixel array elements accessed by caching the 256px x 256px rooms within the stage map, and passing the cached rooms to putImageData(). As opposed to doing what before the change? My point, as you concur, is that setting four array elements (properties) is slower than setting just one. While true, it may not necessarily be slower than setting one if the value to set is more expensive to compute as a result, and may not be the bottleneck to start with. The latter is hard to determine without profiling. My gut feel is that at least in Gecko this is not likely to be a performance bottleneck right now, nor much of a win with the proposed change, if any. But again, that would be easier to judge (at least for the first part) with a profile. -Boris
Re: [whatwg] Canvas pixel manipulation and performance
On Sat, Nov 28, 2009 at 9:00 PM, Boris Zbarsky bzbar...@mit.edu wrote: On 11/28/09 11:42 PM, Kenneth Russell wrote: From a technical standpoint, it would be feasible to use the WebGLUnsignedIntArray to access the Canvas's pixel data, and assemble RGBA pixels into integer values using just JavaScript logical operators. I assume you meant JS bitwise operators? Do we have any indication that this would be faster than four array property sets? The bitwise ops in JS are not necessarily particulary fast. Yes, that's what I meant. I don't have any data on whether this would currently be faster than the four separate byte stores. -Ken
Re: [whatwg] Canvas pixel manipulation and performance
On 11/29/09 12:15 AM, Kenneth Russell wrote: I assume you meant JS bitwise operators? Do we have any indication that this would be faster than four array property sets? The bitwise ops in JS are not necessarily particulary fast. Yes, that's what I meant. I don't have any data on whether this would currently be faster than the four separate byte stores. Are they even byte stores, necessarily? I know in Gecko imagedata is just a JS array at the moment; it stores each of R,G,B,A as a JS Number (with the usual if it's an integer store as an integer optimization arrays do). That might well change in the future, and I hope it does, but that's the current code. I can't speak to what the behavior is in Webkit, and in particular whether it's even the same when using V8 vs Nitro. -Boris
[whatwg] Canvas pixel manipulation and performance
Hello Group, I've been using canvas to draw pixel art (NES/SNES game screens and sprites) similar to what an emulator would do. Doing this kind of drawing requires direct access to the pixel buffer. My problem with the canvas spec (as it is now) is that it tends to artificially bounds pixel drawing performance to JavaScript when doing any sort of pixel access. Setting four unsigned 8-bit array elements (R, G, B, and A) is a slower operation that setting just one unsigned 32-bit array element (RGBA or ABGR). Sadly, we don't have this latter option for canvas. My comment is a request for a new set of pixel access methods on the CanvasRenderingContext2D object. Specifically, alternatives to createImageData(), getImageData(), and putImageData() methods for providing an array of unsigned 32-bit elements for pixel manipulation. One proposal is the reuse of the CanvasArrayBuffer introduced by WebGL[1]. The reference explains the use of CanvasArrayBuffer in the context of RGBA color space: ... RGBA colors, with each component represented as an unsigned byte. This appears to be a useful solution, with an existing implementation to build from (at least in Mozilla). The single concern here is that it neglects any mention of support for hardware utilizing native-ABGR (eg. little endian) byte order, or more obscure formats. I assume the idea was to handle any necessary conversions in the back-end. Including 32-bit color depth-16-bit color depth, for example. A second option is allowing the web developer to handle byte order issues, similar in concept to SDL[2]. In addition to general endian handling, SDL also supports mapping color components to an unsigned 32-bit integer[3]. It seems to me this is the best way to cover hardware byte order/color depth independence while achieving the best user land performance possible. Take for instance, the following pseudo code: var canvas = document.getElementById(canvas); var ctx = canvas.getContext(2d); var pixels = ctx.createUnsignedByteArray(8, 8); // Fill with medium gray for (var i = 0; i 8 * 8; i++) { pixels.data[i] = ctx.mapRGBA(128, 128, 128, 255); } ctx.putUnsignedByteArray(pixels, 0, 0); That appears more sane than the current method: var canvas = document.getElementById(canvas); var ctx = canvas.getContext(2d); var pixels = ctx.createImageData(8, 8); // Fill with medium gray for (var i = 0; i 8 * 8; i++) { pixels.data[i * 4 + 0] = 128; pixels.data[i * 4 + 1] = 128; pixels.data[i * 4 + 2] = 128; pixels.data[i * 4 + 3] = 255; } ctx.putImageData(pixels, 0, 0); I understand this a bad way to fill a portion of a canvas with a solid color; this is for illustration purposes only. The overall idea is that setting fewer array elements per pixel will perform better. We've already seen the emergence of emulators written in JavaScript/Canvas. In fact, there are loads of them[4], and they would all benefit from having a better way to interact directly with canvas pixels. Of course, the use cases are not limited to emulation; my NES/SNES level editor projects would enjoy faster pixel manipulation as well. These kinds of projects can use arbitrarily sized canvases (up to 4864px × 3072px in one case[5]) and can take a good deal of time to fully render, even with several off-ImageData optimization tricks. Looking to discuss more options! Jason Oster [1] http://blog.vlad1.com/2009/11/06/canvasarraybuffer-and-canvasarray/ [2] http://www.libsdl.org/intro.en/usingendian.html [3] http://www.libsdl.org/cgi/docwiki.cgi/SDL_MapRGBA [4] http://www.google.com/search?q=javascript+emulator [5] http://parasyte.kodewerx.org/projects/syndrome/stages/2009-07-05/12_wily3.png
Re: [whatwg] Canvas pixel manipulation and performance
On Nov 26, 2009, at 11:45 AM, Jason Oster wrote: Hello Group, I've been using canvas to draw pixel art (NES/SNES game screens and sprites) similar to what an emulator would do. Doing this kind of drawing requires direct access to the pixel buffer. My problem with the canvas spec (as it is now) is that it tends to artificially bounds pixel drawing performance to JavaScript when doing any sort of pixel access. Setting four unsigned 8-bit array elements (R, G, B, and A) is a slower operation that setting just one unsigned 32-bit array element (RGBA or ABGR). Sadly, we don't have this latter option for canvas. My comment is a request for a new set of pixel access methods on the CanvasRenderingContext2D object. Specifically, alternatives to createImageData(), getImageData(), and putImageData() methods for providing an array of unsigned 32-bit elements for pixel manipulation. One proposal is the reuse of the CanvasArrayBuffer introduced by WebGL[1]. The reference explains the use of CanvasArrayBuffer in the context of RGBA color space: ... RGBA colors, with each component represented as an unsigned byte. This appears to be a useful solution, with an existing implementation to build from (at least in Mozilla). The single concern here is that it neglects any mention of support for hardware utilizing native-ABGR (eg. little endian) byte order, or more obscure formats. I assume the idea was to handle any necessary conversions in the back-end. Including 32-bit color depth-16-bit color depth, for example. WebGL has completely different constraints to that of the 2d canvas -- when the developer provides resources to GL the developer has to provide a myriad of type details, this means that the developer needs to be able to request storage of a specific type. The WebGL array types are specifically targeting this use case -- they don't make sense for canvas2d where the only storage is not a developer specified format. A second option is allowing the web developer to handle byte order issues, similar in concept to SDL[2]. In addition to general endian handling, SDL also supports mapping color components to an unsigned 32-bit integer[3]. It seems to me this is the best way to cover hardware byte order/color depth independence while achieving the best user land performance possible. History has shown that any time a developer won't handle both byte orders -- developers tend to work on the assumption that if something works for them it must be correct, this is why we end up with sites that claim This site needs IE/Safari/Firefox to run type messages. Even conscientious developers who test multiple browsers, and validate their content, etc will be able to produce accidentally broken sites because this would add a hardware dependency on spec behaviour. Realistically simply making an separate object that has indexes 32bit rgba pixels would resolve the problem you're trying to describe -- the implementation would need to do byte order correct, but given that 2/3 canvas implementations already do unpre-premultiplied data conversion on putImageData this is unlikely to add any cost at all (in fact in the webkit implementation i don't believe there would be any difference in the logic in get/putImageData). Take for instance, the following pseudo code: var canvas = document.getElementById(canvas); var ctx = canvas.getContext(2d); var pixels = ctx.createUnsignedByteArray(8, 8); // Fill with medium gray for (var i = 0; i 8 * 8; i++) { pixels.data[i] = ctx.mapRGBA(128, 128, 128, 255); } ctx.putUnsignedByteArray(pixels, 0, 0); Adding a function call would make your code much slower. That appears more sane than the current method: var canvas = document.getElementById(canvas); var ctx = canvas.getContext(2d); var pixels = ctx.createImageData(8, 8); // Fill with medium gray for (var i = 0; i 8 * 8; i++) { pixels.data[i * 4 + 0] = 128; pixels.data[i * 4 + 1] = 128; pixels.data[i * 4 + 2] = 128; pixels.data[i * 4 + 3] = 255; } ctx.putImageData(pixels, 0, 0); I understand this a bad way to fill a portion of a canvas with a solid color; this is for illustration purposes only. The overall idea is that setting fewer array elements per pixel will perform better. Have you actually measured this? How long is spent in each part? I suspect if you're not using the dirty region arguments you're pushing back (and doing premult conversion) on a lot more pixels than necessary. Yes setting 4 properties is slower than setting 1, but where is your time actually being spent. We've already seen the emergence of emulators written in JavaScript/Canvas. In fact, there are loads of them[4], and they would all benefit from having a better way to interact directly with canvas pixels. Of course, the use cases are not limited to emulation; my NES/SNES level editor projects would enjoy