Re: [whatwg] Canvas pixel manipulation and performance

2010-03-11 Thread Ian Hickson
On Thu, 26 Nov 2009, Jason Oster wrote:
 
 I've been using canvas to draw pixel art (NES/SNES game screens and 
 sprites) similar to what an emulator would do.  Doing this kind of 
 drawing requires direct access to the pixel buffer.
 
 My problem with the canvas spec (as it is now) is that it tends to 
 artificially bounds pixel drawing performance to JavaScript when doing 
 any sort of pixel access.  Setting four unsigned 8-bit array elements 
 (R, G, B, and A) is a slower operation that setting just one unsigned 
 32-bit array element (RGBA or ABGR).  Sadly, we don't have this latter 
 option for canvas.

 My comment is a request for a new set of pixel access methods on the 
 CanvasRenderingContext2D object.  Specifically, alternatives to 
 createImageData(), getImageData(), and putImageData() methods for 
 providing an array of unsigned 32-bit elements for pixel manipulation.

This comes up every now and then, but in the big picture, this problem 
isn't a huge issue. I think it's better that we wait for the rest of the 
spec to be better implemented before we start adding more features to 
canvas. Even in canvas, there are a number of features that would 
probably be more important than this, such as reusable path objects, text 
on a path, or dotted or dashed line styles.

If we had data showing that it was a problem, that might change matters.

-- 
Ian Hickson   U+1047E)\._.,--,'``.fL
http://ln.hixie.ch/   U+263A/,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'


Re: [whatwg] Canvas pixel manipulation and performance

2009-12-04 Thread Jason Oster
I guess this suggestion to access the full pixel data in a single array 
element has fallen by the wayside.  Are there any direct objections to 
including additional API to allow this kind of behavior?  It seems most 
developers believe it would be unnecessary, but I haven't heard much in 
the way of reasoning (technical nor personal).


I cannot comment on the typical uses of accessing pixel data from 
script; if it is [in general] more important to have each of the R,G,B,A 
components separated for script access, or not.  But for cases involving 
indexed palettes, having the ability to directly treat each pixel as a 
single property is very much desired.


It is not going to provide a huge boost in performance.  At worst, it 
will help make code cleaner.  But at best, it will do that and 
[slightly?] reduce the performance penalty of reading/writing 3 
superfluous (in my eyes) array accesses.  The only negative aspect I can 
think of with additional API functions is the introduction of new 
developer confusion; Which one do I use?


Thanks for listening,
Jay


Re: [whatwg] Canvas pixel manipulation and performance

2009-12-04 Thread Kenneth Russell
On Fri, Dec 4, 2009 at 9:30 AM, Jason Oster paras...@kodewerx.org wrote:
 I guess this suggestion to access the full pixel data in a single array
 element has fallen by the wayside.  Are there any direct objections to
 including additional API to allow this kind of behavior?  It seems most
 developers believe it would be unnecessary, but I haven't heard much in the
 way of reasoning (technical nor personal).

 I cannot comment on the typical uses of accessing pixel data from script;
 if it is [in general] more important to have each of the R,G,B,A components
 separated for script access, or not.  But for cases involving indexed
 palettes, having the ability to directly treat each pixel as a single
 property is very much desired.

 It is not going to provide a huge boost in performance.  At worst, it will
 help make code cleaner.  But at best, it will do that and [slightly?] reduce
 the performance penalty of reading/writing 3 superfluous (in my eyes) array
 accesses.  The only negative aspect I can think of with additional API
 functions is the introduction of new developer confusion; Which one do I
 use?

I think you'd get more traction if you had performance measurements;
minimally, profiles showing that this is hot in your current
application. Ideally, you could do a prototype in one of the browsers
supporting WebGL which exposes the ImageData's backing store as a
WebGLUnsignedIntArray. If this showed a significant speedup it would
provide strong motivation.

-Ken


Re: [whatwg] Canvas pixel manipulation and performance

2009-12-04 Thread Jason Oster
On Dec 4, 2009, at 6:10 PM, Kenneth Russell wrote:
 I think you'd get more traction if you had performance measurements;
 minimally, profiles showing that this is hot in your current
 application. Ideally, you could do a prototype in one of the browsers
 supporting WebGL which exposes the ImageData's backing store as a
 WebGLUnsignedIntArray. If this showed a significant speedup it would
 provide strong motivation.
My current application isn't necessarily the best test bed.  I'll see what I 
can do to prototype it with some minimal test cases, though.  Thanks for the 
suggestion.

Jay

Re: [whatwg] Canvas pixel manipulation and performance

2009-11-30 Thread Philip Taylor
On Mon, Nov 30, 2009 at 4:46 PM, Kenneth Russell k...@google.com wrote:
 CanvasPixelArray specifies that values greater than 255, including
 +inf, are clamped to 255 and values less than 0, including -inf, are
 clamped to zero. WebGLUnsignedByteArray (as people will see in the
 WebGL draft spec this week or next) specifies that the conversion is
 done with a C-style cast. The results are different for out-of-range
 values.

I was going to say: It doesn't include +/-inf, because
http://whatwg.org/html5#dependencies says if a method with an
argument that is a floating point number type (float) is passed an
Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception
must be raised, and that probably applies to the CanvasPixelArray
setter method.

But it looks like the spec changed since I last looked, and the setter
takes an 'octet' argument, so I think the conversion should happen as
per http://dev.w3.org/2006/webapi/WebIDL/#es-octet and
CanvasPixelArray shouldn't define any conversion. (Filed as
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8405). Hopefully WebIDL
and WebGL either match or can be made to match.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-30 Thread Anne van Kesteren
On Mon, 30 Nov 2009 19:31:53 +0100, Philip Taylor  
excors+wha...@gmail.com wrote:

But it looks like the spec changed since I last looked, and the setter
takes an 'octet' argument, so I think the conversion should happen as
per http://dev.w3.org/2006/webapi/WebIDL/#es-octet and
CanvasPixelArray shouldn't define any conversion. (Filed as
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8405). Hopefully WebIDL
and WebGL either match or can be made to match.


It would be nice if they used the same object/interface too... Maybe  
implementations of CanvasPixelArray should hide the interface and other  
details so that we can eventually convert into some kind of octet array if  
we get native support for that.



--
Anne van Kesteren
http://annevankesteren.nl/


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Mike Hearn
I have to wonder if it's worth trying to micro-optimize web APIs like
this. Your suggestions will squeeze out only a small amount of
additional performance - the goals will get a bit higher and we'll be
back at square one.

I know NativeClient isn't a proposed spec or standardised piece of web
infrastructure, but I think what you really need is the ability to
scribble on a canvas from native code rather than JavaScript. Work
done on that has the advantage of generalizing to all web APIs and use
cases rather than just direct graphics access.


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Jason Oster
On Nov 29, 2009, at 4:19 AM, Mike Hearn wrote:

 I have to wonder if it's worth trying to micro-optimize web APIs like
 this. Your suggestions will squeeze out only a small amount of
 additional performance - the goals will get a bit higher and we'll be
 back at square one.
I've always imagined that was the general flow of performance improvements; 
tune a little here, change that over there, and you knock some time off your 
overall benchmark.  I hope I haven't mistaken what you are saying.

 I know NativeClient isn't a proposed spec or standardised piece of web
 infrastructure, but I think what you really need is the ability to
 scribble on a canvas from native code rather than JavaScript. Work
 done on that has the advantage of generalizing to all web APIs and use
 cases rather than just direct graphics access.
That's one way to get a healthy performance boost (typically) but where does 
the web developer stand in this work?  Are you suggesting native code should 
replace JavaScript?  On the other hand, I have briefly considered using native 
code (via XPCOM) in my XULRunner application to grind out every bit of 
performance possible.  I can understand where you are coming from.

Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Mike Hearn
 That's one way to get a healthy performance boost (typically)
 but where does the web developer stand in this work?  Are
 you suggesting native code should replace JavaScript?

For code where performance is critical (like complex animation code)  yes.

Don't get me wrong, I'm all for better JavaScript performance, but we
have to be realistic. Compared to native code JavaScripts performance
will always be lacking - many clever tricks have been deployed to
speed up JavaScript but even the fastest JS engines don't come close
to the output of average C++ compilers/JVMs. The nature of JS makes it
likely that this situation will remain true for a long time, perhaps
forever.

So there are two possibilities here - one is to introduce ever more
complexity into the web APIs for diminishing returns, even though a
primary goal of the web APIs is simplicity. And the other is to just
bind native code to those APIs, hopefully eliminating much of the
marshalling overhead along the way.

The latter approach has the advantage of not requiring novice-level
developers to understand things like endianness or bit masking to draw
some pixels (replace as appropriate for any given API), whilst
allowing developers that need it to get the fastest execution securely
possible.


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Kenneth Russell
On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 11/29/09 12:15 AM, Kenneth Russell wrote:

 I assume you meant JS bitwise operators?  Do we have any indication that
 this would be faster than four array property sets?  The bitwise ops in
 JS
 are not necessarily particulary fast.

 Yes, that's what I meant. I don't have any data on whether this would
 currently be faster than the four separate byte stores.

 Are they even byte stores, necessarily?  I know in Gecko imagedata is just a
 JS array at the moment; it stores each of R,G,B,A as a JS Number (with the
 usual if it's an integer store as an integer optimization arrays do).
  That might well change in the future, and I hope it does, but that's the
 current code.

 I can't speak to what the behavior is in Webkit, and in particular whether
 it's even the same when using V8 vs Nitro.

In Chromium (WebKit + V8), CanvasPixelArray property stores write
individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray
behave similarly but have simpler clamping semantics.

-Ken


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Philip Taylor
On Sun, Nov 29, 2009 at 6:59 PM, Kenneth Russell k...@google.com wrote:
 On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 Are they even byte stores, necessarily?  I know in Gecko imagedata is just a
 JS array at the moment; it stores each of R,G,B,A as a JS Number (with the
 usual if it's an integer store as an integer optimization arrays do).
  That might well change in the future, and I hope it does, but that's the
 current code.

 I can't speak to what the behavior is in Webkit, and in particular whether
 it's even the same when using V8 vs Nitro.

 In Chromium (WebKit + V8), CanvasPixelArray property stores write
 individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray
 behave similarly but have simpler clamping semantics.

Would it be helpful (for simplicity or performance or consistency etc)
to change the specification of CanvasPixelArray to have those simpler
clamping semantics? (I don't expect there would be compatibility
problems with changing it now, particularly since Firefox doesn't
implement clamping at all in CPA.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Kenneth Russell
On Sun, Nov 29, 2009 at 11:05 AM, Philip Taylor excors+wha...@gmail.com wrote:
 On Sun, Nov 29, 2009 at 6:59 PM, Kenneth Russell k...@google.com wrote:
 On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 Are they even byte stores, necessarily?  I know in Gecko imagedata is just a
 JS array at the moment; it stores each of R,G,B,A as a JS Number (with the
 usual if it's an integer store as an integer optimization arrays do).
  That might well change in the future, and I hope it does, but that's the
 current code.

 I can't speak to what the behavior is in Webkit, and in particular whether
 it's even the same when using V8 vs Nitro.

 In Chromium (WebKit + V8), CanvasPixelArray property stores write
 individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray
 behave similarly but have simpler clamping semantics.

 Would it be helpful (for simplicity or performance or consistency etc)
 to change the specification of CanvasPixelArray to have those simpler
 clamping semantics? (I don't expect there would be compatibility
 problems with changing it now, particularly since Firefox doesn't
 implement clamping at all in CPA.)

It would. Vladimir Vukicevic from Mozilla was planning to raise this
issue with the whatwg upon release of the first public draft of the
WebGL spec.

-Ken


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Boris Zbarsky

On 11/29/09 1:20 PM, Jason Oster wrote:

Changeset 2b56c4771d5c reduced the number of pixel array elements
accessed by caching the 256px x 256px rooms within the stage map, and
passing the cached rooms to putImageData().


As opposed to doing what before the change?

The previous code used a non-cached approach.  Where every pixel in the 
canvas was explicitly drawn into the ImageData array.  Keep in mind, the largest of these 
was 4864px × 3072px.  If anything, the change took time away from JavaScript and placed 
it in native code: putImageData().


I'm not sure I follow.  Looking at the diff, it looks like you used to 
do a single putImageData call, passing it this.fgmap.render(), right?


Now you do a bunch of putImageData calls, passing 
this.fgmap[rooms[i++]].img, where right before that you called 
this.fgmap[i].render() for a bunch if i.


I really don't see how this would have made things faster, unless 
render() is just not being called on all rooms now.


-Boris


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Jason Oster
The patch changed something like this:

for (y in canvasHeight) {
  for (x in canvasWidth) {
putPixel();
  }
}

To something like this:

for (y in roomHeight) {
  for (x in roomWidth) {
putPixel();
  }
}
for (rooms_y in canvasHeight) {
  for (rooms_x in canvasWidth) {
putRoom();
  }
}

This pseudo-code is a bit unintelligible, please bear with me.  Basically, 
there is a fixed number of rooms (256px x 256px images), only about 15 in the 
smaller stages.  The full map might have a total area for 10 x 3 rooms; half of 
them are duplicated.  The patch caches the rooms drawn, and then blits from 
the cache into the canvas.

this.fgmap used to be one giant canvas.  Now it is an array of smaller canvases 
(the size of a single room).  The .render() method draws pixels to the 
associated ImageData in each case (using the four R,G,B,A elements per pixel, 
as we are discussing).  In this case, cutting down on the number of pixels that 
this.fgmap.render() needs to poke into ImageData made the overall drawing 
approximately 2x faster.

It might be important to note that this.fgmap.render() method also does some 
tile decoding (to convert the SNES tile format into a usable bitmap), and 
caches the results.

Does that make more sense?  I know it is difficult to follow unfamiliar code, 
but I would like to clear up any questions you might have.

Jay

On Nov 29, 2009, at 1:03 PM, Boris Zbarsky wrote:

 On 11/29/09 1:20 PM, Jason Oster wrote:
 Changeset 2b56c4771d5c reduced the number of pixel array elements
 accessed by caching the 256px x 256px rooms within the stage map, and
 passing the cached rooms to putImageData().
 
 As opposed to doing what before the change?
 The previous code used a non-cached approach.  Where every pixel in the 
 canvas was explicitly drawn into the ImageData array.  Keep in mind, the 
 largest of these was 4864px × 3072px.  If anything, the change took time 
 away from JavaScript and placed it in native code: putImageData().
 
 I'm not sure I follow.  Looking at the diff, it looks like you used to do a 
 single putImageData call, passing it this.fgmap.render(), right?
 
 Now you do a bunch of putImageData calls, passing this.fgmap[rooms[i++]].img, 
 where right before that you called this.fgmap[i].render() for a bunch if i.
 
 I really don't see how this would have made things faster, unless render() is 
 just not being called on all rooms now.
 
 -Boris



Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Boris Zbarsky

On 11/29/09 3:33 PM, Jason Oster wrote:

It might be important to note that this.fgmap.render() method also does some 
tile decoding (to convert the SNES tile format into a usable bitmap), and 
caches the results.

Does that make more sense?  I know it is difficult to follow unfamiliar code, 
but I would like to clear up any questions you might have.


So the new code has to do about half as much tile decoding, as well as 
half the number of imagedata[n] sets?  Or was the decoding already being 
cached?


-Boris


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Jason Oster
On Nov 29, 2009, at 1:57 PM, Boris Zbarsky wrote:
 So the new code has to do about half as much tile decoding, as well as half 
 the number of imagedata[n] sets?  Or was the decoding already being cached?
Decoded tiles were already cached.  It actually builds MORE tile caches now, 
though: one cache for each this.fgmap[].  So that's more (a lot more) tile 
decoding, but far fewer pixel pokes.

Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Oliver Hunt

On Nov 29, 2009, at 10:59 AM, Kenneth Russell wrote:

 On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 11/29/09 12:15 AM, Kenneth Russell wrote:
 
 I assume you meant JS bitwise operators?  Do we have any indication that
 this would be faster than four array property sets?  The bitwise ops in
 JS
 are not necessarily particulary fast.
 
 Yes, that's what I meant. I don't have any data on whether this would
 currently be faster than the four separate byte stores.
 
 Are they even byte stores, necessarily?  I know in Gecko imagedata is just a
 JS array at the moment; it stores each of R,G,B,A as a JS Number (with the
 usual if it's an integer store as an integer optimization arrays do).
  That might well change in the future, and I hope it does, but that's the
 current code.
 
 I can't speak to what the behavior is in Webkit, and in particular whether
 it's even the same when using V8 vs Nitro.
 
 In Chromium (WebKit + V8), CanvasPixelArray property stores write
 individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray
 behave similarly but have simpler clamping semantics.

I don't know where you're getting that idea from -- the clamping semantics for 
CanvasPixelArray and WebGLUnsignedByteArray are identical.

The CanvasPixelArray implementation in WebKit has always matched the spec and 
been a clamping bytearray, eg. one byte per channel, per pixel.

Just for future reference for all who are interested: in WebKit the JS 
interface to a DOM object is merely a binding to a C++ implementation, eg. 
there's no reason to be concerned about different DOM object behaviour 
dependent on the JS engine - if there were it a difference would imply a bug 
rather than a design choice.

--Oliver

 
 -Ken



Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Boris Zbarsky

On 11/29/09 11:22 PM, Oliver Hunt wrote:
 The CanvasPixelArray implementation in WebKit has always matched the 
 spec and been a clamping bytearray, eg. one byte per channel, per

 pixel.

I assume you mean the spec as it is now and not the spec as it was 
when Gecko implemented get/putImageData?  The latter was quite 
different, as I recall.



Just for future reference for all who are interested: in WebKit the JS 
interface to a DOM object is merely a binding to a C++ implementation


Sure; the point is that in Gecko the thing returned by 
getImageData().data is not a DOM object at all but a JS Array. 
Similarly, the object returned by getImageData() is not a DOM object, 
but a JS Object.  Likewise, putImageData accepts any JS Object with the 
right properties on it (width, height, and data).


I _think_ as of when the implementation was created some of this (e.g. 
the putImageData behavior) was called for by the then-whatwg spec.  See 
http://www.whatwg.org/specs/web-apps/2007-10-26/ for example; I believe 
there were other drafts between the 2005-01 draft and that one that had 
still other behavior.


As you note the spec has since changed; the Gecko implementation hasn't 
been changed yet pending us having some indication that the spec is 
actually stable now.  Once burnt, twice shy and all.


In any case, I simply didn't know whether CanvasPixelArray was 
implemented as a host object or a native object in webkit, and didn't 
want to make any claims about it as a result.


-Boris


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Boris Zbarsky

On 11/29/09 11:22 PM, Oliver Hunt wrote:

I don't know where you're getting that idea from -- the clamping semantics for 
CanvasPixelArray and WebGLUnsignedByteArray are identical.


Perhaps Kenneth included the rounding behavior (which seems to be 
different to me from a brief look at JavaScriptCore/wtf/ByteArray.h and 
WebCore/html/canvas/WebGLUnsignedByteArray.h) in clamping semantics? 
CanvasPixelArray, which uses ByteArray, rounds to nearest integer (ties 
rounded up), while WebGLUnsignedByteArray truncates.


Note that neither implements what the current spec draft calls for 
(which is round-to-nearest, ties-to-even behavior).  No opinion on 
whether this _should_ be what the spec calls for.


-Boris


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-28 Thread Jason Oster
My apologies for the direct reply, Oliver.  This was meant to go back to the 
list:


On Nov 26, 2009, at 3:35 PM, Oliver Hunt wrote:
 WebGL has completely different constraints to that of the 2d canvas -- when 
 the developer provides resources to GL the developer has to provide a myriad 
 of type details, this means that the developer needs to be able to request 
 storage of a specific type.  The WebGL array types are specifically targeting 
 this use case -- they don't make sense for canvas2d where the only storage is 
 not a developer specified format.
That is understandable.

 History has shown that any time a developer won't handle both byte orders -- 
 developers tend to work on the assumption that if something works for them it 
 must be correct, this is why we end up with sites that claim This site needs 
 IE/Safari/Firefox to run type messages.  Even conscientious developers who 
 test multiple browsers, and validate their content, etc will be able to 
 produce accidentally broken sites because this would add a hardware 
 dependency on spec behaviour.
We certainly don't want any more of that.

 Realistically simply making an separate object that has indexes 32bit rgba 
 pixels would resolve the problem you're trying to describe -- the 
 implementation would need to do byte order correct, but given that 2/3 canvas 
 implementations already do unpre-premultiplied data conversion on 
 putImageData this is unlikely to add any cost at all (in fact in the webkit 
 implementation i don't believe there would be any difference in the logic in 
 get/putImageData).
Once again, I agree.  My confusion on the type-specific arrays for WebGL is 
that they were specific and general enough to use in other cases.  If they 
should not be used in 2D canvas implementations (or elsewhere) then a 
2D-canvas-specific array or object would be the way forward.

 Take for instance, the following pseudo code:
 
 var canvas = document.getElementById(canvas);
 var ctx = canvas.getContext(2d);
 var pixels = ctx.createUnsignedByteArray(8, 8);
 // Fill with medium gray
 for (var i = 0; i  8 * 8; i++) {
  pixels.data[i] = ctx.mapRGBA(128, 128, 128, 255);
 }
 ctx.putUnsignedByteArray(pixels, 0, 0);
 
 Adding a function call would make your code much slower.
Yes, it would.  Once again, this was only for illustrative purposes.  More 
commonly, a Look-Up Table would be created, containing all of the colors used 
in the scene before any pixels are touched.  For any kind of low-resolution 
pixel art (as found in classic gaming consoles), the palette is typically 
indexed and consisting of 256 colors or fewer.  In extreme cases, an LUT with 
thousands of colors would be far faster than using such a function call.

I neglected to mention any optimal way of using a mapRGBA function; that's 
not what I was trying to illustrate.

 I understand this a bad way to fill a portion of a canvas with a solid 
 color; this is for illustration purposes only.  The overall idea is that 
 setting fewer array elements per pixel will perform better.
 
 Have you actually measured this?  How long is spent in each part?  I suspect 
 if you're not using the dirty region arguments you're pushing back (and doing 
 premult conversion) on a lot more pixels than necessary.  Yes setting 4 
 properties is slower than setting 1, but where is your time actually being 
 spent.
I have not directly done any measurements, sorry.  What I do have is a mecurial 
repository for a level editor project (which draws independent pixels directly 
to very large canvas elements) showing the progression of optimizations I've 
introduced.  Many of the modifications intended to make the drawing faster have 
done so by avoiding pixel access wherever possible.  Certainly it is not the 
most efficient code, but I've optimized enough to make the time spent setting 
pixel arrays worth investigating.  I still do not have any actual numbers to 
throw around, however.

 We've already seen the emergence of emulators written in JavaScript/Canvas.  
 In fact, there are loads of them[4], and they would all benefit from having 
 a better way to interact directly with canvas pixels.  Of course, the use 
 cases are not limited to emulation; my NES/SNES level editor projects would 
 enjoy faster pixel manipulation as well.  These kinds of projects can use 
 arbitrarily sized canvases (up to 4864px × 3072px in one case[5]) and can 
 take a good deal of time to fully render, even with several off-ImageData 
 optimization tricks.
 
 Without seeing the code for your demo i'd have no idea whether what you're 
 doing is actually efficient -- have you profiled?  Both Safari and Firefox 
 have built in profilers.

The trouble with profiling my project is that it is a XULRunner application, 
and does not run directly in web browsers as-is.  The code can largely be 
hacked to work as a web application.  If you are interested in a demo of 
sorts, the code is all available 

Re: [whatwg] Canvas pixel manipulation and performance

2009-11-28 Thread Kenneth Russell
On Sat, Nov 28, 2009 at 12:44 PM, Jason Oster paras...@kodewerx.org wrote:

 Once again, I agree.  My confusion on the type-specific arrays for WebGL is
 that they were specific and general enough to use in other cases.  If they
 should not be used in 2D canvas implementations (or elsewhere) then a
 2D-canvas-specific array or object would be the way forward.

I and other members of the WebGL working group are hoping that the new
array-like types being introduced with this specification will be
general enough to repurpose in other areas. The first public draft of
the spec will be released in the next week or two, and we're hoping
that will enable discussion with the broader web community.

From a technical standpoint, it would be feasible to use the
WebGLUnsignedIntArray to access the Canvas's pixel data, and
assemble RGBA pixels into integer values using just JavaScript logical
operators. To keep things sane, the specification would need to state
something along the lines that the high (logical, not addressing) 8
bits are the red bits and the low 8 bits are the alpha bits. This
means that implementations on big-endian and little-endian machines
would need to store the data differently internally so that the
behavior at the JavaScript level is identical; the WebGL array types
currently deliberately do no byte swapping.

-Ken


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-28 Thread Boris Zbarsky

On 11/28/09 11:42 PM, Kenneth Russell wrote:

From a technical standpoint, it would be feasible to use the
WebGLUnsignedIntArray to access the Canvas's pixel data, and
assemble RGBA pixels into integer values using just JavaScript logical
operators.


I assume you meant JS bitwise operators?  Do we have any indication that 
this would be faster than four array property sets?  The bitwise ops in 
JS are not necessarily particulary fast.


-Boris


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-28 Thread Boris Zbarsky

On 11/28/09 3:44 PM, Jason Oster wrote:

The trouble with profiling my project is that it is a XULRunner
application, and does not run directly in web browsers as-is.


This is not an issue at all; any XULRunner application can be run in 
Firefox directly (with the right command-line flags).  I'm happy to 
profile whatever set of operations you want for you in Firefox if you 
send me step-by-step instructions for performing that set of operations 
in your application, ideally with code pointers to the start and stop of 
the code to be profiled in your codebase.  Probably offlist for all 
that... ;)


Of course if you have a Mac you can do all this yourself; Shark is an 
excellent tool.



Changeset 2b56c4771d5c reduced the number of pixel array elements
accessed by caching the 256px x 256px rooms within the stage map, and
passing the cached rooms to putImageData().


As opposed to doing what before the change?


My point, as you concur, is that setting four array elements
(properties) is slower than setting just one.


While true, it may not necessarily be slower than setting one if the 
value to set is more expensive to compute as a result, and may not be 
the bottleneck to start with.  The latter is hard to determine without 
profiling.  My gut feel is that at least in Gecko this is not likely to 
be a performance bottleneck right now, nor much of a win with the 
proposed change, if any.  But again, that would be easier to judge (at 
least for the first part) with a profile.


-Boris


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-28 Thread Kenneth Russell
On Sat, Nov 28, 2009 at 9:00 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 11/28/09 11:42 PM, Kenneth Russell wrote:

 From a technical standpoint, it would be feasible to use the
 WebGLUnsignedIntArray to access the Canvas's pixel data, and
 assemble RGBA pixels into integer values using just JavaScript logical
 operators.

 I assume you meant JS bitwise operators?  Do we have any indication that
 this would be faster than four array property sets?  The bitwise ops in JS
 are not necessarily particulary fast.

Yes, that's what I meant. I don't have any data on whether this would
currently be faster than the four separate byte stores.

-Ken


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-28 Thread Boris Zbarsky

On 11/29/09 12:15 AM, Kenneth Russell wrote:

I assume you meant JS bitwise operators?  Do we have any indication that
this would be faster than four array property sets?  The bitwise ops in JS
are not necessarily particulary fast.


Yes, that's what I meant. I don't have any data on whether this would
currently be faster than the four separate byte stores.


Are they even byte stores, necessarily?  I know in Gecko imagedata is 
just a JS array at the moment; it stores each of R,G,B,A as a JS Number 
(with the usual if it's an integer store as an integer optimization 
arrays do).  That might well change in the future, and I hope it does, 
but that's the current code.


I can't speak to what the behavior is in Webkit, and in particular 
whether it's even the same when using V8 vs Nitro.


-Boris


[whatwg] Canvas pixel manipulation and performance

2009-11-26 Thread Jason Oster
Hello Group,

I've been using canvas to draw pixel art (NES/SNES game screens and sprites) 
similar to what an emulator would do.  Doing this kind of drawing requires 
direct access to the pixel buffer.

My problem with the canvas spec (as it is now) is that it tends to artificially 
bounds pixel drawing performance to JavaScript when doing any sort of pixel 
access.  Setting four unsigned 8-bit array elements (R, G, B, and A) is a 
slower operation that setting just one unsigned 32-bit array element (RGBA or 
ABGR).  Sadly, we don't have this latter option for canvas.

My comment is a request for a new set of pixel access methods on the 
CanvasRenderingContext2D object.  Specifically, alternatives to 
createImageData(), getImageData(), and putImageData() methods for providing an 
array of unsigned 32-bit elements for pixel manipulation.

One proposal is the reuse of the CanvasArrayBuffer introduced by WebGL[1].  The 
reference explains the use of CanvasArrayBuffer in the context of RGBA color 
space: ... RGBA colors, with each component represented as an unsigned byte.  
This appears to be a useful solution, with an existing implementation to build 
from (at least in Mozilla).  The single concern here is that it neglects any 
mention of support for hardware utilizing native-ABGR (eg. little endian) 
byte order, or more obscure formats.  I assume the idea was to handle any 
necessary conversions in the back-end.  Including 32-bit color depth-16-bit 
color depth, for example.

A second option is allowing the web developer to handle byte order issues, 
similar in concept to SDL[2].  In addition to general endian handling, SDL also 
supports mapping color components to an unsigned 32-bit integer[3].  It seems 
to me this is the best way to cover hardware byte order/color depth 
independence while achieving the best user land performance possible.

Take for instance, the following pseudo code:

  var canvas = document.getElementById(canvas);
  var ctx = canvas.getContext(2d);
  var pixels = ctx.createUnsignedByteArray(8, 8);
  // Fill with medium gray
  for (var i = 0; i  8 * 8; i++) {
pixels.data[i] = ctx.mapRGBA(128, 128, 128, 255);
  }
  ctx.putUnsignedByteArray(pixels, 0, 0);

That appears more sane than the current method:

  var canvas = document.getElementById(canvas);
  var ctx = canvas.getContext(2d);
  var pixels = ctx.createImageData(8, 8);
  // Fill with medium gray
  for (var i = 0; i  8 * 8; i++) {
pixels.data[i * 4 + 0] = 128;
pixels.data[i * 4 + 1] = 128;
pixels.data[i * 4 + 2] = 128;
pixels.data[i * 4 + 3] = 255;
  }
  ctx.putImageData(pixels, 0, 0);

I understand this a bad way to fill a portion of a canvas with a solid color; 
this is for illustration purposes only.  The overall idea is that setting fewer 
array elements per pixel will perform better.

We've already seen the emergence of emulators written in JavaScript/Canvas.  In 
fact, there are loads of them[4], and they would all benefit from having a 
better way to interact directly with canvas pixels.  Of course, the use cases 
are not limited to emulation; my NES/SNES level editor projects would enjoy 
faster pixel manipulation as well.  These kinds of projects can use arbitrarily 
sized canvases (up to 4864px × 3072px in one case[5]) and can take a good deal 
of time to fully render, even with several off-ImageData optimization tricks.

Looking to discuss more options!
Jason Oster


[1] http://blog.vlad1.com/2009/11/06/canvasarraybuffer-and-canvasarray/
[2] http://www.libsdl.org/intro.en/usingendian.html
[3] http://www.libsdl.org/cgi/docwiki.cgi/SDL_MapRGBA
[4] http://www.google.com/search?q=javascript+emulator
[5] 
http://parasyte.kodewerx.org/projects/syndrome/stages/2009-07-05/12_wily3.png

Re: [whatwg] Canvas pixel manipulation and performance

2009-11-26 Thread Oliver Hunt

On Nov 26, 2009, at 11:45 AM, Jason Oster wrote:

 Hello Group,
 
 I've been using canvas to draw pixel art (NES/SNES game screens and sprites) 
 similar to what an emulator would do.  Doing this kind of drawing requires 
 direct access to the pixel buffer.
 
 My problem with the canvas spec (as it is now) is that it tends to 
 artificially bounds pixel drawing performance to JavaScript when doing any 
 sort of pixel access.  Setting four unsigned 8-bit array elements (R, G, B, 
 and A) is a slower operation that setting just one unsigned 32-bit array 
 element (RGBA or ABGR).  Sadly, we don't have this latter option for canvas.
 
 My comment is a request for a new set of pixel access methods on the 
 CanvasRenderingContext2D object.  Specifically, alternatives to 
 createImageData(), getImageData(), and putImageData() methods for providing 
 an array of unsigned 32-bit elements for pixel manipulation.
 
 One proposal is the reuse of the CanvasArrayBuffer introduced by WebGL[1].  
 The reference explains the use of CanvasArrayBuffer in the context of RGBA 
 color space: ... RGBA colors, with each component represented as an unsigned 
 byte.  This appears to be a useful solution, with an existing implementation 
 to build from (at least in Mozilla).  The single concern here is that it 
 neglects any mention of support for hardware utilizing native-ABGR (eg. 
 little endian) byte order, or more obscure formats.  I assume the idea 
 was to handle any necessary conversions in the back-end.  Including 32-bit 
 color depth-16-bit color depth, for example.
WebGL has completely different constraints to that of the 2d canvas -- when the 
developer provides resources to GL the developer has to provide a myriad of 
type details, this means that the developer needs to be able to request storage 
of a specific type.  The WebGL array types are specifically targeting this use 
case -- they don't make sense for canvas2d where the only storage is not a 
developer specified format.

 A second option is allowing the web developer to handle byte order issues, 
 similar in concept to SDL[2].  In addition to general endian handling, SDL 
 also supports mapping color components to an unsigned 32-bit integer[3].  
 It seems to me this is the best way to cover hardware byte order/color depth 
 independence while achieving the best user land performance possible.

History has shown that any time a developer won't handle both byte orders -- 
developers tend to work on the assumption that if something works for them it 
must be correct, this is why we end up with sites that claim This site needs 
IE/Safari/Firefox to run type messages.  Even conscientious developers who 
test multiple browsers, and validate their content, etc will be able to produce 
accidentally broken sites because this would add a hardware dependency on spec 
behaviour.

Realistically simply making an separate object that has indexes 32bit rgba 
pixels would resolve the problem you're trying to describe -- the 
implementation would need to do byte order correct, but given that 2/3 canvas 
implementations already do unpre-premultiplied data conversion on putImageData 
this is unlikely to add any cost at all (in fact in the webkit implementation i 
don't believe there would be any difference in the logic in get/putImageData).

 
 Take for instance, the following pseudo code:
 
  var canvas = document.getElementById(canvas);
  var ctx = canvas.getContext(2d);
  var pixels = ctx.createUnsignedByteArray(8, 8);
  // Fill with medium gray
  for (var i = 0; i  8 * 8; i++) {
pixels.data[i] = ctx.mapRGBA(128, 128, 128, 255);
  }
  ctx.putUnsignedByteArray(pixels, 0, 0);

Adding a function call would make your code much slower.

 
 That appears more sane than the current method:
 
  var canvas = document.getElementById(canvas);
  var ctx = canvas.getContext(2d);
  var pixels = ctx.createImageData(8, 8);
  // Fill with medium gray
  for (var i = 0; i  8 * 8; i++) {
pixels.data[i * 4 + 0] = 128;
pixels.data[i * 4 + 1] = 128;
pixels.data[i * 4 + 2] = 128;
pixels.data[i * 4 + 3] = 255;
  }
  ctx.putImageData(pixels, 0, 0);
 
 I understand this a bad way to fill a portion of a canvas with a solid color; 
 this is for illustration purposes only.  The overall idea is that setting 
 fewer array elements per pixel will perform better.

Have you actually measured this?  How long is spent in each part?  I suspect if 
you're not using the dirty region arguments you're pushing back (and doing 
premult conversion) on a lot more pixels than necessary.  Yes setting 4 
properties is slower than setting 1, but where is your time actually being 
spent.

 
 We've already seen the emergence of emulators written in JavaScript/Canvas.  
 In fact, there are loads of them[4], and they would all benefit from having a 
 better way to interact directly with canvas pixels.  Of course, the use cases 
 are not limited to emulation; my NES/SNES level editor projects would enjoy