[whatwg] Proposal: Add CanvasRenderingContext2D.currentTransform for reading and writing the current transform matrix

2011-11-03 Thread Chris Jones
An important canvas use case that arose during the development of pdf.js[1] is 
implementing a fill-to-current-clip operation.  In PDF, the shading fill 
operator, sh, requires this.  In the cairo library, cairo_paint() is this 
operation.  The operation can be implemented by tracking the current transform 
matrix (CTM) in script, then at paint time, inverting the CTM and filling the 
canvas bounds transformed by the inverse CTM.  However, tracking the CTM from 
script is cumbersome and not likely to be performant.  It's also not always 
possible for script to track the CTM; for example, if an external library is 
passed a canvas to draw to, the library doesn't know the initial CTM.  Another 
use case that requires the CTM is creating a minimal temporary surface for a 
fill operation that's bounded by user-space coordinates, for which canvas 
doesn't have native support for creating the contents of what's to be filled.  
This case also arose in pdf.js.

To that end, we propose a canvas interface that allows querying the CTM.  The 
concrete proposal is

  interface CanvasRenderingContext2D {
//...
attribute SVGMatrix currentTransform;  // default is the identity matrix
//...
  };

The first use case above is satisfied by |inverseCtm = 
context.currentTransform.inverse();|.  This attribute also serves as a 
convenience attribute for updating the CTM.  The |currentTransform| attribute 
tracks updates made to the CTM through |setTransform()/scale()/translate()|.

Note: a similar API is implemented in Gecko as the sequence attributes 
|mozCurrentTransform/mozCurrentTransformInverse|.  This violates WebIDL rules 
and will be updated.  SVGMatrix was chosen as a replacement since it already 
has the |inverse()| method and so results in a smaller API surface area.  
Mozilla will update Gecko when this proposal stabilizes.

Cheers,
Chris

[1] https://github.com/mozilla/pdf.js


Re: [whatwg] document.write(\r): the spec doesn't say how to handle it.

2011-11-03 Thread Henri Sivonen
On Thu, Nov 3, 2011 at 1:57 AM, David Flanagan dflana...@mozilla.com wrote:
 Firefox, Chrome and Safari all seem to do the right thing: wait for the next
 character before tokenizing the CR.

See http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1247

Firefox tokenizes the CR immediately, emits an LF and then skips over
the next character if it is an LF. When I designed the solution
Firefox uses, I believed it was more correct and more compatible with
legacy than whatever the spec said at the time.

Chrome seems to wait for the next character before tokenizing the CR.

 And I think this means that the description of document.write needs to be 
 changed.

All along, I've felt thought that having U+ and CRLF handling as a
stream preprocessing step was bogus and both should happen upon
tokenization. So far, I've managed to convince Hixie about U+
handling.

 Similarly, what should the tokenizer do if the document.write emits half of
 a UTF-16 surrogate pair as the last character?

The parser operates on UTF-16 code units, so a lone surrogate is emitted.

-- 
Henri Sivonen
hsivo...@iki.fi
http://hsivonen.iki.fi/


Re: [whatwg] document.write(\r): the spec doesn't say how to handle it.

2011-11-03 Thread David Flanagan

On 11/3/11 4:21 AM, Henri Sivonen wrote:

On Thu, Nov 3, 2011 at 1:57 AM, David Flanagandflana...@mozilla.com  wrote:

Firefox, Chrome and Safari all seem to do the right thing: wait for the next
character before tokenizing the CR.

See http://software.hixie.ch/utilities/js/live-dom-viewer/saved/1247

I hadn't used the live dom viewer before.  That's really useful!


Firefox tokenizes the CR immediately, emits an LF and then skips over
the next character if it is an LF. When I designed the solution
Firefox uses, I believed it was more correct and more compatible with
legacy than whatever the spec said at the time.
I'm having a Duh! moment... I currently wait for the next character, but 
what you describe is also works, and allows the document.write() spec to 
make sense.



Chrome seems to wait for the next character before tokenizing the CR.


And I think this means that the description of document.write needs to be 
changed.

All along, I've felt thought that having U+ and CRLF handling as a
stream preprocessing step was bogus and both should happen upon
tokenization. So far, I've managed to convince Hixie about U+
handling.
Each tokenizer state would have to add a rule for CR that said  emit 
LF, save the current tokenizer state, and set the tokenizer state to 
after CR state.  Actually, tokenizer states that already have a rule 
for LF or whitespace would have to integrate this CR rule into that 
rule.  Then new after CR state would have two rules. On LF it would skip 
the character and restore the saved state.  On anything else it would 
push the character back and restore the saved state.



Similarly, what should the tokenizer do if the document.write emits half of
a UTF-16 surrogate pair as the last character?

The parser operates on UTF-16 code units, so a lone surrogate is emitted.


The spec seems pretty unambiguous that it operates on codepoints (though 
I implemented mine using 16-bit code units). §13.2.1:  The input to the 
HTML parsing process consists of a stream of Unicode code points.  Also 
§13.2.2.3 includes a list of codepoints beyond the BMP that are parse 
errors.  And finally, the tests in 
http://code.google.com/p/html5lib/source/browse/testdata/tokenizer/unicodeCharsProblematic.test 
require unpaired surrogates to be converted to the U+FFFD replacement 
character.  (Though my experience is that modifying my tokenizer to pass 
those tests causes other tests to fail, which makes me wonder whether 
unpaired surrogates are only supposed to be replaced in some but not all 
tokenizer states)

Thanks, Henri!

David