Re: [whatwg] Need clarification on DOM exceptions thrown by canvas 2D drawImage

2011-08-09 Thread Philip Taylor
On Mon, Aug 8, 2011 at 10:08 PM, Jeff Muizelaar jmuizel...@mozilla.com wrote:

 On 2011-08-08, at 4:58 PM, Ian Hickson wrote:

 On Mon, 8 Aug 2011, Justin Novosad wrote:

 This inquiry is regarding this page of the specification:
 http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html
 In section 4.8.11.1.10 Images, about drawImage(), it is stated that If
 one of the sw or sh arguments is zero, the implementation must raise an
 INDEX_SIZE_ERR exception  There are no other references to other
 circumstances under which INDEX_SIZE_ERR should be thrown, and there is
 no indication of what the correct behavior is when the source rectangle
 is completely or partially outside the bounds of the source image.

The spec used to throw exceptions on out-of-bounds source rectangles,
but that causes breakage because floats are imprecise (e.g.
http://www.jigzone.com/xmockup/oCanvasBug.php failed in Opera because
79.01  79 as 64-bit double, whereas other browsers
presumably rounded to 32-bit float first), so it had to be changed.
(http://html5.org/r/5373 first, then changed again because of
http://www.w3.org/Bugs/Public/show_bug.cgi?id=10799 to be consistent
with filtering behaviour.)

 A bit lower down in the same section, the spec says: When the filtering
 algorithm requires a pixel value from outside the original image data, it
 must instead use the value from the nearest edge pixel. (That is, the
 filter uses 'clamp-to-edge' behavior.)

 The clamp-to-edge behavior doesn't really work well with Coregraphics' 
 drawImage call. This means that this behaviour is not implemented in Firefox 
 on OS X and I expect WebKit doesn't implement it for a similar reason. I was 
 actually hoping the spec could be changed to the simpler behaviour of just 
 clamping the source rectangle to the bounds of the image. This behaviour is 
 easy to implement on all platforms and is still quite reasonable.

Does the clamp-to-edge behaviour work fine when the source rectangle
is entirely inside the image? e.g. the image

8800
8800
0088
0088

(where each digit is a pixel) drawn at 2x scale with bilinear
filtering should give

88862000
88862000
88862000
66653222
22235666
00026888
00026888
00026888

because of the filtering requirements. If CoreGraphics can't do that
then it's broken (per the spec) regardless of how source rectangles
are handled. Or is it able to do clamp-to-edge fine up to the edge of
the source image, just not extend that beyond the image when the
source rectangle is expanded further?

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Accept full CSS colors in the legacy color parsing algorithm

2011-04-13 Thread Philip Taylor
On Fri, Apr 8, 2011 at 10:26 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 4/8/11 1:54 PM, Tab Atkins Jr. wrote:

 In the legacy color parsing algorithm [...]
 Could we change those two steps to just say If keyword is a valid CSS
 color value, then return the simple color corresponding to that
 value.?  (I guess, to fully match Webkit, you need to change the
 definition of simple color to take alpha into account.)

 Do you have web compat data here?

I don't know if this is relevant or useful but anyway:
http://philip.html5.org/data/font-colors.txt has some basic data for
font color values, http://philip.html5.org/data/bgcolors.txt for
body bgcolor. (Each line is the number of URLs that value was found
on (from the set from
http://philip.html5.org/data/dotbot-20090424.txt), followed by the
XML-encoded value.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas gradients color interpolation - change to premultiplied?

2010-11-23 Thread Philip Taylor
On Tue, Nov 23, 2010 at 8:43 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 Right now, canvas gradients interpolate their colors in
 non-premultiplied space; that is, the raw values of r, g, b, and a are
 interpolated independently.  This has the unfortunate effect that
 colors darken as they transition to transparent, as transparent is
 defined as rgba(0,0,0,0), a transparent black.  Under this scheme,
 the color halfway between yellow and transparent is
 rgba(127,127,0,.5), a partially-transparent dark yellow, rather than
 rgba(255,255,0,.5).*

If you define the gradient as interpolating from solid yellow to
transparent black, I'd expect that it *should* be semi-transparent
blackish-yellow in the middle.

If you want it to be pure yellow, don't use a keyword which is
explicitly specified as transparent black - define the gradient from
rgba(255,255,0,1) to rgba(255,255,0,0) instead. Then you'll get
rgba(255,255,0,0.5) in the middle.

 The rest of the platform has switched to using premultiplied colors
 for interpolation, because they react better in cases like this**.
 CSS transitions and CSS gradients now explicitly use premultiplied
 colors, and SVG ends up interpolating similarly (they don't quite have
 the same problem - they track opacity separate from color, so
 transitioning from color:yellow;opacity:1 to
 color:yellow;opacity:0 gives you color:yellow;opacity:.5 in the
 middle, which is the moral equivalent of rgba(255,255,0,.5)).

That sounds like SVG gradients *can't* be using premultiplied colours.
A transition from color:yellow;opacity:1 to color:black;opacity:0
will have rgba(127,127,0,0.5) in the middle, and it's impossible to
get that if you are using premultiplied colours. You'd have to have
A=1 at the start and A=0 at the end, so (with premultiplied colour)
the end would be interpreted as rgba(0,0,0,0), so you'd get the same
as interpolating to color:yellow;opacity:0 (i.e. rgba(255,255,0,0.5)
in the middle), which is not what SVG does.

http://www.w3.org/TR/SVGTiny12/painting.html#Gradients says explicitly
its behaviour is the non-premultiplied behaviour we currently get with
canvas. (gradient from fully transparent red, via partly transparent
dark yellow, to fully opaque lime - the RGB components of fully
transparent colours are preserved.)

Maybe CSS should have originally used the keyword transparentblack
instead of transparent (though the distinction didn't matter before
gradients existed) - changing the gradient algorithm solely to work
more intuitively when people happen to use that one particular
incorrectly-named keyword seems backwards, and a mistake in CSS.

(Perhaps CSS gradients could avoid this problem by overriding the
meaning of the transparent keyword, so that instead of rgba(0,0,0,0)
it means A=0 with the mean RGB of the adjacent colour stops. That
would let it work as people naturally expect when they use that
keyword, and they can use the rgba() syntax if they really want
transparent black or transparent yellow or transparent red etc.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Question about gradient stops in canvas and parsing as CSS colors

2010-09-24 Thread Philip Taylor
On Wed, Sep 22, 2010 at 3:49 PM, Anne van Kesteren ann...@opera.com wrote:
 On Wed, 22 Sep 2010 16:47:02 +0200, Boris Zbarsky bzbar...@mit.edu wrote:

 Clearly I happen to think Gecko's behavior is the sane one here, but
 there's a clear interoperability problem either way.  Certainly Opera and
 Gecko interpreted the spec differently.

 Might be the way we invoke the CSS code. I think the Gecko behavior makes
 sense. Philip, can your test suite cover this?

Added with the Gecko behaviour (and added a few other cases - Opera
10.61 fails some like rgba-solid-3):
http://dvcs.w3.org/hg/html/diff/5a95d6481bac/tests/submission/PhilipTaylor/tools/canvas/tests2d.yaml

Run the tests from
http://test.w3.org/html/tests/submission/PhilipTaylor/canvas/index.2d.fillStyle.parse.html
or (if that's down)
http://dvcs.w3.org/hg/html/raw-file/tip/tests/submission/PhilipTaylor/canvas/index.2d.fillStyle.parse.html

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas API: What should happen if non-finite floats are used

2010-09-08 Thread Philip Taylor
On Wed, Sep 8, 2010 at 9:02 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 On 9/8/10 2:22 PM, Oliver Hunt wrote:
 One old case that failed in the presence of exceptions was the old canvex
 demo at http://canvex.lazyilluminati.com/83/play.xhtml - this was one of the
 first cases i saw after trying to make webkit's implementation conform to
 the (older) spec by throwing exceptions on non-finite values we had many
 canvas using sites break so had to stop throwing.

 OK.  I can believe that this was the case at the time, but it certainly
 wasn't due to Firefox not throwing.  I can see how given people's penchant
 to create browser-specific content changing the webkit behavior could cause
 issues with sites that were targeting only webkit and didn't bother testing
 in anything else.

Canvex was originally written for and tested in Firefox 1.5/2.0 and
Opera 9. It wasn't tested in Safari (due to lack of Mac).

I think the relevant bug is
https://bugs.webkit.org/show_bug.cgi?id=13537 which was actually
caused by passing 0 sizes to drawImage, not by non-finite values.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas feedback (various threads)

2010-08-11 Thread Philip Taylor
On Wed, Aug 11, 2010 at 9:35 PM, Ian Hickson i...@hixie.ch wrote:
 On Thu, 29 Jul 2010, Gregg Tavares (wrk) wrote:
 source-over
    glBlendFunc(GL_ONE, GL_ONE_MINUS_SRC_ALPHA);

 I tried searching the OpenGL specification for either glBlendFunc or
 GL_ONE_MINUS_SRC_ALPHA and couldn't find either. Could you be more
 specific regarding what exactly we would be referencing?  I'm not really
 sure I understand you proposal.

The OpenGL spec omits the gl/GL_ prefixes - search for BlendFunc
instead. (In the GL 3.0 spec, tables 4.1 (the FUNC_ADD row) and 4.2
seem relevant for defining the blend equations.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Default value of complete attribute on new Image objects

2010-08-10 Thread Philip Taylor
On Wed, Aug 11, 2010 at 12:23 AM, Ian Hickson i...@hixie.ch wrote:
 I've updated the spec to have complete return true if the src is the empty
 string.

Some canvas methods (drawImage, createPattern) are defined in terms of
the complete attribute (If the image argument is an HTMLImageElement
object whose complete attribute is false, [...] then the
implementation must return without drawing anything.). Now that it
can be true when the image doesn't have any image data, what should
they do when passed such an image?

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Default value of complete attribute on new Image objects

2010-08-10 Thread Philip Taylor
On Wed, Aug 11, 2010 at 1:06 AM, Jonas Sicking jo...@sicking.cc wrote:
 On Tue, Aug 10, 2010 at 4:56 PM, Philip Taylor excors+wha...@gmail.com 
 wrote:
 On Wed, Aug 11, 2010 at 12:23 AM, Ian Hickson i...@hixie.ch wrote:
 I've updated the spec to have complete return true if the src is the empty
 string.

 Some canvas methods (drawImage, createPattern) are defined in terms of
 the complete attribute (If the image argument is an HTMLImageElement
 object whose complete attribute is false, [...] then the
 implementation must return without drawing anything.). Now that it
 can be true when the image doesn't have any image data, what should
 they do when passed such an image?

 Isn't the image fully loaded, just empty?

Depends how you define the concept of fully loaded, I guess. The
spec says an empty src is invalid and triggers an error event and
makes the image not available (but now also complete), so it's not
entirely the same as a normal non-empty image.

 Seems like drawing such an
 image should act normal. It just so happens that normal for an
 empty image would be to draw nothing?

 Just have to avoid divide-by-zero errors when creating patterns :)

Probably should do the same as a 0-pixel canvas (If the image
argument is an HTMLCanvasElement object with either a horizontal
dimension or a vertical dimension equal to zero, then the
implementation must raise an INVALID_STATE_ERR exception.). (The spec
currently assumes complete HTMLImageElements always have non-zero
size, so the dimension check isn't applied to them.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] HTML resource packages

2010-08-04 Thread Philip Taylor
On Wed, Aug 4, 2010 at 1:31 AM, Justin Lebar justin.le...@gmail.com wrote:
 We at Mozilla are hoping to ship HTML resource packages in Firefox 4,
 and we wanted to get the WhatWG's feedback on the feature.

 For the impatient, the spec is here:

    http://people.mozilla.org/~jlebar/respkg/

It seems a bit surprising that [pkg.zip img1.png img2.png] provides
more files than [pkg.zip img1.png] but *fewer* files than [pkg.zip]
(which includes all files). I can imagine people would write code
like:

  print html packages='[cached-image-thumbnails.zip  . (join  ,
@thumbnails_which_are_not_out_of_date) . ]';

(intending the package to be updated infrequently, and used only for
images that haven't been modified since the last package update), and
they would get completely the wrong behaviour when the list is empty.
So maybe [pkg.zip] should mean no files (vs pkg.zip which still
means all files).


Filenames in zips are byte-strings, not Unicode-character-strings.
What should happen with non-ASCII in the zip's list of contents?
People will use standard zip programs and frequently end up with
various random character encodings in their file - would browsers
guess or decode as CP437 or decode as UTF-8 or fail? would they look
at the zip header's language encoding flag? etc.


What happens if the document contains multiple html elements (not
all the root element)? (e.g. if it's XHTML, or the elements are added
by scripts). The packages spec seems to assume there is only ever one.


The note at the end of 4.1 seems to be about avoiding problems like
http://evil.com/ saying:

html packages=eviloverride.zip !-- gets downloaded from evil.com --
base href=http://bank.com/;
img src=http://bank.com/logo.png; !-- this shouldn't be
allowed to come from the .zip --

Why is this particular example an important problem? If the attacker
wants to insert their own files into their own pages, they can just do
it directly without using packages. Since this is (I assume) only used
for resources like images and scripts and stylesheets, and not for a
hrefs or iframe hrefs, I don't see how it would let the attacker
circumvent any same-origin restrictions or do anything else dangerous.

The opposite way seems more dangerous, where evil.com says:

html 
packages=http://evil.com/redirect.cgi?http://secret-bank-intranet-server/packages.zip;
img src=http://evil.com/logo.png;
!-- now use canvas to read the pixel data of the secret logo,
since it was loaded from the evil.com origin --

Is anything stopping that?

In 4.3 step 2: What is pkg-url initialised to? (The package href of p?)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] HTML resource packages

2010-08-04 Thread Philip Taylor
On Wed, Aug 4, 2010 at 9:01 PM, Justin Lebar justin.le...@gmail.com wrote:
 What happens if the document contains multiple html elements (not
 all the root element)? (e.g. if it's XHTML, or the elements are added
 by scripts). The packages spec seems to assume there is only ever one.

 The packages attribute should work like the manifest attribute currently 
 works.
 I don't see language in the cache manifest section of HTML5 (6.6) specifying
 what happens when there are multiple html elements, so I hope I don't need 
 to
 specify this either.  :)

http://whatwg.org/html#attr-html-manifest says:

  The manifest attribute only has an effect during the early stages
of document load. Changing the attribute dynamically thus has no
effect (and thus, no DOM API is provided for this attribute).

Its effect is triggered from http://whatwg.org/html#parser-appcache
(html token in the before html insertion mode) or from
http://whatwg.org/html#read-xml , so it will only ever run for the
root html element of the document.

The packages attribute is defined as running Whenever the packages
attribute is changed (including when the document is first loaded, if
its html element has a packages attribute), so it's not the same.
If you do want it to work the same then you'll need to hook into the
parser and ignore dynamic updates.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Allowing in attribute values

2010-06-25 Thread Philip Taylor
On Thu, Jun 24, 2010 at 2:34 PM, Benjamin M. Schwartz
bmsch...@fas.harvard.edu wrote:
 [...]
 HTML5 is about making a spec that matches common practice, right?  In
 practice, no one puts  in attribute values.

The data disagrees: http://philip.html5.org/data/gt-in-attribute.txt

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] WebSockets: UDP

2010-06-03 Thread Philip Taylor
On Thu, Jun 3, 2010 at 7:28 AM, Erik Möller emol...@opera.com wrote:
 [...]
 One thing to remember here is that browsers have other means for
 communication as well. I'm not saying we shouldn't support reliable messages
 over UDP, but just pointing out the option.

Yep - the relevant use cases ought to be supported decently by the
platform, but not necessarily by this extension to the platform (it
might be a different extension or it might be (probably is) supported
already).

 - Protection against an attacker initiating a legitimate socket with a
 user and then redirecting it (with some kind of IP (un)hijacking) to a
 service behind the user's firewall (which isn't a problem when using
 TCP since the service will ignore packets when it hasn't done the TCP
 handshake; but UDP services might respond to a single packet from the
 middle of a websocket stream, so every single packet will have to be
 careful not to be misinterpreted dangerously by unsuspecting
 services).

 I don't quite follow what you mean here. Can you expand on this with an
 example?

I was thinking something like: A host at IP 11.11.11.11 on the public
internet runs some UDP service, like DNS or TFTP or something a bit
more secure. That service is restricted so it only responds to packets
received from IP 22.22.22.22 (a trusted user). The UDP Web Socket
handshake is carefully constructed so that it won't trigger dangerous
behaviour in any of those services (like how the TCP Web Socket uses a
safe HTTP-ish handshake).

An attacker hijacks the IP 11.11.11.11 from the perspective of the
user (by advertising new routes near the user), so the user's packets
to that address go to the attacker. The attacker gets the user to
visit a web page which sets up a UDP Web Socket with the attacker's
server at 11.11.11.11, doing all the handshake authentication
correctly.

The attacker then releases its hijacked address, so any subsequent Web
Socket packets will go to the original restricted service. Since
they're being received from the trusted user, the service will trust
them. Since the web browser has already done the Socket handshake, it
will believe it's talking to a legitimate Web Socket server and will
continue sending whatever data packets the attacker's script tells it
to.

The service will then be receiving and responding to
attacker-controlled packets, and will never have seen the carefully
constructed handshake that's designed to protect it. That's not a
danger for TCP services since they'll reject unexpected packets from
the middle of a TCP stream, but UDP services may accept packets from
the middle of a UDP Web Socket stream.

So it's not sufficient to carefully construct the Web Socket handshake
packets to not trigger unwanted behaviour in non-Socket services.
Every data packet sent on the Socket has to be carefully constructed
too.

(This might be a largely impractical or pointless attack, and there's
probably much easier ways to attack the exposed service, but I don't
know enough about security to judge that. Also I don't know what
packet construction would be sufficiently careful. But it seems like a
possible new concern that's introduced when using UDP in this
context.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] WebSockets: UDP

2010-06-02 Thread Philip Taylor
On Tue, Jun 1, 2010 at 9:02 PM, Erik Möller emol...@opera.com wrote:
 On Tue, 01 Jun 2010 21:14:33 +0200, Philip Taylor excors+wha...@gmail.com
 wrote:

 More feedback is certainly good, though I think the libraries I
 mentioned (DirectPlay/OpenTNL/RakNet/ENet (there's probably more)) are
 useful as an indicator of common real needs (as opposed to edge-case
 or merely perceived needs) - they've been used by quite a few games
 and they seem to have largely converged on a core set of features, so
 that's better than just guessing.

 I guess many commercial games write their own instead of reusing
 third-party libraries, and I guess they often reimplement very similar
 concepts to these, but it would be good to have more reliable
 information about that.


 I was hoping to be able to avoid looking at what the interfaces of a high vs
 low level option would look like this early on in the discussions, but
 perhaps we need to do just that; look at Torque, RakNet etc and find a least
 common denominator and see what the reactions would be to such an interface.

I'm trying to think of them mainly as indirect examples of use cases,
rather than as direct examples of interfaces. Under the assumption
that most games either use a library like these or implement a
comparable one themselves, and that the library designs are driven by
the game requirements, if a feature is supported by most of the
libraries then it's probably needed by many games; and if a feature is
unsupported in many of the libraries then it's probably unnecessary
for most games. (Also an assumption: games running in web browsers
will have similar needs to native games (though lagging many years
behind state-of-the-art); and we only ought to aim to support the
needs of most games, not all games.)

So they seem to suggest things like:
- many games need a combination of reliable and unreliable-ordered and
unreliable-unordered messages.
- many games need to send large messages (so the libraries do
automatic fragmentation).
- many games need to efficiently send tiny messages (so the libraries
do automatic aggregation).
- many games need some kind of security (I have no idea exactly what,
or how much is still relevant when the client is JavaScript and
trivial to tamper with).
- many games need to prioritise certain messages when bandwidth is limited.
- most games don't need low-level control over individual datagrams
and precise packet loss feedback, they're okay with the socket details
being abstracted away.
- ... probably lots more (and/or less); I'm not very familiar with the
details of the libraries so this is unlikely to be an accurate list,
but I think it may be a useful way to analyse the requirements.

(The solution suggested in your initial post
(socket.send(data_smaller_than_mtu) going over UDP) seems to be one
extreme, which combines with higher-level JS libraries to satisfy
these needs. I think I initially suggested the other extreme of
encoding all the features into the browser API. I guess the best
tradeoff depends largely on what non-game use cases exist that should
be satisfied by the same solution.)

 So, what would the minimal set of limitations be to make a UDP WebSocket
 browser-safe?

 -No listen sockets
 -No multicast
 -Reliable handshake with origin info
 -Automatic keep-alives
 -Reliable close handshake
 -Socket is bound to one address for the duration of its lifetime
 -Sockets open sequentially (like current DOS protection in WebSockets)
 -Cap on number of open sockets per server and total per user agent

Perhaps also:
- Cap or dynamic limit on bandwidth (you don't want a single web page
flooding the user's network connection and starving all the TCP
connections)
- Protection against session hijacking
- Protection against an attacker initiating a legitimate socket with a
user and then redirecting it (with some kind of IP (un)hijacking) to a
service behind the user's firewall (which isn't a problem when using
TCP since the service will ignore packets when it hasn't done the TCP
handshake; but UDP services might respond to a single packet from the
middle of a websocket stream, so every single packet will have to be
careful not to be misinterpreted dangerously by unsuspecting
services).

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] WebSockets: UDP

2010-06-01 Thread Philip Taylor
On Tue, Jun 1, 2010 at 11:12 AM, Erik Möller emol...@opera.com wrote:
 The use case I'd like to address in this post is Real-time client/server
 games.

 The majority of the on-line games of today use a client/server model over
 UDP and we should try to give game developers the tools they require to
 create browser based games. For many simpler games a TCP based protocol is
 exactly what's needed but for most real-time games a UDP based protocol is a
 requirement. [...]

 It seems to me the WebSocket interface can be easily modified to cope with
 UDP sockets [...]

As far as I'm aware, games use UDP because they can't use TCP (since
packet loss shouldn't stall the entire stream) and there's no
alternative but UDP. (And also because peer-to-peer usually requires
NAT punchthrough, which is much more reliable with UDP than with TCP).
They don't use UDP because it's a good match for their requirements,
it's just the only choice that doesn't make their requirements
impossible.

There are lots of features that seem very commonly desired in games: a
mixture of reliable and unreliable and reliable-but-unordered channels
(movement updates can be safely dropped but chat messages must never
be), automatic fragmentation of large messages, automatic aggregation
of small messages, flow control to avoid overloading the network,
compression, etc. And there's lots of libraries that build on top of
UDP to implement protocols halfway towards TCP in order to provide
those features:
http://msdn.microsoft.com/en-us/library/bb153248(VS.85).aspx,
http://opentnl.sourceforge.net/doxydocs/fundamentals.html,
http://www.jenkinssoftware.com/raknet/manual/introduction.html,
http://enet.bespin.org/Features.html, etc.

UDP sockets seem like a pretty inadequate solution for the use case of
realtime games - everyone would have to write their own higher-level
networking libraries (probably poorly and incompatibly) in JS to
provide the features that they really want. Browsers would lose the
ability to provide much security, e.g. flow control to prevent
intentional/accidental DOS attacks on the user's network, since they
would be too far removed from the application level to understand what
they should buffer or drop or notify the application about.

I think it'd be much more useful to provide a level of abstraction
similar to those game networking libraries - at least the ability to
send reliable and unreliable sequenced and unreliable unsequenced
messages over the same connection, with automatic
aggregation/fragmentation so you don't have to care about packet
sizes, and dynamic flow control for reliable messages and maybe some
static rate limit for unreliable messages. The API shouldn't expose
details of UDP (you could implement exactly the same API over TCP,
with better reliability but worse latency, or over any other protocols
that become well supported in the network).

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] WebSockets: UDP

2010-06-01 Thread Philip Taylor
On Tue, Jun 1, 2010 at 2:00 PM, Erik Möller emol...@opera.com wrote:
 [...]
 I've never heard any gamedevs complain how poorly UDP matches their needs so
 I'm not so sure about that, but you may be right it would be better to have
 a higher level abstraction. If we are indeed targeting the game developing
 community we should ask for their feedback rather than guessing what they
 prefer. I will grep my linked-in account for game-devs tonight and see if I
 can gather some feedback.

More feedback is certainly good, though I think the libraries I
mentioned (DirectPlay/OpenTNL/RakNet/ENet (there's probably more)) are
useful as an indicator of common real needs (as opposed to edge-case
or merely perceived needs) - they've been used by quite a few games
and they seem to have largely converged on a core set of features, so
that's better than just guessing.

I guess many commercial games write their own instead of reusing
third-party libraries, and I guess they often reimplement very similar
concepts to these, but it would be good to have more reliable
information about that.

 I suspect they prefer to be empowered with UDP rather than boxed into a
 high level protocol that doesn't fit their needs but I may be wrong.

If you put it like that, I don't see why anybody would not want to be
empowered :-)

But that's not the choice, since they could never really have UDP -
the protocol will perhaps have to be Origin-based, connection-oriented
(to exchange Origin information etc), with complex packet headers so
you can't trick it into talking to a DNS server, with rate limiting in
the browser to prevent DOS attacks, restricted to client-server (no
peer-to-peer since you probably can't run a socket server in the
browser), etc.

Once you've got all that, a simple UDP-socket-like API might not be
the most natural or efficient way to implement a higher-level
partially-reliable protocol - the application couldn't cooperate with
the low-level network buffering to prioritise certain messages, it
couldn't use the packet headers that have already been added on top of
UDP, it would have to send acks from a script callback which may add
some latency after a packet is received from the network, etc. So I
think there's some tradeoffs and it's not a question of one low-level
protocol vs one strictly more restrictive higher-level protocol.

 So the question to the gamedevs will be, and please make suggestions for
 changes and I'll do an email round tonight:

 If browser and server vendors agree on and standardize a socket based
 network interface to be used for real-time games running in the browsers, at
 what level would you prefer the interface to be?
 (Note that an interface for communicating reliably via TCP and TLS are
 already implemented.)
 - A low-level interface similar to a plain UDP socket
 - A medium-level interface allowing for reliable and unreliable channels,
 automatically compressed data, flow control, data priority etc
 - A high-level interface with ghosted entities

That first option sounds like you're offering something very much like
a plain UDP socket (and I guess anyone who's willing to write their
own high-level wrapper (which is only hundreds or thousands of lines
of code and not a big deal for a complex game) would prefer that since
they want as much power as possible), but (as above) I think that's
misleading - it's really a UDP interface on top of a protocol that has
some quite different characteristics to UDP. So I think the question
should be clearer that the protocol will necessarily include various
features and restrictions on top of UDP, and the choice is whether it
includes the minimal set of features needed for security and hides
them behind a UDP-like interface or whether it includes higher-level
features and exposes them in a higher-level interface.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] What will not work when we do not have server ?

2010-03-29 Thread Philip Taylor
On Mon, Mar 29, 2010 at 4:27 PM, Tab Atkins Jr. jackalm...@gmail.com wrote:
 On Mon, Mar 29, 2010 at 7:05 AM, narendra sisodiya
 narendra.sisod...@gmail.com wrote:
 Dear all,
     I am making a (uff from long time) some e-learning modules using HTML5.
 The idea is just to make a full interactive lectures (audio, video, svg
 animations , JavaScript, canvas , all sort of new good web technologies etc
 ),
 But there is a little problem. Student will be able to download as a zip
 file. When they want to watch those html5 based interactive tutorials, all
 they need to click on index.html which will open the tutorial.
  I want to ask that what will not work in this mode.
 for example, I have cheked that some basic jQuery ajax demos are working
 well in both url
 http://localhost/narendra/demo.html OR file:///var/www/narenda/demo.html

 I want to know the list for all the such drafts which will not work without
 server. So that I will avoid them Or try to get some workaround.

 Anything that requires a server-side language (PHP, ASP, Python, Ruby,
 etc.) won't work.  Anything that requires only client-side languages
 (HTML, CSS, Javascript) will.

But you also need to be careful about security rules for file://
differing from http://, e.g. Firefox 3 apparently considers files in
parent directories to be non-same-origin, so you can't use
XMLHttpRequest to get ../foo/bar.txt, and if you have an img
src=../images/example.png and draw it on a canvas then you won't
be able to call toDataURL or getImageData, whereas it would be fine if
the files were on an http:// site.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Offscreen canvas (or canvas for web workers).

2010-03-15 Thread Philip Taylor
On Mon, Mar 15, 2010 at 7:05 AM, Maciej Stachowiak m...@apple.com wrote:
 Copying from one canvas to another is much faster than copying to/from
 ImageData. To make copying to a Worker worthwhile as a responsiveness
 improvement for rotations or downscales, in addition to the OffscreenCanvas
 proposal we would need a faster way to copy image data to a Worker. One
 possibility is to allow an OffscreenCanvas to be copied to and from a
 background thread. It seems this would be much much faster than copying via
 ImageData.

Maybe this indicates that implementations of getImageData/putImageData
ought to be optimised? e.g. do the expensive multiplications and
divisions in the premultiplication code with SIMD. (A seemingly
similar thing at http://bugzilla.openedhand.com/show_bug.cgi?id=1939
suggests SSE2 makes things 3x as fast). That would avoid the need to
invent new API, and would also benefit anyone who wants to use
ImageData for other purposes.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Multiple file download

2010-03-10 Thread Philip Taylor
On Wed, Mar 10, 2010 at 5:51 PM, Eric Uhrhane er...@google.com wrote:
 On Wed, Mar 10, 2010 at 12:28 AM, timeless timel...@gmail.com wrote:
 http://www.pkware.com/documents/casestudies/APPNOTE.TXT V. General
 Format of a .ZIP file

 the zip format is fairly streaming friendly, the directory is at the
 end of the file. And if you're actually generating a file which has so
 many records that you can't remember all of them, you're probably
 trying to attack my user agent, so I'm quite happy that you'd fail.

 Isn't a format that has its directory at the end about as
 streaming-UNfriendly as you can get?  You need to pull the whole thing
 down before you can take it apart.  With a .tar.gz, you can unpack
 files as they arrive.

Each file's compressed data is preceded with a header with enough
information to decompress it (filename etc), and then that information
is duplicated in the central directory at the end, so I believe you
can still do streaming decompression (as well as doing random access
once you've got the directory). And you can still do streaming
compression without even buffering a single file, by setting a flag
and moving a part of the file header (lengths and checksum) to just
after the compressed file data.

(But I never understood why pkunzip asked me to put in the last floppy
disk of a multi-disk zip before it would start decompressing the first
- maybe there's some reason that streaming decompression doesn't quite
work perfectly in practice?)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Parsing processing instructions in HTML syntax: 10.2.4.44 Bogus comment state

2010-03-03 Thread Philip Taylor
On Wed, Mar 3, 2010 at 10:55 AM, Brett Zamir bret...@yahoo.com wrote:
 On 3/2/2010 6:54 PM, Ian Hickson wrote:

 On Tue, 2 Mar 2010, Elliotte Rusty Harold wrote:


 Briefly it seems that? causes the parser to go into Bogus comment
 state, which is fair enough. (I wouldn't really recommend that anyone
 use processing instructions in HTML syntax anyway.) However the parser
 comes out of that state at the first. Because processing instructions
 can contain  and terminate only at the two character sequence ?  this
 could cause PI processing to terminate early and leave a lot more error
 handling and a confused parser state in the text yet to come.


 In HTML4, PIs ended at the first, not at ?. ?target data is the
 syntax of PIs when the SGML options used by HTML4 are applied.

 In any case, the parser in HTML5 is based on what browsers do, which is
 also to terminate at the first. It's unlikely that we can change that,
 given backwards-compatibility needs.


 Are there really a lot of folks out there depending on old HTML4-style
 processing instructions not being broken?

Yes, e.g. a load of pages like
http://www.forex.com.cn/html/2008-01/821561.htm (to pick one example
at random) say:

  ?xml:namespace prefix = o ns = urn:schemas-microsoft-com:office:office /

and don't have the string ? anywhere.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Feature proposal - add method to CanvasRenderingContext2D

2010-03-03 Thread Philip Taylor
On Wed, Mar 3, 2010 at 1:08 PM, František Řezáč
frantisek.re...@calavera.info wrote:
 Description
 add overload of (or add similarly called) method createImageData to
 interface CanvasRenderingContext2D which would take two arguments:
 - encodedImageBinaryData
 - dataMimeType
 which are rather self explanatory.

 Reason
 The reason is to be able to supply output of the future File API
 standard (http://www.w3.org/TR/FileAPI/) into canvas.

The canvas API already lets you do:

  var img = new Image();
  img.onload = function() {
ctx.drawImage(img, 0, 0);
// do processing on the canvas
  };
  img.src = 'data:image/png;base64,...'; // get this string from
readAsDataURL etc

Is that sufficient for your use case?

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Error: Stray doctype.

2010-02-12 Thread Philip Taylor
On Fri, Feb 12, 2010 at 4:08 PM, Dean Edwards dean.edwa...@gmail.com wrote:
 http://html5.validator.nu/?doc=http://www.whatwg.org/specs/web-apps/current-work/multipage/

Oops, looks like a consequence of moving the multipage script to a
server with a different version of lxml. Fixed.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] some thoughts on sandboxed IFRAMEs

2010-02-05 Thread Philip Taylor
On Thu, Feb 4, 2010 at 11:12 AM, Ian Hickson i...@hixie.ch wrote:
 On Mon, 25 Jan 2010, Alex Russell wrote:

 AFAICT, the objections fall into several buckets:

   1.) Users might pick badly or may re-use nonces when they shouldn't.
   2.) Escaping  is believed to be more secure because it's likely to
 break more often, raising developer awareness
   3.) The fix to correct escaping problems is believed to be more reliable

 I'm interested in 2 and 3. Users will do dumb things, and both 2 and 3
 assumes a similar baseline scenario as 1; a developer did something
 dumb. Nonces need not be cryptographically strong for most apps, so
 the big problem is re-use. UA's have broad leeway here to prevent
 re-use on origins and deny sandboxing to containers that re-use the
 same nonces on a single page. They can even help by keeping a list of
 recently used nonces and denying reuse.

 Could you elaborate on how one could avoid reuse? That seems like a bad
 idea, since it would prevent any non-client caching mechanism from
 working. The problem is not nonce re-use, it's that the token has to be
 either unpredictable or unspoofable. (It could be predictable and
 unspoofable if it was constructed using a diagonal of the user's text.)

Seems like it should be easy to get secure tokens by doing:

  $token = sha512_hex($input);
  print sandbox token=$token$input/sandbox token=$token;

(or whatever the sandbox syntax is), so there's no need to worry about
cryptographically secure RNGs or nonces or reuse or caching problems.
Is this what you meant by a diagonal of the user's text?

(I'm assuming here that the UA treats the token as an opaque blob, it
doesn't try to recompute the hash itself, so it's robust to changes in
character encoding etc. People could still choose insecure tokens
instead, but it's pretty trivial to use the hash solution correctly in
most programming environments (easier than good random numbers). To
attack it, you'd have to pick two strings X and Y and a hash H such
that hash(X+/sandbox token=+H++Y) = H, which for a good hash
function should be hard, I think.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] HTMLInputElement::valueAsNumber and NaN Infinity

2010-01-25 Thread Philip Taylor
On Mon, Jan 25, 2010 at 9:55 AM, TAMURA, Kent tk...@chromium.org wrote:
 It seems the current spec doesn't define behavior in a case of setting NaN
 or Infinitiy to HTMLInputElement::valueAsNumber.

http://whatwg.org/html5#float-nan : Except where otherwise specified,
if an IDL attribute that is a floating point number type (float) is
assigned an Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR
exception must be raised.

This case seems to apply for valueAsNumber.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-30 Thread Philip Taylor
On Mon, Nov 30, 2009 at 4:46 PM, Kenneth Russell k...@google.com wrote:
 CanvasPixelArray specifies that values greater than 255, including
 +inf, are clamped to 255 and values less than 0, including -inf, are
 clamped to zero. WebGLUnsignedByteArray (as people will see in the
 WebGL draft spec this week or next) specifies that the conversion is
 done with a C-style cast. The results are different for out-of-range
 values.

I was going to say: It doesn't include +/-inf, because
http://whatwg.org/html5#dependencies says if a method with an
argument that is a floating point number type (float) is passed an
Infinity or Not-a-Number (NaN) value, a NOT_SUPPORTED_ERR exception
must be raised, and that probably applies to the CanvasPixelArray
setter method.

But it looks like the spec changed since I last looked, and the setter
takes an 'octet' argument, so I think the conversion should happen as
per http://dev.w3.org/2006/webapi/WebIDL/#es-octet and
CanvasPixelArray shouldn't define any conversion. (Filed as
http://www.w3.org/Bugs/Public/show_bug.cgi?id=8405). Hopefully WebIDL
and WebGL either match or can be made to match.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas pixel manipulation and performance

2009-11-29 Thread Philip Taylor
On Sun, Nov 29, 2009 at 6:59 PM, Kenneth Russell k...@google.com wrote:
 On Sat, Nov 28, 2009 at 9:47 PM, Boris Zbarsky bzbar...@mit.edu wrote:
 Are they even byte stores, necessarily?  I know in Gecko imagedata is just a
 JS array at the moment; it stores each of R,G,B,A as a JS Number (with the
 usual if it's an integer store as an integer optimization arrays do).
  That might well change in the future, and I hope it does, but that's the
 current code.

 I can't speak to what the behavior is in Webkit, and in particular whether
 it's even the same when using V8 vs Nitro.

 In Chromium (WebKit + V8), CanvasPixelArray property stores write
 individual bytes to memory. WebGLByteArray and WebGLUnsignedByteArray
 behave similarly but have simpler clamping semantics.

Would it be helpful (for simplicity or performance or consistency etc)
to change the specification of CanvasPixelArray to have those simpler
clamping semantics? (I don't expect there would be compatibility
problems with changing it now, particularly since Firefox doesn't
implement clamping at all in CPA.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Superset encodings [Re: ISO-8859-* and the C1 control range]

2009-10-22 Thread Philip Taylor
On Thu, Oct 22, 2009 at 9:23 PM, Øistein E. Andersen li...@coq.no wrote:
 On 22 Oct 2009, at 17:15, NARUSE, Yui wrote:

 Finally, Why ISO 2022 series is discouraged is not clear.

 We agree on this point.

The string 숍訊昱穿 encoded as ISO-2022-KR is the bytes 0e 3c 73  63 72
69 70 74 3e. A UA that doesn't support ISO-2022-KR (e.g. Chrome, when
I last checked) will decode it as Windows-1252 and get the string
script, which is bad. So a site that uses ISO-2022-KR is very
likely to expose some users to XSS attacks, which seems like a good
reason to discourage that encoding. The same applies to other ISO-2022
encodings.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas Proposal: aliasClipping property

2009-10-16 Thread Philip Taylor
On Fri, Oct 16, 2009 at 2:41 AM, Charles Pritchard ch...@jumis.com wrote:
 Having gone back and forth with Robert a bit: I was able to recall the whys
 of a particular issue
 that could be handled in this version of the spec, regarding compositing.

 As far as I can tell; the area (width and height, extent) of source image A
 [4.8.11.13 Compositing]
 when source image A is a shape, is not defined by the spec.

 And so in Chrome, when composting with a shape, the extent of image A is
 only that width
 and height the shape covers, whereas in Firefox, the extent of image A is
 equivalent to the
 extent of image B (the current bitmap). This led to an incompatibility
 between the two browsers.

I think the spec is clear on this (at least when I last looked; not
sure if it's changed since then). Image A is infinite and filled with
transparent black, then you draw the shape onto it (with no
compositing yet), and then you composite the whole of image A (using
globalCompositeOperation) on top of the current canvas bitmap. With
some composite operations that's a different result than if you only
composited pixels within the extent of the shapes you drew onto image
A.

(With most composite operations it makes no visible difference,
because compositing transparent black onto a bitmap has no effect, so
this only affects a few unusual modes.)

There is currently no definition of what the extent of a shape is
(does it include transparent pixels? shadows? what about text with a
bitmap font? etc), and it sounds like a complicated thing to define
and to implement interoperably, and I don't see obvious benefits to
users, so the current specced behaviour (using infinite bitmaps, not
extents) seems to me like the best approach (and we just need everyone
to implement it).

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas Proposal: aliasClipping property

2009-10-16 Thread Philip Taylor
On Fri, Oct 16, 2009 at 2:25 PM, Robert O'Callahan rob...@ocallahan.org wrote:
 On Sat, Oct 17, 2009 at 1:06 AM, Philip Taylor excors+wha...@gmail.com
 wrote:

 I think the spec is clear on this (at least when I last looked; not
 sure if it's changed since then). Image A is infinite and filled with
 transparent black, then you draw the shape onto it (with no
 compositing yet), and then you composite the whole of image A (using
 globalCompositeOperation) on top of the current canvas bitmap. With
 some composite operations that's a different result than if you only
 composited pixels within the extent of the shapes you drew onto image
 A.


 Ah, so you mean Firefox is right in this case?

Yes, mostly. 
http://philip.html5.org/tests/canvas/suite/tests/index.2d.composite.uncovered.html
has relevant tests, matching what I believed the spec said - on
Windows, Opera 10 passes them all, Firefox 3.5 passes all except
'copy' (https://bugzilla.mozilla.org/show_bug.cgi?id=366283), Safari 4
and Chrome 3 fail them all.

(Looking at the spec quickly now, I don't see anything that actually
states this explicitly - the only reference to infinite transparent
black bitmaps is when drawing shadows. But
http://www.whatwg.org/specs/web-apps/current-work/multipage/the-canvas-element.html#drawing-model
is phrased in terms of rendering shapes onto an image, then
compositing the image within the clipping region, so I believe it is
meant to work as I said (and definitely not by compositing only within
the extent of the shape drawn onto the image).)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Stripping newlines from URI attributes

2009-07-30 Thread Philip Taylor
On Thu, Jul 30, 2009 at 2:37 PM, Elliotte Rusty
Haroldelh...@ibiblio.org wrote:
 On Wed, Jul 29, 2009 at 5:49 PM, Kartikaya
 Guptalists.wha...@stakface.com wrote:
 It seems that most browsers do some sort of newline and tab removal from URI 
 attributes. For example, if you have

 img src=foo
 bar.jpg

 browsers will still render the image called foobar.jpg despite the CRLF 
 pair in the middle of the src attribute.
 [...]

 This is an area where we should not attempt (and probably simply
 cannot) maintain compatibility with existing browsers. They're just
 too broken.

We should attempt to maintain compatibility with existing content, and
whitespace in URI attributes seems very common in existing content,
e.g.:

http://www.topdogphotos.com/photo-gallery/gallery11.html (newlines in
a href, img src)

http://www.sprig.com/coyuchi_george_or_thor_hooded_baby_towel (tabs
and #xD;#xA; in img src)

and loads more.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] the cite element

2009-07-27 Thread Philip Taylor
On Mon, Jul 27, 2009 at 3:20 PM, Erik Vorhese...@textivism.com wrote:
 On Sun, Jul 19, 2009 at 4:58 AM, Ian Hicksoni...@hixie.ch wrote:

 In practice, people haven't been confused between these two attributes as
 far as we can tell. People who use cite seem to use it for titles, and
 people who use cite= seem to use it for URLs. (The latter is rare.)


 See http://www.four24.com/; note near the top of the source:
 blockquote id=verse cite=John 4:24...

See http://philip.html5.org/data/cite-attribute-values.txt for some
data. (Looks like non-URI values are quite rare.)
Also maybe relevant: see http://philip.html5.org/data/cite.txt for
some older data about cite. (Looks like non-title uses are very
common.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Validation

2009-07-21 Thread Philip Taylor
On Tue, Jul 21, 2009 at 11:22 AM, Kristof
Zelechovskigiecr...@stegny.2a.pl wrote:
 !DOCTYPE html6 would be an abomination, unless the root element changes to
 html6 also :-)

Also it would trigger quirks mode in many existing browsers, and in
any conforming HTML5 implementation. You'd have to use something like
!DOCTYPE html SYSTEM 6 as the shortest string that provides a
version identifier, if you insist on putting it in the doctype.

(The HTML5 doctype reflects that in practice there aren't several
independent carefully-separated languages - there's just a single
vaguely-defined mess called HTML, described in a range of
specifications and sometimes not specified at all, implemented
incrementally with various extensions and bugs and missing features in
various browsers, with people writing pages that mix all the different
features together. The version numbering is an artifact of the W3C's
process of developing a numbered sequence of specifications, and isn't
aligned with how HTML browsers or documents are usually written.

If you want to check that your pages are compatible with certain
browser releases, the language version number is a very bad
approximation - you'd want a tool that understands what features IE10
supports (maybe some (but not all) from HTML4, some (but not all) from
HTML5, some proprietary extensions, etc), and it would be misleading
to think that a pure HTML-version-N validator is going to be good
enough for that. Maybe you want some in-band mechanism for identifying
which pages a spider should check with which rules, but then something
like meta name=check-ua-compatibility content=ie=10;fx=5 seems a
better solution than a language version number in the doctype; if the
problem is real, it should be examined independently of these
particular solutions.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] the cite element

2009-07-01 Thread Philip Taylor
On Wed, Jul 1, 2009 at 6:04 PM, Erik Vorhese...@textivism.com wrote:
 On Wed, Jul 1, 2009 at 11:49 AM, Kristof
 Zelechovskigiecr...@stegny.2a.pl wrote:
 I can imagine two reasons the CITE element cannot be defined as citing
 whom:
  1. Existing tools may assume it contains a title.

 Existing tools (which I would assume follow the HTML 4.01 spec) would
 be mistaken in their implementation of the cite element, then:
 CITE: Contains a citation or reference to other sources. (See
 http://www.w3.org/TR/html401/struct/text.html#h-9.2.1.) Moreover, in
 its sample usage, the HTML 4.01 spec uses cite for more than titles.

In practical usage it seems to be used for more than titles:
http://philip.html5.org/data/cite.txt. (But I haven't tried working
out what else it is used for, or how commonly it's used for titles.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Removing the need for separate feeds

2009-05-22 Thread Philip Taylor
On Fri, May 22, 2009 at 11:45 AM, Adrian Sutton adrian.sut...@ephox.com wrote:
 [...]
 Can anyone point to examples where the content is entirely hand crafted and
 a feed would actually make sense?

Perhaps a page like http://philip.html5.org/data.html - people might
want to subscribe in their feed reader to see all the exciting
updates, and the markup is all hand-written. It's not at all like a
blog, but maybe it's data that could be usefully represented with
Atom.

Currently the markup looks like:

  ol
lia 
href=http://philip.html5.org/data/abbr-acronym.txt;codeabbr/code,
codeacronym/code titles and contents./a !-- 2008-02-03 --
lia href=http://philip.html5.org/data/spaced-uris.txt;URIs
containing spaces./a !-- 2008-02-02 --
...
  /ol

If I understand the spec correctly, I would have to write something like:

  ol
li
  article pubdate=2008-02-03T00:00:00Z
h1a href=http://philip.html5.org/data/abbr-acronym.txt;
rel=bookmarkcodeabbr/code, codeacronym/code titles and
contents./a/h1
  /article
li
  article pubdate=2008-02-02T00:00:00Z
h1a href=http://philip.html5.org/data/spaced-uris.txt;
rel=bookmarkURIs containing spaces./a/h1
  /article
...
  /ol

and then it would hopefully work.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Removing the need for separate feeds

2009-05-22 Thread Philip Taylor
On Fri, May 22, 2009 at 2:02 PM, Adrian Sutton adrian.sut...@ephox.com wrote:
 On 22/05/2009 13:32, Philip Taylor excors+wha...@gmail.com wrote:
 Perhaps a page like http://philip.html5.org/data.html - people might
 want to subscribe in their feed reader to see all the exciting
 updates, and the markup is all hand-written. It's not at all like a
 blog, but maybe it's data that could be usefully represented with
 Atom.

 There are four articles on that page - do they really update often enough to
 warrant anything more than just adding plain If-Modified support to
 feedreaders and displaying the whole page when it changes?

The way I see it, there are 24 articles on the page (grouped into four
categories), each published independently at separate times. There
would be about a hundred if I kept that index up to date.

But I'm not sure this is a very compelling example, and I can't think
of any other cases where I'd possibly want to publish
non-database-backed data as both HTML and Atom.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Link rot is not dangerous

2009-05-15 Thread Philip Taylor
On Fri, May 15, 2009 at 6:25 PM, Shelley Powers
shell...@burningbird.net wrote:
 The most important point to take from all of this, though, is that link rot
 within the RDF world is an extremely rare and unlikely occurrence.

That seems to be untrue in practice - see
http://philip.html5.org/data/rdf-namespace-status.txt

The source data is the list of common RDF namespace URIs at
http://ebiquity.umbc.edu/resource/html/id/196/Most-common-RDF-namespaces
from three years ago. Out of those 284:
 * 56 are 404s. (Of those, 37 end with '#', so that URI itself really
ought to exist. In the other cases, it'd be possible that only the
prefix+suffix URIs are meant to exist. Some of the cases are just
typos, but I'm not sure how many.)
 * 2 are Forbidden. (Of those, 1 looks like a typo.)
 * 2 are Bad Gateway.
 * 22 could not connect to the server. (Of those, 2 weren't http://
URIs, and 1 was a typo. The others represent 13 different domains.)

(For the URIs which returned Redirect responses, I didn't check what
happens when you request the URI it redirected to, so there may be
more failures.)

Over a quarter of the most common namespace URIs don't resolve
successfully today, and most of those look like they should have
resolved when they were originally used, so link rot seems to be
common.

(Major vocabularies like RSS and FOAF are likely to exist for a long
time, but they're the easiest cases to handle - we could just
pre-define the prefixes rss: and foaf: and have a centralised
database mapping them onto schemas/documentation/etc. It seems to me
that URIs are most valuable to let any tiny group make one for their
rarely-used vocabulary, and be guaranteed no name collisions without
needing to communicate with a centralised registry to ensure
uniqueness; but it's those cases that are most vulnerable to link rot,
and in practice the links appear to fail quite often.)

(I'm not arguing that link rot is dangerous - just that the numbers
indicate it's a common situation rather than an extremely rare
exception.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Philip Taylor
On Thu, May 14, 2009 at 1:25 PM, Dan Brickley dan...@danbri.org wrote:
 Having HTML5-microdata -to- RDF parsers is pretty critical to having test
 cases that help us all understand where RDFa-Classic and HTML5 diverge. I'm
 very happy to see this work being done and that there are multiple
 implementations.

 As far as I can see, the main point of divergence is around URI abbreviation
 mechanisms. But also HTML5 might not have a notion equivalent to RDF/RDFa's
 bNodes construct. The sooner we have these parsers the sooner we'll know for
 sure.

If I understand RDF correctly, the idea is that everything can be
URIs, subjects and objects can instead be blank nodes, and objects can
instead be literals. If we restrict literals to strings (optionally
with languages), then I think all triples must follow one of these
eight patterns:

  urn:subject urn:predicate urn:object .
  urn:subject urn:predicate object .
  urn:subject urn:predicate object@lang .
  urn:subject urn:predicate _:X .
  _:X urn:predicate urn:object .
  _:X urn:predicate object .
  _:X urn:predicate object@lang .
  _:X urn:predicate _:Y .

These cases can be trivially mapped into HTML5 microdata as:

  div item
link itemprop=about href=urn:subject
link itemprop=urn:predicate href=urn:object
  /div

  div item
link itemprop=about href=urn:subject
meta itemprop=urn:predicate content=object
  /div

  div item
link itemprop=about href=urn:subject
meta itemprop=urn:predicate content=object lang=lang
  /div

  div item
link itemprop=about href=urn:subject
meta itemprop=urn:predicate item id=X
  /div

  link subject=X itemprop=urn:predicate href=urn:object

  meta subject=X itemprop=urn:predicate content=object

  meta subject=X itemprop=urn:predicate content=object lang=lang

  meta subject=X itemprop=urn:predicate item id=Y

(There's the caveat about link and meta being moved into head in
some browsers; you can replace them with a and span instead.)

These aren't the most elegant ways of expressing complex structures
(because they don't make much use of nesting), but hopefully they
demonstrate that it's possible to express any RDF graph (that only
uses string literals) by decomposing into triples and then writing as
HTML with these patterns.

(If all the triples using a blank node have the same subject, then you
don't need to use 'id' and 'subject' because you can just nest the
markup instead, I think.)

With my parser (in Firefox 3.0), the output triples (sorted into a
clearer order) are:

   http://www.w3.org/1999/xhtml/vocab#item urn:subject .
   http://www.w3.org/1999/xhtml/vocab#item urn:subject .
   http://www.w3.org/1999/xhtml/vocab#item urn:subject .
   http://www.w3.org/1999/xhtml/vocab#item urn:subject .
  urn:subject urn:predicate urn:object .
  urn:subject urn:predicate object .
  urn:subject urn:predicate object@lang .
  urn:subject urn:predicate _:n0 .
  _:n0 urn:predicate urn:object .
  _:n0 urn:predicate object .
  _:n0 urn:predicate object@lang .
  _:n0 urn:predicate _:n1 .

which corresponds to what was desired.

So, I can't see any limits on expressivity other than that literals
must be strings. (But I'm not at all an expert on RDF, and I may have
missed something in the microdata spec, so please let me know if I'm
wrong!)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-14 Thread Philip Taylor
On Thu, May 14, 2009 at 2:54 PM, Philip Taylor excors+wha...@gmail.com wrote:
 [...]
  urn:subject urn:predicate _:X .
 [...]
  div item
    link itemprop=about href=urn:subject
    meta itemprop=urn:predicate item id=X
  /div
 [...]
 So, I can't see any limits on expressivity other than that literals
 must be strings.

Hmm, I think I'm wrong here. 'id' has to be unique, which means this
pattern won't work if _:X is the object for triples with two different
subjects.

Additionally, there must be a chain from every blank node back to 
via http://www.w3.org/1999/xhtml/vocab#item, else it won't get
serialised (since serialisation starts from top-level items and
recurses down the correspondence chains). As a consequence of this and
the previous point, it is impossible to express cycles (e.g. _:X
urn:predicate _:X, or any longer cycles) unless the cycle contains
.

So there are these two restrictions on the shapes of expressible RDF
graphs. (I can't think of any other restrictions, though...)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-12 Thread Philip Taylor
On Tue, May 12, 2009 at 11:55 AM, Eduard Pascual herenva...@gmail.com wrote:
 [...]
 (at least for now: many RDFa-aware agents vs. zero HTML5's
 microdata -aware agents)

HTML5 microdata parsers seem pretty trivial to write -
http://philip.html5.org/demos/microdata/demo.html is only about two
hundred lines to read all the data and to produce JSON and
N3-serialised RDF. It shouldn't take more than a few hours to produce
a similar library for other languages, including the time taken to
read the spec, so the implementation cost for generic parser libraries
doesn't seem like a significant problem.

The cost of integration with backend RDF-based systems seems more
significant - hopefully you could simply replace the frontend RDFa
parser with a microdata parser and generate the same RDF triples and
it would all work fine, but I don't know whether that's true in
practice (because maybe the microdata syntax is too restrictive to
represent the vocabularies people want to use, and so they'd have to
go to lots of extra effort to create a new vocabulary).

 [...] there are other cases where
 separate values might be needed: for example using a street address
 for the human-readable representation of a location and the exact
 geographic coordinates as the machine-readable (since not all
 micro-data parsers can rely on Google Maps's database to resolve
 street addresses, you know); or using a colored name (such as lime
 green displayed on lime green color) as the human-readable
 representation of a color, and the hexcode (like #00FF00) as the
 machine-readable representation.

You could replace
  span itemprop=colorlime green/span
  span itemprop=location1 High Street/span
with
  meta itemprop=color content=#00FF00spanlime green/span
  meta itemprop=location.lat content=56.78meta
itemprop=location.long content=-12.34span1 High Street/span
to get the desired output. (Not particularly elegant syntax, though.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-12 Thread Philip Taylor
On Tue, May 12, 2009 at 10:21 PM, Sam Ruby ru...@intertwingly.net wrote:
 On Tue, May 12, 2009 at 4:34 PM, Shelley Powers
 shell...@burningbird.net wrote:

 I
 would say if your fellow Google developers could understand how this all
 works, there is hope for others.

 if

 http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009May/0064.html

Also: The instructions at
http://google.com/support/webmasters/bin/answer.py?answer=146898 (and
related pages) alternate between
 xmlns:v=http://rdf.data-vocabulary.org;
and
 xmlns:v=http://rdf.data-vocabulary.org/;
seemingly at random.

(The first means that property=v:name abbreviates the bogus URI
http://rdf.data-vocabulary.orgname;, if I understand correctly. The
second means it's http://rdf.data-vocabulary.org/name; which is a
404. Perhaps they meant xmlns:v=http://rdf.data-vocabulary.org/#;
which would point at the relevant bit of the vocabulary RDF file?
Hopefully people won't actually deploy content using the inconsistent
namespaces before the documentation is fixed...)

(They've also got a spanstrong property=v:name and spanspan
property=v:locality and some unclosed as, so it seems the
documentation writers are having difficulty even writing plain HTML.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-11 Thread Philip Taylor
On Mon, May 11, 2009 at 6:15 PM, Giovanni Gentili
giovanni.gent...@gmail.com wrote:
 * a user (or groups of users) wants to annotate
 items present on a generic web page with
 additional properties in a certain vocabulary.
 for example Joe wants to gather in a blog
 a series of personal annotation to movies
 (or other type of items) present in imdb.com.

 [...]

 this option require that @subject accept:

 1) ID of an element with an item attribute, in the same Document
 or
 2) valid URL of an element with an item attribute elsewhere in the web
 or
 3) a valid URL (ithe item is the referred document or fragment)

For the RDF output, you can use link property=about
href=http://subject/; to create triples whose subject is a URL. (I
believe in general you can also do:
  meta item id=n0
  link subject=n0 property=about href=http://subject/;
  link subject=n0 property=http://predicate1/; href=http://object1/;
  meta subject=n0 property=http://predicate2/; content=object2
to represent arbitrary RDF triples.)

I don't think it would make sense for @subject to be a URL when
generating JSON output, because there wouldn't be anywhere to
represent that URL in the output structure. But there could be a
convention that properties called about indicate the URLs that the
item applies to, and then it would work with exactly the same markup
as the RDF case.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Annotating structured data that HTML has no semantics for

2009-05-10 Thread Philip Taylor
On Sun, May 10, 2009 at 11:32 AM, Ian Hickson i...@hixie.ch wrote:

 One of the more elaborate use cases I collected from the e-mails sent in
 over the past few months was the following:

   USE CASE: Annotate structured data that HTML has no semantics for, and
   which nobody has annotated before, and may never again, for private use or
   use in a small self-contained community.

 [...]

 To address this use case and its scenarios, I've added to HTML5 a simple
 syntax (three new attributes) based on RDFa.

There's a quickly-hacked-together demo at
http://philip.html5.org/demos/microdata/demo.html (works in at least
Firefox and Opera), which attempts to show you the JSON serialisation
of the embedded data, which might help in examining the proposal.

-- 
Philip Taylor
exc...@gmail.com


[whatwg] Typo

2009-04-24 Thread Philip Taylor
http://www.whatwg.org/specs/web-apps/current-work/multipage/forms.html#the-pattern-attribute
says:

For example, the following snippet:
label Part number:
 input pattern=[0-9][A-Z]{3} name=part
title=A part number is a digit followed by three
uppercase letters./
/label
...could cause the UA to display an alert such as:
 part number is a digit followed by three uppercase letters.
You cannot complete this form until the field is correct.

which is missing the A in the last-but-one line.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas Shadows - Unnecessary Barrier to Entry

2009-03-28 Thread Philip Taylor
On Fri, Mar 27, 2009 at 11:22 PM, Charles Pritchard ch...@jumis.com wrote:
 [...]
 We've been working on Javascript / Canvas projects for two years now.

 We're in the process of releasing full implementations targeting the Common
 Runtime Language,
 Java AWT, ActionScript and DCOM.

 I'm sure you can all recognize, that these components have their own vector
 APIs,
 and that we're only sending requests through as a proxy.

 While we can implement everything, even the non-zero winding rule,
 there one part of the specification that's absolutely rotten. And that's the
 #shadows section.

 I love a shadow, I love a good looking UI, but most of these APIs do not
 have shadow
 support for shapes.

Do the APIs not provide enough features so you can implement shadows
yourself? e.g. Firefox uses Cairo which doesn't have any native
support for shadows; but it can draw shapes onto an alpha-only
surface, manually blur the pixels (if you can implement getImageData
then I assume you must already have access to the raw pixels and can
do the blurring efficiently), then draw the shape again, and composite
everything appropriately, which results in a correct shadow
implementation. I don't see what makes this fundamentally harder than
implementing all the other required canvas features.

 [...]

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Historic dates in HTML5

2009-03-05 Thread Philip Taylor
On Thu, Mar 5, 2009 at 11:33 AM, j...@eatyourgreens.org.uk
j...@eatyourgreens.org.uk wrote:
 [...]
 Bruce Lawson uses time to mark up the dates of blog posts in the HTML5
 version of his wordpress templates. Is this incorrect usage of HTML5? If
 not, how should HTML5 blog templates work when the blog is dated from 1665
 (http://pepysdiary.com) or 1894 (http://www.cosmicdiary1894.blogspot.com/)?

This reminds me of the issue I had with the old img
alt={description} syntax.

People write software that takes some input, and outputs some markup.
They want to guarantee that their markup will be valid and correctly
interpreted by consumers, regardless of the input. (In the img alt
case, the problem was when the input resulted in legitimate
alternative (non-description) alt text that started with { and ended
with }, forcing the application to add complexity to make sure its
output won't be misinterpreted.)

In any situation where they use time, they'd probably want to write
something like:

  print time 
datetime=.$t-toISO8601Date()..$t-toLocalisedHumanReadableDate()./time;

But given HTML5's restrictions against BCE years, they'd actually have
to write something more like:

  if ($t-getYear()  0) { # (be careful not to write = 0 here)
print time class=time
datetime=.$t-toISO8601Date()..$t-toLocalisedHumanReadableDate()./time;
  } else {
print span class=time.$t-toLocalisedHumanReadableDate()./span;
  }

and make sure their stylesheets use the selector .time instead of
time, to guarantee everything is going to work correctly even with
unexpected input values.

So the restriction adds complexity (and bugs) to code that wants to be
good and careful and generate valid markup.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Historic dates in HTML5

2009-03-05 Thread Philip Taylor
On Thu, Mar 5, 2009 at 12:56 PM, James Graham jgra...@opera.com wrote:
 Philip Taylor wrote:

 and make sure their stylesheets use the selector .time instead of
 time, to guarantee everything is going to work correctly even with
 unexpected input values.

 So the restriction adds complexity (and bugs) to code that wants to be
 good and careful and generate valid markup.


 On the other hand the python datetime class doesn't seem to support years =
 0 at all so consuming software written in python would have to re-implement
 the whole datetime module, potentially causing incompatibilities with third
 party libraries that expect datetimes to have year = 0. This seems like a
 great deal more effort than simply checking that dates are in the allowed
 range before serializing or consuming them in languages that do support
 years = 0.

The Python datetime class doesn't seem to support years   either,
which HTML5 allows. So Python consumers will already have to do if
not year = : discard this time element since I'm not going to be
able to do anything with it, and it's easy for them to change that to
if not 1 = year = :  That seems less effort than adding
checks into the producers.

If there is a desire that any valid HTML5 date-time string should be
representable in Python's datetime class, then HTML5 should limit it
to 4 digits and refuse to parse anything longer. If so, why Python's
datetime in particular? The C++ Boost.Date_Time
(http://www.boost.org/doc/libs/1_38_0/doc/html/date_time/gregorian.html)
is apparently limited to 1400-Jan-01 to -Dec-31. Perl DateTime
and PHP DateTime and Java joda-time
(http://joda-time.sourceforge.net/field.html) seem happy with a range
of millions of years in both directions. I'm not sure about any other
libraries. The range 1.. seems pretty arbitrary since it only
matches Python, and 1..inf doesn't match anything, so neither seems
particularly justified by implementations.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] proposed canvas 2d API additions

2009-02-28 Thread Philip Taylor
On Sat, Feb 28, 2009 at 8:38 PM, JustFillBug mozbug...@yahoo.com.au wrote:
 On 2006-04-26, Ian Hickson i...@hixie.ch wrote:
 On Mon, 24 Apr 2006, Vladimir Vukicevic wrote:

 We can always add isPointInStrokedPath if we ever want to bother with
 that (which is where the ...Fill bit came from in my API, because the
 region covered by a stroked path and that covered by a filled path are
 different, even though testing for a hit against a filled region would
 by far be the common case).

 We can also call the other one isPointOnPath(), if we want to keep the
 method names reasonably short. I'm not sure we'll ever need to add it,
 though. Getting people to click on a line is generally silly.

(Or maybe we could add a convertStrokeToPath() function, which
replaces the current path with a path representing the outline of what
you'd get if you stroked the current path, and then use isPointInPath
on it.)

 We do have a need of isPointOnPath() for editing Bezier lines
 interactively (on a font editing interface). [...]

 Doing point on curve in javascript is painful. And since checking
 isPointInPath() already need to detect the on edge case, this shouldn't
 be too much a burdern on the browser developers.

 So please conside add the isPointOnPath() call to the function.

What makes it painful? If you're only using Beziers, it doesn't seem
too hard to approximate the curves as line segments and then calculate
distances from that.
http://philip.html5.org/demos/canvas/bezier-approx.html is fairly
straightforward (and a much more accurate version shouldn't be much
more complex) and can detect when your mouse is near a curve. (But if
this is a common problem, it would indeed be nicer if the canvas API
provided the functionality instead of forcing you to reimplement it.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas arcTo all points on a line

2009-01-21 Thread Philip Taylor
On Sat, Dec 27, 2008 at 9:37 AM, Dirk Schulze vb...@gmx.de wrote:
 Hi,

 have two questions to the all points on a line part of canvas' arcTo.
 A short example:

 moveTo(50,0);
 arcTo(100,0,  0,0, 10);

 This should add a new, from p1 infinite far away, point to the subpath
 and draw a straight line to it.

 Two questions.

 1) If I add lineTo(50, 50); after arcTo(..). Wouldn't it draw a quasi
 parallel line to the line of arcTo? Because (Xx, Yx) (mentioned in the
 spec) is infinite far away. That means, we will never reach this point
 in reality.

It should draw a really parallel line, with one end at (50,50) and the
other end infinitely far away in the direction determined by the
arcTo.

 2) We don't allow infinite values for moveTo or lineTo, but can make
 this happen with arcTo.
 The example above would be the same as lineTo(-Infinite, 0);
 But we can make moveTo(-Infinite, 0) too with the example above. Just
 make strokeStyle transparent, use arcTo from above and you're done. And
 moveTo(infinite, infinite); would be possible too.

You can moveTo(-1e+300, 0) and moveTo(1e+300, 2e+300), which are much
more similar to what arcTo is meant to do.

Considering the general case where the arcTo's points are not
perfectly horizontal, the idea is that the point is not simply a point
with coordinates (+/-Infinity, +/-Infinity) - it's really the
(theoretical) limit of a point with coordinates (x+dx*t, y+dy*t) as t
approaches infinity, where x,y,dx,dy represent the position/direction
of the (x1,y1)--(x2,y2) line.

Where the spec says (x∞, y∞) is the point that is infinitely far away
from (x1, y1), that lies on the same line as (x0, y0), (x1, y1), and
(x2, y2), you could read it as ...the point that is very very far
away from ..., e.g. take the (x1,y1)--(x2,y2) line and then move
1e+100 units in that direction, and it would be good enough that
nobody would notice the tiny error.

You already have to handle something very similar to this case,
because (x2,y2) might be very very close to the line (x0,y0)--(x1,y1),
which means the start/end tangent points will be very very far away in
the appropriate direction. The special case where (x2,y2) is precisely
on the line is not really special - the points are just even further
(infinitely far) away in that direction.

As a concrete example: see
http://philip.html5.org/demos/canvas/arcto-inf.html, which I believe
should have output like
http://philip.html5.org/demos/canvas/arcto-inf.png (from Safari
3.0.4 for Windows). As (x2,y2) gets closer to the line of the first
two points, the start/end tangent points are pushed further over to
the left. When y2=0.1 they're far enough away that the two straight
lines are nearly horizontal; when y2=0 it's basically the same, except
now they're precisely horizontal.

So I think the spec's behaviour makes sense from a theoretical
perspective, because it avoids any discontinuities in the output when
the input variables are changed a tiny bit. And it made sense from a
practical perspective, because it matched the behaviour of Safari 3.0
(though apparently things have changed in 3.1).

But I don't know if it makes sense from the perspective of someone
who's got to write an independent implementation of it. Does the above
explanation make more sense than the text in the spec? and if so, does
it seem implementable? If so, it seems best to keep the spec's
behaviour and try to clarify the spec's text. But this doesn't seem
like an important case where users will be unhappy if e.g. the arcTo
call draws nothing when all the points are on the same line, so if
it's still a pain to implement the spec's behaviour then I would be
happy with changing what the spec requires.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Canvas arcTo all points on a line

2009-01-21 Thread Philip Taylor
On Wed, Jan 21, 2009 at 2:45 PM, Philip Taylor excors+wha...@gmail.com wrote:
 On Sat, Dec 27, 2008 at 9:37 AM, Dirk Schulze vb...@gmx.de wrote:
 Hi,

 have two questions to the all points on a line part of canvas' arcTo.
 A short example:

 moveTo(50,0);
 arcTo(100,0,  0,0, 10);

 This should add a new, from p1 infinite far away, point to the subpath
 and draw a straight line to it.

 [...]

After some discussion on IRC, it seems this part of the spec is not a
great idea.

As I understand it, the low-level graphics APIs have limited
coordinate range and rely on the User agents may impose
implementation-specific limits on otherwise unconstrained inputs, e.g.
to prevent denial of service attacks, to guard against running out of
memory, or to work around platform-specific limitations. clause (and
common sense) to let them have undefined behaviour when people use
really large coordinate values. The infinitely-distant point required
by arcTo is a really large coordinate value, but we don't want this
case to be undefined behaviour (because it can occur with nice small
integer input values and people might accidentally use it).

Implementing the behaviour currently in the spec (with the
infinitely-distant point) is not trivial, because it requires code
unique to that special case (rather than falling naturally out of an
implementation of the rest of arcTo's behaviour) and has to be careful
to act enough like an infinitely-distance point while remaining within
the implementation limits.

And it seems like a rare edge case where people disagree on whether
the output is sensible, and nobody is really going to care what the
output is (as long as it's well defined); so it doesn't seem
worthwhile having everyone understand and implement the non-trivial
behaviour that's in the spec.

So, in the interest of having something that implementors are more
likely to converge on, I'd suggest replacing the behaviour in that
case (the the direction from (x0, y0) to (x1, y1) is the opposite of
the direction from (x1, y1) to (x2, y2) case) with simply drawing a
straight line from (x0, y0) to (y1, y1), which is easy and apparently
is what Safari on OS X already does. It's also the same as the other
case in that paragraph, so the whole paragraph can be collapsed to:

  Otherwise, if the points (x0, y0), (x1, y1), and (x2, y2) all lie
on a single straight line, then the method must add the point (x1, y1)
to the subpath, and connect that point to the previous point (x0, y0)
by a straight line.

-- 
Philip Taylor
exc...@gmail.com


[whatwg] /html with omitted tags

2008-12-26 Thread Philip Taylor
I can start with a simple document that's probably conforming and that
the validator doesn't complain about:

  !DOCTYPE htmlhtmlheadtitle/title/headbody/body/html

Then I can read the Writing HTML document: Optional tags section, which says:

  A head element's end tag may be omitted if the head element is not
immediately followed by a space character or a comment.

  A body element's start tag may be omitted if the first thing inside
the body element is not a space character or a comment, except if the
first thing inside the body element is a script or style element.

  A body element's end tag may be omitted if the body element is not
immediately followed by a comment.

So I choose to omit the /headbody/body because I think those
rules say I can do so. I get:

  !DOCTYPE htmlhtmlheadtitle/title/html

But now I get a parse error, which I think is because the /html
comes in the in head insertion mode and is Any other end tag: Parse
error. Ignore the token., so something seems wrong.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] 8.2.4.37: EOF handling

2008-12-22 Thread Philip Taylor
On Mon, Dec 22, 2008 at 9:33 PM, Edward Z. Yang
edwardzy...@thewritingpot.com wrote:
 Hello all,

 I think EOF should be handled explicitly in the states after we Consume
 the U+0023 NUMBER SIGN, since the spec as it stands right now implies
 that there will always be another character after the number sign. Or am
 I being a little redundant?

EOF is always treated as if it were a character, e.g. lots of places
say Consume the next input character: ... EOF - ... Reconsume the
EOF character in the data state. If you have # at the end of a
file, the next character is the EOF character, which is not 'x' or 'X'
and so it is anything else. So it seems consistent and unambiguous
to me.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Consuming ampersands

2008-12-22 Thread Philip Taylor
On Tue, Dec 23, 2008 at 1:08 AM, Edward Z. Yang
edwardzy...@thewritingpot.com wrote:
 Hello all,

 When I'm consuming a character reference, when does the ampersand get
 consumed? This doesn't seem to be obvious from the documentation, which
 talks of consuming character references and number hash signs, but never
 the ampersand.

They're consumed in the state that comes before the character
reference state, e.g.:

  8.2.4.1 Data state
  Consume the next input character:
   - U+0026 AMPERSAND () ... switch to the character reference data state.

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Byte-wise tokenization algorithm

2008-12-21 Thread Philip Taylor
On Sun, Dec 21, 2008 at 5:41 AM, Ian Hickson i...@hixie.ch wrote:
 On Sat, 20 Dec 2008, Edward Z. Yang wrote:

 1. Given an input stream that is known to be valid UTF-8, is it possible
 to implement the tokenization algorithm with byte-wise operations only?
 I think it's possible, since all of the character matching parts of the
 algorithm map to characters in ASCII space.

 Yes. (At least, that's the intent; if you find anything that contradicts
 that, please let me know.)

I think there are some cases where it still should work but you might
have to be a little careful - e.g. tablefoo notionally results in
three parse errors according to the spec (one for each character token
which gets foster-parented), so table☹ results in one if you work
with Unicode characters but three if you treat each UTF-8 byte as a
separate character token.

But in practice, tokenisers emit sequence-of-many-characters tokens
instead of single-character tokens, so they only emit one parse error
for tablefoo, and the html5lib test cases assume that behaviour,
and it should work identically if you have sequence-of-many-bytes
tokens instead.

(Apparently only the distinction between 0 and more-than-0 parse
errors is important as far as the spec is concerned, since that has an
effect on whether the document is conforming; but it seems useful for
implementors to share test cases that are precise about exactly where
all the parse errors are emitted, since that helps find bugs, and so
the parse error count is relevant.)

-- 
Philip Taylor
exc...@gmail.com


Re: [whatwg] Solving the login/logout problem in HTML

2008-11-26 Thread Philip Taylor
On Wed, Nov 26, 2008 at 10:12 AM, Ian Hickson [EMAIL PROTECTED] wrote:
 On Wed, 26 Nov 2008, Julian Reschke wrote:
 Ian Hickson wrote:
  ...
  As can be seen in the feedback below, there is interest in improving the So
  when you get to a page that expects you to be logged in, it return a 401
  with:
 
 WWW-Authenticate: HTML form=login
 
  ...and there must be a form element with name=login, which represents
  the form that must be submitted to log in.
  ...

 For security reasons, I'd prefer that to be the form element,
 instead of a form element -- having multiple copies of the name in
 the same document should be considered a fatal error.

 Having multiple form elements with the same name is already an error.

 I'm not sure what you mean by fatal error. The spec precisely defines
 which form should be used in the case of multiple forms with the same
 name. Could you describe the attack scenario you are considering?

If I'm not misunderstanding things, there is a new attack scenario:

I post a comment on someone's blog, saying a
href=/restricted-access.php?xsshole=form
action=http://hacker.example.com/capture name=logininput
name=usernameinput name=password/formcrawl me!/a

On their blog's web server, restricted-access.php require
authentication, and unauthenticated access results in a 401 with
'WWW-Authenticate: HTML form=login' and the appropriate login form.
But inevitably there's some kind of XSS hole in that page, so
arbitrary markup can be inserted above the real login form. (Maybe
they pass an error message in a parameter, which will be displayed
above the form, but they forgot to escape the output.)

Their internal search engine crawler is configured to know a username
and password (and the form field names etc) for these restricted
areas. It follows the link from my blog comment, it notices the
WWW-Authenticate header, and like a good little bot it chooses to
parse the HTML page and find the matching form and fill in the fields
and submit the login details. But actually it's submitting my
XSS-inserted form, and sending the login details to me.

XSS holes already cause various security vulnerabilities; but they
can't currently result in sensibly-written crawlers unwittingly
submitting their login details to arbitrary third parties, so this is
a new risk.

I can imagine a few ways to avoid this problem:

 1) Don't write any pages with XSS holes.
 2) Detect tampering by refusing to submit login details if = 2 forms
match the name.
 3) Only submit login details to same-origin URLs, or to some other
restricted set.
 4) Configure the crawler with the form submission URL, as well as the
form field names and values, so it doesn't have to trust the HTML.
 5) Change WWW-Authenticate so it gives all the details (submission
URL, field names, etc), so nobody has to trust the HTML.

But (1) is not going to happen in reality, so we should try to
minimise the damage when XSS holes exist. (2) won't work because the
attacker could write '...?xsshole=...!--' and the second form would
be hidden. (3) is more sensible; perhaps the spec should explicitly
note that you need to be quite careful about not submitting login
forms to third-party sites unless you're sure you trust them?

But even with (3), I could write a
href=/restricted-access.php?xsshole=form
action=/public-pastebin.php... and the crawler would send the login
details to somewhere on the same host where I could still read them
back, which doesn't seem great.

So (4) is more sensible. You already have to configure the crawler
with the form field names, so you might as well tell it what URL to
submit to, and it shouldn't parse the HTML response or care about the
form element. (Then there's no need for WWW-Authenticate to even say
what the form name is.)

(5) is basically the same, except it's late-binding the form details
rather than hardcoding them into the crawler's configuration, and so
it makes it easy to change the server-side login handling without
reconfiguring everyone's crawlers.

(But the cost of the potential solutions to the vulnerability might be
greater than the cost of the vulnerability, so it might not be worth
doing anything - I don't have a useful opinion on that.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Absent rev?

2008-11-19 Thread Philip Taylor
On Wed, Nov 19, 2008 at 9:35 AM, Martin McEvoy [EMAIL PROTECTED] wrote:
 [...]

 http://code.google.com/webstats/2005-12/linkrels.html

 [...]

 If you have a more up to date study on link relationships, please can I have
 a link?

http://philip.html5.org/data/link-rel-rev.txt has some more recent
data, from a different set of pages (and so with different biases,
e.g. there's lots of Wikipedia and IMDB pages using
rel=apple-touch-icon), with less processing (no case-insensitivity
or token-splitting).

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Absent rev?

2008-11-19 Thread Philip Taylor
On Wed, Nov 19, 2008 at 1:03 PM, Martin McEvoy [EMAIL PROTECTED] wrote:
 Philip Taylor wrote:
 http://philip.html5.org/data/link-rel-rev.txt has some more recent
 data, from a different set of pages (and so with different biases,
 e.g. there's lots of Wikipedia and IMDB pages using
 rel=apple-touch-icon), with less processing (no case-insensitivity
 or token-splitting).

 Thank you Philip that is the most useful set of data I have seen for a long
 time

 It basically says that the whole premise that HTML5 should drop *rev*  (a)
 because authors use it wrong, (b)  Many authors use rev-stylesheet  wrong,
  is a MYTH and an inaccurate assessment of  the *rev* attribute

 Out of the 127249 pages studied, only  0.09% actually use rev=stylesheet

The premise from near the beginning of this thread was:

 We did some studies and found that the attribute was almost never used,
 and most of the time, when it was used, it was a typo where someone meant
 to write rel= but wrote rev=.

I think that ought to say ... (excluding rev=made, which is
uninteresting since it's redundant with rel=author)  In that
case, rev is used on 0.2% of pages, which justifies the claim almost
never used. And rev=stylesheet makes up 57% of those uses of rev,
which justifies the claim most of the time ... it was a typo (under
a loose definition of typo that includes people copying-and-pasting
without understanding the distinction between rel and rev, which is
the impression I get from looking at some of these pages). And looking
at some other values, e.g. link rev=start href=/ title=Home
Page / which seems like it ought to be rel instead, there are typos
in more cases than just rev=stylesheet. So the premise seems valid.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Absent rev?

2008-11-19 Thread Philip Taylor
On Wed, Nov 19, 2008 at 2:54 PM, Martin McEvoy [EMAIL PROTECTED] wrote:
 Philip Taylor wrote:

 rev=stylesheet makes up 57% of those uses of rev,


 How do you get that figure?

 even if you just compare rev=made(1157 instances) and rev=stylesheet(107
 instances) you get 9.25% of the examples use rev incorrectly

That figure was from the case of

 ... (excluding rev=made, which is
 uninteresting since it's redundant with rel=author) 

since that appears to be what Hixie meant (but forgot to say) when
claiming that most uses of rev were typos of rel.

(Case-insensitively, I counted 1259 rev=made, 122 rev=stylesheet,
and 1474 rev=... in total, which means 215 in total excluding
rev=made, and 122/215=57%.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] fixing the authentication problem

2008-10-21 Thread Philip Taylor
On Tue, Oct 21, 2008 at 2:16 PM, Aaron Swartz [EMAIL PROTECTED] wrote:
 The most common way of authenticating to web applications is:

 Client: GET /login
 Server: htmlform method=post
 Client: POST /login
 user=joesmith01password=secret
 Server: 200 OK
 Set-Cookie: acct=joesmith01,2008-10-21,sj89d89asd89s8d

 [...]

 My proposal: add something to HTML5 so that the transaction looks like this:

 Client: GET /login
 Server: htmlform method=post pubkey=/pubkey.key...
 Client: POST /login
 dXNlcj1qb2VzbWl0aDAxJnBhc3N3b3JkPXNlY3JldA==
 Server: 200 OK
 Set-Cookie: acct=joesmith01,2008-10-21,sj89d89asd89s8d

 where the base64 string is the form data encrypted with the key
 downloaded from /pubkey.key.

As I understand it: As an attacker, I can intercept that dXN...
string. Then I can simply make a login POST request myself at any time
in the future, sending the same encrypted string, and will get the
valid login cookies even though I don't know the password. So it
doesn't seem to work very well at keeping me out of the user's
account. Also this seems vulnerable to dictionary attacks, e.g. I can
easily encrypt user=joesmith01password=... for every word in the
dictionary and will probably discover the user's password.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] fixing the authentication problem

2008-10-21 Thread Philip Taylor
On Tue, Oct 21, 2008 at 2:52 PM, Aaron Swartz [EMAIL PROTECTED] wrote:
 As I understand it: As an attacker, I can intercept that dXN...
 string. Then I can simply make a login POST request myself at any time
 in the future, sending the same encrypted string, and will get the
 valid login cookies even though I don't know the password. So it
 doesn't seem to work very well at keeping me out of the user's
 account. Also this seems vulnerable to dictionary attacks, e.g. I can
 easily encrypt user=joesmith01password=... for every word in the
 dictionary and will probably discover the user's password.

 I was simplifying; [...]

Simplifications make it hard to tell whether it's possible to use the
feature securely (and hard to tell what securely means in this
context), which is a necessary condition for usefulness, so it's
probably best to explain in detail exactly how you expect it'll be
used, and then people can try to pick holes in it :-) . (But at least
in my case, I know little enough about security that even if I can't
pick holes then I'd be unwilling to assume it's secure...)

 in real life, I expect the server will include a
 nonce with the form (as a hidden input), which they'll only permit to
 be used once.

That still doesn't help with the dictionary attacks, since the
attacker knows the nonce too. I'd guess the client has to add an extra
nonce (which is never transmitted in the clear) to avoid that problem.

For the server-generated nonce, the login form will have to be on a
page that is never cached, so that every client will get a new nonce
every time they load the page. That would prevent it being used in a
lot of cases where sites put a login box on every page (instead of
requiring the user to go through an extra login page), which is a
minor disadvantage of this scheme.

How will the server limit each nonce to being used once? If it stores
a list of every nonce that was ever used, it's going to be a pretty
large table and slow to check on any reasonably popular site. If it
encodes a timestamp in the nonce, it won't work if a user opens the
login page (causing the new nonce to be generated) in a background tab
and leaves it for a few days before trying to log in, which breaks the
usually-valid assumption that you can wait indefinitely between
separate HTTP requests. (Digest authentication avoids that problem
because it's defined at the HTTP level and can say that the browser
ought to respond immediately and to retry silently if the nonce was
stale.)

Probably more importantly, does this solve any of the security flaws
you indicated Digest authentication has? (i.e. how would it be better
than inventing a mechanism for allow custom styling of the browser's
username/password dialog box?)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] canvas shadow compositing oddities

2008-07-29 Thread Philip Taylor
On Sun, Jul 27, 2008 at 8:06 PM, Eric Butler [EMAIL PROTECTED] wrote:
 [...]
 However, following the spec's drawing model, there are a few operators that
 behave rather unexpectedly if the shadow color is left at its default value.
 For instance, since A in B always results in transparency if either A or B
 is fully transparent, source-in will always simply clear the clipping region
 to fully transparent no matter what the source and destination are.

Oops - that does seem quite broken. (It's probably my fault - I didn't
notice that problem when I was looking at how shadows should work...)

 It would seem Safari isn't quite following the spec here, since it appears
 to never draw shadows when the shadow color is fully transparent or
 something and doesn't encounter these issues.

As far as I can tell: It never draws shadows when shadowColor.alpha 
1/256, regardless of the other attributes. Also, it never draws
shadows when blur=0 and abs(offsetX) = 1 and abs(offsetY) = 1,
regardless of the colour.

In the cases where it does draw shadows, there's also an issue that
its compositor ignores the area outside the shape that's being drawn
(instead of treating it as transparent-black, as is required by the
spec and implemented by Opera and (usually) Firefox) - so in cases
like http://philip.html5.org/demos/canvas/shadow-composite.html with
the source-in mode, WebKit fails to clear the area outside the
shape/shadow to transparent-black. (I'm testing with Safari 3.0.4 - I
hope not much has changed since then). That is probably a sufficiently
unusual situation that it's sensible for the spec to stay as it is and
require WebKit to change, though the spec still needs to change for
the default shadows-disabled case.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] [canvas] imageRenderingQuality property

2008-06-30 Thread Philip Taylor
On 01/07/2008, Ian Hickson [EMAIL PROTECTED] wrote:
 [...]
  It seems better for the browser to simply detect when the graphics burden
  being placed on the hardware by the page is too much to be done at high
  quality given the current load on the CPU, and for the browser to
  automatically drop down to a lower fidelity, higher speed rendering on the
  fly when appropriate.

Sometimes the author will want to force best-quality rendering,
regardless of the performance impact. E.g. a photo manipulation
application might let you resize a segment of a photo, displaying a
live preview (where performance is more important than quality), and
then render the final resized image and store it in a canvas for
future processing. That final rendering needs to be the best possible
quality, so it's not acceptable for the browser to decide that it
should semi-randomly drop the quality because it detected the live
preview was CPU-intensive.

Similarly, a pseudo-3d FPS game might load textures at runtime and
perform some preprocessing (like resizing to be square, and rendering
lots of smaller copies to be used as mipmaps for distant walls so they
look prettier), and then draw that processed texture into the game
thousands of times a second. Since the preprocessing is only done
once, and its result is reused for the whole of the rest of the game,
it should be done at the highest possible quality, regardless of
performance.

So, adaptively reducing the quality and allowing no author control
seems like a bad idea.

Perhaps the imageRenderingQuality property could have values 'high'
and 'auto', where the default is 'high' (so that existing content
continues working the same as it always has, and to avoid surprising
authors by randomly switching the rendering quality when they have no
reason to expect such weird behaviour), and 'auto' means 'low (but
perhaps switch to high if the browser thinks it's going to be fast
enough)'. That would avoid the issue of authors setting quality='low'
and preventing high-speed users from getting the best quality output.

-- 
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas tests updated

2008-06-24 Thread Philip Taylor
I've recently updated my canvas tests at

  http://philip.html5.org/tests/canvas/suite/tests/

so they ought to be up-to-date with the latest version of the spec,
and have greater coverage than before. (Text rendering is the only
thing that's intentionally untested, though I may have missed some
other smaller areas.)

http://philip.html5.org/tests/canvas/suite/tests/results.html shows
the results I got from testing various browsers. Using the totally
unfair unrepresentative biased method of counting the number of
passes, the latest versions of Opera and WebKit and Konqueror are
roughly tied in first place while Firefox is last (except for IE which
doesn't really count), but even the best fail about a quarter of the
tests, so there's plenty of scope for bug-fixing :-)

Anyway, it's quite likely that a number of the tests are incorrect,
since nobody (including me) has reviewed them carefully; or that the
tests are correct according to the spec but the spec is incorrect
according to reality; so it would be good to get feedback from anyone
who notices issues in them.

-- 
Philip Taylor
[EMAIL PROTECTED]


[whatwg] commit-watchers mail format

2008-06-21 Thread Philip Taylor
The mails sent to [EMAIL PROTECTED] are not very
user-friendly. In particular, I collect them in Gmail and 'star'
interesting ones that I want to look at in more detail in the future.
When looking at the list of starred emails, I see:

  whatwg  WHATWG ‎[html5] r1771 - / - Author: ianh Date: 2008-06-13
01:59:40 -0700 (Fri, 13 Jun 2008) New Revision: 1771 Modified …  13
Jun
  whatwg  WHATWG ‎[html5] r1770 - / - Author: ianh Date: 2008-06-13
01:49:25 -0700 (Fri, 13 Jun 2008) New Revision: 1770 Modified …  13
Jun
  whatwg  WHATWG ‎[html5] r1768 - / - Author: ianh Date: 2008-06-13
01:22:57 -0700 (Fri, 13 Jun 2008) New Revision: 1768 Modified …  13
Jun
  whatwg  WHATWG ‎[html5] r1767 - / - Author: ianh Date: 2008-06-13
01:12:01 -0700 (Fri, 13 Jun 2008) New Revision: 1767 Modified …  13
Jun

which makes it impossible to work out what a given email is about, or
to find the email that's about a given change. So it could be nice if
the commit message was in the subject line, or at the top of the body
(so it would appear in Gmail's content snippet thing).

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Bad CSS on the multipage version

2008-06-04 Thread Philip Taylor
On 04/06/2008, Křištof Želechovski [EMAIL PROTECTED] wrote:
 Regarding your page at the URL
  http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level.html
  #the-embed:
  [...]
  Element headings (level 4) are invisible
  (obscured underneath the following content).

Seems to be an IE CSS bug like in
http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!DOCTYPE%20html%3E%0D%0A%3Cstyle%3E%0D%0A%20.a%20%7B%20border%3A2px%20blue%20solid%20%7D%0D%0A%20.b%20%7B%20border%3A2px%20green%20solid%3B%20background%3Ayellow%3B%20margin-top%3A-0.8em%20%7D%0D%0A%3C%2Fstyle%3E%0D%0A%3Cdiv%20class%3Da%3EThis%20text%20should%20be%20visible%20on%20top%20of%20the%20yellow%0D%0A%20%3Cdiv%20class%3Db%3E...%3C%2Fdiv%3E%0D%0A%3C%2Fdiv%3E

That case fails in IE7; it works in IE8 (and in recent versions of
Firefox, Opera, Safari, Konqueror).

I don't know if there's a 'proper' way to fix this, but adding
   h4 { position: relative; }
into the page's CSS makes it work correctly in IE7, and doesn't affect
any other browser.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] More ImageData issues

2008-04-29 Thread Philip Taylor
On 22/02/2008, Oliver Hunt [EMAIL PROTECTED] wrote:
  At the moment the spec merely states that
 putImageData(getImageData(sx,sy,..),sx,sy) should not
 result in any visible change to the canvas, however for those
 implementations that use a premultiplied buffer there is a necessary
 premultiplication stage during blitting that results in a loss of precision
 in some circumstances -- the most obvious being the case of alpha == 0, but
 many other cases exist, eg. (254, 254, 254, alpha  255).  This loss of
 precision has no actual effect on the visible output, but does mean that in
 the following case:
  imageData = context.getImageData(0,0,...);
  imageData.data[0]=254;
  imageData.data[1]=254;
  imageData.data[2]=254;
  imageData.data[3]=1;
  context.putImageData(imageData,0,0);
  imageData2.data = context.getImageData(0,0,...);

  At this point implementations that use premultiplied buffers can't
 guarantee imageData.data[0] == imageData2.data[0]

  Currently no UA can guarantee a roundtrip so i would suggest the spec be
 updated to state that implementations do not have to guarantee a roundtrip
 for any pixel where alpha  255.

The spec does not state that getImageData(putImageData(data)) == data,
which is where the problem would occur. It only states that
putImageData(getImageData) == identity function, which is not a
problem for premultiplied implementations (since the conversion from
premultiplied to non-premultiplied is lossless and reversible). So I
don't think the spec needs to change at all (except that it could have
a note mentioning the issue).

(getImageData can convert internal premultiplied (pr,pg,pb,a) into
ImageData's (r,g,b,a):

if (a == 0) {
r = g = b = 0;
} else {
r = (pr * 255) / a;
g = (pg * 255) / a;
b = (pb * 255) / a;
}

(using round-to-zero integer division). putImageData can convert the other way:

pr = (r*a + 254) / 255;
pg = (g*a + 254) / 255;
pb = (b*a + 254) / 255;

Then put(get()) has no effect on the values in the premultiplied buffer.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Feeedback on dfn, abbr, and other elements related to cross-references

2008-04-21 Thread Philip Taylor
On 21/04/2008, Smylers [EMAIL PROTECTED] wrote:
 Can you link to examples of such webpages, which have abbr elements
  without title attibutes?  What does that mark-up currently achieve?

Out of 130K pages from dmoz.org, I see 592 using abbr elements, and
36 of those using it at least once with no title attribute. If anyone
cares enough, they could look through the list to see how many are
bogus and how many are expecting something useful and what they seem
to be expecting.

Those 36 pages which used abbr with no title a couple of months ago:

http://bundesrecht.juris.de/gsgv_9
http://linuxdidattica.org/
http://markcronan.livejournal.com/33814.html
http://observer.guardian.co.uk/politics/story/0,6903,449920,00.html
http://outer-court.com/goodies/index.htm
http://spazioinwind.libero.it/saf/
http://tubewhore.livejournal.com/
http://www.artofeurope.com/wong/
http://www.beepworld.de/members10/princessa18/
http://www.cs.tut.fi/~jkorpela/latinaohje.html
http://www.danscamera.com/
http://www.fwbosheffield.org/
http://www.gnu.org/
http://www.jokan.de/technik-c2.html
http://www.mozilla.org/directory/
http://www.mozilla.org/projects/mathml/
http://www.offaly.ie/offalyhome/visitoffaly/Attractions/Family/bog+train.htm
http://www.rekordbog.dk/
http://www.seobythesea.com/
http://www.travelphp.com/
http://www.treseta.fi/
http://www.voyager.prima.de/cpp/books1.html
http://www.w3.org/TR/XMLHttpRequest/
(plus 5 more on guardian.co.uk, and 8 more on beepworld.de)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] ALT and equivalent representation

2008-04-18 Thread Philip Taylor
On 18/04/2008, Bill Mason [EMAIL PROTECTED] wrote:
  The example was a case of a hacker who replaces the Google logo on
 google.com with an image only containing the text WE HACKED YOUR SERVERS.
 We assume the hacker cares enough about accessibility to set the alt
 attribute to the same text.

More generally (and less hypothetically), this is any case where an
image is being used just to display text (in a nicer font, or nicer
colours, or animated and on fire, or some other reason it's worth
using an image instead of plain HTML).


  Since the image is no longer the company logo, it falls outside the logo
 discussion in the Icons requirement for alt.

I believe the company logo case is also unclear in the spec. See e.g.
http://www.google.com/ (when it's not a special day) - the image is
simply the word Google (as a page heading, so it should probably be
in h1), so common sense says it should have alt=Google. The spec
phrase Icons: a short phrase or label with an alternative graphical
representation sounds like it might apply here, but none of the cases
in that section seems to work: in particular, I don't think the logo
is being used to represent the entity would apply, because the
purpose of the image is not to represent the entity (as it would be in
e.g. a list of search engines that shows small images of all their
logos so you can choose your favourite), and instead its purpose is to
tell users what site they are on (and to make it look prettier). It
should be made clearer whether the existing case does or does not
apply. If it does not apply, it should be made clear what alt text to
use instead.


Since we're on this topic...


What should happen for 'tracker' images? (i.e. img
src=http://evil.google.com/user-track.php?site=97519340; width=1
height=1 alt=???)
As some examples, Geocities has alt=setstats, someone has
alt=statystyka, someone has alt=CrawlTrack: free crawlers and
spiders tracking script for webmaster- SEO script -script gratuit de
détection des robots pour webmaster, etc, and those examples do not
help users who are seeing the alt text.

Such images are pretty common, and they're not going to go away, so we
should minimise their harm by saying alt= is appropriate. None of
the cases in the spec seem to cover this case yet.


http://validator.nu/?doc=http%3A%2F%2Fwww.google.com%2Fshowimagereport=yesshowsource=yes
shows that some versions of Google (depending on cookies, IP address,
etc) implement the Google logo as four separate images,
approximately like:

.--.-..
| G o o|g|l e |
'--+-+'
   '-' Suomi

where the Suomi (text, not image) is adjacent to the g's descender.

The Goo image has alt=Google, and the other three images have
alt=. When the page is viewed without images, that means it will say
Google instead of the logo, which is a good thing. But HTML5 says
that the alt text is equivalent to the image, which is not true (and
could only be satisfied by alt=Goo, alt=le, alt=Most of a g,
alt=A little bit of a g, which would be silly) - in this case, it is
the combination of alt texts on the whole page that is equivalent to
the combination of images on the page.

google.com is splitting the image up to fit it in a layout table,
which is non-conforming HTML5; but there are other more legitimate
reasons for having several img elements representing a single piece of
text, and in those cases it seems sensible to put alt=all the text
on one image and alt= on the others. Should HTML5 be changed to
accept this?


And as a more general point, the spec provides a list of cases for
using img (and how to use alt for those cases), but this list will
never be complete (especially since the case matches are all
subjective and open to interpretation in multiple ways), so there
needs to be a default case statement for images where the author
doesn't think any of the specific requirements applies.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Question about the PICS label in HTML5

2008-04-16 Thread Philip Taylor
On 16/04/2008, David Gerard [EMAIL PROTECTED] wrote:
 I may have missed it, but does anyone, anywhere, actually use PICS? I
  don't think I've even heard the name uttered in a few years - I
  assumed it had died of neglect and lack of interest.

About 1% of the pages listed on dmoz.org attempt to use it - see
http://philip.html5.org/data/pics-label.html

(I have no idea how many of those uses are syntactically valid (maybe
someone could test that if they're quite bored), or are appropriate
for the page's content.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] A comment to character encoding declaration

2008-03-05 Thread Philip Taylor
On 03/03/2008, Jjgod Jiang [EMAIL PROTECTED] wrote:
  During the development of CJK information processing, many
  text encodings is just a strict subset of another one, for
  example, GB2312 is a subset of GBK, GBK is a subset of
  GB18030. For compatibility purpose, a lot of web pages used
  character encoding declaration like this:

  meta http-equiv=Content-Type content=text/html; charset=gb2312

  in their header, yet they might use characters in GBK but
  not in GB2312. So, I think we can suggest clients to simply
  treat encodings like these as their biggest superset, for
  instance, treat GB2312 as GB18030.

Out of 130K pages from dmoz.org, I see 760 which are declared as
gb2312 (by HTTP Content-Type, meta content, etc).

Of those 760, 120 cause decoding errors in ICU4J when treated as
gb2312. 8 cause errors when treated as gbk, and the same 8 cause
errors as gb18030.

Those 8 are:
http://www.bigm.com.cn/dinosaur/anecdote/
http://www.ccpc.edu.cn
http://www.gdoverseaschn.com.cn/
http://www.jgbr.com.cn
http://www.liechebuluo.com
http://www.netbro.com.cn
http://www.tkdts.com
http://www.wuxi-accp.com/
and I haven't tried working out why they are causing errors.

The 120 are listed at
http://philip.html5.org/data/gb2312-errors.txt. I don't know how
many are really using gb18030, and how many are not actually gb* but
happen to be decoded without errors because they use compatible byte
sequences; but it does look like gb2312 is a fairly significant
problem if it's not treated as gbk/gb18030, so it would be helpful to
suggest/require it to be processed specially.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Proposal for a link attribute to replace a href

2008-02-28 Thread Philip Taylor
On 28/02/2008, Shannon [EMAIL PROTECTED] wrote:
 http://wiki.whatwg.org/wiki/FAQ#Does_HTML5_support_href_on_any_element_like_XHTML_2.0.3F

 So 'backwards-compatibility', as defined by the same document, can be
 achieved by using javascript to walk the DOM and add
 'window.location(node.getAttribute('link'))' to the onclick handler of
 any nodes with a link attribute. I have done a very similar thing before
 to implement :hover on non-anchor elements in IE.

I imagine the script would have problems with incremental loading - if
someone clicks a link before the page has finished loading and before
the script has executed, then it won't work. Is there a way to avoid
that problem and make it work as well as a real implementation?

There are also tools like search engines that need to recognise links
and can't be fixed by compatibility scripts. (Fortunately it's much
easier to upgrade all the world's search engines than all its web
browsers, so this is a less significant issue than with browsers.)

 A global attribute offers several features that a
 does not - most importantly nested links and the ability to hyperlink
 block and interactive elements without breaking validation.

Are there cases where div ...a href=... style=display:block;
width:100%; height:100% ... /a/div is not adequate for making
block links?

 FAQ:  * It doesn't make sense for all elements, such as interactive
 elements like input and button, where the use of href would interfere
 with their normal function.

 As long as the spec is clear about which actions take precedence then
 this is not an issue.

Having to make the spec clear is an issue :-)
It would take quite a bit of effort to design and specify the feature
in sufficient detail, and to write test cases, and to update the spec
in response to implementor feedback about what would cause them fewer
problems. That is all much harder when the new feature interacts with
a lot of existing features (inputs, buttons, image maps, iframes, DOM
events, etc), compared to something fairly self-contained (like
video).

 How is a global link/href any more
 difficult than the existing implementations of onmouseup/down/whatever?
 It's basically the same thing - only *simpler* (no scripting, events,
 bubbling, etc).

As far as I'm aware, HTML elements currently have at most one default
click-event handler and any number of DOM handlers. The new link
attribute wouldn't be a DOM event handler (since it ought to behave
much more like a href), so it would either be an entirely new type
of event handler or it would break the assumption that there is a
single default handler, and I can imagine that that would cause
difficulties. (But I have no implementation experience so I could be
entirely wrong.)

There are cases like

  button type=submit link=... onclick=event.preventDefault()
  button type=submit link=javascript:event.preventDefault()
  a href=1 link=2 onclick=window.location=3
  a href=1 link=2 onclick=window.location=3; return false
  etc

where the interaction between several aspects of the event model would
have to be defined (and implemented and tested), requiring some new
complexity on top of the current href+onclick model.


Another issue is that a href has a number of other attributes for
links, and it would be bad design if a generalisation of href didn't
allow those attributes on other elements. That includes 'target'
(conflicts with base target, form target), 'type' (conflicts with
style type, script type, embed type, object type), 'media'
(conflicts with style media, link media), etc.

Is there a nice way to solve those conflicts? Renaming the link
attributes (the same as renaming 'href' to 'link') would be confusing
to people who already know HTML, and it would be hard to find good
names that aren't used already. Defining lots of exceptional cases for
certain attributes on certain elements would make the language harder
to understand and implement and test. Defining exceptions for a
category of 'non-visible elements' (script, style, etc) wouldn't work
since script style=display:block is not non-visible. I'm not sure
how this could be made to work well.


 Shannon


-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] html start tag token in the root element phase

2008-02-11 Thread Philip Taylor
On 29/06/2007, Henri Sivonen [EMAIL PROTECTED] wrote:
 If the spec dealt with the html start tag token directly in the
 root element phase, the parse error in the main phase wouldn't need
 to be conditional. (Implementations that experience a perf benefit
 from not mutating the attributes of a node probably want to hoist the
 html node creation to the root element phase for perf reasons, too.)

There's also an issue with:

  !doctype html
  foo
  html

not producing any parse error, because the html is the first start
tag token (at least under my interpretation) and therefore is
considered valid. Handling html specially in the root element phase
seems like a reasonable way of fixing this.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas line styles comments

2008-02-02 Thread Philip Taylor
Some comments on the newly modified version:


The lineCap attribute defines the type of endings that UAs shall
place on the end of lines. - it seems weird to use shall, since
this is the only place in the spec (except the list of RFC2119
keywords) that uses it. The other line* properties don't try define to
conformance requirements like that (e.g. they say The lineWidth
attribute gives the width of lines which is only informative), so I
can't tell whether the lineCap one is trying to be a requirement.


The lineJoin attribute defines the type of corners that that UAs will
place where two lines meet. - s/that that/that/


A join exists at any point in a subpath shared by two consecutive
pairs of lines. - should be two consecutive lines or a consecutive
pair of lines.


In addition to the point where the join occurs, two additional points
are relevant to each join: the corners found half the line width away
from the join point, perpendicular to the two lines joining at the
join point. - I'm not sure what that means. Nothing can be
perpendicular to both of the two lines (unless they're parallel). For
each line, there are the two corners half the line width away from the
join point perpendicular to that line, but that gives four corners in
total.

I suppose it'd be alright to say there's four corners, and then talk
about the two corners on the outside of the join since the meaning
of outside is obvious enough even if it's not defined (at least when
the lines aren't parallel).


A filled triangle connecting ... with the third point of the triangle
being the point of the join itself (where the lines touch on the
inside of the join), must be rendered at all joins. - the inside of
the join bit seems unhelpful and unclear (since it's not the opposite
of the outside of the join) - it'd be better just to say ... being
the join point, must be ..., since that's the term used earlier for
that point.


The round value means that a filled arc connecting the two corners on
the outside of the join, with the diameter equal to the line width and
the origin at the point of the join, must be rendered at joins. - if
I was being pedantic (which I am) I'd say there's two possible arcs
connecting those two corners (one clockwise, one anticlockwise), so it
should specify which one is meant. But I don't know how to easily say
that, and an implementor would have to be silly to do it the wrong
way, so maybe a precise definition isn't needed.

Should lineJoin='round';moveTo(0,0);lineTo(100,0);lineTo(0,0);stroke()
draw a semicircle at (100,0) pointing rightwards? There is no outside
of the join there, so the spec doesn't say what should happen.


The miter value means that a filled four-sided polygon must be
rendered at the join, with two of the lines being the perpendicular
edges of the joining lines, ... - the miter-polygon lines aren't the
perpendicular edges - they're only half of each edge (between the join
point and the outside corners). It's probably easier to define the
polygon's points (being the join point, the two outside corners, and
the point where the two continuated outside edges intersect).

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas patterns, and miscellaneous other things

2008-02-02 Thread Philip Taylor
On 31/01/2008, Ian Hickson [EMAIL PROTECTED] wrote:
 I've made toDataURL() return data:, if it's faced with a 0-pixel image.
 It's arbitrary, but I guess it represents the image, at least!

That makes the Note: When trying to use types other than image/png,
authors can check if the image was really returned in the requested
format by checking to see if the returned string starts with one the
exact strings data:image/png, or data:image/png;. now incorrect.
The non-image/png format might be unsupported, but someone might be
drawing a 0-pixel image and they'll get back something that doesn't
start with data:image/png[,;].

It does seem pretty weird to return text/plain content when asked for
an image. But I guess it's safer than e.g. returning an empty string,
since it won't get misinterpreted as a relative address when people do
img.src=canvas.toDataURL(), and I can't think of a better idea.

 User agents may impose implementation-specific limits on otherwise
 unconstrained inputs, e.g. to prevent denial of service attacks, to guard
 against running out of memory, or to work around platform-specific
 limitations. (See ...#hardwareLimitations.)

Does anything say that those limitations should be imposed by throwing
an exception, and not by e.g. returning null or aborting the entire
script?

 I'm assuming that the DOM Bindings
 for JS spec will define how 'undefined' really means 'null'

Hmm, I can imagine 'undefined' converted to a DOMString becoming the
string undefined. (That's at least what
document.createTextNode(undefined) does). But I can just assume for
now it's meant to work like null.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas arcTo

2008-02-02 Thread Philip Taylor
On 31/01/2008, Ian Hickson [EMAIL PROTECTED] wrote:
 On Mon, 2 Jul 2007, Philip Taylor wrote:
  If the point (x2, y2) is on the line defined by the points (x0, y0) and
  (x1, y1) then the method must do nothing, as no arc would satisfy the
  above constraints. - why would no arc satisfy the constraints? If P0,
  P1, P2 are collinear and non-coincident, then (I think) any of the
  (infinitely many) circles which have the given radius and touch
  tangential to the line P0-P2 will satisfy the constraints (i.e. being
  tangential to P0-P1 at some point and to P1-P2 at some point).

 The idea is to just take the two (infinite) lines that are defined by the
 points (end at P1, cross P0 and P2), and draw a circle with the given
 radius between them.

 When the lines are the same line (i.e. P0-P1 is parallel to P1-P2) then
 no circle with a finite non-zero radius can touch the line tangentially at
 more than two points, since for each half of the circle, every point has a
 different tangent, and the two points on opposite sides of the circle are
 tangents to parallel but distinct lines unless the radius is zero.

 No?

The circle can't touch tangentially at two distinct points, but
nothing said there had to be two distinct points. There just had to be
one point on the circle tangential to one line, and one point
tangential to the other line, so they could easily be equal points.


About the updated specification:

the method must add a point (xinf;, yinf;) - s/inf;/infin;/

the infinite line that crosses the point (x0, y0) and ends at the
point (x1, y1) - it could be clearer to say half-infinite line. (It
seems the technical term is ray or half-line, but those aren't as
clear.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas line styles comments

2008-02-02 Thread Philip Taylor
On 02/02/2008, Kristof Zelechovski [EMAIL PROTECTED] wrote:
 The rounding arc should be chosen
 so that it is not contained in the convex hull of the stroke path segments
 terminated at the points where the arc begins.

I believe I can see the idea there, but I can't quite tell what that
phrase means about terminating. The contained within also seems
inaccurate, because e.g.
lineWidth=100;moveTo(0,0);lineTo(1,0);lineTo(1,1) would result in a
convex hull that doesn't contain either arc, though I think it'd be
alright if said does not intersect instead.

A possible alternative that seems simpler and (I think) correct
(except in the special parallel case): The rounding arc should be
chosen so that if it was closed, it would not contain the join point.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas line styles comments

2008-02-02 Thread Philip Taylor
On 02/02/2008, Kristof Zelechovski [EMAIL PROTECTED] wrote:
 You considered the convex hull of the original lines to get that paradox;
 I had the stroke path segments in mind.
 (Stroke path segments are the path equivalent of the stroked curve
 when the stroke operator is not allowed and must be replaced by the fill
 operator).
 Each line corresponds to two parallel stroke path segments;
 two of them intersect and the other two get joint with an arc.
 One of the possible arcs is in the convex hull of those stroke path
 segments.

If the two lines are very short, their stroke paths will (if I
understand correctly) look like

   .-.
   | |
   | |
   | |
 .-|-*---.
 '-|-|---'
   | |
   | |
   '-'

where the * is the join point and the short lines are the two parallel
stroke path segments of each line. Then the convex hull is nearly a
square rotated by 45 degrees, like

   .-.
  /| |'-
/  | |  '-
  /| |'-.
 .-|-*---.
 '-|-|---'
  '.   | |.-'
'-.| |_.-'
   '-'

and so an arc with radius lineWidth/2 from the rightmost point going
clockwise to the upmost point will not be contained entirely within
that nearly-square. So neither arc is within the convex hull.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] More random comments on the putImageData definition

2008-01-23 Thread Philip Taylor
On 23/01/2008, Oliver Hunt [EMAIL PROTECTED] wrote:
 It would be great if putImageData
 could take a source region, in addition to the destination.  One of
 the primary reasons for using get/putImageData is to allow JS to
 rapidly blit data to the screen, however without an ability to blit
 only a subregion of the image data the only available options are to
 either re-blit the entire imagedata region (which can be expensive due
 to the need for [un]premultiplying in some (all?) implementations),

((Opera does non-premultiplied colour internally.))

 or create and populate a new ImageData object which still requires more
 work than would ideally be necessary.

You can also create a temporary canvas and putImageData once onto
that, and then drawImage sections onto the screen as they are needed.
That lets you draw lots of sections lots of times quickly (since
you're mostly drawing from the optimised canvas surface format, not
from a JS array), which perhaps helps in some (most?) of the cases.
(You still have to do a single putImageData of the whole data to get
it onto the temporary canvas, but if there are parts of the data you
aren't ever using then you just should make the ImageData smaller and
cut out the unused bits.)

-- 
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas bits

2008-01-21 Thread Philip Taylor
The ImageData object's width is greater than zero. (and subsequent
lines) is wrong, since it's talking about an object that's explicitly
not an ImageData.

What happens with NaN in imagedata.data? (NaN is a Number, so it's
allowed in the data array. It's not below 0, or above 255, and it
can't be rounded to the nearest integer.)

Note: The transformation is applied to the path when it is drawn -
oh no it isn't.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas ImageData comments

2008-01-18 Thread Philip Taylor
On 18/01/2008, Ian Hickson [EMAIL PROTECTED] wrote:
 On Sat, 16 Jun 2007, Philip Taylor wrote:
 
  Colour spaces are not dealt with at all, but are particularly relevant
  for getImageData (else you have no idea what the values mean).

 Fixed, in theory. But since I have no idea what I'm talking about here,
 you'll have to check closely to make sure I didn't babble incoherently.

I don't know much about colour spaces either, so someone else should
check that it's sane :-)

  maybe it's safest to just say that all colours throughout the canvas API
  must be handled consistently in the same colour space (without saying
  exactly which it is).

 Wouldn't that mean that different browsers could have different effects
 when rendering external images -- with gamma -- to the canvas?

I guess so; and things like gradients wouldn't work consistently if
the colour space wasn't consistent.

Maybe the desired properties are:
- drawImage(img) onto a displayed canvas should look the same as
the original img, regardless of whether the image has gamma etc.
- toDataURL should return the same raw pixel data as getImageData, at
least for image/png (though other formats might make that impossible),
for consistency.
- drawImage(toDataURL()) should have no effect.

I'd also like:
- fillStyle = 'rgb(r, g, b)'; fillRect(...); getImageData returns
exactly [r, g, b, 255].
mainly because that makes it possible to write test cases that use
getImageData to check the results.

I don't know if any of these are wrong, or if others are missing. And
I have no idea if this is trivial for implementors, or if it's
impossible. So I don't have any useful suggestions.

  The putImageData(image, dx, dy) method must take the given ImageData
  structure, and draw it at the specified location dx,dy in the canvas
  coordinate space, mapping each pixel represented by the ImageData
  structure into one device pixel. - how should it 'draw it'? Given the
  requirement on putImageData(getImageData(...)), it has to be replacing
  the pixels in that area rather than doing anything like normal drawing,
  but that isn't explicit.

 Is it better now?

It looks clear enough to me.

  In the example code:
  [...]
  function FillCload(data, x, y) { ... } - should be function
  FillCloud(data, x, y) { ... }.

That error was replaced with function AddCload(data, x, y) { ... } - s/a/u/

  The width and height (w and h) might be different than the sw and sh
  arguments to the function - 'different than' sounds a bit odd to me
  here; maybe I'd prefer 'different from'.

Oops, I was wrong to mention that - 'different than' seems to be
common in some Englishes, and I don't want to complain when it's just
dialect variations.

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Minor addition/rewording for canvas section

2008-01-13 Thread Philip Taylor
On 13/01/2008, Oliver Hunt [EMAIL PROTECTED] wrote:
 Writing to a canvas from a different origin isn't considered a threat,
 the problem is
 evil.example.com reading data from the canvas after naive.example.com
 has put
 private/confidential information into the canvas.

In that case, evil.example.com shouldn't be allowed to read anything
(pixel data or context state) from the canvas after naive.example.com
has done anything at all to it (e.g. calling fillRect, or setting
fillStyle, etc), because otherwise some potentially-private
information will be leaked. (putImageData can be emulated using
fillRect, so it wouldn't make much sense to have different security
restrictions depending on which equivalent mechanism you use.)

Don't the normal same-origin restrictions already prevent
naive.example.com and evil.example.com accessing the same canvas
element, in the same way as (I assume) they prevent evil.example.com
accessing an input type=password.value from a naive.example.com
document?

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Minor addition/rewording for canvas section

2008-01-13 Thread Philip Taylor
On 13/01/2008, Oliver Hunt [EMAIL PROTECTED] wrote:
 I did wonder about why other origins could read anything myself, so
 you're not
 alone -- it just seemed especially odd to allow images to be written
 safely but not
 ImageData.

As far as I'm aware, different origins can never read and write the
same canvas. Images are given special consideration because scripts
already have access to Image objects where the image has a different
origin to the script, like:

  // on a page on www.example.com
  var img = new Image();
  img.onload = function () { ctx.drawImage(img, 0, 0); }
  img.src = 'http://google.com/images/logo.gif';

The canvas reading/writing all happens in the same origin - it's just
the image itself that is not the same origin.

The same does not apply to ImageData, because scripts don't have
access to ImageData objects from other origins.

-- 
Philip Taylor
[EMAIL PROTECTED]


[whatwg] must only ambiguity

2007-12-21 Thread Philip Taylor
Documents and document fragments / Structure says Authors must only
use elements in the HTML namespace in the contexts where they are
allowed, as defined for each element.

That phrase is unclear. It could be interpreted as:

Authors must { only use elements in the HTML namespace } in { the
contexts where [elements in the HTML namespace] are allowed }, i.e.
contexts expecting HTML namespaced elements mustn't contain foreign
content.

Authors must { [...] use elements in the HTML namespace } [only] { in
the contexts where they are allowed }, i.e. HTML elements must not be
used where they aren't allowed.

Authors must only { use elements in the HTML namespace in the
contexts where they are allowed }, i.e. pretty much every imaginable
action in the entire world is disallowed, except for using elements
where allowed.

A suggested replacement: Authors must not use elements in the HTML
namespace except where allowed by the context defined for the
element.


Similarly, Authors must only put elements inside an element if that
element allows them to be there according to its content model should
be fixed to say something like Authors must not put elements inside
an element unless that element allows them to be there according to
its content model.


More generally, all uses of must only and may only etc seem
dangerous. The spec says The key words [...] in the normative parts
of this document are to be interpreted as described in RFC2119, but
instead they have to be interpreted as described by the standard
English grammar rules when they're used in complex phrases like must
only, which makes the spec harder to read when you're trying to read
the normative requirements, and can cause misunderstanding. (Does that
make things particularly harder for non-native-English-speaking
people?)

The conformance requirements would be clearer if all occurrences of x
must only y and x may only y were replaced by x must not { not y
} or by x may y, and x must not { not y }.

Similarly, x should only y if z (e.g. authors should only use these
elements if the absence of those elements would change the meaning of
the content) should be replaced by x should y if z, and should not
do so otherwise or x should not y if not z (depending on which
directions the 'should' applies in).

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] HTML5 and URI Templates

2007-12-17 Thread Philip Taylor
On 17/12/2007, James M Snell [EMAIL PROTECTED] wrote:
 It should be possible for us to also do something like:

   form action=http://example.org/form_processor;
 template=http://example.org?{-join||a,b}
 method=POST
 input name=a type=text /
 input name=b type=text /
 input name=c type=text /
 input name=d type=text /
   /form

 [...]

 HTML5 Post:

   POST /example.org?a=wb=x
   Host: example.org
   ...
   c=yd=z

 HTML4 Post:

   POST /form_processor
   Host: example.org
   ...
   a=wb=xc=yd=z

 - James

Presumably people will use more than one templated form on their site,
but won't want lots of separate form_processors, so they would have to
use

   form action=http://example.org/form_processor?{-join||a,b}
 template=http://example.org?{-join||a,b}
 method=POST

or something theoretically more correct like

   form action=http://example.org/form_processor?%7B-join%7Camp;%7Ca,b%7D;
 template=http://example.org?{-join|amp;|a,b}
 method=POST

and then they can drop in a standard generic form_processor script to
handle everything automatically.


Most legacy browser users could be handled by a script which adds
onsubmit hooks to rewrite the 'action' attribute before submitting. (I
assume that'd work correctly in current browsers, but haven't tested
it). That would avoid the need for repeating the template URI twice
(with the associated risks of typing one of them wrong and not
noticing), if you don't want to handle scriptless users.

(How would the script know when it should do the rewriting, and when
it should leave everything to the browser? There's no obvious feature
test it can perform.)


Wondering about why this feature would be used:

If everyone who uses template URIs uses these backward-compatibility
additions (which they have to, unless they have no users), why would a
browser implement native support for template URIs? (The reason I can
think of is that it provides a slightly better user experience,
because you can go directly to the destination rather than being
delayed by a round-trip to form_processor, but that's no faster than
the scripted approach.)

If everyone who uses template URIs has to use these
backward-compatibility additions, why would they go to that effort
instead of using some server-side redirection logic to perform the
desired processing at the normal non-templated ugly URI? (Maybe it
makes the system cleaner if the server code has a nice URI-based API
and the client code does the mapping onto that, but I have no idea how
much difference it really makes. More significantly, it allows the
direct use of external resources that have sufficiently nice URIs but
don't have an equivalent GET/POST form-accessible API. I haven't seen
any other obvious useful uses yet.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] HTML5 and URI Templates

2007-12-16 Thread Philip Taylor
On 16/12/2007, Henri Sivonen [EMAIL PROTECTED] wrote:
 On Dec 16, 2007, at 05:28, James M Snell wrote:

  form template=http://example.org{-prefix|/|foo}?bar={bar}
   method=POST
   Foo: input name=foo type=input 
   Bar: input name=bar type=input
  /form

 What's the backward-compatibility story of this feature? (Both
 behavior of URI templates in legacy browsers and ensuring that
 existing content doesn't use braces.)

Out of ~15K random pages from dmoz.org, I see two with braces in form action:

http://www.bornsvilkar.dk/ - form name=mainform method=post
action=BV.Main.BV.Browse.aspx?path=%2fwww_bornsvilkar_dk%2fbornsvilkaramp;layout={0685D858-53CA-4F7E-A3C8-53D1BD7F277D}
id=mainform

http://bip.wokiss.pl/margoninm/ - FORM
ACTION='index.php?pid=2opcje=a:1:{i:0;s:6:wyszuk;}' METHOD='POST'
ENCTYPE='multipart/form-data' name=wyszukiwarka style=margin:0px;


But the original example had form template which would avoid that
conflict. (The only template attributes I see are one page with
widget template and one with edittag:edit template.)

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Compatibility problems with HTML5 Canvas spec.

2007-09-25 Thread Philip Taylor
On 25/09/2007, Oliver Hunt [EMAIL PROTECTED] wrote:
 Firefox 2/3 and Safari 2 clear the context's path on strokeRect/
 fillRect, this violates the spec -- but there are many websites that
 now rely on such behaviour despite the behaviour defined in hmtl5.
 This means that those browsers that match the current draft (eg.
 Safari 3 and Opera 9.x) fail to render these websites correctly.

How hard would it be to get those sites fixed? If there are problems
in something like PlotKit or Reflection.js, which lots of people copy
onto their own servers, then it would be a pain to break
compatibility. If it's just sites like canvaspaint.org where there is
a single copy of the code and the developer still exists and can
update it, it seems a much less significant problem to break
compatibility.

 Unfortunately it isn't really an edge case as it's a relatively
 common occurance -- people expect that the rect drawing function (for
 example) will clear the path, so expect clearRect
 (myCanvasElement.width, myCanvasElement.height) to clear the rect and
 reset the path, and other similarly exciting things :-/

Firefox also resets the path on drawImage and putImageData, unlike
Opera and Safari 3 - do people depend on that behaviour too?

 --Oliver

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Issues concerning the base element and xml:base

2007-08-07 Thread Philip Taylor
On 07/08/07, Ian Hickson [EMAIL PROTECTED] wrote:
 This is how it stood back in May (using a sample of several hundred
 thousand pages taken mostly from the more popular sites); number of unique
 URIs in base href attributes as a percentage of all pages parsed:

   0: 93.7%
   1:  6.31%
   2:  0.0308%
   3:  0.00105%
   4:  0.00197%

 This is how it stands as of today (using the same sampling method):

   0: 94.1%
   1:  5.93%
   2:  0.0215%
   3:  0.000928%
   4:  0.000288%

 (All numbers rounded to three significant figures.)

That rounding seems quite misleading - if I haven't forgotten how to
do statistics, and if the details I am forgetting are not critical
ones, and if I'm not misinterpreting how you collected the data, then
the samples are independent and from a binomial distribution that can
be approximated as a normal distribution with standard deviation
sqrt(n*p*(1-p)), and if assuming n=100,000 and guessing p from the
data then the 95%-confidence (+/- 2 s.d.) ranges are something like:

0:  (93.7 +/- 0.15)%
1:  (6.3 +/- 0.15)%
2:  (0.03 +/- 0.01)%
3:  (0.001 +/- 0.002)%
4:  (0.002 +/- 0.003)%

and

0:  (94.1 +/- 0.15)%
1:  (5.9 +/- 0.15)%
2:  (0.02 +/- 0.01)%
3:  (0.001 +/- 0.002)%
4:  (0.0003 +/- 0.001)%

(though the normal approximation breaks down in the = 0.002% bits),
so you can't determine anything about changes in frequency beyond the
zero/one cases.

-- 
Philip Taylor
[EMAIL PROTECTED]


[Whatwg] IE-only character entity references

2007-07-30 Thread Philip Taylor
IE undocumentedly recognises some which nobody else does:

aafsU+206D  ACTIVATE ARABIC FORM SHAPING
ass U+206B  ACTIVATE SYMMETRIC SWAPPING
iafsU+206C  INHIBIT ARABIC FORM SHAPING
iss U+206A  INHIBIT SYMMETRIC SWAPPING
lre U+202A  LEFT-TO-RIGHT EMBEDDING
lro U+202D  LEFT-TO-RIGHT OVERRIDE
nadsU+206E  NATIONAL DIGIT SHAPES
nodsU+206F  NOMINAL DIGIT SHAPES
pdf U+202C  POP DIRECTIONAL FORMATTING
rle U+202B  RIGHT-TO-LEFT EMBEDDING
rlo U+202E  RIGHT-TO-LEFT OVERRIDE
zwspU+200B  ZERO WIDTH SPACE

(I believe that list is complete.)

The first eleven were suggested on
https://listserv.heanet.ie/cgi-bin/wa?A2=ind9605L=html-wgP=4579 some
time ago but don't seem to have gone very far (except into IE).

I can see some legitimate users at
http://www.tasb.com/services/field/staff/index.aspx?print=true and
http://www.pelesoft.co.il/ and maybe there's a few dozen or hundred
more elsewhere (but I can't measure it easily). There's some in
text-art at http://yy28.60.kg/test/read.cgi/maido3/1096370177/l50
and quite a lot in weird places like
http://cheese.2ch.net/life/kako/1010/10103/1010391447.html or
http://zerosen52.gozaru.jp/log/1093422333.html that I don't
understand but that seem to all be on 2channel (or copied from it).
I've no idea how common they are in general.

Are these used significantly on the web, or would they be considered
highly useful if anyone knew they existed, or should HTML5 just ignore
them?

-- 
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] canvas Firefox support for toDataURL broken

2007-07-10 Thread Philip Taylor

On 10/07/07, dev [EMAIL PROTECTED] wrote:

Hey,
I am not sure if this is the correct place to post this, so forgive me
if I am wrong (and point it out too).

The spec states the toDataURL(image/svg+xml) should return image in
svg format , if it can't support that then png image should be
returned. But it seems firefox throws an exception to
canvas.toDataURL(image/svg+xml) whereas it should be returning the
image in png format.


That's correct, and it's just a bug in Firefox. Throwing an exception
on toDataURL(image/png, null) is a vaguely similar bug. (Opera
agrees with the spec in both cases. Safari doesn't implement toDataURL
at all). Probably Firefox should change its behaviour, unless it has
good reasons not to, in which case possibly the spec should change to
match it.

https://bugzilla.mozilla.org/ is the best place for reporting bugs
like this, under component 'Core' / 'Layout: Canvas'. (I've got a load
of test failures recorded at
http://canvex.lazyilluminati.com/tests/tests/results.html, and more
from not-quite-finished tests - I've been waiting to have more
completeness before reporting all the found bugs, but I keep getting
distracted by other things and haven't got around to that yet...)


Regards,
dev


--
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] Canvas arcTo

2007-07-03 Thread Philip Taylor

Which straight line do you mean?

In the first case, the constraints are:

* There is a circle with the given radius.
* The infinite line P0-P1 is tangential to that circle.
* The infinite line P1-P2 is tangential to that circle.
* The Arc is the shortest arc of that circle, between the points where
the circle touches the two lines.

When P0-P1-P2 is a straight line, there is a circle (among many
others) which satisfies the first three constraints, and there is a
zero-length arc of that circle which satisfies the fourth constraint.
(You can't then re-calculate the circle's radius from the arc, because
the arc is just a single point, but I don't think that means the arc
doesn't exist as part of a finite circle). That's not very useful when
you want to draw stuff since there are infinitely many distinct things
you could draw, but it's not the case that there's nothing you could
draw.


In the second case, there is one distinct circle (with zero radius)
which touches both the lines, and there is one distinct point which
the start and end tangent points must be equal to, and the shortest
arc which joins those two points has zero length. There's still
infinitely many such arcs and it gets a bit confusing if you want to
work out its direction (in order to draw line joins and caps), but
you'd always be drawing at least a line from P0 to P1.

(To handle that confusion about the zero-sized arc, I think my earlier
suggestion should be modified to say ... Otherwise, if x1=x2 and
y1=y2, or if the line defined by the points (x0, y0) and (x1, y1) is
parallel and in the same direction as the line defined by the points
(x1, y1) and (x2, y2), ** or if radius is zero, ** then the method
must connect the point (x0, y0) to the point (x1, y1) by a straight
line and add the point (x1, y1) to the subpath. ...)


Actually, I just realised there's still a problem in the normal
non-parallel non-zero-size case, because there are four different
circles which have the two infinite lines as tangents. (And you have
to use infinite lines rather than finite lines, to handle the second
case in http://canvex.lazyilluminati.com/misc/arcto.html like Safari).
So I think it would have to say something like:


Otherwise, let L01 be the line through the points (x0, y0) and (x1,
y1), and let L12 be the line through the points (x1, y1) and (x2, y2).
Consider the circle that has L01 and L12 as tangents, and has its
origin and the point (x2, y2) on the same side of L01, and has its
origin and the point (x0, y0) on the same side of L12, and has radius
radius. The points at which this circle touches these two lines are
called the start and end tangent points respectively. Let The Arc be
the shortest arc given by the circumference of this circle, joining
the start and end tangent points.


unless I got anything else wrong.

On 03/07/07, Kristof Zelechovski [EMAIL PROTECTED] wrote:

The questioned wording is correct: a straight line has infinite radius and
thus does not match the requirement if the radius is finite.
Chris

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Philip Taylor
Sent: Monday, July 02, 2007 1:42 PM
To: WHATWG
Subject: [whatwg] Canvas arcTo

If the point (x2, y2) is on the line defined by the points (x0, y0)
and (x1, y1) then the method must do nothing, as no arc would satisfy
the above constraints. - why would no arc satisfy the constraints? If
P0, P1, P2 are collinear and non-coincident, then (I think) any of the
(infinitely many) circles which have the given radius and touch
tangential to the line P0-P2 will satisfy the constraints (i.e. being
tangential to P0-P1 at some point and to P1-P2 at some point).

[snip]

Negative or zero values for radius must cause the implementation to
raise an INDEX_SIZE_ERR exception. - why not allow zero? You just get
an arc at P1 with zero length, with the start and end tangent points
both at P1, so the effect would be a straight line from P0 to P1,
without needing to handle it as a special case. Safari works like
that.



--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas arc

2007-07-03 Thread Philip Taylor

For the 'arc' function:

What if startAngle = endAngle? What if endAngle  2π + startAngle?
(The endAngle = 2π + startAngle case isn't interesting since
floating-point imprecision means it will never occur.)

In practice: (see the left half of
http://canvex.lazyilluminati.com/misc/arc.html for a (unhelpfully
unlabelled) random collection of examples)

If startAngle = endAngle:
 Firefox (2+3), Safari (3): Nothing is drawn.
 Opera (9.2+9.5): If anticlockwise = true, a full circle is drawn;
otherwise, nothing is drawn.

If endAngle  startAngle + 2π:
 Opera is weird and buggy and would require too much effort to analyse.
 Firefox and Safari mostly match:
 (Assume startAngle = 0 in all the following)

 If endAngle = 2π + ε [where ε is a small positive real number]:
   A full circle is drawn.
 If endAngle = 3π - ε:
   If anticlockwise, 0 to -π is drawn; otherwise a full circle is
drawn, and the 0 to π part is drawn twice (i.e. drawn on top of
itself, which is visible due to antialiasing effects).
 If endAngle = 2nπ - ε for integer n  1:
   If anticlockwise, nothing is drawn; otherwise:
  Firefox: A full circle is drawn twice.
  Safari: A full circle is drawn n times.
 (Swapping startAngle vs endAngle is equivalent to swapping clockwise
vs anticlockwise.)

So, for FF/Safari: When startAngle - endAngle is in the opposite
direction to the (anti)clockwise flag, the two angles are treated
modulo 2π and the arc is drawn between them in the appropriate
direction. When it's the same direction as the (anti)clockwise flag,
Safari extends the path all the way from startAngle to endAngle (going
round the whole circle multiple times if necessary), and Firefox does
the same except it skips all but the first full
going-round-the-whole-circle bit (so it goes round 1 = n  2 times,
if abs(startAngle-endAngle)  2π).

It seems sensible to adopt either Firefox's or Safari's approach
(which differ only in the amount of overdraw). It's probably easier to
use Firefox's, so then Safari would just have to mod the angles a
little before drawing them, because I can't see any other reason to
choose one approach over the other, and I can't see any reason to
choose a totally different approach.

Talking about arcs is confusing when the arc is more than a full
circle and wraps around itself and isn't really a mathematical arc any
more, so I think it's necessary to not define the operation in terms
of arcs. The best I can think of is:


Let da = endAngle - startAngle.
If anticlockwise is true, and da  0 or da  -2π, then let d = (da % 2π) - 2π.
If anticlockwise is false, and da  0 or da  2π, then let d = (da % 2π) + 2π.
If neither of these cases applies, then let d = (da % 2π).
In this algorithm, the % operator is defined to have the same
semantics as the ECMAScript % operator.
The arc is defined by the points (radius*cos(a), radius*sin(a)) for
all a between startAngle and startAngle + d. The points at a =
startAngle and at a = startAngle + d are the path's start and end
points respectively.


(The relevance of using the ECMAScript % operator is that (-3) % 2 =
-1, etc, so it handles negative numbers (and floating-point numbers)
in the way that is needed here, and I can't think of a better way to
say the same thing that's still as well-defined and not horribly
verbose.)

The right half of http://canvex.lazyilluminati.com/misc/arc.html is
implemented as above, and gives exactly the same behaviour as FF in
all the cases I have tried.

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas arcTo

2007-07-02 Thread Philip Taylor

As implemented, the operation of arcTo in Firefox (2, 3) and Opera
(9.2, 9.5) is utterly unrelated to the spec and arguably crazy. At
least Opera has the right spirit and tries drawing arcs between
points, though they're the wrong points and they're always
semicircles. Safari nearly matches the spec, and it's still sensible
when it disagrees with the spec, so that's the only one that's
relevant to consider. There are some examples at
http://canvex.lazyilluminati.com/misc/arcto.html.

If the point (x2, y2) is on the line defined by the points (x0, y0)
and (x1, y1) then the method must do nothing, as no arc would satisfy
the above constraints. - why would no arc satisfy the constraints? If
P0, P1, P2 are collinear and non-coincident, then (I think) any of the
(infinitely many) circles which have the given radius and touch
tangential to the line P0-P2 will satisfy the constraints (i.e. being
tangential to P0-P1 at some point and to P1-P2 at some point).

When P0-P1 and P1-P2 are parallel and the same direction, Safari
just draws the line P0-P1. When they are parallel but opposing
directions, it instead draws a line from P0 to a point infinitely far
from P0 in the direction P1-P2. That is sensible in both cases since
it's equal to the limit as the two lines tend towards parallelism.

If P0=P1 (and either P2=P1 or P2!=P1) then Safari does nothing at all
and does not add any points to the subpath (or, equivalently, it does
add the point P1 to the subpath, which has no effect since the line
P0-P1 has zero length). If P1=P2 and P0!=P1, then it adds the point
P1 to the subpath. Both of these seem generally sane - there's no
sensible limit as the points tend towards coincidence, so there's no
real correct answer, and drawing the straight line P0-P1 seems an
adequate thing to do.

Negative or zero values for radius must cause the implementation to
raise an INDEX_SIZE_ERR exception. - why not allow zero? You just get
an arc at P1 with zero length, with the start and end tangent points
both at P1, so the effect would be a straight line from P0 to P1,
without needing to handle it as a special case. Safari works like
that.

So, I think the following definition would cover all the cases and match Safari:


The arcTo(x1, y1, x2, y2, radius) method must do nothing if the
context has no subpaths. If the context does have a subpath, then the
behaviour depends on the arguments and the last point in the subpath.

Let the point (x0, y0) be the last point in the subpath. If x0=x1 and
y0=y1, then the method must do nothing. Otherwise, if x1=x2 and y1=y2,
or if the line defined by the points (x0, y0) and (x1, y1) is parallel
and in the same direction as the line defined by the points (x1, y1)
and (x2, y2), then the method must connect the point (x0, y0) to the
point (x1, y1) by a straight line and add the point (x1, y1) to the
subpath.

Otherwise, if the line defined by the points (x0, y0) and (x1, y1) is
parallel and in the opposite direction to the line defined by the
points (x1, y1) and (x2, y2), then the method must connect the point
(x0, y0) to the point obtained by extending an infinite distance from
(x0, y0) in the direction of the line defined by (x1, y1) and (x2,
y2), and add that new point to the subpath.

Otherwise, let The Arc be the shortest arc given by the circumference
of the circle that has one point tangent to the line defined by the
points (x0, y0) and (x1, y1), another point tangent to the line
defined by the points (x1, y1) and (x2, y2), and that has radius
radius. The points at which this circle touches these two lines are
called the start and end tangent points respectively. The method must
connect the point (x0, y0) to the start tangent point by a straight
line, then connect the start tangent point to the end tangent point by
The Arc, and finally add the start and end tangent points to the
subpath.

Negative values for radius must cause the implementation to raise an
INDEX_SIZE_ERR exception.


--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] WF2 - form action=

2007-06-30 Thread Philip Taylor

WF2 says:

 When the [form element's action] attribute is absent, UAs must act
as if the action attribute was the empty string, which is a relative
URI reference, and would thus point to the current document (or the
specified base URI, if any).

But: 
http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C%21DOCTYPE%20html%3E%0D%0A%3Cbase%20href%3D%22http%3A//google.com%22%3E%3Cform%3E%3Cinput%20type%3Dsubmit%3E

In IE7, FF2, FF3, Opera 9.2, it ignores the base URI and always
submits to the current page. In Safari 3, it does take account of the
base URI. In all, form action= does the same as form. In all,
form action=. does take account of the base URI. Perhaps it would
be sensible to follow the majority.

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas - non-standard globalCompositeOperation

2007-06-27 Thread Philip Taylor

In addition to the standard values for globalCompositeOperation (and
ignoring 'darker'), Gecko supports:

   clear: The Porter-Duff 'clear' operator, which always sets the
output to rgba(0, 0, 0, 0).

   over: Synonym for 'source-over'. The code says not part of spec,
kept here for compat. (It looks like FF1.5 had a broken
'source-over', and implemented 'over' like a correct 'source-over'.
'source-over' was fixed in FF2.0, and 'over' left unchanged.)

(See 
http://lxr.mozilla.org/mozilla/source/content/canvas/src/nsCanvasRenderingContext2D.cpp#1703.)

WebKit supports:

   clear: Same as above.

   highlight: Synonym for source-over. (See
http://developer.apple.com/documentation/Cocoa/Reference/ApplicationKit/Classes/NSImage_Class/Reference/Reference.html#//apple_ref/doc/c_ref/NSCompositeHighlight
- NSCompositeHighlight: Deprecated. Mapped to
NSCompositeSourceOver.)

(See 
http://trac.webkit.org/projects/webkit/browser/trunk/WebCore/platform/graphics/GraphicsTypes.cpp#L34.)

Opera is very nice and doesn't do anything wrong.

The spec clearly defines the behaviour here: any attempts to set such
values must be ignored.



'clear' is pretty useless, since it's exactly equivalent to doing
globalAlpha = 0; globalCompositeOperation = 'copy' or (depending on
the transform matrix) clearRect(0, 0, w, h). The spec already omits
the Porter-Duff 'B' operator (which sets the output to be equal to the
destination bitmap, i.e. is equivalent to not drawing anything at
all), so it does not seem reasonable to argue for adding 'clear' just
for completeness. I can't think of any other reasons for it to be
added to the spec, other than for interoperability.



As far as I can imagine, for each non-standard value, the possible
situations are:

* No content relies on that value.
 = Web browsers should remove support for it: it has no purpose, and
it may result in authors accidentally using that value and becoming
confused when their code doesn't work in other browsers which will be
irritating for everyone and it will evolve into the next situation:

* Web content relies on that value.
 = It should be added to the spec, because it's necessary for
handling web content.

* Non-web, browser-specific content (extensions, widgets, etc) relies
on that value, and web content doesn't.
 = It should be disabled except when run in the extension/widget/etc
context, to avoid the problems as in the first case. That may cause
minor confusion to the extension/widget/etc authors about why their
code [which is relying on undocumented features] works differently if
they run it on the web instead, but that seems insignificant compared
to having interoperability problems on the web.

* Nobody cares.
 = Nothing happens.


Am I missing any issues here? Would any browser developer think one of
the first three situations applies, and be willing to make the
necessary changes in that case?

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas patterns, and miscellaneous other things

2007-06-23 Thread Philip Taylor

What should happen if you try drawing a 0x0-pixel repeating pattern?
(I can't find a way to make a 0x0 image that any browser will load,
but the spec says you can make a 0x0 canvas. Firefox and Opera can't
make a 0x0 canvas - it acts like it's 300x150 pixels instead. Safari
returns null from createPattern when it's 0x0.)


On a somewhat related note: What should canvas.width = canvas.height
= 0; canvas.toDataURL() do, given that you can never make a valid 0x0
PNG? (Firefox and Opera make the canvas 300x150 pixels instead, so you
can't actually get it that small. Safari can make it that small, but
doesn't implement toDataURL.)

Similarly, what should toDataURL do when the canvas is really large
and the browser doesn't want to give you a data URI? (Opera returns
'undefined' if it's = 30001 pixels in any dimension, and crashes if
it's 3 in each dimension. Firefox (2 and trunk) crashes or hangs
on Linux if it's = 32768 pixels in any dimension, and crashes on
Windows if it's = 65536 pixels).

More generally, the spec says If the user agent does not support the
requested type, it must return the image using the PNG format - what
if it does support the requested type, but still doesn't want to give
you a data URI, e.g. because it's the wrong size (too large, too
small, not a multiple of 4, etc) or because of other environmental
factors (e.g. it wants you to do
getContext('vendor-2d').enableVectorCapture() before
toDataURL('image/svg+xml'))? (Presumably it would be some combination
of falling back to PNG (if you asked for something else), returning
undefined, and throwing exceptions.)


If the empty string or null is specified, repeat must be assumed. -
why allow null, but not undefined or missing? (It would seem quite
reasonable for createPattern(img) to default to a repeating pattern).
(Currently all implementations throw exceptions for undefined/missing,
and Opera and Safari throw for null.)


'complete' for images is underspecified, so it's not possible to test
the related createPattern/drawImage requirements. (Is it set before
onload is called? Can it be set as soon as the Image() constructor
returns? Can it be set at an arbitrary point during execution of the
script that called the Image() constructor? Is it reset when you
change src? etc. Implementations all seem to disagree in lots of
ways.)


About radial gradients: If x0 = x1 and y0 = y1 and r0 = r1, then the
radial gradient must paint nothing. - that conflicts with the
previous must for following the algorithm, so it's not precise about
which you must do. It should probably say If ... then the radial
gradient must paint nothing. Otherwise, radial gradients must be
rendered by following these steps:.


code title=dom-attr-completecomplete/code (twice) - looks like
it should be dom-img-complete, so it points to #complete.

createPattern(image, repetition) - the parameters should be in vars.

The images are not be scaled by this process - s/be //

interface HTMLCanvasElement : HTMLElement {
 attribute unsigned long width;
 attribute unsigned long height;
^ incorrect indentation (should have two more spaces).

Somewhere totally unrelated:
interface HTMLDetailsElement : HTMLElement {
  attribute boolean open;
^ incorrect indentation (should have nine more spaces).

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas line styles comments

2007-06-19 Thread Philip Taylor
.]]

* If the value is bevel, then no extra rendering is needed.

* If the value is round, then UAs must add a filled arc connecting the
corners of the strokes on the outside of the join, with the arc's
diameter equal to the line width and with its origin at the point of
the join.

* If the value is miter, then the intersection point P of the two
tangents to the edges of the strokes on the outside of the join is
calculated. If the distance from P to the join is greater than or
equal to the miter limit ratio multiplied by the line width, then no
extra rendering is needed. Otherwise, a triangle must be added between
P and the corners of the strokes on the outside of the join.

The final stroke shape of a path is the union of the line strokes,
line caps and line joins for all of its subpaths. [[In particular,
there's no non-zero winding number rule. Also, subpaths aren't drawn
separately - they're just combined into one shape which then gets
filled and composited.]]

...much later...

The stroke() method must calculate the final stroke shape of the
current path, using the lineWidth, lineJoin, lineCap, and (if
appropriate) miterLimit attributes, and then fill this shape using the
strokeStyle attribute.


(Hopefully there aren't too many errors in there.)

(Is it worth having diagrams (kind of like
http://canvex.lazyilluminati.com/misc/linejoin.png), so normal people
can tell what the interesting bits here actually mean? Or is that best
left for tutorials and user reference guides?)


There are some other issues I'm currently aware of, possibly requiring
more complexity:

What happens when a stroked path has zero length, in terms of drawing
the line caps/joins? In particular, square caps are impossible because
the line does not have a defined direction (assuming we're not having
dashed paths for now). In Firefox 2 and Opera, nothing is drawn for
zero-length paths. In Firefox 3 and Safari, round caps/joins are drawn
(because the direction of the line doesn't matter in that case, so the
output is well-defined), and nothing else is drawn.

What happens when a stroked path contains a line with zero length,
between non-zero-length lines? As far as I can tell, zero-length lines
never have any effect (e.g. line-joins get drawn between two
non-consecutive non-zero-length lines if they have only zero-length
lines between them, so the earlier suggestion for defining 'join' is
wrong) - except when the path has no non-zero-length lines in it, in
which case the presence of a zero-width line causes round caps to be
drawn in FF3/Safari. (...except in FF3 when it's a zero-length
quadratic/Bézier curve). Maybe it'd be best just to require that lines
with zero length are never added to the subpath - so if you don't add
any non-zero-length ones, the subpath will be empty and won't get
drawn, which is slightly incompatible with Safari/FF3 but hopefully
easy to fix in them, and compatible with Opera/FF2.

--
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] HTML syntax: comments before doctype and doctype sniffing

2007-06-18 Thread Philip Taylor

On 18/06/07, Ian Hickson [EMAIL PROTECTED] wrote:

On Sun, 3 Dec 2006, Simon Pieters wrote:
 Also, as an additional constraint in the syntax section, the entire
 doctype probably should (or must) be within the first 1024 bytes,
 because AFAIK browsers generally only sniff for the first 1024 bytes,
 and if they don't find the entire doctype within that then you get
 quirks mode.

I couldn't reproduce that.


In Firefox 2:

javascript:s='?';for(i=0;i1006;++i)s+='
';window.location='data:text/html,'+s+'!doctype
htmlscriptdocument.write(document.compatMode)/script'

javascript:s='?';for(i=0;i1007;++i)s+='
';window.location='data:text/html,'+s+'!doctype
htmlscriptdocument.write(document.compatMode)/script'

The first produces CSS1Compat, the second BackCompat. As far as I can
tell, Firefox requires the doctype to be found when parsing [using
standards-mode rules] the first 1024 characters (not bytes) from the
first non-whitespace character, and then it reparses the whole
document in quirks mode if necessary.

--
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] HTML syntax: comments before doctype and doctype sniffing

2007-06-18 Thread Philip Taylor

On 18/06/07, Martin Payne [EMAIL PROTECTED] wrote:

Philip Taylor wrote:
 In Firefox 2:

 javascript:s='?';for(i=0;i1006;++i)s+='
 ';window.location='data:text/html,'+s+'!doctype
 htmlscriptdocument.write(document.compatMode)/script'

 javascript:s='?';for(i=0;i1007;++i)s+='
 ';window.location='data:text/html,'+s+'!doctype
 htmlscriptdocument.write(document.compatMode)/script'

 The first produces CSS1Compat, the second BackCompat.

Not for me it doesn't (Mozilla/5.0 (X11; U; Linux i686; en-US;
rv:1.8.1.4) Gecko/20070603 Fedora/2.0.0.4-2.fc8 Firefox/2.0.0.4). Both
render in standards mode for me.


Hmm, that might have been some unfortunate line wrapping - it's
probably better to write:

javascript:s='?';for(i=0;i1006;++i)s+='%20';window.location='data:text/html,'+s+'!doctype%20htmlscriptdocument.write(document.compatMode)/script'

javascript:s='?';for(i=0;i1007;++i)s+='%20';window.location='data:text/html,'+s+'!doctype%20htmlscriptdocument.write(document.compatMode)/script'

where each should be one line with no spaces. Then I get the
CSS1Compat/BackCompat difference when just copying those into the
location bar, in Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.8.1.4)
Gecko/20070515 Firefox/2.0.0.4 and Mozilla/5.0 (X11; U; Linux i686;
en-US; rv:1.9a6pre) Gecko/20070618 Minefield/3.0a6pre.

(IE7 and Opera 9 don't appear to have any limit on how early the
doctype should appear.)

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas ImageData comments

2007-06-15 Thread Philip Taylor
.

If getContext() is called with that exact string for tis contextId
argument ... - s/tis/its/

while one could create an ImageData object, one would net necessarily
know what resolution the canvas expected - s/net/not/

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Canvas shadow rendering

2007-06-14 Thread Philip Taylor
 the images (particularly image A) have infinite
size, so a shape drawn entirely off-screen can still cast shadows into
the visible area.

The algorithm as specified is quite horrendously inefficient, but the
obvious optimisations are to skip the entire shadow part if the shadow
colour is fully transparent, and to perform the Gaussian blur by doing
the horizontal and vertical components separately and ignoring the
bits where max(u,v)  shadowBlur (because G will be so small that its
contribution to the shadow will be lost in the rounding errors). I
assume implementors can work that out for themselves, since it's just
a standard Gaussian blur - the only peculiar bit is the mapping from
shadowBlur to σ. So that's all alright.


One odd issue is with clearRect - Safari sort of applies shadows to
that, as in http://canvex.lazyilluminati.com/misc/shadow/shadow4.html,
except it ignores the colour and just uses the blur. It seems sensible
to just call that a bug, and require that shadows never apply to
clearRect since it doesn't go through the Drawing Model at all. I'll
try to look out for any other possible problem areas.

--
Philip Taylor
[EMAIL PROTECTED]


[whatwg] Numerical imprecision in charset detection

2007-06-01 Thread Philip Taylor

8.2.2. The input stream: If the next six characters are not
'charset' - s/six/seven/

--
Philip Taylor
[EMAIL PROTECTED]


Re: [whatwg] noscript should be allowed in head

2007-05-30 Thread Philip Taylor

On 30/05/07, Maciej Stachowiak [EMAIL PROTECTED] wrote:


On May 30, 2007, at 2:02 AM, Julian Reschke wrote:
 So let's rephrase this question: will there be a conformance class
 for HTML5 consumers that *only* accept conforming documents? (Keep
 in mind that these consumers may not even have a DOM or a
 Javascript engine).

Do you mean: (A) only documents that meet all document conformance
criteria (B) only documents that meet all *machine-checkable*
conformance criteria or (C) documents that would not trigger any
parse errors if the parsing algorithm were applied?


Perhaps it would be better to rephrase as: Will there be a conformance
class for HTML5 consumers that process conforming documents according
the spec, but process non-conforming documents in an undefined way?
(Some non-conforming documents might still be processed according to
the spec, instead of being rejected, so it doesn't *only* accept
conforming documents. That makes it not be impossible, when using the
full definition of conformance.)

At least that's how I interpret the original intent - it means tools
in systems with guaranteed document conformance (i.e. not taking input
from the general web) could be simplified while still claiming to be
conformant and still being interoperable with other such tools. They
would only have to be compatible with the rules for processing
conforming documents, instead of being compatible with the rules
defined by browsers for non-conforming documents. (Is that
interpretation correct, or am I totally missing the point?)

(I'm not sure whether it's that useful to be able to claim conformance
for its own sake. Interoperability is useful, but maybe that can be
achieved by imagining a new spec which just says If a document is
conforming according to the definition in HTML5, then it must be
processed as described in HTML5, otherwise the document should be
rejected but anything may happen and all the tools can follow that,
so there's no need for HTML5 itself to explicitly allow that.)


 (Keep
 in mind that these consumers may not even have a DOM or a
 Javascript engine).


http://www.whatwg.org/specs/web-apps/current-work#non-scripted already
defines UA conformance when there's no scripting, which seems to cover
those cases.

--
Philip Taylor
[EMAIL PROTECTED]


  1   2   >